Creating your own Kotlin detector in TensorFlow

Creating your own Kotlin detector in TensorFlow
Creating your own Kotlin detector in TensorFlow

Creating your own Kotlin detector in TensorFlow

In this article I will show how to create a mobile object detector for the one specific product — a Kotlin mild ketchup:

Creating your own Kotlin detector in TensorFlow
Creating your own Kotlin detector in TensorFlow

Not so long ago I was using OpenCV and its Java interface for object detection in one of my projects. I was implementing Haar feature-based cascade classifiers for different type of products. The generation of such classifier involved following 3 steps:

  • samples collection
  • training
  • detection

For collecting samples I was generating as much as possible distorted images of identified products. To generate a large number of samples (100) from a single image I used the opencv_createsamples utility script:

opencv_createsamples -img [image_name.jpg] -num 100 -bg negatives.dat -vec samples.vec -maxxangle 0.6 -maxyangle 0 -maxzangle 0.3 -maxidev 100 -bgcolor 0 -bgthresh 0 -w 24 -h 24

where negatives.dat was just a list of image paths that didn’t contain the given detected object.

For the training I used opencv_haartraining :

opencv_haartraining -data [classifier_dir_object_name] -vec samples.vec -bg negatives.dat -nstages 10 -precalcValBufSize 1024 -precalcIdxBufSize 1024 -minhitrate 0.995 -maxfalsealarm 0.5 -npos 100 -nneg 100 -w 24 -h 24 -mode ALL

here samples.dat is a file that contains a list of image paths and coordinates pointing to the given object. Its format is following:

[filename] [# of objects] [[x y width height] [… 2nd object] …]

for example:

picture001.jpg 1 140 100 45 45 
picture002.jpg 2 100 200 50 50 50 30 25 25

The application run on a server. The drawback was the cost of generating a valid classifier and the fact, that I couldn’t easily deploy my solution on the mobile.

After TensorFlow has been released I’ve decided to use its object detection API . The cost of generating a properly working model is still high, but TensorFlow comes with a number of handy scripts and samples, which makes a generation of a mobile app with object detection relatively easy.

First we need to download and set up TensorFlow environment and clone appropriate repositories:

Then we need to collect some Kotlin ketchup images. It will be good to downscale them to some low resolution, i.e. 415px x 553px, because I noticed that when using original, large pictures I often got out of memory errors in a training phase. Even having only 60 training images (3456 x 4608px each) I got OoME (both on my laptop with 16GB RAM, i7–4720HQ and on Google ML Engine Cloud with a Standard_GPU).

We can use mogrify to downscale a bunch of images in one step:

find . -name “*.jpg” | xargs mogrify -resize 15%

I also used Gimp to extract smaller objects from large pictures, especially when a target was visible multiple times:

Creating your own Kotlin detector in TensorFlow
Creating your own Kotlin detector in TensorFlow

I used an excellent free tool labelImg to generate product bounding boxes in the XML format.

Creating your own Kotlin detector in TensorFlow
Creating your own Kotlin detector in TensorFlow

Then I divided resulting bunch of JPG and XML files to training (277 imgs) and test (87) sets

Then I run a slightly modified version of the xml_to_csv.py script from the Racoon dataset to generate CSV description for my training and test dataset.

python3 xml_to_csv.py train
python3 xml_to_csv.py test

Next I used an altered generate_tfrecord.py script (I’ve added ability to read labels from the file) to generate TFRecord needed in Tensorflow learning phase.

python3 generate_tfrecord.py -t train — labels_path=labels.txt — csv_input=train_labels.csv — output_path=train.record

In the next step I created .pbtxt file containing just one entry for a detected label:

item {
id: 1
name: ‘kotlin’
},

and the last thing before training begins — the config file. It’s quite complex and I based myself on the Coco dataset (as I’m using transfer learning) . I’ve changed num_classes: 1 (we’re detecting only one ‘kotlin’ ketchup object) and I used ‘ssd_mobilenet_v2’ feature extractor.

In the matcher section I increased values for matched_threshold (to 0.75) and lowered unmatched_threshold (to 0.25). I’ve changed fixed_shape_resizer to 500 x 500. Without this step I was getting to many false positives.

Then to avoid Out of Memory Errors, which unfortunately occurred too often, I decreased batch_size in the training_config section to 4. Then I adjusted fine_tune_checkpoint to the Coco dataset and decreased num_steps to 10000 (to significantly lower learning time).

In the training_input_reader I decreased:

queue_capacity: 100
min_after_dequeue: 50

again, to avoid OoME.

I set the number of steps to 10 000 (previously it was 200 000). Such a significant drop could affect accuracy of a trained model. However I noticed that between 9 000 and 10 000 there was no meaningful drop in loss (always oscillated between 0.5–1.5), so I decided not to increase that value. Then adjusted input_path and label_map_path both in training and evaluation sections.

Now let’s begin with the training phase:

cd tensorflow/models/reasearch
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim (we have to set this path each time we open a new terminal)
python3 object_detection/train.py — logtostderr — pipeline_config_path=product.config –train_dir=~/training

This will generate a bunch of files in the training directory.

Learning steps are recorded at specific checkpoints (see ‘Recording summary at step.’ in the command output) — one of the checkpoints will be used later in the generation of the Protobuf model. If training fails for some reason we can restart training later and (provided from_detection_checkpoint = true) it will start processing from the latest recorded checkpoint. When the learning finishes (it can last with the given config even for 8 hours) we need to export generated graph to the file that can be used on a mobile. We will use object_detection/export_inference_graph.py from Tensorflow Model project:

python3 object_detection/export_inference_graph.py — input_type image_tensor — pipeline_config_path=products.config — trained_checkpoint_prefix=model.ckpt-10000 — output_directory=~/products/out

As a result in the products/out directory we will get a model file, which we need to copy to the assets directory of the mobile app (I will describe it in a next paragraph):

cp /products/out/frozen_inference_graph.pb ~/AndroidStudioProjects/KotlinDetector/assets/frozen_inference_graph.pb

Now, what’s left is the mobile app. I tweaked a sample TensorFlow Object detection app — extracted and translated into Kotlin (well, in what other language could it be implemented? :)). The application uses TensorFlowObjectDetectionAPIModel, which is a wrapper for frozen detection models. It in turn uses TensorFlowInferenceInterface which is a wrapper over the TensorFlow API.

If you want to have your own detector, you would need to substitute frozen_inteference_graph.pb with your own file and change label names in frozen_inference_labels.txt .

And that’s it! Voila — here is Kotlin detector in action:

Creating your own Kotlin detector in TensorFlow
Creating your own Kotlin detector in TensorFlow
Creating your own Kotlin detector in TensorFlow
Creating your own Kotlin detector in TensorFlow

Here you will find code of the mobile app.

Here are Python scripts I used in object detection.

Here is a ready to install .apk file for the app evaluation.