Secrets to OpenCV Haar Cascade Training Revealed

Discuss arguments to opencv

Show trial runs with output

The process of creating Haar Custom Object Detector breaks down into two phases.  The training phase and the detection phase.  In this article we are going to focus on the training phase. The preparation of the training data and running the training application.  The success of your object detector is directly affected by how much effort you put into this phase.  Take your time with this and get it right.

Using the "opencv_traincascade" program

We will use the opencv_traincascade program to train the haar cascade for the Robomow detector. 

First we will need to collect negative samples.  

Negative Samples

Negative Samples are arbitrary images that do NOT contain the Robomow.  Ideally they will be images that we would be pictures of the background  where we would find the Robomow to be in.  For the Robomow, this would be a front or back yard with a lawn.  The negative images are described in a special text file that list the location of the images in the file system in relation to where we will be executing the opencv_traincascade program.  For example we could create a negative textfile called negative.txt with entries like 


We will use the negative.txt file below as an argument to the opencv_traincascade program.  

Your set of negative samples will be used to tell the opencv_traincascade program, what not to look for, when trying to find the robomow.

Positive Images

Next we will need to collect positive samples.   Positive samples will be  created by using the opencv_createsamples utility.   They may be created from a single image with object or from a collection of previously marked up images.

Positive samples are created by the opencv_createsamples program. They are used at each stage in the cascade process to help define what the model should actually look for when trying to find the robomow. The program supports two ways of generating a positive sample dataset.

  1. You can generate a bunch of positives from a single positive object image.
  2. You can supply all the positives yourself and only use the tool to cut them out, resize them and put them in the opencv needed binary format.

Lets explore each approach.  

While the first approach works relatively well for fixed objects, if you have a moving object coming from a video, option 2 will likely be the best approach. In other words generating positive samples even with relatively low number hand selected pictures will likely produce better results than having the system generate a 1,000 objects artificially.   We will test both approaches.  

Approach #1 - Genera​​​​​ting Samples from a single Positive Object

The first approach takes a single object image with for example a company logo and creates a large set of positive samples from the given object image by randomly rotating the object, changing the image intensity as well as placing the image on arbitrary backgrounds. The amount and range of randomness can be controlled by command line arguments of the opencv_createsamples application.

There are 2 tasks that need to be completed before 

Training your Object Detector

  • opencv_createsamples is used to prepare a training dataset of positive and test samples. opencv_createsamples produces dataset of positive samples in a format that is supported by both opencv_haartraining and opencv_trainThcascade applications. The output is a file with *.vec extension, it is a binary format which contains images.
Approach #2 - Generating Samples from a single Positive Object

Please note that you need a large dataset of positive samples before you give it to the mentioned utility, because it only applies perspective transformation. For example you may need only one positive sample for absolutely rigid object like an OpenCV logo, but you definitely need hundreds and even thousands of positive samples for faces. In the case of faces you should consider all the race and age groups, emotions and perhaps beard styles.

So, a single object image may contain a company logo. Then a large set of positive samples is created from the given object image by random rotating, changing the logo intensity as well as placing the logo on arbitrary background. The amount and range of randomness can be controlled by command line arguments of opencv_createsamples utility.


Watch AI in action.  Click HERE to learn more