ROC SDK  2.4.0
Scalable Face Recognition Software
roc-track

Construct templates from a video feed, tracking persons and/or objects between frames.

This command line application creates, runs, and monitors a roc_video_service for processing one or more video streams. The video streams can originate from a live or pre-recorded source.

ROC Track Options

Option Meaning Default
generate-config Write the complete JSON configuration used, to a user-given location. Do not write the JSON configuration in use.

Video Service Parameters

roc_video_service objects are initialized with a JSON parameters object passed to the API as a roc_string.

A JSON file showing the default settings for a roc_video_service is included with the ROC SDK, in the share folder, and can also be viewed here.

A complete roc_video_service configuration consists of the following JSON Arrays and JSON Objects.

Global Settings Object

JSON object containing system paths, debug/logging information, and general, non-analytics system behavior. These settings can only be configured once - when calling roc_video_service_start. The Video Stream List can be updated while running, by calling roc_video_service_update_stream.

Name Type Meaning Default
behavior JSON Object Control high level streaming behavior behavior. See behavior.
cuda JSON Object Enable GPU acceleration. See cuda.
logging JSON Object Enable logging messages. See logging.
opportunistic-batcher JSON Object Control input image batching characteristics. See opportunistic-batcher.
paths JSON Object Configure local file paths. See paths.
status JSON Object Enable status and statistics messages. See status.

behavior

JSON object controlling high level streaming behavior.

Name Type Meaning Default
max-concurrent-streams unsigned int Number of streaming video sources that can be processed concurrently. 2147483647
persistent bool Allow the application to keep running with no streams. false
sender-id string Unique string identifying a roc_video_service. empty
sequence-streams bool Process streams in sequence rather than concurrently. false
verbose-level int Video decoder log level. see roc_video_log_level. ROC_VIDEO_PANIC

cuda

JSON object controlling GPU acceleration.

Name Type Meaning Default
benchmarking-clear-cache bool Clear kernel benchmarking cache. false
device unsigned int Configure which cuda device (GPU) is used for acceleration. 0
enabled bool Enable GPU acceleration. false
cpu-fallback bool Continue processing (on CPU) if GPU initialization fails. true
no-benchmarking bool Disable CUDA kernel benchmarking. false

logging

JSON object controlling logging messages.

Name Type Meaning Default
file JSON Object Send log messages to a file. See file.
stdout-enabled bool Log messages to STDOUT. false
webhook JSON Object Send log messages to a remote endpoint. See webhook.

opportunistic-batcher

JSON object controlling input image batching characteristics.

Name Type Meaning Default
batch-size unsigned int Number of input images to batch. See roc_set_opportunistic_batching 1
max-latency unsigned int Maximum wait time in milliseconds when accumulating a batch for processing. 1000

paths

JSON object containing local file path settings.

Name Type Meaning Default
license string Path to ROC license file. empty
models string Path to ROC detection model libraries. empty

status

JSON object controlling status and statistics messages.

Name Type Meaning Default
file JSON Object Send status messages to a file. See file.
interval unsigned int Period in milliseconds in which to generate status messages. 2000
webhook JSON Object Send status messages to a remote endpoint. See webhook.

file

JSON object configuring file output.

Name Type Meaning Default
enabled bool Send output to file. false
path string Path to file. empty

webhook

JSON object configuring remote output.

Name Type Meaning Default
enabled bool Send output to remote endpoint. false
keep-alive bool Continue processing, and retry when failed transmissions occur. true
url string Url of remote endpoint. empty

Video Stream List

JSON array containing a list of video-stream JSON objects. Streams can be added to the list, removed from the list, or updated via calls to roc_video_service_update_stream. When calling roc_video_service_start, the passed in JSON configuration must contain at least one video-stream.

video-stream

JSON object configuring the processing input and analytics parameters for a video source. Video streams are passed to roc_video_service objects as a JSON array of Video Stream List.

Name Type Meaning Default
enabled bool Process this stream, sending templates to the configured gallery. true
gallery JSON Object Configure processing output. See gallery.
source JSON Object Configure the live or pre-recorded video stream source. See source.
stream-id string Identifier to use when transmitting output to webhook endpoints. empty
tracker JSON Object Configure algorithm parameters and artifact generation. See tracker.

gallery

JSON object controlling processing output.

Name Type Meaning Default
append bool Append to the output gallery instead of overwriting it. false
file JSON Object Send generated templates to file-backed gallery. See file.
set-metadata-hostid bool Append host_id to generated templates' metadata. false
webhook JSON Object Send generated templates to remote gallery. See webhook.

Either gallery:file or gallery:webhook must be enabled for roc_video_service configuration to be valid.

source

JSON object configuring the live or pre-recorded video stream source.

Name Type Meaning Default
fps float Target frames-per-second to process. 5.0
gstreamer JSON Object Configure optional gstreamer video decoding. See gstreamer.
retry-failed-startup bool Periodically retry starting video source (rather than exiting) if it fails. true
threads int Number of threads used to process video file or live stream. -1
url string Path to video file or IP stream. empty
video-start-time string Start time for pre-recorded videos. See roc_video_set_start_time. "0"
warmup unsigned int Milliseconds to drop frames before processing video source. 0

source:url must be a non-empty string for roc_video_service configuration to be valid.

gstreamer

JSON object configuring optional gstreamer video decoding.

Name Type Meaning Default
enabled bool Use GStreamer, rather than FFmpeg, for video decoding. false
gpu-decoding bool Use GStreamer GPU video decoding acceleration, if available. false

tracker

JSON object configuring algorithm parameters and artifact generation.

Name Type Meaning Default
analytics-backends JSON Array List of algorithm parameter JSON objects. See analytics-backend.
camera-preview JSON Object Periodically capture video frames (for displaying in a User Interface). See camera-preview.
context-image JSON Object Capture video frames that resulted in template generation. See context-image.
enhance-contrast JSON Object Enhance pixel contrast in video source images. See enhance-contrast.
model-foreground bool Perform background subtraction before running analytics on input images. See roc_foreground. false
reason-elapsed JSON Object Configure roc_event elapsed parameters. See reason-elapsed.
redaction JSON Object Configure redaction settings for stored images. See redaction.
render JSON Object Set rendering options for stored images. See render.
roi JSON Object Configure regions of interest for input video frames. See roi.
video-recordings JSON Object Configure video clip recording. See video-recordings.
watchlists JSON Object Configure gallery watchlists. See watchlists.

analytics-backend

JSON object containing algorithm parameters to be run for each input video frame.

Name Type Meaning Default
algorithm-id JSON Array See Specifying an Algorithm ID. ROC_STANDARD_DETECTION, ROC_STANDARD_REPRESENTATION
backend-id string Identifier specific to this analytics-backend. Not externally visible. empty
enabled bool Run the analytics specified in this analytics-backend. true
event-reasons JSON Array See roc_event_reasons. ROC_NO_EVENT
false-detection-rate float See False Detection Rate. 0.02
filter-text string Regular expressions describing text to check for in detection areas. empty
k int See Maximum Faces. 1
max-time-separation roc_time Maximum milliseconds between two similar templates for them to be considered the same track. 5000
min-count int Specify the minimum number of templates for a track to generate a roc_tracker event. 1
min-detection-overlap float Minimum position overlap for two similar templates to be considered the same track. 0.0
min-quality float Minimum template quality. -4.0
min-similarity float Minimum similarity for two similar templates to be considered the same track. 0.55
no-min-quality bool Disable quality filtering. false
relative-min-size float Specify detection roc_adaptive_minimum_size. 0.04
set-metadata-object string JSON object of key-value pairs to append to a template's metadata. empty
thumbnail-parameters JSON Object Configure generated thumbnail settings. See roc_thumbnail_parameters.

camera-preview

JSON object configuring periodically capturing video frames.

Name Type Meaning Default
enabled bool Store video frames at the specified fps. false
file JSON Object Store video frames to a file. See file.
fps float Frame rate at at which to store video frames. 1
height unsigned int Stored video frame height in pixels, or 0 to keep original size. 0
quality unsigned int Stored image quality. 95
webhook JSON Object Send stored video frames to a remote endpoint. See webhook.

context-image

JSON object configuring capturing video frames that resulted in template generation.

Name Type Meaning Default
enabled bool Store video frames that resulted in template generation. false
file JSON Object Store video frames to a file. See file.
format string Specify format for saving frames ex. "png" ".jpg"
quality unsigned int Stored image quality. 95
webhook JSON Object Send stored video frames to a remote endpoint. See webhook.

enhance-contrast

JSON object controlling enhancenment of pixel contrast in video source images.

Name Type Meaning Default
clip-limit unsigned int Adaptive histogram equalization contrast limit. 2
enabled bool Enhance conrats in video source images. false
tiles unsigned int Number of horizontal and veritical tiles to partition image into. 0

reason-elapsed

JSON object configuring roc_event elapsed parameters.

Name Type Meaning Default
interval-time unsigned int Milliseconds after which ROC_ELAPSED_INTERVAL event is emitted. 0
threshold-time unsigned int Milliseconds after which ROC_ELAPSED_THRESHOLD event is emitted. 0

redaction

JSON object configuring redaction settings for stored images.

Name Type Meaning Default
context-image JSON Object Configure settings for redacting entire images. See privacy.
face JSON Object Configure settings for redacting faces from stored images. See privacy.

privacy

JSON object controlling image blurring and censoring.

Name Type Meaning Default
behavior string Redact faces or frames that match or don't match a watchlist. See roc_tracker_privacy. ROC_REDACT_NONE
blur bool Blur pixels rather than blackening when redacting. false
enabled bool Allow redaction. false

render

JSON object controlling rendering options for stored images.

Name Type Meaning Default
color string Rendering box RGB color as a comma delimited string. "0,255,0"
detections bool Render detections in stored images. false
foreground bool Render foreground in stored images. false
person-ids bool Render person ID's in stored images. false
recordings bool Render bounding boxes in recorded video clips. false
timestamp bool Render timestamps in stored images. false

roi

JSON object controlling regions of interest for input video frames.

Name Type Meaning Default
enabled bool Only run analytics on user defined image areas. false
exclude-roi JSON Array List of image areas to exclude from analytics processing. See roi-box.
include-roi JSON Array List of image areas to include for analytics processing. See roi-box.
min-roi-overlap float Minimum fraction of detection area that must be contained within an roi-box. 0.5

roi-box

Include and exclude region of interest (roi) areas are given as a string list of comma delimited pixel locations: <x>,<y>,<width><height>.

  • Example: "0.25,0.25,0.5,0.5"

video-recordings

JSON object controlling video clip recording.

Name Type Meaning Default
clip-padding unsigned int Additional milliseconds of recording to pad the start and end of a clip with. 1000
enabled bool Record and store video clips. false
file JSON Object Store video clips to a file. See file.
max-frames unsigned int Number of recorded video frames to retain for later retrieval. 5400
persistent-recording JSON Object Record video clips continously rather than per ROC_FINALIZED_TRACK event. Disables clip padding. See persistent-recording.
webhook JSON Object Store video clips to a remote endpoint. See webhook.

persistent-recording

JSON object configuring continuous video clip recording.

Name Type Meaning Default
clip-duration unsigned int Duration in milliseconds of each clip when persistent recording. 30000
enabled bool Record video clips continously. false
idr-interval unsigned int Number of milliseconds between recorded IDR frames. 1000

watchlists

JSON object containing gallery watchlists' settings.

Name Type Meaning Default
lists JSON Array Array of watchlists to be searched against. See list.
min-similarity float Minimum similarity to trigger a watchlist hit. 0.55

list

JSON object containing individual watchlist and metadata.

Name Type Meaning Default
enabled bool Search against this gallery. false
path string Location of watchlist gallery file. empty
watchlist-id string Identifier specific to this watchlist. Not externally visible. empty

Examples

Show help
Print application usage and supported arguments.
$ roc-track -h 
Run roc-track
All configuration parameters live in the configuration json file.
$ roc-track /path/to/json.config 
Track faces in a video
Track faces in the video ../data/josh.mp4 and write its templates to gallery.t. By default the single best template per track will be saved after the person exits the camera field of view.
Configuration file contents:
{
"global-settings": {},
"video-streams": [
{
"gallery": {
"file": {
"enabled": true,
"path": "gallery.t"
}
}
"source": {
"url": "../data/josh.mp4"
},
"tracker": {
"analytics-backends": [
{
"event-reasons": [
"ROC_FINALIZED_TRACK"
]
}
]
}
}
]
}
Streaming Video
In the above example the input video ../data/josh.mp4 can be replaced with a URL to a live streaming video. See roc_open_video for details. To debug video connectivity issues set "verbose-level" : "ROC_VIDEO_VERBOSE" in the configuration file.
Streaming Templates
In the above example the output galley gallery.t can be replaced with a URL to a remote endpoint that is either:
  • A gallery server hosted with roc-serve. This URL would typically not have an http:// prefix.
  • A custom server backend implementing the Web API. This URL would typically have an http:// prefix.
Frames Per Second
The desired processing FPS can be specified with fps object in source. The default is 5.0. For live videos, frames will be dropped if the system can't keep up with realtime processing. See roc_stream_start and roc_stream_get_true_frame_rate for details.
"source": {
"fps": 5,
"url": "<source-video-url>"
},
Minimum Template Similarity
The minimum between-frame roc_similarity to consider two templates the same person is specified with min-similarity. The default value is ROC_DEFAULT_MIN_TRACKING_SIMILARITY. If you observe broken tracks (two tracks for the same person which can happen in low quality video) consider lowering the similarity threshold to connect faces between frames more aggressively.
"analytics-backends": [
{
"algorithm-id": [
"ROC_STANDARD_DETECTION",
"ROC_STANDARD_REPRESENTATION"
],
"backend-id": "",
"enabled": true,
"event-reasons": [
"ROC_FINALIZED_TRACK"
],
"false-detection-rate": 0.02,
"filter-text": "",
"k": 1,
"max-time-separation": 5000,
"min-count": 1,
"min-detection-overlap": 0,
"min-quality": -4,
"min-similarity": 0.55,
"no-min-quality": false,
"relative-min-size": 0.04,
"set-metadata-object": {
},
"thumbnail-parameters": {
"height": 384,
"quality": 80,
"scale": 0.6,
"width": 288
}
}
]
Maximum Time Separation
The maximum milliseconds to wait for a person to re-enter the camera field of view is specified by max-time-separation. The default value is 5000. If a person hasn't been seen for this amount of time, ROC_FINALIZED_TRACK will be triggered, and a new track will be created if the person re-appears in the video.
Events
You can control the reason(s) when a roc_event is reported and it's roc_event::probe roc_template is saved, the most common are:
  • ROC_FINALIZED_TRACK emits the single best template after the person exits the camera field of view. This is the default reason.
  • ROC_NEW_TRACK emits the first template as soon as a new person enters the camera field of view.
  • ROC_BETTER_PROBE emits when a better capture is obtained for a person actively being tracked.
  • ROC_MERGED_TRACKS two tracks previously thought to be different people have now been established as the same person.
  • ROC_ELAPSED_THRESHOLD milliseconds after which a new track started where we report the current best template.
  • ROC_ELAPSED_INTERVAL milliseconds since the last event for a person where we report the best template obtained since.
    "analytics-backends": [
    {
    "algorithm-id": [
    "ROC_STANDARD_REPRESENTATION",
    "ROC_ANALYTICS",
    "ROC_THUMBNAIL",
    "ROC_FRONTAL_DETECTION"
    ],
    "backend-id": "",
    "enabled": true,
    "event-reasons": [
    "ROC_FINALIZED_TRACK",
    "ROC_BETTER_PROBE",
    "ROC_ELAPSED_INTERVAL"
    ],
    "false-detection-rate": 0.02,
    "filter-text": "",
    "k": 1,
    "max-time-separation": 5000,
    "min-count": 1,
    "min-detection-overlap": 0,
    "min-quality": -4,
    "min-similarity": 0.55,
    "no-min-quality": false,
    "relative-min-size": 0.04,
    "set-metadata-object": {
    },
    "thumbnail-parameters": {
    "height": 384,
    "quality": 80,
    "scale": 0.6,
    "width": 288
    }
    }
    ]
Cropping the input video
If you only need to process a fixed portion of the camera field of view, you can specify that region of interest (ROI) with roi. See roc_tracker_set_rois for details.
"roi": {
"enabled": true,
"exclude-roi": [
],
"include-roi": [
"0.25,0.25,0.5,0.5"
],
"min-roi-overlap": 0.5
}
Alternatively (or additionally), if you only want to process the region of each frame with motion, you can enable that optimization with model-foreground. See roc_foreground for details.
"tracker": {
"model-foreground": true
}
Saving full frames
You can write the corresponding frame for each reported event to a roc_database (folder) using context-image. The roc_event::probe roc_template::archive_id will be set to the archived image, see roc_tracker_set_best_image_parameters. Instead of writing to a folder, these frames can be transmitted to a remote database hosted with roc-serve or a server running the Web API.
"context-image": {
"enabled": true,
"file": {
"enabled": true,
"path": "/path/to/folder"
},
"format": ".jpg",
"quality": 95,
"webhook": {
"enabled": false,
"keep-alive": true,
"url": ""
}
}