#11387: dnn_detect filter won't work with yolo4-tiny model when both anchors and
labels filenames are defined
-------------------------------------+-------------------------------------
Reporter: Leandro | Type: defect
Santiago |
Status: new | Priority: normal
Component: avfilter | Version:
| unspecified
Keywords: | Blocked By:
Blocking: | Reproduced by developer: 0
Analyzed by developer: 0 |
-------------------------------------+-------------------------------------
Summary of the bug:
System: Manjaro Linux stable (latest as on on Dec 30th 2024).
OpenVino version: 2024.6.0.
FFMpeg version:
{{{
ffmpeg version N-118193-g5f38c82536 Copyright (c) 2000-2024 the FFmpeg
developers
built with gcc 14.2.1 (GCC) 20240910
configuration: --enable-libopenvino --enable-libharfbuzz --enable-
libfribidi --enable-libfreetype --enable-libfontconfig --enable-openssl
libavutil 59. 53.100 / 59. 53.100
libavcodec 61. 28.100 / 61. 28.100
libavformat 61. 9.102 / 61. 9.102
libavdevice 61. 4.100 / 61. 4.100
libavfilter 10. 6.101 / 10. 6.101
libswscale 8. 13.100 / 8. 13.100
libswresample 5. 4.100 / 5. 4.100
}}}
How to reproduce:
Install the `openvino-dev` python package to download the models:
{{{
pip install openvino-dev tensorflow
}}}
And download and convert the the `yolo-v4-tiny-tf` and the labels file:
{{{
omz_downloader --name yolo-v4-tiny-tf
omz_converter --name yolo-v4-tiny-tf
wget
https://raw.githubusercontent.com/openvinotoolkit/open_model_zoo/refs/heads/master/data/dataset_classes/coco_80cl.txt
}}}
Then run ffplay on some arbitrary video containing several objects that
should be detected by this model, and drawing rectangles and labels on the
detected objects:
{{{
ffplay \
https://videos.pexels.com/video-
files/5222540/5222540-uhd_3840_2160_30fps.mp4 \
-vf 'dnn_detect=dnn_backend=openvino:model=public/yolo-v4-tiny-
tf/FP32/yolo-v4-tiny-
tf.xml:input=image_input:confidence=0.4:model_type=yolov4:anchors=81&82&135&169&344&319:labels=coco_80cl.txt:async=1:nb_classes=80,drawbox=box_source=side_data_detection_bboxes:color=yellow,drawtext=text_source=side_data_detection_bboxes:fontcolor=yellow:bordercolor=yellow:fontsize=40,showinfo'
}}}
You'see many log lines like this:
{{{
[Parsed_dnn_detect_0 @ 0x785bd2f21680] anchors is not set
}}}
As the `anchors=` filter option on `dnn_detect` is not passed to the
filter, and anchors are required by `yolo4`.
The correct behaviour is the drawbox and drawtext filters writing on the
image, as well as the information about the detected objects being logged
to the terminal:
{{{
...
[Parsed_showinfo_3 @ 0x743b02f22b00] index: 0, region: (145, 1042) ->
(740, 1495), label: car, confidence: 9918/10000.
[Parsed_showinfo_3 @ 0x743b02f22b00] index: 1, region: (551, 893) ->
(551, 893), label: person, confidence: 4277/10000.
[Parsed_showinfo_3 @ 0x743b02f22b00] index: 2, region: (791, 1012) ->
(791, 1012), label: person, confidence: 4069/10000.
[Parsed_showinfo_3 @ 0x743b02f22b00] index: 3, region: (1375, 1055) ->
(1375, 1055), label: person, confidence: 5944/10000.
[Parsed_showinfo_3 @ 0x743b02f22b00] index: 4, region: (1505, 1065) ->
(1505, 1065), label: person, confidence: 7363/10000.
[Parsed_showinfo_3 @ 0x743b02f22b00] index: 5, region: (794, 1011) ->
(794, 1011), label: person, confidence: 8378/10000.
[Parsed_showinfo_3 @ 0x743b02f22b00] index: 6, region: (915, 1010) ->
(915, 1010), label: person, confidence: 8011/10000.
[Parsed_showinfo_3 @ 0x743b02f22b00] index: 7, region: (1088, 1117) ->
(1088, 1117), label: person, confidence: 9511/10000.
[Parsed_showinfo_3 @ 0x743b02f22b00] index: 8, region: (1385, 1052) ->
(1385, 1052), label: person, confidence: 7692/10000.
[Parsed_showinfo_3 @ 0x743b02f22b00] index: 9, region: (1644, 1172) ->
(1644, 1172), label: person, confidence: 9132/10000.
[Parsed_showinfo_3 @ 0x743b02f22b00] index: 10, region: (1801, 1173) ->
(1801, 1173), label: person, confidence: 9828/10000.
[Parsed_showinfo_3 @ 0x743b02f22b00] index: 11, region: (2480, 1299) ->
(2480, 1299), label: person, confidence: 9496/10000.
[Parsed_showinfo_3 @ 0x743b02f22b00] index: 12, region: (414, 1239) ->
(414, 1239), label: car, confidence: 8610/10000.
[Parsed_showinfo_3 @ 0x743b02f22b00] index: 13, region: (422, 1265) ->
(422, 1265), label: car, confidence: 9608/10000.
[Parsed_showinfo_3 @ 0x743b02f22b00] index: 14, region: (452, 1266) ->
(452, 1266), label: car, confidence: 9239/10000.
...
}}}
--
Ticket URL: <https://trac.ffmpeg.org/ticket/11387>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker
_______________________________________________
FFmpeg-trac mailing list
[email protected]
https://ffmpeg.org/mailman/listinfo/ffmpeg-trac
To unsubscribe, visit link above, or email
[email protected] with subject "unsubscribe".