https://bugs.kde.org/show_bug.cgi?id=515492

            Bug ID: 515492
           Summary: Feature Request: digiKam AI Face Recognition for Video
                    with SRT Sidecar files option
    Classification: Applications
           Product: digikam
      Version First unspecified
       Reported In:
          Platform: Other
                OS: Other
            Status: REPORTED
          Severity: wishlist
          Priority: NOR
         Component: Faces-Detection
          Assignee: [email protected]
          Reporter: [email protected]
  Target Milestone: ---

I would like to propose an extension of digiKam’s "People" Face Management
engine to support video files. Currently, digiKam is a leader in image
metadata, but video "People" tagging remains a manual process. This feature
would leverage existing AI models (Yolo/OpenVINO) to scan video files and
generate time-coded face data.

Core Functional Requirements:
Video Face Scanning: 
Use a configurable interval (default: 1s) or keyframe-based analysis to detect
and recognize faces within video containers. Leveraging libraries already
present in Kdenlive (for frame extraction/tracking) could potentially reduce
redundant development.

Probability Grouping: 
Detected faces should be grouped in the "People" sidebar based on match
certainty, similar to the current image workflow, allowing for bulk
confirmation or rejection.
MWG Metadata Embedding: Once confirmed, names should be written to the video's
XMP metadata (Keywords/PersonInImage) using the ExifTool backend.

SRT Face-Appearance Generation: 
A unique feature to export appearance timestamps as SRT sidecar files
[filename]_([face tag]).srt. This allows standard video players (VLC, etc.) to
display "Face Subtitles" or allow users to search for specific appearances.


Use Case and Benefit:
This would make digiKam the first open-source DAM to offer "Face-Searchable"
video. For users with large archives, this solves the problem of finding a
specific person inside hours of video without having to watch the footage
manually.

Technical Suggestions:
Provide a "Minimum interval between detections" setting to prevent SRT bloat.
For uncompressed or high-bitrate video where keyframes are sparse, allow a
fallback to a fixed temporal interval (e.g., scan every 1 seconds).


[video_file_name]_([face tag]).srt
<begin SRT file contents>
NOTE
This SRT file shows all instances of [face tag] found in [filename]
Minimum keyframe interval - [#] second(s)
Generated by [user] with digiKam Video AI

1
$[HH:MM:SS,mmm] --> [HH:MM:SS,mmm]
$[face tag] - [x, y, w, h]

2
$[HH:MM:SS,mmm] --> [HH:MM:SS,mmm]
$[face tag] - [x, y, w, h]

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to