https://bugs.kde.org/show_bug.cgi?id=518409

            Bug ID: 518409
           Summary: Transition Face Recognition from KNN to Centroid
                    Clustering
    Classification: Applications
           Product: digikam
      Version First 9.1.0
       Reported In:
          Platform: unspecified
                OS: All
            Status: REPORTED
          Severity: normal
          Priority: NOR
         Component: Faces-Engine
          Assignee: [email protected]
          Reporter: [email protected]
  Target Milestone: ---

Motivation: 
Currently, digiKam matches new faces by searching for the nearest neighbors
among thousands of individual face vectors. As a library grows, this "raw KNN"
approach leads to a "crowded" vector space where distinct identities begin to
overlap, causing the engine to misidentify faces into a few "catch-all" nodes.
This results in a significant drop in accuracy and a massive increase in the
computational cost of managing the face database.

The Proposal: 
The engine should switch to a Centroid Clustering model. Instead of storing and
matching against every single confirmed face as a discrete point for search,
the system should calculate one or more "centroids" (mean vectors) for each
identity. When a new face is scanned, the engine only needs to find the nearest
identity-centroid rather than the nearest individual face-vector.

Technical Implementation: For each person, the system would maintain a primary
centroid representing their "average" face. To handle variability (e.g.,
profiles, sunglasses, or aging), an identity could support multiple clusters
(e.g., "John Doe - Frontal" and "John Doe - Side"). When a user confirms a new
face, the system simply updates the running average of the corresponding
centroid - a constant time O(1) operation - rather than re-indexing a global
tree of points.

Benefit: This change transforms the search complexity from O(total faces) to
O(total identities). For a user with 50,000 photos of 100 people, the search
space is reduced 500-fold. This would virtually eliminate the "catch-all
identity" bug and drastically speed up the recognition process on large
collections.

Scalability: By reducing the number of points in the active search index,
memory usage is minimized, and the recognition engine remains snappy even as
the photo library grows into the hundreds of thousands. It ensures that "more
data" leads to better accuracy (better centroids) rather than worse performance
(congested KD-trees).

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to