Francois Huaulme created SOLR-18189:
---------------------------------------

             Summary: Improve tracking of duplicate Solr docs with content hash
                 Key: SOLR-18189
                 URL: https://issues.apache.org/jira/browse/SOLR-18189
             Project: Solr
          Issue Type: Improvement
            Reporter: Francois Huaulme


Content hash detection aims to improve update efficiency by identifying and 
bypassing redundant document updates. By detecting instances where content 
remains unchanged, the URP chain skips unnecessary write operations to reduce 
CPU and I/O overhead on nodes by skipping operations done in downstream URPs.

  The implementation offers flexibility through a "monitor-only" mode, where 
duplicate occurrences are tracked via metrics without triggering immediate 
update discards. This approach provides detailed insights into document 
duplication levels within Solr data.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to