Francois Huaulme created SOLR-18189:
---------------------------------------
Summary: Improve tracking of duplicate Solr docs with content hash
Key: SOLR-18189
URL: https://issues.apache.org/jira/browse/SOLR-18189
Project: Solr
Issue Type: Improvement
Reporter: Francois Huaulme
Content hash detection aims to improve update efficiency by identifying and
bypassing redundant document updates. By detecting instances where content
remains unchanged, the URP chain skips unnecessary write operations to reduce
CPU and I/O overhead on nodes by skipping operations done in downstream URPs.
The implementation offers flexibility through a "monitor-only" mode, where
duplicate occurrences are tracked via metrics without triggering immediate
update discards. This approach provides detailed insights into document
duplication levels within Solr data.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]