[
https://issues.apache.org/jira/browse/SOLR-8709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15226550#comment-15226550
]
Joel Bernstein commented on SOLR-8709:
--------------------------------------
Commit settings were:
1 second hard commits
2 second soft commits
The softCommit is what matters because the TopicStream is working off of index
snapshots.
In reviewing the code, it certainly looks like there is room for out-of-order
version numbers to cross commit boundaries, but I would expect to see it happen
in a stress test.
I'm thinking about changing this ticket to maintain a checksum of the version
numbers for a time window. Then resending the time window if the checksums
don't match. This is much more manageable then attempting to track all version
numbers for a time window. This won't provide guaranteed *one time* delivery of
documents, but it will provide guaranteed delivery of all documents in a topic,
with a reasonable expectation of one time delivery.
> Account for out-of-order version numbers in the TopicStream
> -----------------------------------------------------------
>
> Key: SOLR-8709
> URL: https://issues.apache.org/jira/browse/SOLR-8709
> Project: Solr
> Issue Type: Bug
> Reporter: Joel Bernstein
>
> Currently the TopicStream can miss documents if version numbers are received
> out-of-order. The TopicStream sorts on version number so it will only miss
> out-of-order versions that span commit boundaries.
> In order to resolve this issue we can adopt an approach that keeps a set of
> the last N version numbers sent for each Topic. As the documents are scanned
> we can check for documents within this time window that do not appear in the
> sent set. These documents can then be sent.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]