[ https://issues.apache.org/jira/browse/SOLR-14581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17141573#comment-17141573 ]
Erick Erickson commented on SOLR-14581: --------------------------------------- Oh, let's mention the word "theorem" and intimidate people: [https://en.wikipedia.org/wiki/CAP_theorem] Kidding aside, can we boil this down to something like "Solr guarantees consistency as of (NOW-commit_interval). For any given commit_interval, it's possible in some cases that queries may not reflect documents indexed between NOW and (NOW - commit interval)"? We could reference that Wikipedia article to forestall questions about "why can't Solr Do What I Want?" > Document the way auto commits work in SolrCloud > ----------------------------------------------- > > Key: SOLR-14581 > URL: https://issues.apache.org/jira/browse/SOLR-14581 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation, SolrCloud > Affects Versions: master (9.0) > Reporter: Bram Van Dam > Priority: Minor > Attachments: SOLR-14581.patch > > > The documentation is unclear about how auto commits actually work in > SolrCloud. A mailing list reply by Erick Erickson proved to be enlightening. > Erick's reply verbatim: > {quote}Each node has its own timer that starts when it receives an update. > So in your situation, 60 seconds after any give replica gets it’s first > update, all documents that have been received in the interval will > be committed. > But note several things: > 1> commits will tend to cluster for a given shard. By that I mean > they’ll tend to happen within a few milliseconds of each other > ‘cause it doesn’t take that long for an update to get from the > leader to all the followers. > 2> this is per replica. So if you host replicas from multiple collections > on some node, their commits have no relation to each other. And > say for some reason you transmit exactly one document that lands > on shard1. Further, say nodeA contains replicas for shard1 and shard2. > Only the replica for shard1 would commit. > 3> Solr promises eventual consistency. In this case, due to all the > timing variables it is not guaranteed that every replica of a single > shard has the same document available for search at any given time. > Say doc1 hits the leader at time T and a follower at time T+10ms. > Say doc2 hits the leader and gets indexed 5ms before the > commit is triggered, but for some reason it takes 15ms for it to get > to the follower. The leader will be able to search doc2, but the > follower won’t until 60 seconds later.{quote} > Perhaps the subject deserves a section of its own, but I'll attach a patch > which includes the gist of Erick's reply as a Tip in the "indexing in > SolrCloud"-section. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org