[ https://issues.apache.org/jira/browse/SOLR-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hoss Man updated SOLR-10726: ---------------------------- Attachment: SOLR-10726.semaphore.newsearcher.test.patch semaphore.newsearcher.test.log.txt FWIW... I was attempting to write a test that would prove/disprove if waitSearcher=true actually worked in SolrCloud, by having a 'newSearcher' event listener that used a semaphore to try and detect if/when a newSearcher was being warmed after the client's commit call had already been returned. I ran into some weird problems, and in mentioning them in passing to shalin, he pointed me to this jira. I'm attaching a patch showing what i have at the moment -- it doesn't really do much towards my current goal, but it does help demonstrate a few weird things about when/how newSerchers are being opened in SolrCloud that seems relevant to the related problems shalin mentioned when creating this jira... * I had to put some special code in to do an initial commit (on the empty index) to work around the fact that evidently SolrCore will re-open a newSearcher after the very first commit -- even if no documents have been added to it's index. ** BUT: This doesn't happen on _every_ SolrCore ??? ... it seems to be an "N-1" situation, where N is the total number of cores. Ie: in a 2 shard collection with repFactor=2, aparently only 3 of the cores open a newSearcher on this (empty) commit ** see the usage of {{nocommit_HACK_ON_HACK_nocommit_seriously_nocommit}} for details * Once the test actaully starts adding docs to the index, things work predictible -- for a bit... ** The test sequentially does an add followed by a commit, and verifies (using the semaphore) that only 2 replicas (presumably of the shard the added document belongs to) open a newSearcher ** in reality, eventually a commit happens where every SolrCore re-opens a newSearcher (even though nothing in the index has changed on the 2 nodes of the other shards) and there aren't evenough permits in the semaphore. ---- I'm not planning to pursue this at the moment, but i wanted to share it in case it can serve as a useful starting point for anyone else who wants to look into figuring out why it's happening and/or reducing how often SolrCloud is opening newSearchers. > SolrCloud opens multiple searchers on replica creation/startup > -------------------------------------------------------------- > > Key: SOLR-10726 > URL: https://issues.apache.org/jira/browse/SOLR-10726 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: search, SolrCloud > Affects Versions: 6.5.1 > Reporter: Shalin Shekhar Mangar > Labels: difficulty-medium, impact-high > Attachments: semaphore.newsearcher.test.log.txt, > SOLR-10726.semaphore.newsearcher.test.patch > > > I was investigating some curious behavior reported by a customer around first > searcher event listeners and multiple searchers being opened when adding a > new replica. > Turns out that if you add a new replica to solrcloud: > 1) Searchers are opened at least twice and possibly a third time > 2) the first time is because of a new core coming online and opening searcher > on an empty index -- only firstSearcher event listeners are fired here > 3) second time is after replication is complete and we have new index files > available -- firstSearcher event listeners are fired again because the old > searcher opened on core load has already been closed and disposed so this is > technically again a first searcher > 4) third time happens after documents buffered during recovery are replayed > -- if there was no indexing happening on leader then this step is skipped -- > a newSearcher event is fired here because we had already opened a searcher in > the last step > Now if instead of a new replica, a solr node is restarted then there can be > upto four searcher opens -- the additional open is because of log replay on > startup. > So Solr spends a lot of time on unnecessary warming/autowarming on searchers > that are discarded. It is not just warming because sometimes plugins such as > SpellCheckComponent and SuggestComponent can also tie in to these listener > events. > We can probably cut a few of them or at least defer the decision of whether > to fire these listeners to places such as RecoveryStrategy which have a > better idea of whether it is worth it. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org