[
https://issues.apache.org/jira/browse/SOLR-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hoss Man updated SOLR-5783:
---------------------------
Attachment: SOLR-5783.patch
Here's a quick & dirty proof of concept path to make SolrCore.openNewSearcher
return the current searcher when all of the following are true:
* the indexConfig allows for re-open
* there is already a current searcher open
* the underling IndexReader is unchanged from the current searcher
* the getLiveSchema has not changed from the current searcher.
The patch also changes SolrCore.getSearcher to skip warming when the
"newSearcher" (returned by SolrCore.openNewSearcher) and the currentSearcher
are identical (there's a nocommit here to fix indenting because i wanted to
keep the patch simple -- all i'm doing is wrapping a bunch of existing warming
code in an "if" so i didn't increase the indent yet)
This seems to work fine, and solves the problem i've been thinking about: if
you do a commit w/o any changes in the index -- nothing happens. same index,
same reader, same searcher.
As is, this patch causes TestIndexSearcher.testRepopen to fail -- but if i'm
understanding that test correctly, this is because it's making an assumption
about the index reader refcount being incremented by 1 after doing a changless
commit - and in that case, with the patch, the reader refcount doesn't
increase, because it's still the same searcher -- so this should be pretty easy
to fix.
Obviously, besides fixing TestIndexSearcher.testRepopen, a lot of new tests
should be written before commiting this to ensure that the same searcher is
re-used when we expect, and not re-use when we don't -- but before i go down
that rabbit hole (the tests are likeley to be much more complicated then the
code itself) does anyone see any problems with this idea that i'm not thinking
of?
> Can we stop opening a new searcher when the index hasn't changed?
> -----------------------------------------------------------------
>
> Key: SOLR-5783
> URL: https://issues.apache.org/jira/browse/SOLR-5783
> Project: Solr
> Issue Type: Improvement
> Reporter: Hoss Man
> Attachments: SOLR-5783.patch
>
>
> I've been thinking recently about how/when we re-open searchers -- and what
> the overhead of that is in terms of caches and what not -- even if the
> underlying index hasn't changed.
> The particular real world case that got me thinking about this recently is
> when a deleteByQuery gets forwarded to all shards in a collection, and then
> the subsequent (soft)Commit (either auto or explicit) opens a new searcher --
> even if that shard was completley uneffected by the delete.
> It got me wondering: why don't re-use the same searcher when the index is
> unchanged?
> From what I can tell, we're basically 99% of the way there (in
> {{<nrtMode/>}})...
> * IndexWriter.commit is already smart enough to short circut if there's
> nothing to commit
> * SolrCore.openNewSearcher already uses DirectoryReader.openIfChanged to see
> if the reader can be re-used.
> * for "realtime" purposes, SolrCore.openNewSearcher will return the existing
> searcher if it exists and the DirectoryReader hasn't changed
> ...The only reason I could think of for not _always_ re-using the same
> searcher when the underlying DirectoryReader is identical (ie: that last
> bullet above) is in the situation where the "live" schema has changed -- but
> that seems pretty trivial to account for.
> Is there any other reason why this wouldn't be a good idea for improving
> performance?
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]