[ https://issues.apache.org/jira/browse/SOLR-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15806431#comment-15806431 ]
Hoss Man commented on SOLR-9941: -------------------------------- I'm guessing this situation exists because the {{RecentUpdates}} code was added to {{UpdateLog}} to handle (true) out of order updates while a system was "live", but it's not clear to me why the {{RecentUpdates}} are used the way they are during {{UpdateLog.init}} ... it seems like at a minimum DUH2 could inspect the UpdateCommand flags to see if this is a {{REPLAY}} command and if it is skip the {{UpdateLog.getDBQNewer}} call? Here's some steps that make it easy to see how weird this situation can get... {noformat} bin/solr -e techproducts curl -X POST http://localhost:8983/solr/techproducts/config -H 'Content-Type: application/json' --data-binary '{"set-property":{"updateHandler.autoCommit.maxTime":"-1"}}' curl -X POST http://localhost:8983/solr/techproducts/update -H 'Content-Type: application/json' --data-binary '{ "add":{ "doc": { "id": "DOCX", "foo_i": 41 } }, "delete": { "query":"foo_i:42" }, "delete": { "query":"foo_i:43" }, "add":{ "doc": { "id": "DOCX", "foo_i": { "inc" : "1" } } }, "delete": { "query":"foo_i:41" }, "delete": { "query":"foo_i:43" }, "add":{ "doc": { "id": "DOCX", "foo_i": { "inc" : "1" } } }, "delete": { "query":"foo_i:41" }, "delete": { "query":"foo_i:42" } }' # verify the updates were applied correctly and doc didn't get deleted by mistake curl 'http://localhost:8983/solr/techproducts/get?wt=json&id=DOCX' { "doc": { "id":"DOCX", "foo_i":43, "_version_":1555827152896655360}} kill -9 PID # use whatever PID you get from "ps -ef | grep start.jar | grep techproducts" bin/solr start -s example/techproducts/solr/ # re-verify the updates were applied correctly during replay curl 'http://localhost:8983/solr/techproducts/get?wt=json&id=DOCX' { "doc": { "id":"DOCX", "foo_i":43, "_version_":1555827152896655360}} {noformat} And here's a snippet of solr's logs during the replay on (second) startup... {noformat} WARN - 2017-01-07 01:27:45.408; [ x:techproducts] org.apache.solr.update.UpdateLog$LogReplayer; Starting log replay tlog{file=/home/hossman/lucene/dev/solr/example/techproducts/solr/techproducts/data/tlog/tlog.0000000000000000001 refcount=2} active=false starting pos=0 INFO - 2017-01-07 01:27:45.422; [ x:techproducts] org.apache.solr.update.DirectUpdateHandler2; Reordered DBQs detected. Update=add{flags=a,_version_=1555827152848420864,id=DOCX} DBQs=[DBQ{version=1555827152905043968,q=foo_i:42}, DBQ{version=1555827152897703936,q=foo_i:41}, DBQ{version=1555827152894558208,q=foo_i:43}, DBQ{version=1555827152883023872,q=foo_i:41}, DBQ{version=1555827152872538112,q=foo_i:43}, DBQ{version=1555827152849469440,q=foo_i:42}] INFO - 2017-01-07 01:27:45.434; [ x:techproducts] org.apache.solr.core.SolrCore; [techproducts] webapp=null path=null params={q=static+firstSearcher+warming+in+solrconfig.xml&distrib=false&event=firstSearcher} hits=3 status=0 QTime=62 INFO - 2017-01-07 01:27:45.446; [ x:techproducts] org.apache.solr.core.QuerySenderListener; QuerySenderListener done. INFO - 2017-01-07 01:27:45.446; [ x:techproducts] org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener; Loading spell index for spellchecker: default INFO - 2017-01-07 01:27:45.446; [ x:techproducts] org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener; Loading spell index for spellchecker: wordbreak INFO - 2017-01-07 01:27:45.448; [ x:techproducts] org.apache.solr.core.SolrCore; [techproducts] Registered new searcher Searcher@1826a6c8[techproducts] main{ExitableDirectoryReader(UninvertingDirectoryReader(Uninverting(_0(7.0.0):C32)))} INFO - 2017-01-07 01:27:45.532; [ x:techproducts] org.apache.solr.search.SolrIndexSearcher; Opening [Searcher@5bd6b40f[techproducts] realtime] INFO - 2017-01-07 01:27:45.536; [ x:techproducts] org.apache.solr.update.DirectUpdateHandler2; Reordered DBQs detected. Update=add{flags=a,_version_=1555827152881975296,id=DOCX} DBQs=[DBQ{version=1555827152905043968,q=foo_i:42}, DBQ{version=1555827152897703936,q=foo_i:41}, DBQ{version=1555827152894558208,q=foo_i:43}, DBQ{version=1555827152883023872,q=foo_i:41}] INFO - 2017-01-07 01:27:45.546; [ x:techproducts] org.apache.solr.search.SolrIndexSearcher; Opening [Searcher@667a386f[techproducts] realtime] INFO - 2017-01-07 01:27:45.555; [ x:techproducts] org.apache.solr.update.DirectUpdateHandler2; Reordered DBQs detected. Update=add{flags=a,_version_=1555827152896655360,id=DOCX} DBQs=[DBQ{version=1555827152905043968,q=foo_i:42}, DBQ{version=1555827152897703936,q=foo_i:41}] INFO - 2017-01-07 01:27:45.573; [ x:techproducts] org.apache.solr.search.SolrIndexSearcher; Opening [Searcher@3ed8aa7e[techproducts] realtime] INFO - 2017-01-07 01:27:45.579; [ x:techproducts] org.apache.solr.update.DirectUpdateHandler2; start commit{flags=2,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} INFO - 2017-01-07 01:27:45.580; [ x:techproducts] org.apache.solr.update.SolrIndexWriter; Calling setCommitData with IW:org.apache.solr.update.SolrIndexWriter@363e3fe6 INFO - 2017-01-07 01:27:45.615; [ x:techproducts] org.apache.solr.search.SolrIndexSearcher; Opening [Searcher@7af13db2[techproducts] main] INFO - 2017-01-07 01:27:45.617; [ x:techproducts] org.apache.solr.update.DirectUpdateHandler2; end_commit_flush INFO - 2017-01-07 01:27:45.617; [ x:techproducts] org.apache.solr.core.QuerySenderListener; QuerySenderListener sending requests to Searcher@7af13db2[techproducts] main{ExitableDirectoryReader(UninvertingDirectoryReader(Uninverting(_0(7.0.0):C32) Uninverting(_3(7.0.0):C1)))} INFO - 2017-01-07 01:27:45.617; [ x:techproducts] org.apache.solr.core.QuerySenderListener; QuerySenderListener done. INFO - 2017-01-07 01:27:45.619; [ x:techproducts] org.apache.solr.core.SolrCore; [techproducts] Registered new searcher Searcher@7af13db2[techproducts] main{ExitableDirectoryReader(UninvertingDirectoryReader(Uninverting(_0(7.0.0):C32) Uninverting(_3(7.0.0):C1)))} INFO - 2017-01-07 01:27:45.620; [ x:techproducts] org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor; [techproducts] {add=[DOCX (1555827152848420864), DOCX (1555827152881975296), DOCX (1555827152896655360)],deleteByQuery=foo_i:42 (-1555827152849469440),deleteByQuery=foo_i:43 (-1555827152872538112),deleteByQuery=foo_i:41 (-1555827152883023872),deleteByQuery=foo_i:43 (-1555827152894558208),deleteByQuery=foo_i:41 (-1555827152897703936),deleteByQuery=foo_i:42 (-1555827152905043968)} 0 212 WARN - 2017-01-07 01:27:45.620; [ x:techproducts] org.apache.solr.update.UpdateLog$LogReplayer; Log replay finished. recoveryInfo=RecoveryInfo{adds=3 deletes=0 deleteByQuery=6 errors=0 positionOfStart=0} {noformat} ...notice how many times each of those DBQs was listed as being "Reordered" (and thus preemptively applied) even though they are *also* applied as part of the regular replay (note the INFO msg from LogUpdateProcessor) > log replay redundently (pre-)applies DBQs as if they were out of order > ---------------------------------------------------------------------- > > Key: SOLR-9941 > URL: https://issues.apache.org/jira/browse/SOLR-9941 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Hoss Man > > There's kind of an odd situation that arises when a Solr node starts up > (after a crash) and tries to recover from it's tlog that causes deletes to be > redundantly & excessively applied -- at a minimum it causes confusing really > log messages.... > * {{UpdateLog.init(...)}} creates {{TransactionLog}} instances for the most > recent log files found (based on numRecordsToKeep) and then builds a > {{RecentUpdates}} instance from them > * Delete entries from the {{RecentUpdates}} are used to populate 2 lists: > ** {{deleteByQueries}} > ** {{oldDeletes}} (for deleteById). > * Then when {{UpdateLog.recoverFromLog}} is called a {{LogReplayer}} is used > to replay any (uncommited) {{TransactionLog}} enteries > ** during replay {{UpdateLog}} delegates to the UpdateRequestProcessorChain > to for the various adds/deletes, etc... > ** when an add makes it to {{RunUpdateProcessor}} it delegates to > {{DirectUpdateHandler2}}, which (independent of the fact that we're in log > replay) calls {{UpdateLog.getDBQNewer}} for every add, looking for any > "Reordered" deletes that have a version greater then the add > *** if it finds _any_ DBQs "newer" then the document being added, it does a > low level {{IndexWriter.updateDocument}} and then immediately executes _all_ > the newer DBQs ... _once per add_ > ** these deletes are *also* still executed as part of the normal tlog replay, > because they are in the tlog. > Which means if you are recovering from a tlog with 90 addDocs, followed by 5 > DBQs, then *each* of those 5 DBQs will each be executed 91 times -- and for > 90 of those executions, a DUH2 INFO log messages will say {{"Reordered DBQs > detected. ..."}} even tough the only reason they are out of order is because > Solr is deliberately applying them out of order. > * At a minimum we should improve the log messages > * Ideally we should stop (pre-emptively) applying these deletes during tlog > replay. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org