[jira] [Commented] (SOLR-3884) possible bug in how commits are handled during recovery mode on startup?
[ https://issues.apache.org/jira/browse/SOLR-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462737#comment-13462737 ] Steven Rowe commented on SOLR-3884: --- bq. I don't think it does happen every time - others seem to have run the smoke test and not seen it at all, steve saw it some runs, I see it every run on my really fast comp. So I'm guessing there is some timing involved. Mark, what OS are you running the smoke tester on? On another Win7+Cygwin box, the smoke tester succeeds for me. possible bug in how commits are handled during recovery mode on startup? -- Key: SOLR-3884 URL: https://issues.apache.org/jira/browse/SOLR-3884 Project: Solr Issue Type: Bug Reporter: Hoss Man Attachments: java6.solr-example.log, java7.solr-example.log, solr-example.log while testing out 4.0-rc0, sarowe noted the he was seeing the smoke tester script fail while sanity checking the solr example. https://mail-archives.apache.org/mod_mbox/lucene-dev/201209.mbox/%3c6c78e97c707b5b4c8cc61d44f87545863ed...@suex10-mbx-03.ad.syr.edu%3E I'm not certain, but looking at his logs, i think this suggests a bug in how commits are handled when a newly started server is in recovery mode -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3884) possible bug in how commits are handled during recovery mode on startup?
[ https://issues.apache.org/jira/browse/SOLR-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462739#comment-13462739 ] Robert Muir commented on SOLR-3884: --- Strange too, I've run the smoketester over 100 times and not encountered this. and sometimes my computer is busy doing other things too. possible bug in how commits are handled during recovery mode on startup? -- Key: SOLR-3884 URL: https://issues.apache.org/jira/browse/SOLR-3884 Project: Solr Issue Type: Bug Reporter: Hoss Man Attachments: java6.solr-example.log, java7.solr-example.log, solr-example.log while testing out 4.0-rc0, sarowe noted the he was seeing the smoke tester script fail while sanity checking the solr example. https://mail-archives.apache.org/mod_mbox/lucene-dev/201209.mbox/%3c6c78e97c707b5b4c8cc61d44f87545863ed...@suex10-mbx-03.ad.syr.edu%3E I'm not certain, but looking at his logs, i think this suggests a bug in how commits are handled when a newly started server is in recovery mode -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3884) possible bug in how commits are handled during recovery mode on startup?
[ https://issues.apache.org/jira/browse/SOLR-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462803#comment-13462803 ] Mark Miller commented on SOLR-3884: --- I'm running it on Ubuntu 12.04. possible bug in how commits are handled during recovery mode on startup? -- Key: SOLR-3884 URL: https://issues.apache.org/jira/browse/SOLR-3884 Project: Solr Issue Type: Bug Reporter: Hoss Man Attachments: java6.solr-example.log, java7.solr-example.log, solr-example.log while testing out 4.0-rc0, sarowe noted the he was seeing the smoke tester script fail while sanity checking the solr example. https://mail-archives.apache.org/mod_mbox/lucene-dev/201209.mbox/%3c6c78e97c707b5b4c8cc61d44f87545863ed...@suex10-mbx-03.ad.syr.edu%3E I'm not certain, but looking at his logs, i think this suggests a bug in how commits are handled when a newly started server is in recovery mode -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3884) possible bug in how commits are handled during recovery mode on startup?
[ https://issues.apache.org/jira/browse/SOLR-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462809#comment-13462809 ] Robert Muir commented on SOLR-3884: --- Me too: Distributor ID: Ubuntu Description:Ubuntu 12.04 LTS Release:12.04 Codename: precise I run 'ant nightly-smoke' like this: ant -DJAVA7_HOME=/usr/local/jdk1.7.0_01 -DJAVA6_HOME=/usr/local/jdk1.6.0_27 nightly-smoke possible bug in how commits are handled during recovery mode on startup? -- Key: SOLR-3884 URL: https://issues.apache.org/jira/browse/SOLR-3884 Project: Solr Issue Type: Bug Reporter: Hoss Man Attachments: java6.solr-example.log, java7.solr-example.log, solr-example.log while testing out 4.0-rc0, sarowe noted the he was seeing the smoke tester script fail while sanity checking the solr example. https://mail-archives.apache.org/mod_mbox/lucene-dev/201209.mbox/%3c6c78e97c707b5b4c8cc61d44f87545863ed...@suex10-mbx-03.ad.syr.edu%3E I'm not certain, but looking at his logs, i think this suggests a bug in how commits are handled when a newly started server is in recovery mode -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3884) possible bug in how commits are handled during recovery mode on startup?
[ https://issues.apache.org/jira/browse/SOLR-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462860#comment-13462860 ] Hoss Man commented on SOLR-3884: {quote} I tried the clean directory approach - here's the smokeTestRelease.py patch: ... But this seemed to have no effect - the failure still happens: ... Attaching the solr-example.log files from each directory. {quote} If these logs you attached on 25/Sep/12 are what you got *with* the patch then something _really_ weird is going on -- because using a clean example dir there's no logical reason why it should have been doing tlog recovery in either run. possible bug in how commits are handled during recovery mode on startup? -- Key: SOLR-3884 URL: https://issues.apache.org/jira/browse/SOLR-3884 Project: Solr Issue Type: Bug Reporter: Hoss Man Attachments: java6.solr-example.log, java7.solr-example.log, solr-example.log while testing out 4.0-rc0, sarowe noted the he was seeing the smoke tester script fail while sanity checking the solr example. https://mail-archives.apache.org/mod_mbox/lucene-dev/201209.mbox/%3c6c78e97c707b5b4c8cc61d44f87545863ed...@suex10-mbx-03.ad.syr.edu%3E I'm not certain, but looking at his logs, i think this suggests a bug in how commits are handled when a newly started server is in recovery mode -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3884) possible bug in how commits are handled during recovery mode on startup?
[ https://issues.apache.org/jira/browse/SOLR-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462881#comment-13462881 ] Steven Rowe commented on SOLR-3884: --- bq. If these logs you attached on 25/Sep/12 are what you got with the patch then something really weird is going on – because using a clean example dir there's no logical reason why it should have been doing tlog recovery in either run. those two logs are *with* the patch. possible bug in how commits are handled during recovery mode on startup? -- Key: SOLR-3884 URL: https://issues.apache.org/jira/browse/SOLR-3884 Project: Solr Issue Type: Bug Reporter: Hoss Man Attachments: java6.solr-example.log, java7.solr-example.log, solr-example.log while testing out 4.0-rc0, sarowe noted the he was seeing the smoke tester script fail while sanity checking the solr example. https://mail-archives.apache.org/mod_mbox/lucene-dev/201209.mbox/%3c6c78e97c707b5b4c8cc61d44f87545863ed...@suex10-mbx-03.ad.syr.edu%3E I'm not certain, but looking at his logs, i think this suggests a bug in how commits are handled when a newly started server is in recovery mode -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3884) possible bug in how commits are handled during recovery mode on startup?
[ https://issues.apache.org/jira/browse/SOLR-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462996#comment-13462996 ] Yonik Seeley commented on SOLR-3884: Something to watch out for on Windows/cygwin: I've noticed that ^C in cygwin no longer works to stop the JVM. Doing a kill -2 on the java process from a different window seems to immediately abort the JVM (no graceful shutdown that does a commit). The latter may be happening with the python script too? possible bug in how commits are handled during recovery mode on startup? -- Key: SOLR-3884 URL: https://issues.apache.org/jira/browse/SOLR-3884 Project: Solr Issue Type: Bug Reporter: Hoss Man Attachments: java6.solr-example.log, java7.solr-example.log, solr-example.log while testing out 4.0-rc0, sarowe noted the he was seeing the smoke tester script fail while sanity checking the solr example. https://mail-archives.apache.org/mod_mbox/lucene-dev/201209.mbox/%3c6c78e97c707b5b4c8cc61d44f87545863ed...@suex10-mbx-03.ad.syr.edu%3E I'm not certain, but looking at his logs, i think this suggests a bug in how commits are handled when a newly started server is in recovery mode -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3884) possible bug in how commits are handled during recovery mode on startup?
[ https://issues.apache.org/jira/browse/SOLR-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463018#comment-13463018 ] Hoss Man commented on SOLR-3884: {quote} Doing a kill -2 on the java process from a different window seems to immediately abort the JVM (no graceful shutdown that does a commit). The latter may be happening with the python script too? {quote} That could explain how a tlog would be left in the directory after calling the testSolrExample() method, but it still leaves some open questions... 1) according to sarowe, he's seeing this problem (and his smoke test logs show tlog recovery) even when using a pristine copy of the unpack dir - so why is there a tlog in that case? 2) how should tlog recovery be fixed to better handle this situation (ie: docs commits coming in during recovery) in a real world situation. possible bug in how commits are handled during recovery mode on startup? -- Key: SOLR-3884 URL: https://issues.apache.org/jira/browse/SOLR-3884 Project: Solr Issue Type: Bug Reporter: Hoss Man Attachments: java6.solr-example.log, java7.solr-example.log, solr-example.log while testing out 4.0-rc0, sarowe noted the he was seeing the smoke tester script fail while sanity checking the solr example. https://mail-archives.apache.org/mod_mbox/lucene-dev/201209.mbox/%3c6c78e97c707b5b4c8cc61d44f87545863ed...@suex10-mbx-03.ad.syr.edu%3E I'm not certain, but looking at his logs, i think this suggests a bug in how commits are handled when a newly started server is in recovery mode -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3884) possible bug in how commits are handled during recovery mode on startup?
[ https://issues.apache.org/jira/browse/SOLR-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463019#comment-13463019 ] Hoss Man commented on SOLR-3884: bq. 1) according to sarowe, he's seeing this problem (and his smoke test logs show tlog recovery) even when using a pristine copy of the unpack dir - so why is there a tlog in that case? Hmm.. talking with sarowe on IRC, i think there may be a bug in his patch ... he cloned the unpackDir he passes to testSolrExample, but i don't think testSolrExample actaully uses that unpack dir for anything, it appears to just run the example relative the CWD. possible bug in how commits are handled during recovery mode on startup? -- Key: SOLR-3884 URL: https://issues.apache.org/jira/browse/SOLR-3884 Project: Solr Issue Type: Bug Reporter: Hoss Man Attachments: java6.solr-example.log, java7.solr-example.log, solr-example.log while testing out 4.0-rc0, sarowe noted the he was seeing the smoke tester script fail while sanity checking the solr example. https://mail-archives.apache.org/mod_mbox/lucene-dev/201209.mbox/%3c6c78e97c707b5b4c8cc61d44f87545863ed...@suex10-mbx-03.ad.syr.edu%3E I'm not certain, but looking at his logs, i think this suggests a bug in how commits are handled when a newly started server is in recovery mode -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3884) possible bug in how commits are handled during recovery mode on startup?
[ https://issues.apache.org/jira/browse/SOLR-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463030#comment-13463030 ] Yonik Seeley commented on SOLR-3884: bq. I've noticed that ^C in cygwin no longer works to stop the JVM. I think it might be this: http://cygwin.com/ml/cygwin/2012-07/msg00250.html The author does note that the Java shutdown hook is no longer called though. possible bug in how commits are handled during recovery mode on startup? -- Key: SOLR-3884 URL: https://issues.apache.org/jira/browse/SOLR-3884 Project: Solr Issue Type: Bug Reporter: Hoss Man Attachments: java6.solr-example.log, java7.solr-example.log, solr-example.log while testing out 4.0-rc0, sarowe noted the he was seeing the smoke tester script fail while sanity checking the solr example. https://mail-archives.apache.org/mod_mbox/lucene-dev/201209.mbox/%3c6c78e97c707b5b4c8cc61d44f87545863ed...@suex10-mbx-03.ad.syr.edu%3E I'm not certain, but looking at his logs, i think this suggests a bug in how commits are handled when a newly started server is in recovery mode -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3884) possible bug in how commits are handled during recovery mode on startup?
[ https://issues.apache.org/jira/browse/SOLR-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463034#comment-13463034 ] Steven Rowe commented on SOLR-3884: --- {quote} Hmm.. talking with sarowe on IRC, i think there may be a bug in his patch ... he cloned the unpackDir he passes to testSolrExample, but i don't think testSolrExample actaully uses that unpack dir for anything, it appears to just run the example relative the CWD. {quote} Yup, buggy patch, I didn't chdir to the unpacked copy, so it was still sharing the original dir between the two runs. Here's the hopefully fixed patch: {noformat} Index: dev-tools/scripts/smokeTestRelease.py === --- dev-tools/scripts/smokeTestRelease.py (revision 1389823) +++ dev-tools/scripts/smokeTestRelease.py (working copy) @@ -533,12 +536,26 @@ if project == 'lucene': testDemo(isSrc, version) else: + print('copying unpacked distribution...') + java6UnpackPath = '%s-java6' %unpackPath + if os.path.exists(java6UnpackPath): +shutil.rmtree(java6UnpackPath) + shutil.copytree(unpackPath, java6UnpackPath) + chdir(java6UnpackPath) print('test solr example w/ Java 6...') - testSolrExample(unpackPath, JAVA6_HOME, False) + testSolrExample(java6UnpackPath, JAVA6_HOME, False) + print('copying unpacked distribution...') + java7UnpackPath = '%s-java7' %unpackPath + if os.path.exists(java7UnpackPath): +shutil.rmtree(java7UnpackPath) + shutil.copytree(unpackPath, java7UnpackPath) + chdir(java7UnpackPath) print('test solr example w/ Java 7...') - testSolrExample(unpackPath, JAVA7_HOME, False) + testSolrExample(java7UnpackPath, JAVA7_HOME, False) + chdir(unpackPath) + testChangesText('.', version, project) {noformat} possible bug in how commits are handled during recovery mode on startup? -- Key: SOLR-3884 URL: https://issues.apache.org/jira/browse/SOLR-3884 Project: Solr Issue Type: Bug Reporter: Hoss Man Attachments: java6.solr-example.log, java7.solr-example.log, solr-example.log while testing out 4.0-rc0, sarowe noted the he was seeing the smoke tester script fail while sanity checking the solr example. https://mail-archives.apache.org/mod_mbox/lucene-dev/201209.mbox/%3c6c78e97c707b5b4c8cc61d44f87545863ed...@suex10-mbx-03.ad.syr.edu%3E I'm not certain, but looking at his logs, i think this suggests a bug in how commits are handled when a newly started server is in recovery mode -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3884) possible bug in how commits are handled during recovery mode on startup?
[ https://issues.apache.org/jira/browse/SOLR-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462233#comment-13462233 ] Hoss Man commented on SOLR-3884: The crux of hte issue i (think i) see is that the smoke tester tries to sanity check the same example dir twice in succession, once with java6 and once with java7. in sarowe's case, it's always the java7 version that fails (if it fails) and here are some interesting tidbits from the output/log he posted... {noformat} test solr example w/ Java 6... start Solr instance (log=/home/sarowe/temp/tmpDir/unpack/apache-solr-4.0.0/solr-example.log)... startup done test utf8... index example docs... run query... stop server (SIGINT)... test solr example w/ Java 7... start Solr instance (log=/home/sarowe/temp/tmpDir/unpack/apache-solr-4.0.0/solr-example.log)... startup done test utf8... index example docs... run query... FAILED: response is: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime0/intlst name=paramsstr name=qvideo/str/lst/lstresult name=response numFound=0 start=0/result /response {noformat} So the requests to index query data all succced, but the results are not what is expected (no docs found for a simple query) Then we look at the log from that failed second run o the example... {noformat} ... INFO: registering core: collection1 Sep 24, 2012 2:08:58 PM org.apache.solr.update.UpdateLog$LogReplayer doReplay WARNING: Starting log replay tlog{file=solr\collection1\data\tlog\tlog.000 refcount=2} active=false starting pos=0 Sep 24, 2012 2:08:58 PM org.apache.solr.servlet.SolrDispatchFilter init INFO: user.dir=C:\cygwin\home\sarowe\temp\tmpDir\unpack\apache-solr-4.0.0\example Sep 24, 2012 2:08:58 PM org.apache.solr.servlet.SolrDispatchFilter init INFO: SolrDispatchFilter.init() done 2012-09-24 14:08:58.105:INFO:oejs.AbstractConnector:Started SocketConnector@0.0.0.0:8983 Sep 24, 2012 2:08:58 PM org.apache.solr.core.SolrDeletionPolicy onInit INFO: SolrDeletionPolicy.onInit: commits:num=1 commit{dir=NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@C:\cygwin\home\sarowe\temp\tmpDir\unpac k\apache-solr-4.0.0\example\solr\collection1\data\index lockFactory=org.apache.lucene.store.NativeFSLockFactory@7ea69d83; maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_1,generation=1,filenames=[segments_1] Sep 24, 2012 2:08:58 PM org.apache.solr.core.SolrDeletionPolicy updateCommits INFO: newest commit = 1 Sep 24, 2012 2:08:58 PM org.apache.solr.update.DirectUpdateHandler2 commit INFO: start commit{flags=2,_version_=0,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false} Sep 24, 2012 2:08:58 PM org.apache.solr.core.SolrCore execute INFO: [collection1] webapp=/solr path=/select params={wt=pythonq=helloparams=explicit} hits=0 status=0 QTime=1 Sep 24, 2012 2:08:58 PM org.apache.solr.core.SolrCore execute ... {noformat} ...so we have astartup detecting things in the transaction log that need to be replayed, and a commit starts -- meanwhile requests fro mthe smoke script (checking utf8) are comming in. After that we start getting real example documents... {noformat} ... INFO: [collection1] webapp=/solr path=/update params={} {add=[EN7800GTX/2DHTV/256M (1414015368091926528), 100-435805 (1414015368095072256)]} 0 6 Sep 24, 2012 2:08:59 PM org.apache.solr.update.processor.DistributedUpdateProcessor processCommit INFO: Ignoring commit while not ACTIVE - state: REPLAYING replay:0 Sep 24, 2012 2:08:59 PM org.apache.solr.update.processor.LogUpdateProcessor finish INFO: [collection1] webapp=/solr path=/update params={softCommit=true} {commit=} 0 1 Sep 24, 2012 2:08:59 PM org.apache.solr.core.SolrCore execute INFO: [collection1] webapp=/solr path=/select/ params={q=video} hits=0 status=0 QTime=0 ... {noformat} ...and that's where things look wonky to me -- the smoke script explicitly sent in a bunch of documents, and asked for a commit, but aparently even though the client was given a success response, that commit was evidently ignored because solr was in REPLAYING mode, so then the subsequent query didn't find the docs it was expecting. possible bug in how commits are handled during recovery mode on startup? -- Key: SOLR-3884 URL: https://issues.apache.org/jira/browse/SOLR-3884 Project: Solr Issue Type: Bug Reporter: Hoss Man while testing out 4.0-rc0, sarowe noted the he was seeing the smoke tester script fail while sanity checking the solr example. https://mail-archives.apache.org/mod_mbox/lucene-dev/201209.mbox/%3c6c78e97c707b5b4c8cc61d44f87545863ed...@suex10-mbx-03.ad.syr.edu%3E I'm not certain, but looking at
[jira] [Commented] (SOLR-3884) possible bug in how commits are handled during recovery mode on startup?
[ https://issues.apache.org/jira/browse/SOLR-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462234#comment-13462234 ] Mark Miller commented on SOLR-3884: --- I guess it looks like its recovering from a replay log? And it skips commits while it does this. So the problem may just be that we accept updates before we are ready? I've got to head out for a while, but can look into it more later if no one else jumps on it. possible bug in how commits are handled during recovery mode on startup? -- Key: SOLR-3884 URL: https://issues.apache.org/jira/browse/SOLR-3884 Project: Solr Issue Type: Bug Reporter: Hoss Man while testing out 4.0-rc0, sarowe noted the he was seeing the smoke tester script fail while sanity checking the solr example. https://mail-archives.apache.org/mod_mbox/lucene-dev/201209.mbox/%3c6c78e97c707b5b4c8cc61d44f87545863ed...@suex10-mbx-03.ad.syr.edu%3E I'm not certain, but looking at his logs, i think this suggests a bug in how commits are handled when a newly started server is in recovery mode -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3884) possible bug in how commits are handled during recovery mode on startup?
[ https://issues.apache.org/jira/browse/SOLR-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462235#comment-13462235 ] Mark Miller commented on SOLR-3884: --- Didn't see your post first Hossman - right, that's about what I see happening. possible bug in how commits are handled during recovery mode on startup? -- Key: SOLR-3884 URL: https://issues.apache.org/jira/browse/SOLR-3884 Project: Solr Issue Type: Bug Reporter: Hoss Man while testing out 4.0-rc0, sarowe noted the he was seeing the smoke tester script fail while sanity checking the solr example. https://mail-archives.apache.org/mod_mbox/lucene-dev/201209.mbox/%3c6c78e97c707b5b4c8cc61d44f87545863ed...@suex10-mbx-03.ad.syr.edu%3E I'm not certain, but looking at his logs, i think this suggests a bug in how commits are handled when a newly started server is in recovery mode -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3884) possible bug in how commits are handled during recovery mode on startup?
[ https://issues.apache.org/jira/browse/SOLR-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462271#comment-13462271 ] Yonik Seeley commented on SOLR-3884: bq. So the problem may just be that we accept updates before we are ready? Yes, sounds like it from the diagnosis that Hoss gave. bq. I think this is somewhat related to SOLR-3861 in that the only reason we see these problems during tlog REPLAY is because there was no hard commit on shutdown of the first instance. I was just going to ask this (why we are seeing a tlog recovery in the first place). We always used to do a commit on shutdown, and I had to add an explicit test hook to disable this for TestRecovery (DirectUpdateHandler2.commitOnClose = false;) bq. but independent of that we still need to think about how to better deal with documents comming in during RECOVERY We buffer them to the tlog. They will get added eventually. What we should really think about is documents coming in during recovery that aren't from a leader and we aren't in cloud mode (hence we won't forward to a leader). Perhaps we should fail the update? bq. On IRC miller suggested that perhaps Solr should block and not accept new updates until REPLAY is done (ideally by not listing on the socket i would think) Hmmm, yeah I guess that would work too, only if we don't advertise the core being in recovery mode to the cluster. We *don't* want the leader to be forwarding updates if we're going to block. possible bug in how commits are handled during recovery mode on startup? -- Key: SOLR-3884 URL: https://issues.apache.org/jira/browse/SOLR-3884 Project: Solr Issue Type: Bug Reporter: Hoss Man while testing out 4.0-rc0, sarowe noted the he was seeing the smoke tester script fail while sanity checking the solr example. https://mail-archives.apache.org/mod_mbox/lucene-dev/201209.mbox/%3c6c78e97c707b5b4c8cc61d44f87545863ed...@suex10-mbx-03.ad.syr.edu%3E I'm not certain, but looking at his logs, i think this suggests a bug in how commits are handled when a newly started server is in recovery mode -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3884) possible bug in how commits are handled during recovery mode on startup?
[ https://issues.apache.org/jira/browse/SOLR-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462300#comment-13462300 ] Yonik Seeley commented on SOLR-3884: Hmmm, just did as simple manual test, and I see evidence of a final commit. I started a clean stock server, executed post.sh *xml (after commenting out the commit command in the script), then pressed ^C immediately after the docs are added (before any autocommit could kick in.). Last lines on the console: {code} INFO: [collection1] webapp=/solr path=/update params={} {add=[EN7800GTX/2DHTV/256M (1414039066694909952), 100-435805 (1414039066700152832)]} 0 10 ^C2012-09-24 20:25:44.817:INFO:oejs.Server:Graceful shutdown SocketConnector@0.0.0.0:8983 2012-09-24 20:25:44.817:INFO:oejs.Server:Graceful shutdown o.e.j.w.WebAppContext{/solr,file:/opt/code/lusolr/solr/example/solr-webapp/webapp/},/opt/code/lusolr/solr/example/webapps/solr.war 2012-09-24 20:25:45.887:INFO:oejsh.ContextHandler:stopped o.e.j.w.WebAppContext{/solr,file:/opt/code/lusolr/solr/example/solr-webapp/webapp/},/opt/code/lusolr/solr/example/webapps/solr.war {code} The only tlog was capped (an indicator that everything in the log was subjected to a hard commit, hence there should be no need to tlog recovery on restart). {code} /opt/code/lusolr/solr/example/solr/collection1/data/tlog$ tail -n1 tlog.000 0??D`-SOLR_TLOG_END/opt/code/lusolr/solr/example/solr/collection1/data/tlog$ {code} Starting the server again showed no indication of a tlog replay/recovery. So I guess this is something more esoteric related to SOLR-3861 that I don't understand yet (and doesn't happen all the time). bq. autocommit during shutdown FYI, the final commit should be done regardless of any autocommits being enabled, so I assume you're just referring to that commit, and not a real autocommit triggered by time or number of documents that happens to coincide with shutdown. possible bug in how commits are handled during recovery mode on startup? -- Key: SOLR-3884 URL: https://issues.apache.org/jira/browse/SOLR-3884 Project: Solr Issue Type: Bug Reporter: Hoss Man while testing out 4.0-rc0, sarowe noted the he was seeing the smoke tester script fail while sanity checking the solr example. https://mail-archives.apache.org/mod_mbox/lucene-dev/201209.mbox/%3c6c78e97c707b5b4c8cc61d44f87545863ed...@suex10-mbx-03.ad.syr.edu%3E I'm not certain, but looking at his logs, i think this suggests a bug in how commits are handled when a newly started server is in recovery mode -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3884) possible bug in how commits are handled during recovery mode on startup?
[ https://issues.apache.org/jira/browse/SOLR-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462362#comment-13462362 ] Mark Miller commented on SOLR-3884: --- I don't think it does happen every time - others seem to have run the smoke test and not seen it at all, steve saw it some runs, I see it every run on my really fast comp. So I'm guessing there is some timing involved. possible bug in how commits are handled during recovery mode on startup? -- Key: SOLR-3884 URL: https://issues.apache.org/jira/browse/SOLR-3884 Project: Solr Issue Type: Bug Reporter: Hoss Man while testing out 4.0-rc0, sarowe noted the he was seeing the smoke tester script fail while sanity checking the solr example. https://mail-archives.apache.org/mod_mbox/lucene-dev/201209.mbox/%3c6c78e97c707b5b4c8cc61d44f87545863ed...@suex10-mbx-03.ad.syr.edu%3E I'm not certain, but looking at his logs, i think this suggests a bug in how commits are handled when a newly started server is in recovery mode -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3884) possible bug in how commits are handled during recovery mode on startup?
[ https://issues.apache.org/jira/browse/SOLR-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462384#comment-13462384 ] Mark Miller commented on SOLR-3884: --- Hmm...I tried with the SOLR-3861 fix and I still see a fail. The log shows Solr starting and then stopping, but the 'test utf8' fails with ERROR: Solr is not up. possible bug in how commits are handled during recovery mode on startup? -- Key: SOLR-3884 URL: https://issues.apache.org/jira/browse/SOLR-3884 Project: Solr Issue Type: Bug Reporter: Hoss Man Attachments: solr-example.log while testing out 4.0-rc0, sarowe noted the he was seeing the smoke tester script fail while sanity checking the solr example. https://mail-archives.apache.org/mod_mbox/lucene-dev/201209.mbox/%3c6c78e97c707b5b4c8cc61d44f87545863ed...@suex10-mbx-03.ad.syr.edu%3E I'm not certain, but looking at his logs, i think this suggests a bug in how commits are handled when a newly started server is in recovery mode -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org