Re: Commit Within and /update/extract handler
This is being triggered by adding the commitWithin param to ContentStreamUpdateRequest (request.setCommitWithin(1);). My configuration has autoCommit max time of 15s and openSearcher set to false. I'm assuming that changing openSeracher to true should address this, and adding the softCommit = true to the request would make the documents available in the mean time? On Apr 8, 2014 10:02 AM, Erick Erickson erickerick...@gmail.com wrote: Got a clue how it's being generated? Because it's not going to show you documents. commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} openSearcher=false and softCommit=false so the documents will be invisible. You need one or the other set to true. What it will do is close the current segment, open a new one and truncate the current transaction log. These may be good things but they have nothing to do with making docs visible :). See: http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ Best, Erick On Mon, Apr 7, 2014 at 8:43 PM, Jamie Johnson jej2...@gmail.com wrote: Below is the log showing what I believe to be the commit 07-Apr-2014 23:40:55.846 INFO [catalina-exec-5] org.apache.solr.update.processor.LogUpdateProcessor.finish [forums] webapp=/solr path=/update/extract params={uprefix=attr_literal.source_id=e4bb4bb6-96ab-4f8f-8a2a-1cf37dc1bcceliteral.content_group=File literal.id =e4bb4bb6-96ab-4f8f-8a2a-1cf37dc1bcceliteral.forum_id=3literal.content_type=application/octet-streamwt=javabinliteral.uploaded_by=+version=2literal.content_type=application/octet-streamliteral.file_name=exclusions} {add=[e4bb4bb6-96ab-4f8f-8a2a-1cf37dc1bcce (1464785652471037952)]} 0 563 07-Apr-2014 23:41:10.847 INFO [commitScheduler-10-thread-1] org.apache.solr.update.DirectUpdateHandler2.commit start commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} 07-Apr-2014 23:41:10.847 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [IW][commitScheduler-10-thread-1]: commit: start 07-Apr-2014 23:41:10.848 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [IW][commitScheduler-10-thread-1]: commit: enter lock 07-Apr-2014 23:41:10.848 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [IW][commitScheduler-10-thread-1]: commit: now prepare 07-Apr-2014 23:41:10.848 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [IW][commitScheduler-10-thread-1]: prepareCommit: flush 07-Apr-2014 23:41:10.849 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [IW][commitScheduler-10-thread-1]: index before flush _y(4.6):C1 _10(4.6):C1 _11(4.6):C1 _12(4.6):C1 07-Apr-2014 23:41:10.849 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [DW][commitScheduler-10-thread-1]: commitScheduler-10-thread-1 startFullFlush 07-Apr-2014 23:41:10.849 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [DW][commitScheduler-10-thread-1]: anyChanges? numDocsInRam=1 deletes=true hasTickets:false pendingChangesInFullFlush: false 07-Apr-2014 23:41:10.850 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [DWFC][commitScheduler-10-thread-1]: addFlushableState DocumentsWriterPerThread [pendingDeletes=gen=0, segment=_14, aborting=false, numDocsInRAM=1, deleteQueue=DWDQ: [ generation: 2 ]] 07-Apr-2014 23:41:10.852 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [DWPT][commitScheduler-10-thread-1]: flush postings as segment _14 numDocs=1 07-Apr-2014 23:41:10.904 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [DWPT][commitScheduler-10-thread-1]: new segment has 0 deleted docs 07-Apr-2014 23:41:10.904 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [DWPT][commitScheduler-10-thread-1]: new segment has no vectors; norms; no docValues; prox; freqs 07-Apr-2014 23:41:10.904 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [DWPT][commitScheduler-10-thread-1]: flushedFiles=[_14.nvd, _14_Lucene41_0.pos, _14_Lucene41_0.tip, _14_Lucene41_0.tim, _14.nvm, _14.fdx, _14_Lucene41_0.doc, _14.fnm, _14.fdt] 07-Apr-2014 23:41:10.905 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [DWPT][commitScheduler-10-thread-1]: flushed codec=Lucene46 07-Apr-2014 23:41:10.905 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [DWPT][commitScheduler-10-thread-1]: flushed: segment=_14 ramUsed=0.122 MB newFlushedSize(includes docstores)=0.003 MB docs/MB=322.937 07-Apr-2014
Re: Commit Within and /update/extract handler
On 4/9/2014 7:47 AM, Jamie Johnson wrote: This is being triggered by adding the commitWithin param to ContentStreamUpdateRequest (request.setCommitWithin(1);). My configuration has autoCommit max time of 15s and openSearcher set to false. I'm assuming that changing openSeracher to true should address this, and adding the softCommit = true to the request would make the documents available in the mean time? My personal opinion: autoCommit should not be used for document visibility, even though it CAN be used for it. It belongs in every config that uses the transaction log, with openSearcher set to false, and carefully considered maxTime and/or maxDocs parameters. I think it's better to control document visibility entirely manually, but if you actually do want to have an automatic commit for document visibility, use autoSoftCommit. It doesn't make any sense to disable openSearcher on a soft commit, so just leave that out. The docs/time intervals for this can be smaller or greater than the intervals for autoCommit, depending on your needs. Any manual commits that you send probably should be soft commits, but honestly that doesn't really matter if your auto settings are correct. Thanks, Shawn
Re: Commit Within and /update/extract handler
Thanks Shawn, I appreciate the information. On Wed, Apr 9, 2014 at 10:27 AM, Shawn Heisey s...@elyograg.org wrote: On 4/9/2014 7:47 AM, Jamie Johnson wrote: This is being triggered by adding the commitWithin param to ContentStreamUpdateRequest (request.setCommitWithin(1);). My configuration has autoCommit max time of 15s and openSearcher set to false. I'm assuming that changing openSeracher to true should address this, and adding the softCommit = true to the request would make the documents available in the mean time? My personal opinion: autoCommit should not be used for document visibility, even though it CAN be used for it. It belongs in every config that uses the transaction log, with openSearcher set to false, and carefully considered maxTime and/or maxDocs parameters. I think it's better to control document visibility entirely manually, but if you actually do want to have an automatic commit for document visibility, use autoSoftCommit. It doesn't make any sense to disable openSearcher on a soft commit, so just leave that out. The docs/time intervals for this can be smaller or greater than the intervals for autoCommit, depending on your needs. Any manual commits that you send probably should be soft commits, but honestly that doesn't really matter if your auto settings are correct. Thanks, Shawn
Re: Commit Within and /update/extract handler
Got a clue how it's being generated? Because it's not going to show you documents. commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} openSearcher=false and softCommit=false so the documents will be invisible. You need one or the other set to true. What it will do is close the current segment, open a new one and truncate the current transaction log. These may be good things but they have nothing to do with making docs visible :). See: http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ Best, Erick On Mon, Apr 7, 2014 at 8:43 PM, Jamie Johnson jej2...@gmail.com wrote: Below is the log showing what I believe to be the commit 07-Apr-2014 23:40:55.846 INFO [catalina-exec-5] org.apache.solr.update.processor.LogUpdateProcessor.finish [forums] webapp=/solr path=/update/extract params={uprefix=attr_literal.source_id=e4bb4bb6-96ab-4f8f-8a2a-1cf37dc1bcceliteral.content_group=File literal.id=e4bb4bb6-96ab-4f8f-8a2a-1cf37dc1bcceliteral.forum_id=3literal.content_type=application/octet-streamwt=javabinliteral.uploaded_by=+version=2literal.content_type=application/octet-streamliteral.file_name=exclusions} {add=[e4bb4bb6-96ab-4f8f-8a2a-1cf37dc1bcce (1464785652471037952)]} 0 563 07-Apr-2014 23:41:10.847 INFO [commitScheduler-10-thread-1] org.apache.solr.update.DirectUpdateHandler2.commit start commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} 07-Apr-2014 23:41:10.847 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [IW][commitScheduler-10-thread-1]: commit: start 07-Apr-2014 23:41:10.848 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [IW][commitScheduler-10-thread-1]: commit: enter lock 07-Apr-2014 23:41:10.848 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [IW][commitScheduler-10-thread-1]: commit: now prepare 07-Apr-2014 23:41:10.848 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [IW][commitScheduler-10-thread-1]: prepareCommit: flush 07-Apr-2014 23:41:10.849 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [IW][commitScheduler-10-thread-1]: index before flush _y(4.6):C1 _10(4.6):C1 _11(4.6):C1 _12(4.6):C1 07-Apr-2014 23:41:10.849 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [DW][commitScheduler-10-thread-1]: commitScheduler-10-thread-1 startFullFlush 07-Apr-2014 23:41:10.849 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [DW][commitScheduler-10-thread-1]: anyChanges? numDocsInRam=1 deletes=true hasTickets:false pendingChangesInFullFlush: false 07-Apr-2014 23:41:10.850 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [DWFC][commitScheduler-10-thread-1]: addFlushableState DocumentsWriterPerThread [pendingDeletes=gen=0, segment=_14, aborting=false, numDocsInRAM=1, deleteQueue=DWDQ: [ generation: 2 ]] 07-Apr-2014 23:41:10.852 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [DWPT][commitScheduler-10-thread-1]: flush postings as segment _14 numDocs=1 07-Apr-2014 23:41:10.904 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [DWPT][commitScheduler-10-thread-1]: new segment has 0 deleted docs 07-Apr-2014 23:41:10.904 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [DWPT][commitScheduler-10-thread-1]: new segment has no vectors; norms; no docValues; prox; freqs 07-Apr-2014 23:41:10.904 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [DWPT][commitScheduler-10-thread-1]: flushedFiles=[_14.nvd, _14_Lucene41_0.pos, _14_Lucene41_0.tip, _14_Lucene41_0.tim, _14.nvm, _14.fdx, _14_Lucene41_0.doc, _14.fnm, _14.fdt] 07-Apr-2014 23:41:10.905 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [DWPT][commitScheduler-10-thread-1]: flushed codec=Lucene46 07-Apr-2014 23:41:10.905 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [DWPT][commitScheduler-10-thread-1]: flushed: segment=_14 ramUsed=0.122 MB newFlushedSize(includes docstores)=0.003 MB docs/MB=322.937 07-Apr-2014 23:41:10.907 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [DW][commitScheduler-10-thread-1]: publishFlushedSegment seg-private updates=null 07-Apr-2014 23:41:10.907 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [IW][commitScheduler-10-thread-1]: publishFlushedSegment 07-Apr-2014 23:41:10.907 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [BD][commitScheduler-10-thread-1]: push deletes 1 deleted terms (unique
Re: Commit Within and /update/extract handler
You say you see the commit happen in the log, is openSearcher specified? This sounds like you're somehow getting a commit with openSearcher=false... Best, Erick On Sun, Apr 6, 2014 at 5:37 PM, Jamie Johnson jej2...@gmail.com wrote: I'm running solr 4.6.0 and am noticing that commitWithin doesn't seem to work when I am using the /update/extract request handler. It looks like a commit is happening from the logs, but the documents don't become available for search until I do a commit manually. Could this be some type of configuration issue?
Re: Commit Within and /update/extract handler
What does the call look like? Are you setting opening a new searcher or not? That should be in the log line where the commit is recorded... FWIW, Erick On Sun, Apr 6, 2014 at 5:37 PM, Jamie Johnson jej2...@gmail.com wrote: I'm running solr 4.6.0 and am noticing that commitWithin doesn't seem to work when I am using the /update/extract request handler. It looks like a commit is happening from the logs, but the documents don't become available for search until I do a commit manually. Could this be some type of configuration issue?
Re: Commit Within and /update/extract handler
Below is the log showing what I believe to be the commit 07-Apr-2014 23:40:55.846 INFO [catalina-exec-5] org.apache.solr.update.processor.LogUpdateProcessor.finish [forums] webapp=/solr path=/update/extract params={uprefix=attr_literal.source_id=e4bb4bb6-96ab-4f8f-8a2a-1cf37dc1bcceliteral.content_group=File literal.id=e4bb4bb6-96ab-4f8f-8a2a-1cf37dc1bcceliteral.forum_id=3literal.content_type=application/octet-streamwt=javabinliteral.uploaded_by=+version=2literal.content_type=application/octet-streamliteral.file_name=exclusions} {add=[e4bb4bb6-96ab-4f8f-8a2a-1cf37dc1bcce (1464785652471037952)]} 0 563 07-Apr-2014 23:41:10.847 INFO [commitScheduler-10-thread-1] org.apache.solr.update.DirectUpdateHandler2.commit start commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} 07-Apr-2014 23:41:10.847 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [IW][commitScheduler-10-thread-1]: commit: start 07-Apr-2014 23:41:10.848 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [IW][commitScheduler-10-thread-1]: commit: enter lock 07-Apr-2014 23:41:10.848 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [IW][commitScheduler-10-thread-1]: commit: now prepare 07-Apr-2014 23:41:10.848 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [IW][commitScheduler-10-thread-1]: prepareCommit: flush 07-Apr-2014 23:41:10.849 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [IW][commitScheduler-10-thread-1]: index before flush _y(4.6):C1 _10(4.6):C1 _11(4.6):C1 _12(4.6):C1 07-Apr-2014 23:41:10.849 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [DW][commitScheduler-10-thread-1]: commitScheduler-10-thread-1 startFullFlush 07-Apr-2014 23:41:10.849 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [DW][commitScheduler-10-thread-1]: anyChanges? numDocsInRam=1 deletes=true hasTickets:false pendingChangesInFullFlush: false 07-Apr-2014 23:41:10.850 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [DWFC][commitScheduler-10-thread-1]: addFlushableState DocumentsWriterPerThread [pendingDeletes=gen=0, segment=_14, aborting=false, numDocsInRAM=1, deleteQueue=DWDQ: [ generation: 2 ]] 07-Apr-2014 23:41:10.852 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [DWPT][commitScheduler-10-thread-1]: flush postings as segment _14 numDocs=1 07-Apr-2014 23:41:10.904 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [DWPT][commitScheduler-10-thread-1]: new segment has 0 deleted docs 07-Apr-2014 23:41:10.904 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [DWPT][commitScheduler-10-thread-1]: new segment has no vectors; norms; no docValues; prox; freqs 07-Apr-2014 23:41:10.904 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [DWPT][commitScheduler-10-thread-1]: flushedFiles=[_14.nvd, _14_Lucene41_0.pos, _14_Lucene41_0.tip, _14_Lucene41_0.tim, _14.nvm, _14.fdx, _14_Lucene41_0.doc, _14.fnm, _14.fdt] 07-Apr-2014 23:41:10.905 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [DWPT][commitScheduler-10-thread-1]: flushed codec=Lucene46 07-Apr-2014 23:41:10.905 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [DWPT][commitScheduler-10-thread-1]: flushed: segment=_14 ramUsed=0.122 MB newFlushedSize(includes docstores)=0.003 MB docs/MB=322.937 07-Apr-2014 23:41:10.907 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [DW][commitScheduler-10-thread-1]: publishFlushedSegment seg-private updates=null 07-Apr-2014 23:41:10.907 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [IW][commitScheduler-10-thread-1]: publishFlushedSegment 07-Apr-2014 23:41:10.907 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [BD][commitScheduler-10-thread-1]: push deletes 1 deleted terms (unique count=1) bytesUsed=1024 delGen=4 packetCount=1 totBytesUsed=1024 07-Apr-2014 23:41:10.907 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [IW][commitScheduler-10-thread-1]: publish sets newSegment delGen=5 seg=_14(4.6):C1 07-Apr-2014 23:41:10.908 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [IFD][commitScheduler-10-thread-1]: now checkpoint _y(4.6):C1 _10(4.6):C1 _11(4.6):C1 _12(4.6):C1 _14(4.6):C1 [5 segments ; isCommit = false] 07-Apr-2014 23:41:10.908 INFO [commitScheduler-10-thread-1] org.apache.solr.update.LoggingInfoStream.message [IFD][commitScheduler-10-thread-1]: 0 msec to checkpoint 07-Apr-2014 23:41:10.908 INFO [commitScheduler-10-thread-1]
Commit Within and /update/extract handler
I'm running solr 4.6.0 and am noticing that commitWithin doesn't seem to work when I am using the /update/extract request handler. It looks like a commit is happening from the logs, but the documents don't become available for search until I do a commit manually. Could this be some type of configuration issue?