On Wed, Jun 1, 2016 at 5:40 AM, Ravikumar Govindarajan < [email protected]> wrote:
> > > > In newer versions of the code there are multiple streams involved. One > for > > each open file handle plus if a sequential read is detected a new stream > is > > created for the instance for better performance > > > Great. We just patched up our Blur version with this code. > > While I was digging at the reader-closed issue, was quite surprised to > observe the following behavior > > - Issue a commit > - Lucene opens a new reader via IndexWriter. (Doesn't re-use our already > opened DirectoryReader) > - Processes all updates/deletes/merges > - Closes the new reader > - Complete commit > > For a big index & lots of commits, opening a new-reader for every commit is > prohibitively expensive. > > > Here is the JIRA for it... > https://issues.apache.org/jira/browse/LUCENE-2297 > > All we need to do is just set "readerPooling=true" in IndexWriterConfig > class > > Please do explore this option when you find time. > Interesting. The current code in the trunk doesn't reopen the indexreader form the writer. It reuses the existing reader. > > -- > Ravi > > > > On Tue, May 24, 2016 at 7:48 PM, Aaron McCurry <[email protected]> wrote: > > > On Tue, May 24, 2016 at 6:06 AM, Ravikumar Govindarajan < > > [email protected]> wrote: > > > > > We have solved it temporarily by using a KeepLastTwoCommits del policy. > > We > > > don't get these exceptions now!!! > > > > > > > Great! > > > > > > > > > > Btw, I see that pread calls in FSDataInputStream.java are synchronized. > > Is > > > it possible that merge DFS read calls could potentially block search > DFS > > > read calls? > > > > > > > Yes. > > > > > > > > > > Would it be a good idea to have 2 DFSInputStreams for every file, one > for > > > merge & another for search? > > > > > > > In newer versions of the code there are multiple streams involved. One > for > > each open file handle plus if a sequential read is detected a new stream > is > > created for the instance for better performance. Checkout the > > HdfsDirectory class. > > > > Aaron > > > > > > > > > > On Tue, May 10, 2016 at 7:43 PM, Ravikumar Govindarajan < > > > [email protected]> wrote: > > > > > > > Sorry, I mis-understood the code. > > > > I see that it has 2 locks IndexRefreshWriteLock & > > IndexRefreshReadLock. > > > > They look to be separate > > > > > > > > On Tue, May 10, 2016 at 7:16 PM, Ravikumar Govindarajan < > > > > [email protected]> wrote: > > > > > > > >> Thanks a lot Aaron. > > > >> > > > >> I guess we took a commit of 0.2.2 that doesn't have the > > > >> IndexRefreshWriteLock (IRWL). It looks like it co-ordinates between > > > >> searches & incoming mutation commits. If so, then it will likely > solve > > > the > > > >> first issue for us (AlreadyClosedException) > > > >> > > > >> > > > >> Can you recollect if that was the reason IRWL was introduced? > > > >> > > > >> On Tue, May 10, 2016 at 6:40 PM, Aaron McCurry <[email protected]> > > > >> wrote: > > > >> > > > >>> On Tue, May 10, 2016 at 2:30 AM, Ravikumar Govindarajan < > > > >>> [email protected]> wrote: > > > >>> > > > >>> > Actually there are 2 issues... > > > >>> > > > > >>> > 1. IndexReaderClosedException > > > >>> > 2. HDFS Stream Closed > > > >>> > > > > >>> > > > >>> Likely when the index is closed it closes the underlying > indexinputs > > as > > > >>> well causing the HDFS Stream closed exception. > > > >>> > > > >>> > > > >>> > > > > >>> > Merge completion results in File Deletion & ultimately HDFS > Stream > > > >>> Closed > > > >>> > during Search.... > > > >>> > > > > >>> > I use IndexFileDeleter with KeepOnlyLastCommitDeletionPolicy. > This > > > >>> blindly > > > >>> > deletes the file, without bothering to cross-check > > > >>> IndexReader.RefCount > > > > >>> > 0. > > > >>> > > > > >>> > > > >>> Hmm. You can see here: > > > >>> > > > >>> > > > >>> > > > > > > https://github.com/apache/incubator-blur/blob/release-0.2.2-incubating/blur-core/src/main/java/org/apache/blur/manager/writer/BlurIndexSimpleWriter.java#L303 > > > >>> > > > >>> That once the new index is available it is swapped into the index > ref > > > >>> object and the old one is sent to the index closer. Once the ref > to > > > the > > > >>> index are low enough it closes the index. Or at least it should. > > > >>> > > > >>> I will continue looking into the problem but I don't have a > solution > > > for > > > >>> you yet. > > > >>> > > > >>> Aaron > > > >>> > > > >>> > > > >>> > > > >>> > > > > >>> > > > > >>> > *Exception(message:Unknown error during rewrite, > > > >>> > stackTraceStr:java.io.IOException: Stream closed* > > > >>> > at > > > >>> > org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1385) > > > >>> > at > > > org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1374) > > > >>> > at > > > >>> > > org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:89) > > > >>> > at > > > >>> > > > > >>> > > > > >>> > > > > > > org.apache.blur.store.hdfs.HdfsIndexInput.readInternal(HdfsIndexInput.java:62) > > > >>> > at > > > >>> > > > > >>> > > > > >>> > > > > > > org.apache.blur.store.buffer.ReusedBufferedIndexInput.readBytes(ReusedBufferedIndexInput.java:167) > > > >>> > at > > > >>> > > > > >>> > > > > >>> > > > > > > org.apache.blur.store.buffer.ReusedBufferedIndexInput.readBytes(ReusedBufferedIndexInput.java:122) > > > >>> > at > > > >>> > > > > >>> > > > > >>> > > > > > > org.apache.blur.store.hdfs.MmapCacheIndexInput.readAndcache(MmapCacheIndexInput.java:24) > > > >>> > at > > > >>> > > > > >>> > > > > >>> > > > > > > org.apache.blur.store.blockcache_v2.CacheIndexInput.fillNormally(CacheIndexInput.java:354) > > > >>> > at > > > >>> > > > > >>> > > > > >>> > > > > > > org.apache.blur.store.blockcache_v2.CacheIndexInput.fill(CacheIndexInput.java:379) > > > >>> > at > > > >>> > > > > >>> > > > > >>> > > > > > > org.apache.blur.store.blockcache_v2.CacheIndexInput.tryToFill(CacheIndexInput.java:297) > > > >>> > at > > > >>> > > > > >>> > > > > >>> > > > > > > org.apache.blur.store.blockcache_v2.CacheIndexInput.readByte(CacheIndexInput.java:151) > > > >>> > at > > > >>> > > > > >>> > > > > >>> > > > > > > org.apache.blur.lucene.warmup.TraceableIndexInput.readByte(TraceableIndexInput.java:62) > > > >>> > at org.apache.lucene.store.DataInput.readVInt(DataInput.java:108) > > > >>> > at > > > >>> > > > > >>> > > > > >>> > > > > > > org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.loadBlock(BlockTreeTermsReader.java:2366) > > > >>> > at > > > >>> > > > > >>> > > > > >>> > > > > > > org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.seekCeil(BlockTreeTermsReader.java:1949) > > > >>> > at > > > >>> > > > > >>> > > > > >>> > > > > > > org.apache.blur.index.ExitableReader$ExitableTermsEnum.seekCeil(ExitableReader.java:250) > > > >>> > at > > > >>> > > > > >>> > > > > > > org.apache.lucene.index.FilteredTermsEnum.next(FilteredTermsEnum.java:225) > > > >>> > at > > > >>> > > > > >>> > > > > >>> > > > > > > org.apache.lucene.search.TermCollectingRewrite.collectTerms(TermCollectingRewrite.java:78) > > > >>> > at > > > >>> > > > > >>> > > > > >>> > > > > > > org.apache.lucene.search.ConstantScoreAutoRewrite.rewrite(ConstantScoreAutoRewrite.java:95) > > > >>> > at > > > >>> > > > > >>> > > > > >>> > > > > > > org.apache.lucene.search.MultiTermQuery$ConstantScoreAutoRewrite.rewrite(MultiTermQuery.java:220) > > > >>> > at > > > >>> > > > > org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:288) > > > >>> > at > > > org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:412) > > > >>> > at > > > org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:412) > > > >>> > at > > > org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:412) > > > >>> > at > > > >>> > > > > >>> > On Mon, May 9, 2016 at 4:42 PM, Ravikumar Govindarajan < > > > >>> > [email protected]> wrote: > > > >>> > > > > >>> > > One extra info we gleaned from the logs... > > > >>> > > > > > >>> > > 1. Merge Starts & is about to complete > > > >>> > > 2. Searcher is opened > > > >>> > > 3. Merge Completes > > > >>> > > 4. Ref-count drops to 0 in IndexReader > > > >>> > > 5. IndexReader closed while Searcher is still open > > > >>> > > > > > >>> > > This seems to be the main pattern for causing the Exception > > > >>> > > > > > >>> > > -- > > > >>> > > Ravi > > > >>> > > > > > >>> > > On Mon, May 9, 2016 at 3:08 PM, Ravikumar Govindarajan < > > > >>> > > [email protected]> wrote: > > > >>> > > > > > >>> > >> Thanks Aaron... > > > >>> > >> > > > >>> > >> Just a quick question. Lucene itself has ref-counting to close > > > it's > > > >>> > >> readers no? Or Blur has it's own logic to handle it? > > > >>> > >> > > > >>> > >> -- > > > >>> > >> Ravi > > > >>> > >> > > > >>> > >> On Fri, May 6, 2016 at 7:56 PM, Aaron McCurry < > > [email protected] > > > > > > > >>> > wrote: > > > >>> > >> > > > >>> > >>> Likely yes. If have a few minutes this weekend I can look > > > through > > > >>> that > > > >>> > >>> version and see if I can point you in the right direction. > > > >>> > >>> > > > >>> > >>> On Fri, May 6, 2016 at 8:46 AM, Ravikumar Govindarajan < > > > >>> > >>> [email protected]> wrote: > > > >>> > >>> > > > >>> > >>> > Sometimes during an ongoing search we receive an > > > >>> > >>> > IndexReaderClosedException... > > > >>> > >>> > > > > >>> > >>> > We are on an older version of Blur (0.2.2). Has this been > > fixed > > > >>> in > > > >>> > >>> newer > > > >>> > >>> > versions or we have been using it wrongly? > > > >>> > >>> > > > > >>> > >>> > > > *stackTraceStr:org.apache.lucene.store.AlreadyClosedException: > > > >>> this > > > >>> > >>> > IndexReader cannot be used anymore as one of its child > > readers > > > >>> was > > > >>> > >>> closed* > > > >>> > >>> > at > > > >>> > > > org.apache.lucene.index.IndexReader.ensureOpen(IndexReader.java:257) > > > >>> > >>> > at > > > >>> > >>> > > > > >>> > >>> > > > > >>> > >>> > > > >>> > > > > >>> > > > > > > org.apache.lucene.index.FilterAtomicReader.fields(FilterAtomicReader.java:380) > > > >>> > >>> > at > > > >>> > >>> > > > > >>> > >>> > > > > >>> > >>> > > > >>> > > > > >>> > > > > > > org.apache.blur.index.ExitableReader$ExitableFilterAtomicReader.fields(ExitableReader.java:81) > > > >>> > >>> > at > > > >>> > >>> > > > > >>> > >>> > > > > >>> > >>> > > > >>> > > > > >>> > > > > > > org.apache.lucene.search.TermCollectingRewrite.collectTerms(TermCollectingRewrite.java:52) > > > >>> > >>> > at > > > >>> > >>> > > > > >>> > >>> > > > > >>> > >>> > > > >>> > > > > >>> > > > > > > org.apache.lucene.search.ConstantScoreAutoRewrite.rewrite(ConstantScoreAutoRewrite.java:95) > > > >>> > >>> > at > > > >>> > >>> > > > > >>> > >>> > > > > >>> > >>> > > > >>> > > > > >>> > > > > > > org.apache.lucene.search.MultiTermQuery$ConstantScoreAutoRewrite.rewrite(MultiTermQuery.java:220) > > > >>> > >>> > at > > > >>> > >>> > > > >>> > > > > >>> > > > > org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:288) > > > >>> > >>> > at > > > >>> > > > org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:412) > > > >>> > >>> > > > > >>> > >>> > > > >>> > >> > > > >>> > >> > > > >>> > > > > > >>> > > > > >>> > > > >> > > > >> > > > > > > > > > >
