We have also made a patch for having a high-water-mark level (15% of excess
block-cache capacity) after which cache-writes are stopped.

Once capacity is reclaimed via clean-up thread, we resume adding to cache

On Mon, Jul 18, 2016 at 1:58 PM, Ravikumar Govindarajan <
[email protected]> wrote:

> We had an issue with block-cache growing beyond configured size & reducing
> very rarely. Describing the sequence of events
>
>    1. Shard receives incoming mutations, adds it to Index & triggers
>    background merge.
>    2. Merge produces new-set of files. We have write-thru cache enabled &
>    adds new files to block-cache..
>    3. Shard goes silent & doesn't receive any mutation for many minutes
>    all together
>    4. Since we perform commit only upon receiving mutations, the
>    older-files are not evicted from block-cache..
>    5. Problem is exacerbated with KeepNLastCommit policy, where even
>    after commit, unused files are not evicted from block-cache..
>
>
> We are planning to patch up SharedMergeScheduler by refreshing IndexReader
> when a merge completes & then delete merged files from block-cache. This
> way, I believe block-cache can be reigned in whenever it exceeds capacity,
> irrespective of Commit-Policy used
>
> Do let know if this is fine...
>
> On Thu, Jun 16, 2016 at 4:33 PM, Ravikumar Govindarajan <
> [email protected]> wrote:
>
>>  I didn't fully understand the underlying Lucene reader, writer,
>>> open, close semantics
>>
>>
>> I too don't know the correct behavior. Lucene code is incredibly hairy to
>> follow... :)
>>
>> Have pinged lucene mailing list. Hope someone replies...
>>
>> On Tue, Jun 7, 2016 at 4:46 PM, Aaron McCurry <[email protected]> wrote:
>>
>>> On Wed, Jun 1, 2016 at 7:34 AM, Ravikumar Govindarajan <
>>> [email protected]> wrote:
>>>
>>> > Just one more observation here...
>>> >
>>> > Even if readerPooling is set to true, lucene has 2 readers (One for
>>> search
>>> > & one updates/deletes)
>>> >
>>> > But the reader for updates/deletes is not opened/closed for every
>>> commit
>>> > call which is the default behavior as of today. It is opened only once
>>> > (During first update/delete call)
>>> >
>>>
>>> I will take a closer look at the code for this one.  Likely when I wrote
>>> this code I didn't fully understand the underlying Lucene reader, writer,
>>> open, close semantics.  Thank you for pointing this out!
>>>
>>> Aaron
>>>
>>>
>>> >
>>> > On Wed, Jun 1, 2016 at 3:10 PM, Ravikumar Govindarajan <
>>> > [email protected]> wrote:
>>> >
>>> > > In newer versions of the code there are multiple streams involved.
>>> One
>>> > for
>>> > >> each open file handle plus if a sequential read is detected a new
>>> stream
>>> > >> is
>>> > >> created for the instance for better performance
>>> > >
>>> > >
>>> > > Great. We just patched up our Blur version with this code.
>>> > >
>>> > > While I was digging at the reader-closed issue, was quite surprised
>>> to
>>> > > observe the following behavior
>>> > >
>>> > >    - Issue a commit
>>> > >    - Lucene opens a new reader via IndexWriter. (Doesn't re-use our
>>> > >    already opened DirectoryReader)
>>> > >    - Processes all updates/deletes/merges
>>> > >    - Closes the new reader
>>> > >    - Complete commit
>>> > >
>>> > > For a big index & lots of commits, opening a new-reader for every
>>> commit
>>> > > is prohibitively expensive.
>>> > >
>>> > >
>>> > > Here is the JIRA for it...
>>> > > https://issues.apache.org/jira/browse/LUCENE-2297
>>> > >
>>> > > All we need to do is just set "readerPooling=true" in
>>> IndexWriterConfig
>>> > > class
>>> > >
>>> > > Please do explore this option when you find time.
>>> > >
>>> > > --
>>> > > Ravi
>>> > >
>>> > >
>>> > >
>>> > > On Tue, May 24, 2016 at 7:48 PM, Aaron McCurry <[email protected]>
>>> > wrote:
>>> > >
>>> > >> On Tue, May 24, 2016 at 6:06 AM, Ravikumar Govindarajan <
>>> > >> [email protected]> wrote:
>>> > >>
>>> > >> > We have solved it temporarily by using a KeepLastTwoCommits del
>>> > policy.
>>> > >> We
>>> > >> > don't get these exceptions now!!!
>>> > >> >
>>> > >>
>>> > >> Great!
>>> > >>
>>> > >>
>>> > >> >
>>> > >> > Btw, I see that pread calls in FSDataInputStream.java are
>>> > synchronized.
>>> > >> Is
>>> > >> > it possible that merge DFS read calls could potentially block
>>> search
>>> > DFS
>>> > >> > read calls?
>>> > >> >
>>> > >>
>>> > >> Yes.
>>> > >>
>>> > >>
>>> > >> >
>>> > >> > Would it be a good idea to have 2 DFSInputStreams for every file,
>>> one
>>> > >> for
>>> > >> > merge & another for search?
>>> > >> >
>>> > >>
>>> > >> In newer versions of the code there are multiple streams involved.
>>> One
>>> > >> for
>>> > >> each open file handle plus if a sequential read is detected a new
>>> stream
>>> > >> is
>>> > >> created for the instance for better performance.  Checkout the
>>> > >> HdfsDirectory class.
>>> > >>
>>> > >> Aaron
>>> > >>
>>> > >>
>>> > >> >
>>> > >> > On Tue, May 10, 2016 at 7:43 PM, Ravikumar Govindarajan <
>>> > >> > [email protected]> wrote:
>>> > >> >
>>> > >> > > Sorry, I mis-understood the code.
>>> > >> > > I see that it has 2 locks IndexRefreshWriteLock &
>>> > >> IndexRefreshReadLock.
>>> > >> > > They look to be separate
>>> > >> > >
>>> > >> > > On Tue, May 10, 2016 at 7:16 PM, Ravikumar Govindarajan <
>>> > >> > > [email protected]> wrote:
>>> > >> > >
>>> > >> > >> Thanks a lot Aaron.
>>> > >> > >>
>>> > >> > >> I guess we took a commit of 0.2.2 that doesn't have the
>>> > >> > >> IndexRefreshWriteLock (IRWL). It looks like it co-ordinates
>>> between
>>> > >> > >> searches & incoming mutation commits. If so, then it will
>>> likely
>>> > >> solve
>>> > >> > the
>>> > >> > >> first issue for us (AlreadyClosedException)
>>> > >> > >>
>>> > >> > >>
>>> > >> > >> Can you recollect if that was the reason IRWL was introduced?
>>> > >> > >>
>>> > >> > >> On Tue, May 10, 2016 at 6:40 PM, Aaron McCurry <
>>> [email protected]
>>> > >
>>> > >> > >> wrote:
>>> > >> > >>
>>> > >> > >>> On Tue, May 10, 2016 at 2:30 AM, Ravikumar Govindarajan <
>>> > >> > >>> [email protected]> wrote:
>>> > >> > >>>
>>> > >> > >>> > Actually there are 2 issues...
>>> > >> > >>> >
>>> > >> > >>> > 1. IndexReaderClosedException
>>> > >> > >>> > 2. HDFS Stream Closed
>>> > >> > >>> >
>>> > >> > >>>
>>> > >> > >>> Likely when the index is closed it closes the underlying
>>> > >> indexinputs as
>>> > >> > >>> well causing the HDFS Stream closed exception.
>>> > >> > >>>
>>> > >> > >>>
>>> > >> > >>> >
>>> > >> > >>> > Merge completion results in File Deletion & ultimately HDFS
>>> > Stream
>>> > >> > >>> Closed
>>> > >> > >>> > during Search....
>>> > >> > >>> >
>>> > >> > >>> > I use IndexFileDeleter with
>>> KeepOnlyLastCommitDeletionPolicy.
>>> > This
>>> > >> > >>> blindly
>>> > >> > >>> > deletes the file, without bothering to cross-check
>>> > >> > >>> IndexReader.RefCount >
>>> > >> > >>> > 0.
>>> > >> > >>> >
>>> > >> > >>>
>>> > >> > >>> Hmm.  You can see here:
>>> > >> > >>>
>>> > >> > >>>
>>> > >> > >>>
>>> > >> >
>>> > >>
>>> >
>>> https://github.com/apache/incubator-blur/blob/release-0.2.2-incubating/blur-core/src/main/java/org/apache/blur/manager/writer/BlurIndexSimpleWriter.java#L303
>>> > >> > >>>
>>> > >> > >>> That once the new index is available it is swapped into the
>>> index
>>> > >> ref
>>> > >> > >>> object and the old one is sent to the index closer.  Once the
>>> ref
>>> > to
>>> > >> > the
>>> > >> > >>> index are low enough it closes the index.  Or at least it
>>> should.
>>> > >> > >>>
>>> > >> > >>> I will continue looking into the problem but I don't have a
>>> > solution
>>> > >> > for
>>> > >> > >>> you yet.
>>> > >> > >>>
>>> > >> > >>> Aaron
>>> > >> > >>>
>>> > >> > >>>
>>> > >> > >>>
>>> > >> > >>> >
>>> > >> > >>> >
>>> > >> > >>> > *Exception(message:Unknown error during rewrite,
>>> > >> > >>> > stackTraceStr:java.io.IOException: Stream closed*
>>> > >> > >>> > at
>>> > >> > >>>
>>> > >>
>>> org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1385)
>>> > >> > >>> > at
>>> > >> >
>>> org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1374)
>>> > >> > >>> > at
>>> > >> > >>>
>>> > >>
>>> org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:89)
>>> > >> > >>> > at
>>> > >> > >>> >
>>> > >> > >>> >
>>> > >> > >>>
>>> > >> >
>>> > >>
>>> >
>>> org.apache.blur.store.hdfs.HdfsIndexInput.readInternal(HdfsIndexInput.java:62)
>>> > >> > >>> > at
>>> > >> > >>> >
>>> > >> > >>> >
>>> > >> > >>>
>>> > >> >
>>> > >>
>>> >
>>> org.apache.blur.store.buffer.ReusedBufferedIndexInput.readBytes(ReusedBufferedIndexInput.java:167)
>>> > >> > >>> > at
>>> > >> > >>> >
>>> > >> > >>> >
>>> > >> > >>>
>>> > >> >
>>> > >>
>>> >
>>> org.apache.blur.store.buffer.ReusedBufferedIndexInput.readBytes(ReusedBufferedIndexInput.java:122)
>>> > >> > >>> > at
>>> > >> > >>> >
>>> > >> > >>> >
>>> > >> > >>>
>>> > >> >
>>> > >>
>>> >
>>> org.apache.blur.store.hdfs.MmapCacheIndexInput.readAndcache(MmapCacheIndexInput.java:24)
>>> > >> > >>> > at
>>> > >> > >>> >
>>> > >> > >>> >
>>> > >> > >>>
>>> > >> >
>>> > >>
>>> >
>>> org.apache.blur.store.blockcache_v2.CacheIndexInput.fillNormally(CacheIndexInput.java:354)
>>> > >> > >>> > at
>>> > >> > >>> >
>>> > >> > >>> >
>>> > >> > >>>
>>> > >> >
>>> > >>
>>> >
>>> org.apache.blur.store.blockcache_v2.CacheIndexInput.fill(CacheIndexInput.java:379)
>>> > >> > >>> > at
>>> > >> > >>> >
>>> > >> > >>> >
>>> > >> > >>>
>>> > >> >
>>> > >>
>>> >
>>> org.apache.blur.store.blockcache_v2.CacheIndexInput.tryToFill(CacheIndexInput.java:297)
>>> > >> > >>> > at
>>> > >> > >>> >
>>> > >> > >>> >
>>> > >> > >>>
>>> > >> >
>>> > >>
>>> >
>>> org.apache.blur.store.blockcache_v2.CacheIndexInput.readByte(CacheIndexInput.java:151)
>>> > >> > >>> > at
>>> > >> > >>> >
>>> > >> > >>> >
>>> > >> > >>>
>>> > >> >
>>> > >>
>>> >
>>> org.apache.blur.lucene.warmup.TraceableIndexInput.readByte(TraceableIndexInput.java:62)
>>> > >> > >>> > at
>>> > org.apache.lucene.store.DataInput.readVInt(DataInput.java:108)
>>> > >> > >>> > at
>>> > >> > >>> >
>>> > >> > >>> >
>>> > >> > >>>
>>> > >> >
>>> > >>
>>> >
>>> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.loadBlock(BlockTreeTermsReader.java:2366)
>>> > >> > >>> > at
>>> > >> > >>> >
>>> > >> > >>> >
>>> > >> > >>>
>>> > >> >
>>> > >>
>>> >
>>> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.seekCeil(BlockTreeTermsReader.java:1949)
>>> > >> > >>> > at
>>> > >> > >>> >
>>> > >> > >>> >
>>> > >> > >>>
>>> > >> >
>>> > >>
>>> >
>>> org.apache.blur.index.ExitableReader$ExitableTermsEnum.seekCeil(ExitableReader.java:250)
>>> > >> > >>> > at
>>> > >> > >>> >
>>> > >> > >>>
>>> > >> >
>>> > >>
>>> >
>>> org.apache.lucene.index.FilteredTermsEnum.next(FilteredTermsEnum.java:225)
>>> > >> > >>> > at
>>> > >> > >>> >
>>> > >> > >>> >
>>> > >> > >>>
>>> > >> >
>>> > >>
>>> >
>>> org.apache.lucene.search.TermCollectingRewrite.collectTerms(TermCollectingRewrite.java:78)
>>> > >> > >>> > at
>>> > >> > >>> >
>>> > >> > >>> >
>>> > >> > >>>
>>> > >> >
>>> > >>
>>> >
>>> org.apache.lucene.search.ConstantScoreAutoRewrite.rewrite(ConstantScoreAutoRewrite.java:95)
>>> > >> > >>> > at
>>> > >> > >>> >
>>> > >> > >>> >
>>> > >> > >>>
>>> > >> >
>>> > >>
>>> >
>>> org.apache.lucene.search.MultiTermQuery$ConstantScoreAutoRewrite.rewrite(MultiTermQuery.java:220)
>>> > >> > >>> > at
>>> > >> > >>>
>>> > >> >
>>> >
>>> org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:288)
>>> > >> > >>> > at
>>> > >> >
>>> org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:412)
>>> > >> > >>> > at
>>> > >> >
>>> org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:412)
>>> > >> > >>> > at
>>> > >> >
>>> org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:412)
>>> > >> > >>> > at
>>> > >> > >>> >
>>> > >> > >>> > On Mon, May 9, 2016 at 4:42 PM, Ravikumar Govindarajan <
>>> > >> > >>> > [email protected]> wrote:
>>> > >> > >>> >
>>> > >> > >>> > > One extra info we gleaned from the logs...
>>> > >> > >>> > >
>>> > >> > >>> > > 1. Merge Starts & is about to complete
>>> > >> > >>> > > 2. Searcher is opened
>>> > >> > >>> > > 3. Merge Completes
>>> > >> > >>> > > 4. Ref-count drops to 0 in IndexReader
>>> > >> > >>> > > 5. IndexReader closed while Searcher is still open
>>> > >> > >>> > >
>>> > >> > >>> > > This seems to be the main pattern for causing the
>>> Exception
>>> > >> > >>> > >
>>> > >> > >>> > > --
>>> > >> > >>> > > Ravi
>>> > >> > >>> > >
>>> > >> > >>> > > On Mon, May 9, 2016 at 3:08 PM, Ravikumar Govindarajan <
>>> > >> > >>> > > [email protected]> wrote:
>>> > >> > >>> > >
>>> > >> > >>> > >> Thanks Aaron...
>>> > >> > >>> > >>
>>> > >> > >>> > >> Just a quick question. Lucene itself has ref-counting to
>>> > close
>>> > >> > it's
>>> > >> > >>> > >> readers no? Or Blur has it's own logic to handle it?
>>> > >> > >>> > >>
>>> > >> > >>> > >> --
>>> > >> > >>> > >> Ravi
>>> > >> > >>> > >>
>>> > >> > >>> > >> On Fri, May 6, 2016 at 7:56 PM, Aaron McCurry <
>>> > >> [email protected]
>>> > >> > >
>>> > >> > >>> > wrote:
>>> > >> > >>> > >>
>>> > >> > >>> > >>> Likely yes.  If have a few minutes this weekend I can
>>> look
>>> > >> > through
>>> > >> > >>> that
>>> > >> > >>> > >>> version and see if I can point you in the right
>>> direction.
>>> > >> > >>> > >>>
>>> > >> > >>> > >>> On Fri, May 6, 2016 at 8:46 AM, Ravikumar Govindarajan <
>>> > >> > >>> > >>> [email protected]> wrote:
>>> > >> > >>> > >>>
>>> > >> > >>> > >>> > Sometimes during an ongoing search we receive an
>>> > >> > >>> > >>> > IndexReaderClosedException...
>>> > >> > >>> > >>> >
>>> > >> > >>> > >>> > We are on an older version of Blur (0.2.2). Has this
>>> been
>>> > >> fixed
>>> > >> > >>> in
>>> > >> > >>> > >>> newer
>>> > >> > >>> > >>> > versions or we have been using it wrongly?
>>> > >> > >>> > >>> >
>>> > >> > >>> > >>> >
>>> > >> *stackTraceStr:org.apache.lucene.store.AlreadyClosedException:
>>> > >> > >>> this
>>> > >> > >>> > >>> > IndexReader cannot be used anymore as one of its child
>>> > >> readers
>>> > >> > >>> was
>>> > >> > >>> > >>> closed*
>>> > >> > >>> > >>> > at
>>> > >> > >>> >
>>> > >> org.apache.lucene.index.IndexReader.ensureOpen(IndexReader.java:257)
>>> > >> > >>> > >>> > at
>>> > >> > >>> > >>> >
>>> > >> > >>> > >>> >
>>> > >> > >>> > >>>
>>> > >> > >>> >
>>> > >> > >>>
>>> > >> >
>>> > >>
>>> >
>>> org.apache.lucene.index.FilterAtomicReader.fields(FilterAtomicReader.java:380)
>>> > >> > >>> > >>> > at
>>> > >> > >>> > >>> >
>>> > >> > >>> > >>> >
>>> > >> > >>> > >>>
>>> > >> > >>> >
>>> > >> > >>>
>>> > >> >
>>> > >>
>>> >
>>> org.apache.blur.index.ExitableReader$ExitableFilterAtomicReader.fields(ExitableReader.java:81)
>>> > >> > >>> > >>> > at
>>> > >> > >>> > >>> >
>>> > >> > >>> > >>> >
>>> > >> > >>> > >>>
>>> > >> > >>> >
>>> > >> > >>>
>>> > >> >
>>> > >>
>>> >
>>> org.apache.lucene.search.TermCollectingRewrite.collectTerms(TermCollectingRewrite.java:52)
>>> > >> > >>> > >>> > at
>>> > >> > >>> > >>> >
>>> > >> > >>> > >>> >
>>> > >> > >>> > >>>
>>> > >> > >>> >
>>> > >> > >>>
>>> > >> >
>>> > >>
>>> >
>>> org.apache.lucene.search.ConstantScoreAutoRewrite.rewrite(ConstantScoreAutoRewrite.java:95)
>>> > >> > >>> > >>> > at
>>> > >> > >>> > >>> >
>>> > >> > >>> > >>> >
>>> > >> > >>> > >>>
>>> > >> > >>> >
>>> > >> > >>>
>>> > >> >
>>> > >>
>>> >
>>> org.apache.lucene.search.MultiTermQuery$ConstantScoreAutoRewrite.rewrite(MultiTermQuery.java:220)
>>> > >> > >>> > >>> > at
>>> > >> > >>> > >>>
>>> > >> > >>> >
>>> > >> > >>>
>>> > >> >
>>> >
>>> org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:288)
>>> > >> > >>> > >>> > at
>>> > >> > >>> >
>>> > >> org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:412)
>>> > >> > >>> > >>> >
>>> > >> > >>> > >>>
>>> > >> > >>> > >>
>>> > >> > >>> > >>
>>> > >> > >>> > >
>>> > >> > >>> >
>>> > >> > >>>
>>> > >> > >>
>>> > >> > >>
>>> > >> > >
>>> > >> >
>>> > >>
>>> > >
>>> > >
>>> >
>>>
>>
>>
>

Reply via email to