Hmm, try calling maybeMerge after each .addIndexes?

Robert opened this issue to fix addIndexes:
https://issues.apache.org/jira/browse/LUCENE-5672

Mike McCandless

http://blog.mikemccandless.com


On Wed, May 14, 2014 at 11:46 AM, danielv <dani...@exlibris.co.il> wrote:
> Hi,
>
> We have about 550M records index (~800GB) and we merge thousands of mini
> indexes once a week using hadoop - 45 mappers on 2 hadoop nodes.
> After upgrading to Lucene 3.6.1 we noticed that the merge process
> continuously slowing down.
> After we test a couple of options it looks like we found the source of the
> problem but have no idea how to fix it.
> What we do - first we merge all mini-indexes to one intermediate mini-index,
> and than this one to the big (final) one.
> The difference is deleted_records existence in mini-index:
> In case we have no deleted_records from merged mini-indexes - merger run
> about 2h with about 05s-2s per mini-index
> If we have deleted_records - after about 10 minutes we see dramatic
> degradation in time of merging mini-indexes to intermediate one (if first
> 100-200 mini-indexes merge take less than a second, after 10 minutes is take
> more than 10s for one mini-index and after hour or two it is a couple of
> minutes!)
>
> This one from jstack of mapper:
>
>    java.lang.Thread.State: RUNNABLE
>         at java.lang.Thread.isAlive(Native Method)
>         at
> org.apache.lucene.util.CloseableThreadLocal.purge(CloseableThreadLocal.java:115)
>         - locked <0x00000007db0d6140> (a java.util.WeakHashMap)
>         at
> org.apache.lucene.util.CloseableThreadLocal.maybePurge(CloseableThreadLocal.java:105)
>         at
> org.apache.lucene.util.CloseableThreadLocal.get(CloseableThreadLocal.java:88)
>         at
> org.apache.lucene.index.TermInfosReader.getThreadResources(TermInfosReader.java:160)
>         at
> org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:184)
>         at
> org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:172)
>         at
> org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java:66)
>         at
> org.apache.lucene.index.BufferedDeletesStream.applyTermDeletes(BufferedDeletesStream.java:346)
>         - locked <0x00000007805766f0> (a
> org.apache.lucene.index.BufferedDeletesStream)
>         at
> org.apache.lucene.index.BufferedDeletesStream.applyDeletes(BufferedDeletesStream.java:248)
>         - locked <0x00000007805766f0> (a
> org.apache.lucene.index.BufferedDeletesStream)
>         at
> org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3615)
>         - locked <0x00000007805739a0> (a
> org.apache.lucene.index.IndexWriter)
>         at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3552)
>         at
> org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:3120)
>         at
> org.apache.lucene.index.IndexWriter.addIndexesNoOptimize(IndexWriter.java:3064)
>
> We try to use org.apache.lucene.index.IndexWriter.addIndexes instead of
> org.apache.lucene.index.IndexWriter.addIndexesNoOptimize - same behavior.
>
> How can we eliminate this behavior and get improvement in performance of our
> merge?
>
> Thanks!
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Merger-performance-degradation-on-3-6-1-tp4135593.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to