Hi, We have about 550M records index (~800GB) and we merge thousands of mini indexes once a week using hadoop - 45 mappers on 2 hadoop nodes. After upgrading to Lucene 3.6.1 we noticed that the merge process continuously slowing down. After we test a couple of options it looks like we found the source of the problem but have no idea how to fix it. What we do - first we merge all mini-indexes to one intermediate mini-index, and than this one to the big (final) one. The difference is deleted_records existence in mini-index: In case we have no deleted_records from merged mini-indexes - merger run about 2h with about 05s-2s per mini-index If we have deleted_records - after about 10 minutes we see dramatic degradation in time of merging mini-indexes to intermediate one (if first 100-200 mini-indexes merge take less than a second, after 10 minutes is take more than 10s for one mini-index and after hour or two it is a couple of minutes!)
This one from jstack of mapper: java.lang.Thread.State: RUNNABLE at java.lang.Thread.isAlive(Native Method) at org.apache.lucene.util.CloseableThreadLocal.purge(CloseableThreadLocal.java:115) - locked <0x00000007db0d6140> (a java.util.WeakHashMap) at org.apache.lucene.util.CloseableThreadLocal.maybePurge(CloseableThreadLocal.java:105) at org.apache.lucene.util.CloseableThreadLocal.get(CloseableThreadLocal.java:88) at org.apache.lucene.index.TermInfosReader.getThreadResources(TermInfosReader.java:160) at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:184) at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:172) at org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java:66) at org.apache.lucene.index.BufferedDeletesStream.applyTermDeletes(BufferedDeletesStream.java:346) - locked <0x00000007805766f0> (a org.apache.lucene.index.BufferedDeletesStream) at org.apache.lucene.index.BufferedDeletesStream.applyDeletes(BufferedDeletesStream.java:248) - locked <0x00000007805766f0> (a org.apache.lucene.index.BufferedDeletesStream) at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3615) - locked <0x00000007805739a0> (a org.apache.lucene.index.IndexWriter) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3552) at org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:3120) at org.apache.lucene.index.IndexWriter.addIndexesNoOptimize(IndexWriter.java:3064) We try to use org.apache.lucene.index.IndexWriter.addIndexes instead of org.apache.lucene.index.IndexWriter.addIndexesNoOptimize - same behavior. How can we eliminate this behavior and get improvement in performance of our merge? Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Merger-performance-degradation-on-3-6-1-tp4135593.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org