here is a snippet from RAMDirectory JavaDocs:
* <p><b>Warning:</b> This class is not intended to work with huge
* indexes. Everything beyond several hundred megabytes will waste
* resources (GC cycles), because it uses an internal buffer size
* of 1024 bytes, producing millions of {@code byte[1024]} arrays.
* This class is optimized for small memory-resident indexes.
* It also has bad concurrency on multithreaded environments.
simon
On Sat, Jul 7, 2012 at 1:29 PM, Simon Willnauer
<[email protected]> wrote:
> On Fri, Jul 6, 2012 at 9:28 PM, Leon Rosenberg <[email protected]>
> wrote:
>> Hello,
>>
>> we have a small internet shop which uses lucene for product search.
>> With increasing traffic we have continuos problem with literaly
>> hundreds of threads being BLOCKED in lucene code:
>>
>> here is an example taken with jstack on prod system:
>>
>> "ajp-0.0.0.0-8009-1504" daemon prio=10 tid=0x00007f81a518f800
>> nid=0x6554 waiting for monitor entry [0x00007f8142c29000]
>> java.lang.Thread.State: BLOCKED (on object monitor)
>> at org.apache.lucene.store.RAMFile.numBuffers(RAMFile.java:79)
>> - waiting to lock <0x000000054b1daa98> (a
>> org.apache.lucene.store.RAMFile)
>> at
>> org.apache.lucene.store.RAMInputStream.switchCurrentBuffer(RAMInputStream.java:87)
>> at
>> org.apache.lucene.store.RAMInputStream.readBytes(RAMInputStream.java:73)
>> at org.apache.lucene.index.TermBuffer.read(TermBuffer.java:82)
>> at
>> org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java:131)
>> at
>> org.apache.lucene.search.FilteredTermEnum.next(FilteredTermEnum.java:77)
>> at
>> org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(MultiTermQueryWrapperFilter.java:131)
>> at
>> org.apache.lucene.search.ConstantScoreQuery$ConstantWeight.scorer(ConstantScoreQuery.java:139)
>> at
>> org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:298)
>> at
>> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:524)
>> at
>> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:391)
>> at
>> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:298)
>>
>> we have 1172 threads in same state.
>>
>> there are two threads which already lock the object:
>>
>> "ajp-0.0.0.0-8009-548" daemon prio=10 tid=0x00007f81a510b000
>> nid=0x1c64 waiting for monitor entry [0x00007f81897d5000]
>> java.lang.Thread.State: BLOCKED (on object monitor)
>> at org.apache.lucene.store.RAMFile.getBuffer(RAMFile.java:75)
>> - locked <0x000000054b1daa98> (a org.apache.lucene.store.RAMFile)
>> at
>> org.apache.lucene.store.RAMInputStream.switchCurrentBuffer(RAMInputStream.java:97)
>> at
>> org.apache.lucene.store.RAMInputStream.readBytes(RAMInputStream.java:73)
>> at org.apache.lucene.index.TermBuffer.read(TermBuffer.java:82)
>> at
>> org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java:131)
>> at
>> org.apache.lucene.search.FilteredTermEnum.next(FilteredTermEnum.java:77)
>> at
>> org.apache.lucene.search.TermCollectingRewrite.collectTerms(TermCollectingRewrite.java:40)
>> at
>> org.apache.lucene.search.TopTermsRewrite.rewrite(TopTermsRewrite.java:58)
>> at
>> org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:296)
>> at
>> org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:378)
>> at
>> org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:378)
>> at
>> org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:589)
>> at
>> org.apache.lucene.search.Searcher.createNormalizedWeight(Searcher.java:167)
>> at
>> org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:661)
>> at
>> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:298)
>>
>>
>> "ajp-0.0.0.0-8009-663" daemon prio=10 tid=0x00007f81c503d800
>> nid=0x225b waiting for monitor entry [0x00007f8182462000]
>> java.lang.Thread.State: BLOCKED (on object monitor)
>> at org.apache.lucene.store.RAMFile.numBuffers(RAMFile.java:79)
>> - locked <0x000000054b1daa98> (a org.apache.lucene.store.RAMFile)
>> at
>> org.apache.lucene.store.RAMInputStream.switchCurrentBuffer(RAMInputStream.java:87)
>> at
>> org.apache.lucene.store.RAMInputStream.readByte(RAMInputStream.java:63)
>> at org.apache.lucene.store.DataInput.readVInt(DataInput.java:105)
>> at org.apache.lucene.index.TermBuffer.read(TermBuffer.java:64)
>> at
>> org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java:131)
>> at
>> org.apache.lucene.search.FilteredTermEnum.next(FilteredTermEnum.java:77)
>> at
>> org.apache.lucene.search.FilteredTermEnum.setEnum(FilteredTermEnum.java:56)
>> at
>> org.apache.lucene.search.WildcardTermEnum.<init>(WildcardTermEnum.java:65)
>> at
>> org.apache.lucene.search.WildcardQuery.getEnum(WildcardQuery.java:59)
>> at
>> org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(MultiTermQueryWrapperFilter.java:103)
>> at
>> org.apache.lucene.search.ConstantScoreQuery$ConstantWeight.scorer(ConstantScoreQuery.java:139)
>> at
>> org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:298)
>> at
>> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:524)
>> at
>> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:391)
>> at
>> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:298)
>>
>> after some time the BLOCKED threads are waiting to lock another
>> object. Over time the amount of locked threads increases.
>>
>> due to this fact the machine is permanently context-switching and has
>> 100% cpu load (vmstat, top).
>>
>> The responsiveness of search is extremely low, the search needs minutes...
>> At the moment of the stacktrace we had approx. 1000 sessions (but the
>> problems occured with 400 too) and 1500 threads (we start with 80).
>>
>> We are reindexing the index every 10 minutes, due to changes in the
>> backoffice system.
>>
>> any ideas what is happening here or what we are doing wrong?
>>
>> we replaced call to lucene with String.indexOf() to check if the
>> problem is in our code, it didn't show the problematic behavior.
>>
>> Is there a non-blocking search alternative in lucene?
>
> why don't you just use FSDirectory instead of RAMDirectory? Speed is
> usually better too and it doesn't have the problems that RAMDirectory
> has.
>
> simon
>> thanx in advance
>> Leon
>>
>> P.S.
>> some data about machine, java, etc
>>
>>
>>
>> machine: quadcore, ubuntu linux, 2.6.32, 24Gb RAM
>> java version "1.6.0_30"
>> Java(TM) SE Runtime Environment (build 1.6.0_30-b12)
>> Java HotSpot(TM) 64-Bit Server VM (build 20.5-b03, mixed mode)
>>
>> memory config:
>> jmap -heap 29596
>> Attaching to process ID 29596, please wait...
>> Debugger attached successfully.
>> Server compiler detected.
>> JVM version is 20.5-b03
>>
>> using thread-local object allocation.
>> Parallel GC with 8 thread(s)
>>
>> Heap Configuration:
>> MinHeapFreeRatio = 40
>> MaxHeapFreeRatio = 70
>> MaxHeapSize = 17179869184 (16384.0MB)
>> NewSize = 1310720 (1.25MB)
>> MaxNewSize = 17592186044415 MB
>> OldSize = 5439488 (5.1875MB)
>> NewRatio = 2
>> SurvivorRatio = 8
>> PermSize = 21757952 (20.75MB)
>> MaxPermSize = 4294967296 (4096.0MB)
>>
>>
>> The machine runs with parallelOldGc and has a full gc every 15-20 minutes.
>> The app runs in jboss 6 with jsf.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]