Looking further at the java error, those crashes are mostly related to GC.
VM_Operation (0x0000000041b429e0): parallel gc failed allocation, mode: safepoint, requested by thread 0x00002aab1988c400 I'm following the http://java.sun.com/javase/6/webnotes/trouble/TSG-VM/html/gbyzo.html and see if their workaround would do the trick, if not, I would try it on a different(better) machine. Thanks! -Gaku Yonik Seeley wrote: > > It's most likely a > 1) hardware issue: bad memory > OR > 2) incompatible libraries (most likely libc version for the JVM). > > If you have another box around, try that. > > -Yonik > > On Thu, May 29, 2008 at 9:51 PM, Gaku Mak <[EMAIL PROTECTED]> wrote: >> >> Hi Yonik and others, >> >> I'm getting this java error after switching to JVM 1.6.0_3. This error >> occurs after the stress test has been going for a while and failed at 12K >> docs level and at 18K again. Am I doing something wrong? Please help! >> >> Thanks! >> >> # >> # An unexpected error has been detected by Java Runtime Environment: >> # >> # SIGSEGV (0xb) at pc=0x00002aaaaadfbf6d, pid=25030, tid=1079175504 >> # >> # Java VM: Java HotSpot(TM) 64-Bit Server VM (1.6.0_03-b05 mixed mode) >> # Problematic frame: >> # V [libjvm.so+0x230f6d] >> # >> # An error report file with more information is saved as >> hs_err_pid25030.log >> # >> # If you would like to submit a bug report, please visit: >> # http://java.sun.com/webapps/bugreport/crash.jsp >> # >> >> -Gaku >> >> >> Yonik Seeley wrote: >>> >>> On Wed, May 28, 2008 at 10:30 PM, Gaku Mak <[EMAIL PROTECTED]> wrote: >>>> I used the admin GUI to get the java info. >>>> java.vm.specification.vendor = Sun Microsystems Inc. >>> Well, your original email listed IcedTea... but that is mostly Sun >>> code, so maybe that's why the vendor is still listed as Sun. >>> >>> I'd recommend downloading1.6.0_3 from java.sun.com and trying that. >>> >>> Later versions (1.6.0_04+) have a JVM bug that bites Lucene, so stick >>> with 1.6.0_03 for now. >>> >>> -Yonik >>> >>> >>>> Any suggestion? Thanks a lot for your help!! >>>> >>>> -Gaku >>>> >>>> >>>> Yonik Seeley wrote: >>>>> >>>>> Not sure why you would be getting an OOM from just indexing, and with >>>>> the 1.5G heap you've given the JVM. >>>>> Have you tried Sun's JVM? >>>>> >>>>> -Yonik >>>>> >>>>> On Wed, May 28, 2008 at 7:35 PM, gaku113 <[EMAIL PROTECTED]> wrote: >>>>>> >>>>>> Hi all Solr users/developers/experts, >>>>>> >>>>>> I have the following scenario and I appreciate any advice for tuning >>>>>> my >>>>>> solr >>>>>> master server. >>>>>> >>>>>> I have a field in my schema that would index (but not stored) about >>>>>> ~10000 >>>>>> ids for each document. This field is expected to govern the size of >>>>>> the >>>>>> document. Each id can contain up to 6 characters. I figure that >>>>>> there >>>>>> are >>>>>> two alternatives for this field, one is the use a string multi-valued >>>>>> field, >>>>>> and the other would be to pass a white-space-delimited string to solr >>>>>> and >>>>>> have solr tokenize such string based on whitespace (the text_ws >>>>>> fieldType). >>>>>> The master server is expected to receive constant stream of updates. >>>>>> >>>>>> The expected/estimated document size can range from 50k to 100k for a >>>>>> single >>>>>> document. (I know this is quite large). The number of documents is >>>>>> expected >>>>>> to be around 200,000 on each master server, and there can be multiple >>>>>> master >>>>>> servers (sharding). I wish the master can handle more docs too if I >>>>>> can >>>>>> figure a way out. >>>>>> >>>>>> Currently, I'm performing some basic stress tests to simulate the >>>>>> indexing >>>>>> side on the master server. This stress test would continuously add >>>>>> new >>>>>> documents at the rate of about 10 documents every 30 seconds. >>>>>> Autocommit >>>>>> is >>>>>> being used (50 docs and 180 seconds constraints), but I have no idea >>>>>> if >>>>>> this >>>>>> is the preferred way. The goal is to keep adding new documents until >>>>>> we >>>>>> can >>>>>> get at least 200,000 documents (or about 20GB of index) on the master >>>>>> (or >>>>>> even more if the server can handle it) >>>>>> >>>>>> What I experienced from the indexing stress test is that the master >>>>>> server >>>>>> failed to respond after a while, such as non-pingable when there are >>>>>> about >>>>>> 30k documents. When looking at the log, they are mostly: >>>>>> java.lang.OutOfMemoryError: Java heap space >>>>>> OR >>>>>> Ping query caused exception: null (this is probably caused by the OOM >>>>>> problem) >>>>>> >>>>>> There were also a few cases that the java process even went away. >>>>>> >>>>>> Questions: >>>>>> 1) Is it better to use the multi-valued string field or the >>>>>> text_ws >>>>>> field >>>>>> for this large field? >>>>>> 2) Is it better to have more outstanding docs per commit or more >>>>>> frequent >>>>>> commit, in term of maximizing server resources? What is the >>>>>> preferred >>>>>> way >>>>>> to commit documents assuming that solr master receives updates >>>>>> frequently? >>>>>> How many updated docs should there be before issuing a commit? >>>>>> 3) How to avoid the OOM problem in my case? I'm already doing >>>>>> (-Xms1536M >>>>>> -Xmx1536M) on a 2-GB machine. Is that not enough? I'm concerned that >>>>>> adding >>>>>> more Ram would just delay the OOM problem. Any additional JVM option >>>>>> to >>>>>> consider? >>>>>> 4) Any recommendation for the master server configuration, in a >>>>>> sense that I >>>>>> can maximize the number of indexed docs? >>>>>> 5) How can it disable caching on the master altogether as >>>>>> queries >>>>>> won't hit >>>>>> the master? >>>>>> 6) For an average doc size of 50k-100k, is that too large for >>>>>> solr, >>>>>> or even >>>>>> solr is the right tool? If not, any alternative? If we are able to >>>>>> reduce >>>>>> the size of docs, can we expect to index more documents? >>>>>> >>>>>> The followings are info related to software/hardware/configuration: >>>>>> >>>>>> Solr version (solr nightly build on 5/23/2008) >>>>>> Solr Specification Version: 1.2.2008.05.23.08.06.59 >>>>>> Solr Implementation Version: nightly >>>>>> Lucene Specification Version: 2.3.2 >>>>>> Lucene Implementation Version: 2.3.2 652650 >>>>>> Jetty: 6.1.3 >>>>>> >>>>>> Schema.xml (the section that I think are relevant to the master >>>>>> server.) >>>>>> >>>>>> <fieldType name="string" class="solr.StrField" >>>>>> sortMissingLast="true" >>>>>> omitNorms="true"/> >>>>>> <fieldType name="text_ws" class="solr.TextField" >>>>>> positionIncrementGap="100"> >>>>>> <analyzer> >>>>>> <tokenizer class="solr.WhitespaceTokenizerFactory"/> >>>>>> </analyzer> >>>>>> </fieldType> >>>>>> >>>>>> <field name="id" type="string" indexed="true" stored="true" >>>>>> required="true" >>>>>> /> >>>>>> <field name="hex_id_multi" type="string" indexed="true" >>>>>> stored="false" >>>>>> multiValued="true" omitNorms="true"/> >>>>>> <field name="hex_id_string" type="text_ws" indexed="true" >>>>>> stored="false" >>>>>> omitNorms="true"/> >>>>>> >>>>>> <uniqueKey>id</uniqueKey> >>>>>> >>>>>> Solrconfig.xml >>>>>> <indexDefaults> >>>>>> <useCompoundFile>false</useCompoundFile> >>>>>> <mergeFactor>10</mergeFactor> >>>>>> <maxBufferedDocs>500</maxBufferedDocs> >>>>>> <ramBufferSizeMB>50</ramBufferSizeMB> >>>>>> <maxMergeDocs>5000</maxMergeDocs> >>>>>> <maxFieldLength>20000</maxFieldLength> >>>>>> <writeLockTimeout>1000</writeLockTimeout> >>>>>> <commitLockTimeout>10000</commitLockTimeout> >>>>>> >>>>>> <mergePolicy>org.apache.lucene.index.LogByteSizeMergePolicy</mergePolicy> >>>>>> <mergeScheduler>org.apache.lucene.index.ConcurrentMergeScheduler</mergeScheduler> >>>>>> <lockType>single</lockType> >>>>>> </indexDefaults> >>>>>> >>>>>> <mainIndex> >>>>>> <useCompoundFile>false</useCompoundFile> >>>>>> <ramBufferSizeMB>50</ramBufferSizeMB> >>>>>> <mergeFactor>10</mergeFactor> >>>>>> <!-- Deprecated --> >>>>>> <maxBufferedDocs>500</maxBufferedDocs> >>>>>> <maxMergeDocs>5000</maxMergeDocs> >>>>>> <maxFieldLength>20000</maxFieldLength> >>>>>> <unlockOnStartup>false</unlockOnStartup> >>>>>> </mainIndex> >>>>>> <updateHandler class="solr.DirectUpdateHandler2"> >>>>>> >>>>>> <autoCommit> >>>>>> <maxDocs>50</maxDocs> >>>>>> <maxTime>180000</maxTime> >>>>>> </autoCommit> >>>>>> <listener event="postCommit" class="solr.RunExecutableListener"> >>>>>> <str name="exe">solr/bin/snapshooter</str> >>>>>> <str name="dir">.</str> >>>>>> <bool name="wait">true</bool> >>>>>> </listener> >>>>>> </updateHandler> >>>>>> >>>>>> <query> >>>>>> <maxBooleanClauses>50</maxBooleanClauses> >>>>>> <filterCache >>>>>> class="solr.LRUCache" >>>>>> size="0" >>>>>> initialSize="0" >>>>>> autowarmCount="0"/> >>>>>> <queryResultCache >>>>>> class="solr.LRUCache" >>>>>> size="0" >>>>>> initialSize="0" >>>>>> autowarmCount="0"/> >>>>>> <documentCache >>>>>> class="solr.LRUCache" >>>>>> size="0" >>>>>> initialSize="0" >>>>>> autowarmCount="0"/> >>>>>> <enableLazyFieldLoading>true</enableLazyFieldLoading> >>>>>> >>>>>> <queryResultWindowSize>1</queryResultWindowSize> >>>>>> <queryResultMaxDocsCached>1</queryResultMaxDocsCached> >>>>>> <HashDocSet maxSize="1000" loadFactor="0.75"/> >>>>>> <listener event="newSearcher" class="solr.QuerySenderListener"> >>>>>> <arr name="queries"> >>>>>> <lst> <str name="q">user_id</str> <str name="start">0</str> >>>>>> <str >>>>>> name="rows">1</str> </lst> >>>>>> <lst><str name="q">static newSearcher warming query from >>>>>> solrconfig.xml</str></lst> >>>>>> </arr> >>>>>> </listener> >>>>>> <listener event="firstSearcher" class="solr.QuerySenderListener"> >>>>>> <arr name="queries"> >>>>>> <lst> <str name="q">fast_warm</str> <str name="start">0</str> >>>>>> <str >>>>>> name="rows">10</str> </lst> >>>>>> <lst><str name="q">static firstSearcher warming query from >>>>>> solrconfig.xml</str></lst> >>>>>> </arr> >>>>>> </listener> >>>>>> <useColdSearcher>false</useColdSearcher> >>>>>> <maxWarmingSearchers>4</maxWarmingSearchers> >>>>>> </query> >>>>>> >>>>>> Replication: >>>>>> The snappuller is scheduled to run every 15 mins for now. >>>>>> >>>>>> Hardware: >>>>>> AMD (2.1GHz) dual core with 2GB ram 160GB SATA harddrive >>>>>> >>>>>> OS: >>>>>> Fedora 8 (64-bit) >>>>>> >>>>>> JVM version: >>>>>> java version "1.7.0" >>>>>> IcedTea Runtime Environment (build 1.7.0-b21) >>>>>> IcedTea 64-Bit Server VM (build 1.7.0-b21, mixed mode) >>>>>> >>>>>> Java options: >>>>>> java -Djetty.home=/path/to/solr/home -d64 -Xms1536M -Xmx1536M >>>>>> -XX:+UseParallelGC -jar start.jar >>>>>> >>>>>> >>>>>> -- >>>>>> View this message in context: >>>>>> http://www.nabble.com/Solr-indexing-configuration-help-tp17524364p17524364.html >>>>>> Sent from the Solr - User mailing list archive at Nabble.com. >>>>>> >>>>>> >>>>> >>>>> >>>> >>>> -- >>>> View this message in context: >>>> http://www.nabble.com/Solr-indexing-configuration-help-tp17524364p17526135.html >>>> Sent from the Solr - User mailing list archive at Nabble.com. >>>> >>>> >>> >>> >> >> -- >> View this message in context: >> http://www.nabble.com/Solr-indexing-configuration-help-tp17524364p17550056.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> > > -- View this message in context: http://www.nabble.com/Solr-indexing-configuration-help-tp17524364p17550792.html Sent from the Solr - User mailing list archive at Nabble.com.