Actually, i dug in the logs again and surprise, it sometimes still occurs with `random` queries. Here's are a few snippets from the error log. Somewhere during that time there might be OOM-errors but older logs are unfortunately rotated away.
2011-03-14 00:25:32,152 ERROR [solr.search.SolrCache] - [pool-1-thread-1] - : Error during auto-warming of key:f_sp_eigenschappen:geo:java.lang.ArrayIndexOutOfBoundsException: 431733 at org.apache.lucene.util.BitVector.get(BitVector.java:102) at org.apache.lucene.index.SegmentTermDocs.read(SegmentTermDocs.java:152) at org.apache.solr.search.SolrIndexSearcher.getDocSetNC(SolrIndexSearcher.java:642) at org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:545) at org.apache.solr.search.SolrIndexSearcher.cacheDocSet(SolrIndexSearcher.java:520) at org.apache.solr.search.SolrIndexSearcher$2.regenerateItem(SolrIndexSearcher.java:296) at org.apache.solr.search.FastLRUCache.warm(FastLRUCache.java:168) at org.apache.solr.search.SolrIndexSearcher.warm(SolrIndexSearcher.java:1481) at org.apache.solr.core.SolrCore$2.call(SolrCore.java:1131) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) 2011-03-14 00:25:32,795 ERROR [solr.search.SolrCache] - [pool-1-thread-1] - : Error during auto-warming of key:+(titel_i:touareg^5.0 | f_advertentietype:touareg^2.0 | f_automodel_j:touareg^8.0 | facets:touareg^2.0 | omschrijving_i:touareg | catlevel1_i:touareg^2.0 | catlevel2_i:touareg^4.0)~0.1 () (10.0/(7.71E-8*float(ms(const(1300035600000),date(sort_date)))+1.0))^10.0:java.lang.ArrayIndexOutOfBoundsException: 468554 at org.apache.lucene.util.BitVector.get(BitVector.java:102) at org.apache.lucene.index.SegmentTermDocs.readNoTf(SegmentTermDocs.java:169) at org.apache.lucene.index.SegmentTermDocs.read(SegmentTermDocs.java:139) at org.apache.lucene.search.TermScorer.nextDoc(TermScorer.java:130) at org.apache.lucene.search.DisjunctionMaxQuery$DisjunctionMaxWeight.scorer(DisjunctionMaxQuery.java:145) at org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:297) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:246) at org.apache.lucene.search.Searcher.search(Searcher.java:171) at org.apache.solr.search.SolrIndexSearcher.getDocSetNC(SolrIndexSearcher.java:651) at org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:545) at org.apache.solr.search.SolrIndexSearcher.cacheDocSet(SolrIndexSearcher.java:520) at org.apache.solr.search.SolrIndexSearcher$2.regenerateItem(SolrIndexSearcher.java:296) at org.apache.solr.search.FastLRUCache.warm(FastLRUCache.java:168) at org.apache.solr.search.SolrIndexSearcher.warm(SolrIndexSearcher.java:1481) at org.apache.solr.core.SolrCore$2.call(SolrCore.java:1131) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) 2011-03-14 00:25:33,051 ERROR [solr.search.SolrCache] - [pool-1-thread-1] - : Error during auto-warming of key:+*:* (10.0/(7.71E-8*fl oat(ms(const(1300035600000),date(sort_date)))+1.0))^10.0:java.lang.ArrayIndexOutOfBoundsException: 489479 at org.apache.lucene.util.BitVector.get(BitVector.java:102) at org.apache.lucene.index.SegmentTermDocs.next(SegmentTermDocs.java:127) at org.apache.lucene.search.FieldCacheImpl$LongCache.createValue(FieldCacheImpl.java:562) at org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:208) at org.apache.lucene.search.FieldCacheImpl.getLongs(FieldCacheImpl.java:525) at org.apache.solr.search.function.LongFieldSource.getValues(LongFieldSource.java:57) at org.apache.solr.search.function.DualFloatFunction.getValues(DualFloatFunction.java:48) at org.apache.solr.search.function.ReciprocalFloatFunction.getValues(ReciprocalFloatFunction.java:61) at org.apache.solr.search.function.FunctionQuery$AllScorer.<init>(FunctionQuery.java:123) at org.apache.solr.search.function.FunctionQuery$FunctionWeight.scorer(FunctionQuery.java:93) at org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:297) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:246) at org.apache.lucene.search.Searcher.search(Searcher.java:171) at org.apache.solr.search.SolrIndexSearcher.getDocSetNC(SolrIndexSearcher.java:651) at org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:545) at org.apache.solr.search.SolrIndexSearcher.cacheDocSet(SolrIndexSearcher.java:520) at org.apache.solr.search.SolrIndexSearcher$2.regenerateItem(SolrIndexSearcher.java:296) at org.apache.solr.search.FastLRUCache.warm(FastLRUCache.java:168) at org.apache.solr.search.SolrIndexSearcher.warm(SolrIndexSearcher.java:1481) at org.apache.solr.core.SolrCore$2.call(SolrCore.java:1131) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) And several more. Again, i cannot confirm that the system was OOM monday 0:00 hours. I can confirm that after that no OOM's were present, nor this peculiar exception. Thanks again > > that is odd... > > > > can you let us know exactly what verison of Solr/Lucne you are using (if > > it's not an official release, can you let us know exactly what the > > version details on the admin info page say, i'm curious about the svn > > revision) > > Of course, that's the stable 1.4.1. > > > can you also please let us know what types of queries you are generating? > > ... that's the toString output of a query and it's not entirely clear > > what the original looked like. If you can recognize what the original > > query was, it would also be helpful to know if you can consistently > > reproduce this error on autowarming after executing that query (or > > queries like it with a slightly differnet date value) > > It's extremely difficult to reproduce. It happened on a multinode system > that's being prepared for production. It has been under heavy load for a > long time already, updates and queries. It is continuously being updated > with real user input and receives real user queries from a source that's > being updated from logs. Solr is about to replace an existing search > solution. > > It is impossible to reproduce because of these uncontrollable variables, i > tried but failed. The error, however, did occur at least a couple of times > after i started this thread. > > It hasn't reappeared after i reduced precision from milliseconds to an > hour, see my other thread for more information: > http://web.archiveorange.com/archive/v/AAfXfFuqjPhU4tdq53Tv > > > One of the things that particularly boggles me is this... > > > > : org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.ja > > : va > > : > > : :545) > > : : > > : at > > : > > : org.apache.solr.search.SolrIndexSearcher.cacheDocSet(SolrIndexSearcher. > > : ja va:520) > > > > [...] > > > > : Well, i use Dismax' bf parameter to boost very recent documents. I'm > > : not using the queryResultCache or documentCache, only filterCache and > > : Lucene fieldCache. > > > > ... that cache warming stack trace seems to be coming from filterCache, > > but that contradicts your statement that you don't use the filterCache. > > independent of your comments, that's an odd looking query to be cached in > > the filter cache anyway, since it includes a mandatory matchalldocs > > clause, and seems to only exist for boosting on that function. > > But i am using filterCache and fieldCache (forgot to mention the obvious > fieldValueCache as well). > > If you have any methods that may help to reproduce i'm of course willing to > take the time and see if i can. It may prove really hard because several > weird errors were not reproduceable in a more controlled but similar > environment (load and config) and i can't mess with the soon-to-be > production cluster. > > Thanks! > > > -Hoss