Hi David, correct: you should avoid reading the content of a document inside a hitcollector. Normaly that means to cache all you need in main memory. Very simple and fast is a facet with only 255 possible values and exactly one value per document. In this case you need only an byte[IndexReader.maxDoc()] array in cache and an int[256] array for collecting the results (we have 5 GByte to run lucene with a couple of facets).
About "facet". For me a facet corresponds to a field in lucene. So 300 facets are quite a lot. Or did you mean two facets with 150 values each? To find a good solution for your 100M Document, I have three questions: - How many hits per search? - More then one value of the facet per document/how many in average? INDEXORDER means document number. MultiSearcher works also fine: If you have one index for each year and for each of this indices the indexorder in order of date, also the MultiSearcher will have correct INDEXORDER: Take a look to the variable "int[] starts" in MultiSearcher. David Seltzer wrote: > > > Is INDEXORDER based on the DocumentID within each individual index? If so > then the results could be interleaved. Anyone know how this behaves? > > -- View this message in context: http://www.nabble.com/Faceting%2C-Sort-and-DocIDSet-tp23099854p23143797.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org