On 6/10/06, zzzzz shalev <[EMAIL PROTECTED]> wrote:
1. could you let me know what kind of response time you were getting with solr (as well as the size of data and result sizes)
A can tell you a little bit about ours... on one CNET faceted browsing implementation using Solr, the number of facets to check per request average somewhere between 100 and 200 (the total number of unique facets is much larger though). The median request time is 3ms (and I don't think the majority of that time is calculating set intersections). We actually don't have the LRUCaches set large enough to achieve a 100% hit rate, but performance is still fine.
2. i took a really really quick look at DocSetHitCollector and saw the dreaded if (bits==null) bits = new BitSet(maxDoc);
Yes, DocSets can be memory intensive. A BitSet is only used when the number of results gets larger than a threshold... below that, a HashDocSet is used that is O(n) rather than O(maxDoc). So the memory footprint also depends on the cardinality of the sets.
since i rewrote some lucene code to support 64-bit search instances i have indexes that may reach quite a few GB's ,
GBs of index size, or actually billions of documents. It's the number of documents that matters in this case.
allocating bitset's (arrays of long's is quite expensive memory wise and i am still a little skeptical about performance with large result sets)
I just checked in a replacement for BitSet that takes intersection counts much faster.
i did some testing of my facet impl and after an overnight webload session received about a 500 milli response time average for full faceting (with result sets from a few thousand to over 100,000)
How many documents was that with, and how many facets per document? I certainly am interested in more memory efficient faceted browsing, and have been meaning to try some alternatives. So far, we've had good results using cached DocSets though. -Yonik http://incubator.apache.org/solr Solr, the open-source Lucene search server --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]