Re: Facet performance with heterogeneous 'facets'?

Yonik Seeley Fri, 22 Sep 2006 13:45:03 -0700

On 9/22/06, Michael Imbeault <[EMAIL PROTECTED]> wrote:

Excellent news; as you guessed, my schema was (for some reason) set to
version 1.0.


Yeah, I just realized that having "version" right next to "name" would
lead people to think it's "their" version number, when it's really
Solr's version number.  I've added a comment to the example schema to
clarify that.

But better yet, the 800 seconds query is now running in 0.5-2 seconds!
Amazing optimization! I can now do faceting on journal title (17 000
different titles) and last author (>400 000 authors), + 12 date range
queries, in a very reasonable time (considering im on a test windows
desktop box and not a server).

The only problem is if I add first author, I get a
java.lang.OutOfMemoryError: Java heap space. I'm sure this problem will
get away on a server with more than the current 500 megs I can allocate
to Tomcat.


Yes, the Lucene FieldCache takes up a lot of memory.  It basically
holds the entire field in a non-inverted form:
http://lucene.apache.org/java/docs/api/org/apache/lucene/search/FieldCache.StringIndex.html

It's currently also used for sorting, which also needs fast
document->fieldvalue lookups, rather than the inverted
term->documents_containing_that_term

-Yonik

Re: Facet performance with heterogeneous 'facets'?

Reply via email to