On 9/22/06, Michael Imbeault <[EMAIL PROTECTED]> wrote:
Excellent news; as you guessed, my schema was (for some reason) set to version 1.0.
Yeah, I just realized that having "version" right next to "name" would lead people to think it's "their" version number, when it's really Solr's version number. I've added a comment to the example schema to clarify that.
But better yet, the 800 seconds query is now running in 0.5-2 seconds! Amazing optimization! I can now do faceting on journal title (17 000 different titles) and last author (>400 000 authors), + 12 date range queries, in a very reasonable time (considering im on a test windows desktop box and not a server). The only problem is if I add first author, I get a java.lang.OutOfMemoryError: Java heap space. I'm sure this problem will get away on a server with more than the current 500 megs I can allocate to Tomcat.
Yes, the Lucene FieldCache takes up a lot of memory. It basically holds the entire field in a non-inverted form: http://lucene.apache.org/java/docs/api/org/apache/lucene/search/FieldCache.StringIndex.html It's currently also used for sorting, which also needs fast document->fieldvalue lookups, rather than the inverted term->documents_containing_that_term -Yonik