Re: Boosting results

Michael McCandless Mon, 10 Nov 2008 04:56:27 -0800


Well .. the FieldCache API is documented here (for 2.4.0):


    
http://lucene.apache.org/java/2_4_0/api/core/org/apache/lucene/search/FieldCache.html

EG you can load ints (for example) like this:

    FieldCache.DEFAULT.getInts(reader, "myfield");

This returns an array mapping docID --> int value for that field. Youneed to ensure that field has only 1 token per document (and that itparses to an int, for this example).

But: it's slow to load a field for the first time. LUCENE-1231(column-stride fields) aims to greatly speed up the load time.


It's also memory-consuming.

Finally, you might want to instead look at Solr, which provides facetcounting out of the box, rather than roll your own...


Mike

Stefan Trcek wrote:

On Friday 07 November 2008 18:46:17 Michael McCandless wrote:


Sorting populates the field cache (internal to Lucene) for that
field,   meaning it loads all values for all docs and holds them in
memory. This makes the first query slow, and, consumes RAM, in
proportion to how large your index is.


Can you direct me to the API how to access these cached values?
I'd like to have a function like: "List all unique values of the
categories (A, B, C...) for documents that match this query".

i.e. for a query "text:john" show up categories=(A,B)

Doc 1: category=A text=john
Doc 2: category=B text=mary
Doc 3: category=B text=john
Doc 4: category=C text=mary

This is intended for search refinement (I use about 200 categories).
Sorry for hijacking this thread.

Stefan

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Boosting results

Reply via email to