Re: move TrieRange* to core?

Michael McCandless Wed, 18 Mar 2009 15:12:40 -0700


Uwe Schindler wrote:

I have no problem with it! Thanks!
What I would like to be fixed before moving it to core is the factthat aadditional helper field is needed for the trie values. If everythingcouldbe in one field and the field is still sortable, it would be fine.For that,the order of terms in the FieldCache should be fixed. As currenttrie fields
of highest precision order before all other lower precision field, the
simpliest fix would be to only index the first first term fromTermEnum at
the documents index in the FieldCache.
Another way would be to just invert the order and let the higherprecisionfields appear at last in the TermEnum. Both would be possible, butthereshould be a clear statement, which term for multi-term-fields is putinto
FieldCache (maybe configureable). See LUCENE-1372 for that.


Though, won't this make loading the field cache more costly since
you'll iterate through many more terms?

If all terms could be in one field, the API to TrieRange could besimplierand more effective for the GC. The trieCodeLong/Int() method wouldjust
return a TokenStream that can be indexed using "new
Field(Name,TokenStream)", more effectively using the Token's charbufferduring trie encoding (it could be reused). This is how it is done bySolr atthe moment (but with the additional allocation of the array) - I donot like
the array allocations for each term and the whole trie-encoding at the
moment (1x char[], 1x String[], additional copying,...).


I agree it'd be awesome to have a less GC costly translation
during indexing.

I would be happy to have it in core, I could prepare the patch, whenthe
above is fixed!


OK.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: move TrieRange* to core?

Reply via email to