I have no problem with it! Thanks! What I would like to be fixed before moving it to core is the fact that a additional helper field is needed for the trie values. If everything could be in one field and the field is still sortable, it would be fine. For that, the order of terms in the FieldCache should be fixed. As current trie fields of highest precision order before all other lower precision field, the simpliest fix would be to only index the first first term from TermEnum at the documents index in the FieldCache.
Another way would be to just invert the order and let the higher precision fields appear at last in the TermEnum. Both would be possible, but there should be a clear statement, which term for multi-term-fields is put into FieldCache (maybe configureable). See LUCENE-1372 for that. If all terms could be in one field, the API to TrieRange could be simplier and more effective for the GC. The trieCodeLong/Int() method would just return a TokenStream that can be indexed using "new Field(Name,TokenStream)", more effectively using the Token's char buffer during trie encoding (it could be reused). This is how it is done by Solr at the moment (but with the additional allocation of the array) - I do not like the array allocations for each term and the whole trie-encoding at the moment (1x char[], 1x String[], additional copying,...). I would be happy to have it in core, I could prepare the patch, when the above is fixed! As names: NumberUtils, IntRangeFilter, LongRangeFilter is fine, AbstractNumberRangeFilter is internal only (just to have less code duplication, like StringBuffer and StringBuilder in JDK, both coming from a internal superclass invisible to outside) Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: [email protected] > -----Original Message----- > From: Michael McCandless [mailto:[email protected]] > Sent: Wednesday, March 18, 2009 9:02 PM > To: [email protected] > Subject: move TrieRange* to core? > > I think we should move TrieRange* into core before 2.9? > > It's received alot of attention, from both developers (Uwe & Yonik did > lots of iterations, and Solr is folding it in) and user interest. > > It's a simpler & more scalable way to index numeric fields that you > intend to sort and/or do range querying on; we can do away with tricky > number padding. > > Plus it's just plain cool :) > > I also think we should change its name. I know and love "trie", but > it's a very technical term that's not immediately meaningful to users > of Lucene's API. Plus I've learned from doing too many renamings > lately that it's best to try to get the name right at the start. > > Maybe just NumberUtils, IntRangeFilter, LongRangeFilter, > AbstractNumberRangeFilter? > > Thoughts? > > Mike > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
