: Very good points, I hadn't considered the term frequency of the digits : affecting scoring. As an aside, can that aspect of the score be ignored for : these fields?
The easiest way is to use a boost that is so low it's insignificant, or you could subclass TermQuery and override getSimilarity to return a DelegateSimilarity which wraps the real instance and returns constant values for things like tf() and idf() ... but i'm 95% sure that using a RangeFilter (or a ConstantScoreRangeQuery) is going to be faster then all of those TermQueries no matter what. : I need to spend more time with FunctionQuery, I haven't given it the : attention it deserves. i would start by trying out an apples to apples comparison of your current approach with one where your index only has one indexed field each for long/lat that uses ConstantScoreRangeQuery to do the boxing. Compare both the size of the resulting indexes, the memory footprint while open, and the time spent executing comparable queries. You should probably compare queries that involve both large boxes and small boxes, and depending on hte usage pattern you expect consider caching your Filters if you expect many boxes to be reused frequently. once you've found the "best" way to do your boxing ... then look into using FunctionQueries to influence your scores based on distance fro mthe center of hte box. : : Great feedback, thanks for the notes. : : -- jeff : : On 2/28/06, Chris Hostetter <[EMAIL PROTECTED]> wrote: : > : > : > : Geo definition: : > : Boxing around a center point. It's not critical to do a radius search : > with : > : a given circle. A boxed approach allows for taller or wider frames of : > : reference, which are applicable for our use. : > : > if you are just loking to confine your results to a box then i think : > RangeFiltering on both the X and Y axis will be more efficient then the : > individual term queries you are producing. : > : > It will have the added bonus of not artificially affecting the scores of : > hte documents based on how often a particular digit apears in a particular : > position of hte latitue accross your corpus. : > : > Once you've filtered down to a particular bounding box, you might consider : > going back to the function query approach to score documents inside that : > box based on their actual distance from the center point. I don't recall : > at the moment but i believe FunctionQuery's Scorer supports skipTo in such : > a way that it won't bother computing the function for a document that has : > been skiped (ie: when containing in a BooleanQuery with another clause : > that has already prohibited it, or when executed in the context of a : > Filter) : > : > : > : > -Hoss : > : > : > --------------------------------------------------------------------- : > To unsubscribe, e-mail: [EMAIL PROTECTED] : > For additional commands, e-mail: [EMAIL PROTECTED] : > : > : -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]