[ https://issues.apache.org/jira/browse/LUCENE-2395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Uwe Schindler updated LUCENE-2395: ---------------------------------- Attachment: DistanceQuery.java Added Weight.explain() and fixed a missing replacement. > Add a scoring DistanceQuery that does not need caches and separate filters > -------------------------------------------------------------------------- > > Key: LUCENE-2395 > URL: https://issues.apache.org/jira/browse/LUCENE-2395 > Project: Lucene - Java > Issue Type: Improvement > Components: contrib/spatial > Reporter: Uwe Schindler > Fix For: 3.1 > > Attachments: DistanceQuery.java, DistanceQuery.java > > > In a chat with Chris Male and my own ideas when implementing for PANGAEA, I > thought about the broken distance query in contrib. It lacks the following > features: > - It needs a query/filter for the enclosing bbox (which is constant score) > - It needs a separate filter for filtering out hits to far away (inside bbox > but outside distance limit) > - It has no scoring, so if somebody wants to sort by distance, he needs to > use the custom sort. For that to work, spatial caches distance calculation > (which is broken for multi-segment search) > The idea is now to combine all three things into one query, but customizeable: > We first thought about extending CustomScoreQuery and calculate the distance > from FieldCache in the customScore method and return a score of 1 for > distance=0, score=0 on the max distance and score<0 for farer hits, that are > in the bounding box but not in the distance circle. To filter out such > negative scores, we would need to override the scorer in CustomScoreQuery > which is priate. > My proposal is now to use a very stripped down CustomScoreQuery (but not > extend it) that does call a method getDistance(docId) in its scorer's advance > and nextDoc that calculates the distance for the current doc. It stores this > distance also in the scorer. If the distance > maxDistance it throws away the > hit and calls nextDoc() again. The score() method will reurn per default > weight.value*(maxDistance - distance)/maxDistance and uses the precalculated > distance. So the distance is only calculated one time in nextDoc()/advance(). > To be able to plug in custom scoring, the following methods in the query can > be overridden: > - float getDistanceScore(double distance) - returns per default: (maxDistance > - distance)/maxDistance; allows score customization > - DocIdSet getBoundingBoxDocIdSet(Reader, LatLng sw, LatLng ne) - returns an > DocIdSet for the bounding box. Per default it returns e.g. the docIdSet of a > NRF or a cartesian tier filter. You can even plug in any other DocIdSet, e.g. > wrap a Query with QueryWrapperFilter > - support a setter for the GeoDistanceCalculator that is used by the scorer > to get the distance. > - a LatLng provider (similar to CustomScoreProvider/ValueSource) that returns > for a given doc id the lat/lng. This method is called per IndexReader one > time in scorer creation and will retrieve the coordinates. By that we support > FieldCache or whatever. > This query is almost finished in my head, it just needs coding :-) -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org