[
https://issues.apache.org/jira/browse/LUCENE-2395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836476#comment-13836476
]
Michael McCandless commented on LUCENE-2395:
--------------------------------------------
bq. BTW the idea of having a spatial Query that returns the score as the
distance and doesn't need a point cache to do it, is very doable with Lucene
4's spatial module & the RecursivePrefixTree strategy.
Can't you just use the expressions module for this? (LUCENE-5258)
> Add a scoring DistanceQuery that does not need caches and separate filters
> --------------------------------------------------------------------------
>
> Key: LUCENE-2395
> URL: https://issues.apache.org/jira/browse/LUCENE-2395
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/spatial
> Reporter: Uwe Schindler
> Attachments: ASF.LICENSE.NOT.GRANTED--DistanceQuery.java,
> ASF.LICENSE.NOT.GRANTED--DistanceQuery.java
>
>
> In a chat with Chris Male and my own ideas when implementing for PANGAEA, I
> thought about the broken distance query in contrib. It lacks the following
> features:
> - It needs a query/filter for the enclosing bbox (which is constant score)
> - It needs a separate filter for filtering out hits to far away (inside bbox
> but outside distance limit)
> - It has no scoring, so if somebody wants to sort by distance, he needs to
> use the custom sort. For that to work, spatial caches distance calculation
> (which is broken for multi-segment search)
> The idea is now to combine all three things into one query, but customizeable:
> We first thought about extending CustomScoreQuery and calculate the distance
> from FieldCache in the customScore method and return a score of 1 for
> distance=0, score=0 on the max distance and score<0 for farer hits, that are
> in the bounding box but not in the distance circle. To filter out such
> negative scores, we would need to override the scorer in CustomScoreQuery
> which is priate.
> My proposal is now to use a very stripped down CustomScoreQuery (but not
> extend it) that does call a method getDistance(docId) in its scorer's advance
> and nextDoc that calculates the distance for the current doc. It stores this
> distance also in the scorer. If the distance > maxDistance it throws away the
> hit and calls nextDoc() again. The score() method will reurn per default
> weight.value*(maxDistance - distance)/maxDistance and uses the precalculated
> distance. So the distance is only calculated one time in nextDoc()/advance().
> To be able to plug in custom scoring, the following methods in the query can
> be overridden:
> - float getDistanceScore(double distance) - returns per default: (maxDistance
> - distance)/maxDistance; allows score customization
> - DocIdSet getBoundingBoxDocIdSet(Reader, LatLng sw, LatLng ne) - returns an
> DocIdSet for the bounding box. Per default it returns e.g. the docIdSet of a
> NRF or a cartesian tier filter. You can even plug in any other DocIdSet, e.g.
> wrap a Query with QueryWrapperFilter
> - support a setter for the GeoDistanceCalculator that is used by the scorer
> to get the distance.
> - a LatLng provider (similar to CustomScoreProvider/ValueSource) that returns
> for a given doc id the lat/lng. This method is called per IndexReader one
> time in scorer creation and will retrieve the coordinates. By that we support
> FieldCache or whatever.
> This query is almost finished in my head, it just needs coding :-)
--
This message was sent by Atlassian JIRA
(v6.1#6144)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]