[
https://issues.apache.org/jira/browse/LUCENE-5714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14039061#comment-14039061
]
David Smiley commented on LUCENE-5714:
--------------------------------------
Another change to the API is, I think it's not needed to have a BBoxSimilarity
interface. DistanceSimilarity can be tossed, and so could
BBoxSimilarityValueSource. Instead, AreaSimilarity can be ShapeAreaValueSource
that takes a ValueSource that produces shapes from it's objectVal(doc). This
is in the same vein as DistanceToShapeValueSource. This underscores the
pluggability with, say, SerializedDVStrategy with ValueSource's. It's
plausible it will be faster to decode 4 numbers from a contiguous byte array
than have to retrieve a number 4 times via DocValues. And the code shouldn't
have to change accordingly -- it's plug and play.
Continuing this (definitely a separate JIRA issue), looking at the TODOs: these
two methods move to SpatialStrategy:
{code:java}
/**
* Provides access to each rectangle per document as a ValueSource in which
* {@link org.apache.lucene.queries.function.FunctionValues#objectVal(int)}
returns a {@link
* Shape}.
*/ //TODO raise to SpatialStrategy
public ValueSource makeShapeValueSource() {
return new BBoxValueSource(this);
}
@Override
public ValueSource makeDistanceValueSource(Point queryPoint, double
multiplier) {
//TODO if makeShapeValueSource gets lifted to the top; this could become a
generic impl.
return new DistanceToShapeValueSource(makeShapeValueSource(), queryPoint,
multiplier, ctx);
}
{code}
> Improve tests for BBoxStrategy then port to 4x.
> -----------------------------------------------
>
> Key: LUCENE-5714
> URL: https://issues.apache.org/jira/browse/LUCENE-5714
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/spatial
> Reporter: David Smiley
> Assignee: David Smiley
> Fix For: 4.10
>
> Attachments:
> LUCENE-5714__Enhance_BBoxStrategy__more_tests,_fix_dateline_bugs,_new_AreaSimilarity_algor.patch
>
>
> BBoxStrategy needs better tests before I'm comfortable seeing it in 4x.
> Specifically it should use random rectangles based validation (ones that may
> cross the dateline), akin to the other tests. And I think I see an
> equals/hashcode bug to be fixed in there too.
> One particular thing I'd like to see added is how to handle a zero-area case
> for AreaSimilarity. I think an additional feature in which you declare a
> minimum % area (relative to the query shape) would be good.
> It should be possible for the user to combine rectangle center-point to query
> shape center-point distance sorting as well. I think it is but I need to
> make sure it's possible without _having_ to index a separate center point
> field.
> Another possibility (probably not to be addressed here) is a minimum ratio
> between width/height, perhaps 10%. A long but nearly no height line should
> not be massively disadvantaged relevancy-wise to an equivalently long
> diagonal road that has a square bbox.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]