[ 
https://issues.apache.org/jira/browse/SOLR-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851154#action_12851154
 ] 

Yonik Seeley commented on SOLR-1568:
------------------------------------

Things are looking good!  I like the SpatialQueryable interface and the 
SpatialOptions class, which should allow us to use an interface and slowly 
migrate by changing SpatialOptions w/o breaking back compat.  I haven't looked 
much at the implementations of geohash or tier stuff yet - more the basic point 
type and user interfaces.

High Level Interface:
* units: if we do specify units, more standard abbreviations would probably be 
"km" and "mi" rather than "K" and "M"
  * just standardizing on units might be the best option, as I think it would 
be simplest to use (and client conversion is trivial)
  * Examples of confusion and complications that "units" can case: 
    ** When filtering and sorting, what happens if there is a units mismatch?
    ** does "units" apply to the optional "radius" argument?
    ** what are the units for calculated and returned distances?  People might 
assume that because they filtered using "km", that a distance sort or other 
function queries would be in "km" or "mi"
    ** if returned distances can be in any unit, it complicates client code 
that may not know the exact request, and hence the units
    ** boosting or adding in a function of distance to the relevancy score: 
changing the units would unexpectedly change how this worked.
* fl should probably just be "f" (fl stands for "field list")?   Update: 
looking at the code again, it seems that the idea is to allow specifying 
lat,lon separately too?
   we have point and latlon fields for this though... shouldn't it always just 
be a single field?
* "radius" parameter is likely to confuse people... (i.e. I specified radius=1 
because I wanted to search within a 1 mile radius), while in reality this 
represents the radius of the earth.  Since 99.99% of people should not use this 
parameter, perhaps rename to "planet_radius" or "sphere_radius"?
* Seems like it would be really nice to be able to do both distance filtering 
and bounding box filtering.
  ** Distance filtering would normally be implemented as a combination of a 
bounding box + distance calculations for points within that box.
  ** Distance filtering could get even more efficient in the future... we could 
also calculate an "inner" box within the bounding box.
     Any points in the inner box would be guaranteed to be less than the 
distance, hence no need to calculate exact distance.
  ** It looks like the current implementation(s) does bounding box only?

Code:
* SpatialFilterQParserPlugin should probably just be a standard parser - no 
need to explicitly register in solrconfig.xml
  ** oops, I see that this is already done under "sfilt" currently - so it can 
just be removed from solrconfig.xml
  ** is there a reason why it's ResourceLoaderAware?
* for the implementation that creates a single range query for lattitude and a 
single one for longitude: we actually need multiple ranges (or multiple boxes) 
to handle boxes that cross the 180 meridian, as well as boxes that cross the 
poles. 

Tests:
* SpatialQParserPluginTest - it might be nice (and easier to maintain) if this 
were slightly higher level - test that the produced filter successfully filters 
out points outside the box, rather than testing that a specific type of query 
is returned (currently the test does "query instanceof NumericRangeQuery"). 

Example schema:
* before release, we'll prob want to cut down the spatial field types to just 
one recommended one that people can use w/o having to figure out the difference 
between all the types.  OK to keep multiple in during development though... 
makes for easier ad hoc testing.

With an eye toward the end game, here's an example of what we might want to 
shoot for:

{code}
// use case: filter by distance, sort by distance
&point=49,-77
&dist=1000
&fq={!sfilt fl=location}
&sort=dist($point,location) asc
  // we can't quite do this yet with function query syntax, but it seems nice?

// use case: filter by distance, boost relevance score by a function of distance
&point=49,-77
&dist=1000
&fq={!sfilt fl=location}
&boost=recip(dist($point,location),1,$dist,$dist)) 
  // "boost" is an edismax param.
  // function will yield values between .5 and 1
{code}

> Implement Spatial Filter
> ------------------------
>
>                 Key: SOLR-1568
>                 URL: https://issues.apache.org/jira/browse/SOLR-1568
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Grant Ingersoll
>            Assignee: Grant Ingersoll
>            Priority: Minor
>             Fix For: 1.5
>
>         Attachments: CartesianTierQParserPlugin.java, 
> SOLR-1568.Mattmann.031010.patch.txt, SOLR-1568.patch, SOLR-1568.patch, 
> SOLR-1568.patch, SOLR-1568.patch
>
>
> Given an index with spatial information (either as a geohash, 
> SpatialTileField (see SOLR-1586) or just two lat/lon pairs), we should be 
> able to pass in a filter query that takes in the field name, lat, lon and 
> distance and produces an appropriate Filter (i.e. one that is aware of the 
> underlying field type for use by Solr. 
> The interface _could_ look like:
> {code}
> &fq={!sfilt dist=20}location:49.32,-79.0
> {code}
> or it could be:
> {code}
> &fq={!sfilt lat=49.32 lat=-79.0 f=location dist=20}
> {code}
> or:
> {code}
> &fq={!sfilt p=49.32,-79.0 f=location dist=20}
> {code}
> or:
> {code}
> &fq={!sfilt lat=49.32,-79.0 fl=lat,lon dist=20}
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to