maxDistErr should be like 0.3 based on earlier parts of this discussion since
your data is to one of a couple hours of the day, not whole days.  If it was
whole days, you would use 1.  Changing this requires a re-index.  So does
changing worldBounds if you do so.
distErrPct should be 0.  Changing it does not require a re-index because you
are indexing points, not other shapes.  This only affects other shapes.

Speaking of that slight buffer to the query shape I said in my last email,
it should be < half of maxDistErr, whatever you set that to.  So use like
0.1.

~ David


Chris Atkinson wrote
> Hi David,
> Thanks for your continued help.
> 
> I think that you have nailed it on the head for me. I'm 100% sure that I
> had previously tried that query without success. I'm not sure if perhaps I
> had wrong  distErrPct or  maxDistErr values...
> It's getting late, so I'm going to call it a night (I'm on GMT), but I'll
> put your example into practice tomorrow and get confirmation that it's
> working as expected.
> 
> I'll keep playing around with the distErrPct values as well.
> Do I need to do a reindex if I change these values? (I think yes?)
> 
> 
> On Tue, Jun 4, 2013 at 10:44 PM, Smiley, David W. &lt;

> dsmiley@

> &gt; wrote:
> 
>> So "availability" is the absence of any other document's indexed time
>> duration overlapping with your availability query duration.  So I think
>> you should negate an overlaps query.  The overlaps query looks like:
>> Intersects(-Inf start end Inf).  And remember the slight buffering needed
>> as described on the wiki.  You'd add a small fraction to the start time
>> and subtract a small fraction from the end time, so that you don't
>> accidentally match a document that is adjacent.
>>
>> -availability_spatial:"Intersects( 0 30.5 114.5 3650 )"
>>
>> Does that work against your data?  If it doesn't, can you conjecture why
>> it doesn't work based on a sample point in a document that it matched, or
>> a document that should have matched but didn't?
>>
>> ~ David
>>
>> On 6/4/13 3:31 PM, "Chris Atkinson" &lt;

> chrisacky@

> &gt; wrote:
>>
>> >Here is an example I have tried.
>> >
>> >So let's assume that I want to checkIn on the 30th day, and leave on the
>> >115th day.
>> >
>> >My query would be:
>> >
>> >-availability_spatial:"Intersects(   30 0  3650 115 )"
>> >
>> >However, that wouldn't match anything. Here is an example document below
>> >so
>> >you can see. (I've not negated the spatial field in the filter query so
>> >you
>> >can see the field coordinates)
>> >
>> >In case the formatting is bad: See here
>> >
>> >http://pastie.org/pastes/8006249/text
>> >
>> >
>> >
>> ><?xml version="1.0" encoding="UTF-8"?> 
> <response>
>  
> <lst
>>
>  >name="responseHeader"
>> >> 
> <int name="status">
> 0
> </int>
>  
> <int name="QTime">
> 1
> </int>
>  
> <lst
>>
>  >>name="params"> <
>> >str name="fl">availability_spatial
> </str>
>  
> <str name="indent">
> true
> </str>
>> >
> <str
>>
>  >name="q">id:38197 
> </str>
>  
> <str name="_">
> 1370374172298
> </str>
>  
> <str name="wt">
>> >xml
> </str>
>  
> <str name="fq">
> availability_spatial:"Intersects( 30 0 3650 115
>> >)"
>> >
> </str>
>  
> </lst>
>  
> </lst>
>  
> <result name="response" numFound="1" start="0">
>> >
> <doc>
>  <
>> >arr name="availability_spatial"> 
> <str>
> 147.6 163.4
> </str>
>  
> <str>
> 164.6 178.4
> </
>>
>  >str> 
> <str>
> 192.6 220.4
> </str>
>  
> <str>
> 241.6 264.4
> </str>
>  
> </arr>
> </doc>
>  
> </result>
>> >
> </
>>
>  >response>
>> >
>> >
>> >On Tue, Jun 4, 2013 at 8:14 PM, Chris Atkinson &lt;

> chrisacky@

> &gt;
>> >wrote:
>> >
>> >> Thanks David.
>> >> Query times are really quick and my index is only 20Mb now which is
>> >>about
>> >> what I would expect.
>> >> I'm having some problems figuring out what type of query I want to
>> find
>> >> *Available* properties with this new points system.
>> >>
>> >>
>> >> I'm storing bookings against each document. So I have X Y coordinates,
>> >> where X will be  the check in of a previous booking, and Y will be the
>> >> departure.
>> >>
>> >> So for example illustrative purposes, a weeks booking from 10th
>> January
>> >>to
>> >> the 17th, would be X Y => 10 17
>> >>
>> >> 
> <field name="booking">
> 10 17
> </field>
>> >> 
> <field name="booking">
> 22 27
> </field>
>> >>
>> >> I might have several bookings.
>> >>
>> >> Now, I want to find available properties with my search, but I'm just
>> >>not
>> >> sure on the ordering of the end/start in the polygon Intersect.
>> >>
>> >> I've looked at this document very carefully and tried to draw it all
>> out
>> >> on paper.
>> >>
>> >>
>> >>
>> https://people.apache.org/~hossman/spatial-for-non-spatial-meetup-2013011
>> >>7/
>> >>
>> >> Here are the suggestions:
>> >>
>> >> q=fieldX:"Intersects(-ƒ end start ƒ)"
>> >> q=fieldX:"Intersects(-ƒ start end ƒ)"
>> >> q=fieldX:"Intersects(start -ƒ ƒ end)"
>> >>
>> >> All of these, are great for finding the existance of a field
>> coordinate,
>> >> but I need to make sure that the property is available. So I thought I
>> >> could use one of these three queries in the negative by using
>> >> -fieldX:"Inter...." but none of those work.
>> >>
>> >> Can you shine some light on what I might be missing?
>> >> What ordering would I want for *availability*
>> >> Thanks very much.
>> >>
>> >>
>> >>
>> >> On Mon, Jun 3, 2013 at 11:45 PM, Smiley, David W.
>> >>&lt;

> dsmiley@

> &gt;wrote:
>> >>
>> >>> Hi Chris:
>> >>>
>> >>> Have you read: http://wiki.apache.org/solr/SpatialForTimeDurations
>> >>> You're modeling your data sub-optimally.  Full precision rectangles
>> >>> (distErrPct=0) doesn't scale well and you're seeing that.  You should
>> >>> represent your durations as a point and it will take up a fraction of
>> >>>the
>> >>> space (see above).  Furthermore, because your detail gets into one
>> >>>digit
>> >>> to the right of the decimal, your maxDistErr should definitely be
>> >>>smaller
>> >>> than 1 -- use something like 0.5 (given you have two levels of
>> >>>precision
>> >>> below a full day) but to be safer (more certain it's not a problem)
>> use
>> >>> 0.3 -- a little less.  Please report back how that goes.
>> >>>
>> >>> ~ David
>> >>>
>> >>> On 6/3/13 7:27 AM, "Chris Atkinson" &lt;

> chrisacky@

> &gt; wrote:
>> >>>
>> >>> >Hi,
>> >>> >I'm seeing really slow query times. 7-25 seconds when I run a simple
>> >>> >filter
>> >>> >query that uses my SpatialRecursivePrefixTreeFieldType field.
>> >>> >
>> >>> >My index is about 30k documents. Prior to adding the Spatial field,
>> >>>the
>> >>> on
>> >>> >disk space was about 100Mb, so it's a really tiny index. Once I add
>> >>>the
>> >>> >spatial field (which is multi-values), the index size jumps up to
>> 2GB.
>> >>> (Is
>> >>> >this normal?).
>> >>> >
>> >>> >Only about 10k documents will have any spatial data. Typically, they
>> >>>will
>> >>> >have at most 10 shapes each, but the majority are all one of two
>> >>> >rectangles.
>> >>> >
>> >>> >This is my fieldType definition.
>> >>> >
>> >>> >   
> <fieldType name="date_availability"
>>
>  >>> >class="solr.SpatialRecursivePrefixTreeFieldType"
>> >>> >                geo="false"
>> >>> >                worldBounds="0 0 3650 1"
>> >>> >                distErrPct="0"
>> >>> >                maxDistErr="1"
>> >>> >                units="degrees"
>> >>> >            />
>> >>> >
>> >>> >And the field
>> >>> >
>> >>> > 
> <field name="availability_spatial"  type="date_availability"
>>
>  >>> > indexed="true" stored="false" multiValued="true" />
>> >>> >
>> >>> >
>> >>> >I am using the field to represent approximately 10 years after
>> January
>> >>> 1st
>> >>> >2013, where each day is along the X-axis. Because the availability
>> >>>starts
>> >>> >and ends at 2pm and 10am, I was using a decimal place when creating
>> my
>> >>> >shape to show that detail. (Is this approach wrong?)
>> >>> >
>> >>> >So a typical rectangle when indexed would be (minX minY maxX maxY)
>> >>> >
>> >>> >Rectangle 100.6 0 120.4 1
>> >>> >
>> >>> >Is it wrong that my Y and X values are not of the same scale? Since
>> I
>> >>> >don't
>> >>> >care about the Y axis at all, I just set it to be of 1 height
>> always.
>> >>> >
>> >>> >I'm running Solr 4.3, with a small JVM of 768M (can be increased).
>> >>>And I
>> >>> >have 2GB RAM. (Again can be increased).
>> >>> >
>> >>> >Thanks
>> >>>
>> >>>
>> >>
>>
>>





-----
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context: 
http://lucene.472066.n3.nabble.com/SpatialRecursivePrefixTreeFieldType-Spatial-Searching-tp4067778p4068216.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to