Hi Ryan,

>>I'm trying to understand how you have your data indexed so we can give
>>reasonable direction.
>>
>>What field type are you using for your locations?  Is it using the
>>solr spatial field types?  What do you see when you look at the debug
>>information from &debugQuery=true?

we query against a LatLonType using plain latitudes and longitudes and
the bbox function. We send the bbox filter in a filter query that is
uncached (we had to do this in order to get eviction rate down in the
filter cache, we had problems with that). Our filter cache is set up
as follows:

Concurrent LRU Cache(maxSize=32768, initialSize=8192, minSize=29491,
acceptableSize=31129, cleanupThread=false, autowarmCount=8192,
regenerator=org.apache.solr.search.SolrIndexSearcher$2@2fd1fc5c)

We've just restarted the slaves 30 minutes ago, so these values are
not really giving away much, but we see a hit rate of up to 97% on the
filter caches:

lookups : 13003
hits : 12440
hitratio : 0.95
inserts : 563
evictions : 0
size : 8927
warmupTime : 116891
cumulative_lookups : 9990103
cumulative_hits : 9583913
cumulative_hitratio : 0.95
cumulative_inserts : 406191
cumulative_evictions : 0

The warmup time looks a bit worrying, is that a high value by your experience?

As for debugQuery, here's the relevant snippet for the kind of geo
queries we send:

<arr name="filter_queries">
<str>{!bbox cache=false d=50 sfield=location_ll pt=54.1434,-0.452322}</str>
</arr>
<arr name="parsed_filter_queries">
<str>
WrappedQuery({!cache=false
cost=0}+location_ll_0_coordinate:[53.69373983225355 TO
54.59306016774645] +location_ll_1_coordinate:[-1.2199462259963294 TO
0.31530222599632934])
</str>
</arr>

>>
> >From my experience, there is no single best practice for spatial
>>queries -- it will depend on your data density and distribution if.
>>
>>You may also want to look at:
>>http://code.google.com/p/lucene-spatial-playground/
>>but note this is off lucene trunk -- the geohash queries are super fast
>>though

thanks, I will look into that! I still haven't really considered geo
hashes. As far as I understand, documents with a lat/lon are already
assigned a geo hash upon indexing, is that correct? In which way does
a query get faster though when I query by a geo hash rather than a
lat/lon? Doesn't local lucene already map documents to a cartesian
grid upon indexing, thus reducing lookup time? Moreover, will this
mean the results get less accurate since different lat/lons may
collapse into the same hash?

Thanks!

-- 
Matthias Käppler
Lead Developer API & Mobile

Qype GmbH
Großer Burstah 50-52
20457 Hamburg
Telephone: +49 (0)40 - 219 019 2 - 160
Skype: m_kaeppler
Email: matth...@qype.com

Managing Director: Ian Brotherston
Amtsgericht Hamburg
HRB 95913

This e-mail and its attachments may contain confidential and/or
privileged information. If you are not the intended recipient (or have
received this e-mail in error) please notify the sender immediately
and destroy this e-mail and its attachments. Any unauthorized copying,
disclosure or distribution of this e-mail and  its attachments is
strictly forbidden. This notice also applies to future messages.

Reply via email to