Hi Ryan, >>I'm trying to understand how you have your data indexed so we can give >>reasonable direction. >> >>What field type are you using for your locations? Is it using the >>solr spatial field types? What do you see when you look at the debug >>information from &debugQuery=true?
we query against a LatLonType using plain latitudes and longitudes and the bbox function. We send the bbox filter in a filter query that is uncached (we had to do this in order to get eviction rate down in the filter cache, we had problems with that). Our filter cache is set up as follows: Concurrent LRU Cache(maxSize=32768, initialSize=8192, minSize=29491, acceptableSize=31129, cleanupThread=false, autowarmCount=8192, regenerator=org.apache.solr.search.SolrIndexSearcher$2@2fd1fc5c) We've just restarted the slaves 30 minutes ago, so these values are not really giving away much, but we see a hit rate of up to 97% on the filter caches: lookups : 13003 hits : 12440 hitratio : 0.95 inserts : 563 evictions : 0 size : 8927 warmupTime : 116891 cumulative_lookups : 9990103 cumulative_hits : 9583913 cumulative_hitratio : 0.95 cumulative_inserts : 406191 cumulative_evictions : 0 The warmup time looks a bit worrying, is that a high value by your experience? As for debugQuery, here's the relevant snippet for the kind of geo queries we send: <arr name="filter_queries"> <str>{!bbox cache=false d=50 sfield=location_ll pt=54.1434,-0.452322}</str> </arr> <arr name="parsed_filter_queries"> <str> WrappedQuery({!cache=false cost=0}+location_ll_0_coordinate:[53.69373983225355 TO 54.59306016774645] +location_ll_1_coordinate:[-1.2199462259963294 TO 0.31530222599632934]) </str> </arr> >> > >From my experience, there is no single best practice for spatial >>queries -- it will depend on your data density and distribution if. >> >>You may also want to look at: >>http://code.google.com/p/lucene-spatial-playground/ >>but note this is off lucene trunk -- the geohash queries are super fast >>though thanks, I will look into that! I still haven't really considered geo hashes. As far as I understand, documents with a lat/lon are already assigned a geo hash upon indexing, is that correct? In which way does a query get faster though when I query by a geo hash rather than a lat/lon? Doesn't local lucene already map documents to a cartesian grid upon indexing, thus reducing lookup time? Moreover, will this mean the results get less accurate since different lat/lons may collapse into the same hash? Thanks! -- Matthias Käppler Lead Developer API & Mobile Qype GmbH Großer Burstah 50-52 20457 Hamburg Telephone: +49 (0)40 - 219 019 2 - 160 Skype: m_kaeppler Email: matth...@qype.com Managing Director: Ian Brotherston Amtsgericht Hamburg HRB 95913 This e-mail and its attachments may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail and its attachments. Any unauthorized copying, disclosure or distribution of this e-mail and its attachments is strictly forbidden. This notice also applies to future messages.