Oops, forgot to include solr-user@ in the original email. FYI below... ------ Forwarded Message From: "Mattmann, Chris A (388J)" <chris.a.mattm...@jpl.nasa.gov> Reply-To: <d...@lucene.apache.org> Date: Tue, 24 Aug 2010 07:02:58 -0700 To: <d...@lucene.apache.org> Subject: [Spatial] Geonames and extension to Spatial Solution for Solr
Hi Folks, You may have noticed over the past few days a bunch of Spatial related contributions, in particular: SOLR-2073 Geonames.org UpdateProcessor for Spatial SOLR-2074 GeoRSS ResponseWriter SOLR-2075 SpatialQParserPlugin and HostIP adaptor SOLR-2076 Spatial example schema updates SOLR-2077 Spatial example solconfig updates SOLR-2079 Expose HttpServletRequest object from SolrQueryRequest object SOLR-2081 BaseResponseWriter isStreamingDocs causes SingleResponseWriter.end to be called 2x SOLR-2082 Geopost.jar for loading geonames data These contributions were the results of the final project of one my students in CSCI 572: Search Engines and Information Retrieval [1] at USC this past Summer (2010). William and I built off of the existing Spatial code in Solr to provide some more robustness, and out of the box ease-of-use for Spatial. Basically what we did at a high level: * allow easy loading of Geonames.org data using an UpdateRequestProcessor. You can simply use the Geopost.jar tool we provided to load up exports from Geonames.org, which contains city, state, and locale information as well as coordinates (lat, lon) * load up geo-tagged data (lat, lon). This was already provided, but we included some default schema fields to automatically suck these up on ingestion. For our examples, we grabbed a regular RSS news feed, ran it through the geonames.org GeoRSS converter, and transformed it to a Solr input XML file. * Once Geonames.org data has been loaded, and you've loaded up a couple of docs that have been geotagged (with lat, lon), you can use our extended spatial Qparser, to issue queries like: {!spatial ct=[city] s=[state] c=[country] d=[search radius]}search text e.g. {!spatial ct=Orlando s=FL c=US d=400}NASA (which in our example looked up articles about NASA and JPL stories within 400 km (or miles) of Orlando, FL in the US). * allowed for Host IP->lat, lon detection and a pluggable framework to incorporate services like hostip to provide this functionality. That way, if a user doesn't include city, state, etc., in their spatial query, they still get articles "close" to them when using the spatial filter based on host IP detection. Though this isn't perfect, and largely dependent on the architectural topology, it's a really good start. Those are the high level features. We had to fix a bug along the way, and additionally allow for access to some of the objects that Solr tries to insulate you from (of course, with good reason), but it's still a fairly robust spatial solution, and we are incorporating it into several of our projects here at JPL. We hope it helps out the community and that folks find it useful. Thanks! Cheers, Chris ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.mattm...@jpl.nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org ------ End of Forwarded Message