solr-user wrote > > Thanks David. No worries about the delay; am always happy and > appreciative when someone responds. > > I don't understand what you mean by "All center points get cached into > memory upon first use in a score" in question 2 about the Java OOM errors > I am seeing. >
The underlying field type receives one internal Shape instance per WKT string that is handed to it, no matter wether that WKT is MultiGeometry or not. The center point of that shape is indexed in such a way that it can be read into a cache later. It doesn't matter how many vertexes/coordinates your geometries have or quantity of shapes that exist in a single WKT string; it results in one point given one WKT string value. Just wanted to be clear on that. STNumPoints is the wrong statistic since that counts internal coordinates, from my reading of its documentation just now. STNumGeometries isn't right either if your WKT uses any of the Multi* type geometries. solr-user wrote > > The Solr instance I have setup for testing has around 200k docs, with one > WKT field per doc (indexed and stored and set to multivalue). > > I did a count of the number of points that get indexed in Solr (computed > in MS SQL by counting the number of points (using STNumPoints) for each > geometry (using STNumGeometries) in the WKT data I am indexing), and I > have around 35M points total. > > If only the center points for 190K docs get cached, wouldn't that easily > fit in 7GB of heap? > > Even if Solr was caching 35M points, that still doesn't sound like 7GB > worth of data. > Yeah... the memory cache may be pig-ish but not that bad. There's something about the implementation that tells me there could be a bug if any of your polygon shapes are small and/or you index at a high resolution. Given that you have multi-valued spatial data per document, you can't simply use solr.LatLonType. Try this -- create a new field called centerPoints or something like that, and also use the same field type as for the geohash one you are already using. But for this one, hand Solr the center-points of your shape data. Hopefully it's straight-forward for you to calculate this. Then when you do sorting by distance or need to retrieve the distance via a dist:query(...) etc., be sure to use this field and NOT the main shape one that has the full shape indexed. To be sure the spatial module doesn't load the center points for the main shape field, pass needScore=false as a Solr local-param in your filter query for it. Hopefully that fixes it. If it does, there is a bug and I know what it is. ~ David ----- Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book -- View this message in context: http://lucene.472066.n3.nabble.com/question-s-re-lucene-spatial-toolkit-aka-LSP-aka-spatial4j-tp3997757p4000276.html Sent from the Solr - User mailing list archive at Nabble.com.