I know I'm late to this thread, but I saw this and specifically "reverse
geocoding" and it caught my attention.  I recently did this on a public
project with Solr, which you may find of interest:
https://github.com/cga-harvard/hhypermap-bop/tree/master/enrich/solr-geo-admin
I'm super pleased with the performance.

~ David

On Wed, May 17, 2017 at 10:59 PM Tom Hirschfeld <tomhirschf...@gmail.com>
wrote:

> Hey!
>
> I am working on a lucene based service for reverse geocoding. We have a
> large index with lots of unique terms (550 million) and it appears that
> we're running into issue with memory on our leaf servers as the term
> dictionary for the entire index is being loaded into heap space. If we
> allocate > 65g heap space, our queries return relatively quickly (10s -100s
> of ms), but if we drop below ~65g heap space on the leaf nodes, query time
> drops dramatically, quickly hitting 20+ seconds (our test harness drops at
> 20s).
>
> I did some research, and found in past versions of lucene, one could split
> the loading of the terms dictionary using the 'termInfosIndexDivisor'
> option in the directoryReader class. That option was deprecated in lucene
> 5.0.0
> <https://abi-laboratory.pro/java/tracker/changelog/lucene/5.0.0/log.html>
> in
> favor of using codecs to achieve similar functionality. Looking at the
> available experimental codecs. I see the BlockTreeTermsWriter
> <
> https://lucene.apache.org/core/5_3_1/core/org/apache/lucene/codecs/blocktree/BlockTreeTermsWriter.html#BlockTreeTermsWriter(org.apache.lucene.index.SegmentWriteState,%20org.apache.lucene.codecs.PostingsWriterBase,%20int,%20int)
> >
> that
> seems like it could be used for a similar purpose, breaking down the term
> dictionary so that we don't load the whole thing into heap space.
>
> Has anyone run into this problem before and found an effective solution?
> Does changing the codec used seem appropriate for this issue? If so, how do
> I got about loading an alternative codec and configuring it to my needs?
> I'm having trouble finding docs/examples of how this is used in the real
> world so even if you point me to a repo or docs somewhere I'd appreciate
> it.
> Thanks!
>
> Best,
> Tom Hirschfeld
>
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com

Reply via email to