[ 
https://issues.apache.org/jira/browse/LUCENE-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14502886#comment-14502886
 ] 

David Smiley commented on LUCENE-6422:
--------------------------------------

I'll check out your patch tonight, tomorrow at the latest.  Karl/Geo3d has kept 
me busy :-)

RE naming: in both cases it seems the current names actually aren't bad 
relative to your suggestions.  "prune" is a suffix of "pruneLeafyBranches" 
(current name is more descriptive; in any case one would still need to look at 
the javadocs to understand), and SpatialTrie is synonymous with 
SpatialPrefixTree given that Trie and PrefixTree are synonyms.  I'm +1 to 
rename these as you want to 6.x if you think it's worth it.  There are 
back-compat issues with renaming them _now_.  Again, we agree more javadocs 
(including suggested alternative names) to add clarification now would be 
great.  I'll create a patch and seek your input.

RE sandbox: It's not clear to me what is really needed/useful.  If someone 
comes along with some newfangled index/search spatial approach, it could go in 
the module and not hook into any existing interface... except a Lucene Query 
class, and something like a Lucene TokenStream/Field for indexing.

> Add StreamingQuadPrefixTree
> ---------------------------
>
>                 Key: LUCENE-6422
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6422
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/spatial
>    Affects Versions: 5.x
>            Reporter: Nicholas Knize
>         Attachments: LUCENE-6422.patch, LUCENE-6422.patch, 
> LUCENE-6422_with_SPT_factory_and_benchmark.patch
>
>
> To conform to Lucene's inverted index, SpatialStrategies use strings to 
> represent QuadCells and GeoHash cells. Yielding 1 byte per QuadCell and 5 
> bits per GeoHash cell, respectively.  To create the terms representing a 
> Shape, the BytesRefIteratorTokenStream first builds all of the terms into an 
> ArrayList of Cells in memory, then passes the ArrayList.Iterator back to 
> invert() which creates a second lexicographically sorted array of Terms. This 
> doubles the memory consumption when indexing a shape.
> This task introduces a PackedQuadPrefixTree that uses a StreamingStrategy to 
> accomplish the following:
> 1.  Create a packed 8byte representation for a QuadCell
> 2.  Build the Packed cells 'on demand' when incrementToken is called
> Improvements over this approach include the generation of the packed cells 
> using an AutoPrefixAutomaton



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to