[
https://issues.apache.org/jira/browse/LUCENE-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14498829#comment-14498829
]
Michael McCandless commented on LUCENE-6422:
--------------------------------------------
bq. It's very normal, in this open-source project anyway, that there is back &
forth & peer-review and changes that are asked of the contributor. Don't worry;
something is going to get committed – the query speed is a nice improvement!
It's not quite done – that's all.
I agree iterating/peer review is healthy, but growing the community is
also *very* important in open-source, and at least for the geo3d issue
and now this one it worries me when I see barriers being put up that I
feel should not be blockers issues for committing.
Especially when bus factor is essentially one, in an area as important
as spatial, I think encouraging contributions / growing community
becomes incredibly important. It's like when we humans intervene for
an endangered species... after having caused their predicament in
the first place ... sigh.
Of course, if there are real technical objection/problems/quality
issues for a given patch, those *should* be addressed before committing.
It's important to show new people we are eager and excited for their
contributions, that the bar is not so high for them to have an impact.
We can always review/iterate/benchmark after they are committed, as
long as net/net the patch is a step forward as (I think?) this one is.
I also wonder whether we need a new, lighter weight spatial module
(spatial2? spatia_light?), or maybe spatial_sandbox, where the
barrier is lower? The levels of abtractions in the current module
look excessive to me and with both the geo3d issue and this issue,
"correctly fitting in to the existing abstractions" seemed to be one
of the barriers (e.g. your only blocker here ("The only thing about
this patch that is a blocker (-1) for me is
StreamingPrefixTreeStrategy..") seems to be such an issue). So if we
had a more free "sandbox" the barrier is lower by design.
> Add StreamingQuadPrefixTree
> ---------------------------
>
> Key: LUCENE-6422
> URL: https://issues.apache.org/jira/browse/LUCENE-6422
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/spatial
> Affects Versions: 5.x
> Reporter: Nicholas Knize
> Attachments: LUCENE-6422.patch,
> LUCENE-6422_with_SPT_factory_and_benchmark.patch
>
>
> To conform to Lucene's inverted index, SpatialStrategies use strings to
> represent QuadCells and GeoHash cells. Yielding 1 byte per QuadCell and 5
> bits per GeoHash cell, respectively. To create the terms representing a
> Shape, the BytesRefIteratorTokenStream first builds all of the terms into an
> ArrayList of Cells in memory, then passes the ArrayList.Iterator back to
> invert() which creates a second lexicographically sorted array of Terms. This
> doubles the memory consumption when indexing a shape.
> This task introduces a PackedQuadPrefixTree that uses a StreamingStrategy to
> accomplish the following:
> 1. Create a packed 8byte representation for a QuadCell
> 2. Build the Packed cells 'on demand' when incrementToken is called
> Improvements over this approach include the generation of the packed cells
> using an AutoPrefixAutomaton
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]