Re: Refactoring Lucene to Variable-Width DocIds

Ed Kohlwey Wed, 10 Jul 2013 18:40:29 -0700

So, the core of the issue here I think is the partitioning scheme for
parallel search over index shards.

The current approach really strongly favors hash partitioning and locking
on a per-shard incrementer, regardless of what codec is in use. It would be
nice to use all the many Lucene libraries, particularly the searcher, with
indexing storage schemes that don't conform to this model. I understand how
others are doing things now but I would like to discuss an improvement that
suits more use cases.

What could we do to change that?

On Wed, Jul 10, 2013 at 8:48 AM, Robert Muir <[email protected]> wrote:

>
>
> On Wed, Jul 10, 2013 at 8:00 AM, Ed Kohlwey <[email protected]> wrote:
>
>> Yes Grant, thats pretty much exactly it.
>>
>>  I think what Ed is getting at is what if you threw out those
>>> assumptions and that instead the _internal_ ids were variable width (and
>>> perhaps stable?).  Could you then forgo having to do this mapping that
>>> everyone is talking about?
>>>
>>
> if you want that add a docvalues field. thats what they are for.
>
> docid wont change, look at how its used!
>

Re: Refactoring Lucene to Variable-Width DocIds

Reply via email to