p.s.
Adrien, any docs / references on how to implement index time sorting for
versions prior to 6.2 and LUCENE-6766 ?

Il giorno ven 19 mag 2017 alle ore 12:38 Tommaso Teofili <
tommaso.teof...@gmail.com> ha scritto:

> Thanks Adrien, it sounds like a good suggestion, I'll try it out.
> Another approach might be to use separate per cluster indexes, there one
> can somehow control the no. of segments, however that wouldn't probably
> scale with lots of clusters (and sounds weird too).
>
> Regards,
> Tommaso
>
>
> Il giorno gio 18 mag 2017 alle ore 16:54 Adrien Grand <jpou...@gmail.com>
> ha scritto:
>
>> You can't make documents more likely to be in the same segment, however
>> I'm thinking you could use index sorting to make documents closer to each
>> other on a per-segment basis?
>>
>> Le jeu. 18 mai 2017 à 11:04, Tommaso Teofili <tommaso.teof...@gmail.com>
>> a écrit :
>>
>>> Hi all,
>>>
>>> I am working on a use case where my Lucene index stores documents
>>> composed by (relatively short) text and binary values, at retrieval time I
>>> need to retrieve documents that belong to a set of cluster values (e.g.
>>> facets).
>>> In that context I was wondering if and how it'd be possible to make it
>>> more probable that documents (and associated docValues) that belong to a
>>> same cluster fall into the same segment.
>>> That would allow to have a higher storage locality [1] and presumably a
>>> better performance (given docs belonging to the same clusters get retrieved
>>> together most of the times in my use case).
>>> At first I had looked into extending the DV format but that's segment
>>> agnostic therefore I am thinking of coming up with a merge policy which
>>> produces segments whose docs belong to the same cluster with a high
>>> probability.
>>> Any other ideas / suggestions ?
>>>
>>> Regards,
>>> Tommaso
>>>
>>> [1] : https://en.wikipedia.org/wiki/Locality_of_reference
>>>
>>

Reply via email to