For complete clarity..."minVersion" for a SegmentInfo is the min of the
minVersions of all segments involved in the merge which resulted in this
segment. If it is a "pure" segment, then minVersion=version.

On Sat, Apr 19, 2025 at 10:35 PM Rahul Goswami <rahul196...@gmail.com>
wrote:

> Ankit,
> "I guess the SegmentInfo "minVersion" is the min across all segments
> during the merge process?"
> > That is correct
>
> I am wondering if there is any way to end up in the 2nd scenario, without
> having deleted all the documents first?
> > Consider the following sequence of events...
> an index with 2 segments (seg1 and seg2) originally created in Lucene
> 8.x.  ==> Upgrade to 9.x ==> index few documents and commit ==> seg3 gets
> created with version 9.x, but merge doesn't kick in ==> documents in seg1
> and seg2 get deleted followed by commit.==> You are left with seg3 in 9.x
> but indexCreatedVersionMajor as 8.x ==> Upgrade to Lucene 10.x fails.
>
> -Rahul
>
> On Sat, Apr 19, 2025 at 1:01 PM Ankit Jain <jain.ank...@gmail.com> wrote:
>
>> Hi Rahul,
>>
>> Thanks for starting this interesting discussion. I was initially thinking
>> that this API potentially allows upgrading "indexCreatedVersionMajor" via
>> the merge process after rewriting all the segments, but I guess the
>> SegmentInfo "minVersion" is the min across all segments during the merge
>> process?
>>
>> So, I am wondering if there is any way to end up in the 2nd scenario,
>> without having deleted all the documents first?
>>
>>
>> Thanks
>> Ankit
>>
>> On Sat, Apr 19, 2025 at 9:17 AM Rahul Goswami <rahul196...@gmail.com>
>> wrote:
>>
>>> Hello,
>>> Today even after all documents in an index are deleted via an API call,
>>> reindexing still doesn't change the "indexCreatedVersionMajor" property
>>> value in SegmentInfos. Hence even after complete reindexing, an upgrade
>>> path X--> X+1 --> X+2 is still not possible as we end up with an
>>> IndexFormatTooOldException.
>>>
>>> Requesting an API (on IndexWriter?) which can reset this property (upon
>>> a new commit) to the current Lucene version if:
>>> 1) No more live docs present
>>> OR
>>> 2) If all SegmentInfo in the index have a "minVersion" AND "version"
>>> stamp of the latest version , but SegmentInfos has an older
>>> "indexCreatedVersionMajor".
>>>
>>> This will help users a LOT since they can now interact with the index
>>> purely via API without needing manual deletion and also help open up a
>>> legitimate path to upgrade when an index doesn't HAVE to be repopulated
>>> from the source.
>>>
>>> If there is agreement, I am happy to pick this up and submit a PR.
>>>
>>> Thanks,
>>> Rahul Goswami
>>>
>>>
>>>

Reply via email to