Hi all.

Some of our indexes out there in the wild were created on 2.x. We're
about to try updating lucene to 5.x, so we have to update them to at
least 4.x.

Firstly, has anyone already put together a tool to do this? I see
several people asking similar questions on the mailing list and figure
that at least one of them should have succeeded by now.

Secondly, assuming the worst, is there good information around on how
to detect what version an index is already at? I tried doing it with
the public API in 3.6.2 and couldn't figure out how to distinguish a
3.x index from a 2.x one.

Doing it based on the files seems like it would be something like:

    - If segments.gen is missing, panic? (or list all segments_N files
to double-check? or assume it's older than 2.0?)
    - Read segments.gen to figure out which segments_N to read
    - Read segments_N, first 4 bytes contain the format as an int
        - when it's 0x3fd76c17
            - read the next string  (discard it?)
            - read the next int containing the actual format version
                - If the value is 0..3, then it's Lucene 4.x
                - If the value is >= 4, then it's 5.x
                - Any other value is presumably a newer format
        - when it's some other positive number, panic? Supposedly
positive numbers were an older format, but 0x3fd76c17 is also a
positive number.
        - If the int is negative
            - If it's >= -8, then it's Lucene 2.x
            - If it's <= -9, then it's Lucene 3.x

Is anything amiss with this logic?

My thought is to update indices to 3.x and then 4.x in separate
migration steps at the same time I update our own jar to 4.x. Then
I'll update our own jar to 5.x and wonder about whether we should
migrate the index format again, or delay it until lucene 6.

TX

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to