Hi Pablo,

The deafening silence is probably nobody wanting to give you the bad news.
You are on a mission that may not be feasible, and even if you can get it
to "work", the end result won't likely be equivalent to indexing the
original data with Lucene 9.x. The indexing process is fundamentally lossy
and information originally used to produce non-stored fields will have been
thrown out. A simple example is things like stopwords or anything analyzed
with subclasses of FilteringTokenFilter. If the stop word list changed, or
the details of one of these filters changed (bugfix?), you will end up with
a different result than indexing with 9.x. This is just one
example, another would be stemming where the index likely only contains the
stem, not the whole word. Other folks who are more interested in the
details of our codecs than I am can probably provide further examples on a
more fundamental level. Lucene is not a database, and the source documents
should always be retained in a form that can be reindexed. If you have
inherited a system where source material has not been retained, you have a
difficult project and may have some potentially painful expectation setting
to perform.

Best,
Gus



On Fri, Oct 28, 2022 at 8:01 AM Pablo Vázquez Blázquez <pabl...@gmail.com>
wrote:

> Hi all,
>
> I have some indices indexed with lucene 5.5.0. I have updated my
> dependencies and code to Lucene 7 (but my final goal is to use Lucene 9)
> and when trying to work with them I am having the exception:
> org.apache.lucene.index.IndexFormatTooOldException: Format version is not
> supported (resource
>
> BufferedChecksumIndexInput(MMapIndexInput(path=".......\tests\segments_b"))):
> this index is too old (version: 5.5.0). This version of Lucene only
> supports indexes created with release 6.0 and later.
>
> I want to migrate from Lucene 5.x to Lucene 9.x. Which is the best
> strategy? Is there any tool to migrate the indices? Is it mandatory to
> reindex? In this case, how can I deal with this when I do not have the
> sources of documents that generated my current indices (I mean, I just have
> the indices themselves)?
>
> Thanks,
>
> --
> Pablo Vázquez
> (pabl...@gmail.com)
>


-- 
http://www.needhamsoftware.com (work)
http://www.the111shift.com (play)

Reply via email to