Hi all, Thank you all for your responses.
So, when updating to a newer (major) Lucene version that modifies its codecs, there is no way to ensure everything keeps working properly, unless re-indexing, right? Apart from not having some original sources that were indexed (which I will try to solve by using the *IndexUpgrader *tool), I have another problem: I was using the org.apache.lucene.uninverting.UninvertingReader to perform queries against the index, mainly using the grouping api. But currently, it was removed (since Lucene 7.0). So, again, do I have any other alternative, apart from re-indexing to use docValues? To give you more context, I am a developer of a tool that multiple customers can use to index their data (currently, with Lucene 5.5.5). We are planning to upgrade to Lucene 9 (because of some vulnerabilities affecting Lucene 5.5.5) and I think asking them to reindex will not go down well :( Regards, El sáb, 29 oct 2022 a las 23:31, Matt Davis (<kryptonics...@gmail.com>) escribió: > Inside of Zulia search engine, the object being indexed is always a > JSON/BSON object and we store the BSON as a stored byte field in the > index. This allows easy internal reindexing when the searchable fields > change but also allows us to update to the latest lucene version. > Combined with using lucene-backward-codecs an older index than the current > major version can be opened and reindexed. If you have stored all the > fields (or a json/bson) in the index, it would be easy to reindex in the > new format. If you have not, maybe opening with lucene-backward-codecs > will be enough for your use case. > > Thanks, > Matt > > On Sat, Oct 29, 2022 at 2:30 PM Baris Kazar <baris.ka...@oracle.com> > wrote: > > > It is always great practice to retain non-indexed > > data since when Lucene changes version, > > even minor version, I always reindex. > > > > Best regards > > ________________________________ > > From: Gus Heck <gus.h...@gmail.com> > > Sent: Saturday, October 29, 2022 2:17 PM > > To: java-user@lucene.apache.org <java-user@lucene.apache.org> > > Subject: Re: Best strategy migrate indexes > > > > Hi Pablo, > > > > The deafening silence is probably nobody wanting to give you the bad > news. > > You are on a mission that may not be feasible, and even if you can get it > > to "work", the end result won't likely be equivalent to indexing the > > original data with Lucene 9.x. The indexing process is fundamentally > lossy > > and information originally used to produce non-stored fields will have > been > > thrown out. A simple example is things like stopwords or anything > analyzed > > with subclasses of FilteringTokenFilter. If the stop word list changed, > or > > the details of one of these filters changed (bugfix?), you will end up > with > > a different result than indexing with 9.x. This is just one > > example, another would be stemming where the index likely only contains > the > > stem, not the whole word. Other folks who are more interested in the > > details of our codecs than I am can probably provide further examples on > a > > more fundamental level. Lucene is not a database, and the source > documents > > should always be retained in a form that can be reindexed. If you have > > inherited a system where source material has not been retained, you have > a > > difficult project and may have some potentially painful expectation > setting > > to perform. > > > > Best, > > Gus > > > > > > > > On Fri, Oct 28, 2022 at 8:01 AM Pablo Vázquez Blázquez < > pabl...@gmail.com> > > wrote: > > > > > Hi all, > > > > > > I have some indices indexed with lucene 5.5.0. I have updated my > > > dependencies and code to Lucene 7 (but my final goal is to use Lucene > 9) > > > and when trying to work with them I am having the exception: > > > org.apache.lucene.index.IndexFormatTooOldException: Format version is > not > > > supported (resource > > > > > > > > > BufferedChecksumIndexInput(MMapIndexInput(path=".......\tests\segments_b"))): > > > this index is too old (version: 5.5.0). This version of Lucene only > > > supports indexes created with release 6.0 and later. > > > > > > I want to migrate from Lucene 5.x to Lucene 9.x. Which is the best > > > strategy? Is there any tool to migrate the indices? Is it mandatory to > > > reindex? In this case, how can I deal with this when I do not have the > > > sources of documents that generated my current indices (I mean, I just > > have > > > the indices themselves)? > > > > > > Thanks, > > > > > > -- > > > Pablo Vázquez > > > (pabl...@gmail.com) > > > > > > > > > -- > > > > > https://urldefense.com/v3/__http://www.needhamsoftware.com__;!!ACWV5N9M2RV99hQ!PVR-c0gAs5FpIrnotHWeo3sEWScxV8oFJrVpGdItGZictcDbRvnp5aZSqCRhglMCYqQsewQOuio4iIYARA$ > > (work) > > > > > https://urldefense.com/v3/__http://www.the111shift.com__;!!ACWV5N9M2RV99hQ!PVR-c0gAs5FpIrnotHWeo3sEWScxV8oFJrVpGdItGZictcDbRvnp5aZSqCRhglMCYqQsewQOuirxfFWpEQ$ > > (play) > > > -- Pablo Vázquez (pabl...@gmail.com)