Hi, > We have a Lucene 3.6-based index set which is quite large and currently in use. What will be the upgrade path to (a) 4.x or (b) 5.x? With respect to the data migration, etc. What are the steps and is it technically possible? I read that 3.x to 5.x is not possible, and throws IndexTooStale exceptions. Can we do it in two hops, like from 3.x to 4.x and 4.x to 5.x.
I would do this by "re-indexing all the data" with the Lucene 5.x based application. Of course, you will need extra disk and memory space (maybe another machines), but I think it's more safe and easy than two hops index data upgrading. > If I have a set of documents that have already been indexed with Lucene 3.6 and somehow we are able to upgrade to Lucene 4.x (or maybe 5.x), how can we make sure that we will get the same set of results? I am not sure, but I will check the analyzers and tokenizers used in the 3.6 versions. If we could somehow carry over those to 5.x, will we be guaranteed the same set of results? Or are there other considerations to get the same set of results? We cannot "guarantee" same set of results or rankings for arbitrary queries when upgrade Lucene version. Checking all analysis chains is good idea. And I would check top results for some important queries. Which means, - Select important queries as many as possible (most frequently issued by users, or those giving significant business impact to your service) - For each query, take diff between 3.6- and 5.x-based applications' top N results (N would depends on applications or UI) - Check and make adjustments if there are unignorable differences Regards, Tomoko 2015-05-28 14:02 GMT+09:00 Sandeep Khanzode < sandeep_khanz...@yahoo.com.invalid>: > Hi All, > We have a Lucene 3.6-based index set which is quite large and currently in > use. What will be the upgrade path to (a) 4.x or (b) 5.x? With respect to > the data migration, etc. What are the steps and is it technically possible? > I read that 3.x to 5.x is not possible, and throws IndexTooStale > exceptions. Can we do it in two hops, like from 3.x to 4.x and 4.x to 5.x. > If I have a set of documents that have already been indexed with Lucene > 3.6 and somehow we are able to upgrade to Lucene 4.x (or maybe 5.x), how > can we make sure that we will get the same set of results? I am not sure, > but I will check the analyzers and tokenizers used in the 3.6 versions. If > we could somehow carry over those to 5.x, will we be guaranteed the same > set of results? Or are there other considerations to get the same set of > results? - SRK