Hi,
We have a Lucene 3.6-based index set which is quite large and currently
in use. What will be the upgrade path to (a) 4.x or (b) 5.x? With respect
to the data migration, etc. What are the steps and is it technically
possible? I read that 3.x to 5.x is not possible, and throws IndexTooStale
exceptions. Can we do it in two hops, like from 3.x to 4.x and 4.x to 5.x.
I would do this by re-indexing all the data with the Lucene 5.x based
application.
Of course, you will need extra disk and memory space (maybe another
machines), but I think it's more safe and easy than two hops index data
upgrading.
If I have a set of documents that have already been indexed with Lucene
3.6 and somehow we are able to upgrade to Lucene 4.x (or maybe 5.x), how
can we make sure that we will get the same set of results? I am not sure,
but I will check the analyzers and tokenizers used in the 3.6 versions. If
we could somehow carry over those to 5.x, will we be guaranteed the same
set of results? Or are there other considerations to get the same set of
results?
We cannot guarantee same set of results or rankings for arbitrary queries
when upgrade Lucene version.
Checking all analysis chains is good idea. And I would check top results
for some important queries.
Which means,
- Select important queries as many as possible (most frequently issued by
users, or those giving significant business impact to your service)
- For each query, take diff between 3.6- and 5.x-based applications' top N
results (N would depends on applications or UI)
- Check and make adjustments if there are unignorable differences
Regards,
Tomoko
2015-05-28 14:02 GMT+09:00 Sandeep Khanzode
sandeep_khanz...@yahoo.com.invalid:
Hi All,
We have a Lucene 3.6-based index set which is quite large and currently in
use. What will be the upgrade path to (a) 4.x or (b) 5.x? With respect to
the data migration, etc. What are the steps and is it technically possible?
I read that 3.x to 5.x is not possible, and throws IndexTooStale
exceptions. Can we do it in two hops, like from 3.x to 4.x and 4.x to 5.x.
If I have a set of documents that have already been indexed with Lucene
3.6 and somehow we are able to upgrade to Lucene 4.x (or maybe 5.x), how
can we make sure that we will get the same set of results? I am not sure,
but I will check the analyzers and tokenizers used in the 3.6 versions. If
we could somehow carry over those to 5.x, will we be guaranteed the same
set of results? Or are there other considerations to get the same set of
results? - SRK