lowfreq wrote:
I have a Lucene index that is very large in size. It was created using a pre 2.1 version of Lucene.net 2.0.0.4. The index is currently almost 20 GB, and has almost 7000 segment files. The problem I am having is that I need to optimize it, and cant do this without the search functionality of my app being down for a week.
I used the Luke tool from getopt.org and it worked flawlessly, optimizing
the index in just over 2 hours. Problem is that my search cannot use it, and
the error states Unknown Format Version errors, or just plain nothing found.

You should be careful when using Lucene Java to modify Lucene.Net indexes. I know for a fact that deflated data in Lucene Java is incompatible with the deflater implementation in .Net, so it's easy to create an incompatible index even when you use a supposedly compatible version of Lucene Java. Perhaps versions around 2.0 still worked ok, but no guarantees.



I understand that versions of Lucene that are newer than what the index was
built and is searched with can cause problems.
What can I do to make this work? I have tried older versions of Luke, 0.7
was the oldest I could lay hands on, but even it uses a newer version of
Lucene.

Here are links to older versions of Luke:

        http://www.getopt.org/luke/luke-0.1.zip
        http://www.getopt.org/luke/luke-0.2.zip
        http://www.getopt.org/luke/luke-0.3.zip
        http://www.getopt.org/luke/luke-0.4.zip
        http://www.getopt.org/luke/luke-0.5/luke-0.5.jar
        http://www.getopt.org/luke/luke-0.5/luke-src-0.5.zip
        http://www.getopt.org/luke/luke-0.6/lukeall-0.6.jar
        http://www.getopt.org/luke/luke-0.6/luke-src-0.6.zip



My index version shows as 633103800023469045. The version the index is
written as after optimizing with Luke 7.0 is 633103800023469057.

This is just a timestamp, so it doesn't say what version of Lucene created the index. If you open the index with Luke, in the Overview tab there is a line that tells what is the index format version.


--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to