RE: Optimization and Corruption Issues

2009-10-28 Thread George Aroush
Sorry, I'm just catching up with my mailing list inbox, ... Andrzej Bialecki wrote: I used the Luke tool from getopt.org and it worked flawlessly, optimizing the index in just over 2 hours. Problem is that my search cannot use it, and the error states Unknown Format Version errors, or

Re: Optimization and Corruption Issues

2009-10-01 Thread Erick Erickson
Would it work to copy your entire index to a new directory, perhaps on a different machine and optimize *there*? Then copy back to your app. Of course updates would be lost... But taking a week to optimize a 20G index seems just plain wrong. Have you tried playing with the various options to see

Re: Optimization and Corruption Issues

2009-10-01 Thread Mark Miller
2.0 is pre Mike's fabulous indexing updates - which just for one means one thread doing the merging rather than multiple. I'm sure overall its much slower. But you can't take advantage of the newer faster code without updating Lucene in your app. Your best bet is to put it another machine and

RE: Optimization and Corruption Issues

2009-10-01 Thread Uwe Schindler
The problem you have is that, if you optimize the index with a newer Luke version, it refactors the index in a later lucene file format. To read it with your current app, you also have to update your application to at least the version of Lucene Luke uses. Uwe - Uwe Schindler

Re: Optimization and Corruption Issues

2009-10-01 Thread Andrzej Bialecki
lowfreq wrote: I have a Lucene index that is very large in size. It was created using a pre 2.1 version of Lucene.net 2.0.0.4. The index is currently almost 20 GB, and has almost 7000 segment files. The problem I am having is that I need to optimize it, and cant do this without the search

Re: Optimization and Corruption Issues

2009-10-01 Thread Earwin Burrfoot
2.0 is pre Mike's fabulous indexing updates - which just for one means one thread doing the merging rather than multiple. I'm sure overall its much slower. If you're doing a full optimize, you're still using a single thread. Am I wrong? -- Kirill Zakharenko/Кирилл Захаренко

Re: Optimization and Corruption Issues

2009-10-01 Thread Michael McCandless
On Thu, Oct 1, 2009 at 12:49 PM, Earwin Burrfoot ear...@gmail.com wrote: If you're doing a full optimize, you're still using a single thread. Am I wrong? Depends on how many merges are required, and, the merge scheduler. In this case (w/ 7000 segments, which is way too many, normally!),

Re: Optimization and Corruption Issues

2009-10-01 Thread Earwin Burrfoot
If you're doing a full optimize, you're still using a single thread. Am I wrong? Depends on how many merges are required, and, the merge scheduler.  In this case (w/ 7000 segments, which is way too many, normally!), assuming ConcurrentMergeScheduler, multiple threads will be used since

Re: Optimization and Corruption Issues

2009-10-01 Thread lowfreq
Thank you very much for the detailed information everyone! I will try to use the information to make my code better. I have parsed out the optimization bits into a commandline app that runs the optimize on another box. Its messy, but effective in keeping downtime to a minimum. This will get the

Re: Optimization and Corruption Issues

2009-10-01 Thread Michael McCandless
On Thu, Oct 1, 2009 at 2:56 PM, Earwin Burrfoot ear...@gmail.com wrote: If you're doing a full optimize, you're still using a single thread. Am I wrong? Depends on how many merges are required, and, the merge scheduler.  In this case (w/ 7000 segments, which is way too many, normally!),