Re: Efficiently reopening remotely-distributed indexes in 2.9?

2009-10-09 Thread Nigel
Got it -- thanks, Mark! (Recently I read elsewhere in the archives of this list about the value or lack thereof of segments.gen, so skipping that file was in the back of my mind as well.) Chris On Thu, Oct 8, 2009 at 3:04 PM, Mark Miller wrote: > Nigel wrote: > > Thanks, Mark. That makes sens

Re: Efficiently reopening remotely-distributed indexes in 2.9?

2009-10-08 Thread Mark Miller
Nigel wrote: > Thanks, Mark. That makes sense. I guess if you do it in the right order, > you're guaranteed to have the files in a consistent state, since the only > thing that's actually overwritten is the segments.gen file at the end. > The main thing to do is to copy the segments_N files la

Re: Efficiently reopening remotely-distributed indexes in 2.9?

2009-10-08 Thread Nigel
Thanks, Mark. That makes sense. I guess if you do it in the right order, you're guaranteed to have the files in a consistent state, since the only thing that's actually overwritten is the segments.gen file at the end. What about the technique of creating a copy of the directory with hard links a

Re: Efficiently reopening remotely-distributed indexes in 2.9?

2009-10-07 Thread Mark Miller
Solr just copies them into the same directory - Lucene files are write once, so its not much different than what happens locally. Nigel wrote: > Right now we logically re-open an index by making an updated copy of the > index in a new directory (using rsync etc.), opening the new copy, and > closi

Re: Efficiently reopening remotely-distributed indexes in 2.9?

2009-10-07 Thread Nigel
Right now we logically re-open an index by making an updated copy of the index in a new directory (using rsync etc.), opening the new copy, and closing the old one. We don't use IndexReader.reopen() because the updated index is in a different directory (as opposed to being updated in-place). (Rea

Re: Efficiently reopening remotely-distributed indexes in 2.9?

2009-10-05 Thread Mark Miller
I keep considering a full response too this, but I just can't get over the hump and spend the time writing something up. Figured someone else would get to it - perhaps they still will. I will make a comment here though: >Before Lucene 2.9, I don't think this made any difference, as (I think) the

Re: Efficiently reopening remotely-distributed indexes in 2.9?

2009-10-05 Thread Michael Busch
On 10/5/09 5:30 PM, Nigel wrote: Before Lucene 2.9, I don't think this made any difference, as (I think) the only advantage to calling reopen vs. just creating another IndexReader was having reopen figure out whether the index had actually changed. (And whave a different way to figure that out

Re: Efficiently reopening remotely-distributed indexes in 2.9?

2009-10-05 Thread Jason Rutherglen
I'm not sure I understand the question. You're trying to reopen the segments that you're replicated and you're wondering what's changed in Lucene? On Mon, Oct 5, 2009 at 5:30 PM, Nigel wrote: > Anyone have any ideas here?  I imagine a lot of other people will have a > similar question when trying

Re: Efficiently reopening remotely-distributed indexes in 2.9?

2009-10-05 Thread Nigel
Anyone have any ideas here? I imagine a lot of other people will have a similar question when trying to take advantage of the reopen improvements in 2.9. Thanks, Chris On Thu, Oct 1, 2009 at 5:15 PM, Nigel wrote: > I have a question about the reopen functionality in Lucene 2.9. As I > underst

Efficiently reopening remotely-distributed indexes in 2.9?

2009-10-01 Thread Nigel
I have a question about the reopen functionality in Lucene 2.9. As I understand it, since FieldCaches are now per-segment, it can avoid reloading everything when the index is reopened, and instead just load the new segments. For background, like many people we have a distributed architecture wher