On 9/14/06, Michael McCandless <[EMAIL PROTECTED]> wrote:
Yonik Seeley wrote:
>> > >> But, I'm still renaming segments_N.new -> segments_N,
>> > >
>> > > Hmmm, remind me why you need the .new file?  Why can't you just
>> create
>> > > segments_N after you are finished writing all of the segments?
>> >
>> > Because there could be a reader that tries to read the file before it's
>> > done being written.  It would hit EOF and throw an IOException.
>>
>> Ahh, right... unlikely (the segments file is pretty small), but possible.
>>
>> Another alternative (since this changes the index format anyway) is to
>> put something in the segments file to detect if it's partially
>> written... something like the size of the file or the number of
>> segments.  I don't know if the extra complexity would be worth saving
>> the creation time of an extra file or not...
>
> Hey wait... the segments file already has the number of segments.
> Can't you tell if it's not yet complete?

Good point!  A reader could easily know that's it's dealing with an
unfinished segments file (since the file says how many segments it
has) and then sleep/retry until the file completes, which should be a
rare event.  Note that such contention in the current Lucene (ie, on
the commit lock) results in a 1.0 second delay and then retry.

Though what if the writer has crashed and so the new segments file
will never be done?  I guess reader could fallback to the previous
_(N-1) file after some time at the cost of more delay.

If it will happen so rarely, make it simpler and go directly for
segments_(N-1)... (treat it like your previous plan if segments_N.done
hadn't been written yet).

I think that approach would work but I'm still worried about the
interaction with filesystem caching.  EG how much latency is added by
the caching before it realizes this file now has some more data?

Local filesystems don't have that problem.
Remote filesystems would hopefully check for new blocks on demand (as
you try to read it).


-Yonik
http://incubator.apache.org/solr Solr, the open-source Lucene search server

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to