On Tue, Sep 9, 2008 at 11:42 AM, Ning Li <[EMAIL PROTECTED]> wrote:
> On Tue, Sep 9, 2008 at 10:02 AM, Yonik Seeley <[EMAIL PROTECTED]> wrote:
>> Yeah, I think the underlying RandomAccessFile might do the right
>> thing, but IndexInput isn't required to see any changes on the fly
>> (and current implementations don't) so at a minimum it would be a
>> change of IndexInput semantics.  Maybe there would need to be a
>> refresh() function added, or we would need to require a specific
>> Directory impl?
>>
>> OR, if all writes are append-only, perhaps we don't ever need to
>> invalidate the read buffer and would just need to remove the current
>> logic that caches the file length and then let the underlying
>> RandomAccessFile do the EOF checking.
>
> We cannot assume it's always RandomAccessFile, can we?

No, it would essentially be a change in the semantics that all
implementations would need to support.

> So we may have to flush after writing each document.

Flush when creating a new index view (which could possibly be after
every document is added, but doesn't have to be).

> Even so,
> this may not be sufficient for some FS such as HDFS... Is it
> reasonable in this case to keep in memory everything including
> stored fields and term vectors?

We could maybe do something like a proxy IndexInput/IndexOutput that
would allow updating the read buffer from the writer buffer.

-Yonik

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to