Yonik Seeley wrote:

On Tue, Sep 9, 2008 at 11:42 AM, Ning Li <[EMAIL PROTECTED]> wrote:
On Tue, Sep 9, 2008 at 10:02 AM, Yonik Seeley <[EMAIL PROTECTED]> wrote:
Yeah, I think the underlying RandomAccessFile might do the right
thing, but IndexInput isn't required to see any changes on the fly
(and current implementations don't) so at a minimum it would be a
change of IndexInput semantics.  Maybe there would need to be a
refresh() function added, or we would need to require a specific
Directory impl?

OR, if all writes are append-only, perhaps we don't ever need to
invalidate the read buffer and would just need to remove the current
logic that caches the file length and then let the underlying
RandomAccessFile do the EOF checking.

We cannot assume it's always RandomAccessFile, can we?

No, it would essentially be a change in the semantics that all
implementations would need to support.

Right, which is you are allowed to open an IndexInput on a file when an IndexOutput has that same file open and is still appending to it.

So we may have to flush after writing each document.

Flush when creating a new index view (which could possibly be after
every document is added, but doesn't have to be).

Assuming we can make the above semantics requirement change to IndexInput, we don't need to flush on opening a new RAM reader?

Even so,
this may not be sufficient for some FS such as HDFS... Is it
reasonable in this case to keep in memory everything including
stored fields and term vectors?

We could maybe do something like a proxy IndexInput/IndexOutput that
would allow updating the read buffer from the writer buffer.

Does HDFS disallow a reader from reading a file that's still open for append?

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to