Hi Dawid,

Those docs are stale -- we removed random access writing a long time ago.
Please fix :)

Opening a file for read that is still open for writing is less well defined
-- it certainly happens for segments_N (we stopped writing segments.gen a
while ago), but really should not happen for any other index files, I think?

Mike McCandless

http://blog.mikemccandless.com

On Thu, Jul 19, 2018 at 5:47 AM, Dawid Weiss <[email protected]> wrote:

> While looking at the code I came across the following in the Directory
> class:
>
>  * A Directory is a flat list of files.  Files may be written once, when
> they
>  * are created.  Once a file is created it may only be opened for read, or
>  * deleted.  Random access is permitted both when reading and writing.
>
> What is the "Random access is permitted both when reading and
> writing"? Specifically, IndexOutput doesn't allow seeks and if "once a
> file is created it may only be opened for read" mean "ONLY after a
> file is created it may be opened for read" then we should allow
> directory implementations for which concurrent opening of a file for
> which an IndexOutput is still open for writes result in an
> IOException...
>
> We currently make an exception from the above for "segments*" files,
> as shown in MockDirectoryWrapper:
>
>     // cannot open a file for input if it's still open for
>     // output, except for segments.gen and segments_N
>     if (!allowReadingFilesStillOpenForWrite &&
> openFilesForWrite.contains(name) && !name.startsWith("segments")) { ,
>
> and BaseDirectoryTestCase:
>
>              try {
>               IndexInput input = dir.openInput(file,
> newIOContext(random()));
>               input.close();
>               } catch (FileNotFoundException | NoSuchFileException e) {
>                 // ignore
>               } catch (IOException e) {
>                 if (e.getMessage() != null &&
> e.getMessage().contains("still open for writing")) {
>                   // ignore
>                 } else {
>                   throw new RuntimeException(e);
>                 }
>               }
>
> (For the record, Solr's MockDirectoryFactory enables opening files
> being written to to be opened entirely.)
>
> I understand SegmentInfos.finishCommit does an atomic rename (and dir
> metadata flush) from a temporary (pending) segments file to the final
> segments_X so there should be no possibility of reading or ever
> accessing a partially written (or still open for writing) segments*
> file.
>
> Am I missing something? Are the above assumptions and exceptions a
> historical heritage that can be cleaned up and the contract of the
> Directory class clarified?
>
> Dawid
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Reply via email to