Hi Dawid, Those docs are stale -- we removed random access writing a long time ago. Please fix :)
Opening a file for read that is still open for writing is less well defined -- it certainly happens for segments_N (we stopped writing segments.gen a while ago), but really should not happen for any other index files, I think? Mike McCandless http://blog.mikemccandless.com On Thu, Jul 19, 2018 at 5:47 AM, Dawid Weiss <[email protected]> wrote: > While looking at the code I came across the following in the Directory > class: > > * A Directory is a flat list of files. Files may be written once, when > they > * are created. Once a file is created it may only be opened for read, or > * deleted. Random access is permitted both when reading and writing. > > What is the "Random access is permitted both when reading and > writing"? Specifically, IndexOutput doesn't allow seeks and if "once a > file is created it may only be opened for read" mean "ONLY after a > file is created it may be opened for read" then we should allow > directory implementations for which concurrent opening of a file for > which an IndexOutput is still open for writes result in an > IOException... > > We currently make an exception from the above for "segments*" files, > as shown in MockDirectoryWrapper: > > // cannot open a file for input if it's still open for > // output, except for segments.gen and segments_N > if (!allowReadingFilesStillOpenForWrite && > openFilesForWrite.contains(name) && !name.startsWith("segments")) { , > > and BaseDirectoryTestCase: > > try { > IndexInput input = dir.openInput(file, > newIOContext(random())); > input.close(); > } catch (FileNotFoundException | NoSuchFileException e) { > // ignore > } catch (IOException e) { > if (e.getMessage() != null && > e.getMessage().contains("still open for writing")) { > // ignore > } else { > throw new RuntimeException(e); > } > } > > (For the record, Solr's MockDirectoryFactory enables opening files > being written to to be opened entirely.) > > I understand SegmentInfos.finishCommit does an atomic rename (and dir > metadata flush) from a temporary (pending) segments file to the final > segments_X so there should be no possibility of reading or ever > accessing a partially written (or still open for writing) segments* > file. > > Am I missing something? Are the above assumptions and exceptions a > historical heritage that can be cleaned up and the contract of the > Directory class clarified? > > Dawid > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
