On Thu, May 11, 2017 at 8:02 AM, Venkateswara Rao Jujjuri <jujj...@gmail.com > wrote:
> That is correct. But we are talking about active entrylog file(s) and > journal file(s). > Do you see lot of active journal files that won't get updated but read for > long time? > I am a bit confused about this question here. When talking about durability, entry log files and journal files should be same, right? > Anyway just trying broaden the discussion and if we have any doubt if java > does fdatasync all the time. > > On Wed, May 10, 2017 at 4:57 PM, Matteo Merli <matteo.me...@gmail.com> > wrote: > >> My understanding is that fsync() forces the IO on the metadata even if >> the file size has not changed, for example to update the access timestamp. >> >> On Wed, May 10, 2017 at 4:47 PM Venkateswara Rao Jujjuri < >> jujj...@gmail.com> wrote: >> >>> Let me try to consolidate the discussion. >>> >>> 1. We need to flush the metadata whenever we cause metadata changes due >>> to data changes. Mostly Append/Filesize change. >>> 2. I don't know any operations where we perform "metadata operation" >>> only and we needed it to be "persisted" >>> Even if we can think/come up with some obscure cases of #2, majority of >>> of our use-case falls into #1. >>> So why not put force(true) all the time? What cases we may incur >>> penalty? If we pass true, then there is no debate on implementation/source. >>> >>> What do you guys say? >>> JV >>> >>> On Wed, May 10, 2017 at 4:42 PM, Sijie Guo <guosi...@gmail.com> wrote: >>> >>>> >>>> >>>> On Thu, May 11, 2017 at 2:28 AM, Charan Reddy G < >>>> reddychara...@gmail.com> wrote: >>>> >>>>> @JV and @Sijie I think Android java source is not the appropriate one >>>>> to look into. For FileChannelImpl we need to look into jdk code >>>>> >>>>> http://grepcode.com/file/repository.grepcode.com/java/root/ >>>>> jdk/openjdk/8u40-b25/sun/nio/ch/FileChannelImpl.java#FileCh >>>>> annelImpl.force%28boolean%29 >>>>> >>>>> >>>>> public void force(boolean metaData) throws IOException { >>>>> ensureOpen(); >>>>> int rv = -1; >>>>> int ti = -1; >>>>> try { >>>>> begin(); >>>>> ti = threads.add(); >>>>> if (!isOpen()) >>>>> return; >>>>> do { >>>>> rv = nd.force(fd, metaData); >>>>> } while ((rv == IOStatus.INTERRUPTED) && isOpen()); >>>>> } finally { >>>>> threads.remove(ti); >>>>> end(rv > -1); >>>>> assert IOStatus.check(rv); >>>>> } >>>>> } >>>>> >>>>> here 'nd' is of type FileDispatcherImpl. I tried to look into its code >>>>> but since it is native code, we wont be able to read that >>>>> >>>>> http://grepcode.com/file/repository.grepcode.com/java/root/ >>>>> jdk/openjdk/8u40-b25/sun/nio/ch/FileDispatcherImpl.java# >>>>> FileDispatcherImpl >>>>> >>>>> So it is not clear if it is just difference of fdatasync() and >>>>> fsync(). Even if we assume that it is the difference of fdatasync() >>>>> and fsync(), man pages of fdatasync() says >>>>> >>>>> "fdatasync() is similar to fsync(), but does not flush modified >>>>> metadata unless that metadata is needed in order to allow a subsequent >>>>> data >>>>> retrieval to be correctly handled. For example, changes to >>>>> st_atime or st_mtime (respectively, time of last access and time of last >>>>> modification; see stat(2)) do not require flushing because they are not >>>>> necessary for a subsequent data read to be handled correctly. *On >>>>> the other hand, a change to the file size (st_size, as made by say >>>>> ftruncate(2)), would require a metadata flush.**"* >>>>> >>>>> >>>> fdatasync - "does not flush modified metadata unless that metadata is >>>> needed in order to allow a subsequent data retrieval to be correctly >>>> handled" >>>> >>>> In practice, this means that unless the file size changed, the file >>>> metadata will not be synced when calling fdatasync. >>>> >>>> >>>>> Also, when we look into java API for force method - >>>>> https://docs.oracle.com/javase/7/docs/api/java/nio/channel >>>>> s/FileChannel.html#force(boolean), it says the implementation is >>>>> unspecified and system-dependent. So I don't think it is safe to make any >>>>> assumptions here. >>>>> >>>>> "The metaData parameter can be used to limit the number of I/O >>>>> operations that this method is required to perform. Passing false for >>>>> this parameter indicates that only updates to the file's content need be >>>>> written to storage; passing true indicates that updates to both the >>>>> file's content and metadata must be written, which generally requires at >>>>> least one more I/O operation. Whether this parameter actually has any >>>>> effect is dependent upon the underlying operating system and is therefore >>>>> unspecified. Invoking this method may cause an I/O operation to occur >>>>> even if the channel was only opened for reading. Some operating systems, >>>>> for example, maintain a last-access time as part of a file's metadata, and >>>>> this time is updated whenever the file is read. Whether or not this is >>>>> actually done is system-dependent and is therefore unspecified." >>>>> >>>>> Thanks, >>>>> Charan >>>>> >>>>> On Wed, May 10, 2017 at 4:07 AM, Sijie Guo <guosi...@gmail.com> wrote: >>>>> >>>>>> >>>>>> >>>>>> On May 10, 2017 12:42 PM, "Venkateswara Rao Jujjuri" < >>>>>> jujj...@gmail.com> wrote: >>>>>> >>>>>> Looked at the source and it is fdatasync() >>>>>> https://android.googlesource.com/platform/libcore/+/2496a68/ >>>>>> luni/src/main/java/java/nio/FileChannelImpl.java >>>>>> if (metadata) { >>>>>> Libcore.os.fsync(fd); >>>>>> } else { >>>>>> Libcore.os.fdatasync(fd); >>>>>> } >>>>>> >>>>>> So, agreed. fdatasync() fluesh metadata when the file size etc >>>>>> changes, and avoids metadata flush if no data related changes but >>>>>> but that tells me there may be no situation for us to call >>>>>> force(true) in the code. >>>>>> But our code calls it with true in some code path, should we change >>>>>> everything to false for consistency >>>>>> >>>>>> >>>>>> I don't remember that we called force(true) on file channel. Do you >>>>>> mean the buffered channel? >>>>>> >>>>>> >>>>>> JV >>>>>> >>>>>> On Tue, May 9, 2017 at 4:40 PM, Sijie Guo <guosi...@gmail.com> wrote: >>>>>> >>>>>>> We don't force flush metadata in bookies. Because: >>>>>>> >>>>>>> - flush metadata requires more I/O operations. >>>>>>> - flush metadata usually updates file metadata information (e.g. >>>>>>> access permissions, last modification/access time). >>>>>>> >>>>>>> The difference of force(false) and force(true) is the difference >>>>>>> between fdatasync() and fsync(). The former one update last modification >>>>>>> time, while the latter one doesn't. We don't care about the metadata >>>>>>> information like access/modification time, that's the reason we only >>>>>>> called >>>>>>> force(false). >>>>>>> >>>>>>> - Sijie >>>>>>> >>>>>>> >>>>>>> On Sat, May 6, 2017 at 9:10 AM, Charan Reddy G < >>>>>>> reddychara...@gmail.com> wrote: >>>>>>> >>>>>>>> By metadata I mean metadata of file >>>>>>>> >>>>>>>> in BufferedChannel.java >>>>>>>> >>>>>>>> public void flush(boolean shouldForceWrite) throws IOException { >>>>>>>> synchronized(this) { >>>>>>>> flushInternal(); >>>>>>>> } >>>>>>>> if (shouldForceWrite) { >>>>>>>> forceWrite(false); <------------- false here >>>>>>>> } >>>>>>>> } >>>>>>>> >>>>>>>> public long forceWrite(boolean forceMetadata) throws >>>>>>>> IOException { >>>>>>>> // This is the point up to which we had flushed to the file >>>>>>>> system page cache >>>>>>>> // before issuing this force write hence is guaranteed to >>>>>>>> be made durable by >>>>>>>> // the force write, any flush that happens after this may >>>>>>>> or may >>>>>>>> // not be flushed >>>>>>>> long positionForceWrite = writeBufferStartPosition.get(); >>>>>>>> fileChannel.force(forceMetadata); <--------------- >>>>>>>> forcemetadata false >>>>>>>> return positionForceWrite; >>>>>>>> } >>>>>>>> >>>>>>>> On Fri, May 5, 2017 at 5:56 PM, Charan Reddy G < >>>>>>>> reddychara...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Hey, >>>>>>>>> >>>>>>>>> In GarbageCollectorThread.doCompactEntryLogs { >>>>>>>>> >>>>>>>>> ... >>>>>>>>> >>>>>>>>> try { >>>>>>>>> compactEntryLog(scannerFactory, meta); >>>>>>>>> scannerFactory.flush(); >>>>>>>>> <--------------- this will eventually call >>>>>>>>> entrylogger.flushCurrentLog and >>>>>>>>> it force writes the content of the BufferedChannel but not the >>>>>>>>> metadata of >>>>>>>>> the file >>>>>>>>> >>>>>>>>> LOG.info("Removing entry log {} after compaction", >>>>>>>>> meta.getEntryLogId()); >>>>>>>>> removeEntryLog(meta.getEntryLogId()); >>>>>>>>> <----------- this will delete the compacted entrylog >>>>>>>>> >>>>>>>>> } >>>>>>>>> ... >>>>>>>>> >>>>>>>>> >>>>>>>>> in doCompactEntryLogs, we first write the non-deleted entries of >>>>>>>>> the entryLog which is getting compacted to the currentLog in >>>>>>>>> entryLogger, >>>>>>>>> then we flush the entrylogger before deleting the compacted entrylog. >>>>>>>>> But >>>>>>>>> when currentLog is flushed by the entrylogger, it flushes only the >>>>>>>>> content >>>>>>>>> of the file but not the metadata. After flush is completed the >>>>>>>>> compacted >>>>>>>>> entrylog is deleted. It is not ok to not to force flush the metadata >>>>>>>>> of the >>>>>>>>> currentLog for the persisted (checkpointed) data. The filesystem >>>>>>>>> behaviour >>>>>>>>> is unexpected in this case and there is possibility of data loss if >>>>>>>>> the >>>>>>>>> Bookie crashes before closing that logchannel. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Charan >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Jvrao >>>>>> --- >>>>>> First they ignore you, then they laugh at you, then they fight you, >>>>>> then you win. - Mahatma Gandhi >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >>> >>> -- >>> Jvrao >>> --- >>> First they ignore you, then they laugh at you, then they fight you, then >>> you win. - Mahatma Gandhi >>> >>> >>> > > > -- > Jvrao > --- > First they ignore you, then they laugh at you, then they fight you, then > you win. - Mahatma Gandhi > > >