My understanding is that fsync() forces the IO on the metadata even if the
file size has not changed, for example to update the access timestamp.

On Wed, May 10, 2017 at 4:47 PM Venkateswara Rao Jujjuri <jujj...@gmail.com>
wrote:

> Let me try to consolidate the discussion.
>
> 1. We need to flush the metadata whenever we cause metadata changes due to
> data changes. Mostly Append/Filesize change.
> 2. I don't know any operations where we perform "metadata operation" only
> and we needed it to be "persisted"
> Even if we can think/come up with some obscure cases of #2, majority of of
> our use-case falls into #1.
> So why not put force(true) all the time? What cases we may incur penalty?
> If we pass true, then there is no debate on implementation/source.
>
> What do you guys say?
> JV
>
> On Wed, May 10, 2017 at 4:42 PM, Sijie Guo <guosi...@gmail.com> wrote:
>
>>
>>
>> On Thu, May 11, 2017 at 2:28 AM, Charan Reddy G <reddychara...@gmail.com>
>> wrote:
>>
>>>  @JV and @Sijie I think Android java source is not the appropriate one
>>> to look into. For FileChannelImpl we need to look into jdk code
>>>
>>>
>>> http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8u40-b25/sun/nio/ch/FileChannelImpl.java#FileChannelImpl.force%28boolean%29
>>>
>>>
>>>     public void force(boolean metaData) throws IOException {
>>>         ensureOpen();
>>>         int rv = -1;
>>>         int ti = -1;
>>>         try {
>>>             begin();
>>>             ti = threads.add();
>>>             if (!isOpen())
>>>                 return;
>>>             do {
>>>                 rv = nd.force(fd, metaData);
>>>             } while ((rv == IOStatus.INTERRUPTED) && isOpen());
>>>         } finally {
>>>             threads.remove(ti);
>>>             end(rv > -1);
>>>             assert IOStatus.check(rv);
>>>         }
>>>     }
>>>
>>> here 'nd' is of type FileDispatcherImpl. I tried to look into its code
>>> but since it is native code, we wont be able to read that
>>>
>>>
>>> http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8u40-b25/sun/nio/ch/FileDispatcherImpl.java#FileDispatcherImpl
>>>
>>> So it is not clear if it is just difference of fdatasync() and fsync().
>>> Even if we assume that it is the difference of fdatasync() and fsync(),
>>> man pages of fdatasync() says
>>>
>>> "fdatasync() is similar to fsync(), but does not flush modified metadata
>>> unless that metadata is needed in order to allow a subsequent data
>>> retrieval to be correctly handled.   For  example, changes  to
>>>  st_atime or st_mtime (respectively, time of last access and time of last
>>> modification; see stat(2)) do not require flushing because they are not
>>> necessary for a subsequent data read to be handled correctly.  *On the
>>> other hand, a change to the file size (st_size, as made by say
>>> ftruncate(2)), would require a metadata flush.**"*
>>>
>>>
>> fdatasync - "does not flush modified metadata unless that metadata is
>> needed in order to allow a subsequent data retrieval to be correctly
>> handled"
>>
>> In practice, this means that unless the file size changed, the file
>> metadata will not be synced when calling fdatasync.
>>
>>
>>> Also, when we look into java API for force method -
>>> https://docs.oracle.com/javase/7/docs/api/java/nio/channels/FileChannel.html#force(boolean),
>>> it says the implementation is unspecified and system-dependent. So I don't
>>> think it is safe to make any assumptions here.
>>>
>>> "The metaData parameter can be used to limit the number of I/O
>>> operations that this method is required to perform. Passing false for
>>> this parameter indicates that only updates to the file's content need be
>>> written to storage; passing true indicates that updates to both the
>>> file's content and metadata must be written, which generally requires at
>>> least one more I/O operation. Whether this parameter actually has any
>>> effect is dependent upon the underlying operating system and is therefore
>>> unspecified. Invoking this method may cause an I/O operation to occur
>>> even if the channel was only opened for reading. Some operating systems,
>>> for example, maintain a last-access time as part of a file's metadata, and
>>> this time is updated whenever the file is read. Whether or not this is
>>> actually done is system-dependent and is therefore unspecified."
>>>
>>> Thanks,
>>> Charan
>>>
>>> On Wed, May 10, 2017 at 4:07 AM, Sijie Guo <guosi...@gmail.com> wrote:
>>>
>>>>
>>>>
>>>> On May 10, 2017 12:42 PM, "Venkateswara Rao Jujjuri" <jujj...@gmail.com>
>>>> wrote:
>>>>
>>>> Looked at the source and it is fdatasync()
>>>>
>>>> https://android.googlesource.com/platform/libcore/+/2496a68/luni/src/main/java/java/nio/FileChannelImpl.java
>>>> if (metadata) {
>>>>     Libcore.os.fsync(fd);
>>>> } else {
>>>>      Libcore.os.fdatasync(fd);
>>>> }
>>>>
>>>> So, agreed. fdatasync() fluesh metadata when the file size etc changes,
>>>> and avoids metadata flush if no data related changes but
>>>> but that tells me there may be no situation for us to call force(true)
>>>> in the code.
>>>> But our code calls it with true in some code path, should we change
>>>> everything to false for consistency
>>>>
>>>>
>>>> I don't remember that we called force(true) on file channel. Do you
>>>> mean the buffered channel?
>>>>
>>>>
>>>> JV
>>>>
>>>> On Tue, May 9, 2017 at 4:40 PM, Sijie Guo <guosi...@gmail.com> wrote:
>>>>
>>>>> We don't force flush metadata in bookies. Because:
>>>>>
>>>>> - flush metadata requires more I/O operations.
>>>>> - flush metadata usually updates file metadata information (e.g.
>>>>> access permissions, last modification/access time).
>>>>>
>>>>> The difference of force(false) and force(true) is the difference
>>>>> between fdatasync() and fsync(). The former one update last modification
>>>>> time, while the latter one doesn't. We don't care about the metadata
>>>>> information like access/modification time, that's the reason we only 
>>>>> called
>>>>> force(false).
>>>>>
>>>>> - Sijie
>>>>>
>>>>>
>>>>> On Sat, May 6, 2017 at 9:10 AM, Charan Reddy G <
>>>>> reddychara...@gmail.com> wrote:
>>>>>
>>>>>> By metadata I mean metadata of file
>>>>>>
>>>>>> in BufferedChannel.java
>>>>>>
>>>>>>     public void flush(boolean shouldForceWrite) throws IOException {
>>>>>>         synchronized(this) {
>>>>>>             flushInternal();
>>>>>>         }
>>>>>>         if (shouldForceWrite) {
>>>>>>             forceWrite(false);          <------------- false here
>>>>>>         }
>>>>>>     }
>>>>>>
>>>>>>     public long forceWrite(boolean forceMetadata) throws IOException {
>>>>>>         // This is the point up to which we had flushed to the file
>>>>>> system page cache
>>>>>>         // before issuing this force write hence is guaranteed to be
>>>>>> made durable by
>>>>>>         // the force write, any flush that happens after this may or
>>>>>> may
>>>>>>         // not be flushed
>>>>>>         long positionForceWrite = writeBufferStartPosition.get();
>>>>>>         fileChannel.force(forceMetadata);    <---------------
>>>>>> forcemetadata false
>>>>>>         return positionForceWrite;
>>>>>>     }
>>>>>>
>>>>>> On Fri, May 5, 2017 at 5:56 PM, Charan Reddy G <
>>>>>> reddychara...@gmail.com> wrote:
>>>>>>
>>>>>>> Hey,
>>>>>>>
>>>>>>> In GarbageCollectorThread.doCompactEntryLogs {
>>>>>>>
>>>>>>> ...
>>>>>>>
>>>>>>>             try {
>>>>>>>                 compactEntryLog(scannerFactory, meta);
>>>>>>>                 scannerFactory.flush();            <---------------
>>>>>>> this will eventually call entrylogger.flushCurrentLog and it force 
>>>>>>> writes
>>>>>>> the content of the BufferedChannel but not the metadata of the file
>>>>>>>
>>>>>>>                 LOG.info("Removing entry log {} after compaction",
>>>>>>> meta.getEntryLogId());
>>>>>>>                 removeEntryLog(meta.getEntryLogId());  <-----------
>>>>>>> this will delete the compacted entrylog
>>>>>>>
>>>>>>>             }
>>>>>>> ...
>>>>>>>
>>>>>>>
>>>>>>> in doCompactEntryLogs, we first write the non-deleted entries of the
>>>>>>> entryLog which is getting compacted to the currentLog in entryLogger, 
>>>>>>> then
>>>>>>> we flush the entrylogger before deleting the compacted entrylog. But 
>>>>>>> when
>>>>>>> currentLog is flushed by the entrylogger, it flushes only the content of
>>>>>>> the file but not the metadata. After flush is completed the compacted
>>>>>>> entrylog is deleted. It is not ok to not to force flush the metadata of 
>>>>>>> the
>>>>>>> currentLog for the persisted (checkpointed) data. The filesystem 
>>>>>>> behaviour
>>>>>>> is unexpected in this case and there is possibility of data loss if the
>>>>>>> Bookie crashes before closing that logchannel.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Charan
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Jvrao
>>>> ---
>>>> First they ignore you, then they laugh at you, then they fight you,
>>>> then you win. - Mahatma Gandhi
>>>>
>>>>
>>>>
>>>>
>>>
>>
>
>
> --
> Jvrao
> ---
> First they ignore you, then they laugh at you, then they fight you, then
> you win. - Mahatma Gandhi
>
>
>

Reply via email to