[
https://issues.apache.org/jira/browse/DERBY-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16855272#comment-16855272
]
Bryan Pendleton commented on DERBY-7034:
----------------------------------------
Hi David, sorry your questions kind of fell through the cracks. Derby
development has a bit quiet of late, and so we haven't tuned up our "new
contributor" docs in a while, but perhaps you can help us update them and fix
any rough edges.
Here are some places to start:
# https://cwiki.apache.org/confluence/display/DERBY/ForNewDevelopers
# https://cwiki.apache.org/confluence/display/DERBY/DerbyContributorChecklist
# https://cwiki.apache.org/confluence/display/DERBY/DerbyCommitProcess
In terms of how to "panic" when such an error arises, the first thing that
occurs to me is to follow the code path taken by a "Sane" build when it hits an
"Assert", and see if that gives us reasonable crash behavior.
> Derby's sync() handling can lead to database corruption (at least on Linux)
> ---------------------------------------------------------------------------
>
> Key: DERBY-7034
> URL: https://issues.apache.org/jira/browse/DERBY-7034
> Project: Derby
> Issue Type: Bug
> Components: Store
> Affects Versions: 10.14.2.0
> Reporter: David Sitsky
> Priority: Major
>
> I recently read about "fsyncgate 2018" that the Postgres team raised:
> https://wiki.postgresql.org/wiki/Fsync_Errors.
> https://lwn.net/Articles/752063/ has a good overview of the issue relating to
> fsync() behaviour on Linux. The short summary is on some versions of Linux
> if you retry fsync() after it failed, it will succeed and you will end up
> with corrupted data on disk.
> At a quick glance at the Derby code, I have already seen two places where
> sync() is retried in a loop which is clearly dangerous. There could be other
> areas too.
> In LogAccessFile:
> {code}
> /**
> * Guarantee all writes up to the last call to flushLogAccessFile on disk.
> * <p>
> * A call for clients of LogAccessFile to insure that all data written
> * up to the last call to flushLogAccessFile() are written to disk.
> * This call will not return until those writes have hit disk.
> * <p>
> * Note that this routine may block waiting for I/O to complete so
> * callers should limit the number of resource held locked while this
> * operation is called. It is expected that the caller
> * Note that this routine only "writes" the data to the file, this does
> not
> * mean that the data has been synced to disk. The only way to insure
> that
> * is to first call switchLogBuffer() and then follow by a call of sync().
> *
> **/
> public void syncLogAccessFile()
> throws IOException, StandardException
> {
> for( int i=0; ; )
> {
> // 3311: JVM sync call sometimes fails under high load against
> NFS
> // mounted disk. We re-try to do this 20 times.
> try
> {
> synchronized( this)
> {
> log.sync();
> }
> // the sync succeed, so return
> break;
> }
> catch( SyncFailedException sfe )
> {
> i++;
> try
> {
> // wait for .2 of a second, hopefully I/O is done by now
> // we wait a max of 4 seconds before we give up
> Thread.sleep( 200 );
> }
> catch( InterruptedException ie )
> {
> InterruptStatus.setInterrupted();
> }
> if( i > 20 )
> throw StandardException.newException(
> SQLState.LOG_FULL, sfe);
> }
> }
> }
> {code}
> And LogToFile has similar retry code.. but without handling for
> SyncFailedException:
> {code}
> /**
> * Utility routine to call sync() on the input file descriptor.
> * <p>
> */
> private void syncFile( StorageRandomAccessFile raf)
> throws StandardException
> {
> for( int i=0; ; )
> {
> // 3311: JVM sync call sometimes fails under high load against
> NFS
> // mounted disk. We re-try to do this 20 times.
> try
> {
> raf.sync();
> // the sync succeed, so return
> break;
> }
> catch (IOException ioe)
> {
> i++;
> try
> {
> // wait for .2 of a second, hopefully I/O is done by now
> // we wait a max of 4 seconds before we give up
> Thread.sleep(200);
> }
> catch( InterruptedException ie )
> {
> InterruptStatus.setInterrupted();
> }
> if( i > 20 )
> {
> throw StandardException.newException(
> SQLState.LOG_FULL, ioe);
> }
> }
> }
> }
> {code}
> It seems Postgres, MySQL and MongoDB have already changed their code to
> "panic" if an error comes from an fsync() call.
> There is a lot more complexities with how fsync() reports errors (if at all).
> It is worth getting into it further as I am not familiar with Derby's
> internals and how affected it could be by this.
> Interestingly people have indicated this issue is more likely to happen for
> network filesystems (since write failures are more common due to the network
> going down) and in the past it was easy just to say "NFS is broken".. but in
> actual fact the problem was in some cases with fsync() and how it was called
> in a loop.
> I've been trying to find out if Windows has similar issues without much luck.
> But given the mysterious corruption issues I have seen on the past with
> Windows/CIFS.. I do wonder if this is related somehow.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)