See also: https://github.com/google/renameio

On Friday, April 18, 2025 at 1:31:52 PM UTC+2 Brian Candler wrote:

> On Thursday, 17 April 2025 at 00:10:08 UTC+1 Karel Bílek wrote:
>
> I was quite surprised recently that, at least on linux, file.Close() does 
> not guarantee file.Sync(); in some edge-cases, files can be not put 
> properly to filesystem.
>
> If the data integrity is critical, it seems better to call file.Sync() 
> before file.Close().
>
>
> It depends what you mean by "data integrity". If you want to be reasonably 
> sure[^1] that the data has been persisted to disk *before you continue with 
> anything else*, e.g. in case the power is pulled out later, then yes, you 
> need to fsync the file [^2].
>
> However, of course, someone could pull the power plug *before* your 
> program gets to the point of calling fsync() and/or close().  Therefore, 
> it's really a question of how your application recovers from errors when it 
> next starts, and/or how it communicates with other applications. For 
> example, say that its next step is to send a reply saying "yes I got your 
> request, you don't need to worry about it any more", then maybe the 
> contract says that the other party is entitled to assume that the message 
> has been persisted safely when it receives that message. Therefore, you 
> should persist to disk before sending the reply.
>
> Theodore Ts'o explains this very nicely:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/317781/comments/54
> https://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/
>
> https://thunk.org/tytso/blog/2009/03/12/delayed-allocation-and-the-zero-length-file-problem/
>
> In particular, people were assuming that if you write and close a file 
> (without syncing), followed by an atomic rename, the filesystem would 
> guarantee that *either* the old file *or* the new file would be persisted 
> to disk. Because of delayed allocation this was a false assumption - but it 
> is so ingrained in many pieces of code that a workaround was put into ext4 
> so that people get the behaviour they "expect".
>
> Aside: sometimes all that you need for data integrity is sequencing, which 
> can be enforced by write barriers, without having to wait for things to 
> complete (since the write barrier passes down the stack even through 
> deferred writes).
>
> For example, suppose your application did the following:
> - write chunk A
> - barrier
> - write chunk B
>
> The power could be pulled out *at any point*, even in the middle of a 
> write. On restart you will have to deal with these situations:
> - corrupt or incomplete A only
> - complete A, corrupt or incomplete B
> - complete A, complete B
>
> But with a write barrier, you will never see:
> - corrupt or incomplete A, corrupt or incomplete B
> - corrupt or incomplete A, complete B
>
> [^1] if the device lies, e.g. it says the data has been put in persistent 
> storage but it's only in non-battery-backed RAM, then all bets are off.
>
> [^2] it's also essential to check the return code from fsync():
> https://wiki.postgresql.org/wiki/Fsync_Errors
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion visit 
https://groups.google.com/d/msgid/golang-nuts/e85a36d7-9f01-402a-b8b2-bb0ec13d7054n%40googlegroups.com.

Reply via email to