On 2019-07-21 1:23 p.m., Paul Davis wrote:
I’m browsing on my phone but I’m pretty sure we should add an `ok =` to
this line so that we force a bad match:

https://github.com/apache/couchdb/blob/9d098787a71d1c7f7f6adea05da15b0da3ecc7ef/src/couch/src/couch_file.erl#L223

Unless I’m missing somewhere else that we’re making that assertion.

This looks good to me, and I've asked Paul a question in the PR that he answers extremely well...if you're curious about how this would play out should an entire filesystem go read-only or missing from underneath CouchDB, I recommend reading it. (It's also on the [email protected] list if you prefer.)

+1 to the fix.

-Joan



On Sun, Jul 21, 2019 at 4:21 AM Jan Lehnardt <[email protected]> wrote:

Hey all,


Joan sent this along in IRC and it reads bad enough[tm] that we should at
least have a few pairs of eyes looking at if we have to do anything:

     http://danluu.com/fsyncgate/

(It is long and dense, you’ll need to read 10-15% of that page to get the
main picture.

tl;dr: running fsync() after an fsync() that reported EIO clears that
error state with no way of recovery on Linux.

There are two ways of handling this correctly:

1. whatever you wrote() between the last successful fsync() and the
fsync() that raised the error, keep around until after the second fsync(),
so you can write() it again.

2. if any one fsync() returns EIO, report this back up immediately, so
whoever calls you can retry.

* * *

We seem to be doing 2. as per my reading.

Erlang looks like it correctly just raises whatever error fsync() might
return:

1.
https://github.com/erlang/otp/blob/maint-r14/erts/emulator/drivers/unix/unix_efile.c#L792-L809
2.
https://github.com/erlang/otp/blob/maint-r14/erts/emulator/drivers/unix/unix_efile.c#L151-L163

couch_file too:

1.
https://github.com/apache/couchdb/blob/master/src/couch/src/couch_file.erl#L215-L223

I glanced at a few paths going up this chain and couldn’t spot a catch
where we’d hide that error, but it’d be great to get some confirmation on
this.

* * *

Please double-check my understanding of the issue, the correct ways
forward and the findings in Erlang and CouchDB.

Best
Jan
--
Professional Support for Apache CouchDB:
https://neighbourhood.ie/couchdb-support/



Reply via email to