Assuming that Erlang doesn't lie about the return status, then we'd throw an error on a broken fsync which would kill the couch_db_updater. In the case of delayed_commits we'd lose the last delayed commit interval of writes just as any other error.
That's based on these two lines: https://github.com/apache/couchdb-couch/blob/master/src/couch_file.erl#L207 https://github.com/apache/couchdb-couch/blob/master/src/couch_file.erl#L312 Since we assert that the return value is ok. A quick skim of unix_efile.c shows that its passing the return of fsync to check_error which sets an errno if there was an error. So assuming that EINTR doesn't some how crazily get mutated into an ok atom response, we're fine. https://github.com/erlang/otp/blob/master/erts/emulator/drivers/unix/unix_efile.c#L478-L482 https://github.com/erlang/otp/blob/master/erts/emulator/drivers/unix/unix_efile.c#L94-L102 On Thu, May 21, 2015 at 3:10 PM, Jan Lehnardt <[email protected]> wrote: > >> On 21 May 2015, at 21:40, Alexander Shorin <[email protected]> wrote: >> >> I think it worth to cross post to erlang-questions@ ML. Would you? > > if we don’t get any further here, sure :) — I just don’t want to make > a fool of myself, should this be a simple answer and I feel more > comfortable in this particular crowd, with the CoC and all :) > > Best > Jan > -- > >> -- >> ,,,^..^,,, >> >> >> On Thu, May 21, 2015 at 10:23 PM, Jan Lehnardt <[email protected]> wrote: >>> Hi all, >>> >>> I stumbled across https://ldpreload.com/blog/signalfd-is-useless and >>> wondered how this squares against our use of fsync(). >>> >>> A quick glance at >>> https://github.com/erlang/otp/blob/master/erts/emulator/drivers/unix/unix_efile.c >>> reveals that EINTR is handled in multiple places, but only in >>> read/write/sendfile functions, but not fsync. I also tried to trace the >>> calling code of efile_fsync() (or efile_fdatasync()), but I got lost pretty >>> quickly in some dtrace macro indirections, so I don’t know if there is any >>> retry logic higher up. >>> >>> I’m not experienced enough here to make a call, but does that mean that we >>> have a possible scenario where EINTR interrupts an fsync call after which a >>> crash (machine or CouchDB) leaves part of a database not fsynced? Or would >>> the failing fsync bubble up to the corresponding, say, PUT request handler? >>> How about with delayed_commits=true, is the possible data-loss window then >>> 2 seconds rather than the documented 1s? >>> >>> Can anyone shed any light on this? >>> >>> Best >>> Jan >>> -- >>> >>> >>> > > -- > Professional Support for Apache CouchDB: > http://www.neighbourhood.ie/couchdb-support/ >
