Re: FFS sync/async/softdep mount opts clarifications&doc gap q&how stable is softdep now?

2021-03-14 Thread Joseph Mayer
On Sunday, 14 March 2021 16:31, Joseph Mayer  
wrote:
> Hi misc@! (Copying posters to the previous threads on this topic)

Pondering further:


5) "mount -o sync" is practically never useful, isn't it so?:

mount's default synchronicity setting is that data is written
asynchronously.

But, fsync(int fd) "causes all modified data and attributes of fd to be
moved to a permanent storage device" (http://man.openbsd.org/fsync.2),
thereby serving as synchronous checkpoint for a file's writes to disk
up to that moment, isn't it so.

(And if like this doing fsync() at sync points, there is no relevance
in calling sync() (http://man.openbsd.org/sync.2) on top of that,
right.)

Normal Unix software will not expect fwrite():s to hit disk in
sequential order, but instead transactional mechanisms will use
fsync() to ensure pending writes have been flushed to disk.

(The underlying disk's actual propensity to actually flush itself,
would be a separate question altogether and vary between SSD
manufacturers.)


Relative to mount's default setting, "mount -o sync" just enforces some
form of stricter order of writes to disk than the default, and in
practice software will not expect and thus not have value of such an
order.

Therefore, "mount -o sync" is normally not practically relevant neither
on development nor on production machines?


(Note http://man.openbsd.org/sync.2 says the in-kernel process runs
sync() every 30 seconds, good to know.

I presume this means sync() will write any not yet written *data* to
disk where data is in asynchronous mode, while not yet written
metadata will always? sometimes? *NOT* be written where metadata is
in asynchronous mode.)



FFS sync/async/softdep mount opts clarifications&doc gap q&how stable is softdep now?

2021-03-14 Thread Joseph Mayer
Hi misc@! (Copying posters to the previous threads on this topic)

I just took the time to go through the ML archive's writeups between
now and 2015, about FFS mount options in respect of synchronicity and
especially softdep.

Here I like to bring up four points to conversation:

 1) OpenBSD's synchronicity for FFS actually consist of two separate
sub-settings: one for metadata and one for data
 2) The -o sync/async/softdep options are a bit unintiutive
 3) Generally don't use "async"
 4) How stable is softdep now?


Here we go:

1) A clarification: OpenBSD's synchronicity for FFS actually consist
of two separate sub-settings:

 (a) Synchronicity of access to metadata (as in directory and file
 structures).

 The options here are synchronous, asynchronous, and soft
 dependencies.

 Synchronous is the default.

 (b) Synchronicity of access to data (as in file contents).

 The options here are synchronous and asynchronous.

 Asynchronous is the default.

This is not at all obvious from the mount(8) man page, however
pondering the question and reading the ML archives carefully, this
is what I see. Maybe this could be added in very brief form to
the man page.


2) The mount options in respect of synchronicity are a bit unintuitive.
Especially what is easy to miss is that the default setting is
NOT reached by any of the three major options which are called "sync",
"async" and "softdep".

What these three actually do is the following:

sync   : Make both metadata and data access synchronous.

 (That means to switch off the asynchronicity of access to
 data.)

async  : Make both metadata and data access asynchronous.

 (That means to switch off the synchronicity of access to
 metadata.)

softdep: Operate metadata according to the logics of the special
 "softdep" mode, and keep data access asynchronous.

So again for clarity, note that neither of these three are the default.

Is there even any "-o" option that causes an immitation of the default?


3) Normally don't ever use async:

In the past I tried to run OpenBSD with "async" and after an unexpected
system crash (power loss etc.) the file system is in shambles and may
need OS reinstallation, reinstallation of programs, such, so I
generally recommend against it.

Writing lots of files as in "tar xvfz ports.tar.gz" is considerably
sped up in async mode however. I presume what "async" does under the
hood is to postpone flushing filesystem metadata writes to disk as long
as it can.

I presume this is why I saw such a tendency to corruption too, that
there even did not seem to be a built-in timer to flush the metadata
to disk. Did I get this right?

Does even fsync(8) cause an async FFS to write its metadata to disk?


A "/tmp" partition can be "async" I guess, presuming that you would
newfs(8) it on every reboot, thereby protecting your boot process
from having fsck fail.


4) How stable is softdep now?

If I got it right, softdep emerged in the time of HDD:s, as a way to
lessen the amount of disk seek operations when creating/modifying
many files and directories.

I understand on an SSD, softdep is performance-positive compared to
default settings though much less noticeably.

Softdep is supposed to be as reliable as synchronous metadata, but
it bulks the IO operations.. leading to less write operations in total,
and this is why it has a performance benefit on SSD:s too?

(2015 performance benefit on SSD report:
https://marc.info/?l=openbsd-misc&m=142294090200592&w=2
1.17 seconds on FFS, 0.76 seconds on softdep = 54% speedup.)


Operationally softdep is supposed to be noncontroversial - it has some
RAM overhead, which should be fine on modern machines.

A system with defaults (=sync+asyc) vs "softdep" (=softdep+async) will
behave the equivalently, essentially -

Nick pointed out in an old ML post that in case of power loss, with
softdep, a not-closed file that was being written to will likely be
truncated.

I saw a mentioning of one person in 2015 complain of having lost data
with softdep: https://marc.info/?l=openbsd-misc&m=142174547612722&w=2


In the ML archive are some stability concerns though:

There is some mentioning that if you do too many file creations/
deletions, the softdep logics could "fall behind" and... the kernel
would crash - is this real!?

(2015 someone mentioned that slow disk writes can cause the fall behind
error: https://marc.info/?l=openbsd-misc&m=142193536805243&w=2 )


Also, if the underlying block device reports a write error to the
softdep logics, then the kernel could panic too - why not just report
to dmesg or such and fail in a more agreeable fashion?? If this is
the failure mode, then softdep is mostly not appropriate for remote
servers, and instead it should be used for laptops/desktops only -
computers where unexpected reboots are manageable.

Has softdep been updated, or is there any sysctl available, for it to
not cause system reboots?

(An example of such a