On Tue, Nov 10, 2009 at 1:18 PM, Theo de Raadt <dera...@cvs.openbsd.org>
wrote:
>>On Tue, Nov 10, 2009 at 4:29 AM, Nick Guenther <kou...@gmail.com> wrote:
>>> So, as nicely summarized at
>>>
http://www.h-online.com/open/news/item/Possible-data-loss-in-Ext4-740467.html
,
>>> ext4 is kind of broken. It won't honor fsync and, as a /feature/, will
>>> wait up to two minutes to write out data, leading to lots of files
>>> emptied to the great bitbucket in the sky if the machine goes down in
>>> that period. Why is this relevant to OpenBSD? Well sometimes I've been
>>> writing a file in vi or mg and had my machine go down, and when it
>>> comes back I find that the file is empty and I'm just trying to figure
>>> out if this is just because the data wasn't fsync'd or if it's because
>>> of softdep or what.
>>
>>softdep has that effect.  The file was created and then data written.
>>But softdep cares more about the first op than the second, so there's
>>a window where crashing will cause you to wake up with empty files.
>>
>>Without softdep, it's more likely you'll have your data (though it may
>>even be the old version, and you may have to look in lost+found for
>>it).  softdep works fine with fsync, but the old unix trick of write
>>data then rename leads to empty files, because the rename is "sped up"
>>but the data isn't.
>
> There is a very simple explanation for why things are so.
>
> Actual data file loss has never been what these things were coded for.
>
> filesystem *tree and meta-data*, ie. the structure of how things are
> knit together, is the main concern.  If you lose the filesystem tree
> structure, you've lost all your files, not just the newest ones.
> Therefore the goal is safe metadata handling.  The result is you can
> lose specific data in specific (newly written to) files, but the
> structure of the filesystem is consistant enough for fsck to not damage
> it.
>
> If you want to never lose data, you have an option.  Make the filesystem
> syncronous, using the -o sync option.
>
> If you can't accept the performance hit from that, then please accept
> that all the work done over the ages is only on ensuring metadata-safety
> for a low performance penalty.  It has never been about trying to
> promise file data consistancy when that could only be achieved by
> syncronous file data writing.
>

Thank you Ted and Theo for setting the record straight. I'm still a
bit confused so in the hopes of enlightening us all I'd like to keep
asking.

See, since it seems that BSD doesn't have this file-data consistency
guarantee, are Linus' worries about ext4's potential data loss just
being alarmist? It seems to me that the case described in
https://bugs.edge.launchpad.net/ubuntu/+source/linux/+bug/317781/comments/45
is just as likely to happen on OpenBSD--if I run KDE or GNOME and mess
around with my settings then quickly murder the system the files will
be resurrected empty, right?

Another summary article,
http://www.h-online.com/open/news/item/Kernel-developers-squabble-over-Ext3-a
nd-Ext4-740787.html,
says that ext3 mounted with data=ordered  "changes to metadata only
become valid after writing the payload data". My understanding is that
the way this works is the metadata gets journalled to a scratch area
on the disk, then once the syncer gets around to actually writing the
file data (the 'payload') to some new unused location on disk, the
metadata in the journal gets written to the disk too. If the system
goes down before the payload gets written (or even after, but before
the metadata) then the old version of the file is the one still in the
filesystem. This way a file is either in its old state or new state,
never in-between. So then where would my empty file example fit in? Is
it impossible on ext3?

I know I'm getting off topic a bit, but I know this list is clear
enough to clean up the mud puddle. I'm trying to understand the
implementation choices of my chosen OS, so that I can either defend it
to linux zealots. This table summarizes my understanding of the
approximate equivalencies between the various ext and ffs modes.
Please, if I'm totally off, hit me:
[ext3 data= / FFS]
journal ~= sync (ensures consistency of both metadata and file data)
ordered ~= softdep (ensures consistency of metadata both internally
and with file data)
writeback ~= default (ensures consistency of metadata internally but
real file data may not agree, e.g. my empty file)
Additionally FFS has the async flag which turns off the internal
consistency of the metadata structures; I guess there's no equivalent
for this in ext?
What is the reason softdep isn't on by default?

Sorry for being long winded, but I'm thinking some people will be
tempted into geeking out with me,
-Nick

[My own personal See Also section:
ext3 from the horse's mouth
<http://olstrans.sourceforge.net/release/OLS2000-ext3/OLS2000-ext3.html>
ext3 internals in reality
<http://www.sans.org/reading_room/whitepapers/forensics/taking_advantage_of_e
xt3_journaling_file_system_in_a_forensic_investigation_2011>
softdep from the horse's mouth <http://www.mckusick.com/softdep/>]

Reply via email to