Chris Murphy <li...@colorremedies.com> schrieb:

> 
> On Feb 7, 2014, at 2:07 PM, Kai Krakow <hurikhan77+bt...@gmail.com> wrote:
> 
>> Chris Murphy <li...@colorremedies.com> schrieb:
>> 
>>>> If the database/virtual machine/whatever is crash safe, then the
>>>> atomic state that a snapshot grabs will be useful.
>>> 
>>> How fast is this state fixed on disk from the time of the snapshot
>>> command? Loosely speaking. I'm curious if this is < 1 second; a few
>>> seconds; or possibly up to the 30 second default commit interval? And
>>> also if it's even related to the commit interval time at all?
>> 
>> Such constructs can only be crash-safe if write-barriers are passed down
>> through the cow logic of btrfs to the storage layer. That won't probably
>> ever happen. Atomic and transactional updates cannot happen without
>> write- barriers or synchronous writes. To make it work, you need to
>> design the storage-layers from the ground up to work without
>> write-barriers, like having battery-backed write-caches, synchronous
>> logical file-system layers etc. Otherwise, database/vm/whatever
>> transactional/atomic writes are just having undefined status down at the
>> lowest storage layer.
> 
> This explanation makes sense. But I failed to qualify the "state fixed on
> disk". I'm not concerned about when bits actually arrive on disk. I'm
> wondering what state they describe. So assume no crash or power failure,
> and assume writes eventually make it onto the media without a problem.
> What I'm wondering is, what state of the subvolume I'm snapshotting do I
> end up with? Is there a delay and how long is it, or is it pretty much
> instant? The command completes really quickly even when the file system is
> actively being used, so the feedback is that the snapshot state is
> established very fast but I'm not sure what bearing that has in reality.

I think from that perspective it is more or less the same taking a snapshot 
or cycling the power. For the state of the file consistency it means the 
same, I suppose. I got your argument about "state fixed on disk", but I 
implied from perspective of the writing process it is just the same 
situation: in the moment of the snapshot the data file is in a crashed 
state. That is like cycling the power without having a mechanism to support 
transactional guarantees.

So the question is: Do btrfs snapshots give the same guarantees on the 
filesystem level that write-barriers give on the storage level which exactly 
those processes rely upon? The cleanest solution would be if processes could 
give btrfs hints about what belongs to their transactions so in the moment 
of a snapshot the data file would be in clean state. I guess snapshots are 
atomic in that way, that pending writes will never reach the snapshots just 
taken, which is good.

But what about the ordering of writes? Maybe some younger write requests 
already made it to the disk, while older ones didn't. The file system 
usually only has to care about its own transactional integrity, not those of 
its writing processes, and that is completely unrelated to what the writing 
process expects. Or in other words: A following crash only guarantees that 
the active subvolume being written to is clean from the transactional 
perspective of the process, but the snapshot may be broken. As far as I 
know, user processes cannot tell the filesystem when to issue write-
barriers, it could only issue fsyncs (which hurts performance). Otherwise 
this discussion would be a whole different story.

Did you test how btrfs snapshots perform while running fsync with a lot of 
data to be committed? Could give a clue...

-- 
Replies to list only preferred.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to