On 12/01/2016 11:27 AM, Denis V. Lunev wrote:

On 12/01/2016 10:09 PM, Maxim Patlasov wrote:
On 12/01/2016 12:06 AM, Dmitry Monakhov wrote:

Maxim Patlasov <mpatla...@virtuozzo.com> writes:

Alexey,


You're right. And while composing the patch I well understood that it's
possible to rework fuse_sync_writes() using a counter instead of
negative bias. But the problem with flush_mtime still exists anyway.
Think about it: we firstly acquire local mtime from local inode, then
fill and submit mtime-update-request. Since then, we don't know when
exactly fuse daemon will apply that new mtime to its metadata
structures. If another mtime-update is generated in-between (e.g.
"touch
-d <date> file", or even simplier -- just a single direct write
implicitly updating mtime), we wouldn't know which of those two
mtime-update-requests are processed by fused first. That comes from a
general FUSE protocol limitation: when kernel fuse queues request A,
then request B, it cannot be sure if they will be processed by
userspace
as <A, then B> or <B, then A>.


The big advantage of the patch I sent is that it's very simple,
straightforward and presumably will remove 99% of contention between
fsync and io_submit (assuming we spend most of time waiting for
userspace ACK for FUSE_FSYNC request. There are actually three
questions
to answer:

1) Do we really must honor a crazy app who mixes a lot of fsyncs with a
lot of io_submits? The goal of fsync is to ensure that some state is
actually went to platters. An app who races io_submit-s with fsync-s
actually doesn't care which state will come to platters. I'm not sure
that it's reasonable to work very hard to achieve the best possible
performance for such a marginal app.
Obiously any filesystem behave like this.
Task A(mail-server) may perform write/fsync, task B(mysql) do a lot
of io_submit-s
All that io may happens in parallel, fs guarantee only that metadata
will be serialized. So all that concurent IO flowa to blockdevice which
does no have i_mutex so all IO indeed happen concurrently.
Looks as you're comparing an app doing POSIX
open/read/write/fsync/close with fs doing submit_bio. This is a
stretch. But OK, there is a similarity. But I don't think this rather
vague similarity proves something.
we are speaking about VM process, which essentially
re-submits IO from the guest to host like above. For sure
QEMU and VM_app have this IO pattern. Thus this
pattern MUST be optimized as this is one of our
main loads.

Yes, I agree. That's exactly why I wrote in the same email (next paragraph):

This really makes sense. If an app inside a VM loops over ordinary direct writes, while another app (in the same VM) does fsync, it's not fair to suspend the first app for long while just because fuse holds i_mutex for long somewhere deep in fuse_fsync.

Max


That is why I think that this case is not marginal
and important.

Den

_______________________________________________
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel

Reply via email to