On 12/01/2016 11:27 AM, Denis V. Lunev wrote:
On 12/01/2016 10:09 PM, Maxim Patlasov wrote:
On 12/01/2016 12:06 AM, Dmitry Monakhov wrote:
Maxim Patlasov <mpatla...@virtuozzo.com> writes:
Alexey,
You're right. And while composing the patch I well understood that it's
possible to rework fuse_sync_writes() using a counter instead of
negative bias. But the problem with flush_mtime still exists anyway.
Think about it: we firstly acquire local mtime from local inode, then
fill and submit mtime-update-request. Since then, we don't know when
exactly fuse daemon will apply that new mtime to its metadata
structures. If another mtime-update is generated in-between (e.g.
"touch
-d <date> file", or even simplier -- just a single direct write
implicitly updating mtime), we wouldn't know which of those two
mtime-update-requests are processed by fused first. That comes from a
general FUSE protocol limitation: when kernel fuse queues request A,
then request B, it cannot be sure if they will be processed by
userspace
as <A, then B> or <B, then A>.
The big advantage of the patch I sent is that it's very simple,
straightforward and presumably will remove 99% of contention between
fsync and io_submit (assuming we spend most of time waiting for
userspace ACK for FUSE_FSYNC request. There are actually three
questions
to answer:
1) Do we really must honor a crazy app who mixes a lot of fsyncs with a
lot of io_submits? The goal of fsync is to ensure that some state is
actually went to platters. An app who races io_submit-s with fsync-s
actually doesn't care which state will come to platters. I'm not sure
that it's reasonable to work very hard to achieve the best possible
performance for such a marginal app.
Obiously any filesystem behave like this.
Task A(mail-server) may perform write/fsync, task B(mysql) do a lot
of io_submit-s
All that io may happens in parallel, fs guarantee only that metadata
will be serialized. So all that concurent IO flowa to blockdevice which
does no have i_mutex so all IO indeed happen concurrently.
Looks as you're comparing an app doing POSIX
open/read/write/fsync/close with fs doing submit_bio. This is a
stretch. But OK, there is a similarity. But I don't think this rather
vague similarity proves something.
we are speaking about VM process, which essentially
re-submits IO from the guest to host like above. For sure
QEMU and VM_app have this IO pattern. Thus this
pattern MUST be optimized as this is one of our
main loads.
Yes, I agree. That's exactly why I wrote in the same email (next paragraph):
This really makes sense. If an app inside a VM loops over ordinary
direct writes, while another app (in the same VM) does fsync, it's not
fair to suspend the first app for long while just because fuse holds
i_mutex for long somewhere deep in fuse_fsync.
Max
That is why I think that this case is not marginal
and important.
Den
_______________________________________________
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel