Re: [Question] [Block] Is there a way to control the read caching of a block device?

DUO Labs Tue, 03 Sep 2024 20:36:27 -0700

I've just noticed that our conversation was not sent to the mailing list because I accidentally 
"Replied", instead of "Reply All"-ed. For posterity, I put the rest of our 
conversation below:


DUO: Yes --- I have open file descriptors on the host and guest simultaneously.

Bouvier: It's not a good way to share files between systems. A lot of nasty 
things can happen.

DUO: I'm not using it as an actual drive, where I mount a filesystem on both the 
guest and host, but as a single file to pass data host<->guest (it turns out 
that reading/writing to a drive is much faster than TCP/UDP --- 100MB/s vs 4GB/s).

Bouvier: If you want a bidirectional share, I still think it's not a good idea.
Your host and guest OS assume nothing else will modify their disks. That's why 
there are caches for filesystems.

DUO: Hmmm.... I see. If I was to enable ivshmem for macOS hosts, would the ivshmem 
device have the same behavior (ie, guest->host works fine, but host->guest is 
cached)?

Bouvier:
I think the problem you observe is not due to caching policies of QEMU, but 
more to your OS and guest filesystem caches settings. So I'm not sure it would 
be really different by using a ram based filesystem.

Technically, when your guest is writing to a file, the guest OS decides to 
cache this or not. Then, once the guest decides to flush this, the QEMU device 
can decide to write it on the host or not, based on the cache policy you used.

I don't know in details QEMU caching implementation (does QEMU keeps some 
blocks in his own memory? or does it relies on host OS file cache instead?), 
but I'm sure it will be a problem one day.

DUO: After writing to the file, I've set up the guest and host to flush the 
writes (on Linux, sync_file_range(SYNC_FILE_RANGE_WAIT_BEFORE | 
SYNC_FILE_RANGE_WRITE | SYNC_FILE_RANGE_WAIT_AFTER); on macOS, 
fcntl(F_FULLFSYNC)). Right now, I've set the QEMU disk caching policy to 
writethrough, though all of the available settings exhibit the same behavior.

Bouvier:
If you reach the point where you are sure data is correctly written (and 
flushed) to the disk file, the only thing left I can think of is the read cache 
on the Linux guest.

You can try something like drop_caches to confirm this:
https://stackoverflow.com/questions/9551838/how-to-purge-disk-i-o-caches-on-linux
If your file suddenly appear, then that's it. Let me know if it solves this.

However, I still don't know how you can reliably force the guest to re-read 
something which is not supposed to have changed to start with.

DUO:
Huh, wouldn't you know --- that was the problem (running system("echo 3 | sudo tee 
/proc/sys/vm/drop_caches") before every read finally caused my tests to pass). Of 
course, since I now have to run this command upon every read, reading is much slower now. 
Is there a way to purge caches for one file, instead of all of them (which presumably is 
what the command is doing)?

Bouvier: Good to know! It seemed to be the most rationale reason for me.
So, for the next question, the answer is nope.
grep drop_caches in https://www.kernel.org/doc/Documentation/sysctl/vm.txt

Either you can drop only pages, or only slab objects (files), or both, but 
there is no way to purge only files. It makes sense in some way.

One thing you might try is to unmount and remount the disk, this *might* work, 
but I'm not even sure to be honest (the cache may work on blocks, instead of 
filesystem itself).

Next thing will be probably: What's the fastest way to transfer files to VM? 
Beyond network based protocols, you might want to give a try to 
https://virtio-fs.gitlab.io/. I used it only once for a windows vm, but it was 
annoying to setup. You might have much better results with a windows guest.

Re: [Question] [Block] Is there a way to control the read caching of a block device?

Reply via email to