Hi all, I'm developing paravirtualized target linux system which runs multiple linux containers (LXC) inside itself. (For those, who unfamiliar with LXC, simply put, it's an isolated group of userspace processes with their own rootfs.) Each container should be provided access to its rootfs located at host and execution of container should be deterministic. Particularly, it means that container I/O operations must be synchronized within some predefined quantum of guest _virtual_ time, i.e. its I/O activity shouldn't be delayed by host performance or activities on host and other containers. In other words, guest should see it's like either infinite throughput and zero latency, or some predefined throughput/latency characteristics guaranteed per each rootfs.
While other sources of non-determinism are seem to be eliminated (using TCG, -icount, etc.), asynchronous I/O still introduces it. What is scope of "(asynchronous) I/O" term within qemu? Is it something related to block devices layer only, or generic term, covering whole datapath between vCPU and backend? If it relates to block devices only, does usage of VirtFS guarantee deterministic access, or it still involves some asynchrony relative to guest virtual clock? Is it possible to force asynchronous I/O within qemu to be blocking by some external means (host OS configuration, hooks, etc.) ? I know, it may greatly slow down guest performance, but it's still better than nothing. Maybe some trivial patch can be made to qemu code at virtio, block backend or platform syscalls level? Maybe I/O automatically (and guaranteed) fallbacks to synchronous mode in some particular configurations, such as using block device with image located on tmpfs in RAM (either directly or via overlay fs) ? If so, it's great! Or maybe some other solutions exists?... Main problem is to organize access from guest linux to some file system at host (directory, mount point, image file... doesn't matter) in deterministic manner. Secondary problem is to optimize performance as much as possible by: - avoiding unnecessary overheads (e.g. using virtio infrastructure, preference virtfs over blk device, etc.); - allowing some asynchrony within defined quantum of time (e.g. 10ms), i.e. i/o order and speed are free to float within each quantum borders, while result seen by guest at end of quantum is always same. Actually, what I'm trying to achieve have direct contradiction with most people trying to avoid, because synchronous I/O degradates performance in vast majority of usage scenarios. Does anyone have any thoughts on this? Best regards, Artem Pisarenko -- С уважением, Артем Писаренко