Re: [Qemu-devel] SD card subsystem synchronous I/O
On Fri, Apr 20, 2012 at 12:21 AM, andrzej zaborowski balr...@gmail.com wrote: On 18 April 2012 14:35, Stefan Hajnoczi stefa...@gmail.com wrote: Recently there have been new SD card emulation patches so I want to raise the issue of synchronous I/O while there is focus on the SD subsystem. Maybe some of the people who are improving the SD subsystem will be able to help. sd_blk_read() and sd_blk_write() use the synchronous block I/O functions to read/write data on behalf of the guest. Device emulation runs in the vcpu thread with the QEMU global mutex held, and therefore both the guest vcpu and QEMU's own monitor and VNC server are unresponsive while bdrv_read()/bdrv_write() is blocked. This makes bdrv_read()/bdrv_write() in device emulation code a performance problem - the guest becomes unresponsive and laggy under heavy I/O. In extreme cases, like image files on NFS with a network connectivity issue, it can affect the reliability of QEMU as a whole because the monitor and VNC are unavailable until the I/O operation completes. Device emulation should use the bdrv_aio_readv()/bdrv_aio_writev() functions so that control can return to the guest. When the I/O operation completes a callback function is invoked and the device emulation can signal completion to the guest - usually by setting bits in hardware registers and raising an interrupt. The result is good responsiveness and the monitor/VNC remain available even under heavy I/O. The challenge is how to convert hw/sd.c and possibly update emulated SD controllers. We need to stop assuming that a read/write operation can be performed instantly and need to use a bdrv_aio_readv()/bdrv_aio_writev() callback function to complete the I/O. Since I am not familiar with the SD specification or the hw/sd.c code very well I want to check: * Is anyone willing to convert the SD subsystem? * Will it be possible to convert just hw/sd.c without affecting emulated SD controllers? * If we're going to need to fix all controllers in addition to hw/sd.c, then adding more controllers grows the problem. Yes, controllers would be affected, but there are various ways to go about it. Some could be simple to implement (looking at pxa2xx_mmci.c). First of all the SD specification pretty much assumes the storage medium is flash and data is available immediately after it is requested. The host drives the clock and there's a fixed number of cycles that pass between a command and the response. There's a mechanism for the card to indicate it is busy programming after data is written, but it doesn't apply to some types of writes. However the number of cycles between command and response can be different between card manufacturers, so it looks like the card can pull either the CMD and the DAT line high before starting to send the command response or the data. In qemu you could either make the data transfers async, or the response transfers async, there's no need to do both. If the image is on a network filesystem then there could be problems caused by the synchronous IO. Anything else, I'd guess that the caches, readahead and what not make sync IO the same or unnoticeably faster overall. pxa2xx_mmci.c would be easy to convert to async, but some host controllers that are more software than hardware might theoretically give up if the card doesn't respond in N cycles. Even in a case where the bus specification is strict about timing it's possible that the controllers that guest drivers talk to hide those details and instead work on an interrupt-driven basis. In other words, maybe most of the work will be converting controllers to implement the busy state while we do actual block I/O. Is this possible or do SD controllers expose the real low-level timing aspects of the bus to the guest drivers? Stefan
Re: [Qemu-devel] SD card subsystem synchronous I/O
On 20 April 2012 11:50, Stefan Hajnoczi stefa...@gmail.com wrote: On Fri, Apr 20, 2012 at 12:21 AM, andrzej zaborowski balr...@gmail.com wrote: Yes, controllers would be affected, but there are various ways to go about it. Some could be simple to implement (looking at pxa2xx_mmci.c). First of all the SD specification pretty much assumes the storage medium is flash and data is available immediately after it is requested. The host drives the clock and there's a fixed number of cycles that pass between a command and the response. There's a mechanism for the card to indicate it is busy programming after data is written, but it doesn't apply to some types of writes. However the number of cycles between command and response can be different between card manufacturers, so it looks like the card can pull either the CMD and the DAT line high before starting to send the command response or the data. In qemu you could either make the data transfers async, or the response transfers async, there's no need to do both. If the image is on a network filesystem then there could be problems caused by the synchronous IO. Anything else, I'd guess that the caches, readahead and what not make sync IO the same or unnoticeably faster overall. pxa2xx_mmci.c would be easy to convert to async, but some host controllers that are more software than hardware might theoretically give up if the card doesn't respond in N cycles. Even in a case where the bus specification is strict about timing it's possible that the controllers that guest drivers talk to hide those details and instead work on an interrupt-driven basis. Yep. In other words, maybe most of the work will be converting controllers to implement the busy state while we do actual block I/O. Is this possible or do SD controllers expose the real low-level timing aspects of the bus to the guest drivers? The PXA270 one does not and it would be quite easy to convert, as mentioned. It's probably true for most SoCs' controllers. Many devices (not necessarily emulated in Qemu) just have the SD card's pads wired to GPIOs and driven in software or other solutions between fully software and fully hardware. Linux doesn't have any generic bit banging driver for them as far as I can see. Cheers
Re: [Qemu-devel] SD card subsystem synchronous I/O
On 18 April 2012 14:35, Stefan Hajnoczi stefa...@gmail.com wrote: Recently there have been new SD card emulation patches so I want to raise the issue of synchronous I/O while there is focus on the SD subsystem. Maybe some of the people who are improving the SD subsystem will be able to help. sd_blk_read() and sd_blk_write() use the synchronous block I/O functions to read/write data on behalf of the guest. Device emulation runs in the vcpu thread with the QEMU global mutex held, and therefore both the guest vcpu and QEMU's own monitor and VNC server are unresponsive while bdrv_read()/bdrv_write() is blocked. This makes bdrv_read()/bdrv_write() in device emulation code a performance problem - the guest becomes unresponsive and laggy under heavy I/O. In extreme cases, like image files on NFS with a network connectivity issue, it can affect the reliability of QEMU as a whole because the monitor and VNC are unavailable until the I/O operation completes. Device emulation should use the bdrv_aio_readv()/bdrv_aio_writev() functions so that control can return to the guest. When the I/O operation completes a callback function is invoked and the device emulation can signal completion to the guest - usually by setting bits in hardware registers and raising an interrupt. The result is good responsiveness and the monitor/VNC remain available even under heavy I/O. The challenge is how to convert hw/sd.c and possibly update emulated SD controllers. We need to stop assuming that a read/write operation can be performed instantly and need to use a bdrv_aio_readv()/bdrv_aio_writev() callback function to complete the I/O. Since I am not familiar with the SD specification or the hw/sd.c code very well I want to check: * Is anyone willing to convert the SD subsystem? * Will it be possible to convert just hw/sd.c without affecting emulated SD controllers? * If we're going to need to fix all controllers in addition to hw/sd.c, then adding more controllers grows the problem. Yes, controllers would be affected, but there are various ways to go about it. Some could be simple to implement (looking at pxa2xx_mmci.c). First of all the SD specification pretty much assumes the storage medium is flash and data is available immediately after it is requested. The host drives the clock and there's a fixed number of cycles that pass between a command and the response. There's a mechanism for the card to indicate it is busy programming after data is written, but it doesn't apply to some types of writes. However the number of cycles between command and response can be different between card manufacturers, so it looks like the card can pull either the CMD and the DAT line high before starting to send the command response or the data. In qemu you could either make the data transfers async, or the response transfers async, there's no need to do both. If the image is on a network filesystem then there could be problems caused by the synchronous IO. Anything else, I'd guess that the caches, readahead and what not make sync IO the same or unnoticeably faster overall. pxa2xx_mmci.c would be easy to convert to async, but some host controllers that are more software than hardware might theoretically give up if the card doesn't respond in N cycles. Cheers
[Qemu-devel] SD card subsystem synchronous I/O
Recently there have been new SD card emulation patches so I want to raise the issue of synchronous I/O while there is focus on the SD subsystem. Maybe some of the people who are improving the SD subsystem will be able to help. sd_blk_read() and sd_blk_write() use the synchronous block I/O functions to read/write data on behalf of the guest. Device emulation runs in the vcpu thread with the QEMU global mutex held, and therefore both the guest vcpu and QEMU's own monitor and VNC server are unresponsive while bdrv_read()/bdrv_write() is blocked. This makes bdrv_read()/bdrv_write() in device emulation code a performance problem - the guest becomes unresponsive and laggy under heavy I/O. In extreme cases, like image files on NFS with a network connectivity issue, it can affect the reliability of QEMU as a whole because the monitor and VNC are unavailable until the I/O operation completes. Device emulation should use the bdrv_aio_readv()/bdrv_aio_writev() functions so that control can return to the guest. When the I/O operation completes a callback function is invoked and the device emulation can signal completion to the guest - usually by setting bits in hardware registers and raising an interrupt. The result is good responsiveness and the monitor/VNC remain available even under heavy I/O. The challenge is how to convert hw/sd.c and possibly update emulated SD controllers. We need to stop assuming that a read/write operation can be performed instantly and need to use a bdrv_aio_readv()/bdrv_aio_writev() callback function to complete the I/O. Since I am not familiar with the SD specification or the hw/sd.c code very well I want to check: * Is anyone willing to convert the SD subsystem? * Will it be possible to convert just hw/sd.c without affecting emulated SD controllers? * If we're going to need to fix all controllers in addition to hw/sd.c, then adding more controllers grows the problem. Stefan