Re: [Qemu-devel] SD card subsystem synchronous I/O

2012-04-20 Thread Stefan Hajnoczi
On Fri, Apr 20, 2012 at 12:21 AM, andrzej zaborowski balr...@gmail.com wrote:
 On 18 April 2012 14:35, Stefan Hajnoczi stefa...@gmail.com wrote:
 Recently there have been new SD card emulation patches so I want to
 raise the issue of synchronous I/O while there is focus on the SD
 subsystem.  Maybe some of the people who are improving the SD
 subsystem will be able to help.

 sd_blk_read() and sd_blk_write() use the synchronous block I/O
 functions to read/write data on behalf of the guest.  Device emulation
 runs in the vcpu thread with the QEMU global mutex held, and therefore
 both the guest vcpu and QEMU's own monitor and VNC server are
 unresponsive while bdrv_read()/bdrv_write() is blocked.

 This makes bdrv_read()/bdrv_write() in device emulation code a
 performance problem - the guest becomes unresponsive and laggy under
 heavy I/O.  In extreme cases, like image files on NFS with a network
 connectivity issue, it can affect the reliability of QEMU as a whole
 because the monitor and VNC are unavailable until the I/O operation
 completes.

 Device emulation should use the bdrv_aio_readv()/bdrv_aio_writev()
 functions so that control can return to the guest.  When the I/O
 operation completes a callback function is invoked and the device
 emulation can signal completion to the guest - usually by setting bits
 in hardware registers and raising an interrupt.  The result is good
 responsiveness and the monitor/VNC remain available even under heavy
 I/O.

 The challenge is how to convert hw/sd.c and possibly update emulated
 SD controllers.  We need to stop assuming that a read/write operation
 can be performed instantly and need to use a
 bdrv_aio_readv()/bdrv_aio_writev() callback function to complete the
 I/O.

 Since I am not familiar with the SD specification or the hw/sd.c code
 very well I want to check:

 * Is anyone willing to convert the SD subsystem?

 * Will it be possible to convert just hw/sd.c without affecting
 emulated SD controllers?
  * If we're going to need to fix all controllers in addition to
 hw/sd.c, then adding more controllers grows the problem.

 Yes, controllers would be affected, but there are various ways to go
 about it.  Some could be simple to implement (looking at
 pxa2xx_mmci.c).  First of all the SD specification pretty much assumes
 the storage medium is flash and data is available immediately after
 it is requested.  The host drives the clock and there's a fixed number
 of cycles that pass between a command and the response.  There's a
 mechanism for the card to indicate it is busy programming after data
 is written, but it doesn't apply to some types of writes.

 However the number of cycles between command and response can be
 different between card manufacturers, so it looks like the card can
 pull either the CMD and the DAT line high before starting to send the
 command response or the data.  In qemu you could either make the data
 transfers async, or the response transfers async, there's no need to
 do both.

 If the image is on a network filesystem then there could be problems
 caused by the synchronous IO.  Anything else, I'd guess that the
 caches, readahead and what not make sync IO the same or unnoticeably
 faster overall.  pxa2xx_mmci.c would be easy to convert to async, but
 some host controllers that are more software than hardware might
 theoretically give up if the card doesn't respond in N cycles.

Even in a case where the bus specification is strict about timing it's
possible that the controllers that guest drivers talk to hide those
details and instead work on an interrupt-driven basis.

In other words, maybe most of the work will be converting controllers
to implement the busy state while we do actual block I/O.  Is this
possible or do SD controllers expose the real low-level timing aspects
of the bus to the guest drivers?

Stefan



Re: [Qemu-devel] SD card subsystem synchronous I/O

2012-04-20 Thread andrzej zaborowski
On 20 April 2012 11:50, Stefan Hajnoczi stefa...@gmail.com wrote:
 On Fri, Apr 20, 2012 at 12:21 AM, andrzej zaborowski balr...@gmail.com 
 wrote:
 Yes, controllers would be affected, but there are various ways to go
 about it.  Some could be simple to implement (looking at
 pxa2xx_mmci.c).  First of all the SD specification pretty much assumes
 the storage medium is flash and data is available immediately after
 it is requested.  The host drives the clock and there's a fixed number
 of cycles that pass between a command and the response.  There's a
 mechanism for the card to indicate it is busy programming after data
 is written, but it doesn't apply to some types of writes.

 However the number of cycles between command and response can be
 different between card manufacturers, so it looks like the card can
 pull either the CMD and the DAT line high before starting to send the
 command response or the data.  In qemu you could either make the data
 transfers async, or the response transfers async, there's no need to
 do both.

 If the image is on a network filesystem then there could be problems
 caused by the synchronous IO.  Anything else, I'd guess that the
 caches, readahead and what not make sync IO the same or unnoticeably
 faster overall.  pxa2xx_mmci.c would be easy to convert to async, but
 some host controllers that are more software than hardware might
 theoretically give up if the card doesn't respond in N cycles.

 Even in a case where the bus specification is strict about timing it's
 possible that the controllers that guest drivers talk to hide those
 details and instead work on an interrupt-driven basis.

Yep.


 In other words, maybe most of the work will be converting controllers
 to implement the busy state while we do actual block I/O.  Is this
 possible or do SD controllers expose the real low-level timing aspects
 of the bus to the guest drivers?

The PXA270 one does not and it would be quite easy to convert, as
mentioned.  It's probably true for most SoCs' controllers.  Many
devices (not necessarily emulated in Qemu) just have the SD card's
pads wired to GPIOs and driven in software or other solutions between
fully software and fully hardware.  Linux doesn't have any generic
bit banging driver for them as far as I can see.

Cheers



Re: [Qemu-devel] SD card subsystem synchronous I/O

2012-04-19 Thread andrzej zaborowski
On 18 April 2012 14:35, Stefan Hajnoczi stefa...@gmail.com wrote:
 Recently there have been new SD card emulation patches so I want to
 raise the issue of synchronous I/O while there is focus on the SD
 subsystem.  Maybe some of the people who are improving the SD
 subsystem will be able to help.

 sd_blk_read() and sd_blk_write() use the synchronous block I/O
 functions to read/write data on behalf of the guest.  Device emulation
 runs in the vcpu thread with the QEMU global mutex held, and therefore
 both the guest vcpu and QEMU's own monitor and VNC server are
 unresponsive while bdrv_read()/bdrv_write() is blocked.

 This makes bdrv_read()/bdrv_write() in device emulation code a
 performance problem - the guest becomes unresponsive and laggy under
 heavy I/O.  In extreme cases, like image files on NFS with a network
 connectivity issue, it can affect the reliability of QEMU as a whole
 because the monitor and VNC are unavailable until the I/O operation
 completes.

 Device emulation should use the bdrv_aio_readv()/bdrv_aio_writev()
 functions so that control can return to the guest.  When the I/O
 operation completes a callback function is invoked and the device
 emulation can signal completion to the guest - usually by setting bits
 in hardware registers and raising an interrupt.  The result is good
 responsiveness and the monitor/VNC remain available even under heavy
 I/O.

 The challenge is how to convert hw/sd.c and possibly update emulated
 SD controllers.  We need to stop assuming that a read/write operation
 can be performed instantly and need to use a
 bdrv_aio_readv()/bdrv_aio_writev() callback function to complete the
 I/O.

 Since I am not familiar with the SD specification or the hw/sd.c code
 very well I want to check:

 * Is anyone willing to convert the SD subsystem?

 * Will it be possible to convert just hw/sd.c without affecting
 emulated SD controllers?
  * If we're going to need to fix all controllers in addition to
 hw/sd.c, then adding more controllers grows the problem.

Yes, controllers would be affected, but there are various ways to go
about it.  Some could be simple to implement (looking at
pxa2xx_mmci.c).  First of all the SD specification pretty much assumes
the storage medium is flash and data is available immediately after
it is requested.  The host drives the clock and there's a fixed number
of cycles that pass between a command and the response.  There's a
mechanism for the card to indicate it is busy programming after data
is written, but it doesn't apply to some types of writes.

However the number of cycles between command and response can be
different between card manufacturers, so it looks like the card can
pull either the CMD and the DAT line high before starting to send the
command response or the data.  In qemu you could either make the data
transfers async, or the response transfers async, there's no need to
do both.

If the image is on a network filesystem then there could be problems
caused by the synchronous IO.  Anything else, I'd guess that the
caches, readahead and what not make sync IO the same or unnoticeably
faster overall.  pxa2xx_mmci.c would be easy to convert to async, but
some host controllers that are more software than hardware might
theoretically give up if the card doesn't respond in N cycles.

Cheers



[Qemu-devel] SD card subsystem synchronous I/O

2012-04-18 Thread Stefan Hajnoczi
Recently there have been new SD card emulation patches so I want to
raise the issue of synchronous I/O while there is focus on the SD
subsystem.  Maybe some of the people who are improving the SD
subsystem will be able to help.

sd_blk_read() and sd_blk_write() use the synchronous block I/O
functions to read/write data on behalf of the guest.  Device emulation
runs in the vcpu thread with the QEMU global mutex held, and therefore
both the guest vcpu and QEMU's own monitor and VNC server are
unresponsive while bdrv_read()/bdrv_write() is blocked.

This makes bdrv_read()/bdrv_write() in device emulation code a
performance problem - the guest becomes unresponsive and laggy under
heavy I/O.  In extreme cases, like image files on NFS with a network
connectivity issue, it can affect the reliability of QEMU as a whole
because the monitor and VNC are unavailable until the I/O operation
completes.

Device emulation should use the bdrv_aio_readv()/bdrv_aio_writev()
functions so that control can return to the guest.  When the I/O
operation completes a callback function is invoked and the device
emulation can signal completion to the guest - usually by setting bits
in hardware registers and raising an interrupt.  The result is good
responsiveness and the monitor/VNC remain available even under heavy
I/O.

The challenge is how to convert hw/sd.c and possibly update emulated
SD controllers.  We need to stop assuming that a read/write operation
can be performed instantly and need to use a
bdrv_aio_readv()/bdrv_aio_writev() callback function to complete the
I/O.

Since I am not familiar with the SD specification or the hw/sd.c code
very well I want to check:

* Is anyone willing to convert the SD subsystem?

* Will it be possible to convert just hw/sd.c without affecting
emulated SD controllers?
  * If we're going to need to fix all controllers in addition to
hw/sd.c, then adding more controllers grows the problem.

Stefan