Re: [Qemu-devel] [PATCH 2/2 v2] Direct IDE I/O
Gerd Hoffmann wrote: Anthony Liguori wrote: Gerd Hoffmann wrote: Hi, I really want to use readv/writev though. With virtio, we get a scatter/gather list for each IO request. Yep, I've also missed pwritev (or whatever that syscall would be named). Once I post the virtio-blk driver, I'll follow up a little later with some refactoring of the block device layers. I think it can be made much simpler while still remaining asynchronous. IMHO the only alternative to that scheme would be to turn the block drivers in some kind of remapping drivers for the various file formats which don't actually perform the I/O. Then you can handle the actual I/O in a generic way using whatever API is available, be it posix-aio, linux-aio or slow-sync-io. That's part of my plan. Oh, cool. Can you also turn them into a sane shared library while being at it? The current approach to compile it once for qemu and once for qemu-img with -DQEMU_TOOL isn't that great. But if you factor out the actual I/O the block-raw.c code should have no need to mess with qemu internals any more and become much cleaner and simpler ... Yeah, it is definitely something that should be turned into a shared library. I don't think I'll attempt that at first but I do agree it's the right direction to move toward. Regards, Anthony Liguori cheers, Gerd
Re: [Qemu-devel] [PATCH 2/2 v2] Direct IDE I/O
Anthony Liguori wrote: > Gerd Hoffmann wrote: Hi, > I really want to use readv/writev though. With virtio, we get a > scatter/gather list for each IO request. Yep, I've also missed pwritev (or whatever that syscall would be named). > Once I post the virtio-blk driver, I'll follow up a little later with > some refactoring of the block device layers. I think it can be made > much simpler while still remaining asynchronous. > >> IMHO the only alternative to that scheme would be to turn the block >> drivers in some kind of remapping drivers for the various file formats >> which don't actually perform the I/O. Then you can handle the actual >> I/O in a generic way using whatever API is available, be it posix-aio, >> linux-aio or slow-sync-io. > > That's part of my plan. Oh, cool. Can you also turn them into a sane shared library while being at it? The current approach to compile it once for qemu and once for qemu-img with -DQEMU_TOOL isn't that great. But if you factor out the actual I/O the block-raw.c code should have no need to mess with qemu internals any more and become much cleaner and simpler ... cheers, Gerd
Re: [Qemu-devel] [PATCH 2/2 v2] Direct IDE I/O
Gerd Hoffmann wrote: Anthony Liguori wrote: IMHO it would be a much better idea to kill the aio interface altogether and instead make the block drivers reentrant. Then you can use (multiple) posix threads to run the I/O async if you want. Threads are a poor substitute for a proper AIO interface. linux-aio gives you everything you could possibly want in an interface since it allows you to submit multiple vectored operations in a single syscall, use an fd to signal request completion, complete multiple requests in a single syscall, and inject barriers via fdsync. I still think implementing async i/o at block driver level is the wrong thing to do. You'll end up reinventing the wheel over and over again and add complexity to the block drivers which simply doesn't belong there (or not supporting async I/O for most file formats). Just look at the insane file size of the block driver for the simplest possible disk format: block-raw.c. It will become even worse when adding a linux-specific aio variant. In contrast: Making the disk drivers reentrant should be easy for most of them. For the raw driver it should be just using pread/pwrite syscalls instead of lseek + read/write (also saves a syscall along the way, yea!). Others probably need an additional lock for metadata updates. With that in place you can easily implement async I/O via threads one layer above, and only once, in block.c. I really want to use readv/writev though. With virtio, we get a scatter/gather list for each IO request. Once I post the virtio-blk driver, I'll follow up a little later with some refactoring of the block device layers. I think it can be made much simpler while still remaining asynchronous. IMHO the only alternative to that scheme would be to turn the block drivers in some kind of remapping drivers for the various file formats which don't actually perform the I/O. Then you can handle the actual I/O in a generic way using whatever API is available, be it posix-aio, linux-aio or slow-sync-io. That's part of my plan. Regards, Anthony Liguori cheers, Gerd
Re: [Qemu-devel] [PATCH 2/2 v2] Direct IDE I/O
Anthony Liguori wrote: >> IMHO it would be a much better idea to kill the aio interface altogether >> and instead make the block drivers reentrant. Then you can use >> (multiple) posix threads to run the I/O async if you want. > > Threads are a poor substitute for a proper AIO interface. linux-aio > gives you everything you could possibly want in an interface since it > allows you to submit multiple vectored operations in a single syscall, > use an fd to signal request completion, complete multiple requests in a > single syscall, and inject barriers via fdsync. I still think implementing async i/o at block driver level is the wrong thing to do. You'll end up reinventing the wheel over and over again and add complexity to the block drivers which simply doesn't belong there (or not supporting async I/O for most file formats). Just look at the insane file size of the block driver for the simplest possible disk format: block-raw.c. It will become even worse when adding a linux-specific aio variant. In contrast: Making the disk drivers reentrant should be easy for most of them. For the raw driver it should be just using pread/pwrite syscalls instead of lseek + read/write (also saves a syscall along the way, yea!). Others probably need an additional lock for metadata updates. With that in place you can easily implement async I/O via threads one layer above, and only once, in block.c. IMHO the only alternative to that scheme would be to turn the block drivers in some kind of remapping drivers for the various file formats which don't actually perform the I/O. Then you can handle the actual I/O in a generic way using whatever API is available, be it posix-aio, linux-aio or slow-sync-io. cheers, Gerd
Re: [Qemu-devel] [PATCH 2/2 v2] Direct IDE I/O
Le mardi 04 décembre 2007 à 13:49 +0100, Gerd Hoffmann a écrit : > Anthony Liguori wrote: > > I have a patch that uses linux-aio for the virtio-blk driver I'll be > > posting tomorrow and I'm extremely happy with the results. In recent > > kernels, you can use an eventfd interface along with linux-aio so that > > polling is unnecessary. > > Which kernel version is "recent"? I think it is 2.6.22 and after Laurent -- - [EMAIL PROTECTED] -- "Any sufficiently advanced technology is indistinguishable from magic." - Arthur C. Clarke signature.asc Description: Ceci est une partie de message numériquement signée
Re: [Qemu-devel] [PATCH 2/2 v2] Direct IDE I/O
Anthony Liguori wrote: > I have a patch that uses linux-aio for the virtio-blk driver I'll be > posting tomorrow and I'm extremely happy with the results. In recent > kernels, you can use an eventfd interface along with linux-aio so that > polling is unnecessary. Which kernel version is "recent"? cheers, Gerd
Re: [Qemu-devel] [PATCH 2/2 v2] Direct IDE I/O
Le lundi 03 décembre 2007 à 19:16 +, Paul Brook a écrit : > > Yes, librt is providing posix-aio, and librt coming with GNU libc uses > > threads. > > But if I remember correctly librt coming with RHEL uses a mix of threads > > and linux kernel AIO (you can have a look to the .srpm of libc). > > > > BTW, if everyone thinks it could be a good idea I can port block-raw.c > > to use linux kernel AIO (without removing POSIX AIO support, of course) > > This seems rather pointless, given a user can just use a linux-AIO librt > instead. Just a comment: to use linux-aio, file must be opened with O_DIRECT. (it's a good reason to include my patch, isn't it ?) Laurent -- - [EMAIL PROTECTED] -- "Any sufficiently advanced technology is indistinguishable from magic." - Arthur C. Clarke signature.asc Description: Ceci est une partie de message numériquement signée
Re: [Qemu-devel] [PATCH 2/2 v2] Direct IDE I/O
Gerd Hoffmann wrote: Hi, BTW, if everyone thinks it could be a good idea I can port block-raw.c to use linux kernel AIO (without removing POSIX AIO support, of course) IMHO it would be a much better idea to kill the aio interface altogether and instead make the block drivers reentrant. Then you can use (multiple) posix threads to run the I/O async if you want. Threads are a poor substitute for a proper AIO interface. linux-aio gives you everything you could possibly want in an interface since it allows you to submit multiple vectored operations in a single syscall, use an fd to signal request completion, complete multiple requests in a single syscall, and inject barriers via fdsync. Regards, Anthony Liguori cheers, Gerd
Re: [Qemu-devel] [PATCH 2/2 v2] Direct IDE I/O
Paul Brook wrote: Yes, librt is providing posix-aio, and librt coming with GNU libc uses threads. But if I remember correctly librt coming with RHEL uses a mix of threads and linux kernel AIO (you can have a look to the .srpm of libc). BTW, if everyone thinks it could be a good idea I can port block-raw.c to use linux kernel AIO (without removing POSIX AIO support, of course) This seems rather pointless, given a user can just use a linux-AIO librt instead. Not at all. linux-aio is the only interface that allows you to do asynchronous fdsync which simulates a barrier which allows for an ordered queue. I have a patch that uses linux-aio for the virtio-blk driver I'll be posting tomorrow and I'm extremely happy with the results. In recent kernels, you can use an eventfd interface along with linux-aio so that polling is unnecessary. Along with O_DIRECT and the preadv/pwritev interface, you can make a block backend in userspace that performs just as well as if it were in the kernel. The posix-aio interface simply doesn't provide a mechanism to do these things. Regards, Anthony Liguori Paul
Re: [Qemu-devel] [PATCH 2/2 v2] Direct IDE I/O
Gerd Hoffmann, le Mon 03 Dec 2007 22:13:07 +0100, a écrit : > > BTW, if everyone thinks it could be a good idea I can port block-raw.c > > to use linux kernel AIO (without removing POSIX AIO support, of course) > > IMHO it would be a much better idea to kill the aio interface altogether > and instead make the block drivers reentrant. Then you can use > (multiple) posix threads to run the I/O async if you want. Mmm, that will not make my life easier... I'm precisely trying to avoid threads so as to get better throughput. Samuel
Re: [Qemu-devel] [PATCH 2/2 v2] Direct IDE I/O
Hi, > BTW, if everyone thinks it could be a good idea I can port block-raw.c > to use linux kernel AIO (without removing POSIX AIO support, of course) IMHO it would be a much better idea to kill the aio interface altogether and instead make the block drivers reentrant. Then you can use (multiple) posix threads to run the I/O async if you want. cheers, Gerd
Re: [Qemu-devel] [PATCH 2/2 v2] Direct IDE I/O
Paul Brook, le Mon 03 Dec 2007 15:39:48 +, a écrit : > I think host caching is still useful enough to be enabled by default, and > provides a significant performance increase in several cases. > > - The guest typically has a relatively small quantity of RAM, compared to a > modern machine. Allowing the host OS to act as a demand-based L2 cache > allows this to be used without having to dedicate excessive quantities of ram > to qemu. > - I've seen reports that it significantly speeds up the windows installer. > - Host cache is persistent between multiple qemu runs. f you're doing > anything > that requires frequent guest reboots (e.g. kernel debugging) this is going to > be a huge win. > - You're running a host OS that has limited or no caching (e.g. DOS). Yes, and in other cases (e.g. real-production KVM/Xen servers), this is just cache duplication. > I'd hope that the host OS would have cache use heuristics that would help > limit cache pollution. How could it? It can't detect that the guest also has a buffer/page cache. Samuel
Re: [Qemu-devel] [PATCH 2/2 v2] Direct IDE I/O
> Yes, librt is providing posix-aio, and librt coming with GNU libc uses > threads. > But if I remember correctly librt coming with RHEL uses a mix of threads > and linux kernel AIO (you can have a look to the .srpm of libc). > > BTW, if everyone thinks it could be a good idea I can port block-raw.c > to use linux kernel AIO (without removing POSIX AIO support, of course) This seems rather pointless, given a user can just use a linux-AIO librt instead. Paul
Re: [Qemu-devel] [PATCH 2/2 v2] Direct IDE I/O
> Well, let's separate a few things. QEMU uses posix-aio which uses > threads and normal read/write operations. It also limits the number of > threads that aio uses to 1 which effectively makes everything > synchronous anyway. This is a bug. Allegedly this is to workaround an old broken glibc, so we should probably make it conditional on old glibc. Paul
Re: [Qemu-devel] [PATCH 2/2 v2] Direct IDE I/O
Le lundi 03 décembre 2007 à 12:06 -0600, Anthony Liguori a écrit : > Samuel Thibault wrote: > > Anthony Liguori, le Mon 03 Dec 2007 09:54:47 -0600, a écrit : > > > >> Have you done any performance testing? Buffered IO should absolutely > >> beat direct IO simply because buffered IO allows writes to complete > >> before they actually hit disk. > >> > > > > Since qemu can use the aio interface, that shouldn't matter. > > > > Well, let's separate a few things. QEMU uses posix-aio which uses > threads and normal read/write operations. It also limits the number of > threads that aio uses to 1 which effectively makes everything > synchronous anyway. Yes, librt is providing posix-aio, and librt coming with GNU libc uses threads. But if I remember correctly librt coming with RHEL uses a mix of threads and linux kernel AIO (you can have a look to the .srpm of libc). There is also the libaio I wrote some years ago (with Sébastien Dugué) which is purely linux kernel AIO (but kernel patches were never included because of Zach Brown Asynchronous System Call work) BTW, if everyone thinks it could be a good idea I can port block-raw.c to use linux kernel AIO (without removing POSIX AIO support, of course) > But it still doesn't matter. When you issue a write() on an O_DIRECT > fd, the write does not complete until the data has made it's way to > disk. The guest can still run if you're using O_NONBLOCK but the IDE > device will not submit another IO request until you complete the DMA > operation. > > The SCSI device supports multiple outstanding operations but it's > limited to 16 but you'll never see more than one request at a time in > QEMU currently because of the limitation to a single thread. > > Regards, > > Anthony Liguori > > > Samuel > > > > > > > > > > > -- - [EMAIL PROTECTED] -- "Any sufficiently advanced technology is indistinguishable from magic." - Arthur C. Clarke signature.asc Description: Ceci est une partie de message numériquement signée
Re: [Qemu-devel] [PATCH 2/2 v2] Direct IDE I/O
Le lundi 03 décembre 2007 à 09:54 -0600, Anthony Liguori a écrit : > Laurent Vivier wrote: > > Le lundi 03 décembre 2007 à 11:23 +0100, Fabrice Bellard a écrit : > > > >> Laurent Vivier wrote: > >> > >>> This patch enhances the "-drive ,cache=off" mode with IDE drive emulation > >>> by removing the buffer used in the IDE emulation. > >>> --- > >>> block.c | 10 +++ > >>> block.h |2 > >>> block_int.h |1 > >>> cpu-all.h |1 > >>> exec.c | 19 ++ > >>> hw/ide.c| 176 > >>> +--- > >>> vl.c|1 > >>> 7 files changed, 204 insertions(+), 6 deletions(-) > >>> > >> What's the use of keeping the buffered case ? > >> > > > > Well, I don't like to remove code written by others... > > and I don't want to break something. > > > > But if you think I should remove the buffered case, I can. > > > > BTW, do you think I should enable "cache=off" by default ? > > Or even remove the option from the command line and always use > > O_DIRECT ? > > > > Hi Laurent, Hi Anthony, > Have you done any performance testing? Buffered IO should absolutely > beat direct IO simply because buffered IO allows writes to complete > before they actually hit disk. I've observed this myself. Plus the > host typically has a much larger page cache then the guest so the second > level of caching helps an awful lot. I don't have real benchmarks. I just saw some improvements with dbench (which is not a good benchmark, I know...) Direct I/O can be good in some cases (because it avoids multiple copies) and good in others (because it avoids disk access, and as you say it doesn't wait I/O completion). But there are at least two other good reasons to use it: - reliability: by avoiding cache we improve probability of data are on disk (and the ordering of I/O). And as you say, as we wait write completion, we are sure data have been written. - isolation: it allows to avoid to pollute host cache with guest data (and if we have several guests, it avoids to have performance impact at the cache level between guests). But there is no perfect solution, it's why I think it's good thing to let the choice to the user. Laurent - -- - [EMAIL PROTECTED] -- "Any sufficiently advanced technology is indistinguishable from magic." - Arthur C. Clarke signature.asc Description: Ceci est une partie de message numériquement signée
Re: [Qemu-devel] [PATCH 2/2 v2] Direct IDE I/O
Anthony Liguori wrote: > >With the IDE emulation, when the emulated "disk write cache" flag is > >on it may be reasonable to report a write as completed when the AIO is > >dispatched, without waiting for the AIO to complete. > > > >An IDE flush cache command would wait for all outstanding write AIOs > >to complete, and then issue a flush cache (fdatasync) to the real > >device before reporting it has completed. > > > >That's roughly equivalent to what an IDE disk with write caching does, > >and it would provide exactly the guarantees for safe storage to the > >real physical medium that a journalling filesystem or database in the > >guest requires. > > Except that in an enterprise environment, you typically have battery > backed disk cache. It really doesn't matter though b/c in QEMU today, > submitting the request blocks until it's completed anyway (which is > nearly instant anyway since I/O is buffered). Buffered I/O is less reliable in a sense. With buffered I/O, if the host crashes, you may lose data that a filesystem or database on the guest reported as committed to applications. That can result, on those rare occasions, in guest journalled filesystem corruption (something that should be impossible), and in database corruption or durability failure. With direct I/O and write cache emulation (as described), when a guest journalling filesystem or database reports data is committed, it has much the same committment/durability guarantee that the same applications would have running on the host. Namely, the data has reached the disk, and the disk has reported it's committed. This may matter if you want to run those sort of applications in a guest, which clearly people often do, especially with KVM or Xen. Anecdote: This is already a problem in some environments. I have a rented virtual machine; it's running UML. The UML disk uses O_SYNC writes (nowadays), because buffered host writes resulted in occasional guest data loss, and journalled filesystem corruption. Unfortunately, this is a performance slowdown, but it's better than occasional corruption. I imagine similar things apply with Qemu machines occasionally. -- Jamie
Re: [Qemu-devel] [PATCH 2/2 v2] Direct IDE I/O
Jamie Lokier wrote: Paul Brook wrote: On Monday 03 December 2007, Samuel Thibault wrote: Anthony Liguori, le Mon 03 Dec 2007 09:54:47 -0600, a écrit : Have you done any performance testing? Buffered IO should absolutely beat direct IO simply because buffered IO allows writes to complete before they actually hit disk. Since qemu can use the aio interface, that shouldn't matter. Only if the emulated hardware and guest OS support multiple concurrent commands. IDE supports async operation, but not concurrent commmands. In practice this means you only get full performance if you're using the SCSI emulation. With the IDE emulation, when the emulated "disk write cache" flag is on it may be reasonable to report a write as completed when the AIO is dispatched, without waiting for the AIO to complete. An IDE flush cache command would wait for all outstanding write AIOs to complete, and then issue a flush cache (fdatasync) to the real device before reporting it has completed. That's roughly equivalent to what an IDE disk with write caching does, and it would provide exactly the guarantees for safe storage to the real physical medium that a journalling filesystem or database in the guest requires. If a guest doesn't use journalling with IDE write cache safely (e.g. 2.4 Linux and earler), it can simply turn off the IDE "disk write cache" flag, which is what it has to do on a real physical disk too. Terminating the qemu process abruptly might cancel some AIOs, but even that is ok, as it's equivalent to pulling the power on a real disk with uncommitted cached writes. Except that in an enterprise environment, you typically have battery backed disk cache. It really doesn't matter though b/c in QEMU today, submitting the request blocks until it's completed anyway (which is nearly instant anyway since I/O is buffered). Regards, Anthony Liguori -- Jamie
Re: [Qemu-devel] [PATCH 2/2 v2] Direct IDE I/O
Samuel Thibault wrote: Anthony Liguori, le Mon 03 Dec 2007 09:54:47 -0600, a écrit : Have you done any performance testing? Buffered IO should absolutely beat direct IO simply because buffered IO allows writes to complete before they actually hit disk. Since qemu can use the aio interface, that shouldn't matter. Well, let's separate a few things. QEMU uses posix-aio which uses threads and normal read/write operations. It also limits the number of threads that aio uses to 1 which effectively makes everything synchronous anyway. But it still doesn't matter. When you issue a write() on an O_DIRECT fd, the write does not complete until the data has made it's way to disk. The guest can still run if you're using O_NONBLOCK but the IDE device will not submit another IO request until you complete the DMA operation. The SCSI device supports multiple outstanding operations but it's limited to 16 but you'll never see more than one request at a time in QEMU currently because of the limitation to a single thread. Regards, Anthony Liguori Samuel
Re: [Qemu-devel] [PATCH 2/2 v2] Direct IDE I/O
Paul Brook wrote: > On Monday 03 December 2007, Samuel Thibault wrote: > > Anthony Liguori, le Mon 03 Dec 2007 09:54:47 -0600, a écrit : > > > Have you done any performance testing? Buffered IO should absolutely > > > beat direct IO simply because buffered IO allows writes to complete > > > before they actually hit disk. > > > > Since qemu can use the aio interface, that shouldn't matter. > > Only if the emulated hardware and guest OS support multiple concurrent > commands. IDE supports async operation, but not concurrent commmands. In > practice this means you only get full performance if you're using the SCSI > emulation. With the IDE emulation, when the emulated "disk write cache" flag is on it may be reasonable to report a write as completed when the AIO is dispatched, without waiting for the AIO to complete. An IDE flush cache command would wait for all outstanding write AIOs to complete, and then issue a flush cache (fdatasync) to the real device before reporting it has completed. That's roughly equivalent to what an IDE disk with write caching does, and it would provide exactly the guarantees for safe storage to the real physical medium that a journalling filesystem or database in the guest requires. If a guest doesn't use journalling with IDE write cache safely (e.g. 2.4 Linux and earler), it can simply turn off the IDE "disk write cache" flag, which is what it has to do on a real physical disk too. Terminating the qemu process abruptly might cancel some AIOs, but even that is ok, as it's equivalent to pulling the power on a real disk with uncommitted cached writes. -- Jamie
Re: [Qemu-devel] [PATCH 2/2 v2] Direct IDE I/O
On Monday 03 December 2007, Samuel Thibault wrote: > Anthony Liguori, le Mon 03 Dec 2007 09:54:47 -0600, a écrit : > > Have you done any performance testing? Buffered IO should absolutely > > beat direct IO simply because buffered IO allows writes to complete > > before they actually hit disk. > > Since qemu can use the aio interface, that shouldn't matter. Only if the emulated hardware and guest OS support multiple concurrent commands. IDE supports async operation, but not concurrent commmands. In practice this means you only get full performance if you're using the SCSI emulation. Paul
Re: [Qemu-devel] [PATCH 2/2 v2] Direct IDE I/O
Anthony Liguori, le Mon 03 Dec 2007 09:54:47 -0600, a écrit : > Have you done any performance testing? Buffered IO should absolutely > beat direct IO simply because buffered IO allows writes to complete > before they actually hit disk. Since qemu can use the aio interface, that shouldn't matter. Samuel
Re: [Qemu-devel] [PATCH 2/2 v2] Direct IDE I/O
Laurent Vivier wrote: Le lundi 03 décembre 2007 à 11:23 +0100, Fabrice Bellard a écrit : Laurent Vivier wrote: This patch enhances the "-drive ,cache=off" mode with IDE drive emulation by removing the buffer used in the IDE emulation. --- block.c | 10 +++ block.h |2 block_int.h |1 cpu-all.h |1 exec.c | 19 ++ hw/ide.c| 176 +--- vl.c|1 7 files changed, 204 insertions(+), 6 deletions(-) What's the use of keeping the buffered case ? Well, I don't like to remove code written by others... and I don't want to break something. But if you think I should remove the buffered case, I can. BTW, do you think I should enable "cache=off" by default ? Or even remove the option from the command line and always use O_DIRECT ? Hi Laurent, Have you done any performance testing? Buffered IO should absolutely beat direct IO simply because buffered IO allows writes to complete before they actually hit disk. I've observed this myself. Plus the host typically has a much larger page cache then the guest so the second level of caching helps an awful lot. Regards, Anthony Liguori Regards, Laurent
Re: [Qemu-devel] [PATCH 2/2 v2] Direct IDE I/O
On Monday 03 December 2007, Markus Hitter wrote: > Am 03.12.2007 um 11:30 schrieb Laurent Vivier: > > But if you think I should remove the buffered case, I can. > > In doubt, less code is always better. For the unlikely case you broke > something badly, there's always the option to take back the patch. > > > BTW, do you think I should enable "cache=off" by default ? > > This would be fine for a transition phase, but likely, the cache=on > case gets forgotten to be removed later. So, do it now. I think host caching is still useful enough to be enabled by default, and provides a significant performance increase in several cases. - The guest typically has a relatively small quantity of RAM, compared to a modern machine. Allowing the host OS to act as a demand-based L2 cache allows this to be used without having to dedicate excessive quantities of ram to qemu. - I've seen reports that it significantly speeds up the windows installer. - Host cache is persistent between multiple qemu runs. f you're doing anything that requires frequent guest reboots (e.g. kernel debugging) this is going to be a huge win. - You're running a host OS that has limited or no caching (e.g. DOS). I'd hope that the host OS would have cache use heuristics that would help limit cache pollution. Paul
Re: [Qemu-devel] [PATCH 2/2 v2] Direct IDE I/O
Am 03.12.2007 um 11:30 schrieb Laurent Vivier: But if you think I should remove the buffered case, I can. In doubt, less code is always better. For the unlikely case you broke something badly, there's always the option to take back the patch. BTW, do you think I should enable "cache=off" by default ? This would be fine for a transition phase, but likely, the cache=on case gets forgotten to be removed later. So, do it now. my $ 0.02, Markus - - - - - - - - - - - - - - - - - - - Dipl. Ing. Markus Hitter http://www.jump-ing.de/
Re: [Qemu-devel] [PATCH 2/2 v2] Direct IDE I/O
Hi, On Mon, 3 Dec 2007, Fabrice Bellard wrote: > Laurent Vivier wrote: > > This patch enhances the "-drive ,cache=off" mode with IDE drive emulation > > by removing the buffer used in the IDE emulation. > > --- > > block.c | 10 +++ > > block.h |2 block_int.h |1 cpu-all.h |1 exec.c | > > 19 ++ > > hw/ide.c| 176 > > +--- > > vl.c|1 7 files changed, 204 insertions(+), 6 deletions(-) > > What's the use of keeping the buffered case ? AFAICT if your guest is DOS without a disk caching driver, you do not really want to use O_DIRECT. Ciao, Dscho
Re: [Qemu-devel] [PATCH 2/2 v2] Direct IDE I/O
Le lundi 03 décembre 2007 à 11:23 +0100, Fabrice Bellard a écrit : > Laurent Vivier wrote: > > This patch enhances the "-drive ,cache=off" mode with IDE drive emulation > > by removing the buffer used in the IDE emulation. > > --- > > block.c | 10 +++ > > block.h |2 > > block_int.h |1 > > cpu-all.h |1 > > exec.c | 19 ++ > > hw/ide.c| 176 > > +--- > > vl.c|1 > > 7 files changed, 204 insertions(+), 6 deletions(-) > > What's the use of keeping the buffered case ? Well, I don't like to remove code written by others... and I don't want to break something. But if you think I should remove the buffered case, I can. BTW, do you think I should enable "cache=off" by default ? Or even remove the option from the command line and always use O_DIRECT ? Regards, Laurent -- - [EMAIL PROTECTED] -- "Any sufficiently advanced technology is indistinguishable from magic." - Arthur C. Clarke signature.asc Description: Ceci est une partie de message numériquement signée
Re: [Qemu-devel] [PATCH 2/2 v2] Direct IDE I/O
Laurent Vivier wrote: This patch enhances the "-drive ,cache=off" mode with IDE drive emulation by removing the buffer used in the IDE emulation. --- block.c | 10 +++ block.h |2 block_int.h |1 cpu-all.h |1 exec.c | 19 ++ hw/ide.c| 176 +--- vl.c|1 7 files changed, 204 insertions(+), 6 deletions(-) What's the use of keeping the buffered case ? Fabrice.
[Qemu-devel] [PATCH 2/2 v2] Direct IDE I/O
This patch enhances the "-drive ,cache=off" mode with IDE drive emulation by removing the buffer used in the IDE emulation. --- block.c | 10 +++ block.h |2 block_int.h |1 cpu-all.h |1 exec.c | 19 ++ hw/ide.c| 176 +--- vl.c|1 7 files changed, 204 insertions(+), 6 deletions(-) Index: qemu/block.c === --- qemu.orig/block.c 2007-12-03 09:54:47.0 +0100 +++ qemu/block.c2007-12-03 09:54:53.0 +0100 @@ -758,6 +758,11 @@ void bdrv_set_translation_hint(BlockDriv bs->translation = translation; } +void bdrv_set_cache_hint(BlockDriverState *bs, int cache) +{ +bs->cache = cache; +} + void bdrv_get_geometry_hint(BlockDriverState *bs, int *pcyls, int *pheads, int *psecs) { @@ -786,6 +791,11 @@ int bdrv_is_read_only(BlockDriverState * return bs->read_only; } +int bdrv_is_cached(BlockDriverState *bs) +{ +return bs->cache; +} + /* XXX: no longer used */ void bdrv_set_change_cb(BlockDriverState *bs, void (*change_cb)(void *opaque), void *opaque) Index: qemu/block.h === --- qemu.orig/block.h 2007-12-03 09:54:47.0 +0100 +++ qemu/block.h2007-12-03 09:54:53.0 +0100 @@ -113,6 +113,7 @@ void bdrv_set_geometry_hint(BlockDriverS int cyls, int heads, int secs); void bdrv_set_type_hint(BlockDriverState *bs, int type); void bdrv_set_translation_hint(BlockDriverState *bs, int translation); +void bdrv_set_cache_hint(BlockDriverState *bs, int cache); void bdrv_get_geometry_hint(BlockDriverState *bs, int *pcyls, int *pheads, int *psecs); int bdrv_get_type_hint(BlockDriverState *bs); @@ -120,6 +121,7 @@ int bdrv_get_translation_hint(BlockDrive int bdrv_is_removable(BlockDriverState *bs); int bdrv_is_read_only(BlockDriverState *bs); int bdrv_is_inserted(BlockDriverState *bs); +int bdrv_is_cached(BlockDriverState *bs); int bdrv_media_changed(BlockDriverState *bs); int bdrv_is_locked(BlockDriverState *bs); void bdrv_set_locked(BlockDriverState *bs, int locked); Index: qemu/block_int.h === --- qemu.orig/block_int.h 2007-12-03 09:53:30.0 +0100 +++ qemu/block_int.h2007-12-03 09:54:53.0 +0100 @@ -124,6 +124,7 @@ struct BlockDriverState { drivers. They are not used by the block driver */ int cyls, heads, secs, translation; int type; +int cache; char device_name[32]; BlockDriverState *next; }; Index: qemu/vl.c === --- qemu.orig/vl.c 2007-12-03 09:54:47.0 +0100 +++ qemu/vl.c 2007-12-03 09:54:53.0 +0100 @@ -5112,6 +5112,7 @@ static int drive_init(const char *str, i bdrv_flags |= BDRV_O_SNAPSHOT; if (!cache) bdrv_flags |= BDRV_O_DIRECT; +bdrv_set_cache_hint(bdrv, cache); if (bdrv_open(bdrv, file, bdrv_flags) < 0 || qemu_key_check(bdrv, file)) { fprintf(stderr, "qemu: could not open disk image %s\n", file); Index: qemu/hw/ide.c === --- qemu.orig/hw/ide.c 2007-12-03 09:54:47.0 +0100 +++ qemu/hw/ide.c 2007-12-03 09:54:53.0 +0100 @@ -816,7 +816,7 @@ static int dma_buf_rw(BMDMAState *bm, in } /* XXX: handle errors */ -static void ide_read_dma_cb(void *opaque, int ret) +static void ide_read_dma_cb_buffered(void *opaque, int ret) { BMDMAState *bm = opaque; IDEState *s = bm->ide_if; @@ -856,7 +856,86 @@ static void ide_read_dma_cb(void *opaque printf("aio_read: sector_num=%lld n=%d\n", sector_num, n); #endif bm->aiocb = bdrv_aio_read(s->bs, sector_num, s->io_buffer, n, - ide_read_dma_cb, bm); + ide_read_dma_cb_buffered, bm); +} + +static void ide_read_dma_cb_unbuffered(void *opaque, int ret) +{ +BMDMAState *bm = opaque; +IDEState *s = bm->ide_if; +int64_t sector_num; +int nsector; +int len; +uint8_t *phy_addr; + +if (s->nsector == 0) { +s->status = READY_STAT | SEEK_STAT; + ide_set_irq(s); +eot: + bm->status &= ~BM_STATUS_DMAING; + bm->status |= BM_STATUS_INT; + bm->dma_cb = NULL; + bm->ide_if = NULL; + bm->aiocb = NULL; + return; +} + +/* launch next transfer */ + +if (bm->cur_prd_len == 0) { +struct { +uint32_t addr; +uint32_t size; +} prd; + +cpu_physical_memory_read(bm->cur_addr, (uint8_t *)&prd, 8); + +bm->cur_addr += 8; +prd.addr = le32_to_cpu(prd.addr); +prd.size = le32_to_cpu(prd.size); +len = prd.size