date:20100315

Re: [Qemu-devel] [PATCH, RFC] Replace assert(0) with abort() or cpu_abort()

2010-03-15 Thread Markus Armbruster

Blue Swirl blauwir...@gmail.com writes:

 When building with -DNDEBUG, assert(0) will not stop execution
 so it must not be used for abnormal termination.

For each case: are you sure the code does not recover after assert(0)?
Not saying it does, just asking whether you checked.

 Use cpu_abort() when in CPU context, abort() otherwise.

I sympathize with the general idea, but I don't like dead code after
abort().  What about cleaning that up?

Re: [Qemu-devel] [PATCH 0/6] Multi-level page tables and userland mapping fixes, v3

2010-03-15 Thread Riku Voipio

On Sat, Mar 13, 2010 at 03:00:52PM +0100, Martin Mohring wrote:
 I am sorry to bring us all down to earth, we all had wished that this stuff 
 gets finally fixed, but it seems that those patches applied to QEMU master 
 have killed QEMU user mode.
 
sid and maemo userlands work for me like before (64bit and 32bit hosts).

 I am referring to the user mode fixes after commit 
 0aef4261ac0ec9089ade0e3a92f986cb4ba7317e
 
 I had QEMU working on PowerPC and ARM in chroots from the Linux versions:
 
 - Fedora 11 / 12
 - Ubuntu 9.04, 9.10, 10.04
 - Debian 4, 5 and Sid
 
 for ARM and PowerPC. My host OS is openSUSE 11.2 using a 2.6.31 kernel, or 
 openSUSE 11.1. All machines are 64 Bit machines.
 
 Now I have the situation that all the listed target chroots fail, except:
 - Fedora 11 / 12 @ ARM

Please try to be more specific. What is killed? what do you mean by 
chroots fail. qemu segfault? some binary doesn't work that before did?
some syscall doesn't work anymore?

Re: [Qemu-devel] -fda fat:dir -snapshot

2010-03-15 Thread Kevin Wolf

Am 13.03.2010 20:18, schrieb Michael Tokarev:
 Apparently this does not work, and for a lng time:
 
  $ kvm -fda fat:dir
  [ it opens the sdl window ]
  $ kvm -fda fat:dir -snapshot
  qemu: could not open disk image fat:dir: No such file or directory
 
 Is it supposed to work?

Wow, that's a crazy case. I guess nobody has ever tested this, and
indeed it looks like it never has worked. As you might know, -snapshot
internally creates a temporary qcow2 image which takes the image you
originally asked for as a backing file.

So we have a qcow2 file with a backing file tmp:dir. Now, backing file
paths are always resolved relative to the COW file, so we get
/tmp/fat:dir. Oops.

We could just disable this for protocols as a quick fix, but I think in
fact you do want to have this behaviour when using protocols as a
backing file for a persistent COW image. I guess this needs some more
thought, especially in respect to the discussions of making
file/host_device/... protocols.

If you really have a use case for this, you can use an absolute path
after fat: as a workaround, it won't touch the path then.

Kevin

[Qemu-devel] [PATCH] linux-user: use arm features based on cpu model

2010-03-15 Thread Lars Munch

Use arm features based on cpu model. The hardcoded feature list gave
problems in the setjmp/longjmp functions of glibc since it tried to use
VFP instructions even though I specified a pxa270 as cpu model.

Signed-off-by: Lars Munch l...@segv.dk
---
 linux-user/elfload.c |   10 ++
 1 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 91eea62..79af51d 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -333,10 +333,12 @@ enum
   ARM_HWCAP_ARM_VFPv3D16  = 1  13,
 };
 
-#define ELF_HWCAP (ARM_HWCAP_ARM_SWP | ARM_HWCAP_ARM_HALF  \
-| ARM_HWCAP_ARM_THUMB | ARM_HWCAP_ARM_FAST_MULT \
-| ARM_HWCAP_ARM_FPA | ARM_HWCAP_ARM_VFP \
-| ARM_HWCAP_ARM_NEON | ARM_HWCAP_ARM_VFPv3 )
+#define ELF_HWCAP get_elf_hwcap()
+
+static uint32_t get_elf_hwcap(void)
+{
+return thread_env-features;
+}
 
 #endif
 
-- 
1.7.0

Re: [Qemu-devel] Ideas wiki for GSoC 2010

2010-03-15 Thread Avi Kivity


On 03/10/2010 11:30 PM, Luiz Capitulino wrote:

  Hi there,

  Our wiki page for the Summer of Code 2010 is doing quite well:

http://wiki.qemu.org/Google_Summer_of_Code_2010
   


I will add another project - iommu emulation.  Could be very useful for 
doing device assignment to nested guests, which could make testing a lot 
easier.


--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [PATCH] linux-user: use arm features based on cpu model

2010-03-15 Thread Paul Brook

 +static uint32_t get_elf_hwcap(void)
 +{
 +return thread_env-features;
 +}

No.  These values are not the same.

Paul

[Qemu-devel] Re: [PATCH v2 1/2] read-only: minor cleanup

2010-03-15 Thread Paolo Bonzini


On 03/14/2010 02:19 PM, Naphtali Sprei wrote:

Really use read-only flags for opening the file when asked for read-only

Signed-off-by: Naphtali Spreinsp...@redhat.com
---
  qemu-nbd.c |1 +
  1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/qemu-nbd.c b/qemu-nbd.c
index eac0c21..a393583 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -258,6 +258,7 @@ int main(int argc, char **argv)
  break;
  case 'r':
  readonly = true;
+flags= ~BDRV_O_RDWR;
  break;
  case 'P':
  partition = strtol(optarg,end, 0);


Acked-by: Paolo Bonzini pbonz...@redhat.com

Paolo

[Qemu-devel] [PATCH] linux-user: fix running programs with iwmmxt instructions

2010-03-15 Thread Lars Munch


When using linux-user for emulating an pxa270 we cannot generate an illegal
instruction trap to the kernel to save/load the iwmmxt registers.

Signed-off-by: Lars Munch l...@segv.dk
---
 target-arm/translate.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/target-arm/translate.c b/target-arm/translate.c
index cdfe946..6b621b1 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -5802,9 +5802,11 @@ static int disas_coproc_insn(CPUState * env, 
DisasContext *s, uint32_t insn)
 int cpnum;
 
 cpnum = (insn  8)  0xf;
+#if !defined(CONFIG_USER_ONLY)
 if (arm_feature(env, ARM_FEATURE_XSCALE)
 ((env-cp15.c15_cpar ^ 0x3fff)  (1  cpnum)))
return 1;
+#endif
 
 switch (cpnum) {
   case 0:
-- 
1.7.0

Re: [Qemu-devel] Ideas wiki for GSoC 2010

2010-03-15 Thread Avi Kivity


On 03/15/2010 02:38 PM, Joerg Roedel wrote:

On Mon, Mar 15, 2010 at 02:25:41PM +0200, Avi Kivity wrote:
   

On 03/10/2010 11:30 PM, Luiz Capitulino wrote:
 

   Hi there,

   Our wiki page for the Summer of Code 2010 is doing quite well:

http://wiki.qemu.org/Google_Summer_of_Code_2010

   

I will add another project - iommu emulation.  Could be very useful for
doing device assignment to nested guests, which could make testing a lot
easier.
 

Good idea. If there is interest I could help to mentor this project.
   


Thanks.  I volunteered Anthony, but he may be a little overcommitted.

--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [PATCH 0/6] Multi-level page tables and userland mapping fixes, v3

2010-03-15 Thread Jan-Simon Möller

  I had QEMU working on PowerPC and ARM in chroots from the Linux versions:
 
  - Fedora 11 / 12
  - Ubuntu 9.04, 9.10, 10.04
  - Debian 4, 5 and Sid
 
  for ARM and PowerPC. My host OS is openSUSE 11.2 using a 2.6.31 kernel,
  or openSUSE 11.1. All machines are 64 Bit machines.
 
  Now I have the situation that all the listed target chroots fail, except:
  - Fedora 11 / 12 @ ARM
 
 Please try to be more specific. What is killed? what do you mean by
 chroots fail. qemu segfault? some binary doesn't work that before did?
 some syscall doesn't work anymore?
 

We're still investigating the topic. So far, we tracked one failure down to
ldconfig.real which is a static arm binary executed after the chroot got the 
basic packages installed. If we setup a chroot with an older qemu-arm and 
exchange it afterwards with the new qemu and rerun just the build step, it 
works. Thus it seems to be an issue with static arm binaries atm.

Strace of a call of ldconfig.real with the qemu-arm failing:

http://pastie.org/870189

Sort version:

r...@frodo:/# qemu-arm -strace /sbin/ldconfig.real  
16359 uname(0x403fef78) = 0 
16359 brk(NULL) = 0x000a9000
16359 brk(0x000a9d08) = 0x000a9d08  
16359 open(/dev/urandom,O_RDONLY) = 3 
16359 read(3,0x403ff27d,3) = 3  
16359 close(3) = 0  
[...]
16359 stat64(/usr/lib/libgettextlib.so,0x403fdf28) = 0
16359 stat64(/usr/lib/libgettextpo.so.0,0x403fdec0) = 0
16359 stat64(/usr/lib/libgettextpo.so.0.4.0,0x403fdf28) = 0
16359 stat64(/usr/lib/libpython2.6.so.1.0,0x403fdec0) = 0
16359 stat64(/usr/lib/libpython2.6.so.1.0,0x403fdf28) = 0
16359 open(/etc/ld.so.cache~,O_WRONLY|O_CREAT|O_NOFOLLOW|O_TRUNC,0600) = 3
16359 write(3,0xb03d0,1288) = 1288
16359 write(3,0x403ff0a0,0) = -1 errno=14 (Bad address)
16359 write(2,0x403fca08,21)/sbin/ldconfig.real:  = 21
16359 write(2,0x403fc9e8,28)Writing of cache data failed = 28
16359 write(2,0x403fc5b8,13): Bad address = 13
16359 write(2,0x403fc9c0,1)
 = 1
16359 exit_group(1)

Best,
Jan-Simon

Re: [Qemu-devel] [PATCH] linux-user: fix running programs with iwmmxt instructions

2010-03-15 Thread Paul Brook

 +#if !defined(CONFIG_USER_ONLY)
  if (arm_feature(env, ARM_FEATURE_XSCALE)
((env-cp15.c15_cpar ^ 0x3fff)  (1  cpnum)))
   return 1;
 +#endif

This is almost certainly the wrong way to fix this.

Paul

Re: [Qemu-devel] Ideas wiki for GSoC 2010

2010-03-15 Thread Avi Kivity


On 03/15/2010 03:03 PM, Joerg Roedel wrote:



I will add another project - iommu emulation.  Could be very useful
for doing device assignment to nested guests, which could make
testing a lot easier.
   

Our experiments show that nested device assignment is pretty much
required for I/O performance in nested scenarios.
 

Really? I did a small test with virtio-blk in a nested guest (disk read
with dd, so not a real benchmark) and got a reasonable read-performance
of around 25MB/s from the disk in the l2-guest.

   


Your guest wasn't doing a zillion VMREADs and VMWRITEs every exit.

I plan to reduce VMREAD/VMWRITE overhead for kvm, but not much we can do 
for other guests.


--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] Ideas wiki for GSoC 2010

2010-03-15 Thread Anthony Liguori


On 03/15/2010 07:42 AM, Avi Kivity wrote:

On 03/15/2010 02:38 PM, Joerg Roedel wrote:

On Mon, Mar 15, 2010 at 02:25:41PM +0200, Avi Kivity wrote:

On 03/10/2010 11:30 PM, Luiz Capitulino wrote:

   Hi there,

   Our wiki page for the Summer of Code 2010 is doing quite well:

http://wiki.qemu.org/Google_Summer_of_Code_2010


I will add another project - iommu emulation.  Could be very useful for
doing device assignment to nested guests, which could make testing a 
lot

easier.

Good idea. If there is interest I could help to mentor this project.


Thanks.  I volunteered Anthony, but he may be a little overcommitted.


Joerg, feel free to put your name against too.

Regards,

Anthony Liguori

Re: [Qemu-devel] Ideas wiki for GSoC 2010

2010-03-15 Thread Anthony Liguori


On 03/15/2010 08:11 AM, Avi Kivity wrote:

On 03/15/2010 03:03 PM, Joerg Roedel wrote:



I will add another project - iommu emulation.  Could be very useful
for doing device assignment to nested guests, which could make
testing a lot easier.

Our experiments show that nested device assignment is pretty much
required for I/O performance in nested scenarios.

Really? I did a small test with virtio-blk in a nested guest (disk read
with dd, so not a real benchmark) and got a reasonable read-performance
of around 25MB/s from the disk in the l2-guest.



Your guest wasn't doing a zillion VMREADs and VMWRITEs every exit.

I plan to reduce VMREAD/VMWRITE overhead for kvm, but not much we can 
do for other guests.


VMREAD/VMWRITEs are generally optimized by hypervisors as they tend to 
be costly.  KVM is a bit unusual in terms of how many times the 
instructions are executed per exit.


Regards,

Anthony Liguori

[Qemu-devel] [PATCH] pcnet: make subsystem vendor id match hardware

2010-03-15 Thread Michael S. Tsirkin

Real pcnet device (AT2450) apparently has subsystem
device and vendor id set to 0, this is out of spec
(which requires that vendor id is obtained from PCI SIG)
but windows xp driver seems to need this in order
to associate.

qemu sets pci subsystem id to qumranet/qemu
since d350d97d196a632b6c7493acf07a061017fc6f7d,
debian does not yet have this patch.

https://bugzilla.redhat.com/show_bug.cgi?id=521247

Signed-off-by: Michael S. Tsirkin m...@redhat.com
Cc: Gerd Hoffmann kra...@redhat.com
Cc: Anthony Liguori aligu...@us.ibm.com
---
 hw/pcnet.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/hw/pcnet.c b/hw/pcnet.c
index 44b5b31..12260be 100644
--- a/hw/pcnet.c
+++ b/hw/pcnet.c
@@ -1997,6 +1997,9 @@ static int pci_pcnet_init(PCIDevice *pci_dev)
 pci_set_long(pci_conf + PCI_BASE_ADDRESS_0 + 4,
  PCI_BASE_ADDRESS_SPACE_MEMORY);
 
+pci_set_word(pci_conf + PCI_SUBSYSTEM_VENDOR_ID, 0x0);
+pci_set_word(pci_conf + PCI_SUBSYSTEM_ID, 0x0);
+
 /* TODO: value must be 0 at RST# */
 pci_conf[PCI_INTERRUPT_PIN] = 1; // interrupt pin 0
 pci_conf[PCI_MIN_GNT] = 0x06;
-- 
1.7.0.18.g0d53a5

Re: [Qemu-devel] Ideas wiki for GSoC 2010

2010-03-15 Thread Anthony Liguori


On 03/15/2010 08:24 AM, Joerg Roedel wrote:

On Mon, Mar 15, 2010 at 03:11:42PM +0200, Avi Kivity wrote:
   

On 03/15/2010 03:03 PM, Joerg Roedel wrote:
 
   

I will add another project - iommu emulation.  Could be very useful
for doing device assignment to nested guests, which could make
testing a lot easier.

   

Our experiments show that nested device assignment is pretty much
required for I/O performance in nested scenarios.

 

Really? I did a small test with virtio-blk in a nested guest (disk read
with dd, so not a real benchmark) and got a reasonable read-performance
of around 25MB/s from the disk in the l2-guest.


   

Your guest wasn't doing a zillion VMREADs and VMWRITEs every exit.

I plan to reduce VMREAD/VMWRITE overhead for kvm, but not much we can do
for other guests.
 

Does it matter for the ept-on-ept case? The initial patchset of
nested-vmx implemented it and they reported a performance drop of around
12% between levels which is reasonable. So I expected the loss of
io-performance for l2 also reasonable in this case. My small measurement
was also done using npt-on-npt.
   


But that was something like kernbench IIRC which is actually exit light 
once ept is enabled.


Network IO is typically exit heavy and becomes something more of a 
pathological work load (both for nested ept and nested npt).


Regards,

Anthony Liguori


Joerg

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] -fda fat:dir -snapshot

2010-03-15 Thread Michael Tokarev

Kevin Wolf wrote:
 Am 13.03.2010 20:18, schrieb Michael Tokarev:
 Apparently this does not work, and for a lng time:

  $ kvm -fda fat:dir
  [ it opens the sdl window ]
  $ kvm -fda fat:dir -snapshot
  qemu: could not open disk image fat:dir: No such file or directory

 Is it supposed to work?
 
 Wow, that's a crazy case. I guess nobody has ever tested this, and
 indeed it looks like it never has worked. As you might know, -snapshot
 internally creates a temporary qcow2 image which takes the image you
 originally asked for as a backing file.

Yeah, I looked at strace output at least.

[]
 We could just disable this for protocols as a quick fix, but I think in
 fact you do want to have this behaviour when using protocols as a
 backing file for a persistent COW image. I guess this needs some more
 thought, especially in respect to the discussions of making
 file/host_device/... protocols.
 
 If you really have a use case for this, you can use an absolute path
 after fat: as a workaround, it won't touch the path then.

Initially it was a bugreport against kvm in Debian - see
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=504049
Virtual FAT drive doesn't work with -snapshot, I'm not
really sure what use case there is.  But for me, I already
had a more-or-less valid use case - trying a new driver in
windows, supplying it in a virtual floppy (hence fat:dir)
and checking if it works (hence -snapshot).  I actually
tried to do something in the past, but didn't know the fat:
thing (apparently it is not mentioned in the manpage).

But the whole thing indeed looks somewhat.. messy. ;)

Thanks!

/mjt

Re: [Qemu-devel] [PATCH 0/6] Multi-level page tables and userland mapping fixes, v3

2010-03-15 Thread Riku Voipio

On Mon, Mar 15, 2010 at 01:46:10PM +0100, Jan-Simon Möller wrote:
 We're still investigating the topic. So far, we tracked one failure down to
 ldconfig.real which is a static arm binary executed after the chroot got the 
 basic packages installed. If we setup a chroot with an older qemu-arm and 
 exchange it afterwards with the new qemu and rerun just the build step, it 
 works. Thus it seems to be an issue with static arm binaries atm.
 
 Strace of a call of ldconfig.real with the qemu-arm failing:
 
 http://pastie.org/870189
 
 Sort version:
 
 r...@frodo:/# qemu-arm -strace /sbin/ldconfig.real  
 16359 uname(0x403fef78) = 0 
 16359 brk(NULL) = 0x000a9000
 16359 brk(0x000a9d08) = 0x000a9d08  
 16359 open(/dev/urandom,O_RDONLY) = 3 
 16359 read(3,0x403ff27d,3) = 3  
 16359 close(3) = 0  
 [...]
 16359 stat64(/usr/lib/libgettextlib.so,0x403fdf28) = 0
 16359 stat64(/usr/lib/libgettextpo.so.0,0x403fdec0) = 0
 16359 stat64(/usr/lib/libgettextpo.so.0.4.0,0x403fdf28) = 0
 16359 stat64(/usr/lib/libpython2.6.so.1.0,0x403fdec0) = 0
 16359 stat64(/usr/lib/libpython2.6.so.1.0,0x403fdf28) = 0
 16359 open(/etc/ld.so.cache~,O_WRONLY|O_CREAT|O_NOFOLLOW|O_TRUNC,0600) = 3
 16359 write(3,0xb03d0,1288) = 1288
 16359 write(3,0x403ff0a0,0) = -1 errno=14 (Bad address)

A zero sized write. According to manpage ok. 

In qemu we do a lock_user to to get the string to write. Richards change changes
the access checks the get called by lock_user:

page_check_range:

-if (start + len  start)
-/* we've wrapped around */
...
+if (start + len - 1  start) {
+/* We've wrapped around.  */

This now blows up with len = 0;

 16359 write(2,0x403fca08,21)/sbin/ldconfig.real:  = 21
 16359 write(2,0x403fc9e8,28)Writing of cache data failed = 28
 16359 write(2,0x403fc5b8,13): Bad address = 13
 16359 write(2,0x403fc9c0,1)
  = 1
 16359 exit_group(1)
 
 Best,
 Jan-Simon

[Qemu-devel] wake-on-lan IPMI implementation; real power-off and -no-shutdown

2010-03-15 Thread François Revol

Hello,
while working on a demonstrator for a green-IT project, to show
scheduled machine shutdown and powering depending on various
conditions, I wondered if I could use QEMU with wake-on-lan
transparently, but it seems it's not implemented at all.

I though I could try to add support for it, and with -S it theorically
should be doable at least for the first boot, but the network packets
do not go much further until the NIC is actually initialized, as most
network layers use qemu_can_send_packet() which returns 0 if the
machine is stopped.
Hacking this function to return 1 seems to push the packet upward, but
I couldn't find a single point where I could check for WOL packets,
different -net subsystems using different code paths.

Also, it seems -no-shutdown doesn't actually stop the emulation as
said in the manual, it actually keeps the vm running (and using cpu),
despite the OS trying to shutdown via ACPI. At least I tested so with
Haiku (and acpi=true in kernel config), which properly exits QEMU
without -no-shutdown.

Ideally this would evolve into supporting IPMI, which would allow
managing VMs exactly like physical servers without concern, appart
launching the actual process first.
cf.
http://en.wikipedia.org/wiki/Intelligent_Platform_Management_Interface
http://openipmi.sourceforge.net/

Anyone interested in this ?
Anyone tried already ?

François.

Re: [Qemu-devel] Ideas wiki for GSoC 2010

2010-03-15 Thread Avi Kivity


On 03/15/2010 03:23 PM, Anthony Liguori wrote:

On 03/15/2010 08:11 AM, Avi Kivity wrote:

On 03/15/2010 03:03 PM, Joerg Roedel wrote:



I will add another project - iommu emulation.  Could be very useful
for doing device assignment to nested guests, which could make
testing a lot easier.

Our experiments show that nested device assignment is pretty much
required for I/O performance in nested scenarios.

Really? I did a small test with virtio-blk in a nested guest (disk read
with dd, so not a real benchmark) and got a reasonable read-performance
of around 25MB/s from the disk in the l2-guest.



Your guest wasn't doing a zillion VMREADs and VMWRITEs every exit.

I plan to reduce VMREAD/VMWRITE overhead for kvm, but not much we can 
do for other guests.


VMREAD/VMWRITEs are generally optimized by hypervisors as they tend to 
be costly.  KVM is a bit unusual in terms of how many times the 
instructions are executed per exit.


Do you know offhand of any unnecessary read/writes?  There's 
update_cr8_intercept(), but on normal exits, I don't see what else we 
can remove.


--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [PATCH 0/6] Multi-level page tables and userland mapping fixes, v3

2010-03-15 Thread Jan-Simon Möller

Am Montag, 15. März 2010 15:48:03 schrieb Riku Voipio:
 On Mon, Mar 15, 2010 at 01:46:10PM +0100, Jan-Simon Möller wrote:
  We're still investigating the topic. So far, we tracked one failure down
  to ldconfig.real which is a static arm binary executed after the chroot
  got the basic packages installed. If we setup a chroot with an older
  qemu-arm and exchange it afterwards with the new qemu and rerun just the
  build step, it works. Thus it seems to be an issue with static arm
  binaries atm.
 
  Strace of a call of ldconfig.real with the qemu-arm failing:
 
  http://pastie.org/870189
 
  Sort version:
 
  r...@frodo:/# qemu-arm -strace /sbin/ldconfig.real
  16359 uname(0x403fef78) = 0
  16359 brk(NULL) = 0x000a9000
  16359 brk(0x000a9d08) = 0x000a9d08
  16359 open(/dev/urandom,O_RDONLY) = 3
  16359 read(3,0x403ff27d,3) = 3
  16359 close(3) = 0
  [...]
  16359 stat64(/usr/lib/libgettextlib.so,0x403fdf28) = 0
  16359 stat64(/usr/lib/libgettextpo.so.0,0x403fdec0) = 0
  16359 stat64(/usr/lib/libgettextpo.so.0.4.0,0x403fdf28) = 0
  16359 stat64(/usr/lib/libpython2.6.so.1.0,0x403fdec0) = 0
  16359 stat64(/usr/lib/libpython2.6.so.1.0,0x403fdf28) = 0
  16359 open(/etc/ld.so.cache~,O_WRONLY|O_CREAT|O_NOFOLLOW|O_TRUNC,0600)
  = 3 16359 write(3,0xb03d0,1288) = 1288
  16359 write(3,0x403ff0a0,0) = -1 errno=14 (Bad address)
 
 A zero sized write. According to manpage ok.
 
 In qemu we do a lock_user to to get the string to write. Richards change
  changes the access checks the get called by lock_user:
 
 page_check_range:
 
 -if (start + len  start)
 -/* we've wrapped around */
 ...
 +if (start + len - 1  start) {
 +/* We've wrapped around.  */
 
 This now blows up with len = 0;

Confirmed. A quick test with  if (len  0) around and ldconfig.real runs.

Best,
Jan-Simon

Re: [Qemu-devel] wake-on-lan IPMI implementation; real power-off and -no-shutdown

2010-03-15 Thread Paul Brook

 Ideally this would evolve into supporting IPMI, which would allow
 managing VMs exactly like physical servers without concern, appart
 launching the actual process first.
 cf.
 http://en.wikipedia.org/wiki/Intelligent_Platform_Management_Interface
 http://openipmi.sourceforge.net/
 
 Anyone interested in this ?
 Anyone tried already ?

TBH I don't really see the point.

This definitely seems like something that should be handled by your your 
mangement app (via libvirt, or whatever). i.e. have that implement whatever 
remote protocol you want, and just talk to qemu over the normal monitor 
interface.

Paul

Re: [Qemu-devel] wake-on-lan IPMI implementation; real power-off and -no-shutdown

2010-03-15 Thread François Revol

Le Mon, 15 Mar 2010 15:26:24 +, Paul Brook a écrit :
  Ideally this would evolve into supporting IPMI, which would allow
  managing VMs exactly like physical servers without concern, appart
  launching the actual process first.
  cf.
  http://en.wikipedia.org/wiki/Intelligent_Platform_Management_Interface
http://openipmi.sourceforge.net/
 
  Anyone interested in this ?
  Anyone tried already ?

 TBH I don't really see the point.

 This definitely seems like something that should be handled by your
 your
 mangement app (via libvirt, or whatever). i.e. have that implement
 whatever
 remote protocol you want, and just talk to qemu over the normal
 monitor
 interface.

Hmm for IPMI that might probably indeed be a better way, moreover since
they use a parallel subnet on the primary ethernet card...
Though it would require the implementation to listen to the same
virtual card, which are mapped differently to the host depending on the
VM (-net user, ...).

As for WOL, it would still be handy to have I think... btw, do we
support suspending the emulation via ACPI ? VirtualBox has something
called Pause mode, which I'm not sure actually if it's reflected to
ACPI, which allows to avoid wasting cpu when not usign the guest,
though it requires support from the Guest Additions to avoid clock
drifts.

François.

Re: [Qemu-devel] wake-on-lan IPMI implementation; real power-off and -no-shutdown

2010-03-15 Thread Anthony Liguori


On 03/15/2010 10:37 AM, François Revol wrote:

Le Mon, 15 Mar 2010 15:26:24 +, Paul Brook a écrit :
   

Ideally this would evolve into supporting IPMI, which would allow
managing VMs exactly like physical servers without concern, appart
launching the actual process first.
cf.
http://en.wikipedia.org/wiki/Intelligent_Platform_Management_Interface
   

http://openipmi.sourceforge.net/
   

Anyone interested in this ?
Anyone tried already ?
   

TBH I don't really see the point.

This definitely seems like something that should be handled by your
your
mangement app (via libvirt, or whatever). i.e. have that implement
whatever
remote protocol you want, and just talk to qemu over the normal
monitor
interface.
 

Hmm for IPMI that might probably indeed be a better way, moreover since
they use a parallel subnet on the primary ethernet card...
Though it would require the implementation to listen to the same
virtual card, which are mapped differently to the host depending on the
VM (-net user, ...).

As for WOL, it would still be handy to have I think... btw, do we
support suspending the emulation via ACPI ? VirtualBox has something
called Pause mode, which I'm not sure actually if it's reflected to
ACPI, which allows to avoid wasting cpu when not usign the guest,
though it requires support from the Guest Additions to avoid clock
drifts.
   


It's just stop.

It won't cause time drift in qemu when using pvclock but you will get 
soft lockups in Linux.  I presume you get that with Virtual Box too.


Regards,

Anthony Liguori


François.

Re: [Qemu-devel] wake-on-lan IPMI implementation; real power-off and -no-shutdown

2010-03-15 Thread François Revol

  As for WOL, it would still be handy to have I think... btw, do we
  support suspending the emulation via ACPI ? VirtualBox has
  something
  called Pause mode, which I'm not sure actually if it's reflected
  to
  ACPI, which allows to avoid wasting cpu when not usign the guest,
  though it requires support from the Guest Additions to avoid clock
  drifts.
 

 It's just stop.

 It won't cause time drift in qemu when using pvclock but you will get
 soft lockups in Linux.  I presume you get that with Virtual Box too.

Hmm yeah I didn't try playing with the various clock options, since
Haiku doesn't support them anyway, only TSC, and HPET but it's in
progress.

François.

Re: [Qemu-devel] wake-on-lan IPMI implementation; real power-off and -no-shutdown

2010-03-15 Thread Daniel P. Berrange

On Mon, Mar 15, 2010 at 04:01:27PM +0100, Fran?ois Revol wrote:
 Hello,
 while working on a demonstrator for a green-IT project, to show
 scheduled machine shutdown and powering depending on various
 conditions, I wondered if I could use QEMU with wake-on-lan
 transparently, but it seems it's not implemented at all.
 
 I though I could try to add support for it, and with -S it theorically
 should be doable at least for the first boot, but the network packets
 do not go much further until the NIC is actually initialized, as most
 network layers use qemu_can_send_packet() which returns 0 if the
 machine is stopped.
 Hacking this function to return 1 seems to push the packet upward, but
 I couldn't find a single point where I could check for WOL packets,
 different -net subsystems using different code paths.
 
 Also, it seems -no-shutdown doesn't actually stop the emulation as
 said in the manual, it actually keeps the vm running (and using cpu),
 despite the OS trying to shutdown via ACPI. At least I tested so with
 Haiku (and acpi=true in kernel config), which properly exits QEMU
 without -no-shutdown.

Hmm, I think -no-shutdown should at least stop the CPUs executing. It is
not really useful on its own though. The app managing QEMU would want to
use the new JSON based monitor to listen for the SHUTDOWN event to be
emitted, so it can detect the shutdown completing  then take action it
wants either reset the guest, or kill QEMU, etc

Daniel
-- 
|: Red Hat, Engineering, London-o-   http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :|
|: http://autobuild.org-o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505  -o-   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

Re: [Qemu-devel] [PATCH] linux-user: fix running programs with iwmmxt instructions

2010-03-15 Thread Lars Munch

On Mon, Mar 15, 2010 at 12:53:25PM +, Paul Brook wrote:
  +#if !defined(CONFIG_USER_ONLY)
   if (arm_feature(env, ARM_FEATURE_XSCALE)
   ((env-cp15.c15_cpar ^ 0x3fff)  (1  cpnum)))
  return 1;
  +#endif
 
 This is almost certainly the wrong way to fix this.
 
 Paul


Here is a different fix. Hopefully I am on the right track this time. This
patch inits the cpu to have CP0 and/or CP1 accessible from the beginning if in
linux-user mode.

Signed-off-by: Lars Munch l...@segv.dk
---
 target-arm/helper.c |6 ++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/target-arm/helper.c b/target-arm/helper.c
index 1a181ac..3a55da2 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -165,6 +165,9 @@ static void cpu_reset_model_id(CPUARMState *env, uint32_t 
id)
 /* JTAG_ID is ((id  28) | 0x09265013) */
 env-cp15.c0_cachetype = 0xd172172;
 env-cp15.c1_sys = 0x0078;
+#if defined (CONFIG_USER_ONLY)
+env-cp15.c15_cpar = 0x0001;
+#endif
 break;
 case ARM_CPUID_PXA270_A0:
 case ARM_CPUID_PXA270_A1:
@@ -178,6 +181,9 @@ static void cpu_reset_model_id(CPUARMState *env, uint32_t 
id)
 env-iwmmxt.cregs[ARM_IWMMXT_wCID] = 0x69051000 | 'Q';
 env-cp15.c0_cachetype = 0xd172172;
 env-cp15.c1_sys = 0x0078;
+#if defined (CONFIG_USER_ONLY)
+env-cp15.c15_cpar = 0x0003;
+#endif
 break;
 default:
 cpu_abort(env, Bad CPU ID: %x\n, id);
-- 
1.7.0

Re: [Qemu-devel] [PATCH] linux-user: use arm features based on cpu model

2010-03-15 Thread Lars Munch

On Mon, Mar 15, 2010 at 12:26:27PM +, Paul Brook wrote:
  +static uint32_t get_elf_hwcap(void)
  +{
  +return thread_env-features;
  +}
 
 No.  These values are not the same.
 
 Paul


Yes, these values are indeed not the same. Below is an updated patch with a
function similar to the PPC get_elf_hwcap function. I am unsure if I have too
many or too few features as I do not know the details on all the capability
flags, so comments are more than welcome.

Signed-off-by: Lars Munch l...@segv.dk
---
 linux-user/elfload.c |   24 
 1 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 91eea62..6364176 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -333,10 +333,26 @@ enum
   ARM_HWCAP_ARM_VFPv3D16  = 1  13,
 };
 
-#define ELF_HWCAP (ARM_HWCAP_ARM_SWP | ARM_HWCAP_ARM_HALF  \
-| ARM_HWCAP_ARM_THUMB | ARM_HWCAP_ARM_FAST_MULT \
-| ARM_HWCAP_ARM_FPA | ARM_HWCAP_ARM_VFP \
-| ARM_HWCAP_ARM_NEON | ARM_HWCAP_ARM_VFPv3 )
+#define ELF_HWCAP get_elf_hwcap()
+
+static uint32_t get_elf_hwcap(void)
+{
+CPUARMState *e = thread_env;
+uint32_t features = ARM_HWCAP_ARM_SWP | ARM_HWCAP_ARM_HALF
+  | ARM_HWCAP_ARM_THUMB | ARM_HWCAP_ARM_FAST_MULT
+  | ARM_HWCAP_ARM_FPA;
+
+#define GET_FEATURE(flag, feature)  \
+do {if (e-features  flag) features |= feature; } while(0)
+GET_FEATURE(ARM_FEATURE_VFP, ARM_HWCAP_ARM_VFP);
+GET_FEATURE(ARM_FEATURE_IWMMXT, ARM_HWCAP_ARM_IWMMXT);
+GET_FEATURE(ARM_FEATURE_NEON, ARM_HWCAP_ARM_NEON);
+GET_FEATURE(ARM_FEATURE_VFP3, ARM_HWCAP_ARM_VFPv3);
+GET_FEATURE(ARM_FEATURE_VFP_FP16, ARM_HWCAP_ARM_VFPv3D16);
+#undef GET_FEATURE
+
+return features;
+}
 
 #endif
 
-- 
1.7.0

Re: [Qemu-devel] wake-on-lan IPMI implementation; real power-off and -no-shutdown

2010-03-15 Thread Anthony Liguori


On 03/15/2010 10:55 AM, Daniel P. Berrange wrote:

On Mon, Mar 15, 2010 at 04:01:27PM +0100, Fran?ois Revol wrote:
   

Hello,
while working on a demonstrator for a green-IT project, to show
scheduled machine shutdown and powering depending on various
conditions, I wondered if I could use QEMU with wake-on-lan
transparently, but it seems it's not implemented at all.

I though I could try to add support for it, and with -S it theorically
should be doable at least for the first boot, but the network packets
do not go much further until the NIC is actually initialized, as most
network layers use qemu_can_send_packet() which returns 0 if the
machine is stopped.
Hacking this function to return 1 seems to push the packet upward, but
I couldn't find a single point where I could check for WOL packets,
different -net subsystems using different code paths.

Also, it seems -no-shutdown doesn't actually stop the emulation as
said in the manual, it actually keeps the vm running (and using cpu),
despite the OS trying to shutdown via ACPI. At least I tested so with
Haiku (and acpi=true in kernel config), which properly exits QEMU
without -no-shutdown.
 

Hmm, I think -no-shutdown should at least stop the CPUs executing. It is
not really useful on its own though. The app managing QEMU would want to
use the new JSON based monitor to listen for the SHUTDOWN event to be
emitted, so it can detect the shutdown completing  then take action it
wants either reset the guest, or kill QEMU, etc
   


The semantics of -no-shutdown are awful.

I'd personally prefer to see the option deprecated and a new set of 
options introduced with clearer semantics.


Currently, -no-shutdown does too many things.  It affects reboot 
behaviour, shutdown behaviour, the behavior of the SDL close button.  
Each of these things should be individual tunables.


Regards,

Anthony Liguori


Daniel

Re: [Qemu-devel] wake-on-lan IPMI implementation; real power-off and -no-shutdown

2010-03-15 Thread François Revol

  Also, it seems -no-shutdown doesn't actually stop the emulation
   as
  said in the manual, it actually keeps the vm running (and using
   cpu),
  despite the OS trying to shutdown via ACPI. At least I tested so
   with
  Haiku (and acpi=true in kernel config), which properly exits QEMU
  without -no-shutdown.
 
  Hmm, I think -no-shutdown should at least stop the CPUs executing.
  It is
  not really useful on its own though. The app managing QEMU would
  want to
  use the new JSON based monitor to listen for the SHUTDOWN event to
  be
  emitted, so it can detect the shutdown completing  then take
  action it
  wants either reset the guest, or kill QEMU, etc
 

 The semantics of -no-shutdown are awful.

 I'd personally prefer to see the option deprecated and a new set of
 options introduced with clearer semantics.

 Currently, -no-shutdown does too many things.  It affects reboot
 behaviour, shutdown behaviour, the behavior of the SDL close button.
 Each of these things should be individual tunables.

Indeed, it both means do not allow user to close the vm and do not
allow the guest to stop (or reboot ?) it seems...

François.

[Qemu-devel] [RFC][PATCH 0/7] blkdebug

2010-03-15 Thread Kevin Wolf

This patch series introduces a new block driver which acts as a protocol and
whose purpose it is to fail requests. To be more precise, I want it to fail in
configurable places, so that qemu-iotests can be extended with tests for the
error paths (for example for the case when something with metadata writes goes
wrong deep in qcow2).

It works like this (I think this is self-explanatory):

$ cat /tmp/blkdebug.cfg
[inject-error]
event = l1_update
errno = 5
immediately = on
$ qemu-io blkdebug:/tmp/blkdebug.cfg:/tmp/empty.qcow2
qemu-io read 0 4k
read 4096/4096 bytes at offset 0
4 KiB, 1 ops; 0. sec (195.312 MiB/sec and 5. ops/sec)
qemu-io write 0 4k
bdrv_debug_event: 4
blkdebug_debug_event: 4
write failed: Input/output error

Basically what I think is left to do is:
- Resolve TODOs and FIXMEs left in the code
- Require blkdebug to be explicitly enabled before compiling in event
  generation (not sure if it really matters)
- Add more events, they are only added for qcow2-clusters so far
- Address your comments (if any)

Kevin Wolf (7):
  qemu-config: qemu_read_config_file() reads the normal config file
  qemu-config: Make qemu_config_parse more generic
  blkdebug: Basic request passthrough
  blkdebug: Inject errors
  Make qemu-config available for tools
  blkdebug: Add events and rules
  qcow2: Trigger blkdebug events

 Makefile.objs |6 +-
 block.c   |   13 ++
 block.h   |   28 
 block/blkdebug.c  |  423 +
 block/qcow2-cluster.c |   15 ++
 block_int.h   |2 +
 hw/qdev-properties.c  |   19 ++-
 hw/qdev.h |1 -
 qemu-config.c |   45 +++---
 qemu-config.h |4 +-
 vl.c  |   34 ++---
 11 files changed, 540 insertions(+), 50 deletions(-)
 create mode 100644 block/blkdebug.c

[Qemu-devel] [RFC][PATCH 2/7] qemu-config: Make qemu_config_parse more generic

2010-03-15 Thread Kevin Wolf

qemu_config_parse gets the option groups as a parameter now instead of
hardcoding the VM configuration groups. This way it can be used for other
configurations, too.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 qemu-config.c |   15 ---
 qemu-config.h |2 +-
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/qemu-config.c b/qemu-config.c
index 8166b5e..e121f2a 100644
--- a/qemu-config.c
+++ b/qemu-config.c
@@ -290,7 +290,7 @@ QemuOptsList qemu_cpudef_opts = {
 },
 };
 
-static QemuOptsList *lists[] = {
+static QemuOptsList *vm_config_groups[] = {
 qemu_drive_opts,
 qemu_chardev_opts,
 qemu_device_opts,
@@ -303,7 +303,7 @@ static QemuOptsList *lists[] = {
 NULL,
 };
 
-static QemuOptsList *find_list(const char *group)
+static QemuOptsList *find_list(QemuOptsList **lists, const char *group)
 {
 int i;
 
@@ -330,7 +330,7 @@ int qemu_set_option(const char *str)
 return -1;
 }
 
-list = find_list(group);
+list = find_list(vm_config_groups, group);
 if (list == NULL) {
 return -1;
 }
@@ -415,6 +415,7 @@ static int config_write_opts(QemuOpts *opts, void *opaque)
 void qemu_config_write(FILE *fp)
 {
 struct ConfigWriteData data = { .fp = fp };
+QemuOptsList **lists = vm_config_groups;
 int i;
 
 fprintf(fp, # qemu config file\n\n);
@@ -424,7 +425,7 @@ void qemu_config_write(FILE *fp)
 }
 }
 
-int qemu_config_parse(FILE *fp)
+int qemu_config_parse(FILE *fp, QemuOptsList **lists)
 {
 char line[1024], group[64], id[64], arg[64], value[1024];
 QemuOptsList *list = NULL;
@@ -441,7 +442,7 @@ int qemu_config_parse(FILE *fp)
 }
 if (sscanf(line, [%63s \%63[^\]\], group, id) == 2) {
 /* group with id */
-list = find_list(group);
+list = find_list(lists, group);
 if (list == NULL)
 return -1;
 opts = qemu_opts_create(list, id, 1);
@@ -449,7 +450,7 @@ int qemu_config_parse(FILE *fp)
 }
 if (sscanf(line, [%63[^]]], group) == 1) {
 /* group without id */
-list = find_list(group);
+list = find_list(lists, group);
 if (list == NULL)
 return -1;
 opts = qemu_opts_create(list, NULL, 0);
@@ -481,7 +482,7 @@ int qemu_read_config_file(const char *filename)
 return -errno;
 }
 
-if (qemu_config_parse(f) != 0) {
+if (qemu_config_parse(f, vm_config_groups) != 0) {
 return -EINVAL;
 }
 fclose(f);
diff --git a/qemu-config.h b/qemu-config.h
index bbe9d41..086aa2a 100644
--- a/qemu-config.h
+++ b/qemu-config.h
@@ -16,7 +16,7 @@ int qemu_global_option(const char *str);
 void qemu_add_globals(void);
 
 void qemu_config_write(FILE *fp);
-int qemu_config_parse(FILE *fp);
+int qemu_config_parse(FILE *fp, QemuOptsList **lists);
 
 int qemu_read_config_file(const char *filename);
 
-- 
1.6.6.1

[Qemu-devel] [RFC][PATCH 1/7] qemu-config: qemu_read_config_file() reads the normal config file

2010-03-15 Thread Kevin Wolf

Introduce a new function qemu_read_config_file which reads the VM configuration
from a config file. Unlike qemu_config_parse it doesn't take a open file but a
filename and reduces code duplication as a side effect.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 qemu-config.c |   15 +++
 qemu-config.h |2 ++
 vl.c  |   34 +-
 3 files changed, 30 insertions(+), 21 deletions(-)

diff --git a/qemu-config.c b/qemu-config.c
index 246fae6..8166b5e 100644
--- a/qemu-config.c
+++ b/qemu-config.c
@@ -473,3 +473,18 @@ int qemu_config_parse(FILE *fp)
 }
 return 0;
 }
+
+int qemu_read_config_file(const char *filename)
+{
+FILE *f = fopen(filename, r);
+if (f == NULL) {
+return -errno;
+}
+
+if (qemu_config_parse(f) != 0) {
+return -EINVAL;
+}
+fclose(f);
+
+return 0;
+}
diff --git a/qemu-config.h b/qemu-config.h
index b335c42..bbe9d41 100644
--- a/qemu-config.h
+++ b/qemu-config.h
@@ -18,4 +18,6 @@ void qemu_add_globals(void);
 void qemu_config_write(FILE *fp);
 int qemu_config_parse(FILE *fp);
 
+int qemu_read_config_file(const char *filename);
+
 #endif /* QEMU_CONFIG_H */
diff --git a/vl.c b/vl.c
index a3e43ad..078e3f9 100644
--- a/vl.c
+++ b/vl.c
@@ -4940,21 +4940,17 @@ int main(int argc, char **argv, char **envp)
 }
 
 if (defconfig) {
-FILE *fp;
-fp = fopen(CONFIG_QEMU_CONFDIR /qemu.conf, r);
-if (fp) {
-if (qemu_config_parse(fp) != 0) {
-exit(1);
-}
-fclose(fp);
+int ret;
+
+ret = qemu_read_config_file(CONFIG_QEMU_CONFDIR /qemu.conf);
+if (ret == -EINVAL) {
+exit(1);
 }
 
-fp = fopen(CONFIG_QEMU_CONFDIR /target- TARGET_ARCH .conf, r);
-if (fp) {
-if (qemu_config_parse(fp) != 0) {
-exit(1);
-}
-fclose(fp);
+ret = qemu_read_config_file(CONFIG_QEMU_CONFDIR
+/target- TARGET_ARCH .conf);
+if (ret == -EINVAL) {
+exit(1);
 }
 }
 #if defined(cpudef_setup)
@@ -5635,16 +5631,12 @@ int main(int argc, char **argv, char **envp)
 #endif
 case QEMU_OPTION_readconfig:
 {
-FILE *fp;
-fp = fopen(optarg, r);
-if (fp == NULL) {
-fprintf(stderr, open %s: %s\n, optarg, 
strerror(errno));
+int ret = qemu_read_config_file(optarg);
+if (ret  0) {
+fprintf(stderr, read config %s: %s\n, optarg,
+strerror(-ret));
 exit(1);
 }
-if (qemu_config_parse(fp) != 0) {
-exit(1);
-}
-fclose(fp);
 break;
 }
 case QEMU_OPTION_writeconfig:
-- 
1.6.6.1

[Qemu-devel] [RFC][PATCH 3/7] blkdebug: Basic request passthrough

2010-03-15 Thread Kevin Wolf

This isn't doing anything interesting. It creates the blkdebug block driver as
a protocol which just passes everything through to raw.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 Makefile.objs|2 +-
 block/blkdebug.c |  104 ++
 2 files changed, 105 insertions(+), 1 deletions(-)
 create mode 100644 block/blkdebug.c

diff --git a/Makefile.objs b/Makefile.objs
index e791dd5..ece02d5 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -14,7 +14,7 @@ block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
 
 block-nested-y += cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o
 block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o
-block-nested-y += parallels.o nbd.o
+block-nested-y += parallels.o nbd.o blkdebug.o
 block-nested-$(CONFIG_WIN32) += raw-win32.o
 block-nested-$(CONFIG_POSIX) += raw-posix.o
 block-nested-$(CONFIG_CURL) += curl.o
diff --git a/block/blkdebug.c b/block/blkdebug.c
new file mode 100644
index 000..2c7e0dd
--- /dev/null
+++ b/block/blkdebug.c
@@ -0,0 +1,104 @@
+/*
+ * Block protocol for I/O error injection
+ *
+ * Copyright (c) 2010 Kevin Wolf kw...@redhat.com
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the Software), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include qemu-common.h
+#include block_int.h
+#include module.h
+
+typedef struct BDRVBlkdebugState {
+BlockDriverState *hd;
+} BDRVBlkdebugState;
+
+static int blkdebug_open(BlockDriverState *bs, const char *filename, int flags)
+{
+BDRVBlkdebugState *s = bs-opaque;
+
+if (strncmp(filename, blkdebug:, strlen(blkdebug:))) {
+return -EINVAL;
+}
+filename += strlen(blkdebug:);
+
+return bdrv_file_open(s-hd, filename, flags);
+}
+
+static BlockDriverAIOCB *blkdebug_aio_readv(BlockDriverState *bs,
+int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
+BlockDriverCompletionFunc *cb, void *opaque)
+{
+BDRVBlkdebugState *s = bs-opaque;
+BlockDriverAIOCB *acb =
+bdrv_aio_readv(s-hd, sector_num, qiov, nb_sectors, cb, opaque);
+return acb;
+}
+
+static BlockDriverAIOCB *blkdebug_aio_writev(BlockDriverState *bs,
+int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
+BlockDriverCompletionFunc *cb, void *opaque)
+{
+BDRVBlkdebugState *s = bs-opaque;
+BlockDriverAIOCB *acb =
+bdrv_aio_writev(s-hd, sector_num, qiov, nb_sectors, cb, opaque);
+return acb;
+}
+
+static void blkdebug_close(BlockDriverState *bs)
+{
+BDRVBlkdebugState *s = bs-opaque;
+bdrv_delete(s-hd);
+}
+
+static void blkdebug_flush(BlockDriverState *bs)
+{
+BDRVBlkdebugState *s = bs-opaque;
+bdrv_flush(s-hd);
+}
+
+static BlockDriverAIOCB *blkdebug_aio_flush(BlockDriverState *bs,
+BlockDriverCompletionFunc *cb, void *opaque)
+{
+BDRVBlkdebugState *s = bs-opaque;
+return bdrv_aio_flush(s-hd, cb, opaque);
+}
+
+static BlockDriver bdrv_blkdebug = {
+.format_name= blkdebug,
+.protocol_name  = blkdebug,
+
+.instance_size  = sizeof(BDRVBlkdebugState),
+
+.bdrv_open  = blkdebug_open,
+.bdrv_close = blkdebug_close,
+.bdrv_flush = blkdebug_flush,
+
+.bdrv_aio_readv = blkdebug_aio_readv,
+.bdrv_aio_writev= blkdebug_aio_writev,
+.bdrv_aio_flush = blkdebug_aio_flush,
+};
+
+static void bdrv_blkdebug_init(void)
+{
+bdrv_register(bdrv_blkdebug);
+}
+
+block_init(bdrv_blkdebug_init);
-- 
1.6.6.1

[Qemu-devel] [RFC][PATCH 4/7] blkdebug: Inject errors

2010-03-15 Thread Kevin Wolf

Add a mechanism to inject errors instead of passing requests on. With no
further patches applied, you can use it by setting inject_errno in gdb.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/blkdebug.c |   63 ++
 1 files changed, 63 insertions(+), 0 deletions(-)

diff --git a/block/blkdebug.c b/block/blkdebug.c
index 2c7e0dd..f8ccd3c 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -26,10 +26,26 @@
 #include block_int.h
 #include module.h
 
+#include stdbool.h
+
 typedef struct BDRVBlkdebugState {
 BlockDriverState *hd;
+
+int inject_errno;
+bool inject_once;
+
+/* Decides if aio_readv/writev fails right away (true) or returns an error
+ * return value only in the callback (false) */
+bool inject_immediately;
 } BDRVBlkdebugState;
 
+struct callback_data {
+QEMUBH *bh;
+BlockDriverCompletionFunc *cb;
+int ret;
+void *opaque;
+};
+
 static int blkdebug_open(BlockDriverState *bs, const char *filename, int flags)
 {
 BDRVBlkdebugState *s = bs-opaque;
@@ -42,11 +58,53 @@ static int blkdebug_open(BlockDriverState *bs, const char 
*filename, int flags)
 return bdrv_file_open(s-hd, filename, flags);
 }
 
+static void error_callback_bh(void *opaque)
+{
+struct callback_data *d = opaque;
+qemu_bh_delete(d-bh);
+d-cb(d-opaque, d-ret);
+qemu_free(d);
+}
+
+static BlockDriverAIOCB *inject_error(BlockDriverState *bs,
+BlockDriverCompletionFunc *cb, void *opaque)
+{
+BDRVBlkdebugState *s = bs-opaque;
+int error = s-inject_errno;
+struct callback_data *d;
+QEMUBH *bh;
+
+if (s-inject_once) {
+s-inject_errno = 0;
+}
+
+if (s-inject_immediately) {
+return NULL;
+}
+
+d = qemu_mallocz(sizeof(*d));
+d-ret = error;
+d-opaque = opaque;
+d-cb = cb;
+
+bh = qemu_bh_new(error_callback_bh, d);
+d-bh = bh;
+qemu_bh_schedule(bh);
+
+// TODO Use a real AIOCB
+return (BlockDriverAIOCB*) 42;
+}
+
 static BlockDriverAIOCB *blkdebug_aio_readv(BlockDriverState *bs,
 int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
 BlockDriverCompletionFunc *cb, void *opaque)
 {
 BDRVBlkdebugState *s = bs-opaque;
+
+if (s-inject_errno) {
+return inject_error(bs, cb, opaque);
+}
+
 BlockDriverAIOCB *acb =
 bdrv_aio_readv(s-hd, sector_num, qiov, nb_sectors, cb, opaque);
 return acb;
@@ -57,6 +115,11 @@ static BlockDriverAIOCB 
*blkdebug_aio_writev(BlockDriverState *bs,
 BlockDriverCompletionFunc *cb, void *opaque)
 {
 BDRVBlkdebugState *s = bs-opaque;
+
+if (s-inject_errno) {
+return inject_error(bs, cb, opaque);
+}
+
 BlockDriverAIOCB *acb =
 bdrv_aio_writev(s-hd, sector_num, qiov, nb_sectors, cb, opaque);
 return acb;
-- 
1.6.6.1

[Qemu-devel] [RFC][PATCH 5/7] Make qemu-config available for tools

2010-03-15 Thread Kevin Wolf

To be able to use config files for blkdebug, we need to make these functions
available in the tools. This involves moving two functions that can only be
built in the context of the emulator.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 Makefile.objs|4 ++--
 hw/qdev-properties.c |   19 ++-
 hw/qdev.h|1 -
 qemu-config.c|   17 -
 4 files changed, 20 insertions(+), 21 deletions(-)

diff --git a/Makefile.objs b/Makefile.objs
index ece02d5..930408b 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -8,7 +8,7 @@ qobject-obj-y += qerror.o
 # block-obj-y is code used by both qemu system emulation and qemu-img
 
 block-obj-y = cutils.o cache-utils.o qemu-malloc.o qemu-option.o module.o
-block-obj-y += nbd.o block.o aio.o aes.o osdep.o
+block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o
 block-obj-$(CONFIG_POSIX) += posix-aio-compat.o
 block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
 
@@ -76,7 +76,7 @@ common-obj-y += buffered_file.o migration.o migration-tcp.o 
qemu-sockets.o
 common-obj-y += qemu-char.o savevm.o #aio.o
 common-obj-y += msmouse.o ps2.o
 common-obj-y += qdev.o qdev-properties.o
-common-obj-y += qemu-config.o block-migration.o
+common-obj-y += block-migration.o
 
 common-obj-$(CONFIG_BRLAPI) += baum.o
 common-obj-$(CONFIG_POSIX) += migration-exec.o migration-unix.o migration-fd.o
diff --git a/hw/qdev-properties.c b/hw/qdev-properties.c
index 24671af..35d284e 100644
--- a/hw/qdev-properties.c
+++ b/hw/qdev-properties.c
@@ -666,7 +666,7 @@ void qdev_prop_set_defaults(DeviceState *dev, Property 
*props)
 
 static QTAILQ_HEAD(, GlobalProperty) global_props = 
QTAILQ_HEAD_INITIALIZER(global_props);
 
-void qdev_prop_register_global(GlobalProperty *prop)
+static void qdev_prop_register_global(GlobalProperty *prop)
 {
 QTAILQ_INSERT_TAIL(global_props, prop, next);
 }
@@ -694,3 +694,20 @@ void qdev_prop_set_globals(DeviceState *dev)
 }
 }
 }
+
+static int qdev_add_one_global(QemuOpts *opts, void *opaque)
+{
+GlobalProperty *g;
+
+g = qemu_mallocz(sizeof(*g));
+g-driver   = qemu_opt_get(opts, driver);
+g-property = qemu_opt_get(opts, property);
+g-value= qemu_opt_get(opts, value);
+qdev_prop_register_global(g);
+return 0;
+}
+
+void qemu_add_globals(void)
+{
+qemu_opts_foreach(qemu_global_opts, qdev_add_one_global, NULL, 0);
+}
diff --git a/hw/qdev.h b/hw/qdev.h
index adfcf79..30f31a1 100644
--- a/hw/qdev.h
+++ b/hw/qdev.h
@@ -273,7 +273,6 @@ void qdev_prop_set_macaddr(DeviceState *dev, const char 
*name, uint8_t *value);
 void qdev_prop_set_ptr(DeviceState *dev, const char *name, void *value);
 void qdev_prop_set_defaults(DeviceState *dev, Property *props);
 
-void qdev_prop_register_global(GlobalProperty *prop);
 void qdev_prop_register_global_list(GlobalProperty *props);
 void qdev_prop_set_globals(DeviceState *dev);
 
diff --git a/qemu-config.c b/qemu-config.c
index e121f2a..9070729 100644
--- a/qemu-config.c
+++ b/qemu-config.c
@@ -367,23 +367,6 @@ int qemu_global_option(const char *str)
 return 0;
 }
 
-static int qemu_add_one_global(QemuOpts *opts, void *opaque)
-{
-GlobalProperty *g;
-
-g = qemu_mallocz(sizeof(*g));
-g-driver   = qemu_opt_get(opts, driver);
-g-property = qemu_opt_get(opts, property);
-g-value= qemu_opt_get(opts, value);
-qdev_prop_register_global(g);
-return 0;
-}
-
-void qemu_add_globals(void)
-{
-qemu_opts_foreach(qemu_global_opts, qemu_add_one_global, NULL, 0);
-}
-
 struct ConfigWriteData {
 QemuOptsList *list;
 FILE *fp;
-- 
1.6.6.1

[Qemu-devel] [RFC][PATCH 6/7] blkdebug: Add events and rules

2010-03-15 Thread Kevin Wolf

Block drivers can trigger a blkdebug event whenever they reach a place where it
could be useful to inject an error for testing/debugging purposes.

Rules are read from a blkdebug config file and describe which action is taken
when an event is triggered. For now this is only injecting an error (with a few
options) or changing the state (which is an integer). Rules can be declared to
be active only in a specific state; this way later rules can distiguish on
which path we came to trigger their event.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block.c  |   13 +++
 block.h  |9 ++
 block/blkdebug.c |  241 +-
 block_int.h  |2 +
 4 files changed, 264 insertions(+), 1 deletions(-)

diff --git a/block.c b/block.c
index 31d1ba4..d4adbc3 100644
--- a/block.c
+++ b/block.c
@@ -1532,6 +1532,19 @@ int bdrv_load_vmstate(BlockDriverState *bs, uint8_t *buf,
 return drv-bdrv_load_vmstate(bs, buf, pos, size);
 }
 
+void bdrv_debug_event(BlockDriverState *bs, BlkDebugEvent event)
+{
+BlockDriver *drv = bs-drv;
+
+fprintf(stderr, bdrv_debug_event: %d\n, event);
+if (!drv || !drv-bdrv_debug_event) {
+return;
+}
+
+return drv-bdrv_debug_event(bs, event);
+
+}
+
 /**/
 /* handling of snapshots */
 
diff --git a/block.h b/block.h
index edf5704..2fd8361 100644
--- a/block.h
+++ b/block.h
@@ -207,4 +207,13 @@ int bdrv_get_dirty(BlockDriverState *bs, int64_t sector);
 void bdrv_reset_dirty(BlockDriverState *bs, int64_t cur_sector,
   int nr_sectors);
 int64_t bdrv_get_dirty_count(BlockDriverState *bs);
+
+
+typedef enum {
+BLKDBG_EVENT_MAX,
+} BlkDebugEvent;
+
+#define BLKDBG_EVENT(bs, evt) bdrv_debug_event(bs, evt)
+void bdrv_debug_event(BlockDriverState *bs, BlkDebugEvent event);
+
 #endif
diff --git a/block/blkdebug.c b/block/blkdebug.c
index f8ccd3c..22b1768 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -31,12 +31,15 @@
 typedef struct BDRVBlkdebugState {
 BlockDriverState *hd;
 
+int state;
 int inject_errno;
 bool inject_once;
 
 /* Decides if aio_readv/writev fails right away (true) or returns an error
  * return value only in the callback (false) */
 bool inject_immediately;
+
+QLIST_HEAD(list, blkdebug_rule) rules[BLKDBG_EVENT_MAX];
 } BDRVBlkdebugState;
 
 struct callback_data {
@@ -46,16 +49,210 @@ struct callback_data {
 void *opaque;
 };
 
+enum {
+ACTION_INJECT_ERROR,
+ACTION_SET_STATE,
+};
+
+struct blkdebug_rule {
+BlkDebugEvent event;
+int action;
+int state;
+union {
+struct {
+int error;
+int immediately;
+int once;
+} inject;
+struct {
+int new_state;
+} set_state;
+} options;
+QLIST_ENTRY(blkdebug_rule) next;
+};
+
+static QemuOptsList inject_error_opts = {
+.name = inject-error,
+.head = QTAILQ_HEAD_INITIALIZER(inject_error_opts.head),
+.desc = {
+{
+.name = event,
+.type = QEMU_OPT_STRING,
+},
+{
+.name = state,
+.type = QEMU_OPT_NUMBER,
+},
+{
+.name = errno,
+.type = QEMU_OPT_NUMBER,
+},
+{
+.name = once,
+.type = QEMU_OPT_BOOL,
+},
+{
+.name = immediately,
+.type = QEMU_OPT_BOOL,
+},
+{ /* end of list */ }
+},
+};
+
+static QemuOptsList set_state_opts = {
+.name = set-state,
+.head = QTAILQ_HEAD_INITIALIZER(inject_error_opts.head),
+.desc = {
+{
+.name = event,
+.type = QEMU_OPT_STRING,
+},
+{
+.name = state,
+.type = QEMU_OPT_NUMBER,
+},
+{
+.name = new_state,
+.type = QEMU_OPT_NUMBER,
+},
+{ /* end of list */ }
+},
+};
+
+static QemuOptsList *config_groups[] = {
+inject_error_opts,
+set_state_opts,
+NULL
+};
+
+static const char *event_names[BLKDBG_EVENT_MAX] = {
+};
+
+static int get_event_by_name(const char *name, BlkDebugEvent *event)
+{
+int i;
+
+for (i = 0; i  BLKDBG_EVENT_MAX; i++) {
+if (!strcmp(event_names[i], name)) {
+*event = i;
+return 0;
+}
+}
+
+return -1;
+}
+
+struct add_rule_data {
+BDRVBlkdebugState *s;
+int action;
+};
+
+static int add_rule(QemuOpts *opts, void *opaque)
+{
+struct add_rule_data *d = opaque;
+BDRVBlkdebugState *s = d-s;
+const char* event_name;
+BlkDebugEvent event;
+struct blkdebug_rule *rule;
+
+/* Find the right event for the rule */
+event_name = qemu_opt_get(opts, event);
+if (!event_name || get_event_by_name(event_name, event)  0) {
+return -1;
+}
+
+/* Set attributes common for all actions */
+rule =

[Qemu-devel] [RFC][PATCH 7/7] qcow2: Trigger blkdebug events

2010-03-15 Thread Kevin Wolf

This adds blkdebug events to qcow2 to allow injecting I/O errors in specific
places.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block.h   |   19 +++
 block/blkdebug.c  |   17 +
 block/qcow2-cluster.c |   15 +++
 3 files changed, 51 insertions(+), 0 deletions(-)

diff --git a/block.h b/block.h
index 2fd8361..c0f17aa 100644
--- a/block.h
+++ b/block.h
@@ -210,6 +210,25 @@ int64_t bdrv_get_dirty_count(BlockDriverState *bs);
 
 
 typedef enum {
+BLKDBG_L1_UPDATE,
+
+BLKDBG_L1_GROW_ALLOC_TABLE,
+BLKDBG_L1_GROW_WRITE_TABLE,
+BLKDBG_L1_GROW_ACTIVATE_TABLE,
+
+BLKDBG_L2_LOAD,
+BLKDBG_L2_UPDATE,
+BLKDBG_L2_UPDATE_COMPRESSED,
+BLKDBG_L2_ALLOC_COW_READ,
+BLKDBG_L2_ALLOC_WRITE,
+
+BLKDBG_READ,
+BLKDBG_READ_BACKING,
+BLKDBG_READ_COMPRESSED,
+
+BLKDBG_COW_READ,
+BLKDBG_COW_WRITE,
+
 BLKDBG_EVENT_MAX,
 } BlkDebugEvent;
 
diff --git a/block/blkdebug.c b/block/blkdebug.c
index 22b1768..2398975 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -126,6 +126,23 @@ static QemuOptsList *config_groups[] = {
 };
 
 static const char *event_names[BLKDBG_EVENT_MAX] = {
+[BLKDBG_L1_UPDATE]  = l1_update,
+[BLKDBG_L1_GROW_ALLOC_TABLE]= l1_grow.alloc_table,
+[BLKDBG_L1_GROW_WRITE_TABLE]= l1_grow.write_table,
+[BLKDBG_L1_GROW_ACTIVATE_TABLE] = l1_grow.activate_table,
+
+[BLKDBG_L2_LOAD]= l2_load,
+[BLKDBG_L2_UPDATE]  = l2_update,
+[BLKDBG_L2_UPDATE_COMPRESSED]   = l2_update_compressed,
+[BLKDBG_L2_ALLOC_COW_READ]  = l2_alloc.cow_read,
+[BLKDBG_L2_ALLOC_WRITE] = l2_alloc.write,
+
+[BLKDBG_READ]   = read,
+[BLKDBG_READ_BACKING]   = read_backing,
+[BLKDBG_READ_COMPRESSED]= read_compressed,
+
+[BLKDBG_COW_READ]   = cow_read,
+[BLKDBG_COW_WRITE]  = cow_write,
 };
 
 static int get_event_by_name(const char *name, BlkDebugEvent *event)
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index b13b693..9b3d686 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -54,12 +54,14 @@ int qcow2_grow_l1_table(BlockDriverState *bs, int min_size)
 memcpy(new_l1_table, s-l1_table, s-l1_size * sizeof(uint64_t));
 
 /* write new table (align to cluster) */
+BLKDBG_EVENT(s-hd, BLKDBG_L1_GROW_ALLOC_TABLE);
 new_l1_table_offset = qcow2_alloc_clusters(bs, new_l1_size2);
 if (new_l1_table_offset  0) {
 qemu_free(new_l1_table);
 return new_l1_table_offset;
 }
 
+BLKDBG_EVENT(s-hd, BLKDBG_L1_GROW_WRITE_TABLE);
 for(i = 0; i  s-l1_size; i++)
 new_l1_table[i] = cpu_to_be64(new_l1_table[i]);
 ret = bdrv_pwrite(s-hd, new_l1_table_offset, new_l1_table, new_l1_size2);
@@ -69,6 +71,7 @@ int qcow2_grow_l1_table(BlockDriverState *bs, int min_size)
 new_l1_table[i] = be64_to_cpu(new_l1_table[i]);
 
 /* set new table */
+BLKDBG_EVENT(s-hd, BLKDBG_L1_GROW_ACTIVATE_TABLE);
 cpu_to_be32w((uint32_t*)data, new_l1_size);
 cpu_to_be64w((uint64_t*)(data + 4), new_l1_table_offset);
 ret = bdrv_pwrite(s-hd, offsetof(QCowHeader, l1_size), data,sizeof(data));
@@ -170,6 +173,8 @@ static uint64_t *l2_load(BlockDriverState *bs, uint64_t 
l2_offset)
 
 min_index = l2_cache_new_entry(bs);
 l2_table = s-l2_cache + (min_index  s-l2_bits);
+
+BLKDBG_EVENT(s-hd, BLKDBG_L2_LOAD);
 if (bdrv_pread(s-hd, l2_offset, l2_table, s-l2_size * sizeof(uint64_t)) 
!=
 s-l2_size * sizeof(uint64_t))
 return NULL;
@@ -195,6 +200,7 @@ static int write_l1_entry(BDRVQcowState *s, int l1_index)
 buf[i] = cpu_to_be64(s-l1_table[l1_start_index + i]);
 }
 
+BLKDBG_EVENT(s-hd, BLKDBG_L1_UPDATE);
 if (bdrv_pwrite(s-hd, s-l1_table_offset + 8 * l1_start_index,
 buf, sizeof(buf)) != sizeof(buf))
 {
@@ -248,12 +254,14 @@ static uint64_t *l2_allocate(BlockDriverState *bs, int 
l1_index)
 memset(l2_table, 0, s-l2_size * sizeof(uint64_t));
 } else {
 /* if there was an old l2 table, read it from the disk */
+BLKDBG_EVENT(s-hd, BLKDBG_L2_ALLOC_COW_READ);
 if (bdrv_pread(s-hd, old_l2_offset,
l2_table, s-l2_size * sizeof(uint64_t)) !=
 s-l2_size * sizeof(uint64_t))
 return NULL;
 }
 /* write the l2 table to the file */
+BLKDBG_EVENT(s-hd, BLKDBG_L2_ALLOC_WRITE);
 if (bdrv_pwrite(s-hd, l2_offset,
 l2_table, s-l2_size * sizeof(uint64_t)) !=
 s-l2_size * sizeof(uint64_t))
@@ -335,6 +343,7 @@ static int qcow_read(BlockDriverState *bs, int64_t 
sector_num,
 /* read from the base image */
 n1 = qcow2_backing_read1(bs-backing_hd, sector_num, buf, n);
 if (n1  0) {
+BLKDBG_EVENT(s-hd, BLKDBG_READ_BACKING);
 ret = bdrv_read(bs-backing_hd, sector_num, buf,

[Qemu-devel] [PATCH] resource leak fixes for iwmmxt disassemble

2010-03-15 Thread Lars Munch


This patch fixes few resource leaks in the iwmmxt disassemble.

Signed-off-by: Lars Munch l...@segv.dk
---
 target-arm/translate.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/target-arm/translate.c b/target-arm/translate.c
index cdfe946..2ab7881 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -1131,6 +1131,7 @@ static inline TCGv iwmmxt_load_creg(int reg)
 static inline void iwmmxt_store_creg(int reg, TCGv var)
 {
 tcg_gen_st_i32(var, cpu_env, offsetof(CPUState, iwmmxt.cregs[reg]));
+dead_tmp(var);
 }
 
 static inline void gen_op_iwmmxt_movq_wRn_M0(int rn)
@@ -1415,6 +1416,7 @@ static int disas_iwmmxt_insn(CPUState *env, DisasContext 
*s, uint32_t insn)
 }
 }
 }
+dead_tmp(addr);
 return 0;
 }
 
-- 
1.7.0

Re: [Qemu-devel] [PATCH, RFC] Replace assert(0) with abort() or cpu_abort()

2010-03-15 Thread Blue Swirl

On 3/15/10, Markus Armbruster arm...@redhat.com wrote:
 Blue Swirl blauwir...@gmail.com writes:

   When building with -DNDEBUG, assert(0) will not stop execution
   so it must not be used for abnormal termination.


 For each case: are you sure the code does not recover after assert(0)?
  Not saying it does, just asking whether you checked.

I had not checked but did just now,

QEMU system emulators do not handle SIGABRT. The situation is
different for user emulator, but then assert(0) and abort() would be
equally correct or incorrect.

   Use cpu_abort() when in CPU context, abort() otherwise.


 I sympathize with the general idea, but I don't like dead code after
  abort().  What about cleaning that up?

Good idea, but it should be a separate patch. This patch is safe,
whereas the cleanup patch could cause problems if it's not done
carefully.

Re: [Qemu-devel] [RFC][PATCH 0/7] blkdebug

2010-03-15 Thread Blue Swirl

On 3/15/10, Kevin Wolf kw...@redhat.com wrote:
 This patch series introduces a new block driver which acts as a protocol and
  whose purpose it is to fail requests. To be more precise, I want it to fail 
 in
  configurable places, so that qemu-iotests can be extended with tests for the
  error paths (for example for the case when something with metadata writes 
 goes
  wrong deep in qcow2).

Nice. Do you think this could be extended (later) to inject errors
also to guest from QEMU? Then it would be nice to be able to inject
errors when reading/writing to a specific guest block range and for
other formats (raw etc.) too.

Re: [Qemu-devel] [RFC][PATCH 0/7] blkdebug

2010-03-15 Thread Kevin Wolf

Am 15.03.2010 18:40, schrieb Blue Swirl:
 On 3/15/10, Kevin Wolf kw...@redhat.com wrote:
 This patch series introduces a new block driver which acts as a protocol and
  whose purpose it is to fail requests. To be more precise, I want it to fail 
 in
  configurable places, so that qemu-iotests can be extended with tests for the
  error paths (for example for the case when something with metadata writes 
 goes
  wrong deep in qcow2).
 
 Nice. Do you think this could be extended (later) to inject errors
 also to guest from QEMU? 

Not sure what you mean by from QEMU? To allow injecting errors from
the monitor instead of based on rules? Sounds certainly doable.

Or do you just mean to make it work in qemu as opposed to qemu-io? This
really doesn't make a difference to the block layer. If it works in one,
it works in the other one, too.

 Then it would be nice to be able to inject
 errors when reading/writing to a specific guest block range and for
 other formats (raw etc.) too.

Should work with all formats, it's implemented as a protocol (but it's
untested with everything but qcow2). So you basically stick it between
the format driver and the image file. The only thing that could be
needed for other formats is some additional events.

For the block range thing, this would need to implement some more
conditions instead of just checking the state, but in general this is
exactly the kind of things that I'd like to see supported eventually.

So I completely agree that this is the right direction for future
improvements, however I can't tell yet if (or when) I'll get to
implement it myself. My focus is really the qcow2 error paths for now.

Kevin

[Qemu-devel] git clone --recursive fails on git://git.qemu.org/qemu.git

2010-03-15 Thread Chris Webb

I tried to grab a recursive clone of qemu.git in order to get the head of
the seabios and vgabios trees to match the head of qemu.git, but the
operation failed with

  $ git clone --recursive git://git.qemu.org/qemu.git qemu-r
  Initialized empty Git repository in /home/chris/git/qemu-r/.git/
  remote: Counting objects: 59824, done.
  remote: Compressing objects: 100% (15786/15786), done.
  remote: Total 59824 (delta 47532), reused 55283 (delta 43936)
  Receiving objects: 100% (59824/59824), 24.67 MiB | 292 KiB/s, done.
  Resolving deltas: 100% (47532/47532), done.
  Submodule 'roms/seabios' (git://git.qemu.org/seabios.git/) registered for 
path 'roms/seabios'
  Submodule 'roms/vgabios' (git://git.qemu.org/vgabios.git/) registered for 
path 'roms/vgabios'
  Initialized empty Git repository in /home/chris/git/qemu-r/roms/seabios/.git/
  remote: Counting objects: 4711, done.
  remote: Compressing objects: 100% (1259/1259), done.
  remote: Total 4711 (delta 3758), reused 4360 (delta 3450)
  Receiving objects: 100% (4711/4711), 1.04 MiB | 266 KiB/s, done.
  Resolving deltas: 100% (3758/3758), done.
  fatal: reference is not a tree: 8f469b9676127ba6bb52609d89ec774e61db0ee1
  Unable to checkout '8f469b9676127ba6bb52609d89ec774e61db0ee1' in submodule 
path 'roms/seabios'
 
Is this a problem with qemu.git, or am I just failing to drive git properly?

Cheers,

Chris.

Re: [Qemu-devel] [RFC][PATCH 0/7] blkdebug

2010-03-15 Thread Blue Swirl

On 3/15/10, Kevin Wolf kw...@redhat.com wrote:
 Am 15.03.2010 18:40, schrieb Blue Swirl:

  On 3/15/10, Kevin Wolf kw...@redhat.com wrote:
   This patch series introduces a new block driver which acts as a protocol 
 and
whose purpose it is to fail requests. To be more precise, I want it to 
 fail in
configurable places, so that qemu-iotests can be extended with tests for 
 the
error paths (for example for the case when something with metadata 
 writes goes
wrong deep in qcow2).
  
   Nice. Do you think this could be extended (later) to inject errors
   also to guest from QEMU?


 Not sure what you mean by from QEMU? To allow injecting errors from
  the monitor instead of based on rules? Sounds certainly doable.

I was thinking rule file, but monitor would be more flexible. Even
better: allow switching rule files from monitor :-).

  Or do you just mean to make it work in qemu as opposed to qemu-io? This
  really doesn't make a difference to the block layer. If it works in one,
  it works in the other one, too.


   Then it would be nice to be able to inject
   errors when reading/writing to a specific guest block range and for
   other formats (raw etc.) too.


 Should work with all formats, it's implemented as a protocol (but it's
  untested with everything but qcow2). So you basically stick it between
  the format driver and the image file. The only thing that could be
  needed for other formats is some additional events.

  For the block range thing, this would need to implement some more
  conditions instead of just checking the state, but in general this is
  exactly the kind of things that I'd like to see supported eventually.

  So I completely agree that this is the right direction for future
  improvements, however I can't tell yet if (or when) I'll get to
  implement it myself. My focus is really the qcow2 error paths for now.

Ok, it's enough to know at this stage that the architecture is
flexible for this.

[Qemu-devel] Re: [PATCH, RFC] Replace assert(0) with abort() or cpu_abort()

2010-03-15 Thread Paolo Bonzini




I sympathize with the general idea, but I don't like dead code

after abort().  What about cleaning that up?


Good idea, but it should be a separate patch. This patch is safe,
whereas the cleanup patch could cause problems if it's not done
carefully.


This patch is safe, however I'd consider not changing 
assert(0)-abort() if there is code after the assert that looks like an 
attempt at recovering.  Example:


   if (!p) {
   printf (the impossible has happened!);
   assert (0);
   }

   return p-q;

should be changed to abort, while

   if (!p) {
   printf (the impossible has happened!);
   assert (0);
   return 0;
   }

   return p-q;

should not.

Paolo

[Qemu-devel] Re: [PATCH, RFC] Replace assert(0) with abort() or cpu_abort()

2010-03-15 Thread Blue Swirl

On 3/15/10, Paolo Bonzini pbonz...@redhat.com wrote:


 
  
I sympathize with the general idea, but I don't like dead code
   
   after abort().  What about cleaning that up?
  
  
  Good idea, but it should be a separate patch. This patch is safe,
  whereas the cleanup patch could cause problems if it's not done
  carefully.
 

  This patch is safe, however I'd consider not changing assert(0)-abort()
 if there is code after the assert that looks like an attempt at recovering.
 Example:

if (!p) {
printf (the impossible has happened!);
assert (0);
}

return p-q;

  should be changed to abort, while

if (!p) {
printf (the impossible has happened!);
assert (0);
return 0;
}

return p-q;

  should not.

Why not? According to manual page, assert(x) is equal to if (!x) abort().
As I mentioned earlier, system emulators don't handle SIGABRT and for
user emulators the level of bugginess remains constant (or rather, it
no longer depends on NDEBUG). Therefore the recovery code will never
be executed.

[Qemu-devel] Re: [PATCH, RFC] Replace assert(0) with abort() or cpu_abort()

2010-03-15 Thread Markus Armbruster

Paolo Bonzini pbonz...@redhat.com writes:

 I sympathize with the general idea, but I don't like dead code
 after abort().  What about cleaning that up?

 Good idea, but it should be a separate patch. This patch is safe,
 whereas the cleanup patch could cause problems if it's not done
 carefully.

 This patch is safe, however I'd consider not changing
 assert(0)-abort() if there is code after the assert that looks like
 an attempt at recovering.  Example:

if (!p) {
printf (the impossible has happened!);
assert (0);
}

return p-q;

 should be changed to abort, while

if (!p) {
printf (the impossible has happened!);
assert (0);
return 0;
}

return p-q;

 should not.

Except when you find that the recovery attempt is insufficient, of
course.  Requires closer inspection.

[Qemu-devel] Re: [PATCH, RFC] Replace assert(0) with abort() or cpu_abort()

2010-03-15 Thread Paolo Bonzini


I'd consider not changing assert(0)-abort()
if there is code after the assert that looks like an attempt at recovering.
Example:

if (!p) {
printf (the impossible has happened!);
assert (0);
}

return p-q;

  should be changed to abort, while

if (!p) {
printf (the impossible has happened!);
assert (0);
return 0;
}

return p-q;

  should not.


Why not? According to manual page, assert(x) is equal to if (!x) abort().
As I mentioned earlier, system emulators don't handle SIGABRT


... which won't be generated if !NDEBUG.  Only if the recovery code 
makes sense, of course.  However, my point was that those cases where 
there is recovery code are not no-brainers.


Paolo

Re: [Qemu-devel] [PATCH, RFC] Replace assert(0) with abort() or cpu_abort()

2010-03-15 Thread Markus Armbruster

Blue Swirl blauwir...@gmail.com writes:

 On 3/15/10, Markus Armbruster arm...@redhat.com wrote:
 Blue Swirl blauwir...@gmail.com writes:

   When building with -DNDEBUG, assert(0) will not stop execution
   so it must not be used for abnormal termination.


 For each case: are you sure the code does not recover after assert(0)?
  Not saying it does, just asking whether you checked.

 I had not checked but did just now,

 QEMU system emulators do not handle SIGABRT. The situation is
 different for user emulator, but then assert(0) and abort() would be
 equally correct or incorrect.

Please don't tell me that user emulators make abort() return.  abort()
is declared __noreturn__, and the optimizer may well rely on that.

   Use cpu_abort() when in CPU context, abort() otherwise.


 I sympathize with the general idea, but I don't like dead code after
  abort().  What about cleaning that up?

 Good idea, but it should be a separate patch. This patch is safe,
 whereas the cleanup patch could cause problems if it's not done
 carefully.

I support keeping separate things separate.  However, separating
something like

if (mapping-first_mapping_index != first_mapping_index
 mapping-info.file.offset  0) {
-   assert(0);
+abort();
copy_it = 1;
}

from

if (mapping-first_mapping_index != first_mapping_index
 mapping-info.file.offset  0) {
abort();
-   copy_it = 1;
}

doesn't seem worth it.

Re: [Qemu-devel] git clone --recursive fails on git://git.qemu.org/qemu.git

2010-03-15 Thread Anthony Liguori


On 03/15/2010 01:16 PM, Chris Webb wrote:

I tried to grab a recursive clone of qemu.git in order to get the head of
the seabios and vgabios trees to match the head of qemu.git, but the
operation failed with

   $ git clone --recursive git://git.qemu.org/qemu.git qemu-r
   Initialized empty Git repository in /home/chris/git/qemu-r/.git/
   remote: Counting objects: 59824, done.
   remote: Compressing objects: 100% (15786/15786), done.
   remote: Total 59824 (delta 47532), reused 55283 (delta 43936)
   Receiving objects: 100% (59824/59824), 24.67 MiB | 292 KiB/s, done.
   Resolving deltas: 100% (47532/47532), done.
   Submodule 'roms/seabios' (git://git.qemu.org/seabios.git/) registered for 
path 'roms/seabios'
   Submodule 'roms/vgabios' (git://git.qemu.org/vgabios.git/) registered for 
path 'roms/vgabios'
   Initialized empty Git repository in /home/chris/git/qemu-r/roms/seabios/.git/
   remote: Counting objects: 4711, done.
   remote: Compressing objects: 100% (1259/1259), done.
   remote: Total 4711 (delta 3758), reused 4360 (delta 3450)
   Receiving objects: 100% (4711/4711), 1.04 MiB | 266 KiB/s, done.
   Resolving deltas: 100% (3758/3758), done.
   fatal: reference is not a tree: 8f469b9676127ba6bb52609d89ec774e61db0ee1
   Unable to checkout '8f469b9676127ba6bb52609d89ec774e61db0ee1' in submodule 
path 'roms/seabios'

Is this a problem with qemu.git, or am I just failing to drive git properly?
   


Should work now.  Sorry about that.

Regards,

Anthony Liguori


Cheers,

Chris.

[Qemu-devel] Re: [PATCH, RFC] Replace assert(0) with abort() or cpu_abort()

2010-03-15 Thread Blue Swirl

On 3/15/10, Paolo Bonzini pbonz...@redhat.com wrote:
 
   I'd consider not changing assert(0)-abort()
   if there is code after the assert that looks like an attempt at
 recovering.
   Example:
  
  if (!p) {
  printf (the impossible has happened!);
  assert (0);
  }
  
  return p-q;
  
should be changed to abort, while
  
  if (!p) {
  printf (the impossible has happened!);
  assert (0);
  return 0;
  }
  
  return p-q;
  
should not.
  
 
  Why not? According to manual page, assert(x) is equal to if (!x) abort().
  As I mentioned earlier, system emulators don't handle SIGABRT
 

  ... which won't be generated if !NDEBUG.  Only if the recovery code makes
 sense, of course.  However, my point was that those cases where there is
 recovery code are not no-brainers.

Except that compiling with -DNDEBUG was broken and fixed only recently
with a6c6f76ceb95a0986fd1a36cc30f8241734d20c3. Thus I suspect nobody
uses -DNDEBUG for production builds and the code paths after assert(0)
are untested.

Re: [Qemu-devel] [PATCH 1/7] Add support for generic notifier lists

2010-03-15 Thread Anthony Liguori


On 03/11/2010 08:19 AM, Paul Brook wrote:

On 03/11/2010 06:57 AM, Paul Brook wrote:
 

+struct QEMUNotifier
+{
+void (*notify)(QEMUNotifier *notifier);
+};
 

I suggest combining this with QEMUBH.
   

I take it your not opposed to converting QEMUBH to be a QEMUNotifier?
If so, I'm happy to do it.
 

It's unclear to me why you've invented a new thing in the first place.
   


It's a better approximation of the command pattern because the command 
is a single object as opposed to a tuple.  Because the command is an 
object, you can also do things like binding.  For instance:


Which now gives you a notifier that has an fd bound to it's second 
argument (which is pretty useful for IO dispatch).


You can do this with a tuple representation, but it gets awkward.  One 
could argue for formalizing the tuple as a struct but extending by 
nesting becomes more complicated.  Also, for the most part, you already 
have a state for the command and embedding the object means less dynamic 
memory allocation and less code to handle that.


Regards,

Anthony Liguori

[Qemu-devel] [PATCH 1/7] Add support for generic notifier lists (v2)

2010-03-15 Thread Anthony Liguori

Notifiers are data-less callbacks and a notifier list is a list of registered
notifiers that all are interested in a particular event.

We'll use this in a few patches to implement mouse change notification.

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
---
v1 - v2
 - Do not do memory allocations by placing list nodes in notifier
---
 Makefile.objs |1 +
 notify.c  |   39 +++
 notify.h  |   43 +++
 3 files changed, 83 insertions(+), 0 deletions(-)
 create mode 100644 notify.c
 create mode 100644 notify.h

diff --git a/Makefile.objs b/Makefile.objs
index e791dd5..dcb5a92 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -104,6 +104,7 @@ common-obj-$(CONFIG_VNC_TLS) += vnc-tls.o 
vnc-auth-vencrypt.o
 common-obj-$(CONFIG_VNC_SASL) += vnc-auth-sasl.o
 common-obj-$(CONFIG_COCOA) += cocoa.o
 common-obj-$(CONFIG_IOTHREAD) += qemu-thread.o
+common-obj-y += notify.o
 
 slirp-obj-y = cksum.o if.o ip_icmp.o ip_input.o ip_output.o
 slirp-obj-y += slirp.o mbuf.o misc.o sbuf.o socket.o tcp_input.o tcp_output.o
diff --git a/notify.c b/notify.c
new file mode 100644
index 000..bcd3fc5
--- /dev/null
+++ b/notify.c
@@ -0,0 +1,39 @@
+/*
+ * Notifier lists
+ *
+ * Copyright IBM, Corp. 2010
+ *
+ * Authors:
+ *  Anthony Liguori   aligu...@us.ibm.com
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include qemu-common.h
+#include notify.h
+
+void notifier_list_init(NotifierList *list)
+{
+QTAILQ_INIT(list-notifiers);
+}
+
+void notifier_list_add(NotifierList *list, Notifier *notifier)
+{
+QTAILQ_INSERT_HEAD(list-notifiers, notifier, node);
+}
+
+void notifier_list_remove(NotifierList *list, Notifier *notifier)
+{
+QTAILQ_REMOVE(list-notifiers, notifier, node);
+}
+
+void notifier_list_notify(NotifierList *list)
+{
+Notifier *notifier, *next;
+
+QTAILQ_FOREACH_SAFE(notifier, list-notifiers, node, next) {
+notifier-notify(notifier);
+}
+}
diff --git a/notify.h b/notify.h
new file mode 100644
index 000..b40522f
--- /dev/null
+++ b/notify.h
@@ -0,0 +1,43 @@
+/*
+ * Notifier lists
+ *
+ * Copyright IBM, Corp. 2010
+ *
+ * Authors:
+ *  Anthony Liguori   aligu...@us.ibm.com
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef QEMU_NOTIFY_H
+#define QEMU_NOTIFY_H
+
+#include qemu-queue.h
+
+typedef struct Notifier Notifier;
+
+struct Notifier
+{
+void (*notify)(Notifier *notifier);
+QTAILQ_ENTRY(Notifier) node;
+};
+
+typedef struct NotifierList
+{
+QTAILQ_HEAD(, Notifier) notifiers;
+} NotifierList;
+
+#define NOTIFIER_LIST_INITIALIZER(head) \
+{ QTAILQ_HEAD_INITIALIZER((head).notifiers) }
+
+void notifier_list_init(NotifierList *list);
+
+void notifier_list_add(NotifierList *list, Notifier *notifier);
+
+void notifier_list_remove(NotifierList *list, Notifier *notifier);
+
+void notifier_list_notify(NotifierList *list);
+
+#endif
-- 
1.6.5.2

[Qemu-devel] [PATCH 3/7] Add kbd_mouse_has_absolute()

2010-03-15 Thread Anthony Liguori

kbd_mouse_is_absolute tells us whether the current mouse handler is an absolute
device.  kbd_mouse_has_absolute tells us whether we have any device that is
capable of absolute input.

This lets us tell a user that they have configured an absolute device but that
the guest is not currently using it.

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
---
 console.h |5 +
 input.c   |   13 +
 2 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/console.h b/console.h
index 94d9cae..f3c619f 100644
--- a/console.h
+++ b/console.h
@@ -53,8 +53,13 @@ void qemu_remove_led_event_handler(QEMUPutLEDEntry *entry);
 void kbd_put_keycode(int keycode);
 void kbd_put_ledstate(int ledstate);
 void kbd_mouse_event(int dx, int dy, int dz, int buttons_state);
+
+/* Does the current mouse generate absolute events */
 int kbd_mouse_is_absolute(void);
 
+/* Of all the mice, is there one that generates absolute events */
+int kbd_mouse_has_absolute(void);
+
 struct MouseTransformInfo {
 /* Touchscreen resolution */
 int x;
diff --git a/input.c b/input.c
index 397bfc5..8d5a14d 100644
--- a/input.c
+++ b/input.c
@@ -153,6 +153,19 @@ int kbd_mouse_is_absolute(void)
 return QTAILQ_FIRST(mouse_handlers)-qemu_put_mouse_event_absolute;
 }
 
+int kbd_mouse_has_absolute(void)
+{
+QEMUPutMouseEntry *entry;
+
+QTAILQ_FOREACH(entry, mouse_handlers, node) {
+if (entry-qemu_put_mouse_event_absolute) {
+return 1;
+}
+}
+
+return 0;
+}
+
 static void info_mice_iter(QObject *data, void *opaque)
 {
 QDict *mouse;
-- 
1.6.5.2

[Qemu-devel] [PATCH 2/7] Rewrite mouse handlers to use QTAILQ and to have an activation function

2010-03-15 Thread Anthony Liguori

And convert usb-hid to use it (to avoid regression with bisection)

Right now, when we do info mice and we've added a usb tablet, we don't see it
until the guest starts using the tablet.  We implement this behavior in order
to provide a means to delay registration of a mouse handler since we treat
the last registered handler as the current handler.

This is a usability problem though as we would like to give the user feedback
that they've either 1) not added an absolute device 2) there is an absolute
device but the guest isn't using it 3) we have an absolute device and it's
active.

By using QTAILQ and having an explicit activation function that moves the
handler to the front of the queue, we can implement the same semantics as
before with respect to automatically switching to usb tablet while providing
the user with a whole lot more information.

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
---
 console.h|6 +++-
 hw/usb-hid.c |   15 ++--
 input.c  |  109 +++---
 3 files changed, 59 insertions(+), 71 deletions(-)

diff --git a/console.h b/console.h
index 71e8ff2..94d9cae 100644
--- a/console.h
+++ b/console.h
@@ -28,8 +28,10 @@ typedef struct QEMUPutMouseEntry {
 int qemu_put_mouse_event_absolute;
 char *qemu_put_mouse_event_name;
 
+int index;
+
 /* used internally by qemu for handling mice */
-struct QEMUPutMouseEntry *next;
+QTAILQ_ENTRY(QEMUPutMouseEntry) node;
 } QEMUPutMouseEntry;
 
 typedef struct QEMUPutLEDEntry {
@@ -43,6 +45,8 @@ QEMUPutMouseEntry 
*qemu_add_mouse_event_handler(QEMUPutMouseEvent *func,
 void *opaque, int absolute,
 const char *name);
 void qemu_remove_mouse_event_handler(QEMUPutMouseEntry *entry);
+void qemu_activate_mouse_event_handler(QEMUPutMouseEntry *entry);
+
 QEMUPutLEDEntry *qemu_add_led_event_handler(QEMUPutLEDEvent *func, void 
*opaque);
 void qemu_remove_led_event_handler(QEMUPutLEDEntry *entry);
 
diff --git a/hw/usb-hid.c b/hw/usb-hid.c
index 2e4e647..8e6c6e0 100644
--- a/hw/usb-hid.c
+++ b/hw/usb-hid.c
@@ -511,8 +511,7 @@ static int usb_mouse_poll(USBHIDState *hs, uint8_t *buf, 
int len)
 USBMouseState *s = hs-ptr;
 
 if (!s-mouse_grabbed) {
-   s-eh_entry = qemu_add_mouse_event_handler(usb_mouse_event, hs,
-  0, QEMU USB Mouse);
+qemu_activate_mouse_event_handler(s-eh_entry);
s-mouse_grabbed = 1;
 }
 
@@ -553,8 +552,7 @@ static int usb_tablet_poll(USBHIDState *hs, uint8_t *buf, 
int len)
 USBMouseState *s = hs-ptr;
 
 if (!s-mouse_grabbed) {
-   s-eh_entry = qemu_add_mouse_event_handler(usb_tablet_event, hs,
-  1, QEMU USB Tablet);
+qemu_activate_mouse_event_handler(s-eh_entry);
s-mouse_grabbed = 1;
 }
 
@@ -866,6 +864,15 @@ static int usb_hid_initfn(USBDevice *dev, int kind)
 USBHIDState *s = DO_UPCAST(USBHIDState, dev, dev);
 s-dev.speed = USB_SPEED_FULL;
 s-kind = kind;
+
+if (s-kind == USB_MOUSE) {
+s-ptr.eh_entry = qemu_add_mouse_event_handler(usb_mouse_event, s,
+   0, QEMU USB Mouse);
+} else if (s-kind == USB_TABLET) {
+s-ptr.eh_entry = qemu_add_mouse_event_handler(usb_tablet_event, s,
+   1, QEMU USB Tablet);
+}
+
 /* Force poll routine to be run and grab input the first time.  */
 s-changed = 1;
 return 0;
diff --git a/input.c b/input.c
index baaa4c6..397bfc5 100644
--- a/input.c
+++ b/input.c
@@ -31,9 +31,9 @@
 
 static QEMUPutKBDEvent *qemu_put_kbd_event;
 static void *qemu_put_kbd_event_opaque;
-static QEMUPutMouseEntry *qemu_put_mouse_event_head;
-static QEMUPutMouseEntry *qemu_put_mouse_event_current;
 static QTAILQ_HEAD(, QEMUPutLEDEntry) led_handlers = 
QTAILQ_HEAD_INITIALIZER(led_handlers);
+static QTAILQ_HEAD(, QEMUPutMouseEntry) mouse_handlers =
+QTAILQ_HEAD_INITIALIZER(mouse_handlers);
 
 void qemu_add_kbd_event_handler(QEMUPutKBDEvent *func, void *opaque)
 {
@@ -45,7 +45,8 @@ QEMUPutMouseEntry 
*qemu_add_mouse_event_handler(QEMUPutMouseEvent *func,
 void *opaque, int absolute,
 const char *name)
 {
-QEMUPutMouseEntry *s, *cursor;
+QEMUPutMouseEntry *s;
+static int mouse_index = 0;
 
 s = qemu_mallocz(sizeof(QEMUPutMouseEntry));
 
@@ -53,51 +54,24 @@ QEMUPutMouseEntry 
*qemu_add_mouse_event_handler(QEMUPutMouseEvent *func,
 s-qemu_put_mouse_event_opaque = opaque;
 s-qemu_put_mouse_event_absolute = absolute;
 s-qemu_put_mouse_event_name = qemu_strdup(name);
-s-next = NULL;
+s-index = mouse_index++;
 
-if (!qemu_put_mouse_event_head) {
-qemu_put_mouse_event_head = qemu_put_mouse_event_current

[Qemu-devel] [PATCH 7/7] sdl: use mouse mode notifier

2010-03-15 Thread Anthony Liguori

Today we poll the mouse mode whenever there is a mouse movement.  There is a
subtle usability problem with this though.

If we're in relative mode and grab is enabled, when we change to absolute mode,
we break grab.  This gives a user a seamless transition when the new pointer
is enabled.

But because we poll for mouse change, this grab break won't occur until the user
attempts to move the mouse.  By using notifiers, the grab break happens as soon
as possible.

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
---
 sdl.c |   31 ---
 1 files changed, 20 insertions(+), 11 deletions(-)

diff --git a/sdl.c b/sdl.c
index 1d1e100..e9edbcc 100644
--- a/sdl.c
+++ b/sdl.c
@@ -57,6 +57,7 @@ static SDL_Cursor *guest_sprite = NULL;
 static uint8_t allocator;
 static SDL_PixelFormat host_format;
 static int scaling_active = 0;
+static Notifier mouse_mode_notifier;
 
 static void sdl_update(DisplayState *ds, int x, int y, int w, int h)
 {
@@ -485,6 +486,22 @@ static void sdl_grab_end(void)
 sdl_update_caption();
 }
 
+static void sdl_mouse_mode_change(Notifier *notify)
+{
+if (kbd_mouse_is_absolute()) {
+if (!absolute_enabled) {
+sdl_hide_cursor();
+if (gui_grab) {
+sdl_grab_end();
+}
+absolute_enabled = 1;
+}
+} else if (absolute_enabled) {
+   sdl_show_cursor();
+   absolute_enabled = 0;
+}
+}
+
 static void sdl_send_mouse_event(int dx, int dy, int dz, int x, int y, int 
state)
 {
 int buttons;
@@ -497,19 +514,8 @@ static void sdl_send_mouse_event(int dx, int dy, int dz, 
int x, int y, int state
 buttons |= MOUSE_EVENT_MBUTTON;
 
 if (kbd_mouse_is_absolute()) {
-   if (!absolute_enabled) {
-   sdl_hide_cursor();
-   if (gui_grab) {
-   sdl_grab_end();
-   }
-   absolute_enabled = 1;
-   }
-
dx = x * 0x7FFF / (width - 1);
dy = y * 0x7FFF / (height - 1);
-} else if (absolute_enabled) {
-   sdl_show_cursor();
-   absolute_enabled = 0;
 } else if (guest_cursor) {
 x -= guest_x;
 y -= guest_y;
@@ -875,6 +881,9 @@ void sdl_display_init(DisplayState *ds, int full_screen, 
int no_frame)
 dpy_resize(ds);
 }
 
+mouse_mode_notifier.notify = sdl_mouse_mode_change;
+qemu_add_mouse_mode_change_notifier(mouse_mode_notifier);
+
 sdl_update_caption();
 SDL_EnableKeyRepeat(250, 50);
 gui_grab = 0;
-- 
1.6.5.2

[Qemu-devel] [PATCH 6/7] input: make vnc use mouse mode notifiers

2010-03-15 Thread Anthony Liguori

When we switch to absolute mode, we send out a notification (if the client
supports it).  Today, we only send this notification when the client sends us
a mouse event and we're in the wrong mode.

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
---
 vnc.c |   13 -
 vnc.h |2 ++
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/vnc.c b/vnc.c
index 7ce73dc..14ded6c 100644
--- a/vnc.c
+++ b/vnc.c
@@ -1110,6 +1110,7 @@ static void vnc_disconnect_finish(VncState *vs)
 dcl-idle = 1;
 }
 
+qemu_remove_mouse_mode_change_notifier(vs-mouse_mode_notifier);
 vnc_remove_timer(vs-vd);
 qemu_remove_led_event_handler(vs-led);
 qemu_free(vs);
@@ -1427,8 +1428,11 @@ static void client_cut_text(VncState *vs, size_t len, 
uint8_t *text)
 {
 }
 
-static void check_pointer_type_change(VncState *vs, int absolute)
+static void check_pointer_type_change(Notifier *notifier)
 {
+VncState *vs = container_of(notifier, VncState, mouse_mode_notifier);
+int absolute = kbd_mouse_is_absolute();
+
 if (vnc_has_feature(vs, VNC_FEATURE_POINTER_TYPE_CHANGE)  vs-absolute 
!= absolute) {
 vnc_write_u8(vs, 0);
 vnc_write_u8(vs, 0);
@@ -1474,8 +1478,6 @@ static void pointer_event(VncState *vs, int button_mask, 
int x, int y)
 vs-last_x = x;
 vs-last_y = y;
 }
-
-check_pointer_type_change(vs, kbd_mouse_is_absolute());
 }
 
 static void reset_keys(VncState *vs)
@@ -1814,8 +1816,6 @@ static void set_encodings(VncState *vs, int32_t 
*encodings, size_t n_encodings)
 break;
 }
 }
-
-check_pointer_type_change(vs, kbd_mouse_is_absolute());
 }
 
 static void set_pixel_conversion(VncState *vs)
@@ -2431,6 +2431,9 @@ static void vnc_connect(VncDisplay *vd, int csock)
 reset_keys(vs);
 vs-led = qemu_add_led_event_handler(kbd_leds, vs);
 
+vs-mouse_mode_notifier.notify = check_pointer_type_change;
+qemu_add_mouse_mode_change_notifier(vs-mouse_mode_notifier);
+
 vnc_init_timer(vd);
 
 /* vs might be free()ed here */
diff --git a/vnc.h b/vnc.h
index 0fc89bd..8052000 100644
--- a/vnc.h
+++ b/vnc.h
@@ -167,6 +167,8 @@ struct VncState
 Buffer zlib_tmp;
 z_stream zlib_stream[4];
 
+Notifier mouse_mode_notifier;
+
 QTAILQ_ENTRY(VncState) next;
 };
 
-- 
1.6.5.2

[Qemu-devel] [PATCH 4/7] Add notifier for mouse mode changes

2010-03-15 Thread Anthony Liguori

Right now, DisplayState clients rely on polling the mouse mode to determine
when the device is changed to an absolute device.  Use a notification list to
add an explicit notification.

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
---
 console.h |3 +++
 input.c   |   37 -
 2 files changed, 39 insertions(+), 1 deletions(-)

diff --git a/console.h b/console.h
index f3c619f..6def115 100644
--- a/console.h
+++ b/console.h
@@ -3,6 +3,7 @@
 
 #include qemu-char.h
 #include qdict.h
+#include notify.h
 
 /* keyboard/mouse support */
 
@@ -56,6 +57,8 @@ void kbd_mouse_event(int dx, int dy, int dz, int 
buttons_state);
 
 /* Does the current mouse generate absolute events */
 int kbd_mouse_is_absolute(void);
+void qemu_add_mouse_mode_change_notifier(Notifier *notify);
+void qemu_remove_mouse_mode_change_notifier(Notifier *notify);
 
 /* Of all the mice, is there one that generates absolute events */
 int kbd_mouse_has_absolute(void);
diff --git a/input.c b/input.c
index 8d5a14d..c956e06 100644
--- a/input.c
+++ b/input.c
@@ -28,12 +28,13 @@
 #include console.h
 #include qjson.h
 
-
 static QEMUPutKBDEvent *qemu_put_kbd_event;
 static void *qemu_put_kbd_event_opaque;
 static QTAILQ_HEAD(, QEMUPutLEDEntry) led_handlers = 
QTAILQ_HEAD_INITIALIZER(led_handlers);
 static QTAILQ_HEAD(, QEMUPutMouseEntry) mouse_handlers =
 QTAILQ_HEAD_INITIALIZER(mouse_handlers);
+static NotifierList mouse_mode_notifiers = 
+NOTIFIER_LIST_INITIALIZER(mouse_mode_notifiers);
 
 void qemu_add_kbd_event_handler(QEMUPutKBDEvent *func, void *opaque)
 {
@@ -41,6 +42,24 @@ void qemu_add_kbd_event_handler(QEMUPutKBDEvent *func, void 
*opaque)
 qemu_put_kbd_event = func;
 }
 
+static void check_mode_change(void)
+{
+static int current_is_absolute, current_has_absolute;
+int is_absolute;
+int has_absolute;
+
+is_absolute = kbd_mouse_is_absolute();
+has_absolute = kbd_mouse_has_absolute();
+
+if (is_absolute != current_is_absolute ||
+has_absolute != current_has_absolute) {
+notifier_list_notify(mouse_mode_notifiers);
+}
+
+current_is_absolute = is_absolute;
+current_has_absolute = has_absolute;
+}
+
 QEMUPutMouseEntry *qemu_add_mouse_event_handler(QEMUPutMouseEvent *func,
 void *opaque, int absolute,
 const char *name)
@@ -58,6 +77,8 @@ QEMUPutMouseEntry 
*qemu_add_mouse_event_handler(QEMUPutMouseEvent *func,
 
 QTAILQ_INSERT_TAIL(mouse_handlers, s, node);
 
+check_mode_change();
+
 return s;
 }
 
@@ -75,6 +96,8 @@ void qemu_remove_mouse_event_handler(QEMUPutMouseEntry *entry)
 
 qemu_free(entry-qemu_put_mouse_event_name);
 qemu_free(entry);
+
+check_mode_change();
 }
 
 QEMUPutLEDEntry *qemu_add_led_event_handler(QEMUPutLEDEvent *func,
@@ -256,4 +279,16 @@ void do_mouse_set(Monitor *mon, const QDict *qdict)
 if (!found) {
 monitor_printf(mon, Mouse at given index not found\n);
 }
+
+check_mode_change();
+}
+
+void qemu_add_mouse_mode_change_notifier(Notifier *notify)
+{
+notifier_list_add(mouse_mode_notifiers, notify);
+}
+
+void qemu_remove_mouse_mode_change_notifier(Notifier *notify)
+{
+notifier_list_remove(mouse_mode_notifiers, notify);
 }
-- 
1.6.5.2

[Qemu-devel] [PATCH 5/7] Expose whether a mouse is an absolute device via QMP and the human monitor.

2010-03-15 Thread Anthony Liguori

For QMP, we just add an attribute which is backwards compatible.  For the human
monitor, we add (absolute) to the end of the line.

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
---
 input.c |   18 --
 1 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/input.c b/input.c
index c956e06..8f0941e 100644
--- a/input.c
+++ b/input.c
@@ -195,9 +195,10 @@ static void info_mice_iter(QObject *data, void *opaque)
 Monitor *mon = opaque;
 
 mouse = qobject_to_qdict(data);
-monitor_printf(mon, %c Mouse #% PRId64 : %s\n,
+monitor_printf(mon, %c Mouse #% PRId64 : %s%s\n,
   (qdict_get_bool(mouse, current) ? '*' : ' '),
-  qdict_get_int(mouse, index), qdict_get_str(mouse, name));
+   qdict_get_int(mouse, index), qdict_get_str(mouse, name),
+   qdict_get_bool(mouse, absolute) ?  (absolute) : );
 }
 
 void do_info_mice_print(Monitor *mon, const QObject *data)
@@ -224,11 +225,12 @@ void do_info_mice_print(Monitor *mon, const QObject *data)
  * - name: mouse's name
  * - index: mouse's index
  * - current: true if this mouse is receiving events, false otherwise
+ * - absolute: true if the mouse generates absolute input events
  *
  * Example:
  *
- * [ { name: QEMU Microsoft Mouse, index: 0, current: false },
- *   { name: QEMU PS/2 Mouse, index: 1, current: true } ]
+ * [ { name: QEMU Microsoft Mouse, index: 0, current: false, 
absolute: false },
+ *   { name: QEMU PS/2 Mouse, index: 1, current: true, absolute: 
true } ]
  */
 void do_info_mice(Monitor *mon, QObject **ret_data)
 {
@@ -246,10 +248,14 @@ void do_info_mice(Monitor *mon, QObject **ret_data)
 
 QTAILQ_FOREACH(cursor, mouse_handlers, node) {
 QObject *obj;
-obj = qobject_from_jsonf({ 'name': %s, 'index': %d, 'current': %i },
+obj = qobject_from_jsonf({ 'name': %s,
+   'index': %d,
+   'current': %i,
+   'absolute': %i },
  cursor-qemu_put_mouse_event_name,
  cursor-index,
- cursor-index == current);
+ cursor-index == current,
+ !!cursor-qemu_put_mouse_event_absolute);
 qlist_append_obj(mice_list, obj);
 }
 
-- 
1.6.5.2

Re: [Qemu-devel] [PATCH 1/7] Add support for generic notifier lists

2010-03-15 Thread Anthony Liguori


On 03/15/2010 03:31 PM, Anthony Liguori wrote:

On 03/11/2010 08:19 AM, Paul Brook wrote:

On 03/11/2010 06:57 AM, Paul Brook wrote:

+struct QEMUNotifier
+{
+void (*notify)(QEMUNotifier *notifier);
+};

I suggest combining this with QEMUBH.

I take it your not opposed to converting QEMUBH to be a QEMUNotifier?
If so, I'm happy to do it.

It's unclear to me why you've invented a new thing in the first place.


It's a better approximation of the command pattern because the command 
is a single object as opposed to a tuple.  Because the command is an 
object, you can also do things like binding.  For instance:


typedef struct IONotifier
{
   Notifier parent;
   void (*notify)(IONotifier *notifier, int fd);
   int fd;
} IONotifier;

It's been a long day...

Which now gives you a notifier that has an fd bound to it's second 
argument (which is pretty useful for IO dispatch).


You can do this with a tuple representation, but it gets awkward.  One 
could argue for formalizing the tuple as a struct but extending by 
nesting becomes more complicated.  Also, for the most part, you 
already have a state for the command and embedding the object means 
less dynamic memory allocation and less code to handle that.


Regards,

Anthony Liguori



Regards,

Anthony Liguori

Re: [Qemu-devel] Ideas wiki for GSoC 2010

2010-03-15 Thread Joerg Roedel

On Mon, Mar 15, 2010 at 02:25:41PM +0200, Avi Kivity wrote:
 On 03/10/2010 11:30 PM, Luiz Capitulino wrote:
   Hi there,

   Our wiki page for the Summer of Code 2010 is doing quite well:

 http://wiki.qemu.org/Google_Summer_of_Code_2010


 I will add another project - iommu emulation.  Could be very useful for  
 doing device assignment to nested guests, which could make testing a lot  
 easier.

Good idea. If there is interest I could help to mentor this project.

Joerg

Re: [Qemu-devel] Ideas wiki for GSoC 2010

2010-03-15 Thread Muli Ben-Yehuda

On Mon, Mar 15, 2010 at 02:25:41PM +0200, Avi Kivity wrote:
 On 03/10/2010 11:30 PM, Luiz Capitulino wrote:

   Hi there,
 
   Our wiki page for the Summer of Code 2010 is doing quite well:
 
 http://wiki.qemu.org/Google_Summer_of_Code_2010
 
 I will add another project - iommu emulation.  Could be very useful
 for doing device assignment to nested guests, which could make
 testing a lot easier.

Our experiments show that nested device assignment is pretty much
required for I/O performance in nested scenarios.

Cheers,
Muli

Re: [Qemu-devel] Ideas wiki for GSoC 2010

2010-03-15 Thread Joerg Roedel

On Mon, Mar 15, 2010 at 05:53:13AM -0700, Muli Ben-Yehuda wrote:
 On Mon, Mar 15, 2010 at 02:25:41PM +0200, Avi Kivity wrote:
  On 03/10/2010 11:30 PM, Luiz Capitulino wrote:
 
Hi there,
  
Our wiki page for the Summer of Code 2010 is doing quite well:
  
  http://wiki.qemu.org/Google_Summer_of_Code_2010
  
  I will add another project - iommu emulation.  Could be very useful
  for doing device assignment to nested guests, which could make
  testing a lot easier.
 
 Our experiments show that nested device assignment is pretty much
 required for I/O performance in nested scenarios.

Really? I did a small test with virtio-blk in a nested guest (disk read
with dd, so not a real benchmark) and got a reasonable read-performance
of around 25MB/s from the disk in the l2-guest.

Joerg

Re: [Qemu-devel] Ideas wiki for GSoC 2010

2010-03-15 Thread Joerg Roedel

On Mon, Mar 15, 2010 at 03:11:42PM +0200, Avi Kivity wrote:
 On 03/15/2010 03:03 PM, Joerg Roedel wrote:

 I will add another project - iommu emulation.  Could be very useful
 for doing device assignment to nested guests, which could make
 testing a lot easier.

 Our experiments show that nested device assignment is pretty much
 required for I/O performance in nested scenarios.
  
 Really? I did a small test with virtio-blk in a nested guest (disk read
 with dd, so not a real benchmark) and got a reasonable read-performance
 of around 25MB/s from the disk in the l2-guest.



 Your guest wasn't doing a zillion VMREADs and VMWRITEs every exit.

 I plan to reduce VMREAD/VMWRITE overhead for kvm, but not much we can do  
 for other guests.

Does it matter for the ept-on-ept case? The initial patchset of
nested-vmx implemented it and they reported a performance drop of around
12% between levels which is reasonable. So I expected the loss of
io-performance for l2 also reasonable in this case. My small measurement
was also done using npt-on-npt.

Joerg

Re: [Qemu-devel] Ideas wiki for GSoC 2010

2010-03-15 Thread Muli Ben-Yehuda

On Mon, Mar 15, 2010 at 02:03:11PM +0100, Joerg Roedel wrote:

 On Mon, Mar 15, 2010 at 05:53:13AM -0700, Muli Ben-Yehuda wrote:

  On Mon, Mar 15, 2010 at 02:25:41PM +0200, Avi Kivity wrote:

   On 03/10/2010 11:30 PM, Luiz Capitulino wrote:
  
 Hi there,
   
 Our wiki page for the Summer of Code 2010 is doing quite well:
   
   http://wiki.qemu.org/Google_Summer_of_Code_2010
   
   I will add another project - iommu emulation.  Could be very
   useful for doing device assignment to nested guests, which could
   make testing a lot easier.
  
  Our experiments show that nested device assignment is pretty much
  required for I/O performance in nested scenarios.
 
 Really? I did a small test with virtio-blk in a nested guest (disk
 read with dd, so not a real benchmark) and got a reasonable
 read-performance of around 25MB/s from the disk in the l2-guest.

Netperf running in L1 with direct access: ~950 Mbps throughput with
25% CPU utilization. Netperf running in L2 with virtio between L2 and
L1 and direct assignment between L1 and L0: roughly the same
throughput, but over 90% CPU utilization! Now extrapolate to 10GbE.

Cheers,
Muli

Re: [Qemu-devel] Ideas wiki for GSoC 2010

2010-03-15 Thread Joerg Roedel

On Mon, Mar 15, 2010 at 08:14:29AM -0500, Anthony Liguori wrote:
 On 03/15/2010 07:42 AM, Avi Kivity wrote:
 On 03/15/2010 02:38 PM, Joerg Roedel wrote:
 On Mon, Mar 15, 2010 at 02:25:41PM +0200, Avi Kivity wrote:
 On 03/10/2010 11:30 PM, Luiz Capitulino wrote:
Hi there,

Our wiki page for the Summer of Code 2010 is doing quite well:

 http://wiki.qemu.org/Google_Summer_of_Code_2010

 I will add another project - iommu emulation.  Could be very useful for
 doing device assignment to nested guests, which could make testing 
 a lot
 easier.
 Good idea. If there is interest I could help to mentor this project.

 Thanks.  I volunteered Anthony, but he may be a little overcommitted.

 Joerg, feel free to put your name against too.

[x] Done.

Thanks,
Joerg

Re: [Qemu-devel] Ideas wiki for GSoC 2010

2010-03-15 Thread Anthony Liguori


On 03/15/2010 10:06 AM, Avi Kivity wrote:

On 03/15/2010 03:23 PM, Anthony Liguori wrote:

On 03/15/2010 08:11 AM, Avi Kivity wrote:

On 03/15/2010 03:03 PM, Joerg Roedel wrote:



I will add another project - iommu emulation.  Could be very useful
for doing device assignment to nested guests, which could make
testing a lot easier.

Our experiments show that nested device assignment is pretty much
required for I/O performance in nested scenarios.
Really? I did a small test with virtio-blk in a nested guest (disk 
read
with dd, so not a real benchmark) and got a reasonable 
read-performance

of around 25MB/s from the disk in the l2-guest.



Your guest wasn't doing a zillion VMREADs and VMWRITEs every exit.

I plan to reduce VMREAD/VMWRITE overhead for kvm, but not much we 
can do for other guests.


VMREAD/VMWRITEs are generally optimized by hypervisors as they tend 
to be costly.  KVM is a bit unusual in terms of how many times the 
instructions are executed per exit.


Do you know offhand of any unnecessary read/writes?  There's 
update_cr8_intercept(), but on normal exits, I don't see what else we 
can remove.


Yeah, there are a number of examples.

vmcs_clear_bits() and vmcs_set_bits() read a field of the VMCS and then 
immediately writes it.  This is unnecessary as the same information 
could be kept in a shadow variable.  In vmx_fpu_activate, we call 
vmcs_clear_bits() followed immediately by vmcs_set_bits(). which means 
we're reading GUEST_CR0 twice and writing it twice.


vmx_get_rflags() reads from the VMCS and we frequently call get_rflags() 
followed by a set_rflags() to update a bit.  We also don't cache the 
value between calls and there's a few spots in the code that make 
multiple calls.


Regards,

Anthony Liguori

Re: [Qemu-devel] Ideas wiki for GSoC 2010

2010-03-15 Thread Avi Kivity


On 03/16/2010 03:21 AM, Anthony Liguori wrote:

On 03/15/2010 10:06 AM, Avi Kivity wrote:

On 03/15/2010 03:23 PM, Anthony Liguori wrote:

On 03/15/2010 08:11 AM, Avi Kivity wrote:

On 03/15/2010 03:03 PM, Joerg Roedel wrote:



I will add another project - iommu emulation.  Could be very useful
for doing device assignment to nested guests, which could make
testing a lot easier.

Our experiments show that nested device assignment is pretty much
required for I/O performance in nested scenarios.
Really? I did a small test with virtio-blk in a nested guest (disk 
read
with dd, so not a real benchmark) and got a reasonable 
read-performance

of around 25MB/s from the disk in the l2-guest.



Your guest wasn't doing a zillion VMREADs and VMWRITEs every exit.

I plan to reduce VMREAD/VMWRITE overhead for kvm, but not much we 
can do for other guests.


VMREAD/VMWRITEs are generally optimized by hypervisors as they tend 
to be costly.  KVM is a bit unusual in terms of how many times the 
instructions are executed per exit.


Do you know offhand of any unnecessary read/writes?  There's 
update_cr8_intercept(), but on normal exits, I don't see what else we 
can remove.


Yeah, there are a number of examples.

vmcs_clear_bits() and vmcs_set_bits() read a field of the VMCS and 
then immediately writes it.  This is unnecessary as the same 
information could be kept in a shadow variable.  In vmx_fpu_activate, 
we call vmcs_clear_bits() followed immediately by vmcs_set_bits(). 
which means we're reading GUEST_CR0 twice and writing it twice.


This should be much better these days (2.6.34-rc1) as vmx_fpu_activate() 
is called at most once per heavyweight exit (and I have evil plans to 
reduce it even further).  Still, that code should be optimized.


vmx_get_rflags() reads from the VMCS and we frequently call 
get_rflags() followed by a set_rflags() to update a bit.  We also 
don't cache the value between calls and there's a few spots in the 
code that make multiple calls.


We definitely should cache that (and segment access from the emulator as 
well).  But I'd have thought this to be relatively infrequent.  At least 
with Linux, using x2apic and virtio allows you to eliminate most 
emulator access, if you have npt or ept.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

68 matches

Mail list logo