[Qemu-devel] Training program for your software

2011-10-07 Thread OCS Partners

  
  







Dear Sir or Madam,
 
I noticed that your company provides Open
  Source software, but does not appear to have training for this
  software.  My organization,
  One Course Source, is a training development and instructional
  staffing company, specializing in Open Source technologies.
 
Having a solid training program is important
  for any software program to be successful.  Customers need to know that
  they can receive good training when they opt to switch to your
  software product.  
 
One Course Source provides the following
  services:
 

  Instructors:
Our instructors learn your product and can teach from your
existing course materials.
  Courseware
  design: If you need courses developed, our instructors are
also courseware developers.  We
can either develop course materials from scratch or edit your
existing materials.
  Sales:
We can include your classes as part of our sales efforts.  For a commission on sales
we including your classes as part of our sales and marketing
strategy.  

 
I would be interested in opening a dialog
  regarding providing these services to your organization.  If you aren't the correct
  person for this, can you please forward this to the person in your
  organization who is responsible for this sort of business
  relationship?
 
Sincerely,
 
Bob Smith
  One Course Source

  




[Qemu-devel] [Bug 722311] Re: Segmentation fault if started without -enable-kvm parameter

2011-10-07 Thread Bug Reporter
The problem reported above was the same up to and including qemu 0.15.0.
Meanwhile I found this on the LinuxFromScratch (LFS) bug tracker:

  "Glibc-2.14 causes segfaults in SDL",
http://wiki.linuxfromscratch.org/lfs/ticket/2920

After applying their patch to GLIBC, qemu finally works again on the
Pentium 4. As far as I am concerned, this bug report can now be closed.


** Bug watch added: Linux From Scratch Trac #2920
   http://wiki.linuxfromscratch.org/lfs/ticket/2920

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/722311

Title:
  Segmentation fault if started without -enable-kvm parameter

Status in QEMU:
  New

Bug description:
  I start qemu (Linux) from the same USB memory stick on several
  computers. Up to and including qemu 0.12.5, I could use or not use
  qemu's "-enable-kvm" command line parameter as appropriate for the
  hardware, and qemu would run. In contrast, qemu 0.13.0 and 0.14.0
  segfault if started without "-enable-kvm". I get a black window
  appearing for fractions of a second, disappearing immediately, and
  then the error message "Segmentation fault".

  Hardware: Pentium 4, and Core 2 Duo.
  Command line: either "qemu" or "qemu -enable-kvm" (after manually loading the 
kvm-intel module on the Core 2 Duo).
  Reproducible: always.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/722311/+subscriptions



Re: [Qemu-devel] QEMU + ARM11MPCore

2011-10-07 Thread Antti P Miettinen
"TusharK"  writes:
> Hello,
> I tried executing QEMU (realview-smp ARM11MPCore) with Linux kernel 2.6.39.3,
> but it failed. Kernel itself is not getting decompressed. I compiled the
> kernel with realview-smp_config and build was successful. Can you please let
> me know how can test QEMU + ARM11MPcore combination.

In 2009 at least the following kernel options seemed relevant:

CONFIG_MACH_REALVIEW_EB=y
CONFIG_REALVIEW_EB_ARM11MP=y
CONFIG_REALVIEW_EB_ARM11MP_REVB=y
# CONFIG_REALVIEW_HIGH_PHYS_OFFSET is not set

-- 
Antti P Miettinen
http://www.iki.fi/~ananaza/




Re: [Qemu-devel] In-kernel emulation

2011-10-07 Thread Xin Tong
I am kind of speculating here. I could be wrong


MMU. in the kernel you can manipulate the EPT reg which is normally used by
kvm ( introduced by intel vmx, or amd svm).

IO, specifically for network packets, they do not need to go to user space,
they can be handled to the virtual machine directly.


Thanks


Xin


On Fri, Oct 7, 2011 at 3:10 PM, 陳韋任  wrote:

> > guest isa is different from host isa in this case.
> >
> > Xin
> >
> > On Fri, Oct 7, 2011 at 12:33 PM, 陳韋任  wrote:
> >
> > > > I am wondering that whether there are any attempts (product-oriented
> or
> > > > research-based ) to push QEMU into the Linux kernel to speed up
> emulation.
> > > > If the emulation is running in the kernel, there are some resources
> it can
> > > > manipulate to speed up emulation in comparison to the when it is
> running as
> > > > a user process, i.e. MMU. Also, IO emulation may become faster,
> because 2
>
>   I would like to know how you can leverage linux kernel to speed up
> MMU/IO emulation if guest and host are different ISAs. :)
>
> > > > kernel enters and exits are incurred for a network packet if QEMU is
> running
> > > > as a user process. If QEMU is running in the kernel, only 1 kernel
> enter and
> > > > exit are needed.  Any suggestions or discussions are welcome.
> > >
> > >   You want to use QEMU to emulate guest ISA different from the host?
> > > If the ISA of guest and host is the same, then KVM is enough, I think.
>
>
> --
> Wei-Ren Chen (陳韋任)
> Computer Systems Lab, Institute of Information Science,
> Academia Sinica, Taiwan (R.O.C.)
> Tel:886-2-2788-3799 #1667
>


[Qemu-devel] [PATCH] ui/vnc: Fix use of free() instead of g_free()

2011-10-07 Thread Stefan Weil
Please note that mechlist still uses malloc / strdup / free.

Signed-off-by: Stefan Weil 
---
 ui/vnc-auth-sasl.c   |8 
 ui/vnc-enc-hextile.c |4 ++--
 ui/vnc-tls.c |2 +-
 ui/vnc.c |8 
 4 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/ui/vnc-auth-sasl.c b/ui/vnc-auth-sasl.c
index e96095a..23b1bf5 100644
--- a/ui/vnc-auth-sasl.c
+++ b/ui/vnc-auth-sasl.c
@@ -34,7 +34,7 @@ void vnc_sasl_client_cleanup(VncState *vs)
 vs->sasl.runSSF = vs->sasl.waitWriteSSF = vs->sasl.wantSSF = 0;
 vs->sasl.encodedLength = vs->sasl.encodedOffset = 0;
 vs->sasl.encoded = NULL;
-free(vs->sasl.username);
+g_free(vs->sasl.username);
 free(vs->sasl.mechlist);
 vs->sasl.username = vs->sasl.mechlist = NULL;
 sasl_dispose(&vs->sasl.conn);
@@ -506,7 +506,7 @@ void start_auth_sasl(VncState *vs)
 goto authabort;
 
 if (!(remoteAddr = vnc_socket_remote_addr("%s;%s", vs->csock))) {
-free(localAddr);
+g_free(localAddr);
 goto authabort;
 }
 
@@ -518,8 +518,8 @@ void start_auth_sasl(VncState *vs)
   NULL, /* Callbacks, not needed */
   SASL_SUCCESS_DATA,
   &vs->sasl.conn);
-free(localAddr);
-free(remoteAddr);
+g_free(localAddr);
+g_free(remoteAddr);
 localAddr = remoteAddr = NULL;
 
 if (err != SASL_OK) {
diff --git a/ui/vnc-enc-hextile.c b/ui/vnc-enc-hextile.c
index d2905c8..c860dbb 100644
--- a/ui/vnc-enc-hextile.c
+++ b/ui/vnc-enc-hextile.c
@@ -80,8 +80,8 @@ int vnc_hextile_send_framebuffer_update(VncState *vs, int x,
   last_bg, last_fg, &has_bg, &has_fg);
 }
 }
-free(last_fg);
-free(last_bg);
+g_free(last_fg);
+g_free(last_bg);
 
 return 1;
 }
diff --git a/ui/vnc-tls.c b/ui/vnc-tls.c
index ffbd172..3aaa939 100644
--- a/ui/vnc-tls.c
+++ b/ui/vnc-tls.c
@@ -413,7 +413,7 @@ void vnc_tls_client_cleanup(struct VncState *vs)
 vs->tls.session = NULL;
 }
 vs->tls.wiremode = VNC_WIREMODE_CLEAR;
-free(vs->tls.dname);
+g_free(vs->tls.dname);
 }
 
 
diff --git a/ui/vnc.c b/ui/vnc.c
index fc3a612..3107918 100644
--- a/ui/vnc.c
+++ b/ui/vnc.c
@@ -2880,7 +2880,7 @@ int vnc_display_open(DisplayState *ds, const char 
*display)
 if ((saslErr = sasl_server_init(NULL, "qemu")) != SASL_OK) {
 fprintf(stderr, "Failed to initialize SASL auth %s",
 sasl_errstring(saslErr, NULL, NULL));
-free(vs->display);
+g_free(vs->display);
 vs->display = NULL;
 return -1;
 }
@@ -2894,7 +2894,7 @@ int vnc_display_open(DisplayState *ds, const char 
*display)
 else
 vs->lsock = inet_connect(display, SOCK_STREAM);
 if (-1 == vs->lsock) {
-free(vs->display);
+g_free(vs->display);
 vs->display = NULL;
 return -1;
 } else {
@@ -2915,10 +2915,10 @@ int vnc_display_open(DisplayState *ds, const char 
*display)
 vs->lsock = inet_listen(display, dpy, 256, SOCK_STREAM, 5900);
 }
 if (-1 == vs->lsock) {
-free(dpy);
+g_free(dpy);
 return -1;
 } else {
-free(vs->display);
+g_free(vs->display);
 vs->display = dpy;
 }
 }
-- 
1.7.2.5




Re: [Qemu-devel] In-kernel emulation

2011-10-07 Thread 陳韋任
> guest isa is different from host isa in this case.
> 
> Xin
> 
> On Fri, Oct 7, 2011 at 12:33 PM, 陳韋任  wrote:
> 
> > > I am wondering that whether there are any attempts (product-oriented or
> > > research-based ) to push QEMU into the Linux kernel to speed up emulation.
> > > If the emulation is running in the kernel, there are some resources it can
> > > manipulate to speed up emulation in comparison to the when it is running 
> > > as
> > > a user process, i.e. MMU. Also, IO emulation may become faster, because 2

  I would like to know how you can leverage linux kernel to speed up
MMU/IO emulation if guest and host are different ISAs. :)

> > > kernel enters and exits are incurred for a network packet if QEMU is 
> > > running
> > > as a user process. If QEMU is running in the kernel, only 1 kernel enter 
> > > and
> > > exit are needed.  Any suggestions or discussions are welcome.
> >
> >   You want to use QEMU to emulate guest ISA different from the host?
> > If the ISA of guest and host is the same, then KVM is enough, I think.


-- 
Wei-Ren Chen (陳韋任)
Computer Systems Lab, Institute of Information Science,
Academia Sinica, Taiwan (R.O.C.)
Tel:886-2-2788-3799 #1667



Re: [Qemu-devel] [PATCH] tcg: Introduce TCGReg for all TCG hosts (fixes build for s390 hosts)

2011-10-07 Thread Stefan Weil

Am 07.10.2011 20:13, schrieb Richard Henderson:

On 10/07/2011 11:00 AM, Stefan Weil wrote:
+#define TCGReg int /* TODO: Remove this line when tcg-target.c uses 
TCGReg. */


Nack. This is too ugly to live.

How hard can it be to simply change the prototype in each
of the backend files instead? Yes, full conversion to
TCGReg is desirable, but this is not C++ -- integers and
enums are interchangable without casts.

r~


The goal of my patch is to get code which compiles again
on all hosts with minimum risk and which is a base for further
improvements.

I cannot run build tests for all possible hosts, and even
changing 4 prototypes for each host is a risk when it is not
tested. Yes, you can review the changes, some developers can
try builds, but that takes a lot of time.

As soon as my patch is applied, it is possible to add
more TCGReg usage to tcg.c, s390/tcg-target.c and all
other TCG targets in independent patches.

I'm just preparing patches for tcg.c and i386/tcg-target.*.

But let me repeat: these new patches need a good review
which will take some time. The bug fix should be applied soon.
Maybe you can change your mind and send an ack.

- Stefan




Re: [Qemu-devel] [PATCH] tcg: Introduce TCGReg for all TCG hosts (fixes build for s390 hosts)

2011-10-07 Thread Richard Henderson
On 10/07/2011 11:00 AM, Stefan Weil wrote:
> +#define TCGReg int /* TODO: Remove this line when tcg-target.c uses TCGReg. 
> */

Nack.  This is too ugly to live.

How hard can it be to simply change the prototype in each
of the backend files instead?  Yes, full conversion to
TCGReg is desirable, but this is not C++ -- integers and
enums are interchangable without casts.


r~



[Qemu-devel] [PATCH] tcg: Introduce TCGReg for all TCG hosts (fixes build for s390 hosts)

2011-10-07 Thread Stefan Weil
s390 already uses TCGReg - this patch defines TCGReg
for all other TCG targets.

The targets still don't use TCGReg, therefore a temporary define
maps 'TCGReg' to 'int' for all unfinished targets.

This define allows usage of TCGReg in tcg.c thus fixing some forward
declarations which had broken compilation on s390 hosts since commit
c0ad3001bf12292b137b05e1c4643f31c6b0a727.

Signed-off-by: Stefan Weil 
---
 tcg/arm/tcg-target.h   |6 --
 tcg/hppa/tcg-target.h  |6 --
 tcg/i386/tcg-target.h  |6 --
 tcg/ia64/tcg-target.h  |7 +--
 tcg/mips/tcg-target.h  |6 --
 tcg/ppc/tcg-target.h   |6 --
 tcg/ppc64/tcg-target.h |6 --
 tcg/sparc/tcg-target.h |6 --
 tcg/tcg.c  |8 
 9 files changed, 37 insertions(+), 20 deletions(-)

diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h
index 33afd97..047b6d0 100644
--- a/tcg/arm/tcg-target.h
+++ b/tcg/arm/tcg-target.h
@@ -27,7 +27,7 @@
 #undef TCG_TARGET_WORDS_BIGENDIAN
 #undef TCG_TARGET_STACK_GROWSUP
 
-enum {
+typedef enum {
 TCG_REG_R0 = 0,
 TCG_REG_R1,
 TCG_REG_R2,
@@ -44,7 +44,9 @@ enum {
 TCG_REG_R13,
 TCG_REG_R14,
 TCG_REG_PC,
-};
+} TCGReg;
+
+#define TCGReg int /* TODO: Remove this line when tcg-target.c uses TCGReg. */
 
 #define TCG_TARGET_NB_REGS 16
 
diff --git a/tcg/hppa/tcg-target.h b/tcg/hppa/tcg-target.h
index ec9a7bf..8c220ae 100644
--- a/tcg/hppa/tcg-target.h
+++ b/tcg/hppa/tcg-target.h
@@ -32,7 +32,7 @@
 
 #define TCG_TARGET_NB_REGS 32
 
-enum {
+typedef enum {
 TCG_REG_R0 = 0,
 TCG_REG_R1,
 TCG_REG_RP,
@@ -65,7 +65,9 @@ enum {
 TCG_REG_RET1,
 TCG_REG_SP,
 TCG_REG_R31,
-};
+} TCGReg;
+
+#define TCGReg int /* TODO: Remove this line when tcg-target.c uses TCGReg. */
 
 #define TCG_CT_CONST_00x0100
 #define TCG_CT_CONST_S5   0x0200
diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
index b9c9d4e..0466216 100644
--- a/tcg/i386/tcg-target.h
+++ b/tcg/i386/tcg-target.h
@@ -36,7 +36,7 @@
 # define TCG_TARGET_NB_REGS 8
 #endif
 
-enum {
+typedef enum {
 TCG_REG_EAX = 0,
 TCG_REG_ECX,
 TCG_REG_EDX,
@@ -64,7 +64,9 @@ enum {
 TCG_REG_RBP = TCG_REG_EBP,
 TCG_REG_RSI = TCG_REG_ESI,
 TCG_REG_RDI = TCG_REG_EDI,
-};
+} TCGReg;
+
+#define TCGReg int /* TODO: Remove this line when tcg-target.c uses TCGReg. */
 
 #define TCG_CT_CONST_S32 0x100
 #define TCG_CT_CONST_U32 0x200
diff --git a/tcg/ia64/tcg-target.h b/tcg/ia64/tcg-target.h
index 578cf29..fc83559 100644
--- a/tcg/ia64/tcg-target.h
+++ b/tcg/ia64/tcg-target.h
@@ -26,7 +26,8 @@
 
 /* We only map the first 64 registers */
 #define TCG_TARGET_NB_REGS 64
-enum {
+
+typedef enum {
 TCG_REG_R0 = 0,
 TCG_REG_R1,
 TCG_REG_R2,
@@ -91,7 +92,9 @@ enum {
 TCG_REG_R61,
 TCG_REG_R62,
 TCG_REG_R63,
-};
+} TCGReg;
+
+#define TCGReg int /* TODO: Remove this line when tcg-target.c uses TCGReg. */
 
 #define TCG_CT_CONST_ZERO 0x100
 #define TCG_CT_CONST_S22 0x200
diff --git a/tcg/mips/tcg-target.h b/tcg/mips/tcg-target.h
index e2a2571..1b01206 100644
--- a/tcg/mips/tcg-target.h
+++ b/tcg/mips/tcg-target.h
@@ -31,7 +31,7 @@
 
 #define TCG_TARGET_NB_REGS 32
 
-enum {
+typedef enum {
 TCG_REG_ZERO = 0,
 TCG_REG_AT,
 TCG_REG_V0,
@@ -64,7 +64,9 @@ enum {
 TCG_REG_SP,
 TCG_REG_FP,
 TCG_REG_RA,
-};
+} TCGReg;
+
+#define TCGReg int /* TODO: Remove this line when tcg-target.c uses TCGReg. */
 
 #define TCG_CT_CONST_ZERO 0x100
 #define TCG_CT_CONST_U16  0x200
diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h
index 5c2d612..d737809 100644
--- a/tcg/ppc/tcg-target.h
+++ b/tcg/ppc/tcg-target.h
@@ -26,7 +26,7 @@
 #define TCG_TARGET_WORDS_BIGENDIAN
 #define TCG_TARGET_NB_REGS 32
 
-enum {
+typedef enum {
 TCG_REG_R0 = 0,
 TCG_REG_R1,
 TCG_REG_R2,
@@ -59,7 +59,9 @@ enum {
 TCG_REG_R29,
 TCG_REG_R30,
 TCG_REG_R31
-};
+} TCGReg;
+
+#define TCGReg int /* TODO: Remove this line when tcg-target.c uses TCGReg. */
 
 /* used for function call generation */
 #define TCG_REG_CALL_STACK TCG_REG_R1
diff --git a/tcg/ppc64/tcg-target.h b/tcg/ppc64/tcg-target.h
index 8d1fb73..a37259e 100644
--- a/tcg/ppc64/tcg-target.h
+++ b/tcg/ppc64/tcg-target.h
@@ -26,7 +26,7 @@
 #define TCG_TARGET_WORDS_BIGENDIAN
 #define TCG_TARGET_NB_REGS 32
 
-enum {
+typedef enum {
 TCG_REG_R0 = 0,
 TCG_REG_R1,
 TCG_REG_R2,
@@ -59,7 +59,9 @@ enum {
 TCG_REG_R29,
 TCG_REG_R30,
 TCG_REG_R31
-};
+} TCGReg;
+
+#define TCGReg int /* TODO: Remove this line when tcg-target.c uses TCGReg. */
 
 /* used for function call generation */
 #define TCG_REG_CALL_STACK TCG_REG_R1
diff --git a/tcg/sparc/tcg-target.h b/tcg/sparc/tcg-target.h
index 1464ef4..527b32f 100644
--- a/tcg/sparc/tcg-target.h
+++ b/tcg/sparc/tcg-target.h
@@ -27,7 +27,7 @@
 
 #define TCG_TARGET_NB_REGS 32
 
-enum {
+typedef enum {
 TCG_REG_G0 = 0,
 TCG_REG_G1,
 TCG_REG_G2,
@@ -60,7 +60,9 @@ enum {
 TCG_REG_I5,
 TCG_REG_I6,
 TCG_REG_I7,
-};

Re: [Qemu-devel] [PATCH] Make cpu_single_env thread local (Linux only for now)

2011-10-07 Thread David Gilbert
On 5 October 2011 10:21, Paolo Bonzini  wrote:


> If interested people can test the patches more and submit them more
> formally, I'd be very glad.  I wrote it for RCU, but of course that one is
> not really going to be 1.0 material (even for 9p).

Hmm this got a bit more complex than the original patch; still it covers a lot
more bases.

Should this also replace the THREAD that's defined in
linux-user/qemu.h and bsd-user/qemu.h (that is __thread if built with
NPTL)?
It seems to only be there for 'thread_env' which is also a CPUState*
(hmm - what state does that contain that cpu_single_env doesn't?)

Dave



Re: [Qemu-devel] In-kernel emulation

2011-10-07 Thread 陳韋任
> I am wondering that whether there are any attempts (product-oriented or
> research-based ) to push QEMU into the Linux kernel to speed up emulation.
> If the emulation is running in the kernel, there are some resources it can
> manipulate to speed up emulation in comparison to the when it is running as
> a user process, i.e. MMU. Also, IO emulation may become faster, because 2
> kernel enters and exits are incurred for a network packet if QEMU is running
> as a user process. If QEMU is running in the kernel, only 1 kernel enter and
> exit are needed.  Any suggestions or discussions are welcome.

  You want to use QEMU to emulate guest ISA different from the host?
If the ISA of guest and host is the same, then KVM is enough, I think.

Regards,
chenwj

-- 
Wei-Ren Chen (陳韋任)
Computer Systems Lab, Institute of Information Science,
Academia Sinica, Taiwan (R.O.C.)
Tel:886-2-2788-3799 #1667



Re: [Qemu-devel] [PATCH v3] Add AACI audio playback support to the ARM Versatile/PB platform

2011-10-07 Thread Peter Maydell
On 29 September 2011 18:31, Mathieu Sonet  wrote:
> This driver emulates the ARM AACI interface (PL041) connected to a LM4549
> codec.
> It enables audio playback for the Versatile/PB platform.
>
> Limitations:
> - Supports only a playback on one channel (Versatile/Vexpress)
> - Supports only a TX FIFO in compact-mode.

Actually you seem to have implemented a weird hybrid of compact
and non-compact modes.

Non-compact mode: FIFOs are 20 bits wide; a word write from the
CPU writes to bits [19:0] of the FIFO slot; each 'slot' of data
to the LM4549 reads 20 bits from the FIFO.
Compact mode: FIFOs are 40 bits wide (and half as deep); a word
write from the CPU writes to bits [31:0] of the FIFO slot; each
'slot' of data to the LM4549 reads 16 bits from the FIFO (bits
[15:0] then [31:16], and you have to read two slots (ie the
2 channels of stereo).

You've implemented a single 32 bit depth FIFO which one CPU
write writes to and which always passes one 32 bit word to the
LM4549. There are two issues here:
 (1) the LM4549 can take up to 18 bits of data per slot,
which your current lm4549_write_sample() API doesn't allow.
Passing two 32 bit words here would fix that. [My point here
is that we shouldn't hardwire the missing non-compact mode
support into the API between the two components.]
 (2) when the Linux driver dynamically identifies the size
of the FIFO it does it in non-compact mode, so it will return
a value half as big as is actually right for the compact-mode
behaviour you've implemented.

(Also your FIFO is twice as deep as it should be if we're
only implementing compact mode.)

> Playback tested successfully with speaker-test/aplay/mpg123.

I find that on vexpress mpg123 playback works but can be quite stuttery.
madplay (integer only) is somewhat better, so I suspect that qemu is
spending ages emulating Neon/VFP in mpg123...

Further (minor) comments below.

> +    regfile[LM4549_PCM_Front_DAC_Rate]  = 0xBB80;
> +    regfile[LM4549_PCM_ADC_Rate]        = 0xBB80;
> +    regfile[LM4549_Vendor_ID1]          = 0x4e53;
> +    regfile[LM4549_Vendor_ID2]          = 0x4331;

Can we be consistent about upper or lower case for the hex?

> +#define MAX_FIFO_DEPTH                 (1024)
> +#define VERSATILEPB_DEFAULT_FIFO_DEPTH (256)  /* AN115B - Table 1.1 */

pl041.c shouldn't know anything about VersatilePB. The default
fifo depth should be 8 (same as the hardware PL041).

> +static uint8_t pl041_compute_periphid3(pl041_state *s)
> +{
> +    uint8_t id3 = (1 | ((s->fifo_depth >> 4) << 3));
> +    return id3;
> +}

This isn't right.

[5:3] FIFO depth (non-compact mode)
b000  8
b001  16
b010  32
b011  64
b100  128
b101  256
b110  512
b111  1024

...which isn't what your function calculates.
(Linux determines FIFO depth programmatically by stuffing words
into the FIFO until the status register says it's full, which is
why it doesn't complain.)

NB that some of the board TRMs have what seem to be incorrect
labelling on the tables of depth vs ID register bits.

> +    /* Update the irq state */
> +    qemu_set_irq(s->irq, ((s->regs.isr1 & s->regs.ie1) > 0) ? 1 : 0);
> +    DBG_L2("Set interrupt sr1 = 0x%08x isr1 = 0x%08x masked = 0x%08x\n",
> +           s->regs.sr1, s->regs.isr1, s->regs.isr1 & mask);
> +}

This debug printf won't compile if enabled -- you forgot
to update it when you changed the main code to remove the
'mask' variable.

> +static int pl041_post_load(void *opaque, int version_id)
> +{
> +    pl041_state *s = opaque;
> +    lm4549_post_load(&s->codec);
> +    return 0;
> +}

Is it not possible to just register lm4549_post_load()
as the post_load function for lm4549_state, rather than
having a pl041 post_load hook which only passes it through?

(Something in your mailsending path is wrapping long lines, by the way.)

-- PMM



[Qemu-devel] [PATCH 0/3] block: zero write detection

2011-10-07 Thread Stefan Hajnoczi
Image streaming copies data from the backing file into the image file.  It is
important to represent zero regions from the backing file efficiently during
streaming, otherwise the image file grows to the full virtual disk size and
loses sparseness.

There are two ways to implement zero write detection, they are subtly different:

1. Allow image formats to provide efficient representations for zero regions.
   QED does this with "zero clusters" and it has been discussed for qcow2v3.

2. During streaming, check for zeroes and skip writing to the image file when
   zeroes are detected.

However, there are some disadvantages to #2 because it leaves unallocated holes
in the image file.  If image streaming is aborted before it completes then it
will be necessary to reread all unallocated clusters from the backing file upon
resuming image streaming.  Potentionally worse is that a backing file over a
slow remote connection will have the zero regions fetched again and again if
the guest accesses them.  #1 avoids these problems because the image file
contains information on which regions are zeroes and do not need to be
refetched.

This patch series implements #1 with the existing QED zero cluster feature.  In
the future we can add qcow2v3 zero clusters too.  We can also implement #2
directly in the image streaming code as a fallback when the BlockDriver does
not support zero detection #1 itself.  That way we get the best possible zero
write detection, depending on the image format.

Here is a qemu-iotest to verify that zero write detection is working:
http://repo.or.cz/w/qemu-iotests/stefanha.git/commitdiff/226949695eef51bdcdea3e6ce3d7e5a863427f37

Stefan Hajnoczi (3):
  block: add zero write detection interface
  qed: add zero write detection support
  qemu-io: add zero write detection option

 block.c |   16 +++
 block.h |2 +
 block/qed.c |   81 +--
 block_int.h |   13 +
 qemu-io.c   |   35 -
 5 files changed, 132 insertions(+), 15 deletions(-)

-- 
1.7.6.3




[Qemu-devel] [PATCH 2/3] qed: add zero write detection support

2011-10-07 Thread Stefan Hajnoczi
The QED image format is able to efficiently represent clusters
containing zeroes with a magic offset value.  This patch implements zero
write detection for allocating writes so that image streaming can copy
over zero clusters from a backing file without expanding the image file
unnecessarily.

This is based code by Anthony Liguori .

Signed-off-by: Stefan Hajnoczi 
---
 block/qed.c |   81 +--
 1 files changed, 73 insertions(+), 8 deletions(-)

diff --git a/block/qed.c b/block/qed.c
index e87dc4d..ec3113b 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -947,9 +947,8 @@ static void qed_aio_write_l1_update(void *opaque, int ret)
 /**
  * Update L2 table with new cluster offsets and write them out
  */
-static void qed_aio_write_l2_update(void *opaque, int ret)
+static void qed_aio_write_l2_update(QEDAIOCB *acb, int ret, uint64_t offset)
 {
-QEDAIOCB *acb = opaque;
 BDRVQEDState *s = acb_to_s(acb);
 bool need_alloc = acb->find_cluster_ret == QED_CLUSTER_L1;
 int index;
@@ -965,7 +964,7 @@ static void qed_aio_write_l2_update(void *opaque, int ret)
 
 index = qed_l2_index(s, acb->cur_pos);
 qed_update_l2_table(s, acb->request.l2_table->table, index, 
acb->cur_nclusters,
- acb->cur_cluster);
+ offset);
 
 if (need_alloc) {
 /* Write out the whole new L2 table */
@@ -982,6 +981,51 @@ err:
 qed_aio_complete(acb, ret);
 }
 
+static void qed_aio_write_l2_update_cb(void *opaque, int ret)
+{
+QEDAIOCB *acb = opaque;
+qed_aio_write_l2_update(acb, ret, acb->cur_cluster);
+}
+
+/**
+ * Determine if we have a zero write to a block of clusters
+ *
+ * We validate that the write is aligned to a cluster boundary, and that it's
+ * a multiple of cluster size with all zeros.
+ */
+static bool qed_is_zero_write(QEDAIOCB *acb)
+{
+BDRVQEDState *s = acb_to_s(acb);
+int i;
+
+if (!qed_offset_is_cluster_aligned(s, acb->cur_pos)) {
+return false;
+}
+
+if (!qed_offset_is_cluster_aligned(s, acb->cur_qiov.size)) {
+return false;
+}
+
+for (i = 0; i < acb->cur_qiov.niov; i++) {
+struct iovec *iov = &acb->cur_qiov.iov[i];
+uint64_t *v;
+int j;
+
+if ((iov->iov_len & 0x07)) {
+return false;
+}
+
+v = iov->iov_base;
+for (j = 0; j < iov->iov_len; j += sizeof(v[0])) {
+if (v[j >> 3]) {
+return false;
+}
+}
+}
+
+return true;
+}
+
 /**
  * Flush new data clusters before updating the L2 table
  *
@@ -996,7 +1040,7 @@ static void qed_aio_write_flush_before_l2_update(void 
*opaque, int ret)
 QEDAIOCB *acb = opaque;
 BDRVQEDState *s = acb_to_s(acb);
 
-if (!bdrv_aio_flush(s->bs->file, qed_aio_write_l2_update, opaque)) {
+if (!bdrv_aio_flush(s->bs->file, qed_aio_write_l2_update_cb, opaque)) {
 qed_aio_complete(acb, -EIO);
 }
 }
@@ -1026,7 +1070,7 @@ static void qed_aio_write_main(void *opaque, int ret)
 if (s->bs->backing_hd) {
 next_fn = qed_aio_write_flush_before_l2_update;
 } else {
-next_fn = qed_aio_write_l2_update;
+next_fn = qed_aio_write_l2_update_cb;
 }
 }
 
@@ -1092,6 +1136,18 @@ static bool qed_should_set_need_check(BDRVQEDState *s)
 return !(s->header.features & QED_F_NEED_CHECK);
 }
 
+static void qed_aio_write_zero_cluster(void *opaque, int ret)
+{
+QEDAIOCB *acb = opaque;
+
+if (ret) {
+qed_aio_complete(acb, ret);
+return;
+}
+
+qed_aio_write_l2_update(acb, 0, 1);
+}
+
 /**
  * Write new data cluster
  *
@@ -1103,6 +1159,7 @@ static bool qed_should_set_need_check(BDRVQEDState *s)
 static void qed_aio_write_alloc(QEDAIOCB *acb, size_t len)
 {
 BDRVQEDState *s = acb_to_s(acb);
+BlockDriverCompletionFunc *cb;
 
 /* Cancel timer when the first allocating request comes in */
 if (QSIMPLEQ_EMPTY(&s->allocating_write_reqs)) {
@@ -1120,14 +1177,21 @@ static void qed_aio_write_alloc(QEDAIOCB *acb, size_t 
len)
 
 acb->cur_nclusters = qed_bytes_to_clusters(s,
 qed_offset_into_cluster(s, acb->cur_pos) + len);
-acb->cur_cluster = qed_alloc_clusters(s, acb->cur_nclusters);
 qemu_iovec_copy(&acb->cur_qiov, acb->qiov, acb->qiov_offset, len);
 
+/* Zero write detection */
+if (s->bs->use_zero_detection && qed_is_zero_write(acb)) {
+cb = qed_aio_write_zero_cluster;
+} else {
+cb = qed_aio_write_prefill;
+acb->cur_cluster = qed_alloc_clusters(s, acb->cur_nclusters);
+}
+
 if (qed_should_set_need_check(s)) {
 s->header.features |= QED_F_NEED_CHECK;
-qed_write_header(s, qed_aio_write_prefill, acb);
+qed_write_header(s, cb, acb);
 } else {
-qed_aio_write_prefill(acb, 0);
+cb(acb, 0);
 }
 }
 
@@ -1474,6 +1538,7 @@ static BlockDriver bdrv_qed = {
 .format_name  = "qed",
 

[Qemu-devel] [PATCH 1/3] block: add zero write detection interface

2011-10-07 Thread Stefan Hajnoczi
Some image formats can represent zero regions efficiently even when a
backing file is present.  In order to use this feature they need to
detect zero writes and handle them specially.

Since zero write detection consumes CPU cycles it is disabled by default
and must be explicitly enabled.  This patch adds an interface to do so.

Currently no block drivers actually support zero write detection yet.
This is addressed in follow-up patches.

Signed-off-by: Stefan Hajnoczi 
---
 block.c |   16 
 block.h |2 ++
 block_int.h |   13 +
 3 files changed, 31 insertions(+), 0 deletions(-)

diff --git a/block.c b/block.c
index e3fe97f..5cf53d6 100644
--- a/block.c
+++ b/block.c
@@ -481,6 +481,7 @@ static int bdrv_open_common(BlockDriverState *bs, const 
char *filename,
 bs->valid_key = 0;
 bs->open_flags = flags;
 bs->buffer_alignment = 512;
+bs->use_zero_detection = false;
 
 pstrcpy(bs->filename, sizeof(bs->filename), filename);
 
@@ -3344,3 +3345,18 @@ out:
 
 return ret;
 }
+
+int bdrv_set_zero_detection(BlockDriverState *bs, bool enable)
+{
+BlockDriver *drv = bs->drv;
+
+if (!drv) {
+return -ENOMEDIUM;
+}
+if (!drv->has_zero_detection) {
+return -ENOTSUP;
+}
+
+bs->use_zero_detection = enable;
+return 0;
+}
diff --git a/block.h b/block.h
index 16bfa0a..283dc27 100644
--- a/block.h
+++ b/block.h
@@ -273,6 +273,8 @@ int bdrv_img_create(const char *filename, const char *fmt,
 void bdrv_set_buffer_alignment(BlockDriverState *bs, int align);
 void *qemu_blockalign(BlockDriverState *bs, size_t size);
 
+int bdrv_set_zero_detection(BlockDriverState *bs, bool enable);
+
 #define BDRV_SECTORS_PER_DIRTY_CHUNK 2048
 
 void bdrv_set_dirty_tracking(BlockDriverState *bs, int enable);
diff --git a/block_int.h b/block_int.h
index 8c3b863..3e8d768 100644
--- a/block_int.h
+++ b/block_int.h
@@ -146,6 +146,16 @@ struct BlockDriver {
  */
 int (*bdrv_has_zero_init)(BlockDriverState *bs);
 
+/*
+ * True if zero write detection is supported, false otherwise.
+ *
+ * Block drivers that declare support for zero detection should check
+ * BlockDriverState.use_zero_detection for each write request to decide
+ * whether or not to perform detection.  Since zero detection consumes CPU
+ * cycles it is disabled by default.
+ */
+bool has_zero_detection;
+
 QLIST_ENTRY(BlockDriver) list;
 };
 
@@ -195,6 +205,9 @@ struct BlockDriverState {
 /* do we need to tell the quest if we have a volatile write cache? */
 int enable_write_cache;
 
+/* is zero write detection enabled? */
+bool use_zero_detection;
+
 /* NOTE: the following infos are only hints for real hardware
drivers. They are not used by the block driver */
 int cyls, heads, secs, translation;
-- 
1.7.6.3




[Qemu-devel] [PATCH 3/3] qemu-io: add zero write detection option

2011-10-07 Thread Stefan Hajnoczi
Add a -z option to qemu-io and the 'open' command to enable zero write
detection.  This is used by the qemu-iotests 029 test case and allows
scripts to exercise zero write detection.

Signed-off-by: Stefan Hajnoczi 
---
 qemu-io.c |   35 ---
 1 files changed, 28 insertions(+), 7 deletions(-)

diff --git a/qemu-io.c b/qemu-io.c
index e91af37..94beaf6 100644
--- a/qemu-io.c
+++ b/qemu-io.c
@@ -1593,7 +1593,7 @@ static const cmdinfo_t close_cmd = {
 .oneline= "close the current open file",
 };
 
-static int openfile(char *name, int flags, int growable)
+static int openfile(char *name, int flags, int growable, int detect_zeroes)
 {
 if (bs) {
 fprintf(stderr, "file open already, try 'help close'\n");
@@ -1615,6 +1615,16 @@ static int openfile(char *name, int flags, int growable)
 }
 }
 
+if (detect_zeroes) {
+if (bdrv_set_zero_detection(bs, true) < 0) {
+fprintf(stderr, "%s: format does not support zero detection\n",
+progname);
+bdrv_delete(bs);
+bs = NULL;
+return 1;
+}
+}
+
 return 0;
 }
 
@@ -1632,6 +1642,7 @@ static void open_help(void)
 " -s, -- use snapshot file\n"
 " -n, -- disable host cache\n"
 " -g, -- allow file to grow (only applies to protocols)"
+" -z  -- use zero write detection (supported formats only)\n"
 "\n");
 }
 
@@ -1644,7 +1655,7 @@ static const cmdinfo_t open_cmd = {
 .argmin = 1,
 .argmax = -1,
 .flags  = CMD_NOFILE_OK,
-.args   = "[-Crsn] [path]",
+.args   = "[-Crsnz] [path]",
 .oneline= "open the file specified by path",
 .help   = open_help,
 };
@@ -1654,9 +1665,10 @@ static int open_f(int argc, char **argv)
 int flags = 0;
 int readonly = 0;
 int growable = 0;
+int detect_zeroes = 0;
 int c;
 
-while ((c = getopt(argc, argv, "snrg")) != EOF) {
+while ((c = getopt(argc, argv, "snrgz")) != EOF) {
 switch (c) {
 case 's':
 flags |= BDRV_O_SNAPSHOT;
@@ -1670,6 +1682,9 @@ static int open_f(int argc, char **argv)
 case 'g':
 growable = 1;
 break;
+case 'z':
+detect_zeroes = 1;
+break;
 default:
 return command_usage(&open_cmd);
 }
@@ -1683,7 +1698,7 @@ static int open_f(int argc, char **argv)
 return command_usage(&open_cmd);
 }
 
-return openfile(argv[optind], flags, growable);
+return openfile(argv[optind], flags, growable, detect_zeroes);
 }
 
 static int init_args_command(int index)
@@ -1710,7 +1725,7 @@ static int init_check_command(const cmdinfo_t *ct)
 static void usage(const char *name)
 {
 printf(
-"Usage: %s [-h] [-V] [-rsnm] [-c cmd] ... [file]\n"
+"Usage: %s [-h] [-V] [-rsnmkz] [-c cmd] ... [file]\n"
 "QEMU Disk exerciser\n"
 "\n"
 "  -c, --cmdcommand to execute\n"
@@ -1720,6 +1735,7 @@ static void usage(const char *name)
 "  -g, --growable   allow file to grow (only applies to protocols)\n"
 "  -m, --misalign   misalign allocations for O_DIRECT\n"
 "  -k, --native-aio use kernel AIO implementation (on Linux only)\n"
+"  -z, --detect-zeroes  use zero write detection (supported formats only)\n"
 "  -h, --help   display this help and exit\n"
 "  -V, --versionoutput version information and exit\n"
 "\n",
@@ -1731,7 +1747,8 @@ int main(int argc, char **argv)
 {
 int readonly = 0;
 int growable = 0;
-const char *sopt = "hVc:rsnmgk";
+int detect_zeroes = 0;
+const char *sopt = "hVc:rsnmgkz";
 const struct option lopt[] = {
 { "help", 0, NULL, 'h' },
 { "version", 0, NULL, 'V' },
@@ -1743,6 +1760,7 @@ int main(int argc, char **argv)
 { "misalign", 0, NULL, 'm' },
 { "growable", 0, NULL, 'g' },
 { "native-aio", 0, NULL, 'k' },
+{ "detect-zeroes", 0, NULL, 'z' },
 { NULL, 0, NULL, 0 }
 };
 int c;
@@ -1774,6 +1792,9 @@ int main(int argc, char **argv)
 case 'k':
 flags |= BDRV_O_NATIVE_AIO;
 break;
+case 'z':
+detect_zeroes = 1;
+break;
 case 'V':
 printf("%s version %s\n", progname, VERSION);
 exit(0);
@@ -1823,7 +1844,7 @@ int main(int argc, char **argv)
 }
 
 if ((argc - optind) == 1) {
-openfile(argv[optind], flags, growable);
+openfile(argv[optind], flags, growable, detect_zeroes);
 }
 command_loop();
 
-- 
1.7.6.3




Re: [Qemu-devel] [Question] dump memory when host pci device is used by guest

2011-10-07 Thread Jan Kiszka
On 2011-10-07 16:05, Wen Congyang wrote:
> 于 2011/10/7 20:56, Jan Kiszka 写道:
>> On 2011-10-07 14:25, Wen Congyang wrote:
>>> 于 2011/10/7 18:16, Jan Kiszka 写道:
 On 2011-10-07 11:46, Wen Congyang wrote:
> Currently, virsh dump uses monitor command migrate to dump guest's memory
> to file, and we can use crash to analyze the file.
>
> Unfortunately, virsh dump can not work if guest uses host pci device. The
> reason is that the device's status is also needed to migrate to remote 
> machine,
> and the host pci device's status is not stored in qemu. So it is 
> unmigratable.
>
> I think we can  we can add a option to qmp command migrate(eg: skip) to 
> allow
> the user to skip the check, and this option should be used only when 
> dumping
> the guest's memory.

 Why not simply attach gdb? That works independently of migration.
>>>
>>> If qemu has some problem, we can use gdb to debug it. But if guest os
>>> has problem
>>> (eg:kernel panic and kdump does not work), we should dump guest's memory
>>> and use
>>> crash to analyze.
>>
>> qemu-system-xxx -s (or "gdbserver" via monitor if qemu is already
>> running), gdb vmlinux, then "target remote :1234".
> 
> Hmm, if i use qemu, i can do it as the above. But i can not hope our 
> customer
> do it because it is difficult for them to debug kernel.
> So the customer can use 'virsh dump'(the guest is managed by libvirt) or 
> autodump(if
> the guest has a watchdog) to dump the memory. The supporter can debug 
> kernel in another
> machine.

gdb is surely scriptable. It's just a bit more complex than running
"generate-core-file" - gdb does not yet support this for remote
sessions. Ideally, that feature should be added, and you are done,
independent of QEMU migration format X.Y.whatever or limitations due to
unmigratable devices. The current approach is, well, "pragmatic".

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] [PATCH 1/4] Add basic version of bridge helper

2011-10-07 Thread Corey Bryant



On 10/07/2011 10:45 AM, Daniel P. Berrange wrote:

On Fri, Oct 07, 2011 at 10:40:56AM -0400, Corey Bryant wrote:



On 10/07/2011 05:04 AM, Daniel P. Berrange wrote:

On Thu, Oct 06, 2011 at 02:38:56PM -0400, Corey Bryant wrote:



On 10/06/2011 02:04 PM, Anthony Liguori wrote:

On 10/06/2011 11:41 AM, Daniel P. Berrange wrote:

On Thu, Oct 06, 2011 at 11:38:25AM -0400, Richa Marwaha wrote:

This patch adds a helper that can be used to create a tap device
attached to
a bridge device. Since this helper is minimal in what it does, it can be
given CAP_NET_ADMIN which allows qemu to avoid running as root while
still
satisfying the majority of what users tend to want to do with tap
devices.

The way this all works is that qemu launches this helper passing a
bridge
name and the name of an inherited file descriptor. The descriptor is one
end of a socketpair() of domain sockets. This domain socket is used to
transmit a file descriptor of the opened tap device from the helper
to qemu.

The helper can then exit and let qemu use the tap device.


When QEMU is run by libvirt, we generally like to use capng to
remove the ability for QEMU to run setuid programs at all. So
obviously it will struggle to run the qemu-bridge-helper binary
in such a scenario.

With the way you transmit the TAP device FD back to the caller,
it looks like libvirt itself could execute the qemu-bridge-helper
receiving the FD, and then pass the FD onto QEMU using the
traditional tap,fd=XX syntax.


Exactly. This would allow tap-based networking using libvirt session://
URIs.



I'll take note of this.  It seems like it would be a nice future
addition to libvirt.

A slight tangent, but a point on DAC isolation.  The helper enables
DAC isolation for qemu:///session but we still need some work in
libvirt to provide DAC isolation for qemu:///system.  This could be
done by allowing management applications to specify custom
user/group IDs when creating guests rather than hard coding the IDs
in the configuration file.


Yes, this is a item on our todo list for libvirt. There are a couple of
work items involved

  - Extend the XML to allow multiple   elements, one per
security driver in use.
  - Add a new API to allow fetching of live seclabel data per
security driver
  - Extend the current DAC security driver to automatically allocate
UIDs from an admin defined range, and/or pull them from the XML
provided by app.

Tecnically we could do item 3, without doing items 1/2, but that would
neccessitate *not* using the sVirt security driver. I don't think that's
too useful, so items 1/2 let us use both the sVirt&   enhanced DAC driver
at the same time.



I think I'm missing something here and could use some more details
to understand 1&  2.  Here's what I'm currently picturing.

With DAC isolation:
 QEMU A runs under userA:groupA and QEMU B runs under userB:groupB

versus currently:
 QEMU A runs under qemu:qemu and QEMU B runs under qemu:qemu

In either case, guests A and B have separate domain XML and a single
unique seclabel, such as this dynamic SELinux label:


   system_u:system_r:svirt_t:s0:c633,c712
   system_u:object_r:svirt_image_t:s0:c633,c712



If we're going to make the DAC user ID/group ID configurable, then we
need to expose this to application in the XML so that

  a. apps can allocate unique user/group *cluster wide* when shared
 filesystems are in use. libvirt can only ensure per-host uniqueness.

  b. apps can know what user/group ID has been allocate to each guest
 and this can be reported in virsh dominfo, as with svirt info.

ie, we'll need something like this:

   
 system_u:system_r:svirt_t:s0:c633,c712
 system_u:object_r:svirt_image_t:s0:c633,c712
   
   
 102:102
 102:102
   


And:

# virsh dominfo f16x86_64
Id: 29
Name:   f16x86_64
UUID:   1e9f3097-0a45-ea06-d0d8-40507999a1cd
OS Type:hvm
State:  running
CPU(s): 1
CPU time:   19.5s
Max memory: 819200 kB
Used memory:819200 kB
Persistent: yes
Autostart:  disable
Security model: selinux
Security DOI:   0
Security label: system_u:system_r:svirt_t:s0:c244,c424 (permissive)
Security model: dac
Security DOI:   0
Security label: 102:102 (enforcing)

Regards,
Daniel


Ah, yes.  That makes complete sense.  Thanks for the clarification.

--
Regards,
Corey




Re: [Qemu-devel] [PATCH 1/4] Add basic version of bridge helper

2011-10-07 Thread Corey Bryant



On 10/07/2011 10:45 AM, Daniel P. Berrange wrote:

On Fri, Oct 07, 2011 at 10:40:56AM -0400, Corey Bryant wrote:



On 10/07/2011 05:04 AM, Daniel P. Berrange wrote:

On Thu, Oct 06, 2011 at 02:38:56PM -0400, Corey Bryant wrote:



On 10/06/2011 02:04 PM, Anthony Liguori wrote:

On 10/06/2011 11:41 AM, Daniel P. Berrange wrote:

On Thu, Oct 06, 2011 at 11:38:25AM -0400, Richa Marwaha wrote:

This patch adds a helper that can be used to create a tap device
attached to
a bridge device. Since this helper is minimal in what it does, it can be
given CAP_NET_ADMIN which allows qemu to avoid running as root while
still
satisfying the majority of what users tend to want to do with tap
devices.

The way this all works is that qemu launches this helper passing a
bridge
name and the name of an inherited file descriptor. The descriptor is one
end of a socketpair() of domain sockets. This domain socket is used to
transmit a file descriptor of the opened tap device from the helper
to qemu.

The helper can then exit and let qemu use the tap device.


When QEMU is run by libvirt, we generally like to use capng to
remove the ability for QEMU to run setuid programs at all. So
obviously it will struggle to run the qemu-bridge-helper binary
in such a scenario.

With the way you transmit the TAP device FD back to the caller,
it looks like libvirt itself could execute the qemu-bridge-helper
receiving the FD, and then pass the FD onto QEMU using the
traditional tap,fd=XX syntax.


Exactly. This would allow tap-based networking using libvirt session://
URIs.



I'll take note of this.  It seems like it would be a nice future
addition to libvirt.

A slight tangent, but a point on DAC isolation.  The helper enables
DAC isolation for qemu:///session but we still need some work in
libvirt to provide DAC isolation for qemu:///system.  This could be
done by allowing management applications to specify custom
user/group IDs when creating guests rather than hard coding the IDs
in the configuration file.


Yes, this is a item on our todo list for libvirt. There are a couple of
work items involved

  - Extend the XML to allow multiple   elements, one per
security driver in use.
  - Add a new API to allow fetching of live seclabel data per
security driver
  - Extend the current DAC security driver to automatically allocate
UIDs from an admin defined range, and/or pull them from the XML
provided by app.

Tecnically we could do item 3, without doing items 1/2, but that would
neccessitate *not* using the sVirt security driver. I don't think that's
too useful, so items 1/2 let us use both the sVirt&   enhanced DAC driver
at the same time.



I think I'm missing something here and could use some more details
to understand 1&  2.  Here's what I'm currently picturing.

With DAC isolation:
 QEMU A runs under userA:groupA and QEMU B runs under userB:groupB

versus currently:
 QEMU A runs under qemu:qemu and QEMU B runs under qemu:qemu

In either case, guests A and B have separate domain XML and a single
unique seclabel, such as this dynamic SELinux label:


   system_u:system_r:svirt_t:s0:c633,c712
   system_u:object_r:svirt_image_t:s0:c633,c712



If we're going to make the DAC user ID/group ID configurable, then we
need to expose this to application in the XML so that

  a. apps can allocate unique user/group *cluster wide* when shared
 filesystems are in use. libvirt can only ensure per-host uniqueness.

  b. apps can know what user/group ID has been allocate to each guest
 and this can be reported in virsh dominfo, as with svirt info.

ie, we'll need something like this:

   
 system_u:system_r:svirt_t:s0:c633,c712
 system_u:object_r:svirt_image_t:s0:c633,c712
   
   
 102:102
 102:102
   


And:

# virsh dominfo f16x86_64
Id: 29
Name:   f16x86_64
UUID:   1e9f3097-0a45-ea06-d0d8-40507999a1cd
OS Type:hvm
State:  running
CPU(s): 1
CPU time:   19.5s
Max memory: 819200 kB
Used memory:819200 kB
Persistent: yes
Autostart:  disable
Security model: selinux
Security DOI:   0
Security label: system_u:system_r:svirt_t:s0:c244,c424 (permissive)
Security model: dac
Security DOI:   0
Security label: 102:102 (enforcing)

Regards,
Daniel


Ah, yes.  That makes complete sense.  Thanks for the clarification.

--
Regards,
Corey




Re: [Qemu-devel] [PATCH 1/4] Add basic version of bridge helper

2011-10-07 Thread Corey Bryant



On 10/07/2011 05:04 AM, Daniel P. Berrange wrote:

On Thu, Oct 06, 2011 at 02:38:56PM -0400, Corey Bryant wrote:



On 10/06/2011 02:04 PM, Anthony Liguori wrote:

On 10/06/2011 11:41 AM, Daniel P. Berrange wrote:

On Thu, Oct 06, 2011 at 11:38:25AM -0400, Richa Marwaha wrote:

This patch adds a helper that can be used to create a tap device
attached to
a bridge device. Since this helper is minimal in what it does, it can be
given CAP_NET_ADMIN which allows qemu to avoid running as root while
still
satisfying the majority of what users tend to want to do with tap
devices.

The way this all works is that qemu launches this helper passing a
bridge
name and the name of an inherited file descriptor. The descriptor is one
end of a socketpair() of domain sockets. This domain socket is used to
transmit a file descriptor of the opened tap device from the helper
to qemu.

The helper can then exit and let qemu use the tap device.


When QEMU is run by libvirt, we generally like to use capng to
remove the ability for QEMU to run setuid programs at all. So
obviously it will struggle to run the qemu-bridge-helper binary
in such a scenario.

With the way you transmit the TAP device FD back to the caller,
it looks like libvirt itself could execute the qemu-bridge-helper
receiving the FD, and then pass the FD onto QEMU using the
traditional tap,fd=XX syntax.


Exactly. This would allow tap-based networking using libvirt session://
URIs.



I'll take note of this.  It seems like it would be a nice future
addition to libvirt.

A slight tangent, but a point on DAC isolation.  The helper enables
DAC isolation for qemu:///session but we still need some work in
libvirt to provide DAC isolation for qemu:///system.  This could be
done by allowing management applications to specify custom
user/group IDs when creating guests rather than hard coding the IDs
in the configuration file.


Yes, this is a item on our todo list for libvirt. There are a couple of
work items involved

  - Extend the XML to allow multiple  elements, one per
security driver in use.
  - Add a new API to allow fetching of live seclabel data per
security driver
  - Extend the current DAC security driver to automatically allocate
UIDs from an admin defined range, and/or pull them from the XML
provided by app.

Tecnically we could do item 3, without doing items 1/2, but that would
neccessitate *not* using the sVirt security driver. I don't think that's
too useful, so items 1/2 let us use both the sVirt&  enhanced DAC driver
at the same time.



I think I'm missing something here and could use some more details to 
understand 1 & 2.  Here's what I'm currently picturing.


With DAC isolation:
QEMU A runs under userA:groupA and QEMU B runs under userB:groupB

versus currently:
QEMU A runs under qemu:qemu and QEMU B runs under qemu:qemu

In either case, guests A and B have separate domain XML and a single 
unique seclabel, such as this dynamic SELinux label:



  system_u:system_r:svirt_t:s0:c633,c712
  system_u:object_r:svirt_image_t:s0:c633,c712




Regards,
Daniel


--
Regards,
Corey




Re: [Qemu-devel] [PATCH 1/4] Add basic version of bridge helper

2011-10-07 Thread Daniel P. Berrange
On Fri, Oct 07, 2011 at 10:40:56AM -0400, Corey Bryant wrote:
> 
> 
> On 10/07/2011 05:04 AM, Daniel P. Berrange wrote:
> >On Thu, Oct 06, 2011 at 02:38:56PM -0400, Corey Bryant wrote:
> >>
> >>
> >>On 10/06/2011 02:04 PM, Anthony Liguori wrote:
> >>>On 10/06/2011 11:41 AM, Daniel P. Berrange wrote:
> On Thu, Oct 06, 2011 at 11:38:25AM -0400, Richa Marwaha wrote:
> >This patch adds a helper that can be used to create a tap device
> >attached to
> >a bridge device. Since this helper is minimal in what it does, it can be
> >given CAP_NET_ADMIN which allows qemu to avoid running as root while
> >still
> >satisfying the majority of what users tend to want to do with tap
> >devices.
> >
> >The way this all works is that qemu launches this helper passing a
> >bridge
> >name and the name of an inherited file descriptor. The descriptor is one
> >end of a socketpair() of domain sockets. This domain socket is used to
> >transmit a file descriptor of the opened tap device from the helper
> >to qemu.
> >
> >The helper can then exit and let qemu use the tap device.
> 
> When QEMU is run by libvirt, we generally like to use capng to
> remove the ability for QEMU to run setuid programs at all. So
> obviously it will struggle to run the qemu-bridge-helper binary
> in such a scenario.
> 
> With the way you transmit the TAP device FD back to the caller,
> it looks like libvirt itself could execute the qemu-bridge-helper
> receiving the FD, and then pass the FD onto QEMU using the
> traditional tap,fd=XX syntax.
> >>>
> >>>Exactly. This would allow tap-based networking using libvirt session://
> >>>URIs.
> >>>
> >>
> >>I'll take note of this.  It seems like it would be a nice future
> >>addition to libvirt.
> >>
> >>A slight tangent, but a point on DAC isolation.  The helper enables
> >>DAC isolation for qemu:///session but we still need some work in
> >>libvirt to provide DAC isolation for qemu:///system.  This could be
> >>done by allowing management applications to specify custom
> >>user/group IDs when creating guests rather than hard coding the IDs
> >>in the configuration file.
> >
> >Yes, this is a item on our todo list for libvirt. There are a couple of
> >work items involved
> >
> >  - Extend the XML to allow multiple  elements, one per
> >security driver in use.
> >  - Add a new API to allow fetching of live seclabel data per
> >security driver
> >  - Extend the current DAC security driver to automatically allocate
> >UIDs from an admin defined range, and/or pull them from the XML
> >provided by app.
> >
> >Tecnically we could do item 3, without doing items 1/2, but that would
> >neccessitate *not* using the sVirt security driver. I don't think that's
> >too useful, so items 1/2 let us use both the sVirt&  enhanced DAC driver
> >at the same time.
> >
> 
> I think I'm missing something here and could use some more details
> to understand 1 & 2.  Here's what I'm currently picturing.
> 
> With DAC isolation:
> QEMU A runs under userA:groupA and QEMU B runs under userB:groupB
> 
> versus currently:
> QEMU A runs under qemu:qemu and QEMU B runs under qemu:qemu
> 
> In either case, guests A and B have separate domain XML and a single
> unique seclabel, such as this dynamic SELinux label:
> 
> 
>   system_u:system_r:svirt_t:s0:c633,c712
>   system_u:object_r:svirt_image_t:s0:c633,c712
> 

If we're going to make the DAC user ID/group ID configurable, then we
need to expose this to application in the XML so that

 a. apps can allocate unique user/group *cluster wide* when shared
filesystems are in use. libvirt can only ensure per-host uniqueness.

 b. apps can know what user/group ID has been allocate to each guest
and this can be reported in virsh dominfo, as with svirt info.

ie, we'll need something like this:

  
system_u:system_r:svirt_t:s0:c633,c712
system_u:object_r:svirt_image_t:s0:c633,c712
  
  
102:102
102:102
  


And:

# virsh dominfo f16x86_64
Id: 29
Name:   f16x86_64
UUID:   1e9f3097-0a45-ea06-d0d8-40507999a1cd
OS Type:hvm
State:  running
CPU(s): 1
CPU time:   19.5s
Max memory: 819200 kB
Used memory:819200 kB
Persistent: yes
Autostart:  disable
Security model: selinux
Security DOI:   0
Security label: system_u:system_r:svirt_t:s0:c244,c424 (permissive)
Security model: dac
Security DOI:   0
Security label: 102:102 (enforcing)

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [PATCH] checkpatch: remove rule on non-indented labels

2011-10-07 Thread Peter Maydell
On 7 October 2011 14:59, Paolo Bonzini  wrote:
> There are 508 non-indented (non-default) labels, and 511 that are
> indented.  So the rule is debatable at least.  Actually, in the
> common case of labels at the outermost scope, there is really just
> one place where to put the label, so the rule is just wrong IMHO.

Agreed. This is one of the checkpatch shibboleths that I just
ignore when submitting patches :-)

-- PMM



Re: [Qemu-devel] [Question] dump memory when host pci device is used by guest

2011-10-07 Thread Wen Congyang

于 2011/10/7 20:56, Jan Kiszka 写道:

On 2011-10-07 14:25, Wen Congyang wrote:

于 2011/10/7 18:16, Jan Kiszka 写道:

On 2011-10-07 11:46, Wen Congyang wrote:

Currently, virsh dump uses monitor command migrate to dump guest's memory
to file, and we can use crash to analyze the file.

Unfortunately, virsh dump can not work if guest uses host pci device. The
reason is that the device's status is also needed to migrate to remote machine,
and the host pci device's status is not stored in qemu. So it is unmigratable.

I think we can  we can add a option to qmp command migrate(eg: skip) to allow
the user to skip the check, and this option should be used only when dumping
the guest's memory.


Why not simply attach gdb? That works independently of migration.


If qemu has some problem, we can use gdb to debug it. But if guest os
has problem
(eg:kernel panic and kdump does not work), we should dump guest's memory
and use
crash to analyze.


qemu-system-xxx -s (or "gdbserver" via monitor if qemu is already
running), gdb vmlinux, then "target remote :1234".


Hmm, if i use qemu, i can do it as the above. But i can not hope our 
customer

do it because it is difficult for them to debug kernel.
So the customer can use 'virsh dump'(the guest is managed by libvirt) or 
autodump(if
the guest has a watchdog) to dump the memory. The supporter can debug 
kernel in another

machine.

I still think that supporting to dump memory when guest uses host pci 
device is necessay.


Thanks
Wen Congyang


Jan





[Qemu-devel] [PATCH] checkpatch: remove rule on non-indented labels

2011-10-07 Thread Paolo Bonzini
There are 508 non-indented (non-default) labels, and 511 that are
indented.  So the rule is debatable at least.  Actually, in the
common case of labels at the outermost scope, there is really just
one place where to put the label, so the rule is just wrong IMHO.

Signed-off-by: Paolo Bonzini 
---
 scripts/checkpatch.pl |6 --
 1 files changed, 0 insertions(+), 6 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 0eba357..7a71324 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2206,12 +2206,6 @@ sub process {
ERROR("space prohibited before that close parenthesis 
')'\n" . $herecurr);
}
 
-#goto labels aren't indented, allow a single space however
-   if ($line=~/^.\s+[A-Za-z\d_]+:(?![0-9]+)/ and
-  !($line=~/^. [A-Za-z\d_]+:/) and !($line=~/^.\s+default:/)) {
-   WARN("labels should not be indented\n" . $herecurr);
-   }
-
 # Return is not a function.
if (defined($stat) && $stat =~ /^.\s*return(\s*)(\(.*);/s) {
my $spacing = $1;
-- 
1.7.6




Re: [Qemu-devel] [PATCH] Set an invalid-bits mask for each SPE instructions

2011-10-07 Thread Fabien Chouteau
On 07/10/2011 14:40, Alexander Graf wrote:
> On 09/28/2011 05:54 PM, Fabien Chouteau wrote:
>> SPE instructions are defined by pairs. Currently, the invalid-bits mask is 
>> set
>> for the first instruction, but the second one can have a different mask.
>>
>> example:
>> GEN_SPE(efdcmpeq,efdcfs,  0x17, 0x0B, 0x0060, 0x0018, 
>> PPC_SPE_DOUBLE),
>>
>> Signed-off-by: Fabien Chouteau
> 
> It certainly doesn't make the code more ugly than it was before :). Applied 
> to my local ppc-next branch. I take it that you verified all the invalid 
> masks are sane.
> 

Yes I checked all the masks so they should be OK.

> There are some lines exceeding 80 characters, but I'm fairly sure they did 
> before too. So I'll let this slip through for the sake of readability.
> 

For these kind of definition lists the 80 characters limit can result in
very awful code.

Regards,

-- 
Fabien Chouteau



Re: [Qemu-devel] [Question] dump memory when host pci device is used by guest

2011-10-07 Thread Jan Kiszka
On 2011-10-07 14:25, Wen Congyang wrote:
> 于 2011/10/7 18:16, Jan Kiszka 写道:
>> On 2011-10-07 11:46, Wen Congyang wrote:
>>> Currently, virsh dump uses monitor command migrate to dump guest's memory
>>> to file, and we can use crash to analyze the file.
>>>
>>> Unfortunately, virsh dump can not work if guest uses host pci device. The
>>> reason is that the device's status is also needed to migrate to remote 
>>> machine,
>>> and the host pci device's status is not stored in qemu. So it is 
>>> unmigratable.
>>>
>>> I think we can  we can add a option to qmp command migrate(eg: skip) to 
>>> allow
>>> the user to skip the check, and this option should be used only when dumping
>>> the guest's memory.
>>
>> Why not simply attach gdb? That works independently of migration.
> 
> If qemu has some problem, we can use gdb to debug it. But if guest os 
> has problem
> (eg:kernel panic and kdump does not work), we should dump guest's memory 
> and use
> crash to analyze.

qemu-system-xxx -s (or "gdbserver" via monitor if qemu is already
running), gdb vmlinux, then "target remote :1234".

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] [PATCH 2/2] LAPIC: make lapic support cpu hotplug

2011-10-07 Thread liu ping fan
On 10/6/11, Jan Kiszka  wrote:
> On 2011-10-06 03:13, liu ping fan wrote:
>> On Wed, Oct 5, 2011 at 7:01 PM, Jan Kiszka  wrote:
>>> On 2011-10-05 12:26, liu ping fan wrote:
>  > And make the creation of apic as part of cpu initialization, so
>> apic's state has been ready, before setting kvm_apic.
>
> There is no kvm-apic upstream yet, so it's hard to judge why we need
> this here. If we do, this has to be a separate patch. But I seriously
> doubt we need it (my hack worked without it, and that was not because
> of
> its hack nature).
>
> Sorry, I did not explain it clearly. What I mean is that
> “env->apic_state”
 must be prepared
 before qemu_kvm_cpu_thread_fn() -> ... -> kvm_put_sregs(), where we get
 apic_base by
 “ sregs.apic_base = cpu_get_apic_base(env->apic_state);”
 and then call “kvm_vcpu_ioctl(env, KVM_SET_SREGS, &sregs);” which will
 finally affect the
 kvm_apic structure in kernel.

 But as current code, in pc_new_cpu(), we call apic_init() to initialize
 apic_state, after cpu_init(),
 so we can not guarantee the order of apic_state initializaion and the
 setting to kernel.

 Because LAPIC is part of x86 chip, I want to move it into
 cpu_x86_init(),
 and ensure apic_init()
 called before thread “qemu_kvm_cpu_thread_fn()” creation.
>>>
>>> The LAPIC is part of the CPU, the classic APIC was a dedicated chip.
>> Sorry, a little puzzle.  I think x86 interrupt system consists of two
>> parts: IOAPIC/LAPIC.
>> For we have "hw/ioapic.c" to simulate IOAPIC,  I think "hw/apic.c"
>> takes the role as LAPIC,
>> especially that we create an APICState instance for each CPUX86State,
>> just like each LAPIC
>> for x86 CPU in real machine.
>> So we can consider apic_init() to create a LAPIC instance, rather than
>> create a  "classic APIC"?
>>
>> I guess If there is lack of something in IOAPIC/LAPIC bus topology,
>> that will be the arbitrator of ICC bus, right?
>> So "the classic APIC was a dedicated chip" what you said, play this
>> role,  right?
>> Could you tell me a sample chipset of APIC, and I can increase my
>> knowledge about it, thanks.
>
> The 82489DX was used as a discrete APIC on 486 SMP systems.
>
>>
>>>
>>> For various reasons, a safer approach for creating a new CPU is to stop
>>> the machine, add the new device models, run cpu_synchronize_post_init on
>>> that new cpu (looks like you missed that) and then resume everything.
>>> See
>>> http://git.kiszka.org/?p=qemu-kvm.git;a=commitdiff;h=be8f21c6b54eac82f7add7ee9d4ecf9cb8ebb320
>>>
>> Great job. And I am interesting about it. Could you give some sample
>> reason about why we need to stop
>> the machine, or list all of the reasons, so we can resolve it one by
>> one. I can not figure out such scenes by myself.
>> From my view, especially for KVM, the creation of vcpu are protected
>> well by lock mechanism from other
>> vcpu threads in kernel, so we need not to stop all of the threads.
>
> Maybe I was seeing ghosts: I thought that there is a race window between
> VCPU_CREATE and the last initialization IOCTL when we allow other VCPUs
> to interact with the new one already. However, I do not find the
> scenario again ATM.
>
> But if you want to move the VCPU resource initialization completely over
> the VCPU thread, you also have to handle env->halted in that context.
> See [1] for this topic and associated todos.
>
> And don't forget the cpu_synchronize_post_init. Running this after each
> VCPU creation directly should also obsolete cpu_synchronize_all_post_init.
Thanks, Jan.  I will dig into this and follow the thread to see what
to do in next
step

Regards,
ping fan
>
> Jan
>
> [1] http://thread.gmane.org/gmane.comp.emulators.qemu/100806
>
>



Re: [Qemu-devel] [PATCH] Set an invalid-bits mask for each SPE instructions

2011-10-07 Thread Alexander Graf

On 09/28/2011 05:54 PM, Fabien Chouteau wrote:

SPE instructions are defined by pairs. Currently, the invalid-bits mask is set
for the first instruction, but the second one can have a different mask.

example:
GEN_SPE(efdcmpeq,efdcfs,  0x17, 0x0B, 0x0060, 0x0018, 
PPC_SPE_DOUBLE),

Signed-off-by: Fabien Chouteau


It certainly doesn't make the code more ugly than it was before :). 
Applied to my local ppc-next branch. I take it that you verified all the 
invalid masks are sane.


There are some lines exceeding 80 characters, but I'm fairly sure they 
did before too. So I'll let this slip through for the sake of readability.



Alex




[Qemu-devel] In-kernel emulation

2011-10-07 Thread Xin Tong
I am wondering that whether there are any attempts (product-oriented or
research-based ) to push QEMU into the Linux kernel to speed up emulation.
If the emulation is running in the kernel, there are some resources it can
manipulate to speed up emulation in comparison to the when it is running as
a user process, i.e. MMU. Also, IO emulation may become faster, because 2
kernel enters and exits are incurred for a network packet if QEMU is running
as a user process. If QEMU is running in the kernel, only 1 kernel enter and
exit are needed.  Any suggestions or discussions are welcome.


Thanks

Xin


Re: [Qemu-devel] [Question] dump memory when host pci device is used by guest

2011-10-07 Thread Wen Congyang

于 2011/10/7 18:16, Jan Kiszka 写道:

On 2011-10-07 11:46, Wen Congyang wrote:

Currently, virsh dump uses monitor command migrate to dump guest's memory
to file, and we can use crash to analyze the file.

Unfortunately, virsh dump can not work if guest uses host pci device. The
reason is that the device's status is also needed to migrate to remote machine,
and the host pci device's status is not stored in qemu. So it is unmigratable.

I think we can  we can add a option to qmp command migrate(eg: skip) to allow
the user to skip the check, and this option should be used only when dumping
the guest's memory.


Why not simply attach gdb? That works independently of migration.


If qemu has some problem, we can use gdb to debug it. But if guest os 
has problem
(eg:kernel panic and kdump does not work), we should dump guest's memory 
and use

crash to analyze.

Thanks
Wen Congyang




Jan





Re: [Qemu-devel] [PATCH] pcnet: Add link state support

2011-10-07 Thread Jan Kiszka
On 2011-10-07 12:27, Jan Kiszka wrote:
> Update lnkst on link state changes so that guests can obtain this
> information via reading back the LED output pin. Works for Linux but
> not for guests that depend on the missing PHY.

Strike the second sentence: The older Am79C970A that QEMU emulated
provided no PHY access, only newer Am79C973/Am79C979 did. So the patch
is sufficient for our current level of support.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] [PATCH] Raise 9pfs mount_tag limit from 32 to 255 bytes

2011-10-07 Thread Aneesh Kumar K.V
On Fri, 7 Oct 2011 10:27:56 +0100, "Daniel P. Berrange"  
wrote:
> On Thu, Sep 29, 2011 at 04:22:16PM +0100, Daniel P. Berrange wrote:
> > On Thu, Sep 29, 2011 at 08:23:49PM +0530, Aneesh Kumar K.V wrote:
> > > On Thu, 29 Sep 2011 11:34:21 +0100, "Daniel P. Berrange" 
> > >  wrote:
> > > > From: "Daniel P. Berrange" 
> > > > 
> > > > The Linux guest kernel does not appear to have a problem handling
> > > > a mount_tag larger than 32 bytes. Increase the limit to 255 bytes,
> > > > though perhaps it can be made larger still, or not limited at all ?
> > > > 
> > > > Tested with a 3.0.4 kernel and a mount_tag 255 bytes in length.
> > > > 
> > > > * hw/9pfs/virtio-9p.h: Change MAX_TAG_LEN to 255
> > > 
> > > 
> > > mount_tag is passed via pci config space, do we want to have 255 bytes
> > > out of that for device identification.
> > 
> > How big is the config space available for each 9pfs device and what
> > other info does it need to keep there ?
> 
> Does anyone have an clear answer for this ?
> 
> I've done some tests with ever larger mount tags, and managed to increase
> the MAX_TAG_LEN value to 1023  before I started getting guest failures.
> 
> So if the config space is really 1023 bytes in size, it doesn't seem too
> unrealistic to allow 255 bytes of it for the mount_tag, or at the very
> least increase it from 32 to 128 ?
> 

Last time we discussed this Anthony wanted to keep the config space
usage minimal, hence we agreed on the size 32 bytes. 

Anthony,

Any comments ?

-aneesh




[Qemu-devel] [PATCH 2/5] savevm: some coding style cleanups

2011-10-07 Thread Juan Quintela
This patch will make moving code on next patches and having checkpatch
happy easier.

Signed-off-by: Juan Quintela
Reviewed-by: Anthony Liguori 
Signed-off-by: Juan Quintela 
---
 savevm.c |   21 ++---
 1 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/savevm.c b/savevm.c
index 743c304..4069b34 100644
--- a/savevm.c
+++ b/savevm.c
@@ -536,8 +536,9 @@ int qemu_get_buffer(QEMUFile *f, uint8_t *buf, int size1)
 {
 int size, l;

-if (f->is_write)
+if (f->is_write) {
 abort();
+}

 size = size1;
 while (size > 0) {
@@ -545,11 +546,13 @@ int qemu_get_buffer(QEMUFile *f, uint8_t *buf, int size1)
 if (l == 0) {
 qemu_fill_buffer(f);
 l = f->buf_size - f->buf_index;
-if (l == 0)
+if (l == 0) {
 break;
+}
 }
-if (l > size)
+if (l > size) {
 l = size;
+}
 memcpy(buf, f->buf + f->buf_index, l);
 f->buf_index += l;
 buf += l;
@@ -560,26 +563,30 @@ int qemu_get_buffer(QEMUFile *f, uint8_t *buf, int size1)

 static int qemu_peek_byte(QEMUFile *f)
 {
-if (f->is_write)
+if (f->is_write) {
 abort();
+}

 if (f->buf_index >= f->buf_size) {
 qemu_fill_buffer(f);
-if (f->buf_index >= f->buf_size)
+if (f->buf_index >= f->buf_size) {
 return 0;
+}
 }
 return f->buf[f->buf_index];
 }

 int qemu_get_byte(QEMUFile *f)
 {
-if (f->is_write)
+if (f->is_write) {
 abort();
+}

 if (f->buf_index >= f->buf_size) {
 qemu_fill_buffer(f);
-if (f->buf_index >= f->buf_size)
+if (f->buf_index >= f->buf_size) {
 return 0;
+}
 }
 return f->buf[f->buf_index++];
 }
-- 
1.7.6.4




[Qemu-devel] [PATCH 5/5] Revert "savevm: fix corruption in vmstate_subsection_load()."

2011-10-07 Thread Juan Quintela
This reverts commit eb60260de0b050a5e8ab725e84d377d0b44c43ae.

Conflicts:

savevm.c

We changed qemu_peek_byte() prototype, just fixed the rejects.

Signed-off-by: Juan Quintela
Reviewed-by: Anthony Liguori 
Signed-off-by: Juan Quintela 
---
 savevm.c |   10 +-
 1 files changed, 1 insertions(+), 9 deletions(-)

diff --git a/savevm.c b/savevm.c
index aafdc7b..5001dd5 100644
--- a/savevm.c
+++ b/savevm.c
@@ -1704,12 +1704,6 @@ static const VMStateDescription 
*vmstate_get_subsection(const VMStateSubsection
 static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription *vmsd,
void *opaque)
 {
-const VMStateSubsection *sub = vmsd->subsections;
-
-if (!sub || !sub->needed) {
-return 0;
-}
-
 while (qemu_peek_byte(f, 0) == QEMU_VM_SUBSECTION) {
 char idstr[256];
 int ret;
@@ -1731,7 +1725,7 @@ static int vmstate_subsection_load(QEMUFile *f, const 
VMStateDescription *vmsd,
 /* it don't have a valid subsection name */
 return 0;
 }
-sub_vmsd = vmstate_get_subsection(sub, idstr);
+sub_vmsd = vmstate_get_subsection(vmsd->subsections, idstr);
 if (sub_vmsd == NULL) {
 return -ENOENT;
 }
@@ -1740,7 +1734,6 @@ static int vmstate_subsection_load(QEMUFile *f, const 
VMStateDescription *vmsd,
 qemu_file_skip(f, len); /* idstr */
 version_id = qemu_get_be32(f);

-assert(!sub_vmsd->subsections);
 ret = vmstate_load_state(f, sub_vmsd, opaque, version_id);
 if (ret) {
 return ret;
@@ -1764,7 +1757,6 @@ static void vmstate_subsection_save(QEMUFile *f, const 
VMStateDescription *vmsd,
 qemu_put_byte(f, len);
 qemu_put_buffer(f, (uint8_t *)vmsd->name, len);
 qemu_put_be32(f, vmsd->version_id);
-assert(!vmsd->subsections);
 vmstate_save_state(f, vmsd, opaque);
 }
 sub++;
-- 
1.7.6.4




[Qemu-devel] [PATCH 3/5] savevm: define qemu_get_byte() using qemu_peek_byte()

2011-10-07 Thread Juan Quintela
Signed-off-by: Juan Quintela
Signed-off-by: Juan Quintela 
---
 savevm.c |   15 ++-
 1 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/savevm.c b/savevm.c
index 4069b34..94628c6 100644
--- a/savevm.c
+++ b/savevm.c
@@ -578,17 +578,14 @@ static int qemu_peek_byte(QEMUFile *f)

 int qemu_get_byte(QEMUFile *f)
 {
-if (f->is_write) {
-abort();
-}
+int result;

-if (f->buf_index >= f->buf_size) {
-qemu_fill_buffer(f);
-if (f->buf_index >= f->buf_size) {
-return 0;
-}
+result = qemu_peek_byte(f);
+
+if (f->buf_index < f->buf_size) {
+f->buf_index++;
 }
-return f->buf[f->buf_index++];
+return result;
 }

 int64_t qemu_ftell(QEMUFile *f)
-- 
1.7.6.4




[Qemu-devel] [PATCH 4/5] savevm: improve subsections detection on load

2011-10-07 Thread Juan Quintela
We add qemu_peek_buffer, that is identical to qemu_get_buffer, just
that it don't update f->buf_index.

We add a paramenter to qemu_peek_byte() to be able to peek more than
one byte.

Once this is done, to see if we have a subsection we look:
- 1st byte is QEMU_VM_SUBSECTION
- 2nd byte is a length, and is bigger than section name
- 3rd element is a string that starts with section_name

So, we shouldn't have false positives (yes, content could still get us
wrong but probabilities are really low).

v2:
- Alex Williamsom found that we could get negative values on index.
- Rework code to fix that part.
- Rewrite qemu_get_buffer() using qemu_peek_buffer()

v3:
- return "done" on error case

Signed-off-by: Juan Quintela
Signed-off-by: Juan Quintela 
---
 savevm.c |  110 ++
 1 files changed, 75 insertions(+), 35 deletions(-)

diff --git a/savevm.c b/savevm.c
index 94628c6..aafdc7b 100644
--- a/savevm.c
+++ b/savevm.c
@@ -532,59 +532,85 @@ void qemu_put_byte(QEMUFile *f, int v)
 qemu_fflush(f);
 }

-int qemu_get_buffer(QEMUFile *f, uint8_t *buf, int size1)
+static void qemu_file_skip(QEMUFile *f, int size)
 {
-int size, l;
+if (f->buf_index + size < f->buf_size) {
+f->buf_index += size;
+}
+}
+
+static int qemu_peek_buffer(QEMUFile *f, uint8_t *buf, int size, size_t offset)
+{
+int pending;
+int index;

 if (f->is_write) {
 abort();
 }

-size = size1;
-while (size > 0) {
-l = f->buf_size - f->buf_index;
-if (l == 0) {
-qemu_fill_buffer(f);
-l = f->buf_size - f->buf_index;
-if (l == 0) {
-break;
-}
-}
-if (l > size) {
-l = size;
+index = f->buf_index + offset;
+pending = f->buf_size - index;
+if (pending < size) {
+qemu_fill_buffer(f);
+index = f->buf_index + offset;
+pending = f->buf_size - index;
+}
+
+if (pending <= 0) {
+return 0;
+}
+if (size > pending) {
+size = pending;
+}
+
+memcpy(buf, f->buf + index, size);
+return size;
+}
+
+int qemu_get_buffer(QEMUFile *f, uint8_t *buf, int size)
+{
+int pending = size;
+int done = 0;
+
+while (pending > 0) {
+int res;
+
+res = qemu_peek_buffer(f, buf, pending, 0);
+if (res == 0) {
+return done;
 }
-memcpy(buf, f->buf + f->buf_index, l);
-f->buf_index += l;
-buf += l;
-size -= l;
+qemu_file_skip(f, res);
+buf += res;
+pending -= res;
+done += res;
 }
-return size1 - size;
+return done;
 }

-static int qemu_peek_byte(QEMUFile *f)
+static int qemu_peek_byte(QEMUFile *f, int offset)
 {
+int index = f->buf_index + offset;
+
 if (f->is_write) {
 abort();
 }

-if (f->buf_index >= f->buf_size) {
+if (index >= f->buf_size) {
 qemu_fill_buffer(f);
-if (f->buf_index >= f->buf_size) {
+index = f->buf_index + offset;
+if (index >= f->buf_size) {
 return 0;
 }
 }
-return f->buf[f->buf_index];
+return f->buf[index];
 }

 int qemu_get_byte(QEMUFile *f)
 {
 int result;

-result = qemu_peek_byte(f);
-
-if (f->buf_index < f->buf_size) {
-f->buf_index++;
-}
+result = qemu_peek_byte(f, 0);
+qemu_file_skip(f, 1);
 return result;
 }

@@ -1684,22 +1710,36 @@ static int vmstate_subsection_load(QEMUFile *f, const 
VMStateDescription *vmsd,
 return 0;
 }

-while (qemu_peek_byte(f) == QEMU_VM_SUBSECTION) {
+while (qemu_peek_byte(f, 0) == QEMU_VM_SUBSECTION) {
 char idstr[256];
 int ret;
-uint8_t version_id, len;
+uint8_t version_id, len, size;
 const VMStateDescription *sub_vmsd;

-qemu_get_byte(f); /* subsection */
-len = qemu_get_byte(f);
-qemu_get_buffer(f, (uint8_t *)idstr, len);
-idstr[len] = 0;
-version_id = qemu_get_be32(f);
+len = qemu_peek_byte(f, 1);
+if (len < strlen(vmsd->name) + 1) {
+/* subsection name has be be "section_name/a" */
+return 0;
+}
+size = qemu_peek_buffer(f, (uint8_t *)idstr, len, 2);
+if (size != len) {
+return 0;
+}
+idstr[size] = 0;

+if (strncmp(vmsd->name, idstr, strlen(vmsd->name)) != 0) {
+/* it don't have a valid subsection name */
+return 0;
+}
 sub_vmsd = vmstate_get_subsection(sub, idstr);
 if (sub_vmsd == NULL) {
 return -ENOENT;
 }
+qemu_file_skip(f, 1); /* subsection */
+qemu_file_skip(f, 1); /* len */
+qemu_file_skip(f, len); /* idstr */
+version_id = qemu_get_be32(f);
+
 assert(!sub_vmsd->subsections);
 ret = vmstate_load_state(f, sub_vmsd, opaque, version_id);
 if (ret) {
-- 

[Qemu-devel] [PATCH v3 0/5] migration: Improve subsections detection

2011-10-07 Thread Juan Quintela
Hi

v3:
- fix return value on qemu_get_buffer.

Anthony, all reviewers comments are fixed, please consider to apply.

Later, Juan.

v2:
- rename "used" to "remaining" (Alex suggestion)
- implement qemu_get_{byte,buffer} on top of qemu_peek_{byte, buffer}
  (Anthony suggestion)
- fix qemu_peek_buffe_logic (Alex  discovered the problem)

v1:
This series move the subsections detection code form:
- Look that it starts form 5
To:
- Look that it starts form 5 (SUBSECTION)
- Look at the length
- Look that length is bigger than section name
- Look at the idstr and see that it starts with the subsection name.

Please review.

Later, Juan.

Juan Quintela (5):
  savevm: teach qemu_fill_buffer to do partial refills
  savevm: some coding style cleanups
  savevm: define qemu_get_byte() using qemu_peek_byte()
  savevm: improve subsections detection on load
  Revert "savevm: fix corruption in vmstate_subsection_load()."

 savevm.c |  144 -
 1 files changed, 94 insertions(+), 50 deletions(-)

-- 
1.7.6.4




[Qemu-devel] [PATCH 1/5] savevm: teach qemu_fill_buffer to do partial refills

2011-10-07 Thread Juan Quintela
We will need on next patch to be able to lookahead on next patch

v2: rename "used" to "pending" (Alex Williams)

Signed-off-by: Juan Quintela
Reviewed-by: Anthony Liguori 
Signed-off-by: Juan Quintela 
---
 savevm.c |   14 +++---
 1 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/savevm.c b/savevm.c
index 46f2447..743c304 100644
--- a/savevm.c
+++ b/savevm.c
@@ -455,6 +455,7 @@ void qemu_fflush(QEMUFile *f)
 static void qemu_fill_buffer(QEMUFile *f)
 {
 int len;
+int pending;

 if (!f->get_buffer)
 return;
@@ -462,10 +463,17 @@ static void qemu_fill_buffer(QEMUFile *f)
 if (f->is_write)
 abort();

-len = f->get_buffer(f->opaque, f->buf, f->buf_offset, IO_BUF_SIZE);
+pending = f->buf_size - f->buf_index;
+if (pending > 0) {
+memmove(f->buf, f->buf + f->buf_index, pending);
+}
+f->buf_index = 0;
+f->buf_size = pending;
+
+len = f->get_buffer(f->opaque, f->buf + pending, f->buf_offset,
+IO_BUF_SIZE - pending);
 if (len > 0) {
-f->buf_index = 0;
-f->buf_size = len;
+f->buf_size += len;
 f->buf_offset += len;
 } else if (len != -EAGAIN)
 f->has_error = 1;
-- 
1.7.6.4




[Qemu-devel] [PATCH] pcnet: Add link state support

2011-10-07 Thread Jan Kiszka
Update lnkst on link state changes so that guests can obtain this
information via reading back the LED output pin. Works for Linux but
not for guests that depend on the missing PHY.

Signed-off-by: Jan Kiszka 
---
 hw/lance.c |1 +
 hw/pcnet-pci.c |1 +
 hw/pcnet.c |7 +++
 hw/pcnet.h |1 +
 4 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/hw/lance.c b/hw/lance.c
index d83e7f5..93d5fda 100644
--- a/hw/lance.c
+++ b/hw/lance.c
@@ -97,6 +97,7 @@ static NetClientInfo net_lance_info = {
 .size = sizeof(NICState),
 .can_receive = pcnet_can_receive,
 .receive = pcnet_receive,
+.link_status_changed = pcnet_set_link_status,
 .cleanup = lance_cleanup,
 };
 
diff --git a/hw/pcnet-pci.c b/hw/pcnet-pci.c
index cab1116..548bb50 100644
--- a/hw/pcnet-pci.c
+++ b/hw/pcnet-pci.c
@@ -287,6 +287,7 @@ static NetClientInfo net_pci_pcnet_info = {
 .size = sizeof(NICState),
 .can_receive = pcnet_can_receive,
 .receive = pcnet_receive,
+.link_status_changed = pcnet_set_link_status,
 .cleanup = pci_pcnet_cleanup,
 };
 
diff --git a/hw/pcnet.c b/hw/pcnet.c
index add3ec2..cba253b 100644
--- a/hw/pcnet.c
+++ b/hw/pcnet.c
@@ -1197,6 +1197,13 @@ ssize_t pcnet_receive(VLANClientState *nc, const uint8_t 
*buf, size_t size_)
 return size_;
 }
 
+void pcnet_set_link_status(VLANClientState *nc)
+{
+PCNetState *d = DO_UPCAST(NICState, nc, nc)->opaque;
+
+d->lnkst = nc->link_down ? 0 : 0x40;
+}
+
 static void pcnet_transmit(PCNetState *s)
 {
 target_phys_addr_t xmit_cxda = 0;
diff --git a/hw/pcnet.h b/hw/pcnet.h
index 52cc52e..edc81c9 100644
--- a/hw/pcnet.h
+++ b/hw/pcnet.h
@@ -58,6 +58,7 @@ uint32_t pcnet_ioport_readl(void *opaque, uint32_t addr);
 uint32_t pcnet_bcr_readw(PCNetState *s, uint32_t rap);
 int pcnet_can_receive(VLANClientState *nc);
 ssize_t pcnet_receive(VLANClientState *nc, const uint8_t *buf, size_t size_);
+void pcnet_set_link_status(VLANClientState *nc);
 void pcnet_common_cleanup(PCNetState *d);
 int pcnet_common_init(DeviceState *dev, PCNetState *s, NetClientInfo *info);
 extern const VMStateDescription vmstate_pcnet;
-- 
1.7.3.4



Re: [Qemu-devel] [Question] dump memory when host pci device is used by guest

2011-10-07 Thread Jan Kiszka
On 2011-10-07 11:46, Wen Congyang wrote:
> Currently, virsh dump uses monitor command migrate to dump guest's memory
> to file, and we can use crash to analyze the file.
> 
> Unfortunately, virsh dump can not work if guest uses host pci device. The
> reason is that the device's status is also needed to migrate to remote 
> machine,
> and the host pci device's status is not stored in qemu. So it is unmigratable.
> 
> I think we can  we can add a option to qmp command migrate(eg: skip) to allow
> the user to skip the check, and this option should be used only when dumping
> the guest's memory.

Why not simply attach gdb? That works independently of migration.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



[Qemu-devel] Logging Memory Writes in Qemu

2011-10-07 Thread Johannes Stuettgen

Hello,

i am trying to perform some memory measurements and was hoping you could 
point me in the right direction:


My goal is to log every write access to physical memory, the physical 
address written as well as the total amount of bytes that are written 
(modified) in each access.
My first idea after reading the documentation was to change the 
MemoryOps->write ptr to point to a logging function and then pass the 
arguments back to the original. However i could'nt reliably locate the 
place in the code where these ops get initialized.


I also had a look at exec.c, and tried to hook into 'void 
cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf, int len, 
int is_write)'. This function gets called when starting qemu without any 
arguments (and thus no harddiscs or cdrom images), however as soon as i 
boot a harddisk the function does not seem to get called anymore.


These are the calls i get when booting an empty system:
0xAdress:written_bytes
---
0x07FDD000:590
0x07FDC900:16
0x07FDC88C:4
0x07FDD800:590
0x07FDC910:16
0x07FDC89C:4
0x07FDE000:590
0x07FDC920:16
0x07FDC8AC:4

What would be the 'right' place to place such a hook in your opinion?

Sincerly,
Johannes Stuettgen



[Qemu-devel] [Question] dump memory when host pci device is used by guest

2011-10-07 Thread Wen Congyang
Currently, virsh dump uses monitor command migrate to dump guest's memory
to file, and we can use crash to analyze the file.

Unfortunately, virsh dump can not work if guest uses host pci device. The
reason is that the device's status is also needed to migrate to remote machine,
and the host pci device's status is not stored in qemu. So it is unmigratable.

I think we can  we can add a option to qmp command migrate(eg: skip) to allow
the user to skip the check, and this option should be used only when dumping
the guest's memory.

Thanks
Wen Congyang



Re: [Qemu-devel] [PATCH] Raise 9pfs mount_tag limit from 32 to 255 bytes

2011-10-07 Thread Daniel P. Berrange
On Thu, Sep 29, 2011 at 04:22:16PM +0100, Daniel P. Berrange wrote:
> On Thu, Sep 29, 2011 at 08:23:49PM +0530, Aneesh Kumar K.V wrote:
> > On Thu, 29 Sep 2011 11:34:21 +0100, "Daniel P. Berrange" 
> >  wrote:
> > > From: "Daniel P. Berrange" 
> > > 
> > > The Linux guest kernel does not appear to have a problem handling
> > > a mount_tag larger than 32 bytes. Increase the limit to 255 bytes,
> > > though perhaps it can be made larger still, or not limited at all ?
> > > 
> > > Tested with a 3.0.4 kernel and a mount_tag 255 bytes in length.
> > > 
> > > * hw/9pfs/virtio-9p.h: Change MAX_TAG_LEN to 255
> > 
> > 
> > mount_tag is passed via pci config space, do we want to have 255 bytes
> > out of that for device identification.
> 
> How big is the config space available for each 9pfs device and what
> other info does it need to keep there ?

Does anyone have an clear answer for this ?

I've done some tests with ever larger mount tags, and managed to increase
the MAX_TAG_LEN value to 1023  before I started getting guest failures.

So if the config space is really 1023 bytes in size, it doesn't seem too
unrealistic to allow 255 bytes of it for the mount_tag, or at the very
least increase it from 32 to 128 ?

Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [PATCH 1/4] Add basic version of bridge helper

2011-10-07 Thread Daniel P. Berrange
On Thu, Oct 06, 2011 at 02:38:56PM -0400, Corey Bryant wrote:
> 
> 
> On 10/06/2011 02:04 PM, Anthony Liguori wrote:
> >On 10/06/2011 11:41 AM, Daniel P. Berrange wrote:
> >>On Thu, Oct 06, 2011 at 11:38:25AM -0400, Richa Marwaha wrote:
> >>>This patch adds a helper that can be used to create a tap device
> >>>attached to
> >>>a bridge device. Since this helper is minimal in what it does, it can be
> >>>given CAP_NET_ADMIN which allows qemu to avoid running as root while
> >>>still
> >>>satisfying the majority of what users tend to want to do with tap
> >>>devices.
> >>>
> >>>The way this all works is that qemu launches this helper passing a
> >>>bridge
> >>>name and the name of an inherited file descriptor. The descriptor is one
> >>>end of a socketpair() of domain sockets. This domain socket is used to
> >>>transmit a file descriptor of the opened tap device from the helper
> >>>to qemu.
> >>>
> >>>The helper can then exit and let qemu use the tap device.
> >>
> >>When QEMU is run by libvirt, we generally like to use capng to
> >>remove the ability for QEMU to run setuid programs at all. So
> >>obviously it will struggle to run the qemu-bridge-helper binary
> >>in such a scenario.
> >>
> >>With the way you transmit the TAP device FD back to the caller,
> >>it looks like libvirt itself could execute the qemu-bridge-helper
> >>receiving the FD, and then pass the FD onto QEMU using the
> >>traditional tap,fd=XX syntax.
> >
> >Exactly. This would allow tap-based networking using libvirt session://
> >URIs.
> >
> 
> I'll take note of this.  It seems like it would be a nice future
> addition to libvirt.
> 
> A slight tangent, but a point on DAC isolation.  The helper enables
> DAC isolation for qemu:///session but we still need some work in
> libvirt to provide DAC isolation for qemu:///system.  This could be
> done by allowing management applications to specify custom
> user/group IDs when creating guests rather than hard coding the IDs
> in the configuration file.

Yes, this is a item on our todo list for libvirt. There are a couple of
work items involved

 - Extend the XML to allow multiple  elements, one per
   security driver in use.
 - Add a new API to allow fetching of live seclabel data per
   security driver
 - Extend the current DAC security driver to automatically allocate
   UIDs from an admin defined range, and/or pull them from the XML
   provided by app.

Tecnically we could do item 3, without doing items 1/2, but that would
neccessitate *not* using the sVirt security driver. I don't think that's
too useful, so items 1/2 let us use both the sVirt & enhanced DAC driver
at the same time.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] qemu guest agent spins in poll/nanosleep(100ms) when nothing is listening on host

2011-10-07 Thread Daniel P. Berrange
On Thu, Oct 06, 2011 at 04:15:07PM -0500, Michael Roth wrote:
> On Thu, 6 Oct 2011 12:31:05 +0100, "Daniel P. Berrange"  
> wrote:
> > I get the feeling that this kind of problem inherant in the use of any
> > virtio-serial channel, in the same way you can't detect EOF for a regular
> > serial device channel either. Given that virtio-serial is a nice paravirt
> > device, is there anything we can do to it, to allow better handling of
> > EOF by applications ?
> 
> Indeed, and there was a discussion a while back where I think we had tentative
> agreement on a path forward for this. Unfortunately there doesn't seem to be
> a clear solution for doing it purely in guest-userspace:
> 
> http://www.mail-archive.com/qemu-devel@nongnu.org/msg57002.html
> 
> The gist of it is basically making the (guest-side) virtio-serial chardev
> behave more like a unix socket, i.e. if the host hangs up you get a single EOF
> and then your FD becomes invalid, at which point you need to re-open the
> chardev to get a valid FD. This could potentially be done with via a new set 
> of
> -chardev/-device flags.

Ah interesting idea.

> > Or perhaps there is some way to make use of epoll() in edge-triggered
> > mode to detect it already, because IIUC, edge-triggered mode should only
> > fire once for the EOF condition, and then not fire again until something
> > in the host actually sends some data ?
> > 
> > Of course glib's event loop doesn't support edge-triggered events/epoll,
> > but perhaps we could just call epoll() directly in the event handler,
> > instead of the usleep() call ?
> 
> That's definitely worth looking into. Has the 100ms sleep been causing any
> issues though? My main concern with the polling behavior was less a matter of
> performance than being able to provide a "session" where the start and end
> of a stream could be reliably determined, which we don't have currently. But
> the guest agent has since been reworked to persist state between host
> connects/disconnects so it didn't seem to be a major issue anymore.

We're intending to have the agent installed in all Fedora 16 guests and
later guests by default, and used by libvirt for shutdown/reboot. so I
was just looking at how it was working to ensure there are no surprises
and happened to notice the wakeups when disconnected on the host. So it
hasn't actually caused any problems for me, I just have a general desire
to ensure any code doesn't do frequent wakeups when there's no work todo.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



[Qemu-devel] [PATCH 4/4] scsi-disk: fix retrying a flush

2011-10-07 Thread Paolo Bonzini
Flush does not go anymore through scsi_disk_emulate_command.

Signed-off-by: Paolo Bonzini 
---
 hw/scsi-disk.c |9 +++--
 1 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index d6f2345..eb0c679 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -81,7 +81,7 @@ struct SCSIDiskState
 };
 
 static int scsi_handle_rw_error(SCSIDiskReq *r, int error, int type);
-static int scsi_disk_emulate_command(SCSIDiskReq *r);
+static int32_t scsi_send_command(SCSIRequest *req, uint8_t *buf);
 
 static void scsi_free_request(SCSIRequest *req)
 {
@@ -335,7 +335,6 @@ static void scsi_dma_restart_bh(void *opaque)
 r = DO_UPCAST(SCSIDiskReq, req, req);
 if (r->status & SCSI_REQ_STATUS_RETRY) {
 int status = r->status;
-int ret;
 
 r->status &=
 ~(SCSI_REQ_STATUS_RETRY | SCSI_REQ_STATUS_RETRY_TYPE_MASK);
@@ -348,10 +347,8 @@ static void scsi_dma_restart_bh(void *opaque)
 scsi_write_data(&r->req);
 break;
 case SCSI_REQ_STATUS_RETRY_FLUSH:
-ret = scsi_disk_emulate_command(r);
-if (ret == 0) {
-scsi_req_complete(&r->req, GOOD);
-}
+scsi_send_command(&r->req, r->req.cmd.buf);
+break;
 }
 /* This reference was left in by scsi_handle_rw_error.  */
 scsi_req_unref(&r->req);
-- 
1.7.6




[Qemu-devel] [PATCH 3/4] scsi-disk: bump SCSIRequest reference count until aio completion runs

2011-10-07 Thread Paolo Bonzini
In some cases a request may be canceled before the completion callback
runs.  Keep a reference to the request between starting an AIO operation,
and let scsi_*_complete remove it.

Since scsi_handle_rw_error returns whether something else has to be done
for the request by the caller, it makes sense to transfer ownership of
the ref to scsi_handle_rw_error when it returns 1; scsi_dma_restart_bh
will then free the reference after restarting the operation.

This is reproducible by doing an "eject -f" during an installer's media
test, using the lsi adapter.  The resulting "ABORT" message causes the
request to be canceled and freed before the read completes.

Signed-off-by: Paolo Bonzini 
---
 hw/scsi-disk.c |   14 ++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index 6497655..d6f2345 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -139,6 +139,7 @@ static void scsi_read_complete(void * opaque, int ret)
 
 if (ret) {
 if (scsi_handle_rw_error(r, -ret, SCSI_REQ_STATUS_RETRY_READ)) {
+/* Leave in ref for scsi_dma_restart_bh.  */
 return;
 }
 }
@@ -149,6 +150,7 @@ static void scsi_read_complete(void * opaque, int ret)
 r->sector += n;
 r->sector_count -= n;
 scsi_req_data(&r->req, r->qiov.size);
+scsi_req_unref(&r->req);
 }
 
 static void scsi_flush_complete(void * opaque, int ret)
@@ -163,11 +165,13 @@ static void scsi_flush_complete(void * opaque, int ret)
 
 if (ret < 0) {
 if (scsi_handle_rw_error(r, -ret, SCSI_REQ_STATUS_RETRY_FLUSH)) {
+/* Leave in ref for scsi_dma_restart_bh.  */
 return;
 }
 }
 
 scsi_req_complete(&r->req, GOOD);
+scsi_req_unref(&r->req);
 }
 
 /* Read more data from scsi device into buffer.  */
@@ -202,6 +206,9 @@ static void scsi_read_data(SCSIRequest *req)
 if (s->tray_open) {
 scsi_read_complete(r, -ENOMEDIUM);
 }
+
+/* Save a ref for scsi_read_complete, in case r is canceled.  */
+scsi_req_ref(&r->req);
 n = scsi_init_iovec(r);
 bdrv_acct_start(s->bs, &r->acct, n * BDRV_SECTOR_SIZE, BDRV_ACCT_READ);
 r->req.aiocb = bdrv_aio_readv(s->bs, r->sector, &r->qiov, n,
@@ -278,6 +285,7 @@ static void scsi_write_complete(void * opaque, int ret)
 DPRINTF("Write complete tag=0x%x more=%d\n", r->req.tag, r->qiov.size);
 scsi_req_data(&r->req, r->qiov.size);
 }
+scsi_req_unref(&r->req);
 }
 
 static void scsi_write_data(SCSIRequest *req)
@@ -295,6 +303,8 @@ static void scsi_write_data(SCSIRequest *req)
 return;
 }
 
+/* Save a ref for scsi_write_complete, in case r is canceled.  */
+scsi_req_ref(&r->req);
 n = r->qiov.size / 512;
 if (n) {
 if (s->tray_open) {
@@ -343,6 +353,8 @@ static void scsi_dma_restart_bh(void *opaque)
 scsi_req_complete(&r->req, GOOD);
 }
 }
+/* This reference was left in by scsi_handle_rw_error.  */
+scsi_req_unref(&r->req);
 }
 }
 }
@@ -1324,6 +1336,8 @@ static int32_t scsi_send_command(SCSIRequest *req, 
uint8_t *buf)
 r->iov.iov_len = rc;
 break;
 case SYNCHRONIZE_CACHE:
+/* Save a ref for scsi_flush_complete, in case r is canceled.  */
+scsi_req_ref(&r->req);
 bdrv_acct_start(s->bs, &r->acct, 0, BDRV_ACCT_FLUSH);
 r->req.aiocb = bdrv_aio_flush(s->bs, scsi_flush_complete, r);
 if (r->req.aiocb == NULL) {
-- 
1.7.6





[Qemu-devel] [PATCH 0/4] scsi: miscellaneous fixes

2011-10-07 Thread Paolo Bonzini
The most important part is a fix for use-after-free that I found while
testing CD-ROM eject.

Paolo Bonzini (4):
  scsi-disk: fail READ CAPACITY if LBA != 0 but PMI == 0
  scsi-disk: do not complete requests twice
  scsi-disk: bump SCSIRequest reference count until aio completion runs
  scsi-disk: fix retrying a flush

 hw/scsi-disk.c |   38 +-
 2 files changed, 30 insertions(+), 10 deletions(-)

-- 
1.7.6




[Qemu-devel] [PATCH 1/4] scsi-disk: fail READ CAPACITY if LBA != 0 but PMI == 0

2011-10-07 Thread Paolo Bonzini
Tested by the Windows Logo Kit SCSI Compliance test. From SBC-3, paragraph
5.25: "The LOGICAL BLOCK ADDRESS field shall be set to zero if the PMI
bit is set to zero. If the PMI bit is set to zero and the LOGICAL BLOCK
ADDRESS field is not set to zero, then the device server shall terminate
the command with CHECK CONDITION status with the sense key set to ILLEGAL
REQUEST and the additional sense code set to INVALID FIELD IN CDB".

Signed-off-by: Paolo Bonzini 
---
 hw/scsi-disk.c |   12 ++--
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index d9fa8f7..4757a02 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -1160,8 +1160,12 @@ static int scsi_disk_emulate_command(SCSIDiskReq *r)
 /* The normal LEN field for this command is zero.  */
 memset(outbuf, 0, 8);
 bdrv_get_geometry(s->bs, &nb_sectors);
-if (!nb_sectors)
+if (!nb_sectors) {
 goto not_ready;
+}
+if ((req->cmd.buf[8] & 1) == 0 && req->cmd.lba) {
+goto illegal_request;
+}
 nb_sectors /= s->cluster_size;
 /* Returned value is the address of the last sector.  */
 nb_sectors--;
@@ -1206,8 +1210,12 @@ static int scsi_disk_emulate_command(SCSIDiskReq *r)
 DPRINTF("SAI READ CAPACITY(16)\n");
 memset(outbuf, 0, req->cmd.xfer);
 bdrv_get_geometry(s->bs, &nb_sectors);
-if (!nb_sectors)
+if (!nb_sectors) {
 goto not_ready;
+}
+if ((req->cmd.buf[14] & 1) == 0 && req->cmd.lba) {
+goto illegal_request;
+}
 nb_sectors /= s->cluster_size;
 /* Returned value is the address of the last sector.  */
 nb_sectors--;
-- 
1.7.6





[Qemu-devel] [PATCH 2/4] scsi-disk: do not complete requests twice

2011-10-07 Thread Paolo Bonzini
When scsi_handle_rw_error reports a CHECK CONDITION code, the
owner should not call scsi_req_complete.

Signed-off-by: Paolo Bonzini 
---
 hw/scsi-disk.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index 4757a02..6497655 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -230,6 +230,7 @@ static int scsi_handle_rw_error(SCSIDiskReq *r, int error, 
int type)
 
 bdrv_mon_event(s->bs, BDRV_ACTION_STOP, is_read);
 vm_stop(RSTATE_IO_ERROR);
+return 1;
 } else {
 switch (error) {
 case ENOMEDIUM:
@@ -246,8 +247,8 @@ static int scsi_handle_rw_error(SCSIDiskReq *r, int error, 
int type)
 break;
 }
 bdrv_mon_event(s->bs, BDRV_ACTION_REPORT, is_read);
+return 0;
 }
-return 1;
 }
 
 static void scsi_write_complete(void * opaque, int ret)
-- 
1.7.6





[Qemu-devel] [PATCH v2 08/23] i8239: Introduce per-PIC output interrupt

2011-10-07 Thread Jan Kiszka
As a first step towards more generic master-slave support, remove
parent_irq in favor of a per-PIC output interrupt line. The slave's
line is attached to IRQ2 of the master, but it remains unused for now.

Signed-off-by: Jan Kiszka 
---
 hw/i8259.c |   21 -
 1 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/hw/i8259.c b/hw/i8259.c
index de2d5ca..65123bd 100644
--- a/hw/i8259.c
+++ b/hw/i8259.c
@@ -58,6 +58,7 @@ typedef struct PicState {
 uint8_t single_mode; /* true if slave pic is not initialized */
 uint8_t elcr; /* PIIX edge/trigger selection*/
 uint8_t elcr_mask;
+qemu_irq int_out;
 PicState2 *pics_state;
 MemoryRegion base_io;
 MemoryRegion elcr_io;
@@ -67,7 +68,6 @@ struct PicState2 {
 /* 0 is master pic, 1 is slave pic */
 /* XXX: better separation between the two pics */
 PicState pics[2];
-qemu_irq parent_irq;
 void *irq_request_opaque;
 };
 
@@ -148,9 +148,9 @@ static void pic_update_irq(PicState2 *s)
 }
 printf("pic: cpu_interrupt\n");
 #endif
-qemu_irq_raise(s->parent_irq);
+qemu_irq_raise(s->pics[0].int_out);
 } else {
-qemu_irq_lower(s->parent_irq);
+qemu_irq_lower(s->pics[0].int_out);
 }
 }
 
@@ -297,7 +297,7 @@ static void pic_ioport_write(void *opaque, 
target_phys_addr_t addr64,
 /* init */
 pic_reset(s);
 /* deassert a pending interrupt */
-qemu_irq_lower(s->pics_state->parent_irq);
+qemu_irq_lower(s->pics_state->pics[0].int_out);
 s->init_state = 1;
 s->init4 = val & 1;
 s->single_mode = val & 2;
@@ -502,8 +502,10 @@ static const MemoryRegionOps pic_elcr_ioport_ops = {
 };
 
 /* XXX: add generic master/slave system */
-static void pic_init1(int io_addr, int elcr_addr, PicState *s)
+static void pic_init(int io_addr, int elcr_addr, PicState *s, qemu_irq int_out)
 {
+s->int_out = int_out;
+
 memory_region_init_io(&s->base_io, &pic_base_ioport_ops, s, "pic", 2);
 memory_region_init_io(&s->elcr_io, &pic_elcr_ioport_ops, s, "elcr", 1);
 
@@ -553,16 +555,17 @@ void irq_info(Monitor *mon)
 
 qemu_irq *i8259_init(qemu_irq parent_irq)
 {
+qemu_irq *irqs;
 PicState2 *s;
 
 s = g_malloc0(sizeof(PicState2));
-pic_init1(0x20, 0x4d0, &s->pics[0]);
-pic_init1(0xa0, 0x4d1, &s->pics[1]);
+irqs = qemu_allocate_irqs(i8259_set_irq, s, 16);
+pic_init(0x20, 0x4d0, &s->pics[0], parent_irq);
+pic_init(0xa0, 0x4d1, &s->pics[1], irqs[2]);
 s->pics[0].elcr_mask = 0xf8;
 s->pics[1].elcr_mask = 0xde;
-s->parent_irq = parent_irq;
 s->pics[0].pics_state = s;
 s->pics[1].pics_state = s;
 isa_pic = s;
-return qemu_allocate_irqs(i8259_set_irq, s, 16);
+return irqs;
 }
-- 
1.7.3.4




[Qemu-devel] [PATCH v2 15/23] i8259: Clean up pic_ioport_read

2011-10-07 Thread Jan Kiszka
Drop redundant local address variable.

Signed-off-by: Jan Kiszka 
---
 hw/i8259.c |3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/hw/i8259.c b/hw/i8259.c
index 31962c0..545d723 100644
--- a/hw/i8259.c
+++ b/hw/i8259.c
@@ -385,11 +385,10 @@ static uint32_t pic_poll_read(PicState *s)
 return ret;
 }
 
-static uint64_t pic_ioport_read(void *opaque, target_phys_addr_t addr1,
+static uint64_t pic_ioport_read(void *opaque, target_phys_addr_t addr,
 unsigned size)
 {
 PicState *s = opaque;
-unsigned int addr = addr1;
 int ret;
 
 if (s->poll) {
-- 
1.7.3.4




[Qemu-devel] [PATCH v2 10/23] i8259: Reorder intack in pic_read_irq

2011-10-07 Thread Jan Kiszka
As we want to move the IRQ update to pic_intack, ordering matters: the
slave ack must be executed before the master ack to avoid missing
further pending slave IRQs.

Signed-off-by: Jan Kiszka 
---
 hw/i8259.c |   10 ++
 1 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/hw/i8259.c b/hw/i8259.c
index cddd3c7..b7a011f 100644
--- a/hw/i8259.c
+++ b/hw/i8259.c
@@ -228,7 +228,6 @@ int pic_read_irq(PicState2 *s)
 
 irq = pic_get_irq(&s->pics[0]);
 if (irq >= 0) {
-pic_intack(&s->pics[0], irq);
 if (irq == 2) {
 irq2 = pic_get_irq(&s->pics[1]);
 if (irq2 >= 0) {
@@ -238,12 +237,10 @@ int pic_read_irq(PicState2 *s)
 irq2 = 7;
 }
 intno = s->pics[1].irq_base + irq2;
-#if defined(DEBUG_PIC) || defined(DEBUG_IRQ_LATENCY)
-irq = irq2 + 8;
-#endif
 } else {
 intno = s->pics[0].irq_base + irq;
 }
+pic_intack(&s->pics[0], irq);
 } else {
 /* spurious IRQ on host controller */
 irq = 7;
@@ -251,6 +248,11 @@ int pic_read_irq(PicState2 *s)
 }
 pic_update_irq(s);
 
+#if defined(DEBUG_PIC) || defined(DEBUG_IRQ_LATENCY)
+if (irq == 2) {
+irq = irq2 + 8;
+}
+#endif
 #ifdef DEBUG_IRQ_LATENCY
 printf("IRQ%d latency=%0.3fus\n",
irq,
-- 
1.7.3.4




[Qemu-devel] [PATCH v2 18/23] i8259: Eliminate PicState2

2011-10-07 Thread Jan Kiszka
Introduce a reference to the slave PIC for the few cases we need to
access it without a proper pointer at hand and drop PicState2. We could
even live without slave_pic if we had a better way of modeling the
cascade bus the PICs are attached to (in addition to the ISA bus).

Signed-off-by: Jan Kiszka 
---
 hw/i8259.c |   61 +--
 hw/pc.h|8 +++---
 2 files changed, 34 insertions(+), 35 deletions(-)

diff --git a/hw/i8259.c b/hw/i8259.c
index dc13b12..df23bb8 100644
--- a/hw/i8259.c
+++ b/hw/i8259.c
@@ -40,7 +40,7 @@
 //#define DEBUG_IRQ_LATENCY
 //#define DEBUG_IRQ_COUNT
 
-typedef struct PicState {
+struct PicState {
 uint8_t last_irr; /* edge detection */
 uint8_t irr; /* interrupt request register */
 uint8_t imr; /* interrupt mask register */
@@ -62,13 +62,6 @@ typedef struct PicState {
 bool master; /* reflects /SP input pin */
 MemoryRegion base_io;
 MemoryRegion elcr_io;
-} PicState;
-
-struct PicState2 {
-/* 0 is master pic, 1 is slave pic */
-/* XXX: better separation between the two pics */
-PicState pics[2];
-void *irq_request_opaque;
 };
 
 #if defined(DEBUG_PIC) || defined (DEBUG_IRQ_COUNT)
@@ -77,7 +70,8 @@ static int irq_level[16];
 #ifdef DEBUG_IRQ_COUNT
 static uint64_t irq_count[16];
 #endif
-PicState2 *isa_pic;
+PicState *isa_pic;
+static PicState *slave_pic;
 
 /* return the highest priority found in mask (highest = smallest
number). Return 8 if no irq */
@@ -168,7 +162,7 @@ int64_t irq_time[16];
 
 static void i8259_set_irq(void *opaque, int irq, int level)
 {
-PicState2 *s = opaque;
+PicState *s = irq <= 7 ? isa_pic : slave_pic;
 
 #if defined(DEBUG_PIC) || defined(DEBUG_IRQ_COUNT)
 if (level != irq_level[irq]) {
@@ -185,7 +179,7 @@ static void i8259_set_irq(void *opaque, int irq, int level)
 irq_time[irq] = qemu_get_clock_ns(vm_clock);
 }
 #endif
-pic_set_irq1(&s->pics[irq >> 3], irq & 7, level);
+pic_set_irq1(s, irq & 7, level);
 }
 
 /* acknowledge interrupt 'irq' */
@@ -203,29 +197,29 @@ static void pic_intack(PicState *s, int irq)
 pic_update_irq(s);
 }
 
-int pic_read_irq(PicState2 *s)
+int pic_read_irq(PicState *s)
 {
 int irq, irq2, intno;
 
-irq = pic_get_irq(&s->pics[0]);
+irq = pic_get_irq(s);
 if (irq >= 0) {
 if (irq == 2) {
-irq2 = pic_get_irq(&s->pics[1]);
+irq2 = pic_get_irq(slave_pic);
 if (irq2 >= 0) {
-pic_intack(&s->pics[1], irq2);
+pic_intack(slave_pic, irq2);
 } else {
 /* spurious IRQ on slave controller */
 irq2 = 7;
 }
-intno = s->pics[1].irq_base + irq2;
+intno = slave_pic->irq_base + irq2;
 } else {
-intno = s->pics[0].irq_base + irq;
+intno = s->irq_base + irq;
 }
-pic_intack(&s->pics[0], irq);
+pic_intack(s, irq);
 } else {
 /* spurious IRQ on host controller */
 irq = 7;
-intno = s->pics[0].irq_base + irq;
+intno = s->irq_base + irq;
 }
 
 #if defined(DEBUG_PIC) || defined(DEBUG_IRQ_LATENCY)
@@ -390,9 +384,9 @@ static uint64_t pic_ioport_read(void *opaque, 
target_phys_addr_t addr,
 return ret;
 }
 
-int pic_get_output(PicState2 *s)
+int pic_get_output(PicState *s)
 {
-return (pic_get_irq(&s->pics[0]) >= 0);
+return (pic_get_irq(s) >= 0);
 }
 
 static void elcr_ioport_write(void *opaque, target_phys_addr_t addr,
@@ -480,8 +474,8 @@ void pic_info(Monitor *mon)
 if (!isa_pic)
 return;
 
-for(i=0;i<2;i++) {
-s = &isa_pic->pics[i];
+for (i = 0; i < 2; i++) {
+s = i == 0 ? isa_pic : slave_pic;
 monitor_printf(mon, "pic%d: irr=%02x imr=%02x isr=%02x hprio=%d "
"irq_base=%02x rr_sel=%d elcr=%02x fnm=%d\n",
i, s->irr, s->imr, s->isr, s->priority_add,
@@ -510,14 +504,19 @@ void irq_info(Monitor *mon)
 qemu_irq *i8259_init(qemu_irq parent_irq)
 {
 qemu_irq *irqs;
-PicState2 *s;
-
-s = g_malloc0(sizeof(PicState2));
-irqs = qemu_allocate_irqs(i8259_set_irq, s, 16);
-pic_init(0x20, 0x4d0, &s->pics[0], parent_irq, true);
-pic_init(0xa0, 0x4d1, &s->pics[1], irqs[2], false);
-s->pics[0].elcr_mask = 0xf8;
-s->pics[1].elcr_mask = 0xde;
+PicState *s;
+
+irqs = qemu_allocate_irqs(i8259_set_irq, NULL, 16);
+
+s = g_malloc0(sizeof(PicState));
+pic_init(0x20, 0x4d0, s, parent_irq, true);
+s->elcr_mask = 0xf8;
 isa_pic = s;
+
+s = g_malloc0(sizeof(PicState));
+pic_init(0xa0, 0x4d1, s, irqs[2], false);
+s->elcr_mask = 0xde;
+slave_pic = s;
+
 return irqs;
 }
diff --git a/hw/pc.h b/hw/pc.h
index 14c61a2..bfe3dd1 100644
--- a/hw/pc.h
+++ b/hw/pc.h
@@ -60,11 +60,11 @@ bool parallel_mm_init(target_phys_addr_t base, int 
it_shift, qemu_irq irq,
 
 /* i8259.c */
 
-typedef struct PicState2 PicState2;
-extern

[Qemu-devel] [PATCH v2 16/23] i8259: PREP: Replace pic_intack_read with pic_read_irq

2011-10-07 Thread Jan Kiszka
There is nothing in the i8259 spec that justifies the special
pic_intack_read. At least the Linux PREP kernels configure the PICs
properly so that pic_read_irq returns identical values, and setting
read_reg_select in PIC0 cannot be derived from any special i8259 mode.

So switch ppc_prep to pic_read_irq and drop the now unused PIC code.

CC: Andreas Färber 
Signed-off-by: Jan Kiszka 
---
 hw/i8259.c|   39 ---
 hw/pc.h   |1 -
 hw/ppc_prep.c |2 +-
 3 files changed, 1 insertions(+), 41 deletions(-)

diff --git a/hw/i8259.c b/hw/i8259.c
index 545d723..8870277 100644
--- a/hw/i8259.c
+++ b/hw/i8259.c
@@ -361,30 +361,6 @@ static void pic_ioport_write(void *opaque, 
target_phys_addr_t addr64,
 }
 }
 
-static uint32_t pic_poll_read(PicState *s)
-{
-int ret;
-
-ret = pic_get_irq(s);
-if (ret >= 0) {
-bool slave = (s == &isa_pic->pics[1]);
-
-if (slave) {
-s->pics_state->pics[0].isr &= ~(1 << 2);
-s->pics_state->pics[0].irr &= ~(1 << 2);
-}
-s->irr &= ~(1 << ret);
-s->isr &= ~(1 << ret);
-if (slave || ret != 2) {
-pic_update_irq(s);
-}
-} else {
-ret = 0x07;
-}
-
-return ret;
-}
-
 static uint64_t pic_ioport_read(void *opaque, target_phys_addr_t addr,
 unsigned size)
 {
@@ -414,21 +390,6 @@ static uint64_t pic_ioport_read(void *opaque, 
target_phys_addr_t addr,
 return ret;
 }
 
-/* memory mapped interrupt status */
-/* XXX: may be the same than pic_read_irq() */
-uint32_t pic_intack_read(PicState2 *s)
-{
-int ret;
-
-ret = pic_poll_read(&s->pics[0]);
-if (ret == 2)
-ret = pic_poll_read(&s->pics[1]) + 8;
-/* Prepare for ISR read */
-s->pics[0].read_reg_select = 1;
-
-return ret;
-}
-
 int pic_get_output(PicState2 *s)
 {
 return (pic_get_irq(&s->pics[0]) >= 0);
diff --git a/hw/pc.h b/hw/pc.h
index fd5f9b2..14c61a2 100644
--- a/hw/pc.h
+++ b/hw/pc.h
@@ -65,7 +65,6 @@ extern PicState2 *isa_pic;
 qemu_irq *i8259_init(qemu_irq parent_irq);
 int pic_read_irq(PicState2 *s);
 int pic_get_output(PicState2 *s);
-uint32_t pic_intack_read(PicState2 *s);
 void pic_info(Monitor *mon);
 void irq_info(Monitor *mon);
 
diff --git a/hw/ppc_prep.c b/hw/ppc_prep.c
index d26049b..6427baa 100644
--- a/hw/ppc_prep.c
+++ b/hw/ppc_prep.c
@@ -130,7 +130,7 @@ static inline uint32_t _PPC_intack_read(target_phys_addr_t 
addr)
 uint32_t retval = 0;
 
 if ((addr & 0xf) == 0)
-retval = pic_intack_read(isa_pic);
+retval = pic_read_irq(isa_pic);
 #if 0
 printf("%s: 0x" TARGET_FMT_plx " <= %08" PRIx32 "\n", __func__, addr,
retval);
-- 
1.7.3.4




Re: [Qemu-devel] [PATCH v2 19/23] qdev: Add HEX8 property

2011-10-07 Thread Andreas Färber

Am 07.10.2011 09:19, schrieb Jan Kiszka:

Signed-off-by: Jan Kiszka


Reviewed-by: Andreas Färber 

If you resend the series, a one-sentence description would be nice.

Andreas


---
  hw/qdev-properties.c |   29 +
  hw/qdev.h|3 +++
  2 files changed, 32 insertions(+), 0 deletions(-)

diff --git a/hw/qdev-properties.c b/hw/qdev-properties.c
index e0e54aa..f0b811c 100644
--- a/hw/qdev-properties.c
+++ b/hw/qdev-properties.c
@@ -93,6 +93,35 @@ PropertyInfo qdev_prop_uint8 = {
  .print = print_uint8,
  };

+/* --- 8bit hex value --- */
+
+static int parse_hex8(DeviceState *dev, Property *prop, const char *str)
+{
+uint8_t *ptr = qdev_get_prop_ptr(dev, prop);
+char *end;
+
+*ptr = strtoul(str,&end, 16);
+if ((*end != '\0') || (end == str)) {
+return -EINVAL;
+}
+
+return 0;
+}
+
+static int print_hex8(DeviceState *dev, Property *prop, char *dest, size_t len)
+{
+uint8_t *ptr = qdev_get_prop_ptr(dev, prop);
+return snprintf(dest, len, "0x%" PRIx8, *ptr);
+}
+
+PropertyInfo qdev_prop_hex8 = {
+.name  = "hex8",
+.type  = PROP_TYPE_UINT8,
+.size  = sizeof(uint8_t),
+.parse = parse_hex8,
+.print = print_hex8,
+};
+
  /* --- 16bit integer --- */

  static int parse_uint16(DeviceState *dev, Property *prop, const char *str)
diff --git a/hw/qdev.h b/hw/qdev.h
index 8a13ec9..aa7ae36 100644
--- a/hw/qdev.h
+++ b/hw/qdev.h
@@ -224,6 +224,7 @@ extern PropertyInfo qdev_prop_uint16;
  extern PropertyInfo qdev_prop_uint32;
  extern PropertyInfo qdev_prop_int32;
  extern PropertyInfo qdev_prop_uint64;
+extern PropertyInfo qdev_prop_hex8;
  extern PropertyInfo qdev_prop_hex32;
  extern PropertyInfo qdev_prop_hex64;
  extern PropertyInfo qdev_prop_string;
@@ -267,6 +268,8 @@ extern PropertyInfo qdev_prop_pci_devfn;
  DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_int32, int32_t)
  #define DEFINE_PROP_UINT64(_n, _s, _f, _d)  \
  DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_uint64, uint64_t)
+#define DEFINE_PROP_HEX8(_n, _s, _f, _d)   \
+DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_hex8, uint8_t)
  #define DEFINE_PROP_HEX32(_n, _s, _f, _d)   \
  DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_hex32, uint32_t)
  #define DEFINE_PROP_HEX64(_n, _s, _f, _d)   \




[Qemu-devel] [PATCH v2 23/23] i8259: Move to hw library

2011-10-07 Thread Jan Kiszka
No target-specific bits remaining, let's move it over.

Signed-off-by: Jan Kiszka 
---
 Makefile.objs|1 +
 Makefile.target  |8 
 default-configs/alpha-softmmu.mak|1 +
 default-configs/i386-softmmu.mak |1 +
 default-configs/mips-softmmu.mak |1 +
 default-configs/mips64-softmmu.mak   |1 +
 default-configs/mips64el-softmmu.mak |1 +
 default-configs/mipsel-softmmu.mak   |1 +
 default-configs/ppc-softmmu.mak  |1 +
 default-configs/ppc64-softmmu.mak|1 +
 default-configs/ppcemb-softmmu.mak   |1 +
 default-configs/x86_64-softmmu.mak   |1 +
 12 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/Makefile.objs b/Makefile.objs
index 8d23fbb..f551375 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -220,6 +220,7 @@ hw-obj-$(CONFIG_APPLESMC) += applesmc.o
 hw-obj-$(CONFIG_SMARTCARD) += usb-ccid.o ccid-card-passthru.o
 hw-obj-$(CONFIG_SMARTCARD_NSS) += ccid-card-emulated.o
 hw-obj-$(CONFIG_USB_REDIR) += usb-redir.o
+hw-obj-$(CONFIG_I8259) += i8259.o
 
 # PPC devices
 hw-obj-$(CONFIG_OPENPIC) += openpic.o
diff --git a/Makefile.target b/Makefile.target
index 88d2f1f..393f58d 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -219,7 +219,7 @@ obj-$(CONFIG_IVSHMEM) += ivshmem.o
 
 # Hardware support
 obj-i386-y += vga.o
-obj-i386-y += mc146818rtc.o i8259.o pc.o
+obj-i386-y += mc146818rtc.o pc.o
 obj-i386-y += cirrus_vga.o sga.o apic.o ioapic.o piix_pci.o
 obj-i386-y += vmport.o
 obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
@@ -232,7 +232,7 @@ obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o
 obj-ppc-y = ppc.o
 obj-ppc-y += vga.o
 # PREP target
-obj-ppc-y += i8259.o mc146818rtc.o
+obj-ppc-y += mc146818rtc.o
 obj-ppc-y += ppc_prep.o
 # OldWorld PowerMac
 obj-ppc-y += ppc_oldworld.o
@@ -283,7 +283,7 @@ obj-lm32-y += framebuffer.o
 
 obj-mips-y = mips_r4k.o mips_jazz.o mips_malta.o mips_mipssim.o
 obj-mips-y += mips_addr.o mips_timer.o mips_int.o
-obj-mips-y += vga.o i8259.o
+obj-mips-y += vga.o
 obj-mips-y += jazz_led.o
 obj-mips-y += gt64xxx.o mc146818rtc.o
 obj-mips-y += cirrus_vga.o
@@ -365,7 +365,7 @@ obj-m68k-y += m68k-semi.o dummy_m68k.o
 
 obj-s390x-y = s390-virtio-bus.o s390-virtio.o
 
-obj-alpha-y = i8259.o mc146818rtc.o
+obj-alpha-y = mc146818rtc.o
 obj-alpha-y += vga.o cirrus_vga.o
 
 obj-xtensa-y += xtensa_pic.o
diff --git a/default-configs/alpha-softmmu.mak 
b/default-configs/alpha-softmmu.mak
index abadcff..3889213 100644
--- a/default-configs/alpha-softmmu.mak
+++ b/default-configs/alpha-softmmu.mak
@@ -7,3 +7,4 @@ CONFIG_VGA_PCI=y
 CONFIG_IDE_CORE=y
 CONFIG_IDE_QDEV=y
 CONFIG_VMWARE_VGA=y
+CONFIG_I8259=y
diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmmu.mak
index 55589fa..e67ebb3 100644
--- a/default-configs/i386-softmmu.mak
+++ b/default-configs/i386-softmmu.mak
@@ -21,3 +21,4 @@ CONFIG_PIIX_PCI=y
 CONFIG_SOUND=y
 CONFIG_HPET=y
 CONFIG_APPLESMC=y
+CONFIG_I8259=y
diff --git a/default-configs/mips-softmmu.mak b/default-configs/mips-softmmu.mak
index 45bdefb..94a3486 100644
--- a/default-configs/mips-softmmu.mak
+++ b/default-configs/mips-softmmu.mak
@@ -27,3 +27,4 @@ CONFIG_DS1225Y=y
 CONFIG_MIPSNET=y
 CONFIG_PFLASH_CFI01=y
 CONFIG_G364FB=y
+CONFIG_I8259=y
diff --git a/default-configs/mips64-softmmu.mak 
b/default-configs/mips64-softmmu.mak
index d43e33c..b5d3108 100644
--- a/default-configs/mips64-softmmu.mak
+++ b/default-configs/mips64-softmmu.mak
@@ -27,3 +27,4 @@ CONFIG_DS1225Y=y
 CONFIG_MIPSNET=y
 CONFIG_PFLASH_CFI01=y
 CONFIG_G364FB=y
+CONFIG_I8259=y
diff --git a/default-configs/mips64el-softmmu.mak 
b/default-configs/mips64el-softmmu.mak
index f307e8d..2831f44 100644
--- a/default-configs/mips64el-softmmu.mak
+++ b/default-configs/mips64el-softmmu.mak
@@ -29,3 +29,4 @@ CONFIG_MIPSNET=y
 CONFIG_PFLASH_CFI01=y
 CONFIG_FULONG=y
 CONFIG_G364FB=y
+CONFIG_I8259=y
diff --git a/default-configs/mipsel-softmmu.mak 
b/default-configs/mipsel-softmmu.mak
index 1a66bc3..14c949d 100644
--- a/default-configs/mipsel-softmmu.mak
+++ b/default-configs/mipsel-softmmu.mak
@@ -27,3 +27,4 @@ CONFIG_DS1225Y=y
 CONFIG_MIPSNET=y
 CONFIG_PFLASH_CFI01=y
 CONFIG_G364FB=y
+CONFIG_I8259=y
diff --git a/default-configs/ppc-softmmu.mak b/default-configs/ppc-softmmu.mak
index 4563742..c85cdce 100644
--- a/default-configs/ppc-softmmu.mak
+++ b/default-configs/ppc-softmmu.mak
@@ -31,3 +31,4 @@ CONFIG_SOUND=y
 CONFIG_PFLASH_CFI01=y
 CONFIG_PFLASH_CFI02=y
 CONFIG_PTIMER=y
+CONFIG_I8259=y
diff --git a/default-configs/ppc64-softmmu.mak 
b/default-configs/ppc64-softmmu.mak
index d5073b3..8874115 100644
--- a/default-configs/ppc64-softmmu.mak
+++ b/default-configs/ppc64-softmmu.mak
@@ -31,3 +31,4 @@ CONFIG_SOUND=y
 CONFIG_PFLASH_CFI01=y
 CONFIG_PFLASH_CFI02=y
 CONFIG_PTIMER=y
+CONFIG_I8259=y
diff --git a/default-configs/ppcemb-softmmu.mak 
b/default-configs/ppcemb-softmmu.mak
index 9f0730c..5db7205 100644
--- a/default-configs/ppcemb-softmmu.mak
+++ b/defau

Re: [Qemu-devel] Integrating Dynamips and GNS3 UDP tunnels (Patches)

2011-10-07 Thread Jan Kiszka
On 2011-10-06 19:08, Benjamin Epitech wrote:
> GNS3 team developed a GUI in order to inter-connect different emulated
> hardware. In order
> to achieve a network inter-connection between each hosts, one single
> protocol is used: an
> UDP tunneling protocol introduced by Dynamips (a cisco hardware emulator).
> 
> Since the beginning, GNS3 supports Qemu by providing patches for its users,
> these patches
> bring to Qemu the implementation of Dynamips UDP tunneling protocol.
> 
> As GNS3 improves and now supports VirtualBox, it should be time to free
> users of the assle
> of having to patch Qemu themselves. FreeBSD integrated our patches in the
> ports tree, we
> ship a patched Qemu for Windows, and we're now looking forward to integrate
> those patches
> upstream.
> 
> Here are the patches that apply on the latest release of Qemu, I hereby
> submit them for your
> approval or not.
> 
> 1) Basic patch in order to build the new source file
> http://code.gns3.net/qemu-patches/file/6a927b6cdaf8/Makefile_objs.patch
> 
> 2) Parse -net udp
> http://code.gns3.net/qemu-patches/file/6a927b6cdaf8/net_c.patch
> 
> 3) New NET_CLIENT_TYPE_UDP macro
> http://code.gns3.net/qemu-patches/file/6a927b6cdaf8/net_h.patch
> 
> 4) New source code file, implementation of the UDP tunneling protocol
> http://code.gns3.net/qemu-patches/file/6a927b6cdaf8/net_udp_c.patch
> 
> 5) Corresponding header file
> http://code.gns3.net/qemu-patches/file/6a927b6cdaf8/net_udp_h.patch
> 
> The hw_e1000_c.patch is no longer needed, it was a dirty hack that we kept
> for too long.
> The block_raw-win32_c.patch fixes a minor issue that arises only on Windows,
> it may deserve
> another topic.
> 
> Please include me in the replies as I am not subscribed to the list.
> 

You should send out the changes as proper patch series, rebased on
current git head. See http://wiki.qemu.org/Contribute/SubmitAPatch for
further requirements. And make sure that no patch breaks the build so
that bisectability is preserved.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



[Qemu-devel] [PATCH v2 01/23] pc: Drop useless test from isa_irq_handler

2011-10-07 Thread Jan Kiszka
IsaIrqState::ioapic is always non-NULL. Probably, the concrete
qemu_irq was supposed to be tested, but that's already done by
qemu_set_irq.

Signed-off-by: Jan Kiszka 
---
 hw/pc.c |5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/hw/pc.c b/hw/pc.c
index 203627d..a15d165 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -96,9 +96,8 @@ void isa_irq_handler(void *opaque, int n, int level)
 if (n < 16) {
 qemu_set_irq(isa->i8259[n], level);
 }
-if (isa->ioapic)
-qemu_set_irq(isa->ioapic[n], level);
-};
+qemu_set_irq(isa->ioapic[n], level);
+}
 
 static void ioport80_write(void *opaque, uint32_t addr, uint32_t data)
 {
-- 
1.7.3.4




[Qemu-devel] [PATCH v2 14/23] i8259: Fix poll command

2011-10-07 Thread Jan Kiszka
This was probably never used so far: According to the spec, polling
means ack'ing the pending IRQ and setting its corresponding bit in isr.
Moreover, we have to signal a pending IRQ via bit 7 of the returned
value, and we must not return a spurious IRQ if none is pending.

This implements the poll command without the help of pic_poll_read which
is left untouched as pic_intack_read is still using it.

Signed-off-by: Jan Kiszka 
---
 hw/i8259.c |8 +++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/hw/i8259.c b/hw/i8259.c
index bb257e6..31962c0 100644
--- a/hw/i8259.c
+++ b/hw/i8259.c
@@ -393,7 +393,13 @@ static uint64_t pic_ioport_read(void *opaque, 
target_phys_addr_t addr1,
 int ret;
 
 if (s->poll) {
-ret = pic_poll_read(s);
+ret = pic_get_irq(s);
+if (ret >= 0) {
+pic_intack(s, ret);
+ret |= 0x80;
+} else {
+ret = 0;
+}
 s->poll = 0;
 } else {
 if (addr == 0) {
-- 
1.7.3.4




[Qemu-devel] [PATCH v2 17/23] i8259: Replace PicState::pics_state with master flag

2011-10-07 Thread Jan Kiszka
This reflects how real PICs indentify their role (in non-buffered mode):
Pass the state of the /SP input on pic_init and use it instead of
pics_state to differentiate between master and slave mode.

Signed-off-by: Jan Kiszka 
---
 hw/i8259.c |   18 +-
 1 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/hw/i8259.c b/hw/i8259.c
index 8870277..dc13b12 100644
--- a/hw/i8259.c
+++ b/hw/i8259.c
@@ -59,7 +59,7 @@ typedef struct PicState {
 uint8_t elcr; /* PIIX edge/trigger selection*/
 uint8_t elcr_mask;
 qemu_irq int_out;
-PicState2 *pics_state;
+bool master; /* reflects /SP input pin */
 MemoryRegion base_io;
 MemoryRegion elcr_io;
 } PicState;
@@ -107,8 +107,9 @@ static int pic_get_irq(PicState *s)
 mask = s->isr;
 if (s->special_mask)
 mask &= ~s->imr;
-if (s->special_fully_nested_mode && s == &s->pics_state->pics[0])
+if (s->special_fully_nested_mode && s->master) {
 mask &= ~(1 << 2);
+}
 cur_priority = get_priority(s, mask);
 if (priority < cur_priority) {
 /* higher priority found: an irq should be generated */
@@ -126,8 +127,7 @@ static void pic_update_irq(PicState *s)
 irq = pic_get_irq(s);
 if (irq >= 0) {
 DPRINTF("pic%d: imr=%x irr=%x padd=%d\n",
-s == &s->pics_state->pics[0] ? 0 : 1, s->imr, s->irr,
-s->priority_add);
+s->master ? 0 : 1, s->imr, s->irr, s->priority_add);
 qemu_irq_raise(s->int_out);
 } else {
 qemu_irq_lower(s->int_out);
@@ -454,9 +454,11 @@ static const MemoryRegionOps pic_elcr_ioport_ops = {
 };
 
 /* XXX: add generic master/slave system */
-static void pic_init(int io_addr, int elcr_addr, PicState *s, qemu_irq int_out)
+static void pic_init(int io_addr, int elcr_addr, PicState *s, qemu_irq int_out,
+ bool master)
 {
 s->int_out = int_out;
+s->master = master;
 
 memory_region_init_io(&s->base_io, &pic_base_ioport_ops, s, "pic", 2);
 memory_region_init_io(&s->elcr_io, &pic_elcr_ioport_ops, s, "elcr", 1);
@@ -512,12 +514,10 @@ qemu_irq *i8259_init(qemu_irq parent_irq)
 
 s = g_malloc0(sizeof(PicState2));
 irqs = qemu_allocate_irqs(i8259_set_irq, s, 16);
-pic_init(0x20, 0x4d0, &s->pics[0], parent_irq);
-pic_init(0xa0, 0x4d1, &s->pics[1], irqs[2]);
+pic_init(0x20, 0x4d0, &s->pics[0], parent_irq, true);
+pic_init(0xa0, 0x4d1, &s->pics[1], irqs[2], false);
 s->pics[0].elcr_mask = 0xf8;
 s->pics[1].elcr_mask = 0xde;
-s->pics[0].pics_state = s;
-s->pics[1].pics_state = s;
 isa_pic = s;
 return irqs;
 }
-- 
1.7.3.4




[Qemu-devel] [PATCH v2 13/23] i8259: Switch to per-PIC IRQ update

2011-10-07 Thread Jan Kiszka
This converts pic_update_irq to work against a single PIC instead of the
complete cascade. Along this change, the required update after
pic_set_irq1 is now moved into that function.

Signed-off-by: Jan Kiszka 
---
 hw/i8259.c |   59 ---
 1 files changed, 20 insertions(+), 39 deletions(-)

diff --git a/hw/i8259.c b/hw/i8259.c
index d18fc62..bb257e6 100644
--- a/hw/i8259.c
+++ b/hw/i8259.c
@@ -118,39 +118,19 @@ static int pic_get_irq(PicState *s)
 }
 }
 
-static void pic_set_irq1(PicState *s, int irq, int level);
-
-/* raise irq to CPU if necessary. must be called every time the active
-   irq may change */
-static void pic_update_irq(PicState2 *s)
+/* Update INT output. Must be called every time the output may have changed. */
+static void pic_update_irq(PicState *s)
 {
-int irq2, irq;
-
-/* first look at slave pic */
-irq2 = pic_get_irq(&s->pics[1]);
-if (irq2 >= 0) {
-/* if irq request by slave pic, signal master PIC */
-pic_set_irq1(&s->pics[0], 2, 1);
-pic_set_irq1(&s->pics[0], 2, 0);
-}
-/* look at requested irq */
-irq = pic_get_irq(&s->pics[0]);
-if (irq >= 0) {
-#if defined(DEBUG_PIC)
-{
-int i;
-for(i = 0; i < 2; i++) {
-printf("pic%d: imr=%x irr=%x padd=%d\n",
-   i, s->pics[i].imr, s->pics[i].irr,
-   s->pics[i].priority_add);
+int irq;
 
-}
-}
-printf("pic: cpu_interrupt\n");
-#endif
-qemu_irq_raise(s->pics[0].int_out);
+irq = pic_get_irq(s);
+if (irq >= 0) {
+DPRINTF("pic%d: imr=%x irr=%x padd=%d\n",
+s == &s->pics_state->pics[0] ? 0 : 1, s->imr, s->irr,
+s->priority_add);
+qemu_irq_raise(s->int_out);
 } else {
-qemu_irq_lower(s->pics[0].int_out);
+qemu_irq_lower(s->int_out);
 }
 }
 
@@ -179,6 +159,7 @@ static void pic_set_irq1(PicState *s, int irq, int level)
 s->last_irr &= ~mask;
 }
 }
+pic_update_irq(s);
 }
 
 #ifdef DEBUG_IRQ_LATENCY
@@ -205,7 +186,6 @@ static void i8259_set_irq(void *opaque, int irq, int level)
 }
 #endif
 pic_set_irq1(&s->pics[irq >> 3], irq & 7, level);
-pic_update_irq(s);
 }
 
 /* acknowledge interrupt 'irq' */
@@ -220,6 +200,7 @@ static void pic_intack(PicState *s, int irq)
 /* We don't clear a level sensitive interrupt here */
 if (!(s->elcr & (1 << irq)))
 s->irr &= ~(1 << irq);
+pic_update_irq(s);
 }
 
 int pic_read_irq(PicState2 *s)
@@ -246,7 +227,6 @@ int pic_read_irq(PicState2 *s)
 irq = 7;
 intno = s->pics[0].irq_base + irq;
 }
-pic_update_irq(s);
 
 #if defined(DEBUG_PIC) || defined(DEBUG_IRQ_LATENCY)
 if (irq == 2) {
@@ -281,7 +261,7 @@ static void pic_init_reset(PicState *s)
 s->init4 = 0;
 s->single_mode = 0;
 /* Note: ELCR is not reset */
-pic_update_irq(s->pics_state);
+pic_update_irq(s);
 }
 
 static void pic_reset(void *opaque)
@@ -331,23 +311,23 @@ static void pic_ioport_write(void *opaque, 
target_phys_addr_t addr64,
 s->isr &= ~(1 << irq);
 if (cmd == 5)
 s->priority_add = (irq + 1) & 7;
-pic_update_irq(s->pics_state);
+pic_update_irq(s);
 }
 break;
 case 3:
 irq = val & 7;
 s->isr &= ~(1 << irq);
-pic_update_irq(s->pics_state);
+pic_update_irq(s);
 break;
 case 6:
 s->priority_add = (val + 1) & 7;
-pic_update_irq(s->pics_state);
+pic_update_irq(s);
 break;
 case 7:
 irq = val & 7;
 s->isr &= ~(1 << irq);
 s->priority_add = (irq + 1) & 7;
-pic_update_irq(s->pics_state);
+pic_update_irq(s);
 break;
 default:
 /* no operation */
@@ -359,7 +339,7 @@ static void pic_ioport_write(void *opaque, 
target_phys_addr_t addr64,
 case 0:
 /* normal mode */
 s->imr = val;
-pic_update_irq(s->pics_state);
+pic_update_irq(s);
 break;
 case 1:
 s->irq_base = val & 0xf8;
@@ -395,8 +375,9 @@ static uint32_t pic_poll_read(PicState *s)
 }
 s->irr &= ~(1 << ret);
 s->isr &= ~(1 << ret);
-if (slave || ret != 2)
-pic_update_irq(s->pics_state);
+if (slave || ret != 2) {
+pic_update_irq(s);
+}
 } else {
 ret = 0x07;
 }
-- 
1.7.3.4




Re: [Qemu-devel] [PATCH 3/3] pseries: Correctly create ibm, segment-page-sizes property

2011-10-07 Thread Alexander Graf

On 30.09.2011, at 09:50, David Gibson wrote:

> Current versions of the PowerPC architecture require and fully define
> 4kB and 16MB page sizes.  Other pagesizes (e.g. 64kB, 1MB) are
> permitted and are often supported, but the exact encodings used to set
> the up can vary from chip to chip.
> 
> The supported pagesizes and required encodings are advertised to the
> OS via the ibm,segment-page-sizes property in the device tree.
> Currently we do not put this property in our device tree, so guests
> are restricted to the architected 4kB and 16MB pagesizes.
> 
> The base sizes are all that we implement in tcg, however with KVM the
> guest can use anything supported by the host as long as the guest's
> base memory is backed by pages at least as large.  Furthermore, in
> order to use any extended page sizes, the guest needs to know the
> correct encodings for the host.
> 
> This patch, therefore, reads the host's pagesize information, filters
> it based on the pagesize backing RAM, and passes it into the guest.
> 
> Signed-off-by: Nishanth Aravamudan 
> Signed-off-by: David Gibson 
> ---
> hw/spapr.c   |  127 ++
> target-ppc/kvm.c |   43 +
> target-ppc/kvm_ppc.h |6 ++
> 3 files changed, 176 insertions(+), 0 deletions(-)
> 
> diff --git a/hw/spapr.c b/hw/spapr.c
> index 8089d83..72b6c6a 100644
> --- a/hw/spapr.c
> +++ b/hw/spapr.c
> @@ -24,6 +24,8 @@
>  * THE SOFTWARE.
>  *
>  */
> +#include 
> +
> #include "sysemu.h"
> #include "hw.h"
> #include "elf.h"
> @@ -88,6 +90,122 @@ qemu_irq spapr_allocate_irq(uint32_t hint, uint32_t 
> *irq_num)
> return qirq;
> }
> 
> +#define HUGETLBFS_MAGIC   0x958458f6
> +
> +static long getrampagesize(void)
> +{
> +struct statfs fs;
> +int ret;
> +
> +if (!mem_path) {
> +/* guest RAM is backed by normal anonymous pages */
> +return getpagesize();
> +}
> +
> +do {
> +ret = statfs(mem_path, &fs);
> +} while (ret != 0 && errno == EINTR);
> +
> +if (ret != 0) {
> +fprintf(stderr, "Couldn't statfs() memory path: %s\n",
> +strerror(errno));
> +exit(1);
> +}
> +
> +if (fs.f_type != HUGETLBFS_MAGIC) {
> +/* Explicit mempath, but it's ordinary pages */
> +return getpagesize();
> +}
> +
> +/* It's hugepage, return the huge page size */
> +return fs.f_bsize;
> +}

Would this function compile and work on win32 hosts? If not, it should probably 
go to kvm.c.

> +
> +static size_t create_page_sizes_prop(uint32_t *prop, size_t maxsize)
> +{
> +int cells;
> +target_ulong ram_page_size = getrampagesize();
> +int i, j;
> +
> +if (!kvm_enabled()) {
> +/* For the supported CPUs in emulation, we support just 4k and
> + * 16MB pages, with the usual encodings.  This is the default
> + * set the guest will assume if we don't specify anything */
> +return 0;
> +}
> +
> +cells = kvmppc_read_segment_page_sizes(prop, maxsize / sizeof(uint32_t));

Shouldn't we rather be asking the kvm kernel module to tell us its supported 
segment sizes? Just because the host doesn't support 256MB page size doesn't 
mean we can't expose it to the guest, right? Depending on the KVM mode of 
course.

For HV we would pass through the hardware ones. For PR we could pretty much 
support anything since we're shadowing the htab. But there it'd be a win too, 
since we would get less page table entries and could potentially also back 
things with huge pages.

Also, this depends heavily on the guest CPU architecture. For 970, we can't 
support anything but 4k and 16MB (and even that one is crap). For p7, things 
are a lot more flexible. But we need to make sure that what we tell the guest 
is actually possible to do on the particular CPU we're emulating / virtualizing.


Alex




[Qemu-devel] [PATCH v2 06/23] i8259: Drop obsolete prototypes

2011-10-07 Thread Jan Kiszka
Signed-off-by: Jan Kiszka 
---
 hw/pc.h |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/hw/pc.h b/hw/pc.h
index 60da282..fd5f9b2 100644
--- a/hw/pc.h
+++ b/hw/pc.h
@@ -62,8 +62,6 @@ bool parallel_mm_init(target_phys_addr_t base, int it_shift, 
qemu_irq irq,
 
 typedef struct PicState2 PicState2;
 extern PicState2 *isa_pic;
-void pic_set_irq(int irq, int level);
-void pic_set_irq_new(void *opaque, int irq, int level);
 qemu_irq *i8259_init(qemu_irq parent_irq);
 int pic_read_irq(PicState2 *s);
 int pic_get_output(PicState2 *s);
-- 
1.7.3.4




[Qemu-devel] [PATCH v2 05/23] i8259: Remove premature inline function attributes

2011-10-07 Thread Jan Kiszka
The compiler is smarter in choosing the right optimization.

Signed-off-by: Jan Kiszka 
---
 hw/i8259.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/i8259.c b/hw/i8259.c
index 6006123..f1d58ba 100644
--- a/hw/i8259.c
+++ b/hw/i8259.c
@@ -80,7 +80,7 @@ static uint64_t irq_count[16];
 PicState2 *isa_pic;
 
 /* set irq level. If an edge is detected, then the IRR is set to 1 */
-static inline void pic_set_irq1(PicState *s, int irq, int level)
+static void pic_set_irq1(PicState *s, int irq, int level)
 {
 int mask;
 mask = 1 << irq;
@@ -107,7 +107,7 @@ static inline void pic_set_irq1(PicState *s, int irq, int 
level)
 
 /* return the highest priority found in mask (highest = smallest
number). Return 8 if no irq */
-static inline int get_priority(PicState *s, int mask)
+static int get_priority(PicState *s, int mask)
 {
 int priority;
 if (mask == 0)
@@ -206,7 +206,7 @@ static void i8259_set_irq(void *opaque, int irq, int level)
 }
 
 /* acknowledge interrupt 'irq' */
-static inline void pic_intack(PicState *s, int irq)
+static void pic_intack(PicState *s, int irq)
 {
 if (s->auto_eoi) {
 if (s->rotate_on_auto_eoi)
-- 
1.7.3.4




[Qemu-devel] [PATCH] s390x: Add shutdown for TCG s390-virtio machine

2011-10-07 Thread Alexander Graf
Now that we have code in place to do refcounting of online CPUs, we
can drag the TCG code along and implement shutdown for that one too,
so it doesn't feel left out by its KVM counterpart.

Signed-off-by: Alexander Graf 
---
 target-s390x/cpu.h|9 +
 target-s390x/helper.c |   15 ---
 2 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/target-s390x/cpu.h b/target-s390x/cpu.h
index a66aa01..202c098 100644
--- a/target-s390x/cpu.h
+++ b/target-s390x/cpu.h
@@ -315,6 +315,15 @@ unsigned s390_del_running_cpu(CPUState *env);
 /* from s390-virtio-bus */
 extern const target_phys_addr_t virtio_size;
 
+#else
+static inline void s390_add_running_cpu(CPUState *env)
+{
+}
+
+static inline unsigned s390_del_running_cpu(CPUState *env)
+{
+return 0;
+}
 #endif
 void cpu_lock(void);
 void cpu_unlock(void);
diff --git a/target-s390x/helper.c b/target-s390x/helper.c
index 4145104..10cc9dd 100644
--- a/target-s390x/helper.c
+++ b/target-s390x/helper.c
@@ -26,6 +26,9 @@
 #include "gdbstub.h"
 #include "qemu-common.h"
 #include "qemu-timer.h"
+#ifndef CONFIG_USER_ONLY
+#include "sysemu.h"
+#endif
 
 //#define DEBUG_S390
 //#define DEBUG_S390_PTE
@@ -131,6 +134,7 @@ void cpu_reset(CPUS390XState *env)
 memset(env, 0, offsetof(CPUS390XState, breakpoints));
 /* FIXME: reset vector? */
 tlb_flush(env, 1);
+s390_add_running_cpu(env);
 }
 
 #ifndef CONFIG_USER_ONLY
@@ -466,11 +470,15 @@ target_phys_addr_t cpu_get_phys_page_debug(CPUState *env, 
target_ulong vaddr)
 void load_psw(CPUState *env, uint64_t mask, uint64_t addr)
 {
 if (mask & PSW_MASK_WAIT) {
-env->halted = 1;
-env->exception_index = EXCP_HLT;
 if (!(mask & (PSW_MASK_IO | PSW_MASK_EXT | PSW_MASK_MCHECK))) {
-/* XXX disabled wait state - CPU is dead */
+if (s390_del_running_cpu(env) == 0) {
+#ifndef CONFIG_USER_ONLY
+qemu_system_shutdown_request();
+#endif
+}
 }
+env->halted = 1;
+env->exception_index = EXCP_HLT;
 }
 
 env->psw.addr = addr;
@@ -599,6 +607,7 @@ void do_interrupt (CPUState *env)
 qemu_log("%s: %d at pc=%" PRIx64 "\n", __FUNCTION__, env->exception_index,
  env->psw.addr);
 
+s390_add_running_cpu(env);
 /* handle external interrupts */
 if ((env->psw.mask & PSW_MASK_EXT) &&
 env->exception_index == -1) {
-- 
1.6.0.2




[Qemu-devel] [PATCH v2 09/23] i8259: Do not update IRQ output after spurious pic_poll_read

2011-10-07 Thread Jan Kiszka
If pic_poll_read finds no pending IRQ and return a spurious one instead,
no PIC state is changed, thus we do not need to call pic_update_irq.

Signed-off-by: Jan Kiszka 
---
 hw/i8259.c |1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/hw/i8259.c b/hw/i8259.c
index 65123bd..cddd3c7 100644
--- a/hw/i8259.c
+++ b/hw/i8259.c
@@ -393,7 +393,6 @@ static uint32_t pic_poll_read(PicState *s)
 pic_update_irq(s->pics_state);
 } else {
 ret = 0x07;
-pic_update_irq(s->pics_state);
 }
 
 return ret;
-- 
1.7.3.4




[Qemu-devel] [PATCH v2 21/23] i8259: Fix coding style

2011-10-07 Thread Jan Kiszka
No functional changes.

Signed-off-by: Jan Kiszka 
---
 hw/i8259.c |   54 ++
 1 files changed, 34 insertions(+), 20 deletions(-)

diff --git a/hw/i8259.c b/hw/i8259.c
index b4e1867..ab519de 100644
--- a/hw/i8259.c
+++ b/hw/i8259.c
@@ -67,7 +67,7 @@ struct PicState {
 MemoryRegion elcr_io;
 };
 
-#if defined(DEBUG_PIC) || defined (DEBUG_IRQ_COUNT)
+#if defined(DEBUG_PIC) || defined(DEBUG_IRQ_COUNT)
 static int irq_level[16];
 #endif
 #ifdef DEBUG_IRQ_COUNT
@@ -84,11 +84,14 @@ static PicState *slave_pic;
 static int get_priority(PicState *s, int mask)
 {
 int priority;
-if (mask == 0)
+
+if (mask == 0) {
 return 8;
+}
 priority = 0;
-while ((mask & (1 << ((priority + s->priority_add) & 7))) == 0)
+while ((mask & (1 << ((priority + s->priority_add) & 7))) == 0) {
 priority++;
+}
 return priority;
 }
 
@@ -99,14 +102,16 @@ static int pic_get_irq(PicState *s)
 
 mask = s->irr & ~s->imr;
 priority = get_priority(s, mask);
-if (priority == 8)
+if (priority == 8) {
 return -1;
+}
 /* compute current priority. If special fully nested mode on the
master, the IRQ coming from the slave is not taken into account
for the priority computation. */
 mask = s->isr;
-if (s->special_mask)
+if (s->special_mask) {
 mask &= ~s->imr;
+}
 if (s->special_fully_nested_mode && s->master) {
 mask &= ~(1 << 2);
 }
@@ -188,14 +193,16 @@ static void pic_set_irq(void *opaque, int irq, int level)
 static void pic_intack(PicState *s, int irq)
 {
 if (s->auto_eoi) {
-if (s->rotate_on_auto_eoi)
+if (s->rotate_on_auto_eoi) {
 s->priority_add = (irq + 1) & 7;
+}
 } else {
 s->isr |= (1 << irq);
 }
 /* We don't clear a level sensitive interrupt here */
-if (!(s->elcr & (1 << irq)))
+if (!(s->elcr & (1 << irq))) {
 s->irr &= ~(1 << irq);
+}
 pic_update_irq(s);
 }
 
@@ -283,18 +290,22 @@ static void pic_ioport_write(void *opaque, 
target_phys_addr_t addr64,
 s->init_state = 1;
 s->init4 = val & 1;
 s->single_mode = val & 2;
-if (val & 0x08)
+if (val & 0x08) {
 hw_error("level sensitive irq not supported");
+}
 } else if (val & 0x08) {
-if (val & 0x04)
+if (val & 0x04) {
 s->poll = 1;
-if (val & 0x02)
+}
+if (val & 0x02) {
 s->read_reg_select = val & 1;
-if (val & 0x40)
+}
+if (val & 0x40) {
 s->special_mask = (val >> 5) & 1;
+}
 } else {
 cmd = val >> 5;
-switch(cmd) {
+switch (cmd) {
 case 0:
 case 4:
 s->rotate_on_auto_eoi = cmd >> 2;
@@ -305,8 +316,9 @@ static void pic_ioport_write(void *opaque, 
target_phys_addr_t addr64,
 if (priority != 8) {
 irq = (priority + s->priority_add) & 7;
 s->isr &= ~(1 << irq);
-if (cmd == 5)
+if (cmd == 5) {
 s->priority_add = (irq + 1) & 7;
+}
 pic_update_irq(s);
 }
 break;
@@ -331,7 +343,7 @@ static void pic_ioport_write(void *opaque, 
target_phys_addr_t addr64,
 }
 }
 } else {
-switch(s->init_state) {
+switch (s->init_state) {
 case 0:
 /* normal mode */
 s->imr = val;
@@ -374,10 +386,11 @@ static uint64_t pic_ioport_read(void *opaque, 
target_phys_addr_t addr,
 s->poll = 0;
 } else {
 if (addr == 0) {
-if (s->read_reg_select)
+if (s->read_reg_select) {
 ret = s->isr;
-else
+} else {
 ret = s->irr;
+}
 } else {
 ret = s->imr;
 }
@@ -410,7 +423,7 @@ static const VMStateDescription vmstate_pic = {
 .version_id = 1,
 .minimum_version_id = 1,
 .minimum_version_id_old = 1,
-.fields  = (VMStateField []) {
+.fields = (VMStateField[]) {
 VMSTATE_UINT8(last_irr, PicState),
 VMSTATE_UINT8(irr, PicState),
 VMSTATE_UINT8(imr, PicState),
@@ -474,9 +487,9 @@ void pic_info(Monitor *mon)
 int i;
 PicState *s;
 
-if (!isa_pic)
+if (!isa_pic) {
 return;
-
+}
 for (i = 0; i < 2; i++) {
 s = i == 0 ? isa_pic : slave_pic;
 monitor_printf(mon, "pic%d: irr=%02x imr=%02x isr=%02x hprio=%d "
@@ -498,8 +511,9 @@ void irq_info(Monitor *mon)
 monitor_printf(mon, "IRQ statistics:\n");
 for (i = 0; i < 16; i++) {
 count = irq_count[i];
-if (count > 0)
+if (count > 0) {
 monitor_printf(mon, "%2d: %" PRI

[Qemu-devel] [PATCH v2 04/23] pc: Fix and clean up PIC-to-APIC IRQ path

2011-10-07 Thread Jan Kiszka
The master PIC is connected to the LINTIN0 of the APICs. As the APIC
currently does not track the state of that line, we have to ask the PIC
to reinject its IRQ after the CPU picked up an event from the APIC.

This introduces pic_get_output to read the master PIC IRQ line state
without changing it. The APIC uses this function to decide if a PIC IRQ
should be reinjected on apic_update_irq. This reflects better how the
real hardware works.

The patch fixes some failures of the kvm unit tests apic and eventinj by
allowing to enable the proper CPU IRQ deassertion when the guest masks
some pending IRQs at PIC level.

Signed-off-by: Jan Kiszka 
---
 hw/apic.c  |4 
 hw/i8259.c |   15 +++
 hw/pc.c|3 ---
 hw/pc.h|2 +-
 4 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/hw/apic.c b/hw/apic.c
index d8f56c8..8289eef 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -23,6 +23,7 @@
 #include "host-utils.h"
 #include "sysbus.h"
 #include "trace.h"
+#include "pc.h"
 
 /* APIC Local Vector Table */
 #define APIC_LVT_TIMER   0
@@ -399,6 +400,9 @@ static void apic_update_irq(APICState *s)
 }
 if (apic_irq_pending(s) > 0) {
 cpu_interrupt(s->cpu_env, CPU_INTERRUPT_HARD);
+} else if (apic_accept_pic_intr(&s->busdev.qdev) &&
+   pic_get_output(isa_pic)) {
+apic_deliver_pic_intr(&s->busdev.qdev, 1);
 }
 }
 
diff --git a/hw/i8259.c b/hw/i8259.c
index e5323ff..6006123 100644
--- a/hw/i8259.c
+++ b/hw/i8259.c
@@ -146,8 +146,7 @@ static int pic_get_irq(PicState *s)
 
 /* raise irq to CPU if necessary. must be called every time the active
irq may change */
-/* XXX: should not export it, but it is needed for an APIC kludge */
-void pic_update_irq(PicState2 *s)
+static void pic_update_irq(PicState2 *s)
 {
 int irq2, irq;
 
@@ -174,14 +173,9 @@ void pic_update_irq(PicState2 *s)
 printf("pic: cpu_interrupt\n");
 #endif
 qemu_irq_raise(s->parent_irq);
-}
-
-/* all targets should do this rather than acking the IRQ in the cpu */
-#if defined(TARGET_MIPS) || defined(TARGET_PPC) || defined(TARGET_ALPHA)
-else {
+} else {
 qemu_irq_lower(s->parent_irq);
 }
-#endif
 }
 
 #ifdef DEBUG_IRQ_LATENCY
@@ -441,6 +435,11 @@ uint32_t pic_intack_read(PicState2 *s)
 return ret;
 }
 
+int pic_get_output(PicState2 *s)
+{
+return (pic_get_irq(&s->pics[0]) >= 0);
+}
+
 static void elcr_ioport_write(void *opaque, target_phys_addr_t addr,
   uint64_t val, unsigned size)
 {
diff --git a/hw/pc.c b/hw/pc.c
index c979d4b..ff2111c 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -155,9 +155,6 @@ int cpu_get_pic_interrupt(CPUState *env)
 
 intno = apic_get_interrupt(env->apic_state);
 if (intno >= 0) {
-/* set irq request if a PIC irq is still pending */
-/* XXX: improve that */
-pic_update_irq(isa_pic);
 return intno;
 }
 /* read the irq from the PIC */
diff --git a/hw/pc.h b/hw/pc.h
index 2870be4..60da282 100644
--- a/hw/pc.h
+++ b/hw/pc.h
@@ -66,7 +66,7 @@ void pic_set_irq(int irq, int level);
 void pic_set_irq_new(void *opaque, int irq, int level);
 qemu_irq *i8259_init(qemu_irq parent_irq);
 int pic_read_irq(PicState2 *s);
-void pic_update_irq(PicState2 *s);
+int pic_get_output(PicState2 *s);
 uint32_t pic_intack_read(PicState2 *s);
 void pic_info(Monitor *mon);
 void irq_info(Monitor *mon);
-- 
1.7.3.4




[Qemu-devel] [PATCH v2 12/23] i8259: Clear ELCR on reset

2011-10-07 Thread Jan Kiszka
The ELCR is actually part of the chipset but we model it here for
simplicity reasons. The PIIX3 clears the ELCR on reset, which was once
broken by 4dbe19e181. Fix this by splitting up pic_init_reset from
pic_reset and clearing the register in the latter.

Signed-off-by: Jan Kiszka 
---
 hw/i8259.c |   15 ++-
 1 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/hw/i8259.c b/hw/i8259.c
index 3498c6b..d18fc62 100644
--- a/hw/i8259.c
+++ b/hw/i8259.c
@@ -263,10 +263,8 @@ int pic_read_irq(PicState2 *s)
 return intno;
 }
 
-static void pic_reset(void *opaque)
+static void pic_init_reset(PicState *s)
 {
-PicState *s = opaque;
-
 s->last_irr = 0;
 s->irr = 0;
 s->imr = 0;
@@ -286,6 +284,14 @@ static void pic_reset(void *opaque)
 pic_update_irq(s->pics_state);
 }
 
+static void pic_reset(void *opaque)
+{
+PicState *s = opaque;
+
+pic_init_reset(s);
+s->elcr = 0;
+}
+
 static void pic_ioport_write(void *opaque, target_phys_addr_t addr64,
  uint64_t val64, unsigned size)
 {
@@ -297,8 +303,7 @@ static void pic_ioport_write(void *opaque, 
target_phys_addr_t addr64,
 DPRINTF("write: addr=0x%02x val=0x%02x\n", addr, val);
 if (addr == 0) {
 if (val & 0x10) {
-/* init */
-pic_reset(s);
+pic_init_reset(s);
 s->init_state = 1;
 s->init4 = val & 1;
 s->single_mode = val & 2;
-- 
1.7.3.4




[Qemu-devel] [PATCH v2 07/23] i8259: Move pic_set_irq1 after pic_update_irq

2011-10-07 Thread Jan Kiszka
We are about to call the latter from the former. No functional changes.

Signed-off-by: Jan Kiszka 
---
 hw/i8259.c |   55 +--
 1 files changed, 29 insertions(+), 26 deletions(-)

diff --git a/hw/i8259.c b/hw/i8259.c
index f1d58ba..de2d5ca 100644
--- a/hw/i8259.c
+++ b/hw/i8259.c
@@ -79,32 +79,6 @@ static uint64_t irq_count[16];
 #endif
 PicState2 *isa_pic;
 
-/* set irq level. If an edge is detected, then the IRR is set to 1 */
-static void pic_set_irq1(PicState *s, int irq, int level)
-{
-int mask;
-mask = 1 << irq;
-if (s->elcr & mask) {
-/* level triggered */
-if (level) {
-s->irr |= mask;
-s->last_irr |= mask;
-} else {
-s->irr &= ~mask;
-s->last_irr &= ~mask;
-}
-} else {
-/* edge triggered */
-if (level) {
-if ((s->last_irr & mask) == 0)
-s->irr |= mask;
-s->last_irr |= mask;
-} else {
-s->last_irr &= ~mask;
-}
-}
-}
-
 /* return the highest priority found in mask (highest = smallest
number). Return 8 if no irq */
 static int get_priority(PicState *s, int mask)
@@ -144,6 +118,8 @@ static int pic_get_irq(PicState *s)
 }
 }
 
+static void pic_set_irq1(PicState *s, int irq, int level);
+
 /* raise irq to CPU if necessary. must be called every time the active
irq may change */
 static void pic_update_irq(PicState2 *s)
@@ -178,6 +154,33 @@ static void pic_update_irq(PicState2 *s)
 }
 }
 
+/* set irq level. If an edge is detected, then the IRR is set to 1 */
+static void pic_set_irq1(PicState *s, int irq, int level)
+{
+int mask;
+mask = 1 << irq;
+if (s->elcr & mask) {
+/* level triggered */
+if (level) {
+s->irr |= mask;
+s->last_irr |= mask;
+} else {
+s->irr &= ~mask;
+s->last_irr &= ~mask;
+}
+} else {
+/* edge triggered */
+if (level) {
+if ((s->last_irr & mask) == 0) {
+s->irr |= mask;
+}
+s->last_irr |= mask;
+} else {
+s->last_irr &= ~mask;
+}
+}
+}
+
 #ifdef DEBUG_IRQ_LATENCY
 int64_t irq_time[16];
 #endif
-- 
1.7.3.4




[Qemu-devel] [PATCH][uq/master] kvm: Add tool for querying VMX capabilities

2011-10-07 Thread Jan Kiszka
Taken from original qemu-kvm/kvm/scripts/vmxcap.

Signed-off-by: Jan Kiszka 
---
 scripts/kvm/vmxcap |  224 
 1 files changed, 224 insertions(+), 0 deletions(-)
 create mode 100755 scripts/kvm/vmxcap

diff --git a/scripts/kvm/vmxcap b/scripts/kvm/vmxcap
new file mode 100755
index 000..a74ce71
--- /dev/null
+++ b/scripts/kvm/vmxcap
@@ -0,0 +1,224 @@
+#!/usr/bin/python
+#
+# tool for querying VMX capabilities
+#
+# Copyright 2009-2010 Red Hat, Inc.
+#
+# Authors:
+#  Avi Kivity 
+#
+# This work is licensed under the terms of the GNU GPL, version 2.  See
+# the COPYING file in the top-level directory.
+
+MSR_IA32_VMX_BASIC = 0x480
+MSR_IA32_VMX_PINBASED_CTLS = 0x481
+MSR_IA32_VMX_PROCBASED_CTLS = 0x482
+MSR_IA32_VMX_EXIT_CTLS = 0x483
+MSR_IA32_VMX_ENTRY_CTLS = 0x484
+MSR_IA32_VMX_MISC_CTLS = 0x485
+MSR_IA32_VMX_PROCBASED_CTLS2 = 0x48B
+MSR_IA32_VMX_EPT_VPID_CAP = 0x48C
+MSR_IA32_VMX_TRUE_PINBASED_CTLS = 0x48D
+MSR_IA32_VMX_TRUE_PROCBASED_CTLS = 0x48E
+MSR_IA32_VMX_TRUE_EXIT_CTLS = 0x48F
+MSR_IA32_VMX_TRUE_ENTRY_CTLS = 0x490
+
+class msr(object):
+def __init__(self):
+try:
+self.f = file('/dev/cpu/0/msr')
+except:
+self.f = file('/dev/msr0')
+def read(self, index, default = None):
+import struct
+self.f.seek(index)
+try:
+return struct.unpack('Q', self.f.read(8))[0]
+except:
+return default
+
+class Control(object):
+def __init__(self, name, bits, cap_msr, true_cap_msr = None):
+self.name = name
+self.bits = bits
+self.cap_msr = cap_msr
+self.true_cap_msr = true_cap_msr
+def read2(self, nr):
+m = msr()
+val = m.read(nr, 0)
+return (val & 0x, val >> 32)
+def show(self):
+print self.name
+mbz, mb1 = self.read2(self.cap_msr)
+tmbz, tmb1 = 0, 0
+if self.true_cap_msr:
+tmbz, tmb1 = self.read2(self.true_cap_msr)
+for bit in sorted(self.bits.keys()):
+zero = not (mbz & (1 << bit))
+one = mb1 & (1 << bit)
+true_zero = not (tmbz & (1 << bit))
+true_one = tmb1 & (1 << bit)
+s= '?'
+if (self.true_cap_msr and true_zero and true_one
+and one and not zero):
+s = 'default'
+elif zero and not one:
+s = 'no'
+elif one and not zero:
+s = 'forced'
+elif one and zero:
+s = 'yes'
+print '  %-40s %s' % (self.bits[bit], s)
+
+class Misc(object):
+def __init__(self, name, bits, msr):
+self.name = name
+self.bits = bits
+self.msr = msr
+def show(self):
+print self.name
+value = msr().read(self.msr, 0)
+def first_bit(key):
+if type(key) is tuple:
+return key[0]
+else:
+return key
+for bits in sorted(self.bits.keys(), key = first_bit):
+if type(bits) is tuple:
+lo, hi = bits
+fmt = int
+else:
+lo = hi = bits
+def fmt(x):
+return { True: 'yes', False: 'no' }[x]
+v = (value >> lo) & ((1 << (hi - lo + 1)) - 1)
+print '  %-40s %s' % (self.bits[bits], fmt(v))
+
+controls = [
+Control(
+name = 'pin-based controls',
+bits = {
+0: 'External interrupt exiting',
+3: 'NMI exiting',
+5: 'Virtual NMIs',
+6: 'Activate VMX-preemption timer',
+},
+cap_msr = MSR_IA32_VMX_PINBASED_CTLS,
+true_cap_msr = MSR_IA32_VMX_TRUE_PINBASED_CTLS,
+),
+
+Control(
+name = 'primary processor-based controls',
+bits = {
+2: 'Interrupt window exiting',
+3: 'Use TSC offsetting',
+7: 'HLT exiting',
+9: 'INVLPG exiting',
+10: 'MWAIT exiting',
+11: 'RDPMC exiting',
+12: 'RDTSC exiting',
+15: 'CR3-load exiting',
+16: 'CR3-store exiting',
+19: 'CR8-load exiting',
+20: 'CR8-store exiting',
+21: 'Use TPR shadow',
+22: 'NMI-window exiting',
+23: 'MOV-DR exiting',
+24: 'Unconditional I/O exiting',
+25: 'Use I/O bitmaps',
+27: 'Monitor trap flag',
+28: 'Use MSR bitmaps',
+29: 'MONITOR exiting',
+30: 'PAUSE exiting',
+31: 'Activate secondary control',
+},
+cap_msr = MSR_IA32_VMX_PROCBASED_CTLS,
+true_cap_msr = MSR_IA32_VMX_TRUE_PROCBASED_CTLS,
+),
+
+Control(
+name = 'secondary processor-based controls',
+bits = {
+0: 'Virtualize APIC accesses',
+1: 'Enable EPT',
+2: 'Descriptor-table exiting',
+   

[Qemu-devel] [PATCH][uq/master] kvm: Add top-like kvm statistics script

2011-10-07 Thread Jan Kiszka
Taken from original qemu-kvm/kvm/kvm_stat.

Signed-off-by: Jan Kiszka 
---
 scripts/kvm/kvm_stat |  480 ++
 1 files changed, 480 insertions(+), 0 deletions(-)
 create mode 100755 scripts/kvm/kvm_stat

diff --git a/scripts/kvm/kvm_stat b/scripts/kvm/kvm_stat
new file mode 100755
index 000..56d2bd7
--- /dev/null
+++ b/scripts/kvm/kvm_stat
@@ -0,0 +1,480 @@
+#!/usr/bin/python
+#
+# top-like utility for displaying kvm statistics
+#
+# Copyright 2006-2008 Qumranet Technologies
+# Copyright 2008-2011 Red Hat, Inc.
+#
+# Authors:
+#  Avi Kivity 
+#
+# This work is licensed under the terms of the GNU GPL, version 2.  See
+# the COPYING file in the top-level directory.
+
+import curses
+import sys, os, time, optparse
+
+class DebugfsProvider(object):
+def __init__(self):
+self.base = '/sys/kernel/debug/kvm'
+self._fields = os.listdir(self.base)
+def fields(self):
+return self._fields
+def select(self, fields):
+self._fields = fields
+def read(self):
+def val(key):
+return int(file(self.base + '/' + key).read())
+return dict([(key, val(key)) for key in self._fields])
+
+vmx_exit_reasons = {
+0: 'EXCEPTION_NMI',
+1: 'EXTERNAL_INTERRUPT',
+2: 'TRIPLE_FAULT',
+7: 'PENDING_INTERRUPT',
+8: 'NMI_WINDOW',
+9: 'TASK_SWITCH',
+10: 'CPUID',
+12: 'HLT',
+14: 'INVLPG',
+15: 'RDPMC',
+16: 'RDTSC',
+18: 'VMCALL',
+19: 'VMCLEAR',
+20: 'VMLAUNCH',
+21: 'VMPTRLD',
+22: 'VMPTRST',
+23: 'VMREAD',
+24: 'VMRESUME',
+25: 'VMWRITE',
+26: 'VMOFF',
+27: 'VMON',
+28: 'CR_ACCESS',
+29: 'DR_ACCESS',
+30: 'IO_INSTRUCTION',
+31: 'MSR_READ',
+32: 'MSR_WRITE',
+33: 'INVALID_STATE',
+36: 'MWAIT_INSTRUCTION',
+39: 'MONITOR_INSTRUCTION',
+40: 'PAUSE_INSTRUCTION',
+41: 'MCE_DURING_VMENTRY',
+43: 'TPR_BELOW_THRESHOLD',
+44: 'APIC_ACCESS',
+48: 'EPT_VIOLATION',
+49: 'EPT_MISCONFIG',
+54: 'WBINVD',
+55: 'XSETBV',
+}
+
+svm_exit_reasons = {
+0x000: 'READ_CR0',
+0x003: 'READ_CR3',
+0x004: 'READ_CR4',
+0x008: 'READ_CR8',
+0x010: 'WRITE_CR0',
+0x013: 'WRITE_CR3',
+0x014: 'WRITE_CR4',
+0x018: 'WRITE_CR8',
+0x020: 'READ_DR0',
+0x021: 'READ_DR1',
+0x022: 'READ_DR2',
+0x023: 'READ_DR3',
+0x024: 'READ_DR4',
+0x025: 'READ_DR5',
+0x026: 'READ_DR6',
+0x027: 'READ_DR7',
+0x030: 'WRITE_DR0',
+0x031: 'WRITE_DR1',
+0x032: 'WRITE_DR2',
+0x033: 'WRITE_DR3',
+0x034: 'WRITE_DR4',
+0x035: 'WRITE_DR5',
+0x036: 'WRITE_DR6',
+0x037: 'WRITE_DR7',
+0x040: 'EXCP_BASE',
+0x060: 'INTR',
+0x061: 'NMI',
+0x062: 'SMI',
+0x063: 'INIT',
+0x064: 'VINTR',
+0x065: 'CR0_SEL_WRITE',
+0x066: 'IDTR_READ',
+0x067: 'GDTR_READ',
+0x068: 'LDTR_READ',
+0x069: 'TR_READ',
+0x06a: 'IDTR_WRITE',
+0x06b: 'GDTR_WRITE',
+0x06c: 'LDTR_WRITE',
+0x06d: 'TR_WRITE',
+0x06e: 'RDTSC',
+0x06f: 'RDPMC',
+0x070: 'PUSHF',
+0x071: 'POPF',
+0x072: 'CPUID',
+0x073: 'RSM',
+0x074: 'IRET',
+0x075: 'SWINT',
+0x076: 'INVD',
+0x077: 'PAUSE',
+0x078: 'HLT',
+0x079: 'INVLPG',
+0x07a: 'INVLPGA',
+0x07b: 'IOIO',
+0x07c: 'MSR',
+0x07d: 'TASK_SWITCH',
+0x07e: 'FERR_FREEZE',
+0x07f: 'SHUTDOWN',
+0x080: 'VMRUN',
+0x081: 'VMMCALL',
+0x082: 'VMLOAD',
+0x083: 'VMSAVE',
+0x084: 'STGI',
+0x085: 'CLGI',
+0x086: 'SKINIT',
+0x087: 'RDTSCP',
+0x088: 'ICEBP',
+0x089: 'WBINVD',
+0x08a: 'MONITOR',
+0x08b: 'MWAIT',
+0x08c: 'MWAIT_COND',
+0x400: 'NPF',
+}
+
+vendor_exit_reasons = {
+'vmx': vmx_exit_reasons,
+'svm': svm_exit_reasons,
+}
+
+exit_reasons = None
+
+for line in file('/proc/cpuinfo').readlines():
+if line.startswith('flags'):
+for flag in line.split():
+if flag in vendor_exit_reasons:
+exit_reasons = vendor_exit_reasons[flag]
+
+filters = {
+'kvm_exit': ('exit_reason', exit_reasons)
+}
+
+def invert(d):
+return dict((x[1], x[0]) for x in d.iteritems())
+
+for f in filters:
+filters[f] = (filters[f][0], invert(filters[f][1]))
+
+import ctypes, struct, array
+
+libc = ctypes.CDLL('libc.so.6')
+syscall = libc.syscall
+class perf_event_attr(ctypes.Structure):
+_fields_ = [('type', ctypes.c_uint32),
+('size', ctypes.c_uint32),
+('config', ctypes.c_uint64),
+('sample_freq', ctypes.c_uint64),
+('sample_type', ctypes.c_uint64),
+('read_format', ctypes.c_uint64),
+('flags', ctypes.c_uint64),
+('wakeup_events', ctypes.c_uint32),
+('bp_type', ctypes.c_uint32),
+('bp_addr', ctypes.c_uint64),
+('bp_len', ctypes.c_uint64),
+]
+def _perf_event_open(attr, pid, cpu, g

Re: [Qemu-devel] [PATCHv3] s390: Fix cpu shutdown for KVM

2011-10-07 Thread Alexander Graf

On 04.10.2011, at 17:20, Christian Borntraeger wrote:

> On s390 a shutdown is the state of all CPUs being either stopped
> or disabled (for interrupts) waiting. We have to track the overall
> number of running CPUs to call the shutdown sequence accordingly.
> This patch implements the counting and shutdown handling for the 
> kvm path in qemu.
> Lets also wrap changes to env->halted and env->exception_index.
> 
> Signed-off-by: Christian Borntraeger 

Thanks, applied to s390-next.

Alex




[Qemu-devel] [PATCH v2 11/23] i8259: Update IRQ state after reset

2011-10-07 Thread Jan Kiszka
MIPS and PPC users of the i8259 output signal expect us to report state
updates also after reset. As no consumer (including the master PIC) can
misinterpret the deassert as an activation event, it is safe to simply
update the IRQ state after reset.

Signed-off-by: Jan Kiszka 
---
 hw/i8259.c |3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/hw/i8259.c b/hw/i8259.c
index b7a011f..3498c6b 100644
--- a/hw/i8259.c
+++ b/hw/i8259.c
@@ -283,6 +283,7 @@ static void pic_reset(void *opaque)
 s->init4 = 0;
 s->single_mode = 0;
 /* Note: ELCR is not reset */
+pic_update_irq(s->pics_state);
 }
 
 static void pic_ioport_write(void *opaque, target_phys_addr_t addr64,
@@ -298,8 +299,6 @@ static void pic_ioport_write(void *opaque, 
target_phys_addr_t addr64,
 if (val & 0x10) {
 /* init */
 pic_reset(s);
-/* deassert a pending interrupt */
-qemu_irq_lower(s->pics_state->pics[0].int_out);
 s->init_state = 1;
 s->init4 = val & 1;
 s->single_mode = val & 2;
-- 
1.7.3.4




[Qemu-devel] [PATCH v2 20/23] i8259: Convert to qdev

2011-10-07 Thread Jan Kiszka
This key cleanup step requires to move the IRQ debugging bit from
i8259_set_irq directly to the per-PIC pic_set_irq, to pass the PIC
parameters (I/O base, ELCR address and mask, master/slave mode) as
qdev properties, and to interconnect the PICs with their environment via
GPIO pins.

Signed-off-by: Jan Kiszka 
---
 hw/i8259.c |  157 ++--
 1 files changed, 100 insertions(+), 57 deletions(-)

diff --git a/hw/i8259.c b/hw/i8259.c
index df23bb8..b4e1867 100644
--- a/hw/i8259.c
+++ b/hw/i8259.c
@@ -41,6 +41,7 @@
 //#define DEBUG_IRQ_COUNT
 
 struct PicState {
+ISADevice dev;
 uint8_t last_irr; /* edge detection */
 uint8_t irr; /* interrupt request register */
 uint8_t imr; /* interrupt mask register */
@@ -58,8 +59,10 @@ struct PicState {
 uint8_t single_mode; /* true if slave pic is not initialized */
 uint8_t elcr; /* PIIX edge/trigger selection*/
 uint8_t elcr_mask;
-qemu_irq int_out;
-bool master; /* reflects /SP input pin */
+qemu_irq int_out[1];
+uint32_t master; /* reflects /SP input pin */
+uint32_t iobase;
+uint32_t elcr_addr;
 MemoryRegion base_io;
 MemoryRegion elcr_io;
 };
@@ -70,6 +73,9 @@ static int irq_level[16];
 #ifdef DEBUG_IRQ_COUNT
 static uint64_t irq_count[16];
 #endif
+#ifdef DEBUG_IRQ_LATENCY
+static int64_t irq_time[16];
+#endif
 PicState *isa_pic;
 static PicState *slave_pic;
 
@@ -122,17 +128,39 @@ static void pic_update_irq(PicState *s)
 if (irq >= 0) {
 DPRINTF("pic%d: imr=%x irr=%x padd=%d\n",
 s->master ? 0 : 1, s->imr, s->irr, s->priority_add);
-qemu_irq_raise(s->int_out);
+qemu_irq_raise(s->int_out[0]);
 } else {
-qemu_irq_lower(s->int_out);
+qemu_irq_lower(s->int_out[0]);
 }
 }
 
 /* set irq level. If an edge is detected, then the IRR is set to 1 */
-static void pic_set_irq1(PicState *s, int irq, int level)
+static void pic_set_irq(void *opaque, int irq, int level)
 {
-int mask;
-mask = 1 << irq;
+PicState *s = opaque;
+int mask = 1 << irq;
+
+#if defined(DEBUG_PIC) || defined(DEBUG_IRQ_COUNT) || \
+defined(DEBUG_IRQ_LATENCY)
+int irq_index = s->master ? irq : irq + 8;
+#endif
+#if defined(DEBUG_PIC) || defined(DEBUG_IRQ_COUNT)
+if (level != irq_level[irq_index]) {
+DPRINTF("pic_set_irq: irq=%d level=%d\n", irq_index, level);
+irq_level[irq_index] = level;
+#ifdef DEBUG_IRQ_COUNT
+if (level == 1) {
+irq_count[irq_index]++;
+}
+#endif
+}
+#endif
+#ifdef DEBUG_IRQ_LATENCY
+if (level) {
+irq_time[irq_index] = qemu_get_clock_ns(vm_clock);
+}
+#endif
+
 if (s->elcr & mask) {
 /* level triggered */
 if (level) {
@@ -156,32 +184,6 @@ static void pic_set_irq1(PicState *s, int irq, int level)
 pic_update_irq(s);
 }
 
-#ifdef DEBUG_IRQ_LATENCY
-int64_t irq_time[16];
-#endif
-
-static void i8259_set_irq(void *opaque, int irq, int level)
-{
-PicState *s = irq <= 7 ? isa_pic : slave_pic;
-
-#if defined(DEBUG_PIC) || defined(DEBUG_IRQ_COUNT)
-if (level != irq_level[irq]) {
-DPRINTF("i8259_set_irq: irq=%d level=%d\n", irq, level);
-irq_level[irq] = level;
-#ifdef DEBUG_IRQ_COUNT
-   if (level == 1)
-   irq_count[irq]++;
-#endif
-}
-#endif
-#ifdef DEBUG_IRQ_LATENCY
-if (level) {
-irq_time[irq] = qemu_get_clock_ns(vm_clock);
-}
-#endif
-pic_set_irq1(s, irq & 7, level);
-}
-
 /* acknowledge interrupt 'irq' */
 static void pic_intack(PicState *s, int irq)
 {
@@ -258,9 +260,9 @@ static void pic_init_reset(PicState *s)
 pic_update_irq(s);
 }
 
-static void pic_reset(void *opaque)
+static void pic_reset(DeviceState *dev)
 {
-PicState *s = opaque;
+PicState *s = container_of(dev, PicState, dev.qdev);
 
 pic_init_reset(s);
 s->elcr = 0;
@@ -447,23 +449,24 @@ static const MemoryRegionOps pic_elcr_ioport_ops = {
 },
 };
 
-/* XXX: add generic master/slave system */
-static void pic_init(int io_addr, int elcr_addr, PicState *s, qemu_irq int_out,
- bool master)
+static int pic_initfn(ISADevice *dev)
 {
-s->int_out = int_out;
-s->master = master;
+PicState *s = DO_UPCAST(PicState, dev, dev);
 
 memory_region_init_io(&s->base_io, &pic_base_ioport_ops, s, "pic", 2);
 memory_region_init_io(&s->elcr_io, &pic_elcr_ioport_ops, s, "elcr", 1);
 
-isa_register_ioport(NULL, &s->base_io, io_addr);
-if (elcr_addr >= 0) {
-isa_register_ioport(NULL, &s->elcr_io, elcr_addr);
+isa_register_ioport(NULL, &s->base_io, s->iobase);
+if (s->elcr_addr != -1) {
+isa_register_ioport(NULL, &s->elcr_io, s->elcr_addr);
 }
 
-vmstate_register(NULL, io_addr, &vmstate_pic, s);
-qemu_register_reset(pic_reset, s);
+qdev_init_gpio_out(&dev->qdev, s->int_out, ARRAY_SIZE(s->int_out));
+qdev_init_gpio_in(&dev->qdev, pic_set_irq, 8);
+
+qdev_set_legacy_instance_id(&dev->q

[Qemu-devel] [PATCH v2 19/23] qdev: Add HEX8 property

2011-10-07 Thread Jan Kiszka
Signed-off-by: Jan Kiszka 
---
 hw/qdev-properties.c |   29 +
 hw/qdev.h|3 +++
 2 files changed, 32 insertions(+), 0 deletions(-)

diff --git a/hw/qdev-properties.c b/hw/qdev-properties.c
index e0e54aa..f0b811c 100644
--- a/hw/qdev-properties.c
+++ b/hw/qdev-properties.c
@@ -93,6 +93,35 @@ PropertyInfo qdev_prop_uint8 = {
 .print = print_uint8,
 };
 
+/* --- 8bit hex value --- */
+
+static int parse_hex8(DeviceState *dev, Property *prop, const char *str)
+{
+uint8_t *ptr = qdev_get_prop_ptr(dev, prop);
+char *end;
+
+*ptr = strtoul(str, &end, 16);
+if ((*end != '\0') || (end == str)) {
+return -EINVAL;
+}
+
+return 0;
+}
+
+static int print_hex8(DeviceState *dev, Property *prop, char *dest, size_t len)
+{
+uint8_t *ptr = qdev_get_prop_ptr(dev, prop);
+return snprintf(dest, len, "0x%" PRIx8, *ptr);
+}
+
+PropertyInfo qdev_prop_hex8 = {
+.name  = "hex8",
+.type  = PROP_TYPE_UINT8,
+.size  = sizeof(uint8_t),
+.parse = parse_hex8,
+.print = print_hex8,
+};
+
 /* --- 16bit integer --- */
 
 static int parse_uint16(DeviceState *dev, Property *prop, const char *str)
diff --git a/hw/qdev.h b/hw/qdev.h
index 8a13ec9..aa7ae36 100644
--- a/hw/qdev.h
+++ b/hw/qdev.h
@@ -224,6 +224,7 @@ extern PropertyInfo qdev_prop_uint16;
 extern PropertyInfo qdev_prop_uint32;
 extern PropertyInfo qdev_prop_int32;
 extern PropertyInfo qdev_prop_uint64;
+extern PropertyInfo qdev_prop_hex8;
 extern PropertyInfo qdev_prop_hex32;
 extern PropertyInfo qdev_prop_hex64;
 extern PropertyInfo qdev_prop_string;
@@ -267,6 +268,8 @@ extern PropertyInfo qdev_prop_pci_devfn;
 DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_int32, int32_t)
 #define DEFINE_PROP_UINT64(_n, _s, _f, _d)  \
 DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_uint64, uint64_t)
+#define DEFINE_PROP_HEX8(_n, _s, _f, _d)   \
+DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_hex8, uint8_t)
 #define DEFINE_PROP_HEX32(_n, _s, _f, _d)   \
 DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_hex32, uint32_t)
 #define DEFINE_PROP_HEX64(_n, _s, _f, _d)   \
-- 
1.7.3.4




[Qemu-devel] [PATCH v2 22/23] monitor: Restrict pic/irq_info to supporting targets

2011-10-07 Thread Jan Kiszka
Signed-off-by: Jan Kiszka 
Acked-by: Blue Swirl 
---
 hw/an5206.c |   10 --
 hw/arm_pic.c|   11 ---
 hw/cris_pic_cpu.c   |6 --
 hw/etraxfs.h|1 +
 hw/lm32_pic.c   |4 ++--
 hw/lm32_pic.h   |3 +++
 hw/microblaze_pic_cpu.c |6 --
 hw/s390-virtio.c|   11 ---
 hw/shix.c   |   11 ---
 hw/sun4m.c  |4 ++--
 hw/sun4m.h  |4 
 hw/sun4u.c  |8 
 hw/xtensa_pic.c |   10 --
 monitor.c   |   21 +
 14 files changed, 33 insertions(+), 77 deletions(-)

diff --git a/hw/an5206.c b/hw/an5206.c
index 481ae60..3fe1f00 100644
--- a/hw/an5206.c
+++ b/hw/an5206.c
@@ -7,7 +7,6 @@
  */
 
 #include "hw.h"
-#include "pc.h"
 #include "mcf.h"
 #include "boards.h"
 #include "loader.h"
@@ -18,15 +17,6 @@
 #define AN5206_MBAR_ADDR 0x1000
 #define AN5206_RAMBAR_ADDR 0x2000
 
-/* Stub functions for hardware that doesn't exist.  */
-void pic_info(Monitor *mon)
-{
-}
-
-void irq_info(Monitor *mon)
-{
-}
-
 /* Board init.  */
 
 static void an5206_init(ram_addr_t ram_size,
diff --git a/hw/arm_pic.c b/hw/arm_pic.c
index 985148a..4e63845 100644
--- a/hw/arm_pic.c
+++ b/hw/arm_pic.c
@@ -8,19 +8,8 @@
  */
 
 #include "hw.h"
-#include "pc.h"
 #include "arm-misc.h"
 
-/* Stub functions for hardware that doesn't exist.  */
-void pic_info(Monitor *mon)
-{
-}
-
-void irq_info(Monitor *mon)
-{
-}
-
-
 /* Input 0 is IRQ and input 1 is FIQ.  */
 static void arm_pic_cpu_handler(void *opaque, int irq, int level)
 {
diff --git a/hw/cris_pic_cpu.c b/hw/cris_pic_cpu.c
index 7f1e4ab..06ae484 100644
--- a/hw/cris_pic_cpu.c
+++ b/hw/cris_pic_cpu.c
@@ -24,16 +24,10 @@
 
 #include "sysbus.h"
 #include "hw.h"
-#include "pc.h"
 #include "etraxfs.h"
 
 #define D(x)
 
-void pic_info(Monitor *mon)
-{}
-void irq_info(Monitor *mon)
-{}
-
 static void cris_pic_cpu_handler(void *opaque, int irq, int level)
 {
 CPUState *env = (CPUState *)opaque;
diff --git a/hw/etraxfs.h b/hw/etraxfs.h
index 1554b0b..24e8fd8 100644
--- a/hw/etraxfs.h
+++ b/hw/etraxfs.h
@@ -22,6 +22,7 @@
  * THE SOFTWARE.
  */
 
+#include "net.h"
 #include "etraxfs_dma.h"
 
 qemu_irq *cris_pic_init_cpu(CPUState *env);
diff --git a/hw/lm32_pic.c b/hw/lm32_pic.c
index 02941a7..8dd0050 100644
--- a/hw/lm32_pic.c
+++ b/hw/lm32_pic.c
@@ -39,7 +39,7 @@ struct LM32PicState {
 typedef struct LM32PicState LM32PicState;
 
 static LM32PicState *pic;
-void pic_info(Monitor *mon)
+void lm32_do_pic_info(Monitor *mon)
 {
 if (pic == NULL) {
 return;
@@ -49,7 +49,7 @@ void pic_info(Monitor *mon)
 pic->im, pic->ip, pic->irq_state);
 }
 
-void irq_info(Monitor *mon)
+void lm32_irq_info(Monitor *mon)
 {
 int i;
 uint32_t count;
diff --git a/hw/lm32_pic.h b/hw/lm32_pic.h
index e6479b8..14456f3 100644
--- a/hw/lm32_pic.h
+++ b/hw/lm32_pic.h
@@ -8,4 +8,7 @@ uint32_t lm32_pic_get_im(DeviceState *d);
 void lm32_pic_set_ip(DeviceState *d, uint32_t ip);
 void lm32_pic_set_im(DeviceState *d, uint32_t im);
 
+void lm32_do_pic_info(Monitor *mon);
+void lm32_irq_info(Monitor *mon);
+
 #endif /* QEMU_HW_LM32_PIC_H */
diff --git a/hw/microblaze_pic_cpu.c b/hw/microblaze_pic_cpu.c
index 9ad48b4..8b5623c 100644
--- a/hw/microblaze_pic_cpu.c
+++ b/hw/microblaze_pic_cpu.c
@@ -23,16 +23,10 @@
  */
 
 #include "hw.h"
-#include "pc.h"
 #include "microblaze_pic_cpu.h"
 
 #define D(x)
 
-void pic_info(Monitor *mon)
-{}
-void irq_info(Monitor *mon)
-{}
-
 static void microblaze_pic_cpu_handler(void *opaque, int irq, int level)
 {
 CPUState *env = (CPUState *)opaque;
diff --git a/hw/s390-virtio.c b/hw/s390-virtio.c
index acbf026..778cffe 100644
--- a/hw/s390-virtio.c
+++ b/hw/s390-virtio.c
@@ -62,17 +62,6 @@
 static VirtIOS390Bus *s390_bus;
 static CPUState **ipi_states;
 
-void irq_info(Monitor *mon);
-void pic_info(Monitor *mon);
-
-void irq_info(Monitor *mon)
-{
-}
-
-void pic_info(Monitor *mon)
-{
-}
-
 CPUState *s390_cpu_addr2state(uint16_t cpu_addr)
 {
 if (cpu_addr >= smp_cpus) {
diff --git a/hw/shix.c b/hw/shix.c
index 638bf16..dbf4764 100644
--- a/hw/shix.c
+++ b/hw/shix.c
@@ -28,7 +28,6 @@
More information in target-sh4/README.sh4
 */
 #include "hw.h"
-#include "pc.h"
 #include "sh.h"
 #include "sysemu.h"
 #include "boards.h"
@@ -37,16 +36,6 @@
 #define BIOS_FILENAME "shix_bios.bin"
 #define BIOS_ADDRESS 0xA000
 
-void irq_info(Monitor *mon)
-{
-/* X */
-}
-
-void pic_info(Monitor *mon)
-{
-/* X */
-}
-
 static void shix_init(ram_addr_t ram_size,
const char *boot_device,
   const char *kernel_filename, const char *kernel_cmdline,
diff --git a/hw/sun4m.c b/hw/sun4m.c
index dcaed38..71bf648 100644
--- a/hw/sun4m.c
+++ b/hw/sun4m.c
@@ -216,13 +216,13 @@ static void nvram_init(M48t59State *nvram, uint8_t 
*macaddr,
 
 static DeviceState *slavio_intctl;
 
-void pic_info(Monitor *mon)
+void sun4m_pic_info(Monit

[Qemu-devel] [PATCH v2 02/23] pc: Generalize ISA IRQs to GSIs

2011-10-07 Thread Jan Kiszka
The ISA bus IRQ range is 0..15. What isa_irq_handler and IsaIrqState are
actually dealing with are the Global System Interrupts. Refactor the
code to clarify this.

Signed-off-by: Jan Kiszka 
---
 hw/ioapic.h  |7 +++
 hw/isa.h |2 ++
 hw/pc.c  |   18 +-
 hw/pc.h  |   18 ++
 hw/pc_piix.c |   28 ++--
 5 files changed, 42 insertions(+), 31 deletions(-)

diff --git a/hw/ioapic.h b/hw/ioapic.h
index cb2642a..86e63da 100644
--- a/hw/ioapic.h
+++ b/hw/ioapic.h
@@ -17,4 +17,11 @@
  * License along with this library; if not, see .
  */
 
+#ifndef HW_IOAPIC_H
+#define HW_IOAPIC_H
+
+#define IOAPIC_NUM_PINS 24
+
 void ioapic_eoi_broadcast(int vector);
+
+#endif /* !HW_IOAPIC_H */
diff --git a/hw/isa.h b/hw/isa.h
index 432d17a..820c390 100644
--- a/hw/isa.h
+++ b/hw/isa.h
@@ -7,6 +7,8 @@
 #include "memory.h"
 #include "qdev.h"
 
+#define ISA_NUM_IRQS 16
+
 typedef struct ISABus ISABus;
 typedef struct ISADevice ISADevice;
 typedef struct ISADeviceInfo ISADeviceInfo;
diff --git a/hw/pc.c b/hw/pc.c
index a15d165..c979d4b 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -88,15 +88,15 @@ struct e820_table {
 static struct e820_table e820_table;
 struct hpet_fw_config hpet_cfg = {.count = UINT8_MAX};
 
-void isa_irq_handler(void *opaque, int n, int level)
+void gsi_handler(void *opaque, int n, int level)
 {
-IsaIrqState *isa = (IsaIrqState *)opaque;
+GSIState *s = opaque;
 
-DPRINTF("isa_irqs: %s irq %d\n", level? "raise" : "lower", n);
-if (n < 16) {
-qemu_set_irq(isa->i8259[n], level);
+DPRINTF("pc: %s GSI %d\n", level ? "raising" : "lowering", n);
+if (n < ISA_NUM_IRQS) {
+qemu_set_irq(s->i8259_irq[n], level);
 }
-qemu_set_irq(isa->ioapic[n], level);
+qemu_set_irq(s->ioapic_irq[n], level);
 }
 
 static void ioport80_write(void *opaque, uint32_t addr, uint32_t data)
@@ -1115,7 +1115,7 @@ static void cpu_request_exit(void *opaque, int irq, int 
level)
 }
 }
 
-void pc_basic_device_init(qemu_irq *isa_irq,
+void pc_basic_device_init(qemu_irq *gsi,
   ISADevice **rtc_state,
   bool no_vmport)
 {
@@ -1134,8 +1134,8 @@ void pc_basic_device_init(qemu_irq *isa_irq,
 DeviceState *hpet = sysbus_try_create_simple("hpet", HPET_BASE, NULL);
 
 if (hpet) {
-for (i = 0; i < 24; i++) {
-sysbus_connect_irq(sysbus_from_qdev(hpet), i, isa_irq[i]);
+for (i = 0; i < GSI_NUM_PINS; i++) {
+sysbus_connect_irq(sysbus_from_qdev(hpet), i, gsi[i]);
 }
 rtc_irq = qdev_get_gpio_in(hpet, 0);
 }
diff --git a/hw/pc.h b/hw/pc.h
index 7e6ddba..4333898 100644
--- a/hw/pc.h
+++ b/hw/pc.h
@@ -8,6 +8,7 @@
 #include "fdc.h"
 #include "net.h"
 #include "memory.h"
+#include "ioapic.h"
 
 /* PC-style peripherals (also used by other machines).  */
 
@@ -70,15 +71,16 @@ uint32_t pic_intack_read(PicState2 *s);
 void pic_info(Monitor *mon);
 void irq_info(Monitor *mon);
 
-/* ISA */
-#define IOAPIC_NUM_PINS 0x18
+/* Global System Interrupts */
 
-typedef struct isa_irq_state {
-qemu_irq *i8259;
-qemu_irq ioapic[IOAPIC_NUM_PINS];
-} IsaIrqState;
+#define GSI_NUM_PINS IOAPIC_NUM_PINS
 
-void isa_irq_handler(void *opaque, int n, int level);
+typedef struct GSIState {
+qemu_irq *i8259_irq;
+qemu_irq ioapic_irq[IOAPIC_NUM_PINS];
+} GSIState;
+
+void gsi_handler(void *opaque, int n, int level);
 
 /* i8254.c */
 
@@ -141,7 +143,7 @@ void pc_memory_init(MemoryRegion *system_memory,
 MemoryRegion **ram_memory);
 qemu_irq *pc_allocate_cpu_irq(void);
 void pc_vga_init(PCIBus *pci_bus);
-void pc_basic_device_init(qemu_irq *isa_irq,
+void pc_basic_device_init(qemu_irq *gsi,
   ISADevice **rtc_state,
   bool no_vmport);
 void pc_init_ne2k_isa(NICInfo *nd);
diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index ce1c87f..e6e280c 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -53,7 +53,7 @@ static const int ide_iobase[MAX_IDE_BUS] = { 0x1f0, 0x170 };
 static const int ide_iobase2[MAX_IDE_BUS] = { 0x3f6, 0x376 };
 static const int ide_irq[MAX_IDE_BUS] = { 14, 15 };
 
-static void ioapic_init(IsaIrqState *isa_irq_state)
+static void ioapic_init(GSIState *gsi_state)
 {
 DeviceState *dev;
 SysBusDevice *d;
@@ -65,7 +65,7 @@ static void ioapic_init(IsaIrqState *isa_irq_state)
 sysbus_mmio_map(d, 0, 0xfec0);
 
 for (i = 0; i < IOAPIC_NUM_PINS; i++) {
-isa_irq_state->ioapic[i] = qdev_get_gpio_in(dev, i);
+gsi_state->ioapic_irq[i] = qdev_get_gpio_in(dev, i);
 }
 }
 
@@ -87,11 +87,11 @@ static void pc_init1(MemoryRegion *system_memory,
 PCII440FXState *i440fx_state;
 int piix3_devfn = -1;
 qemu_irq *cpu_irq;
-qemu_irq *isa_irq;
+qemu_irq *gsi;
 qemu_irq *i8259;
 qemu_irq *cmos_s3;
 qemu_irq *smi_irq;
-IsaIrqState *isa_irq_state;
+GSIStat

[Qemu-devel] [PATCH v2 00/23] Rework i8259 and PC interrupt models

2011-10-07 Thread Jan Kiszka
Highlights of this series:
 - generic i8259, now part of hwlib
 - qdev conversion of i8259
 - fix for i8259 poll mode (and removal of PREP hack)

The refactoring will also be important to instantiate i8259-kvm devices
for in-kernel irqchip acceleration one day.

Changes in v2:
 - kept PIC irq state update after reset but clarified why this 
   required and only valid here
 - additional fix: Clear ELCR on reset
 - included already posted updates of patch 22 and 23

CC: Andreas Färber 

Jan Kiszka (23):
  pc: Drop useless test from isa_irq_handler
  pc: Generalize ISA IRQs to GSIs
  pc: Convert GSIState::i8259_irq into array
  pc: Fix and clean up PIC-to-APIC IRQ path
  i8259: Remove premature inline function attributes
  i8259: Drop obsolete prototypes
  i8259: Move pic_set_irq1 after pic_update_irq
  i8239: Introduce per-PIC output interrupt
  i8259: Do not update IRQ output after spurious pic_poll_read
  i8259: Reorder intack in pic_read_irq
  i8259: Update IRQ state after reset
  i8259: Clear ELCR on reset
  i8259: Switch to per-PIC IRQ update
  i8259: Fix poll command
  i8259: Clean up pic_ioport_read
  i8259: PREP: Replace pic_intack_read with pic_read_irq
  i8259: Replace PicState::pics_state with master flag
  i8259: Eliminate PicState2
  qdev: Add HEX8 property
  i8259: Convert to qdev
  i8259: Fix coding style
  monitor: Restrict pic/irq_info to supporting targets
  i8259: Move to hw library

 Makefile.objs|1 +
 Makefile.target  |8 +-
 default-configs/alpha-softmmu.mak|1 +
 default-configs/i386-softmmu.mak |1 +
 default-configs/mips-softmmu.mak |1 +
 default-configs/mips64-softmmu.mak   |1 +
 default-configs/mips64el-softmmu.mak |1 +
 default-configs/mipsel-softmmu.mak   |1 +
 default-configs/ppc-softmmu.mak  |1 +
 default-configs/ppc64-softmmu.mak|1 +
 default-configs/ppcemb-softmmu.mak   |1 +
 default-configs/x86_64-softmmu.mak   |1 +
 hw/an5206.c  |   10 -
 hw/apic.c|4 +
 hw/arm_pic.c |   11 -
 hw/cris_pic_cpu.c|6 -
 hw/etraxfs.h |1 +
 hw/i8259.c   |  397 ++
 hw/ioapic.h  |7 +
 hw/isa.h |2 +
 hw/lm32_pic.c|4 +-
 hw/lm32_pic.h|3 +
 hw/microblaze_pic_cpu.c  |6 -
 hw/pc.c  |   24 +--
 hw/pc.h  |   29 ++--
 hw/pc_piix.c |   30 ++--
 hw/ppc_prep.c|2 +-
 hw/qdev-properties.c |   29 +++
 hw/qdev.h|3 +
 hw/s390-virtio.c |   11 -
 hw/shix.c|   11 -
 hw/sun4m.c   |4 +-
 hw/sun4m.h   |4 +
 hw/sun4u.c   |8 -
 hw/xtensa_pic.c  |   10 -
 monitor.c|   21 ++
 36 files changed, 339 insertions(+), 317 deletions(-)

-- 
1.7.3.4




[Qemu-devel] [PATCH v2 03/23] pc: Convert GSIState::i8259_irq into array

2011-10-07 Thread Jan Kiszka
Will be required when we no longer let i8259_init allocate the PIC IRQs
but convert that chips to qdev.

Signed-off-by: Jan Kiszka 
---
 hw/pc.h  |2 +-
 hw/pc_piix.c |4 +++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/hw/pc.h b/hw/pc.h
index 4333898..2870be4 100644
--- a/hw/pc.h
+++ b/hw/pc.h
@@ -76,7 +76,7 @@ void irq_info(Monitor *mon);
 #define GSI_NUM_PINS IOAPIC_NUM_PINS
 
 typedef struct GSIState {
-qemu_irq *i8259_irq;
+qemu_irq i8259_irq[ISA_NUM_IRQS];
 qemu_irq ioapic_irq[IOAPIC_NUM_PINS];
 } GSIState;
 
diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index e6e280c..c89042f 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -158,7 +158,9 @@ static void pc_init1(MemoryRegion *system_memory,
 i8259 = xen_interrupt_controller_init();
 }
 
-gsi_state->i8259_irq = i8259;
+for (i = 0; i < ISA_NUM_IRQS; i++) {
+gsi_state->i8259_irq[i] = i8259[i];
+}
 if (pci_enabled) {
 ioapic_init(gsi_state);
 }
-- 
1.7.3.4




Re: [Qemu-devel] [0/4] pseries: Support and improvements for KVM Book3S-HV support (v2)

2011-10-07 Thread Alexander Graf

On 30.09.2011, at 09:39, David Gibson wrote:

> Alex Graf has added support for KVM acceleration of the pseries
> machine, using his Book3S-PR KVM variant, which runs the guest in
> userspace, emulating supervisor operations.  Recent kernels now have
> the Book3S-HV KVM variant which uses the hardware hypervisor features
> of recent POWER CPUs.  Alex's changes to qemu are enough to get qemu
> working roughly with Book3S-HV, but taking full advantage of this mode
> needs more work.  This patch series makes a start on better exploiting
> Book3S-HV.
> 
> Even with these patches, qemu won't quite be able to run on a current
> Book3S-HV KVM kernel.  That's because current Book3S-HV requires guest
> memory to be backed by hugepages, but qemu refuses to use hugepages
> for guest memory unless KVM advertises CAP_SYNC_MMU, which Book3S-HV
> does not currently do.  We're working on improvements to the KVM code
> which will implement CAP_SYNC_MMU and allow smallpage backing of
> guests, but they're not there yet.  So, in order to test Book3S-HV for
> now you need to either:
> 
> * Hack the host kernel to lie and advertise CAP_SYNC_MMU even though
>   it doesn't really implement it.
> 
> or
> 
> * Hack qemu so it does not check for CAP_SYNC_MMU when the -mem-path
>   option is used.
> 
> Bot approaches are ugly and unsafe, but it seems we can generally get
> away with it in practice.  Obviously this is only an interim hack
> until the proper CAP_SYNC_MMU support is ready.

Thanks, applied all to my local ppc-next tree. Will push to repo.or.cz when 
Blue pulls the current ppc-next tree from there.


Alex




[Qemu-devel] [PATCH] qemu-kvm: Restore VAPIC option ROM installation

2011-10-07 Thread Jan Kiszka
Still needed but was accidentally removed by 8bc62bc6be.

Signed-off-by: Jan Kiszka 
---

As the guilty patch is only in next so far, you may also fold this one
in.

 hw/pc.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/hw/pc.c b/hw/pc.c
index a616029..70e0d08 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -1049,6 +1049,10 @@ void pc_memory_init(MemoryRegion *system_memory,
 (uint32_t)(-bios_size),
 bios);
 
+option_rom[nb_option_roms].name = g_strdup(VAPIC_FILENAME);
+option_rom[nb_option_roms].bootindex = -1;
+nb_option_roms++;
+
 fw_cfg = bochs_bios_init();
 rom_set_fw(fw_cfg);
 
-- 
1.7.3.4