date:20100606

Re: [Qemu-devel] Re: [RFT][PATCH 07/15] qemu_irq: Add IRQ handlers with delivery feedback

2010-06-06 Thread Gleb Natapov

On Sat, Jun 05, 2010 at 02:04:01AM +0200, Jan Kiszka wrote:
  I'd like to also support EOI handling. When the guest clears the
  interrupt condtion, the EOI callback would be called. This could occur
  much later than the IRQ delivery time. I'm not sure if we need the
  result code in that case.
  
  If any intermediate device (IOAPIC?) needs to be informed about either
  delivery or EOI also, it could create a proxy message with its
  callbacks in place. But we need then a separate opaque field (in
  addition to payload) to store the original message.
  
  struct IRQMsg {
   DeviceState *src;
   void (*delivery_cb)(IRQMsg *msg, int result);
   void (*eoi_cb)(IRQMsg *msg, int result);
   void *src_opaque;
   void *payload;
  };
 
 Extending the lifetime of IRQMsg objects beyond the delivery call stack
 means qemu_malloc/free for every delivery. I think it takes a _very_
 appealing reason to justify this. But so far I do not see any use case
 for eio_cb at all.
 
I dislike use of eoi for reinfecting missing interrupts since
it eliminates use of internal PIC/APIC queue of not yet delivered
interrupts. PIC and APIC has internal queue that can handle two elements:
one is delivered, but not yet acked interrupt in isr and another is
pending interrupt in irr. Using eoi callback (or ack notifier as it's
called inside kernel) interrupt will be considered coalesced even if irr
is cleared, but no ack was received for previously delivered interrupt.
But ack notifiers actually has another use: device assignment. There is
a plan to move device assignment from kernel to userspace and for that
ack notifiers will have to be extended to userspace too. If so we can
use them to do irq decoalescing as well. I doubt they should be part
of IRQMsg though. Why not do what kernel does: have globally registered
notifier based on irqchip/pin.

--
Gleb.

[Qemu-devel] Re: sun framebuffer selection (was option-rom)

2010-06-06 Thread Blue Swirl

On Sat, Jun 5, 2010 at 11:10 PM, Bob Breuer breu...@mc.net wrote:
 Blue Swirl wrote:
  but again: should we have a new machine with cg14 or
 some switch to select TCX vs. cg14?

 Maybe the recently proposed machine subtype patches could help here.

 Well, let's try to figure out a method of selecting the framebuffer
 type.  I'll try to list some of the options, even if they might be
 ridiculous.

 1) Use the -vga option.  I know TCX and cg14 are not vga, but I think
 it's the closest existing command line option available.

 2) Switch based on the -g WxH option.  At the moment, the TCX emulation
 doesn't really handle anything other than 1024x768, so switch to cg14
 for other resolutions if supported.

 3) Use some other existing command line option, -device, -set or
 -global?  Might work, but the syntax may not be easy to remember.

We don't have an equivalent of -chardev, -netdev and -drive for displays.

 4) Machine subtype.

 5) New command line option.  Anything above might be better.

 6) New machine type.  Is it a big enough feature to demand it's own
 machine type?  Maybe, but see next option.

 7) Select as default video for SS-20.  The SS-10 and SS-600MP are
 already very similar.  This would allow for some differentiation between
 the machines, but there could still be an option to switch back to TCX.
 Note that TCX was really only available for the SS-4 and SS-5.


 Is there anything else that I missed?

Combined 7  6: make cg14 default for SS-20, add a deprecated
compatibility machine for SS-20 with TCX.


 I'm going to go ahead with option 2 in the short term.  I'm inclined to
 narrow it down to options 1, 4, and 7.  I know that 7 would have
 backwards compatibility concerns.  The cg14 seems to have at least the
 same capabilities as TCX so there shouldn't be any loss of
 functionality.  Even though SS-20 is not the default machine, do you
 know of any OS that works with the sun4m implementation today but
 doesn't have a cg14 driver?  Possible downside to cg14 for video is that
 any acceleration is handled by the SX pixel processor which has no
 available documentation.  TCX also has some amount of unimplemented
 acceleration.

It would be nice to use some basic device with well defined
acceleration or just a frame buffer as default.

[Qemu-devel] Re: [PATCH 6/6] apic: avoid using CPUState internals

2010-06-06 Thread Jan Kiszka

Blue Swirl wrote:
 Use only an opaque CPUState pointer and move the actual CPUState
 contents handling to cpu.h and cpuid.c.
 
 Set env-halted in pc.c and add a function to get the local APIC state
 of the current CPU for the MMIO.
 
 Signed-off-by: Blue Swirl blauwir...@gmail.com
 ---
  hw/apic.c   |   40 +++-
  hw/apic.h   |9 -
  hw/pc.c |   12 +++-
  target-i386/cpu.h   |   27 ---
  target-i386/cpuid.c |6 ++
  5 files changed, 56 insertions(+), 38 deletions(-)
 
 diff --git a/hw/apic.c b/hw/apic.c
 index 91c8d93..332c66e 100644
 --- a/hw/apic.c
 +++ b/hw/apic.c
 @@ -95,7 +95,7 @@
  #define MSI_ADDR_SIZE   0x10
 
  struct APICState {
 -CPUState *cpu_env;
 +void *cpu_env;
  uint32_t apicbase;
  uint8_t id;
  uint8_t arb_id;
 @@ -320,7 +320,7 @@ void cpu_set_apic_base(APICState *s, uint64_t val)
  /* if disabled, cannot be enabled again */
  if (!(val  MSR_IA32_APICBASE_ENABLE)) {
  s-apicbase = ~MSR_IA32_APICBASE_ENABLE;
 -s-cpu_env-cpuid_features = ~CPUID_APIC;
 +cpu_clear_apic_feature(s-cpu_env);
  s-spurious_vec = ~APIC_SV_ENABLE;
  }
  }
 @@ -508,8 +508,6 @@ void apic_init_reset(APICState *s)
  s-initial_count_load_time = 0;
  s-next_time = 0;
  s-wait_for_sipi = 1;
 -
 -s-cpu_env-halted = !(s-apicbase  MSR_IA32_APICBASE_BSP);

We are now lacking 'halted' initialization after system reset. Could be
addressed by a special reset handler in hw/pc.c, I guess.

Jan




signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] Re: [RFT][PATCH 07/15] qemu_irq: Add IRQ handlers with delivery feedback

2010-06-06 Thread Jan Kiszka

Gleb Natapov wrote:
 On Sat, Jun 05, 2010 at 02:04:01AM +0200, Jan Kiszka wrote:
 I'd like to also support EOI handling. When the guest clears the
 interrupt condtion, the EOI callback would be called. This could occur
 much later than the IRQ delivery time. I'm not sure if we need the
 result code in that case.

 If any intermediate device (IOAPIC?) needs to be informed about either
 delivery or EOI also, it could create a proxy message with its
 callbacks in place. But we need then a separate opaque field (in
 addition to payload) to store the original message.

 struct IRQMsg {
  DeviceState *src;
  void (*delivery_cb)(IRQMsg *msg, int result);
  void (*eoi_cb)(IRQMsg *msg, int result);
  void *src_opaque;
  void *payload;
 };
 Extending the lifetime of IRQMsg objects beyond the delivery call stack
 means qemu_malloc/free for every delivery. I think it takes a _very_
 appealing reason to justify this. But so far I do not see any use case
 for eio_cb at all.

 I dislike use of eoi for reinfecting missing interrupts since
 it eliminates use of internal PIC/APIC queue of not yet delivered
 interrupts. PIC and APIC has internal queue that can handle two elements:
 one is delivered, but not yet acked interrupt in isr and another is
 pending interrupt in irr. Using eoi callback (or ack notifier as it's
 called inside kernel) interrupt will be considered coalesced even if irr
 is cleared, but no ack was received for previously delivered interrupt.
 But ack notifiers actually has another use: device assignment. There is
 a plan to move device assignment from kernel to userspace and for that
 ack notifiers will have to be extended to userspace too. If so we can
 use them to do irq decoalescing as well. I doubt they should be part
 of IRQMsg though. Why not do what kernel does: have globally registered
 notifier based on irqchip/pin.

I read this twice but I still don't get your plan. Do you like or
dislike using EIO for de-coalescing? And how should these notifiers work?

Jan



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] Re: [RFT][PATCH 07/15] qemu_irq: Add IRQ handlers with delivery feedback

2010-06-06 Thread Blue Swirl

On Sun, Jun 6, 2010 at 7:15 AM, Gleb Natapov g...@redhat.com wrote:
 On Sat, Jun 05, 2010 at 02:04:01AM +0200, Jan Kiszka wrote:
  I'd like to also support EOI handling. When the guest clears the
  interrupt condtion, the EOI callback would be called. This could occur
  much later than the IRQ delivery time. I'm not sure if we need the
  result code in that case.
 
  If any intermediate device (IOAPIC?) needs to be informed about either
  delivery or EOI also, it could create a proxy message with its
  callbacks in place. But we need then a separate opaque field (in
  addition to payload) to store the original message.
 
  struct IRQMsg {
   DeviceState *src;
   void (*delivery_cb)(IRQMsg *msg, int result);
   void (*eoi_cb)(IRQMsg *msg, int result);
   void *src_opaque;
   void *payload;
  };

 Extending the lifetime of IRQMsg objects beyond the delivery call stack
 means qemu_malloc/free for every delivery. I think it takes a _very_
 appealing reason to justify this. But so far I do not see any use case
 for eio_cb at all.

 I dislike use of eoi for reinfecting missing interrupts since
 it eliminates use of internal PIC/APIC queue of not yet delivered
 interrupts. PIC and APIC has internal queue that can handle two elements:
 one is delivered, but not yet acked interrupt in isr and another is
 pending interrupt in irr. Using eoi callback (or ack notifier as it's
 called inside kernel) interrupt will be considered coalesced even if irr
 is cleared, but no ack was received for previously delivered interrupt.
 But ack notifiers actually has another use: device assignment. There is
 a plan to move device assignment from kernel to userspace and for that
 ack notifiers will have to be extended to userspace too. If so we can
 use them to do irq decoalescing as well. I doubt they should be part
 of IRQMsg though. Why not do what kernel does: have globally registered
 notifier based on irqchip/pin.

Because translation at IOAPIC may be lossy, IRQs from many devices
pointing to the same vector? With IRQMsg you know where a specific
message came from. The situation is different inside the kernel: it
manages both translation and registration, whereas in QEMU we could
only control registration.

[Qemu-devel] Re: [PATCH 6/6] apic: avoid using CPUState internals

2010-06-06 Thread Blue Swirl

On Sun, Jun 6, 2010 at 7:36 AM, Jan Kiszka jan.kis...@web.de wrote:
 Blue Swirl wrote:
 Use only an opaque CPUState pointer and move the actual CPUState
 contents handling to cpu.h and cpuid.c.

 Set env-halted in pc.c and add a function to get the local APIC state
 of the current CPU for the MMIO.

 Signed-off-by: Blue Swirl blauwir...@gmail.com
 ---
  hw/apic.c           |   40 +++-
  hw/apic.h           |    9 -
  hw/pc.c             |   12 +++-
  target-i386/cpu.h   |   27 ---
  target-i386/cpuid.c |    6 ++
  5 files changed, 56 insertions(+), 38 deletions(-)

 diff --git a/hw/apic.c b/hw/apic.c
 index 91c8d93..332c66e 100644
 --- a/hw/apic.c
 +++ b/hw/apic.c
 @@ -95,7 +95,7 @@
  #define MSI_ADDR_SIZE                   0x10

  struct APICState {
 -    CPUState *cpu_env;
 +    void *cpu_env;
      uint32_t apicbase;
      uint8_t id;
      uint8_t arb_id;
 @@ -320,7 +320,7 @@ void cpu_set_apic_base(APICState *s, uint64_t val)
      /* if disabled, cannot be enabled again */
      if (!(val  MSR_IA32_APICBASE_ENABLE)) {
          s-apicbase = ~MSR_IA32_APICBASE_ENABLE;
 -        s-cpu_env-cpuid_features = ~CPUID_APIC;
 +        cpu_clear_apic_feature(s-cpu_env);
          s-spurious_vec = ~APIC_SV_ENABLE;
      }
  }
 @@ -508,8 +508,6 @@ void apic_init_reset(APICState *s)
      s-initial_count_load_time = 0;
      s-next_time = 0;
      s-wait_for_sipi = 1;
 -
 -    s-cpu_env-halted = !(s-apicbase  MSR_IA32_APICBASE_BSP);

 We are now lacking 'halted' initialization after system reset. Could be
 addressed by a special reset handler in hw/pc.c, I guess.

Good catch, I forgot to do that.

Re: [Qemu-devel] Re: [RFT][PATCH 07/15] qemu_irq: Add IRQ handlers with delivery feedback

2010-06-06 Thread Gleb Natapov

On Sun, Jun 06, 2010 at 09:39:04AM +0200, Jan Kiszka wrote:
 Gleb Natapov wrote:
  On Sat, Jun 05, 2010 at 02:04:01AM +0200, Jan Kiszka wrote:
  I'd like to also support EOI handling. When the guest clears the
  interrupt condtion, the EOI callback would be called. This could occur
  much later than the IRQ delivery time. I'm not sure if we need the
  result code in that case.
 
  If any intermediate device (IOAPIC?) needs to be informed about either
  delivery or EOI also, it could create a proxy message with its
  callbacks in place. But we need then a separate opaque field (in
  addition to payload) to store the original message.
 
  struct IRQMsg {
   DeviceState *src;
   void (*delivery_cb)(IRQMsg *msg, int result);
   void (*eoi_cb)(IRQMsg *msg, int result);
   void *src_opaque;
   void *payload;
  };
  Extending the lifetime of IRQMsg objects beyond the delivery call stack
  means qemu_malloc/free for every delivery. I think it takes a _very_
  appealing reason to justify this. But so far I do not see any use case
  for eio_cb at all.
 
  I dislike use of eoi for reinfecting missing interrupts since
  it eliminates use of internal PIC/APIC queue of not yet delivered
  interrupts. PIC and APIC has internal queue that can handle two elements:
  one is delivered, but not yet acked interrupt in isr and another is
  pending interrupt in irr. Using eoi callback (or ack notifier as it's
  called inside kernel) interrupt will be considered coalesced even if irr
  is cleared, but no ack was received for previously delivered interrupt.
  But ack notifiers actually has another use: device assignment. There is
  a plan to move device assignment from kernel to userspace and for that
  ack notifiers will have to be extended to userspace too. If so we can
  use them to do irq decoalescing as well. I doubt they should be part
  of IRQMsg though. Why not do what kernel does: have globally registered
  notifier based on irqchip/pin.
 
 I read this twice but I still don't get your plan. Do you like or
 dislike using EIO for de-coalescing? And how should these notifiers work?
 
That's because I confused myself :) I _dislike_ them to be used, but
since device assignment requires ack notifiers anyway may be it is better
to introduce one mechanism for device assignmen + de-coalescing instead
of introducing two different mechanism. Using ack notifiers should be
easy: RTC registers ack notifier and keep track of delivered interrupts.
If timer triggers after previews irq was set, but before it was acked
coalesced counter is incremented. In ack notifier callback coalesced
counter is checked and if it is not zero new irq is set.

--
Gleb.

Re: [Qemu-devel] Re: [RFT][PATCH 07/15] qemu_irq: Add IRQ handlers with delivery feedback

2010-06-06 Thread Gleb Natapov

On Sun, Jun 06, 2010 at 07:39:49AM +, Blue Swirl wrote:
 On Sun, Jun 6, 2010 at 7:15 AM, Gleb Natapov g...@redhat.com wrote:
  On Sat, Jun 05, 2010 at 02:04:01AM +0200, Jan Kiszka wrote:
   I'd like to also support EOI handling. When the guest clears the
   interrupt condtion, the EOI callback would be called. This could occur
   much later than the IRQ delivery time. I'm not sure if we need the
   result code in that case.
  
   If any intermediate device (IOAPIC?) needs to be informed about either
   delivery or EOI also, it could create a proxy message with its
   callbacks in place. But we need then a separate opaque field (in
   addition to payload) to store the original message.
  
   struct IRQMsg {
    DeviceState *src;
    void (*delivery_cb)(IRQMsg *msg, int result);
    void (*eoi_cb)(IRQMsg *msg, int result);
    void *src_opaque;
    void *payload;
   };
 
  Extending the lifetime of IRQMsg objects beyond the delivery call stack
  means qemu_malloc/free for every delivery. I think it takes a _very_
  appealing reason to justify this. But so far I do not see any use case
  for eio_cb at all.
 
  I dislike use of eoi for reinfecting missing interrupts since
  it eliminates use of internal PIC/APIC queue of not yet delivered
  interrupts. PIC and APIC has internal queue that can handle two elements:
  one is delivered, but not yet acked interrupt in isr and another is
  pending interrupt in irr. Using eoi callback (or ack notifier as it's
  called inside kernel) interrupt will be considered coalesced even if irr
  is cleared, but no ack was received for previously delivered interrupt.
  But ack notifiers actually has another use: device assignment. There is
  a plan to move device assignment from kernel to userspace and for that
  ack notifiers will have to be extended to userspace too. If so we can
  use them to do irq decoalescing as well. I doubt they should be part
  of IRQMsg though. Why not do what kernel does: have globally registered
  notifier based on irqchip/pin.
 
 Because translation at IOAPIC may be lossy, IRQs from many devices
 pointing to the same vector? With IRQMsg you know where a specific
 message came from. The situation is different inside the kernel: it
 manages both translation and registration, whereas in QEMU we could
 only control registration.
Configuring IOAPIC like that is against x86 architecture. OS will not be
able to map from interrupt vector back to device.

--
Gleb.

Re: [Qemu-devel] Re: [RFT][PATCH 07/15] qemu_irq: Add IRQ handlers with delivery feedback

2010-06-06 Thread Jan Kiszka

Gleb Natapov wrote:
 On Sun, Jun 06, 2010 at 09:39:04AM +0200, Jan Kiszka wrote:
 Gleb Natapov wrote:
 On Sat, Jun 05, 2010 at 02:04:01AM +0200, Jan Kiszka wrote:
 I'd like to also support EOI handling. When the guest clears the
 interrupt condtion, the EOI callback would be called. This could occur
 much later than the IRQ delivery time. I'm not sure if we need the
 result code in that case.

 If any intermediate device (IOAPIC?) needs to be informed about either
 delivery or EOI also, it could create a proxy message with its
 callbacks in place. But we need then a separate opaque field (in
 addition to payload) to store the original message.

 struct IRQMsg {
  DeviceState *src;
  void (*delivery_cb)(IRQMsg *msg, int result);
  void (*eoi_cb)(IRQMsg *msg, int result);
  void *src_opaque;
  void *payload;
 };
 Extending the lifetime of IRQMsg objects beyond the delivery call stack
 means qemu_malloc/free for every delivery. I think it takes a _very_
 appealing reason to justify this. But so far I do not see any use case
 for eio_cb at all.

 I dislike use of eoi for reinfecting missing interrupts since
 it eliminates use of internal PIC/APIC queue of not yet delivered
 interrupts. PIC and APIC has internal queue that can handle two elements:
 one is delivered, but not yet acked interrupt in isr and another is
 pending interrupt in irr. Using eoi callback (or ack notifier as it's
 called inside kernel) interrupt will be considered coalesced even if irr
 is cleared, but no ack was received for previously delivered interrupt.
 But ack notifiers actually has another use: device assignment. There is
 a plan to move device assignment from kernel to userspace and for that
 ack notifiers will have to be extended to userspace too. If so we can
 use them to do irq decoalescing as well. I doubt they should be part
 of IRQMsg though. Why not do what kernel does: have globally registered
 notifier based on irqchip/pin.
 I read this twice but I still don't get your plan. Do you like or
 dislike using EIO for de-coalescing? And how should these notifiers work?

 That's because I confused myself :) I _dislike_ them to be used, but
 since device assignment requires ack notifiers anyway may be it is better
 to introduce one mechanism for device assignmen + de-coalescing instead
 of introducing two different mechanism. Using ack notifiers should be
 easy: RTC registers ack notifier and keep track of delivered interrupts.
 If timer triggers after previews irq was set, but before it was acked
 coalesced counter is incremented. In ack notifier callback coalesced
 counter is checked and if it is not zero new irq is set.

Ack notifier registrations and event deliveries still need to be routed.
Piggy-backing this on IRQ messages may be unavoidable for that reason.

Anyway, I'm going to post my HPET updates with the infrastructure for
IRQMsg now. Maybe it's helpful to see the other option in reality.

Jan



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH 04/16] hpet: Move static timer field initialization

2010-06-06 Thread Jan Kiszka

From: Jan Kiszka jan.kis...@siemens.com

Properly initialize HPETTimer::tn and HPETTimer::state once during
hpet_init instead of (re-)writing them on every reset.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/hpet.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/hpet.c b/hw/hpet.c
index bcb160b..fd7a1fd 100644
--- a/hw/hpet.c
+++ b/hw/hpet.c
@@ -575,12 +575,10 @@ static void hpet_reset(void *opaque)
 HPETTimer *timer = s-timer[i];
 
 hpet_del_timer(timer);
-timer-tn = i;
 timer-cmp = ~0ULL;
 timer-config =  HPET_TN_PERIODIC_CAP | HPET_TN_SIZE_CAP;
 /* advertise availability of ioapic inti2 */
 timer-config |=  0x0004ULL  32;
-timer-state = s;
 timer-period = 0ULL;
 timer-wrap_flag = 0;
 }
@@ -617,6 +615,8 @@ void hpet_init(qemu_irq *irq)
 for (i = 0; i  HPET_NUM_TIMERS; i++) {
 timer = s-timer[i];
 timer-qemu_timer = qemu_new_timer(vm_clock, hpet_timer, timer);
+timer-tn = i;
+timer-state = s;
 }
 vmstate_register(-1, vmstate_hpet, s);
 qemu_register_reset(hpet_reset, s);
-- 
1.6.0.2

[Qemu-devel] [PATCH 00/16] HPET cleanups, fixes, enhancements

2010-06-06 Thread Jan Kiszka

Second round, specifically adressing:
 - IRQMsg framework to refactor existing de-coalescing code
 - RTC IRQ output as GPIO pin (routed depening on HPET or -no-hpet)
 - ISA reservation for RTC IRQ

If discussion around IRQMsg and de-coalescing happens to continue, I
would suggest to merge patches 1..7 as they are likely uncontroversial
and also fix bugs.

Jan Kiszka (16):
  hpet: Catch out-of-bounds timer access
  hpet: Coding style cleanups and some refactorings
  hpet: Silence warning on write to running main counter
  hpet: Move static timer field initialization
  hpet: Convert to qdev
  hpet: Start/stop timer when HPET_TN_ENABLE is modified
  monitor/QMP: Drop info hpet / query-hpet
  Pass IRQ object on handler invocation
  Enable message delivery via IRQs
  x86: Refactor RTC IRQ coalescing workaround
  hpet/rtc: Rework RTC IRQ replacement by HPET
  hpet: Drop static state
  hpet: Add support for level-triggered interrupts
  vmstate: Add VMSTATE_STRUCT_VARRAY_UINT8
  hpet: Make number of timers configurable
  hpet: Add MSI support

 QMP/vm-info |2 +-
 hw/acpi_piix4.c |3 +-
 hw/apic.c   |   66 +++---
 hw/apic.h   |   11 +-
 hw/arm11mpcore.c|   12 +-
 hw/arm_gic.c|   18 +-
 hw/arm_pic.c|6 +-
 hw/arm_timer.c  |4 +-
 hw/bitbang_i2c.c|4 +-
 hw/bt-hci-csr.c |2 +-
 hw/cbus.c   |6 +-
 hw/cris_pic_cpu.c   |4 +-
 hw/esp.c|2 +-
 hw/etraxfs_pic.c|   16 +-
 hw/fdc.c|2 +-
 hw/heathrow_pic.c   |3 +-
 hw/hpet.c   |  595 ++-
 hw/hpet_emul.h  |   46 +---
 hw/hw.h |   10 +
 hw/i8259.c  |   28 ++-
 hw/ide/cmd646.c |2 +-
 hw/ide/microdrive.c |2 +-
 hw/integratorcp.c   |   10 +-
 hw/ioapic.c |   22 ++-
 hw/irq.c|   48 -
 hw/irq.h|   42 +++-
 hw/lance.c  |2 +-
 hw/max7310.c|2 +-
 hw/mc146818rtc.c|  111 +-
 hw/mc146818rtc.h|4 +-
 hw/mcf5206.c|6 +-
 hw/mcf_intc.c   |   14 +-
 hw/microblaze_pic_cpu.c |5 +-
 hw/mips_int.c   |   10 +-
 hw/mips_jazz.c  |4 +-
 hw/mips_malta.c |4 +-
 hw/mips_r4k.c   |2 +-
 hw/mst_fpga.c   |   10 +-
 hw/musicpal.c   |   16 +-
 hw/nseries.c|4 +-
 hw/omap.h   |2 +-
 hw/omap1.c  |   34 ++--
 hw/omap2.c  |8 +-
 hw/omap_dma.c   |8 +-
 hw/omap_mmc.c   |2 +-
 hw/openpic.c|6 +-
 hw/palm.c   |2 +-
 hw/pc.c |   59 --
 hw/pc.h |8 +-
 hw/pci.c|4 +-
 hw/pl061.c  |4 +-
 hw/pl190.c  |6 +-
 hw/ppc.c|8 +-
 hw/ppc4xx_devs.c|2 +-
 hw/ppc_prep.c   |4 +-
 hw/pxa2xx.c |2 +-
 hw/pxa2xx_gpio.c|2 +-
 hw/pxa2xx_pcmcia.c  |3 +-
 hw/pxa2xx_pic.c |   10 +-
 hw/r2d.c|2 +-
 hw/rc4030.c |7 +-
 hw/sbi.c|2 +-
 hw/sh_intc.c|4 +-
 hw/sh_intc.h|2 +-
 hw/sharpsl.h|1 -
 hw/slavio_intctl.c  |   16 +-
 hw/slavio_misc.c|3 +-
 hw/sparc32_dma.c|2 +-
 hw/spitz.c  |   14 +-
 hw/ssd0323.c|2 +-
 hw/stellaris.c  |6 +-
 hw/sun4c_intctl.c   |8 +-
 hw/sun4m.c  |   14 +-
 hw/sun4u.c  |   12 +-
 hw/syborg_interrupt.c   |8 +-
 hw/tc6393xb.c   |7 +-
 hw/tosa.c   |2 +-
 hw/tusb6010.c   |3 +-
 hw/twl92230.c   |5 +-
 hw/versatilepb.c|   10 +-
 hw/xilinx_intc.c|8 +-
 hw/zaurus.c |2 +-
 monitor.c   |   22 --
 qemu-monitor.hx |   21 --
 84 files changed, 874 insertions(+), 643 deletions(-)

[Qemu-devel] [PATCH 02/16] hpet: Coding style cleanups and some refactorings

2010-06-06 Thread Jan Kiszka

From: Jan Kiszka jan.kis...@siemens.com

This moves the private HPET structures into the C module, simplifies
some helper functions and fixes most coding style issues (biggest chunk
was improper switch-case indention). No functional changes.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
Reviewed-by: Juan Quintela quint...@redhat.com
---
 hw/hpet.c  |  413 ++-
 hw/hpet_emul.h |   31 +
 2 files changed, 226 insertions(+), 218 deletions(-)

diff --git a/hw/hpet.c b/hw/hpet.c
index 1980906..2836fb0 100644
--- a/hw/hpet.c
+++ b/hw/hpet.c
@@ -37,21 +37,47 @@
 #define DPRINTF(...)
 #endif
 
+struct HPETState;
+typedef struct HPETTimer {  /* timers */
+uint8_t tn; /*timer number*/
+QEMUTimer *qemu_timer;
+struct HPETState *state;
+/* Memory-mapped, software visible timer registers */
+uint64_t config;/* configuration/cap */
+uint64_t cmp;   /* comparator */
+uint64_t fsb;   /* FSB route, not supported now */
+/* Hidden register state */
+uint64_t period;/* Last value written to comparator */
+uint8_t wrap_flag;  /* timer pop will indicate wrap for one-shot 32-bit
+ * mode. Next pop will be actual timer expiration.
+ */
+} HPETTimer;
+
+typedef struct HPETState {
+uint64_t hpet_offset;
+qemu_irq *irqs;
+HPETTimer timer[HPET_NUM_TIMERS];
+
+/* Memory-mapped, software visible registers */
+uint64_t capability;/* capabilities */
+uint64_t config;/* configuration */
+uint64_t isr;   /* interrupt status reg */
+uint64_t hpet_counter;  /* main counter */
+} HPETState;
+
 static HPETState *hpet_statep;
 
 uint32_t hpet_in_legacy_mode(void)
 {
-if (hpet_statep)
-return hpet_statep-config  HPET_CFG_LEGACY;
-else
+if (!hpet_statep) {
 return 0;
+}
+return hpet_statep-config  HPET_CFG_LEGACY;
 }
 
 static uint32_t timer_int_route(struct HPETTimer *timer)
 {
-uint32_t route;
-route = (timer-config  HPET_TN_INT_ROUTE_MASK)  
HPET_TN_INT_ROUTE_SHIFT;
-return route;
+return (timer-config  HPET_TN_INT_ROUTE_MASK)  HPET_TN_INT_ROUTE_SHIFT;
 }
 
 static uint32_t hpet_enabled(void)
@@ -108,9 +134,7 @@ static int deactivating_bit(uint64_t old, uint64_t new, 
uint64_t mask)
 
 static uint64_t hpet_get_ticks(void)
 {
-uint64_t ticks;
-ticks = ns_to_ticks(qemu_get_clock(vm_clock) + hpet_statep-hpet_offset);
-return ticks;
+return ns_to_ticks(qemu_get_clock(vm_clock) + hpet_statep-hpet_offset);
 }
 
 /*
@@ -121,12 +145,14 @@ static inline uint64_t hpet_calculate_diff(HPETTimer *t, 
uint64_t current)
 
 if (t-config  HPET_TN_32BIT) {
 uint32_t diff, cmp;
+
 cmp = (uint32_t)t-cmp;
 diff = cmp - (uint32_t)current;
 diff = (int32_t)diff  0 ? diff : (uint32_t)0;
 return (uint64_t)diff;
 } else {
 uint64_t diff, cmp;
+
 cmp = t-cmp;
 diff = cmp - current;
 diff = (int64_t)diff  0 ? diff : (uint64_t)0;
@@ -136,7 +162,6 @@ static inline uint64_t hpet_calculate_diff(HPETTimer *t, 
uint64_t current)
 
 static void update_irq(struct HPETTimer *timer)
 {
-qemu_irq irq;
 int route;
 
 if (timer-tn = 1  hpet_in_legacy_mode()) {
@@ -144,22 +169,20 @@ static void update_irq(struct HPETTimer *timer)
  * timer0 be routed to IRQ0 in NON-APIC or IRQ2 in the I/O APIC,
  * timer1 be routed to IRQ8 in NON-APIC or IRQ8 in the I/O APIC.
  */
-if (timer-tn == 0) {
-irq=timer-state-irqs[0];
-} else
-irq=timer-state-irqs[8];
+route = (timer-tn == 0) ? 0 : 8;
 } else {
-route=timer_int_route(timer);
-irq=timer-state-irqs[route];
+route = timer_int_route(timer);
 }
-if (timer_enabled(timer)  hpet_enabled()) {
-qemu_irq_pulse(irq);
+if (!timer_enabled(timer) || !hpet_enabled()) {
+return;
 }
+qemu_irq_pulse(timer-state-irqs[route]);
 }
 
 static void hpet_pre_save(void *opaque)
 {
 HPETState *s = opaque;
+
 /* save current counter value */
 s-hpet_counter = hpet_get_ticks();
 }
@@ -212,7 +235,7 @@ static const VMStateDescription vmstate_hpet = {
  */
 static void hpet_timer(void *opaque)
 {
-HPETTimer *t = (HPETTimer*)opaque;
+HPETTimer *t = opaque;
 uint64_t diff;
 
 uint64_t period = t-period;
@@ -220,20 +243,22 @@ static void hpet_timer(void *opaque)
 
 if (timer_is_periodic(t)  period != 0) {
 if (t-config  HPET_TN_32BIT) {
-while (hpet_time_after(cur_tick, t-cmp))
+while (hpet_time_after(cur_tick, t-cmp)) {
 t-cmp = (uint32_t)(t-cmp + t-period);
-} else
-while (hpet_time_after64(cur_tick, t-cmp))
+}
+} else {
+while (hpet_time_after64(cur_tick, t-cmp)) {

[Qemu-devel] [PATCH 05/16] hpet: Convert to qdev

2010-06-06 Thread Jan Kiszka

From: Jan Kiszka jan.kis...@siemens.com

Register the HPET as a sysbus device and create it that way. As it can
route its IRQs to any ISA IRQ, we need to connect it to all 24 of them.
Once converted to qdev, we can move reset handler and vmstate
registration into its hands as well.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/hpet.c  |   43 ++-
 hw/hpet_emul.h |3 ++-
 hw/pc.c|7 ++-
 3 files changed, 38 insertions(+), 15 deletions(-)

diff --git a/hw/hpet.c b/hw/hpet.c
index fd7a1fd..6974935 100644
--- a/hw/hpet.c
+++ b/hw/hpet.c
@@ -29,6 +29,7 @@
 #include console.h
 #include qemu-timer.h
 #include hpet_emul.h
+#include sysbus.h
 
 //#define HPET_DEBUG
 #ifdef HPET_DEBUG
@@ -54,8 +55,9 @@ typedef struct HPETTimer {  /* timers */
 } HPETTimer;
 
 typedef struct HPETState {
+SysBusDevice busdev;
 uint64_t hpet_offset;
-qemu_irq *irqs;
+qemu_irq irqs[HPET_NUM_IRQ_ROUTES];
 HPETTimer timer[HPET_NUM_TIMERS];
 
 /* Memory-mapped, software visible registers */
@@ -565,9 +567,9 @@ static CPUWriteMemoryFunc * const hpet_ram_write[] = {
 hpet_ram_writel,
 };
 
-static void hpet_reset(void *opaque)
+static void hpet_reset(DeviceState *d)
 {
-HPETState *s = opaque;
+HPETState *s = FROM_SYSBUS(HPETState, sysbus_from_qdev(d));
 int i;
 static int count = 0;
 
@@ -600,28 +602,43 @@ static void hpet_reset(void *opaque)
 count = 1;
 }
 
-
-void hpet_init(qemu_irq *irq)
+static int hpet_init(SysBusDevice *dev)
 {
+HPETState *s = FROM_SYSBUS(HPETState, dev);
 int i, iomemtype;
 HPETTimer *timer;
-HPETState *s;
-
-DPRINTF (hpet_init\n);
 
-s = qemu_mallocz(sizeof(HPETState));
+assert(!hpet_statep);
 hpet_statep = s;
-s-irqs = irq;
+for (i = 0; i  HPET_NUM_IRQ_ROUTES; i++) {
+sysbus_init_irq(dev, s-irqs[i]);
+}
 for (i = 0; i  HPET_NUM_TIMERS; i++) {
 timer = s-timer[i];
 timer-qemu_timer = qemu_new_timer(vm_clock, hpet_timer, timer);
 timer-tn = i;
 timer-state = s;
 }
-vmstate_register(-1, vmstate_hpet, s);
-qemu_register_reset(hpet_reset, s);
+
 /* HPET Area */
 iomemtype = cpu_register_io_memory(hpet_ram_read,
hpet_ram_write, s);
-cpu_register_physical_memory(HPET_BASE, 0x400, iomemtype);
+sysbus_init_mmio(dev, 0x400, iomemtype);
+return 0;
 }
+
+static SysBusDeviceInfo hpet_device_info = {
+.qdev.name= hpet,
+.qdev.size= sizeof(HPETState),
+.qdev.no_user = 1,
+.qdev.vmsd= vmstate_hpet,
+.qdev.reset   = hpet_reset,
+.init = hpet_init,
+};
+
+static void hpet_register_device(void)
+{
+sysbus_register_withprop(hpet_device_info);
+}
+
+device_init(hpet_register_device)
diff --git a/hw/hpet_emul.h b/hw/hpet_emul.h
index 2f5f8ba..785f850 100644
--- a/hw/hpet_emul.h
+++ b/hw/hpet_emul.h
@@ -19,6 +19,8 @@
 #define FS_PER_NS 100
 #define HPET_NUM_TIMERS 3
 
+#define HPET_NUM_IRQ_ROUTES 32
+
 #define HPET_CFG_ENABLE 0x001
 #define HPET_CFG_LEGACY 0x002
 
@@ -47,7 +49,6 @@
 
 #if defined TARGET_I386
 extern uint32_t hpet_in_legacy_mode(void);
-extern void hpet_init(qemu_irq *irq);
 #endif
 
 #endif
diff --git a/hw/pc.c b/hw/pc.c
index 9b85c42..ae31e2e 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -35,6 +35,7 @@
 #include elf.h
 #include multiboot.h
 #include mc146818rtc.h
+#include sysbus.h
 
 /* output Bochs bios info messages */
 //#define DEBUG_BIOS
@@ -957,7 +958,11 @@ void pc_basic_device_init(qemu_irq *isa_irq,
 pit = pit_init(0x40, isa_reserve_irq(0));
 pcspk_init(pit);
 if (!no_hpet) {
-hpet_init(isa_irq);
+DeviceState *hpet = sysbus_create_simple(hpet, HPET_BASE, NULL);
+
+for (i = 0; i  24; i++) {
+sysbus_connect_irq(sysbus_from_qdev(hpet), i, isa_irq[i]);
+}
 }
 
 for(i = 0; i  MAX_SERIAL_PORTS; i++) {
-- 
1.6.0.2

[Qemu-devel] [PATCH 10/16] x86: Refactor RTC IRQ coalescing workaround

2010-06-06 Thread Jan Kiszka

From: Jan Kiszka jan.kis...@siemens.com

Make use of the new IRQ message and report delivery results from the
sink to the source. As a by-product, this also adds de-coalescing
support to the PIC.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/apic.c|   64 +++--
 hw/apic.h|9 ++
 hw/i8259.c   |   16 ++-
 hw/ioapic.c  |   20 ++---
 hw/mc146818rtc.c |   83 ++---
 hw/pc.c  |   29 --
 6 files changed, 141 insertions(+), 80 deletions(-)

diff --git a/hw/apic.c b/hw/apic.c
index 7fbd79b..f9587d1 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -123,10 +123,8 @@ typedef struct APICState {
 static int apic_io_memory;
 static APICState *local_apics[MAX_APICS + 1];
 static int last_apic_idx = 0;
-static int apic_irq_delivered;
 
-
-static void apic_set_irq(APICState *s, int vector_num, int trigger_mode);
+static int apic_set_irq(APICState *s, int vector_num, int trigger_mode);
 static void apic_update_irq(APICState *s);
 static void apic_get_delivery_bitmask(uint32_t *deliver_bitmask,
   uint8_t dest, uint8_t dest_mode);
@@ -239,12 +237,12 @@ void apic_deliver_pic_intr(CPUState *env, int level)
 }\
 }
 
-static void apic_bus_deliver(const uint32_t *deliver_bitmask,
- uint8_t delivery_mode,
- uint8_t vector_num, uint8_t polarity,
- uint8_t trigger_mode)
+static int apic_bus_deliver(const uint32_t *deliver_bitmask,
+uint8_t delivery_mode, uint8_t vector_num,
+uint8_t polarity, uint8_t trigger_mode)
 {
 APICState *apic_iter;
+int ret;
 
 switch (delivery_mode) {
 case APIC_DM_LOWPRI:
@@ -261,11 +259,12 @@ static void apic_bus_deliver(const uint32_t 
*deliver_bitmask,
 if (d = 0) {
 apic_iter = local_apics[d];
 if (apic_iter) {
-apic_set_irq(apic_iter, vector_num, trigger_mode);
+return apic_set_irq(apic_iter, vector_num,
+trigger_mode);
 }
 }
 }
-return;
+return QEMU_IRQ_MASKED;
 
 case APIC_DM_FIXED:
 break;
@@ -273,34 +272,42 @@ static void apic_bus_deliver(const uint32_t 
*deliver_bitmask,
 case APIC_DM_SMI:
 foreach_apic(apic_iter, deliver_bitmask,
 cpu_interrupt(apic_iter-cpu_env, CPU_INTERRUPT_SMI) );
-return;
+return QEMU_IRQ_DELIVERED;
 
 case APIC_DM_NMI:
 foreach_apic(apic_iter, deliver_bitmask,
 cpu_interrupt(apic_iter-cpu_env, CPU_INTERRUPT_NMI) );
-return;
+return QEMU_IRQ_DELIVERED;
 
 case APIC_DM_INIT:
 /* normal INIT IPI sent to processors */
 foreach_apic(apic_iter, deliver_bitmask,
  cpu_interrupt(apic_iter-cpu_env, CPU_INTERRUPT_INIT) 
);
-return;
+return QEMU_IRQ_DELIVERED;
 
 case APIC_DM_EXTINT:
 /* handled in I/O APIC code */
 break;
 
 default:
-return;
+return QEMU_IRQ_MASKED;
 }
 
+ret = QEMU_IRQ_MASKED;
 foreach_apic(apic_iter, deliver_bitmask,
- apic_set_irq(apic_iter, vector_num, trigger_mode) );
+if (ret == QEMU_IRQ_MASKED)
+ret = QEMU_IRQ_COALESCED;
+if (apic_set_irq(apic_iter, vector_num,
+ trigger_mode) == QEMU_IRQ_DELIVERED) {
+ret = QEMU_IRQ_DELIVERED;
+}
+);
+return ret;
 }
 
-void apic_deliver_irq(uint8_t dest, uint8_t dest_mode,
-  uint8_t delivery_mode, uint8_t vector_num,
-  uint8_t polarity, uint8_t trigger_mode)
+int apic_deliver_irq(uint8_t dest, uint8_t dest_mode,
+ uint8_t delivery_mode, uint8_t vector_num,
+ uint8_t polarity, uint8_t trigger_mode)
 {
 uint32_t deliver_bitmask[MAX_APIC_WORDS];
 
@@ -308,8 +315,8 @@ void apic_deliver_irq(uint8_t dest, uint8_t dest_mode,
  polarity %d trigger_mode %d\n, __func__, dest, dest_mode,
 delivery_mode, vector_num, polarity, trigger_mode);
 apic_get_delivery_bitmask(deliver_bitmask, dest, dest_mode);
-apic_bus_deliver(deliver_bitmask, delivery_mode, vector_num, polarity,
- trigger_mode);
+return apic_bus_deliver(deliver_bitmask, delivery_mode, vector_num,
+polarity, trigger_mode);
 }
 
 void cpu_set_apic_base(CPUState *env, uint64_t val)
@@ -402,22 +409,10 @@ static void apic_update_irq(APICState *s)
 cpu_interrupt(s-cpu_env, CPU_INTERRUPT_HARD);
 }
 
-void apic_reset_irq_delivered(void)
-{
-DPRINTF_C(%s: old

[Qemu-devel] [PATCH 09/16] Enable message delivery via IRQs

2010-06-06 Thread Jan Kiszka

From: Jan Kiszka jan.kis...@siemens.com

This patch allows to optionally attach a message to an IRQ event. The
message can contain a payload reference and a callback that the IRQ
handler may invoke to report the delivery result. The former can be used
to model message signaling interrupts, the latter to cleanly implement
IRQ de-coalescing logics.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/irq.c |   37 -
 hw/irq.h |   38 +-
 2 files changed, 73 insertions(+), 2 deletions(-)

diff --git a/hw/irq.c b/hw/irq.c
index 24fb09d..db5136b 100644
--- a/hw/irq.c
+++ b/hw/irq.c
@@ -28,12 +28,31 @@ struct IRQState {
 qemu_irq_handler handler;
 void *opaque;
 int n;
+IRQMsg *msg;
 };
 
-void qemu_set_irq(qemu_irq irq, int level)
+void qemu_set_irq_msg(qemu_irq irq, int level, IRQMsg *msg)
 {
 if (irq) {
+irq-msg = msg;
 irq-handler(irq, irq-opaque, irq-n, level);
+irq-msg = NULL;
+}
+}
+
+void *qemu_irq_get_payload(qemu_irq irq)
+{
+IRQMsg *msg = irq-msg;
+
+return msg ? msg-payload : NULL;
+}
+
+void qemu_irq_fire_delivery_cb(qemu_irq irq, int level, int result)
+{
+IRQMsg *msg = irq-msg;
+
+if (msg  msg-delivery_cb) {
+msg-delivery_cb(irq, msg-delivery_opaque, irq-n, level, result);
 }
 }
 
@@ -61,11 +80,27 @@ void qemu_free_irqs(qemu_irq *s)
 qemu_free(s);
 }
 
+static void qemu_notirq_delivery_cb(qemu_irq irq, void *opaque, int line,
+int level, int result)
+{
+qemu_irq orig_irq = opaque;
+
+qemu_irq_fire_delivery_cb(orig_irq, !level, result);
+}
+
 static void qemu_notirq(qemu_irq irq, void *opaque, int line, int level)
 {
 struct IRQState *inv_irq = opaque;
+IRQMsg msg;
 
+if (irq-msg) {
+msg.delivery_cb = qemu_notirq_delivery_cb;
+msg.delivery_opaque = irq;
+msg.payload = irq-msg-payload;
+inv_irq-msg = msg;
+}
 inv_irq-handler(inv_irq, inv_irq-opaque, inv_irq-n, !level);
+inv_irq-msg = NULL;
 }
 
 qemu_irq qemu_irq_invert(qemu_irq irq)
diff --git a/hw/irq.h b/hw/irq.h
index d0f83e3..01f96af 100644
--- a/hw/irq.h
+++ b/hw/irq.h
@@ -3,26 +3,62 @@
 
 /* Generic IRQ/GPIO pin infrastructure.  */
 
+#define QEMU_IRQ_DELIVERED  0
+#define QEMU_IRQ_COALESCED  (-1)
+#define QEMU_IRQ_MASKED (-2)
+
+typedef void (*qemu_irq_delivery_cb)(qemu_irq irq, void *opaque, int n,
+ int level, int result);
 typedef void (*qemu_irq_handler)(qemu_irq irq, void *opaque, int n, int level);
 
-void qemu_set_irq(qemu_irq irq, int level);
+typedef struct IRQMsg {
+qemu_irq_delivery_cb delivery_cb;
+void *delivery_opaque;
+void *payload;
+} IRQMsg;
+
+void qemu_set_irq_msg(qemu_irq irq, int level, IRQMsg *msg);
+
+static inline void qemu_set_irq(qemu_irq irq, int level)
+{
+qemu_set_irq_msg(irq, level, NULL);
+}
 
 static inline void qemu_irq_raise(qemu_irq irq)
 {
 qemu_set_irq(irq, 1);
 }
 
+static inline void qemu_irq_raise_msg(qemu_irq irq, IRQMsg *msg)
+{
+qemu_set_irq_msg(irq, 1, msg);
+}
+
 static inline void qemu_irq_lower(qemu_irq irq)
 {
 qemu_set_irq(irq, 0);
 }
 
+static inline void qemu_irq_lower_msg(qemu_irq irq, IRQMsg *msg)
+{
+qemu_set_irq_msg(irq, 0, msg);
+}
+
 static inline void qemu_irq_pulse(qemu_irq irq)
 {
 qemu_set_irq(irq, 1);
 qemu_set_irq(irq, 0);
 }
 
+static inline void qemu_irq_pulse_msg(qemu_irq irq, IRQMsg *msg)
+{
+qemu_set_irq_msg(irq, 1, msg);
+qemu_set_irq_msg(irq, 0, msg);
+}
+
+void qemu_irq_fire_delivery_cb(qemu_irq irq, int level, int result);
+void *qemu_irq_get_payload(qemu_irq irq);
+
 /* Returns an array of N IRQs.  */
 qemu_irq *qemu_allocate_irqs(qemu_irq_handler handler, void *opaque, int n);
 void qemu_free_irqs(qemu_irq *s);
-- 
1.6.0.2

[Qemu-devel] [PATCH 06/16] hpet: Start/stop timer when HPET_TN_ENABLE is modified

2010-06-06 Thread Jan Kiszka

From: Jan Kiszka jan.kis...@siemens.com

We have to update the qemu timer when the per-timer enable bit is
toggled, just like for HPET_CFG_ENABLE changes.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/hpet.c |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/hw/hpet.c b/hw/hpet.c
index 6974935..041dd84 100644
--- a/hw/hpet.c
+++ b/hw/hpet.c
@@ -430,6 +430,11 @@ static void hpet_ram_writel(void *opaque, 
target_phys_addr_t addr,
 printf(qemu: level-triggered hpet not supported\n);
 exit (-1);
 }
+if (activating_bit(old_val, new_val, HPET_TN_ENABLE)) {
+hpet_set_timer(timer);
+} else if (deactivating_bit(old_val, new_val, HPET_TN_ENABLE)) {
+hpet_del_timer(timer);
+}
 break;
 case HPET_TN_CFG + 4: // Interrupt capabilities
 DPRINTF(qemu: invalid HPET_TN_CFG+4 write\n);
-- 
1.6.0.2

[Qemu-devel] [PATCH 07/16] monitor/QMP: Drop info hpet / query-hpet

2010-06-06 Thread Jan Kiszka

From: Jan Kiszka jan.kis...@siemens.com

This command was of minimal use before, now it is useless as the hpet
become a qdev device and is thus easily discoverable. We should
definitely not set query-hpet in QMP's stone, and there is also no good
reason to keep it for the interactive monitor.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 QMP/vm-info |2 +-
 monitor.c   |   22 --
 qemu-monitor.hx |   21 -
 3 files changed, 1 insertions(+), 44 deletions(-)

diff --git a/QMP/vm-info b/QMP/vm-info
index b150d82..8ebaeb3 100755
--- a/QMP/vm-info
+++ b/QMP/vm-info
@@ -25,7 +25,7 @@ def main():
 qemu = qmp.QEMUMonitorProtocol(argv[1])
 qemu.connect()
 
-for cmd in [ 'version', 'hpet', 'kvm', 'status', 'uuid', 'balloon' ]:
+for cmd in [ 'version', 'kvm', 'status', 'uuid', 'balloon' ]:
 print cmd + ': ' + str(qemu.send('query-' + cmd))
 
 if __name__ == '__main__':
diff --git a/monitor.c b/monitor.c
index 15b53b9..14f77bd 100644
--- a/monitor.c
+++ b/monitor.c
@@ -740,20 +740,6 @@ static void do_info_commands(Monitor *mon, QObject 
**ret_data)
 *ret_data = QOBJECT(cmd_list);
 }
 
-#if defined(TARGET_I386)
-static void do_info_hpet_print(Monitor *mon, const QObject *data)
-{
-monitor_printf(mon, HPET is %s by QEMU\n,
-   qdict_get_bool(qobject_to_qdict(data), enabled) ?
-   enabled : disabled);
-}
-
-static void do_info_hpet(Monitor *mon, QObject **ret_data)
-{
-*ret_data = qobject_from_jsonf({ 'enabled': %i }, !no_hpet);
-}
-#endif
-
 static void do_info_uuid_print(Monitor *mon, const QObject *data)
 {
 monitor_printf(mon, %s\n, qdict_get_str(qobject_to_qdict(data), UUID));
@@ -2509,14 +2495,6 @@ static const mon_cmd_t info_cmds[] = {
 .help   = show the active virtual memory mappings,
 .mhandler.info = mem_info,
 },
-{
-.name   = hpet,
-.args_type  = ,
-.params = ,
-.help   = show state of HPET,
-.user_print = do_info_hpet_print,
-.mhandler.info_new = do_info_hpet,
-},
 #endif
 {
 .name   = jit,
diff --git a/qemu-monitor.hx b/qemu-monitor.hx
index f6a94f2..9f62b94 100644
--- a/qemu-monitor.hx
+++ b/qemu-monitor.hx
@@ -2144,27 +2144,6 @@ show the active virtual memory mappings (i386 only)
 ETEXI
 
 STEXI
-...@item info hpet
-show state of HPET (i386 only)
-ETEXI
-SQMP
-query-hpet
---
-
-Show HPET state.
-
-Return a json-object with the following information:
-
-- enabled: true if hpet if enabled, false otherwise (json-bool)
-
-Example:
-
-- { execute: query-hpet }
-- { return: { enabled: true } }
-
-EQMP
-
-STEXI
 @item info jit
 show dynamic compiler info
 @item info kvm
-- 
1.6.0.2

[Qemu-devel] [PATCH 14/16] vmstate: Add VMSTATE_STRUCT_VARRAY_UINT8

2010-06-06 Thread Jan Kiszka

From: Jan Kiszka jan.kis...@siemens.com

Required for hpet.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/hw.h |   10 ++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/hw/hw.h b/hw/hw.h
index fc2d184..36be0be 100644
--- a/hw/hw.h
+++ b/hw/hw.h
@@ -474,6 +474,16 @@ extern const VMStateInfo vmstate_info_unused_buffer;
 .offset = vmstate_offset_array(_state, _field, _type, _num), \
 }
 
+#define VMSTATE_STRUCT_VARRAY_UINT8(_field, _state, _field_num, _version, 
_vmsd, _type) { \
+.name   = (stringify(_field)),   \
+.num_offset = vmstate_offset_value(_state, _field_num, uint8_t),  \
+.version_id = (_version),\
+.vmsd   = (_vmsd),  \
+.size   = sizeof(_type), \
+.flags  = VMS_STRUCT|VMS_VARRAY_INT32,   \
+.offset = offsetof(_state, _field),  \
+}
+
 #define VMSTATE_STATIC_BUFFER(_field, _state, _version, _test, _start, _size) 
{ \
 .name = (stringify(_field)), \
 .version_id   = (_version),  \
-- 
1.6.0.2

[Qemu-devel] [PATCH 12/16] hpet: Drop static state

2010-06-06 Thread Jan Kiszka

From: Jan Kiszka jan.kis...@siemens.com

Instead of keeping a static reference around, pass the state to
hpet_enabled and hpet_get_ticks. All callers now have it at hand. Will
once allow to instantiate the HPET more than a single time.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/hpet.c |   38 +-
 1 files changed, 17 insertions(+), 21 deletions(-)

diff --git a/hw/hpet.c b/hw/hpet.c
index d26cad5..3866061 100644
--- a/hw/hpet.c
+++ b/hw/hpet.c
@@ -69,8 +69,6 @@ typedef struct HPETState {
 uint64_t hpet_counter;  /* main counter */
 } HPETState;
 
-static HPETState *hpet_statep;
-
 static uint32_t hpet_in_legacy_mode(HPETState *s)
 {
 return s-config  HPET_CFG_LEGACY;
@@ -81,9 +79,9 @@ static uint32_t timer_int_route(struct HPETTimer *timer)
 return (timer-config  HPET_TN_INT_ROUTE_MASK)  HPET_TN_INT_ROUTE_SHIFT;
 }
 
-static uint32_t hpet_enabled(void)
+static uint32_t hpet_enabled(HPETState *s)
 {
-return hpet_statep-config  HPET_CFG_ENABLE;
+return s-config  HPET_CFG_ENABLE;
 }
 
 static uint32_t timer_is_periodic(HPETTimer *t)
@@ -133,9 +131,9 @@ static int deactivating_bit(uint64_t old, uint64_t new, 
uint64_t mask)
 return ((old  mask)  !(new  mask));
 }
 
-static uint64_t hpet_get_ticks(void)
+static uint64_t hpet_get_ticks(HPETState *s)
 {
-return ns_to_ticks(qemu_get_clock(vm_clock) + hpet_statep-hpet_offset);
+return ns_to_ticks(qemu_get_clock(vm_clock) + s-hpet_offset);
 }
 
 /*
@@ -174,7 +172,7 @@ static void update_irq(struct HPETTimer *timer)
 } else {
 route = timer_int_route(timer);
 }
-if (!timer_enabled(timer) || !hpet_enabled()) {
+if (!timer_enabled(timer) || !hpet_enabled(timer-state)) {
 return;
 }
 qemu_irq_pulse(timer-state-irqs[route]);
@@ -185,7 +183,7 @@ static void hpet_pre_save(void *opaque)
 HPETState *s = opaque;
 
 /* save current counter value */
-s-hpet_counter = hpet_get_ticks();
+s-hpet_counter = hpet_get_ticks(s);
 }
 
 static int hpet_post_load(void *opaque, int version_id)
@@ -240,7 +238,7 @@ static void hpet_timer(void *opaque)
 uint64_t diff;
 
 uint64_t period = t-period;
-uint64_t cur_tick = hpet_get_ticks();
+uint64_t cur_tick = hpet_get_ticks(t-state);
 
 if (timer_is_periodic(t)  period != 0) {
 if (t-config  HPET_TN_32BIT) {
@@ -270,7 +268,7 @@ static void hpet_set_timer(HPETTimer *t)
 {
 uint64_t diff;
 uint32_t wrap_diff;  /* how many ticks until we wrap? */
-uint64_t cur_tick = hpet_get_ticks();
+uint64_t cur_tick = hpet_get_ticks(t-state);
 
 /* whenever new timer is being set up, make sure wrap_flag is 0 */
 t-wrap_flag = 0;
@@ -353,16 +351,16 @@ static uint32_t hpet_ram_readl(void *opaque, 
target_phys_addr_t addr)
 DPRINTF(qemu: invalid HPET_CFG + 4 hpet_ram_readl \n);
 return 0;
 case HPET_COUNTER:
-if (hpet_enabled()) {
-cur_tick = hpet_get_ticks();
+if (hpet_enabled(s)) {
+cur_tick = hpet_get_ticks(s);
 } else {
 cur_tick = s-hpet_counter;
 }
 DPRINTF(qemu: reading counter  = % PRIx64 \n, cur_tick);
 return cur_tick;
 case HPET_COUNTER + 4:
-if (hpet_enabled()) {
-cur_tick = hpet_get_ticks();
+if (hpet_enabled(s)) {
+cur_tick = hpet_get_ticks(s);
 } else {
 cur_tick = s-hpet_counter;
 }
@@ -457,7 +455,7 @@ static void hpet_ram_writel(void *opaque, 
target_phys_addr_t addr,
 (timer-period  0xULL) | new_val;
 }
 timer-config = ~HPET_TN_SETVAL;
-if (hpet_enabled()) {
+if (hpet_enabled(s)) {
 hpet_set_timer(timer);
 }
 break;
@@ -476,7 +474,7 @@ static void hpet_ram_writel(void *opaque, 
target_phys_addr_t addr,
 (timer-period  0xULL) | new_val  32;
 }
 timer-config = ~HPET_TN_SETVAL;
-if (hpet_enabled()) {
+if (hpet_enabled(s)) {
 hpet_set_timer(timer);
 }
 break;
@@ -506,7 +504,7 @@ static void hpet_ram_writel(void *opaque, 
target_phys_addr_t addr,
 }
 } else if (deactivating_bit(old_val, new_val, HPET_CFG_ENABLE)) {
 /* Halt main counter and disable interrupt generation. */
-s-hpet_counter = hpet_get_ticks();
+s-hpet_counter = hpet_get_ticks(s);
 for (i = 0; i  HPET_NUM_TIMERS; i++) {
 hpet_del_timer(s-timer[i]);
 }
@@ -527,7 +525,7 @@ static void hpet_ram_writel(void *opaque, 
target_phys_addr_t addr,
 /* FIXME: need to handle level-triggered interrupts */
 break;
 case HPET_COUNTER:
-

[Qemu-devel] [PATCH 11/16] hpet/rtc: Rework RTC IRQ replacement by HPET

2010-06-06 Thread Jan Kiszka

From: Jan Kiszka jan.kis...@siemens.com

Allow the intercept the RTC IRQ for the HPET legacy mode. Then push
routing to IRQ8 completely into the HPET. This allows to turn
hpet_in_legacy_mode() into a private function. Furthermore, this stops
the RTC from clearing IRQ8 even if the HPET is in control.

This patch comes with a side effect: The RTC timers will no longer be
stoppend when there is no IRQ consumer, possibly causing a minor
performance degration. But as the guest may want to redirect the RTC to
the SCI in that mode, it should normally disable unused IRQ source
anyway.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/hpet.c|   42 +++---
 hw/hpet_emul.h   |4 
 hw/mc146818rtc.c |   54 +++---
 hw/mc146818rtc.h |4 +++-
 hw/mips_jazz.c   |2 +-
 hw/mips_malta.c  |2 +-
 hw/mips_r4k.c|2 +-
 hw/pc.c  |   14 --
 hw/ppc_prep.c|2 +-
 9 files changed, 65 insertions(+), 61 deletions(-)

diff --git a/hw/hpet.c b/hw/hpet.c
index 041dd84..d26cad5 100644
--- a/hw/hpet.c
+++ b/hw/hpet.c
@@ -30,6 +30,7 @@
 #include qemu-timer.h
 #include hpet_emul.h
 #include sysbus.h
+#include mc146818rtc.h
 
 //#define HPET_DEBUG
 #ifdef HPET_DEBUG
@@ -58,6 +59,7 @@ typedef struct HPETState {
 SysBusDevice busdev;
 uint64_t hpet_offset;
 qemu_irq irqs[HPET_NUM_IRQ_ROUTES];
+uint8_t rtc_irq_level;
 HPETTimer timer[HPET_NUM_TIMERS];
 
 /* Memory-mapped, software visible registers */
@@ -69,12 +71,9 @@ typedef struct HPETState {
 
 static HPETState *hpet_statep;
 
-uint32_t hpet_in_legacy_mode(void)
+static uint32_t hpet_in_legacy_mode(HPETState *s)
 {
-if (!hpet_statep) {
-return 0;
-}
-return hpet_statep-config  HPET_CFG_LEGACY;
+return s-config  HPET_CFG_LEGACY;
 }
 
 static uint32_t timer_int_route(struct HPETTimer *timer)
@@ -166,12 +165,12 @@ static void update_irq(struct HPETTimer *timer)
 {
 int route;
 
-if (timer-tn = 1  hpet_in_legacy_mode()) {
+if (timer-tn = 1  hpet_in_legacy_mode(timer-state)) {
 /* if LegacyReplacementRoute bit is set, HPET specification requires
  * timer0 be routed to IRQ0 in NON-APIC or IRQ2 in the I/O APIC,
  * timer1 be routed to IRQ8 in NON-APIC or IRQ8 in the I/O APIC.
  */
-route = (timer-tn == 0) ? 0 : 8;
+route = (timer-tn == 0) ? 0 : RTC_ISA_IRQ;
 } else {
 route = timer_int_route(timer);
 }
@@ -515,8 +514,10 @@ static void hpet_ram_writel(void *opaque, 
target_phys_addr_t addr,
 /* i8254 and RTC are disabled when HPET is in legacy mode */
 if (activating_bit(old_val, new_val, HPET_CFG_LEGACY)) {
 hpet_pit_disable();
+qemu_irq_lower(s-irqs[RTC_ISA_IRQ]);
 } else if (deactivating_bit(old_val, new_val, HPET_CFG_LEGACY)) {
 hpet_pit_enable();
+qemu_set_irq(s-irqs[RTC_ISA_IRQ], s-rtc_irq_level);
 }
 break;
 case HPET_CFG + 4:
@@ -607,6 +608,30 @@ static void hpet_reset(DeviceState *d)
 count = 1;
 }
 
+static void hpet_rtc_delivery_cb(qemu_irq irq, void *opaque, int n, int level,
+ int result)
+{
+qemu_irq orig_irq = opaque;
+
+qemu_irq_fire_delivery_cb(orig_irq, level, result);
+}
+
+static void hpet_handle_rtc_irq(qemu_irq irq, void *opaque, int n, int level)
+{
+HPETState *s = FROM_SYSBUS(HPETState, opaque);
+IRQMsg msg = {
+.delivery_cb = hpet_rtc_delivery_cb,
+.delivery_opaque = irq,
+};
+
+s-rtc_irq_level = level;
+if (hpet_in_legacy_mode(s)) {
+qemu_irq_fire_delivery_cb(irq, level, QEMU_IRQ_MASKED);
+} else {
+qemu_set_irq_msg(s-irqs[RTC_ISA_IRQ], level, msg);
+}
+}
+
 static int hpet_init(SysBusDevice *dev)
 {
 HPETState *s = FROM_SYSBUS(HPETState, dev);
@@ -625,6 +650,9 @@ static int hpet_init(SysBusDevice *dev)
 timer-state = s;
 }
 
+isa_reserve_irq(RTC_ISA_IRQ);
+qdev_init_gpio_in(dev-qdev, hpet_handle_rtc_irq, 1);
+
 /* HPET Area */
 iomemtype = cpu_register_io_memory(hpet_ram_read,
hpet_ram_write, s);
diff --git a/hw/hpet_emul.h b/hw/hpet_emul.h
index 785f850..9c268cc 100644
--- a/hw/hpet_emul.h
+++ b/hw/hpet_emul.h
@@ -47,8 +47,4 @@
 #define HPET_TN_INT_ROUTE_CAP_SHIFT 32
 #define HPET_TN_CFG_BITS_READONLY_OR_RESERVED 0x80b1U
 
-#if defined TARGET_I386
-extern uint32_t hpet_in_legacy_mode(void);
-#endif
-
 #endif
diff --git a/hw/mc146818rtc.c b/hw/mc146818rtc.c
index cbb98a4..ac82810 100644
--- a/hw/mc146818rtc.c
+++ b/hw/mc146818rtc.c
@@ -26,7 +26,6 @@
 #include sysemu.h
 #include pc.h
 #include isa.h
-#include hpet_emul.h
 #include mc146818rtc.h
 
 //#define DEBUG_CMOS
@@ -100,24 +99,6 @@ typedef struct RTCState {
 QEMUTimer *second_timer2;
 } RTCState;
 
-static void rtc_irq_raise(RTCState *s,

[Qemu-devel] [PATCH 13/16] hpet: Add support for level-triggered interrupts

2010-06-06 Thread Jan Kiszka

From: Jan Kiszka jan.kis...@siemens.com

By implementing this feature we can also remove a nasty way to kill qemu
(by trying to enable level-triggered hpet interrupts).

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/hpet.c |   32 ++--
 1 files changed, 22 insertions(+), 10 deletions(-)

diff --git a/hw/hpet.c b/hw/hpet.c
index 3866061..eafdccb 100644
--- a/hw/hpet.c
+++ b/hw/hpet.c
@@ -159,8 +159,10 @@ static inline uint64_t hpet_calculate_diff(HPETTimer *t, 
uint64_t current)
 }
 }
 
-static void update_irq(struct HPETTimer *timer)
+static void update_irq(struct HPETTimer *timer, int set)
 {
+uint64_t mask;
+HPETState *s;
 int route;
 
 if (timer-tn = 1  hpet_in_legacy_mode(timer-state)) {
@@ -172,10 +174,18 @@ static void update_irq(struct HPETTimer *timer)
 } else {
 route = timer_int_route(timer);
 }
-if (!timer_enabled(timer) || !hpet_enabled(timer-state)) {
-return;
+s = timer-state;
+mask = 1  timer-tn;
+if (!set || !timer_enabled(timer) || !hpet_enabled(timer-state)) {
+s-isr = ~mask;
+qemu_irq_lower(s-irqs[route]);
+} else if (timer-config  HPET_TN_TYPE_LEVEL) {
+s-isr |= mask;
+qemu_irq_raise(s-irqs[route]);
+} else {
+s-isr = ~mask;
+qemu_irq_pulse(s-irqs[route]);
 }
-qemu_irq_pulse(timer-state-irqs[route]);
 }
 
 static void hpet_pre_save(void *opaque)
@@ -261,7 +271,7 @@ static void hpet_timer(void *opaque)
 t-wrap_flag = 0;
 }
 }
-update_irq(t);
+update_irq(t, 1);
 }
 
 static void hpet_set_timer(HPETTimer *t)
@@ -291,6 +301,7 @@ static void hpet_set_timer(HPETTimer *t)
 static void hpet_del_timer(HPETTimer *t)
 {
 qemu_del_timer(t-qemu_timer);
+update_irq(t, 0);
 }
 
 #ifdef HPET_DEBUG
@@ -423,10 +434,6 @@ static void hpet_ram_writel(void *opaque, 
target_phys_addr_t addr,
 timer-cmp = (uint32_t)timer-cmp;
 timer-period = (uint32_t)timer-period;
 }
-if (new_val  HPET_TN_TYPE_LEVEL) {
-printf(qemu: level-triggered hpet not supported\n);
-exit (-1);
-}
 if (activating_bit(old_val, new_val, HPET_TN_ENABLE)) {
 hpet_set_timer(timer);
 } else if (deactivating_bit(old_val, new_val, HPET_TN_ENABLE)) {
@@ -522,7 +529,12 @@ static void hpet_ram_writel(void *opaque, 
target_phys_addr_t addr,
 DPRINTF(qemu: invalid HPET_CFG+4 write \n);
 break;
 case HPET_STATUS:
-/* FIXME: need to handle level-triggered interrupts */
+val = new_val  s-isr;
+for (i = 0; i  HPET_NUM_TIMERS; i++) {
+if (val  (1  i)) {
+update_irq(s-timer[i], 0);
+}
+}
 break;
 case HPET_COUNTER:
 if (hpet_enabled(s)) {
-- 
1.6.0.2

[Qemu-devel] [PATCH 15/16] hpet: Make number of timers configurable

2010-06-06 Thread Jan Kiszka

From: Jan Kiszka jan.kis...@siemens.com

One HPET block supports up to 32 timers. Allow to instantiate more than
the recommended and implemented minimum of 3. The number is configured
via the qdev property timers. It is also saved/restored so that it
need not match between migration peers.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/hpet.c  |   53 -
 hw/hpet_emul.h |6 +-
 2 files changed, 45 insertions(+), 14 deletions(-)

diff --git a/hw/hpet.c b/hw/hpet.c
index eafdccb..7219967 100644
--- a/hw/hpet.c
+++ b/hw/hpet.c
@@ -60,7 +60,8 @@ typedef struct HPETState {
 uint64_t hpet_offset;
 qemu_irq irqs[HPET_NUM_IRQ_ROUTES];
 uint8_t rtc_irq_level;
-HPETTimer timer[HPET_NUM_TIMERS];
+uint8_t num_timers;
+HPETTimer timer[HPET_MAX_TIMERS];
 
 /* Memory-mapped, software visible registers */
 uint64_t capability;/* capabilities */
@@ -196,12 +197,25 @@ static void hpet_pre_save(void *opaque)
 s-hpet_counter = hpet_get_ticks(s);
 }
 
+static int hpet_pre_load(void *opaque)
+{
+HPETState *s = opaque;
+
+/* version 1 only supports 3, later versions will load the actual value */
+s-num_timers = HPET_MIN_TIMERS;
+return 0;
+}
+
 static int hpet_post_load(void *opaque, int version_id)
 {
 HPETState *s = opaque;
 
 /* Recalculate the offset between the main counter and guest time */
 s-hpet_offset = ticks_to_ns(s-hpet_counter) - qemu_get_clock(vm_clock);
+
+/* Push number of timers into capability returned via HPET_ID */
+s-capability = ~HPET_ID_NUM_TIM_MASK;
+s-capability |= (s-num_timers - 1)  HPET_ID_NUM_TIM_SHIFT;
 return 0;
 }
 
@@ -224,17 +238,19 @@ static const VMStateDescription vmstate_hpet_timer = {
 
 static const VMStateDescription vmstate_hpet = {
 .name = hpet,
-.version_id = 1,
+.version_id = 2,
 .minimum_version_id = 1,
 .minimum_version_id_old = 1,
 .pre_save = hpet_pre_save,
+.pre_load = hpet_pre_load,
 .post_load = hpet_post_load,
 .fields  = (VMStateField []) {
 VMSTATE_UINT64(config, HPETState),
 VMSTATE_UINT64(isr, HPETState),
 VMSTATE_UINT64(hpet_counter, HPETState),
-VMSTATE_STRUCT_ARRAY(timer, HPETState, HPET_NUM_TIMERS, 0,
- vmstate_hpet_timer, HPETTimer),
+VMSTATE_UINT8_V(num_timers, HPETState, 2),
+VMSTATE_STRUCT_VARRAY_UINT8(timer, HPETState, num_timers, 0,
+vmstate_hpet_timer, HPETTimer),
 VMSTATE_END_OF_LIST()
 }
 };
@@ -330,7 +346,7 @@ static uint32_t hpet_ram_readl(void *opaque, 
target_phys_addr_t addr)
 uint8_t timer_id = (addr - 0x100) / 0x20;
 HPETTimer *timer = s-timer[timer_id];
 
-if (timer_id  HPET_NUM_TIMERS - 1) {
+if (timer_id  s-num_timers) {
 DPRINTF(qemu: timer id out of range\n);
 return 0;
 }
@@ -421,7 +437,7 @@ static void hpet_ram_writel(void *opaque, 
target_phys_addr_t addr,
 HPETTimer *timer = s-timer[timer_id];
 
 DPRINTF(qemu: hpet_ram_writel timer_id = %#x \n, timer_id);
-if (timer_id  HPET_NUM_TIMERS - 1) {
+if (timer_id  s-num_timers) {
 DPRINTF(qemu: timer id out of range\n);
 return;
 }
@@ -504,7 +520,7 @@ static void hpet_ram_writel(void *opaque, 
target_phys_addr_t addr,
 /* Enable main counter and interrupt generation. */
 s-hpet_offset =
 ticks_to_ns(s-hpet_counter) - qemu_get_clock(vm_clock);
-for (i = 0; i  HPET_NUM_TIMERS; i++) {
+for (i = 0; i  s-num_timers; i++) {
 if ((s-timer[i])-cmp != ~0ULL) {
 hpet_set_timer(s-timer[i]);
 }
@@ -512,7 +528,7 @@ static void hpet_ram_writel(void *opaque, 
target_phys_addr_t addr,
 } else if (deactivating_bit(old_val, new_val, HPET_CFG_ENABLE)) {
 /* Halt main counter and disable interrupt generation. */
 s-hpet_counter = hpet_get_ticks(s);
-for (i = 0; i  HPET_NUM_TIMERS; i++) {
+for (i = 0; i  s-num_timers; i++) {
 hpet_del_timer(s-timer[i]);
 }
 }
@@ -530,7 +546,7 @@ static void hpet_ram_writel(void *opaque, 
target_phys_addr_t addr,
 break;
 case HPET_STATUS:
 val = new_val  s-isr;
-for (i = 0; i  HPET_NUM_TIMERS; i++) {
+for (i = 0; i  s-num_timers; i++) {
 if (val  (1  i)) {
 update_irq(s-timer[i], 0);
 }
@@ -589,7 +605,7 @@ static void hpet_reset(DeviceState *d)
 int i;
 static int count = 0;
 
-for (i = 0; i  HPET_NUM_TIMERS; i++) {
+for (i = 0; i  s-num_timers; i++) {
 HPETTimer *timer = s-timer[i];
 
 hpet_del_timer(timer);
@@ -603,8 +619,9 @@ static

[Qemu-devel] Re: RFC: blockdev_add friends, brief rationale, QMP docs

2010-06-06 Thread Avi Kivity


On 06/04/2010 05:16 PM, Markus Armbruster wrote:

- protocol: json-array of json-object
   Each element object has a member name
 - Possible values: file, nbd, ...
   Additional members depend on the value of name.
   For name = file:
 - file: file name (json-string)
   For name = nbd:
 - domain: address family (json-string, optional)
 - Possible values: inet (default), unix
 - file: file name (json-string), only with domain = unix
 - host: host name (json-string), only with domain = inet
 - port: port (json-int), only with domain = inet
   ...

   


This loses the nesting that protocols have.  I'd like to see the each 
nested protocol as member of the parent protocol.  Besides the lovely } 
} }s in the json representation, this allows us to have more complicated 
protocols, for example a mirror protocol that has two child protocol 
each specifying a different backing store.


--
error compiling committee.c: too many arguments to function

[Qemu-devel] [PATCH 16/16] hpet: Add MSI support

2010-06-06 Thread Jan Kiszka

From: Jan Kiszka jan.kis...@siemens.com

This implements the HPET capability of routing IRQs to the front-side
bus, aka MSI support. This feature can be enabled via the qdev property
msi and is off by default.

Note that switching it on can cause guests (at least Linux) to use the
HPET as timer instead of the LAPIC. KVM users should recall that only
the latter is currently available as fast in-kernel model.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/apic.c  |2 +-
 hw/apic.h  |1 +
 hw/hpet.c  |   39 +++
 hw/hpet_emul.h |4 +++-
 4 files changed, 40 insertions(+), 6 deletions(-)

diff --git a/hw/apic.c b/hw/apic.c
index f9587d1..f33d20a 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -776,7 +776,7 @@ static uint32_t apic_mem_readl(void *opaque, 
target_phys_addr_t addr)
 return val;
 }
 
-static void apic_send_msi(target_phys_addr_t addr, uint32 data)
+void apic_send_msi(target_phys_addr_t addr, uint32 data)
 {
 uint8_t dest = (addr  MSI_ADDR_DEST_ID_MASK)  MSI_ADDR_DEST_ID_SHIFT;
 uint8_t vector = (data  MSI_DATA_VECTOR_MASK)  MSI_DATA_VECTOR_SHIFT;
diff --git a/hw/apic.h b/hw/apic.h
index 738d98a..9c646f0 100644
--- a/hw/apic.h
+++ b/hw/apic.h
@@ -5,6 +5,7 @@ typedef struct IOAPICState IOAPICState;
 int apic_deliver_irq(uint8_t dest, uint8_t dest_mode,
  uint8_t delivery_mode, uint8_t vector_num,
  uint8_t polarity, uint8_t trigger_mode);
+void apic_send_msi(target_phys_addr_t addr, uint32 data);
 int apic_init(CPUState *env);
 int apic_accept_pic_intr(CPUState *env);
 void apic_deliver_pic_intr(CPUState *env, int level);
diff --git a/hw/hpet.c b/hw/hpet.c
index 7219967..490a804 100644
--- a/hw/hpet.c
+++ b/hw/hpet.c
@@ -31,6 +31,7 @@
 #include hpet_emul.h
 #include sysbus.h
 #include mc146818rtc.h
+#include apic.h
 
 //#define HPET_DEBUG
 #ifdef HPET_DEBUG
@@ -39,6 +40,8 @@
 #define DPRINTF(...)
 #endif
 
+#define HPET_MSI_SUPPORT0
+
 struct HPETState;
 typedef struct HPETTimer {  /* timers */
 uint8_t tn; /*timer number*/
@@ -47,7 +50,7 @@ typedef struct HPETTimer {  /* timers */
 /* Memory-mapped, software visible timer registers */
 uint64_t config;/* configuration/cap */
 uint64_t cmp;   /* comparator */
-uint64_t fsb;   /* FSB route, not supported now */
+uint64_t fsb;   /* FSB route */
 /* Hidden register state */
 uint64_t period;/* Last value written to comparator */
 uint8_t wrap_flag;  /* timer pop will indicate wrap for one-shot 32-bit
@@ -59,6 +62,7 @@ typedef struct HPETState {
 SysBusDevice busdev;
 uint64_t hpet_offset;
 qemu_irq irqs[HPET_NUM_IRQ_ROUTES];
+uint32_t flags;
 uint8_t rtc_irq_level;
 uint8_t num_timers;
 HPETTimer timer[HPET_MAX_TIMERS];
@@ -80,6 +84,11 @@ static uint32_t timer_int_route(struct HPETTimer *timer)
 return (timer-config  HPET_TN_INT_ROUTE_MASK)  HPET_TN_INT_ROUTE_SHIFT;
 }
 
+static uint32_t timer_fsb_route(HPETTimer *t)
+{
+return t-config  HPET_TN_FSB_ENABLE;
+}
+
 static uint32_t hpet_enabled(HPETState *s)
 {
 return s-config  HPET_CFG_ENABLE;
@@ -179,7 +188,11 @@ static void update_irq(struct HPETTimer *timer, int set)
 mask = 1  timer-tn;
 if (!set || !timer_enabled(timer) || !hpet_enabled(timer-state)) {
 s-isr = ~mask;
-qemu_irq_lower(s-irqs[route]);
+if (!timer_fsb_route(timer)) {
+qemu_irq_lower(s-irqs[route]);
+}
+} else if (timer_fsb_route(timer)) {
+apic_send_msi(timer-fsb  32, timer-fsb  0x);
 } else if (timer-config  HPET_TN_TYPE_LEVEL) {
 s-isr |= mask;
 qemu_irq_raise(s-irqs[route]);
@@ -216,6 +229,12 @@ static int hpet_post_load(void *opaque, int version_id)
 /* Push number of timers into capability returned via HPET_ID */
 s-capability = ~HPET_ID_NUM_TIM_MASK;
 s-capability |= (s-num_timers - 1)  HPET_ID_NUM_TIM_SHIFT;
+
+/* Derive HPET_MSI_SUPPORT from the capability of the first timer. */
+s-flags = ~(1  HPET_MSI_SUPPORT);
+if (s-timer[0].config  HPET_TN_FSB_CAP) {
+s-flags |= 1  HPET_MSI_SUPPORT;
+}
 return 0;
 }
 
@@ -361,6 +380,8 @@ static uint32_t hpet_ram_readl(void *opaque, 
target_phys_addr_t addr)
 case HPET_TN_CMP + 4:
 return timer-cmp  32;
 case HPET_TN_ROUTE:
+return timer-fsb;
+case HPET_TN_ROUTE + 4:
 return timer-fsb  32;
 default:
 DPRINTF(qemu: invalid hpet_ram_readl\n);
@@ -444,6 +465,9 @@ static void hpet_ram_writel(void *opaque, 
target_phys_addr_t addr,
 switch ((addr - 0x100) % 0x20) {
 case HPET_TN_CFG:
 DPRINTF(qemu: hpet_ram_writel HPET_TN_CFG\n);
+if (activating_bit(old_val, new_val, HPET_TN_FSB_ENABLE)) {
+update_irq(timer, 0);
+}
 val = hpet_fixup_reg(new_val, old_val,

[Qemu-devel] Re: [PATCH 10/16] x86: Refactor RTC IRQ coalescing workaround

2010-06-06 Thread Blue Swirl

On Sun, Jun 6, 2010 at 8:10 AM, Jan Kiszka jan.kis...@web.de wrote:
 From: Jan Kiszka jan.kis...@siemens.com

 Make use of the new IRQ message and report delivery results from the
 sink to the source. As a by-product, this also adds de-coalescing
 support to the PIC.

 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  hw/apic.c        |   64 +++--
  hw/apic.h        |    9 ++
  hw/i8259.c       |   16 ++-
  hw/ioapic.c      |   20 ++---
  hw/mc146818rtc.c |   83 ++---
  hw/pc.c          |   29 --
  6 files changed, 141 insertions(+), 80 deletions(-)

 diff --git a/hw/apic.c b/hw/apic.c
 index 7fbd79b..f9587d1 100644
 --- a/hw/apic.c
 +++ b/hw/apic.c
 @@ -123,10 +123,8 @@ typedef struct APICState {
  static int apic_io_memory;
  static APICState *local_apics[MAX_APICS + 1];
  static int last_apic_idx = 0;
 -static int apic_irq_delivered;

 -
 -static void apic_set_irq(APICState *s, int vector_num, int trigger_mode);
 +static int apic_set_irq(APICState *s, int vector_num, int trigger_mode);
  static void apic_update_irq(APICState *s);
  static void apic_get_delivery_bitmask(uint32_t *deliver_bitmask,
                                       uint8_t dest, uint8_t dest_mode);
 @@ -239,12 +237,12 @@ void apic_deliver_pic_intr(CPUState *env, int level)
     }\
  }

 -static void apic_bus_deliver(const uint32_t *deliver_bitmask,
 -                             uint8_t delivery_mode,
 -                             uint8_t vector_num, uint8_t polarity,
 -                             uint8_t trigger_mode)
 +static int apic_bus_deliver(const uint32_t *deliver_bitmask,
 +                            uint8_t delivery_mode, uint8_t vector_num,
 +                            uint8_t polarity, uint8_t trigger_mode)
  {
     APICState *apic_iter;
 +    int ret;

     switch (delivery_mode) {
         case APIC_DM_LOWPRI:
 @@ -261,11 +259,12 @@ static void apic_bus_deliver(const uint32_t 
 *deliver_bitmask,
                 if (d = 0) {
                     apic_iter = local_apics[d];
                     if (apic_iter) {
 -                        apic_set_irq(apic_iter, vector_num, trigger_mode);
 +                        return apic_set_irq(apic_iter, vector_num,
 +                                            trigger_mode);
                     }
                 }
             }
 -            return;
 +            return QEMU_IRQ_MASKED;

         case APIC_DM_FIXED:
             break;
 @@ -273,34 +272,42 @@ static void apic_bus_deliver(const uint32_t 
 *deliver_bitmask,
         case APIC_DM_SMI:
             foreach_apic(apic_iter, deliver_bitmask,
                 cpu_interrupt(apic_iter-cpu_env, CPU_INTERRUPT_SMI) );
 -            return;
 +            return QEMU_IRQ_DELIVERED;

         case APIC_DM_NMI:
             foreach_apic(apic_iter, deliver_bitmask,
                 cpu_interrupt(apic_iter-cpu_env, CPU_INTERRUPT_NMI) );
 -            return;
 +            return QEMU_IRQ_DELIVERED;

         case APIC_DM_INIT:
             /* normal INIT IPI sent to processors */
             foreach_apic(apic_iter, deliver_bitmask,
                          cpu_interrupt(apic_iter-cpu_env, 
 CPU_INTERRUPT_INIT) );
 -            return;
 +            return QEMU_IRQ_DELIVERED;

         case APIC_DM_EXTINT:
             /* handled in I/O APIC code */
             break;

         default:
 -            return;
 +            return QEMU_IRQ_MASKED;
     }

 +    ret = QEMU_IRQ_MASKED;
     foreach_apic(apic_iter, deliver_bitmask,
 -                 apic_set_irq(apic_iter, vector_num, trigger_mode) );
 +        if (ret == QEMU_IRQ_MASKED)
 +            ret = QEMU_IRQ_COALESCED;
 +        if (apic_set_irq(apic_iter, vector_num,
 +                         trigger_mode) == QEMU_IRQ_DELIVERED) {
 +            ret = QEMU_IRQ_DELIVERED;
 +        }
 +    );
 +    return ret;
  }

 -void apic_deliver_irq(uint8_t dest, uint8_t dest_mode,
 -                      uint8_t delivery_mode, uint8_t vector_num,
 -                      uint8_t polarity, uint8_t trigger_mode)
 +int apic_deliver_irq(uint8_t dest, uint8_t dest_mode,
 +                     uint8_t delivery_mode, uint8_t vector_num,
 +                     uint8_t polarity, uint8_t trigger_mode)
  {
     uint32_t deliver_bitmask[MAX_APIC_WORDS];

 @@ -308,8 +315,8 @@ void apic_deliver_irq(uint8_t dest, uint8_t dest_mode,
              polarity %d trigger_mode %d\n, __func__, dest, dest_mode,
             delivery_mode, vector_num, polarity, trigger_mode);
     apic_get_delivery_bitmask(deliver_bitmask, dest, dest_mode);
 -    apic_bus_deliver(deliver_bitmask, delivery_mode, vector_num, polarity,
 -                     trigger_mode);
 +    return apic_bus_deliver(deliver_bitmask, delivery_mode, vector_num,
 +                            polarity, trigger_mode);
  }

  void cpu_set_apic_base(CPUState *env, uint64_t val)
 @@ -402,22 +409,10 @@ static void

[Qemu-devel] Re: [PATCH 11/16] hpet/rtc: Rework RTC IRQ replacement by HPET

2010-06-06 Thread Blue Swirl

On Sun, Jun 6, 2010 at 8:11 AM, Jan Kiszka jan.kis...@web.de wrote:
 From: Jan Kiszka jan.kis...@siemens.com

 Allow the intercept the RTC IRQ for the HPET legacy mode. Then push
 routing to IRQ8 completely into the HPET. This allows to turn
 hpet_in_legacy_mode() into a private function. Furthermore, this stops
 the RTC from clearing IRQ8 even if the HPET is in control.

 This patch comes with a side effect: The RTC timers will no longer be
 stoppend when there is no IRQ consumer, possibly causing a minor
 performance degration. But as the guest may want to redirect the RTC to
 the SCI in that mode, it should normally disable unused IRQ source
 anyway.

 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  hw/hpet.c        |   42 +++---
  hw/hpet_emul.h   |    4 
  hw/mc146818rtc.c |   54 
 +++---
  hw/mc146818rtc.h |    4 +++-
  hw/mips_jazz.c   |    2 +-
  hw/mips_malta.c  |    2 +-
  hw/mips_r4k.c    |    2 +-
  hw/pc.c          |   14 --
  hw/ppc_prep.c    |    2 +-
  9 files changed, 65 insertions(+), 61 deletions(-)

 diff --git a/hw/hpet.c b/hw/hpet.c
 index 041dd84..d26cad5 100644
 --- a/hw/hpet.c
 +++ b/hw/hpet.c
 @@ -30,6 +30,7 @@
  #include qemu-timer.h
  #include hpet_emul.h
  #include sysbus.h
 +#include mc146818rtc.h

  //#define HPET_DEBUG
  #ifdef HPET_DEBUG
 @@ -58,6 +59,7 @@ typedef struct HPETState {
     SysBusDevice busdev;
     uint64_t hpet_offset;
     qemu_irq irqs[HPET_NUM_IRQ_ROUTES];
 +    uint8_t rtc_irq_level;
     HPETTimer timer[HPET_NUM_TIMERS];

     /* Memory-mapped, software visible registers */
 @@ -69,12 +71,9 @@ typedef struct HPETState {

  static HPETState *hpet_statep;

 -uint32_t hpet_in_legacy_mode(void)
 +static uint32_t hpet_in_legacy_mode(HPETState *s)
  {
 -    if (!hpet_statep) {
 -        return 0;
 -    }
 -    return hpet_statep-config  HPET_CFG_LEGACY;
 +    return s-config  HPET_CFG_LEGACY;
  }

  static uint32_t timer_int_route(struct HPETTimer *timer)
 @@ -166,12 +165,12 @@ static void update_irq(struct HPETTimer *timer)
  {
     int route;

 -    if (timer-tn = 1  hpet_in_legacy_mode()) {
 +    if (timer-tn = 1  hpet_in_legacy_mode(timer-state)) {
         /* if LegacyReplacementRoute bit is set, HPET specification requires
          * timer0 be routed to IRQ0 in NON-APIC or IRQ2 in the I/O APIC,
          * timer1 be routed to IRQ8 in NON-APIC or IRQ8 in the I/O APIC.
          */
 -        route = (timer-tn == 0) ? 0 : 8;
 +        route = (timer-tn == 0) ? 0 : RTC_ISA_IRQ;
     } else {
         route = timer_int_route(timer);
     }
 @@ -515,8 +514,10 @@ static void hpet_ram_writel(void *opaque, 
 target_phys_addr_t addr,
             /* i8254 and RTC are disabled when HPET is in legacy mode */
             if (activating_bit(old_val, new_val, HPET_CFG_LEGACY)) {
                 hpet_pit_disable();
 +                qemu_irq_lower(s-irqs[RTC_ISA_IRQ]);
             } else if (deactivating_bit(old_val, new_val, HPET_CFG_LEGACY)) {
                 hpet_pit_enable();
 +                qemu_set_irq(s-irqs[RTC_ISA_IRQ], s-rtc_irq_level);
             }
             break;
         case HPET_CFG + 4:
 @@ -607,6 +608,30 @@ static void hpet_reset(DeviceState *d)
     count = 1;
  }

 +static void hpet_rtc_delivery_cb(qemu_irq irq, void *opaque, int n, int 
 level,
 +                                 int result)
 +{
 +    qemu_irq orig_irq = opaque;
 +
 +    qemu_irq_fire_delivery_cb(orig_irq, level, result);
 +}
 +
 +static void hpet_handle_rtc_irq(qemu_irq irq, void *opaque, int n, int level)
 +{
 +    HPETState *s = FROM_SYSBUS(HPETState, opaque);
 +    IRQMsg msg = {
 +        .delivery_cb = hpet_rtc_delivery_cb,
 +        .delivery_opaque = irq,
 +    };
 +
 +    s-rtc_irq_level = level;
 +    if (hpet_in_legacy_mode(s)) {
 +        qemu_irq_fire_delivery_cb(irq, level, QEMU_IRQ_MASKED);
 +    } else {
 +        qemu_set_irq_msg(s-irqs[RTC_ISA_IRQ], level, msg);

This is the problem with passing around stack allocated objects: after
this function finishes, s-irqs[RTC_ISA_IRQ].msg is a dangling pointer
to some stack space.

 +    }
 +}
 +
  static int hpet_init(SysBusDevice *dev)
  {
     HPETState *s = FROM_SYSBUS(HPETState, dev);
 @@ -625,6 +650,9 @@ static int hpet_init(SysBusDevice *dev)
         timer-state = s;
     }

 +    isa_reserve_irq(RTC_ISA_IRQ);
 +    qdev_init_gpio_in(dev-qdev, hpet_handle_rtc_irq, 1);
 +
     /* HPET Area */
     iomemtype = cpu_register_io_memory(hpet_ram_read,
                                        hpet_ram_write, s);
 diff --git a/hw/hpet_emul.h b/hw/hpet_emul.h
 index 785f850..9c268cc 100644
 --- a/hw/hpet_emul.h
 +++ b/hw/hpet_emul.h
 @@ -47,8 +47,4 @@
  #define HPET_TN_INT_ROUTE_CAP_SHIFT 32
  #define HPET_TN_CFG_BITS_READONLY_OR_RESERVED 0x80b1U

 -#if defined TARGET_I386
 -extern uint32_t hpet_in_legacy_mode(void);
 -#endif
 -
  #endif
 diff --git a/hw/mc146818rtc.c b/hw/mc146818rtc.c
 index

[Qemu-devel] Re: [PATCH 00/16] HPET cleanups, fixes, enhancements

2010-06-06 Thread Blue Swirl

On Sun, Jun 6, 2010 at 8:10 AM, Jan Kiszka jan.kis...@web.de wrote:
 Second round, specifically adressing:
  - IRQMsg framework to refactor existing de-coalescing code
  - RTC IRQ output as GPIO pin (routed depening on HPET or -no-hpet)
  - ISA reservation for RTC IRQ

 If discussion around IRQMsg and de-coalescing happens to continue, I
 would suggest to merge patches 1..7 as they are likely uncontroversial
 and also fix bugs.

Otherwise everything looks fine to me, but 10 and 11 had minor
problems. Nice work!

I'd suppose one possible cleanup could be to use the message payload
in place of apic_deliver_irq()?

 Jan Kiszka (16):
  hpet: Catch out-of-bounds timer access
  hpet: Coding style cleanups and some refactorings
  hpet: Silence warning on write to running main counter
  hpet: Move static timer field initialization
  hpet: Convert to qdev
  hpet: Start/stop timer when HPET_TN_ENABLE is modified
  monitor/QMP: Drop info hpet / query-hpet
  Pass IRQ object on handler invocation
  Enable message delivery via IRQs
  x86: Refactor RTC IRQ coalescing workaround
  hpet/rtc: Rework RTC IRQ replacement by HPET
  hpet: Drop static state
  hpet: Add support for level-triggered interrupts
  vmstate: Add VMSTATE_STRUCT_VARRAY_UINT8
  hpet: Make number of timers configurable
  hpet: Add MSI support

  QMP/vm-info             |    2 +-
  hw/acpi_piix4.c         |    3 +-
  hw/apic.c               |   66 +++---
  hw/apic.h               |   11 +-
  hw/arm11mpcore.c        |   12 +-
  hw/arm_gic.c            |   18 +-
  hw/arm_pic.c            |    6 +-
  hw/arm_timer.c          |    4 +-
  hw/bitbang_i2c.c        |    4 +-
  hw/bt-hci-csr.c         |    2 +-
  hw/cbus.c               |    6 +-
  hw/cris_pic_cpu.c       |    4 +-
  hw/esp.c                |    2 +-
  hw/etraxfs_pic.c        |   16 +-
  hw/fdc.c                |    2 +-
  hw/heathrow_pic.c       |    3 +-
  hw/hpet.c               |  595 
 ++-
  hw/hpet_emul.h          |   46 +---
  hw/hw.h                 |   10 +
  hw/i8259.c              |   28 ++-
  hw/ide/cmd646.c         |    2 +-
  hw/ide/microdrive.c     |    2 +-
  hw/integratorcp.c       |   10 +-
  hw/ioapic.c             |   22 ++-
  hw/irq.c                |   48 -
  hw/irq.h                |   42 +++-
  hw/lance.c              |    2 +-
  hw/max7310.c            |    2 +-
  hw/mc146818rtc.c        |  111 +-
  hw/mc146818rtc.h        |    4 +-
  hw/mcf5206.c            |    6 +-
  hw/mcf_intc.c           |   14 +-
  hw/microblaze_pic_cpu.c |    5 +-
  hw/mips_int.c           |   10 +-
  hw/mips_jazz.c          |    4 +-
  hw/mips_malta.c         |    4 +-
  hw/mips_r4k.c           |    2 +-
  hw/mst_fpga.c           |   10 +-
  hw/musicpal.c           |   16 +-
  hw/nseries.c            |    4 +-
  hw/omap.h               |    2 +-
  hw/omap1.c              |   34 ++--
  hw/omap2.c              |    8 +-
  hw/omap_dma.c           |    8 +-
  hw/omap_mmc.c           |    2 +-
  hw/openpic.c            |    6 +-
  hw/palm.c               |    2 +-
  hw/pc.c                 |   59 --
  hw/pc.h                 |    8 +-
  hw/pci.c                |    4 +-
  hw/pl061.c              |    4 +-
  hw/pl190.c              |    6 +-
  hw/ppc.c                |    8 +-
  hw/ppc4xx_devs.c        |    2 +-
  hw/ppc_prep.c           |    4 +-
  hw/pxa2xx.c             |    2 +-
  hw/pxa2xx_gpio.c        |    2 +-
  hw/pxa2xx_pcmcia.c      |    3 +-
  hw/pxa2xx_pic.c         |   10 +-
  hw/r2d.c                |    2 +-
  hw/rc4030.c             |    7 +-
  hw/sbi.c                |    2 +-
  hw/sh_intc.c            |    4 +-
  hw/sh_intc.h            |    2 +-
  hw/sharpsl.h            |    1 -
  hw/slavio_intctl.c      |   16 +-
  hw/slavio_misc.c        |    3 +-
  hw/sparc32_dma.c        |    2 +-
  hw/spitz.c              |   14 +-
  hw/ssd0323.c            |    2 +-
  hw/stellaris.c          |    6 +-
  hw/sun4c_intctl.c       |    8 +-
  hw/sun4m.c              |   14 +-
  hw/sun4u.c              |   12 +-
  hw/syborg_interrupt.c   |    8 +-
  hw/tc6393xb.c           |    7 +-
  hw/tosa.c               |    2 +-
  hw/tusb6010.c           |    3 +-
  hw/twl92230.c           |    5 +-
  hw/versatilepb.c        |   10 +-
  hw/xilinx_intc.c        |    8 +-
  hw/zaurus.c             |    2 +-
  monitor.c               |   22 --
  qemu-monitor.hx         |   21 --
  84 files changed, 874 insertions(+), 643 deletions(-)

[Qemu-devel] Re: [PATCH 10/16] x86: Refactor RTC IRQ coalescing workaround

2010-06-06 Thread Jan Kiszka

Blue Swirl wrote:
 On Sun, Jun 6, 2010 at 8:10 AM, Jan Kiszka jan.kis...@web.de wrote:
 From: Jan Kiszka jan.kis...@siemens.com

 Make use of the new IRQ message and report delivery results from the
 sink to the source. As a by-product, this also adds de-coalescing
 support to the PIC.

 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  hw/apic.c|   64 +++--
  hw/apic.h|9 ++
  hw/i8259.c   |   16 ++-
  hw/ioapic.c  |   20 ++---
  hw/mc146818rtc.c |   83 
 ++---
  hw/pc.c  |   29 --
  6 files changed, 141 insertions(+), 80 deletions(-)

 diff --git a/hw/apic.c b/hw/apic.c
 index 7fbd79b..f9587d1 100644
 --- a/hw/apic.c
 +++ b/hw/apic.c
 @@ -123,10 +123,8 @@ typedef struct APICState {
  static int apic_io_memory;
  static APICState *local_apics[MAX_APICS + 1];
  static int last_apic_idx = 0;
 -static int apic_irq_delivered;

 -
 -static void apic_set_irq(APICState *s, int vector_num, int trigger_mode);
 +static int apic_set_irq(APICState *s, int vector_num, int trigger_mode);
  static void apic_update_irq(APICState *s);
  static void apic_get_delivery_bitmask(uint32_t *deliver_bitmask,
   uint8_t dest, uint8_t dest_mode);
 @@ -239,12 +237,12 @@ void apic_deliver_pic_intr(CPUState *env, int level)
 }\
  }

 -static void apic_bus_deliver(const uint32_t *deliver_bitmask,
 - uint8_t delivery_mode,
 - uint8_t vector_num, uint8_t polarity,
 - uint8_t trigger_mode)
 +static int apic_bus_deliver(const uint32_t *deliver_bitmask,
 +uint8_t delivery_mode, uint8_t vector_num,
 +uint8_t polarity, uint8_t trigger_mode)
  {
 APICState *apic_iter;
 +int ret;

 switch (delivery_mode) {
 case APIC_DM_LOWPRI:
 @@ -261,11 +259,12 @@ static void apic_bus_deliver(const uint32_t 
 *deliver_bitmask,
 if (d = 0) {
 apic_iter = local_apics[d];
 if (apic_iter) {
 -apic_set_irq(apic_iter, vector_num, trigger_mode);
 +return apic_set_irq(apic_iter, vector_num,
 +trigger_mode);
 }
 }
 }
 -return;
 +return QEMU_IRQ_MASKED;

 case APIC_DM_FIXED:
 break;
 @@ -273,34 +272,42 @@ static void apic_bus_deliver(const uint32_t 
 *deliver_bitmask,
 case APIC_DM_SMI:
 foreach_apic(apic_iter, deliver_bitmask,
 cpu_interrupt(apic_iter-cpu_env, CPU_INTERRUPT_SMI) );
 -return;
 +return QEMU_IRQ_DELIVERED;

 case APIC_DM_NMI:
 foreach_apic(apic_iter, deliver_bitmask,
 cpu_interrupt(apic_iter-cpu_env, CPU_INTERRUPT_NMI) );
 -return;
 +return QEMU_IRQ_DELIVERED;

 case APIC_DM_INIT:
 /* normal INIT IPI sent to processors */
 foreach_apic(apic_iter, deliver_bitmask,
  cpu_interrupt(apic_iter-cpu_env, 
 CPU_INTERRUPT_INIT) );
 -return;
 +return QEMU_IRQ_DELIVERED;

 case APIC_DM_EXTINT:
 /* handled in I/O APIC code */
 break;

 default:
 -return;
 +return QEMU_IRQ_MASKED;
 }

 +ret = QEMU_IRQ_MASKED;
 foreach_apic(apic_iter, deliver_bitmask,
 - apic_set_irq(apic_iter, vector_num, trigger_mode) );
 +if (ret == QEMU_IRQ_MASKED)
 +ret = QEMU_IRQ_COALESCED;
 +if (apic_set_irq(apic_iter, vector_num,
 + trigger_mode) == QEMU_IRQ_DELIVERED) {
 +ret = QEMU_IRQ_DELIVERED;
 +}
 +);
 +return ret;
  }

 -void apic_deliver_irq(uint8_t dest, uint8_t dest_mode,
 -  uint8_t delivery_mode, uint8_t vector_num,
 -  uint8_t polarity, uint8_t trigger_mode)
 +int apic_deliver_irq(uint8_t dest, uint8_t dest_mode,
 + uint8_t delivery_mode, uint8_t vector_num,
 + uint8_t polarity, uint8_t trigger_mode)
  {
 uint32_t deliver_bitmask[MAX_APIC_WORDS];

 @@ -308,8 +315,8 @@ void apic_deliver_irq(uint8_t dest, uint8_t dest_mode,
  polarity %d trigger_mode %d\n, __func__, dest, dest_mode,
 delivery_mode, vector_num, polarity, trigger_mode);
 apic_get_delivery_bitmask(deliver_bitmask, dest, dest_mode);
 -apic_bus_deliver(deliver_bitmask, delivery_mode, vector_num, polarity,
 - trigger_mode);
 +return apic_bus_deliver(deliver_bitmask, delivery_mode, vector_num,
 +polarity, trigger_mode);
  }

  void cpu_set_apic_base(CPUState *env, uint64_t val)
 @@ -402,22 +409,10 @@

[Qemu-devel] Re: [PATCH 11/16] hpet/rtc: Rework RTC IRQ replacement by HPET

2010-06-06 Thread Jan Kiszka

Blue Swirl wrote:
 On Sun, Jun 6, 2010 at 8:11 AM, Jan Kiszka jan.kis...@web.de wrote:
 From: Jan Kiszka jan.kis...@siemens.com

 Allow the intercept the RTC IRQ for the HPET legacy mode. Then push
 routing to IRQ8 completely into the HPET. This allows to turn
 hpet_in_legacy_mode() into a private function. Furthermore, this stops
 the RTC from clearing IRQ8 even if the HPET is in control.

 This patch comes with a side effect: The RTC timers will no longer be
 stoppend when there is no IRQ consumer, possibly causing a minor
 performance degration. But as the guest may want to redirect the RTC to
 the SCI in that mode, it should normally disable unused IRQ source
 anyway.

 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  hw/hpet.c|   42 +++---
  hw/hpet_emul.h   |4 
  hw/mc146818rtc.c |   54 
 +++---
  hw/mc146818rtc.h |4 +++-
  hw/mips_jazz.c   |2 +-
  hw/mips_malta.c  |2 +-
  hw/mips_r4k.c|2 +-
  hw/pc.c  |   14 --
  hw/ppc_prep.c|2 +-
  9 files changed, 65 insertions(+), 61 deletions(-)

 diff --git a/hw/hpet.c b/hw/hpet.c
 index 041dd84..d26cad5 100644
 --- a/hw/hpet.c
 +++ b/hw/hpet.c
 @@ -30,6 +30,7 @@
  #include qemu-timer.h
  #include hpet_emul.h
  #include sysbus.h
 +#include mc146818rtc.h

  //#define HPET_DEBUG
  #ifdef HPET_DEBUG
 @@ -58,6 +59,7 @@ typedef struct HPETState {
 SysBusDevice busdev;
 uint64_t hpet_offset;
 qemu_irq irqs[HPET_NUM_IRQ_ROUTES];
 +uint8_t rtc_irq_level;
 HPETTimer timer[HPET_NUM_TIMERS];

 /* Memory-mapped, software visible registers */
 @@ -69,12 +71,9 @@ typedef struct HPETState {

  static HPETState *hpet_statep;

 -uint32_t hpet_in_legacy_mode(void)
 +static uint32_t hpet_in_legacy_mode(HPETState *s)
  {
 -if (!hpet_statep) {
 -return 0;
 -}
 -return hpet_statep-config  HPET_CFG_LEGACY;
 +return s-config  HPET_CFG_LEGACY;
  }

  static uint32_t timer_int_route(struct HPETTimer *timer)
 @@ -166,12 +165,12 @@ static void update_irq(struct HPETTimer *timer)
  {
 int route;

 -if (timer-tn = 1  hpet_in_legacy_mode()) {
 +if (timer-tn = 1  hpet_in_legacy_mode(timer-state)) {
 /* if LegacyReplacementRoute bit is set, HPET specification requires
  * timer0 be routed to IRQ0 in NON-APIC or IRQ2 in the I/O APIC,
  * timer1 be routed to IRQ8 in NON-APIC or IRQ8 in the I/O APIC.
  */
 -route = (timer-tn == 0) ? 0 : 8;
 +route = (timer-tn == 0) ? 0 : RTC_ISA_IRQ;
 } else {
 route = timer_int_route(timer);
 }
 @@ -515,8 +514,10 @@ static void hpet_ram_writel(void *opaque, 
 target_phys_addr_t addr,
 /* i8254 and RTC are disabled when HPET is in legacy mode */
 if (activating_bit(old_val, new_val, HPET_CFG_LEGACY)) {
 hpet_pit_disable();
 +qemu_irq_lower(s-irqs[RTC_ISA_IRQ]);
 } else if (deactivating_bit(old_val, new_val, HPET_CFG_LEGACY)) {
 hpet_pit_enable();
 +qemu_set_irq(s-irqs[RTC_ISA_IRQ], s-rtc_irq_level);
 }
 break;
 case HPET_CFG + 4:
 @@ -607,6 +608,30 @@ static void hpet_reset(DeviceState *d)
 count = 1;
  }

 +static void hpet_rtc_delivery_cb(qemu_irq irq, void *opaque, int n, int 
 level,
 + int result)
 +{
 +qemu_irq orig_irq = opaque;
 +
 +qemu_irq_fire_delivery_cb(orig_irq, level, result);
 +}
 +
 +static void hpet_handle_rtc_irq(qemu_irq irq, void *opaque, int n, int 
 level)
 +{
 +HPETState *s = FROM_SYSBUS(HPETState, opaque);
 +IRQMsg msg = {
 +.delivery_cb = hpet_rtc_delivery_cb,
 +.delivery_opaque = irq,
 +};
 +
 +s-rtc_irq_level = level;
 +if (hpet_in_legacy_mode(s)) {
 +qemu_irq_fire_delivery_cb(irq, level, QEMU_IRQ_MASKED);
 +} else {
 +qemu_set_irq_msg(s-irqs[RTC_ISA_IRQ], level, msg);
 
 This is the problem with passing around stack allocated objects: after
 this function finishes, s-irqs[RTC_ISA_IRQ].msg is a dangling pointer
 to some stack space.

s-irqs[RTC_ISA_IRQ].msg is NULL when qemu_set_irq_msg returned, msg
itself will not leak out of the qemu_irq subsystem.

Jan



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] Re: [PATCH 00/16] HPET cleanups, fixes, enhancements

2010-06-06 Thread Jan Kiszka

Blue Swirl wrote:
 On Sun, Jun 6, 2010 at 8:10 AM, Jan Kiszka jan.kis...@web.de wrote:
 Second round, specifically adressing:
  - IRQMsg framework to refactor existing de-coalescing code
  - RTC IRQ output as GPIO pin (routed depening on HPET or -no-hpet)
  - ISA reservation for RTC IRQ

 If discussion around IRQMsg and de-coalescing happens to continue, I
 would suggest to merge patches 1..7 as they are likely uncontroversial
 and also fix bugs.
 
 Otherwise everything looks fine to me, but 10 and 11 had minor
 problems. Nice work!

Thanks for the quick feedback!

 
 I'd suppose one possible cleanup could be to use the message payload
 in place of apic_deliver_irq()?

Haven't looked into such things yet, the series is already long enough.
:) I could also imagine that we may avoid exporting the APIC MSI
functions for HPET use and instead provide a single MSI qemu_irq object,
pushing the vector information into the message payload.

Jan



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] Re: [PATCHv3 1/2] virtio: support layout with avail ring before idx

2010-06-06 Thread Michael S. Tsirkin

On Sat, Jun 05, 2010 at 01:40:26PM +0930, Rusty Russell wrote:
 On Fri, 4 Jun 2010 09:12:05 pm Michael S. Tsirkin wrote:
  On Fri, Jun 04, 2010 at 08:46:49PM +0930, Rusty Russell wrote:
   I'm uncomfortable with moving a field.
   
   We haven't done that before and I wonder what will break with old code.
  
  With e.g. my patch, We only do this conditionally when bit is negotitated.
 
 Of course, but see this change:
 
 commit ef688e151c00e5d529703be9a04fd506df8bc54e
 Author: Rusty Russell ru...@rustcorp.com.au
 Date:   Fri Jun 12 22:16:35 2009 -0600
 
 virtio: meet virtio spec by finalizing features before using device
 
 Virtio devices are supposed to negotiate features before they start using
 the device, but the current code doesn't do this.  This is because the
 driver's probe() function invariably has to add buffers to a virtqueue,
 or probe the disk (virtio_blk).
 
 This currently doesn't matter since no existing backend is strict about
 the feature negotiation.  But it's possible to imagine a future feature
 which completely changes how a device operates: in this case, we'd need
 to acknowledge it before using the device.
 
 Signed-off-by: Rusty Russell ru...@rustcorp.com.au
 
 Now, this isn't impossible to overcome: we know that if they use the ring
 before completing feature negotiation then they don't understand the new
 format.
 
 But we have to be aware of that on the qemu side.  Are we?

I think we are ok. virtqueue_init which sets the avail/ysed pointers is
called when we write the base address.  So we only need to be careful
and not change this feature bit after creating the rings.


   Should we instead just abandon the flags field and use last_used only?
   Or, more radically, put flags == last_used when the feature is on?
   
   Thoughts?
   Rusty.
  
  Hmm, e.g. with TX and virtio net, we almost never want interrupts,
  whatever the index value.
 
 Good point.  OK, I give in, I'll take your patch which moves the fields
 to the end.  Is that your preference?

Yes, I think so.
You mean PATCHv3 unchanged with 254 byte padding?

 Please be careful with the qemu side though...
 
 It's not inconceivable that I'll write that virtio cacheline simulator this
 (coming) week, too...
 
 Thanks.
 Rusty.

Re: [Qemu-devel] Re: [RFT][PATCH 07/15] qemu_irq: Add IRQ handlers with delivery feedback

2010-06-06 Thread Gleb Natapov

On Sun, Jun 06, 2010 at 10:07:48AM +0200, Jan Kiszka wrote:
 Gleb Natapov wrote:
  On Sun, Jun 06, 2010 at 09:39:04AM +0200, Jan Kiszka wrote:
  Gleb Natapov wrote:
  On Sat, Jun 05, 2010 at 02:04:01AM +0200, Jan Kiszka wrote:
  I'd like to also support EOI handling. When the guest clears the
  interrupt condtion, the EOI callback would be called. This could occur
  much later than the IRQ delivery time. I'm not sure if we need the
  result code in that case.
 
  If any intermediate device (IOAPIC?) needs to be informed about either
  delivery or EOI also, it could create a proxy message with its
  callbacks in place. But we need then a separate opaque field (in
  addition to payload) to store the original message.
 
  struct IRQMsg {
   DeviceState *src;
   void (*delivery_cb)(IRQMsg *msg, int result);
   void (*eoi_cb)(IRQMsg *msg, int result);
   void *src_opaque;
   void *payload;
  };
  Extending the lifetime of IRQMsg objects beyond the delivery call stack
  means qemu_malloc/free for every delivery. I think it takes a _very_
  appealing reason to justify this. But so far I do not see any use case
  for eio_cb at all.
 
  I dislike use of eoi for reinfecting missing interrupts since
  it eliminates use of internal PIC/APIC queue of not yet delivered
  interrupts. PIC and APIC has internal queue that can handle two elements:
  one is delivered, but not yet acked interrupt in isr and another is
  pending interrupt in irr. Using eoi callback (or ack notifier as it's
  called inside kernel) interrupt will be considered coalesced even if irr
  is cleared, but no ack was received for previously delivered interrupt.
  But ack notifiers actually has another use: device assignment. There is
  a plan to move device assignment from kernel to userspace and for that
  ack notifiers will have to be extended to userspace too. If so we can
  use them to do irq decoalescing as well. I doubt they should be part
  of IRQMsg though. Why not do what kernel does: have globally registered
  notifier based on irqchip/pin.
  I read this twice but I still don't get your plan. Do you like or
  dislike using EIO for de-coalescing? And how should these notifiers work?
 
  That's because I confused myself :) I _dislike_ them to be used, but
  since device assignment requires ack notifiers anyway may be it is better
  to introduce one mechanism for device assignmen + de-coalescing instead
  of introducing two different mechanism. Using ack notifiers should be
  easy: RTC registers ack notifier and keep track of delivered interrupts.
  If timer triggers after previews irq was set, but before it was acked
  coalesced counter is incremented. In ack notifier callback coalesced
  counter is checked and if it is not zero new irq is set.
 
 Ack notifier registrations and event deliveries still need to be routed.
 Piggy-backing this on IRQ messages may be unavoidable for that reason.
It is done in the kernel without piggy-backing.

 
 Anyway, I'm going to post my HPET updates with the infrastructure for
 IRQMsg now. Maybe it's helpful to see the other option in reality.
 
One other think to consider current approach does not always work.
Win2K3-64bit-smp and Win2k8-64bit-smp configure RTC interrupt to be
broadcasted to all cpus, but only boot cpu does time calculation. With
current approach if interrupt is delivered to at least one vcpu
it will not be considered coalesced, but if cpu it was delivered to is
not cpu that does time accounting then clock will drift.

--
Gleb.

Re: [Qemu-devel] Re: [RFT][PATCH 07/15] qemu_irq: Add IRQ handlers with delivery feedback

2010-06-06 Thread Jan Kiszka

Gleb Natapov wrote:
 On Sun, Jun 06, 2010 at 10:07:48AM +0200, Jan Kiszka wrote:
 Gleb Natapov wrote:
 On Sun, Jun 06, 2010 at 09:39:04AM +0200, Jan Kiszka wrote:
 Gleb Natapov wrote:
 On Sat, Jun 05, 2010 at 02:04:01AM +0200, Jan Kiszka wrote:
 I'd like to also support EOI handling. When the guest clears the
 interrupt condtion, the EOI callback would be called. This could occur
 much later than the IRQ delivery time. I'm not sure if we need the
 result code in that case.

 If any intermediate device (IOAPIC?) needs to be informed about either
 delivery or EOI also, it could create a proxy message with its
 callbacks in place. But we need then a separate opaque field (in
 addition to payload) to store the original message.

 struct IRQMsg {
  DeviceState *src;
  void (*delivery_cb)(IRQMsg *msg, int result);
  void (*eoi_cb)(IRQMsg *msg, int result);
  void *src_opaque;
  void *payload;
 };
 Extending the lifetime of IRQMsg objects beyond the delivery call stack
 means qemu_malloc/free for every delivery. I think it takes a _very_
 appealing reason to justify this. But so far I do not see any use case
 for eio_cb at all.

 I dislike use of eoi for reinfecting missing interrupts since
 it eliminates use of internal PIC/APIC queue of not yet delivered
 interrupts. PIC and APIC has internal queue that can handle two elements:
 one is delivered, but not yet acked interrupt in isr and another is
 pending interrupt in irr. Using eoi callback (or ack notifier as it's
 called inside kernel) interrupt will be considered coalesced even if irr
 is cleared, but no ack was received for previously delivered interrupt.
 But ack notifiers actually has another use: device assignment. There is
 a plan to move device assignment from kernel to userspace and for that
 ack notifiers will have to be extended to userspace too. If so we can
 use them to do irq decoalescing as well. I doubt they should be part
 of IRQMsg though. Why not do what kernel does: have globally registered
 notifier based on irqchip/pin.
 I read this twice but I still don't get your plan. Do you like or
 dislike using EIO for de-coalescing? And how should these notifiers work?

 That's because I confused myself :) I _dislike_ them to be used, but
 since device assignment requires ack notifiers anyway may be it is better
 to introduce one mechanism for device assignmen + de-coalescing instead
 of introducing two different mechanism. Using ack notifiers should be
 easy: RTC registers ack notifier and keep track of delivered interrupts.
 If timer triggers after previews irq was set, but before it was acked
 coalesced counter is incremented. In ack notifier callback coalesced
 counter is checked and if it is not zero new irq is set.
 Ack notifier registrations and event deliveries still need to be routed.
 Piggy-backing this on IRQ messages may be unavoidable for that reason.
 It is done in the kernel without piggy-backing.

As it does not include any IRQ routers in front of the interrupt
controller. Maybe it works for x86, but it is no generic solution.

Also, periodic timer sources get no information about the fact that
their interrupt is masked somewhere along the path to the VCPUs and will
possibly replay countless IRQs when the masking ends, no?

 
 Anyway, I'm going to post my HPET updates with the infrastructure for
 IRQMsg now. Maybe it's helpful to see the other option in reality.

 One other think to consider current approach does not always work.
 Win2K3-64bit-smp and Win2k8-64bit-smp configure RTC interrupt to be
 broadcasted to all cpus, but only boot cpu does time calculation. With
 current approach if interrupt is delivered to at least one vcpu
 it will not be considered coalesced, but if cpu it was delivered to is
 not cpu that does time accounting then clock will drift.

That means we would have to fire callbacks per receiving CPU and report
its number back. Is there a way to find out if we are running such a
guest without an '-enable-win2k[38]-64bit-smp-rtc-drift-fix'?

Jan



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] Re: [RFT][PATCH 07/15] qemu_irq: Add IRQ handlers with delivery feedback

2010-06-06 Thread Gleb Natapov

On Sun, Jun 06, 2010 at 12:10:07PM +0200, Jan Kiszka wrote:
 Gleb Natapov wrote:
  On Sun, Jun 06, 2010 at 10:07:48AM +0200, Jan Kiszka wrote:
  Gleb Natapov wrote:
  On Sun, Jun 06, 2010 at 09:39:04AM +0200, Jan Kiszka wrote:
  Gleb Natapov wrote:
  On Sat, Jun 05, 2010 at 02:04:01AM +0200, Jan Kiszka wrote:
  I'd like to also support EOI handling. When the guest clears the
  interrupt condtion, the EOI callback would be called. This could occur
  much later than the IRQ delivery time. I'm not sure if we need the
  result code in that case.
 
  If any intermediate device (IOAPIC?) needs to be informed about either
  delivery or EOI also, it could create a proxy message with its
  callbacks in place. But we need then a separate opaque field (in
  addition to payload) to store the original message.
 
  struct IRQMsg {
   DeviceState *src;
   void (*delivery_cb)(IRQMsg *msg, int result);
   void (*eoi_cb)(IRQMsg *msg, int result);
   void *src_opaque;
   void *payload;
  };
  Extending the lifetime of IRQMsg objects beyond the delivery call stack
  means qemu_malloc/free for every delivery. I think it takes a _very_
  appealing reason to justify this. But so far I do not see any use case
  for eio_cb at all.
 
  I dislike use of eoi for reinfecting missing interrupts since
  it eliminates use of internal PIC/APIC queue of not yet delivered
  interrupts. PIC and APIC has internal queue that can handle two 
  elements:
  one is delivered, but not yet acked interrupt in isr and another is
  pending interrupt in irr. Using eoi callback (or ack notifier as it's
  called inside kernel) interrupt will be considered coalesced even if irr
  is cleared, but no ack was received for previously delivered interrupt.
  But ack notifiers actually has another use: device assignment. There is
  a plan to move device assignment from kernel to userspace and for that
  ack notifiers will have to be extended to userspace too. If so we can
  use them to do irq decoalescing as well. I doubt they should be part
  of IRQMsg though. Why not do what kernel does: have globally registered
  notifier based on irqchip/pin.
  I read this twice but I still don't get your plan. Do you like or
  dislike using EIO for de-coalescing? And how should these notifiers work?
 
  That's because I confused myself :) I _dislike_ them to be used, but
  since device assignment requires ack notifiers anyway may be it is better
  to introduce one mechanism for device assignmen + de-coalescing instead
  of introducing two different mechanism. Using ack notifiers should be
  easy: RTC registers ack notifier and keep track of delivered interrupts.
  If timer triggers after previews irq was set, but before it was acked
  coalesced counter is incremented. In ack notifier callback coalesced
  counter is checked and if it is not zero new irq is set.
  Ack notifier registrations and event deliveries still need to be routed.
  Piggy-backing this on IRQ messages may be unavoidable for that reason.
  It is done in the kernel without piggy-backing.
 
 As it does not include any IRQ routers in front of the interrupt
 controller. Maybe it works for x86, but it is no generic solution.
 
x86 has IRQ router in front of interrupt controller inside pci host
bridge.

 Also, periodic timer sources get no information about the fact that
 their interrupt is masked somewhere along the path to the VCPUs and will
 possibly replay countless IRQs when the masking ends, no?
 
Correct, for that we have mask notifiers in the kernel. Gets ugly be the
minute.

  
  Anyway, I'm going to post my HPET updates with the infrastructure for
  IRQMsg now. Maybe it's helpful to see the other option in reality.
 
  One other think to consider current approach does not always work.
  Win2K3-64bit-smp and Win2k8-64bit-smp configure RTC interrupt to be
  broadcasted to all cpus, but only boot cpu does time calculation. With
  current approach if interrupt is delivered to at least one vcpu
  it will not be considered coalesced, but if cpu it was delivered to is
  not cpu that does time accounting then clock will drift.
 
 That means we would have to fire callbacks per receiving CPU and report
 its number back. Is there a way to find out if we are running such a
 guest without an '-enable-win2k[38]-64bit-smp-rtc-drift-fix'?
 
Not that I know of.

--
Gleb.

Re: [Qemu-devel] [PATCH 01/17] vl.c: Remove double include of netinet/in.h for Solaris

2010-06-06 Thread Andreas Färber


Am 04.06.2010 um 18:08 schrieb jes.soren...@redhat.com:


From: Jes Sorensen jes.soren...@redhat.com

vl.c: netinet/in.h is already included once above for in the generic
POSIX section.

Signed-off-by: Jes Sorensen jes.soren...@redhat.com


Acked-by: Andreas Faerber afaer...@opensolaris.org


---
vl.c |1 -
1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/vl.c b/vl.c
index 417554f..7c4298a 100644
--- a/vl.c
+++ b/vl.c
@@ -70,7 +70,6 @@
#include sys/ethernet.h
#include sys/sockio.h
#include netinet/arp.h
-#include netinet/in.h
#include netinet/in_systm.h
#include netinet/ip.h
#include netinet/ip_icmp.h // must come after ip.h
--
1.6.5.2

[Qemu-devel] just one more qemu problem

2010-06-06 Thread Kristoffer Gustafsson

Hi.
Well, just one more problem I need help with, then I'll not post this things to 
the list again.
I upgraded qemu, and got sound in the console using the curses option, so 
windows works.
but when starting in a gnome terminal with
qemu windows.ovl -soundhw all -localtime
The microsoft windows screen comes up, and windows tries to boot, but nothing 
happens.
What have I missed now?
/Kristoffer
Kristoffer Gustafsson
Trelleborgsvägen 1b
514 33 Tranemo

tel: 0325-42093
mobil: 073-8226473
e-post: k...@dreamwld.com
Eller
kristoffer_gustafs...@allmail.net

Re: [Qemu-devel] Re: [PATCH v2 2/2] vnc: threaded VNC server

2010-06-06 Thread Avi Kivity


On 06/05/2010 11:03 AM, Corentin Chary wrote:


So it's disabled by default? Sounds like a pretty cool and useful feature to me 
that should be enabled by default.
 

Because it's does not work on  windows (qemu-thread.c only uses
pthread) and because I don't want to break everything :)
   


One option is to disable vnc on Windows and let a Windows maintainer 
materialize and add the corresponding support.  Introducing more and 
more config options is not a good approach.


--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [PATCH v2 2/2] vnc: threaded VNC server

2010-06-06 Thread Avi Kivity


On 06/04/2010 04:20 PM, Corentin Chary wrote:


+if (vnc_trylock_display(vd)) {
+vd-timer_interval = VNC_REFRESH_INTERVAL_BASE;
+qemu_mod_timer(vd-timer, qemu_get_clock(rt_clock) +
+   vd-timer_interval);
+return;
+}
+
  has_dirty = vnc_refresh_server_surface(vd);
+vnc_unlock_display(vd);
   


This could delay the update by quite a bit, no?

A more elaborate approach would be to enqueue the refresh job into the 
queue.  May need the iothread enabled so we have qemu_mutex.


btw, I could not find other uses of vd-mutex, shouldn't it protect 
against the work thread?



--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] Re: [PATCH v2 2/2] vnc: threaded VNC server

2010-06-06 Thread Corentin Chary

On Sun, Jun 6, 2010 at 3:54 PM, Avi Kivity a...@redhat.com wrote:
 On 06/05/2010 11:03 AM, Corentin Chary wrote:

 So it's disabled by default? Sounds like a pretty cool and useful feature
 to me that should be enabled by default.


 Because it's does not work on  windows (qemu-thread.c only uses
 pthread) and because I don't want to break everything :)


 One option is to disable vnc on Windows and let a Windows maintainer
 materialize and add the corresponding support.  Introducing more and more
 config options is not a good approach.

I think keeping the non-threaded code is a good thing (and there is
not much code). There is probably case where you want to avoid
threads.



-- 
Corentin Chary
http://xf.iksaif.net

Re: [Qemu-devel] [PATCH v2 2/2] vnc: threaded VNC server

2010-06-06 Thread Corentin Chary

On Sun, Jun 6, 2010 at 4:11 PM, Avi Kivity a...@redhat.com wrote:
 On 06/04/2010 04:20 PM, Corentin Chary wrote:

 +    if (vnc_trylock_display(vd)) {
 +        vd-timer_interval = VNC_REFRESH_INTERVAL_BASE;
 +        qemu_mod_timer(vd-timer, qemu_get_clock(rt_clock) +
 +                       vd-timer_interval);
 +        return;
 +    }
 +
      has_dirty = vnc_refresh_server_surface(vd);
 +    vnc_unlock_display(vd);


 This could delay the update by quite a bit, no?

Yep, but it's far better than waiting the lock because it doesn't slow
down the main thread.
I played big buck bunny trailler (33sec) in mplayer and tight encoding:
- ~40 sec with the non-threaded server
- ~37 sec with a lock
- ~33 sec with a try_lock

 A more elaborate approach would be to enqueue the refresh job into the
 queue.  May need the iothread enabled so we have qemu_mutex.

Maybe, but I'd like to wait the generic async work subsystem before
adding different kind of jobs to the queue. And it's already a big
improvment over the current code :).

 btw, I could not find other uses of vd-mutex, shouldn't it protect against
 the work thread?

Check vnc-jobs.c, there is a qemu_mutex_lock(vs-vd-mutex);

-- 
Corentin Chary
http://xf.iksaif.net

Re: [Qemu-devel] [PATCH v2 2/2] vnc: threaded VNC server

2010-06-06 Thread Avi Kivity


On 06/06/2010 05:48 PM, Corentin Chary wrote:

On Sun, Jun 6, 2010 at 4:11 PM, Avi Kivitya...@redhat.com  wrote:
   

On 06/04/2010 04:20 PM, Corentin Chary wrote:
 

+if (vnc_trylock_display(vd)) {
+vd-timer_interval = VNC_REFRESH_INTERVAL_BASE;
+qemu_mod_timer(vd-timer, qemu_get_clock(rt_clock) +
+   vd-timer_interval);
+return;
+}
+
  has_dirty = vnc_refresh_server_surface(vd);
+vnc_unlock_display(vd);

   

This could delay the update by quite a bit, no?
 

Yep, but it's far better than waiting the lock because it doesn't slow
down the main thread.
I played big buck bunny trailler (33sec) in mplayer and tight encoding:
- ~40 sec with the non-threaded server
- ~37 sec with a lock
- ~33 sec with a try_lock
   


Definitely, blocking the main thread is a no-no.


A more elaborate approach would be to enqueue the refresh job into the
queue.  May need the iothread enabled so we have qemu_mutex.
 

Maybe, but I'd like to wait the generic async work subsystem before
adding different kind of jobs to the queue. And it's already a big
improvment over the current code :).
   


Hm, ok.


btw, I could not find other uses of vd-mutex, shouldn't it protect against
the work thread?
 

Check vnc-jobs.c, there is a qemu_mutex_lock(vs-vd-mutex);

   


Shouldn't it use vnc_lock_display()?  That's why I missed it.

--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [PATCH v2 2/2] vnc: threaded VNC server

2010-06-06 Thread Corentin Chary

On Sun, Jun 6, 2010 at 4:53 PM, Avi Kivity a...@redhat.com wrote:
 On 06/06/2010 05:48 PM, Corentin Chary wrote:

 On Sun, Jun 6, 2010 at 4:11 PM, Avi Kivitya...@redhat.com  wrote:


 On 06/04/2010 04:20 PM, Corentin Chary wrote:


 +    if (vnc_trylock_display(vd)) {
 +        vd-timer_interval = VNC_REFRESH_INTERVAL_BASE;
 +        qemu_mod_timer(vd-timer, qemu_get_clock(rt_clock) +
 +                       vd-timer_interval);
 +        return;
 +    }
 +
      has_dirty = vnc_refresh_server_surface(vd);
 +    vnc_unlock_display(vd);



 This could delay the update by quite a bit, no?


 Yep, but it's far better than waiting the lock because it doesn't slow
 down the main thread.
 I played big buck bunny trailler (33sec) in mplayer and tight encoding:
 - ~40 sec with the non-threaded server
 - ~37 sec with a lock
 - ~33 sec with a try_lock


 Definitely, blocking the main thread is a no-no.

 A more elaborate approach would be to enqueue the refresh job into the
 queue.  May need the iothread enabled so we have qemu_mutex.


 Maybe, but I'd like to wait the generic async work subsystem before
 adding different kind of jobs to the queue. And it's already a big
 improvment over the current code :).


 Hm, ok.

 btw, I could not find other uses of vd-mutex, shouldn't it protect
 against
 the work thread?


 Check vnc-jobs.c, there is a qemu_mutex_lock(vs-vd-mutex);



 Shouldn't it use vnc_lock_display()?  That's why I missed it.

I didn't use vnc_lock_display because I didn't want to export it first.
Maybe I should also use vnc_lock_output() in vnc-jobs.c ...



-- 
Corentin Chary
http://xf.iksaif.net

Re: [Qemu-devel] [PATCH v6 5/6] Inter-VM shared memory PCI device

2010-06-06 Thread Avi Kivity


On 06/05/2010 12:44 PM, Blue Swirl wrote:

On Fri, Jun 4, 2010 at 9:45 PM, Cam Macdonellc...@cs.ualberta.ca  wrote:
   

Support an inter-vm shared memory device that maps a shared-memory object as a
PCI device in the guest.  This patch also supports interrupts between guest by
communicating over a unix domain socket.  This patch applies to the qemu-kvm
repository.

-device ivshmem,size=size in format accepted by -m[,shm=shm name]

Interrupts are supported between multiple VMs by using a shared memory server
by using a chardev socket.

-device ivshmem,size=size in format accepted by -m[,shm=shm name]
   [,chardev=id][,msi=on][,irqfd=on][,vectors=n][,role=peer|master]
-chardev socket,path=path,id=id

(shared memory server is qemu.git/contrib/ivshmem-server)

Sample programs and init scripts are in a git repo here:

 

Why is this KVM specific BTW, Posix SHM is available on many
platforms? What would happen if kvm_set_foobar functions were not
called when KVM is not being used? Is host eventfd support essential?
   


It's not kvm specific, it's 
atomic-ops-on-shared-memory-are-visible-as-atomic-ops specific, which is 
currently only available with kvm.  When tcg gains true smp support (and 
not just against other tcg threads) this can work with tcg as well.


I guess that needs a host with at least 32/64 bit CAS for 32/64 bit 
targets respectively, and double that if the target has DCAS.  Not sure 
how targets with ll/sc can be implemented, especially if there are 
limits as to what can go in between.


--
error compiling committee.c: too many arguments to function

[Qemu-devel] Re: [RFC] QMP: Introduce query-netdevices documentation

2010-06-06 Thread Avi Kivity


On 06/04/2010 05:06 PM, Miguel Di Ciurcio Filho wrote:

This introduces the protocol specification for querying information about
network devices available on a VM and a new monitor command that show the same
information.

Signed-off-by: Miguel Di Ciurcio Filhomiguel.fi...@gmail.com
---
  qemu-monitor.hx |   69 +++
  1 files changed, 69 insertions(+), 0 deletions(-)

diff --git a/qemu-monitor.hx b/qemu-monitor.hx
index f6a94f2..8600129 100644
--- a/qemu-monitor.hx
+++ b/qemu-monitor.hx
@@ -1674,6 +1674,75 @@ show the various VLANs and the associated devices
  ETEXI

  STEXI
+...@item info netdevices
+show information about network devices
+ETEXI
+SQMP
+query-netdevices
+
+
+Each device is represented by a json-object. The returned value is a json-array
+of all devices.
+
+Each json-object contain the following:
+
+- device: device name (json-string)
+- vlan: only present if the device is attached to a VLAN (json-int)
+- info: json-object containing the following:
+  - model: type of the device (json-string)
+  - Possible values: tap, socket, xen, slirp, dump,
+ vde, ne2k_pci, i82551, i82557b,
+ i82559er, rtl8139, e1000, pcnet,
+ virtio, dp83932, lan9118, mcf_fec,
+ xilinx-ethlite, lance, stellaris,
+ smc91c111, ne2k_isa, mv88w8618,
+ mipsnet, fseth, dp83932, usb
   


This casts the vlan model into concrete.  I thought we wanted to move 
away from it?  Instead have separate entries for host and guest devices.


--
error compiling committee.c: too many arguments to function

[Qemu-devel] Re: sun framebuffer selection (was option-rom)

2010-06-06 Thread Artyom Tarasenko

2010/6/6 Blue Swirl blauwir...@gmail.com:
 On Sat, Jun 5, 2010 at 11:10 PM, Bob Breuer breu...@mc.net wrote:
 Blue Swirl wrote:
  but again: should we have a new machine with cg14 or
 some switch to select TCX vs. cg14?


Why not just probe for both devices? OpenBIOS has the intention
to run one day on a real hardware, doesn't it?

 Maybe the recently proposed machine subtype patches could help here.

How is the graphic card different from cpu or a disk drive?

 Well, let's try to figure out a method of selecting the framebuffer
 type.  I'll try to list some of the options, even if they might be
 ridiculous.

 1) Use the -vga option.  I know TCX and cg14 are not vga, but I think
 it's the closest existing command line option available.

 2) Switch based on the -g WxH option.  At the moment, the TCX emulation
 doesn't really handle anything other than 1024x768, so switch to cg14
 for other resolutions if supported.

 3) Use some other existing command line option, -device, -set or
 -global?  Might work, but the syntax may not be easy to remember.

 We don't have an equivalent of -chardev, -netdev and -drive for displays.

I guess only cause the other emulated platforms don't have that much
of choice (yet).
Why not use just the generic -device option?

 4) Machine subtype.

 5) New command line option.  Anything above might be better.

 6) New machine type.  Is it a big enough feature to demand it's own
 machine type?  Maybe, but see next option.

 7) Select as default video for SS-20.  The SS-10 and SS-600MP are
 already very similar.  This would allow for some differentiation between
 the machines, but there could still be an option to switch back to TCX.
 Note that TCX was really only available for the SS-4 and SS-5.


They are similar in qemu. But it's rather a bug than a feature. The
real SS-600 is much more complex VME-bus machine.


 Is there anything else that I missed?

 Combined 7  6: make cg14 default for SS-20, add a deprecated
 compatibility machine for SS-20 with TCX.


 I'm going to go ahead with option 2 in the short term.  I'm inclined to
 narrow it down to options 1, 4, and 7.  I know that 7 would have
 backwards compatibility concerns.  The cg14 seems to have at least the
 same capabilities as TCX so there shouldn't be any loss of
 functionality.  Even though SS-20 is not the default machine, do you
 know of any OS that works with the sun4m implementation today but
 doesn't have a cg14 driver?  Possible downside to cg14 for video is that
 any acceleration is handled by the SX pixel processor which has no
 available documentation.  TCX also has some amount of unimplemented
 acceleration.

 It would be nice to use some basic device with well defined
 acceleration or just a frame buffer as default.


AFAIK the open source OSes don't use the cg14 acceleration anyway. So
we'll only have potential problems with Solaris and NeXTStep here.


-- 
Regards,
Artyom Tarasenko

solaris/sparc under qemu blog: http://tyom.blogspot.com/

[Qemu-devel] [PATCH] virtio-net: truncating packet

2010-06-06 Thread Michael S. Tsirkin

virtio net attempts to peek into virtio queue to
determine that we have enough space for the complete
packet to fit. However, it fails to account for space
consumed by virtio net header when it does this,
Under stress this results in a failure
with a message 'truncating packet'.

redhat bz 591494.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/virtio-net.c |   15 +--
 1 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index 6a9d560..bf67e73 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -532,16 +532,17 @@ static ssize_t virtio_net_receive(VLANClientState *nc, 
const uint8_t *buf, size_
 if (!virtio_net_can_receive(n-nic-nc))
 return -1;
 
-if (!virtio_net_has_buffers(n, size))
+/* hdr_len refers to the header we supply to the guest */
+hdr_len = n-mergeable_rx_bufs ?
+sizeof(struct virtio_net_hdr_mrg_rxbuf) : sizeof(struct 
virtio_net_hdr);
+
+
+if (!virtio_net_has_buffers(n, size + hdr_len))
 return 0;
 
 if (!receive_filter(n, buf, size))
 return size;
 
-/* hdr_len refers to the header we supply to the guest */
-hdr_len = n-mergeable_rx_bufs ?
-sizeof(struct virtio_net_hdr_mrg_rxbuf) : sizeof(struct 
virtio_net_hdr);
-
 offset = i = 0;
 
 while (offset  size) {
@@ -555,7 +556,9 @@ static ssize_t virtio_net_receive(VLANClientState *nc, 
const uint8_t *buf, size_
 virtqueue_pop(n-rx_vq, elem) == 0) {
 if (i == 0)
 return -1;
-fprintf(stderr, virtio-net truncating packet\n);
+fprintf(stderr, virtio-net truncating packet: 
+   offset %zd, size %zd, hdr %zd\n,
+   offset, size, hdr_len);
 exit(1);
 }
 
-- 
1.7.1.12.g42b7f

[Qemu-devel] Re: [RFC PATCH v4 3/3] block: add sheepdog driver for distributed storage support

2010-06-06 Thread MORITA Kazutaka

At Fri, 04 Jun 2010 13:04:00 +0200,
Kevin Wolf wrote:
 
 Am 03.06.2010 18:23, schrieb MORITA Kazutaka:
  +static void sd_aio_cancel(BlockDriverAIOCB *blockacb)
  +{
  + SheepdogAIOCB *acb = (SheepdogAIOCB *)blockacb;
  +
  + acb-canceled = 1;
  +}
 
  Does this provide the right semantics? You haven't really cancelled the
  request, but you pretend to. So you actually complete the request in the
  background and then throw the return code away.
 
  I seem to remember that posix-aio-compat.c waits at this point for
  completion of the requests, calls the callbacks and only afterwards
  returns from aio_cancel when no more requests are in flight.
 
  Or if you can really cancel requests, it would be the best option, of
  course.
 
  
  Sheepdog cannot cancel the requests which are already sent to the
  servers.  So, as you say, we pretend to cancel the requests without
  waiting for completion of them.  However, are there any situation
  where pretending to cancel causes problems in practice?
 
 I'm not sure how often it would happen in practice, but if the guest OS
 thinks the old value is on disk when in fact the new one is, this could
 lead to corruption. I think if it can happen, even without evidence that
 it actually does, it's already relevant enough.
 

I agree.

  To wait for completion of the requests here, we may need to create
  another thread for processing I/O like posix-aio-compat.c.
 
 I don't think you need a thread to get the same behaviour, you just need
 to call the fd handlers like in the main loop. It would probably be the
 first driver doing this, though, and it's not an often used code path,
 so it might be a bad idea.
 
 Maybe it's reasonable to just complete the request with -EIO? This way
 the guest couldn't make any assumption about the data written. On the
 other hand, it could be unhappy about failed requests, but that's
 probably better than corruption.
 

Completing with -EIO looks good to me.  Thanks for the advice.
I'll send an updated patch tomorrow.

Regards,

Kazutaka

[Qemu-devel] [Bug 590456] [NEW] qemu forum (http://qemu-forum.ipi.fi/) not available

2010-06-06 Thread raymond

Public bug reported:

receive an error message since this week:
---
General Error
SQL ERROR [ mysql4 ]

Host 'www-hostnet' is blocked because of many connection errors; unblock
with 'mysqladmin flush-hosts' [1129]

An sql error occurred while fetching this page. Please contact an administrator 
if this problem persists.
---

It's not a bug of qemu, but on project's related website.

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
qemu forum (http://qemu-forum.ipi.fi/) not available
https://bugs.launchpad.net/bugs/590456
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.

Status in QEMU: New

Bug description:
receive an error message since this week:
---
General Error
SQL ERROR [ mysql4 ]

Host 'www-hostnet' is blocked because of many connection errors; unblock with 
'mysqladmin flush-hosts' [1129]

An sql error occurred while fetching this page. Please contact an administrator 
if this problem persists.
---

It's not a bug of qemu, but on project's related website.

[Qemu-devel] Re: [PATCH 6/6] apic: avoid using CPUState internals

2010-06-06 Thread Paolo Bonzini


On 06/05/2010 11:31 PM, Blue Swirl wrote:

Use only an opaque CPUState pointer and move the actual CPUState
contents handling to cpu.h and cpuid.c.

Set env-halted in pc.c and add a function to get the local APIC state
of the current CPU for the MMIO.

Signed-off-by: Blue Swirlblauwir...@gmail.com
---
  hw/apic.c   |   40 +++-
  hw/apic.h   |9 -
  hw/pc.c |   12 +++-
  target-i386/cpu.h   |   27 ---
  target-i386/cpuid.c |6 ++
  5 files changed, 56 insertions(+), 38 deletions(-)

diff --git a/hw/apic.c b/hw/apic.c
index 91c8d93..332c66e 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -95,7 +95,7 @@
  #define MSI_ADDR_SIZE   0x10

  struct APICState {
-CPUState *cpu_env;
+void *cpu_env;


I proposed having an opaque CPUState type in hw/ but it was rejected. 
But I don't think using a void pointer is any better.


Paolo

[Qemu-devel] Moving a memory mapping

2010-06-06 Thread Olivier Galibert

  Hi all,

I create a mmio memory mapping with cpu_register_io_memory followed by
cpu_register_physical_memory.  The host then twiddles a control
register that changes the mapping address.  How can I move the mapping
to the new address?  I could not find a cpu_unregister_physical_memory
function or equivalent.

Best,

  OG.

Re: [Qemu-devel] qemu:virtio-9p: [RFC] [PATCH 01/02] Send iounit to client for read/write operations

2010-06-06 Thread Venkateswararao Jujjuri (JV)

Sripathi Kodi wrote:
 On Tue,  1 Jun 2010 19:47:14 +0530
 M. Mohan Kumar mo...@in.ibm.com wrote:
 
 Compute iounit based on the host filesystem block size and pass it to
 client with open/create response. Also return iounit as statfs's f_bsize
 for optimal block size transfers.

 Signed-off-by: M. Mohan Kumar mo...@in.ibm.com
 ---
  hw/virtio-9p.c |   56 
 ++--
  hw/virtio-9p.h |3 +++
  2 files changed, 45 insertions(+), 14 deletions(-)

 diff --git a/hw/virtio-9p.c b/hw/virtio-9p.c
 index f087122..4357f1f 100644
 --- a/hw/virtio-9p.c
 +++ b/hw/virtio-9p.c
 @@ -1,4 +1,4 @@
 -/*
 +/*
   * Virtio 9p backend
   *
   * Copyright IBM, Corp. 2010
 @@ -269,6 +269,11 @@ static int v9fs_do_fsync(V9fsState *s, int fd)
  return s-ops-fsync(s-ctx, fd);
  }

 +static int v9fs_do_statfs(V9fsState *s, V9fsString *path, struct statfs 
 *stbuf)
 +{
 +return s-ops-statfs(s-ctx, path-data, stbuf);
 +}
 +
  static void v9fs_string_init(V9fsString *str)
  {
  str-data = NULL;
 @@ -1035,11 +1040,10 @@ static void v9fs_fix_path(V9fsString *dst, 
 V9fsString *src, int len)

  static void v9fs_version(V9fsState *s, V9fsPDU *pdu)
  {
 -int32_t msize;
  V9fsString version;
  size_t offset = 7;

 -pdu_unmarshal(pdu, offset, ds, msize, version);
 +pdu_unmarshal(pdu, offset, ds, s-msize, version);

  if (!strcmp(version.data, 9P2000.u)) {
  s-proto_version = V9FS_PROTO_2000U;
 @@ -1049,7 +1053,7 @@ static void v9fs_version(V9fsState *s, V9fsPDU *pdu)
  v9fs_string_sprintf(version, unknown);
  }

 -offset += pdu_marshal(pdu, offset, ds, msize, version);
 +offset += pdu_marshal(pdu, offset, ds, s-msize, version);
  complete_pdu(s, pdu, offset);

  v9fs_string_free(version);
 @@ -1304,6 +1308,20 @@ out:
  v9fs_walk_complete(s, vs, err);
  }

 +static int32_t get_iounit(V9fsState *s, V9fsString *name)
 +{
 +struct statfs stbuf;
 +int32_t iounit = 0;
 +
 +
 +if (!v9fs_do_statfs(s, name, stbuf)) {
 +iounit = stbuf.f_bsize;
 +iounit *= (s-msize - P9_IOHDRSZ)/stbuf.f_bsize;
 
 If (s-msize - P9_IOHDRSZ) is less than stbuf.f_bsize iounit becomes
 zero. See below.
 
 +}
 +
 +return iounit;
 +}
 +
  static void v9fs_open_post_opendir(V9fsState *s, V9fsOpenState *vs, int err)
  {
  if (vs-fidp-dir == NULL) {
 @@ -1321,12 +1339,15 @@ out:

  static void v9fs_open_post_open(V9fsState *s, V9fsOpenState *vs, int err)
  {
 +int32_t iounit;
 +
  if (vs-fidp-fd == -1) {
  err = -errno;
  goto out;
  }

 -vs-offset += pdu_marshal(vs-pdu, vs-offset, Qd, vs-qid, 0);
 +iounit = get_iounit(s, vs-fidp-path);
 +vs-offset += pdu_marshal(vs-pdu, vs-offset, Qd, vs-qid, iounit);
  err = vs-offset;
  out:
  complete_pdu(s, vs-pdu, err);
 @@ -1800,11 +1821,16 @@ out:

  static void v9fs_post_create(V9fsState *s, V9fsCreateState *vs, int err)
  {
 +int32_t iounit;
 +
 +iounit = get_iounit(s, vs-fidp-path);
 +
  if (err == 0) {
  v9fs_string_copy(vs-fidp-path, vs-fullname);
  stat_to_qid(vs-stbuf, vs-qid);

 -vs-offset += pdu_marshal(vs-pdu, vs-offset, Qd, vs-qid, 0);
 +vs-offset += pdu_marshal(vs-pdu, vs-offset, Qd, vs-qid,
 +iounit);

  err = vs-offset;
  }
 @@ -2295,23 +2321,25 @@ out:
  qemu_free(vs);
  }

 -static int v9fs_do_statfs(V9fsState *s, V9fsString *path, struct statfs 
 *stbuf)
 -{
 -return s-ops-statfs(s-ctx, path-data, stbuf);
 -}
 -
  static void v9fs_statfs_post_statfs(V9fsState *s, V9fsStatfsState *vs, int 
 err)
  {
 +int32_t bsize_factor;
 +
  if (err) {
  err = -errno;
  goto out;
  }

 +bsize_factor = (s-msize - P9_IOHDRSZ)/vs-stbuf.f_bsize;
 +if (!bsize_factor) {
 +bsize_factor = 1;
 +}
 
 Again, if (s-msize - P9_IOHDRSZ) is less than stbuf.f_bsize
 bsize_factor becomes zero. The following divisions become divide by
 zero!

Yes, I think we should leave iounit alone return it with open/create and handle 
it
as per the 9P protocol..and return whatever the fileserver gives for stat and 
satatfs.

- JV

 
 Thanks,
 Sripathi.
 
  vs-v9statfs.f_type = vs-stbuf.f_type;
  vs-v9statfs.f_bsize = vs-stbuf.f_bsize;
 -vs-v9statfs.f_blocks = vs-stbuf.f_blocks;
 -vs-v9statfs.f_bfree = vs-stbuf.f_bfree;
 -vs-v9statfs.f_bavail = vs-stbuf.f_bavail;
 +vs-v9statfs.f_bsize *= bsize_factor;
 +vs-v9statfs.f_blocks = vs-stbuf.f_blocks/bsize_factor;
 +vs-v9statfs.f_bfree = vs-stbuf.f_bfree/bsize_factor;
 +vs-v9statfs.f_bavail = vs-stbuf.f_bavail/bsize_factor;
  vs-v9statfs.f_files = vs-stbuf.f_files;
  vs-v9statfs.f_ffree = vs-stbuf.f_ffree;
  vs-v9statfs.fsid_val = (unsigned int) vs-stbuf.f_fsid.__val[0] |
 diff --git a/hw/virtio-9p.h b/hw/virtio-9p.h
 index 6b3d4a4..9264163 100644
 --- a/hw/virtio-9p.h
 +++ b/hw/virtio-9p.h
 @@ -72,6 +72,8 @@ enum p9_proto_version {

[Qemu-devel] [Bug 590552] [NEW] New default network card doesn't work with tap networking

2010-06-06 Thread Gabriele Tozzi

Public bug reported:

Unfortunately, I can provide very little information.

Hope this will be useful anyway.

I've upgraded qemu using debian apt to lastest unstable (QEMU PC
emulator version 0.12.4 (Debian 0.12.4+dfsg-2), Copyright (c) 2003-2008
Fabrice Bellard): looks like at some point the default network card for
-net nic option was switched to intel gigabit instead of the good old
ne2k_pci.

I was using -net tap -net nic options and my network stopped working.
When not working,
- tcpdump on the host shows me taht all packets are sent and received fine from 
guest
- tcpdump on guest shows that packets from host are NOT received

obviously, both host tap interface and guest eth0 interfaces, routing
tables, dns, firewall, etc... are well configured.

Having banged my head for a while, I finally stopped the host and
started it again using -net nic,model=ne2k_pci option, then my network
magically started working again.

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
New default network card doesn't work with tap networking
https://bugs.launchpad.net/bugs/590552
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.

Status in QEMU: New

Bug description:
Unfortunately, I can provide very little information.

Hope this will be useful anyway.

I've upgraded qemu using debian apt to lastest unstable (QEMU PC emulator 
version 0.12.4 (Debian 0.12.4+dfsg-2), Copyright (c) 2003-2008 Fabrice 
Bellard): looks like at some point the default network card for -net nic option 
was switched to intel gigabit instead of the good old ne2k_pci.

I was using -net tap -net nic options and my network stopped working.
When not working,
- tcpdump on the host shows me taht all packets are sent and received fine from 
guest
- tcpdump on guest shows that packets from host are NOT received

obviously, both host tap interface and guest eth0 interfaces, routing tables, 
dns, firewall, etc... are well configured.

Having banged my head for a while, I finally stopped the host and started it 
again using -net nic,model=ne2k_pci option, then my network magically started 
working again.

[Qemu-devel] [Bug 590456] Re: qemu forum (http://qemu-forum.ipi.fi/) not available

2010-06-06 Thread Anthony Liguori

The QEMU forum is in no way officially associated with QEMU.  We have no
access to the server and no ability to fix it.  I don't even know who
runs it.

** Changed in: qemu
   Status: New = Invalid

-- 
qemu forum (http://qemu-forum.ipi.fi/) not available
https://bugs.launchpad.net/bugs/590456
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.

Status in QEMU: Invalid

Bug description:
receive an error message since this week:
---
General Error
SQL ERROR [ mysql4 ]

Host 'www-hostnet' is blocked because of many connection errors; unblock with 
'mysqladmin flush-hosts' [1129]

An sql error occurred while fetching this page. Please contact an administrator 
if this problem persists.
---

It's not a bug of qemu, but on project's related website.

[Qemu-devel] [PATCH v5 RESEND 2/4] Introduce cpu_physical_memory_get_dirty_range().

2010-06-06 Thread Yoshiaki Tamura

It checks the first row and puts dirty addr in the array.  If the first row is
empty, it skips to the first non-dirty row or the end addr, and put the length
in the first entry of the array.

Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
Signed-off-by: OHMURA Kei ohmura@lab.ntt.co.jp
---
 cpu-all.h |4 +++
 exec.c|   67 +
 2 files changed, 71 insertions(+), 0 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index 77ef939..6ac8fc2 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -1019,6 +1019,10 @@ static inline void 
cpu_physical_memory_mask_dirty_range(ram_addr_t start,
  }
 }
 
+int cpu_physical_memory_get_dirty_range(ram_addr_t start, ram_addr_t end, 
+ram_addr_t *dirty_rams, int length,
+int dirty_flags);
+
 void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
  int dirty_flags);
 void cpu_tlb_update_dirty(CPUState *env);
diff --git a/exec.c b/exec.c
index 21299be..5fabb24 100644
--- a/exec.c
+++ b/exec.c
@@ -2023,6 +2023,73 @@ static inline void tlb_reset_dirty_range(CPUTLBEntry 
*tlb_entry,
 }
 }
 
+/* It checks the first row and puts dirty addrs in the array.
+   If the first row is empty, it skips to the first non-dirty row
+   or the end addr, and put the length in the first entry of the array. */
+int cpu_physical_memory_get_dirty_range(ram_addr_t start, ram_addr_t end, 
+ram_addr_t *dirty_rams, int length,
+int dirty_flag)
+{
+unsigned long p = 0, page_number;
+ram_addr_t addr;
+ram_addr_t s_idx = (start  TARGET_PAGE_BITS) / HOST_LONG_BITS;
+ram_addr_t e_idx = (end  TARGET_PAGE_BITS) / HOST_LONG_BITS;
+int i, j, offset, dirty_idx = dirty_flag_to_idx(dirty_flag);
+
+/* mask bits before the start addr */
+offset = (start  TARGET_PAGE_BITS)  (HOST_LONG_BITS - 1);
+cpu_physical_memory_sync_master(s_idx);
+p |= phys_ram_dirty[dirty_idx][s_idx]  ~((1UL  offset) - 1);
+
+if (s_idx == e_idx) {
+/* mask bits after the end addr */
+offset = (end  TARGET_PAGE_BITS)  (HOST_LONG_BITS - 1);
+p = (1UL  offset) - 1;
+}
+
+if (p == 0) {
+/* when the row is empty */
+ram_addr_t skip;
+if (s_idx == e_idx) {
+skip = end;
+} else {
+/* skip empty rows */
+while (s_idx  e_idx) {
+s_idx++;
+cpu_physical_memory_sync_master(s_idx);
+
+if (phys_ram_dirty[dirty_idx][s_idx] != 0) {
+break;
+}
+}
+skip = (s_idx * HOST_LONG_BITS * TARGET_PAGE_SIZE);
+}
+dirty_rams[0] = skip - start;
+i = 0;
+
+} else if (p == ~0UL) {
+/* when the row is fully dirtied */
+addr = start;
+for (i = 0; i  length; i++) {
+dirty_rams[i] = addr;
+addr += TARGET_PAGE_SIZE;
+}
+} else {
+/* when the row is partially dirtied */
+i = 0;
+do {
+j = ffsl(p) - 1;
+p = ~(1UL  j);
+page_number = s_idx * HOST_LONG_BITS + j;
+addr = page_number * TARGET_PAGE_SIZE;
+dirty_rams[i] = addr;
+i++;
+} while (p != 0  i  length);
+}
+
+return i;
+}
+
 /* Note: start and end must be within the same ram block.  */
 void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
  int dirty_flags)
-- 
1.7.0.31.g1df487

[Qemu-devel] [PATCH v5 RESEND 4/4] Use cpu_physical_memory_get_dirty_range() to check multiple dirty pages.

2010-06-06 Thread Yoshiaki Tamura

Modifies ram_save_block() and ram_save_remaining() to use
cpu_physical_memory_get_dirty_range() to check multiple dirty and non-dirty
pages at once.

Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
Signed-off-by: OHMURA Kei ohmura@lab.ntt.co.jp
---
 arch_init.c |   57 +++--
 1 files changed, 35 insertions(+), 22 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 8e849a8..6f2ed29 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -108,32 +108,39 @@ static int ram_save_block(QEMUFile *f)
 static ram_addr_t current_addr = 0;
 ram_addr_t saved_addr = current_addr;
 ram_addr_t addr = 0;
-int bytes_sent = 0;
+ram_addr_t dirty_rams[HOST_LONG_BITS];
+int i, found, bytes_sent = 0;
 
 while (addr  last_ram_offset) {
-if (cpu_physical_memory_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) 
{
+if ((found = cpu_physical_memory_get_dirty_range(
+ current_addr, last_ram_offset, dirty_rams, HOST_LONG_BITS,
+ MIGRATION_DIRTY_FLAG))) {
 uint8_t *p;
 
-cpu_physical_memory_reset_dirty(current_addr,
-current_addr + TARGET_PAGE_SIZE,
-MIGRATION_DIRTY_FLAG);
-
-p = qemu_get_ram_ptr(current_addr);
-
-if (is_dup_page(p, *p)) {
-qemu_put_be64(f, current_addr | RAM_SAVE_FLAG_COMPRESS);
-qemu_put_byte(f, *p);
-bytes_sent = 1;
-} else {
-qemu_put_be64(f, current_addr | RAM_SAVE_FLAG_PAGE);
-qemu_put_buffer(f, p, TARGET_PAGE_SIZE);
-bytes_sent = TARGET_PAGE_SIZE;
+for (i = 0; i  found; i++) {
+ram_addr_t page_addr = dirty_rams[i];
+cpu_physical_memory_reset_dirty(page_addr,
+page_addr + TARGET_PAGE_SIZE,
+MIGRATION_DIRTY_FLAG);
+
+p = qemu_get_ram_ptr(page_addr);
+
+if (is_dup_page(p, *p)) {
+qemu_put_be64(f, page_addr | RAM_SAVE_FLAG_COMPRESS);
+qemu_put_byte(f, *p);
+bytes_sent++;
+} else {
+qemu_put_be64(f, page_addr | RAM_SAVE_FLAG_PAGE);
+qemu_put_buffer(f, p, TARGET_PAGE_SIZE);
+bytes_sent += TARGET_PAGE_SIZE;
+}
 }
 
 break;
+} else {
+addr += dirty_rams[0];
+current_addr = (saved_addr + addr) % last_ram_offset;
 }
-addr += TARGET_PAGE_SIZE;
-current_addr = (saved_addr + addr) % last_ram_offset;
 }
 
 return bytes_sent;
@@ -143,12 +150,18 @@ static uint64_t bytes_transferred;
 
 static ram_addr_t ram_save_remaining(void)
 {
-ram_addr_t addr;
+ram_addr_t addr = 0;
 ram_addr_t count = 0;
+ram_addr_t dirty_rams[HOST_LONG_BITS];
+int found = 0;
 
-for (addr = 0; addr  last_ram_offset; addr += TARGET_PAGE_SIZE) {
-if (cpu_physical_memory_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
-count++;
+while (addr  last_ram_offset) {
+if ((found = cpu_physical_memory_get_dirty_range(addr, last_ram_offset,
+dirty_rams, HOST_LONG_BITS, MIGRATION_DIRTY_FLAG))) {
+count += found;
+addr = dirty_rams[found - 1] + TARGET_PAGE_SIZE;
+} else {
+addr += dirty_rams[0];
 }
 }
 
-- 
1.7.0.31.g1df487

[Qemu-devel] [PATCH v5 RESEND 1/4] Modify DIRTY_FLAG value and introduce DIRTY_IDX to use as indexes of bit-based phys_ram_dirty.

2010-06-06 Thread Yoshiaki Tamura

Replaces byte-based phys_ram_dirty bitmap with four (MASTER, VGA, CODE,
MIGRATION) bit-based phys_ram_dirty bitmap.  On allocation, it sets all bits in
the bitmap.  It uses ffs() to convert DIRTY_FLAG to DIRTY_IDX.

Modifies wrapper functions for byte-based phys_ram_dirty bitmap to bit-based
phys_ram_dirty bitmap.  MASTER works as a buffer, and upon get_diry() or
get_dirty_flags(), it calls cpu_physical_memory_sync_master() to update VGA and
MIGRATION.

Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
---
 cpu-all.h |  128 -
 exec.c|   15 --
 qemu-common.h |3 +
 3 files changed, 121 insertions(+), 25 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index 77eaf85..77ef939 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -37,6 +37,9 @@
 
 #include softfloat.h
 
+/* to use ffs in flag_to_idx() */
+#include strings.h
+
 #if defined(HOST_WORDS_BIGENDIAN) != defined(TARGET_WORDS_BIGENDIAN)
 #define BSWAP_NEEDED
 #endif
@@ -859,7 +862,6 @@ target_phys_addr_t cpu_get_phys_page_debug(CPUState *env, 
target_ulong addr);
 /* memory API */
 
 extern int phys_ram_fd;
-extern uint8_t *phys_ram_dirty;
 extern ram_addr_t ram_size;
 extern ram_addr_t last_ram_offset;
 
@@ -884,51 +886,137 @@ extern int mem_prealloc;
 /* Set if TLB entry is an IO callback.  */
 #define TLB_MMIO(1  5)
 
-#define VGA_DIRTY_FLAG   0x01
-#define CODE_DIRTY_FLAG  0x02
-#define MIGRATION_DIRTY_FLAG 0x08
+/* Use DIRTY_IDX as indexes of bit-based phys_ram_dirty. */
+#define MASTER_DIRTY_IDX0
+#define VGA_DIRTY_IDX   1
+#define CODE_DIRTY_IDX  2
+#define MIGRATION_DIRTY_IDX 3
+#define NUM_DIRTY_IDX   4
+
+#define MASTER_DIRTY_FLAG(1  MASTER_DIRTY_IDX)
+#define VGA_DIRTY_FLAG   (1  VGA_DIRTY_IDX)
+#define CODE_DIRTY_FLAG  (1  CODE_DIRTY_IDX)
+#define MIGRATION_DIRTY_FLAG (1  MIGRATION_DIRTY_IDX)
+
+extern unsigned long *phys_ram_dirty[NUM_DIRTY_IDX];
+
+static inline int dirty_flag_to_idx(int flag)
+{
+return ffs(flag) - 1;
+}
+
+static inline int dirty_idx_to_flag(int idx)
+{
+return 1  idx;
+}
 
 /* read dirty bit (return 0 or 1) */
 static inline int cpu_physical_memory_is_dirty(ram_addr_t addr)
 {
-return phys_ram_dirty[addr  TARGET_PAGE_BITS] == 0xff;
+unsigned long mask;
+ram_addr_t index = (addr  TARGET_PAGE_BITS) / HOST_LONG_BITS;
+int offset = (addr  TARGET_PAGE_BITS)  (HOST_LONG_BITS - 1);
+ 
+mask = 1UL  offset;
+return (phys_ram_dirty[MASTER_DIRTY_IDX][index]  mask) == mask;
+}
+
+static inline void cpu_physical_memory_sync_master(ram_addr_t index)
+{
+if (phys_ram_dirty[MASTER_DIRTY_IDX][index]) {
+phys_ram_dirty[VGA_DIRTY_IDX][index]
+|=  phys_ram_dirty[MASTER_DIRTY_IDX][index];
+phys_ram_dirty[MIGRATION_DIRTY_IDX][index]
+|=  phys_ram_dirty[MASTER_DIRTY_IDX][index];
+phys_ram_dirty[MASTER_DIRTY_IDX][index] = 0UL;
+}
 }
 
 static inline int cpu_physical_memory_get_dirty_flags(ram_addr_t addr)
 {
-return phys_ram_dirty[addr  TARGET_PAGE_BITS];
+ unsigned long mask;
+ ram_addr_t index = (addr  TARGET_PAGE_BITS) / HOST_LONG_BITS;
+ int offset = (addr  TARGET_PAGE_BITS)  (HOST_LONG_BITS - 1);
+ int ret = 0, i;
+ 
+ mask = 1UL  offset;
+ cpu_physical_memory_sync_master(index);
+
+ for (i = VGA_DIRTY_IDX; i = MIGRATION_DIRTY_IDX; i++) {
+ if (phys_ram_dirty[i][index]  mask) {
+ ret |= dirty_idx_to_flag(i);
+ }
+ }
+ 
+ return ret;
+}
+
+static inline int cpu_physical_memory_get_dirty_idx(ram_addr_t addr,
+int dirty_idx)
+{
+unsigned long mask;
+ram_addr_t index = (addr  TARGET_PAGE_BITS) / HOST_LONG_BITS;
+int offset = (addr  TARGET_PAGE_BITS)  (HOST_LONG_BITS - 1);
+
+mask = 1UL  offset;
+cpu_physical_memory_sync_master(index);
+return (phys_ram_dirty[dirty_idx][index]  mask) == mask;
 }
 
 static inline int cpu_physical_memory_get_dirty(ram_addr_t addr,
 int dirty_flags)
 {
-return phys_ram_dirty[addr  TARGET_PAGE_BITS]  dirty_flags;
+return cpu_physical_memory_get_dirty_idx(addr,
+ dirty_flag_to_idx(dirty_flags));
 }
 
 static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
 {
-phys_ram_dirty[addr  TARGET_PAGE_BITS] = 0xff;
+unsigned long mask;
+ram_addr_t index = (addr  TARGET_PAGE_BITS) / HOST_LONG_BITS;
+int offset = (addr  TARGET_PAGE_BITS)  (HOST_LONG_BITS - 1);
+
+mask = 1UL  offset;
+phys_ram_dirty[MASTER_DIRTY_IDX][index] |= mask;
 }
 
-static inline int cpu_physical_memory_set_dirty_flags(ram_addr_t addr,
-  int dirty_flags)
+static inline void cpu_physical_memory_set_dirty_range(ram_addr_t addr,
+   unsigned long mask)
 {
-return

[Qemu-devel] [PATCH v5 RESEND 3/4] Use cpu_physical_memory_set_dirty_range() to update phys_ram_dirty.

2010-06-06 Thread Yoshiaki Tamura

Modifies kvm_physical_sync_dirty_bitmap to use
cpu_physical_memory_set_dirty_range() to update the row of the bit-based
phys_ram_dirty bitmap at once.

Signed-off-by: OHMURA Kei ohmura@lab.ntt.co.jp
Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
---
 kvm-all.c |   24 +---
 1 files changed, 9 insertions(+), 15 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index c238f54..0d29798 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -290,8 +290,8 @@ static int kvm_get_dirty_pages_log_range(unsigned long 
start_addr,
  unsigned long offset,
  unsigned long mem_size)
 {
-unsigned int i, j;
-unsigned long page_number, addr, addr1, c;
+unsigned int i;
+unsigned long page_number, addr, addr1;
 ram_addr_t ram_addr;
 unsigned int len = ((mem_size / TARGET_PAGE_SIZE) + HOST_LONG_BITS - 1) /
 HOST_LONG_BITS;
@@ -302,23 +302,17 @@ static int kvm_get_dirty_pages_log_range(unsigned long 
start_addr,
  */
 for (i = 0; i  len; i++) {
 if (bitmap[i] != 0) {
-c = leul_to_cpu(bitmap[i]);
-do {
-j = ffsl(c) - 1;
-c = ~(1ul  j);
-page_number = i * HOST_LONG_BITS + j;
-addr1 = page_number * TARGET_PAGE_SIZE;
-addr = offset + addr1;
-ram_addr = cpu_get_physical_page_desc(addr);
-cpu_physical_memory_set_dirty(ram_addr);
-} while (c != 0);
+page_number = i * HOST_LONG_BITS;
+addr1 = page_number * TARGET_PAGE_SIZE;
+addr = offset + addr1;
+ram_addr = cpu_get_physical_page_desc(addr);
+cpu_physical_memory_set_dirty_range(ram_addr,
+leul_to_cpu(bitmap[i]));
 }
 }
 return 0;
 }
 
-#define ALIGN(x, y)  (((x)+(y)-1)  ~((y)-1))
-
 /**
  * kvm_physical_sync_dirty_bitmap - Grab dirty bitmap from kernel space
  * This function updates qemu's dirty bitmap using 
cpu_physical_memory_set_dirty().
@@ -343,7 +337,7 @@ static int 
kvm_physical_sync_dirty_bitmap(target_phys_addr_t start_addr,
 break;
 }
 
-size = ALIGN(((mem-memory_size)  TARGET_PAGE_BITS), HOST_LONG_BITS) 
/ 8;
+size = BITMAP_SIZE(mem-memory_size);
 if (!d.dirty_bitmap) {
 d.dirty_bitmap = qemu_malloc(size);
 } else if (size  allocated_size) {
-- 
1.7.0.31.g1df487

[Qemu-devel] [PATCH v5 RESEND 0/4] Introduce bit-based phys_ram_dirty, and bit-based dirty page checker.

2010-06-06 Thread Yoshiaki Tamura

The dirty and non-dirty pages are checked one by one.  When most of the memory
is not dirty, checking the dirty and non-dirty pages by multiple page size
should be much faster than checking them one by one.  We introduced bit-based
phys_ram_dirty for VGA, CODE, MIGRATION, MASTER, and
cpu_physical_memory_get_dirty_range() for this purpose.

The following numbers show the speed up of bit-based phys_ram_dirty.  The speed
up grows when the number of rows, whose contents are 0, gets larger.

Test Environment:
CPU: 4x Intel Xeon Quad Core 2.66GHz
Mem size: 96GB

Host OS: CentOS (kernel 2.6.33)
Guest OS: Debian/GNU Linux lenny (kernel 2.6.26)
Guest Mem size: 512MB

Conditions of experiments are as follows:
Cond1: Guest OS periodically makes the 256MB continuous dirty pages.
Cond2: Guest OS periodically makes the 256MB dirty pages and non-dirty pages
in turn.
Cond3: Guest OS read 1GB file, which is bigger than memory.
Cond4: Guest OS write 1GB file, which is bigger than memory.

Experimental results:
Cond1: 5 ??? 83 times speed up
Cond2: 5 ??? 52 times speed up
Cond3: 5 ??? 132 times speed up
Cond4: 5 ??? 57 times speed up

Changes from v4 to v5 are:
- Rebased to HEAD (0ffbba357c557d9fa5caf9476878a4b9c155a614)
- Use BITMAP_SIZE() in kvm_physical_sync_dirty_bitmap() (3/4)

Changes from v3 to v4 are:

- Merged {1,2,3}/6 to compile correctly.
- Fix setting bits after phys_ram_dirty allocation.
- renamed DIRTY_FLAG and DIRTY_IDX converter function.

Changes from v2 to v3 are:

- Change FLAGS value to (1,2,4,8), and add IDX (0,1,2,3)
- Use ffs to convert FLAGS to IDX.
- Add a helper function which takes IDX.
- Change the behavior of MASTER as a buffer.
- Change dirty bitmap access to a loop.
- Add brace after if ()

Yoshiaki Tamura (4):
  Modify DIRTY_FLAG value and introduce DIRTY_IDX to use as indexes of
bit-based phys_ram_dirty.
  Introduce cpu_physical_memory_get_dirty_range().
  Use cpu_physical_memory_set_dirty_range() to update phys_ram_dirty.
  Use cpu_physical_memory_get_dirty_range() to check multiple dirty
pages.

 arch_init.c   |   57 +++-
 cpu-all.h |  132 -
 exec.c|   82 +--
 kvm-all.c |   24 --
 qemu-common.h |3 +
 5 files changed, 236 insertions(+), 62 deletions(-)

[Qemu-devel] Few Questions about QEMU JSON

2010-06-06 Thread akshay st

Hello,
Basically i want to seperate QEMU(Instruction translations, hardware emulation 
drivers etc...) and Simulators (UI,events etc...), Someone suggested me to use 
json mechanism. I want to understand more on json, can u please give me some 
insight,It there is any document or something it will be helpful. 
Also is it possible to seperate QEMU and Simulator?If yes can we use JSON?

Warm Regards,
Akshay

59 matches

Mail list logo