date:20110914

[Qemu-devel] [PATCH] Fix subtle integer overflow bug in memory API

2011-09-14 Thread David Gibson

It is quite common to have a MemoryRegion with size of INT64_MAX.
When processing alias regions in render_memory_region() it's quite
easy to find a case where it will construct a temporary AddrRange with
a non-zero start, and size still of INT64_MAX.  When means attempting
to compute the end of such a range as start + size will result in
signed integer overflow.

This integer overflow means that addrrange_intersects() can
incorrectly report regions as not intersecting when they do.  For
example consider the case of address ranges {0x100,
0x7fff} and {0x1001000, 0x1000} where the second
is in fact included completely in the first.

This patch rearranges addrrange_intersects() to avoid the integer
overflow, correcting this behaviour.

Signed-off-by: David Gibson da...@gibson.dropbear.id.au
---
 memory.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/memory.c b/memory.c
index 57f0fa4..101b67c 100644
--- a/memory.c
+++ b/memory.c
@@ -55,8 +55,8 @@ static AddrRange addrrange_shift(AddrRange range, int64_t 
delta)
 
 static bool addrrange_intersects(AddrRange r1, AddrRange r2)
 {
-return (r1.start = r2.start  r1.start  r2.start + r2.size)
-|| (r2.start = r1.start  r2.start  r1.start + r1.size);
+return (r1.start = r2.start  (r1.start - r2.start)  r2.size)
+|| (r2.start = r1.start  (r2.start - r1.start)  r1.size);
 }
 
 static AddrRange addrrange_intersection(AddrRange r1, AddrRange r2)
-- 
1.7.5.4

Re: [Qemu-devel] [PATCH v3 5/6] vga: Use linear mapping + dirty logging in chain 4 memory access mode

2011-09-14 Thread Avi Kivity


On 09/13/2011 10:39 PM, Blue Swirl wrote:


  Here is the problem: Both the vram and the ISA range get mapped into
  system address space, but the former eclipses the latter as it shows up
  earlier in the list and has the same priority. This picture changes with
  the chain-4 alias which has prio 2, thus maps over the vram.

  It looks to me like the ISA address space is either misplaced at
  0x8000 or is not supposed to be mapped at all on PPC. Comments?

Since there is no PCI-ISA bridge, ISA address space shouldn't exist.


Where does the vga device sit then?

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

Re: [Qemu-devel] [PATCH] hid: vmstat fix

2011-09-14 Thread Paolo Bonzini


On 09/14/2011 05:03 AM, TeLeMan wrote:

The commit usb/hid: add hid_pointer_activate, use it used
HIDMouseState.mouse_grabbed in hid_pointer_activate(), so
mouse_grabbed should be added into vmstat.


Does this fix a bug?  qemu_activate_mouse_event_handler is meant to be 
called once per execution of the VM, it is not guest state.


Paolo

[Qemu-devel] [PATCH 0/3] virtio-serial: Bug fix, add stats for bytes transferred

2011-09-14 Thread Amit Shah

Hello,

These patches fix one bug (patch 2), and add some stats for bytes
sent, received and discarded, mainly for debugging purposes..  These
stats are shown in the 'info qtree' output.  More details in the
commit logs.

Please apply,


Amit Shah (3):
  virtio-serial-bus: add port arg to discard_vq_data()
  virtio-serial-bus: discard data in already popped-out elem
  virtio-serial-bus: Add per-port stats for received, sent, discarded
bytes

 hw/virtio-serial-bus.c |   37 +++--
 hw/virtio-serial.h |   11 +++
 2 files changed, 42 insertions(+), 6 deletions(-)

-- 
1.7.6

[Qemu-devel] [PATCH 1/3] virtio-serial-bus: add port arg to discard_vq_data()

2011-09-14 Thread Amit Shah

To discard throttled data as well as maintain statistics of bytes
received and discarded, discard_vq_data() will need the port associated
with the vq.

Signed-off-by: Amit Shah amit.s...@redhat.com
---
 hw/virtio-serial-bus.c |9 +
 1 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c
index a4825b9..6838d73 100644
--- a/hw/virtio-serial-bus.c
+++ b/hw/virtio-serial-bus.c
@@ -114,7 +114,8 @@ static size_t write_to_port(VirtIOSerialPort *port,
 return offset;
 }
 
-static void discard_vq_data(VirtQueue *vq, VirtIODevice *vdev)
+static void discard_vq_data(VirtIOSerialPort *port, VirtQueue *vq,
+VirtIODevice *vdev)
 {
 VirtQueueElement elem;
 
@@ -248,7 +249,7 @@ int virtio_serial_close(VirtIOSerialPort *port)
  * consume, reset the throttling flag and discard the data.
  */
 port-throttled = false;
-discard_vq_data(port-ovq, port-vser-vdev);
+discard_vq_data(port, port-ovq, port-vser-vdev);
 
 send_control_event(port, VIRTIO_CONSOLE_PORT_OPEN, 0);
 
@@ -473,7 +474,7 @@ static void handle_output(VirtIODevice *vdev, VirtQueue *vq)
 info = port ? DO_UPCAST(VirtIOSerialPortInfo, qdev, port-dev.info) : NULL;
 
 if (!port || !port-host_connected || !info-have_data) {
-discard_vq_data(vq, vdev);
+discard_vq_data(port, vq, vdev);
 return;
 }
 
@@ -730,7 +731,7 @@ static void remove_port(VirtIOSerial *vser, uint32_t 
port_id)
 
 port = find_port_by_id(vser, port_id);
 /* Flush out any unconsumed buffers first */
-discard_vq_data(port-ovq, port-vser-vdev);
+discard_vq_data(port, port-ovq, port-vser-vdev);
 
 send_control_event(port, VIRTIO_CONSOLE_PORT_REMOVE, 1);
 }
-- 
1.7.6

[Qemu-devel] [PATCH 2/3] virtio-serial-bus: discard data in already popped-out elem

2011-09-14 Thread Amit Shah

While discarding data previously any popped-out elem in the vq but not
yet pushed into the guest because the backend was throttled wasn't
pushed back into the guest.  Fix that by checking if we had any
in-progress elem, and pushing it out to the guest first before emptying
the vq.

Signed-off-by: Amit Shah amit.s...@redhat.com
---
 hw/virtio-serial-bus.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c
index 6838d73..2c84398 100644
--- a/hw/virtio-serial-bus.c
+++ b/hw/virtio-serial-bus.c
@@ -122,6 +122,10 @@ static void discard_vq_data(VirtIOSerialPort *port, 
VirtQueue *vq,
 if (!virtio_queue_ready(vq)) {
 return;
 }
+if (port  port-elem.out_num) {
+virtqueue_push(vq, port-elem, 0);
+port-elem.out_num = 0;
+}
 while (virtqueue_pop(vq, elem)) {
 virtqueue_push(vq, elem, 0);
 }
-- 
1.7.6

[Qemu-devel] [PATCH 3/3] virtio-serial-bus: Add per-port stats for received, sent, discarded bytes

2011-09-14 Thread Amit Shah

This commit adds port-specific stats for the number of bytes received,
sent and discarded.  They can be seen in the 'info qtree' monitor output
for the specific port.

This data can be used to check for data loss bugs (or disprove such
claims). It can also be used for accounting, if there's such a need.

The stats remain valid throughout the lifetime of the port. Unplugging a
port will reset the stats.  The numbers are not reset across port
opens/closes.

Signed-off-by: Amit Shah amit.s...@redhat.com
---
 hw/virtio-serial-bus.c |   24 ++--
 hw/virtio-serial.h |   11 +++
 2 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c
index 2c84398..deefda4 100644
--- a/hw/virtio-serial-bus.c
+++ b/hw/virtio-serial-bus.c
@@ -108,6 +108,7 @@ static size_t write_to_port(VirtIOSerialPort *port,
 offset += len;
 
 virtqueue_push(vq, elem, len);
+port-stats.bytes_sent += len;
 }
 
 virtio_notify(port-vser-vdev, vq);
@@ -123,10 +124,24 @@ static void discard_vq_data(VirtIOSerialPort *port, 
VirtQueue *vq,
 return;
 }
 if (port  port-elem.out_num) {
+port-stats.bytes_discarded += (iov_size(port-elem.out_sg,
+ elem.out_num)
+- iov_size(port-elem.out_sg,
+   port-iov_idx)
+- port-iov_offset);
 virtqueue_push(vq, port-elem, 0);
 port-elem.out_num = 0;
 }
 while (virtqueue_pop(vq, elem)) {
+if (port) {
+unsigned long size;
+
+size = iov_size(elem.out_sg, elem.out_num);
+
+/* We haven't counted these bytes in the received stats yet. */
+port-stats.bytes_received += size;
+port-stats.bytes_discarded += size;
+}
 virtqueue_push(vq, elem, 0);
 }
 virtio_notify(vdev, vq);
@@ -152,6 +167,8 @@ static void do_flush_queued_data(VirtIOSerialPort *port, 
VirtQueue *vq,
 }
 port-iov_idx = 0;
 port-iov_offset = 0;
+port-stats.bytes_received += iov_size(port-elem.out_sg,
+   port-elem.out_num);
 }
 
 for (i = port-iov_idx; i  port-elem.out_num; i++) {
@@ -684,11 +701,14 @@ static void virtser_bus_dev_print(Monitor *mon, 
DeviceState *qdev, int indent)
 {
 VirtIOSerialPort *port = DO_UPCAST(VirtIOSerialPort, dev, qdev);
 
-monitor_printf(mon, %*sport %d, guest %s, host %s, throttle %s\n,
+monitor_printf(mon, %*sport %d, guest %s, host %s, throttle %s, 
bytes_sent %lu, bytes_received %lu, bytes_discarded: %lu\n,
indent, , port-id,
port-guest_connected ? on : off,
port-host_connected ? on : off,
-   port-throttled ? on : off);
+   port-throttled ? on : off,
+   port-stats.bytes_sent,
+   port-stats.bytes_received,
+   port-stats.bytes_discarded);
 }
 
 /* This function is only used if a port id is not provided by the user */
diff --git a/hw/virtio-serial.h b/hw/virtio-serial.h
index ab13803..34d36d7 100644
--- a/hw/virtio-serial.h
+++ b/hw/virtio-serial.h
@@ -67,6 +67,10 @@ typedef struct VirtIOSerialBus VirtIOSerialBus;
 typedef struct VirtIOSerialPort VirtIOSerialPort;
 typedef struct VirtIOSerialPortInfo VirtIOSerialPortInfo;
 
+typedef struct {
+unsigned long bytes_sent, bytes_received, bytes_discarded;
+} PortStats;
+
 /*
  * This is the state that's shared between all the ports.  Some of the
  * state is configurable via command-line options. Some of it can be
@@ -87,6 +91,13 @@ struct VirtIOSerialPort {
 VirtQueue *ivq, *ovq;
 
 /*
+ * Keep count of the bytes sent, received and discarded for
+ * this port for accounting and debugging purposes.  These
+ * counts are not reset across port open / close events.
+ */
+PortStats stats;
+
+/*
  * This name is sent to the guest and exported via sysfs.
  * The guest could create symlinks based on this information.
  * The name is in the reverse fqdn format, like org.qemu.console.0
-- 
1.7.6

Re: [Qemu-devel] [PATCH] pseries: Update SLOF firmware image

2011-09-14 Thread Paolo Bonzini


On 09/01/2011 07:13 AM, David Gibson wrote:

The current SLOF firmware for the pseries machine has a bug in SCSI
condition handling that was exposed by recent updates to qemu's SCSI
emulation.  This patch updates the SLOF image to one with the bug fixed.


Ping for this and 
http://permalink.gmane.org/gmane.comp.emulators.qemu/114461


Paolo

Re: [Qemu-devel] [PATCH] hid: vmstat fix

2011-09-14 Thread TeLeMan

On Wed, Sep 14, 2011 at 15:15, Paolo Bonzini pbonz...@redhat.com wrote:
 On 09/14/2011 05:03 AM, TeLeMan wrote:

 The commit usb/hid: add hid_pointer_activate, use it used
 HIDMouseState.mouse_grabbed in hid_pointer_activate(), so
 mouse_grabbed should be added into vmstat.

 Does this fix a bug?  qemu_activate_mouse_event_handler is meant to be
 called once per execution of the VM, it is not guest state.
Yes, this patch fixes the usb mouse not be working after loadvm in the
guest windows.


 Paolo

Re: [Qemu-devel] [PATCH v3 5/6] vga: Use linear mapping + dirty logging in chain 4 memory access mode

2011-09-14 Thread Alexander Graf


On 14.09.2011, at 09:11, Avi Kivity wrote:

 On 09/13/2011 10:39 PM, Blue Swirl wrote:
 
   Here is the problem: Both the vram and the ISA range get mapped into
   system address space, but the former eclipses the latter as it shows up
   earlier in the list and has the same priority. This picture changes with
   the chain-4 alias which has prio 2, thus maps over the vram.
 
   It looks to me like the ISA address space is either misplaced at
   0x8000 or is not supposed to be mapped at all on PPC. Comments?
 
 Since there is no PCI-ISA bridge, ISA address space shouldn't exist.
 
 Where does the vga device sit then?

On the PCI bus? :)


Alex

Re: [Qemu-devel] [Qemu-ppc] [PATCH] pseries: Update SLOF firmware image

2011-09-14 Thread Alexander Graf


On 14.09.2011, at 09:38, Paolo Bonzini wrote:

 On 09/01/2011 07:13 AM, David Gibson wrote:
 The current SLOF firmware for the pseries machine has a bug in SCSI
 condition handling that was exposed by recent updates to qemu's SCSI
 emulation.  This patch updates the SLOF image to one with the bug fixed.
 
 Ping for this and http://permalink.gmane.org/gmane.comp.emulators.qemu/114461

Yeah, sorry, I introduced a regression with the KVM ABI in my HIOR patches and 
still need to rework that before I can push out the tree (otherwise it's a hell 
lot of work to untangle the changes). My hope is that we have VGA fixed until 
then too, so all ppc targets will work again ;)


Alex

Re: [Qemu-devel] [PATCH] pc_init: Fail on bad kernel

2011-09-14 Thread Sasha Levin

Ping?

On Sat, 2011-09-03 at 22:35 +0300, Sasha Levin wrote:
 When providing QEMU with a bad '-kernel' parameter, such as a file which
 is not really a kernel, QEMU will attempt to allocate a huge amount of
 memory and fail either with Failed to allocate memory: Cannot allocate
 memory or a GLib error: GLib-ERROR **: gmem.c:170: failed to allocate
 18446744073709529965 bytes
 
 This patch handles the case where the magic sig wasn't located in the
 provided kernel, and loading it as multiboot failed as well.
 
 Cc: Anthony Liguori aligu...@us.ibm.com
 Signed-off-by: Sasha Levin levinsasha...@gmail.com
 ---
  hw/pc.c |8 +++-
  1 files changed, 7 insertions(+), 1 deletions(-)
 
 diff --git a/hw/pc.c b/hw/pc.c
 index 6b3662e..428440b 100644
 --- a/hw/pc.c
 +++ b/hw/pc.c
 @@ -691,8 +691,14 @@ static void load_linux(void *fw_cfg,
   /* This looks like a multiboot kernel. If it is, let's stop
  treating it like a Linux kernel. */
  if (load_multiboot(fw_cfg, f, kernel_filename, initrd_filename,
 -   kernel_cmdline, kernel_size, header))
 +   kernel_cmdline, kernel_size, header)) {
  return;
 +} else {
 +fprintf(stderr, qemu: could not load kernel '%s': %s\n,
 + kernel_filename, strerror(errno));
 + exit(1);
 +}
 + 
   protocol = 0;
  }
  

-- 

Sasha.

Re: [Qemu-devel] [PATCH] hid: vmstat fix

2011-09-14 Thread Paolo Bonzini


On 09/14/2011 09:40 AM, TeLeMan wrote:


  The commit usb/hid: add hid_pointer_activate, use it used
  HIDMouseState.mouse_grabbed in hid_pointer_activate(), so
  mouse_grabbed should be added into vmstat.


  Does this fix a bug?  qemu_activate_mouse_event_handler is meant to be
  called once per execution of the VM, it is not guest state.

Yes, this patch fixes the usb mouse not be working after loadvm in the
guest windows.


I'm wondering if, with your patch, Windows is actually using the PS/2 
mouse after loadvm...  If that is the case, perhaps instead you can move


if (hs-kind == HID_MOUSE || hs-kind == HID_TABLET) {
hid_pointer_activate(hs);
}

from hw/usb-hid.c to hid_set_next_idle, which is called at post-load time.

Paolo

Re: [Qemu-devel] [PATCH v3 5/6] vga: Use linear mapping + dirty logging in chain 4 memory access mode

2011-09-14 Thread Jan Kiszka

On 2011-09-14 09:42, Alexander Graf wrote:
 
 On 14.09.2011, at 09:11, Avi Kivity wrote:
 
 On 09/13/2011 10:39 PM, Blue Swirl wrote:

  Here is the problem: Both the vram and the ISA range get mapped into
  system address space, but the former eclipses the latter as it shows up
  earlier in the list and has the same priority. This picture changes with
  the chain-4 alias which has prio 2, thus maps over the vram.

  It looks to me like the ISA address space is either misplaced at
  0x8000 or is not supposed to be mapped at all on PPC. Comments?

 Since there is no PCI-ISA bridge, ISA address space shouldn't exist.

 Where does the vga device sit then?
 
 On the PCI bus? :)

Then make sure that the container for ISA resources is a dummy region -
or even NULL so that VGA will know that it's supposed to skip ISA
registrations.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH v3 5/6] vga: Use linear mapping + dirty logging in chain 4 memory access mode

2011-09-14 Thread Avi Kivity


On 09/14/2011 10:42 AM, Alexander Graf wrote:

On 14.09.2011, at 09:11, Avi Kivity wrote:

  On 09/13/2011 10:39 PM, Blue Swirl wrote:
  
 Here is the problem: Both the vram and the ISA range get mapped into
 system address space, but the former eclipses the latter as it shows up
 earlier in the list and has the same priority. This picture changes with
 the chain-4 alias which has prio 2, thus maps over the vram.
  
 It looks to me like the ISA address space is either misplaced at
 0x8000 or is not supposed to be mapped at all on PPC. Comments?

  Since there is no PCI-ISA bridge, ISA address space shouldn't exist.

  Where does the vga device sit then?

On the PCI bus? :)



I thought it was std vga, which is an ISA device.

Anyway PCI supports the vga region at 0xa-0xc.  Where is it 
supposed to be mapped?


--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [PATCH v3 5/6] vga: Use linear mapping + dirty logging in chain 4 memory access mode

2011-09-14 Thread Jan Kiszka

On 2011-09-14 10:17, Avi Kivity wrote:
 On 09/14/2011 10:42 AM, Alexander Graf wrote:
 On 14.09.2011, at 09:11, Avi Kivity wrote:

  On 09/13/2011 10:39 PM, Blue Swirl wrote:
  
 Here is the problem: Both the vram and the ISA range get mapped into
 system address space, but the former eclipses the latter as it shows 
 up
 earlier in the list and has the same priority. This picture changes 
 with
 the chain-4 alias which has prio 2, thus maps over the vram.
  
 It looks to me like the ISA address space is either misplaced at
 0x8000 or is not supposed to be mapped at all on PPC. Comments?

  Since there is no PCI-ISA bridge, ISA address space shouldn't exist.

  Where does the vga device sit then?

 On the PCI bus? :)

 
 I thought it was std vga, which is an ISA device.

There are both types (ISA-only and PCI).

 
 Anyway PCI supports the vga region at 0xa-0xc.  Where is it 
 supposed to be mapped?

...but not all PCI bridges make use of this feature / forward legacy
requests.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH v3 5/6] vga: Use linear mapping + dirty logging in chain 4 memory access mode

2011-09-14 Thread Avi Kivity


On 09/14/2011 11:20 AM, Jan Kiszka wrote:


  Anyway PCI supports the vga region at 0xa-0xc.  Where is it
  supposed to be mapped?

...but not all PCI bridges make use of this feature / forward legacy
requests.



Then this should be fixed in the bridge?

--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [PATCH] Fix subtle integer overflow bug in memory API

2011-09-14 Thread Avi Kivity


On 09/14/2011 10:02 AM, David Gibson wrote:

It is quite common to have a MemoryRegion with size of INT64_MAX.
When processing alias regions in render_memory_region() it's quite
easy to find a case where it will construct a temporary AddrRange with
a non-zero start, and size still of INT64_MAX.  When means attempting
to compute the end of such a range as start + size will result in
signed integer overflow.

This integer overflow means that addrrange_intersects() can
incorrectly report regions as not intersecting when they do.  For
example consider the case of address ranges {0x100,
0x7fff} and {0x1001000, 0x1000} where the second
is in fact included completely in the first.


Good catch, thanks for digging this out.


This patch rearranges addrrange_intersects() to avoid the integer
overflow, correcting this behaviour.


I expect that the bad behaviour can still be triggered, for example by 
pointing aliases towards the end of very large regions.  Not that I 
expect this to occur in practice.


I think we should move towards using __int128 internally.  Is there any 
relevant host which does not support __int128?


Meanwhile, applied to memory/core, and will request a pull shortly.

--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [PATCH v3 5/6] vga: Use linear mapping + dirty logging in chain 4 memory access mode

2011-09-14 Thread Jan Kiszka

On 2011-09-14 10:22, Avi Kivity wrote:
 On 09/14/2011 11:20 AM, Jan Kiszka wrote:

  Anyway PCI supports the vga region at 0xa-0xc.  Where is it
  supposed to be mapped?

 ...but not all PCI bridges make use of this feature / forward legacy
 requests.

 
 Then this should be fixed in the bridge?

Yes, it's a PPC bug.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH v3 5/6] vga: Use linear mapping + dirty logging in chain 4 memory access mode

2011-09-14 Thread Alexander Graf


On 14.09.2011, at 10:24, Jan Kiszka wrote:

 On 2011-09-14 10:22, Avi Kivity wrote:
 On 09/14/2011 11:20 AM, Jan Kiszka wrote:
 
 Anyway PCI supports the vga region at 0xa-0xc.  Where is it
 supposed to be mapped?
 
 ...but not all PCI bridges make use of this feature / forward legacy
 requests.
 
 
 Then this should be fixed in the bridge?
 
 Yes, it's a PPC bug.

So how does the bridge not forward it then?


Alex

Re: [Qemu-devel] [PATCH v3 5/6] vga: Use linear mapping + dirty logging in chain 4 memory access mode

2011-09-14 Thread Jan Kiszka

On 2011-09-14 10:27, Alexander Graf wrote:
 
 On 14.09.2011, at 10:24, Jan Kiszka wrote:
 
 On 2011-09-14 10:22, Avi Kivity wrote:
 On 09/14/2011 11:20 AM, Jan Kiszka wrote:

 Anyway PCI supports the vga region at 0xa-0xc.  Where is it
 supposed to be mapped?

 ...but not all PCI bridges make use of this feature / forward legacy
 requests.


 Then this should be fixed in the bridge?

 Yes, it's a PPC bug.
 
 So how does the bridge not forward it then?

On real HW, by keeping the VGA Enable bit off. Or just not issuing
requests to the a..b range.

Under QEMU, I would simply provide the VGA model a memory region for
legacy stuff that remains unregistered.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH v3 5/6] vga: Use linear mapping + dirty logging in chain 4 memory access mode

2011-09-14 Thread Avi Kivity


On 09/14/2011 11:27 AM, Alexander Graf wrote:

On 14.09.2011, at 10:24, Jan Kiszka wrote:

  On 2011-09-14 10:22, Avi Kivity wrote:
  On 09/14/2011 11:20 AM, Jan Kiszka wrote:

  Anyway PCI supports the vga region at 0xa-0xc.  Where is it
  supposed to be mapped?

  ...but not all PCI bridges make use of this feature / forward legacy
  requests.


  Then this should be fixed in the bridge?

  Yes, it's a PPC bug.

So how does the bridge not forward it then?



I expect that currently vga adds the region to pci_address_space().  We 
need to create a pci_address_space_vga() function that returns a region 
for vga to use.  Then add or remove the region to pci_address_space(), 
within the bridge code, depending on whether the bridge forwards vga 
accesses or not.


(assuming I understood the problem correctly - not sure)

--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [PATCH] Fix subtle integer overflow bug in memory API

2011-09-14 Thread Avi Kivity


On 09/14/2011 11:23 AM, Avi Kivity wrote:


I think we should move towards using __int128 internally.  Is there 
any relevant host which does not support __int128?


Crap, it's not even supported on i386.

--
error compiling committee.c: too many arguments to function

[Qemu-devel] [PULL 00/58] ppc patch queue 2011-09-14

2011-09-14 Thread Alexander Graf

Hi Aurelien / Blue,

This is my current patch queue for ppc. Please pull.

Alex


The following changes since commit 44520db10b1b92f272348ab7028e7afc68ac3edf:
  Fabien Chouteau (1):
Gdbstub: Fix back-trace on SPARC32

are available in the git repository at:

  git://repo.or.cz/qemu/agraf.git ppc-next

Alexander Graf (38):
  PPC: Move openpic to target specific code compilation
  PPC: Add CPU local MMIO regions to MPIC
  PPC: Extend MPIC MMIO range
  PPC: Fix IPI support in MPIC
  PPC: Set MPIC IDE for IPI to 0
  PPC: MPIC: Remove read functionality for WO registers
  PPC: MPIC: Fix CI bit definitions
  PPC: Bump MPIC up to 32 supported CPUs
  PPC: E500: create multiple envs
  PPC: E500: Generate IRQ lines for many CPUs
  device tree: add nop_node
  PPC: bamboo: Move host fdt copy to target
  PPC: KVM: Add generic function to read host clockfreq
  PPC: E500: Use generic kvm function for freq
  PPC: E500: Remove mpc8544_copy_soc_cell
  PPC: bamboo: Use kvm api for freq and clock frequencies
  PPC: KVM: Remove kvmppc_read_host_property
  PPC: KVM: Add stubs for kvm helper functions
  PPC: E500: Update freqs for all CPUs
  PPC: E500: Remove unneeded CPU nodes
  PPC: E500: Add PV spinning code
  PPC: E500: Update cpu-release-addr property in cpu nodes
  device tree: add add_subnode command
  device tree: dont fail operations
  device tree: give dt more size
  MPC8544DS: Remove CPU nodes
  MPC8544DS: Generate CPU nodes on init
  PPC: E500: Bump CPU count to 15
  PPC: Add new target config for pseries
  KVM: update kernel headers
  PPC: Enable to use PAPR with PR style KVM
  PPC: SPAPR: Use KVM function for time info
  KVM: Update kernel headers
  openpic: Unfold read_IRQreg
  openpic: Unfold write_IRQreg
  PPC: Fix via-cuda memory registration
  PPC: Fix heathrow PIC to use little endian MMIO
  KVM: Update kernel headers

David Gibson (8):
  pseries: Bugfixes for interrupt numbering in XICS code
  pseries: Add a phandle to the xicp interrupt controller device tree node
  pseries: interrupt controller should not have a 'reg' property
  pseries: More complete WIMG validation in H_ENTER code
  pseries: Add real mode debugging hcalls
  Implement POWER7's CFAR in TCG
  pseries: Implement hcall-bulk hypervisor interface
  pseries: Update SLOF firmware image

Elie Richa (1):
  PPC: Fix sync instructions problem in SMP

Fabien Chouteau (1):
  Gdbstub: handle read of fpscr

Laurent Vivier (1):
  ppc: move ADB stuff from ppc_mac.h to adb.h

Nishanth Aravamudan (1):
  pseries: use macro for firmware filename

Paolo Bonzini (4):
  spapr: proper qdevification
  spapr: prepare for qdevification of irq
  spapr: make irq customizable via qdev
  vscsi: send the CHECK_CONDITION status down together with autosense data

Scott Wood (3):
  kvm: ppc: booke206: use MMU API
  ppc: booke206: add info tlb support
  ppc: booke206: use MAV=2.0 TSIZE definition, fix 4G pages

Stefan Hajnoczi (1):
  ppc405: use RAM_ADDR_FMT instead of %08lx

 Makefile.objs|1 -
 Makefile.target  |   10 +-
 configure|3 +
 cpu-exec.c   |1 +
 device_tree.c|   92 ++--
 device_tree.h|2 +
 gdbstub.c|2 +-
 hmp-commands.hx  |2 +-
 hw/adb.c |2 +-
 hw/adb.h |   67 +
 hw/cuda.c|   29 +++--
 hw/heathrow_pic.c|2 +-
 hw/openpic.c |  289 +-
 hw/ppc405_boards.c   |5 +-
 hw/ppc440_bamboo.c   |   16 ++-
 hw/ppc_mac.h |   42 --
 hw/ppc_newworld.c|1 +
 hw/ppc_oldworld.c|1 +
 hw/ppce500_mpc8544ds.c   |  195 +++---
 hw/ppce500_spin.c|  186 
 hw/spapr.c   |   52 ---
 hw/spapr.h   |9 ++
 hw/spapr_hcall.c |  220 +++--
 hw/spapr_llan.c  |   11 +--
 hw/spapr_vio.c   |   11 ++
 hw/spapr_vio.h   |   18 ++--
 hw/spapr_vscsi.c |   13 +--
 hw/spapr_vty.c   |   10 +-
 hw/xics.c|   17 +--
 linux-headers/asm-powerpc/kvm.h  |   59 -
 linux-headers/asm-x86/kvm_para.h |   14 ++
 linux-headers/linux/kvm.h|   42 +-
 linux-headers/linux/kvm_para.h   |1 +
 monitor.c|5 +-
 pc-bios/README   |2 +-
 pc-bios/mpc8544ds.dtb|  Bin 2277 - 2028 bytes
 pc-bios/mpc8544ds.dts|   12 --

[Qemu-devel] [PATCH 26/58] device tree: add add_subnode command

2011-09-14 Thread Alexander Graf

We want to be able to create subnodes in our device tree, so export it through
the qemu device tree abstraction framework.

Signed-off-by: Alexander Graf ag...@suse.de
---
 device_tree.c |   24 
 device_tree.h |1 +
 2 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/device_tree.c b/device_tree.c
index 23e89e3..f4a78c8 100644
--- a/device_tree.c
+++ b/device_tree.c
@@ -118,3 +118,27 @@ int qemu_devtree_nop_node(void *fdt, const char *node_path)
 
 return fdt_nop_node(fdt, offset);
 }
+
+int qemu_devtree_add_subnode(void *fdt, const char *name)
+{
+int offset;
+char *dupname = g_strdup(name);
+char *basename = strrchr(dupname, '/');
+int retval;
+
+if (!basename) {
+return -1;
+}
+
+basename[0] = '\0';
+basename++;
+
+offset = fdt_path_offset(fdt, dupname);
+if (offset  0) {
+return offset;
+}
+
+retval = fdt_add_subnode(fdt, offset, basename);
+g_free(dupname);
+return retval;
+}
diff --git a/device_tree.h b/device_tree.h
index 76fce5f..4378685 100644
--- a/device_tree.h
+++ b/device_tree.h
@@ -23,5 +23,6 @@ int qemu_devtree_setprop_cell(void *fdt, const char 
*node_path,
 int qemu_devtree_setprop_string(void *fdt, const char *node_path,
 const char *property, const char *string);
 int qemu_devtree_nop_node(void *fdt, const char *node_path);
+int qemu_devtree_add_subnode(void *fdt, const char *name);
 
 #endif /* __DEVICE_TREE_H__ */
-- 
1.6.0.2

[Qemu-devel] [PATCH 53/58] openpic: Unfold read_IRQreg

2011-09-14 Thread Alexander Graf

The helper function read_IRQreg was always called with a specific argument on
the type of register to access. Inside the function we were simply doing a
switch on that constant argument again. It's a lot easier to just unfold this
into two separate functions and call each individually.

Reported-by: Blue Swirl blauwir...@gmail.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/openpic.c |   56 +---
 1 files changed, 25 insertions(+), 31 deletions(-)

diff --git a/hw/openpic.c b/hw/openpic.c
index 03e442b..fbd8837 100644
--- a/hw/openpic.c
+++ b/hw/openpic.c
@@ -472,20 +472,14 @@ static void openpic_reset (void *opaque)
 opp-glbc = 0x;
 }
 
-static inline uint32_t read_IRQreg (openpic_t *opp, int n_IRQ, uint32_t reg)
+static inline uint32_t read_IRQreg_ide(openpic_t *opp, int n_IRQ)
 {
-uint32_t retval;
-
-switch (reg) {
-case IRQ_IPVP:
-retval = opp-src[n_IRQ].ipvp;
-break;
-case IRQ_IDE:
-retval = opp-src[n_IRQ].ide;
-break;
-}
+return opp-src[n_IRQ].ide;
+}
 
-return retval;
+static inline uint32_t read_IRQreg_ipvp(openpic_t *opp, int n_IRQ)
+{
+return opp-src[n_IRQ].ipvp;
 }
 
 static inline void write_IRQreg (openpic_t *opp, int n_IRQ,
@@ -523,10 +517,10 @@ static uint32_t read_doorbell_register (openpic_t *opp,
 
 switch (offset) {
 case DBL_IPVP_OFFSET:
-retval = read_IRQreg(opp, IRQ_DBL0 + n_dbl, IRQ_IPVP);
+retval = read_IRQreg_ipvp(opp, IRQ_DBL0 + n_dbl);
 break;
 case DBL_IDE_OFFSET:
-retval = read_IRQreg(opp, IRQ_DBL0 + n_dbl, IRQ_IDE);
+retval = read_IRQreg_ide(opp, IRQ_DBL0 + n_dbl);
 break;
 case DBL_DMR_OFFSET:
 retval = opp-doorbells[n_dbl].dmr;
@@ -564,10 +558,10 @@ static uint32_t read_mailbox_register (openpic_t *opp,
 retval = opp-mailboxes[n_mbx].mbr;
 break;
 case MBX_IVPR_OFFSET:
-retval = read_IRQreg(opp, IRQ_MBX0 + n_mbx, IRQ_IPVP);
+retval = read_IRQreg_ipvp(opp, IRQ_MBX0 + n_mbx);
 break;
 case MBX_DMR_OFFSET:
-retval = read_IRQreg(opp, IRQ_MBX0 + n_mbx, IRQ_IDE);
+retval = read_IRQreg_ide(opp, IRQ_MBX0 + n_mbx);
 break;
 }
 
@@ -695,7 +689,7 @@ static uint32_t openpic_gbl_read (void *opaque, 
target_phys_addr_t addr)
 {
 int idx;
 idx = (addr - 0x10A0)  4;
-retval = read_IRQreg(opp, opp-irq_ipi0 + idx, IRQ_IPVP);
+retval = read_IRQreg_ipvp(opp, opp-irq_ipi0 + idx);
 }
 break;
 case 0x10E0: /* SPVE */
@@ -765,10 +759,10 @@ static uint32_t openpic_timer_read (void *opaque, 
uint32_t addr)
 retval = opp-timers[idx].tibc;
 break;
 case 0x20: /* TIPV */
-retval = read_IRQreg(opp, opp-irq_tim0 + idx, IRQ_IPVP);
+retval = read_IRQreg_ipvp(opp, opp-irq_tim0 + idx);
 break;
 case 0x30: /* TIDE */
-retval = read_IRQreg(opp, opp-irq_tim0 + idx, IRQ_IDE);
+retval = read_IRQreg_ide(opp, opp-irq_tim0 + idx);
 break;
 }
 DPRINTF(%s: = %08x\n, __func__, retval);
@@ -809,10 +803,10 @@ static uint32_t openpic_src_read (void *opaque, uint32_t 
addr)
 idx = addr  5;
 if (addr  0x10) {
 /* EXDE / IFEDE / IEEDE */
-retval = read_IRQreg(opp, idx, IRQ_IDE);
+retval = read_IRQreg_ide(opp, idx);
 } else {
 /* EXVP / IFEVP / IEEVP */
-retval = read_IRQreg(opp, idx, IRQ_IPVP);
+retval = read_IRQreg_ipvp(opp, idx);
 }
 DPRINTF(%s: = %08x\n, __func__, retval);
 
@@ -1368,13 +1362,13 @@ static uint32_t mpic_timer_read (void *opaque, 
target_phys_addr_t addr)
 retval = mpp-timers[idx].tibc;
 break;
 case 0x20: /* TIPV */
-retval = read_IRQreg(mpp, MPIC_TMR_IRQ + idx, IRQ_IPVP);
+retval = read_IRQreg_ipvp(mpp, MPIC_TMR_IRQ + idx);
 break;
 case 0x30: /* TIDR */
 if ((addr 0xF0) == 0XF0)
 retval = mpp-dst[cpu].tfrr;
 else
-retval = read_IRQreg(mpp, MPIC_TMR_IRQ + idx, IRQ_IDE);
+retval = read_IRQreg_ide(mpp, MPIC_TMR_IRQ + idx);
 break;
 }
 DPRINTF(%s: = %08x\n, __func__, retval);
@@ -1421,10 +1415,10 @@ static uint32_t mpic_src_ext_read (void *opaque, 
target_phys_addr_t addr)
 idx += (addr  0xFFF0)  5;
 if (addr  0x10) {
 /* EXDE / IFEDE / IEEDE */
-retval = read_IRQreg(mpp, idx, IRQ_IDE);
+retval = read_IRQreg_ide(mpp, idx);
 } else {
 /* EXVP / IFEVP / IEEVP */
-retval = read_IRQreg(mpp, idx, IRQ_IPVP);
+retval = read_IRQreg_ipvp(mpp, idx);
 }
 DPRINTF(%s: = %08x\n, __func__, retval);
 }
@@ -1471,10 +1465,10 @@ static uint32_t mpic_src_int_read (void *opaque, 
target_phys_addr_t addr)
 idx += (addr  0xFFF0)  5;
 if (addr  0x10) {
 /* EXDE / IFEDE / IEEDE */

Re: [Qemu-devel] [PATCH][RFC][0/2] REF+/REF- optimization

2011-09-14 Thread Kevin Wolf

Am 13.09.2011 15:36, schrieb Frediano Ziglio:
 2011/9/13 Kevin Wolf kw...@redhat.com:
 Am 13.09.2011 09:53, schrieb Frediano Ziglio:
 These patches try to trade-off between leaks and speed for clusters
 refcounts.

 Refcount increments (REF+ or refp) are handled in a different way from
 decrements (REF- or refm). The reason it that posting or not flushing
 a REF- cause just a leak while posting a REF+ cause a corruption.

 To optimize REF- I just used an array to store offsets then when a
 flush is requested or array reach a limit (currently 1022) the array
 is sorted and written to disk. I use an array with offset instead of
 ranges to support compression (an offset could appear multiple times
 in the array).
 I consider this patch quite ready.

 Ok, first of all let's clarify what this optimises. I don't think it
 changes anything at all for the writeback cache modes, because these
 already do most operations in memory only. So this must be about
 optimising some operations with cache=writethrough. REF- isn't about
 normal cluster allocation, it is about COW with internal snapshots or
 bdrv_discard. Do you have benchmarks for any of them?

 I strongly disagree with your approach for REF-. We already have a
 cache, and introducing a second one sounds like a bad idea. I think we
 could get a very similar effect if we introduced a
 qcow2_cache_entry_mark_dirty_wb() that marks a given refcount block as
 dirty, but at the same time tells the cache that even in write-through
 mode it can still treat this block as write-back. This should require
 much less code changes.

 
 Yes, mainly optimize for writethrough. I did not test with writeback
 but should improve even this (I think here you have some flush to keep
 consistency).
 I'll try to write a qcow2_cache_entry_mark_dirty_wb patch and test it.

Great, thanks!

 But let's measure the effects first, I suspect that for cluster
 allocation it doesn't help much because every REF- comes with a REF+.

 
 That's 50% of effort if REF- clusters are far from REF+ :)

I would expect that the next REF+ allocates exactly the REF- cluster.
But you still have a point, we save the write on REF- and combine it
with the REF+ write.

 To optimize REF+ I mark a range as allocated and use this range to
 get new ones (avoiding writing refcount to disk). When a flush is
 requested or in some situations (like snapshot) this cache is disabled
 and flushed (written as REF-).
 I do not consider this patch ready, it works and pass all io-tests
 but for instance I would avoid allocating new clusters for refcount
 during preallocation.

 The only question here is if improving cache=writethrough cluster
 allocation performance is worth the additional complexity in the already
 complex refcounting code.

 
 I didn't see this optimization as a second level cache, but yes, for
 REF- is a second cache.
 
 The alternative that was discussed before is the dirty bit approach that
 is used in QED and would allow us to use writeback for all refcount
 blocks, regardless of REF- or REF+. It would be an easier approach
 requiring less code changes, but it comes with the cost of requiring an
 fsck after a qemu crash.

 
 I was thinking about changing the header magic first time we change
 refcount in order to mark image as dirty so newer Qemu recognize the
 flag while former one does not recognize image. Obviously reverting
 magic on image close.

We've discussed this idea before and I think it wasn't considered a
great idea to automagically change the header in an incompatible way.
But we can always say that for improved performance you need to upgrade
your image to qcow2 v3.

 End speed up is quite visible allocating clusters (more then 20%).

 What benchmark do you use for testing this?

 Kevin

 
 Currently I'm using bonnie++ but I noted similar improves with iozone.
 The test script format an image then launch a Linux machine which run
 a script and save result to a file.
 The test image is seems by this virtual machine as a separate disk.
 The file on hist reside in a separate LV.
 I got quite consistent results (of course not working on the machine
 while testing, is not actually dedicated to this job).
 
 Actually I'm running the test (added a test working in a snapshot image).

Okay. Let me guess the remaining variables: The image is on an ext4 host
filesystem, you use cache=writethrough and virtio-blk. You don't use
backing files, compression and encryption. For your tests with internal
snapshots you have exactly one internal snapshot that is taken
immediately before the benchmark. Oh, and not to forget, KVM is enabled.

Are these assumptions correct?

Kevin

[Qemu-devel] [PATCH 42/58] pseries: use macro for firmware filename

2011-09-14 Thread Alexander Graf

From: Nishanth Aravamudan n...@us.ibm.com

For some time we've had a nicely defined macro with the filename for our
firmware image.  However we didn't actually use it in the place we're
supposed to.  This patch fixes it.

Signed-off-by: Nishanth Aravamudan n...@us.ibm.com
Signed-off-by: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/spapr.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/hw/spapr.c b/hw/spapr.c
index 00aed62..91953cf 100644
--- a/hw/spapr.c
+++ b/hw/spapr.c
@@ -442,7 +442,7 @@ static void ppc_spapr_init(ram_addr_t ram_size,
 %ldM guest RAM\n, MIN_RAM_SLOF);
 exit(1);
 }
-filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, slof.bin);
+filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, FW_FILE_NAME);
 fw_size = load_image_targphys(filename, 0, FW_MAX_SIZE);
 if (fw_size  0) {
 hw_error(qemu: could not load LPAR rtas '%s'\n, filename);
-- 
1.6.0.2

[Qemu-devel] [PATCH 09/58] PPC: MPIC: Remove read functionality for WO registers

2011-09-14 Thread Alexander Graf

The IPI dispatch registers are write only according to every MPIC
spec I have found. So instead of pretending you could read back something
from them, better not handle them at all.

Reported-by: Elie Richa ri...@adacore.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/openpic.c |7 ---
 1 files changed, 0 insertions(+), 7 deletions(-)

diff --git a/hw/openpic.c b/hw/openpic.c
index 31ad175..dfec52e 100644
--- a/hw/openpic.c
+++ b/hw/openpic.c
@@ -952,13 +952,6 @@ static uint32_t openpic_cpu_read_internal(void *opaque, 
target_phys_addr_t addr,
 case 0xB0: /* PEOI */
 retval = 0;
 break;
-#if MAX_IPI  0
-case 0x40: /* IDE */
-case 0x50:
-idx = (addr - 0x40)  4;
-retval = read_IRQreg(opp, opp-irq_ipi0 + idx, IRQ_IDE);
-break;
-#endif
 default:
 break;
 }
-- 
1.6.0.2

[Qemu-devel] [PATCH] raw-posix: Fix bdrv_flush error return values

2011-09-14 Thread Kevin Wolf

bdrv_flush is supposed to use 0/-errno return values

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/raw-posix.c |9 -
 1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index a624f56..305998d 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -839,7 +839,14 @@ static int raw_create(const char *filename, 
QEMUOptionParameter *options)
 static int raw_flush(BlockDriverState *bs)
 {
 BDRVRawState *s = bs-opaque;
-return qemu_fdatasync(s-fd);
+int ret;
+
+ret = qemu_fdatasync(s-fd);
+if (ret  0) {
+return -errno;
+}
+
+return 0;
 }
 
 #ifdef CONFIG_XFS
-- 
1.7.6

[Qemu-devel] [PATCH 3/3] memory: optimize empty transactions due to mutators

2011-09-14 Thread Avi Kivity

The mutating memory APIs can easily cause empty transactions,
where the mutators don't actually change anything, or perhaps
only modify disabled regions.  Detect these conditions and
avoid regenerating the memory topology.

Signed-off-by: Avi Kivity a...@redhat.com
---
 memory.c |8 +++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/memory.c b/memory.c
index 3b0cc25..1370fac 100644
--- a/memory.c
+++ b/memory.c
@@ -19,6 +19,7 @@
 #include assert.h
 
 unsigned memory_region_transaction_depth = 0;
+static bool memory_region_update_pending = false;
 
 typedef struct AddrRange AddrRange;
 
@@ -717,6 +718,7 @@ static void address_space_update_topology(AddressSpace *as)
 static void memory_region_update_topology(MemoryRegion *mr)
 {
 if (memory_region_transaction_depth) {
+memory_region_update_pending |= !mr || mr-enabled;
 return;
 }
 
@@ -730,6 +732,8 @@ static void memory_region_update_topology(MemoryRegion *mr)
 if (address_space_io.root) {
 address_space_update_topology(address_space_io);
 }
+
+memory_region_update_pending = false;
 }
 
 void memory_region_transaction_begin(void)
@@ -741,7 +745,9 @@ void memory_region_transaction_commit(void)
 {
 assert(memory_region_transaction_depth);
 --memory_region_transaction_depth;
-memory_region_update_topology(NULL);
+if (!memory_region_transaction_depth  memory_region_update_pending) {
+memory_region_update_topology(NULL);
+}
 }
 
 static void memory_region_destructor_none(MemoryRegion *mr)
-- 
1.7.6.3

[Qemu-devel] [PATCH 2/3] memory: introduce memory_region_set_address()

2011-09-14 Thread Avi Kivity

Allow changing the address of a memory region while it is
in the memory hierarchy.

Signed-off-by: Avi Kivity a...@redhat.com
---
 memory.c |   20 
 memory.h |   11 +++
 2 files changed, 31 insertions(+), 0 deletions(-)

diff --git a/memory.c b/memory.c
index ce0f3fd..3b0cc25 100644
--- a/memory.c
+++ b/memory.c
@@ -1260,6 +1260,26 @@ void memory_region_set_enabled(MemoryRegion *mr, bool 
enabled)
 memory_region_update_topology(NULL);
 }
 
+void memory_region_set_address(MemoryRegion *mr, target_phys_addr_t addr)
+{
+MemoryRegion *parent = mr-parent;
+unsigned priority = mr-priority;
+bool may_overlap = mr-may_overlap;
+
+if (addr == mr-addr || !parent) {
+return;
+}
+
+memory_region_transaction_begin();
+memory_region_del_subregion(parent, mr);
+if (may_overlap) {
+memory_region_add_subregion_overlap(parent, addr, mr, priority);
+} else {
+memory_region_add_subregion(parent, addr, mr);
+}
+memory_region_transaction_commit();
+}
+
 void set_system_memory_map(MemoryRegion *mr)
 {
 address_space_memory.root = mr;
diff --git a/memory.h b/memory.h
index 60b1449..468970b 100644
--- a/memory.h
+++ b/memory.h
@@ -509,6 +509,17 @@ void memory_region_del_subregion(MemoryRegion *mr,
  */
 void memory_region_set_enabled(MemoryRegion *mr, bool enabled);
 
+/*
+ * memory_region_set_address: dynamically update the address of a region
+ *
+ * Dynamically updates the address of a region, relative to its parent.
+ * May be used on regions are currently part of a memory hierarchy.
+ *
+ * @mr: the region to be updated
+ * @addr: new address, relative to parent region
+ */
+void memory_region_set_address(MemoryRegion *mr, target_phys_addr_t addr);
+
 /* Start a transaction; changes will be accumulated and made visible only
  * when the transaction ends.
  */
-- 
1.7.6.3

[Qemu-devel] [PATCH 0/3] Memory API mutators

2011-09-14 Thread Avi Kivity

This patchset introduces memory_region_set_enabled() and
memory_region_set_address() to avoid the requirement on memory
routers to track the internal state of the memory API (so they know
whether they need to add or remove a region).  Instead, they can
simply copy the state of the region from the guest-exposed register
to the memory core, via the new mutator functions.

Please review.  Do we need a memory_region_set_size() as well?  Do we want

  memory_region_set_attributes(mr,
   MR_ATTR_ENABLED | MR_ATTR_SIZE,
   (MemoryRegionAttributes) {
   .enabled = s-enabled,
   .address = s-addr,
   });

?

Avi Kivity (3):
  memory: introduce memory_region_set_enabled()
  memory: introduce memory_region_set_address()
  memory: optimize empty transactions due to mutators

 memory.c |   64 -
 memory.h |   28 +++
 2 files changed, 82 insertions(+), 10 deletions(-)

-- 
1.7.6.3

[Qemu-devel] [PATCH 1/3] memory: introduce memory_region_set_enabled()

2011-09-14 Thread Avi Kivity

This allows users to disable a memory region without removing
it from the hierarchy, simplifying the implementation of
memory routers.

Signed-off-by: Avi Kivity a...@redhat.com
---
 memory.c |   38 --
 memory.h |   17 +
 2 files changed, 45 insertions(+), 10 deletions(-)

diff --git a/memory.c b/memory.c
index 101b67c..ce0f3fd 100644
--- a/memory.c
+++ b/memory.c
@@ -494,6 +494,10 @@ static void render_memory_region(FlatView *view,
 FlatRange fr;
 AddrRange tmp;
 
+if (!mr-enabled) {
+return;
+}
+
 base += mr-addr;
 
 tmp = addrrange_make(base, mr-size);
@@ -710,12 +714,16 @@ static void address_space_update_topology(AddressSpace 
*as)
 address_space_update_ioeventfds(as);
 }
 
-static void memory_region_update_topology(void)
+static void memory_region_update_topology(MemoryRegion *mr)
 {
 if (memory_region_transaction_depth) {
 return;
 }
 
+if (mr  !mr-enabled) {
+return;
+}
+
 if (address_space_memory.root) {
 address_space_update_topology(address_space_memory);
 }
@@ -733,7 +741,7 @@ void memory_region_transaction_commit(void)
 {
 assert(memory_region_transaction_depth);
 --memory_region_transaction_depth;
-memory_region_update_topology();
+memory_region_update_topology(NULL);
 }
 
 static void memory_region_destructor_none(MemoryRegion *mr)
@@ -770,6 +778,7 @@ void memory_region_init(MemoryRegion *mr,
 mr-size = size;
 mr-addr = 0;
 mr-offset = 0;
+mr-enabled = true;
 mr-terminates = false;
 mr-readable = true;
 mr-destructor = memory_region_destructor_none;
@@ -1005,7 +1014,7 @@ void memory_region_set_log(MemoryRegion *mr, bool log, 
unsigned client)
 uint8_t mask = 1  client;
 
 mr-dirty_log_mask = (mr-dirty_log_mask  ~mask) | (log * mask);
-memory_region_update_topology();
+memory_region_update_topology(mr);
 }
 
 bool memory_region_get_dirty(MemoryRegion *mr, target_phys_addr_t addr,
@@ -1042,7 +1051,7 @@ void memory_region_rom_device_set_readable(MemoryRegion 
*mr, bool readable)
 {
 if (mr-readable != readable) {
 mr-readable = readable;
-memory_region_update_topology();
+memory_region_update_topology(mr);
 }
 }
 
@@ -1144,7 +1153,7 @@ void memory_region_add_eventfd(MemoryRegion *mr,
 memmove(mr-ioeventfds[i+1], mr-ioeventfds[i],
 sizeof(*mr-ioeventfds) * (mr-ioeventfd_nb-1 - i));
 mr-ioeventfds[i] = mrfd;
-memory_region_update_topology();
+memory_region_update_topology(mr);
 }
 
 void memory_region_del_eventfd(MemoryRegion *mr,
@@ -1174,7 +1183,7 @@ void memory_region_del_eventfd(MemoryRegion *mr,
 --mr-ioeventfd_nb;
 mr-ioeventfds = g_realloc(mr-ioeventfds,
   sizeof(*mr-ioeventfds)*mr-ioeventfd_nb + 
1);
-memory_region_update_topology();
+memory_region_update_topology(mr);
 }
 
 static void memory_region_add_subregion_common(MemoryRegion *mr,
@@ -1210,7 +1219,7 @@ static void 
memory_region_add_subregion_common(MemoryRegion *mr,
 }
 QTAILQ_INSERT_TAIL(mr-subregions, subregion, subregions_link);
 done:
-memory_region_update_topology();
+memory_region_update_topology(mr);
 }
 
 
@@ -1239,17 +1248,26 @@ void memory_region_del_subregion(MemoryRegion *mr,
 assert(subregion-parent == mr);
 subregion-parent = NULL;
 QTAILQ_REMOVE(mr-subregions, subregion, subregions_link);
-memory_region_update_topology();
+memory_region_update_topology(mr);
+}
+
+void memory_region_set_enabled(MemoryRegion *mr, bool enabled)
+{
+if (enabled == mr-enabled) {
+return;
+}
+mr-enabled = enabled;
+memory_region_update_topology(NULL);
 }
 
 void set_system_memory_map(MemoryRegion *mr)
 {
 address_space_memory.root = mr;
-memory_region_update_topology();
+memory_region_update_topology(NULL);
 }
 
 void set_system_io_map(MemoryRegion *mr)
 {
 address_space_io.root = mr;
-memory_region_update_topology();
+memory_region_update_topology(NULL);
 }
diff --git a/memory.h b/memory.h
index 06b83ae..60b1449 100644
--- a/memory.h
+++ b/memory.h
@@ -114,6 +114,7 @@ struct MemoryRegion {
 IORange iorange;
 bool terminates;
 bool readable;
+bool enabled;
 MemoryRegion *alias;
 target_phys_addr_t alias_offset;
 unsigned priority;
@@ -492,6 +493,22 @@ void memory_region_add_subregion_overlap(MemoryRegion *mr,
 void memory_region_del_subregion(MemoryRegion *mr,
  MemoryRegion *subregion);
 
+
+/*
+ * memory_region_set_enabled: dynamically enable or disable a region
+ *
+ * Enables or disables a memory region.  A disabled memory region
+ * ignores all accesses to itself and its subregions.  It does not
+ * obscure sibling subregions with lower priority - it simply behaves as
+ * if it was removed from the hierarchy.
+ *
+ * Regions default to being enabled.
+ *
+ * @mr: the region to be updated
+ *

[Qemu-devel] [PATCH 51/58] Gdbstub: handle read of fpscr

2011-09-14 Thread Alexander Graf

From: Fabien Chouteau chout...@adacore.com

Signed-off-by: Fabien Chouteau chout...@adacore.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 gdbstub.c   |2 +-
 target-ppc/translate_init.c |3 +--
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/gdbstub.c b/gdbstub.c
index 90683a4..efe7b5f 100644
--- a/gdbstub.c
+++ b/gdbstub.c
@@ -733,7 +733,7 @@ static int cpu_gdb_read_register(CPUState *env, uint8_t 
*mem_buf, int n)
 {
 if (gdb_has_xml)
 return 0;
-GET_REG32(0); /* fpscr */
+GET_REG32(env-fpscr);
 }
 }
 }
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 211f3bd..d09c7ca 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -9700,8 +9700,7 @@ static int gdb_get_float_reg(CPUState *env, uint8_t 
*mem_buf, int n)
 return 8;
 }
 if (n == 32) {
-/* FPSCR not implemented  */
-memset(mem_buf, 0, 4);
+stl_p(mem_buf, env-fpscr);
 return 4;
 }
 return 0;
-- 
1.6.0.2

[Qemu-devel] [PATCH 22/58] PPC: E500: Update freqs for all CPUs

2011-09-14 Thread Alexander Graf

Now that we can so nicely find out the host's frequencies, we should also
make sure that we get them into all virtual CPUs' device tree nodes.

Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/ppce500_mpc8544ds.c |   10 +++---
 1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/hw/ppce500_mpc8544ds.c b/hw/ppce500_mpc8544ds.c
index 2c7c677..0791e27 100644
--- a/hw/ppce500_mpc8544ds.c
+++ b/hw/ppce500_mpc8544ds.c
@@ -70,9 +70,9 @@ static int mpc8544_load_device_tree(CPUState *env,
 int fdt_size;
 void *fdt;
 uint8_t hypercall[16];
-char cpu_name[128] = /cpus/PowerPC,8544@0;
 uint32_t clock_freq = 4;
 uint32_t tb_freq = 4;
+int i;
 
 filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, BINARY_DEVICE_TREE_FILE);
 if (!filename) {
@@ -122,8 +122,12 @@ static int mpc8544_load_device_tree(CPUState *env,
  hypercall, sizeof(hypercall));
 }
 
-qemu_devtree_setprop_cell(fdt, cpu_name, clock-frequency, clock_freq);
-qemu_devtree_setprop_cell(fdt, cpu_name, timebase-frequency, tb_freq);
+for (i = 0; i  smp_cpus; i++) {
+char cpu_name[128];
+snprintf(cpu_name, sizeof(cpu_name), /cpus/PowerPC,8544@%x, i);
+qemu_devtree_setprop_cell(fdt, cpu_name, clock-frequency, 
clock_freq);
+qemu_devtree_setprop_cell(fdt, cpu_name, timebase-frequency, 
tb_freq);
+}
 
 ret = rom_add_blob_fixed(BINARY_DEVICE_TREE_FILE, fdt, fdt_size, addr);
 g_free(fdt);
-- 
1.6.0.2

[Qemu-devel] [PATCH 58/58] KVM: Update kernel headers

2011-09-14 Thread Alexander Graf

Removes ABI-breaking HIOR parts - KVM patch to follow.

Signed-off-by: Alexander Graf ag...@suse.de
---
 linux-headers/asm-powerpc/kvm.h |   12 ++--
 linux-headers/linux/kvm.h   |1 -
 2 files changed, 2 insertions(+), 11 deletions(-)

diff --git a/linux-headers/asm-powerpc/kvm.h b/linux-headers/asm-powerpc/kvm.h
index 28eecf0..a635e22 100644
--- a/linux-headers/asm-powerpc/kvm.h
+++ b/linux-headers/asm-powerpc/kvm.h
@@ -149,12 +149,6 @@ struct kvm_regs {
 #define KVM_SREGS_E_UPDATE_DBSR(1  3)
 
 /*
- * Book3S special bits to indicate contents in the struct by maintaining
- * backwards compatibility with older structs. If adding a new field,
- * please make sure to add a flag for that new field */
-#define KVM_SREGS_S_HIOR   (1  0)
-
-/*
  * In KVM_SET_SREGS, reserved/pad fields must be left untouched from a
  * previous KVM_GET_REGS.
  *
@@ -176,11 +170,9 @@ struct kvm_sregs {
} ppc64;
struct {
__u32 sr[16];
-   __u64 ibat[8];
-   __u64 dbat[8];
+   __u64 ibat[8]; 
+   __u64 dbat[8]; 
} ppc32;
-   __u64 flags; /* KVM_SREGS_S_ */
-   __u64 hior;
} s;
struct {
union {
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 8bb6cde..6f5095c 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -554,7 +554,6 @@ struct kvm_ppc_pvinfo {
 #define KVM_CAP_PPC_SMT 64
 #define KVM_CAP_PPC_RMA65
 #define KVM_CAP_MAX_VCPUS 66   /* returns max vcpus per vm */
-#define KVM_CAP_PPC_HIOR 67
 #define KVM_CAP_PPC_PAPR 68
 #define KVM_CAP_SW_TLB 69
 
-- 
1.6.0.2

[Qemu-devel] [PATCH 50/58] pseries: Update SLOF firmware image

2011-09-14 Thread Alexander Graf

From: David Gibson da...@gibson.dropbear.id.au

The current SLOF firmware for the pseries machine has a bug in SCSI
condition handling that was exposed by recent updates to qemu's SCSI
emulation.  This patch updates the SLOF image to one with the bug fixed.

Signed-off-by: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Alexander Graf ag...@suse.de
---
 pc-bios/README   |2 +-
 pc-bios/slof.bin |  Bin 579072 - 57 bytes
 2 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/pc-bios/README b/pc-bios/README
index f74b246..8912211 100644
--- a/pc-bios/README
+++ b/pc-bios/README
@@ -17,7 +17,7 @@
 - SLOF (Slimline Open Firmware) is a free IEEE 1275 Open Firmware
   implementation for certain IBM POWER hardware.  The sources are at
   https://github.com/dgibson/SLOF, and the image currently in qemu is
-  built from git tag qemu-slof-20110323.
+  built from git tag qemu-slof-20110830.
 
 - The PXE roms come from the iPXE project. Built with BANNER_TIME 0.
   Sources available at http://ipxe.org.  Vendor:Device ID - ROM mapping:
diff --git a/pc-bios/slof.bin b/pc-bios/slof.bin
index 
22c4c7f5c448e3002aefecf3438f5d080586d666..66163031c6eb5539b54b73214bf18b8cb6aa8743
 100644
GIT binary patch
delta 2674
zcma)-e{2)i9l+l^+vI#pAfbVz1)3MqK~gx|NkeJ^ZtP|W2}zTNkV3agE@WF-#FhL
zcjv}l6WP#ojQ#+OS}A$K{{iHcf~vXkI6+cq@ZwKT6q#)c!NIV^hiKCZ+ZO0;?J
z9Q%+vw%D_wB*`@X;4yL0-74U0c$IMYVm=$ZDnI^A$n)wEXa{KrcNPFfeP)0;CX
zS8)1Id!Rov9sYQR3R9uw7m+8xoS$qAJmhlc-8GFZxQ#q7*FLS=+#LT*9~p)U|B
zCsaXbJ)sSRDhX8)+DPaDLe+#e5!y_shEOe`2MIkyXbYjOgtifSm{1)dM5vz7c0vt=
z9wGEaLOTfUB-BW#iO{2jb`jc5h$ZA9h5x5pHbyZumPX=D5TG;Bzb)EdPn+FqTVL
z_G9@WmV;Qnk7YNO*YS8Wb2OOPIV@@jHKP_^b2Dn*)85?N-r7Zpzb5+8BsSRRY3tH
zj7Ksn!L6?E`bT@qGh0LT)Oci}Gn9sK@51FLFRg=1nW^3*75#nryYGYIllh?cCj5Tp
z^1+?(MCRjzyWq$|ZQmsblbQGWFTudV{GWPJf4{vJP7}txi*;JNP(?aK;^{MmTLbz
z81knI`hHPUxierZzzj@9fux!xM=eA?bZ8p0E}J%AaSMaihw2N^upmoTQV}1x?s_
zaD;Fa0q^D-T?UPpvx+_9tn+vtS!{?R2WJPj%+zl`~mCx^=2XL~yGl8eT9vIMM
zBRKCp4Zu@p0ovODpg#vdvI(Hn=|Jb3+Rd8_MG?M1k_{{oc9B;KJgJi6K==r6HEIN
z*)u_Sp`yQIUKfi{YhdO0+auFYQ-$lzRDQ!k??VlX(qjefdv+bejX|TU7djt+#D$
z8-K2ux{^r@kM~J-DI00DN`ZgSWnK1oxe{RE_zWgVD?aO~w*x@hDw{R?m#H
z;#7PP;z8lz$i@eqZ)|cO=IAqjRyusvLmANO-OzIsZ|+qdaItppSW~RwV#c*3(#B
zK~^~57k)bBF;wls=-T__gX_?=R?-tBF=44v*@UE+#N0WfIh$Vji~bg^q+b+dd|y
zb^U~0j{`QLR!^Ek8P8k`=;v{goPMetcF?F0}Q*ox|zjaDjX^7EVuYs*HWZIhhx
z{hhMJX=hEUBb^2-ujOy8{33scpQFLbhxz4|`uyT@eLii*|FKzg80II#4}|ot4H?
z2dF(!2T%g+_{(H3RXrI+SJ6J%SBV=^zN6g_X(Cnw$G_0EAA0qv=GGd1;IF!U|3
z==}}=6KBgV;)4$Qr}EYLS{2oOyOIR{BpH_3cytFAFLyDt{tv=U}jEOC+A#VcL1P3
zQyNSMte?%fI$+?`-8mN?+mSc-kqaa%8krpIvWfedp+ln;z*M7(N^t=Gtzg!)Ii
z5FgidQ88IbV0@^wmi_D^z1wX)`xaeSea`@yYSG+!t_SEk0J7=s-tEf5_QfH8C~QA
zmgh8%m_zmR1@`fW2ZjUb+q|V@a3jXx5RVkGrz3|B5~it+8LyHI2v@u+2Q%(UCS
zSYem#Ct1H8@jCOEpsN~lxTk9%yZI{pW4OVGBxHyt%L#%m8pa#97HO)juF91MHNJ(
zYNE1A9YS1NbzjH;*0Fvk~Ks*jGoXLdl{Eh)L0+=nr`mrWz`T1+Zp|dHLJ`%^_W{
zBvw)+99DugXa{c(As?QDNKNKz9^s1_WXu^nX+K{x-C}YJ9MRW^=je*%+5#$7Gd1
z#wzicogP}98PeB8dpdZi*pF^bl10JJ0oR5EyE~=kt~RGT)IZ#{+T22i9)E}KfZ4~
zdVr!F-PYoDIu!J(n(5_r)iAsfo@WcM1ujpP#(1x$j*Gg+x3#jeq{I{6F-f^!$6D^+
z)1Fkxgy~e3uoQ{wIInTZjJ6|hRE#VQp1QvA84)oGg0{owtK|`I0eWS(DRo6@mLg
z;ssIR#$U$ZUpdttkiTdEF}UZ=c%c3nU6{i1aTK{)V_aexg5{YvW(FpKZX=l$W
z;y7+Zai+#iHi~Z^{tGIt8%e_yW2~H##^TDjqzHHcpFGMYhdKGc#m+O--Z%Gq7I
zXM!We;zCFwEtEb#(j#28e6LoM4pjioXm!0l{2mEAL(tWLiRBtdkb4?U3DnoI1PO
zBOI+@9*JrHpLMCGF;z7UQ|IveSEvrf@arYFBSUIeBC^QfBStS2|F8@93V-W6JDU)L
yKfp*93SWNoB)(auBpXer#na;RG+U=`(2cIPR!!rr@7u;jR35DeHyN*t^War#sNp

delta 2908
zcmb`Ie{3699l+nSeaYD@YZsg!YnSz9q4_~_{*fldP2x3fm$pgTq-oQwLQM1d?tH%X
zVw~jd;j7N=k=v?Z2yVT9tMX@db(rL%Vx?~~)!0}-8|nm1={nR*-K0nZ6fGe0
zgZvu9shu^_3rz5-|zQ*-+SNpeRtPhs!RX0?wOsIKY~eLo6TBRacnZ~yvN)3_(Hz;
z%Y20n0V#c4aOu+?othQgB?GdXmBCS8OnD^V0-%0?7Ln4+)(5C85|34u}qGyyvS
z2Z1jTxQ9R)fqMyTBCwgjeFVx0e35{YKm~zH0#yX63EWR$3xTZ!9w1Od;6Vc02p|Hr
z1Rf$#N8n2YwiBo)_JM(Kofy46KE#TLVzLQBH%XRR_5};_5V{lJJns+PWghM;M4
z!qS1|-?2Q3wsZ^#PY9L4r2KiEcapgHf~ol$GejU53ver-GN#?UN7tceJ;6wE4;$
zXy0S#Q9)6qSQPmYFG?Xb7VU0~E7B;d2`H41WWK4)0RR|}SY9{)s2}@+z0OQ^pti*H
zo$U;yA@t7vYR_B=ypb92eT*9JwuhxX1Vt!d^xkw*8snr`LM4UK0aIBKL_EHnd=AV
zV9)IIq1zCSWhMsu;Bw~8!F)La4=jSsG*Qj`m$iim!?2w`S1ydGW|n2{Inj~@nFA
zTMs$A^c#U0+3uRUF5M0Ed82_{i3BI6NEY8-8HV;uQm6s*953J2kezG{VEf$3bM
z4bojfUmAc9Ea_SBoLg)=ex21Vg#i5tN{Avad-ewD%}VI7XUqe4O~hF0rK4hmgFsf
z#y{vxo{4}J^=0=w_9HzV-1;mrW;-^JB(KecW008Uz}5b)+S@m|so(`r{;?ekgMh
z#8c;PCj0Y$``K+4q*pBI%SNBree;8hL17!^l5;w-#Ps|1MumZij^Ox=ZkJ}-~6W4
z{`|)``t$!N^VNcRz56Ki=!3b`}xEm#La@6HJ2E)y;!jeugr#6F2Mrfny*`mru-{
zSbToddfB0uDLu}dGG!D{d@OzpvFgpkiqo2$*$Uih!?SkzW(DFOz8to?@QwV
zI0S1-0BUq3+T0}PY6;}N{@-QM!V_o4pchZPO|D|ZFfm!{!Xgg((8zBJnbnC|
zw*9dCd^TdsWh2zjnaaB+Z!(vWDaEYLRh0O6A-c$%W2YjUVs-!@H78l4pHBEhF
z*;SjrwjRhYjgM_jjz3qI#M#@tgQT04yM8SllkTKi}@AqG!52ok!|;9AQ2^#9i
zOIN4sbBgYRn_?fKuJ*l{u#T)c}4=RV?h8?BFG=0(6cAt8RAA0}wb1zLtIp*isWq
z-GT3e*WJ$paQjDqes+ecaGd$7(DQNs_D1-wi9Px(KuktM|=NZhIF%{-!jT?9$K9
zP`gXJ09Xm759sOVD0F^hhQh4`x66o3|(rmR~y*@p!wb0+6DSbMfszFgU5OhmrxWT
zsxcDp@Z^@R@J{#59%E_Ivg0-+k;FWja5WJfAtDo=_sJa$cd=Few9Y$#lFNNErv`#

[Qemu-devel] [PATCH 47/58] Implement POWER7's CFAR in TCG

2011-09-14 Thread Alexander Graf

From: David Gibson da...@gibson.dropbear.id.au

This patch implements support for the CFAR SPR on POWER7 (Come From
Address Register), which snapshots the PC value at the time of a branch or
an rfid.  The latest powerpc-next kernel also catches it and can show it in
xmon or in the signal frames.

This works well enough to let recent kernels boot (which otherwise oops
on the CFAR access).  It hasn't been tested enough to be confident that the
CFAR values are actually accurate, but one thing at a time.

Signed-off-by: Ben Herrenschmidt b...@kernel.crashing.org
Signed-off-by: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Alexander Graf ag...@suse.de
---
 target-ppc/cpu.h|8 
 target-ppc/translate.c  |   28 
 target-ppc/translate_init.c |   23 ++-
 3 files changed, 58 insertions(+), 1 deletions(-)

diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index 32706df..3f4af22 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -555,6 +555,8 @@ enum {
 /* Decrementer clock: RTC clock (POWER, 601) or bus clock*/
 POWERPC_FLAG_RTC_CLK  = 0x0001,
 POWERPC_FLAG_BUS_CLK  = 0x0002,
+/* Has CFAR  */
+POWERPC_FLAG_CFAR = 0x0004,
 };
 
 /*/
@@ -872,6 +874,10 @@ struct CPUPPCState {
 target_ulong ctr;
 /* condition register */
 uint32_t crf[8];
+#if defined(TARGET_PPC64)
+/* CFAR */
+target_ulong cfar;
+#endif
 /* XER */
 target_ulong xer;
 /* Reservation address */
@@ -1204,6 +1210,7 @@ static inline void cpu_clone_regs(CPUState *env, 
target_ulong newsp)
 #define SPR_601_UDECR (0x006)
 #define SPR_LR(0x008)
 #define SPR_CTR   (0x009)
+#define SPR_DSCR  (0x011)
 #define SPR_DSISR (0x012)
 #define SPR_DAR   (0x013) /* DAE for PowerPC 601 */
 #define SPR_601_RTCU  (0x014)
@@ -1212,6 +1219,7 @@ static inline void cpu_clone_regs(CPUState *env, 
target_ulong newsp)
 #define SPR_SDR1  (0x019)
 #define SPR_SRR0  (0x01A)
 #define SPR_SRR1  (0x01B)
+#define SPR_CFAR  (0x01C)
 #define SPR_AMR   (0x01D)
 #define SPR_BOOKE_PID (0x030)
 #define SPR_BOOKE_DECAR   (0x036)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 4277460..1e362fc 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -69,6 +69,9 @@ static TCGv cpu_nip;
 static TCGv cpu_msr;
 static TCGv cpu_ctr;
 static TCGv cpu_lr;
+#if defined(TARGET_PPC64)
+static TCGv cpu_cfar;
+#endif
 static TCGv cpu_xer;
 static TCGv cpu_reserve;
 static TCGv_i32 cpu_fpscr;
@@ -154,6 +157,11 @@ void ppc_translate_init(void)
 cpu_lr = tcg_global_mem_new(TCG_AREG0,
 offsetof(CPUState, lr), lr);
 
+#if defined(TARGET_PPC64)
+cpu_cfar = tcg_global_mem_new(TCG_AREG0,
+  offsetof(CPUState, cfar), cfar);
+#endif
+
 cpu_xer = tcg_global_mem_new(TCG_AREG0,
  offsetof(CPUState, xer), xer);
 
@@ -187,6 +195,7 @@ typedef struct DisasContext {
 int le_mode;
 #if defined(TARGET_PPC64)
 int sf_mode;
+int has_cfar;
 #endif
 int fpu_enabled;
 int altivec_enabled;
@@ -3345,6 +3354,14 @@ static inline void gen_qemu_st32fiw(DisasContext *ctx, 
TCGv_i64 arg1, TCGv arg2)
 /* stfiwx */
 GEN_STXF(stfiw, st32fiw, 0x17, 0x1E, PPC_FLOAT_STFIWX);
 
+static inline void gen_update_cfar(DisasContext *ctx, target_ulong nip)
+{
+#if defined(TARGET_PPC64)
+if (ctx-has_cfar)
+tcg_gen_movi_tl(cpu_cfar, nip);
+#endif
+}
+
 /***Branch ***/
 static inline void gen_goto_tb(DisasContext *ctx, int n, target_ulong dest)
 {
@@ -3407,6 +3424,7 @@ static void gen_b(DisasContext *ctx)
 target = li;
 if (LK(ctx-opcode))
 gen_setlr(ctx, ctx-nip);
+gen_update_cfar(ctx, ctx-nip);
 gen_goto_tb(ctx, 0, target);
 }
 
@@ -3469,6 +3487,7 @@ static inline void gen_bcond(DisasContext *ctx, int type)
 }
 tcg_temp_free_i32(temp);
 }
+gen_update_cfar(ctx, ctx-nip);
 if (type == BCOND_IM) {
 target_ulong li = (target_long)((int16_t)(BD(ctx-opcode)));
 if (likely(AA(ctx-opcode) == 0)) {
@@ -3580,6 +3599,7 @@ static void gen_rfi(DisasContext *ctx)
 gen_inval_exception(ctx, POWERPC_EXCP_PRIV_OPC);
 return;
 }
+gen_update_cfar(ctx, ctx-nip);
 gen_helper_rfi();
 gen_sync_exception(ctx);
 #endif
@@ -3596,6 +3616,7 @@ static void gen_rfid(DisasContext *ctx)
 gen_inval_exception(ctx, POWERPC_EXCP_PRIV_OPC);
 return;
 }
+gen_update_cfar(ctx, ctx-nip);
 gen_helper_rfid();
 gen_sync_exception(ctx);
 #endif
@@ -9263,6 +9284,12 @@ void cpu_dump_state

[Qemu-devel] [PATCH 14/58] device tree: add nop_node

2011-09-14 Thread Alexander Graf

We have a qemu internal abstraction layer on FDT. While I'm not fully convinced
we need it at all, it's missing the nop_node functionality that we now need
on e500. So let's add it and think about the general future of that API later.

Signed-off-by: Alexander Graf ag...@suse.de
---
 device_tree.c |   11 +++
 device_tree.h |1 +
 2 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/device_tree.c b/device_tree.c
index 3a224d1..23e89e3 100644
--- a/device_tree.c
+++ b/device_tree.c
@@ -107,3 +107,14 @@ int qemu_devtree_setprop_string(void *fdt, const char 
*node_path,
 
 return fdt_setprop_string(fdt, offset, property, string);
 }
+
+int qemu_devtree_nop_node(void *fdt, const char *node_path)
+{
+int offset;
+
+offset = fdt_path_offset(fdt, node_path);
+if (offset  0)
+return offset;
+
+return fdt_nop_node(fdt, offset);
+}
diff --git a/device_tree.h b/device_tree.h
index cecd98f..76fce5f 100644
--- a/device_tree.h
+++ b/device_tree.h
@@ -22,5 +22,6 @@ int qemu_devtree_setprop_cell(void *fdt, const char 
*node_path,
   const char *property, uint32_t val);
 int qemu_devtree_setprop_string(void *fdt, const char *node_path,
 const char *property, const char *string);
+int qemu_devtree_nop_node(void *fdt, const char *node_path);
 
 #endif /* __DEVICE_TREE_H__ */
-- 
1.6.0.2

Re: [Qemu-devel] [PATCH 0/3] Memory API mutators

2011-09-14 Thread Avi Kivity


Jan, too, was interested in this.

On 09/14/2011 12:23 PM, Avi Kivity wrote:

This patchset introduces memory_region_set_enabled() and
memory_region_set_address() to avoid the requirement on memory
routers to track the internal state of the memory API (so they know
whether they need to add or remove a region).  Instead, they can
simply copy the state of the region from the guest-exposed register
to the memory core, via the new mutator functions.

Please review.  Do we need a memory_region_set_size() as well?  Do we want

   memory_region_set_attributes(mr,
MR_ATTR_ENABLED | MR_ATTR_SIZE,
(MemoryRegionAttributes) {
.enabled = s-enabled,
.address = s-addr,
});

?

Avi Kivity (3):
   memory: introduce memory_region_set_enabled()
   memory: introduce memory_region_set_address()
   memory: optimize empty transactions due to mutators

  memory.c |   64 -
  memory.h |   28 +++
  2 files changed, 82 insertions(+), 10 deletions(-)




--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [PATCH][RFC][0/2] REF+/REF- optimization

2011-09-14 Thread Frediano Ziglio

2011/9/14 Kevin Wolf kw...@redhat.com:
 Am 13.09.2011 15:36, schrieb Frediano Ziglio:
 2011/9/13 Kevin Wolf kw...@redhat.com:
 Am 13.09.2011 09:53, schrieb Frediano Ziglio:
 These patches try to trade-off between leaks and speed for clusters
 refcounts.

 Refcount increments (REF+ or refp) are handled in a different way from
 decrements (REF- or refm). The reason it that posting or not flushing
 a REF- cause just a leak while posting a REF+ cause a corruption.

 To optimize REF- I just used an array to store offsets then when a
 flush is requested or array reach a limit (currently 1022) the array
 is sorted and written to disk. I use an array with offset instead of
 ranges to support compression (an offset could appear multiple times
 in the array).
 I consider this patch quite ready.

 Ok, first of all let's clarify what this optimises. I don't think it
 changes anything at all for the writeback cache modes, because these
 already do most operations in memory only. So this must be about
 optimising some operations with cache=writethrough. REF- isn't about
 normal cluster allocation, it is about COW with internal snapshots or
 bdrv_discard. Do you have benchmarks for any of them?

 I strongly disagree with your approach for REF-. We already have a
 cache, and introducing a second one sounds like a bad idea. I think we
 could get a very similar effect if we introduced a
 qcow2_cache_entry_mark_dirty_wb() that marks a given refcount block as
 dirty, but at the same time tells the cache that even in write-through
 mode it can still treat this block as write-back. This should require
 much less code changes.


 Yes, mainly optimize for writethrough. I did not test with writeback
 but should improve even this (I think here you have some flush to keep
 consistency).
 I'll try to write a qcow2_cache_entry_mark_dirty_wb patch and test it.

 Great, thanks!


Don't expect however the patch too soon, I'm quite busy in these days.

 But let's measure the effects first, I suspect that for cluster
 allocation it doesn't help much because every REF- comes with a REF+.


 That's 50% of effort if REF- clusters are far from REF+ :)

 I would expect that the next REF+ allocates exactly the REF- cluster.
 But you still have a point, we save the write on REF- and combine it
 with the REF+ write.


This is still a TODO for REF+ patch.

Oh... time ago looking at refcount code I realize that a single
deallocation could be reused in some cases only after Qemu restart.
For instance
- got a single cluster REF- which take refcount to 0
- free_cluster_index get decreased to this index
- we get a new cluster request for 2 clusters
- free_cluster_index get increased
we skip freed deallocation and if we don't get a new deallocation for
a cluster with index minor to our freed cluster this cluster is not
reused.
(I didn't test this behavior, no leak, no corruption, just image could
be larger then expected)

 To optimize REF+ I mark a range as allocated and use this range to
 get new ones (avoiding writing refcount to disk). When a flush is
 requested or in some situations (like snapshot) this cache is disabled
 and flushed (written as REF-).
 I do not consider this patch ready, it works and pass all io-tests
 but for instance I would avoid allocating new clusters for refcount
 during preallocation.

 The only question here is if improving cache=writethrough cluster
 allocation performance is worth the additional complexity in the already
 complex refcounting code.


 I didn't see this optimization as a second level cache, but yes, for
 REF- is a second cache.

 The alternative that was discussed before is the dirty bit approach that
 is used in QED and would allow us to use writeback for all refcount
 blocks, regardless of REF- or REF+. It would be an easier approach
 requiring less code changes, but it comes with the cost of requiring an
 fsck after a qemu crash.


 I was thinking about changing the header magic first time we change
 refcount in order to mark image as dirty so newer Qemu recognize the
 flag while former one does not recognize image. Obviously reverting
 magic on image close.

 We've discussed this idea before and I think it wasn't considered a
 great idea to automagically change the header in an incompatible way.
 But we can always say that for improved performance you need to upgrade
 your image to qcow2 v3.


I don't understand why there is not a wiki page for detailed qcow3
changes. I saw your post on May. I follow this ML since August so I
think I missed a lot of discussion on qcow improves.

 End speed up is quite visible allocating clusters (more then 20%).

 What benchmark do you use for testing this?

 Kevin


 Currently I'm using bonnie++ but I noted similar improves with iozone.
 The test script format an image then launch a Linux machine which run
 a script and save result to a file.
 The test image is seems by this virtual machine as a separate disk.
 The file on hist reside in a separate LV.
 I got quite

[Qemu-devel] [PATCH 23/58] PPC: E500: Remove unneeded CPU nodes

2011-09-14 Thread Alexander Graf

We should only keep CPU nodes in the device tree around that we really have
virtual CPUs for. So remove all superfluous entries that we just keep there
in case someone wants to create a lot of vCPUs.

Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/ppce500_mpc8544ds.c |6 ++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/hw/ppce500_mpc8544ds.c b/hw/ppce500_mpc8544ds.c
index 0791e27..9379624 100644
--- a/hw/ppce500_mpc8544ds.c
+++ b/hw/ppce500_mpc8544ds.c
@@ -129,6 +129,12 @@ static int mpc8544_load_device_tree(CPUState *env,
 qemu_devtree_setprop_cell(fdt, cpu_name, timebase-frequency, 
tb_freq);
 }
 
+for (i = smp_cpus; i  32; i++) {
+char cpu_name[128];
+snprintf(cpu_name, sizeof(cpu_name), /cpus/PowerPC,8544@%x, i);
+qemu_devtree_nop_node(fdt, cpu_name);
+}
+
 ret = rom_add_blob_fixed(BINARY_DEVICE_TREE_FILE, fdt, fdt_size, addr);
 g_free(fdt);
 
-- 
1.6.0.2

Re: [Qemu-devel] [PATCH 0/3] Memory API mutators

2011-09-14 Thread Peter Maydell

On 14 September 2011 10:23, Avi Kivity a...@redhat.com wrote:
 This patchset introduces memory_region_set_enabled() and
 memory_region_set_address() to avoid the requirement on memory
 routers to track the internal state of the memory API (so they know
 whether they need to add or remove a region).  Instead, they can
 simply copy the state of the region from the guest-exposed register
 to the memory core, via the new mutator functions.

 Please review.  Do we need a memory_region_set_size() as well?

Would set_size() allow things like omap_gpmc() to avoid the need
to create an intermediate container subregion to enforce size
clipping on the child region it's trying to map?

(Strictly speaking what omap_gpmc() wants is not merely clipping
to a guest-specified size but also wrapping, so you can take a
16MB child region and map the bottom 4MB of it repeating into
a 32MB chunk of address space, say. But that would require a lot
of playing games with aliases to implement a bizarre corner
case that nobody uses in practice.)

-- PMM

[Qemu-devel] [PATCH 57/58] PPC: Fix heathrow PIC to use little endian MMIO

2011-09-14 Thread Alexander Graf

During the memory API conversion, the indication on little endianness of
MMIO for the heathrow PIC got dropped. This patch adds it back again.

Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/heathrow_pic.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/hw/heathrow_pic.c b/hw/heathrow_pic.c
index 51996ab..16f48d1 100644
--- a/hw/heathrow_pic.c
+++ b/hw/heathrow_pic.c
@@ -126,7 +126,7 @@ static uint64_t pic_read(void *opaque, target_phys_addr_t 
addr,
 static const MemoryRegionOps heathrow_pic_ops = {
 .read = pic_read,
 .write = pic_write,
-.endianness = DEVICE_NATIVE_ENDIAN,
+.endianness = DEVICE_LITTLE_ENDIAN,
 };
 
 static void heathrow_pic_set_irq(void *opaque, int num, int level)
-- 
1.6.0.2

[Qemu-devel] [PATCH 11/58] PPC: Bump MPIC up to 32 supported CPUs

2011-09-14 Thread Alexander Graf

The MPIC emulation is now capable of handling up to 32 CPUs. Reflect that in
the code exporting the numbers out and fix an integer overflow while at it.

Signed-off-by: Alexander Graf ag...@suse.de

---

v1 - v2:

  - Max cpus is 15 due to cINT routing
  - Report nb_cpus not MAX_CPUS in MPIC capabilities
---
 hw/openpic.c |   10 +++---
 1 files changed, 3 insertions(+), 7 deletions(-)

diff --git a/hw/openpic.c b/hw/openpic.c
index 109c1bc..03e442b 100644
--- a/hw/openpic.c
+++ b/hw/openpic.c
@@ -63,7 +63,7 @@
 
 #elif defined(USE_MPCxxx)
 
-#define MAX_CPU 2
+#define MAX_CPU15
 #define MAX_IRQ   128
 #define MAX_DBL 0
 #define MAX_MBX 0
@@ -507,7 +507,7 @@ static inline void write_IRQreg (openpic_t *opp, int n_IRQ,
 break;
 case IRQ_IDE:
 tmp = val  0xC000;
-tmp |= val  ((1  MAX_CPU) - 1);
+tmp |= val  ((1ULL  MAX_CPU) - 1);
 opp-src[n_IRQ].ide = tmp;
 DPRINTF(Set IDE %d to 0x%08x\n, n_IRQ, opp-src[n_IRQ].ide);
 break;
@@ -1283,7 +1283,7 @@ static void mpic_reset (void *opaque)
 
 mpp-glbc = 0x8000;
 /* Initialise controller registers */
-mpp-frep = 0x004f0002;
+mpp-frep = 0x004f0002 | ((mpp-nb_cpus - 1)  8);
 mpp-veni = VENI;
 mpp-pint = 0x;
 mpp-spve = 0x;
@@ -1684,10 +1684,6 @@ qemu_irq *mpic_init (target_phys_addr_t base, int 
nb_cpus,
 {mpic_cpu_read, mpic_cpu_write, MPIC_CPU_REG_START, MPIC_CPU_REG_SIZE},
 };
 
-/* XXX: for now, only one CPU is supported */
-if (nb_cpus != 1)
-return NULL;
-
 mpp = g_malloc0(sizeof(openpic_t));
 
 for (i = 0; i  sizeof(list)/sizeof(list[0]); i++) {
-- 
1.6.0.2

Re: [Qemu-devel] [PATCH 0/3] Memory API mutators

2011-09-14 Thread Avi Kivity


On 09/14/2011 12:56 PM, Peter Maydell wrote:

On 14 September 2011 10:23, Avi Kivitya...@redhat.com  wrote:
  This patchset introduces memory_region_set_enabled() and
  memory_region_set_address() to avoid the requirement on memory
  routers to track the internal state of the memory API (so they know
  whether they need to add or remove a region).  Instead, they can
  simply copy the state of the region from the guest-exposed register
  to the memory core, via the new mutator functions.

  Please review.  Do we need a memory_region_set_size() as well?

Would set_size() allow things like omap_gpmc() to avoid the need
to create an intermediate container subregion to enforce size
clipping on the child region it's trying to map?


I'd recommend not calling _set_size() on somebody else's region - this 
quickly leads to confusion.  Only call set_size() if you also called 
_init() and will call _destroy().


Can you point me at the code in question?

_set_size() may be useful for dynamic bridge windows and the like.


(Strictly speaking what omap_gpmc() wants is not merely clipping
to a guest-specified size but also wrapping, so you can take a
16MB child region and map the bottom 4MB of it repeating into
a 32MB chunk of address space, say. But that would require a lot
of playing games with aliases to implement a bizarre corner
case that nobody uses in practice.)



That's best done in the memory core, the rendering loop can be adjusted 
to do this replication.


--
error compiling committee.c: too many arguments to function

[Qemu-devel] [PATCH 17/58] PPC: E500: Use generic kvm function for freq

2011-09-14 Thread Alexander Graf

Now that we have generic KVM functions to read out the host tb and clock
frequencies, let's use them in the e500 code!

Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/ppce500_mpc8544ds.c |   44 +---
 1 files changed, 9 insertions(+), 35 deletions(-)

diff --git a/hw/ppce500_mpc8544ds.c b/hw/ppce500_mpc8544ds.c
index 9cb01f3..8748531 100644
--- a/hw/ppce500_mpc8544ds.c
+++ b/hw/ppce500_mpc8544ds.c
@@ -14,8 +14,6 @@
  * (at your option) any later version.
  */
 
-#include dirent.h
-
 #include config.h
 #include qemu-common.h
 #include net.h
@@ -96,6 +94,9 @@ static int mpc8544_load_device_tree(CPUState *env,
 int fdt_size;
 void *fdt;
 uint8_t hypercall[16];
+char cpu_name[128] = /cpus/PowerPC,8544@0;
+uint32_t clock_freq = 4;
+uint32_t tb_freq = 4;
 
 filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, BINARY_DEVICE_TREE_FILE);
 if (!filename) {
@@ -133,32 +134,9 @@ static int mpc8544_load_device_tree(CPUState *env,
 fprintf(stderr, couldn't set /chosen/bootargs\n);
 
 if (kvm_enabled()) {
-struct dirent *dirp;
-DIR *dp;
-char buf[128];
-
-if ((dp = opendir(/proc/device-tree/cpus/)) == NULL) {
-printf(Can't open directory /proc/device-tree/cpus/\n);
-ret = -1;
-goto out;
-}
-
-buf[0] = '\0';
-while ((dirp = readdir(dp)) != NULL) {
-if (strncmp(dirp-d_name, PowerPC, 7) == 0) {
-snprintf(buf, 128, /cpus/%s, dirp-d_name);
-break;
-}
-}
-closedir(dp);
-if (buf[0] == '\0') {
-printf(Unknow host!\n);
-ret = -1;
-goto out;
-}
-
-mpc8544_copy_soc_cell(fdt, buf, clock-frequency);
-mpc8544_copy_soc_cell(fdt, buf, timebase-frequency);
+/* Read out host's frequencies */
+clock_freq = kvmppc_get_clockfreq();
+tb_freq = kvmppc_get_tbfreq();
 
 /* indicate KVM hypercall interface */
 qemu_devtree_setprop_string(fdt, /hypervisor, compatible,
@@ -166,15 +144,11 @@ static int mpc8544_load_device_tree(CPUState *env,
 kvmppc_get_hypercall(env, hypercall, sizeof(hypercall));
 qemu_devtree_setprop(fdt, /hypervisor, hcall-instructions,
  hypercall, sizeof(hypercall));
-} else {
-const uint32_t freq = 4;
-
-qemu_devtree_setprop_cell(fdt, /cpus/PowerPC,8544@0,
-  clock-frequency, freq);
-qemu_devtree_setprop_cell(fdt, /cpus/PowerPC,8544@0,
-  timebase-frequency, freq);
 }
 
+qemu_devtree_setprop_cell(fdt, cpu_name, clock-frequency, clock_freq);
+qemu_devtree_setprop_cell(fdt, cpu_name, timebase-frequency, tb_freq);
+
 ret = rom_add_blob_fixed(BINARY_DEVICE_TREE_FILE, fdt, fdt_size, addr);
 g_free(fdt);
 
-- 
1.6.0.2

[Qemu-devel] [PATCH 04/58] PPC: Move openpic to target specific code compilation

2011-09-14 Thread Alexander Graf

The MPIC has some funny feature where it maps different registers to an MMIO
region depending which CPU accesses them.

To be able to reflect that, we need to make OpenPIC be compiled in the target
code, so it can access cpu_single_env.

Signed-off-by: Alexander Graf ag...@suse.de
---
 Makefile.objs   |1 -
 Makefile.target |2 ++
 2 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/Makefile.objs b/Makefile.objs
index 62020d7..60c63af 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -221,7 +221,6 @@ hw-obj-$(CONFIG_SMARTCARD_NSS) += ccid-card-emulated.o
 hw-obj-$(CONFIG_USB_REDIR) += usb-redir.o
 
 # PPC devices
-hw-obj-$(CONFIG_OPENPIC) += openpic.o
 hw-obj-$(CONFIG_PREP_PCI) += prep_pci.o
 # Mac shared devices
 hw-obj-$(CONFIG_MACIO) += macio.o
diff --git a/Makefile.target b/Makefile.target
index f708453..2ed9099 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -252,6 +252,8 @@ obj-ppc-y += ppce500_mpc8544ds.o mpc8544_guts.o
 obj-ppc-y += virtex_ml507.o
 obj-ppc-$(CONFIG_KVM) += kvm_ppc.o
 obj-ppc-$(CONFIG_FDT) += device_tree.o
+# PowerPC OpenPIC
+obj-ppc-y += openpic.o
 
 # Xilinx PPC peripherals
 obj-ppc-y += xilinx_intc.o
-- 
1.6.0.2

[Qemu-devel] [PATCH 45/58] ppc: booke206: add info tlb support

2011-09-14 Thread Alexander Graf

From: Scott Wood scottw...@freescale.com

Signed-off-by: Scott Wood scottw...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 hmp-commands.hx |2 +-
 monitor.c   |5 ++-
 target-ppc/cpu.h|2 +
 target-ppc/helper.c |   88 +++
 4 files changed, 94 insertions(+), 3 deletions(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 9e1cca8..506014c 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1306,7 +1306,7 @@ show i8259 (PIC) state
 @item info pci
 show emulated PCI device info
 @item info tlb
-show virtual to physical memory mappings (i386, SH4 and SPARC only)
+show virtual to physical memory mappings (i386, SH4, SPARC, and PPC only)
 @item info mem
 show the active virtual memory mappings (i386 only)
 @item info jit
diff --git a/monitor.c b/monitor.c
index 03ae997..46bfeec 100644
--- a/monitor.c
+++ b/monitor.c
@@ -2456,7 +2456,7 @@ static void tlb_info(Monitor *mon)
 
 #endif
 
-#if defined(TARGET_SPARC)
+#if defined(TARGET_SPARC) || defined(TARGET_PPC)
 static void tlb_info(Monitor *mon)
 {
 CPUState *env1 = mon_get_cpu();
@@ -2949,7 +2949,8 @@ static const mon_cmd_t info_cmds[] = {
 .user_print = do_pci_info_print,
 .mhandler.info_new = do_pci_info,
 },
-#if defined(TARGET_I386) || defined(TARGET_SH4) || defined(TARGET_SPARC)
+#if defined(TARGET_I386) || defined(TARGET_SH4) || defined(TARGET_SPARC) || \
+defined(TARGET_PPC)
 {
 .name   = tlb,
 .args_type  = ,
diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index 3e7f797..5200e6e 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -2045,4 +2045,6 @@ static inline void cpu_pc_from_tb(CPUState *env, 
TranslationBlock *tb)
 env-nip = tb-pc;
 }
 
+void dump_mmu(FILE *f, fprintf_function cpu_fprintf, CPUState *env);
+
 #endif /* !defined (__CPU_PPC_H__) */
diff --git a/target-ppc/helper.c b/target-ppc/helper.c
index 5ec83f2..d1bc574 100644
--- a/target-ppc/helper.c
+++ b/target-ppc/helper.c
@@ -1465,6 +1465,94 @@ found_tlb:
 return ret;
 }
 
+static const char *book3e_tsize_to_str[32] = {
+1K, 2K, 4K, 8K, 16K, 32K, 64K, 128K, 256K, 512K,
+1M, 2M, 4M, 8M, 16M, 32M, 64M, 128M, 256M, 512M,
+1G, 2G, 4G, 8G, 16G, 32G, 64G, 128G, 256G, 512G,
+1T, 2T
+};
+
+static void mmubooke206_dump_one_tlb(FILE *f, fprintf_function cpu_fprintf,
+ CPUState *env, int tlbn, int offset,
+ int tlbsize)
+{
+ppcmas_tlb_t *entry;
+int i;
+
+cpu_fprintf(f, \nTLB%d:\n, tlbn);
+cpu_fprintf(f, Effective  Physical   Size TID   TS SRWX 
URWX WIMGE U0123\n);
+
+entry = env-tlb.tlbm[offset];
+for (i = 0; i  tlbsize; i++, entry++) {
+target_phys_addr_t ea, pa, size;
+int tsize;
+
+if (!(entry-mas1  MAS1_VALID)) {
+continue;
+}
+
+tsize = (entry-mas1  MAS1_TSIZE_MASK)  MAS1_TSIZE_SHIFT;
+size = 1024ULL  tsize;
+ea = entry-mas2  ~(size - 1);
+pa = entry-mas7_3  ~(size - 1);
+
+cpu_fprintf(f, 0x%016 PRIx64  0x%016 PRIx64  %4s %-5u %1u  
S%c%c%c U%c%c%c %c%c%c%c%c U%c%c%c%c\n,
+(uint64_t)ea, (uint64_t)pa,
+book3e_tsize_to_str[tsize],
+(entry-mas1  MAS1_TID_MASK)  MAS1_TID_SHIFT,
+(entry-mas1  MAS1_TS)  MAS1_TS_SHIFT,
+entry-mas7_3  MAS3_SR ? 'R' : '-',
+entry-mas7_3  MAS3_SW ? 'W' : '-',
+entry-mas7_3  MAS3_SX ? 'X' : '-',
+entry-mas7_3  MAS3_UR ? 'R' : '-',
+entry-mas7_3  MAS3_UW ? 'W' : '-',
+entry-mas7_3  MAS3_UX ? 'X' : '-',
+entry-mas2  MAS2_W ? 'W' : '-',
+entry-mas2  MAS2_I ? 'I' : '-',
+entry-mas2  MAS2_M ? 'M' : '-',
+entry-mas2  MAS2_G ? 'G' : '-',
+entry-mas2  MAS2_E ? 'E' : '-',
+entry-mas7_3  MAS3_U0 ? '0' : '-',
+entry-mas7_3  MAS3_U1 ? '1' : '-',
+entry-mas7_3  MAS3_U2 ? '2' : '-',
+entry-mas7_3  MAS3_U3 ? '3' : '-');
+}
+}
+
+static void mmubooke206_dump_mmu(FILE *f, fprintf_function cpu_fprintf,
+ CPUState *env)
+{
+int offset = 0;
+int i;
+
+if (kvm_enabled()  !env-kvm_sw_tlb) {
+cpu_fprintf(f, Cannot access KVM TLB\n);
+return;
+}
+
+for (i = 0; i  BOOKE206_MAX_TLBN; i++) {
+int size = booke206_tlb_size(env, i);
+
+if (size == 0) {
+continue;
+}
+
+mmubooke206_dump_one_tlb(f, cpu_fprintf, env, i, offset, size);
+offset += size;
+}
+}
+
+void dump_mmu(FILE *f, fprintf_function cpu_fprintf, CPUState *env)
+{
+switch (env-mmu_model) {
+case POWERPC_MMU_BOOKE206:
+mmubooke206_dump_mmu(f, cpu_fprintf,

[Qemu-devel] [PATCH 28/58] device tree: give dt more size

2011-09-14 Thread Alexander Graf

We currently load a device tree blob and then just take its size x2 to
account for modifications we do inside. While this is nice and great,
it fails when we have a small device tree as blob and lots of nodes added
in machine init code.

So for now, just make it 20k bigger than it was before. We maybe want to
be more clever about this later.

Signed-off-by: Alexander Graf ag...@suse.de
---
 device_tree.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/device_tree.c b/device_tree.c
index 751538e..dc69232 100644
--- a/device_tree.c
+++ b/device_tree.c
@@ -41,6 +41,7 @@ void *load_device_tree(const char *filename_path, int *sizep)
 }
 
 /* Expand to 2x size to give enough room for manipulation.  */
+dt_size += 1;
 dt_size *= 2;
 /* First allocate space in qemu for device tree */
 fdt = g_malloc0(dt_size);
-- 
1.6.0.2

Re: [Qemu-devel] [PATCH 05/58] PPC: Add CPU local MMIO regions to MPIC

2011-09-14 Thread Peter Maydell

On 14 September 2011 09:42, Alexander Graf ag...@suse.de wrote:
 The MPIC exports a register set for each CPU connected to it. They can all
 be accessed through specific registers or using a shadow page that is mapped
 differently depending on which CPU accesses it.

 This patch implements the shadow map, making it possible for guests to access
 the CPU local registers using the same address on each CPU.

 +static int get_current_cpu(void)
 +{
 +  return cpu_single_env-cpu_index;
 +}

This is the standard way of doing this (we use it on ARM as well), but
it's pretty clearly a hack. which master sent this memory transaction
is an attribute that ought to be passed down to the MMIO read/write
functions, really (along with other interesting things like priv or
not? and probably architecture specific attributes like ARM's
secure/non-secure); this matches how hardware does it where the
attributes are passed along as extra signals in the bus fabric.
(Sometimes hardware also does this by having buses from the different
cores be totally separate paths at the point where this kind of device
is connected, before merging together later; we don't really support
modelling that either :-))

Not a nak, just an observation while I'm thinking about it.

-- PMM

[Qemu-devel] [PATCH 05/58] PPC: Add CPU local MMIO regions to MPIC

2011-09-14 Thread Alexander Graf

The MPIC exports a register set for each CPU connected to it. They can all
be accessed through specific registers or using a shadow page that is mapped
differently depending on which CPU accesses it.

This patch implements the shadow map, making it possible for guests to access
the CPU local registers using the same address on each CPU.

Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/openpic.c |  110 ++
 1 files changed, 72 insertions(+), 38 deletions(-)

diff --git a/hw/openpic.c b/hw/openpic.c
index 26c96e2..cf89f23 100644
--- a/hw/openpic.c
+++ b/hw/openpic.c
@@ -2,6 +2,7 @@
  * OpenPIC emulation
  *
  * Copyright (c) 2004 Jocelyn Mayer
+ *   2011 Alexander Graf
  *
  * Permission is hereby granted, free of charge, to any person obtaining a copy
  * of this software and associated documentation files (the Software), to 
deal
@@ -161,6 +162,16 @@ static inline int test_bit (uint32_t *field, int bit)
 return (field[bit  5]  1  (bit  0x1F)) != 0;
 }
 
+static int get_current_cpu(void)
+{
+  return cpu_single_env-cpu_index;
+}
+
+static uint32_t openpic_cpu_read_internal(void *opaque, target_phys_addr_t 
addr,
+  int idx);
+static void openpic_cpu_write_internal(void *opaque, target_phys_addr_t addr,
+   uint32_t val, int idx);
+
 enum {
 IRQ_EXTERNAL = 0x01,
 IRQ_INTERNAL = 0x02,
@@ -590,18 +601,27 @@ static void openpic_gbl_write (void *opaque, 
target_phys_addr_t addr, uint32_t v
 DPRINTF(%s: addr  TARGET_FMT_plx  = %08x\n, __func__, addr, val);
 if (addr  0xF)
 return;
-addr = 0xFF;
 switch (addr) {
-case 0x00: /* FREP */
+case 0x40:
+case 0x50:
+case 0x60:
+case 0x70:
+case 0x80:
+case 0x90:
+case 0xA0:
+case 0xB0:
+openpic_cpu_write_internal(opp, addr, val, get_current_cpu());
+break;
+case 0x1000: /* FREP */
 break;
-case 0x20: /* GLBC */
+case 0x1020: /* GLBC */
 if (val  0x8000  opp-reset)
 opp-reset(opp);
 opp-glbc = val  ~0x8000;
 break;
-case 0x80: /* VENI */
+case 0x1080: /* VENI */
 break;
-case 0x90: /* PINT */
+case 0x1090: /* PINT */
 for (idx = 0; idx  opp-nb_cpus; idx++) {
 if ((val  (1  idx))  !(opp-pint  (1  idx))) {
 DPRINTF(Raise OpenPIC RESET output for CPU %d\n, idx);
@@ -615,22 +635,20 @@ static void openpic_gbl_write (void *opaque, 
target_phys_addr_t addr, uint32_t v
 }
 opp-pint = val;
 break;
-#if MAX_IPI  0
-case 0xA0: /* IPI_IPVP */
-case 0xB0:
-case 0xC0:
-case 0xD0:
+case 0x10A0: /* IPI_IPVP */
+case 0x10B0:
+case 0x10C0:
+case 0x10D0:
 {
 int idx;
-idx = (addr - 0xA0)  4;
+idx = (addr - 0x10A0)  4;
 write_IRQreg(opp, opp-irq_ipi0 + idx, IRQ_IPVP, val);
 }
 break;
-#endif
-case 0xE0: /* SPVE */
+case 0x10E0: /* SPVE */
 opp-spve = val  0x00FF;
 break;
-case 0xF0: /* TIFR */
+case 0x10F0: /* TIFR */
 opp-tifr = val;
 break;
 default:
@@ -647,36 +665,43 @@ static uint32_t openpic_gbl_read (void *opaque, 
target_phys_addr_t addr)
 retval = 0x;
 if (addr  0xF)
 return retval;
-addr = 0xFF;
 switch (addr) {
-case 0x00: /* FREP */
+case 0x1000: /* FREP */
 retval = opp-frep;
 break;
-case 0x20: /* GLBC */
+case 0x1020: /* GLBC */
 retval = opp-glbc;
 break;
-case 0x80: /* VENI */
+case 0x1080: /* VENI */
 retval = opp-veni;
 break;
-case 0x90: /* PINT */
+case 0x1090: /* PINT */
 retval = 0x;
 break;
-#if MAX_IPI  0
-case 0xA0: /* IPI_IPVP */
+case 0x40:
+case 0x50:
+case 0x60:
+case 0x70:
+case 0x80:
+case 0x90:
+case 0xA0:
 case 0xB0:
-case 0xC0:
-case 0xD0:
+retval = openpic_cpu_read_internal(opp, addr, get_current_cpu());
+break;
+case 0x10A0: /* IPI_IPVP */
+case 0x10B0:
+case 0x10C0:
+case 0x10D0:
 {
 int idx;
-idx = (addr - 0xA0)  4;
+idx = (addr - 0x10A0)  4;
 retval = read_IRQreg(opp, opp-irq_ipi0 + idx, IRQ_IPVP);
 }
 break;
-#endif
-case 0xE0: /* SPVE */
+case 0x10E0: /* SPVE */
 retval = opp-spve;
 break;
-case 0xF0: /* TIFR */
+case 0x10F0: /* TIFR */
 retval = opp-tifr;
 break;
 default:
@@ -794,23 +819,23 @@ static uint32_t openpic_src_read (void *opaque, uint32_t 
addr)
 return retval;
 }
 
-static void openpic_cpu_write (void *opaque, target_phys_addr_t addr, uint32_t 
val)
+static void openpic_cpu_write_internal(void *opaque, target_phys_addr_t addr,
+

Re: [Qemu-devel] [PATCH 05/58] PPC: Add CPU local MMIO regions to MPIC

2011-09-14 Thread Alexander Graf


Am 14.09.2011 um 12:07 schrieb Peter Maydell peter.mayd...@linaro.org:

 On 14 September 2011 09:42, Alexander Graf ag...@suse.de wrote:
 The MPIC exports a register set for each CPU connected to it. They can all
 be accessed through specific registers or using a shadow page that is mapped
 differently depending on which CPU accesses it.
 
 This patch implements the shadow map, making it possible for guests to access
 the CPU local registers using the same address on each CPU.
 
 +static int get_current_cpu(void)
 +{
 +  return cpu_single_env-cpu_index;
 +}
 
 This is the standard way of doing this (we use it on ARM as well), but
 it's pretty clearly a hack. which master sent this memory transaction
 is an attribute that ought to be passed down to the MMIO read/write
 functions, really (along with other interesting things like priv or
 not? and probably architecture specific attributes like ARM's
 secure/non-secure); this matches how hardware does it where the
 attributes are passed along as extra signals in the bus fabric.
 (Sometimes hardware also does this by having buses from the different
 cores be totally separate paths at the point where this kind of device
 is connected, before merging together later; we don't really support
 modelling that either :-))
 
 Not a nak, just an observation while I'm thinking about it.

Yeah, I tend to agree in general. I'm not 100% sure in this case, as it's 
almost an in-cpu device. But it would be nice to pass this information on the 
mmio callbacks.

However, right now this is the only way to do it, as we don't have the pretty 
flexible one implemented yet ;).

Alex

[Qemu-devel] [PATCH 48/58] pseries: Implement hcall-bulk hypervisor interface

2011-09-14 Thread Alexander Graf

From: David Gibson da...@gibson.dropbear.id.au

This patch adds support for the H_REMOVE_BULK hypercall on the pseries
machine.  Strictly speaking this isn't necessarym since the kernel will
only attempt to use this if hcall-bulk is advertised in the device tree,
which previously it was not.

Adding this support may give a marginal performance increase, but more
importantly it reduces the differences between the emulated machine and
an existing PowerVM or kvm system, both of which already implement
hcall-bulk.

Signed-off-by: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/spapr.c   |2 +-
 hw/spapr_hcall.c |  125 -
 2 files changed, 114 insertions(+), 13 deletions(-)

diff --git a/hw/spapr.c b/hw/spapr.c
index 91953cf..deb4ae5 100644
--- a/hw/spapr.c
+++ b/hw/spapr.c
@@ -75,7 +75,7 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
 uint32_t end_prop = cpu_to_be32(initrd_base + initrd_size);
 uint32_t pft_size_prop[] = {0, cpu_to_be32(hash_shift)};
 char hypertas_prop[] = hcall-pft\0hcall-term\0hcall-dabr\0hcall-interrupt
-\0hcall-tce\0hcall-vio\0hcall-splpar;
+\0hcall-tce\0hcall-vio\0hcall-splpar\0hcall-bulk;
 uint32_t interrupt_server_ranges_prop[] = {0, cpu_to_be32(smp_cpus)};
 int i;
 char *modelname;
diff --git a/hw/spapr_hcall.c b/hw/spapr_hcall.c
index 0c61c10..84281be 100644
--- a/hw/spapr_hcall.c
+++ b/hw/spapr_hcall.c
@@ -174,20 +174,26 @@ static target_ulong h_enter(CPUState *env, 
sPAPREnvironment *spapr,
 return H_SUCCESS;
 }
 
-static target_ulong h_remove(CPUState *env, sPAPREnvironment *spapr,
- target_ulong opcode, target_ulong *args)
+enum {
+REMOVE_SUCCESS = 0,
+REMOVE_NOT_FOUND = 1,
+REMOVE_PARM = 2,
+REMOVE_HW = 3,
+};
+
+static target_ulong remove_hpte(CPUState *env, target_ulong ptex,
+target_ulong avpn,
+target_ulong flags,
+target_ulong *vp, target_ulong *rp)
 {
-target_ulong flags = args[0];
-target_ulong pte_index = args[1];
-target_ulong avpn = args[2];
 uint8_t *hpte;
 target_ulong v, r, rb;
 
-if ((pte_index * HASH_PTE_SIZE_64)  ~env-htab_mask) {
-return H_PARAMETER;
+if ((ptex * HASH_PTE_SIZE_64)  ~env-htab_mask) {
+return REMOVE_PARM;
 }
 
-hpte = env-external_htab + (pte_index * HASH_PTE_SIZE_64);
+hpte = env-external_htab + (ptex * HASH_PTE_SIZE_64);
 while (!lock_hpte(hpte, HPTE_V_HVLOCK)) {
 /* We have no real concurrency in qemu soft-emulation, so we
  * will never actually have a contested lock */
@@ -202,14 +208,106 @@ static target_ulong h_remove(CPUState *env, 
sPAPREnvironment *spapr,
 ((flags  H_ANDCOND)  (v  avpn) != 0)) {
 stq_p(hpte, v  ~HPTE_V_HVLOCK);
 assert(!(ldq_p(hpte)  HPTE_V_HVLOCK));
-return H_NOT_FOUND;
+return REMOVE_NOT_FOUND;
 }
-args[0] = v  ~HPTE_V_HVLOCK;
-args[1] = r;
+*vp = v  ~HPTE_V_HVLOCK;
+*rp = r;
 stq_p(hpte, 0);
-rb = compute_tlbie_rb(v, r, pte_index);
+rb = compute_tlbie_rb(v, r, ptex);
 ppc_tlb_invalidate_one(env, rb);
 assert(!(ldq_p(hpte)  HPTE_V_HVLOCK));
+return REMOVE_SUCCESS;
+}
+
+static target_ulong h_remove(CPUState *env, sPAPREnvironment *spapr,
+ target_ulong opcode, target_ulong *args)
+{
+target_ulong flags = args[0];
+target_ulong pte_index = args[1];
+target_ulong avpn = args[2];
+int ret;
+
+ret = remove_hpte(env, pte_index, avpn, flags,
+  args[0], args[1]);
+
+switch (ret) {
+case REMOVE_SUCCESS:
+return H_SUCCESS;
+
+case REMOVE_NOT_FOUND:
+return H_NOT_FOUND;
+
+case REMOVE_PARM:
+return H_PARAMETER;
+
+case REMOVE_HW:
+return H_HARDWARE;
+}
+
+assert(0);
+}
+
+#define H_BULK_REMOVE_TYPE 0xc000ULL
+#define   H_BULK_REMOVE_REQUEST0x4000ULL
+#define   H_BULK_REMOVE_RESPONSE   0x8000ULL
+#define   H_BULK_REMOVE_END0xc000ULL
+#define H_BULK_REMOVE_CODE 0x3000ULL
+#define   H_BULK_REMOVE_SUCCESS0xULL
+#define   H_BULK_REMOVE_NOT_FOUND  0x1000ULL
+#define   H_BULK_REMOVE_PARM   0x2000ULL
+#define   H_BULK_REMOVE_HW 0x3000ULL
+#define H_BULK_REMOVE_RC   0x0c00ULL
+#define H_BULK_REMOVE_FLAGS0x0300ULL
+#define   H_BULK_REMOVE_ABSOLUTE   0xULL
+#define   H_BULK_REMOVE_ANDCOND0x0100ULL
+#define   H_BULK_REMOVE_AVPN   0x0200ULL
+#define H_BULK_REMOVE_PTEX 0x00ffULL
+
+#define H_BULK_REMOVE_MAX_BATCH4
+
+static target_ulong

[Qemu-devel] [PATCH 36/58] pseries: Bugfixes for interrupt numbering in XICS code

2011-09-14 Thread Alexander Graf

From: David Gibson da...@gibson.dropbear.id.au

The implementation of the XICS interrupt controller contains several
(difficult to trigger) bugs due to the fact that we were not 100%
consistent with which irq numbering we used.  In most places, global
numbers were used as handled by the presentation layer, however a few
functions took local numberings, that is the source number within
the interrupt source controller which is offset from the global
number.  In most cases the function and its caller agreed on this, but
in a few cases it didn't.

This patch cleans this up by always using global numbering.
Translation to the local number is now always and only done when we
look up the individual interrupt source state structure.  This should
remove the existing bugs and with luck reduce the chances of
re-introducing such bugs.

Signed-off-by: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/xics.c |   17 -
 1 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/hw/xics.c b/hw/xics.c
index 9bf82aa..bd8d4cd 100644
--- a/hw/xics.c
+++ b/hw/xics.c
@@ -187,17 +187,17 @@ static int ics_valid_irq(struct ics_state *ics, uint32_t 
nr)
  (nr  (ics-offset + ics-nr_irqs));
 }
 
-static void ics_set_irq_msi(void *opaque, int nr, int val)
+static void ics_set_irq_msi(void *opaque, int srcno, int val)
 {
 struct ics_state *ics = (struct ics_state *)opaque;
-struct ics_irq_state *irq = ics-irqs + nr;
+struct ics_irq_state *irq = ics-irqs + srcno;
 
 if (val) {
 if (irq-priority == 0xff) {
 irq-masked_pending = 1;
 /* masked pending */ ;
 } else  {
-icp_irq(ics-icp, irq-server, nr + ics-offset, irq-priority);
+icp_irq(ics-icp, irq-server, srcno + ics-offset, irq-priority);
 }
 }
 }
@@ -229,7 +229,7 @@ static void ics_resend_msi(struct ics_state *ics)
 static void ics_write_xive_msi(struct ics_state *ics, int nr, int server,
uint8_t priority)
 {
-struct ics_irq_state *irq = ics-irqs + nr;
+struct ics_irq_state *irq = ics-irqs + nr - ics-offset;
 
 irq-server = server;
 irq-priority = priority;
@@ -239,7 +239,7 @@ static void ics_write_xive_msi(struct ics_state *ics, int 
nr, int server,
 }
 
 irq-masked_pending = 0;
-icp_irq(ics-icp, server, nr + ics-offset, priority);
+icp_irq(ics-icp, server, nr, priority);
 }
 
 static void ics_reject(struct ics_state *ics, int nr)
@@ -334,7 +334,7 @@ static void rtas_set_xive(sPAPREnvironment *spapr, uint32_t 
token,
 return;
 }
 
-ics_write_xive_msi(ics, nr - ics-offset, server, priority);
+ics_write_xive_msi(ics, nr, server, priority);
 
 rtas_st(rets, 0, 0); /* Success */
 }
@@ -388,7 +388,7 @@ static void rtas_int_off(sPAPREnvironment *spapr, uint32_t 
token,
 struct ics_irq_state *irq = xics-irqs + (nr - xics-offset);
 
 irq-saved_priority = irq-priority;
-ics_write_xive_msi(xics, nr - xics-offset, irq-server, 0xff);
+ics_write_xive_msi(xics, nr, irq-server, 0xff);
 #endif
 
 rtas_st(rets, 0, 0); /* Success */
@@ -418,8 +418,7 @@ static void rtas_int_on(sPAPREnvironment *spapr, uint32_t 
token,
 #if 0
 struct ics_irq_state *irq = xics-irqs + (nr - xics-offset);
 
-ics_write_xive_msi(xics, nr - xics-offset,
-   irq-server, irq-saved_priority);
+ics_write_xive_msi(xics, nr, irq-server, irq-saved_priority);
 #endif
 
 rtas_st(rets, 0, 0); /* Success */
-- 
1.6.0.2

[Qemu-devel] [PATCH 01/58] spapr: proper qdevification

2011-09-14 Thread Alexander Graf

From: Paolo Bonzini pbonz...@redhat.com

Right now the spapr devices cannot be instantiated with -device,
because the IRQs need to be passed to the spapr_*_create functions.
Do this instead in the bus's init wrapper.

This is particularly important with the conversion from scsi-disk
to scsi-{cd,hd} that Markus made.  After his patches, if you
specify a scsi-cd device attached to an if=none drive, the default
VSCSI controller will not be created and, without qdevification,
you will not be able to add yours.

NOTE from agraf: added small compile fix

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
Cc: Alexander Graf ag...@suse.de
Cc: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/spapr.c   |   15 +--
 hw/spapr.h   |8 
 hw/spapr_llan.c  |7 +--
 hw/spapr_vio.c   |5 +
 hw/spapr_vio.h   |   13 -
 hw/spapr_vscsi.c |8 +---
 hw/spapr_vty.c   |8 +---
 7 files changed, 25 insertions(+), 39 deletions(-)

diff --git a/hw/spapr.c b/hw/spapr.c
index 1265cee..8cf93fe 100644
--- a/hw/spapr.c
+++ b/hw/spapr.c
@@ -298,7 +298,6 @@ static void ppc_spapr_init(ram_addr_t ram_size,
 long kernel_size, initrd_size, fw_size;
 long pteg_shift = 17;
 char *filename;
-int irq = 16;
 
 spapr = g_malloc(sizeof(*spapr));
 cpu_ppc_hypercall = emulate_spapr_hypercall;
@@ -360,15 +359,14 @@ static void ppc_spapr_init(ram_addr_t ram_size,
 /* Set up VIO bus */
 spapr-vio_bus = spapr_vio_bus_init();
 
-for (i = 0; i  MAX_SERIAL_PORTS; i++, irq++) {
+for (i = 0; i  MAX_SERIAL_PORTS; i++) {
 if (serial_hds[i]) {
 spapr_vty_create(spapr-vio_bus, SPAPR_VTY_BASE_ADDRESS + i,
- serial_hds[i], xics_find_qirq(spapr-icp, irq),
- irq);
+ serial_hds[i]);
 }
 }
 
-for (i = 0; i  nb_nics; i++, irq++) {
+for (i = 0; i  nb_nics; i++) {
 NICInfo *nd = nd_table[i];
 
 if (!nd-model) {
@@ -376,8 +374,7 @@ static void ppc_spapr_init(ram_addr_t ram_size,
 }
 
 if (strcmp(nd-model, ibmveth) == 0) {
-spapr_vlan_create(spapr-vio_bus, 0x1000 + i, nd,
-  xics_find_qirq(spapr-icp, irq), irq);
+spapr_vlan_create(spapr-vio_bus, 0x1000 + i, nd);
 } else {
 fprintf(stderr, pSeries (sPAPR) platform does not support 
 NIC model '%s' (only ibmveth is supported)\n,
@@ -387,9 +384,7 @@ static void ppc_spapr_init(ram_addr_t ram_size,
 }
 
 for (i = 0; i = drive_get_max_bus(IF_SCSI); i++) {
-spapr_vscsi_create(spapr-vio_bus, 0x2000 + i,
-   xics_find_qirq(spapr-icp, irq), irq);
-irq++;
+spapr_vscsi_create(spapr-vio_bus, 0x2000 + i);
 }
 
 if (kernel_filename) {
diff --git a/hw/spapr.h b/hw/spapr.h
index 263691b..009c459 100644
--- a/hw/spapr.h
+++ b/hw/spapr.h
@@ -1,6 +1,8 @@
 #if !defined(__HW_SPAPR_H__)
 #define __HW_SPAPR_H__
 
+#include hw/xics.h
+
 struct VIOsPAPRBus;
 struct icp_state;
 
@@ -278,6 +280,12 @@ void spapr_register_hypercall(target_ulong opcode, 
spapr_hcall_fn fn);
 target_ulong spapr_hypercall(CPUState *env, target_ulong opcode,
  target_ulong *args);
 
+static inline qemu_irq spapr_find_qirq(sPAPREnvironment *spapr,
+int irq_num)
+{
+return xics_find_qirq(spapr-icp, irq_num);
+}
+
 static inline uint32_t rtas_ld(target_ulong phys, int n)
 {
 return ldl_be_phys(phys + 4*n);
diff --git a/hw/spapr_llan.c b/hw/spapr_llan.c
index c18efc7..2597748 100644
--- a/hw/spapr_llan.c
+++ b/hw/spapr_llan.c
@@ -195,11 +195,9 @@ static int spapr_vlan_init(VIOsPAPRDevice *sdev)
 return 0;
 }
 
-void spapr_vlan_create(VIOsPAPRBus *bus, uint32_t reg, NICInfo *nd,
-   qemu_irq qirq, uint32_t vio_irq_num)
+void spapr_vlan_create(VIOsPAPRBus *bus, uint32_t reg, NICInfo *nd)
 {
 DeviceState *dev;
-VIOsPAPRDevice *sdev;
 
 dev = qdev_create(bus-bus, spapr-vlan);
 qdev_prop_set_uint32(dev, reg, reg);
@@ -207,9 +205,6 @@ void spapr_vlan_create(VIOsPAPRBus *bus, uint32_t reg, 
NICInfo *nd,
 qdev_set_nic_properties(dev, nd);
 
 qdev_init_nofail(dev);
-sdev = (VIOsPAPRDevice *)dev;
-sdev-qirq = qirq;
-sdev-vio_irq_num = vio_irq_num;
 }
 
 static int spapr_vlan_devnode(VIOsPAPRDevice *dev, void *fdt, int node_off)
diff --git a/hw/spapr_vio.c b/hw/spapr_vio.c
index ce6558b..ba2e1c1 100644
--- a/hw/spapr_vio.c
+++ b/hw/spapr_vio.c
@@ -32,6 +32,7 @@
 
 #include hw/spapr.h
 #include hw/spapr_vio.h
+#include hw/xics.h
 
 #ifdef CONFIG_FDT
 #include libfdt.h
@@ -595,6 +596,7 @@ static int spapr_vio_busdev_init(DeviceState *qdev, 
DeviceInfo *qinfo)
 {
 VIOsPAPRDeviceInfo *info = (VIOsPAPRDeviceInfo *)qinfo;
 VIOsPAPRDevice *dev = (VIOsPAPRDevice *)qdev;
+VIOsPAPRBus *bus =

[Qemu-devel] [PATCH 13/58] PPC: E500: Generate IRQ lines for many CPUs

2011-09-14 Thread Alexander Graf

Now that we can generate multiple envs for all our virtual CPUs, we
also need to tell the MPIC that we have multiple CPUs connected and
connect them all to the respective virtual interrupt lines.

Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/ppce500_mpc8544ds.c |   17 -
 1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/hw/ppce500_mpc8544ds.c b/hw/ppce500_mpc8544ds.c
index 8d05587..9cb01f3 100644
--- a/hw/ppce500_mpc8544ds.c
+++ b/hw/ppce500_mpc8544ds.c
@@ -237,7 +237,7 @@ static void mpc8544ds_init(ram_addr_t ram_size,
 target_long initrd_size=0;
 int i=0;
 unsigned int pci_irq_nrs[4] = {1, 2, 3, 4};
-qemu_irq *irqs, *mpic;
+qemu_irq **irqs, *mpic;
 DeviceState *dev;
 struct boot_info *boot_info;
 CPUState *firstenv = NULL;
@@ -247,6 +247,8 @@ static void mpc8544ds_init(ram_addr_t ram_size,
 cpu_model = e500v2_v30;
 }
 
+irqs = g_malloc0(smp_cpus * sizeof(qemu_irq *));
+irqs[0] = g_malloc0(smp_cpus * sizeof(qemu_irq) * OPENPIC_OUTPUT_NB);
 for (i = 0; i  smp_cpus; i++) {
 qemu_irq *input;
 env = cpu_ppc_init(cpu_model);
@@ -259,6 +261,10 @@ static void mpc8544ds_init(ram_addr_t ram_size,
 firstenv = env;
 }
 
+irqs[i] = irqs[0] + (i * OPENPIC_OUTPUT_NB);
+input = (qemu_irq *)env-irq_inputs;
+irqs[i][OPENPIC_OUTPUT_INT] = input[PPCE500_INPUT_INT];
+irqs[i][OPENPIC_OUTPUT_CINT] = input[PPCE500_INPUT_CINT];
 env-spr[SPR_BOOKE_PIR] = env-cpu_index = i;
 
 /* XXX register timer? */
@@ -283,10 +289,11 @@ static void mpc8544ds_init(ram_addr_t ram_size,
  mpc8544ds.ram, ram_size));
 
 /* MPIC */
-irqs = g_malloc0(sizeof(qemu_irq) * OPENPIC_OUTPUT_NB);
-irqs[OPENPIC_OUTPUT_INT] = ((qemu_irq 
*)env-irq_inputs)[PPCE500_INPUT_INT];
-irqs[OPENPIC_OUTPUT_CINT] = ((qemu_irq 
*)env-irq_inputs)[PPCE500_INPUT_CINT];
-mpic = mpic_init(MPC8544_MPIC_REGS_BASE, 1, irqs, NULL);
+mpic = mpic_init(MPC8544_MPIC_REGS_BASE, smp_cpus, irqs, NULL);
+
+if (!mpic) {
+cpu_abort(env, MPIC failed to initialize\n);
+}
 
 /* Serial */
 if (serial_hds[0]) {
-- 
1.6.0.2

[Qemu-devel] [PATCH 34/58] PPC: Enable to use PAPR with PR style KVM

2011-09-14 Thread Alexander Graf

When running PR style KVM, we need to tell the kernel that we want
to run in PAPR mode now. This means that we need to pass some more
register information down and enable papr mode. We also need to align
the HTAB to htab_size boundary.

Using this patch, -M pseries works with kvm even on non-hv kvm
implementations, as long as the preceding kernel patches are in.

Signed-off-by: Alexander Graf ag...@suse.de

---

v1 - v2:

  - match on CONFIG_PSERIES

v2 - v3:

  - remove HIOR pieces from PAPR patch (ABI breakage)
---
 hw/spapr.c   |   14 +-
 target-ppc/kvm.c |   40 
 target-ppc/kvm_ppc.h |5 +
 3 files changed, 58 insertions(+), 1 deletions(-)

diff --git a/hw/spapr.c b/hw/spapr.c
index 8cf93fe..c5c9a95 100644
--- a/hw/spapr.c
+++ b/hw/spapr.c
@@ -38,6 +38,9 @@
 #include hw/spapr_vio.h
 #include hw/xics.h
 
+#include kvm.h
+#include kvm_ppc.h
+
 #include libfdt.h
 
 #define KERNEL_LOAD_ADDR0x
@@ -336,12 +339,21 @@ static void ppc_spapr_init(ram_addr_t ram_size,
  * later we should probably make it scale to the size of guest
  * RAM */
 spapr-htab_size = 1ULL  (pteg_shift + 7);
-spapr-htab = g_malloc(spapr-htab_size);
+spapr-htab = qemu_memalign(spapr-htab_size, spapr-htab_size);
 
 for (env = first_cpu; env != NULL; env = env-next_cpu) {
 env-external_htab = spapr-htab;
 env-htab_base = -1;
 env-htab_mask = spapr-htab_size - 1;
+
+/* Tell KVM that we're in PAPR mode */
+env-spr[SPR_SDR1] = (unsigned long)spapr-htab |
+ ((pteg_shift + 7) - 18);
+env-spr[SPR_HIOR] = 0;
+
+if (kvm_enabled()) {
+kvmppc_set_papr(env);
+}
 }
 
 filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, spapr-rtas.bin);
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 77b98c4..f65b6e1 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -29,6 +29,10 @@
 #include cpu.h
 #include device_tree.h
 
+#include hw/sysbus.h
+#include hw/spapr.h
+#include hw/spapr_vio.h
+
 //#define DEBUG_KVM
 
 #ifdef DEBUG_KVM
@@ -455,6 +459,14 @@ int kvm_arch_handle_exit(CPUState *env, struct kvm_run 
*run)
 dprintf(handle halt\n);
 ret = kvmppc_handle_halt(env);
 break;
+#ifdef CONFIG_PSERIES
+case KVM_EXIT_PAPR_HCALL:
+dprintf(handle PAPR hypercall\n);
+run-papr_hcall.ret = spapr_hypercall(env, run-papr_hcall.nr,
+  run-papr_hcall.args);
+ret = 1;
+break;
+#endif
 default:
 fprintf(stderr, KVM: unknown exit reason %d\n, run-exit_reason);
 ret = -1;
@@ -606,6 +618,34 @@ int kvmppc_get_hypercall(CPUState *env, uint8_t *buf, int 
buf_len)
 return 0;
 }
 
+void kvmppc_set_papr(CPUState *env)
+{
+struct kvm_enable_cap cap;
+int ret;
+
+memset(cap, 0, sizeof(cap));
+cap.cap = KVM_CAP_PPC_PAPR;
+ret = kvm_vcpu_ioctl(env, KVM_ENABLE_CAP, cap);
+
+if (ret) {
+goto fail;
+}
+
+/*
+ * XXX We set HIOR here. It really should be a qdev property of
+ * the CPU node, but we don't have CPUs converted to qdev yet.
+ *
+ * Once we have qdev CPUs, move HIOR to a qdev property and
+ * remove this chunk.
+ */
+/* XXX Set HIOR using new ioctl */
+
+return;
+
+fail:
+cpu_abort(env, This KVM version does not support PAPR\n);
+}
+
 bool kvm_arch_stop_on_emulation_error(CPUState *env)
 {
 return true;
diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
index 76f98d9..c484e60 100644
--- a/target-ppc/kvm_ppc.h
+++ b/target-ppc/kvm_ppc.h
@@ -17,6 +17,7 @@ uint32_t kvmppc_get_tbfreq(void);
 uint64_t kvmppc_get_clockfreq(void);
 int kvmppc_get_hypercall(CPUState *env, uint8_t *buf, int buf_len);
 int kvmppc_set_interrupt(CPUState *env, int irq, int level);
+void kvmppc_set_papr(CPUState *env);
 
 #else
 
@@ -40,6 +41,10 @@ static inline int kvmppc_set_interrupt(CPUState *env, int 
irq, int level)
 return -1;
 }
 
+static inline void kvmppc_set_papr(CPUState *env)
+{
+}
+
 #endif
 
 #ifndef CONFIG_KVM
-- 
1.6.0.2

Re: [Qemu-devel] [PATCH][RFC][0/2] REF+/REF- optimization

2011-09-14 Thread Kevin Wolf

Am 14.09.2011 11:52, schrieb Frediano Ziglio:
 2011/9/14 Kevin Wolf kw...@redhat.com:
 Am 13.09.2011 15:36, schrieb Frediano Ziglio:
 2011/9/13 Kevin Wolf kw...@redhat.com:
 Am 13.09.2011 09:53, schrieb Frediano Ziglio:
 These patches try to trade-off between leaks and speed for clusters
 refcounts.

 Refcount increments (REF+ or refp) are handled in a different way from
 decrements (REF- or refm). The reason it that posting or not flushing
 a REF- cause just a leak while posting a REF+ cause a corruption.

 To optimize REF- I just used an array to store offsets then when a
 flush is requested or array reach a limit (currently 1022) the array
 is sorted and written to disk. I use an array with offset instead of
 ranges to support compression (an offset could appear multiple times
 in the array).
 I consider this patch quite ready.

 Ok, first of all let's clarify what this optimises. I don't think it
 changes anything at all for the writeback cache modes, because these
 already do most operations in memory only. So this must be about
 optimising some operations with cache=writethrough. REF- isn't about
 normal cluster allocation, it is about COW with internal snapshots or
 bdrv_discard. Do you have benchmarks for any of them?

 I strongly disagree with your approach for REF-. We already have a
 cache, and introducing a second one sounds like a bad idea. I think we
 could get a very similar effect if we introduced a
 qcow2_cache_entry_mark_dirty_wb() that marks a given refcount block as
 dirty, but at the same time tells the cache that even in write-through
 mode it can still treat this block as write-back. This should require
 much less code changes.


 Yes, mainly optimize for writethrough. I did not test with writeback
 but should improve even this (I think here you have some flush to keep
 consistency).
 I'll try to write a qcow2_cache_entry_mark_dirty_wb patch and test it.

 Great, thanks!

 
 Don't expect however the patch too soon, I'm quite busy in these days.

Ok, no problem. It's not really urgent either.

 But let's measure the effects first, I suspect that for cluster
 allocation it doesn't help much because every REF- comes with a REF+.


 That's 50% of effort if REF- clusters are far from REF+ :)

 I would expect that the next REF+ allocates exactly the REF- cluster.
 But you still have a point, we save the write on REF- and combine it
 with the REF+ write.

 
 This is still a TODO for REF+ patch.

Actually, I was talking about the qcow2_cache_entry_mark_dirty_wb() case
without any other change. You get it automatically then.

 Oh... time ago looking at refcount code I realize that a single
 deallocation could be reused in some cases only after Qemu restart.
 For instance
 - got a single cluster REF- which take refcount to 0
 - free_cluster_index get decreased to this index
 - we get a new cluster request for 2 clusters
 - free_cluster_index get increased
 we skip freed deallocation and if we don't get a new deallocation for
 a cluster with index minor to our freed cluster this cluster is not
 reused.
 (I didn't test this behavior, no leak, no corruption, just image could
 be larger then expected)

Yes, I'm aware of that. I'm not sure if it matters in practice.

 To optimize REF+ I mark a range as allocated and use this range to
 get new ones (avoiding writing refcount to disk). When a flush is
 requested or in some situations (like snapshot) this cache is disabled
 and flushed (written as REF-).
 I do not consider this patch ready, it works and pass all io-tests
 but for instance I would avoid allocating new clusters for refcount
 during preallocation.

 The only question here is if improving cache=writethrough cluster
 allocation performance is worth the additional complexity in the already
 complex refcounting code.


 I didn't see this optimization as a second level cache, but yes, for
 REF- is a second cache.

 The alternative that was discussed before is the dirty bit approach that
 is used in QED and would allow us to use writeback for all refcount
 blocks, regardless of REF- or REF+. It would be an easier approach
 requiring less code changes, but it comes with the cost of requiring an
 fsck after a qemu crash.


 I was thinking about changing the header magic first time we change
 refcount in order to mark image as dirty so newer Qemu recognize the
 flag while former one does not recognize image. Obviously reverting
 magic on image close.

 We've discussed this idea before and I think it wasn't considered a
 great idea to automagically change the header in an incompatible way.
 But we can always say that for improved performance you need to upgrade
 your image to qcow2 v3.

 
 I don't understand why there is not a wiki page for detailed qcow3
 changes. I saw your post on May. I follow this ML since August so I
 think I missed a lot of discussion on qcow improves.

Unfortunately there have been almost no comments, so you can consider
RFC v2 as the current proposal.

 End speed up is

[Qemu-devel] [PATCH 27/58] device tree: dont fail operations

2011-09-14 Thread Alexander Graf

When we screw up and issue an FDT command that doesn't work, we really need to
know immediately and usually can't continue to create the machine. To make sure
we don't need to add error checking in all device tree modification code users,
we can just add the fail checks to the qemu abstract functions.

Signed-off-by: Alexander Graf ag...@suse.de
---
 device_tree.c |   76 ++--
 1 files changed, 51 insertions(+), 25 deletions(-)

diff --git a/device_tree.c b/device_tree.c
index f4a78c8..751538e 100644
--- a/device_tree.c
+++ b/device_tree.c
@@ -72,56 +72,81 @@ fail:
 return NULL;
 }
 
-int qemu_devtree_setprop(void *fdt, const char *node_path,
- const char *property, void *val_array, int size)
+static int findnode_nofail(void *fdt, const char *node_path)
 {
 int offset;
 
 offset = fdt_path_offset(fdt, node_path);
-if (offset  0)
-return offset;
+if (offset  0) {
+fprintf(stderr, %s Couldn't find node %s: %s\n, __func__, node_path,
+fdt_strerror(offset));
+exit(1);
+}
+
+return offset;
+}
+
+int qemu_devtree_setprop(void *fdt, const char *node_path,
+ const char *property, void *val_array, int size)
+{
+int r;
+
+r = fdt_setprop(fdt, findnode_nofail(fdt, node_path), property, val_array, 
size);
+if (r  0) {
+fprintf(stderr, %s: Couldn't set %s/%s: %s\n, __func__, node_path,
+property, fdt_strerror(r));
+exit(1);
+}
 
-return fdt_setprop(fdt, offset, property, val_array, size);
+return r;
 }
 
 int qemu_devtree_setprop_cell(void *fdt, const char *node_path,
   const char *property, uint32_t val)
 {
-int offset;
+int r;
 
-offset = fdt_path_offset(fdt, node_path);
-if (offset  0)
-return offset;
+r = fdt_setprop_cell(fdt, findnode_nofail(fdt, node_path), property, val);
+if (r  0) {
+fprintf(stderr, %s: Couldn't set %s/%s = %#08x: %s\n, __func__,
+node_path, property, val, fdt_strerror(r));
+exit(1);
+}
 
-return fdt_setprop_cell(fdt, offset, property, val);
+return r;
 }
 
 int qemu_devtree_setprop_string(void *fdt, const char *node_path,
 const char *property, const char *string)
 {
-int offset;
+int r;
 
-offset = fdt_path_offset(fdt, node_path);
-if (offset  0)
-return offset;
+r = fdt_setprop_string(fdt, findnode_nofail(fdt, node_path), property, 
string);
+if (r  0) {
+fprintf(stderr, %s: Couldn't set %s/%s = %s: %s\n, __func__,
+node_path, property, string, fdt_strerror(r));
+exit(1);
+}
 
-return fdt_setprop_string(fdt, offset, property, string);
+return r;
 }
 
 int qemu_devtree_nop_node(void *fdt, const char *node_path)
 {
-int offset;
+int r;
 
-offset = fdt_path_offset(fdt, node_path);
-if (offset  0)
-return offset;
+r = fdt_nop_node(fdt, findnode_nofail(fdt, node_path));
+if (r  0) {
+fprintf(stderr, %s: Couldn't nop node %s: %s\n, __func__, node_path,
+fdt_strerror(r));
+exit(1);
+}
 
-return fdt_nop_node(fdt, offset);
+return r;
 }
 
 int qemu_devtree_add_subnode(void *fdt, const char *name)
 {
-int offset;
 char *dupname = g_strdup(name);
 char *basename = strrchr(dupname, '/');
 int retval;
@@ -133,12 +158,13 @@ int qemu_devtree_add_subnode(void *fdt, const char *name)
 basename[0] = '\0';
 basename++;
 
-offset = fdt_path_offset(fdt, dupname);
-if (offset  0) {
-return offset;
+retval = fdt_add_subnode(fdt, findnode_nofail(fdt, dupname), basename);
+if (retval  0) {
+fprintf(stderr, FDT: Failed to create subnode %s: %s\n, name,
+fdt_strerror(retval));
+exit(1);
 }
 
-retval = fdt_add_subnode(fdt, offset, basename);
 g_free(dupname);
 return retval;
 }
-- 
1.6.0.2

[Qemu-devel] [PATCH 44/58] kvm: ppc: booke206: use MMU API

2011-09-14 Thread Alexander Graf

From: Scott Wood scottw...@freescale.com

Share the TLB array with KVM.  This allows us to set the initial TLB
both on initial boot and reset, is useful for debugging, and could
eventually be used to support migration.

Signed-off-by: Scott Wood scottw...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/ppce500_mpc8544ds.c |2 +
 target-ppc/cpu.h   |2 +
 target-ppc/kvm.c   |   85 
 3 files changed, 89 insertions(+), 0 deletions(-)

diff --git a/hw/ppce500_mpc8544ds.c b/hw/ppce500_mpc8544ds.c
index b86a008..61151d8 100644
--- a/hw/ppce500_mpc8544ds.c
+++ b/hw/ppce500_mpc8544ds.c
@@ -189,6 +189,8 @@ static void mmubooke_create_initial_mapping(CPUState *env,
 tlb-mas2 = va  TARGET_PAGE_MASK;
 tlb-mas7_3 = pa  TARGET_PAGE_MASK;
 tlb-mas7_3 |= MAS3_UR | MAS3_UW | MAS3_UX | MAS3_SR | MAS3_SW | MAS3_SX;
+
+env-tlb_dirty = true;
 }
 
 static void mpc8544ds_cpu_reset_sec(void *opaque)
diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index b8d42e0..3e7f797 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -934,6 +934,8 @@ struct CPUPPCState {
 ppc_tlb_t tlb;   /* TLB is optional. Allocate them only if needed*/
 /* 403 dedicated access protection registers */
 target_ulong pb[4];
+bool tlb_dirty;   /* Set to non-zero when modifying TLB  */
+bool kvm_sw_tlb;  /* non-zero if KVM SW TLB API is active*/
 #endif
 
 /* Other registers */
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index f65b6e1..35a6f10 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -112,6 +112,52 @@ static int kvm_arch_sync_sregs(CPUState *cenv)
 return kvm_vcpu_ioctl(cenv, KVM_SET_SREGS, sregs);
 }
 
+/* Set up a shared TLB array with KVM */
+static int kvm_booke206_tlb_init(CPUState *env)
+{
+struct kvm_book3e_206_tlb_params params = {};
+struct kvm_config_tlb cfg = {};
+struct kvm_enable_cap encap = {};
+unsigned int entries = 0;
+int ret, i;
+
+if (!kvm_enabled() ||
+!kvm_check_extension(env-kvm_state, KVM_CAP_SW_TLB)) {
+return 0;
+}
+
+assert(ARRAY_SIZE(params.tlb_sizes) == BOOKE206_MAX_TLBN);
+
+for (i = 0; i  BOOKE206_MAX_TLBN; i++) {
+params.tlb_sizes[i] = booke206_tlb_size(env, i);
+params.tlb_ways[i] = booke206_tlb_ways(env, i);
+entries += params.tlb_sizes[i];
+}
+
+assert(entries == env-nb_tlb);
+assert(sizeof(struct kvm_book3e_206_tlb_entry) == sizeof(ppcmas_tlb_t));
+
+env-tlb_dirty = true;
+
+cfg.array = (uintptr_t)env-tlb.tlbm;
+cfg.array_len = sizeof(ppcmas_tlb_t) * entries;
+cfg.params = (uintptr_t)params;
+cfg.mmu_type = KVM_MMU_FSL_BOOKE_NOHV;
+
+encap.cap = KVM_CAP_SW_TLB;
+encap.args[0] = (uintptr_t)cfg;
+
+ret = kvm_vcpu_ioctl(env, KVM_ENABLE_CAP, encap);
+if (ret  0) {
+fprintf(stderr, %s: couldn't enable KVM_CAP_SW_TLB: %s\n,
+__func__, strerror(-ret));
+return ret;
+}
+
+env-kvm_sw_tlb = true;
+return 0;
+}
+
 int kvm_arch_init_vcpu(CPUState *cenv)
 {
 int ret;
@@ -123,6 +169,15 @@ int kvm_arch_init_vcpu(CPUState *cenv)
 
 idle_timer = qemu_new_timer_ns(vm_clock, kvm_kick_env, cenv);
 
+/* Some targets support access to KVM's guest TLB. */
+switch (cenv-mmu_model) {
+case POWERPC_MMU_BOOKE206:
+ret = kvm_booke206_tlb_init(cenv);
+break;
+default:
+break;
+}
+
 return ret;
 }
 
@@ -130,6 +185,31 @@ void kvm_arch_reset_vcpu(CPUState *env)
 {
 }
 
+static void kvm_sw_tlb_put(CPUState *env)
+{
+struct kvm_dirty_tlb dirty_tlb;
+unsigned char *bitmap;
+int ret;
+
+if (!env-kvm_sw_tlb) {
+return;
+}
+
+bitmap = g_malloc((env-nb_tlb + 7) / 8);
+memset(bitmap, 0xFF, (env-nb_tlb + 7) / 8);
+
+dirty_tlb.bitmap = (uintptr_t)bitmap;
+dirty_tlb.num_dirty = env-nb_tlb;
+
+ret = kvm_vcpu_ioctl(env, KVM_DIRTY_TLB, dirty_tlb);
+if (ret) {
+fprintf(stderr, %s: KVM_DIRTY_TLB: %s\n,
+__func__, strerror(-ret));
+}
+
+g_free(bitmap);
+}
+
 int kvm_arch_put_registers(CPUState *env, int level)
 {
 struct kvm_regs regs;
@@ -167,6 +247,11 @@ int kvm_arch_put_registers(CPUState *env, int level)
 if (ret  0)
 return ret;
 
+if (env-tlb_dirty) {
+kvm_sw_tlb_put(env);
+env-tlb_dirty = false;
+}
+
 return ret;
 }
 
-- 
1.6.0.2

Re: [Qemu-devel] [PATCH 00/12] nbd improvements

2011-09-14 Thread Kevin Wolf

Am 08.09.2011 17:24, schrieb Paolo Bonzini:
 I find nbd quite useful to test migration, but it is limited:
 it can only do synchronous operation, it is not safe because it
 does not support flush, and it has no discard either.  qemu-nbd
 is also limited to 1MB requests, and the nbd block driver does
 not take this into account.
 
 Luckily, flush/FUA support is being worked out by upstream,
 and discard can also be added with the same framework (patches
 1 to 6).
 
 Asynchronous support is also very similar to what sheepdog is
 already doing (patches 7 to 12).
 
 Paolo Bonzini (12):
   nbd: support feature negotiation
   nbd: sync API definitions with upstream
   nbd: support NBD_SET_FLAGS ioctl
   nbd: add support for NBD_CMD_FLUSH
   nbd: add support for NBD_CMD_FLAG_FUA
   nbd: support NBD_CMD_TRIM in the server
   sheepdog: add coroutine_fn markers
   add socket_set_block
   sheepdog: move coroutine send/recv function to generic code
   block: add bdrv_co_flush support
   nbd: switch to asynchronous operation
   nbd: split requests

Okay, completed the review for this series now. I think if you consider
the comments posted so far for v2 we should be good.

Kevin

Re: [Qemu-devel] [PATCH 0/3] Memory API mutators

2011-09-14 Thread Peter Maydell

On 14 September 2011 11:02, Avi Kivity a...@redhat.com wrote:
 On 09/14/2011 12:56 PM, Peter Maydell wrote:

 On 14 September 2011 10:23, Avi Kivitya...@redhat.com  wrote:
   This patchset introduces memory_region_set_enabled() and
   memory_region_set_address() to avoid the requirement on memory
   routers to track the internal state of the memory API (so they know
   whether they need to add or remove a region).  Instead, they can
   simply copy the state of the region from the guest-exposed register
   to the memory core, via the new mutator functions.
 
   Please review.  Do we need a memory_region_set_size() as well?

 Would set_size() allow things like omap_gpmc() to avoid the need
 to create an intermediate container subregion to enforce size
 clipping on the child region it's trying to map?

 I'd recommend not calling _set_size() on somebody else's region - this
 quickly leads to confusion.  Only call set_size() if you also called _init()
 and will call _destroy().

 Can you point me at the code in question?

hw/omap_gpmc.c:omap_gpmc_cs_map(). For each of the 8 children you
can connect to it, the GPMC has a base and mask register. The
hardware logic is effectively
 if ((address  mask) == base) { send transaction to this child }

(complicated only slightly by the register for base only having
bits [29:24] with the others implied-zero, and the register for
mask only having bits [27:24].) The effect is that you can use
the mask value to set the size of the area the child is mapped in.
(Silly mask settings with holes are discouraged by the TRM,
and the current code doesn't handle them.)

The repeated-in-the-space effect happens if the child is smaller
than the space it's in: the child hardware just ignores the higher
bits of the address so appears multiple times.

 (Strictly speaking what omap_gpmc() wants is not merely clipping
 to a guest-specified size but also wrapping, so you can take a
 16MB child region and map the bottom 4MB of it repeating into
 a 32MB chunk of address space, say. But that would require a lot
 of playing games with aliases to implement a bizarre corner
 case that nobody uses in practice.)

 That's best done in the memory core, the rendering loop can be adjusted to
 do this replication.

That would be nice, although as I say nobody is actually relying
on it so probably not worth the effort unless there's another user
for it.

-- PMM

[Qemu-devel] [PATCH 55/58] ppc: move ADB stuff from ppc_mac.h to adb.h

2011-09-14 Thread Alexander Graf

From: Laurent Vivier laur...@vivier.eu

Allow to use ADB in non-ppc macintosh

Signed-off-by: Laurent Vivier laur...@vivier.eu
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/adb.c  |2 +-
 hw/adb.h  |   67 +
 hw/cuda.c |1 +
 hw/ppc_mac.h  |   42 -
 hw/ppc_newworld.c |1 +
 hw/ppc_oldworld.c |1 +
 6 files changed, 71 insertions(+), 43 deletions(-)
 create mode 100644 hw/adb.h

diff --git a/hw/adb.c b/hw/adb.c
index 8dedbf8..aa15f55 100644
--- a/hw/adb.c
+++ b/hw/adb.c
@@ -22,7 +22,7 @@
  * THE SOFTWARE.
  */
 #include hw.h
-#include ppc_mac.h
+#include adb.h
 #include console.h
 
 /* debug ADB */
diff --git a/hw/adb.h b/hw/adb.h
new file mode 100644
index 000..b2a591c
--- /dev/null
+++ b/hw/adb.h
@@ -0,0 +1,67 @@
+/*
+ * QEMU ADB emulation shared definitions and prototypes
+ *
+ * Copyright (c) 2004-2007 Fabrice Bellard
+ * Copyright (c) 2007 Jocelyn Mayer
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the Software), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#if !defined(__ADB_H__)
+#define __ADB_H__
+
+#define MAX_ADB_DEVICES 16
+
+#define ADB_MAX_OUT_LEN 16
+
+typedef struct ADBDevice ADBDevice;
+
+/* buf = NULL means polling */
+typedef int ADBDeviceRequest(ADBDevice *d, uint8_t *buf_out,
+  const uint8_t *buf, int len);
+typedef int ADBDeviceReset(ADBDevice *d);
+
+struct ADBDevice {
+struct ADBBusState *bus;
+int devaddr;
+int handler;
+ADBDeviceRequest *devreq;
+ADBDeviceReset *devreset;
+void *opaque;
+};
+
+typedef struct ADBBusState {
+ADBDevice devices[MAX_ADB_DEVICES];
+int nb_devices;
+int poll_index;
+} ADBBusState;
+
+int adb_request(ADBBusState *s, uint8_t *buf_out,
+const uint8_t *buf, int len);
+int adb_poll(ADBBusState *s, uint8_t *buf_out);
+
+ADBDevice *adb_register_device(ADBBusState *s, int devaddr,
+   ADBDeviceRequest *devreq,
+   ADBDeviceReset *devreset,
+   void *opaque);
+void adb_kbd_init(ADBBusState *bus);
+void adb_mouse_init(ADBBusState *bus);
+
+extern ADBBusState adb_bus;
+#endif /* !defined(__ADB_H__) */
diff --git a/hw/cuda.c b/hw/cuda.c
index 5c92d81..6f05975 100644
--- a/hw/cuda.c
+++ b/hw/cuda.c
@@ -24,6 +24,7 @@
  */
 #include hw.h
 #include ppc_mac.h
+#include adb.h
 #include qemu-timer.h
 #include sysemu.h
 
diff --git a/hw/ppc_mac.h b/hw/ppc_mac.h
index 7351bb6..af75e45 100644
--- a/hw/ppc_mac.h
+++ b/hw/ppc_mac.h
@@ -77,46 +77,4 @@ void macio_nvram_setup_bar(MacIONVRAMState *s, MemoryRegion 
*bar,
 void pmac_format_nvram_partition (MacIONVRAMState *nvr, int len);
 uint32_t macio_nvram_read (void *opaque, uint32_t addr);
 void macio_nvram_write (void *opaque, uint32_t addr, uint32_t val);
-
-/* adb.c */
-
-#define MAX_ADB_DEVICES 16
-
-#define ADB_MAX_OUT_LEN 16
-
-typedef struct ADBDevice ADBDevice;
-
-/* buf = NULL means polling */
-typedef int ADBDeviceRequest(ADBDevice *d, uint8_t *buf_out,
-  const uint8_t *buf, int len);
-typedef int ADBDeviceReset(ADBDevice *d);
-
-struct ADBDevice {
-struct ADBBusState *bus;
-int devaddr;
-int handler;
-ADBDeviceRequest *devreq;
-ADBDeviceReset *devreset;
-void *opaque;
-};
-
-typedef struct ADBBusState {
-ADBDevice devices[MAX_ADB_DEVICES];
-int nb_devices;
-int poll_index;
-} ADBBusState;
-
-int adb_request(ADBBusState *s, uint8_t *buf_out,
-const uint8_t *buf, int len);
-int adb_poll(ADBBusState *s, uint8_t *buf_out);
-
-ADBDevice *adb_register_device(ADBBusState *s, int devaddr,
-   ADBDeviceRequest *devreq,
-   ADBDeviceReset *devreset,
-   void *opaque);
-void adb_kbd_init(ADBBusState *bus);
-void adb_mouse_init(ADBBusState *bus);
-
-extern ADBBusState adb_bus;
-
 #endif /*

Re: [Qemu-devel] [PATCH 05/58] PPC: Add CPU local MMIO regions to MPIC

2011-09-14 Thread Jan Kiszka

On 2011-09-14 12:07, Peter Maydell wrote:
 On 14 September 2011 09:42, Alexander Graf ag...@suse.de wrote:
 The MPIC exports a register set for each CPU connected to it. They can all
 be accessed through specific registers or using a shadow page that is mapped
 differently depending on which CPU accesses it.

 This patch implements the shadow map, making it possible for guests to access
 the CPU local registers using the same address on each CPU.
 
 +static int get_current_cpu(void)
 +{
 +  return cpu_single_env-cpu_index;
 +}
 
 This is the standard way of doing this (we use it on ARM as well), but
 it's pretty clearly a hack. which master sent this memory transaction
 is an attribute that ought to be passed down to the MMIO read/write
 functions, really (along with other interesting things like priv or
 not? and probably architecture specific attributes like ARM's
 secure/non-secure); this matches how hardware does it where the
 attributes are passed along as extra signals in the bus fabric.
 (Sometimes hardware also does this by having buses from the different
 cores be totally separate paths at the point where this kind of device
 is connected, before merging together later; we don't really support
 modelling that either :-))
 
 Not a nak, just an observation while I'm thinking about it.

Same problem has to be solved on x86. The way the local APIC is hooked
up right now is totally broken, just works by chance because normal
guests don't seriously stress the architecture.

If we start dispatching CPU memory accesses via per-CPU memory roots,
the problem can be solved without passing additional source information
to the callbacks.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] [PATCH 56/58] PPC: Fix via-cuda memory registration

2011-09-14 Thread Alexander Graf

Commit 23c5e4ca (convert to memory API) broke the VIA Cuda emulation layer
by not registering the IO structs.

This patch registers them properly and thus makes -M g3beige and -M mac99
work again.

Tested-by: Andreas Färber andreas.faer...@web.de
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/cuda.c |   28 
 1 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/hw/cuda.c b/hw/cuda.c
index 6f05975..4077436 100644
--- a/hw/cuda.c
+++ b/hw/cuda.c
@@ -634,16 +634,20 @@ static uint32_t cuda_readl (void *opaque, 
target_phys_addr_t addr)
 return 0;
 }
 
-static CPUWriteMemoryFunc * const cuda_write[] = {
-cuda_writeb,
-cuda_writew,
-cuda_writel,
-};
-
-static CPUReadMemoryFunc * const cuda_read[] = {
-cuda_readb,
-cuda_readw,
-cuda_readl,
+static MemoryRegionOps cuda_ops = {
+.old_mmio = {
+.write = {
+cuda_writeb,
+cuda_writew,
+cuda_writel,
+},
+.read = {
+cuda_readb,
+cuda_readw,
+cuda_readl,
+},
+},
+.endianness = DEVICE_NATIVE_ENDIAN,
 };
 
 static bool cuda_timer_exist(void *opaque, int version_id)
@@ -740,8 +744,8 @@ void cuda_init (MemoryRegion **cuda_mem, qemu_irq irq)
 s-tick_offset = (uint32_t)mktimegm(tm) + RTC_OFFSET;
 
 s-adb_poll_timer = qemu_new_timer_ns(vm_clock, cuda_adb_poll, s);
-cpu_register_io_memory(cuda_read, cuda_write, s,
- DEVICE_NATIVE_ENDIAN);
+memory_region_init_io(s-mem, cuda_ops, s, cuda, 0x2000);
+
 *cuda_mem = s-mem;
 vmstate_register(NULL, -1, vmstate_cuda, s);
 qemu_register_reset(cuda_reset, s);
-- 
1.6.0.2

Re: [Qemu-devel] [PATCH 0/3] Memory API mutators

2011-09-14 Thread Jan Kiszka

On 2011-09-14 11:49, Avi Kivity wrote:
 Jan, too, was interested in this.
 
 On 09/14/2011 12:23 PM, Avi Kivity wrote:
 This patchset introduces memory_region_set_enabled() and
 memory_region_set_address() to avoid the requirement on memory
 routers to track the internal state of the memory API (so they know
 whether they need to add or remove a region).  Instead, they can
 simply copy the state of the region from the guest-exposed register
 to the memory core, via the new mutator functions.

 Please review.  Do we need a memory_region_set_size() as well?  Do we want

memory_region_set_attributes(mr,
 MR_ATTR_ENABLED | MR_ATTR_SIZE,
 (MemoryRegionAttributes) {
 .enabled = s-enabled,
 .address = s-addr,
 });

 ?

 Avi Kivity (3):
memory: introduce memory_region_set_enabled()
memory: introduce memory_region_set_address()
memory: optimize empty transactions due to mutators

   memory.c |   64 
 -
   memory.h |   28 +++
   2 files changed, 82 insertions(+), 10 deletions(-)


Whatever the outcome is (tons of memory_region_set/get_X functions or
huge attribute structures + set/get_attributes), it should be consistent
for all attributes of a memory region. And there should be only one way
of doing this.

I think the decision multiple set/get vs. attribute struct depends on
some (estimated) usage stats: How many call sites will access multiple
attributes in one run and how may will only manipulate a single?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] [PATCH 39/58] pseries: More complete WIMG validation in H_ENTER code

2011-09-14 Thread Alexander Graf

From: David Gibson da...@gibson.dropbear.id.au

Currently our implementation of the H_ENTER hypercall, which inserts a
mapping in the hash page table assumes that only ordinary memory is ever
mapped, and only permits mapping attribute bits accordingly (WIMG==0010).

However, we intend to start adding emulated IO to the pseries platform
(and real IO with PCI passthrough on kvm) which means this simple test
will no longer suffice.

This patch extends the h_enter validation code to check if the given
address is a RAM address.  If it is it enforces WIMG==0010, otherwise
it assumes that it is an IO mapping and instead enforces WIMG=010x.

Signed-off-by: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/spapr.c   |3 ++-
 hw/spapr.h   |1 +
 hw/spapr_hcall.c |   22 ++
 3 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/hw/spapr.c b/hw/spapr.c
index 9eefef9..00aed62 100644
--- a/hw/spapr.c
+++ b/hw/spapr.c
@@ -336,7 +336,8 @@ static void ppc_spapr_init(ram_addr_t ram_size,
 }
 
 /* allocate RAM */
-ram_offset = qemu_ram_alloc(NULL, ppc_spapr.ram, ram_size);
+spapr-ram_limit = ram_size;
+ram_offset = qemu_ram_alloc(NULL, ppc_spapr.ram, spapr-ram_limit);
 cpu_register_physical_memory(0, ram_size, ram_offset);
 
 /* allocate hash page table.  For now we always make this 16mb,
diff --git a/hw/spapr.h b/hw/spapr.h
index 009c459..3d21b7a 100644
--- a/hw/spapr.h
+++ b/hw/spapr.h
@@ -10,6 +10,7 @@ typedef struct sPAPREnvironment {
 struct VIOsPAPRBus *vio_bus;
 struct icp_state *icp;
 
+target_phys_addr_t ram_limit;
 void *htab;
 long htab_size;
 target_phys_addr_t fdt_addr, rtas_addr;
diff --git a/hw/spapr_hcall.c b/hw/spapr_hcall.c
index f7ead04..70f853c 100644
--- a/hw/spapr_hcall.c
+++ b/hw/spapr_hcall.c
@@ -99,6 +99,8 @@ static target_ulong h_enter(CPUState *env, sPAPREnvironment 
*spapr,
 target_ulong pte_index = args[1];
 target_ulong pteh = args[2];
 target_ulong ptel = args[3];
+target_ulong page_shift = 12;
+target_ulong raddr;
 target_ulong i;
 uint8_t *hpte;
 
@@ -111,6 +113,7 @@ static target_ulong h_enter(CPUState *env, sPAPREnvironment 
*spapr,
 #endif
 if ((ptel  0xff000) == 0) {
 /* 16M page */
+page_shift = 24;
 /* lowest AVA bit must be 0 for 16M pages */
 if (pteh  0x80) {
 return H_PARAMETER;
@@ -120,12 +123,23 @@ static target_ulong h_enter(CPUState *env, 
sPAPREnvironment *spapr,
 }
 }
 
-/* FIXME: bounds check the pa? */
+raddr = (ptel  HPTE_R_RPN)  ~((1ULL  page_shift) - 1);
 
-/* Check WIMG */
-if ((ptel  HPTE_R_WIMG) != HPTE_R_M) {
-return H_PARAMETER;
+if (raddr  spapr-ram_limit) {
+/* Regular RAM - should have WIMG=0010 */
+if ((ptel  HPTE_R_WIMG) != HPTE_R_M) {
+return H_PARAMETER;
+}
+} else {
+/* Looks like an IO address */
+/* FIXME: What WIMG combinations could be sensible for IO?
+ * For now we allow WIMG=010x, but are there others? */
+/* FIXME: Should we check against registered IO addresses? */
+if ((ptel  (HPTE_R_W | HPTE_R_I | HPTE_R_M)) != HPTE_R_I) {
+return H_PARAMETER;
+}
 }
+
 pteh = ~0x60ULL;
 
 if ((pte_index * HASH_PTE_SIZE_64)  ~env-htab_mask) {
-- 
1.6.0.2

[Qemu-devel] [PATCH 35/58] PPC: SPAPR: Use KVM function for time info

2011-09-14 Thread Alexander Graf

One of the things we can't fake on PPC is the timer speed. So
we need to extract the frequency information from the host and
put it back into the guest device tree.

Luckily, we already have functions for that from the non-pseries
targets, so all we need to do is to connect the dots and the guest
suddenly gets to know its real timer speeds.

Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/spapr.c |8 
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/spapr.c b/hw/spapr.c
index c5c9a95..760e323 100644
--- a/hw/spapr.c
+++ b/hw/spapr.c
@@ -140,6 +140,8 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
 char *nodename;
 uint32_t segs[] = {cpu_to_be32(28), cpu_to_be32(40),
0x, 0x};
+uint32_t tbfreq = kvm_enabled() ? kvmppc_get_tbfreq() : TIMEBASE_FREQ;
+uint32_t cpufreq = kvm_enabled() ? kvmppc_get_clockfreq() : 10;
 
 if (asprintf(nodename, %s@%x, modelname, index)  0) {
 fprintf(stderr, Allocation failure\n);
@@ -158,10 +160,8 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
 env-dcache_line_size)));
 _FDT((fdt_property_cell(fdt, icache-block-size,
 env-icache_line_size)));
-_FDT((fdt_property_cell(fdt, timebase-frequency, TIMEBASE_FREQ)));
-/* Hardcode CPU frequency for now.  It's kind of arbitrary on
- * full emu, for kvm we should copy it from the host */
-_FDT((fdt_property_cell(fdt, clock-frequency, 10)));
+_FDT((fdt_property_cell(fdt, timebase-frequency, tbfreq)));
+_FDT((fdt_property_cell(fdt, clock-frequency, cpufreq)));
 _FDT((fdt_property_cell(fdt, ibm,slb-size, env-slb_nr)));
 _FDT((fdt_property(fdt, ibm,pft-size,
pft_size_prop, sizeof(pft_size_prop;
-- 
1.6.0.2

[Qemu-devel] [PATCH 15/58] PPC: bamboo: Move host fdt copy to target

2011-09-14 Thread Alexander Graf

We have some code in generic kvm_ppc.c that is only used by 440. Move to
the 440 specific device code.

Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/ppc440_bamboo.c   |   37 +++--
 target-ppc/kvm_ppc.c |   30 --
 target-ppc/kvm_ppc.h |1 -
 3 files changed, 35 insertions(+), 33 deletions(-)

diff --git a/hw/ppc440_bamboo.c b/hw/ppc440_bamboo.c
index 1addb68..65d4f0f 100644
--- a/hw/ppc440_bamboo.c
+++ b/hw/ppc440_bamboo.c
@@ -31,6 +31,38 @@
 #define FDT_ADDR 0x180
 #define RAMDISK_ADDR 0x190
 
+#ifdef CONFIG_FDT
+static int bamboo_copy_host_cell(void *fdt, const char *node, const char *prop)
+{
+uint32_t cell;
+int ret;
+
+ret = kvmppc_read_host_property(node, prop, cell, sizeof(cell));
+if (ret  0) {
+fprintf(stderr, couldn't read host %s/%s\n, node, prop);
+goto out;
+}
+
+ret = qemu_devtree_setprop_cell(fdt, node, prop, cell);
+if (ret  0) {
+fprintf(stderr, couldn't set guest %s/%s\n, node, prop);
+goto out;
+}
+
+out:
+return ret;
+}
+
+static void bamboo_fdt_update(void *fdt)
+{
+/* Copy data from the host device tree into the guest. Since the guest can
+ * directly access the timebase without host involvement, we must expose
+ * the correct frequencies. */
+bamboo_copy_host_cell(fdt, /cpus/cpu@0, clock-frequency);
+bamboo_copy_host_cell(fdt, /cpus/cpu@0, timebase-frequency);
+}
+#endif
+
 static int bamboo_load_device_tree(target_phys_addr_t addr,
  uint32_t ramsize,
  target_phys_addr_t initrd_base,
@@ -76,8 +108,9 @@ static int bamboo_load_device_tree(target_phys_addr_t addr,
 if (ret  0)
 fprintf(stderr, couldn't set /chosen/bootargs\n);
 
-if (kvm_enabled())
-kvmppc_fdt_update(fdt);
+if (kvm_enabled()) {
+bamboo_fdt_update(fdt);
+}
 
 ret = rom_add_blob_fixed(BINARY_DEVICE_TREE_FILE, fdt, fdt_size, addr);
 g_free(fdt);
diff --git a/target-ppc/kvm_ppc.c b/target-ppc/kvm_ppc.c
index 867dc1d..233115e 100644
--- a/target-ppc/kvm_ppc.c
+++ b/target-ppc/kvm_ppc.c
@@ -54,36 +54,6 @@ free:
 free(path);
 return ret;
 }
-
-static int kvmppc_copy_host_cell(void *fdt, const char *node, const char *prop)
-{
-uint32_t cell;
-int ret;
-
-ret = kvmppc_read_host_property(node, prop, cell, sizeof(cell));
-if (ret  0) {
-fprintf(stderr, couldn't read host %s/%s\n, node, prop);
-goto out;
-}
-
-ret = qemu_devtree_setprop_cell(fdt, node, prop, cell);
-if (ret  0) {
-fprintf(stderr, couldn't set guest %s/%s\n, node, prop);
-goto out;
-}
-
-out:
-return ret;
-}
-
-void kvmppc_fdt_update(void *fdt)
-{
-/* Copy data from the host device tree into the guest. Since the guest can
- * directly access the timebase without host involvement, we must expose
- * the correct frequencies. */
-kvmppc_copy_host_cell(fdt, /cpus/cpu@0, clock-frequency);
-kvmppc_copy_host_cell(fdt, /cpus/cpu@0, timebase-frequency);
-}
 #endif
 
 static void kvmppc_timer_hack(void *opaque)
diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
index 45a1373..2f32249 100644
--- a/target-ppc/kvm_ppc.h
+++ b/target-ppc/kvm_ppc.h
@@ -10,7 +10,6 @@
 #define __KVM_PPC_H__
 
 void kvmppc_init(void);
-void kvmppc_fdt_update(void *fdt);
 #ifndef CONFIG_KVM
 static inline int kvmppc_read_host_property(const char *node_path, const char 
*prop,
 void *val, size_t len)
-- 
1.6.0.2

[Qemu-devel] [PATCH 02/58] spapr: prepare for qdevification of irq

2011-09-14 Thread Alexander Graf

From: Paolo Bonzini pbonz...@redhat.com

Restructure common properties for sPAPR devices so that IRQ definitions
can be added in one place.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
Cc: Alexander Graf ag...@suse.de
Cc: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/spapr_llan.c  |4 +---
 hw/spapr_vio.h   |5 +
 hw/spapr_vscsi.c |4 +---
 hw/spapr_vty.c   |2 +-
 4 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/hw/spapr_llan.c b/hw/spapr_llan.c
index 2597748..abe1297 100644
--- a/hw/spapr_llan.c
+++ b/hw/spapr_llan.c
@@ -495,9 +495,7 @@ static VIOsPAPRDeviceInfo spapr_vlan = {
 .qdev.name = spapr-vlan,
 .qdev.size = sizeof(VIOsPAPRVLANDevice),
 .qdev.props = (Property[]) {
-DEFINE_PROP_UINT32(reg, VIOsPAPRDevice, reg, 0x1000),
-DEFINE_PROP_UINT32(dma-window, VIOsPAPRDevice, rtce_window_size,
-   0x1000),
+DEFINE_SPAPR_PROPERTIES(VIOsPAPRVLANDevice, sdev, 0x1000, 0x1000),
 DEFINE_NIC_PROPERTIES(VIOsPAPRVLANDevice, nicconf),
 DEFINE_PROP_END_OF_LIST(),
 },
diff --git a/hw/spapr_vio.h b/hw/spapr_vio.h
index faa5d94..7eb5367 100644
--- a/hw/spapr_vio.h
+++ b/hw/spapr_vio.h
@@ -60,6 +60,11 @@ typedef struct VIOsPAPRDevice {
 VIOsPAPR_CRQ crq;
 } VIOsPAPRDevice;
 
+#define DEFINE_SPAPR_PROPERTIES(type, field, default_reg, default_dma_window) \
+DEFINE_PROP_UINT32(reg, type, field.reg, default_reg), \
+DEFINE_PROP_UINT32(dma-window, type, field.rtce_window_size, \
+   default_dma_window)
+
 typedef struct VIOsPAPRBus {
 BusState bus;
 int irq;
diff --git a/hw/spapr_vscsi.c b/hw/spapr_vscsi.c
index d2d0415..6fc82f6 100644
--- a/hw/spapr_vscsi.c
+++ b/hw/spapr_vscsi.c
@@ -930,9 +930,7 @@ static VIOsPAPRDeviceInfo spapr_vscsi = {
 .qdev.name = spapr-vscsi,
 .qdev.size = sizeof(VSCSIState),
 .qdev.props = (Property[]) {
-DEFINE_PROP_UINT32(reg, VIOsPAPRDevice, reg, 0x2000),
-DEFINE_PROP_UINT32(dma-window, VIOsPAPRDevice,
-   rtce_window_size, 0x1000),
+DEFINE_SPAPR_PROPERTIES(VSCSIState, vdev, 0x2000, 0x1000),
 DEFINE_PROP_END_OF_LIST(),
 },
 };
diff --git a/hw/spapr_vty.c b/hw/spapr_vty.c
index 607b81b..a9d4b03 100644
--- a/hw/spapr_vty.c
+++ b/hw/spapr_vty.c
@@ -140,7 +140,7 @@ static VIOsPAPRDeviceInfo spapr_vty = {
 .qdev.name = spapr-vty,
 .qdev.size = sizeof(VIOsPAPRVTYDevice),
 .qdev.props = (Property[]) {
-DEFINE_PROP_UINT32(reg, VIOsPAPRDevice, reg, 0),
+DEFINE_SPAPR_PROPERTIES(VIOsPAPRVTYDevice, sdev, 0, 0),
 DEFINE_PROP_CHR(chardev, VIOsPAPRVTYDevice, chardev),
 DEFINE_PROP_END_OF_LIST(),
 },
-- 
1.6.0.2

[Qemu-devel] [PATCH 33/58] KVM: update kernel headers

2011-09-14 Thread Alexander Graf

This patch updates the kvm kernel headers to the latest version.

Signed-off-by: Alexander Graf ag...@suse.de
---
 linux-headers/asm-powerpc/kvm.h  |   23 +++
 linux-headers/asm-x86/kvm_para.h |   14 ++
 linux-headers/linux/kvm.h|   25 +
 linux-headers/linux/kvm_para.h   |1 +
 4 files changed, 55 insertions(+), 8 deletions(-)

diff --git a/linux-headers/asm-powerpc/kvm.h b/linux-headers/asm-powerpc/kvm.h
index 777d307..579e219 100644
--- a/linux-headers/asm-powerpc/kvm.h
+++ b/linux-headers/asm-powerpc/kvm.h
@@ -22,6 +22,10 @@
 
 #include linux/types.h
 
+/* Select powerpc specific features in linux/kvm.h */
+#define __KVM_HAVE_SPAPR_TCE
+#define __KVM_HAVE_PPC_SMT
+
 struct kvm_regs {
__u64 pc;
__u64 cr;
@@ -145,6 +149,12 @@ struct kvm_regs {
 #define KVM_SREGS_E_UPDATE_DBSR(1  3)
 
 /*
+ * Book3S special bits to indicate contents in the struct by maintaining
+ * backwards compatibility with older structs. If adding a new field,
+ * please make sure to add a flag for that new field */
+#define KVM_SREGS_S_HIOR   (1  0)
+
+/*
  * In KVM_SET_SREGS, reserved/pad fields must be left untouched from a
  * previous KVM_GET_REGS.
  *
@@ -169,6 +179,8 @@ struct kvm_sregs {
__u64 ibat[8];
__u64 dbat[8];
} ppc32;
+   __u64 flags; /* KVM_SREGS_S_ */
+   __u64 hior;
} s;
struct {
union {
@@ -272,4 +284,15 @@ struct kvm_guest_debug_arch {
 #define KVM_INTERRUPT_UNSET-2U
 #define KVM_INTERRUPT_SET_LEVEL-3U
 
+/* for KVM_CAP_SPAPR_TCE */
+struct kvm_create_spapr_tce {
+   __u64 liobn;
+   __u32 window_size;
+};
+
+/* for KVM_ALLOCATE_RMA */
+struct kvm_allocate_rma {
+   __u64 rma_size;
+};
+
 #endif /* __LINUX_KVM_POWERPC_H */
diff --git a/linux-headers/asm-x86/kvm_para.h b/linux-headers/asm-x86/kvm_para.h
index 834d71e..f2ac46a 100644
--- a/linux-headers/asm-x86/kvm_para.h
+++ b/linux-headers/asm-x86/kvm_para.h
@@ -21,6 +21,7 @@
  */
 #define KVM_FEATURE_CLOCKSOURCE23
 #define KVM_FEATURE_ASYNC_PF   4
+#define KVM_FEATURE_STEAL_TIME 5
 
 /* The last 8 bits are used to indicate how to interpret the flags field
  * in pvclock structure. If no bits are set, all flags are ignored.
@@ -30,10 +31,23 @@
 #define MSR_KVM_WALL_CLOCK  0x11
 #define MSR_KVM_SYSTEM_TIME 0x12
 
+#define KVM_MSR_ENABLED 1
 /* Custom MSRs falls in the range 0x4b564d00-0x4b564dff */
 #define MSR_KVM_WALL_CLOCK_NEW  0x4b564d00
 #define MSR_KVM_SYSTEM_TIME_NEW 0x4b564d01
 #define MSR_KVM_ASYNC_PF_EN 0x4b564d02
+#define MSR_KVM_STEAL_TIME  0x4b564d03
+
+struct kvm_steal_time {
+   __u64 steal;
+   __u32 version;
+   __u32 flags;
+   __u32 pad[12];
+};
+
+#define KVM_STEAL_ALIGNMENT_BITS 5
+#define KVM_STEAL_VALID_BITS ((-1ULL  (KVM_STEAL_ALIGNMENT_BITS + 1)))
+#define KVM_STEAL_RESERVED_MASK (((1  KVM_STEAL_ALIGNMENT_BITS) - 1 )  1)
 
 #define KVM_MAX_MMU_OP_BATCH   32
 
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index fc63b73..2062375 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -161,6 +161,7 @@ struct kvm_pit_config {
 #define KVM_EXIT_NMI  16
 #define KVM_EXIT_INTERNAL_ERROR   17
 #define KVM_EXIT_OSI  18
+#define KVM_EXIT_PAPR_HCALL  19
 
 /* For KVM_EXIT_INTERNAL_ERROR */
 #define KVM_INTERNAL_ERROR_EMULATION 1
@@ -264,6 +265,11 @@ struct kvm_run {
struct {
__u64 gprs[32];
} osi;
+   struct {
+   __u64 nr;
+   __u64 ret;
+   __u64 args[9];
+   } papr_hcall;
/* Fix the size of the union. */
char padding[256];
};
@@ -457,7 +463,7 @@ struct kvm_ppc_pvinfo {
 #define KVM_CAP_VAPIC 6
 #define KVM_CAP_EXT_CPUID 7
 #define KVM_CAP_CLOCKSOURCE 8
-#define KVM_CAP_NR_VCPUS 9   /* returns max vcpus per vm */
+#define KVM_CAP_NR_VCPUS 9   /* returns recommended max vcpus per vm */
 #define KVM_CAP_NR_MEMSLOTS 10   /* returns max memory slots per vm */
 #define KVM_CAP_PIT 11
 #define KVM_CAP_NOP_IO_DELAY 12
@@ -544,6 +550,12 @@ struct kvm_ppc_pvinfo {
 #define KVM_CAP_TSC_CONTROL 60
 #define KVM_CAP_GET_TSC_KHZ 61
 #define KVM_CAP_PPC_BOOKE_SREGS 62
+#define KVM_CAP_SPAPR_TCE 63
+#define KVM_CAP_PPC_SMT 64
+#define KVM_CAP_PPC_RMA65
+#define KVM_CAP_MAX_VCPUS 66   /* returns max vcpus per vm */
+#define KVM_CAP_PPC_HIOR 67
+#define KVM_CAP_PPC_PAPR 68
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -746,6 +758,9 @@ struct kvm_clock_data {
 /* Available with KVM_CAP_XCRS */
 #define KVM_GET_XCRS _IOR(KVMIO,  0xa6, struct kvm_xcrs)
 #define KVM_SET_XCRS _IOW(KVMIO,  0xa7, struct kvm_xcrs)
+#define

[Qemu-devel] [PATCH 32/58] PPC: Add new target config for pseries

2011-09-14 Thread Alexander Graf

We only support -M pseries when certain prerequisites are met, such
as a PPC64 guest and libfdt. To only gather these requirements in
a single place, this patch introduces a new CONFIG_PSERIES variable
that gets set when all prerequisites are met.

Signed-off-by: Alexander Graf ag...@suse.de
---
 Makefile.target |6 ++
 configure   |3 +++
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/Makefile.target b/Makefile.target
index 3f689ce..7160b35 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -239,10 +239,8 @@ obj-ppc-y += ppc_oldworld.o
 # NewWorld PowerMac
 obj-ppc-y += ppc_newworld.o
 # IBM pSeries (sPAPR)
-ifeq ($(CONFIG_FDT)$(TARGET_PPC64),yy)
-obj-ppc-y += spapr.o spapr_hcall.o spapr_rtas.o spapr_vio.o
-obj-ppc-y += xics.o spapr_vty.o spapr_llan.o spapr_vscsi.o
-endif
+obj-ppc-$(CONFIG_PSERIES) += spapr.o spapr_hcall.o spapr_rtas.o spapr_vio.o
+obj-ppc-$(CONFIG_PSERIES) += xics.o spapr_vty.o spapr_llan.o spapr_vscsi.o
 # PowerPC 4xx boards
 obj-ppc-y += ppc4xx_devs.o ppc4xx_pci.o ppc405_uc.o ppc405_boards.o
 obj-ppc-y += ppc440.o ppc440_bamboo.o
diff --git a/configure b/configure
index 0875f95..d59fbd5 100755
--- a/configure
+++ b/configure
@@ -3402,6 +3402,9 @@ case $target_arch2 in
   fi
 fi
 esac
+if test $target_arch2 = ppc64 -a $fdt = yes; then
+  echo CONFIG_PSERIES=y  $config_target_mak
+fi
 if test $target_bigendian = yes ; then
   echo TARGET_WORDS_BIGENDIAN=y  $config_target_mak
 fi
-- 
1.6.0.2

[Qemu-devel] [PATCH 54/58] openpic: Unfold write_IRQreg

2011-09-14 Thread Alexander Graf

The helper function write_IRQreg was always called with a specific argument on
the type of register to access. Inside the function we were simply doing a
switch on that constant argument again. It's a lot easier to just unfold this
into two separate functions and call each individually.

Reported-by: Blue Swirl blauwir...@gmail.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/openpic.c |   79 +++--
 1 files changed, 37 insertions(+), 42 deletions(-)

diff --git a/hw/openpic.c b/hw/openpic.c
index fbd8837..43b8f27 100644
--- a/hw/openpic.c
+++ b/hw/openpic.c
@@ -482,30 +482,25 @@ static inline uint32_t read_IRQreg_ipvp(openpic_t *opp, 
int n_IRQ)
 return opp-src[n_IRQ].ipvp;
 }
 
-static inline void write_IRQreg (openpic_t *opp, int n_IRQ,
- uint32_t reg, uint32_t val)
+static inline void write_IRQreg_ide(openpic_t *opp, int n_IRQ, uint32_t val)
 {
 uint32_t tmp;
 
-switch (reg) {
-case IRQ_IPVP:
-/* NOTE: not fully accurate for special IRQs, but simple and
-   sufficient */
-/* ACTIVITY bit is read-only */
-opp-src[n_IRQ].ipvp =
-(opp-src[n_IRQ].ipvp  0x4000) |
-(val  0x800F00FF);
-openpic_update_irq(opp, n_IRQ);
-DPRINTF(Set IPVP %d to 0x%08x - 0x%08x\n,
-n_IRQ, val, opp-src[n_IRQ].ipvp);
-break;
-case IRQ_IDE:
-tmp = val  0xC000;
-tmp |= val  ((1ULL  MAX_CPU) - 1);
-opp-src[n_IRQ].ide = tmp;
-DPRINTF(Set IDE %d to 0x%08x\n, n_IRQ, opp-src[n_IRQ].ide);
-break;
-}
+tmp = val  0xC000;
+tmp |= val  ((1ULL  MAX_CPU) - 1);
+opp-src[n_IRQ].ide = tmp;
+DPRINTF(Set IDE %d to 0x%08x\n, n_IRQ, opp-src[n_IRQ].ide);
+}
+
+static inline void write_IRQreg_ipvp(openpic_t *opp, int n_IRQ, uint32_t val)
+{
+/* NOTE: not fully accurate for special IRQs, but simple and sufficient */
+/* ACTIVITY bit is read-only */
+opp-src[n_IRQ].ipvp = (opp-src[n_IRQ].ipvp  0x4000)
+ | (val  0x800F00FF);
+openpic_update_irq(opp, n_IRQ);
+DPRINTF(Set IPVP %d to 0x%08x - 0x%08x\n, n_IRQ, val,
+opp-src[n_IRQ].ipvp);
 }
 
 #if 0 // Code provision for Intel model
@@ -535,10 +530,10 @@ static void write_doorbell_register (penpic_t *opp, int 
n_dbl,
 {
 switch (offset) {
 case DBL_IVPR_OFFSET:
-write_IRQreg(opp, IRQ_DBL0 + n_dbl, IRQ_IPVP, value);
+write_IRQreg_ipvp(opp, IRQ_DBL0 + n_dbl, value);
 break;
 case DBL_IDE_OFFSET:
-write_IRQreg(opp, IRQ_DBL0 + n_dbl, IRQ_IDE, value);
+write_IRQreg_ide(opp, IRQ_DBL0 + n_dbl, value);
 break;
 case DBL_DMR_OFFSET:
 opp-doorbells[n_dbl].dmr = value;
@@ -576,10 +571,10 @@ static void write_mailbox_register (openpic_t *opp, int 
n_mbx,
 opp-mailboxes[n_mbx].mbr = value;
 break;
 case MBX_IVPR_OFFSET:
-write_IRQreg(opp, IRQ_MBX0 + n_mbx, IRQ_IPVP, value);
+write_IRQreg_ipvp(opp, IRQ_MBX0 + n_mbx, value);
 break;
 case MBX_DMR_OFFSET:
-write_IRQreg(opp, IRQ_MBX0 + n_mbx, IRQ_IDE, value);
+write_IRQreg_ide(opp, IRQ_MBX0 + n_mbx, value);
 break;
 }
 }
@@ -636,7 +631,7 @@ static void openpic_gbl_write (void *opaque, 
target_phys_addr_t addr, uint32_t v
 {
 int idx;
 idx = (addr - 0x10A0)  4;
-write_IRQreg(opp, opp-irq_ipi0 + idx, IRQ_IPVP, val);
+write_IRQreg_ipvp(opp, opp-irq_ipi0 + idx, val);
 }
 break;
 case 0x10E0: /* SPVE */
@@ -729,10 +724,10 @@ static void openpic_timer_write (void *opaque, uint32_t 
addr, uint32_t val)
 opp-timers[idx].tibc = val;
 break;
 case 0x20: /* TIVP */
-write_IRQreg(opp, opp-irq_tim0 + idx, IRQ_IPVP, val);
+write_IRQreg_ipvp(opp, opp-irq_tim0 + idx, val);
 break;
 case 0x30: /* TIDE */
-write_IRQreg(opp, opp-irq_tim0 + idx, IRQ_IDE, val);
+write_IRQreg_ide(opp, opp-irq_tim0 + idx, val);
 break;
 }
 }
@@ -782,10 +777,10 @@ static void openpic_src_write (void *opaque, uint32_t 
addr, uint32_t val)
 idx = addr  5;
 if (addr  0x10) {
 /* EXDE / IFEDE / IEEDE */
-write_IRQreg(opp, idx, IRQ_IDE, val);
+write_IRQreg_ide(opp, idx, val);
 } else {
 /* EXVP / IFEVP / IEEVP */
-write_IRQreg(opp, idx, IRQ_IPVP, val);
+write_IRQreg_ipvp(opp, idx, val);
 }
 }
 
@@ -835,8 +830,8 @@ static void openpic_cpu_write_internal(void *opaque, 
target_phys_addr_t addr,
 case 0x70:
 idx = (addr - 0x40)  4;
 /* we use IDE as mask which CPUs to deliver the IPI to still. */
-write_IRQreg(opp, opp-irq_ipi0 + idx, IRQ_IDE,
- opp-src[opp-irq_ipi0 + idx].ide | val);
+write_IRQreg_ide(opp, opp-irq_ipi0 + idx,
+ opp-src[opp-irq_ipi0 + idx].ide

[Qemu-devel] [PATCH 12/58] PPC: E500: create multiple envs

2011-09-14 Thread Alexander Graf

When creating a VM, we should go through smp_cpus and create a virtual CPU for
every CPU the user requested. This patch adds support for that and moves some
code around to make that more convenient.

Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/ppce500_mpc8544ds.c |   44 +---
 1 files changed, 29 insertions(+), 15 deletions(-)

diff --git a/hw/ppce500_mpc8544ds.c b/hw/ppce500_mpc8544ds.c
index 1274a3e..8d05587 100644
--- a/hw/ppce500_mpc8544ds.c
+++ b/hw/ppce500_mpc8544ds.c
@@ -226,7 +226,7 @@ static void mpc8544ds_init(ram_addr_t ram_size,
  const char *cpu_model)
 {
 PCIBus *pci_bus;
-CPUState *env;
+CPUState *env = NULL;
 uint64_t elf_entry;
 uint64_t elf_lowaddr;
 target_phys_addr_t entry=0;
@@ -240,24 +240,40 @@ static void mpc8544ds_init(ram_addr_t ram_size,
 qemu_irq *irqs, *mpic;
 DeviceState *dev;
 struct boot_info *boot_info;
+CPUState *firstenv = NULL;
 
-/* Setup CPU */
+/* Setup CPUs */
 if (cpu_model == NULL) {
 cpu_model = e500v2_v30;
 }
 
-env = cpu_ppc_init(cpu_model);
-if (!env) {
-fprintf(stderr, Unable to initialize CPU!\n);
-exit(1);
-}
+for (i = 0; i  smp_cpus; i++) {
+qemu_irq *input;
+env = cpu_ppc_init(cpu_model);
+if (!env) {
+fprintf(stderr, Unable to initialize CPU!\n);
+exit(1);
+}
+
+if (!firstenv) {
+firstenv = env;
+}
 
-/* XXX register timer? */
-ppc_emb_timers_init(env, 4, PPC_INTERRUPT_DECR);
-ppc_dcr_init(env, NULL, NULL);
+env-spr[SPR_BOOKE_PIR] = env-cpu_index = i;
 
-/* Register reset handler */
-qemu_register_reset(mpc8544ds_cpu_reset, env);
+/* XXX register timer? */
+ppc_emb_timers_init(env, 4, PPC_INTERRUPT_DECR);
+ppc_dcr_init(env, NULL, NULL);
+/* XXX Enable DEC interrupts - probably wrong in the backend */
+env-spr[SPR_40x_TCR] = 1  26;
+
+/* Register reset handler */
+boot_info = g_malloc0(sizeof(struct boot_info));
+qemu_register_reset(mpc8544ds_cpu_reset, env);
+env-load_info = boot_info;
+}
+
+env = firstenv;
 
 /* Fixup Memory size on a alignment boundary */
 ram_size = ~(RAM_SIZES_ALIGN - 1);
@@ -336,8 +352,6 @@ static void mpc8544ds_init(ram_addr_t ram_size,
 }
 }
 
-boot_info = g_malloc0(sizeof(struct boot_info));
-
 /* If we're loading a kernel directly, we must load the device tree too. */
 if (kernel_filename) {
 #ifndef CONFIG_FDT
@@ -350,10 +364,10 @@ static void mpc8544ds_init(ram_addr_t ram_size,
 exit(1);
 }
 
+boot_info = env-load_info;
 boot_info-entry = entry;
 boot_info-dt_base = dt_base;
 }
-env-load_info = boot_info;
 
 if (kvm_enabled()) {
 kvmppc_init();
-- 
1.6.0.2

[Qemu-devel] [PATCH 03/58] spapr: make irq customizable via qdev

2011-09-14 Thread Alexander Graf

From: Paolo Bonzini pbonz...@redhat.com

This also lets the user see the irq in info qtree.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
Cc: Alexander Graf ag...@suse.de
Cc: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/spapr_vio.c |8 +++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/hw/spapr_vio.c b/hw/spapr_vio.c
index ba2e1c1..0546ccb 100644
--- a/hw/spapr_vio.c
+++ b/hw/spapr_vio.c
@@ -52,6 +52,10 @@
 static struct BusInfo spapr_vio_bus_info = {
 .name   = spapr-vio,
 .size   = sizeof(VIOsPAPRBus),
+.props = (Property[]) {
+DEFINE_PROP_UINT32(irq, VIOsPAPRDevice, vio_irq_num, 0), \
+DEFINE_PROP_END_OF_LIST(),
+},
 };
 
 VIOsPAPRDevice *spapr_vio_find_by_reg(VIOsPAPRBus *bus, uint32_t reg)
@@ -604,7 +608,9 @@ static int spapr_vio_busdev_init(DeviceState *qdev, 
DeviceInfo *qinfo)
 }
 
 dev-qdev.id = id;
-dev-vio_irq_num = bus-irq++;
+if (!dev-vio_irq_num) {
+dev-vio_irq_num = bus-irq++;
+}
 dev-qirq = spapr_find_qirq(spapr, dev-vio_irq_num);
 
 rtce_init(dev);
-- 
1.6.0.2

[Qemu-devel] [PATCH 25/58] PPC: E500: Update cpu-release-addr property in cpu nodes

2011-09-14 Thread Alexander Graf

The guest OS wants to know where the guest spins, so let's tell him while
updating the CPU nodes with the frequencies anyways.

Signed-off-by: Alexander Graf ag...@suse.de

---

v1 - v2:

  - use new spin table address
---
 hw/ppce500_mpc8544ds.c |6 ++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/hw/ppce500_mpc8544ds.c b/hw/ppce500_mpc8544ds.c
index 3b8b449..a3e1ce4 100644
--- a/hw/ppce500_mpc8544ds.c
+++ b/hw/ppce500_mpc8544ds.c
@@ -125,9 +125,15 @@ static int mpc8544_load_device_tree(CPUState *env,
 
 for (i = 0; i  smp_cpus; i++) {
 char cpu_name[128];
+uint64_t cpu_release_addr[] = {
+cpu_to_be64(MPC8544_SPIN_BASE + (i * 0x20))
+};
+
 snprintf(cpu_name, sizeof(cpu_name), /cpus/PowerPC,8544@%x, i);
 qemu_devtree_setprop_cell(fdt, cpu_name, clock-frequency, 
clock_freq);
 qemu_devtree_setprop_cell(fdt, cpu_name, timebase-frequency, 
tb_freq);
+qemu_devtree_setprop(fdt, cpu_name, cpu-release-addr,
+ cpu_release_addr, sizeof(cpu_release_addr));
 }
 
 for (i = smp_cpus; i  32; i++) {
-- 
1.6.0.2

[Qemu-devel] [PATCH 07/58] PPC: Fix IPI support in MPIC

2011-09-14 Thread Alexander Graf

The current IPI support in the MPIC code is incomplete and doesn't work. This
code adds proper support for IPIs in MPIC by using the IDE register to remember
which CPUs IPIs are still outstanding to. New triggers through the IPI trigger
register only add to the list of CPUs we want to IPI.

Signed-off-by: Alexander Graf ag...@suse.de

---

v1 - v2:

  - Use MAX_IPI instead of hardcoded 4

Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/openpic.c |   17 +++--
 1 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/hw/openpic.c b/hw/openpic.c
index f7d5583..9710ac0 100644
--- a/hw/openpic.c
+++ b/hw/openpic.c
@@ -57,7 +57,7 @@
 #define MAX_MBX 4
 #define MAX_TMR 4
 #define VECTOR_BITS 8
-#define MAX_IPI 0
+#define MAX_IPI 4
 
 #define VID (0x)
 
@@ -840,7 +840,9 @@ static void openpic_cpu_write_internal(void *opaque, 
target_phys_addr_t addr,
 case 0x60:
 case 0x70:
 idx = (addr - 0x40)  4;
-write_IRQreg(opp, opp-irq_ipi0 + idx, IRQ_IDE, val);
+/* we use IDE as mask which CPUs to deliver the IPI to still. */
+write_IRQreg(opp, opp-irq_ipi0 + idx, IRQ_IDE,
+ opp-src[opp-irq_ipi0 + idx].ide | val);
 openpic_set_irq(opp, opp-irq_ipi0 + idx, 1);
 openpic_set_irq(opp, opp-irq_ipi0 + idx, 0);
 break;
@@ -934,6 +936,17 @@ static uint32_t openpic_cpu_read_internal(void *opaque, 
target_phys_addr_t addr,
 reset_bit(src-ipvp, IPVP_ACTIVITY);
 src-pending = 0;
 }
+
+if ((n_IRQ = opp-irq_ipi0)   (n_IRQ  (opp-irq_ipi0 + 
MAX_IPI))) {
+src-ide = ~(1  idx);
+if (src-ide  !test_bit(src-ipvp, IPVP_SENSE)) {
+/* trigger on CPUs that didn't know about it yet */
+openpic_set_irq(opp, n_IRQ, 1);
+openpic_set_irq(opp, n_IRQ, 0);
+/* if all CPUs knew about it, set active bit again */
+set_bit(src-ipvp, IPVP_ACTIVITY);
+}
+}
 }
 break;
 case 0xB0: /* PEOI */
-- 
1.6.0.2

[Qemu-devel] [PATCH 43/58] KVM: Update kernel headers

2011-09-14 Thread Alexander Graf

Another round of KVM features, another round of kernel header updates :)

Signed-off-by: Alexander Graf ag...@suse.de
---
 linux-headers/asm-powerpc/kvm.h |   40 +++
 linux-headers/linux/kvm.h   |   18 +
 2 files changed, 58 insertions(+), 0 deletions(-)

diff --git a/linux-headers/asm-powerpc/kvm.h b/linux-headers/asm-powerpc/kvm.h
index 579e219..28eecf0 100644
--- a/linux-headers/asm-powerpc/kvm.h
+++ b/linux-headers/asm-powerpc/kvm.h
@@ -284,6 +284,11 @@ struct kvm_guest_debug_arch {
 #define KVM_INTERRUPT_UNSET-2U
 #define KVM_INTERRUPT_SET_LEVEL-3U
 
+#define KVM_CPU_4401
+#define KVM_CPU_E500V2 2
+#define KVM_CPU_3S_32  3
+#define KVM_CPU_3S_64  4
+
 /* for KVM_CAP_SPAPR_TCE */
 struct kvm_create_spapr_tce {
__u64 liobn;
@@ -295,4 +300,39 @@ struct kvm_allocate_rma {
__u64 rma_size;
 };
 
+struct kvm_book3e_206_tlb_entry {
+   __u32 mas8;
+   __u32 mas1;
+   __u64 mas2;
+   __u64 mas7_3;
+};
+
+struct kvm_book3e_206_tlb_params {
+   /*
+* For mmu types KVM_MMU_FSL_BOOKE_NOHV and KVM_MMU_FSL_BOOKE_HV:
+*
+* - The number of ways of TLB0 must be a power of two between 2 and
+*   16.
+* - TLB1 must be fully associative.
+* - The size of TLB0 must be a multiple of the number of ways, and
+*   the number of sets must be a power of two.
+* - The size of TLB1 may not exceed 64 entries.
+* - TLB0 supports 4 KiB pages.
+* - The page sizes supported by TLB1 are as indicated by
+*   TLB1CFG (if MMUCFG[MAVN] = 0) or TLB1PS (if MMUCFG[MAVN] = 1)
+*   as returned by KVM_GET_SREGS.
+* - TLB2 and TLB3 are reserved, and their entries in tlb_sizes[]
+*   and tlb_ways[] must be zero.
+*
+* tlb_ways[n] = tlb_sizes[n] means the array is fully associative.
+*
+* KVM will adjust TLBnCFG based on the sizes configured here,
+* though arrays greater than 2048 entries will have TLBnCFG[NENTRY]
+* set to zero.
+*/
+   __u32 tlb_sizes[4];
+   __u32 tlb_ways[4];
+   __u32 reserved[8];
+};
+
 #endif /* __LINUX_KVM_POWERPC_H */
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 2062375..8bb6cde 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -556,6 +556,7 @@ struct kvm_ppc_pvinfo {
 #define KVM_CAP_MAX_VCPUS 66   /* returns max vcpus per vm */
 #define KVM_CAP_PPC_HIOR 67
 #define KVM_CAP_PPC_PAPR 68
+#define KVM_CAP_SW_TLB 69
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -635,6 +636,21 @@ struct kvm_clock_data {
__u32 pad[9];
 };
 
+#define KVM_MMU_FSL_BOOKE_NOHV 0
+#define KVM_MMU_FSL_BOOKE_HV   1
+
+struct kvm_config_tlb {
+   __u64 params;
+   __u64 array;
+   __u32 mmu_type;
+   __u32 array_len;
+};
+
+struct kvm_dirty_tlb {
+   __u64 bitmap;
+   __u32 num_dirty;
+};
+
 /*
  * ioctls for VM fds
  */
@@ -761,6 +777,8 @@ struct kvm_clock_data {
 #define KVM_CREATE_SPAPR_TCE _IOW(KVMIO,  0xa8, struct 
kvm_create_spapr_tce)
 /* Available with KVM_CAP_RMA */
 #define KVM_ALLOCATE_RMA _IOR(KVMIO,  0xa9, struct kvm_allocate_rma)
+/* Available with KVM_CAP_SW_TLB */
+#define KVM_DIRTY_TLB_IOW(KVMIO,  0xaa, struct kvm_dirty_tlb)
 
 #define KVM_DEV_ASSIGN_ENABLE_IOMMU(1  0)
 
-- 
1.6.0.2

[Qemu-devel] [PATCH 16/58] PPC: KVM: Add generic function to read host clockfreq

2011-09-14 Thread Alexander Graf

We need to find out the host's clock-frequency when running on KVM, so
let's export a respective function.

Signed-off-by: Alexander Graf ag...@suse.de

---

v1 - v2:

  - enable 64bit values
---
 target-ppc/kvm.c |   67 ++
 target-ppc/kvm_ppc.h |1 +
 2 files changed, 68 insertions(+), 0 deletions(-)

diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 21f35af..77b98c4 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -14,6 +14,7 @@
  *
  */
 
+#include dirent.h
 #include sys/types.h
 #include sys/ioctl.h
 #include sys/mman.h
@@ -38,6 +39,8 @@
 do { } while (0)
 #endif
 
+#define PROC_DEVTREE_CPU  /proc/device-tree/cpus/
+
 const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
 KVM_CAP_LAST_INFO
 };
@@ -509,6 +512,70 @@ uint32_t kvmppc_get_tbfreq(void)
 return retval;
 }
 
+/* Try to find a device tree node for a CPU with clock-frequency property */
+static int kvmppc_find_cpu_dt(char *buf, int buf_len)
+{
+struct dirent *dirp;
+DIR *dp;
+
+if ((dp = opendir(PROC_DEVTREE_CPU)) == NULL) {
+printf(Can't open directory  PROC_DEVTREE_CPU \n);
+return -1;
+}
+
+buf[0] = '\0';
+while ((dirp = readdir(dp)) != NULL) {
+FILE *f;
+snprintf(buf, buf_len, %s%s/clock-frequency, PROC_DEVTREE_CPU,
+ dirp-d_name);
+f = fopen(buf, r);
+if (f) {
+snprintf(buf, buf_len, %s%s, PROC_DEVTREE_CPU, dirp-d_name);
+fclose(f);
+break;
+}
+buf[0] = '\0';
+}
+closedir(dp);
+if (buf[0] == '\0') {
+printf(Unknown host!\n);
+return -1;
+}
+
+return 0;
+}
+
+uint64_t kvmppc_get_clockfreq(void)
+{
+char buf[512];
+uint32_t tb[2];
+FILE *f;
+int len;
+
+if (kvmppc_find_cpu_dt(buf, sizeof(buf))) {
+return 0;
+}
+
+strncat(buf, /clock-frequency, sizeof(buf) - strlen(buf));
+
+f = fopen(buf, rb);
+if (!f) {
+return -1;
+}
+
+len = fread(tb, sizeof(tb[0]), 2, f);
+fclose(f);
+switch (len) {
+case 1:
+/* freq is only a single cell */
+return tb[0];
+case 2:
+return *(uint64_t*)tb;
+}
+
+return 0;
+}
+
 int kvmppc_get_hypercall(CPUState *env, uint8_t *buf, int buf_len)
 {
 uint32_t *hc = (uint32_t*)buf;
diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
index 2f32249..7c08c0f 100644
--- a/target-ppc/kvm_ppc.h
+++ b/target-ppc/kvm_ppc.h
@@ -23,6 +23,7 @@ int kvmppc_read_host_property(const char *node_path, const 
char *prop,
 #endif
 
 uint32_t kvmppc_get_tbfreq(void);
+uint64_t kvmppc_get_clockfreq(void);
 int kvmppc_get_hypercall(CPUState *env, uint8_t *buf, int buf_len);
 int kvmppc_set_interrupt(CPUState *env, int irq, int level);
 
-- 
1.6.0.2

Re: [Qemu-devel] [PATCH] PPC: Fix for the gdb single step problem on an rfi instruction

2011-09-14 Thread Sebastian Bauer


Hi!

On Fri, 12 Aug 2011 15:29:58 +0200, Elie Richa wrote:
I've had this problem recently and your patch does fix the issue, 
thanks!


I like to bump this as this was not in the latest ppc patch queue. Is 
there anything wrong with that patch?


TIA

Best,
Sebastian


On 08/10/2011 01:41 PM, Sebastian Bauer wrote:
When using gdb to single step a ppc interrupt routine, the execution 
flow passes
the rfi instruction without actually returning from the interrupt. 
The patch
fixes this by avoiding to update the nip when the debug exception is 
raised
and a previous POWERPC_EXCP_SYNC was set. The latter is the case 
only, if code for

rfi or a related instruction was generated.

Signed-off-by: Sebastian Bauer m...@sebastianbauer.info
---
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index fd7c208..42b91fd 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -287,7 +287,7 @@ static inline void 
gen_debug_exception(DisasContext *ctx)

{
TCGv_i32 t0;

- if (ctx-exception != POWERPC_EXCP_BRANCH)
+ if (ctx-exception != POWERPC_EXCP_BRANCH  ctx-exception != 
POWERPC_EXCP_SYNC)

gen_update_nip(ctx, ctx-nip);
t0 = tcg_const_i32(EXCP_DEBUG);
gen_helper_raise_exception(t0);

[Qemu-devel] [PATCH 46/58] ppc: booke206: use MAV=2.0 TSIZE definition, fix 4G pages

2011-09-14 Thread Alexander Graf

From: Scott Wood scottw...@freescale.com

This definition is backward compatible with MAV=1.0 as long as
the guest does not set reserved bits in MAS1/MAS4.

Also, fix the shift in booke206_tlb_to_page_size -- it's the base
that should be able to hold a 4G page size, not the shift count.

Signed-off-by: Scott Wood scottw...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/ppce500_mpc8544ds.c |2 +-
 target-ppc/cpu.h   |4 ++--
 target-ppc/helper.c|5 +++--
 3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/hw/ppce500_mpc8544ds.c b/hw/ppce500_mpc8544ds.c
index 61151d8..8095516 100644
--- a/hw/ppce500_mpc8544ds.c
+++ b/hw/ppce500_mpc8544ds.c
@@ -174,7 +174,7 @@ out:
 /* Create -kernel TLB entries for BookE, linearly spanning 256MB.  */
 static inline target_phys_addr_t booke206_page_size_to_tlb(uint64_t size)
 {
-return (ffs(size  10) - 1)  1;
+return ffs(size  10) - 1;
 }
 
 static void mmubooke_create_initial_mapping(CPUState *env,
diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index 5200e6e..32706df 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -667,8 +667,8 @@ enum {
 #define MAS0_ATSEL_TLB 0
 #define MAS0_ATSEL_LRATMAS0_ATSEL
 
-#define MAS1_TSIZE_SHIFT   8
-#define MAS1_TSIZE_MASK(0xf  MAS1_TSIZE_SHIFT)
+#define MAS1_TSIZE_SHIFT   7
+#define MAS1_TSIZE_MASK(0x1f  MAS1_TSIZE_SHIFT)
 
 #define MAS1_TS_SHIFT  12
 #define MAS1_TS(1  MAS1_TS_SHIFT)
diff --git a/target-ppc/helper.c b/target-ppc/helper.c
index d1bc574..73796c8 100644
--- a/target-ppc/helper.c
+++ b/target-ppc/helper.c
@@ -1293,7 +1293,7 @@ target_phys_addr_t booke206_tlb_to_page_size(CPUState 
*env, ppcmas_tlb_t *tlb)
 {
 uint32_t tlbncfg;
 int tlbn = booke206_tlbm_to_tlbn(env, tlb);
-target_phys_addr_t tlbm_size;
+int tlbm_size;
 
 tlbncfg = env-spr[SPR_BOOKE_TLB0CFG + tlbn];
 
@@ -1301,9 +1301,10 @@ target_phys_addr_t booke206_tlb_to_page_size(CPUState 
*env, ppcmas_tlb_t *tlb)
 tlbm_size = (tlb-mas1  MAS1_TSIZE_MASK)  MAS1_TSIZE_SHIFT;
 } else {
 tlbm_size = (tlbncfg  TLBnCFG_MINSIZE)  TLBnCFG_MINSIZE_SHIFT;
+tlbm_size = 1;
 }
 
-return (1  (tlbm_size  1))  10;
+return 1024ULL  tlbm_size;
 }
 
 /* TLB check function for MAS based SoftTLBs */
-- 
1.6.0.2

[Qemu-devel] [PATCH 30/58] MPC8544DS: Generate CPU nodes on init

2011-09-14 Thread Alexander Graf

With this patch, we generate CPU nodes in the machine initialization, giving
us the freedom to generate as many nodes as we want and as the machine supports,
but only those.

This is a first step towards a much cleaner device tree generation
infrastructure, where we would not require precompiled dtb blobs anymore.

Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/ppce500_mpc8544ds.c |   46 +-
 1 files changed, 33 insertions(+), 13 deletions(-)

diff --git a/hw/ppce500_mpc8544ds.c b/hw/ppce500_mpc8544ds.c
index a3e1ce4..dfa8034 100644
--- a/hw/ppce500_mpc8544ds.c
+++ b/hw/ppce500_mpc8544ds.c
@@ -123,23 +123,43 @@ static int mpc8544_load_device_tree(CPUState *env,
  hypercall, sizeof(hypercall));
 }
 
-for (i = 0; i  smp_cpus; i++) {
+/* We need to generate the cpu nodes in reverse order, so Linux can pick
+   the first node as boot node and be happy */
+for (i = smp_cpus - 1; i = 0; i--) {
 char cpu_name[128];
-uint64_t cpu_release_addr[] = {
-cpu_to_be64(MPC8544_SPIN_BASE + (i * 0x20))
-};
+uint64_t cpu_release_addr = cpu_to_be64(MPC8544_SPIN_BASE + (i * 
0x20));
+
+for (env = first_cpu; env != NULL; env = env-next_cpu) {
+if (env-cpu_index == i) {
+break;
+}
+}
+
+if (!env) {
+continue;
+}
 
-snprintf(cpu_name, sizeof(cpu_name), /cpus/PowerPC,8544@%x, i);
+snprintf(cpu_name, sizeof(cpu_name), /cpus/PowerPC,8544@%x, 
env-cpu_index);
+qemu_devtree_add_subnode(fdt, cpu_name);
 qemu_devtree_setprop_cell(fdt, cpu_name, clock-frequency, 
clock_freq);
 qemu_devtree_setprop_cell(fdt, cpu_name, timebase-frequency, 
tb_freq);
-qemu_devtree_setprop(fdt, cpu_name, cpu-release-addr,
- cpu_release_addr, sizeof(cpu_release_addr));
-}
-
-for (i = smp_cpus; i  32; i++) {
-char cpu_name[128];
-snprintf(cpu_name, sizeof(cpu_name), /cpus/PowerPC,8544@%x, i);
-qemu_devtree_nop_node(fdt, cpu_name);
+qemu_devtree_setprop_string(fdt, cpu_name, device_type, cpu);
+qemu_devtree_setprop_cell(fdt, cpu_name, reg, env-cpu_index);
+qemu_devtree_setprop_cell(fdt, cpu_name, d-cache-line-size,
+  env-dcache_line_size);
+qemu_devtree_setprop_cell(fdt, cpu_name, i-cache-line-size,
+  env-icache_line_size);
+qemu_devtree_setprop_cell(fdt, cpu_name, d-cache-size, 0x8000);
+qemu_devtree_setprop_cell(fdt, cpu_name, i-cache-size, 0x8000);
+qemu_devtree_setprop_cell(fdt, cpu_name, bus-frequency, 0);
+if (env-cpu_index) {
+qemu_devtree_setprop_string(fdt, cpu_name, status, disabled);
+qemu_devtree_setprop_string(fdt, cpu_name, enable-method, 
spin-table);
+qemu_devtree_setprop(fdt, cpu_name, cpu-release-addr,
+ cpu_release_addr, sizeof(cpu_release_addr));
+} else {
+qemu_devtree_setprop_string(fdt, cpu_name, status, okay);
+}
 }
 
 ret = rom_add_blob_fixed(BINARY_DEVICE_TREE_FILE, fdt, fdt_size, addr);
-- 
1.6.0.2

[Qemu-devel] [PATCH 10/58] PPC: MPIC: Fix CI bit definitions

2011-09-14 Thread Alexander Graf

The bit definitions for critical interrupt routing are in PowerPC order
(most significant bit is 0), while we end up shifting it with normal bit
order. Turn the numbers around so we actually end up fetching the
right ones.

Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/openpic.c |   10 +-
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/hw/openpic.c b/hw/openpic.c
index dfec52e..109c1bc 100644
--- a/hw/openpic.c
+++ b/hw/openpic.c
@@ -131,11 +131,11 @@ enum {
 #define MPIC_CPU_REG_SIZE 0x100 + ((MAX_CPU - 1) * 0x1000)
 
 enum mpic_ide_bits {
-IDR_EP = 0,
-IDR_CI0 = 1,
-IDR_CI1 = 2,
-IDR_P1 = 30,
-IDR_P0 = 31,
+IDR_EP = 31,
+IDR_CI0 = 30,
+IDR_CI1 = 29,
+IDR_P1 = 1,
+IDR_P0 = 0,
 };
 
 #else
-- 
1.6.0.2

Re: [Qemu-devel] [PATCH 50/58] pseries: Update SLOF firmware image

2011-09-14 Thread Peter Maydell

On 14 September 2011 09:43, Alexander Graf ag...@suse.de wrote:
 From: David Gibson da...@gibson.dropbear.id.au

 The current SLOF firmware for the pseries machine has a bug in SCSI
 condition handling that was exposed by recent updates to qemu's SCSI
 emulation.  This patch updates the SLOF image to one with the bug fixed.

 Signed-off-by: David Gibson da...@gibson.dropbear.id.au
 Signed-off-by: Alexander Graf ag...@suse.de
 ---
  pc-bios/README   |    2 +-
  pc-bios/slof.bin |  Bin 579072 - 57 bytes
  2 files changed, 1 insertions(+), 1 deletions(-)

I confess to not really understanding how we keep the git
submodules and the binary blobs in sync, but shouldn't there
be a reference in the commit message to the git commit hash
for the slof sources corresponding to this blob, and maybe
also an update to roms/SLOF here? (cf commit d67c3f2c for
example) ?

-- PMM

[Qemu-devel] [PATCH 31/58] PPC: E500: Bump CPU count to 15

2011-09-14 Thread Alexander Graf

Now that we have everything in place, make the machine description
aware of the fact that we can now handle 15 virtual CPUs!

Signed-off-by: Alexander Graf ag...@suse.de

---

v1 - v2:

  - Max cpus is 15 because of MPIC
---
 hw/ppce500_mpc8544ds.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/hw/ppce500_mpc8544ds.c b/hw/ppce500_mpc8544ds.c
index dfa8034..b86a008 100644
--- a/hw/ppce500_mpc8544ds.c
+++ b/hw/ppce500_mpc8544ds.c
@@ -396,6 +396,7 @@ static QEMUMachine mpc8544ds_machine = {
 .name = mpc8544ds,
 .desc = mpc8544ds,
 .init = mpc8544ds_init,
+.max_cpus = 15,
 };
 
 static void mpc8544ds_machine_init(void)
-- 
1.6.0.2

[Qemu-devel] [PATCH 21/58] PPC: KVM: Add stubs for kvm helper functions

2011-09-14 Thread Alexander Graf

We have a bunch of helper functions that don't have any stubs for them in case
we don't have CONFIG_KVM enabled. That didn't bite us so far, because gcc can
optimize them out pretty well, but we should really provide them.

Signed-off-by: Alexander Graf ag...@suse.de

---

v1 - v2:

   - use uint64_t for clockfreq
---
 target-ppc/kvm_ppc.h |   26 ++
 1 files changed, 26 insertions(+), 0 deletions(-)

diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
index 0c659c8..76f98d9 100644
--- a/target-ppc/kvm_ppc.h
+++ b/target-ppc/kvm_ppc.h
@@ -11,11 +11,37 @@
 
 void kvmppc_init(void);
 
+#ifdef CONFIG_KVM
+
 uint32_t kvmppc_get_tbfreq(void);
 uint64_t kvmppc_get_clockfreq(void);
 int kvmppc_get_hypercall(CPUState *env, uint8_t *buf, int buf_len);
 int kvmppc_set_interrupt(CPUState *env, int irq, int level);
 
+#else
+
+static inline uint32_t kvmppc_get_tbfreq(void)
+{
+return 0;
+}
+
+static inline uint64_t kvmppc_get_clockfreq(void)
+{
+return 0;
+}
+
+static inline int kvmppc_get_hypercall(CPUState *env, uint8_t *buf, int 
buf_len)
+{
+return -1;
+}
+
+static inline int kvmppc_set_interrupt(CPUState *env, int irq, int level)
+{
+return -1;
+}
+
+#endif
+
 #ifndef CONFIG_KVM
 #define kvmppc_eieio() do { } while (0)
 #else
-- 
1.6.0.2

[Qemu-devel] [PATCH 38/58] pseries: interrupt controller should not have a 'reg' property

2011-09-14 Thread Alexander Graf

From: David Gibson da...@gibson.dropbear.id.au

The interrupt controller presented in the device tree for the pseries
machine is manipulated by the guest only through hypervisor calls.  It
has no real or emulated registers for the guest to access.

However, it currently has a bogus 'reg' property advertising a register
window.  Moreover, this property has an invalid format, being a 32-bit
zero, when the #address-cells property on the root bus indicates that it
needs a 64-bit address.  Since the guest never attempts to manipulate
the node directly, it works, but it is ugly and can cause warnings when
manipulating the device tree in other tools (such as future firmware
versions).

This patch, therefore, corrects the problem by entirely removing the
interrupt-controller node's 'reg' property.

Signed-off-by: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/spapr.c |3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/hw/spapr.c b/hw/spapr.c
index bb00ae6..9eefef9 100644
--- a/hw/spapr.c
+++ b/hw/spapr.c
@@ -194,12 +194,11 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
 _FDT((fdt_end_node(fdt)));
 
 /* interrupt controller */
-_FDT((fdt_begin_node(fdt, interrupt-controller@0)));
+_FDT((fdt_begin_node(fdt, interrupt-controller)));
 
 _FDT((fdt_property_string(fdt, device_type,
   PowerPC-External-Interrupt-Presentation)));
 _FDT((fdt_property_string(fdt, compatible, IBM,ppc-xicp)));
-_FDT((fdt_property_cell(fdt, reg, 0)));
 _FDT((fdt_property(fdt, interrupt-controller, NULL, 0)));
 _FDT((fdt_property(fdt, ibm,interrupt-server-ranges,
interrupt_server_ranges_prop,
-- 
1.6.0.2

[Qemu-devel] [PATCH 18/58] PPC: E500: Remove mpc8544_copy_soc_cell

2011-09-14 Thread Alexander Graf

We don't need mpc8544_copy_soc_cell anymore, since we're explicitly reading
host values and writing guest values respectively.

Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/ppce500_mpc8544ds.c |   24 
 1 files changed, 0 insertions(+), 24 deletions(-)

diff --git a/hw/ppce500_mpc8544ds.c b/hw/ppce500_mpc8544ds.c
index 8748531..2c7c677 100644
--- a/hw/ppce500_mpc8544ds.c
+++ b/hw/ppce500_mpc8544ds.c
@@ -56,30 +56,6 @@ struct boot_info
 uint32_t entry;
 };
 
-#ifdef CONFIG_FDT
-static int mpc8544_copy_soc_cell(void *fdt, const char *node, const char *prop)
-{
-uint32_t cell;
-int ret;
-
-ret = kvmppc_read_host_property(node, prop, cell, sizeof(cell));
-if (ret  0) {
-fprintf(stderr, couldn't read host %s/%s\n, node, prop);
-goto out;
-}
-
-ret = qemu_devtree_setprop_cell(fdt, /cpus/PowerPC,8544@0,
-prop, cell);
-if (ret  0) {
-fprintf(stderr, couldn't set guest /cpus/PowerPC,8544@0/%s\n, prop);
-goto out;
-}
-
-out:
-return ret;
-}
-#endif
-
 static int mpc8544_load_device_tree(CPUState *env,
 target_phys_addr_t addr,
 uint32_t ramsize,
-- 
1.6.0.2

[Qemu-devel] [PATCH 06/58] PPC: Extend MPIC MMIO range

2011-09-14 Thread Alexander Graf

The MPIC exports a page for each CPU that it controls. To support more than
one CPU, we need to also reserve the MMIO space according to the amount of
CPUs we want to support.

Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/openpic.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/hw/openpic.c b/hw/openpic.c
index cf89f23..f7d5583 100644
--- a/hw/openpic.c
+++ b/hw/openpic.c
@@ -128,7 +128,7 @@ enum {
 #define MPIC_MSI_REG_START0x11C00
 #define MPIC_MSI_REG_SIZE 0x100
 #define MPIC_CPU_REG_START0x2
-#define MPIC_CPU_REG_SIZE 0x100
+#define MPIC_CPU_REG_SIZE 0x100 + ((MAX_CPU - 1) * 0x1000)
 
 enum mpic_ide_bits {
 IDR_EP = 0,
-- 
1.6.0.2

[Qemu-devel] [PATCH 49/58] vscsi: send the CHECK_CONDITION status down together with autosense data

2011-09-14 Thread Alexander Graf

From: Paolo Bonzini pbonz...@redhat.com

I introduced this bug in commit 05751d3 (vscsi: always use get_sense,
2011-08-03) because at the time there was no way to expose a sense
condition to SLOF and Linux manages to work around the bug.  However,
the bug becomes evident now that SCSI devices also report unit
attention on reset.

SLOF also has problems dealing with unit attention conditions, so
it still will not boot even with this fix (just like OpenBIOS).
IBM folks are aware of their part of the bug. :-)

Reported-by: Thomas Huth th...@linux.vnet.ibm.com
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
Acked-by: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/spapr_vscsi.c |1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/hw/spapr_vscsi.c b/hw/spapr_vscsi.c
index 6fc82f6..e8426d7 100644
--- a/hw/spapr_vscsi.c
+++ b/hw/spapr_vscsi.c
@@ -483,7 +483,6 @@ static void vscsi_command_complete(SCSIRequest *sreq, 
uint32_t status)
 if (status == CHECK_CONDITION) {
 req-senselen = scsi_req_get_sense(req-sreq, req-sense,
sizeof(req-sense));
-status = 0;
 dprintf(VSCSI: Sense data, %d bytes:\n, len);
 dprintf(   %02x  %02x  %02x  %02x  %02x  %02x  %02x  %02x\n,
 req-sense[0], req-sense[1], req-sense[2], req-sense[3],
-- 
1.6.0.2

[Qemu-devel] [PATCH 41/58] pseries: Add real mode debugging hcalls

2011-09-14 Thread Alexander Graf

From: David Gibson da...@gibson.dropbear.id.au

PAPR systems support several hypercalls intended for use in real mode
debugging tools.  These implement reads and writes to arbitrary guest
physical addresses.  This is useful for real mode software because it
allows access to IO addresses and memory outside the RMA without going
through the somewhat involved process of setting up the hash page table
and enabling translation.

We want these so that when we add real IO devices, the SLOF firmware can
boot from them without having to enter virtual mode.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
Signed-off-by: David Gibson d...@au1.ibm.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/spapr_hcall.c |   73 ++
 1 files changed, 73 insertions(+), 0 deletions(-)

diff --git a/hw/spapr_hcall.c b/hw/spapr_hcall.c
index 70f853c..0c61c10 100644
--- a/hw/spapr_hcall.c
+++ b/hw/spapr_hcall.c
@@ -463,6 +463,67 @@ static target_ulong h_rtas(CPUState *env, sPAPREnvironment 
*spapr,
nret, rtas_r3 + 12 + 4*nargs);
 }
 
+static target_ulong h_logical_load(CPUState *env, sPAPREnvironment *spapr,
+   target_ulong opcode, target_ulong *args)
+{
+target_ulong size = args[0];
+target_ulong addr = args[1];
+
+switch (size) {
+case 1:
+args[0] = ldub_phys(addr);
+return H_SUCCESS;
+case 2:
+args[0] = lduw_phys(addr);
+return H_SUCCESS;
+case 4:
+args[0] = ldl_phys(addr);
+return H_SUCCESS;
+case 8:
+args[0] = ldq_phys(addr);
+return H_SUCCESS;
+}
+return H_PARAMETER;
+}
+
+static target_ulong h_logical_store(CPUState *env, sPAPREnvironment *spapr,
+target_ulong opcode, target_ulong *args)
+{
+target_ulong size = args[0];
+target_ulong addr = args[1];
+target_ulong val  = args[2];
+
+switch (size) {
+case 1:
+stb_phys(addr, val);
+return H_SUCCESS;
+case 2:
+stw_phys(addr, val);
+return H_SUCCESS;
+case 4:
+stl_phys(addr, val);
+return H_SUCCESS;
+case 8:
+stq_phys(addr, val);
+return H_SUCCESS;
+}
+return H_PARAMETER;
+}
+
+static target_ulong h_logical_icbi(CPUState *env, sPAPREnvironment *spapr,
+   target_ulong opcode, target_ulong *args)
+{
+/* Nothing to do on emulation, KVM will trap this in the kernel */
+return H_SUCCESS;
+}
+
+static target_ulong h_logical_dcbf(CPUState *env, sPAPREnvironment *spapr,
+   target_ulong opcode, target_ulong *args)
+{
+/* Nothing to do on emulation, KVM will trap this in the kernel */
+return H_SUCCESS;
+}
+
 static spapr_hcall_fn papr_hypercall_table[(MAX_HCALL_OPCODE / 4) + 1];
 static spapr_hcall_fn kvmppc_hypercall_table[KVMPPC_HCALL_MAX - 
KVMPPC_HCALL_BASE + 1];
 
@@ -527,6 +588,18 @@ static void hypercall_init(void)
 spapr_register_hypercall(H_REGISTER_VPA, h_register_vpa);
 spapr_register_hypercall(H_CEDE, h_cede);
 
+/* debugger hcalls (also used by SLOF). Note: We do -not- differenciate
+ * here between the CI and the CACHE variants, they will use whatever
+ * mapping attributes qemu is using. When using KVM, the kernel will
+ * enforce the attributes more strongly
+ */
+spapr_register_hypercall(H_LOGICAL_CI_LOAD, h_logical_load);
+spapr_register_hypercall(H_LOGICAL_CI_STORE, h_logical_store);
+spapr_register_hypercall(H_LOGICAL_CACHE_LOAD, h_logical_load);
+spapr_register_hypercall(H_LOGICAL_CACHE_STORE, h_logical_store);
+spapr_register_hypercall(H_LOGICAL_ICBI, h_logical_icbi);
+spapr_register_hypercall(H_LOGICAL_DCBF, h_logical_dcbf);
+
 /* qemu/KVM-PPC specific hcalls */
 spapr_register_hypercall(KVMPPC_H_RTAS, h_rtas);
 }
-- 
1.6.0.2

[Qemu-devel] [PATCH 08/58] PPC: Set MPIC IDE for IPI to 0

2011-09-14 Thread Alexander Graf

We use the IDE register with IPIs as a mask to keep track which processors
have already acknowledged the respective interrupt. So we need to initialize
it to 0 to make sure that it doesn't accidently fire an IPI on CPU0 when the
first IPI is triggered.

Reported-by: Elie Richa ri...@adacore.com
Signed-off-by: Alexander Graf ag...@suse.de

---

v2 - v3:

  - fix IDE IPI reset
---
 hw/openpic.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/hw/openpic.c b/hw/openpic.c
index 9710ac0..31ad175 100644
--- a/hw/openpic.c
+++ b/hw/openpic.c
@@ -1299,6 +1299,10 @@ static void mpic_reset (void *opaque)
 mpp-src[i].ipvp = 0x8080;
 mpp-src[i].ide  = 0x0001;
 }
+/* Set IDE for IPIs to 0 so we don't get spurious interrupts */
+for (i = mpp-irq_ipi0; i  (mpp-irq_ipi0 + MAX_IPI); i++) {
+mpp-src[i].ide = 0;
+}
 /* Initialise IRQ destinations */
 for (i = 0; i  MAX_CPU; i++) {
 mpp-dst[i].pctp  = 0x000F;
-- 
1.6.0.2

[Qemu-devel] [PATCH 29/58] MPC8544DS: Remove CPU nodes

2011-09-14 Thread Alexander Graf

We want to generate the CPU nodes in machine init code, so remove them from
the device tree definition that we precompile.

Signed-off-by: Alexander Graf ag...@suse.de
---
 pc-bios/mpc8544ds.dtb |  Bin 2277 - 2028 bytes
 pc-bios/mpc8544ds.dts |   12 
 2 files changed, 0 insertions(+), 12 deletions(-)

diff --git a/pc-bios/mpc8544ds.dtb b/pc-bios/mpc8544ds.dtb
index 
ae318b1fe83846cc2e133951a3666fcfcdf87f79..c6d302153c7407d5d0127be29b0c35f80e47f8fb
 100644
GIT binary patch
delta 424
zcmaDV_=aEO0`I@K3=HgV7#J8V7#P?t0BH%76f7eAO-?P8KC%#jT*{~lRq;qVGNu+
zgGpO80wTx2Se#mvnV92XVrpOj5@H5o79dUoaVFO=n@yHu7E~+@qhp%%K^lVK%DC
zOh63N(K9)OS(!0yas{(Dk?LOn)z6*G!y?7RuxYXeOPCPDVW4@8NM@d#Jb@*NiQ(ep
zFDXnqh(mFQVQVF#G^2fdh6R3*?3eKTTP3F~Xi9v};53e2?KrxtWfnp$OF!E7
zLApVn2ZjurwO}FhPlr`dQBX*1n*4+n2~GZ2Ia}o?0Nb{iFxU%#SBTM#ky%lsfDGf
edC8Rw$*DOxx|w+?sTB;#Ir+)i2u`Q9CHADZ%Xk1

delta 636
zcmaFE|5Q-p0`I@K3=AAk85kHW7#P?qfV2h3j(nK5CZ{YEPTIqlPkLJ!3$Ad1_IB
zvyO$SiHU;Seh9~vH-DTazQCb0LJ$Paex5E4+OFmkod`He4yqApb%Vr6B@stfk6!
z4_B}V%tP=uK19QJs6iW9+=rQJZnmWEm!T#^aN1n7mbC@*oFs0P!Ut)gQCAci^e
z?LL0%0TrOh*s~wtStKu$plcqfdJG*M`*4%wa-|B0wQVBw?w^FPM{7?mdbu4v=
zD`BxVle%fkldm(RuP2me-bdkuuD*+UPxfUqK2ntdV_z%P`#=!_^f#-u;0ETO
z4yM|z{ml*!iFuFF?#X@wu$vAy2**j8L7HCnR%(Y#hF#944D`rFf}OBU`|P9Zfa6u
zajI@wQEFjnYF=_BLsDrm5-L?KRFwTUzC`ao?6V1oSKuPo0*rA%2+WufPD@C{)cX6

diff --git a/pc-bios/mpc8544ds.dts b/pc-bios/mpc8544ds.dts
index a88b47c..7eb3160 100644
--- a/pc-bios/mpc8544ds.dts
+++ b/pc-bios/mpc8544ds.dts
@@ -25,18 +25,6 @@
cpus {
#address-cells = 1;
#size-cells = 0;
-
-   PowerPC,8544@0 {
-   device_type = cpu;
-   reg = 0x0;
-   d-cache-line-size = 32;   // 32 bytes
-   i-cache-line-size = 32;   // 32 bytes
-   d-cache-size = 0x8000;// L1, 32K
-   i-cache-size = 0x8000;// L1, 32K
-   timebase-frequency = 0;
-   bus-frequency = 0;
-   clock-frequency = 0;
-   };
};
 
memory {
-- 
1.6.0.2

Re: [Qemu-devel] [PATCH v8 3/4] block: add block timer and throttling algorithm

2011-09-14 Thread Marcelo Tosatti

On Tue, Sep 13, 2011 at 11:09:46AM +0800, Zhi Yong Wu wrote:
 On Fri, Sep 9, 2011 at 10:44 PM, Marcelo Tosatti mtosa...@redhat.com wrote:
  On Thu, Sep 08, 2011 at 06:11:07PM +0800, Zhi Yong Wu wrote:
  Note:
       1.) When bps/iops limits are specified to a small value such as 511 
  bytes/s, this VM will hang up. We are considering how to handle this 
  senario.
 
  You can increase the length of the slice, if the request is larger than
  slice_time * bps_limit.
 Yeah, but it is a challenge for how to increase it. Do you have some nice 
 idea?

If the queue is empty, and the request being processed does not fit the
queue, increase the slice so that the request fits.

That is, make BLOCK_IO_SLICE_TIME dynamic and adjust it as described
above (if the bps or io limits change, reset it to the default
BLOCK_IO_SLICE_TIME).

       2.) When dd command is issued in guest, if its option bs is set to 
  a large value such as bs=1024K, the result speed will slightly bigger 
  than the limits.
 
  Why?
 This issue has not existed. I will remove it.
 When drive bps=100, i did some testings on guest VM.
 1.) bs=1024K
 18+0 records in
 18+0 records out
 18874368 bytes (19 MB) copied, 26.6268 s, 709 kB/s
 2.) bs=2048K
 18+0 records in
 18+0 records out
 37748736 bytes (38 MB) copied, 46.5336 s, 811 kB/s
 
 
  There is lots of debugging leftovers in the patch.
 sorry, i forgot to remove them.

[Qemu-devel] [PATCH 37/58] pseries: Add a phandle to the xicp interrupt controller device tree node

2011-09-14 Thread Alexander Graf

From: David Gibson da...@gibson.dropbear.id.au

Future devices we will be adding to the pseries machine (e.g. PCI) will
need nodes in the device tree which explicitly reference the top-level
interrupt controller via interrupt-parent or interrupt-map properties.

In order to do this, the interrupt controller node needs an assigned
phandle.  This patch adds the appropriate property, in preparation.

Signed-off-by: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/spapr.c |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/hw/spapr.c b/hw/spapr.c
index 760e323..bb00ae6 100644
--- a/hw/spapr.c
+++ b/hw/spapr.c
@@ -57,6 +57,8 @@
 #define MAX_CPUS256
 #define XICS_IRQS  1024
 
+#define PHANDLE_XICP0x
+
 sPAPREnvironment *spapr;
 
 static void *spapr_create_fdt_skel(const char *cpu_model,
@@ -202,6 +204,9 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
 _FDT((fdt_property(fdt, ibm,interrupt-server-ranges,
interrupt_server_ranges_prop,
sizeof(interrupt_server_ranges_prop;
+_FDT((fdt_property_cell(fdt, #interrupt-cells, 2)));
+_FDT((fdt_property_cell(fdt, linux,phandle, PHANDLE_XICP)));
+_FDT((fdt_property_cell(fdt, phandle, PHANDLE_XICP)));
 
 _FDT((fdt_end_node(fdt)));
 
-- 
1.6.0.2

[Qemu-devel] [PATCH 52/58] ppc405: use RAM_ADDR_FMT instead of %08lx

2011-09-14 Thread Alexander Graf

From: Stefan Hajnoczi stefa...@linux.vnet.ibm.com

The RAM_ADDR_FMT macro hides the type of ram_addr_t so that format
strings can be safely used.  Make sure to use RAM_ADDR_FMT so that the
build works on 32-bit hosts with Xen enabled.  Whether Xen should affect
ppc TCG targets is questionable but a separate issue.

Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/ppc405_boards.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/hw/ppc405_boards.c b/hw/ppc405_boards.c
index e6c8ac6..712a6be 100644
--- a/hw/ppc405_boards.c
+++ b/hw/ppc405_boards.c
@@ -213,7 +213,8 @@ static void ref405ep_init (ram_addr_t ram_size,
 sram_size = 512 * 1024;
 sram_offset = qemu_ram_alloc(NULL, ef405ep.sram, sram_size);
 #ifdef DEBUG_BOARD_INIT
-printf(%s: register SRAM at offset %08lx\n, __func__, sram_offset);
+printf(%s: register SRAM at offset  RAM_ADDR_FMT \n,
+   __func__, sram_offset);
 #endif
 cpu_register_physical_memory(0xFFF0, sram_size,
  sram_offset | IO_MEM_RAM);
@@ -357,7 +358,7 @@ static void ref405ep_init (ram_addr_t ram_size,
 #ifdef DEBUG_BOARD_INIT
 printf(%s: Done\n, __func__);
 #endif
-printf(bdloc %016lx\n, (unsigned long)bdloc);
+printf(bdloc  RAM_ADDR_FMT \n, bdloc);
 }
 
 static QEMUMachine ref405ep_machine = {
-- 
1.6.0.2

[Qemu-devel] [PATCH 24/58] PPC: E500: Add PV spinning code

2011-09-14 Thread Alexander Graf

CPUs that are not the boot CPU need to run in spinning code to check if they
should run off to execute and if so where to jump to. This usually happens
by leaving secondary CPUs looping and checking if some variable in memory
changed.

In an environment like Qemu however we can be more clever. We can just export
the spin table the primary CPU modifies as MMIO region that would event based
wake up the respective secondary CPUs. That saves us quite some cycles while
the secondary CPUs are not up yet.

So this patch adds a PV device that simply exports the spinning table into the
guest and thus allows the primary CPU to wake up secondary ones.

Signed-off-by: Alexander Graf ag...@suse.de

---

v1 - v2:

  - change into MMIO scheme
  - map the secondary NIP instead of 0 1:1
  - only map 64MB for TLB, same as u-boot
  - prepare code for 64-bit spinnings

v2 - v3:

  - remove r6
  - set MAS2_M
  - map EA 0
  - use second TLB1 entry

v3 - v4:

  - change to memoryops

v4 - v5:

  - fix endianness bugs
---
 Makefile.target|2 +-
 hw/ppce500_mpc8544ds.c |   33 -
 hw/ppce500_spin.c  |  186 
 3 files changed, 216 insertions(+), 5 deletions(-)
 create mode 100644 hw/ppce500_spin.c

diff --git a/Makefile.target b/Makefile.target
index 2ed9099..3f689ce 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -247,7 +247,7 @@ endif
 obj-ppc-y += ppc4xx_devs.o ppc4xx_pci.o ppc405_uc.o ppc405_boards.o
 obj-ppc-y += ppc440.o ppc440_bamboo.o
 # PowerPC E500 boards
-obj-ppc-y += ppce500_mpc8544ds.o mpc8544_guts.o
+obj-ppc-y += ppce500_mpc8544ds.o mpc8544_guts.o ppce500_spin.o
 # PowerPC 440 Xilinx ML507 reference board.
 obj-ppc-y += virtex_ml507.o
 obj-ppc-$(CONFIG_KVM) += kvm_ppc.o
diff --git a/hw/ppce500_mpc8544ds.c b/hw/ppce500_mpc8544ds.c
index 9379624..3b8b449 100644
--- a/hw/ppce500_mpc8544ds.c
+++ b/hw/ppce500_mpc8544ds.c
@@ -49,6 +49,7 @@
 #define MPC8544_PCI_IO 0xE100
 #define MPC8544_PCI_IOLEN  0x1
 #define MPC8544_UTIL_BASE  (MPC8544_CCSRBAR_BASE + 0xe)
+#define MPC8544_SPIN_BASE  0xEF00
 
 struct boot_info
 {
@@ -164,6 +165,18 @@ static void mmubooke_create_initial_mapping(CPUState *env,
 tlb-mas7_3 |= MAS3_UR | MAS3_UW | MAS3_UX | MAS3_SR | MAS3_SW | MAS3_SX;
 }
 
+static void mpc8544ds_cpu_reset_sec(void *opaque)
+{
+CPUState *env = opaque;
+
+cpu_reset(env);
+
+/* Secondary CPU starts in halted state for now. Needs to change when
+   implementing non-kernel boot. */
+env-halted = 1;
+env-exception_index = EXCP_HLT;
+}
+
 static void mpc8544ds_cpu_reset(void *opaque)
 {
 CPUState *env = opaque;
@@ -172,6 +185,7 @@ static void mpc8544ds_cpu_reset(void *opaque)
 cpu_reset(env);
 
 /* Set initial guest state. */
+env-halted = 0;
 env-gpr[1] = (1620) - 8;
 env-gpr[3] = bi-dt_base;
 env-nip = bi-entry;
@@ -199,7 +213,6 @@ static void mpc8544ds_init(ram_addr_t ram_size,
 unsigned int pci_irq_nrs[4] = {1, 2, 3, 4};
 qemu_irq **irqs, *mpic;
 DeviceState *dev;
-struct boot_info *boot_info;
 CPUState *firstenv = NULL;
 
 /* Setup CPUs */
@@ -234,9 +247,16 @@ static void mpc8544ds_init(ram_addr_t ram_size,
 env-spr[SPR_40x_TCR] = 1  26;
 
 /* Register reset handler */
-boot_info = g_malloc0(sizeof(struct boot_info));
-qemu_register_reset(mpc8544ds_cpu_reset, env);
-env-load_info = boot_info;
+if (!i) {
+/* Primary CPU */
+struct boot_info *boot_info;
+boot_info = g_malloc0(sizeof(struct boot_info));
+qemu_register_reset(mpc8544ds_cpu_reset, env);
+env-load_info = boot_info;
+} else {
+/* Secondary CPUs */
+qemu_register_reset(mpc8544ds_cpu_reset_sec, env);
+}
 }
 
 env = firstenv;
@@ -289,6 +309,9 @@ static void mpc8544ds_init(ram_addr_t ram_size,
 }
 }
 
+/* Register spinning region */
+sysbus_create_simple(e500-spin, MPC8544_SPIN_BASE, NULL);
+
 /* Load kernel. */
 if (kernel_filename) {
 kernel_size = load_uimage(kernel_filename, entry, loadaddr, NULL);
@@ -321,6 +344,8 @@ static void mpc8544ds_init(ram_addr_t ram_size,
 
 /* If we're loading a kernel directly, we must load the device tree too. */
 if (kernel_filename) {
+struct boot_info *boot_info;
+
 #ifndef CONFIG_FDT
 cpu_abort(env, Compiled without FDT support - can't load kernel\n);
 #endif
diff --git a/hw/ppce500_spin.c b/hw/ppce500_spin.c
new file mode 100644
index 000..38451ac
--- /dev/null
+++ b/hw/ppce500_spin.c
@@ -0,0 +1,186 @@
+#include hw.h
+#include sysemu.h
+#include sysbus.h
+#include kvm.h
+
+#define MAX_CPUS 32
+
+typedef struct spin_info {
+uint64_t addr;
+uint64_t r3;
+uint32_t resv;
+uint32_t pir;
+uint64_t reserved;
+} __attribute__ ((packed)) SpinInfo;
+
+typedef struct spin_state {
+SysBusDevice busdev;
+

[Qemu-devel] [PATCH 19/58] PPC: bamboo: Use kvm api for freq and clock frequencies

2011-09-14 Thread Alexander Graf

Now that we have nice and shiny APIs to read out the host's clock and timebase
frequencies, let's use them in the bamboo code as well!

Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/ppc440_bamboo.c |   45 -
 1 files changed, 12 insertions(+), 33 deletions(-)

diff --git a/hw/ppc440_bamboo.c b/hw/ppc440_bamboo.c
index 65d4f0f..1523764 100644
--- a/hw/ppc440_bamboo.c
+++ b/hw/ppc440_bamboo.c
@@ -31,38 +31,6 @@
 #define FDT_ADDR 0x180
 #define RAMDISK_ADDR 0x190
 
-#ifdef CONFIG_FDT
-static int bamboo_copy_host_cell(void *fdt, const char *node, const char *prop)
-{
-uint32_t cell;
-int ret;
-
-ret = kvmppc_read_host_property(node, prop, cell, sizeof(cell));
-if (ret  0) {
-fprintf(stderr, couldn't read host %s/%s\n, node, prop);
-goto out;
-}
-
-ret = qemu_devtree_setprop_cell(fdt, node, prop, cell);
-if (ret  0) {
-fprintf(stderr, couldn't set guest %s/%s\n, node, prop);
-goto out;
-}
-
-out:
-return ret;
-}
-
-static void bamboo_fdt_update(void *fdt)
-{
-/* Copy data from the host device tree into the guest. Since the guest can
- * directly access the timebase without host involvement, we must expose
- * the correct frequencies. */
-bamboo_copy_host_cell(fdt, /cpus/cpu@0, clock-frequency);
-bamboo_copy_host_cell(fdt, /cpus/cpu@0, timebase-frequency);
-}
-#endif
-
 static int bamboo_load_device_tree(target_phys_addr_t addr,
  uint32_t ramsize,
  target_phys_addr_t initrd_base,
@@ -75,6 +43,8 @@ static int bamboo_load_device_tree(target_phys_addr_t addr,
 char *filename;
 int fdt_size;
 void *fdt;
+uint32_t tb_freq = 4;
+uint32_t clock_freq = 4;
 
 filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, BINARY_DEVICE_TREE_FILE);
 if (!filename) {
@@ -108,10 +78,19 @@ static int bamboo_load_device_tree(target_phys_addr_t addr,
 if (ret  0)
 fprintf(stderr, couldn't set /chosen/bootargs\n);
 
+/* Copy data from the host device tree into the guest. Since the guest can
+ * directly access the timebase without host involvement, we must expose
+ * the correct frequencies. */
 if (kvm_enabled()) {
-bamboo_fdt_update(fdt);
+tb_freq = kvmppc_get_tbfreq();
+clock_freq = kvmppc_get_clockfreq();
 }
 
+qemu_devtree_setprop_cell(fdt, /cpus/cpu@0, clock-frequency,
+  clock_freq);
+qemu_devtree_setprop_cell(fdt, /cpus/cpu@0, timebase-frequency,
+  tb_freq);
+
 ret = rom_add_blob_fixed(BINARY_DEVICE_TREE_FILE, fdt, fdt_size, addr);
 g_free(fdt);
 
-- 
1.6.0.2

1 2 3 >

1 - 100 of 241 matches

Mail list logo