from:"Alexander Graf"

Re: [Qemu-devel] [Qemu-ppc] real cdrom access failure

2013-06-10 Thread Alexander Graf


On 06/10/2013 03:39 PM, Programmingkid wrote:

On Jun 9, 2013, at 12:34 PM, Alexander Graf wrote:


On 09.06.2013, at 18:28, Programmingkid wrote:


I am trying to access the cdrom drive in QEMU 1.5.0, but can't. This is the 
error I see: qemu-system-ppc: -cdrom /dev/cdrom: could not open disk image 
/dev/cdrom: No such file or directory. I think this is a bug with version 1.5.0 
on Mac OS X. Anybody else notice this problem?

Mac OS X doesn't provide a /dev/cdrom link. You have to point it directly to 
the target device. To get a list of available devices, try

   $ diskutil list

Also make sure that all partitions and file systems on top of the CD-ROM are 
unmounted (diskutil unmount or just umount), as OSX won't allow direct access 
to /dev/disk1 otherwise.


Alex



The -cdrom /dev/cdrom option always worked in the past. Just not with version 
1.5.0.


Hrm. CC'ing Andreas and Peter. They're the best matches to people 
knowing their way around OSX host support :). Also Changing qemu-ppc@ to 
qemu-devel@, as this is 100% unrelated to the ppc target.



Alex

Re: [Qemu-devel] [PATCH v2]booke: timer: Deactivate timer for target_bit above 61

2013-06-10 Thread Alexander Graf

On 06/10/2013 02:47 PM, Bhushan Bharat-R65777 wrote:

-Original Message-
From: Andreas Färber [mailto:afaer...@suse.de]
Sent: Monday, June 10, 2013 5:43 PM
To: Bhushan Bharat-R65777
Cc: qemu-...@nongnu.org; qemu-devel@nongnu.org; ag...@suse.de; Wood Scott-
B07421; Bhushan Bharat-R65777
Subject: Re: [Qemu-devel] [PATCH v2]booke: timer: Deactivate timer for
target_bit above 61

Am 10.06.2013 09:55, schrieb Bharat Bhushan:

QEMU timer supports a maximum timer of INT64_MAX. So starting timer
only for time which is calculated using target_bit  62 and
deactivate/stop timer if the target bit is above 61.

This patch also fix the time calculation from target_bit.
The code was doing (1  (target_bit + 1)) while this should be (1ULL
  (target_bit + 1)).

Signed-off-by: Bharat Bhushanbharat.bhus...@freescale.com
---
v1-v2
  - Added booke: timer: in patch subject

  hw/ppc/ppc_booke.c |8 +++-
  1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/hw/ppc/ppc_booke.c b/hw/ppc/ppc_booke.c index
e41b036..f4eda15 100644
--- a/hw/ppc/ppc_booke.c
+++ b/hw/ppc/ppc_booke.c
@@ -133,9 +133,15 @@ static void booke_update_fixed_timer(CPUPPCState

*env,

  ppc_tb_t *tb_env = env-tb_env;
  uint64_t lapse;
  uint64_t tb;
-uint64_t period = 1  (target_bit + 1);
+uint64_t period;
  uint64_t now;

+/* Deactivate timer for target_bit  61 */
+if (target_bit  61)
+return;

Braces missing and trailing whitespace after return.

Ok, will correct

So IIUC we can only allow 63 bits due to signedness, thus a maximum of
(1  62), thus target_bit= 61.

Any chance at least the comment can be worded to explain that any better? Maybe
also use (target-bit + 1= 63) or period  INT64_MAX as condition?

How about this:
 /* QEMU timer supports a maximum timer of INT64_MAX (0x7fff_).
  * Run booke fit/wdog timer when
  * ((1ULL  target_bit + 1)  0x4000_), i.e target_bit = 61.
  * Also the time with this maximum target_bit (with current range of
  * CPU frequency PowerPC supports) will be many many years. So it is
  * pretty safe to stop the timer above this threshold. */

How about

  /* This timeout will take years to trigger. Treat the timer as 
disabled. */

Alex

Re: [Qemu-devel] [PATCH v2]booke: timer: Deactivate timer for target_bit above 61

2013-06-10 Thread Alexander Graf


On 10.06.2013, at 19:20, Scott Wood wrote:

 On 06/10/2013 09:26:18 AM, Alexander Graf wrote:
 On 06/10/2013 02:47 PM, Bhushan Bharat-R65777 wrote:
 -Original Message-
 From: Andreas Färber [mailto:afaer...@suse.de]
 Sent: Monday, June 10, 2013 5:43 PM
 To: Bhushan Bharat-R65777
 Cc: qemu-...@nongnu.org; qemu-devel@nongnu.org; ag...@suse.de; Wood Scott-
 B07421; Bhushan Bharat-R65777
 Subject: Re: [Qemu-devel] [PATCH v2]booke: timer: Deactivate timer for
 target_bit above 61
 So IIUC we can only allow 63 bits due to signedness, thus a maximum of
 (1  62), thus target_bit= 61.
 Any chance at least the comment can be worded to explain that any better? 
 Maybe
 also use (target-bit + 1= 63) or period  INT64_MAX as condition?
 How about this:
 /* QEMU timer supports a maximum timer of INT64_MAX 
 (0x7fff_).
  * Run booke fit/wdog timer when
  * ((1ULL  target_bit + 1)  0x4000_), i.e target_bit = 
 61.
  * Also the time with this maximum target_bit (with current range of
  * CPU frequency PowerPC supports) will be many many years. So it is
  * pretty safe to stop the timer above this threshold. */
 How about
  /* This timeout will take years to trigger. Treat the timer as disabled. */
 
 There should be at least a brief mention that it's because the QEMU timer 
 can't handle larger values,

If it can't handle higher values, maybe it's better to just set the timer value 
to INT64_MAX when we detect an overflow? That would make the code plainly 
obvious.


Alex

 with the detailed explanation in the changelog.  A better lower bound on the 
 number of years would be nice as well (e.g. hundreds of years).
 
 -Scott

Re: [Qemu-devel] [PATCH v2]booke: timer: Deactivate timer for target_bit above 61

2013-06-11 Thread Alexander Graf

On 06/11/2013 01:40 PM, Bhushan Bharat-R65777 wrote:

-Original Message-
From: Alexander Graf [mailto:ag...@suse.de]
Sent: Monday, June 10, 2013 11:40 PM
To: Wood Scott-B07421
Cc: Bhushan Bharat-R65777; Andreas Färber; qemu-...@nongnu.org; qemu-
de...@nongnu.org; Wood Scott-B07421
Subject: Re: [Qemu-devel] [PATCH v2]booke: timer: Deactivate timer for
target_bit above 61

On 10.06.2013, at 19:20, Scott Wood wrote:

On 06/10/2013 09:26:18 AM, Alexander Graf wrote:

On 06/10/2013 02:47 PM, Bhushan Bharat-R65777 wrote:

-Original Message-
From: Andreas Färber [mailto:afaer...@suse.de]
Sent: Monday, June 10, 2013 5:43 PM
To: Bhushan Bharat-R65777
Cc: qemu-...@nongnu.org; qemu-devel@nongnu.org; ag...@suse.de; Wood
Scott- B07421; Bhushan Bharat-R65777
Subject: Re: [Qemu-devel] [PATCH v2]booke: timer: Deactivate timer
for target_bit above 61 So IIUC we can only allow 63 bits due to
signedness, thus a maximum of (1   62), thus target_bit= 61.
Any chance at least the comment can be worded to explain that any
better? Maybe also use (target-bit + 1= 63) or period   INT64_MAX as

condition?

How about this:
 /* QEMU timer supports a maximum timer of INT64_MAX

(0x7fff_).

  * Run booke fit/wdog timer when
  * ((1ULL   target_bit + 1)   0x4000_), i.e target_bit =

61.

  * Also the time with this maximum target_bit (with current range of
  * CPU frequency PowerPC supports) will be many many years. So it is
  * pretty safe to stop the timer above this threshold. */

How about
  /* This timeout will take years to trigger. Treat the timer as
disabled. */

There should be at least a brief mention that it's because the QEMU
timer can't handle larger values,

If it can't handle higher values, maybe it's better to just set the timer value
to INT64_MAX when we detect an overflow? That would make the code plainly
obvious.

What about below comment (a mix of both :)):

 /* Timeout calculated with (target_bit + 1)  62 will take
  * hundreds of years to trigger. Treat the timer as disabled.
  * Also this timeout is within the qemu supported maximum
  * timeout limit (INT64_MAX.). */

Ok, next question: Why does return disable the timer?

Alex

Re: [Qemu-devel] [PATCH v2]booke: timer: Deactivate timer for target_bit above 61

2013-06-11 Thread Alexander Graf

On 06/11/2013 02:47 PM, Bhushan Bharat-R65777 wrote:

-Original Message-
From: Alexander Graf [mailto:ag...@suse.de]
Sent: Tuesday, June 11, 2013 6:10 PM
To: Bhushan Bharat-R65777
Cc: Wood Scott-B07421; Andreas Färber; qemu-...@nongnu.org; qemu-
de...@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v2]booke: timer: Deactivate timer for
target_bit above 61

On 06/11/2013 01:40 PM, Bhushan Bharat-R65777 wrote:

-Original Message-
From: Alexander Graf [mailto:ag...@suse.de]
Sent: Monday, June 10, 2013 11:40 PM
To: Wood Scott-B07421
Cc: Bhushan Bharat-R65777; Andreas Färber; qemu-...@nongnu.org; qemu-
de...@nongnu.org; Wood Scott-B07421
Subject: Re: [Qemu-devel] [PATCH v2]booke: timer: Deactivate timer
for target_bit above 61

On 10.06.2013, at 19:20, Scott Wood wrote:

On 06/10/2013 09:26:18 AM, Alexander Graf wrote:

On 06/10/2013 02:47 PM, Bhushan Bharat-R65777 wrote:

-Original Message-
From: Andreas Färber [mailto:afaer...@suse.de]
Sent: Monday, June 10, 2013 5:43 PM
To: Bhushan Bharat-R65777
Cc: qemu-...@nongnu.org; qemu-devel@nongnu.org; ag...@suse.de;
Wood
Scott- B07421; Bhushan Bharat-R65777
Subject: Re: [Qemu-devel] [PATCH v2]booke: timer: Deactivate
timer for target_bit above 61 So IIUC we can only allow 63 bits due to
signedness, thus a maximum of (162), thus target_bit= 61.
Any chance at least the comment can be worded to explain that any
better? Maybe also use (target-bit + 1= 63) or periodINT64_MAX as

condition?

How about this:
  /* QEMU timer supports a maximum timer of INT64_MAX

(0x7fff_).

   * Run booke fit/wdog timer when
   * ((1ULLtarget_bit + 1)0x4000_), i.e target_bit

=

61.

   * Also the time with this maximum target_bit (with current range of
   * CPU frequency PowerPC supports) will be many many years. So it is
   * pretty safe to stop the timer above this threshold. */

How about
   /* This timeout will take years to trigger. Treat the timer as
disabled. */

There should be at least a brief mention that it's because the QEMU
timer can't handle larger values,

If it can't handle higher values, maybe it's better to just set the
timer value to INT64_MAX when we detect an overflow? That would make
the code plainly obvious.

What about below comment (a mix of both :)):

  /* Timeout calculated with (target_bit + 1)   62 will take
   * hundreds of years to trigger. Treat the timer as disabled.
   * Also this timeout is within the qemu supported maximum
   * timeout limit (INT64_MAX.). */

Ok, next question: Why does return disable the timer?

Actually here disabled means _not_ starting the timer. This function will be called to 
start timer initially and then later it is called to restart after every expiry. If we do 
not start then it is as good as stopped/disabled (it is not disabled in TCR). Probably 
saying do not start qemu timer or something similar is better than saying 
disabling the timer.

Couldn't you simply make things obvious from the code flow without 
pulling up assumptions?

Something along the lines of

if (overflow) {
*next = INT64_MAX;
}

qemu_mod_timer(timer, *next);

Then everyone knows what's going on, we can always assume the timer is 
running and there's no need to understand complex corner cases. It feels 
more like the timer framework would be the one to decid to ignore 
timeouts that take years to finish.

Alex

Re: [Qemu-devel] [PATCH v2]booke: timer: Deactivate timer for target_bit above 61

2013-06-11 Thread Alexander Graf

On 06/11/2013 03:18 PM, Bhushan Bharat-R65777 wrote:

-Original Message-
From: Alexander Graf [mailto:ag...@suse.de]
Sent: Tuesday, June 11, 2013 6:27 PM
To: Bhushan Bharat-R65777
Cc: Wood Scott-B07421; Andreas Färber; qemu-...@nongnu.org; qemu-
de...@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v2]booke: timer: Deactivate timer for
target_bit above 61

On 06/11/2013 02:47 PM, Bhushan Bharat-R65777 wrote:

-Original Message-
From: Alexander Graf [mailto:ag...@suse.de]
Sent: Tuesday, June 11, 2013 6:10 PM
To: Bhushan Bharat-R65777
Cc: Wood Scott-B07421; Andreas Färber; qemu-...@nongnu.org; qemu-
de...@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v2]booke: timer: Deactivate timer
for target_bit above 61

On 06/11/2013 01:40 PM, Bhushan Bharat-R65777 wrote:

-Original Message-
From: Alexander Graf [mailto:ag...@suse.de]
Sent: Monday, June 10, 2013 11:40 PM
To: Wood Scott-B07421
Cc: Bhushan Bharat-R65777; Andreas Färber; qemu-...@nongnu.org;
qemu- de...@nongnu.org; Wood Scott-B07421
Subject: Re: [Qemu-devel] [PATCH v2]booke: timer: Deactivate timer
for target_bit above 61

On 10.06.2013, at 19:20, Scott Wood wrote:

On 06/10/2013 09:26:18 AM, Alexander Graf wrote:

On 06/10/2013 02:47 PM, Bhushan Bharat-R65777 wrote:

-Original Message-
From: Andreas Färber [mailto:afaer...@suse.de]
Sent: Monday, June 10, 2013 5:43 PM
To: Bhushan Bharat-R65777
Cc: qemu-...@nongnu.org; qemu-devel@nongnu.org; ag...@suse.de;
Wood
Scott- B07421; Bhushan Bharat-R65777
Subject: Re: [Qemu-devel] [PATCH v2]booke: timer: Deactivate
timer for target_bit above 61 So IIUC we can only allow 63 bits due to
signedness, thus a maximum of (1 62), thus target_bit= 61.
Any chance at least the comment can be worded to explain that any
better? Maybe also use (target-bit + 1= 63) or period INT64_MAX as

condition?

How about this:
   /* QEMU timer supports a maximum timer of INT64_MAX

(0x7fff_).

* Run booke fit/wdog timer when
* ((1ULL target_bit + 1) 0x4000_), i.e

target_bit

=

61.

* Also the time with this maximum target_bit (with current range

of

* CPU frequency PowerPC supports) will be many many years. So it

is

* pretty safe to stop the timer above this threshold. */

How about
/* This timeout will take years to trigger. Treat the timer as
disabled. */

There should be at least a brief mention that it's because the
QEMU timer can't handle larger values,

If it can't handle higher values, maybe it's better to just set the
timer value to INT64_MAX when we detect an overflow? That would
make the code plainly obvious.

What about below comment (a mix of both :)):

   /* Timeout calculated with (target_bit + 1)62 will take
* hundreds of years to trigger. Treat the timer as disabled.
* Also this timeout is within the qemu supported maximum
* timeout limit (INT64_MAX.). */

Ok, next question: Why does return disable the timer?

Actually here disabled means _not_ starting the timer. This function will be

called to start timer initially and then later it is called to restart after
every expiry. If we do not start then it is as good as stopped/disabled (it is
not disabled in TCR). Probably saying do not start qemu timer or something
similar is better than saying disabling the timer.

Couldn't you simply make things obvious from the code flow without pulling up
assumptions?

You yourself suggested to stop/disable timer above a threshold :)

Something along the lines of

if (overflow) {

What is overflow?

The reason you're jumping through the hoops :).

Do you mean something like this:
diff --git a/hw/ppc/ppc_booke.c b/hw/ppc/ppc_booke.c
index e41b036..5b84b96 100644
--- a/hw/ppc/ppc_booke.c
+++ b/hw/ppc/ppc_booke.c
@@ -133,15 +133,19 @@ static void booke_update_fixed_timer(CPUPPCState 
*env,
  ppc_tb_t *tb_env = env-tb_env;
  uint64_t lapse;
  uint64_t tb;
-uint64_t period = 1  (target_bit + 1);
+uint64_t period;
  uint64_t now;

  now = qemu_get_clock_ns(vm_clock);
  tb  = cpu_ppc_get_tb(tb_env, now, tb_env-tb_offset);

-lapse = period - ((tb - (1  target_bit))  (period - 1));
-
-*next = now + muldiv64(lapse, get_ticks_per_sec(), tb_env-tb_freq);
+if (target_bit= 62) {
/* This would overflow our calculation, so just max the timer out to the 
biggest value the timer framework can handle */

+*next = INT64_MAX;
+} else {
+period = 1ULL  (target_bit + 1);
+lapse = period - ((tb - (1  target_bit))  (period - 1));
+*next = now + muldiv64(lapse, get_ticks_per_sec(), tb_env-tb_freq);
+}

Alex

Re: [Qemu-devel] KVM call agenda for 2013-06-25

2013-06-11 Thread Alexander Graf


On 11.06.2013, at 17:52, Juan Quintela wrote:

 
 Hi
 
 Now we have moved to one call each other week.
 Please, send any topic that you are interested in covering.

VFIO for device tree based platforms


Alex

 
 Thanks, Juan.
 
 PD.  If you want to attend and you don't have the call details,
  contact me.

Re: [Qemu-devel] [PATCH 8/9] kvm/openpic: in-kernel mpic support

2013-06-12 Thread Alexander Graf


On 01.05.2013, at 03:48, Scott Wood wrote:

 Enables support for the in-kernel MPIC that thas been merged into the
 KVM next branch.  This includes irqfd/KVM_IRQ_LINE support from Alex
 Graf (along with some other improvements).
 
 Note from Alex regarding kvm_irqchip_create():
 
  On x86, one would call kvm_irqchip_create() to initialize an
  in-kernel interrupt controller.  That function then goes ahead and
  initializes global capability variables as well as the default irq
  routing table.
 
  On ppc, we can't call kvm_irqchip_create() because we can have
  different types of interrupt controllers.  So we want to do all the
  things that function would do for us in the in-kernel device init
  handler.
 
 Signed-off-by: Scott Wood scottw...@freescale.com
 ---
 default-configs/ppc-softmmu.mak   |1 +
 default-configs/ppc64-softmmu.mak |1 +
 hw/intc/Makefile.objs |1 +
 hw/intc/openpic_kvm.c |  256 +
 hw/ppc/e500.c |   79 +++-
 include/hw/ppc/openpic.h  |2 +-
 6 files changed, 334 insertions(+), 6 deletions(-)
 create mode 100644 hw/intc/openpic_kvm.c
 
 diff --git a/default-configs/ppc-softmmu.mak b/default-configs/ppc-softmmu.mak
 index cc3587f..63255dc 100644
 --- a/default-configs/ppc-softmmu.mak
 +++ b/default-configs/ppc-softmmu.mak
 @@ -43,5 +43,6 @@ CONFIG_XILINX=y
 CONFIG_XILINX_ETHLITE=y
 CONFIG_OPENPIC=y
 CONFIG_E500=$(CONFIG_FDT)
 +CONFIG_OPENPIC_KVM=$(and $(CONFIG_E500),$(CONFIG_KVM))
 # For PReP
 CONFIG_MC146818RTC=y
 diff --git a/default-configs/ppc64-softmmu.mak 
 b/default-configs/ppc64-softmmu.mak
 index 884ea8a..e3c0c68 100644
 --- a/default-configs/ppc64-softmmu.mak
 +++ b/default-configs/ppc64-softmmu.mak
 @@ -44,6 +44,7 @@ CONFIG_XILINX_ETHLITE=y
 CONFIG_OPENPIC=y
 CONFIG_PSERIES=$(CONFIG_FDT)
 CONFIG_E500=$(CONFIG_FDT)
 +CONFIG_OPENPIC_KVM=$(and $(CONFIG_E500),$(CONFIG_KVM))
 # For pSeries
 CONFIG_PCI_HOTPLUG=y
 # For PReP
 diff --git a/hw/intc/Makefile.objs b/hw/intc/Makefile.objs
 index 718d97a..837ef19 100644
 --- a/hw/intc/Makefile.objs
 +++ b/hw/intc/Makefile.objs
 @@ -20,4 +20,5 @@ obj-$(CONFIG_GRLIB) += grlib_irqmp.o
 obj-$(CONFIG_IOAPIC) += ioapic.o
 obj-$(CONFIG_OMAP) += omap_intc.o
 obj-$(CONFIG_OPENPIC) += openpic.o
 +obj-$(CONFIG_OPENPIC_KVM) += openpic_kvm.o
 obj-$(CONFIG_SH4) += sh_intc.o
 diff --git a/hw/intc/openpic_kvm.c b/hw/intc/openpic_kvm.c
 new file mode 100644
 index 000..e57ae2f
 --- /dev/null
 +++ b/hw/intc/openpic_kvm.c
 @@ -0,0 +1,256 @@
 +/*
 + * KVM in-kernel OpenPIC
 + *
 + * Copyright 2013 Freescale Semiconductor, Inc.
 + *
 + * Permission is hereby granted, free of charge, to any person obtaining a 
 copy
 + * of this software and associated documentation files (the Software), to 
 deal
 + * in the Software without restriction, including without limitation the 
 rights
 + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 + * copies of the Software, and to permit persons to whom the Software is
 + * furnished to do so, subject to the following conditions:
 + *
 + * The above copyright notice and this permission notice shall be included in
 + * all copies or substantial portions of the Software.
 + *
 + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
 + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
 FROM,
 + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 + * THE SOFTWARE.
 + */
 +
 +#include sys/ioctl.h
 +#include exec/address-spaces.h
 +#include hw/hw.h
 +#include hw/ppc/openpic.h
 +#include hw/pci/msi.h
 +#include hw/sysbus.h
 +#include sysemu/kvm.h
 +#include qemu/log.h
 +
 +typedef struct KVMOpenPICState {
 +SysBusDevice busdev;
 +MemoryRegion mem;
 +MemoryListener mem_listener;
 +uint32_t fd;
 +uint32_t model;
 +} KVMOpenPICState;
 +
 +static void kvm_openpic_set_irq(void *opaque, int n_IRQ, int level)
 +{
 +kvm_set_irq(kvm_state, n_IRQ, level);
 +}
 +
 +static void kvm_openpic_reset(DeviceState *d)
 +{
 +qemu_log_mask(LOG_UNIMP, %s: unimplemented\n, __func__);
 +}
 +
 +static void kvm_openpic_write(void *opaque, hwaddr addr, uint64_t val,
 +  unsigned size)
 +{
 +KVMOpenPICState *opp = opaque;
 +struct kvm_device_attr attr;
 +uint32_t val32 = val;
 +int ret;
 +
 +attr.group = KVM_DEV_MPIC_GRP_REGISTER;
 +attr.attr = addr;
 +attr.addr = (uint64_t)(unsigned long)val32;
 +
 +ret = ioctl(opp-fd, KVM_SET_DEVICE_ATTR, attr);
 +if (ret  0) {
 +qemu_log_mask(LOG_UNIMP, %s: %s %llx\n, __func__,
 +  strerror(errno), attr.attr);
 +}
 +}
 +
 +static uint64_t kvm_openpic_read(void

Re: [Qemu-devel] [PATCH v3 1/9] KVM: Don't assume that mpstate exists with in-kernel PIC always

2013-06-12 Thread Alexander Graf


On 01.05.2013, at 03:48, Scott Wood wrote:

 From: Alexander Graf ag...@suse.de
 
 On PPC, we don't support MP state. So far it's not necessary and I'm
 not convinced yet that we really need to support it ever.
 
 However, the current idle logic in QEMU assumes that an in-kernel PIC
 also means we support MP state. This assumption is not true anymore.
 
 Let's split up the two cases into two different variables. That way
 PPC can expose an in-kernel PIC, while not implementing MP state.
 
 Signed-off-by: Alexander Graf ag...@suse.de
 CC: Jan Kiszka jan.kis...@siemens.com
 Signed-off-by: Scott Wood scottw...@freescale.com

Thanks, applied all except 8/9 to ppc-next.


Alex

Re: [Qemu-devel] [PATCH] spapr: add yet another maintainer

2013-06-12 Thread Alexander Graf


On 12.06.2013, at 16:27, Alexey Kardashevskiy wrote:

 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 ---
 MAINTAINERS |1 +
 1 file changed, 1 insertion(+)
 
 diff --git a/MAINTAINERS b/MAINTAINERS
 index 13c0cc5..1e00bb1 100644
 --- a/MAINTAINERS
 +++ b/MAINTAINERS
 @@ -430,6 +430,7 @@ F: hw/isa/pc87312.[hc]
 sPAPR
 M: David Gibson da...@gibson.dropbear.id.au

David should get removed here then, no?


Alex

 M: Alexander Graf ag...@suse.de
 +M: Alexey Kardashevskiy a...@ozlabs.ru
 L: qemu-...@nongnu.org
 S: Supported
 F: hw/*/spapr*
 -- 
 1.7.10.4

Re: [Qemu-devel] [PATCH v3 1/9] KVM: Don't assume that mpstate exists with in-kernel PIC always

2013-06-12 Thread Alexander Graf


On 12.06.2013, at 22:16, Scott Wood wrote:

 On 06/12/2013 08:04:55 AM, Alexander Graf wrote:
 On 01.05.2013, at 03:48, Scott Wood wrote:
  From: Alexander Graf ag...@suse.de
 
  On PPC, we don't support MP state. So far it's not necessary and I'm
  not convinced yet that we really need to support it ever.
 
  However, the current idle logic in QEMU assumes that an in-kernel PIC
  also means we support MP state. This assumption is not true anymore.
 
  Let's split up the two cases into two different variables. That way
  PPC can expose an in-kernel PIC, while not implementing MP state.
 
  Signed-off-by: Alexander Graf ag...@suse.de
  CC: Jan Kiszka jan.kis...@siemens.com
  Signed-off-by: Scott Wood scottw...@freescale.com
 Thanks, applied all except 8/9 to ppc-next.
 
 Did you push?  I don't see anything since early May on either repo.or.cz or 
 github.

Sorry, I forgot to push. It's pushed to github now. I'm gradually deprecating 
the repo.or.cz one.


Alex

Re: [Qemu-devel] [PATCH v2] kvm/openpic: in-kernel mpic support

2013-06-12 Thread Alexander Graf


On 12.06.2013, at 22:32, Scott Wood wrote:

 Enables support for the in-kernel MPIC that thas been merged into the
 KVM next branch.  This includes irqfd/KVM_IRQ_LINE support from Alex
 Graf (along with some other improvements).
 
 Note from Alex regarding kvm_irqchip_create():
 
  On x86, one would call kvm_irqchip_create() to initialize an
  in-kernel interrupt controller.  That function then goes ahead and
  initializes global capability variables as well as the default irq
  routing table.
 
  On ppc, we can't call kvm_irqchip_create() because we can have
  different types of interrupt controllers.  So we want to do all the
  things that function would do for us in the in-kernel device init
  handler.
 
 Signed-off-by: Scott Wood scottw...@freescale.com

Thanks, applied to ppc-next.


Alex

Re: [Qemu-devel] [PATCH] kvm/openpic: add kvm_irqchip_commit_routes

2013-06-12 Thread Alexander Graf


On 12.06.2013, at 23:21, Scott Wood wrote:

 The patch that added kvm_irqchip_commit_routes was originally
 meant to come after the in-kernel mpic patch, and thus it updated
 hw/intc/openpic_kvm.c.  However, it was applied before the in-kernel
 mpic patch (which creates hw/intc/openpic_kvm.c), and thus this hunk
 got lost.
 
 Signed-off-by: Scott Wood scottw...@freescale.com

I'll squash this in with your previous commit if you're ok with that.


Alex

 ---
 hw/intc/openpic_kvm.c |2 ++
 1 file changed, 2 insertions(+)
 
 diff --git a/hw/intc/openpic_kvm.c b/hw/intc/openpic_kvm.c
 index 809b34b..17d0a35 100644
 --- a/hw/intc/openpic_kvm.c
 +++ b/hw/intc/openpic_kvm.c
 @@ -204,6 +204,8 @@ static int kvm_openpic_init(SysBusDevice *dev)
 kvm_msi_via_irqfd_allowed = true;
 kvm_gsi_routing_allowed = true;
 
 +kvm_irqchip_commit_routes(s);
 +
 return 0;
 }
 
 -- 
 1.7.10.4

Re: [Qemu-devel] [PATCH] kvm/openpic: add kvm_irqchip_commit_routes

2013-06-12 Thread Alexander Graf


On 12.06.2013, at 23:25, Scott Wood wrote:

 On 06/12/2013 04:23:09 PM, Alexander Graf wrote:
 On 12.06.2013, at 23:21, Scott Wood wrote:
  The patch that added kvm_irqchip_commit_routes was originally
  meant to come after the in-kernel mpic patch, and thus it updated
  hw/intc/openpic_kvm.c.  However, it was applied before the in-kernel
  mpic patch (which creates hw/intc/openpic_kvm.c), and thus this hunk
  got lost.
 
  Signed-off-by: Scott Wood scottw...@freescale.com
 I'll squash this in with your previous commit if you're ok with that.
 
 That's fine.

Ok, done.


Alex

Re: [Qemu-devel] [PATCH v3] spapr-rtas: fix h_rtas parameters reading

2013-09-30 Thread Alexander Graf


On 27.09.2013, at 10:10, Alexey Kardashevskiy wrote:

 On the real hardware, RTAS is called in real mode and therefore
 top 4 bits of the address passed in the call are ignored.
 So does the patch.
 
 This converts h_rtas() to use existing rtas_ld() handlers.
 
 This fixed rtas_ld()/rtas_st() to ignore top 4 bits.
 
 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru

Thanks, applied to ppc-next.


Alex

Re: [Qemu-devel] [PATCH v2] spapr: Add ibm, purr property on power7 and newer

2013-09-30 Thread Alexander Graf


On 27.09.2013, at 10:11, Alexey Kardashevskiy wrote:

 PAPR+ says that no ibm,purr tells the guest that H_PURR is not
 supported. However some guests still try calling H_PURR on POWER7 unless
 the property is present and equal to 0. This adds the property for CPUs
 supporting the PURR special register.
 
 Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru

Thanks, applied to ppc-next.


Alex

Re: [Qemu-devel] [PATCH] spapr: add compat machine option

2013-09-30 Thread Alexander Graf


On 27.09.2013, at 10:06, Alexey Kardashevskiy wrote:

 To be able to boot on newer hardware that the software support,
 PowerISA defines a logical PVR, one per every PowerISA specification
 version from 2.04.
 
 This adds the compat option which takes values 205 or 206 and forces
 QEMU to boot the guest with a logical PVR (CPU_POWERPC_LOGICAL_2_05 or
 CPU_POWERPC_LOGICAL_2_06).
 
 The guest reads the logical PVR value from cpu-version property of
 a CPU device node.
 
 Cc: Nikunj A Dadhania nik...@linux.vnet.ibm.com
 Cc: Andreas Färber afaer...@suse.de
 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 ---
 hw/ppc/spapr.c  | 40 
 include/hw/ppc/spapr.h  |  2 ++
 target-ppc/cpu-models.h | 10 ++
 target-ppc/cpu.h|  3 +++
 target-ppc/kvm.c|  2 ++
 vl.c|  4 
 6 files changed, 61 insertions(+)
 
 diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
 index a09a1d9..737452d 100644
 --- a/hw/ppc/spapr.c
 +++ b/hw/ppc/spapr.c
 @@ -33,6 +33,7 @@
 #include sysemu/kvm.h
 #include kvm_ppc.h
 #include mmu-hash64.h
 +#include cpu-models.h
 
 #include hw/boards.h
 #include hw/ppc/ppc.h
 @@ -196,6 +197,26 @@ static XICSState *xics_system_init(int nr_servers, int 
 nr_irqs)
 return icp;
 }
 
 +static void spapr_compat_mode_init(sPAPREnvironment *spapr)
 +{
 +QemuOpts *machine_opts = qemu_get_machine_opts();
 +uint64_t compat = qemu_opt_get_number(machine_opts, compat, 0);
 +
 +switch (compat) {
 +case 0:
 +break;
 +case 205:
 +spapr-arch_compat = CPU_POWERPC_LOGICAL_2_05;
 +break;
 +case 206:
 +spapr-arch_compat = CPU_POWERPC_LOGICAL_2_06;

Does it make sense to declare compat mode a number or would a string be easier 
for users? I can imagine that -machine compat=power6 is easier to understand 
for a user than -machine compat=205.

 +break;
 +default:
 +perror(Unsupported mode, only are 205, 206 supported\n);
 +break;
 +}
 +}
 +
 static int spapr_fixup_cpu_dt(void *fdt, sPAPREnvironment *spapr)
 {
 int ret = 0, offset;
 @@ -206,6 +227,7 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPREnvironment 
 *spapr)
 
 CPU_FOREACH(cpu) {
 DeviceClass *dc = DEVICE_GET_CLASS(cpu);
 +CPUPPCState *env = (POWERPC_CPU(cpu)-env);
 uint32_t associativity[] = {cpu_to_be32(0x5),
 cpu_to_be32(0x0),
 cpu_to_be32(0x0),
 @@ -238,6 +260,14 @@ static int spapr_fixup_cpu_dt(void *fdt, 
 sPAPREnvironment *spapr)
 if (ret  0) {
 return ret;
 }
 +
 +if (env-arch_compat) {
 +ret = fdt_setprop(fdt, offset, cpu-version,
 +  env-arch_compat, sizeof(env-arch_compat));
 +if (ret  0) {
 +return ret;
 +}
 +}
 }
 return ret;
 }
 @@ -1145,6 +1175,8 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 spapr = g_malloc0(sizeof(*spapr));
 QLIST_INIT(spapr-phbs);
 
 +spapr_compat_mode_init(spapr);
 +
 cpu_ppc_hypercall = emulate_spapr_hypercall;
 
 /* Allocate RMA if necessary */
 @@ -1226,6 +1258,14 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 
 xics_cpu_setup(spapr-icp, cpu);
 
 +/*
 + * If compat mode is set in the command line, pass it to CPU so KVM
 + * will be able to set it in the host kernel.
 + */
 +if (spapr-arch_compat) {
 +env-arch_compat = spapr-arch_compat;

You should set the compat mode in KVM here, rather than doing it in the 
put_registers call which gets invoked on every register sync. Or can the guest 
change the mode?

Also, we need to handle failure. If the kernel can not set the CPU to 2.05 mode 
for example (IIRC POWER8 doesn't allow you to) we should bail out here.

And then there's the TCG question. We either have to disable CPU features 
similar to how we handle it in KVM (by setting and honoring the respective bits 
in PCR) or we need to bail out too and declare compat mode unsupported for TCG.

And then there's the fact that the kernel interface isn't upstream in a way that

 +}
 +
 qemu_register_reset(spapr_cpu_reset, cpu);
 }
 
 diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
 index ca175b0..201c578 100644
 --- a/include/hw/ppc/spapr.h
 +++ b/include/hw/ppc/spapr.h
 @@ -34,6 +34,8 @@ typedef struct sPAPREnvironment {
 uint32_t epow_irq;
 Notifier epow_notifier;
 
 +uint32_t arch_compat;/* Compatible PVR from the command line */
 +
 /* Migration state */
 int htab_save_index;
 bool htab_first_pass;
 diff --git a/target-ppc/cpu-models.h b/target-ppc/cpu-models.h
 index 49ba4a4..d7c033c 100644
 --- a/target-ppc/cpu-models.h
 +++ b/target-ppc/cpu-models.h
 @@ -583,6 +583,16 @@ enum {
 CPU_POWERPC_RS64II = 0x0034,
 CPU_POWERPC_RS64III

Re: [Qemu-devel] [PATCH v7] powerpc: add PVR mask support

2013-09-30 Thread Alexander Graf


On 27.09.2013, at 10:05, Alexey Kardashevskiy wrote:

 IBM POWERPC processors encode PVR as a CPU family in higher 16 bits and
 a CPU version in lower 16 bits. Since there is no significant change
 in behavior between versions, there is no point to add every single CPU
 version in QEMU's CPU list. Also, new CPU versions of already supported
 CPU won't break the existing code.
 
 This adds PVR value/mask support for KVM, i.e. for -cpu host option.
 
 As CPU family class name for POWER7 is POWER7-family, there is no need
 to touch aliases.
 
 Cc: Andreas Färber afaer...@suse.de
 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru

Looks reasonable to me, but I'll wait for an ack from Andreas.


Alex

Re: [Qemu-devel] [PATCH] spapr: add compat machine option

2013-09-30 Thread Alexander Graf


On 09/30/2013 03:22 PM, Alexey Kardashevskiy wrote:

On 30.09.2013 21:25, Alexander Graf wrote:

On 27.09.2013, at 10:06, Alexey Kardashevskiy wrote:


To be able to boot on newer hardware that the software support,
PowerISA defines a logical PVR, one per every PowerISA specification
version from 2.04.

This adds the compat option which takes values 205 or 206 and forces
QEMU to boot the guest with a logical PVR (CPU_POWERPC_LOGICAL_2_05 or
CPU_POWERPC_LOGICAL_2_06).

The guest reads the logical PVR value from cpu-version property of
a CPU device node.

Cc: Nikunj A Dadhanianik...@linux.vnet.ibm.com
Cc: Andreas Färberafaer...@suse.de
Signed-off-by: Alexey Kardashevskiya...@ozlabs.ru
---
hw/ppc/spapr.c  | 40 
include/hw/ppc/spapr.h  |  2 ++
target-ppc/cpu-models.h | 10 ++
target-ppc/cpu.h|  3 +++
target-ppc/kvm.c|  2 ++
vl.c|  4 
6 files changed, 61 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index a09a1d9..737452d 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -33,6 +33,7 @@
#include sysemu/kvm.h
#include kvm_ppc.h
#include mmu-hash64.h
+#include cpu-models.h

#include hw/boards.h
#include hw/ppc/ppc.h
@@ -196,6 +197,26 @@ static XICSState *xics_system_init(int nr_servers, int 
nr_irqs)
 return icp;
}

+static void spapr_compat_mode_init(sPAPREnvironment *spapr)
+{
+QemuOpts *machine_opts = qemu_get_machine_opts();
+uint64_t compat = qemu_opt_get_number(machine_opts, compat, 0);
+
+switch (compat) {
+case 0:
+break;
+case 205:
+spapr-arch_compat = CPU_POWERPC_LOGICAL_2_05;
+break;
+case 206:
+spapr-arch_compat = CPU_POWERPC_LOGICAL_2_06;

Does it make sense to declare compat mode a number or would a string
be easier for users? I can imagine that -machine compat=power6 is
easier to understand for a user than -machine compat=205.

I just follow the PowerISA spec. It does not say anywhere (at least I do
not see it) that 2.05==power6. 2.05 was released when power6 was
released and power6 supports 2.05 but these are not synonims. And
compat=power6 would not set cpu-version to any of power6 PVRs, it
still will be a logical PVR. It confuses me too to tell qemu 205
instead of power6 but it is the spec to blame :)


So what is 2_06 plus then? :)

To me it really sounds like a 1:1 mapping to cores rather than specs - 
the ISA defines a lot more capabilities than a single core necessarily 
supports, especially with the inclusion of booke into the generic ppc spec.







+break;
+default:
+perror(Unsupported mode, only are 205, 206 supported\n);
+break;
+}
+}
+
static int spapr_fixup_cpu_dt(void *fdt, sPAPREnvironment *spapr)
{
 int ret = 0, offset;
@@ -206,6 +227,7 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPREnvironment 
*spapr)

 CPU_FOREACH(cpu) {
 DeviceClass *dc = DEVICE_GET_CLASS(cpu);
+CPUPPCState *env =(POWERPC_CPU(cpu)-env);
 uint32_t associativity[] = {cpu_to_be32(0x5),
 cpu_to_be32(0x0),
 cpu_to_be32(0x0),
@@ -238,6 +260,14 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPREnvironment 
*spapr)
 if (ret  0) {
 return ret;
 }
+
+if (env-arch_compat) {
+ret = fdt_setprop(fdt, offset, cpu-version,
+env-arch_compat, sizeof(env-arch_compat));
+if (ret  0) {
+return ret;
+}
+}
 }
 return ret;
}
@@ -1145,6 +1175,8 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 spapr = g_malloc0(sizeof(*spapr));
 QLIST_INIT(spapr-phbs);

+spapr_compat_mode_init(spapr);
+
 cpu_ppc_hypercall = emulate_spapr_hypercall;

 /* Allocate RMA if necessary */
@@ -1226,6 +1258,14 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)

 xics_cpu_setup(spapr-icp, cpu);

+/*
+ * If compat mode is set in the command line, pass it to CPU so KVM
+ * will be able to set it in the host kernel.
+ */
+if (spapr-arch_compat) {
+env-arch_compat = spapr-arch_compat;

You should set the compat mode in KVM here, rather than doing it in

the put_registers call which gets invoked on every register sync. Or can
the guest change the mode?


I will change it here in the next patch (which requires kernel changes
which are not there yet). The guest cannot change it directly but it can
indirectly via client-architecture-support.


They probably want a generic callback then. What happens on reset?





Also, we need to handle failure. If the kernel can not set the CPU
to

2.05 mode for example (IIRC POWER8 doesn't allow you to) we should bail
out here.

Yep, I'll add this easy check :)


And then there's the TCG question. We either have to disable CPU

features similar to how we handle it in KVM (by setting and honoring the
respective bits in PCR) or we need

Re: [Qemu-devel] [PATCH v5 00/14] xics: reworks and in-kernel support

2013-09-30 Thread Alexander Graf


On 09/26/2013 08:18 AM, Alexey Kardashevskiy wrote:

Yet another try with XICS and XICS-KVM.


Thanks, applied to ppc-next.


Alex



v4-v5:
Rebased onto upstream;
Put few reviewed-by: Andreas;
Added IRQFD enablement patches.

v3-v4:
Addressed multiple comments from Alex;
Split out many tiny patches to make them easier to review;
Fixed xics_cpu_setup not to call the parent;
And many, many small changes.

v2-v3:
Addressed multiple comments from Andreas;
Added 2 patches for XICS from Ben - I included them into the series as they
are about XICS and they won't rebase automatically if moved before XICS rework
so it seemed to me that it would be better to carry them toghether. If it is
wrong, please let me know, I'll repost them separately.

v1-v2:
The main change is this adds xics-common parent for emulated XICS and 
XICS-KVM.
And many, many small changes, mostly to address Andreas comments.

Migration from XICS to XICS-KVM and vice versa still works.


Alexey Kardashevskiy (10):
   xics: move reset and cpu_setup
   spapr: move cpu_setup after kvmppc_set_papr
   xics: replace fprintf with error_report
   xics: add pre_save/post_load dispatchers
   xics: convert init() to realize()
   xics: add missing const specifiers to TypeInfo
   xics: split to xics and xics-common
   xics: add cpu_setup callback
   xics-kvm: enable irqfd for MSI
   spapr-pci: enable irqfd for INTx

Benjamin Herrenschmidt (2):
   xics: Implement H_IPOLL
   xics: Implement H_XIRR_X

David Gibson (2):
   target-ppc: Add helper for KVM_PPC_RTAS_DEFINE_TOKEN
   xics-kvm: Support for in-kernel XICS interrupt controller

  default-configs/ppc64-softmmu.mak |   1 +
  hw/intc/Makefile.objs |   1 +
  hw/intc/xics.c| 331 -
  hw/intc/xics_kvm.c| 494 ++
  hw/ppc/spapr.c|  27 ++-
  hw/ppc/spapr_pci.c|  13 +
  include/hw/ppc/spapr.h|   1 +
  include/hw/ppc/xics.h |  57 +
  target-ppc/kvm.c  |  14 ++
  target-ppc/kvm_ppc.h  |   7 +
  10 files changed, 884 insertions(+), 62 deletions(-)
  create mode 100644 hw/intc/xics_kvm.c

Re: [Qemu-devel] [PATCH -V4 1/4] target-ppc: Update slb array with correct index values.

2013-09-30 Thread Alexander Graf


On 09/25/2013 05:41 PM, Aneesh Kumar K.V wrote:

Hi Alex,

Any update on this ?


The patch itself never made it to the qemu-devel mailing list which I 
pull things off of (through patchworks). Please resend.



Alex



-aneesh

Aneesh Kumar K.Vaneesh.ku...@linux.vnet.ibm.com  writes:


From: Aneesh Kumar K.Vaneesh.ku...@linux.vnet.ibm.com

Without this, a value of rb=0 and rs=0 results in replacing the 0th
index. This can be observed when using gdb remote debugging support.

(gdb) x/10i do_fork
0xc0085330do_fork:Cannot access memory at address 
0xc0085330
(gdb)

This is because when we do the slb sync via kvm_cpu_synchronize_state,
we overwrite the slb entry (0th entry) for 0xc0085330

Signed-off-by: Aneesh Kumar K.Vaneesh.ku...@linux.vnet.ibm.com
---
  target-ppc/kvm.c | 17 +++--
  1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 30a870e..1838465 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -1033,9 +1033,22 @@ int kvm_arch_get_registers(CPUState *cs)

  /* Sync SLB */
  #ifdef TARGET_PPC64
+/*
+ * The packed SLB array we get from KVM_GET_SREGS only contains
+ * information about valid entries. So we flush our internal
+ * copy to get rid of stale ones, then put all valid SLB entries
+ * back in.
+ */
+memset(env-slb, 0, sizeof(env-slb));
  for (i = 0; i  64; i++) {
-ppc_store_slb(env, sregs.u.s.ppc64.slb[i].slbe,
-   sregs.u.s.ppc64.slb[i].slbv);
+target_ulong rb = sregs.u.s.ppc64.slb[i].slbe;
+target_ulong rs = sregs.u.s.ppc64.slb[i].slbv;
+/*
+ * Only restore valid entries
+ */
+if (rb  SLB_ESID_V) {
+ppc_store_slb(env, rb, rs);
+}
  }
  #endif

--
1.8.1.2

Re: [Qemu-devel] [PATCH -V4 2/4] target-ppc: Fix page table lookup with kvm enabled

2013-09-30 Thread Alexander Graf


On 09/05/2013 10:16 AM, Aneesh Kumar K.V wrote:

From: Aneesh Kumar K.Vaneesh.ku...@linux.vnet.ibm.com

With kvm enabled, we store the hash page table information in the hypervisor.
Use ioctl to read the htab contents. Without this we get the below error when
trying to read the guest address

  (gdb) x/10 do_fork
  0xc0098660do_fork:   Cannot access memory at address 
0xc0098660
  (gdb)

Signed-off-by: Aneesh Kumar K.Vaneesh.ku...@linux.vnet.ibm.com
---
  target-ppc/kvm.c| 59 +
  target-ppc/kvm_ppc.h| 12 +-
  target-ppc/mmu-hash64.c | 57 ---
  3 files changed, 104 insertions(+), 24 deletions(-)

diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 1838465..05b066c 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -1888,3 +1888,62 @@ int kvm_arch_on_sigbus(int code, void *addr)
  void kvm_arch_init_irq_routing(KVMState *s)
  {
  }
+
+hwaddr kvmppc_hash64_pteg_search(PowerPCCPU *cpu, hwaddr hash,
+ bool secondary, target_ulong ptem,
+ target_ulong *hpte0, target_ulong *hpte1)
+{
+int htab_fd;
+uint64_t index;
+hwaddr pte_offset;
+target_ulong pte0, pte1;
+struct kvm_get_htab_fd ghf;
+struct kvm_get_htab_buf {
+struct kvm_get_htab_header header;
+/*
+ * Older kernel required one extra byte.


Older than what?


+ */
+unsigned long hpte[(HPTES_PER_GROUP * 2) + 1];
+} hpte_buf;
+
+index = (hash * HPTES_PER_GROUP)  cpu-env.htab_mask;
+*hpte0 = 0;
+*hpte1 = 0;
+if (!cap_htab_fd) {
+return 0;
+}
+
+ghf.flags = 0;
+ghf.start_index = index;
+htab_fd = kvm_vm_ioctl(kvm_state, KVM_PPC_GET_HTAB_FD,ghf);
+if (htab_fd  0) {
+goto error_out;
+}
+/*
+ * Read the hpte group
+ */
+if (read(htab_fd,hpte_buf, sizeof(hpte_buf))  0) {
+goto out;
+}
+
+index = 0;
+pte_offset = (hash * HASH_PTEG_SIZE_64)  cpu-env.htab_mask;;
+while (index  hpte_buf.header.n_valid) {
+pte0 = hpte_buf.hpte[(index * 2)];
+pte1 = hpte_buf.hpte[(index * 2) + 1];
+if ((pte0  HPTE64_V_VALID)
+  (secondary == !!(pte0  HPTE64_V_SECONDARY))
+  HPTE64_V_COMPARE(pte0, ptem)) {
+*hpte0 = pte0;
+*hpte1 = pte1;
+close(htab_fd);
+return pte_offset;
+}
+index++;
+pte_offset += HASH_PTE_SIZE_64;
+}
+out:
+close(htab_fd);
+error_out:
+return -1;
+}
diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
index 4ae7bf2..dad0e57 100644
--- a/target-ppc/kvm_ppc.h
+++ b/target-ppc/kvm_ppc.h
@@ -42,7 +42,9 @@ int kvmppc_get_htab_fd(bool write);
  int kvmppc_save_htab(QEMUFile *f, int fd, size_t bufsize, int64_t max_ns);
  int kvmppc_load_htab_chunk(QEMUFile *f, int fd, uint32_t index,
 uint16_t n_valid, uint16_t n_invalid);
-
+hwaddr kvmppc_hash64_pteg_search(PowerPCCPU *cpu, hwaddr hash,
+ bool secondary, target_ulong ptem,
+ target_ulong *hpte0, target_ulong *hpte1);
  #else

  static inline uint32_t kvmppc_get_tbfreq(void)
@@ -181,6 +183,14 @@ static inline int kvmppc_load_htab_chunk(QEMUFile *f, int 
fd, uint32_t index,
  abort();
  }

+static inline hwaddr kvmppc_hash64_pteg_search(PowerPCCPU *cpu, hwaddr hash,
+   bool secondary,
+   target_ulong ptem,
+   target_ulong *hpte0,
+   target_ulong *hpte1)
+{
+abort();
+}
  #endif

  #ifndef CONFIG_KVM
diff --git a/target-ppc/mmu-hash64.c b/target-ppc/mmu-hash64.c
index 67fc1b5..2288fe8 100644
--- a/target-ppc/mmu-hash64.c
+++ b/target-ppc/mmu-hash64.c
@@ -302,37 +302,50 @@ static int ppc_hash64_amr_prot(CPUPPCState *env, 
ppc_hash_pte64_t pte)
  return prot;
  }

-static hwaddr ppc_hash64_pteg_search(CPUPPCState *env, hwaddr pteg_off,
+static hwaddr ppc_hash64_pteg_search(CPUPPCState *env, hwaddr hash,
   bool secondary, target_ulong ptem,
   ppc_hash_pte64_t *pte)
  {
-hwaddr pte_offset = pteg_off;
+hwaddr pte_offset;
  target_ulong pte0, pte1;
-int i;
-
-for (i = 0; i  HPTES_PER_GROUP; i++) {
-pte0 = ppc_hash64_load_hpte0(env, pte_offset);
-pte1 = ppc_hash64_load_hpte1(env, pte_offset);
-
-if ((pte0  HPTE64_V_VALID)
-  (secondary == !!(pte0  HPTE64_V_SECONDARY))
-  HPTE64_V_COMPARE(pte0, ptem)) {
-pte-pte0 = pte0;
-pte-pte1 = pte1;
-return pte_offset;
+int i, ret = 0;
+
+if (kvm_enabled()) {
+ret = kvmppc_hash64_pteg_search(ppc_env_get_cpu(env), hash,
+secondary, ptem,

Re: [Qemu-devel] [PATCH -V4 2/4] target-ppc: Fix page table lookup with kvm enabled

2013-09-30 Thread Alexander Graf


On 09/05/2013 10:16 AM, Aneesh Kumar K.V wrote:

From: Aneesh Kumar K.Vaneesh.ku...@linux.vnet.ibm.com

With kvm enabled, we store the hash page table information in the hypervisor.
Use ioctl to read the htab contents. Without this we get the below error when
trying to read the guest address

  (gdb) x/10 do_fork
  0xc0098660do_fork:   Cannot access memory at address 
0xc0098660
  (gdb)

Signed-off-by: Aneesh Kumar K.Vaneesh.ku...@linux.vnet.ibm.com
---
  target-ppc/kvm.c| 59 +
  target-ppc/kvm_ppc.h| 12 +-
  target-ppc/mmu-hash64.c | 57 ---
  3 files changed, 104 insertions(+), 24 deletions(-)

diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 1838465..05b066c 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -1888,3 +1888,62 @@ int kvm_arch_on_sigbus(int code, void *addr)
  void kvm_arch_init_irq_routing(KVMState *s)
  {
  }
+
+hwaddr kvmppc_hash64_pteg_search(PowerPCCPU *cpu, hwaddr hash,
+ bool secondary, target_ulong ptem,
+ target_ulong *hpte0, target_ulong *hpte1)
+{
+int htab_fd;
+uint64_t index;
+hwaddr pte_offset;
+target_ulong pte0, pte1;
+struct kvm_get_htab_fd ghf;
+struct kvm_get_htab_buf {
+struct kvm_get_htab_header header;
+/*
+ * Older kernel required one extra byte.


Older than what?


+ */
+unsigned long hpte[(HPTES_PER_GROUP * 2) + 1];
+} hpte_buf;
+
+index = (hash * HPTES_PER_GROUP)  cpu-env.htab_mask;
+*hpte0 = 0;
+*hpte1 = 0;
+if (!cap_htab_fd) {
+return 0;
+}
+
+ghf.flags = 0;
+ghf.start_index = index;
+htab_fd = kvm_vm_ioctl(kvm_state, KVM_PPC_GET_HTAB_FD,ghf);
+if (htab_fd  0) {
+goto error_out;
+}
+/*
+ * Read the hpte group
+ */
+if (read(htab_fd,hpte_buf, sizeof(hpte_buf))  0) {
+goto out;
+}
+
+index = 0;
+pte_offset = (hash * HASH_PTEG_SIZE_64)  cpu-env.htab_mask;;
+while (index  hpte_buf.header.n_valid) {
+pte0 = hpte_buf.hpte[(index * 2)];
+pte1 = hpte_buf.hpte[(index * 2) + 1];
+if ((pte0  HPTE64_V_VALID)
+  (secondary == !!(pte0  HPTE64_V_SECONDARY))
+  HPTE64_V_COMPARE(pte0, ptem)) {
+*hpte0 = pte0;
+*hpte1 = pte1;
+close(htab_fd);
+return pte_offset;
+}
+index++;
+pte_offset += HASH_PTE_SIZE_64;
+}
+out:
+close(htab_fd);
+error_out:
+return -1;
+}
diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
index 4ae7bf2..dad0e57 100644
--- a/target-ppc/kvm_ppc.h
+++ b/target-ppc/kvm_ppc.h
@@ -42,7 +42,9 @@ int kvmppc_get_htab_fd(bool write);
  int kvmppc_save_htab(QEMUFile *f, int fd, size_t bufsize, int64_t max_ns);
  int kvmppc_load_htab_chunk(QEMUFile *f, int fd, uint32_t index,
 uint16_t n_valid, uint16_t n_invalid);
-
+hwaddr kvmppc_hash64_pteg_search(PowerPCCPU *cpu, hwaddr hash,
+ bool secondary, target_ulong ptem,
+ target_ulong *hpte0, target_ulong *hpte1);
  #else

  static inline uint32_t kvmppc_get_tbfreq(void)
@@ -181,6 +183,14 @@ static inline int kvmppc_load_htab_chunk(QEMUFile *f, int 
fd, uint32_t index,
  abort();
  }

+static inline hwaddr kvmppc_hash64_pteg_search(PowerPCCPU *cpu, hwaddr hash,
+   bool secondary,
+   target_ulong ptem,
+   target_ulong *hpte0,
+   target_ulong *hpte1)
+{
+abort();
+}
  #endif

  #ifndef CONFIG_KVM
diff --git a/target-ppc/mmu-hash64.c b/target-ppc/mmu-hash64.c
index 67fc1b5..2288fe8 100644
--- a/target-ppc/mmu-hash64.c
+++ b/target-ppc/mmu-hash64.c
@@ -302,37 +302,50 @@ static int ppc_hash64_amr_prot(CPUPPCState *env, 
ppc_hash_pte64_t pte)
  return prot;
  }

-static hwaddr ppc_hash64_pteg_search(CPUPPCState *env, hwaddr pteg_off,
+static hwaddr ppc_hash64_pteg_search(CPUPPCState *env, hwaddr hash,
   bool secondary, target_ulong ptem,
   ppc_hash_pte64_t *pte)
  {
-hwaddr pte_offset = pteg_off;
+hwaddr pte_offset;
  target_ulong pte0, pte1;
-int i;
-
-for (i = 0; i  HPTES_PER_GROUP; i++) {
-pte0 = ppc_hash64_load_hpte0(env, pte_offset);
-pte1 = ppc_hash64_load_hpte1(env, pte_offset);
-
-if ((pte0  HPTE64_V_VALID)
-  (secondary == !!(pte0  HPTE64_V_SECONDARY))
-  HPTE64_V_COMPARE(pte0, ptem)) {
-pte-pte0 = pte0;
-pte-pte1 = pte1;
-return pte_offset;
+int i, ret = 0;
+
+if (kvm_enabled()) {
+ret = kvmppc_hash64_pteg_search(ppc_env_get_cpu(env), hash,
+secondary, ptem,

Re: [Qemu-devel] [PATCH] target-ppc: dump-guest-memory support

2013-09-30 Thread Alexander Graf


On 09/25/2013 05:40 PM, Aneesh Kumar K.V wrote:

Hi Alex,

Any update on this ?


Sent to qemu-de...@nongnu.og instead of qemu-devel@nongnu.org. I can't 
(and won't) apply patches that didn't land on qemu-devel@nongnu.org.


The patch itself looks reasonable to me though :).


Alex

Re: [Qemu-devel] [PATCH v2 1/4] target-ppc: Fill in OpenFirmware names for some PowerPCCPU families

2013-09-30 Thread Alexander Graf


On 09/25/2013 11:01 AM, Alexey Kardashevskiy wrote:

On 09/17/2013 12:16 AM, Alexey Kardashevskiy wrote:

On 09/10/2013 02:15 PM, Alexey Kardashevskiy wrote:

On 08/16/2013 08:35 AM, Andreas Färber wrote:

Set the expected values for POWER7, POWER7+, POWER8 and POWER5+.
Note that POWER5+ and POWER7+ are intentionally lacking the '+', so the
lack of a POWER7P family constitutes no problem.

Signed-off-by: Andreas Färberafaer...@suse.de


Out of curiosity - is anything going to happen to this series? Is it
awaiting for someone's review? Just asking as it is quite old and nobody
seems to care :)

Ping, anyone? Not sure if any of my mails even reaches maillists :-/

Ping?

It conflicts with [PATCH] pseries: Fix loading of little endian kernels
posted today. It would be great to have either this series or the new patch
upstream...


Andreas is going to rework it, yes :).


Alex

Re: [Qemu-devel] [PATCH 01/10] target-s390: Move facilities bits to env

2013-09-30 Thread Alexander Graf


On 09/23/2013 04:04 PM, Richard Henderson wrote:

Rather than simply hard-coding them in STFL instruction.

Signed-off-by: Richard Hendersonr...@twiddle.net
---
  target-s390x/cpu.c   |  3 +++
  target-s390x/cpu.h   |  1 +
  target-s390x/translate.c | 10 +-
  3 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/target-s390x/cpu.c b/target-s390x/cpu.c
index 3c89f8a..ff691df 100644
--- a/target-s390x/cpu.c
+++ b/target-s390x/cpu.c
@@ -181,6 +181,9 @@ static void s390_cpu_initfn(Object *obj)
  env-cpu_num = cpu_num++;
  env-ext_index = -1;

+env-facilities[0] = 0xc000ull;
+env-facilities[1] = 0;


Could we add CPU definitions along the way here? I'm fine with making z9 
the default CPU type, but we should make this explicit :).



+
  if (tcg_enabled()  !inited) {
  inited = true;
  s390x_translate_init();
diff --git a/target-s390x/cpu.h b/target-s390x/cpu.h
index 8be5648..746aec8 100644
--- a/target-s390x/cpu.h
+++ b/target-s390x/cpu.h
@@ -136,6 +136,7 @@ typedef struct CPUS390XState {
  CPU_COMMON

  /* reset does memset(0) up to here */
+uint64_t facilities[2];

  int cpu_num;
  uint8_t *storage_keys;
diff --git a/target-s390x/translate.c b/target-s390x/translate.c
index afe90eb..d4dc8ea 100644
--- a/target-s390x/translate.c
+++ b/target-s390x/translate.c
@@ -3230,12 +3230,12 @@ static ExitStatus op_spt(DisasContext *s, DisasOps *o)

  static ExitStatus op_stfl(DisasContext *s, DisasOps *o)
  {
-TCGv_i64 f, a;
-/* We really ought to have more complete indication of facilities
-   that we implement.  Address this when STFLE is implemented.  */
+TCGv_i64 f = tcg_temp_new_i64();
+TCGv_i64 a = tcg_const_i64(200);
+
  check_privileged(s);
-f = tcg_const_i64(0xc000);
-a = tcg_const_i64(200);
+tcg_gen_ld_i64(f, cpu_env, offsetof(CPUS390XState, facilities[0]));
+tcg_gen_shri_i64(f, f, 32);


IMHO the facility list should be stored in DisasContext. That way we can 
check whether we're generating code against the correct target.



Alex

Re: [Qemu-devel] [PATCH 07/10] target-s390: Fix STIDP

2013-09-30 Thread Alexander Graf


On 09/23/2013 04:04 PM, Richard Henderson wrote:

The implementation had been incomplete, as we did not store the
machine type.

Signed-off-by: Richard Hendersonr...@twiddle.net
---
  target-s390x/cpu.c   |  2 ++
  target-s390x/cpu.h   | 14 +-
  target-s390x/translate.c |  2 +-
  3 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/target-s390x/cpu.c b/target-s390x/cpu.c
index 01ff49b..d003dcf 100644
--- a/target-s390x/cpu.c
+++ b/target-s390x/cpu.c
@@ -257,6 +257,8 @@ static void s390_cpu_initfn(Object *obj)
  env-facilities[0] |= FAC0_Z9_109;
  #endif

+env-machine_type = 0x2094;   /* ??? Also Z9-109.  */


This really wants to be selected by -cpu.


+
  if (tcg_enabled()  !inited) {
  inited = true;
  s390x_translate_init();
diff --git a/target-s390x/cpu.h b/target-s390x/cpu.h
index a0bafef..95f9cab 100644
--- a/target-s390x/cpu.h
+++ b/target-s390x/cpu.h
@@ -197,7 +197,19 @@ typedef struct CPUS390XState {
  /* reset does memset(0) up to here */
  uint64_t facilities[2];

-int cpu_num;
+union {
+uint64_t cpuid;
+struct {
+#ifdef HOST_WORDS_BIGENDIAN
+uint32_t cpu_num;
+uint32_t machine_type;
+#else
+uint32_t machine_type;
+uint32_t cpu_num;
+#endif


Are we guaranteed that we don't need to pack? Also anonymous 
unions/structs are a gcc extension IIRC. And why do you swap endianness 
here, but not above when defining the machine_type value?



Alex


+};
+};
+
  uint8_t *storage_keys;

  uint64_t tod_offset;
diff --git a/target-s390x/translate.c b/target-s390x/translate.c
index 192d54e..25a6537 100644
--- a/target-s390x/translate.c
+++ b/target-s390x/translate.c
@@ -3242,7 +3242,7 @@ static ExitStatus op_stctl(DisasContext *s, DisasOps *o)
  static ExitStatus op_stidp(DisasContext *s, DisasOps *o)
  {
  check_privileged(s);
-tcg_gen_ld32u_i64(o-out, cpu_env, offsetof(CPUS390XState, cpu_num));
+tcg_gen_ld_i64(o-out, cpu_env, offsetof(CPUS390XState, cpuid));
  return NO_EXIT;
  }

Re: [Qemu-devel] [PATCH 0/9] target-s390 tcg improvements

2013-09-30 Thread Alexander Graf


On 09/23/2013 04:04 PM, Richard Henderson wrote:

With this patch set we can boot the fedora 19 kernel, and make
it all the way to /bin/init.  At which point the process either
hangs or crashes; in either case the kernel winds up with no
runnable processes and spends its time in the idle loop.

The choice of z9-109 for the facilities is because that appears
to be what fedora 19 is targeting as the minimum.

That said, a debian install can make it all the way through to
completion, so the fedora crash/hang must be related to something
in the extra z9-109 insns.


Reviewed-by: Alexander Graf ag...@suse.de

I couldn't spot anything obviously wrong.


Alex





r~


Richard Henderson (10):
   target-s390: Move facilities bits to env
   target-s390: Implement STFLE
   target-s390: Add facilities bits and sets
   target-s390: Raise OPERATION exception for disabled insns
   target-s390: Implement SAM31 and SAM64
   target-s390: Implement EPSW
   target-s390: Fix STIDP
   target-s390: Fix STURA
   target-s390: Implement LURA, LURAG, STURG
   target-s390: Implement ECAG

  target-s390x/cpu.c |  78 ++
  target-s390x/cpu.h |  74 -
  target-s390x/helper.h  |   4 ++
  target-s390x/insn-data.def |  18 +++--
  target-s390x/mem_helper.c  |  18 -
  target-s390x/misc_helper.c |  13 
  target-s390x/translate.c   | 161 -
  7 files changed, 329 insertions(+), 37 deletions(-)

Re: [Qemu-devel] [PATCH] turn firmware image filename into a machine option

2013-10-01 Thread Alexander Graf


On 10/01/2013 04:40 PM, Gerd Hoffmann wrote:

   Hi,


SLOF is what is loaded from the very beginning, it configures PCI, cooks
the device tree and boots the guest system (directly or via yaboot/grub,
from disk, network or ram). Normal firmware, as usual. It knows all the
details about the machine so the guest system (linux) does not need to know
details about PCI host bus adapter or anything like this.

So pretty much like seabios on x86.


RTAS is an agent which always lives in RAM when the guest system (linux,
aix) is up and running. It is a light-weight version of SLOF which is left
in RAM by SLOF and can do board/machine specific tasks such as PCI config
space access or PCI hotplug - something what SLOF already knows about and
something what the guest does not want to know about in details. This came
from IBM pHyp (traditional server PPC64 hypervisor) and it is quite a big
firmware. In the case of KVM, it is very small stub which simply passes
requests to QEMU which does the rest. But it is still a separate binary
image even in the current QEMU.

How that does get loaded?  Is it there at machine init?  Or does SLOF
load RTAS from somewhere?


It gets loaded to a fixed address similar to the device tree. But 
there's no reason that couldn't be changed to on demand loading or even 
an integrated RTAS blob inside of SLOF.



Alex

Re: [Qemu-devel] [PATCH 07/10] target-s390: Fix STIDP

2013-10-01 Thread Alexander Graf


On 09/30/2013 09:48 PM, Richard Henderson wrote:

On 09/30/2013 11:13 AM, Alexander Graf wrote:

-int cpu_num;
+union {
+uint64_t cpuid;
+struct {
+#ifdef HOST_WORDS_BIGENDIAN
+uint32_t cpu_num;
+uint32_t machine_type;
+#else
+uint32_t machine_type;
+uint32_t cpu_num;
+#endif

Are we guaranteed that we don't need to pack? Also anonymous unions/structs are
a gcc extension IIRC. And why do you swap endianness here, but not above when
defining the machine_type value?

(1) I can't imagine that we would; such struct/unions are used all over.


*shrug* you're the expert :).


(2) Sure, but we've so many other gcc extensions I figured it didn't matter.


Avi complained about it to me in Linux patches. Not sure how much we 
care in QEMU.



(3) Of course.  I want host endianness, not target endianness.


Phew. I think I'm slowly starting to grasp what you're trying to do 
here. Any way you could make this more explicit through shifts and ors 
and other explicit operations? This feels like too much magic to just 
understand on a glimpse to me.



Alex

Re: [Qemu-devel] [PATCH 01/10] target-s390: Move facilities bits to env

2013-10-01 Thread Alexander Graf


On 09/30/2013 09:15 PM, Richard Henderson wrote:

On 09/30/2013 11:03 AM, Alexander Graf wrote:

On 09/23/2013 04:04 PM, Richard Henderson wrote:

Rather than simply hard-coding them in STFL instruction.

Signed-off-by: Richard Hendersonr...@twiddle.net
---
   target-s390x/cpu.c   |  3 +++
   target-s390x/cpu.h   |  1 +
   target-s390x/translate.c | 10 +-
   3 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/target-s390x/cpu.c b/target-s390x/cpu.c
index 3c89f8a..ff691df 100644
--- a/target-s390x/cpu.c
+++ b/target-s390x/cpu.c
@@ -181,6 +181,9 @@ static void s390_cpu_initfn(Object *obj)
   env-cpu_num = cpu_num++;
   env-ext_index = -1;

+env-facilities[0] = 0xc000ull;
+env-facilities[1] = 0;

Could we add CPU definitions along the way here? I'm fine with making z9 the
default CPU type, but we should make this explicit :).

Certainly that's what we should do.  I just hadn't yet researched the
currently correct way to do that.  I know there's some amount of out-of-date
examples in the current source base.


I'll leave the answer to that to Andreas :).



Pointers?


   static ExitStatus op_stfl(DisasContext *s, DisasOps *o)
   {
-TCGv_i64 f, a;
-/* We really ought to have more complete indication of facilities
-   that we implement.  Address this when STFLE is implemented.  */
+TCGv_i64 f = tcg_temp_new_i64();
+TCGv_i64 a = tcg_const_i64(200);
+
   check_privileged(s);
-f = tcg_const_i64(0xc000);
-a = tcg_const_i64(200);
+tcg_gen_ld_i64(f, cpu_env, offsetof(CPUS390XState, facilities[0]));
+tcg_gen_shri_i64(f, f, 32);

IMHO the facility list should be stored in DisasContext. That way we can check
whether we're generating code against the correct target.

See patch 4.

As for the code we generate here, does it really matter if we load the value
from env, or have it encoded as a constant?  It still has to get stored to
memory, so it's not like the TCG optimizer is going to do anything with the
constant.


No, it only seemed more straight forward to me from a single source of 
information point of view. But it really doesn't matter. Shifting in C 
seems to be easier to read :).



Alex

Re: [Qemu-devel] [PATCH 01/10] target-s390: Move facilities bits to env

2013-10-01 Thread Alexander Graf


On 10/01/2013 05:52 PM, Richard Henderson wrote:

On 10/01/2013 08:48 AM, Alexander Graf wrote:

On 09/30/2013 09:15 PM, Richard Henderson wrote:

On 09/30/2013 11:03 AM, Alexander Graf wrote:

On 09/23/2013 04:04 PM, Richard Henderson wrote:

Rather than simply hard-coding them in STFL instruction.

Signed-off-by: Richard Hendersonr...@twiddle.net
---
target-s390x/cpu.c   |  3 +++
target-s390x/cpu.h   |  1 +
target-s390x/translate.c | 10 +-
3 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/target-s390x/cpu.c b/target-s390x/cpu.c
index 3c89f8a..ff691df 100644
--- a/target-s390x/cpu.c
+++ b/target-s390x/cpu.c
@@ -181,6 +181,9 @@ static void s390_cpu_initfn(Object *obj)
env-cpu_num = cpu_num++;
env-ext_index = -1;

+env-facilities[0] = 0xc000ull;
+env-facilities[1] = 0;

Could we add CPU definitions along the way here? I'm fine with making z9 the
default CPU type, but we should make this explicit :).

Certainly that's what we should do.  I just hadn't yet researched the
currently correct way to do that.  I know there's some amount of out-of-date
examples in the current source base.

I'll leave the answer to that to Andreas :).

Can we leave that for a separate patch series then?


Just make sure you actually check for feature bits on every instruction 
(which I think you do, but the current code is way too magical to me to 
really understand it anymore) so that we can always implement a z900 cpu 
type later on.





-TCGv_i64 f, a;
-/* We really ought to have more complete indication of facilities
-   that we implement.  Address this when STFLE is implemented.  */
+TCGv_i64 f = tcg_temp_new_i64();
+TCGv_i64 a = tcg_const_i64(200);
+
check_privileged(s);
-f = tcg_const_i64(0xc000);
-a = tcg_const_i64(200);
+tcg_gen_ld_i64(f, cpu_env, offsetof(CPUS390XState, facilities[0]));
+tcg_gen_shri_i64(f, f, 32);

IMHO the facility list should be stored in DisasContext. That way we can check
whether we're generating code against the correct target.

See patch 4.

As for the code we generate here, does it really matter if we load the value
from env, or have it encoded as a constant?  It still has to get stored to
memory, so it's not like the TCG optimizer is going to do anything with the
constant.

No, it only seemed more straight forward to me from a single source of
information point of view. But it really doesn't matter. Shifting in C seems
to be easier to read :).

Fair enough.  I'll rearrange the order of the patches so that we can
update STFL to use the DisasContext data.


Thanks :)


Alex

Re: [Qemu-devel] [PATCH v2 0/9] target-s390 tcg improvements

2013-10-02 Thread Alexander Graf


On 01.10.2013, at 19:17, Richard Henderson wrote:

 Changes v1-v2:
  * Squashed patches 1, 2, 4.  The feedback from patch 1 called for a
re-ordering of the patch set to better deal with STFL, but I could
not find an ordering that worked well.  I think this is probably
just as clear as 3 separate patches.
 
  * STFL and STIDP implementations adjusted for feedback.
 
  * Minor tweaks to SAM implementations.
 
  * Split out z9-109 hack to a separate patch.

Acked-by: Alexander Graf ag...@suse.de


Alex

Re: [Qemu-devel] [PATCH -V4 2/4] target-ppc: Fix page table lookup with kvm enabled

2013-10-02 Thread Alexander Graf


On 01.10.2013, at 03:27, Aneesh Kumar K.V wrote:

 Alexander Graf ag...@suse.de writes:
 
 On 09/05/2013 10:16 AM, Aneesh Kumar K.V wrote:
 From: Aneesh Kumar K.Vaneesh.ku...@linux.vnet.ibm.com
 
 With kvm enabled, we store the hash page table information in the 
 hypervisor.
 Use ioctl to read the htab contents. Without this we get the below error 
 when
 trying to read the guest address
 
  (gdb) x/10 do_fork
  0xc0098660do_fork:   Cannot access memory at address 
 0xc0098660
  (gdb)
 
 Signed-off-by: Aneesh Kumar K.Vaneesh.ku...@linux.vnet.ibm.com
 ---
  target-ppc/kvm.c| 59 
 +
  target-ppc/kvm_ppc.h| 12 +-
  target-ppc/mmu-hash64.c | 57 
 ---
  3 files changed, 104 insertions(+), 24 deletions(-)
 
 diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
 index 1838465..05b066c 100644
 --- a/target-ppc/kvm.c
 +++ b/target-ppc/kvm.c
 @@ -1888,3 +1888,62 @@ int kvm_arch_on_sigbus(int code, void *addr)
  void kvm_arch_init_irq_routing(KVMState *s)
  {
  }
 +
 +hwaddr kvmppc_hash64_pteg_search(PowerPCCPU *cpu, hwaddr hash,
 + bool secondary, target_ulong ptem,
 + target_ulong *hpte0, target_ulong *hpte1)
 +{
 +int htab_fd;
 +uint64_t index;
 +hwaddr pte_offset;
 +target_ulong pte0, pte1;
 +struct kvm_get_htab_fd ghf;
 +struct kvm_get_htab_buf {
 +struct kvm_get_htab_header header;
 +/*
 + * Older kernel required one extra byte.
 
 Older than what?
 
 + */
 
 Since we decided to drop that kernel patch, that should be updated as
 kernel requires one extra byte.
 
 +unsigned long hpte[(HPTES_PER_GROUP * 2) + 1];
 +} hpte_buf;
 +
 +index = (hash * HPTES_PER_GROUP)  cpu-env.htab_mask;
 +*hpte0 = 0;
 +*hpte1 = 0;
 +if (!cap_htab_fd) {
 +return 0;
 +}
 +
 
 .
 
 
 -static hwaddr ppc_hash64_pteg_search(CPUPPCState *env, hwaddr pteg_off,
 +static hwaddr ppc_hash64_pteg_search(CPUPPCState *env, hwaddr hash,
   bool secondary, target_ulong ptem,
   ppc_hash_pte64_t *pte)
  {
 -hwaddr pte_offset = pteg_off;
 +hwaddr pte_offset;
  target_ulong pte0, pte1;
 -int i;
 -
 -for (i = 0; i  HPTES_PER_GROUP; i++) {
 -pte0 = ppc_hash64_load_hpte0(env, pte_offset);
 -pte1 = ppc_hash64_load_hpte1(env, pte_offset);
 -
 -if ((pte0  HPTE64_V_VALID)
 -  (secondary == !!(pte0  HPTE64_V_SECONDARY))
 -  HPTE64_V_COMPARE(pte0, ptem)) {
 -pte-pte0 = pte0;
 -pte-pte1 = pte1;
 -return pte_offset;
 +int i, ret = 0;
 +
 +if (kvm_enabled()) {
 +ret = kvmppc_hash64_pteg_search(ppc_env_get_cpu(env), hash,
 +secondary, ptem,
 +pte-pte0,pte-pte1);
 
 Instead of duplicating the search, couldn't you just hook yourself into 
 ppc_hash64_load_hpte0/1 and return the respective ptes from there? Just 
 cache the current pteg to ensure things don't become dog slow.
 
 
 Can you  explain this better ? 

You're basically doing

hwaddr ppc_hash64_pteg_search(...)
{
if (kvm) {
pteg = read_from_kvm();
foreach pte in pteg {
if (match) return offset;
}
return -1;
} else {
foreach pte in pteg {
pte = read_pte_from_memory();
if (match) return offset;
}
return -1;
}
}

This is massive code duplication. The only difference between kvm and tcg are 
the source for the pteg read. David already abstracted the actual pte0/pte1 
reads away in ppc_hash64_load_hpte0 and ppc_hash64_load_hpte1 wrapper functions.

Now imagine we could add a temporary pteg store in env. Then have something 
like this in ppc_hash64_load_hpte0:

if (need_kvm_htab_access) {
if (env-current_cached_pteg != this_pteg) (
read_pteg(env-cached_pteg);
return env-cached_pteg[x].pte0;
}
} else {
do what was done before
}

That way the actual resolver doesn't care where the PTEG comes from, as it only 
ever checks pte0/pte1 and leaves all the magic on where those come from to the 
load function.


Alex

Re: [Qemu-devel] [PATCH -V4 RESEND 0/6] target-ppc: Add support for dumping guest memory using qemu gdb server

2013-10-02 Thread Alexander Graf


On 01.10.2013, at 18:19, Aneesh Kumar K.V wrote:

 Hi,
 
 This patch series implement support for dumping guest memory using qemu gdb 
 server. The last patch also enable qemu monitor command dump-guest-memory

Thanks, applied all but 2/6 to ppc-next. I think the core dump bits should be 
more smart eventually to determine whether we want to save ppc32 or ppc64 
dumps, as qemu-system-ppc64 can execute 32bit CPUs. But for now your patch as 
is should work.


Alex

Re: [Qemu-devel] [RFC] QEMU/KVM PowerPC: virtio and guest endianness

2013-10-04 Thread Alexander Graf

CC'ing qemu-devel - please use qemu-ppc@ only as a tag, every mail needs to go 
to qemu-devel as well.

On 03.10.2013, at 16:29, Greg Kurz wrote:

 Hi,
 
 There have been some work on the topic lately but no agreement has
 been reached yet. I want to consolidate the facts in a single thread of
 mail and re-start the discussion. Please find below a recap of what we
 have as of today:
 
 From a virtio POV, guest endianness is reflected by the endianness of
 the interrupt vectors (ILE bit in the LPCR register). The guest kernel
 relies on the H_SET_MODE_RESOURCE_LE hcall to set this bit, early in the
 boot process.
 
 Rusty sent a patchset on qemu-devel@ to provide the necessary bits to
 perform byteswap in the QEMU:
 
 http://patchwork.ozlabs.org/patch/266451/
 http://patchwork.ozlabs.org/patch/266452/
 http://patchwork.ozlabs.org/patch/266450/
 (plus other enablement patches for virtio drivers, not essential for
 the discussion).
 
 In non-KVM mode, QEMU implements the H_SET_MODE_RESOURCE_LE and updates
 its internal value for LPCR when the guest requests it. Rusty's patchset
 works out-of-the-box in this mode: I could successfully setup and use a
 9p share over virtio transport (broader virtio testing still to be done
 though).
 
 When using KVM, the story is different : QEMU is not on this
 endianness change flow anymore, providing KVM has the following
 patch from Anton:
 
 http://patchwork.ozlabs.org/patch/277079/
 
 There are *at least* two approaches to bring back endianness knowledge
 to QEMU: polling (1) and propagation (2).
 
 (1) QEMU must retrieve LPCR from the kernel using the following API:
 
 http://patchwork.ozlabs.org/patch/273029/
 
 (2) KVM can resume execution to the host and thus propagating
 H_SET_MODE_RESOURCE_LE to QEMU. Laurent came up with a patch on
 linuxppc-dev@ to do this:
 
 http://patchwork.ozlabs.org/patch/278590/
 
 I would say (1) is a standard and sane way of addressing the issue:
 since the LPCR register value is held by KVM, it makes sense to
 introduce an API to get/set it. Then, it is up to QEMU to use this API.
 
 We can dumbly do the polling in all the places where byteswapping
 matters: it is clearly sub-optimized, especially since the LPCR_ILE bit
 doesn't change so often. Rusty suggested we can retrieve it at virtio
 device reset time and cache it, since an endianness change after the
 devices have started to be used is non-sensical.
 
 I have searched for an appropriate place to add the polling and I must
 admit I did not find any... I am no QEMU expert but I suspect we would
 need some kind of arch specific hook to be called from the virtio code
 to do this... :-\ I hope I am wrong, please correct me if so.

Just put it into the normal register sync function and call 
cpu_synchronize_state() on virtio reset.

 On the other hand, (2) looks a bit hacky: KVM usually returns to the
 host when it cannot fully handle the h_call. Propagating may look like
 a useless path to follow from a KVM POV. From a QEMU POV, things are
 different: propagation will trig the fallback code in QEMU, already
 working in non-KVM mode. Nothing more to be done.

We have to decide which scheme to follow. There are 2 way we can / should 
handle registers usually:

  a) owned by QEMU
  b) owned by KVM

If they're owned by QEMU, every hypercall needs to go into QEMU which then 
propagates that change through an ioctl back into KVM.
If they're owned by KVM, QEMU needs to fetch them whenever it needs to

As a general rule of thumb path b is easier to hack up, path a is easier to 
maintain long term. Which is pretty much what you're seeing here.

 I have a better feeling for (2) because:
 - 2-liner patch in KVM
 - no extra code change in QEMU
 - already *partially* tested

I don't understand. QEMU would get triggered, then have to propagate things 
back into KVM. We definitely do _not_ want KVM to do magic, then tell QEMU to 
handle a hypercall again.

 Also, I understood Rusty is working on the next virtio specification
 which should address the endian issue: probably not worth to add too
 many temporary lines in the QEMU code...

Does 3.13 support LE mode? Does 3.13 support the new and shiny virtio spec? 
There's a good chance we'd have to deal with guest kernels that can do LE, but 
not sane virtio.

 Of course, I probably lack some essential knowledge that would be
 more favorable to (1)... so please comment and argue ! :)

I think a 100% QEMU implementation that just goes through all vcpus and does a 
simple SET_ONE_REG for LPCR to set ILE would be the best. Anton's patch isn't 
in Linus' tree yet, right? So all it takes is a partial revert of that one to 
not handle the actual hypercall in KVM. And some code in kvmppc_set_lpcr() to 
also set intr_msr (not changing it is a bug today already).


Alex

Re: [Qemu-devel] [Qemu-ppc] [RFC] QEMU/KVM PowerPC: virtio and guest endianness

2013-10-04 Thread Alexander Graf


On 04.10.2013, at 16:08, Greg Kurz wrote:

 Answering to both Paul and Alex.
 
 On Fri, 4 Oct 2013 13:54:25 +0200
 Alexander Graf ag...@suse.de wrote:
 
 
 On 04.10.2013, at 13:53, Paul Mackerras wrote:
 
 I don't mind particularly whether H_SET_MODE for the endianness
 setting gets handled in the kernel or in QEMU, but I don't think it
 should be handled in both.  If you want QEMU to know about the
 endianness setting immediately, make the kernel version do nothing
 and get QEMU to handle it -- which if KVM is enabled will mean
 iterating over all vcpus and getting them all to send the new LPCR
 setting to the kernel via the SET_ONE_REG ioctl.
 
 However, I want the setting of breakpoint registers (CIABR and
 DAWR/X) via H_SET_MODE to happen in the kernel, preferably in real
 mode, since that can happen on context switch and thus needs to be
 quick.
 
 
 Paul,
 
 As far as virtio is concerned, QEMU only needs to know about the guest
 endiannes if a virtio device shows up. The virtio reset flow is a
 good candiadate for that.
 
 I don't want to see a single hypercall be split across the QEMU/KVM
 barrier. So if there's a reasonable incentive to handle H_SET_MODE in
 KVM, we should handle all of it in KVM.
 
 
 Alex,
 
 The appropriate solution would be then to let KVM implement the whole
 H_SET_MODE hcall and own LPCR. QEMU will poll it with cpu_synchronize_state().
 It seems to preserve all the requirements.

Yes. Since breakpoint registers are part of H_SET_MODE, we want to have it 
owned by KVM rather than QEMU. I still don't know what those PAPR people think 
they're doing, shoving completely unrelated things into the same hypercall 
though :).


Alex

Re: [Qemu-devel] [PATCH v2 1/2] KVM: s390: add and extend interrupt information data structs

2013-10-04 Thread Alexander Graf


On 06.09.2013, at 14:19, Jens Freimann wrote:

 With the currently available struct kvm_s390_interrupt it is not possible to
 inject every kind of interrupt as defined in the z/Architecture. Add
 additional interruption parameters to the structures and move it to kvm.h
 
 Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com
 Reviewed-by: Cornelia Huck cornelia.h...@de.ibm.com
 ---
 arch/s390/include/asm/kvm_host.h | 34 +-
 include/uapi/linux/kvm.h | 63 
 2 files changed, 64 insertions(+), 33 deletions(-)
 
 diff --git a/arch/s390/include/asm/kvm_host.h 
 b/arch/s390/include/asm/kvm_host.h
 index e87ecaa..adb05c5 100644
 --- a/arch/s390/include/asm/kvm_host.h
 +++ b/arch/s390/include/asm/kvm_host.h
 @@ -16,6 +16,7 @@
 #include linux/hrtimer.h
 #include linux/interrupt.h
 #include linux/kvm_host.h
 +#include linux/kvm.h
 #include asm/debug.h
 #include asm/cpu.h
 
 @@ -162,18 +163,6 @@ struct kvm_vcpu_stat {
   u32 diagnose_9c;
 };
 
 -struct kvm_s390_io_info {
 - __u16subchannel_id;/* 0x0b8 */
 - __u16subchannel_nr;/* 0x0ba */
 - __u32io_int_parm;  /* 0x0bc */
 - __u32io_int_word;  /* 0x0c0 */
 -};
 -
 -struct kvm_s390_ext_info {
 - __u32 ext_params;
 - __u64 ext_params2;
 -};
 -
 #define PGM_OPERATION0x01
 #define PGM_PRIVILEGED_OP  0x02
 #define PGM_EXECUTE  0x03
 @@ -182,27 +171,6 @@ struct kvm_s390_ext_info {
 #define PGM_SPECIFICATION0x06
 #define PGM_DATA 0x07
 
 -struct kvm_s390_pgm_info {
 - __u16 code;
 -};
 -
 -struct kvm_s390_prefix_info {
 - __u32 address;
 -};
 -
 -struct kvm_s390_extcall_info {
 - __u16 code;
 -};
 -
 -struct kvm_s390_emerg_info {
 - __u16 code;
 -};
 -
 -struct kvm_s390_mchk_info {
 - __u64 cr14;
 - __u64 mcic;
 -};
 -
 struct kvm_s390_interrupt_info {
   struct list_head list;
   u64 type;
 diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
 index 99c2533..eeb08a1 100644
 --- a/include/uapi/linux/kvm.h
 +++ b/include/uapi/linux/kvm.h
 @@ -434,6 +434,69 @@ struct kvm_s390_interrupt {
   __u64 parm64;
 };
 
 +struct kvm_s390_io_info {
 + __u16 subchannel_id;
 + __u16 subchannel_nr;
 + __u32 io_int_parm;
 + __u32 io_int_word;
 +};
 +
 +struct kvm_s390_ext_info {
 + __u32 ext_params;
 + __u32 pad;
 + __u64 ext_params2;
 +};
 +
 +struct kvm_s390_pgm_info {
 + __u64 trans_exc_code;
 + __u64 mon_code;
 + __u64 per_address;
 + __u32 data_exc_code;
 + __u16 code;
 + __u16 mon_class_nr;
 + __u8 per_code;
 + __u8 per_atmid;
 + __u8 exc_access_id;
 + __u8 per_access_id;
 + __u8 op_access_id;
 + __u8 pad[3];
 +};
 +
 +struct kvm_s390_prefix_info {
 + __u32 address;
 +};
 +
 +struct kvm_s390_extcall_info {
 + __u16 code;
 +};
 +
 +struct kvm_s390_emerg_info {
 + __u16 code;
 +};
 +
 +struct kvm_s390_mchk_info {
 + __u64 cr14;
 + __u64 mcic;
 + __u64 failing_storage_address;
 + __u32 ext_damage_code;
 + __u32 pad;
 + __u8 fixed_logout[16];
 +};
 +
 +struct kvm_s390_irq {
 + __u64 type;
 + union {
 + struct kvm_s390_io_info io;
 + struct kvm_s390_ext_info ext;
 + struct kvm_s390_pgm_info pgm;
 + struct kvm_s390_emerg_info emerg;
 + struct kvm_s390_extcall_info extcall;
 + struct kvm_s390_prefix_info prefix;
 + struct kvm_s390_mchk_info mchk;
 + char reserved[64];
 + };

Avi always complained about anonymous structs :). Apparently they're a gcc only 
extension.


Alex

Re: [Qemu-devel] [PATCH v2 2/2] KVM: s390: add floating irq controller

2013-10-04 Thread Alexander Graf


On 06.09.2013, at 14:19, Jens Freimann wrote:

 This patch adds a floating irq controller as a kvm_device.
 It will be necessary for migration of floating interrupts as well
 as for hardening the reset code by allowing user space to explicitly
 remove all pending floating interrupts.
 
 Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com
 Reviewed-by: Cornelia Huck cornelia.h...@de.ibm.com
 ---
 Documentation/virtual/kvm/devices/s390_flic.txt |  36 +++
 arch/s390/include/asm/kvm_host.h|   1 +
 arch/s390/include/uapi/asm/kvm.h|   5 +
 arch/s390/kvm/interrupt.c   | 304 
 arch/s390/kvm/kvm-s390.c|   1 +
 include/linux/kvm_host.h|   1 +
 include/uapi/linux/kvm.h|   2 +
 virt/kvm/kvm_main.c |   5 +
 8 files changed, 304 insertions(+), 51 deletions(-)
 
 diff --git a/Documentation/virtual/kvm/devices/s390_flic.txt 
 b/Documentation/virtual/kvm/devices/s390_flic.txt
 new file mode 100644
 index 000..06aef31
 --- /dev/null
 +++ b/Documentation/virtual/kvm/devices/s390_flic.txt
 @@ -0,0 +1,36 @@
 +FLIC (floating interrupt controller)
 +
 +
 +FLIC handles floating (non per-cpu) interrupts, i.e.  I/O, service and some
 +machine check interruptions. All interrupts are stored in a per-vm list of
 +pending interrupts. FLIC performs operations on this list.
 +
 +Only one FLIC instance may be instantiated.
 +
 +FLIC provides support to
 +- add/delete interrupts (KVM_DEV_FLIC_ENQUEUE and _DEQUEUE)
 +- purge all pending floating interrupts (KVM_DEV_FLIC_CLEAR_IRQS)
 +
 +Groups:
 +  KVM_DEV_FLIC_ENQUEUE
 +Adds one interrupt to the list of pending floating interrupts. Interrupts
 +are taken from this list for injection into the guest. attr contains
 +a struct kvm_s390_irq which contains all data relevant for
 +interrupt injection.
 +The format of the data structure kvm_s390_irq as it is copied from 
 userspace
 +is defined in usr/include/linux/kvm.h.
 +For historic reasons list members are stored in a different data 
 structure, i.e.
 +we need to copy the relevant data into a struct kvm_s390_interrupt_info
 +which can then be added to the list.
 +
 +  KVM_DEV_FLIC_DEQUEUE
 +Takes one element off the pending interrupts list and copies it into 
 userspace.
 +Dequeued interrupts are not injected into the guest.
 +attr-addr contains the userspace address of a struct kvm_s390_irq.
 +List elements are stored in the format of struct kvm_s390_interrupt_info
 +(arch/s390/include/asm/kvm_host.h) and are copied into a struct 
 kvm_s390_irq
 +(usr/include/linux/kvm.h)
 +
 +  KVM_DEV_FLIC_CLEAR_IRQS
 +Simply deletes all elements from the list of currently pending floating 
 interrupts.
 +No interrupts are injected into the guest.
 diff --git a/arch/s390/include/asm/kvm_host.h 
 b/arch/s390/include/asm/kvm_host.h
 index adb05c5..e1cc166 100644
 --- a/arch/s390/include/asm/kvm_host.h
 +++ b/arch/s390/include/asm/kvm_host.h
 @@ -238,6 +238,7 @@ struct kvm_arch{
   struct sca_block *sca;
   debug_info_t *dbf;
   struct kvm_s390_float_interrupt float_int;
 + struct kvm_device *flic;
   struct gmap *gmap;
   int css_support;
 };
 diff --git a/arch/s390/include/uapi/asm/kvm.h 
 b/arch/s390/include/uapi/asm/kvm.h
 index d25da59..33d52b8 100644
 --- a/arch/s390/include/uapi/asm/kvm.h
 +++ b/arch/s390/include/uapi/asm/kvm.h
 @@ -16,6 +16,11 @@
 
 #define __KVM_S390
 
 +/* Device control API: s390-specific devices */
 +#define KVM_DEV_FLIC_DEQUEUE 1
 +#define KVM_DEV_FLIC_ENQUEUE 2
 +#define KVM_DEV_FLIC_CLEAR_IRQS 3
 +
 /* for KVM_GET_REGS and KVM_SET_REGS */
 struct kvm_regs {
   /* general purpose regs for s390 */
 diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
 index 7f35cb3..d6d5e36 100644
 --- a/arch/s390/kvm/interrupt.c
 +++ b/arch/s390/kvm/interrupt.c
 @@ -656,53 +656,86 @@ struct kvm_s390_interrupt_info 
 *kvm_s390_get_io_int(struct kvm *kvm,
   return inti;
 }
 
 -int kvm_s390_inject_vm(struct kvm *kvm,
 -struct kvm_s390_interrupt *s390int)
 +static void __inject_vm(struct kvm *kvm, struct kvm_s390_interrupt_info 
 *inti)

This really doesn't only inject, it enqueues an interrupt and also injects them 
then, right?

 {
   struct kvm_s390_local_interrupt *li;
   struct kvm_s390_float_interrupt *fi;
 - struct kvm_s390_interrupt_info *inti, *iter;
 + struct kvm_s390_interrupt_info *iter;
   int sigcpu;
 
 + mutex_lock(kvm-lock);
 + fi = kvm-arch.float_int;

You probably want to move this into your device structure eventually.

 + spin_lock(fi-lock);
 + if (!is_ioint(inti-type)) {
 + list_add_tail(inti-list, fi-list);
 + } else {
 + u64 isc_bits = int_word_to_isc_bits(inti-io.io_int_word);
 +
 + /* Keep I/O

Re: [Qemu-devel] [PATCH v2 0/2] KVM: s390: add floating irq controller

2013-10-04 Thread Alexander Graf


On 06.09.2013, at 15:30, Christian Borntraeger wrote:

 On 06/09/13 14:19, Jens Freimann wrote: This series adds a kvm_device that 
 acts as a irq controller for floating
 interrupts.  As a first step it implements functionality to retrieve and 
 inject
 interrupts for the purpose of migration and for hardening the reset code by
 allowing user space to explicitly remove all pending floating interrupts.
 
 PFAULT patches will also use this device for enabling/disabling pfault, 
 therefore
 the pfault patch series will be reworked to use this device.
 
 * Patch 1/2 adds a new data structure to hold interrupt information. The 
 current
  one (struct kvm_s390_interrupt) does not allow to inject every kind of 
 interrupt,
  e.g. some data for program interrupts and machine check interruptions were
  missing.
 
 * Patch 2/2 adds a kvm_device which supports getting/setting currently 
 pending
  floating interrupts as well as deleting all currently pending interrupts
 
 
 Jens Freimann (2):
  KVM: s390: add and extend interrupt information data structs
  KVM: s390: add floating irq controller
 
 Documentation/virtual/kvm/devices/s390_flic.txt |  36 +++
 arch/s390/include/asm/kvm_host.h|  35 +--
 arch/s390/include/uapi/asm/kvm.h|   5 +
 arch/s390/kvm/interrupt.c   | 304 
 
 arch/s390/kvm/kvm-s390.c|   1 +
 include/linux/kvm_host.h|   1 +
 include/uapi/linux/kvm.h|  65 +
 virt/kvm/kvm_main.c |   5 +
 8 files changed, 368 insertions(+), 84 deletions(-)
 create mode 100644 Documentation/virtual/kvm/devices/s390_flic.txt
 
 
 
 Gleb, Paolo,
 
 since the qemu part relies on a kernel header file, it makes sense to not 
 only let the kernel
 part go via the kvm tree, but also the qemu part. I want Alex to Ack the 
 interface, and if he
 agrees then I am fine with applying the whole series.

I think the interface works. My comments are almost exclusively on internal 
code structure which can follow up on a later patch. The only thing that 
definitely needs fixing now is the unnamed union.


Alex

 
 If nothing else comes up, feel free to apply the small change request from 
 Peter yourself or
 ask Jens for a resend.
 
 --snip
 
 --- a/include/uapi/linux/kvm.h
 +++ b/include/uapi/linux/kvm.h
 @@ -908,7 +908,7 @@ struct kvm_device_attr {
 #define KVM_DEV_TYPE_FSL_MPIC_20   1
 #define KVM_DEV_TYPE_FSL_MPIC_42   2
 #define KVM_DEV_TYPE_XICS  3
 -#define KVM_DEV_TYPE_FLIC  4
 +#define KVM_DEV_TYPE_FLIC  5
 
 /*
  * ioctls for VM fds
 
 --snip

Re: [Qemu-devel] [PATCH -V4 2/4] target-ppc: Fix page table lookup with kvm enabled

2013-10-07 Thread Alexander Graf


On 07.10.2013, at 15:58, Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com 
wrote:

 Alexander Graf ag...@suse.de writes:
 
 On 01.10.2013, at 03:27, Aneesh Kumar K.V wrote:
 
 Alexander Graf ag...@suse.de writes:
 
 On 09/05/2013 10:16 AM, Aneesh Kumar K.V wrote:
 From: Aneesh Kumar K.Vaneesh.ku...@linux.vnet.ibm.com
 
 
 
 
 
 Can you  explain this better ? 
 
 You're basically doing
 
 hwaddr ppc_hash64_pteg_search(...)
 {
if (kvm) {
pteg = read_from_kvm();
foreach pte in pteg {
if (match) return offset;
}
return -1;
} else {
foreach pte in pteg {
pte = read_pte_from_memory();
if (match) return offset;
}
return -1;
}
 }
 
 This is massive code duplication. The only difference between kvm and
 tcg are the source for the pteg read. David already abstracted the
 actual pte0/pte1 reads away in ppc_hash64_load_hpte0 and
 ppc_hash64_load_hpte1 wrapper functions.
 
 Now imagine we could add a temporary pteg store in env. Then have something 
 like this in ppc_hash64_load_hpte0:
 
 if (need_kvm_htab_access) {
if (env-current_cached_pteg != this_pteg) (
read_pteg(env-cached_pteg);
return env-cached_pteg[x].pte0;
}
 } else {
do what was done before
 }
 
 That way the actual resolver doesn't care where the PTEG comes from,
 as it only ever checks pte0/pte1 and leaves all the magic on where
 those come from to the load function.
 
 I tried to do this and didn't like the end result. For one we
 unnecessarly bloat CPUPPCState struct to now carry a pteg information
 and associated array. ie, we need to have now the below in the CPUPPCState.

How about something like

token = ppc_hash64_start_access();
foreach (hpte entry) {
   pte0 = ppc_hash64_load_hpte0(token, ...);
   ...
}
ppc_hash64_stop_access(token);

That way you could put the buffer and pteg_group into the token struct and only 
allocate and use it when KVM with HV is in use.

 
 int pteg_group;
 unsigned long hpte[(HPTES_PER_GROUP * 2) + 1];
 
 Also out serach can be much effective with the current code, 

We're anything but performance critical at this point.

 
while (index  hpte_buf.header.n_valid) {
 
 against 
 
for (i = 0; i  HPTES_PER_GROUP; i++) {
 
 I guess the former is better when we can find invalid hpte entries.
 
 We now also need to update kvm_cpu_synchronize_state to clear
 pte_group so that we would not look at the stale values. If we ever want
 to use reading pteg in any other place we could possibly look at doing
 this. But at this stage, IMHO it unnecessarily make it all complex and
 less efficient.

The point is to make it less complex. I don't like the idea of having 2 hash 
lookups in the same code base that do basically the same. And efficiency only 
ever counts in the TCG case here.


Alex

Re: [Qemu-devel] [PATCH -V5] target-ppc: Fix page table lookup with kvm enabled

2013-10-11 Thread Alexander Graf


On 11.10.2013, at 13:13, Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com 
wrote:

 From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
 
 With kvm enabled, we store the hash page table information in the hypervisor.
 Use ioctl to read the htab contents. Without this we get the below error when
 trying to read the guest address
 
 (gdb) x/10 do_fork
 0xc0098660 do_fork:   Cannot access memory at address 
 0xc0098660
 (gdb)
 
 Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
 ---
 Changes from V4:
 * Rewrite to avoid two code paths for doing hash lookups
 
 hw/ppc/spapr_hcall.c| 44 ---
 target-ppc/kvm.c| 47 +
 target-ppc/kvm_ppc.h| 16 +++
 target-ppc/mmu-hash64.c | 70 +++--
 target-ppc/mmu-hash64.h | 33 ---
 5 files changed, 170 insertions(+), 40 deletions(-)
 
 diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
 index f10ba8a..96348e3 100644
 --- a/hw/ppc/spapr_hcall.c
 +++ b/hw/ppc/spapr_hcall.c
 @@ -52,6 +52,7 @@ static target_ulong h_enter(PowerPCCPU *cpu, 
 sPAPREnvironment *spapr,
 target_ulong raddr;
 target_ulong i;
 hwaddr hpte;
 +struct ppc_hash64_hpte_token *token;
 
 /* only handle 4k and 16M pages for now */
 if (pteh  HPTE64_V_LARGE) {
 @@ -94,25 +95,32 @@ static target_ulong h_enter(PowerPCCPU *cpu, 
 sPAPREnvironment *spapr,
 if ((pte_index * HASH_PTE_SIZE_64)  ~env-htab_mask) {
 return H_PARAMETER;
 }
 +
 +i = 0;
 +hpte = pte_index * HASH_PTE_SIZE_64;
 if (likely((flags  H_EXACT) == 0)) {
 pte_index = ~7ULL;
 -hpte = pte_index * HASH_PTE_SIZE_64;
 -for (i = 0; ; ++i) {
 +token = ppc_hash64_start_access(ppc_env_get_cpu(env), pte_index);
 +do {

Why convert this into a while loop?

 if (i == 8) {
 +ppc_hash64_stop_access(token);
 return H_PTEG_FULL;
 }
 -if ((ppc_hash64_load_hpte0(env, hpte)  HPTE64_V_VALID) == 0) {
 +if ((ppc_hash64_load_hpte0(token, i)  HPTE64_V_VALID) == 0) {
 break;
 }
 -hpte += HASH_PTE_SIZE_64;
 -}
 +} while (i++);
 +ppc_hash64_stop_access(token);
 } else {
 -i = 0;
 -hpte = pte_index * HASH_PTE_SIZE_64;
 -if (ppc_hash64_load_hpte0(env, hpte)  HPTE64_V_VALID) {
 +token = ppc_hash64_start_access(ppc_env_get_cpu(env), pte_index);
 +if (ppc_hash64_load_hpte0(token, 0)  HPTE64_V_VALID) {
 +ppc_hash64_stop_access(token);
 return H_PTEG_FULL;
 }
 +ppc_hash64_stop_access(token);
 }
 +hpte += i * HASH_PTE_SIZE_64;
 +
 ppc_hash64_store_hpte1(env, hpte, ptel);
 /* eieio();  FIXME: need some sort of barrier for smp? */
 ppc_hash64_store_hpte0(env, hpte, pteh | HPTE64_V_HPTE_DIRTY);
 @@ -135,15 +143,16 @@ static RemoveResult remove_hpte(CPUPPCState *env, 
 target_ulong ptex,
 {
 hwaddr hpte;
 target_ulong v, r, rb;
 +struct ppc_hash64_hpte_token  *token;
 
 if ((ptex * HASH_PTE_SIZE_64)  ~env-htab_mask) {
 return REMOVE_PARM;
 }
 
 -hpte = ptex * HASH_PTE_SIZE_64;
 -
 -v = ppc_hash64_load_hpte0(env, hpte);
 -r = ppc_hash64_load_hpte1(env, hpte);
 +token = ppc_hash64_start_access(ppc_env_get_cpu(env), ptex);
 +v = ppc_hash64_load_hpte0(token, 0);
 +r = ppc_hash64_load_hpte1(token, 0);
 +ppc_hash64_stop_access(token);
 
 if ((v  HPTE64_V_VALID) == 0 ||
 ((flags  H_AVPN)  (v  ~0x7fULL) != avpn) ||
 @@ -152,6 +161,7 @@ static RemoveResult remove_hpte(CPUPPCState *env, 
 target_ulong ptex,
 }
 *vp = v;
 *rp = r;
 +hpte = ptex * HASH_PTE_SIZE_64;
 ppc_hash64_store_hpte0(env, hpte, HPTE64_V_HPTE_DIRTY);
 rb = compute_tlbie_rb(v, r, ptex);
 ppc_tlb_invalidate_one(env, rb);
 @@ -261,15 +271,16 @@ static target_ulong h_protect(PowerPCCPU *cpu, 
 sPAPREnvironment *spapr,
 target_ulong avpn = args[2];
 hwaddr hpte;
 target_ulong v, r, rb;
 +struct ppc_hash64_hpte_token *token;
 
 if ((pte_index * HASH_PTE_SIZE_64)  ~env-htab_mask) {
 return H_PARAMETER;
 }
 
 -hpte = pte_index * HASH_PTE_SIZE_64;
 -
 -v = ppc_hash64_load_hpte0(env, hpte);
 -r = ppc_hash64_load_hpte1(env, hpte);
 +token = ppc_hash64_start_access(ppc_env_get_cpu(env), pte_index);
 +v = ppc_hash64_load_hpte0(token, 0);
 +r = ppc_hash64_load_hpte1(token, 0);
 +ppc_hash64_stop_access(token);
 
 if ((v  HPTE64_V_VALID) == 0 ||
 ((flags  H_AVPN)  (v  ~0x7fULL) != avpn)) {
 @@ -282,6 +293,7 @@ static target_ulong h_protect(PowerPCCPU *cpu, 
 sPAPREnvironment *spapr,
 r |= (flags  48)  HPTE64_R_KEY_HI;
 r |= flags  (HPTE64_R_PP | HPTE64_R_N | HPTE64_R_KEY_LO);
 rb = compute_tlbie_rb(v, r, pte_index);
 +hpte = pte_index *

Re: [Qemu-devel] [RFC PATCH] spapr-vty: workaround reg property for old kernels

2013-10-15 Thread Alexander Graf


On 10/15/2013 10:50 AM, Alexey Kardashevskiy wrote:

Old kernels (  3.1) handle hvcX devices different in different parts.
Sometime the kernel assumes that the hvc device numbers start from zero
and if there is just one hvc, then it is hvc0.

However kernel's add_preferred_console() uses the very last byte of
the VTY's reg property as an hvc number so it might end up with something
different than hvc.

The problem appears on SLES11SP3 and RHEL6. If to run QEMU without
-nodefaults, then the default VTY is created first on a VIO bus and gets
reg==0x7100 so it will be hvc0 and everything will be fine.
If to run QEMU with:
  -nodefaults \
  -chardev socket,id=char1,host=localhost,port=8001,server,telnet,mux=on \
  -device spapr-vty,chardev=char1 \
  -mon chardev=char1,mode=readline,id=mon1 \

then the exactly the same config is expected but in this case spapr-vty
gets reg==0x7101 and therefore it becomes hvc1 and lots of debug
output is missing. SLES11SP3 does not even boot as /dev/console is
redirected to /dev/hvc0 which is dead.

The issue can be solved by manual selection of VTY's reg property to
have last byte equal to zero.

The alternative would be to use separate reg property counter for
automatic reg property generation and this is what the patch does.

Signed-off-by: Alexey Kardashevskiya...@ozlabs.ru
---

Since libvirt uses -nodefault a lot and in this case spapr-nvram gets
created first and gets reg=0x7100, we cannot just ignore this. Also,
it does not seem an option to require libvirt users to specify spapr-vty
reg property every time.

Can anyone think of a simpler solutionu? Thanks.


---
  hw/ppc/spapr_vio.c | 7 ++-
  include/hw/ppc/spapr_vio.h | 1 +
  2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c
index a6a0a51..2d56950 100644
--- a/hw/ppc/spapr_vio.c
+++ b/hw/ppc/spapr_vio.c
@@ -438,7 +438,11 @@ static int spapr_vio_busdev_init(DeviceState *qdev)
  VIOsPAPRBus *bus = DO_UPCAST(VIOsPAPRBus, bus, dev-qdev.parent_bus);

  do {
-dev-reg = bus-next_reg++;
+if (!object_dynamic_cast(OBJECT(qdev), spapr-vty)) {
+dev-reg = bus-next_reg++;
+} else {
+dev-reg = bus-next_vty_reg++;
+}
  } while (reg_conflict(dev));
  }

@@ -501,6 +505,7 @@ VIOsPAPRBus *spapr_vio_bus_init(void)
  qbus = qbus_create(TYPE_SPAPR_VIO_BUS, dev, spapr-vio);
  bus = DO_UPCAST(VIOsPAPRBus, bus, qbus);
  bus-next_reg = 0x7100;
+bus-next_vty_reg = 0x71000100;


This breaks as soon as you pass in more than 0x100 devices that are 
non-vty into the guest, no?


The reg property really describes the virtual slot a device is in. 
Couldn't we do that allocation explicitly and push it from libvirt, just 
like we do it with the slots for PCI?



Alex




  /* hcall-vio */
  spapr_register_hypercall(H_VIO_SIGNAL, h_vio_signal);
diff --git a/include/hw/ppc/spapr_vio.h b/include/hw/ppc/spapr_vio.h
index d8b3b03..3a92d9e 100644
--- a/include/hw/ppc/spapr_vio.h
+++ b/include/hw/ppc/spapr_vio.h
@@ -73,6 +73,7 @@ struct VIOsPAPRDevice {
  struct VIOsPAPRBus {
  BusState bus;
  uint32_t next_reg;
+uint32_t next_vty_reg;
  int (*init)(VIOsPAPRDevice *dev);
  int (*devnode)(VIOsPAPRDevice *dev, void *fdt, int node_off);
  };

Re: [Qemu-devel] [PATCH v2] pseries: Update SLOF firmware image

2013-10-16 Thread Alexander Graf


On 10/15/2013 07:00 AM, Alexey Kardashevskiy wrote:

SLOF git commit is e2e8ac901e617573ea383f9cffd136146d0675a4

The main changes are:
* fixed bug with not passing arguments from -append
* client-architecture-support hypercall
* netboot
* USB stack fixes

The full list of changes:
client-architecture-support: fix wrong version read
client-architecture-support: fix redundant stack drop
Update device tree returned by CAS hypercall
fdt: introduce fdt-init
Add ibm,client-architecture-support method
Kernel parameter passed from qemu commandline ignored
Allow more than one client to open net devices simultaneously
ci: add missing close in else condition
Add GPT support
pci: fix interrupt-map for bridges
usb-ohci: preserve the toggleCarry bit in ED
usb-ohci: done_head processing fixes
usb-ohci: update init and rationalize timings
usb-msc: handle stall and other fixes
scsi: make probe more error resilient
usb-core: Add CLEAR FEATURE api
Implement range allocator
Remove bcm57xx network driver as module
Remove e1000 network driver as module
Remove virtio-net network driver as module
Remove veth network driver as module
Add missing close-dev in ping
Remove lodable network driver modules and related functions
Add bcm57xx network driver in libbcm
Add e1000 network driver in libe1k
Add virtio-net driver in libvirtio
Add veth driver in libveth
Get MAC address for client interface module
Add SLOF usleep wrapper
Add SLOF pci wrapper functions
Fix 'canon' client interface

Cc: Nikunj A Dadhania nik...@linux.vnet.ibm.com
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
Changes:
v2:
* fixed a bug in client-architecture-support: fix wrong version read

---

Alex, please note that the patch is made against your ppc-next branch
rather than qemu.org/master as the previous SLOF update request did not
reach upstream yet.

And yes, this is v2. My bad, sorry, the updates are huge for email :(
---
  pc-bios/README   |   2 +-
  pc-bios/slof.bin | Bin 875424 - 873920 bytes
  roms/SLOF|   2 +-
  3 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/pc-bios/README b/pc-bios/README
index cf97365..e2622d2 100644
--- a/pc-bios/README
+++ b/pc-bios/README
@@ -17,7 +17,7 @@
  - SLOF (Slimline Open Firmware) is a free IEEE 1275 Open Firmware
implementation for certain IBM POWER hardware.  The sources are at
https://github.com/aik/SLOF, and the image currently in qemu is
-  built from git tag qemu-slof-20130827.
+  built from git tag qemu-slof-20131015.
  
  - sgabios (the Serial Graphics Adapter option ROM) provides a means for

legacy x86 software to communicate with an attached serial console as
diff --git a/pc-bios/slof.bin b/pc-bios/slof.bin
index 
0e8b51ad1f3ecde581db2a49c1b75e96e9084c41..92a9831be7ade3d09a8b0aea08d0694da792da51
 100644
GIT binary patch
delta 238635


[...]


diff --git a/roms/SLOF b/roms/SLOF
index a523d1b..e2e8ac9 16
--- a/roms/SLOF
+++ b/roms/SLOF
@@ -1 +1 @@
-Subproject commit a523d1b0cd6e96cf5e393f0a10f897e8ed639fdc
+Subproject commit e2e8ac901e617573ea383f9cffd136146d0675a4


Is this commit in git.qemu.org/SLOF.git? I can't find it there. Anthony, 
can you pull it?



Alex

Re: [Qemu-devel] [PATCH v2] pseries: Update SLOF firmware image

2013-10-16 Thread Alexander Graf



Am 16.10.2013 um 12:47 schrieb Alexey Kardashevskiy a...@ozlabs.ru:

 On 10/16/2013 08:54 PM, Alexander Graf wrote:
 On 10/15/2013 07:00 AM, Alexey Kardashevskiy wrote:
 SLOF git commit is e2e8ac901e617573ea383f9cffd136146d0675a4
 
 The main changes are:
 * fixed bug with not passing arguments from -append
 * client-architecture-support hypercall
 * netboot
 * USB stack fixes
 
 The full list of changes:
 client-architecture-support: fix wrong version read
 client-architecture-support: fix redundant stack drop
 Update device tree returned by CAS hypercall
 fdt: introduce fdt-init
 Add ibm,client-architecture-support method
 Kernel parameter passed from qemu commandline ignored
 Allow more than one client to open net devices simultaneously
 ci: add missing close in else condition
 Add GPT support
 pci: fix interrupt-map for bridges
 usb-ohci: preserve the toggleCarry bit in ED
 usb-ohci: done_head processing fixes
 usb-ohci: update init and rationalize timings
 usb-msc: handle stall and other fixes
 scsi: make probe more error resilient
 usb-core: Add CLEAR FEATURE api
 Implement range allocator
 Remove bcm57xx network driver as module
 Remove e1000 network driver as module
 Remove virtio-net network driver as module
 Remove veth network driver as module
 Add missing close-dev in ping
 Remove lodable network driver modules and related functions
 Add bcm57xx network driver in libbcm
 Add e1000 network driver in libe1k
 Add virtio-net driver in libvirtio
 Add veth driver in libveth
 Get MAC address for client interface module
 Add SLOF usleep wrapper
 Add SLOF pci wrapper functions
 Fix 'canon' client interface
 
 Cc: Nikunj A Dadhania nik...@linux.vnet.ibm.com
 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 ---
 Changes:
 v2:
 * fixed a bug in client-architecture-support: fix wrong version read
 
 ---
 
 Alex, please note that the patch is made against your ppc-next branch
 rather than qemu.org/master as the previous SLOF update request did not
 reach upstream yet.
 
 And yes, this is v2. My bad, sorry, the updates are huge for email :(
 ---
  pc-bios/README   |   2 +-
  pc-bios/slof.bin | Bin 875424 - 873920 bytes
  roms/SLOF|   2 +-
  3 files changed, 2 insertions(+), 2 deletions(-)
 
 diff --git a/pc-bios/README b/pc-bios/README
 index cf97365..e2622d2 100644
 --- a/pc-bios/README
 +++ b/pc-bios/README
 @@ -17,7 +17,7 @@
  - SLOF (Slimline Open Firmware) is a free IEEE 1275 Open Firmware
implementation for certain IBM POWER hardware.  The sources are at
https://github.com/aik/SLOF, and the image currently in qemu is
 -  built from git tag qemu-slof-20130827.
 +  built from git tag qemu-slof-20131015.
- sgabios (the Serial Graphics Adapter option ROM) provides a means for
legacy x86 software to communicate with an attached serial console as
 diff --git a/pc-bios/slof.bin b/pc-bios/slof.bin
 index
 0e8b51ad1f3ecde581db2a49c1b75e96e9084c41..92a9831be7ade3d09a8b0aea08d0694da792da51
 100644
 GIT binary patch
 delta 238635
 
 [...]
 
 diff --git a/roms/SLOF b/roms/SLOF
 index a523d1b..e2e8ac9 16
 --- a/roms/SLOF
 +++ b/roms/SLOF
 @@ -1 +1 @@
 -Subproject commit a523d1b0cd6e96cf5e393f0a10f897e8ed639fdc
 +Subproject commit e2e8ac901e617573ea383f9cffd136146d0675a4
 
 Is this commit in git.qemu.org/SLOF.git? I can't find it there.
 
 It is from git in pc-bios/README so it is my git on github. I do not have
 authority over git.qemu.org :)

Well, the submodule points to the copy on git.qemu.org for legal reasons. So I 
can only include this after Anthony updated it :)

Alex

 
 
 Anthony, can you pull it?
 
 
 
 
 
 -- 
 Alexey

Re: [Qemu-devel] [Qemu-ppc] [PATCH V2] Fix float64_to_uint64

2013-10-17 Thread Alexander Graf


On 16.10.2013, at 23:10, Tom Musta tommu...@gmail.com wrote:

 The comment preceding the float64_to_uint64 routine suggests that
 the implementation is broken.  And this is, indeed, the case.
 
 This patch properly implements the conversion of a 64-bit floating
 point number to an unsigned, 64 bit integer.
 
 Note that the patch does not pass scripts/checkpatch.pl because it
 maintains the coding style of fpu/softfloat.c.
 
 V2: This contribution can be licensed under either the softfloat-2a or -2b
 license.

Missing a SoB line.


Alex

 
 ---
 fpu/softfloat.c |   92 ++
 1 files changed, 85 insertions(+), 7 deletions(-)
 
 diff --git a/fpu/softfloat.c b/fpu/softfloat.c
 index 7ba51b6..f8c7f92 100644
 --- a/fpu/softfloat.c
 +++ b/fpu/softfloat.c
 @@ -204,6 +204,46 @@ static int64 roundAndPackInt64( flag zSign, uint64_t 
 absZ0, uint64_t absZ1 STATU
 }
 
 /*
 +| Takes the 128-bit fixed-point value formed by concatenating `absZ0' and
 +| `absZ1', with binary point between bits 63 and 64 (between the input 
 words),
 +| and returns the properly rounded 64-bit unsigned integer corresponding to 
 the
 +| input.  Ordinarily, the fixed-point input is simply rounded to an integer,
 +| with the inexact exception raised if the input cannot be represented 
 exactly
 +| as an integer.  However, if the fixed-point input is too large, the invalid
 +| exception is raised and the largest unsigned integer is returned.
 +**/
 +
 +static int64 roundAndPackUint64( uint64_t absZ0, uint64_t absZ1 STATUS_PARAM)
 +{
 +int8 roundingMode;
 +flag roundNearestEven, increment;
 +int64_t z;
 +
 +roundingMode = STATUS(float_rounding_mode);
 +roundNearestEven = ( roundingMode == float_round_nearest_even );
 +increment = ( (int64_t) absZ1  0 );
 +if ( ! roundNearestEven ) {
 +if ( roundingMode == float_round_to_zero ) {
 +increment = 0;
 +}
 +else {
 +increment = ( roundingMode == float_round_up )  absZ1;
 +}
 +}
 +if ( increment ) {
 +++absZ0;
 +if ( absZ0 == 0 ) {
 +float_raise( float_flag_invalid STATUS_VAR);
 +return LIT64( 0x );
 +}
 +absZ0 = ~ ( ( (uint64_t) ( absZ11 ) == 0 )  roundNearestEven );
 +}
 +z = absZ0;
 +if ( absZ1 ) STATUS(float_exception_flags) |= float_flag_inexact;
 +return z;
 +}
 +
 +/*
 | Returns the fraction bits of the single-precision floating-point value `a'.
 **/
 
 @@ -6536,18 +6576,56 @@ uint_fast16_t float64_to_uint16_round_to_zero(float64 
 a STATUS_PARAM)
 return res;
 }
 
 -/* FIXME: This looks broken.  */
 -uint64_t float64_to_uint64 (float64 a STATUS_PARAM)
 +/*
 +| Returns the result of converting the double-precision floating-point value
 +| `a' to the 64-bit unsigned integer format.  The conversion is
 +| performed according to the IEC/IEEE Standard for Binary Floating-Point
 +| Arithmetic---which means in particular that the conversion is rounded
 +| according to the current rounding mode.  If `a' is a NaN, the largest
 +| positive integer is returned.  If the conversion overflows, the
 +| largest unsigned integer is returned.  If 'a' is negative, zero is
 +| returned.
 +**/
 +
 +uint64_t float64_to_uint64( float64 a STATUS_PARAM )
 {
 -int64_t v;
 +flag aSign;
 +int_fast16_t aExp, shiftCount;
 +uint64_t aSig, aSigExtra;
 +a = float64_squash_input_denormal(a STATUS_VAR);
 
 -v = float64_val(int64_to_float64(INT64_MIN STATUS_VAR));
 -v += float64_val(a);
 -v = float64_to_int64(make_float64(v) STATUS_VAR);
 +aSig = extractFloat64Frac( a );
 +aExp = extractFloat64Exp( a );
 +aSign = extractFloat64Sign( a );
 +if ( aSign ) {
 +if ( aExp ) {
 +float_raise( float_flag_invalid STATUS_VAR);
 +} else if ( aSig ) { /* negative denormalized */
 +float_raise( float_flag_inexact STATUS_VAR);
 +}
 +return 0;
 +}
 +if ( aExp ) aSig |= LIT64( 0x0010 );
 +shiftCount = 0x433 - aExp;
 +if ( shiftCount = 0 ) {
 +if ( 0x43E  aExp ) {
 +if ( ( aSig != LIT64( 0x0010 ) ) ||
 + ( aExp == 0x7FF ) ) {
 +float_raise( float_flag_invalid STATUS_VAR);
 +}
 +return LIT64( 0x );
 +}
 +aSigExtra = 0;
 +aSig = - shiftCount;
 +}
 +else {
 +shift64ExtraRightJamming( aSig, 0,

Re: [Qemu-devel] [PATCH 00/60] AArch64 TCG emulation support

2013-10-17 Thread Alexander Graf


On 16.10.2013, at 21:54, Edgar E. Iglesias edgar.igles...@gmail.com wrote:

 On Fri, Sep 27, 2013 at 02:47:54AM +0200, Alexander Graf wrote:
 Howdy,
 
 This is the first batch of patches to implement AArch64 instruction
 emulation in QEMU. It implements enough to execute simple AArch64
 programs in linux-user mode.
 
 We still have quite a big number of patches outstanding that will
 come after this initial set, both in linux-user code as well as in
 the AArch64 instruction emulator. But this series is already quite
 big, so let's get this one through first.
 
 
 Impressive work Alex!
 
 It would be fun to try this out, do you have a public repo with these
 patches?

I even have a repo with a not as clean, but fully working backend:

  https://github.com/openSUSE/qemu/commits/aarch64-work

 How much progress have you made on system emulation?

None :). So far the target was user emulation. When I get a bit of spare time 
on my hands again I'll take a stab at system emulation too, but so far I 
haven't gotten around to it. In fact, doing v2 is definitely higher on my todo 
list than system emulation ;).


Alex

Re: [Qemu-devel] [PATCH v3 0/2] target-ppc: Tidy sPAPR device tree CPU nodes

2013-10-17 Thread Alexander Graf


On 15.10.2013, at 18:33, Andreas Färber afaer...@suse.de wrote:

 Hello Alexey and Alex,
 
 This series cleans up the fdt CPU nodes for -M pseries as attempted by Prerna.
 
 v3 uses DeviceClass::fw_name for name storage exclusively, with 
 PowerPC,UNKNOWN
 as fallback.

Thanks, applied all to ppc-next.


Alex

Re: [Qemu-devel] virtio-blk-pci: how to tell if it is CD or HDD?

2013-10-17 Thread Alexander Graf


On 17.10.2013, at 14:54, Paolo Bonzini pbonz...@redhat.com wrote:

 Il 17/10/2013 14:38, Alexey Kardashevskiy ha scritto:
 qdev_get_fw_dev_path:
 /spapr-vio-bridge/spapr-vscsi/channel@0/disk@3,2 suffix=(null)
 /spapr-vio-bridge/spapr-vscsi/channel@0/disk@3,1 suffix=(null)
 
 You need to implement qdev_fw_get_path to change
 
 spapr-vio-bridge - vdevice
 spapr-vscsi - v-scsi@REG
 
 /pci@8002000/ethernet@1 suffix=/ethernet-phy@0
 
 The extra suffix is not a problem since you can parse a prefix successfully.
 
 /pci@8002000/scsi@0/channel@0/disk@3,2 suffix=(null)
 /pci@8002000/scsi@0/channel@0/disk@3,1 suffix=(null)
 
 I guess this is virtio-scsi.
 
 SLOF:
 0  devalias
 cdrom123 : /pci@8002000/scsi@0/disk@1030001
 cdrom12 : /pci@8002000/scsi@0/disk@1030002
 hvterm : /vdevice/vty@71000100
 net : /pci@8002000/ethernet@1
 scsi : /vdevice/v-scsi@7101
 cdrom1 : /vdevice/v-scsi@7101/disk@8301
 cdrom : /vdevice/v-scsi@7101/disk@8302
 nvram : /vdevice/nvram@7100 ok
 
 
 In ideal world I would want to get in QEMU what SLOF can understand and
 pass this to SLOF. But QEMU APIs return something which cannot be converted
 straight away.
 
 Or I could simply put bootindex to the device tree nodes (as
 qemu,bootindex) but in this case wildcard nodes support fails as there
 is just a single node /vdevice/v-scsi@7101/disk in the device tree
 for all LUNs. And we definitely do not want to create nodes for all disk
 devices.
 
 Or I can implement a smart converter from QEMU strings to OF pathnames.
 
 Or I can implement third set of callbacks, something like qdev_OF_dev_path().
 
 Or not support bootindex at all.
 
 All possibilities suck but which one sucks less? :) Thanks!
 
 In general, try to make QEMU produce SLOF APIs by modifying the devices
 that instantiate the buses.

But please make sure to not block the path for non-SLOF machines. -M mac99 
should still be able to get different path names for PCI devices for example.


Alex

Re: [Qemu-devel] [PATCH v8 0/3] hw/arm: Add 'virt' platform

2013-10-18 Thread Alexander Graf


On 18.10.2013, at 13:12, Peter Maydell peter.mayd...@linaro.org wrote:

 On 17 October 2013 17:48, Peter Maydell peter.mayd...@linaro.org wrote:
 This patch series adds a 'virt' platform which uses the
 kernel's mach-virt (fully device-tree driven) support
 to create a simple minimalist platform intended for
 use for KVM VM guests.
 
 Changes v7-v8:
 * iterate through virtio-mmio nodes the opposite way round so
   that they appear in the device tree lowest-address-first;
   this matches PPC behaviour and the vexpress code
 
 ...it turns out this isn't quite right. We need to create
 the actual devices in forwards order (so that devices created
 on the qemu command line populate the transports lowest address
 first) and then create the dtb nodes in reverse order (so that
 the transports appear in the final dtb lowest address first). Ugh.
 
 Given this plus the fact that you still need a kernel patch to
 get the thing to boot at all [would anybody on the kernel side
 like to pick up that particular ball?], I'm leaning toward not
 putting this in 1.7 now.

We could add a fdt_append_subnode_namelen() function that instead of putting it 
after the parent's properties puts the new node after all subnodes. While we're 
waiting for it to trickle into libfdt we could keep a copy in device_tree.c.

Then we just switch everything to natural non-reverse order append_subnode().


Alex

[Qemu-devel] [PATCH 3/3] Add migration stream analyzation script

2013-10-23 Thread Alexander Graf

This patch adds a python tool to the scripts directory that can read
a dumped migration stream which contains the debug_migration device
and construct a human readable JSON stream out of it.

It's very simple to use:

  $ qemu-system-x86_64 -device debug_migration
(qemu) migrate exec:cat  mig
  $ ./scripts/analyze_migration.py -f mig

Signed-off-by: Alexander Graf ag...@suse.de
---
 scripts/analyze-migration.py | 483 +++
 1 file changed, 483 insertions(+)
 create mode 100755 scripts/analyze-migration.py

diff --git a/scripts/analyze-migration.py b/scripts/analyze-migration.py
new file mode 100755
index 000..bf70749
--- /dev/null
+++ b/scripts/analyze-migration.py
@@ -0,0 +1,483 @@
+#!/usr/bin/env python
+#
+#  Migration Stream Analyzer
+#
+#  Copyright (c) 2013 Alexander Graf ag...@suse.de
+#
+# This library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2 of the License, or (at your option) any later version.
+#
+# This library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with this library; if not, see http://www.gnu.org/licenses/.
+
+import numpy as np
+import json
+import os
+import argparse
+import collections
+import pprint
+
+class MigrationFile(object):
+def __init__(self, filename):
+self.filename = filename
+self.file = open(self.filename, rb)
+
+def read64(self):
+return np.asscalar(np.fromfile(self.file, count=1, dtype='i8')[0])
+
+def read32(self):
+return np.asscalar(np.fromfile(self.file, count=1, dtype='i4')[0])
+
+def read8(self):
+return np.asscalar(np.fromfile(self.file, count=1, dtype='i1')[0])
+
+def readstr(self, len = None):
+if len is None:
+len = self.read8()
+if len == 0:
+return 
+return np.fromfile(self.file, count=1, dtype=('S%d' % len))[0]
+
+def readvar(self, size = None):
+if size is None:
+size = self.read8()
+if size == 0:
+return 
+value = self.file.read(size)
+if len(value) != size:
+raise Exception(Unexpected end of %s at 0x%x % (self.filename, 
self.file.tell()))
+return value
+
+# Search the current file from the current position onwards for a JSON
+# migration descriptor. Returns the JSON string blob.
+def read_migration_debug_json(self):
+pos = self.file.tell()
+data = self.file.read()
+dbgpos = data.find(Debug Migration)
+if dbgpos == -1:
+raise Exception(No Debug Migration device found)
+
+# The full file read closed the file as well, reopen it where we were
+self.file = open(self.filename, rb)
+self.file.seek(pos, 0)
+
+# We assume that our JSON blob starts after the Debug Migration magic
+# and is null terminated.
+return data[(dbgpos + 16):].split('\0',1)[0]
+
+def close(self):
+self.file.close()
+
+
+class RamSection(object):
+RAM_SAVE_FLAG_COMPRESS = 0x02
+RAM_SAVE_FLAG_MEM_SIZE = 0x04
+RAM_SAVE_FLAG_PAGE = 0x08
+RAM_SAVE_FLAG_EOS  = 0x10
+RAM_SAVE_FLAG_CONTINUE = 0x20
+RAM_SAVE_FLAG_XBZRLE   = 0x40
+RAM_SAVE_FLAG_HOOK = 0x80
+# This can be dynamic, but all targets we care about have 4k pages
+TARGET_PAGE_SIZE   = 0x1000
+blocks = []
+
+def __init__(self, file, version_id, device, section_key):
+if version_id != 4:
+raise Exception(Unknown RAM version %d % version_id)
+
+self.file = file
+self.section_key = section_key
+
+def read(self):
+# Read all RAM sections
+while True:
+addr = self.file.read64()
+flags = addr  0xfff
+addr = 0xf000;
+
+if flags  self.RAM_SAVE_FLAG_MEM_SIZE:
+while True:
+namelen = self.file.read8()
+# We assume that no RAM chunk is big enough to ever
+# hit the first byte of the address, so when we see
+# a zero here we know it has to be an address, not the
+# length of the next block.
+if namelen == 0:
+self.file.file.seek(-1, 1)
+break
+name = self.file.readstr(len = namelen)
+len = self.file.read64()
+self.blocks.append((name, len))
+flags = ~self.RAM_SAVE_FLAG_MEM_SIZE
+
+if flags  self.RAM_SAVE_FLAG_COMPRESS:
+if flags

[Qemu-devel] [PATCH 0/3] Migration Debugging Helper Device

2013-10-23 Thread Alexander Graf

This patch set adds support for a simple migration debugging method.

It adds a device that exports all metadata required to read a migration
stream from an external program as part of the migration stream. The
external program then does not need to have any knowledge of device internals
of the target virtual machine.

The patch set also adds a python script that serves as such an external program,
allowing users to easily introspect the contents of a live migrated stream.

This approach consciously does not modify any way QEMU operates. To QEMU, it is
completely transparent. QEMU does not read that stream either, so you can not
use it to recover from migration breakage within the code. For that, we should
simply improve the migration protocol to be more future proof.

This approach is about enabling offline introspection of migration stream data
and structure, so that we have one more tool in our hands to see what goes wrong
inside a virtual machine.

  Example decoded migration: http://csgraf.de/mig/mig.txt
  Presentation: https://www.youtube.com/watch?v=iq1x40Qsrew
  Slides: https://www.dropbox.com/s/otp2pk2n3g087zp/Live%20Migration.pdf

Alexander Graf (3):
  Export savevm handlers outside of savevm.c
  Add migration debug device
  Add migration stream analyzation script

 hw/misc/Makefile.objs|   1 +
 hw/misc/debug_migration.c| 498 +++
 include/qemu/savevm.h|  28 +++
 savevm.c |  24 +--
 scripts/analyze-migration.py | 483 +
 5 files changed, 1012 insertions(+), 22 deletions(-)
 create mode 100644 hw/misc/debug_migration.c
 create mode 100644 include/qemu/savevm.h
 create mode 100755 scripts/analyze-migration.py

-- 
1.7.12.4

[Qemu-devel] [PATCH 1/3] Export savevm handlers outside of savevm.c

2013-10-23 Thread Alexander Graf

We need to be able to access savevm handlers from code that lives
outside of savevm.c. Extract its struct definitions and declaration
into a separate header file.

Signed-off-by: Alexander Graf ag...@suse.de
---
 include/qemu/savevm.h | 28 
 savevm.c  | 24 ++--
 2 files changed, 30 insertions(+), 22 deletions(-)
 create mode 100644 include/qemu/savevm.h

diff --git a/include/qemu/savevm.h b/include/qemu/savevm.h
new file mode 100644
index 000..5dae243
--- /dev/null
+++ b/include/qemu/savevm.h
@@ -0,0 +1,28 @@
+#ifndef QEMU_SAVEVM_H
+#define QEMU_SAVEVM_H
+
+typedef struct CompatEntry {
+char idstr[256];
+int instance_id;
+} CompatEntry;
+
+typedef struct SaveStateEntry {
+QTAILQ_ENTRY(SaveStateEntry) entry;
+char idstr[256];
+int instance_id;
+int alias_id;
+int version_id;
+int section_id;
+SaveVMHandlers *ops;
+const VMStateDescription *vmsd;
+void *opaque;
+CompatEntry *compat;
+int no_migrate;
+int is_ram;
+} SaveStateEntry;
+
+typedef QTAILQ_HEAD(EHCIQueueHead, EHCIQueue) EHCIQueueHead;
+typedef QTAILQ_HEAD(savevm_handlers, SaveStateEntry) SaveStateEntryHead;
+extern SaveStateEntryHead savevm_handlers;
+
+#endif /* QEMU_SAVEVM_H */
diff --git a/savevm.c b/savevm.c
index 2f631d4..eea45e1 100644
--- a/savevm.c
+++ b/savevm.c
@@ -1457,29 +1457,9 @@ const VMStateInfo vmstate_info_bitmap = {
 .put = put_bitmap,
 };
 
-typedef struct CompatEntry {
-char idstr[256];
-int instance_id;
-} CompatEntry;
-
-typedef struct SaveStateEntry {
-QTAILQ_ENTRY(SaveStateEntry) entry;
-char idstr[256];
-int instance_id;
-int alias_id;
-int version_id;
-int section_id;
-SaveVMHandlers *ops;
-const VMStateDescription *vmsd;
-void *opaque;
-CompatEntry *compat;
-int no_migrate;
-int is_ram;
-} SaveStateEntry;
-
+#include qemu/savevm.h
 
-static QTAILQ_HEAD(savevm_handlers, SaveStateEntry) savevm_handlers =
-QTAILQ_HEAD_INITIALIZER(savevm_handlers);
+SaveStateEntryHead savevm_handlers = QTAILQ_HEAD_INITIALIZER(savevm_handlers);
 static int global_section_id;
 
 static int calculate_new_instance_id(const char *idstr)
-- 
1.7.12.4

[Qemu-devel] [PATCH 2/3] Add migration debug device

2013-10-23 Thread Alexander Graf

This patch adds a pseudo device whose sole purpose is to encapsulate
a machine readable layout description of the vmstate stream layout inside
of the stream.

With this device enabled in the system while a migration is happening, we
have to chance to decypher the contents of the stream from an external
program without any knowledge of the device layout of the guest.

Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/misc/Makefile.objs |   1 +
 hw/misc/debug_migration.c | 498 ++
 2 files changed, 499 insertions(+)
 create mode 100644 hw/misc/debug_migration.c

diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
index 2578e29..4cfe8a4 100644
--- a/hw/misc/Makefile.objs
+++ b/hw/misc/Makefile.objs
@@ -41,3 +41,4 @@ obj-$(CONFIG_SLAVIO) += slavio_misc.o
 obj-$(CONFIG_ZYNQ) += zynq_slcr.o
 
 obj-$(CONFIG_PVPANIC) += pvpanic.o
+obj-y += debug_migration.o
diff --git a/hw/misc/debug_migration.c b/hw/misc/debug_migration.c
new file mode 100644
index 000..813041e
--- /dev/null
+++ b/hw/misc/debug_migration.c
@@ -0,0 +1,498 @@
+/*
+ *  QEMU pseudo-device to expose migration details
+ *
+ *  Copyright (c) 2013 Alexander Graf ag...@suse.de
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see http://www.gnu.org/licenses/.
+ */
+
+#include hw/hw.h
+#include qemu/savevm.h
+#include qapi/qmp/qstring.h
+
+#define TYPE_DEBUG_MIGRATION_DEVICE debug-migration
+
+typedef struct DebugMigration {
+DeviceState parent_obj;
+
+uint8_t magic[16];
+int32_t size;
+char *data;
+} DebugMigration;
+
+
+/ QJSON */
+
+typedef struct QJSON {
+QString *str;
+bool omit_comma;
+unsigned long self_size_offset;
+} QJSON;
+
+static void json_emit_element(QJSON *json, const char *name)
+{
+/* Check whether we need to print a , before an element */
+if (json-omit_comma) {
+json-omit_comma = false;
+} else {
+qstring_append(json-str, , );
+}
+
+if (name) {
+qstring_append(json-str, \);
+qstring_append(json-str, name);
+qstring_append(json-str, \ : );
+}
+}
+
+static void json_start_object(QJSON *json, const char *name)
+{
+json_emit_element(json, name);
+qstring_append(json-str, { );
+json-omit_comma = true;
+}
+
+static void json_end_object(QJSON *json)
+{
+qstring_append(json-str,  });
+json-omit_comma = false;
+}
+
+static void json_start_array(QJSON *json, const char *name)
+{
+json_emit_element(json, name);
+qstring_append(json-str, [ );
+json-omit_comma = true;
+}
+
+static void json_end_array(QJSON *json)
+{
+qstring_append(json-str,  ]);
+json-omit_comma = false;
+}
+
+static void json_prop_int(QJSON *json, const char *name, int64_t val)
+{
+json_emit_element(json, name);
+qstring_append_int(json-str, val);
+}
+
+static void json_prop_str(QJSON *json, const char *name, const char *str)
+{
+json_emit_element(json, name);
+qstring_append_chr(json-str, '');
+qstring_append(json-str, str);
+qstring_append_chr(json-str, '');
+}
+
+static QJSON *qjson_new(void)
+{
+QJSON *json = g_new(QJSON, 1);
+json-str = qstring_from_str({ );
+json-omit_comma = true;
+return json;
+}
+
+static void qjson_finish(QJSON *json)
+{
+json_end_object(json);
+}
+
+
+/ fake_file */
+
+
+static int fake_file_put_buffer(void *opaque, const uint8_t *json,
+int64_t pos, int size)
+{
+size_t *offset = (size_t *)opaque;
+*offset += size;
+return size;
+}
+
+const QEMUFileOps fake_file_ops = {
+.put_buffer = fake_file_put_buffer,
+};
+
+
+/ debug_migration */
+
+static void print_vmsd(QJSON *json, const VMStateDescription *vmsd,
+   void *opaque);
+static void print_vmsd_one(QJSON *json, const VMStateDescription *vmsd,
+   void *opaque, int version_id);
+static const VMStateDescription vmstate_debug_migration;
+
+
+static void print_non_vmstate(QJSON *json, SaveStateEntry *se)
+{
+QEMUFile *fakefile;
+size_t offset = 0;
+
+fakefile = qemu_fopen_ops(offset, fake_file_ops);
+
+offset = 0;
+se-ops-save_state(fakefile, se-opaque);
+qemu_fflush(fakefile);
+
+json_prop_int(json, size, offset);
+json_start_array(json, fields

[Qemu-devel] [PULL 00/29] ppc patch queue 2013-10-25

2013-10-25 Thread Alexander Graf

Hi Blue / Aurelien / Anthony,

This is my current patch queue for ppc.  Please pull.

Alex


The following changes since commit fc8ead74674b7129e8f31c2595c76658e5622197:

  Merge remote-tracking branch 'qemu-kvm/uq/master' into staging (2013-10-18 
10:03:24 -0700)

are available in the git repository at:


  git://github.com/agraf/qemu.git ppc-for-upstream

for you to fetch changes up to 3bbf37f2692652cc9d48030a9e7f34e2207429f6:

  spapr: Use DeviceClass::fw_name for device tree CPU node (2013-10-25 23:25:48 
+0200)


Alexander Graf (1):
  PPC: Fix L2CR write accesses

Alexey Kardashevskiy (14):
  pseries: Update SLOF firmware image
  spapr: increase temporary fdt buffer size
  spapr: Add ibm, purr property on power7 and newer
  spapr-rtas: fix h_rtas parameters reading
  xics: move reset and cpu_setup
  spapr: move cpu_setup after kvmppc_set_papr
  xics: replace fprintf with error_report
  xics: add pre_save/post_load dispatchers
  xics: convert init() to realize()
  xics: add missing const specifiers to TypeInfo
  xics: split to xics and xics-common
  xics: add cpu_setup callback
  xics-kvm: enable irqfd for MSI
  spapr-pci: enable irqfd for INTx

Andreas Färber (2):
  target-ppc: Fill in OpenFirmware names for some PowerPCCPU families
  spapr: Use DeviceClass::fw_name for device tree CPU node

Aneesh Kumar K.V (5):
  target-ppc: Update slb array with correct index values.
  target-ppc: Check for error on address translation in memsave command
  target-ppc: Use #define for max slb entries
  dump-guest-memory: Check for the correct return value
  target-ppc: dump-guest-memory support

Benjamin Herrenschmidt (3):
  pseries: Fix loading of little endian kernels
  xics: Implement H_IPOLL
  xics: Implement H_XIRR_X

David Gibson (2):
  target-ppc: Add helper for KVM_PPC_RTAS_DEFINE_TOKEN
  xics-kvm: Support for in-kernel XICS interrupt controller

Tom Musta (2):
  ppc: Add CFAR, DAR and DSISR to the dictionary of printable registers
  target-ppc: Little Endian Correction to Load/Store Vector Element

 cpus.c|   5 +-
 default-configs/ppc64-softmmu.mak |   1 +
 dump.c|   4 +-
 hw/intc/Makefile.objs |   1 +
 hw/intc/xics.c| 327 -
 hw/intc/xics_kvm.c| 494 ++
 hw/ppc/spapr.c|  72 --
 hw/ppc/spapr_hcall.c  |   6 +-
 hw/ppc/spapr_pci.c|  13 +
 include/elf.h |   3 +
 include/hw/ppc/spapr.h|  11 +-
 include/hw/ppc/xics.h |  57 +
 monitor.c |   3 +
 pc-bios/README|   2 +-
 pc-bios/slof.bin  | Bin 909720 - 875424 bytes
 roms/SLOF |   2 +-
 target-ppc/Makefile.objs  |   2 +-
 target-ppc/arch_dump.c| 253 +++
 target-ppc/cpu-qom.h  |   5 +-
 target-ppc/cpu.h  |   3 +-
 target-ppc/kvm.c  |  35 ++-
 target-ppc/kvm_ppc.h  |   7 +
 target-ppc/machine.c  |   2 +-
 target-ppc/mem_helper.c   |   2 +
 target-ppc/translate_init.c   |  38 ++-
 25 files changed, 1235 insertions(+), 113 deletions(-)
 create mode 100644 hw/intc/xics_kvm.c
 create mode 100644 target-ppc/arch_dump.c

[Qemu-devel] [PULL 17/29] xics: add cpu_setup callback

2013-10-25 Thread Alexander Graf

From: Alexey Kardashevskiy a...@ozlabs.ru

This adds a cpu_setup callback to the XICS device class (as XICS-KVM
will do it different), xics_cpu_setup() will call it if it is set.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/intc/xics.c| 5 +
 include/hw/ppc/xics.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index 5ed2618..1c6e6f5 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -37,9 +37,14 @@ void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
 CPUState *cs = CPU(cpu);
 CPUPPCState *env = cpu-env;
 ICPState *ss = icp-ss[cs-cpu_index];
+XICSStateClass *info = XICS_COMMON_GET_CLASS(icp);
 
 assert(cs-cpu_index  icp-nr_servers);
 
+if (info-cpu_setup) {
+info-cpu_setup(icp, cpu);
+}
+
 switch (PPC_INPUT(env)) {
 case PPC_FLAGS_INPUT_POWER7:
 ss-output = env-irq_inputs[POWER7_INPUT_INT];
diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
index 7e702a0..343bba8 100644
--- a/include/hw/ppc/xics.h
+++ b/include/hw/ppc/xics.h
@@ -64,6 +64,7 @@ typedef struct ICSIRQState ICSIRQState;
 struct XICSStateClass {
 DeviceClass parent_class;
 
+void (*cpu_setup)(XICSState *icp, PowerPCCPU *cpu);
 void (*set_nr_irqs)(XICSState *icp, uint32_t nr_irqs, Error **errp);
 void (*set_nr_servers)(XICSState *icp, uint32_t nr_servers, Error **errp);
 };
-- 
1.8.1.4

[Qemu-devel] [PULL 02/29] pseries: Fix loading of little endian kernels

2013-10-25 Thread Alexander Graf

From: Benjamin Herrenschmidt b...@kernel.crashing.org

Try loading the kernel as little endian if it fails big endian.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
Reviewed-by: Anton Blanchard an...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/ppc/spapr.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 004184d..5bf6c3b 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -273,6 +273,7 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
hwaddr initrd_base,
hwaddr initrd_size,
hwaddr kernel_size,
+   bool little_endian,
const char *boot_device,
const char *kernel_cmdline,
uint32_t epow_irq)
@@ -326,6 +327,9 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
   cpu_to_be64(kernel_size) };
 
 _FDT((fdt_property(fdt, qemu,boot-kernel, kprop, sizeof(kprop;
+if (little_endian) {
+_FDT((fdt_property(fdt, qemu,boot-kernel-le, NULL, 0)));
+}
 }
 if (boot_device) {
 _FDT((fdt_property_string(fdt, qemu,boot-device, boot_device)));
@@ -1102,6 +1106,7 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 uint32_t initrd_base = 0;
 long kernel_size = 0, initrd_size = 0;
 long load_limit, rtas_limit, fw_size;
+bool kernel_le = false;
 char *filename;
 
 msi_supported = true;
@@ -1282,6 +1287,12 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 kernel_size = load_elf(kernel_filename, translate_kernel_address, NULL,
NULL, lowaddr, NULL, 1, ELF_MACHINE, 0);
 if (kernel_size  0) {
+kernel_size = load_elf(kernel_filename,
+   translate_kernel_address, NULL,
+   NULL, lowaddr, NULL, 0, ELF_MACHINE, 0);
+kernel_le = kernel_size  0;
+}
+if (kernel_size  0) {
 kernel_size = load_image_targphys(kernel_filename,
   KERNEL_LOAD_ADDR,
   load_limit - KERNEL_LOAD_ADDR);
@@ -1331,7 +1342,7 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 /* Prepare the device tree */
 spapr-fdt_skel = spapr_create_fdt_skel(cpu_model,
 initrd_base, initrd_size,
-kernel_size,
+kernel_size, kernel_le,
 boot_device, kernel_cmdline,
 spapr-epow_irq);
 assert(spapr-fdt_skel != NULL);
-- 
1.8.1.4

[Qemu-devel] [PULL 20/29] xics: Implement H_XIRR_X

2013-10-25 Thread Alexander Graf

From: Benjamin Herrenschmidt b...@kernel.crashing.org

This implements H_XIRR_X hypercall in addition to H_XIRR as
it is mandatory for PAPR+ and there is no way for the guest to
detect whether it is supported or not so just add it.

As the Partition Adjunct Option is not supported at the moment,
the CPPR parameter of the hypercall is ignored.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/intc/xics.c | 14 ++
 include/hw/ppc/spapr.h |  1 +
 2 files changed, 15 insertions(+)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index eb93276..a05 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -27,6 +27,7 @@
 
 #include hw/hw.h
 #include trace.h
+#include qemu/timer.h
 #include hw/ppc/spapr.h
 #include hw/ppc/xics.h
 #include qemu/error-report.h
@@ -679,6 +680,18 @@ static target_ulong h_xirr(PowerPCCPU *cpu, 
sPAPREnvironment *spapr,
 return H_SUCCESS;
 }
 
+static target_ulong h_xirr_x(PowerPCCPU *cpu, sPAPREnvironment *spapr,
+ target_ulong opcode, target_ulong *args)
+{
+CPUState *cs = CPU(cpu);
+ICPState *ss = spapr-icp-ss[cs-cpu_index];
+uint32_t xirr = icp_accept(ss);
+
+args[0] = xirr;
+args[1] = cpu_get_real_ticks();
+return H_SUCCESS;
+}
+
 static target_ulong h_eoi(PowerPCCPU *cpu, sPAPREnvironment *spapr,
   target_ulong opcode, target_ulong *args)
 {
@@ -853,6 +866,7 @@ static void xics_realize(DeviceState *dev, Error **errp)
 spapr_register_hypercall(H_CPPR, h_cppr);
 spapr_register_hypercall(H_IPI, h_ipi);
 spapr_register_hypercall(H_XIRR, h_xirr);
+spapr_register_hypercall(H_XIRR_X, h_xirr_x);
 spapr_register_hypercall(H_EOI, h_eoi);
 spapr_register_hypercall(H_IPOLL, h_ipoll);
 
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 6407c8a..5ae0b58 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -283,6 +283,7 @@ typedef struct sPAPREnvironment {
 #define H_GET_EM_PARMS  0x2B8
 #define H_SET_MPP   0x2D0
 #define H_GET_MPP   0x2D4
+#define H_XIRR_X0x2FC
 #define H_SET_MODE  0x31C
 #define MAX_HCALL_OPCODEH_SET_MODE
 
-- 
1.8.1.4

[Qemu-devel] [PULL 25/29] target-ppc: Use #define for max slb entries

2013-10-25 Thread Alexander Graf

From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com

Instead of opencoding 64 use MAX_SLB_ENTRIES. We don't update the kernel
header here.

Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 target-ppc/cpu.h | 3 ++-
 target-ppc/kvm.c | 4 ++--
 target-ppc/machine.c | 2 +-
 3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index 422a6bb..26acdba 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -405,6 +405,7 @@ struct ppc_slb_t {
 uint64_t vsid;
 };
 
+#define MAX_SLB_ENTRIES 64
 #define SEGMENT_SHIFT_256M  28
 #define SEGMENT_MASK_256M   (~((1ULL  SEGMENT_SHIFT_256M) - 1))
 
@@ -949,7 +950,7 @@ struct CPUPPCState {
 #if !defined(CONFIG_USER_ONLY)
 #if defined(TARGET_PPC64)
 /* PowerPC 64 SLB area */
-ppc_slb_t slb[64];
+ppc_slb_t slb[MAX_SLB_ENTRIES];
 int32_t slb_nr;
 #endif
 /* segment registers */
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index e2f8b03..b77ce5e 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -818,7 +818,7 @@ int kvm_arch_put_registers(CPUState *cs, int level)
 
 /* Sync SLB */
 #ifdef TARGET_PPC64
-for (i = 0; i  64; i++) {
+for (i = 0; i  ARRAY_SIZE(env-slb); i++) {
 sregs.u.s.ppc64.slb[i].slbe = env-slb[i].esid;
 sregs.u.s.ppc64.slb[i].slbv = env-slb[i].vsid;
 }
@@ -1040,7 +1040,7 @@ int kvm_arch_get_registers(CPUState *cs)
  * back in.
  */
 memset(env-slb, 0, sizeof(env-slb));
-for (i = 0; i  64; i++) {
+for (i = 0; i  ARRAY_SIZE(env-slb); i++) {
 target_ulong rb = sregs.u.s.ppc64.slb[i].slbe;
 target_ulong rs = sregs.u.s.ppc64.slb[i].slbv;
 /*
diff --git a/target-ppc/machine.c b/target-ppc/machine.c
index 12e1512..12c174f 100644
--- a/target-ppc/machine.c
+++ b/target-ppc/machine.c
@@ -312,7 +312,7 @@ static const VMStateDescription vmstate_slb = {
 .minimum_version_id_old = 1,
 .fields  = (VMStateField []) {
 VMSTATE_INT32_EQUAL(env.slb_nr, PowerPCCPU),
-VMSTATE_SLB_ARRAY(env.slb, PowerPCCPU, 64),
+VMSTATE_SLB_ARRAY(env.slb, PowerPCCPU, MAX_SLB_ENTRIES),
 VMSTATE_END_OF_LIST()
 }
 };
-- 
1.8.1.4

[Qemu-devel] [PULL 03/29] ppc: Add CFAR, DAR and DSISR to the dictionary of printable registers

2013-10-25 Thread Alexander Graf

From: Tom Musta tommu...@gmail.com

The CFAR, DAR and DSISR registers are currently missing from the
dictionary of registers that may be printed in the QEMU console.
These are interesting registers when debugging.  With this patch,
the following commands work properly:

 (qemu) print $cfar
 (qemu) print $dar
 (qemu) print $dsisr

Signed-off-by: Tom Musta tommu...@gmail.com
Reviewed-by: Anton Blanchard an...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 monitor.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/monitor.c b/monitor.c
index 74f3f1b..b02b21c 100644
--- a/monitor.c
+++ b/monitor.c
@@ -3186,6 +3186,9 @@ static const MonitorDef monitor_defs[] = {
 
 { srr0, offsetof(CPUPPCState, spr[SPR_SRR0]) },
 { srr1, offsetof(CPUPPCState, spr[SPR_SRR1]) },
+{ dar, offsetof(CPUPPCState, spr[SPR_DAR]) },
+{ dsisr, offsetof(CPUPPCState, spr[SPR_DSISR]) },
+{ cfar, offsetof(CPUPPCState, spr[SPR_CFAR]) },
 { sprg0, offsetof(CPUPPCState, spr[SPR_SPRG0]) },
 { sprg1, offsetof(CPUPPCState, spr[SPR_SPRG1]) },
 { sprg2, offsetof(CPUPPCState, spr[SPR_SPRG2]) },
-- 
1.8.1.4

[Qemu-devel] [PULL 19/29] xics: Implement H_IPOLL

2013-10-25 Thread Alexander Graf

From: Benjamin Herrenschmidt b...@kernel.crashing.org

This adds support for the H_IPOLL hypercall which the guest
uses to poll for a pending interrupt. This hypercall is
mandatory for PAPR+ and there is no way for the guest to
detect whether it is supported or not so just add it.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Acked-by: Alexander Graf ag...@suse.de
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/intc/xics.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index 1c6e6f5..eb93276 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -689,6 +689,18 @@ static target_ulong h_eoi(PowerPCCPU *cpu, 
sPAPREnvironment *spapr,
 return H_SUCCESS;
 }
 
+static target_ulong h_ipoll(PowerPCCPU *cpu, sPAPREnvironment *spapr,
+target_ulong opcode, target_ulong *args)
+{
+CPUState *cs = CPU(cpu);
+ICPState *ss = spapr-icp-ss[cs-cpu_index];
+
+args[0] = ss-xirr;
+args[1] = ss-mfrr;
+
+return H_SUCCESS;
+}
+
 static void rtas_set_xive(PowerPCCPU *cpu, sPAPREnvironment *spapr,
   uint32_t token,
   uint32_t nargs, target_ulong args,
@@ -842,6 +854,7 @@ static void xics_realize(DeviceState *dev, Error **errp)
 spapr_register_hypercall(H_IPI, h_ipi);
 spapr_register_hypercall(H_XIRR, h_xirr);
 spapr_register_hypercall(H_EOI, h_eoi);
+spapr_register_hypercall(H_IPOLL, h_ipoll);
 
 object_property_set_bool(OBJECT(icp-ics), true, realized, error);
 if (error) {
-- 
1.8.1.4

[Qemu-devel] [PULL 05/29] PPC: Fix L2CR write accesses

2013-10-25 Thread Alexander Graf

Commit 2345f1c01 was supposed to render L2CR writes into noops. Instead,
it made them illegal instruction traps which apparently didn't confuse
XNU, but can easily confuse other OSs.

Fix it up by actually doing nothing when we write to L2CR.

Reported-by: Julio Guerra gu...@julio.in
Signed-off-by: Alexander Graf ag...@suse.de
Tested-by: Julio Guerra gu...@julio.in
---
 target-ppc/translate_init.c | 29 +
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 651da6b..807dab3 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -108,6 +108,11 @@ static void spr_write_clear (void *opaque, int sprn, int 
gprn)
 tcg_temp_free(t0);
 tcg_temp_free(t1);
 }
+
+static void spr_access_nop(void *opaque, int sprn, int gprn)
+{
+}
+
 #endif
 
 /* SPR common to all PowerPC */
@@ -1382,7 +1387,7 @@ static void gen_spr_74xx (CPUPPCState *env)
 /* XXX : not implemented */
 spr_register(env, SPR_L2CR, L2CR,
  SPR_NOACCESS, SPR_NOACCESS,
- spr_read_generic, NULL,
+ spr_read_generic, spr_access_nop,
  0x);
 /* Not strictly an SPR */
 vscr_init(env, 0x0001);
@@ -5170,7 +5175,7 @@ static void init_proc_750 (CPUPPCState *env)
 /* XXX : not implemented */
 spr_register(env, SPR_L2CR, L2CR,
  SPR_NOACCESS, SPR_NOACCESS,
- spr_read_generic, NULL,
+ spr_read_generic, spr_access_nop,
  0x);
 /* Time base */
 gen_tbl(env);
@@ -5233,7 +5238,7 @@ static void init_proc_750cl (CPUPPCState *env)
 /* XXX : not implemented */
 spr_register(env, SPR_L2CR, L2CR,
  SPR_NOACCESS, SPR_NOACCESS,
- spr_read_generic, NULL,
+ spr_read_generic, spr_access_nop,
  0x);
 /* Time base */
 gen_tbl(env);
@@ -5419,7 +5424,7 @@ static void init_proc_750cx (CPUPPCState *env)
 /* XXX : not implemented */
 spr_register(env, SPR_L2CR, L2CR,
  SPR_NOACCESS, SPR_NOACCESS,
- spr_read_generic, NULL,
+ spr_read_generic, spr_access_nop,
  0x);
 /* Time base */
 gen_tbl(env);
@@ -5486,7 +5491,7 @@ static void init_proc_750fx (CPUPPCState *env)
 /* XXX : not implemented */
 spr_register(env, SPR_L2CR, L2CR,
  SPR_NOACCESS, SPR_NOACCESS,
- spr_read_generic, NULL,
+ spr_read_generic, spr_access_nop,
  0x);
 /* Time base */
 gen_tbl(env);
@@ -5558,7 +5563,7 @@ static void init_proc_750gx (CPUPPCState *env)
 /* XXX : not implemented (XXX: different from 750fx) */
 spr_register(env, SPR_L2CR, L2CR,
  SPR_NOACCESS, SPR_NOACCESS,
- spr_read_generic, NULL,
+ spr_read_generic, spr_access_nop,
  0x);
 /* Time base */
 gen_tbl(env);
@@ -5694,7 +5699,7 @@ static void init_proc_755 (CPUPPCState *env)
 /* XXX : not implemented */
 spr_register(env, SPR_L2CR, L2CR,
  SPR_NOACCESS, SPR_NOACCESS,
- spr_read_generic, NULL,
+ spr_read_generic, spr_access_nop,
  0x);
 /* XXX : not implemented */
 spr_register(env, SPR_L2PMCR, L2PMCR,
@@ -6650,7 +6655,7 @@ static void init_proc_970 (CPUPPCState *env)
 /* XXX : not implemented */
 spr_register(env, SPR_L2CR, L2CR,
  SPR_NOACCESS, SPR_NOACCESS,
- spr_read_generic, NULL,
+ spr_read_generic, spr_access_nop,
  0x);
 /* Memory management */
 /* XXX: not correct */
@@ -6750,7 +6755,7 @@ static void init_proc_970FX (CPUPPCState *env)
 /* XXX : not implemented */
 spr_register(env, SPR_L2CR, L2CR,
  SPR_NOACCESS, SPR_NOACCESS,
- spr_read_generic, NULL,
+ spr_read_generic, spr_access_nop,
  0x);
 /* Memory management */
 /* XXX: not correct */
@@ -6862,7 +6867,7 @@ static void init_proc_970GX (CPUPPCState *env)
 /* XXX : not implemented */
 spr_register(env, SPR_L2CR, L2CR,
  SPR_NOACCESS, SPR_NOACCESS,
- spr_read_generic, NULL,
+ spr_read_generic, spr_access_nop,
  0x);
 /* Memory management */
 /* XXX: not correct */
@@ -6962,7 +6967,7 @@ static void init_proc_970MP (CPUPPCState *env)
 /* XXX : not implemented */
 spr_register(env, SPR_L2CR, L2CR,
  SPR_NOACCESS, SPR_NOACCESS,
- spr_read_generic, NULL,
+ spr_read_generic, spr_access_nop,
  0x);
 /* Memory management */
 /* XXX: not correct */
@@ -7054,7 +7059,7 @@ static void init_proc_power5plus(CPUPPCState *env

[Qemu-devel] [PULL 22/29] spapr-pci: enable irqfd for INTx

2013-10-25 Thread Alexander Graf

From: Alexey Kardashevskiy a...@ozlabs.ru

This enables IRQFD for LSI (level triggered INTx interrupts) by adding
a spapr_route_intx_pin_to_irq() callback to the sPAPR PCI host bus. This
callback is called to know the global interrupt number to link resampling fd
with IRQFD's fd in KVM.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/ppc/spapr_pci.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 9b6ee32..edb4cb0 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -432,6 +432,17 @@ static void pci_spapr_set_irq(void *opaque, int irq_num, 
int level)
 qemu_set_irq(spapr_phb_lsi_qirq(phb, irq_num), level);
 }
 
+static PCIINTxRoute spapr_route_intx_pin_to_irq(void *opaque, int pin)
+{
+sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(opaque);
+PCIINTxRoute route;
+
+route.mode = PCI_INTX_ENABLED;
+route.irq = sphb-lsi_table[pin].irq;
+
+return route;
+}
+
 /*
  * MSI/MSIX memory region implementation.
  * The handler handles both MSI and MSIX.
@@ -610,6 +621,8 @@ static int spapr_phb_init(SysBusDevice *s)
 
 pci_setup_iommu(bus, spapr_pci_dma_iommu, sphb);
 
+pci_bus_set_route_irq_fn(bus, spapr_route_intx_pin_to_irq);
+
 QLIST_INSERT_HEAD(spapr-phbs, sphb, list);
 
 /* Initialize the LSI table */
-- 
1.8.1.4

[Qemu-devel] [PULL 10/29] xics: move reset and cpu_setup

2013-10-25 Thread Alexander Graf

From: Alexey Kardashevskiy a...@ozlabs.ru

This simple change makes following patches nicer.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Acked-by: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/intc/xics.c | 72 +-
 1 file changed, 36 insertions(+), 36 deletions(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index bb018d1..a0d71ef 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -30,6 +30,42 @@
 #include hw/ppc/spapr.h
 #include hw/ppc/xics.h
 
+void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
+{
+CPUState *cs = CPU(cpu);
+CPUPPCState *env = cpu-env;
+ICPState *ss = icp-ss[cs-cpu_index];
+
+assert(cs-cpu_index  icp-nr_servers);
+
+switch (PPC_INPUT(env)) {
+case PPC_FLAGS_INPUT_POWER7:
+ss-output = env-irq_inputs[POWER7_INPUT_INT];
+break;
+
+case PPC_FLAGS_INPUT_970:
+ss-output = env-irq_inputs[PPC970_INPUT_INT];
+break;
+
+default:
+fprintf(stderr, XICS interrupt controller does not support this CPU 
+bus model\n);
+abort();
+}
+}
+
+static void xics_reset(DeviceState *d)
+{
+XICSState *icp = XICS(d);
+int i;
+
+for (i = 0; i  icp-nr_servers; i++) {
+device_reset(DEVICE(icp-ss[i]));
+}
+
+device_reset(DEVICE(icp-ics));
+}
+
 /*
  * ICP: Presentation layer
  */
@@ -600,42 +636,6 @@ static void rtas_int_on(PowerPCCPU *cpu, sPAPREnvironment 
*spapr,
  * XICS
  */
 
-static void xics_reset(DeviceState *d)
-{
-XICSState *icp = XICS(d);
-int i;
-
-for (i = 0; i  icp-nr_servers; i++) {
-device_reset(DEVICE(icp-ss[i]));
-}
-
-device_reset(DEVICE(icp-ics));
-}
-
-void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
-{
-CPUState *cs = CPU(cpu);
-CPUPPCState *env = cpu-env;
-ICPState *ss = icp-ss[cs-cpu_index];
-
-assert(cs-cpu_index  icp-nr_servers);
-
-switch (PPC_INPUT(env)) {
-case PPC_FLAGS_INPUT_POWER7:
-ss-output = env-irq_inputs[POWER7_INPUT_INT];
-break;
-
-case PPC_FLAGS_INPUT_970:
-ss-output = env-irq_inputs[PPC970_INPUT_INT];
-break;
-
-default:
-fprintf(stderr, XICS interrupt controller does not support this CPU 
-bus model\n);
-abort();
-}
-}
-
 static void xics_realize(DeviceState *dev, Error **errp)
 {
 XICSState *icp = XICS(dev);
-- 
1.8.1.4

[Qemu-devel] [PULL 04/29] target-ppc: Little Endian Correction to Load/Store Vector Element

2013-10-25 Thread Alexander Graf

From: Tom Musta tommu...@gmail.com

The Load Vector Element (lve*x) and Store Vector Element (stve*x)
instructions not only byte-swap in Little Endian mode, they also
invert the element that is accessed. For example, the RTL for
lvehx contains this:

 eb -- EA[60:63]
 if Big-Endian byte ordering then
 VRT[8*eb:8*eb+15] -- MEM(EA,2)
 else
 VRT[112-(8*eb):127-(8*eb)] -- MEM(EA,2)

This patch adds the element inversion, as described in the last line
of the RTL.

Signed-off-by: Tom Musta tommu...@gmail.com
Reviewed-by: Anton Blanchard an...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 target-ppc/mem_helper.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/target-ppc/mem_helper.c b/target-ppc/mem_helper.c
index d8e63ca..f35ed03 100644
--- a/target-ppc/mem_helper.c
+++ b/target-ppc/mem_helper.c
@@ -212,6 +212,7 @@ target_ulong helper_lscbx(CPUPPCState *env, target_ulong 
addr, uint32_t reg,
 int index = (addr  0xf)  sh; \
 \
 if (msr_le) {   \
+index = n_elems - index - 1;\
 r-element[LO_IDX ? index : (adjust - index)] = \
 swap(access(env, addr));\
 } else {\
@@ -236,6 +237,7 @@ LVE(lvewx, cpu_ldl_data, bswap32, u32)
 int index = (addr  0xf)  sh; \
 \
 if (msr_le) {   \
+index = n_elems - index - 1;\
 access(env, addr, swap(r-element[LO_IDX ? index :  \
   (adjust - index)]));  \
 } else {\
-- 
1.8.1.4

[Qemu-devel] [PULL 14/29] xics: convert init() to realize()

2013-10-25 Thread Alexander Graf

From: Alexey Kardashevskiy a...@ozlabs.ru

This fixes XICS according new QOM rules.

This converts ICS's init() callbacks to realize().

This converts legacy qdev_init_nofail() to property_set(realized).

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Reviewed-by: Andreas Färber afaer...@suse.de
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/intc/xics.c | 28 ++--
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index eeb64f5..76654db 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -479,15 +479,17 @@ static const VMStateDescription vmstate_ics = {
 },
 };
 
-static int ics_realize(DeviceState *dev)
+static void ics_realize(DeviceState *dev, Error **errp)
 {
 ICSState *ics = ICS(dev);
 
+if (!ics-nr_irqs) {
+error_setg(errp, Number of interrupts needs to be greater 0);
+return;
+}
 ics-irqs = g_malloc0(ics-nr_irqs * sizeof(ICSIRQState));
 ics-islsi = g_malloc0(ics-nr_irqs * sizeof(bool));
 ics-qirqs = qemu_allocate_irqs(ics_set_irq, ics, ics-nr_irqs);
-
-return 0;
 }
 
 static void ics_class_init(ObjectClass *klass, void *data)
@@ -495,7 +497,7 @@ static void ics_class_init(ObjectClass *klass, void *data)
 DeviceClass *dc = DEVICE_CLASS(klass);
 ICSStateClass *isc = ICS_CLASS(klass);
 
-dc-init = ics_realize;
+dc-realize = ics_realize;
 dc-vmsd = vmstate_ics;
 dc-reset = ics_reset;
 isc-post_load = ics_post_load;
@@ -691,8 +693,14 @@ static void xics_realize(DeviceState *dev, Error **errp)
 {
 XICSState *icp = XICS(dev);
 ICSState *ics = icp-ics;
+Error *error = NULL;
 int i;
 
+if (!icp-nr_servers) {
+error_setg(errp, Number of servers needs to be greater 0);
+return;
+}
+
 /* Registration of global state belongs into realize */
 spapr_rtas_register(ibm,set-xive, rtas_set_xive);
 spapr_rtas_register(ibm,get-xive, rtas_get_xive);
@@ -707,7 +715,11 @@ static void xics_realize(DeviceState *dev, Error **errp)
 ics-nr_irqs = icp-nr_irqs;
 ics-offset = XICS_IRQ_BASE;
 ics-icp = icp;
-qdev_init_nofail(DEVICE(ics));
+object_property_set_bool(OBJECT(icp-ics), true, realized, error);
+if (error) {
+error_propagate(errp, error);
+return;
+}
 
 icp-ss = g_malloc0(icp-nr_servers*sizeof(ICPState));
 for (i = 0; i  icp-nr_servers; i++) {
@@ -715,7 +727,11 @@ static void xics_realize(DeviceState *dev, Error **errp)
 object_initialize(icp-ss[i], sizeof(icp-ss[i]), TYPE_ICP);
 snprintf(buffer, sizeof(buffer), icp[%d], i);
 object_property_add_child(OBJECT(icp), buffer, OBJECT(icp-ss[i]), 
NULL);
-qdev_init_nofail(DEVICE(icp-ss[i]));
+object_property_set_bool(OBJECT(icp-ss[i]), true, realized, 
error);
+if (error) {
+error_propagate(errp, error);
+return;
+}
 }
 }
 
-- 
1.8.1.4

[Qemu-devel] [PULL 09/29] target-ppc: Add helper for KVM_PPC_RTAS_DEFINE_TOKEN

2013-10-25 Thread Alexander Graf

From: David Gibson da...@gibson.dropbear.id.au

Recent PowerKVM allows the kernel to intercept some RTAS calls from the
guest directly.  This is used to implement the more efficient in-kernel
XICS for example.  qemu is still responsible for assigning the RTAS token
numbers however, and needs to tell the kernel which RTAS function name is
assigned to a given token value.  This patch adds a convenience wrapper for
the KVM_PPC_RTAS_DEFINE_TOKEN ioctl() which is used for this purpose.

Signed-off-by: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Acked-by: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Alexander Graf ag...@suse.de
---
 target-ppc/kvm.c | 14 ++
 target-ppc/kvm_ppc.h |  7 +++
 2 files changed, 21 insertions(+)

diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 8a196c6..0b5d391 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -1789,6 +1789,20 @@ static int kvm_ppc_register_host_cpu_type(void)
 return 0;
 }
 
+int kvmppc_define_rtas_kernel_token(uint32_t token, const char *function)
+{
+struct kvm_rtas_token_args args = {
+.token = token,
+};
+
+if (!kvm_check_extension(kvm_state, KVM_CAP_PPC_RTAS)) {
+return -ENOENT;
+}
+
+strncpy(args.name, function, sizeof(args.name));
+
+return kvm_vm_ioctl(kvm_state, KVM_PPC_RTAS_DEFINE_TOKEN, args);
+}
 
 int kvmppc_get_htab_fd(bool write)
 {
diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
index 4ae7bf2..5f78e4b 100644
--- a/target-ppc/kvm_ppc.h
+++ b/target-ppc/kvm_ppc.h
@@ -38,6 +38,7 @@ uint64_t kvmppc_rma_size(uint64_t current_size, unsigned int 
hash_shift);
 #endif /* !CONFIG_USER_ONLY */
 int kvmppc_fixup_cpu(PowerPCCPU *cpu);
 bool kvmppc_has_cap_epr(void);
+int kvmppc_define_rtas_kernel_token(uint32_t token, const char *function);
 int kvmppc_get_htab_fd(bool write);
 int kvmppc_save_htab(QEMUFile *f, int fd, size_t bufsize, int64_t max_ns);
 int kvmppc_load_htab_chunk(QEMUFile *f, int fd, uint32_t index,
@@ -164,6 +165,12 @@ static inline bool kvmppc_has_cap_epr(void)
 return false;
 }
 
+static inline int kvmppc_define_rtas_kernel_token(uint32_t token,
+  const char *function)
+{
+return -1;
+}
+
 static inline int kvmppc_get_htab_fd(bool write)
 {
 return -1;
-- 
1.8.1.4

[Qemu-devel] [PULL 12/29] xics: replace fprintf with error_report

2013-10-25 Thread Alexander Graf

From: Alexey Kardashevskiy a...@ozlabs.ru

This replaces old-style fprintf with new style error_report.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Reviewed-by: Andreas Färber afaer...@suse.de
Acked-by: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/intc/xics.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index a0d71ef..666888d 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -29,6 +29,7 @@
 #include trace.h
 #include hw/ppc/spapr.h
 #include hw/ppc/xics.h
+#include qemu/error-report.h
 
 void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
 {
@@ -48,8 +49,8 @@ void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
 break;
 
 default:
-fprintf(stderr, XICS interrupt controller does not support this CPU 
-bus model\n);
+error_report(XICS interrupt controller does not support this CPU 
+ bus model);
 abort();
 }
 }
-- 
1.8.1.4

[Qemu-devel] [PULL 06/29] spapr: increase temporary fdt buffer size

2013-10-25 Thread Alexander Graf

From: Alexey Kardashevskiy a...@ozlabs.ru

At the moment the size of the buffer is set to 64K which is
enough for approximately 150 VCPUs which is not the limit.

This increases the buffer up to 256K which allows having
a tree for approximately 600 VCPUs which is way beyond the real
number we need.

As only the real size of the tree is copied to the guest, there
will be no impact on existing configurations.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/ppc/spapr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 5bf6c3b..6322c98 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -62,7 +62,7 @@
  *
  * We load our kernel at 4M, leaving space for SLOF initial image
  */
-#define FDT_MAX_SIZE0x1
+#define FDT_MAX_SIZE0x4
 #define RTAS_MAX_SIZE   0x1
 #define FW_MAX_SIZE 0x40
 #define FW_FILE_NAMEslof.bin
-- 
1.8.1.4

[Qemu-devel] [PULL 18/29] xics-kvm: Support for in-kernel XICS interrupt controller

2013-10-25 Thread Alexander Graf

From: David Gibson da...@gibson.dropbear.id.au

Recent (host) kernels support emulating the PAPR defined XICS interrupt
controller system within KVM.  This patch allows qemu to initialize and
configure the in-kernel XICS, and keep its state in sync with qemu's XICS
state as necessary.

This should give considerable performance improvements.  e.g. on a simple
IPI ping-pong test between hardware threads, using qemu XICS gives us
around 5,000 irqs/second, whereas the in-kernel XICS gives us around
70,000 irqs/s on the same hardware configuration.

Signed-off-by: David Gibson da...@gibson.dropbear.id.au
[Mike Qiu qiud...@linux.vnet.ibm.com: fixed mistype which caused 
ics_set_kvm_state() to fail]
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Reviewed-by: Alexander Graf ag...@suse.de
Signed-off-by: Alexander Graf ag...@suse.de
---
 default-configs/ppc64-softmmu.mak |   1 +
 hw/intc/Makefile.objs |   1 +
 hw/intc/xics_kvm.c| 488 ++
 hw/ppc/spapr.c|  21 +-
 include/hw/ppc/xics.h |  10 +
 5 files changed, 520 insertions(+), 1 deletion(-)
 create mode 100644 hw/intc/xics_kvm.c

diff --git a/default-configs/ppc64-softmmu.mak 
b/default-configs/ppc64-softmmu.mak
index 975112a..fb34a9b 100644
--- a/default-configs/ppc64-softmmu.mak
+++ b/default-configs/ppc64-softmmu.mak
@@ -46,6 +46,7 @@ CONFIG_E500=y
 CONFIG_OPENPIC_KVM=$(and $(CONFIG_E500),$(CONFIG_KVM))
 # For pSeries
 CONFIG_XICS=$(CONFIG_PSERIES)
+CONFIG_XICS_KVM=$(and $(CONFIG_PSERIES),$(CONFIG_KVM))
 # For PReP
 CONFIG_I82378=y
 CONFIG_I8259=y
diff --git a/hw/intc/Makefile.objs b/hw/intc/Makefile.objs
index 2851eed..47ac442 100644
--- a/hw/intc/Makefile.objs
+++ b/hw/intc/Makefile.objs
@@ -23,3 +23,4 @@ obj-$(CONFIG_OMAP) += omap_intc.o
 obj-$(CONFIG_OPENPIC_KVM) += openpic_kvm.o
 obj-$(CONFIG_SH4) += sh_intc.o
 obj-$(CONFIG_XICS) += xics.o
+obj-$(CONFIG_XICS_KVM) += xics_kvm.o
diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
new file mode 100644
index 000..a2ccafa
--- /dev/null
+++ b/hw/intc/xics_kvm.c
@@ -0,0 +1,488 @@
+/*
+ * QEMU PowerPC pSeries Logical Partition (aka sPAPR) hardware System Emulator
+ *
+ * PAPR Virtualized Interrupt System, aka ICS/ICP aka xics, in-kernel emulation
+ *
+ * Copyright (c) 2013 David Gibson, IBM Corporation.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the Software), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ *
+ */
+
+#include hw/hw.h
+#include trace.h
+#include hw/ppc/spapr.h
+#include hw/ppc/xics.h
+#include kvm_ppc.h
+#include qemu/config-file.h
+#include qemu/error-report.h
+
+#include sys/ioctl.h
+
+typedef struct KVMXICSState {
+XICSState parent_obj;
+
+uint32_t set_xive_token;
+uint32_t get_xive_token;
+uint32_t int_off_token;
+uint32_t int_on_token;
+int kernel_xics_fd;
+} KVMXICSState;
+
+/*
+ * ICP-KVM
+ */
+static void icp_get_kvm_state(ICPState *ss)
+{
+uint64_t state;
+struct kvm_one_reg reg = {
+.id = KVM_REG_PPC_ICP_STATE,
+.addr = (uintptr_t)state,
+};
+int ret;
+
+/* ICP for this CPU thread is not in use, exiting */
+if (!ss-cs) {
+return;
+}
+
+ret = kvm_vcpu_ioctl(ss-cs, KVM_GET_ONE_REG, reg);
+if (ret != 0) {
+error_report(Unable to retrieve KVM interrupt controller state
+ for CPU %d: %s, ss-cs-cpu_index, strerror(errno));
+exit(1);
+}
+
+ss-xirr = state  KVM_REG_PPC_ICP_XISR_SHIFT;
+ss-mfrr = (state  KVM_REG_PPC_ICP_MFRR_SHIFT)
+ KVM_REG_PPC_ICP_MFRR_MASK;
+ss-pending_priority = (state  KVM_REG_PPC_ICP_PPRI_SHIFT)
+ KVM_REG_PPC_ICP_PPRI_MASK;
+}
+
+static int icp_set_kvm_state(ICPState *ss, int version_id)
+{
+uint64_t state;
+struct kvm_one_reg reg = {
+.id = KVM_REG_PPC_ICP_STATE,
+.addr = (uintptr_t)state,
+};
+int ret;
+
+/* ICP for this CPU thread is not in use, exiting */
+if (!ss-cs) {
+return 0;
+}
+
+state

[Qemu-devel] [PULL 21/29] xics-kvm: enable irqfd for MSI

2013-10-25 Thread Alexander Graf

From: Alexey Kardashevskiy a...@ozlabs.ru

This enables IRQFD support for sPAPR. The feature decreases the latency
of interrupt handling.

To enable IRQFD for MSI, this sets kvm_gsi_direct_mapping to true which
enables direct MSI mapping.

To enable IRQFD for LSI (level triggered INTx interrupts), a PCI host bus
callback is required. The patch for that is coming next.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/intc/xics_kvm.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
index a2ccafa..c203646 100644
--- a/hw/intc/xics_kvm.c
+++ b/hw/intc/xics_kvm.c
@@ -441,6 +441,12 @@ static void xics_kvm_realize(DeviceState *dev, Error 
**errp)
 goto fail;
 }
 }
+
+kvm_kernel_irqchip = true;
+kvm_irqfds_allowed = true;
+kvm_msi_via_irqfd_allowed = true;
+kvm_gsi_direct_mapping = true;
+
 return;
 
 fail:
-- 
1.8.1.4

[Qemu-devel] [PULL 23/29] target-ppc: Update slb array with correct index values.

2013-10-25 Thread Alexander Graf

From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com

Without this, a value of rb=0 and rs=0 results in replacing the 0th
index. This can be observed when using gdb remote debugging support.

(gdb) x/10i do_fork
   0xc0085330 do_fork:Cannot access memory at address 
0xc0085330
(gdb)

This is because when we do the slb sync via kvm_cpu_synchronize_state,
we overwrite the slb entry (0th entry) for 0xc0085330

Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 target-ppc/kvm.c | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 0b5d391..e2f8b03 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -1033,9 +1033,22 @@ int kvm_arch_get_registers(CPUState *cs)
 
 /* Sync SLB */
 #ifdef TARGET_PPC64
+/*
+ * The packed SLB array we get from KVM_GET_SREGS only contains
+ * information about valid entries. So we flush our internal
+ * copy to get rid of stale ones, then put all valid SLB entries
+ * back in.
+ */
+memset(env-slb, 0, sizeof(env-slb));
 for (i = 0; i  64; i++) {
-ppc_store_slb(env, sregs.u.s.ppc64.slb[i].slbe,
-   sregs.u.s.ppc64.slb[i].slbv);
+target_ulong rb = sregs.u.s.ppc64.slb[i].slbe;
+target_ulong rs = sregs.u.s.ppc64.slb[i].slbv;
+/*
+ * Only restore valid entries
+ */
+if (rb  SLB_ESID_V) {
+ppc_store_slb(env, rb, rs);
+}
 }
 #endif
 
-- 
1.8.1.4

[Qemu-devel] [PULL 29/29] spapr: Use DeviceClass::fw_name for device tree CPU node

2013-10-25 Thread Alexander Graf

From: Andreas Färber afaer...@suse.de

Instead of relying on cpu_model, obtain the device tree node label
per CPU. Use DeviceClass::fw_name as source.

Whenever DeviceClass::fw_name is unknown, default to PowerPC,UNKNOWN.

As a consequence, spapr_fixup_cpu_dt() can operate on each CPU's fw_name,
obsoleting sPAPREnvironment::cpu_model, and spapr_create_fdt_skel() can
drop its cpu_model argument.

Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
Signed-off-by: Andreas Färber afaer...@suse.de
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/ppc/spapr.c  | 26 ++
 include/hw/ppc/spapr.h  |  1 -
 target-ppc/translate_init.c |  2 ++
 3 files changed, 8 insertions(+), 21 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index c0613e4..f76b355 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -204,9 +204,8 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPREnvironment 
*spapr)
 int smt = kvmppc_smt_threads();
 uint32_t pft_size_prop[] = {0, cpu_to_be32(spapr-htab_shift)};
 
-assert(spapr-cpu_model);
-
 CPU_FOREACH(cpu) {
+DeviceClass *dc = DEVICE_GET_CLASS(cpu);
 uint32_t associativity[] = {cpu_to_be32(0x5),
 cpu_to_be32(0x0),
 cpu_to_be32(0x0),
@@ -218,7 +217,7 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPREnvironment 
*spapr)
 continue;
 }
 
-snprintf(cpu_model, 32, /cpus/%s@%x, spapr-cpu_model,
+snprintf(cpu_model, 32, /cpus/%s@%x, dc-fw_name,
  cpu-cpu_index);
 
 offset = fdt_path_offset(fdt, cpu_model);
@@ -288,8 +287,7 @@ static size_t create_page_sizes_prop(CPUPPCState *env, 
uint32_t *prop,
 } while (0)
 
 
-static void *spapr_create_fdt_skel(const char *cpu_model,
-   hwaddr initrd_base,
+static void *spapr_create_fdt_skel(hwaddr initrd_base,
hwaddr initrd_size,
hwaddr kernel_size,
bool little_endian,
@@ -306,7 +304,6 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
 char qemu_hypertas_prop[] = hcall-memop1;
 uint32_t refpoints[] = {cpu_to_be32(0x4), cpu_to_be32(0x4)};
 uint32_t interrupt_server_ranges_prop[] = {0, cpu_to_be32(smp_cpus)};
-char *modelname;
 int i, smt = kvmppc_smt_threads();
 unsigned char vec5[] = {0x0, 0x0, 0x0, 0x0, 0x0, 0x80};
 
@@ -365,18 +362,10 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
 _FDT((fdt_property_cell(fdt, #address-cells, 0x1)));
 _FDT((fdt_property_cell(fdt, #size-cells, 0x0)));
 
-modelname = g_strdup(cpu_model);
-
-for (i = 0; i  strlen(modelname); i++) {
-modelname[i] = toupper(modelname[i]);
-}
-
-/* This is needed during FDT finalization */
-spapr-cpu_model = g_strdup(modelname);
-
 CPU_FOREACH(cs) {
 PowerPCCPU *cpu = POWERPC_CPU(cs);
 CPUPPCState *env = cpu-env;
+DeviceClass *dc = DEVICE_GET_CLASS(cs);
 PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cs);
 int index = cs-cpu_index;
 uint32_t servers_prop[smp_threads];
@@ -393,7 +382,7 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
 continue;
 }
 
-nodename = g_strdup_printf(%s@%x, modelname, index);
+nodename = g_strdup_printf(%s@%x, dc-fw_name, index);
 
 _FDT((fdt_begin_node(fdt, nodename)));
 
@@ -477,8 +466,6 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
 _FDT((fdt_end_node(fdt)));
 }
 
-g_free(modelname);
-
 _FDT((fdt_end_node(fdt)));
 
 /* RTAS */
@@ -1363,8 +1350,7 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
  savevm_htab_handlers, spapr);
 
 /* Prepare the device tree */
-spapr-fdt_skel = spapr_create_fdt_skel(cpu_model,
-initrd_base, initrd_size,
+spapr-fdt_skel = spapr_create_fdt_skel(initrd_base, initrd_size,
 kernel_size, kernel_le,
 boot_device, kernel_cmdline,
 spapr-epow_irq);
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 5ae0b58..fdaab2d 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -29,7 +29,6 @@ typedef struct sPAPREnvironment {
 target_ulong entry_point;
 uint32_t next_irq;
 uint64_t rtc_offset;
-char *cpu_model;
 bool has_graphics;
 
 uint32_t epow_irq;
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 9e29caa..47825ac 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -8587,6 +8587,8 @@ static void ppc_cpu_class_init(ObjectClass *oc, void 
*data)
 #else
 cc-gdb_core_xml_file = power-core.xml;
 #endif
+
+dc-fw_name = PowerPC,UNKNOWN;
 }
 
 static const

[Qemu-devel] [PULL 26/29] dump-guest-memory: Check for the correct return value

2013-10-25 Thread Alexander Graf

From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com

We should check for error with s-note_size

Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 dump.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/dump.c b/dump.c
index 846155c..80a9116 100644
--- a/dump.c
+++ b/dump.c
@@ -66,7 +66,7 @@ typedef struct DumpState {
 uint32_t sh_info;
 bool have_section;
 bool resume;
-size_t note_size;
+ssize_t note_size;
 hwaddr memory_offset;
 int fd;
 
@@ -765,7 +765,7 @@ static int dump_init(DumpState *s, int fd, bool paging, 
bool has_filter,
 
 s-note_size = cpu_get_note_size(s-dump_info.d_class,
  s-dump_info.d_machine, nr_cpus);
-if (ret  0) {
+if (s-note_size  0) {
 error_set(errp, QERR_UNSUPPORTED);
 goto cleanup;
 }
-- 
1.8.1.4

[Qemu-devel] [PULL 28/29] target-ppc: Fill in OpenFirmware names for some PowerPCCPU families

2013-10-25 Thread Alexander Graf

From: Andreas Färber afaer...@suse.de

Set the expected values for POWER7, POWER7+, POWER8 and POWER5+.
Note that POWER5+ and POWER7+ are intentionally lacking the '+', so the
lack of a POWER7P family constitutes no problem.

Signed-off-by: Andreas Färber afaer...@suse.de
Signed-off-by: Alexander Graf ag...@suse.de
---
 target-ppc/translate_init.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index f778eaa..9e29caa 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -7108,6 +7108,7 @@ POWERPC_FAMILY(POWER5P)(ObjectClass *oc, void *data)
 DeviceClass *dc = DEVICE_CLASS(oc);
 PowerPCCPUClass *pcc = POWERPC_CPU_CLASS(oc);
 
+dc-fw_name = PowerPC,POWER5;
 dc-desc = POWER5+;
 pcc-init_proc = init_proc_power5plus;
 pcc-check_pow = check_pow_970FX;
@@ -7218,6 +7219,7 @@ POWERPC_FAMILY(POWER7)(ObjectClass *oc, void *data)
 DeviceClass *dc = DEVICE_CLASS(oc);
 PowerPCCPUClass *pcc = POWERPC_CPU_CLASS(oc);
 
+dc-fw_name = PowerPC,POWER7;
 dc-desc = POWER7;
 pcc-init_proc = init_proc_POWER7;
 pcc-check_pow = check_pow_nocheck;
@@ -7252,6 +7254,7 @@ POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data)
 DeviceClass *dc = DEVICE_CLASS(oc);
 PowerPCCPUClass *pcc = POWERPC_CPU_CLASS(oc);
 
+dc-fw_name = PowerPC,POWER8;
 dc-desc = POWER8;
 pcc-init_proc = init_proc_POWER7;
 pcc-check_pow = check_pow_nocheck;
-- 
1.8.1.4

[Qemu-devel] [PULL 27/29] target-ppc: dump-guest-memory support

2013-10-25 Thread Alexander Graf

From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com

This patch add support for dumping guest memory using dump-guest-memory
monitor command.

Before patch:

(qemu) dump-guest-memory testcrash
this feature or command is not currently supported
(qemu)

After patch:

(qemu) dump-guest-memory testcrash
(qemu)

crash was able to read the file

crash bt
PID: 0  TASK: c0c0d0d0  CPU: 0   COMMAND: swapper/0

 R0:  2884R1:  c0cafa50R2:  c0cb05b0
 R3:  R4:  c0bc4cb0R5:  
 R6:  001efe93b800R7:  R8:  
 R9:  b0001032R10: 0001R11: 0001eb2117e00d55

...

NOTE: Currently crash tools doesn't look at ELF notes in the dump on ppc64.

Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 include/elf.h   |   3 +
 target-ppc/Makefile.objs|   2 +-
 target-ppc/arch_dump.c  | 253 
 target-ppc/cpu-qom.h|   5 +-
 target-ppc/translate_init.c |   4 +
 5 files changed, 265 insertions(+), 2 deletions(-)
 create mode 100644 target-ppc/arch_dump.c

diff --git a/include/elf.h b/include/elf.h
index 58bfbf8..b818091 100644
--- a/include/elf.h
+++ b/include/elf.h
@@ -1359,6 +1359,9 @@ typedef struct elf64_shdr {
 #define NT_S390_TODPREG 0x303   /* s390 TOD programmable register */
 #define NT_S390_TODCMP  0x302   /* s390 TOD clock comparator register 
*/
 #define NT_S390_TIMER   0x301   /* s390 timer register */
+#define NT_PPC_VMX   0x100  /* PowerPC Altivec/VMX registers */
+#define NT_PPC_SPE   0x101  /* PowerPC SPE/EVR registers */
+#define NT_PPC_VSX   0x102  /* PowerPC VSX registers */
 
 
 /* Note header in a PT_NOTE section */
diff --git a/target-ppc/Makefile.objs b/target-ppc/Makefile.objs
index 94d6d0c..3cb23e0 100644
--- a/target-ppc/Makefile.objs
+++ b/target-ppc/Makefile.objs
@@ -2,7 +2,7 @@ obj-y += cpu-models.o
 obj-y += translate.o
 ifeq ($(CONFIG_SOFTMMU),y)
 obj-y += machine.o mmu_helper.o mmu-hash32.o
-obj-$(TARGET_PPC64) += mmu-hash64.o
+obj-$(TARGET_PPC64) += mmu-hash64.o arch_dump.o
 endif
 obj-$(CONFIG_KVM) += kvm.o kvm_ppc.o
 obj-$(call lnot,$(CONFIG_KVM)) += kvm-stub.o
diff --git a/target-ppc/arch_dump.c b/target-ppc/arch_dump.c
new file mode 100644
index 000..17fd4c6
--- /dev/null
+++ b/target-ppc/arch_dump.c
@@ -0,0 +1,253 @@
+/*
+ * writing ELF notes for ppc64 arch
+ *
+ *
+ * Copyright IBM, Corp. 2013
+ *
+ * Authors:
+ * Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include cpu.h
+#include elf.h
+#include exec/cpu-all.h
+#include sysemu/dump.h
+#include sysemu/kvm.h
+
+struct PPC64UserRegStruct {
+uint64_t gpr[32];
+uint64_t nip;
+uint64_t msr;
+uint64_t orig_gpr3;
+uint64_t ctr;
+uint64_t link;
+uint64_t xer;
+uint64_t ccr;
+uint64_t softe;
+uint64_t trap;
+uint64_t dar;
+uint64_t dsisr;
+uint64_t result;
+} QEMU_PACKED;
+
+struct PPC64ElfPrstatus {
+char pad1[112];
+struct PPC64UserRegStruct pr_reg;
+uint64_t pad2[4];
+} QEMU_PACKED;
+
+
+struct PPC64ElfFpregset {
+uint64_t fpr[32];
+uint64_t fpscr;
+}  QEMU_PACKED;
+
+
+struct PPC64ElfVmxregset {
+ppc_avr_t avr[32];
+ppc_avr_t vscr;
+union {
+ppc_avr_t unused;
+uint32_t value;
+} vrsave;
+}  QEMU_PACKED;
+
+struct PPC64ElfVsxregset {
+uint64_t vsr[32];
+}  QEMU_PACKED;
+
+struct PPC64ElfSperegset {
+uint32_t evr[32];
+uint64_t spe_acc;
+uint32_t spe_fscr;
+}  QEMU_PACKED;
+
+typedef struct noteStruct {
+Elf64_Nhdr hdr;
+char name[5];
+char pad3[3];
+union {
+struct PPC64ElfPrstatus  prstatus;
+struct PPC64ElfFpregset  fpregset;
+struct PPC64ElfVmxregset vmxregset;
+struct PPC64ElfVsxregset vsxregset;
+struct PPC64ElfSperegset speregset;
+} contents;
+} QEMU_PACKED Note;
+
+
+static void ppc64_write_elf64_prstatus(Note *note, PowerPCCPU *cpu)
+{
+int i;
+uint64_t cr;
+struct PPC64ElfPrstatus *prstatus;
+struct PPC64UserRegStruct *reg;
+
+note-hdr.n_type = cpu_to_be32(NT_PRSTATUS);
+
+prstatus = note-contents.prstatus;
+memset(prstatus, 0, sizeof(*prstatus));
+reg = prstatus-pr_reg;
+
+for (i = 0; i  32; i++) {
+reg-gpr[i] = cpu_to_be64(cpu-env.gpr[i]);
+}
+reg-nip = cpu_to_be64(cpu-env.nip);
+reg-msr = cpu_to_be64(cpu-env.msr);
+reg-ctr = cpu_to_be64(cpu-env.ctr);
+reg-link = cpu_to_be64(cpu-env.lr);
+reg-xer = cpu_to_be64(cpu_read_xer(cpu-env));
+
+cr = 0;
+for (i = 0; i  8; i++) {
+cr |= (cpu-env.crf[i]  15)  (4 * (7 - i));
+}
+reg-ccr = cpu_to_be64(cr);
+}
+
+static void

[Qemu-devel] [PULL 13/29] xics: add pre_save/post_load dispatchers

2013-10-25 Thread Alexander Graf

From: Alexey Kardashevskiy a...@ozlabs.ru

The upcoming support of in-kernel XICS will redefine migration callbacks
for both ICS and ICP so classes and callback pointers are added.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/intc/xics.c| 56 ---
 include/hw/ppc/xics.h | 26 
 2 files changed, 79 insertions(+), 3 deletions(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index 666888d..eeb64f5 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -190,11 +190,35 @@ static void icp_irq(XICSState *icp, int server, int nr, 
uint8_t priority)
 }
 }
 
+static void icp_dispatch_pre_save(void *opaque)
+{
+ICPState *ss = opaque;
+ICPStateClass *info = ICP_GET_CLASS(ss);
+
+if (info-pre_save) {
+info-pre_save(ss);
+}
+}
+
+static int icp_dispatch_post_load(void *opaque, int version_id)
+{
+ICPState *ss = opaque;
+ICPStateClass *info = ICP_GET_CLASS(ss);
+
+if (info-post_load) {
+return info-post_load(ss, version_id);
+}
+
+return 0;
+}
+
 static const VMStateDescription vmstate_icp_server = {
 .name = icp/server,
 .version_id = 1,
 .minimum_version_id = 1,
 .minimum_version_id_old = 1,
+.pre_save = icp_dispatch_pre_save,
+.post_load = icp_dispatch_post_load,
 .fields  = (VMStateField []) {
 /* Sanity check */
 VMSTATE_UINT32(xirr, ICPState),
@@ -229,6 +253,7 @@ static TypeInfo icp_info = {
 .parent = TYPE_DEVICE,
 .instance_size = sizeof(ICPState),
 .class_init = icp_class_init,
+.class_size = sizeof(ICPStateClass),
 };
 
 /*
@@ -390,10 +415,9 @@ static void ics_reset(DeviceState *dev)
 }
 }
 
-static int ics_post_load(void *opaque, int version_id)
+static int ics_post_load(ICSState *ics, int version_id)
 {
 int i;
-ICSState *ics = opaque;
 
 for (i = 0; i  ics-icp-nr_servers; i++) {
 icp_resend(ics-icp, i);
@@ -402,6 +426,28 @@ static int ics_post_load(void *opaque, int version_id)
 return 0;
 }
 
+static void ics_dispatch_pre_save(void *opaque)
+{
+ICSState *ics = opaque;
+ICSStateClass *info = ICS_GET_CLASS(ics);
+
+if (info-pre_save) {
+info-pre_save(ics);
+}
+}
+
+static int ics_dispatch_post_load(void *opaque, int version_id)
+{
+ICSState *ics = opaque;
+ICSStateClass *info = ICS_GET_CLASS(ics);
+
+if (info-post_load) {
+return info-post_load(ics, version_id);
+}
+
+return 0;
+}
+
 static const VMStateDescription vmstate_ics_irq = {
 .name = ics/irq,
 .version_id = 1,
@@ -421,7 +467,8 @@ static const VMStateDescription vmstate_ics = {
 .version_id = 1,
 .minimum_version_id = 1,
 .minimum_version_id_old = 1,
-.post_load = ics_post_load,
+.pre_save = ics_dispatch_pre_save,
+.post_load = ics_dispatch_post_load,
 .fields  = (VMStateField []) {
 /* Sanity check */
 VMSTATE_UINT32_EQUAL(nr_irqs, ICSState),
@@ -446,10 +493,12 @@ static int ics_realize(DeviceState *dev)
 static void ics_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
+ICSStateClass *isc = ICS_CLASS(klass);
 
 dc-init = ics_realize;
 dc-vmsd = vmstate_ics;
 dc-reset = ics_reset;
+isc-post_load = ics_post_load;
 }
 
 static TypeInfo ics_info = {
@@ -457,6 +506,7 @@ static TypeInfo ics_info = {
 .parent = TYPE_DEVICE,
 .instance_size = sizeof(ICSState),
 .class_init = ics_class_init,
+.class_size = sizeof(ICSStateClass),
 };
 
 /*
diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
index 66364c5..6e3b605 100644
--- a/include/hw/ppc/xics.h
+++ b/include/hw/ppc/xics.h
@@ -42,7 +42,9 @@
  *  that yet)
  */
 typedef struct XICSState XICSState;
+typedef struct ICPStateClass ICPStateClass;
 typedef struct ICPState ICPState;
+typedef struct ICSStateClass ICSStateClass;
 typedef struct ICSState ICSState;
 typedef struct ICSIRQState ICSIRQState;
 
@@ -59,6 +61,18 @@ struct XICSState {
 #define TYPE_ICP icp
 #define ICP(obj) OBJECT_CHECK(ICPState, (obj), TYPE_ICP)
 
+#define ICP_CLASS(klass) \
+ OBJECT_CLASS_CHECK(ICPStateClass, (klass), TYPE_ICP)
+#define ICP_GET_CLASS(obj) \
+ OBJECT_GET_CLASS(ICPStateClass, (obj), TYPE_ICP)
+
+struct ICPStateClass {
+DeviceClass parent_class;
+
+void (*pre_save)(ICPState *s);
+int (*post_load)(ICPState *s, int version_id);
+};
+
 struct ICPState {
 /* private */
 DeviceState parent_obj;
@@ -72,6 +86,18 @@ struct ICPState {
 #define TYPE_ICS ics
 #define ICS(obj) OBJECT_CHECK(ICSState, (obj), TYPE_ICS)
 
+#define ICS_CLASS(klass) \
+ OBJECT_CLASS_CHECK(ICSStateClass, (klass), TYPE_ICS)
+#define ICS_GET_CLASS(obj) \
+ OBJECT_GET_CLASS(ICSStateClass, (obj), TYPE_ICS)
+
+struct ICSStateClass {
+DeviceClass parent_class;
+
+void (*pre_save)(ICSState *s);
+int (*post_load)(ICSState *s, int version_id);
+};
+
 struct

[Qemu-devel] [PULL 24/29] target-ppc: Check for error on address translation in memsave command

2013-10-25 Thread Alexander Graf

From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com

When we translate the virtual address to physical check for error.

Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 cpus.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/cpus.c b/cpus.c
index 398229e..912938c 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1403,7 +1403,10 @@ void qmp_memsave(int64_t addr, int64_t size, const char 
*filename,
 l = sizeof(buf);
 if (l  size)
 l = size;
-cpu_memory_rw_debug(cpu, addr, buf, l, 0);
+if (cpu_memory_rw_debug(cpu, addr, buf, l, 0) != 0) {
+error_setg(errp, Invalid addr 0x%016 PRIx64 specified, addr);
+goto exit;
+}
 if (fwrite(buf, 1, l, f) != l) {
 error_set(errp, QERR_IO_ERROR);
 goto exit;
-- 
1.8.1.4

[Qemu-devel] [PULL 11/29] spapr: move cpu_setup after kvmppc_set_papr

2013-10-25 Thread Alexander Graf

From: Alexey Kardashevskiy a...@ozlabs.ru

This moves the xics_cpu_setup() call after kvmppc_set_papr()
in order to get VCPUs initialized as this is required by upcoming
XICS-KVM.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Acked-by: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/ppc/spapr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 259df92..a276377 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1184,8 +1184,6 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 }
 env = cpu-env;
 
-xics_cpu_setup(spapr-icp, cpu);
-
 /* Set time-base frequency to 512 MHz */
 cpu_ppc_tb_init(env, TIMEBASE_FREQ);
 
@@ -1199,6 +1197,8 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 kvmppc_set_papr(cpu);
 }
 
+xics_cpu_setup(spapr-icp, cpu);
+
 qemu_register_reset(spapr_cpu_reset, cpu);
 }
 
-- 
1.8.1.4

[Qemu-devel] [PULL 07/29] spapr: Add ibm, purr property on power7 and newer

2013-10-25 Thread Alexander Graf

From: Alexey Kardashevskiy a...@ozlabs.ru

PAPR+ says that no ibm,purr tells the guest that H_PURR is not
supported. However some guests still try calling H_PURR on POWER7 unless
the property is present and equal to 0. This adds the property for CPUs
supporting the PURR special register.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/ppc/spapr.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 6322c98..259df92 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -422,6 +422,10 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
 _FDT((fdt_property(fdt, ibm,ppc-interrupt-gserver#s,
gservers_prop, sizeof(gservers_prop;
 
+if (env-spr_cb[SPR_PURR].oea_read) {
+_FDT((fdt_property(fdt, ibm,purr, NULL, 0)));
+}
+
 if (env-mmu_model  POWERPC_MMU_1TSEG) {
 _FDT((fdt_property(fdt, ibm,processor-segment-sizes,
segs, sizeof(segs;
-- 
1.8.1.4

[Qemu-devel] [PULL 08/29] spapr-rtas: fix h_rtas parameters reading

2013-10-25 Thread Alexander Graf

From: Alexey Kardashevskiy a...@ozlabs.ru

On the real hardware, RTAS is called in real mode and therefore
top 4 bits of the address passed in the call are ignored.
So does the patch.

This converts h_rtas() to use existing rtas_ld() handlers.

This fixed rtas_ld()/rtas_st() to ignore top 4 bits.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/ppc/spapr_hcall.c   | 6 +++---
 include/hw/ppc/spapr.h | 9 +++--
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index f10ba8a..f755a53 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -521,9 +521,9 @@ static target_ulong h_rtas(PowerPCCPU *cpu, 
sPAPREnvironment *spapr,
target_ulong opcode, target_ulong *args)
 {
 target_ulong rtas_r3 = args[0];
-uint32_t token = ldl_be_phys(rtas_r3);
-uint32_t nargs = ldl_be_phys(rtas_r3 + 4);
-uint32_t nret = ldl_be_phys(rtas_r3 + 8);
+uint32_t token = rtas_ld(rtas_r3, 0);
+uint32_t nargs = rtas_ld(rtas_r3, 1);
+uint32_t nret = rtas_ld(rtas_r3, 2);
 
 return spapr_rtas_call(cpu, spapr, token, nargs, rtas_r3 + 12,
nret, rtas_r3 + 12 + 4*nargs);
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index e37b419..6407c8a 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -332,14 +332,19 @@ static inline int spapr_allocate_lsi(int hint)
 return spapr_allocate_irq(hint, true);
 }
 
+static inline uint64_t ppc64_phys_to_real(uint64_t addr)
+{
+return addr  ~0xF000ULL;
+}
+
 static inline uint32_t rtas_ld(target_ulong phys, int n)
 {
-return ldl_be_phys(phys + 4*n);
+return ldl_be_phys(ppc64_phys_to_real(phys + 4*n));
 }
 
 static inline void rtas_st(target_ulong phys, int n, uint32_t val)
 {
-stl_be_phys(phys + 4*n, val);
+stl_be_phys(ppc64_phys_to_real(phys + 4*n), val);
 }
 
 typedef void (*spapr_rtas_fn)(PowerPCCPU *cpu, sPAPREnvironment *spapr,
-- 
1.8.1.4

[Qemu-devel] [PULL 16/29] xics: split to xics and xics-common

2013-10-25 Thread Alexander Graf

From: Alexey Kardashevskiy a...@ozlabs.ru

The upcoming XICS-KVM support will use bits of emulated XICS code.
So this introduces new level of hierarchy - xics-common class. Both
emulated XICS and XICS-KVM will inherit from it and override class
callbacks when required.

The new xics-common class implements:
1. replaces static nr_irqs and nr_servers properties with
the dynamic ones and adds callbacks to be executed when properties
are set.
2. xics_cpu_setup() callback renamed to xics_common_cpu_setup() as
it is a common part for both XICS'es
3. xics_reset() renamed to xics_common_reset() for the same reason.

The emulated XICS changes:
1. the part of xics_realize() which creates ICPs is moved to
the nr_servers property callback as realize() is too late to
create/initialize devices and instance_init() is too early to create
devices as the number of child devices comes via the nr_servers
property.
2. added ics_initfn() which does a little part of what xics_realize() did.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Reviewed-by: Alexander Graf ag...@suse.de
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/intc/xics.c| 156 +++---
 hw/ppc/spapr.c|   2 +-
 include/hw/ppc/xics.h |  20 +++
 3 files changed, 157 insertions(+), 21 deletions(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index c90eb0a..5ed2618 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -30,6 +30,7 @@
 #include hw/ppc/spapr.h
 #include hw/ppc/xics.h
 #include qemu/error-report.h
+#include qapi/visitor.h
 
 void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
 {
@@ -55,9 +56,12 @@ void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
 }
 }
 
-static void xics_reset(DeviceState *d)
+/*
+ * XICS Common class - parent for emulated XICS and KVM-XICS
+ */
+static void xics_common_reset(DeviceState *d)
 {
-XICSState *icp = XICS(d);
+XICSState *icp = XICS_COMMON(d);
 int i;
 
 for (i = 0; i  icp-nr_servers; i++) {
@@ -67,6 +71,99 @@ static void xics_reset(DeviceState *d)
 device_reset(DEVICE(icp-ics));
 }
 
+static void xics_prop_get_nr_irqs(Object *obj, Visitor *v,
+  void *opaque, const char *name, Error **errp)
+{
+XICSState *icp = XICS_COMMON(obj);
+int64_t value = icp-nr_irqs;
+
+visit_type_int(v, value, name, errp);
+}
+
+static void xics_prop_set_nr_irqs(Object *obj, Visitor *v,
+  void *opaque, const char *name, Error **errp)
+{
+XICSState *icp = XICS_COMMON(obj);
+XICSStateClass *info = XICS_COMMON_GET_CLASS(icp);
+Error *error = NULL;
+int64_t value;
+
+visit_type_int(v, value, name, error);
+if (error) {
+error_propagate(errp, error);
+return;
+}
+if (icp-nr_irqs) {
+error_setg(errp, Number of interrupts is already set to %u,
+   icp-nr_irqs);
+return;
+}
+
+assert(info-set_nr_irqs);
+assert(icp-ics);
+info-set_nr_irqs(icp, value, errp);
+}
+
+static void xics_prop_get_nr_servers(Object *obj, Visitor *v,
+ void *opaque, const char *name,
+ Error **errp)
+{
+XICSState *icp = XICS_COMMON(obj);
+int64_t value = icp-nr_servers;
+
+visit_type_int(v, value, name, errp);
+}
+
+static void xics_prop_set_nr_servers(Object *obj, Visitor *v,
+ void *opaque, const char *name,
+ Error **errp)
+{
+XICSState *icp = XICS_COMMON(obj);
+XICSStateClass *info = XICS_COMMON_GET_CLASS(icp);
+Error *error = NULL;
+int64_t value;
+
+visit_type_int(v, value, name, error);
+if (error) {
+error_propagate(errp, error);
+return;
+}
+if (icp-nr_servers) {
+error_setg(errp, Number of servers is already set to %u,
+   icp-nr_servers);
+return;
+}
+
+assert(info-set_nr_servers);
+info-set_nr_servers(icp, value, errp);
+}
+
+static void xics_common_initfn(Object *obj)
+{
+object_property_add(obj, nr_irqs, int,
+xics_prop_get_nr_irqs, xics_prop_set_nr_irqs,
+NULL, NULL, NULL);
+object_property_add(obj, nr_servers, int,
+xics_prop_get_nr_servers, xics_prop_set_nr_servers,
+NULL, NULL, NULL);
+}
+
+static void xics_common_class_init(ObjectClass *oc, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(oc);
+
+dc-reset = xics_common_reset;
+}
+
+static const TypeInfo xics_common_info = {
+.name  = TYPE_XICS_COMMON,
+.parent= TYPE_SYS_BUS_DEVICE,
+.instance_size = sizeof(XICSState),
+.class_size= sizeof(XICSStateClass),
+.instance_init = xics_common_initfn,
+.class_init= xics_common_class_init,
+};
+
 /*
  * ICP: Presentation layer
  */
@@ -479,6 +576,13 @@ static const VMStateDescription vmstate_ics

[Qemu-devel] [PULL 15/29] xics: add missing const specifiers to TypeInfo

2013-10-25 Thread Alexander Graf

From: Alexey Kardashevskiy a...@ozlabs.ru

This adds missing const specifiers to ICS and ICP TypeInfo's.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Reviewed-by: Andreas Färber afaer...@suse.de
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/intc/xics.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index 76654db..c90eb0a 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -248,7 +248,7 @@ static void icp_class_init(ObjectClass *klass, void *data)
 dc-vmsd = vmstate_icp_server;
 }
 
-static TypeInfo icp_info = {
+static const TypeInfo icp_info = {
 .name = TYPE_ICP,
 .parent = TYPE_DEVICE,
 .instance_size = sizeof(ICPState),
@@ -503,7 +503,7 @@ static void ics_class_init(ObjectClass *klass, void *data)
 isc-post_load = ics_post_load;
 }
 
-static TypeInfo ics_info = {
+static const TypeInfo ics_info = {
 .name = TYPE_ICS,
 .parent = TYPE_DEVICE,
 .instance_size = sizeof(ICSState),
-- 
1.8.1.4

Re: [Qemu-devel] [Qemu-ppc] [PULL 00/29] ppc patch queue 2013-10-25

2013-10-25 Thread Alexander Graf

Hey Mark,

Am 25.10.2013 um 23:59 schrieb Mark Cave-Ayland mark.cave-ayl...@ilande.co.uk:

 On 25/10/13 22:27, Alexander Graf wrote:
 
 Hi Blue / Aurelien / Anthony,
 
 This is my current patch queue for ppc.  Please pull.
 
 Alex
 
 Hi Alex,
 
 Did you get my repost of the PPC PCI configuration space patch to qemu-devel 
 here: http://lists.gnu.org/archive/html/qemu-devel/2013-10/msg01491.html? Or 
 should that go via someone else's tree?

Thanks a lot for the reminder. There is absolutely nothibg wrong with the 
patch, but I wanted to make sure that I have a fully autotested tree synced out 
before the hard freeze. I'll send this together with the next SLOF update as 
soon as the SLOF git tree is synchronized.

Since this is a genuine bugfix, we can always get it into QEMU after the hard 
freeze deadline.


Alex

Re: [Qemu-devel] [PATCH v7] powerpc: add PVR mask support

2013-10-27 Thread Alexander Graf


On 23.10.2013, at 07:57, Andreas Färber afaer...@suse.de wrote:

 Am 27.09.2013 09:05, schrieb Alexey Kardashevskiy:
 IBM POWERPC processors encode PVR as a CPU family in higher 16 bits and
 a CPU version in lower 16 bits. Since there is no significant change
 in behavior between versions, there is no point to add every single CPU
 version in QEMU's CPU list. Also, new CPU versions of already supported
 CPU won't break the existing code.
 
 This adds PVR value/mask support for KVM, i.e. for -cpu host option.
 
 As CPU family class name for POWER7 is POWER7-family, there is no need
 to touch aliases.
 
 Cc: Andreas Färber afaer...@suse.de
 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 
 As promised to Paul, using the Hackathon timeslot to review this:
 
 Reviewed-by: Andreas Färber afaer...@suse.de

Thanks, applied to ppc-next-1.8


Alex

Re: [Qemu-devel] [PATCH] spapr: add vio-bus devices to categories

2013-10-27 Thread Alexander Graf


On 10.10.2013, at 20:08, Alexey Kardashevskiy a...@ozlabs.ru wrote:

 In order to get devices appear in output of
 ./qemu-system-ppc64 -device ?,
 they must be assigned to one of DEVICE_CATEGORY_.
 
 This puts VIO devices classes to corresponding categories.
 
 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru

Thanks, applied to ppc-next-1.8


Alex

Re: [Qemu-devel] [RFC PATCH] spapr: add ibmveth to the supported network adapters list

2013-10-27 Thread Alexander Graf


On 10.10.2013, at 20:09, Alexey Kardashevskiy a...@ozlabs.ru wrote:

 The problem is that -net nic,model=? does not print ibmveth in
 the list while it is actually supported.
 
 Most of the QEMU emulated network devices are PCI but ibmveth
 (a.k.a. spapr-vlan) is not. However with -net nic,model=?, QEMU prints
 only PCI devices in the list, even if it does not say that the list is
 all about PCI devices.
 
 This adds ?/help handling in spapr.c and adds ibmveth in the beginning
 of the list.
 
 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 ---
 
 This is an RFC patch.
 
 The other solutions could be:
 1. add ibmveth into pci_nic_models[] in hw/pci/pci.c but this would not
 be correct as ibmveth is not PCI and it must appear only on pseries machine.
 
 2. implemement short version of qdev_print_category_devices() and call it
 with DEVICE_CATEGORY_NETWORK but that would print more devices than
 pci_nic_init_nofail() can handle (vmxnet3, usb-bt-dongle).
 
 3. fix qemu_check_nic_model() to specifically say that this is a list of
 PCI devices and there might be some other devices which -net nic,model+
 supports but there are not PCI but that could break compatibility (some
 management software may rely on this exact string).
 
 4. Reject the patch and just say that people must stop using -net. Ok for 
 me :)
 
 Since -net is kind of obsolete interface and does not seem to be extended 
 ever,
 the proposed patch does not look too ugly, does not it?
 ---
 hw/ppc/spapr.c | 15 +++
 1 file changed, 15 insertions(+)
 
 diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
 index c0613e4..45ed3da 100644
 --- a/hw/ppc/spapr.c
 +++ b/hw/ppc/spapr.c
 @@ -1276,6 +1276,21 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 
 if (strcmp(nd-model, ibmveth) == 0) {
 spapr_vlan_create(spapr-vio_bus, nd);
 +} else if (is_help_option(nd-model)) {
 +static const char * const nic_models[] = {
 +ibmveth,
 +ne2k_pci,
 +i82551,
 +i82557b,
 +i82559er,
 +rtl8139,
 +e1000,
 +pcnet,
 +virtio,
 +NULL
 +};

I don't like the idea of duplicating that list. Basically the list of supported 
-net models is incorrect today even on x86 where you can say -net 
nic,model=ne2k_isa. It really is only a list of PCI devices.

I can think of a number of convoluted ways to fix this up, but I think that 
ignoring fully accuracy of the output of -net model=? is the most straight 
forward thing to do.


Alex

Re: [Qemu-devel] [PATCH -V6 2/3] target-ppc: Fix page table lookup with kvm enabled

2013-10-27 Thread Alexander Graf


On 15.10.2013, at 01:58, Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com 
wrote:

 From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
 
 With kvm enabled, we store the hash page table information in the hypervisor.
 Use ioctl to read the htab contents. Without this we get the below error when
 trying to read the guest address
 
 (gdb) x/10 do_fork
 0xc0098660 do_fork:   Cannot access memory at address 
 0xc0098660
 (gdb)
 
 Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
 ---
 Changes from V5:
 
 * Added two new patches
 * Address review comments
 
 hw/ppc/spapr_hcall.c| 47 --
 target-ppc/kvm.c| 53 ++
 target-ppc/kvm_ppc.h| 19 
 target-ppc/mmu-hash64.c | 77 -
 target-ppc/mmu-hash64.h | 23 ++-
 5 files changed, 181 insertions(+), 38 deletions(-)
 
 diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
 index f10ba8a..e04bf6c 100644
 --- a/hw/ppc/spapr_hcall.c
 +++ b/hw/ppc/spapr_hcall.c
 @@ -52,6 +52,8 @@ static target_ulong h_enter(PowerPCCPU *cpu, 
 sPAPREnvironment *spapr,
 target_ulong raddr;
 target_ulong i;
 hwaddr hpte;
 +void *token;
 +bool htab_fd;
 
 /* only handle 4k and 16M pages for now */
 if (pteh  HPTE64_V_LARGE) {
 @@ -94,25 +96,32 @@ static target_ulong h_enter(PowerPCCPU *cpu, 
 sPAPREnvironment *spapr,
 if ((pte_index * HASH_PTE_SIZE_64)  ~env-htab_mask) {
 return H_PARAMETER;
 }
 +
 +i = 0;
 +hpte = pte_index * HASH_PTE_SIZE_64;
 if (likely((flags  H_EXACT) == 0)) {
 pte_index = ~7ULL;
 -hpte = pte_index * HASH_PTE_SIZE_64;
 -for (i = 0; ; ++i) {
 +token = ppc_hash64_start_access(cpu, pte_index, htab_fd);
 +do {
 if (i == 8) {
 +ppc_hash64_stop_access(token, htab_fd);
 return H_PTEG_FULL;
 }
 -if ((ppc_hash64_load_hpte0(env, hpte)  HPTE64_V_VALID) == 0) {
 +if ((ppc_hash64_load_hpte0(env, token, i)  HPTE64_V_VALID) == 
 0) {
 break;
 }
 -hpte += HASH_PTE_SIZE_64;
 -}
 +} while (i++);
 +ppc_hash64_stop_access(token, htab_fd);
 } else {
 -i = 0;
 -hpte = pte_index * HASH_PTE_SIZE_64;
 -if (ppc_hash64_load_hpte0(env, hpte)  HPTE64_V_VALID) {
 +token = ppc_hash64_start_access(cpu, pte_index, htab_fd);
 +if (ppc_hash64_load_hpte0(env, token, 0)  HPTE64_V_VALID) {
 +ppc_hash64_stop_access(token, htab_fd);
 return H_PTEG_FULL;
 }
 +ppc_hash64_stop_access(token, htab_fd);
 }
 +hpte += i * HASH_PTE_SIZE_64;
 +
 ppc_hash64_store_hpte1(env, hpte, ptel);
 /* eieio();  FIXME: need some sort of barrier for smp? */
 ppc_hash64_store_hpte0(env, hpte, pteh | HPTE64_V_HPTE_DIRTY);
 @@ -134,16 +143,18 @@ static RemoveResult remove_hpte(CPUPPCState *env, 
 target_ulong ptex,
 target_ulong *vp, target_ulong *rp)
 {
 hwaddr hpte;
 +void *token;
 +bool htab_fd;
 target_ulong v, r, rb;
 
 if ((ptex * HASH_PTE_SIZE_64)  ~env-htab_mask) {
 return REMOVE_PARM;
 }
 
 -hpte = ptex * HASH_PTE_SIZE_64;
 -
 -v = ppc_hash64_load_hpte0(env, hpte);
 -r = ppc_hash64_load_hpte1(env, hpte);
 +token = ppc_hash64_start_access(ppc_env_get_cpu(env), ptex, htab_fd);
 +v = ppc_hash64_load_hpte0(env, token, 0);
 +r = ppc_hash64_load_hpte1(env, token, 0);
 +ppc_hash64_stop_access(token, htab_fd);
 
 if ((v  HPTE64_V_VALID) == 0 ||
 ((flags  H_AVPN)  (v  ~0x7fULL) != avpn) ||
 @@ -152,6 +163,7 @@ static RemoveResult remove_hpte(CPUPPCState *env, 
 target_ulong ptex,
 }
 *vp = v;
 *rp = r;
 +hpte = ptex * HASH_PTE_SIZE_64;
 ppc_hash64_store_hpte0(env, hpte, HPTE64_V_HPTE_DIRTY);
 rb = compute_tlbie_rb(v, r, ptex);
 ppc_tlb_invalidate_one(env, rb);
 @@ -260,16 +272,18 @@ static target_ulong h_protect(PowerPCCPU *cpu, 
 sPAPREnvironment *spapr,
 target_ulong pte_index = args[1];
 target_ulong avpn = args[2];
 hwaddr hpte;
 +void *token;
 +bool htab_fd;
 target_ulong v, r, rb;
 
 if ((pte_index * HASH_PTE_SIZE_64)  ~env-htab_mask) {
 return H_PARAMETER;
 }
 
 -hpte = pte_index * HASH_PTE_SIZE_64;
 -
 -v = ppc_hash64_load_hpte0(env, hpte);
 -r = ppc_hash64_load_hpte1(env, hpte);
 +token = ppc_hash64_start_access(cpu, pte_index, htab_fd);
 +v = ppc_hash64_load_hpte0(env, token, 0);
 +r = ppc_hash64_load_hpte1(env, token, 0);
 +ppc_hash64_stop_access(token, htab_fd);
 
 if ((v  HPTE64_V_VALID) == 0 ||
 ((flags  H_AVPN)  (v  ~0x7fULL) != avpn)) {
 @@ -282,6 +296,7 @@ static target_ulong h_protect(PowerPCCPU *cpu, 
 sPAPREnvironment *spapr,
 r |= (flags  48)  HPTE64_R_KEY_HI;
 r |= flags  (HPTE64_R_PP

Re: [Qemu-devel] [PATCH -V6 3/3] target-ppc: Fix htab_mask calculation

2013-10-27 Thread Alexander Graf


On 15.10.2013, at 01:58, Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com 
wrote:

 From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
 
 Correctly update the htab_mask using the return value of
 KVM_PPC_ALLOCATE_HTAB ioctl. Also we don't update sdr1
 on GET_SREGS for HV. So don't update htab_mask if sdr1
 is found to be zero. Fix the pte index calculation to be
 same as that found in the kernel
 
 Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
 ---
 hw/ppc/spapr.c  | 3 ++-
 target-ppc/mmu-hash64.c | 2 +-
 target-ppc/mmu_helper.c | 4 +++-
 3 files changed, 6 insertions(+), 3 deletions(-)
 
 diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
 index 22f2a8a..d4f3502 100644
 --- a/hw/ppc/spapr.c
 +++ b/hw/ppc/spapr.c
 @@ -724,7 +724,8 @@ static void spapr_cpu_reset(void *opaque)
 env-external_htab = (void *)1;
 }
 env-htab_base = -1;
 -env-htab_mask = HTAB_SIZE(spapr) - 1;
 +/* 128 (2**7) bytes in each HPTEG */
 +env-htab_mask = (1ULL  ((spapr)-htab_shift - 7)) - 1;

HTAB_SIZE(spapr) / 128? The compiler should be smart enough to produce the same 
code out of that.

However, could you please explain why it's better to have the mask be on the 
PTEG rather than the offset? Is this something you missed in the previous 
patch? If so, please change the semantics on what htab_mask means before you 
break the code as that makes bisecting hard.

Furthermore, since you are changing the semantics of htab_mask, have you 
checked all other users of it? Most notably the hash32 code.


Alex

 env-spr[SPR_SDR1] = (target_ulong)(uintptr_t)spapr-htab |
 (spapr-htab_shift - 18);
 }
 diff --git a/target-ppc/mmu-hash64.c b/target-ppc/mmu-hash64.c
 index 5c797c3..ddd8440 100644
 --- a/target-ppc/mmu-hash64.c
 +++ b/target-ppc/mmu-hash64.c
 @@ -354,7 +354,7 @@ static hwaddr ppc_hash64_pteg_search(CPUPPCState *env, 
 hwaddr hash,
 target_ulong pte0, pte1;
 unsigned long pte_index;
 
 -pte_index = (hash * HPTES_PER_GROUP)  env-htab_mask;
 +pte_index = (hash  env-htab_mask) * HPTES_PER_GROUP;
 token = ppc_hash64_start_access(ppc_env_get_cpu(env), pte_index, 
 htab_fd);
 if (!token) {
 return -1;
 diff --git a/target-ppc/mmu_helper.c b/target-ppc/mmu_helper.c
 index 04a840b..c39cb7b 100644
 --- a/target-ppc/mmu_helper.c
 +++ b/target-ppc/mmu_helper.c
 @@ -2025,7 +2025,9 @@ void ppc_store_sdr1(CPUPPCState *env, target_ulong 
 value)
  stored in SDR1\n, htabsize);
 htabsize = 28;
 }
 -env-htab_mask = (1ULL  (htabsize + 18)) - 1;
 +if (htabsize) {
 +env-htab_mask = (1ULL  (htabsize + 18 - 7)) - 1;
 +}
 env-htab_base = value  SDR_64_HTABORG;
 } else
 #endif /* defined(TARGET_PPC64) */
 -- 
 1.8.3.2

Re: [Qemu-devel] [PATCH -V5] target-ppc: Fix page table lookup with kvm enabled

2013-10-27 Thread Alexander Graf


On 11.10.2013, at 09:58, Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com 
wrote:

 Alexander Graf ag...@suse.de writes:
 
 On 11.10.2013, at 13:13, Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com 
 wrote:
 
 From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
 
 With kvm enabled, we store the hash page table information in the 
 hypervisor.
 Use ioctl to read the htab contents. Without this we get the below error 
 when
 trying to read the guest address
 
 (gdb) x/10 do_fork
 0xc0098660 do_fork:   Cannot access memory at address 
 0xc0098660
 (gdb)
 
 Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
 ---
 Changes from V4:
 * Rewrite to avoid two code paths for doing hash lookups
 
 
 
 +
 +i = 0;
 +hpte = pte_index * HASH_PTE_SIZE_64;
if (likely((flags  H_EXACT) == 0)) {
pte_index = ~7ULL;
 -hpte = pte_index * HASH_PTE_SIZE_64;
 -for (i = 0; ; ++i) {
 +token = ppc_hash64_start_access(ppc_env_get_cpu(env), pte_index);
 +do {
 
 Why convert this into a while loop?
 
 I am moving i = 0 outside the loop. Hence found while () better than 
 for(;;++i) 

Outside of what loop? You're only moving it outside of the if().

 
 
if (i == 8) {
 +ppc_hash64_stop_access(token);
return H_PTEG_FULL;
}
 -if ((ppc_hash64_load_hpte0(env, hpte)  HPTE64_V_VALID) == 0) {
 +if ((ppc_hash64_load_hpte0(token, i)  HPTE64_V_VALID) == 0) {
break;
}
 -hpte += HASH_PTE_SIZE_64;
 -}
 +} while (i++);
 +ppc_hash64_stop_access(token);
 
 
 
 
 +
 +int kvm_ppc_hash64_start_access(PowerPCCPU *cpu, unsigned long pte_index,
 +struct ppc_hash64_hpte_token *token)
 +{
 +int htab_fd;
 +int hpte_group_size;
 +struct kvm_get_htab_fd ghf;
 +struct kvm_get_htab_buf {
 +struct kvm_get_htab_header header;
 +/*
 + * We required one extra byte for read
 + */
 +unsigned long hpte[(HPTES_PER_GROUP * 2) + 1];
 +} hpte_buf;;
 
 Double semicolon?
 
 Will fix
 
 
 +
 +ghf.flags = 0;
 +ghf.start_index = pte_index;
 +htab_fd = kvm_vm_ioctl(kvm_state, KVM_PPC_GET_HTAB_FD, ghf);
 +if (htab_fd  0) {
 +goto error_out;
 +}
 +memset(hpte_buf, 0, sizeof(hpte_buf));
 
 
 
 diff --git a/target-ppc/mmu-hash64.c b/target-ppc/mmu-hash64.c
 index 67fc1b5..aeb4593 100644
 --- a/target-ppc/mmu-hash64.c
 +++ b/target-ppc/mmu-hash64.c
 @@ -302,29 +302,73 @@ static int ppc_hash64_amr_prot(CPUPPCState *env, 
 ppc_hash_pte64_t pte)
return prot;
 }
 
 -static hwaddr ppc_hash64_pteg_search(CPUPPCState *env, hwaddr pteg_off,
 +struct ppc_hash64_hpte_token *ppc_hash64_start_access(PowerPCCPU *cpu,
 +  unsigned long 
 pte_index)
 
 How about you also pass in the number of PTEs you want to access?
 Let's call it pte_num for now. Then if you only care about one PTE
 you can indicate so, otherwise it's clear that you want to access 8
 PTEs beginning from the one you're pointing at.
 
 So if we want to pass pte_num, then i can be any number, 1, 8, 10. That
 would make the code complex, because now we need to make the buffer
 passed to read() of variable size.Also i would need another allocation
 for the return buffer. I can do tricks like make the token handle the
 pointer to actual buffer skipping the header. But ppc_hash64=stop_acess then
 would have to know about kvm htab read header which i found not nice.
 We can possibly update the function name to indicate that it will always
 read hptegroup from the pte_index. Something like ppc64_start_hpteg_access() 
 ?. 

Just abort() if pte_num is not 1 or 8.

 
 
 +{
 +hwaddr pte_offset;
 +struct ppc_hash64_hpte_token *token;
 
 void *token = NULL;
 
 if (kvmppc_uses_htab_fd(cpu)) {
/* HTAB is controlled by KVM. Fetch the PTEG into a new buffer. */
 
int hpte_group_size = sizeof(unsigned long) * 2 * pte_num;
token = g_malloc(hpte_group_size);
if (kvm_ppc_hash64_read_pteg(cpu, pte_index, token)) {
 
 That is the tricky part, the read buffer need to have a header in the
 beginning. May be i can do kvm_ppc_hash64_stop_access(void *token) that
 does the pointer match gets to the head of token and free. Will try that.
 
free(token);
return NULL;
}
 } else {
/* HTAB is controlled by QEMU. Just point to the internally accessible 
 PTEG. */
hwaddr pte_offset;
 
pte_offset = pte_index * HASH_PTE_SIZE_64;
if (cpu-env.external_htab) {
token = cpu-env.external_htab + pte_offset;
} else {
token = (uint8_t *) cpu-env.htab_base + pte_offset;
}
 }
 
 return token;
 
 This way it's more obvious which path the normal code flow would be. We 
 also only clearly choose what to do depending on in-kernel HTAB or now. As a 
 big plus we don't need a struct that we need

Re: [Qemu-devel] [PATCH 14/60] AArch64: Add orr instruction emulation

2013-10-30 Thread Alexander Graf


On 27.09.2013, at 11:25, Richard Henderson r...@twiddle.net wrote:

 On 09/26/2013 05:48 PM, Alexander Graf wrote:
 This patch adds emulation support for the orr instruction.
 
 Signed-off-by: Alexander Graf ag...@suse.de
 ---
 target-arm/helper-a64.c|  28 +++
 target-arm/helper-a64.h|   1 +
 target-arm/translate-a64.c | 120 
 +
 3 files changed, 149 insertions(+)
 
 diff --git a/target-arm/helper-a64.c b/target-arm/helper-a64.c
 index 8105fb5..da72b7f 100644
 --- a/target-arm/helper-a64.c
 +++ b/target-arm/helper-a64.c
 @@ -24,3 +24,31 @@
 #include sysemu/sysemu.h
 #include qemu/bitops.h
 
 +uint32_t HELPER(pstate_add)(uint32_t pstate, uint64_t a1, uint64_t a2,
 +uint64_t ar)
 +{
 +int64_t s1 = a1;
 +int64_t s2 = a2;
 +int64_t sr = ar;
 +
 +pstate = ~(PSTATE_N | PSTATE_Z | PSTATE_C | PSTATE_V);
 +
 +if (sr  0) {
 +pstate |= PSTATE_N;
 +}
 +
 +if (!ar) {
 +pstate |= PSTATE_Z;
 +}
 +
 +if (ar  (ar  a1)) {
 +pstate |= PSTATE_C;
 +}
 +
 +if ((s1  0  s2  0  sr  0) ||
 +(s1  0  s2  0  sr  0)) {
 +pstate |= PSTATE_V;
 +}
 +
 +return pstate;
 +}
 
 Why are you not using the same split apart bits as A32?

There is an architecturally defined register that specifies what pstate looks 
like and IIRC that includes system level state as well, similar to EFLAGS. So I 
figured it's more straight forward to use a single variable for it.

I don't think it really makes much of a difference either way though. If we see 
that doing it in a split way makes more sense we can always just switch to that 
later.


Alex

Re: [Qemu-devel] [Qemu-ppc] [v2 02/13] Add lxsdx

2013-10-31 Thread Alexander Graf


On 11.10.2013, at 05:57, Tom Musta tommu...@gmail.com wrote:

 This patch adds the Load VSX Scalar Doubleowrd Indexed (lxsdx)
 instruction.
 
 The lower 8 bytes of the target register are undefined; this
 implementation leaves those bytes unaltered.
 
 Signed-off-by: Tom Musta tommu...@gmail.com

The diff seems to be broken. Patchworks and my mail client show 2 spaces in 
already existing code parts of the patch.


Alex


diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index bd5e89d..6ee0d80 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7007,6 +7007,21 @@ static inline TCGv_i64 cpu_vsrl(int n)
  }
  }

+static void gen_lxsdx(DisasContext *ctx)
+{
+TCGv EA;
+if (unlikely(!ctx-vsx_enabled)) {
+gen_exception(ctx, POWERPC_EXCP_VSXU);
+return;
+}
+gen_set_access_type(ctx, ACCESS_INT);
+EA = tcg_temp_new();
+gen_addr_reg_index(ctx, EA);
+gen_qemu_ld64(ctx, cpu_vsrh(xT(ctx-opcode)), EA);
+/* NOTE: cpu_vsrl is undefined */
+tcg_temp_free(EA);
+}
+
  static void gen_lxvd2x(DisasContext *ctx)
  {
  TCGv EA;
@@ -9518,6 +9533,7 @@ GEN_VAFORM_PAIRED(vmsumshm, vmsumshs, 20),
  GEN_VAFORM_PAIRED(vsel, vperm, 21),
  GEN_VAFORM_PAIRED(vmaddfp, vnmsubfp, 23),

+GEN_HANDLER_E(lxsdx, 0x1F, 0x0C, 0x12, 0, PPC_NONE, PPC2_VSX),
  GEN_HANDLER_E(lxvd2x, 0x1F, 0x0C, 0x1A, 0, PPC_NONE, PPC2_VSX),

  GEN_HANDLER_E(stxvd2x, 0x1F, 0xC, 0x1E, 0, PPC_NONE, PPC2_VSX),

Re: [Qemu-devel] [RFC PATCH] spapr: add ibmveth to the supported network adapters list

2013-11-01 Thread Alexander Graf



Am 01.11.2013 um 03:52 schrieb Alexey Kardashevskiy a...@ozlabs.ru:

 On 10/28/2013 05:03 AM, Alexander Graf wrote:
 
 On 10.10.2013, at 20:09, Alexey Kardashevskiy a...@ozlabs.ru wrote:
 
 The problem is that -net nic,model=? does not print ibmveth in
 the list while it is actually supported.
 
 Most of the QEMU emulated network devices are PCI but ibmveth
 (a.k.a. spapr-vlan) is not. However with -net nic,model=?, QEMU prints
 only PCI devices in the list, even if it does not say that the list is
 all about PCI devices.
 
 This adds ?/help handling in spapr.c and adds ibmveth in the beginning
 of the list.
 
 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 ---
 
 This is an RFC patch.
 
 The other solutions could be:
 1. add ibmveth into pci_nic_models[] in hw/pci/pci.c but this would not
 be correct as ibmveth is not PCI and it must appear only on pseries 
 machine.
 
 2. implemement short version of qdev_print_category_devices() and call it
 with DEVICE_CATEGORY_NETWORK but that would print more devices than
 pci_nic_init_nofail() can handle (vmxnet3, usb-bt-dongle).
 
 3. fix qemu_check_nic_model() to specifically say that this is a list of
 PCI devices and there might be some other devices which -net nic,model+
 supports but there are not PCI but that could break compatibility (some
 management software may rely on this exact string).
 
 4. Reject the patch and just say that people must stop using -net. Ok for 
 me :)
 
 Since -net is kind of obsolete interface and does not seem to be extended 
 ever,
 the proposed patch does not look too ugly, does not it?
 ---
 hw/ppc/spapr.c | 15 +++
 1 file changed, 15 insertions(+)
 
 diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
 index c0613e4..45ed3da 100644
 --- a/hw/ppc/spapr.c
 +++ b/hw/ppc/spapr.c
 @@ -1276,6 +1276,21 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 
if (strcmp(nd-model, ibmveth) == 0) {
spapr_vlan_create(spapr-vio_bus, nd);
 +} else if (is_help_option(nd-model)) {
 +static const char * const nic_models[] = {
 +ibmveth,
 +ne2k_pci,
 +i82551,
 +i82557b,
 +i82559er,
 +rtl8139,
 +e1000,
 +pcnet,
 +virtio,
 +NULL
 +};
 
 I don't like the idea of duplicating that list.
 
 Neither do I :) But the list itself already looks quite ugly.
 
 Basically the list of supported -net models is incorrect today even on
 x86 where you can say -net nic,model=ne2k_isa. It really is only a list
 of PCI devices.
 
 
 I can think of a number of convoluted ways to fix this up, but I think
 that ignoring fully accuracy of the output of -net model=? is the most
 straight forward thing to do.
 
 Does any of your convoluted ways include adding a new category
 (DEVICE_CATEGORY_NETWORK_LEGACY?) into enum DeviceCategory, adding devices
 from the list above and fixing qemu_show_nic_models() to show what is in
 the category?

Most of them consist of a full redesign of the way -net works :).

 
 Or -net interface is deprecated and we do not want even touch it?

I don't think we should deprecate it. It's easier to use than anything else. 
Ahci adoption heavily suffered from not being enabled in -drive - I don't want 
that again here.

Alex

 
 
 
 -- 
 Alexey

Re: [Qemu-devel] [RFC PATCH] spapr: add initial ibm, client-architecture-support rtas call support

2013-09-04 Thread Alexander Graf


On 04.09.2013, at 12:19, Alexey Kardashevskiy wrote:

   This is an RFC patch.
 
 The modern Linux kernel supports every known POWERPC CPU so when
 it boots, it can always find a matching cpu_spec from the cpu_specs array.
 However if the kernel is quite old, it may be missing the definition of
 the actual CPU. To provide ability for old kernels to work on modern
 hardware, a Logical Processor Version concept was introduced in PowerISA.
 From the hardware prospective, it is supported by PCR (Processor
 Compatibility Register) which is defined in PowerISA. The register
 enables compatibility mode which can be set to PowerISA 2.05 or 2.06.
 
 PAPR+ specification defines a Logical Processor Version per every
 version of PowerISA specification. PAPR+ also defines
 a ibm,client-architecture-support rtas call which purpose is to provide
 a negotiation mechanism for the guest and the hypervisor to work out
 the best Logical Processor Version to continue with.
 
 At the moment, the Linux kernel calls the ibm,client-architecture-support
 method and only then reads the device. The current RTAS's handler checks
 the capabilities from the array supplied by the guest kernel, analyses
 if QEMU can or cannot provide with the requested features.
 If QEMU supports everything the guest has requested, it returns from rtas
 call and the guest continues booting.
 If some parameter changes, QEMU fixes the device tree and reboots
 the guest with a new tree.
 
 In this version, the ibm,client-architecture-support handler checks
 if the current CPU is in the list from the guest and if it is not, QEMU
 adds a cpu-version property to a cpu node with the best of logical PVRs
 supported by the guest.
 
 Technically QEMU reboots and as a part of reboot, it fixes the tree and
 this is when the cpu-version property is actually added.
 
 Although it seems possible to add a custom interface between SLOF and QEMU
 and implement device tree update on the fly to avoid a guest reboot,
 there still may be cases when device tree change would not be enough.
 As an example, the guest may ask for a bigger RMA area than QEMU allocates
 by default.
 
 The patch depends on [PATCH v5] powerpc: add PVR mask support.
 
 Cc: Nikunj A Dadhania nik...@linux.vnet.ibm.com
 Cc: Andreas Färber afaer...@suse.de
 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 ---
 hw/ppc/spapr.c  | 10 ++
 hw/ppc/spapr_hcall.c| 76 +
 include/hw/ppc/spapr.h  |  7 -
 target-ppc/cpu-models.h | 13 
 target-ppc/cpu-qom.h|  1 +
 target-ppc/translate_init.c |  3 ++
 6 files changed, 109 insertions(+), 1 deletion(-)
 
 diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
 index 13574bf..5adf53c 100644
 --- a/hw/ppc/spapr.c
 +++ b/hw/ppc/spapr.c
 @@ -238,6 +238,16 @@ static int spapr_fixup_cpu_dt(void *fdt, 
 sPAPREnvironment *spapr)
 if (ret  0) {
 return ret;
 }
 +
 +if (spapr-pvr_new) {
 +ret = fdt_setprop(fdt, offset, cpu-version,
 +  spapr-pvr_new, sizeof(spapr-pvr_new));
 +if (ret  0) {
 +return ret;
 +}
 +/* Reset as the guest after reboot may give other PVR set */
 +spapr-pvr_new = 0;
 +}
 }
 return ret;
 }
 diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
 index 9f6e7b8..509de89 100644
 --- a/hw/ppc/spapr_hcall.c
 +++ b/hw/ppc/spapr_hcall.c
 @@ -3,6 +3,7 @@
 #include helper_regs.h
 #include hw/ppc/spapr.h
 #include mmu-hash64.h
 +#include cpu-models.h
 
 static target_ulong h_random(PowerPCCPU *cpu, sPAPREnvironment *spapr,
target_ulong opcode, target_ulong *args)
 @@ -792,6 +793,78 @@ out:
 return ret;
 }
 
 +static target_ulong h_client_architecture_support(PowerPCCPU *cpu,
 +  sPAPREnvironment *spapr,
 +  target_ulong opcode,
 +  target_ulong *args)
 +{
 +target_ulong list = args[0];
 +int i, number_of_option_vectors;
 +PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
 +bool cpu_match = false;
 +unsigned compat_cpu_level = 0, pvr_new;
 +
 +/* Parse PVR list */
 +for ( ; ; ) {
 +uint32_t pvr, pvr_mask;
 +
 +pvr_mask = ldl_phys(list);
 +list += 4;
 +pvr = ldl_phys(list);
 +list += 4;
 +
 +if ((cpu-env.spr[SPR_PVR]  pvr_mask) == (pvr  pvr_mask)) {
 +cpu_match = true;
 +pvr_new = cpu-env.spr[SPR_PVR];
 +}
 +
 +/* Is it a logical PVR? */
 +if ((pvr  CPU_POWERPC_LOGICAL_MASK) == CPU_POWERPC_LOGICAL_MASK) {
 +switch (pvr) {
 +case CPU_POWERPC_LOGICAL_2_05:
 +if ((pcc-pcr  POWERPC_ISA_COMPAT_2_05) 
 +(compat_cpu_level  2050)) {
 +compat_cpu_level = 2050;
 +

Re: [Qemu-devel] [RFC PATCH] spapr: add initial ibm, client-architecture-support rtas call support

2013-09-04 Thread Alexander Graf


On 04.09.2013, at 13:40, Alexey Kardashevskiy wrote:

 On 09/04/2013 08:42 PM, Alexander Graf wrote:
 
 On 04.09.2013, at 12:19, Alexey Kardashevskiy wrote:
 
  This is an RFC patch.
 
 The modern Linux kernel supports every known POWERPC CPU so when
 it boots, it can always find a matching cpu_spec from the cpu_specs array.
 However if the kernel is quite old, it may be missing the definition of
 the actual CPU. To provide ability for old kernels to work on modern
 hardware, a Logical Processor Version concept was introduced in PowerISA.
 From the hardware prospective, it is supported by PCR (Processor
 Compatibility Register) which is defined in PowerISA. The register
 enables compatibility mode which can be set to PowerISA 2.05 or 2.06.
 
 PAPR+ specification defines a Logical Processor Version per every
 version of PowerISA specification. PAPR+ also defines
 a ibm,client-architecture-support rtas call which purpose is to provide
 a negotiation mechanism for the guest and the hypervisor to work out
 the best Logical Processor Version to continue with.
 
 At the moment, the Linux kernel calls the ibm,client-architecture-support
 method and only then reads the device. The current RTAS's handler checks
 the capabilities from the array supplied by the guest kernel, analyses
 if QEMU can or cannot provide with the requested features.
 If QEMU supports everything the guest has requested, it returns from rtas
 call and the guest continues booting.
 If some parameter changes, QEMU fixes the device tree and reboots
 the guest with a new tree.
 
 In this version, the ibm,client-architecture-support handler checks
 if the current CPU is in the list from the guest and if it is not, QEMU
 adds a cpu-version property to a cpu node with the best of logical PVRs
 supported by the guest.
 
 Technically QEMU reboots and as a part of reboot, it fixes the tree and
 this is when the cpu-version property is actually added.
 
 Although it seems possible to add a custom interface between SLOF and QEMU
 and implement device tree update on the fly to avoid a guest reboot,
 there still may be cases when device tree change would not be enough.
 As an example, the guest may ask for a bigger RMA area than QEMU allocates
 by default.
 
 The patch depends on [PATCH v5] powerpc: add PVR mask support.
 
 Cc: Nikunj A Dadhania nik...@linux.vnet.ibm.com
 Cc: Andreas Färber afaer...@suse.de
 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 ---
 hw/ppc/spapr.c  | 10 ++
 hw/ppc/spapr_hcall.c| 76 
 +
 include/hw/ppc/spapr.h  |  7 -
 target-ppc/cpu-models.h | 13 
 target-ppc/cpu-qom.h|  1 +
 target-ppc/translate_init.c |  3 ++
 6 files changed, 109 insertions(+), 1 deletion(-)
 
 diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
 index 13574bf..5adf53c 100644
 --- a/hw/ppc/spapr.c
 +++ b/hw/ppc/spapr.c
 @@ -238,6 +238,16 @@ static int spapr_fixup_cpu_dt(void *fdt, 
 sPAPREnvironment *spapr)
if (ret  0) {
return ret;
}
 +
 +if (spapr-pvr_new) {
 +ret = fdt_setprop(fdt, offset, cpu-version,
 +  spapr-pvr_new, sizeof(spapr-pvr_new));
 +if (ret  0) {
 +return ret;
 +}
 +/* Reset as the guest after reboot may give other PVR set */
 +spapr-pvr_new = 0;
 +}
}
return ret;
 }
 diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
 index 9f6e7b8..509de89 100644
 --- a/hw/ppc/spapr_hcall.c
 +++ b/hw/ppc/spapr_hcall.c
 @@ -3,6 +3,7 @@
 #include helper_regs.h
 #include hw/ppc/spapr.h
 #include mmu-hash64.h
 +#include cpu-models.h
 
 static target_ulong h_random(PowerPCCPU *cpu, sPAPREnvironment *spapr,
   target_ulong opcode, target_ulong *args)
 @@ -792,6 +793,78 @@ out:
return ret;
 }
 
 +static target_ulong h_client_architecture_support(PowerPCCPU *cpu,
 +  sPAPREnvironment *spapr,
 +  target_ulong opcode,
 +  target_ulong *args)
 +{
 +target_ulong list = args[0];
 +int i, number_of_option_vectors;
 +PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
 +bool cpu_match = false;
 +unsigned compat_cpu_level = 0, pvr_new;
 +
 +/* Parse PVR list */
 +for ( ; ; ) {
 +uint32_t pvr, pvr_mask;
 +
 +pvr_mask = ldl_phys(list);
 +list += 4;
 +pvr = ldl_phys(list);
 +list += 4;
 +
 +if ((cpu-env.spr[SPR_PVR]  pvr_mask) == (pvr  pvr_mask)) {
 +cpu_match = true;
 +pvr_new = cpu-env.spr[SPR_PVR];
 +}
 +
 +/* Is it a logical PVR? */
 +if ((pvr  CPU_POWERPC_LOGICAL_MASK) == CPU_POWERPC_LOGICAL_MASK) {
 +switch (pvr) {
 +case CPU_POWERPC_LOGICAL_2_05:
 +if ((pcc-pcr  POWERPC_ISA_COMPAT_2_05

Re: [Qemu-devel] [RFC PATCH] spapr: add initial ibm, client-architecture-support rtas call support

2013-09-04 Thread Alexander Graf


On 04.09.2013, at 15:08, Alexey Kardashevskiy wrote:

 On 09/04/2013 10:13 PM, Alexander Graf wrote:
 
 On 04.09.2013, at 13:40, Alexey Kardashevskiy wrote:
 
 On 09/04/2013 08:42 PM, Alexander Graf wrote:
 
 On 04.09.2013, at 12:19, Alexey Kardashevskiy wrote:
 
 This is an RFC patch.
 
 The modern Linux kernel supports every known POWERPC CPU so when
 it boots, it can always find a matching cpu_spec from the cpu_specs array.
 However if the kernel is quite old, it may be missing the definition of
 the actual CPU. To provide ability for old kernels to work on modern
 hardware, a Logical Processor Version concept was introduced in PowerISA.
 From the hardware prospective, it is supported by PCR (Processor
 Compatibility Register) which is defined in PowerISA. The register
 enables compatibility mode which can be set to PowerISA 2.05 or 2.06.
 
 PAPR+ specification defines a Logical Processor Version per every
 version of PowerISA specification. PAPR+ also defines
 a ibm,client-architecture-support rtas call which purpose is to provide
 a negotiation mechanism for the guest and the hypervisor to work out
 the best Logical Processor Version to continue with.
 
 At the moment, the Linux kernel calls the ibm,client-architecture-support
 method and only then reads the device. The current RTAS's handler checks
 the capabilities from the array supplied by the guest kernel, analyses
 if QEMU can or cannot provide with the requested features.
 If QEMU supports everything the guest has requested, it returns from rtas
 call and the guest continues booting.
 If some parameter changes, QEMU fixes the device tree and reboots
 the guest with a new tree.
 
 In this version, the ibm,client-architecture-support handler checks
 if the current CPU is in the list from the guest and if it is not, QEMU
 adds a cpu-version property to a cpu node with the best of logical PVRs
 supported by the guest.
 
 Technically QEMU reboots and as a part of reboot, it fixes the tree and
 this is when the cpu-version property is actually added.
 
 Although it seems possible to add a custom interface between SLOF and QEMU
 and implement device tree update on the fly to avoid a guest reboot,
 there still may be cases when device tree change would not be enough.
 As an example, the guest may ask for a bigger RMA area than QEMU allocates
 by default.
 
 The patch depends on [PATCH v5] powerpc: add PVR mask support.
 
 Cc: Nikunj A Dadhania nik...@linux.vnet.ibm.com
 Cc: Andreas Färber afaer...@suse.de
 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 ---
 hw/ppc/spapr.c  | 10 ++
 hw/ppc/spapr_hcall.c| 76 
 +
 include/hw/ppc/spapr.h  |  7 -
 target-ppc/cpu-models.h | 13 
 target-ppc/cpu-qom.h|  1 +
 target-ppc/translate_init.c |  3 ++
 6 files changed, 109 insertions(+), 1 deletion(-)
 
 diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
 index 13574bf..5adf53c 100644
 --- a/hw/ppc/spapr.c
 +++ b/hw/ppc/spapr.c
 @@ -238,6 +238,16 @@ static int spapr_fixup_cpu_dt(void *fdt, 
 sPAPREnvironment *spapr)
   if (ret  0) {
   return ret;
   }
 +
 +if (spapr-pvr_new) {
 +ret = fdt_setprop(fdt, offset, cpu-version,
 +  spapr-pvr_new, sizeof(spapr-pvr_new));
 +if (ret  0) {
 +return ret;
 +}
 +/* Reset as the guest after reboot may give other PVR set */
 +spapr-pvr_new = 0;
 +}
   }
   return ret;
 }
 diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
 index 9f6e7b8..509de89 100644
 --- a/hw/ppc/spapr_hcall.c
 +++ b/hw/ppc/spapr_hcall.c
 @@ -3,6 +3,7 @@
 #include helper_regs.h
 #include hw/ppc/spapr.h
 #include mmu-hash64.h
 +#include cpu-models.h
 
 static target_ulong h_random(PowerPCCPU *cpu, sPAPREnvironment *spapr,
  target_ulong opcode, target_ulong *args)
 @@ -792,6 +793,78 @@ out:
   return ret;
 }
 
 +static target_ulong h_client_architecture_support(PowerPCCPU *cpu,
 +  sPAPREnvironment 
 *spapr,
 +  target_ulong opcode,
 +  target_ulong *args)
 +{
 +target_ulong list = args[0];
 +int i, number_of_option_vectors;
 +PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
 +bool cpu_match = false;
 +unsigned compat_cpu_level = 0, pvr_new;
 +
 +/* Parse PVR list */
 +for ( ; ; ) {
 +uint32_t pvr, pvr_mask;
 +
 +pvr_mask = ldl_phys(list);
 +list += 4;
 +pvr = ldl_phys(list);
 +list += 4;
 +
 +if ((cpu-env.spr[SPR_PVR]  pvr_mask) == (pvr  pvr_mask)) {
 +cpu_match = true;
 +pvr_new = cpu-env.spr[SPR_PVR];
 +}
 +
 +/* Is it a logical PVR? */
 +if ((pvr  CPU_POWERPC_LOGICAL_MASK) == 
 CPU_POWERPC_LOGICAL_MASK) {
 +switch (pvr

Re: [Qemu-devel] [PATCH] spapr-rtas: reset top 4 bits in parameters address

2013-09-05 Thread Alexander Graf



Am 05.09.2013 um 07:58 schrieb Alexey Kardashevskiy a...@ozlabs.ru:

 On the real hardware, RTAS is called in real mode and therefore
 ignores top 4 bits of the address passed in the call.

Shouldn't we ignore the upper 4 bits for every memory access in real mode, not 
just that one parameter?

Alex

 
 This fixes QEMU to do the same thing.
 
 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 ---
 hw/ppc/spapr_rtas.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
 
 diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
 index eb542f2..ab03d67 100644
 --- a/hw/ppc/spapr_rtas.c
 +++ b/hw/ppc/spapr_rtas.c
 @@ -240,7 +240,8 @@ target_ulong spapr_rtas_call(PowerPCCPU *cpu, 
 sPAPREnvironment *spapr,
 struct rtas_call *call = rtas_table + (token - TOKEN_BASE);
 
 if (call-fn) {
 -call-fn(cpu, spapr, token, nargs, args, nret, rets);
 +call-fn(cpu, spapr, token, nargs, args  0x0FFFUL,
 + nret, rets);
 return H_SUCCESS;
 }
 }
 -- 
 1.8.4.rc4

Re: [Qemu-devel] [Qemu-ppc] [PATCH 16/16] target-ppc: Convert to new ldst opcodes

2013-09-05 Thread Alexander Graf


On 04.09.2013, at 23:05, Richard Henderson wrote:

 This lets us change le_mode to end_mode and fold away nearly all
 of the tests for the current cpu endianness, and removing all of the
 explicitly generated bswap opcodes.
 
 Cc: qemu-...@nongnu.org
 Signed-off-by: Richard Henderson r...@twiddle.net

No complaints from me, apart from the usual LE mode isn't necessarily what you 
think it is on PPC one. But the code would be as broken as before IIUC.

Ben, you had some insight in how LE mode on different PPC flavors work. Could 
you please make sure we're not walking into the wrong direction here?


Alex

Re: [Qemu-devel] [RFC PATCH] spapr: support time base offset migration

2013-09-05 Thread Alexander Graf


On 05.09.2013, at 06:54, Alexey Kardashevskiy wrote:

 On 09/05/2013 02:30 PM, David Gibson wrote:
 On Tue, Sep 03, 2013 at 05:31:42PM +1000, Alexey Kardashevskiy wrote:
 This allows guests to have a different timebase origin from the host.
 
 This is needed for migration, where a guest can migrate from one host
 to another and the two hosts might have a different timebase origin.
 However, the timebase seen by the guest must not go backwards, and
 should go forwards only by a small amount corresponding to the time
 taken for the migration.
 
 This is only supported for recent POWER hardware which has the TBU40
 (timebase upper 40 bits) register. That includes POWER6, 7, 8 but not
 970.
 
 This adds kvm_access_one_reg() to access a special register which is not
 in env-spr.
 
 The feature must be present in the host kernel.
 
 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 ---
 
 This is an RFC but not a final patch. Can break something but I just do not 
 see what.
 
 ---
 hw/ppc/ppc.c | 49 +
 include/hw/ppc/ppc.h |  4 
 target-ppc/kvm.c | 23 +++
 target-ppc/machine.c | 44 
 trace-events |  3 +++
 5 files changed, 123 insertions(+)
 
 diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
 index 1e3cab3..7d08c9a 100644
 --- a/hw/ppc/ppc.c
 +++ b/hw/ppc/ppc.c
 @@ -31,6 +31,7 @@
 #include hw/loader.h
 #include sysemu/kvm.h
 #include kvm_ppc.h
 +#include trace.h
 
 //#define PPC_DEBUG_IRQ
 #define PPC_DEBUG_TB
 @@ -796,6 +797,54 @@ static void cpu_ppc_set_tb_clk (void *opaque, uint32_t 
 freq)
 cpu_ppc_store_purr(cpu, 0xULL);
 }
 
 +/*
 + * Calculate timebase on the destination side of migration
 
 + * We calculate new timebase offset as shown below:
 + * 1) Gtb2 = Gtb1 + max(tod2 - tod1, 0)
 + *Gtb2 = tb2 + off2
 + *Gtb1 = tb1 + off1
 + * 2) tb2 + off2 = tb1 + off1 + max(tod2 - tod1, 0)
 + * 3) off2 = tb1 - tb2 + off1 + max(tod2 - tod1, 0)
 + *
 + * where:
 + * Gtb2 - destination guest timebase
 + * tb2 - destination host timebase
 + * off2 - destination timebase offset
 + * tod2 - destination time of the day
 + * Gtb1 - source guest timebase
 + * tb1 - source host timebase
 + * off1 - source timebase offset
 + * tod1 - source time of the day
 + *
 + * The result we want is in @off2
 + *
 + * Two conditions must be met for @off2:
 + * 1) off2 must be multiple of 2^24 ticks as it will be set via TBU40 SPR
 + * 2) Gtb2 = Gtb1
 
 What about the TCG case, where there is not host timebase, only a a
 host system clock?
 
 
 cpu_get_real_ticks() returns ticks, this is what the patch cares about.
 What is the difference between KVM and TCG here?
 
 
 + */
 +void cpu_ppc_adjust_tb_offset(ppc_tb_t *tb_env)
 +{
 +uint64_t tb2, tod2, off2;
 +int ratio = tb_env-tb_freq / 100;
 +struct timeval tv;
 +
 +tb2 = cpu_get_real_ticks();
 +gettimeofday(tv, NULL);
 +tod2 = tv.tv_sec * 100 + tv.tv_usec;
 +
 +off2 = tb_env-timebase - tb2 + tb_env-tb_offset;
 +if (tod2  tb_env-time_of_the_day) {
 +off2 += (tod2 - tb_env-time_of_the_day) * ratio;
 +}
 +off2 = ROUND_UP(off2, 1  24);
 +
 +trace_ppc_tb_adjust(tb_env-tb_offset, off2,
 +(int64_t)off2 - tb_env-tb_offset);
 +
 +tb_env-tb_offset = off2;
 +}
 +
 /* Set up (once) timebase frequency (in Hz) */
 clk_setup_cb cpu_ppc_tb_init (CPUPPCState *env, uint32_t freq)
 {
 diff --git a/include/hw/ppc/ppc.h b/include/hw/ppc/ppc.h
 index 132ab97..235871c 100644
 --- a/include/hw/ppc/ppc.h
 +++ b/include/hw/ppc/ppc.h
 @@ -32,6 +32,9 @@ struct ppc_tb_t {
 uint64_t purr_start;
 void *opaque;
 uint32_t flags;
 +/* Cached values for live migration purposes */
 +uint64_t timebase;
 +uint64_t time_of_the_day;
 
 How is the time of day encoded here?
 
 
 Microseconds. I'll put a comment here, I just thought it is quite obvious
 as gettimeofday() returns microseconds.
 
 
 };
 
 /* PPC Timers flags */
 @@ -46,6 +49,7 @@ struct ppc_tb_t {
*/
 
 uint64_t cpu_ppc_get_tb(ppc_tb_t *tb_env, uint64_t vmclk, int64_t 
 tb_offset);
 +void cpu_ppc_adjust_tb_offset(ppc_tb_t *tb_env);
 clk_setup_cb cpu_ppc_tb_init (CPUPPCState *env, uint32_t freq);
 /* Embedded PowerPC DCR management */
 typedef uint32_t (*dcr_read_cb)(void *opaque, int dcrn);
 diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
 index 7af9e3d..93df955 100644
 --- a/target-ppc/kvm.c
 +++ b/target-ppc/kvm.c
 @@ -35,6 +35,7 @@
 #include hw/sysbus.h
 #include hw/ppc/spapr.h
 #include hw/ppc/spapr_vio.h
 +#include hw/ppc/ppc.h
 #include sysemu/watchdog.h
 
 //#define DEBUG_KVM
 @@ -761,6 +762,22 @@ static int kvm_put_vpa(CPUState *cs)
 }
 #endif /* TARGET_PPC64 */
 
 +static int kvm_access_one_reg(CPUState *cs, bool set, __u64 id,
 void *addr)
 
 I think it would be nicer to have seperate set_one_reg and get_one_reg
 functions, rather than

Re: [Qemu-devel] [PATCH] spapr-rtas: reset top 4 bits in parameters address

2013-09-05 Thread Alexander Graf


On 05.09.2013, at 09:40, Alexey Kardashevskiy wrote:

 On 09/05/2013 05:08 PM, Alexander Graf wrote:
 
 
 Am 05.09.2013 um 07:58 schrieb Alexey Kardashevskiy a...@ozlabs.ru:
 
 On the real hardware, RTAS is called in real mode and therefore
 ignores top 4 bits of the address passed in the call.
 
 Shouldn't we ignore the upper 4 bits for every memory access in real mode, 
 not just that one parameter?
 
 We probably should but I just do not see any easy way of doing this. Yet
 another Ignore N bits on the top memory region type? No idea.

Well, it already works for code that runs inside of guest context, because 
there the softmmu code for real mode strips the upper 4 bits.

I basically see 2 ways of fixing this correctly:

1) Don't access memory through cpu_physical_memory_rw or ldx_phys but instead 
through real mode wrappers that strip the upper 4 bits, similar to how we 
handle virtual memory differently from physical memory

2) Create 15 aliases to system_memory at the upper 4 bits of address space. 
That should at the end of the day give you the same effect

The fix as you're proposing it wouldn't work for indirect memory descriptors. 
Imagine you have an address parameter that gives you a pointer to a struct in 
memory that again contains a pointer. You still want that pointer be 
interpreted correctly, no?


Alex

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 8673 matches

Mail list logo