Re: [PATCH] powerpc/e500: move qemu machine spec together with the rest

2015-09-14 Thread Alexander Graf


> Am 14.09.2015 um 15:17 schrieb Laurentiu Tudor :
> 
>> On 09/10/2015 02:01 AM, Scott Wood wrote:
>>> On Fri, 2015-09-04 at 15:46 +0300, Laurentiu Tudor wrote:
>>> This way we get rid of an entire file with mostly
>>> duplicated code plus a Kconfig option that you always
>>> had to take care to check it in order for kvm to work.
>>> 
>>> Signed-off-by: Laurentiu Tudor 
>>> ---
>>> arch/powerpc/platforms/85xx/Kconfig   | 15 -
>>> arch/powerpc/platforms/85xx/Makefile  |  1 -
>>> arch/powerpc/platforms/85xx/corenet_generic.c |  1 +
>>> arch/powerpc/platforms/85xx/qemu_e500.c   | 85 
>> 
>> 
>> qemu_e500 is not only for corenet chips.  
> 
> That's too bad. :-(
> I remember discussions on dropping the e500v2 support at some point in time?
> 
>> We can add it to the defconfig (in fact I've been meaning to do so).
> 
> Or maybe just drop de KConfig option and
> wrap the file in an #ifdef CONFIG_KVM or something along these lines?

CONFIG_KVM is for host support though. This is for the guest kernel.

Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] KVM: PPC: Book3S HV: Implement H_CLEAR_REF and H_CLEAR_MOD

2015-09-12 Thread Alexander Graf


> Am 12.09.2015 um 18:47 schrieb Nathan Whitehorn :
> 
>> On 09/06/15 16:52, Paul Mackerras wrote:
>>> On Sun, Sep 06, 2015 at 12:47:12PM -0700, Nathan Whitehorn wrote:
>>> Anything I can do to help move these along? It's a big performance
>>> improvement for FreeBSD guests.
>> These patches are in Paolo's kvm-ppc-next branch and should go into
>> Linus' tree in the next couple of days.
>> 
>> Paul.
> 
> One additional question. What is your preferred way to enable these? Since 
> these are part of the mandatory part of the PAPR spec, I think there's an 
> argument to add them to the default_hcall_list? Otherwise, they should be 
> enabled by default in QEMU (I can take care of sending that patch if you 
> prefer this route).

The default hcall list just describes which hcalls were implicitly enabled at 
the point in time we made them enableable by user space. IMHO no new hcalls 
should get added there.

So yes, please send a patch to qemu :).


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-ppc] KVM memory slots limit on powerpc

2015-09-04 Thread Alexander Graf


On 04.09.15 11:59, Christian Borntraeger wrote:
> Am 04.09.2015 um 11:35 schrieb Thomas Huth:
>>
>>  Hi all,
>>
>> now that we get memory hotplugging for the spapr machine on qemu-ppc,
>> too, it seems like we easily can hit the amount of KVM-internal memory
>> slots now ("#define KVM_USER_MEM_SLOTS 32" in
>> arch/powerpc/include/asm/kvm_host.h). For example, start
>> qemu-system-ppc64 with a couple of "-device secondary-vga" and "-m
>> 4G,slots=32,maxmem=40G" and then try to hot-plug all 32 DIMMs ... and
>> you'll see that it aborts way earlier already.
>>
>> The x86 code already increased the amount of KVM_USER_MEM_SLOTS to 509
>> already (+3 internal slots = 512) ... maybe we should now increase the
>> amount of slots on powerpc, too? Since we don't use internal slots on
>> POWER, would 512 be a good value? Or would less be sufficient, too?
> 
> When you are at it, the s390 value should also be increased I guess.

That constant defines the array size for the memslot array in struct kvm
which in turn again gets allocated by kzalloc, so it's pinned kernel
memory that is physically contiguous. Doing big allocations can turn
into problems during runtime.

So maybe there is another way? Can we extend the memslot array size
dynamically somehow? Allocate it separately? How much memory does the
memslot array use up with 512 entries?


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: ppc: Fix size of the PSPB register

2015-09-02 Thread Alexander Graf


> Am 02.09.2015 um 09:26 schrieb Thomas Huth :
> 
>> On 02/09/15 00:55, Benjamin Herrenschmidt wrote:
>>> On Wed, 2015-09-02 at 08:45 +1000, Paul Mackerras wrote:
>>> On Wed, Sep 02, 2015 at 08:25:05AM +1000, Benjamin Herrenschmidt
>>> wrote:
 On Tue, 2015-09-01 at 23:41 +0200, Thomas Huth wrote:
> The size of the Problem State Priority Boost Register is only
> 32 bits, so let's change the type of the corresponding variable
> accordingly to avoid future trouble.
 
 It's not future trouble, it's broken today for LE and this should
 fix
 it BUT 
>>> 
>>> No, it's broken today for BE hosts, which will always see 0 for the
>>> PSPB register value.  LE hosts are fine.
> 
> Right ... I just meant that nobody really experienced trouble with this
> today yet, but the bug is already present now already of course.

Sounds like a great candidate for kvm-unit-tests then, no? ;)


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vfio: Enable VFIO device for powerpc

2015-08-26 Thread Alexander Graf


On 13.08.15 03:15, David Gibson wrote:
> ec53500f "kvm: Add VFIO device" added a special KVM pseudo-device which is
> used to handle any necessary interactions between KVM and VFIO.
> 
> Currently that device is built on x86 and ARM, but not powerpc, although
> powerpc does support both KVM and VFIO.  This makes things awkward in
> userspace
> 
> Currently qemu prints an alarming error message if you attempt to use VFIO
> and it can't initialize the KVM VFIO device.  We don't want to remove the
> warning, because lack of the KVM VFIO device could mean coherency problems
> on x86.  On powerpc, however, the error is harmless but looks disturbing,
> and a test based on host architecture in qemu would be ugly, and break if
> we do need the KVM VFIO device for something important in future.
> 
> There's nothing preventing the KVM VFIO device from being built for
> powerpc, so this patch turns it on.  It won't actually do anything, since
> we don't define any of the arch_*() hooks, but it will make qemu happy and
> we can extend it in future if we need to.
> 
> Signed-off-by: David Gibson 
> Reviewed-by: Eric Auger 

Paul is going to take care of the kvm-ppc tree for 4.3. Also, ppc kvm
patches should get CC on the kvm-ppc@vger mailing list ;).

Paul, could you please pick this one up?


Thanks!

Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Build regressions/improvements in v4.2-rc8

2015-08-26 Thread Alexander Graf


On 24.08.15 10:36, Geert Uytterhoeven wrote:
> On Mon, Aug 24, 2015 at 10:34 AM, Geert Uytterhoeven
>  wrote:
>> JFYI, when comparing v4.2-rc8[1] to v4.2-rc7[3], the summaries are:
>>   - build errors: +4/-7
> 
> 4 regressions:
>   + /home/kisskb/slave/src/include/linux/kvm_host.h: error: array
> subscript is above array bounds [-Werror=array-bounds]:  => 430:19
> (arch/powerpc/kvm/book3s_64_mmu.c: In function 'kvmppc_mmu
> _book3s_64_tlbie':)
> 
> powerpc-randconfig (seen before in a v3.15-rc1 build?)

I'm not quite sure what's going wrong here. The code in question does

  kvm_for_each_vcpu(i, v, vcpu->kvm)
kvmppc_mmu_pte_vflush(v, va >> 12, mask);

and IIUC the thing we're potentially running over on would be
kvm->vcpus[i]. But that one is bound by the kvm_for_each_vcpu loop, no?


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PULL 00/12] ppc patch queue 2015-08-22

2015-08-23 Thread Alexander Graf


On 22.08.15 15:32, Paolo Bonzini wrote:
> 
> 
> On 22/08/2015 02:21, Alexander Graf wrote:
>> Hi Paolo,
>>
>> This is my current patch queue for ppc.  Please pull.
> 
> Done, but this queue has not been in linux-next.  Please push to
> kvm-ppc-next on your github Linux tree as well; please keep an eye on

Ah, sorry. I pushed to kvm-ppc-next in parallel to sending the request.

> Steven Rothwell's messages in the next few days, and I'll send the pull
> request sometimes next week via webmail if everything goes fine.

Nothing exciting came in so far, so I hope we're good :).


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL 02/12] KVM: PPC: Remove PPC970 from KVM_BOOK3S_64_HV text in Kconfig

2015-08-22 Thread Alexander Graf
From: Thomas Huth 

Since the PPC970 support has been removed from the kvm-hv kernel
module recently, we should also reflect this change in the help
text of the corresponding Kconfig option.

Signed-off-by: Thomas Huth 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/kvm/Kconfig | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index 3caec2c..c2024ac 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -74,14 +74,14 @@ config KVM_BOOK3S_64
  If unsure, say N.
 
 config KVM_BOOK3S_64_HV
-   tristate "KVM support for POWER7 and PPC970 using hypervisor mode in 
host"
+   tristate "KVM for POWER7 and later using hypervisor mode in host"
depends on KVM_BOOK3S_64 && PPC_POWERNV
select KVM_BOOK3S_HV_POSSIBLE
select MMU_NOTIFIER
select CMA
---help---
  Support running unmodified book3s_64 guest kernels in
- virtual machines on POWER7 and PPC970 processors that have
+ virtual machines on POWER7 and newer processors that have
  hypervisor mode available to the host.
 
  If you say Y here, KVM will use the hardware virtualization
@@ -89,8 +89,8 @@ config KVM_BOOK3S_64_HV
  guest operating systems will run at full hardware speed
  using supervisor and user modes.  However, this also means
  that KVM is not usable under PowerVM (pHyp), is only usable
- on POWER7 (or later) processors and PPC970-family processors,
- and cannot emulate a different processor from the host processor.
+ on POWER7 or later processors, and cannot emulate a
+ different processor from the host processor.
 
  If unsure, say N.
 
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL 04/12] KVM: PPC: add missing pt_regs initialization

2015-08-22 Thread Alexander Graf
From: Tudor Laurentiu 

On this switch branch the regs initialization
doesn't happen so add it.
This was found with the help of a static
code analysis tool.

Signed-off-by: Laurentiu Tudor 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/kvm/booke.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index cc58426..ae458f0 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -933,6 +933,7 @@ static void kvmppc_restart_interrupt(struct kvm_vcpu *vcpu,
 #endif
break;
case BOOKE_INTERRUPT_CRITICAL:
+   kvmppc_fill_pt_regs(®s);
unknown_exception(®s);
break;
case BOOKE_INTERRUPT_DEBUG:
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL 00/12] ppc patch queue 2015-08-22

2015-08-22 Thread Alexander Graf
Hi Paolo,

This is my current patch queue for ppc.  Please pull.

Alex


The following changes since commit 4d283ec908e617fa28bcb06bce310206f0655d67:

  x86/kvm: Rename VMX's segment access rights defines (2015-08-15 00:47:13 
+0200)

are available in the git repository at:

  git://github.com/agraf/linux-2.6.git tags/signed-kvm-ppc-next

for you to fetch changes up to c63517c2e3810071359af926f621c1f784388c3f:

  KVM: PPC: Book3S: correct width in XER handling (2015-08-22 11:16:19 +0200)


Patch queue for ppc - 2015-08-22

Highlights for KVM PPC this time around:

  - Book3S: A few bug fixes
  - Book3S: Allow micro-threading on POWER8


Paul Mackerras (7):
  KVM: PPC: Book3S HV: Make use of unused threads when running guests
  KVM: PPC: Book3S HV: Implement dynamic micro-threading on POWER8
  KVM: PPC: Book3S HV: Fix race in reading change bit when removing HPTE
  KVM: PPC: Book3S HV: Fix bug in dirty page tracking
  KVM: PPC: Book3S HV: Implement H_CLEAR_REF and H_CLEAR_MOD
  KVM: PPC: Book3S HV: Fix preempted vcore list locking
  KVM: PPC: Book3S HV: Fix preempted vcore stolen time calculation

Sam bobroff (1):
  KVM: PPC: Book3S: correct width in XER handling

Thomas Huth (2):
  KVM: PPC: Remove PPC970 from KVM_BOOK3S_64_HV text in Kconfig
  KVM: PPC: Fix warnings from sparse

Tudor Laurentiu (2):
  KVM: PPC: fix suspicious use of conditional operator
  KVM: PPC: add missing pt_regs initialization

 arch/powerpc/include/asm/kvm_book3s.h |   5 +-
 arch/powerpc/include/asm/kvm_book3s_asm.h |  22 +-
 arch/powerpc/include/asm/kvm_booke.h  |   4 +-
 arch/powerpc/include/asm/kvm_host.h   |  24 +-
 arch/powerpc/include/asm/ppc-opcode.h |   2 +-
 arch/powerpc/kernel/asm-offsets.c |   9 +
 arch/powerpc/kvm/Kconfig  |   8 +-
 arch/powerpc/kvm/book3s.c |   3 +-
 arch/powerpc/kvm/book3s_32_mmu_host.c |   1 +
 arch/powerpc/kvm/book3s_64_mmu_host.c |   1 +
 arch/powerpc/kvm/book3s_64_mmu_hv.c   |   8 +-
 arch/powerpc/kvm/book3s_emulate.c |   1 +
 arch/powerpc/kvm/book3s_hv.c  | 660 ++
 arch/powerpc/kvm/book3s_hv_builtin.c  |  32 +-
 arch/powerpc/kvm/book3s_hv_rm_mmu.c   | 161 +++-
 arch/powerpc/kvm/book3s_hv_rm_xics.c  |   4 +-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S   | 128 +-
 arch/powerpc/kvm/book3s_paired_singles.c  |   2 +-
 arch/powerpc/kvm/book3s_segment.S |   4 +-
 arch/powerpc/kvm/booke.c  |   1 +
 arch/powerpc/kvm/e500_mmu.c   |   2 +-
 arch/powerpc/kvm/powerpc.c|   2 +-
 22 files changed, 938 insertions(+), 146 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL 01/12] KVM: PPC: fix suspicious use of conditional operator

2015-08-22 Thread Alexander Graf
From: Tudor Laurentiu 

This was signaled by a static code analysis tool.

Signed-off-by: Laurentiu Tudor 
Reviewed-by: Scott Wood 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/kvm/e500_mmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/e500_mmu.c b/arch/powerpc/kvm/e500_mmu.c
index 50860e9..29911a0 100644
--- a/arch/powerpc/kvm/e500_mmu.c
+++ b/arch/powerpc/kvm/e500_mmu.c
@@ -377,7 +377,7 @@ int kvmppc_e500_emul_tlbsx(struct kvm_vcpu *vcpu, gva_t ea)
| MAS0_NV(vcpu_e500->gtlb_nv[tlbsel]);
vcpu->arch.shared->mas1 =
  (vcpu->arch.shared->mas6 & MAS6_SPID0)
-   | (vcpu->arch.shared->mas6 & (MAS6_SAS ? MAS1_TS : 0))
+   | ((vcpu->arch.shared->mas6 & MAS6_SAS) ? MAS1_TS : 0)
| (vcpu->arch.shared->mas4 & MAS4_TSIZED(~0));
vcpu->arch.shared->mas2 &= MAS2_EPN;
vcpu->arch.shared->mas2 |= vcpu->arch.shared->mas4 &
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL 05/12] KVM: PPC: Book3S HV: Make use of unused threads when running guests

2015-08-22 Thread Alexander Graf
From: Paul Mackerras 

When running a virtual core of a guest that is configured with fewer
threads per core than the physical cores have, the extra physical
threads are currently unused.  This makes it possible to use them to
run one or more other virtual cores from the same guest when certain
conditions are met.  This applies on POWER7, and on POWER8 to guests
with one thread per virtual core.  (It doesn't apply to POWER8 guests
with multiple threads per vcore because they require a 1-1 virtual to
physical thread mapping in order to be able to use msgsndp and the
TIR.)

The idea is that we maintain a list of preempted vcores for each
physical cpu (i.e. each core, since the host runs single-threaded).
Then, when a vcore is about to run, it checks to see if there are
any vcores on the list for its physical cpu that could be
piggybacked onto this vcore's execution.  If so, those additional
vcores are put into state VCORE_PIGGYBACK and their runnable VCPU
threads are started as well as the original vcore, which is called
the master vcore.

After the vcores have exited the guest, the extra ones are put back
onto the preempted list if any of their VCPUs are still runnable and
not idle.

This means that vcpu->arch.ptid is no longer necessarily the same as
the physical thread that the vcpu runs on.  In order to make it easier
for code that wants to send an IPI to know which CPU to target, we
now store that in a new field in struct vcpu_arch, called thread_cpu.

Reviewed-by: David Gibson 
Tested-by: Laurent Vivier 
Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/include/asm/kvm_host.h |  19 +-
 arch/powerpc/kernel/asm-offsets.c   |   2 +
 arch/powerpc/kvm/book3s_hv.c| 333 ++--
 arch/powerpc/kvm/book3s_hv_builtin.c|   7 +-
 arch/powerpc/kvm/book3s_hv_rm_xics.c|   4 +-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S |   5 +
 6 files changed, 298 insertions(+), 72 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index d91f65b..2b74490 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -278,7 +278,9 @@ struct kvmppc_vcore {
u16 last_cpu;
u8 vcore_state;
u8 in_guest;
+   struct kvmppc_vcore *master_vcore;
struct list_head runnable_threads;
+   struct list_head preempt_list;
spinlock_t lock;
wait_queue_head_t wq;
spinlock_t stoltb_lock; /* protects stolen_tb and preempt_tb */
@@ -300,12 +302,18 @@ struct kvmppc_vcore {
 #define VCORE_EXIT_MAP(vc) ((vc)->entry_exit_map >> 8)
 #define VCORE_IS_EXITING(vc)   (VCORE_EXIT_MAP(vc) != 0)
 
-/* Values for vcore_state */
+/*
+ * Values for vcore_state.
+ * Note that these are arranged such that lower values
+ * (< VCORE_SLEEPING) don't require stolen time accounting
+ * on load/unload, and higher values do.
+ */
 #define VCORE_INACTIVE 0
-#define VCORE_SLEEPING 1
-#define VCORE_PREEMPT  2
-#define VCORE_RUNNING  3
-#define VCORE_EXITING  4
+#define VCORE_PREEMPT  1
+#define VCORE_PIGGYBACK2
+#define VCORE_SLEEPING 3
+#define VCORE_RUNNING  4
+#define VCORE_EXITING  5
 
 /*
  * Struct used to manage memory for a virtual processor area
@@ -619,6 +627,7 @@ struct kvm_vcpu_arch {
int trap;
int state;
int ptid;
+   int thread_cpu;
bool timer_running;
wait_queue_head_t cpu_run;
 
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 9823057..a78cdbf 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -512,6 +512,8 @@ int main(void)
DEFINE(VCPU_VPA, offsetof(struct kvm_vcpu, arch.vpa.pinned_addr));
DEFINE(VCPU_VPA_DIRTY, offsetof(struct kvm_vcpu, arch.vpa.dirty));
DEFINE(VCPU_HEIR, offsetof(struct kvm_vcpu, arch.emul_inst));
+   DEFINE(VCPU_CPU, offsetof(struct kvm_vcpu, cpu));
+   DEFINE(VCPU_THREAD_CPU, offsetof(struct kvm_vcpu, arch.thread_cpu));
 #endif
 #ifdef CONFIG_PPC_BOOK3S
DEFINE(VCPU_VCPUID, offsetof(struct kvm_vcpu, vcpu_id));
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 6e588ac..0173ce2 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -81,6 +81,9 @@ static DECLARE_BITMAP(default_enabled_hcalls, 
MAX_HCALL_OPCODE/4 + 1);
 #define MPP_BUFFER_ORDER   3
 #endif
 
+static int target_smt_mode;
+module_param(target_smt_mode, int, S_IRUGO | S_IWUSR);
+MODULE_PARM_DESC(target_smt_mode, "Target threads per core (0 = max)");
 
 static void kvmppc_end_cede(struct kvm_vcpu *vcpu);
 static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu);
@@ -114,7 +117,7 @@ static bool kvmppc_ipi_thread(int cpu)
 
 static void kvmppc_fast_vcpu_kick_hv(struct kvm_vcpu *vcpu)
 {
-   int cpu = vcpu->cpu;
+   int cpu;
wait_queue_head_t *wqp;
 
wqp = kvm_arch_vcpu_

[PULL 08/12] KVM: PPC: Book3S HV: Fix bug in dirty page tracking

2015-08-22 Thread Alexander Graf
From: Paul Mackerras 

This fixes a bug in the tracking of pages that get modified by the
guest.  If the guest creates a large-page HPTE, writes to memory
somewhere within the large page, and then removes the HPTE, we only
record the modified state for the first normal page within the large
page, when in fact the guest might have modified some other normal
page within the large page.

To fix this we use some unused bits in the rmap entry to record the
order (log base 2) of the size of the page that was modified, when
removing an HPTE.  Then in kvm_test_clear_dirty_npages() we use that
order to return the correct number of modified pages.

The same thing could in principle happen when removing a HPTE at the
host's request, i.e. when paging out a page, except that we never
page out large pages, and the guest can only create large-page HPTEs
if the guest RAM is backed by large pages.  However, we also fix
this case for the sake of future-proofing.

The reference bit is also subject to the same loss of information.  We
don't make the same fix here for the reference bit because there isn't
an interface for userspace to find out which pages the guest has
referenced, whereas there is one for userspace to find out which pages
the guest has modified.  Because of this loss of information, the
kvm_age_hva_hv() and kvm_test_age_hva_hv() functions might incorrectly
say that a page has not been referenced when it has, but that doesn't
matter greatly because we never page or swap out large pages.

Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/include/asm/kvm_book3s.h |  1 +
 arch/powerpc/include/asm/kvm_host.h   |  2 ++
 arch/powerpc/kvm/book3s_64_mmu_hv.c   |  8 +++-
 arch/powerpc/kvm/book3s_hv_rm_mmu.c   | 17 +
 4 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index b91e74a..e6b2534 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -158,6 +158,7 @@ extern pfn_t kvmppc_gpa_to_pfn(struct kvm_vcpu *vcpu, gpa_t 
gpa, bool writing,
bool *writable);
 extern void kvmppc_add_revmap_chain(struct kvm *kvm, struct revmap_entry *rev,
unsigned long *rmap, long pte_index, int realmode);
+extern void kvmppc_update_rmap_change(unsigned long *rmap, unsigned long 
psize);
 extern void kvmppc_invalidate_hpte(struct kvm *kvm, __be64 *hptep,
unsigned long pte_index);
 void kvmppc_clear_ref_hpte(struct kvm *kvm, __be64 *hptep,
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 80eb29a..e187b6a 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -205,8 +205,10 @@ struct revmap_entry {
  */
 #define KVMPPC_RMAP_LOCK_BIT   63
 #define KVMPPC_RMAP_RC_SHIFT   32
+#define KVMPPC_RMAP_CHG_SHIFT  48
 #define KVMPPC_RMAP_REFERENCED (HPTE_R_R << KVMPPC_RMAP_RC_SHIFT)
 #define KVMPPC_RMAP_CHANGED(HPTE_R_C << KVMPPC_RMAP_RC_SHIFT)
+#define KVMPPC_RMAP_CHG_ORDER  (0x3ful << KVMPPC_RMAP_CHG_SHIFT)
 #define KVMPPC_RMAP_PRESENT0x1ul
 #define KVMPPC_RMAP_INDEX  0xul
 
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index dab68b7..1f9c0a1 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -761,6 +761,8 @@ static int kvm_unmap_rmapp(struct kvm *kvm, unsigned long 
*rmapp,
/* Harvest R and C */
rcbits = be64_to_cpu(hptep[1]) & (HPTE_R_R | HPTE_R_C);
*rmapp |= rcbits << KVMPPC_RMAP_RC_SHIFT;
+   if (rcbits & HPTE_R_C)
+   kvmppc_update_rmap_change(rmapp, psize);
if (rcbits & ~rev[i].guest_rpte) {
rev[i].guest_rpte = ptel | rcbits;
note_hpte_modification(kvm, &rev[i]);
@@ -927,8 +929,12 @@ static int kvm_test_clear_dirty_npages(struct kvm *kvm, 
unsigned long *rmapp)
  retry:
lock_rmap(rmapp);
if (*rmapp & KVMPPC_RMAP_CHANGED) {
-   *rmapp &= ~KVMPPC_RMAP_CHANGED;
+   long change_order = (*rmapp & KVMPPC_RMAP_CHG_ORDER)
+   >> KVMPPC_RMAP_CHG_SHIFT;
+   *rmapp &= ~(KVMPPC_RMAP_CHANGED | KVMPPC_RMAP_CHG_ORDER);
npages_dirty = 1;
+   if (change_order > PAGE_SHIFT)
+   npages_dirty = 1ul << (change_order - PAGE_SHIFT);
}
if (!(*rmapp & KVMPPC_RMAP_PRESENT)) {
unlock_rmap(rmapp);
diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c 
b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index c6d601c..c7a3ab2 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/k

[PULL 06/12] KVM: PPC: Book3S HV: Implement dynamic micro-threading on POWER8

2015-08-22 Thread Alexander Graf
From: Paul Mackerras 

This builds on the ability to run more than one vcore on a physical
core by using the micro-threading (split-core) modes of the POWER8
chip.  Previously, only vcores from the same VM could be run together,
and (on POWER8) only if they had just one thread per core.  With the
ability to split the core on guest entry and unsplit it on guest exit,
we can run up to 8 vcpu threads from up to 4 different VMs, and we can
run multiple vcores with 2 or 4 vcpus per vcore.

Dynamic micro-threading is only available if the static configuration
of the cores is whole-core mode (unsplit), and only on POWER8.

To manage this, we introduce a new kvm_split_mode struct which is
shared across all of the subcores in the core, with a pointer in the
paca on each thread.  In addition we extend the core_info struct to
have information on each subcore.  When deciding whether to add a
vcore to the set already on the core, we now have two possibilities:
(a) piggyback the vcore onto an existing subcore, or (b) start a new
subcore.

Currently, when any vcpu needs to exit the guest and switch to host
virtual mode, we interrupt all the threads in all subcores and switch
the core back to whole-core mode.  It may be possible in future to
allow some of the subcores to keep executing in the guest while
subcore 0 switches to the host, but that is not implemented in this
patch.

This adds a module parameter called dynamic_mt_modes which controls
which micro-threading (split-core) modes the code will consider, as a
bitmap.  In other words, if it is 0, no micro-threading mode is
considered; if it is 2, only 2-way micro-threading is considered; if
it is 4, only 4-way, and if it is 6, both 2-way and 4-way
micro-threading mode will be considered.  The default is 6.

With this, we now have secondary threads which are the primary thread
for their subcore and therefore need to do the MMU switch.  These
threads will need to be started even if they have no vcpu to run, so
we use the vcore pointer in the PACA rather than the vcpu pointer to
trigger them.

It is now possible for thread 0 to find that an exit has been
requested before it gets to switch the subcore state to the guest.  In
that case we haven't added the guest's timebase offset to the
timebase, so we need to be careful not to subtract the offset in the
guest exit path.  In fact we just skip the whole path that switches
back to host context, since we haven't switched to the guest context.

Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/include/asm/kvm_book3s_asm.h |  20 ++
 arch/powerpc/include/asm/kvm_host.h   |   3 +
 arch/powerpc/kernel/asm-offsets.c |   7 +
 arch/powerpc/kvm/book3s_hv.c  | 367 ++
 arch/powerpc/kvm/book3s_hv_builtin.c  |  25 +-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S   | 113 +++--
 6 files changed, 473 insertions(+), 62 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_asm.h 
b/arch/powerpc/include/asm/kvm_book3s_asm.h
index 5bdfb5d..57d5dfe 100644
--- a/arch/powerpc/include/asm/kvm_book3s_asm.h
+++ b/arch/powerpc/include/asm/kvm_book3s_asm.h
@@ -25,6 +25,12 @@
 #define XICS_MFRR  0xc
 #define XICS_IPI   2   /* interrupt source # for IPIs */
 
+/* Maximum number of threads per physical core */
+#define MAX_SMT_THREADS8
+
+/* Maximum number of subcores per physical core */
+#define MAX_SUBCORES   4
+
 #ifdef __ASSEMBLY__
 
 #ifdef CONFIG_KVM_BOOK3S_HANDLER
@@ -65,6 +71,19 @@ kvmppc_resume_\intno:
 
 #else  /*__ASSEMBLY__ */
 
+struct kvmppc_vcore;
+
+/* Struct used for coordinating micro-threading (split-core) mode changes */
+struct kvm_split_mode {
+   unsigned long   rpr;
+   unsigned long   pmmar;
+   unsigned long   ldbar;
+   u8  subcore_size;
+   u8  do_nap;
+   u8  napped[MAX_SMT_THREADS];
+   struct kvmppc_vcore *master_vcs[MAX_SUBCORES];
+};
+
 /*
  * This struct goes in the PACA on 64-bit processors.  It is used
  * to store host state that needs to be saved when we enter a guest
@@ -100,6 +119,7 @@ struct kvmppc_host_state {
u64 host_spurr;
u64 host_dscr;
u64 dec_expires;
+   struct kvm_split_mode *kvm_split_mode;
 #endif
 #ifdef CONFIG_PPC_BOOK3S_64
u64 cfar;
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 2b74490..80eb29a 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -302,6 +302,9 @@ struct kvmppc_vcore {
 #define VCORE_EXIT_MAP(vc) ((vc)->entry_exit_map >> 8)
 #define VCORE_IS_EXITING(vc)   (VCORE_EXIT_MAP(vc) != 0)
 
+/* This bit is used when a vcore exit is triggered from outside the vcore */
+#define VCORE_EXIT_REQ 0x1
+
 /*
  * Values for vcore_state.
  * Note that these are arranged such that lower values
diff --git a/arch/powerpc/kernel/asm-o

[PULL 03/12] KVM: PPC: Fix warnings from sparse

2015-08-22 Thread Alexander Graf
From: Thomas Huth 

When compiling the KVM code for POWER with "make C=1", sparse
complains about functions missing proper prototypes and a 64-bit
constant missing the ULL prefix. Let's fix this by making the
functions static or by including the proper header with the
prototypes, and by appending a ULL prefix to the constant
PPC_MPPE_ADDRESS_MASK.

Signed-off-by: Thomas Huth 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/include/asm/ppc-opcode.h| 2 +-
 arch/powerpc/kvm/book3s.c| 3 ++-
 arch/powerpc/kvm/book3s_32_mmu_host.c| 1 +
 arch/powerpc/kvm/book3s_64_mmu_host.c| 1 +
 arch/powerpc/kvm/book3s_emulate.c| 1 +
 arch/powerpc/kvm/book3s_hv.c | 8 
 arch/powerpc/kvm/book3s_paired_singles.c | 2 +-
 arch/powerpc/kvm/powerpc.c   | 2 +-
 8 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index 8452335..790f5d1 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -287,7 +287,7 @@
 
 /* POWER8 Micro Partition Prefetch (MPP) parameters */
 /* Address mask is common for LOGMPP instruction and MPPR SPR */
-#define PPC_MPPE_ADDRESS_MASK 0xc000
+#define PPC_MPPE_ADDRESS_MASK 0xc000ULL
 
 /* Bits 60 and 61 of MPP SPR should be set to one of the following */
 /* Aborting the fetch is indeed setting 00 in the table size bits */
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 05ea8fc..53285d5 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -240,7 +240,8 @@ void kvmppc_core_queue_inst_storage(struct kvm_vcpu *vcpu, 
ulong flags)
kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_INST_STORAGE);
 }
 
-int kvmppc_book3s_irqprio_deliver(struct kvm_vcpu *vcpu, unsigned int priority)
+static int kvmppc_book3s_irqprio_deliver(struct kvm_vcpu *vcpu,
+unsigned int priority)
 {
int deliver = 1;
int vec = 0;
diff --git a/arch/powerpc/kvm/book3s_32_mmu_host.c 
b/arch/powerpc/kvm/book3s_32_mmu_host.c
index 2035d16..d5c9bfe 100644
--- a/arch/powerpc/kvm/book3s_32_mmu_host.c
+++ b/arch/powerpc/kvm/book3s_32_mmu_host.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include "book3s.h"
 
 /* #define DEBUG_MMU */
 /* #define DEBUG_SR */
diff --git a/arch/powerpc/kvm/book3s_64_mmu_host.c 
b/arch/powerpc/kvm/book3s_64_mmu_host.c
index b982d92..79ad35a 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_host.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_host.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include "trace_pr.h"
+#include "book3s.h"
 
 #define PTE_SIZE 12
 
diff --git a/arch/powerpc/kvm/book3s_emulate.c 
b/arch/powerpc/kvm/book3s_emulate.c
index 5a2bc4b..2afdb9c 100644
--- a/arch/powerpc/kvm/book3s_emulate.c
+++ b/arch/powerpc/kvm/book3s_emulate.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include "book3s.h"
 
 #define OP_19_XOP_RFID 18
 #define OP_19_XOP_RFI  50
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 68d067a..6e588ac 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -214,12 +214,12 @@ static void kvmppc_set_msr_hv(struct kvm_vcpu *vcpu, u64 
msr)
kvmppc_end_cede(vcpu);
 }
 
-void kvmppc_set_pvr_hv(struct kvm_vcpu *vcpu, u32 pvr)
+static void kvmppc_set_pvr_hv(struct kvm_vcpu *vcpu, u32 pvr)
 {
vcpu->arch.pvr = pvr;
 }
 
-int kvmppc_set_arch_compat(struct kvm_vcpu *vcpu, u32 arch_compat)
+static int kvmppc_set_arch_compat(struct kvm_vcpu *vcpu, u32 arch_compat)
 {
unsigned long pcr = 0;
struct kvmppc_vcore *vc = vcpu->arch.vcore;
@@ -259,7 +259,7 @@ int kvmppc_set_arch_compat(struct kvm_vcpu *vcpu, u32 
arch_compat)
return 0;
 }
 
-void kvmppc_dump_regs(struct kvm_vcpu *vcpu)
+static void kvmppc_dump_regs(struct kvm_vcpu *vcpu)
 {
int r;
 
@@ -292,7 +292,7 @@ void kvmppc_dump_regs(struct kvm_vcpu *vcpu)
   vcpu->arch.last_inst);
 }
 
-struct kvm_vcpu *kvmppc_find_vcpu(struct kvm *kvm, int id)
+static struct kvm_vcpu *kvmppc_find_vcpu(struct kvm *kvm, int id)
 {
int r;
struct kvm_vcpu *v, *ret = NULL;
diff --git a/arch/powerpc/kvm/book3s_paired_singles.c 
b/arch/powerpc/kvm/book3s_paired_singles.c
index bd6ab16..a759d9a 100644
--- a/arch/powerpc/kvm/book3s_paired_singles.c
+++ b/arch/powerpc/kvm/book3s_paired_singles.c
@@ -352,7 +352,7 @@ static inline u32 inst_get_field(u32 inst, int msb, int lsb)
return kvmppc_get_field(inst, msb + 32, lsb + 32);
 }
 
-bool kvmppc_inst_is_paired_single(struct kvm_vcpu *vcpu, u32 inst)
+static bool kvmppc_inst_is_paired_single(struct kvm_vcpu *vcpu, u32 inst)
 {
if (!(vcpu->arch.hflags & BOOK3S_HFLAG_PAIRED_SINGLE))
return false;
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index e5dd

[PULL 11/12] KVM: PPC: Book3S HV: Fix preempted vcore stolen time calculation

2015-08-22 Thread Alexander Graf
From: Paul Mackerras 

Whenever a vcore state is VCORE_PREEMPT we need to be counting stolen
time for it.  This currently isn't the case when we have a vcore that
no longer has any runnable threads in it but still has a runner task,
so we do an explicit call to kvmppc_core_start_stolen() in that case.

Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/kvm/book3s_hv.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 3d02276..fad52f2 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -2283,9 +2283,14 @@ static void post_guest_process(struct kvmppc_vcore *vc, 
bool is_master)
}
list_del_init(&vc->preempt_list);
if (!is_master) {
-   vc->vcore_state = vc->runner ? VCORE_PREEMPT : VCORE_INACTIVE;
-   if (still_running > 0)
+   if (still_running > 0) {
kvmppc_vcore_preempt(vc);
+   } else if (vc->runner) {
+   vc->vcore_state = VCORE_PREEMPT;
+   kvmppc_core_start_stolen(vc);
+   } else {
+   vc->vcore_state = VCORE_INACTIVE;
+   }
if (vc->n_runnable > 0 && vc->runner == NULL) {
/* make sure there's a candidate runner awake */
vcpu = list_first_entry(&vc->runnable_threads,
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL 10/12] KVM: PPC: Book3S HV: Fix preempted vcore list locking

2015-08-22 Thread Alexander Graf
From: Paul Mackerras 

When a vcore gets preempted, we put it on the preempted vcore list for
the current CPU.  The runner task then calls schedule() and comes back
some time later and takes itself off the list.  We need to be careful
to lock the list that it was put onto, which may not be the list for the
current CPU since the runner task may have moved to another CPU.

Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/kvm/book3s_hv.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 6e3ef30..3d02276 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1962,10 +1962,11 @@ static void kvmppc_vcore_preempt(struct kvmppc_vcore 
*vc)
 
 static void kvmppc_vcore_end_preempt(struct kvmppc_vcore *vc)
 {
-   struct preempted_vcore_list *lp = this_cpu_ptr(&preempted_vcores);
+   struct preempted_vcore_list *lp;
 
kvmppc_core_end_stolen(vc);
if (!list_empty(&vc->preempt_list)) {
+   lp = &per_cpu(preempted_vcores, vc->pcpu);
spin_lock(&lp->lock);
list_del_init(&vc->preempt_list);
spin_unlock(&lp->lock);
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL 12/12] KVM: PPC: Book3S: correct width in XER handling

2015-08-22 Thread Alexander Graf
From: Sam bobroff 

In 64 bit kernels, the Fixed Point Exception Register (XER) is a 64
bit field (e.g. in kvm_regs and kvm_vcpu_arch) and in most places it is
accessed as such.

This patch corrects places where it is accessed as a 32 bit field by a
64 bit kernel.  In some cases this is via a 32 bit load or store
instruction which, depending on endianness, will cause either the
lower or upper 32 bits to be missed.  In another case it is cast as a
u32, causing the upper 32 bits to be cleared.

This patch corrects those places by extending the access methods to
64 bits.

Signed-off-by: Sam Bobroff 
Reviewed-by: Laurent Vivier 
Reviewed-by: Thomas Huth 
Tested-by: Thomas Huth 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/include/asm/kvm_book3s.h | 4 ++--
 arch/powerpc/include/asm/kvm_book3s_asm.h | 2 +-
 arch/powerpc/include/asm/kvm_booke.h  | 4 ++--
 arch/powerpc/kvm/book3s_hv_rmhandlers.S   | 6 +++---
 arch/powerpc/kvm/book3s_segment.S | 4 ++--
 5 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index e6b2534..9fac01c 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -226,12 +226,12 @@ static inline u32 kvmppc_get_cr(struct kvm_vcpu *vcpu)
return vcpu->arch.cr;
 }
 
-static inline void kvmppc_set_xer(struct kvm_vcpu *vcpu, u32 val)
+static inline void kvmppc_set_xer(struct kvm_vcpu *vcpu, ulong val)
 {
vcpu->arch.xer = val;
 }
 
-static inline u32 kvmppc_get_xer(struct kvm_vcpu *vcpu)
+static inline ulong kvmppc_get_xer(struct kvm_vcpu *vcpu)
 {
return vcpu->arch.xer;
 }
diff --git a/arch/powerpc/include/asm/kvm_book3s_asm.h 
b/arch/powerpc/include/asm/kvm_book3s_asm.h
index 57d5dfe..72b6225 100644
--- a/arch/powerpc/include/asm/kvm_book3s_asm.h
+++ b/arch/powerpc/include/asm/kvm_book3s_asm.h
@@ -132,7 +132,7 @@ struct kvmppc_book3s_shadow_vcpu {
bool in_use;
ulong gpr[14];
u32 cr;
-   u32 xer;
+   ulong xer;
ulong ctr;
ulong lr;
ulong pc;
diff --git a/arch/powerpc/include/asm/kvm_booke.h 
b/arch/powerpc/include/asm/kvm_booke.h
index 3286f0d..bc6e29e 100644
--- a/arch/powerpc/include/asm/kvm_booke.h
+++ b/arch/powerpc/include/asm/kvm_booke.h
@@ -54,12 +54,12 @@ static inline u32 kvmppc_get_cr(struct kvm_vcpu *vcpu)
return vcpu->arch.cr;
 }
 
-static inline void kvmppc_set_xer(struct kvm_vcpu *vcpu, u32 val)
+static inline void kvmppc_set_xer(struct kvm_vcpu *vcpu, ulong val)
 {
vcpu->arch.xer = val;
 }
 
-static inline u32 kvmppc_get_xer(struct kvm_vcpu *vcpu)
+static inline ulong kvmppc_get_xer(struct kvm_vcpu *vcpu)
 {
return vcpu->arch.xer;
 }
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index e347766..472680f 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -944,7 +944,7 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
blt hdec_soon
 
ld  r6, VCPU_CTR(r4)
-   lwz r7, VCPU_XER(r4)
+   ld  r7, VCPU_XER(r4)
 
mtctr   r6
mtxer   r7
@@ -1181,7 +1181,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
mfctr   r3
mfxer   r4
std r3, VCPU_CTR(r9)
-   stw r4, VCPU_XER(r9)
+   std r4, VCPU_XER(r9)
 
/* If this is a page table miss then see if it's theirs or ours */
cmpwi   r12, BOOK3S_INTERRUPT_H_DATA_STORAGE
@@ -1763,7 +1763,7 @@ kvmppc_hdsi:
bl  kvmppc_msr_interrupt
 fast_interrupt_c_return:
 6: ld  r7, VCPU_CTR(r9)
-   lwz r8, VCPU_XER(r9)
+   ld  r8, VCPU_XER(r9)
mtctr   r7
mtxer   r8
mr  r4, r9
diff --git a/arch/powerpc/kvm/book3s_segment.S 
b/arch/powerpc/kvm/book3s_segment.S
index acee37c..ca8f174 100644
--- a/arch/powerpc/kvm/book3s_segment.S
+++ b/arch/powerpc/kvm/book3s_segment.S
@@ -123,7 +123,7 @@ no_dcbz32_on:
PPC_LL  r8, SVCPU_CTR(r3)
PPC_LL  r9, SVCPU_LR(r3)
lwz r10, SVCPU_CR(r3)
-   lwz r11, SVCPU_XER(r3)
+   PPC_LL  r11, SVCPU_XER(r3)
 
mtctr   r8
mtlrr9
@@ -237,7 +237,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
mfctr   r8
mflrr9
 
-   stw r5, SVCPU_XER(r13)
+   PPC_STL r5, SVCPU_XER(r13)
PPC_STL r6, SVCPU_FAULT_DAR(r13)
stw r7, SVCPU_FAULT_DSISR(r13)
PPC_STL r8, SVCPU_CTR(r13)
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL 09/12] KVM: PPC: Book3S HV: Implement H_CLEAR_REF and H_CLEAR_MOD

2015-08-22 Thread Alexander Graf
From: Paul Mackerras 

This adds implementations for the H_CLEAR_REF (test and clear reference
bit) and H_CLEAR_MOD (test and clear changed bit) hypercalls.

When clearing the reference or change bit in the guest view of the HPTE,
we also have to clear it in the real HPTE so that we can detect future
references or changes.  When we do so, we transfer the R or C bit value
to the rmap entry for the underlying host page so that kvm_age_hva_hv(),
kvm_test_age_hva_hv() and kvmppc_hv_get_dirty_log() know that the page
has been referenced and/or changed.

These hypercalls are not used by Linux guests.  These implementations
have been tested using a FreeBSD guest.

Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/kvm/book3s_hv_rm_mmu.c | 126 ++--
 arch/powerpc/kvm/book3s_hv_rmhandlers.S |   4 +-
 2 files changed, 121 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c 
b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index c7a3ab2..c1df9bb 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -112,25 +112,38 @@ void kvmppc_update_rmap_change(unsigned long *rmap, 
unsigned long psize)
 }
 EXPORT_SYMBOL_GPL(kvmppc_update_rmap_change);
 
+/* Returns a pointer to the revmap entry for the page mapped by a HPTE */
+static unsigned long *revmap_for_hpte(struct kvm *kvm, unsigned long hpte_v,
+ unsigned long hpte_gr)
+{
+   struct kvm_memory_slot *memslot;
+   unsigned long *rmap;
+   unsigned long gfn;
+
+   gfn = hpte_rpn(hpte_gr, hpte_page_size(hpte_v, hpte_gr));
+   memslot = __gfn_to_memslot(kvm_memslots_raw(kvm), gfn);
+   if (!memslot)
+   return NULL;
+
+   rmap = real_vmalloc_addr(&memslot->arch.rmap[gfn - memslot->base_gfn]);
+   return rmap;
+}
+
 /* Remove this HPTE from the chain for a real page */
 static void remove_revmap_chain(struct kvm *kvm, long pte_index,
struct revmap_entry *rev,
unsigned long hpte_v, unsigned long hpte_r)
 {
struct revmap_entry *next, *prev;
-   unsigned long gfn, ptel, head;
-   struct kvm_memory_slot *memslot;
+   unsigned long ptel, head;
unsigned long *rmap;
unsigned long rcbits;
 
rcbits = hpte_r & (HPTE_R_R | HPTE_R_C);
ptel = rev->guest_rpte |= rcbits;
-   gfn = hpte_rpn(ptel, hpte_page_size(hpte_v, ptel));
-   memslot = __gfn_to_memslot(kvm_memslots_raw(kvm), gfn);
-   if (!memslot)
+   rmap = revmap_for_hpte(kvm, hpte_v, ptel);
+   if (!rmap)
return;
-
-   rmap = real_vmalloc_addr(&memslot->arch.rmap[gfn - memslot->base_gfn]);
lock_rmap(rmap);
 
head = *rmap & KVMPPC_RMAP_INDEX;
@@ -678,6 +691,105 @@ long kvmppc_h_read(struct kvm_vcpu *vcpu, unsigned long 
flags,
return H_SUCCESS;
 }
 
+long kvmppc_h_clear_ref(struct kvm_vcpu *vcpu, unsigned long flags,
+   unsigned long pte_index)
+{
+   struct kvm *kvm = vcpu->kvm;
+   __be64 *hpte;
+   unsigned long v, r, gr;
+   struct revmap_entry *rev;
+   unsigned long *rmap;
+   long ret = H_NOT_FOUND;
+
+   if (pte_index >= kvm->arch.hpt_npte)
+   return H_PARAMETER;
+
+   rev = real_vmalloc_addr(&kvm->arch.revmap[pte_index]);
+   hpte = (__be64 *)(kvm->arch.hpt_virt + (pte_index << 4));
+   while (!try_lock_hpte(hpte, HPTE_V_HVLOCK))
+   cpu_relax();
+   v = be64_to_cpu(hpte[0]);
+   r = be64_to_cpu(hpte[1]);
+   if (!(v & (HPTE_V_VALID | HPTE_V_ABSENT)))
+   goto out;
+
+   gr = rev->guest_rpte;
+   if (rev->guest_rpte & HPTE_R_R) {
+   rev->guest_rpte &= ~HPTE_R_R;
+   note_hpte_modification(kvm, rev);
+   }
+   if (v & HPTE_V_VALID) {
+   gr |= r & (HPTE_R_R | HPTE_R_C);
+   if (r & HPTE_R_R) {
+   kvmppc_clear_ref_hpte(kvm, hpte, pte_index);
+   rmap = revmap_for_hpte(kvm, v, gr);
+   if (rmap) {
+   lock_rmap(rmap);
+   *rmap |= KVMPPC_RMAP_REFERENCED;
+   unlock_rmap(rmap);
+   }
+   }
+   }
+   vcpu->arch.gpr[4] = gr;
+   ret = H_SUCCESS;
+ out:
+   unlock_hpte(hpte, v & ~HPTE_V_HVLOCK);
+   return ret;
+}
+
+long kvmppc_h_clear_mod(struct kvm_vcpu *vcpu, unsigned long flags,
+   unsigned long pte_index)
+{
+   struct kvm *kvm = vcpu->kvm;
+   __be64 *hpte;
+   unsigned long v, r, gr;
+   struct revmap_entry *rev;
+   unsigned long *rmap;
+   long ret = H_NOT_FOUND;
+
+   if (pte_index >= kvm->arch.hpt_npte)
+   return H_

[PULL 07/12] KVM: PPC: Book3S HV: Fix race in reading change bit when removing HPTE

2015-08-22 Thread Alexander Graf
From: Paul Mackerras 

The reference (R) and change (C) bits in a HPT entry can be set by
hardware at any time up until the HPTE is invalidated and the TLB
invalidation sequence has completed.  This means that when removing
a HPTE, we need to read the HPTE after the invalidation sequence has
completed in order to obtain reliable values of R and C.  The code
in kvmppc_do_h_remove() used to do this.  However, commit 6f22bd3265fb
("KVM: PPC: Book3S HV: Make HTAB code LE host aware") removed the
read after invalidation as a side effect of other changes.  This
restores the read of the HPTE after invalidation.

The user-visible effect of this bug would be that when migrating a
guest, there is a small probability that a page modified by the guest
and then unmapped by the guest might not get re-transmitted and thus
the destination might end up with a stale copy of the page.

Fixes: 6f22bd3265fb
Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/kvm/book3s_hv_rm_mmu.c | 18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c 
b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index b027a89..c6d601c 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -421,14 +421,20 @@ long kvmppc_do_h_remove(struct kvm *kvm, unsigned long 
flags,
rev = real_vmalloc_addr(&kvm->arch.revmap[pte_index]);
v = pte & ~HPTE_V_HVLOCK;
if (v & HPTE_V_VALID) {
-   u64 pte1;
-
-   pte1 = be64_to_cpu(hpte[1]);
hpte[0] &= ~cpu_to_be64(HPTE_V_VALID);
-   rb = compute_tlbie_rb(v, pte1, pte_index);
+   rb = compute_tlbie_rb(v, be64_to_cpu(hpte[1]), pte_index);
do_tlbies(kvm, &rb, 1, global_invalidates(kvm, flags), true);
-   /* Read PTE low word after tlbie to get final R/C values */
-   remove_revmap_chain(kvm, pte_index, rev, v, pte1);
+   /*
+* The reference (R) and change (C) bits in a HPT
+* entry can be set by hardware at any time up until
+* the HPTE is invalidated and the TLB invalidation
+* sequence has completed.  This means that when
+* removing a HPTE, we need to re-read the HPTE after
+* the invalidation sequence has completed in order to
+* obtain reliable values of R and C.
+*/
+   remove_revmap_chain(kvm, pte_index, rev, v,
+   be64_to_cpu(hpte[1]));
}
r = rev->guest_rpte & ~HPTE_GR_RESERVED;
note_hpte_modification(kvm, rev);
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm:powerpc:Fix incorrect return statement in the function mpic_set_default_irq_routing

2015-08-12 Thread Alexander Graf


On 12.08.15 21:06, nick wrote:
> 
> 
> On 2015-08-12 03:05 PM, Alexander Graf wrote:
>>
>>
>> On 07.08.15 17:54, Nicholas Krause wrote:
>>> This fixes the incorrect return statement in the function
>>> mpic_set_default_irq_routing from always returning zero
>>> to signal success to this function's caller to instead
>>> return the return value of kvm_set_irq_routing as this
>>> function can fail and we need to correctly signal the
>>> caller of mpic_set_default_irq_routing when the call
>>> to this particular function has failed.
>>>
>>> Signed-off-by: Nicholas Krause 
>>
>> I like the patch, but I don't see it on the kvm-ppc mailing list. It
>> doesn't show up on patchwork or spinics. Did something go wrong while
>> sending it out?
>>
>>
>> Alex
>>
> Alex,
> Ask Paolo about it as he would be able to explain it better then I.

Well, whatever the reason, I can only apply patches that actually
appeared on the public mailing list. Otherwise people may not get the
chance to review them ;).


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm:powerpc:Fix return statements for wrapper functions in the file book3s_64_mmu_hv.c

2015-08-12 Thread Alexander Graf


On 10.08.15 17:27, Nicholas Krause wrote:
> This fixes the wrapper functions kvm_umap_hva_hv and the function
> kvm_unmap_hav_range_hv to return the return value of the function
> kvm_handle_hva or kvm_handle_hva_range that they are wrapped to
> call internally rather then always making the caller of these
> wrapper functions think they always run successfully by returning
> the value of zero directly.
> 
> Signed-off-by: Nicholas Krause 

Paul, could you please take on this one?

Thanks,

Alex

> ---
>  arch/powerpc/kvm/book3s_64_mmu_hv.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
> b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> index dab68b7..0905c8f 100644
> --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
> +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> @@ -774,14 +774,12 @@ static int kvm_unmap_rmapp(struct kvm *kvm, unsigned 
> long *rmapp,
>  
>  int kvm_unmap_hva_hv(struct kvm *kvm, unsigned long hva)
>  {
> - kvm_handle_hva(kvm, hva, kvm_unmap_rmapp);
> - return 0;
> + return kvm_handle_hva(kvm, hva, kvm_unmap_rmapp);
>  }
>  
>  int kvm_unmap_hva_range_hv(struct kvm *kvm, unsigned long start, unsigned 
> long end)
>  {
> - kvm_handle_hva_range(kvm, start, end, kvm_unmap_rmapp);
> - return 0;
> + return kvm_handle_hva_range(kvm, start, end, kvm_unmap_rmapp);
>  }
>  
>  void kvmppc_core_flush_memslot_hv(struct kvm *kvm,
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm:powerpc:Fix incorrect return statement in the function mpic_set_default_irq_routing

2015-08-12 Thread Alexander Graf


On 07.08.15 17:54, Nicholas Krause wrote:
> This fixes the incorrect return statement in the function
> mpic_set_default_irq_routing from always returning zero
> to signal success to this function's caller to instead
> return the return value of kvm_set_irq_routing as this
> function can fail and we need to correctly signal the
> caller of mpic_set_default_irq_routing when the call
> to this particular function has failed.
> 
> Signed-off-by: Nicholas Krause 

I like the patch, but I don't see it on the kvm-ppc mailing list. It
doesn't show up on patchwork or spinics. Did something go wrong while
sending it out?


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 1/1] KVM: PPC: Book3S: correct width in XER handling

2015-08-12 Thread Alexander Graf


On 06.08.15 12:16, Laurent Vivier wrote:
> Hi,
> 
> I'd also like to see this patch in the mainstream as it fixes a bug
> appearing when we switch from vCPU context to hypervisor context (guest
> crash).

Thanks, applied to kvm-ppc-queue.


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [kvm-unit-tests PATCH 11/14] powerpc/ppc64: add rtas_power_off

2015-08-03 Thread Alexander Graf


On 03.08.15 19:02, Andrew Jones wrote:
> On Mon, Aug 03, 2015 at 07:08:17PM +0200, Paolo Bonzini wrote:
>>
>>
>> On 03/08/2015 16:41, Andrew Jones wrote:
>>> Add enough RTAS support to support power-off, and apply it to
>>> exit().
>>>
>>> Signed-off-by: Andrew Jones 
>>
>> Why not use virtio-mmio + testdev on ppc as well?  Similar to how we're
>> not using PSCI on ARM or ACPI on x86.
> 
> I have some longer term plans to add minimal virtio-pci support to
> kvm-unit-tests, and then we could plug virtio-serial+chr-testdev into
> that. I didn't think I could use virtio-mmio directly with spapr, but
> maybe I can? Actually, I sort of like this approach more in some

You would need to add support for the dynamic sysbus device allocation
in the spapr machine, but then I don't see why it wouldn't work.

PCI however is the more natural choice on sPAPR if you want to do virtio.

That said, if all you need is a chr transport, IIRC there should be a
way to get you additional channels on the existing "serial port" - which
really is just a simply hypercall interface. But David is the best
person to guide you to the best path forward here.


Alex

> respects though, as it doesn't require a special testdev or virtio
> support, keeping the unit test extra minimal. In fact, I was even
> thinking about posting patches (which I've already written) that
> allow chr-testdev to be optional for ARM too, now that it could use
> the exitcode snooper.
> 
> Thanks,
> drew
> 
>>
>> Paolo
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] Two fixes for dynamic micro-threading

2015-07-23 Thread Alexander Graf


On 20.07.15 08:49, David Gibson wrote:
> On Thu, Jul 16, 2015 at 05:11:12PM +1000, Paul Mackerras wrote:
>> This series contains two fixes for the new dynamic micro-threading
>> code that was added recently for HV-mode KVM on Power servers.
>> The patches are against Alex Graf's kvm-ppc-queue branch.  Please
>> apply.
> 
> agraf,
> 
> Any word on these?  These appear to fix a really nasty host crash in
> current upstream.  I'd really like to see them merged ASAP.

Thanks, applied to kvm-ppc-queue.

The host crash should only occur with dynamic micro-threading enabled,
which is not in Linus' tree, correct?


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/5] PPC: Current patch queue for HV KVM

2015-07-01 Thread Alexander Graf


On 24.06.15 13:18, Paul Mackerras wrote:
> This is my current queue of patches for HV KVM.  This series is based
> on the kvm next branch.  They have all been posted 6 weeks ago or
> more, though I have just added a 3-line fix to patch 2/5 to fix a bug
> that we found in testing migration, and I expanded a comment (no code
> change) in patch 3/5 following a suggestion by Aneesh.
> 
> I'd like to see these go into 4.2 if possible.

Thanks, applied all to kvm-ppc-queue.


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/5] KVM: PPC: Book3S HV: Implement dynamic micro-threading on POWER8

2015-06-30 Thread Alexander Graf

On 06/24/15 13:18, Paul Mackerras wrote:

This builds on the ability to run more than one vcore on a physical
core by using the micro-threading (split-core) modes of the POWER8
chip.  Previously, only vcores from the same VM could be run together,
and (on POWER8) only if they had just one thread per core.  With the
ability to split the core on guest entry and unsplit it on guest exit,
we can run up to 8 vcpu threads from up to 4 different VMs, and we can
run multiple vcores with 2 or 4 vcpus per vcore.

Dynamic micro-threading is only available if the static configuration
of the cores is whole-core mode (unsplit), and only on POWER8.

To manage this, we introduce a new kvm_split_mode struct which is
shared across all of the subcores in the core, with a pointer in the
paca on each thread.  In addition we extend the core_info struct to
have information on each subcore.  When deciding whether to add a
vcore to the set already on the core, we now have two possibilities:
(a) piggyback the vcore onto an existing subcore, or (b) start a new
subcore.

Currently, when any vcpu needs to exit the guest and switch to host
virtual mode, we interrupt all the threads in all subcores and switch
the core back to whole-core mode.  It may be possible in future to
allow some of the subcores to keep executing in the guest while
subcore 0 switches to the host, but that is not implemented in this
patch.

This adds a module parameter called dynamic_mt_modes which controls
which micro-threading (split-core) modes the code will consider, as a
bitmap.  In other words, if it is 0, no micro-threading mode is
considered; if it is 2, only 2-way micro-threading is considered; if
it is 4, only 4-way, and if it is 6, both 2-way and 4-way
micro-threading mode will be considered.  The default is 6.

With this, we now have secondary threads which are the primary thread
for their subcore and therefore need to do the MMU switch.  These
threads will need to be started even if they have no vcpu to run, so
we use the vcore pointer in the PACA rather than the vcpu pointer to
trigger them.

It is now possible for thread 0 to find that an exit has been
requested before it gets to switch the subcore state to the guest.  In
that case we haven't added the guest's timebase offset to the
timebase, so we need to be careful not to subtract the offset in the
guest exit path.  In fact we just skip the whole path that switches
back to host context, since we haven't switched to the guest context.

Signed-off-by: Paul Mackerras 
---
  arch/powerpc/include/asm/kvm_book3s_asm.h |  20 ++
  arch/powerpc/include/asm/kvm_host.h   |   3 +
  arch/powerpc/kernel/asm-offsets.c |   7 +
  arch/powerpc/kvm/book3s_hv.c  | 369 ++
  arch/powerpc/kvm/book3s_hv_builtin.c  |  25 +-
  arch/powerpc/kvm/book3s_hv_rmhandlers.S   | 113 +++--
  6 files changed, 475 insertions(+), 62 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_asm.h 
b/arch/powerpc/include/asm/kvm_book3s_asm.h
index 5bdfb5d..4024d24 100644
--- a/arch/powerpc/include/asm/kvm_book3s_asm.h
+++ b/arch/powerpc/include/asm/kvm_book3s_asm.h
@@ -25,6 +25,12 @@
  #define XICS_MFRR 0xc
  #define XICS_IPI  2   /* interrupt source # for IPIs */
  
+/* Maximum number of threads per physical core */

+#define MAX_THREADS8
+
+/* Maximum number of subcores per physical core */
+#define MAX_SUBCORES   4
+
  #ifdef __ASSEMBLY__
  
  #ifdef CONFIG_KVM_BOOK3S_HANDLER

@@ -65,6 +71,19 @@ kvmppc_resume_\intno:
  
  #else  /*__ASSEMBLY__ */
  
+struct kvmppc_vcore;

+
+/* Struct used for coordinating micro-threading (split-core) mode changes */
+struct kvm_split_mode {
+   unsigned long   rpr;
+   unsigned long   pmmar;
+   unsigned long   ldbar;
+   u8  subcore_size;
+   u8  do_nap;
+   u8  napped[MAX_THREADS];
+   struct kvmppc_vcore *master_vcs[MAX_SUBCORES];
+};
+
  /*
   * This struct goes in the PACA on 64-bit processors.  It is used
   * to store host state that needs to be saved when we enter a guest
@@ -100,6 +119,7 @@ struct kvmppc_host_state {
u64 host_spurr;
u64 host_dscr;
u64 dec_expires;
+   struct kvm_split_mode *kvm_split_mode;
  #endif
  #ifdef CONFIG_PPC_BOOK3S_64
u64 cfar;
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 2b74490..80eb29a 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -302,6 +302,9 @@ struct kvmppc_vcore {
  #define VCORE_EXIT_MAP(vc)((vc)->entry_exit_map >> 8)
  #define VCORE_IS_EXITING(vc)  (VCORE_EXIT_MAP(vc) != 0)
  
+/* This bit is used when a vcore exit is triggered from outside the vcore */

+#define VCORE_EXIT_REQ 0x1
+
  /*
   * Values for vcore_state.
   * Note that these are arranged such that lower values
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/

Re: [PATCH 1/3] powerpc: implement barrier primitives

2015-06-17 Thread Alexander Graf


On 17.06.15 12:15, Will Deacon wrote:
> On Wed, Jun 17, 2015 at 10:43:48AM +0100, Andre Przywara wrote:
>> Instead of referring to the Linux header including the barrier
>> macros, copy over the rather simple implementation for the PowerPC
>> barrier instructions kvmtool uses. This fixes build for powerpc.
>>
>> Signed-off-by: Andre Przywara 
>> ---
>> Hi,
>>
>> I just took what kvmtool seems to have used before, I actually have
>> no idea if "sync" is the right instruction or "lwsync" would do.
>> Would be nice if some people with PowerPC knowledge could comment.
> 
> I *think* we can use lwsync for rmb and wmb, but would want confirmation
> from a ppc guy before making that change!

Also I'd prefer to play safe for now :)


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] treewide: Fix typo compatability -> compatibility

2015-06-01 Thread Alexander Graf


On 27.05.15 14:05, Laurent Pinchart wrote:
> Even though 'compatability' has a dedicated entry in the Wiktionary,
> it's listed as 'Mispelling of compatibility'. Fix it.
> 
> Signed-off-by: Laurent Pinchart 
> ---
>  arch/metag/include/asm/elf.h | 2 +-


>  arch/powerpc/kvm/book3s.c    | 2 +-

Acked-by: Alexander Graf 

for the PPC KVM bit.


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/1] KVM: PPC: Book3S: correct width in XER handling

2015-05-26 Thread Alexander Graf


On 26.05.15 02:27, Sam Bobroff wrote:
> In 64 bit kernels, the Fixed Point Exception Register (XER) is a 64
> bit field (e.g. in kvm_regs and kvm_vcpu_arch) and in most places it is
> accessed as such.
> 
> This patch corrects places where it is accessed as a 32 bit field by a
> 64 bit kernel.  In some cases this is via a 32 bit load or store
> instruction which, depending on endianness, will cause either the
> lower or upper 32 bits to be missed.  In another case it is cast as a
> u32, causing the upper 32 bits to be cleared.
> 
> This patch corrects those places by extending the access methods to
> 64 bits.
> 
> Signed-off-by: Sam Bobroff 
> ---
> 
> v2:
> 
> Also extend kvmppc_book3s_shadow_vcpu.xer to 64 bit.
> 
>  arch/powerpc/include/asm/kvm_book3s.h |4 ++--
>  arch/powerpc/include/asm/kvm_book3s_asm.h |2 +-
>  arch/powerpc/kvm/book3s_hv_rmhandlers.S   |6 +++---
>  arch/powerpc/kvm/book3s_segment.S |4 ++--
>  4 files changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
> b/arch/powerpc/include/asm/kvm_book3s.h
> index b91e74a..05a875a 100644
> --- a/arch/powerpc/include/asm/kvm_book3s.h
> +++ b/arch/powerpc/include/asm/kvm_book3s.h
> @@ -225,12 +225,12 @@ static inline u32 kvmppc_get_cr(struct kvm_vcpu *vcpu)
>   return vcpu->arch.cr;
>  }
>  
> -static inline void kvmppc_set_xer(struct kvm_vcpu *vcpu, u32 val)
> +static inline void kvmppc_set_xer(struct kvm_vcpu *vcpu, ulong val)

Now we have book3s and booke files with different prototypes on the same
inline function names. That's really ugly. Please keep them in sync ;).


Alex

>  {
>   vcpu->arch.xer = val;
>  }
>  
> -static inline u32 kvmppc_get_xer(struct kvm_vcpu *vcpu)
> +static inline ulong kvmppc_get_xer(struct kvm_vcpu *vcpu)
>  {
>   return vcpu->arch.xer;
>  }
> diff --git a/arch/powerpc/include/asm/kvm_book3s_asm.h 
> b/arch/powerpc/include/asm/kvm_book3s_asm.h
> index 5bdfb5d..c4ccd2d 100644
> --- a/arch/powerpc/include/asm/kvm_book3s_asm.h
> +++ b/arch/powerpc/include/asm/kvm_book3s_asm.h
> @@ -112,7 +112,7 @@ struct kvmppc_book3s_shadow_vcpu {
>   bool in_use;
>   ulong gpr[14];
>   u32 cr;
> - u32 xer;
> + ulong xer;
>   ulong ctr;
>   ulong lr;
>   ulong pc;
> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
> b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> index 4d70df2..d75be59 100644
> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> @@ -870,7 +870,7 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
>   blt hdec_soon
>  
>   ld  r6, VCPU_CTR(r4)
> - lwz r7, VCPU_XER(r4)
> + ld  r7, VCPU_XER(r4)
>  
>   mtctr   r6
>   mtxer   r7
> @@ -1103,7 +1103,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
>   mfctr   r3
>   mfxer   r4
>   std r3, VCPU_CTR(r9)
> - stw r4, VCPU_XER(r9)
> + std r4, VCPU_XER(r9)
>  
>   /* If this is a page table miss then see if it's theirs or ours */
>   cmpwi   r12, BOOK3S_INTERRUPT_H_DATA_STORAGE
> @@ -1675,7 +1675,7 @@ kvmppc_hdsi:
>   bl  kvmppc_msr_interrupt
>  fast_interrupt_c_return:
>  6:   ld  r7, VCPU_CTR(r9)
> - lwz r8, VCPU_XER(r9)
> + ld  r8, VCPU_XER(r9)
>   mtctr   r7
>   mtxer   r8
>   mr  r4, r9
> diff --git a/arch/powerpc/kvm/book3s_segment.S 
> b/arch/powerpc/kvm/book3s_segment.S
> index acee37c..ca8f174 100644
> --- a/arch/powerpc/kvm/book3s_segment.S
> +++ b/arch/powerpc/kvm/book3s_segment.S
> @@ -123,7 +123,7 @@ no_dcbz32_on:
>   PPC_LL  r8, SVCPU_CTR(r3)
>   PPC_LL  r9, SVCPU_LR(r3)
>   lwz r10, SVCPU_CR(r3)
> - lwz r11, SVCPU_XER(r3)
> + PPC_LL  r11, SVCPU_XER(r3)
>  
>   mtctr   r8
>   mtlrr9
> @@ -237,7 +237,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
>   mfctr   r8
>   mflrr9
>  
> - stw r5, SVCPU_XER(r13)
> + PPC_STL r5, SVCPU_XER(r13)
>   PPC_STL r6, SVCPU_FAULT_DAR(r13)
>   stw r7, SVCPU_FAULT_DSISR(r13)
>   PPC_STL r8, SVCPU_CTR(r13)
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] KVM: PPC: Book3S: correct width in XER handling

2015-05-25 Thread Alexander Graf


On 26.05.15 02:14, Sam Bobroff wrote:
> On Mon, May 25, 2015 at 11:08:08PM +0200, Alexander Graf wrote:
>>
>>
>> On 20.05.15 07:26, Sam Bobroff wrote:
>>> In 64 bit kernels, the Fixed Point Exception Register (XER) is a 64
>>> bit field (e.g. in kvm_regs and kvm_vcpu_arch) and in most places it is
>>> accessed as such.
>>>
>>> This patch corrects places where it is accessed as a 32 bit field by a
>>> 64 bit kernel.  In some cases this is via a 32 bit load or store
>>> instruction which, depending on endianness, will cause either the
>>> lower or upper 32 bits to be missed.  In another case it is cast as a
>>> u32, causing the upper 32 bits to be cleared.
>>>
>>> This patch corrects those places by extending the access methods to
>>> 64 bits.
>>>
>>> Signed-off-by: Sam Bobroff 
>>> ---
>>>
>>>  arch/powerpc/include/asm/kvm_book3s.h   |4 ++--
>>>  arch/powerpc/kvm/book3s_hv_rmhandlers.S |6 +++---
>>>  arch/powerpc/kvm/book3s_segment.S   |4 ++--
>>>  3 files changed, 7 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
>>> b/arch/powerpc/include/asm/kvm_book3s.h
>>> index b91e74a..05a875a 100644
>>> --- a/arch/powerpc/include/asm/kvm_book3s.h
>>> +++ b/arch/powerpc/include/asm/kvm_book3s.h
>>> @@ -225,12 +225,12 @@ static inline u32 kvmppc_get_cr(struct kvm_vcpu *vcpu)
>>> return vcpu->arch.cr;
>>>  }
>>>  
>>> -static inline void kvmppc_set_xer(struct kvm_vcpu *vcpu, u32 val)
>>> +static inline void kvmppc_set_xer(struct kvm_vcpu *vcpu, ulong val)
>>>  {
>>> vcpu->arch.xer = val;
>>>  }
>>>  
>>> -static inline u32 kvmppc_get_xer(struct kvm_vcpu *vcpu)
>>> +static inline ulong kvmppc_get_xer(struct kvm_vcpu *vcpu)
>>>  {
>>> return vcpu->arch.xer;
>>>  }
>>> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
>>> b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
>>> index 4d70df2..d75be59 100644
>>> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
>>> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
>>> @@ -870,7 +870,7 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
>>> blt hdec_soon
>>>  
>>> ld  r6, VCPU_CTR(r4)
>>> -   lwz r7, VCPU_XER(r4)
>>> +   ld  r7, VCPU_XER(r4)
>>>  
>>> mtctr   r6
>>> mtxer   r7
>>> @@ -1103,7 +1103,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
>>> mfctr   r3
>>> mfxer   r4
>>> std r3, VCPU_CTR(r9)
>>> -   stw r4, VCPU_XER(r9)
>>> +   std r4, VCPU_XER(r9)
>>>  
>>> /* If this is a page table miss then see if it's theirs or ours */
>>> cmpwi   r12, BOOK3S_INTERRUPT_H_DATA_STORAGE
>>> @@ -1675,7 +1675,7 @@ kvmppc_hdsi:
>>> bl  kvmppc_msr_interrupt
>>>  fast_interrupt_c_return:
>>>  6: ld  r7, VCPU_CTR(r9)
>>> -   lwz r8, VCPU_XER(r9)
>>> +   ld  r8, VCPU_XER(r9)
>>> mtctr   r7
>>> mtxer   r8
>>> mr  r4, r9
>>> diff --git a/arch/powerpc/kvm/book3s_segment.S 
>>> b/arch/powerpc/kvm/book3s_segment.S
>>> index acee37c..ca8f174 100644
>>> --- a/arch/powerpc/kvm/book3s_segment.S
>>> +++ b/arch/powerpc/kvm/book3s_segment.S
>>> @@ -123,7 +123,7 @@ no_dcbz32_on:
>>> PPC_LL  r8, SVCPU_CTR(r3)
>>> PPC_LL  r9, SVCPU_LR(r3)
>>> lwz r10, SVCPU_CR(r3)
>>> -   lwz r11, SVCPU_XER(r3)
>>> +   PPC_LL  r11, SVCPU_XER(r3)
>>
>> struct kvmppc_book3s_shadow_vcpu {
>> bool in_use;
>> ulong gpr[14];
>> u32 cr;
>> u32 xer;
>> [...]
>>
>> so at least this change looks wrong. Please double-check all fields in
>> your patch again.
>>
>>
>> Alex
> 
> Thanks for the review and the catch!
> 
> The xer field in kvm_vcpu_arch is already ulong, so it looks like the one in
> kvmppc_book3s_shadow_vcpu is the only other case. I'll fix that and repost.

I guess given that the one in pt_regs is also ulong going ulong rather
than u32 is the better choice, yes.

While at it, could you please just do a grep -i xer across all kvm (.c
and .h) files and just sanity check that we're staying in sync?


Thanks!

Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: add missing pt_regs initialization

2015-05-25 Thread Alexander Graf


On 18.05.15 14:44, Laurentiu Tudor wrote:
> On this switch branch the regs initialization
> doesn't happen so add it.
> This was found with the help of a static
> code analysis tool.
> 
> Signed-off-by: Laurentiu Tudor 
> Cc: Scott Wood 
> Cc: Mihai Caraman 

Thanks, applied to kvm-ppc-queue.


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] KVM: PPC: Book3S: correct width in XER handling

2015-05-25 Thread Alexander Graf


On 20.05.15 07:26, Sam Bobroff wrote:
> In 64 bit kernels, the Fixed Point Exception Register (XER) is a 64
> bit field (e.g. in kvm_regs and kvm_vcpu_arch) and in most places it is
> accessed as such.
> 
> This patch corrects places where it is accessed as a 32 bit field by a
> 64 bit kernel.  In some cases this is via a 32 bit load or store
> instruction which, depending on endianness, will cause either the
> lower or upper 32 bits to be missed.  In another case it is cast as a
> u32, causing the upper 32 bits to be cleared.
> 
> This patch corrects those places by extending the access methods to
> 64 bits.
> 
> Signed-off-by: Sam Bobroff 
> ---
> 
>  arch/powerpc/include/asm/kvm_book3s.h   |4 ++--
>  arch/powerpc/kvm/book3s_hv_rmhandlers.S |6 +++---
>  arch/powerpc/kvm/book3s_segment.S   |4 ++--
>  3 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
> b/arch/powerpc/include/asm/kvm_book3s.h
> index b91e74a..05a875a 100644
> --- a/arch/powerpc/include/asm/kvm_book3s.h
> +++ b/arch/powerpc/include/asm/kvm_book3s.h
> @@ -225,12 +225,12 @@ static inline u32 kvmppc_get_cr(struct kvm_vcpu *vcpu)
>   return vcpu->arch.cr;
>  }
>  
> -static inline void kvmppc_set_xer(struct kvm_vcpu *vcpu, u32 val)
> +static inline void kvmppc_set_xer(struct kvm_vcpu *vcpu, ulong val)
>  {
>   vcpu->arch.xer = val;
>  }
>  
> -static inline u32 kvmppc_get_xer(struct kvm_vcpu *vcpu)
> +static inline ulong kvmppc_get_xer(struct kvm_vcpu *vcpu)
>  {
>   return vcpu->arch.xer;
>  }
> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
> b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> index 4d70df2..d75be59 100644
> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> @@ -870,7 +870,7 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
>   blt hdec_soon
>  
>   ld  r6, VCPU_CTR(r4)
> - lwz r7, VCPU_XER(r4)
> + ld  r7, VCPU_XER(r4)
>  
>   mtctr   r6
>   mtxer   r7
> @@ -1103,7 +1103,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
>   mfctr   r3
>   mfxer   r4
>   std r3, VCPU_CTR(r9)
> - stw r4, VCPU_XER(r9)
> + std r4, VCPU_XER(r9)
>  
>   /* If this is a page table miss then see if it's theirs or ours */
>   cmpwi   r12, BOOK3S_INTERRUPT_H_DATA_STORAGE
> @@ -1675,7 +1675,7 @@ kvmppc_hdsi:
>   bl  kvmppc_msr_interrupt
>  fast_interrupt_c_return:
>  6:   ld  r7, VCPU_CTR(r9)
> - lwz r8, VCPU_XER(r9)
> + ld  r8, VCPU_XER(r9)
>   mtctr   r7
>   mtxer   r8
>   mr  r4, r9
> diff --git a/arch/powerpc/kvm/book3s_segment.S 
> b/arch/powerpc/kvm/book3s_segment.S
> index acee37c..ca8f174 100644
> --- a/arch/powerpc/kvm/book3s_segment.S
> +++ b/arch/powerpc/kvm/book3s_segment.S
> @@ -123,7 +123,7 @@ no_dcbz32_on:
>   PPC_LL  r8, SVCPU_CTR(r3)
>   PPC_LL  r9, SVCPU_LR(r3)
>   lwz r10, SVCPU_CR(r3)
> - lwz r11, SVCPU_XER(r3)
> + PPC_LL  r11, SVCPU_XER(r3)

struct kvmppc_book3s_shadow_vcpu {
bool in_use;
ulong gpr[14];
u32 cr;
u32 xer;
[...]

so at least this change looks wrong. Please double-check all fields in
your patch again.


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: check for lookup_linux_ptep() returning NULL

2015-05-25 Thread Alexander Graf


On 21.05.15 21:37, Scott Wood wrote:
> On Thu, 2015-05-21 at 16:26 +0300, Laurentiu Tudor wrote:
>> If passed a larger page size lookup_linux_ptep()
>> may fail, so add a check for that and bail out
>> if that's the case.
>> This was found with the help of a static
>> code analysis tool.
>>
>> Signed-off-by: Mihai Caraman 
>> Signed-off-by: Laurentiu Tudor 
>> Cc: Scott Wood 
>> ---
>> based on https://github.com/agraf/linux-2.6.git kvm-ppc-next
>>
>>  arch/powerpc/kvm/e500_mmu_host.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Reviewed-by: Scott Wood 

Thanks, applied to kvm-ppc-queue.


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: Fix warnings from sparse

2015-05-25 Thread Alexander Graf


On 22.05.15 09:25, Thomas Huth wrote:
> When compiling the KVM code for POWER with "make C=1", sparse
> complains about functions missing proper prototypes and a 64-bit
> constant missing the ULL prefix. Let's fix this by making the
> functions static or by including the proper header with the
> prototypes, and by appending a ULL prefix to the constant
> PPC_MPPE_ADDRESS_MASK.
> 
> Signed-off-by: Thomas Huth 

Thanks, applied to kvm-ppc-queue.

Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: Remove PPC970 from KVM_BOOK3S_64_HV text in Kconfig

2015-05-25 Thread Alexander Graf


On 22.05.15 11:41, Thomas Huth wrote:
> Since the PPC970 support has been removed from the kvm-hv kernel
> module recently, we should also reflect this change in the help
> text of the corresponding Kconfig option.
> 
> Signed-off-by: Thomas Huth 

Thanks, applied to kvm-ppc-queue.

Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: Remove PPC970 from KVM_BOOK3S_64_HV text in Kconfig

2015-05-25 Thread Alexander Graf


On 22.05.15 11:41, Thomas Huth wrote:
> Since the PPC970 support has been removed from the kvm-hv kernel
> module recently, we should also reflect this change in the help
> text of the corresponding Kconfig option.
> 
> Signed-off-by: Thomas Huth 

Thanks, applied to kvm-ppc-queue.


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: fix suspicious use of conditional operator

2015-05-25 Thread Alexander Graf


On 25.05.15 10:48, Laurentiu Tudor wrote:
> This was signaled by a static code analysis tool.
> 
> Signed-off-by: Laurentiu Tudor 
> Reviewed-by: Scott Wood 

Thanks, applied to kvm-ppc-queue.


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL 15/21] KVM: PPC: Book3S HV: Get rid of vcore nap_count and n_woken

2015-04-21 Thread Alexander Graf
From: Paul Mackerras 

We can tell when a secondary thread has finished running a guest by
the fact that it clears its kvm_hstate.kvm_vcpu pointer, so there
is no real need for the nap_count field in the kvmppc_vcore struct.
This changes kvmppc_wait_for_nap to poll the kvm_hstate.kvm_vcpu
pointers of the secondary threads rather than polling vc->nap_count.
Besides reducing the size of the kvmppc_vcore struct by 8 bytes,
this also means that we can tell which secondary threads have got
stuck and thus print a more informative error message.

Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/include/asm/kvm_host.h |  2 --
 arch/powerpc/kernel/asm-offsets.c   |  1 -
 arch/powerpc/kvm/book3s_hv.c| 47 +++--
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 19 +
 4 files changed, 34 insertions(+), 35 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 83c4425..1517faa 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -272,8 +272,6 @@ struct kvmppc_vcore {
int n_runnable;
int num_threads;
int entry_exit_count;
-   int n_woken;
-   int nap_count;
int napping_threads;
int first_vcpuid;
u16 pcpu;
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 92ec3fc..8aa8246 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -563,7 +563,6 @@ int main(void)
DEFINE(VCPU_WORT, offsetof(struct kvm_vcpu, arch.wort));
DEFINE(VCPU_SHADOW_SRR1, offsetof(struct kvm_vcpu, arch.shadow_srr1));
DEFINE(VCORE_ENTRY_EXIT, offsetof(struct kvmppc_vcore, 
entry_exit_count));
-   DEFINE(VCORE_NAP_COUNT, offsetof(struct kvmppc_vcore, nap_count));
DEFINE(VCORE_IN_GUEST, offsetof(struct kvmppc_vcore, in_guest));
DEFINE(VCORE_NAPPING_THREADS, offsetof(struct kvmppc_vcore, 
napping_threads));
DEFINE(VCORE_KVM, offsetof(struct kvmppc_vcore, kvm));
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index fb4f166..7c1335d 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1729,8 +1729,10 @@ static int kvmppc_grab_hwthread(int cpu)
tpaca = &paca[cpu];
 
/* Ensure the thread won't go into the kernel if it wakes */
-   tpaca->kvm_hstate.hwthread_req = 1;
tpaca->kvm_hstate.kvm_vcpu = NULL;
+   tpaca->kvm_hstate.napping = 0;
+   smp_wmb();
+   tpaca->kvm_hstate.hwthread_req = 1;
 
/*
 * If the thread is already executing in the kernel (e.g. handling
@@ -1773,35 +1775,43 @@ static void kvmppc_start_thread(struct kvm_vcpu *vcpu)
}
cpu = vc->pcpu + vcpu->arch.ptid;
tpaca = &paca[cpu];
-   tpaca->kvm_hstate.kvm_vcpu = vcpu;
tpaca->kvm_hstate.kvm_vcore = vc;
tpaca->kvm_hstate.ptid = vcpu->arch.ptid;
vcpu->cpu = vc->pcpu;
+   /* Order stores to hstate.kvm_vcore etc. before store to kvm_vcpu */
smp_wmb();
+   tpaca->kvm_hstate.kvm_vcpu = vcpu;
 #if defined(CONFIG_PPC_ICP_NATIVE) && defined(CONFIG_SMP)
-   if (cpu != smp_processor_id()) {
+   if (cpu != smp_processor_id())
xics_wake_cpu(cpu);
-   if (vcpu->arch.ptid)
-   ++vc->n_woken;
-   }
 #endif
 }
 
-static void kvmppc_wait_for_nap(struct kvmppc_vcore *vc)
+static void kvmppc_wait_for_nap(void)
 {
-   int i;
+   int cpu = smp_processor_id();
+   int i, loops;
 
-   HMT_low();
-   i = 0;
-   while (vc->nap_count < vc->n_woken) {
-   if (++i >= 100) {
-   pr_err("kvmppc_wait_for_nap timeout %d %d\n",
-  vc->nap_count, vc->n_woken);
-   break;
+   for (loops = 0; loops < 100; ++loops) {
+   /*
+* Check if all threads are finished.
+* We set the vcpu pointer when starting a thread
+* and the thread clears it when finished, so we look
+* for any threads that still have a non-NULL vcpu ptr.
+*/
+   for (i = 1; i < threads_per_subcore; ++i)
+   if (paca[cpu + i].kvm_hstate.kvm_vcpu)
+   break;
+   if (i == threads_per_subcore) {
+   HMT_medium();
+   return;
}
-   cpu_relax();
+   HMT_low();
}
HMT_medium();
+   for (i = 1; i < threads_per_subcore; ++i)
+   if (paca[cpu + i].kvm_hstate.kvm_vcpu)
+   pr_err("KVM: CPU %d seems to be stuck\n", cpu + i);
 }
 
 /*
@@ -1942,8 +1952,6 @@ static void kvmppc_run

[PULL 11/21] KVM: PPC: Book3S HV: Accumulate timing information for real-mode code

2015-04-21 Thread Alexander Graf
From: Paul Mackerras 

This reads the timebase at various points in the real-mode guest
entry/exit code and uses that to accumulate total, minimum and
maximum time spent in those parts of the code.  Currently these
times are accumulated per vcpu in 5 parts of the code:

* rm_entry - time taken from the start of kvmppc_hv_entry() until
  just before entering the guest.
* rm_intr - time from when we take a hypervisor interrupt in the
  guest until we either re-enter the guest or decide to exit to the
  host.  This includes time spent handling hcalls in real mode.
* rm_exit - time from when we decide to exit the guest until the
  return from kvmppc_hv_entry().
* guest - time spend in the guest
* cede - time spent napping in real mode due to an H_CEDE hcall
  while other threads in the same vcore are active.

These times are exposed in debugfs in a directory per vcpu that
contains a file called "timings".  This file contains one line for
each of the 5 timings above, with the name followed by a colon and
4 numbers, which are the count (number of times the code has been
executed), the total time, the minimum time, and the maximum time,
all in nanoseconds.

The overhead of the extra code amounts to about 30ns for an hcall that
is handled in real mode (e.g. H_SET_DABR), which is about 25%.  Since
production environments may not wish to incur this overhead, the new
code is conditional on a new config symbol,
CONFIG_KVM_BOOK3S_HV_EXIT_TIMING.

Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/include/asm/kvm_host.h |  21 +
 arch/powerpc/include/asm/time.h |   3 +
 arch/powerpc/kernel/asm-offsets.c   |  13 +++
 arch/powerpc/kernel/time.c  |   6 ++
 arch/powerpc/kvm/Kconfig|  14 +++
 arch/powerpc/kvm/book3s_hv.c| 150 
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 141 +-
 7 files changed, 346 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index f1d0bbc..d2068bb 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -369,6 +369,14 @@ struct kvmppc_slb {
u8 base_page_size;  /* MMU_PAGE_xxx */
 };
 
+/* Struct used to accumulate timing information in HV real mode code */
+struct kvmhv_tb_accumulator {
+   u64 seqcount;   /* used to synchronize access, also count * 2 */
+   u64 tb_total;   /* total time in timebase ticks */
+   u64 tb_min; /* min time */
+   u64 tb_max; /* max time */
+};
+
 # ifdef CONFIG_PPC_FSL_BOOK3E
 #define KVMPPC_BOOKE_IAC_NUM   2
 #define KVMPPC_BOOKE_DAC_NUM   2
@@ -657,6 +665,19 @@ struct kvm_vcpu_arch {
 
u32 emul_inst;
 #endif
+
+#ifdef CONFIG_KVM_BOOK3S_HV_EXIT_TIMING
+   struct kvmhv_tb_accumulator *cur_activity;  /* What we're timing */
+   u64 cur_tb_start;   /* when it started */
+   struct kvmhv_tb_accumulator rm_entry;   /* real-mode entry code */
+   struct kvmhv_tb_accumulator rm_intr;/* real-mode intr handling */
+   struct kvmhv_tb_accumulator rm_exit;/* real-mode exit code */
+   struct kvmhv_tb_accumulator guest_time; /* guest execution */
+   struct kvmhv_tb_accumulator cede_time;  /* time napping inside guest */
+
+   struct dentry *debugfs_dir;
+   struct dentry *debugfs_timings;
+#endif /* CONFIG_KVM_BOOK3S_HV_EXIT_TIMING */
 };
 
 #define VCPU_FPR(vcpu, i)  (vcpu)->arch.fp.fpr[i][TS_FPROFFSET]
diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
index 03cbada..10fc784 100644
--- a/arch/powerpc/include/asm/time.h
+++ b/arch/powerpc/include/asm/time.h
@@ -211,5 +211,8 @@ extern void secondary_cpu_time_init(void);
 
 DECLARE_PER_CPU(u64, decrementers_next_tb);
 
+/* Convert timebase ticks to nanoseconds */
+unsigned long long tb_to_ns(unsigned long long tb_ticks);
+
 #endif /* __KERNEL__ */
 #endif /* __POWERPC_TIME_H */
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 4717859..3fea721 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -459,6 +459,19 @@ int main(void)
DEFINE(VCPU_SPRG2, offsetof(struct kvm_vcpu, arch.shregs.sprg2));
DEFINE(VCPU_SPRG3, offsetof(struct kvm_vcpu, arch.shregs.sprg3));
 #endif
+#ifdef CONFIG_KVM_BOOK3S_HV_EXIT_TIMING
+   DEFINE(VCPU_TB_RMENTRY, offsetof(struct kvm_vcpu, arch.rm_entry));
+   DEFINE(VCPU_TB_RMINTR, offsetof(struct kvm_vcpu, arch.rm_intr));
+   DEFINE(VCPU_TB_RMEXIT, offsetof(struct kvm_vcpu, arch.rm_exit));
+   DEFINE(VCPU_TB_GUEST, offsetof(struct kvm_vcpu, arch.guest_time));
+   DEFINE(VCPU_TB_CEDE, offsetof(struct kvm_vcpu, arch.cede_time));
+   DEFINE(VCPU_CUR_ACTIVITY, offsetof(struct kvm_vcpu, arch.cur_activity));
+   DEFINE(VCPU_ACTIVITY_START, offset

[PULL 14/21] KVM: PPC: Book3S HV: Move vcore preemption point up into kvmppc_run_vcpu

2015-04-21 Thread Alexander Graf
From: Paul Mackerras 

Rather than calling cond_resched() in kvmppc_run_core() before doing
the post-processing for the vcpus that we have just run (that is,
calling kvmppc_handle_exit_hv(), kvmppc_set_timer(), etc.), we now do
that post-processing before calling cond_resched(), and that post-
processing is moved out into its own function, post_guest_process().

The reschedule point is now in kvmppc_run_vcpu() and we define a new
vcore state, VCORE_PREEMPT, to indicate that that the vcore's runner
task is runnable but not running.  (Doing the reschedule with the
vcore in VCORE_INACTIVE state would be bad because there are potentially
other vcpus waiting for the runner in kvmppc_wait_for_exec() which
then wouldn't get woken up.)

Also, we make use of the handy cond_resched_lock() function, which
unlocks and relocks vc->lock for us around the reschedule.

Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/include/asm/kvm_host.h |  5 +-
 arch/powerpc/kvm/book3s_hv.c| 92 +
 2 files changed, 55 insertions(+), 42 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 3eecd88..83c4425 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -304,8 +304,9 @@ struct kvmppc_vcore {
 /* Values for vcore_state */
 #define VCORE_INACTIVE 0
 #define VCORE_SLEEPING 1
-#define VCORE_RUNNING  2
-#define VCORE_EXITING  3
+#define VCORE_PREEMPT  2
+#define VCORE_RUNNING  3
+#define VCORE_EXITING  4
 
 /*
  * Struct used to manage memory for a virtual processor area
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index b38c10e..fb4f166 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1882,15 +1882,50 @@ static void prepare_threads(struct kvmppc_vcore *vc)
}
 }
 
+static void post_guest_process(struct kvmppc_vcore *vc)
+{
+   u64 now;
+   long ret;
+   struct kvm_vcpu *vcpu, *vnext;
+
+   now = get_tb();
+   list_for_each_entry_safe(vcpu, vnext, &vc->runnable_threads,
+arch.run_list) {
+   /* cancel pending dec exception if dec is positive */
+   if (now < vcpu->arch.dec_expires &&
+   kvmppc_core_pending_dec(vcpu))
+   kvmppc_core_dequeue_dec(vcpu);
+
+   trace_kvm_guest_exit(vcpu);
+
+   ret = RESUME_GUEST;
+   if (vcpu->arch.trap)
+   ret = kvmppc_handle_exit_hv(vcpu->arch.kvm_run, vcpu,
+   vcpu->arch.run_task);
+
+   vcpu->arch.ret = ret;
+   vcpu->arch.trap = 0;
+
+   if (vcpu->arch.ceded) {
+   if (!is_kvmppc_resume_guest(ret))
+   kvmppc_end_cede(vcpu);
+   else
+   kvmppc_set_timer(vcpu);
+   }
+   if (!is_kvmppc_resume_guest(vcpu->arch.ret)) {
+   kvmppc_remove_runnable(vc, vcpu);
+   wake_up(&vcpu->arch.cpu_run);
+   }
+   }
+}
+
 /*
  * Run a set of guest threads on a physical core.
  * Called with vc->lock held.
  */
 static void kvmppc_run_core(struct kvmppc_vcore *vc)
 {
-   struct kvm_vcpu *vcpu, *vnext;
-   long ret;
-   u64 now;
+   struct kvm_vcpu *vcpu;
int i;
int srcu_idx;
 
@@ -1922,8 +1957,11 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc)
 */
if ((threads_per_core > 1) &&
((vc->num_threads > threads_per_subcore) || !on_primary_thread())) {
-   list_for_each_entry(vcpu, &vc->runnable_threads, arch.run_list)
+   list_for_each_entry(vcpu, &vc->runnable_threads, arch.run_list) 
{
vcpu->arch.ret = -EBUSY;
+   kvmppc_remove_runnable(vc, vcpu);
+   wake_up(&vcpu->arch.cpu_run);
+   }
goto out;
}
 
@@ -1979,44 +2017,12 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc)
kvm_guest_exit();
 
preempt_enable();
-   cond_resched();
 
spin_lock(&vc->lock);
-   now = get_tb();
-   list_for_each_entry(vcpu, &vc->runnable_threads, arch.run_list) {
-   /* cancel pending dec exception if dec is positive */
-   if (now < vcpu->arch.dec_expires &&
-   kvmppc_core_pending_dec(vcpu))
-   kvmppc_core_dequeue_dec(vcpu);
-
-   trace_kvm_guest_exit(vcpu);
-
-   ret = RESUME_GUEST;
-   if (vcpu->arch.trap)
-   ret = kvmppc_handle_exit_hv(vcpu->arch.kvm_run, vcpu,
-

[PULL 07/21] KVM: PPC: Book3S HV: Convert ICS mutex lock to spin lock

2015-04-21 Thread Alexander Graf
From: Suresh Warrier 

Replaces the ICS mutex lock with a spin lock since we will be porting
these routines to real mode. Note that we need to disable interrupts
before we take the lock in anticipation of the fact that on the guest
side, we are running in the context of a hard irq and interrupts are
disabled (EE bit off) when the lock is acquired. Again, because we
will be acquiring the lock in hypervisor real mode, we need to use
an arch_spinlock_t instead of a normal spinlock here as we want to
avoid running any lockdep code (which may not be safe to execute in
real mode).

Signed-off-by: Suresh Warrier 
Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/kvm/book3s_xics.c | 68 +-
 arch/powerpc/kvm/book3s_xics.h |  2 +-
 2 files changed, 48 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_xics.c b/arch/powerpc/kvm/book3s_xics.c
index 60bdbac..5f7beebd 100644
--- a/arch/powerpc/kvm/book3s_xics.c
+++ b/arch/powerpc/kvm/book3s_xics.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -39,7 +40,7 @@
  * LOCKING
  * ===
  *
- * Each ICS has a mutex protecting the information about the IRQ
+ * Each ICS has a spin lock protecting the information about the IRQ
  * sources and avoiding simultaneous deliveries if the same interrupt.
  *
  * ICP operations are done via a single compare & swap transaction
@@ -109,7 +110,10 @@ static void ics_check_resend(struct kvmppc_xics *xics, 
struct kvmppc_ics *ics,
 {
int i;
 
-   mutex_lock(&ics->lock);
+   unsigned long flags;
+
+   local_irq_save(flags);
+   arch_spin_lock(&ics->lock);
 
for (i = 0; i < KVMPPC_XICS_IRQ_PER_ICS; i++) {
struct ics_irq_state *state = &ics->irq_state[i];
@@ -120,12 +124,15 @@ static void ics_check_resend(struct kvmppc_xics *xics, 
struct kvmppc_ics *ics,
XICS_DBG("resend %#x prio %#x\n", state->number,
  state->priority);
 
-   mutex_unlock(&ics->lock);
+   arch_spin_unlock(&ics->lock);
+   local_irq_restore(flags);
icp_deliver_irq(xics, icp, state->number);
-   mutex_lock(&ics->lock);
+   local_irq_save(flags);
+   arch_spin_lock(&ics->lock);
}
 
-   mutex_unlock(&ics->lock);
+   arch_spin_unlock(&ics->lock);
+   local_irq_restore(flags);
 }
 
 static bool write_xive(struct kvmppc_xics *xics, struct kvmppc_ics *ics,
@@ -133,8 +140,10 @@ static bool write_xive(struct kvmppc_xics *xics, struct 
kvmppc_ics *ics,
   u32 server, u32 priority, u32 saved_priority)
 {
bool deliver;
+   unsigned long flags;
 
-   mutex_lock(&ics->lock);
+   local_irq_save(flags);
+   arch_spin_lock(&ics->lock);
 
state->server = server;
state->priority = priority;
@@ -145,7 +154,8 @@ static bool write_xive(struct kvmppc_xics *xics, struct 
kvmppc_ics *ics,
deliver = true;
}
 
-   mutex_unlock(&ics->lock);
+   arch_spin_unlock(&ics->lock);
+   local_irq_restore(flags);
 
return deliver;
 }
@@ -186,6 +196,7 @@ int kvmppc_xics_get_xive(struct kvm *kvm, u32 irq, u32 
*server, u32 *priority)
struct kvmppc_ics *ics;
struct ics_irq_state *state;
u16 src;
+   unsigned long flags;
 
if (!xics)
return -ENODEV;
@@ -195,10 +206,12 @@ int kvmppc_xics_get_xive(struct kvm *kvm, u32 irq, u32 
*server, u32 *priority)
return -EINVAL;
state = &ics->irq_state[src];
 
-   mutex_lock(&ics->lock);
+   local_irq_save(flags);
+   arch_spin_lock(&ics->lock);
*server = state->server;
*priority = state->priority;
-   mutex_unlock(&ics->lock);
+   arch_spin_unlock(&ics->lock);
+   local_irq_restore(flags);
 
return 0;
 }
@@ -365,6 +378,7 @@ static void icp_deliver_irq(struct kvmppc_xics *xics, 
struct kvmppc_icp *icp,
struct kvmppc_ics *ics;
u32 reject;
u16 src;
+   unsigned long flags;
 
/*
 * This is used both for initial delivery of an interrupt and
@@ -391,7 +405,8 @@ static void icp_deliver_irq(struct kvmppc_xics *xics, 
struct kvmppc_icp *icp,
state = &ics->irq_state[src];
 
/* Get a lock on the ICS */
-   mutex_lock(&ics->lock);
+   local_irq_save(flags);
+   arch_spin_lock(&ics->lock);
 
/* Get our server */
if (!icp || state->server != icp->server_num) {
@@ -434,7 +449,7 @@ static void icp_deliver_irq(struct kvmppc_xics *xics, 
struct kvmppc_icp *icp,
 *
 * Note that if successful, the new delivery might have itself
 * rejected an interrupt th

[PULL 09/21] KVM: PPC: Book3S HV: Add ICP real mode counters

2015-04-21 Thread Alexander Graf
From: Suresh Warrier 

Add two counters to count how often we generate real-mode ICS resend
and reject events. The counters provide some performance statistics
that could be used in the future to consider if the real mode functions
need further optimizing. The counters are displayed as part of IPC and
ICP state provided by /sys/debug/kernel/powerpc/kvm* for each VM.

Also added two counters that count (approximately) how many times we
don't find an ICP or ICS we're looking for. These are not currently
exposed through sysfs, but can be useful when debugging crashes.

Signed-off-by: Suresh Warrier 
Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/kvm/book3s_hv_rm_xics.c |  7 +++
 arch/powerpc/kvm/book3s_xics.c   | 10 --
 arch/powerpc/kvm/book3s_xics.h   |  5 +
 3 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rm_xics.c 
b/arch/powerpc/kvm/book3s_hv_rm_xics.c
index 73bbe92..6dded8c 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_xics.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_xics.c
@@ -227,6 +227,7 @@ static void icp_rm_deliver_irq(struct kvmppc_xics *xics, 
struct kvmppc_icp *icp,
ics = kvmppc_xics_find_ics(xics, new_irq, &src);
if (!ics) {
/* Unsafe increment, but this does not need to be accurate */
+   xics->err_noics++;
return;
}
state = &ics->irq_state[src];
@@ -239,6 +240,7 @@ static void icp_rm_deliver_irq(struct kvmppc_xics *xics, 
struct kvmppc_icp *icp,
icp = kvmppc_xics_find_server(xics->kvm, state->server);
if (!icp) {
/* Unsafe increment again*/
+   xics->err_noicp++;
goto out;
}
}
@@ -383,6 +385,7 @@ static void icp_rm_down_cppr(struct kvmppc_xics *xics, 
struct kvmppc_icp *icp,
 * separately here as well.
 */
if (resend) {
+   icp->n_check_resend++;
icp_rm_check_resend(xics, icp);
}
 }
@@ -500,11 +503,13 @@ int kvmppc_rm_h_ipi(struct kvm_vcpu *vcpu, unsigned long 
server,
 
/* Handle reject in real mode */
if (reject && reject != XICS_IPI) {
+   this_icp->n_reject++;
icp_rm_deliver_irq(xics, icp, reject);
}
 
/* Handle resends in real mode */
if (resend) {
+   this_icp->n_check_resend++;
icp_rm_check_resend(xics, icp);
}
 
@@ -566,6 +571,7 @@ int kvmppc_rm_h_cppr(struct kvm_vcpu *vcpu, unsigned long 
cppr)
 * attempt (see comments in icp_rm_deliver_irq).
 */
if (reject && reject != XICS_IPI) {
+   icp->n_reject++;
icp_rm_deliver_irq(xics, icp, reject);
}
  bail:
@@ -616,6 +622,7 @@ int kvmppc_rm_h_eoi(struct kvm_vcpu *vcpu, unsigned long 
xirr)
 
/* Still asserted, resend it */
if (state->asserted) {
+   icp->n_reject++;
icp_rm_deliver_irq(xics, icp, irq);
}
 
diff --git a/arch/powerpc/kvm/book3s_xics.c b/arch/powerpc/kvm/book3s_xics.c
index 5f7beebd..8f3e6cc 100644
--- a/arch/powerpc/kvm/book3s_xics.c
+++ b/arch/powerpc/kvm/book3s_xics.c
@@ -901,6 +901,7 @@ static int xics_debug_show(struct seq_file *m, void 
*private)
unsigned long flags;
unsigned long t_rm_kick_vcpu, t_rm_check_resend;
unsigned long t_rm_reject, t_rm_notify_eoi;
+   unsigned long t_reject, t_check_resend;
 
if (!kvm)
return 0;
@@ -909,6 +910,8 @@ static int xics_debug_show(struct seq_file *m, void 
*private)
t_rm_notify_eoi = 0;
t_rm_check_resend = 0;
t_rm_reject = 0;
+   t_check_resend = 0;
+   t_reject = 0;
 
seq_printf(m, "=\nICP state\n=\n");
 
@@ -928,12 +931,15 @@ static int xics_debug_show(struct seq_file *m, void 
*private)
t_rm_notify_eoi += icp->n_rm_notify_eoi;
t_rm_check_resend += icp->n_rm_check_resend;
t_rm_reject += icp->n_rm_reject;
+   t_check_resend += icp->n_check_resend;
+   t_reject += icp->n_reject;
}
 
-   seq_puts(m, "ICP Guest Real Mode exit totals: ");
-   seq_printf(m, "\tkick_vcpu=%lu check_resend=%lu reject=%lu 
notify_eoi=%lu\n",
+   seq_printf(m, "ICP Guest->Host totals: kick_vcpu=%lu check_resend=%lu 
reject=%lu notify_eoi=%lu\n",
t_rm_kick_vcpu, t_rm_check_resend,
t_rm_reject, t_rm_notify_eoi);
+   seq_printf(m, "ICP Real Mode totals: check_resend=%lu resend=%lu\n",
+   t_check_resend, t_reject);
for (icsid = 0; icsid <= KVMPPC_XICS_MAX_ICS_ID; icsid++) {
struct kvmppc_ics *ics = xics->ics[icsid];
 
diff --g

[PULL 06/21] KVM: PPC: Book3S HV: Add guest->host real mode completion counters

2015-04-21 Thread Alexander Graf
From: "Suresh E. Warrier" 

Add counters to track number of times we switch from guest real mode
to host virtual mode during an interrupt-related hyper call because the
hypercall requires actions that cannot be completed in real mode. This
will help when making optimizations that reduce guest-host transitions.

It is safe to use an ordinary increment rather than an atomic operation
because there is one ICP per virtual CPU and kvmppc_xics_rm_complete()
only works on the ICP for the current VCPU.

The counters are displayed as part of IPC and ICP state provided by
/sys/debug/kernel/powerpc/kvm* for each VM.

Signed-off-by: Suresh Warrier 
Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/kvm/book3s_xics.c | 31 +++
 arch/powerpc/kvm/book3s_xics.h |  6 ++
 2 files changed, 33 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_xics.c b/arch/powerpc/kvm/book3s_xics.c
index a4a8d9f..60bdbac 100644
--- a/arch/powerpc/kvm/book3s_xics.c
+++ b/arch/powerpc/kvm/book3s_xics.c
@@ -802,14 +802,22 @@ static noinline int kvmppc_xics_rm_complete(struct 
kvm_vcpu *vcpu, u32 hcall)
XICS_DBG("XICS_RM: H_%x completing, act: %x state: %lx tgt: %p\n",
 hcall, icp->rm_action, icp->rm_dbgstate.raw, icp->rm_dbgtgt);
 
-   if (icp->rm_action & XICS_RM_KICK_VCPU)
+   if (icp->rm_action & XICS_RM_KICK_VCPU) {
+   icp->n_rm_kick_vcpu++;
kvmppc_fast_vcpu_kick(icp->rm_kick_target);
-   if (icp->rm_action & XICS_RM_CHECK_RESEND)
+   }
+   if (icp->rm_action & XICS_RM_CHECK_RESEND) {
+   icp->n_rm_check_resend++;
icp_check_resend(xics, icp->rm_resend_icp);
-   if (icp->rm_action & XICS_RM_REJECT)
+   }
+   if (icp->rm_action & XICS_RM_REJECT) {
+   icp->n_rm_reject++;
icp_deliver_irq(xics, icp, icp->rm_reject);
-   if (icp->rm_action & XICS_RM_NOTIFY_EOI)
+   }
+   if (icp->rm_action & XICS_RM_NOTIFY_EOI) {
+   icp->n_rm_notify_eoi++;
kvm_notify_acked_irq(vcpu->kvm, 0, icp->rm_eoied_irq);
+   }
 
icp->rm_action = 0;
 
@@ -872,10 +880,17 @@ static int xics_debug_show(struct seq_file *m, void 
*private)
struct kvm *kvm = xics->kvm;
struct kvm_vcpu *vcpu;
int icsid, i;
+   unsigned long t_rm_kick_vcpu, t_rm_check_resend;
+   unsigned long t_rm_reject, t_rm_notify_eoi;
 
if (!kvm)
return 0;
 
+   t_rm_kick_vcpu = 0;
+   t_rm_notify_eoi = 0;
+   t_rm_check_resend = 0;
+   t_rm_reject = 0;
+
seq_printf(m, "=\nICP state\n=\n");
 
kvm_for_each_vcpu(i, vcpu, kvm) {
@@ -890,8 +905,16 @@ static int xics_debug_show(struct seq_file *m, void 
*private)
   icp->server_num, state.xisr,
   state.pending_pri, state.cppr, state.mfrr,
   state.out_ee, state.need_resend);
+   t_rm_kick_vcpu += icp->n_rm_kick_vcpu;
+   t_rm_notify_eoi += icp->n_rm_notify_eoi;
+   t_rm_check_resend += icp->n_rm_check_resend;
+   t_rm_reject += icp->n_rm_reject;
}
 
+   seq_puts(m, "ICP Guest Real Mode exit totals: ");
+   seq_printf(m, "\tkick_vcpu=%lu check_resend=%lu reject=%lu 
notify_eoi=%lu\n",
+   t_rm_kick_vcpu, t_rm_check_resend,
+   t_rm_reject, t_rm_notify_eoi);
for (icsid = 0; icsid <= KVMPPC_XICS_MAX_ICS_ID; icsid++) {
struct kvmppc_ics *ics = xics->ics[icsid];
 
diff --git a/arch/powerpc/kvm/book3s_xics.h b/arch/powerpc/kvm/book3s_xics.h
index 73f0f27..de970ec 100644
--- a/arch/powerpc/kvm/book3s_xics.h
+++ b/arch/powerpc/kvm/book3s_xics.h
@@ -78,6 +78,12 @@ struct kvmppc_icp {
u32  rm_reject;
u32  rm_eoied_irq;
 
+   /* Counters for each reason we exited real mode */
+   unsigned long n_rm_kick_vcpu;
+   unsigned long n_rm_check_resend;
+   unsigned long n_rm_reject;
+   unsigned long n_rm_notify_eoi;
+
/* Debug stuff for real mode */
union kvmppc_icp_state rm_dbgstate;
struct kvm_vcpu *rm_dbgtgt;
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL 12/21] KVM: PPC: Book3S HV: Simplify handling of VCPUs that need a VPA update

2015-04-21 Thread Alexander Graf
From: Paul Mackerras 

Previously, if kvmppc_run_core() was running a VCPU that needed a VPA
update (i.e. one of its 3 virtual processor areas needed to be pinned
in memory so the host real mode code can update it on guest entry and
exit), we would drop the vcore lock and do the update there and then.
Future changes will make it inconvenient to drop the lock, so instead
we now remove it from the list of runnable VCPUs and wake up its
VCPU task.  This will have the effect that the VCPU task will exit
kvmppc_run_vcpu(), go around the do loop in kvmppc_vcpu_run_hv(), and
re-enter kvmppc_run_vcpu(), whereupon it will do the necessary call
to kvmppc_update_vpas() and then rejoin the vcore.

The one complication is that the runner VCPU (whose VCPU task is the
current task) might be one of the ones that gets removed from the
runnable list.  In that case we just return from kvmppc_run_core()
and let the code in kvmppc_run_vcpu() wake up another VCPU task to be
the runner if necessary.

This all means that the VCORE_STARTING state is no longer used, so we
remove it.

Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/include/asm/kvm_host.h |  5 ++--
 arch/powerpc/kvm/book3s_hv.c| 56 -
 2 files changed, 32 insertions(+), 29 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index d2068bb..2f339ff 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -306,9 +306,8 @@ struct kvmppc_vcore {
 /* Values for vcore_state */
 #define VCORE_INACTIVE 0
 #define VCORE_SLEEPING 1
-#define VCORE_STARTING 2
-#define VCORE_RUNNING  3
-#define VCORE_EXITING  4
+#define VCORE_RUNNING  2
+#define VCORE_EXITING  3
 
 /*
  * Struct used to manage memory for a virtual processor area
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 64a02d4..b38c10e 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1863,6 +1863,25 @@ static void kvmppc_start_restoring_l2_cache(const struct 
kvmppc_vcore *vc)
mtspr(SPRN_MPPR, mpp_addr | PPC_MPPR_FETCH_WHOLE_TABLE);
 }
 
+static void prepare_threads(struct kvmppc_vcore *vc)
+{
+   struct kvm_vcpu *vcpu, *vnext;
+
+   list_for_each_entry_safe(vcpu, vnext, &vc->runnable_threads,
+arch.run_list) {
+   if (signal_pending(vcpu->arch.run_task))
+   vcpu->arch.ret = -EINTR;
+   else if (vcpu->arch.vpa.update_pending ||
+vcpu->arch.slb_shadow.update_pending ||
+vcpu->arch.dtl.update_pending)
+   vcpu->arch.ret = RESUME_GUEST;
+   else
+   continue;
+   kvmppc_remove_runnable(vc, vcpu);
+   wake_up(&vcpu->arch.cpu_run);
+   }
+}
+
 /*
  * Run a set of guest threads on a physical core.
  * Called with vc->lock held.
@@ -1872,46 +1891,31 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc)
struct kvm_vcpu *vcpu, *vnext;
long ret;
u64 now;
-   int i, need_vpa_update;
+   int i;
int srcu_idx;
-   struct kvm_vcpu *vcpus_to_update[threads_per_core];
 
-   /* don't start if any threads have a signal pending */
-   need_vpa_update = 0;
-   list_for_each_entry(vcpu, &vc->runnable_threads, arch.run_list) {
-   if (signal_pending(vcpu->arch.run_task))
-   return;
-   if (vcpu->arch.vpa.update_pending ||
-   vcpu->arch.slb_shadow.update_pending ||
-   vcpu->arch.dtl.update_pending)
-   vcpus_to_update[need_vpa_update++] = vcpu;
-   }
+   /*
+* Remove from the list any threads that have a signal pending
+* or need a VPA update done
+*/
+   prepare_threads(vc);
+
+   /* if the runner is no longer runnable, let the caller pick a new one */
+   if (vc->runner->arch.state != KVMPPC_VCPU_RUNNABLE)
+   return;
 
/*
-* Initialize *vc, in particular vc->vcore_state, so we can
-* drop the vcore lock if necessary.
+* Initialize *vc.
 */
vc->n_woken = 0;
vc->nap_count = 0;
vc->entry_exit_count = 0;
vc->preempt_tb = TB_NIL;
-   vc->vcore_state = VCORE_STARTING;
vc->in_guest = 0;
vc->napping_threads = 0;
vc->conferring_threads = 0;
 
/*
-* Updating any of the vpas requires calling kvmppc_pin_guest_page,
-* which can't be called with any spinlocks held.
-*/
-   if (need_vpa_update) {
-   spin_unlock(&vc->lock);
-   for (i = 0; i < need_vpa_update; ++i)
-   kvmppc_update_vpas(vcpus_to_update[i]);
-   sp

[PULL 13/21] KVM: PPC: Book3S HV: Minor cleanups

2015-04-21 Thread Alexander Graf
From: Paul Mackerras 

* Remove unused kvmppc_vcore::n_busy field.
* Remove setting of RMOR, since it was only used on PPC970 and the
  PPC970 KVM support has been removed.
* Don't use r1 or r2 in setting the runlatch since they are
  conventionally reserved for other things; use r0 instead.
* Streamline the code a little and remove the ext_interrupt_to_host
  label.
* Add some comments about register usage.
* hcall_try_real_mode doesn't need to be global, and can't be
  called from C code anyway.

Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/include/asm/kvm_host.h |  2 --
 arch/powerpc/kernel/asm-offsets.c   |  1 -
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 44 ++---
 3 files changed, 19 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 2f339ff..3eecd88 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -227,7 +227,6 @@ struct kvm_arch {
unsigned long host_sdr1;
int tlbie_lock;
unsigned long lpcr;
-   unsigned long rmor;
unsigned long vrma_slb_v;
int hpte_setup_done;
u32 hpt_order;
@@ -271,7 +270,6 @@ struct kvm_arch {
  */
 struct kvmppc_vcore {
int n_runnable;
-   int n_busy;
int num_threads;
int entry_exit_count;
int n_woken;
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 3fea721..92ec3fc 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -505,7 +505,6 @@ int main(void)
DEFINE(KVM_NEED_FLUSH, offsetof(struct kvm, arch.need_tlb_flush.bits));
DEFINE(KVM_ENABLED_HCALLS, offsetof(struct kvm, arch.enabled_hcalls));
DEFINE(KVM_LPCR, offsetof(struct kvm, arch.lpcr));
-   DEFINE(KVM_RMOR, offsetof(struct kvm, arch.rmor));
DEFINE(KVM_VRMA_SLB_V, offsetof(struct kvm, arch.vrma_slb_v));
DEFINE(VCPU_DSISR, offsetof(struct kvm_vcpu, arch.shregs.dsisr));
DEFINE(VCPU_DAR, offsetof(struct kvm_vcpu, arch.shregs.dar));
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index b06fe53..f8267e5 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -245,9 +245,9 @@ kvm_novcpu_exit:
 kvm_start_guest:
 
/* Set runlatch bit the minute you wake up from nap */
-   mfspr   r1, SPRN_CTRLF
-   ori r1, r1, 1
-   mtspr   SPRN_CTRLT, r1
+   mfspr   r0, SPRN_CTRLF
+   ori r0, r0, 1
+   mtspr   SPRN_CTRLT, r0
 
ld  r2,PACATOC(r13)
 
@@ -493,11 +493,9 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
cmpwi   r0,0
beq 20b
 
-   /* Set LPCR and RMOR. */
+   /* Set LPCR. */
 10:ld  r8,VCORE_LPCR(r5)
mtspr   SPRN_LPCR,r8
-   ld  r8,KVM_RMOR(r9)
-   mtspr   SPRN_RMOR,r8
isync
 
/* Check if HDEC expires soon */
@@ -1075,7 +1073,8 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
bne 2f
mfspr   r3,SPRN_HDEC
cmpwi   r3,0
-   bge ignore_hdec
+   mr  r4,r9
+   bge fast_guest_return
 2:
/* See if this is an hcall we can handle in real mode */
cmpwi   r12,BOOK3S_INTERRUPT_SYSCALL
@@ -1083,26 +1082,21 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
 
/* External interrupt ? */
cmpwi   r12, BOOK3S_INTERRUPT_EXTERNAL
-   bne+ext_interrupt_to_host
+   bne+guest_exit_cont
 
/* External interrupt, first check for host_ipi. If this is
 * set, we know the host wants us out so let's do it now
 */
bl  kvmppc_read_intr
cmpdi   r3, 0
-   bgt ext_interrupt_to_host
+   bgt guest_exit_cont
 
/* Check if any CPU is heading out to the host, if so head out too */
ld  r5, HSTATE_KVM_VCORE(r13)
lwz r0, VCORE_ENTRY_EXIT(r5)
cmpwi   r0, 0x100
-   bge ext_interrupt_to_host
-
-   /* Return to guest after delivering any pending interrupt */
mr  r4, r9
-   b   deliver_guest_interrupt
-
-ext_interrupt_to_host:
+   blt deliver_guest_interrupt
 
 guest_exit_cont:   /* r9 = vcpu, r12 = trap, r13 = paca */
/* Save more register state  */
@@ -1763,8 +1757,10 @@ kvmppc_hisi:
  * Returns to the guest if we handle it, or continues on up to
  * the kernel if we can't (i.e. if we don't have a handler for
  * it, or if the handler returns H_TOO_HARD).
+ *
+ * r5 - r8 contain hcall args,
+ * r9 = vcpu, r10 = pc, r11 = msr, r12 = trap, r13 = paca
  */
-   .globl  hcall_try_real_mode
 hcall_try_real_mode:
ld  r3,VCPU_GPR(R3)(r9)
andi.   r0,r11,MSR_PR
@@ -2024,10 +2020,6 @@ hcall_real_table:
.globl  hcall_real_table_end
 hcall_real_table_end:
 
-ignore_hdec:
-

[PULL 20/21] KVM: PPC: Book3S HV: Translate kvmhv_commence_exit to C

2015-04-21 Thread Alexander Graf
From: Paul Mackerras 

This replaces the assembler code for kvmhv_commence_exit() with C code
in book3s_hv_builtin.c.  It also moves the IPI sending code that was
in book3s_hv_rm_xics.c into a new kvmhv_rm_send_ipi() function so it
can be used by kvmhv_commence_exit() as well as icp_rm_set_vcpu_irq().

Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/include/asm/kvm_book3s_64.h |  2 +
 arch/powerpc/kvm/book3s_hv_builtin.c | 63 ++
 arch/powerpc/kvm/book3s_hv_rm_xics.c | 12 +-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S  | 66 
 4 files changed, 75 insertions(+), 68 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h 
b/arch/powerpc/include/asm/kvm_book3s_64.h
index 869c53f..2b84e48 100644
--- a/arch/powerpc/include/asm/kvm_book3s_64.h
+++ b/arch/powerpc/include/asm/kvm_book3s_64.h
@@ -438,6 +438,8 @@ static inline struct kvm_memslots *kvm_memslots_raw(struct 
kvm *kvm)
 
 extern void kvmppc_mmu_debugfs_init(struct kvm *kvm);
 
+extern void kvmhv_rm_send_ipi(int cpu);
+
 #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
 
 #endif /* __ASM_KVM_BOOK3S_64_H__ */
diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c 
b/arch/powerpc/kvm/book3s_hv_builtin.c
index 2754251..c42aa55 100644
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define KVM_CMA_CHUNK_ORDER18
 
@@ -184,3 +185,65 @@ long kvmppc_h_random(struct kvm_vcpu *vcpu)
 
return H_HARDWARE;
 }
+
+static inline void rm_writeb(unsigned long paddr, u8 val)
+{
+   __asm__ __volatile__("stbcix %0,0,%1"
+   : : "r" (val), "r" (paddr) : "memory");
+}
+
+/*
+ * Send an interrupt to another CPU.
+ * This can only be called in real mode.
+ * The caller needs to include any barrier needed to order writes
+ * to memory vs. the IPI/message.
+ */
+void kvmhv_rm_send_ipi(int cpu)
+{
+   unsigned long xics_phys;
+
+   /* Poke the target */
+   xics_phys = paca[cpu].kvm_hstate.xics_phys;
+   rm_writeb(xics_phys + XICS_MFRR, IPI_PRIORITY);
+}
+
+/*
+ * The following functions are called from the assembly code
+ * in book3s_hv_rmhandlers.S.
+ */
+static void kvmhv_interrupt_vcore(struct kvmppc_vcore *vc, int active)
+{
+   int cpu = vc->pcpu;
+
+   /* Order setting of exit map vs. msgsnd/IPI */
+   smp_mb();
+   for (; active; active >>= 1, ++cpu)
+   if (active & 1)
+   kvmhv_rm_send_ipi(cpu);
+}
+
+void kvmhv_commence_exit(int trap)
+{
+   struct kvmppc_vcore *vc = local_paca->kvm_hstate.kvm_vcore;
+   int ptid = local_paca->kvm_hstate.ptid;
+   int me, ee;
+
+   /* Set our bit in the threads-exiting-guest map in the 0xff00
+  bits of vcore->entry_exit_map */
+   me = 0x100 << ptid;
+   do {
+   ee = vc->entry_exit_map;
+   } while (cmpxchg(&vc->entry_exit_map, ee, ee | me) != ee);
+
+   /* Are we the first here? */
+   if ((ee >> 8) != 0)
+   return;
+
+   /*
+* Trigger the other threads in this vcore to exit the guest.
+* If this is a hypervisor decrementer interrupt then they
+* will be already on their way out of the guest.
+*/
+   if (trap != BOOK3S_INTERRUPT_HV_DECREMENTER)
+   kvmhv_interrupt_vcore(vc, ee & ~(1 << ptid));
+}
diff --git a/arch/powerpc/kvm/book3s_hv_rm_xics.c 
b/arch/powerpc/kvm/book3s_hv_rm_xics.c
index 6dded8c..00e45b6 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_xics.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_xics.c
@@ -26,12 +26,6 @@
 static void icp_rm_deliver_irq(struct kvmppc_xics *xics, struct kvmppc_icp 
*icp,
u32 new_irq);
 
-static inline void rm_writeb(unsigned long paddr, u8 val)
-{
-   __asm__ __volatile__("sync; stbcix %0,0,%1"
-   : : "r" (val), "r" (paddr) : "memory");
-}
-
 /* -- ICS routines -- */
 static void ics_rm_check_resend(struct kvmppc_xics *xics,
struct kvmppc_ics *ics, struct kvmppc_icp *icp)
@@ -60,7 +54,6 @@ static void icp_rm_set_vcpu_irq(struct kvm_vcpu *vcpu,
struct kvm_vcpu *this_vcpu)
 {
struct kvmppc_icp *this_icp = this_vcpu->arch.icp;
-   unsigned long xics_phys;
int cpu;
 
/* Mark the target VCPU as having an interrupt pending */
@@ -83,9 +76,8 @@ static void icp_rm_set_vcpu_irq(struct kvm_vcpu *vcpu,
/* In SMT cpu will always point to thread 0, we adjust it */
cpu += vcpu->arch.ptid;
 
-   /* Not too hard, then poke the target */
-   xics_phys = paca[cpu].kvm_hstate.xics_phys;
-   rm_writeb(xics_phys + XICS_MFRR, IPI_PRIORITY);
+   smp_mb();
+   kvmhv_rm_send_ipi(cpu);
 }
 
 static void icp_rm_clr_v

[PULL 16/21] KVM: PPC: Book3S HV: Don't wake thread with no vcpu on guest IPI

2015-04-21 Thread Alexander Graf
From: Paul Mackerras 

When running a multi-threaded guest and vcpu 0 in a virtual core
is not running in the guest (i.e. it is busy elsewhere in the host),
thread 0 of the physical core will switch the MMU to the guest and
then go to nap mode in the code at kvm_do_nap.  If the guest sends
an IPI to thread 0 using the msgsndp instruction, that will wake
up thread 0 and cause all the threads in the guest to exit to the
host unnecessarily.  To avoid the unnecessary exit, this arranges
for the PECEDP bit to be cleared in this situation.  When napping
due to a H_CEDE from the guest, we still set PECEDP so that the
thread will wake up on an IPI sent using msgsndp.

Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 6716db3..12d7e4c 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -191,6 +191,7 @@ kvmppc_primary_no_guest:
li  r3, NAPPING_NOVCPU
stb r3, HSTATE_NAPPING(r13)
 
+   li  r3, 0   /* Don't wake on privileged (OS) doorbell */
b   kvm_do_nap
 
 kvm_novcpu_wakeup:
@@ -2129,10 +2130,13 @@ _GLOBAL(kvmppc_h_cede)  /* r3 = vcpu pointer, 
r11 = msr, r13 = paca */
bl  kvmhv_accumulate_time
 #endif
 
+   lis r3, LPCR_PECEDP@h   /* Do wake on privileged doorbell */
+
/*
 * Take a nap until a decrementer or external or doobell interrupt
-* occurs, with PECE1, PECE0 and PECEDP set in LPCR. Also clear the
-* runlatch bit before napping.
+* occurs, with PECE1 and PECE0 set in LPCR.
+* On POWER8, if we are ceding, also set PECEDP.
+* Also clear the runlatch bit before napping.
 */
 kvm_do_nap:
mfspr   r0, SPRN_CTRLF
@@ -2144,7 +2148,7 @@ kvm_do_nap:
mfspr   r5,SPRN_LPCR
ori r5,r5,LPCR_PECE0 | LPCR_PECE1
 BEGIN_FTR_SECTION
-   orisr5,r5,LPCR_PECEDP@h
+   rlwimi  r5, r3, 0, LPCR_PECEDP
 END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
mtspr   SPRN_LPCR,r5
isync
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL 03/21] KVM: PPC: Book3S HV: Add fast real-mode H_RANDOM implementation.

2015-04-21 Thread Alexander Graf
From: Michael Ellerman 

Some PowerNV systems include a hardware random-number generator.
This HWRNG is present on POWER7+ and POWER8 chips and is capable of
generating one 64-bit random number every microsecond.  The random
numbers are produced by sampling a set of 64 unstable high-frequency
oscillators and are almost completely entropic.

PAPR defines an H_RANDOM hypercall which guests can use to obtain one
64-bit random sample from the HWRNG.  This adds a real-mode
implementation of the H_RANDOM hypercall.  This hypercall was
implemented in real mode because the latency of reading the HWRNG is
generally small compared to the latency of a guest exit and entry for
all the threads in the same virtual core.

Userspace can detect the presence of the HWRNG and the H_RANDOM
implementation by querying the KVM_CAP_PPC_HWRNG capability.  The
H_RANDOM hypercall implementation will only be invoked when the guest
does an H_RANDOM hypercall if userspace first enables the in-kernel
H_RANDOM implementation using the KVM_CAP_PPC_ENABLE_HCALL capability.

Signed-off-by: Michael Ellerman 
Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 Documentation/virtual/kvm/api.txt   |  17 +
 arch/powerpc/include/asm/archrandom.h   |  11 ++-
 arch/powerpc/include/asm/kvm_ppc.h  |   2 +
 arch/powerpc/kvm/book3s_hv_builtin.c|  15 +
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 115 
 arch/powerpc/kvm/powerpc.c  |   3 +
 arch/powerpc/platforms/powernv/rng.c|  29 
 include/uapi/linux/kvm.h|   1 +
 8 files changed, 191 insertions(+), 2 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index bc9f6fe..9fa2bf8 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -3573,3 +3573,20 @@ struct {
 @ar   - access register number
 
 KVM handlers should exit to userspace with rc = -EREMOTE.
+
+
+8. Other capabilities.
+--
+
+This section lists capabilities that give information about other
+features of the KVM implementation.
+
+8.1 KVM_CAP_PPC_HWRNG
+
+Architectures: ppc
+
+This capability, if KVM_CHECK_EXTENSION indicates that it is
+available, means that that the kernel has an implementation of the
+H_RANDOM hypercall backed by a hardware random-number generator.
+If present, the kernel H_RANDOM handler can be enabled for guest use
+with the KVM_CAP_PPC_ENABLE_HCALL capability.
diff --git a/arch/powerpc/include/asm/archrandom.h 
b/arch/powerpc/include/asm/archrandom.h
index bde5311..0cc6eed 100644
--- a/arch/powerpc/include/asm/archrandom.h
+++ b/arch/powerpc/include/asm/archrandom.h
@@ -30,8 +30,6 @@ static inline int arch_has_random(void)
return !!ppc_md.get_random_long;
 }
 
-int powernv_get_random_long(unsigned long *v);
-
 static inline int arch_get_random_seed_long(unsigned long *v)
 {
return 0;
@@ -47,4 +45,13 @@ static inline int arch_has_random_seed(void)
 
 #endif /* CONFIG_ARCH_RANDOM */
 
+#ifdef CONFIG_PPC_POWERNV
+int powernv_hwrng_present(void);
+int powernv_get_random_long(unsigned long *v);
+int powernv_get_random_real_mode(unsigned long *v);
+#else
+static inline int powernv_hwrng_present(void) { return 0; }
+static inline int powernv_get_random_real_mode(unsigned long *v) { return 0; }
+#endif
+
 #endif /* _ASM_POWERPC_ARCHRANDOM_H */
diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index 46bf652..b8475da 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -302,6 +302,8 @@ static inline bool is_kvmppc_hv_enabled(struct kvm *kvm)
return kvm->arch.kvm_ops == kvmppc_hv_ops;
 }
 
+extern int kvmppc_hwrng_present(void);
+
 /*
  * Cuts out inst bits with ordering according to spec.
  * That means the leftmost bit is zero. All given bits are included.
diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c 
b/arch/powerpc/kvm/book3s_hv_builtin.c
index 1f083ff..1954a1c 100644
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define KVM_CMA_CHUNK_ORDER18
 
@@ -169,3 +170,17 @@ int kvmppc_hcall_impl_hv_realmode(unsigned long cmd)
return 0;
 }
 EXPORT_SYMBOL_GPL(kvmppc_hcall_impl_hv_realmode);
+
+int kvmppc_hwrng_present(void)
+{
+   return powernv_hwrng_present();
+}
+EXPORT_SYMBOL_GPL(kvmppc_hwrng_present);
+
+long kvmppc_h_random(struct kvm_vcpu *vcpu)
+{
+   if (powernv_get_random_real_mode(&vcpu->arch.gpr[4]))
+   return H_SUCCESS;
+
+   return H_HARDWARE;
+}
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 6cbf163..0814ca1 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -1839,6 +1839,121 @@ hcall_real_table:
.long   0   /* 0x12c */
.long   0 

[PULL 21/21] KVM: PPC: Book3S HV: Use msgsnd for signalling threads on POWER8

2015-04-21 Thread Alexander Graf
From: Paul Mackerras 

This uses msgsnd where possible for signalling other threads within
the same core on POWER8 systems, rather than IPIs through the XICS
interrupt controller.  This includes waking secondary threads to run
the guest, the interrupts generated by the virtual XICS, and the
interrupts to bring the other threads out of the guest when exiting.

Aggregated statistics from debugfs across vcpus for a guest with 32
vcpus, 8 threads/vcore, running on a POWER8, show this before the
change:

 rm_entry: 3387.6ns (228 - 86600, 1008969 samples)
  rm_exit: 4561.5ns (12 - 3477452, 1009402 samples)
  rm_intr: 1660.0ns (12 - 553050, 3600051 samples)

and this after the change:

 rm_entry: 3060.1ns (212 - 65138, 953873 samples)
  rm_exit: 4244.1ns (12 - 9693408, 954331 samples)
  rm_intr: 1342.3ns (12 - 1104718, 3405326 samples)

for a test of booting Fedora 20 big-endian to the login prompt.

The time taken for a H_PROD hcall (which is handled in the host
kernel) went down from about 35 microseconds to about 16 microseconds
with this change.

The noinline added to kvmppc_run_core turned out to be necessary for
good performance, at least with gcc 4.9.2 as packaged with Fedora 21
and a little-endian POWER8 host.

Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/kernel/asm-offsets.c   |  3 ++
 arch/powerpc/kvm/book3s_hv.c| 51 ++---
 arch/powerpc/kvm/book3s_hv_builtin.c| 16 +--
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 22 --
 4 files changed, 70 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 0d07efb..0034b6b 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -37,6 +37,7 @@
 #include 
 #include 
 #include 
+#include 
 #ifdef CONFIG_PPC64
 #include 
 #include 
@@ -759,5 +760,7 @@ int main(void)
offsetof(struct paca_struct, subcore_sibling_mask));
 #endif
 
+   DEFINE(PPC_DBELL_SERVER, PPC_DBELL_SERVER);
+
return 0;
 }
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index ea1600f..48d3c5d 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -51,6 +51,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -84,9 +85,35 @@ static DECLARE_BITMAP(default_enabled_hcalls, 
MAX_HCALL_OPCODE/4 + 1);
 static void kvmppc_end_cede(struct kvm_vcpu *vcpu);
 static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu);
 
+static bool kvmppc_ipi_thread(int cpu)
+{
+   /* On POWER8 for IPIs to threads in the same core, use msgsnd */
+   if (cpu_has_feature(CPU_FTR_ARCH_207S)) {
+   preempt_disable();
+   if (cpu_first_thread_sibling(cpu) ==
+   cpu_first_thread_sibling(smp_processor_id())) {
+   unsigned long msg = PPC_DBELL_TYPE(PPC_DBELL_SERVER);
+   msg |= cpu_thread_in_core(cpu);
+   smp_mb();
+   __asm__ __volatile__ (PPC_MSGSND(%0) : : "r" (msg));
+   preempt_enable();
+   return true;
+   }
+   preempt_enable();
+   }
+
+#if defined(CONFIG_PPC_ICP_NATIVE) && defined(CONFIG_SMP)
+   if (cpu >= 0 && cpu < nr_cpu_ids && paca[cpu].kvm_hstate.xics_phys) {
+   xics_wake_cpu(cpu);
+   return true;
+   }
+#endif
+
+   return false;
+}
+
 static void kvmppc_fast_vcpu_kick_hv(struct kvm_vcpu *vcpu)
 {
-   int me;
int cpu = vcpu->cpu;
wait_queue_head_t *wqp;
 
@@ -96,20 +123,12 @@ static void kvmppc_fast_vcpu_kick_hv(struct kvm_vcpu *vcpu)
++vcpu->stat.halt_wakeup;
}
 
-   me = get_cpu();
+   if (kvmppc_ipi_thread(cpu + vcpu->arch.ptid))
+   return;
 
/* CPU points to the first thread of the core */
-   if (cpu != me && cpu >= 0 && cpu < nr_cpu_ids) {
-#ifdef CONFIG_PPC_ICP_NATIVE
-   int real_cpu = cpu + vcpu->arch.ptid;
-   if (paca[real_cpu].kvm_hstate.xics_phys)
-   xics_wake_cpu(real_cpu);
-   else
-#endif
-   if (cpu_online(cpu))
-   smp_send_reschedule(cpu);
-   }
-   put_cpu();
+   if (cpu >= 0 && cpu < nr_cpu_ids && cpu_online(cpu))
+   smp_send_reschedule(cpu);
 }
 
 /*
@@ -1781,10 +1800,8 @@ static void kvmppc_start_thread(struct kvm_vcpu *vcpu)
/* Order stores to hstate.kvm_vcore etc. before store to kvm_vcpu */
smp_wmb();
tpaca->kvm_hstate.kvm_vcpu = vcpu;
-#if defined(CONFIG_PPC_ICP_NATIVE) && defined(CONFIG_SMP)
if (cpu != smp_processor_id())
-   xics_wake_cpu(cpu);
-#endif
+   kvmppc_ipi_thread(cp

[PULL 19/21] KVM: PPC: Book3S HV: Streamline guest entry and exit

2015-04-21 Thread Alexander Graf
From: Paul Mackerras 

On entry to the guest, secondary threads now wait for the primary to
switch the MMU after loading up most of their state, rather than before.
This means that the secondary threads get into the guest sooner, in the
common case where the secondary threads get to kvmppc_hv_entry before
the primary thread.

On exit, the first thread out increments the exit count and interrupts
the other threads (to get them out of the guest) before saving most
of its state, rather than after.  That means that the other threads
exit sooner and means that the first thread doesn't spend so much
time waiting for the other threads at the point where the MMU gets
switched back to the host.

This pulls out the code that increments the exit count and interrupts
other threads into a separate function, kvmhv_commence_exit().
This also makes sure that r12 and vcpu->arch.trap are set correctly
in some corner cases.

Statistics from /sys/kernel/debug/kvm/vm*/vcpu*/timings show the
improvement.  Aggregating across vcpus for a guest with 32 vcpus,
8 threads/vcore, running on a POWER8, gives this before the change:

 rm_entry: avg 4537.3ns (222 - 48444, 1068878 samples)
  rm_exit: avg 4787.6ns (152 - 165490, 1010717 samples)
  rm_intr: avg 1673.6ns (12 - 341304, 3818691 samples)

and this after the change:

 rm_entry: avg 3427.7ns (232 - 68150, 1118921 samples)
  rm_exit: avg 4716.0ns (12 - 150720, 1119477 samples)
  rm_intr: avg 1614.8ns (12 - 522436, 3850432 samples)

showing a substantial reduction in the time spent per guest entry in
the real-mode guest entry code, and smaller reductions in the real
mode guest exit and interrupt handling times.  (The test was to start
the guest and boot Fedora 20 big-endian to the login prompt.)

Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 212 +++-
 1 file changed, 126 insertions(+), 86 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 245f5c9..3f6fd78 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -175,6 +175,19 @@ kvmppc_primary_no_guest:
/* put the HDEC into the DEC, since HDEC interrupts don't wake us */
mfspr   r3, SPRN_HDEC
mtspr   SPRN_DEC, r3
+   /*
+* Make sure the primary has finished the MMU switch.
+* We should never get here on a secondary thread, but
+* check it for robustness' sake.
+*/
+   ld  r5, HSTATE_KVM_VCORE(r13)
+65:lbz r0, VCORE_IN_GUEST(r5)
+   cmpwi   r0, 0
+   beq 65b
+   /* Set LPCR. */
+   ld  r8,VCORE_LPCR(r5)
+   mtspr   SPRN_LPCR,r8
+   isync
/* set our bit in napping_threads */
ld  r5, HSTATE_KVM_VCORE(r13)
lbz r7, HSTATE_PTID(r13)
@@ -206,7 +219,7 @@ kvm_novcpu_wakeup:
 
/* check the wake reason */
bl  kvmppc_check_wake_reason
-   
+
/* see if any other thread is already exiting */
lwz r0, VCORE_ENTRY_EXIT(r5)
cmpwi   r0, 0x100
@@ -244,7 +257,15 @@ kvm_novcpu_wakeup:
b   kvmppc_got_guest
 
 kvm_novcpu_exit:
-   b   hdec_soon
+#ifdef CONFIG_KVM_BOOK3S_HV_EXIT_TIMING
+   ld  r4, HSTATE_KVM_VCPU(r13)
+   cmpdi   r4, 0
+   beq 13f
+   addir3, r4, VCPU_TB_RMEXIT
+   bl  kvmhv_accumulate_time
+#endif
+13:bl  kvmhv_commence_exit
+   b   kvmhv_switch_to_host
 
 /*
  * We come in here when wakened from nap mode.
@@ -422,7 +443,7 @@ kvmppc_hv_entry:
/* Primary thread switches to guest partition. */
ld  r9,VCORE_KVM(r5)/* pointer to struct kvm */
cmpwi   r6,0
-   bne 20f
+   bne 10f
ld  r6,KVM_SDR1(r9)
lwz r7,KVM_LPID(r9)
li  r0,LPID_RSVD/* switch to reserved LPID */
@@ -493,26 +514,9 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
 
li  r0,1
stb r0,VCORE_IN_GUEST(r5)   /* signal secondaries to continue */
-   b   10f
-
-   /* Secondary threads wait for primary to have done partition switch */
-20:lbz r0,VCORE_IN_GUEST(r5)
-   cmpwi   r0,0
-   beq 20b
-
-   /* Set LPCR. */
-10:ld  r8,VCORE_LPCR(r5)
-   mtspr   SPRN_LPCR,r8
-   isync
-
-   /* Check if HDEC expires soon */
-   mfspr   r3,SPRN_HDEC
-   cmpwi   r3,512  /* 1 microsecond */
-   li  r12,BOOK3S_INTERRUPT_HV_DECREMENTER
-   blt hdec_soon
 
/* Do we have a guest vcpu to run? */
-   cmpdi   r4, 0
+10:cmpdi   r4, 0
beq kvmppc_primary_no_guest
 kvmppc_got_guest:
 
@@ -837,6 +841,30 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
clrrdi  r6,r6,1
mtspr   SPRN_CTRLT,r6
 4:
+   /* Secondary threads wait for primary to have done partition switch */

[PULL 04/21] KVM: PPC: Book3S HV: Remove RMA-related variables from code

2015-04-21 Thread Alexander Graf
From: "Aneesh Kumar K.V" 

We don't support real-mode areas now that 970 support is removed.
Remove the remaining details of rma from the code.  Also rename
rma_setup_done to hpte_setup_done to better reflect the changes.

Signed-off-by: Aneesh Kumar K.V 
Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/include/asm/kvm_host.h |  3 +--
 arch/powerpc/kvm/book3s_64_mmu_hv.c | 28 ++--
 arch/powerpc/kvm/book3s_hv.c| 10 +-
 3 files changed, 20 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 8ef0512..015773f 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -228,9 +228,8 @@ struct kvm_arch {
int tlbie_lock;
unsigned long lpcr;
unsigned long rmor;
-   struct kvm_rma_info *rma;
unsigned long vrma_slb_v;
-   int rma_setup_done;
+   int hpte_setup_done;
u32 hpt_order;
atomic_t vcpus_running;
u32 online_vcores;
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 534acb3..dbf1271 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -116,12 +116,12 @@ long kvmppc_alloc_reset_hpt(struct kvm *kvm, u32 
*htab_orderp)
long order;
 
mutex_lock(&kvm->lock);
-   if (kvm->arch.rma_setup_done) {
-   kvm->arch.rma_setup_done = 0;
-   /* order rma_setup_done vs. vcpus_running */
+   if (kvm->arch.hpte_setup_done) {
+   kvm->arch.hpte_setup_done = 0;
+   /* order hpte_setup_done vs. vcpus_running */
smp_mb();
if (atomic_read(&kvm->arch.vcpus_running)) {
-   kvm->arch.rma_setup_done = 1;
+   kvm->arch.hpte_setup_done = 1;
goto out;
}
}
@@ -1339,20 +1339,20 @@ static ssize_t kvm_htab_write(struct file *file, const 
char __user *buf,
unsigned long tmp[2];
ssize_t nb;
long int err, ret;
-   int rma_setup;
+   int hpte_setup;
 
if (!access_ok(VERIFY_READ, buf, count))
return -EFAULT;
 
/* lock out vcpus from running while we're doing this */
mutex_lock(&kvm->lock);
-   rma_setup = kvm->arch.rma_setup_done;
-   if (rma_setup) {
-   kvm->arch.rma_setup_done = 0;   /* temporarily */
-   /* order rma_setup_done vs. vcpus_running */
+   hpte_setup = kvm->arch.hpte_setup_done;
+   if (hpte_setup) {
+   kvm->arch.hpte_setup_done = 0;  /* temporarily */
+   /* order hpte_setup_done vs. vcpus_running */
smp_mb();
if (atomic_read(&kvm->arch.vcpus_running)) {
-   kvm->arch.rma_setup_done = 1;
+   kvm->arch.hpte_setup_done = 1;
mutex_unlock(&kvm->lock);
return -EBUSY;
}
@@ -1405,7 +1405,7 @@ static ssize_t kvm_htab_write(struct file *file, const 
char __user *buf,
   "r=%lx\n", ret, i, v, r);
goto out;
}
-   if (!rma_setup && is_vrma_hpte(v)) {
+   if (!hpte_setup && is_vrma_hpte(v)) {
unsigned long psize = hpte_base_page_size(v, r);
unsigned long senc = slb_pgsize_encoding(psize);
unsigned long lpcr;
@@ -1414,7 +1414,7 @@ static ssize_t kvm_htab_write(struct file *file, const 
char __user *buf,
(VRMA_VSID << SLB_VSID_SHIFT_1T);
lpcr = senc << (LPCR_VRMASD_SH - 4);
kvmppc_update_lpcr(kvm, lpcr, LPCR_VRMASD);
-   rma_setup = 1;
+   hpte_setup = 1;
}
++i;
hptp += 2;
@@ -1430,9 +1430,9 @@ static ssize_t kvm_htab_write(struct file *file, const 
char __user *buf,
}
 
  out:
-   /* Order HPTE updates vs. rma_setup_done */
+   /* Order HPTE updates vs. hpte_setup_done */
smp_wmb();
-   kvm->arch.rma_setup_done = rma_setup;
+   kvm->arch.hpte_setup_done = hpte_setup;
mutex_unlock(&kvm->lock);
 
if (err)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index b9c11a3..dde14fd 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -2044,11 +2044,11 @@ static int kvmppc_vcpu_run_hv(struct kvm_run *run, 
struct kvm_vcpu *vcpu)
}
 
atomic_inc(&vcpu->kvm->

[PULL 18/21] KVM: PPC: Book3S HV: Use bitmap of active threads rather than count

2015-04-21 Thread Alexander Graf
From: Paul Mackerras 

Currently, the entry_exit_count field in the kvmppc_vcore struct
contains two 8-bit counts, one of the threads that have started entering
the guest, and one of the threads that have started exiting the guest.
This changes it to an entry_exit_map field which contains two bitmaps
of 8 bits each.  The advantage of doing this is that it gives us a
bitmap of which threads need to be signalled when exiting the guest.
That means that we no longer need to use the trick of setting the
HDEC to 0 to pull the other threads out of the guest, which led in
some cases to a spurious HDEC interrupt on the next guest entry.

Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/include/asm/kvm_host.h | 15 
 arch/powerpc/kernel/asm-offsets.c   |  2 +-
 arch/powerpc/kvm/book3s_hv.c|  5 ++-
 arch/powerpc/kvm/book3s_hv_builtin.c| 10 +++---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 61 +++--
 5 files changed, 44 insertions(+), 49 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 1517faa..d67a838 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -263,15 +263,15 @@ struct kvm_arch {
 
 /*
  * Struct for a virtual core.
- * Note: entry_exit_count combines an entry count in the bottom 8 bits
- * and an exit count in the next 8 bits.  This is so that we can
- * atomically increment the entry count iff the exit count is 0
- * without taking the lock.
+ * Note: entry_exit_map combines a bitmap of threads that have entered
+ * in the bottom 8 bits and a bitmap of threads that have exited in the
+ * next 8 bits.  This is so that we can atomically set the entry bit
+ * iff the exit map is 0 without taking a lock.
  */
 struct kvmppc_vcore {
int n_runnable;
int num_threads;
-   int entry_exit_count;
+   int entry_exit_map;
int napping_threads;
int first_vcpuid;
u16 pcpu;
@@ -296,8 +296,9 @@ struct kvmppc_vcore {
ulong conferring_threads;
 };
 
-#define VCORE_ENTRY_COUNT(vc)  ((vc)->entry_exit_count & 0xff)
-#define VCORE_EXIT_COUNT(vc)   ((vc)->entry_exit_count >> 8)
+#define VCORE_ENTRY_MAP(vc)((vc)->entry_exit_map & 0xff)
+#define VCORE_EXIT_MAP(vc) ((vc)->entry_exit_map >> 8)
+#define VCORE_IS_EXITING(vc)   (VCORE_EXIT_MAP(vc) != 0)
 
 /* Values for vcore_state */
 #define VCORE_INACTIVE 0
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 8aa8246..0d07efb 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -562,7 +562,7 @@ int main(void)
DEFINE(VCPU_ACOP, offsetof(struct kvm_vcpu, arch.acop));
DEFINE(VCPU_WORT, offsetof(struct kvm_vcpu, arch.wort));
DEFINE(VCPU_SHADOW_SRR1, offsetof(struct kvm_vcpu, arch.shadow_srr1));
-   DEFINE(VCORE_ENTRY_EXIT, offsetof(struct kvmppc_vcore, 
entry_exit_count));
+   DEFINE(VCORE_ENTRY_EXIT, offsetof(struct kvmppc_vcore, entry_exit_map));
DEFINE(VCORE_IN_GUEST, offsetof(struct kvmppc_vcore, in_guest));
DEFINE(VCORE_NAPPING_THREADS, offsetof(struct kvmppc_vcore, 
napping_threads));
DEFINE(VCORE_KVM, offsetof(struct kvmppc_vcore, kvm));
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 7c1335d..ea1600f 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1952,7 +1952,7 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc)
/*
 * Initialize *vc.
 */
-   vc->entry_exit_count = 0;
+   vc->entry_exit_map = 0;
vc->preempt_tb = TB_NIL;
vc->in_guest = 0;
vc->napping_threads = 0;
@@ -2119,8 +2119,7 @@ static int kvmppc_run_vcpu(struct kvm_run *kvm_run, 
struct kvm_vcpu *vcpu)
 * this thread straight away and have it join in.
 */
if (!signal_pending(current)) {
-   if (vc->vcore_state == VCORE_RUNNING &&
-   VCORE_EXIT_COUNT(vc) == 0) {
+   if (vc->vcore_state == VCORE_RUNNING && !VCORE_IS_EXITING(vc)) {
kvmppc_create_dtl_entry(vcpu, vc);
kvmppc_start_thread(vcpu);
trace_kvm_guest_enter(vcpu);
diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c 
b/arch/powerpc/kvm/book3s_hv_builtin.c
index 1954a1c..2754251 100644
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -115,11 +115,11 @@ long int kvmppc_rm_h_confer(struct kvm_vcpu *vcpu, int 
target,
int rv = H_SUCCESS; /* => don't yield */
 
set_bit(vcpu->arch.ptid, &vc->conferring_threads);
-   while ((get_tb() < stop) && (VCORE_EXIT_COUNT(vc) == 0)) {
-   threads_running = VCORE_ENTRY_COUNT(vc);
-   threads_ceded = hweight32(vc->napping

[PULL 17/21] KVM: PPC: Book3S HV: Use decrementer to wake napping threads

2015-04-21 Thread Alexander Graf
From: Paul Mackerras 

This arranges for threads that are napping due to their vcpu having
ceded or due to not having a vcpu to wake up at the end of the guest's
timeslice without having to be poked with an IPI.  We do that by
arranging for the decrementer to contain a value no greater than the
number of timebase ticks remaining until the end of the timeslice.
In the case of a thread with no vcpu, this number is in the hypervisor
decrementer already.  In the case of a ceded vcpu, we use the smaller
of the HDEC value and the DEC value.

Using the DEC like this when ceded means we need to save and restore
the guest decrementer value around the nap.

Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 43 +++--
 1 file changed, 41 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 12d7e4c..16719af 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -172,6 +172,9 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
 
 kvmppc_primary_no_guest:
/* We handle this much like a ceded vcpu */
+   /* put the HDEC into the DEC, since HDEC interrupts don't wake us */
+   mfspr   r3, SPRN_HDEC
+   mtspr   SPRN_DEC, r3
/* set our bit in napping_threads */
ld  r5, HSTATE_KVM_VCORE(r13)
lbz r7, HSTATE_PTID(r13)
@@ -223,6 +226,12 @@ kvm_novcpu_wakeup:
cmpdi   r3, 0
bge kvm_novcpu_exit
 
+   /* See if our timeslice has expired (HDEC is negative) */
+   mfspr   r0, SPRN_HDEC
+   li  r12, BOOK3S_INTERRUPT_HV_DECREMENTER
+   cmpwi   r0, 0
+   blt kvm_novcpu_exit
+
/* Got an IPI but other vcpus aren't yet exiting, must be a latecomer */
ld  r4, HSTATE_KVM_VCPU(r13)
cmpdi   r4, 0
@@ -1493,10 +1502,10 @@ kvmhv_do_exit:  /* r12 = trap, r13 = 
paca */
cmpwi   r3,0x100/* Are we the first here? */
bge 43f
cmpwi   r12,BOOK3S_INTERRUPT_HV_DECREMENTER
-   beq 40f
+   beq 43f
li  r0,0
mtspr   SPRN_HDEC,r0
-40:
+
/*
 * Send an IPI to any napping threads, since an HDEC interrupt
 * doesn't wake CPUs up from nap.
@@ -2124,6 +2133,27 @@ _GLOBAL(kvmppc_h_cede)   /* r3 = vcpu pointer, 
r11 = msr, r13 = paca */
/* save FP state */
bl  kvmppc_save_fp
 
+   /*
+* Set DEC to the smaller of DEC and HDEC, so that we wake
+* no later than the end of our timeslice (HDEC interrupts
+* don't wake us from nap).
+*/
+   mfspr   r3, SPRN_DEC
+   mfspr   r4, SPRN_HDEC
+   mftbr5
+   cmpwr3, r4
+   ble 67f
+   mtspr   SPRN_DEC, r4
+67:
+   /* save expiry time of guest decrementer */
+   extsw   r3, r3
+   add r3, r3, r5
+   ld  r4, HSTATE_KVM_VCPU(r13)
+   ld  r5, HSTATE_KVM_VCORE(r13)
+   ld  r6, VCORE_TB_OFFSET(r5)
+   subfr3, r6, r3  /* convert to host TB value */
+   std r3, VCPU_DEC_EXPIRES(r4)
+
 #ifdef CONFIG_KVM_BOOK3S_HV_EXIT_TIMING
ld  r4, HSTATE_KVM_VCPU(r13)
addir3, r4, VCPU_TB_CEDE
@@ -2181,6 +2211,15 @@ kvm_end_cede:
/* load up FP state */
bl  kvmppc_load_fp
 
+   /* Restore guest decrementer */
+   ld  r3, VCPU_DEC_EXPIRES(r4)
+   ld  r5, HSTATE_KVM_VCORE(r13)
+   ld  r6, VCORE_TB_OFFSET(r5)
+   add r3, r3, r6  /* convert host TB to guest TB value */
+   mftbr7
+   subfr3, r7, r3
+   mtspr   SPRN_DEC, r3
+
/* Load NV GPRS */
ld  r14, VCPU_GPR(R14)(r4)
ld  r15, VCPU_GPR(R15)(r4)
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL 02/21] kvmppc: Implement H_LOGICAL_CI_{LOAD,STORE} in KVM

2015-04-21 Thread Alexander Graf
From: David Gibson 

On POWER, storage caching is usually configured via the MMU - attributes
such as cache-inhibited are stored in the TLB and the hashed page table.

This makes correctly performing cache inhibited IO accesses awkward when
the MMU is turned off (real mode).  Some CPU models provide special
registers to control the cache attributes of real mode load and stores but
this is not at all consistent.  This is a problem in particular for SLOF,
the firmware used on KVM guests, which runs entirely in real mode, but
which needs to do IO to load the kernel.

To simplify this qemu implements two special hypercalls, H_LOGICAL_CI_LOAD
and H_LOGICAL_CI_STORE which simulate a cache-inhibited load or store to
a logical address (aka guest physical address).  SLOF uses these for IO.

However, because these are implemented within qemu, not the host kernel,
these bypass any IO devices emulated within KVM itself.  The simplest way
to see this problem is to attempt to boot a KVM guest from a virtio-blk
device with iothread / dataplane enabled.  The iothread code relies on an
in kernel implementation of the virtio queue notification, which is not
triggered by the IO hcalls, and so the guest will stall in SLOF unable to
load the guest OS.

This patch addresses this by providing in-kernel implementations of the
2 hypercalls, which correctly scan the KVM IO bus.  Any access to an
address not handled by the KVM IO bus will cause a VM exit, hitting the
qemu implementation as before.

Note that a userspace change is also required, in order to enable these
new hcall implementations with KVM_CAP_PPC_ENABLE_HCALL.

Signed-off-by: David Gibson 
[agraf: fix compilation]
Signed-off-by: Alexander Graf 
---
 arch/powerpc/include/asm/kvm_book3s.h |  3 ++
 arch/powerpc/kvm/book3s.c | 76 +++
 arch/powerpc/kvm/book3s_hv.c  | 12 ++
 arch/powerpc/kvm/book3s_pr_papr.c | 28 +
 4 files changed, 119 insertions(+)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index 942c7b1..578e550 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -292,6 +292,9 @@ static inline bool kvmppc_supports_magic_page(struct 
kvm_vcpu *vcpu)
return !is_kvmppc_hv_enabled(vcpu->kvm);
 }
 
+extern int kvmppc_h_logical_ci_load(struct kvm_vcpu *vcpu);
+extern int kvmppc_h_logical_ci_store(struct kvm_vcpu *vcpu);
+
 /* Magic register values loaded into r3 and r4 before the 'sc' assembly
  * instruction for the OSI hypercalls */
 #define OSI_SC_MAGIC_R30x113724FA
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index cfbcdc6..453a8a4 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -821,6 +821,82 @@ void kvmppc_core_destroy_vm(struct kvm *kvm)
 #endif
 }
 
+int kvmppc_h_logical_ci_load(struct kvm_vcpu *vcpu)
+{
+   unsigned long size = kvmppc_get_gpr(vcpu, 4);
+   unsigned long addr = kvmppc_get_gpr(vcpu, 5);
+   u64 buf;
+   int ret;
+
+   if (!is_power_of_2(size) || (size > sizeof(buf)))
+   return H_TOO_HARD;
+
+   ret = kvm_io_bus_read(vcpu, KVM_MMIO_BUS, addr, size, &buf);
+   if (ret != 0)
+   return H_TOO_HARD;
+
+   switch (size) {
+   case 1:
+   kvmppc_set_gpr(vcpu, 4, *(u8 *)&buf);
+   break;
+
+   case 2:
+   kvmppc_set_gpr(vcpu, 4, be16_to_cpu(*(__be16 *)&buf));
+   break;
+
+   case 4:
+   kvmppc_set_gpr(vcpu, 4, be32_to_cpu(*(__be32 *)&buf));
+   break;
+
+   case 8:
+   kvmppc_set_gpr(vcpu, 4, be64_to_cpu(*(__be64 *)&buf));
+   break;
+
+   default:
+   BUG();
+   }
+
+   return H_SUCCESS;
+}
+EXPORT_SYMBOL_GPL(kvmppc_h_logical_ci_load);
+
+int kvmppc_h_logical_ci_store(struct kvm_vcpu *vcpu)
+{
+   unsigned long size = kvmppc_get_gpr(vcpu, 4);
+   unsigned long addr = kvmppc_get_gpr(vcpu, 5);
+   unsigned long val = kvmppc_get_gpr(vcpu, 6);
+   u64 buf;
+   int ret;
+
+   switch (size) {
+   case 1:
+   *(u8 *)&buf = val;
+   break;
+
+   case 2:
+   *(__be16 *)&buf = cpu_to_be16(val);
+   break;
+
+   case 4:
+   *(__be32 *)&buf = cpu_to_be32(val);
+   break;
+
+   case 8:
+   *(__be64 *)&buf = cpu_to_be64(val);
+   break;
+
+   default:
+   return H_TOO_HARD;
+   }
+
+   ret = kvm_io_bus_write(vcpu, KVM_MMIO_BUS, addr, size, &buf);
+   if (ret != 0)
+   return H_TOO_HARD;
+
+   return H_SUCCESS;
+}
+EXPORT_SYMBOL_GPL(kvmppc_h_logical_ci_store);
+
 int kvmppc_core_check_processor_compat(void)
 {
/*
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index de74756.

[PULL 08/21] KVM: PPC: Book3S HV: Move virtual mode ICP functions to real-mode

2015-04-21 Thread Alexander Graf
From: Suresh Warrier 

Interrupt-based hypercalls return H_TOO_HARD to inform KVM that it needs
to switch to the host to complete the rest of hypercall function in
virtual mode. This patch ports the virtual mode ICS/ICP reject and resend
functions to be runnable in hypervisor real mode, thus avoiding the need
to switch to the host to execute these functions in virtual mode. However,
the hypercalls continue to return H_TOO_HARD for vcpu_wakeup and notify
events - these events cannot be done in real mode and they will still need
a switch to host virtual mode.

There are sufficient differences between the real mode code and the
virtual mode code for the ICS/ICP resend and reject functions that
for now the code has been duplicated instead of sharing common code.
In the future, we can look at creating common functions.

Signed-off-by: Suresh Warrier 
Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/kvm/book3s_hv_rm_xics.c | 225 ---
 1 file changed, 211 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rm_xics.c 
b/arch/powerpc/kvm/book3s_hv_rm_xics.c
index 7c22997..73bbe92 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_xics.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_xics.c
@@ -23,12 +23,39 @@
 
 #define DEBUG_PASSUP
 
+static void icp_rm_deliver_irq(struct kvmppc_xics *xics, struct kvmppc_icp 
*icp,
+   u32 new_irq);
+
 static inline void rm_writeb(unsigned long paddr, u8 val)
 {
__asm__ __volatile__("sync; stbcix %0,0,%1"
: : "r" (val), "r" (paddr) : "memory");
 }
 
+/* -- ICS routines -- */
+static void ics_rm_check_resend(struct kvmppc_xics *xics,
+   struct kvmppc_ics *ics, struct kvmppc_icp *icp)
+{
+   int i;
+
+   arch_spin_lock(&ics->lock);
+
+   for (i = 0; i < KVMPPC_XICS_IRQ_PER_ICS; i++) {
+   struct ics_irq_state *state = &ics->irq_state[i];
+
+   if (!state->resend)
+   continue;
+
+   arch_spin_unlock(&ics->lock);
+   icp_rm_deliver_irq(xics, icp, state->number);
+   arch_spin_lock(&ics->lock);
+   }
+
+   arch_spin_unlock(&ics->lock);
+}
+
+/* -- ICP routines -- */
+
 static void icp_rm_set_vcpu_irq(struct kvm_vcpu *vcpu,
struct kvm_vcpu *this_vcpu)
 {
@@ -116,6 +143,178 @@ static inline int check_too_hard(struct kvmppc_xics *xics,
return (xics->real_mode_dbg || icp->rm_action) ? H_TOO_HARD : H_SUCCESS;
 }
 
+static void icp_rm_check_resend(struct kvmppc_xics *xics,
+struct kvmppc_icp *icp)
+{
+   u32 icsid;
+
+   /* Order this load with the test for need_resend in the caller */
+   smp_rmb();
+   for_each_set_bit(icsid, icp->resend_map, xics->max_icsid + 1) {
+   struct kvmppc_ics *ics = xics->ics[icsid];
+
+   if (!test_and_clear_bit(icsid, icp->resend_map))
+   continue;
+   if (!ics)
+   continue;
+   ics_rm_check_resend(xics, ics, icp);
+   }
+}
+
+static bool icp_rm_try_to_deliver(struct kvmppc_icp *icp, u32 irq, u8 priority,
+  u32 *reject)
+{
+   union kvmppc_icp_state old_state, new_state;
+   bool success;
+
+   do {
+   old_state = new_state = READ_ONCE(icp->state);
+
+   *reject = 0;
+
+   /* See if we can deliver */
+   success = new_state.cppr > priority &&
+   new_state.mfrr > priority &&
+   new_state.pending_pri > priority;
+
+   /*
+* If we can, check for a rejection and perform the
+* delivery
+*/
+   if (success) {
+   *reject = new_state.xisr;
+   new_state.xisr = irq;
+   new_state.pending_pri = priority;
+   } else {
+   /*
+* If we failed to deliver we set need_resend
+* so a subsequent CPPR state change causes us
+* to try a new delivery.
+*/
+   new_state.need_resend = true;
+   }
+
+   } while (!icp_rm_try_update(icp, old_state, new_state));
+
+   return success;
+}
+
+static void icp_rm_deliver_irq(struct kvmppc_xics *xics, struct kvmppc_icp 
*icp,
+   u32 new_irq)
+{
+   struct ics_irq_state *state;
+   struct kvmppc_ics *ics;
+   u32 reject;
+   u16 src;
+
+   /*
+* This is used both for initial delivery of an interrupt and
+* for subsequent rejection.
+*
+* Rejection can be racy vs. resends. We have evaluated the
+

[PULL 05/21] KVM: PPC: Book3S HV: Add helpers for lock/unlock hpte

2015-04-21 Thread Alexander Graf
From: "Aneesh Kumar K.V" 

This adds helper routines for locking and unlocking HPTEs, and uses
them in the rest of the code.  We don't change any locking rules in
this patch.

Signed-off-by: Aneesh Kumar K.V 
Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/include/asm/kvm_book3s_64.h | 14 ++
 arch/powerpc/kvm/book3s_64_mmu_hv.c  | 25 ++---
 arch/powerpc/kvm/book3s_hv_rm_mmu.c  | 25 +
 3 files changed, 33 insertions(+), 31 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h 
b/arch/powerpc/include/asm/kvm_book3s_64.h
index 2d81e20..0789a0f 100644
--- a/arch/powerpc/include/asm/kvm_book3s_64.h
+++ b/arch/powerpc/include/asm/kvm_book3s_64.h
@@ -85,6 +85,20 @@ static inline long try_lock_hpte(__be64 *hpte, unsigned long 
bits)
return old == 0;
 }
 
+static inline void unlock_hpte(__be64 *hpte, unsigned long hpte_v)
+{
+   hpte_v &= ~HPTE_V_HVLOCK;
+   asm volatile(PPC_RELEASE_BARRIER "" : : : "memory");
+   hpte[0] = cpu_to_be64(hpte_v);
+}
+
+/* Without barrier */
+static inline void __unlock_hpte(__be64 *hpte, unsigned long hpte_v)
+{
+   hpte_v &= ~HPTE_V_HVLOCK;
+   hpte[0] = cpu_to_be64(hpte_v);
+}
+
 static inline int __hpte_actual_psize(unsigned int lp, int psize)
 {
int i, shift;
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index dbf1271..6c6825a 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -338,9 +338,7 @@ static int kvmppc_mmu_book3s_64_hv_xlate(struct kvm_vcpu 
*vcpu, gva_t eaddr,
v = be64_to_cpu(hptep[0]) & ~HPTE_V_HVLOCK;
gr = kvm->arch.revmap[index].guest_rpte;
 
-   /* Unlock the HPTE */
-   asm volatile("lwsync" : : : "memory");
-   hptep[0] = cpu_to_be64(v);
+   unlock_hpte(hptep, v);
preempt_enable();
 
gpte->eaddr = eaddr;
@@ -469,8 +467,7 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
hpte[0] = be64_to_cpu(hptep[0]) & ~HPTE_V_HVLOCK;
hpte[1] = be64_to_cpu(hptep[1]);
hpte[2] = r = rev->guest_rpte;
-   asm volatile("lwsync" : : : "memory");
-   hptep[0] = cpu_to_be64(hpte[0]);
+   unlock_hpte(hptep, hpte[0]);
preempt_enable();
 
if (hpte[0] != vcpu->arch.pgfault_hpte[0] ||
@@ -621,7 +618,7 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
 
hptep[1] = cpu_to_be64(r);
eieio();
-   hptep[0] = cpu_to_be64(hpte[0]);
+   __unlock_hpte(hptep, hpte[0]);
asm volatile("ptesync" : : : "memory");
preempt_enable();
if (page && hpte_is_writable(r))
@@ -642,7 +639,7 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
return ret;
 
  out_unlock:
-   hptep[0] &= ~cpu_to_be64(HPTE_V_HVLOCK);
+   __unlock_hpte(hptep, be64_to_cpu(hptep[0]));
preempt_enable();
goto out_put;
 }
@@ -771,7 +768,7 @@ static int kvm_unmap_rmapp(struct kvm *kvm, unsigned long 
*rmapp,
}
}
unlock_rmap(rmapp);
-   hptep[0] &= ~cpu_to_be64(HPTE_V_HVLOCK);
+   __unlock_hpte(hptep, be64_to_cpu(hptep[0]));
}
return 0;
 }
@@ -857,7 +854,7 @@ static int kvm_age_rmapp(struct kvm *kvm, unsigned long 
*rmapp,
}
ret = 1;
}
-   hptep[0] &= ~cpu_to_be64(HPTE_V_HVLOCK);
+   __unlock_hpte(hptep, be64_to_cpu(hptep[0]));
} while ((i = j) != head);
 
unlock_rmap(rmapp);
@@ -974,8 +971,7 @@ static int kvm_test_clear_dirty_npages(struct kvm *kvm, 
unsigned long *rmapp)
 
/* Now check and modify the HPTE */
if (!(hptep[0] & cpu_to_be64(HPTE_V_VALID))) {
-   /* unlock and continue */
-   hptep[0] &= ~cpu_to_be64(HPTE_V_HVLOCK);
+   __unlock_hpte(hptep, be64_to_cpu(hptep[0]));
continue;
}
 
@@ -996,9 +992,9 @@ static int kvm_test_clear_dirty_npages(struct kvm *kvm, 
unsigned long *rmapp)
npages_dirty = n;
eieio();
}
-   v &= ~(HPTE_V_ABSENT | HPTE_V_HVLOCK);
+   v &= ~HPTE_V_ABSENT;
v |= HPTE_V_VALID;
-   hptep[0] = cpu_to_be64(v);
+   __unlock_hpte(hptep, v);
} while ((i = j) != head);
 
unlock_rmap(rmapp);
@@ -1218,8 +1214,7 @@ static long record_hpte(unsigned long flags, __be64 *hptp,
r &= ~HPTE_GR_MODIFIED;
revp->guest_rpte = r;
}
-  

[PULL 01/21] powerpc: Export __spin_yield

2015-04-21 Thread Alexander Graf
From: "Suresh E. Warrier" 

Export __spin_yield so that the arch_spin_unlock() function can
be invoked from a module. This will be required for modules where
we want to take a lock that is also is acquired in hypervisor
real mode. Because we want to avoid running any lockdep code
(which may not be safe in real mode), this lock needs to be
an arch_spinlock_t instead of a normal spinlock.

Signed-off-by: Suresh Warrier 
Acked-by: Paul Mackerras 
Acked-by: Michael Ellerman 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/lib/locks.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/lib/locks.c b/arch/powerpc/lib/locks.c
index 170a034..f7deebd 100644
--- a/arch/powerpc/lib/locks.c
+++ b/arch/powerpc/lib/locks.c
@@ -41,6 +41,7 @@ void __spin_yield(arch_spinlock_t *lock)
plpar_hcall_norets(H_CONFER,
get_hard_smp_processor_id(holder_cpu), yield_count);
 }
+EXPORT_SYMBOL_GPL(__spin_yield);
 
 /*
  * Waiting for a read lock or a write lock on a rwlock...
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL 00/21] ppc patch queue 2015-04-21 for 4.1

2015-04-21 Thread Alexander Graf
Hi Paolo / Marcelo,

This is my current patch queue for ppc.  Please pull.

Alex


The following changes since commit b79013b2449c23f1f505bdf39c5a6c330338b244:

  Merge tag 'staging-4.1-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging (2015-04-13 
17:37:33 -0700)

are available in the git repository at:


  git://github.com/agraf/linux-2.6.git tags/signed-kvm-ppc-queue

for you to fetch changes up to 66feed61cdf6ee65fd551d3460b1efba6bee55b8:

  KVM: PPC: Book3S HV: Use msgsnd for signalling threads on POWER8 (2015-04-21 
15:21:34 +0200)


Patch queue for ppc - 2015-04-21

This is the latest queue for KVM on PowerPC changes. Highlights this
time around:

  - Book3S HV: Debugging aids
  - Book3S HV: Minor performance improvements
  - Book3S HV: Cleanups


Aneesh Kumar K.V (2):
  KVM: PPC: Book3S HV: Remove RMA-related variables from code
  KVM: PPC: Book3S HV: Add helpers for lock/unlock hpte

David Gibson (1):
  kvmppc: Implement H_LOGICAL_CI_{LOAD,STORE} in KVM

Michael Ellerman (1):
  KVM: PPC: Book3S HV: Add fast real-mode H_RANDOM implementation.

Paul Mackerras (12):
  KVM: PPC: Book3S HV: Create debugfs file for each guest's HPT
  KVM: PPC: Book3S HV: Accumulate timing information for real-mode code
  KVM: PPC: Book3S HV: Simplify handling of VCPUs that need a VPA update
  KVM: PPC: Book3S HV: Minor cleanups
  KVM: PPC: Book3S HV: Move vcore preemption point up into kvmppc_run_vcpu
  KVM: PPC: Book3S HV: Get rid of vcore nap_count and n_woken
  KVM: PPC: Book3S HV: Don't wake thread with no vcpu on guest IPI
  KVM: PPC: Book3S HV: Use decrementer to wake napping threads
  KVM: PPC: Book3S HV: Use bitmap of active threads rather than count
  KVM: PPC: Book3S HV: Streamline guest entry and exit
  KVM: PPC: Book3S HV: Translate kvmhv_commence_exit to C
  KVM: PPC: Book3S HV: Use msgsnd for signalling threads on POWER8

Suresh E. Warrier (2):
  powerpc: Export __spin_yield
  KVM: PPC: Book3S HV: Add guest->host real mode completion counters

Suresh Warrier (3):
  KVM: PPC: Book3S HV: Convert ICS mutex lock to spin lock
  KVM: PPC: Book3S HV: Move virtual mode ICP functions to real-mode
  KVM: PPC: Book3S HV: Add ICP real mode counters

 Documentation/virtual/kvm/api.txt|  17 +
 arch/powerpc/include/asm/archrandom.h|  11 +-
 arch/powerpc/include/asm/kvm_book3s.h|   3 +
 arch/powerpc/include/asm/kvm_book3s_64.h |  18 +
 arch/powerpc/include/asm/kvm_host.h  |  47 ++-
 arch/powerpc/include/asm/kvm_ppc.h   |   2 +
 arch/powerpc/include/asm/time.h  |   3 +
 arch/powerpc/kernel/asm-offsets.c|  20 +-
 arch/powerpc/kernel/time.c   |   6 +
 arch/powerpc/kvm/Kconfig |  14 +
 arch/powerpc/kvm/book3s.c|  76 +
 arch/powerpc/kvm/book3s_64_mmu_hv.c  | 189 +--
 arch/powerpc/kvm/book3s_hv.c | 435 ++--
 arch/powerpc/kvm/book3s_hv_builtin.c | 100 +-
 arch/powerpc/kvm/book3s_hv_rm_mmu.c  |  25 +-
 arch/powerpc/kvm/book3s_hv_rm_xics.c | 238 +++--
 arch/powerpc/kvm/book3s_hv_rmhandlers.S  | 559 +++
 arch/powerpc/kvm/book3s_pr_papr.c|  28 ++
 arch/powerpc/kvm/book3s_xics.c   | 105 --
 arch/powerpc/kvm/book3s_xics.h   |  13 +-
 arch/powerpc/kvm/powerpc.c   |   3 +
 arch/powerpc/lib/locks.c |   1 +
 arch/powerpc/platforms/powernv/rng.c |  29 ++
 include/uapi/linux/kvm.h |   1 +
 virt/kvm/kvm_main.c  |   1 +
 25 files changed, 1580 insertions(+), 364 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL 10/21] KVM: PPC: Book3S HV: Create debugfs file for each guest's HPT

2015-04-21 Thread Alexander Graf
From: Paul Mackerras 

This creates a debugfs directory for each HV guest (assuming debugfs
is enabled in the kernel config), and within that directory, a file
by which the contents of the guest's HPT (hashed page table) can be
read.  The directory is named vm, where  is the PID of the
process that created the guest.  The file is named "htab".  This is
intended to help in debugging problems in the host's management
of guest memory.

The contents of the file consist of a series of lines like this:

  3f48 4000d032bf003505 000bd7ff1196 0003b5c71196

The first field is the index of the entry in the HPT, the second and
third are the HPT entry, so the third entry contains the real page
number that is mapped by the entry if the entry's valid bit is set.
The fourth field is the guest's view of the second doubleword of the
entry, so it contains the guest physical address.  (The format of the
second through fourth fields are described in the Power ISA and also
in arch/powerpc/include/asm/mmu-hash64.h.)

Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/include/asm/kvm_book3s_64.h |   2 +
 arch/powerpc/include/asm/kvm_host.h  |   2 +
 arch/powerpc/kvm/book3s_64_mmu_hv.c  | 136 +++
 arch/powerpc/kvm/book3s_hv.c |  12 +++
 virt/kvm/kvm_main.c  |   1 +
 5 files changed, 153 insertions(+)

diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h 
b/arch/powerpc/include/asm/kvm_book3s_64.h
index 0789a0f..869c53f 100644
--- a/arch/powerpc/include/asm/kvm_book3s_64.h
+++ b/arch/powerpc/include/asm/kvm_book3s_64.h
@@ -436,6 +436,8 @@ static inline struct kvm_memslots *kvm_memslots_raw(struct 
kvm *kvm)
return rcu_dereference_raw_notrace(kvm->memslots);
 }
 
+extern void kvmppc_mmu_debugfs_init(struct kvm *kvm);
+
 #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
 
 #endif /* __ASM_KVM_BOOK3S_64_H__ */
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 015773f..f1d0bbc 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -238,6 +238,8 @@ struct kvm_arch {
atomic_t hpte_mod_interest;
cpumask_t need_tlb_flush;
int hpt_cma_alloc;
+   struct dentry *debugfs_dir;
+   struct dentry *htab_dentry;
 #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
 #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
struct mutex hpt_mutex;
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 6c6825a..d6fe308 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -1490,6 +1491,141 @@ int kvm_vm_ioctl_get_htab_fd(struct kvm *kvm, struct 
kvm_get_htab_fd *ghf)
return ret;
 }
 
+struct debugfs_htab_state {
+   struct kvm  *kvm;
+   struct mutexmutex;
+   unsigned long   hpt_index;
+   int chars_left;
+   int buf_index;
+   charbuf[64];
+};
+
+static int debugfs_htab_open(struct inode *inode, struct file *file)
+{
+   struct kvm *kvm = inode->i_private;
+   struct debugfs_htab_state *p;
+
+   p = kzalloc(sizeof(*p), GFP_KERNEL);
+   if (!p)
+   return -ENOMEM;
+
+   kvm_get_kvm(kvm);
+   p->kvm = kvm;
+   mutex_init(&p->mutex);
+   file->private_data = p;
+
+   return nonseekable_open(inode, file);
+}
+
+static int debugfs_htab_release(struct inode *inode, struct file *file)
+{
+   struct debugfs_htab_state *p = file->private_data;
+
+   kvm_put_kvm(p->kvm);
+   kfree(p);
+   return 0;
+}
+
+static ssize_t debugfs_htab_read(struct file *file, char __user *buf,
+size_t len, loff_t *ppos)
+{
+   struct debugfs_htab_state *p = file->private_data;
+   ssize_t ret, r;
+   unsigned long i, n;
+   unsigned long v, hr, gr;
+   struct kvm *kvm;
+   __be64 *hptp;
+
+   ret = mutex_lock_interruptible(&p->mutex);
+   if (ret)
+   return ret;
+
+   if (p->chars_left) {
+   n = p->chars_left;
+   if (n > len)
+   n = len;
+   r = copy_to_user(buf, p->buf + p->buf_index, n);
+   n -= r;
+   p->chars_left -= n;
+   p->buf_index += n;
+   buf += n;
+   len -= n;
+   ret = n;
+   if (r) {
+   if (!n)
+   ret = -EFAULT;
+   goto out;
+   }
+   }
+
+   kvm = p->kvm;
+   i = p->hpt_index;
+   hptp = (__be64 *)(kvm->arch.hpt_virt + (i * HPTE_SIZE));
+   for (; len != 0 && i < kvm->arch.hpt_npte; ++i, hptp += 2) {
+   if (!(be64_t

Re: [PATCHv4] kvmppc: Implement H_LOGICAL_CI_{LOAD,STORE} in KVM

2015-04-21 Thread Alexander Graf

On 04/21/2015 02:41 AM, David Gibson wrote:

On POWER, storage caching is usually configured via the MMU - attributes
such as cache-inhibited are stored in the TLB and the hashed page table.

This makes correctly performing cache inhibited IO accesses awkward when
the MMU is turned off (real mode).  Some CPU models provide special
registers to control the cache attributes of real mode load and stores but
this is not at all consistent.  This is a problem in particular for SLOF,
the firmware used on KVM guests, which runs entirely in real mode, but
which needs to do IO to load the kernel.

To simplify this qemu implements two special hypercalls, H_LOGICAL_CI_LOAD
and H_LOGICAL_CI_STORE which simulate a cache-inhibited load or store to
a logical address (aka guest physical address).  SLOF uses these for IO.

However, because these are implemented within qemu, not the host kernel,
these bypass any IO devices emulated within KVM itself.  The simplest way
to see this problem is to attempt to boot a KVM guest from a virtio-blk
device with iothread / dataplane enabled.  The iothread code relies on an
in kernel implementation of the virtio queue notification, which is not
triggered by the IO hcalls, and so the guest will stall in SLOF unable to
load the guest OS.

This patch addresses this by providing in-kernel implementations of the
2 hypercalls, which correctly scan the KVM IO bus.  Any access to an
address not handled by the KVM IO bus will cause a VM exit, hitting the
qemu implementation as before.

Note that a userspace change is also required, in order to enable these
new hcall implementations with KVM_CAP_PPC_ENABLE_HCALL.

Signed-off-by: David Gibson 
---
  arch/powerpc/include/asm/kvm_book3s.h |  3 ++
  arch/powerpc/kvm/book3s.c | 76 +++
  arch/powerpc/kvm/book3s_hv.c  | 12 ++
  arch/powerpc/kvm/book3s_pr_papr.c | 28 +
  4 files changed, 119 insertions(+)

Changes in v4:
  * Rebase onto 4.0+, correct for changed signature of kvm_io_bus_{read,write}

Alex, I saw from some build system notifications that you seemed to
hit some troubles compiling the last version of this patch. This
should fix it - hope it's not too late to get into 4.1.


Oh, I already fixed it up in my tree, no worries.


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/12] Remaining improvements for HV KVM

2015-04-15 Thread Alexander Graf


On 09.04.15 10:49, Paolo Bonzini wrote:
> 
> 
> On 09/04/2015 00:57, Alexander Graf wrote:
>>>
>>> The last patch in this series needs a definition of PPC_MSGCLR that is
>>> added by the patch "powerpc/powernv: Fixes for hypervisor doorbell
>>> handling", which has now gone upstream into Linus' tree as commit
>>> 755563bc79c7 via the linuxppc-dev mailing list.  Alex, how do you want
>>> to handle that?  You could pull in the master branch of the kvm tree,
>>> which includes 755563bc79c7, or you could cherry-pick 755563bc79c7 and
>>> let the subsequent merge fix it up.
>>
>> I've just cherry-picked it for now since it still lives in my queue, so
>> it will get thrown out automatically once I rebase on next if it's
>> included in there.
>>
>> Paolo / Marcelo, could you please try to somehow get the commit above
>> into the next branch somehow? I guess the easiest would be to merge
>> linus/master into kvm/next.
>>
>> Thanks, applied all to kvm-ppc-queue.
> 
> I plan to send the x86/MIPS/s390/ARM merge very early to Linus, maybe
> even tomorrow.  So you can just rebase on top of 4.0-rc6 and send your
> pull request relative to Linus's tree instead of kvm/next.
> 
> Does that work for you?

Phew, that really complicates things on my side. I usually do

  kvm-ppc-queue -> kvm-ppc-next -> kvm/next

which means that my queue already contains your next patches. I could of
course to a rebase --onto and remove anything that is in the kvm tree,
but then we'd end up conflicting on documentation changes.

Since you already did send out the first pull request, just let me know
when you pulled linus' tree back into kvm/next (or kvm/master) so that I
can fast-forward merge this in my kvm-ppc-next branch and then rebase my
queue on top, merge it into the next branch and send you a pull request ;)


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/12] Remaining improvements for HV KVM

2015-04-15 Thread Alexander Graf


On 14.04.15 13:56, Paul Mackerras wrote:
> On Thu, Apr 09, 2015 at 12:57:58AM +0200, Alexander Graf wrote:
>> On 03/28/2015 04:21 AM, Paul Mackerras wrote:
>>> This is the rest of my current patch queue for HV KVM on PPC.  This
>>> series is based on Alex Graf's kvm-ppc-queue branch.  The only change
>> >from the previous version of this series is that patch 2 has been
>>> updated to take account of the timebase offset.
>>>
>>> The last patch in this series needs a definition of PPC_MSGCLR that is
>>> added by the patch "powerpc/powernv: Fixes for hypervisor doorbell
>>> handling", which has now gone upstream into Linus' tree as commit
>>> 755563bc79c7 via the linuxppc-dev mailing list.  Alex, how do you want
>>> to handle that?  You could pull in the master branch of the kvm tree,
>>> which includes 755563bc79c7, or you could cherry-pick 755563bc79c7 and
>>> let the subsequent merge fix it up.
>>
>> I've just cherry-picked it for now since it still lives in my queue, so it
>> will get thrown out automatically once I rebase on next if it's included in
>> there.
>>
>> Paolo / Marcelo, could you please try to somehow get the commit above into
>> the next branch somehow? I guess the easiest would be to merge linus/master
>> into kvm/next.
>>
>> Thanks, applied all to kvm-ppc-queue.
> 
> Did you forget to push it out or something?  Your kvm-ppc-queue branch
> is still at 4.0-rc1 as far as I can see.

Oops, not sure how that happened. Does it show up correctly for you now?


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/12] Remaining improvements for HV KVM

2015-04-08 Thread Alexander Graf

On 03/28/2015 04:21 AM, Paul Mackerras wrote:

This is the rest of my current patch queue for HV KVM on PPC.  This
series is based on Alex Graf's kvm-ppc-queue branch.  The only change
from the previous version of this series is that patch 2 has been
updated to take account of the timebase offset.

The last patch in this series needs a definition of PPC_MSGCLR that is
added by the patch "powerpc/powernv: Fixes for hypervisor doorbell
handling", which has now gone upstream into Linus' tree as commit
755563bc79c7 via the linuxppc-dev mailing list.  Alex, how do you want
to handle that?  You could pull in the master branch of the kvm tree,
which includes 755563bc79c7, or you could cherry-pick 755563bc79c7 and
let the subsequent merge fix it up.


I've just cherry-picked it for now since it still lives in my queue, so 
it will get thrown out automatically once I rebase on next if it's 
included in there.


Paolo / Marcelo, could you please try to somehow get the commit above 
into the next branch somehow? I guess the easiest would be to merge 
linus/master into kvm/next.


Thanks, applied all to kvm-ppc-queue.


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL 3/3] KVM: PPC: Book3S HV: Fix instruction emulation

2015-03-25 Thread Alexander Graf
From: Paul Mackerras 

Commit 4a157d61b48c ("KVM: PPC: Book3S HV: Fix endianness of
instruction obtained from HEIR register") had the side effect that
we no longer reset vcpu->arch.last_inst to -1 on guest exit in
the cases where the instruction is not fetched from the guest.
This means that if instruction emulation turns out to be required
in those cases, the host will emulate the wrong instruction, since
vcpu->arch.last_inst will contain the last instruction that was
emulated.

This fixes it by making sure that vcpu->arch.last_inst is reset
to -1 in those cases.

Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index bb94e6f..6cbf163 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -1005,6 +1005,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
/* Save HEIR (HV emulation assist reg) in emul_inst
   if this is an HEI (HV emulation interrupt, e40) */
li  r3,KVM_INST_FETCH_FAILED
+   stw r3,VCPU_LAST_INST(r9)
cmpwi   r12,BOOK3S_INTERRUPT_H_EMUL_ASSIST
bne 11f
mfspr   r3,SPRN_HEIR
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL 1/3] KVM: PPC: Book3S HV: Fix spinlock/mutex ordering issue in kvmppc_set_lpcr()

2015-03-25 Thread Alexander Graf
From: Paul Mackerras 

Currently, kvmppc_set_lpcr() has a spinlock around the whole function,
and inside that does mutex_lock(&kvm->lock).  It is not permitted to
take a mutex while holding a spinlock, because the mutex_lock might
call schedule().  In addition, this causes lockdep to warn about a
lock ordering issue:

==
[ INFO: possible circular locking dependency detected ]
3.18.0-kvm-04645-gdfea862-dirty #131 Not tainted
---
qemu-system-ppc/8179 is trying to acquire lock:
 (&kvm->lock){+.+.+.}, at: [] .kvmppc_set_lpcr+0xf4/0x1c0 
[kvm_hv]

but task is already holding lock:
 (&(&vcore->lock)->rlock){+.+...}, at: [] 
.kvmppc_set_lpcr+0x40/0x1c0 [kvm_hv]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&(&vcore->lock)->rlock){+.+...}:
   [] .mutex_lock_nested+0x80/0x570
   [] .kvmppc_vcpu_run_hv+0xc4/0xe40 [kvm_hv]
   [] .kvmppc_vcpu_run+0x2c/0x40 [kvm]
   [] .kvm_arch_vcpu_ioctl_run+0x54/0x160 [kvm]
   [] .kvm_vcpu_ioctl+0x4a8/0x7b0 [kvm]
   [] .do_vfs_ioctl+0x444/0x770
   [] .SyS_ioctl+0xc4/0xe0
   [] syscall_exit+0x0/0x98

-> #0 (&kvm->lock){+.+.+.}:
   [] .lock_acquire+0xcc/0x1a0
   [] .mutex_lock_nested+0x80/0x570
   [] .kvmppc_set_lpcr+0xf4/0x1c0 [kvm_hv]
   [] .kvmppc_set_one_reg_hv+0x4dc/0x990 [kvm_hv]
   [] .kvmppc_set_one_reg+0x44/0x330 [kvm]
   [] .kvm_vcpu_ioctl_set_one_reg+0x5c/0x150 [kvm]
   [] .kvm_arch_vcpu_ioctl+0x214/0x2c0 [kvm]
   [] .kvm_vcpu_ioctl+0xe0/0x7b0 [kvm]
   [] .do_vfs_ioctl+0x444/0x770
   [] .SyS_ioctl+0xc4/0xe0
   [] syscall_exit+0x0/0x98

other info that might help us debug this:

 Possible unsafe locking scenario:

   CPU0CPU1
   
  lock(&(&vcore->lock)->rlock);
   lock(&kvm->lock);
   lock(&(&vcore->lock)->rlock);
  lock(&kvm->lock);

 *** DEADLOCK ***

2 locks held by qemu-system-ppc/8179:
 #0:  (&vcpu->mutex){+.+.+.}, at: [] .vcpu_load+0x28/0x90 
[kvm]
 #1:  (&(&vcore->lock)->rlock){+.+...}, at: [] 
.kvmppc_set_lpcr+0x40/0x1c0 [kvm_hv]

stack backtrace:
CPU: 4 PID: 8179 Comm: qemu-system-ppc Not tainted 
3.18.0-kvm-04645-gdfea862-dirty #131
Call Trace:
[c01a66c0f310] [c0b486ac] .dump_stack+0x88/0xb4 (unreliable)
[c01a66c0f390] [c00f8bec] .print_circular_bug+0x27c/0x3d0
[c01a66c0f440] [c00fe9e8] .__lock_acquire+0x2028/0x2190
[c01a66c0f5d0] [c00ff28c] .lock_acquire+0xcc/0x1a0
[c01a66c0f6a0] [c0b3c120] .mutex_lock_nested+0x80/0x570
[c01a66c0f7c0] [decc1f54] .kvmppc_set_lpcr+0xf4/0x1c0 [kvm_hv]
[c01a66c0f860] [decc510c] .kvmppc_set_one_reg_hv+0x4dc/0x990 
[kvm_hv]
[c01a66c0f8d0] [deb9f234] .kvmppc_set_one_reg+0x44/0x330 [kvm]
[c01a66c0f960] [deb9c9dc] .kvm_vcpu_ioctl_set_one_reg+0x5c/0x150 
[kvm]
[c01a66c0f9f0] [deb9ced4] .kvm_arch_vcpu_ioctl+0x214/0x2c0 [kvm]
[c01a66c0faf0] [deb940b0] .kvm_vcpu_ioctl+0xe0/0x7b0 [kvm]
[c01a66c0fcb0] [c026cbb4] .do_vfs_ioctl+0x444/0x770
[c01a66c0fd90] [c026cfa4] .SyS_ioctl+0xc4/0xe0
[c01a66c0fe30] [c0009264] syscall_exit+0x0/0x98

This fixes it by moving the mutex_lock()/mutex_unlock() pair outside
the spin-locked region.

Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/kvm/book3s_hv.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index de4018a..b273193 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -942,20 +942,20 @@ static int kvm_arch_vcpu_ioctl_set_sregs_hv(struct 
kvm_vcpu *vcpu,
 static void kvmppc_set_lpcr(struct kvm_vcpu *vcpu, u64 new_lpcr,
bool preserve_top32)
 {
+   struct kvm *kvm = vcpu->kvm;
struct kvmppc_vcore *vc = vcpu->arch.vcore;
u64 mask;
 
+   mutex_lock(&kvm->lock);
spin_lock(&vc->lock);
/*
 * If ILE (interrupt little-endian) has changed, update the
 * MSR_LE bit in the intr_msr for each vcpu in this vcore.
 */
if ((new_lpcr & LPCR_ILE) != (vc->lpcr & LPCR_ILE)) {
-   struct kvm *kvm = vcpu->kvm;
struct kvm_vcpu *vcpu;
int i;
 
-   mutex_lock(&kvm->lock);
kvm_for_each_vcpu(i, vcpu, kvm) {
if (vcpu->arch.vcore != vc)
continue;
@@ -964,7 +964,6 @@ static void kvmppc_set_lpcr(struct kvm_vcpu *vcpu, u64 
new_lpcr,
else
vcpu->arch.in

[PULL 2/3] KVM: PPC: Book3S HV: Endian fix for accessing VPA yield count

2015-03-25 Thread Alexander Graf
From: Paul Mackerras 

The VPA (virtual processor area) is defined by PAPR and is therefore
big-endian, so we need a be32_to_cpu when reading it in
kvmppc_get_yield_count().  Without this, H_CONFER always fails on a
little-endian host, causing SMP guests to waste time spinning on
spinlocks.

Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/kvm/book3s_hv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index b273193..de74756 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -636,7 +636,7 @@ static int kvmppc_get_yield_count(struct kvm_vcpu *vcpu)
spin_lock(&vcpu->arch.vpa_update_lock);
lppaca = (struct lppaca *)vcpu->arch.vpa.pinned_addr;
if (lppaca)
-   yield_count = lppaca->yield_count;
+   yield_count = be32_to_cpu(lppaca->yield_count);
spin_unlock(&vcpu->arch.vpa_update_lock);
return yield_count;
 }
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL 0/3] 4.0 patch queue 2015-03-25

2015-03-25 Thread Alexander Graf
Hi Paolo,

This is my current patch queue for 4.0.  Please pull.

Alex


The following changes since commit f710a12d73dfa1c3a5d2417f2482b970f03bb850:

  Merge tag 'kvm-arm-fixes-4.0-rc5' of 
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm (2015-03-16 
20:08:56 -0300)

are available in the git repository at:


  git://github.com/agraf/linux-2.6.git tags/signed-for-4.0

for you to fetch changes up to 2bf27601c7b50b6ced72f27304109dc52eb52919:

  KVM: PPC: Book3S HV: Fix instruction emulation (2015-03-20 11:42:33 +0100)


Patch queue for 4.0 - 2015-03-25

A few bug fixes for Book3S HV KVM:

  - Fix spinlock ordering
  - Fix idle guests on LE hosts
  - Fix instruction emulation


Paul Mackerras (3):
  KVM: PPC: Book3S HV: Fix spinlock/mutex ordering issue in 
kvmppc_set_lpcr()
  KVM: PPC: Book3S HV: Endian fix for accessing VPA yield count
  KVM: PPC: Book3S HV: Fix instruction emulation

 arch/powerpc/kvm/book3s_hv.c| 8 
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 1 +
 2 files changed, 5 insertions(+), 4 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [kvm-ppc:kvm-ppc-queue 7/9] ERROR: ".__spin_yield" [arch/powerpc/kvm/kvm.ko] undefined!

2015-03-23 Thread Alexander Graf


On 23.03.15 04:03, Michael Ellerman wrote:
> On Mon, 2015-03-23 at 14:00 +1100, Paul Mackerras wrote:
>> On Fri, Mar 20, 2015 at 08:07:53PM +0800, kbuild test robot wrote:
>>> tree:   git://github.com/agraf/linux-2.6.git kvm-ppc-queue
>>> head:   9b1daf3cfba1801768aa41b1b6ad0b653844241f
>>> commit: aba777f5ce0accb4c6a277e671de0330752954e8 [7/9] KVM: PPC: Book3S HV: 
>>> Convert ICS mutex lock to spin lock
>>> config: powerpc-defconfig (attached as .config)
>>> reproduce:
>>>   wget 
>>> https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
>>>  -O ~/bin/make.cross
>>>   chmod +x ~/bin/make.cross
>>>   git checkout aba777f5ce0accb4c6a277e671de0330752954e8
>>>   # save the attached .config to linux build tree
>>>   make.cross ARCH=powerpc 
>>>
>>> All error/warnings:
>>>
> ERROR: ".__spin_yield" [arch/powerpc/kvm/kvm.ko] undefined!
>>
>> Yes, this is the patch that depends on the "powerpc: Export
>> __spin_yield" patch that Suresh posted to linuxppc-...@ozlabs.org and
>> I acked.
>>
>> I think the best thing at this stage is probably for Alex to take that
>> patch through his tree, assuming Michael is OK with that.
> 
> Fine by me.
> 
> Acked-by: Michael Ellerman 

Awesome, thanks, applied to kvm-ppc-queue.


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 07/23] KVM: PPC: Book3S: Allow reuse of vCPU object

2015-03-23 Thread Alexander Graf


On 23.03.15 08:50, Bharata B Rao wrote:
> On Sat, Mar 21, 2015 at 8:28 PM, Alexander Graf  wrote:
>>
>>
>> On 20.03.15 16:51, Bharata B Rao wrote:
>>> On Fri, Mar 20, 2015 at 12:34:18PM +0100, Alexander Graf wrote:
>>>>
>>>>
>>>> On 20.03.15 12:26, Paul Mackerras wrote:
>>>>> On Fri, Mar 20, 2015 at 12:01:32PM +0100, Alexander Graf wrote:
>>>>>>
>>>>>>
>>>>>> On 20.03.15 10:39, Paul Mackerras wrote:
>>>>>>> From: Bharata B Rao 
>>>>>>>
>>>>>>> Since KVM isn't equipped to handle closure of vcpu fd from 
>>>>>>> userspace(QEMU)
>>>>>>> correctly, certain work arounds have to be employed to allow reuse of
>>>>>>> vcpu array slot in KVM during cpu hot plug/unplug from guest. One such
>>>>>>> proposed workaround is to park the vcpu fd in userspace during cpu 
>>>>>>> unplug
>>>>>>> and reuse it later during next hotplug.
>>>>>>>
>>>>>>> More details can be found here:
>>>>>>> KVM: https://www.mail-archive.com/kvm@vger.kernel.org/msg102839.html
>>>>>>> QEMU: http://lists.gnu.org/archive/html/qemu-devel/2014-12/msg00859.html
>>>>>>>
>>>>>>> In order to support this workaround with PowerPC KVM, don't create or
>>>>>>> initialize ICP if the vCPU is found to be already associated with an 
>>>>>>> ICP.
>>>>>>>
>>>>>>> Signed-off-by: Bharata B Rao 
>>>>>>> Signed-off-by: Paul Mackerras 
>>>>>>
>>>>>> This probably makes some sense, but please make sure that user space has
>>>>>> some way to figure out whether hotplug works at all.
>>>>>
>>>>> Bharata is working on the qemu side of all this, so I assume he has
>>>>> that covered.
>>>>
>>>> Well, so far the kernel doesn't expose anything he can query, so I
>>>> suppose he just blindly assumes that older host kernels will randomly
>>>> break and nobody cares. I'd rather prefer to see a CAP exposed that qemu
>>>> can check on.
>>>
>>> I see that you have already taken this into your tree. I have an updated
>>> patch to expose a CAP. If the below patch looks ok, then let me know how
>>> you would prefer to take this patch in.
>>>
>>> Regards,
>>> Bharata.
>>>
>>> KVM: PPC: BOOK3S: Allow reuse of vCPU object
>>>
>>> From: Bharata B Rao 
>>>
>>> Since KVM isn't equipped to handle closure of vcpu fd from userspace(QEMU)
>>> correctly, certain work arounds have to be employed to allow reuse of
>>> vcpu array slot in KVM during cpu hot plug/unplug from guest. One such
>>> proposed workaround is to park the vcpu fd in userspace during cpu unplug
>>> and reuse it later during next hotplug.
>>>
>>> More details can be found here:
>>> KVM: https://www.mail-archive.com/kvm@vger.kernel.org/msg102839.html
>>> QEMU: http://lists.gnu.org/archive/html/qemu-devel/2014-12/msg00859.html
>>>
>>> In order to support this workaround with PowerPC KVM, don't create or
>>> initialize ICP if the vCPU is found to be already associated with an ICP.
>>> User space (QEMU) can reuse the vCPU after checking for the availability
>>> of KVM_CAP_SPAPR_REUSE_VCPU capability.
>>>
>>> Signed-off-by: Bharata B Rao 
>>> ---
>>>  arch/powerpc/kvm/book3s_xics.c |9 +++--
>>>  arch/powerpc/kvm/powerpc.c |   12 
>>>  include/uapi/linux/kvm.h   |1 +
>>>  3 files changed, 20 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/powerpc/kvm/book3s_xics.c b/arch/powerpc/kvm/book3s_xics.c
>>> index a4a8d9f..ead3a35 100644
>>> --- a/arch/powerpc/kvm/book3s_xics.c
>>> +++ b/arch/powerpc/kvm/book3s_xics.c
>>> @@ -1313,8 +1313,13 @@ int kvmppc_xics_connect_vcpu(struct kvm_device *dev, 
>>> struct kvm_vcpu *vcpu,
>>>   return -EPERM;
>>>   if (xics->kvm != vcpu->kvm)
>>>   return -EPERM;
>>> - if (vcpu->arch.irq_type)
>>> - return -EBUSY;
>>> +
>>> + /*
>>> +  * If irq_type is already set, don't reinialize but
>>> +  * return success allowing this vcpu to 

Re: [PATCH 07/23] KVM: PPC: Book3S: Allow reuse of vCPU object

2015-03-21 Thread Alexander Graf


On 20.03.15 16:51, Bharata B Rao wrote:
> On Fri, Mar 20, 2015 at 12:34:18PM +0100, Alexander Graf wrote:
>>
>>
>> On 20.03.15 12:26, Paul Mackerras wrote:
>>> On Fri, Mar 20, 2015 at 12:01:32PM +0100, Alexander Graf wrote:
>>>>
>>>>
>>>> On 20.03.15 10:39, Paul Mackerras wrote:
>>>>> From: Bharata B Rao 
>>>>>
>>>>> Since KVM isn't equipped to handle closure of vcpu fd from userspace(QEMU)
>>>>> correctly, certain work arounds have to be employed to allow reuse of
>>>>> vcpu array slot in KVM during cpu hot plug/unplug from guest. One such
>>>>> proposed workaround is to park the vcpu fd in userspace during cpu unplug
>>>>> and reuse it later during next hotplug.
>>>>>
>>>>> More details can be found here:
>>>>> KVM: https://www.mail-archive.com/kvm@vger.kernel.org/msg102839.html
>>>>> QEMU: http://lists.gnu.org/archive/html/qemu-devel/2014-12/msg00859.html
>>>>>
>>>>> In order to support this workaround with PowerPC KVM, don't create or
>>>>> initialize ICP if the vCPU is found to be already associated with an ICP.
>>>>>
>>>>> Signed-off-by: Bharata B Rao 
>>>>> Signed-off-by: Paul Mackerras 
>>>>
>>>> This probably makes some sense, but please make sure that user space has
>>>> some way to figure out whether hotplug works at all.
>>>
>>> Bharata is working on the qemu side of all this, so I assume he has
>>> that covered.
>>
>> Well, so far the kernel doesn't expose anything he can query, so I
>> suppose he just blindly assumes that older host kernels will randomly
>> break and nobody cares. I'd rather prefer to see a CAP exposed that qemu
>> can check on.
> 
> I see that you have already taken this into your tree. I have an updated
> patch to expose a CAP. If the below patch looks ok, then let me know how
> you would prefer to take this patch in.
> 
> Regards,
> Bharata.
> 
> KVM: PPC: BOOK3S: Allow reuse of vCPU object
> 
> From: Bharata B Rao 
> 
> Since KVM isn't equipped to handle closure of vcpu fd from userspace(QEMU)
> correctly, certain work arounds have to be employed to allow reuse of
> vcpu array slot in KVM during cpu hot plug/unplug from guest. One such
> proposed workaround is to park the vcpu fd in userspace during cpu unplug
> and reuse it later during next hotplug.
> 
> More details can be found here:
> KVM: https://www.mail-archive.com/kvm@vger.kernel.org/msg102839.html
> QEMU: http://lists.gnu.org/archive/html/qemu-devel/2014-12/msg00859.html
> 
> In order to support this workaround with PowerPC KVM, don't create or
> initialize ICP if the vCPU is found to be already associated with an ICP.
> User space (QEMU) can reuse the vCPU after checking for the availability
> of KVM_CAP_SPAPR_REUSE_VCPU capability.
> 
> Signed-off-by: Bharata B Rao 
> ---
>  arch/powerpc/kvm/book3s_xics.c |9 +++--
>  arch/powerpc/kvm/powerpc.c |   12 
>  include/uapi/linux/kvm.h   |1 +
>  3 files changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_xics.c b/arch/powerpc/kvm/book3s_xics.c
> index a4a8d9f..ead3a35 100644
> --- a/arch/powerpc/kvm/book3s_xics.c
> +++ b/arch/powerpc/kvm/book3s_xics.c
> @@ -1313,8 +1313,13 @@ int kvmppc_xics_connect_vcpu(struct kvm_device *dev, 
> struct kvm_vcpu *vcpu,
>   return -EPERM;
>   if (xics->kvm != vcpu->kvm)
>   return -EPERM;
> - if (vcpu->arch.irq_type)
> - return -EBUSY;
> +
> + /*
> +  * If irq_type is already set, don't reinialize but
> +  * return success allowing this vcpu to be reused.
> +  */
> + if (vcpu->arch.irq_type != KVMPPC_IRQ_DEFAULT)
> + return 0;
>  
>   r = kvmppc_xics_create_icp(vcpu, xcpu);
>   if (!r)
> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
> index 27c0fac..5b7007c 100644
> --- a/arch/powerpc/kvm/powerpc.c
> +++ b/arch/powerpc/kvm/powerpc.c
> @@ -564,6 +564,18 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long 
> ext)
>   r = 1;
>   break;
>  #endif
> + case KVM_CAP_SPAPR_REUSE_VCPU:
> + /*
> +  * Kernel currently doesn't support closing of vCPU fd from
> +  * user space (QEMU) correctly. Hence the option available
> +  * is to park the vCPU fd in user space whenever a guest
> +  * CPU is hot removed and reuse the 

Re: [PATCH v4 2/4] kvm/ppc/mpic: drop unused IRQ_testbit

2015-03-21 Thread Alexander Graf


On 21.03.15 07:56, Arseny Solokha wrote:
> Drop unused static procedure which doesn't have callers within its
> translation unit. It had been already removed independently in QEMU[1]
> from the OpenPIC implementation borrowed by the kernel.
> 
> [1] https://lists.gnu.org/archive/html/qemu-devel/2014-06/msg01812.html
> 
> v4: Fixed the comment regarding the origination of OpenPIC codebase
> and CC'ed KVM mailing lists, as suggested by Alexander Graf.
> 
> v3: In patch 4/4, do not remove fsl_mpic_primary_get_version() from
> arch/powerpc/sysdev/mpic.c because the patch by Jia Hongtao
> ("powerpc/85xx: workaround for chips with MSI hardware errata") makes
> use of it.
> 
> v2: Added a brief explanation to each patch description of why removed
> functions are unused, as suggested by Michael Ellerman.
> 
> Signed-off-by: Arseny Solokha 

Thanks, applied to kvm-ppc-queue (for 4.1).


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/23] Bug fixes and improvements for HV KVM

2015-03-20 Thread Alexander Graf


On 20.03.15 10:39, Paul Mackerras wrote:
> This is my current patch queue for HV KVM on PPC.  This series is
> based on the "queue" branch of the KVM tree, i.e. roughly v4.0-rc3
> plus a set of recent KVM changes which don't intersect with the
> changes in this series.  On top of that, in my testing I have some
> patches which are not KVM-related but are needed to boot and run a
> recent upstream kernel successfully:
> 
> tick/broadcast-hrtimer : Fix suspicious RCU usage in idle loop
> tick/hotplug: Handover time related duties before cpu offline
> powerpc/powernv: Check image loaded or not before calling flash
> powerpc/powernv: Fixes for hypervisor doorbell handling
> powerpc/powernv: Fix return value from power7_nap() et al.
> powerpc: Export __spin_yield
> 
> These patches have been posted by their authors and are on their way
> upstream via various trees.  They are not included in this series.
> 
> The first three patches are bug fixes that should go into v4.0 if
> possible.  The remainder are intended for the 4.1 merge window.
> 
> The patch "powerpc: Export __spin_yield" is a prerequisite for patch
> 9/23 of this series ("KVM: PPC: Book3S HV: Convert ICS mutex lock to
> spin lock").  It is on its way upstream through the linuxppc-dev
> mailing list.
> 
> The patch "powerpc/powernv: Fixes for hypervisor doorbell handling" is
> needed for correct operation with patch 20/23, "KVM: PPC: Book3S HV:
> Use msgsnd for signalling threads".  It is also on its way upstream
> through the linuxppc-dev list.  I am expecting both of these
> prerequisite patches to go into 4.0.
> 
> Finally, the last patch in this series converts some of the assembly
> code in book3s_hv_rmhandlers.S into C.  I intend to continue this
> trend.

Thanks, applied patches 4-11 to kvm-ppc-queue.


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/23] KVM: PPC: Book3S HV: Accumulate timing information for real-mode code

2015-03-20 Thread Alexander Graf


On 20.03.15 12:25, Paul Mackerras wrote:
> On Fri, Mar 20, 2015 at 12:15:15PM +0100, Alexander Graf wrote:
>>
>>
>> On 20.03.15 10:39, Paul Mackerras wrote:
>>> This reads the timebase at various points in the real-mode guest
>>> entry/exit code and uses that to accumulate total, minimum and
>>> maximum time spent in those parts of the code.  Currently these
>>> times are accumulated per vcpu in 5 parts of the code:
>>>
>>> * rm_entry - time taken from the start of kvmppc_hv_entry() until
>>>   just before entering the guest.
>>> * rm_intr - time from when we take a hypervisor interrupt in the
>>>   guest until we either re-enter the guest or decide to exit to the
>>>   host.  This includes time spent handling hcalls in real mode.
>>> * rm_exit - time from when we decide to exit the guest until the
>>>   return from kvmppc_hv_entry().
>>> * guest - time spend in the guest
>>> * cede - time spent napping in real mode due to an H_CEDE hcall
>>>   while other threads in the same vcore are active.
>>>
>>> These times are exposed in debugfs in a directory per vcpu that
>>> contains a file called "timings".  This file contains one line for
>>> each of the 5 timings above, with the name followed by a colon and
>>> 4 numbers, which are the count (number of times the code has been
>>> executed), the total time, the minimum time, and the maximum time,
>>> all in nanoseconds.
>>>
>>> Signed-off-by: Paul Mackerras 
>>
>> Have you measure the additional overhead this brings?
> 
> I haven't - in fact I did this patch so I could measure the overhead
> or improvement from other changes I did, but it doesn't measure its
> own overhead, of course.  I guess I need a workload that does a
> defined number of guest entries and exits and measure how fast it runs
> with and without the patch (maybe something like H_SET_MODE in a
> loop).  I'll figure something out and post the results.  

Yeah, just measure the number of exits you can handle for a simple
hcall. If there is measurable overhead, it's probably a good idea to
move the statistics gathering into #ifdef paths for DEBUGFS or maybe
even a separate EXIT_TIMING config option as we have it for booke.


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 07/23] KVM: PPC: Book3S: Allow reuse of vCPU object

2015-03-20 Thread Alexander Graf


On 20.03.15 12:26, Paul Mackerras wrote:
> On Fri, Mar 20, 2015 at 12:01:32PM +0100, Alexander Graf wrote:
>>
>>
>> On 20.03.15 10:39, Paul Mackerras wrote:
>>> From: Bharata B Rao 
>>>
>>> Since KVM isn't equipped to handle closure of vcpu fd from userspace(QEMU)
>>> correctly, certain work arounds have to be employed to allow reuse of
>>> vcpu array slot in KVM during cpu hot plug/unplug from guest. One such
>>> proposed workaround is to park the vcpu fd in userspace during cpu unplug
>>> and reuse it later during next hotplug.
>>>
>>> More details can be found here:
>>> KVM: https://www.mail-archive.com/kvm@vger.kernel.org/msg102839.html
>>> QEMU: http://lists.gnu.org/archive/html/qemu-devel/2014-12/msg00859.html
>>>
>>> In order to support this workaround with PowerPC KVM, don't create or
>>> initialize ICP if the vCPU is found to be already associated with an ICP.
>>>
>>> Signed-off-by: Bharata B Rao 
>>> Signed-off-by: Paul Mackerras 
>>
>> This probably makes some sense, but please make sure that user space has
>> some way to figure out whether hotplug works at all.
> 
> Bharata is working on the qemu side of all this, so I assume he has
> that covered.

Well, so far the kernel doesn't expose anything he can query, so I
suppose he just blindly assumes that older host kernels will randomly
break and nobody cares. I'd rather prefer to see a CAP exposed that qemu
can check on.

> 
>> Also Paul, for patches that you pick up from others, I'd prefer if they
>> send the patches to the ML themselves first and you pick them up from
>> there then. That way we give everyone the same treatment.
> 
> Fair enough.  In fact Bharata did post the patch but he sent it to
> linuxppc-...@ozlabs.org not the KVM lists.

Please make sure you only take patches into your queue that made it to
at least kvm@vger, preferably kvm-ppc@vger as well. If you see related
patches on other mailing lists, just ask the respective people to resend
with proper ML exposure.


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 20/23] KVM: PPC: Book3S HV: Use msgsnd for signalling threads on POWER8

2015-03-20 Thread Alexander Graf


On 20.03.15 10:39, Paul Mackerras wrote:
> This uses msgsnd where possible for signalling other threads within
> the same core on POWER8 systems, rather than IPIs through the XICS
> interrupt controller.  This includes waking secondary threads to run
> the guest, the interrupts generated by the virtual XICS, and the
> interrupts to bring the other threads out of the guest when exiting.
> 
> Signed-off-by: Paul Mackerras 
> ---
>  arch/powerpc/kernel/asm-offsets.c   |  4 +++
>  arch/powerpc/kvm/book3s_hv.c| 48 
> ++---
>  arch/powerpc/kvm/book3s_hv_rm_xics.c| 11 
>  arch/powerpc/kvm/book3s_hv_rmhandlers.S | 41 
>  4 files changed, 83 insertions(+), 21 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/asm-offsets.c 
> b/arch/powerpc/kernel/asm-offsets.c
> index fa7b57d..0ce2aa6 100644
> --- a/arch/powerpc/kernel/asm-offsets.c
> +++ b/arch/powerpc/kernel/asm-offsets.c
> @@ -37,6 +37,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #ifdef CONFIG_PPC64
>  #include 
>  #include 
> @@ -568,6 +569,7 @@ int main(void)
>   DEFINE(VCORE_LPCR, offsetof(struct kvmppc_vcore, lpcr));
>   DEFINE(VCORE_PCR, offsetof(struct kvmppc_vcore, pcr));
>   DEFINE(VCORE_DPDES, offsetof(struct kvmppc_vcore, dpdes));
> + DEFINE(VCORE_PCPU, offsetof(struct kvmppc_vcore, pcpu));
>   DEFINE(VCPU_SLB_E, offsetof(struct kvmppc_slb, orige));
>   DEFINE(VCPU_SLB_V, offsetof(struct kvmppc_slb, origv));
>   DEFINE(VCPU_SLB_SIZE, sizeof(struct kvmppc_slb));
> @@ -757,5 +759,7 @@ int main(void)
>   offsetof(struct paca_struct, subcore_sibling_mask));
>  #endif
>  
> + DEFINE(PPC_DBELL_SERVER, PPC_DBELL_SERVER);
> +
>   return 0;
>  }
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 03a8bb4..2c34bae 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -51,6 +51,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -84,9 +85,34 @@ static DECLARE_BITMAP(default_enabled_hcalls, 
> MAX_HCALL_OPCODE/4 + 1);
>  static void kvmppc_end_cede(struct kvm_vcpu *vcpu);
>  static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu);
>  
> +static bool kvmppc_ipi_thread(int cpu)
> +{
> + /* On POWER8 for IPIs to threads in the same core, use msgsnd */
> + if (cpu_has_feature(CPU_FTR_ARCH_207S)) {
> + preempt_disable();
> + if ((cpu & ~7) == (smp_processor_id() & ~7)) {
> + unsigned long msg = PPC_DBELL_TYPE(PPC_DBELL_SERVER);
> + msg |= cpu & 7;
> + smp_mb();
> + __asm__ __volatile__ (PPC_MSGSND(%0) : : "r" (msg));
> + preempt_enable();
> + return true;
> + }
> + preempt_enable();
> + }
> +
> +#if defined(CONFIG_PPC_ICP_NATIVE) && defined(CONFIG_SMP)
> + if (cpu >= 0 && cpu < nr_cpu_ids && paca[cpu].kvm_hstate.xics_phys) {
> + xics_wake_cpu(cpu);
> + return true;
> + }
> +#endif
> +
> + return false;
> +}
> +
>  static void kvmppc_fast_vcpu_kick_hv(struct kvm_vcpu *vcpu)
>  {
> - int me;
>   int cpu = vcpu->cpu;
>   wait_queue_head_t *wqp;
>  
> @@ -96,20 +122,12 @@ static void kvmppc_fast_vcpu_kick_hv(struct kvm_vcpu 
> *vcpu)
>   ++vcpu->stat.halt_wakeup;
>   }
>  
> - me = get_cpu();
> + if (kvmppc_ipi_thread(cpu + vcpu->arch.ptid))
> + return;
>  
>   /* CPU points to the first thread of the core */
> - if (cpu != me && cpu >= 0 && cpu < nr_cpu_ids) {
> -#ifdef CONFIG_PPC_ICP_NATIVE
> - int real_cpu = cpu + vcpu->arch.ptid;
> - if (paca[real_cpu].kvm_hstate.xics_phys)
> - xics_wake_cpu(real_cpu);
> - else
> -#endif
> - if (cpu_online(cpu))
> - smp_send_reschedule(cpu);
> - }
> - put_cpu();
> + if (cpu >= 0 && cpu < nr_cpu_ids && cpu_online(cpu))
> + smp_send_reschedule(cpu);
>  }
>  
>  /*
> @@ -1754,10 +1772,8 @@ static void kvmppc_start_thread(struct kvm_vcpu *vcpu)
>   /* Order stores to hstate.kvm_vcore etc. before store to kvm_vcpu */
>   smp_wmb();
>   tpaca->kvm_hstate.kvm_vcpu = vcpu;
> -#if defined(CONFIG_PPC_ICP_NATIVE) && defined(CONFIG_SMP)
>   if (cpu != smp_processor_id())
> - xics_wake_cpu(cpu);
> -#endif
> + kvmppc_ipi_thread(cpu);
>  }
>  
>  static void kvmppc_wait_for_nap(void)
> diff --git a/arch/powerpc/kvm/book3s_hv_rm_xics.c 
> b/arch/powerpc/kvm/book3s_hv_rm_xics.c
> index 6dded8c..457a8b1 100644
> --- a/arch/powerpc/kvm/book3s_hv_rm_xics.c
> +++ b/arch/powerpc/kvm/book3s_hv_rm_xics.c
> @@ -18,6 +18,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "book3s_xics.h"
>  
> @@ -83,6 +84,16 @@ static void icp_rm_set_vcpu_irq(struct kvm_vc

Re: [PATCH 12/23] KVM: PPC: Book3S HV: Create debugfs file for each guest's HPT

2015-03-20 Thread Alexander Graf


On 20.03.15 10:39, Paul Mackerras wrote:
> This creates a debugfs directory for each HV guest (assuming debugfs
> is enabled in the kernel config), and within that directory, a file
> by which the contents of the guest's HPT (hashed page table) can be
> read.  The directory is named vm, where  is the PID of the
> process that created the guest.  The file is named "htab".  This is
> intended to help in debugging problems in the host's management
> of guest memory.
> 
> The contents of the file consist of a series of lines like this:
> 
>   3f48 4000d032bf003505 000bd7ff1196 0003b5c71196
> 
> The first field is the index of the entry in the HPT, the second and
> third are the HPT entry, so the third entry contains the real page
> number that is mapped by the entry if the entry's valid bit is set.
> The fourth field is the guest's view of the second doubleword of the
> entry, so it contains the guest physical address.  (The format of the
> second through fourth fields are described in the Power ISA and also
> in arch/powerpc/include/asm/mmu-hash64.h.)
> 
> Signed-off-by: Paul Mackerras 
> ---
>  arch/powerpc/include/asm/kvm_book3s_64.h |   2 +
>  arch/powerpc/include/asm/kvm_host.h  |   2 +
>  arch/powerpc/kvm/book3s_64_mmu_hv.c  | 136 
> +++
>  arch/powerpc/kvm/book3s_hv.c |  12 +++
>  virt/kvm/kvm_main.c  |   1 +
>  5 files changed, 153 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h 
> b/arch/powerpc/include/asm/kvm_book3s_64.h
> index 0789a0f..869c53f 100644
> --- a/arch/powerpc/include/asm/kvm_book3s_64.h
> +++ b/arch/powerpc/include/asm/kvm_book3s_64.h
> @@ -436,6 +436,8 @@ static inline struct kvm_memslots 
> *kvm_memslots_raw(struct kvm *kvm)
>   return rcu_dereference_raw_notrace(kvm->memslots);
>  }
>  
> +extern void kvmppc_mmu_debugfs_init(struct kvm *kvm);
> +
>  #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
>  
>  #endif /* __ASM_KVM_BOOK3S_64_H__ */
> diff --git a/arch/powerpc/include/asm/kvm_host.h 
> b/arch/powerpc/include/asm/kvm_host.h
> index 015773f..f1d0bbc 100644
> --- a/arch/powerpc/include/asm/kvm_host.h
> +++ b/arch/powerpc/include/asm/kvm_host.h
> @@ -238,6 +238,8 @@ struct kvm_arch {
>   atomic_t hpte_mod_interest;
>   cpumask_t need_tlb_flush;
>   int hpt_cma_alloc;
> + struct dentry *debugfs_dir;
> + struct dentry *htab_dentry;
>  #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
>  #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
>   struct mutex hpt_mutex;
> diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
> b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> index 6c6825a..d6fe308 100644
> --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
> +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> @@ -27,6 +27,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -1490,6 +1491,141 @@ int kvm_vm_ioctl_get_htab_fd(struct kvm *kvm, struct 
> kvm_get_htab_fd *ghf)
>   return ret;
>  }
>  
> +struct debugfs_htab_state {
> + struct kvm  *kvm;
> + struct mutexmutex;
> + unsigned long   hpt_index;
> + int chars_left;
> + int buf_index;
> + charbuf[64];
> +};
> +
> +static int debugfs_htab_open(struct inode *inode, struct file *file)
> +{
> + struct kvm *kvm = inode->i_private;
> + struct debugfs_htab_state *p;
> +
> + p = kzalloc(sizeof(*p), GFP_KERNEL);
> + if (!p)
> + return -ENOMEM;
> +
> + kvm_get_kvm(kvm);
> + p->kvm = kvm;
> + mutex_init(&p->mutex);
> + file->private_data = p;
> +
> + return nonseekable_open(inode, file);
> +}
> +
> +static int debugfs_htab_release(struct inode *inode, struct file *file)
> +{
> + struct debugfs_htab_state *p = file->private_data;
> +
> + kvm_put_kvm(p->kvm);
> + kfree(p);
> + return 0;
> +}
> +
> +static ssize_t debugfs_htab_read(struct file *file, char __user *buf,
> +  size_t len, loff_t *ppos)
> +{
> + struct debugfs_htab_state *p = file->private_data;
> + ssize_t ret, r;
> + unsigned long i, n;
> + unsigned long v, hr, gr;
> + struct kvm *kvm;
> + __be64 *hptp;
> +
> + ret = mutex_lock_interruptible(&p->mutex);
> + if (ret)
> + return ret;
> +
> + if (p->chars_left) {
> + n = p->chars_left;
> + if (n > len)
> + n = len;
> + r = copy_to_user(buf, p->buf + p->buf_index, n);
> + n -= r;
> + p->chars_left -= n;
> + p->buf_index += n;
> + buf += n;
> + len -= n;
> + ret = n;
> + if (r) {
> + if (!n)
> + ret = -EFAULT;
> + goto out;
> + }
> + }
> +
> + kvm = p->kvm;
> + i = p->hpt_index;
> + hptp = (__be64 *)(kvm->arch.hpt_virt + (i * HPTE_SIZE));
> + for (; len != 0 && i < kvm->arch.

Re: [PATCH 13/23] KVM: PPC: Book3S HV: Accumulate timing information for real-mode code

2015-03-20 Thread Alexander Graf


On 20.03.15 10:39, Paul Mackerras wrote:
> This reads the timebase at various points in the real-mode guest
> entry/exit code and uses that to accumulate total, minimum and
> maximum time spent in those parts of the code.  Currently these
> times are accumulated per vcpu in 5 parts of the code:
> 
> * rm_entry - time taken from the start of kvmppc_hv_entry() until
>   just before entering the guest.
> * rm_intr - time from when we take a hypervisor interrupt in the
>   guest until we either re-enter the guest or decide to exit to the
>   host.  This includes time spent handling hcalls in real mode.
> * rm_exit - time from when we decide to exit the guest until the
>   return from kvmppc_hv_entry().
> * guest - time spend in the guest
> * cede - time spent napping in real mode due to an H_CEDE hcall
>   while other threads in the same vcore are active.
> 
> These times are exposed in debugfs in a directory per vcpu that
> contains a file called "timings".  This file contains one line for
> each of the 5 timings above, with the name followed by a colon and
> 4 numbers, which are the count (number of times the code has been
> executed), the total time, the minimum time, and the maximum time,
> all in nanoseconds.
> 
> Signed-off-by: Paul Mackerras 

Have you measure the additional overhead this brings?

> ---
>  arch/powerpc/include/asm/kvm_host.h |  19 +
>  arch/powerpc/include/asm/time.h |   3 +
>  arch/powerpc/kernel/asm-offsets.c   |  11 +++
>  arch/powerpc/kernel/time.c  |   6 ++
>  arch/powerpc/kvm/book3s_hv.c| 135 
> 
>  arch/powerpc/kvm/book3s_hv_rmhandlers.S | 105 -
>  6 files changed, 276 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/kvm_host.h 
> b/arch/powerpc/include/asm/kvm_host.h
> index f1d0bbc..286c0ce 100644
> --- a/arch/powerpc/include/asm/kvm_host.h
> +++ b/arch/powerpc/include/asm/kvm_host.h
> @@ -369,6 +369,14 @@ struct kvmppc_slb {
>   u8 base_page_size;  /* MMU_PAGE_xxx */
>  };
>  
> +/* Struct used to accumulate timing information in HV real mode code */
> +struct kvmhv_tb_accumulator {
> + u64 seqcount;   /* used to synchronize access, also count * 2 */
> + u64 tb_total;   /* total time in timebase ticks */
> + u64 tb_min; /* min time */
> + u64 tb_max; /* max time */
> +};
> +
>  # ifdef CONFIG_PPC_FSL_BOOK3E
>  #define KVMPPC_BOOKE_IAC_NUM 2
>  #define KVMPPC_BOOKE_DAC_NUM 2
> @@ -656,6 +664,17 @@ struct kvm_vcpu_arch {
>   u64 busy_preempt;
>  
>   u32 emul_inst;
> +
> + struct kvmhv_tb_accumulator *cur_activity;  /* What we're timing */
> + u64 cur_tb_start;   /* when it started */
> + struct kvmhv_tb_accumulator rm_entry;   /* real-mode entry code */
> + struct kvmhv_tb_accumulator rm_intr;/* real-mode intr handling */
> + struct kvmhv_tb_accumulator rm_exit;/* real-mode exit code */
> + struct kvmhv_tb_accumulator guest_time; /* guest execution */
> + struct kvmhv_tb_accumulator cede_time;  /* time napping inside guest */
> +
> + struct dentry *debugfs_dir;
> + struct dentry *debugfs_timings;
>  #endif
>  };
>  
> diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
> index 03cbada..10fc784 100644
> --- a/arch/powerpc/include/asm/time.h
> +++ b/arch/powerpc/include/asm/time.h
> @@ -211,5 +211,8 @@ extern void secondary_cpu_time_init(void);
>  
>  DECLARE_PER_CPU(u64, decrementers_next_tb);
>  
> +/* Convert timebase ticks to nanoseconds */
> +unsigned long long tb_to_ns(unsigned long long tb_ticks);
> +
>  #endif /* __KERNEL__ */
>  #endif /* __POWERPC_TIME_H */
> diff --git a/arch/powerpc/kernel/asm-offsets.c 
> b/arch/powerpc/kernel/asm-offsets.c
> index 4717859..ec9f59c 100644
> --- a/arch/powerpc/kernel/asm-offsets.c
> +++ b/arch/powerpc/kernel/asm-offsets.c
> @@ -458,6 +458,17 @@ int main(void)
>   DEFINE(VCPU_SPRG1, offsetof(struct kvm_vcpu, arch.shregs.sprg1));
>   DEFINE(VCPU_SPRG2, offsetof(struct kvm_vcpu, arch.shregs.sprg2));
>   DEFINE(VCPU_SPRG3, offsetof(struct kvm_vcpu, arch.shregs.sprg3));
> + DEFINE(VCPU_TB_RMENTRY, offsetof(struct kvm_vcpu, arch.rm_entry));
> + DEFINE(VCPU_TB_RMINTR, offsetof(struct kvm_vcpu, arch.rm_intr));
> + DEFINE(VCPU_TB_RMEXIT, offsetof(struct kvm_vcpu, arch.rm_exit));
> + DEFINE(VCPU_TB_GUEST, offsetof(struct kvm_vcpu, arch.guest_time));
> + DEFINE(VCPU_TB_CEDE, offsetof(struct kvm_vcpu, arch.cede_time));
> + DEFINE(VCPU_CUR_ACTIVITY, offsetof(struct kvm_vcpu, arch.cur_activity));
> + DEFINE(VCPU_ACTIVITY_START, offsetof(struct kvm_vcpu, 
> arch.cur_tb_start));
> + DEFINE(TAS_SEQCOUNT, offsetof(struct kvmhv_tb_accumulator, seqcount));
> + DEFINE(TAS_TOTAL, offsetof(struct kvmhv_tb_accumulator, tb_total));
> + DEFINE(TAS_MIN, offsetof(struct kvmhv_tb_accumulator, tb_min));
> + DEFINE(TAS_MAX

Re: [PATCH 07/23] KVM: PPC: Book3S: Allow reuse of vCPU object

2015-03-20 Thread Alexander Graf


On 20.03.15 10:39, Paul Mackerras wrote:
> From: Bharata B Rao 
> 
> Since KVM isn't equipped to handle closure of vcpu fd from userspace(QEMU)
> correctly, certain work arounds have to be employed to allow reuse of
> vcpu array slot in KVM during cpu hot plug/unplug from guest. One such
> proposed workaround is to park the vcpu fd in userspace during cpu unplug
> and reuse it later during next hotplug.
> 
> More details can be found here:
> KVM: https://www.mail-archive.com/kvm@vger.kernel.org/msg102839.html
> QEMU: http://lists.gnu.org/archive/html/qemu-devel/2014-12/msg00859.html
> 
> In order to support this workaround with PowerPC KVM, don't create or
> initialize ICP if the vCPU is found to be already associated with an ICP.
> 
> Signed-off-by: Bharata B Rao 
> Signed-off-by: Paul Mackerras 

This probably makes some sense, but please make sure that user space has
some way to figure out whether hotplug works at all.

Also Paul, for patches that you pick up from others, I'd prefer if they
send the patches to the ML themselves first and you pick them up from
there then. That way we give everyone the same treatment.


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/23] Bug fixes and improvements for HV KVM

2015-03-20 Thread Alexander Graf


On 20.03.15 10:39, Paul Mackerras wrote:
> This is my current patch queue for HV KVM on PPC.  This series is
> based on the "queue" branch of the KVM tree, i.e. roughly v4.0-rc3
> plus a set of recent KVM changes which don't intersect with the
> changes in this series.  On top of that, in my testing I have some
> patches which are not KVM-related but are needed to boot and run a
> recent upstream kernel successfully:
> 
> tick/broadcast-hrtimer : Fix suspicious RCU usage in idle loop
> tick/hotplug: Handover time related duties before cpu offline
> powerpc/powernv: Check image loaded or not before calling flash
> powerpc/powernv: Fixes for hypervisor doorbell handling
> powerpc/powernv: Fix return value from power7_nap() et al.
> powerpc: Export __spin_yield
> 
> These patches have been posted by their authors and are on their way
> upstream via various trees.  They are not included in this series.
> 
> The first three patches are bug fixes that should go into v4.0 if
> possible.

Thanks, applied the first 3 to my for-4.0 branch which is going through
autotest now. If everything runs fine, I'll send it to Paolo for
upstream merge.


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3] kvmppc: Implement H_LOGICAL_CI_{LOAD,STORE} in KVM

2015-03-16 Thread Alexander Graf


On 16.03.15 21:41, David Gibson wrote:
> On Thu, Feb 05, 2015 at 01:57:11AM +0100, Alexander Graf wrote:
>>
>>
>> On 05.02.15 01:53, David Gibson wrote:
>>> On POWER, storage caching is usually configured via the MMU - attributes
>>> such as cache-inhibited are stored in the TLB and the hashed page table.
>>>
>>> This makes correctly performing cache inhibited IO accesses awkward when
>>> the MMU is turned off (real mode).  Some CPU models provide special
>>> registers to control the cache attributes of real mode load and stores but
>>> this is not at all consistent.  This is a problem in particular for SLOF,
>>> the firmware used on KVM guests, which runs entirely in real mode, but
>>> which needs to do IO to load the kernel.
>>>
>>> To simplify this qemu implements two special hypercalls, H_LOGICAL_CI_LOAD
>>> and H_LOGICAL_CI_STORE which simulate a cache-inhibited load or store to
>>> a logical address (aka guest physical address).  SLOF uses these for IO.
>>>
>>> However, because these are implemented within qemu, not the host kernel,
>>> these bypass any IO devices emulated within KVM itself.  The simplest way
>>> to see this problem is to attempt to boot a KVM guest from a virtio-blk
>>> device with iothread / dataplane enabled.  The iothread code relies on an
>>> in kernel implementation of the virtio queue notification, which is not
>>> triggered by the IO hcalls, and so the guest will stall in SLOF unable to
>>> load the guest OS.
>>>
>>> This patch addresses this by providing in-kernel implementations of the
>>> 2 hypercalls, which correctly scan the KVM IO bus.  Any access to an
>>> address not handled by the KVM IO bus will cause a VM exit, hitting the
>>> qemu implementation as before.
>>>
>>> Note that a userspace change is also required, in order to enable these
>>> new hcall implementations with KVM_CAP_PPC_ENABLE_HCALL.
>>>
>>> Signed-off-by: David Gibson 
>>
>> Thanks, applied to kvm-ppc-queue.
> 
> Any news on when this might go up to mainline?

I'm aiming for 4.1.


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: H_CLEAR_REF and H_CLEAR_MOD

2015-02-18 Thread Alexander Graf

> Am 18.02.2015 um 07:12 schrieb Nathan Whitehorn :
> 
> It seems like KVM doesn't implement the H_CLEAR_REF and H_CLEAR_MOD 
> hypervisor calls, which are absolutely critical for memory management in the 
> FreeBSD kernel (and are marked "mandatory" in the PAPR manual). It seems some 
> patches have been contributed already in 
> https://lists.ozlabs.org/pipermail/linuxppc-dev/2011-December/095013.html, so 
> it would be fantastic if these could end up upstream.

Paul, I guess we never included this because  there was no user. If FreeBSD 
does use it though, I think it makes a lot of sense to resend it for inclusion.

> 
> I'm going to try to get some kind of workaround in the meantime so we can at 
> least run on existing kernels.

Please don't add hacks in FreeBSD only because kvm is missing a feature. Let's 
just get this done properly :).


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3] kvmppc: Implement H_LOGICAL_CI_{LOAD,STORE} in KVM

2015-02-04 Thread Alexander Graf


On 05.02.15 01:53, David Gibson wrote:
> On POWER, storage caching is usually configured via the MMU - attributes
> such as cache-inhibited are stored in the TLB and the hashed page table.
> 
> This makes correctly performing cache inhibited IO accesses awkward when
> the MMU is turned off (real mode).  Some CPU models provide special
> registers to control the cache attributes of real mode load and stores but
> this is not at all consistent.  This is a problem in particular for SLOF,
> the firmware used on KVM guests, which runs entirely in real mode, but
> which needs to do IO to load the kernel.
> 
> To simplify this qemu implements two special hypercalls, H_LOGICAL_CI_LOAD
> and H_LOGICAL_CI_STORE which simulate a cache-inhibited load or store to
> a logical address (aka guest physical address).  SLOF uses these for IO.
> 
> However, because these are implemented within qemu, not the host kernel,
> these bypass any IO devices emulated within KVM itself.  The simplest way
> to see this problem is to attempt to boot a KVM guest from a virtio-blk
> device with iothread / dataplane enabled.  The iothread code relies on an
> in kernel implementation of the virtio queue notification, which is not
> triggered by the IO hcalls, and so the guest will stall in SLOF unable to
> load the guest OS.
> 
> This patch addresses this by providing in-kernel implementations of the
> 2 hypercalls, which correctly scan the KVM IO bus.  Any access to an
> address not handled by the KVM IO bus will cause a VM exit, hitting the
> qemu implementation as before.
> 
> Note that a userspace change is also required, in order to enable these
> new hcall implementations with KVM_CAP_PPC_ENABLE_HCALL.
> 
> Signed-off-by: David Gibson 

Thanks, applied to kvm-ppc-queue.


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2] kvmppc: Implement H_LOGICAL_CI_{LOAD,STORE} in KVM

2015-02-04 Thread Alexander Graf


On 03.02.15 06:44, David Gibson wrote:
> On POWER, storage caching is usually configured via the MMU - attributes
> such as cache-inhibited are stored in the TLB and the hashed page table.
> 
> This makes correctly performing cache inhibited IO accesses awkward when
> the MMU is turned off (real mode).  Some CPU models provide special
> registers to control the cache attributes of real mode load and stores but
> this is not at all consistent.  This is a problem in particular for SLOF,
> the firmware used on KVM guests, which runs entirely in real mode, but
> which needs to do IO to load the kernel.
> 
> To simplify this qemu implements two special hypercalls, H_LOGICAL_CI_LOAD
> and H_LOGICAL_CI_STORE which simulate a cache-inhibited load or store to
> a logical address (aka guest physical address).  SLOF uses these for IO.
> 
> However, because these are implemented within qemu, not the host kernel,
> these bypass any IO devices emulated within KVM itself.  The simplest way
> to see this problem is to attempt to boot a KVM guest from a virtio-blk
> device with iothread / dataplane enabled.  The iothread code relies on an
> in kernel implementation of the virtio queue notification, which is not
> triggered by the IO hcalls, and so the guest will stall in SLOF unable to
> load the guest OS.
> 
> This patch addresses this by providing in-kernel implementations of the
> 2 hypercalls, which correctly scan the KVM IO bus.  Any access to an
> address not handled by the KVM IO bus will cause a VM exit, hitting the
> qemu implementation as before.
> 
> Note that a userspace change is also required, in order to enable these
> new hcall implementations with KVM_CAP_PPC_ENABLE_HCALL.
> 
> Signed-off-by: David Gibson 
> ---
>  arch/powerpc/include/asm/kvm_book3s.h |  3 ++
>  arch/powerpc/kvm/book3s.c | 76 
> +++
>  arch/powerpc/kvm/book3s_hv.c  | 12 ++
>  arch/powerpc/kvm/book3s_pr_papr.c | 28 +
>  4 files changed, 119 insertions(+)
> 
> v2:
>   - Removed some debugging printk()s that were accidentally left in
>   - Fix endianness; like all PAPR hypercalls, these should always act
> big-endian, even if the guest is little-endian (in practice this
> makes no difference, since the only user is SLOF, which is always
> big-endian)
> 
> diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
> b/arch/powerpc/include/asm/kvm_book3s.h
> index 942c7b1..578e550 100644
> --- a/arch/powerpc/include/asm/kvm_book3s.h
> +++ b/arch/powerpc/include/asm/kvm_book3s.h
> @@ -292,6 +292,9 @@ static inline bool kvmppc_supports_magic_page(struct 
> kvm_vcpu *vcpu)
>   return !is_kvmppc_hv_enabled(vcpu->kvm);
>  }
>  
> +extern int kvmppc_h_logical_ci_load(struct kvm_vcpu *vcpu);
> +extern int kvmppc_h_logical_ci_store(struct kvm_vcpu *vcpu);
> +
>  /* Magic register values loaded into r3 and r4 before the 'sc' assembly
>   * instruction for the OSI hypercalls */
>  #define OSI_SC_MAGIC_R3  0x113724FA
> diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
> index 888bf46..7b51492 100644
> --- a/arch/powerpc/kvm/book3s.c
> +++ b/arch/powerpc/kvm/book3s.c
> @@ -820,6 +820,82 @@ void kvmppc_core_destroy_vm(struct kvm *kvm)
>  #endif
>  }
>  
> +int kvmppc_h_logical_ci_load(struct kvm_vcpu *vcpu)
> +{
> + unsigned long size = kvmppc_get_gpr(vcpu, 4);
> + unsigned long addr = kvmppc_get_gpr(vcpu, 5);
> + u64 buf;
> + int ret;
> +
> + if (!is_power_of_2(size) || (size > sizeof(buf)))
> + return H_TOO_HARD;
> +
> + ret = kvm_io_bus_read(vcpu->kvm, KVM_MMIO_BUS, addr, size, &buf);
> + if (ret != 0)
> + return H_TOO_HARD;
> +
> + switch (size) {
> + case 1:
> + kvmppc_set_gpr(vcpu, 4, *(u8 *)&buf);
> + break;
> +
> + case 2:
> + kvmppc_set_gpr(vcpu, 4, be16_to_cpu(*(u16 *)&buf));
> + break;
> +
> + case 4:
> + kvmppc_set_gpr(vcpu, 4, be32_to_cpu(*(u32 *)&buf));
> + break;
> +
> + case 8:
> + kvmppc_set_gpr(vcpu, 4, be64_to_cpu(*(u64 *)&buf));

Shouldn't these casts be __be types?

> + break;
> +
> + default:
> + BUG();
> + }
> +
> + return H_SUCCESS;
> +}
> +EXPORT_SYMBOL_GPL(kvmppc_h_logical_ci_load); /* For use by the kvm-pr module 
> */

No need for the comment.

> +
> +int kvmppc_h_logical_ci_store(struct kvm_vcpu *vcpu)
> +{
> + unsigned long size = kvmppc_get_gpr(vcpu, 4);
> + unsigned long addr = kvmppc_get_gpr(vcpu, 5);
> + unsigned long val = kvmppc_get_gpr(vcpu, 6);
> + u64 buf;
> + int ret;
> +
> + switch (size) {
> + case 1:
> + *(u8 *)&buf = val;
> + break;
> +
> + case 2:
> + *(u16 *)&buf = cpu_to_be16(val);
> + break;
> +
> + case 4:
> + *(u32 *)&buf = cpu_to_be32(val);
> + break;
> +
> + case 8:
> + *(u64 *

Re: [PATCH 0/8] current ACCESS_ONCE patch queue

2015-01-16 Thread Alexander Graf


On 15.01.15 09:58, Christian Borntraeger wrote:
> Folks,
> 
> fyi, this is my current patch queue for the next merge window. It
> does contain a patch that will disallow ACCESS_ONCE on non-scalar
> types.
> 
> The tree is part of linux-next and can be found at
> git://git.kernel.org/pub/scm/linux/kernel/git/borntraeger/linux.git linux-next

KVM PPC bits are:

 Acked-by: Alexander Graf 



Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] powerpc: powernv: Return to cpu offline loop when finished in KVM guest

2014-12-21 Thread Alexander Graf

On 21.12.14 15:13, Andreas Schwab wrote:
> arch/powerpc/kvm/built-in.o: In function `kvm_no_guest':
> arch/powerpc/kvm/book3s_hv_rmhandlers.o:(.text+0x724): undefined reference to 
> `power7_wakeup_loss'

Ugh. We just removed support for 970 HV mode, but that obviously doesn't
mean you can't compile in support for HV mode without enabling p7.

Paul, what would you think of a patch that makes BOOK3S_HV depend on
PPC_POWERNV?


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL 16/18] KVM: PPC: Book3S HV: Fix endianness of instruction obtained from HEIR register

2014-12-17 Thread Alexander Graf
From: Paul Mackerras 

There are two ways in which a guest instruction can be obtained from
the guest in the guest exit code in book3s_hv_rmhandlers.S.  If the
exit was caused by a Hypervisor Emulation interrupt (i.e. an illegal
instruction), the offending instruction is in the HEIR register
(Hypervisor Emulation Instruction Register).  If the exit was caused
by a load or store to an emulated MMIO device, we load the instruction
from the guest by turning data relocation on and loading the instruction
with an lwz instruction.

Unfortunately, in the case where the guest has opposite endianness to
the host, these two methods give results of different endianness, but
both get put into vcpu->arch.last_inst.  The HEIR value has been loaded
using guest endianness, whereas the lwz will load the instruction using
host endianness.  The rest of the code that uses vcpu->arch.last_inst
assumes it was loaded using host endianness.

To fix this, we define a new vcpu field to store the HEIR value.  Then,
in kvmppc_handle_exit_hv(), we transfer the value from this new field to
vcpu->arch.last_inst, doing a byte-swap if the guest and host endianness
differ.

Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/include/asm/kvm_host.h | 2 ++
 arch/powerpc/kernel/asm-offsets.c   | 1 +
 arch/powerpc/kvm/book3s_hv.c| 4 
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 4 ++--
 4 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 5686a42..6544187 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -651,6 +651,8 @@ struct kvm_vcpu_arch {
spinlock_t tbacct_lock;
u64 busy_stolen;
u64 busy_preempt;
+
+   u32 emul_inst;
 #endif
 };
 
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 815212e..b14716b 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -498,6 +498,7 @@ int main(void)
DEFINE(VCPU_DAR, offsetof(struct kvm_vcpu, arch.shregs.dar));
DEFINE(VCPU_VPA, offsetof(struct kvm_vcpu, arch.vpa.pinned_addr));
DEFINE(VCPU_VPA_DIRTY, offsetof(struct kvm_vcpu, arch.vpa.dirty));
+   DEFINE(VCPU_HEIR, offsetof(struct kvm_vcpu, arch.emul_inst));
 #endif
 #ifdef CONFIG_PPC_BOOK3S
DEFINE(VCPU_VCPUID, offsetof(struct kvm_vcpu, vcpu_id));
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 1ee4e9e..299351e 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -831,6 +831,10 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, 
struct kvm_vcpu *vcpu,
 * Accordingly return to Guest or Host.
 */
case BOOK3S_INTERRUPT_H_EMUL_ASSIST:
+   if (vcpu->arch.emul_inst != KVM_INST_FETCH_FAILED)
+   vcpu->arch.last_inst = kvmppc_need_byteswap(vcpu) ?
+   swab32(vcpu->arch.emul_inst) :
+   vcpu->arch.emul_inst;
if (vcpu->guest_debug & KVM_GUESTDBG_USE_SW_BP) {
r = kvmppc_emulate_debug_inst(run, vcpu);
} else {
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index c0f9e68..26a5b8d 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -983,13 +983,13 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
 
stw r12,VCPU_TRAP(r9)
 
-   /* Save HEIR (HV emulation assist reg) in last_inst
+   /* Save HEIR (HV emulation assist reg) in emul_inst
   if this is an HEI (HV emulation interrupt, e40) */
li  r3,KVM_INST_FETCH_FAILED
cmpwi   r12,BOOK3S_INTERRUPT_H_EMUL_ASSIST
bne 11f
mfspr   r3,SPRN_HEIR
-11:stw r3,VCPU_LAST_INST(r9)
+11:stw r3,VCPU_HEIR(r9)
 
/* these are volatile across C function calls */
mfctr   r3
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL 14/18] KVM: PPC: Book3S HV: Tracepoints for KVM HV guest interactions

2014-12-17 Thread Alexander Graf
From: "Suresh E. Warrier" 

This patch adds trace points in the guest entry and exit code and also
for exceptions handled by the host in kernel mode - hypercalls and page
faults. The new events are added to /sys/kernel/debug/tracing/events
under a new subsystem called kvm_hv.

Acked-by: Paul Mackerras 
Signed-off-by: Suresh Warrier 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/kvm/book3s_64_mmu_hv.c |  13 +-
 arch/powerpc/kvm/book3s_hv.c|  19 ++
 arch/powerpc/kvm/trace_book3s.h |  32 +++
 arch/powerpc/kvm/trace_hv.h | 477 
 arch/powerpc/kvm/trace_pr.h |  25 +-
 5 files changed, 539 insertions(+), 27 deletions(-)
 create mode 100644 arch/powerpc/kvm/trace_book3s.h
 create mode 100644 arch/powerpc/kvm/trace_hv.h

diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 59425f1..311e4a3 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -37,6 +37,8 @@
 #include 
 #include 
 
+#include "trace_hv.h"
+
 /* POWER7 has 10-bit LPIDs, PPC970 has 6-bit LPIDs */
 #define MAX_LPID_970   63
 
@@ -622,6 +624,8 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
gfn = gpa >> PAGE_SHIFT;
memslot = gfn_to_memslot(kvm, gfn);
 
+   trace_kvm_page_fault_enter(vcpu, hpte, memslot, ea, dsisr);
+
/* No memslot means it's an emulated MMIO region */
if (!memslot || (memslot->flags & KVM_MEMSLOT_INVALID))
return kvmppc_hv_emulate_mmio(run, vcpu, gpa, ea,
@@ -641,6 +645,7 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
mmu_seq = kvm->mmu_notifier_seq;
smp_rmb();
 
+   ret = -EFAULT;
is_io = 0;
pfn = 0;
page = NULL;
@@ -664,7 +669,7 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
}
up_read(¤t->mm->mmap_sem);
if (!pfn)
-   return -EFAULT;
+   goto out_put;
} else {
page = pages[0];
pfn = page_to_pfn(page);
@@ -694,14 +699,14 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, 
struct kvm_vcpu *vcpu,
}
}
 
-   ret = -EFAULT;
if (psize > pte_size)
goto out_put;
 
/* Check WIMG vs. the actual page we're accessing */
if (!hpte_cache_flags_ok(r, is_io)) {
if (is_io)
-   return -EFAULT;
+   goto out_put;
+
/*
 * Allow guest to map emulated device memory as
 * uncacheable, but actually make it cacheable.
@@ -765,6 +770,8 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
SetPageDirty(page);
 
  out_put:
+   trace_kvm_page_fault_exit(vcpu, hpte, ret);
+
if (page) {
/*
 * We drop pages[0] here, not page because page might
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 74afa2d..325ed94 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -58,6 +58,9 @@
 
 #include "book3s.h"
 
+#define CREATE_TRACE_POINTS
+#include "trace_hv.h"
+
 /* #define EXIT_DEBUG */
 /* #define EXIT_DEBUG_SIMPLE */
 /* #define EXIT_DEBUG_INT */
@@ -1730,6 +1733,7 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc)
list_for_each_entry(vcpu, &vc->runnable_threads, arch.run_list) {
kvmppc_start_thread(vcpu);
kvmppc_create_dtl_entry(vcpu, vc);
+   trace_kvm_guest_enter(vcpu);
}
 
/* Set this explicitly in case thread 0 doesn't have a vcpu */
@@ -1738,6 +1742,9 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc)
 
vc->vcore_state = VCORE_RUNNING;
preempt_disable();
+
+   trace_kvmppc_run_core(vc, 0);
+
spin_unlock(&vc->lock);
 
kvm_guest_enter();
@@ -1783,6 +1790,8 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc)
kvmppc_core_pending_dec(vcpu))
kvmppc_core_dequeue_dec(vcpu);
 
+   trace_kvm_guest_exit(vcpu);
+
ret = RESUME_GUEST;
if (vcpu->arch.trap)
ret = kvmppc_handle_exit_hv(vcpu->arch.kvm_run, vcpu,
@@ -1808,6 +1817,8 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc)
wake_up(&vcpu->arch.cpu_run);
}
}
+
+   trace_kvmppc_run_core(vc, 1);
 }
 
 /*
@@ -1854,11 +1865,13 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore 
*vc)
}
 
vc->vcore_state = VCORE_SLEEPING;
+   trace_kvmppc_vcore_blocked(vc, 0);
spin_unlock(&vc->lock);
schedule();
finish_wait(&v

[PULL 05/18] KVM: PPC: Book3S HV: Fix KSM memory corruption

2014-12-17 Thread Alexander Graf
From: Paul Mackerras 

Testing with KSM active in the host showed occasional corruption of
guest memory.  Typically a page that should have contained zeroes
would contain values that look like the contents of a user process
stack (values such as 0x_3fff__xxx).

Code inspection in kvmppc_h_protect revealed that there was a race
condition with the possibility of granting write access to a page
which is read-only in the host page tables.  The code attempts to keep
the host mapping read-only if the host userspace PTE is read-only, but
if that PTE had been temporarily made invalid for any reason, the
read-only check would not trigger and the host HPTE could end up
read-write.  Examination of the guest HPT in the failure situation
revealed that there were indeed shared pages which should have been
read-only that were mapped read-write.

To close this race, we don't let a page go from being read-only to
being read-write, as far as the real HPTE mapping the page is
concerned (the guest view can go to read-write, but the actual mapping
stays read-only).  When the guest tries to write to the page, we take
an HDSI and let kvmppc_book3s_hv_page_fault take care of providing a
writable HPTE for the page.

This eliminates the occasional corruption of shared pages
that was previously seen with KSM active.

Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/kvm/book3s_hv_rm_mmu.c | 44 ++---
 1 file changed, 17 insertions(+), 27 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c 
b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index 084ad54..411720f 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -667,40 +667,30 @@ long kvmppc_h_protect(struct kvm_vcpu *vcpu, unsigned 
long flags,
rev->guest_rpte = r;
note_hpte_modification(kvm, rev);
}
-   r = (be64_to_cpu(hpte[1]) & ~mask) | bits;
 
/* Update HPTE */
if (v & HPTE_V_VALID) {
-   rb = compute_tlbie_rb(v, r, pte_index);
-   hpte[0] = cpu_to_be64(v & ~HPTE_V_VALID);
-   do_tlbies(kvm, &rb, 1, global_invalidates(kvm, flags), true);
/*
-* If the host has this page as readonly but the guest
-* wants to make it read/write, reduce the permissions.
-* Checking the host permissions involves finding the
-* memslot and then the Linux PTE for the page.
+* If the page is valid, don't let it transition from
+* readonly to writable.  If it should be writable, we'll
+* take a trap and let the page fault code sort it out.
 */
-   if (hpte_is_writable(r) && kvm->arch.using_mmu_notifiers) {
-   unsigned long psize, gfn, hva;
-   struct kvm_memory_slot *memslot;
-   pgd_t *pgdir = vcpu->arch.pgdir;
-   pte_t pte;
-
-   psize = hpte_page_size(v, r);
-   gfn = ((r & HPTE_R_RPN) & ~(psize - 1)) >> PAGE_SHIFT;
-   memslot = __gfn_to_memslot(kvm_memslots_raw(kvm), gfn);
-   if (memslot) {
-   hva = __gfn_to_hva_memslot(memslot, gfn);
-   pte = lookup_linux_pte_and_update(pgdir, hva,
- 1, &psize);
-   if (pte_present(pte) && !pte_write(pte))
-   r = hpte_make_readonly(r);
-   }
+   pte = be64_to_cpu(hpte[1]);
+   r = (pte & ~mask) | bits;
+   if (hpte_is_writable(r) && kvm->arch.using_mmu_notifiers &&
+   !hpte_is_writable(pte))
+   r = hpte_make_readonly(r);
+   /* If the PTE is changing, invalidate it first */
+   if (r != pte) {
+   rb = compute_tlbie_rb(v, r, pte_index);
+   hpte[0] = cpu_to_be64((v & ~HPTE_V_VALID) |
+ HPTE_V_ABSENT);
+   do_tlbies(kvm, &rb, 1, global_invalidates(kvm, flags),
+ true);
+   hpte[1] = cpu_to_be64(r);
}
}
-   hpte[1] = cpu_to_be64(r);
-   eieio();
-   hpte[0] = cpu_to_be64(v & ~HPTE_V_HVLOCK);
+   unlock_hpte(hpte, v & ~HPTE_V_HVLOCK);
asm volatile("ptesync" : : : "memory");
return H_SUCCESS;
 }
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL 17/18] KVM: PPC: Book3S HV: Improve H_CONFER implementation

2014-12-17 Thread Alexander Graf
From: Sam Bobroff 

Currently the H_CONFER hcall is implemented in kernel virtual mode,
meaning that whenever a guest thread does an H_CONFER, all the threads
in that virtual core have to exit the guest.  This is bad for
performance because it interrupts the other threads even if they
are doing useful work.

The H_CONFER hcall is called by a guest VCPU when it is spinning on a
spinlock and it detects that the spinlock is held by a guest VCPU that
is currently not running on a physical CPU.  The idea is to give this
VCPU's time slice to the holder VCPU so that it can make progress
towards releasing the lock.

To avoid having the other threads exit the guest unnecessarily,
we add a real-mode implementation of H_CONFER that checks whether
the other threads are doing anything.  If all the other threads
are idle (i.e. in H_CEDE) or trying to confer (i.e. in H_CONFER),
it returns H_TOO_HARD which causes a guest exit and allows the
H_CONFER to be handled in virtual mode.

Otherwise it spins for a short time (up to 10 microseconds) to give
other threads the chance to observe that this thread is trying to
confer.  The spin loop also terminates when any thread exits the guest
or when all other threads are idle or trying to confer.  If the
timeout is reached, the H_CONFER returns H_SUCCESS.  In this case the
guest VCPU will recheck the spinlock word and most likely call
H_CONFER again.

This also improves the implementation of the H_CONFER virtual mode
handler.  If the VCPU is part of a virtual core (vcore) which is
runnable, there will be a 'runner' VCPU which has taken responsibility
for running the vcore.  In this case we yield to the runner VCPU
rather than the target VCPU.

We also introduce a check on the target VCPU's yield count: if it
differs from the yield count passed to H_CONFER, the target VCPU
has run since H_CONFER was called and may have already released
the lock.  This check is required by PAPR.

Signed-off-by: Sam Bobroff 
Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/include/asm/kvm_host.h |  1 +
 arch/powerpc/kvm/book3s_hv.c| 41 -
 arch/powerpc/kvm/book3s_hv_builtin.c| 32 +
 arch/powerpc/kvm/book3s_hv_rmhandlers.S |  2 +-
 4 files changed, 74 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 6544187..7efd666a 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -295,6 +295,7 @@ struct kvmppc_vcore {
ulong dpdes;/* doorbell state (POWER8) */
void *mpp_buffer; /* Micro Partition Prefetch buffer */
bool mpp_buffer_is_valid;
+   ulong conferring_threads;
 };
 
 #define VCORE_ENTRY_COUNT(vc)  ((vc)->entry_exit_count & 0xff)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 299351e..de4018a 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -607,10 +607,45 @@ static int kvmppc_h_set_mode(struct kvm_vcpu *vcpu, 
unsigned long mflags,
}
 }
 
+static int kvm_arch_vcpu_yield_to(struct kvm_vcpu *target)
+{
+   struct kvmppc_vcore *vcore = target->arch.vcore;
+
+   /*
+* We expect to have been called by the real mode handler
+* (kvmppc_rm_h_confer()) which would have directly returned
+* H_SUCCESS if the source vcore wasn't idle (e.g. if it may
+* have useful work to do and should not confer) so we don't
+* recheck that here.
+*/
+
+   spin_lock(&vcore->lock);
+   if (target->arch.state == KVMPPC_VCPU_RUNNABLE &&
+   vcore->vcore_state != VCORE_INACTIVE)
+   target = vcore->runner;
+   spin_unlock(&vcore->lock);
+
+   return kvm_vcpu_yield_to(target);
+}
+
+static int kvmppc_get_yield_count(struct kvm_vcpu *vcpu)
+{
+   int yield_count = 0;
+   struct lppaca *lppaca;
+
+   spin_lock(&vcpu->arch.vpa_update_lock);
+   lppaca = (struct lppaca *)vcpu->arch.vpa.pinned_addr;
+   if (lppaca)
+   yield_count = lppaca->yield_count;
+   spin_unlock(&vcpu->arch.vpa_update_lock);
+   return yield_count;
+}
+
 int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu)
 {
unsigned long req = kvmppc_get_gpr(vcpu, 3);
unsigned long target, ret = H_SUCCESS;
+   int yield_count;
struct kvm_vcpu *tvcpu;
int idx, rc;
 
@@ -646,7 +681,10 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu)
ret = H_PARAMETER;
break;
}
-   kvm_vcpu_yield_to(tvcpu);
+   yield_count = kvmppc_get_gpr(vcpu, 5);
+   if (kvmppc_get_yield_count(tvcpu) != yield_count)
+   break;
+   kvm_arch_vcpu_yield_to(tvcpu);
break;
  

[PULL 11/18] arch: powerpc: kvm: book3s_pr.c: Remove unused function

2014-12-17 Thread Alexander Graf
From: Rickard Strandqvist 

Remove the function get_fpr_index() that is not used anywhere.

This was partially found by using a static code analysis program called 
cppcheck.

Signed-off-by: Rickard Strandqvist 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/kvm/book3s_pr.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index cf2eb16..f573839 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -644,11 +644,6 @@ int kvmppc_handle_pagefault(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
return r;
 }
 
-static inline int get_fpr_index(int i)
-{
-   return i * TS_FPRWIDTH;
-}
-
 /* Give up external provider (FPU, Altivec, VSX) */
 void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr)
 {
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL 09/18] arch: powerpc: kvm: book3s_32_mmu.c: Remove unused function

2014-12-17 Thread Alexander Graf
From: Rickard Strandqvist 

Remove the function sr_nx() that is not used anywhere.

This was partially found by using a static code analysis program called 
cppcheck.

Signed-off-by: Rickard Strandqvist 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/kvm/book3s_32_mmu.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_32_mmu.c b/arch/powerpc/kvm/book3s_32_mmu.c
index cd0b073..a2eb6d3 100644
--- a/arch/powerpc/kvm/book3s_32_mmu.c
+++ b/arch/powerpc/kvm/book3s_32_mmu.c
@@ -78,11 +78,6 @@ static inline bool sr_kp(u32 sr_raw)
return (sr_raw & 0x2000) ? true: false;
 }
 
-static inline bool sr_nx(u32 sr_raw)
-{
-   return (sr_raw & 0x1000) ? true: false;
-}
-
 static int kvmppc_mmu_book3s_32_xlate_bat(struct kvm_vcpu *vcpu, gva_t eaddr,
  struct kvmppc_pte *pte, bool data,
  bool iswrite);
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL 15/18] KVM: PPC: Book3S HV: Remove code for PPC970 processors

2014-12-17 Thread Alexander Graf
From: Paul Mackerras 

This removes the code that was added to enable HV KVM to work
on PPC970 processors.  The PPC970 is an old CPU that doesn't
support virtualizing guest memory.  Removing PPC970 support also
lets us remove the code for allocating and managing contiguous
real-mode areas, the code for the !kvm->arch.using_mmu_notifiers
case, the code for pinning pages of guest memory when first
accessed and keeping track of which pages have been pinned, and
the code for handling H_ENTER hypercalls in virtual mode.

Book3S HV KVM is now supported only on POWER7 and POWER8 processors.
The KVM_CAP_PPC_RMA capability now always returns 0.

Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/include/asm/kvm_book3s.h|   2 -
 arch/powerpc/include/asm/kvm_book3s_64.h |   1 -
 arch/powerpc/include/asm/kvm_host.h  |  14 --
 arch/powerpc/include/asm/kvm_ppc.h   |   2 -
 arch/powerpc/kernel/asm-offsets.c|   1 -
 arch/powerpc/kvm/book3s_64_mmu_hv.c  | 200 ++---
 arch/powerpc/kvm/book3s_hv.c | 292 +++
 arch/powerpc/kvm/book3s_hv_builtin.c | 104 +--
 arch/powerpc/kvm/book3s_hv_interrupts.S  |  39 +
 arch/powerpc/kvm/book3s_hv_ras.c |   5 +-
 arch/powerpc/kvm/book3s_hv_rm_mmu.c  | 110 ++--
 arch/powerpc/kvm/book3s_hv_rmhandlers.S  | 245 +-
 arch/powerpc/kvm/powerpc.c   |  10 +-
 13 files changed, 70 insertions(+), 955 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index 6acf0c2..942c7b1 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -170,8 +170,6 @@ extern void *kvmppc_pin_guest_page(struct kvm *kvm, 
unsigned long addr,
unsigned long *nb_ret);
 extern void kvmppc_unpin_guest_page(struct kvm *kvm, void *addr,
unsigned long gpa, bool dirty);
-extern long kvmppc_virtmode_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
-   long pte_index, unsigned long pteh, unsigned long ptel);
 extern long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags,
long pte_index, unsigned long pteh, unsigned long ptel,
pgd_t *pgdir, bool realmode, unsigned long *idx_ret);
diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h 
b/arch/powerpc/include/asm/kvm_book3s_64.h
index a37f1a4..2d81e20 100644
--- a/arch/powerpc/include/asm/kvm_book3s_64.h
+++ b/arch/powerpc/include/asm/kvm_book3s_64.h
@@ -37,7 +37,6 @@ static inline void svcpu_put(struct kvmppc_book3s_shadow_vcpu 
*svcpu)
 
 #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
 #define KVM_DEFAULT_HPT_ORDER  24  /* 16MB HPT by default */
-extern unsigned long kvm_rma_pages;
 #endif
 
 #define VRMA_VSID  0x1ffUL /* 1TB VSID reserved for VRMA */
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 7cf94a5..5686a42 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -180,11 +180,6 @@ struct kvmppc_spapr_tce_table {
struct page *pages[0];
 };
 
-struct kvm_rma_info {
-   atomic_t use_count;
-   unsigned long base_pfn;
-};
-
 /* XICS components, defined in book3s_xics.c */
 struct kvmppc_xics;
 struct kvmppc_icp;
@@ -214,16 +209,9 @@ struct revmap_entry {
 #define KVMPPC_RMAP_PRESENT0x1ul
 #define KVMPPC_RMAP_INDEX  0xul
 
-/* Low-order bits in memslot->arch.slot_phys[] */
-#define KVMPPC_PAGE_ORDER_MASK 0x1f
-#define KVMPPC_PAGE_NO_CACHE   HPTE_R_I/* 0x20 */
-#define KVMPPC_PAGE_WRITETHRU  HPTE_R_W/* 0x40 */
-#define KVMPPC_GOT_PAGE0x80
-
 struct kvm_arch_memory_slot {
 #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
unsigned long *rmap;
-   unsigned long *slot_phys;
 #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
 };
 
@@ -242,14 +230,12 @@ struct kvm_arch {
struct kvm_rma_info *rma;
unsigned long vrma_slb_v;
int rma_setup_done;
-   int using_mmu_notifiers;
u32 hpt_order;
atomic_t vcpus_running;
u32 online_vcores;
unsigned long hpt_npte;
unsigned long hpt_mask;
atomic_t hpte_mod_interest;
-   spinlock_t slot_phys_lock;
cpumask_t need_tlb_flush;
int hpt_cma_alloc;
 #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index a6dcdb6..46bf652 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -170,8 +170,6 @@ extern long kvmppc_h_put_tce(struct kvm_vcpu *vcpu, 
unsigned long liobn,
 unsigned long ioba, unsigned long tce);
 extern long kvmppc_h_get_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
 unsigned long ioba);
-extern struct kvm_rma_info *kvm_alloc_rma(void);
-

[PULL 00/18] ppc patch queue 2014-12-18

2014-12-17 Thread Alexander Graf
Hi Paolo,

This is my current patch queue for ppc.  Please pull.

After the merge with Linus' tree, e500v2 compilation will be broken because
commit 69111bac42f5 broke it upstream. Could you please take care to apply the
fix I CC'ed you on for it?


Thanks!

Alex


The following changes since commit e08e833616f7eefebdacfd1d743d80ff3c3b2585:

  KVM: cpuid: recompute CPUID 0xD.0:EBX,ECX (2014-12-05 13:57:49 +0100)

are available in the git repository at:

  git://github.com/agraf/linux-2.6.git tags/signed-kvm-ppc-next

for you to fetch changes up to 476ce5ef09b21a76e74d07ff9d723ba0de49b53b:

  KVM: PPC: Book3S: Enable in-kernel XICS emulation by default (2014-12-17 
22:23:22 +0100)


Patch queue for ppc - 2014-12-18

Highights this time around:

  - Removal of HV support for 970. It became a maintenance burden and received
practically no testing. POWER8 with HV is available now, so just grab one
of those boxes if PR isn't enough for you.
  - Some bug fixes and performance improvements
  - Tracepoints for book3s_hv

----
Alexander Graf (1):
  KVM: PPC: BookE: Improve irq inject tracepoint

Aneesh Kumar K.V (1):
  KVM: PPC: Book3S HV: Add missing HPTE unlock

Anton Blanchard (1):
  KVM: PPC: Book3S: Enable in-kernel XICS emulation by default

Cédric Le Goater (1):
  KVM: PPC: Book3S HV: ptes are big endian

Mahesh Salgaonkar (1):
  KVM: PPC: Book3S HV: Fix an issue where guest is paused on receiving HMI

Paul Mackerras (5):
  KVM: PPC: Book3S HV: Fix computation of tlbie operand
  KVM: PPC: Book3S HV: Fix KSM memory corruption
  KVM: PPC: Book3S HV: Simplify locking around stolen time calculations
  KVM: PPC: Book3S HV: Remove code for PPC970 processors
  KVM: PPC: Book3S HV: Fix endianness of instruction obtained from HEIR 
register

Rickard Strandqvist (4):
  arch: powerpc: kvm: book3s_32_mmu.c: Remove unused function
  arch: powerpc: kvm: book3s.c: Remove some unused functions
  arch: powerpc: kvm: book3s_pr.c: Remove unused function
  arch: powerpc: kvm: book3s_paired_singles.c: Remove unused function

Sam Bobroff (1):
  KVM: PPC: Book3S HV: Improve H_CONFER implementation

Suresh E. Warrier (3):
  KVM: PPC: Book3S HV: Fix inaccuracies in ICP emulation for H_IPI
  KVM: PPC: Book3S HV: Check wait conditions before sleeping in 
kvmppc_vcore_blocked
  KVM: PPC: Book3S HV: Tracepoints for KVM HV guest interactions

 arch/powerpc/include/asm/kvm_book3s.h|   2 -
 arch/powerpc/include/asm/kvm_book3s_64.h |   3 +-
 arch/powerpc/include/asm/kvm_host.h  |  18 +-
 arch/powerpc/include/asm/kvm_ppc.h   |   2 -
 arch/powerpc/kernel/asm-offsets.c|   2 +-
 arch/powerpc/kvm/Kconfig |   1 +
 arch/powerpc/kvm/book3s.c|   8 -
 arch/powerpc/kvm/book3s_32_mmu.c |   5 -
 arch/powerpc/kvm/book3s_64_mmu_hv.c  | 224 +++
 arch/powerpc/kvm/book3s_hv.c | 438 ++--
 arch/powerpc/kvm/book3s_hv_builtin.c | 136 +++--
 arch/powerpc/kvm/book3s_hv_interrupts.S  |  39 +--
 arch/powerpc/kvm/book3s_hv_ras.c |   5 +-
 arch/powerpc/kvm/book3s_hv_rm_mmu.c  | 150 +++---
 arch/powerpc/kvm/book3s_hv_rm_xics.c |  36 ++-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S  | 251 +---
 arch/powerpc/kvm/book3s_paired_singles.c |   8 -
 arch/powerpc/kvm/book3s_pr.c |   5 -
 arch/powerpc/kvm/book3s_xics.c   |  30 +-
 arch/powerpc/kvm/book3s_xics.h   |   1 +
 arch/powerpc/kvm/e500.c  |   8 -
 arch/powerpc/kvm/powerpc.c   |  10 +-
 arch/powerpc/kvm/trace_book3s.h  |  32 +++
 arch/powerpc/kvm/trace_booke.h   |  47 ++-
 arch/powerpc/kvm/trace_hv.h  | 477 +++
 arch/powerpc/kvm/trace_pr.h  |  25 +-
 26 files changed, 870 insertions(+), 1093 deletions(-)
 create mode 100644 arch/powerpc/kvm/trace_book3s.h
 create mode 100644 arch/powerpc/kvm/trace_hv.h
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL 13/18] KVM: PPC: Book3S HV: Simplify locking around stolen time calculations

2014-12-17 Thread Alexander Graf
From: Paul Mackerras 

Currently the calculations of stolen time for PPC Book3S HV guests
uses fields in both the vcpu struct and the kvmppc_vcore struct.  The
fields in the kvmppc_vcore struct are protected by the
vcpu->arch.tbacct_lock of the vcpu that has taken responsibility for
running the virtual core.  This works correctly but confuses lockdep,
because it sees that the code takes the tbacct_lock for a vcpu in
kvmppc_remove_runnable() and then takes another vcpu's tbacct_lock in
vcore_stolen_time(), and it thinks there is a possibility of deadlock,
causing it to print reports like this:

=
[ INFO: possible recursive locking detected ]
3.18.0-rc7-kvm-00016-g8db4bc6 #89 Not tainted
-
qemu-system-ppc/6188 is trying to acquire lock:
 (&(&vcpu->arch.tbacct_lock)->rlock){..}, at: [] 
.vcore_stolen_time+0x48/0xd0 [kvm_hv]

but task is already holding lock:
 (&(&vcpu->arch.tbacct_lock)->rlock){..}, at: [] 
.kvmppc_remove_runnable.part.3+0x30/0xd0 [kvm_hv]

other info that might help us debug this:
 Possible unsafe locking scenario:

   CPU0
   
  lock(&(&vcpu->arch.tbacct_lock)->rlock);
  lock(&(&vcpu->arch.tbacct_lock)->rlock);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

3 locks held by qemu-system-ppc/6188:
 #0:  (&vcpu->mutex){+.+.+.}, at: [] .vcpu_load+0x28/0xe0 
[kvm]
 #1:  (&(&vcore->lock)->rlock){+.+...}, at: [] 
.kvmppc_vcpu_run_hv+0x530/0x1530 [kvm_hv]
 #2:  (&(&vcpu->arch.tbacct_lock)->rlock){..}, at: [] 
.kvmppc_remove_runnable.part.3+0x30/0xd0 [kvm_hv]

stack backtrace:
CPU: 40 PID: 6188 Comm: qemu-system-ppc Not tainted 
3.18.0-rc7-kvm-00016-g8db4bc6 #89
Call Trace:
[c00b2754f3f0] [c0b31b6c] .dump_stack+0x88/0xb4 (unreliable)
[c00b2754f470] [c00faeb8] .__lock_acquire+0x1878/0x2190
[c00b2754f600] [c00fbf0c] .lock_acquire+0xcc/0x1a0
[c00b2754f6d0] [c0b2954c] ._raw_spin_lock_irq+0x4c/0x70
[c00b2754f760] [decb1fe8] .vcore_stolen_time+0x48/0xd0 [kvm_hv]
[c00b2754f7f0] [decb25b4] .kvmppc_remove_runnable.part.3+0x44/0xd0 
[kvm_hv]
[c00b2754f880] [decb43ec] .kvmppc_vcpu_run_hv+0x76c/0x1530 [kvm_hv]
[c00b2754f9f0] [deb9f46c] .kvmppc_vcpu_run+0x2c/0x40 [kvm]
[c00b2754fa60] [deb9c9a4] .kvm_arch_vcpu_ioctl_run+0x54/0x160 [kvm]
[c00b2754faf0] [deb94538] .kvm_vcpu_ioctl+0x498/0x760 [kvm]
[c00b2754fcb0] [c0267eb4] .do_vfs_ioctl+0x444/0x770
[c00b2754fd90] [c02682a4] .SyS_ioctl+0xc4/0xe0
[c00b2754fe30] [c00092e4] syscall_exit+0x0/0x98

In order to make the locking easier to analyse, we change the code to
use a spinlock in the kvmppc_vcore struct to protect the stolen_tb and
preempt_tb fields.  This lock needs to be an irq-safe lock since it is
used in the kvmppc_core_vcpu_load_hv() and kvmppc_core_vcpu_put_hv()
functions, which are called with the scheduler rq lock held, which is
an irq-safe lock.

Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/include/asm/kvm_host.h |  1 +
 arch/powerpc/kvm/book3s_hv.c| 60 +++--
 2 files changed, 32 insertions(+), 29 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 0478556..7cf94a5 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -297,6 +297,7 @@ struct kvmppc_vcore {
struct list_head runnable_threads;
spinlock_t lock;
wait_queue_head_t wq;
+   spinlock_t stoltb_lock; /* protects stolen_tb and preempt_tb */
u64 stolen_tb;
u64 preempt_tb;
struct kvm_vcpu *runner;
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 1a7a281..74afa2d 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -135,11 +135,10 @@ static void kvmppc_fast_vcpu_kick_hv(struct kvm_vcpu 
*vcpu)
  * stolen.
  *
  * Updates to busy_stolen are protected by arch.tbacct_lock;
- * updates to vc->stolen_tb are protected by the arch.tbacct_lock
- * of the vcpu that has taken responsibility for running the vcore
- * (i.e. vc->runner).  The stolen times are measured in units of
- * timebase ticks.  (Note that the != TB_NIL checks below are
- * purely defensive; they should never fail.)
+ * updates to vc->stolen_tb are protected by the vcore->stoltb_lock
+ * lock.  The stolen times are measured in units of timebase ticks.
+ * (Note that the != TB_NIL checks below are purely defensive;
+ * they should never fail.)
  */
 
 static void kvmppc_core_vcpu_load_hv(struct kvm_vcpu *vcpu, int cpu)
@@ -147,12 +146,21 @@ static void kvmppc_core_vcpu_load_hv(struct kvm_vcpu 
*vcpu, int cpu)
struct kvmppc_vcore *vc = vcpu->arch.vcore;
unsi

[PULL 12/18] arch: powerpc: kvm: book3s_paired_singles.c: Remove unused function

2014-12-17 Thread Alexander Graf
From: Rickard Strandqvist 

Remove the function inst_set_field() that is not used anywhere.

This was partially found by using a static code analysis program called 
cppcheck.

Signed-off-by: Rickard Strandqvist 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/kvm/book3s_paired_singles.c | 8 
 1 file changed, 8 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_paired_singles.c 
b/arch/powerpc/kvm/book3s_paired_singles.c
index bfb8035..bd6ab16 100644
--- a/arch/powerpc/kvm/book3s_paired_singles.c
+++ b/arch/powerpc/kvm/book3s_paired_singles.c
@@ -352,14 +352,6 @@ static inline u32 inst_get_field(u32 inst, int msb, int 
lsb)
return kvmppc_get_field(inst, msb + 32, lsb + 32);
 }
 
-/*
- * Replaces inst bits with ordering according to spec.
- */
-static inline u32 inst_set_field(u32 inst, int msb, int lsb, int value)
-{
-   return kvmppc_set_field(inst, msb + 32, lsb + 32, value);
-}
-
 bool kvmppc_inst_is_paired_single(struct kvm_vcpu *vcpu, u32 inst)
 {
if (!(vcpu->arch.hflags & BOOK3S_HFLAG_PAIRED_SINGLE))
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL 02/18] KVM: PPC: Book3S HV: Add missing HPTE unlock

2014-12-17 Thread Alexander Graf
From: "Aneesh Kumar K.V" 

In kvm_test_clear_dirty(), if we find an invalid HPTE we move on to the
next HPTE without unlocking the invalid one.  In fact we should never
find an invalid and unlocked HPTE in the rmap chain, but for robustness
we should unlock it.  This adds the missing unlock.

Reported-by: Benjamin Herrenschmidt 
Signed-off-by: Aneesh Kumar K.V 
Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/kvm/book3s_64_mmu_hv.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index d407702..41f96c5 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -1117,8 +1117,11 @@ static int kvm_test_clear_dirty_npages(struct kvm *kvm, 
unsigned long *rmapp)
}
 
/* Now check and modify the HPTE */
-   if (!(hptep[0] & cpu_to_be64(HPTE_V_VALID)))
+   if (!(hptep[0] & cpu_to_be64(HPTE_V_VALID))) {
+   /* unlock and continue */
+   hptep[0] &= ~cpu_to_be64(HPTE_V_HVLOCK);
continue;
+   }
 
/* need to make it temporarily absent so C is stable */
hptep[0] |= cpu_to_be64(HPTE_V_ABSENT);
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL 04/18] KVM: PPC: Book3S HV: Fix an issue where guest is paused on receiving HMI

2014-12-17 Thread Alexander Graf
From: Mahesh Salgaonkar 

When we get an HMI (hypervisor maintenance interrupt) while in a
guest, we see that guest enters into paused state.  The reason is, in
kvmppc_handle_exit_hv it falls through default path and returns to
host instead of resuming guest.  This causes guest to enter into
paused state.  HMI is a hypervisor only interrupt and it is safe to
resume the guest since the host has handled it already.  This patch
adds a switch case to resume the guest.

Without this patch we see guest entering into paused state with following
console messages:

[ 3003.329351] Severe Hypervisor Maintenance interrupt [Recovered]
[ 3003.329356]  Error detail: Timer facility experienced an error
[ 3003.329359]  HMER: 0840
[ 3003.329360]  TFMR: 4a12000980a84000
[ 3003.329366] vcpu c007c35094c0 (40):
[ 3003.329368] pc  = c00c2ba0  msr = 80009032  trap = e60
[ 3003.329370] r 0 = c021ddc0  r16 = 0046
[ 3003.329372] r 1 = c0007a02bbd0  r17 = 327d5d98
[ 3003.329375] r 2 = c10980b8  r18 = 1fc9a0b0
[ 3003.329377] r 3 = c142d6b8  r19 = c142d6b8
[ 3003.329379] r 4 = 0002  r20 = 
[ 3003.329381] r 5 = c524a110  r21 = 
[ 3003.329383] r 6 = 0001  r22 = 
[ 3003.329386] r 7 =   r23 = c524a110
[ 3003.329388] r 8 =   r24 = 0001
[ 3003.329391] r 9 = 0001  r25 = c0007c31da38
[ 3003.329393] r10 = c14280b8  r26 = 0002
[ 3003.329395] r11 = 746f6f6c2f68656c  r27 = c524a110
[ 3003.329397] r12 = 28004484  r28 = c0007c31da38
[ 3003.329399] r13 = cfe01400  r29 = 0002
[ 3003.329401] r14 = 0046  r30 = c3011e00
[ 3003.329403] r15 = ffba  r31 = 0002
[ 3003.329404] ctr = c041a670  lr  = c0272520
[ 3003.329405] srr0 = c007e8d8 srr1 = 90001002
[ 3003.329406] sprg0 =  sprg1 = cfe01400
[ 3003.329407] sprg2 = cfe01400 sprg3 = 0005
[ 3003.329408] cr = 48004482  xer = 2000  dsisr = 4200
[ 3003.329409] dar = 010015020048
[ 3003.329410] fault dar = 010015020048 dsisr = 4200
[ 3003.329411] SLB (8 entries):
[ 3003.329412]   ESID = c800 VSID = 40016e7779000510
[ 3003.329413]   ESID = d801 VSID = 400142add1000510
[ 3003.329414]   ESID = f804 VSID = 4000eb1a81000510
[ 3003.329415]   ESID = 1f00080b VSID = 40004fda0a000d90
[ 3003.329416]   ESID = 3f00080c VSID = 400039f536000d90
[ 3003.329417]   ESID = 180d VSID = 0001251b35150d90
[ 3003.329417]   ESID = 0100080e VSID = 4001e4609d90
[ 3003.329418]   ESID = d8000819 VSID = 40013d349c000400
[ 3003.329419] lpcr = c04881847001 sdr1 = 001b1906 last_inst = 

[ 3003.329421] trap=0xe60 | pc=0xc00c2ba0 | msr=0x80009032
[ 3003.329524] Severe Hypervisor Maintenance interrupt [Recovered]
[ 3003.329526]  Error detail: Timer facility experienced an error
[ 3003.329527]  HMER: 0840
[ 3003.329527]  TFMR: 4a12000980a94000
[ 3006.359786] Severe Hypervisor Maintenance interrupt [Recovered]
[ 3006.359792]  Error detail: Timer facility experienced an error
[ 3006.359795]  HMER: 0840
[ 3006.359797]  TFMR: 4a12000980a84000

 IdName   State

 2 guest2 running
 3 guest3 paused
 4 guest4 running

Signed-off-by: Mahesh Salgaonkar 
Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/kvm/book3s_hv.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index e63587d..cd7e030 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -769,6 +769,8 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, 
struct kvm_vcpu *vcpu,
vcpu->stat.ext_intr_exits++;
r = RESUME_GUEST;
break;
+   /* HMI is hypervisor interrupt and host has handled it. Resume guest.*/
+   case BOOK3S_INTERRUPT_HMI:
case BOOK3S_INTERRUPT_PERFMON:
r = RESUME_GUEST;
break;
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL 08/18] KVM: PPC: Book3S HV: Check wait conditions before sleeping in kvmppc_vcore_blocked

2014-12-17 Thread Alexander Graf
From: "Suresh E. Warrier" 

The kvmppc_vcore_blocked() code does not check for the wait condition
after putting the process on the wait queue. This means that it is
possible for an external interrupt to become pending, but the vcpu to
remain asleep until the next decrementer interrupt.  The fix is to
make one last check for pending exceptions and ceded state before
calling schedule().

Signed-off-by: Suresh Warrier 
Signed-off-by: Paul Mackerras 
Signed-off-by: Alexander Graf 
---
 arch/powerpc/kvm/book3s_hv.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index cd7e030..1a7a281 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1828,9 +1828,29 @@ static void kvmppc_wait_for_exec(struct kvm_vcpu *vcpu, 
int wait_state)
  */
 static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
 {
+   struct kvm_vcpu *vcpu;
+   int do_sleep = 1;
+
DEFINE_WAIT(wait);
 
prepare_to_wait(&vc->wq, &wait, TASK_INTERRUPTIBLE);
+
+   /*
+* Check one last time for pending exceptions and ceded state after
+* we put ourselves on the wait queue
+*/
+   list_for_each_entry(vcpu, &vc->runnable_threads, arch.run_list) {
+   if (vcpu->arch.pending_exceptions || !vcpu->arch.ceded) {
+   do_sleep = 0;
+   break;
+   }
+   }
+
+   if (!do_sleep) {
+   finish_wait(&vc->wq, &wait);
+   return;
+   }
+
vc->vcore_state = VCORE_SLEEPING;
spin_unlock(&vc->lock);
schedule();
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   3   4   5   6   7   8   9   10   >