Re: [Patch v2] kexec: increase max of kexec segments and use dynamic allocation

2010-07-29 Thread Cong Wang

On 07/27/10 18:00, Milton Miller wrote:

[ Added kexec at lists.infradead.org and linuxppc-dev@lists.ozlabs.org ]



Currently KEXEC_SEGMENT_MAX is only 16 which is too small for machine with
many memory ranges.  When hibernate on a machine with disjoint memory we do
need one segment for each memory region. Increase this hard limit to 16K
which is reasonably large.

And change -segment from a static array to a dynamically allocated memory.

Cc: Neil Hormannhor...@redhat.com
Cc: huang yinghuang.ying.cari...@gmail.com
Cc: Eric W. Biedermanebied...@xmission.com
Signed-off-by: WANG Congamw...@redhat.com

---
diff --git a/arch/powerpc/kernel/machine_kexec_64.c 
b/arch/powerpc/kernel/machine_kexec_64.c
index ed31a29..f115585 100644
--- a/arch/powerpc/kernel/machine_kexec_64.c
+++ b/arch/powerpc/kernel/machine_kexec_64.c
@@ -131,10 +131,7 @@ static void copy_segments(unsigned long ind)
  void kexec_copy_flush(struct kimage *image)
  {
long i, nr_segments = image-nr_segments;
-   struct  kexec_segment ranges[KEXEC_SEGMENT_MAX];
-
-   /* save the ranges on the stack to efficiently flush the icache */
-   memcpy(ranges, image-segment, sizeof(ranges));
+   struct  kexec_segment range;


I'm glad you found our copy on the stack and removed the stack overflow
that comes with this bump, but ...



/*
 * After this call we may not use anything allocated in dynamic
@@ -148,9 +145,11 @@ void kexec_copy_flush(struct kimage *image)
 * we need to clear the icache for all dest pages sometime,
 * including ones that were in place on the original copy
 */
-   for (i = 0; i  nr_segments; i++)
-   flush_icache_range((unsigned long)__va(ranges[i].mem),
-   (unsigned long)__va(ranges[i].mem + ranges[i].memsz));
+   for (i = 0; i  nr_segments; i++) {
+   memcpy(range,image-segment[i], sizeof(range));
+   flush_icache_range((unsigned long)__va(range.mem),
+   (unsigned long)__va(range.mem + range.memsz));
+   }
  }


This is executed after the copy, so as it says,
we may not use anything allocated in dynamic memory.

We could allocate control pages to copy the segment list into.
Actually ppc64 doesn't use the existing control page, but that
is only 4kB today.

We need the list to icache flush all the pages in all the segments.
The as the indirect list doesn't have pages that were allocated at
their destination.

Or maybe the icache flush should be done in the generic code
like it does for crash load segments?



I don't get the point here, according to the comments,
it is copied into stack because of efficiency.

--
The opposite of love is not hate, it's indifference.
 - Elie Wiesel
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH V5] powerpc/mpc512x: Add gpio driver

2010-07-29 Thread Grant Likely
On Wed, Jul 7, 2010 at 5:28 AM, Peter Korsgaard jac...@sunsite.dk wrote:
 Anatolij == Anatolij Gustschin ag...@denx.de writes:

 Hi,

 Old mail, I know ..

  Anatolij From: Matthias Fuchs matthias.fu...@esd.eu
  Anatolij This patch adds a gpio driver for MPC512X PowerPCs.

  Anatolij It has been tested on our CAN-CBX-CPU5201 module that
  Anatolij uses a MPC5121 CPU. This platform comes with a couple of
  Anatolij LEDs and configuration switches that have been used for testing.

  Anatolij After change to the of-gpio api the reworked driver has been
  Anatolij tested on pdm360ng board with some configuration switches.

 This looks very similar to the existing
 arch/powerpc/sysdev/mpc8xxx_gpio.c - Couldn't we just add 5121 support
 there instead?

  Anatolij +struct mpc512x_gpio_regs {
  Anatolij +    u32 gpdir;
  Anatolij +    u32 gpodr;
  Anatolij +    u32 gpdat;
  Anatolij +    u32 gpier;
  Anatolij +    u32 gpimr;
  Anatolij +    u32 gpicr1;
  Anatolij +    u32 gpicr2;
  Anatolij +};

Hi Anatolij,

Peter's right, the register map looks the same, except for the
additional gpicr1  2 registers in the 512x version.  Can the 512x
gpios be supported by the 8xxx gpio driver?

g.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: i meet some surprising things,when i modify the dts file

2010-07-29 Thread Grant Likely
On Thu, Jul 29, 2010 at 12:08 AM, hacklu embedway.t...@gmail.com wrote:
  local...@f0010100 {
 
 ranges = 
  0 0 FC00 100
  2 0 FA00 100
  1 0 7000 100
    ;
 fl...@0,0{

 .
 }

 fl...@2,0{

 
 }

  board-cont...@1,0{
     .
 }

 }

 this is part of my dts files. I don't kown what each field means in the
 config rangs.
 for instance, 2 0 FA00 100 .
 I only konw this:
 2 is means chip selects.
 0 is what?
 Fa0 means the start address.
 100 means the range of the device

Ranges translates from the child address domain to the parent address
domain.  It consists of 3 fields; The child base address, the parent
base address, and the size.  In this case:

child base address := 2 0  (#address-cells = 2 in this node)
parent base address := 0xfa00 (#address-cells = 1 in parent node)
and length = 0x100 (16MB)

For the child address, #address-cells is set to 2, meaning 1 cell for
the chip select #, and 1 cell for an offset into the chip select
range.  In most cases the offset will be zero in a ranges property.

So in this case, the ranges property states that chip select 2 is a
16MB region mapped to base address 0xfa00.


 but ,I got some puzzled.
 when I set the two flash in the 0,1 chips select or 0,2 chips select my
 linux works well.
 and, the board-control only can be set at 1 chis select,otherwise the pci
 doesn't be detected.

Unless the bus controller hardware needs to know the chip select
number for another purpose (ie. setting up a local bus DMA transfer),
you could really use any number for the chip select as long as it is
consistent between the child node and the ranges property.


 so , what is the chips select? is it based on hardware?

Yes, it is based on hardware.  The .dts file is describing which CS
line each external device is attached to.

 but my flash can use
 0,1,2 chips select.
 or it is just set by software? but my pci devece can only work in 1 chips
 select.


 BTW:
 I also want to know how to write the dts file. I want to understand each
 node in the dts files.
 but I can't get enough documents. I have readed the linux/document/...
 could you privode me some useful information?

See here:

http://www.devicetree.org/Device_Tree_Usage

g.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH V5] powerpc/mpc512x: Add gpio driver

2010-07-29 Thread Anatolij Gustschin
On Thu, 29 Jul 2010 01:19:23 -0600
Grant Likely grant.lik...@secretlab.ca wrote:

 On Wed, Jul 7, 2010 at 5:28 AM, Peter Korsgaard jac...@sunsite.dk wrote:
  Anatolij == Anatolij Gustschin ag...@denx.de writes:
 
  Hi,
 
  Old mail, I know ..
 
   Anatolij From: Matthias Fuchs matthias.fu...@esd.eu
   Anatolij This patch adds a gpio driver for MPC512X PowerPCs.
 
   Anatolij It has been tested on our CAN-CBX-CPU5201 module that
   Anatolij uses a MPC5121 CPU. This platform comes with a couple of
   Anatolij LEDs and configuration switches that have been used for testing.
 
   Anatolij After change to the of-gpio api the reworked driver has been
   Anatolij tested on pdm360ng board with some configuration switches.
 
  This looks very similar to the existing
  arch/powerpc/sysdev/mpc8xxx_gpio.c - Couldn't we just add 5121 support
  there instead?
 
   Anatolij +struct mpc512x_gpio_regs {
   Anatolij +    u32 gpdir;
   Anatolij +    u32 gpodr;
   Anatolij +    u32 gpdat;
   Anatolij +    u32 gpier;
   Anatolij +    u32 gpimr;
   Anatolij +    u32 gpicr1;
   Anatolij +    u32 gpicr2;
   Anatolij +};
 
 Hi Anatolij,
 
 Peter's right, the register map looks the same, except for the
 additional gpicr1  2 registers in the 512x version.  Can the 512x
 gpios be supported by the 8xxx gpio driver?

Hi Grant,

I wanted to extend/test this driver but didn't have time so far. I'll
look at 8xxx gpio driver this weekend to see if it can be used for
512x gpios.

Anatolij
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


i meet some surprising things,when i modify the dts file

2010-07-29 Thread hacklu
 local...@f0010100 {

ranges = 
 0 0 FC00 100   
 2 0 FA00 100   
 1 0 7000 100
   ;
fl...@0,0{

.
}

fl...@2,0{


}

 board-cont...@1,0{
.
}

}

this is part of my dts files. I don't kown what each field means in the config 
rangs.
for instance, 2 0 FA00 100 .
I only konw this:
2 is means chip selects.
0 is what?
Fa0 means the start address.
100 means the range of the device

but ,I got some puzzled.
when I set the two flash in the 0,1 chips select or 0,2 chips select my linux 
works well.
and, the board-control only can be set at 1 chis select,otherwise the pci 
doesn't be detected.

so , what is the chips select? is it based on hardware? but my flash can use 
0,1,2 chips select.
or it is just set by software? but my pci devece can only work in 1 chips 
select.


BTW:
I also want to know how to write the dts file. I want to understand each node 
in the dts files.
but I can't get enough documents. I have readed the linux/document/...
could you privode me some useful information?

thank you ver much~

2010-07-29 



hacklu 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Memory Mapping Buffers smaller than page size?

2010-07-29 Thread Ravi Gupta
Hi Simon,

Thanks for the quick reply. One more thing I want to ask is what if I create
a dma pool (using pci_pool_create()), allocate dma buffers from that pool
and then try to memory map them? will the buffers in that case will be
continuous and is it possible to memory map them in a single user space
page?

Thanks in advance
Ravi Gupta

On Wed, Jul 28, 2010 at 7:51 PM, Simon Richter simon.rich...@hogyros.dewrote:

 Hi,

 On Wed, Jul 28, 2010 at 06:44:10PM +0530, Ravi Gupta wrote:

  I am new to linux device drivers development. I have created 16 buffers
 of
  size 256 bytes each(using kmalloc()) in my device driver code. I want to
  memory map these buffers to user space. Now is it possible to memory map
  these buffer(16*256 = 4096 = 1 page on 32 bit linux) into a single page
 in
  user space OR i have to map them in individual pages in user space? Note,
  all the buffers may not be stored in continuous memory location.

 Pages are the smallest unit for mappings, so each buffer would end up in
 its own mapping. If you want the buffers to be accessible without an
 offset, then you cannot have them in continuous locations, as you cannot
 map memory from the middle of a page to the beginning either.

 So your options are: one page per buffer (wasteful, but gives you
 granular access control), or allocating all the buffers as a single
 block.

   Simon

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/6] Remove owner field from sysfs attribute structure

2010-07-29 Thread Eric Biederman
On Wed, Jul 28, 2010 at 10:09 PM, Guenter Roeck
guenter.ro...@ericsson.com wrote:
 The following comment is found in include/linux/sysfs.h:

   /* FIXME
    * The *owner field is no longer used.
    * x86 tree has been cleaned up. The owner
    * attribute is still left for other arches.
    */

 As it turns out, the *owner field is (again?) initialized in several modules,
 suggesting that such initialization may be creeping back into the code.

 This patch set removes the above comment, the *owner field, and each instance
 in the code where it was found to be initialized.

 Compiled with x86 allmodconfig as well as with all alpha, arm, mips, powerpc,
 and sparc defconfig builds.

This seems reasonable to me.  Can we get this in linux-next?

Eric
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2 5/7] Add support for ramdisk on ppc32 for uImage-ppc and Elf-ppc

2010-07-29 Thread Simon Horman
On Tue, Jul 20, 2010 at 03:14:58PM -0500, Matthew McClintock wrote:
 This fixes --reuseinitrd and --ramdisk option for ppc32 on
 uImage-ppc and Elf. It works for normal kexec as well as for
 kdump.
 
 When using --reuseinitrd you need to specifify retain_initrd
 on the command line. Also, if you are doing kdump you need to make
 sure your initrd lives in the crashdump region otherwise the
 kdump kernel will not be able to access it. The --ramdisk option
 should always work.

Thanks, I have applied this change.
I had to do a minor merge on the Makefile,
could you verify that the result is correct?
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 1/4] irq: rename IRQF_TIMER to IRQF_NO_SUSPEND

2010-07-29 Thread Thomas Gleixner
On Wed, 28 Jul 2010, Ian Campbell wrote:

 Continue to provide IRQF_TIMER as an alias to IRQF_NO_SUSPEND since I
 think it is worth preserving the nice self-documenting name (where it
 is used appropriately). It also avoid needing to patch all the many
 users who are using the flag for an actual timer interrupt.

I'm not happy about the alias. What about:

#define __IRQF_TIMER0x0200
#define IRQF_NO_SUSPEND 0x0400

#define IRQF_TIMER(__IRQF_TIMER | IRQF_NO_SUSPEND)

Thanks,

tglx
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 1/4] irq: rename IRQF_TIMER to IRQF_NO_SUSPEND

2010-07-29 Thread Ian Campbell
On Thu, 2010-07-29 at 09:49 +0100, Thomas Gleixner wrote:
 On Wed, 28 Jul 2010, Ian Campbell wrote:
 
  Continue to provide IRQF_TIMER as an alias to IRQF_NO_SUSPEND since I
  think it is worth preserving the nice self-documenting name (where it
  is used appropriately). It also avoid needing to patch all the many
  users who are using the flag for an actual timer interrupt.
 
 I'm not happy about the alias. What about:
 
 #define __IRQF_TIMER  0x0200
 #define IRQF_NO_SUSPEND   0x0400
 
 #define IRQF_TIMER(__IRQF_TIMER | IRQF_NO_SUSPEND)

Sure, I'll rework along those lines.

Ian.



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v3 0/7] Fixup booting with device trees and uImage/elf on ppc32

2010-07-29 Thread Simon Horman
On Mon, Jul 26, 2010 at 11:22:58PM -0500, Matthew McClintock wrote:
 
 On Jul 26, 2010, at 9:55 PM, Simon Horman wrote:
 
  [Cced linuxppc-dev]
  
  On Tue, Jul 20, 2010 at 11:42:57PM -0500, Matthew McClintock wrote:
  This patch series adds full support for booting with a flat device tree
  with either uImage or elf file formats. Kexec and Kdump should work, and
  you should also be able to use ramdisks or reuse your current ramdisk as 
  well
  
  This patch series was tested on an mpc85xx system with a kernel version
  2.6.35-rc3
  
  v1: Initial version
  
  v2: Added support for fs2dt (file system to device tree)
  
  v3: Fix some misc. git problems I had and other code cleanups
  
  Hi Matthew,
  
  I'm a little concerned that these changes are non trivial and haven't had
  much review. But I am prepared to put them into my tree once 2.0.2 is
  released - perhaps that way they will get some test coverage. Does
  that work for you?
 
 Either way works for me. I know they could use more review, however as Maxim 
 said the current tree does not work AFAIK. Either way, I'm willing to keeping 
 addressing everyones concerns and wait or move forward and make some quick 
 fixes as well.

All applied.

I made some minor changes to three of the patches.
I have noted each change in separate emails.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[GIT/PATCH 0/4] Do not use IRQF_TIMER for non timer interrupts

2010-07-29 Thread Ian Campbell
On Thu, 2010-07-29 at 09:49 +0100, Thomas Gleixner wrote:
 On Wed, 28 Jul 2010, Ian Campbell wrote:
 
  Continue to provide IRQF_TIMER as an alias to IRQF_NO_SUSPEND since I
  think it is worth preserving the nice self-documenting name (where it
  is used appropriately). It also avoid needing to patch all the many
  users who are using the flag for an actual timer interrupt.
 
 I'm not happy about the alias. What about:
 
 #define __IRQF_TIMER  0x0200
 #define IRQF_NO_SUSPEND   0x0400
 
 #define IRQF_TIMER(__IRQF_TIMER | IRQF_NO_SUSPEND)

Resending with this change. Plus I ran checkpatch on the whole lot (I
previously managed to run it only on the first patch) and fixed the
complaints.

Ian.

The following changes since commit fc0f5ac8fe693d1b05f5a928cc48135d1c8b7f2e:
  Linus Torvalds (1):
Merge branch 'for-linus' of git://git.kernel.org/.../ericvh/v9fs

are available in the git repository at:

  git://xenbits.xensource.com/people/ianc/linux-2.6.git for-irq/irqf-no-suspend

Ian Campbell (4):
  irq: Add new IRQ flag IRQF_NO_SUSPEND
  ixp4xx-beeper: Use IRQF_NO_SUSPEND not IRQF_TIMER for non-timer interrupt
  powerpc: Use IRQF_NO_SUSPEND not IRQF_TIMER for non-timer interrupts
  xen: do not suspend IPI IRQs.

 arch/powerpc/platforms/powermac/low_i2c.c |5 +++--
 drivers/input/misc/ixp4xx-beeper.c|3 ++-
 drivers/macintosh/via-pmu.c   |9 +
 drivers/xen/events.c  |1 +
 include/linux/interrupt.h |7 ++-
 kernel/irq/manage.c   |2 +-
 6 files changed, 18 insertions(+), 9 deletions(-)


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 1/4] irq: Add new IRQ flag IRQF_NO_SUSPEND

2010-07-29 Thread Ian Campbell
A small number of users of IRQF_TIMER are using it for the implied no
suspend behaviour on interrupts which are not timer interrupts.

Therefore add a new IRQF_NO_SUSPEND flag, rename IRQF_TIMER to
__IRQF_TIMER and redefine IRQF_TIMER in terms of these new flags.

Signed-off-by: Ian Campbell ian.campb...@citrix.com
Cc: Thomas Gleixner t...@linutronix.de
Cc: Jeremy Fitzhardinge jer...@goop.org
Cc: Dmitry Torokhov dmitry.torok...@gmail.com
Cc: Benjamin Herrenschmidt b...@kernel.crashing.org
Cc: Paul Mackerras pau...@samba.org
Cc: Grant Likely grant.lik...@secretlab.ca
Cc: xen-de...@lists.xensource.com
Cc: linux-in...@vger.kernel.org
Cc: linuxppc-...@ozlabs.org
Cc: devicetree-disc...@lists.ozlabs.org
---
 include/linux/interrupt.h |7 ++-
 kernel/irq/manage.c   |2 +-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index c233113..a0384a4 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -53,16 +53,21 @@
  * IRQF_ONESHOT - Interrupt is not reenabled after the hardirq handler 
finished.
  *Used by threaded interrupts which need to keep the
  *irq line disabled until the threaded handler has been run.
+ * IRQF_NO_SUSPEND - Do not disable this IRQ during suspend
+ *
  */
 #define IRQF_DISABLED  0x0020
 #define IRQF_SAMPLE_RANDOM 0x0040
 #define IRQF_SHARED0x0080
 #define IRQF_PROBE_SHARED  0x0100
-#define IRQF_TIMER 0x0200
+#define __IRQF_TIMER   0x0200
 #define IRQF_PERCPU0x0400
 #define IRQF_NOBALANCING   0x0800
 #define IRQF_IRQPOLL   0x1000
 #define IRQF_ONESHOT   0x2000
+#define IRQF_NO_SUSPEND0x4000
+
+#define IRQF_TIMER (__IRQF_TIMER | IRQF_NO_SUSPEND)
 
 /*
  * Bits used by threaded handlers:
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index e149748..c3003e9 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -216,7 +216,7 @@ static inline int setup_affinity(unsigned int irq, struct 
irq_desc *desc)
 void __disable_irq(struct irq_desc *desc, unsigned int irq, bool suspend)
 {
if (suspend) {
-   if (!desc-action || (desc-action-flags  IRQF_TIMER))
+   if (!desc-action || (desc-action-flags  IRQF_NO_SUSPEND))
return;
desc-status |= IRQ_SUSPENDED;
}
-- 
1.5.6.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 3/4] powerpc: Use IRQF_NO_SUSPEND not IRQF_TIMER for non-timer interrupts

2010-07-29 Thread Ian Campbell
kw_i2c_irq and via_pmu_interrupt are not timer interrupts and
therefore should not use IRQF_TIMER. Use the recently introduced
IRQF_NO_SUSPEND instead since that is the actual desired behaviour.

Signed-off-by: Ian Campbell ian.campb...@citrix.com
Cc: Thomas Gleixner t...@linutronix.de
Cc: Benjamin Herrenschmidt b...@kernel.crashing.org
Cc: Paul Mackerras pau...@samba.org
Cc: Grant Likely grant.lik...@secretlab.ca
Cc: linuxppc-...@ozlabs.org
Cc: devicetree-disc...@lists.ozlabs.org
---
 arch/powerpc/platforms/powermac/low_i2c.c |5 +++--
 drivers/macintosh/via-pmu.c   |9 +
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/platforms/powermac/low_i2c.c 
b/arch/powerpc/platforms/powermac/low_i2c.c
index 06a137c..480567e 100644
--- a/arch/powerpc/platforms/powermac/low_i2c.c
+++ b/arch/powerpc/platforms/powermac/low_i2c.c
@@ -542,11 +542,12 @@ static struct pmac_i2c_host_kw *__init 
kw_i2c_host_init(struct device_node *np)
/* Make sure IRQ is disabled */
kw_write_reg(reg_ier, 0);
 
-   /* Request chip interrupt. We set IRQF_TIMER because we don't
+   /* Request chip interrupt. We set IRQF_NO_SUSPEND because we don't
 * want that interrupt disabled between the 2 passes of driver
 * suspend or we'll have issues running the pfuncs
 */
-   if (request_irq(host-irq, kw_i2c_irq, IRQF_TIMER, keywest i2c, host))
+   if (request_irq(host-irq, kw_i2c_irq, IRQF_NO_SUSPEND,
+   keywest i2c, host))
host-irq = NO_IRQ;
 
printk(KERN_INFO KeyWest i2c @0x%08x irq %d %s\n,
diff --git a/drivers/macintosh/via-pmu.c b/drivers/macintosh/via-pmu.c
index 3d4fc0f..35bc273 100644
--- a/drivers/macintosh/via-pmu.c
+++ b/drivers/macintosh/via-pmu.c
@@ -400,11 +400,12 @@ static int __init via_pmu_start(void)
printk(KERN_ERR via-pmu: can't map interrupt\n);
return -ENODEV;
}
-   /* We set IRQF_TIMER because we don't want the interrupt to be disabled
-* between the 2 passes of driver suspend, we control our own disabling
-* for that one
+   /* We set IRQF_NO_SUSPEND because we don't want the interrupt
+* to be disabled between the 2 passes of driver suspend, we
+* control our own disabling for that one
 */
-   if (request_irq(irq, via_pmu_interrupt, IRQF_TIMER, VIA-PMU, (void 
*)0)) {
+   if (request_irq(irq, via_pmu_interrupt, IRQF_NO_SUSPEND,
+   VIA-PMU, (void *)0)) {
printk(KERN_ERR via-pmu: can't request irq %d\n, irq);
return -ENODEV;
}
-- 
1.5.6.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Problems using UART on MPC5200

2010-07-29 Thread Detlev Zundel
Hi Sven,

 I am using a PowerPC MPC5200 from Freescale (with STK5200-Board), ELDK 4.2
 from DENX and the Kernel 2.6.34-rc5.

 My Kernel is running fine. The console output is coming over the device
 ttyPSC0.

 In future I want to login over telnet. So I deactivated the Kerneloption
 to output the console over the UART device.

It would help if you were more precise in describing what you did and
what you try to achieve.  What exact option did you change?

 Now I want to read and write to the RS232 interface from a program.
 But when I try to open the device ttyPSC* I get the following error:
 unable to read portsettings : Inappropriate ioctl for device

The message means what it says - whatever device driver is connected to
the device file you open does not support the ioctl you call on it.
Now to better understand this, it would help if you tell us what device
file you open, what major and minor number this has, what /proc/devices
shows this hooks to and what ioctl you do in your application.

 What does this mean ? How can I send and receive Data from/to the UART ?

This should all work with standard procedures.

Cheers
  Detlev

--
DENX Software Engineering GmbH,  MD: Wolfgang Denk  Detlev Zundel
HRB 165235 Munich,  Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-40 Fax: (+49)-8142-66989-80 Email: d...@denx.de

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 00/27] KVM PPC PV framework v3

2010-07-29 Thread Alexander Graf
On PPC we run PR=0 (kernel mode) code in PR=1 (user mode) and don't use the
hypervisor extensions.

While that is all great to show that virtualization is possible, there are
quite some cases where the emulation overhead of privileged instructions is
killing performance.

This patchset tackles exactly that issue. It introduces a paravirtual framework
using which KVM and Linux share a page to exchange register state with. That
way we don't have to switch to the hypervisor just to change a value of a
privileged register.

To prove my point, I ran the same test I did for the MMU optimizations against
the PV framework. Here are the results:

[without]

debian-powerpc:~# time for i in {1..1000}; do /bin/echo hello  /dev/null; done

real0m14.659s
user0m8.967s
sys 0m5.688s

[with]

debian-powerpc:~# time for i in {1..1000}; do /bin/echo hello  /dev/null; done

real0m7.557s
user0m4.121s
sys 0m3.426s


So this is a significant performance improvement! I'm quite happy how fast this
whole thing becomes :)

I tried to take all comments I've heard from people so far about such a PV
framework into account. In case you told me something before that is a no-go
and I still did it, please just tell me again.

To make use of this whole thing you also need patches to qemu and openbios. I
have them in my queue, but want to see this set upstream first before I start
sending patches to the other projects.

Now go and have fun with fast VMs on PPC! Get yourself a G5 on ebay and start
experiencing the power yourself. - heh

v1 - v2:

  - change hypervisor calls to use r0 and r3
  - make crit detection only trigger in supervisor mode
  - RMO - PAM
  - introduce kvm_patch_ins
  - only flush icache when patching
  - introduce kvm_patch_ins_b
  - update documentation

v2 - v3:

  - use pPAPR conventions for hypercall interface
  - only use r0 as magic sc number
  - remove PVR detection
  - remove BookE shared page mapping support
  - combine book3s-64 and -32 magic page ra override
  - add self-test check if the mapping works to guest code
  - add safety check for relocatable kernels

Alexander Graf (27):
  KVM: PPC: Introduce shared page
  KVM: PPC: Convert MSR to shared page
  KVM: PPC: Convert DSISR to shared page
  KVM: PPC: Convert DAR to shared page.
  KVM: PPC: Convert SRR0 and SRR1 to shared page
  KVM: PPC: Convert SPRG[0-4] to shared page
  KVM: PPC: Implement hypervisor interface
  KVM: PPC: Add PV guest critical sections
  KVM: PPC: Add PV guest scratch registers
  KVM: PPC: Tell guest about pending interrupts
  KVM: PPC: Make PAM a define
  KVM: PPC: First magic page steps
  KVM: PPC: Magic Page Book3s support
  KVM: PPC: Expose magic page support to guest
  KVM: Move kvm_guest_init out of generic code
  KVM: PPC: Generic KVM PV guest support
  KVM: PPC: KVM PV guest stubs
  KVM: PPC: PV instructions to loads and stores
  KVM: PPC: PV tlbsync to nop
  KVM: PPC: Introduce kvm_tmp framework
  KVM: PPC: Introduce branch patching helper
  KVM: PPC: PV assembler helpers
  KVM: PPC: PV mtmsrd L=1
  KVM: PPC: PV mtmsrd L=0 and mtmsr
  KVM: PPC: PV wrteei
  KVM: PPC: Add Documentation about PV interface
  KVM: PPC: Add get_pvinfo interface to query hypercall instructions

 Documentation/kvm/api.txt|   23 ++
 Documentation/kvm/ppc-pv.txt |  180 +++
 arch/powerpc/include/asm/kvm_book3s.h|2 +-
 arch/powerpc/include/asm/kvm_host.h  |   15 +-
 arch/powerpc/include/asm/kvm_para.h  |  135 -
 arch/powerpc/include/asm/kvm_ppc.h   |1 +
 arch/powerpc/kernel/Makefile |2 +
 arch/powerpc/kernel/asm-offsets.c|   18 +-
 arch/powerpc/kernel/kvm.c|  485 ++
 arch/powerpc/kernel/kvm_emul.S   |  247 +++
 arch/powerpc/kvm/44x.c   |7 +
 arch/powerpc/kvm/44x_tlb.c   |8 +-
 arch/powerpc/kvm/book3s.c|  188 
 arch/powerpc/kvm/book3s_32_mmu.c |   28 ++-
 arch/powerpc/kvm/book3s_32_mmu_host.c|6 +-
 arch/powerpc/kvm/book3s_64_mmu.c |   42 +++-
 arch/powerpc/kvm/book3s_64_mmu_host.c|   13 +-
 arch/powerpc/kvm/book3s_emulate.c|   25 +-
 arch/powerpc/kvm/book3s_paired_singles.c |   11 +-
 arch/powerpc/kvm/booke.c |   83 --
 arch/powerpc/kvm/booke.h |6 +-
 arch/powerpc/kvm/booke_emulate.c |   14 +-
 arch/powerpc/kvm/booke_interrupts.S  |3 +-
 arch/powerpc/kvm/e500.c  |7 +
 arch/powerpc/kvm/e500_tlb.c  |   12 +-
 arch/powerpc/kvm/e500_tlb.h  |2 +-
 arch/powerpc/kvm/emulate.c   |   36 ++-
 arch/powerpc/kvm/powerpc.c   |   84 +-
 arch/powerpc/platforms/Kconfig   |   10 +
 arch/x86/include/asm/kvm_para.h  |6 +
 include/linux/kvm.h  |   11 +
 include/linux/kvm_para.h |7 +-
 32 files changed, 1538 

[PATCH 08/27] KVM: PPC: Add PV guest critical sections

2010-07-29 Thread Alexander Graf
When running in hooked code we need a way to disable interrupts without
clobbering any interrupts or exiting out to the hypervisor.

To achieve this, we have an additional critical field in the shared page. If
that field is equal to the r1 register of the guest, it tells the hypervisor
that we're in such a critical section and thus may not receive any interrupts.

Signed-off-by: Alexander Graf ag...@suse.de

---

v1 - v2:

  - make crit detection only trigger in supervisor mode
---
 arch/powerpc/include/asm/kvm_para.h |1 +
 arch/powerpc/kvm/book3s.c   |   18 --
 arch/powerpc/kvm/booke.c|   15 +++
 3 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_para.h 
b/arch/powerpc/include/asm/kvm_para.h
index 556fd59..4577e7b 100644
--- a/arch/powerpc/include/asm/kvm_para.h
+++ b/arch/powerpc/include/asm/kvm_para.h
@@ -24,6 +24,7 @@
 #include linux/of.h
 
 struct kvm_vcpu_arch_shared {
+   __u64 critical; /* Guest may not get interrupts if == r1 */
__u64 sprg0;
__u64 sprg1;
__u64 sprg2;
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 5cb5f0d..d6227ff 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -251,14 +251,28 @@ int kvmppc_book3s_irqprio_deliver(struct kvm_vcpu *vcpu, 
unsigned int priority)
int deliver = 1;
int vec = 0;
ulong flags = 0ULL;
+   ulong crit_raw = vcpu-arch.shared-critical;
+   ulong crit_r1 = kvmppc_get_gpr(vcpu, 1);
+   bool crit;
+
+   /* Truncate crit indicators in 32 bit mode */
+   if (!(vcpu-arch.shared-msr  MSR_SF)) {
+   crit_raw = 0x;
+   crit_r1 = 0x;
+   }
+
+   /* Critical section when crit == r1 */
+   crit = (crit_raw == crit_r1);
+   /* ... and we're in supervisor mode */
+   crit = crit  !(vcpu-arch.shared-msr  MSR_PR);
 
switch (priority) {
case BOOK3S_IRQPRIO_DECREMENTER:
-   deliver = vcpu-arch.shared-msr  MSR_EE;
+   deliver = (vcpu-arch.shared-msr  MSR_EE)  !crit;
vec = BOOK3S_INTERRUPT_DECREMENTER;
break;
case BOOK3S_IRQPRIO_EXTERNAL:
-   deliver = vcpu-arch.shared-msr  MSR_EE;
+   deliver = (vcpu-arch.shared-msr  MSR_EE)  !crit;
vec = BOOK3S_INTERRUPT_EXTERNAL;
break;
case BOOK3S_IRQPRIO_SYSTEM_RESET:
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 13e0747..104d0ee 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -147,6 +147,20 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu 
*vcpu,
int allowed = 0;
ulong uninitialized_var(msr_mask);
bool update_esr = false, update_dear = false;
+   ulong crit_raw = vcpu-arch.shared-critical;
+   ulong crit_r1 = kvmppc_get_gpr(vcpu, 1);
+   bool crit;
+
+   /* Truncate crit indicators in 32 bit mode */
+   if (!(vcpu-arch.shared-msr  MSR_SF)) {
+   crit_raw = 0x;
+   crit_r1 = 0x;
+   }
+
+   /* Critical section when crit == r1 */
+   crit = (crit_raw == crit_r1);
+   /* ... and we're in supervisor mode */
+   crit = crit  !(vcpu-arch.shared-msr  MSR_PR);
 
switch (priority) {
case BOOKE_IRQPRIO_DTLB_MISS:
@@ -181,6 +195,7 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu 
*vcpu,
case BOOKE_IRQPRIO_DECREMENTER:
case BOOKE_IRQPRIO_FIT:
allowed = vcpu-arch.shared-msr  MSR_EE;
+   allowed = allowed  !crit;
msr_mask = MSR_CE|MSR_ME|MSR_DE;
break;
case BOOKE_IRQPRIO_DEBUG:
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 01/27] KVM: PPC: Introduce shared page

2010-07-29 Thread Alexander Graf
For transparent variable sharing between the hypervisor and guest, I introduce
a shared page. This shared page will contain all the registers the guest can
read and write safely without exiting guest context.

This patch only implements the stubs required for the basic structure of the
shared page. The actual register moving follows.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h |2 ++
 arch/powerpc/include/asm/kvm_para.h |5 +
 arch/powerpc/kernel/asm-offsets.c   |1 +
 arch/powerpc/kvm/44x.c  |7 +++
 arch/powerpc/kvm/book3s.c   |9 -
 arch/powerpc/kvm/e500.c |7 +++
 6 files changed, 30 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index b0b23c0..53edacd 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -25,6 +25,7 @@
 #include linux/interrupt.h
 #include linux/types.h
 #include linux/kvm_types.h
+#include linux/kvm_para.h
 #include asm/kvm_asm.h
 
 #define KVM_MAX_VCPUS 1
@@ -290,6 +291,7 @@ struct kvm_vcpu_arch {
struct tasklet_struct tasklet;
u64 dec_jiffies;
unsigned long pending_exceptions;
+   struct kvm_vcpu_arch_shared *shared;
 
 #ifdef CONFIG_PPC_BOOK3S
struct hlist_head hpte_hash_pte[HPTEG_HASH_NUM_PTE];
diff --git a/arch/powerpc/include/asm/kvm_para.h 
b/arch/powerpc/include/asm/kvm_para.h
index 2d48f6a..1485ba8 100644
--- a/arch/powerpc/include/asm/kvm_para.h
+++ b/arch/powerpc/include/asm/kvm_para.h
@@ -20,6 +20,11 @@
 #ifndef __POWERPC_KVM_PARA_H__
 #define __POWERPC_KVM_PARA_H__
 
+#include linux/types.h
+
+struct kvm_vcpu_arch_shared {
+};
+
 #ifdef __KERNEL__
 
 static inline int kvm_para_available(void)
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 496cc5b..944f593 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -400,6 +400,7 @@ int main(void)
DEFINE(VCPU_SPRG6, offsetof(struct kvm_vcpu, arch.sprg6));
DEFINE(VCPU_SPRG7, offsetof(struct kvm_vcpu, arch.sprg7));
DEFINE(VCPU_SHADOW_PID, offsetof(struct kvm_vcpu, arch.shadow_pid));
+   DEFINE(VCPU_SHARED, offsetof(struct kvm_vcpu, arch.shared));
 
/* book3s */
 #ifdef CONFIG_PPC_BOOK3S
diff --git a/arch/powerpc/kvm/44x.c b/arch/powerpc/kvm/44x.c
index 73c0a3f..e7b1f3f 100644
--- a/arch/powerpc/kvm/44x.c
+++ b/arch/powerpc/kvm/44x.c
@@ -123,8 +123,14 @@ struct kvm_vcpu *kvmppc_core_vcpu_create(struct kvm *kvm, 
unsigned int id)
if (err)
goto free_vcpu;
 
+   vcpu-arch.shared = (void*)__get_free_page(GFP_KERNEL|__GFP_ZERO);
+   if (!vcpu-arch.shared)
+   goto uninit_vcpu;
+
return vcpu;
 
+uninit_vcpu:
+   kvm_vcpu_uninit(vcpu);
 free_vcpu:
kmem_cache_free(kvm_vcpu_cache, vcpu_44x);
 out:
@@ -135,6 +141,7 @@ void kvmppc_core_vcpu_free(struct kvm_vcpu *vcpu)
 {
struct kvmppc_vcpu_44x *vcpu_44x = to_44x(vcpu);
 
+   free_page((unsigned long)vcpu-arch.shared);
kvm_vcpu_uninit(vcpu);
kmem_cache_free(kvm_vcpu_cache, vcpu_44x);
 }
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index a3cef30..b3385dd 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -1242,6 +1242,10 @@ struct kvm_vcpu *kvmppc_core_vcpu_create(struct kvm 
*kvm, unsigned int id)
if (err)
goto free_shadow_vcpu;
 
+   vcpu-arch.shared = (void*)__get_free_page(GFP_KERNEL|__GFP_ZERO);
+   if (!vcpu-arch.shared)
+   goto uninit_vcpu;
+
vcpu-arch.host_retip = kvm_return_point;
vcpu-arch.host_msr = mfmsr();
 #ifdef CONFIG_PPC_BOOK3S_64
@@ -1268,10 +1272,12 @@ struct kvm_vcpu *kvmppc_core_vcpu_create(struct kvm 
*kvm, unsigned int id)
 
err = kvmppc_mmu_init(vcpu);
if (err  0)
-   goto free_shadow_vcpu;
+   goto uninit_vcpu;
 
return vcpu;
 
+uninit_vcpu:
+   kvm_vcpu_uninit(vcpu);
 free_shadow_vcpu:
kfree(vcpu_book3s-shadow_vcpu);
 free_vcpu:
@@ -1284,6 +1290,7 @@ void kvmppc_core_vcpu_free(struct kvm_vcpu *vcpu)
 {
struct kvmppc_vcpu_book3s *vcpu_book3s = to_book3s(vcpu);
 
+   free_page((unsigned long)vcpu-arch.shared);
kvm_vcpu_uninit(vcpu);
kfree(vcpu_book3s-shadow_vcpu);
vfree(vcpu_book3s);
diff --git a/arch/powerpc/kvm/e500.c b/arch/powerpc/kvm/e500.c
index e8a00b0..71750f2 100644
--- a/arch/powerpc/kvm/e500.c
+++ b/arch/powerpc/kvm/e500.c
@@ -117,8 +117,14 @@ struct kvm_vcpu *kvmppc_core_vcpu_create(struct kvm *kvm, 
unsigned int id)
if (err)
goto uninit_vcpu;
 
+   vcpu-arch.shared = (void*)__get_free_page(GFP_KERNEL|__GFP_ZERO);
+   if (!vcpu-arch.shared)
+   goto uninit_tlb;
+
return vcpu;
 
+uninit_tlb:
+   kvmppc_e500_tlb_uninit(vcpu_e500);
 uninit_vcpu:
 

[PATCH 06/27] KVM: PPC: Convert SPRG[0-4] to shared page

2010-07-29 Thread Alexander Graf
When in kernel mode there are 4 additional registers available that are
simple data storage. Instead of exiting to the hypervisor to read and
write those, we can just share them with the guest using the page.

This patch converts all users of the current field to the shared page.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h |4 
 arch/powerpc/include/asm/kvm_para.h |4 
 arch/powerpc/kvm/book3s.c   |   16 
 arch/powerpc/kvm/booke.c|   16 
 arch/powerpc/kvm/emulate.c  |   24 
 5 files changed, 36 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 5255d75..221cf85 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -217,10 +217,6 @@ struct kvm_vcpu_arch {
ulong guest_owned_ext;
 #endif
u32 mmucr;
-   ulong sprg0;
-   ulong sprg1;
-   ulong sprg2;
-   ulong sprg3;
ulong sprg4;
ulong sprg5;
ulong sprg6;
diff --git a/arch/powerpc/include/asm/kvm_para.h 
b/arch/powerpc/include/asm/kvm_para.h
index d7fc6c2..e402999 100644
--- a/arch/powerpc/include/asm/kvm_para.h
+++ b/arch/powerpc/include/asm/kvm_para.h
@@ -23,6 +23,10 @@
 #include linux/types.h
 
 struct kvm_vcpu_arch_shared {
+   __u64 sprg0;
+   __u64 sprg1;
+   __u64 sprg2;
+   __u64 sprg3;
__u64 srr0;
__u64 srr1;
__u64 dar;
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index afa0dd4..cfd7fe5 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -1062,10 +1062,10 @@ int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, 
struct kvm_regs *regs)
regs-srr0 = vcpu-arch.shared-srr0;
regs-srr1 = vcpu-arch.shared-srr1;
regs-pid = vcpu-arch.pid;
-   regs-sprg0 = vcpu-arch.sprg0;
-   regs-sprg1 = vcpu-arch.sprg1;
-   regs-sprg2 = vcpu-arch.sprg2;
-   regs-sprg3 = vcpu-arch.sprg3;
+   regs-sprg0 = vcpu-arch.shared-sprg0;
+   regs-sprg1 = vcpu-arch.shared-sprg1;
+   regs-sprg2 = vcpu-arch.shared-sprg2;
+   regs-sprg3 = vcpu-arch.shared-sprg3;
regs-sprg5 = vcpu-arch.sprg4;
regs-sprg6 = vcpu-arch.sprg5;
regs-sprg7 = vcpu-arch.sprg6;
@@ -1088,10 +1088,10 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, 
struct kvm_regs *regs)
kvmppc_set_msr(vcpu, regs-msr);
vcpu-arch.shared-srr0 = regs-srr0;
vcpu-arch.shared-srr1 = regs-srr1;
-   vcpu-arch.sprg0 = regs-sprg0;
-   vcpu-arch.sprg1 = regs-sprg1;
-   vcpu-arch.sprg2 = regs-sprg2;
-   vcpu-arch.sprg3 = regs-sprg3;
+   vcpu-arch.shared-sprg0 = regs-sprg0;
+   vcpu-arch.shared-sprg1 = regs-sprg1;
+   vcpu-arch.shared-sprg2 = regs-sprg2;
+   vcpu-arch.shared-sprg3 = regs-sprg3;
vcpu-arch.sprg5 = regs-sprg4;
vcpu-arch.sprg6 = regs-sprg5;
vcpu-arch.sprg7 = regs-sprg6;
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 793df28..b2c8c42 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -495,10 +495,10 @@ int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, 
struct kvm_regs *regs)
regs-srr0 = vcpu-arch.shared-srr0;
regs-srr1 = vcpu-arch.shared-srr1;
regs-pid = vcpu-arch.pid;
-   regs-sprg0 = vcpu-arch.sprg0;
-   regs-sprg1 = vcpu-arch.sprg1;
-   regs-sprg2 = vcpu-arch.sprg2;
-   regs-sprg3 = vcpu-arch.sprg3;
+   regs-sprg0 = vcpu-arch.shared-sprg0;
+   regs-sprg1 = vcpu-arch.shared-sprg1;
+   regs-sprg2 = vcpu-arch.shared-sprg2;
+   regs-sprg3 = vcpu-arch.shared-sprg3;
regs-sprg5 = vcpu-arch.sprg4;
regs-sprg6 = vcpu-arch.sprg5;
regs-sprg7 = vcpu-arch.sprg6;
@@ -521,10 +521,10 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, 
struct kvm_regs *regs)
kvmppc_set_msr(vcpu, regs-msr);
vcpu-arch.shared-srr0 = regs-srr0;
vcpu-arch.shared-srr1 = regs-srr1;
-   vcpu-arch.sprg0 = regs-sprg0;
-   vcpu-arch.sprg1 = regs-sprg1;
-   vcpu-arch.sprg2 = regs-sprg2;
-   vcpu-arch.sprg3 = regs-sprg3;
+   vcpu-arch.shared-sprg0 = regs-sprg0;
+   vcpu-arch.shared-sprg1 = regs-sprg1;
+   vcpu-arch.shared-sprg2 = regs-sprg2;
+   vcpu-arch.shared-sprg3 = regs-sprg3;
vcpu-arch.sprg5 = regs-sprg4;
vcpu-arch.sprg6 = regs-sprg5;
vcpu-arch.sprg7 = regs-sprg6;
diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c
index ad0fa4f..454869b 100644
--- a/arch/powerpc/kvm/emulate.c
+++ b/arch/powerpc/kvm/emulate.c
@@ -263,13 +263,17 @@ int kvmppc_emulate_instruction(struct kvm_run *run, 
struct kvm_vcpu *vcpu)
kvmppc_set_gpr(vcpu, rt, get_tb()); break;
 
case SPRN_SPRG0:
-   kvmppc_set_gpr(vcpu, rt, 

[PATCH 02/27] KVM: PPC: Convert MSR to shared page

2010-07-29 Thread Alexander Graf
One of the most obvious registers to share with the guest directly is the
MSR. The MSR contains the interrupts enabled flag which the guest has to
toggle in critical sections.

So in order to bring the overhead of interrupt en- and disabling down, let's
put msr into the shared page. Keep in mind that even though you can fully read
its contents, writing to it doesn't always update all state. There are a few
safe fields that don't require hypervisor interaction. See the documentation
for a list of MSR bits that are safe to be set from inside the guest.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h  |1 -
 arch/powerpc/include/asm/kvm_para.h  |1 +
 arch/powerpc/kernel/asm-offsets.c|2 +-
 arch/powerpc/kvm/44x_tlb.c   |8 ++--
 arch/powerpc/kvm/book3s.c|   65 --
 arch/powerpc/kvm/book3s_32_mmu.c |   12 +++---
 arch/powerpc/kvm/book3s_32_mmu_host.c|4 +-
 arch/powerpc/kvm/book3s_64_mmu.c |   12 +++---
 arch/powerpc/kvm/book3s_64_mmu_host.c|4 +-
 arch/powerpc/kvm/book3s_emulate.c|9 ++--
 arch/powerpc/kvm/book3s_paired_singles.c |7 ++-
 arch/powerpc/kvm/booke.c |   20 +-
 arch/powerpc/kvm/booke.h |6 +-
 arch/powerpc/kvm/booke_emulate.c |6 +-
 arch/powerpc/kvm/booke_interrupts.S  |3 +-
 arch/powerpc/kvm/e500_tlb.c  |   12 +++---
 arch/powerpc/kvm/e500_tlb.h  |2 +-
 arch/powerpc/kvm/powerpc.c   |3 +-
 18 files changed, 93 insertions(+), 84 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 53edacd..ba20f90 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -211,7 +211,6 @@ struct kvm_vcpu_arch {
u32 cr;
 #endif
 
-   ulong msr;
 #ifdef CONFIG_PPC_BOOK3S
ulong shadow_msr;
ulong hflags;
diff --git a/arch/powerpc/include/asm/kvm_para.h 
b/arch/powerpc/include/asm/kvm_para.h
index 1485ba8..a17dc52 100644
--- a/arch/powerpc/include/asm/kvm_para.h
+++ b/arch/powerpc/include/asm/kvm_para.h
@@ -23,6 +23,7 @@
 #include linux/types.h
 
 struct kvm_vcpu_arch_shared {
+   __u64 msr;
 };
 
 #ifdef __KERNEL__
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 944f593..a55d47e 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -394,13 +394,13 @@ int main(void)
DEFINE(VCPU_HOST_STACK, offsetof(struct kvm_vcpu, arch.host_stack));
DEFINE(VCPU_HOST_PID, offsetof(struct kvm_vcpu, arch.host_pid));
DEFINE(VCPU_GPRS, offsetof(struct kvm_vcpu, arch.gpr));
-   DEFINE(VCPU_MSR, offsetof(struct kvm_vcpu, arch.msr));
DEFINE(VCPU_SPRG4, offsetof(struct kvm_vcpu, arch.sprg4));
DEFINE(VCPU_SPRG5, offsetof(struct kvm_vcpu, arch.sprg5));
DEFINE(VCPU_SPRG6, offsetof(struct kvm_vcpu, arch.sprg6));
DEFINE(VCPU_SPRG7, offsetof(struct kvm_vcpu, arch.sprg7));
DEFINE(VCPU_SHADOW_PID, offsetof(struct kvm_vcpu, arch.shadow_pid));
DEFINE(VCPU_SHARED, offsetof(struct kvm_vcpu, arch.shared));
+   DEFINE(VCPU_SHARED_MSR, offsetof(struct kvm_vcpu_arch_shared, msr));
 
/* book3s */
 #ifdef CONFIG_PPC_BOOK3S
diff --git a/arch/powerpc/kvm/44x_tlb.c b/arch/powerpc/kvm/44x_tlb.c
index 8123125..4cbbca7 100644
--- a/arch/powerpc/kvm/44x_tlb.c
+++ b/arch/powerpc/kvm/44x_tlb.c
@@ -221,14 +221,14 @@ gpa_t kvmppc_mmu_xlate(struct kvm_vcpu *vcpu, unsigned 
int gtlb_index,
 
 int kvmppc_mmu_itlb_index(struct kvm_vcpu *vcpu, gva_t eaddr)
 {
-   unsigned int as = !!(vcpu-arch.msr  MSR_IS);
+   unsigned int as = !!(vcpu-arch.shared-msr  MSR_IS);
 
return kvmppc_44x_tlb_index(vcpu, eaddr, vcpu-arch.pid, as);
 }
 
 int kvmppc_mmu_dtlb_index(struct kvm_vcpu *vcpu, gva_t eaddr)
 {
-   unsigned int as = !!(vcpu-arch.msr  MSR_DS);
+   unsigned int as = !!(vcpu-arch.shared-msr  MSR_DS);
 
return kvmppc_44x_tlb_index(vcpu, eaddr, vcpu-arch.pid, as);
 }
@@ -353,7 +353,7 @@ void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 gvaddr, 
gpa_t gpaddr,
 
stlbe.word1 = (hpaddr  0xfc00) | ((hpaddr  32)  0xf);
stlbe.word2 = kvmppc_44x_tlb_shadow_attrib(flags,
-   vcpu-arch.msr  MSR_PR);
+   vcpu-arch.shared-msr  
MSR_PR);
stlbe.tid = !(asid  0xff);
 
/* Keep track of the reference so we can properly release it later. */
@@ -422,7 +422,7 @@ static int tlbe_is_host_safe(const struct kvm_vcpu *vcpu,
 
/* Does it match current guest AS? */
/* XXX what about IS != DS? */
-   if (get_tlb_ts(tlbe) != !!(vcpu-arch.msr  MSR_IS))
+   if (get_tlb_ts(tlbe) != !!(vcpu-arch.shared-msr  MSR_IS))
return 0;
 
gpa = get_tlb_raddr(tlbe);
diff 

[PATCH 14/27] KVM: PPC: Expose magic page support to guest

2010-07-29 Thread Alexander Graf
Now that we have the shared page in place and the MMU code knows about
the magic page, we can expose that capability to the guest!

Signed-off-by: Alexander Graf ag...@suse.de

---

v2 - v3:

  - align hypercalls to in/out of ePAPR
---
 arch/powerpc/include/asm/kvm_para.h |2 ++
 arch/powerpc/kvm/powerpc.c  |   11 +++
 2 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_para.h 
b/arch/powerpc/include/asm/kvm_para.h
index 0653b0d..7438ab3 100644
--- a/arch/powerpc/include/asm/kvm_para.h
+++ b/arch/powerpc/include/asm/kvm_para.h
@@ -45,6 +45,8 @@ struct kvm_vcpu_arch_shared {
 #define HC_EV_SUCCESS  0
 #define HC_EV_UNIMPLEMENTED12
 
+#define KVM_FEATURE_MAGIC_PAGE 1
+
 #ifdef __KERNEL__
 
 #ifdef CONFIG_KVM_GUEST
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index a4cf4b4..fecfe04 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -61,8 +61,19 @@ int kvmppc_kvm_pv(struct kvm_vcpu *vcpu)
}
 
switch (nr) {
+   case HC_VENDOR_KVM | KVM_HC_PPC_MAP_MAGIC_PAGE:
+   {
+   vcpu-arch.magic_page_pa = param1;
+   vcpu-arch.magic_page_ea = param2;
+
+   r = HC_EV_SUCCESS;
+   break;
+   }
case HC_VENDOR_KVM | KVM_HC_FEATURES:
r = HC_EV_SUCCESS;
+#if defined(CONFIG_PPC_BOOK3S) /* XXX Missing magic page on BookE */
+   r2 |= (1  KVM_FEATURE_MAGIC_PAGE);
+#endif
 
/* Second return value is in r4 */
kvmppc_set_gpr(vcpu, 4, r2);
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 12/27] KVM: PPC: First magic page steps

2010-07-29 Thread Alexander Graf
We will be introducing a method to project the shared page in guest context.
As soon as we're talking about this coupling, the shared page is colled magic
page.

This patch introduces simple defines, so the follow-up patches are easier to
read.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h |2 ++
 include/linux/kvm_para.h|1 +
 2 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 1674da8..e1da775 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -287,6 +287,8 @@ struct kvm_vcpu_arch {
u64 dec_jiffies;
unsigned long pending_exceptions;
struct kvm_vcpu_arch_shared *shared;
+   unsigned long magic_page_pa; /* phys addr to map the magic page to */
+   unsigned long magic_page_ea; /* effect. addr to map the magic page to */
 
 #ifdef CONFIG_PPC_BOOK3S
struct hlist_head hpte_hash_pte[HPTEG_HASH_NUM_PTE];
diff --git a/include/linux/kvm_para.h b/include/linux/kvm_para.h
index 3b8080e..ac2015a 100644
--- a/include/linux/kvm_para.h
+++ b/include/linux/kvm_para.h
@@ -18,6 +18,7 @@
 #define KVM_HC_VAPIC_POLL_IRQ  1
 #define KVM_HC_MMU_OP  2
 #define KVM_HC_FEATURES3
+#define KVM_HC_PPC_MAP_MAGIC_PAGE  4
 
 /*
  * hypercalls use architecture specific
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 17/27] KVM: PPC: KVM PV guest stubs

2010-07-29 Thread Alexander Graf
We will soon start and replace instructions from the text section with
other, paravirtualized versions. To ease the readability of those patches
I split out the generic looping and magic page mapping code out.

This patch still only contains stubs. But at least it loops through the
text section :).

Signed-off-by: Alexander Graf ag...@suse.de

---

v1 - v2:

  - kvm guest patch framework: introduce patch_ins

v2 - v3:

  - add self-test in guest code
  - remove superfluous new lines in generic guest code
---
 arch/powerpc/kernel/kvm.c |   95 +
 1 files changed, 95 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c
index a5ece71..e93366f 100644
--- a/arch/powerpc/kernel/kvm.c
+++ b/arch/powerpc/kernel/kvm.c
@@ -33,6 +33,62 @@
 #define KVM_MAGIC_PAGE (-4096L)
 #define magic_var(x) KVM_MAGIC_PAGE + offsetof(struct kvm_vcpu_arch_shared, x)
 
+#define KVM_MASK_RT0x03e0
+
+static bool kvm_patching_worked = true;
+
+static inline void kvm_patch_ins(u32 *inst, u32 new_inst)
+{
+   *inst = new_inst;
+   flush_icache_range((ulong)inst, (ulong)inst + 4);
+}
+
+static void kvm_map_magic_page(void *data)
+{
+   kvm_hypercall2(KVM_HC_PPC_MAP_MAGIC_PAGE,
+  KVM_MAGIC_PAGE,  /* Physical Address */
+  KVM_MAGIC_PAGE); /* Effective Address */
+}
+
+static void kvm_check_ins(u32 *inst)
+{
+   u32 _inst = *inst;
+   u32 inst_no_rt = _inst  ~KVM_MASK_RT;
+   u32 inst_rt = _inst  KVM_MASK_RT;
+
+   switch (inst_no_rt) {
+   }
+
+   switch (_inst) {
+   }
+}
+
+static void kvm_use_magic_page(void)
+{
+   u32 *p;
+   u32 *start, *end;
+   u32 tmp;
+
+   /* Tell the host to map the magic page to -4096 on all CPUs */
+   on_each_cpu(kvm_map_magic_page, NULL, 1);
+
+   /* Quick self-test to see if the mapping works */
+   if (__get_user(tmp, (u32*)KVM_MAGIC_PAGE)) {
+   kvm_patching_worked = false;
+   return;
+   }
+
+   /* Now loop through all code and find instructions */
+   start = (void*)_stext;
+   end = (void*)_etext;
+
+   for (p = start; p  end; p++)
+   kvm_check_ins(p);
+
+   printk(KERN_INFO KVM: Live patching for a fast VM %s\n,
+kvm_patching_worked ? worked : failed);
+}
+
 unsigned long kvm_hypercall(unsigned long *in,
unsigned long *out,
unsigned long nr)
@@ -69,3 +125,42 @@ unsigned long kvm_hypercall(unsigned long *in,
return r3;
 }
 EXPORT_SYMBOL_GPL(kvm_hypercall);
+
+static int kvm_para_setup(void)
+{
+   extern u32 kvm_hypercall_start;
+   struct device_node *hyper_node;
+   u32 *insts;
+   int len, i;
+
+   hyper_node = of_find_node_by_path(/hypervisor);
+   if (!hyper_node)
+   return -1;
+
+   insts = (u32*)of_get_property(hyper_node, hcall-instructions, len);
+   if (len % 4)
+   return -1;
+   if (len  (4 * 4))
+   return -1;
+
+   for (i = 0; i  (len / 4); i++)
+   kvm_patch_ins((kvm_hypercall_start)[i], insts[i]);
+
+   return 0;
+}
+
+static int __init kvm_guest_init(void)
+{
+   if (!kvm_para_available())
+   return 0;
+
+   if (kvm_para_setup())
+   return 0;
+
+   if (kvm_para_has_feature(KVM_FEATURE_MAGIC_PAGE))
+   kvm_use_magic_page();
+
+   return 0;
+}
+
+postcore_initcall(kvm_guest_init);
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 10/27] KVM: PPC: Tell guest about pending interrupts

2010-07-29 Thread Alexander Graf
When the guest turns on interrupts again, it needs to know if we have an
interrupt pending for it. Because if so, it should rather get out of guest
context and get the interrupt.

So we introduce a new field in the shared page that we use to tell the guest
that there's a pending interrupt lying around.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_para.h |1 +
 arch/powerpc/kvm/book3s.c   |7 +++
 arch/powerpc/kvm/booke.c|7 +++
 3 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_para.h 
b/arch/powerpc/include/asm/kvm_para.h
index 5be00c9..0653b0d 100644
--- a/arch/powerpc/include/asm/kvm_para.h
+++ b/arch/powerpc/include/asm/kvm_para.h
@@ -37,6 +37,7 @@ struct kvm_vcpu_arch_shared {
__u64 dar;
__u64 msr;
__u32 dsisr;
+   __u32 int_pending;  /* Tells the guest if we have an interrupt */
 };
 
 #define KVM_SC_MAGIC_R00x4b564d21 /* KVM! */
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index d6227ff..06229fe 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -337,6 +337,7 @@ int kvmppc_book3s_irqprio_deliver(struct kvm_vcpu *vcpu, 
unsigned int priority)
 void kvmppc_core_deliver_interrupts(struct kvm_vcpu *vcpu)
 {
unsigned long *pending = vcpu-arch.pending_exceptions;
+   unsigned long old_pending = vcpu-arch.pending_exceptions;
unsigned int priority;
 
 #ifdef EXIT_DEBUG
@@ -356,6 +357,12 @@ void kvmppc_core_deliver_interrupts(struct kvm_vcpu *vcpu)
 BITS_PER_BYTE * sizeof(*pending),
 priority + 1);
}
+
+   /* Tell the guest about our interrupt status */
+   if (*pending)
+   vcpu-arch.shared-int_pending = 1;
+   else if (old_pending)
+   vcpu-arch.shared-int_pending = 0;
 }
 
 void kvmppc_set_pvr(struct kvm_vcpu *vcpu, u32 pvr)
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 104d0ee..c604277 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -224,6 +224,7 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu 
*vcpu,
 void kvmppc_core_deliver_interrupts(struct kvm_vcpu *vcpu)
 {
unsigned long *pending = vcpu-arch.pending_exceptions;
+   unsigned long old_pending = vcpu-arch.pending_exceptions;
unsigned int priority;
 
priority = __ffs(*pending);
@@ -235,6 +236,12 @@ void kvmppc_core_deliver_interrupts(struct kvm_vcpu *vcpu)
 BITS_PER_BYTE * sizeof(*pending),
 priority + 1);
}
+
+   /* Tell the guest about our interrupt status */
+   if (*pending)
+   vcpu-arch.shared-int_pending = 1;
+   else if (old_pending)
+   vcpu-arch.shared-int_pending = 0;
 }
 
 /**
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 20/27] KVM: PPC: Introduce kvm_tmp framework

2010-07-29 Thread Alexander Graf
We will soon require more sophisticated methods to replace single instructions
with multiple instructions. We do that by branching to a memory region where we
write replacement code for the instruction to.

This region needs to be within 32 MB of the patched instruction though, because
that's the furthest we can jump with immediate branches.

So we keep 1MB of free space around in bss. After we're done initing we can just
tell the mm system that the unused pages are free, but until then we have enough
space to fit all our code in.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kernel/kvm.c |   42 --
 1 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c
index 3258922..926f93f 100644
--- a/arch/powerpc/kernel/kvm.c
+++ b/arch/powerpc/kernel/kvm.c
@@ -65,6 +65,8 @@
 #define KVM_INST_TLBSYNC   0x7c00046c
 
 static bool kvm_patching_worked = true;
+static char kvm_tmp[1024 * 1024];
+static int kvm_tmp_index;
 
 static inline void kvm_patch_ins(u32 *inst, u32 new_inst)
 {
@@ -105,6 +107,23 @@ static void kvm_patch_ins_nop(u32 *inst)
kvm_patch_ins(inst, KVM_INST_NOP);
 }
 
+static u32 *kvm_alloc(int len)
+{
+   u32 *p;
+
+   if ((kvm_tmp_index + len)  ARRAY_SIZE(kvm_tmp)) {
+   printk(KERN_ERR KVM: No more space (%d + %d)\n,
+   kvm_tmp_index, len);
+   kvm_patching_worked = false;
+   return NULL;
+   }
+
+   p = (void*)kvm_tmp[kvm_tmp_index];
+   kvm_tmp_index += len;
+
+   return p;
+}
+
 static void kvm_map_magic_page(void *data)
 {
kvm_hypercall2(KVM_HC_PPC_MAP_MAGIC_PAGE,
@@ -270,17 +289,36 @@ static int kvm_para_setup(void)
return 0;
 }
 
+static __init void kvm_free_tmp(void)
+{
+   unsigned long start, end;
+
+   start = (ulong)kvm_tmp[kvm_tmp_index + (PAGE_SIZE - 1)]  PAGE_MASK;
+   end = (ulong)kvm_tmp[ARRAY_SIZE(kvm_tmp)]  PAGE_MASK;
+
+   /* Free the tmp space we don't need */
+   for (; start  end; start += PAGE_SIZE) {
+   ClearPageReserved(virt_to_page(start));
+   init_page_count(virt_to_page(start));
+   free_page(start);
+   totalram_pages++;
+   }
+}
+
 static int __init kvm_guest_init(void)
 {
if (!kvm_para_available())
-   return 0;
+   goto free_tmp;
 
if (kvm_para_setup())
-   return 0;
+   goto free_tmp;
 
if (kvm_para_has_feature(KVM_FEATURE_MAGIC_PAGE))
kvm_use_magic_page();
 
+free_tmp:
+   kvm_free_tmp();
+
return 0;
 }
 
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 09/27] KVM: PPC: Add PV guest scratch registers

2010-07-29 Thread Alexander Graf
While running in hooked code we need to store register contents out because
we must not clobber any registers.

So let's add some fields to the shared page we can just happily write to.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_para.h |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_para.h 
b/arch/powerpc/include/asm/kvm_para.h
index 4577e7b..5be00c9 100644
--- a/arch/powerpc/include/asm/kvm_para.h
+++ b/arch/powerpc/include/asm/kvm_para.h
@@ -24,6 +24,9 @@
 #include linux/of.h
 
 struct kvm_vcpu_arch_shared {
+   __u64 scratch1;
+   __u64 scratch2;
+   __u64 scratch3;
__u64 critical; /* Guest may not get interrupts if == r1 */
__u64 sprg0;
__u64 sprg1;
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 11/27] KVM: PPC: Make PAM a define

2010-07-29 Thread Alexander Graf
On PowerPC it's very normal to not support all of the physical RAM in real mode.
To check if we're matching on the shared page or not, we need to know the limits
so we can restrain ourselves to that range.

So let's make it a define instead of open-coding it. And while at it, let's also
increase it.

Signed-off-by: Alexander Graf ag...@suse.de

v2 - v3:

  - RMO - PAM (non-magic page)
---
 arch/powerpc/include/asm/kvm_host.h |3 +++
 arch/powerpc/kvm/book3s.c   |4 ++--
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 221cf85..1674da8 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -48,6 +48,9 @@
 #define HPTEG_HASH_NUM_VPTE(1  HPTEG_HASH_BITS_VPTE)
 #define HPTEG_HASH_NUM_VPTE_LONG   (1  HPTEG_HASH_BITS_VPTE_LONG)
 
+/* Physical Address Mask - allowed range of real mode RAM access */
+#define KVM_PAM0x0fffULL
+
 struct kvm;
 struct kvm_run;
 struct kvm_vcpu;
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 06229fe..0ed5376 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -465,7 +465,7 @@ static int kvmppc_xlate(struct kvm_vcpu *vcpu, ulong eaddr, 
bool data,
r = vcpu-arch.mmu.xlate(vcpu, eaddr, pte, data);
} else {
pte-eaddr = eaddr;
-   pte-raddr = eaddr  0x;
+   pte-raddr = eaddr  KVM_PAM;
pte-vpage = VSID_REAL | eaddr  12;
pte-may_read = true;
pte-may_write = true;
@@ -579,7 +579,7 @@ int kvmppc_handle_pagefault(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
pte.may_execute = true;
pte.may_read = true;
pte.may_write = true;
-   pte.raddr = eaddr  0x;
+   pte.raddr = eaddr  KVM_PAM;
pte.eaddr = eaddr;
pte.vpage = eaddr  12;
}
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 15/27] KVM: Move kvm_guest_init out of generic code

2010-07-29 Thread Alexander Graf
Currently x86 is the only architecture that uses kvm_guest_init(). With
PowerPC we're getting a second user, but the signature is different there
and we don't need to export it, as it uses the normal kernel init framework.

So let's move the x86 specific definition of that function over to the x86
specfic header file.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/x86/include/asm/kvm_para.h |6 ++
 include/linux/kvm_para.h|5 -
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
index 05eba5e..7b562b6 100644
--- a/arch/x86/include/asm/kvm_para.h
+++ b/arch/x86/include/asm/kvm_para.h
@@ -158,6 +158,12 @@ static inline unsigned int kvm_arch_para_features(void)
return cpuid_eax(KVM_CPUID_FEATURES);
 }
 
+#ifdef CONFIG_KVM_GUEST
+void __init kvm_guest_init(void);
+#else
+#define kvm_guest_init() do { } while (0)
 #endif
 
+#endif /* __KERNEL__ */
+
 #endif /* _ASM_X86_KVM_PARA_H */
diff --git a/include/linux/kvm_para.h b/include/linux/kvm_para.h
index ac2015a..47a070b 100644
--- a/include/linux/kvm_para.h
+++ b/include/linux/kvm_para.h
@@ -26,11 +26,6 @@
 #include asm/kvm_para.h
 
 #ifdef __KERNEL__
-#ifdef CONFIG_KVM_GUEST
-void __init kvm_guest_init(void);
-#else
-#define kvm_guest_init() do { } while (0)
-#endif
 
 static inline int kvm_para_has_feature(unsigned int feature)
 {
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 04/27] KVM: PPC: Convert DAR to shared page.

2010-07-29 Thread Alexander Graf
The DAR register contains the address a data page fault occured at. This
register behaves pretty much like a simple data storage register that gets
written to on data faults. There is no hypervisor interaction required on
read or write.

This patch converts all users of the current field to the shared page.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h  |1 -
 arch/powerpc/include/asm/kvm_para.h  |1 +
 arch/powerpc/kvm/book3s.c|   14 +++---
 arch/powerpc/kvm/book3s_emulate.c|6 +++---
 arch/powerpc/kvm/book3s_paired_singles.c |2 +-
 arch/powerpc/kvm/booke.c |2 +-
 arch/powerpc/kvm/booke_emulate.c |4 ++--
 7 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index ba20f90..c852408 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -231,7 +231,6 @@ struct kvm_vcpu_arch {
ulong csrr1;
ulong dsrr0;
ulong dsrr1;
-   ulong dear;
ulong esr;
u32 dec;
u32 decar;
diff --git a/arch/powerpc/include/asm/kvm_para.h 
b/arch/powerpc/include/asm/kvm_para.h
index 9f7565b..ec72a1c 100644
--- a/arch/powerpc/include/asm/kvm_para.h
+++ b/arch/powerpc/include/asm/kvm_para.h
@@ -23,6 +23,7 @@
 #include linux/types.h
 
 struct kvm_vcpu_arch_shared {
+   __u64 dar;
__u64 msr;
__u32 dsisr;
 };
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index eb401b6..4d46f8b 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -594,14 +594,14 @@ int kvmppc_handle_pagefault(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
 
if (page_found == -ENOENT) {
/* Page not found in guest PTE entries */
-   vcpu-arch.dear = kvmppc_get_fault_dar(vcpu);
+   vcpu-arch.shared-dar = kvmppc_get_fault_dar(vcpu);
vcpu-arch.shared-dsisr = to_svcpu(vcpu)-fault_dsisr;
vcpu-arch.shared-msr |=
(to_svcpu(vcpu)-shadow_srr1  0xf800ULL);
kvmppc_book3s_queue_irqprio(vcpu, vec);
} else if (page_found == -EPERM) {
/* Storage protection */
-   vcpu-arch.dear = kvmppc_get_fault_dar(vcpu);
+   vcpu-arch.shared-dar = kvmppc_get_fault_dar(vcpu);
vcpu-arch.shared-dsisr =
to_svcpu(vcpu)-fault_dsisr  ~DSISR_NOHPTE;
vcpu-arch.shared-dsisr |= DSISR_PROTFAULT;
@@ -610,7 +610,7 @@ int kvmppc_handle_pagefault(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
kvmppc_book3s_queue_irqprio(vcpu, vec);
} else if (page_found == -EINVAL) {
/* Page not found in guest SLB */
-   vcpu-arch.dear = kvmppc_get_fault_dar(vcpu);
+   vcpu-arch.shared-dar = kvmppc_get_fault_dar(vcpu);
kvmppc_book3s_queue_irqprio(vcpu, vec + 0x80);
} else if (!is_mmio 
   kvmppc_visible_gfn(vcpu, pte.raddr  PAGE_SHIFT)) {
@@ -867,17 +867,17 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
if (to_svcpu(vcpu)-fault_dsisr  DSISR_NOHPTE) {
r = kvmppc_handle_pagefault(run, vcpu, dar, exit_nr);
} else {
-   vcpu-arch.dear = dar;
+   vcpu-arch.shared-dar = dar;
vcpu-arch.shared-dsisr = to_svcpu(vcpu)-fault_dsisr;
kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
-   kvmppc_mmu_pte_flush(vcpu, vcpu-arch.dear, ~0xFFFUL);
+   kvmppc_mmu_pte_flush(vcpu, dar, ~0xFFFUL);
r = RESUME_GUEST;
}
break;
}
case BOOK3S_INTERRUPT_DATA_SEGMENT:
if (kvmppc_mmu_map_segment(vcpu, kvmppc_get_fault_dar(vcpu))  
0) {
-   vcpu-arch.dear = kvmppc_get_fault_dar(vcpu);
+   vcpu-arch.shared-dar = kvmppc_get_fault_dar(vcpu);
kvmppc_book3s_queue_irqprio(vcpu,
BOOK3S_INTERRUPT_DATA_SEGMENT);
}
@@ -997,7 +997,7 @@ program_interrupt:
if (kvmppc_read_inst(vcpu) == EMULATE_DONE) {
vcpu-arch.shared-dsisr = kvmppc_alignment_dsisr(vcpu,
kvmppc_get_last_inst(vcpu));
-   vcpu-arch.dear = kvmppc_alignment_dar(vcpu,
+   vcpu-arch.shared-dar = kvmppc_alignment_dar(vcpu,
kvmppc_get_last_inst(vcpu));
kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
}
diff --git a/arch/powerpc/kvm/book3s_emulate.c 
b/arch/powerpc/kvm/book3s_emulate.c
index 9982ff1..c147864 100644
--- a/arch/powerpc/kvm/book3s_emulate.c
+++ 

[PATCH 18/27] KVM: PPC: PV instructions to loads and stores

2010-07-29 Thread Alexander Graf
Some instructions can simply be replaced by load and store instructions to
or from the magic page.

This patch replaces often called instructions that fall into the above category.

Signed-off-by: Alexander Graf ag...@suse.de

---

v1 - v2:

  - use kvm_patch_ins
---
 arch/powerpc/kernel/kvm.c |  109 +
 1 files changed, 109 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c
index e93366f..9ec572c 100644
--- a/arch/powerpc/kernel/kvm.c
+++ b/arch/powerpc/kernel/kvm.c
@@ -33,7 +33,34 @@
 #define KVM_MAGIC_PAGE (-4096L)
 #define magic_var(x) KVM_MAGIC_PAGE + offsetof(struct kvm_vcpu_arch_shared, x)
 
+#define KVM_INST_LWZ   0x8000
+#define KVM_INST_STW   0x9000
+#define KVM_INST_LD0xe800
+#define KVM_INST_STD   0xf800
+#define KVM_INST_NOP   0x6000
+#define KVM_INST_B 0x4800
+#define KVM_INST_B_MASK0x03ff
+#define KVM_INST_B_MAX 0x01ff
+
 #define KVM_MASK_RT0x03e0
+#define KVM_INST_MFMSR 0x7ca6
+#define KVM_INST_MFSPR_SPRG0   0x7c1042a6
+#define KVM_INST_MFSPR_SPRG1   0x7c1142a6
+#define KVM_INST_MFSPR_SPRG2   0x7c1242a6
+#define KVM_INST_MFSPR_SPRG3   0x7c1342a6
+#define KVM_INST_MFSPR_SRR00x7c1a02a6
+#define KVM_INST_MFSPR_SRR10x7c1b02a6
+#define KVM_INST_MFSPR_DAR 0x7c1302a6
+#define KVM_INST_MFSPR_DSISR   0x7c1202a6
+
+#define KVM_INST_MTSPR_SPRG0   0x7c1043a6
+#define KVM_INST_MTSPR_SPRG1   0x7c1143a6
+#define KVM_INST_MTSPR_SPRG2   0x7c1243a6
+#define KVM_INST_MTSPR_SPRG3   0x7c1343a6
+#define KVM_INST_MTSPR_SRR00x7c1a03a6
+#define KVM_INST_MTSPR_SRR10x7c1b03a6
+#define KVM_INST_MTSPR_DAR 0x7c1303a6
+#define KVM_INST_MTSPR_DSISR   0x7c1203a6
 
 static bool kvm_patching_worked = true;
 
@@ -43,6 +70,34 @@ static inline void kvm_patch_ins(u32 *inst, u32 new_inst)
flush_icache_range((ulong)inst, (ulong)inst + 4);
 }
 
+static void kvm_patch_ins_ld(u32 *inst, long addr, u32 rt)
+{
+#ifdef CONFIG_64BIT
+   kvm_patch_ins(inst, KVM_INST_LD | rt | (addr  0xfffc));
+#else
+   kvm_patch_ins(inst, KVM_INST_LWZ | rt | ((addr + 4)  0xfffc));
+#endif
+}
+
+static void kvm_patch_ins_lwz(u32 *inst, long addr, u32 rt)
+{
+   kvm_patch_ins(inst, KVM_INST_LWZ | rt | (addr  0x));
+}
+
+static void kvm_patch_ins_std(u32 *inst, long addr, u32 rt)
+{
+#ifdef CONFIG_64BIT
+   kvm_patch_ins(inst, KVM_INST_STD | rt | (addr  0xfffc));
+#else
+   kvm_patch_ins(inst, KVM_INST_STW | rt | ((addr + 4)  0xfffc));
+#endif
+}
+
+static void kvm_patch_ins_stw(u32 *inst, long addr, u32 rt)
+{
+   kvm_patch_ins(inst, KVM_INST_STW | rt | (addr  0xfffc));
+}
+
 static void kvm_map_magic_page(void *data)
 {
kvm_hypercall2(KVM_HC_PPC_MAP_MAGIC_PAGE,
@@ -57,6 +112,60 @@ static void kvm_check_ins(u32 *inst)
u32 inst_rt = _inst  KVM_MASK_RT;
 
switch (inst_no_rt) {
+   /* Loads */
+   case KVM_INST_MFMSR:
+   kvm_patch_ins_ld(inst, magic_var(msr), inst_rt);
+   break;
+   case KVM_INST_MFSPR_SPRG0:
+   kvm_patch_ins_ld(inst, magic_var(sprg0), inst_rt);
+   break;
+   case KVM_INST_MFSPR_SPRG1:
+   kvm_patch_ins_ld(inst, magic_var(sprg1), inst_rt);
+   break;
+   case KVM_INST_MFSPR_SPRG2:
+   kvm_patch_ins_ld(inst, magic_var(sprg2), inst_rt);
+   break;
+   case KVM_INST_MFSPR_SPRG3:
+   kvm_patch_ins_ld(inst, magic_var(sprg3), inst_rt);
+   break;
+   case KVM_INST_MFSPR_SRR0:
+   kvm_patch_ins_ld(inst, magic_var(srr0), inst_rt);
+   break;
+   case KVM_INST_MFSPR_SRR1:
+   kvm_patch_ins_ld(inst, magic_var(srr1), inst_rt);
+   break;
+   case KVM_INST_MFSPR_DAR:
+   kvm_patch_ins_ld(inst, magic_var(dar), inst_rt);
+   break;
+   case KVM_INST_MFSPR_DSISR:
+   kvm_patch_ins_lwz(inst, magic_var(dsisr), inst_rt);
+   break;
+
+   /* Stores */
+   case KVM_INST_MTSPR_SPRG0:
+   kvm_patch_ins_std(inst, magic_var(sprg0), inst_rt);
+   break;
+   case KVM_INST_MTSPR_SPRG1:
+   kvm_patch_ins_std(inst, magic_var(sprg1), inst_rt);
+   break;
+   case KVM_INST_MTSPR_SPRG2:
+   kvm_patch_ins_std(inst, magic_var(sprg2), inst_rt);
+   break;
+   case KVM_INST_MTSPR_SPRG3:
+   kvm_patch_ins_std(inst, magic_var(sprg3), inst_rt);
+   break;
+   case KVM_INST_MTSPR_SRR0:
+   kvm_patch_ins_std(inst, magic_var(srr0), inst_rt);
+   break;
+   case KVM_INST_MTSPR_SRR1:
+   kvm_patch_ins_std(inst, magic_var(srr1), inst_rt);
+   break;
+   case KVM_INST_MTSPR_DAR:
+   kvm_patch_ins_std(inst, 

[PATCH 05/27] KVM: PPC: Convert SRR0 and SRR1 to shared page

2010-07-29 Thread Alexander Graf
The SRR0 and SRR1 registers contain cached values of the PC and MSR
respectively. They get written to by the hypervisor when an interrupt
occurs or directly by the kernel. They are also used to tell the rfi(d)
instruction where to jump to.

Because it only gets touched on defined events that, it's very simple to
share with the guest. Hypervisor and guest both have full r/w access.

This patch converts all users of the current field to the shared page.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h |2 --
 arch/powerpc/include/asm/kvm_para.h |2 ++
 arch/powerpc/kvm/book3s.c   |   12 ++--
 arch/powerpc/kvm/book3s_emulate.c   |4 ++--
 arch/powerpc/kvm/booke.c|   15 ---
 arch/powerpc/kvm/booke_emulate.c|4 ++--
 arch/powerpc/kvm/emulate.c  |   12 
 7 files changed, 28 insertions(+), 23 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index c852408..5255d75 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -225,8 +225,6 @@ struct kvm_vcpu_arch {
ulong sprg5;
ulong sprg6;
ulong sprg7;
-   ulong srr0;
-   ulong srr1;
ulong csrr0;
ulong csrr1;
ulong dsrr0;
diff --git a/arch/powerpc/include/asm/kvm_para.h 
b/arch/powerpc/include/asm/kvm_para.h
index ec72a1c..d7fc6c2 100644
--- a/arch/powerpc/include/asm/kvm_para.h
+++ b/arch/powerpc/include/asm/kvm_para.h
@@ -23,6 +23,8 @@
 #include linux/types.h
 
 struct kvm_vcpu_arch_shared {
+   __u64 srr0;
+   __u64 srr1;
__u64 dar;
__u64 msr;
__u32 dsisr;
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 4d46f8b..afa0dd4 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -162,8 +162,8 @@ void kvmppc_set_msr(struct kvm_vcpu *vcpu, u64 msr)
 
 void kvmppc_inject_interrupt(struct kvm_vcpu *vcpu, int vec, u64 flags)
 {
-   vcpu-arch.srr0 = kvmppc_get_pc(vcpu);
-   vcpu-arch.srr1 = vcpu-arch.shared-msr | flags;
+   vcpu-arch.shared-srr0 = kvmppc_get_pc(vcpu);
+   vcpu-arch.shared-srr1 = vcpu-arch.shared-msr | flags;
kvmppc_set_pc(vcpu, to_book3s(vcpu)-hior + vec);
vcpu-arch.mmu.reset_msr(vcpu);
 }
@@ -1059,8 +1059,8 @@ int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, 
struct kvm_regs *regs)
regs-lr = kvmppc_get_lr(vcpu);
regs-xer = kvmppc_get_xer(vcpu);
regs-msr = vcpu-arch.shared-msr;
-   regs-srr0 = vcpu-arch.srr0;
-   regs-srr1 = vcpu-arch.srr1;
+   regs-srr0 = vcpu-arch.shared-srr0;
+   regs-srr1 = vcpu-arch.shared-srr1;
regs-pid = vcpu-arch.pid;
regs-sprg0 = vcpu-arch.sprg0;
regs-sprg1 = vcpu-arch.sprg1;
@@ -1086,8 +1086,8 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, 
struct kvm_regs *regs)
kvmppc_set_lr(vcpu, regs-lr);
kvmppc_set_xer(vcpu, regs-xer);
kvmppc_set_msr(vcpu, regs-msr);
-   vcpu-arch.srr0 = regs-srr0;
-   vcpu-arch.srr1 = regs-srr1;
+   vcpu-arch.shared-srr0 = regs-srr0;
+   vcpu-arch.shared-srr1 = regs-srr1;
vcpu-arch.sprg0 = regs-sprg0;
vcpu-arch.sprg1 = regs-sprg1;
vcpu-arch.sprg2 = regs-sprg2;
diff --git a/arch/powerpc/kvm/book3s_emulate.c 
b/arch/powerpc/kvm/book3s_emulate.c
index c147864..f333cb4 100644
--- a/arch/powerpc/kvm/book3s_emulate.c
+++ b/arch/powerpc/kvm/book3s_emulate.c
@@ -73,8 +73,8 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
switch (get_xop(inst)) {
case OP_19_XOP_RFID:
case OP_19_XOP_RFI:
-   kvmppc_set_pc(vcpu, vcpu-arch.srr0);
-   kvmppc_set_msr(vcpu, vcpu-arch.srr1);
+   kvmppc_set_pc(vcpu, vcpu-arch.shared-srr0);
+   kvmppc_set_msr(vcpu, vcpu-arch.shared-srr1);
*advance = 0;
break;
 
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 4aab6d2..793df28 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -64,7 +64,8 @@ void kvmppc_dump_vcpu(struct kvm_vcpu *vcpu)
 
printk(pc:   %08lx msr:  %08llx\n, vcpu-arch.pc, 
vcpu-arch.shared-msr);
printk(lr:   %08lx ctr:  %08lx\n, vcpu-arch.lr, vcpu-arch.ctr);
-   printk(srr0: %08lx srr1: %08lx\n, vcpu-arch.srr0, vcpu-arch.srr1);
+   printk(srr0: %08llx srr1: %08llx\n, vcpu-arch.shared-srr0,
+   vcpu-arch.shared-srr1);
 
printk(exceptions: %08lx\n, vcpu-arch.pending_exceptions);
 
@@ -189,8 +190,8 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu 
*vcpu,
}
 
if (allowed) {
-   vcpu-arch.srr0 = vcpu-arch.pc;
-   vcpu-arch.srr1 = vcpu-arch.shared-msr;
+   vcpu-arch.shared-srr0 = vcpu-arch.pc;
+   

[PATCH 07/27] KVM: PPC: Implement hypervisor interface

2010-07-29 Thread Alexander Graf
To communicate with KVM directly we need to plumb some sort of interface
between the guest and KVM. Usually those interfaces use hypercalls.

This hypercall implementation is described in the last patch of the series
in a special documentation file. Please read that for further information.

This patch implements stubs to handle KVM PPC hypercalls on the host and
guest side alike.

Signed-off-by: Alexander Graf ag...@suse.de

---

v1 - v2:

  - change hypervisor calls to use new register values

v2 - v3:

  - move PV interface to ePAPR
  - only check R0 on hypercall
  - remove PVR hack
  - align hypercalls to in/out of ePAPR
  - add kvm.c with hypercall function
---
 arch/powerpc/include/asm/kvm_para.h |  114 ++-
 arch/powerpc/include/asm/kvm_ppc.h  |1 +
 arch/powerpc/kernel/Makefile|2 +
 arch/powerpc/kernel/kvm.c   |   68 +
 arch/powerpc/kvm/book3s.c   |9 ++-
 arch/powerpc/kvm/booke.c|   10 +++-
 arch/powerpc/kvm/powerpc.c  |   32 ++
 include/linux/kvm_para.h|1 +
 8 files changed, 233 insertions(+), 4 deletions(-)
 create mode 100644 arch/powerpc/kernel/kvm.c

diff --git a/arch/powerpc/include/asm/kvm_para.h 
b/arch/powerpc/include/asm/kvm_para.h
index e402999..556fd59 100644
--- a/arch/powerpc/include/asm/kvm_para.h
+++ b/arch/powerpc/include/asm/kvm_para.h
@@ -21,6 +21,7 @@
 #define __POWERPC_KVM_PARA_H__
 
 #include linux/types.h
+#include linux/of.h
 
 struct kvm_vcpu_arch_shared {
__u64 sprg0;
@@ -34,16 +35,127 @@ struct kvm_vcpu_arch_shared {
__u32 dsisr;
 };
 
+#define KVM_SC_MAGIC_R00x4b564d21 /* KVM! */
+#define HC_VENDOR_KVM  (42  16)
+#define HC_EV_SUCCESS  0
+#define HC_EV_UNIMPLEMENTED12
+
 #ifdef __KERNEL__
 
+#ifdef CONFIG_KVM_GUEST
+
+static inline int kvm_para_available(void)
+{
+   struct device_node *hyper_node;
+
+   hyper_node = of_find_node_by_path(/hypervisor);
+   if (!hyper_node)
+   return 0;
+
+   if (!of_device_is_compatible(hyper_node, linux,kvm))
+   return 0;
+
+   return 1;
+}
+
+extern unsigned long kvm_hypercall(unsigned long *in,
+  unsigned long *out,
+  unsigned long nr);
+
+#else
+
 static inline int kvm_para_available(void)
 {
return 0;
 }
 
+static unsigned long kvm_hypercall(unsigned long *in,
+  unsigned long *out,
+  unsigned long nr)
+{
+   return HC_EV_UNIMPLEMENTED;
+}
+
+#endif
+
+static inline long kvm_hypercall0_1(unsigned int nr, unsigned long *r2)
+{
+   unsigned long in[8];
+   unsigned long out[8];
+   unsigned long r;
+
+   r = kvm_hypercall(in, out, nr | HC_VENDOR_KVM);
+   *r2 = out[0];
+
+   return r;
+}
+
+static inline long kvm_hypercall0(unsigned int nr)
+{
+   unsigned long in[8];
+   unsigned long out[8];
+
+   return kvm_hypercall(in, out, nr | HC_VENDOR_KVM);
+}
+
+static inline long kvm_hypercall1(unsigned int nr, unsigned long p1)
+{
+   unsigned long in[8];
+   unsigned long out[8];
+
+   in[0] = p1;
+   return kvm_hypercall(in, out, nr | HC_VENDOR_KVM);
+}
+
+static inline long kvm_hypercall2(unsigned int nr, unsigned long p1,
+ unsigned long p2)
+{
+   unsigned long in[8];
+   unsigned long out[8];
+
+   in[0] = p1;
+   in[1] = p2;
+   return kvm_hypercall(in, out, nr | HC_VENDOR_KVM);
+}
+
+static inline long kvm_hypercall3(unsigned int nr, unsigned long p1,
+ unsigned long p2, unsigned long p3)
+{
+   unsigned long in[8];
+   unsigned long out[8];
+
+   in[0] = p1;
+   in[1] = p2;
+   in[2] = p3;
+   return kvm_hypercall(in, out, nr | HC_VENDOR_KVM);
+}
+
+static inline long kvm_hypercall4(unsigned int nr, unsigned long p1,
+ unsigned long p2, unsigned long p3,
+ unsigned long p4)
+{
+   unsigned long in[8];
+   unsigned long out[8];
+
+   in[0] = p1;
+   in[1] = p2;
+   in[2] = p3;
+   in[3] = p4;
+   return kvm_hypercall(in, out, nr | HC_VENDOR_KVM);
+}
+
+
 static inline unsigned int kvm_arch_para_features(void)
 {
-   return 0;
+   unsigned long r;
+
+   if (!kvm_para_available())
+   return 0;
+
+   if(kvm_hypercall0_1(KVM_HC_FEATURES, r))
+   return 0;
+
+   return r;
 }
 
 #endif /* __KERNEL__ */
diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index 18d139e..ecb3bc7 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -107,6 +107,7 @@ extern int kvmppc_booke_init(void);
 extern void kvmppc_booke_exit(void);
 
 extern void kvmppc_core_destroy_mmu(struct kvm_vcpu *vcpu);
+extern int kvmppc_kvm_pv(struct 

[PATCH 03/27] KVM: PPC: Convert DSISR to shared page

2010-07-29 Thread Alexander Graf
The DSISR register contains information about a data page fault. It is fully
read/write from inside the guest context and we don't need to worry about
interacting based on writes of this register.

This patch converts all users of the current field to the shared page.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s.h|1 -
 arch/powerpc/include/asm/kvm_para.h  |1 +
 arch/powerpc/kvm/book3s.c|   11 ++-
 arch/powerpc/kvm/book3s_emulate.c|6 +++---
 arch/powerpc/kvm/book3s_paired_singles.c |2 +-
 5 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index 8274a2d..b5b1961 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -85,7 +85,6 @@ struct kvmppc_vcpu_book3s {
u64 hid[6];
u64 gqr[8];
int slb_nr;
-   u32 dsisr;
u64 sdr1;
u64 hior;
u64 msr_mask;
diff --git a/arch/powerpc/include/asm/kvm_para.h 
b/arch/powerpc/include/asm/kvm_para.h
index a17dc52..9f7565b 100644
--- a/arch/powerpc/include/asm/kvm_para.h
+++ b/arch/powerpc/include/asm/kvm_para.h
@@ -24,6 +24,7 @@
 
 struct kvm_vcpu_arch_shared {
__u64 msr;
+   __u32 dsisr;
 };
 
 #ifdef __KERNEL__
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 2efe692..eb401b6 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -595,15 +595,16 @@ int kvmppc_handle_pagefault(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
if (page_found == -ENOENT) {
/* Page not found in guest PTE entries */
vcpu-arch.dear = kvmppc_get_fault_dar(vcpu);
-   to_book3s(vcpu)-dsisr = to_svcpu(vcpu)-fault_dsisr;
+   vcpu-arch.shared-dsisr = to_svcpu(vcpu)-fault_dsisr;
vcpu-arch.shared-msr |=
(to_svcpu(vcpu)-shadow_srr1  0xf800ULL);
kvmppc_book3s_queue_irqprio(vcpu, vec);
} else if (page_found == -EPERM) {
/* Storage protection */
vcpu-arch.dear = kvmppc_get_fault_dar(vcpu);
-   to_book3s(vcpu)-dsisr = to_svcpu(vcpu)-fault_dsisr  
~DSISR_NOHPTE;
-   to_book3s(vcpu)-dsisr |= DSISR_PROTFAULT;
+   vcpu-arch.shared-dsisr =
+   to_svcpu(vcpu)-fault_dsisr  ~DSISR_NOHPTE;
+   vcpu-arch.shared-dsisr |= DSISR_PROTFAULT;
vcpu-arch.shared-msr |=
(to_svcpu(vcpu)-shadow_srr1  0xf800ULL);
kvmppc_book3s_queue_irqprio(vcpu, vec);
@@ -867,7 +868,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu 
*vcpu,
r = kvmppc_handle_pagefault(run, vcpu, dar, exit_nr);
} else {
vcpu-arch.dear = dar;
-   to_book3s(vcpu)-dsisr = to_svcpu(vcpu)-fault_dsisr;
+   vcpu-arch.shared-dsisr = to_svcpu(vcpu)-fault_dsisr;
kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
kvmppc_mmu_pte_flush(vcpu, vcpu-arch.dear, ~0xFFFUL);
r = RESUME_GUEST;
@@ -994,7 +995,7 @@ program_interrupt:
}
case BOOK3S_INTERRUPT_ALIGNMENT:
if (kvmppc_read_inst(vcpu) == EMULATE_DONE) {
-   to_book3s(vcpu)-dsisr = kvmppc_alignment_dsisr(vcpu,
+   vcpu-arch.shared-dsisr = kvmppc_alignment_dsisr(vcpu,
kvmppc_get_last_inst(vcpu));
vcpu-arch.dear = kvmppc_alignment_dar(vcpu,
kvmppc_get_last_inst(vcpu));
diff --git a/arch/powerpc/kvm/book3s_emulate.c 
b/arch/powerpc/kvm/book3s_emulate.c
index 35d3c16..9982ff1 100644
--- a/arch/powerpc/kvm/book3s_emulate.c
+++ b/arch/powerpc/kvm/book3s_emulate.c
@@ -221,7 +221,7 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
else if (r == -EPERM)
dsisr |= DSISR_PROTFAULT;
 
-   to_book3s(vcpu)-dsisr = dsisr;
+   vcpu-arch.shared-dsisr = dsisr;
to_svcpu(vcpu)-fault_dsisr = dsisr;
 
kvmppc_book3s_queue_irqprio(vcpu,
@@ -327,7 +327,7 @@ int kvmppc_core_emulate_mtspr(struct kvm_vcpu *vcpu, int 
sprn, int rs)
to_book3s(vcpu)-sdr1 = spr_val;
break;
case SPRN_DSISR:
-   to_book3s(vcpu)-dsisr = spr_val;
+   vcpu-arch.shared-dsisr = spr_val;
break;
case SPRN_DAR:
vcpu-arch.dear = spr_val;
@@ -440,7 +440,7 @@ int kvmppc_core_emulate_mfspr(struct kvm_vcpu *vcpu, int 
sprn, int rt)
kvmppc_set_gpr(vcpu, rt, to_book3s(vcpu)-sdr1);

[PATCH 21/27] KVM: PPC: Introduce branch patching helper

2010-07-29 Thread Alexander Graf
We will need to patch several instruction streams over to a different
code path, so we need a way to patch a single instruction with a branch
somewhere else.

This patch adds a helper to facilitate this patching.

Signed-off-by: Alexander Graf ag...@suse.de

---

v2 - v3:

  - add safety check for relocatable kernels
---
 arch/powerpc/kernel/kvm.c |   14 ++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c
index 926f93f..239a70d 100644
--- a/arch/powerpc/kernel/kvm.c
+++ b/arch/powerpc/kernel/kvm.c
@@ -107,6 +107,20 @@ static void kvm_patch_ins_nop(u32 *inst)
kvm_patch_ins(inst, KVM_INST_NOP);
 }
 
+static void kvm_patch_ins_b(u32 *inst, int addr)
+{
+#ifdef CONFIG_RELOCATABLE
+   /* On relocatable kernels interrupts handlers and our code
+  can be in different regions, so we don't patch them */
+
+   extern u32 __end_interrupts;
+   if ((ulong)inst  (ulong)__end_interrupts)
+   return;
+#endif
+
+   kvm_patch_ins(inst, KVM_INST_B | (addr  KVM_INST_B_MASK));
+}
+
 static u32 *kvm_alloc(int len)
 {
u32 *p;
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 24/27] KVM: PPC: PV mtmsrd L=0 and mtmsr

2010-07-29 Thread Alexander Graf
There is also a form of mtmsr where all bits need to be addressed. While the
PPC64 Linux kernel behaves resonably well here, on PPC32 we do not have an
L=1 form. It does mtmsr even for simple things like only changing EE.

So we need to hook into that one as well and check for a mask of bits that we
deem safe to change from within guest context.

Signed-off-by: Alexander Graf ag...@suse.de

---

v1 - v2:

  - use kvm_patch_ins_b
---
 arch/powerpc/kernel/kvm.c  |   51 
 arch/powerpc/kernel/kvm_emul.S |   84 
 2 files changed, 135 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c
index 717ab0d..8ac57e2 100644
--- a/arch/powerpc/kernel/kvm.c
+++ b/arch/powerpc/kernel/kvm.c
@@ -63,7 +63,9 @@
 #define KVM_INST_MTSPR_DSISR   0x7c1203a6
 
 #define KVM_INST_TLBSYNC   0x7c00046c
+#define KVM_INST_MTMSRD_L0 0x7c000164
 #define KVM_INST_MTMSRD_L1 0x7c010164
+#define KVM_INST_MTMSR 0x7c000124
 
 static bool kvm_patching_worked = true;
 static char kvm_tmp[1024 * 1024];
@@ -176,6 +178,49 @@ static void kvm_patch_ins_mtmsrd(u32 *inst, u32 rt)
kvm_patch_ins_b(inst, distance_start);
 }
 
+extern u32 kvm_emulate_mtmsr_branch_offs;
+extern u32 kvm_emulate_mtmsr_reg1_offs;
+extern u32 kvm_emulate_mtmsr_reg2_offs;
+extern u32 kvm_emulate_mtmsr_reg3_offs;
+extern u32 kvm_emulate_mtmsr_orig_ins_offs;
+extern u32 kvm_emulate_mtmsr_len;
+extern u32 kvm_emulate_mtmsr[];
+
+static void kvm_patch_ins_mtmsr(u32 *inst, u32 rt)
+{
+   u32 *p;
+   int distance_start;
+   int distance_end;
+   ulong next_inst;
+
+   p = kvm_alloc(kvm_emulate_mtmsr_len * 4);
+   if (!p)
+   return;
+
+   /* Find out where we are and put everything there */
+   distance_start = (ulong)p - (ulong)inst;
+   next_inst = ((ulong)inst + 4);
+   distance_end = next_inst - (ulong)p[kvm_emulate_mtmsr_branch_offs];
+
+   /* Make sure we only write valid b instructions */
+   if (distance_start  KVM_INST_B_MAX) {
+   kvm_patching_worked = false;
+   return;
+   }
+
+   /* Modify the chunk to fit the invocation */
+   memcpy(p, kvm_emulate_mtmsr, kvm_emulate_mtmsr_len * 4);
+   p[kvm_emulate_mtmsr_branch_offs] |= distance_end  KVM_INST_B_MASK;
+   p[kvm_emulate_mtmsr_reg1_offs] |= rt;
+   p[kvm_emulate_mtmsr_reg2_offs] |= rt;
+   p[kvm_emulate_mtmsr_reg3_offs] |= rt;
+   p[kvm_emulate_mtmsr_orig_ins_offs] = *inst;
+   flush_icache_range((ulong)p, (ulong)p + kvm_emulate_mtmsr_len * 4);
+
+   /* Patch the invocation */
+   kvm_patch_ins_b(inst, distance_start);
+}
+
 static void kvm_map_magic_page(void *data)
 {
kvm_hypercall2(KVM_HC_PPC_MAP_MAGIC_PAGE,
@@ -256,6 +301,12 @@ static void kvm_check_ins(u32 *inst)
if (get_rt(inst_rt)  30)
kvm_patch_ins_mtmsrd(inst, inst_rt);
break;
+   case KVM_INST_MTMSR:
+   case KVM_INST_MTMSRD_L0:
+   /* We use r30 and r31 during the hook */
+   if (get_rt(inst_rt)  30)
+   kvm_patch_ins_mtmsr(inst, inst_rt);
+   break;
}
 
switch (_inst) {
diff --git a/arch/powerpc/kernel/kvm_emul.S b/arch/powerpc/kernel/kvm_emul.S
index 10dc4a6..8cd22f4 100644
--- a/arch/powerpc/kernel/kvm_emul.S
+++ b/arch/powerpc/kernel/kvm_emul.S
@@ -120,3 +120,87 @@ kvm_emulate_mtmsrd_reg_offs:
 .global kvm_emulate_mtmsrd_len
 kvm_emulate_mtmsrd_len:
.long (kvm_emulate_mtmsrd_end - kvm_emulate_mtmsrd) / 4
+
+
+#define MSR_SAFE_BITS (MSR_EE | MSR_CE | MSR_ME | MSR_RI)
+#define MSR_CRITICAL_BITS ~MSR_SAFE_BITS
+
+.global kvm_emulate_mtmsr
+kvm_emulate_mtmsr:
+
+   SCRATCH_SAVE
+
+   /* Fetch old MSR in r31 */
+   LL64(r31, KVM_MAGIC_PAGE + KVM_MAGIC_MSR, 0)
+
+   /* Find the changed bits between old and new MSR */
+kvm_emulate_mtmsr_reg1:
+   xor r31, r0, r31
+
+   /* Check if we need to really do mtmsr */
+   LOAD_REG_IMMEDIATE(r30, MSR_CRITICAL_BITS)
+   and.r31, r31, r30
+
+   /* No critical bits changed? Maybe we can stay in the guest. */
+   beq maybe_stay_in_guest
+
+do_mtmsr:
+
+   SCRATCH_RESTORE
+
+   /* Just fire off the mtmsr if it's critical */
+kvm_emulate_mtmsr_orig_ins:
+   mtmsr   r0
+
+   b   kvm_emulate_mtmsr_branch
+
+maybe_stay_in_guest:
+
+   /* Check if we have to fetch an interrupt */
+   lwz r31, (KVM_MAGIC_PAGE + KVM_MAGIC_INT)(0)
+   cmpwi   r31, 0
+   beq+no_mtmsr
+
+   /* Check if we may trigger an interrupt */
+kvm_emulate_mtmsr_reg2:
+   andi.   r31, r0, MSR_EE
+   beq no_mtmsr
+
+   b   do_mtmsr
+
+no_mtmsr:
+
+   /* Put MSR into magic page because we don't call mtmsr */
+kvm_emulate_mtmsr_reg3:
+   STL64(r0, KVM_MAGIC_PAGE + KVM_MAGIC_MSR, 0)
+
+   SCRATCH_RESTORE
+
+   /* Go back to 

[PATCH 27/27] KVM: PPC: Add get_pvinfo interface to query hypercall instructions

2010-07-29 Thread Alexander Graf
We need to tell the guest the opcodes that make up a hypercall through
interfaces that are controlled by userspace. So we need to add a call
for userspace to allow it to query those opcodes so it can pass them
on.

This is required because the hypercall opcodes can change based on
the hypervisor conditions. If we're running in hardware accelerated
hypervisor mode, a hypercall looks different from when we're running
without hardware acceleration.

Signed-off-by: Alexander Graf ag...@suse.de
---
 Documentation/kvm/api.txt  |   23 +++
 arch/powerpc/kvm/powerpc.c |   38 ++
 include/linux/kvm.h|   11 +++
 3 files changed, 72 insertions(+), 0 deletions(-)

diff --git a/Documentation/kvm/api.txt b/Documentation/kvm/api.txt
index 5f5b649..44d9893 100644
--- a/Documentation/kvm/api.txt
+++ b/Documentation/kvm/api.txt
@@ -1032,6 +1032,29 @@ are defined as follows:
eax, ebx, ecx, edx: the values returned by the cpuid instruction for
  this function/index combination
 
+4.46 KVM_PPC_GET_PVINFO
+
+Capability: KVM_CAP_PPC_GET_PVINFO
+Architectures: ppc
+Type: vm ioctl
+Parameters: struct kvm_ppc_pvinfo (out)
+Returns: 0 on success, !0 on error
+
+struct kvm_ppc_pvinfo {
+   __u32 flags;
+   __u32 hcall[4];
+   __u8  pad[108];
+};
+
+This ioctl fetches PV specific information that need to be passed to the guest
+using the device tree or other means from vm context.
+
+For now the only implemented piece of information distributed here is an array
+of 4 instructions that make up a hypercall.
+
+If any additional field gets added to this structure later on, a bit for that
+additional piece of information will be set in the flags bitmap.
+
 5. The kvm_run structure
 
 Application code obtains a pointer to the kvm_run structure by
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index fecfe04..6a53a3f 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -191,6 +191,7 @@ int kvm_dev_ioctl_check_extension(long ext)
case KVM_CAP_PPC_UNSET_IRQ:
case KVM_CAP_ENABLE_CAP:
case KVM_CAP_PPC_OSI:
+   case KVM_CAP_PPC_GET_PVINFO:
r = 1;
break;
case KVM_CAP_COALESCED_MMIO:
@@ -578,16 +579,53 @@ out:
return r;
 }
 
+static int kvm_vm_ioctl_get_pvinfo(struct kvm_ppc_pvinfo *pvinfo)
+{
+   u32 inst_lis = 0x3c00;
+   u32 inst_ori = 0x6000;
+   u32 inst_nop = 0x6000;
+   u32 inst_sc = 0x4402;
+   u32 inst_imm_mask = 0x;
+
+   /*
+* The hypercall to get into KVM from within guest context is as
+* follows:
+*
+*lis r0, r0, kvm_sc_magic...@h
+*ori r0, kvm_sc_magic...@l
+*sc
+*nop
+*/
+   pvinfo-hcall[0] = inst_lis | ((KVM_SC_MAGIC_R0  16)  inst_imm_mask);
+   pvinfo-hcall[1] = inst_ori | (KVM_SC_MAGIC_R0  inst_imm_mask);
+   pvinfo-hcall[2] = inst_sc;
+   pvinfo-hcall[3] = inst_nop;
+
+   return 0;
+}
+
 long kvm_arch_vm_ioctl(struct file *filp,
unsigned int ioctl, unsigned long arg)
 {
+   void __user *argp = (void __user *)arg;
long r;
 
switch (ioctl) {
+   case KVM_PPC_GET_PVINFO: {
+   struct kvm_ppc_pvinfo pvinfo;
+   r = kvm_vm_ioctl_get_pvinfo(pvinfo);
+   if (copy_to_user(argp, pvinfo, sizeof(pvinfo))) {
+   r = -EFAULT;
+   goto out;
+   }
+
+   break;
+   }
default:
r = -ENOTTY;
}
 
+out:
return r;
 }
 
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 636fc38..3707704 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -414,6 +414,14 @@ struct kvm_enable_cap {
__u8  pad[64];
 };
 
+/* for KVM_PPC_GET_PVINFO */
+struct kvm_ppc_pvinfo {
+   /* out */
+   __u32 flags;
+   __u32 hcall[4];
+   __u8  pad[108];
+};
+
 #define KVMIO 0xAE
 
 /*
@@ -530,6 +538,7 @@ struct kvm_enable_cap {
 #ifdef __KVM_HAVE_XCRS
 #define KVM_CAP_XCRS 56
 #endif
+#define KVM_CAP_PPC_GET_PVINFO 57
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -664,6 +673,8 @@ struct kvm_clock_data {
 /* Available with KVM_CAP_PIT_STATE2 */
 #define KVM_GET_PIT2  _IOR(KVMIO,  0x9f, struct kvm_pit_state2)
 #define KVM_SET_PIT2  _IOW(KVMIO,  0xa0, struct kvm_pit_state2)
+/* Available with KVM_CAP_PPC_GET_PVINFO */
+#define KVM_PPC_GET_PVINFO   _IOW(KVMIO,  0xa1, struct kvm_ppc_pvinfo)
 
 /*
  * ioctls for vcpu fds
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 26/27] KVM: PPC: Add Documentation about PV interface

2010-07-29 Thread Alexander Graf
We just introduced a new PV interface that screams for documentation. So here
it is - a shiny new and awesome text file describing the internal works of
the PPC KVM paravirtual interface.

Signed-off-by: Alexander Graf ag...@suse.de

---

v1 - v2:

  - clarify guest implementation
  - clarify that privileged instructions still work
  - explain safe MSR bits
  - Fix dsisr patch description
  - change hypervisor calls to use new register values

v2 - v3:

  - update documentation to new hypercall interface
  - change detection to be device tree based
---
 Documentation/kvm/ppc-pv.txt |  180 ++
 1 files changed, 180 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/kvm/ppc-pv.txt

diff --git a/Documentation/kvm/ppc-pv.txt b/Documentation/kvm/ppc-pv.txt
new file mode 100644
index 000..960cd51
--- /dev/null
+++ b/Documentation/kvm/ppc-pv.txt
@@ -0,0 +1,180 @@
+The PPC KVM paravirtual interface
+=
+
+The basic execution principle by which KVM on PowerPC works is to run all 
kernel
+space code in PR=1 which is user space. This way we trap all privileged
+instructions and can emulate them accordingly.
+
+Unfortunately that is also the downfall. There are quite some privileged
+instructions that needlessly return us to the hypervisor even though they
+could be handled differently.
+
+This is what the PPC PV interface helps with. It takes privileged instructions
+and transforms them into unprivileged ones with some help from the hypervisor.
+This cuts down virtualization costs by about 50% on some of my benchmarks.
+
+The code for that interface can be found in arch/powerpc/kernel/kvm*
+
+Querying for existence
+==
+
+To find out if we're running on KVM or not, we leverage the device tree. When
+Linux is running on KVM, a node /hypervisor exists. That node contains a
+compatible property with the value linux,kvm.
+
+Once you determined you're running under a PV capable KVM, you can now use
+hypercalls as described below.
+
+KVM hypercalls
+==
+
+Inside the device tree's /hypervisor node there's a property called
+'hypercall-instructions'. This property contains at most 4 opcodes that make
+up the hypercall. To call a hypercall, just call these instructions.
+
+The parameters are as follows:
+
+   RegisterIN  OUT
+
+   r0  -   volatile
+   r3  1st parameter   Return code
+   r4  2nd parameter   1st output value
+   r5  3rd parameter   2nd output value
+   r6  4th parameter   3rd output value
+   r7  5th parameter   4th output value
+   r8  6th parameter   5th output value
+   r9  7th parameter   6th output value
+   r10 8th parameter   7th output value
+   r11 hypercall number8th output value
+   r12 -   volatile
+
+Hypercall definitions are shared in generic code, so the same hypercall numbers
+apply for x86 and powerpc alike with the exception that each KVM hypercall
+also needs to be ORed with the KVM vendor code which is (42  16).
+
+Return codes can be as follows:
+
+   CodeMeaning
+
+   0   Success
+   12  Hypercall not implemented
+   0  Error
+
+The magic page
+==
+
+To enable communication between the hypervisor and guest there is a new shared
+page that contains parts of supervisor visible register state. The guest can
+map this shared page using the KVM hypercall KVM_HC_PPC_MAP_MAGIC_PAGE.
+
+With this hypercall issued the guest always gets the magic page mapped at the
+desired location in effective and physical address space. For now, we always
+map the page to -4096. This way we can access it using absolute load and store
+functions. The following instruction reads the first field of the magic page:
+
+   ld  rX, -4096(0)
+
+The interface is designed to be extensible should there be need later to add
+additional registers to the magic page. If you add fields to the magic page,
+also define a new hypercall feature to indicate that the host can give you more
+registers. Only if the host supports the additional features, make use of them.
+
+The magic page has the following layout as described in
+arch/powerpc/include/asm/kvm_para.h:
+
+struct kvm_vcpu_arch_shared {
+   __u64 scratch1;
+   __u64 scratch2;
+   __u64 scratch3;
+   __u64 critical; /* Guest may not get interrupts if == r1 */
+   __u64 sprg0;
+   __u64 sprg1;
+   __u64 sprg2;
+   __u64 sprg3;
+   __u64 srr0;
+   __u64 srr1;
+   __u64 dar;
+   __u64 msr;
+   __u32 dsisr;
+   __u32 int_pending;  /* Tells the guest if we have an interrupt */
+};
+
+Additions to the 

[PATCH 22/27] KVM: PPC: PV assembler helpers

2010-07-29 Thread Alexander Graf
When we hook an instruction we need to make sure we don't clobber any of
the registers at that point. So we write them out to scratch space in the
magic page. To make sure we don't fall into a race with another piece of
hooked code, we need to disable interrupts.

To make the later patches and code in general easier readable, let's introduce
a set of defines that save and restore r30, r31 and cr. Let's also define some
helpers to read the lower 32 bits of a 64 bit field on 32 bit systems.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kernel/kvm_emul.S |   29 +
 1 files changed, 29 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kernel/kvm_emul.S b/arch/powerpc/kernel/kvm_emul.S
index e0d4183..1dac72d 100644
--- a/arch/powerpc/kernel/kvm_emul.S
+++ b/arch/powerpc/kernel/kvm_emul.S
@@ -35,3 +35,32 @@ kvm_hypercall_start:
 
 #define KVM_MAGIC_PAGE (-4096)
 
+#ifdef CONFIG_64BIT
+#define LL64(reg, offs, reg2)  ld  reg, (offs)(reg2)
+#define STL64(reg, offs, reg2) std reg, (offs)(reg2)
+#else
+#define LL64(reg, offs, reg2)  lwz reg, (offs + 4)(reg2)
+#define STL64(reg, offs, reg2) stw reg, (offs + 4)(reg2)
+#endif
+
+#define SCRATCH_SAVE   \
+   /* Enable critical section. We are critical if  \
+  shared-critical == r1 */\
+   STL64(r1, KVM_MAGIC_PAGE + KVM_MAGIC_CRITICAL, 0);  \
+   \
+   /* Save state */\
+   PPC_STL r31, (KVM_MAGIC_PAGE + KVM_MAGIC_SCRATCH1)(0);  \
+   PPC_STL r30, (KVM_MAGIC_PAGE + KVM_MAGIC_SCRATCH2)(0);  \
+   mfcrr31;\
+   stw r31, (KVM_MAGIC_PAGE + KVM_MAGIC_SCRATCH3)(0);
+
+#define SCRATCH_RESTORE
\
+   /* Restore state */ \
+   PPC_LL  r31, (KVM_MAGIC_PAGE + KVM_MAGIC_SCRATCH1)(0);  \
+   lwz r30, (KVM_MAGIC_PAGE + KVM_MAGIC_SCRATCH3)(0);  \
+   mtcrr30;\
+   PPC_LL  r30, (KVM_MAGIC_PAGE + KVM_MAGIC_SCRATCH2)(0);  \
+   \
+   /* Disable critical section. We are critical if \
+  shared-critical == r1 and r2 is always != r1 */ \
+   STL64(r2, KVM_MAGIC_PAGE + KVM_MAGIC_CRITICAL, 0);
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 19/27] KVM: PPC: PV tlbsync to nop

2010-07-29 Thread Alexander Graf
With our current MMU scheme we don't need to know about the tlbsync instruction.
So we can just nop it out.

Signed-off-by: Alexander Graf ag...@suse.de

---

v1 - v2:

  - use kvm_patch_ins
---
 arch/powerpc/kernel/kvm.c |   12 
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c
index 9ec572c..3258922 100644
--- a/arch/powerpc/kernel/kvm.c
+++ b/arch/powerpc/kernel/kvm.c
@@ -62,6 +62,8 @@
 #define KVM_INST_MTSPR_DAR 0x7c1303a6
 #define KVM_INST_MTSPR_DSISR   0x7c1203a6
 
+#define KVM_INST_TLBSYNC   0x7c00046c
+
 static bool kvm_patching_worked = true;
 
 static inline void kvm_patch_ins(u32 *inst, u32 new_inst)
@@ -98,6 +100,11 @@ static void kvm_patch_ins_stw(u32 *inst, long addr, u32 rt)
kvm_patch_ins(inst, KVM_INST_STW | rt | (addr  0xfffc));
 }
 
+static void kvm_patch_ins_nop(u32 *inst)
+{
+   kvm_patch_ins(inst, KVM_INST_NOP);
+}
+
 static void kvm_map_magic_page(void *data)
 {
kvm_hypercall2(KVM_HC_PPC_MAP_MAGIC_PAGE,
@@ -166,6 +173,11 @@ static void kvm_check_ins(u32 *inst)
case KVM_INST_MTSPR_DSISR:
kvm_patch_ins_stw(inst, magic_var(dsisr), inst_rt);
break;
+
+   /* Nops */
+   case KVM_INST_TLBSYNC:
+   kvm_patch_ins_nop(inst);
+   break;
}
 
switch (_inst) {
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 16/27] KVM: PPC: Generic KVM PV guest support

2010-07-29 Thread Alexander Graf
We have all the hypervisor pieces in place now, but the guest parts are still
missing.

This patch implements basic awareness of KVM when running Linux as guest. It
doesn't do anything with it yet though.

Signed-off-by: Alexander Graf ag...@suse.de

---

v2 - v3:

  - Add hypercall stub
---
 arch/powerpc/kernel/Makefile  |2 +-
 arch/powerpc/kernel/asm-offsets.c |   15 +++
 arch/powerpc/kernel/kvm.c |3 +++
 arch/powerpc/kernel/kvm_emul.S|   37 +
 arch/powerpc/platforms/Kconfig|   10 ++
 5 files changed, 66 insertions(+), 1 deletions(-)
 create mode 100644 arch/powerpc/kernel/kvm_emul.S

diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 5ea853d..d8e29b4 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -125,7 +125,7 @@ ifneq ($(CONFIG_XMON)$(CONFIG_KEXEC),)
 obj-y  += ppc_save_regs.o
 endif
 
-obj-$(CONFIG_KVM_GUEST)+= kvm.o
+obj-$(CONFIG_KVM_GUEST)+= kvm.o kvm_emul.o
 
 # Disable GCOV in odd or sensitive code
 GCOV_PROFILE_prom_init.o := n
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index a55d47e..e3e740b 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -465,6 +465,21 @@ int main(void)
DEFINE(VCPU_FAULT_ESR, offsetof(struct kvm_vcpu, arch.fault_esr));
 #endif /* CONFIG_PPC_BOOK3S */
 #endif
+
+#ifdef CONFIG_KVM_GUEST
+   DEFINE(KVM_MAGIC_SCRATCH1, offsetof(struct kvm_vcpu_arch_shared,
+   scratch1));
+   DEFINE(KVM_MAGIC_SCRATCH2, offsetof(struct kvm_vcpu_arch_shared,
+   scratch2));
+   DEFINE(KVM_MAGIC_SCRATCH3, offsetof(struct kvm_vcpu_arch_shared,
+   scratch3));
+   DEFINE(KVM_MAGIC_INT, offsetof(struct kvm_vcpu_arch_shared,
+  int_pending));
+   DEFINE(KVM_MAGIC_MSR, offsetof(struct kvm_vcpu_arch_shared, msr));
+   DEFINE(KVM_MAGIC_CRITICAL, offsetof(struct kvm_vcpu_arch_shared,
+   critical));
+#endif
+
 #ifdef CONFIG_44x
DEFINE(PGD_T_LOG2, PGD_T_LOG2);
DEFINE(PTE_T_LOG2, PTE_T_LOG2);
diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c
index 4f85505..a5ece71 100644
--- a/arch/powerpc/kernel/kvm.c
+++ b/arch/powerpc/kernel/kvm.c
@@ -30,6 +30,9 @@
 #include asm/cacheflush.h
 #include asm/disassemble.h
 
+#define KVM_MAGIC_PAGE (-4096L)
+#define magic_var(x) KVM_MAGIC_PAGE + offsetof(struct kvm_vcpu_arch_shared, x)
+
 unsigned long kvm_hypercall(unsigned long *in,
unsigned long *out,
unsigned long nr)
diff --git a/arch/powerpc/kernel/kvm_emul.S b/arch/powerpc/kernel/kvm_emul.S
new file mode 100644
index 000..e0d4183
--- /dev/null
+++ b/arch/powerpc/kernel/kvm_emul.S
@@ -0,0 +1,37 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2010
+ *
+ * Authors: Alexander Graf ag...@suse.de
+ */
+
+#include asm/ppc_asm.h
+#include asm/kvm_asm.h
+#include asm/reg.h
+#include asm/page.h
+#include asm/asm-offsets.h
+
+/* Hypercall entry point. Will be patched with device tree instructions. */
+
+.global kvm_hypercall_start
+kvm_hypercall_start:
+   li  r3, -1
+   nop
+   nop
+   nop
+   blr
+
+#define KVM_MAGIC_PAGE (-4096)
+
diff --git a/arch/powerpc/platforms/Kconfig b/arch/powerpc/platforms/Kconfig
index d1663db..1744349 100644
--- a/arch/powerpc/platforms/Kconfig
+++ b/arch/powerpc/platforms/Kconfig
@@ -21,6 +21,16 @@ source arch/powerpc/platforms/44x/Kconfig
 source arch/powerpc/platforms/40x/Kconfig
 source arch/powerpc/platforms/amigaone/Kconfig
 
+config KVM_GUEST
+   bool KVM Guest support
+   default y
+   ---help---
+ This option enables various optimizations for running under the KVM
+ hypervisor. Overhead for the kernel when not running inside KVM should
+ be minimal.
+
+ In case of doubt, say Y
+
 config PPC_NATIVE
bool
depends on 6xx || PPC64
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org

[PATCH 23/27] KVM: PPC: PV mtmsrd L=1

2010-07-29 Thread Alexander Graf
The PowerPC ISA has a special instruction for mtmsr that only changes the EE
and RI bits, namely the L=1 form.

Since that one is reasonably often occuring and simple to implement, let's
go with this first. Writing EE=0 is always just a store. Doing EE=1 also
requires us to check for pending interrupts and if necessary exit back to the
hypervisor.

Signed-off-by: Alexander Graf ag...@suse.de

---

v1 - v2:

  - use kvm_patch_ins_b
---
 arch/powerpc/kernel/kvm.c  |   45 
 arch/powerpc/kernel/kvm_emul.S |   56 
 2 files changed, 101 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c
index 239a70d..717ab0d 100644
--- a/arch/powerpc/kernel/kvm.c
+++ b/arch/powerpc/kernel/kvm.c
@@ -63,6 +63,7 @@
 #define KVM_INST_MTSPR_DSISR   0x7c1203a6
 
 #define KVM_INST_TLBSYNC   0x7c00046c
+#define KVM_INST_MTMSRD_L1 0x7c010164
 
 static bool kvm_patching_worked = true;
 static char kvm_tmp[1024 * 1024];
@@ -138,6 +139,43 @@ static u32 *kvm_alloc(int len)
return p;
 }
 
+extern u32 kvm_emulate_mtmsrd_branch_offs;
+extern u32 kvm_emulate_mtmsrd_reg_offs;
+extern u32 kvm_emulate_mtmsrd_len;
+extern u32 kvm_emulate_mtmsrd[];
+
+static void kvm_patch_ins_mtmsrd(u32 *inst, u32 rt)
+{
+   u32 *p;
+   int distance_start;
+   int distance_end;
+   ulong next_inst;
+
+   p = kvm_alloc(kvm_emulate_mtmsrd_len * 4);
+   if (!p)
+   return;
+
+   /* Find out where we are and put everything there */
+   distance_start = (ulong)p - (ulong)inst;
+   next_inst = ((ulong)inst + 4);
+   distance_end = next_inst - (ulong)p[kvm_emulate_mtmsrd_branch_offs];
+
+   /* Make sure we only write valid b instructions */
+   if (distance_start  KVM_INST_B_MAX) {
+   kvm_patching_worked = false;
+   return;
+   }
+
+   /* Modify the chunk to fit the invocation */
+   memcpy(p, kvm_emulate_mtmsrd, kvm_emulate_mtmsrd_len * 4);
+   p[kvm_emulate_mtmsrd_branch_offs] |= distance_end  KVM_INST_B_MASK;
+   p[kvm_emulate_mtmsrd_reg_offs] |= rt;
+   flush_icache_range((ulong)p, (ulong)p + kvm_emulate_mtmsrd_len * 4);
+
+   /* Patch the invocation */
+   kvm_patch_ins_b(inst, distance_start);
+}
+
 static void kvm_map_magic_page(void *data)
 {
kvm_hypercall2(KVM_HC_PPC_MAP_MAGIC_PAGE,
@@ -211,6 +249,13 @@ static void kvm_check_ins(u32 *inst)
case KVM_INST_TLBSYNC:
kvm_patch_ins_nop(inst);
break;
+
+   /* Rewrites */
+   case KVM_INST_MTMSRD_L1:
+   /* We use r30 and r31 during the hook */
+   if (get_rt(inst_rt)  30)
+   kvm_patch_ins_mtmsrd(inst, inst_rt);
+   break;
}
 
switch (_inst) {
diff --git a/arch/powerpc/kernel/kvm_emul.S b/arch/powerpc/kernel/kvm_emul.S
index 1dac72d..10dc4a6 100644
--- a/arch/powerpc/kernel/kvm_emul.S
+++ b/arch/powerpc/kernel/kvm_emul.S
@@ -64,3 +64,59 @@ kvm_hypercall_start:
/* Disable critical section. We are critical if \
   shared-critical == r1 and r2 is always != r1 */ \
STL64(r2, KVM_MAGIC_PAGE + KVM_MAGIC_CRITICAL, 0);
+
+.global kvm_emulate_mtmsrd
+kvm_emulate_mtmsrd:
+
+   SCRATCH_SAVE
+
+   /* Put MSR  ~(MSR_EE|MSR_RI) in r31 */
+   LL64(r31, KVM_MAGIC_PAGE + KVM_MAGIC_MSR, 0)
+   lis r30, (~(MSR_EE | MSR_RI))@h
+   ori r30, r30, (~(MSR_EE | MSR_RI))@l
+   and r31, r31, r30
+
+   /* OR the register's (MSR_EE|MSR_RI) on MSR */
+kvm_emulate_mtmsrd_reg:
+   andi.   r30, r0, (MSR_EE|MSR_RI)
+   or  r31, r31, r30
+
+   /* Put MSR back into magic page */
+   STL64(r31, KVM_MAGIC_PAGE + KVM_MAGIC_MSR, 0)
+
+   /* Check if we have to fetch an interrupt */
+   lwz r31, (KVM_MAGIC_PAGE + KVM_MAGIC_INT)(0)
+   cmpwi   r31, 0
+   beq+no_check
+
+   /* Check if we may trigger an interrupt */
+   andi.   r30, r30, MSR_EE
+   beq no_check
+
+   SCRATCH_RESTORE
+
+   /* Nag hypervisor */
+   tlbsync
+
+   b   kvm_emulate_mtmsrd_branch
+
+no_check:
+
+   SCRATCH_RESTORE
+
+   /* Go back to caller */
+kvm_emulate_mtmsrd_branch:
+   b   .
+kvm_emulate_mtmsrd_end:
+
+.global kvm_emulate_mtmsrd_branch_offs
+kvm_emulate_mtmsrd_branch_offs:
+   .long (kvm_emulate_mtmsrd_branch - kvm_emulate_mtmsrd) / 4
+
+.global kvm_emulate_mtmsrd_reg_offs
+kvm_emulate_mtmsrd_reg_offs:
+   .long (kvm_emulate_mtmsrd_reg - kvm_emulate_mtmsrd) / 4
+
+.global kvm_emulate_mtmsrd_len
+kvm_emulate_mtmsrd_len:
+   .long (kvm_emulate_mtmsrd_end - kvm_emulate_mtmsrd) / 4
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 13/27] KVM: PPC: Magic Page Book3s support

2010-07-29 Thread Alexander Graf
We need to override EA as well as PA lookups for the magic page. When the guest
tells us to project it, the magic page overrides any guest mappings.

In order to reflect that, we need to hook into all the MMU layers of KVM to
force map the magic page if necessary.

Signed-off-by: Alexander Graf ag...@suse.de

---

v2 - v3:

  - RMO - PAM
  - combine 32 and 64 real page magic override
  - remove leftover goto point
  - align hypercalls to in/out of ePAPR
---
 arch/powerpc/include/asm/kvm_book3s.h |1 +
 arch/powerpc/kvm/book3s.c |   35 ++--
 arch/powerpc/kvm/book3s_32_mmu.c  |   16 +++
 arch/powerpc/kvm/book3s_32_mmu_host.c |2 +-
 arch/powerpc/kvm/book3s_64_mmu.c  |   30 +++-
 arch/powerpc/kvm/book3s_64_mmu_host.c |9 +--
 6 files changed, 81 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index b5b1961..00cf8b0 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -130,6 +130,7 @@ extern void kvmppc_set_bat(struct kvm_vcpu *vcpu, struct 
kvmppc_bat *bat,
   bool upper, u32 val);
 extern void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr);
 extern int kvmppc_emulate_paired_single(struct kvm_run *run, struct kvm_vcpu 
*vcpu);
+extern pfn_t kvmppc_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn);
 
 extern u32 kvmppc_trampoline_lowmem;
 extern u32 kvmppc_trampoline_enter;
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 0ed5376..eee97b5 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -419,6 +419,25 @@ void kvmppc_set_pvr(struct kvm_vcpu *vcpu, u32 pvr)
}
 }
 
+pfn_t kvmppc_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn)
+{
+   ulong mp_pa = vcpu-arch.magic_page_pa;
+
+   /* Magic page override */
+   if (unlikely(mp_pa) 
+   unlikely(((gfn  PAGE_SHIFT)  KVM_PAM) ==
+((mp_pa  PAGE_MASK)  KVM_PAM))) {
+   ulong shared_page = ((ulong)vcpu-arch.shared)  PAGE_MASK;
+   pfn_t pfn;
+
+   pfn = (pfn_t)virt_to_phys((void*)shared_page)  PAGE_SHIFT;
+   get_page(pfn_to_page(pfn));
+   return pfn;
+   }
+
+   return gfn_to_pfn(vcpu-kvm, gfn);
+}
+
 /* Book3s_32 CPUs always have 32 bytes cache line size, which Linux assumes. To
  * make Book3s_32 Linux work on Book3s_64, we have to make sure we trap dcbz to
  * emulate 32 bytes dcbz length.
@@ -554,6 +573,13 @@ mmio:
 
 static int kvmppc_visible_gfn(struct kvm_vcpu *vcpu, gfn_t gfn)
 {
+   ulong mp_pa = vcpu-arch.magic_page_pa;
+
+   if (unlikely(mp_pa) 
+   unlikely((mp_pa  KVM_PAM)  PAGE_SHIFT == gfn)) {
+   return 1;
+   }
+
return kvm_is_visible_gfn(vcpu-kvm, gfn);
 }
 
@@ -1257,6 +1283,7 @@ struct kvm_vcpu *kvmppc_core_vcpu_create(struct kvm *kvm, 
unsigned int id)
struct kvmppc_vcpu_book3s *vcpu_book3s;
struct kvm_vcpu *vcpu;
int err = -ENOMEM;
+   unsigned long p;
 
vcpu_book3s = vmalloc(sizeof(struct kvmppc_vcpu_book3s));
if (!vcpu_book3s)
@@ -1274,8 +1301,10 @@ struct kvm_vcpu *kvmppc_core_vcpu_create(struct kvm 
*kvm, unsigned int id)
if (err)
goto free_shadow_vcpu;
 
-   vcpu-arch.shared = (void*)__get_free_page(GFP_KERNEL|__GFP_ZERO);
-   if (!vcpu-arch.shared)
+   p = __get_free_page(GFP_KERNEL|__GFP_ZERO);
+   /* the real shared page fills the last 4k of our page */
+   vcpu-arch.shared = (void*)(p + PAGE_SIZE - 4096);
+   if (!p)
goto uninit_vcpu;
 
vcpu-arch.host_retip = kvm_return_point;
@@ -1322,7 +1351,7 @@ void kvmppc_core_vcpu_free(struct kvm_vcpu *vcpu)
 {
struct kvmppc_vcpu_book3s *vcpu_book3s = to_book3s(vcpu);
 
-   free_page((unsigned long)vcpu-arch.shared);
+   free_page((unsigned long)vcpu-arch.shared  PAGE_MASK);
kvm_vcpu_uninit(vcpu);
kfree(vcpu_book3s-shadow_vcpu);
vfree(vcpu_book3s);
diff --git a/arch/powerpc/kvm/book3s_32_mmu.c b/arch/powerpc/kvm/book3s_32_mmu.c
index 449bce5..a7d121a 100644
--- a/arch/powerpc/kvm/book3s_32_mmu.c
+++ b/arch/powerpc/kvm/book3s_32_mmu.c
@@ -281,8 +281,24 @@ static int kvmppc_mmu_book3s_32_xlate(struct kvm_vcpu 
*vcpu, gva_t eaddr,
  struct kvmppc_pte *pte, bool data)
 {
int r;
+   ulong mp_ea = vcpu-arch.magic_page_ea;
 
pte-eaddr = eaddr;
+
+   /* Magic page override */
+   if (unlikely(mp_ea) 
+   unlikely((eaddr  ~0xfffULL) == (mp_ea  ~0xfffULL)) 
+   !(vcpu-arch.shared-msr  MSR_PR)) {
+   pte-vpage = kvmppc_mmu_book3s_32_ea_to_vp(vcpu, eaddr, data);
+   pte-raddr = vcpu-arch.magic_page_pa | (pte-raddr  0xfff);
+   pte-raddr = KVM_PAM;
+   pte-may_execute = true;
+   

[PATCH 25/27] KVM: PPC: PV wrteei

2010-07-29 Thread Alexander Graf
On BookE the preferred way to write the EE bit is the wrteei instruction. It
already encodes the EE bit in the instruction.

So in order to get BookE some speedups as well, let's also PV'nize thati
instruction.

Signed-off-by: Alexander Graf ag...@suse.de

---

v1 - v2:

  - use kvm_patch_ins_b
---
 arch/powerpc/kernel/kvm.c  |   50 
 arch/powerpc/kernel/kvm_emul.S |   41 
 2 files changed, 91 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c
index 8ac57e2..e936817 100644
--- a/arch/powerpc/kernel/kvm.c
+++ b/arch/powerpc/kernel/kvm.c
@@ -67,6 +67,9 @@
 #define KVM_INST_MTMSRD_L1 0x7c010164
 #define KVM_INST_MTMSR 0x7c000124
 
+#define KVM_INST_WRTEEI_0  0x7c000146
+#define KVM_INST_WRTEEI_1  0x7c008146
+
 static bool kvm_patching_worked = true;
 static char kvm_tmp[1024 * 1024];
 static int kvm_tmp_index;
@@ -221,6 +224,47 @@ static void kvm_patch_ins_mtmsr(u32 *inst, u32 rt)
kvm_patch_ins_b(inst, distance_start);
 }
 
+#ifdef CONFIG_BOOKE
+
+extern u32 kvm_emulate_wrteei_branch_offs;
+extern u32 kvm_emulate_wrteei_ee_offs;
+extern u32 kvm_emulate_wrteei_len;
+extern u32 kvm_emulate_wrteei[];
+
+static void kvm_patch_ins_wrteei(u32 *inst)
+{
+   u32 *p;
+   int distance_start;
+   int distance_end;
+   ulong next_inst;
+
+   p = kvm_alloc(kvm_emulate_wrteei_len * 4);
+   if (!p)
+   return;
+
+   /* Find out where we are and put everything there */
+   distance_start = (ulong)p - (ulong)inst;
+   next_inst = ((ulong)inst + 4);
+   distance_end = next_inst - (ulong)p[kvm_emulate_wrteei_branch_offs];
+
+   /* Make sure we only write valid b instructions */
+   if (distance_start  KVM_INST_B_MAX) {
+   kvm_patching_worked = false;
+   return;
+   }
+
+   /* Modify the chunk to fit the invocation */
+   memcpy(p, kvm_emulate_wrteei, kvm_emulate_wrteei_len * 4);
+   p[kvm_emulate_wrteei_branch_offs] |= distance_end  KVM_INST_B_MASK;
+   p[kvm_emulate_wrteei_ee_offs] |= (*inst  MSR_EE);
+   flush_icache_range((ulong)p, (ulong)p + kvm_emulate_wrteei_len * 4);
+
+   /* Patch the invocation */
+   kvm_patch_ins_b(inst, distance_start);
+}
+
+#endif
+
 static void kvm_map_magic_page(void *data)
 {
kvm_hypercall2(KVM_HC_PPC_MAP_MAGIC_PAGE,
@@ -310,6 +354,12 @@ static void kvm_check_ins(u32 *inst)
}
 
switch (_inst) {
+#ifdef CONFIG_BOOKE
+   case KVM_INST_WRTEEI_0:
+   case KVM_INST_WRTEEI_1:
+   kvm_patch_ins_wrteei(inst);
+   break;
+#endif
}
 }
 
diff --git a/arch/powerpc/kernel/kvm_emul.S b/arch/powerpc/kernel/kvm_emul.S
index 8cd22f4..3199f65 100644
--- a/arch/powerpc/kernel/kvm_emul.S
+++ b/arch/powerpc/kernel/kvm_emul.S
@@ -204,3 +204,44 @@ kvm_emulate_mtmsr_orig_ins_offs:
 .global kvm_emulate_mtmsr_len
 kvm_emulate_mtmsr_len:
.long (kvm_emulate_mtmsr_end - kvm_emulate_mtmsr) / 4
+
+
+
+.global kvm_emulate_wrteei
+kvm_emulate_wrteei:
+
+   SCRATCH_SAVE
+
+   /* Fetch old MSR in r31 */
+   LL64(r31, KVM_MAGIC_PAGE + KVM_MAGIC_MSR, 0)
+
+   /* Remove MSR_EE from old MSR */
+   li  r30, 0
+   ori r30, r30, MSR_EE
+   andcr31, r31, r30
+
+   /* OR new MSR_EE onto the old MSR */
+kvm_emulate_wrteei_ee:
+   ori r31, r31, 0
+
+   /* Write new MSR value back */
+   STL64(r31, KVM_MAGIC_PAGE + KVM_MAGIC_MSR, 0)
+
+   SCRATCH_RESTORE
+
+   /* Go back to caller */
+kvm_emulate_wrteei_branch:
+   b   .
+kvm_emulate_wrteei_end:
+
+.global kvm_emulate_wrteei_branch_offs
+kvm_emulate_wrteei_branch_offs:
+   .long (kvm_emulate_wrteei_branch - kvm_emulate_wrteei) / 4
+
+.global kvm_emulate_wrteei_ee_offs
+kvm_emulate_wrteei_ee_offs:
+   .long (kvm_emulate_wrteei_ee - kvm_emulate_wrteei) / 4
+
+.global kvm_emulate_wrteei_len
+kvm_emulate_wrteei_len:
+   .long (kvm_emulate_wrteei_end - kvm_emulate_wrteei) / 4
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 1/7] KVM: PPC: Book3S_32 MMU debug compile fixes

2010-07-29 Thread Alexander Graf
Due to previous changes, the Book3S_32 guest MMU code didn't compile properly
when enabling debugging.

This patch repairs the broken code paths, making it possible to define DEBUG_MMU
and friends again.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_32_mmu.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_32_mmu.c b/arch/powerpc/kvm/book3s_32_mmu.c
index a7d121a..5bf4bf8 100644
--- a/arch/powerpc/kvm/book3s_32_mmu.c
+++ b/arch/powerpc/kvm/book3s_32_mmu.c
@@ -104,7 +104,7 @@ static hva_t kvmppc_mmu_book3s_32_get_pteg(struct 
kvmppc_vcpu_book3s *vcpu_book3
pteg = (vcpu_book3s-sdr1  0x) | hash;
 
dprintk(MMU: pc=0x%lx eaddr=0x%lx sdr1=0x%llx pteg=0x%x vsid=0x%x\n,
-   vcpu_book3s-vcpu.arch.pc, eaddr, vcpu_book3s-sdr1, pteg,
+   kvmppc_get_pc(vcpu_book3s-vcpu), eaddr, vcpu_book3s-sdr1, 
pteg,
sre-vsid);
 
r = gfn_to_hva(vcpu_book3s-vcpu.kvm, pteg  PAGE_SHIFT);
@@ -269,7 +269,7 @@ no_page_found:
dprintk_pte(KVM MMU: No PTE found (sdr1=0x%llx ptegp=0x%lx)\n,
to_book3s(vcpu)-sdr1, ptegp);
for (i=0; i16; i+=2) {
-   dprintk_pte(   %02d: 0x%x - 0x%x (0x%llx)\n,
+   dprintk_pte(   %02d: 0x%x - 0x%x (0x%x)\n,
i, pteg[i], pteg[i+1], ptem);
}
}
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 7/7] KVM: PPC: Move KVM trampolines before __end_interrupts

2010-07-29 Thread Alexander Graf
When using a relocatable kernel we need to make sure that the trampline code
and the interrupt handlers are both copied to low memory. The only way to do
this reliably is to put them in the copied section.

This patch should make relocated kernels work with KVM.

KVM-Stable-Tag
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kernel/exceptions-64s.S |6 ++
 arch/powerpc/kernel/head_64.S|6 --
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index 3e423fb..a0f25fb 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -299,6 +299,12 @@ slb_miss_user_pseries:
b   .   /* prevent spec. execution */
 #endif /* __DISABLED__ */
 
+/* KVM's trampoline code needs to be close to the interrupt handlers */
+
+#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
+#include ../kvm/book3s_rmhandlers.S
+#endif
+
.align  7
.globl  __end_interrupts
 __end_interrupts:
diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index 844a44b..d3010a3 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -166,12 +166,6 @@ exception_marker:
 #include exceptions-64s.S
 #endif
 
-/* KVM trampoline code needs to be close to the interrupt handlers */
-
-#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
-#include ../kvm/book3s_rmhandlers.S
-#endif
-
 _GLOBAL(generic_secondary_thread_init)
mr  r24,r3
 
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 4/7] KVM: PPC: Add book3s_32 tlbie flush acceleration

2010-07-29 Thread Alexander Graf
On Book3s_32 the tlbie instruction flushed effective addresses by the mask
0x0000. This is pretty hard to reflect with a hash that hashes ~0xfff, so
to speed up that target we should also keep a special hash around for it.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h |4 +++
 arch/powerpc/kvm/book3s_mmu_hpte.c  |   40 ++
 2 files changed, 39 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index fafc71a..bba3b9b 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -42,9 +42,11 @@
 
 #define HPTEG_CACHE_NUM(1  15)
 #define HPTEG_HASH_BITS_PTE13
+#define HPTEG_HASH_BITS_PTE_LONG   12
 #define HPTEG_HASH_BITS_VPTE   13
 #define HPTEG_HASH_BITS_VPTE_LONG  5
 #define HPTEG_HASH_NUM_PTE (1  HPTEG_HASH_BITS_PTE)
+#define HPTEG_HASH_NUM_PTE_LONG(1  HPTEG_HASH_BITS_PTE_LONG)
 #define HPTEG_HASH_NUM_VPTE(1  HPTEG_HASH_BITS_VPTE)
 #define HPTEG_HASH_NUM_VPTE_LONG   (1  HPTEG_HASH_BITS_VPTE_LONG)
 
@@ -163,6 +165,7 @@ struct kvmppc_mmu {
 
 struct hpte_cache {
struct hlist_node list_pte;
+   struct hlist_node list_pte_long;
struct hlist_node list_vpte;
struct hlist_node list_vpte_long;
struct rcu_head rcu_head;
@@ -293,6 +296,7 @@ struct kvm_vcpu_arch {
 
 #ifdef CONFIG_PPC_BOOK3S
struct hlist_head hpte_hash_pte[HPTEG_HASH_NUM_PTE];
+   struct hlist_head hpte_hash_pte_long[HPTEG_HASH_NUM_PTE_LONG];
struct hlist_head hpte_hash_vpte[HPTEG_HASH_NUM_VPTE];
struct hlist_head hpte_hash_vpte_long[HPTEG_HASH_NUM_VPTE_LONG];
int hpte_cache_count;
diff --git a/arch/powerpc/kvm/book3s_mmu_hpte.c 
b/arch/powerpc/kvm/book3s_mmu_hpte.c
index b643893..02c64ab 100644
--- a/arch/powerpc/kvm/book3s_mmu_hpte.c
+++ b/arch/powerpc/kvm/book3s_mmu_hpte.c
@@ -45,6 +45,12 @@ static inline u64 kvmppc_mmu_hash_pte(u64 eaddr)
return hash_64(eaddr  PTE_SIZE, HPTEG_HASH_BITS_PTE);
 }
 
+static inline u64 kvmppc_mmu_hash_pte_long(u64 eaddr)
+{
+   return hash_64((eaddr  0x0000)  PTE_SIZE,
+  HPTEG_HASH_BITS_PTE_LONG);
+}
+
 static inline u64 kvmppc_mmu_hash_vpte(u64 vpage)
 {
return hash_64(vpage  0xfULL, HPTEG_HASH_BITS_VPTE);
@@ -66,6 +72,11 @@ void kvmppc_mmu_hpte_cache_map(struct kvm_vcpu *vcpu, struct 
hpte_cache *pte)
index = kvmppc_mmu_hash_pte(pte-pte.eaddr);
hlist_add_head_rcu(pte-list_pte, vcpu-arch.hpte_hash_pte[index]);
 
+   /* Add to ePTE_long list */
+   index = kvmppc_mmu_hash_pte_long(pte-pte.eaddr);
+   hlist_add_head_rcu(pte-list_pte_long,
+  vcpu-arch.hpte_hash_pte_long[index]);
+
/* Add to vPTE list */
index = kvmppc_mmu_hash_vpte(pte-pte.vpage);
hlist_add_head_rcu(pte-list_vpte, vcpu-arch.hpte_hash_vpte[index]);
@@ -99,6 +110,7 @@ static void invalidate_pte(struct kvm_vcpu *vcpu, struct 
hpte_cache *pte)
spin_lock(vcpu-arch.mmu_lock);
 
hlist_del_init_rcu(pte-list_pte);
+   hlist_del_init_rcu(pte-list_pte_long);
hlist_del_init_rcu(pte-list_vpte);
hlist_del_init_rcu(pte-list_vpte_long);
 
@@ -150,10 +162,28 @@ static void kvmppc_mmu_pte_flush_page(struct kvm_vcpu 
*vcpu, ulong guest_ea)
rcu_read_unlock();
 }
 
-void kvmppc_mmu_pte_flush(struct kvm_vcpu *vcpu, ulong guest_ea, ulong ea_mask)
+static void kvmppc_mmu_pte_flush_long(struct kvm_vcpu *vcpu, ulong guest_ea)
 {
-   u64 i;
+   struct hlist_head *list;
+   struct hlist_node *node;
+   struct hpte_cache *pte;
+
+   /* Find the list of entries in the map */
+   list = vcpu-arch.hpte_hash_pte_long[
+   kvmppc_mmu_hash_pte_long(guest_ea)];
 
+   rcu_read_lock();
+
+   /* Check the list for matching entries and invalidate */
+   hlist_for_each_entry_rcu(pte, node, list, list_pte_long)
+   if ((pte-pte.eaddr  0x0000UL) == guest_ea)
+   invalidate_pte(vcpu, pte);
+
+   rcu_read_unlock();
+}
+
+void kvmppc_mmu_pte_flush(struct kvm_vcpu *vcpu, ulong guest_ea, ulong ea_mask)
+{
dprintk_mmu(KVM: Flushing %d Shadow PTEs: 0x%lx  0x%lx\n,
vcpu-arch.hpte_cache_count, guest_ea, ea_mask);
 
@@ -164,9 +194,7 @@ void kvmppc_mmu_pte_flush(struct kvm_vcpu *vcpu, ulong 
guest_ea, ulong ea_mask)
kvmppc_mmu_pte_flush_page(vcpu, guest_ea);
break;
case 0x0000:
-   /* 32-bit flush w/o segment, go through all possible segments */
-   for (i = 0; i  0x1ULL; i += 0x1000ULL)
-   kvmppc_mmu_pte_flush(vcpu, guest_ea | i, ~0xfffUL);
+   kvmppc_mmu_pte_flush_long(vcpu, guest_ea);
break;
case 0:
/* Doing a complete 

[PATCH 5/7] KVM: PPC: Use MSR_DR for external load_up

2010-07-29 Thread Alexander Graf
Book3S_32 requires MSR_DR to be disabled during load_up_xxx while on Book3S_64
it's supposed to be enabled. I misread the code and disabled it in both cases,
potentially breaking the PS3 which has a really small RMA.

This patch makes KVM work on the PS3 again.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_rmhandlers.S |   28 +++-
 1 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_rmhandlers.S 
b/arch/powerpc/kvm/book3s_rmhandlers.S
index 506d5c3..229d3d6 100644
--- a/arch/powerpc/kvm/book3s_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_rmhandlers.S
@@ -202,8 +202,25 @@ _GLOBAL(kvmppc_rmcall)
 
 #if defined(CONFIG_PPC_BOOK3S_32)
 #define STACK_LR   INT_FRAME_SIZE+4
+
+/* load_up_xxx have to run with MSR_DR=0 on Book3S_32 */
+#define MSR_EXT_START  \
+   PPC_STL r20, _NIP(r1);  \
+   mfmsr   r20;\
+   LOAD_REG_IMMEDIATE(r3, MSR_DR|MSR_EE);  \
+   andcr3,r20,r3;  /* Disable DR,EE */ \
+   mtmsr   r3; \
+   sync
+
+#define MSR_EXT_END\
+   mtmsr   r20;/* Enable DR,EE */  \
+   sync;   \
+   PPC_LL  r20, _NIP(r1)
+
 #elif defined(CONFIG_PPC_BOOK3S_64)
 #define STACK_LR   _LINK
+#define MSR_EXT_START
+#define MSR_EXT_END
 #endif
 
 /*
@@ -215,19 +232,12 @@ _GLOBAL(kvmppc_load_up_ ## what); 
\
PPC_STLU r1, -INT_FRAME_SIZE(r1);   \
mflrr3; \
PPC_STL r3, STACK_LR(r1);   \
-   PPC_STL r20, _NIP(r1);  \
-   mfmsr   r20;\
-   LOAD_REG_IMMEDIATE(r3, MSR_DR|MSR_EE);  \
-   andcr3,r20,r3;  /* Disable DR,EE */ \
-   mtmsr   r3; \
-   sync;   \
+   MSR_EXT_START;  \
\
bl  FUNC(load_up_ ## what); \
\
-   mtmsr   r20;/* Enable DR,EE */  \
-   sync;   \
+   MSR_EXT_END;\
PPC_LL  r3, STACK_LR(r1);   \
-   PPC_LL  r20, _NIP(r1);  \
mtlrr3; \
addir1, r1, INT_FRAME_SIZE; \
blr
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 2/7] KVM: PPC: RCU'ify the Book3s MMU

2010-07-29 Thread Alexander Graf
So far we've been running all code without locking of any sort. This wasn't
really an issue because I didn't see any parallel access to the shadow MMU
code coming.

But then I started to implement dirty bitmapping to MOL which has the video
code in its own thread, so suddenly we had the dirty bitmap code run in
parallel to the shadow mmu code. And with that came trouble.

So I went ahead and made the MMU modifying functions as parallelizable as
I could think of. I hope I didn't screw up too much RCU logic :-). If you
know your way around RCU and locking and what needs to be done when, please
take a look at this patch.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h |2 +
 arch/powerpc/kvm/book3s_mmu_hpte.c  |   78 ++
 2 files changed, 61 insertions(+), 19 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index e1da775..fafc71a 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -165,6 +165,7 @@ struct hpte_cache {
struct hlist_node list_pte;
struct hlist_node list_vpte;
struct hlist_node list_vpte_long;
+   struct rcu_head rcu_head;
u64 host_va;
u64 pfn;
ulong slot;
@@ -295,6 +296,7 @@ struct kvm_vcpu_arch {
struct hlist_head hpte_hash_vpte[HPTEG_HASH_NUM_VPTE];
struct hlist_head hpte_hash_vpte_long[HPTEG_HASH_NUM_VPTE_LONG];
int hpte_cache_count;
+   spinlock_t mmu_lock;
 #endif
 };
 
diff --git a/arch/powerpc/kvm/book3s_mmu_hpte.c 
b/arch/powerpc/kvm/book3s_mmu_hpte.c
index 4868d4a..b643893 100644
--- a/arch/powerpc/kvm/book3s_mmu_hpte.c
+++ b/arch/powerpc/kvm/book3s_mmu_hpte.c
@@ -60,68 +60,94 @@ void kvmppc_mmu_hpte_cache_map(struct kvm_vcpu *vcpu, 
struct hpte_cache *pte)
 {
u64 index;
 
+   spin_lock(vcpu-arch.mmu_lock);
+
/* Add to ePTE list */
index = kvmppc_mmu_hash_pte(pte-pte.eaddr);
-   hlist_add_head(pte-list_pte, vcpu-arch.hpte_hash_pte[index]);
+   hlist_add_head_rcu(pte-list_pte, vcpu-arch.hpte_hash_pte[index]);
 
/* Add to vPTE list */
index = kvmppc_mmu_hash_vpte(pte-pte.vpage);
-   hlist_add_head(pte-list_vpte, vcpu-arch.hpte_hash_vpte[index]);
+   hlist_add_head_rcu(pte-list_vpte, vcpu-arch.hpte_hash_vpte[index]);
 
/* Add to vPTE_long list */
index = kvmppc_mmu_hash_vpte_long(pte-pte.vpage);
-   hlist_add_head(pte-list_vpte_long,
-  vcpu-arch.hpte_hash_vpte_long[index]);
+   hlist_add_head_rcu(pte-list_vpte_long,
+  vcpu-arch.hpte_hash_vpte_long[index]);
+
+   spin_unlock(vcpu-arch.mmu_lock);
+}
+
+static void free_pte_rcu(struct rcu_head *head)
+{
+   struct hpte_cache *pte = container_of(head, struct hpte_cache, 
rcu_head);
+   kmem_cache_free(hpte_cache, pte);
 }
 
 static void invalidate_pte(struct kvm_vcpu *vcpu, struct hpte_cache *pte)
 {
+   /* pte already invalidated? */
+   if (hlist_unhashed(pte-list_pte))
+   return;
+
dprintk_mmu(KVM: Flushing SPT: 0x%lx (0x%llx) - 0x%llx\n,
pte-pte.eaddr, pte-pte.vpage, pte-host_va);
 
/* Different for 32 and 64 bit */
kvmppc_mmu_invalidate_pte(vcpu, pte);
 
+   spin_lock(vcpu-arch.mmu_lock);
+
+   hlist_del_init_rcu(pte-list_pte);
+   hlist_del_init_rcu(pte-list_vpte);
+   hlist_del_init_rcu(pte-list_vpte_long);
+
+   spin_unlock(vcpu-arch.mmu_lock);
+
if (pte-pte.may_write)
kvm_release_pfn_dirty(pte-pfn);
else
kvm_release_pfn_clean(pte-pfn);
 
-   hlist_del(pte-list_pte);
-   hlist_del(pte-list_vpte);
-   hlist_del(pte-list_vpte_long);
-
vcpu-arch.hpte_cache_count--;
-   kmem_cache_free(hpte_cache, pte);
+   call_rcu(pte-rcu_head, free_pte_rcu);
 }
 
 static void kvmppc_mmu_pte_flush_all(struct kvm_vcpu *vcpu)
 {
struct hpte_cache *pte;
-   struct hlist_node *node, *tmp;
+   struct hlist_node *node;
int i;
 
+   rcu_read_lock();
+
for (i = 0; i  HPTEG_HASH_NUM_VPTE_LONG; i++) {
struct hlist_head *list = vcpu-arch.hpte_hash_vpte_long[i];
 
-   hlist_for_each_entry_safe(pte, node, tmp, list, list_vpte_long)
+   hlist_for_each_entry_rcu(pte, node, list, list_vpte_long)
invalidate_pte(vcpu, pte);
}
+
+   rcu_read_unlock();
 }
 
 static void kvmppc_mmu_pte_flush_page(struct kvm_vcpu *vcpu, ulong guest_ea)
 {
struct hlist_head *list;
-   struct hlist_node *node, *tmp;
+   struct hlist_node *node;
struct hpte_cache *pte;
 
/* Find the list of entries in the map */
list = vcpu-arch.hpte_hash_pte[kvmppc_mmu_hash_pte(guest_ea)];
 
+   rcu_read_lock();
+
/* Check the list for matching entries and invalidate */
-   hlist_for_each_entry_safe(pte, 

[PATCH 3/7] KVM: PPC: correctly check gfn_to_pfn() return value

2010-07-29 Thread Alexander Graf
From: Gleb Natapov g...@redhat.com

On failure gfn_to_pfn returns bad_page so use correct function to check
for that.

Signed-off-by: Gleb Natapov g...@redhat.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_32_mmu_host.c |2 +-
 arch/powerpc/kvm/book3s_64_mmu_host.c |2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_32_mmu_host.c 
b/arch/powerpc/kvm/book3s_32_mmu_host.c
index 05e8c9e..343452c 100644
--- a/arch/powerpc/kvm/book3s_32_mmu_host.c
+++ b/arch/powerpc/kvm/book3s_32_mmu_host.c
@@ -148,7 +148,7 @@ int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct 
kvmppc_pte *orig_pte)
 
/* Get host physical address for gpa */
hpaddr = kvmppc_gfn_to_pfn(vcpu, orig_pte-raddr  PAGE_SHIFT);
-   if (kvm_is_error_hva(hpaddr)) {
+   if (is_error_pfn(hpaddr)) {
printk(KERN_INFO Couldn't get guest page for gfn %lx!\n,
 orig_pte-eaddr);
return -EINVAL;
diff --git a/arch/powerpc/kvm/book3s_64_mmu_host.c 
b/arch/powerpc/kvm/book3s_64_mmu_host.c
index 6cdd19a..672b149 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_host.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_host.c
@@ -102,7 +102,7 @@ int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct 
kvmppc_pte *orig_pte)
 
/* Get host physical address for gpa */
hpaddr = kvmppc_gfn_to_pfn(vcpu, orig_pte-raddr  PAGE_SHIFT);
-   if (kvm_is_error_hva(hpaddr)) {
+   if (is_error_pfn(hpaddr)) {
printk(KERN_INFO Couldn't get guest page for gfn %lx!\n, 
orig_pte-eaddr);
return -EINVAL;
}
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 0/7] Rest of my KVM-PPC patch queue

2010-07-29 Thread Alexander Graf
During the past few weeks a couple of fixes have gathered in my queue. This
is a dump of everything that is not related to the PV framework.

Please apply on top of the PV stuff.

Alexander Graf (6):
  KVM: PPC: Book3S_32 MMU debug compile fixes
  KVM: PPC: RCU'ify the Book3s MMU
  KVM: PPC: Add book3s_32 tlbie flush acceleration
  KVM: PPC: Use MSR_DR for external load_up
  KVM: PPC: Make long relocations be ulong
  KVM: PPC: Move KVM trampolines before __end_interrupts

Gleb Natapov (1):
  KVM: PPC: correctly check gfn_to_pfn() return value

 arch/powerpc/include/asm/kvm_book3s.h |4 +-
 arch/powerpc/include/asm/kvm_host.h   |6 ++
 arch/powerpc/kernel/exceptions-64s.S  |6 ++
 arch/powerpc/kernel/head_64.S |6 --
 arch/powerpc/kvm/book3s_32_mmu.c  |4 +-
 arch/powerpc/kvm/book3s_32_mmu_host.c |2 +-
 arch/powerpc/kvm/book3s_64_mmu_host.c |2 +-
 arch/powerpc/kvm/book3s_mmu_hpte.c|  118 ++---
 arch/powerpc/kvm/book3s_rmhandlers.S  |   32 ++---
 9 files changed, 133 insertions(+), 47 deletions(-)

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 6/7] KVM: PPC: Make long relocations be ulong

2010-07-29 Thread Alexander Graf
On Book3S KVM we directly expose some asm pointers to C code as
variables. These need to be relocated and thus break on relocatable
kernels.

To make sure we can at least build, let's mark them as long instead
of u32 where 64bit relocations don't work.

This fixes the following build error:

WARNING: 2 bad relocations^M
 c0008590 R_PPC64_ADDR32.text+0x40008460^M
 c0008594 R_PPC64_ADDR32.text+0x40008598^M

Please keep in mind that actually using KVM on a relocated kernel
might still break. This only fixes the compile problem.

Reported-by: Subrata Modak subr...@linux.vnet.ibm.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s.h |4 ++--
 arch/powerpc/kvm/book3s_rmhandlers.S  |4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index 00cf8b0..f04f516 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -132,8 +132,8 @@ extern void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong 
msr);
 extern int kvmppc_emulate_paired_single(struct kvm_run *run, struct kvm_vcpu 
*vcpu);
 extern pfn_t kvmppc_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn);
 
-extern u32 kvmppc_trampoline_lowmem;
-extern u32 kvmppc_trampoline_enter;
+extern ulong kvmppc_trampoline_lowmem;
+extern ulong kvmppc_trampoline_enter;
 extern void kvmppc_rmcall(ulong srr0, ulong srr1);
 extern void kvmppc_load_up_fpu(void);
 extern void kvmppc_load_up_altivec(void);
diff --git a/arch/powerpc/kvm/book3s_rmhandlers.S 
b/arch/powerpc/kvm/book3s_rmhandlers.S
index 229d3d6..2b9c908 100644
--- a/arch/powerpc/kvm/book3s_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_rmhandlers.S
@@ -252,10 +252,10 @@ define_load_up(vsx)
 
 .global kvmppc_trampoline_lowmem
 kvmppc_trampoline_lowmem:
-   .long kvmppc_handler_lowmem_trampoline - CONFIG_KERNEL_START
+   PPC_LONG kvmppc_handler_lowmem_trampoline - CONFIG_KERNEL_START
 
 .global kvmppc_trampoline_enter
 kvmppc_trampoline_enter:
-   .long kvmppc_handler_trampoline_enter - CONFIG_KERNEL_START
+   PPC_LONG kvmppc_handler_trampoline_enter - CONFIG_KERNEL_START
 
 #include book3s_segment.S
-- 
1.6.0.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2 5/7] Add support for ramdisk on ppc32 for uImage-ppc and Elf-ppc

2010-07-29 Thread Matthew McClintock

On Jul 29, 2010, at 3:33 AM, Simon Horman wrote:

 On Tue, Jul 20, 2010 at 03:14:58PM -0500, Matthew McClintock wrote:
 This fixes --reuseinitrd and --ramdisk option for ppc32 on
 uImage-ppc and Elf. It works for normal kexec as well as for
 kdump.
 
 When using --reuseinitrd you need to specifify retain_initrd
 on the command line. Also, if you are doing kdump you need to make
 sure your initrd lives in the crashdump region otherwise the
 kdump kernel will not be able to access it. The --ramdisk option
 should always work.
 
 Thanks, I have applied this change.
 I had to do a minor merge on the Makefile,
 could you verify that the result is correct?
 

Tested and looks good.

-M

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 4/6] regulator: Remove owner field from attribute initialization in regulator core driver

2010-07-29 Thread Mark Brown
On Wed, Jul 28, 2010 at 10:09:24PM -0700, Guenter Roeck wrote:
 Signed-off-by: Guenter Roeck guenter.ro...@ericsson.com

Acked-by: Mark Brown broo...@opensource.wolfsonmicro.com
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 0/2 v1.04] Add support for DWC OTG driver.

2010-07-29 Thread Greg KH
On Thu, Jul 29, 2010 at 09:26:12AM -0700, Fushen Chen wrote:
  [PATCH 1/2 v1.04]
  1. License information is under clarification.
 
 I meant that APM is still working with Synopys to resolve the GPL License.
 There is no result yet.

Then I would be very careful in posting the code like you have done.  As
it is, the code is not something that can be legally posted or used in
any device, and you might be held liable for it :(

good luck,

greg k-h
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 0/2 v1.04] Add support for DWC OTG driver.

2010-07-29 Thread Fushen Chen
 [PATCH 1/2 v1.04]
 1. License information is under clarification.

I meant that APM is still working with Synopys to resolve the GPL License.
There is no result yet.
I'll change this line to License issue is resolved. if that happens.
I modified other part of the patch according to other reviewer's comment.

Thanks,
Fushen

On Wed, Jul 28, 2010 at 8:05 PM, Greg KH gre...@suse.de wrote:

 On Wed, Jul 28, 2010 at 05:28:41PM -0700, Fushen Chen wrote:
  [PATCH 1/2 v1.04]
  1. License information is under clarification.

 What do you mean by this?  I fail to see a change here, why just repost
 the same code again?

 What is being done to resolve the issues I outlined previously?

 greg k-h

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] of: Provide default of_node_to_nid() implementation.

2010-07-29 Thread Grant Likely
of_node_to_nid() is only relevant in a few architectures.  Don't force
everyone to implement it anyway.

Signed-off-by: Grant Likely grant.lik...@secretlab.ca
---

v3: make -1 the default return value and let powerpc override it to 0 when
CONFIG_NUMA not set.

 arch/microblaze/include/asm/topology.h |   10 --
 arch/powerpc/include/asm/prom.h|7 +++
 arch/powerpc/include/asm/topology.h|7 ---
 arch/sparc/include/asm/prom.h  |3 +--
 include/linux/of.h |5 +
 5 files changed, 13 insertions(+), 19 deletions(-)

diff --git a/arch/microblaze/include/asm/topology.h 
b/arch/microblaze/include/asm/topology.h
index 96bcea5..5428f33 100644
--- a/arch/microblaze/include/asm/topology.h
+++ b/arch/microblaze/include/asm/topology.h
@@ -1,11 +1 @@
 #include asm-generic/topology.h
-
-#ifndef _ASM_MICROBLAZE_TOPOLOGY_H
-#define _ASM_MICROBLAZE_TOPOLOGY_H
-
-struct device_node;
-static inline int of_node_to_nid(struct device_node *device)
-{
-   return 0;
-}
-#endif /* _ASM_MICROBLAZE_TOPOLOGY_H */
diff --git a/arch/powerpc/include/asm/prom.h b/arch/powerpc/include/asm/prom.h
index da7dd63..55bccc0 100644
--- a/arch/powerpc/include/asm/prom.h
+++ b/arch/powerpc/include/asm/prom.h
@@ -103,6 +103,13 @@ struct device_node *of_find_next_cache_node(struct 
device_node *np);
 /* Get the MAC address */
 extern const void *of_get_mac_address(struct device_node *np);
 
+#ifdef CONFIG_NUMA
+extern int of_node_to_nid(struct device_node *device);
+#else
+static inline int of_node_to_nid(struct device_node *device) { return 0; }
+#endif
+#define of_node_to_nid of_node_to_nid
+
 /**
  * of_irq_map_pci - Resolve the interrupt for a PCI device
  * @pdev:  the device whose interrupt is to be resolved
diff --git a/arch/powerpc/include/asm/topology.h 
b/arch/powerpc/include/asm/topology.h
index 32adf72..09dd38c 100644
--- a/arch/powerpc/include/asm/topology.h
+++ b/arch/powerpc/include/asm/topology.h
@@ -41,8 +41,6 @@ static inline int cpu_to_node(int cpu)
   cpu_all_mask :   \
   node_to_cpumask_map[node])
 
-int of_node_to_nid(struct device_node *device);
-
 struct pci_bus;
 #ifdef CONFIG_PCI
 extern int pcibus_to_node(struct pci_bus *bus);
@@ -94,11 +92,6 @@ extern void sysfs_remove_device_from_node(struct sys_device 
*dev, int nid);
 
 #else
 
-static inline int of_node_to_nid(struct device_node *device)
-{
-   return 0;
-}
-
 static inline void dump_numa_cpu_topology(void) {}
 
 static inline int sysfs_add_device_to_node(struct sys_device *dev, int nid)
diff --git a/arch/sparc/include/asm/prom.h b/arch/sparc/include/asm/prom.h
index c82a7da..291f125 100644
--- a/arch/sparc/include/asm/prom.h
+++ b/arch/sparc/include/asm/prom.h
@@ -43,8 +43,7 @@ extern int of_getintprop_default(struct device_node *np,
 extern int of_find_in_proplist(const char *list, const char *match, int len);
 #ifdef CONFIG_NUMA
 extern int of_node_to_nid(struct device_node *dp);
-#else
-#define of_node_to_nid(dp) (-1)
+#define of_node_to_nid of_node_to_nid
 #endif
 
 extern void prom_build_devicetree(void);
diff --git a/include/linux/of.h b/include/linux/of.h
index b0756f3..cad7cf0 100644
--- a/include/linux/of.h
+++ b/include/linux/of.h
@@ -146,6 +146,11 @@ static inline unsigned long of_read_ulong(const __be32 
*cell, int size)
 
 #define OF_BAD_ADDR((u64)-1)
 
+#ifndef of_node_to_nid
+static inline int of_node_to_nid(struct device_node *np) { return -1; }
+#define of_node_to_nid of_node_to_nid
+#endif
+
 extern struct device_node *of_find_node_by_name(struct device_node *from,
const char *name);
 #define for_each_node_by_name(dn, name) \

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH] of/address: Clean up function declarations

2010-07-29 Thread Grant Likely
This patch moves the declaration of of_get_address(), of_get_pci_address(),
and of_pci_address_to_resource() out of arch code and into the common
linux/of_address header file.

This patch also fixes some of the asm/prom.h ordering issues.  It still
includes some header files that it ideally shouldn't be, but at least the
ordering is consistent now so that of_* overrides work.

Signed-off-by: Grant Likely grant.lik...@secretlab.ca
---
 arch/microblaze/include/asm/prom.h|   33 +++-
 arch/powerpc/include/asm/prom.h   |   49 +++--
 arch/powerpc/kernel/legacy_serial.c   |1 +
 arch/powerpc/kernel/pci-common.c  |1 +
 arch/powerpc/platforms/52xx/lite5200.c|1 +
 arch/powerpc/platforms/amigaone/setup.c   |3 +-
 arch/powerpc/platforms/iseries/mf.c   |1 +
 arch/powerpc/platforms/powermac/feature.c |2 +
 drivers/char/bsr.c|1 +
 drivers/net/fsl_pq_mdio.c |1 +
 drivers/net/xilinx_emaclite.c |2 +
 drivers/serial/uartlite.c |1 +
 drivers/spi/mpc512x_psc_spi.c |1 +
 drivers/spi/mpc52xx_psc_spi.c |1 +
 drivers/spi/xilinx_spi_of.c   |1 +
 drivers/usb/gadget/fsl_qe_udc.c   |1 +
 drivers/video/controlfb.c |2 +
 drivers/video/offb.c  |3 +-
 include/linux/of_address.h|   32 +++
 19 files changed, 74 insertions(+), 63 deletions(-)

diff --git a/arch/microblaze/include/asm/prom.h 
b/arch/microblaze/include/asm/prom.h
index cb9c3dd..101fa09 100644
--- a/arch/microblaze/include/asm/prom.h
+++ b/arch/microblaze/include/asm/prom.h
@@ -20,11 +20,6 @@
 #ifndef __ASSEMBLY__
 
 #include linux/types.h
-#include linux/of_address.h
-#include linux/of_irq.h
-#include linux/of_fdt.h
-#include linux/proc_fs.h
-#include linux/platform_device.h
 #include asm/irq.h
 #include asm/atomic.h
 
@@ -52,25 +47,9 @@ extern void pci_create_OF_bus_map(void);
  * OF address retreival  translation
  */
 
-/* Extract an address from a device, returns the region size and
- * the address space flags too. The PCI version uses a BAR number
- * instead of an absolute index
- */
-extern const u32 *of_get_address(struct device_node *dev, int index,
-   u64 *size, unsigned int *flags);
-extern const u32 *of_get_pci_address(struct device_node *dev, int bar_no,
-   u64 *size, unsigned int *flags);
-
-extern int of_pci_address_to_resource(struct device_node *dev, int bar,
-   struct resource *r);
-
 #ifdef CONFIG_PCI
 extern unsigned long pci_address_to_pio(phys_addr_t address);
-#else
-static inline unsigned long pci_address_to_pio(phys_addr_t address)
-{
-   return (unsigned long)-1;
-}
+#define pci_address_to_pio pci_address_to_pio
 #endif /* CONFIG_PCI */
 
 /* Parse the ibm,dma-window property of an OF node into the busno, phys and
@@ -99,8 +78,18 @@ extern const void *of_get_mac_address(struct device_node 
*np);
  * resolving using the OF tree walking.
  */
 struct pci_dev;
+struct of_irq;
 extern int of_irq_map_pci(struct pci_dev *pdev, struct of_irq *out_irq);
 
 #endif /* __ASSEMBLY__ */
 #endif /* __KERNEL__ */
+
+/* These includes are put at the bottom because they may contain things
+ * that are overridden by this file.  Ideally they shouldn't be included
+ * by this file, but there are a bunch of .c files that currently depend
+ * on it.  Eventually they will be cleaned up. */
+#include linux/of_fdt.h
+#include linux/of_irq.h
+#include linux/platform_device.h
+
 #endif /* _ASM_MICROBLAZE_PROM_H */
diff --git a/arch/powerpc/include/asm/prom.h b/arch/powerpc/include/asm/prom.h
index 55bccc0..ae26f2e 100644
--- a/arch/powerpc/include/asm/prom.h
+++ b/arch/powerpc/include/asm/prom.h
@@ -17,11 +17,6 @@
  * 2 of the License, or (at your option) any later version.
  */
 #include linux/types.h
-#include linux/of_fdt.h
-#include linux/of_address.h
-#include linux/of_irq.h
-#include linux/proc_fs.h
-#include linux/platform_device.h
 #include asm/irq.h
 #include asm/atomic.h
 
@@ -49,41 +44,9 @@ extern void pci_create_OF_bus_map(void);
 extern u64 of_translate_dma_address(struct device_node *dev,
const u32 *in_addr);
 
-/* Extract an address from a device, returns the region size and
- * the address space flags too. The PCI version uses a BAR number
- * instead of an absolute index
- */
-extern const u32 *of_get_address(struct device_node *dev, int index,
-  u64 *size, unsigned int *flags);
-#ifdef CONFIG_PCI
-extern const u32 *of_get_pci_address(struct device_node *dev, int bar_no,
-  u64 *size, unsigned int *flags);
-#else
-static inline const u32 *of_get_pci_address(struct device_node *dev,
-   int bar_no, u64 *size, unsigned int *flags)
-{
-   return NULL;
-}
-#endif /* 

Re: Commit 3da34aa brakes MSI support on MPC8308 (possibly all MPC83xx) [REPOST]

2010-07-29 Thread Wolfgang Denk
Dear Kumar  Kim,

any comments on this issue?

Thanks.

In message 4c48b384.1020...@emcraft.com Ilya Yanok wrote:
   Hi Kumar, Kim, Josh, everybody,
 
 I hope to disturb you but I haven't got any reply for my first posting...
 
 I've found that MSI work correctly with older kernels on my MPC8308RDB 
 board and don't work with newer ones. After bisecting I've found that 
 the source of the problem is commit 3da34aa:
 
 commit 3da34aae03d498ee62f75aa7467de93cce3030fd
 Author: Kumar Gala ga...@kernel.crashing.org
 Date:   Tue May 12 15:51:56 2009 -0500
 
  powerpc/fsl: Support unique MSI addresses per PCIe Root Complex
 
  Its feasible based on how the PCI address map is setup that the region
  of PCI address space used for MSIs differs for each PHB on the same 
 SoC.
 
  Instead of assuming that the address mappes to CCSRBAR 1:1 we read
  PEXCSRBAR (BAR0) for the PHB that the given pci_dev is on.
 
  Signed-off-by: Kumar Gala ga...@kernel.crashing.org
 
 I can see BAR0 initialization for 85xx/86xx hardware but not for 83xx 
 neigher in the kernel nor in U-Boot (that makes me think that all 83xx 
 can be affected).
 I'm not actually an PCI expert so I've just tried to write IMMR base 
 address to the BAR0 register from the U-Boot to get the correct address 
 but this doesn't help.
 Please direct me how to init 83xx PCIE controller to make it compatible 
 with this patch.
 
 Kim, I think MPC8315E is affected too, could you please test it?
 
 Thanks in advance.
 
 Regards, Ilya.

Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH, MD: Wolfgang Denk  Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: w...@denx.de
A good aphorism is too hard for the tooth of time, and  is  not  worn
away  by  all  the  centuries,  although  it serves as food for every
epoch.  - Friedrich Wilhelm Nietzsche
  _Miscellaneous Maxims and Opinions_ no. 168
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2] powerpc/kexec: Fix orphaned offline CPUs across kexec

2010-07-29 Thread Michael Neuling


In message 4c511216.30...@ozlabs.org you wrote:
 When CPU hotplug is used, some CPUs may be offline at the time a kexec is
 performed.  The subsequent kernel may expect these CPUs to be already running
,
 and will declare them stuck.  On pseries, there's also a soft-offline (cede)
 state that CPUs may be in; this can also cause problems as the kexeced kernel
 may ask RTAS if they're online -- and RTAS would say they are.  Again, stuck.
 
 This patch kicks each present offline CPU awake before the kexec, so that
 none are lost to these assumptions in the subsequent kernel.

There are a lot of cleanups in this patch.  The change you are making
would be a lot clearer without all the additional cleanups in there.  I
think I'd like to see this as two patches.  One for cleanups and one for
the addition of wake_offline_cpus().

Other than that, I'm not completely convinced this is the functionality
we want.  Do we really want to online these cpus?  Why where they
offlined in the first place?  I understand the stuck problem, but is the
solution to online them, or to change the device tree so that the second
kernel doesn't detect them as stuck?  

Mikey

 
 Signed-off-by: Matt Evans m...@ozlabs.org
 ---
 v2:   Added FIXME comment noting a possible problem with incorrectly
   started secondary CPUs, following feedback from Milton.
 
  arch/powerpc/kernel/machine_kexec_64.c |   55 --
-
  1 files changed, 49 insertions(+), 6 deletions(-)
 
 diff --git a/arch/powerpc/kernel/machine_kexec_64.c b/arch/powerpc/kernel/mac
hine_kexec_64.c
 index 4fbb3be..37f805e 100644
 --- a/arch/powerpc/kernel/machine_kexec_64.c
 +++ b/arch/powerpc/kernel/machine_kexec_64.c
 @@ -15,6 +15,8 @@
  #include linux/thread_info.h
  #include linux/init_task.h
  #include linux/errno.h
 +#include linux/kernel.h
 +#include linux/cpu.h
  
  #include asm/page.h
  #include asm/current.h
 @@ -181,7 +183,20 @@ static void kexec_prepare_cpus_wait(int wait_state)
   int my_cpu, i, notified=-1;
  
   my_cpu = get_cpu();
 - /* Make sure each CPU has atleast made it to the state we need */
 + /* Make sure each CPU has at least made it to the state we need.
 +  *
 +  * FIXME: There is a (slim) chance of a problem if not all of the CPUs
 +  * are correctly onlined.  If somehow we start a CPU on boot with RTAS
 +  * start-cpu, but somehow that CPU doesn't write callin_cpu_map[] in
 +  * time, the boot CPU will timeout.  If it does eventually execute
 +  * stuff, the secondary will start up (paca[].cpu_start was written) an
d
 +  * get into a peculiar state.  If the platform supports
 +  * smp_ops-take_timebase(), the secondary CPU will probably be spinnin
g
 +  * in there.  If not (i.e. pseries), the secondary will continue on and
 +  * try to online itself/idle/etc. If it survives that, we need to find
 +  * these possible-but-not-online-but-should-be CPUs and chaperone them
 +  * into kexec_smp_wait().
 +  */
   for_each_online_cpu(i) {
   if (i == my_cpu)
   continue;
 @@ -189,9 +204,9 @@ static void kexec_prepare_cpus_wait(int wait_state)
   while (paca[i].kexec_state  wait_state) {
   barrier();
   if (i != notified) {
 - printk( kexec: waiting for cpu %d (physical
 -  %d) to enter %i state\n,
 - i, paca[i].hw_cpu_id, wait_state);
 + printk(KERN_INFO kexec: waiting for cpu %d 
 +(physical %d) to enter %i state\n,
 +i, paca[i].hw_cpu_id, wait_state);
   notified = i;
   }
   }
 @@ -199,9 +214,32 @@ static void kexec_prepare_cpus_wait(int wait_state)
   mb();
  }
  
 -static void kexec_prepare_cpus(void)
 +/*
 + * We need to make sure each present CPU is online.  The next kernel will sc
an
 + * the device tree and assume primary threads are online and query secondary
 + * threads via RTAS to online them if required.  If we don't online primary
 + * threads, they will be stuck.  However, we also online secondary threads a
s we
 + * may be using 'cede offline'.  In this case RTAS doesn't see the secondary
 + * threads as offline -- and again, these CPUs will be stuck.
 + *
 + * So, we online all CPUs that should be running, including secondary thread
s.
 + */
 +static void wake_offline_cpus(void)
  {
 + int cpu = 0;
  
 + for_each_present_cpu(cpu) {
 + if (!cpu_online(cpu)) {
 + printk(KERN_INFO kexec: Waking offline cpu %d.\n,
 +cpu);
 + cpu_up(cpu);
 + }
 + }
 +}
 +
 +static void kexec_prepare_cpus(void)
 +{
 + wake_offline_cpus();
   smp_call_function(kexec_smp_down, NULL, /* wait */0);
   

Re: [PATCH 1/2 v1.03] Add support for DWC OTG HCD function.

2010-07-29 Thread Feng Kan
Hi Greg:

We will change to a BSD 3 clause license header. Our legal counsel is
talking to Synopsis to make this change. We will resubmit once this
is in place. Please let me know if you have any additional concerns.

Feng Kan
Applied Micro

On Mon, Jul 26, 2010 at 4:16 PM, Greg KH gre...@suse.de wrote:
 On Mon, Jul 26, 2010 at 04:05:49PM -0700, Feng Kan wrote:
 Hi Greg:

 We are having our legal revisit this again. What would you advise us
 to do at this point?

 I thought I was very clear below as to what is needed.

 Disclose the agreement or have someone with legal authority reply this
 thread.

 Neither will resolve the end issue, right?

 Perhaps something in the header that states Applied Micro verified
 with Synopsys to use this code for GPL purpose.

 No, that will just make it messier.  Someone needs to delete all of the
 mess in the file, put the proper license information for what the code
 is being licensed under (whatever it is), and provide a signed-off-by
 from a person from Synopsys and APM that can speak for the company that
 they agree that the code can properly be placed into the Linux kernel.

 thanks,

 greg k-h




-- 
Feng Kan
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2] powerpc/kexec: Fix orphaned offline CPUs across kexec

2010-07-29 Thread Matt Evans
Michael Neuling wrote:
 In message 4c511216.30...@ozlabs.org you wrote:
 When CPU hotplug is used, some CPUs may be offline at the time a kexec is
 performed.  The subsequent kernel may expect these CPUs to be already running
 ,
 and will declare them stuck.  On pseries, there's also a soft-offline (cede)
 state that CPUs may be in; this can also cause problems as the kexeced kernel
 may ask RTAS if they're online -- and RTAS would say they are.  Again, stuck.

 This patch kicks each present offline CPU awake before the kexec, so that
 none are lost to these assumptions in the subsequent kernel.
 
 There are a lot of cleanups in this patch.  The change you are making
 would be a lot clearer without all the additional cleanups in there.  I
 think I'd like to see this as two patches.  One for cleanups and one for
 the addition of wake_offline_cpus().

Okay, I can split this.  Typofixy-add-debug in one, wake_offline_cpus in 
another.

 Other than that, I'm not completely convinced this is the functionality
 we want.  Do we really want to online these cpus?  Why where they
 offlined in the first place?  I understand the stuck problem, but is the
 solution to online them, or to change the device tree so that the second
 kernel doesn't detect them as stuck?

Well... There are two cases.  If a CPU is soft-offlined on pseries, it must be 
woken from that cede loop (in platforms/pseries/hotplug-cpu.c) as we're 
replacing code under its feet.  We could either special-case the wakeup from 
this cede loop to get that CPU to RTAS stop-self itself properly.  (Kind of 
like a wake to die.)

So that leaves hard-offline CPUs (perhaps including the above): I don't know 
why they might have been offlined.  If it's something serious, like fire, 
they'd be removed from the present set too (and thus not be considered in this 
restarting case).  We could add a mask to the CPU node to show which of the 
threads (if any) are running, and alter the startup code to start everything if 
this mask doesn't exist (non-kexec) or only online currently-running threads if 
the mask is present.  That feels a little weird.

My reasoning for restarting everything was:  The first time you boot, all of 
your present CPUs are started up.  When you reboot, any CPUs you offlined for 
fun are restarted.  Kexec is (in this non-crash sense) a user-initiated 'quick 
reboot', so I reasoned that it should look the same as a 'hard reboot' and your 
new invocation would have all available CPUs running as is usual.


Cheers,


Matt
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 0/2 v1.04] Add support for DWC OTG driver.

2010-07-29 Thread David Daney

On 07/28/2010 05:28 PM, Fushen Chen wrote:

[PATCH 1/2 v1.04]

.
.
.

PATCH 1/2 seems to not have made it to linux-...@vger.kernel.org.  I 
suspect that a spam filter got it.


Could you remove whatever there is in the patch that triggers the 
filter?  Or failing that, change the filter so we can all see the patch?


Thanks,
David Daney
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 1/2 v1.03] Add support for DWC OTG HCD function.

2010-07-29 Thread Greg KH
On Thu, Jul 29, 2010 at 05:14:59PM -0700, Feng Kan wrote:
 Hi Greg:
 
 We will change to a BSD 3 clause license header. Our legal counsel is
 talking to Synopsis to make this change.

Why BSD?  You do realize what that means when combined within the body
of the kernel, right?

Are you going to be expecting others to contribute back to the code
under this license, or will you accept the fact that future
contributions from the community will cause the license to change?

 We will resubmit once this is in place. Please let me know if you have
 any additional concerns.

My main concern is that you, and everyone else involved in the driver,
never considered the license of the code in the first place and expected
the kernel community to accept it as-is, placing the problem on us.

What will be done in the future to prevent this from happening again?

thanks,

greg k-h
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 1/2 v1.03] Add support for DWC OTG HCD function.

2010-07-29 Thread Feng Kan
Hi Greg:

On Thu, Jul 29, 2010 at 5:50 PM, Greg KH gre...@suse.de wrote:
 On Thu, Jul 29, 2010 at 05:14:59PM -0700, Feng Kan wrote:
 Hi Greg:

 We will change to a BSD 3 clause license header. Our legal counsel is
 talking to Synopsis to make this change.

 Why BSD?  You do realize what that means when combined within the body
 of the kernel, right?


FKAN: We will shoot for a dual BSD/GPL license such as the one in the HP
   Hil driver.

 Are you going to be expecting others to contribute back to the code
 under this license, or will you accept the fact that future
 contributions from the community will cause the license to change?

 We will resubmit once this is in place. Please let me know if you have
 any additional concerns.

 My main concern is that you, and everyone else involved in the driver,
 never considered the license of the code in the first place and expected
 the kernel community to accept it as-is, placing the problem on us.

FKAN: Please don't think this is the case, we gone through this exercise
  with Denx. We had legal looking into the header before submission
  to them and the kernel.


 What will be done in the future to prevent this from happening again?

FKAN: agreed, once bitten  :)


 thanks,

 greg k-h




-- 
Feng Kan
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 1/2 v1.03] Add support for DWC OTG HCD function.

2010-07-29 Thread Greg KH
On Thu, Jul 29, 2010 at 06:19:25PM -0700, Feng Kan wrote:
 Hi Greg:
 
 On Thu, Jul 29, 2010 at 5:50 PM, Greg KH gre...@suse.de wrote:
  On Thu, Jul 29, 2010 at 05:14:59PM -0700, Feng Kan wrote:
  Hi Greg:
 
  We will change to a BSD 3 clause license header. Our legal counsel is
  talking to Synopsis to make this change.
 
  Why BSD?  You do realize what that means when combined within the body
  of the kernel, right?
 
 
 FKAN: We will shoot for a dual BSD/GPL license such as the one in the HP
Hil driver.

What specific driver is this?

And are you sure that all of the contributors to the code agree with
this licensing change?  Are you going to require contributors to
dual-license their changes?

If so, why keep it BSD, what does that get you?

  Are you going to be expecting others to contribute back to the code
  under this license, or will you accept the fact that future
  contributions from the community will cause the license to change?


You didn't answer this question, which is a very important one before I
can accept this driver.

  We will resubmit once this is in place. Please let me know if you have
  any additional concerns.
 
  My main concern is that you, and everyone else involved in the driver,
  never considered the license of the code in the first place and expected
  the kernel community to accept it as-is, placing the problem on us.
 
 FKAN: Please don't think this is the case, we gone through this exercise
   with Denx.

What is Denx?

 We had legal looking into the header before submission
   to them and the kernel.

Then what happened here?  Just curious as to how the driver was public
for so long before someone realized this.

  What will be done in the future to prevent this from happening again?
 
 FKAN: agreed, once bitten  :)

That didn't answer the question :)

thanks,

greg k-h
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH V4] powerpc/prom: Export device tree physical address via proc

2010-07-29 Thread David Gibson
On Thu, Jul 15, 2010 at 11:39:21AM -0500, Matthew McClintock wrote:
 On Thu, 2010-07-15 at 10:22 -0600, Grant Likely wrote:
   Thanks for taking a look. My first thought was to just blow away all
  the
   memreserve regions and start over. But, there are reserve regions
  for
   other things that I might not want to blow away. For example, on
  mpc85xx
   SMP systems we have an additional reserve region for our boot page.
  
  What is your starting point?  Where does the device tree (and
  memreserve list) come from
  that you're passing to kexec?  My first impression is that if you have
  to scrub the memreserve list, then the source being used to
  obtain the memreserves is either faulty or unsuitable to the task. 
 
 I'm pulling the device tree passed in via u-boot and passing it to
 kexec. It is the most complete device tree and requires the least amount
 of fixup.
 
 I have to scrub two items, the ramdisk/initrd and the device tree
 because upon kexec'ing the kernel we have the ability to pass in new
 ramdisk/initrd and device tree. They can also live at different
 physical addresses for the second reboot.

 The initrd addresses are already exposed, so we can
 update/remove/reuse that entry, we just need a way for kexec to
 determine the current device tree address so it can replace the
 correct memreserve region for the kexec'ing kernels' device tree.

Ok, be careful with this.  You do have the information you need, but
you might have to split an existing entry.  Having a single reserve
entry to cover the initrd would be typical, but it doesn't have to
happen that way - e.g. if a firmware reserves a big region for its own
purposes, and places the initrd within that region.

Also, the latest specs do *not* require the device tree itself to be
mem reserved.

 The whole problem comes from repeatedly kexec'ing, we need to make
 sure we don't keep losing blobs of memory to reserve regions (so we
 can't just blindly add). We also need to make sure we don't lose
 other memreserve regions that might be important for other things
 (so we can't just blow them all away).

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH V4] powerpc/prom: Export device tree physical address via proc

2010-07-29 Thread David Gibson
On Thu, Jul 15, 2010 at 01:18:21PM -0600, Grant Likely wrote:
 On Thu, Jul 15, 2010 at 12:58 PM, Matthew McClintock m...@freescale.com 
 wrote:
  On Thu, 2010-07-15 at 12:37 -0600, Grant Likely wrote:
  On Thu, Jul 15, 2010 at 12:03 PM, Matthew McClintock m...@freescale.com 
  wrote:
   Yes. Where would we get a list of memreserve sections?
 
  I would say the list of reserves that are not under the control of
  Linux should be explicitly described in the device tree proper.  For
  instance, if you have a region that firmware depends on, then have a
  node for describing the firmware and a property stating the memory
  regions that it depends on.  The memreserve regions can be generated
  from that.
 
  Ok, so we could traverse the tree node-by-bode for a
  persistent-memreserve property and add them to the /memreserve/ list in
  the kexec user space tools?

Well.. I don't think it should be this way as a matter of spec.  But
you could use a property as an interim stash for memreserve
information.

I agree that the precise defined semantics of the memreserve regions
is kind of fuzzy and non-obvious.  Here's how I believe they need to
work:

memory in a reserved region must *never* be touched by the OS
(or subsequent kexec-invoked OSes) unless something else in the device
tree explicitly instructs it how

There already exist several mechanisms for instructing the OS to use
particular reserved regions for particular purposes: e.g. the initrd
properties, and the spin-table properties.  More such mechanisms might
be added in future ePAPR (or whatever) revisions.  But if the OS
version doesn't understand such a future mechanism, it must fall back
to assuming that the memory is reserved in perpetuity.

Now, some of these mechanisms (implicitly) permit the OS to re-use the
reserved memory after it's done using them as instructed (initrd is
the most obvious one).  In that case the OS can re-add the reserved
space to it's general pools, and excise it from the reserved space for
subsequent kexec()-style boots.  However that's (potentially) a more
complex process than just removing an entry - the initial firmware is
free to combine adjacent reserved regions into one reserve entry, or
even to cover a single reserved region with multiple entries.  So in
order to do this manipulation you will need an allocator of sorts that
does the region reservation/dereservation correctly handling the
semantics on a byte-by-byte basis.

You should also be careful that the regions you're handling do
actually lie in memory space.  Linux doesn't support this right now,
but I do have an experimental patch that allows the initrd properties
to point to (e.g.) flash instead of RAM.  In that case the initrd
wouldn't have to lie in an explicitly reserved region, and obviously
could not be returned to the general pool after use.

 I *think* that is okay, but I'd like to hear from Segher, Ben, Mitch,
 David Gibson, and other device tree experts on whether or not that
 exact property naming is a good one.
 
 Write up a proposed binding (you can use devicetree.org).  Post it for
 review (make sure you cc: both devicetree-discuss and linuxppc-dev, as
 well as cc'ing the people listed above.)
 
   Should we export
   the reserve sections instead of the device tree location?
 
  It shouldn't really be something that the kernel is explicitly
  exporting because it is a characteristic of the board design.  It is
  something that belongs in the tree-proper.  ie. when you extract the
  tree you have data telling what the region is, and why it is reserved.
 
  Agreed.
 
 
   We just need a
   way to preserve what was there at boot to pass to the new kernel.
 
  Yet there is no differentiation between the board-dictated memory
  reserves and the things that U-Boot/Linux made an arbitrary decision
  on.  The solution should focus not on can I throw this one away? but
  rather Is this one I should keep?  :-)  A subtle difference, I know,
  but it changes the way you approach the solution.
 
  Fair enough. I think the above solution will work nicely, and I can
  start implementing something if you agree - if I interpreted your idea
  correctly. Although it should not require any changes to the kernel
  proper.
 
 Correct.
 
 g.
 ___
 Linuxppc-dev mailing list
 Linuxppc-dev@lists.ozlabs.org
 https://lists.ozlabs.org/listinfo/linuxppc-dev
 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 1/2 v1.03] Add support for DWC OTG HCD function.

2010-07-29 Thread Feng Kan
On Thu, Jul 29, 2010 at 6:26 PM, Greg KH gre...@suse.de wrote:
 On Thu, Jul 29, 2010 at 06:19:25PM -0700, Feng Kan wrote:
 Hi Greg:

 On Thu, Jul 29, 2010 at 5:50 PM, Greg KH gre...@suse.de wrote:
  On Thu, Jul 29, 2010 at 05:14:59PM -0700, Feng Kan wrote:
  Hi Greg:
 
  We will change to a BSD 3 clause license header. Our legal counsel is
  talking to Synopsis to make this change.
 
  Why BSD?  You do realize what that means when combined within the body
  of the kernel, right?
 

 FKAN: We will shoot for a dual BSD/GPL license such as the one in the HP
            Hil driver.

 What specific driver is this?

FKAN: this is driver/input/serio/hil_mlc.c and quite a number of others.


 And are you sure that all of the contributors to the code agree with
 this licensing change?  Are you going to require contributors to
 dual-license their changes?

 If so, why keep it BSD, what does that get you?

FKAN: for one thing, to make it future proof on other submissions.


  Are you going to be expecting others to contribute back to the code
  under this license, or will you accept the fact that future
  contributions from the community will cause the license to change?


 You didn't answer this question, which is a very important one before I
 can accept this driver.

FKAN: Yes, all of the above. Our legal is working on that. I thought by default
   GPL defines the above statement.


  We will resubmit once this is in place. Please let me know if you have
  any additional concerns.
 
  My main concern is that you, and everyone else involved in the driver,
  never considered the license of the code in the first place and expected
  the kernel community to accept it as-is, placing the problem on us.

 FKAN: Please don't think this is the case, we gone through this exercise
           with Denx.

 What is Denx?

FKAN: U-Boot Denx.de


 We had legal looking into the header before submission
           to them and the kernel.

 Then what happened here?  Just curious as to how the driver was public
 for so long before someone realized this.


FKAN:  this was few years back. At the time we had the header changed
   so it was BSD like to be accepted by Denx.

  What will be done in the future to prevent this from happening again?

 FKAN: agreed, once bitten  :)

 That didn't answer the question :)

FKAN: we have a system of checks for every patch that goes out. I will send
   out a guideline to all reviewer to make sure the header
follow kernel precedence.
   Legal is quite aware of the issue now too.


 thanks,

 greg k-h




-- 
Feng Kan
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2] powerpc/kexec: Fix orphaned offline CPUs across kexec

2010-07-29 Thread Michael Neuling
(adding kexec list to CC)

In message 4c521fd2.4050...@ozlabs.org you wrote:
 Michael Neuling wrote:
  In message 4c511216.30...@ozlabs.org you wrote:
  When CPU hotplug is used, some CPUs may be offline at the time a kexec is
  performed.  The subsequent kernel may expect these CPUs to be already runn
ing
  ,
  and will declare them stuck.  On pseries, there's also a soft-offline (ced
e)
  state that CPUs may be in; this can also cause problems as the kexeced ker
nel
  may ask RTAS if they're online -- and RTAS would say they are.  Again, stu
ck.
 
  This patch kicks each present offline CPU awake before the kexec, so that
  none are lost to these assumptions in the subsequent kernel.
  
  There are a lot of cleanups in this patch.  The change you are making
  would be a lot clearer without all the additional cleanups in there.  I
  think I'd like to see this as two patches.  One for cleanups and one for
  the addition of wake_offline_cpus().
 
 Okay, I can split this.  Typofixy-add-debug in one, wake_offline_cpus
 in another. 

Thanks.

 
  Other than that, I'm not completely convinced this is the functionality
  we want.  Do we really want to online these cpus?  Why where they
  offlined in the first place?  I understand the stuck problem, but is the
  solution to online them, or to change the device tree so that the second
  kernel doesn't detect them as stuck?
 
 Well... There are two cases.  If a CPU is soft-offlined on pseries, it
 must b e woken from that cede loop (in
 platforms/pseries/hotplug-cpu.c) as we're repla cing code under its
 feet.  We could either special-case the wakeup from this ce de loop to
 get that CPU to RTAS stop-self itself properly.  (Kind of like a 
 wake to die.)

Makes sense.  

 So that leaves hard-offline CPUs (perhaps including the above): I
 don't know why they might have been offlined.  If it's something
 serious, like fire, they'd be removed from the present set too (and
 thus not be considered in this restarting case).  We could add a mask
 to the CPU node to show which of the threads (if any) are running, and
 alter the startup code to start everything if this mask doesn't exist
 (non-kexec) or only online currently-running threads if the mask is
 present.  That feels a little weird.
 
 My reasoning for restarting everything was: The first time you boot,
 all of your present CPUs are started up.  When you reboot, any CPUs
 you offlined for fun are restarted.  Kexec is (in this non-crash
 sense) a user-initiated 'quick reboot', so I reasoned that it should
 look the same as a 'hard reboot' and your new invocation would have
 all available CPUs running as is usual.

OK, I like this justification.  Would be good to include it in the
checkin comment since we're changing functionality somewhat.

Mikey
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 1/2 v1.03] Add support for DWC OTG HCD function.

2010-07-29 Thread Greg KH
On Thu, Jul 29, 2010 at 07:02:44PM -0700, Feng Kan wrote:
 On Thu, Jul 29, 2010 at 6:26 PM, Greg KH gre...@suse.de wrote:
  On Thu, Jul 29, 2010 at 06:19:25PM -0700, Feng Kan wrote:
  Hi Greg:
 
  On Thu, Jul 29, 2010 at 5:50 PM, Greg KH gre...@suse.de wrote:
   On Thu, Jul 29, 2010 at 05:14:59PM -0700, Feng Kan wrote:
   Hi Greg:
  
   We will change to a BSD 3 clause license header. Our legal counsel is
   talking to Synopsis to make this change.
  
   Why BSD? ??You do realize what that means when combined within the body
   of the kernel, right?
  
 
  FKAN: We will shoot for a dual BSD/GPL license such as the one in the HP
  ?? ?? ?? ?? ?? ??Hil driver.
 
  What specific driver is this?
 
 FKAN: this is driver/input/serio/hil_mlc.c and quite a number of others.

Ok, thanks.

Are you _sure_ that you didn't take any existing GPL code and put it
into this driver when making it?  Did all contributors to the code
release their contributions under both licenses?

  And are you sure that all of the contributors to the code agree with
  this licensing change? ??Are you going to require contributors to
  dual-license their changes?
 
  If so, why keep it BSD, what does that get you?
 
 FKAN: for one thing, to make it future proof on other submissions.

What do you mean by this?  What can you do with this code other than use
it on a Linux system?  You can't put it into any other operating system
with a different license, can you?

   Are you going to be expecting others to contribute back to the code
   under this license, or will you accept the fact that future
   contributions from the community will cause the license to change?
 
 
  You didn't answer this question, which is a very important one before I
  can accept this driver.
 
 FKAN: Yes, all of the above. Our legal is working on that. I thought by 
 default
GPL defines the above statement.

The GPL does, but as you are trying to dual-license the code, you have
to be careful about how you accept changes, and under what license.
It's a lot more work than I think you realize.  What process do you have
in place to handle this?

   We will resubmit once this is in place. Please let me know if you have
   any additional concerns.
  
   My main concern is that you, and everyone else involved in the driver,
   never considered the license of the code in the first place and expected
   the kernel community to accept it as-is, placing the problem on us.
 
  FKAN: Please don't think this is the case, we gone through this exercise
  ?? ?? ?? ?? ?? with Denx.
 
  What is Denx?
 
 FKAN: U-Boot Denx.de

Ah, thanks.

  We had legal looking into the header before submission
  ?? ?? ?? ?? ?? to them and the kernel.
 
  Then what happened here? ??Just curious as to how the driver was public
  for so long before someone realized this.
 
 
 FKAN:  this was few years back. At the time we had the header changed
so it was BSD like to be accepted by Denx.
 
   What will be done in the future to prevent this from happening again?
 
  FKAN: agreed, once bitten  :)
 
  That didn't answer the question :)
 
 FKAN: we have a system of checks for every patch that goes out. I will send
out a guideline to all reviewer to make sure the header
 follow kernel precedence.

But you took this code from a different vendor, are you able to properly
identify the code contributions to this base and what license it is
under and where they got it from?

Legal is quite aware of the issue now too.

As they should be :)

Please reconsider the dual licensing unless you really are ready to
handle the implications of it.

thanks,

greg k-h
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2] powerpc/kexec: Fix orphaned offline CPUs across kexec

2010-07-29 Thread Simon Horman
On Fri, Jul 30, 2010 at 01:15:14PM +1000, Michael Neuling wrote:
 (adding kexec list to CC)
 
 In message 4c521fd2.4050...@ozlabs.org you wrote:
  Michael Neuling wrote:
   In message 4c511216.30...@ozlabs.org you wrote:
   When CPU hotplug is used, some CPUs may be offline at the time a kexec is
   performed.  The subsequent kernel may expect these CPUs to be already 
   runn
 ing
   ,
   and will declare them stuck.  On pseries, there's also a soft-offline 
   (ced
 e)
   state that CPUs may be in; this can also cause problems as the kexeced 
   ker
 nel
   may ask RTAS if they're online -- and RTAS would say they are.  Again, 
   stu
 ck.
  
   This patch kicks each present offline CPU awake before the kexec, so that
   none are lost to these assumptions in the subsequent kernel.
   
   There are a lot of cleanups in this patch.  The change you are making
   would be a lot clearer without all the additional cleanups in there.  I
   think I'd like to see this as two patches.  One for cleanups and one for
   the addition of wake_offline_cpus().
  
  Okay, I can split this.  Typofixy-add-debug in one, wake_offline_cpus
  in another. 
 
 Thanks.
 
  
   Other than that, I'm not completely convinced this is the functionality
   we want.  Do we really want to online these cpus?  Why where they
   offlined in the first place?  I understand the stuck problem, but is the
   solution to online them, or to change the device tree so that the second
   kernel doesn't detect them as stuck?
  
  Well... There are two cases.  If a CPU is soft-offlined on pseries, it
  must b e woken from that cede loop (in
  platforms/pseries/hotplug-cpu.c) as we're repla cing code under its
  feet.  We could either special-case the wakeup from this ce de loop to
  get that CPU to RTAS stop-self itself properly.  (Kind of like a 
  wake to die.)
 
 Makes sense.  
 
  So that leaves hard-offline CPUs (perhaps including the above): I
  don't know why they might have been offlined.  If it's something
  serious, like fire, they'd be removed from the present set too (and
  thus not be considered in this restarting case).  We could add a mask
  to the CPU node to show which of the threads (if any) are running, and
  alter the startup code to start everything if this mask doesn't exist
  (non-kexec) or only online currently-running threads if the mask is
  present.  That feels a little weird.
  
  My reasoning for restarting everything was: The first time you boot,
  all of your present CPUs are started up.  When you reboot, any CPUs
  you offlined for fun are restarted.  Kexec is (in this non-crash
  sense) a user-initiated 'quick reboot', so I reasoned that it should
  look the same as a 'hard reboot' and your new invocation would have
  all available CPUs running as is usual.
 
 OK, I like this justification.  Would be good to include it in the
 checkin comment since we're changing functionality somewhat.

FWIW, I do too. Personally I like to think of kexec as soft-reboot.
Where soft means, in software, not soft-touch.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 3/3 v2] mmc: Add ESDHC weird voltage bits workaround

2010-07-29 Thread Roy Zang
P4080 ESDHC controller does not support 1.8V and 3.0V voltage. but the
host controller capabilities register wrongly set the bits.
This patch adds the workaround to correct the weird voltage setting bits.

Signed-off-by: Roy Zang tie-fei.z...@freescale.com
---
This is the second version of patch 
http://patchwork.ozlabs.org/patch/60106/
According to the comment, remove some un-necessary setting.

Together with patch
http://patchwork.ozlabs.org/patch/60111/
http://patchwork.ozlabs.org/patch/60116/

This serial patches add mmc support for p4080 silicon

 drivers/mmc/host/sdhci-of-core.c |4 
 drivers/mmc/host/sdhci.c |8 
 drivers/mmc/host/sdhci.h |4 
 3 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/drivers/mmc/host/sdhci-of-core.c b/drivers/mmc/host/sdhci-of-core.c
index 0c30242..1f3913d 100644
--- a/drivers/mmc/host/sdhci-of-core.c
+++ b/drivers/mmc/host/sdhci-of-core.c
@@ -164,6 +164,10 @@ static int __devinit sdhci_of_probe(struct of_device 
*ofdev,
if (sdhci_of_wp_inverted(np))
host-quirks |= SDHCI_QUIRK_INVERTED_WRITE_PROTECT;
 
+   if (of_device_is_compatible(np, fsl,p4080-esdhc))
+   host-quirks |= (SDHCI_QUIRK_QORIQ_NO_VDD_180
+   |SDHCI_QUIRK_QORIQ_NO_VDD_300);
+
clk = of_get_property(np, clock-frequency, size);
if (clk  size == sizeof(*clk)  *clk)
of_host-clock = *clk;
diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 1424d08..a667790 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -1699,6 +1699,14 @@ int sdhci_add_host(struct sdhci_host *host)
 
caps = sdhci_readl(host, SDHCI_CAPABILITIES);
 
+/* Workaround for P4080 host controller capabilities
+ * 1.8V and 3.0V do not supported*/
+   if (host-quirks  SDHCI_QUIRK_QORIQ_NO_VDD_180)
+   caps = ~SDHCI_CAN_VDD_180;
+
+   if (host-quirks  SDHCI_QUIRK_QORIQ_NO_VDD_300)
+   caps = ~SDHCI_CAN_VDD_300;
+
if (host-quirks  SDHCI_QUIRK_FORCE_DMA)
host-flags |= SDHCI_USE_SDMA;
else if (!(caps  SDHCI_CAN_DO_SDMA))
diff --git a/drivers/mmc/host/sdhci.h b/drivers/mmc/host/sdhci.h
index aa112aa..389b58c 100644
--- a/drivers/mmc/host/sdhci.h
+++ b/drivers/mmc/host/sdhci.h
@@ -243,6 +243,10 @@ struct sdhci_host {
 #define SDHCI_QUIRK_NO_ENDATTR_IN_NOPDESC  (126)
 /* Controller uses Auto CMD12 command to stop the transfer */
 #define SDHCI_QUIRK_MULTIBLOCK_READ_ACMD12 (127)
+/* Controller cannot support 1.8V */
+#define SDHCI_QUIRK_QORIQ_NO_VDD_180   (128)
+/* Controller cannot support 3.0V */
+#define SDHCI_QUIRK_QORIQ_NO_VDD_300   (129)
 
int irq;/* Device IRQ */
void __iomem *  ioaddr; /* Mapped address */
-- 
1.5.6.5


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 0/2] powerpc/kexec: Fix orphaned offline CPUs, add comments/debug

2010-07-29 Thread Matt Evans
Separated tidyup comments  debug away from the fix of restarting offline
available CPUs before waiting for them on kexec.


Matt Evans (2):
  powerpc/kexec: Add to and tidy debug/comments in machine_kexec64.c
  powerpc/kexec: Fix orphaned offline CPUs across kexec

 arch/powerpc/kernel/machine_kexec_64.c |   55 ---
 1 files changed, 49 insertions(+), 6 deletions(-)

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 1/2] powerpc/kexec: Add to and tidy debug/comments in machine_kexec64.c

2010-07-29 Thread Matt Evans
Tidies some typos, KERN_INFO-ise an info msg, and add a debug msg showing
when the final sequence starts.

Also adds a comment to kexec_prepare_cpus_wait() to make note of a possible
problem; the need for kexec to deal with CPUs that failed to originally start
up.

Signed-off-by: Matt Evans m...@ozlabs.org
---
 arch/powerpc/kernel/machine_kexec_64.c |   29 -
 1 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kernel/machine_kexec_64.c 
b/arch/powerpc/kernel/machine_kexec_64.c
index 4fbb3be..aa3d5cd 100644
--- a/arch/powerpc/kernel/machine_kexec_64.c
+++ b/arch/powerpc/kernel/machine_kexec_64.c
@@ -15,6 +15,7 @@
 #include linux/thread_info.h
 #include linux/init_task.h
 #include linux/errno.h
+#include linux/kernel.h
 
 #include asm/page.h
 #include asm/current.h
@@ -181,7 +182,20 @@ static void kexec_prepare_cpus_wait(int wait_state)
int my_cpu, i, notified=-1;
 
my_cpu = get_cpu();
-   /* Make sure each CPU has atleast made it to the state we need */
+   /* Make sure each CPU has at least made it to the state we need.
+*
+* FIXME: There is a (slim) chance of a problem if not all of the CPUs
+* are correctly onlined.  If somehow we start a CPU on boot with RTAS
+* start-cpu, but somehow that CPU doesn't write callin_cpu_map[] in
+* time, the boot CPU will timeout.  If it does eventually execute
+* stuff, the secondary will start up (paca[].cpu_start was written) and
+* get into a peculiar state.  If the platform supports
+* smp_ops-take_timebase(), the secondary CPU will probably be spinning
+* in there.  If not (i.e. pseries), the secondary will continue on and
+* try to online itself/idle/etc. If it survives that, we need to find
+* these possible-but-not-online-but-should-be CPUs and chaperone them
+* into kexec_smp_wait().
+*/
for_each_online_cpu(i) {
if (i == my_cpu)
continue;
@@ -189,9 +203,9 @@ static void kexec_prepare_cpus_wait(int wait_state)
while (paca[i].kexec_state  wait_state) {
barrier();
if (i != notified) {
-   printk( kexec: waiting for cpu %d (physical
-%d) to enter %i state\n,
-   i, paca[i].hw_cpu_id, wait_state);
+   printk(KERN_INFO kexec: waiting for cpu %d 
+  (physical %d) to enter %i state\n,
+  i, paca[i].hw_cpu_id, wait_state);
notified = i;
}
}
@@ -215,7 +229,10 @@ static void kexec_prepare_cpus(void)
if (ppc_md.kexec_cpu_down)
ppc_md.kexec_cpu_down(0, 0);
 
-   /* Before removing MMU mapings make sure all CPUs have entered real 
mode */
+   /*
+* Before removing MMU mappings make sure all CPUs have entered real
+* mode:
+*/
kexec_prepare_cpus_wait(KEXEC_STATE_REAL_MODE);
 
put_cpu();
@@ -284,6 +301,8 @@ void default_machine_kexec(struct kimage *image)
if (crashing_cpu == -1)
kexec_prepare_cpus();
 
+   pr_debug(kexec: Starting switchover sequence.\n);
+
/* switch to a staticly allocated stack.  Based on irq stack code.
 * XXX: the task struct will likely be invalid once we do the copy!
 */
-- 
1.6.3.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 2/2] powerpc/kexec: Fix orphaned offline CPUs across kexec

2010-07-29 Thread Matt Evans
When CPU hotplug is used, some CPUs may be offline at the time a kexec is
performed.  The subsequent kernel may expect these CPUs to be already running,
and will declare them stuck.  On pseries, there's also a soft-offline (cede)
state that CPUs may be in; this can also cause problems as the kexeced kernel
may ask RTAS if they're online -- and RTAS would say they are.  The CPU will
either appear stuck, or will cause a crash as we replace its cede loop beneath
it.

This patch kicks each present offline CPU awake before the kexec, so that
none are forever lost to these assumptions in the subsequent kernel.

Now, the behaviour is that all available CPUs that were offlined are now
online  usable after the kexec.  This mimics the behaviour of a full reboot
(on which all CPUs will be restarted).

Signed-off-by: Matt Evans m...@ozlabs.org
---
 arch/powerpc/kernel/machine_kexec_64.c |   26 +-
 1 files changed, 25 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kernel/machine_kexec_64.c 
b/arch/powerpc/kernel/machine_kexec_64.c
index aa3d5cd..37f805e 100644
--- a/arch/powerpc/kernel/machine_kexec_64.c
+++ b/arch/powerpc/kernel/machine_kexec_64.c
@@ -16,6 +16,7 @@
 #include linux/init_task.h
 #include linux/errno.h
 #include linux/kernel.h
+#include linux/cpu.h
 
 #include asm/page.h
 #include asm/current.h
@@ -213,9 +214,32 @@ static void kexec_prepare_cpus_wait(int wait_state)
mb();
 }
 
-static void kexec_prepare_cpus(void)
+/*
+ * We need to make sure each present CPU is online.  The next kernel will scan
+ * the device tree and assume primary threads are online and query secondary
+ * threads via RTAS to online them if required.  If we don't online primary
+ * threads, they will be stuck.  However, we also online secondary threads as 
we
+ * may be using 'cede offline'.  In this case RTAS doesn't see the secondary
+ * threads as offline -- and again, these CPUs will be stuck.
+ *
+ * So, we online all CPUs that should be running, including secondary threads.
+ */
+static void wake_offline_cpus(void)
 {
+   int cpu = 0;
+
+   for_each_present_cpu(cpu) {
+   if (!cpu_online(cpu)) {
+   printk(KERN_INFO kexec: Waking offline cpu %d.\n,
+  cpu);
+   cpu_up(cpu);
+   }
+   }
+}
 
+static void kexec_prepare_cpus(void)
+{
+   wake_offline_cpus();
smp_call_function(kexec_smp_down, NULL, /* wait */0);
local_irq_disable();
mb(); /* make sure IRQs are disabled before we say they are */
-- 
1.6.3.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev