Re: [PATCH 1/2] powerpc: add 16K/64K pages support for the 44x PPC32 architectures.

2008-10-22 Thread Christian Ehrhardt
 +713,7 @@ _GLOBAL(copy_page)
dcbtr5,r4
li  r11,L1_CACHE_BYTES+4
 #endif /* MAX_COPY_PREFETCH */
-   li  r0,4096/L1_CACHE_BYTES - MAX_COPY_PREFETCH
+   li  r0,PAGE_SIZE/L1_CACHE_BYTES - MAX_COPY_PREFETCH
crclr   4*cr0+eq
 2:
mtctr   r0
diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
index 2001abd..4eed001 100644
--- a/arch/powerpc/mm/pgtable_32.c
+++ b/arch/powerpc/mm/pgtable_32.c
@@ -72,12 +72,7 @@ extern unsigned long p_mapped_by_tlbcam(unsigned long pa);
 #define p_mapped_by_tlbcam(x)  (0UL)
 #endif /* HAVE_TLBCAM */

-#ifdef CONFIG_PTE_64BIT
-/* 44x uses an 8kB pgdir because it has 8-byte Linux PTEs. */
-#define PGDIR_ORDER1
-#else
-#define PGDIR_ORDER0
-#endif
+#define PGDIR_ORDERmax(32 + PGD_T_LOG2 - PGDIR_SHIFT - PAGE_SHIFT, 0)

 pgd_t *pgd_alloc(struct mm_struct *mm)
 {
@@ -400,7 +395,7 @@ void kernel_map_pages(struct page *page, int numpages, int 
enable)
 #endif /* CONFIG_DEBUG_PAGEALLOC */

 static int fixmaps;
-unsigned long FIXADDR_TOP = 0xf000;
+unsigned long FIXADDR_TOP = (-PAGE_SIZE);
 EXPORT_SYMBOL(FIXADDR_TOP);

 void __set_fixmap (enum fixed_addresses idx, phys_addr_t phys, pgprot_t flags)
diff --git a/arch/powerpc/platforms/Kconfig.cputype 
b/arch/powerpc/platforms/Kconfig.cputype
index 7f65127..a1386a4 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -202,7 +202,7 @@ config PPC_STD_MMU_32

 config PPC_MM_SLICES
bool
-   default y if HUGETLB_PAGE || PPC_64K_PAGES
+   default y if HUGETLB_PAGE || (PPC64  PPC_64K_PAGES)
default n

 config VIRT_CPU_ACCOUNTING
  



--

Grüsse / regards, 
Christian Ehrhardt

IBM Linux Technology Center, Open Virtualization

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 1/2] powerpc: add 16K/64K pages support for the 44x PPC32 architectures.

2008-10-22 Thread Christian Ehrhardt
Ilya, here the snippet you asked for with CONFIG_DEBUG_BUGVERBOSE 
enabled and bootmem_debug set.


## Booting kernel from Legacy Image at 0400 ...
  Image Name:   Linux-2.6.27-dirty
  Image Type:   PowerPC Linux Kernel Image (gzip compressed)
  Data Size:1521505 Bytes =  1.5 MB
  Load Address: 0040
  Entry Point:  00400458
  Verifying Checksum ... OK
  Uncompressing Kernel Image ... OK
CPU clock-frequency - 0x27bc86a4 (667MHz)
CPU timebase-frequency - 0x27bc86a4 (667MHz)
/plb: clock-frequency - 9ef21a9 (167MHz)
/plb/opb: clock-frequency - 4f790d4 (83MHz)
/plb/opb/ebc: clock-frequency - 34fb5e3 (56MHz)
/plb/opb/[EMAIL PROTECTED]: clock-frequency - a8c000 (11MHz)
/plb/opb/[EMAIL PROTECTED]: clock-frequency - a8c000 (11MHz)
/plb/opb/[EMAIL PROTECTED]: clock-frequency - 42ecac (4MHz)
/plb/opb/[EMAIL PROTECTED]: clock-frequency - 42ecac (4MHz)
Memory - 0x0 0x0 0x000 (255MB)
ethernet0: local-mac-address - 00:10:ec:00:e2:3e
ethernet1: local-mac-address - 00:10:ec:80:e2:3e

zImage starting: loaded at 0x0040 (sp: 0x0fe3c820)
Allocating 0x3d54dc bytes for kernel ...
gunzipping (0x - 0x0040e000:0x007b24a4)...done 0x390af8 bytes

Linux/PowerPC load: console=ttyS0,115200 ip=dhcp 
nfsroot=192.168.1.2:/home/paelzer/ubuntu_ppc.8.04 root=/dev/nfs rw 
bootmem_debug

Finalizing device tree... flat tree at 0x40bed8
Using PowerPC 44x Platform machine description
Linux version 2.6.27-dirty ([EMAIL PROTECTED]) (gcc version 4.2.3) #12 
Wed Oct 22 19:40:49 CEST 2008

console [udbg0] enabled
bootmem::init_bootmem_core nid=0 start=0 map=ffd end=fff mapsize=200
bootmem::mark_bootmem_node nid=0 start=0 end=fff reserve=0 flags=0
bootmem::__free nid=0 start=0 end=fff
bootmem::mark_bootmem_node nid=0 start=0 end=3e reserve=1 flags=0
bootmem::__reserve nid=0 start=0 end=3e flags=0
bootmem::mark_bootmem_node nid=0 start=40 end=41 reserve=1 flags=0
bootmem::__reserve nid=0 start=40 end=41 flags=0
bootmem::mark_bootmem_node nid=0 start=ffd end=fff reserve=1 flags=0
bootmem::__reserve nid=0 start=ffd end=fff flags=0
[ cut here ]
kernel BUG at mm/bootmem.c:320!
Oops: Exception in kernel mode, sig: 5 [#1]
PowerPC 44x Platform
NIP: c02ce838 LR: c02ca4e4 CTR: c000dcf8
REGS: c0361eb0 TRAP: 0700   Not tainted  (2.6.27-dirty)
MSR: 00021000 ME  CR: 22004022  XER: 005f
TASK = c03304a8[0] 'swapper' THREAD: c036
GPR00: c02e0c98 c0361f60 c03304a8 0fff 1000 0001  
4000
GPR08: e000   c02e0c90 2224  0ffa6800 
0ffbf000
GPR16: 100c  100c  0ffa7500 0fe3cb20 0001 
c02e0c98
GPR24:  0001 1000 0fff c03a 0fff c03ad1e0 
c02e0c84

NIP [c02ce838] mark_bootmem+0xe0/0x124
LR [c02ca4e4] do_init_bootmem+0x134/0x168
Call Trace:
[c0361f60] [c02ce810] mark_bootmem+0xb8/0x124 (unreliable)
[c0361f90] [c02ca4e4] do_init_bootmem+0x134/0x168
[c0361fb0] [c02c8e00] setup_arch+0x13c/0x1b8
[c0361fc0] [c02c066c] start_kernel+0x94/0x2ac
[c0361ff0] [c1e8] skpinv+0x190/0x1cc
Instruction dump:
7f07c378 4bfffe15 7c7e1b78 4192000c 2f83 409e0024 7f9ae000 419e0050
817f0014 83bf0004 3bebffec 4b68 0fe0 4800 7f63db78 7fa4eb78
---[ end trace 31fd0ba7d8756001 ]---
Kernel panic - not syncing: Attempted to kill the idle task!
Rebooting in 180 seconds..

Christian Ehrhardt wrote:

Hi Ilya,
I just tried your patch on my 440 board because it would help us in 
our environment.

Unfortunately I run into a bug on early boot (mark_bootmem).

A log can be found in this mail, this is the bug when running with 64k 
page size.
I tried this with and without your 2/2 265k patch and also with page 
size configured to 16k, the error is the same in all cases.


I used an earlier version of your patch in the past and it worked 
fine. Applying this old patch causes the same problem.
Therefore I expect that there was some other code changed that breaks 
with page size != 4k.


I did not check that in detail yet, but I would be happy for every 
hint I could get to fix this.


= bootm
## Booting kernel from Legacy Image at 0400 ...
  Image Name:   Linux-2.6.27-dirty
  Image Type:   PowerPC Linux Kernel Image (gzip compressed)
  Data Size:1512203 Bytes =  1.4 MB
  Load Address: 0040
  Entry Point:  00400458
  Verifying Checksum ... OK
  Uncompressing Kernel Image ... OK
CPU clock-frequency - 0x27bc86a4 (667MHz)
CPU timebase-frequency - 0x27bc86a4 (667MHz)
/plb: clock-frequency - 9ef21a9 (167MHz)
/plb/opb: clock-frequency - 4f790d4 (83MHz)
/plb/opb/ebc: clock-frequency - 34fb5e3 (56MHz)
/plb/opb/[EMAIL PROTECTED]: clock-frequency - a8c000 (11MHz)
/plb/opb/[EMAIL PROTECTED]: clock-frequency - a8c000 (11MHz)
/plb/opb/[EMAIL PROTECTED]: clock-frequency - 42ecac (4MHz)
/plb/opb/[EMAIL PROTECTED]: clock-frequency - 42ecac (4MHz)
Memory - 0x0 0x0 0x000 (255MB)
ethernet0: local-mac-address - 00:10:ec:00:e2:3e
ethernet1: local-mac-address - 00:10:ec:80:e2:3e

zImage starting: loaded at 0x0040 (sp: 0x0fe3c820)
Allocating

[PATCH 2/3] kvmppc: add hypercall infrastructure - guest part v3

2008-09-16 Thread ehrhardt
From: Christian Ehrhardt [EMAIL PROTECTED]

This adds the guest portion of the hypercall infrastructure.

Version 3 now follows the beat ABI, but proposes a new implementation style as
static inline asm functions instead of pure assembler code. That should allow
the compiler to be more flexible and therefore a better optimization.

If people agree on that new implementation style we might merge this code.
The current implementation of beat style hypercalls can be found in
arch/powerpc/platforms/cell/beat_hvCall.S

Signed-off-by: Christian Ehrhardt [EMAIL PROTECTED]
---

[diffstat]
 epapr_hcalls.h |   59 +
 1 file changed, 59 insertions(+)

[diff]

diff --git a/include/asm-powerpc/epapr_hcalls.h 
b/include/asm-powerpc/epapr_hcalls.h
new file mode 100644
--- /dev/null
+++ b/include/asm-powerpc/epapr_hcalls.h
@@ -0,0 +1,59 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright IBM Corp. 2008
+ *
+ * Authors:
+ * Christian Ehrhardt [EMAIL PROTECTED]
+ */
+
+#ifndef __POWERPC_EPAPR_HCALLS_H__
+#define __POWERPC_EPAPR_HCALLS_H__
+
+#ifdef __KERNEL__
+
+/* Hypercalls use the beat ABI */
+#define KVM_HYPERCALL_BIN 0x4422
+
+static inline long epapr_hypercall_1in_1out(unsigned int nr, unsigned long p1)
+{
+   register unsigned long hcall asm (r11) = nr;
+   register unsigned long arg1_ret asm (r3) = p1;
+
+   asm volatile(.long %1
+   : +r(arg1_ret)
+   : i(KVM_HYPERCALL_BIN), r(hcall)
+   : r4, r5, r6, r7, r8,
+ r9, r10, r12, cc);
+   return arg1_ret;
+}
+
+static inline long epapr_hypercall_2in_1out(unsigned int nr,
+   unsigned long p1, unsigned long p2)
+{
+   register unsigned long hcall asm (r11) = nr;
+   register unsigned long arg1_ret asm (r3) = p1;
+   register unsigned long arg2 asm (r4) = p2;
+
+   asm volatile(.long %1
+   : +r(arg1_ret)
+   : i(KVM_HYPERCALL_BIN), r(hcall), r(arg2)
+   : r5, r6, r7, r8,
+ r9, r10, r12, cc);
+   return arg1_ret;
+}
+
+#endif /* __KERNEL__ */
+
+#endif /* __POWERPC_EPAPR_HCALLS_H__ */
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH 0/3][RFC] kvmppc: paravirtualization interface - guest part v3

2008-09-16 Thread ehrhardt
From: Christian Ehrhardt [EMAIL PROTECTED]

Version 3 updates:
- guest hypercall infrastructure is now generic (in epapr_hcalls.h)
  while the kvm specific functions stay in kvm_para.h
- the hypercalls now use beat style ABI
- dropped the guest coop patch changing wrteei to wrtee (now mfmsr is
  rewritten avoiding side effects and a lot of corner cases. Additionally this
  does not need any guest cooperation to be effective)

This patch series implements a paravirtualization interface using:
- the device tree mechanism to pass hypervisor informations to the guest
- hypercalls for guest-host calls
- an example exploiter of that interface (magic page)

The device tree format used here (=base for the discussions on
embedded-hypervisor) is the following.
- A node hypervisor to show the general availability of some hypervisor data
- flags for features like the example feature,pv-magicpage
  setting 1 = available, everything else = unavailable
- Some features might need to pass more data and can use an entry in the
  device tree like the example of data,pv-magicpage-size

The host side of these patches can be found on [EMAIL PROTECTED]

I hope that eventually this guest patch series (that is modifying the ppc boot
process and adding e.g. new ppc fixmaps could go upstream (when discussed
and agreed somewhen) via linuxppc-dev, while the kvm host part will go via
kvm (Avi Kivity).

[patches in series]
[PATCH 1/3] kvmppc: read device tree hypervisor node infrastructure
[PATCH 2/3] kvmppc: add hypercall infrastructure - guest part
[PATCH 3/3] kvmppc: magic page paravirtualization - guest part

---
[diffstat]
 arch/powerpc/kernel/kvm.c|   53 +++
 b/arch/powerpc/kernel/Makefile   |2 +
 b/arch/powerpc/kernel/kvm.c  |   30 +
 b/arch/powerpc/kernel/setup_32.c |3 +
 b/arch/powerpc/platforms/44x/Kconfig |7 
 b/include/asm-powerpc/epapr_hcalls.h |   59 +++
 b/include/asm-powerpc/fixmap.h   |   10 +
 b/include/asm-powerpc/kvm_para.h |   43 +++--
 include/asm-powerpc/kvm_para.h   |   26 +++
 9 files changed, 229 insertions(+), 4 deletions(-)
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH 1/3] kvmppc: read device tree hypervisor node infrastructure

2008-09-16 Thread ehrhardt
From: Christian Ehrhardt [EMAIL PROTECTED]

This patch adds the guest portion of the device tree based host-guest
communication. Using the device tree infrastructure this patch implements
kvm_para_available and kvm_arch_para_features (in this patch just the
infrastructure, no specific feature registered).

Signed-off-by: Christian Ehrhardt [EMAIL PROTECTED]
---

[diffstat]
 arch/powerpc/kernel/Makefile   |2 +
 arch/powerpc/kernel/kvm.c  |   30 +
 arch/powerpc/kernel/setup_32.c |3 ++
 arch/powerpc/platforms/44x/Kconfig |7 ++
 include/asm-powerpc/kvm_para.h |   43 ++---
 5 files changed, 82 insertions(+), 3 deletions(-)
[diff]

diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -80,6 +80,8 @@
 
 obj-$(CONFIG_8XX_MINIMAL_FPEMU) += softemu8xx.o
 
+obj-$(CONFIG_KVM_GUEST)+= kvm.o
+
 ifneq ($(CONFIG_PPC_INDIRECT_IO),y)
 obj-y  += iomap.o
 endif
diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c
new file mode 100644
--- /dev/null
+++ b/arch/powerpc/kernel/kvm.c
@@ -0,0 +1,30 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright IBM Corp. 2008
+ *
+ * Authors:
+ * Hollis Blanchard [EMAIL PROTECTED]
+ * Christian Ehrhardt [EMAIL PROTECTED]
+ */
+
+#include linux/percpu.h
+#include linux/mm.h
+#include linux/kvm_para.h
+
+void __init kvm_guest_init(void)
+{
+   if (!kvm_para_available())
+   return;
+}
diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c
--- a/arch/powerpc/kernel/setup_32.c
+++ b/arch/powerpc/kernel/setup_32.c
@@ -17,6 +17,7 @@
 #include linux/cpu.h
 #include linux/console.h
 #include linux/lmb.h
+#include linux/kvm_para.h
 
 #include asm/io.h
 #include asm/prom.h
@@ -319,5 +320,7 @@
ppc_md.setup_arch();
if ( ppc_md.progress ) ppc_md.progress(arch: exit, 0x3eab);
 
+   kvm_guest_init();
+
paging_init();
 }
diff --git a/arch/powerpc/platforms/44x/Kconfig 
b/arch/powerpc/platforms/44x/Kconfig
--- a/arch/powerpc/platforms/44x/Kconfig
+++ b/arch/powerpc/platforms/44x/Kconfig
@@ -152,3 +152,10 @@
 # 44x errata/workaround config symbols, selected by the CPU models above
 config IBM440EP_ERR42
bool
+
+config KVM_GUEST
+   bool KVM Guest support
+   depends on EXPERIMENTAL
+   help
+   This option enables various optimizations for running under the KVM
+   hypervisor.
diff --git a/include/asm-powerpc/kvm_para.h b/include/asm-powerpc/kvm_para.h
--- a/include/asm-powerpc/kvm_para.h
+++ b/include/asm-powerpc/kvm_para.h
@@ -14,7 +14,9 @@
  *
  * Copyright IBM Corp. 2008
  *
- * Authors: Hollis Blanchard [EMAIL PROTECTED]
+ * Authors:
+ * Hollis Blanchard [EMAIL PROTECTED]
+ * Christian Ehrhardt [EMAIL PROTECTED]
  */
 
 #ifndef __POWERPC_KVM_PARA_H__
@@ -22,15 +24,50 @@
 
 #ifdef __KERNEL__
 
+#include linux/of.h
+
+static struct kvmppc_para_features {
+   char *dtcell;
+   int feature;
+} para_features[] = {
+};
+
 static inline int kvm_para_available(void)
 {
-   return 0;
+   struct device_node *dn;
+   int ret;
+
+   dn = of_find_node_by_path(/hypervisor);
+   ret = !!dn;
+
+   of_node_put(dn);
+
+   return ret;
 }
 
 static inline unsigned int kvm_arch_para_features(void)
 {
-   return 0;
+   struct device_node *dn;
+   const int *dtval;
+   unsigned int features = 0;
+   int i;
+
+   dn = of_find_node_by_path(/hypervisor);
+   if (!dn)
+   return 0;
+
+   for (i = 0; i  ARRAY_SIZE(para_features); i++) {
+   dtval = of_get_property(dn, para_features[i].dtcell, NULL);
+   if (dtval  *dtval == 1)
+   features |= (1  para_features[i].feature);
+   }
+
+   of_node_put(dn);
+
+   return features;
 }
+
+void kvm_guest_init(void);
 
 #endif /* __KERNEL__ */
 
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 4/4] kvmppc: convert wrteei to wrtee as kvm guest optimization

2008-08-22 Thread Christian Ehrhardt

Scott Wood wrote:

On Thu, Aug 21, 2008 at 09:21:39AM -0500, Kumar Gala wrote:
  
Where is the other discussion?  I'd like to understand what's going on  
here.. (especially since I added the wrtee[i] changes to kernel way  
back when).



Presumably, they want to be able to replace wrtee with a store to a
hypervisor/guest shared memory area, and there's no store-immediate
instruction.

-Scott
  

Exactly Scott

And for your question Kumar, in the last submission I was asked to split 
host and guest patches.
So the host discussion lives on [EMAIL PROTECTED] as I mentioned 
(maybe a bit too hidden)

in the [0/4] mail of this series.


--

Grüsse / regards, 
Christian Ehrhardt

IBM Linux Technology Center, Open Virtualization

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 4/4] kvmppc: convert wrteei to wrtee as kvm guest optimization

2008-08-21 Thread Christian Ehrhardt

Josh Boyer wrote:

On Wed, 20 Aug 2008 14:06:51 -0500
Hollis Blanchard [EMAIL PROTECTED] wrote:  
  
To be honest I unfortunately don't know how big the impact for 
non-virtualized systems is. I would like to test it, but without 
hardware performance counters on the core I have I'm not sure (yet)
how 
to measure that in a good way - any suggestion welcome.
  

I don't see why we need performance counters. Can't we just compare any
bare metal benchmark results with the patch both applied and not?


Do you know of one that causes a large amount of
local_irq_{disable,enable}s to be called?
  

I think *every* workload causes a large number of
local_irq_{disable,enable} calls... :)



Well, sure.  I was just going for test the change as specifically as
possible.  One could write a module that did X number of
disable/enable pairs and reported the timebase at start and end to
compare.  X could even be a module parameter.  Just to try and
eliminate noise or whatever from the testing.

/me shrugs.

josh
  
yeah I thought of something like that too, because I expect the 
difference to be very small.
Instead of a module I wanted to put this somewhere prior to the kernel 
mounting root-fs to avoid interferences from whatever userspace is doing 
(e.g. causing  thousands of interrupts come back while the module 
perform that test.).
Eventually we need a synthetic benchmark like that AND a check how it 
affects a common system to be sure.



--

Grüsse / regards, 
Christian Ehrhardt

IBM Linux Technology Center, Open Virtualization

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 4/4] kvmppc: convert wrteei to wrtee as kvm guest optimization

2008-08-21 Thread Christian Ehrhardt

Kumar Gala wrote:


On Aug 19, 2008, at 5:36 AM, [EMAIL PROTECTED] wrote:


From: Christian Ehrhardt [EMAIL PROTECTED]

Dependent on the already existing CONFIG_KVM_GUEST config option this 
patch
changes wrteei to wrtee allowing the hypervisor to rewrite those to 
nontrapping
instructions. Maybe we should split the kvm guest otpimizations in 
two parts
one for the overhead free optimizations and on for the rest that 
might add

some complexity for non virtualized execution (like this one).

Signed-off-by: Christian Ehrhardt [EMAIL PROTECTED]
---


So this commit message doesnt explain why 'wrtee' facilities doing 
whatever optimization and 'wrteei' doesnt.



yep I only explained it elsewhere.
I'll add it here too in the next version - thanks Kumar
(and fix the word otpimizations)

--

Grüsse / regards, 
Christian Ehrhardt

IBM Linux Technology Center, Open Virtualization

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 2/4] kvmppc: add hypercall infrastructure - guest part

2008-08-20 Thread Christian Ehrhardt

Arnd Bergmann wrote:

On Tuesday 19 August 2008, [EMAIL PROTECTED] wrote:
  

+static inline long kvm_hypercall1(unsigned int nr, unsigned long p1)
+{
+   register unsigned long hcall asm (r0) = nr;
+   register unsigned long arg1 asm (r3) = p1;
+   register long ret asm (r11);
+
+   asm volatile(.long %1
+   : =r(ret)
+   : i(KVM_HYPERCALL_BIN), r(hcall), r(arg1)
+   : r4, r5, r6, r7, r8,
+ r9, r10, r12, cc);
+   return ret;
+}



What is the reasoning for making the calling convention different from
all the existing hcall interfaces here?

pseries uses r3 for the hcall number, lv1 and beat use r11, so using
r0 just for the sake of being different seems counterintuitive.

Arnd 
  
Some documentation is here 
http://kvm.qumranet.com/kvmwiki/PowerPC_Hypercall_ABI
As far as I remember it was oriented on system calls, from my point we 
can still change it atm.
When we discussed about that I was too new to the power architecture to 
really get all the details, but I assume Hollis and Jimi can answer you 
that.



--

Grüsse / regards, 
Christian Ehrhardt

IBM Linux Technology Center, Open Virtualization

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 4/4] kvmppc: convert wrteei to wrtee as kvm guest optimization

2008-08-20 Thread Christian Ehrhardt

Arnd Bergmann wrote:

On Tuesday 19 August 2008, [EMAIL PROTECTED] wrote:
  

Dependent on the already existing CONFIG_KVM_GUEST config option this patch
changes wrteei to wrtee allowing the hypervisor to rewrite those to nontrapping
instructions. Maybe we should split the kvm guest otpimizations in two parts
one for the overhead free optimizations and on for the rest that might add
some complexity for non virtualized execution (like this one).

Signed-off-by: Christian Ehrhardt [EMAIL PROTECTED]



How significant is the performance impact of this change for non-virtualized
systems? If it's very low, maybe you should not bother with the #ifdef, and
if it's noticable, you might be better off using dynamic patching for this.

Arnd 
  
To be honest I unfortunately don't know how big the impact for 
non-virtualized systems is. I would like to test it, but without 
hardware performance counters on the core I have I'm not sure (yet) how 
to measure that in a good way - any suggestion welcome.
I'm really sure that any jumping around style dynamic patching in the 
guest like function pointers etc will be slower than just let the load 
be there. Unfortunately I can not rewrite it from the hypervisor because 
for wrteei I would need a stwi to rewrite it in one instruction.
The patch as it is today let you choose between 10% benefit for 
virtualized guest and an unkown but surely very small overhead on native 
hardware.


--

Grüsse / regards, 
Christian Ehrhardt

IBM Linux Technology Center, Open Virtualization

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH 2/4] kvmppc: add hypercall infrastructure - guest part

2008-08-19 Thread ehrhardt
From: Christian Ehrhardt [EMAIL PROTECTED]

This adds the guest portion of the hypercall infrastructure, basically an
illegal instruction with a defined layout.
See http://kvm.qumranet.com/kvmwiki/PowerPC_Hypercall_ABI for more detail
on the hypercall ABI for powerpc.

Signed-off-by: Christian Ehrhardt [EMAIL PROTECTED]
---

[diffstat]
 kvm_para.h |   33 +
 1 file changed, 33 insertions(+)

[diff]

diff --git a/include/asm-powerpc/kvm_para.h b/include/asm-powerpc/kvm_para.h
--- a/include/asm-powerpc/kvm_para.h
+++ b/include/asm-powerpc/kvm_para.h
@@ -25,6 +25,8 @@
 #ifdef __KERNEL__
 
 #include linux/of.h
+
+#define KVM_HYPERCALL_BIN 0x03ff
 
 static struct kvmppc_para_features {
char *dtcell;
@@ -69,6 +71,37 @@
 
 void kvm_guest_init(void);
 
+static inline long kvm_hypercall1(unsigned int nr, unsigned long p1)
+{
+   register unsigned long hcall asm (r0) = nr;
+   register unsigned long arg1 asm (r3) = p1;
+   register long ret asm (r11);
+
+   asm volatile(.long %1
+   : =r(ret)
+   : i(KVM_HYPERCALL_BIN), r(hcall), r(arg1)
+   : r4, r5, r6, r7, r8,
+ r9, r10, r12, cc);
+   return ret;
+}
+
+static inline long kvm_hypercall2(unsigned int nr,
+   unsigned long p1, unsigned long p2)
+{
+   register unsigned long hcall asm (r0) = nr;
+   register unsigned long arg1 asm (r3) = p1;
+   register unsigned long arg2 asm (r4) = p2;
+   register long ret asm (r11);
+
+   asm volatile(.long %1
+   : =r(ret)
+   : i(KVM_HYPERCALL_BIN), r(hcall),
+   r(arg1), r(arg2)
+   : r5, r6, r7, r8,
+ r9, r10, r12, cc);
+   return ret;
+}
+
 #endif /* __KERNEL__ */
 
 #endif /* __POWERPC_KVM_PARA_H__ */
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH 4/4] kvmppc: convert wrteei to wrtee as kvm guest optimization

2008-08-19 Thread ehrhardt
From: Christian Ehrhardt [EMAIL PROTECTED]

Dependent on the already existing CONFIG_KVM_GUEST config option this patch
changes wrteei to wrtee allowing the hypervisor to rewrite those to nontrapping
instructions. Maybe we should split the kvm guest otpimizations in two parts
one for the overhead free optimizations and on for the rest that might add
some complexity for non virtualized execution (like this one).

Signed-off-by: Christian Ehrhardt [EMAIL PROTECTED]
---

[diffstat]
 hw_irq.h |   12 
 1 file changed, 12 insertions(+)

[diff]
diff --git a/include/asm-powerpc/hw_irq.h b/include/asm-powerpc/hw_irq.h
--- a/include/asm-powerpc/hw_irq.h
+++ b/include/asm-powerpc/hw_irq.h
@@ -72,7 +72,11 @@
 static inline void local_irq_disable(void)
 {
 #ifdef CONFIG_BOOKE
+#ifdef CONFIG_KVM_GUEST
+   __asm__ __volatile__(wrtee %0: : r(0) :memory);
+#else
__asm__ __volatile__(wrteei 0: : :memory);
+#endif
 #else
unsigned long msr;
__asm__ __volatile__(: : :memory);
@@ -84,7 +88,11 @@
 static inline void local_irq_enable(void)
 {
 #ifdef CONFIG_BOOKE
+#ifdef CONFIG_KVM_GUEST
+   __asm__ __volatile__(wrtee %0: : r(MSR_EE) :memory);
+#else
__asm__ __volatile__(wrteei 1: : :memory);
+#endif
 #else
unsigned long msr;
__asm__ __volatile__(: : :memory);
@@ -99,7 +107,11 @@
msr = mfmsr();
*flags = msr;
 #ifdef CONFIG_BOOKE
+#ifdef CONFIG_KVM_GUEST
+   __asm__ __volatile__(wrtee %0: : r(0) :memory);
+#else
__asm__ __volatile__(wrteei 0: : :memory);
+#endif
 #else
SET_MSR_EE(msr  ~MSR_EE);
 #endif
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH 0/4][RFC] kvmppc: paravirtualization interface - guest part v2

2008-08-19 Thread ehrhardt
From: Christian Ehrhardt [EMAIL PROTECTED]

This patch series implements a paravirtualization interface using:
- the device tree mechanism to pass hypervisor informations to the guest
- hypercalls for guest-host calls
- an example exploiter of that interface (magic page)

Version 2 includes changes to the feedback of my last submission and is now
tested against the implemented and working host part. The host part discussion
can be found on [EMAIL PROTECTED]

The used hypercall ABI was already discussed on the embedded-hypervisor mailing
list and is available at http://kvm.qumranet.com/kvmwiki/PowerPC_Hypercall_ABI

The device tree format used here (=base for the discussions on
embedded-hypervisor) is the following.
- A node hypervisor to show the general availability of some hypervisor data
- flags for features like the example feature,pv-magicpage
  setting 1 = available, everything else = unavailable
- Some features might need to pass more data and can use an entry in the
  device tree like the example of data,pv-magicpage-size

I hope that eventually this guest patch series (that is modifying the boot
process and adding e.g. new ppc fixmaps could go upstream (when discussed
and agreed somewhen) via linuxppc-dev, while the kvm host part will go via
kvm (Avi Kivity).

[patches in series]
[PATCH 1/4] kvmppc: read device tree hypervisor node infrastructure
[PATCH 2/4] kvmppc: add hypercall infrastructure - guest part
[PATCH 3/4] kvmppc: magic page paravirtualization - guest part
[PATCH 4/4] kvmppc: convert wrteei to wrtee as kvm guest optimization

---
[diffstat]
 arch/powerpc/kernel/kvm.c|   51 ++
 b/arch/powerpc/kernel/Makefile   |2 +
 b/arch/powerpc/kernel/kvm.c  |   30 +
 b/arch/powerpc/kernel/setup_32.c |3 +
 b/arch/powerpc/platforms/44x/Kconfig |7 
 b/include/asm-powerpc/fixmap.h   |   10 +
 b/include/asm-powerpc/hw_irq.h   |   12 +++
 b/include/asm-powerpc/kvm_para.h |   43 +++--
 b/mm/page_alloc.c|1
 include/asm-powerpc/kvm_para.h   |   59 +++
 10 files changed, 214 insertions(+), 4 deletions(-)
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH 3/4] kvmppc: magic page paravirtualization - guest part

2008-08-19 Thread ehrhardt
From: Christian Ehrhardt [EMAIL PROTECTED]

This patch adds the guest handling for the magic page mechanism. A Hypervisor
can modify the device tree passed to the guest. Using that already existing
interface a guest can simply detect available hypervisor features and agree
on the supported ones using hypercalls.
In this example it is checked for the feature switch feature,pv-magicpage
in the hypervisor node and additional data which represents the size the
hypervisor requests in data,pv-magicpage-size.
When the guest reads that data and wants to support it the memory is allocated
and passed to the hypervisor using the KVM_HCALL_RESERVE_MAGICPAGE hypercall.

Signed-off-by: Christian Ehrhardt [EMAIL PROTECTED]
---

[diffstat]
 arch/powerpc/kernel/kvm.c  |   51 +
 include/asm-powerpc/fixmap.h   |   10 +++-
 include/asm-powerpc/kvm_para.h |   26 
 mm/page_alloc.c|1
 4 files changed, 87 insertions(+), 1 deletion(-)

[diff]

diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c
--- a/arch/powerpc/kernel/kvm.c
+++ b/arch/powerpc/kernel/kvm.c
@@ -22,9 +22,60 @@
 #include linux/percpu.h
 #include linux/mm.h
 #include linux/kvm_para.h
+#include linux/bootmem.h
+#include asm/fixmap.h
+
+/*
+ * this is guest memory granted to the hypervisor;
+ * the hypervisor can place data in this area and rewrite
+ * privileged instructions to read from this area without
+ * trapping.
+ * Only the Hypervisor needs to be aware of the structure layout
+ * which makes the guest more felxible - the guest only guarantees
+ * the size which is requested by the hypervisor and read from a
+ * device tree entry.
+ */
+static void *kvm_magicpage;
+
+static void __init kvmppc_register_magic_page(void)
+{
+   unsigned long gvaddr;
+   unsigned long gpaddr;
+   int size;
+   long err;
+
+   size = kvmppc_pv_read_data(KVM_PVDATA_MAGICPAGE_SIZE);
+   if (size  0) {
+   printk(KERN_ERR %s: couldn't read size for kvmppc style 
+   paravirtualization support (got %d)\n,
+   __func__, size);
+   return;
+   }
+
+   /* FIXME Guest SMP needs that percpu which */
+   kvm_magicpage = alloc_bootmem(size);
+   if (!kvm_magicpage) {
+   printk(KERN_ERR %s - failed to allocate %d bytes\n,
+__func__, size);
+   return;
+   }
+   gpaddr = (unsigned long)__pa(kvm_magicpage);
+   gvaddr = fix_to_virt(FIX_KVM_PV);
+
+   err = kvm_hypercall2(KVM_HCALL_RESERVE_MAGICPAGE, gvaddr, gpaddr);
+   if (err)
+   printk(KERN_ERR %s: couldn't register pv mem\n, __func__);
+   else
+   printk(KERN_NOTICE %s: registered %d bytes for pv mem support
+(gvaddr 0x%08lx gpaddr 0x%08lx)\n,
+__func__, size, gvaddr, gpaddr);
+}
 
 void __init kvm_guest_init(void)
 {
if (!kvm_para_available())
return;
+
+   if (kvm_para_has_feature(KVM_FEATURE_PPCPV_MAGICPAGE))
+   kvmppc_register_magic_page();
 }
diff --git a/include/asm-powerpc/fixmap.h b/include/asm-powerpc/fixmap.h
--- a/include/asm-powerpc/fixmap.h
+++ b/include/asm-powerpc/fixmap.h
@@ -36,7 +36,7 @@
  *
  * these 'compile-time allocated' memory buffers are
  * fixed-size 4k pages. (or larger if used with an increment
- * highger than 1) use fixmap_set(idx,phys) to associate
+ * higher than 1) use fixmap_set(idx,phys) to associate
  * physical memory with fixmap indices.
  *
  * TLB entries of such buffers will not be flushed across
@@ -44,6 +44,14 @@
  */
 enum fixed_addresses {
FIX_HOLE,
+#ifdef CONFIG_KVM_GUEST
+   /*
+* reserved virtual address space for paravirtualization - needs to be
+*  =32k away from base address 0 to be able to reach it with
+* immediate addressing using base 0 instead of needing a register.
+*/
+   FIX_KVM_PV,
+#endif
 #ifdef CONFIG_HIGHMEM
FIX_KMAP_BEGIN, /* reserved pte's for temporary kernel mappings */
FIX_KMAP_END = FIX_KMAP_BEGIN+(KM_TYPE_NR*NR_CPUS)-1,
diff --git a/include/asm-powerpc/kvm_para.h b/include/asm-powerpc/kvm_para.h
--- a/include/asm-powerpc/kvm_para.h
+++ b/include/asm-powerpc/kvm_para.h
@@ -28,10 +28,18 @@
 
 #define KVM_HYPERCALL_BIN 0x03ff
 
+#define KVM_HCALL_RESERVE_MAGICPAGE0
+
+#define KVM_PVDATA_MAGICPAGE_SIZE  data,pv-magicpage-size
+
+/* List of PV features supported, returned as a bitfield */
+#define KVM_FEATURE_PPCPV_MAGICPAGE0
+
 static struct kvmppc_para_features {
char *dtcell;
int feature;
 } para_features[] = {
+   { feature,pv-magicpage, KVM_FEATURE_PPCPV_MAGICPAGE }
 };
 
 static inline int kvm_para_available(void)
@@ -67,6 +75,24 @@
of_node_put(dn);
 
return features;
+}
+
+/* reads the specified data field out of the hypervisor node */
+static inline int

[PATCH 1/4] kvmppc: read device tree hypervisor node infrastructure

2008-08-19 Thread ehrhardt
From: Christian Ehrhardt [EMAIL PROTECTED]

This patch adds the guest portion of the device tree based host-guest
communication. Using the device tree infrastructure this patch implements
kvm_para_available and kvm_arch_para_features (in this patch just the
infrastructure, no specific feature registered).

Signed-off-by: Christian Ehrhardt [EMAIL PROTECTED]
---

[diffstat]
 arch/powerpc/kernel/Makefile   |2 ++
 arch/powerpc/kernel/kvm.c  |   30 ++
 arch/powerpc/kernel/setup_32.c |3 +++
 arch/powerpc/platforms/44x/Kconfig |7 +++
 include/asm-powerpc/kvm_para.h |   37 ++---
 5 files changed, 76 insertions(+), 3 deletions(-)

[diff]

diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -80,6 +80,8 @@
 
 obj-$(CONFIG_8XX_MINIMAL_FPEMU) += softemu8xx.o
 
+obj-$(CONFIG_KVM_GUEST)+= kvm.o
+
 ifneq ($(CONFIG_PPC_INDIRECT_IO),y)
 obj-y  += iomap.o
 endif
diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c
new file mode 100644
--- /dev/null
+++ b/arch/powerpc/kernel/kvm.c
@@ -0,0 +1,30 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright IBM Corp. 2008
+ *
+ * Authors:
+ * Hollis Blanchard [EMAIL PROTECTED]
+ * Christian Ehrhardt [EMAIL PROTECTED]
+ */
+
+#include linux/percpu.h
+#include linux/mm.h
+#include linux/kvm_para.h
+
+void __init kvm_guest_init(void)
+{
+   if (!kvm_para_available())
+   return;
+}
diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c
--- a/arch/powerpc/kernel/setup_32.c
+++ b/arch/powerpc/kernel/setup_32.c
@@ -17,6 +17,7 @@
 #include linux/cpu.h
 #include linux/console.h
 #include linux/lmb.h
+#include linux/kvm_para.h
 
 #include asm/io.h
 #include asm/prom.h
@@ -319,5 +320,7 @@
ppc_md.setup_arch();
if ( ppc_md.progress ) ppc_md.progress(arch: exit, 0x3eab);
 
+   kvm_guest_init();
+
paging_init();
 }
diff --git a/arch/powerpc/platforms/44x/Kconfig 
b/arch/powerpc/platforms/44x/Kconfig
--- a/arch/powerpc/platforms/44x/Kconfig
+++ b/arch/powerpc/platforms/44x/Kconfig
@@ -152,3 +152,10 @@
 # 44x errata/workaround config symbols, selected by the CPU models above
 config IBM440EP_ERR42
bool
+
+config KVM_GUEST
+   bool KVM Guest support
+   depends on EXPERIMENTAL
+   help
+   This option enables various optimizations for running under the KVM
+   hypervisor.
diff --git a/include/asm-powerpc/kvm_para.h b/include/asm-powerpc/kvm_para.h
--- a/include/asm-powerpc/kvm_para.h
+++ b/include/asm-powerpc/kvm_para.h
@@ -14,7 +14,9 @@
  *
  * Copyright IBM Corp. 2008
  *
- * Authors: Hollis Blanchard [EMAIL PROTECTED]
+ * Authors:
+ * Hollis Blanchard [EMAIL PROTECTED]
+ * Christian Ehrhardt [EMAIL PROTECTED]
  */
 
 #ifndef __POWERPC_KVM_PARA_H__
@@ -22,15 +24,50 @@
 
 #ifdef __KERNEL__
 
+#include linux/of.h
+
+static struct kvmppc_para_features {
+   char *dtcell;
+   int feature;
+} para_features[] = {
+};
+
 static inline int kvm_para_available(void)
 {
-   return 0;
+   struct device_node *dn;
+   int ret;
+
+   dn = of_find_node_by_path(/hypervisor);
+   ret = !!dn;
+
+   of_node_put(dn);
+
+   return ret;
 }
 
 static inline unsigned int kvm_arch_para_features(void)
 {
-   return 0;
+   struct device_node *dn;
+   const int *dtval;
+   unsigned int features = 0;
+   int i;
+
+   dn = of_find_node_by_path(/hypervisor);
+   if (!dn)
+   return 0;
+
+   for (i = 0; i  ARRAY_SIZE(para_features); i++) {
+   dtval = of_get_property(dn, para_features[i].dtcell, NULL);
+   if (dtval  *dtval == 1)
+   features |= (1  para_features[i].feature);
+   }
+
+   of_node_put(dn);
+
+   return features;
 }
+
+void kvm_guest_init(void);
 
 #endif /* __KERNEL__ */
 
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 1/6] kvmppc: read device tree hypervisor node infrastructure

2008-07-24 Thread Christian Ehrhardt

Tony Breeds wrote:

On Wed, Jul 23, 2008 at 10:36:42AM +0200, [EMAIL PROTECTED] wrote:

Hi Christian,
A few comments inlined ...

  

[...]

+
 static inline int kvm_para_available(void)
 {
-   return 0;
+   struct device_node *dn;
+
+   dn = of_find_node_by_path(/hypervisor);



You need an of_node_put(dn);

  
I just looked at the linux/of.h and did not see that I have to free it 
again.

Thanks for the hint, I inserted both calls.

+
+   return !!dn;
 }
 
 static inline unsigned int kvm_arch_para_features(void)

 {
-   return 0;
+   struct device_node *dn;
+   const int *dtval;
+   unsigned int features = 0;
+   int i;
+
+   dn = of_find_node_by_path(/hypervisor);
+   if (!dn)
+   return 0;
+
+   for (i = 0; i  ARRAY_SIZE(para_features)-1; i++) {



Why -1?  Isn't ARRAY_SIZE(para_features) adequate?
  


yeah I already had this, bit the change was folded into the wrong patch, 
fixed now


[...]

Yours Tony

  linux.conf.auhttp://www.marchsouth.org/
  Jan 19 - 24 2009 The Australian Linux Technical Conference!

  



--

Grüsse / regards, 
Christian Ehrhardt

IBM Linux Technology Center, Open Virtualization

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: [PATCH 3/6] kvmppc: add hypercall infrastructure - guest part

2008-07-24 Thread Christian Ehrhardt

Tony Breeds wrote:

On Wed, Jul 23, 2008 at 10:36:44AM +0200, [EMAIL PROTECTED] wrote:
  

From: Christian Ehrhardt [EMAIL PROTECTED]



Hi Christian,
  

This adds the guest portion of the hypercall infrastructure, basically an
illegal instruction with a defined layout.
See http://kvm.qumranet.com/kvmwiki/PowerPC_Hypercall_ABI for more detail
on the hypercall ABI for powerpc.

Signed-off-by: Christian Ehrhardt [EMAIL PROTECTED]
---

[diffstat]
 kvm_para.h |   16 
 1 file changed, 16 insertions(+)

[diff]
diff --git a/include/asm-powerpc/kvm_para.h b/include/asm-powerpc/kvm_para.h
--- a/include/asm-powerpc/kvm_para.h
+++ b/include/asm-powerpc/kvm_para.h
@@ -25,6 +25,8 @@
 #ifdef __KERNEL__
 
 #include linux/of.h

+
+#define KVM_HYPERCALL_BIN 0x03ff



Ummm didn't you add this in patch 2 of 6?
  

This is just because I initially wanted to split Host  Guest patch series.
I need to separate my patches a bit more anyway for the next submission 
thanks for pointing out this duplication.



Yours Tony

  linux.conf.auhttp://www.marchsouth.org/
  Jan 19 - 24 2009 The Australian Linux Technical Conference!

  



--

Grüsse / regards, 
Christian Ehrhardt

IBM Linux Technology Center, Open Virtualization

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: [PATCH 0/6][RFC] kvmppc: paravirtualization interface

2008-07-24 Thread Christian Ehrhardt

Tony Breeds wrote:

On Wed, Jul 23, 2008 at 10:36:41AM +0200, [EMAIL PROTECTED] wrote:
  

From: Christian Ehrhardt [EMAIL PROTECTED]

This patch series implements a paravirtualization interface using:
- the device tree mechanism to pass hypervisor informations to the guest
- hypercalls for guest-host calls
- an example exploiter of that interface (magic page)
This is work in progress, but working so far. I just start to really exploit
the fuctionality behind the magic page mechanism therefor I can't provide any
performance improvements so far, but it is evolved enough for RFC and to start
the standardization discussion.



Are you aiming this for the current merge window, ie for 2.6.27?
  
The aim is not really fixed. It would be nice to get into 2.6.27, but 
since I can't yet expect how long it takes ...


Actually the guest patches would already go through reviews and 
upstream, due to the fact that the guest code changes are not that (the 
major part of the implementation will go over kvmppc - kvm upstream).
But since I want to discuss about the standardization on the embedded 
hypervisor list first, the naming of the device tree entries are not 
fixed yet.
Therefor I can't yet define which kernel version merge window I'll 
target/reach.


btw - embedded hypervisor - I got advised that this is a closed list 
which I forgot.
Sorry for all who got bounces on a replay-all action. The next version 
of the patch series will go to the involved open source lists only and a 
separate more standardization than patch style mail series to embedded 
hypervisor.



Yours Tony

  linux.conf.auhttp://www.marchsouth.org/
  Jan 19 - 24 2009 The Australian Linux Technical Conference!
  

--

Grüsse / regards, 
Christian Ehrhardt

IBM Linux Technology Center, Open Virtualization

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

[PATCH 1/6] kvmppc: read device tree hypervisor node infrastructure

2008-07-23 Thread ehrhardt
From: Christian Ehrhardt [EMAIL PROTECTED]

This patch adds the guest portion of the device tree based host-guest
communication. Using the device tree infrastructure this patch implements
kvm_para_available and kvm_arch_para_features (in this patch just the
infrastructure, no specific feature registered).

Signed-off-by: Christian Ehrhardt [EMAIL PROTECTED]
---

[diffstat]
 arch/powerpc/kernel/Makefile   |2 ++
 arch/powerpc/kernel/kvm.c  |   30 ++
 arch/powerpc/kernel/setup_32.c |3 +++
 arch/powerpc/platforms/44x/Kconfig |7 +++
 include/asm-powerpc/kvm_para.h |   37 ++---
 5 files changed, 76 insertions(+), 3 deletions(-)

[diff]

diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -80,6 +80,8 @@
 
 obj-$(CONFIG_8XX_MINIMAL_FPEMU) += softemu8xx.o
 
+obj-$(CONFIG_KVM_GUEST)+= kvm.o
+
 ifneq ($(CONFIG_PPC_INDIRECT_IO),y)
 obj-y  += iomap.o
 endif
diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c
new file mode 100644
--- /dev/null
+++ b/arch/powerpc/kernel/kvm.c
@@ -0,0 +1,30 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright IBM Corp. 2008
+ *
+ * Authors:
+ * Hollis Blanchard [EMAIL PROTECTED]
+ * Christian Ehrhardt [EMAIL PROTECTED]
+ */
+
+#include linux/percpu.h
+#include linux/mm.h
+#include linux/kvm_para.h
+
+void __init kvm_guest_init(void)
+{
+   if (!kvm_para_available())
+   return;
+}
diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c
--- a/arch/powerpc/kernel/setup_32.c
+++ b/arch/powerpc/kernel/setup_32.c
@@ -17,6 +17,7 @@
 #include linux/cpu.h
 #include linux/console.h
 #include linux/lmb.h
+#include linux/kvm_para.h
 
 #include asm/io.h
 #include asm/prom.h
@@ -319,5 +320,7 @@
ppc_md.setup_arch();
if ( ppc_md.progress ) ppc_md.progress(arch: exit, 0x3eab);
 
+   kvm_guest_init();
+
paging_init();
 }
diff --git a/arch/powerpc/platforms/44x/Kconfig 
b/arch/powerpc/platforms/44x/Kconfig
--- a/arch/powerpc/platforms/44x/Kconfig
+++ b/arch/powerpc/platforms/44x/Kconfig
@@ -152,3 +152,10 @@
 # 44x errata/workaround config symbols, selected by the CPU models above
 config IBM440EP_ERR42
bool
+
+config KVM_GUEST
+   bool KVM Guest support
+   depends on EXPERIMENTAL
+   help
+   This option enables various optimizations for running under the KVM
+   hypervisor.
diff --git a/include/asm-powerpc/kvm_para.h b/include/asm-powerpc/kvm_para.h
--- a/include/asm-powerpc/kvm_para.h
+++ b/include/asm-powerpc/kvm_para.h
@@ -14,7 +14,9 @@
  *
  * Copyright IBM Corp. 2008
  *
- * Authors: Hollis Blanchard [EMAIL PROTECTED]
+ * Authors:
+ * Hollis Blanchard [EMAIL PROTECTED]
+ * Christian Ehrhardt [EMAIL PROTECTED]
  */
 
 #ifndef __POWERPC_KVM_PARA_H__
@@ -22,15 +24,44 @@
 
 #ifdef __KERNEL__
 
+#include linux/of.h
+
+static struct kvmppc_para_features {
+   char *dtcell;
+   int feature;
+} para_features[] = {
+};
+
 static inline int kvm_para_available(void)
 {
-   return 0;
+   struct device_node *dn;
+
+   dn = of_find_node_by_path(/hypervisor);
+
+   return !!dn;
 }
 
 static inline unsigned int kvm_arch_para_features(void)
 {
-   return 0;
+   struct device_node *dn;
+   const int *dtval;
+   unsigned int features = 0;
+   int i;
+
+   dn = of_find_node_by_path(/hypervisor);
+   if (!dn)
+   return 0;
+
+   for (i = 0; i  ARRAY_SIZE(para_features)-1; i++) {
+   dtval = of_get_property(dn, para_features[i].dtcell, NULL);
+   if (dtval  *dtval == 1)
+   features |= (1  para_features[i].feature);
+   }
+
+   return features;
 }
+
+void kvm_guest_init(void);
 
 #endif /* __KERNEL__ */
 
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH 5/6] kvmppc: magic page paravirtualization - guest part

2008-07-23 Thread ehrhardt
From: Christian Ehrhardt [EMAIL PROTECTED]

This patch adds the guest handling for the magic page mechanism. A Hypervisor
can modify the device tree passed to the guest. Using that already existing
interface a guest can simply detect available hypervisor features and agree
on the supported ones using hypercalls.
In this example it is checked for the feature switch feature,pv-magicpage
in the hypervisor node and additional data which represents the size the
hypervisor requests in data,pv-magicpage-size.
When the guest read that data and wants to support it the memory is allocated
and passed to the hypervisor using the KVM_HCALL_RESERVE_MAGICPAGE hypercall.

Signed-off-by: Christian Ehrhardt [EMAIL PROTECTED]
---

[diffstat]
 arch/powerpc/kernel/kvm.c  |   48 +
 include/asm-powerpc/kvm_para.h |   27 ++-
 2 files changed, 74 insertions(+), 1 deletion(-)

[diff]
diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c
--- a/arch/powerpc/kernel/kvm.c
+++ b/arch/powerpc/kernel/kvm.c
@@ -22,9 +22,57 @@
 #include linux/percpu.h
 #include linux/mm.h
 #include linux/kvm_para.h
+#include linux/bootmem.h
+
+/*
+ * this is guest memory granted to the hypervisor;
+ * the hypervisor can place data in this area and rewrite
+ * privileged instructions to read from this area without
+ * trapping.
+ * Only the Hypervisor needs to be aware of the structure layout
+ * which makes the guest more felxible - the guest only guarantees
+ * the size which is requested by the hypervisor and read from a
+ * device tree entry.
+ */
+void *kvm_magicpage;
+
+static void __init kvmppc_register_magic_page(void)
+{
+   unsigned long paddr;
+   int size;
+   long err;
+
+   size = kvmppc_pv_read_data(KVM_PVDATA_MAGICPAGE_SIZE);
+   if (size  0) {
+   printk(KERN_ERR%s: couldn't read size for kvmppc style 
+   paravirtualization support (got %d)\n,
+   __func__, size);
+   return;
+   }
+
+   /* FIXME Guest SMP needs that percpu
+* On SMP we might also need a free implementation */
+   kvm_magicpage = alloc_bootmem(size);
+   if (!kvm_magicpage) {
+   printk(KERN_ERR%s - failed to allocate %d bytes\n,
+__func__, size);
+   return;
+   }
+
+   paddr = (unsigned long)__pa(kvm_magicpage);
+   err = kvm_hypercall1(KVM_HCALL_RESERVE_MAGICPAGE, paddr);
+   if (err)
+   printk(KERN_ERR%s: couldn't register magic page\n, __func__);
+   else
+   printk(KERN_NOTICE%s: registered %d bytes for 
+   virtualization support\n, __func__, size);
+}
 
 void __init kvm_guest_init(void)
 {
if (!kvm_para_available())
return;
+
+   if (kvm_para_has_feature(KVM_FEATURE_PPCPV_MAGICPAGE))
+   kvmppc_register_magic_page();
 }
diff --git a/include/asm-powerpc/kvm_para.h b/include/asm-powerpc/kvm_para.h
--- a/include/asm-powerpc/kvm_para.h
+++ b/include/asm-powerpc/kvm_para.h
@@ -28,10 +28,18 @@
 
 #define KVM_HYPERCALL_BIN 0x03ff
 
+#define KVM_HCALL_RESERVE_MAGICPAGE0
+
+#define KVM_PVDATA_MAGICPAGE_SIZE  data,pv-magicpage-size
+
+/* List of PV features supported, returned as a bitfield */
+#define KVM_FEATURE_PPCPV_MAGICPAGE0
+
 static struct kvmppc_para_features {
char *dtcell;
int feature;
 } para_features[] = {
+   { feature,pv-magicpage, KVM_FEATURE_PPCPV_MAGICPAGE }
 };
 
 static inline int kvm_para_available(void)
@@ -54,13 +62,30 @@
if (!dn)
return 0;
 
-   for (i = 0; i  ARRAY_SIZE(para_features)-1; i++) {
+   for (i = 0; i  ARRAY_SIZE(para_features); i++) {
dtval = of_get_property(dn, para_features[i].dtcell, NULL);
if (dtval  *dtval == 1)
features |= (1  para_features[i].feature);
}
 
return features;
+}
+
+/* reads the specified data field out of the hypervisor node */
+static inline int kvmppc_pv_read_data(char *dtcell)
+{
+   struct device_node *dn;
+   const int *dtval;
+
+   dn = of_find_node_by_path(/hypervisor);
+   if (!dn)
+   return -EINVAL;
+
+   dtval = of_get_property(dn, dtcell, NULL);
+   if (dtval)
+   return *dtval;
+   else
+   return -EINVAL;
 }
 
 void kvm_guest_init(void);
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH 3/6] kvmppc: add hypercall infrastructure - guest part

2008-07-23 Thread ehrhardt
From: Christian Ehrhardt [EMAIL PROTECTED]

This adds the guest portion of the hypercall infrastructure, basically an
illegal instruction with a defined layout.
See http://kvm.qumranet.com/kvmwiki/PowerPC_Hypercall_ABI for more detail
on the hypercall ABI for powerpc.

Signed-off-by: Christian Ehrhardt [EMAIL PROTECTED]
---

[diffstat]
 kvm_para.h |   16 
 1 file changed, 16 insertions(+)

[diff]
diff --git a/include/asm-powerpc/kvm_para.h b/include/asm-powerpc/kvm_para.h
--- a/include/asm-powerpc/kvm_para.h
+++ b/include/asm-powerpc/kvm_para.h
@@ -25,6 +25,8 @@
 #ifdef __KERNEL__
 
 #include linux/of.h
+
+#define KVM_HYPERCALL_BIN 0x03ff
 
 static struct kvmppc_para_features {
char *dtcell;
@@ -63,6 +65,20 @@
 
 void kvm_guest_init(void);
 
+static inline long kvm_hypercall1(unsigned int nr, unsigned long p1)
+{
+   register unsigned long hcall asm (r0) = nr;
+   register unsigned long arg1 asm (r3) = p1;
+   register long ret asm (r11);
+
+   asm volatile(.long %1
+   : =r(ret)
+   : i(KVM_HYPERCALL_BIN), r(hcall), r(arg1)
+   : r4, r5, r6, r7, r8,
+ r9, r10, r12, cc);
+   return ret;
+}
+
 #endif /* __KERNEL__ */
 
 #endif /* __POWERPC_KVM_PARA_H__ */
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH 6/6] kvmppc: kvm-userspace: device tree modification for magicpage

2008-07-23 Thread ehrhardt
From: Christian Ehrhardt [EMAIL PROTECTED]

This patch to kvm-userspace connects the other host  guest patches in this
series. On guest initialization it checks the hosts capabilities for the
magicpage mechanism. If available the device tree passed to the guest gets the
hypervisor node added and in that node the feature flag and the requested
magic page size (read from the host kernel via an ioctl) is stored.

Signed-off-by: Christian Ehrhardt [EMAIL PROTECTED]
---

[diffstat]
 libkvm/libkvm-powerpc.c |6 ++
 libkvm/libkvm.h |6 ++
 qemu/hw/device_tree.c   |   10 ++
 qemu/hw/device_tree.h   |1 +
 qemu/hw/ppc440_bamboo.c |   15 +++
 qemu/qemu-kvm-powerpc.c |5 +
 qemu/qemu-kvm.h |1 +
 7 files changed, 44 insertions(+)

[diff]
diff --git a/libkvm/libkvm-powerpc.c b/libkvm/libkvm-powerpc.c
--- a/libkvm/libkvm-powerpc.c
+++ b/libkvm/libkvm-powerpc.c
@@ -19,6 +19,7 @@
 
 #include libkvm.h
 #include kvm-powerpc.h
+#include sys/ioctl.h
 #include errno.h
 #include stdio.h
 #include inttypes.h
@@ -105,6 +106,11 @@
return 0;
 }
 
+int kvm_get_magicpage_size(kvm_context_t kvm)
+{
+   return ioctl(kvm-fd, KVM_GET_PPCPV_MAGICPAGE_SIZE, 0);
+}
+
 int kvm_arch_run(struct kvm_run *run, kvm_context_t kvm, int vcpu)
 {
int ret = 0;
diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h
--- a/libkvm/libkvm.h
+++ b/libkvm/libkvm.h
@@ -639,6 +639,12 @@
 
 #endif
 
+#ifdef KVM_CAP_PPCPV_MAGICPAGE
+
+int kvm_get_magicpage_size(kvm_context_t kvm);
+
+#endif
+
 int kvm_translate(kvm_context_t kvm, int vcpu, struct kvm_translation *tr);
 
 #endif
diff --git a/qemu/hw/device_tree.c b/qemu/hw/device_tree.c
--- a/qemu/hw/device_tree.c
+++ b/qemu/hw/device_tree.c
@@ -190,4 +190,14 @@
exit(1);
}
 }
+
+void dt_add_subnode(void *fdt, const char *name, char *node_path)
+{
+   int offset;
+   offset = get_offset_of_node(fdt, node_path);
+   if (fdt_add_subnode(fdt, offset, name)  0) {
+   printf(Unable to create device tree node '%s'\n, name);
+   exit(1);
+   }
+}
 #endif
diff --git a/qemu/hw/device_tree.h b/qemu/hw/device_tree.h
--- a/qemu/hw/device_tree.h
+++ b/qemu/hw/device_tree.h
@@ -23,4 +23,5 @@
uint32_t *val_array, int size);
 void dt_string(void *fdt, char *node_path, char *property,
char *string);
+void dt_add_subnode(void *fdt, const char *name, char *node_path);
 #endif
diff --git a/qemu/hw/ppc440_bamboo.c b/qemu/hw/ppc440_bamboo.c
--- a/qemu/hw/ppc440_bamboo.c
+++ b/qemu/hw/ppc440_bamboo.c
@@ -51,6 +51,7 @@
uint32_t cpu_freq;
uint32_t timebase_freq;
uint32_t mem_reg_property[]={0, 0, ram_size};
+   int pv_magicpage_size;
 
printf(%s: START\n, __func__);
 
@@ -167,6 +168,20 @@
dt_cell(fdt, /chosen, linux,initrd-end,
(initrd_base + initrd_size));
dt_string(fdt, /chosen, bootargs, (char *)kernel_cmdline);
+
+   if (kvm_enabled()
+kvm_qemu_check_extension(KVM_CAP_PPCPV_MAGICPAGE)) {
+   pv_magicpage_size = kvmppc_pv_get_magicpage_size();
+   if (pv_magicpage_size  0) {
+   fprintf(stderr, %s: error reading magic page size\n,
+__func__);
+   exit(1);
+   }
+   dt_add_subnode(fdt, hypervisor, /);
+   dt_cell(fdt, /hypervisor, feature,pv-magicpage, 1);
+   dt_cell(fdt, /hypervisor, data,pv-magicpage-size,
+   pv_magicpage_size);
+   }
 #endif
 
if (kvm_enabled()) {
diff --git a/qemu/qemu-kvm-powerpc.c b/qemu/qemu-kvm-powerpc.c
--- a/qemu/qemu-kvm-powerpc.c
+++ b/qemu/qemu-kvm-powerpc.c
@@ -214,6 +214,11 @@
 return 0; /* XXX ignore failed DCR ops */
 }
 
+int kvmppc_pv_get_magicpage_size(void)
+{
+   return kvm_get_magicpage_size(kvm_context);
+}
+
 int mmukvm_get_physical_address(CPUState *env, mmu_ctx_t *ctx,
 target_ulong eaddr, int rw, int access_type)
 {
diff --git a/qemu/qemu-kvm.h b/qemu/qemu-kvm.h
--- a/qemu/qemu-kvm.h
+++ b/qemu/qemu-kvm.h
@@ -86,6 +86,7 @@
 #ifdef TARGET_PPC
 int handle_powerpc_dcr_read(int vcpu, uint32_t dcrn, uint32_t *data);
 int handle_powerpc_dcr_write(int vcpu,uint32_t dcrn, uint32_t data);
+int kvmppc_pv_get_magicpage_size();
 #endif
 
 #if !defined(SYS_signalfd)
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH 0/6][RFC] kvmppc: paravirtualization interface

2008-07-23 Thread ehrhardt
From: Christian Ehrhardt [EMAIL PROTECTED]

This patch series implements a paravirtualization interface using:
- the device tree mechanism to pass hypervisor informations to the guest
- hypercalls for guest-host calls
- an example exploiter of that interface (magic page)
This is work in progress, but working so far. I just start to really exploit
the fuctionality behind the magic page mechanism therefor I can't provide any
performance improvements so far, but it is evolved enough for RFC and to start
the standardization discussion.

The used hypercall ABI was already discussed on the embedded-hypervisor mailing
list and is available at http://kvm.qumranet.com/kvmwiki/PowerPC_Hypercall_ABI

The device tree format used here (=base for the discussions on
embedded-hypervisor) is the following.
- A node hypervisor to show the general availability of some hypervisor data
- flags for features like the example feature,pv-magicpage
  setting 1 = available, everything else = unavailable
- Some features might need to pass more data and can use an entry in the
  device tree like the example of data,pv-magicpage-size

Parties on cc:
linuxppc-dev@ozlabs.org
  The patches affect code in the generic powerpc bootsetup so I would be
  happy about comments if the hooks are ok that way.
[EMAIL PROTECTED]
  This power.org TSC discusses about standardization of the virtualization
  interfaces. This patch series is perfectly suited due to it's simple changes
  to start the discussion about the device tree there.
[EMAIL PROTECTED]
  The code is made for kvm on powerpc which lives on this list.

[patches in series]
Subject: [PATCH 1/6] kvmppc: read device tree hypervisor node infrastructure
  Providing the guest functionality to read hypervisor features from the
  device tree and adding the basic hook to the powerpc boot6setup code
Subject: [PATCH 2/6] kvmppc: add hypercall infrastructure - host part
Subject: [PATCH 3/6] kvmppc: add hypercall infrastructure - guest part
  patch 23 add the hypercall infrastruture as mentioned above
Subject: [PATCH 4/6] kvmppc: magic page hypercall - host part
Subject: [PATCH 5/6] kvmppc: magic page paravirtualization - guest part
  patch 45 add the magic page mechanism which will later on be used for
  binary rewriting the guest.
Subject: [PATCH 6/6] kvmppc: kvm-userspace: device tree modification for 
magicpage
  This connects host and guest reading host capabilities and modifying the
  device tree passed to the guest accordingly 

---
[diffstat]
 arch/powerpc/kernel/kvm.c|   48 +++
 arch/powerpc/kvm/emulate.c   |5 +++
 b/arch/powerpc/kernel/Makefile   |2 +
 b/arch/powerpc/kernel/kvm.c  |   30 +
 b/arch/powerpc/kernel/setup_32.c |3 ++
 b/arch/powerpc/kvm/emulate.c |   27 +++
 b/arch/powerpc/kvm/powerpc.c |   18 -
 b/arch/powerpc/platforms/44x/Kconfig |7 +
 b/include/asm-powerpc/kvm_para.h |   37 --
 b/include/linux/kvm.h|6 
 b/libkvm/libkvm-powerpc.c|6 
 b/libkvm/libkvm.h|6 
 b/qemu/hw/device_tree.c  |   10 +++
 b/qemu/hw/device_tree.h  |1
 b/qemu/hw/ppc440_bamboo.c|   15 ++
 b/qemu/qemu-kvm-powerpc.c|5 +++
 b/qemu/qemu-kvm.h|1
 include/asm-powerpc/kvm_para.h   |   47 +-
 18 files changed, 269 insertions(+), 5 deletions(-)
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH 4/6] kvmppc: magic page hypercall - host part

2008-07-23 Thread ehrhardt
From: Christian Ehrhardt [EMAIL PROTECTED]

This adds the host part of the magic page registration. This is a memory
area of the guest granted to the host.
The patch just introduces the infrastruture to receive the guest paddr.
This is work in progress and it is intended to later on use this memory
as storage area a guest can read unprivileged (using binary rewriting to
change privileges instructions).

Signed-off-by: Christian Ehrhardt [EMAIL PROTECTED]
---

[diffstat]
 arch/powerpc/kvm/emulate.c |5 +
 arch/powerpc/kvm/powerpc.c |   18 +-
 include/asm-powerpc/kvm_para.h |2 ++
 include/linux/kvm.h|6 ++
 4 files changed, 30 insertions(+), 1 deletion(-)

[diff]
diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c
--- a/arch/powerpc/kvm/emulate.c
+++ b/arch/powerpc/kvm/emulate.c
@@ -208,6 +208,11 @@
int ret = 0;
 
switch (vcpu-arch.gpr[0]) {
+   case KVM_HCALL_RESERVE_MAGICPAGE:
+   /* FIXME TODO implement the real fuctionality using that */
+   printk(KERN_ERR%s - receive magicpage address 0x%x\n,
+   __func__, vcpu-arch.gpr[3]);
+   break;
default:
printk(KERN_ERRunknown hypercall %d\n, vcpu-arch.gpr[0]);
kvmppc_dump_vcpu(vcpu);
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -148,6 +148,9 @@
case KVM_CAP_COALESCED_MMIO:
r = KVM_COALESCED_MMIO_PAGE_OFFSET;
break;
+   case KVM_CAP_PPCPV_MAGICPAGE:
+   r = 1;
+   break;
default:
r = 0;
break;
@@ -159,7 +162,20 @@
 long kvm_arch_dev_ioctl(struct file *filp,
 unsigned int ioctl, unsigned long arg)
 {
-   return -EINVAL;
+   long r = -EINVAL;
+
+   switch (ioctl) {
+   case KVM_GET_PPCPV_MAGICPAGE_SIZE:
+   r = -EINVAL;
+   if (arg)
+   goto out;
+   r = 1024;
+   break;
+   default:
+   r = -EINVAL;
+   }
+out:
+   return r;
 }
 
 int kvm_arch_set_memory_region(struct kvm *kvm,
diff --git a/include/asm-powerpc/kvm_para.h b/include/asm-powerpc/kvm_para.h
--- a/include/asm-powerpc/kvm_para.h
+++ b/include/asm-powerpc/kvm_para.h
@@ -24,6 +24,8 @@
 
 #define KVM_HYPERCALL_BIN 0x03ff
 
+#define KVM_HCALL_RESERVE_MAGICPAGE0
+
 static inline int kvm_para_available(void)
 {
return 0;
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -365,6 +365,11 @@
 #define KVM_TRACE_PAUSE   _IO(KVMIO,  0x07)
 #define KVM_TRACE_DISABLE _IO(KVMIO,  0x08)
 /*
+ * ioctls for powerpc paravirtualization extensions
+ */
+#define KVM_GET_PPCPV_MAGICPAGE_SIZE   _IO(KVMIO,   0x09)
+
+/*
  * Extension capability list.
  */
 #define KVM_CAP_IRQCHIP  0
@@ -382,6 +387,7 @@
 #define KVM_CAP_PV_MMU 13
 #define KVM_CAP_MP_STATE 14
 #define KVM_CAP_COALESCED_MMIO 15
+#define KVM_CAP_PPCPV_MAGICPAGE 16
 
 /*
  * ioctls for VM fds
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH 2/6] kvmppc: add hypercall infrastructure - host part

2008-07-23 Thread ehrhardt
From: Christian Ehrhardt [EMAIL PROTECTED]

This adds the host portion of the hypercall infrastructure which receives
the guest calls - no specific hcall function is implemented in this patch.

Signed-off-by: Christian Ehrhardt [EMAIL PROTECTED]
---

[diffstat]
 arch/powerpc/kvm/emulate.c |   27 +++
 include/asm-powerpc/kvm_para.h |2 ++
 2 files changed, 29 insertions(+)

[diff]
diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c
--- a/arch/powerpc/kvm/emulate.c
+++ b/arch/powerpc/kvm/emulate.c
@@ -203,6 +203,24 @@
kvmppc_set_msr(vcpu, vcpu-arch.srr1);
 }
 
+static int kvmppc_do_hypercall(struct kvm_vcpu *vcpu)
+{
+   int ret = 0;
+
+   switch (vcpu-arch.gpr[0]) {
+   default:
+   printk(KERN_ERRunknown hypercall %d\n, vcpu-arch.gpr[0]);
+   kvmppc_dump_vcpu(vcpu);
+   ret = -ENOSYS;
+   }
+
+   vcpu-arch.gpr[11] = ret;
+   vcpu-arch.pc += 4; /* Advance past hypercall instruction. */
+
+   return ret;
+}
+
+
 /* XXX to do:
  * lhax
  * lhaux
@@ -232,6 +250,15 @@
int advance = 1;
 
switch (get_op(inst)) {
+   case 0:
+   if (inst == KVM_HYPERCALL_BIN) {
+   kvmppc_do_hypercall(vcpu);
+   advance = 0; /* kvmppc_do_hypercall handles the PC. */
+   } else {
+   printk(KERN_ERRunknown op %d\n, get_op(inst));
+   emulated = EMULATE_FAIL;
+   }
+   break;
case 3: /* trap */
printk(trap!\n);
kvmppc_queue_exception(vcpu, BOOKE_INTERRUPT_PROGRAM);
diff --git a/include/asm-powerpc/kvm_para.h b/include/asm-powerpc/kvm_para.h
--- a/include/asm-powerpc/kvm_para.h
+++ b/include/asm-powerpc/kvm_para.h
@@ -22,6 +22,8 @@
 
 #ifdef __KERNEL__
 
+#define KVM_HYPERCALL_BIN 0x03ff
+
 static inline int kvm_para_available(void)
 {
return 0;
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: pci issue - wrong detection of pci ressources

2008-04-22 Thread Christian Ehrhardt
 / regards, 
Christian Ehrhardt

IBM Linux Technology Center, Open Virtualization
Subject: [PATCH][dts][radeonfb]: fix pci mem in dts and radeonfb resource 
variables

From: Christian Ehrhardt [EMAIL PROTECTED]

This patch is fixing the sequoia.dts device tree file to the values defined
in the 440Epx data sheet from amcc.
That fixes an issue where my graphic card could not initialize because the pci
resource space was not big enough.
The related mail thread about the backgrounds of this has the subject pci
issue - wrong detection of pci ressources
After these values were fixed another modification that came up in the mail
thread was needed to prevent an error. This change fixes the type of the
resource vaiables in the radeon frame buffer driver (We might want to split
that into two patches).

Signed-off-by: Christian Ehrhardt [EMAIL PROTECTED]
---

[diffstat]
 arch/powerpc/boot/dts/sequoia.dts |9 +++--
 drivers/video/aty/radeonfb.h  |4 ++--
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/boot/dts/sequoia.dts 
b/arch/powerpc/boot/dts/sequoia.dts
--- a/arch/powerpc/boot/dts/sequoia.dts
+++ b/arch/powerpc/boot/dts/sequoia.dts
@@ -344,9 +344,14 @@
/* Outbound ranges, one memory and one IO,
 * later cannot be changed. Chip supports a second
 * IO range but we don't use it for now
+* From the 440EPx user manual:
+* PCI 1 Memory 1 8000  1 BFFF  1GB
+* I/O  1 E800  1 E800  64KB
+* I/O  1 E880  1 EBFF  56MB
 */
-   ranges = 0200 0 8000 1 8000 0 1000
-   0100 0  1 e800 0 0010;
+   ranges = 0200 0 8000 1 8000 0 4000
+   0100 0  1 e800 0 0001
+   0100 0  1 e880 0 0380;
 
/* Inbound 2GB range starting at 0 */
dma-ranges = 4200 0 0 0 0 0 8000;
diff --git a/drivers/video/aty/radeonfb.h b/drivers/video/aty/radeonfb.h
--- a/drivers/video/aty/radeonfb.h
+++ b/drivers/video/aty/radeonfb.h
@@ -287,8 +287,8 @@ struct radeonfb_info {
 
charname[DEVICE_NAME_SIZE];
 
-   unsigned long   mmio_base_phys;
-   unsigned long   fb_base_phys;
+   resource_size_t mmio_base_phys;
+   resource_size_t fb_base_phys;
 
void __iomem*mmio_base;
void __iomem*fb_base;
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: pci issue - wrong detection of pci ressources

2008-04-22 Thread Christian Ehrhardt

Sergei Shtylyov wrote:

Hello.

Christian Ehrhardt wrote:


[...]



The Documentation of the 440EPx core lists these spaces:
PCI 1 Memory 1 8000  1 BFFF  1GB
I/O  1 E800  1 E800  64KB
I/O  1 E880  1 EBFF  56MB


Having 2 I/O spaces looks just wrong. Actually, PCs do well with only 
64K of I/O space.




ok - I just wanted to be complete.
I removed the 56M section from the new dts file patch.

[...]


radeonfb: EDID probed
Parsing EDID data for panel info
Setting up default mode based on panel info
radeonfb (:00:0a.0): ATI Radeon Y`


   Hm, what's that Y`?


Thats the final message in the radeonfb driver ater initializing everything.
  printk (radeonfb (%s): %s\n, pci_name(rinfo-pdev), rinfo-name);
I wonder why that rinfo-name is clobbered - maybe another issue, I have to 
keep that in mind.

[...]


   I think you'd better use Ben's patch that he's just posted:

http://patchwork.ozlabs.org/linuxppc/patch?id=18034

WBR, Sergei


yep - I use Ben's patch now which reduces my patch to the actual dts fix.
Updated patch attached.

--

Grüsse / regards, 
Christian Ehrhardt

IBM Linux Technology Center, Open Virtualization
Subject: [PATCH][dts]: fix pci mem in sequoia dts

From: Christian Ehrhardt [EMAIL PROTECTED]

This patch is fixing the sequoia.dts device tree file to the values defined
in the 440Epx data sheet from amcc.
That fixes an issue where my graphic card could not initialize because the pci
resource space was not big enough.
The related mail thread about the backgrounds of this has the subject pci
issue - wrong detection of pci ressources

Signed-off-by: Christian Ehrhardt [EMAIL PROTECTED]
---

[diffstat]
 sequoia.dts |8 ++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/boot/dts/sequoia.dts 
b/arch/powerpc/boot/dts/sequoia.dts
--- a/arch/powerpc/boot/dts/sequoia.dts
+++ b/arch/powerpc/boot/dts/sequoia.dts
@@ -344,9 +344,13 @@
/* Outbound ranges, one memory and one IO,
 * later cannot be changed. Chip supports a second
 * IO range but we don't use it for now
+* From the 440EPx user manual:
+* PCI 1 Memory 1 8000  1 BFFF  1GB
+* I/O  1 E800  1 E800  64KB
+* I/O  1 E880  1 EBFF  56MB
 */
-   ranges = 0200 0 8000 1 8000 0 1000
-   0100 0  1 e800 0 0010;
+   ranges = 0200 0 8000 1 8000 0 4000
+   0100 0  1 e800 0 0001;
 
/* Inbound 2GB range starting at 0 */
dma-ranges = 4200 0 0 0 0 0 8000;
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: [PATCH 1/3] radeonfb: Fix 64 bits resources on 32 bits archs

2008-04-22 Thread Christian Ehrhardt

Benjamin Herrenschmidt wrote:

This fixes radeonfb to not truncate 64 bits resources on 32 bits
platforms. Unfortunately, there are still issues with addresses
returned to userspace via struct fb_fix_screeninfo. This will
have to be dealt with separately.


Thanks for this patch Benjamin, I use it together with what we have discussed in the 
pci issue - wrong detection of pci ressources thread.
Unfortunately I now hit exactly that issue with fb_fix_screeninfo you describe.

For everyone the fb_fix_screeninfo has two unsigned long vars that need to 
strore a IO address.
This fails in my case with a 32bit powerpc system (=sizeof(long)=4) which has paddr 
4Gb and actually it should affect any 32bit platform with paddr4Gb.
You see it e.g. when you try to initialize X11, the x11 radeon driver issues a 
FBIOGET_FSCREENINFO ioctl and because our address is 4Gb it get's clobbered by 
that unsigned long in the fb_fix_screeninfo structure (the value comes from a 
resource_size_t variable which has the correct 64bit).

struct fb_fix_screeninfo {
   char id[16];/* identification string eg TT Builtin 
*/
   unsigned long smem_start; /* Start of frame buffer mem */
[...]
   unsigned long mmio_start; /* Start of Memory Mapped I/O   */
[...]

I tried the stupid solution to just change the fb_fix_screeninfo structure to 
resource_size_t, but that changes the size ioctl transports and would require 
awareness in the userspace applications using that ioctl.
The X11  Framebuffer driver work on 64bit systems, so I think it's just an 
issue of not cutting that data down to 32bit when transporting it (has anyone 
already checked the x11 drivers, I hope they don't use unsigned long too).

I wanted to ask if there are any known workarounds atm that would allow me to 
use my X11 for now?

--

Grüsse / regards, 
Christian Ehrhardt

IBM Linux Technology Center, Open Virtualization
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: pci issue - wrong detection of pci ressources

2008-04-21 Thread Christian Ehrhardt

Benjamin Herrenschmidt wrote:

Yes you're right. Early at the pci initialization are errors of the allocation 
for pi ressources.
And that are exactly the ressources failing later, so that pci initialization 
seem to be the reason for my problem.
Was there any simple solution (e.g. just somehow increase memory reserved for 
pci) when you came across that issue Johan ?


Hrm... I was expecting to see a lot more output here, make sure you have
debug on your command line (or enable early debug output, same
effect).


There is nothing more even with debug in kernel command line.
But I added some printk's around the resource handling to get a better feeling 
what's happening.
I attached the new extended bootlog and the patch with the printk's for a 
better understanding where which message is printed.

I tried to understand how I might e.g. increase the number of available 
resources or their size, but unfortunately that is not the simplest code when 
working with it the first time ;-)


Cheers,
Ben.




For comparison I defined DEBUG in the good kernel (arch=ppc) and that is what 
the initialization prints (pci ...:0a:1 is the secondary head of the same 
graphic card an it's not an issue if thats not allocated):
good case:
PCI: Probing PCI hardware
PCI: bridge rsrc 0.. (100), parent c0354624
PCI: bridge rsrc 8000..8fff (200), parent c0354608
PCI::00:0a.0: Resource 0: 8800-8fff (f=1208)
PCI::00:0a.0: Resource 1: ff00- (f=101)
PCI::00:0a.0: Resource 2: 87ff-87ff (f=200)
PCI::00:0a.1: Resource 0: 7800-7fff (f=1208)
PCI: Cannot allocate resource region 0 of device :00:0a.1
PCI::00:0a.1: Resource 1: 77ff-77ff (f=200)
PCI: Cannot allocate resource region 1 of device :00:0a.1
PCI: Failed to allocate mem resource #0:[EMAIL PROTECTED] for :00:0a.1

--

Grüsse / regards, 
Christian Ehrhardt

IBM Linux Technology Center, Open Virtualization
PCI host bridge /plb/[EMAIL PROTECTED] (primary) ranges:
 MEM 0x00018000..0x00018fff - 0x8000
  IO 0x0001e800..0x0001e80f - 0x
4xx PCI DMA offset set to 0x
PCI: Probing PCI hardware
PCI: Hiding 4xx host bridge resources :00:00.0
Try to map irq for :00:00.0...
 - got one, spec 2 cells (0x0003 0x0008...) on /interrupt-controller2
 - mapped to linux irq 16
Try to map irq for :00:0a.0...
 - got one, spec 2 cells (0x0003 0x0008...) on /interrupt-controller2
 - mapped to linux irq 16
Try to map irq for :00:0a.1...
PCI: PHB (bus 0) bridge rsrc 0: -000f [0x100], 
parent c0365060 (PCI IO)
__request_resource - request cf8045b0 name '/plb/[EMAIL PROTECTED]' start 0x0 
end 0x0
__request_resource - no conflict parent c0365060 sibling 
PCI: PHB (bus 0) bridge rsrc 1: 00018000-00018fff [0x200], 
parent c0365038 (PCI mem)
__request_resource - request cf8045d8 name '/plb/[EMAIL PROTECTED]' start 0x1 
end 0x8000
__request_resource - no conflict parent c0365038 sibling 
PCI: Assigning unassigned resouces...
pci_assign_resource - allocate with IORESOURCE_PREFETCH
pci_bus_alloc_resource - enter
pci_assign_resource - second pci_bus_alloc_resource call
pci_bus_alloc_resource - enter
pci_bus_alloc_resource - call allocate ressource size 0 startcalc 134217728, 
align -1
find_resource - size 0, min 0x800, max 0x1
find_resource - found start 0x1 end 0x8000
__request_resource - request cf810578 name ':00:0a.0' start 0x1 end 
0x8000
__request_resource - no conflict parent cf8045d8 sibling 
pci_assign_resource - allocate with IORESOURCE_PREFETCH
pci_bus_alloc_resource - enter
pci_assign_resource - second pci_bus_alloc_resource call
pci_bus_alloc_resource - enter
pci_bus_alloc_resource - call allocate ressource size 0 startcalc 134217728, 
align -1
find_resource - size 0, min 0x800, max 0x1
find_resource - continue with start 0x1 on 8800
find_resource - found start 0x1 end 0x8800
__request_resource - request cf810178 name ':00:0a.1' start 0x1 end 
0x8800
__request_resource - no conflict parent cf8045d8 sibling 
pci_assign_resource - allocate with IORESOURCE_PREFETCH
pci_bus_alloc_resource - enter
pci_assign_resource - second pci_bus_alloc_resource call
pci_bus_alloc_resource - enter
pci_bus_alloc_resource - call allocate ressource size 0 startcalc 131072, align 
-1
find_resource - size 0, min 0x2, max 0x1
find_resource - continue with start 0x1 on 8800
find_resource - continue with start 0x1 on 9000
find_resource - no this - exit
PCI: pci_assign_resource - Failed to allocate mem resource #6:[EMAIL PROTECTED] 
for :00:0a.0
pci_assign_resource - allocate with IORESOURCE_PREFETCH
pci_bus_alloc_resource - enter
pci_bus_alloc_resource - call allocate ressource size 0 startcalc 65536, align 
-1
find_resource - size 0, min

Re: pci issue - wrong detection of pci ressources

2008-04-21 Thread Christian Ehrhardt

Sergei Shtylyov wrote:

Hello.

Christian Ehrhardt wrote:


Cheers,
Ben.


For comparison I defined DEBUG in the good kernel (arch=ppc) and that 
is what the initialization prints (pci ...:0a:1 is the secondary head 
of the same graphic card an it's not an issue if thats not allocated):

[...]
   You've changed fb_base_phys and mmio_base_phys to resource_size_t 
which is 64-bit, so use %llx to print them.



Thanks your absolutely right, I sometimes forget that I need long long for 
64bit on 32bit archs (and ignored the warnings :-( ).
I corrected the printk format strings and attached the new logs.

[...]

+else {
+printk(KERN_ERR%s - continue with start 0x%0lx on %p\n, __func__, (this-end + 
1), this-sibling);
+}
new-start = this-end + 1;
this = this-sibling;


   And here. Yet it's not clear why you call resource's 'end' 'start'...


It's the new-start that get's calculated one line after that new else part.
I printed that one to to see a bit how the loop iterates the resource elements.



WBR, Sergei



--

Grüsse / regards, 
Christian Ehrhardt

IBM Linux Technology Center, Open Virtualization
With DEBUG in arch/powerpc/kernel/pci-common.c, debug in commandline and a 
patch with some printk's (I attached the patch because it is the usual put 
printk's everywhere so the diff helps to understand where the prints come 
from). Corrected prinf format specifiers according to the comments from Sergei 
Shtylyov.

PCI host bridge /plb/[EMAIL PROTECTED] (primary) ranges:
 MEM 0x00018000..0x00018fff - 0x8000
  IO 0x0001e800..0x0001e80f - 0x
4xx PCI DMA offset set to 0x
PCI: Probing PCI hardware
PCI: Hiding 4xx host bridge resources :00:00.0
Try to map irq for :00:00.0...
 - got one, spec 2 cells (0x0003 0x0008...) on /interrupt-controller2
 - mapped to linux irq 16
Try to map irq for :00:0a.0...
 - got one, spec 2 cells (0x0003 0x0008...) on /interrupt-controller2
 - mapped to linux irq 16
Try to map irq for :00:0a.1...
PCI: PHB (bus 0) bridge rsrc 0: -000f [0x100], 
parent c0365060 (PCI IO)
__request_resource - request 0xcf8045b0 name '/plb/[EMAIL PROTECTED]' start 0x0 
end 0xf
__request_resource - no conflict parent 0xc0365060 sibling 0x
PCI: PHB (bus 0) bridge rsrc 1: 00018000-00018fff [0x200], 
parent c0365038 (PCI mem)
__request_resource - request 0xcf8045d8 name '/plb/[EMAIL PROTECTED]' start 
0x18000 end 0x18fff
__request_resource - no conflict parent 0xc0365038 sibling 0x
PCI: Assigning unassigned resouces...
pci_assign_unassigned_resources -#1- bus 0xcf82d400
pci_assign_unassigned_resources -#2- bus 0xcf82d400
pci_assign_resource - allocate with IORESOURCE_PREFETCH
pci_bus_alloc_resource - enter
pci_assign_resource - second pci_bus_alloc_resource call
pci_bus_alloc_resource - enter
pci_bus_alloc_resource - call allocate ressource size 0x800 startcalc 
0x, align 0x800
find_resource - size 0x800, min 0x18000, max 0x
find_resource - found start 0x18000 end 0x187ff
__request_resource - request 0xcf810578 name ':00:0a.0' start 0x18000 
end 0x187ff
__request_resource - no conflict parent 0xcf8045d8 sibling 0x
pci_assign_resource - allocate with IORESOURCE_PREFETCH
pci_bus_alloc_resource - enter
pci_assign_resource - second pci_bus_alloc_resource call
pci_bus_alloc_resource - enter
pci_bus_alloc_resource - call allocate ressource size 0x800 startcalc 
0x, align 0x800
find_resource - size 0x800, min 0x18000, max 0x
find_resource - continue with start 0x18800 on 0x
find_resource - found start 0x18800 end 0x18fff
__request_resource - request 0xcf810178 name ':00:0a.1' start 0x18800 
end 0x18fff
__request_resource - no conflict parent 0xcf8045d8 sibling 0x
pci_assign_resource - allocate with IORESOURCE_PREFETCH
pci_bus_alloc_resource - enter
pci_assign_resource - second pci_bus_alloc_resource call
pci_bus_alloc_resource - enter
pci_bus_alloc_resource - call allocate ressource size 0x2 startcalc 
0x, align 0x2
find_resource - size 0x2, min 0x18000, max 0x
find_resource - continue with start 0x18800 on 0xcf810178
find_resource - continue with start 0x19000 on 0x
find_resource - no this - exit
PCI: pci_assign_resource - Failed to allocate mem resource #6:[EMAIL PROTECTED] 
for :00:0a.0
pci_assign_resource - allocate with IORESOURCE_PREFETCH
pci_bus_alloc_resource - enter
pci_bus_alloc_resource - call allocate ressource size 0x1 startcalc 
0x, align 0x1
find_resource - size 0x1, min 0x18000, max 0x
find_resource - continue with start 0x18800 on 0xcf810178
find_resource - continue with start 0x19000 on 0x
find_resource - no this - exit
PCI

Re: pci issue - wrong detection of pci ressources

2008-04-20 Thread Christian Ehrhardt

Johan Borkhuis wrote:

Hello Christian,

Christian Ehrhardt wrote:

Hi,
I tried to use a radeon r200 based graphic card on a sequoia ppc 
(440epx) board. I wondered about the initialization of radeonfb that 
failed with

__ioremap(): phys addr 0x0 is RAM lr c029cf80
radeonfb (:00:0a.0): cannot map MMIO
radeonfb: probe of :00:0a.0 failed with error -5


[...]


I came across a similar problem, which (ultimately) was caused by a lack 
of memory reserved for PCI. I moved from 2.6.14(ppc) to 2.6.20(powerpc), 
and suddenly some cards stopped working: the BAR registers were not 
initialized, so it was not possible to access the cards.
Have a look at the boot-time messages, especially the early messages, as 
the PCI subsystem is started very early in the boot process. You could 
also try switching on PCI-debugging, and have a look at the debug 
messages, or add some extra debugging info to the pci-initialization code.


Yes you're right. Early at the pci initialization are errors of the allocation 
for pi ressources.
And that are exactly the ressources failing later, so that pci initialization 
seem to be the reason for my problem.
Was there any simple solution (e.g. just somehow increase memory reserved for 
pci) when you came across that issue Johan ?

With DEBUG in pci-common.c enabled (bad kernel) and a extension showing which 
functions alloc fails (put a %s for __func__):
PCI host bridge /plb/[EMAIL PROTECTED] (primary) ranges:
MEM 0x00018000..0x00018fff - 0x8000
 IO 0x0001e800..0x0001e80f - 0x
4xx PCI DMA offset set to 0x
PCI: Probing PCI hardware
PCI: Hiding 4xx host bridge resources :00:00.0
Try to map irq for :00:00.0...
- got one, spec 2 cells (0x0003 0x0008...) on /interrupt-controller2
- mapped to linux irq 16
Try to map irq for :00:0a.0...
- got one, spec 2 cells (0x0003 0x0008...) on /interrupt-controller2
- mapped to linux irq 16
Try to map irq for :00:0a.1...
PCI: PHB (bus 0) bridge rsrc 0: -000f [0x100], 
parent c0363060 (PCI IO)
PCI: PHB (bus 0) bridge rsrc 1: 00018000-00018fff [0x200], 
parent c0363038 (PCI mem)
PCI: Assigning unassigned resouces...
PCI: pci_assign_resource - Failed to allocate mem resource #6:[EMAIL PROTECTED] 
for :00:0a.0
PCI: pci_assign_resource - Failed to allocate mem resource #2:[EMAIL PROTECTED] 
for :00:0a.0
PCI: pci_assign_resource - Failed to allocate mem resource #1:[EMAIL PROTECTED] 
for :00:0a.1

--

Grüsse / regards, 
Christian Ehrhardt

IBM Linux Technology Center, Open Virtualization





To be complete for the case we might need it I answer all the other questions:

Benjamin Herrenschmidt wrote:

On Fri, 2008-04-18 at 14:07 +0200, Christian Ehrhardt wrote:

= Region 2 is not detected with our kernel, this later break things
like radeonfb initialization.


I'll need some information here:

- Your device-tree (is that the base sequoia one ?)


DTS File is the normal sequoia.dts file in arch/powerpc/boot/dts with the 
latest change being:
user:Stefan Roese [EMAIL PROTECTED]
date:Fri Feb 15 21:35:30 2008 -0600
summary: [POWERPC] 4xx: Remove i2c and xxmii-interface device_types 
from dts


- Enable DEBUG in arch/powerpc/kernel/pci-common.c and pci_32.c
- Send me the resulting dmesg log


done - full dmesg attached


- Also include the output of /proc/iomem


/proc/iomem - bad kernel
[EMAIL PROTECTED]:~# cat /proc/iomem
e300-e38f : ehci_hcd
18000-18fff : /plb/[EMAIL PROTECTED]
 18000-187ff : :00:0a.0
 18800-18fff : :00:0a.1
1ef600300-1ef600307 : serial
1ef600400-1ef600407 : serial
1ef600500-1ef600507 : serial
1ef600600-1ef600607 : serial
1fc00-1 : 1fc00.nor_flash

/proc/iomem - good kernel
[EMAIL PROTECTED]:~# cat /proc/iomem
8000-8fff : PCI host bridge
 8000-8000 : :00:0a.1
 8002-8003 : :00:0a.0
 87ff-87ff : :00:0a.0
   87ff-87ff : radeonfb mmio
 8800-8fff : :00:0a.0
   8800-8fff : radeonfb framebuffer
d000-d0001fff : ndfc-nand.0
e100-e17f : musbhsfc_udc.0
 e100-e17f : musbhsfc_udc
e300-e3ff : ppc-soc-ehci.0
e400-e4ff : ppc-soc-ohci.0
fc00- : physmap-flash.0
 fc00- : physmap-flash.0



Actually, there's a bug in radeonfb:

In radeonfb.h, try changing

unsigned long   mmio_base_phys;
unsigned long   fb_base_phys;

To

resource_size_t mmio_base_phys;
resource_size_t fb_base_phys;


This did not fix the issue, as we have seen that it is caused earlier in pci 
initialization.
But that fix corrects the code if it is useful in my case or not ;-)



Sergei Shtylyov wrote:

Christian Ehrhardt wrote:


[...]

Bad kernel:
00:0a.0 VGA compatible controller: ATI Technologies Inc RV280 [Radeon 
9200 PRO] (rev 01) (prog

pci issue - wrong detection of pci ressources

2008-04-18 Thread Christian Ehrhardt

Hi,
I tried to use a radeon r200 based graphic card on a sequoia ppc (440epx) 
board. I wondered about the initialization of radeonfb that failed with
__ioremap(): phys addr 0x0 is RAM lr c029cf80
radeonfb (:00:0a.0): cannot map MMIO
radeonfb: probe of :00:0a.0 failed with error -5

I trigger a check in ioremap, because the address it wants to remap is 0x0 
which can never work. The reason of that is that the pci ressource of that 
graphic card is not properly detected.

With some help I found two kernels - one that work and one that has this issue.
Unfortunately they are very different:
  good = 2.6.24.2 from the linux-2.6-denx - built for arch=ppc
  bad = we have 2.6.25-rc9 (used in our kvm ppc project atm) - build for 
arch=powerpc
I tried building the 2.6.25-rc9 with arch=ppc, but that one does not boot so 
far. Because of that I can't surely tell you if it is only that difference that 
breaks the pci detection.
We need arch=powerpc for our kvm code anyway, so I hope there is another 
solution than to switch to arch=ppc ;-)

I just started to debug into that, but I wanted to ask here if there might be 
some known issues causing that and/or to get some hints where to look at.

The issue is much better visible when I boot with these two kernels and use lspci 
-vvv

Good kernel:
00:0a.0 VGA compatible controller: ATI Technologies Inc RV280 [Radeon 9200 PRO] 
(rev 01) (prog-if 00 [VGA])
   Subsystem: PC Partner Limited Unknown device 0250
   Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
   Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium TAbort- TAbort- 
MAbort- SERR- PERR-
   Latency: 128 (2000ns min)
   Interrupt: pin A routed to IRQ 67
   Region 0: Memory at 8800 (32-bit, prefetchable) [size=128M]
   Region 1: I/O ports at ff00 [size=256]
   Region 2: Memory at 87ff (32-bit, non-prefetchable) [size=64K]
   Expansion ROM at 8002 [disabled] [size=128K]
   Capabilities: [50] Power Management version 2
   Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)
   Status: D0 PME-Enable- DSel=0 DScale=0 PME-

Bad kernel:
00:0a.0 VGA compatible controller: ATI Technologies Inc RV280 [Radeon 9200 PRO] 
(rev 01) (prog-if 00 [VGA])
   Subsystem: PC Partner Limited Unknown device 0250
   Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
   Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium TAbort- TAbort- 
MAbort- SERR- PERR-
   Latency: 128 (2000ns min)
   Interrupt: pin A routed to IRQ 16
   Region 0: Memory at 18000 (32-bit, prefetchable) [size=128M]
   Region 1: I/O ports at 1000 [size=256]
   Region 2: Memory at ignored (32-bit, non-prefetchable)
   Capabilities: [50] Power Management version 2
   Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)
   Status: D0 PME-Enable- DSel=0 DScale=0 PME-


= Region 2 is not detected with our kernel, this later break things like 
radeonfb initialization.

--

Grüsse / regards, 
Christian Ehrhardt

IBM Linux Technology Center, Open Virtualization

P.S. I tested both pci slots of my board and both behave the same
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [kvm-devel] [PATCH 4 of 4] [KVM POWERPC] PowerPC 440 KVM implementation

2008-04-17 Thread Christian Ehrhardt

Avi Kivity wrote:

Avi Kivity wrote:

Hollis Blanchard wrote:

+config KVM
+ tristate Kernel-based Virtual Machine (KVM) support
+ depends on EXPERIMENTAL
+ select PREEMPT_NOTIFIERS
+ select ANON_INODES
+ ---help---
+ Support hosting virtualized guest machines. You will also
+ need to select one or more of the processor modules below.
+
+ This module provides access to the hardware capabilities through
+ a character device node named /dev/kvm.
+
+ To compile this as a module, choose M here: the module
+ will be called kvm.
+
+ If unsure, say N.


In my ignorance, I set KVM=m on a non-44x build, which then failed. 
This needs either to depend on 44x, or to be fixed to compile.


Setting 44x, I get

AS [M] arch/powerpc/kvm/booke_interrupts.o
arch/powerpc/kvm/booke_interrupts.S: Assembler messages:
arch/powerpc/kvm/booke_interrupts.S:351: Error: unsupported relocation 
against VCPU_HOST_TLB
arch/powerpc/kvm/booke_interrupts.S:352: Error: unsupported relocation 
against VCPU_SHADOW_TLB


Afaik we just don't support building kvm as module atm. So a simple and fast 
solution would be to change the Kconfig options from tristate to bool.
Additionally we still have some cross references between the code build on the two 
used symbols (KVMKVM_BOOKE_HOST) which means that we need to ensure that if 
KVM if configured for powerpc we also select exactly one host implementation. To do 
so I changed the second option to a choice field which eventually has always 
selected one suboption.

To ensure that the only existent suboption we have atm can be selected we need the 
smallest commonality as dependency at the KVM config option which atm put depends 
44x there.
We can change that once we support modules and/or separate selections.

A patch for that is attached, but I would like to wait for Hollis comments on 
that before you apply that.


So further Kconfig restrictions are needed, or perhaps a patch. .config 
attached.




[...]

--

Grüsse / regards, 
Christian Ehrhardt

IBM Linux Technology Center, Open Virtualization
--- arch/powerpc/kvm/Kconfig.save	2008-04-17 15:21:54.0 +0200
+++ arch/powerpc/kvm/Kconfig	2008-04-17 15:58:52.0 +0200
@@ -15,8 +15,8 @@
 if VIRTUALIZATION
 
 config KVM
-	tristate Kernel-based Virtual Machine (KVM) support
-	depends on EXPERIMENTAL
+	bool Kernel-based Virtual Machine (KVM) support
+	depends on EXPERIMENTAL  44x
 	select PREEMPT_NOTIFIERS
 	select ANON_INODES
 	---help---
@@ -31,13 +31,22 @@
 
 	  If unsure, say N.
 
+choice
+	prompt KVM host PowerPC processor support
+	depends on KVM  44x
+	default KVM_BOOKE_HOST
+	help
+	  This option sets the Kind of PowerPC processor to virtualize.
+
 config KVM_BOOKE_HOST
-	tristate KVM host support for Book E PowerPC processors
+	bool Book E
 	depends on KVM  44x
 	---help---
 	  Provides host support for KVM on Book E PowerPC processors. Currently
 	  this works on 440 processors only.
 
+endchoice
+
 source drivers/virtio/Kconfig
 
 endif # VIRTUALIZATION
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev