Please pull my perfcounters.git tree

2009-08-17 Thread Paul Mackerras
Ben,

The following changes since commit 64f1607ffbbc772685733ea63e6f7f4183df1b16:
  Linus Torvalds (1):
Linux 2.6.31-rc6

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/paulus/perfcounters.git master

Please pull them into your powerpc-next branch.  I'll also ask Ingo
Molnar to pull them into the tip tree.

Thanks,
Paul.

Paul Mackerras (3):
  powerpc/32: Always order writes to halves of 64-bit PTEs
  powerpc: Allow perf_counters to access user memory at interrupt time
  perf_counter: powerpc: Add callchain support

 arch/powerpc/include/asm/pgtable.h   |6 +-
 arch/powerpc/kernel/Makefile |2 +-
 arch/powerpc/kernel/asm-offsets.c|2 +
 arch/powerpc/kernel/exceptions-64s.S |   19 ++
 arch/powerpc/kernel/perf_callchain.c |  527 ++
 arch/powerpc/mm/slb.c|   37 ++-
 arch/powerpc/mm/stab.c   |   11 +-
 7 files changed, 588 insertions(+), 16 deletions(-)
 create mode 100644 arch/powerpc/kernel/perf_callchain.c
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: simple gpio driver

2009-08-17 Thread Heiko Schocher
Hello Anton,

Anton Vorontsov wrote:
> Oops, I missed that patch, sorry.
> 
> On Mon, Aug 17, 2009 at 03:18:37PM -0600, Grant Likely wrote:
>> On Wed, Aug 12, 2009 at 11:49 PM, Heiko Schocher wrote:
>>> Hello Anton,
>>>
>>> i am trying to use the arch/powerpc/sysdev/simple_gpio.c driver,
>>> for accessing some gpios, and found, that u8_gpio_get()
>>> returns not only a 1 or a 0, instead it returns the real bit
>>> position from the gpio:
>>>
>>> gpioreturn
>>> basevalue
>>> 0   0/0x01
>>> 1   0/0x02
>>> 2   0/0x04
>>> 3   0/0x08
>>> 4   0/0x10
>>> 5   0/0x20
>>> 6   0/0x40
>>> 7   0/0x80
>>>
>>> I also use the arch/powerpc/platforms/52xx/mpc52xx_gpio.c and
>>> mpc52xx_gpt.c drivers, they all return for a gpio just a 1 or 0,
> 
> There is also arch/powerpc/sysdev/qe_lib/gpio.c and
> arch/powerpc/sysdev/mpc8xxx_gpio.c that don't do that.

Ah, okay.

>>> which seems correct to me, because a gpio can have only 1 or 0
>>> as state ... what do you think?
>> I think returning '1' is perhaps slightly 'better' (however you define
>> that), but I don't think the caller should make any assumptions beyond
>> zero/non-zero.
> 
> Yep. So I don't think that the patch is needed.

Yes, if the gpio lib only differs in zero versus non zero.

Thanks for the info

bye
Heiko
-- 
DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: simple gpio driver

2009-08-17 Thread Heiko Schocher
Hello Grant,

Grant Likely wrote:
> On Wed, Aug 12, 2009 at 11:49 PM, Heiko Schocher wrote:
>> Hello Anton,
>>
>> i am trying to use the arch/powerpc/sysdev/simple_gpio.c driver,
>> for accessing some gpios, and found, that u8_gpio_get()
>> returns not only a 1 or a 0, instead it returns the real bit
>> position from the gpio:
>>
>> gpioreturn
>> basevalue
>> 0   0/0x01
>> 1   0/0x02
>> 2   0/0x04
>> 3   0/0x08
>> 4   0/0x10
>> 5   0/0x20
>> 6   0/0x40
>> 7   0/0x80
>>
>> I also use the arch/powerpc/platforms/52xx/mpc52xx_gpio.c and
>> mpc52xx_gpt.c drivers, they all return for a gpio just a 1 or 0,
>> which seems correct to me, because a gpio can have only 1 or 0
>> as state ... what do you think?
> 
> I think returning '1' is perhaps slightly 'better' (however you define

Yep.

> that), but I don't think the caller should make any assumptions beyond
> zero/non-zero.

Hmm... why? I think a gpio_pin can have as value only 0 or 1.
Ah, if you say zero versus non zero ... hmm... okay.

>> I solved this issue (if it is) with the following patch:
>>
>> diff --git a/arch/powerpc/sysdev/simple_gpio.c 
>> b/arch/powerpc/sysdev/simple_gpio.c
>> index 43c4569..bb0d79c 100644
>> --- a/arch/powerpc/sysdev/simple_gpio.c
>> +++ b/arch/powerpc/sysdev/simple_gpio.c
>> @@ -46,7 +46,7 @@ static int u8_gpio_get(struct gpio_chip *gc, unsigned int 
>> gpio)
>>  {
>>struct of_mm_gpio_chip *mm_gc = to_of_mm_gpio_chip(gc);
>>
>> -   return in_8(mm_gc->regs) & u8_pin2mask(gpio);
>> +   return (in_8(mm_gc->regs) & u8_pin2mask(gpio) ? 1 : 0);
> 
> For clarity, the brackets should be just around the & operands, and
> "!= 0" instead of "? 1 : 0" might result in slightly smaller code.
> 
> return (in_8(mm_gc->regs) & u8_pin2mask(gpio)) != 0;

Yep, you are right, thanks for the info.

bye
Heiko
-- 
DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Mailing lists (Was: Re: powerpc/405ex: Support cuImage for PPC405EX)

2009-08-17 Thread tiejun.chen
Stephen Rothwell wrote:
> Please do *not* send mail to both linuxppc-...@ozlabs.org and
> linuxppc-...@lists.ozlabs.org.   We all end up with two copies :-(
> 
> They are the same list.

Sorry for this inconvenient I bring :(

Best Regards
Tiejun
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 1/3 v3] powerpc/32: Always order writes to halves of 64-bit PTEs

2009-08-17 Thread Benjamin Herrenschmidt
On Tue, 2009-08-18 at 09:00 +1000, Paul Mackerras wrote:
> On 32-bit systems with 64-bit PTEs, the PTEs have to be written in two
> 32-bit halves.  On SMP we write the higher-order half and then the
> lower-order half, with a write barrier between the two halves, but on
> UP there was no particular ordering of the writes to the two halves.
> 
> This extends the ordering that we already do on SMP to the UP case as
> well.  The reason is that with the perf_counter subsystem potentially
> accessing user memory at interrupt time to get stack traces, we have
> to be careful not to create an incorrect but apparently valid PTE even
> on UP.
> 
> Signed-off-by: Paul Mackerras 

Acked-by: Benjamin Herrenschmidt 
---
> ---
>  arch/powerpc/include/asm/pgtable.h |6 +++---
>  1 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/pgtable.h 
> b/arch/powerpc/include/asm/pgtable.h
> index eb17da7..2a5da06 100644
> --- a/arch/powerpc/include/asm/pgtable.h
> +++ b/arch/powerpc/include/asm/pgtable.h
> @@ -104,8 +104,8 @@ static inline void __set_pte_at(struct mm_struct *mm, 
> unsigned long addr,
>   else
>   pte_update(ptep, ~_PAGE_HASHPTE, pte_val(pte));
>  
> -#elif defined(CONFIG_PPC32) && defined(CONFIG_PTE_64BIT) && 
> defined(CONFIG_SMP)
> - /* Second case is 32-bit with 64-bit PTE in SMP mode. In this case, we
> +#elif defined(CONFIG_PPC32) && defined(CONFIG_PTE_64BIT)
> + /* Second case is 32-bit with 64-bit PTE.  In this case, we
>* can just store as long as we do the two halves in the right order
>* with a barrier in between. This is possible because we take care,
>* in the hash code, to pre-invalidate if the PTE was already hashed,
> @@ -140,7 +140,7 @@ static inline void __set_pte_at(struct mm_struct *mm, 
> unsigned long addr,
>  
>  #else
>   /* Anything else just stores the PTE normally. That covers all 64-bit
> -  * cases, and 32-bit non-hash with 64-bit PTEs in UP mode
> +  * cases, and 32-bit non-hash with 32-bit PTEs.
>*/
>   *ptep = pte;
>  #endif

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 2/3 v3] powerpc: Allow perf_counters to access user memory at interrupt time

2009-08-17 Thread Benjamin Herrenschmidt
On Tue, 2009-08-18 at 09:00 +1000, Paul Mackerras wrote:
> This provides a mechanism to allow the perf_counters code to access
> user memory in a PMU interrupt routine.  Such an access can cause
> various kinds of interrupt: SLB miss, MMU hash table miss, segment
> table miss, or TLB miss, depending on the processor.  This commit
> only deals with 64-bit classic/server processors, which use an MMU
> hash table.  32-bit processors are already able to access user memory
> at interrupt time.  Since we don't soft-disable on 32-bit, we avoid
> the possibility of reentering hash_page or the TLB miss handlers,
> since they run with interrupts disabled.

  .../...

> 
> Signed-off-by: Paul Mackerras 

Acked-by: Benjamin Herrenschmidt 


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 3/3 v3] perf_counter: powerpc: Add callchain support

2009-08-17 Thread Benjamin Herrenschmidt
On Tue, 2009-08-18 at 09:01 +1000, Paul Mackerras wrote:
> This adds support for tracing callchains for powerpc, both 32-bit
> and 64-bit, and both in the kernel and userspace, from PMU interrupt
> context.

> Signed-off-by: Paul Mackerras 

Acked-by: Benjamin Herrenschmidt 



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[RFC] Clock binding

2009-08-17 Thread Benjamin Herrenschmidt
So here's a followup to my discussion about the clock API.

I'm cooking up a patch that replace our current primitive implementation
in arch/powerpc/kernel/clock.c with something along the lines of what I
described. However, I want a bit more churn here on the device-tree
related bits.

So, basically, the goal here is to define a binding so that we can link
a device clock inputs to a clock provider clock outputs.

In general, in a system, there's actually 3 "names" involved. The clock
provider output name, the clock signal name, and the clock input name on
the device. However, I want to avoid involving the clock signal name as
it's a "global" name and it will just end up being a mess if we start
exposing that.

So basically, it boils down to a device having some clock inputs,
referenced by names, that need to be linked to another node which is a
clock provider, which has outputs, references either by number or names,
see discussion below.

First, why names, and not numbers ? IE. It's the OF "tradition" for
resources to just be an array, like interrupts, or address ranges in
"reg" properties, and one has to know what the Nth interrupt correspond
too.

My answer here is that maybe the tradition but it's crap :-) Names are
much better in the long run, besides it makes it easier to represent if
not all inputs have been wired. Also, to some extent, things like PCI do
encode a "name" with "reg" or "assigned-addresses" properties as part of
the config space offset in the top part of the address, and that has
proved very useful.

Thus I think using names is the way to go, and we should even generalize
that and add a new "interrupt-names" property to name the members of an
"interrupts" :-)

So back to the subject at hand. That leaves us with having to populate
the driver with some kind of map (I call it clock-map). Ideally, if
everything is named, which is the best approach imho, that map would
bind a list of:

- clock input name
- clock provider phandle
- clock output name on provider

However, it's a bit nasty to mix strings and numbers (phandles) in a
single property. It's possible, but would likely lead to the phandle not
being aligned and tools such as lsprop to fail miserably to display
those properties in any kind of readable form.

My earlier emails proposed an approach like this:

- clock input names go into a "clock-names" property
  (which I suggest naming instead "clock-input-names" btw)

- the map goes into a "clock-map" property and for each input
  provides a phandle and a one cell numerical ID that identifies
  the clock on the source.

However, I really dislike that numerical clock ID. Magic numbers suck.
It should be a string. But I don't want to add a 3rd property in there.

Hence my idea below. It's not perfect but it's the less sucky i've come
up with so far. And then we can do some small refinements.

* Device has:

- "clock-input-names" as above
- "clock-map" contains list of phandle,index

* Clock source has:

- "clock-output-names" list of strings

The "index" in the clock map thus would reference the
"clock-output-names" array in the clock provider. That means that the
"magic number" here is entirely local to a given device-tree, doesn't
leak into driver code, which continues using names.

In addition, we can even have some smooth "upgrade" path from existing
"clock-frequency" properties by assuming that if "clock-output-names" is
absent, but "clock-frequency" exist, then index 0 references a fixed
frequency clock source without a driver. This could be generally handy
anyway to represent crystals of fixed bus clocks without having to write
a clock source driver for them.

Any comments ?

I'll post a patch, maybe later today, implementing the above (I may or
may not have time to also convert the existing 512x code to it, we'll
see).

Cheers,
Ben.


 


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: powerpc/405ex: Support cuImage for PPC405EX

2009-08-17 Thread Benjamin Herrenschmidt
On Tue, 2009-08-18 at 10:28 +0800, Tiejun Chen wrote:
> Summary: powerpc/405ex: Support cuImage for PPC405EX
> Reviewers: Benjmain and linux-ppc
> 
> These patch series are used to support cuImage on the kilauea board based on 
> PPC405ex.

Thanks !

I'll let Josh pick that up.

Cheers,
Ben.


> Tested on the amcc kilauea board:
> ===
> ...
> => tftp 100 cuImage.kilauea
> Waiting for PHY auto negotiation to complete.. done
> ENET Speed is 100 Mbps - FULL duplex connection (EMAC0)
> Using ppc_4xx_eth0 device
> TFTP from server 192.168.1.2; our IP address is 192.168.1.103
> Filename 'cuImage.kilauea'.
> Load address: 0x100
> Loading: #
>  #
>  #
>  #
>  #
> done
> Bytes transferred = 1540945 (178351 hex)
> => bootm
> ## Booting kernel from Legacy Image at 0100 ...
>Image Name:   Linux-2.6.31-rc5-57857-g8df7f47-
>Created:  2009-08-17   6:31:13 UTC
>Image Type:   PowerPC Linux Kernel Image (gzip compressed)
>Data Size:1540881 Bytes =  1.5 MB
>Load Address: 0040
>Entry Point:  00400468
>Verifying Checksum ... OK
>Uncompressing Kernel Image ... OK
> CPU clock-frequency <- 0x23c345fa (600MHz)
> CPU timebase-frequency <- 0x23c345fa (600MHz)
> /plb: clock-frequency <- bebc1fe (200MHz)
> /plb/opb: clock-frequency <- 5f5e0ff (100MHz)
> /plb/opb/ebc: clock-frequency <- 5f5e0ff (100MHz)
> /plb/opb/ser...@ef600200: clock-frequency <- a8c000 (11MHz)
> /plb/opb/ser...@ef600300: clock-frequency <- a8c000 (11MHz)
> Memory <- <0x0 0x1000> (256MB)
> ethernet0: local-mac-address <- 00:06:4b:10:22:6c
> ethernet1: local-mac-address <- 00:06:4b:10:22:6d
> 
> zImage starting: loaded at 0x0040 (sp: 0x0fe9ec08)
> Allocating 0x330c70 bytes for kernel ...
> gunzipping (0x <- 0x0040f000:0x0073a03c)...done 0x31425c bytes
> 
> Linux/PowerPC load: root=/dev/nfs rw nfsroot=192.168.1.2:/home/vividfe/rootfsf
> Finalizing device tree... flat tree at 0x747300
> Using PowerPC 40x Platform machine description
> ...
> 
> Best Regards
> Tiejun
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Mailing lists (Was: Re: powerpc/405ex: Support cuImage for PPC405EX)

2009-08-17 Thread Stephen Rothwell
Please do *not* send mail to both linuxppc-...@ozlabs.org and
linuxppc-...@lists.ozlabs.org.   We all end up with two copies :-(

They are the same list.
-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/


pgp4qi6OjUw4I.pgp
Description: PGP signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/2] powerpc/405ex: provide necessary fixup function to support cuImage

2009-08-17 Thread Tiejun Chen
For cuImage format it's necessary to provide clock fixups since u-boot will
not pass necessary clock frequency into the dtb included into cuImage so we 
implement the clock fixups as defined in the technical documentation for the 
board and update header file with the basic register definitions. 

Signed-off-by: Tiejun Chen 
---
 arch/powerpc/boot/4xx.c |  142 +++
 arch/powerpc/boot/4xx.h |1 +
 arch/powerpc/boot/dcr.h |   12 
 3 files changed, 155 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/boot/4xx.c b/arch/powerpc/boot/4xx.c
index 325b310..b5561b3 100644
--- a/arch/powerpc/boot/4xx.c
+++ b/arch/powerpc/boot/4xx.c
@@ -8,6 +8,10 @@
  *   Eugene Surovegin  or 
  *   Copyright (c) 2003, 2004 Zultys Technologies
  *
+ * Copyright (C) 2009 Wind River Systems, Inc.
+ *   Updated for supporting PPC405EX on Kilauea.
+ *   Tiejun Chen 
+ *
  * This program is free software; you can redistribute it and/or
  * modify it under the terms of the GNU General Public License
  * as published by the Free Software Foundation; either version
@@ -659,3 +663,141 @@ void ibm405ep_fixup_clocks(unsigned int sys_clk)
dt_fixup_clock("/plb/opb/ser...@ef600300", uart0);
dt_fixup_clock("/plb/opb/ser...@ef600400", uart1);
 }
+
+static u8 fwdv_multi_bits[] = {
+   /* values for:  1 - 16 */
+   0x01, 0x02, 0x0e, 0x09, 0x04, 0x0b, 0x10, 0x0d, 0x0c, 0x05,
+   0x06, 0x0f, 0x0a, 0x07, 0x08, 0x03
+};
+
+u32 get_fwdva(unsigned long cpr_fwdv)
+{
+   u32 index;
+
+   for (index = 0; index < ARRAY_SIZE(fwdv_multi_bits); index++)
+   if (cpr_fwdv == (u32)fwdv_multi_bits[index])
+   return index + 1;
+
+   return 0;
+}
+
+static u8 fbdv_multi_bits[] = {
+   /* values for:  1 - 100 */
+   0x00, 0xff, 0x7e, 0xfd, 0x7a, 0xf5, 0x6a, 0xd5, 0x2a, 0xd4,
+   0x29, 0xd3, 0x26, 0xcc, 0x19, 0xb3, 0x67, 0xce, 0x1d, 0xbb,
+   0x77, 0xee, 0x5d, 0xba, 0x74, 0xe9, 0x52, 0xa5, 0x4b, 0x96,
+   0x2c, 0xd8, 0x31, 0xe3, 0x46, 0x8d, 0x1b, 0xb7, 0x6f, 0xde,
+   0x3d, 0xfb, 0x76, 0xed, 0x5a, 0xb5, 0x6b, 0xd6, 0x2d, 0xdb,
+   0x36, 0xec, 0x59, 0xb2, 0x64, 0xc9, 0x12, 0xa4, 0x48, 0x91,
+   0x23, 0xc7, 0x0e, 0x9c, 0x38, 0xf0, 0x61, 0xc2, 0x05, 0x8b,
+   0x17, 0xaf, 0x5f, 0xbe, 0x7c, 0xf9, 0x72, 0xe5, 0x4a, 0x95,
+   0x2b, 0xd7, 0x2e, 0xdc, 0x39, 0xf3, 0x66, 0xcd, 0x1a, 0xb4,
+   0x68, 0xd1, 0x22, 0xc4, 0x09, 0x93, 0x27, 0xcf, 0x1e, 0xbc,
+   /* values for:  101 - 200 */
+   0x78, 0xf1, 0x62, 0xc5, 0x0a, 0x94, 0x28, 0xd0, 0x21, 0xc3,
+   0x06, 0x8c, 0x18, 0xb0, 0x60, 0xc1, 0x02, 0x84, 0x08, 0x90,
+   0x20, 0xc0, 0x01, 0x83, 0x07, 0x8f, 0x1f, 0xbf, 0x7f, 0xfe,
+   0x7d, 0xfa, 0x75, 0xea, 0x55, 0xaa, 0x54, 0xa9, 0x53, 0xa6,
+   0x4c, 0x99, 0x33, 0xe7, 0x4e, 0x9d, 0x3b, 0xf7, 0x6e, 0xdd,
+   0x3a, 0xf4, 0x69, 0xd2, 0x25, 0xcb, 0x16, 0xac, 0x58, 0xb1,
+   0x63, 0xc6, 0x0d, 0x9b, 0x37, 0xef, 0x5e, 0xbd, 0x7b, 0xf6,
+   0x6d, 0xda, 0x35, 0xeb, 0x56, 0xad, 0x5b, 0xb6, 0x6c, 0xd9,
+   0x32, 0xe4, 0x49, 0x92, 0x24, 0xc8, 0x11, 0xa3, 0x47, 0x8e,
+   0x1c, 0xb8, 0x70, 0xe1, 0x42, 0x85, 0x0b, 0x97, 0x2f, 0xdf,
+   /* values for:  201 - 255 */
+   0x3e, 0xfc, 0x79, 0xf2, 0x65, 0xca, 0x15, 0xab, 0x57, 0xae,
+   0x5c, 0xb9, 0x73, 0xe6, 0x4d, 0x9a, 0x34, 0xe8, 0x51, 0xa2,
+   0x44, 0x89, 0x13, 0xa7, 0x4f, 0x9e, 0x3c, 0xf8, 0x71, 0xe2,
+   0x45, 0x8a, 0x14, 0xa8, 0x50, 0xa1, 0x43, 0x86, 0x0c, 0x98,
+   0x30, 0xe0, 0x41, 0x82, 0x04, 0x88, 0x10, 0xa0, 0x40, 0x81,
+   0x03, 0x87, 0x0f, 0x9f, 0x3f  /* END */
+};
+
+u32 get_fbdv(unsigned long cpr_fbdv)
+{
+   u32 index;
+
+   for (index = 0; index < ARRAY_SIZE(fbdv_multi_bits); index++)
+   if (cpr_fbdv == (u32)fbdv_multi_bits[index])
+   return index + 1;
+
+   return 0;
+}
+
+void ibm405ex_fixup_clocks(unsigned int sys_clk, unsigned int uart_clk)
+{
+   /* PLL config */
+   u32 pllc  = CPR0_READ(CPR0_PLLC);
+   u32 plld  = CPR0_READ(CPR0_PLLD);
+   u32 cpud  = CPR0_READ(CPR0_CPUD);
+   u32 plbd  = CPR0_READ(CPR0_PLBD);
+   u32 opbd  = CPR0_READ(CPR0_OPBD);
+   u32 perd  = CPR0_READ(CPR0_PERD);
+
+   /* Dividers */
+   u32 fbdv   = get_fbdv(__fix_zero((plld >> 24) & 0xff, 1));
+
+   u32 fwdva  = get_fwdva(__fix_zero((plld >> 16) & 0x0f, 1));
+
+   u32 cpudv0 = __fix_zero((cpud >> 24) & 7, 8);
+   
+   /* PLBDV0 is hardwared to 010. */
+   u32 plbdv0 = 2;
+   u32 plb2xdv0 = __fix_zero((plbd >> 16) & 7, 8);
+
+   u32 opbdv0 = __fix_zero((opbd >> 24) & 3, 4);
+
+   u32 perdv0 = __fix_zero((perd >> 24) & 3, 4);
+
+   /* Resulting clocks */
+   u32 cpu, plb, opb, ebc, vco, tb, uart0, uart1; 
+
+   /* PLL's VCO is the source for primary forward ? */
+   if (pllc & 0x4000) {
+   u32 m;
+
+   /* Feedback path */
+   switch ((pllc >> 24) & 7)

[PATCH 2/2] powerpc/405ex: support cuImage via included dtb

2009-08-17 Thread Tiejun Chen
To support cuImage, we need to initialize the required sections and 
ensure that it is built.

Signed-off-by: Tiejun Chen 
---
 arch/powerpc/boot/Makefile |3 +-
 arch/powerpc/boot/cuboot-kilauea.c |   50 
 2 files changed, 52 insertions(+), 1 deletions(-)
 create mode 100644 arch/powerpc/boot/cuboot-kilauea.c

diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile
index 9ae7b7e..44ce95b 100644
--- a/arch/powerpc/boot/Makefile
+++ b/arch/powerpc/boot/Makefile
@@ -75,7 +75,7 @@ src-plat := of.c cuboot-52xx.c cuboot-824x.c cuboot-83xx.c 
cuboot-85xx.c holly.c
cuboot-katmai.c cuboot-rainier.c redboot-8xx.c ep8248e.c \
cuboot-warp.c cuboot-85xx-cpm2.c cuboot-yosemite.c simpleboot.c 
\
virtex405-head.S virtex.c redboot-83xx.c cuboot-sam440ep.c \
-   cuboot-acadia.c cuboot-amigaone.c
+   cuboot-acadia.c cuboot-amigaone.c cuboot-kilauea.c
 src-boot := $(src-wlib) $(src-plat) empty.c
 
 src-boot := $(addprefix $(obj)/, $(src-boot))
@@ -192,6 +192,7 @@ image-$(CONFIG_DEFAULT_UIMAGE)  += uImage
 image-$(CONFIG_EP405)  += dtbImage.ep405
 image-$(CONFIG_WALNUT) += treeImage.walnut
 image-$(CONFIG_ACADIA) += cuImage.acadia
+image-$(CONFIG_KILAUEA)+= cuImage.kilauea
 
 # Board ports in arch/powerpc/platform/44x/Kconfig
 image-$(CONFIG_EBONY)  += treeImage.ebony cuImage.ebony
diff --git a/arch/powerpc/boot/cuboot-kilauea.c 
b/arch/powerpc/boot/cuboot-kilauea.c
new file mode 100644
index 000..7db1b39
--- /dev/null
+++ b/arch/powerpc/boot/cuboot-kilauea.c
@@ -0,0 +1,50 @@
+/*
+ * Old U-boot compatibility for PPC405EX. This image is already included 
+ * a dtb.
+ *
+ * Author: Tiejun Chen 
+ *
+ * Copyright (C) 2009 Wind River Systems, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation.
+ */
+
+#include "ops.h"
+#include "io.h"
+#include "dcr.h"
+#include "stdio.h"
+#include "4xx.h"
+#include "44x.h"
+#include "cuboot.h"
+
+#define TARGET_4xx
+#define TARGET_44x
+#include "ppcboot.h"
+
+#define KILAUEA_SYS_EXT_SERIAL_CLOCK 11059200/* ext. 11.059MHz clk 
*/
+
+static bd_t bd;
+
+static void kilauea_fixups(void)
+{
+   /*TODO: Please change this as the real. Note that should be 
33MHZ~100MHZ.*/
+   unsigned long sysclk = ;
+
+   ibm405ex_fixup_clocks(sysclk, KILAUEA_SYS_EXT_SERIAL_CLOCK);
+   dt_fixup_memory(bd.bi_memstart, bd.bi_memsize);
+   ibm4xx_fixup_ebc_ranges("/plb/opb/ebc");
+   dt_fixup_mac_address_by_alias("ethernet0", bd.bi_enetaddr);
+   dt_fixup_mac_address_by_alias("ethernet1", bd.bi_enet1addr);
+}
+
+void platform_init(unsigned long r3, unsigned long r4, unsigned long r5,
+   unsigned long r6, unsigned long r7)
+{
+   CUBOOT_INIT();
+   platform_ops.fixups = kilauea_fixups;
+   platform_ops.exit = ibm40x_dbcr_reset;
+   fdt_init(_dtb_start);
+   serial_console_init();
+}
-- 
1.5.6

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


powerpc/405ex: Support cuImage for PPC405EX

2009-08-17 Thread Tiejun Chen
Summary: powerpc/405ex: Support cuImage for PPC405EX
Reviewers: Benjmain and linux-ppc

These patch series are used to support cuImage on the kilauea board based on 
PPC405ex.

Tested on the amcc kilauea board:
===
...
=> tftp 100 cuImage.kilauea
Waiting for PHY auto negotiation to complete.. done
ENET Speed is 100 Mbps - FULL duplex connection (EMAC0)
Using ppc_4xx_eth0 device
TFTP from server 192.168.1.2; our IP address is 192.168.1.103
Filename 'cuImage.kilauea'.
Load address: 0x100
Loading: #
 #
 #
 #
 #
done
Bytes transferred = 1540945 (178351 hex)
=> bootm
## Booting kernel from Legacy Image at 0100 ...
   Image Name:   Linux-2.6.31-rc5-57857-g8df7f47-
   Created:  2009-08-17   6:31:13 UTC
   Image Type:   PowerPC Linux Kernel Image (gzip compressed)
   Data Size:1540881 Bytes =  1.5 MB
   Load Address: 0040
   Entry Point:  00400468
   Verifying Checksum ... OK
   Uncompressing Kernel Image ... OK
CPU clock-frequency <- 0x23c345fa (600MHz)
CPU timebase-frequency <- 0x23c345fa (600MHz)
/plb: clock-frequency <- bebc1fe (200MHz)
/plb/opb: clock-frequency <- 5f5e0ff (100MHz)
/plb/opb/ebc: clock-frequency <- 5f5e0ff (100MHz)
/plb/opb/ser...@ef600200: clock-frequency <- a8c000 (11MHz)
/plb/opb/ser...@ef600300: clock-frequency <- a8c000 (11MHz)
Memory <- <0x0 0x1000> (256MB)
ethernet0: local-mac-address <- 00:06:4b:10:22:6c
ethernet1: local-mac-address <- 00:06:4b:10:22:6d

zImage starting: loaded at 0x0040 (sp: 0x0fe9ec08)
Allocating 0x330c70 bytes for kernel ...
gunzipping (0x <- 0x0040f000:0x0073a03c)...done 0x31425c bytes

Linux/PowerPC load: root=/dev/nfs rw nfsroot=192.168.1.2:/home/vividfe/rootfsf
Finalizing device tree... flat tree at 0x747300
Using PowerPC 40x Platform machine description
...

Best Regards
Tiejun


















___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] powerpc: Fix __flush_icache_range on 44x

2009-08-17 Thread Benjamin Herrenschmidt
On Mon, 2009-08-17 at 20:16 -0400, Josh Boyer wrote:
> 
> You can if you'd like.  My biggest concern is getting time to
> recreate.  I
> think I'll have time later in the week if you'd like to wait until
> then.
> I simply didn't want to send out a patch that I wasn't sure fixed the
> issue.
> 
That's ok. It's a bug fix so it's less constrained by the upcoming merge
window and we can send it back to -stable later.

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


FW: need help getting SPI controller working on 405EX [PPM2009081200000033]

2009-08-17 Thread Tirumala Reddy Marri
 

1) It looks like the correct entry in kilauea.dts file should be: 
208 IIC1: i...@ef600500 { 
209 compatible = "ibm,iic-405ex", "ibm,iic"; 
210 reg = ; 
211 interrupt-parent = <&UIC0>; 
212 interrupts = <7 4>; 
213 #address-cells = <1>; 
214 #size-cells = <0>; 
215 }; 
216 
217 SPI0: s...@ef600600 { 
218 /* compatible = "ibm,iic-405ex", "ibm,iic"; */ 
219 compatible = "amcc,scp-405ex"; 
220 reg = ; 
221 interrupts = <8 4>; 
222 interrupt-parent = <&UIC0>; 
223 }; 
224 
225 RGMII0: emac-rg...@ef600b00 { 
226 compatible = "ibm,rgmii-405ex", "ibm,rgmii"; 
227 reg = ; 
228 has-mdio; 
229 }; 
230 
231 EMAC0: ether...@ef600900 { 

2) Right now the e.g. scp-dev.c is in drivers/scp directory in one of
the internal release I have found, NOT in the e.g. 2.6.29.

Additional comments: 
- Ideally the file should be moved to drivers/spi, like all other spi
drivers. 
- Even in the internal release, the files do NOT compile properly,
because of missing file, need CONFIG_PINE, etc 
[supp...@localhost linux]$ make uImage 
scripts/kconfig/conf -s arch/powerpc/Kconfig 
  CHK include/linux/version.h 
  CHK include/linux/utsrelease.h 
  CALLscripts/checksyscalls.sh 
  CHK include/linux/compile.h 
  CALLarch/powerpc/kernel/systbl_chk.sh 
  CC  drivers/scp/scp-dev.o 
drivers/scp/scp-dev.c:84:24: error: asm/ibm4xx.h: No such file or
directory 
drivers/scp/scp-dev.c:705: error: 'scpdev_init' undeclared here (not in
a function) 
make[2]: *** [drivers/scp/scp-dev.o] Error 1 
make[1]: *** [drivers/scp] Error 2 
make: *** [drivers] Error 2 
[supp...@localhost linux] 

Q: Marri, what do we need to provide to Nathan French ? 
Q: Fan, per Jinag-An's request, what is the procedure for cleaning this
up before releasing to Linux community ? 

Regards, Samuel 

-Original Message- 
From: support_re...@amcc.com [mailto:support_re...@amcc.com] 
Sent: Fri 8/7/2009 9:24 AM 
To: Samuel Wang 
Subject: FW: need help getting SPI controller working on 405EX
[PPM200908120033311192] 
  
Sender  : tma...@amcc.com 
Tracking Number : PPM200908120033311192 
Pool: PPC_MID 
Sent to : "AMCC Product Support"  
Date: 8/7/09 9:24 AM 
--- 

Forwarded by: Alan Millard 

(no comments entered) 
--- 

 

-Original Message- 
From: linuxppc-dev-bounces+tmarri=amcc@lists.ozlabs.org 
[mailto:linuxppc-dev-bounces+tmarri=amcc@lists.ozlabs.org] On Behalf

Of Nathan French 
Sent: Thursday, August 06, 2009 9:08 AM 
To: linuxppc-dev@lists.ozlabs.org 
Subject: need help getting SPI controller working on 405EX 

Hi, I am trying to add support for the 405EX's SPI controller on a 
Kilauea board.  I've added the below to the device tree (under 
plb/opb/): 

[nfre...@nfrench-laptop linux-2.6-denx]$ diff -C2 
arch/powerpc/boot/dts/kilauea.dts spi.dts 
*** arch/powerpc/boot/dts/kilauea.dts   2009-05-05 15:56:16.0 
-0700 
--- spi.dts 2009-08-06 08:42:19.0 -0700 
*** 
*** 207,210  
--- 207,221  
#size-cells = <0>; 
}; 
+ 
+ SPI0: s...@ef600600 { 
+ cell-index = <0>; 
+ compatible = "ibm,spi-405ex", "ibm,spi"; 
+ reg = ; 
+ interrupts = <8 4>; 
+ interrupt-parent = <&UIC0>; 
+ mode = "cpu"; 
+ }; 
  
RGMII0: emac-rg...@ef600b00 { 

I've also compiled my kernel with the following enabled: 

CONFIG_SPI=y 
CONFIG_SPI_MASTER=y 
CONFIG_SPI_SPIDEV=y 

I see this make it into the device tree after boot: 

[r...@10.2.3.28 /]$ find /proc/device-tree/ | grep spi 
/proc/device-tree/plb/opb/s...@ef600600 
/proc/device-tree/plb/opb/s...@ef600600/name 
/proc/device-tree/plb/opb/s...@ef600600/mode 
/proc/device-tree/plb/opb/s...@ef600600/interrupt-parent 
/proc/device-tree/plb/opb/s...@ef600600/interrupts 
/proc/device-tree/plb/opb/s...@ef600600/reg 
/proc/device-tree/plb/opb/s...@ef600600/compatible 
/proc/device-tree/plb/opb/s...@ef600600/cell-index 

But I don't see any /dev/spidev* devices created or any mention of SPI 
at boot time.  I'm starting to suspect that I don't have the kernel 
configured right, otherwise I would see at least the SPI driver 
complaining about something, right? 

Thanks, 

Nathan French 

___ 
Linuxppc-dev mailing list 
Linuxppc-dev@lists.ozlabs.org 
https://lists.ozlabs.org/listinfo/linuxppc-dev 

 

___
Linuxppc-dev mailing list
Linuxppc-dev@list

Re: [PATCH] powerpc: Fix __flush_icache_range on 44x

2009-08-17 Thread Josh Boyer
On Tue, Aug 18, 2009 at 07:46:28AM +1000, Benjamin Herrenschmidt wrote:
>On Mon, 2009-08-17 at 12:07 -0400, Josh Boyer wrote:
>> 
>> Olof pointed out that we could probably do the iccci before the icbi loop and
>> just skip that loop entirely on 44x.  This is most certainly valid, but at
>> this particular moment I don't have time to try and reproduce the issue with
>> an alternative fix and I wanted to get _something_ out there to fix the 
>> issue.  
>> 
>> I suck for that, I know.
>
>Well, I can massage your patch if you want. The fact is, the icbi loop
>and iccci are definitely redundant :-)

You can if you'd like.  My biggest concern is getting time to recreate.  I
think I'll have time later in the week if you'd like to wait until then.
I simply didn't want to send out a patch that I wasn't sure fixed the issue.

josh
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 1/3 v3] powerpc/32: Always order writes to halves of 64-bit PTEs

2009-08-17 Thread Paul Mackerras
Kumar Gala writes:

> On Aug 17, 2009, at 6:00 PM, Paul Mackerras wrote:
> 
> > On 32-bit systems with 64-bit PTEs, the PTEs have to be written in two
> > 32-bit halves.  On SMP we write the higher-order half and then the
> > lower-order half, with a write barrier between the two halves, but on
> > UP there was no particular ordering of the writes to the two halves.
> >
> > This extends the ordering that we already do on SMP to the UP case as
> > well.  The reason is that with the perf_counter subsystem potentially
> > accessing user memory at interrupt time to get stack traces, we have
> > to be careful not to create an incorrect but apparently valid PTE even
> > on UP.
> >
> > Signed-off-by: Paul Mackerras 
> > ---
> > arch/powerpc/include/asm/pgtable.h |6 +++---
> > 1 files changed, 3 insertions(+), 3 deletions(-)
> 
> Just out of interest did you end up hitting this in testing?

No.  Ben told me he wanted this change, so I did what I was told. :)

Paul.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 1/3 v3] powerpc/32: Always order writes to halves of 64-bit PTEs

2009-08-17 Thread Kumar Gala


On Aug 17, 2009, at 6:00 PM, Paul Mackerras wrote:


On 32-bit systems with 64-bit PTEs, the PTEs have to be written in two
32-bit halves.  On SMP we write the higher-order half and then the
lower-order half, with a write barrier between the two halves, but on
UP there was no particular ordering of the writes to the two halves.

This extends the ordering that we already do on SMP to the UP case as
well.  The reason is that with the perf_counter subsystem potentially
accessing user memory at interrupt time to get stack traces, we have
to be careful not to create an incorrect but apparently valid PTE even
on UP.

Signed-off-by: Paul Mackerras 
---
arch/powerpc/include/asm/pgtable.h |6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)


Just out of interest did you end up hitting this in testing?

- k
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: simple gpio driver

2009-08-17 Thread Anton Vorontsov
Oops, I missed that patch, sorry.

On Mon, Aug 17, 2009 at 03:18:37PM -0600, Grant Likely wrote:
> On Wed, Aug 12, 2009 at 11:49 PM, Heiko Schocher wrote:
> > Hello Anton,
> >
> > i am trying to use the arch/powerpc/sysdev/simple_gpio.c driver,
> > for accessing some gpios, and found, that u8_gpio_get()
> > returns not only a 1 or a 0, instead it returns the real bit
> > position from the gpio:
> >
> > gpio    return
> > base    value
> > 0       0/0x01
> > 1       0/0x02
> > 2       0/0x04
> > 3       0/0x08
> > 4       0/0x10
> > 5       0/0x20
> > 6       0/0x40
> > 7       0/0x80
> >
> > I also use the arch/powerpc/platforms/52xx/mpc52xx_gpio.c and
> > mpc52xx_gpt.c drivers, they all return for a gpio just a 1 or 0,

There is also arch/powerpc/sysdev/qe_lib/gpio.c and
arch/powerpc/sysdev/mpc8xxx_gpio.c that don't do that.

> > which seems correct to me, because a gpio can have only 1 or 0
> > as state ... what do you think?
> 
> I think returning '1' is perhaps slightly 'better' (however you define
> that), but I don't think the caller should make any assumptions beyond
> zero/non-zero.

Yep. So I don't think that the patch is needed.

Thanks,

-- 
Anton Vorontsov
email: cbouatmai...@gmail.com
irc://irc.freenode.net/bd2
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/3 v3] powerpc/32: Always order writes to halves of 64-bit PTEs

2009-08-17 Thread Paul Mackerras
On 32-bit systems with 64-bit PTEs, the PTEs have to be written in two
32-bit halves.  On SMP we write the higher-order half and then the
lower-order half, with a write barrier between the two halves, but on
UP there was no particular ordering of the writes to the two halves.

This extends the ordering that we already do on SMP to the UP case as
well.  The reason is that with the perf_counter subsystem potentially
accessing user memory at interrupt time to get stack traces, we have
to be careful not to create an incorrect but apparently valid PTE even
on UP.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/include/asm/pgtable.h |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/pgtable.h 
b/arch/powerpc/include/asm/pgtable.h
index eb17da7..2a5da06 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -104,8 +104,8 @@ static inline void __set_pte_at(struct mm_struct *mm, 
unsigned long addr,
else
pte_update(ptep, ~_PAGE_HASHPTE, pte_val(pte));
 
-#elif defined(CONFIG_PPC32) && defined(CONFIG_PTE_64BIT) && defined(CONFIG_SMP)
-   /* Second case is 32-bit with 64-bit PTE in SMP mode. In this case, we
+#elif defined(CONFIG_PPC32) && defined(CONFIG_PTE_64BIT)
+   /* Second case is 32-bit with 64-bit PTE.  In this case, we
 * can just store as long as we do the two halves in the right order
 * with a barrier in between. This is possible because we take care,
 * in the hash code, to pre-invalidate if the PTE was already hashed,
@@ -140,7 +140,7 @@ static inline void __set_pte_at(struct mm_struct *mm, 
unsigned long addr,
 
 #else
/* Anything else just stores the PTE normally. That covers all 64-bit
-* cases, and 32-bit non-hash with 64-bit PTEs in UP mode
+* cases, and 32-bit non-hash with 32-bit PTEs.
 */
*ptep = pte;
 #endif
-- 
1.6.0.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 3/3 v3] perf_counter: powerpc: Add callchain support

2009-08-17 Thread Paul Mackerras
This adds support for tracing callchains for powerpc, both 32-bit
and 64-bit, and both in the kernel and userspace, from PMU interrupt
context.

The first three entries stored for each callchain are the NIP (next
instruction pointer), LR (link register), and the contents of the LR
save area in the second stack frame (the first is ignored because the
ABI convention on powerpc is that functions save their return address
in their caller's stack frame).  Because leaf functions don't have to
save their return address (LR value) and don't have to establish a
stack frame, it's possible for either or both of LR and the second
stack frame's LR save area to have valid return addresses in them.
This is basically impossible to disambiguate without either reading
the code or looking at auxiliary information such as CFI tables.
Since we don't want to do either of those things at interrupt time,
we store both LR and the second stack frame's LR save area.

Once we get past the second stack frame, there is no ambiguity; all
return addresses we get are reliable.

For kernel traces, we check whether they are valid kernel instruction
addresses and store zero instead if they are not (rather than
omitting them, which would make it impossible for userspace to know
which was which).  We also store zero instead of the second stack
frame's LR save area value if it is the same as LR.

For kernel traces, we check for interrupt frames, and for user traces,
we check for signal frames.  In each case, since we're starting a new
trace, we store a PERF_CONTEXT_KERNEL/USER marker so that userspace
knows that the next three entries are NIP, LR and the second stack frame
for the interrupted context.

We read user memory with __get_user_inatomic.  On 64-bit, if this
PMU interrupt occurred while interrupts are soft-disabled, and
there is no MMU hash table entry for the page, we will get an
-EFAULT return from __get_user_inatomic even if there is a valid
Linux PTE for the page, since hash_page isn't reentrant.  Thus we
have code here to read the Linux PTE and access the page via the
kernel linear mapping.  Since 64-bit doesn't use (or need) highmem
there is no need to do kmap_atomic.  On 32-bit, we don't do soft
interrupt disabling, so this complication doesn't occur and there
is no need to fall back to reading the Linux PTE, since hash_page
(or the TLB miss handler) will get called automatically if necessary.

Note that we cannot get PMU interrupts in the interval during
context switch between switch_mm (which switches the user address
space) and switch_to (which actually changes current to the new
process).  On 64-bit this is because interrupts are hard-disabled
in switch_mm and stay hard-disabled until they are soft-enabled
later, after switch_to has returned.  So there is no possibility
of trying to do a user stack trace when the user address space is
not current's address space.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/kernel/Makefile |2 +-
 arch/powerpc/kernel/perf_callchain.c |  527 ++
 2 files changed, 528 insertions(+), 1 deletions(-)
 create mode 100644 arch/powerpc/kernel/perf_callchain.c

diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index b73396b..9619285 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -97,7 +97,7 @@ obj64-$(CONFIG_AUDIT) += compat_audit.o
 
 obj-$(CONFIG_DYNAMIC_FTRACE)   += ftrace.o
 obj-$(CONFIG_FUNCTION_GRAPH_TRACER)+= ftrace.o
-obj-$(CONFIG_PPC_PERF_CTRS)+= perf_counter.o
+obj-$(CONFIG_PPC_PERF_CTRS)+= perf_counter.o perf_callchain.o
 obj64-$(CONFIG_PPC_PERF_CTRS)  += power4-pmu.o ppc970-pmu.o power5-pmu.o \
   power5+-pmu.o power6-pmu.o power7-pmu.o
 obj32-$(CONFIG_PPC_PERF_CTRS)  += mpc7450-pmu.o
diff --git a/arch/powerpc/kernel/perf_callchain.c 
b/arch/powerpc/kernel/perf_callchain.c
new file mode 100644
index 000..f74b62c
--- /dev/null
+++ b/arch/powerpc/kernel/perf_callchain.c
@@ -0,0 +1,527 @@
+/*
+ * Performance counter callchain support - powerpc architecture code
+ *
+ * Copyright © 2009 Paul Mackerras, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#ifdef CONFIG_PPC64
+#include "ppc32.h"
+#endif
+
+/*
+ * Store another value in a callchain_entry.
+ */
+static inline void callchain_store(struct perf_callchain_entry *entry, u64 ip)
+{
+   unsigned int nr = entry->nr;
+
+   if (nr < PERF_MAX_STACK_DEPTH) {
+   entry->ip[nr] = ip;
+   entry->nr = nr + 1;
+   }
+}
+
+/*
+ * Is sp valid as the address of the next kernel stack frame after prev_sp?
+ * The ne

[PATCH 2/3 v3] powerpc: Allow perf_counters to access user memory at interrupt time

2009-08-17 Thread Paul Mackerras
This provides a mechanism to allow the perf_counters code to access
user memory in a PMU interrupt routine.  Such an access can cause
various kinds of interrupt: SLB miss, MMU hash table miss, segment
table miss, or TLB miss, depending on the processor.  This commit
only deals with 64-bit classic/server processors, which use an MMU
hash table.  32-bit processors are already able to access user memory
at interrupt time.  Since we don't soft-disable on 32-bit, we avoid
the possibility of reentering hash_page or the TLB miss handlers,
since they run with interrupts disabled.

On 64-bit processors, an SLB miss interrupt on a user address will
update the slb_cache and slb_cache_ptr fields in the paca.  This is
OK except in the case where a PMU interrupt occurs in switch_slb,
which also accesses those fields.  To prevent this, we hard-disable
interrupts in switch_slb.  Interrupts are already soft-disabled at
this point, and will get hard-enabled when they get soft-enabled
later.

This also reworks slb_flush_and_rebolt: to avoid hard-disabling twice,
and to make sure that it clears the slb_cache_ptr when called from
other callers than switch_slb, the existing routine is renamed to
__slb_flush_and_rebolt, which is called by switch_slb and the new
version of slb_flush_and_rebolt.

Similarly, switch_stab (used on POWER3 and RS64 processors) gets a
hard_irq_disable() to protect the per-cpu variables used there and
in ste_allocate.

If a MMU hashtable miss interrupt occurs, normally we would call
hash_page to look up the Linux PTE for the address and create a HPTE.
However, hash_page is fairly complex and takes some locks, so to
avoid the possibility of deadlock, we check the preemption count
to see if we are in a (pseudo-)NMI handler, and if so, we don't call
hash_page but instead treat it like a bad access that will get
reported up through the exception table mechanism.  An interrupt
whose handler runs even though the interrupt occurred when
soft-disabled (such as the PMU interrupt) is considered a pseudo-NMI
handler, which should use nmi_enter()/nmi_exit() rather than
irq_enter()/irq_exit().

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/kernel/asm-offsets.c|2 +
 arch/powerpc/kernel/exceptions-64s.S |   19 +
 arch/powerpc/mm/slb.c|   37 +++--
 arch/powerpc/mm/stab.c   |   11 +-
 4 files changed, 57 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 561b646..197b156 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -67,6 +67,8 @@ int main(void)
DEFINE(MMCONTEXTID, offsetof(struct mm_struct, context.id));
 #ifdef CONFIG_PPC64
DEFINE(AUDITCONTEXT, offsetof(struct task_struct, audit_context));
+   DEFINE(SIGSEGV, SIGSEGV);
+   DEFINE(NMI_MASK, NMI_MASK);
 #else
DEFINE(THREAD_INFO, offsetof(struct task_struct, stack));
 #endif /* CONFIG_PPC64 */
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index eb89811..8ac85e0 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -729,6 +729,11 @@ BEGIN_FTR_SECTION
bne-do_ste_alloc/* If so handle it */
 END_FTR_SECTION_IFCLR(CPU_FTR_SLB)
 
+   clrrdi  r11,r1,THREAD_SHIFT
+   lwz r0,TI_PREEMPT(r11)  /* If we're in an "NMI" */
+   andis.  r0,r0,nmi_m...@h/* (i.e. an irq when soft-disabled) */
+   bne 77f /* then don't call hash_page now */
+
/*
 * On iSeries, we soft-disable interrupts here, then
 * hard-enable interrupts so that the hash_page code can spin on
@@ -833,6 +838,20 @@ handle_page_fault:
bl  .low_hash_fault
b   .ret_from_except
 
+/*
+ * We come here as a result of a DSI at a point where we don't want
+ * to call hash_page, such as when we are accessing memory (possibly
+ * user memory) inside a PMU interrupt that occurred while interrupts
+ * were soft-disabled.  We want to invoke the exception handler for
+ * the access, or panic if there isn't a handler.
+ */
+77:bl  .save_nvgprs
+   mr  r4,r3
+   addir3,r1,STACK_FRAME_OVERHEAD
+   li  r5,SIGSEGV
+   bl  .bad_page_fault
+   b   .ret_from_except
+
/* here we have a segment miss */
 do_ste_alloc:
bl  .ste_allocate   /* try to insert stab entry */
diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c
index 5b7038f..a685652 100644
--- a/arch/powerpc/mm/slb.c
+++ b/arch/powerpc/mm/slb.c
@@ -92,15 +92,13 @@ static inline void create_shadowed_slbe(unsigned long ea, 
int ssize,
 : "memory" );
 }
 
-void slb_flush_and_rebolt(void)
+static void __slb_flush_and_rebolt(void)
 {
/* If you change this make sure you change SLB_NUM_BOLTED
 * appropriately too. */
unsigned lo

Re: [PATCH] powerpc: Fix __flush_icache_range on 44x

2009-08-17 Thread Benjamin Herrenschmidt
On Mon, 2009-08-17 at 12:07 -0400, Josh Boyer wrote:
> 
> Olof pointed out that we could probably do the iccci before the icbi loop and
> just skip that loop entirely on 44x.  This is most certainly valid, but at
> this particular moment I don't have time to try and reproduce the issue with
> an alternative fix and I wanted to get _something_ out there to fix the 
> issue.  
> 
> I suck for that, I know.

Well, I can massage your patch if you want. The fact is, the icbi loop
and iccci are definitely redundant :-)

Cheers,
Ben.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: simple gpio driver

2009-08-17 Thread Grant Likely
On Wed, Aug 12, 2009 at 11:49 PM, Heiko Schocher wrote:
> Hello Anton,
>
> i am trying to use the arch/powerpc/sysdev/simple_gpio.c driver,
> for accessing some gpios, and found, that u8_gpio_get()
> returns not only a 1 or a 0, instead it returns the real bit
> position from the gpio:
>
> gpio    return
> base    value
> 0       0/0x01
> 1       0/0x02
> 2       0/0x04
> 3       0/0x08
> 4       0/0x10
> 5       0/0x20
> 6       0/0x40
> 7       0/0x80
>
> I also use the arch/powerpc/platforms/52xx/mpc52xx_gpio.c and
> mpc52xx_gpt.c drivers, they all return for a gpio just a 1 or 0,
> which seems correct to me, because a gpio can have only 1 or 0
> as state ... what do you think?

I think returning '1' is perhaps slightly 'better' (however you define
that), but I don't think the caller should make any assumptions beyond
zero/non-zero.

>
> I solved this issue (if it is) with the following patch:
>
> diff --git a/arch/powerpc/sysdev/simple_gpio.c 
> b/arch/powerpc/sysdev/simple_gpio.c
> index 43c4569..bb0d79c 100644
> --- a/arch/powerpc/sysdev/simple_gpio.c
> +++ b/arch/powerpc/sysdev/simple_gpio.c
> @@ -46,7 +46,7 @@ static int u8_gpio_get(struct gpio_chip *gc, unsigned int 
> gpio)
>  {
>        struct of_mm_gpio_chip *mm_gc = to_of_mm_gpio_chip(gc);
>
> -       return in_8(mm_gc->regs) & u8_pin2mask(gpio);
> +       return (in_8(mm_gc->regs) & u8_pin2mask(gpio) ? 1 : 0);

For clarity, the brackets should be just around the & operands, and
"!= 0" instead of "? 1 : 0" might result in slightly smaller code.

return (in_8(mm_gc->regs) & u8_pin2mask(gpio)) != 0;
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 1/1 v1] powerpc44x: Add Eiger AMCC (AppliedMicro) PPC460SX evaluation board support.

2009-08-17 Thread Feng Kan

Please do, much appreciated.

Thanks
Feng Kan
AMCC Software

On 08/17/2009 08:34 AM, Josh Boyer wrote:

On Wed, Aug 12, 2009 at 05:38:47PM -0700, Feng Kan wrote:
   

This patch adds support for the AMCC (AppliedMicro) PPC460SX Eiger evaluation 
board.

Signed-off-by: Tai Tri Nguyen
Acked-by: Feng Kan
Acked-by: Tirumala Marri
---
arch/powerpc/boot/dts/eiger.dts|  421 ++
arch/powerpc/configs/44x/eiger_defconfig   | 1200 
arch/powerpc/platforms/44x/Kconfig |   12 +
arch/powerpc/platforms/44x/ppc44x_simple.c |1 +
4 files changed, 1634 insertions(+), 0 deletions(-)
create mode 100644 arch/powerpc/boot/dts/eiger.dts
create mode 100644 arch/powerpc/configs/44x/eiger_defconfig
 


Thanks, this looks great.

If you have no objections, I will commit an updated defconfig against the
current kernel sources instead of the one attached.  Some of the options
will move around a bit, but there should be no overall changes.

josh
   


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc: Fix __flush_icache_range on 44x

2009-08-17 Thread Josh Boyer
On Mon, Aug 17, 2009 at 09:41:36AM -0400, Josh Boyer wrote:
>The ptrace POKETEXT interface allows a process to modify the text pages of
>a child process being ptraced, usually to insert breakpoints via trap
>instructions.  The kernel eventually calls copy_to_user_page, which in turn
>calls __flush_icache_range to invalidate the icache lines for the child
>process.
>
>However, this function does not work on 44x due to the icache being virtually
>indexed.  This was noticed by a breakpoint being triggered after it had been
>cleared by ltrace on a 440EPx board.  The convenient solution is to do a
>flash invalidate of the icache in the __flush_icache_range function.
>
>Signed-off-by: Josh Boyer 
>
>---
>
>diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
>index 15f28e0..c9805a4 100644
>--- a/arch/powerpc/kernel/misc_32.S
>+++ b/arch/powerpc/kernel/misc_32.S
>@@ -346,6 +346,9 @@ END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
> 2:icbi0,r6
>   addir6,r6,L1_CACHE_BYTES
>   bdnz2b
>+#ifdef CONFIG_44x
>+  iccci   r0, r0
>+#endif

Olof pointed out that we could probably do the iccci before the icbi loop and
just skip that loop entirely on 44x.  This is most certainly valid, but at
this particular moment I don't have time to try and reproduce the issue with
an alternative fix and I wanted to get _something_ out there to fix the issue.  

I suck for that, I know.

josh
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 1/1 v1] powerpc44x: Add Eiger AMCC (AppliedMicro) PPC460SX evaluation board support.

2009-08-17 Thread Josh Boyer
On Wed, Aug 12, 2009 at 05:38:47PM -0700, Feng Kan wrote:
>This patch adds support for the AMCC (AppliedMicro) PPC460SX Eiger evaluation 
>board.
>
>Signed-off-by: Tai Tri Nguyen 
>Acked-by: Feng Kan 
>Acked-by: Tirumala Marri 
>---
> arch/powerpc/boot/dts/eiger.dts|  421 ++
> arch/powerpc/configs/44x/eiger_defconfig   | 1200 
> arch/powerpc/platforms/44x/Kconfig |   12 +
> arch/powerpc/platforms/44x/ppc44x_simple.c |1 +
> 4 files changed, 1634 insertions(+), 0 deletions(-)
> create mode 100644 arch/powerpc/boot/dts/eiger.dts
> create mode 100644 arch/powerpc/configs/44x/eiger_defconfig

Thanks, this looks great.

If you have no objections, I will commit an updated defconfig against the
current kernel sources instead of the one attached.  Some of the options
will move around a bit, but there should be no overall changes.

josh
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] Add support for the ESTeem 195E (PPC405EP) SBC

2009-08-17 Thread Josh Boyer
On Thu, Jul 30, 2009 at 04:08:49PM -0400, Josh Boyer wrote:
>>> Ok.  So I'm not really all that thrilled with changes to ppcboot.h.  
>>> We try to keep this file as much in-sync with U-Boot as we can.  Did 
>>> your HOTFOOT changes get pulled into upstream U-Boot?
>>
>>Yeah, I thought this may be a problem, but I didn't know a better way to 
>>go about this and still maintain compatibility with the many thousands 
>>of boards already in the field.  I mean, I could strip out the ppcboot.h 
>>changes and maintain that as an out-of-tree patch, but without that 
>>patch, the kernel won't boot on in-the-field boards, rendering the 
>>upstreaming of support for this board kinda pointless.
>>
>>I haven't tried to push anything to upstream u-boot, given how ancient 
>>the in-production bootloader is.  The guy who originally mangled u-boot 
>>for this board did so before the "standard" 405EP dual ethernet layout 
>>was added, and never tried to push it upstream.  Any upstream uboot work 
>>will take the form of a native dts/fdt port that probably won't use 
>>ppcboot.h anyway, which brings us full circle...
>
>There is another way.  Perhaps you could just copy ppcboot.h to a new file
>called "hotfoot.h" and just use that.  It's a duplication of ppcboot.h to
>some degree, but it seems to make sense for your board and it helps preserve
>the "stock" ppcboot.h for other boards.

Solomon, any update on this?  As far as I'm concerned, the ppcboot.h issue is
the only thing that really needs to be reworked before we bring this patch
in.

josh
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Poll: Rebasing of powerpc-next

2009-08-17 Thread Becky Bruce


On Aug 15, 2009, at 5:20 PM, Benjamin Herrenschmidt wrote:


Hi !

I'd like to rebase powerpc-next ... a few bugs have been found that it
would be nice to fix in the original patch rather than introducing a
bisection breakage, and Kumar also just noticed a potentially  
misleading

error in a commit message from his tree.

So who is not ok with me doing that tomorrow or tuesday ?


Sounds like a swell idea to me I hate bisection breakage.

-Becky

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 0/3] cpu: idle state framework for offline CPUs.

2009-08-17 Thread Dipankar Sarma
On Mon, Aug 17, 2009 at 01:28:15PM +0530, Dipankar Sarma wrote:
> On Mon, Aug 17, 2009 at 09:15:57AM +0200, Peter Zijlstra wrote:
> > On Mon, 2009-08-17 at 11:54 +0530, Dipankar Sarma wrote:
> > > For most parts, we do. The guest kernel doesn't manage the offline
> > > CPU state. That is typically done by the hypervisor. However, offline
> > > operation as defined now always result in a VM resize in some hypervisor
> > > systems (like pseries) - it would be convenient to have a non-resize
> > > offline operation which lets the guest cede the cpu to hypervisor
> > > with the hint that the VM shouldn't be resized and the guest needs the 
> > > guarantee
> > > to get the cpu back any time. The hypervisor can do whatever it wants
> > > with the ceded CPU including putting it in a low power state, but
> > > not change the physical cpu shares of the VM. The pseries hypervisor,
> > > for example, clearly distinguishes between the two - "rtas-stop-self" call
> > > to resize VM vs. H_CEDE hypercall with a hint. What I am suggesting
> > > is that we allow this with an extension to existing interfaces because it 
> > > makes sense to allow sort of "hibernation" of the cpus without changing 
> > > any
> > > configuration of the VMs.
> > 
> > >From my POV the thing you call cede is the only sane thing to do for a
> > guest. Let the hypervisor management interface deal with resizing guests
> > if and when that's needed.
> 
> That is more or less how it currently works - atleast for pseries hypervisor. 
> The current "offline" operation with "rtas-stop-self" call I mentioned
> earlier is initiated by the hypervisor management interfaces/tool in
> pseries system. This wakes up a guest system tool that echoes "1"
> to the offline file resulting in the configuration change.

Should have said - echoes "0" to the online file. 

You don't necessarily need this in the guest Linux as long as there is
a way for hypervisor tools to internally move Linux tasks/interrupts
from a vcpu - async event handled by the kernel, for example.
But I think it is too late for that - the interface has long been
exported.


> The OS involvement is necessary to evacuate tasks/interrupts
> from the released CPU. We don't really want to initiate this from guests.
> 
> > Thing is, you don't want a guest to be able to influence the amount of
> > cpu shares attributed to it. You want that in explicit control of
> > whomever manages the hypervisor.
> 
> Agreed. But given a fixed cpu share by the hypervisor management tools,
> we would like to be able to cede cpus to hypervisor leaving the hypervisor
> configuration intact. This, we don't have at the moment and want to just
> extend the current interface for this.
> 
> Thanks
> Dipankar
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 4/8] powerpc/qe&cpm: Implement static inline stubs for non-QE/CPM builds

2009-08-17 Thread Anton Vorontsov
On Sat, Aug 15, 2009 at 02:25:49AM +0400, Anton Vorontsov wrote:
[]
> +#ifdef CONFIG_CPM
>  int cpm_command(u32 command, u8 opcode);
> +#else
> +static inline int cpm_command(u32 command, u8 opcode)
> +{
> + return -ENOSYS;
> +}
> +#endif /* CONFIG_CPM */

It appers that fsl_qe_udc.h defines its own inlined version, and so
fsl_qe_udc.c's build breaks:

In file included from drivers/usb/gadget/fsl_qe_udc.c:44:
fsl_qe_udc.h:432: error: redefinition of ‘qe_issue_cmd’
arch/powerpc/include/asm/qe.h:153: error: previous definition of ‘qe_issue_cmd’ 
was here

I didn't notice that earlier because USB_GADGET_FSL_QE isn't enabled
in any defconfig.

I'll send v2 soon.

Thanks,

-- 
Anton Vorontsov
email: cbouatmai...@gmail.com
irc://irc.freenode.net/bd2
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] powerpc: Fix __flush_icache_range on 44x

2009-08-17 Thread Josh Boyer
The ptrace POKETEXT interface allows a process to modify the text pages of
a child process being ptraced, usually to insert breakpoints via trap
instructions.  The kernel eventually calls copy_to_user_page, which in turn
calls __flush_icache_range to invalidate the icache lines for the child
process.

However, this function does not work on 44x due to the icache being virtually
indexed.  This was noticed by a breakpoint being triggered after it had been
cleared by ltrace on a 440EPx board.  The convenient solution is to do a
flash invalidate of the icache in the __flush_icache_range function.

Signed-off-by: Josh Boyer 

---

diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index 15f28e0..c9805a4 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -346,6 +346,9 @@ END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
 2: icbi0,r6
addir6,r6,L1_CACHE_BYTES
bdnz2b
+#ifdef CONFIG_44x
+   iccci   r0, r0
+#endif
sync/* additional sync needed on g4 */
isync
blr
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Poll: Rebasing of powerpc-next

2009-08-17 Thread Josh Boyer
On Sun, Aug 16, 2009 at 08:20:17AM +1000, Benjamin Herrenschmidt wrote:
>Hi !
>
>I'd like to rebase powerpc-next ... a few bugs have been found that it
>would be nice to fix in the original patch rather than introducing a
>bisection breakage, and Kumar also just noticed a potentially misleading
>error in a commit message from his tree.
>
>So who is not ok with me doing that tomorrow or tuesday ?

That's fine with me.  I have a few pending patches to pull in, and another
to send out myself.  I'll do that after the rebase.

josh
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 0/3] cpu: idle state framework for offline CPUs.

2009-08-17 Thread Dipankar Sarma
On Mon, Aug 17, 2009 at 09:15:57AM +0200, Peter Zijlstra wrote:
> On Mon, 2009-08-17 at 11:54 +0530, Dipankar Sarma wrote:
> > For most parts, we do. The guest kernel doesn't manage the offline
> > CPU state. That is typically done by the hypervisor. However, offline
> > operation as defined now always result in a VM resize in some hypervisor
> > systems (like pseries) - it would be convenient to have a non-resize
> > offline operation which lets the guest cede the cpu to hypervisor
> > with the hint that the VM shouldn't be resized and the guest needs the 
> > guarantee
> > to get the cpu back any time. The hypervisor can do whatever it wants
> > with the ceded CPU including putting it in a low power state, but
> > not change the physical cpu shares of the VM. The pseries hypervisor,
> > for example, clearly distinguishes between the two - "rtas-stop-self" call
> > to resize VM vs. H_CEDE hypercall with a hint. What I am suggesting
> > is that we allow this with an extension to existing interfaces because it 
> > makes sense to allow sort of "hibernation" of the cpus without changing any
> > configuration of the VMs.
> 
> >From my POV the thing you call cede is the only sane thing to do for a
> guest. Let the hypervisor management interface deal with resizing guests
> if and when that's needed.

That is more or less how it currently works - atleast for pseries hypervisor. 
The current "offline" operation with "rtas-stop-self" call I mentioned
earlier is initiated by the hypervisor management interfaces/tool in
pseries system. This wakes up a guest system tool that echoes "1"
to the offline file resulting in the configuration change.
The OS involvement is necessary to evacuate tasks/interrupts
from the released CPU. We don't really want to initiate this from guests.

> Thing is, you don't want a guest to be able to influence the amount of
> cpu shares attributed to it. You want that in explicit control of
> whomever manages the hypervisor.

Agreed. But given a fixed cpu share by the hypervisor management tools,
we would like to be able to cede cpus to hypervisor leaving the hypervisor
configuration intact. This, we don't have at the moment and want to just
extend the current interface for this.

Thanks
Dipankar

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 0/3] cpu: idle state framework for offline CPUs.

2009-08-17 Thread Peter Zijlstra
On Mon, 2009-08-17 at 11:54 +0530, Dipankar Sarma wrote:
> On Sun, Aug 16, 2009 at 11:53:22PM +0200, Peter Zijlstra wrote:
> > On Mon, 2009-08-17 at 01:14 +0530, Balbir Singh wrote:
> > > Agreed, I've tried to come with a little ASCII art to depict your
> > > scenairos graphically
> > > 
> > > 
> > > ++ don't need (offline)
> > > |  OS+--->++
> > > +--+-+| hypervisor +-> Reuse CPU
> > >|  ||   for something
> > >|  ||   else
> > >|  ||   (visible to users)
> > >|  ||as resource changed
> > >|  +--- +
> > >V (needed, but can cede)
> > >++
> > >| hypervisor | Don't reuse CPU
> > >||  (CPU ceded)
> > >|| give back to OS
> > >++ when needed.
> > > (Not visible to
> > > users as so resource
> > > binding changed)
> > 
> > I still don't get it... _why_ should this be exposed in the guest
> > kernel? Why not let the hypervisor manage a guest's offline cpus in a
> > way it sees fit?
> 
> For most parts, we do. The guest kernel doesn't manage the offline
> CPU state. That is typically done by the hypervisor. However, offline
> operation as defined now always result in a VM resize in some hypervisor
> systems (like pseries) - it would be convenient to have a non-resize
> offline operation which lets the guest cede the cpu to hypervisor
> with the hint that the VM shouldn't be resized and the guest needs the 
> guarantee
> to get the cpu back any time. The hypervisor can do whatever it wants
> with the ceded CPU including putting it in a low power state, but
> not change the physical cpu shares of the VM. The pseries hypervisor,
> for example, clearly distinguishes between the two - "rtas-stop-self" call
> to resize VM vs. H_CEDE hypercall with a hint. What I am suggesting
> is that we allow this with an extension to existing interfaces because it 
> makes sense to allow sort of "hibernation" of the cpus without changing any
> configuration of the VMs.

>From my POV the thing you call cede is the only sane thing to do for a
guest. Let the hypervisor management interface deal with resizing guests
if and when that's needed.

Thing is, you don't want a guest to be able to influence the amount of
cpu shares attributed to it. You want that in explicit control of
whomever manages the hypervisor.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev