Re: [PATCH 2/5] sched: add asymmetric packing option for sibling domain

2010-04-14 Thread Michael Neuling
In message 1271161767.4807.1281.ca...@twins you wrote:
 On Fri, 2010-04-09 at 16:21 +1000, Michael Neuling wrote:
  Peter: Since this is based mainly off your initial patch, it should
  have your signed-off-by too, but I didn't want to add without your
  permission.  Can I add it?
 
 Of course! :-)
 
 This thing does need a better changelog though, and maybe a larger
 comment with check_asym_packing(), explaining why and what we're doing
 and what we're assuming (that lower cpu number also means lower thread
 number).
 

OK, updated patch below...

Mikey


[PATCH 2/5] sched: add asymmetric group packing option for sibling domain

Check to see if the group is packed in a sched doman.

This is primarily intended to used at the sibling level.  Some cores
like POWER7 prefer to use lower numbered SMT threads.  In the case of
POWER7, it can move to lower SMT modes only when higher threads are
idle.  When in lower SMT modes, the threads will perform better since
they share less core resources.  Hence when we have idle threads, we
want them to be the higher ones.

This adds a hook into f_b_g() called check_asym_packing() to check the
packing.  This packing function is run on idle threads.  It checks to
see if the busiest CPU in this domain (core in the P7 case) has a
higher CPU number than what where the packing function is being run
on.  If it is, calculate the imbalance and return the higher busier
thread as the busiest group to f_b_g().  Here we are assuming a lower
CPU number will be equivalent to a lower SMT thread number.

It also creates a new SD_ASYM_PACKING flag to enable this feature at
any scheduler domain level.

It also creates an arch hook to enable this feature at the sibling
level.  The default function doesn't enable this feature.

Based heavily on patch from Peter Zijlstra.  

Signed-off-by: Michael Neuling mi...@neuling.org
Signed-off-by: Peter Zijlstra pet...@infradead.org
---
 include/linux/sched.h|4 +-
 include/linux/topology.h |1 
 kernel/sched_fair.c  |   93 +--
 3 files changed, 94 insertions(+), 4 deletions(-)

Index: linux-2.6-ozlabs/include/linux/sched.h
===
--- linux-2.6-ozlabs.orig/include/linux/sched.h
+++ linux-2.6-ozlabs/include/linux/sched.h
@@ -799,7 +799,7 @@ enum cpu_idle_type {
 #define SD_POWERSAVINGS_BALANCE0x0100  /* Balance for power savings */
 #define SD_SHARE_PKG_RESOURCES 0x0200  /* Domain members share cpu pkg 
resources */
 #define SD_SERIALIZE   0x0400  /* Only a single load balancing 
instance */
-
+#define SD_ASYM_PACKING0x0800  /* Place busy groups earlier in 
the domain */
 #define SD_PREFER_SIBLING  0x1000  /* Prefer to place tasks in a sibling 
domain */
 
 enum powersavings_balance_level {
@@ -834,6 +834,8 @@ static inline int sd_balance_for_package
return SD_PREFER_SIBLING;
 }
 
+extern int __weak arch_sd_sibiling_asym_packing(void);
+
 /*
  * Optimise SD flags for power savings:
  * SD_BALANCE_NEWIDLE helps agressive task consolidation and power savings.
Index: linux-2.6-ozlabs/include/linux/topology.h
===
--- linux-2.6-ozlabs.orig/include/linux/topology.h
+++ linux-2.6-ozlabs/include/linux/topology.h
@@ -102,6 +102,7 @@ int arch_update_cpu_topology(void);
| 1*SD_SHARE_PKG_RESOURCES  \
| 0*SD_SERIALIZE\
| 0*SD_PREFER_SIBLING   \
+   | arch_sd_sibiling_asym_packing()   \
,   \
.last_balance   = jiffies,  \
.balance_interval   = 1,\
Index: linux-2.6-ozlabs/kernel/sched_fair.c
===
--- linux-2.6-ozlabs.orig/kernel/sched_fair.c
+++ linux-2.6-ozlabs/kernel/sched_fair.c
@@ -2493,6 +2493,39 @@ static inline void update_sg_lb_stats(st
 }
 
 /**
+ * update_sd_pick_busiest - return 1 on busiest group
+ * @sd: sched_domain whose statistics are to be checked
+ * @sds: sched_domain statistics
+ * @sg: sched_group candidate to be checked for being the busiest
+ * @sds: sched_group statistics
+ *
+ * This returns 1 for the busiest group. If asymmetric packing is
+ * enabled and we already have a busiest, but this candidate group has
+ * a higher cpu number than the current busiest, pick this sg.
+ */
+static int update_sd_pick_busiest(struct sched_domain *sd,
+ struct sd_lb_stats *sds,
+ struct sched_group *sg,
+ struct sg_lb_stats *sgs)
+{
+   if (sgs-sum_nr_running  sgs-group_capacity)
+   return 1;
+
+   if (sgs-group_imb)
+

[PATCH] powerpc/perf_event: Fix oops due to perf_event_do_pending call

2010-04-14 Thread Paul Mackerras
Anton Blanchard found that large POWER systems would occasionally
crash in the exception exit path when profiling with perf_events.
The symptom was that an interrupt would occur late in the exit path
when the MSR[RI] (recoverable interrupt) bit was clear.  Interrupts
should be hard-disabled at this point but they were enabled.  Because
the interrupt was not recoverable the system panicked.

The reason is that the exception exit path was calling
perf_event_do_pending after hard-disabling interrupts, and
perf_event_do_pending will re-enable interrupts.

The simplest and cleanest fix for this is to use the same mechanism
that 32-bit powerpc does, namely to cause a self-IPI by setting the
decrementer to 1.  This means we can remove the tests in the exception
exit path and raw_local_irq_restore.

This also makes sure that the call to perf_event_do_pending from
timer_interrupt() happens within irq_enter/irq_exit.  (Note that
calling perf_event_do_pending from timer_interrupt does not mean that
there is a possible 1/HZ latency; setting the decrementer to 1 ensures
that the timer interrupt will happen immediately, i.e. within one
timebase tick, which is a few nanoseconds or 10s of nanoseconds.)

Signed-off-by: Paul Mackerras pau...@samba.org
Cc: sta...@kernel.org
---
Ben, please put this in your tree of fixes to go to Linus for 2.6.34,
since it fixes a potential panic.

 arch/powerpc/include/asm/hw_irq.h |   38 
 arch/powerpc/kernel/asm-offsets.c |1 
 arch/powerpc/kernel/entry_64.S|9 -
 arch/powerpc/kernel/irq.c |6 ---
 arch/powerpc/kernel/time.c|   60 ++
 5 files changed, 48 insertions(+), 66 deletions(-)
diff --git a/arch/powerpc/include/asm/hw_irq.h 
b/arch/powerpc/include/asm/hw_irq.h
index 9f4c9d4..bd100fc 100644
--- a/arch/powerpc/include/asm/hw_irq.h
+++ b/arch/powerpc/include/asm/hw_irq.h
@@ -130,43 +130,5 @@ static inline int irqs_disabled_flags(unsigned long flags)
  */
 struct irq_chip;
 
-#ifdef CONFIG_PERF_EVENTS
-
-#ifdef CONFIG_PPC64
-static inline unsigned long test_perf_event_pending(void)
-{
-   unsigned long x;
-
-   asm volatile(lbz %0,%1(13)
-   : =r (x)
-   : i (offsetof(struct paca_struct, perf_event_pending)));
-   return x;
-}
-
-static inline void set_perf_event_pending(void)
-{
-   asm volatile(stb %0,%1(13) : :
-   r (1),
-   i (offsetof(struct paca_struct, perf_event_pending)));
-}
-
-static inline void clear_perf_event_pending(void)
-{
-   asm volatile(stb %0,%1(13) : :
-   r (0),
-   i (offsetof(struct paca_struct, perf_event_pending)));
-}
-#endif /* CONFIG_PPC64 */
-
-#else  /* CONFIG_PERF_EVENTS */
-
-static inline unsigned long test_perf_event_pending(void)
-{
-   return 0;
-}
-
-static inline void clear_perf_event_pending(void) {}
-#endif /* CONFIG_PERF_EVENTS */
-
 #endif /* __KERNEL__ */
 #endif /* _ASM_POWERPC_HW_IRQ_H */
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 957ceb7..c09138d 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -133,7 +133,6 @@ int main(void)
DEFINE(PACAKMSR, offsetof(struct paca_struct, kernel_msr));
DEFINE(PACASOFTIRQEN, offsetof(struct paca_struct, soft_enabled));
DEFINE(PACAHARDIRQEN, offsetof(struct paca_struct, hard_enabled));
-   DEFINE(PACAPERFPEND, offsetof(struct paca_struct, perf_event_pending));
DEFINE(PACACONTEXTID, offsetof(struct paca_struct, context.id));
 #ifdef CONFIG_PPC_MM_SLICES
DEFINE(PACALOWSLICESPSIZE, offsetof(struct paca_struct,
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 07109d8..42e9d90 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -556,15 +556,6 @@ ALT_FW_FTR_SECTION_END_IFCLR(FW_FEATURE_ISERIES)
 2:
TRACE_AND_RESTORE_IRQ(r5);
 
-#ifdef CONFIG_PERF_EVENTS
-   /* check paca-perf_event_pending if we're enabling ints */
-   lbz r3,PACAPERFPEND(r13)
-   and.r3,r3,r5
-   beq 27f
-   bl  .perf_event_do_pending
-27:
-#endif /* CONFIG_PERF_EVENTS */
-
/* extract EE bit and use it to restore paca-hard_enabled */
ld  r3,_MSR(r1)
rldicl  r4,r3,49,63 /* r0 = (r3  15)  1 */
diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 64f6f20..066bd31 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -53,7 +53,6 @@
 #include linux/bootmem.h
 #include linux/pci.h
 #include linux/debugfs.h
-#include linux/perf_event.h
 
 #include asm/uaccess.h
 #include asm/system.h
@@ -145,11 +144,6 @@ notrace void raw_local_irq_restore(unsigned long en)
}
 #endif /* CONFIG_PPC_STD_MMU_64 */
 
-   if (test_perf_event_pending()) {
-   clear_perf_event_pending();
-   perf_event_do_pending();
-   }
-
/*
  

Re: [PATCH] powerpc/perf_event: Fix oops due to perf_event_do_pending call

2010-04-14 Thread Benjamin Herrenschmidt
On Wed, 2010-04-14 at 16:46 +1000, Paul Mackerras wrote:
 Ben, please put this in your tree of fixes to go to Linus for 2.6.34,
 since it fixes a potential panic.

Should it go to -stable as well ? How far back ?

Cheers,
Ben.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Questions about creating a OF platform device

2010-04-14 Thread Terren Chow
Dear all,
I'm new to PPC Linux development. Recently I'm learning how to write a
Linux device driver on MPC5121 based embedded system. I'm just confused
about the OF related code in the kernel.
I know in the Linux device driver model, a driver is attached to a
device through the bus_type methods. For example, the  platform_driver is
attached to a platform device through the register_platform_driver() method.
This method will scan over the device list in the bus, and perform probe()
method when the match() method return ok.
   My question is, in PPC Linux, there is a structure named *
of_platform_driver* which is similar to *platform_driver* and there is also
a bus_type named *of_platform_bus_type* which is similar to *
platform_bus_type*. But I can't find any information about the structure
of_platform_device. How do the kernel create the of_platform_device and
place them into the device list in the *platform_bus_type* ? I search the
Internet and I can't find any information about this.Could anyone give me
some information so that I can find out the magic behind. Thanks!

-- 
Terren.Chow
College of Informatics, SCAU
Graduate Student
Lab of Embedded System and Wireless Sensor Network
MSN: terren.c...@hotmail.com
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

mpc5121e-Real Time Clock

2010-04-14 Thread CTAG / Moisés Domínguez
Hi,

 

I am trying  to use internal Real Time Clock of my ads5121 board .So that I
am using /dev/rtc-0 device with linux/Documentation/rtc.txt example. I can
read and write time/date but I notice RTC is not being updated as if VBAT
was off or there was not oscillator (I tested oscillator and VBAT).

 

According MPC5121e errdata of freescale doc., there’s an issue involving
RTC:

 

Description:

“The TAMP bit in the SRTC does not reliably provide an indication that the
RTC power supply or the RTC

oscillator has been interrupted”

 

Workaround:

“In your system’s design, do not rely on the tamper indication to be
reliable”

 

 

Looking in rtc-mpc5121.c code I didn’t find anything related to Keep Alive
Register (where TAMP bit is) so I would like to know where this issue is
taking into account or if it is really being taking into account in the
driver.

 

 

Regards,  Moisés.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/perf_event: Fix oops due to perf_event_do_pending call

2010-04-14 Thread Michael Ellerman
On Wed, 2010-04-14 at 16:46 +1000, Paul Mackerras wrote:
 Anton Blanchard found that large POWER systems would occasionally
 crash in the exception exit path when profiling with perf_events.
 The symptom was that an interrupt would occur late in the exit path
 when the MSR[RI] (recoverable interrupt) bit was clear.  Interrupts
 should be hard-disabled at this point but they were enabled.  Because
 the interrupt was not recoverable the system panicked.
 
...
 diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
 index 1b16b9a..0441bbd 100644
 --- a/arch/powerpc/kernel/time.c
 +++ b/arch/powerpc/kernel/time.c
 @@ -532,25 +532,60 @@ void __init iSeries_time_init_early(void)
  }
  #endif /* CONFIG_PPC_ISERIES */
  
 -#if defined(CONFIG_PERF_EVENTS)  defined(CONFIG_PPC32)
 -DEFINE_PER_CPU(u8, perf_event_pending);
 +#ifdef CONFIG_PERF_EVENTS
  
 -void set_perf_event_pending(void)
 +/*
 + * 64-bit uses a byte in the PACA, 32-bit uses a per-cpu variable...
 + */

Any reason not to switch to per-cpu for both, now that you don't need to
access it from asm?

cheers


signature.asc
Description: This is a digitally signed message part
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Resetting PCI-E devices after linux boot

2010-04-14 Thread Jake Magee
Dan,

Were you able to get PCI-E hotplug working?  I could not get this working
myself and assumed that driver support was lacking.  I'm actually using a
PPC405.

Thanks,
Jake


On Thu, Mar 25, 2010 at 8:26 PM, Dan Wilson dwil...@fulcrummicro.comwrote:

 We are building a PCI-E device for use in an embedded system with an 85xx
 processor.  One of our customers is adamant that linux PCI-E hot-swap
 support will not allow us to either bring the device up after linux boot
 (i.e., the PCI-E device must be present when linux scans for PCI-E devices
 at startup) or to reset the device once linux is up.  It was our impression
 that the PCI-E hot-swap support should allow for devices to appear after
 linux boot, be properly initialized, and then later be able to shut them
 down and bring them back up again.

 Has anyone successfully used the PCI-E hot-swap capabilities in the linux
 kernel in a PPC 85xx environment?  Any known gotchas we need to be aware of?

 Thanks in advance for your responses,

 Dan.

 ___
 Linuxppc-dev mailing list
 Linuxppc-dev@lists.ozlabs.org
 https://lists.ozlabs.org/listinfo/linuxppc-dev

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 1/2] perf: Move arch specific code into separate arch directory

2010-04-14 Thread Masami Hiramatsu
Ian Munsie wrote:
 From: Ian Munsie imun...@au.ibm.com
 
 The perf userspace tool included some architecture specific code to map
 registers from the DWARF register number into the names used by the regs
 and stack access API.
 
 This patch moves the architecture specific code out into a separate
 arch/x86 directory along with the infrastructure required to use it.
 
 Signed-off-by: Ian Munsie imun...@au.ibm.com
 ---
 Changes since v1: From Masami Hiramatsu's suggestion, I added a check in the
 Makefile for if the arch specific Makefile defines PERF_HAVE_DWARF_REGS,
 printing a message during build if it has not. This simplifies the code
 removing the odd macro from the previous version and the need for an arch
 specific arch_dwarf-regs.h. I have not entirely disabled DWARF support for
 architectures that don't implement the register mappings, so that they can
 still add a probe based on a line number (they will be missing the ability to
 capture the value of a variable from a register).

Hmm, sorry, I don't think it is a good way to go... IMHO, porting dwarf-regs.c
is so easy (you can just refer systemtap/runtime/loc2c-runtime.h), easier
than porting kprobe-tracer on another arch. And perf is a part of kernel tree.
It means that someone who are porting kprobe-tracer, he should port
dwarf-regs.c too. In that case, PERF_HAVE_DWARF_REGS flag will be used only
between those two patches in same patchset. So, I suggested you to drop dwarf
support if dwarf-regs mapping doesn't exist.

AFAIK, at this point, only s390 users are affected. I'd like to ask
them to just port a register mapping on perf and test it too.

Thank you,

-- 
Masami Hiramatsu
e-mail: mhira...@redhat.com
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2 1/2] perf: Move arch specific code into separate arch directory

2010-04-14 Thread Heiko Carstens
On Wed, Apr 14, 2010 at 07:46:12AM -0700, Masami Hiramatsu wrote:
 Ian Munsie wrote:
  From: Ian Munsie imun...@au.ibm.com
  
  The perf userspace tool included some architecture specific code to map
  registers from the DWARF register number into the names used by the regs
  and stack access API.
  
  This patch moves the architecture specific code out into a separate
  arch/x86 directory along with the infrastructure required to use it.
  
  Signed-off-by: Ian Munsie imun...@au.ibm.com
  ---
  Changes since v1: From Masami Hiramatsu's suggestion, I added a check in the
  Makefile for if the arch specific Makefile defines PERF_HAVE_DWARF_REGS,
  printing a message during build if it has not. This simplifies the code
  removing the odd macro from the previous version and the need for an arch
  specific arch_dwarf-regs.h. I have not entirely disabled DWARF support for
  architectures that don't implement the register mappings, so that they can
  still add a probe based on a line number (they will be missing the ability 
  to
  capture the value of a variable from a register).
 
 Hmm, sorry, I don't think it is a good way to go... IMHO, porting dwarf-regs.c
 is so easy (you can just refer systemtap/runtime/loc2c-runtime.h), easier
 than porting kprobe-tracer on another arch. And perf is a part of kernel tree.
 It means that someone who are porting kprobe-tracer, he should port
 dwarf-regs.c too. In that case, PERF_HAVE_DWARF_REGS flag will be used only
 between those two patches in same patchset. So, I suggested you to drop dwarf
 support if dwarf-regs mapping doesn't exist.
 
 AFAIK, at this point, only s390 users are affected. I'd like to ask
 them to just port a register mapping on perf and test it too.

Hm, I'm a bit lost here. Probably due to lack of context. What would be missing
on s390 and what am I supposed to implement and how can I test it?
Any pointers to git commits?
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2 1/2] perf: Move arch specific code into separate arch directory

2010-04-14 Thread Masami Hiramatsu
Heiko Carstens wrote:
 On Wed, Apr 14, 2010 at 07:46:12AM -0700, Masami Hiramatsu wrote:
 Ian Munsie wrote:
 From: Ian Munsie imun...@au.ibm.com

 The perf userspace tool included some architecture specific code to map
 registers from the DWARF register number into the names used by the regs
 and stack access API.

 This patch moves the architecture specific code out into a separate
 arch/x86 directory along with the infrastructure required to use it.

 Signed-off-by: Ian Munsie imun...@au.ibm.com
 ---
 Changes since v1: From Masami Hiramatsu's suggestion, I added a check in the
 Makefile for if the arch specific Makefile defines PERF_HAVE_DWARF_REGS,
 printing a message during build if it has not. This simplifies the code
 removing the odd macro from the previous version and the need for an arch
 specific arch_dwarf-regs.h. I have not entirely disabled DWARF support for
 architectures that don't implement the register mappings, so that they can
 still add a probe based on a line number (they will be missing the ability 
 to
 capture the value of a variable from a register).

 Hmm, sorry, I don't think it is a good way to go... IMHO, porting 
 dwarf-regs.c
 is so easy (you can just refer systemtap/runtime/loc2c-runtime.h), easier
 than porting kprobe-tracer on another arch. And perf is a part of kernel 
 tree.
 It means that someone who are porting kprobe-tracer, he should port
 dwarf-regs.c too. In that case, PERF_HAVE_DWARF_REGS flag will be used only
 between those two patches in same patchset. So, I suggested you to drop dwarf
 support if dwarf-regs mapping doesn't exist.

 AFAIK, at this point, only s390 users are affected. I'd like to ask
 them to just port a register mapping on perf and test it too.
 
 Hm, I'm a bit lost here. Probably due to lack of context. What would be 
 missing
 on s390 and what am I supposed to implement and how can I test it?
 Any pointers to git commits?

Ah, sorry about that. Now we're talking about an idea about porting perf-probe
on some architectures which supports kprobe-tracer.
Ian's patch (https://patchwork.kernel.org/patch/92328/) is currently under
discussion, so there is no git commit yet (but it will be in a few days).

So what I'd like to suggest you is implementing s390 version of DWARF register
mapping support(ppc version is here: https://patchwork.kernel.org/patch/92329/)
for perf probe (a subcommand of perf tools(tools/perf)) and test the perf-probe
can work.

For testing, you may need to compile kernel with CONFIG_DEBUG_INFO, install
elfutils-devel, and make perf tools (cd tools/perf; make).
And then, execute below command.

$ ./perf probe -v --add 'vfs_read file'


Thank you,

-- 
Masami Hiramatsu
e-mail: mhira...@redhat.com

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: mpc5121e-Real Time Clock

2010-04-14 Thread Wolfgang Denk
Dear =?iso-8859-1?Q?CTAG_/_Mois=E9s_Dom=EDnguez?=,

In message e70643a89af74de1bca25c3c4ae2c...@ctag you wrote:
 
 Looking in rtc-mpc5121.c code I didn't find anything related to Keep Alive
 Register (where TAMP bit is) so I would like to know where this issue is
 taking into account or if it is really being taking into account in the
 driver.

It is not.

Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH, MD: Wolfgang Denk  Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: w...@denx.de
To the systems programmer,  users  and  applications  serve  only  to
provide a test load.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Resetting PCI-E devices after linux boot

2010-04-14 Thread Benjamin Herrenschmidt
On Wed, 2010-04-14 at 08:55 -0500, Jake Magee wrote:
 On Thu, Mar 25, 2010 at 8:26 PM, Dan Wilson dwil...@fulcrummicro.com
 wrote:
 We are building a PCI-E device for use in an embedded system
 with an 85xx processor.  One of our customers is adamant that
 linux PCI-E hot-swap support will not allow us to either bring
 the device up after linux boot (i.e., the PCI-E device must be
 present when linux scans for PCI-E devices at startup) or to
 reset the device once linux is up.  It was our impression that
 the PCI-E hot-swap support should allow for devices to appear
 after linux boot, be properly initialized, and then later be
 able to shut them down and bring them back up again.
 
 Has anyone successfully used the PCI-E hot-swap capabilities
 in the linux kernel in a PPC 85xx environment?  Any known
 gotchas we need to be aware of?
 
 Thanks in advance for your responses,

It should be possible to get that working, but I suspect not without
some code changes. I know the current PCIe hotswap driver has ACPI hooks
that would need to be replaced by appropriate hooks into the powerpc
code to perform the right resource manipulation etc...

We do PCIe hotswap on IBM pSeries, but this is using specific FW
interfaces for which we have a dedicated PCI hotplug driver.

Can the slot power be SW controlled on the Canyonlands PCIe slot ? In
that case I should be able to toy with that myself at some stage (but
not for a couple of weeks).

Cheers,
Ben.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re:mpc5121e-Real Time Clock

2010-04-14 Thread Terren Chow
I don't think the internal RTC is workable. I've read the RTC section in the
MPC5121 data sheet, and I found that the RTC only keep the time tick
register using VBAT. So if there is a power failure, the RTC will lost its
minute and hour information. After the system boot up, the kernel will read
the RTC to get its wall time, so the system time will reset to 1970x. I
think there is a design mistake of the CPU.
The only solution is using the mt41 RTC chip on the ADS5121 board as the RTC
device.

-- 
Terren Chow
Graduate student
College of Informatics, SCAU
MSN: terren.c...@hotmail.com
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Xorg on Fujitsu Lime with MPC5200b?

2010-04-14 Thread Bill Gatliff
Guys:


I'm not quite sure where to ask this question, but all my attempts
elsewhere have come up short, so...

Put simply, I have an MPC5200b platform with a Fujitsu Lime GDC, and
I'm trying to run Debian squeeze's xorg on it.

Actually, I *have* Debian squeeze's xorg running on the platform just
fine, with a 2.6.34-rc1 kernel (kernel.org).  Problem is, every single
diagonal line is very blocky--- not smooth at all.  I used to think this
was a problem with X's fonts, but now I don't think so because the mouse
cursor's diagonal lines also look equally bad.

It's almost as if any time the platform tries to draw a diagonal line,
truncation/rounding errors are causing it problems in figuring out which
pixels to turn on and off.

A non-Linux kernel on this hardware, running a non-X GUI, seems to work
fine so I think the hardware isn't the problem.

Anyone have any suggestions on where to start with this one?  Anyone
else running a similar configuration with any success?  I'm completely
lost, and running out of hair *fast*...


Thanks!


b.g.

-- 
Bill Gatliff
b...@billgatliff.com

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH v3] perf: Split out arch specific code improve PowerPC perf probe support

2010-04-14 Thread Ian Munsie
These patches add the required mappings to use perf probe on PowerPC.

Part 1 of the patch series moves the arch dependent x86 32 and 64 bit DWARF
register number mappings out into a separate arch directory and adds the
necessary Makefile foo to use it.

Part 2 of the patch series adds the PowerPC mappings -
Functionality wise it requires the patch titled powerpc: Add kprobe-based
event tracer from the powerpc-next tree to provide the
HAVE_REGS_AND_STACK_ACCESS_API required for CONFIG_KPROBE_EVENT. The code will
still compile cleanly without it and will fail gracefully at runtime on the
missing CONFIG_KPROBE_EVENT support as before as well as printing a warning
message during compilation.


Changes since v2: From Masami Hiramatsu's feedback DWARF support is disabled
altogether if the architecture specific Makefile does not define
PERF_HAVE_DWARF_REGS - ie, DWARF register mappings are missing for the
architecture. A message indicating this is printed out during compilation.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH v3 1/2] perf: Move arch specific code into separate arch directory

2010-04-14 Thread Ian Munsie
From: Ian Munsie imun...@au.ibm.com

The perf userspace tool included some architecture specific code to map
registers from the DWARF register number into the names used by the regs
and stack access API.

This patch moves the architecture specific code out into a separate
arch/x86 directory along with the infrastructure required to use it.

Signed-off-by: Ian Munsie imun...@au.ibm.com
---
Changes since v2: From Masami Hiramatsu's feedback DWARF support is disabled
altogether if the architecture specific Makefile does not define
PERF_HAVE_DWARF_REGS - ie, DWARF register mappings are missing for the
architecture. A message indicating this is printed out during compilation.

 tools/perf/Makefile   |   26 ++-
 tools/perf/arch/x86/Makefile  |2 +
 tools/perf/arch/x86/util/dwarf-regs.c |   75 +
 tools/perf/util/include/dwarf-regs.h  |8 
 tools/perf/util/probe-finder.c|   55 +---
 5 files changed, 110 insertions(+), 56 deletions(-)
 create mode 100644 tools/perf/arch/x86/Makefile
 create mode 100644 tools/perf/arch/x86/util/dwarf-regs.c
 create mode 100644 tools/perf/util/include/dwarf-regs.h

diff --git a/tools/perf/Makefile b/tools/perf/Makefile
index 57b3569..269d5dd 100644
--- a/tools/perf/Makefile
+++ b/tools/perf/Makefile
@@ -173,6 +173,20 @@ uname_R := $(shell sh -c 'uname -r 2/dev/null || echo 
not')
 uname_P := $(shell sh -c 'uname -p 2/dev/null || echo not')
 uname_V := $(shell sh -c 'uname -v 2/dev/null || echo not')
 
+ARCH ?= $(shell echo $(uname_M) | sed -e s/i.86/i386/ -e s/sun4u/sparc64/ \
+ -e s/arm.*/arm/ -e s/sa110/arm/ \
+ -e s/s390x/s390/ -e s/parisc64/parisc/ \
+ -e s/ppc.*/powerpc/ -e s/mips.*/mips/ \
+ -e s/sh[234].*/sh/ )
+
+# Additional ARCH settings for x86
+ifeq ($(ARCH),i386)
+ARCH := x86
+endif
+ifeq ($(ARCH),x86_64)
+ARCH := x86
+endif
+
 # CFLAGS and LDFLAGS are for the users to override from the command line.
 
 #
@@ -285,7 +299,7 @@ endif
 # Those must not be GNU-specific; they are shared with perl/ which may
 # be built by a different compiler. (Note that this is an artifact now
 # but it still might be nice to keep that distinction.)
-BASIC_CFLAGS = -Iutil/include
+BASIC_CFLAGS = -Iutil/include -Iarch/$(ARCH)/include
 BASIC_LDFLAGS =
 
 # Guard against environment variables
@@ -367,6 +381,7 @@ LIB_H += util/include/asm/byteorder.h
 LIB_H += util/include/asm/swab.h
 LIB_H += util/include/asm/system.h
 LIB_H += util/include/asm/uaccess.h
+LIB_H += util/include/dwarf-regs.h
 LIB_H += perf.h
 LIB_H += util/cache.h
 LIB_H += util/callchain.h
@@ -485,6 +500,7 @@ PERFLIBS = $(LIB_FILE)
 
 -include config.mak.autogen
 -include config.mak
+-include arch/$(ARCH)/Makefile
 
 ifeq ($(uname_S),Darwin)
ifndef NO_FINK
@@ -521,12 +537,16 @@ endif
 ifneq ($(shell sh -c (echo '\#include dwarf.h'; echo '\#include libdw.h'; 
echo 'int main(void) { Dwarf *dbg; dbg = dwarf_begin(0, DWARF_C_READ); return 
(long)dbg; }') | $(CC) -x c - $(ALL_CFLAGS) -I/usr/include/elfutils -ldw -lelf 
-o $(BITBUCKET) $(ALL_LDFLAGS) $(EXTLIBS) $(QUIET_STDERR)  echo y), y)
msg := $(warning No libdw.h found or old libdw.h found, disables dwarf 
support. Please install elfutils-devel/elfutils-dev);
 else
+ifeq ($(origin PERF_HAVE_DWARF_REGS), undefined)
+   msg := $(warning DWARF register mappings have not been defined for 
architecture $(ARCH), DWARF support disabled);
+else
 ifndef NO_DWARF
BASIC_CFLAGS += -I/usr/include/elfutils -DDWARF_SUPPORT
EXTLIBS += -lelf -ldw
LIB_OBJS += $(OUTPUT)util/probe-finder.o
-endif
-endif
+endif # NO_DWARF
+endif # PERF_HAVE_DWARF_REGS
+endif # Dwarf support
 
 ifneq ($(shell sh -c (echo '\#include newt.h'; echo 'int main(void) { 
newtInit(); newtCls(); return newtFinished(); }') | $(CC) -x c - $(ALL_CFLAGS) 
-D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -lnewt -o $(BITBUCKET) 
$(ALL_LDFLAGS) $(EXTLIBS) $(QUIET_STDERR)  echo y), y)
msg := $(warning newt not found, disables TUI support. Please install 
newt-devel or libnewt-dev);
diff --git a/tools/perf/arch/x86/Makefile b/tools/perf/arch/x86/Makefile
new file mode 100644
index 000..cbd7833
--- /dev/null
+++ b/tools/perf/arch/x86/Makefile
@@ -0,0 +1,2 @@
+PERF_HAVE_DWARF_REGS := 1
+LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
diff --git a/tools/perf/arch/x86/util/dwarf-regs.c 
b/tools/perf/arch/x86/util/dwarf-regs.c
new file mode 100644
index 000..a794d30
--- /dev/null
+++ b/tools/perf/arch/x86/util/dwarf-regs.c
@@ -0,0 +1,75 @@
+/*
+ * dwarf-regs.c : Mapping of DWARF debug register numbers into register names.
+ * Extracted from probe-finder.c
+ *
+ * Written by Masami Hiramatsu mhira...@redhat.com
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License 

[PATCH v3 2/2] perf probe: Add PowerPC DWARF register number mappings

2010-04-14 Thread Ian Munsie
From: Ian Munsie imun...@au.ibm.com

This patch adds mappings from the register numbers from DWARF to the
register names used in the PowerPC Regs and Stack Access API. This
allows perf probe to be used to record variable contents on PowerPC.

This patch depends on functionality in the powerpc/next tree, though it
will compile fine without it. Specifically this patch depends on commit
powerpc: Add kprobe-based event tracer

Signed-off-by: Ian Munsie imun...@au.ibm.com
---
 tools/perf/arch/powerpc/Makefile  |2 +
 tools/perf/arch/powerpc/util/dwarf-regs.c |   88 +
 2 files changed, 90 insertions(+), 0 deletions(-)
 create mode 100644 tools/perf/arch/powerpc/Makefile
 create mode 100644 tools/perf/arch/powerpc/util/dwarf-regs.c

diff --git a/tools/perf/arch/powerpc/Makefile b/tools/perf/arch/powerpc/Makefile
new file mode 100644
index 000..cbd7833
--- /dev/null
+++ b/tools/perf/arch/powerpc/Makefile
@@ -0,0 +1,2 @@
+PERF_HAVE_DWARF_REGS := 1
+LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
diff --git a/tools/perf/arch/powerpc/util/dwarf-regs.c 
b/tools/perf/arch/powerpc/util/dwarf-regs.c
new file mode 100644
index 000..48ae0c5
--- /dev/null
+++ b/tools/perf/arch/powerpc/util/dwarf-regs.c
@@ -0,0 +1,88 @@
+/*
+ * Mapping of DWARF debug register numbers into register names.
+ *
+ * Copyright (C) 2010 Ian Munsie, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include libio.h
+#include dwarf-regs.h
+
+
+struct pt_regs_dwarfnum {
+   const char *name;
+   unsigned int dwarfnum;
+};
+
+#define STR(s) #s
+#define REG_DWARFNUM_NAME(r, num) {.name = r, .dwarfnum = num}
+#define GPR_DWARFNUM_NAME(num) \
+   {.name = STR(%gpr##num), .dwarfnum = num}
+#define REG_DWARFNUM_END {.name = NULL, .dwarfnum = 0}
+
+/*
+ * Reference:
+ * http://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi-1.9.html
+ */
+static const struct pt_regs_dwarfnum regdwarfnum_table[] = {
+   GPR_DWARFNUM_NAME(0),
+   GPR_DWARFNUM_NAME(1),
+   GPR_DWARFNUM_NAME(2),
+   GPR_DWARFNUM_NAME(3),
+   GPR_DWARFNUM_NAME(4),
+   GPR_DWARFNUM_NAME(5),
+   GPR_DWARFNUM_NAME(6),
+   GPR_DWARFNUM_NAME(7),
+   GPR_DWARFNUM_NAME(8),
+   GPR_DWARFNUM_NAME(9),
+   GPR_DWARFNUM_NAME(10),
+   GPR_DWARFNUM_NAME(11),
+   GPR_DWARFNUM_NAME(12),
+   GPR_DWARFNUM_NAME(13),
+   GPR_DWARFNUM_NAME(14),
+   GPR_DWARFNUM_NAME(15),
+   GPR_DWARFNUM_NAME(16),
+   GPR_DWARFNUM_NAME(17),
+   GPR_DWARFNUM_NAME(18),
+   GPR_DWARFNUM_NAME(19),
+   GPR_DWARFNUM_NAME(20),
+   GPR_DWARFNUM_NAME(21),
+   GPR_DWARFNUM_NAME(22),
+   GPR_DWARFNUM_NAME(23),
+   GPR_DWARFNUM_NAME(24),
+   GPR_DWARFNUM_NAME(25),
+   GPR_DWARFNUM_NAME(26),
+   GPR_DWARFNUM_NAME(27),
+   GPR_DWARFNUM_NAME(28),
+   GPR_DWARFNUM_NAME(29),
+   GPR_DWARFNUM_NAME(30),
+   GPR_DWARFNUM_NAME(31),
+   REG_DWARFNUM_NAME(%msr,   66),
+   REG_DWARFNUM_NAME(%ctr,   109),
+   REG_DWARFNUM_NAME(%link,  108),
+   REG_DWARFNUM_NAME(%xer,   101),
+   REG_DWARFNUM_NAME(%dar,   119),
+   REG_DWARFNUM_NAME(%dsisr, 118),
+   REG_DWARFNUM_END,
+};
+
+/**
+ * get_arch_regstr() - lookup register name from it's DWARF register number
+ * @n: the DWARF register number
+ *
+ * get_arch_regstr() returns the name of the register in struct
+ * regdwarfnum_table from it's DWARF register number. If the register is not
+ * found in the table, this returns NULL;
+ */
+const char *get_arch_regstr(unsigned int n)
+{
+   const struct pt_regs_dwarfnum *roff;
+   for (roff = regdwarfnum_table; roff-name != NULL; roff++)
+   if (roff-dwarfnum == n)
+   return roff-name;
+   return NULL;
+}
-- 
1.7.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 4/5] sched: Mark the balance type for use in need_active_balance()

2010-04-14 Thread Michael Neuling
 On Fri, 2010-04-09 at 16:21 +1000, Michael Neuling wrote:
  need_active_balance() gates the asymmetric packing based due to power
  save logic, but for packing we don't care.
 
 This explanation lacks a how/why.
 
 So the problem is that need_active_balance() ends up returning false and
 prevents the active balance from pulling a task to a lower available SMT
 sibling?

Correct.  I've put a more detailed description in the patch below.  

  This marks the type of balanace we are attempting to do perform from
  f_b_g() and stops need_active_balance() power save logic gating a
  balance in the asymmetric packing case.
 
 At the very least this wants more comments in the code. 

Sorry again for the lack luster comments. I've updated this patch also.

 I'm not really charmed by having to add yet another variable to pass
 around that mess, but I can't seem to come up with something cleaner
 either.

Yeah, the current case only ever reads the balance type in the !=
BALANCE_POWER so a full enum might be overkill, but I though it might
come in useful for someone else.

Updated patch below.

Mikey


[PATCH 4/5] sched: fix need_active_balance() from preventing asymmetric packing 

need_active_balance() prevents a task being pulled onto a newly idle
package in an attempt to completely free it so it can be powered down.
Hence it returns false to load_balance() and prevents the active
balance from occurring.

Unfortunately, when asymmetric packing is enabled at the sibling level
this power save logic is preventing the packing balance from moving a
task to a lower idle thread.  At the sibling level SD_SHARE_CPUPOWER
and parent(SD_POWERSAVINGS_BALANCE) are enabled and the domain is also
non-idle (since we have at least 1 task we are trying to move down).
Hence the following code, prevents the an active balance from
occurring:

if (!sd_idle  sd-flags  SD_SHARE_CPUPOWER 
!test_sd_parent(sd, SD_POWERSAVINGS_BALANCE))
return 0;

To fix this, this patch classifies the type of balance we are
attempting to perform into none, load, power and packing based on what
function finds busiest in f_b_g().  This classification is then used
by need_active_balance() to prevent the above power saving logic from
stopping a balance due to asymmetric packing.  This ensures tasks can
be correctly moved down to lower sibling threads.  

Signed-off-by: Michael Neuling mi...@neuling.org
---

 kernel/sched_fair.c |   35 ++-
 1 file changed, 30 insertions(+), 5 deletions(-)

Index: linux-2.6-ozlabs/kernel/sched_fair.c
===
--- linux-2.6-ozlabs.orig/kernel/sched_fair.c
+++ linux-2.6-ozlabs/kernel/sched_fair.c
@@ -91,6 +91,14 @@ const_debug unsigned int sysctl_sched_mi
 
 static const struct sched_class fair_sched_class;
 
+/* Enum to classify the type of balance we are attempting to perform */
+enum balance_type {
+   BALANCE_NONE = 0,
+   BALANCE_LOAD,
+   BALANCE_POWER,
+   BALANCE_PACKING
+};
+
 /**
  * CFS operations on generic schedulable entities:
  */
@@ -2803,16 +2811,19 @@ static inline void calculate_imbalance(s
  * @cpus: The set of CPUs under consideration for load-balancing.
  * @balance: Pointer to a variable indicating if this_cpu
  * is the appropriate cpu to perform load balancing at this_level.
+ * @bt: returns the type of imbalance found
  *
  * Returns:- the busiest group if imbalance exists.
  * - If no imbalance and user has opted for power-savings balance,
  *return the least loaded group whose CPUs can be
  *put to idle by rebalancing its tasks onto our group.
+ * - *bt classifies the type of imbalance found
  */
 static struct sched_group *
 find_busiest_group(struct sched_domain *sd, int this_cpu,
   unsigned long *imbalance, enum cpu_idle_type idle,
-  int *sd_idle, const struct cpumask *cpus, int *balance)
+  int *sd_idle, const struct cpumask *cpus, int *balance,
+  enum balance_type *bt)
 {
struct sd_lb_stats sds;
 
@@ -2837,6 +2848,7 @@ find_busiest_group(struct sched_domain *
if (!(*balance))
goto ret;
 
+   *bt = BALANCE_PACKING;
if ((idle == CPU_IDLE || idle == CPU_NEWLY_IDLE) 
check_asym_packing(sd, sds, this_cpu, imbalance))
return sds.busiest;
@@ -2857,6 +2869,7 @@ find_busiest_group(struct sched_domain *
 
/* Looks like there is an imbalance. Compute it */
calculate_imbalance(sds, this_cpu, imbalance);
+   *bt = BALANCE_LOAD;
return sds.busiest;
 
 out_balanced:
@@ -2864,10 +2877,12 @@ out_balanced:
 * There is no obvious imbalance. But check if we can do some balancing
 * to save power.
 */
+   *bt = BALANCE_POWER;
if 

Re: Xorg on Fujitsu Lime with MPC5200b?

2010-04-14 Thread Benjamin Herrenschmidt
On Wed, 2010-04-14 at 22:07 -0500, Bill Gatliff wrote:
 
 Anyone have any suggestions on where to start with this one?  Anyone
 else running a similar configuration with any success?  I'm completely
 lost, and running out of hair *fast*...

Most probably endian bugs in the Lime driver ...

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 5/5] sched: make fix_small_imbalance work with asymmetric packing

2010-04-14 Thread Michael Neuling
In message 1271208670.2834.55.ca...@sbs-t61.sc.intel.com you wrote:
 On Tue, 2010-04-13 at 05:29 -0700, Peter Zijlstra wrote:
  On Fri, 2010-04-09 at 16:21 +1000, Michael Neuling wrote:
   With the asymmetric packing infrastructure, fix_small_imbalance is
   causing idle higher threads to pull tasks off lower threads.  
   
   This is being caused by an off-by-one error.  
   
   Signed-off-by: Michael Neuling mi...@neuling.org
   ---
   I'm not sure this is the right fix but without it, higher threads pull
   tasks off the lower threads, then the packing pulls it back down, etc
   etc and tasks bounce around constantly.
  
  Would help if you expand upon the why/how it manages to get pulled up.
  
  I can't immediately spot anything wrong with the patch, but then that
  isn't my favourite piece of code either.. Suresh, any comments?
  
 
 Sorry didn't pay much attention to this patchset. But based on the
 comments from Michael and looking at this patchset, it has SMT/MC
 implications. I will review and run some tests and get back in a day.
 
 As far as this particular patch is concerned, original code is coming
 from Ingo's original CFS code commit (dd41f596) and the below hunk
 pretty much explains what the change is about.
 
 -   if (max_load - this_load = busiest_load_per_task * imbn) {
 +   if (max_load - this_load + SCHED_LOAD_SCALE_FUZZ =
 +   busiest_load_per_task * imbn) {
 
 So the below proposed change will probably break what the above
 mentioned commit was trying to achieve, which is: for fairness reasons
 we were bouncing the small extra load (between the max_load and
 this_load) around.

Actually, you can drop this patch.  

In the process of clarifying why it was needed for the changelog, I
discovered I don't actually need it.  

Sorry about that.

Mikey

 
   ---
   
kernel/sched_fair.c |2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
   
   Index: linux-2.6-ozlabs/kernel/sched_fair.c
   ===
   --- linux-2.6-ozlabs.orig/kernel/sched_fair.c
   +++ linux-2.6-ozlabs/kernel/sched_fair.c
   @@ -2652,7 +2652,7 @@ static inline void fix_small_imbalance(s
  * SCHED_LOAD_SCALE;
 scaled_busy_load_per_task /= sds-busiest-cpu_power;

   - if (sds-max_load - sds-this_load + scaled_busy_load_per_task =
   + if (sds-max_load - sds-this_load + scaled_busy_load_per_task 
 (scaled_busy_load_per_task * imbn)) {
 *imbalance = sds-busiest_load_per_task;
 return;
  
 
 thanks,
 suresh
 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev