Re: linux-next: generic-ipi tree build failure

2008-07-03 Thread Ingo Molnar

* Stephen Rothwell <[EMAIL PROTECTED]> wrote:

> Hi Ingo, Jens,
> 
> Today's linux-next build (powerpc ppc64_defconfig) failed like this:
> 
> arch/powerpc/mm/tlb_64.c: In function 'pgtable_free_now':
> arch/powerpc/mm/tlb_64.c:66: error: too many arguments to function 
> 'smp_call_function'
> arch/powerpc/mm/slice.c: In function 'slice_get_unmapped_area':
> arch/powerpc/mm/slice.c:559: error: too many arguments to function 
> 'on_each_cpu'
> arch/powerpc/kernel/machine_kexec_64.c: In function 'kexec_prepare_cpus':
> arch/powerpc/kernel/machine_kexec_64.c:175: error: too many arguments to 
> function 'smp_call_function'
> 
> I applied the patch below.

thanks, applied the tlb_64.c and machine_kexec_64.c bits to 
tip/generic-ipi [see the commit below]. (Please carry the slice.c bits 
in linux-next separately as that's due to new code in linux-next.)

Ingo

>
commit 392096e98fd55e54035978fe03796fca8d26a574
Author: Stephen Rothwell <[EMAIL PROTECTED]>
Date:   Thu Jul 3 17:10:07 2008 +1000

generic-ipi: fix linux-next tree build failure

Today's linux-next build (powerpc ppc64_defconfig) failed like this:

arch/powerpc/mm/tlb_64.c: In function 'pgtable_free_now':
arch/powerpc/mm/tlb_64.c:66: error: too many arguments to function 
'smp_call_function'
arch/powerpc/kernel/machine_kexec_64.c: In function 'kexec_prepare_cpus':
arch/powerpc/kernel/machine_kexec_64.c:175: error: too many arguments to 
function 'smp_call_function'
    
    Signed-off-by: Stephen Rothwell <[EMAIL PROTECTED]>
Acked-by: Jens Axboe <[EMAIL PROTECTED]>
Cc: Paul Mackerras <[EMAIL PROTECTED]>
Cc: 
Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>

diff --git a/arch/powerpc/kernel/machine_kexec_64.c 
b/arch/powerpc/kernel/machine_kexec_64.c
index 704375b..b732b5f 100644
--- a/arch/powerpc/kernel/machine_kexec_64.c
+++ b/arch/powerpc/kernel/machine_kexec_64.c
@@ -172,7 +172,7 @@ static void kexec_prepare_cpus(void)
 {
int my_cpu, i, notified=-1;
 
-   smp_call_function(kexec_smp_down, NULL, 0, /* wait */0);
+   smp_call_function(kexec_smp_down, NULL, /* wait */0);
my_cpu = get_cpu();
 
/* check the others cpus are now down (via paca hw cpu id == -1) */
diff --git a/arch/powerpc/mm/tlb_64.c b/arch/powerpc/mm/tlb_64.c
index e2d867c..69ad829 100644
--- a/arch/powerpc/mm/tlb_64.c
+++ b/arch/powerpc/mm/tlb_64.c
@@ -66,7 +66,7 @@ static void pgtable_free_now(pgtable_free_t pgf)
 {
pte_freelist_forced_free++;
 
-   smp_call_function(pte_free_smp_sync, NULL, 0, 1);
+   smp_call_function(pte_free_smp_sync, NULL, 1);
 
pgtable_free(pgf);
 }
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: the printk problem

2008-07-05 Thread Ingo Molnar

* Linus Torvalds <[EMAIL PROTECTED]> wrote:

> > Still all happily untested, of course. And still with no actual 
> > users converted.
> 
> Ok, it's tested, and here's an example usage conversion.
> 
> The diffstat pretty much says it all. It _does_ change the format of 
> the stack trace entry a bit, but I don't think it's for the worse 
> (unless it breaks things like the oops tracker - Arjan?)
> 
> It changes the symbol-in-module format from
> 
>   :ext3:add_dirent_to_buf+0x6c/0x26c
> 
> to
> 
>   add_dirent_to_buf+0x6c/0x26c [ext3]
> 
> but quite frankly, the latter was the standard format anyway (it's 
> what "sprint_symbol()" gives you), and traps_64.c was the odd man out.
> 
> In fact, traps_32.c already uses the standard print_symbol() format, 
> so it really was an issue of the 64-bit code being odd (and I assume 
> that this also means that it cannot break the oops tracker, since it 
> already had to be able to handle both formats).
> 
> I also removed the KALLSYMS dependency, so if KALLSYMS isn't enabled 
> it will now give the same hex format twice, but I doubt we really care 
> (such stack traces are unreadable whether it shows up once or twice, 
> and the simplicity is worth it).
> 
> If people do just a few more conversions like this, then the 52 added 
> lines in lib/vsnprintf.c are more than made up for by removed lines 
> elsewhere (and more readable source code).

applied (with the commit message below) to tip/x86/debug for v2.6.27 
merging, thanks Linus. Can i add your SOB too?

Ingo

>
commit 4afd2534d6d4a77f4b7497c92f1ff7528d8f4eaa
Author: Linus Torvalds <[EMAIL PROTECTED]>
Date:   Sat Jul 5 15:32:41 2008 -0700

x86, 64-bit: standardize printk_address()

Changes the symbol-in-module format from

:ext3:add_dirent_to_buf+0x6c/0x26c

to

add_dirent_to_buf+0x6c/0x26c [ext3]

the latter was the standard format anyway (it's what "sprint_symbol()"
gives you), and traps_64.c was the odd man out.

In fact, traps_32.c already uses the standard print_symbol() format, so
it really was an issue of the 64-bit code being odd (and I assume that
this also means that it cannot break the oops tracker, since it already
had to be able to handle both formats).

I also removed the KALLSYMS dependency, so if KALLSYMS isn't enabled it
will now give the same hex format twice, but I doubt we really care
(such stack traces are unreadable whether it shows up once or twice, and
the simplicity is worth it).

If people do just a few more conversions like this, then the 52 added
lines in lib/vsnprintf.c are more than made up for by removed lines
elsewhere (and more readable source code).

Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>

diff --git a/arch/x86/kernel/traps_64.c b/arch/x86/kernel/traps_64.c
index adff76e..f1a95d1 100644
--- a/arch/x86/kernel/traps_64.c
+++ b/arch/x86/kernel/traps_64.c
@@ -104,30 +104,7 @@ int kstack_depth_to_print = 12;
 
 void printk_address(unsigned long address, int reliable)
 {
-#ifdef CONFIG_KALLSYMS
-   unsigned long offset = 0, symsize;
-   const char *symname;
-   char *modname;
-   char *delim = ":";
-   char namebuf[KSYM_NAME_LEN];
-   char reliab[4] = "";
-
-   symname = kallsyms_lookup(address, &symsize, &offset,
-   &modname, namebuf);
-   if (!symname) {
-   printk(" [<%016lx>]\n", address);
-   return;
-   }
-   if (!reliable)
-   strcpy(reliab, "? ");
-
-   if (!modname)
-   modname = delim = "";
-   printk(" [<%016lx>] %s%s%s%s%s+0x%lx/0x%lx\n",
-   address, reliab, delim, modname, delim, symname, offset, 
symsize);
-#else
-   printk(" [<%016lx>]\n", address);
-#endif
+   printk(" [<%016lx>] %s%pS\n", address, reliable ? "": "? ", (void *) 
address);
 }
 
 static unsigned long *in_exception_stack(unsigned cpu, unsigned long stack,
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: the printk problem

2008-07-05 Thread Ingo Molnar

* Linus Torvalds <[EMAIL PROTECTED]> wrote:

> On Sun, 6 Jul 2008, Ingo Molnar wrote:
> > 
> > applied (with the commit message below) to tip/x86/debug for v2.6.27 
> > merging, thanks Linus. Can i add your SOB too?
> 
> Sure, add my S-O-B. But I hope/assuem that you also added my earlier 
> patch that added the support for '%pS' too? I'm not entirely sure that 
> should go in an x86-specific branch, since it has nothing x86-specific 
> in it.

yeah, agreed, combined it's not an x86 topic anymore.

[ There's some lkml trouble so i've missed the earlier patch. I'm not 
  sure the email problem is on my side, see how incomplete the 
  discussion is on lkml.org as well:

 http://lkml.org/lkml/2008/6/25/170   ]

Anyway, i have have added this second patch of yours to tip/core/printk 
and moved your first patch over to that topic (which relies on it).

That topic has a few other (smaller) printk enhancements queued for 
v2.6.27 already:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git 
core/printk

[find other stats below]

so it fits in naturally.

Ingo

-->
Ingo Molnar (1):
  printk: export console_drivers

Jan Kiszka (1):
  printk: don't prefer unsuited consoles on registration

Jiri Slaby (1):
  x86, generic: mark early_printk as asmlinkage

Linus Torvalds (2):
  printk: add support for '%pS'
  x86, 64-bit: standardize printk_address()

Nick Andrew (2):
  printk: refactor processing of line severity tokens
  printk: remember the message level for multi-line output

Tejun Heo (1):
  printk: clean up recursion check related static variables

Thomas Gleixner (2):
  namespacecheck: fix kernel printk.c
  namespacecheck: more kernel/printk.c fixes

 arch/x86/kernel/early_printk.c |2 +-
 arch/x86/kernel/traps_64.c |   25 +
 include/linux/kernel.h |8 +---
 kernel/printk.c|  107 +++-
 lib/vsprintf.c |  118 +---
 5 files changed, 133 insertions(+), 127 deletions(-)

diff --git a/arch/x86/kernel/early_printk.c b/arch/x86/kernel/early_printk.c
index 643fd86..ff9e735 100644
--- a/arch/x86/kernel/early_printk.c
+++ b/arch/x86/kernel/early_printk.c
@@ -196,7 +196,7 @@ static struct console simnow_console = {
 static struct console *early_console = &early_vga_console;
 static int early_console_initialized;
 
-void early_printk(const char *fmt, ...)
+asmlinkage void early_printk(const char *fmt, ...)
 {
char buf[512];
int n;
diff --git a/arch/x86/kernel/traps_64.c b/arch/x86/kernel/traps_64.c
index adff76e..f1a95d1 100644
--- a/arch/x86/kernel/traps_64.c
+++ b/arch/x86/kernel/traps_64.c
@@ -104,30 +104,7 @@ int kstack_depth_to_print = 12;
 
 void printk_address(unsigned long address, int reliable)
 {
-#ifdef CONFIG_KALLSYMS
-   unsigned long offset = 0, symsize;
-   const char *symname;
-   char *modname;
-   char *delim = ":";
-   char namebuf[KSYM_NAME_LEN];
-   char reliab[4] = "";
-
-   symname = kallsyms_lookup(address, &symsize, &offset,
-   &modname, namebuf);
-   if (!symname) {
-   printk(" [<%016lx>]\n", address);
-   return;
-   }
-   if (!reliable)
-   strcpy(reliab, "? ");
-
-   if (!modname)
-   modname = delim = "";
-   printk(" [<%016lx>] %s%s%s%s%s+0x%lx/0x%lx\n",
-   address, reliab, delim, modname, delim, symname, offset, 
symsize);
-#else
-   printk(" [<%016lx>]\n", address);
-#endif
+   printk(" [<%016lx>] %s%pS\n", address, reliable ? "": "? ", (void *) 
address);
 }
 
 static unsigned long *in_exception_stack(unsigned cpu, unsigned long stack,
diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 792bf0a..4cb8d3d 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -184,9 +184,6 @@ asmlinkage int vprintk(const char *fmt, va_list args)
__attribute__ ((format (printf, 1, 0)));
 asmlinkage int printk(const char * fmt, ...)
__attribute__ ((format (printf, 1, 2))) __cold;
-extern int log_buf_get_len(void);
-extern int log_buf_read(int idx);
-extern int log_buf_copy(char *dest, int idx, int len);
 
 extern int printk_ratelimit_jiffies;
 extern int printk_ratelimit_burst;
@@ -202,9 +199,6 @@ static inline int vprintk(const char *s, va_list args) { 
return 0; }
 static inline int printk(const char *s, ...)
__attribute__ ((format (printf, 1, 2)));
 static inline int __cold printk(const char *s, ...) { return 0; }
-static inline int log_buf_get_len(void) { return 0; }
-static inline int log_buf_read(int idx) { return 0; }
-static inline int log_buf

Re: the printk problem

2008-07-05 Thread Ingo Molnar

* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> yeah, agreed, combined it's not an x86 topic anymore.
> 
> [ There's some lkml trouble so i've missed the earlier patch. I'm not 
>   sure the email problem is on my side, see how incomplete the 
>   discussion is on lkml.org as well:
> 
>  http://lkml.org/lkml/2008/6/25/170   ]
> 
> Anyway, i have have added this second patch of yours to 
> tip/core/printk and moved your first patch over to that topic (which 
> relies on it).

ah, i found the reason - it started out on linux-ia64 originally then 
moved over to lkml - so only half of the discussion was visible there. 
And linux-ia64 is one of the few vger lists i'm not subscribed to 
apparently. (there's no vger-please-give-me-all-emails list - making the 
following of Linux development even harder)

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH -next-20080709] fixup stop_machine use cpu mask vs ftrace

2008-07-11 Thread Ingo Molnar

* Milton Miller <[EMAIL PROTECTED]> wrote:

> Hi Rusty, Ingo.
> 
> Rusty's patch [PATCH 3/3] stop_machine: use cpu mask rather than magic 
> numbers didn't find kernel/trace/ftrace.c in -next, causing an 
> immediate almost NULL pointer dereference in ftrace_dynamic_init.

Rusty - what's going on here? Please do not change APIs like that, which 
cause code to crash. Either do a compatible API change, or change it 
over in a way that causes clear build failures, not crashes.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH -next-20080709] fixup stop_machine use cpu mask vs ftrace

2008-07-11 Thread Ingo Molnar

* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> 
> * Milton Miller <[EMAIL PROTECTED]> wrote:
> 
> > Hi Rusty, Ingo.
> > 
> > Rusty's patch [PATCH 3/3] stop_machine: use cpu mask rather than magic 
> > numbers didn't find kernel/trace/ftrace.c in -next, causing an 
> > immediate almost NULL pointer dereference in ftrace_dynamic_init.
> 
> Rusty - what's going on here? Please do not change APIs like that, 
> which cause code to crash. Either do a compatible API change, or 
> change it over in a way that causes clear build failures, not crashes.

ah, i see it from Rusty's other reply that there's going to be another 
version of this. Good :-)

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: 2.6.23-rc1-mm2

2007-08-01 Thread Ingo Molnar

* Andrew Morton <[EMAIL PROTECTED]> wrote:

> On Wed, 1 Aug 2007 10:02:30 +0200 Mariusz Kozlowski <[EMAIL PROTECTED]> wrote:
> 
> > Hello,
> > 
> > I get this warning. Looking at the comment in kernel/irq/resend.c
> > it's harmless. Is it?

yeah, harmless.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] Restore deterministic CPU accounting on powerpc

2007-11-03 Thread Ingo Molnar

* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> * Paul Mackerras <[EMAIL PROTECTED]> wrote:
> 
> > Since powerpc started using CONFIG_GENERIC_CLOCKEVENTS, the 
> > deterministic CPU accounting (CONFIG_VIRT_CPU_ACCOUNTING) has been 
> > broken on powerpc, because we end up counting user time twice: once in 
> > timer_interrupt() and once in update_process_times().
> > 
> > This fixes the problem by pulling the code in update_process_times 
> > that updates utime and stime into a separate function called 
> > account_process_tick.  If CONFIG_VIRT_CPU_ACCOUNTING is not defined, 
> > there is a version of account_process_tick in kernel/timer.c that 
> > simply accounts a whole tick to either utime or stime as before.  If 
> > CONFIG_VIRT_CPU_ACCOUNTING is defined, then arch code gets to 
> > implement account_process_tick.
> > 
> > This also lets us simplify the s390 code a bit; it means that the s390 
> > timer interrupt can now call update_process_times even when 
> > CONFIG_VIRT_CPU_ACCOUNTING is turned on, and can just implement a 
> > suitable account_process_tick().
> > 
> > Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]>
> 
> lets push this towards Linus via the scheduler tree, ok?

hm, i've removed it for now because it doesnt even build due toj:

+#ifndef CONFIG_VIRT_CPU_ACCOUNTING
+void account_process_tick(int user_tick)
+{
+   if (user_tick) {
+   account_user_time(p, jiffies_to_cputime(1));
+   account_user_time_scaled(p, jiffies_to_cputime(1));

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] Restore deterministic CPU accounting on powerpc

2007-11-03 Thread Ingo Molnar

* Paul Mackerras <[EMAIL PROTECTED]> wrote:

> Since powerpc started using CONFIG_GENERIC_CLOCKEVENTS, the 
> deterministic CPU accounting (CONFIG_VIRT_CPU_ACCOUNTING) has been 
> broken on powerpc, because we end up counting user time twice: once in 
> timer_interrupt() and once in update_process_times().
> 
> This fixes the problem by pulling the code in update_process_times 
> that updates utime and stime into a separate function called 
> account_process_tick.  If CONFIG_VIRT_CPU_ACCOUNTING is not defined, 
> there is a version of account_process_tick in kernel/timer.c that 
> simply accounts a whole tick to either utime or stime as before.  If 
> CONFIG_VIRT_CPU_ACCOUNTING is defined, then arch code gets to 
> implement account_process_tick.
> 
> This also lets us simplify the s390 code a bit; it means that the s390 
> timer interrupt can now call update_process_times even when 
> CONFIG_VIRT_CPU_ACCOUNTING is turned on, and can just implement a 
> suitable account_process_tick().
> 
> Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]>

lets push this towards Linus via the scheduler tree, ok?

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH v2] Restore deterministic CPU accounting on powerpc

2007-11-04 Thread Ingo Molnar

* Paul Mackerras <[EMAIL PROTECTED]> wrote:

> Signed-off-by: Paul Mackerras <[EMAIL PROTECTED]>
> ---
> account_process_tick now takes the task_struct * as an argument.
> Tested both with and without CONFIG_VIRT_CPU_ACCOUNTING.

thanks, applied.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [BUG] 2.6.24-rc2-mm1 - kernel bug on nfs v4

2007-11-17 Thread Ingo Molnar

* Torsten Kaiser <[EMAIL PROTECTED]> wrote:

> Sadly lockdep does not work for me, as it gets turned off early:
> [   39.851594] -
> [   39.855963] inconsistent {softirq-on-W} -> {in-softirq-W} usage.
> [   39.861981] swapper/0 [HC0[0]:SC1[1]:HE0:SE0] takes:
> [   39.866963]  (&n->list_lock){-+..}, at: []

hey, that means it found a bug - which is not sad at all :-)

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [BUG] 2.6.24-rc3-git2 softlockup detected

2007-12-04 Thread Ingo Molnar

* Kamalesh Babulal <[EMAIL PROTECTED]> wrote:

> > So 2.6.24-rc3 was OK and 2.6.24-rc3-git2 is not?
> 
> Yes, the 2.6.24-rc3 was Ok and this is seen from 2.6.24-rc3-git2/3/4.

just to make sure: this is a real lockup and failed bootup (or device 
init), not just a message, right?

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [BUG] 2.6.24-rc3-git2 softlockup detected

2007-12-04 Thread Ingo Molnar

* Kamalesh Babulal <[EMAIL PROTECTED]> wrote:

> Hi Ingo,
> 
> This softlockup is seen in the 2.6.24-rc4 either and looks like a 
> message because this is seen while running tbench and machine 
> continues running other test's after the softlockup messages and some 
> times seen with the bootup, but the machines reaches the login prompt 
> and able to continue running tests.

do you know whether there's any true delay when this happens, or is it a 
pure softlockup-detector false positive?

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 2/2] x86-64: seccomp: fix 32/64 syscall hole

2009-02-27 Thread Ingo Molnar

* Roland McGrath  wrote:

> +#ifdef CONFIG_COMPAT
> + if (is_compat_task())
>   syscall = mode1_syscalls_32;
>  #endif

btw., shouldnt is_compat_task() expand to 0 in the 
!CONFIG_COMPAT case? That way we could remove this #ifdef too. 
(and move the first #ifdef inside the array initialization so 
that we always have a mode1_syscalls_32[] array.)

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [RESEND GIT PATCH tj-percpu] percpu: fix spurious alignment WARN in legacy SMP percpu allocator

2009-03-11 Thread Ingo Molnar

* Tejun Heo  wrote:

> Impact: remove spurious WARN on legacy SMP percpu allocator
> 
> Commit f2a8205c4ef1af917d175c36a4097ae5587791c8 incorrectly added too
> tight WARN_ON_ONCE() on alignments for UP and legacy SMP percpu
> allocator.  Commit e317603694bfd17b28a40de9d65e1a4ec12f816e fixed it
> for UP but legacy SMP allocator was forgotten.  Fix it.
> 
> Signed-off-by: Tejun Heo 
> Reported-by: Sachin P. Sant 
> ---
> (RESEND: cc'ing Ingo. :-)
> 
> Oops, that was a stupid omission.  This patch should fix it.  Ingo,
> please pull from the following git vector to receive the first first
> four patches from the use-dynamic-percpu-allocator-by-default patchset
> (without the actual conversion which can disrupt archs) + this patch.
> I moved the actual conversion patch into #tj-percpu-exp branch, so the
> pull should be safe.
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git tj-percpu
> 
> Thanks.

Pulled into tip:core/percpu, thanks a lot Tejun!

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: linux-next: cpus4096 tree build failure

2009-03-19 Thread Ingo Molnar

* Rusty Russell  wrote:

> On Thursday 19 March 2009 21:23:00 Stephen Rothwell wrote:
> > From: Stephen Rothwell 
> > Date: Thu, 19 Mar 2009 21:35:24 +1100
> > Subject: [PATCH] powerpc: mmzone.h needs cpumask_t to be defined
> > 
> > Commit 082edb7bf443eb8eda15b482d16ad9dd8137ad24 ("numa,cpumask: move
> > numa_node_id default implementation to topology.h") removed the include
> > of linux/topology.h from linux/mmzone.h which exposed this lack.
> > 
> > Signed-off-by: Stephen Rothwell 
> 
> Acked-by: Rusty Russell 
> 
> Ingo, please apply.
> 
> Apparently sparc is similarly broken :(

I've applied it.

( I cannot push it to the auto-cpus4096-next branch yet because you
  broke the scheduler as well (boot crash) so the topic is marked
  broken. )

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] tracing: Fix TRACING_SUPPORT dependency

2009-03-20 Thread Ingo Molnar

* Anton Vorontsov  wrote:

> commit 40ada30f9621fbd831ac2437b9a2a399aad34b00 ("tracing: clean 
> up menu"), despite the "clean up" in its purpose, introduced 
> behavioural change for Kconfig symbols: we no longer able to 
> select tracing support on PPC32 (because IRQFLAGS_SUPPORT isn't 
> yet implemented).

Could you please solve this by implementing proper irqflag-tracing 
support? It's been available upstream for almost three years. It's 
needed for lockdep support as well, etc.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] tracing: Fix TRACING_SUPPORT dependency

2009-03-20 Thread Ingo Molnar

* Anton Vorontsov  wrote:

> On Fri, Mar 20, 2009 at 08:04:28PM +0100, Ingo Molnar wrote:
> > 
> > * Anton Vorontsov  wrote:
> > 
> > > commit 40ada30f9621fbd831ac2437b9a2a399aad34b00 ("tracing: clean 
> > > up menu"), despite the "clean up" in its purpose, introduced 
> > > behavioural change for Kconfig symbols: we no longer able to 
> > > select tracing support on PPC32 (because IRQFLAGS_SUPPORT isn't 
> > > yet implemented).
> > 
> > Could you please solve this by implementing proper 
> > irqflag-tracing support? It's been available upstream for almost 
> > three years. It's needed for lockdep support as well, etc.
> 
> Breaking things via clean up patches is an interesting method of 
> encouraging something to implement. ;-)
>
> Surely I'll look into implementing irqflags tracing, but 
> considering that no one ever needed this for almost three years, 
> [...]

Weird, there's no lockdep support?

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] tracing: Fix TRACING_SUPPORT dependency

2009-03-21 Thread Ingo Molnar

* Anton Vorontsov  wrote:

> On Fri, Mar 20, 2009 at 08:57:43PM +0100, Ingo Molnar wrote:
> > 
> > * Anton Vorontsov  wrote:
> > 
> > > On Fri, Mar 20, 2009 at 08:04:28PM +0100, Ingo Molnar wrote:
> > > > 
> > > > * Anton Vorontsov  wrote:
> > > > 
> > > > > commit 40ada30f9621fbd831ac2437b9a2a399aad34b00 ("tracing: clean 
> > > > > up menu"), despite the "clean up" in its purpose, introduced 
> > > > > behavioural change for Kconfig symbols: we no longer able to 
> > > > > select tracing support on PPC32 (because IRQFLAGS_SUPPORT isn't 
> > > > > yet implemented).
> > > > 
> > > > Could you please solve this by implementing proper 
> > > > irqflag-tracing support? It's been available upstream for almost 
> > > > three years. It's needed for lockdep support as well, etc.
> > > 
> > > Breaking things via clean up patches is an interesting method of 
> > > encouraging something to implement. ;-)
> > >
> > > Surely I'll look into implementing irqflags tracing, but 
> > > considering that no one ever needed this for almost three years, 
> > > [...]
> > 
> > Weird, there's no lockdep support?
> 
> *ashamed*: apparently no such support currently exist for PPC32. ;-)

Hm, do all the tracers even compile on ppc32 with your patch?

We had periodic build failures on weird, unmaintained architectures 
that had no irqflags-tracing support and hence didnt know the 
raw_irqs_save/restore primitives ...

I'm not trying to make things more difficult for you (and we can 
apply your patch if it builds fine and does not cause problems 
elsewhere), but there were some real downsides to not having proper 
irq APIs ...

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] tracing: Fix TRACING_SUPPORT dependency

2009-03-21 Thread Ingo Molnar

* Steven Rostedt  wrote:

> 
> On Sat, 21 Mar 2009, Ingo Molnar wrote:
> 
> > 
> > * Anton Vorontsov  wrote:
> > 
> > > On Fri, Mar 20, 2009 at 08:57:43PM +0100, Ingo Molnar wrote:
> > > > 
> > > > * Anton Vorontsov  wrote:
> > > > 
> > > > > On Fri, Mar 20, 2009 at 08:04:28PM +0100, Ingo Molnar wrote:
> > > > > > 
> > > > > > * Anton Vorontsov  wrote:
> > > > > > 
> > > > > > > commit 40ada30f9621fbd831ac2437b9a2a399aad34b00 ("tracing: clean 
> > > > > > > up menu"), despite the "clean up" in its purpose, introduced 
> > > > > > > behavioural change for Kconfig symbols: we no longer able to 
> > > > > > > select tracing support on PPC32 (because IRQFLAGS_SUPPORT isn't 
> > > > > > > yet implemented).
> > > > > > 
> > > > > > Could you please solve this by implementing proper 
> > > > > > irqflag-tracing support? It's been available upstream for almost 
> > > > > > three years. It's needed for lockdep support as well, etc.
> > > > > 
> > > > > Breaking things via clean up patches is an interesting method of 
> > > > > encouraging something to implement. ;-)
> > > > >
> > > > > Surely I'll look into implementing irqflags tracing, but 
> > > > > considering that no one ever needed this for almost three years, 
> > > > > [...]
> > > > 
> > > > Weird, there's no lockdep support?
> > > 
> > > *ashamed*: apparently no such support currently exist for PPC32. ;-)
> > 
> > Hm, do all the tracers even compile on ppc32 with your patch?
> > 
> > We had periodic build failures on weird, unmaintained architectures 
> > that had no irqflags-tracing support and hence didnt know the 
> > raw_irqs_save/restore primitives ...
> > 
> > I'm not trying to make things more difficult for you (and we can 
> > apply your patch if it builds fine and does not cause problems 
> > elsewhere), but there were some real downsides to not having proper 
> > irq APIs ...
> 
> Note, the issue is not with the hooks into local_irq_save/restore, 
> but with the entry.S code. That code is very sensitive where the 
> irqs are enabled and disabled.

i know. What i'm talking about is that non-lockdep architectures 
have the habit of not defining raw_local_irq_save() - which the 
tracing core relies on.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] tracing: Fix TRACING_SUPPORT dependency

2009-03-21 Thread Ingo Molnar

* Steven Rostedt  wrote:

> 
> On Sat, 21 Mar 2009, Steven Rostedt wrote:
> > 
> > Since we know that's not an issue with PPC32, perhaps we should add (I 
> > hate to do this)...
> > 
> > 
> > depends on TRACE_IRQFLAGS_SUPPORT || PPC32
> > 
> > And document that the "|| PPC32" should go when PowerPC32 gets its act
> > together.  :-/
> 
> Note, the only tracer broken on PPC32 is the IRQSOFF tracer, and 
> that already depends on TRACE_IRQFLAGS_SUPPORT.

Ok, that's fine with me too. Perhaps we could add a TRACING_SUPPORT 
thing that architectures can enable - but it's probably overkill in 
this case.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: linux-next: tracing/powerpc tree build failure

2009-04-01 Thread Ingo Molnar

* Stephen Rothwell  wrote:

> Hi all,
> 
> This patch is now applicable to the tracing tree after merging 
> with Linus' tree.

Thanks, that's useful info.

There's the skb tracepoints related merge fixlet needed too. 
Anything else in this context you are aware of?

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 2/2] x86-64: seccomp: fix 32/64 syscall hole

2009-05-06 Thread Ingo Molnar

* Markus Gutschke (顧孟勤)  wrote:

> On Wed, May 6, 2009 at 14:29, Ingo Molnar  wrote:
> > That's a pretty interesting usage. What would be fallback mode you
> > are using if the kernel doesnt have seccomp built in? Completely
> > non-sandboxed? Or a ptrace/PTRACE_SYSCALL based sandbox?
> 
> Ptrace has performance and/or reliability problems when used to 
> sandbox threaded applications due to potential race conditions 
> when inspecting system call arguments. We hope that we can avoid 
> this problem with seccomp. It is very attractive that kernel 
> automatically terminates any application that violates the very 
> well-defined constraints of the sandbox.
> 
> In general, we are currently exploring different options based on 
> general availability, functionality, and complexity of 
> implementation. Seccomp is a good middle ground that we expect to 
> be able to use in the medium term to provide an acceptable 
> solution for a large segment of Linux users. Although the 
> restriction to just four unfiltered system calls is painful.

Which other system calls would you like to use? Futexes might be 
one, for fast synchronization primitives?

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: [PATCH 2/2] x86-64: seccomp: fix 32/64 syscall hole

2009-05-06 Thread Ingo Molnar

* Markus Gutschke (顧孟勤)  wrote:

> On Wed, May 6, 2009 at 14:54, Ingo Molnar  wrote:
> > Which other system calls would you like to use? Futexes might be
> > one, for fast synchronization primitives?
> 
> There are a large number of system calls that "normal" C/C++ code 
> uses quite frequently, and that are not security sensitive. A 
> typical example would be gettimeofday(). But there are other 
> system calls, where the sandbox would not really need to inspect 
> arguments as the call does not expose any exploitable interface.
> 
> It is currently awkward that in order to use seccomp we have to 
> intercept all system calls and provide alternative implementations 
> for them; whereas we really only care about a comparatively small 
> number of security critical operations that we need to restrict.
> 
> Also, any redirected system call ends up incurring at least two 
> context switches, which is needlessly expensive for the large 
> number of trivial system calls. We are quite happy that read() and 
> write(), which are quite important to us, do not incur this 
> penalty.

doing a (per arch) bitmap of harmless syscalls and replacing the 
mode1_syscalls[] check with that in kernel/seccomp.c would be a 
pretty reasonable extension. (.config controllable perhaps, for 
old-style-seccomp)

It would probably be faster than the current loop over 
mode1_syscalls[] as well.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: [PATCH 2/2] x86-64: seccomp: fix 32/64 syscall hole

2009-05-06 Thread Ingo Molnar

* Markus Gutschke (顧孟勤)  wrote:

> On Sat, Feb 28, 2009 at 10:23, Linus Torvalds
>  wrote:

> > And I guess the seccomp interaction means that this is 
> > potentially a 2.6.29 thing. Not that I know whether anybody 
> > actually _uses_ seccomp. It does seem to be enabled in at least 
> > Fedora kernels, but it might not be used anywhere.
> 
> In the Linux version of Google Chrome, we are currently working on 
> code that will use seccomp for parts of our sandboxing solution.

That's a pretty interesting usage. What would be fallback mode you 
are using if the kernel doesnt have seccomp built in? Completely 
non-sandboxed? Or a ptrace/PTRACE_SYSCALL based sandbox?

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: [PATCH 2/2] x86-64: seccomp: fix 32/64 syscall hole

2009-05-07 Thread Ingo Molnar

* Nicholas Miell  wrote:

> On Wed, 2009-05-06 at 15:21 -0700, Markus Gutschke (顧孟勤) wrote:
> > On Wed, May 6, 2009 at 15:13, Ingo Molnar  wrote:
> > > doing a (per arch) bitmap of harmless syscalls and replacing the
> > > mode1_syscalls[] check with that in kernel/seccomp.c would be a
> > > pretty reasonable extension. (.config controllable perhaps, for
> > > old-style-seccomp)
> > >
> > > It would probably be faster than the current loop over
> > > mode1_syscalls[] as well.
> > 
> > This would be a great option to improve performance of our sandbox. I
> > can detect the availability of the new kernel API dynamically, and
> > then not intercept the bulk of the system calls. This would allow the
> > sandbox to work both with existing and with newer kernels.
> > 
> > We'll post a kernel patch for discussion in the next few days,
> > 
> 
> I suspect the correct thing to do would be to leave seccomp mode 1 
> alone and introduce a mode 2 with a less restricted set of system 
> calls -- the interface was designed to be extended in this way, 
> after all.

Yes, that is what i alluded to above via the '.config controllable' 
aspect.

Mode 2 could be implemented like this: extend prctl_set_seccomp() 
with a bitmap pointer, and copy it to a per task seccomp context 
structure.

a bitmap for 300 syscalls takes only about 40 bytes.

Please take care to implement nesting properly: if a seccomp context 
does a seccomp call (which mode 2 could allow), then the resulting 
bitmap should be the logical-AND of the parent and child bitmaps. 
There's no reason why seccomp couldnt be used in hiearachy of 
sandboxes, in a gradually less permissive fashion.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: question about softirqs

2009-05-12 Thread Ingo Molnar

* Chris Friesen  wrote:

> This started out as a thread on the ppc list, but on the 
> suggestion of DaveM and Paul Mackerras I'm expanding the receiver 
> list a bit.
> 
> Currently, if a softirq is raised in process context the 
> TIF_RESCHED_PENDING flag gets set and on return to userspace we 
> run the scheduler, expecting it to switch to ksoftirqd to handle 
> the softirqd processing.
> 
> I think I see a possible problem with this. Suppose I have a 
> SCHED_FIFO task spinning on recvmsg() with MSG_DONTWAIT set. Under 
> the scenario above, schedule() would re-run the spinning task 
> rather than ksoftirqd, thus preventing any incoming packets from 
> being sent up the stack until we get a real hardware 
> interrupt--which could be a whole jiffy if interrupt mitigation is 
> enabled in the net device.

TIF_RESCHED_PENDING will not be set if a SCHED_FIFO task wakes up a 
SCHED_OTHER ksoftirqd task. But starvation of ksoftirqd processing 
will occur.

> DaveM pointed out that if we're doing transmits we're likely to 
> hit local_bh_enable(), which would process the softirq work.  
> However, I think we may still have a problem in the above rx-only 
> scenario--or is it too contrived to matter?

This could occur, and the problem is really that task priorities do 
not extend across softirq work processing.

This could occur in ordinary SCHED_OTHER tasks as well, if the 
softirq is bounced to ksoftirqd - which it only should be if there's 
serious softirq overload - or, as you describe it above, if the 
softirq is raised in process context:

if (!in_interrupt())
wakeup_softirqd();

that's not really clean. We look into eliminating process context 
use of raise_softirq_irqsoff(). Such code sequence:

local_irq_save(flags);
...
raise_softirq_irqsoff(nr);
...
local_irq_restore(flags);

should be converted to something like:

local_irq_save(flags);
...
raise_softirq_irqsoff(nr);
...
local_irq_restore(flags);
recheck_softirqs();

If someone does not do proper local_bh_disable()/enable() sequences 
for micro-optimization reasons, then push the check to after the 
critcal section - and dont cause extra reschedules by waking up 
ksoftirqd. raise_softirq_irqsoff() will also be faster.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: question about softirqs

2009-05-12 Thread Ingo Molnar

* Peter Zijlstra  wrote:

> On Tue, 2009-05-12 at 10:12 +0200, Ingo Molnar wrote:
> > * Chris Friesen  wrote:
> > 
> > > This started out as a thread on the ppc list, but on the 
> > > suggestion of DaveM and Paul Mackerras I'm expanding the receiver 
> > > list a bit.
> > > 
> > > Currently, if a softirq is raised in process context the 
> > > TIF_RESCHED_PENDING flag gets set and on return to userspace we 
> > > run the scheduler, expecting it to switch to ksoftirqd to handle 
> > > the softirqd processing.
> > > 
> > > I think I see a possible problem with this. Suppose I have a 
> > > SCHED_FIFO task spinning on recvmsg() with MSG_DONTWAIT set. Under 
> > > the scenario above, schedule() would re-run the spinning task 
> > > rather than ksoftirqd, thus preventing any incoming packets from 
> > > being sent up the stack until we get a real hardware 
> > > interrupt--which could be a whole jiffy if interrupt mitigation is 
> > > enabled in the net device.
> > 
> > TIF_RESCHED_PENDING will not be set if a SCHED_FIFO task wakes up a 
> > SCHED_OTHER ksoftirqd task. But starvation of ksoftirqd processing 
> > will occur.
> > 
> > > DaveM pointed out that if we're doing transmits we're likely to 
> > > hit local_bh_enable(), which would process the softirq work.  
> > > However, I think we may still have a problem in the above rx-only 
> > > scenario--or is it too contrived to matter?
> > 
> > This could occur, and the problem is really that task priorities do 
> > not extend across softirq work processing.
> > 
> > This could occur in ordinary SCHED_OTHER tasks as well, if the 
> > softirq is bounced to ksoftirqd - which it only should be if there's 
> > serious softirq overload - or, as you describe it above, if the 
> > softirq is raised in process context:
> > 
> > if (!in_interrupt())
> > wakeup_softirqd();
> > 
> > that's not really clean. We look into eliminating process context 
> > use of raise_softirq_irqsoff(). Such code sequence:
> > 
> > local_irq_save(flags);
> > ...
> > raise_softirq_irqsoff(nr);
> > ...
> > local_irq_restore(flags);
> > 
> > should be converted to something like:
> > 
> > local_irq_save(flags);
> > ...
> > raise_softirq_irqsoff(nr);
> > ...
> > local_irq_restore(flags);
> > recheck_softirqs();
> > 
> > If someone does not do proper local_bh_disable()/enable() sequences 
> > for micro-optimization reasons, then push the check to after the 
> > critcal section - and dont cause extra reschedules by waking up 
> > ksoftirqd. raise_softirq_irqsoff() will also be faster.
> 
> 
> Wouldn't the even better solution be to get rid of softirqs 
> all-together?
> 
> I see the recent work by Thomas to get threaded interrupts 
> upstream as a good first step towards that goal, once the RX 
> processing is moved to a thread (or multiple threads) one can 
> priorize them in the regular sys_sched_setscheduler() way and its 
> obvious that a FIFO task above the priority of the network tasks 
> will have network starvation issues.

Yeah, that would be "nice". A single IRQ thread plus the process 
context(s) doing networking might perform well.

Multiple IRQ threads (softirq and hardirq threads mixed) i'm not so 
sure about - it's extra context-switching cost.

Btw, i noticed that using scheduling for work (packet, etc.) flow 
distribution standardizes and evens out the behavior of workloads. 
Softirq scheduling is really quite random currently. We have a 
random processing loop-limit in the core code and various batching 
and work-limit controls at individual usage sites. We sometimes 
piggyback to ksoftirqd. It's far easier to keep performance in check 
when things are more predictable.

But this is not an easy endevour, and performance regressions have 
to be expected and addressed if they occur. There can be random 
packet queuing details in networking drivers that just happen to 
work fine now, and might work worse with a kernel thread in place. 
So there has to be broad buy-in for the concept, and a concerted 
effort to eliminate softirq processing and most of hardirq 
processing by pushing those two elements into a single hardirq 
thread (and the rest into process context).

Not for the faint hearted. Nor is it recommended to be done without 
a good layer of asbestos.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: linux-next: PowerPC WARN_ON_ONCE() after merge of the final tree (tip related)

2010-04-15 Thread Ingo Molnar

* David Miller  wrote:

> From: Ingo Molnar 
> Date: Thu, 15 Apr 2010 08:49:40 +0200
> 
> > Btw., WARN_ON trapping on PowerPC is clearly a PowerPC bug - there's a good 
> > reason we have WARN_ON versus BUG_ON - it should be fixed.
> 
> I disagree, an implementation should be allowed to use the most
> efficient implementation possible for both interfaces.

It trades robustness for slightly better space/code efficiency.

Such a trap based mechanism exists on x86 as well and we use it for BUG_ON(). 
We intentionally dont use it to generate warnings and dont override __WARN(), 
because it would blow up way too often when a warning triggers in some 
sensitive codepath that cannot take a trap.

Anyway, the warning obviously has to be fixed - but the boot crash itself is 
PowerPC's own doing.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: linux-next: PowerPC WARN_ON_ONCE() after merge of the final tree (tip related)

2010-04-15 Thread Ingo Molnar

* Stephen Rothwell  wrote:

> Hi all,
> 
> Yesterday's (and today's) linux-next boot (PowerPC) failed like this:
> 
> [ cut here ]
> Badness at kernel/lockdep.c:2301
> NIP: c00a35c8 LR: c00084c4 CTR: 
> REGS: c0bf77e0 TRAP: 0700   Not tainted  (2.6.34-rc4-autokern1)
> MSR: 80021032   CR: 2444  XER: 0004
> TASK = c0aa3d30[0] 'swapper' THREAD: c0bf4000 CPU: 0
> GPR00: 0001 c0bf7a60 c0bf32f0 c00084c4 
> GPR04:  0a00  0068 
> GPR08: 0008 c0c4fabe  7265677368657265 
> GPR12: 80009032 c7691000 01c0 c0770bf8 
> GPR16: c076f390  0043 024876f0 
> GPR20: c0887480 02487480 c08876f0 01b5f8d0 
> GPR24: c0770478 0330 c0c1f1c8 c0884610 
> GPR28: c0c1b290 c00084c4 c0b45068 c0aa3d30 
> NIP [c00a35c8] .trace_hardirqs_on_caller+0xb0/0x224
> LR [c00084c4] system_call_common+0xc4/0x114
> Call Trace:
> [c0bf7a60] [c0bf7ba0] init_thread_union+0x3ba0/0x4000 
> (unreliable)
> [c0bf7af0] [c00084c4] system_call_common+0xc4/0x114
> --- Exception: c01 at .kernel_thread+0x28/0x70
> LR = .rest_init+0x34/0xf8
> [c0bf7de0] [c086916c] .proc_sys_init+0x20/0x64 (unreliable)
> [c0bf7e50] [c00099c0] .rest_init+0x20/0xf8
> [c0bf7ee0] [c0848af0] .start_kernel+0x484/0x4a8
> [c0bf7f90] [c00083c0] .start_here_common+0x1c/0x5c
> Instruction dump:
> 409e0188 0fe0 48000180 801f08d8 2f80 41be0050 880d01da 2fa0 
> 41be0028 e93e8538 8809 6801 <0b00> 2fa0 41be0010 e93e8538 
> [ cut here ]
> 
> Caused by commit bd6d29c25bb1a24a4c160ec5de43e0004e01f72b ("lockstat:
> Make lockstat counting per cpu").  This added a WARN_ON_ONCE to
> debug_atomic_inc() which is called from trace_hardirqs_on_caller() with
> irqs enabled.
> 
> Line 2301 is:
> 
> if (unlikely(curr->hardirqs_enabled)) {
> debug_atomic_inc(redundant_hardirqs_on);   <--- 2301
> return;
> }
> 
> This is especially bad since on PowerPC, WARN_ON is a TRAP and the return
> path from the TRAP also calls trace_hardirqs_on_caller(), so the TRAP
> recurses ...

Ok, we'll fix the warning.

Btw., WARN_ON trapping on PowerPC is clearly a PowerPC bug - there's a good 
reason we have WARN_ON versus BUG_ON - it should be fixed.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: linux-next: PowerPC WARN_ON_ONCE() after merge of the final tree (tip related)

2010-04-15 Thread Ingo Molnar

* Frederic Weisbecker  wrote:

> On Thu, Apr 15, 2010 at 08:49:40AM +0200, Ingo Molnar wrote:
> > 
> > * Stephen Rothwell  wrote:
> > 
> > > Hi all,
> > > 
> > > Yesterday's (and today's) linux-next boot (PowerPC) failed like this:
> > > 
> > > [ cut here ]
> > > Badness at kernel/lockdep.c:2301
> > > NIP: c00a35c8 LR: c00084c4 CTR: 
> > > REGS: c0bf77e0 TRAP: 0700   Not tainted  (2.6.34-rc4-autokern1)
> > > MSR: 80021032   CR: 2444  XER: 0004
> > > TASK = c0aa3d30[0] 'swapper' THREAD: c0bf4000 CPU: 0
> > > GPR00: 0001 c0bf7a60 c0bf32f0 
> > > c00084c4 
> > > GPR04:  0a00  
> > > 0068 
> > > GPR08: 0008 c0c4fabe  
> > > 7265677368657265 
> > > GPR12: 80009032 c7691000 01c0 
> > > c0770bf8 
> > > GPR16: c076f390  0043 
> > > 024876f0 
> > > GPR20: c0887480 02487480 c08876f0 
> > > 01b5f8d0 
> > > GPR24: c0770478 0330 c0c1f1c8 
> > > c0884610 
> > > GPR28: c0c1b290 c00084c4 c0b45068 
> > > c0aa3d30 
> > > NIP [c00a35c8] .trace_hardirqs_on_caller+0xb0/0x224
> > > LR [c00084c4] system_call_common+0xc4/0x114
> > > Call Trace:
> > > [c0bf7a60] [c0bf7ba0] init_thread_union+0x3ba0/0x4000 
> > > (unreliable)
> > > [c0bf7af0] [c00084c4] system_call_common+0xc4/0x114
> > > --- Exception: c01 at .kernel_thread+0x28/0x70
> > > LR = .rest_init+0x34/0xf8
> > > [c0bf7de0] [c086916c] .proc_sys_init+0x20/0x64 
> > > (unreliable)
> > > [c0bf7e50] [c00099c0] .rest_init+0x20/0xf8
> > > [c0bf7ee0] [c0848af0] .start_kernel+0x484/0x4a8
> > > [c0bf7f90] [c00083c0] .start_here_common+0x1c/0x5c
> > > Instruction dump:
> > > 409e0188 0fe0 48000180 801f08d8 2f80 41be0050 880d01da 2fa0 
> > > 41be0028 e93e8538 8809 6801 <0b00> 2fa0 41be0010 e93e8538 
> > > [ cut here ]
> > > 
> > > Caused by commit bd6d29c25bb1a24a4c160ec5de43e0004e01f72b ("lockstat:
> > > Make lockstat counting per cpu").  This added a WARN_ON_ONCE to
> > > debug_atomic_inc() which is called from trace_hardirqs_on_caller() with
> > > irqs enabled.
> > > 
> > > Line 2301 is:
> > > 
> > > if (unlikely(curr->hardirqs_enabled)) {
> > > debug_atomic_inc(redundant_hardirqs_on);   <--- 2301
> > > return;
> > > }
> > > 
> > > This is especially bad since on PowerPC, WARN_ON is a TRAP and the return
> > > path from the TRAP also calls trace_hardirqs_on_caller(), so the TRAP
> > > recurses ...
> > 
> > Ok, we'll fix the warning.
> > 
> > Btw., WARN_ON trapping on PowerPC is clearly a PowerPC bug - there's a good 
> > reason we have WARN_ON versus BUG_ON - it should be fixed.
> 
> 
> In this case, I guess the following fix should be sufficient?
> I'm going to test it and provide a sane changelog.
> 
> 
> diff --git a/kernel/lockdep.c b/kernel/lockdep.c
> index 78325f8..65d4336 100644
> --- a/kernel/lockdep.c
> +++ b/kernel/lockdep.c
> @@ -2298,7 +2298,11 @@ void trace_hardirqs_on_caller(unsigned long ip)
>   return;
>  
>   if (unlikely(curr->hardirqs_enabled)) {
> + unsigned long flags;
> +
> + raw_local_irq_save(flags);
>   debug_atomic_inc(redundant_hardirqs_on);
> + raw_local_irq_restore(flags);
>   return;
>   }
>   /* we'll do an OFF -> ON transition: */

that looks rather ugly. Why not do a raw:

this_cpu_inc(lockdep_stats.redundant_hardirqs_on);

which basically open-codes debug_atomic_inc(), but without the warning?

Btw., using the this_cpu() methods might result in faster code for all the 
debug_atomic_inc() macros as well?

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: linux-next: PowerPC WARN_ON_ONCE() after merge of the final tree (tip related)

2010-04-15 Thread Ingo Molnar

* Frederic Weisbecker  wrote:

> On Thu, Apr 15, 2010 at 04:03:58PM +0200, Ingo Molnar wrote:
> > > 
> > > 
> > > diff --git a/kernel/lockdep.c b/kernel/lockdep.c
> > > index 78325f8..65d4336 100644
> > > --- a/kernel/lockdep.c
> > > +++ b/kernel/lockdep.c
> > > @@ -2298,7 +2298,11 @@ void trace_hardirqs_on_caller(unsigned long ip)
> > >   return;
> > >  
> > >   if (unlikely(curr->hardirqs_enabled)) {
> > > + unsigned long flags;
> > > +
> > > + raw_local_irq_save(flags);
> > >   debug_atomic_inc(redundant_hardirqs_on);
> > > + raw_local_irq_restore(flags);
> > >   return;
> > >   }
> > >   /* we'll do an OFF -> ON transition: */
> > 
> > that looks rather ugly. Why not do a raw:
> > 
> > this_cpu_inc(lockdep_stats.redundant_hardirqs_on);
> > 
> > which basically open-codes debug_atomic_inc(), but without the warning?
> 
> 
> There is also no guarantee we are in a non-preemptable section. We can then
> also race against another cpu.
> 
> I'm not sure what to do.

it's a statistics counter so worst-case we lose a count. It's not a real issue 
- but might be worth adding a comment.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [GIT PULL] Perf probe support for PowerPC, from Ian Munsie

2010-04-23 Thread Ingo Molnar

* Paul Mackerras  wrote:

> Ingo,
> 
> Please pull my perf.git master branch:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/paulus/perf.git master
> 
> It has two commits from Ian Munsie that allow us to access local
> variables with perf probe on PowerPC.  We also need a commit in Ben's
> powerpc-next branch for it to function, but it compiles without that.
> 
> Thanks,
> Paul.
> 
> ---
> 
> The following changes since commit 6eca8cc35b50af1037bc919106dd6dd332c959c2:
>   Frederic Weisbecker (1):
> perf: Fix perf probe build error
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/paulus/perf.git master
> 
> Ian Munsie (2):
>   perf: Move arch specific code into separate arch directory
>   perf probe: Add PowerPC DWARF register number mappings
> 
>  tools/perf/Makefile   |   36 ++--
>  tools/perf/arch/powerpc/Makefile  |4 +
>  tools/perf/arch/powerpc/util/dwarf-regs.c |   88 
> +
>  tools/perf/arch/x86/Makefile  |4 +
>  tools/perf/arch/x86/util/dwarf-regs.c |   75 
>  tools/perf/util/include/dwarf-regs.h  |8 +++
>  tools/perf/util/probe-finder.c|   55 +-
>  7 files changed, 211 insertions(+), 59 deletions(-)
>  create mode 100644 tools/perf/arch/powerpc/Makefile
>  create mode 100644 tools/perf/arch/powerpc/util/dwarf-regs.c
>  create mode 100644 tools/perf/arch/x86/Makefile
>  create mode 100644 tools/perf/arch/x86/util/dwarf-regs.c
>  create mode 100644 tools/perf/util/include/dwarf-regs.h

Pulled, thanks a lot Paul!

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: linux-next: origin tree build failure

2009-06-12 Thread Ingo Molnar

* Peter Zijlstra  wrote:

> On Fri, 2009-06-12 at 19:33 +1000, Benjamin Herrenschmidt wrote:
> > We should at least -try- to follow the
> > process we've defined, don't you think ?
>
> So you're saying -next should include whole new subsystems even 
> though its not clear they will be merged?
> 
> That'll invariably create the opposite case where a tree doesn't 
> get pulled and breaks bits due to its absence.
> 
> -next does a great job of sorting the existing subsystem trees, 
> but I don't think its Stephens job to decide if things will get 
> merged.
> 
> Therefore when things are in limbo (there was no definite ACK from 
> Linus on perf counters) both inclusion and exclusion from -next 
> can lead to trouble.

Precisely. linux-next is for the uncontroversial stuff from existing 
subsystems. Sometimes for features pushed by or approved by existing 
subsystem maintainers. But it is not for controversial stuff - Linus 
is the upstream maintainer, not Stephen.

We had a real mess with perfmon3 which was included into linux-next 
in a rouge way without Cc:-ing the affected maintainers and against 
the maintainers. There was a repeat incident recently as well, where 
a tree was included into linux-next without the approval (and 
without the Cc:) of affected maintainers. linux-next needs to be 
more careful about adding trees.

All in one, we did the same with perfcounters that we expected of 
perfmonv3. No double standard.

Nor is there any real issue here. The bug was my fault, it was 
trivial to fix, it affects a small subset of testers and it is 
already upstream, applied on the same day perfcounters were pulled.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: linux-next: origin tree build failure

2009-06-12 Thread Ingo Molnar

* Benjamin Herrenschmidt  wrote:

> On Fri, 2009-06-12 at 10:24 +1000, Stephen Rothwell wrote:
> 
> > From: Stephen Rothwell 
> > Date: Fri, 12 Jun 2009 10:14:22 +1000
> > Subject: [PATCH] perfcounters: remove powerpc definitions of 
> > perf_counter_do_pending
> > 
> > Commit 925d519ab82b6dd7aca9420d809ee83819c08db2 ("perf_counter:
> > unify and fix delayed counter wakeup") added global definitions.
> > 
> > Signed-off-by: Stephen Rothwell 
> 
> Acked-by: Benjamin Herrenschmidt 

Ah - thanks. The bug was caused by me being a bit too optimistic in 
applying the shiny-new Power7 support patches on the last day. (nice 
CPU btw.)

> Linus, please apply. BTW, This is _EXACTLY_ why this should have 
> been in -next for a few days before being merged :-(

Not really: for example current upstream is build-broken on x86 due 
to an integration artifact via the kmemleak tree - despite it having 
been in linux-next for months.

Paulus was building and booting powerpc on a daily basis and i ran 
cross-builds as well.

Such bugs happen, and they are easy enough to fix. What matters 
arent the 1-2 short-lived bugs that do happen when a new combination 
of trees is created, but the long-lived combination bugs and 
conflicts.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: linux-next: origin tree build failure

2009-06-12 Thread Ingo Molnar

* Benjamin Herrenschmidt  wrote:

> > Ah - thanks. The bug was caused by me being a bit too optimistic 
> > in applying the shiny-new Power7 support patches on the last 
> > day. (nice CPU btw.)
> 
> In that case paulus tells me it's actually Peter screwing up 
> moving something from the powerpc code to generic :-)

Yes, but i committed it and it's my task to make sure that the thing 
works as a whole so it's my fault still :)

>  .../...
> 
> > Such bugs happen, and they are easy enough to fix. What matters 
> > arent the 1-2 short-lived bugs that do happen when a new 
> > combination of trees is created, but the long-lived combination 
> > bugs and conflicts.
> 
> I'm not saying -next would fix world hunger ... but in this case 
> we have two sets of issues, perfctr and the init ordering change 
> which both got merged totally bypassing -next... We should at 
> least -try- to follow the process we've defined, don't you think ?

You are trying to define a process that does not exist in that form 
and which never existed in that form.

It was never true that new code _MUST_ go via linux-next - and i 
hope it will never be true.

linux-next has integration testing so that interactions between 
maintainer trees are mapped and that architectures that otherwise 
few people use get build-tested too (well beyond their practical 
relevance, i have to add) - but there's little critical review done 
in linux-next. Nor should it be the forum for that, it simply 
contains way too much stuff and has a weird history format with 
daily rebases that makes review hard and expensive in that form.

linux-next should not be second-guessing maintainers and should not 
act as an "approval forum" for controversial features, increasing 
the (already quite substantial) pressure on maintainers to apply 
more crap.

And that is true even if it's a new feature that i happen to support 
- as in this case - it sure would have been handy to have more 
perfcounters test coverage, every little bit of extra testing helps.

If linux-next wants to do that then it should be renamed to 
something else and not called linux-next.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: linux-next: origin tree build failure

2009-06-12 Thread Ingo Molnar

* Benjamin Herrenschmidt  wrote:

> > linux-next should not be second-guessing maintainers and should 
> > not act as an "approval forum" for controversial features, 
> > increasing the (already quite substantial) pressure on 
> > maintainers to apply more crap.
> 
> I agree here. That's not the point. The idea is that for things 
> that -are- approved by their respective maintainers, to get some 
> integration testing and ironing of those mechanical bugs so that 
> by the time they hit mainstream, they don't break bisection among 
> others.

This is certainly doable for agreeable features - which is the bulk 
- and it is being done.

But this is a catch-22 for _controversial_ new features - which 
perfcounters clearly was, in case you turned off your lkml 
subscription ;-)

And if you hit that build breakage during bisection you can do:

   git cherry-pick e14112d

Also, you seem to brush off the notion that far more bugs slip 
through linux-next than get caught by it.

So if you think linux-next matters in terms of _regression_ testing, 
the numbers dont seem to support that notion. This particular 
incident does support that notion though, granted - but it's taken 
out of context IMHO:

In terms of test coverage, at least for our trees, less than 1% of 
the bugs we handle get reported in a linux-next context - and most 
of the bugs that get reported (against say the scheduler tree) are 
related to rare architectures.

In fact, i checked, there were _zero_ x86 bugs reported against 
linux-next and solved against it between v2.6.30-rc1 and v2.6.30:

   git log --grep=next -i v2.6.30-rc1..v2.6.30 arch/x86/

Doing it over the full cycle shows one commit altogether - a Xen 
build failure. In fact, i just checked the whole stabilization cycle 
for the whole kernel (v2.6.30-rc1..v2.6.30-final), and there were 
only 5 linux-next originated patches, most of them build failures.

I did this by looking at all occurances of 'next', in all commit 
logs:

   git log --grep=next -i v2.6.30-rc1..v2.6.30

and then manually checking the context of all 'next' matches and 
counting the linux-next related commits.

So lets be generous and say that because some people dont put the 
bug report originator into the changelog it was four times as many, 
20 - but that's still dwarved by the sheer amount of post-rc1 
changes: thousands of changes and hundreds of regressions.

linux-next is mostly useful (to me at least) not for the 
cross-builds it does, but in terms of mapping out upcoming conflicts 
- which also drives early detection of problematic patches and 
problematic conflicts.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: linux-next: origin tree build failure

2009-06-12 Thread Ingo Molnar

* Benjamin Herrenschmidt  wrote:

> On Fri, 2009-06-12 at 23:10 +1000, Benjamin Herrenschmidt wrote:
> > On Fri, 2009-06-12 at 14:53 +0200, Ingo Molnar wrote:
> 
> > To some extent, here, the issue is on Linus side and it's up to him (Hey
> > Linus ! still listening ?) to maybe be more proactive at giving an ack
> > or nack so that we can get a chance to do that final pass of ironing out
> > the mechanical bugs before we hit the main tree.
> 
> Let me add a little bit more background to my reasoning here and why I
> think having this integration testing step is so valuable...
> 
> It all boils down to bisection and having a bisectable tree.

I think you are way too concentrated on this particular incident, 
and you are generalizing it into something that is not so in 
practice.

Even in this particular case, there's just 3 other commit points in 
the Git tree between commit 8a1ca8c (the breakage on PowerPC) and 
e14112d (the fix). We'll have up to 10,000 commits.

I bisect on an almost daily basis, and i'm not seeing unreasonable 
problems.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: linux-next: origin tree build failure

2009-06-12 Thread Ingo Molnar

* Benjamin Herrenschmidt  wrote:

> On Fri, 2009-06-12 at 15:44 +0200, Ingo Molnar wrote:
> 
> > This is certainly doable for agreeable features - which is the bulk 
> > - and it is being done.
> > 
> > But this is a catch-22 for _controversial_ new features - which 
> > perfcounters clearly was, in case you turned off your lkml 
> > subscription ;-)
> 
> I didn't :-) My point here is that Linus can make a decision with 
> an email -before- merging so that -next gets a chance, at least 
> for a couple of days, to do the integration testing once the 
> controversy has been sorted by his highness.

Uhm, the bug you are making a big deal of would have been found and 
fixed by Paulus a few hours after any such mail - and probably by me 
too as i do daily cross builds to Power.

So yes, we had a bug, but any extra linux-next hoops would not have 
prevented it: i could still have messed up by getting lured by that 
nice piece of Power7 hardware enablement patch on the last day ;-)

So the bug was my fault for being too fast-and-loose with that 
particular patch, creating a ~5-commits-hop build breakage bisection 
window on Power.

Now that i'm sufficiently chastised, can we now move on please? :)

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: linux-next: origin tree build failure

2009-06-12 Thread Ingo Molnar

* Benjamin Herrenschmidt  wrote:

> On Fri, 2009-06-12 at 15:49 +0200, Ingo Molnar wrote:
> > * Benjamin Herrenschmidt  wrote:
> > 
> > > On Fri, 2009-06-12 at 23:10 +1000, Benjamin Herrenschmidt wrote:
> > > > On Fri, 2009-06-12 at 14:53 +0200, Ingo Molnar wrote:
> > > 
> > > > To some extent, here, the issue is on Linus side and it's up to him (Hey
> > > > Linus ! still listening ?) to maybe be more proactive at giving an ack
> > > > or nack so that we can get a chance to do that final pass of ironing out
> > > > the mechanical bugs before we hit the main tree.
> > > 
> > > Let me add a little bit more background to my reasoning here and why I
> > > think having this integration testing step is so valuable...
> > > 
> > > It all boils down to bisection and having a bisectable tree.
> > 
> > I think you are way too concentrated on this particular incident, 
> > and you are generalizing it into something that is not so in 
> > practice.
> 
> Maybe. But maybe it's representative... so far in this merge 
> window, 100% of the powerpc build and runtime breakage upstream 
> comes from stuff that didn't get into -next before.

But that's axiomatic, isnt it? linux-next build-tests PowerPC as the 
first in the row of tests - so no change that was in linux-next can 
ever cause a build failure on PowerPC, right?

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 1/2] lib: Provide generic atomic64_t implementation

2009-06-13 Thread Ingo Molnar

* Linus Torvalds  wrote:

> On Sat, 13 Jun 2009, Linus Torvalds wrote:
> > 
> > On Sat, 13 Jun 2009, Paul Mackerras wrote:
> > >
> > > Linus, Andrew: OK if this goes in via the powerpc tree?
> > 
> > Ok by me.
> 
> Btw, do 32-bit architectures really necessarily want 64-bit 
> performance counters?
> 
> I realize that 32-bit counters will overflow pretty easily, but I 
> do wonder about the performance impact of doing things like hashed 
> spinlocks for 64-bit counters. Maybe the downsides of 64-bit perf 
> counters on such architectures might outweight the upsides?

We account all sorts of non-hw bits via atomic64_t as well - for 
example time related counters in nanoseconds - which wrap 32 bits at 
4 seconds.

There's also security/stability relevant bits:

counter->id = atomic64_inc_return(&perf_counter_id);

We dont really want that ID to wrap ever - it could create a leaking 
of one PMU context into another. (We could rewrite it by putting a 
global lock around it, but still - this is a convenient primitive.)

In select places we might be able to reduce the use of atomic64_t 
(that might make performance sense anyway) - but to get rid of all 
of them would be quite painful. We initially started with a 32-bit 
implementation and it was quite painful with fast-paced units.

So since Paul has already coded the wrappers up ... i'd really 
prefer that, unless there's really compelling reasons not to do it.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH RFC] powerpc: perf_counter: Enable use of software counters on 32-bit powerpc

2009-06-13 Thread Ingo Molnar

* Paul Mackerras  wrote:

> +extern void set_perf_counter_pending(void);

btw., Mike Frysinger pointed out that this prototype should be in 
include/linux/perf_counter.h, not spread out in every architecture 
pointlessly.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 1/6] perf_counter: powerpc: Enable use of software counters on 32-bit powerpc

2009-06-17 Thread Ingo Molnar

* Paul Mackerras  wrote:

> This depends on the generic atomic64_t patches, which are now in 
> Linus' tree.  Ingo, if you're putting these in, please pull Linus' 
> tree in first.

yes, i already did that earlier today - so all should be fine with 
the lib/atomic64.c dependency.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 6/6] perf_counter: tools: Makefile tweaks for 64-bit powerpc

2009-06-17 Thread Ingo Molnar

* Paul Mackerras  wrote:

> +++ b/tools/perf/Makefile
> @@ -157,9 +157,21 @@ uname_R := $(shell sh -c 'uname -r 2>/dev/null || echo 
> not')
>  uname_P := $(shell sh -c 'uname -p 2>/dev/null || echo not')
>  uname_V := $(shell sh -c 'uname -v 2>/dev/null || echo not')
>  
> +# If we're on a 64-bit kernel, use -m64
> +ifneq ($(patsubst %64,%,$(uname_M)),$(uname_M))
> +  M64 := -m64
> +endif

this is fine.

> +# Don't use -Werror on ppc64; we get warnings due to using
> +# %Lx formats on __u64, which is unsigned long.
> +Werror := -Werror
> +ifeq ($(uname_M),ppc64)
> +  Werror :=
> +endif

hm, i dont really like this one - it just adds a special case on an 
arch. Why is __u64 unsigned long on powerpc and not unsigned long 
long? I thought the whole mess with u64 was fixed there recently and 
powerpc too now uses include/asm-generic/int-ll64.h ?

ah, it does this:

/*
 * This is here because we used to use l64 for 64bit powerpc
 * and we don't want to impact user mode with our change to ll64
 * in the kernel.
 */
#if defined(__powerpc64__) && !defined(__KERNEL__)
# include 
#else
# include 
#endif

That's crappy really.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 6/6] perf_counter: tools: Makefile tweaks for 64-bit powerpc

2009-06-17 Thread Ingo Molnar

* Paul Mackerras  wrote:

> Ingo Molnar writes:
> 
> > ah, it does this:
> > 
> > /*
> >  * This is here because we used to use l64 for 64bit powerpc
> >  * and we don't want to impact user mode with our change to ll64
> >  * in the kernel.
> >  */
> > #if defined(__powerpc64__) && !defined(__KERNEL__)
> > # include 
> > #else
> > # include 
> > #endif
> > 
> > That's crappy really.
> 
> We were concerned that changing the userland-visible type of __u64 
> from unsigned long to unsigned long long, etc., would be breaking 
> the ABI, even if only in a small way - I thought it could possibly 
> change C++ mangled function names, for instance, and it would 
> cause fresh compile warnings on existing user code that prints 
> __u64 with %lx, which has always been the correct thing to do on 
> ppc64.
> 
> A counter-argument would be, I guess, that __u64 et al. are purely 
> for use in describing the kernel/user interface, so we have a 
> little more latitude than with the type of e.g. u_int64_t.  I 
> dunno.  I don't recall getting much of an answer from the glibc 
> guys about what they thought of the idea of changing it.
> 
> Anyway, of the 64-bit architectures, alpha, ia64, and mips64 also 
> have __u64 as unsigned long in userspace, so this issue will still 
> crop up even if we change it on powerpc.

Having crap elsewhere is no reason to spread it further really. We 
need consistent types. Can we define __KERNEL__ perhaps to get to 
the real types?

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 1/6] perf_counter: powerpc: Enable use of software counters on 32-bit powerpc

2009-06-17 Thread Ingo Molnar

* Paul Mackerras  wrote:

> This depends on the generic atomic64_t patches, which are now in 
> Linus' tree.  Ingo, if you're putting these in, please pull Linus' 
> tree in first.

Note, i've created a new branch, tip:perfcounters/powerpc, so we can 
keep these things separate and Ben can pull them too. I see there 
was some review feedback - do you want to send a v2 version perhaps?

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 1/6] perf_counter: powerpc: Enable use of software counters on 32-bit powerpc

2009-06-17 Thread Ingo Molnar

* Kumar Gala  wrote:

> On Jun 17, 2009, at 9:21 AM, Ingo Molnar wrote:
>
>> * Paul Mackerras  wrote:
>>
>>> This depends on the generic atomic64_t patches, which are now in 
>>> Linus' tree.  Ingo, if you're putting these in, please pull 
>>> Linus' tree in first.
>>
>> Note, i've created a new branch, tip:perfcounters/powerpc, so we 
>> can keep these things separate and Ben can pull them too. I see 
>> there was some review feedback - do you want to send a v2 version 
>> perhaps?
>
> Out of interest, is the intent to try and get these changes into 
> .31?  I ask because I want to know if I should try to find time to 
> add support for the FSL embedded perfmon ASAP or not.

I think it would be nice to have more platform support in .31. 
Perfcounters is a brand-new feature so there's no risk of 
regression. In the end it will depend on Linus to pull of course, 
and BenH can veto it too if he'd like no more PowerPC changes in 
this cycle. Worst-case it's all .32 material.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 1/6] perf_counter: powerpc: Enable use of software counters on 32-bit powerpc

2009-06-18 Thread Ingo Molnar

* Paul Mackerras  wrote:

> Ingo Molnar writes:
> 
> > Note, i've created a new branch, tip:perfcounters/powerpc, so we can 
> > keep these things separate and Ben can pull them too. I see there 
> > was some review feedback - do you want to send a v2 version perhaps?
> 
> Kumar's comments seemed to me to be wanting changes to accommodate 
> code that doesn't exist yet, so I think those changes should be 
> done later when that code exists and we know exactly what is 
> needed.  So the current patches are fine as-is IMO.

ok - will queue them up.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 6/6] perf_counter: tools: Makefile tweaks for 64-bit powerpc

2009-06-18 Thread Ingo Molnar

* Paul Mackerras  wrote:

> This also removes the -Werror flag when building on a 64-bit powerpc
> machine.  The userspace definition of u64 is unsigned long rather
> than unsigned long long, meaning that gcc warns every time a u64
> is printed with %Lx or %llx (though that does work properly).
> In future we may use PRI64 etc. for printing 64-bit quantities,
> which would eliminate these warnings.

> +# Don't use -Werror on ppc64; we get warnings due to using
> +# %Lx formats on __u64, which is unsigned long.
> +Werror := -Werror
> +ifeq ($(uname_M),ppc64)
> +  Werror :=
> +endif

Note, i left out this bit from the commit - we need to find a better 
solution than to allow ugly warnings on PowerPC.

Could we use the kernel's u64 type directly perhaps? That would 
allow us to change all __u64 to u64 in all of tools/perf/ which is a 
nice clean-up in any case.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 6/6] perf_counter: tools: Makefile tweaks for 64-bit powerpc

2009-06-19 Thread Ingo Molnar

* Paul Mackerras  wrote:

> Ingo Molnar writes:
> 
> > Note, i left out this bit from the commit - we need to find a 
> > better solution than to allow ugly warnings on PowerPC.
> > 
> > Could we use the kernel's u64 type directly perhaps? That would 
> > allow us to change all __u64 to u64 in all of tools/perf/ which 
> > is a nice clean-up in any case.
> 
> This is userspace, we can use "u64" however we like. :) I'll cook 
> up a patch to add "typedef unsigned long long u64" and change 
> __u64 to u64.

Thanks, i've applied the patch. Note, it crossed with a few changes 
from Peter - i fixed up the conflicts - please double check whether 
it's still fine on PowerPC too.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 1/6] perf_counter: powerpc: Enable use of software counters on 32-bit powerpc

2009-06-20 Thread Ingo Molnar

* Benjamin Herrenschmidt  wrote:

> On Wed, 2009-06-17 at 16:27 +0200, Ingo Molnar wrote:
> > I think it would be nice to have more platform support in .31. 
> > Perfcounters is a brand-new feature so there's no risk of 
> > regression. In the end it will depend on Linus to pull of course, 
> > and BenH can veto it too if he'd like no more PowerPC changes in 
> > this cycle. Worst-case it's all .32 material.
> 
> There have been little PowerPC changes in this cycle and I agree 
> with you on that it's a nice feature to have with little risk of 
> regression.

Ok - thanks - i'll push it to Linus probably later today.

> In fact, I also have an up-to-date (and hopefully working) 
> irqtrace/lockdep patch for 32-bit powerpc (we only do 64-bit right 
> now) that I'm considering merging this time around, the benefit it 
> brings is worth the risk I believe.

Nice :-)

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 2/2] perf_counter: powerpc: Add callchain support

2009-06-27 Thread Ingo Molnar

* Peter Zijlstra  wrote:

> On Sat, 2009-06-27 at 15:31 +1000, Paul Mackerras wrote:
> > +   if (regs) {
> > +   if (current_is_64bit())
> > +   perf_callchain_user_64(regs, entry);
> > +   else
> > +   perf_callchain_user_32(regs, entry);
> > +   }
> 
> Ingo do we need 32 on 64 stuff like that too?

hm, indeed.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [00/15] swiotlb cleanup

2009-07-09 Thread Ingo Molnar

* FUJITA Tomonori  wrote:

> - removes unused (and unnecessary) hooks in swiotlb.
> 
> - adds dma_capable() and converts swiotlb to use it. It can be used to
> know if a memory area is dma capable or not. I added
> is_buffer_dma_capable() for the same purpose long ago but it turned
> out that the function doesn't work on POWERPC.
> 
> This can be applied cleanly to linux-next, -mm, and mainline. This
> patchset touches multiple architectures (ia64, powerpc, x86) so I
> guess that -mm is appropriate for this patchset (I don't care much
> what tree would merge this though).
> 
> This is tested on x86 but only compile tested on POWERPC and IA64.
> 
> Thanks,
> 
> =
>  arch/ia64/include/asm/dma-mapping.h|   18 ++
>  arch/powerpc/include/asm/dma-mapping.h |   23 +++
>  arch/powerpc/kernel/dma-swiotlb.c  |   48 +---
>  arch/x86/include/asm/dma-mapping.h |   18 ++
>  arch/x86/kernel/pci-dma.c  |2 +-
>  arch/x86/kernel/pci-gart_64.c  |5 +-
>  arch/x86/kernel/pci-nommu.c|2 +-
>  arch/x86/kernel/pci-swiotlb.c  |   25 
>  include/linux/dma-mapping.h|5 --
>  include/linux/swiotlb.h|   11 
>  lib/swiotlb.c  |  102 
> +---
>  11 files changed, 92 insertions(+), 167 deletions(-)

Hm, the functions and facilities you remove here were added as part 
of preparatory patches for Xen guest support. You were aware of 
them, you were involved in discussions about those aspects with Ian 
and Jeremy but still you chose not to Cc: either of them and you 
failed to address that aspect in the changelogs.

I'd like the Xen code to become cleaner more than anyone else here i 
guess, but patch submission methods like this are not really 
helpful. A far better method is to be open about such disagreements, 
to declare them, to Cc: everyone who disagrees, and to line out the 
arguments in the changelogs as well - instead of just curtly 
declaring those APIs 'unused' and failing to Cc: involved parties.

Alas, on the technical level the cleanups themselves look mostly 
fine to me. Ian, Jeremy, the changes will alter Xen's use of 
swiotlb, but can the Xen side still live with these new methods - in 
particular is dma_capable() sufficient as a mechanism and can the 
Xen side filter out DMA allocations to make them physically 
continuous?

Ben, Tony, Becky, any objections wrt. the PowerPC / IA64 impact? If 
everyone agrees i can apply them to the IOMMU tree, test it and push 
it out to -next, etc.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [00/15] swiotlb cleanup

2009-07-10 Thread Ingo Molnar

* Ian Campbell  wrote:

> I've not examined the series in detail it looks OK but I don't 
> think it is quite sufficient. The Xen determination of whether a 
> buffer is dma_capable or not is based on the physical address 
> while dma_capable takes only the dma address.
> 
> I'm not sure we can "invert" our conditions to work back from dma 
> address to physical since given a start dma address and a length 
> we would need to check that dma_to_phys(dma+PAGE_SIZE) == 
> dma_to_phys(dma)+PAGE_SIZE etc. However dma+PAGE_SIZE might belong 
> to a different domain so translating it to a physical address in 
> isolation tells us nothing especially useful since it would give 
> us the physical address in that other guest which is useless to 
> us. If we could pass both physical and dma address to dma_capable 
> I think that would probably be sufficient for our purposes.
> 
> As well as that Xen needs some way to influence the allocation of 
> the actual bounce buffer itself since we need to arrange for it to 
> be machine address contiguous as well as physical address 
> contiguous. This series explicitly removes those hooks without 
> replacement. My most recent proposal was to have a new 
> swiotlb_init variant which was given a preallocated buffer which 
> this series doesn't necessarily preclude.
> 
> The phys_to_dma and dma_to_phys translation points are the last 
> piece Xen needs and seem to be preserved in this series.
> 
> However Fujita's objection to all of the previous swiotlb-for-xen 
> proposals was around the addition of the Xen hooks in whichever 
> location. Originally these hooks were via __weak functions and 
> later proposals implemented them via function pointer hooks in the 
> x86 implementations of the arch-abstract interfaces (phys<->dma 
> and dma_capable etc). I don't think this series addresses those 
> objections (fair enough -- it wasn't intended to) or leads to any 
> new approach to solving the issue, although I also don't think it 
> makes the issue any harder to address. I don't think it will be 
> possible to make progress on Xen usage of swiotlb until a solution 
> can be found to this conflict of opinion.
> 
> Fujita suggested that we export the core sync_single() 
> functionality and reimplemented the surrounding infrastructure in 
> terms of that (and incorporating our additional requirements). I 
> prototyped this (it is currently unworking, in fact it seems to 
> have developed rather a taste for filesystems :-() but the 
> diffstat of my WIP patch is:
>
>  arch/x86/kernel/pci-swiotlb.c |6 
>  arch/x86/xen/pci-swiotlb.c|2 
>  drivers/pci/xen-iommu.c   |  385 
> --
>  include/linux/swiotlb.h   |   12 +
>  lib/swiotlb.c |   10 -
>  5 files changed, 385 insertions(+), 30 deletions(-)
>
> where a fair number of the lines in xen-iommu.c are copies of 
> functions from swiotlb.c with minor modifications. As I say it 
> doesn't work yet but I think it's roughly indicative of what such 
> an approach would look like. I don't like it much but am happy to 
> run with it if it looks to be the most acceptable approach. [...]

+400 lines of code to avoid much fewer lines of generic code impact 
on the lib/swiotlb.c side sounds like a bad technical choice to me. 

It makes the swiotlb code less useful and basically forks a random 
implementation of it in drivers/pci/xen-iommu.c.

Fujita-san, can you think of a solution that avoids the whole-sale 
copying of hundreds of lines of code?

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [00/15] swiotlb cleanup

2009-07-18 Thread Ingo Molnar

* FUJITA Tomonori  wrote:

> On Mon, 13 Jul 2009 13:20:22 +0900
> FUJITA Tomonori  wrote:
> 
> > On Fri, 10 Jul 2009 16:12:48 +0200
> > Ingo Molnar  wrote:
> > 
> > > > functionality and reimplemented the surrounding infrastructure in 
> > > > terms of that (and incorporating our additional requirements). I 
> > > > prototyped this (it is currently unworking, in fact it seems to 
> > > > have developed rather a taste for filesystems :-() but the 
> > > > diffstat of my WIP patch is:
> > > >
> > > >  arch/x86/kernel/pci-swiotlb.c |6 
> > > >  arch/x86/xen/pci-swiotlb.c|2 
> > > >  drivers/pci/xen-iommu.c   |  385 
> > > > --
> > > >  include/linux/swiotlb.h   |   12 +
> > > >  lib/swiotlb.c |   10 -
> > > >  5 files changed, 385 insertions(+), 30 deletions(-)
> > > >
> > > > where a fair number of the lines in xen-iommu.c are copies of 
> > > > functions from swiotlb.c with minor modifications. As I say it 
> > > > doesn't work yet but I think it's roughly indicative of what such 
> > > > an approach would look like. I don't like it much but am happy to 
> > > > run with it if it looks to be the most acceptable approach. [...]
> > > 
> > > +400 lines of code to avoid much fewer lines of generic code impact 
> > > on the lib/swiotlb.c side sounds like a bad technical choice to me. 
> > 
> > The amount of code is not the point. The way to impact on the
> > lib/swiotlb.c is totally wrong from the perspective of the kernel
> > design; it uses architecture code in the very original (xen) way.
> 
> btw, '+400 lines of code to avoid much fewer lines of generic code
> impact on the lib/swiotlb.c' doesn't sound true to me.
> 
> Here is a patch in the way that Xen people want to do:
> 
> http://patchwork.kernel.org/patch/26343/
> 
> ---
>  arch/x86/Kconfig |4 +
>  arch/x86/include/asm/io.h|2 +
>  arch/x86/include/asm/pci_x86.h   |1 +
>  arch/x86/include/asm/xen/iommu.h |   12 ++
>  arch/x86/kernel/pci-dma.c|3 +
>  arch/x86/pci/Makefile|1 +
>  arch/x86/pci/init.c  |6 +
>  arch/x86/pci/xen.c   |   51 +++
>  drivers/pci/Makefile |2 +
>  drivers/pci/xen-iommu.c  |  271 
> ++
> 
> Even with the way that Xen people want to do, 
> drivers/pci/xen-iommu.c is about 300 lines. And my patchset 
> removes the nice amount of lines for dom0 support. I don't see 
> much difference wrt lines.

ok, that kind of impact looks reasonable. If we are wrong and the 
Xen model becomes duplicated anywhere else it can still be 
generalized into core swiotlb code.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH RFC 1/2] Makefile: Never use -fno-omit-frame-pointer

2009-07-18 Thread Ingo Molnar

* Anton Vorontsov  wrote:

> On Wed, Jun 17, 2009 at 12:16:30AM +0400, Anton Vorontsov wrote:
> > According to Segher Boessenkool and GCC manual, -fomit-frame-pointer
> > is only the default when optimising on archs/ABIs where it doesn't
> > hinder debugging and -pg. So, we do not get it by default on x86,
> > not at any optimisation level.
> > 
> > On the other hand, *using* -fno-omit-frame-pointer causes gcc to
> > produce buggy code on PowerPC targets.
> > 
> > If Segher and GCC manual are right, this patch should be a no-op
> > for all arches except PowerPC, where the patch fixes gcc issues.
> > 
> > Signed-off-by: Anton Vorontsov 
> > ---
> > 
> > See this thread for more discussion:
> > http://osdir.com/ml/linux-kernel/2009-05/msg01754.html
> > 
> > p.s.
> > Obviously, I didn't test this patch on anything else but PPC32. ;-)
> > 
> > Segher, do you know if all GCC versions that we support for
> > building Linux are behaving the way that GCC manual describe?
> 
> No news is good news... Ingo, can we merge this into -tip for 
> testing?

Changes to the top level Makefile should really go via Sam's kbuild 
tree.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: ftrace scripts and make V=1

2009-08-05 Thread Ingo Molnar

* Steven Rostedt  wrote:

> Well we tracked it down and it is powerpc64 specific.
> 
> Seems that in drivers/hwmon/lm93.c there's a function called:
> 
>LM93_IN_FROM_REG()
> 
> But PPC64 has function descriptors and the real function names (the ones 
> you see in objdump) start with a '.'. Thus this in objdump you have:
> 
>  Disassembly of section .text:
> 
>   <.LM93_IN_FROM_REG>:
>0:   7c 08 02 a6 mflrr0
>4:   fb 81 ff e0 std r28,-32(r1)
> 
> 
> The function name used is .LM93_IN_FROM_REG. But gcc considers 
> symbols that start with ".L" as a special symbol that is used 
> inside the assembly stage.
> 
> The nm passed into recordmcount uses the --synthetic option which 
> shows the ".L" symbols (my runs outside of the build did not 
> include the --synthetic option, so my older patch worked). We see 
> the function as a local.
> 
> Now to capture all the locations that use "mcount" we need to have 
> a reference to link into the object file a list of mcount callers. 
> We need a reference that will not disappear. We try to use a 
> global function and if that does not work, we use a local function 
> as a reference. But to relink the section back into the object, we 
> need to make it global. In this case, we run objcopy using 
> --globalize-symbol and --localize-symbol to convert the symbol 
> into a global symbol, link the mcount list, then convert it back 
> to a local symbol.
> 
> This works great except for this case. .L* symbols can not be 
> converted into a global symbol, and the mcount section referencing 
> it will remain unresolved.
> 
> Try this patch and see if it fixes your issue.
> 
> Thanks!
> 
> -- Steve
> 
> diff --git a/scripts/recordmcount.pl b/scripts/recordmcount.pl
> index d29baa2..4889c44 100755
> --- a/scripts/recordmcount.pl
> +++ b/scripts/recordmcount.pl
> @@ -414,7 +414,10 @@ while () {
>   $offset = hex $1;
>   } else {
>   # if we already have a function, and this is weak, skip it
> - if (!defined($ref_func) && !defined($weak{$text})) {
> + if (!defined($ref_func) && !defined($weak{$text}) &&
> +  # PPC64 can have symbols that start with .L and
> +  # gcc considers these special. Don't use them!
> +  $text !~ /^\.L/) {
>   $ref_func = $text;
>   $offset = hex $1;
>   }

Ah, indeed. I'm wondering whether also emitting a build warning 
would be useful - just in the (admittedly unlikely) case of someone 
wondering about why LM93_IN_FROM_REG does not show up in function 
traces.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH -v2 0/7] powerpc: use asm-generic/dma-mapping-common.h

2009-08-13 Thread Ingo Molnar

* FUJITA Tomonori  wrote:

> On Thu, 13 Aug 2009 15:48:42 +1000
> Benjamin Herrenschmidt  wrote:
> 
> > On Wed, 2009-08-05 at 14:08 +0900, FUJITA Tomonori wrote:
> > 
> > > The above swiotlb patchset was merged in -tip so I think that merging
> > > this patchset via -tip too is the easiest way to handle this patchset.
> > > 
> > > The patchset also is available via a git tree:
> > > 
> > > git://git.kernel.org/pub/scm/linux/kernel/git/tomo/linux-2.6-misc.git 
> > > powerpc
> > 
> > Hi !
> > 
> > While I generally agree here with the patches, I'm not sure it should be
> > merged via -tip since it mostly touches arch/powerpc files (and I need
> > to review it a bit more carefully, hopefully you'll have Ack's hitting
> > your mailbox later today).
> 
> Thanks!
> 
> This patchset depends on my swiotlb cleanup patchset:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/tomo/linux-2.6-misc.git swiotlb
> 
> http://marc.info/?l=linux-ia64&m=124718816520156&w=2
> 
> My swiotlb cleanup patchset has been in -tip. It might be easier 
> to merge both the swiotlb patchset and this patchset in powerpc 
> tree?

Ben, what's your preference? I waited for your reaction with these 
bits, i.e. they are not in tip:core/iommu yet.

One variant would be what Fujita suggested: you could pull 
core/iommu as a basis (it's a well-tested, problem-free tree at the 
moment, with no big risky items), and then pull/apply the powerpc 
specific bits from Fujita.

A second variant would be that we could pull these bits into 
core/iommu ... albeit you are right that the PowerPC tree is much 
better at testing PowerPC patches.

A third variant would be to wait with these bits until the swiotlb 
bits in core/iommu hit upstream. This would increase patch latency.

Any of these variants is good to me. What Fujita suggests seems to 
be the best to me: #1 gets us the most testing and the lowest 
latency - at the cost of tree dependency. We wont rebase core/iommu.

[ We've got three good tree properties: "tree independence",
  "good testing", "low patch latency", but we cannot have all
  three at once, we must pick two of them ;-) ]

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH -v2 0/7] powerpc: use asm-generic/dma-mapping-common.h

2009-08-13 Thread Ingo Molnar

* Benjamin Herrenschmidt  wrote:

> 
> > Ben, what's your preference? I waited for your reaction with these 
> > bits, i.e. they are not in tip:core/iommu yet.
> 
> Oh I though they were... discard my previous private mail about 
> missing Ack's then :-)
> 
> I'll review them more in depth hopefully tomorrow but they look 
> good.

Sure - take your time.

> > One variant would be what Fujita suggested: you could pull 
> > core/iommu as a basis (it's a well-tested, problem-free tree at 
> > the moment, with no big risky items), and then pull/apply the 
> > powerpc specific bits from Fujita.
> 
> Or we can have the patches in core/iommu and I pull the whole 
> thing in powerpc-next. [...]

Ok! We could also stage it a bit (one or two weeks) in a separate 
branch and allow a rebase, should you find any bugs during testing?

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] perf_counter/powerpc: Fix compilation after perf_counter_overflow change

2009-09-21 Thread Ingo Molnar

* Metzger, Markus T  wrote:

> >-Original Message-
> >From: Paul Mackerras [mailto:pau...@samba.org]
> >Sent: Monday, September 21, 2009 8:45 AM
> 
> 
> >Markus, please take care in future to mention it in the changelog if
> >your patches touch definitions used by other architectures.  If you
> >could go so far as to use grep a bit more and fix up other
> >architectures' callsites for the things you're changing, that would be
> >very much appreciated.  Thanks.
> 
> I'm sorry I missed that.
> 
> There's one more place in arch/sparc/.
> The below patch should fix it, but I have no means to test it.

You also missed a third thing:

+static inline int
+perf_output_begin(struct perf_output_handle *handle, struct perf_counter *c,
+ unsigned int size, int nmi, int sample)   { }

an 'int' function returning void ...

Plus all the !PERF_COUNTERS branch of empty inlines is pointless - these 
facilities are used by perfcounters code only. I fixed that too.

> 
> Index: b/arch/sparc/kernel/perf_counter.c
> ===
> --- a/arch/sparc/kernel/perf_counter.c
> +++ b/arch/sparc/kernel/perf_counter.c
> @@ -493,7 +493,6 @@ static int __kprobes perf_counter_nmi_ha
>  
>   regs = args->regs;
>  
> - data.regs = regs;
>   data.addr = 0;
>  
>   cpuc = &__get_cpu_var(cpu_hw_counters);
> @@ -513,7 +512,7 @@ static int __kprobes perf_counter_nmi_ha
>   if (!sparc_perf_counter_set_period(counter, hwc, idx))
>   continue;
>  
> - if (perf_counter_overflow(counter, 1, &data))
> + if (perf_counter_overflow(counter, 1, &data, regs))
>   sparc_pmu_disable_counter(hwc, idx);
>   }

Looks correct to me and i've also done a Sparc cross build with the fix 
in place and it builds fine besides the unrelated build error pasted 
below. I've added it to the other fix and if David acks it will send it 
to Linus later today.

Thanks,

Ingo

/home/mingo/tip/drivers/video/console/vgacon.c: In function 'vgacon_startup':
/home/mingo/tip/drivers/video/console/vgacon.c:516: warning: passing argument 1 
of 'scr_readw' discards qualifiers from pointer target type
/home/mingo/tip/drivers/video/console/vgacon.c:517: warning: passing argument 1 
of 'scr_readw' discards qualifiers from pointer target type
/home/mingo/tip/drivers/video/console/vgacon.c:518: warning: passing argument 2 
of 'scr_writew' discards qualifiers from pointer target type
/home/mingo/tip/drivers/video/console/vgacon.c:519: warning: passing argument 2 
of 'scr_writew' discards qualifiers from pointer target type
/home/mingo/tip/drivers/video/console/vgacon.c:520: warning: passing argument 1 
of 'scr_readw' discards qualifiers from pointer target type
/home/mingo/tip/drivers/video/console/vgacon.c:520: warning: passing argument 1 
of 'scr_readw' discards qualifiers from pointer target type
/home/mingo/tip/drivers/video/console/vgacon.c:521: warning: passing argument 2 
of 'scr_writew' discards qualifiers from pointer target type
/home/mingo/tip/drivers/video/console/vgacon.c:522: warning: passing argument 2 
of 'scr_writew' discards qualifiers from pointer target type
/home/mingo/tip/drivers/video/console/vgacon.c:525: warning: passing argument 2 
of 'scr_writew' discards qualifiers from pointer target type
/home/mingo/tip/drivers/video/console/vgacon.c:526: warning: passing argument 2 
of 'scr_writew' discards qualifiers from pointer target type
/home/mingo/tip/drivers/video/console/vgacon.c:527: warning: passing argument 1 
of 'scr_readw' discards qualifiers from pointer target type
/home/mingo/tip/drivers/video/console/vgacon.c:527: warning: passing argument 1 
of 'scr_readw' discards qualifiers from pointer target type
/home/mingo/tip/drivers/video/console/vgacon.c:528: warning: passing argument 2 
of 'scr_writew' discards qualifiers from pointer target type
/home/mingo/tip/drivers/video/console/vgacon.c:529: warning: passing argument 2 
of 'scr_writew' discards qualifiers from pointer target type
/home/mingo/tip/drivers/video/console/vgacon.c:532: warning: passing argument 2 
of 'scr_writew' discards qualifiers from pointer target type
/home/mingo/tip/drivers/video/console/vgacon.c:533: warning: passing argument 2 
of 'scr_writew' discards qualifiers from pointer target type
/home/mingo/tip/drivers/video/console/vgacon.c: In function 'vgacon_do_font_op':
/home/mingo/tip/drivers/video/console/vgacon.c:1126: error: implicit 
declaration of function 'vga_writeb'
/home/mingo/tip/drivers/video/console/vgacon.c:1129: error: implicit 
declaration of function 'vga_readb'
make[4]: *** [drivers/video/console/vgacon.o] Error 1
make[3]: *** [drivers/video/console] Error 2
make[2]: *** [drivers/video] Error 2
make[2]: *** Waiting for unfinished jobs
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-d

Re: [PATCH] perf_counter/powerpc: Fix compilation after perf_counter_overflow change

2009-09-21 Thread Ingo Molnar

* Heiko Carstens  wrote:

> On Mon, Sep 21, 2009 at 09:30:43AM +0200, Ingo Molnar wrote:
> > 
> > * Metzger, Markus T  wrote:
> > 
> > > >-Original Message-
> > > >From: Paul Mackerras [mailto:pau...@samba.org]
> > > >Sent: Monday, September 21, 2009 8:45 AM
> > > 
> > > 
> > > >Markus, please take care in future to mention it in the changelog if
> > > >your patches touch definitions used by other architectures.  If you
> > > >could go so far as to use grep a bit more and fix up other
> > > >architectures' callsites for the things you're changing, that would be
> > > >very much appreciated.  Thanks.
> > > 
> > > I'm sorry I missed that.
> > > 
> > > There's one more place in arch/sparc/.
> > > The below patch should fix it, but I have no means to test it.
> > 
> > You also missed a third thing:
> > 
> > +static inline int
> > +perf_output_begin(struct perf_output_handle *handle, struct perf_counter 
> > *c,
> > + unsigned int size, int nmi, int sample)   { }
> > 
> > an 'int' function returning void ...
> > 
> > Plus all the !PERF_COUNTERS branch of empty inlines is pointless - these 
> > facilities are used by perfcounters code only. I fixed that too.
> 
> Hi Ingo,
> 
> did you fix all of these warnings for !PERF_COUNTERS?
> 
> include/linux/perf_counter.h: In function 'perf_output_begin':
> include/linux/perf_counter.h:854: warning: no return statement in function 
> returning non-void
> include/linux/perf_counter.h: At top level:
> include/linux/perf_counter.h:863: warning: 'struct perf_sample_data' declared 
> inside parameter list
> include/linux/perf_counter.h:863: warning: its scope is only this definition 
> or declaration, which is probably not what you want
> include/linux/perf_counter.h:868: warning: 'struct perf_sample_data' declared 
> inside parameter list

Yes. The full commit is below.

Ingo

>
>From cd74c86bdf705f824d494a2bbda393d1d562b40a Mon Sep 17 00:00:00 2001
From: Paul Mackerras 
Date: Mon, 21 Sep 2009 16:44:32 +1000
Subject: [PATCH] perf_counter, powerpc, sparc: Fix compilation after 
perf_counter_overflow() change

Commit 5622f295 ("x86, perf_counter, bts: Optimize BTS overflow
handling") removed the regs field from struct perf_sample_data and
added a regs parameter to perf_counter_overflow().  This breaks the
build on powerpc (and Sparc) as reported by Sachin Sant:

  arch/powerpc/kernel/perf_counter.c: In function 'record_and_restart':
  arch/powerpc/kernel/perf_counter.c:1165: error: unknown field 'regs' 
specified in initializer

This adjusts arch/powerpc/kernel/perf_counter.c to correspond with the
new struct perf_sample_data and perf_counter_overflow().

[ v2: also fix Sparc, Markus Metzger  ]

Reported-by: Sachin Sant 
Signed-off-by: Paul Mackerras 
Cc: Markus Metzger 
Cc: David S. Miller 
Cc: b...@kernel.crashing.org
Cc: linuxppc-...@ozlabs.org
Cc: Peter Zijlstra 
LKML-Reference: <19127.8400.376239.586...@drongo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar 
---
 arch/powerpc/kernel/perf_counter.c |3 +--
 arch/sparc/kernel/perf_counter.c   |3 +--
 include/linux/perf_counter.h   |   17 -
 3 files changed, 2 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/kernel/perf_counter.c 
b/arch/powerpc/kernel/perf_counter.c
index 7ceefaf..5ccf9bc 100644
--- a/arch/powerpc/kernel/perf_counter.c
+++ b/arch/powerpc/kernel/perf_counter.c
@@ -1162,7 +1162,6 @@ static void record_and_restart(struct perf_counter 
*counter, unsigned long val,
 */
if (record) {
struct perf_sample_data data = {
-   .regs   = regs,
.addr   = 0,
.period = counter->hw.last_period,
};
@@ -1170,7 +1169,7 @@ static void record_and_restart(struct perf_counter 
*counter, unsigned long val,
if (counter->attr.sample_type & PERF_SAMPLE_ADDR)
perf_get_data_addr(regs, &data.addr);
 
-   if (perf_counter_overflow(counter, nmi, &data)) {
+   if (perf_counter_overflow(counter, nmi, &data, regs)) {
/*
 * Interrupts are coming too fast - throttle them
 * by setting the counter to 0, so it will be
diff --git a/arch/sparc/kernel/perf_counter.c b/arch/sparc/kernel/perf_counter.c
index 09de403..b1265ce 100644
--- a/arch/sparc/kernel/perf_counter.c
+++ b/arch/sparc/kernel/perf_counter.c
@@ -493,7 +493,6 @@ static int __kprobes perf_counter_nmi_handler(struct 
notifier_block

Re: [PATCH] perf_counter/powerpc: Fix compilation after perf_counter_overflow change

2009-09-21 Thread Ingo Molnar

* Paul Mackerras  wrote:

> Commit 5622f295 ("x86, perf_counter, bts: Optimize BTS overflow
> handling") removed the regs field from struct perf_sample_data and
> added a regs parameter to perf_counter_overflow().  This breaks the
> build on powerpc as reported by Sachin Sant:
> 
> arch/powerpc/kernel/perf_counter.c: In function 'record_and_restart':
> arch/powerpc/kernel/perf_counter.c:1165: error: unknown field 'regs' 
> specified in initializer
> cc1: warnings being treated as errors
> arch/powerpc/kernel/perf_counter.c:1165: error: initialization makes integer 
> from pointer without a cast
> arch/powerpc/kernel/perf_counter.c:1173: error: too few arguments to function 
> 'perf_counter_overflow'
> make[1]: *** [arch/powerpc/kernel/perf_counter.o] Error 1
> make: *** [arch/powerpc/kernel] Error 2
> 
> This adjusts arch/powerpc/kernel/perf_counter.c to correspond with the
> new struct perf_sample_data and perf_counter_overflow().
> 
> Reported-by: Sachin Sant 
> Signed-off-by: Paul Mackerras 

Applied, thanks Paul.

> ---
>
> I missed this problem when the "x86, perf_counter, bts: Optimize BTS 
> overflow handling" patch was posted because the headline made it seem 
> entirely x86-specific, and the changes to struct perf_sample_data and 
> perf_counter_overflow() were not mentioned in the changelog.
> 
> Markus, please take care in future to mention it in the changelog if 
> your patches touch definitions used by other architectures.  If you 
> could go so far as to use grep a bit more and fix up other 
> architectures' callsites for the things you're changing, that would be 
> very much appreciated.  Thanks.

Yes, that should be done in general - still, nothing beats actual 
testing.

Paul, you might also want to test the perfcounter bits of -tip on 
PowerPC a bit more frequently - this patch was there for 5 days before i 
sent it to Linus.

Cross-builds didnt catch it as perfcounters isnt enabled by default in 
any of the powerpc defconfigs:

phoenix:~/linux/linux> grep -w CONFIG_PERF_COUNTERS arch/powerpc/configs/*
arch/powerpc/configs/adder875_defconfig:# CONFIG_PERF_COUNTERS is not set
arch/powerpc/configs/c2k_defconfig:# CONFIG_PERF_COUNTERS is not set
arch/powerpc/configs/ep8248e_defconfig:# CONFIG_PERF_COUNTERS is not set
arch/powerpc/configs/ep88xc_defconfig:# CONFIG_PERF_COUNTERS is not set
arch/powerpc/configs/linkstation_defconfig:# CONFIG_PERF_COUNTERS is not set
arch/powerpc/configs/mgcoge_defconfig:# CONFIG_PERF_COUNTERS is not set
arch/powerpc/configs/mgsuvd_defconfig:# CONFIG_PERF_COUNTERS is not set
arch/powerpc/configs/mpc7448_hpc2_defconfig:# CONFIG_PERF_COUNTERS is not set
arch/powerpc/configs/mpc8272_ads_defconfig:# CONFIG_PERF_COUNTERS is not set
arch/powerpc/configs/mpc83xx_defconfig:# CONFIG_PERF_COUNTERS is not set
arch/powerpc/configs/mpc85xx_defconfig:# CONFIG_PERF_COUNTERS is not set
arch/powerpc/configs/mpc85xx_smp_defconfig:# CONFIG_PERF_COUNTERS is not set
arch/powerpc/configs/mpc866_ads_defconfig:# CONFIG_PERF_COUNTERS is not set
arch/powerpc/configs/mpc86xx_defconfig:# CONFIG_PERF_COUNTERS is not set
arch/powerpc/configs/mpc885_ads_defconfig:# CONFIG_PERF_COUNTERS is not set
arch/powerpc/configs/pq2fads_defconfig:# CONFIG_PERF_COUNTERS is not set
arch/powerpc/configs/prpmc2800_defconfig:# CONFIG_PERF_COUNTERS is not set
arch/powerpc/configs/ps3_defconfig:# CONFIG_PERF_COUNTERS is not set
arch/powerpc/configs/storcenter_defconfig:# CONFIG_PERF_COUNTERS is not set

There's not that many PowerPC users so all extra testing help would be 
much welcome. Also, enabling them in the powerpc defconfigs would be 
helpful as well.

Thanks,

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] perf_event, powerpc: Fix compilation after big perf_counter rename

2009-09-22 Thread Ingo Molnar

* Benjamin Herrenschmidt  wrote:

> On Tue, 2009-09-22 at 09:48 +1000, Paul Mackerras wrote:
>
> > This fixes two places in the powerpc perf_event (perf_counter) code 
> > where 'list_entry' needs to be changed to 'group_entry', but were 
> > missed in commit 65abc865 ("perf_counter: Rename list_entry -> 
> > group_entry, counter_list -> group_list").

Oops, indeed - queued up the fix and will send it to Linus shortly - 
thanks!

> Ingo: This is becoming a recurring one now... powerpc build upstream 
> is broken approx everyday by some new perfctr build breakage.
>
> You really aren't build testing other architectures than x86 right ?

On the contrary - i am build testing every architecture on a daily 
basis. (and sometimes i do it multiple times a day - yesterday i did 5 
cross builds during the rename) In fact i am testing more architectures 
than linux-next does.

Here's the log of the test i ran yesterday before i sent those bits to 
Linus:

testing 24 architectures.
 (warns)   (warns)
testing  alpha:  -git:  pass (   24),  -tip:  pass (   24)
testingarm:  -git:  fail (   11),  -tip:  fail (   13)
testing   blackfin:  -git:  pass (3),  -tip:  pass (3)
testing   cris:  -git:  fail (   34),  -tip:  pass (   20)
testingfrv:  -git:  fail (   13),  -tip:  fail (   13)
testing  h8300:  -git:  fail (  441),  -tip:  fail (  185)
testing   i386:  -git:  pass (2),  -tip:  pass (5)
testing   ia64:  -git:  fail (  172),  -tip:  pass (  160)
testing   m32r:  -git:  pass (   39),  -tip:  pass (   39)
testing   m68k:  -git:  pass (   42),  -tip:  pass (   42)
testing  m68knommu:  -git:  fail (   80),  -tip:  fail (   80)
testing microblaze:  -git:  fail (   14),  -tip:  fail (   14)
testing   mips:  -git:  pass (6),  -tip:  pass (6)
testingmn10300:  -git:  fail (   10),  -tip:  fail (   10)
testing parisc:  -git:  pass (   26),  -tip:  pass (   26)
testingpowerpc:  -git:  fail (   36),  -tip:  fail (   45)
testing   s390:  -git:  pass (6),  -tip:  pass (6)
testing  score:  -git:  fail (   13),  -tip:  fail (   13)
testing sh:  -git:  fail (   22),  -tip:  fail (   19)
testing  sparc:  -git:  pass (3),  -tip:  pass (3)
testing um:  -git:  pass (3),  -tip:  pass (3)
testing xtensa:  -git:  fail (   46),  -tip:  fail (   46)
testing x86-64:  -git:  pass (0),  -tip:  pass (0)
testing x86-32:  -git:  pass (0),  -tip:  pass (0)

In fact there are architectures that dont build in Linus's tree and 
build in -tip:

testing   cris:  -git:  fail (   34),  -tip:  pass (   20)

Because not only do i test every architecture i also try to fix upstream 
bugs on non-x86 pro-actively. See for example this upstream fix:

 8d7ac69: Blackfin: Fix link errors with binutils 2.19 and GCC 4.3

Nevertheless you are right that i should have caught this particular 
PowerPC build bug - i missed it - sorry about that!

Thanks,

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] perf_event, powerpc: Fix compilation after big perf_counter rename

2009-09-22 Thread Ingo Molnar

* Benjamin Herrenschmidt  wrote:

> On Tue, 2009-09-22 at 09:28 +0200, Ingo Molnar wrote:
> > 
> > Nevertheless you are right that i should have caught this particular 
> > PowerPC build bug - i missed it - sorry about that!
> 
> Allright. Well, to help in general, we are setting up a build-bot here 
> too that will build -tip HEAD for at least powerpc daily with a few 
> configs too.

Cool, that's really useful! Especially during the weekends that will be 
helpful, in that timeframe linux-next driven testing has a latency of 
72-95 hours and -tip usually has an uptick in patches.

Thanks,

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] perf_event, powerpc: Fix compilation after big perf_counter rename

2009-09-23 Thread Ingo Molnar

* Michael Ellerman  wrote:

> On Tue, 2009-09-22 at 18:00 +1000, Benjamin Herrenschmidt wrote:
> > On Tue, 2009-09-22 at 09:28 +0200, Ingo Molnar wrote:
> > > 
> > > Nevertheless you are right that i should have caught this particular 
> > > PowerPC build bug - i missed it - sorry about that!
> > > 
> > Allright. Well, to help in general, we are setting up a build-bot
> > here too that will build -tip HEAD for at least powerpc daily with
> > a few configs too.
> 
> Results here:
> 
> http://kisskb.ellerman.id.au/kisskb/branch/12/

ok, seems green for today - the two failures are: one a powerpc 
toolchain problem it appears, plus a mainline warning.

Btw., for me to be able to notice failures there it would have to email 
me automatically if there's any -tip build failures that do not occur 
with the upstream branch. Does it have such a feature?

Ingo

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] perf_event, powerpc: Fix compilation after big perf_counter rename

2009-09-24 Thread Ingo Molnar

* Michael Ellerman  wrote:

> On Wed, 2009-09-23 at 14:44 +0200, Ingo Molnar wrote:
> > * Michael Ellerman  wrote:
> > 
> > > On Tue, 2009-09-22 at 18:00 +1000, Benjamin Herrenschmidt wrote:
> > > > On Tue, 2009-09-22 at 09:28 +0200, Ingo Molnar wrote:
> > > > > 
> > > > > Nevertheless you are right that i should have caught this particular 
> > > > > PowerPC build bug - i missed it - sorry about that!
> > > > > 
> > > > Allright. Well, to help in general, we are setting up a build-bot
> > > > here too that will build -tip HEAD for at least powerpc daily with
> > > > a few configs too.
> > > 
> > > Results here:
> > > 
> > > http://kisskb.ellerman.id.au/kisskb/branch/12/
> > 
> > ok, seems green for today - the two failures are: one a powerpc 
> > toolchain problem it appears, plus a mainline warning.
> 
> Yep that looks more or less normal.
> 
> > Btw., for me to be able to notice failures there it would have to 
> > email me automatically if there's any -tip build failures that do 
> > not occur with the upstream branch. Does it have such a feature?
> 
> Not really, it sends mails to me, but it doesn't have a way to filter 
> them by branch. I think the plan is we'll keep an eye on it and either 
> send you patches or at least let you know that it's broken.

how many mails are those per day, typically? If there's not too many and 
if there's a way to send all of them to me i could post-filter them for 
-tip relevance. If that is feasible. You bouncing it to me later is 
certainly also a solution. (but lengthens the latency of fixes, 
obviously.)

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] perf_event, powerpc: Fix compilation after big perf_counter rename

2009-10-01 Thread Ingo Molnar

* Stephen Rothwell  wrote:

> Hi Ingo,
> 
> On Thu, 24 Sep 2009 23:25:55 +1000 Michael Ellerman  
> wrote:
> >
> > Give me a day or two, I should be able to add a per-branch setting for
> > who to send mails to without too much trouble.
> 
> In the mean time I don't now if someone has pointed you at these today:
> 
> http://kisskb.ellerman.id.au/kisskb/branch/12/

That's an upstream warning.

-tip supports fail-on-build-warnings build mode (for the whole kernel) 
via the CONFIG_ALLOW_WARNINGS .config setting. So if you do allnoconfig 
builds, make sure you turn on CONFIG_ALLOW_WARNINGS=y to get the same 
build behavior as with Linus's tree.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [GIT PULL 00/21] perf/core improvements and fixes

2018-08-02 Thread Ingo Molnar


* Arnaldo Carvalho de Melo  wrote:

> Hi Ingo,
> 
>   Please consider pulling, contains a recently merged
> tip/perf/urgent,
> 
> - Arnaldo
> 
> Test results at the end of this message, as usual.
> 
> The following changes since commit c2586cfbb905939b79b49a9121fb0a59a5668fd6:
> 
>   Merge remote-tracking branch 'tip/perf/urgent' into perf/core (2018-07-31 
> 09:55:45 -0300)
> 
> are available in the Git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
> tags/perf-core-for-mingo-4.19-20180801
> 
> for you to fetch changes up to b912885ab75c7c8aa841c615108afd755d0b97f8:
> 
>   perf trace: Do not require --no-syscalls to suppress strace like output 
> (2018-08-01 16:20:28 -0300)
> 
> 
> perf/core improvements and fixes:
> 
> perf trace: (Arnaldo Carvalho de Melo)
> 
> - Do not require --no-syscalls to suppress strace like output, i.e.
> 
>  # perf trace -e sched:*switch
> 
>   will show just sched:sched_switch events, not strace-like formatted
>   syscall events, use --syscalls to get the previous behaviour.
> 
>   If instead:
> 
>  # perf trace
> 
>   is used, i.e. no events specified, then --syscalls is implied and
>   system wide strace like formatting will be applied to all syscalls.
> 
>   The behaviour when just a syscall subset is used with '-e' is unchanged:
> 
>  # perf trace -e *sleep,sched:*switch
> 
>   will work as before: just the 'nanosleep' syscall will be strace-like
>   formatted plus the sched:sched_switch tracepoint event, system wide.
> 
> - Allow string table generators to use a default header dir, allowing
>   use of them without parameters to see the table it generates on
>   stdout, e.g.:
> 
> $ tools/perf/trace/beauty/kvm_ioctl.sh
> static const char *kvm_ioctl_cmds[] = {
> [0x00] = "GET_API_VERSION",
> [0x01] = "CREATE_VM",
> [0x02] = "GET_MSR_INDEX_LIST",
> [0x03] = "CHECK_EXTENSION",
> 
> [0xe0] = "CREATE_DEVICE",
> [0xe1] = "SET_DEVICE_ATTR",
> [0xe2] = "GET_DEVICE_ATTR",
> [0xe3] = "HAS_DEVICE_ATTR",
> };
> $
> 
>   See 'ls tools/perf/trace/beauty/*.sh' to see the available string
>   table generators.
> 
> - Add a generator for IPPROTO_ socket's protocol constants.
> 
> perf record: (Kan Liang)
> 
> - Fix error out while applying initial delay and using LBR, due to
>   the use of a PERF_TYPE_SOFTWARE/PERF_COUNT_SW_DUMMY event to track
>   PERF_RECORD_MMAP events while waiting for the initial delay. Such
>   events fail when configured asking PERF_SAMPLE_BRANCH_STACK in
>   perf_event_attr.sample_type.
> 
> perf c2c: (Jiri Olsa)
> 
> - Fix report crash for empty browser, when processing a perf.data file
>   without events of interest, either because not asked for in
>   'perf record' or because the workload didn't triggered such events.
> 
> perf list: (Michael Petlan)
> 
> - Align metric group description format with PMU event description.
> 
> perf tests: (Sandipan Das)
> 
> - Fix indexing when invoking subtests, which caused BPF tests to
>   get results for the next test in the list, with the last one
>   reporting a failure.
> 
> eBPF:
> 
> - Fix installation directory for header files included from eBPF proggies,
>   avoiding clashing with relative paths used to build other software projects
>   such as glibc. (Thomas Richter)
> 
> - Show better message when failing to load an object. (Arnaldo Carvalho de 
> Melo)
> 
> General: (Christophe Leroy)
> 
> - Allow overriding MAX_NR_CPUS at compile time, to make the tooling
>   usable in systems with less memory, in time this has to be changed
>   to properly allocate based on _NPROCESSORS_ONLN.
> 
> Architecture specific:
> 
> - Update arm64's ThunderX2 implementation defined pmu core events (Ganapatrao 
> Kulkarni)
> 
> - Fix complex event name parsing in 'perf test' for PowerPC, where the 
> 'umask' event
>   modifier isn't present. (Sandipan Das)
> 
> CoreSight ARM hardware tracing: (Leo Yan)
> 
> - Fix start tracing packet handling.
> 
> - Support dummy address value for CS_ETM_TRACE_ON packet.
> 
> - Generate branch sample when receiving a CS_ETM_TRACE_ON packet.
> 
> - Generate branch sample for CS_ETM_TRACE_ON packet.
> 
> Signed-off-by: Arnaldo Carvalho de Melo 
> 
> 
> Arnaldo Carvalho de Melo (9):
>   perf trace beauty: Default header_dir to cwd to work without parms
>   tools include uapi: Grab a copy of linux/in.h
>   perf beauty: Add a generator for IPPROTO_ socket's protocol constants
>   perf trace beauty: Do not print NULL strarray entries
>   perf trace beauty: Add beautifiers for 'socket''s 'protocol' arg
>   perf trace: Beautify the AF_INET & AF_INET6 'socket' syscall 'protocol' 
> args
>   perf bpf: Show better message when failing to load an object
>   perf bpf: Include uapi/linux/bpf.h from the 'perf trace' script's bp

Re: [PATCH v6 00/11] hugetlb: Factorize hugetlb architecture primitives

2018-08-07 Thread Ingo Molnar
  | 54 ++---
>  arch/sparc/include/asm/hugetlb.h | 40 +++--
>  arch/x86/include/asm/hugetlb.h   | 69 --
>  include/asm-generic/hugetlb.h| 88 
> +++-
>  15 files changed, 135 insertions(+), 394 deletions(-)

The x86 bits look good to me (assuming it's all tested on all relevant 
architectures, etc.)

Acked-by: Ingo Molnar 

Thanks,

Ingo


Re: [PATCH 2/2] x86, powerpc: remove -funit-at-a-time compiler option entirely

2018-11-11 Thread Ingo Molnar


* Masahiro Yamada  wrote:

> GCC 4.6 manual says:
> 
> -funit-at-a-time
>   This option is left for compatibility reasons. -funit-at-a-time has
>   no effect, while -fno-unit-at-a-time implies -fno-toplevel-reorder
>   and -fno-section-anchors.
>   Enabled by default.
> 
> Signed-off-by: Masahiro Yamada 
> ---
> 
>  arch/powerpc/Makefile | 4 
>  arch/x86/Makefile | 4 
>  arch/x86/Makefile.um  | 5 -
>  3 files changed, 13 deletions(-)
> 
> diff --git a/arch/x86/Makefile b/arch/x86/Makefile
> index 88398fd..3508049 100644
> --- a/arch/x86/Makefile
> +++ b/arch/x86/Makefile
> @@ -130,10 +130,6 @@ else
>  
>  KBUILD_CFLAGS += -mno-red-zone
>  KBUILD_CFLAGS += -mcmodel=kernel
> -
> -# -funit-at-a-time shrinks the kernel .text considerably
> -# unfortunately it makes reading oopses harder.
> -KBUILD_CFLAGS += $(call cc-option,-funit-at-a-time)
>  endif
>  
>  ifdef CONFIG_X86_X32
> diff --git a/arch/x86/Makefile.um b/arch/x86/Makefile.um
> index 577976b..1db7913 100644
> --- a/arch/x86/Makefile.um
> +++ b/arch/x86/Makefile.um
> @@ -26,9 +26,6 @@ cflags-y += $(call cc-option,-mpreferred-stack-boundary=2)
>  # an unresolved reference.
>  cflags-y += -ffreestanding
>  
> -# gcc 4.3.0 needs -funit-at-a-time for extern inline functions.
> -KBUILD_CFLAGS += $(call cc-option,-funit-at-a-time)
> -
>  KBUILD_CFLAGS += $(cflags-y)
>  
>  else
> @@ -50,6 +47,4 @@ ELF_FORMAT := elf64-x86-64
>  LINK-$(CONFIG_LD_SCRIPT_DYN) += -Wl,-rpath,/lib64
>  LINK-y += -m64
>  
> -# Do unit-at-a-time unconditionally on x86_64, following the host
> -KBUILD_CFLAGS += $(call cc-option,-funit-at-a-time)
>  endif

Acked-by: Ingo Molnar 

Thanks,

Ingo


Re: [PATCH 02/17] x86: Add support for ZSTD-compressed kernel

2018-11-11 Thread Ingo Molnar


* Adam Borowski  wrote:

> From: Nick Terrell 
> 
> Integrates the ZSTD decompression code to the x86 pre-boot code.
> 
> Zstandard requires slightly more memory during the kernel decompression
> on x86 (192 KB vs 64 KB), and the memory usage is independent of the
> window size.
> 
> Zstandard requires memory proportional to the window size used during
> compression for decompressing the ramdisk image, since streaming mode is
> used. Newer versions of zstd (1.3.2+) list the window size of a file
> with `zstd -lv '. The absolute maximum amount of memory required
> is just over 8 MB.
> 
> Signed-off-by: Nick Terrell 
> ---
>  Documentation/x86/boot.txt| 6 +++---
>  arch/x86/Kconfig  | 1 +
>  arch/x86/boot/compressed/Makefile | 5 -
>  arch/x86/boot/compressed/misc.c   | 4 
>  arch/x86/boot/header.S| 8 +++-
>  arch/x86/include/asm/boot.h   | 6 --
>  6 files changed, 23 insertions(+), 7 deletions(-)

Acked-by: Ingo Molnar 

> diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S
> index 4c881c850125..af2efb256527 100644
> --- a/arch/x86/boot/header.S
> +++ b/arch/x86/boot/header.S
> @@ -526,8 +526,14 @@ pref_address:.quad LOAD_PHYSICAL_ADDR
> # preferred load addr
>  # the size-dependent part now grows so fast.
>  #
>  # extra_bytes = (uncompressed_size >> 8) + 65536
> +#
> +# ZSTD compressed data grows by at most 3 bytes per 128K, and only has a 22
> +# byte fixed overhead but has a maximum block size of 128K, so it needs a
> +# larger margin.
> +#
> +# extra_bytes = (uncompressed_size >> 8) + 131072
>  
> -#define ZO_z_extra_bytes ((ZO_z_output_len >> 8) + 65536)
> +#define ZO_z_extra_bytes ((ZO_z_output_len >> 8) + 131072)

This change would also affect other decompressors, not just ZSTD, 
correct?

Might want to split this change out into a separate preparatory patch to 
allow it to be bisected to, or at least mention it in the changelog more 
explicitly?

Thanks,

Ingo


Re: [PATCH v9 0/6] add support for relative references in special sections

2018-07-03 Thread Ingo Molnar


* Ard Biesheuvel  wrote:

> On 27 June 2018 at 17:15, Will Deacon  wrote:
> > Hi Ard,
> >
> > On Tue, Jun 26, 2018 at 08:27:55PM +0200, Ard Biesheuvel wrote:
> >> This adds support for emitting special sections such as initcall arrays,
> >> PCI fixups and tracepoints as relative references rather than absolute
> >> references. This reduces the size by 50% on 64-bit architectures, but
> >> more importantly, it removes the need for carrying relocation metadata
> >> for these sections in relocatable kernels (e.g., for KASLR) that needs
> >> to be fixed up at boot time. On arm64, this reduces the vmlinux footprint
> >> of such a reference by 8x (8 byte absolute reference + 24 byte RELA entry
> >> vs 4 byte relative reference)
> >>
> >> Patch #3 was sent out before as a single patch. This series supersedes
> >> the previous submission. This version makes relative ksymtab entries
> >> dependent on the new Kconfig symbol HAVE_ARCH_PREL32_RELOCATIONS rather
> >> than trying to infer from kbuild test robot replies for which architectures
> >> it should be blacklisted.
> >>
> >> Patch #1 introduces the new Kconfig symbol HAVE_ARCH_PREL32_RELOCATIONS,
> >> and sets it for the main architectures that are expected to benefit the
> >> most from this feature, i.e., 64-bit architectures or ones that use
> >> runtime relocations.
> >>
> >> Patch #2 add support for #define'ing __DISABLE_EXPORTS to get rid of
> >> ksymtab/kcrctab sections in decompressor and EFI stub objects when
> >> rebuilding existing C files to run in a different context.
> >
> > I had a small question on patch 3, but it's really for my understanding.
> > So, for patches 1-3:
> >
> > Reviewed-by: Will Deacon 
> >
> 
> Thanks all.
> 
> Thomas, Ingo,
> 
> Except for the below tweak against patch #3 for powerpc, which may
> apparently get confused by an input section called .discard without
> any suffixes, this series is good to go, but requires your ack to
> proceed, so I would like to ask you to share your comments and/or
> objections. Also, any suggestions or recommendations regarding the
> route these patches should take are highly appreciated.

LGTM:

Acked-by: Ingo Molnar 

Regarding route - I suspect -mm would be good, or any other tree that does a 
lot 
of cross-arch testing?

Thanks,

Ingo


Re: [PATCH] watchdog/softlockup: Fix SOFTLOCKUP_DETECTOR=n build

2018-07-10 Thread Ingo Molnar


* Peter Zijlstra  wrote:

> On Mon, Jul 09, 2018 at 11:40:14PM +0530, Abdul Haleem wrote:
> 
> > Thanks Peter for the patch, build and boot is fine.
> > 
> > Reported-and-tested-by: Abdul Haleem 
> 
> Excellent, Ingo can you stick this in?

Sure, done!

Thanks,

Ingo


Re: [GIT PULL 0/5] perf/urgent fixes

2018-07-30 Thread Ingo Molnar


* Arnaldo Carvalho de Melo  wrote:

> Hi Ingo,
> 
>   Please consider pulling, just to get the build without warnings
> and finishing successfully in all my test environments,
> 
> - Arnaldo
> 
> Test results at the end of this message, as usual.
> 
> The following changes since commit 7f635ff187ab6be0b350b3ec06791e376af238ab:
> 
>   perf/core: Fix crash when using HW tracing kernel filters (2018-07-25 
> 11:46:22 +0200)
> 
> are available in the Git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
> tags/perf-urgent-for-mingo-4.18-20180730
> 
> for you to fetch changes up to 44fe619b1418ff4e9d2f9518a940fbe2fb686a08:
> 
>   perf tools: Fix the build on the alpine:edge distro (2018-07-30 13:15:03 
> -0300)
> 
> 
> perf/urgent fixes: (Arnaldo Carvalho de Melo)
> 
> - Update the tools copy of several files, including perf_event.h,
>   powerpc's asm/unistd.h (new io_pgetevents syscall), bpf.h and
>   x86's memcpy_64.s (used in 'perf bench mem'), silencing the
>   respective warnings during the perf tools build.
> 
> - Fix the build on the alpine:edge distro.
> 
> Signed-off-by: Arnaldo Carvalho de Melo 
> 
> 
> Arnaldo Carvalho de Melo (5):
>   tools headers uapi: Update tools's copy of linux/perf_event.h
>   tools headers powerpc: Update asm/unistd.h copy to pick new
>   tools headers uapi: Refresh linux/bpf.h copy
>   tools arch: Update arch/x86/lib/memcpy_64.S copy used in 'perf bench 
> mem memcpy'
>   perf tools: Fix the build on the alpine:edge distro
> 
>  tools/arch/powerpc/include/uapi/asm/unistd.h |   1 +
>  tools/arch/x86/include/asm/mcsafe_test.h |  13 
>  tools/arch/x86/lib/memcpy_64.S   | 112 
> +--
>  tools/include/uapi/linux/bpf.h   |  28 +--
>  tools/include/uapi/linux/perf_event.h|   2 +
>  tools/perf/arch/x86/util/pmu.c   |   1 +
>  tools/perf/arch/x86/util/tsc.c   |   1 +
>  tools/perf/bench/Build   |   1 +
>  tools/perf/bench/mem-memcpy-x86-64-asm.S |   1 +
>  tools/perf/bench/mem-memcpy-x86-64-lib.c |  24 ++
>  tools/perf/perf.h|   1 +
>  tools/perf/util/header.h |   1 +
>  tools/perf/util/namespaces.h |   1 +
>  13 files changed, 124 insertions(+), 63 deletions(-)
>  create mode 100644 tools/arch/x86/include/asm/mcsafe_test.h
>  create mode 100644 tools/perf/bench/mem-memcpy-x86-64-lib.c

Pulled, thanks a lot Arnaldo!

Ingo


Re: [PATCH tip/core/rcu 2/3] srcu: Force full grace-period ordering

2017-01-14 Thread Ingo Molnar

* Paul E. McKenney  wrote:

> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index 357b32aaea48..5fdfe874229e 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -1175,11 +1175,11 @@ do { \
>   * if the UNLOCK and LOCK are executed by the same CPU or if the
>   * UNLOCK and LOCK operate on the same lock variable.
>   */
> -#ifdef CONFIG_PPC
> +#ifdef CONFIG_ARCH_WEAK_RELACQ
>  #define smp_mb__after_unlock_lock()  smp_mb()  /* Full ordering for lock. */
> -#else /* #ifdef CONFIG_PPC */
> +#else /* #ifdef CONFIG_ARCH_WEAK_RELACQ */
>  #define smp_mb__after_unlock_lock()  do { } while (0)
> -#endif /* #else #ifdef CONFIG_PPC */
> +#endif /* #else #ifdef CONFIG_ARCH_WEAK_RELACQ */
>  
>  

So at the risk of sounding totally pedantic, why not structure it like the 
existing smp_mb__before/after*() primitives in barrier.h?

That allows asm-generic/barrier.h to pick up the definition - for example in 
the 
case of smp_acquire__after_ctrl_dep() we do:

 #ifndef smp_acquire__after_ctrl_dep
 #define smp_acquire__after_ctrl_dep()   smp_rmb()
 #endif

Which allows Tile to relax it:

  arch/tile/include/asm/barrier.h:#define smp_acquire__after_ctrl_dep()   
barrier()

I.e. I'd move the API definition out of rcupdate.h and into barrier.h - even 
though tree-RCU is the only user of this barrier type.

Thanks,

Ingo


Re: [PATCH tip/core/rcu 2/3] srcu: Force full grace-period ordering

2017-01-14 Thread Ingo Molnar

* Paul E. McKenney  wrote:

> On Sun, Jan 15, 2017 at 08:11:23AM +0100, Ingo Molnar wrote:
> > 
> > * Paul E. McKenney  wrote:
> > 
> > > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > > index 357b32aaea48..5fdfe874229e 100644
> > > --- a/include/linux/rcupdate.h
> > > +++ b/include/linux/rcupdate.h
> > > @@ -1175,11 +1175,11 @@ do { \
> > >   * if the UNLOCK and LOCK are executed by the same CPU or if the
> > >   * UNLOCK and LOCK operate on the same lock variable.
> > >   */
> > > -#ifdef CONFIG_PPC
> > > +#ifdef CONFIG_ARCH_WEAK_RELACQ
> > >  #define smp_mb__after_unlock_lock()  smp_mb()  /* Full ordering for 
> > > lock. */
> > > -#else /* #ifdef CONFIG_PPC */
> > > +#else /* #ifdef CONFIG_ARCH_WEAK_RELACQ */
> > >  #define smp_mb__after_unlock_lock()  do { } while (0)
> > > -#endif /* #else #ifdef CONFIG_PPC */
> > > +#endif /* #else #ifdef CONFIG_ARCH_WEAK_RELACQ */
> > >  
> > >  
> > 
> > So at the risk of sounding totally pedantic, why not structure it like the 
> > existing smp_mb__before/after*() primitives in barrier.h?
> > 
> > That allows asm-generic/barrier.h to pick up the definition - for example 
> > in the 
> > case of smp_acquire__after_ctrl_dep() we do:
> > 
> >  #ifndef smp_acquire__after_ctrl_dep
> >  #define smp_acquire__after_ctrl_dep()   smp_rmb()
> >  #endif
> > 
> > Which allows Tile to relax it:
> > 
> >   arch/tile/include/asm/barrier.h:#define smp_acquire__after_ctrl_dep()   
> > barrier()
> > 
> > I.e. I'd move the API definition out of rcupdate.h and into barrier.h - 
> > even 
> > though tree-RCU is the only user of this barrier type.
> 
> I wouldn't have any problem with that, however, some time back it was
> moved into RCU because (you guessed it!) RCU is the only user.  ;-)

Indeed ...

[sounds of rummaging around in the Git tree]

I found this commit of yours from ancient history (more than a year ago!):

  commit 12d560f4ea87030667438a169912380be00cea4b
  Author: Paul E. McKenney 
  Date:   Tue Jul 14 18:35:23 2015 -0700

rcu,locking: Privatize smp_mb__after_unlock_lock()

RCU is the only thing that uses smp_mb__after_unlock_lock(), and is
likely the only thing that ever will use it, so this commit makes this
macro private to RCU.

Signed-off-by: Paul E. McKenney 
Cc: Will Deacon 
Cc: Peter Zijlstra 
Cc: Benjamin Herrenschmidt 
Cc: "linux-a...@vger.kernel.org" 

So I concur and I'm fine with your patch - or with the status quo code as well.

Thanks,

Ingo


Re: [PATCH tip/core/rcu 2/3] srcu: Force full grace-period ordering

2017-01-15 Thread Ingo Molnar

* Paul E. McKenney  wrote:

> > [sounds of rummaging around in the Git tree]
> > 
> > I found this commit of yours from ancient history (more than a year ago!):
> > 
> >   commit 12d560f4ea87030667438a169912380be00cea4b
> >   Author: Paul E. McKenney 
> >   Date:   Tue Jul 14 18:35:23 2015 -0700
> > 
> > rcu,locking: Privatize smp_mb__after_unlock_lock()
> > 
> > RCU is the only thing that uses smp_mb__after_unlock_lock(), and is
> > likely the only thing that ever will use it, so this commit makes this
> > macro private to RCU.
> > 
> > Signed-off-by: Paul E. McKenney 
> > Cc: Will Deacon 
> > Cc: Peter Zijlstra 
> > Cc: Benjamin Herrenschmidt 
> > Cc: "linux-a...@vger.kernel.org" 
> > 
> > So I concur and I'm fine with your patch - or with the status quo code as 
> > well.
> 
> I already have the patch queued, so how about I keep it if I get an ack
> from the powerpc guys and drop it otherwise?

Yeah, sounds good! Your patch made me look up 'RelAcq' so it has documentation 
value as well ;-)

Thanks,

Ingo


Re: [PATCH tip/core/rcu 2/3] srcu: Force full grace-period ordering

2017-01-15 Thread Ingo Molnar

* Paul E. McKenney  wrote:

> On Sun, Jan 15, 2017 at 10:40:58AM +0100, Ingo Molnar wrote:
> > 
> > * Paul E. McKenney  wrote:
> > 
> > > > [sounds of rummaging around in the Git tree]
> > > > 
> > > > I found this commit of yours from ancient history (more than a year 
> > > > ago!):
> > > > 
> > > >   commit 12d560f4ea87030667438a169912380be00cea4b
> > > >   Author: Paul E. McKenney 
> > > >   Date:   Tue Jul 14 18:35:23 2015 -0700
> > > > 
> > > > rcu,locking: Privatize smp_mb__after_unlock_lock()
> > > > 
> > > > RCU is the only thing that uses smp_mb__after_unlock_lock(), and is
> > > > likely the only thing that ever will use it, so this commit makes 
> > > > this
> > > > macro private to RCU.
> > > > 
> > > > Signed-off-by: Paul E. McKenney 
> > > > Cc: Will Deacon 
> > > > Cc: Peter Zijlstra 
> > > > Cc: Benjamin Herrenschmidt 
> > > > Cc: "linux-a...@vger.kernel.org" 
> > > > 
> > > > So I concur and I'm fine with your patch - or with the status quo code 
> > > > as well.
> > > 
> > > I already have the patch queued, so how about I keep it if I get an ack
> > > from the powerpc guys and drop it otherwise?
> > 
> > Yeah, sounds good! Your patch made me look up 'RelAcq' so it has 
> > documentation 
> > value as well ;-)
> 
> ;-) ;-) ;-)
> 
> Looking forward, my guess would be that if some other code needs
> smp_mb__after_unlock_lock() or if some other architecture needs
> non-smb_mb() special handling, I should consider making it work the
> same as smp_mb__after_atomic() and friends.  Does that seem like a
> reasonable thought?

Yeah, absolutely - it's just that the pattern triggered the 'this looks a bit 
too 
specialized' response in me, but after seeing the details (again ...) I agree 
that 
this time is different!

Thanks,

Ingo


Re: [PATCH 1/3] kprobes: introduce weak variant of kprobe_exceptions_notify

2017-02-09 Thread Ingo Molnar

* Michael Ellerman  wrote:

> "Naveen N. Rao"  writes:
> 
> > kprobe_exceptions_notify() is not used on some of the architectures such
> > as arm[64] and powerpc anymore. Introduce a weak variant for such
> > architectures.
> 
> I'll merge patch 1 & 3 via the powerpc tree for v4.11.

Acked-by: Ingo Molnar 

Thanks,

Ingo


Re: [GIT PULL 00/35] perf/core improvements and fixes

2017-03-06 Thread Ingo Molnar

* Arnaldo Carvalho de Melo  wrote:

> From: Arnaldo Carvalho de Melo 
> 
> Hi Ingo,
> 
>   Please consider pulling,
> 
> - Arnaldo
> 
> Test results at the end of this message, as usual.
> 
> The following changes since commit 9d020d33fc1b2faa0eb35859df1381ca5dc94ffe:
> 
>   Merge branch 'linus' into perf/urgent, to resolve conflict (2017-03-02 
> 08:05:45 +0100)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
> tags/perf-core-for-mingo-4.11-20170306
> 
> for you to fetch changes up to 001916b94a04809a94abb07daba6f9ace01906ba:
> 
>   perf bench numa: Add more comment for -c option (2017-03-06 12:39:30 -0300)
> 
> 
> perf/core improvements and fixes:
> 
> New features:
> 
> - Allow sorting by symbol_size in 'perf report' and 'perf top' (Charles 
> Baylis)
> 
>   E.g.:
> 
>   # perf report -s symbol_size,symbol
> 
>   Samples: 9K of event 'cycles:k', Event count (approx.): 2870461623
>   Overhead  Symbol size  Symbol
> 14.55%  326  [k] flush_tlb_mm_range
>  7.20% 1045  [k] filemap_map_pages
>  5.82%  124  [k] vma_interval_tree_insert
>  5.18% 2430  [k] unmap_page_range
>  2.57%  571  [k] vma_interval_tree_remove
>  1.94%  494  [k] page_add_file_rmap
>  1.82%  740  [k] page_remove_rmap
>  1.66% 1017  [k] release_pages
>  1.57% 1636  [k] update_blocked_averages
>  1.57%   76  [k] unlock_page
> 
> - Add support for -p/--pid, -a/--all-cpus and -C/--cpu in 'perf ftrace' 
> (Namhyung Kim)
> 
> Change in behaviour:
> 
> - Make system wide (-a) the default option if no target was specified and one
>   of following conditions is met:
> 
>   - No workload specified (current behaviour)
> 
>   - A workload is specified but all requested events are system wide ones,
> like uncore ones. (Jiri Olsa)
> 
> Fixes:
> 
> - Add missing initialization to the instruction decoder used in the
>   intel PT/BTS code, which was causing lots of failures in 'perf test',
>   looking for a value when there was none (Adrian Hunter)
> 
> Infrastructure:
> 
> - Add arch code needed to adopt the kernel's refcount_t to aid in
>   catching bugs when using atomic_t as a reference counter, basically
>   cmpxchg related functions (Arnaldo Carvalho de Melo)
> 
> - Convert the code using atomic_t as reference counts to refcount_t
>   (Elena Rashetova)
> 
> - Add feature test for sched_getcpu() to more easily check for its
>   presence in the many libc implementations and accross different
>   versions of such C libraries (Arnaldo Carvalho de Melo)
> 
> - Issue a HW watchdog disable hint in 'perf stat' for when some of the
>   requested events can't get counted because a PMU counter is taken by that
>   watchdog (Borislav Petkov).
> 
> - Add mapping for Intel's KnightsMill PMU events (Karol Wachowski)
> 
> Documentation:
> 
> - Clarify the term 'convergence' in:
> 
>perf bench numa numa-mem -h --show_convergence (Jiri Olsa)
> 
> Kernel code:
> 
> - Ensure probe location is at function entry in kretprobes (Naveen N. Rao)
> 
> - Allow return probes with offsets and absolute addresses (Naveen N. Rao)
> 
> Signed-off-by: Arnaldo Carvalho de Melo 
> 
> 
> Adrian Hunter (1):
>   perf intel-PT/BTS: Add missing initialization
> 
> Arnaldo Carvalho de Melo (12):
>   tools include: Adopt __compiletime_error
>   tools arch x86: Include asm/cmpxchg.h
>   tools arch x86: Introduce atomic_cmpxchg()
>   tools include: Introduce atomic_cmpxchg_{relaxed,release}()
>   tools include: Provide gcc based cmpxchg fallback for !x86
>   tools include: Add UINT_MAX def to kernel.h
>   tools include: Adopt kernel's refcount.h
>   perf evlist: Clarify a bit the use of perf_mmap->refcnt
>   tools build: Add test for sched_getcpu()
>   perf bench futex: Use __maybe_unused
>   perf bench futex: Fix build on musl + clang
>   tools build: Use the same CC for feature detection and actual build
> 
> Borislav Petkov (1):
>   perf stat: Issue a HW watchdog disable hint
> 
> Charles Baylis (1):
>   perf tools: Allow sorting by symbol size
> 
> Elena Reshetova (9):
>   perf cgroup: Convert cgroup_sel.refcnt from atomic_t to refcount_t
>   perf cpumap: Convert cpu_map.refcnt from atomic_t to refcount_t
>   perf comm: Convert comm_str.refcnt from atomic_t to refcount_t
>   perf dso: Convert dso.refcnt from atomic_t to refcount_t
>   perf map: Convert map.refcnt from atomic_t to refcount_t
>   perf map: Convert map_groups.refcnt from atomic_t to refcount_t
>   perf evlist: Convert perf_map.refcnt from atomic_t to refcount_t
>   perf thread: convert thread.refcnt from atomic_t to refcount_t
>   perf thread_map: Convert thread_map.refcnt from atomic_t to refcount_t
> 
> Jiri Ols

Re: [PATCH v5 00/15] livepatch: hybrid consistency model

2017-03-07 Thread Ingo Molnar

* Josh Poimboeuf  wrote:

>  arch/Kconfig |   6 +
>  arch/powerpc/include/asm/thread_info.h   |   4 +-
>  arch/powerpc/kernel/signal.c |   4 +
>  arch/s390/include/asm/thread_info.h  |  24 +-
>  arch/s390/kernel/entry.S |  31 +-
>  arch/x86/Kconfig |   1 +
>  arch/x86/entry/common.c  |   9 +-
>  arch/x86/include/asm/thread_info.h   |  13 +-
>  arch/x86/include/asm/unwind.h|   6 +
>  arch/x86/kernel/stacktrace.c |  96 +++-
>  arch/x86/kernel/unwind_frame.c   |   2 +

for the x86 and scheduler changes:

Acked-by: Ingo Molnar 

Thanks,

Ingo


Re: [GIT PULL 00/19] perf/core improvements and fixes

2017-03-15 Thread Ingo Molnar

* Arnaldo Carvalho de Melo  wrote:

> Hi Ingo,
> 
>   Please consider pulling,
> 
> - Arnaldo
> 
> Test results at the end of this message, as usual.
> 
> The following changes since commit 84e5b549214f2160c12318aac549de85f600c79a:
> 
>   Merge tag 'perf-core-for-mingo-4.11-20170306' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core 
> (2017-03-07 08:14:14 +0100)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
> tags/perf-core-for-mingo-4.12-20170314
> 
> for you to fetch changes up to 5f6bee34707973ea7879a7857fd63ddccc92fff3:
> 
>   kprobes: Convert kprobe_exceptions_notify to use NOKPROBE_SYMBOL 
> (2017-03-14 15:17:40 -0300)
> 
> 
> perf/core improvements and fixes:
> 
> New features:
> 
> - Add PERF_RECORD_NAMESPACES so that the kernel can record information
>   required to associate samples to namespaces, helping in container
>   problem characterization.
> 
>   Now the 'perf record has a --namespace' option to ask for such info,
>   and when present, it can be used, initially, via a new sort order,
>   'cgroup_id', allowing histogram entry bucketization by a (device, inode)
>   based cgroup identifier (Hari Bathini)
> 
> - Add --next option to 'perf sched timehist', showing what is the next
>   thread to run (Brendan Gregg)
> 
> Fixes:
> 
> - Fix segfault with basic block 'cycles' sort dimension (Changbin Du)
> 
> - Add c2c to command-list.txt, making it appear in the 'perf help'
>   output (Changbin Du)
> 
> - Fix zeroing of 'abs_path' variable in the perf hists browser switch
>   file code (Changbin Du)
> 
> - Hide tips messages when -q/--quiet is given to 'perf report' (Namhyung Kim)
> 
> Infrastructure:
> 
> - Use ref_reloc_sym + offset to setup kretprobes (Naveen Rao)
> 
> - Ignore generated files pmu-events/{jevents,pmu-events.c} for git (Changbin 
> Du)
> 
> Documentation:
> 
> - Document +field style argument support for --field option (Changbin Du)
> 
> - Clarify 'perf c2c --stats' help message (Namhyung Kim)
> 
> Signed-off-by: Arnaldo Carvalho de Melo 
> 
> 
> Brendan Gregg (1):
>   perf sched timehist: Add --next option
> 
> Changbin Du (5):
>   perf tools: Missing c2c command in command-list
>   perf tools: Ignore generated files pmu-events/{jevents,pmu-events.c} 
> for git
>   perf sort: Fix segfault with basic block 'cycles' sort dimension
>   perf report: Document +field style argument support for --field option
>   perf hists browser: Fix typo in function switch_data_file
> 
> Hari Bathini (5):
>   perf: Add PERF_RECORD_NAMESPACES to include namespaces related info
>   perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related 
> info
>   perf record: Synthesize namespace events for current processes
>   perf script: Add script print support for namespace events
>   perf tools: Add 'cgroup_id' sort order keyword
> 
> Namhyung Kim (3):
>   perf report: Hide tip message when -q option is given
>   perf c2c: Clarify help message of --stats option
>   perf c2c: Fix display bug when using pipe
> 
> Naveen N. Rao (5):
>   perf probe: Factor out the ftrace README scanning
>   perf kretprobes: Offset from reloc_sym if kernel supports it
>   perf powerpc: Choose local entry point with kretprobes
>   doc: trace/kprobes: add information about NOKPROBE_SYMBOL
>   kprobes: Convert kprobe_exceptions_notify to use NOKPROBE_SYMBOL
> 
>  Documentation/trace/kprobetrace.txt |   5 +-
>  include/linux/perf_event.h  |   2 +
>  include/uapi/linux/perf_event.h |  32 +-
>  kernel/events/core.c| 139 ++
>  kernel/fork.c   |   2 +
>  kernel/kprobes.c|   5 +-
>  kernel/nsproxy.c|   3 +
>  tools/include/uapi/linux/perf_event.h   |  32 +-
>  tools/perf/.gitignore   |   2 +
>  tools/perf/Documentation/perf-record.txt|   3 +
>  tools/perf/Documentation/perf-report.txt|   7 +-
>  tools/perf/Documentation/perf-sched.txt |   4 +
>  tools/perf/Documentation/perf-script.txt|   3 +
>  tools/perf/arch/powerpc/util/sym-handling.c |  14 ++-
>  tools/perf/builtin-annotate.c   |   1 +
>  tools/perf/builtin-c2c.c|   4 +-
>  tools/perf/builtin-diff.c   |   1 +
>  tools/perf/builtin-inject.c |  13 +++
>  tools/perf/builtin-kmem.c   |   1 +
>  tools/perf/builtin-kvm.c|   2 +
>  tools/perf/builtin-lock.c   |   1 +
>  tools/perf/builtin-mem.c|   1 +
>  tools/perf/builtin-record.c |  35 ++-
>  tools/perf/builtin-report.c |   4 +-
>  tools/perf/b

Re: [GIT PULL 0/6] perf/core improvements and fixes

2017-03-16 Thread Ingo Molnar

* Arnaldo Carvalho de Melo  wrote:

> Hi Ingo,
> 
>   Please consider pulling,
> 
> - Arnaldo
> 
> Test results at the end of this message, as usual.
> 
> The following changes since commit ffa86c2f1a8862cf58c873f6f14d4b2c3250fb48:
> 
>   Merge tag 'perf-core-for-mingo-4.12-20170314' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core 
> (2017-03-15 19:27:27 +0100)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
> tags/perf-core-for-mingo-4.12-20170316
> 
> for you to fetch changes up to 61f35d750683b21e9e3836e309195c79c1daed74:
> 
>   uprobes: Default UPROBES_EVENTS to Y (2017-03-16 12:42:02 -0300)
> 
> 
> perf/core improvements and fixes:
> 
> New features:
> 
> - Add 'brstackinsn' field in 'perf script' to reuse the x86 instruction
>   decoder used in the Intel PT code to study hot paths to samples (Andi Kleen)
> 
> Kernel:
> 
> - Default UPROBES_EVENTS to Y (Alexei Starovoitov)
> 
> - Fix check for kretprobe offset within function entry (Naveen N. Rao)
> 
> Infrastructure:
> 
> - Introduce util func is_sdt_event() (Ravi Bangoria)
> 
> - Make perf_event__synthesize_mmap_events() scale on older kernels where
>   reading /proc/pid/maps is way slower than reading /proc/pid/task/pid/maps 
> (Stephane Eranian)
> 
> Signed-off-by: Arnaldo Carvalho de Melo 
> 
> 
> Andi Kleen (1):
>   perf script: Add 'brstackinsn' for branch stacks
> 
> Arnaldo Carvalho de Melo (2):
>   tools headers: Sync {tools/,}arch/x86/include/asm/cpufeatures.h
>   uprobes: Default UPROBES_EVENTS to Y
> 
> Naveen N. Rao (1):
>   trace/kprobes: Fix check for kretprobe offset within function entry
> 
> Ravi Bangoria (1):
>   perf probe: Introduce util func is_sdt_event()
> 
> Stephane Eranian (1):
>   perf tools: Make perf_event__synthesize_mmap_events() scale
> 
>  include/linux/kprobes.h|   1 +
>  kernel/kprobes.c   |  40 ++--
>  kernel/trace/Kconfig   |   2 +-
>  kernel/trace/trace_kprobe.c|   2 +-
>  tools/arch/x86/include/asm/cpufeatures.h   |   5 +-
>  tools/perf/Documentation/perf-script.txt   |  13 +-
>  tools/perf/builtin-script.c| 264 
> -
>  tools/perf/util/Build  |   1 +
>  tools/perf/util/dump-insn.c|  14 ++
>  tools/perf/util/dump-insn.h|  22 ++
>  tools/perf/util/event.c|   4 +-
>  .../util/intel-pt-decoder/intel-pt-insn-decoder.c  |  24 ++
>  tools/perf/util/parse-events.h |  20 ++
>  tools/perf/util/probe-event.c  |   9 +-
>  14 files changed, 381 insertions(+), 40 deletions(-)
>  create mode 100644 tools/perf/util/dump-insn.c
>  create mode 100644 tools/perf/util/dump-insn.h

Pulled, thanks a lot Arnaldo!

Ingo


Re: [PATCH v6 2/8] module: use relative references for __ksymtab entries

2017-12-28 Thread Ingo Molnar

* Ard Biesheuvel  wrote:

> Annoyingly, we need this because there is a single instance of a
> special section that ends up in the EFI stub code: we build lib/sort.c
> again as a EFI libstub object, and given that sort() is exported, we
> end up with a ksymtab section in the EFI stub. The sort() thing has
> caused issues before [0], so perhaps I should just clone sort.c into
> drivers/firmware/efi/libstub and get rid of that hack.
> 
> [0] 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=29f9007b3182ab3f328a31da13e6b1c9072f7a95

If the root problem is early bootstrap code randomly using generic facility 
that 
isn't __init, then we should definitely improve tooling to at least detect 
these 
problems.

As bootstrap code gets improved (KASLR, more complex decompression, etc. etc.) 
we 
keep using new bits of generic facilities...

So this should definitely not be hidden by open coding that function (which has 
various other disadvantages as well), but should be turned from silent breakage 
either into non-breakage (and do so not only for sort() but for other generic 
functions as well), or should be turned into a build failure.

Thanks,

Ingo


Re: [PATCH v11 0/3] mm, x86, powerpc: Enhancements to Memory Protection Keys.

2018-01-30 Thread Ingo Molnar

* Ram Pai  wrote:

> This patch series provides arch-neutral enhancements to
> enable memory-keys on new architecutes, and the corresponding
> changes in x86 and powerpc specific code to support that.
> 
> a) Provides ability to support upto 32 keys.  PowerPC
>   can handle 32 keys and hence needs this.
> 
> b) Arch-neutral code; and not the arch-specific code,
>determines the format of the string, that displays the key
>for each vma in smaps.
> 
> PowerPC implementation of memory-keys is now in powerpc/next tree.
> https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?h=next&id=92e3da3cf193fd27996909956c12a23c0333da44

All three patches look sane to me. If you would like to carry these generic 
bits 
in the PowerPC tree as well then:

  Reviewed-by: Ingo Molnar 

Thanks,

Ingo


Re: [PATCH for 4.16 v7 02/11] powerpc: membarrier: Skip memory barrier in switch_mm()

2018-02-05 Thread Ingo Molnar

* Mathieu Desnoyers  wrote:

>  
> +config ARCH_HAS_MEMBARRIER_HOOKS
> + bool

Yeah, so I have renamed this to ARCH_HAS_MEMBARRIER_CALLBACKS, and propagated 
it 
through the rest of the patches. "Callback" is the canonical name, and I also 
cringe every time I see 'hook'.

Please let me know if there are any big objections against this minor cleanup.

Thanks,

Ingo


Re: [PATCH] headers: untangle kmemleak.h from mm.h

2018-02-11 Thread Ingo Molnar

* Randy Dunlap  wrote:

> From: Randy Dunlap 
> 
> Currently  #includes  for no obvious
> reason. It looks like it's only a convenience, so remove kmemleak.h
> from slab.h and add  to any users of kmemleak_*
> that don't already #include it.
> Also remove  from source files that do not use it.
> 
> This is tested on i386 allmodconfig and x86_64 allmodconfig. It
> would be good to run it through the 0day bot for other $ARCHes.
> I have neither the horsepower nor the storage space for the other
> $ARCHes.
> 
> [slab.h is the second most used header file after module.h; kernel.h
> is right there with slab.h. There could be some minor error in the
> counting due to some #includes having comments after them and I
> didn't combine all of those.]
> 
> This is Lingchi patch #1 (death by a thousand cuts, applied to kernel
> header files).
> 
> Signed-off-by: Randy Dunlap 

Nice find:

Reviewed-by: Ingo Molnar 

I agree that it needs to go through 0-day to find any hidden dependencies we 
might 
have grown due to this.

Thanks,

Ingo


Re: [GIT PULL 00/41] perf/core improvements and fixes

2018-02-17 Thread Ingo Molnar

* Arnaldo Carvalho de Melo  wrote:

> Hi Ingo,
> 
>   Please consider pulling, this is on top of tip/perf/urgent.
> 
> - Arnaldo
> 
> Test results at the end of this message, as usual.
> 
> The following changes since commit 297f9233b53a08fd457815e19f1d6f2c3389857b:
> 
>   kprobes: Propagate error from disarm_kprobe_ftrace() (2018-02-16 09:12:58 
> +0100)
> 
> are available in the Git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
> tags/perf-core-for-mingo-4.17-20180216
> 
> for you to fetch changes up to 21316ac6803d4a1aadd74b896db8d60a92cd1140:
> 
>   perf tests shell lib: Use a wildcard to remove the vfs_getname probe 
> (2018-02-16 15:31:12 -0300)
> 
> 
> perf/core improvements and fixes:
> 
> - Fix wrong jump arrow in systems with branch records with cycles,
>   i.e. Intel's >= Skylake (Jin Yao)
> 
> - Fix 'perf record --per-thread' problem introduced when
>   implementing 'perf stat --per-thread (Jin Yao)
> 
> - Use arch__compare_symbol_names() to fix 'perf test vmlinux',
>   that was using strcmp(symbol names) while the dso routines
>   doing symbol lookups used the arch overridable one, making
>   this test fail in architectures that overrided that function
>   with something other than strcmp() (Jiri Olsa)
> 
> - Add 'perf script --show-round-event' to display
>   PERF_RECORD_FINISHED_ROUND entries (Jiri Olsa)
> 
> - Fix dwarf unwind for stripped binaries in 'perf test' (Jiri Olsa)
> 
> - Use ordered_events for 'perf report --tasks', otherwise we may get
>   artifacts when PERF_RECORD_FORK gets processed before PERF_RECORD_COMM
>   (when they got recorded in different CPUs) (Jiri Olsa)
> 
> - Add support to display group output for non group events, i.e.
>   now when one uses 'perf report --group' on a perf.data file
>   recorded without explicitly grouping events with {} (e.g.
>   "perf record -e '{cycles,instructions}'" get the same output
>   that would produce, i.e. see all those non-grouped events in
>   multiple columns, at the same time (Jiri Olsa)
> 
> - Skip non-address kallsyms entries, e.g. '(null)' for !root (Jiri Olsa)
> 
> - Kernel maps fixes wrt perf.data(report) versus live system (top)
>   (Jiri Olsa)
> 
> - Fix memory corruption when using 'perf record -j call -g -a '
>   followed by 'perf report --branch-history' (Jiri Olsa)
> 
> - ARM CoreSight fixes (Mathieu Poirier)
> 
> - Add inject capability for CoreSight Traces (Robert Waker)
> 
> - Update documentation for use of 'perf' + ARM CoreSight (Robert Walker)
> 
> - Man pages fixes (Sangwon Hong, Jaecheol Shin)
> 
> - Fix some 'perf test' cases on s/390 and x86_64 (some backtraces
>   changed with a glibc update) (Thomas Richter)
> 
> - Add detailed CPUID info in the 'perf.data' headers for s/390 to
>   then use it in 'perf annotate' (Thomas Richter)
> 
> - Add '--interval-count N' to 'perf stat', to use with -I, i.e.
>   'perf stat -I 1000 --interval-count 2' will show stats every
>1000ms, two times (yuzhoujian)
> 
> - Add 'perf stat --timeout Nms', that will run for that many
>   milliseconds and then stop, printing the counters (yuzhoujian)
> 
> - Fix description for 'perf report --mem-modex (Andi Kleen)
> 
> - Use a wildcard to remove the vfs_getname probe in the
>   'perf test' shell based test cases (Arnaldo Carvalho de Melo)
> 
> Signed-off-by: Arnaldo Carvalho de Melo 
> 
> 
> Andi Kleen (1):
>   perf report: Fix description for --mem-mode
> 
> Arnaldo Carvalho de Melo (1):
>   perf tests shell lib: Use a wildcard to remove the vfs_getname probe
> 
> Jaecheol Shin (1):
>   perf annotate: Add missing arguments in Man page
> 
> Jin Yao (2):
>   perf tools: Use target->per_thread and target->system_wide flags
>   perf report: Fix wrong jump arrow
> 
> Jiri Olsa (18):
>   perf record: Put new line after target override warning
>   perf script: Add --show-round-event to display 
> PERF_RECORD_FINISHED_ROUND
>   tools lib api fs: Add filename__read_xll function
>   tools lib api fs: Add sysfs__read_xll function
>   perf tests: Fix dwarf unwind for stripped binaries
>   perf tools: Fix comment for sort__* compare functions
>   perf report: Ask for ordered events for --tasks option
>   perf report: Add support to display group output for non group events
>   tools lib symbol: Skip non-address kallsyms line
>   perf symbols: Check if we read regular file in dso__load()
>   perf machine: Free root_dir in machine__init() error path
>   perf machine: Move kernel mmap name into struct machine
>   perf machine: Generalize machine__set_kernel_mmap()
>   perf machine: Don't search for active kernel start in 
> __machine__create_kernel_maps
>   perf machine: Remove machine__load_kallsyms()
>   perf tools: Do not create kernel maps in sample__resolve()
>   perf tests: Use arch__co

Re: [PATCH v12 01/22] selftests/x86: Move protecton key selftest to arch neutral directory

2018-02-21 Thread Ingo Molnar

* Ram Pai  wrote:

> cc: Dave Hansen 
> cc: Florian Weimer 
> Signed-off-by: Ram Pai 
> ---
>  tools/testing/selftests/vm/Makefile   |1 +
>  tools/testing/selftests/vm/pkey-helpers.h |  223 
>  tools/testing/selftests/vm/protection_keys.c  | 1407 
> +
>  tools/testing/selftests/x86/Makefile  |2 +-
>  tools/testing/selftests/x86/pkey-helpers.h|  223 
>  tools/testing/selftests/x86/protection_keys.c | 1407 
> -
>  6 files changed, 1632 insertions(+), 1631 deletions(-)
>  create mode 100644 tools/testing/selftests/vm/pkey-helpers.h
>  create mode 100644 tools/testing/selftests/vm/protection_keys.c
>  delete mode 100644 tools/testing/selftests/x86/pkey-helpers.h
>  delete mode 100644 tools/testing/selftests/x86/protection_keys.c

Acked-by: Ingo Molnar 

Thanks,

Ingo


Re: [PATCH] x86, powerpc : pkey-mprotect must allow pkey-0

2018-03-09 Thread Ingo Molnar

* Ram Pai  wrote:

> Once an address range is associated with an allocated pkey, it cannot be
> reverted back to key-0. There is no valid reason for the above behavior.  On
> the contrary applications need the ability to do so.
> 
> The patch relaxes the restriction.
> 
> Tested on powerpc and x86_64.
> 
> cc: Dave Hansen 
> cc: Michael Ellermen 
> cc: Ingo Molnar 
> Signed-off-by: Ram Pai 
> ---
>  arch/powerpc/include/asm/pkeys.h | 19 ++-
>  arch/x86/include/asm/pkeys.h |  5 +++--
>  2 files changed, 17 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/pkeys.h 
> b/arch/powerpc/include/asm/pkeys.h
> index 0409c80..3e8abe4 100644
> --- a/arch/powerpc/include/asm/pkeys.h
> +++ b/arch/powerpc/include/asm/pkeys.h
> @@ -101,10 +101,18 @@ static inline u16 pte_to_pkey_bits(u64 pteflags)
>  
>  static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
>  {
> - /* A reserved key is never considered as 'explicitly allocated' */
> - return ((pkey < arch_max_pkey()) &&
> - !__mm_pkey_is_reserved(pkey) &&
> - __mm_pkey_is_allocated(mm, pkey));
> + /* pkey 0 is allocated by default. */
> + if (!pkey)
> +return true;
> +
> + if (pkey < 0 || pkey >= arch_max_pkey())
> +return false;
> +
> + /* reserved keys are never allocated. */
> + if (__mm_pkey_is_reserved(pkey))
> +return false;

Please capitalize in comments consistently, i.e.:

/* Reserved keys are never allocated: */

> +
> + return(__mm_pkey_is_allocated(mm, pkey));

'return' is not a function.

Thanks,

Ingo


Re: [RFC PATCH 4/6] mm: provide generic compat_sys_readahead() implementation

2018-03-19 Thread Ingo Molnar

* Al Viro  wrote:

> On Sun, Mar 18, 2018 at 06:18:48PM +, Al Viro wrote:
> 
> > I'd done some digging in that area, will find the notes and post.
> 
> OK, found:

Very nice writeup - IMHO this should go into Documentation/!

> OTOH, consider arm.  There we have
>   * r0, r1, r2, r3, [sp,#8], [sp,#12], [sp,#16]... is the sequence
> of objects used to pass arguments
>   * 32bit and less - pick the next available slot
>   * 64bit - skip a slot if we'd already taken an odd number, then use
> the next two slots for lower and upper 32 bits of the argument.
> 
> So our classes take
> simple n-argument:0 to 6 slots
> WD4 slots
> DWW   4 slots
> WDW   5 slots
> WWDD  6 slots
> WDWW  5 slots
> WWWD  6 slots
> WWDWW 6 slots
> WDDW  7 slots (!)  Also , , !@#!@#!@#!# and other nice
> and well-deserved comments from arch maintainers, some of them even printable:
> /* It would be nice if people remember that not all the world's an i386
>when they introduce new system calls */
> SYSCALL_DEFINE4(sync_file_range2, int, fd, unsigned int, flags,
>  loff_t, offset, loff_t, nbytes)

Such idiosyncratic platform quirks that have an impact on generic code should 
be 
as self-maintaining as possible: i.e. there should be a build time warning even 
on 
x86 if someone introduces a new, suboptimally packed system call.

Otherwise we'll have such incidents again and again as new system calls get 
added.

> [snip the preprocessor horrors - the sketches I've got there are downright 
> obscene]

I still think we should consider creating a generic facility and a tool: which 
would immediately and automatically add new system calls to *every* 
architecture - 
or which would initially at least check these syscall ABI constraints.

I.e. this would start with a new generic kernel facility that warns about 
suboptimal new system call argument layouts on every architecture, not just on 
the 
affected ones.

That's a significant undertaking but should be possible to do.

Once such a facility is in place all the existing old mess is still a PITA, but 
should be manageable eventually - as no new mess is added to it.

IMHO that's the only thing that could break the somewhat deadly current dynamic 
of 
system call mappings mess. Complaining about people not knowing about quirks 
won't 
help.

One way to implement this would be to put the argument chain types (string) and 
sizes (int) into a special debug section which isn't included in the final 
kernel 
image but which can be checked at link time.

For example this attempt at creating a new system call:

  SYSCALL_DEFINE3(moron, int, fd, loff_t, offset, size_t, count)

... would translate into something like:

.name = "moron", .pattern = "WWW", .type = "int",.size = 4,
.name = NULL,  .type = "loff_t", .size = 8,
.name = NULL,  .type = "size_t", .size = 4,
.name = NULL,  .type = NULL, .size = 0, /* 
end of parameter list */

i.e. "WDW". The build-time constraint checker could then warn about:

  # error: System call "moron" uses invalid 'WWW' argument mapping for a 'WDW' 
sequence
  #please avoid long-long arguments or use 'SYSCALL_DEFINE3_WDW()' 
instead

Each architecture can provide its own syscall parameter checking logic. Both 
'stack boundary' and parameter packing rules would be straightforward to 
express 
if we had such a data structure.

Also note that this tool could also check for optimum packing, i.e. if the new 
system call is defined as:

  SYSCALL_DEFINE3_WDW(moron, int, fd, loff_t, offset, size_t, count)

... would translate to something like:

.name = "moron", .pattern = "WDW", .type = "int",.size = 4,
.name = NULL,  .type = "loff_t", .size = 8,
.name = NULL,  .type = "size_t", .size = 4,
.name = NULL,  .type = NULL, .size = 0, /* 
end of parameter list */

where the tool would print out this error:

  # error: System call "moron" uses suboptimal 'WDW' argument mapping instead 
of 'WWD'

there would be a whitelist of existing system calls that are already using an 
suboptimal argument order - but the warnings/errors would trigger for all new 
system calls.

But adding non-straight-mapped system calls would be the exception in any case.

Such tooling could also do other things, such as limit the C types used for 
system 
call defines to a well-chosen set of ABI-safe types, such as:

  3  key_t
  3  uint32_t
  4  aio_context_t
  4  mqd_t
  4  timer_t
 10  clockid_t
 10  gid_t
 10  loff_t
 10  long
 10  old_gid_t
 10  old_uid_t
 10  umode_t
 11  uid_t
 31  pid_t
 34  size_t
 69  unsigned int
130  unsigned long
  

Re: [GIT PULL 00/14] perf/core improvements and fixes

2018-03-19 Thread Ingo Molnar

* Arnaldo Carvalho de Melo  wrote:

> Hi Ingo,
> 
>   Please consider pulling, this has those 31 patches that were
> blocked due to some problems (author not being the fist S-o-B, build
> broken on ppc), those issues should all be fixed and then we have 14
> patches more, described in the signed tag.
> 
> Regards,
> 
> - Arnaldo
> 
> Test results at the end of this message, as usual.
> 
> The following changes since commit 10f354a36f9a9aa1b8bffe0abc1cd43822a85bcd:
> 
>   perf test: Fix exit code for record+probe_libc_inet_pton.sh (2018-03-16 
> 13:56:31 -0300)
> 
> are available in the Git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
> tags/perf-core-for-mingo-4.17-20180319
> 
> for you to fetch changes up to 1cd618838b9703eabe4a75badf433382b12f6bef:
> 
>   perf tests bp_account: Fix build with clang-6 (2018-03-19 13:51:54 -0300)
> 
> 
> perf/core improvements and fixes:
> 
> - Fixes for problems experienced with new gcc 8 warnings, that treated
>   as errors, broke the build, related to snprintf and casting issues.
>   (Arnaldo Carvalho de Melo, Jiri Olsa, Josh Poinboeuf)
> 
> - Fix build of new breakpoint 'perf test' entry with clang < 6, noticed
>   on fedora 25, 26 and 27 (Arnaldo Carvalho de Melo)
> 
> - Workaround problem with symbol resolution in 'perf annotate', using
>   the symbol name already present in the objdump output (Arnaldo Carvalho de 
> Melo)
> 
> - Document 'perf top --ignore-vmlinux' (Arnaldo Carvalho de Melo)
> 
> - Fix out of bounds access on array fd when cnt is 100 in one of the
>   'perf test' entries, detected using 'cpptest' (Colin Ian King)
> 
> - Add support for the forced leader feature, i.e. 'perf report --group'
>   for a group of events not really grouped when scheduled (without using
>   {} to enclose the list of events in the command line) in pipe mode,
>   e.g.:
> 
>   $ perf record -e cycles,instructions -o - kill | perf report --group -i -
> 
> - Use right type to access array elements in 'perf probe' (Masami Hiramatsu)
> 
> - Update POWER9 vendor events (those described in JSON format) (Sukadev 
> Bhattiprolu)
> 
> - Discard head in overwrite_rb_find_range() (Yisheng Xie)
> 
> - Avoid setting 'quiet' to 'true' unnecessarily (Yisheng Xie)
> 
> Signed-off-by: Arnaldo Carvalho de Melo 
> 
> 
> Arnaldo Carvalho de Melo (4):
>   perf annotate: Use asprintf when formatting objdump command line
>   perf top: Document --ignore-vmlinux
>   perf annotate: Use ops->target.name when available for unresolved call 
> targets
>   perf tests bp_account: Fix build with clang-6
> 
> Colin Ian King (1):
>   perf tests: Fix out of bounds access on array fd when cnt is 100
> 
> Jiri Olsa (4):
>   perf record: Synthesize features before events in pipe mode
>   perf report: Support forced leader feature in pipe mode
>   perf tools: Fix snprint warnings for gcc 8
>   perf tools: Fix python extension build for gcc 8
> 
> Josh Poimboeuf (1):
>   objtool, perf: Fix GCC 8 -Wrestrict error
> 
> Masami Hiramatsu (1):
>   perf probe: Use right type to access array elements
> 
> Sukadev Bhattiprolu (1):
>   perf vendor events: Update POWER9 events
> 
> Yisheng Xie (2):
>   perf mmap: Discard head in overwrite_rb_find_range()
>   perf debug: Avoid setting 'quiet' to 'true' unnecessarily
> 
>  tools/lib/str_error_r.c|   2 +-
>  tools/perf/Documentation/perf-top.txt  |   3 +
>  tools/perf/builtin-record.c|  18 +-
>  tools/perf/builtin-report.c|  57 +++--
>  tools/perf/builtin-script.c|  22 +-
>  .../perf/pmu-events/arch/powerpc/power9/cache.json |  25 ---
>  .../pmu-events/arch/powerpc/power9/frontend.json   |  10 -
>  .../pmu-events/arch/powerpc/power9/marked.json |   5 -
>  .../pmu-events/arch/powerpc/power9/memory.json |   5 -
>  .../perf/pmu-events/arch/powerpc/power9/other.json | 241 
> ++---
>  .../pmu-events/arch/powerpc/power9/pipeline.json   |  50 ++---
>  tools/perf/pmu-events/arch/powerpc/power9/pmc.json |   5 -
>  .../arch/powerpc/power9/translation.json   |  10 +-
>  tools/perf/tests/attr.c|   4 +-
>  tools/perf/tests/bp_account.c  |  10 +-
>  tools/perf/tests/mem.c |   2 +-
>  tools/perf/tests/pmu.c |   2 +-
>  tools/perf/util/annotate.c |  20 +-
>  tools/perf/util/cgroup.c   |   2 +-
>  tools/perf/util/debug.c|   1 -
>  tools/perf/util/header.c   |  11 +-
>  tools/perf/util/mmap.c |  15 +-
>  tools/perf/util/parse-events.c |   4 +-
>  tools/perf/util/pmu.c  

Re: [RFC PATCH 4/6] mm: provide generic compat_sys_readahead() implementation

2018-03-20 Thread Ingo Molnar

* Al Viro  wrote:

> > For example this attempt at creating a new system call:
> > 
> >   SYSCALL_DEFINE3(moron, int, fd, loff_t, offset, size_t, count)
> > 
> > ... would translate into something like:
> > 
> > .name = "moron", .pattern = "WWW", .type = "int",.size = 4,
> > .name = NULL,  .type = "loff_t", .size = 8,
> > .name = NULL,  .type = "size_t", .size = 4,
> > .name = NULL,  .type = NULL, .size = 0, /* 
> > end of parameter list */
> > 
> > i.e. "WDW". The build-time constraint checker could then warn about:
> > 
> >   # error: System call "moron" uses invalid 'WWW' argument mapping for a 
> > 'WDW' sequence
> >   #please avoid long-long arguments or use 'SYSCALL_DEFINE3_WDW()' 
> > instead
> 
> ... if you do 32bit build.

Yeah - but the checking tool could do a 32-bit sizing of the types and thus the 
checks would work on all arches and on all bitness settings.

I don't think doing part of this in CPP is a good idea:

 - It won't be able to do the full range of checks

 - Wrappers should IMHO be trivial and open coded as much as possible - not 
hidden
   inside several layers of macros.

 - There should be a penalty for newly introduced, badly designed system call
   ABIs, while most CPP variants I can think of will just make bad but solvable 
   decisions palatable, AFAICS.

I.e. I think the way out of this would be two steps:

 1) for new system calls: hard-enforce the highest quality at the development
stage and hard-reject crap. No new 6-parameter system calls or badly ordered
arguments. The tool would also check new extensions to existing system 
calls, 
i.e. no more "add a crappy 4th argument to an existing system call that 
works 
on x86 but hurts MIPS".

 2) for old legacies: cleanly open code all our existing legacies and weird
wrappers. No new muck will be added to it so the line count does not matter.

... is there anything I'm missing?

Thanks,

Ingo


Re: [RFC] new SYSCALL_DEFINE/COMPAT_SYSCALL_DEFINE wrappers

2018-03-30 Thread Ingo Molnar

* John Paul Adrian Glaubitz  wrote:

> On 03/27/2018 12:40 PM, Linus Torvalds wrote:
> > On Mon, Mar 26, 2018 at 4:37 PM, John Paul Adrian Glaubitz
> >  wrote:
> >>
> >> What about a tarball with a minimal Debian x32 chroot? Then you can
> >> install interesting packages you would like to test yourself.
> > 
> > That probably works fine.
> 
> I just created a fresh Debian x32 unstable chroot using this command:
> 
> $ debootstrap --no-check-gpg --variant=minbase --arch=x32 unstable 
> debian-x32-unstable http://ftp.ports.debian.org/debian-ports
> 
> It can be downloaded from my Debian webspace along checksum files for
> verification:
> 
> > https://people.debian.org/~glaubitz/chroots/
> 
> Let me know if you run into any issues.

Here's the direct download link:

  $ wget https://people.debian.org/~glaubitz/chroots/debian-x32-unstable.tar.gz

Checksum should be:

  $ sha256sum debian-x32-unstable.tar.gz
  010844bcc76bd1a3b7a20fe47f7067ed8e429a84fa60030a2868626e8fa7ec3b  
debian-x32-unstable.tar.gz

Seems to work fine here (on a distro kernel) even if I extract all the files as 
a 
non-root user and do:

  ~/s/debian-x32-unstable> fakechroot /usr/sbin/chroot . /usr/bin/dpkg -l  | 
tail -2

  ERROR: ld.so: object 'libfakechroot.so' from LD_PRELOAD cannot be preloaded 
(cannot open shared object file): ignored.
  ii  util-linux:x32 2.31.1-0.5   x32  miscellaneous 
system utilities
  ii  zlib1g:x32 1:1.2.8.dfsg-5   x32  compression 
library - runtime

So that 'dpkg' instance appears to be running inside the chroot environment and 
is 
listing x32 installed packages.

Although I did get this warning:

  ERROR: ld.so: object 'libfakechroot.so' from LD_PRELOAD cannot be preloaded 
(cannot open shared object file): ignored.

Even with that warning, is still still a sufficiently complex test of x32 
syscall 
code paths?

BTW., "fakechroot /usr/sbin/chroot ." crashes instead of giving me a bash shell.

Thanks,

Ingo


Re: [PATCH] Extract initrd free logic from arch-specific code.

2018-03-30 Thread Ingo Molnar

* Shea Levy  wrote:

> Now only those architectures that have custom initrd free requirements
> need to define free_initrd_mem.
> 
> Signed-off-by: Shea Levy 

Please put the Kconfig symbol name this patch introduces both into the title, 
so 
that people know what to grep for.

> ---
>  arch/alpha/mm/init.c  |  8 
>  arch/arc/mm/init.c|  7 ---
>  arch/arm/Kconfig  |  1 +
>  arch/arm64/Kconfig|  1 +
>  arch/blackfin/Kconfig |  1 +
>  arch/c6x/mm/init.c|  7 ---
>  arch/cris/Kconfig |  1 +
>  arch/frv/mm/init.c| 11 ---
>  arch/h8300/mm/init.c  |  7 ---
>  arch/hexagon/Kconfig  |  1 +
>  arch/ia64/Kconfig |  1 +
>  arch/m32r/Kconfig |  1 +
>  arch/m32r/mm/init.c   | 11 ---
>  arch/m68k/mm/init.c   |  7 ---
>  arch/metag/Kconfig|  1 +
>  arch/microblaze/mm/init.c |  7 ---
>  arch/mips/Kconfig |  1 +
>  arch/mn10300/Kconfig  |  1 +
>  arch/nios2/mm/init.c  |  7 ---
>  arch/openrisc/mm/init.c   |  7 ---
>  arch/parisc/mm/init.c |  7 ---
>  arch/powerpc/mm/mem.c |  7 ---
>  arch/riscv/mm/init.c  |  6 --
>  arch/s390/Kconfig |  1 +
>  arch/score/Kconfig|  1 +
>  arch/sh/mm/init.c |  7 ---
>  arch/sparc/Kconfig|  1 +
>  arch/tile/Kconfig |  1 +
>  arch/um/kernel/mem.c  |  7 ---
>  arch/unicore32/Kconfig|  1 +
>  arch/x86/Kconfig  |  1 +
>  arch/xtensa/Kconfig   |  1 +
>  init/initramfs.c  |  7 +++
>  usr/Kconfig   |  4 
>  34 files changed, 28 insertions(+), 113 deletions(-)

Please also put it into Documentation/features/.

> diff --git a/usr/Kconfig b/usr/Kconfig
> index 43658b8a975e..7a94f6df39bf 100644
> --- a/usr/Kconfig
> +++ b/usr/Kconfig
> @@ -233,3 +233,7 @@ config INITRAMFS_COMPRESSION
>   default ".lzma" if RD_LZMA
>   default ".bz2"  if RD_BZIP2
>   default ""
> +
> +config HAVE_ARCH_FREE_INITRD_MEM
> + bool
> + default n

Help text would be nice, to tell arch maintainers what the purpose of this 
switch 
is.

Also, a nit, I think this should be named "ARCH_HAS_FREE_INITRD_MEM", which is 
the 
dominant pattern:

triton:~/tip> git grep 'select.*ARCH' arch/x86/Kconfig* | cut -f2 | cut -d_ 
-f1-2 | sort | uniq -c | sort -n
...
  2 select ARCH_USES
  2 select ARCH_WANTS
  3 select ARCH_MIGHT
  3 select ARCH_WANT
  4 select ARCH_SUPPORTS
  4 select ARCH_USE
 16 select HAVE_ARCH
 23 select ARCH_HAS

It also reads nicely in English:

  "arch has free_initrd_mem()"

While the other makes little sense:

  "have arch free_initrd_mem()"

?

Thanks,

Ingo


Re: [PATCH] Extract initrd free logic from arch-specific code.

2018-04-01 Thread Ingo Molnar

* Shea Levy  wrote:

> > Please also put it into Documentation/features/.
> 
> I switched this patch series (the latest revision v6 was just posted) to
> using weak symbols instead of Kconfig. Does it still warrant documentation?

Probably not.

Thanks,

Ingo


Re: [PATCH 5/6] mm, x86: Add ARCH_HAS_ZONE_DEVICE

2017-05-22 Thread Ingo Molnar

* Oliver O'Halloran  wrote:

> Currently ZONE_DEVICE depends on X86_64. This is fine for now, but it
> will get unwieldly as new platforms get ZONE_DEVICE support. Moving it
> to an arch selected Kconfig option to save us some trouble in the
> future.
> 
> Cc: x...@kernel.org
> Signed-off-by: Oliver O'Halloran 

Acked-by: Ingo Molnar 

Thanks,

Ingo


Re: [GIT PULL 00/13] perf/core improvements and fixes

2017-09-04 Thread Ingo Molnar

* Arnaldo Carvalho de Melo  wrote:

> Hi Ingo,
> 
>   Please consider pulling,
> 
> - Arnaldo
> 
> Test results at the end of this message, as usual.
> 
> The following changes since commit 1b2f76d77a277bb70d38ad0991ed7f16bbc115a9:
> 
>   Merge tag 'perf-core-for-mingo-4.14-20170829' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core 
> (2017-08-29 23:13:56 +0200)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
> tags/perf-core-for-mingo-4.14-20170901
> 
> for you to fetch changes up to eba9fac017617e685d648339e29a1453a30cb065:
> 
>   perf annotate browser: Help for cycling thru hottest instructions with 
> TAB/shift+TAB (2017-09-01 14:55:40 -0300)
> 
> 
> perf/core improvements and fixes:
> 
> - Support syscall name glob matching in 'perf trace' (Arnaldo Carvalho de 
> Melo)
> 
>   e.g.:
> 
># perf trace -e pkey_*
>32.784 (0.006 ms): pkey/16018 pkey_alloc(init_val: DISABLE_WRITE) = -1 
> EINVAL Invalid argument
>32.795 (0.004 ms): pkey/16018 pkey_mprotect(start: 0x7f380d0a6000, len: 
> 4096, prot: READ|WRITE, pkey: -1) = 0
>32.801 (0.002 ms): pkey/16018 pkey_free(pkey: -1) = -1 
> EINVAL Invalid argument
>^C#
> 
> - Do not auto merge counts for explicitely specified events in
>   'perf stat' (Arnaldo Carvalho de Melo)
> 
> - Fix syntax in documentation of .perfconfig intel-pt option (Jack Henschel)
> 
> - Calculate the average cycles of iterations for loops detected by the
>   branch history support in 'perf report' (Jin Yao)
> 
> - Support PERF_SAMPLE_PHYS_ADDR as a sort key "phys_daddr" in the 'script', 
> 'mem',
>   'top' and 'report'. Also add a test entry for it in 'perf test' (Kan Liang)
> 
> - Fix 'Object code reading' 'perf test' entry in PowerPC (Ravi Bangoria)
> 
> - Remove some duplicate Power9 duplicate vendor events (described in JSON
>   files) (Sukadev Bhattiprolu)
> 
> - Add help entry in the TUI annotate browser about cycling thru hottest
>   instructions with TAB/shift+TAB (Arnaldo Carvalho de Melo)
> 
> Signed-off-by: Arnaldo Carvalho de Melo 
> 
> 
> Arnaldo Carvalho de Melo (4):
>   perf syscalltbl: Support glob matching on syscall names
>   perf trace: Support syscall name globbing
>   perf stat: Only auto-merge events that are PMU aliases
>   perf annotate browser: Help for cycling thru hottest instructions with 
> TAB/shift+TAB
> 
> Jack Henschel (1):
>   perf intel-pt: Fix syntax in documentation of config option
> 
> Jin Yao (1):
>   perf report: Calculate the average cycles of iterations
> 
> Kan Liang (5):
>   perf tools: Support new sample type for physical address
>   perf sort: Add sort option for physical address
>   perf mem: Support physical address
>   perf script: Support physical address
>   perf test: Add test case for PERF_SAMPLE_PHYS_ADDR
> 
> Ravi Bangoria (1):
>   perf test powerpc: Fix 'Object code reading' test
> 
> Sukadev Bhattiprolu (1):
>   perf vendor events powerpc: Remove duplicate events
> 
>  tools/include/uapi/linux/perf_event.h  |   4 +-
>  tools/perf/Documentation/intel-pt.txt  |   2 +-
>  tools/perf/Documentation/perf-mem.txt  |   4 +
>  tools/perf/Documentation/perf-record.txt   |   5 +-
>  tools/perf/Documentation/perf-report.txt   |   1 +
>  tools/perf/Documentation/perf-script.txt   |   2 +-
>  tools/perf/Documentation/perf-trace.txt|   2 +-
>  tools/perf/builtin-mem.c   |  97 -
>  tools/perf/builtin-record.c|   2 +
>  tools/perf/builtin-script.c|  15 ++-
>  tools/perf/builtin-stat.c  |   2 +-
>  tools/perf/builtin-trace.c |  39 ++-
>  tools/perf/perf.h  |   1 +
>  .../pmu-events/arch/powerpc/power9/frontend.json   |   7 +-
>  .../perf/pmu-events/arch/powerpc/power9/other.json | 120 
> -
>  .../pmu-events/arch/powerpc/power9/pipeline.json   |   7 +-
>  tools/perf/pmu-events/arch/powerpc/power9/pmc.json |   7 +-
>  tools/perf/tests/code-reading.c|   5 +
>  tools/perf/tests/sample-parsing.c  |   6 +-
>  tools/perf/ui/browsers/annotate.c  |   3 +-
>  tools/perf/ui/browsers/hists.c |   8 +-
>  tools/perf/ui/stdio/hist.c |  10 +-
>  tools/perf/util/callchain.c|  49 -
>  tools/perf/util/callchain.h|   9 +-
>  tools/perf/util/event.h|   1 +
>  tools/perf/util/evsel.c|  19 +++-
>  tools/perf/util/evsel.h|   1 +
>  tools/perf/util/hist.c   

Re: [PATCH 0/4] PCI: Cleanup unused stuff

2017-10-07 Thread Ingo Molnar

* Bjorn Helgaas  wrote:

> Sorry for the long cc list.  These are pretty trivial; they just remove
> some unnecessary declarations across several arches.
> 
> ---
> 
> Bjorn Helgaas (4):
>   PCI: Remove redundant pcibios_set_master() declarations
>   PCI: Remove redundant pci_dev, pci_bus, resource declarations
>   PCI: Remove unused declarations
>   alpha/PCI: Make pdev_save_srm_config() static
> 
> 
>  arch/alpha/include/asm/pci.h|5 -
>  arch/alpha/kernel/pci.c |   11 ++-
>  arch/alpha/kernel/pci_impl.h|8 
>  arch/cris/include/asm/pci.h |9 -
>  arch/frv/include/asm/pci.h  |4 
>  arch/ia64/include/asm/pci.h |4 
>  arch/mips/include/asm/pci.h |4 
>  arch/mn10300/include/asm/pci.h  |4 
>  arch/mn10300/unit-asb2305/pci-asb2305.h |3 ---
>  arch/parisc/include/asm/pci.h   |8 
>  arch/powerpc/include/asm/pci.h  |2 --
>  arch/sh/include/asm/pci.h   |4 
>  arch/sparc/include/asm/pci_32.h |2 --
>  arch/x86/include/asm/pci.h  |2 --
>  arch/xtensa/include/asm/pci.h   |2 --
>  15 files changed, 10 insertions(+), 62 deletions(-)

Nice cleanups! For the whole series:

  Reviewed-by: Ingo Molnar 

Thanks,

Ingo



Re: [GIT PULL 00/25] perf/core improvements and fixes

2017-06-21 Thread Ingo Molnar

* Arnaldo Carvalho de Melo  wrote:

> Hi Ingo,
> 
>   Please consider pulling,
> 
> - Arnaldo
> 
> Test results at the end of this message, as usual.
> 
> The following changes since commit 007b811b4041989ec2dc91b9614aa2c41332723e:
> 
>   Merge tag 'perf-core-for-mingo-4.13-20170719' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core 
> (2017-06-20 10:49:08 +0200)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
> tags/perf-core-for-mingo-4.13-20170621
> 
> for you to fetch changes up to 701516ae3dec801084bc913d21e03fce15c61a0b:
> 
>   perf script: Fix message because field list option is -F not -f (2017-06-21 
> 11:35:53 -0300)
> 
> 
> perf/core improvements ad fixes:
> 
> New features:
> 
> - Add support to measure SMI cost in 'perf stat' (Kan Liang)
> 
> - Add support for unwinding callchains in powerpc with libdw (Paolo Bonzini)
> 
> Fixes:
> 
> - Fix message: cpu list option is -C not -c (Adrian Hunter)
> 
> - Fix 'perf script' message: field list option is -F not -f (Adrian Hunter)
> 
> - Intel PT fixes: (Adrian Hunter)
> 
>   o Fix missing stack clear
>   o Ensure IP is zero when state is INTEL_PT_STATE_NO_IP
>   o Fix last_ip usage
>   o Ensure never to set 'last_ip' when packet 'count' is zero
>   o Clear FUP flag on error
>   o Fix transactions_sample_type
> 
> Infrastructure:
> 
> - Intel PT cleanups/refactorings (Adrian Hunter)
> 
>   o Use FUP always when scanning for an IP
>   o Add missing __fallthrough
>   o Remove redundant initial_skip checks
>   o Allow decoding with branch tracing disabled
>   o Add default config for pass-through branch enable
>   o Add documentation for new config terms
>   o Add decoder support for ptwrite and power event packets
>   o Add reserved byte to CBR packet payload
>   o Add decoder support for CBR events
> 
> - Move  find_process() to the only place that uses it, skimming some
>   more fat from util.[ch] (Arnaldo Carvalho de Melo)
> 
> - Do parameter validation earlier on fetch_kernel_version() (Arnaldo Carvalho 
> de Melo)
> 
> - Remove unused _ALL_SOURCE define (Arnaldo Carvalho de Melo)
> 
> - Add sysfs__write_int function (Kan Liang)
> 
> Signed-off-by: Arnaldo Carvalho de Melo 
> 
> 
> Adrian Hunter (19):
>   perf intel-pt: Move decoder error setting into one condition
>   perf intel-pt: Improve sample timestamp
>   perf intel-pt: Fix missing stack clear
>   perf intel-pt: Ensure IP is zero when state is INTEL_PT_STATE_NO_IP
>   perf intel-pt: Fix last_ip usage
>   perf intel-pt: Ensure never to set 'last_ip' when packet 'count' is zero
>   perf intel-pt: Use FUP always when scanning for an IP
>   perf intel-pt: Clear FUP flag on error
>   perf intel-pt: Add missing __fallthrough
>   perf intel-pt: Allow decoding with branch tracing disabled
>   perf intel-pt: Add default config for pass-through branch enable
>   perf intel-pt: Add documentation for new config terms
>   perf intel-pt: Add decoder support for ptwrite and power event packets
>   perf intel-pt: Add reserved byte to CBR packet payload
>   perf intel-pt: Add decoder support for CBR events
>   perf intel-pt: Remove redundant initial_skip checks
>   perf intel-pt: Fix transactions_sample_type
>   perf tools: Fix message because cpu list option is -C not -c
>   perf script: Fix message because field list option is -F not -f
> 
> Arnaldo Carvalho de Melo (3):
>   perf evsel: Adopt find_process()
>   perf tools: Do parameter validation earlier on fetch_kernel_version()
>   perf tools: Remove unused _ALL_SOURCE define
> 
> Kan Liang (2):
>   tools lib api fs: Add sysfs__write_int function
>   perf stat: Add support to measure SMI cost
> 
> Paolo Bonzini (1):
>   perf unwind: Support for powerpc
> 
>  tools/lib/api/fs/fs.c  |  30 +++
>  tools/lib/api/fs/fs.h  |   4 +
>  tools/perf/Documentation/intel-pt.txt  |  36 +++
>  tools/perf/Documentation/perf-stat.txt |  14 +
>  tools/perf/Makefile.config |   2 +-
>  tools/perf/arch/powerpc/util/Build |   2 +
>  tools/perf/arch/powerpc/util/unwind-libdw.c|  73 ++
>  tools/perf/arch/x86/util/intel-pt.c|   5 +
>  tools/perf/builtin-script.c|   2 +-
>  tools/perf/builtin-stat.c  |  49 
>  tools/perf/util/evsel.c|  39 +++
>  .../perf/util/intel-pt-decoder/intel-pt-decoder.c  | 290 
> +++--
>  .../perf/util/intel-pt-decoder/intel-pt-decoder.h  |  13 +
>  .../util/intel-pt-decoder/intel-pt-pkt-decoder.c   | 110 +++-
>  .../util/intel-pt-decoder/intel-pt-pkt-decoder.h   |   7 +
>  tools/perf/util/int

<    1   2   3   4   >