from:"Kevin Hao"

Re: [PATCH 5.10 04/57] net: octeontx2: Make sure the buffer is 128 byte aligned

2021-02-07 Thread Kevin Hao

On Sun, Feb 07, 2021 at 10:20:15AM +0100, Pavel Machek wrote:
> Hi!
> 
> > commit db2805150a0f27c00ad286a29109397a7723adad upstream.
> > 
> > The octeontx2 hardware needs the buffer to be 128 byte aligned.
> > But in the current implementation of napi_alloc_frag(), it can't
> > guarantee the return address is 128 byte aligned even the request size
> > is a multiple of 128 bytes, so we have to request an extra 128 bytes and
> > use the PTR_ALIGN() to make sure that the buffer is aligned correctly.
> > 
> > +++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c
> > @@ -473,10 +473,11 @@ dma_addr_t __otx2_alloc_rbuf(struct otx2
> > dma_addr_t iova;
> > u8 *buf;
> >  
> > -   buf = napi_alloc_frag(pool->rbsize);
> > +   buf = napi_alloc_frag(pool->rbsize + OTX2_ALIGN);
> > if (unlikely(!buf))
> > return -ENOMEM;
> >  
> > +   buf = PTR_ALIGN(buf, OTX2_ALIGN);
> 
> So we allocate a buffer, then change it, and then pass modified
> pointer to the page_frag_free(buf); in the error path. That... can't
> be right, right?

It doesn't matter. It will work as far as the address we passed to 
page_frag_free()
is in the range of buf ~ (buf + rbsize).

> 
> > iova = dma_map_single_attrs(pfvf->dev, buf, pool->rbsize,
> > DMA_FROM_DEVICE, DMA_ATTR_SKIP_CPU_SYNC);
> > if (unlikely(dma_mapping_error(pfvf->dev, iova))) {
> 
> BTW otx2_alloc_rbuf and __otx2_alloc_rbuf should probably return s64,
> as they return negative error code...

It does seem buggy to return dma_addr_t for these two functions, I will cook up
a patch to fix this issue.

Thanks,
Kevin

> 
> Best regards,
>   Pavel
> 
> -- 
> http://www.livejournal.com/~pavelmachek




signature.asc
Description: PGP signature

[PATCH 2/2] tracing/hwlat: Honor the tracing_cpumask

2020-07-30 Thread Kevin Hao

In calculation of the cpu mask for the hwlat kernel thread, the wrong
cpu mask is used instead of the tracing_cpumask, this causes the
tracing/tracing_cpumask useless for hwlat tracer. Fixes it.

Fixes: 0330f7aa8ee6 ("tracing: Have hwlat trace migrate across tracing_cpumask 
CPUs")
Signed-off-by: Kevin Hao 
---
 kernel/trace/trace_hwlat.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/trace_hwlat.c b/kernel/trace/trace_hwlat.c
index ddb528a6cd51..17873e5d0353 100644
--- a/kernel/trace/trace_hwlat.c
+++ b/kernel/trace/trace_hwlat.c
@@ -283,6 +283,7 @@ static bool disable_migrate;
 static void move_to_next_cpu(void)
 {
struct cpumask *current_mask = _cpumask;
+   struct trace_array *tr = hwlat_trace;
int next_cpu;
 
if (disable_migrate)
@@ -296,7 +297,7 @@ static void move_to_next_cpu(void)
goto disable;
 
get_online_cpus();
-   cpumask_and(current_mask, cpu_online_mask, tracing_buffer_mask);
+   cpumask_and(current_mask, cpu_online_mask, tr->tracing_cpumask);
next_cpu = cpumask_next(smp_processor_id(), current_mask);
put_online_cpus();
 
@@ -372,7 +373,7 @@ static int start_kthread(struct trace_array *tr)
 
/* Just pick the first CPU on first iteration */
get_online_cpus();
-   cpumask_and(current_mask, cpu_online_mask, tracing_buffer_mask);
+   cpumask_and(current_mask, cpu_online_mask, tr->tracing_cpumask);
put_online_cpus();
next_cpu = cpumask_first(current_mask);
 
-- 
2.26.2

[PATCH 1/2] tracing/hwlat: Drop the duplicate assignment in start_kthread()

2020-07-30 Thread Kevin Hao

We have set 'current_mask' to '_cpumask' in its declaration,
so there is no need to assign again.

Signed-off-by: Kevin Hao 
---
 kernel/trace/trace_hwlat.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/kernel/trace/trace_hwlat.c b/kernel/trace/trace_hwlat.c
index e2be7bb7ef7e..ddb528a6cd51 100644
--- a/kernel/trace/trace_hwlat.c
+++ b/kernel/trace/trace_hwlat.c
@@ -371,7 +371,6 @@ static int start_kthread(struct trace_array *tr)
return 0;
 
/* Just pick the first CPU on first iteration */
-   current_mask = _cpumask;
get_online_cpus();
cpumask_and(current_mask, cpu_online_mask, tracing_buffer_mask);
put_online_cpus();
-- 
2.26.2

[PATCH v5.6-rt] mm: slub: Always flush the delayed empty slubs in flush_all()

2020-05-03 Thread Kevin Hao

After commit f0b231101c94 ("mm/SLUB: delay giving back empty slubs to
IRQ enabled regions"), when the free_slab() is invoked with the IRQ
disabled, the empty slubs are moved to a per-CPU list and will be
freed after IRQ enabled later. But in the current codes, there is
a check to see if there really has the cpu slub on a specific cpu
before flushing the delayed empty slubs, this may cause a reference
of already released kmem_cache in a scenario like below:
cpu 0   cpu 1
  kmem_cache_destroy()
flush_all()
 --->IPI   flush_cpu_slab()
 flush_slab()
   deactivate_slab()
 discard_slab()
   free_slab()
 c->page = NULL;
  for_each_online_cpu(cpu)
if (!has_cpu_slab(1, s))
  continue
this skip to flush the delayed
empty slub released by cpu1
kmem_cache_free(kmem_cache, s)

   kmalloc()
 __slab_alloc()
free_delayed()
__free_slab()
reference to released kmem_cache

Fixes: f0b231101c94 ("mm/SLUB: delay giving back empty slubs to IRQ enabled 
regions")
Signed-off-by: Kevin Hao 
---
 mm/slub.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 15c194ff16e6..83b29bf71fd0 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2402,9 +2402,6 @@ static void flush_all(struct kmem_cache *s)
for_each_online_cpu(cpu) {
struct slub_free_list *f;
 
-   if (!has_cpu_slab(cpu, s))
-   continue;
-
f = _cpu(slub_free_list, cpu);
raw_spin_lock_irq(>lock);
list_splice_init(>list, );
-- 
2.26.0

Re: [tip:x86/urgent] x86/fpu: Set the xcomp_bv when we fake up a XSAVES area

2017-01-23 Thread Kevin Hao

On Mon, Jan 23, 2017 at 06:38:42PM -0800, Dave Hansen wrote:
> On 01/23/2017 06:09 PM, Kevin Hao wrote:
> > On Mon, Jan 23, 2017 at 06:01:10PM -0800, Dave Hansen wrote:
> >> On 01/23/2017 05:50 PM, Kevin Hao wrote:
> >>> According to the ISA manual, XSAVES also set the XCOMP_BV[62:0]. My code 
> >>> only
> >>> try to be compatible with what the cpu does when excuting XSAVES. The 
> >>> following
> >>> is quoted from 325462-sdm-vol-1-2abcd-3abcd.pdf.
> >>>   The XSAVES instructions sets bit 63 of the XCOMP_BV field of the XSAVE 
> >>> header while writing RFBM[62:0] to
> >>>   XCOMP_BV[62:0]. The XSAVES instruction does not write any part of the 
> >>> XSAVE header other than the XSTATE_BV
> >>>   and XCOMP_BV fields.
> >> What purpose does it serve to make copyin_to_xsaves() set that bit,
> > We try to fake up a memory area which is supposed to be composed by XSAVES
> > instruction. My code is just trying to do what the XSAVES do.
> 
> No. copyin_to_xsaves() copies data into an *existing* XSAVES-formatted
> buffer.  If you want to change what it does, fine, but that's not what
> it does or tries to do today.

No, I didn't change what the function copyin_to_xsaves() does. I just tried to
fix the bug in it.

> 
> >> other than helping to hide bugs?
> > Why do you think it hide the bug? In contrast, I think my patch fixes what 
> > the
> > bug really is. The memory area we fake up is bug, we should fix it there.
> 
> Yu-cheng found the bug.  That bug will probably manifest in other code
> paths than copyin_to_xsaves().  If we did your patch, it would hide the
> bug in those other code paths.

The XCOMP_BV[62:0] is supposed to be updated by XSAVEC/XSAVES instructions.
It should not be touched by the software in theory except some special cases
like what we do in copyin_to_xsaves(). Trying to init the XCOMP_BV[62:0] to some
assuming values in fpstate_init() should be error prone and just paper over the
real bug.

Thanks,
Kevin


signature.asc
Description: PGP signature

Re: [tip:x86/urgent] x86/fpu: Set the xcomp_bv when we fake up a XSAVES area

2017-01-23 Thread Kevin Hao

On Mon, Jan 23, 2017 at 06:38:42PM -0800, Dave Hansen wrote:
> On 01/23/2017 06:09 PM, Kevin Hao wrote:
> > On Mon, Jan 23, 2017 at 06:01:10PM -0800, Dave Hansen wrote:
> >> On 01/23/2017 05:50 PM, Kevin Hao wrote:
> >>> According to the ISA manual, XSAVES also set the XCOMP_BV[62:0]. My code 
> >>> only
> >>> try to be compatible with what the cpu does when excuting XSAVES. The 
> >>> following
> >>> is quoted from 325462-sdm-vol-1-2abcd-3abcd.pdf.
> >>>   The XSAVES instructions sets bit 63 of the XCOMP_BV field of the XSAVE 
> >>> header while writing RFBM[62:0] to
> >>>   XCOMP_BV[62:0]. The XSAVES instruction does not write any part of the 
> >>> XSAVE header other than the XSTATE_BV
> >>>   and XCOMP_BV fields.
> >> What purpose does it serve to make copyin_to_xsaves() set that bit,
> > We try to fake up a memory area which is supposed to be composed by XSAVES
> > instruction. My code is just trying to do what the XSAVES do.
> 
> No. copyin_to_xsaves() copies data into an *existing* XSAVES-formatted
> buffer.  If you want to change what it does, fine, but that's not what
> it does or tries to do today.

No, I didn't change what the function copyin_to_xsaves() does. I just tried to
fix the bug in it.

> 
> >> other than helping to hide bugs?
> > Why do you think it hide the bug? In contrast, I think my patch fixes what 
> > the
> > bug really is. The memory area we fake up is bug, we should fix it there.
> 
> Yu-cheng found the bug.  That bug will probably manifest in other code
> paths than copyin_to_xsaves().  If we did your patch, it would hide the
> bug in those other code paths.

The XCOMP_BV[62:0] is supposed to be updated by XSAVEC/XSAVES instructions.
It should not be touched by the software in theory except some special cases
like what we do in copyin_to_xsaves(). Trying to init the XCOMP_BV[62:0] to some
assuming values in fpstate_init() should be error prone and just paper over the
real bug.

Thanks,
Kevin


signature.asc
Description: PGP signature

Re: [tip:x86/urgent] x86/fpu: Set the xcomp_bv when we fake up a XSAVES area

2017-01-23 Thread Kevin Hao

On Mon, Jan 23, 2017 at 06:01:10PM -0800, Dave Hansen wrote:
> On 01/23/2017 05:50 PM, Kevin Hao wrote:
> > According to the ISA manual, XSAVES also set the XCOMP_BV[62:0]. My code 
> > only
> > try to be compatible with what the cpu does when excuting XSAVES. The 
> > following
> > is quoted from 325462-sdm-vol-1-2abcd-3abcd.pdf.
> >   The XSAVES instructions sets bit 63 of the XCOMP_BV field of the XSAVE 
> > header while writing RFBM[62:0] to
> >   XCOMP_BV[62:0]. The XSAVES instruction does not write any part of the 
> > XSAVE header other than the XSTATE_BV
> >   and XCOMP_BV fields.
> 
> What purpose does it serve to make copyin_to_xsaves() set that bit,

We try to fake up a memory area which is supposed to be composed by XSAVES
instruction. My code is just trying to do what the XSAVES do.

> other than helping to hide bugs?

Why do you think it hide the bug? In contrast, I think my patch fixes what the
bug really is. The memory area we fake up is bug, we should fix it there.

Thanks,
Kevin


signature.asc
Description: PGP signature

Re: [tip:x86/urgent] x86/fpu: Set the xcomp_bv when we fake up a XSAVES area

2017-01-23 Thread Kevin Hao

On Mon, Jan 23, 2017 at 06:01:10PM -0800, Dave Hansen wrote:
> On 01/23/2017 05:50 PM, Kevin Hao wrote:
> > According to the ISA manual, XSAVES also set the XCOMP_BV[62:0]. My code 
> > only
> > try to be compatible with what the cpu does when excuting XSAVES. The 
> > following
> > is quoted from 325462-sdm-vol-1-2abcd-3abcd.pdf.
> >   The XSAVES instructions sets bit 63 of the XCOMP_BV field of the XSAVE 
> > header while writing RFBM[62:0] to
> >   XCOMP_BV[62:0]. The XSAVES instruction does not write any part of the 
> > XSAVE header other than the XSTATE_BV
> >   and XCOMP_BV fields.
> 
> What purpose does it serve to make copyin_to_xsaves() set that bit,

We try to fake up a memory area which is supposed to be composed by XSAVES
instruction. My code is just trying to do what the XSAVES do.

> other than helping to hide bugs?

Why do you think it hide the bug? In contrast, I think my patch fixes what the
bug really is. The memory area we fake up is bug, we should fix it there.

Thanks,
Kevin


signature.asc
Description: PGP signature

Re: [tip:x86/urgent] x86/fpu: Set the xcomp_bv when we fake up a XSAVES area

2017-01-23 Thread Kevin Hao

On Mon, Jan 23, 2017 at 04:53:25PM -0800, Dave Hansen wrote:
> >> The fix I am proposing is...
> >>
> >>state->xsave.header.xcomp_bv = XCOMP_BV_COMPACTED_FORMAT |
> >>   xfeatures_mask;
> > 
> > Actually I thought about this change before I made this patch, but I don't 
> > this
> > is the right fix. It is always error prone to init the xcomp_bv to all the
> > supported feature. In case like copyin_to_xsaves(), it is possible that the
> > features which should be set in xcomp_bv do not equal to all the supported
> > features. Please see the following codes in copyin_to_xsaves():
> > /*
> >  * The state that came in from userspace was user-state only.
> >  * Mask all the user states out of 'xfeatures':
> >  */
> > xsave->header.xfeatures &= XFEATURE_MASK_SUPERVISOR;
> > 
> > /*
> >  * Add back in the features that came in from userspace:
> >  */
> > xsave->header.xfeatures |= xfeatures;
> 
> Hi Kevin,
> 
> I think you may be confusing 'xfeatures' with 'xcomp_bv'.  xfeatures
> tells you what features are present in the buffer while xcomp_bv tells
> you what *format* the buffer is in.

According to the ISA manual, XSAVES also set the XCOMP_BV[62:0]. My code only
try to be compatible with what the cpu does when excuting XSAVES. The following
is quoted from 325462-sdm-vol-1-2abcd-3abcd.pdf.
  The XSAVES instructions sets bit 63 of the XCOMP_BV field of the XSAVE header 
while writing RFBM[62:0] to
  XCOMP_BV[62:0]. The XSAVES instruction does not write any part of the XSAVE 
header other than the XSTATE_BV
  and XCOMP_BV fields.

Thanks,
Kevin

> 
> Userspace never dictates the *format* of the kernel buffer, only the
> contents of each state.  So, it makes sense that the copyin code would
> not (and should not) modify xcomp_bv.
> 
> We ensure that xcomp_bv has all supported states set all the time, or
> we're *supposed* to.


signature.asc
Description: PGP signature

Re: [tip:x86/urgent] x86/fpu: Set the xcomp_bv when we fake up a XSAVES area

2017-01-23 Thread Kevin Hao

On Mon, Jan 23, 2017 at 04:53:25PM -0800, Dave Hansen wrote:
> >> The fix I am proposing is...
> >>
> >>state->xsave.header.xcomp_bv = XCOMP_BV_COMPACTED_FORMAT |
> >>   xfeatures_mask;
> > 
> > Actually I thought about this change before I made this patch, but I don't 
> > this
> > is the right fix. It is always error prone to init the xcomp_bv to all the
> > supported feature. In case like copyin_to_xsaves(), it is possible that the
> > features which should be set in xcomp_bv do not equal to all the supported
> > features. Please see the following codes in copyin_to_xsaves():
> > /*
> >  * The state that came in from userspace was user-state only.
> >  * Mask all the user states out of 'xfeatures':
> >  */
> > xsave->header.xfeatures &= XFEATURE_MASK_SUPERVISOR;
> > 
> > /*
> >  * Add back in the features that came in from userspace:
> >  */
> > xsave->header.xfeatures |= xfeatures;
> 
> Hi Kevin,
> 
> I think you may be confusing 'xfeatures' with 'xcomp_bv'.  xfeatures
> tells you what features are present in the buffer while xcomp_bv tells
> you what *format* the buffer is in.

According to the ISA manual, XSAVES also set the XCOMP_BV[62:0]. My code only
try to be compatible with what the cpu does when excuting XSAVES. The following
is quoted from 325462-sdm-vol-1-2abcd-3abcd.pdf.
  The XSAVES instructions sets bit 63 of the XCOMP_BV field of the XSAVE header 
while writing RFBM[62:0] to
  XCOMP_BV[62:0]. The XSAVES instruction does not write any part of the XSAVE 
header other than the XSTATE_BV
  and XCOMP_BV fields.

Thanks,
Kevin

> 
> Userspace never dictates the *format* of the kernel buffer, only the
> contents of each state.  So, it makes sense that the copyin code would
> not (and should not) modify xcomp_bv.
> 
> We ensure that xcomp_bv has all supported states set all the time, or
> we're *supposed* to.


signature.asc
Description: PGP signature

Re: [tip:x86/urgent] x86/fpu: Set the xcomp_bv when we fake up a XSAVES area

2017-01-23 Thread Kevin Hao

On Mon, Jan 23, 2017 at 01:16:40PM -0800, Yu-cheng Yu wrote:
> On Mon, Jan 23, 2017 at 01:10:20PM -0800, Dave Hansen wrote:
> > The code is:
> > 
> > > void fpstate_init(union fpregs_state *state)
> > > {
> > > if (!static_cpu_has(X86_FEATURE_FPU)) {
> > > fpstate_init_soft(>soft);
> > > return;
> > > }
> > > 
> > > memset(state, 0, fpu_kernel_xstate_size);
> > > 
> > > /*
> > >  * XRSTORS requires that this bit is set in xcomp_bv, or
> > >  * it will #GP. Make sure it is replaced after the memset().
> > >  */
> > > if (static_cpu_has(X86_FEATURE_XSAVES))
> > > state->xsave.header.xcomp_bv = XCOMP_BV_COMPACTED_FORMAT;
> > 
> > That seems to set it unconditionally.  What am I missing?
> 
> The fix I am proposing is...
> 
>   state->xsave.header.xcomp_bv = XCOMP_BV_COMPACTED_FORMAT |
>  xfeatures_mask;

Actually I thought about this change before I made this patch, but I don't this
is the right fix. It is always error prone to init the xcomp_bv to all the
supported feature. In case like copyin_to_xsaves(), it is possible that the
features which should be set in xcomp_bv do not equal to all the supported
features. Please see the following codes in copyin_to_xsaves():
/*
 * The state that came in from userspace was user-state only.
 * Mask all the user states out of 'xfeatures':
 */
xsave->header.xfeatures &= XFEATURE_MASK_SUPERVISOR;

/*
 * Add back in the features that came in from userspace:
 */
xsave->header.xfeatures |= xfeatures;

Thanks,
Kevin

signature.asc
Description: PGP signature

Re: [tip:x86/urgent] x86/fpu: Set the xcomp_bv when we fake up a XSAVES area

2017-01-23 Thread Kevin Hao

On Mon, Jan 23, 2017 at 01:16:40PM -0800, Yu-cheng Yu wrote:
> On Mon, Jan 23, 2017 at 01:10:20PM -0800, Dave Hansen wrote:
> > The code is:
> > 
> > > void fpstate_init(union fpregs_state *state)
> > > {
> > > if (!static_cpu_has(X86_FEATURE_FPU)) {
> > > fpstate_init_soft(>soft);
> > > return;
> > > }
> > > 
> > > memset(state, 0, fpu_kernel_xstate_size);
> > > 
> > > /*
> > >  * XRSTORS requires that this bit is set in xcomp_bv, or
> > >  * it will #GP. Make sure it is replaced after the memset().
> > >  */
> > > if (static_cpu_has(X86_FEATURE_XSAVES))
> > > state->xsave.header.xcomp_bv = XCOMP_BV_COMPACTED_FORMAT;
> > 
> > That seems to set it unconditionally.  What am I missing?
> 
> The fix I am proposing is...
> 
>   state->xsave.header.xcomp_bv = XCOMP_BV_COMPACTED_FORMAT |
>  xfeatures_mask;

Actually I thought about this change before I made this patch, but I don't this
is the right fix. It is always error prone to init the xcomp_bv to all the
supported feature. In case like copyin_to_xsaves(), it is possible that the
features which should be set in xcomp_bv do not equal to all the supported
features. Please see the following codes in copyin_to_xsaves():
/*
 * The state that came in from userspace was user-state only.
 * Mask all the user states out of 'xfeatures':
 */
xsave->header.xfeatures &= XFEATURE_MASK_SUPERVISOR;

/*
 * Add back in the features that came in from userspace:
 */
xsave->header.xfeatures |= xfeatures;

Thanks,
Kevin

signature.asc
Description: PGP signature

[tip:x86/urgent] x86/fpu: Set the xcomp_bv when we fake up a XSAVES area

2017-01-23 Thread tip-bot for Kevin Hao

Commit-ID:  4c833368f0bf748d4147bf301b1f95bc8eccb3c0
Gitweb: http://git.kernel.org/tip/4c833368f0bf748d4147bf301b1f95bc8eccb3c0
Author: Kevin Hao <haoke...@gmail.com>
AuthorDate: Sun, 22 Jan 2017 16:50:23 +0800
Committer:  Thomas Gleixner <t...@linutronix.de>
CommitDate: Mon, 23 Jan 2017 10:40:18 +0100

x86/fpu: Set the xcomp_bv when we fake up a XSAVES area

I got the following calltrace on a Apollo Lake SoC with 32-bit kernel:

  WARNING: CPU: 2 PID: 261 at arch/x86/include/asm/fpu/internal.h:363 
fpu__restore+0x1f5/0x260
  [...]
  Hardware name: Intel Corp. Broxton P/NOTEBOOK, BIOS 
APLIRVPA.X64.0138.B35.1608091058 08/09/2016
  Call Trace:
   dump_stack()
   __warn()
   ? fpu__restore()
   warn_slowpath_null()
   fpu__restore()
   __fpu__restore_sig()
   fpu__restore_sig()
   restore_sigcontext.isra.9()
   sys_sigreturn()
   do_int80_syscall_32()
   entry_INT80_32()

The reason is that a #GP occurs when executing XRSTORS. The root cause
is that we forget to set the xcomp_bv when we fake up the XSAVES area
in the copyin_to_xsaves() function.

Signed-off-by: Kevin Hao <haoke...@gmail.com>
Cc: Andy Lutomirski <l...@kernel.org>
Cc: Borislav Petkov <b...@alien8.de>
Cc: Brian Gerst <brge...@gmail.com>
Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Denys Vlasenko <dvlas...@redhat.com>
Cc: Fenghua Yu <fenghua...@intel.com>
Cc: H. Peter Anvin <h...@zytor.com>
Cc: Josh Poimboeuf <jpoim...@redhat.com>
Cc: Linus Torvalds <torva...@linux-foundation.org>
Cc: Oleg Nesterov <o...@redhat.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Quentin Casasnovas <quentin.casasno...@oracle.com>
Cc: Rik van Riel <r...@redhat.com>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Yu-cheng Yu <yu-cheng...@intel.com>
Link: 
http://lkml.kernel.org/r/1485075023-30161-1-git-send-email-haoke...@gmail.com
Signed-off-by: Ingo Molnar <mi...@kernel.org>
Signed-off-by: Thomas Gleixner <t...@linutronix.de>
---
 arch/x86/kernel/fpu/xstate.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 1d77704..e287b90 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1070,6 +1070,7 @@ int copyin_to_xsaves(const void *kbuf, const void __user 
*ubuf,
 * Add back in the features that came in from userspace:
 */
xsave->header.xfeatures |= xfeatures;
+   xsave->header.xcomp_bv = XCOMP_BV_COMPACTED_FORMAT | 
xsave->header.xfeatures;
 
return 0;
 }

[tip:x86/urgent] x86/fpu: Set the xcomp_bv when we fake up a XSAVES area

2017-01-23 Thread tip-bot for Kevin Hao

Commit-ID:  4c833368f0bf748d4147bf301b1f95bc8eccb3c0
Gitweb: http://git.kernel.org/tip/4c833368f0bf748d4147bf301b1f95bc8eccb3c0
Author: Kevin Hao 
AuthorDate: Sun, 22 Jan 2017 16:50:23 +0800
Committer:  Thomas Gleixner 
CommitDate: Mon, 23 Jan 2017 10:40:18 +0100

x86/fpu: Set the xcomp_bv when we fake up a XSAVES area

I got the following calltrace on a Apollo Lake SoC with 32-bit kernel:

  WARNING: CPU: 2 PID: 261 at arch/x86/include/asm/fpu/internal.h:363 
fpu__restore+0x1f5/0x260
  [...]
  Hardware name: Intel Corp. Broxton P/NOTEBOOK, BIOS 
APLIRVPA.X64.0138.B35.1608091058 08/09/2016
  Call Trace:
   dump_stack()
   __warn()
   ? fpu__restore()
   warn_slowpath_null()
   fpu__restore()
   __fpu__restore_sig()
   fpu__restore_sig()
   restore_sigcontext.isra.9()
   sys_sigreturn()
   do_int80_syscall_32()
   entry_INT80_32()

The reason is that a #GP occurs when executing XRSTORS. The root cause
is that we forget to set the xcomp_bv when we fake up the XSAVES area
in the copyin_to_xsaves() function.

Signed-off-by: Kevin Hao 
Cc: Andy Lutomirski 
Cc: Borislav Petkov 
Cc: Brian Gerst 
Cc: Dave Hansen 
Cc: Denys Vlasenko 
Cc: Fenghua Yu 
Cc: H. Peter Anvin 
Cc: Josh Poimboeuf 
Cc: Linus Torvalds 
Cc: Oleg Nesterov 
Cc: Peter Zijlstra 
Cc: Quentin Casasnovas 
Cc: Rik van Riel 
Cc: Thomas Gleixner 
Cc: Yu-cheng Yu 
Link: 
http://lkml.kernel.org/r/1485075023-30161-1-git-send-email-haoke...@gmail.com
Signed-off-by: Ingo Molnar 
Signed-off-by: Thomas Gleixner 
---
 arch/x86/kernel/fpu/xstate.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 1d77704..e287b90 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1070,6 +1070,7 @@ int copyin_to_xsaves(const void *kbuf, const void __user 
*ubuf,
 * Add back in the features that came in from userspace:
 */
xsave->header.xfeatures |= xfeatures;
+   xsave->header.xcomp_bv = XCOMP_BV_COMPACTED_FORMAT | 
xsave->header.xfeatures;
 
return 0;
 }

[tip:x86/urgent] x86/fpu: Set the xcomp_bv when we fake up a XSAVES area

2017-01-23 Thread tip-bot for Kevin Hao

Commit-ID:  5fa356458b5c918bdf8307b070a3d74bc015d910
Gitweb: http://git.kernel.org/tip/5fa356458b5c918bdf8307b070a3d74bc015d910
Author: Kevin Hao <haoke...@gmail.com>
AuthorDate: Sun, 22 Jan 2017 16:50:23 +0800
Committer:  Ingo Molnar <mi...@kernel.org>
CommitDate: Mon, 23 Jan 2017 09:03:03 +0100

x86/fpu: Set the xcomp_bv when we fake up a XSAVES area

I got the following calltrace on a Apollo Lake SoC with 32-bit kernel:

  WARNING: CPU: 2 PID: 261 at arch/x86/include/asm/fpu/internal.h:363 
fpu__restore+0x1f5/0x260
  [...]
  Hardware name: Intel Corp. Broxton P/NOTEBOOK, BIOS 
APLIRVPA.X64.0138.B35.1608091058 08/09/2016
  Call Trace:
   dump_stack()
   __warn()
   ? fpu__restore()
   warn_slowpath_null()
   fpu__restore()
   __fpu__restore_sig()
   fpu__restore_sig()
   restore_sigcontext.isra.9()
   sys_sigreturn()
   do_int80_syscall_32()
   entry_INT80_32()

The reason is that a #GP occurs when executing XRSTORS. The root cause
is that we forget to set the xcomp_bv when we fake up the XSAVES area
in the copyin_to_xsaves() function.

Signed-off-by: Kevin Hao <haoke...@gmail.com>
Cc: Andy Lutomirski <l...@kernel.org>
Cc: Borislav Petkov <b...@alien8.de>
Cc: Brian Gerst <brge...@gmail.com>
Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Denys Vlasenko <dvlas...@redhat.com>
Cc: Fenghua Yu <fenghua...@intel.com>
Cc: H. Peter Anvin <h...@zytor.com>
Cc: Josh Poimboeuf <jpoim...@redhat.com>
Cc: Linus Torvalds <torva...@linux-foundation.org>
Cc: Oleg Nesterov <o...@redhat.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Quentin Casasnovas <quentin.casasno...@oracle.com>
Cc: Rik van Riel <r...@redhat.com>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Yu-cheng Yu <yu-cheng...@intel.com>
Link: 
http://lkml.kernel.org/r/1485075023-30161-1-git-send-email-haoke...@gmail.com
Signed-off-by: Ingo Molnar <mi...@kernel.org>
---
 arch/x86/kernel/fpu/xstate.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 1d77704..e287b90 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1070,6 +1070,7 @@ int copyin_to_xsaves(const void *kbuf, const void __user 
*ubuf,
 * Add back in the features that came in from userspace:
 */
xsave->header.xfeatures |= xfeatures;
+   xsave->header.xcomp_bv = XCOMP_BV_COMPACTED_FORMAT | 
xsave->header.xfeatures;
 
return 0;
 }

[tip:x86/urgent] x86/fpu: Set the xcomp_bv when we fake up a XSAVES area

2017-01-23 Thread tip-bot for Kevin Hao

Commit-ID:  5fa356458b5c918bdf8307b070a3d74bc015d910
Gitweb: http://git.kernel.org/tip/5fa356458b5c918bdf8307b070a3d74bc015d910
Author: Kevin Hao 
AuthorDate: Sun, 22 Jan 2017 16:50:23 +0800
Committer:  Ingo Molnar 
CommitDate: Mon, 23 Jan 2017 09:03:03 +0100

x86/fpu: Set the xcomp_bv when we fake up a XSAVES area

I got the following calltrace on a Apollo Lake SoC with 32-bit kernel:

  WARNING: CPU: 2 PID: 261 at arch/x86/include/asm/fpu/internal.h:363 
fpu__restore+0x1f5/0x260
  [...]
  Hardware name: Intel Corp. Broxton P/NOTEBOOK, BIOS 
APLIRVPA.X64.0138.B35.1608091058 08/09/2016
  Call Trace:
   dump_stack()
   __warn()
   ? fpu__restore()
   warn_slowpath_null()
   fpu__restore()
   __fpu__restore_sig()
   fpu__restore_sig()
   restore_sigcontext.isra.9()
   sys_sigreturn()
   do_int80_syscall_32()
   entry_INT80_32()

The reason is that a #GP occurs when executing XRSTORS. The root cause
is that we forget to set the xcomp_bv when we fake up the XSAVES area
in the copyin_to_xsaves() function.

Signed-off-by: Kevin Hao 
Cc: Andy Lutomirski 
Cc: Borislav Petkov 
Cc: Brian Gerst 
Cc: Dave Hansen 
Cc: Denys Vlasenko 
Cc: Fenghua Yu 
Cc: H. Peter Anvin 
Cc: Josh Poimboeuf 
Cc: Linus Torvalds 
Cc: Oleg Nesterov 
Cc: Peter Zijlstra 
Cc: Quentin Casasnovas 
Cc: Rik van Riel 
Cc: Thomas Gleixner 
Cc: Yu-cheng Yu 
Link: 
http://lkml.kernel.org/r/1485075023-30161-1-git-send-email-haoke...@gmail.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/kernel/fpu/xstate.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 1d77704..e287b90 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1070,6 +1070,7 @@ int copyin_to_xsaves(const void *kbuf, const void __user 
*ubuf,
 * Add back in the features that came in from userspace:
 */
xsave->header.xfeatures |= xfeatures;
+   xsave->header.xcomp_bv = XCOMP_BV_COMPACTED_FORMAT | 
xsave->header.xfeatures;
 
return 0;
 }

[PATCH] x86/fpu: set the xcomp_bv when we fake up a XSAVES area

2017-01-22 Thread Kevin Hao

I got the following calltrace on a Apollo Lake SoC with 32bit kernel.
  WARNING: CPU: 2 PID: 261 at arch/x86/include/asm/fpu/internal.h:363 
fpu__restore+0x1f5/0x260
  Modules linked in:
  CPU: 2 PID: 261 Comm: check_hostname. Not tainted 4.10.0-rc4-next-20170120 #90
  Hardware name: Intel Corp. Broxton P/NOTEBOOK, BIOS 
APLIRVPA.X64.0138.B35.1608091058 08/09/2016
  Call Trace:
   dump_stack+0x47/0x5f
   __warn+0xea/0x110
   ? fpu__restore+0x1f5/0x260
   warn_slowpath_null+0x2a/0x30
   fpu__restore+0x1f5/0x260
   __fpu__restore_sig+0x165/0x6b0
   fpu__restore_sig+0x2f/0x50
   restore_sigcontext.isra.9+0xe0/0xf0
   sys_sigreturn+0xaa/0xf0
   do_int80_syscall_32+0x59/0xb0
   entry_INT80_32+0x2a/0x2a
  EIP: 0xb77acc61
  EFLAGS: 0246 CPU: 2
  EAX:  EBX: 0003 ECX: 08151d38 EDX: 
  ESI: bfa9ce20 EDI: 08151d38 EBP: 000c ESP: bfa9cdbc
   DS: 007b ES: 007b FS:  GS: 0033 SS: 007b

The reason is that a #GP occurs when executing XRSTORS. The root cause
is that we forget to set the xcomp_bv when we fake up the XSAVES area
in function copyin_to_xsaves().

Signed-off-by: Kevin Hao <haoke...@gmail.com>
---
 arch/x86/kernel/fpu/xstate.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 35f7024aace5..2c0df2681481 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1071,6 +1071,8 @@ int copyin_to_xsaves(const void *kbuf, const void __user 
*ubuf,
 * Add back in the features that came in from userspace:
 */
xsave->header.xfeatures |= xfeatures;
+   xsave->header.xcomp_bv = XCOMP_BV_COMPACTED_FORMAT |
+xsave->header.xfeatures;
 
return 0;
 }
-- 
2.9.3

[PATCH] x86/fpu: set the xcomp_bv when we fake up a XSAVES area

2017-01-22 Thread Kevin Hao

I got the following calltrace on a Apollo Lake SoC with 32bit kernel.
  WARNING: CPU: 2 PID: 261 at arch/x86/include/asm/fpu/internal.h:363 
fpu__restore+0x1f5/0x260
  Modules linked in:
  CPU: 2 PID: 261 Comm: check_hostname. Not tainted 4.10.0-rc4-next-20170120 #90
  Hardware name: Intel Corp. Broxton P/NOTEBOOK, BIOS 
APLIRVPA.X64.0138.B35.1608091058 08/09/2016
  Call Trace:
   dump_stack+0x47/0x5f
   __warn+0xea/0x110
   ? fpu__restore+0x1f5/0x260
   warn_slowpath_null+0x2a/0x30
   fpu__restore+0x1f5/0x260
   __fpu__restore_sig+0x165/0x6b0
   fpu__restore_sig+0x2f/0x50
   restore_sigcontext.isra.9+0xe0/0xf0
   sys_sigreturn+0xaa/0xf0
   do_int80_syscall_32+0x59/0xb0
   entry_INT80_32+0x2a/0x2a
  EIP: 0xb77acc61
  EFLAGS: 0246 CPU: 2
  EAX:  EBX: 0003 ECX: 08151d38 EDX: 
  ESI: bfa9ce20 EDI: 08151d38 EBP: 000c ESP: bfa9cdbc
   DS: 007b ES: 007b FS:  GS: 0033 SS: 007b

The reason is that a #GP occurs when executing XRSTORS. The root cause
is that we forget to set the xcomp_bv when we fake up the XSAVES area
in function copyin_to_xsaves().

Signed-off-by: Kevin Hao 
---
 arch/x86/kernel/fpu/xstate.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 35f7024aace5..2c0df2681481 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1071,6 +1071,8 @@ int copyin_to_xsaves(const void *kbuf, const void __user 
*ubuf,
 * Add back in the features that came in from userspace:
 */
xsave->header.xfeatures |= xfeatures;
+   xsave->header.xcomp_bv = XCOMP_BV_COMPACTED_FORMAT |
+xsave->header.xfeatures;
 
return 0;
 }
-- 
2.9.3

Re: [v2,3/6] powerpc: kill mfvtb()

2015-09-07 Thread Kevin Hao

On Mon, Sep 07, 2015 at 07:29:42PM +1000, Michael Ellerman wrote:
> On Mon, 2015-24-08 at 11:20:25 UTC, Kevin Hao wrote:
> > This function is only used by get_vtb(). They are almost the same
> > except the reading from the real register. Move the mfspr() to
> > get_vtb() and kill the function mfvtb(). With this, we can eliminate
> > the use of cpu_has_feature() in very core header file like reg.h.
> > This is a preparation for the use of jump label for cpu_has_feature().
> 
> I don't mind this change. But can you elaborate a bit on the issue with using
> cpu_has_feature() in reg.h? Just so I can understand the problem.
> 
> I assume you ended up in a big mess of includes when you tried to include
> jump_label.h from reg.h?

Yes, the "jump_leabel.h" already include "reg.h" implicitly. 
   from arch/powerpc/include/asm/cache.h:6:0,
   from include/linux/cache.h:5,
   from include/linux/printk.h:8,
   from include/linux/kernel.h:14,
   from include/asm-generic/bug.h:13,
   from arch/powerpc/include/asm/bug.h:127,
   from include/linux/bug.h:4,
   from include/linux/jump_label.h:81,

So including "jump_label.h" from "reg.h" will cause a recursive header
inclusion problem.

Thanks,
Kevin

pgpE7V7gUSsew.pgp
Description: PGP signature

Re: [v2,3/6] powerpc: kill mfvtb()

2015-09-07 Thread Kevin Hao

On Mon, Sep 07, 2015 at 07:29:42PM +1000, Michael Ellerman wrote:
> On Mon, 2015-24-08 at 11:20:25 UTC, Kevin Hao wrote:
> > This function is only used by get_vtb(). They are almost the same
> > except the reading from the real register. Move the mfspr() to
> > get_vtb() and kill the function mfvtb(). With this, we can eliminate
> > the use of cpu_has_feature() in very core header file like reg.h.
> > This is a preparation for the use of jump label for cpu_has_feature().
> 
> I don't mind this change. But can you elaborate a bit on the issue with using
> cpu_has_feature() in reg.h? Just so I can understand the problem.
> 
> I assume you ended up in a big mess of includes when you tried to include
> jump_label.h from reg.h?

Yes, the "jump_leabel.h" already include "reg.h" implicitly. 
   from arch/powerpc/include/asm/cache.h:6:0,
   from include/linux/cache.h:5,
   from include/linux/printk.h:8,
   from include/linux/kernel.h:14,
   from include/asm-generic/bug.h:13,
   from arch/powerpc/include/asm/bug.h:127,
   from include/linux/bug.h:4,
   from include/linux/jump_label.h:81,

So including "jump_label.h" from "reg.h" will cause a recursive header
inclusion problem.

Thanks,
Kevin

pgpE7V7gUSsew.pgp
Description: PGP signature

[PATCH v2 1/6] jump_label: make it possible for the archs to invoke jump_label_init() much earlier

2015-08-24 Thread Kevin Hao

For some archs (such as powerpc) would want to invoke jump_label_init()
in a much earlier stage. So check static_key_initialized in order to
make sure this function run only once.

Signed-off-by: Kevin Hao 
---
v2: No change.

 kernel/jump_label.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/jump_label.c b/kernel/jump_label.c
index f7dd15d537f9..cfc7d7b65432 100644
--- a/kernel/jump_label.c
+++ b/kernel/jump_label.c
@@ -205,6 +205,9 @@ void __init jump_label_init(void)
struct static_key *key = NULL;
struct jump_entry *iter;
 
+   if (static_key_initialized)
+   return;
+
jump_label_lock();
jump_label_sort_entries(iter_start, iter_stop);
 
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 2/6] powerpc: invoke jump_label_init() in a much earlier stage

2015-08-24 Thread Kevin Hao

So we can use static_key for cpu_has_feature() and mmu_has_feature().

Signed-off-by: Kevin Hao 
---
v2: No change.

 arch/powerpc/kernel/setup_32.c | 2 ++
 arch/powerpc/kernel/setup_64.c | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c
index bb02e9f6944e..35980a2785ba 100644
--- a/arch/powerpc/kernel/setup_32.c
+++ b/arch/powerpc/kernel/setup_32.c
@@ -113,6 +113,8 @@ notrace void __init machine_init(u64 dt_ptr)
 {
lockdep_init();
 
+   jump_label_init();
+
/* Enable early debugging if any specified (see udbg.h) */
udbg_early_init();
 
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index bdcbb716f4d6..f0802a0b4a20 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -250,6 +250,8 @@ void __init early_setup(unsigned long dt_ptr)
/* Initialize lockdep early or else spinlocks will blow */
lockdep_init();
 
+   jump_label_init();
+
/*  printk is now safe to use --- */
 
/* Enable early debugging if any specified (see udbg.h) */
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 6/6] powerpc: use jump label for mmu_has_feature

2015-08-24 Thread Kevin Hao

The mmu features are fixed once the probe of mmu features are done.
And the function mmu_has_feature() does be used in some hot path.
The checking of the mmu features for each time of invoking of
mmu_has_feature() seems suboptimal. This tries to reduce this
overhead of this check by using jump label.

The generated assemble code of the following c program:
if (mmu_has_feature(MMU_FTR_XXX))
xxx()
Before:
lis r9,-16230
lwz r9,12324(r9)
lwz r9,24(r9)
andi.   r10,r9,16
beqlr+

After:
nop if MMU_FTR_XXX is enabled
b xxx   if MMU_FTR_XXX is not enabled

Signed-off-by: Kevin Hao 
---
v2: Use the open-coded definition and initialization for mmu_feat_keys[].

 arch/powerpc/include/asm/mmu.h | 29 +
 arch/powerpc/kernel/cputable.c | 17 +
 arch/powerpc/kernel/setup_32.c |  1 +
 arch/powerpc/kernel/setup_64.c |  1 +
 4 files changed, 48 insertions(+)

diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
index 3d5abfe6ba67..e091de352a75 100644
--- a/arch/powerpc/include/asm/mmu.h
+++ b/arch/powerpc/include/asm/mmu.h
@@ -109,6 +109,34 @@
 DECLARE_PER_CPU(int, next_tlbcam_idx);
 #endif
 
+#ifdef CONFIG_JUMP_LABEL
+#include 
+
+#define MAX_MMU_FEATURES   (8 * sizeof(((struct cpu_spec 
*)0)->mmu_features))
+
+extern struct static_key_true mmu_feat_keys[MAX_MMU_FEATURES];
+
+extern void mmu_feat_keys_init(void);
+
+static inline int mmu_has_feature(unsigned long feature)
+{
+   int i;
+
+   i = __builtin_ctzl(feature);
+   return static_branch_likely(_feat_keys[i]);
+}
+
+static inline void mmu_clear_feature(unsigned long feature)
+{
+   int i;
+
+   i = __builtin_ctzl(feature);
+   cur_cpu_spec->mmu_features &= ~feature;
+   static_branch_disable(_feat_keys[i]);
+}
+#else
+static inline void mmu_feat_keys_init(void) { }
+
 static inline int mmu_has_feature(unsigned long feature)
 {
return (cur_cpu_spec->mmu_features & feature);
@@ -118,6 +146,7 @@ static inline void mmu_clear_feature(unsigned long feature)
 {
cur_cpu_spec->mmu_features &= ~feature;
 }
+#endif
 
 extern unsigned int __start___mmu_ftr_fixup, __stop___mmu_ftr_fixup;
 
diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c
index ea94931c5e70..6b505c98520b 100644
--- a/arch/powerpc/kernel/cputable.c
+++ b/arch/powerpc/kernel/cputable.c
@@ -2214,4 +2214,21 @@ void __init cpu_feat_keys_init(void)
static_branch_disable(_feat_keys[i]);
}
 }
+
+struct static_key_true mmu_feat_keys[MAX_MMU_FEATURES] = {
+   [0 ... MAX_MMU_FEATURES - 1] = STATIC_KEY_TRUE_INIT
+};
+EXPORT_SYMBOL_GPL(mmu_feat_keys);
+
+void __init mmu_feat_keys_init(void)
+{
+   int i;
+
+   for (i = 0; i < MAX_MMU_FEATURES; i++) {
+   unsigned long f = 1ul << i;
+
+   if (!(cur_cpu_spec->mmu_features & f))
+   static_branch_disable(_feat_keys[i]);
+   }
+}
 #endif
diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c
index 93756175a13c..8acff5a4bc3e 100644
--- a/arch/powerpc/kernel/setup_32.c
+++ b/arch/powerpc/kernel/setup_32.c
@@ -116,6 +116,7 @@ notrace void __init machine_init(u64 dt_ptr)
 
jump_label_init();
cpu_feat_keys_init();
+   mmu_feat_keys_init();
 
/* Enable early debugging if any specified (see udbg.h) */
udbg_early_init();
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index 4cf3894d91fa..df6f98f1c46c 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -252,6 +252,7 @@ void __init early_setup(unsigned long dt_ptr)
 
jump_label_init();
cpu_feat_keys_init();
+   mmu_feat_keys_init();
 
/*  printk is now safe to use --- */
 
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 0/6] powerpc: use jump label for {cpu,mmu}_has_feature()

2015-08-24 Thread Kevin Hao

Hi,

v2:
Drop the following two patches as suggested by Ingo and Peter:
jump_label: no need to acquire the jump_label_mutex in jump_lable_init()
jump_label: introduce DEFINE_STATIC_KEY_{TRUE,FALSE}_ARRAY macros

v1:
I have tried to change the {cpu,mmu}_has_feature() to use jump label two yeas
ago [1]. But that codes seem a bit ugly. This is a reimplementation by moving 
the
jump_label_init() much earlier so the jump label can be used in a very earlier
stage. Boot test on p4080ds, t2080rdb and powermac (qemu). This patch series
is against linux-next.

[1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2013-September/111026.html

Kevin Hao (6):
  jump_label: make it possible for the archs to invoke jump_label_init()
much earlier
  powerpc: invoke jump_label_init() in a much earlier stage
  powerpc: kill mfvtb()
  powerpc: move the cpu_has_feature to a separate file
  powerpc: use the jump label for cpu_has_feature
  powerpc: use jump label for mmu_has_feature

 arch/powerpc/include/asm/cacheflush.h   |  1 +
 arch/powerpc/include/asm/cpufeatures.h  | 34 ++
 arch/powerpc/include/asm/cputable.h | 16 +++---
 arch/powerpc/include/asm/cputime.h  |  1 +
 arch/powerpc/include/asm/dbell.h|  1 +
 arch/powerpc/include/asm/dcr-native.h   |  1 +
 arch/powerpc/include/asm/mman.h |  1 +
 arch/powerpc/include/asm/mmu.h  | 29 ++
 arch/powerpc/include/asm/reg.h  |  9 
 arch/powerpc/include/asm/time.h |  3 ++-
 arch/powerpc/include/asm/xor.h  |  1 +
 arch/powerpc/kernel/align.c |  1 +
 arch/powerpc/kernel/cputable.c  | 37 +
 arch/powerpc/kernel/irq.c   |  1 +
 arch/powerpc/kernel/process.c   |  1 +
 arch/powerpc/kernel/setup-common.c  |  1 +
 arch/powerpc/kernel/setup_32.c  |  5 +
 arch/powerpc/kernel/setup_64.c  |  4 
 arch/powerpc/kernel/smp.c   |  1 +
 arch/powerpc/platforms/cell/pervasive.c |  1 +
 arch/powerpc/xmon/ppc-dis.c |  1 +
 kernel/jump_label.c |  3 +++
 22 files changed, 135 insertions(+), 18 deletions(-)
 create mode 100644 arch/powerpc/include/asm/cpufeatures.h

-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 5/6] powerpc: use the jump label for cpu_has_feature

2015-08-24 Thread Kevin Hao

The cpu features are fixed once the probe of cpu features are done.
And the function cpu_has_feature() does be used in some hot path.
The checking of the cpu features for each time of invoking of
cpu_has_feature() seems suboptimal. This tries to reduce this
overhead of this check by using jump label.

The generated assemble code of the following c program:
if (cpu_has_feature(CPU_FTR_XXX))
xxx()

Before:
lis r9,-16230
lwz r9,12324(r9)
lwz r9,12(r9)
andi.   r10,r9,512
beqlr-

After:
nop if CPU_FTR_XXX is enabled
b xxx   if CPU_FTR_XXX is not enabled

Signed-off-by: Kevin Hao 
---
v2: Use the open-coded definition and initialization for cpu_feat_keys[].

 arch/powerpc/include/asm/cpufeatures.h | 20 
 arch/powerpc/include/asm/cputable.h|  8 
 arch/powerpc/kernel/cputable.c | 20 
 arch/powerpc/kernel/setup_32.c |  1 +
 arch/powerpc/kernel/setup_64.c |  1 +
 5 files changed, 50 insertions(+)

diff --git a/arch/powerpc/include/asm/cpufeatures.h 
b/arch/powerpc/include/asm/cpufeatures.h
index 37650db5044f..405a97fe6ef9 100644
--- a/arch/powerpc/include/asm/cpufeatures.h
+++ b/arch/powerpc/include/asm/cpufeatures.h
@@ -3,6 +3,25 @@
 
 #include 
 
+#ifdef CONFIG_JUMP_LABEL
+#include 
+
+extern struct static_key_true cpu_feat_keys[MAX_CPU_FEATURES];
+
+static inline int cpu_has_feature(unsigned long feature)
+{
+   int i;
+
+   if (CPU_FTRS_ALWAYS & feature)
+   return 1;
+
+   if (!(CPU_FTRS_POSSIBLE & feature))
+   return 0;
+
+   i = __builtin_ctzl(feature);
+   return static_branch_likely(_feat_keys[i]);
+}
+#else
 static inline int cpu_has_feature(unsigned long feature)
 {
return (CPU_FTRS_ALWAYS & feature) ||
@@ -10,5 +29,6 @@ static inline int cpu_has_feature(unsigned long feature)
& cur_cpu_spec->cpu_features
& feature);
 }
+#endif
 
 #endif /* __ASM_POWERPC_CPUFEATURE_H */
diff --git a/arch/powerpc/include/asm/cputable.h 
b/arch/powerpc/include/asm/cputable.h
index ae4b6ef341cd..2ebee2894102 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -114,6 +114,12 @@ extern void do_feature_fixups(unsigned long value, void 
*fixup_start,
 
 extern const char *powerpc_base_platform;
 
+#ifdef CONFIG_JUMP_LABEL
+extern void cpu_feat_keys_init(void);
+#else
+static inline void cpu_feat_keys_init(void) { }
+#endif
+
 /* TLB flush actions. Used as argument to cpu_spec.flush_tlb() hook */
 enum {
TLB_INVAL_SCOPE_GLOBAL = 0, /* invalidate all TLBs */
@@ -124,6 +130,8 @@ enum {
 
 /* CPU kernel features */
 
+#define MAX_CPU_FEATURES   (8 * sizeof(((struct cpu_spec 
*)0)->cpu_features))
+
 /* Retain the 32b definitions all use bottom half of word */
 #define CPU_FTR_COHERENT_ICACHEASM_CONST(0x0001)
 #define CPU_FTR_L2CR   ASM_CONST(0x0002)
diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c
index 7d80bfdfb15e..ea94931c5e70 100644
--- a/arch/powerpc/kernel/cputable.c
+++ b/arch/powerpc/kernel/cputable.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -2195,3 +2196,22 @@ struct cpu_spec * __init identify_cpu(unsigned long 
offset, unsigned int pvr)
 
return NULL;
 }
+
+#ifdef CONFIG_JUMP_LABEL
+struct static_key_true cpu_feat_keys[MAX_CPU_FEATURES] = {
+   [0 ... MAX_CPU_FEATURES - 1] = STATIC_KEY_TRUE_INIT
+};
+EXPORT_SYMBOL_GPL(cpu_feat_keys);
+
+void __init cpu_feat_keys_init(void)
+{
+   int i;
+
+   for (i = 0; i < MAX_CPU_FEATURES; i++) {
+   unsigned long f = 1ul << i;
+
+   if (!(cur_cpu_spec->cpu_features & f))
+   static_branch_disable(_feat_keys[i]);
+   }
+}
+#endif
diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c
index f0868f510b3b..93756175a13c 100644
--- a/arch/powerpc/kernel/setup_32.c
+++ b/arch/powerpc/kernel/setup_32.c
@@ -115,6 +115,7 @@ notrace void __init machine_init(u64 dt_ptr)
lockdep_init();
 
jump_label_init();
+   cpu_feat_keys_init();
 
/* Enable early debugging if any specified (see udbg.h) */
udbg_early_init();
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index f0802a0b4a20..4cf3894d91fa 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -251,6 +251,7 @@ void __init early_setup(unsigned long dt_ptr)
lockdep_init();
 
jump_label_init();
+   cpu_feat_keys_init();
 
/*  printk is now safe to use --- */
 
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.

[PATCH v2 4/6] powerpc: move the cpu_has_feature to a separate file

2015-08-24 Thread Kevin Hao

We plan to use jump label for cpu_has_feature. In order to implement
this we need to include the linux/jump_label.h in asm/cputable.h.
But it seems that asm/cputable.h is so basic header file for ppc that
it is almost included by all the other header files. The including of
the linux/jump_label.h will introduces various recursive inclusion.
And it is very hard to fix that. So we choose to move the function
cpu_has_feature to a separate header file before using the jump label
for it. No functional change.

Signed-off-by: Kevin Hao 
---
v2: No change.

 arch/powerpc/include/asm/cacheflush.h   |  1 +
 arch/powerpc/include/asm/cpufeatures.h  | 14 ++
 arch/powerpc/include/asm/cputable.h |  8 
 arch/powerpc/include/asm/cputime.h  |  1 +
 arch/powerpc/include/asm/dbell.h|  1 +
 arch/powerpc/include/asm/dcr-native.h   |  1 +
 arch/powerpc/include/asm/mman.h |  1 +
 arch/powerpc/include/asm/time.h |  1 +
 arch/powerpc/include/asm/xor.h  |  1 +
 arch/powerpc/kernel/align.c |  1 +
 arch/powerpc/kernel/irq.c   |  1 +
 arch/powerpc/kernel/process.c   |  1 +
 arch/powerpc/kernel/setup-common.c  |  1 +
 arch/powerpc/kernel/setup_32.c  |  1 +
 arch/powerpc/kernel/smp.c   |  1 +
 arch/powerpc/platforms/cell/pervasive.c |  1 +
 arch/powerpc/xmon/ppc-dis.c |  1 +
 17 files changed, 29 insertions(+), 8 deletions(-)
 create mode 100644 arch/powerpc/include/asm/cpufeatures.h

diff --git a/arch/powerpc/include/asm/cacheflush.h 
b/arch/powerpc/include/asm/cacheflush.h
index 6229e6b6037b..3bdcd9231852 100644
--- a/arch/powerpc/include/asm/cacheflush.h
+++ b/arch/powerpc/include/asm/cacheflush.h
@@ -11,6 +11,7 @@
 
 #include 
 #include 
+#include 
 
 /*
  * No cache flushing is required when address mappings are changed,
diff --git a/arch/powerpc/include/asm/cpufeatures.h 
b/arch/powerpc/include/asm/cpufeatures.h
new file mode 100644
index ..37650db5044f
--- /dev/null
+++ b/arch/powerpc/include/asm/cpufeatures.h
@@ -0,0 +1,14 @@
+#ifndef __ASM_POWERPC_CPUFEATURES_H
+#define __ASM_POWERPC_CPUFEATURES_H
+
+#include 
+
+static inline int cpu_has_feature(unsigned long feature)
+{
+   return (CPU_FTRS_ALWAYS & feature) ||
+  (CPU_FTRS_POSSIBLE
+   & cur_cpu_spec->cpu_features
+   & feature);
+}
+
+#endif /* __ASM_POWERPC_CPUFEATURE_H */
diff --git a/arch/powerpc/include/asm/cputable.h 
b/arch/powerpc/include/asm/cputable.h
index b118072670fb..ae4b6ef341cd 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -556,14 +556,6 @@ enum {
 };
 #endif /* __powerpc64__ */
 
-static inline int cpu_has_feature(unsigned long feature)
-{
-   return (CPU_FTRS_ALWAYS & feature) ||
-  (CPU_FTRS_POSSIBLE
-   & cur_cpu_spec->cpu_features
-   & feature);
-}
-
 #define HBP_NUM 1
 
 #endif /* !__ASSEMBLY__ */
diff --git a/arch/powerpc/include/asm/cputime.h 
b/arch/powerpc/include/asm/cputime.h
index e2452550bcb1..b91837865c0e 100644
--- a/arch/powerpc/include/asm/cputime.h
+++ b/arch/powerpc/include/asm/cputime.h
@@ -28,6 +28,7 @@ static inline void setup_cputime_one_jiffy(void) { }
 #include 
 #include 
 #include 
+#include 
 
 typedef u64 __nocast cputime_t;
 typedef u64 __nocast cputime64_t;
diff --git a/arch/powerpc/include/asm/dbell.h b/arch/powerpc/include/asm/dbell.h
index 5fa6b20eba10..2d9eae338f70 100644
--- a/arch/powerpc/include/asm/dbell.h
+++ b/arch/powerpc/include/asm/dbell.h
@@ -16,6 +16,7 @@
 #include 
 
 #include 
+#include 
 
 #define PPC_DBELL_MSG_BRDCAST  (0x0400)
 #define PPC_DBELL_TYPE(x)  (((x) & 0xf) << (63-36))
diff --git a/arch/powerpc/include/asm/dcr-native.h 
b/arch/powerpc/include/asm/dcr-native.h
index 4efc11dacb98..0186ba05bfe1 100644
--- a/arch/powerpc/include/asm/dcr-native.h
+++ b/arch/powerpc/include/asm/dcr-native.h
@@ -24,6 +24,7 @@
 
 #include 
 #include 
+#include 
 
 typedef struct {
unsigned int base;
diff --git a/arch/powerpc/include/asm/mman.h b/arch/powerpc/include/asm/mman.h
index 8565c254151a..74922ad05e6c 100644
--- a/arch/powerpc/include/asm/mman.h
+++ b/arch/powerpc/include/asm/mman.h
@@ -13,6 +13,7 @@
 
 #include 
 #include 
+#include 
 
 /*
  * This file is included by linux/mman.h, so we can't use cacl_vm_prot_bits()
diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
index 6f69828458fb..fa63005f827f 100644
--- a/arch/powerpc/include/asm/time.h
+++ b/arch/powerpc/include/asm/time.h
@@ -18,6 +18,7 @@
 #include 
 
 #include 
+#include 
 
 /* time.c */
 extern unsigned long tb_ticks_per_jiffy;
diff --git a/arch/powerpc/include/asm/xor.h b/arch/powerpc/include/asm/xor.h
index 0abb97f3be10..15ba0d07937f 100644
--- a/arch/powerpc/include/asm/xor.h
+++ b/arch/powerpc/include/asm/xor.h
@@ -23,6 +23,7 @@
 #ifdef CONFIG_ALTIVEC
 
 #include 
+#include 
 
 void xor_altivec_2(u

[PATCH v2 3/6] powerpc: kill mfvtb()

2015-08-24 Thread Kevin Hao

This function is only used by get_vtb(). They are almost the same
except the reading from the real register. Move the mfspr() to
get_vtb() and kill the function mfvtb(). With this, we can eliminate
the use of cpu_has_feature() in very core header file like reg.h.
This is a preparation for the use of jump label for cpu_has_feature().

Signed-off-by: Kevin Hao 
---
v2: No change.

 arch/powerpc/include/asm/reg.h  | 9 -
 arch/powerpc/include/asm/time.h | 2 +-
 2 files changed, 1 insertion(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index aa1cc5f015ee..d0b5f4b63776 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -1207,15 +1207,6 @@
 : "r" ((unsigned long)(v)) \
 : "memory")
 
-static inline unsigned long mfvtb (void)
-{
-#ifdef CONFIG_PPC_BOOK3S_64
-   if (cpu_has_feature(CPU_FTR_ARCH_207S))
-   return mfspr(SPRN_VTB);
-#endif
-   return 0;
-}
-
 #ifdef __powerpc64__
 #if defined(CONFIG_PPC_CELL) || defined(CONFIG_PPC_FSL_BOOK3E)
 #define mftb() ({unsigned long rval;   \
diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
index 10fc784a2ad4..6f69828458fb 100644
--- a/arch/powerpc/include/asm/time.h
+++ b/arch/powerpc/include/asm/time.h
@@ -106,7 +106,7 @@ static inline u64 get_vtb(void)
 {
 #ifdef CONFIG_PPC_BOOK3S_64
if (cpu_has_feature(CPU_FTR_ARCH_207S))
-   return mfvtb();
+   return mfspr(SPRN_VTB);
 #endif
return 0;
 }
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 4/6] powerpc: move the cpu_has_feature to a separate file

2015-08-24 Thread Kevin Hao

We plan to use jump label for cpu_has_feature. In order to implement
this we need to include the linux/jump_label.h in asm/cputable.h.
But it seems that asm/cputable.h is so basic header file for ppc that
it is almost included by all the other header files. The including of
the linux/jump_label.h will introduces various recursive inclusion.
And it is very hard to fix that. So we choose to move the function
cpu_has_feature to a separate header file before using the jump label
for it. No functional change.

Signed-off-by: Kevin Hao haoke...@gmail.com
---
v2: No change.

 arch/powerpc/include/asm/cacheflush.h   |  1 +
 arch/powerpc/include/asm/cpufeatures.h  | 14 ++
 arch/powerpc/include/asm/cputable.h |  8 
 arch/powerpc/include/asm/cputime.h  |  1 +
 arch/powerpc/include/asm/dbell.h|  1 +
 arch/powerpc/include/asm/dcr-native.h   |  1 +
 arch/powerpc/include/asm/mman.h |  1 +
 arch/powerpc/include/asm/time.h |  1 +
 arch/powerpc/include/asm/xor.h  |  1 +
 arch/powerpc/kernel/align.c |  1 +
 arch/powerpc/kernel/irq.c   |  1 +
 arch/powerpc/kernel/process.c   |  1 +
 arch/powerpc/kernel/setup-common.c  |  1 +
 arch/powerpc/kernel/setup_32.c  |  1 +
 arch/powerpc/kernel/smp.c   |  1 +
 arch/powerpc/platforms/cell/pervasive.c |  1 +
 arch/powerpc/xmon/ppc-dis.c |  1 +
 17 files changed, 29 insertions(+), 8 deletions(-)
 create mode 100644 arch/powerpc/include/asm/cpufeatures.h

diff --git a/arch/powerpc/include/asm/cacheflush.h 
b/arch/powerpc/include/asm/cacheflush.h
index 6229e6b6037b..3bdcd9231852 100644
--- a/arch/powerpc/include/asm/cacheflush.h
+++ b/arch/powerpc/include/asm/cacheflush.h
@@ -11,6 +11,7 @@
 
 #include linux/mm.h
 #include asm/cputable.h
+#include asm/cpufeatures.h
 
 /*
  * No cache flushing is required when address mappings are changed,
diff --git a/arch/powerpc/include/asm/cpufeatures.h 
b/arch/powerpc/include/asm/cpufeatures.h
new file mode 100644
index ..37650db5044f
--- /dev/null
+++ b/arch/powerpc/include/asm/cpufeatures.h
@@ -0,0 +1,14 @@
+#ifndef __ASM_POWERPC_CPUFEATURES_H
+#define __ASM_POWERPC_CPUFEATURES_H
+
+#include asm/cputable.h
+
+static inline int cpu_has_feature(unsigned long feature)
+{
+   return (CPU_FTRS_ALWAYS  feature) ||
+  (CPU_FTRS_POSSIBLE
+cur_cpu_spec-cpu_features
+feature);
+}
+
+#endif /* __ASM_POWERPC_CPUFEATURE_H */
diff --git a/arch/powerpc/include/asm/cputable.h 
b/arch/powerpc/include/asm/cputable.h
index b118072670fb..ae4b6ef341cd 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -556,14 +556,6 @@ enum {
 };
 #endif /* __powerpc64__ */
 
-static inline int cpu_has_feature(unsigned long feature)
-{
-   return (CPU_FTRS_ALWAYS  feature) ||
-  (CPU_FTRS_POSSIBLE
-cur_cpu_spec-cpu_features
-feature);
-}
-
 #define HBP_NUM 1
 
 #endif /* !__ASSEMBLY__ */
diff --git a/arch/powerpc/include/asm/cputime.h 
b/arch/powerpc/include/asm/cputime.h
index e2452550bcb1..b91837865c0e 100644
--- a/arch/powerpc/include/asm/cputime.h
+++ b/arch/powerpc/include/asm/cputime.h
@@ -28,6 +28,7 @@ static inline void setup_cputime_one_jiffy(void) { }
 #include asm/div64.h
 #include asm/time.h
 #include asm/param.h
+#include asm/cpufeatures.h
 
 typedef u64 __nocast cputime_t;
 typedef u64 __nocast cputime64_t;
diff --git a/arch/powerpc/include/asm/dbell.h b/arch/powerpc/include/asm/dbell.h
index 5fa6b20eba10..2d9eae338f70 100644
--- a/arch/powerpc/include/asm/dbell.h
+++ b/arch/powerpc/include/asm/dbell.h
@@ -16,6 +16,7 @@
 #include linux/threads.h
 
 #include asm/ppc-opcode.h
+#include asm/cpufeatures.h
 
 #define PPC_DBELL_MSG_BRDCAST  (0x0400)
 #define PPC_DBELL_TYPE(x)  (((x)  0xf)  (63-36))
diff --git a/arch/powerpc/include/asm/dcr-native.h 
b/arch/powerpc/include/asm/dcr-native.h
index 4efc11dacb98..0186ba05bfe1 100644
--- a/arch/powerpc/include/asm/dcr-native.h
+++ b/arch/powerpc/include/asm/dcr-native.h
@@ -24,6 +24,7 @@
 
 #include linux/spinlock.h
 #include asm/cputable.h
+#include asm/cpufeatures.h
 
 typedef struct {
unsigned int base;
diff --git a/arch/powerpc/include/asm/mman.h b/arch/powerpc/include/asm/mman.h
index 8565c254151a..74922ad05e6c 100644
--- a/arch/powerpc/include/asm/mman.h
+++ b/arch/powerpc/include/asm/mman.h
@@ -13,6 +13,7 @@
 
 #include asm/cputable.h
 #include linux/mm.h
+#include asm/cpufeatures.h
 
 /*
  * This file is included by linux/mman.h, so we can't use cacl_vm_prot_bits()
diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
index 6f69828458fb..fa63005f827f 100644
--- a/arch/powerpc/include/asm/time.h
+++ b/arch/powerpc/include/asm/time.h
@@ -18,6 +18,7 @@
 #include linux/percpu.h
 
 #include asm/processor.h
+#include asm/cpufeatures.h
 
 /* time.c */
 extern unsigned long tb_ticks_per_jiffy;
diff --git a/arch/powerpc

[PATCH v2 3/6] powerpc: kill mfvtb()

2015-08-24 Thread Kevin Hao

This function is only used by get_vtb(). They are almost the same
except the reading from the real register. Move the mfspr() to
get_vtb() and kill the function mfvtb(). With this, we can eliminate
the use of cpu_has_feature() in very core header file like reg.h.
This is a preparation for the use of jump label for cpu_has_feature().

Signed-off-by: Kevin Hao haoke...@gmail.com
---
v2: No change.

 arch/powerpc/include/asm/reg.h  | 9 -
 arch/powerpc/include/asm/time.h | 2 +-
 2 files changed, 1 insertion(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index aa1cc5f015ee..d0b5f4b63776 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -1207,15 +1207,6 @@
 : r ((unsigned long)(v)) \
 : memory)
 
-static inline unsigned long mfvtb (void)
-{
-#ifdef CONFIG_PPC_BOOK3S_64
-   if (cpu_has_feature(CPU_FTR_ARCH_207S))
-   return mfspr(SPRN_VTB);
-#endif
-   return 0;
-}
-
 #ifdef __powerpc64__
 #if defined(CONFIG_PPC_CELL) || defined(CONFIG_PPC_FSL_BOOK3E)
 #define mftb() ({unsigned long rval;   \
diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
index 10fc784a2ad4..6f69828458fb 100644
--- a/arch/powerpc/include/asm/time.h
+++ b/arch/powerpc/include/asm/time.h
@@ -106,7 +106,7 @@ static inline u64 get_vtb(void)
 {
 #ifdef CONFIG_PPC_BOOK3S_64
if (cpu_has_feature(CPU_FTR_ARCH_207S))
-   return mfvtb();
+   return mfspr(SPRN_VTB);
 #endif
return 0;
 }
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 0/6] powerpc: use jump label for {cpu,mmu}_has_feature()

2015-08-24 Thread Kevin Hao

Hi,

v2:
Drop the following two patches as suggested by Ingo and Peter:
jump_label: no need to acquire the jump_label_mutex in jump_lable_init()
jump_label: introduce DEFINE_STATIC_KEY_{TRUE,FALSE}_ARRAY macros

v1:
I have tried to change the {cpu,mmu}_has_feature() to use jump label two yeas
ago [1]. But that codes seem a bit ugly. This is a reimplementation by moving 
the
jump_label_init() much earlier so the jump label can be used in a very earlier
stage. Boot test on p4080ds, t2080rdb and powermac (qemu). This patch series
is against linux-next.

[1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2013-September/111026.html

Kevin Hao (6):
  jump_label: make it possible for the archs to invoke jump_label_init()
much earlier
  powerpc: invoke jump_label_init() in a much earlier stage
  powerpc: kill mfvtb()
  powerpc: move the cpu_has_feature to a separate file
  powerpc: use the jump label for cpu_has_feature
  powerpc: use jump label for mmu_has_feature

 arch/powerpc/include/asm/cacheflush.h   |  1 +
 arch/powerpc/include/asm/cpufeatures.h  | 34 ++
 arch/powerpc/include/asm/cputable.h | 16 +++---
 arch/powerpc/include/asm/cputime.h  |  1 +
 arch/powerpc/include/asm/dbell.h|  1 +
 arch/powerpc/include/asm/dcr-native.h   |  1 +
 arch/powerpc/include/asm/mman.h |  1 +
 arch/powerpc/include/asm/mmu.h  | 29 ++
 arch/powerpc/include/asm/reg.h  |  9 
 arch/powerpc/include/asm/time.h |  3 ++-
 arch/powerpc/include/asm/xor.h  |  1 +
 arch/powerpc/kernel/align.c |  1 +
 arch/powerpc/kernel/cputable.c  | 37 +
 arch/powerpc/kernel/irq.c   |  1 +
 arch/powerpc/kernel/process.c   |  1 +
 arch/powerpc/kernel/setup-common.c  |  1 +
 arch/powerpc/kernel/setup_32.c  |  5 +
 arch/powerpc/kernel/setup_64.c  |  4 
 arch/powerpc/kernel/smp.c   |  1 +
 arch/powerpc/platforms/cell/pervasive.c |  1 +
 arch/powerpc/xmon/ppc-dis.c |  1 +
 kernel/jump_label.c |  3 +++
 22 files changed, 135 insertions(+), 18 deletions(-)
 create mode 100644 arch/powerpc/include/asm/cpufeatures.h

-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 5/6] powerpc: use the jump label for cpu_has_feature

2015-08-24 Thread Kevin Hao

The cpu features are fixed once the probe of cpu features are done.
And the function cpu_has_feature() does be used in some hot path.
The checking of the cpu features for each time of invoking of
cpu_has_feature() seems suboptimal. This tries to reduce this
overhead of this check by using jump label.

The generated assemble code of the following c program:
if (cpu_has_feature(CPU_FTR_XXX))
xxx()

Before:
lis r9,-16230
lwz r9,12324(r9)
lwz r9,12(r9)
andi.   r10,r9,512
beqlr-

After:
nop if CPU_FTR_XXX is enabled
b xxx   if CPU_FTR_XXX is not enabled

Signed-off-by: Kevin Hao haoke...@gmail.com
---
v2: Use the open-coded definition and initialization for cpu_feat_keys[].

 arch/powerpc/include/asm/cpufeatures.h | 20 
 arch/powerpc/include/asm/cputable.h|  8 
 arch/powerpc/kernel/cputable.c | 20 
 arch/powerpc/kernel/setup_32.c |  1 +
 arch/powerpc/kernel/setup_64.c |  1 +
 5 files changed, 50 insertions(+)

diff --git a/arch/powerpc/include/asm/cpufeatures.h 
b/arch/powerpc/include/asm/cpufeatures.h
index 37650db5044f..405a97fe6ef9 100644
--- a/arch/powerpc/include/asm/cpufeatures.h
+++ b/arch/powerpc/include/asm/cpufeatures.h
@@ -3,6 +3,25 @@
 
 #include asm/cputable.h
 
+#ifdef CONFIG_JUMP_LABEL
+#include linux/jump_label.h
+
+extern struct static_key_true cpu_feat_keys[MAX_CPU_FEATURES];
+
+static inline int cpu_has_feature(unsigned long feature)
+{
+   int i;
+
+   if (CPU_FTRS_ALWAYS  feature)
+   return 1;
+
+   if (!(CPU_FTRS_POSSIBLE  feature))
+   return 0;
+
+   i = __builtin_ctzl(feature);
+   return static_branch_likely(cpu_feat_keys[i]);
+}
+#else
 static inline int cpu_has_feature(unsigned long feature)
 {
return (CPU_FTRS_ALWAYS  feature) ||
@@ -10,5 +29,6 @@ static inline int cpu_has_feature(unsigned long feature)
 cur_cpu_spec-cpu_features
 feature);
 }
+#endif
 
 #endif /* __ASM_POWERPC_CPUFEATURE_H */
diff --git a/arch/powerpc/include/asm/cputable.h 
b/arch/powerpc/include/asm/cputable.h
index ae4b6ef341cd..2ebee2894102 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -114,6 +114,12 @@ extern void do_feature_fixups(unsigned long value, void 
*fixup_start,
 
 extern const char *powerpc_base_platform;
 
+#ifdef CONFIG_JUMP_LABEL
+extern void cpu_feat_keys_init(void);
+#else
+static inline void cpu_feat_keys_init(void) { }
+#endif
+
 /* TLB flush actions. Used as argument to cpu_spec.flush_tlb() hook */
 enum {
TLB_INVAL_SCOPE_GLOBAL = 0, /* invalidate all TLBs */
@@ -124,6 +130,8 @@ enum {
 
 /* CPU kernel features */
 
+#define MAX_CPU_FEATURES   (8 * sizeof(((struct cpu_spec 
*)0)-cpu_features))
+
 /* Retain the 32b definitions all use bottom half of word */
 #define CPU_FTR_COHERENT_ICACHEASM_CONST(0x0001)
 #define CPU_FTR_L2CR   ASM_CONST(0x0002)
diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c
index 7d80bfdfb15e..ea94931c5e70 100644
--- a/arch/powerpc/kernel/cputable.c
+++ b/arch/powerpc/kernel/cputable.c
@@ -15,6 +15,7 @@
 #include linux/threads.h
 #include linux/init.h
 #include linux/export.h
+#include linux/jump_label.h
 
 #include asm/oprofile_impl.h
 #include asm/cputable.h
@@ -2195,3 +2196,22 @@ struct cpu_spec * __init identify_cpu(unsigned long 
offset, unsigned int pvr)
 
return NULL;
 }
+
+#ifdef CONFIG_JUMP_LABEL
+struct static_key_true cpu_feat_keys[MAX_CPU_FEATURES] = {
+   [0 ... MAX_CPU_FEATURES - 1] = STATIC_KEY_TRUE_INIT
+};
+EXPORT_SYMBOL_GPL(cpu_feat_keys);
+
+void __init cpu_feat_keys_init(void)
+{
+   int i;
+
+   for (i = 0; i  MAX_CPU_FEATURES; i++) {
+   unsigned long f = 1ul  i;
+
+   if (!(cur_cpu_spec-cpu_features  f))
+   static_branch_disable(cpu_feat_keys[i]);
+   }
+}
+#endif
diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c
index f0868f510b3b..93756175a13c 100644
--- a/arch/powerpc/kernel/setup_32.c
+++ b/arch/powerpc/kernel/setup_32.c
@@ -115,6 +115,7 @@ notrace void __init machine_init(u64 dt_ptr)
lockdep_init();
 
jump_label_init();
+   cpu_feat_keys_init();
 
/* Enable early debugging if any specified (see udbg.h) */
udbg_early_init();
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index f0802a0b4a20..4cf3894d91fa 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -251,6 +251,7 @@ void __init early_setup(unsigned long dt_ptr)
lockdep_init();
 
jump_label_init();
+   cpu_feat_keys_init();
 
/*  printk is now safe to use --- */
 
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body

[PATCH v2 2/6] powerpc: invoke jump_label_init() in a much earlier stage

2015-08-24 Thread Kevin Hao

So we can use static_key for cpu_has_feature() and mmu_has_feature().

Signed-off-by: Kevin Hao haoke...@gmail.com
---
v2: No change.

 arch/powerpc/kernel/setup_32.c | 2 ++
 arch/powerpc/kernel/setup_64.c | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c
index bb02e9f6944e..35980a2785ba 100644
--- a/arch/powerpc/kernel/setup_32.c
+++ b/arch/powerpc/kernel/setup_32.c
@@ -113,6 +113,8 @@ notrace void __init machine_init(u64 dt_ptr)
 {
lockdep_init();
 
+   jump_label_init();
+
/* Enable early debugging if any specified (see udbg.h) */
udbg_early_init();
 
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index bdcbb716f4d6..f0802a0b4a20 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -250,6 +250,8 @@ void __init early_setup(unsigned long dt_ptr)
/* Initialize lockdep early or else spinlocks will blow */
lockdep_init();
 
+   jump_label_init();
+
/*  printk is now safe to use --- */
 
/* Enable early debugging if any specified (see udbg.h) */
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 6/6] powerpc: use jump label for mmu_has_feature

2015-08-24 Thread Kevin Hao

The mmu features are fixed once the probe of mmu features are done.
And the function mmu_has_feature() does be used in some hot path.
The checking of the mmu features for each time of invoking of
mmu_has_feature() seems suboptimal. This tries to reduce this
overhead of this check by using jump label.

The generated assemble code of the following c program:
if (mmu_has_feature(MMU_FTR_XXX))
xxx()
Before:
lis r9,-16230
lwz r9,12324(r9)
lwz r9,24(r9)
andi.   r10,r9,16
beqlr+

After:
nop if MMU_FTR_XXX is enabled
b xxx   if MMU_FTR_XXX is not enabled

Signed-off-by: Kevin Hao haoke...@gmail.com
---
v2: Use the open-coded definition and initialization for mmu_feat_keys[].

 arch/powerpc/include/asm/mmu.h | 29 +
 arch/powerpc/kernel/cputable.c | 17 +
 arch/powerpc/kernel/setup_32.c |  1 +
 arch/powerpc/kernel/setup_64.c |  1 +
 4 files changed, 48 insertions(+)

diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
index 3d5abfe6ba67..e091de352a75 100644
--- a/arch/powerpc/include/asm/mmu.h
+++ b/arch/powerpc/include/asm/mmu.h
@@ -109,6 +109,34 @@
 DECLARE_PER_CPU(int, next_tlbcam_idx);
 #endif
 
+#ifdef CONFIG_JUMP_LABEL
+#include linux/jump_label.h
+
+#define MAX_MMU_FEATURES   (8 * sizeof(((struct cpu_spec 
*)0)-mmu_features))
+
+extern struct static_key_true mmu_feat_keys[MAX_MMU_FEATURES];
+
+extern void mmu_feat_keys_init(void);
+
+static inline int mmu_has_feature(unsigned long feature)
+{
+   int i;
+
+   i = __builtin_ctzl(feature);
+   return static_branch_likely(mmu_feat_keys[i]);
+}
+
+static inline void mmu_clear_feature(unsigned long feature)
+{
+   int i;
+
+   i = __builtin_ctzl(feature);
+   cur_cpu_spec-mmu_features = ~feature;
+   static_branch_disable(mmu_feat_keys[i]);
+}
+#else
+static inline void mmu_feat_keys_init(void) { }
+
 static inline int mmu_has_feature(unsigned long feature)
 {
return (cur_cpu_spec-mmu_features  feature);
@@ -118,6 +146,7 @@ static inline void mmu_clear_feature(unsigned long feature)
 {
cur_cpu_spec-mmu_features = ~feature;
 }
+#endif
 
 extern unsigned int __start___mmu_ftr_fixup, __stop___mmu_ftr_fixup;
 
diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c
index ea94931c5e70..6b505c98520b 100644
--- a/arch/powerpc/kernel/cputable.c
+++ b/arch/powerpc/kernel/cputable.c
@@ -2214,4 +2214,21 @@ void __init cpu_feat_keys_init(void)
static_branch_disable(cpu_feat_keys[i]);
}
 }
+
+struct static_key_true mmu_feat_keys[MAX_MMU_FEATURES] = {
+   [0 ... MAX_MMU_FEATURES - 1] = STATIC_KEY_TRUE_INIT
+};
+EXPORT_SYMBOL_GPL(mmu_feat_keys);
+
+void __init mmu_feat_keys_init(void)
+{
+   int i;
+
+   for (i = 0; i  MAX_MMU_FEATURES; i++) {
+   unsigned long f = 1ul  i;
+
+   if (!(cur_cpu_spec-mmu_features  f))
+   static_branch_disable(mmu_feat_keys[i]);
+   }
+}
 #endif
diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c
index 93756175a13c..8acff5a4bc3e 100644
--- a/arch/powerpc/kernel/setup_32.c
+++ b/arch/powerpc/kernel/setup_32.c
@@ -116,6 +116,7 @@ notrace void __init machine_init(u64 dt_ptr)
 
jump_label_init();
cpu_feat_keys_init();
+   mmu_feat_keys_init();
 
/* Enable early debugging if any specified (see udbg.h) */
udbg_early_init();
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index 4cf3894d91fa..df6f98f1c46c 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -252,6 +252,7 @@ void __init early_setup(unsigned long dt_ptr)
 
jump_label_init();
cpu_feat_keys_init();
+   mmu_feat_keys_init();
 
/*  printk is now safe to use --- */
 
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 1/6] jump_label: make it possible for the archs to invoke jump_label_init() much earlier

2015-08-24 Thread Kevin Hao

For some archs (such as powerpc) would want to invoke jump_label_init()
in a much earlier stage. So check static_key_initialized in order to
make sure this function run only once.

Signed-off-by: Kevin Hao haoke...@gmail.com
---
v2: No change.

 kernel/jump_label.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/jump_label.c b/kernel/jump_label.c
index f7dd15d537f9..cfc7d7b65432 100644
--- a/kernel/jump_label.c
+++ b/kernel/jump_label.c
@@ -205,6 +205,9 @@ void __init jump_label_init(void)
struct static_key *key = NULL;
struct jump_entry *iter;
 
+   if (static_key_initialized)
+   return;
+
jump_label_lock();
jump_label_sort_entries(iter_start, iter_stop);
 
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/8] jump_label: introduce DEFINE_STATIC_KEY_{TRUE,FALSE}_ARRAY macros

2015-08-21 Thread Kevin Hao

On Fri, Aug 21, 2015 at 08:40:59AM +0200, Ingo Molnar wrote:
> Please also Cc: peterz and me to the next submission of the series - static 
> key 
> (and jump label) changes go through the locking tree normally, and there's a 
> number of changes pending already for v4.3:

Sure.

Thanks,
Kevin


pgpoBXHB0v8h7.pgp
Description: PGP signature

Re: [PATCH 3/8] jump_label: introduce DEFINE_STATIC_KEY_{TRUE,FALSE}_ARRAY macros

2015-08-21 Thread Kevin Hao

On Fri, Aug 21, 2015 at 08:28:26AM +0200, Ingo Molnar wrote:
> 
> * Kevin Hao  wrote:
> 
> > These are used to define a static_key_{true,false} array.
> > 
> > Signed-off-by: Kevin Hao 
> > ---
> >  include/linux/jump_label.h | 6 ++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h
> > index 7f653e8f6690..5c1d6a49dd6b 100644
> > --- a/include/linux/jump_label.h
> > +++ b/include/linux/jump_label.h
> > @@ -267,6 +267,12 @@ struct static_key_false {
> >  #define DEFINE_STATIC_KEY_FALSE(name)  \
> > struct static_key_false name = STATIC_KEY_FALSE_INIT
> >  
> > +#define DEFINE_STATIC_KEY_TRUE_ARRAY(name, n)  \
> > +   struct static_key_true name[n] = { [0 ... n - 1] = STATIC_KEY_TRUE_INIT 
> > }
> > +
> > +#define DEFINE_STATIC_KEY_FALSE_ARRAY(name, n) \
> > +   struct static_key_false name[n] = { [0 ... n - 1] = 
> > STATIC_KEY_FALSE_INIT }
> 
> I think the define makes the code more obfuscated and less clear, the 
> open-coded 
> initialization is pretty dense and easy to read to begin with.

OK, I will drop this patch and move the initialization of the array to the
corresponding patch.

Thanks,
Kevin


pgp9vEyrqdT9P.pgp
Description: PGP signature

Re: [PATCH 3/8] jump_label: introduce DEFINE_STATIC_KEY_{TRUE,FALSE}_ARRAY macros

2015-08-21 Thread Kevin Hao

On Fri, Aug 21, 2015 at 08:28:26AM +0200, Ingo Molnar wrote:
 
 * Kevin Hao haoke...@gmail.com wrote:
 
  These are used to define a static_key_{true,false} array.
  
  Signed-off-by: Kevin Hao haoke...@gmail.com
  ---
   include/linux/jump_label.h | 6 ++
   1 file changed, 6 insertions(+)
  
  diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h
  index 7f653e8f6690..5c1d6a49dd6b 100644
  --- a/include/linux/jump_label.h
  +++ b/include/linux/jump_label.h
  @@ -267,6 +267,12 @@ struct static_key_false {
   #define DEFINE_STATIC_KEY_FALSE(name)  \
  struct static_key_false name = STATIC_KEY_FALSE_INIT
   
  +#define DEFINE_STATIC_KEY_TRUE_ARRAY(name, n)  \
  +   struct static_key_true name[n] = { [0 ... n - 1] = STATIC_KEY_TRUE_INIT 
  }
  +
  +#define DEFINE_STATIC_KEY_FALSE_ARRAY(name, n) \
  +   struct static_key_false name[n] = { [0 ... n - 1] = 
  STATIC_KEY_FALSE_INIT }
 
 I think the define makes the code more obfuscated and less clear, the 
 open-coded 
 initialization is pretty dense and easy to read to begin with.

OK, I will drop this patch and move the initialization of the array to the
corresponding patch.

Thanks,
Kevin


pgp9vEyrqdT9P.pgp
Description: PGP signature

Re: [PATCH 3/8] jump_label: introduce DEFINE_STATIC_KEY_{TRUE,FALSE}_ARRAY macros

2015-08-21 Thread Kevin Hao

On Fri, Aug 21, 2015 at 08:40:59AM +0200, Ingo Molnar wrote:
 Please also Cc: peterz and me to the next submission of the series - static 
 key 
 (and jump label) changes go through the locking tree normally, and there's a 
 number of changes pending already for v4.3:

Sure.

Thanks,
Kevin


pgpoBXHB0v8h7.pgp
Description: PGP signature

Re: [PATCH 3/8] jump_label: introduce DEFINE_STATIC_KEY_{TRUE,FALSE}_ARRAY macros

2015-08-20 Thread Kevin Hao

On Thu, Aug 20, 2015 at 08:31:58PM +0200, Peter Zijlstra wrote:
> On Thu, Aug 20, 2015 at 08:14:31PM +0800, Kevin Hao wrote:
> > These are used to define a static_key_{true,false} array.
> 
> Yes but why...
> 
> there might have been some clue in the patches you didn't send me, but
> since you didn't send them, I'm left wondering.

Sorry for the confusion. In order to use jump label for the
{cpu,mmu}_has_feature() functions on powerpc, we need to declare an array of
32 or 64 static_key_true (one static_key_true for each cpu or mmu feature).
The following are the two patches which depends on this patch.
  https://lkml.org/lkml/2015/8/20/355
  https://lkml.org/lkml/2015/8/20/356

So far only DEFINE_STATIC_KEY_TRUE_ARRAY macro is used, but I think it may seem
canonical to define the macros for both true or false keys at the same time.

Thanks,
Kevin

pgpcD5zZ9lU7u.pgp
Description: PGP signature

Re: [PATCH 1/8] jump_label: no need to acquire the jump_label_mutex in jump_lable_init()

2015-08-20 Thread Kevin Hao

On Thu, Aug 20, 2015 at 08:29:03PM +0200, Peter Zijlstra wrote:
> On Thu, Aug 20, 2015 at 08:14:29PM +0800, Kevin Hao wrote:
> > The jump_label_init() run in a very early stage, even before the
> > sched_init(). So there is no chance for concurrent access of the
> > jump label table.
> 
> It also doesn't hurt to have it. Its better to be consistent and
> conservative with locking unless there is a pressing need.

Yes, it has no real hurt. IMHO it may cause confusion that the function
jump_label_init() may run in two different thread context simultaneously. 
Anyway if you guys don't think so, I can drop this patch.

Thanks,
Kevin


pgp_4f4fGgWfN.pgp
Description: PGP signature

[PATCH 0/8] powerpc: use jump label for {cpu,mmu}_has_feature()

2015-08-20 Thread Kevin Hao

Hi,

I have tried to change the {cpu,mmu}_has_feature() to use jump label two yeas
ago [1]. But that codes seem a bit ugly. This is a reimplementation by moving 
the
jump_label_init() much earlier so the jump label can be used in a very earlier
stage. Boot test on p4080ds, t2080rdb and powermac (qemu). This patch series
is against linux-next.

[1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2013-September/111026.html

Kevin Hao (8):
  jump_label: no need to acquire the jump_label_mutex in
jump_lable_init()
  jump_label: make it possible for the archs to invoke jump_label_init()
much earlier
  jump_label: introduce DEFINE_STATIC_KEY_{TRUE,FALSE}_ARRAY macros
  powerpc: invoke jump_label_init() in a much earlier stage
  powerpc: kill mfvtb()
  powerpc: move the cpu_has_feature to a separate file
  powerpc: use the jump label for cpu_has_feature
  powerpc: use jump label for mmu_has_feature

 arch/powerpc/include/asm/cacheflush.h   |  1 +
 arch/powerpc/include/asm/cpufeatures.h  | 34 +
 arch/powerpc/include/asm/cputable.h | 16 
 arch/powerpc/include/asm/cputime.h  |  1 +
 arch/powerpc/include/asm/dbell.h|  1 +
 arch/powerpc/include/asm/dcr-native.h   |  1 +
 arch/powerpc/include/asm/mman.h |  1 +
 arch/powerpc/include/asm/mmu.h  | 29 
 arch/powerpc/include/asm/reg.h  |  9 -
 arch/powerpc/include/asm/time.h |  3 ++-
 arch/powerpc/include/asm/xor.h  |  1 +
 arch/powerpc/kernel/align.c |  1 +
 arch/powerpc/kernel/cputable.c  | 33 
 arch/powerpc/kernel/irq.c   |  1 +
 arch/powerpc/kernel/process.c   |  1 +
 arch/powerpc/kernel/setup-common.c  |  1 +
 arch/powerpc/kernel/setup_32.c  |  5 +
 arch/powerpc/kernel/setup_64.c  |  4 
 arch/powerpc/kernel/smp.c   |  1 +
 arch/powerpc/platforms/cell/pervasive.c |  1 +
 arch/powerpc/xmon/ppc-dis.c |  1 +
 include/linux/jump_label.h  |  6 ++
 kernel/jump_label.c |  5 +++--
 23 files changed, 137 insertions(+), 20 deletions(-)
 create mode 100644 arch/powerpc/include/asm/cpufeatures.h

-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/8] powerpc: invoke jump_label_init() in a much earlier stage

2015-08-20 Thread Kevin Hao

So we can use static_key for cpu_has_feature() and mmu_has_feature().

Signed-off-by: Kevin Hao 
---
 arch/powerpc/kernel/setup_32.c | 2 ++
 arch/powerpc/kernel/setup_64.c | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c
index bb02e9f6944e..35980a2785ba 100644
--- a/arch/powerpc/kernel/setup_32.c
+++ b/arch/powerpc/kernel/setup_32.c
@@ -113,6 +113,8 @@ notrace void __init machine_init(u64 dt_ptr)
 {
lockdep_init();
 
+   jump_label_init();
+
/* Enable early debugging if any specified (see udbg.h) */
udbg_early_init();
 
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index bdcbb716f4d6..f0802a0b4a20 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -250,6 +250,8 @@ void __init early_setup(unsigned long dt_ptr)
/* Initialize lockdep early or else spinlocks will blow */
lockdep_init();
 
+   jump_label_init();
+
/*  printk is now safe to use --- */
 
/* Enable early debugging if any specified (see udbg.h) */
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/8] jump_label: no need to acquire the jump_label_mutex in jump_lable_init()

2015-08-20 Thread Kevin Hao

The jump_label_init() run in a very early stage, even before the
sched_init(). So there is no chance for concurrent access of the
jump label table.

Signed-off-by: Kevin Hao 
---
 kernel/jump_label.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/kernel/jump_label.c b/kernel/jump_label.c
index f7dd15d537f9..df1a7fbe7cd5 100644
--- a/kernel/jump_label.c
+++ b/kernel/jump_label.c
@@ -205,7 +205,6 @@ void __init jump_label_init(void)
struct static_key *key = NULL;
struct jump_entry *iter;
 
-   jump_label_lock();
jump_label_sort_entries(iter_start, iter_stop);
 
for (iter = iter_start; iter < iter_stop; iter++) {
@@ -229,7 +228,6 @@ void __init jump_label_init(void)
 #endif
}
static_key_initialized = true;
-   jump_label_unlock();
 }
 
 #ifdef CONFIG_MODULES
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/8] jump_label: make it possible for the archs to invoke jump_label_init() much earlier

2015-08-20 Thread Kevin Hao

For some archs (such as powerpc) would want to invoke jump_label_init()
in a much earlier stage. So check static_key_initialized in order to
make sure this function run only once.

Signed-off-by: Kevin Hao 
---
 kernel/jump_label.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/jump_label.c b/kernel/jump_label.c
index df1a7fbe7cd5..fcae370f8794 100644
--- a/kernel/jump_label.c
+++ b/kernel/jump_label.c
@@ -205,6 +205,9 @@ void __init jump_label_init(void)
struct static_key *key = NULL;
struct jump_entry *iter;
 
+   if (static_key_initialized)
+   return;
+
jump_label_sort_entries(iter_start, iter_stop);
 
for (iter = iter_start; iter < iter_stop; iter++) {
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 6/8] powerpc: move the cpu_has_feature to a separate file

2015-08-20 Thread Kevin Hao

We plan to use jump label for cpu_has_feature. In order to implement
this we need to include the linux/jump_label.h in asm/cputable.h.
But it seems that asm/cputable.h is so basic header file for ppc that
it is almost included by all the other header files. The including of
the linux/jump_label.h will introduces various recursive inclusion.
And it is very hard to fix that. So we choose to move the function
cpu_has_feature to a separate header file before using the jump label
for it. No functional change.

Signed-off-by: Kevin Hao 
---
 arch/powerpc/include/asm/cacheflush.h   |  1 +
 arch/powerpc/include/asm/cpufeatures.h  | 14 ++
 arch/powerpc/include/asm/cputable.h |  8 
 arch/powerpc/include/asm/cputime.h  |  1 +
 arch/powerpc/include/asm/dbell.h|  1 +
 arch/powerpc/include/asm/dcr-native.h   |  1 +
 arch/powerpc/include/asm/mman.h |  1 +
 arch/powerpc/include/asm/time.h |  1 +
 arch/powerpc/include/asm/xor.h  |  1 +
 arch/powerpc/kernel/align.c |  1 +
 arch/powerpc/kernel/irq.c   |  1 +
 arch/powerpc/kernel/process.c   |  1 +
 arch/powerpc/kernel/setup-common.c  |  1 +
 arch/powerpc/kernel/setup_32.c  |  1 +
 arch/powerpc/kernel/smp.c   |  1 +
 arch/powerpc/platforms/cell/pervasive.c |  1 +
 arch/powerpc/xmon/ppc-dis.c |  1 +
 17 files changed, 29 insertions(+), 8 deletions(-)
 create mode 100644 arch/powerpc/include/asm/cpufeatures.h

diff --git a/arch/powerpc/include/asm/cacheflush.h 
b/arch/powerpc/include/asm/cacheflush.h
index 6229e6b6037b..3bdcd9231852 100644
--- a/arch/powerpc/include/asm/cacheflush.h
+++ b/arch/powerpc/include/asm/cacheflush.h
@@ -11,6 +11,7 @@
 
 #include 
 #include 
+#include 
 
 /*
  * No cache flushing is required when address mappings are changed,
diff --git a/arch/powerpc/include/asm/cpufeatures.h 
b/arch/powerpc/include/asm/cpufeatures.h
new file mode 100644
index ..37650db5044f
--- /dev/null
+++ b/arch/powerpc/include/asm/cpufeatures.h
@@ -0,0 +1,14 @@
+#ifndef __ASM_POWERPC_CPUFEATURES_H
+#define __ASM_POWERPC_CPUFEATURES_H
+
+#include 
+
+static inline int cpu_has_feature(unsigned long feature)
+{
+   return (CPU_FTRS_ALWAYS & feature) ||
+  (CPU_FTRS_POSSIBLE
+   & cur_cpu_spec->cpu_features
+   & feature);
+}
+
+#endif /* __ASM_POWERPC_CPUFEATURE_H */
diff --git a/arch/powerpc/include/asm/cputable.h 
b/arch/powerpc/include/asm/cputable.h
index b118072670fb..ae4b6ef341cd 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -556,14 +556,6 @@ enum {
 };
 #endif /* __powerpc64__ */
 
-static inline int cpu_has_feature(unsigned long feature)
-{
-   return (CPU_FTRS_ALWAYS & feature) ||
-  (CPU_FTRS_POSSIBLE
-   & cur_cpu_spec->cpu_features
-   & feature);
-}
-
 #define HBP_NUM 1
 
 #endif /* !__ASSEMBLY__ */
diff --git a/arch/powerpc/include/asm/cputime.h 
b/arch/powerpc/include/asm/cputime.h
index e2452550bcb1..b91837865c0e 100644
--- a/arch/powerpc/include/asm/cputime.h
+++ b/arch/powerpc/include/asm/cputime.h
@@ -28,6 +28,7 @@ static inline void setup_cputime_one_jiffy(void) { }
 #include 
 #include 
 #include 
+#include 
 
 typedef u64 __nocast cputime_t;
 typedef u64 __nocast cputime64_t;
diff --git a/arch/powerpc/include/asm/dbell.h b/arch/powerpc/include/asm/dbell.h
index 5fa6b20eba10..2d9eae338f70 100644
--- a/arch/powerpc/include/asm/dbell.h
+++ b/arch/powerpc/include/asm/dbell.h
@@ -16,6 +16,7 @@
 #include 
 
 #include 
+#include 
 
 #define PPC_DBELL_MSG_BRDCAST  (0x0400)
 #define PPC_DBELL_TYPE(x)  (((x) & 0xf) << (63-36))
diff --git a/arch/powerpc/include/asm/dcr-native.h 
b/arch/powerpc/include/asm/dcr-native.h
index 4efc11dacb98..0186ba05bfe1 100644
--- a/arch/powerpc/include/asm/dcr-native.h
+++ b/arch/powerpc/include/asm/dcr-native.h
@@ -24,6 +24,7 @@
 
 #include 
 #include 
+#include 
 
 typedef struct {
unsigned int base;
diff --git a/arch/powerpc/include/asm/mman.h b/arch/powerpc/include/asm/mman.h
index 8565c254151a..74922ad05e6c 100644
--- a/arch/powerpc/include/asm/mman.h
+++ b/arch/powerpc/include/asm/mman.h
@@ -13,6 +13,7 @@
 
 #include 
 #include 
+#include 
 
 /*
  * This file is included by linux/mman.h, so we can't use cacl_vm_prot_bits()
diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
index 6f69828458fb..fa63005f827f 100644
--- a/arch/powerpc/include/asm/time.h
+++ b/arch/powerpc/include/asm/time.h
@@ -18,6 +18,7 @@
 #include 
 
 #include 
+#include 
 
 /* time.c */
 extern unsigned long tb_ticks_per_jiffy;
diff --git a/arch/powerpc/include/asm/xor.h b/arch/powerpc/include/asm/xor.h
index 0abb97f3be10..15ba0d07937f 100644
--- a/arch/powerpc/include/asm/xor.h
+++ b/arch/powerpc/include/asm/xor.h
@@ -23,6 +23,7 @@
 #ifdef CONFIG_ALTIVEC
 
 #include 
+#include 
 
 void xor_altivec_2(u

[PATCH 8/8] powerpc: use jump label for mmu_has_feature

2015-08-20 Thread Kevin Hao

The mmu features are fixed once the probe of mmu features are done.
And the function mmu_has_feature() does be used in some hot path.
The checking of the mmu features for each time of invoking of
mmu_has_feature() seems suboptimal. This tries to reduce this
overhead of this check by using jump label.

The generated assemble code of the following c program:
if (mmu_has_feature(MMU_FTR_XXX))
xxx()
Before:
lis r9,-16230
lwz r9,12324(r9)
lwz r9,24(r9)
andi.   r10,r9,16
beqlr+

After:
nop if MMU_FTR_XXX is enabled
b xxx   if MMU_FTR_XXX is not enabled

Signed-off-by: Kevin Hao 
---
 arch/powerpc/include/asm/mmu.h | 29 +
 arch/powerpc/kernel/cputable.c | 15 +++
 arch/powerpc/kernel/setup_32.c |  1 +
 arch/powerpc/kernel/setup_64.c |  1 +
 4 files changed, 46 insertions(+)

diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
index 3d5abfe6ba67..e091de352a75 100644
--- a/arch/powerpc/include/asm/mmu.h
+++ b/arch/powerpc/include/asm/mmu.h
@@ -109,6 +109,34 @@
 DECLARE_PER_CPU(int, next_tlbcam_idx);
 #endif
 
+#ifdef CONFIG_JUMP_LABEL
+#include 
+
+#define MAX_MMU_FEATURES   (8 * sizeof(((struct cpu_spec 
*)0)->mmu_features))
+
+extern struct static_key_true mmu_feat_keys[MAX_MMU_FEATURES];
+
+extern void mmu_feat_keys_init(void);
+
+static inline int mmu_has_feature(unsigned long feature)
+{
+   int i;
+
+   i = __builtin_ctzl(feature);
+   return static_branch_likely(_feat_keys[i]);
+}
+
+static inline void mmu_clear_feature(unsigned long feature)
+{
+   int i;
+
+   i = __builtin_ctzl(feature);
+   cur_cpu_spec->mmu_features &= ~feature;
+   static_branch_disable(_feat_keys[i]);
+}
+#else
+static inline void mmu_feat_keys_init(void) { }
+
 static inline int mmu_has_feature(unsigned long feature)
 {
return (cur_cpu_spec->mmu_features & feature);
@@ -118,6 +146,7 @@ static inline void mmu_clear_feature(unsigned long feature)
 {
cur_cpu_spec->mmu_features &= ~feature;
 }
+#endif
 
 extern unsigned int __start___mmu_ftr_fixup, __stop___mmu_ftr_fixup;
 
diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c
index 7d4fe69a61ed..18a843f139c3 100644
--- a/arch/powerpc/kernel/cputable.c
+++ b/arch/powerpc/kernel/cputable.c
@@ -2212,4 +2212,19 @@ void __init cpu_feat_keys_init(void)
static_branch_disable(_feat_keys[i]);
}
 }
+
+DEFINE_STATIC_KEY_TRUE_ARRAY(mmu_feat_keys, MAX_MMU_FEATURES);
+EXPORT_SYMBOL_GPL(mmu_feat_keys);
+
+void __init mmu_feat_keys_init(void)
+{
+   int i;
+
+   for (i = 0; i < MAX_MMU_FEATURES; i++) {
+   unsigned long f = 1ul << i;
+
+   if (!(cur_cpu_spec->mmu_features & f))
+   static_branch_disable(_feat_keys[i]);
+   }
+}
 #endif
diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c
index 93756175a13c..8acff5a4bc3e 100644
--- a/arch/powerpc/kernel/setup_32.c
+++ b/arch/powerpc/kernel/setup_32.c
@@ -116,6 +116,7 @@ notrace void __init machine_init(u64 dt_ptr)
 
jump_label_init();
cpu_feat_keys_init();
+   mmu_feat_keys_init();
 
/* Enable early debugging if any specified (see udbg.h) */
udbg_early_init();
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index 4cf3894d91fa..df6f98f1c46c 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -252,6 +252,7 @@ void __init early_setup(unsigned long dt_ptr)
 
jump_label_init();
cpu_feat_keys_init();
+   mmu_feat_keys_init();
 
/*  printk is now safe to use --- */
 
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 7/8] powerpc: use the jump label for cpu_has_feature

2015-08-20 Thread Kevin Hao

The cpu features are fixed once the probe of cpu features are done.
And the function cpu_has_feature() does be used in some hot path.
The checking of the cpu features for each time of invoking of
cpu_has_feature() seems suboptimal. This tries to reduce this
overhead of this check by using jump label.

The generated assemble code of the following c program:
if (cpu_has_feature(CPU_FTR_XXX))
xxx()

Before:
lis r9,-16230
lwz r9,12324(r9)
lwz r9,12(r9)
andi.   r10,r9,512
beqlr-

After:
nop if CPU_FTR_XXX is enabled
b xxx   if CPU_FTR_XXX is not enabled

Signed-off-by: Kevin Hao 
---
 arch/powerpc/include/asm/cpufeatures.h | 20 
 arch/powerpc/include/asm/cputable.h|  8 
 arch/powerpc/kernel/cputable.c | 18 ++
 arch/powerpc/kernel/setup_32.c |  1 +
 arch/powerpc/kernel/setup_64.c |  1 +
 5 files changed, 48 insertions(+)

diff --git a/arch/powerpc/include/asm/cpufeatures.h 
b/arch/powerpc/include/asm/cpufeatures.h
index 37650db5044f..405a97fe6ef9 100644
--- a/arch/powerpc/include/asm/cpufeatures.h
+++ b/arch/powerpc/include/asm/cpufeatures.h
@@ -3,6 +3,25 @@
 
 #include 
 
+#ifdef CONFIG_JUMP_LABEL
+#include 
+
+extern struct static_key_true cpu_feat_keys[MAX_CPU_FEATURES];
+
+static inline int cpu_has_feature(unsigned long feature)
+{
+   int i;
+
+   if (CPU_FTRS_ALWAYS & feature)
+   return 1;
+
+   if (!(CPU_FTRS_POSSIBLE & feature))
+   return 0;
+
+   i = __builtin_ctzl(feature);
+   return static_branch_likely(_feat_keys[i]);
+}
+#else
 static inline int cpu_has_feature(unsigned long feature)
 {
return (CPU_FTRS_ALWAYS & feature) ||
@@ -10,5 +29,6 @@ static inline int cpu_has_feature(unsigned long feature)
& cur_cpu_spec->cpu_features
& feature);
 }
+#endif
 
 #endif /* __ASM_POWERPC_CPUFEATURE_H */
diff --git a/arch/powerpc/include/asm/cputable.h 
b/arch/powerpc/include/asm/cputable.h
index ae4b6ef341cd..2ebee2894102 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -114,6 +114,12 @@ extern void do_feature_fixups(unsigned long value, void 
*fixup_start,
 
 extern const char *powerpc_base_platform;
 
+#ifdef CONFIG_JUMP_LABEL
+extern void cpu_feat_keys_init(void);
+#else
+static inline void cpu_feat_keys_init(void) { }
+#endif
+
 /* TLB flush actions. Used as argument to cpu_spec.flush_tlb() hook */
 enum {
TLB_INVAL_SCOPE_GLOBAL = 0, /* invalidate all TLBs */
@@ -124,6 +130,8 @@ enum {
 
 /* CPU kernel features */
 
+#define MAX_CPU_FEATURES   (8 * sizeof(((struct cpu_spec 
*)0)->cpu_features))
+
 /* Retain the 32b definitions all use bottom half of word */
 #define CPU_FTR_COHERENT_ICACHEASM_CONST(0x0001)
 #define CPU_FTR_L2CR   ASM_CONST(0x0002)
diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c
index 7d80bfdfb15e..7d4fe69a61ed 100644
--- a/arch/powerpc/kernel/cputable.c
+++ b/arch/powerpc/kernel/cputable.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -2195,3 +2196,20 @@ struct cpu_spec * __init identify_cpu(unsigned long 
offset, unsigned int pvr)
 
return NULL;
 }
+
+#ifdef CONFIG_JUMP_LABEL
+DEFINE_STATIC_KEY_TRUE_ARRAY(cpu_feat_keys, MAX_CPU_FEATURES);
+EXPORT_SYMBOL_GPL(cpu_feat_keys);
+
+void __init cpu_feat_keys_init(void)
+{
+   int i;
+
+   for (i = 0; i < MAX_CPU_FEATURES; i++) {
+   unsigned long f = 1ul << i;
+
+   if (!(cur_cpu_spec->cpu_features & f))
+   static_branch_disable(_feat_keys[i]);
+   }
+}
+#endif
diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c
index f0868f510b3b..93756175a13c 100644
--- a/arch/powerpc/kernel/setup_32.c
+++ b/arch/powerpc/kernel/setup_32.c
@@ -115,6 +115,7 @@ notrace void __init machine_init(u64 dt_ptr)
lockdep_init();
 
jump_label_init();
+   cpu_feat_keys_init();
 
/* Enable early debugging if any specified (see udbg.h) */
udbg_early_init();
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index f0802a0b4a20..4cf3894d91fa 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -251,6 +251,7 @@ void __init early_setup(unsigned long dt_ptr)
lockdep_init();
 
jump_label_init();
+   cpu_feat_keys_init();
 
/*  printk is now safe to use --- */
 
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 5/8] powerpc: kill mfvtb()

2015-08-20 Thread Kevin Hao

This function is only used by get_vtb(). They are almost the same
except the reading from the real register. Move the mfspr() to
get_vtb() and kill the function mfvtb(). With this, we can eliminate
the use of cpu_has_feature() in very core header file like reg.h.
This is a preparation for the use of jump label for cpu_has_feature().

Signed-off-by: Kevin Hao 
---
 arch/powerpc/include/asm/reg.h  | 9 -
 arch/powerpc/include/asm/time.h | 2 +-
 2 files changed, 1 insertion(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index aa1cc5f015ee..d0b5f4b63776 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -1207,15 +1207,6 @@
 : "r" ((unsigned long)(v)) \
 : "memory")
 
-static inline unsigned long mfvtb (void)
-{
-#ifdef CONFIG_PPC_BOOK3S_64
-   if (cpu_has_feature(CPU_FTR_ARCH_207S))
-   return mfspr(SPRN_VTB);
-#endif
-   return 0;
-}
-
 #ifdef __powerpc64__
 #if defined(CONFIG_PPC_CELL) || defined(CONFIG_PPC_FSL_BOOK3E)
 #define mftb() ({unsigned long rval;   \
diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
index 10fc784a2ad4..6f69828458fb 100644
--- a/arch/powerpc/include/asm/time.h
+++ b/arch/powerpc/include/asm/time.h
@@ -106,7 +106,7 @@ static inline u64 get_vtb(void)
 {
 #ifdef CONFIG_PPC_BOOK3S_64
if (cpu_has_feature(CPU_FTR_ARCH_207S))
-   return mfvtb();
+   return mfspr(SPRN_VTB);
 #endif
return 0;
 }
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/8] jump_label: introduce DEFINE_STATIC_KEY_{TRUE,FALSE}_ARRAY macros

2015-08-20 Thread Kevin Hao

These are used to define a static_key_{true,false} array.

Signed-off-by: Kevin Hao 
---
 include/linux/jump_label.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h
index 7f653e8f6690..5c1d6a49dd6b 100644
--- a/include/linux/jump_label.h
+++ b/include/linux/jump_label.h
@@ -267,6 +267,12 @@ struct static_key_false {
 #define DEFINE_STATIC_KEY_FALSE(name)  \
struct static_key_false name = STATIC_KEY_FALSE_INIT
 
+#define DEFINE_STATIC_KEY_TRUE_ARRAY(name, n)  \
+   struct static_key_true name[n] = { [0 ... n - 1] = STATIC_KEY_TRUE_INIT 
}
+
+#define DEFINE_STATIC_KEY_FALSE_ARRAY(name, n) \
+   struct static_key_false name[n] = { [0 ... n - 1] = 
STATIC_KEY_FALSE_INIT }
+
 #ifdef HAVE_JUMP_LABEL
 
 /*
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 8/8] powerpc: use jump label for mmu_has_feature

2015-08-20 Thread Kevin Hao

The mmu features are fixed once the probe of mmu features are done.
And the function mmu_has_feature() does be used in some hot path.
The checking of the mmu features for each time of invoking of
mmu_has_feature() seems suboptimal. This tries to reduce this
overhead of this check by using jump label.

The generated assemble code of the following c program:
if (mmu_has_feature(MMU_FTR_XXX))
xxx()
Before:
lis r9,-16230
lwz r9,12324(r9)
lwz r9,24(r9)
andi.   r10,r9,16
beqlr+

After:
nop if MMU_FTR_XXX is enabled
b xxx   if MMU_FTR_XXX is not enabled

Signed-off-by: Kevin Hao haoke...@gmail.com
---
 arch/powerpc/include/asm/mmu.h | 29 +
 arch/powerpc/kernel/cputable.c | 15 +++
 arch/powerpc/kernel/setup_32.c |  1 +
 arch/powerpc/kernel/setup_64.c |  1 +
 4 files changed, 46 insertions(+)

diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
index 3d5abfe6ba67..e091de352a75 100644
--- a/arch/powerpc/include/asm/mmu.h
+++ b/arch/powerpc/include/asm/mmu.h
@@ -109,6 +109,34 @@
 DECLARE_PER_CPU(int, next_tlbcam_idx);
 #endif
 
+#ifdef CONFIG_JUMP_LABEL
+#include linux/jump_label.h
+
+#define MAX_MMU_FEATURES   (8 * sizeof(((struct cpu_spec 
*)0)-mmu_features))
+
+extern struct static_key_true mmu_feat_keys[MAX_MMU_FEATURES];
+
+extern void mmu_feat_keys_init(void);
+
+static inline int mmu_has_feature(unsigned long feature)
+{
+   int i;
+
+   i = __builtin_ctzl(feature);
+   return static_branch_likely(mmu_feat_keys[i]);
+}
+
+static inline void mmu_clear_feature(unsigned long feature)
+{
+   int i;
+
+   i = __builtin_ctzl(feature);
+   cur_cpu_spec-mmu_features = ~feature;
+   static_branch_disable(mmu_feat_keys[i]);
+}
+#else
+static inline void mmu_feat_keys_init(void) { }
+
 static inline int mmu_has_feature(unsigned long feature)
 {
return (cur_cpu_spec-mmu_features  feature);
@@ -118,6 +146,7 @@ static inline void mmu_clear_feature(unsigned long feature)
 {
cur_cpu_spec-mmu_features = ~feature;
 }
+#endif
 
 extern unsigned int __start___mmu_ftr_fixup, __stop___mmu_ftr_fixup;
 
diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c
index 7d4fe69a61ed..18a843f139c3 100644
--- a/arch/powerpc/kernel/cputable.c
+++ b/arch/powerpc/kernel/cputable.c
@@ -2212,4 +2212,19 @@ void __init cpu_feat_keys_init(void)
static_branch_disable(cpu_feat_keys[i]);
}
 }
+
+DEFINE_STATIC_KEY_TRUE_ARRAY(mmu_feat_keys, MAX_MMU_FEATURES);
+EXPORT_SYMBOL_GPL(mmu_feat_keys);
+
+void __init mmu_feat_keys_init(void)
+{
+   int i;
+
+   for (i = 0; i  MAX_MMU_FEATURES; i++) {
+   unsigned long f = 1ul  i;
+
+   if (!(cur_cpu_spec-mmu_features  f))
+   static_branch_disable(mmu_feat_keys[i]);
+   }
+}
 #endif
diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c
index 93756175a13c..8acff5a4bc3e 100644
--- a/arch/powerpc/kernel/setup_32.c
+++ b/arch/powerpc/kernel/setup_32.c
@@ -116,6 +116,7 @@ notrace void __init machine_init(u64 dt_ptr)
 
jump_label_init();
cpu_feat_keys_init();
+   mmu_feat_keys_init();
 
/* Enable early debugging if any specified (see udbg.h) */
udbg_early_init();
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index 4cf3894d91fa..df6f98f1c46c 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -252,6 +252,7 @@ void __init early_setup(unsigned long dt_ptr)
 
jump_label_init();
cpu_feat_keys_init();
+   mmu_feat_keys_init();
 
/*  printk is now safe to use --- */
 
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 7/8] powerpc: use the jump label for cpu_has_feature

2015-08-20 Thread Kevin Hao

The cpu features are fixed once the probe of cpu features are done.
And the function cpu_has_feature() does be used in some hot path.
The checking of the cpu features for each time of invoking of
cpu_has_feature() seems suboptimal. This tries to reduce this
overhead of this check by using jump label.

The generated assemble code of the following c program:
if (cpu_has_feature(CPU_FTR_XXX))
xxx()

Before:
lis r9,-16230
lwz r9,12324(r9)
lwz r9,12(r9)
andi.   r10,r9,512
beqlr-

After:
nop if CPU_FTR_XXX is enabled
b xxx   if CPU_FTR_XXX is not enabled

Signed-off-by: Kevin Hao haoke...@gmail.com
---
 arch/powerpc/include/asm/cpufeatures.h | 20 
 arch/powerpc/include/asm/cputable.h|  8 
 arch/powerpc/kernel/cputable.c | 18 ++
 arch/powerpc/kernel/setup_32.c |  1 +
 arch/powerpc/kernel/setup_64.c |  1 +
 5 files changed, 48 insertions(+)

diff --git a/arch/powerpc/include/asm/cpufeatures.h 
b/arch/powerpc/include/asm/cpufeatures.h
index 37650db5044f..405a97fe6ef9 100644
--- a/arch/powerpc/include/asm/cpufeatures.h
+++ b/arch/powerpc/include/asm/cpufeatures.h
@@ -3,6 +3,25 @@
 
 #include asm/cputable.h
 
+#ifdef CONFIG_JUMP_LABEL
+#include linux/jump_label.h
+
+extern struct static_key_true cpu_feat_keys[MAX_CPU_FEATURES];
+
+static inline int cpu_has_feature(unsigned long feature)
+{
+   int i;
+
+   if (CPU_FTRS_ALWAYS  feature)
+   return 1;
+
+   if (!(CPU_FTRS_POSSIBLE  feature))
+   return 0;
+
+   i = __builtin_ctzl(feature);
+   return static_branch_likely(cpu_feat_keys[i]);
+}
+#else
 static inline int cpu_has_feature(unsigned long feature)
 {
return (CPU_FTRS_ALWAYS  feature) ||
@@ -10,5 +29,6 @@ static inline int cpu_has_feature(unsigned long feature)
 cur_cpu_spec-cpu_features
 feature);
 }
+#endif
 
 #endif /* __ASM_POWERPC_CPUFEATURE_H */
diff --git a/arch/powerpc/include/asm/cputable.h 
b/arch/powerpc/include/asm/cputable.h
index ae4b6ef341cd..2ebee2894102 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -114,6 +114,12 @@ extern void do_feature_fixups(unsigned long value, void 
*fixup_start,
 
 extern const char *powerpc_base_platform;
 
+#ifdef CONFIG_JUMP_LABEL
+extern void cpu_feat_keys_init(void);
+#else
+static inline void cpu_feat_keys_init(void) { }
+#endif
+
 /* TLB flush actions. Used as argument to cpu_spec.flush_tlb() hook */
 enum {
TLB_INVAL_SCOPE_GLOBAL = 0, /* invalidate all TLBs */
@@ -124,6 +130,8 @@ enum {
 
 /* CPU kernel features */
 
+#define MAX_CPU_FEATURES   (8 * sizeof(((struct cpu_spec 
*)0)-cpu_features))
+
 /* Retain the 32b definitions all use bottom half of word */
 #define CPU_FTR_COHERENT_ICACHEASM_CONST(0x0001)
 #define CPU_FTR_L2CR   ASM_CONST(0x0002)
diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c
index 7d80bfdfb15e..7d4fe69a61ed 100644
--- a/arch/powerpc/kernel/cputable.c
+++ b/arch/powerpc/kernel/cputable.c
@@ -15,6 +15,7 @@
 #include linux/threads.h
 #include linux/init.h
 #include linux/export.h
+#include linux/jump_label.h
 
 #include asm/oprofile_impl.h
 #include asm/cputable.h
@@ -2195,3 +2196,20 @@ struct cpu_spec * __init identify_cpu(unsigned long 
offset, unsigned int pvr)
 
return NULL;
 }
+
+#ifdef CONFIG_JUMP_LABEL
+DEFINE_STATIC_KEY_TRUE_ARRAY(cpu_feat_keys, MAX_CPU_FEATURES);
+EXPORT_SYMBOL_GPL(cpu_feat_keys);
+
+void __init cpu_feat_keys_init(void)
+{
+   int i;
+
+   for (i = 0; i  MAX_CPU_FEATURES; i++) {
+   unsigned long f = 1ul  i;
+
+   if (!(cur_cpu_spec-cpu_features  f))
+   static_branch_disable(cpu_feat_keys[i]);
+   }
+}
+#endif
diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c
index f0868f510b3b..93756175a13c 100644
--- a/arch/powerpc/kernel/setup_32.c
+++ b/arch/powerpc/kernel/setup_32.c
@@ -115,6 +115,7 @@ notrace void __init machine_init(u64 dt_ptr)
lockdep_init();
 
jump_label_init();
+   cpu_feat_keys_init();
 
/* Enable early debugging if any specified (see udbg.h) */
udbg_early_init();
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index f0802a0b4a20..4cf3894d91fa 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -251,6 +251,7 @@ void __init early_setup(unsigned long dt_ptr)
lockdep_init();
 
jump_label_init();
+   cpu_feat_keys_init();
 
/*  printk is now safe to use --- */
 
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 5/8] powerpc: kill mfvtb()

2015-08-20 Thread Kevin Hao

This function is only used by get_vtb(). They are almost the same
except the reading from the real register. Move the mfspr() to
get_vtb() and kill the function mfvtb(). With this, we can eliminate
the use of cpu_has_feature() in very core header file like reg.h.
This is a preparation for the use of jump label for cpu_has_feature().

Signed-off-by: Kevin Hao haoke...@gmail.com
---
 arch/powerpc/include/asm/reg.h  | 9 -
 arch/powerpc/include/asm/time.h | 2 +-
 2 files changed, 1 insertion(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index aa1cc5f015ee..d0b5f4b63776 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -1207,15 +1207,6 @@
 : r ((unsigned long)(v)) \
 : memory)
 
-static inline unsigned long mfvtb (void)
-{
-#ifdef CONFIG_PPC_BOOK3S_64
-   if (cpu_has_feature(CPU_FTR_ARCH_207S))
-   return mfspr(SPRN_VTB);
-#endif
-   return 0;
-}
-
 #ifdef __powerpc64__
 #if defined(CONFIG_PPC_CELL) || defined(CONFIG_PPC_FSL_BOOK3E)
 #define mftb() ({unsigned long rval;   \
diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
index 10fc784a2ad4..6f69828458fb 100644
--- a/arch/powerpc/include/asm/time.h
+++ b/arch/powerpc/include/asm/time.h
@@ -106,7 +106,7 @@ static inline u64 get_vtb(void)
 {
 #ifdef CONFIG_PPC_BOOK3S_64
if (cpu_has_feature(CPU_FTR_ARCH_207S))
-   return mfvtb();
+   return mfspr(SPRN_VTB);
 #endif
return 0;
 }
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/8] jump_label: introduce DEFINE_STATIC_KEY_{TRUE,FALSE}_ARRAY macros

2015-08-20 Thread Kevin Hao

These are used to define a static_key_{true,false} array.

Signed-off-by: Kevin Hao haoke...@gmail.com
---
 include/linux/jump_label.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h
index 7f653e8f6690..5c1d6a49dd6b 100644
--- a/include/linux/jump_label.h
+++ b/include/linux/jump_label.h
@@ -267,6 +267,12 @@ struct static_key_false {
 #define DEFINE_STATIC_KEY_FALSE(name)  \
struct static_key_false name = STATIC_KEY_FALSE_INIT
 
+#define DEFINE_STATIC_KEY_TRUE_ARRAY(name, n)  \
+   struct static_key_true name[n] = { [0 ... n - 1] = STATIC_KEY_TRUE_INIT 
}
+
+#define DEFINE_STATIC_KEY_FALSE_ARRAY(name, n) \
+   struct static_key_false name[n] = { [0 ... n - 1] = 
STATIC_KEY_FALSE_INIT }
+
 #ifdef HAVE_JUMP_LABEL
 
 /*
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 6/8] powerpc: move the cpu_has_feature to a separate file

2015-08-20 Thread Kevin Hao

We plan to use jump label for cpu_has_feature. In order to implement
this we need to include the linux/jump_label.h in asm/cputable.h.
But it seems that asm/cputable.h is so basic header file for ppc that
it is almost included by all the other header files. The including of
the linux/jump_label.h will introduces various recursive inclusion.
And it is very hard to fix that. So we choose to move the function
cpu_has_feature to a separate header file before using the jump label
for it. No functional change.

Signed-off-by: Kevin Hao haoke...@gmail.com
---
 arch/powerpc/include/asm/cacheflush.h   |  1 +
 arch/powerpc/include/asm/cpufeatures.h  | 14 ++
 arch/powerpc/include/asm/cputable.h |  8 
 arch/powerpc/include/asm/cputime.h  |  1 +
 arch/powerpc/include/asm/dbell.h|  1 +
 arch/powerpc/include/asm/dcr-native.h   |  1 +
 arch/powerpc/include/asm/mman.h |  1 +
 arch/powerpc/include/asm/time.h |  1 +
 arch/powerpc/include/asm/xor.h  |  1 +
 arch/powerpc/kernel/align.c |  1 +
 arch/powerpc/kernel/irq.c   |  1 +
 arch/powerpc/kernel/process.c   |  1 +
 arch/powerpc/kernel/setup-common.c  |  1 +
 arch/powerpc/kernel/setup_32.c  |  1 +
 arch/powerpc/kernel/smp.c   |  1 +
 arch/powerpc/platforms/cell/pervasive.c |  1 +
 arch/powerpc/xmon/ppc-dis.c |  1 +
 17 files changed, 29 insertions(+), 8 deletions(-)
 create mode 100644 arch/powerpc/include/asm/cpufeatures.h

diff --git a/arch/powerpc/include/asm/cacheflush.h 
b/arch/powerpc/include/asm/cacheflush.h
index 6229e6b6037b..3bdcd9231852 100644
--- a/arch/powerpc/include/asm/cacheflush.h
+++ b/arch/powerpc/include/asm/cacheflush.h
@@ -11,6 +11,7 @@
 
 #include linux/mm.h
 #include asm/cputable.h
+#include asm/cpufeatures.h
 
 /*
  * No cache flushing is required when address mappings are changed,
diff --git a/arch/powerpc/include/asm/cpufeatures.h 
b/arch/powerpc/include/asm/cpufeatures.h
new file mode 100644
index ..37650db5044f
--- /dev/null
+++ b/arch/powerpc/include/asm/cpufeatures.h
@@ -0,0 +1,14 @@
+#ifndef __ASM_POWERPC_CPUFEATURES_H
+#define __ASM_POWERPC_CPUFEATURES_H
+
+#include asm/cputable.h
+
+static inline int cpu_has_feature(unsigned long feature)
+{
+   return (CPU_FTRS_ALWAYS  feature) ||
+  (CPU_FTRS_POSSIBLE
+cur_cpu_spec-cpu_features
+feature);
+}
+
+#endif /* __ASM_POWERPC_CPUFEATURE_H */
diff --git a/arch/powerpc/include/asm/cputable.h 
b/arch/powerpc/include/asm/cputable.h
index b118072670fb..ae4b6ef341cd 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -556,14 +556,6 @@ enum {
 };
 #endif /* __powerpc64__ */
 
-static inline int cpu_has_feature(unsigned long feature)
-{
-   return (CPU_FTRS_ALWAYS  feature) ||
-  (CPU_FTRS_POSSIBLE
-cur_cpu_spec-cpu_features
-feature);
-}
-
 #define HBP_NUM 1
 
 #endif /* !__ASSEMBLY__ */
diff --git a/arch/powerpc/include/asm/cputime.h 
b/arch/powerpc/include/asm/cputime.h
index e2452550bcb1..b91837865c0e 100644
--- a/arch/powerpc/include/asm/cputime.h
+++ b/arch/powerpc/include/asm/cputime.h
@@ -28,6 +28,7 @@ static inline void setup_cputime_one_jiffy(void) { }
 #include asm/div64.h
 #include asm/time.h
 #include asm/param.h
+#include asm/cpufeatures.h
 
 typedef u64 __nocast cputime_t;
 typedef u64 __nocast cputime64_t;
diff --git a/arch/powerpc/include/asm/dbell.h b/arch/powerpc/include/asm/dbell.h
index 5fa6b20eba10..2d9eae338f70 100644
--- a/arch/powerpc/include/asm/dbell.h
+++ b/arch/powerpc/include/asm/dbell.h
@@ -16,6 +16,7 @@
 #include linux/threads.h
 
 #include asm/ppc-opcode.h
+#include asm/cpufeatures.h
 
 #define PPC_DBELL_MSG_BRDCAST  (0x0400)
 #define PPC_DBELL_TYPE(x)  (((x)  0xf)  (63-36))
diff --git a/arch/powerpc/include/asm/dcr-native.h 
b/arch/powerpc/include/asm/dcr-native.h
index 4efc11dacb98..0186ba05bfe1 100644
--- a/arch/powerpc/include/asm/dcr-native.h
+++ b/arch/powerpc/include/asm/dcr-native.h
@@ -24,6 +24,7 @@
 
 #include linux/spinlock.h
 #include asm/cputable.h
+#include asm/cpufeatures.h
 
 typedef struct {
unsigned int base;
diff --git a/arch/powerpc/include/asm/mman.h b/arch/powerpc/include/asm/mman.h
index 8565c254151a..74922ad05e6c 100644
--- a/arch/powerpc/include/asm/mman.h
+++ b/arch/powerpc/include/asm/mman.h
@@ -13,6 +13,7 @@
 
 #include asm/cputable.h
 #include linux/mm.h
+#include asm/cpufeatures.h
 
 /*
  * This file is included by linux/mman.h, so we can't use cacl_vm_prot_bits()
diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
index 6f69828458fb..fa63005f827f 100644
--- a/arch/powerpc/include/asm/time.h
+++ b/arch/powerpc/include/asm/time.h
@@ -18,6 +18,7 @@
 #include linux/percpu.h
 
 #include asm/processor.h
+#include asm/cpufeatures.h
 
 /* time.c */
 extern unsigned long tb_ticks_per_jiffy;
diff --git a/arch/powerpc/include/asm/xor.h b

[PATCH 4/8] powerpc: invoke jump_label_init() in a much earlier stage

2015-08-20 Thread Kevin Hao

So we can use static_key for cpu_has_feature() and mmu_has_feature().

Signed-off-by: Kevin Hao haoke...@gmail.com
---
 arch/powerpc/kernel/setup_32.c | 2 ++
 arch/powerpc/kernel/setup_64.c | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c
index bb02e9f6944e..35980a2785ba 100644
--- a/arch/powerpc/kernel/setup_32.c
+++ b/arch/powerpc/kernel/setup_32.c
@@ -113,6 +113,8 @@ notrace void __init machine_init(u64 dt_ptr)
 {
lockdep_init();
 
+   jump_label_init();
+
/* Enable early debugging if any specified (see udbg.h) */
udbg_early_init();
 
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index bdcbb716f4d6..f0802a0b4a20 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -250,6 +250,8 @@ void __init early_setup(unsigned long dt_ptr)
/* Initialize lockdep early or else spinlocks will blow */
lockdep_init();
 
+   jump_label_init();
+
/*  printk is now safe to use --- */
 
/* Enable early debugging if any specified (see udbg.h) */
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/8] jump_label: no need to acquire the jump_label_mutex in jump_lable_init()

2015-08-20 Thread Kevin Hao

The jump_label_init() run in a very early stage, even before the
sched_init(). So there is no chance for concurrent access of the
jump label table.

Signed-off-by: Kevin Hao haoke...@gmail.com
---
 kernel/jump_label.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/kernel/jump_label.c b/kernel/jump_label.c
index f7dd15d537f9..df1a7fbe7cd5 100644
--- a/kernel/jump_label.c
+++ b/kernel/jump_label.c
@@ -205,7 +205,6 @@ void __init jump_label_init(void)
struct static_key *key = NULL;
struct jump_entry *iter;
 
-   jump_label_lock();
jump_label_sort_entries(iter_start, iter_stop);
 
for (iter = iter_start; iter  iter_stop; iter++) {
@@ -229,7 +228,6 @@ void __init jump_label_init(void)
 #endif
}
static_key_initialized = true;
-   jump_label_unlock();
 }
 
 #ifdef CONFIG_MODULES
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/8] jump_label: make it possible for the archs to invoke jump_label_init() much earlier

2015-08-20 Thread Kevin Hao

For some archs (such as powerpc) would want to invoke jump_label_init()
in a much earlier stage. So check static_key_initialized in order to
make sure this function run only once.

Signed-off-by: Kevin Hao haoke...@gmail.com
---
 kernel/jump_label.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/jump_label.c b/kernel/jump_label.c
index df1a7fbe7cd5..fcae370f8794 100644
--- a/kernel/jump_label.c
+++ b/kernel/jump_label.c
@@ -205,6 +205,9 @@ void __init jump_label_init(void)
struct static_key *key = NULL;
struct jump_entry *iter;
 
+   if (static_key_initialized)
+   return;
+
jump_label_sort_entries(iter_start, iter_stop);
 
for (iter = iter_start; iter  iter_stop; iter++) {
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/8] powerpc: use jump label for {cpu,mmu}_has_feature()

2015-08-20 Thread Kevin Hao

Hi,

I have tried to change the {cpu,mmu}_has_feature() to use jump label two yeas
ago [1]. But that codes seem a bit ugly. This is a reimplementation by moving 
the
jump_label_init() much earlier so the jump label can be used in a very earlier
stage. Boot test on p4080ds, t2080rdb and powermac (qemu). This patch series
is against linux-next.

[1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2013-September/111026.html

Kevin Hao (8):
  jump_label: no need to acquire the jump_label_mutex in
jump_lable_init()
  jump_label: make it possible for the archs to invoke jump_label_init()
much earlier
  jump_label: introduce DEFINE_STATIC_KEY_{TRUE,FALSE}_ARRAY macros
  powerpc: invoke jump_label_init() in a much earlier stage
  powerpc: kill mfvtb()
  powerpc: move the cpu_has_feature to a separate file
  powerpc: use the jump label for cpu_has_feature
  powerpc: use jump label for mmu_has_feature

 arch/powerpc/include/asm/cacheflush.h   |  1 +
 arch/powerpc/include/asm/cpufeatures.h  | 34 +
 arch/powerpc/include/asm/cputable.h | 16 
 arch/powerpc/include/asm/cputime.h  |  1 +
 arch/powerpc/include/asm/dbell.h|  1 +
 arch/powerpc/include/asm/dcr-native.h   |  1 +
 arch/powerpc/include/asm/mman.h |  1 +
 arch/powerpc/include/asm/mmu.h  | 29 
 arch/powerpc/include/asm/reg.h  |  9 -
 arch/powerpc/include/asm/time.h |  3 ++-
 arch/powerpc/include/asm/xor.h  |  1 +
 arch/powerpc/kernel/align.c |  1 +
 arch/powerpc/kernel/cputable.c  | 33 
 arch/powerpc/kernel/irq.c   |  1 +
 arch/powerpc/kernel/process.c   |  1 +
 arch/powerpc/kernel/setup-common.c  |  1 +
 arch/powerpc/kernel/setup_32.c  |  5 +
 arch/powerpc/kernel/setup_64.c  |  4 
 arch/powerpc/kernel/smp.c   |  1 +
 arch/powerpc/platforms/cell/pervasive.c |  1 +
 arch/powerpc/xmon/ppc-dis.c |  1 +
 include/linux/jump_label.h  |  6 ++
 kernel/jump_label.c |  5 +++--
 23 files changed, 137 insertions(+), 20 deletions(-)
 create mode 100644 arch/powerpc/include/asm/cpufeatures.h

-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/8] jump_label: no need to acquire the jump_label_mutex in jump_lable_init()

2015-08-20 Thread Kevin Hao

On Thu, Aug 20, 2015 at 08:29:03PM +0200, Peter Zijlstra wrote:
 On Thu, Aug 20, 2015 at 08:14:29PM +0800, Kevin Hao wrote:
  The jump_label_init() run in a very early stage, even before the
  sched_init(). So there is no chance for concurrent access of the
  jump label table.
 
 It also doesn't hurt to have it. Its better to be consistent and
 conservative with locking unless there is a pressing need.

Yes, it has no real hurt. IMHO it may cause confusion that the function
jump_label_init() may run in two different thread context simultaneously. 
Anyway if you guys don't think so, I can drop this patch.

Thanks,
Kevin


pgp_4f4fGgWfN.pgp
Description: PGP signature

Re: [PATCH 3/8] jump_label: introduce DEFINE_STATIC_KEY_{TRUE,FALSE}_ARRAY macros

2015-08-20 Thread Kevin Hao

On Thu, Aug 20, 2015 at 08:31:58PM +0200, Peter Zijlstra wrote:
 On Thu, Aug 20, 2015 at 08:14:31PM +0800, Kevin Hao wrote:
  These are used to define a static_key_{true,false} array.
 
 Yes but why...
 
 there might have been some clue in the patches you didn't send me, but
 since you didn't send them, I'm left wondering.

Sorry for the confusion. In order to use jump label for the
{cpu,mmu}_has_feature() functions on powerpc, we need to declare an array of
32 or 64 static_key_true (one static_key_true for each cpu or mmu feature).
The following are the two patches which depends on this patch.
  https://lkml.org/lkml/2015/8/20/355
  https://lkml.org/lkml/2015/8/20/356

So far only DEFINE_STATIC_KEY_TRUE_ARRAY macro is used, but I think it may seem
canonical to define the macros for both true or false keys at the same time.

Thanks,
Kevin


pgpcD5zZ9lU7u.pgp
Description: PGP signature

Re: [PATCH] tty: Remove leftover dependencies on PPC_OF

2015-04-17 Thread Kevin Hao

On Fri, Apr 17, 2015 at 07:58:10AM +0100, Grant Likely wrote:
> On Fri, Apr 17, 2015 at 7:01 AM, Kevin Hao  wrote:
> > On Thu, Apr 16, 2015 at 10:20:59PM -0700, Guenter Roeck wrote:
> >> powerpc qemu runs fail with the current upstream kernel.
> >> Bisect points to commit 52d996270032 ("powerpc: kill PPC_OF").
> >> Unfortunately, that commit did not remove all instances of PPC_OF.
> >> Practical impact is that the serial driver used by powerpc qemu
> >> targets is no longer built into the test kernel.
> >
> > Sorry for the break. This is a dependency issue. The patch 213dce3c17a6
> > ("tty: kconfig: remove the superfluous dependency on PPC_OF") has already
> > been merged into tty-next, but still not propagate to upstream yet. I failed
> > to reminder Michael of this when the pulling request is sent to Linus.
> 
> This means you're doing it wrong.

Definitely.

> If there are dependencies between
> patches, then they *must* all be in the same tree. You cannot depend
> on the order that branches get merged because it breaks bisecting.

We don't try to depend on the order that branches get merged. We just want to
merge the patch which kill the PPC_OF completely only after all the patches for
the specific drivers have been merged. I should took the responsibility to
make sure of this, but of course I missed something. Yes. putting all the
patches into the same tree should be a better plan. I will make sure of this
next time. :-)

> 
> Or, to put it another way, every branch must compile on it's own
> without needing to be merged into mainline or linux-next first.

This issue doesn't break compile. That is another reason we didn't find it
earlier.

Thanks,
Kevin

> 
> There are two ways to handle this. The easiest is to commit all the
> patches to a single branch. Then there is guaranteed to be no
> problems. However, if the patches really need to be split up between
> several branches, then the way to handle it is to put the prerequisite
> patches into a new clean branch, and then merge that branch into each
> tree that needs the prerequisite. That way every single tree includes
> the required commit without committing it more than once or breaking
> bisectability.
> 
> g.
> 
> >
> > Thanks,
> > Kevin
> >>
> >> Fixes: 52d996270032 ("powerpc: kill PPC_OF")
> >> Cc: Kevin Hao 
> >> Cc: Michael Ellerman 
> >> Signed-off-by: Guenter Roeck 
> >> ---
> >>  drivers/tty/serial/Kconfig | 4 ++--
> >>  1 file changed, 2 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/tty/serial/Kconfig b/drivers/tty/serial/Kconfig
> >> index d2501f01cd03..77471d3db8d3 100644
> >> --- a/drivers/tty/serial/Kconfig
> >> +++ b/drivers/tty/serial/Kconfig
> >> @@ -835,7 +835,7 @@ config SERIAL_MCF_CONSOLE
> >>
> >>  config SERIAL_PMACZILOG
> >>   tristate "Mac or PowerMac z85c30 ESCC support"
> >> - depends on (M68K && MAC) || (PPC_OF && PPC_PMAC)
> >> + depends on (M68K && MAC) || PPC_PMAC
> >>   select SERIAL_CORE
> >>   help
> >> This driver supports the Zilog z85C30 serial ports found on
> >> @@ -1153,7 +1153,7 @@ config SERIAL_OMAP_CONSOLE
> >>
> >>  config SERIAL_OF_PLATFORM_NWPSERIAL
> >>   tristate "NWP serial port driver"
> >> - depends on PPC_OF && PPC_DCR
> >> + depends on PPC_DCR
> >>   select SERIAL_OF_PLATFORM
> >>   select SERIAL_CORE_CONSOLE
> >>   select SERIAL_CORE
> >> --
> >> 2.1.0
> >>


pgpwg2ECpgNRW.pgp
Description: PGP signature

Re: [PATCH] tty: Remove leftover dependencies on PPC_OF

2015-04-17 Thread Kevin Hao

On Thu, Apr 16, 2015 at 10:20:59PM -0700, Guenter Roeck wrote:
> powerpc qemu runs fail with the current upstream kernel.
> Bisect points to commit 52d996270032 ("powerpc: kill PPC_OF").
> Unfortunately, that commit did not remove all instances of PPC_OF.
> Practical impact is that the serial driver used by powerpc qemu
> targets is no longer built into the test kernel.

Sorry for the break. This is a dependency issue. The patch 213dce3c17a6
("tty: kconfig: remove the superfluous dependency on PPC_OF") has already
been merged into tty-next, but still not propagate to upstream yet. I failed
to reminder Michael of this when the pulling request is sent to Linus.

Thanks,
Kevin
> 
> Fixes: 52d996270032 ("powerpc: kill PPC_OF")
> Cc: Kevin Hao 
> Cc: Michael Ellerman 
> Signed-off-by: Guenter Roeck 
> ---
>  drivers/tty/serial/Kconfig | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/tty/serial/Kconfig b/drivers/tty/serial/Kconfig
> index d2501f01cd03..77471d3db8d3 100644
> --- a/drivers/tty/serial/Kconfig
> +++ b/drivers/tty/serial/Kconfig
> @@ -835,7 +835,7 @@ config SERIAL_MCF_CONSOLE
>  
>  config SERIAL_PMACZILOG
>   tristate "Mac or PowerMac z85c30 ESCC support"
> - depends on (M68K && MAC) || (PPC_OF && PPC_PMAC)
> + depends on (M68K && MAC) || PPC_PMAC
>   select SERIAL_CORE
>   help
> This driver supports the Zilog z85C30 serial ports found on
> @@ -1153,7 +1153,7 @@ config SERIAL_OMAP_CONSOLE
>  
>  config SERIAL_OF_PLATFORM_NWPSERIAL
>   tristate "NWP serial port driver"
> - depends on PPC_OF && PPC_DCR
> + depends on PPC_DCR
>   select SERIAL_OF_PLATFORM
>   select SERIAL_CORE_CONSOLE
>   select SERIAL_CORE
> -- 
> 2.1.0
> 


pgpdzJFqlOrzo.pgp
Description: PGP signature

Re: [PATCH] tty: Remove leftover dependencies on PPC_OF

2015-04-17 Thread Kevin Hao

On Thu, Apr 16, 2015 at 10:20:59PM -0700, Guenter Roeck wrote:
 powerpc qemu runs fail with the current upstream kernel.
 Bisect points to commit 52d996270032 (powerpc: kill PPC_OF).
 Unfortunately, that commit did not remove all instances of PPC_OF.
 Practical impact is that the serial driver used by powerpc qemu
 targets is no longer built into the test kernel.

Sorry for the break. This is a dependency issue. The patch 213dce3c17a6
(tty: kconfig: remove the superfluous dependency on PPC_OF) has already
been merged into tty-next, but still not propagate to upstream yet. I failed
to reminder Michael of this when the pulling request is sent to Linus.

Thanks,
Kevin
 
 Fixes: 52d996270032 (powerpc: kill PPC_OF)
 Cc: Kevin Hao haoke...@gmail.com
 Cc: Michael Ellerman m...@ellerman.id.au
 Signed-off-by: Guenter Roeck li...@roeck-us.net
 ---
  drivers/tty/serial/Kconfig | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)
 
 diff --git a/drivers/tty/serial/Kconfig b/drivers/tty/serial/Kconfig
 index d2501f01cd03..77471d3db8d3 100644
 --- a/drivers/tty/serial/Kconfig
 +++ b/drivers/tty/serial/Kconfig
 @@ -835,7 +835,7 @@ config SERIAL_MCF_CONSOLE
  
  config SERIAL_PMACZILOG
   tristate Mac or PowerMac z85c30 ESCC support
 - depends on (M68K  MAC) || (PPC_OF  PPC_PMAC)
 + depends on (M68K  MAC) || PPC_PMAC
   select SERIAL_CORE
   help
 This driver supports the Zilog z85C30 serial ports found on
 @@ -1153,7 +1153,7 @@ config SERIAL_OMAP_CONSOLE
  
  config SERIAL_OF_PLATFORM_NWPSERIAL
   tristate NWP serial port driver
 - depends on PPC_OF  PPC_DCR
 + depends on PPC_DCR
   select SERIAL_OF_PLATFORM
   select SERIAL_CORE_CONSOLE
   select SERIAL_CORE
 -- 
 2.1.0
 


pgpdzJFqlOrzo.pgp
Description: PGP signature

Re: [PATCH] tty: Remove leftover dependencies on PPC_OF

2015-04-17 Thread Kevin Hao

On Fri, Apr 17, 2015 at 07:58:10AM +0100, Grant Likely wrote:
 On Fri, Apr 17, 2015 at 7:01 AM, Kevin Hao haoke...@gmail.com wrote:
  On Thu, Apr 16, 2015 at 10:20:59PM -0700, Guenter Roeck wrote:
  powerpc qemu runs fail with the current upstream kernel.
  Bisect points to commit 52d996270032 (powerpc: kill PPC_OF).
  Unfortunately, that commit did not remove all instances of PPC_OF.
  Practical impact is that the serial driver used by powerpc qemu
  targets is no longer built into the test kernel.
 
  Sorry for the break. This is a dependency issue. The patch 213dce3c17a6
  (tty: kconfig: remove the superfluous dependency on PPC_OF) has already
  been merged into tty-next, but still not propagate to upstream yet. I failed
  to reminder Michael of this when the pulling request is sent to Linus.
 
 This means you're doing it wrong.

Definitely.

 If there are dependencies between
 patches, then they *must* all be in the same tree. You cannot depend
 on the order that branches get merged because it breaks bisecting.

We don't try to depend on the order that branches get merged. We just want to
merge the patch which kill the PPC_OF completely only after all the patches for
the specific drivers have been merged. I should took the responsibility to
make sure of this, but of course I missed something. Yes. putting all the
patches into the same tree should be a better plan. I will make sure of this
next time. :-)

 
 Or, to put it another way, every branch must compile on it's own
 without needing to be merged into mainline or linux-next first.

This issue doesn't break compile. That is another reason we didn't find it
earlier.

Thanks,
Kevin

 
 There are two ways to handle this. The easiest is to commit all the
 patches to a single branch. Then there is guaranteed to be no
 problems. However, if the patches really need to be split up between
 several branches, then the way to handle it is to put the prerequisite
 patches into a new clean branch, and then merge that branch into each
 tree that needs the prerequisite. That way every single tree includes
 the required commit without committing it more than once or breaking
 bisectability.
 
 g.
 
 
  Thanks,
  Kevin
 
  Fixes: 52d996270032 (powerpc: kill PPC_OF)
  Cc: Kevin Hao haoke...@gmail.com
  Cc: Michael Ellerman m...@ellerman.id.au
  Signed-off-by: Guenter Roeck li...@roeck-us.net
  ---
   drivers/tty/serial/Kconfig | 4 ++--
   1 file changed, 2 insertions(+), 2 deletions(-)
 
  diff --git a/drivers/tty/serial/Kconfig b/drivers/tty/serial/Kconfig
  index d2501f01cd03..77471d3db8d3 100644
  --- a/drivers/tty/serial/Kconfig
  +++ b/drivers/tty/serial/Kconfig
  @@ -835,7 +835,7 @@ config SERIAL_MCF_CONSOLE
 
   config SERIAL_PMACZILOG
tristate Mac or PowerMac z85c30 ESCC support
  - depends on (M68K  MAC) || (PPC_OF  PPC_PMAC)
  + depends on (M68K  MAC) || PPC_PMAC
select SERIAL_CORE
help
  This driver supports the Zilog z85C30 serial ports found on
  @@ -1153,7 +1153,7 @@ config SERIAL_OMAP_CONSOLE
 
   config SERIAL_OF_PLATFORM_NWPSERIAL
tristate NWP serial port driver
  - depends on PPC_OF  PPC_DCR
  + depends on PPC_DCR
select SERIAL_OF_PLATFORM
select SERIAL_CORE_CONSOLE
select SERIAL_CORE
  --
  2.1.0
 


pgpwg2ECpgNRW.pgp
Description: PGP signature

Re: [PATCH] powerpc/irq: Remove HAVE_IRQ_EXIT_ON_IRQ_STACK feature at powerpc platform

2014-03-28 Thread Kevin Hao

On Sat, Mar 29, 2014 at 08:27:07AM +1100, Benjamin Herrenschmidt wrote:
> On Fri, 2014-03-28 at 16:18 +0800, Kevin Hao wrote:
> 
> > powerpc: Set the correct ksp_limit on ppc32 when switching to irq stack
> >
> 
> Kevin. It looks like it was applied to 3.14 and sent to 3.12 stable but
> not 3.13 ... can you fix that up ?

It was already merged into 3.13 stable since 3.13.6:
  https://lkml.org/lkml/2014/3/4/787

I guess that Dongsheng didn't use the latest 3.13 stable tree.

Thanks,
Kevin


pgp6NL7sUpnne.pgp
Description: PGP signature

Re: [PATCH] powerpc/irq: Remove HAVE_IRQ_EXIT_ON_IRQ_STACK feature at powerpc platform

2014-03-28 Thread Kevin Hao

On Fri, Mar 28, 2014 at 09:00:13AM +, dongsheng.w...@freescale.com wrote:
> Thanks Kevin. Your patch works normal. :)
> 
> I still have some confused. I think when __do_softirq always get a interrupt, 
> the hard stack will be run out, isn't it?

No, it won't. Please see the explanation in the following commit log.

commit cc1f027454929924471bea2f362431072e3c71be
Author: Frederic Weisbecker 
Date:   Tue Sep 24 17:17:47 2013 +0200

irq: Optimize softirq stack selection in irq exit

If irq_exit() is called on the arch's specified irq stack,
it should be safe to run softirqs inline under that same
irq stack as it is near empty by the time we call irq_exit().

For example if we use the same stack for both hard and soft irqs here,
the worst case scenario is:
hardirq -> softirq -> hardirq. But then the softirq supersedes the
first hardirq as the stack user since irq_exit() is called in
a mostly empty stack. So the stack merge in this case looks acceptable.

Stack overrun still have a chance to happen if hardirqs have more
opportunities to nest, but then it's another problem to solve.

So lets adapt the irq exit's softirq stack on top of a new Kconfig symbol
that can be defined when irq_exit() runs on the irq stack. That way
we can spare some stack switch on irq processing and all the cache
issues that come along.

Thanks,
Kevin

pgpj9K2FO6AVp.pgp
Description: PGP signature

Re: [PATCH] powerpc/irq: Remove HAVE_IRQ_EXIT_ON_IRQ_STACK feature at powerpc platform

2014-03-28 Thread Kevin Hao

On Fri, Mar 28, 2014 at 03:38:32PM +0800, Dongsheng Wang wrote:
> From: Wang Dongsheng 
> 
> If softirq use hardirq stack, we will get kernel painc when a hard irq coming 
> again
> during __do_softirq enable local irq to deal with softirq action. So we need 
> to switch
> satck into softirq stack when invoke soft irq.
> 
>  Task--->
>   | Task stack
>   |
>   Interrput->EXCEPTION->do_IRQ->
>   ^| Hard irq stack
>   ||
>   |irq_exit->__do_softirq->local_irq_enable-- 
>   -->local_irq_disable
>   |   
> | Hard irq stack
>   |   
> |
>   |   
> Interrupt coming again
>   |   There will get a Interrupt nesting  
> |
>   
> 
> Trace 1: Trap 900
> 
> Kernel stack overflow in process e8152f40, r1=e8e05ec0
> CPU: 0 PID: 2399 Comm: image_compress/ Not tainted 3.13.0-rc3-03475-g2e3f85b 
> #432
> task: e8152f40 ti: c080a000 task.ti: ef176000
> NIP: c05bec04 LR: c0305590 CTR: 0010
> REGS: e8e05e10 TRAP: 0901   Not tainted  (3.13.0-rc3-03475-g2e3f85b)

Could you double check if you got the following patch applied?

commit 1a18a66446f3f289b05b634f18012424d82aa63a
Author: Kevin Hao 
Date:   Fri Jan 17 12:25:28 2014 +0800

powerpc: Set the correct ksp_limit on ppc32 when switching to irq stack

Guenter Roeck has got the following call trace on a p2020 board:
  Kernel stack overflow in process eb3e5a00, r1=eb79df90
  CPU: 0 PID: 2838 Comm: ssh Not tainted 3.13.0-rc8-juniper-00146-g19eca00 
#4
  task: eb3e5a00 ti: c0616000 task.ti: ef44
  NIP: c003a420 LR: c003a410 CTR: c0017518
  REGS: eb79dee0 TRAP: 0901   Not tainted 
(3.13.0-rc8-juniper-00146-g19eca00)
  MSR: 00029000   CR: 24008444  XER: 
  GPR00: c003a410 eb79df90 eb3e5a00  eb05d900 0001 65d87646 

  GPR08:  020b8000   44008442
  NIP [c003a420] __do_softirq+0x94/0x1ec
  LR [c003a410] __do_softirq+0x84/0x1ec
  Call Trace:
  [eb79df90] [c003a410] __do_softirq+0x84/0x1ec (unreliable)
  [eb79dfe0] [c003a970] irq_exit+0xbc/0xc8
  [eb79dff0] [c000cc1c] call_do_irq+0x24/0x3c
  [ef441f20] [c00046a8] do_IRQ+0x8c/0xf8
  [ef441f40] [c000e7f4] ret_from_except+0x0/0x18
  --- Exception: 501 at 0xfcda524
  LR = 0x10024900
  Instruction dump:
  7c781b78 3b4a 3a73b040 543c0024 3a80 3b3913a0 7ef5bb78 48201bf9
  5463103a 7d3b182e 7e89b92e 7c008146 <3ba0> 7e7e9b78 4814 57fff87f
  Kernel panic - not syncing: kernel stack overflow
  CPU: 0 PID: 2838 Comm: ssh Not tainted 3.13.0-rc8-juniper-00146-g19eca00 
#4
  Call Trace:

The reason is that we have used the wrong register to calculate the
ksp_limit in commit cbc9565ee826 (powerpc: Remove ksp_limit on ppc64).
Just fix it.

As suggested by Benjamin Herrenschmidt, also add the C prototype of the
function in the comment in order to avoid such kind of errors in the
future.
    
Cc: sta...@vger.kernel.org # 3.12
Reported-by: Guenter Roeck 
Tested-by: Guenter Roeck 
Signed-off-by: Kevin Hao 
Signed-off-by: Benjamin Herrenschmidt 

Thanks,
Kevin


pgppiuvIWJRql.pgp
Description: PGP signature

Re: [PATCH] powerpc/irq: Remove HAVE_IRQ_EXIT_ON_IRQ_STACK feature at powerpc platform

2014-03-28 Thread Kevin Hao

On Fri, Mar 28, 2014 at 03:38:32PM +0800, Dongsheng Wang wrote:
 From: Wang Dongsheng dongsheng.w...@freescale.com
 
 If softirq use hardirq stack, we will get kernel painc when a hard irq coming 
 again
 during __do_softirq enable local irq to deal with softirq action. So we need 
 to switch
 satck into softirq stack when invoke soft irq.
 
  Task---
   | Task stack
   |
   Interrput-EXCEPTION-do_IRQ-
   ^| Hard irq stack
   ||
   |irq_exit-__do_softirq-local_irq_enable-- 
   --local_irq_disable
   |   
 | Hard irq stack
   |   
 |
   |   
 Interrupt coming again
   |   There will get a Interrupt nesting  
 |
   
 
 Trace 1: Trap 900
 
 Kernel stack overflow in process e8152f40, r1=e8e05ec0
 CPU: 0 PID: 2399 Comm: image_compress/ Not tainted 3.13.0-rc3-03475-g2e3f85b 
 #432
 task: e8152f40 ti: c080a000 task.ti: ef176000
 NIP: c05bec04 LR: c0305590 CTR: 0010
 REGS: e8e05e10 TRAP: 0901   Not tainted  (3.13.0-rc3-03475-g2e3f85b)

Could you double check if you got the following patch applied?

commit 1a18a66446f3f289b05b634f18012424d82aa63a
Author: Kevin Hao haoke...@gmail.com
Date:   Fri Jan 17 12:25:28 2014 +0800

powerpc: Set the correct ksp_limit on ppc32 when switching to irq stack

Guenter Roeck has got the following call trace on a p2020 board:
  Kernel stack overflow in process eb3e5a00, r1=eb79df90
  CPU: 0 PID: 2838 Comm: ssh Not tainted 3.13.0-rc8-juniper-00146-g19eca00 
#4
  task: eb3e5a00 ti: c0616000 task.ti: ef44
  NIP: c003a420 LR: c003a410 CTR: c0017518
  REGS: eb79dee0 TRAP: 0901   Not tainted 
(3.13.0-rc8-juniper-00146-g19eca00)
  MSR: 00029000 CE,EE,ME  CR: 24008444  XER: 
  GPR00: c003a410 eb79df90 eb3e5a00  eb05d900 0001 65d87646 

  GPR08:  020b8000   44008442
  NIP [c003a420] __do_softirq+0x94/0x1ec
  LR [c003a410] __do_softirq+0x84/0x1ec
  Call Trace:
  [eb79df90] [c003a410] __do_softirq+0x84/0x1ec (unreliable)
  [eb79dfe0] [c003a970] irq_exit+0xbc/0xc8
  [eb79dff0] [c000cc1c] call_do_irq+0x24/0x3c
  [ef441f20] [c00046a8] do_IRQ+0x8c/0xf8
  [ef441f40] [c000e7f4] ret_from_except+0x0/0x18
  --- Exception: 501 at 0xfcda524
  LR = 0x10024900
  Instruction dump:
  7c781b78 3b4a 3a73b040 543c0024 3a80 3b3913a0 7ef5bb78 48201bf9
  5463103a 7d3b182e 7e89b92e 7c008146 3ba0 7e7e9b78 4814 57fff87f
  Kernel panic - not syncing: kernel stack overflow
  CPU: 0 PID: 2838 Comm: ssh Not tainted 3.13.0-rc8-juniper-00146-g19eca00 
#4
  Call Trace:

The reason is that we have used the wrong register to calculate the
ksp_limit in commit cbc9565ee826 (powerpc: Remove ksp_limit on ppc64).
Just fix it.

As suggested by Benjamin Herrenschmidt, also add the C prototype of the
function in the comment in order to avoid such kind of errors in the
future.

Cc: sta...@vger.kernel.org # 3.12
Reported-by: Guenter Roeck li...@roeck-us.net
Tested-by: Guenter Roeck li...@roeck-us.net
Signed-off-by: Kevin Hao haoke...@gmail.com
Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org

Thanks,
Kevin


pgppiuvIWJRql.pgp
Description: PGP signature

Re: [PATCH] powerpc/irq: Remove HAVE_IRQ_EXIT_ON_IRQ_STACK feature at powerpc platform

2014-03-28 Thread Kevin Hao

On Fri, Mar 28, 2014 at 09:00:13AM +, dongsheng.w...@freescale.com wrote:
 Thanks Kevin. Your patch works normal. :)
 
 I still have some confused. I think when __do_softirq always get a interrupt, 
 the hard stack will be run out, isn't it?

No, it won't. Please see the explanation in the following commit log.

commit cc1f027454929924471bea2f362431072e3c71be
Author: Frederic Weisbecker fweis...@gmail.com
Date:   Tue Sep 24 17:17:47 2013 +0200

irq: Optimize softirq stack selection in irq exit

If irq_exit() is called on the arch's specified irq stack,
it should be safe to run softirqs inline under that same
irq stack as it is near empty by the time we call irq_exit().

For example if we use the same stack for both hard and soft irqs here,
the worst case scenario is:
hardirq - softirq - hardirq. But then the softirq supersedes the
first hardirq as the stack user since irq_exit() is called in
a mostly empty stack. So the stack merge in this case looks acceptable.

Stack overrun still have a chance to happen if hardirqs have more
opportunities to nest, but then it's another problem to solve.

So lets adapt the irq exit's softirq stack on top of a new Kconfig symbol
that can be defined when irq_exit() runs on the irq stack. That way
we can spare some stack switch on irq processing and all the cache
issues that come along.

Thanks,
Kevin


pgpj9K2FO6AVp.pgp
Description: PGP signature

Re: [PATCH] powerpc/irq: Remove HAVE_IRQ_EXIT_ON_IRQ_STACK feature at powerpc platform

2014-03-28 Thread Kevin Hao

On Sat, Mar 29, 2014 at 08:27:07AM +1100, Benjamin Herrenschmidt wrote:
 On Fri, 2014-03-28 at 16:18 +0800, Kevin Hao wrote:
 
  powerpc: Set the correct ksp_limit on ppc32 when switching to irq stack
 
 
 Kevin. It looks like it was applied to 3.14 and sent to 3.12 stable but
 not 3.13 ... can you fix that up ?

It was already merged into 3.13 stable since 3.13.6:
  https://lkml.org/lkml/2014/3/4/787

I guess that Dongsheng didn't use the latest 3.13 stable tree.

Thanks,
Kevin


pgp6NL7sUpnne.pgp
Description: PGP signature

Re: [PATCH 9/9] powerpc/pm: support deep sleep feature on T1040

2014-03-20 Thread Kevin Hao

On Tue, Mar 18, 2014 at 06:18:54PM -0500, Scott Wood wrote:
> >  The sequence "write, readback, sync" will guarantee this according to the 
> > manual. 
> 
> If you're referring to the section you quoted above, the manual does not
> say that.  It only talks about when accesses "to memory regions affected
> by the configuration register write" can be safely made.

  To guarantee that the results of any sequence of writes to configuration
  registers are in effect, the final configuration register write should be
  immediately followed by a read of the same register, and that should be
  followed by a SYNC instruction. Then accesses can safely be made to memory
  regions affected by the configuration register write.

According to the above description in t4240 RM manual (2.6.1 Accessing CCSR
Memory from the Local Processor), that the writing to configuration register
takes effect is a prerequisite for the memory access to the affected regions.

> 
> > If we just want to
> > order the delay loop after the load, the following sequence should be 
> > enough:
> > store to CCSR_DDR_SDRAM_CFG_2
> > load from CCSR_DDR_SDRAM_CFG_2
> > isync or sync
> > delay loop
> > 
> > Why do we need the 'date dependency' here? According to the e6500 manual, 
> > the
> > instructions can execute out of order, but complete in order. So I am really
> > wondering why we need to order the load and the following delay loop if 
> > there
> > is no intention to order the storage access? 
> 
> The data (not date) dependency means that the twi will not execute until
> after the load instruction returns data (thus, after the device has
> responded to the load).  The twi, being a flow control instruction
> rather than a storage instruction, should be fully ordered by isync.
> 
> From the isync description in the ISA: "Except as described in the
> preceding sentence, the isync instruction may complete before storage
> accesses associated with instructions preceding the
> isync instruction have been performed."
> 
> I don't know if that really applies to loads (as opposed to stores), and
> it probably doesn't apply to guarded loads, but I feel safer leaving the
> twi in.
> 
> As for sync instead of isync, I see nothing to indicate that it would be
> adequate (though it may be that the only reason there needed to be a
> delay loop in the first place was the lack of a readback/sync before
> doing other I/O, in which case this is moot).

OK, so the intention of 'twi, isync' following the load is not to order the
following storage access, but order the following delay loop instructions,
right? But according to the e6500 manual, the instructions complete in order.
The following is the definition of 'complete':
  Complete—An instruction is eligible to complete after it finishes executing
  and makes its results available for subsequent instructions. Instructions
  must complete in order from the bottom two entries of the
  completion queue (CQ). The completion unit coordinates how instructions (which
  may have executed out of order) affect architected registers to ensure the
  appearance of serial execution. This guarantees that the completed instruction
  and all previous instructions can cause no exceptions. An instruction 
completes
  when it is retired, that is, deleted from the CQ.

So the following delay loop instructions should never complete before the
complete of the load instruction. After the complete of load instruction,
the data should already been updated to the architecture register. So we can
guarantee that the load instruction get the data (the device has responded to
the load) before the complete of the following delay loop instructions, why do
we need the additional 'twi, isync'?  Then for a case like this (which don't
need order the following storage access), I think the following sequence should
be enough:
write to configuration register
read back
delay loop

> I mean that I do not understand why it says, "the load on which the
> Branch depends is performed before any loads caused by instructions
> following the isync" rather than "the Load on which the Branch depends
> is performed before any instructions following the isync".

When we talk about the 'performed' here, we should only mean the effect of
storage access. The following is the definition of 'performed':
  performed
  A load or instruction fetch by a processor or mech-
  anism (P1) is performed with respect to any pro-
  cessor or mechanism (P2) when the value to be
  returned by the load or instruction fetch can no
  longer be changed by a store by P2. A store by P1
  is performed with respect to P2 when a load by P2
  from the location accessed by the store will return
  the value stored (or a value stored subsequently).
  An instruction cache block invalidation by P1 is
  performed with respect to P2 when an instruction
  fetch by P2 will not be satisfied from the copy of
  the block that existed in its instruction cache

Re: [PATCH 9/9] powerpc/pm: support deep sleep feature on T1040

2014-03-20 Thread Kevin Hao

On Tue, Mar 18, 2014 at 06:18:54PM -0500, Scott Wood wrote:
   The sequence write, readback, sync will guarantee this according to the 
  manual. 
 
 If you're referring to the section you quoted above, the manual does not
 say that.  It only talks about when accesses to memory regions affected
 by the configuration register write can be safely made.

  To guarantee that the results of any sequence of writes to configuration
  registers are in effect, the final configuration register write should be
  immediately followed by a read of the same register, and that should be
  followed by a SYNC instruction. Then accesses can safely be made to memory
  regions affected by the configuration register write.

According to the above description in t4240 RM manual (2.6.1 Accessing CCSR
Memory from the Local Processor), that the writing to configuration register
takes effect is a prerequisite for the memory access to the affected regions.

 
  If we just want to
  order the delay loop after the load, the following sequence should be 
  enough:
  store to CCSR_DDR_SDRAM_CFG_2
  load from CCSR_DDR_SDRAM_CFG_2
  isync or sync
  delay loop
  
  Why do we need the 'date dependency' here? According to the e6500 manual, 
  the
  instructions can execute out of order, but complete in order. So I am really
  wondering why we need to order the load and the following delay loop if 
  there
  is no intention to order the storage access? 
 
 The data (not date) dependency means that the twi will not execute until
 after the load instruction returns data (thus, after the device has
 responded to the load).  The twi, being a flow control instruction
 rather than a storage instruction, should be fully ordered by isync.
 
 From the isync description in the ISA: Except as described in the
 preceding sentence, the isync instruction may complete before storage
 accesses associated with instructions preceding the
 isync instruction have been performed.
 
 I don't know if that really applies to loads (as opposed to stores), and
 it probably doesn't apply to guarded loads, but I feel safer leaving the
 twi in.
 
 As for sync instead of isync, I see nothing to indicate that it would be
 adequate (though it may be that the only reason there needed to be a
 delay loop in the first place was the lack of a readback/sync before
 doing other I/O, in which case this is moot).

OK, so the intention of 'twi, isync' following the load is not to order the
following storage access, but order the following delay loop instructions,
right? But according to the e6500 manual, the instructions complete in order.
The following is the definition of 'complete':
  Complete—An instruction is eligible to complete after it finishes executing
  and makes its results available for subsequent instructions. Instructions
  must complete in order from the bottom two entries of the
  completion queue (CQ). The completion unit coordinates how instructions (which
  may have executed out of order) affect architected registers to ensure the
  appearance of serial execution. This guarantees that the completed instruction
  and all previous instructions can cause no exceptions. An instruction 
completes
  when it is retired, that is, deleted from the CQ.

So the following delay loop instructions should never complete before the
complete of the load instruction. After the complete of load instruction,
the data should already been updated to the architecture register. So we can
guarantee that the load instruction get the data (the device has responded to
the load) before the complete of the following delay loop instructions, why do
we need the additional 'twi, isync'?  Then for a case like this (which don't
need order the following storage access), I think the following sequence should
be enough:
write to configuration register
read back
delay loop

 I mean that I do not understand why it says, the load on which the
 Branch depends is performed before any loads caused by instructions
 following the isync rather than the Load on which the Branch depends
 is performed before any instructions following the isync.

When we talk about the 'performed' here, we should only mean the effect of
storage access. The following is the definition of 'performed':
  performed
  A load or instruction fetch by a processor or mech-
  anism (P1) is performed with respect to any pro-
  cessor or mechanism (P2) when the value to be
  returned by the load or instruction fetch can no
  longer be changed by a store by P2. A store by P1
  is performed with respect to P2 when a load by P2
  from the location accessed by the store will return
  the value stored (or a value stored subsequently).
  An instruction cache block invalidation by P1 is
  performed with respect to P2 when an instruction
  fetch by P2 will not be satisfied from the copy of
  the block that existed in its instruction cache when
  the instruction causing the invalidation was exe-
  cuted, and

Re: [PATCH 9/9] powerpc/pm: support deep sleep feature on T1040

2014-03-15 Thread Kevin Hao

On Fri, Mar 14, 2014 at 05:26:27PM -0500, Scott Wood wrote:
> On Thu, 2014-03-13 at 15:46 +0800, Kevin Hao wrote:
> > On Wed, Mar 12, 2014 at 12:43:05PM -0500, Scott Wood wrote:
> > > > Shouldn't we use "readback, sync" here? The following is quoted form 
> > > > t4240RM:
> > > >   To guarantee that the results of any sequence of writes to 
> > > > configuration
> > > >   registers are in effect, the final configuration register write 
> > > > should be
> > > >   immediately followed by a read of the same register, and that should 
> > > > be
> > > >   followed by a SYNC instruction. Then accesses can safely be made to 
> > > > memory
> > > >   regions affected by the configuration register write.
> > > 
> > > I agree that the sync before the readback is probably not necessary,
> > > since transactions to the same address should already be ordered.
> > > 
> > > A sync after the readback helps if you're trying to order the readback
> > > with subsequent memory accesses, though in that case wouldn't a sync
> > > alone (no readback) be adequate?
> > 
> > No, we don't just want to order the subsequent memory access here.
> > The 'write, readback, sync' is the required sequence if we want to make
> > sure that the writing to CCSR register does really take effect.
> > 
> > >  Though maybe not always -- see the
> > > comment near the end of fsl_elbc_write_buf() in
> > > drivers/mtd/nand_fsl_elbc.c.  I guess the readback does more than just
> > > make sure the device has seen the write, ensuring that the device has
> > > finished the transaction to the point of acting on another one.
> > 
> > Agree.
> > 
> > > 
> > > The data dependency plus isync sequence, which is done by the normal I/O
> > > accessors used from C code, orders the readback versus all future
> > > instructions (not just I/O).  The delay loop is not I/O.
> > 
> > According to the PowerISA, the sequence 'load, date dependency, isync' only
> > order the load accesses. 
> 
> The point is to order the delay loop after the load, not to order
> storage versus storage.

I think the point is to make sure that the writing of the CCSR_DDR_SDRAM_CFG_2
does really take effect before we begin to delay loop. The sequence "write,
readback, sync" will guarantee this according to the manual. If we just want to
order the delay loop after the load, the following sequence should be enough:
store to CCSR_DDR_SDRAM_CFG_2
load from CCSR_DDR_SDRAM_CFG_2
isync or sync
delay loop

Why do we need the 'date dependency' here? According to the e6500 manual, the
instructions can execute out of order, but complete in order. So I am really
wondering why we need to order the load and the following delay loop if there
is no intention to order the storage access? 

> 
> This is a sequence we're already using on all of our I/O loads
> (excluding accesses like in this patch that don't use the standard
> accessors).  I'm confident that it works even if it's not
> architecturally guaranteed.

This sequence is used to order the load and the followed storage access.
And this is guaranteed by the architecture. But I don't think it is suitable
for a case like this. The following is quoted from PowerISA:
  Because stores cannot be performed “out-of-order”
  (see Book III), if a Store instruction depends on the
  value returned by a preceding Load instruction
  (because the value returned by the Load is used to
  compute either the effective address specified by the
  Store or the value to be stored), the corresponding stor-
  age accesses are performed in program order. The
  same applies if whether the Store instruction is exe-
  cuted depends on a conditional Branch instruction that
  in turn depends on the value returned by a preceding
  Load instruction.

  Because an isync instruction prevents the execution of
  instructions following the isync until instructions pre-
  ceding the isync have completed, if an isync follows a
  conditional Branch instruction that depends on the
  value returned by a preceding Load instruction, the
  load on which the Branch depends is performed before
  any loads caused by instructions following the isync.

I think the above description would guarantee that the load will be performed
before any storage access (both load and store) following the isync in the
following scenario:
lwz r4, 0(r3)
twi 0, r4, 0
isync

>  I'm not sure that there exists a clear
> architectural way of synchronizing non-storage instructions relative to
> storage instructions.

Isn't what the execution synchronization

Re: [PATCH 9/9] powerpc/pm: support deep sleep feature on T1040

2014-03-15 Thread Kevin Hao

On Fri, Mar 14, 2014 at 05:26:27PM -0500, Scott Wood wrote:
 On Thu, 2014-03-13 at 15:46 +0800, Kevin Hao wrote:
  On Wed, Mar 12, 2014 at 12:43:05PM -0500, Scott Wood wrote:
Shouldn't we use readback, sync here? The following is quoted form 
t4240RM:
  To guarantee that the results of any sequence of writes to 
configuration
  registers are in effect, the final configuration register write 
should be
  immediately followed by a read of the same register, and that should 
be
  followed by a SYNC instruction. Then accesses can safely be made to 
memory
  regions affected by the configuration register write.
   
   I agree that the sync before the readback is probably not necessary,
   since transactions to the same address should already be ordered.
   
   A sync after the readback helps if you're trying to order the readback
   with subsequent memory accesses, though in that case wouldn't a sync
   alone (no readback) be adequate?
  
  No, we don't just want to order the subsequent memory access here.
  The 'write, readback, sync' is the required sequence if we want to make
  sure that the writing to CCSR register does really take effect.
  
Though maybe not always -- see the
   comment near the end of fsl_elbc_write_buf() in
   drivers/mtd/nand_fsl_elbc.c.  I guess the readback does more than just
   make sure the device has seen the write, ensuring that the device has
   finished the transaction to the point of acting on another one.
  
  Agree.
  
   
   The data dependency plus isync sequence, which is done by the normal I/O
   accessors used from C code, orders the readback versus all future
   instructions (not just I/O).  The delay loop is not I/O.
  
  According to the PowerISA, the sequence 'load, date dependency, isync' only
  order the load accesses. 
 
 The point is to order the delay loop after the load, not to order
 storage versus storage.

I think the point is to make sure that the writing of the CCSR_DDR_SDRAM_CFG_2
does really take effect before we begin to delay loop. The sequence write,
readback, sync will guarantee this according to the manual. If we just want to
order the delay loop after the load, the following sequence should be enough:
store to CCSR_DDR_SDRAM_CFG_2
load from CCSR_DDR_SDRAM_CFG_2
isync or sync
delay loop

Why do we need the 'date dependency' here? According to the e6500 manual, the
instructions can execute out of order, but complete in order. So I am really
wondering why we need to order the load and the following delay loop if there
is no intention to order the storage access? 

 
 This is a sequence we're already using on all of our I/O loads
 (excluding accesses like in this patch that don't use the standard
 accessors).  I'm confident that it works even if it's not
 architecturally guaranteed.

This sequence is used to order the load and the followed storage access.
And this is guaranteed by the architecture. But I don't think it is suitable
for a case like this. The following is quoted from PowerISA:
  Because stores cannot be performed “out-of-order”
  (see Book III), if a Store instruction depends on the
  value returned by a preceding Load instruction
  (because the value returned by the Load is used to
  compute either the effective address specified by the
  Store or the value to be stored), the corresponding stor-
  age accesses are performed in program order. The
  same applies if whether the Store instruction is exe-
  cuted depends on a conditional Branch instruction that
  in turn depends on the value returned by a preceding
  Load instruction.
  
  Because an isync instruction prevents the execution of
  instructions following the isync until instructions pre-
  ceding the isync have completed, if an isync follows a
  conditional Branch instruction that depends on the
  value returned by a preceding Load instruction, the
  load on which the Branch depends is performed before
  any loads caused by instructions following the isync.

I think the above description would guarantee that the load will be performed
before any storage access (both load and store) following the isync in the
following scenario:
lwz r4, 0(r3)
twi 0, r4, 0
isync

  I'm not sure that there exists a clear
 architectural way of synchronizing non-storage instructions relative to
 storage instructions.

Isn't what the execution synchronization instructions such as sync, isync, mtmsr
do?

 
 Given that isync is documented as preventing any execution of
 instructions after the isync until all previous instructions complete,
 it doesn't seem to make sense for the architecture to explicitly talk
 about loads (as opposed to any other instruction) following a load,
 dependent conditional branch, isync sequence.

Sorry, I didn't get what you mean.

 
  So if we want to order all the storage access as well
  as execution synchronization, we should choose sync here.
 
 Do we need

Re: [PATCH 9/9] powerpc/pm: support deep sleep feature on T1040

2014-03-13 Thread Kevin Hao

On Wed, Mar 12, 2014 at 12:43:05PM -0500, Scott Wood wrote:
> > Shouldn't we use "readback, sync" here? The following is quoted form 
> > t4240RM:
> >   To guarantee that the results of any sequence of writes to configuration
> >   registers are in effect, the final configuration register write should be
> >   immediately followed by a read of the same register, and that should be
> >   followed by a SYNC instruction. Then accesses can safely be made to memory
> >   regions affected by the configuration register write.
> 
> I agree that the sync before the readback is probably not necessary,
> since transactions to the same address should already be ordered.
> 
> A sync after the readback helps if you're trying to order the readback
> with subsequent memory accesses, though in that case wouldn't a sync
> alone (no readback) be adequate?

No, we don't just want to order the subsequent memory access here.
The 'write, readback, sync' is the required sequence if we want to make
sure that the writing to CCSR register does really take effect.

>  Though maybe not always -- see the
> comment near the end of fsl_elbc_write_buf() in
> drivers/mtd/nand_fsl_elbc.c.  I guess the readback does more than just
> make sure the device has seen the write, ensuring that the device has
> finished the transaction to the point of acting on another one.

Agree.

> 
> The data dependency plus isync sequence, which is done by the normal I/O
> accessors used from C code, orders the readback versus all future
> instructions (not just I/O).  The delay loop is not I/O.

According to the PowerISA, the sequence 'load, date dependency, isync' only
order the load accesses. So if we want to order all the storage access as well
as execution synchronization, we should choose sync here.

Thanks,
Kevin


pgpp01siJZzzc.pgp
Description: PGP signature

Re: [PATCH 9/9] powerpc/pm: support deep sleep feature on T1040

2014-03-13 Thread Kevin Hao

On Wed, Mar 12, 2014 at 12:43:05PM -0500, Scott Wood wrote:
  Shouldn't we use readback, sync here? The following is quoted form 
  t4240RM:
To guarantee that the results of any sequence of writes to configuration
registers are in effect, the final configuration register write should be
immediately followed by a read of the same register, and that should be
followed by a SYNC instruction. Then accesses can safely be made to memory
regions affected by the configuration register write.
 
 I agree that the sync before the readback is probably not necessary,
 since transactions to the same address should already be ordered.
 
 A sync after the readback helps if you're trying to order the readback
 with subsequent memory accesses, though in that case wouldn't a sync
 alone (no readback) be adequate?

No, we don't just want to order the subsequent memory access here.
The 'write, readback, sync' is the required sequence if we want to make
sure that the writing to CCSR register does really take effect.

  Though maybe not always -- see the
 comment near the end of fsl_elbc_write_buf() in
 drivers/mtd/nand_fsl_elbc.c.  I guess the readback does more than just
 make sure the device has seen the write, ensuring that the device has
 finished the transaction to the point of acting on another one.

Agree.

 
 The data dependency plus isync sequence, which is done by the normal I/O
 accessors used from C code, orders the readback versus all future
 instructions (not just I/O).  The delay loop is not I/O.

According to the PowerISA, the sequence 'load, date dependency, isync' only
order the load accesses. So if we want to order all the storage access as well
as execution synchronization, we should choose sync here.

Thanks,
Kevin


pgpp01siJZzzc.pgp
Description: PGP signature

Re: [PATCH 9/9] powerpc/pm: support deep sleep feature on T1040

2014-03-11 Thread Kevin Hao

On Tue, Mar 11, 2014 at 08:10:24PM -0500, Scott Wood wrote:
> > +   FSL_DIS_ALL_IRQ
> > +
> > +   /*
> > +* Place DDR controller in self refresh mode.
> > +* From here on, DDR can't be access any more.
> > +*/
> > +   lwz r10, 0(r13)
> > +   orisr10, r10, CCSR_DDR_SDRAM_CFG_2_FRC_SR@h
> > +   stw r10, 0(r13)
> > +
> > +   /* can't call udelay() here, so use a macro to delay */
> > +   FSLDELAY(50)
> 
> A timebase loop doesn't require accessing DDR.
> 
> You also probably want to do a "sync, readback, data dependency, isync"
> sequence to make sure that the store has hit CCSR before you begin your
> delay (or is a delay required at all if you do that?).

Shouldn't we use "readback, sync" here? The following is quoted form t4240RM:
  To guarantee that the results of any sequence of writes to configuration
  registers are in effect, the final configuration register write should be
  immediately followed by a read of the same register, and that should be
  followed by a SYNC instruction. Then accesses can safely be made to memory
  regions affected by the configuration register write.

> > +
> > +   /* Enable SCU15 to trigger on RCPM Concentrator 0 */
> > +   lwz r10, 0(r15)
> > +   orisr10, r10, DCSR_EPU_EPECR15_IC0@h
> > +   stw r10, 0(r15)
> > +
> > +   /* put Core0 in PH15 mode, trigger EPU FSM */
> > +   lwz r10, 0(r12)
> > +   ori r10, r10, CCSR_RCPM_PCPH15SETR_CORE0
> > +   stw r10, 0(r12)
> 
> Shouldn't there be a sync to ensure that the previous I/O happens before
> the final store to enter PH15?

Do we really need a sync here? According to the PowerISA, the above stores
should be performed in program order.
  If two Store instructions or two Load instructions
  specify storage locations that are both Caching
  Inhibited and Guarded, the corresponding storage
  accesses are performed in program order with
  respect to any processor or mechanism.

Thanks,
Kevin

pgp3W0oJZFfr6.pgp
Description: PGP signature

Re: [PATCH 9/9] powerpc/pm: support deep sleep feature on T1040

2014-03-11 Thread Kevin Hao

On Tue, Mar 11, 2014 at 08:10:24PM -0500, Scott Wood wrote:
  +   FSL_DIS_ALL_IRQ
  +
  +   /*
  +* Place DDR controller in self refresh mode.
  +* From here on, DDR can't be access any more.
  +*/
  +   lwz r10, 0(r13)
  +   orisr10, r10, CCSR_DDR_SDRAM_CFG_2_FRC_SR@h
  +   stw r10, 0(r13)
  +
  +   /* can't call udelay() here, so use a macro to delay */
  +   FSLDELAY(50)
 
 A timebase loop doesn't require accessing DDR.
 
 You also probably want to do a sync, readback, data dependency, isync
 sequence to make sure that the store has hit CCSR before you begin your
 delay (or is a delay required at all if you do that?).

Shouldn't we use readback, sync here? The following is quoted form t4240RM:
  To guarantee that the results of any sequence of writes to configuration
  registers are in effect, the final configuration register write should be
  immediately followed by a read of the same register, and that should be
  followed by a SYNC instruction. Then accesses can safely be made to memory
  regions affected by the configuration register write.

  +
  +   /* Enable SCU15 to trigger on RCPM Concentrator 0 */
  +   lwz r10, 0(r15)
  +   orisr10, r10, DCSR_EPU_EPECR15_IC0@h
  +   stw r10, 0(r15)
  +
  +   /* put Core0 in PH15 mode, trigger EPU FSM */
  +   lwz r10, 0(r12)
  +   ori r10, r10, CCSR_RCPM_PCPH15SETR_CORE0
  +   stw r10, 0(r12)
 
 Shouldn't there be a sync to ensure that the previous I/O happens before
 the final store to enter PH15?

Do we really need a sync here? According to the PowerISA, the above stores
should be performed in program order.
  If two Store instructions or two Load instructions
  specify storage locations that are both Caching
  Inhibited and Guarded, the corresponding storage
  accesses are performed in program order with
  respect to any processor or mechanism.

Thanks,
Kevin


pgp3W0oJZFfr6.pgp
Description: PGP signature

Re: [PATCH v4 0/4] Bugfix for of_match_node ordering

2014-02-20 Thread Kevin Hao

On Thu, Feb 20, 2014 at 03:57:07PM +0530, Sachin Kamat wrote:
> Hi Kevin,
> 
> On 20 February 2014 15:42, Kevin Hao  wrote:
> > On Thu, Feb 20, 2014 at 02:09:08PM +0530, Sachin Kamat wrote:
> >> Hi Grant,
> >>
> >> I observe the following boot failure with today's (next-20140220) 
> >> linux-next
> >> tree on Exynos based boards with the default exynos_defconfig.
> >
> > Does this help?
> 
> The below patch works for me. Thanks for the fix.

Thanks Sachin.

Hi Grant,

I assume you would fold this change into the original patch. But if you want
a separate patch, please let me know.

Thanks,
Kevin


pgpOPr59Bs5BH.pgp
Description: PGP signature

Re: [PATCH v4 0/4] Bugfix for of_match_node ordering

2014-02-20 Thread Kevin Hao

On Thu, Feb 20, 2014 at 02:09:08PM +0530, Sachin Kamat wrote:
> Hi Grant,
> 
> I observe the following boot failure with today's (next-20140220) linux-next
> tree on Exynos based boards with the default exynos_defconfig.

Does this help?

diff --git a/drivers/of/base.c b/drivers/of/base.c
index 8a27fc907ab6..9cc893530b9a 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -381,12 +381,16 @@ static int __of_device_is_compatible(const struct 
device_node *device,
 
/* Compatible match has highest priority */
if (compat && compat[0]) {
-   of_property_for_each_string(device, "compatible", prop, cp) {
+   prop = __of_find_property(device, "compatible", NULL);
+   if (!prop)
+   return 0;
+
+   for (cp = of_prop_next_string(prop, NULL); cp;
+   cp = of_prop_next_string(prop, cp), index++) {
if (of_compat_cmp(cp, compat, strlen(compat)) == 0) {
score = INT_MAX/2 - (index << 2);
break;
}
-   index++;
}
if (!score)
return 0;


Thanks,
Kevin


pgpufzk4GbhPC.pgp
Description: PGP signature

Re: [PATCH v4 0/4] Bugfix for of_match_node ordering

2014-02-20 Thread Kevin Hao

On Thu, Feb 20, 2014 at 02:09:08PM +0530, Sachin Kamat wrote:
 Hi Grant,
 
 I observe the following boot failure with today's (next-20140220) linux-next
 tree on Exynos based boards with the default exynos_defconfig.

Does this help?

diff --git a/drivers/of/base.c b/drivers/of/base.c
index 8a27fc907ab6..9cc893530b9a 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -381,12 +381,16 @@ static int __of_device_is_compatible(const struct 
device_node *device,
 
/* Compatible match has highest priority */
if (compat  compat[0]) {
-   of_property_for_each_string(device, compatible, prop, cp) {
+   prop = __of_find_property(device, compatible, NULL);
+   if (!prop)
+   return 0;
+
+   for (cp = of_prop_next_string(prop, NULL); cp;
+   cp = of_prop_next_string(prop, cp), index++) {
if (of_compat_cmp(cp, compat, strlen(compat)) == 0) {
score = INT_MAX/2 - (index  2);
break;
}
-   index++;
}
if (!score)
return 0;


Thanks,
Kevin


pgpufzk4GbhPC.pgp
Description: PGP signature

Re: [PATCH v4 0/4] Bugfix for of_match_node ordering

2014-02-20 Thread Kevin Hao

On Thu, Feb 20, 2014 at 03:57:07PM +0530, Sachin Kamat wrote:
 Hi Kevin,
 
 On 20 February 2014 15:42, Kevin Hao haoke...@gmail.com wrote:
  On Thu, Feb 20, 2014 at 02:09:08PM +0530, Sachin Kamat wrote:
  Hi Grant,
 
  I observe the following boot failure with today's (next-20140220) 
  linux-next
  tree on Exynos based boards with the default exynos_defconfig.
 
  Does this help?
 
 The below patch works for me. Thanks for the fix.

Thanks Sachin.

Hi Grant,

I assume you would fold this change into the original patch. But if you want
a separate patch, please let me know.

Thanks,
Kevin


pgpOPr59Bs5BH.pgp
Description: PGP signature

[PATCH v3 2/4] of: reimplement the matching method for __of_match_node()

2014-02-19 Thread Kevin Hao

In the current implementation of __of_match_node(), it will compare
each given match entry against all the node's compatible strings
with of_device_is_compatible().

To achieve multiple compatible strings per node with ordering from
specific to generic, this requires given matches to be ordered from
specific to generic. For most of the drivers this is not true and
also an alphabetical ordering is more sane there.

Therefore, we define a following priority order for the match, and
then scan all the entries to find the best match.
  1. specific compatible && type && name
  2. specific compatible && type
  3. specific compatible && name
  4. specific compatible
  5. general compatible && type && name
  6. general compatible && type
  7. general compatible && name
  8. general compatible
  9. type && name
  10. type
  11. name

This is based on some pseudo-codes provided by Grant Likely.

Signed-off-by: Kevin Hao 
[grant.likely: Changed multiplier to 4 which makes more sense]
Signed-off-by: Grant Likely 
---
v3: Also need to bail out when there does have a compatible member in match
entry, but it doesn't match with the device node's compatible.
 
v2: Fix the bug such as we get the same score for the following two match
entries:
name2 { }

struct of_device_id matches[] = {
{.name = "name2", },
{.name = "name2", .type = "type1", },
{}
};

 drivers/of/base.c | 96 ---
 1 file changed, 77 insertions(+), 19 deletions(-)

diff --git a/drivers/of/base.c b/drivers/of/base.c
index ba195fbce4c6..8f79f006d86f 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -342,21 +342,28 @@ struct device_node *of_get_cpu_node(int cpu, unsigned int 
*thread)
 }
 EXPORT_SYMBOL(of_get_cpu_node);
 
-/** Checks if the given "compat" string matches one of the strings in
- * the device's "compatible" property
+/*
+ * Compare with the __of_device_is_compatible, this will return a score for the
+ * matching strings. The smaller value indicates the match for the more 
specific
+ * compatible string.
  */
-static int __of_device_is_compatible(const struct device_node *device,
-const char *compat)
+static int __of_device_is_compatible_score(const struct device_node *device,
+const char *compat, int *pscore)
 {
const char* cp;
int cplen, l;
+   int score = 0;
 
cp = __of_get_property(device, "compatible", );
if (cp == NULL)
return 0;
while (cplen > 0) {
-   if (of_compat_cmp(cp, compat, strlen(compat)) == 0)
+   score++;
+   if (of_compat_cmp(cp, compat, strlen(compat)) == 0) {
+   if (pscore)
+   *pscore = score;
return 1;
+   }
l = strlen(cp) + 1;
cp += l;
cplen -= l;
@@ -368,6 +375,15 @@ static int __of_device_is_compatible(const struct 
device_node *device,
 /** Checks if the given "compat" string matches one of the strings in
  * the device's "compatible" property
  */
+static int __of_device_is_compatible(const struct device_node *device,
+const char *compat)
+{
+   return __of_device_is_compatible_score(device, compat, NULL);
+}
+
+/** Checks if the given "compat" string matches one of the strings in
+ * the device's "compatible" property
+ */
 int of_device_is_compatible(const struct device_node *device,
const char *compat)
 {
@@ -734,25 +750,55 @@ static
 const struct of_device_id *__of_match_node(const struct of_device_id *matches,
   const struct device_node *node)
 {
+   const struct of_device_id *best_match = NULL;
+   int best_score = 0;
+
if (!matches)
return NULL;
 
while (matches->name[0] || matches->type[0] || matches->compatible[0]) {
-   int match = 1;
-   if (matches->name[0])
-   match &= node->name
-   && !strcmp(matches->name, node->name);
-   if (matches->type[0])
-   match &= node->type
-   && !strcmp(matches->type, node->type);
-   if (matches->compatible[0])
-   match &= __of_device_is_compatible(node,
-  matches->compatible);
-   if (match)
-   return matches;
+   int score = 0;
+
+   /*
+* Matching compatible is

[PATCH v3 2/4] of: reimplement the matching method for __of_match_node()

2014-02-19 Thread Kevin Hao

In the current implementation of __of_match_node(), it will compare
each given match entry against all the node's compatible strings
with of_device_is_compatible().

To achieve multiple compatible strings per node with ordering from
specific to generic, this requires given matches to be ordered from
specific to generic. For most of the drivers this is not true and
also an alphabetical ordering is more sane there.

Therefore, we define a following priority order for the match, and
then scan all the entries to find the best match.
  1. specific compatible  type  name
  2. specific compatible  type
  3. specific compatible  name
  4. specific compatible
  5. general compatible  type  name
  6. general compatible  type
  7. general compatible  name
  8. general compatible
  9. type  name
  10. type
  11. name

This is based on some pseudo-codes provided by Grant Likely.

Signed-off-by: Kevin Hao haoke...@gmail.com
[grant.likely: Changed multiplier to 4 which makes more sense]
Signed-off-by: Grant Likely grant.lik...@linaro.org
---
v3: Also need to bail out when there does have a compatible member in match
entry, but it doesn't match with the device node's compatible.
 
v2: Fix the bug such as we get the same score for the following two match
entries:
name2 { }

struct of_device_id matches[] = {
{.name = name2, },
{.name = name2, .type = type1, },
{}
};

 drivers/of/base.c | 96 ---
 1 file changed, 77 insertions(+), 19 deletions(-)

diff --git a/drivers/of/base.c b/drivers/of/base.c
index ba195fbce4c6..8f79f006d86f 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -342,21 +342,28 @@ struct device_node *of_get_cpu_node(int cpu, unsigned int 
*thread)
 }
 EXPORT_SYMBOL(of_get_cpu_node);
 
-/** Checks if the given compat string matches one of the strings in
- * the device's compatible property
+/*
+ * Compare with the __of_device_is_compatible, this will return a score for the
+ * matching strings. The smaller value indicates the match for the more 
specific
+ * compatible string.
  */
-static int __of_device_is_compatible(const struct device_node *device,
-const char *compat)
+static int __of_device_is_compatible_score(const struct device_node *device,
+const char *compat, int *pscore)
 {
const char* cp;
int cplen, l;
+   int score = 0;
 
cp = __of_get_property(device, compatible, cplen);
if (cp == NULL)
return 0;
while (cplen  0) {
-   if (of_compat_cmp(cp, compat, strlen(compat)) == 0)
+   score++;
+   if (of_compat_cmp(cp, compat, strlen(compat)) == 0) {
+   if (pscore)
+   *pscore = score;
return 1;
+   }
l = strlen(cp) + 1;
cp += l;
cplen -= l;
@@ -368,6 +375,15 @@ static int __of_device_is_compatible(const struct 
device_node *device,
 /** Checks if the given compat string matches one of the strings in
  * the device's compatible property
  */
+static int __of_device_is_compatible(const struct device_node *device,
+const char *compat)
+{
+   return __of_device_is_compatible_score(device, compat, NULL);
+}
+
+/** Checks if the given compat string matches one of the strings in
+ * the device's compatible property
+ */
 int of_device_is_compatible(const struct device_node *device,
const char *compat)
 {
@@ -734,25 +750,55 @@ static
 const struct of_device_id *__of_match_node(const struct of_device_id *matches,
   const struct device_node *node)
 {
+   const struct of_device_id *best_match = NULL;
+   int best_score = 0;
+
if (!matches)
return NULL;
 
while (matches-name[0] || matches-type[0] || matches-compatible[0]) {
-   int match = 1;
-   if (matches-name[0])
-   match = node-name
-!strcmp(matches-name, node-name);
-   if (matches-type[0])
-   match = node-type
-!strcmp(matches-type, node-type);
-   if (matches-compatible[0])
-   match = __of_device_is_compatible(node,
-  matches-compatible);
-   if (match)
-   return matches;
+   int score = 0;
+
+   /*
+* Matching compatible is better than matching type and name,
+* and the specific compatible is better than the general.
+*/
+   if (matches-compatible[0]) {
+   if (__of_device_is_compatible_score(node

Re: [PATCH v2 2/4] of: reimplement the matching method for __of_match_node()

2014-02-18 Thread Kevin Hao

On Wed, Feb 19, 2014 at 02:21:02PM +0800, Kevin Hao wrote:
> + /*
> +  * Matching compatible is better than matching type and name,
> +  * and the specific compatible is better than the general.
> +  */
> + if (matches->compatible[0] &&
> +  __of_device_is_compatible_score(node,
> + matches->compatible, ))
> + score = INT_MAX - 4 * score;

It seems that we also need to adjust the above as what we do for the type
and name. The v3 is coming.

Thanks,
Kevin


pgpzNcFByYije.pgp
Description: PGP signature

Re: [PATCH 4/4] of: Add self test for of_match_node()

2014-02-18 Thread Kevin Hao

On Tue, Feb 18, 2014 at 10:31:20PM +, Grant Likely wrote:
> Adds a selftest function for the of_match_node function. of_match_node
> is supposed to handle precedence for the compatible property as well as
> the name and device_type values. This patch adds some test case data and
> a function that makes sure each test node matches against the correct
> entry of an of_device_id table.
> 
> This code was written to verify the new of_match_node() implementation
> that is an earlier part of this series.
> 
> Currently all but one test passes. There is one scenario where the empty
> "b/name2" node is getting matched against an entry without any
> device_type property at all. It is unknown why this is, but it needs to
> be solved before this patch can be committed. (However, this is testing
> the new of_match_table implementation, which still does far better than
> the old implementation which gets the precedence completely wrong.)

We can drop the above paragraph now since all the test cases passed on
the new version of of_match_node().

> 
> Signed-off-by: Grant Likely 
> Cc: Kevin Hau 

s/Hau/Hao/

Thanks,
Kevin


pgpkAgTjm8Qe_.pgp
Description: PGP signature

[PATCH v2 2/4] of: reimplement the matching method for __of_match_node()

2014-02-18 Thread Kevin Hao

In the current implementation of __of_match_node(), it will compare
each given match entry against all the node's compatible strings
with of_device_is_compatible().

To achieve multiple compatible strings per node with ordering from
specific to generic, this requires given matches to be ordered from
specific to generic. For most of the drivers this is not true and
also an alphabetical ordering is more sane there.

Therefore, we define a following priority order for the match, and
then scan all the entries to find the best match.
  1. specific compatible && type && name
  2. specific compatible && type
  3. specific compatible && name
  4. specific compatible
  5. general compatible && type && name
  6. general compatible && type
  7. general compatible && name
  8. general compatible
  9. type && name
  10. type
  11. name

This is based on some pseudo-codes provided by Grant Likely.

Signed-off-by: Kevin Hao 
[grant.likely: Changed multiplier to 4 which makes more sense]
Signed-off-by: Grant Likely 
---
v2: Fix the bug such as we get the same score for the following two match
entries:
name2 { }

struct of_device_id matches[] = {
{.name = "name2", },
{.name = "name2", .type = "type1", },
{}
};
 
 drivers/of/base.c | 93 +++
 1 file changed, 74 insertions(+), 19 deletions(-)

diff --git a/drivers/of/base.c b/drivers/of/base.c
index ba195fbce4c6..4085e2af0b7f 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -342,21 +342,28 @@ struct device_node *of_get_cpu_node(int cpu, unsigned int 
*thread)
 }
 EXPORT_SYMBOL(of_get_cpu_node);
 
-/** Checks if the given "compat" string matches one of the strings in
- * the device's "compatible" property
+/*
+ * Compare with the __of_device_is_compatible, this will return a score for the
+ * matching strings. The smaller value indicates the match for the more 
specific
+ * compatible string.
  */
-static int __of_device_is_compatible(const struct device_node *device,
-const char *compat)
+static int __of_device_is_compatible_score(const struct device_node *device,
+const char *compat, int *pscore)
 {
const char* cp;
int cplen, l;
+   int score = 0;
 
cp = __of_get_property(device, "compatible", );
if (cp == NULL)
return 0;
while (cplen > 0) {
-   if (of_compat_cmp(cp, compat, strlen(compat)) == 0)
+   score++;
+   if (of_compat_cmp(cp, compat, strlen(compat)) == 0) {
+   if (pscore)
+   *pscore = score;
return 1;
+   }
l = strlen(cp) + 1;
cp += l;
cplen -= l;
@@ -368,6 +375,15 @@ static int __of_device_is_compatible(const struct 
device_node *device,
 /** Checks if the given "compat" string matches one of the strings in
  * the device's "compatible" property
  */
+static int __of_device_is_compatible(const struct device_node *device,
+const char *compat)
+{
+   return __of_device_is_compatible_score(device, compat, NULL);
+}
+
+/** Checks if the given "compat" string matches one of the strings in
+ * the device's "compatible" property
+ */
 int of_device_is_compatible(const struct device_node *device,
const char *compat)
 {
@@ -734,25 +750,52 @@ static
 const struct of_device_id *__of_match_node(const struct of_device_id *matches,
   const struct device_node *node)
 {
+   const struct of_device_id *best_match = NULL;
+   int best_score = 0;
+
if (!matches)
return NULL;
 
while (matches->name[0] || matches->type[0] || matches->compatible[0]) {
-   int match = 1;
-   if (matches->name[0])
-   match &= node->name
-   && !strcmp(matches->name, node->name);
-   if (matches->type[0])
-   match &= node->type
-   && !strcmp(matches->type, node->type);
-   if (matches->compatible[0])
-   match &= __of_device_is_compatible(node,
-  matches->compatible);
-   if (match)
-   return matches;
+   int score = 0;
+
+   /*
+* Matching compatible is better than matching type and name,
+* and the specific compatible is better than the general.
+

[PATCH v2 2/4] of: reimplement the matching method for __of_match_node()

2014-02-18 Thread Kevin Hao

In the current implementation of __of_match_node(), it will compare
each given match entry against all the node's compatible strings
with of_device_is_compatible().

To achieve multiple compatible strings per node with ordering from
specific to generic, this requires given matches to be ordered from
specific to generic. For most of the drivers this is not true and
also an alphabetical ordering is more sane there.

Therefore, we define a following priority order for the match, and
then scan all the entries to find the best match.
  1. specific compatible  type  name
  2. specific compatible  type
  3. specific compatible  name
  4. specific compatible
  5. general compatible  type  name
  6. general compatible  type
  7. general compatible  name
  8. general compatible
  9. type  name
  10. type
  11. name

This is based on some pseudo-codes provided by Grant Likely.

Signed-off-by: Kevin Hao haoke...@gmail.com
[grant.likely: Changed multiplier to 4 which makes more sense]
Signed-off-by: Grant Likely grant.lik...@linaro.org
---
v2: Fix the bug such as we get the same score for the following two match
entries:
name2 { }

struct of_device_id matches[] = {
{.name = name2, },
{.name = name2, .type = type1, },
{}
};
 
 drivers/of/base.c | 93 +++
 1 file changed, 74 insertions(+), 19 deletions(-)

diff --git a/drivers/of/base.c b/drivers/of/base.c
index ba195fbce4c6..4085e2af0b7f 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -342,21 +342,28 @@ struct device_node *of_get_cpu_node(int cpu, unsigned int 
*thread)
 }
 EXPORT_SYMBOL(of_get_cpu_node);
 
-/** Checks if the given compat string matches one of the strings in
- * the device's compatible property
+/*
+ * Compare with the __of_device_is_compatible, this will return a score for the
+ * matching strings. The smaller value indicates the match for the more 
specific
+ * compatible string.
  */
-static int __of_device_is_compatible(const struct device_node *device,
-const char *compat)
+static int __of_device_is_compatible_score(const struct device_node *device,
+const char *compat, int *pscore)
 {
const char* cp;
int cplen, l;
+   int score = 0;
 
cp = __of_get_property(device, compatible, cplen);
if (cp == NULL)
return 0;
while (cplen  0) {
-   if (of_compat_cmp(cp, compat, strlen(compat)) == 0)
+   score++;
+   if (of_compat_cmp(cp, compat, strlen(compat)) == 0) {
+   if (pscore)
+   *pscore = score;
return 1;
+   }
l = strlen(cp) + 1;
cp += l;
cplen -= l;
@@ -368,6 +375,15 @@ static int __of_device_is_compatible(const struct 
device_node *device,
 /** Checks if the given compat string matches one of the strings in
  * the device's compatible property
  */
+static int __of_device_is_compatible(const struct device_node *device,
+const char *compat)
+{
+   return __of_device_is_compatible_score(device, compat, NULL);
+}
+
+/** Checks if the given compat string matches one of the strings in
+ * the device's compatible property
+ */
 int of_device_is_compatible(const struct device_node *device,
const char *compat)
 {
@@ -734,25 +750,52 @@ static
 const struct of_device_id *__of_match_node(const struct of_device_id *matches,
   const struct device_node *node)
 {
+   const struct of_device_id *best_match = NULL;
+   int best_score = 0;
+
if (!matches)
return NULL;
 
while (matches-name[0] || matches-type[0] || matches-compatible[0]) {
-   int match = 1;
-   if (matches-name[0])
-   match = node-name
-!strcmp(matches-name, node-name);
-   if (matches-type[0])
-   match = node-type
-!strcmp(matches-type, node-type);
-   if (matches-compatible[0])
-   match = __of_device_is_compatible(node,
-  matches-compatible);
-   if (match)
-   return matches;
+   int score = 0;
+
+   /*
+* Matching compatible is better than matching type and name,
+* and the specific compatible is better than the general.
+*/
+   if (matches-compatible[0] 
+__of_device_is_compatible_score(node,
+   matches-compatible, score))
+   score = INT_MAX - 4 * score;
+
+   /*
+* Matching type

Re: [PATCH 4/4] of: Add self test for of_match_node()

2014-02-18 Thread Kevin Hao

On Tue, Feb 18, 2014 at 10:31:20PM +, Grant Likely wrote:
 Adds a selftest function for the of_match_node function. of_match_node
 is supposed to handle precedence for the compatible property as well as
 the name and device_type values. This patch adds some test case data and
 a function that makes sure each test node matches against the correct
 entry of an of_device_id table.
 
 This code was written to verify the new of_match_node() implementation
 that is an earlier part of this series.
 
 Currently all but one test passes. There is one scenario where the empty
 b/name2 node is getting matched against an entry without any
 device_type property at all. It is unknown why this is, but it needs to
 be solved before this patch can be committed. (However, this is testing
 the new of_match_table implementation, which still does far better than
 the old implementation which gets the precedence completely wrong.)

We can drop the above paragraph now since all the test cases passed on
the new version of of_match_node().

 
 Signed-off-by: Grant Likely grant.lik...@linaro.org
 Cc: Kevin Hau haoke...@gmail.com

s/Hau/Hao/

Thanks,
Kevin


pgpkAgTjm8Qe_.pgp
Description: PGP signature

Re: [PATCH v2 2/4] of: reimplement the matching method for __of_match_node()

2014-02-18 Thread Kevin Hao

On Wed, Feb 19, 2014 at 02:21:02PM +0800, Kevin Hao wrote:
 + /*
 +  * Matching compatible is better than matching type and name,
 +  * and the specific compatible is better than the general.
 +  */
 + if (matches-compatible[0] 
 +  __of_device_is_compatible_score(node,
 + matches-compatible, score))
 + score = INT_MAX - 4 * score;

It seems that we also need to adjust the above as what we do for the type
and name. The v3 is coming.

Thanks,
Kevin


pgpzNcFByYije.pgp
Description: PGP signature

Re: Kernel stack overflows due to "powerpc: Remove ksp_limit on ppc64" with v3.13-rc8 on ppc32 (P2020)

2014-01-16 Thread Kevin Hao

On Fri, Jan 17, 2014 at 01:58:10PM +1100, Benjamin Herrenschmidt wrote:
> On Fri, 2014-01-17 at 10:20 +0800, Kevin Hao wrote:
> > On Thu, Jan 16, 2014 at 10:05:32AM -0800, Guenter Roeck wrote:
> > > Hi all,
> > > 
> > > I am getting kernel stack overflows with v3.13-rc8 on a system with P2020 
> > > CPU.
> > > The kernel is patched for the target, but I don't think that is related.
> > > Stack overflows are in different areas, but always in calls from 
> > > __do_softirq.
> > > 
> > > Crashes happen reliably either during boot or if I put any kind of load
> > > onto the system.
> > 
> > How about the following fix:
> 
> Wow. I've been staring at that code for 15mn this morning and didn't
> spot it ! Nice catch :-)
> 
> Any chance you can send a version of that patch that adds the C
> prototype of the function in a comment right before the assembly ?

Will do. The patch is coming soon.

Thanks,
Kevin


pgpS225TwyTMg.pgp
Description: PGP signature

Re: Kernel stack overflows due to "powerpc: Remove ksp_limit on ppc64" with v3.13-rc8 on ppc32 (P2020)

2014-01-16 Thread Kevin Hao

On Thu, Jan 16, 2014 at 10:05:32AM -0800, Guenter Roeck wrote:
> Hi all,
> 
> I am getting kernel stack overflows with v3.13-rc8 on a system with P2020 CPU.
> The kernel is patched for the target, but I don't think that is related.
> Stack overflows are in different areas, but always in calls from __do_softirq.
> 
> Crashes happen reliably either during boot or if I put any kind of load
> onto the system.

How about the following fix:

diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index e47d268727a4..52fffe5616b4 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -61,7 +61,7 @@ _GLOBAL(call_do_irq)
mflrr0
stw r0,4(r1)
lwz r10,THREAD+KSP_LIMIT(r2)
-   addir11,r3,THREAD_INFO_GAP
+   addir11,r4,THREAD_INFO_GAP
stwur1,THREAD_SIZE-STACK_FRAME_OVERHEAD(r4)
mr  r1,r4
stw r10,8(r1)

Thanks,
Kevin
> 
> Example:
> 
> Kernel stack overflow in process eb3e5a00, r1=eb79df90
> CPU: 0 PID: 2838 Comm: ssh Not tainted 3.13.0-rc8-juniper-00146-g19eca00 #4
> task: eb3e5a00 ti: c0616000 task.ti: ef44
> NIP: c003a420 LR: c003a410 CTR: c0017518
> REGS: eb79dee0 TRAP: 0901   Not tainted (3.13.0-rc8-juniper-00146-g19eca00)
> MSR: 00029000   CR: 24008444  XER: 
> GPR00: c003a410 eb79df90 eb3e5a00  eb05d900 0001 65d87646 
> GPR08:  020b8000   44008442
> NIP [c003a420] __do_softirq+0x94/0x1ec
> LR [c003a410] __do_softirq+0x84/0x1ec
> Call Trace:
> [eb79df90] [c003a410] __do_softirq+0x84/0x1ec (unreliable)
> [eb79dfe0] [c003a970] irq_exit+0xbc/0xc8
> [eb79dff0] [c000cc1c] call_do_irq+0x24/0x3c
> [ef441f20] [c00046a8] do_IRQ+0x8c/0xf8
> [ef441f40] [c000e7f4] ret_from_except+0x0/0x18
> --- Exception: 501 at 0xfcda524
> LR = 0x10024900
> Instruction dump:
> 7c781b78 3b4a 3a73b040 543c0024 3a80 3b3913a0 7ef5bb78 48201bf9
> 5463103a 7d3b182e 7e89b92e 7c008146 <3ba0> 7e7e9b78 4814 57fff87f
> Kernel panic - not syncing: kernel stack overflow
> CPU: 0 PID: 2838 Comm: ssh Not tainted 3.13.0-rc8-juniper-00146-g19eca00 #4
> Call Trace:
> Rebooting in 180 seconds..
> 
> Reverting the following commit fixes the problem.
> 
> cbc9565ee8 "powerpc: Remove ksp_limit on ppc64"
> 
> Should I submit a patch reverting this commit, or is there a better way to fix
> the problem on short notice (given that 3.13 is close) ?
> 
> Thanks,
> Guenter
> ___
> Linuxppc-dev mailing list
> linuxppc-...@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev


pgpehVQKEomJO.pgp
Description: PGP signature

Re: Kernel stack overflows due to powerpc: Remove ksp_limit on ppc64 with v3.13-rc8 on ppc32 (P2020)

2014-01-16 Thread Kevin Hao

On Thu, Jan 16, 2014 at 10:05:32AM -0800, Guenter Roeck wrote:
 Hi all,
 
 I am getting kernel stack overflows with v3.13-rc8 on a system with P2020 CPU.
 The kernel is patched for the target, but I don't think that is related.
 Stack overflows are in different areas, but always in calls from __do_softirq.
 
 Crashes happen reliably either during boot or if I put any kind of load
 onto the system.

How about the following fix:

diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index e47d268727a4..52fffe5616b4 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -61,7 +61,7 @@ _GLOBAL(call_do_irq)
mflrr0
stw r0,4(r1)
lwz r10,THREAD+KSP_LIMIT(r2)
-   addir11,r3,THREAD_INFO_GAP
+   addir11,r4,THREAD_INFO_GAP
stwur1,THREAD_SIZE-STACK_FRAME_OVERHEAD(r4)
mr  r1,r4
stw r10,8(r1)

Thanks,
Kevin
 
 Example:
 
 Kernel stack overflow in process eb3e5a00, r1=eb79df90
 CPU: 0 PID: 2838 Comm: ssh Not tainted 3.13.0-rc8-juniper-00146-g19eca00 #4
 task: eb3e5a00 ti: c0616000 task.ti: ef44
 NIP: c003a420 LR: c003a410 CTR: c0017518
 REGS: eb79dee0 TRAP: 0901   Not tainted (3.13.0-rc8-juniper-00146-g19eca00)
 MSR: 00029000 CE,EE,ME  CR: 24008444  XER: 
 GPR00: c003a410 eb79df90 eb3e5a00  eb05d900 0001 65d87646 
 GPR08:  020b8000   44008442
 NIP [c003a420] __do_softirq+0x94/0x1ec
 LR [c003a410] __do_softirq+0x84/0x1ec
 Call Trace:
 [eb79df90] [c003a410] __do_softirq+0x84/0x1ec (unreliable)
 [eb79dfe0] [c003a970] irq_exit+0xbc/0xc8
 [eb79dff0] [c000cc1c] call_do_irq+0x24/0x3c
 [ef441f20] [c00046a8] do_IRQ+0x8c/0xf8
 [ef441f40] [c000e7f4] ret_from_except+0x0/0x18
 --- Exception: 501 at 0xfcda524
 LR = 0x10024900
 Instruction dump:
 7c781b78 3b4a 3a73b040 543c0024 3a80 3b3913a0 7ef5bb78 48201bf9
 5463103a 7d3b182e 7e89b92e 7c008146 3ba0 7e7e9b78 4814 57fff87f
 Kernel panic - not syncing: kernel stack overflow
 CPU: 0 PID: 2838 Comm: ssh Not tainted 3.13.0-rc8-juniper-00146-g19eca00 #4
 Call Trace:
 Rebooting in 180 seconds..
 
 Reverting the following commit fixes the problem.
 
 cbc9565ee8 powerpc: Remove ksp_limit on ppc64
 
 Should I submit a patch reverting this commit, or is there a better way to fix
 the problem on short notice (given that 3.13 is close) ?
 
 Thanks,
 Guenter
 ___
 Linuxppc-dev mailing list
 linuxppc-...@lists.ozlabs.org
 https://lists.ozlabs.org/listinfo/linuxppc-dev


pgpehVQKEomJO.pgp
Description: PGP signature

Re: Kernel stack overflows due to powerpc: Remove ksp_limit on ppc64 with v3.13-rc8 on ppc32 (P2020)

2014-01-16 Thread Kevin Hao

On Fri, Jan 17, 2014 at 01:58:10PM +1100, Benjamin Herrenschmidt wrote:
 On Fri, 2014-01-17 at 10:20 +0800, Kevin Hao wrote:
  On Thu, Jan 16, 2014 at 10:05:32AM -0800, Guenter Roeck wrote:
   Hi all,
   
   I am getting kernel stack overflows with v3.13-rc8 on a system with P2020 
   CPU.
   The kernel is patched for the target, but I don't think that is related.
   Stack overflows are in different areas, but always in calls from 
   __do_softirq.
   
   Crashes happen reliably either during boot or if I put any kind of load
   onto the system.
  
  How about the following fix:
 
 Wow. I've been staring at that code for 15mn this morning and didn't
 spot it ! Nice catch :-)
 
 Any chance you can send a version of that patch that adds the C
 prototype of the function in a comment right before the assembly ?

Will do. The patch is coming soon.

Thanks,
Kevin


pgpS225TwyTMg.pgp
Description: PGP signature

Re: [PATCH 1/5] jump_label: factor out the base part of jump_label.h to a separate file

2013-09-01 Thread Kevin Hao

On Fri, Aug 30, 2013 at 06:37:33PM +0200, Radim Krčmář wrote:
> 2013-08-25 15:15+0800, Kevin Hao:
> > We plan to use the jump label in the cpu/mmu feature check on ppc.
> > This will need to include the jump_label.h in several very basic header
> > files of ppc which seems to be included by most of the other head
> > files implicitly or explicitly. But in the current jump_label.h,
> > it also include the "linux/workqueue.h" and this will cause recursive
> > inclusion. In order to fix this, we choose to factor out the base
> > part of jump_label.h to a separate header file and we can include
> > that file instead of jump_label.h to avoid the recursive inclusion.
> > No functional change.
> 
> "linux/workqueue.h" was included because of deferred keys and they are
> split into "linux/jump_label_ratelimit.h" to solve the same problem in
> paravirt ticket spinlock series.
> (still in -next: 851cf6e7 jump_label: Split jumplabel ratelimit)

OK. I will respin a new version based on this patch. Thanks for the
information Radim.

Thanks,
Kevin


pgpnObBkhbr6p.pgp
Description: PGP signature

Re: [PATCH 1/5] jump_label: factor out the base part of jump_label.h to a separate file

2013-09-01 Thread Kevin Hao

On Fri, Aug 30, 2013 at 06:37:33PM +0200, Radim Krčmář wrote:
 2013-08-25 15:15+0800, Kevin Hao:
  We plan to use the jump label in the cpu/mmu feature check on ppc.
  This will need to include the jump_label.h in several very basic header
  files of ppc which seems to be included by most of the other head
  files implicitly or explicitly. But in the current jump_label.h,
  it also include the linux/workqueue.h and this will cause recursive
  inclusion. In order to fix this, we choose to factor out the base
  part of jump_label.h to a separate header file and we can include
  that file instead of jump_label.h to avoid the recursive inclusion.
  No functional change.
 
 linux/workqueue.h was included because of deferred keys and they are
 split into linux/jump_label_ratelimit.h to solve the same problem in
 paravirt ticket spinlock series.
 (still in -next: 851cf6e7 jump_label: Split jumplabel ratelimit)

OK. I will respin a new version based on this patch. Thanks for the
information Radim.

Thanks,
Kevin


pgpnObBkhbr6p.pgp
Description: PGP signature

[PATCH 2/5] jump_label: also include linux/atomic.h when jump label is enabled

2013-08-25 Thread Kevin Hao

The struct static_key will have a atomic_t type member no matter
whether jump label is enabled or not. We would include linux/atomic.h
when jump label is not enabled. But it also does make sense to include
this header file when jump label is enabled.

Signed-off-by: Kevin Hao 
---
 include/linux/jump_label_base.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/jump_label_base.h b/include/linux/jump_label_base.h
index 20df08f..d5c8f4b 100644
--- a/include/linux/jump_label_base.h
+++ b/include/linux/jump_label_base.h
@@ -5,6 +5,8 @@
 #include 
 #include 
 
+#include 
+
 #if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL)
 
 struct static_key {
@@ -77,8 +79,6 @@ extern void jump_label_apply_nops(struct module *mod);
 
 #else  /* !HAVE_JUMP_LABEL */
 
-#include 
-
 struct static_key {
atomic_t enabled;
 };
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/5] powerpc: use jump label for cpu/mmu_has_feature

2013-08-25 Thread Kevin Hao

Inspired by Benjamin Herrenschmidt, this patch series try to reduce the
cpu/mmu feature checking overhead by using jump label. The following is
the difference of the run path of cpu_has_feature between before and after
applying these patches:

   before after
  addis   r10,r2,1   b xxx
  addir9,r10,-2280   b xxx (This will also be omitted if the
  ld  r9,0(r9)  feature is not set)
  ld  r9,16(r9)
  rldicl. r8,r9,55,63
  beq c0037c94

This patch series passed the build test for almost all the defconfig of ppc.
There does have some broken for some configs. But they are not related to this
change. This also passed allyesconfig for x86. Boot test on p2020rdb and
p5020ds boards.

Kevin Hao (5):
  jump_label: factor out the base part of jump_label.h to a separate
file
  jump_label: also include linux/atomic.h when jump label is enabled
  powerpc: move the cpu_has_feature to a separate file
  powerpc: use the jump label for cpu_has_feature
  powerpc: use jump label for mmu_has_feature

 arch/powerpc/include/asm/cacheflush.h   |   1 +
 arch/powerpc/include/asm/cpufeatures.h  |  42 ++
 arch/powerpc/include/asm/cputable.h |   8 --
 arch/powerpc/include/asm/cputime.h  |   1 +
 arch/powerpc/include/asm/dbell.h|   1 +
 arch/powerpc/include/asm/dcr-native.h   |   1 +
 arch/powerpc/include/asm/mman.h |   1 +
 arch/powerpc/include/asm/mmu.h  |  19 +
 arch/powerpc/include/asm/time.h |   1 +
 arch/powerpc/kernel/align.c |   1 +
 arch/powerpc/kernel/cputable.c  |  43 ++
 arch/powerpc/kernel/irq.c   |   1 +
 arch/powerpc/kernel/process.c   |   1 +
 arch/powerpc/kernel/setup-common.c  |   1 +
 arch/powerpc/kernel/setup_32.c  |   1 +
 arch/powerpc/kernel/smp.c   |   1 +
 arch/powerpc/oprofile/op_model_rs64.c   |   1 +
 arch/powerpc/platforms/cell/pervasive.c |   1 +
 arch/powerpc/xmon/ppc-dis.c |   1 +
 include/linux/jump_label.h  | 132 +
 include/linux/jump_label_base.h | 142 
 21 files changed, 263 insertions(+), 138 deletions(-)
 create mode 100644 arch/powerpc/include/asm/cpufeatures.h
 create mode 100644 include/linux/jump_label_base.h

-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/5] jump_label: factor out the base part of jump_label.h to a separate file

2013-08-25 Thread Kevin Hao

We plan to use the jump label in the cpu/mmu feature check on ppc.
This will need to include the jump_label.h in several very basic header
files of ppc which seems to be included by most of the other head
files implicitly or explicitly. But in the current jump_label.h,
it also include the "linux/workqueue.h" and this will cause recursive
inclusion. In order to fix this, we choose to factor out the base
part of jump_label.h to a separate header file and we can include
that file instead of jump_label.h to avoid the recursive inclusion.
No functional change.

Signed-off-by: Kevin Hao 
---
 include/linux/jump_label.h  | 132 +
 include/linux/jump_label_base.h | 142 
 2 files changed, 144 insertions(+), 130 deletions(-)
 create mode 100644 include/linux/jump_label_base.h

diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h
index 0976fc4..14bae65 100644
--- a/include/linux/jump_label.h
+++ b/include/linux/jump_label.h
@@ -46,20 +46,11 @@
  *
 */
 
-#include 
-#include 
 #include 
+#include 
 
-#if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL)
 
-struct static_key {
-   atomic_t enabled;
-/* Set lsb bit to 1 if branch is default true, 0 ot */
-   struct jump_entry *entries;
-#ifdef CONFIG_MODULES
-   struct static_key_mod *next;
-#endif
-};
+#ifdef HAVE_JUMP_LABEL
 
 struct static_key_deferred {
struct static_key key;
@@ -67,145 +58,26 @@ struct static_key_deferred {
struct delayed_work work;
 };
 
-# include 
-# define HAVE_JUMP_LABEL
-#endif /* CC_HAVE_ASM_GOTO && CONFIG_JUMP_LABEL */
-
-enum jump_label_type {
-   JUMP_LABEL_DISABLE = 0,
-   JUMP_LABEL_ENABLE,
-};
-
-struct module;
-
-#ifdef HAVE_JUMP_LABEL
-
-#define JUMP_LABEL_TRUE_BRANCH 1UL
-
-static
-inline struct jump_entry *jump_label_get_entries(struct static_key *key)
-{
-   return (struct jump_entry *)((unsigned long)key->entries
-   & ~JUMP_LABEL_TRUE_BRANCH);
-}
-
-static inline bool jump_label_get_branch_default(struct static_key *key)
-{
-   if ((unsigned long)key->entries & JUMP_LABEL_TRUE_BRANCH)
-   return true;
-   return false;
-}
-
-static __always_inline bool static_key_false(struct static_key *key)
-{
-   return arch_static_branch(key);
-}
-
-static __always_inline bool static_key_true(struct static_key *key)
-{
-   return !static_key_false(key);
-}
-
-extern struct jump_entry __start___jump_table[];
-extern struct jump_entry __stop___jump_table[];
-
-extern void jump_label_init(void);
-extern void jump_label_lock(void);
-extern void jump_label_unlock(void);
-extern void arch_jump_label_transform(struct jump_entry *entry,
- enum jump_label_type type);
-extern void arch_jump_label_transform_static(struct jump_entry *entry,
-enum jump_label_type type);
-extern int jump_label_text_reserved(void *start, void *end);
-extern void static_key_slow_inc(struct static_key *key);
-extern void static_key_slow_dec(struct static_key *key);
 extern void static_key_slow_dec_deferred(struct static_key_deferred *key);
-extern void jump_label_apply_nops(struct module *mod);
 extern void
 jump_label_rate_limit(struct static_key_deferred *key, unsigned long rl);
 
-#define STATIC_KEY_INIT_TRUE ((struct static_key) \
-   { .enabled = ATOMIC_INIT(1), .entries = (void *)1 })
-#define STATIC_KEY_INIT_FALSE ((struct static_key) \
-   { .enabled = ATOMIC_INIT(0), .entries = (void *)0 })
-
 #else  /* !HAVE_JUMP_LABEL */
 
-#include 
-
-struct static_key {
-   atomic_t enabled;
-};
-
-static __always_inline void jump_label_init(void)
-{
-}
-
 struct static_key_deferred {
struct static_key  key;
 };
 
-static __always_inline bool static_key_false(struct static_key *key)
-{
-   if (unlikely(atomic_read(>enabled)) > 0)
-   return true;
-   return false;
-}
-
-static __always_inline bool static_key_true(struct static_key *key)
-{
-   if (likely(atomic_read(>enabled)) > 0)
-   return true;
-   return false;
-}
-
-static inline void static_key_slow_inc(struct static_key *key)
-{
-   atomic_inc(>enabled);
-}
-
-static inline void static_key_slow_dec(struct static_key *key)
-{
-   atomic_dec(>enabled);
-}
-
 static inline void static_key_slow_dec_deferred(struct static_key_deferred 
*key)
 {
static_key_slow_dec(>key);
 }
 
-static inline int jump_label_text_reserved(void *start, void *end)
-{
-   return 0;
-}
-
-static inline void jump_label_lock(void) {}
-static inline void jump_label_unlock(void) {}
-
-static inline int jump_label_apply_nops(struct module *mod)
-{
-   return 0;
-}
-
 static inline void
 jump_label_rate_limit(struct static_key_deferred *key,
unsigned long rl)
 {
 }
 
-#define STATIC_KEY_INIT_TRUE ((st

[PATCH 0/5] powerpc: use jump label for cpu/mmu_has_feature

2013-08-25 Thread Kevin Hao

Inspired by Benjamin Herrenschmidt, this patch series try to reduce the
cpu/mmu feature checking overhead by using jump label. The following is
the difference of the run path of cpu_has_feature between before and after
applying these patches:

   before after
  addis   r10,r2,1   b xxx
  addir9,r10,-2280   b xxx (This will also be omitted if the
  ld  r9,0(r9)  feature is not set)
  ld  r9,16(r9)
  rldicl. r8,r9,55,63
  beq c0037c94

This patch series passed the build test for almost all the defconfig of ppc.
There does have some broken for some configs. But they are not related to this
change. This also passed allyesconfig for x86. Boot test on p2020rdb and
p5020ds boards.

Kevin Hao (5):
  jump_label: factor out the base part of jump_label.h to a separate
file
  jump_label: also include linux/atomic.h when jump label is enabled
  powerpc: move the cpu_has_feature to a separate file
  powerpc: use the jump label for cpu_has_feature
  powerpc: use jump label for mmu_has_feature

 arch/powerpc/include/asm/cacheflush.h   |   1 +
 arch/powerpc/include/asm/cpufeatures.h  |  42 ++
 arch/powerpc/include/asm/cputable.h |   8 --
 arch/powerpc/include/asm/cputime.h  |   1 +
 arch/powerpc/include/asm/dbell.h|   1 +
 arch/powerpc/include/asm/dcr-native.h   |   1 +
 arch/powerpc/include/asm/mman.h |   1 +
 arch/powerpc/include/asm/mmu.h  |  19 +
 arch/powerpc/include/asm/time.h |   1 +
 arch/powerpc/kernel/align.c |   1 +
 arch/powerpc/kernel/cputable.c  |  43 ++
 arch/powerpc/kernel/irq.c   |   1 +
 arch/powerpc/kernel/process.c   |   1 +
 arch/powerpc/kernel/setup-common.c  |   1 +
 arch/powerpc/kernel/setup_32.c  |   1 +
 arch/powerpc/kernel/smp.c   |   1 +
 arch/powerpc/oprofile/op_model_rs64.c   |   1 +
 arch/powerpc/platforms/cell/pervasive.c |   1 +
 arch/powerpc/xmon/ppc-dis.c |   1 +
 include/linux/jump_label.h  | 132 +
 include/linux/jump_label_base.h | 142 
 21 files changed, 263 insertions(+), 138 deletions(-)
 create mode 100644 arch/powerpc/include/asm/cpufeatures.h
 create mode 100644 include/linux/jump_label_base.h

-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 >

1 - 100 of 108 matches

Mail list logo