subject:"\"Kernel broken on processors without performance counters\""

Re: Kernel broken on processors without performance counters

2015-07-24 Thread Jason Baron



On 07/24/2015 08:36 AM, Peter Zijlstra wrote:
> On Fri, Jul 24, 2015 at 12:56:02PM +0200, Peter Zijlstra wrote:
>> On Thu, Jul 23, 2015 at 03:14:13PM -0400, Jason Baron wrote:
 +static enum jump_label_type jump_label_type(struct jump_entry *entry)
 +{
 +  struct static_key *key = static_key_cast(iter->key);
 +  bool true_branch = jump_label_get_branch_default(key);
 +  bool state = static_key_enabled(key);
 +  bool inv = static_key_inv(iter->key);
 +
 +  return (true_branch ^ state) ^ inv;
>>> iiuc...this allows both the old style keys to co-exist with the
>>> new ones. IE state wouldn't be set for the new ones. And inv
>>> shouldn't be set for the old ones. Is that correct?
>> @state is the dynamic variable here, the other two are compile time.
>> @true_branch denotes the default (compile time) value, and @inv denotes
>> the (compile time) branch preference.
> Ha!, so that wasn't entirely correct, it turned out @inv means
> arch_static_branch_jump().
>
> That would let us remove the whole argument to the arch functions.
>
> That said, I generated the logic table for @inv meaning the branch type
> and then found a logic similar to what you outlined:
>
>
> /*
>  * Combine the right initial value (type) with the right branch order
>  * to generate the desired result.
>  *
>  *
>  *  type  likely (1)  unlikely (0)
>  * ---+---+--
>  *|   |
>  *  true (1)  |  ...|...
>  *|NOP  |JMP L
>  *|   | 1: ...
>  *|   L: ...|
>  *| |
>  *| | L: 
>  *| |jmp 1b
>  *|   |
>  * ---+---+--
>  *|   |
>  *  false (0) |  ...|...
>  *|JMP L|NOP
>  *|   | 1: ...
>  *|   L: ...|
>  *| |
>  *| | L: 
>  *| |jmp 1b
>  *|   |
>  * ---+---+--
>  *
>  * The initial value is encoded in the LSB of static_key::entries,
>  * type: 0 = false, 1 = true.
>  *
>  * The branch type is encoded in the LSB of jump_entry::key,
>  * branch: 0 = unlikely, 1 = likely.
>  *
>  * This gives the following logic table:
>  *
>  *enabled typebranchinstuction
>  * -+---
>  *0   0   0   | NOP
>  *0   0   1   | JMP
>  *0   1   0   | NOP
>  *0   1   1   | JMP
>  *
>  *1   0   0   | JMP
>  *1   0   1   | NOP
>  *1   1   0   | JMP
>  *1   1   1   | NOP
>  *
>  */
>
> This gives a map: ins = enabled ^ branch, which shows @type to be
> redundant.
>
> And we can trivially switch over the old static_key_{true,false}() to
> emit the right branch type.
>
> Whcih would mean we could remove the type encoding entirely, but I'll
> leave that be for now, maybe it'll come in handy later or whatnot.
>

I would just remove 'type'. Essentially, before we were storing
the 'branch' with the key. However, in this new model the
'branch' is really associated with the branch location, since the
same key can be used now with different branches.

Thanks,

-Jason

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-24 Thread Peter Zijlstra

On Fri, Jul 24, 2015 at 12:56:02PM +0200, Peter Zijlstra wrote:
> On Thu, Jul 23, 2015 at 03:14:13PM -0400, Jason Baron wrote:
> > > +static enum jump_label_type jump_label_type(struct jump_entry *entry)
> > > +{
> > > + struct static_key *key = static_key_cast(iter->key);
> > > + bool true_branch = jump_label_get_branch_default(key);
> > > + bool state = static_key_enabled(key);
> > > + bool inv = static_key_inv(iter->key);
> > > +
> > > + return (true_branch ^ state) ^ inv;
> > 
> > iiuc...this allows both the old style keys to co-exist with the
> > new ones. IE state wouldn't be set for the new ones. And inv
> > shouldn't be set for the old ones. Is that correct?
> 
> @state is the dynamic variable here, the other two are compile time.
> @true_branch denotes the default (compile time) value, and @inv denotes
> the (compile time) branch preference.

Ha!, so that wasn't entirely correct, it turned out @inv means
arch_static_branch_jump().

That would let us remove the whole argument to the arch functions.

That said, I generated the logic table for @inv meaning the branch type
and then found a logic similar to what you outlined:


/*
 * Combine the right initial value (type) with the right branch order
 * to generate the desired result.
 *
 *
 *  typelikely (1)  unlikely (0)
 * ---+---+--
 *|   |
 *  true (1)  |...|...
 *|NOP|JMP L
 *| | 1: ...
 *| L: ...|
 *|   |
 *|   | L: 
 *|   |jmp 1b
 *|   |
 * ---+---+--
 *|   |
 *  false (0) |...|...
 *|JMP L  |NOP
 *| | 1: ...
 *| L: ...|
 *|   |
 *|   | L: 
 *|   |jmp 1b
 *|   |
 * ---+---+--
 *
 * The initial value is encoded in the LSB of static_key::entries,
 * type: 0 = false, 1 = true.
 *
 * The branch type is encoded in the LSB of jump_entry::key,
 * branch: 0 = unlikely, 1 = likely.
 *
 * This gives the following logic table:
 *
 *  enabled typebranchinstuction
 * -+---
 *  0   0   0   | NOP
 *  0   0   1   | JMP
 *  0   1   0   | NOP
 *  0   1   1   | JMP
 *
 *  1   0   0   | JMP
 *  1   0   1   | NOP
 *  1   1   0   | JMP
 *  1   1   1   | NOP
 *
 */

This gives a map: ins = enabled ^ branch, which shows @type to be
redundant.

And we can trivially switch over the old static_key_{true,false}() to
emit the right branch type.

Whcih would mean we could remove the type encoding entirely, but I'll
leave that be for now, maybe it'll come in handy later or whatnot.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-24 Thread Peter Zijlstra

On Thu, Jul 23, 2015 at 03:14:13PM -0400, Jason Baron wrote:
> > +static enum jump_label_type jump_label_type(struct jump_entry *entry)
> > +{
> > +   struct static_key *key = static_key_cast(iter->key);
> > +   bool true_branch = jump_label_get_branch_default(key);
> > +   bool state = static_key_enabled(key);
> > +   bool inv = static_key_inv(iter->key);
> > +
> > +   return (true_branch ^ state) ^ inv;
> 
> iiuc...this allows both the old style keys to co-exist with the
> new ones. IE state wouldn't be set for the new ones. And inv
> shouldn't be set for the old ones. Is that correct?

@state is the dynamic variable here, the other two are compile time.
@true_branch denotes the default (compile time) value, and @inv denotes
the (compile time) branch preference.

So @state will still be set for the new ones, but you're right in that
@inv will not be set for the 'legacy' stuff.

This one little line needs a big comment.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-24 Thread Peter Zijlstra

On Fri, Jul 24, 2015 at 07:29:05AM +0200, Borislav Petkov wrote:
> On Thu, Jul 23, 2015 at 09:02:14PM +0200, Peter Zijlstra wrote:
> > On Thu, Jul 23, 2015 at 07:54:36PM +0200, Borislav Petkov wrote:
> > > On Thu, Jul 23, 2015 at 07:08:11PM +0200, Peter Zijlstra wrote:
> > > > That would be bad, how can we force it to emit 5 bytes?
> > > 
> > > .byte 0xe9 like we used to do in static_cpu_has_safe().
> > 
> > Like so then?
> > 
> > static __always_inline bool arch_static_branch_jump(struct static_key *key, 
> > bool inv)
> > {
> > unsigned long kval = (unsigned long)key + inv;
> > 
> > asm_volatile_goto("1:"
> > ".byte 0xe9\n\t .long %l[l_yes]\n\t"
> > ".pushsection __jump_table,  \"aw\" \n\t"
> > _ASM_ALIGN "\n\t"
> > _ASM_PTR "1b, %l[l_yes], %c0 \n\t"
> > ".popsection \n\t"
> > : :  "i" (kval) : : l_yes);
> > 
> > return false;
> > l_yes:
> > return true;
> > }
> 
> Yap.
> 
> But, we can do even better and note down what kind of JMP the compiler
> generated and teach __jump_label_transform() to generate the right one.
> Maybe this struct jump_entry would get a flags member or so. This way
> we're optimal.
> 
> Methinks...

Yes, Jason and Steve already have patches to go do that, but I'd really
like to keep that separate for now, this thing is big enough as is.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-23 Thread Borislav Petkov

On Thu, Jul 23, 2015 at 09:02:14PM +0200, Peter Zijlstra wrote:
> On Thu, Jul 23, 2015 at 07:54:36PM +0200, Borislav Petkov wrote:
> > On Thu, Jul 23, 2015 at 07:08:11PM +0200, Peter Zijlstra wrote:
> > > That would be bad, how can we force it to emit 5 bytes?
> > 
> > .byte 0xe9 like we used to do in static_cpu_has_safe().
> 
> Like so then?
> 
> static __always_inline bool arch_static_branch_jump(struct static_key *key, 
> bool inv)
> {
>   unsigned long kval = (unsigned long)key + inv;
> 
>   asm_volatile_goto("1:"
>   ".byte 0xe9\n\t .long %l[l_yes]\n\t"
>   ".pushsection __jump_table,  \"aw\" \n\t"
>   _ASM_ALIGN "\n\t"
>   _ASM_PTR "1b, %l[l_yes], %c0 \n\t"
>   ".popsection \n\t"
>   : :  "i" (kval) : : l_yes);
> 
>   return false;
> l_yes:
>   return true;
> }

Yap.

But, we can do even better and note down what kind of JMP the compiler
generated and teach __jump_label_transform() to generate the right one.
Maybe this struct jump_entry would get a flags member or so. This way
we're optimal.

Methinks...

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-23 Thread Jason Baron

On 07/23/2015 10:49 AM, Peter Zijlstra wrote:
> On Thu, Jul 23, 2015 at 04:33:08PM +0200, Peter Zijlstra wrote:
>> Lemme finish this and I'll post it.
> Compile tested on x86_64 only..
>
> Please have a look, I think you said I got some of the logic wrong, I've
> not double checked that.
>
> I'll go write comments and double check things.
>
> ---
>  arch/arm/include/asm/jump_label.h |  22 +++-
>  arch/arm64/include/asm/jump_label.h   |  22 +++-
>  arch/mips/include/asm/jump_label.h|  23 +++-
>  arch/powerpc/include/asm/jump_label.h |  23 +++-
>  arch/s390/include/asm/jump_label.h|  23 +++-
>  arch/sparc/include/asm/jump_label.h   |  38 ++---
>  arch/x86/include/asm/jump_label.h |  24 +++-
>  include/linux/jump_label.h| 101 
> +-
>  kernel/jump_label.c   |  89 +++---
>  9 files changed, 297 insertions(+), 68 deletions(-)
>
> diff --git a/arch/arm/include/asm/jump_label.h 
> b/arch/arm/include/asm/jump_label.h
> index 5f337dc5c108..6c9789b33497 100644
> --- a/arch/arm/include/asm/jump_label.h
> +++ b/arch/arm/include/asm/jump_label.h
> @@ -13,14 +13,32 @@
>  #define JUMP_LABEL_NOP   "nop"
>  #endif
>  
> -static __always_inline bool arch_static_branch(struct static_key *key)
> +static __always_inline bool arch_static_branch(struct static_key *key, bool 
> inv)
>  {
> + unsigned long kval = (unsigned long)key + inv;
> +
>   asm_volatile_goto("1:\n\t"
>JUMP_LABEL_NOP "\n\t"
>".pushsection __jump_table,  \"aw\"\n\t"
>".word 1b, %l[l_yes], %c0\n\t"
>".popsection\n\t"
> -  : :  "i" (key) :  : l_yes);
> +  : :  "i" (kval) :  : l_yes);
> +
> + return false;
> +l_yes:
> + return true;
> +}
> +
> +static __always_inline bool arch_static_branch_jump(struct static_key *key, 
> bool inv)
> +{
> + unsigned long kval = (unsigned long)key + inv;
> +
> + asm_volatile_goto("1:\n\t"
> +  "b %l[l_yes]\n\t"
> +  ".pushsection __jump_table,  \"aw\"\n\t"
> +  ".word 1b, %l[l_yes], %c0\n\t"
> +  ".popsection\n\t"
> +  : :  "i" (kval) :  : l_yes);
>  
>   return false;
>  l_yes:
> diff --git a/arch/arm64/include/asm/jump_label.h 
> b/arch/arm64/include/asm/jump_label.h
> index c0e5165c2f76..e5cda5d75c62 100644
> --- a/arch/arm64/include/asm/jump_label.h
> +++ b/arch/arm64/include/asm/jump_label.h
> @@ -26,14 +26,32 @@
>  
>  #define JUMP_LABEL_NOP_SIZE  AARCH64_INSN_SIZE
>  
> -static __always_inline bool arch_static_branch(struct static_key *key)
> +static __always_inline bool arch_static_branch(struct static_key *key, bool 
> inv)
>  {
> + unsigned long kval = (unsigned long)key + inv;
> +
>   asm goto("1: nop\n\t"
>".pushsection __jump_table,  \"aw\"\n\t"
>".align 3\n\t"
>".quad 1b, %l[l_yes], %c0\n\t"
>".popsection\n\t"
> -  :  :  "i"(key) :  : l_yes);
> +  :  :  "i"(kval) :  : l_yes);
> +
> + return false;
> +l_yes:
> + return true;
> +}
> +
> +static __always_inline bool arch_static_branch_jump(struct static_key *key, 
> bool inv)
> +{
> + unsigned long kval = (unsigned long)key + inv;
> +
> + asm goto("1: b %l[l_yes]\n\t"
> +  ".pushsection __jump_table,  \"aw\"\n\t"
> +  ".align 3\n\t"
> +  ".quad 1b, %l[l_yes], %c0\n\t"
> +  ".popsection\n\t"
> +  :  :  "i"(kval) :  : l_yes);
>  
>   return false;
>  l_yes:
> diff --git a/arch/mips/include/asm/jump_label.h 
> b/arch/mips/include/asm/jump_label.h
> index 608aa57799c8..d9fca6f52a93 100644
> --- a/arch/mips/include/asm/jump_label.h
> +++ b/arch/mips/include/asm/jump_label.h
> @@ -26,14 +26,33 @@
>  #define NOP_INSN "nop"
>  #endif
>  
> -static __always_inline bool arch_static_branch(struct static_key *key)
> +static __always_inline bool arch_static_branch(struct static_key *key, bool 
> inv)
>  {
> + unsigned long kval = (unsigned long)key + inv;
> +
>   asm_volatile_goto("1:\t" NOP_INSN "\n\t"
>   "nop\n\t"
>   ".pushsection __jump_table,  \"aw\"\n\t"
>   WORD_INSN " 1b, %l[l_yes], %0\n\t"
>   ".popsection\n\t"
> - : :  "i" (key) : : l_yes);
> + : :  "i" (kval) : : l_yes);
> +
> + return false;
> +l_yes:
> + return true;
> +}
> +
> +static __always_inline bool arch_static_branch_jump(struct static_key *key, 
> bool inv)
> +{
> + unsigned long kval = (unsigned long)key + inv;
> +
> + asm_volatile_goto("1:\tj %l[l_yes]\n\t"
> + "nop\n\t"
> + ".pushsection __jump_table,  \"aw\"\n\t"
> + WORD_INSN " 1b, %l[l_yes], %0\n\t"
> + ".popsection\n\t"
> + : :  "i" (kval) : : l_yes);
> +
>   return false;
>  l_yes:
>   return true;
> diff

Re: Kernel broken on processors without performance counters

2015-07-23 Thread Peter Zijlstra

On Thu, Jul 23, 2015 at 01:33:36PM -0400, Jason Baron wrote:
> > That would be bad, how can we force it to emit 5 bytes?

> hmmI don't think that's an issue, the patching code can
> detect if its a 2-byte jump - 0xeb, or 5-byte: 0xe9, and do
> the correct no-op. Same going the other way. See the code
> I posted a few mails back. In fact, this gets us to the
> smaller 2-byte no-ops in cases where we are initialized
> to jump.

Ah yes, I looked right over that.

I think that if we want to go do that, it should be a separate patch
though.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-23 Thread Peter Zijlstra

On Thu, Jul 23, 2015 at 07:54:36PM +0200, Borislav Petkov wrote:
> On Thu, Jul 23, 2015 at 07:08:11PM +0200, Peter Zijlstra wrote:
> > That would be bad, how can we force it to emit 5 bytes?
> 
> .byte 0xe9 like we used to do in static_cpu_has_safe().

Like so then?

static __always_inline bool arch_static_branch_jump(struct static_key *key, 
bool inv)
{
unsigned long kval = (unsigned long)key + inv;

asm_volatile_goto("1:"
".byte 0xe9\n\t .long %l[l_yes]\n\t"
".pushsection __jump_table,  \"aw\" \n\t"
_ASM_ALIGN "\n\t"
_ASM_PTR "1b, %l[l_yes], %c0 \n\t"
".popsection \n\t"
: :  "i" (kval) : : l_yes);

return false;
l_yes:
return true;
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-23 Thread Steven Rostedt

On Thu, 23 Jul 2015 13:33:36 -0400
Jason Baron  wrote:

> On 07/23/2015 01:08 PM, Peter Zijlstra wrote:
> > On Thu, Jul 23, 2015 at 11:34:50AM -0400, Steven Rostedt wrote:
> >> On Thu, 23 Jul 2015 12:42:15 +0200
> >> Peter Zijlstra  wrote:
> >>
> >>> static __always_inline bool arch_static_branch_jump(struct static_key 
> >>> *key, bool inv)
> >>> {
> >>>   if (!inv) {
> >>>   asm_volatile_goto("1:"
> >>>   "jmp %l[l_yes]\n\t"
> >> And what happens when this gets converted to a two byte jump?
> >>
> > That would be bad, how can we force it to emit 5 bytes?
> hmmI don't think that's an issue, the patching code can
> detect if its a 2-byte jump - 0xeb, or 5-byte: 0xe9, and do
> the correct no-op. Same going the other way. See the code
> I posted a few mails back. In fact, this gets us to the
> smaller 2-byte no-ops in cases where we are initialized
> to jump.
>

Ah right, and I already have the code that checks that (from the
original plan).

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-23 Thread Borislav Petkov

On Thu, Jul 23, 2015 at 07:08:11PM +0200, Peter Zijlstra wrote:
> That would be bad, how can we force it to emit 5 bytes?

.byte 0xe9 like we used to do in static_cpu_has_safe().

Or you can copy the insn sizing from the alternatives macros which I
added recently.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-23 Thread Andy Lutomirski

On Jul 23, 2015 10:08 AM, "Peter Zijlstra"  wrote:
>
> On Thu, Jul 23, 2015 at 11:34:50AM -0400, Steven Rostedt wrote:
> > On Thu, 23 Jul 2015 12:42:15 +0200
> > Peter Zijlstra  wrote:
> >
> > > static __always_inline bool arch_static_branch_jump(struct static_key 
> > > *key, bool inv)
> > > {
> > > if (!inv) {
> > > asm_volatile_goto("1:"
> > > "jmp %l[l_yes]\n\t"
> >
> > And what happens when this gets converted to a two byte jump?
> >
>
> That would be bad, how can we force it to emit 5 bytes?

jmp.d32 on newer toolchains IIRC.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-23 Thread Jason Baron

On 07/23/2015 01:08 PM, Peter Zijlstra wrote:
> On Thu, Jul 23, 2015 at 11:34:50AM -0400, Steven Rostedt wrote:
>> On Thu, 23 Jul 2015 12:42:15 +0200
>> Peter Zijlstra  wrote:
>>
>>> static __always_inline bool arch_static_branch_jump(struct static_key *key, 
>>> bool inv)
>>> {
>>> if (!inv) {
>>> asm_volatile_goto("1:"
>>> "jmp %l[l_yes]\n\t"
>> And what happens when this gets converted to a two byte jump?
>>
> That would be bad, how can we force it to emit 5 bytes?
hmmI don't think that's an issue, the patching code can
detect if its a 2-byte jump - 0xeb, or 5-byte: 0xe9, and do
the correct no-op. Same going the other way. See the code
I posted a few mails back. In fact, this gets us to the
smaller 2-byte no-ops in cases where we are initialized
to jump.

Thanks,

-Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-23 Thread Steven Rostedt

On Thu, 23 Jul 2015 19:08:11 +0200
Peter Zijlstra  wrote:

> On Thu, Jul 23, 2015 at 11:34:50AM -0400, Steven Rostedt wrote:
> > On Thu, 23 Jul 2015 12:42:15 +0200
> > Peter Zijlstra  wrote:
> > 
> > > static __always_inline bool arch_static_branch_jump(struct static_key 
> > > *key, bool inv)
> > > {
> > >   if (!inv) {
> > >   asm_volatile_goto("1:"
> > >   "jmp %l[l_yes]\n\t"
> > 
> > And what happens when this gets converted to a two byte jump?
> > 
> 
> That would be bad, how can we force it to emit 5 bytes?

No idea, but I could pull out that old code that converted them :-)

The complexity was in the elf parser that was run at kernel compile
time. It was based on the same code that does the work with
record-mcount.c to find all the mcount callers and made the sections
for them. In fact, it wasn't much different, as record-mcount.c will
convert the black listed sections into nops, so they do not bother
calling mcount at all. But those sections were not recorded, as they
were blacklisted anyway (not whitelisted really, as to be a blacklisted
section, it just had to not be in the whitelisted list).

If we got the jmp conversion in, I was going to clean up the code such
that both record-mcount.c and the jmp conversions used the same code
where applicable.

I would probably still convert every jmp to a nop (2 or 5 byte), and
then at boot up convert those back to jmps that are needed.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-23 Thread Peter Zijlstra

On Thu, Jul 23, 2015 at 11:34:50AM -0400, Steven Rostedt wrote:
> On Thu, 23 Jul 2015 12:42:15 +0200
> Peter Zijlstra  wrote:
> 
> > static __always_inline bool arch_static_branch_jump(struct static_key *key, 
> > bool inv)
> > {
> > if (!inv) {
> > asm_volatile_goto("1:"
> > "jmp %l[l_yes]\n\t"
> 
> And what happens when this gets converted to a two byte jump?
> 

That would be bad, how can we force it to emit 5 bytes?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-23 Thread Steven Rostedt

On Thu, 23 Jul 2015 12:42:15 +0200
Peter Zijlstra  wrote:

> static __always_inline bool arch_static_branch_jump(struct static_key *key, 
> bool inv)
> {
>   if (!inv) {
>   asm_volatile_goto("1:"
>   "jmp %l[l_yes]\n\t"

And what happens when this gets converted to a two byte jump?

-- Steve

>   ".pushsection __jump_table,  \"aw\" \n\t"
>   _ASM_ALIGN "\n\t"
>   _ASM_PTR "1b, %l[l_yes], %c0 \n\t"
>   ".popsection \n\t"
>   : :  "i" (key) : : l_yes);
>   } else {
>   asm_volatile_goto("1:"
>   "jmp %l[l_yes]\n\t"
>   ".pushsection __jump_table_inv,  \"aw\" \n\t"
>   _ASM_ALIGN "\n\t"
>   _ASM_PTR "1b, %l[l_yes], %c0 \n\t"
>   ".popsection \n\t"
>   : :  "i" (key) : : l_yes);
>   }
>   return false;
> l_yes:
>   return true;
> }
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-23 Thread Peter Zijlstra

On Thu, Jul 23, 2015 at 10:19:52AM -0400, Jason Baron wrote:
> >
> > #define static_branch_likely(x) 
> > \
> > ({  
> > \
> > bool branch;
> > \
> > if (__builtin_types_compatible_p(typeof(x), struct static_key_true))
> > \
> > branch = !arch_static_branch(&(x)->key, false); 
> > \
> > else if (__builtin_types_compatible_p(typeof(x), struct 
> > static_key_false)) \
> > branch = !arch_static_branch_jump(&(x)->key, true); 
> > \
> > else
> > \
> > branch = wrong_branch_error();  
> > \
> > branch; 
> > \
> > })
> >
> > #define static_branch_unlikely(x)   
> > \
> > ({  
> > \
> > bool branch;
> > \
> > if (__builtin_types_compatible_p(typeof(x), struct static_key_true))
> > \
> > branch = arch_static_branch(&(x)->key, true);   
> > \
> > else if (__builtin_types_compatible_p(typeof(x), struct 
> > static_key_false)) \
> > branch = arch_static_branch_jump(&(x)->key, false); 
> > \
> > else
> > \
> > branch = wrong_branch_error();  
> > \
> > branch; 
> > \
> > })
> >
> 
> In 'static_branch_unlikely()', I think  arch_static_branch() and
> arch_static_branch_jump() are reversed.

Yes, you're right. But I think I need a nap before touching this stuff
again :-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-23 Thread Steven Rostedt

On Tue, 21 Jul 2015 12:57:08 -0400
Jason Baron  wrote:


> On 07/21/2015 12:12 PM, Peter Zijlstra wrote:
> > On Tue, Jul 21, 2015 at 08:51:51AM -0700, Andy Lutomirski wrote:
> >> To clarify my (mis-)understanding:
> >>
> >> There are two degrees of freedom in a static_key.  They can start out
> >> true or false, and they can be unlikely or likely.  Are those two
> >> degrees of freedom in fact tied together?
> > Yes, if you start out false, you must be unlikely. If you start out
> > true, you must be likely.
> >
> > We could maybe try and untangle that if there really is a good use case,
> > but this is the current state.
> 
> We could potentially allow all combinations but it requires a more
> complex implementation. Perhaps, by rewriting no-ops to jmps
> during build-time? Steve Rostedt posted an implementation a while
> back to do that, but in part it wasn't merged due to its
> complexity and the fact that it increased the kernel text, which
> I don't think we ever got to the bottom of. Steve was actually

I think that happened because we didn't save enough with the nops to
compensate for the code that it took to implement that change. Perhaps
in the future that will be different as the implementation is a fixed
size, while the usage of jump labels increase.

-- Steve

> trying to reduce the size of the no-ops by first letting the compiler
> pick the 'jmp' instruction size (short jumps of only 2-bytes on
> x86, instead of the hard-coded 5 bytes we currently have).
> However, we could use the same idea here to allow the mixed
> branch label and initial variable state.
> 
> That said, it seems easy enough to reverse the direction of
> the branch in an __init or so when we boot, if required.
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-23 Thread Peter Zijlstra

On Thu, Jul 23, 2015 at 04:33:08PM +0200, Peter Zijlstra wrote:
> Lemme finish this and I'll post it.

Compile tested on x86_64 only..

Please have a look, I think you said I got some of the logic wrong, I've
not double checked that.

I'll go write comments and double check things.

---
 arch/arm/include/asm/jump_label.h |  22 +++-
 arch/arm64/include/asm/jump_label.h   |  22 +++-
 arch/mips/include/asm/jump_label.h|  23 +++-
 arch/powerpc/include/asm/jump_label.h |  23 +++-
 arch/s390/include/asm/jump_label.h|  23 +++-
 arch/sparc/include/asm/jump_label.h   |  38 ++---
 arch/x86/include/asm/jump_label.h |  24 +++-
 include/linux/jump_label.h| 101 +-
 kernel/jump_label.c   |  89 +++---
 9 files changed, 297 insertions(+), 68 deletions(-)

diff --git a/arch/arm/include/asm/jump_label.h 
b/arch/arm/include/asm/jump_label.h
index 5f337dc5c108..6c9789b33497 100644
--- a/arch/arm/include/asm/jump_label.h
+++ b/arch/arm/include/asm/jump_label.h
@@ -13,14 +13,32 @@
 #define JUMP_LABEL_NOP "nop"
 #endif
 
-static __always_inline bool arch_static_branch(struct static_key *key)
+static __always_inline bool arch_static_branch(struct static_key *key, bool 
inv)
 {
+   unsigned long kval = (unsigned long)key + inv;
+
asm_volatile_goto("1:\n\t"
 JUMP_LABEL_NOP "\n\t"
 ".pushsection __jump_table,  \"aw\"\n\t"
 ".word 1b, %l[l_yes], %c0\n\t"
 ".popsection\n\t"
-: :  "i" (key) :  : l_yes);
+: :  "i" (kval) :  : l_yes);
+
+   return false;
+l_yes:
+   return true;
+}
+
+static __always_inline bool arch_static_branch_jump(struct static_key *key, 
bool inv)
+{
+   unsigned long kval = (unsigned long)key + inv;
+
+   asm_volatile_goto("1:\n\t"
+"b %l[l_yes]\n\t"
+".pushsection __jump_table,  \"aw\"\n\t"
+".word 1b, %l[l_yes], %c0\n\t"
+".popsection\n\t"
+: :  "i" (kval) :  : l_yes);
 
return false;
 l_yes:
diff --git a/arch/arm64/include/asm/jump_label.h 
b/arch/arm64/include/asm/jump_label.h
index c0e5165c2f76..e5cda5d75c62 100644
--- a/arch/arm64/include/asm/jump_label.h
+++ b/arch/arm64/include/asm/jump_label.h
@@ -26,14 +26,32 @@
 
 #define JUMP_LABEL_NOP_SIZEAARCH64_INSN_SIZE
 
-static __always_inline bool arch_static_branch(struct static_key *key)
+static __always_inline bool arch_static_branch(struct static_key *key, bool 
inv)
 {
+   unsigned long kval = (unsigned long)key + inv;
+
asm goto("1: nop\n\t"
 ".pushsection __jump_table,  \"aw\"\n\t"
 ".align 3\n\t"
 ".quad 1b, %l[l_yes], %c0\n\t"
 ".popsection\n\t"
-:  :  "i"(key) :  : l_yes);
+:  :  "i"(kval) :  : l_yes);
+
+   return false;
+l_yes:
+   return true;
+}
+
+static __always_inline bool arch_static_branch_jump(struct static_key *key, 
bool inv)
+{
+   unsigned long kval = (unsigned long)key + inv;
+
+   asm goto("1: b %l[l_yes]\n\t"
+".pushsection __jump_table,  \"aw\"\n\t"
+".align 3\n\t"
+".quad 1b, %l[l_yes], %c0\n\t"
+".popsection\n\t"
+:  :  "i"(kval) :  : l_yes);
 
return false;
 l_yes:
diff --git a/arch/mips/include/asm/jump_label.h 
b/arch/mips/include/asm/jump_label.h
index 608aa57799c8..d9fca6f52a93 100644
--- a/arch/mips/include/asm/jump_label.h
+++ b/arch/mips/include/asm/jump_label.h
@@ -26,14 +26,33 @@
 #define NOP_INSN "nop"
 #endif
 
-static __always_inline bool arch_static_branch(struct static_key *key)
+static __always_inline bool arch_static_branch(struct static_key *key, bool 
inv)
 {
+   unsigned long kval = (unsigned long)key + inv;
+
asm_volatile_goto("1:\t" NOP_INSN "\n\t"
"nop\n\t"
".pushsection __jump_table,  \"aw\"\n\t"
WORD_INSN " 1b, %l[l_yes], %0\n\t"
".popsection\n\t"
-   : :  "i" (key) : : l_yes);
+   : :  "i" (kval) : : l_yes);
+
+   return false;
+l_yes:
+   return true;
+}
+
+static __always_inline bool arch_static_branch_jump(struct static_key *key, 
bool inv)
+{
+   unsigned long kval = (unsigned long)key + inv;
+
+   asm_volatile_goto("1:\tj %l[l_yes]\n\t"
+   "nop\n\t"
+   ".pushsection __jump_table,  \"aw\"\n\t"
+   WORD_INSN " 1b, %l[l_yes], %0\n\t"
+   ".popsection\n\t"
+   : :  "i" (kval) : : l_yes);
+
return false;
 l_yes:
return true;
diff --git a/arch/powerpc/include/asm/jump_label.h 
b/arch/powerpc/include/asm/jump_label.h
index efbf9a322a23..c7cffedb1801 100644
--- a/arch/powerpc/include/asm/jump_label.h
+++ b/arch/powerpc/include/asm/jump_label.h
@@ -18,14 +18,33 @@

Re: Kernel broken on processors without performance counters

2015-07-23 Thread Peter Zijlstra

On Thu, Jul 23, 2015 at 10:19:52AM -0400, Jason Baron wrote:
> > And I think it'll all work. Hmm?
> 
> Cool. This also gives an extra degree of freedom in that it allows keys to
> be arbitrarily mixed with the likely/unlikely branch types. I'm not sure 
> that's
> come up as a use-case, but seems like it would be good. It also implies
> that the LABEL_TYPE_{TRUE,FALSE}, is no longer associated with the key
> b/c a key could be used in both and unlikely() or likely() branch. So that
> would eventually go away, and the 'struct static_key()', I guess could point
> to its relevant entries in both tables. Although, that means an extra
> pointer in the 'struct static_key'. It may be simpler to simply add another
> field to the jump table that specifies if the branch is likely/unlikely,
> and then we are back to one table? IE  arch_static_branch() could add
> that field to the jump table.

Way ahead of you, while implementing the dual section I ran into trouble
and found that it would be far easier to indeed stick it in the
jump_entry.

However, instead of growing the thing, I've used the LSB of the key
field, that's a pointer so it has at least two bits free anyhow.

I've also implemented it for all archs (+- compile failures, I've not
gotten that far).

Lemme finish this and I'll post it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-23 Thread Jason Baron

On 07/23/2015 06:42 AM, Peter Zijlstra wrote:
> On Wed, Jul 22, 2015 at 01:06:44PM -0400, Jason Baron wrote:
>> Ok,
>>
>> So we could add all 4 possible initial states, where the
>> branches would be:
>>
>> static_likely_init_true_branch(struct static_likely_init_true_key *key)
>> static_likely_init_false_branch(struct static_likely_init_false_key *key)
>> static_unlikely_init_false_branch(struct static_unlikely_init_false_key *key)
>> static_unlikely_init_true_branch(struct static_unlikely_init_true_key *key)
> I'm sorely tempted to go quote cypress hill here...
>
>
> And I realize part of the problem is that we're wanting to use jump
> labels before we can patch them. But surely we can do better.
>
> extern bool wrong_branch_error(void);
>
> struct static_key_true;
> struct static_key_false;
>
> #define static_branch_likely(x)   
> \
> ({
> \
>   bool branch;
> \
>   if (__builtin_types_compatible_p(typeof(x), struct static_key_true))
> \
>   branch = !arch_static_branch(&(x)->key);
> \
>   else if (__builtin_types_compatible_p(typeof(x), struct 
> static_key_false)) \
>   branch = !arch_static_branch_jump(&(x)->key);   
> \
>   else
> \
>   branch = wrong_branch_error();  
> \
>   branch; 
> \
> })
>
> #define static_branch_unlikely(x) 
> \
> ({
> \
>   bool branch;
> \
>   if (__builtin_types_compatible_p(typeof(x), struct static_key_true))
> \
>   branch = arch_static_branch(&(x)->key); 
> \
>   else if (__builtin_types_compatible_p(typeof(x), struct 
> static_key_false)) \
>   branch = arch_static_branch_jump(&(x)->key);
> \
>   else
> \
>   branch = wrong_branch_error();  
> \
>   branch; 
> \
> })
>
> Can't we make something like that work?
>
> So the immediate problem appears to be the 4 different key inits, which don't
> seem very supportive of this separation:
>
> +#define STATIC_KEY_LIKEY_INIT_TRUE ((struct static_unlikely_init_true_key)   
>  \
> +{ .key.enabled = ATOMIC_INIT(1),\
> +  .key.entries = (void *)JUMP_LABEL_TYPE_TRUE_BRANCH })
>
> +#define STATIC_KEY_LIKEY_INIT_FALSE ((struct static_unlikely_init_false_key) 
>\
> +{ .key.enabled = ATOMIC_INIT(0),\
> +  .key.entries = (void *)JUMP_LABEL_TYPE_TRUE_BRANCH })
>
> +#define STATIC_KEY_UNLIKELY_INIT_TRUE ((struct 
> static_unlikely_init_true_key)\
> +{ .key.enabled = ATOMIC_INIT(1),\
> +  .key.entries = (void *)JUMP_LABEL_TYPE_FALSE_BRANCH })
>
> +#define STATIC_KEY_UNLIKELY_INIT_FALSE ((struct 
> static_unlikely_init_false_key)\
> +{ .key.enabled = ATOMIC_INIT(0),\
> +  .key.entries = (void *)JUMP_LABEL_TYPE_FALSE_BRANCH })
>
>
> But I think we can fix that by using a second __jump_table section, then
> we can augment the LABEL_TYPE_{TRUE,FALSE} thing with the section we
> find the jump_entry in.
>
> Then we can do:
>
> #define STATIC_KEY_TRUE_INIT  (struct static_key_true) { .key = 
> STATIC_KEY_INIT_TRUE,  }
> #define STATIC_KEY_FALSE_INIT (struct static_key_false){ .key = 
> STATIC_KEY_INIT_FALSE, }
>
> And invert the jump_label_type if we're in the second section.
>
> I think we'll need a second argument to the arch_static_branch*()
> functions to indicate which section it needs to go in.
>
>
> static __always_inline bool arch_static_branch(struct static_key *key, bool 
> inv)
> {
>   if (!inv) {
>   asm_volatile_goto("1:"
>   ".byte " __stringify(STATIC_KEY_INIT_NOP) "\n\t"
>   ".pushsection __jump_table,  \"aw\" \n\t"
>   _ASM_ALIGN "\n\t"
>   _ASM_PTR "1b, %l[l_yes], %c0 \n\t"
>   ".popsection \n\t"
>   : :  "i" (key) : : l_yes);
>   } else {
>   asm_volatile_goto("1:"
>   ".byte " __stringify(STATIC_KEY_INIT_NOP) "\n\t"
>   ".pushsection __jump_table_inv,  \"aw\" \n\t"
>   _ASM_ALIGN "\n\t"
>   _ASM_PTR "1b, %l[l_yes], %c0 \n\t"
>   ".popsection \n\t"
>   : :  "i" (k

Re: Kernel broken on processors without performance counters

2015-07-23 Thread Borislav Petkov

On Thu, Jul 23, 2015 at 12:42:15PM +0200, Peter Zijlstra wrote:
> > static_likely_init_true_branch(struct static_likely_init_true_key *key)
> > static_likely_init_false_branch(struct static_likely_init_false_key *key)
> > static_unlikely_init_false_branch(struct static_unlikely_init_false_key 
> > *key)
> > static_unlikely_init_true_branch(struct static_unlikely_init_true_key *key)
> 
> I'm sorely tempted to go quote cypress hill here...

Yah, those are at least too long and nuts.

> And I realize part of the problem is that we're wanting to use jump
> labels before we can patch them. But surely we can do better.
> 
> extern bool wrong_branch_error(void);
> 
> struct static_key_true;
> struct static_key_false;
> 
> #define static_branch_likely(x)   
> \
> ({
> \
>   bool branch;
> \
>   if (__builtin_types_compatible_p(typeof(x), struct static_key_true))
> \
>   branch = !arch_static_branch(&(x)->key);
> \
>   else if (__builtin_types_compatible_p(typeof(x), struct 
> static_key_false)) \
>   branch = !arch_static_branch_jump(&(x)->key);   
> \
>   else
> \
>   branch = wrong_branch_error();  
> \
>   branch; 
> \
> })
> 
> #define static_branch_unlikely(x) 
> \
> ({
> \
>   bool branch;
> \
>   if (__builtin_types_compatible_p(typeof(x), struct static_key_true))
> \
>   branch = arch_static_branch(&(x)->key); 
> \
>   else if (__builtin_types_compatible_p(typeof(x), struct 
> static_key_false)) \
>   branch = arch_static_branch_jump(&(x)->key);
> \
>   else
> \
>   branch = wrong_branch_error();  
> \
>   branch; 
> \
> })
> 
> Can't we make something like that work?
> 
> So the immediate problem appears to be the 4 different key inits, which don't
> seem very supportive of this separation:
> 
> +#define STATIC_KEY_LIKEY_INIT_TRUE ((struct static_unlikely_init_true_key)   
>  \

LIKELY

> +{ .key.enabled = ATOMIC_INIT(1),\
> +  .key.entries = (void *)JUMP_LABEL_TYPE_TRUE_BRANCH })
> 
> +#define STATIC_KEY_LIKEY_INIT_FALSE ((struct static_unlikely_init_false_key) 
>\

Yuck, those struct names are still too long IMO.

> +{ .key.enabled = ATOMIC_INIT(0),\
> +  .key.entries = (void *)JUMP_LABEL_TYPE_TRUE_BRANCH })
> 
> +#define STATIC_KEY_UNLIKELY_INIT_TRUE ((struct 
> static_unlikely_init_true_key)\
> +{ .key.enabled = ATOMIC_INIT(1),\
> +  .key.entries = (void *)JUMP_LABEL_TYPE_FALSE_BRANCH })
> 
> +#define STATIC_KEY_UNLIKELY_INIT_FALSE ((struct 
> static_unlikely_init_false_key)\
> +{ .key.enabled = ATOMIC_INIT(0),\
> +  .key.entries = (void *)JUMP_LABEL_TYPE_FALSE_BRANCH })
> 
> 
> But I think we can fix that by using a second __jump_table section, then
> we can augment the LABEL_TYPE_{TRUE,FALSE} thing with the section we
> find the jump_entry in.
> 
> Then we can do:
> 
> #define STATIC_KEY_TRUE_INIT  (struct static_key_true) { .key = 
> STATIC_KEY_INIT_TRUE,  }
> #define STATIC_KEY_FALSE_INIT (struct static_key_false){ .key = 
> STATIC_KEY_INIT_FALSE, }

Let's abbreviate that "STATIC_KEY" thing too:

SK_TRUE_INIT
SK_FALSE_INIT
...

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-23 Thread Peter Zijlstra

On Wed, Jul 22, 2015 at 01:06:44PM -0400, Jason Baron wrote:
> Ok,
> 
> So we could add all 4 possible initial states, where the
> branches would be:
> 
> static_likely_init_true_branch(struct static_likely_init_true_key *key)
> static_likely_init_false_branch(struct static_likely_init_false_key *key)
> static_unlikely_init_false_branch(struct static_unlikely_init_false_key *key)
> static_unlikely_init_true_branch(struct static_unlikely_init_true_key *key)

I'm sorely tempted to go quote cypress hill here...


And I realize part of the problem is that we're wanting to use jump
labels before we can patch them. But surely we can do better.

extern bool wrong_branch_error(void);

struct static_key_true;
struct static_key_false;

#define static_branch_likely(x) 
\
({  
\
bool branch;
\
if (__builtin_types_compatible_p(typeof(x), struct static_key_true))
\
branch = !arch_static_branch(&(x)->key);
\
else if (__builtin_types_compatible_p(typeof(x), struct 
static_key_false)) \
branch = !arch_static_branch_jump(&(x)->key);   
\
else
\
branch = wrong_branch_error();  
\
branch; 
\
})

#define static_branch_unlikely(x)   
\
({  
\
bool branch;
\
if (__builtin_types_compatible_p(typeof(x), struct static_key_true))
\
branch = arch_static_branch(&(x)->key); 
\
else if (__builtin_types_compatible_p(typeof(x), struct 
static_key_false)) \
branch = arch_static_branch_jump(&(x)->key);
\
else
\
branch = wrong_branch_error();  
\
branch; 
\
})

Can't we make something like that work?

So the immediate problem appears to be the 4 different key inits, which don't
seem very supportive of this separation:

+#define STATIC_KEY_LIKEY_INIT_TRUE ((struct static_unlikely_init_true_key) 
   \
+{ .key.enabled = ATOMIC_INIT(1),\
+  .key.entries = (void *)JUMP_LABEL_TYPE_TRUE_BRANCH })

+#define STATIC_KEY_LIKEY_INIT_FALSE ((struct static_unlikely_init_false_key)   
 \
+{ .key.enabled = ATOMIC_INIT(0),\
+  .key.entries = (void *)JUMP_LABEL_TYPE_TRUE_BRANCH })

+#define STATIC_KEY_UNLIKELY_INIT_TRUE ((struct static_unlikely_init_true_key)  
  \
+{ .key.enabled = ATOMIC_INIT(1),\
+  .key.entries = (void *)JUMP_LABEL_TYPE_FALSE_BRANCH })

+#define STATIC_KEY_UNLIKELY_INIT_FALSE ((struct 
static_unlikely_init_false_key)\
+{ .key.enabled = ATOMIC_INIT(0),\
+  .key.entries = (void *)JUMP_LABEL_TYPE_FALSE_BRANCH })


But I think we can fix that by using a second __jump_table section, then
we can augment the LABEL_TYPE_{TRUE,FALSE} thing with the section we
find the jump_entry in.

Then we can do:

#define STATIC_KEY_TRUE_INIT  (struct static_key_true) { .key = 
STATIC_KEY_INIT_TRUE,  }
#define STATIC_KEY_FALSE_INIT (struct static_key_false){ .key = 
STATIC_KEY_INIT_FALSE, }

And invert the jump_label_type if we're in the second section.

I think we'll need a second argument to the arch_static_branch*()
functions to indicate which section it needs to go in.


static __always_inline bool arch_static_branch(struct static_key *key, bool inv)
{
if (!inv) {
asm_volatile_goto("1:"
".byte " __stringify(STATIC_KEY_INIT_NOP) "\n\t"
".pushsection __jump_table,  \"aw\" \n\t"
_ASM_ALIGN "\n\t"
_ASM_PTR "1b, %l[l_yes], %c0 \n\t"
".popsection \n\t"
: :  "i" (key) : : l_yes);
} else {
asm_volatile_goto("1:"
".byte " __stringify(STATIC_KEY_INIT_NOP) "\n\t"
".pushsection __jump_table_inv,  \"aw\" \n\t"
_ASM_ALIGN "\n\t"
_ASM_PTR "1b, %l[l_yes], %c0 \n\t"
".popsection \n\t"
: :  "i" (key) : : l_yes);
}
return false;
l_yes:
return true;
}

static __always_inline bool arch_static_branch_jump(struct static_key *key, 
bool inv)
{
if (!inv) {
a

Re: Kernel broken on processors without performance counters

2015-07-22 Thread Mikulas Patocka



On Wed, 22 Jul 2015, Borislav Petkov wrote:

> On Tue, Jul 21, 2015 at 02:50:25PM -0400, Jason Baron wrote:
> > hmmm...so this is a case where need to the default the branch
> > to the out-of-line branch at boot. That is, we can't just enable
> > the out-of-line branch at boot time, b/c it might be too late at
> > that point? IE native_sched_clock() gets called very early?
> 
> Well, even the layout is wrong here. The optimal thing would be to have:
> 
>   NOP
>   rdtsc
> 
> unlikely:
>   /* read jiffies */
> 
> at build time. And then at boot time, patch in the JMP over the NOP on
> !use_tsc boxes. And RDTSC works always, no matter how early.

RDTSC doesn't work on 486...

Mikulas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-22 Thread Jason Baron

On 07/22/2015 12:24 AM, Borislav Petkov wrote:
> On Tue, Jul 21, 2015 at 02:50:25PM -0400, Jason Baron wrote:
>> hmmm...so this is a case where need to the default the branch
>> to the out-of-line branch at boot. That is, we can't just enable
>> the out-of-line branch at boot time, b/c it might be too late at
>> that point? IE native_sched_clock() gets called very early?
> Well, even the layout is wrong here. The optimal thing would be to have:
>
>   NOP
>   rdtsc
>
> unlikely:
>   /* read jiffies */
>
> at build time. And then at boot time, patch in the JMP over the NOP on
> !use_tsc boxes. And RDTSC works always, no matter how early.
>
> I'm fairly sure we can do that now with alternatives instead of jump
> labels.
>
> The problem currently is that the 5-byte NOP gets patched in with a JMP
> so we have an unconditional forwards JMP to the RDTSC.
>
> Now I'd put my money on most arches handling NOPs better then
> unconditional JMPs and this is a hot path...
>

Ok,

So we could add all 4 possible initial states, where the
branches would be:

static_likely_init_true_branch(struct static_likely_init_true_key *key)
static_likely_init_false_branch(struct static_likely_init_false_key *key)
static_unlikely_init_false_branch(struct static_unlikely_init_false_key *key)
static_unlikely_init_true_branch(struct static_unlikely_init_true_key *key)

So, this last one means 'inline' the false branch, but start the branch as
true. Which is, I think what you want here.

So this addresses the use-case of 'initial' branch direction doesn't
match the the default branch bias. We could also do this with link
time time patching (and potentially reduce the need for 4 different
types), but the implementation below doesn't seem too bad :) Its based
upon Peter's initial patch. It unfortunately does require some arch
specific bits (so we probably need a 'HAVE_ARCH', wrapper until we
have support.

Now, the native_sched_clock() starts as:

8200bf10 :
8200bf10:   55  push   %rbp
8200bf11:   48 89 e5mov%rsp,%rbp
8200bf14:   eb 30   jmp8200bf46 

8200bf16:   0f 31   rdtsc

And then the '0x3b' gets patch to a 2-byte no-op

Thanks,

-Jason


diff --git a/arch/x86/include/asm/jump_label.h 
b/arch/x86/include/asm/jump_label.h
index a4c1cf7..030cf52 100644
--- a/arch/x86/include/asm/jump_label.h
+++ b/arch/x86/include/asm/jump_label.h
@@ -30,6 +30,20 @@ l_yes:
 return true;
 }
 
+static __always_inline bool arch_static_branch_jump(struct static_key *key)
+{
+asm_volatile_goto("1:"
+"jmp %l[l_yes]\n\t"
+".pushsection __jump_table,  \"aw\" \n\t"
+_ASM_ALIGN "\n\t"
+_ASM_PTR "1b, %l[l_yes], %c0 \n\t"
+".popsection \n\t"
+: :  "i" (key) : : l_yes);
+return false;
+l_yes:
+return true;
+}
+
 #ifdef CONFIG_X86_64
 typedef u64 jump_label_t;
 #else
diff --git a/arch/x86/kernel/jump_label.c b/arch/x86/kernel/jump_label.c
index 26d5a55..d40b19d 100644
--- a/arch/x86/kernel/jump_label.c
+++ b/arch/x86/kernel/jump_label.c
@@ -22,6 +22,10 @@ union jump_code_union {
 char jump;
 int offset;
 } __attribute__((packed));
+struct {
+char jump_short;
+char offset_short;
+} __attribute__((packed));
 };
 
 static void bug_at(unsigned char *ip, int line)
@@ -36,6 +40,8 @@ static void bug_at(unsigned char *ip, int line)
 BUG();
 }
 
+static const unsigned char short_nop[] = { P6_NOP2 };
+
 static void __jump_label_transform(struct jump_entry *entry,
enum jump_label_type type,
void *(*poker)(void *, const void *, size_t),
@@ -44,47 +50,44 @@ static void __jump_label_transform(struct jump_entry *entry,
 union jump_code_union code;
 const unsigned char default_nop[] = { STATIC_KEY_INIT_NOP };
 const unsigned char *ideal_nop = ideal_nops[NOP_ATOMIC5];
+void *instruct = (void *)entry->code;
+unsigned size = JUMP_LABEL_NOP_SIZE;
+unsigned char opcode = *(unsigned char *)instruct;
 
 if (type == JUMP_LABEL_ENABLE) {
-if (init) {
-/*
- * Jump label is enabled for the first time.
- * So we expect a default_nop...
- */
-if (unlikely(memcmp((void *)entry->code, default_nop, 5)
- != 0))
-bug_at((void *)entry->code, __LINE__);
-} else {
-/*
- * ...otherwise expect an ideal_nop. Otherwise
- * something went horribly wrong.
- */
-if (unlikely(memcmp((void *)entry->code, ideal_nop, 5)
- != 0))
-bug_at((void *)entry->code, __LINE__);
-}
+if (opcode == 0xe9 || opcode == 0xeb)
+return;
 
-code.jump = 0xe9;
-code.offset = entry->target -
-(entry->code + JUMP_LABEL_NOP_SIZE);
-} else {
-

Re: Kernel broken on processors without performance counters

2015-07-21 Thread Borislav Petkov

On Tue, Jul 21, 2015 at 02:50:25PM -0400, Jason Baron wrote:
> hmmm...so this is a case where need to the default the branch
> to the out-of-line branch at boot. That is, we can't just enable
> the out-of-line branch at boot time, b/c it might be too late at
> that point? IE native_sched_clock() gets called very early?

Well, even the layout is wrong here. The optimal thing would be to have:

NOP
rdtsc

unlikely:
/* read jiffies */

at build time. And then at boot time, patch in the JMP over the NOP on
!use_tsc boxes. And RDTSC works always, no matter how early.

I'm fairly sure we can do that now with alternatives instead of jump
labels.

The problem currently is that the 5-byte NOP gets patched in with a JMP
so we have an unconditional forwards JMP to the RDTSC.

Now I'd put my money on most arches handling NOPs better then
unconditional JMPs and this is a hot path...

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-21 Thread Valdis . Kletnieks

On Tue, 21 Jul 2015 12:29:21 -0700, Andy Lutomirski said:

> That's not what I meant.  We do something in the C code that tells the
> build step which way the initial state goes.  At link time, we make
> the initial state actually work like that.  Then, at run time, we can
> still switch it again if needed.

OK, that's something different than what I read the first time around.

I'm thinking that a good adjective would be "brittle", in the face of
recalcitrant GCC releases.  Look at the fun we've had over the years
getting things that *should* be fairly intuitive to work correctly (such
as "inline").

Having said that, if somebody comes up with something that actually
works, I'd be OK with it...


pgpYvkJobctLv.pgp
Description: PGP signature

Re: Kernel broken on processors without performance counters

2015-07-21 Thread Andy Lutomirski

On Tue, Jul 21, 2015 at 12:00 PM,   wrote:
> On Tue, 21 Jul 2015 11:54:30 -0700, Andy Lutomirski said:
>
>> Could this be done at link time, or perhaps when compressing the
>> kernel image, instead of at boot time?
>
> That's only safe to do if the kernel is built for one specific CPU - if it's
> a generic kernel that boots on multiple hardware designs, it will be wrong
> for some systems.
>
> In other words - safe to do if you're building it for *your* hardware. Totally
> unsafe for a distro.

That's not what I meant.  We do something in the C code that tells the
build step which way the initial state goes.  At link time, we make
the initial state actually work like that.  Then, at run time, we can
still switch it again if needed.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-21 Thread Valdis . Kletnieks

On Tue, 21 Jul 2015 11:54:30 -0700, Andy Lutomirski said:

> Could this be done at link time, or perhaps when compressing the
> kernel image, instead of at boot time?

That's only safe to do if the kernel is built for one specific CPU - if it's
a generic kernel that boots on multiple hardware designs, it will be wrong
for some systems.

In other words - safe to do if you're building it for *your* hardware. Totally
unsafe for a distro.


pgp36OOq64gbJ.pgp
Description: PGP signature

Re: Kernel broken on processors without performance counters

2015-07-21 Thread Andy Lutomirski

On Tue, Jul 21, 2015 at 11:50 AM, Jason Baron  wrote:
>
>
> On 07/21/2015 02:15 PM, Borislav Petkov wrote:
>> On Tue, Jul 21, 2015 at 06:12:15PM +0200, Peter Zijlstra wrote:
>>> Yes, if you start out false, you must be unlikely. If you start out
>>> true, you must be likely.
>>>
>>> We could maybe try and untangle that if there really is a good use case,
>>> but this is the current state.
>>>
>>> The whole reason this happened is because 'false' is like:
>>>
>>>
>>>  ...
>>>  
>>> 1:
>>>  ...
>>>
>>>
>>>
>>> label:
>>>  
>>>  jmp 1b
>>>
>>>
>>> Where the code if out-of-line by default. The enable will rewrite the
>>>  with a jmp label.
>> Btw, native_sched_clock() is kinda botched because of that, see below.
>>
>> I'd want that RDTSC to come first with a NOP preceding it which can
>> become a JMP in case some idiotic CPU can't do RDTSC and needs to use
>> jiffies. Instead, we *unconditionally* jump to RDTSC which is WTF?! We
>> can just as well do a normal unlikely() without the static_key:
>
> hmmm...so this is a case where need to the default the branch
> to the out-of-line branch at boot. That is, we can't just enable
> the out-of-line branch at boot time, b/c it might be too late at
> that point? IE native_sched_clock() gets called very early?
>

Could this be done at link time, or perhaps when compressing the
kernel image, instead of at boot time?

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-21 Thread Jason Baron



On 07/21/2015 02:15 PM, Borislav Petkov wrote:
> On Tue, Jul 21, 2015 at 06:12:15PM +0200, Peter Zijlstra wrote:
>> Yes, if you start out false, you must be unlikely. If you start out
>> true, you must be likely.
>>
>> We could maybe try and untangle that if there really is a good use case,
>> but this is the current state.
>>
>> The whole reason this happened is because 'false' is like:
>>
>>
>>  ...
>>  
>> 1:
>>  ...
>>
>>
>>
>> label:
>>  
>>  jmp 1b
>>
>>
>> Where the code if out-of-line by default. The enable will rewrite the
>>  with a jmp label.
> Btw, native_sched_clock() is kinda botched because of that, see below.
>
> I'd want that RDTSC to come first with a NOP preceding it which can
> become a JMP in case some idiotic CPU can't do RDTSC and needs to use
> jiffies. Instead, we *unconditionally* jump to RDTSC which is WTF?! We
> can just as well do a normal unlikely() without the static_key:

hmmm...so this is a case where need to the default the branch
to the out-of-line branch at boot. That is, we can't just enable
the out-of-line branch at boot time, b/c it might be too late at
that point? IE native_sched_clock() gets called very early?

Thanks,

-Jason

>
>   .globl  native_sched_clock
>   .type   native_sched_clock, @function
> native_sched_clock:
>   pushq   %rbp#
>   movq%rsp, %rbp  #,
> #APP
> # 21 "./arch/x86/include/asm/jump_label.h" 1
>   1:.byte 0x0f,0x1f,0x44,0x00,0
>   .pushsection __jump_table,  "aw" 
>.balign 8 
>.quad 1b, .L122, __use_tsc #,
>   .popsection 
>   
> # 0 "" 2
> #NO_APP
>   movabsq $-429466729600, %rax#, tmp118
>   popq%rbp#
>   imulq   $100, jiffies_64(%rip), %rdx#, jiffies_64, D.28443
>   addq%rdx, %rax  # D.28443, D.28443
>   ret
> .L122:
> #APP
> # 118 "./arch/x86/include/asm/msr.h" 1
>   rdtsc
> # 0 "" 2
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-21 Thread Borislav Petkov

On Tue, Jul 21, 2015 at 06:12:15PM +0200, Peter Zijlstra wrote:
> Yes, if you start out false, you must be unlikely. If you start out
> true, you must be likely.
> 
> We could maybe try and untangle that if there really is a good use case,
> but this is the current state.
> 
> The whole reason this happened is because 'false' is like:
> 
> 
>   ...
>   
> 1:
>   ...
> 
> 
> 
> label:
>   
>   jmp 1b
> 
> 
> Where the code if out-of-line by default. The enable will rewrite the
>  with a jmp label.

Btw, native_sched_clock() is kinda botched because of that, see below.

I'd want that RDTSC to come first with a NOP preceding it which can
become a JMP in case some idiotic CPU can't do RDTSC and needs to use
jiffies. Instead, we *unconditionally* jump to RDTSC which is WTF?! We
can just as well do a normal unlikely() without the static_key:

.globl  native_sched_clock
.type   native_sched_clock, @function
native_sched_clock:
pushq   %rbp#
movq%rsp, %rbp  #,
#APP
# 21 "./arch/x86/include/asm/jump_label.h" 1
1:.byte 0x0f,0x1f,0x44,0x00,0
.pushsection __jump_table,  "aw" 
 .balign 8 
 .quad 1b, .L122, __use_tsc #,
.popsection 

# 0 "" 2
#NO_APP
movabsq $-429466729600, %rax#, tmp118
popq%rbp#
imulq   $100, jiffies_64(%rip), %rdx#, jiffies_64, D.28443
addq%rdx, %rax  # D.28443, D.28443
ret
.L122:
#APP
# 118 "./arch/x86/include/asm/msr.h" 1
rdtsc
# 0 "" 2

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-21 Thread Jason Baron

On 07/21/2015 12:12 PM, Peter Zijlstra wrote:
> On Tue, Jul 21, 2015 at 08:51:51AM -0700, Andy Lutomirski wrote:
>> To clarify my (mis-)understanding:
>>
>> There are two degrees of freedom in a static_key.  They can start out
>> true or false, and they can be unlikely or likely.  Are those two
>> degrees of freedom in fact tied together?
> Yes, if you start out false, you must be unlikely. If you start out
> true, you must be likely.
>
> We could maybe try and untangle that if there really is a good use case,
> but this is the current state.

We could potentially allow all combinations but it requires a more
complex implementation. Perhaps, by rewriting no-ops to jmps
during build-time? Steve Rostedt posted an implementation a while
back to do that, but in part it wasn't merged due to its
complexity and the fact that it increased the kernel text, which
I don't think we ever got to the bottom of. Steve was actually
trying to reduce the size of the no-ops by first letting the compiler
pick the 'jmp' instruction size (short jumps of only 2-bytes on
x86, instead of the hard-coded 5 bytes we currently have).
However, we could use the same idea here to allow the mixed
branch label and initial variable state.

That said, it seems easy enough to reverse the direction of
the branch in an __init or so when we boot, if required.

> The whole reason this happened is because 'false' is like:
>
>
>   ...
>   
> 1:
>   ...
>
>
>
> label:
>   
>   jmp 1b
>
>
> Where the code if out-of-line by default. The enable will rewrite the
>  with a jmp label.
>
> Of course, if you have code that is on by default, you don't want to pay
> that out-of-line penalty all the time. So the on by default generates:
>
>
>   ...
>   
>   
> label:
>   ...
>
>
> Where, if we disable, we replace the nop with jmp label.
>
> Or rather, that all is the intent, GCC doesn't actually honour hot/cold
> attributes on asm labels very well last time I tried.

I tried this too not that long ago, and didn't seem to make any
difference. Ideally this could allow us to make variations where
'cold' code is moved further out-of-line.

Thanks,

-Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-21 Thread Peter Zijlstra

On Tue, Jul 21, 2015 at 08:51:51AM -0700, Andy Lutomirski wrote:
> To clarify my (mis-)understanding:
> 
> There are two degrees of freedom in a static_key.  They can start out
> true or false, and they can be unlikely or likely.  Are those two
> degrees of freedom in fact tied together?

Yes, if you start out false, you must be unlikely. If you start out
true, you must be likely.

We could maybe try and untangle that if there really is a good use case,
but this is the current state.

The whole reason this happened is because 'false' is like:

...

1:
...

label:

jmp 1b

Where the code if out-of-line by default. The enable will rewrite the
 with a jmp label.

Of course, if you have code that is on by default, you don't want to pay
that out-of-line penalty all the time. So the on by default generates:

...

label:
...

Where, if we disable, we replace the nop with jmp label.

Or rather, that all is the intent, GCC doesn't actually honour hot/cold
attributes on asm labels very well last time I tried.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-21 Thread Peter Zijlstra

On Tue, Jul 21, 2015 at 05:49:59PM +0200, Peter Zijlstra wrote:
> On Tue, Jul 21, 2015 at 05:43:27PM +0200, Thomas Gleixner wrote:
> > On Tue, 21 Jul 2015, Peter Zijlstra wrote:
> > 
> > > > -#endif /* _LINUX_JUMP_LABEL_H */
> > > > +static inline void static_key_enable(struct static_key *key)
> > > > +{
> > > > +   int count = static_key_count(key);
> > > > +
> > > > +   WARN_ON_ONCE(count < 0 || count > 1);
> > > > +
> > > > +   if (!count)
> > > > +   static_key_slow_inc(key);
> > > > +}
> > > > +
> > > > +static inline void static_key_disable(struct static_key *key)
> > > > +{
> > > > +   int count = static_key_count(key);
> > > > +
> > > > +   WARN_ON_ONCE(count < 0 || count > 1);
> > > > +
> > > > +   if (count)
> > > > +   static_key_slow_dec(key);
> > > > +}
> > 
> > The functions above are not part of the interface which should be used
> > in code, right?
> 
> They are in fact. They make it easier to deal with the refcount thing
> when all you want is boolean states.

That is, things like sched_feat_keys[] which is an array of static keys,
the split types doesn't work -- an array can after all have only one
type.

In such cases you do have to be very careful to not wreck things.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-21 Thread Thomas Gleixner

On Tue, 21 Jul 2015, Peter Zijlstra wrote:
> On Tue, Jul 21, 2015 at 05:43:27PM +0200, Thomas Gleixner wrote:
> > > > +#define static_branch_inc(_k)  static_key_slow_inc(&(_k)->key)
> > > > +#define static_branch_dec(_k)  static_key_slow_dec(&(_k)->key)
> > 
> > Inlines please
> > 
> > > > +/*
> > > > + * Normal usage; boolean enable/disable.
> > > > + */
> > > > +
> > > > +#define static_branch_enable(_k)   static_key_enable(&(_k)->key)
> > > > +#define static_branch_disable(_k)  static_key_disable(&(_k)->key)
> > 
> > Ditto
> > 
> > All in all much more understandable than the existing horror.
> 
> They cannot in fact be inlines because we play type tricks. Note how _k
> can be either struct static_likely_key or struct static_unlikely_key.

Indeed. Care to add a comment?

Thanks,

tglx

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-21 Thread Andy Lutomirski

On Tue, Jul 21, 2015 at 8:49 AM, Peter Zijlstra  wrote:
> On Tue, Jul 21, 2015 at 05:43:27PM +0200, Thomas Gleixner wrote:
>> On Tue, 21 Jul 2015, Peter Zijlstra wrote:
>>
>> > > -#endif   /* _LINUX_JUMP_LABEL_H */
>> > > +static inline void static_key_enable(struct static_key *key)
>> > > +{
>> > > + int count = static_key_count(key);
>> > > +
>> > > + WARN_ON_ONCE(count < 0 || count > 1);
>> > > +
>> > > + if (!count)
>> > > + static_key_slow_inc(key);
>> > > +}
>> > > +
>> > > +static inline void static_key_disable(struct static_key *key)
>> > > +{
>> > > + int count = static_key_count(key);
>> > > +
>> > > + WARN_ON_ONCE(count < 0 || count > 1);
>> > > +
>> > > + if (count)
>> > > + static_key_slow_dec(key);
>> > > +}
>>
>> The functions above are not part of the interface which should be used
>> in code, right?
>
> They are in fact. They make it easier to deal with the refcount thing
> when all you want is boolean states.
>
>> > > +/* 
>> > > --
>> > >  */
>> > > +
>> > > +/*
>> > > + * likely -- default enabled, puts the branch body in-line
>> > > + */
>> > > +
>> > > +struct static_likely_key {
>> > > + struct static_key key;
>> > > +};
>> > > +
>> > > +#define STATIC_LIKELY_KEY_INIT   (struct static_likely_key){ .key = 
>> > > STATIC_KEY_INIT_TRUE, }
>> > > +
>> > > +static inline bool static_likely_branch(struct static_likely_key *key)
>> > > +{
>> > > + return static_key_true(&key->key);
>> > > +}
>> > > +
>> > > +/*
>> > > + * unlikely -- default disabled, puts the branch body out-of-line
>> > > + */
>> > > +
>> > > +struct static_unlikely_key {
>> > > + struct static_key key;
>> > > +};
>> > > +
>> > > +#define STATIC_UNLIKELY_KEY_INIT (struct static_unlikely_key){ .key = 
>> > > STATIC_KEY_INIT_FALSE, }
>> > > +
>> > > +static inline bool static_unlikely_branch(struct static_unlikely_key 
>> > > *key)
>> > > +{
>> > > + return static_key_false(&key->key);
>> > > +}
>> > > +
>> > > +/*
>> > > + * Advanced usage; refcount, branch is enabled when: count != 0
>> > > + */
>> > > +
>> > > +#define static_branch_inc(_k)static_key_slow_inc(&(_k)->key)
>> > > +#define static_branch_dec(_k)static_key_slow_dec(&(_k)->key)
>>
>> Inlines please
>>
>> > > +/*
>> > > + * Normal usage; boolean enable/disable.
>> > > + */
>> > > +
>> > > +#define static_branch_enable(_k) static_key_enable(&(_k)->key)
>> > > +#define static_branch_disable(_k)static_key_disable(&(_k)->key)
>>
>> Ditto
>>
>> All in all much more understandable than the existing horror.
>
> They cannot in fact be inlines because we play type tricks. Note how _k
> can be either struct static_likely_key or struct static_unlikely_key.
>
>

To clarify my (mis-)understanding:

There are two degrees of freedom in a static_key.  They can start out
true or false, and they can be unlikely or likely.  Are those two
degrees of freedom in fact tied together?

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-21 Thread Peter Zijlstra

On Tue, Jul 21, 2015 at 05:43:27PM +0200, Thomas Gleixner wrote:
> On Tue, 21 Jul 2015, Peter Zijlstra wrote:
> 
> > > -#endif   /* _LINUX_JUMP_LABEL_H */
> > > +static inline void static_key_enable(struct static_key *key)
> > > +{
> > > + int count = static_key_count(key);
> > > +
> > > + WARN_ON_ONCE(count < 0 || count > 1);
> > > +
> > > + if (!count)
> > > + static_key_slow_inc(key);
> > > +}
> > > +
> > > +static inline void static_key_disable(struct static_key *key)
> > > +{
> > > + int count = static_key_count(key);
> > > +
> > > + WARN_ON_ONCE(count < 0 || count > 1);
> > > +
> > > + if (count)
> > > + static_key_slow_dec(key);
> > > +}
> 
> The functions above are not part of the interface which should be used
> in code, right?

They are in fact. They make it easier to deal with the refcount thing
when all you want is boolean states.

> > > +/* 
> > > --
> > >  */
> > > +
> > > +/*
> > > + * likely -- default enabled, puts the branch body in-line
> > > + */
> > > +
> > > +struct static_likely_key {
> > > + struct static_key key;
> > > +};
> > > +
> > > +#define STATIC_LIKELY_KEY_INIT   (struct static_likely_key){ .key = 
> > > STATIC_KEY_INIT_TRUE, }
> > > +
> > > +static inline bool static_likely_branch(struct static_likely_key *key)
> > > +{
> > > + return static_key_true(&key->key);
> > > +}
> > > +
> > > +/*
> > > + * unlikely -- default disabled, puts the branch body out-of-line
> > > + */
> > > +
> > > +struct static_unlikely_key {
> > > + struct static_key key;
> > > +};
> > > +
> > > +#define STATIC_UNLIKELY_KEY_INIT (struct static_unlikely_key){ .key = 
> > > STATIC_KEY_INIT_FALSE, }
> > > +
> > > +static inline bool static_unlikely_branch(struct static_unlikely_key 
> > > *key)
> > > +{
> > > + return static_key_false(&key->key);
> > > +}
> > > +
> > > +/*
> > > + * Advanced usage; refcount, branch is enabled when: count != 0
> > > + */
> > > +
> > > +#define static_branch_inc(_k)static_key_slow_inc(&(_k)->key)
> > > +#define static_branch_dec(_k)static_key_slow_dec(&(_k)->key)
> 
> Inlines please
> 
> > > +/*
> > > + * Normal usage; boolean enable/disable.
> > > + */
> > > +
> > > +#define static_branch_enable(_k) static_key_enable(&(_k)->key)
> > > +#define static_branch_disable(_k)static_key_disable(&(_k)->key)
> 
> Ditto
> 
> All in all much more understandable than the existing horror.

They cannot in fact be inlines because we play type tricks. Note how _k
can be either struct static_likely_key or struct static_unlikely_key.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-21 Thread Thomas Gleixner

On Tue, 21 Jul 2015, Peter Zijlstra wrote:

> On Fri, Jul 10, 2015 at 04:13:59PM +0200, Peter Zijlstra wrote:
> > On Wed, Jul 08, 2015 at 05:36:43PM -0700, Andy Lutomirski wrote:
> > > >> In what universe is "static_key_false" a reasonable name for a
> > > >> function that returns true if a static key is true?

It might be very well our universe, you just need access to the proper
drugs to feel comfortable with it.

> > > I think the current naming is almost maximally bad.  The naming would
> > > be less critical if it were typesafe, though.
> > 
> > How about something like so on top? It will allow us to slowly migrate
> > existing and new users over to the hopefully saner interface?

> > -#endif /* _LINUX_JUMP_LABEL_H */
> > +static inline void static_key_enable(struct static_key *key)
> > +{
> > +   int count = static_key_count(key);
> > +
> > +   WARN_ON_ONCE(count < 0 || count > 1);
> > +
> > +   if (!count)
> > +   static_key_slow_inc(key);
> > +}
> > +
> > +static inline void static_key_disable(struct static_key *key)
> > +{
> > +   int count = static_key_count(key);
> > +
> > +   WARN_ON_ONCE(count < 0 || count > 1);
> > +
> > +   if (count)
> > +   static_key_slow_dec(key);
> > +}

The functions above are not part of the interface which should be used
in code, right?

To avoid further confusion, can we please move all the existing mess
and the new helpers into jump_label_internals.h and just put the new
interfaces in jump_label.h?

> > +/* 
> > -- 
> > */
> > +
> > +/*
> > + * likely -- default enabled, puts the branch body in-line
> > + */
> > +
> > +struct static_likely_key {
> > +   struct static_key key;
> > +};
> > +
> > +#define STATIC_LIKELY_KEY_INIT (struct static_likely_key){ .key = 
> > STATIC_KEY_INIT_TRUE, }
> > +
> > +static inline bool static_likely_branch(struct static_likely_key *key)
> > +{
> > +   return static_key_true(&key->key);
> > +}
> > +
> > +/*
> > + * unlikely -- default disabled, puts the branch body out-of-line
> > + */
> > +
> > +struct static_unlikely_key {
> > +   struct static_key key;
> > +};
> > +
> > +#define STATIC_UNLIKELY_KEY_INIT (struct static_unlikely_key){ .key = 
> > STATIC_KEY_INIT_FALSE, }
> > +
> > +static inline bool static_unlikely_branch(struct static_unlikely_key *key)
> > +{
> > +   return static_key_false(&key->key);
> > +}
> > +
> > +/*
> > + * Advanced usage; refcount, branch is enabled when: count != 0
> > + */
> > +
> > +#define static_branch_inc(_k)  static_key_slow_inc(&(_k)->key)
> > +#define static_branch_dec(_k)  static_key_slow_dec(&(_k)->key)

Inlines please

> > +/*
> > + * Normal usage; boolean enable/disable.
> > + */
> > +
> > +#define static_branch_enable(_k)   static_key_enable(&(_k)->key)
> > +#define static_branch_disable(_k)  static_key_disable(&(_k)->key)

Ditto

All in all much more understandable than the existing horror.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-21 Thread Peter Zijlstra

On Fri, Jul 10, 2015 at 04:13:59PM +0200, Peter Zijlstra wrote:
> On Wed, Jul 08, 2015 at 05:36:43PM -0700, Andy Lutomirski wrote:
> > >> In what universe is "static_key_false" a reasonable name for a
> > >> function that returns true if a static key is true?
> 
> > I think the current naming is almost maximally bad.  The naming would
> > be less critical if it were typesafe, though.
> 
> How about something like so on top? It will allow us to slowly migrate
> existing and new users over to the hopefully saner interface?

Anybody? (Adding more Cc's)

> ---
>  include/linux/jump_label.h | 67 
> +-
>  kernel/sched/core.c| 18 ++---
>  2 files changed, 74 insertions(+), 11 deletions(-)
> 
> diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h
> index f4de473f226b..98ed3c2ec78d 100644
> --- a/include/linux/jump_label.h
> +++ b/include/linux/jump_label.h
> @@ -213,6 +213,71 @@ static inline bool static_key_enabled(struct static_key 
> *key)
>   return static_key_count(key) > 0;
>  }
>  
> -#endif   /* _LINUX_JUMP_LABEL_H */
> +static inline void static_key_enable(struct static_key *key)
> +{
> + int count = static_key_count(key);
> +
> + WARN_ON_ONCE(count < 0 || count > 1);
> +
> + if (!count)
> + static_key_slow_inc(key);
> +}
> +
> +static inline void static_key_disable(struct static_key *key)
> +{
> + int count = static_key_count(key);
> +
> + WARN_ON_ONCE(count < 0 || count > 1);
> +
> + if (count)
> + static_key_slow_dec(key);
> +}
> +
> +/* 
> -- */
> +
> +/*
> + * likely -- default enabled, puts the branch body in-line
> + */
> +
> +struct static_likely_key {
> + struct static_key key;
> +};
> +
> +#define STATIC_LIKELY_KEY_INIT   (struct static_likely_key){ .key = 
> STATIC_KEY_INIT_TRUE, }
> +
> +static inline bool static_likely_branch(struct static_likely_key *key)
> +{
> + return static_key_true(&key->key);
> +}
> +
> +/*
> + * unlikely -- default disabled, puts the branch body out-of-line
> + */
> +
> +struct static_unlikely_key {
> + struct static_key key;
> +};
> +
> +#define STATIC_UNLIKELY_KEY_INIT (struct static_unlikely_key){ .key = 
> STATIC_KEY_INIT_FALSE, }
> +
> +static inline bool static_unlikely_branch(struct static_unlikely_key *key)
> +{
> + return static_key_false(&key->key);
> +}
> +
> +/*
> + * Advanced usage; refcount, branch is enabled when: count != 0
> + */
> +
> +#define static_branch_inc(_k)static_key_slow_inc(&(_k)->key)
> +#define static_branch_dec(_k)static_key_slow_dec(&(_k)->key)
> +
> +/*
> + * Normal usage; boolean enable/disable.
> + */
> +
> +#define static_branch_enable(_k) static_key_enable(&(_k)->key)
> +#define static_branch_disable(_k)static_key_disable(&(_k)->key)
>  
>  #endif /* __ASSEMBLY__ */
> +#endif   /* _LINUX_JUMP_LABEL_H */
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 78b4bad10081..22ba92297375 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -164,14 +164,12 @@ struct static_key sched_feat_keys[__SCHED_FEAT_NR] = {
>  
>  static void sched_feat_disable(int i)
>  {
> - if (static_key_enabled(&sched_feat_keys[i]))
> - static_key_slow_dec(&sched_feat_keys[i]);
> + static_key_enable(&sched_feat_keys[i]);
>  }
>  
>  static void sched_feat_enable(int i)
>  {
> - if (!static_key_enabled(&sched_feat_keys[i]))
> - static_key_slow_inc(&sched_feat_keys[i]);
> + static_key_disable(&sched_feat_keys[i]);
>  }
>  #else
>  static void sched_feat_disable(int i) { };
> @@ -2318,17 +2316,17 @@ void wake_up_new_task(struct task_struct *p)
>  
>  #ifdef CONFIG_PREEMPT_NOTIFIERS
>  
> -static struct static_key preempt_notifier_key = STATIC_KEY_INIT_FALSE;
> +static struct static_unlikely_key preempt_notifier_key = 
> STATIC_UNLIKELY_KEY_INIT;
>  
>  void preempt_notifier_inc(void)
>  {
> - static_key_slow_inc(&preempt_notifier_key);
> + static_branch_inc(&preempt_notifier_key);
>  }
>  EXPORT_SYMBOL_GPL(preempt_notifier_inc);
>  
>  void preempt_notifier_dec(void)
>  {
> - static_key_slow_dec(&preempt_notifier_key);
> + static_branch_dec(&preempt_notifier_key);
>  }
>  EXPORT_SYMBOL_GPL(preempt_notifier_dec);
>  
> @@ -2338,7 +2336,7 @@ EXPORT_SYMBOL_GPL(preempt_notifier_dec);
>   */
>  void preempt_notifier_register(struct preempt_notifier *notifier)
>  {
> - if (!static_key_false(&preempt_notifier_key))
> + if (!static_unlikely_branch(&preempt_notifier_key))
>   WARN(1, "registering preempt_notifier while notifiers 
> disabled\n");
>  
>   hlist_add_head(¬ifier->link, ¤t->preempt_notifiers);
> @@ -2367,7 +2365,7 @@ static void __fire_sched_in_preempt_notifiers(struct 
> task_struct *curr)
>  
>  static __always_inline void fire_sched_in_preempt_notifiers(struct 
> task_struct *curr)
>  {
> - if (static_

Re: Kernel broken on processors without performance counters

2015-07-14 Thread Mikulas Patocka



On Tue, 14 Jul 2015, Borislav Petkov wrote:

> On Wed, Jul 08, 2015 at 11:17:38AM -0400, Mikulas Patocka wrote:
> > Hi
> > 
> > I found out that the patch a66734297f78707ce39d756b656bfae861d53f62 breaks 
> > the kernel on processors without performance counters, such as AMD K6-3.
> 
> Out of curiosity: are you actually using this piece of computer history
> or you're only booting new kernels for regressions?
> 
> :)
> 
> Thanks.

I use it, but mostly for connecting to other machines. I wanted to test 
some other patch, I compiled a new kernel on K6-3 and found out this 
crash.

Mikulas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-14 Thread Borislav Petkov

On Wed, Jul 08, 2015 at 11:17:38AM -0400, Mikulas Patocka wrote:
> Hi
> 
> I found out that the patch a66734297f78707ce39d756b656bfae861d53f62 breaks 
> the kernel on processors without performance counters, such as AMD K6-3.

Out of curiosity: are you actually using this piece of computer history
or you're only booting new kernels for regressions?

:)

Thanks.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-10 Thread Jason Baron

On 07/10/2015 10:13 AM, Peter Zijlstra wrote:
> On Wed, Jul 08, 2015 at 05:36:43PM -0700, Andy Lutomirski wrote:
 In what universe is "static_key_false" a reasonable name for a
 function that returns true if a static key is true?
>> I think the current naming is almost maximally bad.  The naming would
>> be less critical if it were typesafe, though.
> How about something like so on top? It will allow us to slowly migrate
> existing and new users over to the hopefully saner interface?
>
> ---
>  include/linux/jump_label.h | 67 
> +-
>  kernel/sched/core.c| 18 ++---
>  2 files changed, 74 insertions(+), 11 deletions(-)
>
> diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h
> index f4de473f226b..98ed3c2ec78d 100644
> --- a/include/linux/jump_label.h
> +++ b/include/linux/jump_label.h
> @@ -213,6 +213,71 @@ static inline bool static_key_enabled(struct static_key 
> *key)
>   return static_key_count(key) > 0;
>  }
>  
> -#endif   /* _LINUX_JUMP_LABEL_H */
> +static inline void static_key_enable(struct static_key *key)
> +{
> + int count = static_key_count(key);
> +
> + WARN_ON_ONCE(count < 0 || count > 1);
> +
> + if (!count)
> + static_key_slow_inc(key);
> +}
> +
> +static inline void static_key_disable(struct static_key *key)
> +{
> + int count = static_key_count(key);
> +
> + WARN_ON_ONCE(count < 0 || count > 1);
> +
> + if (count)
> + static_key_slow_dec(key);
> +}

should those be __static_key_enable()/disable() to indicate that we don't
that we don't want ppl using these directly. Similarly for other 'internal'
functions.

> +
> +/* 
> -- */
> +
> +/*
> + * likely -- default enabled, puts the branch body in-line
> + */
> +
> +struct static_likely_key {
> + struct static_key key;
> +};
> +
> +#define STATIC_LIKELY_KEY_INIT   (struct static_likely_key){ .key = 
> STATIC_KEY_INIT_TRUE, }
> +
> +static inline bool static_likely_branch(struct static_likely_key *key)
> +{
> + return static_key_true(&key->key);
> +}
> +
> +/*
> + * unlikely -- default disabled, puts the branch body out-of-line
> + */
> +
> +struct static_unlikely_key {
> + struct static_key key;
> +};
> +
> +#define STATIC_UNLIKELY_KEY_INIT (struct static_unlikely_key){ .key = 
> STATIC_KEY_INIT_FALSE, }
> +
> +static inline bool static_unlikely_branch(struct static_unlikely_key *key)
> +{
> + return static_key_false(&key->key);
> +}
> +
> +/*
> + * Advanced usage; refcount, branch is enabled when: count != 0
> + */
> +
> +#define static_branch_inc(_k)static_key_slow_inc(&(_k)->key)
> +#define static_branch_dec(_k)static_key_slow_dec(&(_k)->key)
> +

I think of these as operations on the 'static_key' so I still
like 'static_key_inc()/dec()' (removing the 'slow' makes them
different still).

> +/*
> + * Normal usage; boolean enable/disable.
> + */
> +
> +#define static_branch_enable(_k) static_key_enable(&(_k)->key)
> +#define static_branch_disable(_k)static_key_disable(&(_k)->key)
>  

Same here maybe: static_key_set_true()/false()?

Thanks,

-Jason



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-10 Thread Peter Zijlstra

On Wed, Jul 08, 2015 at 05:36:43PM -0700, Andy Lutomirski wrote:
> >> In what universe is "static_key_false" a reasonable name for a
> >> function that returns true if a static key is true?

> I think the current naming is almost maximally bad.  The naming would
> be less critical if it were typesafe, though.

How about something like so on top? It will allow us to slowly migrate
existing and new users over to the hopefully saner interface?

---
 include/linux/jump_label.h | 67 +-
 kernel/sched/core.c| 18 ++---
 2 files changed, 74 insertions(+), 11 deletions(-)

diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h
index f4de473f226b..98ed3c2ec78d 100644
--- a/include/linux/jump_label.h
+++ b/include/linux/jump_label.h
@@ -213,6 +213,71 @@ static inline bool static_key_enabled(struct static_key 
*key)
return static_key_count(key) > 0;
 }
 
-#endif /* _LINUX_JUMP_LABEL_H */
+static inline void static_key_enable(struct static_key *key)
+{
+   int count = static_key_count(key);
+
+   WARN_ON_ONCE(count < 0 || count > 1);
+
+   if (!count)
+   static_key_slow_inc(key);
+}
+
+static inline void static_key_disable(struct static_key *key)
+{
+   int count = static_key_count(key);
+
+   WARN_ON_ONCE(count < 0 || count > 1);
+
+   if (count)
+   static_key_slow_dec(key);
+}
+
+/* -- 
*/
+
+/*
+ * likely -- default enabled, puts the branch body in-line
+ */
+
+struct static_likely_key {
+   struct static_key key;
+};
+
+#define STATIC_LIKELY_KEY_INIT (struct static_likely_key){ .key = 
STATIC_KEY_INIT_TRUE, }
+
+static inline bool static_likely_branch(struct static_likely_key *key)
+{
+   return static_key_true(&key->key);
+}
+
+/*
+ * unlikely -- default disabled, puts the branch body out-of-line
+ */
+
+struct static_unlikely_key {
+   struct static_key key;
+};
+
+#define STATIC_UNLIKELY_KEY_INIT (struct static_unlikely_key){ .key = 
STATIC_KEY_INIT_FALSE, }
+
+static inline bool static_unlikely_branch(struct static_unlikely_key *key)
+{
+   return static_key_false(&key->key);
+}
+
+/*
+ * Advanced usage; refcount, branch is enabled when: count != 0
+ */
+
+#define static_branch_inc(_k)  static_key_slow_inc(&(_k)->key)
+#define static_branch_dec(_k)  static_key_slow_dec(&(_k)->key)
+
+/*
+ * Normal usage; boolean enable/disable.
+ */
+
+#define static_branch_enable(_k)   static_key_enable(&(_k)->key)
+#define static_branch_disable(_k)  static_key_disable(&(_k)->key)
 
 #endif /* __ASSEMBLY__ */
+#endif /* _LINUX_JUMP_LABEL_H */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 78b4bad10081..22ba92297375 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -164,14 +164,12 @@ struct static_key sched_feat_keys[__SCHED_FEAT_NR] = {
 
 static void sched_feat_disable(int i)
 {
-   if (static_key_enabled(&sched_feat_keys[i]))
-   static_key_slow_dec(&sched_feat_keys[i]);
+   static_key_enable(&sched_feat_keys[i]);
 }
 
 static void sched_feat_enable(int i)
 {
-   if (!static_key_enabled(&sched_feat_keys[i]))
-   static_key_slow_inc(&sched_feat_keys[i]);
+   static_key_disable(&sched_feat_keys[i]);
 }
 #else
 static void sched_feat_disable(int i) { };
@@ -2318,17 +2316,17 @@ void wake_up_new_task(struct task_struct *p)
 
 #ifdef CONFIG_PREEMPT_NOTIFIERS
 
-static struct static_key preempt_notifier_key = STATIC_KEY_INIT_FALSE;
+static struct static_unlikely_key preempt_notifier_key = 
STATIC_UNLIKELY_KEY_INIT;
 
 void preempt_notifier_inc(void)
 {
-   static_key_slow_inc(&preempt_notifier_key);
+   static_branch_inc(&preempt_notifier_key);
 }
 EXPORT_SYMBOL_GPL(preempt_notifier_inc);
 
 void preempt_notifier_dec(void)
 {
-   static_key_slow_dec(&preempt_notifier_key);
+   static_branch_dec(&preempt_notifier_key);
 }
 EXPORT_SYMBOL_GPL(preempt_notifier_dec);
 
@@ -2338,7 +2336,7 @@ EXPORT_SYMBOL_GPL(preempt_notifier_dec);
  */
 void preempt_notifier_register(struct preempt_notifier *notifier)
 {
-   if (!static_key_false(&preempt_notifier_key))
+   if (!static_unlikely_branch(&preempt_notifier_key))
WARN(1, "registering preempt_notifier while notifiers 
disabled\n");
 
hlist_add_head(¬ifier->link, ¤t->preempt_notifiers);
@@ -2367,7 +2365,7 @@ static void __fire_sched_in_preempt_notifiers(struct 
task_struct *curr)
 
 static __always_inline void fire_sched_in_preempt_notifiers(struct task_struct 
*curr)
 {
-   if (static_key_false(&preempt_notifier_key))
+   if (static_unlikely_branch(&preempt_notifier_key))
__fire_sched_in_preempt_notifiers(curr);
 }
 
@@ -2385,7 +2383,7 @@ static __always_inline void
 fire_sched_out_preempt_notifiers(struct task_struct *curr,
 struct task_struct *next)
 {
-   if (static_key_false(&preempt_notifier_ke

Re: Kernel broken on processors without performance counters

2015-07-09 Thread Jason Baron

On 07/09/2015 01:11 PM, Peter Zijlstra wrote:
> On Wed, Jul 08, 2015 at 04:04:32PM -0400, Jason Baron wrote:
>> So the 'static_key_false' is really branch is initially false. We had
>> a naming discussion before, but if ppl think its confusing,
>> 'static_key_init_false', or 'static_key_default_false' might be better,
>> or other ideas I agree its confusing.
> Jason, while I have you here on the subject; mind posting that true/fase
> thing again? That makes more sense than the inc/dec thing for some usage
> sites.
>
>

Ok, sounds good. I will try and get to this next week.

Thanks,

-Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-09 Thread Peter Zijlstra

On Wed, Jul 08, 2015 at 04:04:32PM -0400, Jason Baron wrote:
> So the 'static_key_false' is really branch is initially false. We had
> a naming discussion before, but if ppl think its confusing,
> 'static_key_init_false', or 'static_key_default_false' might be better,
> or other ideas I agree its confusing.

Jason, while I have you here on the subject; mind posting that true/fase
thing again? That makes more sense than the inc/dec thing for some usage
sites.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-08 Thread Andy Lutomirski

On Wed, Jul 8, 2015 at 1:04 PM, Jason Baron  wrote:
> On 07/08/2015 01:37 PM, Andy Lutomirski wrote:
>> On Wed, Jul 8, 2015 at 9:07 AM, Peter Zijlstra  wrote:
>>> On Wed, Jul 08, 2015 at 11:17:38AM -0400, Mikulas Patocka wrote:
 Hi

 I found out that the patch a66734297f78707ce39d756b656bfae861d53f62 breaks
 the kernel on processors without performance counters, such as AMD K6-3.
 Reverting the patch fixes the problem.

 The static key rdpmc_always_available somehow gets set (I couldn't really
 find out what is setting it, the function set_attr_rdpmc is not executed),
 cr4_set_bits(X86_CR4_PCE) is executed and that results in a crash on boot
 when attempting to execute init, because the proecssor doesn't support
 that bit in CR4.
>>> Urgh, the static key trainwreck bites again.
>>>
>>> One is not supposed to mix static_key_true() and STATIC_KEY_INIT_FALSE.
>>>
>>> Does this make it go again?
>>>
>>> ---
>>>  arch/x86/include/asm/mmu_context.h | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/arch/x86/include/asm/mmu_context.h 
>>> b/arch/x86/include/asm/mmu_context.h
>>> index 5e8daee7c5c9..804a3a6030ca 100644
>>> --- a/arch/x86/include/asm/mmu_context.h
>>> +++ b/arch/x86/include/asm/mmu_context.h
>>> @@ -23,7 +23,7 @@ extern struct static_key rdpmc_always_available;
>>>
>>>  static inline void load_mm_cr4(struct mm_struct *mm)
>>>  {
>>> -   if (static_key_true(&rdpmc_always_available) ||
>>> +   if (static_key_false(&rdpmc_always_available) ||
>> In what universe is "static_key_false" a reasonable name for a
>> function that returns true if a static key is true?
>>
>> Can we rename that function?  And could we maybe make static keys type
>> safe?  I.e. there would be a type that starts out true and a type that
>> starts out false.
>
> So the 'static_key_false' is really branch is initially false. We had
> a naming discussion before, but if ppl think its confusing,
> 'static_key_init_false', or 'static_key_default_false' might be better,
> or other ideas I agree its confusing.

I think the current naming is almost maximally bad.  The naming would
be less critical if it were typesafe, though.

>
> In terms of getting the type to match so we don't have these
> mismatches, I think we could introduce 'struct static_key_false'
> and 'struct static_key_true' with proper initializers. However,
> 'static_key_slow_inc()/dec()'  would also have to add the
> true/false modifier.

I think that would be okay.

> Or maybe we do:
>
> struct static_key_false {
> struct static_key key;
> } random_key;
>
> and then the 'static_key_sloc_inc()/dec()' would just take
> a &random_key.key

That would be okay, too.  Or it could be a macro that has the same effect.

>
> If we were to change this, I don't think it would be too hard to
> introduce the new API, convert subsystems over time and then
> drop the old one.

And we might discover a bug or three while doing it :)

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-08 Thread Jason Baron

On 07/08/2015 01:37 PM, Andy Lutomirski wrote:
> On Wed, Jul 8, 2015 at 9:07 AM, Peter Zijlstra  wrote:
>> On Wed, Jul 08, 2015 at 11:17:38AM -0400, Mikulas Patocka wrote:
>>> Hi
>>>
>>> I found out that the patch a66734297f78707ce39d756b656bfae861d53f62 breaks
>>> the kernel on processors without performance counters, such as AMD K6-3.
>>> Reverting the patch fixes the problem.
>>>
>>> The static key rdpmc_always_available somehow gets set (I couldn't really
>>> find out what is setting it, the function set_attr_rdpmc is not executed),
>>> cr4_set_bits(X86_CR4_PCE) is executed and that results in a crash on boot
>>> when attempting to execute init, because the proecssor doesn't support
>>> that bit in CR4.
>> Urgh, the static key trainwreck bites again.
>>
>> One is not supposed to mix static_key_true() and STATIC_KEY_INIT_FALSE.
>>
>> Does this make it go again?
>>
>> ---
>>  arch/x86/include/asm/mmu_context.h | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/include/asm/mmu_context.h 
>> b/arch/x86/include/asm/mmu_context.h
>> index 5e8daee7c5c9..804a3a6030ca 100644
>> --- a/arch/x86/include/asm/mmu_context.h
>> +++ b/arch/x86/include/asm/mmu_context.h
>> @@ -23,7 +23,7 @@ extern struct static_key rdpmc_always_available;
>>
>>  static inline void load_mm_cr4(struct mm_struct *mm)
>>  {
>> -   if (static_key_true(&rdpmc_always_available) ||
>> +   if (static_key_false(&rdpmc_always_available) ||
> In what universe is "static_key_false" a reasonable name for a
> function that returns true if a static key is true?
>
> Can we rename that function?  And could we maybe make static keys type
> safe?  I.e. there would be a type that starts out true and a type that
> starts out false.

So the 'static_key_false' is really branch is initially false. We had
a naming discussion before, but if ppl think its confusing,
'static_key_init_false', or 'static_key_default_false' might be better,
or other ideas I agree its confusing.

In terms of getting the type to match so we don't have these
mismatches, I think we could introduce 'struct static_key_false'
and 'struct static_key_true' with proper initializers. However,
'static_key_slow_inc()/dec()'  would also have to add the
true/false modifier. Or maybe we do:

struct static_key_false {
struct static_key key;
} random_key;

and then the 'static_key_sloc_inc()/dec()' would just take
a &random_key.key

If we were to change this, I don't think it would be too hard to
introduce the new API, convert subsystems over time and then
drop the old one.

Thanks,

-Jason


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-08 Thread Andy Lutomirski

On Wed, Jul 8, 2015 at 9:07 AM, Peter Zijlstra  wrote:
> On Wed, Jul 08, 2015 at 11:17:38AM -0400, Mikulas Patocka wrote:
>> Hi
>>
>> I found out that the patch a66734297f78707ce39d756b656bfae861d53f62 breaks
>> the kernel on processors without performance counters, such as AMD K6-3.
>> Reverting the patch fixes the problem.
>>
>> The static key rdpmc_always_available somehow gets set (I couldn't really
>> find out what is setting it, the function set_attr_rdpmc is not executed),
>> cr4_set_bits(X86_CR4_PCE) is executed and that results in a crash on boot
>> when attempting to execute init, because the proecssor doesn't support
>> that bit in CR4.
>
> Urgh, the static key trainwreck bites again.
>
> One is not supposed to mix static_key_true() and STATIC_KEY_INIT_FALSE.
>
> Does this make it go again?
>
> ---
>  arch/x86/include/asm/mmu_context.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/mmu_context.h 
> b/arch/x86/include/asm/mmu_context.h
> index 5e8daee7c5c9..804a3a6030ca 100644
> --- a/arch/x86/include/asm/mmu_context.h
> +++ b/arch/x86/include/asm/mmu_context.h
> @@ -23,7 +23,7 @@ extern struct static_key rdpmc_always_available;
>
>  static inline void load_mm_cr4(struct mm_struct *mm)
>  {
> -   if (static_key_true(&rdpmc_always_available) ||
> +   if (static_key_false(&rdpmc_always_available) ||

In what universe is "static_key_false" a reasonable name for a
function that returns true if a static key is true?

Can we rename that function?  And could we maybe make static keys type
safe?  I.e. there would be a type that starts out true and a type that
starts out false.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-08 Thread Mikulas Patocka



On Wed, 8 Jul 2015, Peter Zijlstra wrote:

> On Wed, Jul 08, 2015 at 11:17:38AM -0400, Mikulas Patocka wrote:
> > Hi
> > 
> > I found out that the patch a66734297f78707ce39d756b656bfae861d53f62 breaks 
> > the kernel on processors without performance counters, such as AMD K6-3. 
> > Reverting the patch fixes the problem.
> > 
> > The static key rdpmc_always_available somehow gets set (I couldn't really 
> > find out what is setting it, the function set_attr_rdpmc is not executed), 
> > cr4_set_bits(X86_CR4_PCE) is executed and that results in a crash on boot 
> > when attempting to execute init, because the proecssor doesn't support 
> > that bit in CR4.
> 
> Urgh, the static key trainwreck bites again.
> 
> One is not supposed to mix static_key_true() and STATIC_KEY_INIT_FALSE.
> 
> Does this make it go again?

Yes, this patch fixes the problem.

Mikulas

> ---
>  arch/x86/include/asm/mmu_context.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/mmu_context.h 
> b/arch/x86/include/asm/mmu_context.h
> index 5e8daee7c5c9..804a3a6030ca 100644
> --- a/arch/x86/include/asm/mmu_context.h
> +++ b/arch/x86/include/asm/mmu_context.h
> @@ -23,7 +23,7 @@ extern struct static_key rdpmc_always_available;
>  
>  static inline void load_mm_cr4(struct mm_struct *mm)
>  {
> - if (static_key_true(&rdpmc_always_available) ||
> + if (static_key_false(&rdpmc_always_available) ||
>   atomic_read(&mm->context.perf_rdpmc_allowed))
>   cr4_set_bits(X86_CR4_PCE);
>   else
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-08 Thread Peter Zijlstra

On Wed, Jul 08, 2015 at 11:17:38AM -0400, Mikulas Patocka wrote:
> Hi
> 
> I found out that the patch a66734297f78707ce39d756b656bfae861d53f62 breaks 
> the kernel on processors without performance counters, such as AMD K6-3. 
> Reverting the patch fixes the problem.
> 
> The static key rdpmc_always_available somehow gets set (I couldn't really 
> find out what is setting it, the function set_attr_rdpmc is not executed), 
> cr4_set_bits(X86_CR4_PCE) is executed and that results in a crash on boot 
> when attempting to execute init, because the proecssor doesn't support 
> that bit in CR4.

Urgh, the static key trainwreck bites again.

One is not supposed to mix static_key_true() and STATIC_KEY_INIT_FALSE.

Does this make it go again?

---
 arch/x86/include/asm/mmu_context.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/mmu_context.h 
b/arch/x86/include/asm/mmu_context.h
index 5e8daee7c5c9..804a3a6030ca 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -23,7 +23,7 @@ extern struct static_key rdpmc_always_available;
 
 static inline void load_mm_cr4(struct mm_struct *mm)
 {
-   if (static_key_true(&rdpmc_always_available) ||
+   if (static_key_false(&rdpmc_always_available) ||
atomic_read(&mm->context.perf_rdpmc_allowed))
cr4_set_bits(X86_CR4_PCE);
else
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Kernel broken on processors without performance counters

2015-07-08 Thread Mikulas Patocka

Hi

I found out that the patch a66734297f78707ce39d756b656bfae861d53f62 breaks 
the kernel on processors without performance counters, such as AMD K6-3. 
Reverting the patch fixes the problem.

The static key rdpmc_always_available somehow gets set (I couldn't really 
find out what is setting it, the function set_attr_rdpmc is not executed), 
cr4_set_bits(X86_CR4_PCE) is executed and that results in a crash on boot 
when attempting to execute init, because the proecssor doesn't support 
that bit in CR4.

Mikulas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

51 matches

Mail list logo