RE: [PATCH v2 3/3] arch/*/: remove CONFIG_VIRT_TO_BUS

2022-06-30 Thread David Laight
From: Michael Schmitz
> Sent: 29 June 2022 00:09
> 
> Hi Arnd,
> 
> On 29/06/22 09:50, Arnd Bergmann wrote:
> > On Tue, Jun 28, 2022 at 11:03 PM Michael Schmitz  
> > wrote:
> >> On 28/06/22 19:03, Geert Uytterhoeven wrote:
>  The driver allocates bounce buffers using kmalloc if it hits an
>  unaligned data buffer - can such buffers still even happen these days?
> >>> No idea.
> >> Hmmm - I think I'll stick a WARN_ONCE() in there so we know whether this
> >> code path is still being used.
> > kmalloc() guarantees alignment to the next power-of-two size or
> > KMALLOC_MIN_ALIGN, whichever is bigger. On m68k this means it
> > is cacheline aligned.
> 
> And all SCSI buffers are allocated using kmalloc? No way at all for user
> space to pass unaligned data?

I didn't think kmalloc() gave any such guarantee about alignment.
There are cache-line alignment requirements on systems with non-coherent
dma, but otherwise the alignment can be much smaller.

One of the allocators adds a header to each item, IIRC that can
lead to 'unexpected' alignments - especially on m68k.

dma_alloc_coherent() does align to next 'power of 2'.
And sometimes you need (eg) 16k allocates that are 16k aligned.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


Re: [RFC PATCH v3 11/12] powerpc: Remove unreachable() from WARN_ON()

2022-06-30 Thread Naveen N. Rao

Christophe Leroy wrote:

Hi Sathvika,

Adding ARM people as they seem to face the same kind of problem (see 
https://patchwork.kernel.org/project/linux-kbuild/patch/20220623014917.199563-33-chenzhong...@huawei.com/)


Le 27/06/2022 à 17:35, Sathvika Vasireddy a écrit :


On 25/06/22 12:16, Christophe Leroy wrote:


Le 24/06/2022 à 20:32, Sathvika Vasireddy a écrit :

objtool is throwing *unannotated intra-function call*
warnings with a few instructions that are marked
unreachable. Remove unreachable() from WARN_ON()
to fix these warnings, as the codegen remains same
with and without unreachable() in WARN_ON().

Did you try the two exemples described in commit 1e688dd2a3d6
("powerpc/bug: Provide better flexibility to WARN_ON/__WARN_FLAGS() with
asm goto") ?

Without your patch:

0640 :
   640:    81 23 00 84 lwz r9,132(r3)
   644:    71 29 40 00 andi.   r9,r9,16384
   648:    40 82 00 0c bne 654 
   64c:    80 63 00 0c lwz r3,12(r3)
   650:    4e 80 00 20 blr
   654:    0f e0 00 00 twui    r0,0

0658 :
   658:    2c 04 00 00 cmpwi   r4,0
   65c:    41 82 00 0c beq 668 
   660:    7c 63 23 96 divwu   r3,r3,r4
   664:    4e 80 00 20 blr
   668:    0f e0 00 00 twui    r0,0
   66c:    38 60 00 00 li  r3,0
   670:    4e 80 00 20 blr


With your patch:

0640 :
   640:    81 23 00 84 lwz r9,132(r3)
   644:    71 29 40 00 andi.   r9,r9,16384
   648:    40 82 00 0c bne 654 
   64c:    80 63 00 0c lwz r3,12(r3)
   650:    4e 80 00 20 blr
   654:    0f e0 00 00 twui    r0,0
   658:    4b ff ff f4 b   64c     <==

065c :
   65c:    2c 04 00 00 cmpwi   r4,0
   660:    41 82 00 0c beq 66c 
   664:    7c 63 23 96 divwu   r3,r3,r4
   668:    4e 80 00 20 blr
   66c:    0f e0 00 00 twui    r0,0
   670:    38 60 00 00 li  r3,0    <==
   674:    4e 80 00 20 blr    <==
   678:    38 60 00 00 li  r3,0
   67c:    4e 80 00 20 blr


The builtin variant of unreachable (__builtin_unreachable()) works.

How about using that instead of unreachable() ?




In fact the problem comes from the macro annotate_unreachable() which is 
called by unreachable() before calling __build_unreachable().


Seems like this macro adds (after the unconditional trap twui) a call to 
an empty function whose address is listed in section .discard.unreachable


 1c78:   00 00 e0 0f twuir0,0
 1c7c:   55 e7 ff 4b bl  3d0 




RELOCATION RECORDS FOR [.discard.unreachable]:
OFFSET   TYPE  VALUE
 R_PPC64_REL32 .text+0x03d0

The problem is that that function has size 0:

03d0 l F .text	 
qdisc_root_sleeping_lock.part.0



And objtool is not prepared for a function with size 0.


annotate_unreachable() seems to have been introduced in commit 
649ea4d5a624f0 ("objtool: Assume unannotated UD2 instructions are dead 
ends").


Objtool considers 'ud2' instruction to be fatal, so BUG() has 
__builtin_unreachable(), rather than unreachable(). See commit 
bfb1a7c91fb775 ("x86/bug: Merge annotate_reachable() into _BUG_FLAGS() 
asm"). For the same reason, __WARN_FLAGS() is annotated with 
_ASM_REACHABLE so that objtool can differentiate warnings from a BUG().


On powerpc, we use trap variants for both and don't have a special 
instruction for a BUG(). As such, for _WARN_FLAGS(), using 
__builtin_unreachable() suffices to achieve optimal code generation from 
the compiler. Objtool would consider subsequent instructions to be 
reachable. For BUG(), we can continue to use unreachable() so that 
objtool can differentiate these from traps used in warnings.




The following changes to objtool seem to fix the problem, most warning 
are gone with that change.


diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 63218f5799c2..37c0a268b7ea 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -77,6 +77,8 @@ static int symbol_by_offset(const void *key, const 
struct rb_node *node)


if (*o < s->offset)
return -1;
+   if (*o == s->offset && !s->len)
+   return 0;
if (*o >= s->offset + s->len)
return 1;

@@ -400,7 +402,7 @@ static void elf_add_symbol(struct elf *elf, struct 
symbol *sym)

 * Don't store empty STT_NOTYPE symbols in the rbtree.  They
 * can exist within a function, confusing the sorting.
 */
-   if (!sym->len)
+   if (sym->type == STT_NOTYPE && !sym->len)
rb_erase(&sym->node, &sym->sec->symbol_tree);
  }


Is there a reason to do this, rather than change __WARN_FLAGS() to use 
__builtin_unreachable()? Or, are you seeing an issue with unreachable() 
elsewhere in the kernel?



- Naveen



Re: [PATCH] crypto: vmx - drop unexpected word 'for' in comments

2022-06-30 Thread Herbert Xu
Jiang Jian  wrote:
> there is an unexpected word 'for' in the comments that need to be dropped
> 
> file - drivers/crypto/vmx/ghashp8-ppc.pl
> line - 19
> 
> "# GHASH for for PowerISA v2.07."
> 
> changed to:
> 
> "# GHASH for PowerISA v2.07."
> 
> Signed-off-by: Jiang Jian 
> ---
> drivers/crypto/vmx/ghashp8-ppc.pl | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] crypto: nx - drop unexpected word "the"

2022-06-30 Thread Herbert Xu
Jiang Jian  wrote:
> there is an unexpected word "the" in the comments that need to be dropped
> 
>>- * The DDE is setup with the the DDE count, byte count, and address of
>>+ * The DDE is setup with the DDE count, byte count, and address of
> 
> Signed-off-by: Jiang Jian 
> ---
> drivers/crypto/nx/nx-common-powernv.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH v2 3/3] arch/*/: remove CONFIG_VIRT_TO_BUS

2022-06-30 Thread Christophe Leroy


Le 30/06/2022 à 10:04, David Laight a écrit :
> From: Michael Schmitz
>> Sent: 29 June 2022 00:09
>>
>> Hi Arnd,
>>
>> On 29/06/22 09:50, Arnd Bergmann wrote:
>>> On Tue, Jun 28, 2022 at 11:03 PM Michael Schmitz  
>>> wrote:
 On 28/06/22 19:03, Geert Uytterhoeven wrote:
>> The driver allocates bounce buffers using kmalloc if it hits an
>> unaligned data buffer - can such buffers still even happen these days?
> No idea.
 Hmmm - I think I'll stick a WARN_ONCE() in there so we know whether this
 code path is still being used.
>>> kmalloc() guarantees alignment to the next power-of-two size or
>>> KMALLOC_MIN_ALIGN, whichever is bigger. On m68k this means it
>>> is cacheline aligned.
>>
>> And all SCSI buffers are allocated using kmalloc? No way at all for user
>> space to pass unaligned data?
> 
> I didn't think kmalloc() gave any such guarantee about alignment.

I does since commit 59bb47985c1d ("mm, sl[aou]b: guarantee natural 
alignment for kmalloc(power-of-two)")

Christophe

> There are cache-line alignment requirements on systems with non-coherent
> dma, but otherwise the alignment can be much smaller.
> 
> One of the allocators adds a header to each item, IIRC that can
> lead to 'unexpected' alignments - especially on m68k.
> 
> dma_alloc_coherent() does align to next 'power of 2'.
> And sometimes you need (eg) 16k allocates that are 16k aligned.
> 
>   David
> 
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 
> 1PT, UK
> Registration No: 1397386 (Wales)

Re: [RFC PATCH v3 11/12] powerpc: Remove unreachable() from WARN_ON()

2022-06-30 Thread Christophe Leroy


Le 30/06/2022 à 10:05, Naveen N. Rao a écrit :
> Christophe Leroy wrote:
>> Hi Sathvika,
>>
>> Adding ARM people as they seem to face the same kind of problem (see 
>> https://patchwork.kernel.org/project/linux-kbuild/patch/20220623014917.199563-33-chenzhong...@huawei.com/)
>>  
>>
>>
>> Le 27/06/2022 à 17:35, Sathvika Vasireddy a écrit :
>>>
>>> On 25/06/22 12:16, Christophe Leroy wrote:

 Le 24/06/2022 à 20:32, Sathvika Vasireddy a écrit :
> objtool is throwing *unannotated intra-function call*
> warnings with a few instructions that are marked
> unreachable. Remove unreachable() from WARN_ON()
> to fix these warnings, as the codegen remains same
> with and without unreachable() in WARN_ON().
 Did you try the two exemples described in commit 1e688dd2a3d6
 ("powerpc/bug: Provide better flexibility to WARN_ON/__WARN_FLAGS() 
 with
 asm goto") ?

 Without your patch:

 0640 :
    640:    81 23 00 84 lwz r9,132(r3)
    644:    71 29 40 00 andi.   r9,r9,16384
    648:    40 82 00 0c bne 654 
    64c:    80 63 00 0c lwz r3,12(r3)
    650:    4e 80 00 20 blr
    654:    0f e0 00 00 twui    r0,0

 0658 :
    658:    2c 04 00 00 cmpwi   r4,0
    65c:    41 82 00 0c beq 668 
    660:    7c 63 23 96 divwu   r3,r3,r4
    664:    4e 80 00 20 blr
    668:    0f e0 00 00 twui    r0,0
    66c:    38 60 00 00 li  r3,0
    670:    4e 80 00 20 blr


 With your patch:

 0640 :
    640:    81 23 00 84 lwz r9,132(r3)
    644:    71 29 40 00 andi.   r9,r9,16384
    648:    40 82 00 0c bne 654 
    64c:    80 63 00 0c lwz r3,12(r3)
    650:    4e 80 00 20 blr
    654:    0f e0 00 00 twui    r0,0
    658:    4b ff ff f4 b   64c     <==

 065c :
    65c:    2c 04 00 00 cmpwi   r4,0
    660:    41 82 00 0c beq 66c 
    664:    7c 63 23 96 divwu   r3,r3,r4
    668:    4e 80 00 20 blr
    66c:    0f e0 00 00 twui    r0,0
    670:    38 60 00 00 li  r3,0    <==
    674:    4e 80 00 20 blr    <==
    678:    38 60 00 00 li  r3,0
    67c:    4e 80 00 20 blr

>>> The builtin variant of unreachable (__builtin_unreachable()) works.
>>>
>>> How about using that instead of unreachable() ?
>>>
>>>
>>
>> In fact the problem comes from the macro annotate_unreachable() which 
>> is called by unreachable() before calling __build_unreachable().
>>
>> Seems like this macro adds (after the unconditional trap twui) a call 
>> to an empty function whose address is listed in section 
>> .discard.unreachable
>>
>>  1c78:   00 00 e0 0f twui    r0,0
>>  1c7c:   55 e7 ff 4b bl  3d0 
>> 
>>
>>
>> RELOCATION RECORDS FOR [.discard.unreachable]:
>> OFFSET   TYPE  VALUE
>>  R_PPC64_REL32 .text+0x03d0
>>
>> The problem is that that function has size 0:
>>
>> 03d0 l F .text     
>> qdisc_root_sleeping_lock.part.0
>>
>>
>> And objtool is not prepared for a function with size 0.
> 
> annotate_unreachable() seems to have been introduced in commit 
> 649ea4d5a624f0 ("objtool: Assume unannotated UD2 instructions are dead 
> ends").
> 
> Objtool considers 'ud2' instruction to be fatal, so BUG() has 
> __builtin_unreachable(), rather than unreachable(). See commit 
> bfb1a7c91fb775 ("x86/bug: Merge annotate_reachable() into _BUG_FLAGS() 
> asm"). For the same reason, __WARN_FLAGS() is annotated with 
> _ASM_REACHABLE so that objtool can differentiate warnings from a BUG().
> 
> On powerpc, we use trap variants for both and don't have a special 
> instruction for a BUG(). As such, for _WARN_FLAGS(), using 
> __builtin_unreachable() suffices to achieve optimal code generation from 
> the compiler. Objtool would consider subsequent instructions to be 
> reachable. For BUG(), we can continue to use unreachable() so that 
> objtool can differentiate these from traps used in warnings.

Not sure I understand what you mean.

__WARN_FLAGS() and BUG() both use 'twui' which is unconditionnal trap, 
as such both are the same.

On the other side, WARN_ON() and BUG_ON() use tlbnei which is a 
conditionnel trap.

> 
>>
>> The following changes to objtool seem to fix the problem, most warning 
>> are gone with that change.
>>
>> diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
>> index 63218f5799c2..37c0a268b7ea 100644
>> --- a/tools/objtool/elf.c
>> +++ b/tools/objtool/elf.c
>> @@ -77,6 +77,8 @@ static int symbol_by_offset(const void *key, const 
>> struct rb_node *node)
>>
>>   if (*o < s->offset)
>>   return -1;
>> +    if (*o == s->offset && !s->len)
>> +    return 0;
>>   if (*o >= s->offset + s->len)
>>   return 1;
>>
>> 

RE: [PATCH v2 3/3] arch/*/: remove CONFIG_VIRT_TO_BUS

2022-06-30 Thread David Laight
From: Christophe Leroy
> Sent: 30 June 2022 10:40
> 
> Le 30/06/2022 à 10:04, David Laight a écrit :
> > From: Michael Schmitz
> >> Sent: 29 June 2022 00:09
> >>
> >> Hi Arnd,
> >>
> >> On 29/06/22 09:50, Arnd Bergmann wrote:
> >>> On Tue, Jun 28, 2022 at 11:03 PM Michael Schmitz  
> >>> wrote:
>  On 28/06/22 19:03, Geert Uytterhoeven wrote:
> >> The driver allocates bounce buffers using kmalloc if it hits an
> >> unaligned data buffer - can such buffers still even happen these days?
> > No idea.
>  Hmmm - I think I'll stick a WARN_ONCE() in there so we know whether this
>  code path is still being used.
> >>> kmalloc() guarantees alignment to the next power-of-two size or
> >>> KMALLOC_MIN_ALIGN, whichever is bigger. On m68k this means it
> >>> is cacheline aligned.
> >>
> >> And all SCSI buffers are allocated using kmalloc? No way at all for user
> >> space to pass unaligned data?
> >
> > I didn't think kmalloc() gave any such guarantee about alignment.
> 
> I does since commit 59bb47985c1d ("mm, sl[aou]b: guarantee natural
> alignment for kmalloc(power-of-two)")

Looks like it is done for 'power-of-two' less than PAGE_SIZE.
This may not help scsi tape writes which could easily be (say) 47 bytes.
I think that only guarantees 2 byte alignment on m68k.
(Although increasing the min-alignment on m68k to 4 (or even 8)
will probably make no measurable difference.)

What happens above PAGE_SIZE?
Any structure with a trailing [] field could easily request
'64k + a_bit' bytes.
You don't really want to extend this to 128k - but I suspect
that is what happens.

David
 

> 
> Christophe
> 
> > There are cache-line alignment requirements on systems with non-coherent
> > dma, but otherwise the alignment can be much smaller.
> >
> > One of the allocators adds a header to each item, IIRC that can
> > lead to 'unexpected' alignments - especially on m68k.
> >
> > dma_alloc_coherent() does align to next 'power of 2'.
> > And sometimes you need (eg) 16k allocates that are 16k aligned.
> >
> > David
> >
> > -
> > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 
> > 1PT, UK
> > Registration No: 1397386 (Wales)

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


Re: [RFC PATCH v3 11/12] powerpc: Remove unreachable() from WARN_ON()

2022-06-30 Thread Christophe Leroy




Le 30/06/2022 à 11:58, Christophe Leroy a écrit :



Le 30/06/2022 à 10:05, Naveen N. Rao a écrit :

Christophe Leroy wrote:

Hi Sathvika,

Adding ARM people as they seem to face the same kind of problem (see 
https://patchwork.kernel.org/project/linux-kbuild/patch/20220623014917.199563-33-chenzhong...@huawei.com/) 



Le 27/06/2022 à 17:35, Sathvika Vasireddy a écrit :


On 25/06/22 12:16, Christophe Leroy wrote:


Le 24/06/2022 à 20:32, Sathvika Vasireddy a écrit :

objtool is throwing *unannotated intra-function call*
warnings with a few instructions that are marked
unreachable. Remove unreachable() from WARN_ON()
to fix these warnings, as the codegen remains same
with and without unreachable() in WARN_ON().

Did you try the two exemples described in commit 1e688dd2a3d6
("powerpc/bug: Provide better flexibility to WARN_ON/__WARN_FLAGS() 
with

asm goto") ?

Without your patch:

0640 :
   640:    81 23 00 84 lwz r9,132(r3)
   644:    71 29 40 00 andi.   r9,r9,16384
   648:    40 82 00 0c bne 654 
   64c:    80 63 00 0c lwz r3,12(r3)
   650:    4e 80 00 20 blr
   654:    0f e0 00 00 twui    r0,0

0658 :
   658:    2c 04 00 00 cmpwi   r4,0
   65c:    41 82 00 0c beq 668 
   660:    7c 63 23 96 divwu   r3,r3,r4
   664:    4e 80 00 20 blr
   668:    0f e0 00 00 twui    r0,0
   66c:    38 60 00 00 li  r3,0
   670:    4e 80 00 20 blr


With your patch:

0640 :
   640:    81 23 00 84 lwz r9,132(r3)
   644:    71 29 40 00 andi.   r9,r9,16384
   648:    40 82 00 0c bne 654 
   64c:    80 63 00 0c lwz r3,12(r3)
   650:    4e 80 00 20 blr
   654:    0f e0 00 00 twui    r0,0
   658:    4b ff ff f4 b   64c     <==

065c :
   65c:    2c 04 00 00 cmpwi   r4,0
   660:    41 82 00 0c beq 66c 
   664:    7c 63 23 96 divwu   r3,r3,r4
   668:    4e 80 00 20 blr
   66c:    0f e0 00 00 twui    r0,0
   670:    38 60 00 00 li  r3,0    <==
   674:    4e 80 00 20 blr    <==
   678:    38 60 00 00 li  r3,0
   67c:    4e 80 00 20 blr


The builtin variant of unreachable (__builtin_unreachable()) works.

How about using that instead of unreachable() ?




In fact the problem comes from the macro annotate_unreachable() which 
is called by unreachable() before calling __build_unreachable().


Seems like this macro adds (after the unconditional trap twui) a call 
to an empty function whose address is listed in section 
.discard.unreachable


 1c78:   00 00 e0 0f twui    r0,0
 1c7c:   55 e7 ff 4b bl  3d0 




RELOCATION RECORDS FOR [.discard.unreachable]:
OFFSET   TYPE  VALUE
 R_PPC64_REL32 .text+0x03d0

The problem is that that function has size 0:

03d0 l F .text     
qdisc_root_sleeping_lock.part.0



And objtool is not prepared for a function with size 0.


annotate_unreachable() seems to have been introduced in commit 
649ea4d5a624f0 ("objtool: Assume unannotated UD2 instructions are dead 
ends").


Objtool considers 'ud2' instruction to be fatal, so BUG() has 
__builtin_unreachable(), rather than unreachable(). See commit 
bfb1a7c91fb775 ("x86/bug: Merge annotate_reachable() into _BUG_FLAGS() 
asm"). For the same reason, __WARN_FLAGS() is annotated with 
_ASM_REACHABLE so that objtool can differentiate warnings from a BUG().


On powerpc, we use trap variants for both and don't have a special 
instruction for a BUG(). As such, for _WARN_FLAGS(), using 
__builtin_unreachable() suffices to achieve optimal code generation 
from the compiler. Objtool would consider subsequent instructions to 
be reachable. For BUG(), we can continue to use unreachable() so that 
objtool can differentiate these from traps used in warnings.


Not sure I understand what you mean.

__WARN_FLAGS() and BUG() both use 'twui' which is unconditionnal trap, 
as such both are the same.


On the other side, WARN_ON() and BUG_ON() use tlbnei which is a 
conditionnel trap.






The following changes to objtool seem to fix the problem, most 
warning are gone with that change.


diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 63218f5799c2..37c0a268b7ea 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -77,6 +77,8 @@ static int symbol_by_offset(const void *key, const 
struct rb_node *node)


  if (*o < s->offset)
  return -1;
+    if (*o == s->offset && !s->len)
+    return 0;
  if (*o >= s->offset + s->len)
  return 1;

@@ -400,7 +402,7 @@ static void elf_add_symbol(struct elf *elf, 
struct symbol *sym)

   * Don't store empty STT_NOTYPE symbols in the rbtree.  They
   * can exist within a function, confusing the sorting.
   */
-    if (!sym->len)
+    if (sym->type == STT_NOTYPE && !sym->len)
  rb_erase(&sym->node, &sym->sec->symbol_tree);
  }


Is there a reason to do this, rath

Re: [RFC PATCH v3 11/12] powerpc: Remove unreachable() from WARN_ON()

2022-06-30 Thread Naveen N. Rao

Christophe Leroy wrote:



Le 30/06/2022 à 10:05, Naveen N. Rao a écrit :

Christophe Leroy wrote:

The builtin variant of unreachable (__builtin_unreachable()) works.

How about using that instead of unreachable() ?




In fact the problem comes from the macro annotate_unreachable() which 
is called by unreachable() before calling __build_unreachable().


Seems like this macro adds (after the unconditional trap twui) a call 
to an empty function whose address is listed in section 
.discard.unreachable


 1c78:   00 00 e0 0f twui    r0,0
 1c7c:   55 e7 ff 4b bl  3d0 




RELOCATION RECORDS FOR [.discard.unreachable]:
OFFSET   TYPE  VALUE
 R_PPC64_REL32 .text+0x03d0

The problem is that that function has size 0:

03d0 l F .text     
qdisc_root_sleeping_lock.part.0



And objtool is not prepared for a function with size 0.


annotate_unreachable() seems to have been introduced in commit 
649ea4d5a624f0 ("objtool: Assume unannotated UD2 instructions are dead 
ends").


Objtool considers 'ud2' instruction to be fatal, so BUG() has 
__builtin_unreachable(), rather than unreachable(). See commit 
bfb1a7c91fb775 ("x86/bug: Merge annotate_reachable() into _BUG_FLAGS() 
asm"). For the same reason, __WARN_FLAGS() is annotated with 
_ASM_REACHABLE so that objtool can differentiate warnings from a BUG().


On powerpc, we use trap variants for both and don't have a special 
instruction for a BUG(). As such, for _WARN_FLAGS(), using 
__builtin_unreachable() suffices to achieve optimal code generation from 
the compiler. Objtool would consider subsequent instructions to be 
reachable. For BUG(), we can continue to use unreachable() so that 
objtool can differentiate these from traps used in warnings.


Not sure I understand what you mean.

__WARN_FLAGS() and BUG() both use 'twui' which is unconditionnal trap, 
as such both are the same.


On the other side, WARN_ON() and BUG_ON() use tlbnei which is a 
conditionnel trap.


Objtool classifies 'ud2' as INSN_BUG, and 'int3' as INSN_TRAP. In x86 
BUG(), there is no need for an annotation since objtool assumes that 
'ud2' terminates control flow. But, for __WARN_FLAGS(), since 'ud2' is 
used, an explicit annotate_reachable() is needed. That's _reachable_, to 
indicate that the control flow can continue with the next instruction.


On powerpc, we should (eventually) classify all trap variants as 
INSN_TRAP. Even in the absence of that classification today, objtool 
assumes that control flow continues with the next instruction. With your 
work to utilize asm goto for __WARN_FLAGS(), with no extra instructions 
being generated, I think it is appropriate to just use 
__builtin_unreachable() and to not use the annotation.


In any case, we are only hitting this since gcc is generating a 'bl' due 
to that annotation. We are not yet enabling full objtool validation on 
powerpc, so I think we can revisit this at that point.








The following changes to objtool seem to fix the problem, most warning 
are gone with that change.


diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 63218f5799c2..37c0a268b7ea 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -77,6 +77,8 @@ static int symbol_by_offset(const void *key, const 
struct rb_node *node)


  if (*o < s->offset)
  return -1;
+    if (*o == s->offset && !s->len)
+    return 0;
  if (*o >= s->offset + s->len)
  return 1;

@@ -400,7 +402,7 @@ static void elf_add_symbol(struct elf *elf, struct 
symbol *sym)

   * Don't store empty STT_NOTYPE symbols in the rbtree.  They
   * can exist within a function, confusing the sorting.
   */
-    if (!sym->len)
+    if (sym->type == STT_NOTYPE && !sym->len)
  rb_erase(&sym->node, &sym->sec->symbol_tree);
  }


Is there a reason to do this, rather than change __WARN_FLAGS() to use 
__builtin_unreachable()? Or, are you seeing an issue with unreachable() 
elsewhere in the kernel?




At the moment I'm trying to understand what the issue is, and explore 
possible fixes. I guess if we tell objtool that after 'twui' subsequent 
instructions are unreachable, then __builtin_unreachable() is enough.


Yes, see my explanation above. Since no 'bl' is emitted with the 
builtin, objtool won't complain, especially for mcount.




I think we should also understand why annotate_unreachable() gives us a 
so bad result and see if it can be changed to something cleaner than a 
'bl' to an empty function that has no instructions.


Indeed. Not really sure. annotate_unreachable() wants to take the 
address of the instruction after the trap. But, in reality, due to use 
of asm goto for __WARN_FLAGS, no instructions would be generated. I 
wonder if that combination causes such code to be emitted.



- Naveen



Re: [PATCH v5.18] powerpc/ftrace: Remove ftrace init tramp once kernel init is complete

2022-06-30 Thread Greg KH
On Mon, Jun 27, 2022 at 11:09:29PM +0530, Naveen N. Rao wrote:
> commit 84ade0a6655bee803d176525ef457175cbf4df22 upstream.
> 
> Stop using the ftrace trampoline for init section once kernel init is
> complete.
> 
> Fixes: 67361cf8071286 ("powerpc/ftrace: Handle large kernel configs")
> Cc: sta...@vger.kernel.org # v4.20+
> Signed-off-by: Naveen N. Rao 
> Signed-off-by: Michael Ellerman 
> Link: 
> https://lore.kernel.org/r/20220516071422.463738-1-naveen.n@linux.vnet.ibm.com
> ---
>  arch/powerpc/include/asm/ftrace.h  |  4 +++-
>  arch/powerpc/kernel/trace/ftrace.c | 15 ---
>  arch/powerpc/mm/mem.c  |  2 ++
>  3 files changed, 17 insertions(+), 4 deletions(-)

All now queued up, thanks.

greg k-h


[next-20220629] kobject warnings during boot

2022-06-30 Thread Sachin Sant
While booting linux-next (5.19.0-rc4-next-20220629) on Power8 non-virtualised
following koject warning is seen:

[0.000184] clocksource: timebase mult[1f4] shift[24] registered
[0.000303] clockevent: decrementer mult[83126e98] shift[32] cpu[0]
[0.000397] [ cut here ]
[0.000478] kobject: '(null)' ((ptrval)): is not initialized, yet 
kobject_get() is being called.
[0.000667] WARNING: CPU: 0 PID: 0 at lib/kobject.c:626 
kobject_get+0x90/0x100
[0.000802] Modules linked in:
[0.000861] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
5.19.0-rc4-next-20220629 #1
[0.000995] NIP:  c073b0c0 LR: c073b0bc CTR: c00d33a0
[0.001120] REGS: c2a0f890 TRAP: 0700   Not tainted  
(5.19.0-rc4-next-20220629)
[0.001260] MSR:  92021033   CR: 28008822  
XER: 2000
[0.001428] CFAR: c014f3a4 IRQMASK: 1
   GPR00: c073b0bc c2a0fb30 c2a12000 
005c
   GPR04: 7fff c2a0f8f0 c2a0f8e8 

   GPR08: c2826b78  c2566a50 
c28e6bb8
   GPR12: 8800 c2d0 0003 

   GPR16:   0278 
c2a4dfe0
   GPR20: c2a52238 c2a52820 c00d8d60 
c0fe6e18
   GPR24:  c2a0fe90 c0fe6e08 

   GPR28:  c2952e80 c00c3f10 
c2952e80
[0.002681] NIP [c073b0c0] kobject_get+0x90/0x100
[0.002781] LR [c073b0bc] kobject_get+0x8c/0x100
[0.002880] Call Trace:
[0.002925] [c2a0fb30] [c073b0bc] kobject_get+0x8c/0x100 
(unreliable)
[0.003071] [c2a0fba0] [c087e464] device_add+0xf4/0xb00
[0.003194] [c2a0fc80] [c0a7f6e4] of_device_add+0x64/0x80
[0.003321] [c2a0fcb0] [c0a800d0] 
of_platform_device_create_pdata+0xd0/0x1b0
[0.003476] [c2a0fd00] [c201fa44] 
pnv_get_random_long_early+0x240/0x2e4
[0.003623] [c2a0fe20] [c2060c38] random_init+0xc0/0x214
[0.003749] [c2a0fec0] [c20048bc] start_kernel+0x98c/0xbf4
[0.003878] [c2a0ff90] [c000d978] start_here_common+0x1c/0x24
[0.004008] Instruction dump:
[0.004063] 4e800020 6000 6000 6000 7c0802a6 3c62fe65 7fe5fb78 
3863a8e0
[0.004216] f8010080 e89f 4ba14285 6000 <0fe0> 6000 6000 
6000
[0.004372] ---[ end trace  ]---
[0.004456] [ cut here ]
[0.004537] refcount_t: addition on 0; use-after-free.
[0.004645] WARNING: CPU: 0 PID: 0 at lib/refcount.c:25 
refcount_warn_saturate+0x164/0x1f0
[0.004797] Modules linked in:
[0.004853] CPU: 0 PID: 0 Comm: swapper/0 Tainted: GW 
5.19.0-rc4-next-20220629 #1
[0.005010] NIP:  c06e0a54 LR: c06e0a50 CTR: c00d33a0
[0.005134] REGS: c2a0f830 TRAP: 0700   Tainted: GW  
(5.19.0-rc4-next-20220629)

Reverting the following patch helps avoid this warning.

commit f3eac426657d985b97c92fa5f7ae1d43f04721f3
powerpc/powernv: wire up rng during setup_arch

I have attached the dmesg log.

Thanks
- Sachin



dmesg.log
Description: Binary data


Re: [next-20220629] kobject warnings during boot

2022-06-30 Thread Jason A. Donenfeld
Hi Sachin,

On Thu, Jun 30, 2022 at 05:01:57PM +0530, Sachin Sant wrote:
> [0.000478] kobject: '(null)' ((ptrval)): is not initialized, yet 
> kobject_get() is being called.
> [0.002925] [c2a0fb30] [c073b0bc] kobject_get+0x8c/0x100 
> (unreliable)
> [0.003071] [c2a0fba0] [c087e464] device_add+0xf4/0xb00
> [0.003194] [c2a0fc80] [c0a7f6e4] of_device_add+0x64/0x80
> [0.003321] [c2a0fcb0] [c0a800d0] 
> of_platform_device_create_pdata+0xd0/0x1b0
> [0.003476] [c2a0fd00] [c201fa44] 
> pnv_get_random_long_early+0x240/0x2e4
> [0.003623] [c2a0fe20] [c2060c38] random_init+0xc0/0x214

Huh, that's a surprise, because random_init() is being called long after
all of the arch-specific stuff is initialized. This might point to
something else fishy happening during ppc's init? Or maybe the of device
needs a parent that hasn't been initialized yet. Not sure... But see if
the below patch fixes the issue. I don't have a POWER8 machine to try
this on, but if it works for you, I'll build it into a real patch.

Jason

diff --git a/arch/powerpc/platforms/powernv/rng.c 
b/arch/powerpc/platforms/powernv/rng.c
index 463c78c52cc5..bd5ad5f351c2 100644
--- a/arch/powerpc/platforms/powernv/rng.c
+++ b/arch/powerpc/platforms/powernv/rng.c
@@ -176,12 +176,8 @@ static int __init pnv_get_random_long_early(unsigned long 
*v)
NULL) != pnv_get_random_long_early)
return 0;

-   for_each_compatible_node(dn, NULL, "ibm,power-rng") {
-   if (rng_create(dn))
-   continue;
-   /* Create devices for hwrng driver */
-   of_platform_device_create(dn, NULL, NULL);
-   }
+   for_each_compatible_node(dn, NULL, "ibm,power-rng")
+   rng_create(dn);

if (!ppc_md.get_random_seed)
return 0;
@@ -205,10 +201,16 @@ void __init pnv_rng_init(void)

 static int __init pnv_rng_late_init(void)
 {
+   struct device_node *dn;
unsigned long v;
+
/* In case it wasn't called during init for some other reason. */
if (ppc_md.get_random_seed == pnv_get_random_long_early)
pnv_get_random_long_early(&v);
+   if (ppc_md.get_random_seed == powernv_get_random_long) {
+   for_each_compatible_node(dn, NULL, "ibm,power-rng")
+   of_platform_device_create(dn, NULL, NULL);
+   }
return 0;
 }
 machine_subsys_initcall(powernv, pnv_rng_late_init);


[PATCH] powerpc/powernv: delay rng of node creation until later in boot

2022-06-30 Thread Jason A. Donenfeld
The of node for the rng must be created much later in boot. Otherwise it
tries to connect to a parent that doesn't yet exist, resulting on this
splat:

[0.000478] kobject: '(null)' ((ptrval)): is not initialized, yet 
kobject_get() is being called.
[0.002925] [c2a0fb30] [c073b0bc] kobject_get+0x8c/0x100 
(unreliable)
[0.003071] [c2a0fba0] [c087e464] device_add+0xf4/0xb00
[0.003194] [c2a0fc80] [c0a7f6e4] of_device_add+0x64/0x80
[0.003321] [c2a0fcb0] [c0a800d0] 
of_platform_device_create_pdata+0xd0/0x1b0
[0.003476] [c2a0fd00] [c201fa44] 
pnv_get_random_long_early+0x240/0x2e4
[0.003623] [c2a0fe20] [c2060c38] random_init+0xc0/0x214

This patch fixes the issue by doing the of node creation inside of
machine_subsys_initcall.

Fixes: f3eac426657d ("powerpc/powernv: wire up rng during setup_arch")
Cc: sta...@vger.kernel.org
Cc: Michael Ellerman 
Reported-by: Sachin Sant 
Signed-off-by: Jason A. Donenfeld 
---
 arch/powerpc/platforms/powernv/rng.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/rng.c 
b/arch/powerpc/platforms/powernv/rng.c
index 463c78c52cc5..bd5ad5f351c2 100644
--- a/arch/powerpc/platforms/powernv/rng.c
+++ b/arch/powerpc/platforms/powernv/rng.c
@@ -176,12 +176,8 @@ static int __init pnv_get_random_long_early(unsigned long 
*v)
NULL) != pnv_get_random_long_early)
return 0;
 
-   for_each_compatible_node(dn, NULL, "ibm,power-rng") {
-   if (rng_create(dn))
-   continue;
-   /* Create devices for hwrng driver */
-   of_platform_device_create(dn, NULL, NULL);
-   }
+   for_each_compatible_node(dn, NULL, "ibm,power-rng")
+   rng_create(dn);
 
if (!ppc_md.get_random_seed)
return 0;
@@ -205,10 +201,16 @@ void __init pnv_rng_init(void)
 
 static int __init pnv_rng_late_init(void)
 {
+   struct device_node *dn;
unsigned long v;
+
/* In case it wasn't called during init for some other reason. */
if (ppc_md.get_random_seed == pnv_get_random_long_early)
pnv_get_random_long_early(&v);
+   if (ppc_md.get_random_seed == powernv_get_random_long) {
+   for_each_compatible_node(dn, NULL, "ibm,power-rng")
+   of_platform_device_create(dn, NULL, NULL);
+   }
return 0;
 }
 machine_subsys_initcall(powernv, pnv_rng_late_init);
-- 
2.35.1



Re: [PATCH v4.9] kexec_file: drop weak attribute from arch_kexec_apply_relocations[_add]

2022-06-30 Thread Greg KH
On Tue, Jun 28, 2022 at 09:12:48PM +0530, Naveen N. Rao wrote:
> commit 3e35142ef99fe6b4fe5d834ad43ee13cca10a2dc upstream.
> 
> Since commit d1bcae833b32f1 ("ELF: Don't generate unused section
> symbols") [1], binutils (v2.36+) started dropping section symbols that
> it thought were unused.  This isn't an issue in general, but with
> kexec_file.c, gcc is placing kexec_arch_apply_relocations[_add] into a
> separate .text.unlikely section and the section symbol ".text.unlikely"
> is being dropped. Due to this, recordmcount is unable to find a non-weak
> symbol in .text.unlikely to generate a relocation record against.
> 
> Address this by dropping the weak attribute from these functions.
> Instead, follow the existing pattern of having architectures #define the
> name of the function they want to override in their headers.
> 
> [1] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1
> 
> [a...@linux-foundation.org: arch/s390/include/asm/kexec.h needs 
> linux/module.h]
> Link: 
> https://lkml.kernel.org/r/20220519091237.676736-1-naveen.n@linux.vnet.ibm.com
> Signed-off-by: Michael Ellerman 
> Signed-off-by: Naveen N. Rao 
> Cc: "Eric W. Biederman" 
> Cc: 
> Signed-off-by: Andrew Morton 
> ---
>  arch/x86/include/asm/kexec.h |  7 +++
>  include/linux/kexec.h| 26 ++
>  kernel/kexec_file.c  | 18 --
>  3 files changed, 29 insertions(+), 22 deletions(-)
> 

All now queued up, thanks.

greg k-h


[Bug 215389] pagealloc: memory corruption at building glibc-2.33 and running its' testsuite

2022-06-30 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=215389

--- Comment #32 from Erhard F. (erhar...@mailbox.org) ---
(In reply to Michael Ellerman from comment #30)
> It's a bit of a stab in the dark, but can you try turning preempt off?
> 
> ie. CONFIG_PREEMPT_NONE=y
Looks like your intuition was not bad at all. ;) CONFIG_PREEMPT_NONE=y had no
effect but when I disable SMP at all '# CONFIG_SMP is not set' I get no memory
corruption and also no stack overflow issues.

Also no special treatment with Advanced Options or setting THREAD_SHIFT
manually was necessary. The G4 just does fine, albeit with 1 of it's 2 CPUs
only with disabled SMP.

For testing I did 6 of this glibc testsuite builds in a row without getting
issues. With SMP enabled I get memory corruption or stack overflow at the 1st
build allmost all of the time.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PATCH] powerpc/powernv: delay rng of node creation until later in boot

2022-06-30 Thread Sachin Sant



> On 30-Jun-2022, at 5:46 PM, Jason A. Donenfeld  wrote:
> 
> The of node for the rng must be created much later in boot. Otherwise it
> tries to connect to a parent that doesn't yet exist, resulting on this
> splat:
> 
> [0.000478] kobject: '(null)' ((ptrval)): is not initialized, yet 
> kobject_get() is being called.
> [0.002925] [c2a0fb30] [c073b0bc] kobject_get+0x8c/0x100 
> (unreliable)
> [0.003071] [c2a0fba0] [c087e464] device_add+0xf4/0xb00
> [0.003194] [c2a0fc80] [c0a7f6e4] of_device_add+0x64/0x80
> [0.003321] [c2a0fcb0] [c0a800d0] 
> of_platform_device_create_pdata+0xd0/0x1b0
> [0.003476] [c2a0fd00] [c201fa44] 
> pnv_get_random_long_early+0x240/0x2e4
> [0.003623] [c2a0fe20] [c2060c38] random_init+0xc0/0x214
> 
> This patch fixes the issue by doing the of node creation inside of
> machine_subsys_initcall.
> 
> Fixes: f3eac426657d ("powerpc/powernv: wire up rng during setup_arch")
> Cc: sta...@vger.kernel.org
> Cc: Michael Ellerman 
> Reported-by: Sachin Sant 
> Signed-off-by: Jason A. Donenfeld 
> ---

Thanks Jason for the patch. This fixes the reported problem for me.

Tested-by: Sachin Sant 

- Sachin



[PATCH v3 1/2] powerpc/powernv: rename remaining rng powernv_ functions to pnv_

2022-06-30 Thread Jason A. Donenfeld
The preferred nomenclature is pnv_, not powernv_, but rng.c used
powernv_ for some reason, which isn't consistent with the rest. A recent
commit added a few pnv_ functions to rng.c, making the file a bit of a
mishmash. This commit just replaces the rest of them.

Cc: Michael Ellerman 
Fixes: f3eac426657 ("powerpc/powernv: wire up rng during setup_arch")
Signed-off-by: Jason A. Donenfeld 
---
 arch/powerpc/include/asm/archrandom.h | 10 +++---
 arch/powerpc/kvm/book3s_hv_builtin.c  |  4 +--
 arch/powerpc/platforms/powernv/rng.c  | 44 +--
 drivers/char/hw_random/powernv-rng.c  |  2 +-
 4 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/arch/powerpc/include/asm/archrandom.h 
b/arch/powerpc/include/asm/archrandom.h
index 9a53e29680f4..11d4815841ab 100644
--- a/arch/powerpc/include/asm/archrandom.h
+++ b/arch/powerpc/include/asm/archrandom.h
@@ -38,12 +38,12 @@ static inline bool __must_check 
arch_get_random_seed_int(unsigned int *v)
 #endif /* CONFIG_ARCH_RANDOM */
 
 #ifdef CONFIG_PPC_POWERNV
-int powernv_hwrng_present(void);
-int powernv_get_random_long(unsigned long *v);
-int powernv_get_random_real_mode(unsigned long *v);
+int pnv_hwrng_present(void);
+int pnv_get_random_long(unsigned long *v);
+int pnv_get_random_real_mode(unsigned long *v);
 #else
-static inline int powernv_hwrng_present(void) { return 0; }
-static inline int powernv_get_random_real_mode(unsigned long *v) { return 0; }
+static inline int pnv_hwrng_present(void) { return 0; }
+static inline int pnv_get_random_real_mode(unsigned long *v) { return 0; }
 #endif
 
 #endif /* _ASM_POWERPC_ARCHRANDOM_H */
diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c 
b/arch/powerpc/kvm/book3s_hv_builtin.c
index 88a8f6473c4e..799d40c2ab4f 100644
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -176,13 +176,13 @@ EXPORT_SYMBOL_GPL(kvmppc_hcall_impl_hv_realmode);
 
 int kvmppc_hwrng_present(void)
 {
-   return powernv_hwrng_present();
+   return pnv_hwrng_present();
 }
 EXPORT_SYMBOL_GPL(kvmppc_hwrng_present);
 
 long kvmppc_rm_h_random(struct kvm_vcpu *vcpu)
 {
-   if (powernv_get_random_real_mode(&vcpu->arch.regs.gpr[4]))
+   if (pnv_get_random_real_mode(&vcpu->arch.regs.gpr[4]))
return H_SUCCESS;
 
return H_HARDWARE;
diff --git a/arch/powerpc/platforms/powernv/rng.c 
b/arch/powerpc/platforms/powernv/rng.c
index bd5ad5f351c2..386b44660e76 100644
--- a/arch/powerpc/platforms/powernv/rng.c
+++ b/arch/powerpc/platforms/powernv/rng.c
@@ -21,24 +21,24 @@
 
 #define DARN_ERR 0xul
 
-struct powernv_rng {
+struct pnv_rng {
void __iomem *regs;
void __iomem *regs_real;
unsigned long mask;
 };
 
-static DEFINE_PER_CPU(struct powernv_rng *, powernv_rng);
+static DEFINE_PER_CPU(struct pnv_rng *, pnv_rng);
 
-int powernv_hwrng_present(void)
+int pnv_hwrng_present(void)
 {
-   struct powernv_rng *rng;
+   struct pnv_rng *rng;
 
-   rng = get_cpu_var(powernv_rng);
+   rng = get_cpu_var(pnv_rng);
put_cpu_var(rng);
return rng != NULL;
 }
 
-static unsigned long rng_whiten(struct powernv_rng *rng, unsigned long val)
+static unsigned long rng_whiten(struct pnv_rng *rng, unsigned long val)
 {
unsigned long parity;
 
@@ -58,18 +58,18 @@ static unsigned long rng_whiten(struct powernv_rng *rng, 
unsigned long val)
return val;
 }
 
-int powernv_get_random_real_mode(unsigned long *v)
+int pnv_get_random_real_mode(unsigned long *v)
 {
-   struct powernv_rng *rng;
+   struct pnv_rng *rng;
 
-   rng = raw_cpu_read(powernv_rng);
+   rng = raw_cpu_read(pnv_rng);
 
*v = rng_whiten(rng, __raw_rm_readq(rng->regs_real));
 
return 1;
 }
 
-static int powernv_get_random_darn(unsigned long *v)
+static int pnv_get_random_darn(unsigned long *v)
 {
unsigned long val;
 
@@ -93,19 +93,19 @@ static int __init initialise_darn(void)
return -ENODEV;
 
for (i = 0; i < 10; i++) {
-   if (powernv_get_random_darn(&val)) {
-   ppc_md.get_random_seed = powernv_get_random_darn;
+   if (pnv_get_random_darn(&val)) {
+   ppc_md.get_random_seed = pnv_get_random_darn;
return 0;
}
}
return -EIO;
 }
 
-int powernv_get_random_long(unsigned long *v)
+int pnv_get_random_long(unsigned long *v)
 {
-   struct powernv_rng *rng;
+   struct pnv_rng *rng;
 
-   rng = get_cpu_var(powernv_rng);
+   rng = get_cpu_var(pnv_rng);
 
*v = rng_whiten(rng, in_be64(rng->regs));
 
@@ -113,9 +113,9 @@ int powernv_get_random_long(unsigned long *v)
 
return 1;
 }
-EXPORT_SYMBOL_GPL(powernv_get_random_long);
+EXPORT_SYMBOL_GPL(pnv_get_random_long);
 
-static __init void rng_init_per_cpu(struct powernv_rng *rng,
+static __init void rng_init_per_cpu(struct pnv_rng *rng,
struct device_node *dn)
 

[PATCH v3 0/2] powerpc rng cleanups

2022-06-30 Thread Jason A. Donenfeld
These are two small cleanups for -next.

I'm sending this v3 because very likely
https://lore.kernel.org/all/20220630121654.1939181-1-ja...@zx2c4.com/
will land first, in which case this needs a small adjustment.

Jason A. Donenfeld (2):
  powerpc/powernv: rename remaining rng powernv_ functions to pnv_
  powerpc/kvm: don't crash on missing rng, and use darn

 arch/powerpc/include/asm/archrandom.h |  7 +--
 arch/powerpc/kvm/book3s_hv_builtin.c  |  7 +--
 arch/powerpc/platforms/powernv/rng.c  | 65 ++-
 drivers/char/hw_random/powernv-rng.c  |  2 +-
 4 files changed, 30 insertions(+), 51 deletions(-)

-- 
2.35.1



[PATCH v3 2/2] powerpc/kvm: don't crash on missing rng, and use darn

2022-06-30 Thread Jason A. Donenfeld
On POWER8 systems that don't have ibm,power-rng available, a guest that
ignores the KVM_CAP_PPC_HWRNG flag and calls H_RANDOM anyway will
dereference a NULL pointer. And on machines with darn instead of
ibm,power-rng, H_RANDOM won't work at all.

This patch kills two birds with one stone, by routing H_RANDOM calls to
ppc_md.get_random_seed, and doing the real mode check inside of it.

Cc: sta...@vger.kernel.org # v4.1+
Cc: Michael Ellerman 
Fixes: e928e9cb3601 ("KVM: PPC: Book3S HV: Add fast real-mode H_RANDOM 
implementation.")
Signed-off-by: Jason A. Donenfeld 
---
 arch/powerpc/include/asm/archrandom.h |  5 
 arch/powerpc/kvm/book3s_hv_builtin.c  |  7 +++---
 arch/powerpc/platforms/powernv/rng.c  | 33 +++
 3 files changed, 12 insertions(+), 33 deletions(-)

diff --git a/arch/powerpc/include/asm/archrandom.h 
b/arch/powerpc/include/asm/archrandom.h
index 11d4815841ab..3af27bb84a3d 100644
--- a/arch/powerpc/include/asm/archrandom.h
+++ b/arch/powerpc/include/asm/archrandom.h
@@ -38,12 +38,7 @@ static inline bool __must_check 
arch_get_random_seed_int(unsigned int *v)
 #endif /* CONFIG_ARCH_RANDOM */
 
 #ifdef CONFIG_PPC_POWERNV
-int pnv_hwrng_present(void);
 int pnv_get_random_long(unsigned long *v);
-int pnv_get_random_real_mode(unsigned long *v);
-#else
-static inline int pnv_hwrng_present(void) { return 0; }
-static inline int pnv_get_random_real_mode(unsigned long *v) { return 0; }
 #endif
 
 #endif /* _ASM_POWERPC_ARCHRANDOM_H */
diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c 
b/arch/powerpc/kvm/book3s_hv_builtin.c
index 799d40c2ab4f..3abaef5f9ac2 100644
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -19,7 +19,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
@@ -176,13 +176,14 @@ EXPORT_SYMBOL_GPL(kvmppc_hcall_impl_hv_realmode);
 
 int kvmppc_hwrng_present(void)
 {
-   return pnv_hwrng_present();
+   return ppc_md.get_random_seed != NULL;
 }
 EXPORT_SYMBOL_GPL(kvmppc_hwrng_present);
 
 long kvmppc_rm_h_random(struct kvm_vcpu *vcpu)
 {
-   if (pnv_get_random_real_mode(&vcpu->arch.regs.gpr[4]))
+   if (ppc_md.get_random_seed &&
+   ppc_md.get_random_seed(&vcpu->arch.regs.gpr[4]))
return H_SUCCESS;
 
return H_HARDWARE;
diff --git a/arch/powerpc/platforms/powernv/rng.c 
b/arch/powerpc/platforms/powernv/rng.c
index 386b44660e76..4a48566528c0 100644
--- a/arch/powerpc/platforms/powernv/rng.c
+++ b/arch/powerpc/platforms/powernv/rng.c
@@ -29,15 +29,6 @@ struct pnv_rng {
 
 static DEFINE_PER_CPU(struct pnv_rng *, pnv_rng);
 
-int pnv_hwrng_present(void)
-{
-   struct pnv_rng *rng;
-
-   rng = get_cpu_var(pnv_rng);
-   put_cpu_var(rng);
-   return rng != NULL;
-}
-
 static unsigned long rng_whiten(struct pnv_rng *rng, unsigned long val)
 {
unsigned long parity;
@@ -58,17 +49,6 @@ static unsigned long rng_whiten(struct pnv_rng *rng, 
unsigned long val)
return val;
 }
 
-int pnv_get_random_real_mode(unsigned long *v)
-{
-   struct pnv_rng *rng;
-
-   rng = raw_cpu_read(pnv_rng);
-
-   *v = rng_whiten(rng, __raw_rm_readq(rng->regs_real));
-
-   return 1;
-}
-
 static int pnv_get_random_darn(unsigned long *v)
 {
unsigned long val;
@@ -105,11 +85,14 @@ int pnv_get_random_long(unsigned long *v)
 {
struct pnv_rng *rng;
 
-   rng = get_cpu_var(pnv_rng);
-
-   *v = rng_whiten(rng, in_be64(rng->regs));
-
-   put_cpu_var(rng);
+   if (mfmsr() & MSR_DR) {
+   rng = raw_cpu_read(pnv_rng);
+   *v = rng_whiten(rng, __raw_rm_readq(rng->regs_real));
+   } else {
+   rng = get_cpu_var(pnv_rng);
+   *v = rng_whiten(rng, in_be64(rng->regs));
+   put_cpu_var(rng);
+   }
 
return 1;
 }
-- 
2.35.1



Re: [RFC PATCH v3 11/12] powerpc: Remove unreachable() from WARN_ON()

2022-06-30 Thread Segher Boessenkool
On Thu, Jun 30, 2022 at 04:07:47PM +0530, Naveen N. Rao wrote:
> Objtool classifies 'ud2' as INSN_BUG, and 'int3' as INSN_TRAP. In x86 
> BUG(), there is no need for an annotation since objtool assumes that 
> 'ud2' terminates control flow. But, for __WARN_FLAGS(), since 'ud2' is 
> used, an explicit annotate_reachable() is needed. That's _reachable_, to 
> indicate that the control flow can continue with the next instruction.
> 
> On powerpc, we should (eventually) classify all trap variants as 
> INSN_TRAP. Even in the absence of that classification today, objtool 
> assumes that control flow continues with the next instruction. With your 
> work to utilize asm goto for __WARN_FLAGS(), with no extra instructions 
> being generated, I think it is appropriate to just use 
> __builtin_unreachable() and to not use the annotation.
> 
> In any case, we are only hitting this since gcc is generating a 'bl' due 
> to that annotation. We are not yet enabling full objtool validation on 
> powerpc, so I think we can revisit this at that point.

See also  that asks for a __builtin_trap()
variant that does not terminate control flow ("that is recoverable").


Segher


Re: [PATCH v3 0/2] powerpc rng cleanups

2022-06-30 Thread Sachin Sant



> On 30-Jun-2022, at 7:31 PM, Jason A. Donenfeld  wrote:
> 
> These are two small cleanups for -next.
> 
> I'm sending this v3 because very likely
> https://lore.kernel.org/all/20220630121654.1939181-1-ja...@zx2c4.com/
> will land first, in which case this needs a small adjustment.
> 
> Jason A. Donenfeld (2):
>  powerpc/powernv: rename remaining rng powernv_ functions to pnv_
>  powerpc/kvm: don't crash on missing rng, and use darn
> 
> arch/powerpc/include/asm/archrandom.h |  7 +--
> arch/powerpc/kvm/book3s_hv_builtin.c  |  7 +--
> arch/powerpc/platforms/powernv/rng.c  | 65 ++-
> drivers/char/hw_random/powernv-rng.c  |  2 +-
> 4 files changed, 30 insertions(+), 51 deletions(-)
> 

I tried these 2 patches + previous one (to fix kobject warning) on
top of 5.19.0-rc4-next-20220630 next on a Power8 server. 

5.19.0-rc4-next-20220630 +
powerpc/powernv: delay rng of node creation until later in boot +
powerpc/powernv: rename remaining rng powernv_ functions to pnv_ +
powerpc/kvm: don't crash on missing rng, and use darn

Unfortunately it fails to boot with following crash

[0.00] ftrace: allocated 13 pages with 3 groups
[0.00] trace event string verifier disabled
[0.00] rcu: Hierarchical RCU implementation.
[0.00] rcu: RCU restricting CPUs from NR_CPUS=2048 to nr_cpu_ids=80.
[0.00]  Rude variant of Tasks RCU enabled.
[0.00] rcu: RCU calculated value of scheduler-enlistment delay is 10 
jiffies.
[0.00] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=80
[0.00] NR_IRQS: 512, nr_irqs: 512, preallocated irqs: 16
[0.00] ICS OPAL backend registered
[0.00] rcu: srcu_init: Setting srcu_struct sizes based on contention.
[0.01] clocksource: timebase: mask: 0x max_cycles: 
0x761537d007, max_idle_ns: 440795202126 ns
[0.000182] clocksource: timebase mult[1f4] shift[24] registered
[0.001905] BUG: Unable to handle kernel data access on read at 
0x34000
[0.002032] Faulting instruction address: 0xc00d7990
[0.002132] Oops: Kernel access of bad area, sig: 7 [#1]
[0.002226] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
[0.002338] Modules linked in:
[0.002396] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
5.19.0-rc4-next-20220630-dirty #20
[0.002539] NIP:  c00d7990 LR: c201fa74 CTR: c00d7960
[0.002663] REGS: c2a0fa60 TRAP: 0300   Not tainted  
(5.19.0-rc4-next-20220630-dirty)
[0.002812] MSR:  92001033   CR: 44000228  
XER: 2000
[0.002979] CFAR: c201fa70 DAR: 00034000 DSISR: 0002 
IRQMASK: 1 
[0.002979] GPR00: c201fa74 c2a0fd00 c2a12000 
c2a0fe90 
[0.002979] GPR04: 0001  c0deb578 
006f 
[0.002979] GPR08:  00034000 c7040080 
 
[0.002979] GPR12: c00d7960 c2d0 0003 
 
[0.002979] GPR16:   0278 
c2a4dfe0 
[0.002979] GPR20: c2a52238 c2a52820 c00d7960 
c0fe6c50 
[0.002979] GPR24: 0001 c2a0fe90 c0fe6c40 
c2141ff8 
[0.002979] GPR28: 0800 c70400a0 c2ab0150 
 
[0.004279] NIP [c00d7990] pnv_get_random_long+0x30/0xd0
[0.004390] LR [c201fa74] pnv_get_random_long_early+0x268/0x2d0
[0.004509] Call Trace:
[0.004553] [c2a0fd00] [c201fa4c] 
pnv_get_random_long_early+0x240/0x2d0 (unreliable)
[0.004718] [c2a0fe20] [c2060d5c] random_init+0xc0/0x214
[0.004844] [c2a0fec0] [c20048c0] start_kernel+0x990/0xbf8
[0.004972] [c2a0ff90] [c000d878] start_here_common+0x1c/0x24
[0.005102] Instruction dump:
[0.005156] 3c4c0294 3842a6a0 7c0802a6 6000 7d2000a6 71290010 41820048 
e94d0030 
[0.005309] 3d22ff73 3929fff8 7d4a482a e92a0008 <7d204eea> e8ea0010 7d2803f4 
7ce94a78 
[0.005465] ---[ end trace  ]---
[0.005545] 
[1.005574] Kernel panic - not syncing: Fatal exception
[1.005671] Rebooting in 10 seconds..

Reverting powerpc/kvm: don't crash on missing rng, and use darn helps to boot
the server successfully.

Thanks
-Sachin



Re: [PATCH v3 0/2] powerpc rng cleanups

2022-06-30 Thread Jason A. Donenfeld
Hi Sachin, Michael,

On Thu, Jun 30, 2022 at 6:12 PM Sachin Sant  wrote:
> > On 30-Jun-2022, at 7:31 PM, Jason A. Donenfeld  wrote:
> >
> > These are two small cleanups for -next.
> >
> > I'm sending this v3 because very likely
> > https://lore.kernel.org/all/20220630121654.1939181-1-ja...@zx2c4.com/
> > will land first, in which case this needs a small adjustment.
> >
> > Jason A. Donenfeld (2):
> >  powerpc/powernv: rename remaining rng powernv_ functions to pnv_
> >  powerpc/kvm: don't crash on missing rng, and use darn
> >
> > arch/powerpc/include/asm/archrandom.h |  7 +--
> > arch/powerpc/kvm/book3s_hv_builtin.c  |  7 +--
> > arch/powerpc/platforms/powernv/rng.c  | 65 ++-
> > drivers/char/hw_random/powernv-rng.c  |  2 +-
> > 4 files changed, 30 insertions(+), 51 deletions(-)
> >
>
> I tried these 2 patches + previous one (to fix kobject warning) on
> top of 5.19.0-rc4-next-20220630 next on a Power8 server.
>
> 5.19.0-rc4-next-20220630 +
> powerpc/powernv: delay rng of node creation until later in boot +
> powerpc/powernv: rename remaining rng powernv_ functions to pnv_ +
> powerpc/kvm: don't crash on missing rng, and use darn
>
> Unfortunately it fails to boot with following crash
>
> [0.00] ftrace: allocated 13 pages with 3 groups
> [0.00] trace event string verifier disabled
> [0.00] rcu: Hierarchical RCU implementation.
> [0.00] rcu: RCU restricting CPUs from NR_CPUS=2048 to 
> nr_cpu_ids=80.
> [0.00]  Rude variant of Tasks RCU enabled.
> [0.00] rcu: RCU calculated value of scheduler-enlistment delay is 10 
> jiffies.
> [0.00] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=80
> [0.00] NR_IRQS: 512, nr_irqs: 512, preallocated irqs: 16
> [0.00] ICS OPAL backend registered
> [0.00] rcu: srcu_init: Setting srcu_struct sizes based on contention.
> [0.01] clocksource: timebase: mask: 0x max_cycles: 
> 0x761537d007, max_idle_ns: 440795202126 ns
> [0.000182] clocksource: timebase mult[1f4] shift[24] registered
> [0.001905] BUG: Unable to handle kernel data access on read at 
> 0x34000
> [0.002032] Faulting instruction address: 0xc00d7990
> [0.002132] Oops: Kernel access of bad area, sig: 7 [#1]
> [0.002226] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
> [0.002338] Modules linked in:
> [0.002396] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
> 5.19.0-rc4-next-20220630-dirty #20
> [0.002539] NIP:  c00d7990 LR: c201fa74 CTR: 
> c00d7960
> [0.002663] REGS: c2a0fa60 TRAP: 0300   Not tainted  
> (5.19.0-rc4-next-20220630-dirty)
> [0.002812] MSR:  92001033   CR: 
> 44000228  XER: 2000
> [0.002979] CFAR: c201fa70 DAR: 00034000 DSISR: 0002 
> IRQMASK: 1
> [0.002979] GPR00: c201fa74 c2a0fd00 c2a12000 
> c2a0fe90
> [0.002979] GPR04: 0001  c0deb578 
> 006f
> [0.002979] GPR08:  00034000 c7040080 
> 
> [0.002979] GPR12: c00d7960 c2d0 0003 
> 
> [0.002979] GPR16:   0278 
> c2a4dfe0
> [0.002979] GPR20: c2a52238 c2a52820 c00d7960 
> c0fe6c50
> [0.002979] GPR24: 0001 c2a0fe90 c0fe6c40 
> c2141ff8
> [0.002979] GPR28: 0800 c70400a0 c2ab0150 
> 
> [0.004279] NIP [c00d7990] pnv_get_random_long+0x30/0xd0
> [0.004390] LR [c201fa74] pnv_get_random_long_early+0x268/0x2d0
> [0.004509] Call Trace:
> [0.004553] [c2a0fd00] [c201fa4c] 
> pnv_get_random_long_early+0x240/0x2d0 (unreliable)
> [0.004718] [c2a0fe20] [c2060d5c] random_init+0xc0/0x214
> [0.004844] [c2a0fec0] [c20048c0] start_kernel+0x990/0xbf8
> [0.004972] [c2a0ff90] [c000d878] 
> start_here_common+0x1c/0x24
> [0.005102] Instruction dump:
> [0.005156] 3c4c0294 3842a6a0 7c0802a6 6000 7d2000a6 71290010 41820048 
> e94d0030
> [0.005309] 3d22ff73 3929fff8 7d4a482a e92a0008 <7d204eea> e8ea0010 
> 7d2803f4 7ce94a78
> [0.005465] ---[ end trace  ]---
> [0.005545]
> [1.005574] Kernel panic - not syncing: Fatal exception
> [1.005671] Rebooting in 10 seconds..
>
> Reverting powerpc/kvm: don't crash on missing rng, and use darn 

WARN at crypto/testmgr.c:5774 (next)

2022-06-30 Thread Sachin Sant
Following warning is seen while booting recent -next kernel on IBM Power server.

[1.544420] [ cut here ]
[1.544422] alg: self-tests for rsa-generic (rsa) failed (rc=-22)
[1.544429] WARNING: CPU: 18 PID: 512 at crypto/testmgr.c:5774 
alg_test+0x42c/0x850
[1.544437] Modules linked in:
[1.51] CPU: 18 PID: 512 Comm: cryptomgr_test Not tainted 
5.19.0-rc4-next-20220627 #2
[1.56] NIP:  c06fa76c LR: c06fa768 CTR: c08552e0
[1.58] REGS: c8a27980 TRAP: 0700   Not tainted  
(5.19.0-rc4-next-20220627)
[1.544451] MSR:  80029033   CR: 28008822  
XER: 20040005
[1.544458] CFAR: c0154114 IRQMASK: 0 
[1.544458] GPR00: c06fa768 c8a27c20 c2a8ff00 
0035 
[1.544458] GPR04: 7fff c8a279e0 c8a279d8 
 
[1.544458] GPR08: 7fff  c25c6ff8 
c2947160 
[1.544458] GPR12: 8000 c009afff3f00 c018c6f8 
c70c5180 
[1.544458] GPR16:    
 
[1.544458] GPR20:    
c0f1c230 
[1.544458] GPR24:  ce679080 0400 
 
[1.544458] GPR28: ce679000 000d c2d814a8 
ffea 
[1.544491] NIP [c06fa76c] alg_test+0x42c/0x850
[1.544495] LR [c06fa768] alg_test+0x428/0x850
[1.544499] Call Trace:
[1.544500] [c8a27c20] [c06fa768] alg_test+0x428/0x850 
(unreliable)
[1.544505] [c8a27d90] [c06f8df0] cryptomgr_test+0x40/0x70
[1.544510] [c8a27dc0] [c018c814] kthread+0x124/0x130
[1.544514] [c8a27e10] [c000cdf4] 
ret_from_kernel_thread+0x5c/0x64
[1.544518] Instruction dump:
[1.544520] 409e02e0 3d22002f 892915d1 2f89 409e02d0 3c62fe77 7f25cb78 
7f84e378 
[1.544526] 7fe6fb78 3863ac78 4ba59949 6000 <0fe0> fa2100f8 fa410100 
fa610108 
[1.544532] ---[ end trace  ]—

Git bisect points to the following patch.

# git bisect bad
f145d411a67efacc0731fc3f9c7b2d89fb62523a is the first bad commit
commit f145d411a67efacc0731fc3f9c7b2d89fb62523a
crypto: rsa - implement Chinese Remainder Theorem for faster private key 
operations

Reverting the patch helps avoid this boot time warning.

- Sachin

[Bug 216190] 5.19-rc4 kernel + KASAN fails to boot at very early stage when CONFIG_SMP=y is selected (PowerMac G4 3,6)

2022-06-30 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=216190

--- Comment #2 from Christophe Leroy (christophe.le...@csgroup.eu) ---
Problem is likely due to commit 4291d085b0b0 ("powerpc/32s: Make pte_update()
non atomic on 603 core")

kasan_early_init() calls __set_pte_at(), which calls pte_update() if
CONFIG_SMP, and pte_update() calls mmu_has_feature() since above commit, but
that's too early for calling mmu_has_feature() so mmu_has_feature() tries to
warn using printk(), but that cannot work because the KASAN shadow is not set.

Can you try with the change below ?

diff --git a/arch/powerpc/mm/kasan/init_32.c b/arch/powerpc/mm/kasan/init_32.c
index f3e4d069e0ba..a70828a6d935 100644
--- a/arch/powerpc/mm/kasan/init_32.c
+++ b/arch/powerpc/mm/kasan/init_32.c
@@ -25,7 +25,7 @@ static void __init kasan_populate_pte(pte_t *ptep, pgprot_t
prot)
int i;

for (i = 0; i < PTRS_PER_PTE; i++, ptep++)
-   __set_pte_at(&init_mm, va, ptep, pfn_pte(PHYS_PFN(pa), prot),
0);
+   __set_pte_at(&init_mm, va, ptep, pfn_pte(PHYS_PFN(pa), prot),
1);
 }

 int __init kasan_init_shadow_page_tables(unsigned long k_start, unsigned long
k_end)

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PATCH v2 3/3] arch/*/: remove CONFIG_VIRT_TO_BUS

2022-06-30 Thread Michael Schmitz

Hi Christoph,

On 29/06/22 18:21, Christoph Hellwig wrote:

On Wed, Jun 29, 2022 at 11:09:00AM +1200, Michael Schmitz wrote:

And all SCSI buffers are allocated using kmalloc? No way at all for user
space to pass unaligned data?

Most that you will see actually comes from the page allocator.  But
the block layer has a dma_alignment limit, and when userspace sends
I/O that is not properly aligned it will be bounce buffered before
it it sent to the driver.


That limit is set to L1_CACHE_BYTES on m68k so we're good here.

Thanks,

    Michael




Re: [PATCH v2 3/3] arch/*/: remove CONFIG_VIRT_TO_BUS

2022-06-30 Thread Michael Schmitz

Hi Christoph,

On 29/06/22 18:25, Christoph Hellwig wrote:

On Wed, Jun 29, 2022 at 09:38:00AM +1200, Michael Schmitz wrote:

That's one of the 'liberties' I alluded to. The reason I left these in is
that I'm none too certain what device feature the DMA API uses to decide a
device isn't cache-coherent.

The DMA API does not look at device features at all.  It needs to be
told so by the platform code.  Once an architecture implements the
hooks to support non-coherent DMA all devices are treated as
non-coherent by default unless overriden by the architecture either
globally (using the global dma_default_coherent variable) or per-device
(using the dev->dma_coherent field, usually set by arch_setup_dma_ops).
Haven't got any of that, so non-coherent DMA is all we can use (even 
though some of the RAM used for bounce buffers may actually be coherent 
due to the page table cache bits).



If it's dev->coherent_dma_mask, the way I set
up the device in the a3000 driver should leave the coherent mask unchanged.
For the Zorro drivers, devices are set up to use the same storage to store
normal and coherent masks - something we most likely want to change. I need
to think about the ramifications of that.

No, the coherent mask is slightly misnamed amd not actually related.


Thanks, that had me confused.

Cheers,

    Michael




[PATCH v4 0/5] tpm: Preserve TPM measurement log across kexec (ppc64)

2022-06-30 Thread Stefan Berger
The of-tree subsystem does not currently preserve the IBM vTPM 1.2 and
vTPM 2.0 measurement logs across a kexec on PowerVM and PowerKVM. This
series fixes this for the kexec_file_load() syscall using the flattened
device tree (fdt) to carry the TPM measurement log's buffer across kexec.

   Stefan

v4:
 - Rebased on 2 patches that would otherwise create merge conflicts;
   posting these patches in this series with several tags removed so
   krobot can test the series already
 - Changes to individual patches documented in patch descripitons

v3:
 - Moved TPM Open Firmware related function to 
drivers/char/tpm/eventlog/tpm_of.c

v2:
 - rearranged patches
 - fixed compilation issues for x86

Palmer Dabbelt (1):
  drivers: of: kexec ima: Support 32-bit platforms

Stefan Berger (3):
  tpm: of: Make of-tree specific function commonly available
  of: kexec: Refactor IMA buffer related functions to make them reusable
  tpm/kexec: Duplicate TPM measurement log in of-tree for kexec

Vaibhav Jain (1):
  of: check previous kernel's ima-kexec-buffer against memory bounds

 drivers/char/tpm/eventlog/of.c |  31 +---
 drivers/of/kexec.c | 328 +
 include/linux/kexec.h  |   6 +
 include/linux/of.h |   8 +-
 include/linux/tpm.h|  27 +++
 kernel/kexec_file.c|   6 +
 6 files changed, 340 insertions(+), 66 deletions(-)


base-commit: 03c765b0e3b4cb5063276b086c76f7a612856a9a
-- 
2.35.1



[PATCH v4 2/5] drivers: of: kexec ima: Support 32-bit platforms

2022-06-30 Thread Stefan Berger
From: Palmer Dabbelt 

RISC-V recently added kexec_file() support, which uses enables kexec
IMA.  We're the first 32-bit platform to support this, so we found a
build bug.

[ Several tags removed; for testing by krobot ]
Signed-off-by: Palmer Dabbelt 
---
 drivers/of/kexec.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/of/kexec.c b/drivers/of/kexec.c
index 91b04b04eec4..c4f9b6655a2e 100644
--- a/drivers/of/kexec.c
+++ b/drivers/of/kexec.c
@@ -253,8 +253,8 @@ static int setup_ima_buffer(const struct kimage *image, 
void *fdt,
if (ret)
return -EINVAL;
 
-   pr_debug("IMA buffer at 0x%llx, size = 0x%zx\n",
-image->ima_buffer_addr, image->ima_buffer_size);
+   pr_debug("IMA buffer at 0x%pa, size = 0x%zx\n",
+&image->ima_buffer_addr, image->ima_buffer_size);
 
return 0;
 }
-- 
2.35.1



[PATCH v4 5/5] tpm/kexec: Duplicate TPM measurement log in of-tree for kexec

2022-06-30 Thread Stefan Berger
The memory area of the TPM measurement log is currently not properly
duplicated for carrying it across kexec when an Open Firmware
Devicetree is used. Therefore, the contents of the log get corrupted.
Fix this for the kexec_file_load() syscall by allocating a buffer and
copying the contents of the existing log into it. The new buffer is
preserved across the kexec and a pointer to it is available when the new
kernel is started. To achieve this, store the allocated buffer's address
in the flattened device tree (fdt) under the name linux,tpm-kexec-buffer
and search for this entry early in the kernel startup before the TPM
subsystem starts up. Adjust the pointer in the of-tree stored under
linux,sml-base to point to this buffer holding the preserved log. The TPM
driver can then read the base address from this entry when making the log
available. Invalidate the log by removing 'linux,sml-base' from the
devicetree if anything goes wrong with updating the buffer.

Use subsys_initcall() to call the function to restore the buffer even if
the TPM subsystem or driver are not used. This allows the buffer to be
carried across the next kexec without involvement of the TPM subsystem
and ensures a valid buffer pointed to by the of-tree.

Use the subsys_initcall(), rather than an ealier initcall, since
page_is_ram() in get_kexec_buffer() only starts working at this stage.

Signed-off-by: Stefan Berger 
Cc: Rob Herring 
Cc: Frank Rowand 
Cc: Eric Biederman 

---
v4:
 - Added #include  due to parisc
 - Use phys_addr_t for physical address rather than void *
 - Remove linux,sml-base if the buffer cannot be updated after a kexec
 - Added __init to functions where possible
---
 drivers/of/kexec.c| 214 +-
 include/linux/kexec.h |   6 ++
 include/linux/of.h|   8 +-
 kernel/kexec_file.c   |   6 ++
 4 files changed, 232 insertions(+), 2 deletions(-)

diff --git a/drivers/of/kexec.c b/drivers/of/kexec.c
index 0710703acfb0..6d1eac0e2e3c 100644
--- a/drivers/of/kexec.c
+++ b/drivers/of/kexec.c
@@ -18,6 +18,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #define RNG_SEED_SIZE  128
 
@@ -241,7 +243,6 @@ static void remove_ima_buffer(void *fdt, int chosen_node)
remove_buffer(fdt, chosen_node, "linux,ima-kexec-buffer");
 }
 
-#ifdef CONFIG_IMA_KEXEC
 static int setup_buffer(void *fdt, int chosen_node, const char *name,
phys_addr_t addr, size_t size)
 {
@@ -265,6 +266,7 @@ static int setup_buffer(void *fdt, int chosen_node, const 
char *name,
 
 }
 
+#ifdef CONFIG_IMA_KEXEC
 /**
  * setup_ima_buffer - add IMA buffer information to the fdt
  * @image: kexec image being loaded.
@@ -287,6 +289,213 @@ static inline int setup_ima_buffer(const struct kimage 
*image, void *fdt,
 }
 #endif /* CONFIG_IMA_KEXEC */
 
+/**
+ * tpm_get_kexec_buffer - get TPM log buffer from the previous kernel
+ * @phyaddr:   On successful return, set to physical address of buffer
+ * @size:  On successful return, set to the buffer size.
+ *
+ * Return: 0 on success, negative errno on error.
+ */
+static int __init tpm_get_kexec_buffer(phys_addr_t *phyaddr, size_t *size)
+{
+   unsigned long tmp_addr;
+   size_t tmp_size;
+   int ret;
+
+   ret = get_kexec_buffer("linux,tpm-kexec-buffer", &tmp_addr, &tmp_size);
+   if (ret)
+   return ret;
+
+   *phyaddr = (phys_addr_t)tmp_addr;
+   *size = tmp_size;
+
+   return 0;
+}
+
+/**
+ * tpm_of_remove_kexec_buffer - remove the linux,tpm-kexec-buffer node
+ */
+static int __init tpm_of_remove_kexec_buffer(void)
+{
+   struct property *prop;
+
+   prop = of_find_property(of_chosen, "linux,tpm-kexec-buffer", NULL);
+   if (!prop)
+   return -ENOENT;
+
+   return of_remove_property(of_chosen, prop);
+}
+
+/**
+ * remove_tpm_buffer - remove the TPM log buffer property and reservation from 
@fdt
+ *
+ * @fdt: Flattened Device Tree to update
+ * @chosen_node: Offset to the chosen node in the device tree
+ *
+ * The TPM log measurement buffer is of no use to a subsequent kernel, so we 
always
+ * remove it from the device tree.
+ */
+static void remove_tpm_buffer(void *fdt, int chosen_node)
+{
+   if (!IS_ENABLED(CONFIG_PPC64))
+   return;
+
+   remove_buffer(fdt, chosen_node, "linux,tpm-kexec-buffer");
+}
+
+/**
+ * setup_tpm_buffer - add TPM measurement log buffer information to the fdt
+ * @image: kexec image being loaded.
+ * @fdt:   Flattened device tree for the next kernel.
+ * @chosen_node:   Offset to the chosen node.
+ *
+ * Return: 0 on success, or negative errno on error.
+ */
+static int setup_tpm_buffer(const struct kimage *image, void *fdt,
+   int chosen_node)
+{
+   if (!IS_ENABLED(CONFIG_PPC64))
+   return 0;
+
+   return setup_buffer(fdt, chosen_node, "linux,tpm-kexec-buffer",
+   image->tpm_buffer_addr, image->tpm_

[PATCH v4 3/5] tpm: of: Make of-tree specific function commonly available

2022-06-30 Thread Stefan Berger
Simplify tpm_read_log_of() by moving reusable parts of the code into
an inline function that makes it commonly available so it can be
used also for kexec support. Call the new of_tpm_get_sml_parameters()
function from the TPM Open Firmware driver.

Signed-off-by: Stefan Berger 
Cc: Jarkko Sakkinen 
Cc: Jason Gunthorpe 
Cc: Rob Herring 
Cc: Frank Rowand 

---
v4:
 - converted to inline function
---
 drivers/char/tpm/eventlog/of.c | 31 +--
 include/linux/tpm.h| 27 +++
 2 files changed, 32 insertions(+), 26 deletions(-)

diff --git a/drivers/char/tpm/eventlog/of.c b/drivers/char/tpm/eventlog/of.c
index a9ce66d09a75..f9462d19632e 100644
--- a/drivers/char/tpm/eventlog/of.c
+++ b/drivers/char/tpm/eventlog/of.c
@@ -12,6 +12,7 @@
 
 #include 
 #include 
+#include 
 #include 
 
 #include "../tpm.h"
@@ -20,11 +21,10 @@
 int tpm_read_log_of(struct tpm_chip *chip)
 {
struct device_node *np;
-   const u32 *sizep;
-   const u64 *basep;
struct tpm_bios_log *log;
u32 size;
u64 base;
+   int ret;
 
log = &chip->log;
if (chip->dev.parent && chip->dev.parent->of_node)
@@ -35,30 +35,9 @@ int tpm_read_log_of(struct tpm_chip *chip)
if (of_property_read_bool(np, "powered-while-suspended"))
chip->flags |= TPM_CHIP_FLAG_ALWAYS_POWERED;
 
-   sizep = of_get_property(np, "linux,sml-size", NULL);
-   basep = of_get_property(np, "linux,sml-base", NULL);
-   if (sizep == NULL && basep == NULL)
-   return -ENODEV;
-   if (sizep == NULL || basep == NULL)
-   return -EIO;
-
-   /*
-* For both vtpm/tpm, firmware has log addr and log size in big
-* endian format. But in case of vtpm, there is a method called
-* sml-handover which is run during kernel init even before
-* device tree is setup. This sml-handover function takes care
-* of endianness and writes to sml-base and sml-size in little
-* endian format. For this reason, vtpm doesn't need conversion
-* but physical tpm needs the conversion.
-*/
-   if (of_property_match_string(np, "compatible", "IBM,vtpm") < 0 &&
-   of_property_match_string(np, "compatible", "IBM,vtpm20") < 0) {
-   size = be32_to_cpup((__force __be32 *)sizep);
-   base = be64_to_cpup((__force __be64 *)basep);
-   } else {
-   size = *sizep;
-   base = *basep;
-   }
+   ret = of_tpm_get_sml_parameters(np, &base, &size);
+   if (ret < 0)
+   return ret;
 
if (size == 0) {
dev_warn(&chip->dev, "%s: Event log area empty\n", __func__);
diff --git a/include/linux/tpm.h b/include/linux/tpm.h
index dfeb25a0362d..b3dff255bc58 100644
--- a/include/linux/tpm.h
+++ b/include/linux/tpm.h
@@ -460,4 +460,31 @@ static inline struct tpm_chip *tpm_default_chip(void)
return NULL;
 }
 #endif
+
+#ifdef CONFIG_OF
+static inline int of_tpm_get_sml_parameters(struct device_node *np,
+   u64 *base, u32 *size)
+{
+   const u32 *sizep;
+   const u64 *basep;
+
+   sizep = of_get_property(np, "linux,sml-size", NULL);
+   basep = of_get_property(np, "linux,sml-base", NULL);
+   if (sizep == NULL && basep == NULL)
+   return -ENODEV;
+   if (sizep == NULL || basep == NULL)
+   return -EIO;
+
+   if (of_property_match_string(np, "compatible", "IBM,vtpm") < 0 &&
+   of_property_match_string(np, "compatible", "IBM,vtpm20") < 0) {
+   *size = be32_to_cpup((__force __be32 *)sizep);
+   *base = be64_to_cpup((__force __be64 *)basep);
+   } else {
+   *size = *sizep;
+   *base = *basep;
+   }
+   return 0;
+}
+#endif
+
 #endif
-- 
2.35.1



[PATCH v4 1/5] of: check previous kernel's ima-kexec-buffer against memory bounds

2022-06-30 Thread Stefan Berger
From: Vaibhav Jain 

Presently ima_get_kexec_buffer() doesn't check if the previous kernel's
ima-kexec-buffer lies outside the addressable memory range. This can result
in a kernel panic if the new kernel is booted with 'mem=X' arg and the
ima-kexec-buffer was allocated beyond that range by the previous kernel.
The panic is usually of the form below:

$ sudo kexec --initrd initrd vmlinux --append='mem=16G'


 BUG: Unable to handle kernel data access on read at 0xc000c01fff7f
 Faulting instruction address: 0xc0837974
 Oops: Kernel access of bad area, sig: 11 [#1]

 NIP [c0837974] ima_restore_measurement_list+0x94/0x6c0
 LR [c083b55c] ima_load_kexec_buffer+0xac/0x160
 Call Trace:
 [c371fa80] [c083b55c] ima_load_kexec_buffer+0xac/0x160
 [c371fb00] [c20512c4] ima_init+0x80/0x108
 [c371fb70] [c20514dc] init_ima+0x4c/0x120
 [c371fbf0] [c0012240] do_one_initcall+0x60/0x2c0
 [c371fcc0] [c2004ad0] kernel_init_freeable+0x344/0x3ec
 [c371fda0] [c00128a4] kernel_init+0x34/0x1b0
 [c371fe10] [c000ce64] ret_from_kernel_thread+0x5c/0x64
 Instruction dump:
 f92100b8 f92100c0 90e10090 910100a0 4182050c 282a0017 3bc0 40810330
 7c0802a6 fb610198 7c9b2378 f80101d0  2c090001 40820614 e9240010
 ---[ end trace  ]---

Fix this issue by checking returned PFN range of previous kernel's
ima-kexec-buffer with page_is_ram() to ensure correct memory bounds.

Fixes: 467d27824920 ("powerpc: ima: get the kexec buffer passed by the previous 
kernel")
[ Several tags removed; for testing by krobot ]
Signed-off-by: Vaibhav Jain 
Link: https://lore.kernel.org/r/20220531041446.3334259-1-vaib...@linux.ibm.com
---
 drivers/of/kexec.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/drivers/of/kexec.c b/drivers/of/kexec.c
index 8d374cc552be..91b04b04eec4 100644
--- a/drivers/of/kexec.c
+++ b/drivers/of/kexec.c
@@ -126,6 +126,7 @@ int ima_get_kexec_buffer(void **addr, size_t *size)
 {
int ret, len;
unsigned long tmp_addr;
+   unsigned long start_pfn, end_pfn;
size_t tmp_size;
const void *prop;
 
@@ -140,6 +141,22 @@ int ima_get_kexec_buffer(void **addr, size_t *size)
if (ret)
return ret;
 
+   /* Do some sanity on the returned size for the ima-kexec buffer */
+   if (!tmp_size)
+   return -ENOENT;
+
+   /*
+* Calculate the PFNs for the buffer and ensure
+* they are with in addressable memory.
+*/
+   start_pfn = PHYS_PFN(tmp_addr);
+   end_pfn = PHYS_PFN(tmp_addr + tmp_size - 1);
+   if (!page_is_ram(start_pfn) || !page_is_ram(end_pfn)) {
+   pr_warn("IMA buffer at 0x%lx, size = 0x%zx beyond memory\n",
+   tmp_addr, tmp_size);
+   return -EINVAL;
+   }
+
*addr = __va(tmp_addr);
*size = tmp_size;
 
-- 
2.35.1



[PATCH v4 4/5] of: kexec: Refactor IMA buffer related functions to make them reusable

2022-06-30 Thread Stefan Berger
Refactor IMA buffer related functions to make them reusable for carrying
TPM logs across kexec.

Signed-off-by: Stefan Berger 
Cc: Rob Herring 
Cc: Frank Rowand 
Cc: Mimi Zohar 

---
v4:
 - Move debug output into setup_buffer()
---
 drivers/of/kexec.c | 131 ++---
 1 file changed, 76 insertions(+), 55 deletions(-)

diff --git a/drivers/of/kexec.c b/drivers/of/kexec.c
index c4f9b6655a2e..0710703acfb0 100644
--- a/drivers/of/kexec.c
+++ b/drivers/of/kexec.c
@@ -115,48 +115,59 @@ static int do_get_kexec_buffer(const void *prop, int len, 
unsigned long *addr,
return 0;
 }
 
-/**
- * ima_get_kexec_buffer - get IMA buffer from the previous kernel
- * @addr:  On successful return, set to point to the buffer contents.
- * @size:  On successful return, set to the buffer size.
- *
- * Return: 0 on success, negative errno on error.
- */
-int ima_get_kexec_buffer(void **addr, size_t *size)
+static int get_kexec_buffer(const char *name, unsigned long *addr, size_t 
*size)
 {
int ret, len;
-   unsigned long tmp_addr;
unsigned long start_pfn, end_pfn;
-   size_t tmp_size;
const void *prop;
 
-   if (!IS_ENABLED(CONFIG_HAVE_IMA_KEXEC))
-   return -ENOTSUPP;
-
-   prop = of_get_property(of_chosen, "linux,ima-kexec-buffer", &len);
+   prop = of_get_property(of_chosen, name, &len);
if (!prop)
return -ENOENT;
 
-   ret = do_get_kexec_buffer(prop, len, &tmp_addr, &tmp_size);
+   ret = do_get_kexec_buffer(prop, len, addr, size);
if (ret)
return ret;
 
-   /* Do some sanity on the returned size for the ima-kexec buffer */
-   if (!tmp_size)
+   /* Do some sanity on the returned size for the kexec buffer */
+   if (!*size)
return -ENOENT;
 
/*
 * Calculate the PFNs for the buffer and ensure
 * they are with in addressable memory.
 */
-   start_pfn = PHYS_PFN(tmp_addr);
-   end_pfn = PHYS_PFN(tmp_addr + tmp_size - 1);
+   start_pfn = PHYS_PFN(*addr);
+   end_pfn = PHYS_PFN(*addr + *size - 1);
if (!page_is_ram(start_pfn) || !page_is_ram(end_pfn)) {
-   pr_warn("IMA buffer at 0x%lx, size = 0x%zx beyond memory\n",
-   tmp_addr, tmp_size);
+   pr_warn("%s buffer at 0x%lx, size = 0x%zx beyond memory\n",
+   name, *addr, *size);
return -EINVAL;
}
 
+   return 0;
+}
+
+/**
+ * ima_get_kexec_buffer - get IMA buffer from the previous kernel
+ * @addr:  On successful return, set to point to the buffer contents.
+ * @size:  On successful return, set to the buffer size.
+ *
+ * Return: 0 on success, negative errno on error.
+ */
+int ima_get_kexec_buffer(void **addr, size_t *size)
+{
+   int ret;
+   unsigned long tmp_addr;
+   size_t tmp_size;
+
+   if (!IS_ENABLED(CONFIG_HAVE_IMA_KEXEC))
+   return -ENOTSUPP;
+
+   ret = get_kexec_buffer("linux,ima-kexec-buffer", &tmp_addr, &tmp_size);
+   if (ret)
+   return ret;
+
*addr = __va(tmp_addr);
*size = tmp_size;
 
@@ -191,72 +202,82 @@ int ima_free_kexec_buffer(void)
return memblock_phys_free(addr, size);
 }
 
-/**
- * remove_ima_buffer - remove the IMA buffer property and reservation from @fdt
- *
- * @fdt: Flattened Device Tree to update
- * @chosen_node: Offset to the chosen node in the device tree
- *
- * The IMA measurement buffer is of no use to a subsequent kernel, so we always
- * remove it from the device tree.
- */
-static void remove_ima_buffer(void *fdt, int chosen_node)
+static int remove_buffer(void *fdt, int chosen_node, const char *name)
 {
int ret, len;
unsigned long addr;
size_t size;
const void *prop;
 
-   if (!IS_ENABLED(CONFIG_HAVE_IMA_KEXEC))
-   return;
-
-   prop = fdt_getprop(fdt, chosen_node, "linux,ima-kexec-buffer", &len);
+   prop = fdt_getprop(fdt, chosen_node, name, &len);
if (!prop)
-   return;
+   return -ENOENT;
 
ret = do_get_kexec_buffer(prop, len, &addr, &size);
-   fdt_delprop(fdt, chosen_node, "linux,ima-kexec-buffer");
+   fdt_delprop(fdt, chosen_node, name);
if (ret)
-   return;
+   return ret;
 
ret = fdt_find_and_del_mem_rsv(fdt, addr, size);
if (!ret)
-   pr_debug("Removed old IMA buffer reservation.\n");
+   pr_debug("Remove old %s buffer reserveration", name);
+   return ret;
 }
 
-#ifdef CONFIG_IMA_KEXEC
 /**
- * setup_ima_buffer - add IMA buffer information to the fdt
- * @image: kexec image being loaded.
- * @fdt:   Flattened device tree for the next kernel.
- * @chosen_node:   Offset to the chosen node.
+ * remove_ima_buffer - remove the IMA buffer property and reservation from @fdt
  *
- * Return: 0 on success, or negativ

Re: [PATCH] freescale/fs_enet:fix repeated words in comments

2022-06-30 Thread patchwork-bot+netdevbpf
Hello:

This patch was applied to netdev/net-next.git (master)
by Jakub Kicinski :

On Wed, 29 Jun 2022 20:54:41 +0800 you wrote:
> Delete the redundant word 'a'.
> 
> Signed-off-by: Jilin Yuan 
> ---
>  drivers/net/ethernet/freescale/fs_enet/fs_enet.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Here is the summary with links:
  - freescale/fs_enet:fix repeated words in comments
https://git.kernel.org/netdev/net-next/c/b1769b6eb06b

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html




Re: [PATCH 5.15 00/28] 5.15.52-rc1 review

2022-06-30 Thread Bagas Sanjaya
On Thu, Jun 30, 2022 at 03:46:56PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.15.52 release.
> There are 28 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 

Successfully cross-compiled for arm64 (bcm2711_defconfig, GCC 12.1.0)
and powerpc (ps3_defconfig, GCC 12.1.0).

I get two warnings on powerpc build (vdso related):

  VDSO32L arch/powerpc/kernel/vdso32/vdso32.so.dbg
/lib/gcc/powerpc64-unknown-linux-gnu/12.1.0/../../../../powerpc64-unknown-linux-gnu/bin/ld:
 warning: cannot find entry symbol 
nable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang; not 
setting start address
  VDSO64L arch/powerpc/kernel/vdso64/vdso64.so.dbg
/lib/gcc/powerpc64-unknown-linux-gnu/12.1.0/../../../../powerpc64-unknown-linux-gnu/bin/ld:
 warning: cannot find entry symbol 
nable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang; not 
setting start address

Tested-by: Bagas Sanjaya 

-- 
An old man doll... just what I always wanted! - Clara


Re: [PATCH v4 1/2] PCI/AER: Disable AER service when link is in L2/L3 ready, L2 and L3 state

2022-06-30 Thread Kai-Heng Feng
On Sat, Apr 23, 2022 at 6:26 AM Bjorn Helgaas  wrote:
>
> [+cc Rafael, linux-pm; sorry forgot this last time]
>
> On Fri, Apr 22, 2022 at 05:24:36PM -0500, Bjorn Helgaas wrote:
> > On Fri, Apr 08, 2022 at 11:31:58PM +0800, Kai-Heng Feng wrote:
> > > On Intel Alder Lake platforms, Thunderbolt entering D3cold can cause
> > > some errors reported by AER:
> > > [   30.100211] pcieport :00:1d.0: AER: Uncorrected (Non-Fatal) error 
> > > received: :00:1d.0
> > > [   30.100251] pcieport :00:1d.0: PCIe Bus Error: 
> > > severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
> > > [   30.100256] pcieport :00:1d.0:   device [8086:7ab0] error 
> > > status/mask=0010/4000
> > > [   30.100262] pcieport :00:1d.0:[20] UnsupReq   
> > > (First)
> > > [   30.100267] pcieport :00:1d.0: AER:   TLP Header: 3400 
> > > 0852  
> > > [   30.100372] thunderbolt :0a:00.0: AER: can't recover (no 
> > > error_detected callback)
> > > [   30.100401] xhci_hcd :3e:00.0: AER: can't recover (no 
> > > error_detected callback)
> > > [   30.100427] pcieport :00:1d.0: AER: device recovery failed
> > >
> > > So disable AER service to avoid the noises from turning power rails
> > > on/off when the device is in low power states (D3hot and D3cold), as
> > > PCIe Base Spec 5.0, section 5.2 "Link State Power Management" states
> > > that TLP and DLLP transmission is disabled for a Link in L2/L3 Ready
> > > (D3hot), L2 (D3cold with aux power) and L3 (D3cold).
> >
> > Help me walk through what's happening here, because I'm never very
> > confident about how error reporting works.  I *think* the Unsupported
> > Request error means some request was in progress and was not
> > completed.  I don't think a link going down should by itself cause
> > an Unsupported Request error because there's no *request*.
> >
> > I have a theory about what happened here.  Decoding the TLP Header
> > (from PCIe r6.0, sec 2.2.1.1, 2.2.8.10) gives:
> >
> >   3400 (0011 0100 ...):
> > Fmt   0014 DW header, no data
> > Type   1 0100Msg, Local - Terminate at Receiver
> >
> >   0852 (0800 ... 0101 0010)
> > Requester ID 080000:08.0
> > Message Code 0101 0010   PTM Request

Is there any TLP decoder software available? That will be really
helpful for debugging.

> >
> > From your lspci in bugzilla, 08:00 has PTM enabled.  So my theory is
> > that:
> >
> >   - 08:00.0 sent a PTM Request Message (a Posted Request)
> >   - 00:1d.0 received the PTM Request Message
> >   - The link transitioned to DL_Down
> >   - Per sec 2.9.1, 00:1d.0 discarded the Request and reported an
> > Unsupported Request
> >   - Or, per sec 6.21.3, if 00:1d.0 received a PTM Request when its
> > own PTM Enable was clear, it would also be treated as an
> > Unsupported Request
> >
> > So I suspect we should disable PTM on 08:00.0 before putting it in a
> > low-power state.  If you manually disable PTM on 08:00.0, do these
> > errors stop happening?

Yes, disabling PTM on upstream port can solve the issue.
Thanks for find the root cause!

> >
> > David did something like this [1], but just for Root Ports.  That
> > looks wrong to me because sec 6.21.3 says we should not have PTM
> > enabled in an Upstream Port (i.e., in a downstream device like
> > 08:00.0) unless it is already enabled in the Downstream Port (i.e., in
> > the Root Port 00:1d.0).

So I think it should be like this?
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index cfaf40a540a82..8ba8a0e12946e 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -2717,7 +2717,8 @@ int pci_prepare_to_sleep(struct pci_dev *dev)
 * port to enter a lower-power PM state and the SoC to reach a
 * lower-power idle state as a whole.
 */
-   if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT)
+   if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT ||
+   pci_pcie_type(dev) == PCI_EXP_TYPE_UPSTREAM)
pci_disable_ptm(dev);

pci_enable_wake(dev, target_state, wakeup);
@@ -2775,7 +2776,8 @@ int pci_finish_runtime_suspend(struct pci_dev *dev)
 * port to enter a lower-power PM state and the SoC to reach a
 * lower-power idle state as a whole.
 */
-   if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT)
+   if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT ||
+   pci_pcie_type(dev) == PCI_EXP_TYPE_UPSTREAM)
pci_disable_ptm(dev);

__pci_enable_wake(dev, target_state, pci_dev_run_wake(dev));


> >
> > Nit: can you remove the timestamps from the log?  They add clutter but
> > no useful information.

Sure.

> >
> > [1] https://git.kernel.org/linus/a697f072f5da
> >
> > > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=215453
> > > Reviewed-by: Mika Westerberg 
> > > Signed-off-by: Kai-Heng Feng 
> > > ---
> > > v4:
> > >  - Explicitly states the spec version.
> > 

[PATCH] powerpc/32s: Fix boot failure with KASAN + SMP + JUMP_LABEL_FEATURE_CHECK_DEBUG

2022-06-30 Thread Christophe Leroy
Since commit 4291d085b0b0 ("powerpc/32s: Make pte_update() non
atomic on 603 core"), pte_update() has been using
mmu_has_feature(MMU_FTR_HPTE_TABLE) to avoid a useless atomic
operation on 603 cores.

When kasan_early_init() sets up the early zero shadow, it uses
__set_pte_at(). On book3s/32, __set_pte_at() calls pte_update()
when CONFIG_SMP is selected in order to ensure the preservation of
_PAGE_HASHPTE in case of concurrent update of the PTE. But that's
too early for mmu_has_feature(), so when
CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG is selected, mmu_has_feature()
calls printk(). That's too early to call printk() because KASAN
early zero shadow page is not set up yet. It leads to a deadlock.

However, when kasan_early_init() is called, there is only one CPU
running and no risk of concurrent PTE update. So __set_pte_at() can
be called with the 'percpu' flag. With that flag set, the PTE is
written directly instead of being written via pte_update().

Reported-by: Erhard Furtner 
Fixes: 4291d085b0b0 ("powerpc/32s: Make pte_update() non atomic on 603 core")
Signed-off-by: Christophe Leroy 
---
 arch/powerpc/mm/kasan/init_32.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/kasan/init_32.c b/arch/powerpc/mm/kasan/init_32.c
index f3e4d069e0ba..a70828a6d935 100644
--- a/arch/powerpc/mm/kasan/init_32.c
+++ b/arch/powerpc/mm/kasan/init_32.c
@@ -25,7 +25,7 @@ static void __init kasan_populate_pte(pte_t *ptep, pgprot_t 
prot)
int i;
 
for (i = 0; i < PTRS_PER_PTE; i++, ptep++)
-   __set_pte_at(&init_mm, va, ptep, pfn_pte(PHYS_PFN(pa), prot), 
0);
+   __set_pte_at(&init_mm, va, ptep, pfn_pte(PHYS_PFN(pa), prot), 
1);
 }
 
 int __init kasan_init_shadow_page_tables(unsigned long k_start, unsigned long 
k_end)
-- 
2.36.1



[PATCH] powerpc/32s: Cleanup the mess in __set_pte_at()

2022-06-30 Thread Christophe Leroy
__set_pte_at() handles 3 main cases with #ifdefs plus the 'percpu'
subcase which leads to code duplication.

Rewrite the function using IS_ENABLED() to minimise the total number
of cases and remove duplicated code.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/book3s/32/pgtable.h | 77 
 1 file changed, 31 insertions(+), 46 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h 
b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 40041ac713d9..2a0ca1f9a1ff 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -534,58 +534,43 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t 
newprot)
 
 
 /* This low level function performs the actual PTE insertion
- * Setting the PTE depends on the MMU type and other factors. It's
- * an horrible mess that I'm not going to try to clean up now but
- * I'm keeping it in one place rather than spread around
+ * Setting the PTE depends on the MMU type and other factors.
+ *
+ * First case is 32-bit Hash MMU in SMP mode with 32-bit PTEs. We use the
+ * helper pte_update() which does an atomic update. We need to do that
+ * because a concurrent invalidation can clear _PAGE_HASHPTE. If it's a
+ * per-CPU PTE such as a kmap_atomic, we do a simple update preserving
+ * the hash bits instead (ie, same as the non-SMP case)
+ *
+ * Second case is 32-bit with 64-bit PTE.  In this case, we
+ * can just store as long as we do the two halves in the right order
+ * with a barrier in between. This is possible because we take care,
+ * in the hash code, to pre-invalidate if the PTE was already hashed,
+ * which synchronizes us with any concurrent invalidation.
+ * In the percpu case, we also fallback to the simple update preserving
+ * the hash bits
+ *
+ * Third case is 32-bit hash table in UP mode, we need to preserve
+ * the _PAGE_HASHPTE bit since we may not have invalidated the previous
+ * translation in the hash yet (done in a subsequent flush_tlb_xxx())
+ * and see we need to keep track that this PTE needs invalidating
  */
 static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
pte_t *ptep, pte_t pte, int percpu)
 {
-#if defined(CONFIG_SMP) && !defined(CONFIG_PTE_64BIT)
-   /* First case is 32-bit Hash MMU in SMP mode with 32-bit PTEs. We use 
the
-* helper pte_update() which does an atomic update. We need to do that
-* because a concurrent invalidation can clear _PAGE_HASHPTE. If it's a
-* per-CPU PTE such as a kmap_atomic, we do a simple update preserving
-* the hash bits instead (ie, same as the non-SMP case)
-*/
-   if (percpu)
-   *ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
- | (pte_val(pte) & ~_PAGE_HASHPTE));
-   else
-   pte_update(mm, addr, ptep, ~_PAGE_HASHPTE, pte_val(pte), 0);
+   if ((!IS_ENABLED(CONFIG_SMP) && !IS_ENABLED(CONFIG_PTE_64BIT)) || 
percpu) {
+   *ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE) |
+ (pte_val(pte) & ~_PAGE_HASHPTE));
+   } else if (IS_ENABLED(CONFIG_PTE_64BIT)) {
+   if (pte_val(*ptep) & _PAGE_HASHPTE)
+   flush_hash_entry(mm, ptep, addr);
 
-#elif defined(CONFIG_PTE_64BIT)
-   /* Second case is 32-bit with 64-bit PTE.  In this case, we
-* can just store as long as we do the two halves in the right order
-* with a barrier in between. This is possible because we take care,
-* in the hash code, to pre-invalidate if the PTE was already hashed,
-* which synchronizes us with any concurrent invalidation.
-* In the percpu case, we also fallback to the simple update preserving
-* the hash bits
-*/
-   if (percpu) {
-   *ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
- | (pte_val(pte) & ~_PAGE_HASHPTE));
-   return;
+   asm volatile("stw%X0 %2,%0; eieio; stw%X1 %L2,%1" :
+"=m" (*ptep), "=m" (*((unsigned char *)ptep+4)) :
+"r" (pte) : "memory");
+   } else {
+   pte_update(mm, addr, ptep, ~_PAGE_HASHPTE, pte_val(pte), 0);
}
-   if (pte_val(*ptep) & _PAGE_HASHPTE)
-   flush_hash_entry(mm, ptep, addr);
-   __asm__ __volatile__("\
-   stw%X0 %2,%0\n\
-   eieio\n\
-   stw%X1 %L2,%1"
-   : "=m" (*ptep), "=m" (*((unsigned char *)ptep+4))
-   : "r" (pte) : "memory");
-
-#else
-   /* Third case is 32-bit hash table in UP mode, we need to preserve
-* the _PAGE_HASHPTE bit since we may not have invalidated the previous
-* translation in the hash yet (done in a subsequent flush_tlb_xxx())
-* and see we need to keep track that this PTE needs invalidating
-*/
-   *ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
-   

Re: [RFC PATCH v3 11/12] powerpc: Remove unreachable() from WARN_ON()

2022-06-30 Thread Sathvika Vasireddy

Hi Chen,

Thanks for pitching in and providing your inputs :-)

On 01/07/22 07:43, Chen Zhongjin wrote:

Hi everyone,

Hope I'm not too late for this discussion.

I'm not familiar with ppc so it spent me some time to reproduce this. 
But at last I didn't make it.


What I did:

1 checkout to tip/objtool/core

2 apply this patch

3 recover the unreachable() after WARN_ENTRY(..

4 compile on defconfig/allyesconfig

If anyone can point out which file will generate this problem it will 
be perfect.


To reproduce this problem, you may want to apply this patch series on 
top of objtool/core branch of the tip tree, and compile on 
ppc64le_defconfig.


There are a couple of C files that are generating these warnings. One 
such file is: arch/powerpc/kvm/../../../virt/kvm/kvm_main.o which gives
*arch/powerpc/kvm/../../../virt/kvm/kvm_main.o: warning: objtool: 
kvm_mmu_notifier_release+0x6c: unannotated intra-function call* warning.


With unreachable() in __WARN_FLAGS(), we get unannotated intra-function 
call warnings, but without unreachable() like in patch 11/12 or with 
just the builtin variant of unreachable (__builtin_unreachable()) 
instead of unreachable(), we do not get those warnings.



- Sathvika