Re: 答复: 答复: loop nesting in alignment exception and machine check

2019-11-26 Thread Christophe Leroy

Hi,

Le 01/11/2019 à 02:57, Wangshaobo (bobo) a écrit :

Hi, Christophe

I am sorry that we are in some troubles for some unpredictable problems 
when we replay and haven't given you a quick reply.

I also want to ask does the phenomeon(use memcpy_toio when copy 
ioremap_address) only occurs in powerpc ? does any other
arch also has the same problem ? we are in persuit of asking why this 
phenomenon happened. Our linux kernel version is 4.4.


It's not a problem ... it's a feature.

I have no idea whether the same kind of issue can happen on other 
arches, sorry.


Christophe



thanks very much.

-邮件原件-
发件人: Christophe Leroy [mailto:christophe.le...@c-s.fr]
发送时间: 2019年10月31日 19:13
收件人: Wangshaobo (bobo) 
抄送: chengjian (D) ; Libin (Huawei) ; 
Xiexiuqi ; zhangyi (F) 
主题: Re: 答复: loop nesting in alignment exception and machine check

Hi,

Did you try ? Does it work ?

Christophe

Le 28/10/2019 à 06:57, Wangshaobo (bobo) a écrit :

Hi,Christophe

Thank you for your quick reply. I will try to use memcpy_toio() instead of 
memcpy().

-邮件原件-
发件人: Christophe Leroy [mailto:christophe.le...@c-s.fr]
发送时间: 2019年10月26日 19:20
收件人: Wangshaobo (bobo) 
抄送: linux-a...@vger.kernel.org; alist...@popple.id.au; chengjian (D)
; Xiexiuqi ;
linux-ker...@vger.kernel.org; o...@buserror.net; pau...@samba.org;
Libin (Huawei) ; ag...@denx.de;
linuxppc-dev@lists.ozlabs.org
主题: Re: loop nesting in alignment exception and machine check

Hi,

Le 26/10/2019 à 09:23, Wangshaobo (bobo) a écrit :

Hi,

I encountered a problem about a loop nesting occurred in
manufacturing the alignment exception in machine check, trigger background is :

problem:

machine checkout or critical interrupt ->…->kbox_write[for recording
last words] -> memcpy(irremap_addr, src,size):_GLOBAL(memcpy)…

when we enter memcpy,a command ‘dcbz r11,r6’ will cause a alignment
exception, in this situation,r11 loads the ioremap address,which
leads to the alignment exception,


You can't use memcpy() on something else than memory.

For an ioremapped area, you have to use memcpy_toio()

Christophe



then the command can not be process successfully,as we still in
machine check.at the end ,it triggers a new irq machine check in irq
handler function,a loop nesting begins.

analysis:

We have analysed a lot,but it still can not come to a reasonable
description,in common,the alignment triggered in machine check
context can still be collected into the Kbox

after alignment exception be handled by handler function, but how
does the machine checkout can be triggered in the handler fucntion
for any causes? We print relevant registers

as follow when first enter machine check and alignment exception
handler
function:

    MSR:0x2  MSR:0x0

    SRR1:0x2  SRR1:0x21002

    But the manual says SRR1 should be set to MSR(0x2),why
that happened ?

    Then a branch in handler function copy the SRR1 to
MSR,this enble MSR[ME] and MSR[CE],system collapses.

Conclusion:

    1)  why the alignment exception can not be handled in
machine check ?

    2)  besides memcpy,any other function can cause the
alignment exception ?

We still recurrent it, the line as follows:

    Cpu dead lock->watch log->trigger
fiq->kbox_write->memcpy->alignment exception->print last words.

    but for those problems as below,what the kbox printed is empty.

--kbox restart:[   10.147594]

kbox verify fs magic fail

kbox mem mabye destroyed, format it

kbox: load OK

lock-task: major[249] minor[0]

-start show_destroyed_kbox_mem_head

:      

0010:      

0020:      

0030:      

0040:      

0050:      

0060:      

0070:      

0080:      

0090:      



答复: 答复: loop nesting in alignment exception and machine check

2019-10-31 Thread Wangshaobo (bobo)
Hi, Christophe

I am sorry that we are in some troubles for some unpredictable problems 
when we replay and haven't given you a quick reply.

I also want to ask does the phenomeon(use memcpy_toio when copy 
ioremap_address) only occurs in powerpc ? does any other 
arch also has the same problem ? we are in persuit of asking why this 
phenomenon happened. Our linux kernel version is 4.4.

thanks very much.

-邮件原件-
发件人: Christophe Leroy [mailto:christophe.le...@c-s.fr] 
发送时间: 2019年10月31日 19:13
收件人: Wangshaobo (bobo) 
抄送: chengjian (D) ; Libin (Huawei) 
; Xiexiuqi ; zhangyi (F) 

主题: Re: 答复: loop nesting in alignment exception and machine check

Hi,

Did you try ? Does it work ?

Christophe

Le 28/10/2019 à 06:57, Wangshaobo (bobo) a écrit :
> Hi,Christophe
> 
> Thank you for your quick reply. I will try to use memcpy_toio() instead of 
> memcpy().
> 
> -邮件原件-
> 发件人: Christophe Leroy [mailto:christophe.le...@c-s.fr]
> 发送时间: 2019年10月26日 19:20
> 收件人: Wangshaobo (bobo) 
> 抄送: linux-a...@vger.kernel.org; alist...@popple.id.au; chengjian (D) 
> ; Xiexiuqi ; 
> linux-ker...@vger.kernel.org; o...@buserror.net; pau...@samba.org; 
> Libin (Huawei) ; ag...@denx.de; 
> linuxppc-dev@lists.ozlabs.org
> 主题: Re: loop nesting in alignment exception and machine check
> 
> Hi,
> 
> Le 26/10/2019 à 09:23, Wangshaobo (bobo) a écrit :
>> Hi,
>>
>> I encountered a problem about a loop nesting occurred in 
>> manufacturing the alignment exception in machine check, trigger background 
>> is :
>>
>> problem:
>>
>> machine checkout or critical interrupt ->…->kbox_write[for recording 
>> last words] -> memcpy(irremap_addr, src,size):_GLOBAL(memcpy)…
>>
>> when we enter memcpy,a command ‘dcbz r11,r6’ will cause a alignment 
>> exception, in this situation,r11 loads the ioremap address,which 
>> leads to the alignment exception,
> 
> You can't use memcpy() on something else than memory.
> 
> For an ioremapped area, you have to use memcpy_toio()
> 
> Christophe
> 
>>
>> then the command can not be process successfully,as we still in 
>> machine check.at the end ,it triggers a new irq machine check in irq 
>> handler function,a loop nesting begins.
>>
>> analysis:
>>
>> We have analysed a lot,but it still can not come to a reasonable 
>> description,in common,the alignment triggered in machine check 
>> context can still be collected into the Kbox
>>
>> after alignment exception be handled by handler function, but how 
>> does the machine checkout can be triggered in the handler fucntion 
>> for any causes? We print relevant registers
>>
>> as follow when first enter machine check and alignment exception 
>> handler
>> function:
>>
>>    MSR:0x2  MSR:0x0
>>
>>    SRR1:0x2  SRR1:0x21002
>>
>>    But the manual says SRR1 should be set to MSR(0x2),why 
>> that happened ?
>>
>>    Then a branch in handler function copy the SRR1 to 
>> MSR,this enble MSR[ME] and MSR[CE],system collapses.
>>
>> Conclusion:
>>
>>    1)  why the alignment exception can not be handled in 
>> machine check ?
>>
>>    2)  besides memcpy,any other function can cause the 
>> alignment exception ?
>>
>> We still recurrent it, the line as follows:
>>
>>    Cpu dead lock->watch log->trigger
>> fiq->kbox_write->memcpy->alignment exception->print last words.
>>
>>    but for those problems as below,what the kbox printed is empty.
>>
>> --kbox restart:[   10.147594]
>>
>> kbox verify fs magic fail
>>
>> kbox mem mabye destroyed, format it
>>
>> kbox: load OK
>>
>> lock-task: major[249] minor[0]
>>
>> -start show_destroyed_kbox_mem_head
>>
>> :      
>>
>> 0010:      
>>
>> 0020:      
>>
>> 0030:      
>>
>> 0040:      
>>
>> 0050:      
>>
>> 0060:      
>>
>> 0070:      
>>
>> 0080:      
>>
>> 0090:      
>>