On Wed, Oct 1, 2014 at 5:20 PM, Gilles Chanteperdrix
<[email protected]> wrote:
> On 10/01/2014 11:12 AM, GP Orcullo wrote:
>> On Oct 1, 2014 3:54 PM, "Gilles Chanteperdrix" <
>> [email protected]> wrote:
>>>
>>> On 10/01/2014 01:32 AM, GP Orcullo wrote:
>>>> On Sep 30, 2014 8:16 PM, "Gilles Chanteperdrix" <
>>>> [email protected]> wrote:
>>>>>
>>>>> On 09/30/2014 02:04 PM, GP Orcullo wrote:
>>>>>> On Sep 30, 2014 7:30 PM, "Gilles Chanteperdrix" <
>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>> On 09/30/2014 07:31 AM, GP Orcullo wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Running the switchtest for extended periods (>10 mins) causes the
>>>>>>>> machine to lockup.
>>>>>>>>
>>>>>>>> I'm running a modified xeno-regression-test which contains only the
>>>>>>>> following tests:
>>>>>>>>
>>>>>>>> check_alive /usr/lib/xenomai/testsuite/switchtest
>>>>>>>> check_alive /usr/lib/xenomai/testsuite/switchtest -s 1000
>>>>>>>> check_alive /usr/lib/xenomai/testsuite/latency ${1+"$@"}
>>>>>>>>
>>>>>>>> The script is invoked with the following arguments:
>>>>>>>>
>>>>>>>> nohup sudo ./xeno-regression-test -l
>>>>>>>> "/usr/lib/xenomai/testsuite/dohell -m /media/work 36000" -t 2 >
>>>>>>>> /dev/null & top -d0.5
>>>>>>>>
>>>>>>>> The kernel dumps the OOPS information intermittently so it's
>> difficult
>>>>>>>> to diagnose the issue.
>>>>>>>>
>>>>>>>> Attached is the kernel config and the logfile.
>>>>>>>
>>>>>>> Ok, this is an exynos. Sorry, but I have never seen the patch for
>>>>>>> exynos, so I do not know what is inside. You should direct your
>>>>>>> questions to whoever provided you with this support.
>>>>>>
>>>>>> I'm in the process of porting xenomai to run on exynos.
>>>>>>
>>>>>> The ipipe-core-3.8.13-arm-3.patch applies cleanly to the 3.8.13.11
>>>> kernel
>>>>>> used by the odroid U3 board.
>>>>>>
>>>>>> Attached is the ipipe patch that I've made.
>>>>>>
>>>>>> I was just wondering what would cause switchtest to fail. The error
>>>> that I
>>>>>> can see is that the system is running out of memory and I don't know
>>>>>> exactly what is causing this.
>>>>>
>>>>> Certainly not switchtest as it does not do any memory allocation.
>>>>> However, the dohell script has a loop creating a large file and
>> removing
>>>>> it. So, could you try and run the dohell script with an unpatched
>> kernel
>>>>> and see if you have the error?
>>>>>
>>>>
>>>> Running dohell on a patched and unpatched kernel doesn't trigger the
>> lockup.
>>>>
>>>> Running switchtest without dohell works OK.
>>>
>>> Is the problem a lockup, or an OOM?
>>>
>>
>> It's a lockup.
>>
>> The OOM message is the only one that I've captured so far.  Most of the
>> time the kernel doesn't spew any messages before the lockup.
>>
>> The lockups are repeatable but generating any error messages isn't.
>
> Are you running the tests on the serial console, or with ssh? Do you
> have unlocked context switch enabled? Have you tried enabling some debug
> options?
>

I'm using the serial console to log the kernel messages and ssh to run
the command. Using purely the serial console has the same results.

Is this the context switch?: "CONFIG_XENO_HW_UNLOCKED_SWITCH=y"

I will try playing again with the debug options and see if I can get
something useful.

> Also note that xeno-regression-test puts the system under a lot of
> stress, so it may happen that there is no output for some time (several
> minutes), normally the test should stop by itself if there is no output
> for something like 30 minutes. So, I would recommend not redirecting
> xeno-test output to see if there is any error before the lockup, and
> when you see the lockup, leave the system for 30 minutes to see if it
> does not restart or if xeno-regression-test can exit gracefully.
>

This is a total lockup. There's a heartbeat led that dies when it occurs.

Attached is one error log that I had captured previously and this one
had the CONFIG_CPU_IDLE enabled. I've lost track on which kernel this
trace came from but maybe the error looks familiar.

> --
>                                                                 Gilles.

-- 
GP Orcullo
-------------- next part --------------
[ 4619.775000] Kernel panic - not syncing: Attempted to kill init! 
exitcode=0x0000000b
[ 4619.775000]  PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND           
[ 4619.775000] [<c0014a94>] (unwind_backtrace+0x0/0xf8) from [<c02fd710>] 
(panic+0x8c/0x1e4)
[ 4619.775000] [<c02fd710>] (panic+0x8c/0x1e4) from [<c0028fa8>] 
(do_exit+0x770/0x828)
[ 4619.775000] [<c0028fa8>] (do_exit+0x770/0x828) from [<c00291ac>] 
(do_group_exit+0x3c/0xb0)
[ 4619.775000] [<c00291ac>] (do_group_exit+0x3c/0xb0) from [<c0032f00>] 
(get_signal_to_deliver+0x1c4/0x530)
[ 4619.775000] [<c0032f00>] (get_signal_to_deliver+0x1c4/0x530) from 
[<c001125c>] (do_signal+0x7c/0x480)
[ 4619.775000] [<c001125c>] (do_signal+0x7c/0x480) from [<c0011afc>] 
(do_work_pending+0x68/0xa8)
[ 4619.775000] [<c0011afc>] (do_work_pending+0x68/0xa8) from [<c000e340>] 
(work_pending+0xc/0x20)
[ 4619.775000] CPU1: stopping   0    0 S   0.0  0.0   0:00.00 kthreadd          
[ 4619.775000] [<c0014a94>] (unwind_backtrace+0x0/0xf8) from [<c0013438>] 
(handle_IPI+0x120/0x14c)
[ 4619.775000] [<c0013438>] (handle_IPI+0x120/0x14c) from [<c000855c>] 
(gic_handle_irq+0x60/0x68)
[ 4619.775000] [<c000855c>] (gic_handle_irq+0x60/0x68) from [<c000df00>] 
(__irq_svc+0x40/0x70)
[ 4619.775000] Exception stack(0xc2113d88 to 0xc2113dd0)00.00 kworker/u:0H      
[ 4619.775000] 3d80:                   e6c60d10 0001ffff 00000001 00797474 
e4089008 00797474
[ 4619.775000] 3da0: 00008eb2 e6c60d10 c2113e58 e6b8c025 00000003 c2113f00 
ff555a5a c2113dd0
[ 4619.775000] 3dc0: c00d43c8 c00d43f0 60000053 ffffffff00.00 rcu_bh            
[ 4619.775000] [<c000df00>] (__irq_svc+0x40/0x70) from [<c00d43f0>] 
(__d_lookup+0x64/0x178)
[ 4619.775000] [<c00d43f0>] (__d_lookup+0x64/0x178) from [<c00c8800>] 
(lookup_fast+0x140/0x27c)                                          
[ 4619.775000] [<c00c8800>] (lookup_fast+0x140/0x27c) from [<c00c97e8>] 
(link_path_walk+0x178/0x8a0)                                     
[ 4619.775000] [<c00c97e8>] (link_path_walk+0x178/0x8a0) from [<c00ca64c>] 
(path_lookupat+0x54/0x774)                                    
[ 4619.775000] [<c00ca64c>] (path_lookupat+0x54/0x774) from [<c00cad8c>] 
(filename_lookup+0x20/0x60)                                     
[ 4619.775000] [<c00cad8c>] (filename_lookup+0x20/0x60) from [<c00ccda8>] 
(user_path_at_empty+0x50/0x7c)                                 
[ 4619.775000] [<c00ccda8>] (user_path_at_empty+0x50/0x7c) from [<c00ccde8>] 
(user_path_at+0x14/0x1c)                                    
[ 4619.775000] [<c00ccde8>] (user_path_at+0x14/0x1c) from [<c00defec>] 
(sys_lgetxattr+0x30/0x80)                                         
[ 4619.775000] [<c00defec>] (sys_lgetxattr+0x30/0x80) from [<c000e300>] 
(ret_fast_syscall+0x0/0x30)                                      
[ 4619.775000] CPU3: stopping                                                   
                                                         
[ 4619.775000] [<c0014a94>] (unwind_backtrace+0x0/0xf8) from [<c0013438>] 
(handle_IPI+0x120/0x14c)                                       
[ 4619.775000] [<c0013438>] (handle_IPI+0x120/0x14c) from [<c000855c>] 
(gic_handle_irq+0x60/0x68)                                        
[ 4619.775000] [<c000855c>] (gic_handle_irq+0x60/0x68) from [<c000df00>] 
(__irq_svc+0x40/0x70)                                           
[ 4619.775000] Exception stack(0xe6777ee8 to 0xe6777f30)                        
                                                         
[ 4619.775000] 7ee0:                   00000001 a0000053 c00c9244 00000000 
e7001480 c23ed000                                             
[ 4619.775000] 7f00: 426905c3 c23e8000 000000d0 c23ed000 426905c7 000201f0 
010b3000 e6777f30                                             
[ 4619.775000] 7f20: c00c9244 c00ba378 20000053 ffffffff                        
                                                         
[ 4619.775000] [<c000df00>] (__irq_svc+0x40/0x70) from [<c00ba378>] 
(kmem_cache_alloc+0x6c/0xe4)                                         
[ 4619.775000] [<c00ba378>] (kmem_cache_alloc+0x6c/0xe4) from [<c00c9244>] 
(getname_flags+0x20/0x118)                                    
[ 4619.775000] [<c00c9244>] (getname_flags+0x20/0x118) from [<c00bf2d4>] 
(do_sys_open+0xb4/0x174)                                        
[ 4619.775000] [<c00bf2d4>] (do_sys_open+0xb4/0x174) from [<c000e300>] 
(ret_fast_syscall+0x0/0x30)                                       
[ 4619.775000] CPU2: stopping                                                   
                                                         
[ 4619.775000] [<c0014a94>] (unwind_backtrace+0x0/0xf8) from [<c0013438>] 
(handle_IPI+0x120/0x14c)                                       
[ 4619.775000] [<c0013438>] (handle_IPI+0x120/0x14c) from [<c000855c>] 
(gic_handle_irq+0x60/0x68)                                        
[ 4619.775000] [<c000855c>] (gic_handle_irq+0x60/0x68) from [<c000df00>] 
(__irq_svc+0x40/0x70)                                           
[ 4619.775000] Exception stack(0xe706ff40 to 0xe706ff88)                        
                                                         
[ 4619.775000] ff40: e706ff88 3b9aca00 a3d560f0 00000433 a3a76c30 00000433 
c14d83f0 00000000                                             
[ 4619.775000] ff60: c045a380 413fc090 c0454f90 00000000 00000018 e706ff88 
c0057900 c0255b64                                             
[ 4619.775000] ff80: 60000053 ffffffff                                          
                                                         
[ 4619.775000] [<c000df00>] (__irq_svc+0x40/0x70) from [<c0255b64>] 
(cpuidle_wrap_enter+0x48/0x94)                                       
[ 4619.775000] [<c0255b64>] (cpuidle_wrap_enter+0x48/0x94) from [<c025586c>] 
(cpuidle_enter_state+0x14/0x68)                             
[ 4619.775000] [<c025586c>] (cpuidle_enter_state+0x14/0x68) from [<c0255954>] 
(cpuidle_idle_call+0x94/0x100)                             
[ 4619.775000] [<c0255954>] (cpuidle_idle_call+0x94/0x100) from [<c000f5e0>] 
(cpu_idle+0x90/0xec)                                        
[ 4619.775000] [<c000f5e0>] (cpu_idle+0x90/0xec) from [<402fa928>] (0x402fa928) 
 
_______________________________________________
Xenomai mailing list
[email protected]
http://www.xenomai.org/mailman/listinfo/xenomai

Reply via email to