On Wed, Oct 1, 2014 at 5:20 PM, Gilles Chanteperdrix
<[email protected]> wrote:
> On 10/01/2014 11:12 AM, GP Orcullo wrote:
>> On Oct 1, 2014 3:54 PM, "Gilles Chanteperdrix" <
>> [email protected]> wrote:
>>>
>>> On 10/01/2014 01:32 AM, GP Orcullo wrote:
>>>> On Sep 30, 2014 8:16 PM, "Gilles Chanteperdrix" <
>>>> [email protected]> wrote:
>>>>>
>>>>> On 09/30/2014 02:04 PM, GP Orcullo wrote:
>>>>>> On Sep 30, 2014 7:30 PM, "Gilles Chanteperdrix" <
>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>> On 09/30/2014 07:31 AM, GP Orcullo wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Running the switchtest for extended periods (>10 mins) causes the
>>>>>>>> machine to lockup.
>>>>>>>>
>>>>>>>> I'm running a modified xeno-regression-test which contains only the
>>>>>>>> following tests:
>>>>>>>>
>>>>>>>> check_alive /usr/lib/xenomai/testsuite/switchtest
>>>>>>>> check_alive /usr/lib/xenomai/testsuite/switchtest -s 1000
>>>>>>>> check_alive /usr/lib/xenomai/testsuite/latency ${1+"$@"}
>>>>>>>>
>>>>>>>> The script is invoked with the following arguments:
>>>>>>>>
>>>>>>>> nohup sudo ./xeno-regression-test -l
>>>>>>>> "/usr/lib/xenomai/testsuite/dohell -m /media/work 36000" -t 2 >
>>>>>>>> /dev/null & top -d0.5
>>>>>>>>
>>>>>>>> The kernel dumps the OOPS information intermittently so it's
>> difficult
>>>>>>>> to diagnose the issue.
>>>>>>>>
>>>>>>>> Attached is the kernel config and the logfile.
>>>>>>>
>>>>>>> Ok, this is an exynos. Sorry, but I have never seen the patch for
>>>>>>> exynos, so I do not know what is inside. You should direct your
>>>>>>> questions to whoever provided you with this support.
>>>>>>
>>>>>> I'm in the process of porting xenomai to run on exynos.
>>>>>>
>>>>>> The ipipe-core-3.8.13-arm-3.patch applies cleanly to the 3.8.13.11
>>>> kernel
>>>>>> used by the odroid U3 board.
>>>>>>
>>>>>> Attached is the ipipe patch that I've made.
>>>>>>
>>>>>> I was just wondering what would cause switchtest to fail. The error
>>>> that I
>>>>>> can see is that the system is running out of memory and I don't know
>>>>>> exactly what is causing this.
>>>>>
>>>>> Certainly not switchtest as it does not do any memory allocation.
>>>>> However, the dohell script has a loop creating a large file and
>> removing
>>>>> it. So, could you try and run the dohell script with an unpatched
>> kernel
>>>>> and see if you have the error?
>>>>>
>>>>
>>>> Running dohell on a patched and unpatched kernel doesn't trigger the
>> lockup.
>>>>
>>>> Running switchtest without dohell works OK.
>>>
>>> Is the problem a lockup, or an OOM?
>>>
>>
>> It's a lockup.
>>
>> The OOM message is the only one that I've captured so far. Most of the
>> time the kernel doesn't spew any messages before the lockup.
>>
>> The lockups are repeatable but generating any error messages isn't.
>
> Are you running the tests on the serial console, or with ssh? Do you
> have unlocked context switch enabled? Have you tried enabling some debug
> options?
>
I'm using the serial console to log the kernel messages and ssh to run
the command. Using purely the serial console has the same results.
Is this the context switch?: "CONFIG_XENO_HW_UNLOCKED_SWITCH=y"
I will try playing again with the debug options and see if I can get
something useful.
> Also note that xeno-regression-test puts the system under a lot of
> stress, so it may happen that there is no output for some time (several
> minutes), normally the test should stop by itself if there is no output
> for something like 30 minutes. So, I would recommend not redirecting
> xeno-test output to see if there is any error before the lockup, and
> when you see the lockup, leave the system for 30 minutes to see if it
> does not restart or if xeno-regression-test can exit gracefully.
>
This is a total lockup. There's a heartbeat led that dies when it occurs.
Attached is one error log that I had captured previously and this one
had the CONFIG_CPU_IDLE enabled. I've lost track on which kernel this
trace came from but maybe the error looks familiar.
> --
> Gilles.
--
GP Orcullo
-------------- next part --------------
[ 4619.775000] Kernel panic - not syncing: Attempted to kill init!
exitcode=0x0000000b
[ 4619.775000] PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
[ 4619.775000] [<c0014a94>] (unwind_backtrace+0x0/0xf8) from [<c02fd710>]
(panic+0x8c/0x1e4)
[ 4619.775000] [<c02fd710>] (panic+0x8c/0x1e4) from [<c0028fa8>]
(do_exit+0x770/0x828)
[ 4619.775000] [<c0028fa8>] (do_exit+0x770/0x828) from [<c00291ac>]
(do_group_exit+0x3c/0xb0)
[ 4619.775000] [<c00291ac>] (do_group_exit+0x3c/0xb0) from [<c0032f00>]
(get_signal_to_deliver+0x1c4/0x530)
[ 4619.775000] [<c0032f00>] (get_signal_to_deliver+0x1c4/0x530) from
[<c001125c>] (do_signal+0x7c/0x480)
[ 4619.775000] [<c001125c>] (do_signal+0x7c/0x480) from [<c0011afc>]
(do_work_pending+0x68/0xa8)
[ 4619.775000] [<c0011afc>] (do_work_pending+0x68/0xa8) from [<c000e340>]
(work_pending+0xc/0x20)
[ 4619.775000] CPU1: stopping 0 0 S 0.0 0.0 0:00.00 kthreadd
[ 4619.775000] [<c0014a94>] (unwind_backtrace+0x0/0xf8) from [<c0013438>]
(handle_IPI+0x120/0x14c)
[ 4619.775000] [<c0013438>] (handle_IPI+0x120/0x14c) from [<c000855c>]
(gic_handle_irq+0x60/0x68)
[ 4619.775000] [<c000855c>] (gic_handle_irq+0x60/0x68) from [<c000df00>]
(__irq_svc+0x40/0x70)
[ 4619.775000] Exception stack(0xc2113d88 to 0xc2113dd0)00.00 kworker/u:0H
[ 4619.775000] 3d80: e6c60d10 0001ffff 00000001 00797474
e4089008 00797474
[ 4619.775000] 3da0: 00008eb2 e6c60d10 c2113e58 e6b8c025 00000003 c2113f00
ff555a5a c2113dd0
[ 4619.775000] 3dc0: c00d43c8 c00d43f0 60000053 ffffffff00.00 rcu_bh
[ 4619.775000] [<c000df00>] (__irq_svc+0x40/0x70) from [<c00d43f0>]
(__d_lookup+0x64/0x178)
[ 4619.775000] [<c00d43f0>] (__d_lookup+0x64/0x178) from [<c00c8800>]
(lookup_fast+0x140/0x27c)
[ 4619.775000] [<c00c8800>] (lookup_fast+0x140/0x27c) from [<c00c97e8>]
(link_path_walk+0x178/0x8a0)
[ 4619.775000] [<c00c97e8>] (link_path_walk+0x178/0x8a0) from [<c00ca64c>]
(path_lookupat+0x54/0x774)
[ 4619.775000] [<c00ca64c>] (path_lookupat+0x54/0x774) from [<c00cad8c>]
(filename_lookup+0x20/0x60)
[ 4619.775000] [<c00cad8c>] (filename_lookup+0x20/0x60) from [<c00ccda8>]
(user_path_at_empty+0x50/0x7c)
[ 4619.775000] [<c00ccda8>] (user_path_at_empty+0x50/0x7c) from [<c00ccde8>]
(user_path_at+0x14/0x1c)
[ 4619.775000] [<c00ccde8>] (user_path_at+0x14/0x1c) from [<c00defec>]
(sys_lgetxattr+0x30/0x80)
[ 4619.775000] [<c00defec>] (sys_lgetxattr+0x30/0x80) from [<c000e300>]
(ret_fast_syscall+0x0/0x30)
[ 4619.775000] CPU3: stopping
[ 4619.775000] [<c0014a94>] (unwind_backtrace+0x0/0xf8) from [<c0013438>]
(handle_IPI+0x120/0x14c)
[ 4619.775000] [<c0013438>] (handle_IPI+0x120/0x14c) from [<c000855c>]
(gic_handle_irq+0x60/0x68)
[ 4619.775000] [<c000855c>] (gic_handle_irq+0x60/0x68) from [<c000df00>]
(__irq_svc+0x40/0x70)
[ 4619.775000] Exception stack(0xe6777ee8 to 0xe6777f30)
[ 4619.775000] 7ee0: 00000001 a0000053 c00c9244 00000000
e7001480 c23ed000
[ 4619.775000] 7f00: 426905c3 c23e8000 000000d0 c23ed000 426905c7 000201f0
010b3000 e6777f30
[ 4619.775000] 7f20: c00c9244 c00ba378 20000053 ffffffff
[ 4619.775000] [<c000df00>] (__irq_svc+0x40/0x70) from [<c00ba378>]
(kmem_cache_alloc+0x6c/0xe4)
[ 4619.775000] [<c00ba378>] (kmem_cache_alloc+0x6c/0xe4) from [<c00c9244>]
(getname_flags+0x20/0x118)
[ 4619.775000] [<c00c9244>] (getname_flags+0x20/0x118) from [<c00bf2d4>]
(do_sys_open+0xb4/0x174)
[ 4619.775000] [<c00bf2d4>] (do_sys_open+0xb4/0x174) from [<c000e300>]
(ret_fast_syscall+0x0/0x30)
[ 4619.775000] CPU2: stopping
[ 4619.775000] [<c0014a94>] (unwind_backtrace+0x0/0xf8) from [<c0013438>]
(handle_IPI+0x120/0x14c)
[ 4619.775000] [<c0013438>] (handle_IPI+0x120/0x14c) from [<c000855c>]
(gic_handle_irq+0x60/0x68)
[ 4619.775000] [<c000855c>] (gic_handle_irq+0x60/0x68) from [<c000df00>]
(__irq_svc+0x40/0x70)
[ 4619.775000] Exception stack(0xe706ff40 to 0xe706ff88)
[ 4619.775000] ff40: e706ff88 3b9aca00 a3d560f0 00000433 a3a76c30 00000433
c14d83f0 00000000
[ 4619.775000] ff60: c045a380 413fc090 c0454f90 00000000 00000018 e706ff88
c0057900 c0255b64
[ 4619.775000] ff80: 60000053 ffffffff
[ 4619.775000] [<c000df00>] (__irq_svc+0x40/0x70) from [<c0255b64>]
(cpuidle_wrap_enter+0x48/0x94)
[ 4619.775000] [<c0255b64>] (cpuidle_wrap_enter+0x48/0x94) from [<c025586c>]
(cpuidle_enter_state+0x14/0x68)
[ 4619.775000] [<c025586c>] (cpuidle_enter_state+0x14/0x68) from [<c0255954>]
(cpuidle_idle_call+0x94/0x100)
[ 4619.775000] [<c0255954>] (cpuidle_idle_call+0x94/0x100) from [<c000f5e0>]
(cpu_idle+0x90/0xec)
[ 4619.775000] [<c000f5e0>] (cpu_idle+0x90/0xec) from [<402fa928>] (0x402fa928)
_______________________________________________
Xenomai mailing list
[email protected]
http://www.xenomai.org/mailman/listinfo/xenomai