apport information ** Tags added: apport-collected
** Description changed: When bonnie++ was run in a loop, the system exhibits a hang behavior with "rcu_sched: self-detected stall on CPU" The time to error can be inconsistent. One time it took 7 hours and the next time more than 2 days. Commands to reproduce the failure: $ sudo apt-get install bonnie++ $ mkdir bonnie $ while true; do bonnie++ -d bonnie; done &>>bonnie0.log & Stack trace: [237019.072290] INFO: rcu_sched self-detected stall on CPU { 1} (t=19305216 jiffies g=580389 c=580388 q=84) [237019.080901] CPU: 1 PID: 44 Comm: kswapd0 Tainted: GF 3.11.0-6-generic-lpae #12-Ubuntu [237019.088879] [<c002bc00>] (unwind_backtrace+0x0/0x138) from [<c0026f1c>] (show_stack+0x10/0x14) [237019.096700] [<c0026f1c>] (show_stack+0x10/0x14) from [<c05cbe50>] (dump_stack+0x74/0x90) [237019.104051] [<c05cbe50>] (dump_stack+0x74/0x90) from [<c00bf37c>] (rcu_check_callbacks+0x31c/0x798) [237019.112262] [<c00bf37c>] (rcu_check_callbacks+0x31c/0x798) from [<c00492a0>] (update_process_times+0x38/0x64) [237019.121254] [<c00492a0>] (update_process_times+0x38/0x64) from [<c008cdbc>] (tick_sched_handle+0x54/0x60) [237019.129933] [<c008cdbc>] (tick_sched_handle+0x54/0x60) from [<c008d00c>] (tick_sched_timer+0x44/0x74) [237019.138300] [<c008d00c>] (tick_sched_timer+0x44/0x74) from [<c005db50>] (__run_hrtimer+0x74/0x1d4) [237019.146433] [<c005db50>] (__run_hrtimer+0x74/0x1d4) from [<c005e6f8>] (hrtimer_interrupt+0x10c/0x2c0) [237019.154800] [<c005e6f8>] (hrtimer_interrupt+0x10c/0x2c0) from [<c0492e44>] (arch_timer_handler_phys+0x28/0x30) [237019.163871] [<c0492e44>] (arch_timer_handler_phys+0x28/0x30) from [<c00b8c2c>] (handle_percpu_devid_irq+0x6c/0x104) [237019.173332] [<c00b8c2c>] (handle_percpu_devid_irq+0x6c/0x104) from [<c00b54ec>] (generic_handle_irq+0x20/0x30) [237019.182402] [<c00b54ec>] (generic_handle_irq+0x20/0x30) from [<c0023ff4>] (handle_IRQ+0x38/0x94) [237019.190378] [<c0023ff4>] (handle_IRQ+0x38/0x94) from [<c0008508>] (gic_handle_irq+0x28/0x5c) [237019.198041] [<c0008508>] (gic_handle_irq+0x28/0x5c) from [<c05d1c00>] (__irq_svc+0x40/0x50) [237019.205624] Exception stack(0xee2c1c18 to 0xee2c1c60) [237019.210238] 1c00: 00000004 00000004 [237019.217666] 1c20: 00000008 00000001 ee2c1c8c ca208700 ca208700 0996b000 ca208708 00000001 [237019.225093] 1c40: 00000002 edb31300 00000003 ee2c1c60 c02f54fc c00923c8 200f0013 ffffffff [237019.232523] [<c05d1c00>] (__irq_svc+0x40/0x50) from [<c00923c8>] (generic_exec_single+0x6c/0x94) [237019.240500] [<c00923c8>] (generic_exec_single+0x6c/0x94) from [<c00924f4>] (smp_call_function_single+0x104/0x198) [237019.249805] [<c00924f4>] (smp_call_function_single+0x104/0x198) from [<c0029920>] (broadcast_tlb_mm_a15_erratum+0x7c/0x84) [237019.259812] [<c0029920>] (broadcast_tlb_mm_a15_erratum+0x7c/0x84) from [<c0029adc>] (flush_tlb_page+0x74/0xa8) [237019.268882] [<c0029adc>] (flush_tlb_page+0x74/0xa8) from [<c011fc8c>] (ptep_clear_flush_young+0x6c/0xd0) [237019.277484] [<c011fc8c>] (ptep_clear_flush_young+0x6c/0xd0) from [<c011a60c>] (page_referenced_one+0x64/0x1fc) [237019.286554] [<c011a60c>] (page_referenced_one+0x64/0x1fc) from [<c011c034>] (page_referenced+0xf4/0x2e4) [237019.295155] [<c011c034>] (page_referenced+0xf4/0x2e4) from [<c00fc410>] (shrink_active_list+0x1f0/0x35c) [237019.303756] [<c00fc410>] (shrink_active_list+0x1f0/0x35c) from [<c00fdadc>] (shrink_lruvec+0x32c/0x598) [237019.312279] [<c00fdadc>] (shrink_lruvec+0x32c/0x598) from [<c00fddb0>] (shrink_zone+0x68/0x180) [237019.320176] [<c00fddb0>] (shrink_zone+0x68/0x180) from [<c00fe430>] (kswapd+0x568/0x9d4) [237019.327527] [<c00fe430>] (kswapd+0x568/0x9d4) from [<c005aae0>] (kthread+0xa4/0xb0) [237019.334487] [<c005aae0>] (kthread+0xa4/0xb0) from [<c0023198>] (ret_from_fork+0x14/0x3c) Setup details: Quad-core A15 server nodes on Calxeda Midway hardware. The failure has been seen two times with DDR setting of DDR3@1600mt/s cat /proc/version_signature Ubuntu 3.11.0-12.18-generic-lpae 3.11.3 The issue was first seen on Ubuntu 3.11.0-6.12-generic-lpae cat /etc/issue Ubuntu 13.04 \n \l Additional debug information attached + --- + Architecture: armhf + DistroRelease: Ubuntu 13.04 + MarkForUpload: True + Package: linux (not installed) + ProcEnviron: + LANGUAGE=en_US: + TERM=vt102 + PATH=(custom, no user) + LANG=en_US + SHELL=/bin/bash + Uname: Linux 3.11.0-12-generic-lpae armv7l + UserGroups: adm cdrom dip lpadmin plugdev sambashare sudo ** Attachment added: "HookError_cloud_archive.txt" https://bugs.launchpad.net/bugs/1239800/+attachment/3877969/+files/HookError_cloud_archive.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1239800 Title: Soft lockup when running bonnie++ only at 1600 mt/s Status in “linux” package in Ubuntu: Confirmed Bug description: When bonnie++ was run in a loop, the system exhibits a hang behavior with "rcu_sched: self-detected stall on CPU" The time to error can be inconsistent. One time it took 7 hours and the next time more than 2 days. Commands to reproduce the failure: $ sudo apt-get install bonnie++ $ mkdir bonnie $ while true; do bonnie++ -d bonnie; done &>>bonnie0.log & Stack trace: [237019.072290] INFO: rcu_sched self-detected stall on CPU { 1} (t=19305216 jiffies g=580389 c=580388 q=84) [237019.080901] CPU: 1 PID: 44 Comm: kswapd0 Tainted: GF 3.11.0-6-generic-lpae #12-Ubuntu [237019.088879] [<c002bc00>] (unwind_backtrace+0x0/0x138) from [<c0026f1c>] (show_stack+0x10/0x14) [237019.096700] [<c0026f1c>] (show_stack+0x10/0x14) from [<c05cbe50>] (dump_stack+0x74/0x90) [237019.104051] [<c05cbe50>] (dump_stack+0x74/0x90) from [<c00bf37c>] (rcu_check_callbacks+0x31c/0x798) [237019.112262] [<c00bf37c>] (rcu_check_callbacks+0x31c/0x798) from [<c00492a0>] (update_process_times+0x38/0x64) [237019.121254] [<c00492a0>] (update_process_times+0x38/0x64) from [<c008cdbc>] (tick_sched_handle+0x54/0x60) [237019.129933] [<c008cdbc>] (tick_sched_handle+0x54/0x60) from [<c008d00c>] (tick_sched_timer+0x44/0x74) [237019.138300] [<c008d00c>] (tick_sched_timer+0x44/0x74) from [<c005db50>] (__run_hrtimer+0x74/0x1d4) [237019.146433] [<c005db50>] (__run_hrtimer+0x74/0x1d4) from [<c005e6f8>] (hrtimer_interrupt+0x10c/0x2c0) [237019.154800] [<c005e6f8>] (hrtimer_interrupt+0x10c/0x2c0) from [<c0492e44>] (arch_timer_handler_phys+0x28/0x30) [237019.163871] [<c0492e44>] (arch_timer_handler_phys+0x28/0x30) from [<c00b8c2c>] (handle_percpu_devid_irq+0x6c/0x104) [237019.173332] [<c00b8c2c>] (handle_percpu_devid_irq+0x6c/0x104) from [<c00b54ec>] (generic_handle_irq+0x20/0x30) [237019.182402] [<c00b54ec>] (generic_handle_irq+0x20/0x30) from [<c0023ff4>] (handle_IRQ+0x38/0x94) [237019.190378] [<c0023ff4>] (handle_IRQ+0x38/0x94) from [<c0008508>] (gic_handle_irq+0x28/0x5c) [237019.198041] [<c0008508>] (gic_handle_irq+0x28/0x5c) from [<c05d1c00>] (__irq_svc+0x40/0x50) [237019.205624] Exception stack(0xee2c1c18 to 0xee2c1c60) [237019.210238] 1c00: 00000004 00000004 [237019.217666] 1c20: 00000008 00000001 ee2c1c8c ca208700 ca208700 0996b000 ca208708 00000001 [237019.225093] 1c40: 00000002 edb31300 00000003 ee2c1c60 c02f54fc c00923c8 200f0013 ffffffff [237019.232523] [<c05d1c00>] (__irq_svc+0x40/0x50) from [<c00923c8>] (generic_exec_single+0x6c/0x94) [237019.240500] [<c00923c8>] (generic_exec_single+0x6c/0x94) from [<c00924f4>] (smp_call_function_single+0x104/0x198) [237019.249805] [<c00924f4>] (smp_call_function_single+0x104/0x198) from [<c0029920>] (broadcast_tlb_mm_a15_erratum+0x7c/0x84) [237019.259812] [<c0029920>] (broadcast_tlb_mm_a15_erratum+0x7c/0x84) from [<c0029adc>] (flush_tlb_page+0x74/0xa8) [237019.268882] [<c0029adc>] (flush_tlb_page+0x74/0xa8) from [<c011fc8c>] (ptep_clear_flush_young+0x6c/0xd0) [237019.277484] [<c011fc8c>] (ptep_clear_flush_young+0x6c/0xd0) from [<c011a60c>] (page_referenced_one+0x64/0x1fc) [237019.286554] [<c011a60c>] (page_referenced_one+0x64/0x1fc) from [<c011c034>] (page_referenced+0xf4/0x2e4) [237019.295155] [<c011c034>] (page_referenced+0xf4/0x2e4) from [<c00fc410>] (shrink_active_list+0x1f0/0x35c) [237019.303756] [<c00fc410>] (shrink_active_list+0x1f0/0x35c) from [<c00fdadc>] (shrink_lruvec+0x32c/0x598) [237019.312279] [<c00fdadc>] (shrink_lruvec+0x32c/0x598) from [<c00fddb0>] (shrink_zone+0x68/0x180) [237019.320176] [<c00fddb0>] (shrink_zone+0x68/0x180) from [<c00fe430>] (kswapd+0x568/0x9d4) [237019.327527] [<c00fe430>] (kswapd+0x568/0x9d4) from [<c005aae0>] (kthread+0xa4/0xb0) [237019.334487] [<c005aae0>] (kthread+0xa4/0xb0) from [<c0023198>] (ret_from_fork+0x14/0x3c) Setup details: Quad-core A15 server nodes on Calxeda Midway hardware. The failure has been seen two times with DDR setting of DDR3@1600mt/s cat /proc/version_signature Ubuntu 3.11.0-12.18-generic-lpae 3.11.3 The issue was first seen on Ubuntu 3.11.0-6.12-generic-lpae cat /etc/issue Ubuntu 13.04 \n \l Additional debug information attached --- Architecture: armhf DistroRelease: Ubuntu 13.04 MarkForUpload: True Package: linux (not installed) ProcEnviron: LANGUAGE=en_US: TERM=vt102 PATH=(custom, no user) LANG=en_US SHELL=/bin/bash Uname: Linux 3.11.0-12-generic-lpae armv7l UserGroups: adm cdrom dip lpadmin plugdev sambashare sudo To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1239800/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp