------- Comment From dieg...@br.ibm.com 2018-01-23 09:23 EDT------- Hi Joseph,
I was able to reproduce the problem on HWE Kernal (XENIAL): -- System root@tuletapio2-lp3:~# uname -a Linux tuletapio2-lp3 4.13.13 #1 SMP Tue Jan 23 07:41:39 CST 2018 ppc64le ppc64le ppc64le GNU/Linux root@tuletapio2-lp3:~# cat /proc/meminfo HugePages_Total: 2 HugePages_Free: 2 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 16777216 kB root@tuletapio2-lp3:~# cat /proc/cmdline BOOT_IMAGE=/boot/vmlinux-4.13.13 root=UUID=728bebfe-83ba-410d-917b-b552edbbb0a3 ro quiet splash default_hugepagesz=16G hugepagesz=16G hugepages=2 -- DUMP Unable to handle kernel paging request for data at address 0x5deadbeef0000108 Faulting instruction address: 0xc0000000002cf374 Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=2048 NUMA pSeries Modules linked in: CPU: 7 PID: 3769 Comm: mem-on-off-test Not tainted 4.13.13 #1 task: c0000007d9400000 task.stack: c0000007d9480000 NIP: c0000000002cf374 LR: c0000000002cf298 CTR: c000000000134ef0 REGS: c0000007d94837d0 TRAP: 0380 Not tainted (4.13.13) MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 44422484 XER: 00000000 CFAR: c0000000002cf308 SOFTE: 1 GPR00: c0000000002cf28c c0000007d9483a50 c000000000e6ff00 c000000000fbc580 GPR04: f000000004000000 5deadbeef0000100 5deadbeef0000200 0000000000000002 GPR08: 5deadbeef0000000 0000000000000001 c000000000fbc580 c0000007fb000628 GPR12: 0000000000002200 c00000000fd02a00 0000000000001000 f000000004000000 GPR16: 0000000000000002 0000000000000000 c000000000ea3b00 0000000000001000 GPR20: 00000000ffffdb4f 0000000000000001 c000000fffd44600 c0000007d9483b60 GPR24: 0000000000100000 c000000000f0dad8 c000000000eb2dd0 0000000000000001 GPR28: 0000000000000000 c000000000fc4580 c000000000fd4580 f000000004000000 NIP [c0000000002cf374] dissolve_free_huge_page+0x124/0x230 LR [c0000000002cf298] dissolve_free_huge_page+0x48/0x230 Call Trace: [c0000007d9483a50] [c0000000002cf28c] dissolve_free_huge_page+0x3c/0x230 (unreliable) [c0000007d9483a90] [c0000000002cf548] dissolve_free_huge_pages+0xc8/0x150 [c0000007d9483ae0] [c0000000002ee1c8] __offline_pages.constprop.5+0x398/0xa90 [c0000007d9483c30] [c000000000645870] memory_subsys_offline+0x60/0xf0 [c0000007d9483c60] [c000000000623434] device_offline+0xf4/0x130 [c0000007d9483ca0] [c000000000645718] store_mem_state+0x178/0x190 [c0000007d9483ce0] [c00000000061ea34] dev_attr_store+0x34/0x60 [c0000007d9483d00] [c0000000003bdd10] sysfs_kf_write+0x60/0xa0 [c0000007d9483d20] [c0000000003bcaac] kernfs_fop_write+0x16c/0x240 [c0000007d9483d70] [c000000000314cf4] __vfs_write+0x34/0x70 [c0000007d9483d90] [c00000000031673c] vfs_write+0xcc/0x230 [c0000007d9483de0] [c000000000318410] SyS_write+0x60/0x110 [c0000007d9483e30] [c00000000000b860] system_call+0x58/0x6c Instruction dump: e90a0030 88ff0007 7fa94000 419e0120 3d005dea e8df0028 e8bf0020 7fe4fb78 6108dbee 7d435378 790807c6 6508f000 <f8c50008> f8a60000 7d094378 61080100 ---[ end trace 23e3dda3fe0a58bd ]-- -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1724120 Title: Ubuntu 16.04.3 - call traces occurs when memory-hotplug test is run with 16Gb hugepages configured Status in The Ubuntu-power-systems project: In Progress Status in linux package in Ubuntu: In Progress Status in linux source package in Artful: In Progress Bug description: Issue: Call traces occurs when memory-hotplug script is run with 16Gb hugepages configured. Environment: ppc64le PowerVM Lpar root@ltctuleta-lp1:~# uname -r 4.4.0-34-generic root@ltctuleta-lp1:~# cat /proc/meminfo | grep -i huge AnonHugePages: 0 kB HugePages_Total: 2 HugePages_Free: 2 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 16777216 kB root@ltctuleta-lp1:~# free -h total used free shared buff/cache available Mem: 85G 32G 52G 16M 193M 52G Swap: 43G 0B 43G Steps to reproduce: 1 - Download kernel source and enter to the directory- tools/testing/selftests/memory-hotplug/ 2 - Run mem-on-off-test.sh script in it. System gives call traces like: offline_memory_expect_success 639: unexpected fail online-offline 668 [ 57.552964] Unable to handle kernel paging request for data at address 0x00000028 [ 57.552977] Faulting instruction address: 0xc00000000029bc04 [ 57.552987] Oops: Kernel access of bad area, sig: 11 [#1] [ 57.552992] SMP NR_CPUS=2048 NUMA pSeries [ 57.553002] Modules linked in: btrfs xor raid6_pq pseries_rng sunrpc autofs4 ses enclosure nouveau bnx2x i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm vxlan ip6_udp_tunnel ipr udp_tunnel rtc_generic mdio libcrc32c [ 57.553050] CPU: 44 PID: 6518 Comm: mem-on-off-test Not tainted 4.4.0-34-generic #53-Ubuntu [ 57.553059] task: c00000072773c8e0 ti: c000000727780000 task.ti: c000000727780000 [ 57.553067] NIP: c00000000029bc04 LR: c00000000029bbdc CTR: c0000000001107f0 [ 57.553076] REGS: c000000727783770 TRAP: 0300 Not tainted (4.4.0-34-generic) [ 57.553083] MSR: 8000000100009033 <SF,EE,ME,IR,DR,RI,LE> CR: 24242882 XER: 00000002 [ 57.553104] CFAR: c000000000008468 DAR: 0000000000000028 DSISR: 40000000 SOFTE: 1 GPR00: c00000000029bbdc c0000007277839f0 c0000000015b5d00 0000000000000000 GPR04: 000000000029d000 0000000000000800 0000000000000000 f00000000a000001 GPR08: f00000000a700020 0000000000000008 c00000000185e270 c000000e7e000050 GPR12: 0000000000002200 c00000000e6ea200 000000000029d000 0000000022000000 GPR16: 1000000000000000 c0000000015e2200 000000000a700000 0000000000000000 GPR20: 0000000000010000 0000000000000100 0000000000000200 c0000000015f16d0 GPR24: c000000001876510 0000000000000000 0000000000000001 c000000001872a00 GPR28: 000000000029d000 f000000000000000 f00000000a700000 000000000029c000 [ 57.553211] NIP [c00000000029bc04] dissolve_free_huge_pages+0x154/0x220 [ 57.553219] LR [c00000000029bbdc] dissolve_free_huge_pages+0x12c/0x220 [ 57.553226] Call Trace: [ 57.553231] [c0000007277839f0] [c00000000029bbdc] dissolve_free_huge_pages+0x12c/0x220 (unreliable) [ 57.553244] [c000000727783a80] [c0000000002dcbc8] __offline_pages.constprop.6+0x3f8/0x900 [ 57.553254] [c000000727783bd0] [c0000000006fbb38] memory_subsys_offline+0xa8/0x110 [ 57.553265] [c000000727783c00] [c0000000006d6424] device_offline+0x104/0x140 [ 57.553274] [c000000727783c40] [c0000000006fba80] store_mem_state+0x180/0x190 [ 57.553283] [c000000727783c80] [c0000000006d1e58] dev_attr_store+0x68/0xa0 [ 57.553293] [c000000727783cc0] [c000000000398110] sysfs_kf_write+0x80/0xb0 [ 57.553302] [c000000727783d00] [c000000000397028] kernfs_fop_write+0x188/0x200 [ 57.553312] [c000000727783d50] [c0000000002e190c] __vfs_write+0x6c/0xe0 [ 57.553321] [c000000727783d90] [c0000000002e2640] vfs_write+0xc0/0x230 [ 57.553329] [c000000727783de0] [c0000000002e367c] SyS_write+0x6c/0x110 [ 57.553339] [c000000727783e30] [c000000000009204] system_call+0x38/0xb4 [ 57.553346] Instruction dump: [ 57.553351] 7e831836 4bfff991 e91e0028 e8fe0020 7d32e82a f9070008 f8e80000 fabe0020 [ 57.553366] fade0028 79294620 79291764 7d234a14 <e9030028> 3908ffff f9030028 81091458 [ 57.553383] ---[ end trace 617f7bdd75bcfc10 ]--- [ 57.557133] Segmentation fault The following commit IDs were built into a 4.10.0-37-generic #41 test kernel and verified to fix the problem: a525108cf1cc14651602d678da38fa627a76a724 e1073d1e7920946ac4776a619cc40668b9e1401b 40692eb5eea209c2dd55857f44b4e1d7206e91d6 e24a1307ba1f99fc62a0bd61d5e87fcfb6d5503d 79cc38ded1e1ac86e69c90f604efadd50b0b3762 4ae279c2c96ab38a78b954d218790a8f6db714e5 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1724120/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp