[Kernel-packages] [Bug 1573062] Re: memory_stress_ng failing for Power architecture for 16.04
FYI, I think the patch made it to 4.4 stable as well -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1573062 Title: memory_stress_ng failing for Power architecture for 16.04 Status in linux package in Ubuntu: In Progress Status in linux source package in Xenial: In Progress Status in linux source package in Yakkety: In Progress Bug description: memory_stress_ng, as part of server certification is failing for IBM Power S812LC(TN71-BP012) in bare metal mode. Failing in this case is defined by the test locking up the server in an unrecoverable state which only a reboot will fix. I will be attaching screen and kern logs for the failures and a successful run on 14.04 on the same server. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1573062] Re: memory_stress_ng failing for Power architecture for 16.04
Can we build the latest 4.8, may be we should wait for 4.8-rc7. I've got all the fixes upstream, with the latest being 135e8c9250dd5c8c9aae5984fde6f230d0cbfeaf -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1573062 Title: memory_stress_ng failing for Power architecture for 16.04 Status in linux package in Ubuntu: In Progress Status in linux source package in Xenial: In Progress Status in linux source package in Yakkety: In Progress Bug description: memory_stress_ng, as part of server certification is failing for IBM Power S812LC(TN71-BP012) in bare metal mode. Failing in this case is defined by the test locking up the server in an unrecoverable state which only a reboot will fix. I will be attaching screen and kern logs for the failures and a successful run on 14.04 on the same server. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1573062] Re: memory_stress_ng failing for Power architecture for 16.04
Can we please get answers to 1, 2 and 4 for comment #128. Also Kalpana has a request for a new kernel build. Thanks, -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1573062 Title: memory_stress_ng failing for Power architecture for 16.04 Status in linux package in Ubuntu: In Progress Status in linux source package in Xenial: In Progress Status in linux source package in Yakkety: In Progress Bug description: memory_stress_ng, as part of server certification is failing for IBM Power S812LC(TN71-BP012) in bare metal mode. Failing in this case is defined by the test locking up the server in an unrecoverable state which only a reboot will fix. I will be attaching screen and kern logs for the failures and a successful run on 14.04 on the same server. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1573062] Re: memory_stress_ng failing for Power architecture for 16.04
Thank you for the excellent summary. Questions 1. Can we get the configurations of the machines. 2. The first column is the number of times the test ran? 3. I see that 4.4.0-31-generic-50-Ubuntu passed on all machines across several runs, is that true? 4. Did any of the tests result in system hang? Can we find out from the summary? Would it be fair to assume that tests when run against 4.4.0-31-generic-50-Ubuntu would pass again and we should work off of that? My interest is in 4.4.0-34-generic-53~lp1573062PATCHED, With that I notice that gulpin saw failures in mmapfork and probably a hang there -- for which I posted a scheduler try_to_wake_up fix upstream. Generally binacle faces brk stress test failures -- it will be interesting to see its machine configuration and why the test failed One observation at my end is that we should reboot between runs as I think some tests can kill important tasks in the system and I am not sure if there is a guarantee that the system is able to carry on correct operation recovering from all the tasks being re-spawned after OOM for example. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1573062 Title: memory_stress_ng failing for Power architecture for 16.04 Status in linux package in Ubuntu: In Progress Status in linux source package in Xenial: In Progress Status in linux source package in Yakkety: In Progress Bug description: memory_stress_ng, as part of server certification is failing for IBM Power S812LC(TN71-BP012) in bare metal mode. Failing in this case is defined by the test locking up the server in an unrecoverable state which only a reboot will fix. I will be attaching screen and kern logs for the failures and a successful run on 14.04 on the same server. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1573062] Re: memory_stress_ng failing for Power architecture for 16.04
I just posted another patch @ http://www.mail-archive.com/linux- ker...@vger.kernel.org/msg1219903.html, I am testing this patch at the moment. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1573062 Title: memory_stress_ng failing for Power architecture for 16.04 Status in linux package in Ubuntu: In Progress Status in linux source package in Xenial: In Progress Status in linux source package in Yakkety: In Progress Bug description: memory_stress_ng, as part of server certification is failing for IBM Power S812LC(TN71-BP012) in bare metal mode. Failing in this case is defined by the test locking up the server in an unrecoverable state which only a reboot will fix. I will be attaching screen and kern logs for the failures and a successful run on 14.04 on the same server. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1573062] Re: memory_stress_ng failing for Power architecture for 16.04
Thanks Jeff. I see that the ARM64 might have failed - https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1610320 Can we know the git commit id that fixed the ARM64 failure in mainline? BTW, could you please share the full machine configurations - threads+RAM+swap for each of the other architectures you ran this on? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1573062 Title: memory_stress_ng failing for Power architecture for 16.04 Status in linux package in Ubuntu: In Progress Status in linux source package in Xenial: In Progress Status in linux source package in Yakkety: In Progress Bug description: memory_stress_ng, as part of server certification is failing for IBM Power S812LC(TN71-BP012) in bare metal mode. Failing in this case is defined by the test locking up the server in an unrecoverable state which only a reboot will fix. I will be attaching screen and kern logs for the failures and a successful run on 14.04 on the same server. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1610320] Re: stress-ng memory testing causes Arm64 system to hang
Do we know what change fixed the issue? commit id? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1610320 Title: stress-ng memory testing causes Arm64 system to hang Status in linux package in Ubuntu: Confirmed Bug description: Running the certification memory test using stress-ng on an ARM64 system with 64GB roughly of RAM (MAAS shows 63GB). The test runs several stress-ng memory related tests. It appears that the system locks up when the bigheap test runs, every time so far (two of two runs have failed). I'm doing a third now to confirm that bigheap is where the lockup occurs. I've also run this exact same test on similar and smaller memory amounts on s390x and amd64 without problem. This is also being done to provide data for a similar bug discovered on Power (ppc64le). To test this on an arm64 system: Install Xenial $ add-apt-repository ppa:hardware-certification/public $ apt update $ apt install canonical-certification-server $ /usr/lib/plainbox-provider-checkbox/bin/memory_stress_ng the memory_stress_ng script is a wrapper for stress_ng that only calls certain memory tests. See script for an idea of how it's executing the tests. This could be the same issue that we're seeing on power, or it could be a different issue for ARM that looks similar. Here's the original Power bug: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062 ProblemType: Bug DistroRelease: Ubuntu 16.04 Package: linux-image-4.4.0-31-generic 4.4.0-31.50 ProcVersionSignature: User Name 4.4.0-31.50-generic 4.4.13 Uname: Linux 4.4.0-31-generic aarch64 AlsaDevices: total 0 crw-rw 1 root audio 116, 1 Aug 5 09:27 seq crw-rw 1 root audio 116, 33 Aug 5 09:27 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay' ApportVersion: 2.20.1-0ubuntu2.1 Architecture: arm64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: N/A Date: Fri Aug 5 16:15:51 2016 IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig' Lsusb: Error: command ['lsusb'] failed with exit code 1: PciMultimedia: ProcEnviron: TERM=xterm PATH=(custom, no user) XDG_RUNTIME_DIR= LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: ProcKernelCmdLine: console=ttyS0,9600n8r ro RelatedPackageVersions: linux-restricted-modules-4.4.0-31-generic N/A linux-backports-modules-4.4.0-31-generic N/A linux-firmware1.157.2 RfKill: Error: [Errno 2] No such file or directory: 'rfkill' SourcePackage: linux UpgradeStatus: No upgrade log present (probably fresh install) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1610320/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1573062] Re: memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04
These logs are something I've not seen here in my testing. This shows that we are stuck doing an up_write() on root->rwsem in the anon_vma path. It looks like we are contending on the rwsem's sem->wait_lock. I don't have a reproduction of this issue, it will be interesting to examine what is causing the heavy contention -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1573062 Title: memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04 Status in linux package in Ubuntu: In Progress Status in linux source package in Xenial: In Progress Status in linux source package in Yakkety: In Progress Bug description: memory_stress_ng, as part of server certification is failing for IBM Power S812LC(TN71-BP012) in bare metal mode. Failing in this case is defined by the test locking up the server in an unrecoverable state which only a reboot will fix. I will be attaching screen and kern logs for the failures and a successful run on 14.04 on the same server. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1573062] Re: memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04
I am unable to reproduce the failure either, but your system with 32G and 128 threads seems like the test would start 128 hogs each hogging up 32GB. How much swap do you have on them? Could you post the dmesg to see what failed and the logs around it? It looks like the stack stressor failed. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1573062 Title: memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04 Status in linux package in Ubuntu: In Progress Status in linux source package in Xenial: In Progress Status in linux source package in Yakkety: In Progress Bug description: memory_stress_ng, as part of server certification is failing for IBM Power S812LC(TN71-BP012) in bare metal mode. Failing in this case is defined by the test locking up the server in an unrecoverable state which only a reboot will fix. I will be attaching screen and kern logs for the failures and a successful run on 14.04 on the same server. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1573062] Re: memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04
Can I quickly check if the oom_reaper patches are there in the built kernel or is it just the fix I posted? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1573062 Title: memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04 Status in linux package in Ubuntu: In Progress Status in linux source package in Xenial: In Progress Status in linux source package in Yakkety: In Progress Bug description: memory_stress_ng, as part of server certification is failing for IBM Power S812LC(TN71-BP012) in bare metal mode. Failing in this case is defined by the test locking up the server in an unrecoverable state which only a reboot will fix. I will be attaching screen and kern logs for the failures and a successful run on 14.04 on the same server. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1573062] Re: memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04
Had several other runs of success. I would like to see runs from others as well. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1573062 Title: memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04 Status in linux package in Ubuntu: In Progress Status in linux source package in Xenial: In Progress Status in linux source package in Yakkety: In Progress Bug description: memory_stress_ng, as part of server certification is failing for IBM Power S812LC(TN71-BP012) in bare metal mode. Failing in this case is defined by the test locking up the server in an unrecoverable state which only a reboot will fix. I will be attaching screen and kern logs for the failures and a successful run on 14.04 on the same server. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1573062] Re: memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04
At my end, I ran two runs with success. More runs in progress -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1573062 Title: memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04 Status in linux package in Ubuntu: In Progress Status in linux source package in Xenial: In Progress Status in linux source package in Yakkety: In Progress Bug description: memory_stress_ng, as part of server certification is failing for IBM Power S812LC(TN71-BP012) in bare metal mode. Failing in this case is defined by the test locking up the server in an unrecoverable state which only a reboot will fix. I will be attaching screen and kern logs for the failures and a successful run on 14.04 on the same server. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1573062] Re: memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04
I just looked at the directory and I can find just the arm64 kernel. Could you please confirm if I am looking at the right thing and at the right place? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1573062 Title: memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04 Status in linux package in Ubuntu: In Progress Status in linux source package in Xenial: In Progress Status in linux source package in Yakkety: In Progress Bug description: memory_stress_ng, as part of server certification is failing for IBM Power S812LC(TN71-BP012) in bare metal mode. Failing in this case is defined by the test locking up the server in an unrecoverable state which only a reboot will fix. I will be attaching screen and kern logs for the failures and a successful run on 14.04 on the same server. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1573062] Re: memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04
Sorry the 14.04 should be 16.04 in comment #61 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1573062 Title: memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04 Status in linux package in Ubuntu: In Progress Status in linux source package in Wily: In Progress Status in linux source package in Xenial: In Progress Status in linux source package in Yakkety: In Progress Bug description: memory_stress_ng, as part of server certification is failing for IBM Power S812LC(TN71-BP012) in bare metal mode. Failing in this case is defined by the test locking up the server in an unrecoverable state which only a reboot will fix. I will be attaching screen and kern logs for the failures and a successful run on 14.04 on the same server. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1573062] Re: memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04
I have 14.04 installed with 4.4.0-28 and I can see the following In the bad case 1. OOM'ing of stress-ng-brk is slow, I can see it making progress -- see tasks being scheduled/console output and sysrq output on Ctrl-o h 2. stress-ng-brk is trying to make progress in OOM, but is heavily contenting for what seems like lru lock 3. The network driver is trying to serve softirq's and fails to process them as allocation fails and does a dump_stack() In the other case (based on logs and limited testing - kernel 698f415cf5756e320623bdb015a600945743377c for me) 1. OOM proceeds quickly and stress-ng-brk is OOM'd frequently 2. At some point when all stress-ng-brk seem to be OOM'd the test completes I think the test relies on all stress-ng-brk's to OOM before considering completion (I could be wrong), but in this case progress is slow, but the system is responding. I am going to try the kernel listed here We believe that the test may require multiple iterations of running for check for consistency of commits. At this point I think the test will progress and needs longer and progress is slow. I think the expectation is that it needs to complete faster; the system is not lock'd up completely as far as I can tell. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1573062 Title: memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04 Status in linux package in Ubuntu: In Progress Status in linux source package in Wily: In Progress Status in linux source package in Xenial: In Progress Status in linux source package in Yakkety: In Progress Bug description: memory_stress_ng, as part of server certification is failing for IBM Power S812LC(TN71-BP012) in bare metal mode. Failing in this case is defined by the test locking up the server in an unrecoverable state which only a reboot will fix. I will be attaching screen and kern logs for the failures and a successful run on 14.04 on the same server. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp