[Kernel-packages] [Bug 1573062] Re: memory_stress_ng failing for Power architecture for 16.04

2016-09-22 Thread Balbir Singh
FYI, I think the patch made it to 4.4 stable as well

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1573062

Title:
  memory_stress_ng failing for Power architecture for 16.04

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Xenial:
  In Progress
Status in linux source package in Yakkety:
  In Progress

Bug description:
  memory_stress_ng, as part of server certification is failing for IBM
  Power S812LC(TN71-BP012) in bare metal mode. Failing in this case is
  defined by the test locking up the server in an unrecoverable state
  which only a reboot will fix.

  I will be attaching screen and kern logs for the failures and a
  successful run on 14.04 on the same server.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1573062] Re: memory_stress_ng failing for Power architecture for 16.04

2016-09-16 Thread Balbir Singh
Can we build the latest 4.8, may be we should wait for 4.8-rc7. I've got
all the fixes upstream, with the latest being
135e8c9250dd5c8c9aae5984fde6f230d0cbfeaf

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1573062

Title:
  memory_stress_ng failing for Power architecture for 16.04

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Xenial:
  In Progress
Status in linux source package in Yakkety:
  In Progress

Bug description:
  memory_stress_ng, as part of server certification is failing for IBM
  Power S812LC(TN71-BP012) in bare metal mode. Failing in this case is
  defined by the test locking up the server in an unrecoverable state
  which only a reboot will fix.

  I will be attaching screen and kern logs for the failures and a
  successful run on 14.04 on the same server.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1573062] Re: memory_stress_ng failing for Power architecture for 16.04

2016-09-13 Thread Balbir Singh
Can we please get answers to 1, 2 and 4 for comment #128. Also Kalpana
has a request for a new kernel build.

Thanks,

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1573062

Title:
  memory_stress_ng failing for Power architecture for 16.04

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Xenial:
  In Progress
Status in linux source package in Yakkety:
  In Progress

Bug description:
  memory_stress_ng, as part of server certification is failing for IBM
  Power S812LC(TN71-BP012) in bare metal mode. Failing in this case is
  defined by the test locking up the server in an unrecoverable state
  which only a reboot will fix.

  I will be attaching screen and kern logs for the failures and a
  successful run on 14.04 on the same server.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1573062] Re: memory_stress_ng failing for Power architecture for 16.04

2016-09-04 Thread Balbir Singh
Thank you for the excellent summary. Questions

1. Can we get the configurations of the machines.
2. The first column is the number of times the test ran?
3. I see that 4.4.0-31-generic-50-Ubuntu passed on all machines across several 
runs, is that true?
4. Did any of the tests result in system hang? Can we find out from the summary?


Would it be fair to assume that tests when run against 
4.4.0-31-generic-50-Ubuntu would pass again and we should work off of that?

My interest is in 4.4.0-34-generic-53~lp1573062PATCHED, With that I
notice that gulpin saw failures in mmapfork and probably a hang there --
for which I posted a scheduler try_to_wake_up fix upstream. Generally
binacle faces brk stress test failures -- it will be interesting to see
its machine configuration and why the test failed

One observation at my end is that we should reboot between runs as I
think some tests can kill important tasks in the system and I am not
sure if there is a guarantee that the system is able to carry on correct
operation recovering from all the tasks being re-spawned after OOM for
example.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1573062

Title:
  memory_stress_ng failing for Power architecture for 16.04

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Xenial:
  In Progress
Status in linux source package in Yakkety:
  In Progress

Bug description:
  memory_stress_ng, as part of server certification is failing for IBM
  Power S812LC(TN71-BP012) in bare metal mode. Failing in this case is
  defined by the test locking up the server in an unrecoverable state
  which only a reboot will fix.

  I will be attaching screen and kern logs for the failures and a
  successful run on 14.04 on the same server.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1573062] Re: memory_stress_ng failing for Power architecture for 16.04

2016-08-30 Thread Balbir Singh
I just posted another patch @ http://www.mail-archive.com/linux-
ker...@vger.kernel.org/msg1219903.html, I am testing this patch at the
moment.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1573062

Title:
  memory_stress_ng failing for Power architecture for 16.04

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Xenial:
  In Progress
Status in linux source package in Yakkety:
  In Progress

Bug description:
  memory_stress_ng, as part of server certification is failing for IBM
  Power S812LC(TN71-BP012) in bare metal mode. Failing in this case is
  defined by the test locking up the server in an unrecoverable state
  which only a reboot will fix.

  I will be attaching screen and kern logs for the failures and a
  successful run on 14.04 on the same server.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1573062] Re: memory_stress_ng failing for Power architecture for 16.04

2016-08-18 Thread Balbir Singh
Thanks Jeff. I see that the ARM64 might have failed -
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1610320

Can we know the git commit id that fixed the ARM64 failure in mainline?

BTW, could you please share the full machine configurations -
threads+RAM+swap for each of the other architectures you ran this on?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1573062

Title:
  memory_stress_ng failing for Power architecture for 16.04

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Xenial:
  In Progress
Status in linux source package in Yakkety:
  In Progress

Bug description:
  memory_stress_ng, as part of server certification is failing for IBM
  Power S812LC(TN71-BP012) in bare metal mode. Failing in this case is
  defined by the test locking up the server in an unrecoverable state
  which only a reboot will fix.

  I will be attaching screen and kern logs for the failures and a
  successful run on 14.04 on the same server.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1610320] Re: stress-ng memory testing causes Arm64 system to hang

2016-08-18 Thread Balbir Singh
Do we know what change fixed the issue? commit id?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1610320

Title:
  stress-ng memory testing causes Arm64 system to hang

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Running the certification memory test using stress-ng on an ARM64
  system with 64GB roughly of RAM (MAAS shows 63GB).

  The test runs several stress-ng memory related tests.  It appears that
  the system locks up when the bigheap test runs, every time so far (two
  of two runs have failed).

  I'm doing a third now to confirm that bigheap is where the lockup
  occurs.

  I've also run this exact same test on similar and smaller memory
  amounts on s390x and amd64 without problem.

  This is also being done to provide data for a similar bug discovered
  on Power (ppc64le).

  To test this on an arm64 system:

  Install Xenial
  $ add-apt-repository ppa:hardware-certification/public
  $ apt update
  $ apt install canonical-certification-server
  $ /usr/lib/plainbox-provider-checkbox/bin/memory_stress_ng

  the memory_stress_ng script is a wrapper for stress_ng that only calls
  certain memory tests.  See script for an idea of how it's executing
  the tests.

  This could be the same issue that we're seeing on power, or it could
  be a different issue for ARM that looks similar.  Here's the original
  Power bug:

  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062

  ProblemType: Bug
  DistroRelease: Ubuntu 16.04
  Package: linux-image-4.4.0-31-generic 4.4.0-31.50
  ProcVersionSignature: User Name 4.4.0-31.50-generic 4.4.13
  Uname: Linux 4.4.0-31-generic aarch64
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 Aug  5 09:27 seq
   crw-rw 1 root audio 116, 33 Aug  5 09:27 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.1-0ubuntu2.1
  Architecture: arm64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  CRDA: N/A
  Date: Fri Aug  5 16:15:51 2016
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  Lsusb: Error: command ['lsusb'] failed with exit code 1:
  PciMultimedia:
   
  ProcEnviron:
   TERM=xterm
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB:
   
  ProcKernelCmdLine: console=ttyS0,9600n8r ro
  RelatedPackageVersions:
   linux-restricted-modules-4.4.0-31-generic N/A
   linux-backports-modules-4.4.0-31-generic  N/A
   linux-firmware1.157.2
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1610320/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1573062] Re: memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04

2016-08-18 Thread Balbir Singh
These logs are something I've not seen here in my testing. This shows
that we are stuck doing an up_write() on root->rwsem in the anon_vma
path. It looks like we are contending on the rwsem's sem->wait_lock. I
don't have a reproduction of this issue, it will be interesting to
examine what is causing the heavy contention

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1573062

Title:
  memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Xenial:
  In Progress
Status in linux source package in Yakkety:
  In Progress

Bug description:
  memory_stress_ng, as part of server certification is failing for IBM
  Power S812LC(TN71-BP012) in bare metal mode. Failing in this case is
  defined by the test locking up the server in an unrecoverable state
  which only a reboot will fix.

  I will be attaching screen and kern logs for the failures and a
  successful run on 14.04 on the same server.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1573062] Re: memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04

2016-08-17 Thread Balbir Singh
I am unable to reproduce the failure either, but your system with 32G
and 128 threads seems like the test would start 128 hogs each hogging up
32GB. How much swap do you have on them? Could you post the dmesg to see
what failed and the logs around it? It looks like the stack stressor
failed.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1573062

Title:
  memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Xenial:
  In Progress
Status in linux source package in Yakkety:
  In Progress

Bug description:
  memory_stress_ng, as part of server certification is failing for IBM
  Power S812LC(TN71-BP012) in bare metal mode. Failing in this case is
  defined by the test locking up the server in an unrecoverable state
  which only a reboot will fix.

  I will be attaching screen and kern logs for the failures and a
  successful run on 14.04 on the same server.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1573062] Re: memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04

2016-08-15 Thread Balbir Singh
Can I quickly check if the oom_reaper patches are there in the built
kernel or is it just the fix I posted?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1573062

Title:
  memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Xenial:
  In Progress
Status in linux source package in Yakkety:
  In Progress

Bug description:
  memory_stress_ng, as part of server certification is failing for IBM
  Power S812LC(TN71-BP012) in bare metal mode. Failing in this case is
  defined by the test locking up the server in an unrecoverable state
  which only a reboot will fix.

  I will be attaching screen and kern logs for the failures and a
  successful run on 14.04 on the same server.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1573062] Re: memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04

2016-08-14 Thread Balbir Singh
Had several other runs of success. I would like to see runs from others
as well.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1573062

Title:
  memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Xenial:
  In Progress
Status in linux source package in Yakkety:
  In Progress

Bug description:
  memory_stress_ng, as part of server certification is failing for IBM
  Power S812LC(TN71-BP012) in bare metal mode. Failing in this case is
  defined by the test locking up the server in an unrecoverable state
  which only a reboot will fix.

  I will be attaching screen and kern logs for the failures and a
  successful run on 14.04 on the same server.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1573062] Re: memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04

2016-08-11 Thread Balbir Singh
At my end, I ran two runs with success. More runs in progress

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1573062

Title:
  memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Xenial:
  In Progress
Status in linux source package in Yakkety:
  In Progress

Bug description:
  memory_stress_ng, as part of server certification is failing for IBM
  Power S812LC(TN71-BP012) in bare metal mode. Failing in this case is
  defined by the test locking up the server in an unrecoverable state
  which only a reboot will fix.

  I will be attaching screen and kern logs for the failures and a
  successful run on 14.04 on the same server.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1573062] Re: memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04

2016-08-10 Thread Balbir Singh
I just looked at the directory and I can find just the arm64 kernel.
Could you please confirm if I am looking at the right thing and at the
right place?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1573062

Title:
  memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Xenial:
  In Progress
Status in linux source package in Yakkety:
  In Progress

Bug description:
  memory_stress_ng, as part of server certification is failing for IBM
  Power S812LC(TN71-BP012) in bare metal mode. Failing in this case is
  defined by the test locking up the server in an unrecoverable state
  which only a reboot will fix.

  I will be attaching screen and kern logs for the failures and a
  successful run on 14.04 on the same server.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1573062] Re: memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04

2016-07-06 Thread Balbir Singh
Sorry the 14.04 should be 16.04 in comment #61

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1573062

Title:
  memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Wily:
  In Progress
Status in linux source package in Xenial:
  In Progress
Status in linux source package in Yakkety:
  In Progress

Bug description:
  memory_stress_ng, as part of server certification is failing for IBM
  Power S812LC(TN71-BP012) in bare metal mode. Failing in this case is
  defined by the test locking up the server in an unrecoverable state
  which only a reboot will fix.

  I will be attaching screen and kern logs for the failures and a
  successful run on 14.04 on the same server.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1573062] Re: memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04

2016-07-05 Thread Balbir Singh
I have 14.04 installed with 4.4.0-28 and I can see the following

In the bad case

1. OOM'ing of stress-ng-brk is slow, I can see it making progress -- see tasks 
being  scheduled/console output and sysrq output on Ctrl-o h
2. stress-ng-brk is trying to make progress in OOM, but is heavily contenting 
for what seems like lru lock
3. The network driver is trying to serve softirq's and fails to process them as 
allocation fails and does a dump_stack()


In the other case (based on logs and limited testing - kernel 
698f415cf5756e320623bdb015a600945743377c for me)
1. OOM proceeds quickly and stress-ng-brk is OOM'd frequently
2. At some point when all stress-ng-brk seem to be OOM'd the test completes

I think the test relies on all stress-ng-brk's to OOM before considering
completion (I could be wrong), but in this case progress is slow, but
the system is responding. I am going to try the kernel listed here

We believe that the test may require multiple iterations of running for
check for consistency of commits.

At this point I think the test will progress and needs longer and
progress is slow. I think the expectation is that it needs to complete
faster; the system is not lock'd up completely as far as I can tell.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1573062

Title:
  memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Wily:
  In Progress
Status in linux source package in Xenial:
  In Progress
Status in linux source package in Yakkety:
  In Progress

Bug description:
  memory_stress_ng, as part of server certification is failing for IBM
  Power S812LC(TN71-BP012) in bare metal mode. Failing in this case is
  defined by the test locking up the server in an unrecoverable state
  which only a reboot will fix.

  I will be attaching screen and kern logs for the failures and a
  successful run on 14.04 on the same server.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp