[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-10-10 Thread Frank Heimes
** Changed in: linux (Ubuntu)
   Status: Fix Committed => Fix Released

** Changed in: ubuntu-z-systems
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Fix Released
Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  Fix Released

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] _xfs_log_force+0x9a/0x2e8 [xfs]
  [ 

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-10-10 Thread Launchpad Bug Tracker
This bug was fixed in the package linux - 4.4.0-97.120

---
linux (4.4.0-97.120) xenial; urgency=low

  * linux: 4.4.0-97.120 -proposed tracker (LP: #1718149)

  * blk-mq: possible deadlock on CPU hot(un)plug (LP: #1670634)
- [Config] s390x -- disable CONFIG_{DM, SCSI}_MQ_DEFAULT

  * Xenial update to 4.4.87 stable release (LP: #1715678)
- irqchip: mips-gic: SYNC after enabling GIC region
- i2c: ismt: Don't duplicate the receive length for block reads
- i2c: ismt: Return EMSGSIZE for block reads with bogus length
- ceph: fix readpage from fscache
- cpumask: fix spurious cpumask_of_node() on non-NUMA multi-node configs
- cpuset: Fix incorrect memory_pressure control file mapping
- alpha: uapi: Add support for __SANE_USERSPACE_TYPES__
- CIFS: remove endian related sparse warning
- wl1251: add a missing spin_lock_init()
- xfrm: policy: check policy direction value
- drm/ttm: Fix accounting error when fail to get pages for pool
- kvm: arm/arm64: Fix race in resetting stage2 PGD
- kvm: arm/arm64: Force reading uncached stage2 PGD
- epoll: fix race between ep_poll_callback(POLLFREE) and 
ep_free()/ep_remove()
- crypto: algif_skcipher - only call put_page on referenced and used pages
- Linux 4.4.87

  * Xenial update to 4.4.86 stable release (LP: #1715430)
- scsi: isci: avoid array subscript warning
- ALSA: au88x0: Fix zero clear of stream->resources
- btrfs: remove duplicate const specifier
- i2c: jz4780: drop superfluous init
- gcov: add support for gcc version >= 6
- gcov: support GCC 7.1
- lightnvm: initialize ppa_addr in dev_to_generic_addr()
- p54: memset(0) whole array
- lpfc: Fix Device discovery failures during switch reboot test.
- arm64: mm: abort uaccess retries upon fatal signal
- x86/io: Add "memory" clobber to insb/insw/insl/outsb/outsw/outsl
- arm64: fpsimd: Prevent registers leaking across exec
- scsi: sg: protect accesses to 'reserved' page array
- scsi: sg: reset 'res_in_use' after unlinking reserved array
- drm/i915: fix compiler warning in drivers/gpu/drm/i915/intel_uncore.c
- Linux 4.4.86

  * Xenial update to 4.4.85 stable release (LP: #1714298)
- af_key: do not use GFP_KERNEL in atomic contexts
- dccp: purge write queue in dccp_destroy_sock()
- dccp: defer ccid_hc_tx_delete() at dismantle time
- ipv4: fix NULL dereference in free_fib_info_rcu()
- net_sched/sfq: update hierarchical backlog when drop packet
- ipv4: better IP_MAX_MTU enforcement
- sctp: fully initialize the IPv6 address in sctp_v6_to_addr()
- tipc: fix use-after-free
- ipv6: reset fn->rr_ptr when replacing route
- ipv6: repair fib6 tree in failure case
- tcp: when rearming RTO, if RTO time is in past then fire RTO ASAP
- irda: do not leak initialized list.dev to userspace
- net: sched: fix NULL pointer dereference when action calls some targets
- net_sched: fix order of queue length updates in qdisc_replace()
- mei: me: add broxton pci device ids
- mei: me: add lewisburg device ids
- Input: trackpoint - add new trackpoint firmware ID
- Input: elan_i2c - add ELAN0602 ACPI ID to support Lenovo Yoga310
- ALSA: core: Fix unexpected error at replacing user TLV
- ALSA: hda - Add stereo mic quirk for Lenovo G50-70 (17aa:3978)
- ARCv2: PAE40: Explicitly set MSB counterpart of SLC region ops addresses
- i2c: designware: Fix system suspend
- drm: Release driver tracking before making the object available again
- drm/atomic: If the atomic check fails, return its value first
- drm: rcar-du: lvds: Fix PLL frequency-related configuration
- drm: rcar-du: lvds: Rename PLLEN bit to PLLON
- drm: rcar-du: Fix crash in encoder failure error path
- drm: rcar-du: Fix display timing controller parameter
- drm: rcar-du: Fix H/V sync signal polarity configuration
- tracing: Fix freeing of filter in create_filter() when set_str is false
- cifs: Fix df output for users with quota limits
- cifs: return ENAMETOOLONG for overlong names in cifs_open()/cifs_lookup()
- nfsd: Limit end of page list when decoding NFSv4 WRITE
- perf/core: Fix group {cpu,task} validation
- Bluetooth: hidp: fix possible might sleep error in hidp_session_thread
- Bluetooth: cmtp: fix possible might sleep error in cmtp_session
- Bluetooth: bnep: fix possible might sleep error in bnep_session
- binder: use group leader instead of open thread
- binder: Use wake up hint for synchronous transactions.
- ANDROID: binder: fix proc->tsk check.
- iio: imu: adis16480: Fix acceleration scale factor for adis16480
- iio: hid-sensor-trigger: Fix the race with user space powering up sensors
- staging: rtl8188eu: add RNX-N150NUB support
- ASoC: simple-card: don't fail if sysclk setting is not supported
- ASoC: rsnd: disable SRC.out only when stop timing
- ASoC: rsnd: avoid 

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-09-29 Thread Kleber Sacilotto de Souza
Hi @jacobi,

Thank you very much for verifying the fix!

Kleber

** Tags removed: verification-needed-xenial
** Tags added: verification-done-xenial

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Fix Committed
Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Xenial:
  Fix Committed

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] _xfs_log_force+0x9a/0x2e8 [xfs]
  [ 

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-09-25 Thread Kleber Sacilotto de Souza
This bug is awaiting verification that the kernel in -proposed solves
the problem. Please test the kernel and update this bug with the
results. If the problem is solved, change the tag 'verification-needed-
xenial' to 'verification-done-xenial'. If the problem still exists,
change the tag 'verification-needed-xenial' to 'verification-failed-
xenial'.

If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: verification-needed-xenial

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Fix Committed
Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Xenial:
  Fix Committed

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-09-19 Thread Frank Heimes
** Changed in: linux (Ubuntu)
   Status: Triaged => Fix Committed

** Changed in: ubuntu-z-systems
   Status: Triaged => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Fix Committed
Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Xenial:
  Fix Committed

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] _xfs_log_force+0x9a/0x2e8 [xfs]
  [ 5281.179615]  

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-09-19 Thread Kleber Sacilotto de Souza
** Changed in: linux (Ubuntu Xenial)
   Status: New => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Triaged
Status in linux package in Ubuntu:
  Triaged
Status in linux source package in Xenial:
  Fix Committed

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] _xfs_log_force+0x9a/0x2e8 [xfs]
  [ 5281.179615]  [<03ff805df114>] xfs_log_force+0x44/0x100 [xfs]
  [ 5281.179640]  

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-09-19 Thread Stefan Bader
** Also affects: linux (Ubuntu Xenial)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Triaged
Status in linux package in Ubuntu:
  Triaged
Status in linux source package in Xenial:
  New

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] _xfs_log_force+0x9a/0x2e8 [xfs]
  [ 5281.179615]  [<03ff805df114>] xfs_log_force+0x44/0x100 [xfs]
  [ 5281.179640]  

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-09-13 Thread Juerg Haefliger
For the record, the following commit fixes the issue for later kernels
(>4.11). What we're seeing with the 4.4 kernel is most likely a
different issue though.


commit ba74b6f7fcc07355d087af6939712eed4a454821 (refs/bisect/new)
Author: Christoph Hellwig 
Date:   Thu Aug 24 18:07:02 2017 +0200

virtio_pci: fix cpu affinity support

Commit 0b0f9dc5 ("Revert "virtio_pci: use shared interrupts for
virtqueues"") removed the adjustment of the pre_vectors for the virtio
MSI-X vector allocation which was added in commit fb5e31d9 ("virtio:
allow drivers to request IRQ affinity when creating VQs"). This will
lead to an incorrect assignment of MSI-X vectors, and potential
deadlocks when offlining cpus.

Signed-off-by: Christoph Hellwig 
Fixes: 0b0f9dc5 ("Revert "virtio_pci: use shared interrupts for virtqueues")
Reported-by: YASUAKI ISHIMATSU 
Cc: sta...@vger.kernel.org
Signed-off-by: Michael S. Tsirkin 

diff --git a/drivers/virtio/virtio_pci_common.c 
b/drivers/virtio/virtio_pci_common.c
index 007a4f366086..1c4797e53f68 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -107,6 +107,7 @@ static int vp_request_msix_vectors(struct virtio_device 
*vdev, int nvectors,
 {
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
const char *name = dev_name(_dev->vdev.dev);
+   unsigned flags = PCI_IRQ_MSIX;
unsigned i, v;
int err = -ENOMEM;
 
@@ -126,10 +127,13 @@ static int vp_request_msix_vectors(struct virtio_device 
*vdev, int nvectors,
GFP_KERNEL))
goto error;
 
+   if (desc) {
+   flags |= PCI_IRQ_AFFINITY;
+   desc->pre_vectors++; /* virtio config vector */
+   }
+
err = pci_alloc_irq_vectors_affinity(vp_dev->pci_dev, nvectors,
-nvectors, PCI_IRQ_MSIX |
-(desc ? PCI_IRQ_AFFINITY : 0),
-desc);
+nvectors, flags, desc);
if (err < 0)
goto error;
vp_dev->msix_enabled = 1;

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Triaged
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-09-12 Thread Juerg Haefliger
Ok, we will turn these options off for Xenial 4.4 and bring s390x in
line with the other architectures. Note that we *think* that the reason
that they're enabled for s390x is that we initially received a config
file from IBM that we used as a base.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Triaged
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] 

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-09-11 Thread Joseph Salisbury
** Tags removed: kernel-key
** Tags added: kernel-da-key

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Triaged
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] _xfs_log_force+0x9a/0x2e8 [xfs]
  [ 5281.179615]  [<03ff805df114>] xfs_log_force+0x44/0x100 [xfs]
  [ 5281.179640]  [<03ff805ec668>] xfsaild+0x170/0x798 [xfs]
  [ 5281.179643]  [<0018335a>] 

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-09-11 Thread Juerg Haefliger
The difference between the Xenial and SUSE kernel is that Xenial has:
  CONFIG_SCSI_MQ_DEFAULT=y
  CONFIG_DM_MQ_DEFAULT=y
but SUSE:
  # CONFIG_SCSI_MQ_DEFAULT is not set
  # CONFIG_DM_MQ_DEFAULT is not set

If I disable blk-mq in the Xenial kernel, the test passes.

The easiest 'fix' would be to simply disable blk-mq. This can be
accomplished via the kernel commandline parameters:
scsi_mod.use_blk_mq=0 dm_mod.use_blk_mq=0.

I also noticed that s390x is the only architecture where these options
are enabled in the Xenial kernel. Is there a specific requirement for
this?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Triaged
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] 

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-09-06 Thread Juerg Haefliger
Test passes with 4.13 and 4.12 and SUSE's SLE12 SP2 4.4.21-69-generic kernel.
Test fails with 4.11.


** Changed in: linux (Ubuntu)
 Assignee: (unassigned) => Juerg Haefliger (juergh)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Triaged
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] _xfs_log_force+0x9a/0x2e8 [xfs]
  [ 5281.179615]  [<03ff805df114>] 

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-09-04 Thread Juerg Haefliger
I'm able to reproduce the issue locally.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Triaged
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] _xfs_log_force+0x9a/0x2e8 [xfs]
  [ 5281.179615]  [<03ff805df114>] xfs_log_force+0x44/0x100 [xfs]
  [ 5281.179640]  [<03ff805ec668>] xfsaild+0x170/0x798 [xfs]
  [ 5281.179643]  [<0018335a>] 

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-09-02 Thread Juerg Haefliger
Are you running specific CPU (un)plug tests in parallel with pdebuild?
Can you post the contents of /etc/cpuplugd.conf?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Triaged
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] _xfs_log_force+0x9a/0x2e8 [xfs]
  [ 5281.179615]  [<03ff805df114>] xfs_log_force+0x44/0x100 [xfs]
  [ 5281.179640]  [<03ff805ec668>] 

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-08-29 Thread Juerg Haefliger
Sorry for the late reply, I'm just getting around looking at this. Yes,
I do have access to the folder now, thanks!

Questions: Is this setup currently working with another distribution and
you're just experiencing issues when running Ubuntu? If so, what's that
other distro and kernel? Also, can you give me a high-level overview of
your storage architecture?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Triaged
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-08-22 Thread Juerg Haefliger
I've tried to take a look at at the dump files but the BOX link
referenced in comment #4 doesn't work for me. Either I don't have access
or it has been removed.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Triaged
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] _xfs_log_force+0x9a/0x2e8 [xfs]
  [ 5281.179615]  [<03ff805df114>] xfs_log_force+0x44/0x100 [xfs]

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-07-14 Thread Joseph Salisbury
The purpose of testing v4.12-rc7 is to narrow down that last kernel
version that had the hang bug(The bad kernel) and the first kernel
version that did not(The good kernel).  This will allow us to identify
the exact commit that fixes the hang bug.  This can be accomplished by
performing a "Reverse" bisect[0].  Once we know the commit that fixes
the bug, we can SRU it to all the previous Ubuntu releases.


Are you not able to test for the hang bug without compiling the DKMS-OpenAFS 
packages?  If so, did they compile okay when you tested v4.12-rc8?


[0] https://wiki.ubuntu.com/Kernel/KernelBisection

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Triaged
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-07-10 Thread Joseph Salisbury
Thanks for the update.  Can you test v4.11-rc7?  It is available from:
http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.11-rc7/

We can perform a "Reverse" kernel bisect if we can identify the last bad
kernel version and the first good one.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Triaged
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] 

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-07-07 Thread Joseph Salisbury
@jac...@de.ibm.com

Can you confirm if v4.11-rc8 fixed the bug or not?  Per comment #24 it
ts un-clear if it does or not.

If we find that it is fixed in the v4.11-rc8 kernel, or any other newer
kernel, we can perform a "Reverse" bisect to identify the commit that
fixes the bug, then backport it to prior releases.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Triaged
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] 

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-06-12 Thread Joseph Salisbury
** Tags removed: kernel-da-key
** Tags added: kernel-key

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Triaged
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] _xfs_log_force+0x9a/0x2e8 [xfs]
  [ 5281.179615]  [<03ff805df114>] xfs_log_force+0x44/0x100 [xfs]
  [ 5281.179640]  [<03ff805ec668>] xfsaild+0x170/0x798 [xfs]
  [ 5281.179643]  [<0018335a>] 

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-06-08 Thread Frank Heimes
I copied over the latest 4_11_0-041100rc8 dump from IBM Box to our Canonical 
private file share into my home:
~fheimes/mclint_20170607_kernel_4_11_0-041100rc8_without_openafs.dump.bz2

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Triaged
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] _xfs_log_force+0x9a/0x2e8 [xfs]
  [ 5281.179615]  [<03ff805df114>] 

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-06-06 Thread Frank Heimes
** Changed in: ubuntu-z-systems
   Status: New => Triaged

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Triaged
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] _xfs_log_force+0x9a/0x2e8 [xfs]
  [ 5281.179615]  [<03ff805df114>] xfs_log_force+0x44/0x100 [xfs]
  [ 5281.179640]  [<03ff805ec668>] xfsaild+0x170/0x798 [xfs]
  [ 5281.179643]  

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-06-05 Thread Joseph Salisbury
** Tags removed: kernel-key
** Tags added: kernel-da-key

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] _xfs_log_force+0x9a/0x2e8 [xfs]
  [ 5281.179615]  [<03ff805df114>] xfs_log_force+0x44/0x100 [xfs]
  [ 5281.179640]  [<03ff805ec668>] xfsaild+0x170/0x798 [xfs]
  [ 5281.179643]  [<0018335a>] 

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-04-26 Thread Joseph Salisbury
@jac...@de.ibm.com  It would be good to know if this bug is already
fixed in the mainline kernel.  Would it be possible for you to test
4.11-rc8?  It can be downloaded from:

http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.11-rc8/

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] _xfs_log_force+0x9a/0x2e8 [xfs]
  

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-04-24 Thread Joseph Salisbury
Is there an update on this bug?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] _xfs_log_force+0x9a/0x2e8 [xfs]
  [ 5281.179615]  [<03ff805df114>] xfs_log_force+0x44/0x100 [xfs]
  [ 5281.179640]  [<03ff805ec668>] xfsaild+0x170/0x798 [xfs]
  [ 5281.179643]  [<0018335a>] kthread+0x10a/0x110
  [ 

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-04-18 Thread Joseph Salisbury
** Changed in: linux (Ubuntu)
   Status: New => Triaged

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] _xfs_log_force+0x9a/0x2e8 [xfs]
  [ 5281.179615]  [<03ff805df114>] xfs_log_force+0x44/0x100 [xfs]
  [ 5281.179640]  [<03ff805ec668>] xfsaild+0x170/0x798 [xfs]
  [ 5281.179643]  [<0018335a>] 

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-04-10 Thread Frank Heimes
I copied over the latest 4.4-72 dump from IBM Box to our Canonical private file 
share into my home:
/~fheimes/mclint_20170406_kernel_4_4_0-72_without_openafs.dump.bz2
also reachable via https ...

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] _xfs_log_force+0x9a/0x2e8 [xfs]
  [ 5281.179615]  [<03ff805df114>] 

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-03-30 Thread Joseph Salisbury
** Changed in: linux (Ubuntu)
   Importance: Undecided => High

** Tags added: kernel-key

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] _xfs_log_force+0x9a/0x2e8 [xfs]
  [ 5281.179615]  [<03ff805df114>] xfs_log_force+0x44/0x100 [xfs]
  [ 5281.179640]  [<03ff805ec668>] xfsaild+0x170/0x798 [xfs]
  [ 

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-03-09 Thread Tim Gardner
An x86-64 fixdep is an artifact of cross compiling.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] _xfs_log_force+0x9a/0x2e8 [xfs]
  [ 5281.179615]  [<03ff805df114>] xfs_log_force+0x44/0x100 [xfs]
  [ 5281.179640]  [<03ff805ec668>] xfsaild+0x170/0x798 [xfs]
  [ 5281.179643]  [<0018335a>] 

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-03-08 Thread Douglas Miller
Carsten, I am currently thinking there are two possibilities here.
Maybe three.

1) The fix I submitted is not in the kernel(s) you are running.
2) The s390 compiler does not produce the necessary code to implicitly convert 
long int to bool.
3) You are hitting a different bug that just happens to look the same.

For #2, a simple compiler test could be done to check what code is
produced when assigning a long int value to a bool (GCC _Bool). If you
want to pursue that let me know. I am not familiar with s390 object
code, so we might need someone to interpret the objdump. As far as I can
tell, the s390 Linux kernel does use GCC _Bool for the data type "bool",
so it would then be a matter of what code the s390 GCC produces in this
case.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-03-08 Thread Douglas Miller
I'm confused about the release mechanics, I guess. Looking at the git
repository, I see tag "Ubuntu-4.4.0-65.86" (for example) and that tag
commit does contain the fix. Is it then possible for a kernel labeled
"4.4.0-65-generic #86" to NOT contain that patch? Am I making a gross
assumption that these tags reflect what was released?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] 

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-03-08 Thread Tim Gardner
Due to an emergency CVE rebase, that patch still hasn't made it into the
wild. Here is a test kernel that definitely has the patch

http://kernel.ubuntu.com/~rtg/lp1670634/

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] _xfs_log_force+0x9a/0x2e8 [xfs]
  [ 5281.179615]  [<03ff805df114>] xfs_log_force+0x44/0x100 

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-03-07 Thread Tim Gardner
Douglas - The patch referred to in LP #1662673 ("percpu-refcount: fix
reference leak during percpu-atomic transition") is in
Ubuntu-4.4.0-65.86 which has yet to be released.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] _xfs_log_force+0x9a/0x2e8 [xfs]
  [ 5281.179615]  [<03ff805df114>] xfs_log_force+0x44/0x100 

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-03-07 Thread Douglas Miller
This looks a lot like the problem in LP #1662673, but that fix is
supposed to be in kernel 4.4.0-65-generic. Might be worth confirming
though. Or perhaps confirming that the fix actually works on the Z
architecture (depends on how the architecture/compiler handles 'bool').

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-03-07 Thread Frank Heimes
** Changed in: ubuntu-z-systems
 Assignee: (unassigned) => Canonical Kernel Team (canonical-kernel-team)

** Changed in: ubuntu-z-systems
   Importance: Undecided => High

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] _xfs_log_force+0x9a/0x2e8 [xfs]
  [ 5281.179615]  [<03ff805df114>] xfs_log_force+0x44/0x100 

[Kernel-packages] [Bug 1670634] Re: blk-mq: possible deadlock on CPU hot(un)plug

2017-03-07 Thread Frank Heimes
** Tags removed: bot-comment
** Tags added: s390x

** Also affects: ubuntu-z-systems
   Importance: Undecided
   Status: New

** Package changed: ubuntu => linux (Ubuntu)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] _xfs_log_force+0x9a/0x2e8 [xfs]
  [ 5281.179615]  [<03ff805df114>] xfs_log_force+0x44/0x100