Peter,

I got it to crash again, this time with a nice kernel dump. The dump can
be fetched here:

http://www.rmoesbergen.nl/linux-image-3.2.0-34-generic.0.crash.gz

The crash itself looked like this:

Dec 21 11:07:32 ealxs00161 kernel: [63272.392812] sd 4:0:1:1: emc: ALUA 
failover mode detected
Dec 21 11:07:32 ealxs00161 kernel: [63272.392820] sd 4:0:1:1: emc: at SP B Port 
1 (owned, default SP B)
Dec 21 11:07:32 ealxs00161 kernel: [63272.393180] sd 3:0:0:1: emc: ALUA 
failover mode detected
Dec 21 11:07:32 ealxs00161 kernel: [63272.393187] sd 3:0:0:1: emc: at SP B Port 
0 (owned, default SP B)
Dec 21 11:10:36 ealxs00161 kernel: [63455.641431] qla2xxx 
[0000:07:00.0]-500b:3: LOOP DOWN detected (2 3 0 0).
Dec 21 11:10:52 ealxs00161 multipathd: sdf: remove path (uevent)
Dec 21 11:10:52 ealxs00161 kernel: [63471.548255]  rport-3:0-1: blocked FC 
remote port time out: removing target and saving binding
Dec 21 11:10:52 ealxs00161 kernel: [63471.676065]  rport-3:0-0: blocked FC 
remote port time out: removing target and saving binding
Dec 21 11:11:08 ealxs00161 cimserver[2079]: Authentication failed for user=root.
Dec 21 11:11:10 ealxs00161 cimserver[2079]: Authentication failed for user=root.
Dec 21 11:13:28 ealxs00161 kernel: [63627.745648] INFO: task jbd2/dm-1-8:1530 
blocked for more than 120 seconds.
Dec 21 11:13:28 ealxs00161 kernel: [63627.746025] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 21 11:13:28 ealxs00161 kernel: [63627.756371] jbd2/dm-1-8     D 
ffff8803aa11a620     0  1530      2 0x00000000
Dec 21 11:13:28 ealxs00161 kernel: [63627.756380]  ffff880416141ac0 
0000000000000046 ffff880416141a60 ffff88042ee137c0
Dec 21 11:13:28 ealxs00161 kernel: [63627.756388]  ffff880416141fd8 
ffff880416141fd8 ffff880416141fd8 00000000000137c0
Dec 21 11:13:28 ealxs00161 kernel: [63627.756395]  ffffffff81c0d020 
ffff880415ef9700 ffff880416141a90 ffff88042ee14080
Dec 21 11:13:28 ealxs00161 kernel: [63627.756403] Call Trace:
Dec 21 11:13:28 ealxs00161 kernel: [63627.756416]  [<ffffffff81117230>] ? 
__lock_page+0x70/0x70
Dec 21 11:13:28 ealxs00161 kernel: [63627.756431]  [<ffffffff81659ebf>] 
schedule+0x3f/0x60
Dec 21 11:13:28 ealxs00161 kernel: [63627.756441]  [<ffffffff81659f6f>] 
io_schedule+0x8f/0xd0
Dec 21 11:13:28 ealxs00161 kernel: [63627.756451]  [<ffffffff8111723e>] 
sleep_on_page+0xe/0x20
Dec 21 11:13:28 ealxs00161 kernel: [63627.756460]  [<ffffffff8165a78f>] 
__wait_on_bit+0x5f/0x90
Dec 21 11:13:28 ealxs00161 kernel: [63627.756470]  [<ffffffff811173a8>] 
wait_on_page_bit+0x78/0x80
Dec 21 11:13:28 ealxs00161 kernel: [63627.756481]  [<ffffffff8108ad60>] ? 
autoremove_wake_function+0x40/0x40
Dec 21 11:13:28 ealxs00161 kernel: [63627.756492]  [<ffffffff811174bc>] 
filemap_fdatawait_range+0x10c/0x1a0
Dec 21 11:13:28 ealxs00161 kernel: [63627.756503]  [<ffffffff8111757b>] 
filemap_fdatawait+0x2b/0x30
Dec 21 11:13:28 ealxs00161 kernel: [63627.756516]  [<ffffffff81260ea0>] 
journal_finish_inode_data_buffers+0x70/0x170
Dec 21 11:13:28 ealxs00161 kernel: [63627.756528]  [<ffffffff81261795>] 
jbd2_journal_commit_transaction+0x665/0x1240
Dec 21 11:13:28 ealxs00161 kernel: [63627.756538]  [<ffffffff8108ad20>] ? 
add_wait_queue+0x60/0x60
Dec 21 11:13:28 ealxs00161 kernel: [63627.756548]  [<ffffffff8126603b>] 
kjournald2+0xbb/0x220
Dec 21 11:13:28 ealxs00161 kernel: [63627.756557]  [<ffffffff8108ad20>] ? 
add_wait_queue+0x60/0x60
Dec 21 11:13:28 ealxs00161 kernel: [63627.756566]  [<ffffffff81265f80>] ? 
commit_timeout+0x10/0x10
Dec 21 11:13:28 ealxs00161 kernel: [63627.756575]  [<ffffffff8108a27c>] 
kthread+0x8c/0xa0
Dec 21 11:13:28 ealxs00161 kernel: [63627.756587]  [<ffffffff81666534>] 
kernel_thread_helper+0x4/0x10
Dec 21 11:13:28 ealxs00161 kernel: [63627.756596]  [<ffffffff8108a1f0>] ? 
flush_kthread_worker+0xa0/0xa0
Dec 21 11:13:28 ealxs00161 kernel: [63627.756606]  [<ffffffff81666530>] ? 
gs_change+0x13/0x13
Dec 21 11:13:28 ealxs00161 kernel: [63627.756612] Kernel panic - not syncing: 
hung_task: blocked tasks
Dec 21 11:13:28 ealxs00161 kernel: [63627.768425] Pid: 66, comm: khungtaskd 
Tainted: G        W    3.2.0-34-generic #53-Ubuntu
Dec 21 11:13:28 ealxs00161 kernel: [63627.779691] Call Trace:
Dec 21 11:13:28 ealxs00161 kernel: [63627.790147]  [<ffffffff81643128>] 
panic+0x91/0x1a4
Dec 21 11:13:28 ealxs00161 kernel: [63627.800888]  [<ffffffff810d78f2>] 
check_hung_task+0xb2/0xc0
Dec 21 11:13:28 ealxs00161 kernel: [63627.811370]  [<ffffffff810d7a1b>] 
check_hung_uninterruptible_tasks+0x11b/0x140
Dec 21 11:13:28 ealxs00161 kernel: [63627.821998]  [<ffffffff810d7a40>] ? 
check_hung_uninterruptible_tasks+0x140/0x140
Dec 21 11:13:28 ealxs00161 kernel: [63627.833715]  [<ffffffff810d7a8f>] 
watchdog+0x4f/0x60
Dec 21 11:13:28 ealxs00161 kernel: [63627.844538]  [<ffffffff8108a27c>] 
kthread+0x8c/0xa0
Dec 21 11:13:28 ealxs00161 kernel: [63627.855370]  [<ffffffff81666534>] 
kernel_thread_helper+0x4/0x10
Dec 21 11:13:28 ealxs00161 kernel: [63627.866367]  [<ffffffff8108a1f0>] ? 
flush_kthread_worker+0xa0/0xa0
Dec 21 11:13:28 ealxs00161 kernel: [63627.877343]  [<ffffffff81666530>] ? 
gs_change+0x13/0x13

output of ps xa, just before the crash:

  PID TTY      STAT   TIME COMMAND
    1 ?        Ss     0:02 /sbin/init
    2 ?        S      0:00 [kthreadd]
    3 ?        S      0:01 [ksoftirqd/0]
    6 ?        S      0:01 [migration/0]
    7 ?        S      0:00 [watchdog/0]
    8 ?        S      0:00 [migration/1]
   10 ?        S      0:00 [ksoftirqd/1]
   12 ?        S      0:00 [watchdog/1]
   13 ?        S      0:01 [migration/2]
   15 ?        S      0:00 [ksoftirqd/2]
   16 ?        S      0:00 [watchdog/2]
   17 ?        S      0:00 [migration/3]
   19 ?        S      0:00 [ksoftirqd/3]
   20 ?        S      0:00 [watchdog/3]
   21 ?        S      0:00 [migration/4]
   23 ?        S      0:00 [ksoftirqd/4]
   24 ?        S      0:00 [watchdog/4]
   25 ?        S      0:00 [migration/5]
   27 ?        S      0:00 [ksoftirqd/5]
   28 ?        S      0:00 [watchdog/5]
   29 ?        S      0:00 [migration/6]
   30 ?        S      0:00 [kworker/6:0]
   31 ?        S      0:00 [ksoftirqd/6]
   32 ?        S      0:00 [watchdog/6]
   33 ?        S      0:00 [migration/7]
   35 ?        S      0:00 [ksoftirqd/7]
   36 ?        S      0:00 [watchdog/7]
   37 ?        S      0:00 [migration/8]
   38 ?        S      0:00 [kworker/8:0]
   39 ?        S      0:00 [ksoftirqd/8]
   40 ?        S      0:00 [watchdog/8]
   41 ?        S      0:00 [migration/9]
   42 ?        S      0:00 [kworker/9:0]
   43 ?        S      0:00 [ksoftirqd/9]
   44 ?        S      0:00 [watchdog/9]
   45 ?        S      0:00 [migration/10]
   47 ?        S      0:00 [ksoftirqd/10]
   48 ?        S      0:00 [watchdog/10]
   49 ?        S      0:00 [migration/11]
   51 ?        S      0:00 [ksoftirqd/11]
   52 ?        S      0:00 [watchdog/11]
   53 ?        S<     0:00 [cpuset]
   54 ?        S<     0:00 [khelper]
   55 ?        S      0:00 [kdevtmpfs]
   56 ?        S<     0:00 [netns]
   58 ?        S      0:00 [sync_supers]
   59 ?        S      0:00 [bdi-default]
   60 ?        S<     0:00 [kintegrityd]
   61 ?        S<     0:00 [kblockd]
   62 ?        S<     0:00 [ata_sff]
   63 ?        S      0:00 [khubd]
   64 ?        S<     0:00 [md]
   66 ?        S      0:00 [khungtaskd]
   67 ?        S      0:14 [kswapd0]
   68 ?        SN     0:00 [ksmd]
   69 ?        SN     0:00 [khugepaged]
   70 ?        S      0:00 [fsnotify_mark]
   71 ?        S      0:00 [ecryptfs-kthrea]
   72 ?        S<     0:00 [crypto]
   80 ?        S<     0:00 [kthrotld]
   81 ?        S      0:00 [scsi_eh_0]
   82 ?        S      0:00 [scsi_eh_1]
  104 ?        S<     0:00 [devfreq_wq]
  265 ?        S      0:00 [scsi_eh_2]
  267 ?        S      0:00 [hpsa]
  349 ?        S      0:00 [kworker/6:1]
  352 ?        S      0:00 [kworker/9:1]
  353 ?        S      0:00 [kworker/10:1]
  354 ?        S      0:00 [kworker/4:1]
  357 ?        S<     0:00 [kdmflush]
  365 ?        S      0:00 [jbd2/sda1-8]
  366 ?        S<     0:00 [ext4-dio-unwrit]
  458 ?        S      0:00 upstart-udev-bridge --daemon
  461 ?        Ss     0:00 /sbin/udevd --daemon
  547 ?        S<     0:00 [kmpathd]
  548 ?        S<     0:00 [kmpath_handlerd]
  626 ?        S<     0:00 [edac-poller]
  660 ?        S      0:00 [scsi_eh_3]
  702 ?        S<     0:00 [kpsmoused]
  861 ?        S<     0:00 [qla2xxx_3_dpc]
  862 ?        Ss     0:00 rpcbind -w
  864 ?        S<     0:00 [scsi_wq_3]
  877 ?        S      0:00 [scsi_eh_4]
  879 ?        Ss     0:00 rpc.statd -L
  888 ?        S<     0:00 [rpciod]
  891 ?        S<     0:00 [nfsiod]
  893 ?        S      0:00 upstart-socket-bridge --daemon
  895 ?        S<     0:00 [qla2xxx_4_dpc]
  896 ?        S<     0:00 [scsi_wq_4]
  902 ?        S<     0:00 [bond0]
 1054 ?        S<     0:00 [kdmflush]
 1109 ?        S<     0:00 [kdmflush]
 1490 ?        S      0:06 [jbd2/dm-2-8]
 1491 ?        S<     0:00 [ext4-dio-unwrit]
 1530 ?        D      0:42 [jbd2/dm-1-8]
 1531 ?        S<     0:00 [ext4-dio-unwrit]
 1573 ?        Ss     0:00 /usr/sbin/sshd -D
 1576 ?        Ss     0:00 rpc.idmapd
 1580 ?        Ss     0:00 dbus-daemon --system --fork --activation=upstart
 1603 ?        Sl     0:02 rsyslogd -c5
 1677 tty4     Ss+    0:00 /sbin/getty -8 38400 tty4
 1684 tty5     Ss+    0:00 /sbin/getty -8 38400 tty5
 1693 tty2     Ss+    0:00 /sbin/getty -8 38400 tty2
 1697 tty3     Ss+    0:00 /sbin/getty -8 38400 tty3
 1703 tty6     Ss+    0:00 /sbin/getty -8 38400 tty6
 1710 ?        Ss     0:00 acpid -c /etc/acpi/events -s /var/run/acpid.socket
 1712 ?        Ss     0:00 cron
 1715 ?        Ss     0:00 atd
 1727 ?        S      0:00 /usr/sbin/zabbix_agentd
 1729 ?        Ss     0:20 /usr/sbin/irqbalance
 1733 ?        Ssl    0:00 whoopsie
 1738 ?        Ssl    2:51 /usr/sbin/mysqld
 1745 ?        S      0:35 /usr/sbin/zabbix_agentd
 1746 ?        S      0:10 /usr/sbin/zabbix_agentd
 1747 ?        S      0:10 /usr/sbin/zabbix_agentd
 1748 ?        S      0:11 /usr/sbin/zabbix_agentd
 1749 ?        S      0:12 /usr/sbin/zabbix_agentd
 1750 ?        S      0:11 /usr/sbin/zabbix_agentd
 1751 ?        S      0:01 /usr/sbin/zabbix_agentd
 1919 ?        S      0:00 [kworker/5:2]
 2004 ?        S      0:00 [kworker/8:2]
 2010 ?        S      0:00 [kworker/11:2]
 2011 ?        S      0:00 [kworker/11:3]
 2024 ?        Sl     0:01 /opt/Unisphere/bin/hostagent -f 
/etc/Unisphere/agent.config
 2046 ?        SLl    0:08 /sbin/multipathd
 2079 ?        SLsl   0:19 /opt/microsoft/scx/bin/scxcimserver
 2177 tty1     Ss+    0:00 /sbin/getty -8 38400 tty1
 2179 ?        S      0:00 [flush-8:0]
 2180 ?        D      1:05 [flush-252:1]
 2181 ?        S      0:00 [flush-252:2]
 2257 ?        Ssl    0:28 /opt/microsoft/scx/bin/scxcimprovagt 0 9 12 root 
SCXCoreProviderModule
 2422 ?        Ss     0:02 /usr/sbin/ntpd -p /var/run/ntpd.pid -g -u 109:115
 2625 ?        Ss     0:00 sshd: ronaldm [priv]
 2824 ?        S      0:00 sshd: ronaldm@pts/0 
 2825 pts/0    Ss     0:00 -bash
 2924 pts/0    S      0:00 sudo -i
 2929 pts/0    S      0:00 -bash
 3194 ?        Ssl    0:01 /opt/microsoft/scx/bin/scxcimprovagt 0 8 14 scoma 
SCXUserCoreProviderModule
 3207 ?        S      0:17 [kworker/1:3]
 3614 ?        Ss     0:00 sshd: ronaldm [priv]
 3753 ?        S      0:00 sshd: ronaldm@pts/2 
 3754 pts/2    Ss     0:00 -bash
 3868 pts/2    S      0:00 sudo -i
 3874 pts/2    S      0:00 -bash
 4925 ?        S      0:00 [kworker/7:3]
 5248 ?        S      0:00 [kworker/u:2]
 5251 ?        S      0:00 [kworker/u:3]
 5348 ?        S      0:00 [kworker/10:2]
 5353 ?        S      0:00 [kworker/1:1]
 5361 ?        S      0:00 [kworker/0:1]
 5382 ?        S      0:00 [kworker/3:0]
 5383 ?        S      0:00 [kworker/3:3]
 5384 ?        S      0:00 [kworker/5:3]
 5387 ?        S      0:00 [kworker/0:5]
 5391 ?        S      0:00 [kworker/1:2]
 5691 ?        S      0:00 [kworker/7:4]
 6088 ?        S      0:00 [kworker/1:4]
 6221 ?        S      0:00 [kworker/4:2]
 6260 ?        S      0:00 [kworker/2:0]
 6261 ?        S      0:00 [kworker/2:4]
 6521 pts/0    D+     0:17 bonnie++ -d . -u root
 6655 ?        S      0:00 /sbin/udevd --daemon
 6656 ?        S      0:00 /sbin/udevd --daemon
 6910 ?        S      0:00 [kworker/1:0]
 6915 pts/2    R+     0:00 ps xa


Acceptatie - DB01 (root@ealxs00161):~# ps xa | grep multi
 2046 ?        SLl    0:08 /sbin/multipathd
 6917 pts/2    S+     0:00 grep --color=auto multi

Also, just before the crash:

Acceptatie - DB01 (root@ealxs00161):~# multipath -ll
LUN-DATABASE (36006016061e02e003cf1aca4ae07e211) dm-2 DGC,VRAID
size=200G features='1 queue_if_no_path' hwhandler='1 emc' wp=rw
|-+- policy='round-robin 0' prio=130 status=active
| |- 4:0:1:1 sdi 8:128 active ready running
| `- #:#:#:# -   #:#   active faulty running
`-+- policy='round-robin 0' prio=10 status=enabled
  |- 4:0:0:1 sde 8:64  active ready running
  `- #:#:#:# -   #:#   active faulty running
LUN-LOGGING (36006016061e02e000286c1adae07e211) dm-1 DGC,VRAID
size=20G features='0' hwhandler='1 emc' wp=rw
|-+- policy='round-robin 0' prio=130 status=active
| |- 4:0:0:0 sdd 8:48  active ready running
| `- #:#:#:# -   #:#   active faulty running
`-+- policy='round-robin 0' prio=10 status=enabled
  |- 4:0:1:0 sdh 8:112 active ready running
  `- #:#:#:# -   #:#   active faulty running

Output of dmsetup table -v before starting the tests:
Name:              vg-swap
State:             ACTIVE
Read Ahead:        256
Tables present:    LIVE
Open count:        2
Event number:      0
Major, minor:      252, 0
Number of targets: 1
UUID: LVM-BySGZfHLAZg250K7UjTxYBStGjTdkb2CE8b7q7HMxBUtJso72BPYfnAcLpxixYP4

0 3997696 linear 8:2 512

Name:              LUN-DATABASE
State:             ACTIVE
Read Ahead:        256
Tables present:    LIVE
Open count:        1
Event number:      4
Major, minor:      252, 2
Number of targets: 1
UUID: mpath-36006016061e02e003cf1aca4ae07e211

0 419430400 multipath 1 queue_if_no_path 1 emc 2 1 round-robin 0 2 1
8:128 1000 8:32 1000 round-robin 0 2 1 8:64 1000 8:96 1000

Name:              LUN-LOGGING
State:             ACTIVE
Read Ahead:        256
Tables present:    LIVE
Open count:        1
Event number:      4
Major, minor:      252, 1
Number of targets: 1
UUID: mpath-36006016061e02e000286c1adae07e211

0 41943040 multipath 1 queue_if_no_path 1 emc 2 1 round-robin 0 2 1 8:48
1000 8:80 1000 round-robin 0 2 1 8:112 1000 8:16 1000


Output of lsscsi -lv before starting the tests:
[2:0:0:0]    storage HP       P420i            3.04  -       
  state=running queue_depth=1020 scsi_level=6 type=12 device_blocked=0 timeout=0
  dir: /sys/bus/scsi/devices/2:0:0:0  
[/sys/devices/pci0000:00/0000:00:02.2/0000:02:00.0/host2/target2:0:0/2:0:0:0]
[2:0:0:1]    disk    HP       LOGICAL VOLUME   3.04  /dev/sda
  state=running queue_depth=1020 scsi_level=6 type=0 device_blocked=0 timeout=30
  dir: /sys/bus/scsi/devices/2:0:0:1  
[/sys/devices/pci0000:00/0000:00:02.2/0000:02:00.0/host2/target2:0:0/2:0:0:1]
[3:0:0:0]    disk    DGC      VRAID            0531  /dev/sdb
  state=running queue_depth=32 scsi_level=5 type=0 device_blocked=0 timeout=30
  dir: /sys/bus/scsi/devices/3:0:0:0  
[/sys/devices/pci0000:00/0000:00:01.0/0000:07:00.0/host3/rport-3:0-0/target3:0:0/3:0:0:0]
[3:0:0:1]    disk    DGC      VRAID            0531  /dev/sdc
  state=running queue_depth=32 scsi_level=5 type=0 device_blocked=0 timeout=30
  dir: /sys/bus/scsi/devices/3:0:0:1  
[/sys/devices/pci0000:00/0000:00:01.0/0000:07:00.0/host3/rport-3:0-0/target3:0:0/3:0:0:1]
[3:0:1:0]    disk    DGC      VRAID            0531  /dev/sdf
  state=running queue_depth=32 scsi_level=5 type=0 device_blocked=0 timeout=30
  dir: /sys/bus/scsi/devices/3:0:1:0  
[/sys/devices/pci0000:00/0000:00:01.0/0000:07:00.0/host3/rport-3:0-1/target3:0:1/3:0:1:0]
[3:0:1:1]    disk    DGC      VRAID            0531  /dev/sdg
  state=running queue_depth=32 scsi_level=5 type=0 device_blocked=0 timeout=30
  dir: /sys/bus/scsi/devices/3:0:1:1  
[/sys/devices/pci0000:00/0000:00:01.0/0000:07:00.0/host3/rport-3:0-1/target3:0:1/3:0:1:1]
[4:0:0:0]    disk    DGC      VRAID            0531  /dev/sdd
  state=running queue_depth=32 scsi_level=5 type=0 device_blocked=0 timeout=30
  dir: /sys/bus/scsi/devices/4:0:0:0  
[/sys/devices/pci0000:00/0000:00:1c.0/0000:0a:00.0/host4/rport-4:0-0/target4:0:0/4:0:0:0]
[4:0:0:1]    disk    DGC      VRAID            0531  /dev/sde
  state=running queue_depth=32 scsi_level=5 type=0 device_blocked=0 timeout=30
  dir: /sys/bus/scsi/devices/4:0:0:1  
[/sys/devices/pci0000:00/0000:00:1c.0/0000:0a:00.0/host4/rport-4:0-0/target4:0:0/4:0:0:1]
[4:0:1:0]    disk    DGC      VRAID            0531  /dev/sdh
  state=running queue_depth=32 scsi_level=5 type=0 device_blocked=0 timeout=30
  dir: /sys/bus/scsi/devices/4:0:1:0  
[/sys/devices/pci0000:00/0000:00:1c.0/0000:0a:00.0/host4/rport-4:0-1/target4:0:1/4:0:1:0]
[4:0:1:1]    disk    DGC      VRAID            0531  /dev/sdi
  state=running queue_depth=32 scsi_level=5 type=0 device_blocked=0 timeout=30
  dir: /sys/bus/scsi/devices/4:0:1:1  
[/sys/devices/pci0000:00/0000:00:1c.0/0000:0a:00.0/host4/rport-4:0-1/target4:0:1/4:0:1:1]

I hope this helps...

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1032550

Title:
  [multipath]  failed to get sysfs information

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1032550/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to