Default Comment by Bridge

** Attachment added: "attach dmesg.out"
   https://bugs.launchpad.net/bugs/1696445/+attachment/4891263/+files/dmesg.out

** Changed in: ubuntu
     Assignee: (unassigned) => Ubuntu on IBM Power Systems Bug Triage 
(ubuntu-power-triage)

** Package changed: ubuntu => linux (Ubuntu)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1696445

Title:
  OpenPower: Some multipaths temporarily have only a single path

Status in linux package in Ubuntu:
  New

Bug description:
  ==== State: Open by: nguyenp on 31 May 2017 15:46:14 ====

   Product Name          : OpenPOWER Firmware
   Product Version       : open-power-SMC-P8DTU-V2.00.GA2-20170126-prod
   Product Extra         :      op-build-3782262
   Product Extra         :      hostboot-7fdfb37
   Product Extra         :      occ-e6e194f
   Product Extra         :      skiboot-5.4.2
   Product Extra         :      linux-4.4.24-openpower1-9641b3a
   Product Extra         :      petitboot-v1.4.0-2f8598b
   Product Extra         :      p8dtu-xml-9a8fee2

  Cable configuration:
  ====================
  On this P8-Briggs system, I have 2 Seagate Storages running with max 
configuration. There are 84 HDDs drives in each storage. So the total drives is 
168 HDDs for both Seagate storages.

  I connected 2 LSI 9300-8e SAS adapters to 2 Seagate storages with
  alternate cabling for redundancy. See a Figure on the connection
  below:

  Note:  Each Seagate storage has 2 I/O moudules connection in the back. 
         Both I/O modules from each Seagate does see the same set of HDDs

  Cable connection:

  SAS adapter #1:    port1  ----->  Seagate #1-A I/O module
                     port0  --------------------------------------> Seagate 
#2-B I/O module

                  
  SAS adapter #2:    port1  ---->  Seagate #2-A I/O module
                     port0  --------------------------------------> Seagate 
#1-B I/O module

  Ubuntu 16.04.2:
  ===============

  - Running with new kernel Ubuntu 4.8.0-520-generic
  #550~16.04.1+bz154734 from Mauricio Faria De Oliveira.

  Problem Description:
  ====================
  In this Briggs system, I'm running with new Ubuntu 4.8.0-520-generic 
#550~16.04.1+bz154734 that has fix for Multipath problem. Mauricio helped to 
patch the system with this kernel last week to fix the multipath io_setup 
failed problem in LTCBug154734.

  This week, I went ahead and scaled up my test configuration to max
  configuration 2x5U84_Enclosures,_MaxCfg_168HDDs. This time, it hit a
  different issue. The issue is that some multipaths only have a single
  path and no redundancy. Others have multiple paths and redundancy.

  == Comment: #13 - Paul Nguyen - 2017-06-01 15:19:58 ==

  -  I agreed with Mauricio that this problem is a timing problem.

  - I re-ran the test and noticed that it took more than 50 minutes
  after system reboot to discover all disks and to build Multipaths
  correctly.

  - So for it to take this long, it's going to be a problem.

  - I have gathered all logs and attaching to the bug for Mauricio to
  look and confirm.

  - If there is a workaround or fix for faster probe time then I will
  try it out.

  
  - Below is more information I captured:

  Checkpoint #1:
  ==============
  - system reboot around 2pm (14:00)

  
  Checkpoint # 2:
  ===============
  - It took several minutes for first disk to be detected.

  root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' |  grep 'Attached SCSI disk' | head
  [Thu Jun  1 14:06:48 2017] sd 17:0:1:0: [sdb] Attached SCSI disk
  [Thu Jun  1 14:06:51 2017] sd 17:0:2:0: [sdc] Attached SCSI disk
  [Thu Jun  1 14:06:53 2017] sd 17:0:3:0: [sdd] Attached SCSI disk
  [Thu Jun  1 14:06:57 2017] sd 17:0:4:0: [sde] Attached SCSI disk
  [Thu Jun  1 14:07:00 2017] sd 17:0:5:0: [sdf] Attached SCSI disk
  [Thu Jun  1 14:07:03 2017] sd 17:0:6:0: [sdg] Attached SCSI disk
  [Thu Jun  1 14:07:05 2017] sd 17:0:7:0: [sdh] Attached SCSI disk
  [Thu Jun  1 14:07:08 2017] sd 17:0:8:0: [sdi] Attached SCSI disk
  [Thu Jun  1 14:07:11 2017] sd 17:0:9:0: [sdj] Attached SCSI disk
  [Thu Jun  1 14:07:14 2017] sd 17:0:10:0: [sdk] Attached SCSI disk
  root@smb1p1:~# 

  ...

  root@smb1p1:~# multipath -ll|grep dm |wc -l
  103
  root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' |  grep 'Attached SCSI disk' | tail
  [Thu Jun  1 14:18:30 2017] sd 17:0:100:0: [sdcr] Attached SCSI disk
  [Thu Jun  1 14:18:35 2017] sd 17:0:101:0: [sdcs] Attached SCSI disk
  [Thu Jun  1 14:18:40 2017] sd 17:0:102:0: [sdct] Attached SCSI disk
  [Thu Jun  1 14:18:44 2017] sd 17:0:103:0: [sdcu] Attached SCSI disk
  [Thu Jun  1 14:18:54 2017] sd 17:0:105:0: [sdcv] Attached SCSI disk
  [Thu Jun  1 14:18:59 2017] sd 17:0:106:0: [sdcw] Attached SCSI disk
  [Thu Jun  1 14:19:04 2017] sd 17:0:107:0: [sdcx] Attached SCSI disk
  [Thu Jun  1 14:19:09 2017] sd 17:0:108:0: [sdcy] Attached SCSI disk
  [Thu Jun  1 14:19:14 2017] sd 17:0:109:0: [sdcz] Attached SCSI disk
  [Thu Jun  1 14:19:19 2017] sd 17:0:110:0: [sdda] Attached SCSI disk
  root@smb1p1:~# 

  ...

  root@smb1p1:~# multipath -ll|grep dm |wc -l
  126
  root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' |  grep 'Attached SCSI disk' | tail
  [Thu Jun  1 14:20:23 2017] sd 17:0:123:0: [sddn] Attached SCSI disk
  [Thu Jun  1 14:20:28 2017] sd 17:0:124:0: [sddo] Attached SCSI disk
  [Thu Jun  1 14:20:33 2017] sd 17:0:125:0: [sddp] Attached SCSI disk
  [Thu Jun  1 14:20:38 2017] sd 17:0:126:0: [sddq] Attached SCSI disk
  [Thu Jun  1 14:20:44 2017] sd 17:0:127:0: [sddr] Attached SCSI disk
  [Thu Jun  1 14:20:48 2017] sd 17:0:128:0: [sdds] Attached SCSI disk
  [Thu Jun  1 14:20:54 2017] sd 17:0:129:0: [sddt] Attached SCSI disk
  [Thu Jun  1 14:20:59 2017] sd 17:0:130:0: [sddu] Attached SCSI disk
  [Thu Jun  1 14:21:04 2017] sd 17:0:131:0: [sddv] Attached SCSI disk
  [Thu Jun  1 14:21:09 2017] sd 17:0:132:0: [sddw] Attached SCSI disk
  root@smb1p1:~# 

  ...

  root@smb1p1:~# multipath -ll|grep dm |wc -l
  142
  root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' |  grep 'Attached SCSI disk' | tail
  [Thu Jun  1 14:21:54 2017] sd 17:0:141:0: [sdee] Attached SCSI disk
  [Thu Jun  1 14:21:58 2017] sd 17:0:142:0: [sdef] Attached SCSI disk
  [Thu Jun  1 14:22:04 2017] sd 17:0:143:0: [sdeg] Attached SCSI disk
  [Thu Jun  1 14:22:08 2017] sd 17:0:144:0: [sdeh] Attached SCSI disk
  [Thu Jun  1 14:22:14 2017] sd 17:0:145:0: [sdei] Attached SCSI disk
  [Thu Jun  1 14:22:18 2017] sd 17:0:146:0: [sdej] Attached SCSI disk
  [Thu Jun  1 14:22:24 2017] sd 17:0:147:0: [sdek] Attached SCSI disk
  [Thu Jun  1 14:22:29 2017] sd 17:0:148:0: [sdel] Attached SCSI disk
  [Thu Jun  1 14:22:34 2017] sd 17:0:149:0: [sdem] Attached SCSI disk
  [Thu Jun  1 14:22:39 2017] sd 17:0:150:0: [sden] Attached SCSI disk
  root@smb1p1:~# 

  ...

  root@smb1p1:~# multipath -ll|grep dm |wc -l
  163
  root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' |  grep 'Attached SCSI disk' | tail
  [Thu Jun  1 14:23:48 2017] sd 17:0:164:0: [sdfa] Attached SCSI disk
  [Thu Jun  1 14:23:53 2017] sd 17:0:165:0: [sdfb] Attached SCSI disk
  [Thu Jun  1 14:23:58 2017] sd 17:0:166:0: [sdfc] Attached SCSI disk
  [Thu Jun  1 14:24:03 2017] sd 17:0:167:0: [sdfd] Attached SCSI disk
  [Thu Jun  1 14:24:08 2017] sd 17:0:168:0: [sdfe] Attached SCSI disk
  [Thu Jun  1 14:24:13 2017] sd 17:0:169:0: [sdff] Attached SCSI disk
  [Thu Jun  1 14:24:19 2017] sd 17:0:170:0: [sdfg] Attached SCSI disk
  [Thu Jun  1 14:24:23 2017] sd 17:0:171:0: [sdfh] Attached SCSI disk
  [Thu Jun  1 14:24:28 2017] sd 17:0:172:0: [sdfi] Attached SCSI disk
  [Thu Jun  1 14:24:33 2017] sd 17:0:173:0: [sdfj] Attached SCSI disk

  
  ...

  root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' |  grep 'Attached SCSI disk' | tail
  [Thu Jun  1 14:24:03 2017] sd 17:0:167:0: [sdfd] Attached SCSI disk
  [Thu Jun  1 14:24:08 2017] sd 17:0:168:0: [sdfe] Attached SCSI disk
  [Thu Jun  1 14:24:13 2017] sd 17:0:169:0: [sdff] Attached SCSI disk
  [Thu Jun  1 14:24:19 2017] sd 17:0:170:0: [sdfg] Attached SCSI disk
  [Thu Jun  1 14:24:23 2017] sd 17:0:171:0: [sdfh] Attached SCSI disk
  [Thu Jun  1 14:24:28 2017] sd 17:0:172:0: [sdfi] Attached SCSI disk
  [Thu Jun  1 14:24:33 2017] sd 17:0:173:0: [sdfj] Attached SCSI disk
  [Thu Jun  1 14:24:38 2017] sd 17:0:174:0: [sdfk] Attached SCSI disk
  [Thu Jun  1 14:24:43 2017] sd 17:0:175:0: [sdfl] Attached SCSI disk
  [Thu Jun  1 14:24:48 2017] sd 17:0:176:0: [sdfm] Attached SCSI disk
  root@smb1p1:~# 

  
  root@smb1p1:~# date
  Thu Jun  1 14:27:03 CDT 2017
  root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+'
  168
  root@smb1p1:~# 

  
  Checkpoint #3:
  ============= 

  - After 34 minutes, multipath -ll command shows paths with single path
  and no redundancy.

  root@smb1p1:~# multipath -ll > multipath.log.06012017.afterReboot
  root@smb1p1:~# cat multipath.log.06012017.afterReboot |more

  35000c50086a3ca97 dm-161 IBM-ESXS,ST10000NM0226 E
  size=9.0T features='0' hwhandler='0' wp=rw
  `-+- policy='round-robin 0' prio=1 status=active
    `- 17:0:170:0 sdfg 130:32  active ready running
  35000c50086bae8bf dm-144 IBM-ESXS,ST10000NM0226 E
  size=9.0T features='0' hwhandler='0' wp=rw
  `-+- policy='round-robin 0' prio=1 status=active
    `- 17:0:152:0 sdep 129:16  active ready running
  35000c50086baa42f dm-143 IBM-ESXS,ST10000NM0226 E
  size=9.0T features='0' hwhandler='0' wp=rw
  `-+- policy='round-robin 0' prio=1 status=active
    `- 17:0:151:0 sdeo 129:0   active ready running
  ...

  
  Check point #4:
  ===============
   
  - After 43  minutes, multipath -ll command shows some paths with only single 
path and no redundancy and some path with multiple paths and redundancy.

  
  root@smb1p1:~# date
  Thu Jun  1 14:43:00 CDT 2017
  root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+'
  252
  root@smb1p1:~# 

  
  Checkpoint #5:
  ==============

  - After 47 minutes, multipath -ll command still shows some paths with
  only single path and no redundancy.

  root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' |  grep 'Attached SCSI disk' | head
  [Thu Jun  1 14:06:48 2017] sd 17:0:1:0: [sdb] Attached SCSI disk
  [Thu Jun  1 14:06:51 2017] sd 17:0:2:0: [sdc] Attached SCSI disk
  [Thu Jun  1 14:06:53 2017] sd 17:0:3:0: [sdd] Attached SCSI disk
  [Thu Jun  1 14:06:57 2017] sd 17:0:4:0: [sde] Attached SCSI disk
  [Thu Jun  1 14:07:00 2017] sd 17:0:5:0: [sdf] Attached SCSI disk
  [Thu Jun  1 14:07:03 2017] sd 17:0:6:0: [sdg] Attached SCSI disk
  [Thu Jun  1 14:07:05 2017] sd 17:0:7:0: [sdh] Attached SCSI disk
  [Thu Jun  1 14:07:08 2017] sd 17:0:8:0: [sdi] Attached SCSI disk
  [Thu Jun  1 14:07:11 2017] sd 17:0:9:0: [sdj] Attached SCSI disk
  [Thu Jun  1 14:07:14 2017] sd 17:0:10:0: [sdk] Attached SCSI disk
  root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' |  grep 'Attached SCSI disk' | tail
  [Thu Jun  1 14:46:15 2017] sd 18:0:112:0: [sdjo] Attached SCSI disk
  [Thu Jun  1 14:46:20 2017] sd 18:0:113:0: [sdjp] Attached SCSI disk
  [Thu Jun  1 14:46:25 2017] sd 18:0:114:0: [sdjq] Attached SCSI disk
  [Thu Jun  1 14:46:31 2017] sd 18:0:115:0: [sdjr] Attached SCSI disk
  [Thu Jun  1 14:46:36 2017] sd 18:0:116:0: [sdjs] Attached SCSI disk
  [Thu Jun  1 14:46:41 2017] sd 18:0:117:0: [sdjt] Attached SCSI disk
  [Thu Jun  1 14:46:46 2017] sd 18:0:118:0: [sdju] Attached SCSI disk
  [Thu Jun  1 14:46:51 2017] sd 18:0:119:0: [sdjv] Attached SCSI disk
  [Thu Jun  1 14:46:56 2017] sd 18:0:120:0: [sdjw] Attached SCSI disk
  [Thu Jun  1 14:47:01 2017] sd 18:0:121:0: [sdjx] Attached SCSI disk
  root@smb1p1:~# 
  root@smb1p1:~# 
  root@smb1p1:~# date
  Thu Jun  1 14:47:20 CDT 2017
  root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+'
  288
  root@smb1p1:~# 

  Checkpoint #6:
  ==============

  - After 51 minutes after system reboot, looks like all disk are
  discovered and the Multipath is correctly built.

  
  root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+'
  336

  root@smb1p1:~# date
  Thu Jun  1 14:52:05 CDT 2017
  root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' |  grep 'Attached SCSI disk' | tail
  [Thu Jun  1 14:50:47 2017] sd 18:0:167:0: [sdlp] Attached SCSI disk
  [Thu Jun  1 14:50:52 2017] sd 18:0:168:0: [sdlq] Attached SCSI disk
  [Thu Jun  1 14:50:57 2017] sd 18:0:169:0: [sdlr] Attached SCSI disk
  [Thu Jun  1 14:51:02 2017] sd 18:0:170:0: [sdls] Attached SCSI disk
  [Thu Jun  1 14:51:07 2017] sd 18:0:171:0: [sdlt] Attached SCSI disk
  [Thu Jun  1 14:51:13 2017] sd 18:0:172:0: [sdlu] Attached SCSI disk
  [Thu Jun  1 14:51:17 2017] sd 18:0:173:0: [sdlv] Attached SCSI disk
  [Thu Jun  1 14:51:22 2017] sd 18:0:174:0: [sdlw] Attached SCSI disk
  [Thu Jun  1 14:51:27 2017] sd 18:0:175:0: [sdlx] Attached SCSI disk
  [Thu Jun  1 14:51:33 2017] sd 18:0:176:0: [sdly] Attached SCSI disk
  root@smb1p1:~#

  == Comment: #24 - Mauricio Faria De Oliveira  - 2017-06-06 11:42:59 ==
  Hi Paul,

  Per your logs, yes, it's the slowness with the SES driver.

  I'll ask Canonical to pick it up for 16.10 and 17.04 so it makes into
  16.04.2 and 16.04.3.

  Thanks,
  Mauricio

  == Comment: #26 - Mauricio Faria De Oliveira <mauri...@br.ibm.com> - 
2017-06-06 12:06:32 ==
  The patch applies cleanly in the master-next branch of ubuntu-zesty.git and 
ubuntu-yakkety.git.
  Mirroring to Canonical to get a LP bug number, required in the submission 
process.

  == Comment: #27 - Mauricio Faria De Oliveira <mauri...@br.ibm.com> - 
2017-06-06 12:07:58 ==
  The commit is [1].

  commit 75106523f39751390b5789b36ee1d213b3af1945
  Author: Mauricio Faria de Oliveira <mauri...@linux.vnet.ibm.com>
  Date:   Wed Apr 5 12:18:19 2017 -0300

      scsi: ses: don't get power status of SES device slot on probe

  [1]
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=75106523f39751390b5789b36ee1d213b3af1945

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1696445/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to