Public bug reported:

== Comment: #1 - Application Cdeadmin <cdead...@us.ibm.com> - 2016-12-02 
04:55:07 ==
==== State: Open by: tdylla on 01 December 2016 07:24:33 ====

Notice: This Note entry was modified.  2 non-ascii character(s) were
replaced with question marks.

 BMC yl13u2bmc/9.5.57.84
       Gui - ADMIN/admin    ssh - sysadmin/superuser

OS  yl13u2os/9.5.57.85
       ssh - root/Pumpk1ns

root@YL13U2OS:~# ver
cat: /proc/device-tree/openprom/model: No such file or directory
       ver 1.5.4.5 - OS, HTX, Firmware and Machine details

                           OS: GNU/Linux
                   OS Version: Ubuntu 16.04.1 LTS \n \l
               Kernel Version: 4.4.0-47-generic
                  HTX Version: htxubuntu-422
                    Host Name: YL13U2OS
            Machine Serial No: 100CC9A
           Machine Type/Model: 8335-GTB

root@YL13U2OS:~# uname -a
Linux YL13U2OS 4.4.0-47-generic #68-Ubuntu SMP Wed Oct 26 19:38:24 UTC 2016 
ppc64le ppc64le ppc64le GNU/Linux

root@YL13U2OS:~# cat /etc/os-release
NAME="Ubuntu"
VERSION="16.04.1 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.1 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/";
SUPPORT_URL="http://help.ubuntu.com/";
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/";
VERSION_CODENAME=xenial
UBUNTU_CODENAME=xenial

Dasd exercisers fail with a write error.  These have never failed
before.

root@YL13U2OS:~# lsblk -o KNAME,TYPE,SIZE,MODEL,ROTA
KNAME     TYPE   SIZE MODEL                                    ROTA
sda       disk   1.8T ST2000NX0253                                1
sda1      part   1.8T                                             1
sdb       disk   1.8T ST2000NX0253                                1
sdb1      part   1.8T                                             1

Getting HTX erros from yl13u2os.rch.stglabs.ibm.com

######################## Result Starts Here ################################
Currently running ECG/MDT : /usr/lpp/htx//mdt/mdt.whit
===========================

---------------------------------------------------------------------
Device id:/dev/sda1
Timestamp:Dec 1 01:22:57 2016
err=00000001
sev=1
Exerciser Name:hxestorage
Serial No:Not Available
Part No:Not Available
Location:Not Available
FRU Number:Not Available
Device:Not Available
Error Text:rule_1_3 numopers= 1907729 loop= 1322123 blk=0xc08768b0 len=262144 
dir=DOWN min_blkno=0xaea86084 max_blkno=0xe8e080af
BWRC LBA fencepost Detail:
th_num min_lba max_lba status
0 0 2476e9ff R
1 4766ee58 74704057 R
2 74704058 99783457 R
3 c0876ab0 e8e080af R
write error - errno: 1(?)

---------------------------------------------------------------------

---------------------------------------------------------------------
Device id:/dev/sda1
Timestamp:Dec 1 01:22:57 2016
err=00000001
sev=1
Exerciser Name:hxestorage
Serial No:Not Available
Part No:Not Available
Location:Not Available
FRU Number:Not Available
Device:Not Available
Error Text:Hardware Exerciser stopped on error

---------------------------------------------------------------------

---------------------------------------------------------------------
Device id:/dev/sdb1
Timestamp:Dec 1 01:23:08 2016
err=00000001
sev=1
Exerciser Name:hxestorage
Serial No:Not Available
Part No:Not Available
Location:Not Available
FRU Number:Not Available
Device:Not Available
Error Text:rule_1_1 numopers= 1907729 loop= 1394165 blk=0x49e45458 len=262144 
dir=DOWN min_blkno=0x3a38202c max_blkno=0x74704057
BWRC LBA fencepost Detail:
th_num min_lba max_lba status
0 0 247c47ff R
1 49e45658 74704057 R
2 74704058 99d2a657 R
3 c0d344b0 e8e080af R
write error - errno: 1(?)

---------------------------------------------------------------------

---------------------------------------------------------------------
Device id:/dev/sdb1
Timestamp:Dec 1 01:23:08 2016
err=00000001
sev=1
Exerciser Name:hxestorage
Serial No:Not Available
Part No:Not Available
Location:Not Available
FRU Number:Not Available
Device:Not Available
Error Text:Hardware Exerciser stopped on error

---------------------------------------------------------------------

######################### Result Ends Here
#################################

System is still running exercisers.  Feel Free to play with the system.
System is available for any debug that is needed.

==== State: Open by: mamukul1 on 01 December 2016 15:41:32 ====

Write() failing with errno 1 for both sda1 and sdb1.
Some errors seen in dmesg as well in same timeframe.

Over to hxestorage to debug further.
#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#

<Note by preeti, 2016/12/01 23:47:34 seq: 7   rel: 0 action: note>
Both the devices are failing with errno. set to 1 for write() system call,
which means "operation not permitted".

---------------------------------------------------------------------
Device id:/dev/sda1
Timestamp:Dec  1 01:22:57 2016
err=00000001
sev=1
Exerciser Name:hxestorage
Serial No:Not Available
Part No:Not Available
Location:Not Available
FRU Number:Not Available
Device:Not Available
Error Text:rule_1_3 numopers=   1907729 loop=   1322123 blk=0xc08768b0 
len=262144 dir=DOWN min_blkno=0xaea86084 max_blkno=0xe8e080af
BWRC LBA fencepost Detail:
th_num                min_lba                  max_lba          status
0                 0                2476e9ff        R
1          4766ee58                74704057        R
2          74704058                99783457        R
3          c0876ab0  )             e8e080af        R
write error - errno: 1(??

Below is corresponding data in kernel logs (Not sure if it is related to
error):

Dec  1 01:22:57 YL13U2OS kernel: [50119.193567] EXT4-fs (sda1): VFS: Can't find 
ext4 filesystem
Dec  1 01:22:57 YL13U2OS kernel: [50119.201895] EXT4-fs (sda1): VFS: Can't find 
ext4 filesystem
Dec  1 01:22:57 YL13U2OS kernel: [50119.207728] EXT4-fs (sda1): VFS: Can't find 
ext4 filesystem
Dec  1 01:22:57 YL13U2OS kernel: [50119.234961] squashfs: SQUASHFS error: Can't 
find a SQUASHFS superblock on sda1
Dec  1 01:22:57 YL13U2OS kernel: [50119.249926] FAT-fs (sda1): bogus number of 
FAT structure
Dec  1 01:22:57 YL13U2OS kernel: [50119.250215] FAT-fs (sda1): Can't find a 
valid FAT filesystem
Dec  1 01:22:58 YL13U2OS kernel: [50119.700556] XFS (sda1): Invalid superblock 
magic number
Dec  1 01:22:58 YL13U2OS kernel: [50120.448485] FAT-fs (sda1): bogus number of 
FAT structure
Dec  1 01:22:58 YL13U2OS kernel: [50120.448818] FAT-fs (sda1): Can't find a 
valid FAT filesystem
Dec  1 01:22:59 YL13U2OS kernel: [50120.463705] VFS: Can't find a Minix 
filesystem V1 | V2 | V3 on device sda1.
Dec  1 01:22:59 YL13U2OS kernel: [50120.468236] hfsplus: unable to find HFS+ 
superblock
Dec  1 01:22:59 YL13U2OS kernel: [50120.474019] qnx4: no qnx4 filesystem (no 
root dir).
Dec  1 01:22:59 YL13U2OS kernel: [50120.477931] ufs: You didn't specify the 
type of your ufs filesystem
Dec  1 01:22:59 YL13U2OS kernel: [50120.477931]
Dec  1 01:22:59 YL13U2OS kernel: [50120.477931] mount -t ufs -o 
ufstype=sun|sunx86|44bsd|ufs2|5xbsd|old|hp|nextstep|nextstep-cd|openstep ...
Dec  1 01:22:59 YL13U2OS kernel: [50120.477931]
Dec  1 01:22:59 YL13U2OS kernel: [50120.477931] >>>WARNING<<< Wrong ufstype may 
corrupt your filesystem, default is ufstype=old
Dec  1 01:22:59 YL13U2OS kernel: [50120.481654] ufs: ufs_fill_super(): bad 
magic number
Dec  1 01:22:59 YL13U2OS kernel: [50120.487379] hfs: can't find a HFS 
filesystem on dev sda1

Will transfer to Linux to look further.
<Note by preeti, 2016/12/02 04:35:35 seq: 8   rel: 0 action: assign>

== Comment: #2 - Application Cdeadmin <cdead...@us.ibm.com> - 2016-12-02 
09:55:08 ==
==== State: Open by: tdylla on 02 December 2016 09:53:18 ====

I noticed on a different system that has htxubuntu-424 installed along with a 
patch from defect sw372840 that the sdb exercisers is running just fine.  It 
currently has a cycle count of 2 and current stanza of 5.  The device on this 
other system is exactly the same drive type.
sdb disk 1.8T ST2000NX0253 
 sdb1 part 1.8T

== Comment: #3 - VIPIN K. PARASHAR <vipar...@in.ibm.com> - 2016-12-05 05:43:45 
==
root@YL13U2OS:~# cat /proc/partitions 
major minor  #blocks  name

   1        0      65536 ram0
   1        1      65536 ram1
   1        2      65536 ram2
   1        3      65536 ram3
   1        4      65536 ram4
   1        5      65536 ram5
   1        6      65536 ram6
   1        7      65536 ram7
   1        8      65536 ram8
   1        9      65536 ram9
   1       10      65536 ram10
   1       11      65536 ram11
   1       12      65536 ram12
   1       13      65536 ram13
   1       14      65536 ram14
   1       15      65536 ram15
 259        0 3125616984 nvme0n1
 259        1       7168 nvme0n1p1
 259        2 2999266304 nvme0n1p2
 259        3  126342144 nvme0n1p3
   8        0 1953514584 sda
   8        1 1953513560 sda1
   8       16 1953514584 sdb
   8       17 1953513560 sdb1
  11        0    1048575 sr0
  11        1    1048575 sr1
  11        2    1048575 sr2
  11        3    1048575 sr3
root@YL13U2OS:~# mount
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
udev on /dev type devtmpfs 
(rw,nosuid,relatime,size=508856128k,nr_inodes=7950877,mode=755)
devpts on /dev/pts type devpts 
(rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=107151232k,mode=755)
/dev/nvme0n1p2 on / type ext4 (rw,relatime,errors=remount-ro,data=ordered)
securityfs on /sys/kernel/security type securityfs 
(rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k)
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755)
cgroup on /sys/fs/cgroup/systemd type cgroup 
(rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/lib/systemd/systemd-cgroups-agent,name=systemd)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
cgroup on /sys/fs/cgroup/devices type cgroup 
(rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
cgroup on /sys/fs/cgroup/freezer type cgroup 
(rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/blkio type cgroup 
(rw,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/perf_event type cgroup 
(rw,nosuid,nodev,noexec,relatime,perf_event)
cgroup on /sys/fs/cgroup/cpuset type cgroup 
(rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/memory type cgroup 
(rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/hugetlb type cgroup 
(rw,nosuid,nodev,noexec,relatime,hugetlb)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup 
(rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup 
(rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs 
(rw,relatime,fd=27,pgrp=1,timeout=0,minproto=5,maxproto=5,direct)
mqueue on /dev/mqueue type mqueue (rw,relatime)
debugfs on /sys/kernel/debug type debugfs (rw,relatime)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime)
fusectl on /sys/fs/fuse/connections type fusectl (rw,relatime)
configfs on /sys/kernel/config type configfs (rw,relatime)
lxcfs on /var/lib/lxcfs type fuse.lxcfs 
(rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
tmpfs on /run/user/0 type tmpfs 
(rw,nosuid,nodev,relatime,size=107151232k,mode=700)
root@YL13U2OS:~# df
Filesystem      1K-blocks    Used  Available Use% Mounted on
udev            508856128       0  508856128   0% /dev
tmpfs           107151232   32832  107118400   1% /run
/dev/nvme0n1p2 2952071944 7906084 2794186164   1% /
tmpfs           535756096       0  535756096   0% /dev/shm
tmpfs                5120       0       5120   0% /run/lock
tmpfs           535756096       0  535756096   0% /sys/fs/cgroup
tmpfs           107151232       0  107151232   0% /run/user/0
root@YL13U2OS:~#


== Comment: #7 - VIPIN K. PARASHAR <vipar...@in.ibm.com> - 2016-12-06 05:43:06 
==
root@YL13U2OS:~# df -T
Filesystem     Type      1K-blocks    Used  Available Use% Mounted on
udev           devtmpfs  508856128       0  508856128   0% /dev
tmpfs          tmpfs     107151232   32832  107118400   1% /run
/dev/nvme0n1p2 ext4     2952071944 7931124 2794161124   1% /
tmpfs          tmpfs     535756096       0  535756096   0% /dev/shm
tmpfs          tmpfs          5120       0       5120   0% /run/lock
tmpfs          tmpfs     535756096       0  535756096   0% /sys/fs/cgroup
tmpfs          tmpfs     107151232       0  107151232   0% /run/user/0
root@YL13U2OS:~#

== Comment: #8 - VIPIN K. PARASHAR <vipar...@in.ibm.com> - 2016-12-06 06:33:48 
==
root@YL13U2OS:~# cat /etc/fstab
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
# / was on /dev/nvme0n1p2 during installation
UUID=6cddb0e5-477c-4d64-807a-631b2d12dfac /               ext4    
errors=remount-ro 0       1
# swap was on /dev/nvme0n1p3 during installation
UUID=00693a84-74f6-4ded-b82d-6a938880ba8a none            swap    sw            
  0       0

root@YL13U2OS:~# lsblk
NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda           8:0    1   1.8T  0 disk 
??sda1        8:1    1   1.8T  0 part 
sdb           8:16   1   1.8T  0 disk 
??sdb1        8:17   1   1.8T  0 part 
sr0          11:0    1  1024M  0 rom  
sr1          11:1    1  1024M  0 rom  
sr2          11:2    1  1024M  0 rom  
sr3          11:3    1  1024M  0 rom  
nvme0n1     259:0    0   2.9T  0 disk 
??nvme0n1p1 259:1    0     7M  0 part 
??nvme0n1p2 259:2    0   2.8T  0 part /
??nvme0n1p3 259:3    0 120.5G  0 part [SWAP]

root@YL13U2OS:~# lsblk --fs
NAME        FSTYPE LABEL UUID                                 MOUNTPOINT
sda                                                           
??sda1                                                        
sdb                                                           
??sdb1                                                        
sr0                                                           
sr1                                                           
sr2                                                           
sr3                                                           
nvme0n1                                                       
??nvme0n1p1                                                   
??nvme0n1p2 ext4         6cddb0e5-477c-4d64-807a-631b2d12dfac /
??nvme0n1p3 swap         00693a84-74f6-4ded-b82d-6a938880ba8a [SWAP]

root@YL13U2OS:~# grep -B 1 '"hxestorage"' /usr/lpp/htx/mdt/mdt
sda1:
        HE_name = "hxestorage"                      * Hardware Exerciser name, 
14 char
--
sdb1:
        HE_name = "hxestorage"                      * Hardware Exerciser name, 
14 char
root@YL13U2OS:~# 
root@YL13U2OS:~# 
root@YL13U2OS:~# grep 'Device id' /tmp/htxerr
Device id:/dev/sda1         
Device id:/dev/sda1         
Device id:/dev/sdb1         
Device id:/dev/sdb1         
root@YL13U2OS:~# 

sda1 and sdb2 are only disks being exercised and both have errored out due after
write failure. nvme0n1p1 disk is being used by OS and thus not getting 
exercised by HTX.

== Comment: #9 - VIPIN K. PARASHAR <vipar...@in.ibm.com> - 2016-12-06 07:52:38 
==
[Thu Dec  1 01:22:57 2016] EXT4-fs (sda1): VFS: Can't find ext4 filesystem
[Thu Dec  1 01:22:57 2016] EXT4-fs (sda1): VFS: Can't find ext4 filesystem
[Thu Dec  1 01:22:57 2016] EXT4-fs (sda1): VFS: Can't find ext4 filesystem
[Thu Dec  1 01:22:57 2016] squashfs: SQUASHFS error: Can't find a SQUASHFS 
superblock on sda1
[Thu Dec  1 01:22:57 2016] FAT-fs (sda1): bogus number of FAT structure
[Thu Dec  1 01:22:57 2016] FAT-fs (sda1): Can't find a valid FAT filesystem
[Thu Dec  1 01:22:57 2016] XFS (sda1): Invalid superblock magic number
[Thu Dec  1 01:22:58 2016] FAT-fs (sda1): bogus number of FAT structure
[Thu Dec  1 01:22:58 2016] FAT-fs (sda1): Can't find a valid FAT filesystem
[Thu Dec  1 01:22:58 2016] VFS: Can't find a Minix filesystem V1 | V2 | V3 on 
device sda1.
[Thu Dec  1 01:22:58 2016] hfsplus: unable to find HFS+ superblock
[Thu Dec  1 01:22:58 2016] qnx4: no qnx4 filesystem (no root dir).
[Thu Dec  1 01:22:58 2016] ufs: You didn't specify the type of your ufs 
filesystem
                           
                           mount -t ufs -o 
ufstype=sun|sunx86|44bsd|ufs2|5xbsd|old|hp|nextstep|nextstep-cd|openstep ...
                           
                           >>>WARNING<<< Wrong ufstype may corrupt your 
filesystem, default is ufstype=old
[Thu Dec  1 01:22:58 2016] ufs: ufs_fill_super(): bad magic number
[Thu Dec  1 01:22:58 2016] hfs: can't find a HFS filesystem on dev sda1
[Thu Dec  1 01:23:08 2016] EXT4-fs (sdb1): VFS: Can't find ext4 filesystem
[Thu Dec  1 01:23:08 2016] EXT4-fs (sdb1): VFS: Can't find ext4 filesystem
[Thu Dec  1 01:23:08 2016] EXT4-fs (sdb1): VFS: Can't find ext4 filesystem
[Thu Dec  1 01:23:08 2016] squashfs: SQUASHFS error: Can't find a SQUASHFS 
superblock on sdb1
[Thu Dec  1 01:23:08 2016] FAT-fs (sdb1): bogus number of FAT structure
[Thu Dec  1 01:23:08 2016] FAT-fs (sdb1): Can't find a valid FAT filesystem
[Thu Dec  1 01:23:08 2016] XFS (sdb1): Invalid superblock magic number
[Thu Dec  1 01:23:10 2016] FAT-fs (sdb1): bogus number of FAT structure
[Thu Dec  1 01:23:10 2016] FAT-fs (sdb1): Can't find a valid FAT filesystem
[Thu Dec  1 01:23:10 2016] VFS: Can't find a Minix filesystem V1 | V2 | V3 on 
device sdb1.
[Thu Dec  1 01:23:10 2016] hfsplus: unable to find HFS+ superblock
[Thu Dec  1 01:23:10 2016] qnx4: no qnx4 filesystem (no root dir).
[Thu Dec  1 01:23:10 2016] ufs: You didn't specify the type of your ufs 
filesystem
                           
                           mount -t ufs -o 
ufstype=sun|sunx86|44bsd|ufs2|5xbsd|old|hp|nextstep|nextstep-cd|openstep ...
                           
                           >>>WARNING<<< Wrong ufstype may corrupt your 
filesystem, default is ufstype=old
[Thu Dec  1 01:23:10 2016] ufs: ufs_fill_super(): bad magic number
[Thu Dec  1 01:23:10 2016] hfs: can't find a HFS filesystem on dev sdb1

Linux has failed to detect file systems on sda1, sdb1 disks, causing write
failures for HTX exerciser. Similar fails are reported for nvme disk also in 
Linux kernel log.

== Comment: #10 - VIPIN K. PARASHAR <vipar...@in.ibm.com> - 2016-12-06 08:01:35 
==
Linux errors are being  by os-prober. I ran os-probe manually and 
FS fails got logged in Linux log.  So os-probe got invoked while HTX
was running. This caused write fails for sda1, sdb1 disks along with
nvme disks and also logged Linux errors.

== Comment: #11 - VIPIN K. PARASHAR <vipar...@in.ibm.com> - 2016-12-06 08:04:55 
==
What operation was tried while HTX was running, once these errors
were seen ? Was it apt upgrade or some thing else ?

== Comment: #12 - Application Cdeadmin <cdead...@us.ibm.com> - 2016-12-07 
10:56:09 ==
==== State: MoreInfo by: tdylla on 07 December 2016 10:53:58 ====

HTX was started using htx command line commands.  From then on, the
system was monitored through "System Live Monitor"  No other commands
were executed by a user. This failure happened during an overnight run.
I believe that the Ubuntu OS was loaded to automatically load Security
Fix's which is required.

** Affects: os-prober (Ubuntu)
     Importance: Undecided
     Assignee: Taco Screen team (taco-screen-team)
         Status: New


** Tags: architecture-ppc64le bugnameltc-149477 severity-high 
targetmilestone-inin16042

** Tags added: architecture-ppc64le bugnameltc-149477 severity-high
targetmilestone-inin16042

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1648561

Title:
  htxubuntu SDB dasd exercisers fail

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/os-prober/+bug/1648561/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to