Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

Sean Redmond Fri, 12 Jan 2018 06:46:13 -0800

Hi David,

To follow up on this I had a 4th drive fail (out of 12) and have opted to
order the below disks as a replacement, I have an ongoing case with Intel
via the supplier - Will report back anything useful - But I am going to
avoid the Intel s4600 2TB SSD's for the moment.


1.92TB Samsung SM863a 2.5" Enterprise SSD, SATA3 6Gb/s, 2-bit MLC V-NAND

Regards
Sean Redmond

On Wed, Jan 10, 2018 at 11:08 PM, Sean Redmond <sean.redmo...@gmail.com>
wrote:

> Hi David,
>
> Thanks for your email, they are connected inside Dell R730XD (2.5 inch 24
> disk model) in None RAID mode via a perc RAID card.
>
> The version of ceph is Jewel with kernel 4.13.X and ubuntu 16.04.
>
> Thanks for your feedback on the HGST disks.
>
> Thanks
>
> On Wed, Jan 10, 2018 at 10:55 PM, David Herselman <d...@syrex.co> wrote:
>
>> Hi Sean,
>>
>>
>>
>> No, Intel’s feedback has been… Pathetic… I have yet to receive anything
>> more than a request to ‘sign’ a non-disclosure agreement, to obtain beta
>> firmware. No official answer as to whether or not one can logically unlock
>> the drives, no answer to my question whether or not Intel publish serial
>> numbers anywhere pertaining to recalled batches and no information
>> pertaining to whether or not firmware updates would address any known
>> issues.
>>
>>
>>
>> This with us being an accredited Intel Gold partner…
>>
>>
>>
>>
>>
>> We’ve returned the lot and ended up with 9/12 of the drives failing in
>> the same manner. The replaced drives, which had different serial number
>> ranges, also failed. Very frustrating is that the drives fail in a way that
>> result in unbootable servers, unless one adds ‘rootdelay=240’ to the kernel.
>>
>>
>>
>>
>>
>> I would be interested to know what platform your drives were in and
>> whether or not they were connected to a RAID module/card.
>>
>>
>>
>> PS: After much searching we’ve decided to order the NVMe conversion kit
>> and have ordered HGST UltraStar SN200 2.5 inch SFF drives with a 3 DWPD
>> rating.
>>
>>
>>
>>
>>
>> Regards
>>
>> David Herselman
>>
>>
>>
>> *From:* Sean Redmond [mailto:sean.redmo...@gmail.com]
>> *Sent:* Thursday, 11 January 2018 12:45 AM
>> *To:* David Herselman <d...@syrex.co>
>> *Cc:* Christian Balzer <ch...@gol.com>; ceph-users@lists.ceph.com
>>
>> *Subject:* Re: [ceph-users] Many concurrent drive failures - How do I
>> activate pgs?
>>
>>
>>
>> Hi,
>>
>>
>>
>> I have a case where 3 out to 12 of these Intel S4600 2TB model failed
>> within a matter of days after being burn-in tested then placed into
>> production.
>>
>>
>>
>> I am interested to know, did you every get any further feedback from the
>> vendor on your issue?
>>
>>
>>
>> Thanks
>>
>>
>>
>> On Thu, Dec 21, 2017 at 1:38 PM, David Herselman <d...@syrex.co> wrote:
>>
>> Hi,
>>
>> I assume this can only be a physical manufacturing flaw or a firmware
>> bug? Do Intel publish advisories on recalled equipment? Should others be
>> concerned about using Intel DC S4600 SSD drives? Could this be an
>> electrical issue on the Hot Swap Backplane or BMC firmware issue? Either
>> way, all pure Intel...
>>
>> The hole is only 1.3 GB (4 MB x 339 objects) but perfectly striped
>> through images, file systems are subsequently severely damaged.
>>
>> Is it possible to get Ceph to read in partial data shards? It would
>> provide between 25-75% more yield...
>>
>>
>> Is there anything wrong with how we've proceeded thus far? Would be nice
>> to reference examples of using ceph-objectstore-tool but documentation is
>> virtually non-existent.
>>
>> We used another SSD drive to simulate bringing all the SSDs back online.
>> We carved up the drive to provide equal partitions to essentially simulate
>> the original SSDs:
>>   # Partition a drive to provide 12 x 150GB partitions, eg:
>>     sdd       8:48   0   1.8T  0 disk
>>     |-sdd1    8:49   0   140G  0 part
>>     |-sdd2    8:50   0   140G  0 part
>>     |-sdd3    8:51   0   140G  0 part
>>     |-sdd4    8:52   0   140G  0 part
>>     |-sdd5    8:53   0   140G  0 part
>>     |-sdd6    8:54   0   140G  0 part
>>     |-sdd7    8:55   0   140G  0 part
>>     |-sdd8    8:56   0   140G  0 part
>>     |-sdd9    8:57   0   140G  0 part
>>     |-sdd10   8:58   0   140G  0 part
>>     |-sdd11   8:59   0   140G  0 part
>>     +-sdd12   8:60   0   140G  0 part
>>
>>
>>   Pre-requisites:
>>     ceph osd set noout;
>>     apt-get install uuid-runtime;
>>
>>
>>   for ID in `seq 24 35`; do
>>     UUID=`uuidgen`;
>>     OSD_SECRET=`ceph-authtool --gen-print-key`;
>>     DEVICE='/dev/sdd'$[$ID-23]; # 24-23 = /dev/sdd1, 35-23 = /dev/sdd12
>>     echo "{\"cephx_secret\": \"$OSD_SECRET\"}" | ceph osd new $UUID $ID
>> -i - -n client.bootstrap-osd -k /var/lib/ceph/bootstrap-osd/ceph.keyring;
>>     mkdir /var/lib/ceph/osd/ceph-$ID;
>>     mkfs.xfs $DEVICE;
>>     mount $DEVICE /var/lib/ceph/osd/ceph-$ID;
>>     ceph-authtool --create-keyring /var/lib/ceph/osd/ceph-$ID/keyring
>> --name osd.$ID --add-key $OSD_SECRET;
>>     ceph-osd -i $ID --mkfs --osd-uuid $UUID;
>>     chown -R ceph:ceph /var/lib/ceph/osd/ceph-$ID;
>>     systemctl enable ceph-osd@$ID;
>>     systemctl start ceph-osd@$ID;
>>   done
>>
>>
>> Once up we imported previous exports of empty head files in to 'real'
>> OSDs:
>>   kvm5b:
>>     systemctl stop ceph-osd@8;
>>     ceph-objectstore-tool --op import --pgid 7.4s0 --data-path
>> /var/lib/ceph/osd/ceph-8 --journal-path /var/lib/ceph/osd/ceph-8/journal
>> --file /var/lib/vz/template/ssd_recovery/osd8_7.4s0.export;
>>     chown ceph:ceph -R /var/lib/ceph/osd/ceph-8;
>>     systemctl start ceph-osd@8;
>>   kvm5f:
>>     systemctl stop ceph-osd@23;
>>     ceph-objectstore-tool --op import --pgid 7.fs0 --data-path
>> /var/lib/ceph/osd/ceph-23 --journal-path /var/lib/ceph/osd/ceph-23/journal
>> --file /var/lib/vz/template/ssd_recovery/osd23_7.fs0.export;
>>     chown ceph:ceph -R /var/lib/ceph/osd/ceph-23;
>>     systemctl start ceph-osd@23;
>>
>>
>> Bulk import previously exported objects:
>>     cd /var/lib/vz/template/ssd_recovery;
>>     for FILE in `ls -1A osd*_*.export | grep -Pv '^osd(8|23)_'`; do
>>       OSD=`echo $FILE | perl -pe 's/^osd(\d+).*/\1/'`;
>>       PGID=`echo $FILE | perl -pe 's/^osd\d+_(.*?).export/\1/g'`;
>>       echo -e "systemctl stop ceph-osd@$OSD\t ceph-objectstore-tool --op
>> import --pgid $PGID --data-path /var/lib/ceph/osd/ceph-$OSD --journal-path
>> /var/lib/ceph/osd/ceph-$OSD/journal --file /var/lib/vz/template/ssd_recov
>> ery/osd"$OSD"_$PGID.export";
>>     done | sort
>>
>> Sample output (this will wrap):
>> systemctl stop ceph-osd@27       ceph-objectstore-tool --op import
>> --pgid 7.4s3 --data-path /var/lib/ceph/osd/ceph-27 --journal-path
>> /var/lib/ceph/osd/ceph-27/journal --file /var/lib/vz/template/ssd_recov
>> ery/osd27_7.4s3.export
>> systemctl stop ceph-osd@27       ceph-objectstore-tool --op import
>> --pgid 7.fs5 --data-path /var/lib/ceph/osd/ceph-27 --journal-path
>> /var/lib/ceph/osd/ceph-27/journal --file /var/lib/vz/template/ssd_recov
>> ery/osd27_7.fs5.export
>> systemctl stop ceph-osd@30       ceph-objectstore-tool --op import
>> --pgid 7.fs4 --data-path /var/lib/ceph/osd/ceph-30 --journal-path
>> /var/lib/ceph/osd/ceph-30/journal --file /var/lib/vz/template/ssd_recov
>> ery/osd30_7.fs4.export
>> systemctl stop ceph-osd@31       ceph-objectstore-tool --op import
>> --pgid 7.4s2 --data-path /var/lib/ceph/osd/ceph-31 --journal-path
>> /var/lib/ceph/osd/ceph-31/journal --file /var/lib/vz/template/ssd_recov
>> ery/osd31_7.4s2.export
>> systemctl stop ceph-osd@32       ceph-objectstore-tool --op import
>> --pgid 7.4s4 --data-path /var/lib/ceph/osd/ceph-32 --journal-path
>> /var/lib/ceph/osd/ceph-32/journal --file /var/lib/vz/template/ssd_recov
>> ery/osd32_7.4s4.export
>> systemctl stop ceph-osd@32       ceph-objectstore-tool --op import
>> --pgid 7.fs2 --data-path /var/lib/ceph/osd/ceph-32 --journal-path
>> /var/lib/ceph/osd/ceph-32/journal --file /var/lib/vz/template/ssd_recov
>> ery/osd32_7.fs2.export
>> systemctl stop ceph-osd@34       ceph-objectstore-tool --op import
>> --pgid 7.4s5 --data-path /var/lib/ceph/osd/ceph-34 --journal-path
>> /var/lib/ceph/osd/ceph-34/journal --file /var/lib/vz/template/ssd_recov
>> ery/osd34_7.4s5.export
>> systemctl stop ceph-osd@34       ceph-objectstore-tool --op import
>> --pgid 7.fs1 --data-path /var/lib/ceph/osd/ceph-34 --journal-path
>> /var/lib/ceph/osd/ceph-34/journal --file /var/lib/vz/template/ssd_recov
>> ery/osd34_7.fs1.export
>>
>>
>> Reset permissions and then started the OSDs:
>> for OSD in 27 30 31 32 34; do
>>   chown -R ceph:ceph /var/lib/ceph/osd/ceph-$OSD;
>>   systemctl start ceph-osd@$OSD;
>> done
>>
>>
>> Then finally started all the OSDs... Now to hope that Intel have a way of
>> accessing drives that are in a 'disable logical state'.
>>
>>
>>
>> The imports succeed, herewith a link to the output after running an
>> import for placement group 7.4s2 on OSD 31:
>>   https://drive.google.com/open?id=1-Jo1jmrWrGLO2OgflacGPlEf2p32Y4hn
>>
>> Sample snippet:
>>     Write 1#7:fffcd2ec:::rbd_data.4.be8e9974b0dc51.0000000000002869:head#
>>     snapset 0=[]:{}
>>     Write 1#7:fffd4823:::rbd_data.4.ba24ef2ae8944a.000000000000a2b0:head#
>>     snapset 0=[]:{}
>>     Write 1#7:fffd6fb6:::benchmark_data_kvm5b_20945_object14722:head#
>>     snapset 0=[]:{}
>>     Write 1#7:ffffa069:::rbd_data.4.ba24ef2ae8944a.000000000000aea9:head#
>>     snapset 0=[]:{}
>>     Import successful
>>
>>
>> Data does get written, I can tell by the size of the FileStore mount
>> points:
>>   [root@kvm5b ssd_recovery]# df -h | grep -P 'ceph-(27|30|31|32|34)$'
>>   /dev/sdd4       140G  5.2G  135G   4% /var/lib/ceph/osd/ceph-27
>>   /dev/sdd7       140G   14G  127G  10% /var/lib/ceph/osd/ceph-30
>>   /dev/sdd8       140G   14G  127G  10% /var/lib/ceph/osd/ceph-31
>>   /dev/sdd9       140G   22G  119G  16% /var/lib/ceph/osd/ceph-32
>>   /dev/sdd11      140G   22G  119G  16% /var/lib/ceph/osd/ceph-34
>>
>>
>> How do I tell Ceph to read these object shards?
>>
>>
>>
>> PS: It's probably a good idea to reweight the OSDs to 0 before starting
>> again. This should prevent data flowing on to them, if they are not in a
>> different device class or other crush selection ruleset. Ie:
>>   for OSD in `seq 24 35`; do
>>     ceph osd crush reweight osd.$OSD 0;
>>   done
>>
>>
>> Regards
>> David Herselman
>>
>> -----Original Message-----
>>
>> From: David Herselman
>> Sent: Thursday, 21 December 2017 3:49 AM
>> To: 'Christian Balzer' <ch...@gol.com>; ceph-users@lists.ceph.com
>> Subject: RE: [ceph-users] Many concurrent drive failures - How do I
>> activate pgs?
>>
>> Hi Christian,
>>
>> Thanks for taking the time, I haven't been contacted by anyone yet but
>> managed to get the down placement groups cleared by exporting 7.4s0 and
>> 7.fs0 and then marking them as complete on the surviving OSDs:
>>     kvm5c:
>>       ceph-objectstore-tool --op export --pgid 7.4s0 --data-path
>> /var/lib/ceph/osd/ceph-8 --journal-path /var/lib/ceph/osd/ceph-8/journal
>> --file /var/lib/vz/template/ssd_recovery/osd8_7.4s0.export;
>>       ceph-objectstore-tool --op mark-complete --data-path
>> /var/lib/ceph/osd/ceph-8 --journal-path /var/lib/ceph/osd/ceph-8/journal
>> --pgid 7.4s0;
>>     kvm5f:
>>       ceph-objectstore-tool --op export --pgid 7.fs0 --data-path
>> /var/lib/ceph/osd/ceph-23 --journal-path /var/lib/ceph/osd/ceph-23/journal
>> --file /var/lib/vz/template/ssd_recovery/osd23_7.fs0.export;
>>       ceph-objectstore-tool --op mark-complete --data-path
>> /var/lib/ceph/osd/ceph-23 --journal-path /var/lib/ceph/osd/ceph-23/journal
>> --pgid 7.fs0;
>>
>> This would presumably simply punch holes in the RBD images but at least
>> we can copy them out of that pool and hope that Intel can somehow unlock
>> the drives for us to then export/import objects.
>>
>>
>> To answer your questions though, we have 6 near identical Intel Wildcat
>> Pass 1U servers and have Proxmox loaded on them. Proxmox uses a Debian 9
>> base with the Ubuntu kernel, for which they apply cherry picked kernel
>> patches (eg Intel NIC driver updates, vhost perf regression and mem-leak
>> fixes, etc):
>>
>> kvm5a:
>>        Intel R1208WTTGSR System (serial: BQWS55091014)
>>        Intel S2600WTTR Motherboard (serial: BQWL54950385, BIOS ID:
>> SE5C610.86B.01.01.0021.032120170601)
>>        2 x Intel Xeon E5-2640v4 2.4GHz (HT disabled)
>>        24 x Micron 8GB DDR4 2133MHz (24 x 18ASF1G72PZ-2G1B1)
>>        Intel AXX10GBNIA I/O Module
>> kvm5b:
>>        Intel R1208WTTGS System (serial: BQWS53890178)
>>        Intel S2600WTT Motherboard (serial: BQWL52550359, BIOS ID:
>> SE5C610.86B.01.01.0021.032120170601)
>>        2 x Intel Xeon E5-2640v4 2.4GHz (HT enabled)
>>        4 x Micron 64GB DDR4 2400MHz LR-DIMM (4 x 72ASS8G72LZ-2G3B2)
>>        Intel AXX10GBNIA I/O Module
>> kvm5c:
>>        Intel R1208WT2GS System (serial: BQWS50490279)
>>        Intel S2600WT2 Motherboard (serial: BQWL44650203, BIOS ID:
>> SE5C610.86B.01.01.0021.032120170601)
>>        2 x Intel Xeon E5-2640v3 2.6GHz (HT enabled)
>>        4 x Micron 64GB DDR4 2400MHz LR-DIMM (4 x 72ASS8G72LZ-2G3B2)
>>        Intel AXX10GBNIA I/O Module
>> kvm5d:
>>        Intel R1208WTTGSR System (serial: BQWS62291318)
>>        Intel S2600WTTR Motherboard (serial: BQWL61855187, BIOS ID:
>> SE5C610.86B.01.01.0021.032120170601)
>>        2 x Intel Xeon E5-2640v4 2.4GHz (HT enabled)
>>        4 x Micron 64GB DDR4 2400MHz LR-DIMM (4 x 72ASS8G72LZ-2G3B2)
>>        Intel AXX10GBNIA I/O Module
>> kvm5e:
>>        Intel R1208WTTGSR System (serial: BQWS64290162)
>>        Intel S2600WTTR Motherboard (serial: BQWL63953066, BIOS ID:
>> SE5C610.86B.01.01.0021.032120170601)
>>        2 x Intel Xeon E5-2640v4 2.4GHz (HT enabled)
>>        4 x Micron 64GB DDR4 2400MHz LR-DIMM (4 x 72ASS8G72LZ-2G3B2)
>>        Intel AXX10GBNIA I/O Module
>> kvm5f:
>>        Intel R1208WTTGSR System (serial: BQWS71790632)
>>        Intel S2600WTTR Motherboard (serial: BQWL71050622, BIOS ID:
>> SE5C610.86B.01.01.0021.032120170601)
>>        2 x Intel Xeon E5-2640v4 2.4GHz (HT enabled)
>>        4 x Micron 64GB DDR4 2400MHz LR-DIMM (4 x 72ASS8G72LZ-2G3B2)
>>        Intel AXX10GBNIA I/O Module
>>
>> Summary:
>>   * 5b has an Intel S2600WTT, 5c has an Intel S2600WT2, all others have
>> S2600WTTR Motherboards
>>   * 5a has ECC Registered Dual Rank DDR DIMMs, all others have ECC
>> LoadReduced-DIMMs
>>   * 5c has an Intel X540-AT2 10 GbE adapter as the on-board NICs are only
>> 1 GbE
>>
>>
>> Each system has identical discs:
>>   * 2 x 480 GB Intel SSD DC S3610 (SSDSC2BX480G4) - partitioned as
>> software RAID1 OS volume and Ceph FileStore journals (spinners)
>>   * 4 x 2 TB Seagate discs (ST2000NX0243) - Ceph FileStore OSDs (journals
>> in S3610 partitions)
>>   * 2 x 1.9 TB Intel SSD DC S4600 (SSDSC2KG019T7) - Ceph BlueStore OSDs
>> (problematic)
>>
>>
>> Additional information:
>>   * All drives are directly attached to the on-board AHCI SATA
>> controllers, via the standard 2.5 inch drive chassis hot-swap bays.
>>   * We added 12 x 1.9 TB SSD DC S4600 drives last week Thursday, 2 in
>> each system's slots 7 & 8
>>   * Systems have been operating with existing Intel SSD DC S3610 and 2 TB
>> Seagate discs for over a year; we added the most recent node (kvm5f) on the
>> 23rd of November.
>>   * 6 of the 12 Intel SSD DC S4600 drives failed in less than 100 hours.
>>   * They work perfectly until they suddenly stop responding and are
>> thereafter, even with us physically shutting down the server and powering
>> it back up again, completely inaccessible. Intel diagnostic tool reports
>> 'logically locked'.
>>
>>
>> Drive failures appear random to me:
>>     kvm5a - bay 7 - offline - Model=INTEL SSDSC2KG019T7, FwRev=SCV10100,
>> SerialNo=BTYM739208851P9DGN
>>     kvm5a - bay 8 - online  - Model=INTEL SSDSC2KG019T7, FwRev=SCV10100,
>> SerialNo=PHYM727602TM1P9DGN
>>     kvm5b - bay 7 - offline - Model=INTEL SSDSC2KG019T7, FwRev=SCV10100,
>> SerialNo=PHYM7276031E1P9DGN
>>     kvm5b - bay 8 - online  - Model=INTEL SSDSC2KG019T7, FwRev=SCV10100,
>> SerialNo=BTYM7392087W1P9DGN
>>     kvm5c - bay 7 - offline - Model=INTEL SSDSC2KG019T7, FwRev=SCV10100,
>> SerialNo=BTYM739200ZJ1P9DGN
>>     kvm5c - bay 8 - offline - Model=INTEL SSDSC2KG019T7, FwRev=SCV10100,
>> SerialNo=BTYM7392088B1P9DGN
>>     kvm5d - bay 7 - offline - Model=INTEL SSDSC2KG019T7, FwRev=SCV10100,
>> SerialNo=BTYM738604Y11P9DGN
>>     kvm5d - bay 8 - online  - Model=INTEL SSDSC2KG019T7, FwRev=SCV10100,
>> SerialNo=PHYM727603181P9DGN
>>     kvm5e - bay 7 - online  - Model=INTEL SSDSC2KG019T7, FwRev=SCV10100,
>> SerialNo=BTYM7392013B1P9DGN
>>     kvm5e - bay 8 - offline - Model=INTEL SSDSC2KG019T7, FwRev=SCV10100,
>> SerialNo=BTYM7392087E1P9DGN
>>     kvm5f - bay 7 - online  - Model=INTEL SSDSC2KG019T7, FwRev=SCV10100,
>> SerialNo=BTYM739208721P9DGN
>>     kvm5f - bay 8 - online  - Model=INTEL SSDSC2KG019T7, FwRev=SCV10100,
>> SerialNo=BTYM739208C41P9DGN
>>
>>
>> Intel SSD Data Center Tool reports:
>> C:\isdct>isdct.exe show -intelssd
>>
>> - Intel SSD DC S4600 Series PHYM7276031E1P9DGN -
>>
>> Bootloader : Property not found
>> DevicePath : \\\\.\\PHYSICALDRIVE1
>> DeviceStatus : Selected drive is in a disable logical state.
>> Firmware : SCV10100
>> FirmwareUpdateAvailable : Please contact Intel Customer Support for
>> further assistance at the following website:
>> http://www.intel.com/go/ssdsupport.
>> Index : 0
>> ModelNumber : INTEL SSDSC2KG019T7
>> ProductFamily : Intel SSD DC S4600 Series SerialNumber :
>> PHYM7276031E1P9DGN
>>
>>
>> C:\isdct>isdct show -a -intelssd 0
>>
>> - Intel SSD DC S4600 Series PHYM7276031E1P9DGN -
>>
>> AccessibleMaxAddressSupported : True
>> AggregationThreshold : Selected drive is in a disable logical state.
>> AggregationTime : Selected drive is in a disable logical state.
>> ArbitrationBurst : Selected drive is in a disable logical state.
>> BusType : 11
>> CoalescingDisable : Selected drive is in a disable logical state.
>> ControllerCompatibleIDs : PCI\\VEN_8086&DEV_8C02&REV_05P
>> CI\\VEN_8086&DEV_8C02PCI\\VEN_8086&CC_010601PCI\\VEN_8086&
>> CC_0106PCI\\VEN_8086PCI\\CC_010601PCI\\CC_0106
>> ControllerDescription : @mshdc.inf,%pci\\cc_010601.devicedesc%;Standard
>> SATA AHCI Controller ControllerID : PCI\\VEN_8086&DEV_8C02&SUBSYS_
>> 78461462&REV_05\\3&11583659&0&FA
>> ControllerIDEMode : False
>> ControllerManufacturer : @mshdc.inf,%ms-ahci%;Standard SATA AHCI
>> Controller ControllerService : storahci DIPMEnabled : False DIPMSupported :
>> False DevicePath : \\\\.\\PHYSICALDRIVE1 DeviceStatus : Selected drive
>> is in a disable logical state.
>> DigitalFenceSupported : False
>> DownloadMicrocodePossible : True
>> DriverDescription : Standard SATA AHCI Controller DriverMajorVersion : 10
>> DriverManufacturer : Standard SATA AHCI Controller DriverMinorVersion : 0
>> DriverVersion : 10.0.16299.98 DynamicMMIOEnabled : The selected drive does
>> not support this feature.
>> EnduranceAnalyzer : Selected drive is in a disable logical state.
>> ErrorString : *BAD_CONTEXT_2020 F4
>> Firmware : SCV10100
>> FirmwareUpdateAvailable : Please contact Intel Customer Support for
>> further assistance at the following website:
>> http://www.intel.com/go/ssdsupport.
>> HDD : False
>> HighPriorityWeightArbitration : Selected drive is in a disable logical
>> state.
>> IEEE1667Supported : False
>> IOCompletionQueuesRequested : Selected drive is in a disable logical
>> state.
>> IOSubmissionQueuesRequested : Selected drive is in a disable logical
>> state.
>> Index : 0
>> Intel : True
>> IntelGen3SATA : True
>> IntelNVMe : False
>> InterruptVector : Selected drive is in a disable logical state.
>> IsDualPort : False
>> LatencyTrackingEnabled : Selected drive is in a disable logical state.
>> LowPriorityWeightArbitration : Selected drive is in a disable logical
>> state.
>> Lun : 0
>> MaximumLBA : 3750748847
>> MediumPriorityWeightArbitration : Selected drive is in a disable logical
>> state.
>> ModelNumber : INTEL SSDSC2KG019T7
>> NVMePowerState : Selected drive is in a disable logical state.
>> NativeMaxLBA : Selected drive is in a disable logical state.
>> OEM : Generic
>> OpalState : Selected drive is in a disable logical state.
>> PLITestTimeInterval : Selected drive is in a disable logical state.
>> PNPString : SCSI\\DISK&VEN_INTEL&PROD_SSDSC2KG019T7\\4&2BE6C224&0&010000
>> PathID : 1
>> PhySpeed : Selected drive is in a disable logical state.
>> PhysicalSectorSize : Selected drive is in a disable logical state.
>> PhysicalSize : 1920383410176
>> PowerGovernorAveragePower : Selected drive is in a disable logical state.
>> PowerGovernorBurstPower : Selected drive is in a disable logical state.
>> PowerGovernorMode : Selected drive is in a disable logical state.
>> Product : Youngsville
>> ProductFamily : Intel SSD DC S4600 Series ProductProtocol : ATA
>> ReadErrorRecoveryTimer : Selected drive is in a disable logical state.
>> RemoteSecureEraseSupported : False
>> SCSIPortNumber : 0
>> SMARTEnabled : True
>> SMARTHealthCriticalWarningsConfiguration : Selected drive is in a
>> disable logical state.
>> SMARTSelfTestSupported : True
>> SMBusAddress : Selected drive is in a disable logical state.
>> SSCEnabled : False
>> SanitizeBlockEraseSupported : False
>> SanitizeCryptoScrambleSupported : True
>> SanitizeSupported : True
>> SataGen1 : True
>> SataGen2 : True
>> SataGen3 : True
>> SataNegotiatedSpeed : Unknown
>> SectorSize : 512
>> SecurityEnabled : False
>> SecurityFrozen : False
>> SecurityLocked : False
>> SecuritySupported : False
>> SerialNumber : PHYM7276031E1P9DGN
>> TCGSupported : False
>> TargetID : 0
>> TempThreshold : Selected drive is in a disable logical state.
>> TemperatureLoggingInterval : Selected drive is in a disable logical state.
>> TimeLimitedErrorRecovery : Selected drive is in a disable logical state.
>> TrimSize : 4
>> TrimSupported : True
>> VolatileWriteCacheEnabled : Selected drive is in a disable logical state.
>> WWID : 3959312879584368077
>> WriteAtomicityDisableNormal : Selected drive is in a disable logical
>> state.
>> WriteCacheEnabled : True
>> WriteCacheReorderingStateEnabled : Selected drive is in a disable
>> logical state.
>> WriteCacheState : Selected drive is in a disable logical state.
>> WriteCacheSupported : True
>> WriteErrorRecoveryTimer : Selected drive is in a disable logical state.
>>
>>
>>
>> SMART information is inaccessible, overall status is failed. Herewith the
>> stats from a partner disc which was still working when the others failed:
>> Device Model:     INTEL SSDSC2KG019T7
>> Serial Number:    PHYM727602TM1P9DGN
>> LU WWN Device Id: 5 5cd2e4 14e1636bb
>> Firmware Version: SCV10100
>> User Capacity:    1,920,383,410,176 bytes [1.92 TB]
>> Sector Sizes:     512 bytes logical, 4096 bytes physical
>> Rotation Rate:    Solid State Device
>> Form Factor:      2.5 inches
>> Device is:        Not in smartctl database [for details use: -P showall]
>> ATA Version is:   ACS-3 T13/2161-D revision 5
>> SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
>> Local Time is:    Mon Dec 18 19:33:51 2017 SAST
>> SMART support is: Available - device has SMART capability.
>> SMART support is: Enabled
>>
>> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
>> UPDATED  WHEN_FAILED RAW_VALUE
>>   5 Reallocated_Sector_Ct   0x0032   100   100   000    Old_age   Always
>>      -       0
>>   9 Power_On_Hours          0x0032   100   100   000    Old_age   Always
>>      -       98
>> 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always
>>      -       3
>> 170 Unknown_Attribute       0x0033   100   100   010    Pre-fail  Always
>>      -       0
>> 171 Unknown_Attribute       0x0032   100   100   000    Old_age   Always
>>      -       1
>> 172 Unknown_Attribute       0x0032   100   100   000    Old_age   Always
>>      -       0
>> 174 Unknown_Attribute       0x0032   100   100   000    Old_age   Always
>>      -       0
>> 175 Program_Fail_Count_Chip 0x0033   100   100   010    Pre-fail  Always
>>      -       17567121432
>> 183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always
>>      -       0
>> 184 End-to-End_Error        0x0033   100   100   090    Pre-fail  Always
>>      -       0
>> 187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always
>>      -       0
>> 190 Airflow_Temperature_Cel 0x0022   077   076   000    Old_age   Always
>>      -       23 (Min/Max 17/29)
>> 192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always
>>      -       0
>> 194 Temperature_Celsius     0x0022   100   100   000    Old_age   Always
>>      -       23
>> 197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always
>>      -       0
>> 199 UDMA_CRC_Error_Count    0x003e   100   100   000    Old_age   Always
>>      -       0
>> 225 Unknown_SSD_Attribute   0x0032   100   100   000    Old_age   Always
>>      -       14195
>> 226 Unknown_SSD_Attribute   0x0032   100   100   000    Old_age   Always
>>      -       0
>> 227 Unknown_SSD_Attribute   0x0032   100   100   000    Old_age   Always
>>      -       42
>> 228 Power-off_Retract_Count 0x0032   100   100   000    Old_age   Always
>>      -       5905
>> 232 Available_Reservd_Space 0x0033   100   100   010    Pre-fail  Always
>>      -       0
>> 233 Media_Wearout_Indicator 0x0032   100   100   000    Old_age   Always
>>      -       0
>> 234 Unknown_Attribute       0x0032   100   100   000    Old_age   Always
>>      -       0
>> 241 Total_LBAs_Written      0x0032   100   100   000    Old_age   Always
>>      -       14195
>> 242 Total_LBAs_Read         0x0032   100   100   000    Old_age   Always
>>      -       10422
>> 243 Unknown_Attribute       0x0032   100   100   000    Old_age   Always
>>      -       41906
>>
>>
>> Media wear out : 0% used
>> LBAs written: 14195
>> Power on hours: <100
>> Power cycle count: once at the factory, once at our offices to check if
>> there was newer firmware (there wasn't) and once when we restarted the node
>> to see if it could then access a failed drive.
>>
>>
>> Regards
>> David Herselman
>>
>>
>> -----Original Message-----
>> From: Christian Balzer [mailto:ch...@gol.com]
>> Sent: Thursday, 21 December 2017 3:24 AM
>> To: ceph-users@lists.ceph.com
>> Cc: David Herselman <d...@syrex.co>
>> Subject: Re: [ceph-users] Many concurrent drive failures - How do I
>> activate pgs?
>>
>> Hello,
>>
>> first off, I don't have anything to add to your conclusions of the
>> current status, alas there are at least 2 folks here on the ML making a
>> living from Ceph disaster recovery, so I hope you have been contacted
>> already.
>>
>> Now once your data is safe or you have a moment, I and others here would
>> probably be quite interested in some more details, see inline below.
>>
>> On Wed, 20 Dec 2017 22:25:23 +0000 David Herselman wrote:
>>
>> [snip]
>> >
>> > We've happily been running a 6 node cluster with 4 x FileStore HDDs per
>> node (journals on SSD partitions) for over a year and recently upgraded all
>> nodes to Debian 9, Ceph Luminous 12.2.2 and kernel 4.13.8. We ordered 12 x
>> Intel DC S4600 SSDs which arrived last week so we added two per node on
>> Thursday evening and brought them up as BlueStore OSDs. We had proactively
>> updated our existing pools to reference only devices classed as 'hdd', so
>> that we could move select images over to ssd replicated and erasure coded
>> pools.
>> >
>> Could you tell us more about that cluster, as in HW, how are the SSDs
>> connected and FW version of the controller if applicable.
>>
>> Kernel 4.13.8 suggests that this is a handrolled, upstream kernel.
>> While not necessarily related I'll note that as far as Debian kernels
>> (which are very lightly if at all patched) are concerned, nothing beyond
>> 4.9 has been working to my satisfaction.
>> 4.11 still worked, but 4.12 crash-reboot-looped on all my Supermicro X10
>> machines (quite a varied selection).
>> The current 4.13.13 backport boots on some of those machines, but still
>> throws errors with the EDAC devices, which works fine with 4.9.
>>
>> 4.14 is known to happily destroy data if used with bcache and even if one
>> doesn't use that it should give you pause.
>>
>> > We were pretty diligent and downloaded Intel's Firmware Update Tool and
>> validated that each new drive had the latest available firmware before
>> installing them in the nodes. We did numerous benchmarks on Friday and
>> eventually moved some images over to the new storage pools. Everything was
>> working perfectly and extensive tests on Sunday showed excellent
>> performance. Sunday night one of the new SSDs died and Ceph replicated and
>> redistributed data accordingly, then another failed in the early hours of
>> Monday morning and Ceph did what it needed to.
>> >
>> > We had the two failed drives replaced by 11am and Ceph was up to
>> 2/4918587 objects degraded (0.000%) when a third drive failed. At this
>> point we updated the crush maps for the rbd_ssd and ec_ssd pools and set
>> the device class to 'hdd', to essentially evacuate everything off the SSDs.
>> Other SSDs then failed at 3:22pm, 4:19pm, 5:49pm and 5:50pm. We've
>> ultimately lost half the Intel S4600 drives, which are all completely
>> inaccessible. Our status at 11:42pm Monday night was: 1/1398478 objects
>> unfound (0.000%) and 339/4633062 objects degraded (0.007%).
>> >
>> The relevant logs when and how those SSDs failed would be interesting.
>> Was the distribution of the failed SSDs random among the cluster?
>> Are you running smartd and did it have something to say?
>>
>> Completely inaccessible sounds a lot like the infamous "self-bricking" of
>> Intel SSDs when they discover something isn't right, or they don't like the
>> color scheme of the server inside (^.^).
>>
>> I'm using quite a lot of Intel SSDs and had only one "fatal" incident.
>> A DC S3700 detected that its powercap had failed, but of course kept
>> working fine. Until a reboot was need, when it promptly bricked itself,
>> data inaccessible, SMART reporting barely that something was there.
>>
>> So one wonders what caused your SSDs to get their knickers in such a
>> twist.
>> Are the survivors showing any unusual signs in their SMART output?
>>
>> Of course what your vendor/Intel will have to say will also be of
>> interest. ^o^
>>
>> Regards,
>>
>> Christian
>> --
>> Christian Balzer        Network/Systems Engineer
>> ch...@gol.com           Rakuten Communications
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>
>
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

Reply via email to