> 
> An observation: We are using proxmox (5.4), and it displays the WEAR level as 
> "N/A", which is unfortunate... :-( Tried upgrading to 6: the same, still no 
> wearout in the GUI.
> 
> Here is smartctl output:
> 
>> smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.15.18-26-pve] (local build)

Old release, probably came with your OS.

>> Device Model:     Micron_5210_MTFDDAK3T8QDE
>> Firmware Version: D2MU404

Make sure you run the very latest public firmware from Micron.  Ask them if 
this is current.

>> Device is:        Not in smartctl database [for details use: -P showall]

There you go.

>> 202 Unknown_SSD_Attribute   0x0030   100   100   001    Old_age   Offline    
>>   -       0

SMART is not consistently implemented.  Some manufacturers / models use 
different attribute numbers.  Some eg. will present a counter as lifetime 
remaining, others as lifetime consumed.  And the calculations used to present 
lifetime percentages varies among models and may not be the best predictor of 
failure, eg. right before the drive is going to fail the number may change 
non-linearly. 

But it’s better than nothing, and I digress.

Some tools match the label string returned by smartctl instead of or in 
addition to the number.  smartctl will default strings for certain attributes 
(which sometimes gives wrong answers) if there isn’t an exact match to the 
actual drive model.

So I bet if you were to add

    "Micron_5210_(EE|MT)FDDAK(1T9|3T8|7T6)QDE|"                        // 
tested with Micron_5210_MTFDDAK1T9QDC

to your drivedb.h stanza for the 5100/5200, proxmox might pick it up.

Your 6.6 drivedb.h may not even have a 5100/5200 entry.  The latest upstream is 
here

https://www.smartmontools.org/browser/trunk/smartmontools/drivedb.h

But beware that IIRC the bits at the beginning of the file changed with 7.0, so 
you won’t be able to inhale the latest file in toto, and it doesn’t actually 
list the 5210.  Someone from Micron submitted an omnibus update that included 
it, but it conflicted with the existing 5100/5200 entry, and I haven’t gotten 
around to submitting a reconciled version.

With the caveat that I’m composing this in my MUA and have not tested it, you 
might try adding the below to your current drivedb.h.  Now that I again have a 
5100 Pro to test against I need to get off my butt and submit a diff upstream 
so it gets into 7.2.


Be careful with the syntax, it’s easy to get the |{,} characters wrong.


// Reference: 
https://www.micron.com/resource-details/feec878a-265e-49a7-8086-15137c5f9011
  // TN-FD-34: 5100 SSD SMART Implementation
  {
    "Micron 5100 Pro / 5200 SSDs",
    "(Micron_5100_)?(EE|MT)FDDA[KV](240|480|960|1T9|3T8|7T6)T(BY|CB|CC)|" // 
Matches both stock and Dell OEM
    "Micron_5210_(EE|MT)FDDAK(1T9|3T8|7T6)QDE|"                        // 
tested with Micron_5210_MTFDDAK1T9QDC
    "(Micron_5200_)?MTFDDAK(480|960|1T9|3T8|7T6)TD(C|D|N)", // tested with 
Micron_5200_MTFDDAK3T8TDD/D1MU505
    "", "",
  //"-v 1,raw48,Raw_Read_Error_Rate "
  //"-v 5,raw48,Reallocated_Block_Count "
  //"-v 9,raw24(raw8),Power_On_Hours "  // raw24(raw8)??
  //"-v 12,raw48,Power_Cycle_Count "
    "-v 170,raw48,Reserved_Block_Pct " // Percentage of remaining reserved 
blocks available
    "-v 171,raw48,Program_Fail_Count "
    "-v 172,raw48,Erase_Fail_Count "
    "-v 173,raw48,Avg_Block-Erase_Count "
    "-v 174,raw48,Unexpect_Power_Loss_Ct "
  //"-v 180,raw48,Reserved_Block_Count " // absolute count of remaining 
reserved blocks available
    "-v 183,raw48,SATA_Int_Downshift_Ct " // SATA speed downshift count
  //"-v 184,raw48,Error_Correction_Count "
  //"-v 187,raw48,Reported_Uncorrect " // Number of UECC correction failures
  //"-v 188,raw48,Command_Timeouts "
  //"-v 194,tempminmax,Temperature_Celsius " // 100 - degrees C, wraps: 101 
reported as 255
  //"-v 195,raw48,Cumulativ_Corrected_ECC "
  //"-v 196,raw48,Reallocation_Event_Ct "
  //"-v 197,raw48,Current_Pending_Sector " // Use the raw value
  //"-v 198,raw48,Offline_Uncorrectable "  // Use the raw value
  //"-v 199,raw48,UDMA_CRC_Error_Count "   // Use the raw value
    "-v 202,raw48,Percent_Lifetime_Remain " // Remaining endurance, trips at 10%
    "-v 206,raw48,Write_Error_Rate "
    "-v 210,raw48,RAIN_Success_Recovered "  // Total number of NAND pages 
recovered by RAIN
    "-v 211,raw48,Integ_Scan_Complete_Cnt "  // Number of periodic data 
integrity scans completed
    "-v 212,raw48,Integ_Scan_Folding_Cnt "   // Number of blocks reallocated by 
integrity scans
    "-v 213,raw48,Integ_Scan_Progress "      // Current is percentage, raw is 
absolute number of superblocks scanned by the current in\
tegrity scan
    "-v 247,raw48,Host_Program_Page_Count "
    "-v 248,raw48,Bckgnd_Program_Page_Cnt"
  },
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to