Hi,

We would like to replace the current seagate ST4000NM0034 HDDs in our ceph cluster with SSDs, and before doing that, we would like to checkout the typical usage of our current drives, over the last years, so we can select the best (price/performance/endurance) SSD to replace them with.

I am trying to extract this info from the fields "Blocks received from initiator" / "blocks sent to initiator", as these are the fields smartctl gets from the seagate disks. But the numbers seem strange, and I would like to request feedback here.

Three nodes, all equal, 8 OSDs per node, all 4TB ST4000NM0034 (filestore) HDDs with SSD-based journals:

root@node1:~# ceph osd crush tree
ID CLASS WEIGHT   TYPE NAME
-1       87.35376 root default
-2       29.11688     host node1
 0   hdd  3.64000         osd.0
 1   hdd  3.64000         osd.1
 2   hdd  3.63689         osd.2
 3   hdd  3.64000         osd.3
12   hdd  3.64000         osd.12
13   hdd  3.64000         osd.13
14   hdd  3.64000         osd.14
15   hdd  3.64000         osd.15
-3       29.12000     host node2
 4   hdd  3.64000         osd.4
 5   hdd  3.64000         osd.5
 6   hdd  3.64000         osd.6
 7   hdd  3.64000         osd.7
16   hdd  3.64000         osd.16
17   hdd  3.64000         osd.17
18   hdd  3.64000         osd.18
19   hdd  3.64000         osd.19
-4       29.11688     host node3
 8   hdd  3.64000         osd.8
 9   hdd  3.64000         osd.9
10   hdd  3.64000         osd.10
11   hdd  3.64000         osd.11
20   hdd  3.64000         osd.20
21   hdd  3.64000         osd.21
22   hdd  3.64000         osd.22
23   hdd  3.63689         osd.23

We are looking at the numbers from smartctl, and basing our calculations on this output for each individual various OSD:
Vendor (Seagate) cache information
  Blocks sent to initiator = 3783529066
  Blocks received from initiator = 3121186120
  Blocks read from cache and sent to initiator = 545427169
  Number of read and write commands whose size <= segment size = 93877358
  Number of read and write commands whose size > segment size = 2290879

I created the following spreadsheet:

        blocks sent     blocks received total blocks    
        to initiator    from initiator  calculated      read%   write%          
aka
node1
osd0    905060564       1900663448      2805724012      32,26%  67,74%          
sda
osd1    2270442418      3756215880      6026658298      37,67%  62,33%          
sdb
osd2    3531938448      3940249192      7472187640      47,27%  52,73%          
sdc
osd3    2824808123      3130655416      5955463539      47,43%  52,57%          
sdd
osd12   1956722491      1294854032      3251576523      60,18%  39,82%          
sdg
osd13   3410188306      1265443936      4675632242      72,94%  27,06%          
sdh
osd14   3765454090      3115079112      6880533202      54,73%  45,27%          
sdi
osd15   2272246730      2218847264      4491093994      50,59%  49,41%          
sdj
                                                        
node2                                                   
osd4    3974937107      740853712       4715790819      84,29%  15,71%          
sda
osd5    1181377668      2109150744      3290528412      35,90%  64,10%          
sdb
osd5    1903438106      608869008       2512307114      75,76%  24,24%          
sdc
osd7    3511170043      724345936       4235515979      82,90%  17,10%          
sdd
osd16   2642731906      3981984640      6624716546      39,89%  60,11%          
sdg
osd17   3994977805      3703856288      7698834093      51,89%  48,11%          
sdh
osd18   3992157229      2096991672      6089148901      65,56%  34,44%          
sdi
osd19   279766405       1053039640      1332806045      20,99%  79,01%          
sdj
                                                        
node3                                                   
osd8    3711322586      234696960       3946019546      94,05%  5,95%           
sda
osd9    1203912715      3132990000      4336902715      27,76%  72,24%          
sdb
osd10   912356010       1681434416      2593790426      35,17%  64,83%          
sdc
osd11   810488345       2626589896      3437078241      23,58%  76,42%          
sdd
osd20   1506879946      2421596680      3928476626      38,36%  61,64%          
sdg
osd21   2991526593      7525120         2999051713      99,75%  0,25%           
sdh
osd22   29560337        3226114552      3255674889      0,91%   99,09%          
sdi
osd23   2019195656      2563506320      4582701976      44,06%  55,94%          
sdj

But as can be seen above, this results in some very strange numbers, for example node3/osd21 and node2/osd19, node3/osd8, the numbers are unlikely.

So, probably we're doing something wrong in our logic here.

Can someone explain what we're doing wrong, and is it possible to obtain stats like these also from ceph directly? Does ceph keep historical stats like above..?

MJ
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to