[ceph-users] Massive performance issues

Thomas Schneider Thu, 13 Mar 2025 17:53:44 -0700

Hi,

we are having massive performance issues with our Ceph cluster, and by
now I have no idea how and where to debug further.


See attachments for full benchmark logs, relevant excerpts:

HDD pool:
> read: IOPS=7591, BW=119MiB/s (124MB/s)(34.9GiB/300892msec)

SSD pool:
> read: IOPS=2007, BW=31.4MiB/s (32.9MB/s)(9422MiB/300334msec)

Yes, the SSD pool is in fact slower in both IOPS and data rate than the
HDD pool. Of course this is only one particular benchmark scenario, but
at least it has some numbers and not just “everything feels slow”. I’m
happy to run different benchmarks if required.

Cluster information:
 * 2x10G ethernet to a Cisco Catalyst 3850 on each node
 * Debian 11 (bullseye), Linux 5.10
 * Ceph 18.2.4 (reef), managed with cephadm
 * All but the oldest node serve as VM hypervisors (KVM) as well
 * 6 nodes (hostnames appended to compare with OSD information later)
    - 2x AMD EPYC 7232P, 64G RAM (cirrus, nimbus)
    - 1x Intel Xeon E3-1230 v3, 32G RAM (pileus, no VMs)
    - 2x Intel Xeon E5-1620 v4, 64G RAM (lenticular, nacreous)
    - 1x Intel Xeon Silver 4110, 96G RAM (stratus)
 * Cluster usage: mostly RBD, some RGW and CephFS
 * OSDs:
    - 19 Hitachi Ultrastar 7K4000 2TB
    - 1 WD Ultrastar DC HC310 4TB
    - 3 Hitachi Deskstar 7k2000 2TB
    - 13 Hitachi Ultrastar A7K3000 2TB
    - 8 Micron 5210 ION 2TB
    - 4 Micron 5300 MAX - Mixed Use 2TB
    - 5 Samsung Datacenter PM893 2TB

You can find various information that I deemed useful below. Please ask
if you would like further information.

As I said, I don’t really know where to look and what to do, so I would
really appreciate any pointers on how to debug this and improve
performance.

Of course I have already seen the pg warnings, but I am really not sure
what to adjust in which direction, especially the contradiction between
the MANY_OBJECTS_PER_PG and POOL_TOO_MANY_PGS warnings for the
default.rgw.ec.data pool (there are too many objects per pg, so I
should probably increase pg_num, however it recommends to scale down
from 128 to 32?!).

Thanks,
Thomas

# ceph -s
  cluster:
    id:     91688ac0-1b4f-43e3-913d-a844338d9325
    health: HEALTH_WARN
            1 pools have many more objects per pg than average
            3 pools have too many placement groups

  services:
    mon: 5 daemons, quorum nimbus,lenticular,stratus,cirrus,nacreous (age 5d)
    mgr: lenticular(active, since 3M), standbys: nimbus, stratus, nacreous, 
cirrus
    mds: 1/1 daemons up, 1 standby
    osd: 53 osds: 53 up (since 2w), 53 in (since 18M)
    rgw: 5 daemons active (5 hosts, 1 zones)

  data:
    volumes: 1/1 healthy
    pools:   15 pools, 2977 pgs
    objects: 7.20M objects, 15 TiB
    usage:   42 TiB used, 55 TiB / 97 TiB avail
    pgs:     2975 active+clean
             1    active+clean+scrubbing
             1    active+clean+scrubbing+deep

  io:
    client:   6.5 MiB/s rd, 1.3 MiB/s wr, 158 op/s rd, 63 op/s wr

# ceph health detail
HEALTH_WARN 1 pools have many more objects per pg than average; 3 pools have 
too many placement groups
[WRN] MANY_OBJECTS_PER_PG: 1 pools have many more objects per pg than average
    pool default.rgw.ec.data objects per pg (27197) is more than 11.2524 times 
cluster average (2417)
[WRN] POOL_TOO_MANY_PGS: 3 pools have too many placement groups
    Pool default.rgw.ec.data has 128 placement groups, should have 32
    Pool templates has 512 placement groups, should have 256
    Pool rbd.ec has 1024 placement groups, should have 256

# ceph df
--- RAW STORAGE ---
CLASS    SIZE   AVAIL     USED  RAW USED  %RAW USED
hdd    67 TiB  29 TiB   39 TiB    39 TiB      57.58
ssd    30 TiB  26 TiB  3.7 TiB   3.7 TiB      12.37
TOTAL  97 TiB  55 TiB   42 TiB    42 TiB      43.74

--- POOLS ---
POOL                        ID   PGS   STORED  OBJECTS     USED  %USED  MAX 
AVAIL
templates                    1   512  4.4 TiB    1.27M   13 TiB  37.46    7.4 
TiB
.rgw.root                    9    32  1.4 KiB        4  768 KiB      0    7.4 
TiB
default.rgw.control         10    32      0 B        8      0 B      0    7.4 
TiB
default.rgw.meta            11    32  7.0 KiB       38  5.1 MiB      0    7.4 
TiB
default.rgw.log             12    32  3.5 KiB      208  5.7 MiB      0    7.4 
TiB
default.rgw.ec.data         14   128  1.3 TiB    3.48M  2.7 TiB  10.95     13 
TiB
default.rgw.buckets.index   15    32   57 MiB       53  171 MiB      0    7.4 
TiB
default.rgw.buckets.data    16    32  2.5 GiB   17.85k  8.3 GiB   0.04    7.4 
TiB
default.rgw.buckets.non-ec  17    32  1.6 KiB        1  197 KiB      0    7.4 
TiB
rbd.ssd                     19   512  872 GiB  259.98k  2.6 TiB   9.46    8.1 
TiB
.mgr                        20     1  640 MiB      161  1.9 GiB      0    7.4 
TiB
cephfs_metadata             21    32  404 MiB   35.40k  1.2 GiB      0    8.1 
TiB
cephfs_data                 22    32   42 GiB   28.09k  128 GiB   0.56    7.4 
TiB
rbd                         23   512   12 GiB    6.57k   39 GiB   0.17    7.4 
TiB
rbd.ec                      24  1024    8 TiB    2.10M   16 TiB  42.48     11 
TiB

The pool "templates" is named as such for historical reasons. It is
used as RBD VM image storage.

# ceph osd pool autoscale-status
POOL                          SIZE  TARGET SIZE                RATE  RAW 
CAPACITY   RATIO  TARGET RATIO  EFFECTIVE RATIO  BIAS  PG_NUM  NEW PG_NUM  
AUTOSCALE  BULK
.rgw.root                   256.0k                              3.0        
68931G  0.0000                                  1.0      32              off    
    False
default.rgw.control             0                               3.0        
68931G  0.0000                                  1.0      32              off    
    False
default.rgw.meta             1740k                              3.0        
68931G  0.0000                                  1.0      32              off    
    False
default.rgw.log              1932k                              3.0        
68931G  0.0000                                  1.0      32              off    
    False
default.rgw.ec.data          1678G               1.6666666269302368        
68931G  0.0406                                  1.0     128              warn   
    False
default.rgw.buckets.index   58292k                              3.0        
68931G  0.0000                                  1.0      32              off    
    False
default.rgw.buckets.data     2828M                              3.0        
68931G  0.0001                                  1.0      32              off    
    False
default.rgw.buckets.non-ec  67159                               3.0        
68931G  0.0000                                  1.0      32              off    
    False
.mgr                        639.7M                              3.0        
68931G  0.0000                                  1.0       1              off    
    False
cephfs_metadata             405.4M                              3.0        
30404G  0.0000                                  4.0      32              off    
    False
cephfs_data                 43733M                              3.0        
68931G  0.0019                                  1.0      32              off    
    False
templates                    4541G                              3.0        
68931G  0.1977                                  1.0     512              warn   
    True
rbd.ssd                     870.4G                              3.0        
30404G  0.0859                                  1.0     512              warn   
    True
rbd                         13336M                              3.0        
68931G  0.0006                                  1.0     512              off    
    True
rbd.ec                       8401G                              2.0        
68931G  0.2438                                  1.0    1024              warn   
    True

# ceph osd pool ls detail
pool 1 'templates' replicated size 3 min_size 2 crush_rule 0 object_hash 
rjenkins pg_num 512 pgp_num 512 autoscale_mode warn last_change 7495052 lfor 
0/2976944/7469772 flags hashpspool,selfmanaged_snaps,bulk stripe_width 0 
application rbd read_balance_score 1.48
pool 9 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash 
rjenkins pg_num 32 pgp_num 32 autoscale_mode off last_change 7467477 flags 
hashpspool stripe_width 0 application rgw read_balance_score 3.38
pool 10 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0 
object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode off last_change 
7467478 flags hashpspool stripe_width 0 application rgw read_balance_score 3.37
pool 11 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0 
object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode off last_change 
7467479 flags hashpspool stripe_width 0 application rgw read_balance_score 3.36
pool 12 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0 object_hash 
rjenkins pg_num 32 pgp_num 32 autoscale_mode off last_change 7467480 flags 
hashpspool stripe_width 0 application rgw read_balance_score 4.52
pool 14 'default.rgw.ec.data' erasure profile rgw-video size 5 min_size 4 
crush_rule 3 object_hash rjenkins pg_num 128 pgp_num 128 autoscale_mode warn 
last_change 7494999 lfor 0/5467738/7467533 flags hashpspool stripe_width 12288 
application rgw
pool 15 'default.rgw.buckets.index' replicated size 3 min_size 2 crush_rule 0 
object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode off last_change 
7467482 flags hashpspool stripe_width 0 application rgw read_balance_score 4.52
pool 16 'default.rgw.buckets.data' replicated size 3 min_size 2 crush_rule 0 
object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode off last_change 
7467483 flags hashpspool stripe_width 0 application rgw read_balance_score 3.39
pool 17 'default.rgw.buckets.non-ec' replicated size 3 min_size 2 crush_rule 0 
object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode off last_change 
7467484 flags hashpspool stripe_width 0 application rgw read_balance_score 3.38
pool 19 'rbd.ssd' replicated size 3 min_size 2 crush_rule 4 object_hash 
rjenkins pg_num 512 pgp_num 512 autoscale_mode warn last_change 7495053 lfor 
0/141997/7467030 flags hashpspool,selfmanaged_snaps,bulk stripe_width 0 
application rbd read_balance_score 1.43
pool 20 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins 
pg_num 1 pgp_num 1 autoscale_mode off last_change 7480977 flags hashpspool 
stripe_width 0 pg_num_min 1 application mgr,mgr_devicehealth read_balance_score 
37.50
pool 21 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 4 object_hash 
rjenkins pg_num 32 pgp_num 32 autoscale_mode off last_change 7467487 lfor 
0/0/96365 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 
recovery_priority 5 application cephfs read_balance_score 1.59
pool 22 'cephfs_data' replicated size 3 min_size 2 crush_rule 0 object_hash 
rjenkins pg_num 32 pgp_num 32 autoscale_mode off last_change 7467488 lfor 
0/0/96365 flags hashpspool,selfmanaged_snaps stripe_width 0 application cephfs 
read_balance_score 4.52
pool 23 'rbd' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins 
pg_num 512 pgp_num 512 autoscale_mode off last_change 7467489 lfor 0/0/884114 
flags hashpspool,selfmanaged_snaps,bulk stripe_width 0 application rbd 
read_balance_score 1.41
pool 24 'rbd.ec' erasure profile rbd size 4 min_size 3 crush_rule 5 object_hash 
rjenkins pg_num 1024 pgp_num 1024 autoscale_mode warn last_change 7495054 lfor 
0/2366959/2366961 flags hashpspool,ec_overwrites,selfmanaged_snaps,bulk 
stripe_width 8192 application rbd

# ceph osd utilization
avg 192.66
stddev 75.9851 (expected baseline 13.7486)
min osd.48 with 88 pgs (0.456762 * mean)
max osd.10 with 500 pgs (2.59524 * mean)

# ceph osd df tree
ID   CLASS  WEIGHT    REWEIGHT  SIZE     RAW USE  DATA     OMAP     META      
AVAIL    %USE   VAR   PGS  STATUS  TYPE NAME          
-13         95.71759         -   97 TiB   42 TiB   42 TiB  4.2 GiB   204 GiB   
55 TiB  43.74  1.00    -          root default       
-16         14.15759         -   15 TiB  8.2 TiB  8.2 TiB  713 MiB    29 GiB  
6.4 TiB  56.36  1.29    -              host cirrus    
  2    hdd   1.76970   1.00000  1.8 TiB  1.1 TiB  1.0 TiB   85 MiB   4.9 GiB  
784 GiB  57.94  1.32  223      up          osd.2      
  5    hdd   1.76970   1.00000  1.8 TiB  1.0 TiB  1.0 TiB   50 MiB   3.2 GiB  
821 GiB  55.95  1.28  223      up          osd.5      
  8    hdd   1.76970   1.00000  1.8 TiB  1.0 TiB  1.0 TiB   88 MiB   2.9 GiB  
804 GiB  56.86  1.30  229      up          osd.8      
 11    hdd   1.76970   1.00000  1.8 TiB  963 GiB  959 GiB  236 MiB   3.6 GiB  
900 GiB  51.66  1.18  240      up          osd.11     
 13    hdd   1.76970   1.00000  1.8 TiB  1.0 TiB  1.0 TiB   61 MiB   4.7 GiB  
823 GiB  55.81  1.28  220      up          osd.13     
 15    hdd   1.76970   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   52 MiB   3.1 GiB  
762 GiB  59.13  1.35  229      up          osd.15     
 18    hdd   1.76970   1.00000  1.8 TiB  1.1 TiB  1.0 TiB   85 MiB   2.7 GiB  
787 GiB  57.76  1.32  228      up          osd.18     
 19    hdd   1.76970   1.00000  1.8 TiB  1.0 TiB  1.0 TiB   54 MiB   3.4 GiB  
824 GiB  55.79  1.28  219      up          osd.19     
-18         15.93417         -   16 TiB  6.0 TiB  6.0 TiB  663 MiB    23 GiB   
10 TiB  37.15  0.85    -              host lenticular
  0    hdd   1.76970   1.00000  1.8 TiB  1.0 TiB  1.0 TiB   88 MiB   3.2 GiB  
796 GiB  57.26  1.31  229      up          osd.0      
  4    hdd   1.76970   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   42 MiB   6.0 GiB  
753 GiB  59.57  1.36  231      up          osd.4      
  6    hdd   1.76970   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   54 MiB   3.5 GiB  
758 GiB  59.31  1.36  237      up          osd.6      
 10    hdd   3.63869   1.00000  3.6 TiB  1.9 TiB  1.9 TiB  127 MiB   5.9 GiB  
1.7 TiB  52.08  1.19  500      up          osd.10     
 36    ssd   1.74660   1.00000  1.7 TiB  229 GiB  228 GiB   92 MiB   959 MiB  
1.5 TiB  12.79  0.29  103      up          osd.36     
 37    ssd   1.74660   1.00000  1.7 TiB  223 GiB  222 GiB   97 MiB  1006 MiB  
1.5 TiB  12.49  0.29   98      up          osd.37     
 38    ssd   1.74660   1.00000  1.7 TiB  223 GiB  222 GiB   61 MiB   1.0 GiB  
1.5 TiB  12.46  0.28   98      up          osd.38     
 39    ssd   1.74660   1.00000  1.7 TiB  221 GiB  219 GiB  102 MiB   1.4 GiB  
1.5 TiB  12.35  0.28   98      up          osd.39     
-17         22.89058         -   23 TiB  9.5 TiB  9.4 TiB  1.1 GiB    45 GiB   
14 TiB  40.59  0.93    -              host nacreous  
  1    hdd   1.76970   1.00000  1.8 TiB  1.0 TiB  1.0 TiB  124 MiB   6.0 GiB  
818 GiB  56.07  1.28  229      up          osd.1      
  3    hdd   1.76970   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   54 MiB   3.1 GiB  
780 GiB  58.11  1.33  222      up          osd.3      
  7    hdd   1.76970   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   48 MiB   3.1 GiB  
773 GiB  58.48  1.34  237      up          osd.7      
  9    hdd   1.76970   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   50 MiB   4.9 GiB  
766 GiB  58.88  1.35  223      up          osd.9      
 14    hdd   1.76970   1.00000  1.8 TiB  1.0 TiB  1.0 TiB   93 MiB   3.3 GiB  
812 GiB  56.43  1.29  224      up          osd.14     
 16    hdd   1.76970   1.00000  1.8 TiB  1.0 TiB  1.0 TiB   43 MiB   5.0 GiB  
792 GiB  57.51  1.31  235      up          osd.16     
 17    hdd   1.76970   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   68 MiB   4.2 GiB  
777 GiB  58.31  1.33  227      up          osd.17     
 20    hdd   1.76970   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   66 MiB   5.9 GiB  
780 GiB  58.15  1.33  231      up          osd.20     
 48    ssd   1.74660   1.00000  1.7 TiB  213 GiB  212 GiB   94 MiB   1.3 GiB  
1.5 TiB  11.92  0.27   88      up          osd.48     
 49    ssd   1.74660   1.00000  1.7 TiB  215 GiB  212 GiB   94 MiB   2.5 GiB  
1.5 TiB  12.01  0.27   90      up          osd.49     
 50    ssd   1.74660   1.00000  1.7 TiB  214 GiB  212 GiB   95 MiB   2.1 GiB  
1.5 TiB  11.99  0.27   92      up          osd.50     
 51    ssd   1.74660   1.00000  1.7 TiB  217 GiB  215 GiB  144 MiB   1.9 GiB  
1.5 TiB  12.13  0.28   94      up          osd.51     
 52    ssd   1.74660   1.00000  1.7 TiB  214 GiB  212 GiB  144 MiB   2.0 GiB  
1.5 TiB  11.95  0.27   94      up          osd.52     
-19          6.98639         -  7.0 TiB  895 GiB  889 GiB  345 MiB   5.6 GiB  
6.1 TiB  12.51  0.29    -              host nimbus    
 44    ssd   1.74660   1.00000  1.7 TiB  227 GiB  224 GiB  148 MiB   2.3 GiB  
1.5 TiB  12.67  0.29   98      up          osd.44     
 45    ssd   1.74660   1.00000  1.7 TiB  222 GiB  221 GiB   68 MiB  1021 MiB  
1.5 TiB  12.39  0.28   93      up          osd.45     
 46    ssd   1.74660   1.00000  1.7 TiB  223 GiB  222 GiB   64 MiB   972 MiB  
1.5 TiB  12.48  0.29   97      up          osd.46     
 47    ssd   1.74660   1.00000  1.7 TiB  224 GiB  222 GiB   64 MiB   1.3 GiB  
1.5 TiB  12.52  0.29   95      up          osd.47     
-15         21.19368         -   22 TiB  9.3 TiB  9.3 TiB  848 MiB    56 GiB   
12 TiB  43.35  0.99    -              host pileus    
 12    hdd   1.76970   1.00000  1.8 TiB  1.0 TiB  1.0 TiB   54 MiB   8.7 GiB  
790 GiB  57.61  1.32  231      up          osd.12     
 21    hdd   1.76970   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   46 MiB   6.1 GiB  
765 GiB  58.94  1.35  236      up          osd.21     
 22    hdd   1.76970   1.00000  1.8 TiB  1.0 TiB  1.0 TiB   60 MiB   6.2 GiB  
810 GiB  56.52  1.29  228      up          osd.22     
 23    hdd   1.76970   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   83 MiB   8.0 GiB  
779 GiB  58.16  1.33  230      up          osd.23     
 24    hdd   1.76970   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   44 MiB   5.5 GiB  
777 GiB  58.29  1.33  229      up          osd.24     
 25    hdd   1.76970   1.00000  1.8 TiB  1.0 TiB  1.0 TiB   73 MiB   5.6 GiB  
799 GiB  57.09  1.31  230      up          osd.25     
 26    hdd   1.76970   1.00000  1.8 TiB  1.0 TiB  1.0 TiB   63 MiB   5.4 GiB  
797 GiB  57.22  1.31  239      up          osd.26     
 27    hdd   1.81940   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   99 MiB   7.1 GiB  
720 GiB  61.36  1.40  247      up          osd.27     
 40    ssd   1.74660   1.00000  1.7 TiB  220 GiB  219 GiB   66 MiB  1013 MiB  
1.5 TiB  12.31  0.28   96      up          osd.40     
 41    ssd   1.74660   1.00000  1.7 TiB  221 GiB  220 GiB   63 MiB   981 MiB  
1.5 TiB  12.36  0.28   97      up          osd.41     
 42    ssd   1.74660   1.00000  1.7 TiB  228 GiB  227 GiB   98 MiB   999 MiB  
1.5 TiB  12.73  0.29  101      up          osd.42     
 43    ssd   1.74660   1.00000  1.7 TiB  227 GiB  226 GiB   99 MiB   866 MiB  
1.5 TiB  12.67  0.29  100      up          osd.43     
-14         14.55518         -   15 TiB  8.6 TiB  8.5 TiB  582 MiB    45 GiB  
6.0 TiB  59.02  1.35    -              host stratus   
 28    hdd   1.81940   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   54 MiB   5.5 GiB  
756 GiB  59.41  1.36  239      up          osd.28     
 29    hdd   1.81940   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   44 MiB   4.7 GiB  
759 GiB  59.27  1.35  230      up          osd.29     
 30    hdd   1.81940   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   68 MiB   5.0 GiB  
753 GiB  59.57  1.36  232      up          osd.30     
 31    hdd   1.81940   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   64 MiB   6.2 GiB  
778 GiB  58.26  1.33  229      up          osd.31     
 32    hdd   1.81940   1.00000  1.8 TiB  1.0 TiB  1.0 TiB   85 MiB   5.0 GiB  
797 GiB  57.24  1.31  226      up          osd.32     
 33    hdd   1.81940   1.00000  1.8 TiB  1.1 TiB  1.1 TiB  112 MiB   4.1 GiB  
749 GiB  59.81  1.37  235      up          osd.33     
 34    hdd   1.81940   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   46 MiB   8.5 GiB  
770 GiB  58.66  1.34  238      up          osd.34     
 35    hdd   1.81940   1.00000  1.8 TiB  1.1 TiB  1.1 TiB  109 MiB   6.2 GiB  
747 GiB  59.92  1.37  244      up          osd.35     
                         TOTAL   97 TiB   42 TiB   42 TiB  4.2 GiB   204 GiB   
55 TiB  43.74                                        
MIN/MAX VAR: 0.27/1.40  STDDEV: 21.24

# ceph balancer status
{
    "active": true,
    "last_optimize_duration": "0:00:00.040070",
    "last_optimize_started": "Thu Mar 13 14:59:05 2025",
    "mode": "upmap",
    "no_optimization_needed": true,
    "optimize_result": "Unable to find further optimization, or pool(s) pg_num 
is decreasing, or distribution is already perfect",
    "plans": []
}

-- 
Fachschaft I/1 Mathematik/Physik/Informatik der RWTH Aachen Thomas
Schneider Campus Mitte: Augustinerbach 2a, 52062 Aachen Telefon: +49
241 80 94506 Informatikzentrum: Ahornstraße 55, Raum 4U17, 52074 Aachen
Telefon: +49 241 80 26741 https://www.fsmpi.rwth-aachen.de

# fio --ioengine=libaio --direct=1 --bs=16384 --iodepth=128 --rw=randread 
--norandommap --size=20G --numjobs=1 --runtime=300 --time_based 
--name=/dev/rbd/rbd/test
/dev/rbd/rbd/test: (g=0): rw=randread, bs=(R) 16.0KiB-16.0KiB, (W) 
16.0KiB-16.0KiB, (T) 16.0KiB-16.0KiB, ioengine=libaio, iodepth=128
fio-3.25
Starting 1 process
Jobs: 1 (f=1): [r(1)][100.0%][r=35.0MiB/s][r=2241 IOPS][eta 00m:00s]
/dev/rbd/rbd/test: (groupid=0, jobs=1): err= 0: pid=3832471: Thu Jan 30 
19:55:46 2025
  read: IOPS=7591, BW=119MiB/s (124MB/s)(34.9GiB/300892msec)
    slat (usec): min=2, max=665, avg= 8.93, stdev= 5.57
    clat (usec): min=65, max=5330.5k, avg=16850.13, stdev=184056.88
     lat (usec): min=82, max=5330.5k, avg=16859.23, stdev=184056.97
    clat percentiles (usec):
     |  1.00th=[    147],  5.00th=[    180], 10.00th=[    202],
     | 20.00th=[    237], 30.00th=[    269], 40.00th=[    297],
     | 50.00th=[    330], 60.00th=[    367], 70.00th=[    420],
     | 80.00th=[    523], 90.00th=[   1876], 95.00th=[   6128],
     | 99.00th=[ 362808], 99.50th=[1098908], 99.90th=[3506439],
     | 99.95th=[4143973], 99.99th=[4731175]
   bw (  KiB/s): min=  768, max=478432, per=100.00%, avg=121819.01, 
stdev=95962.09, samples=600
   iops        : min=   48, max=29902, avg=7613.70, stdev=5997.62, samples=600
  lat (usec)   : 100=0.01%, 250=24.52%, 500=53.85%, 750=8.61%, 1000=1.56%
  lat (msec)   : 2=1.84%, 4=3.46%, 10=2.34%, 20=1.12%, 50=0.91%
  lat (msec)   : 100=0.40%, 250=0.31%, 500=0.13%, 750=0.18%, 1000=0.21%
  lat (msec)   : 2000=0.34%, >=2000=0.20%
  cpu          : usr=2.33%, sys=10.06%, ctx=1687235, majf=0, minf=523
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued rwts: total=2284212,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=128

Run status group 0 (all jobs):
   READ: bw=119MiB/s (124MB/s), 119MiB/s-119MiB/s (124MB/s-124MB/s), io=34.9GiB 
(37.4GB), run=300892-300892msec

# fio --ioengine=libaio --direct=1 --bs=16384 --iodepth=128 --rw=randread 
--norandommap --size=20G --numjobs=1 --runtime=300 --time_based 
--name=/dev/rbd/rbd.ssd/test
/dev/rbd/rbd.ssd/test: (g=0): rw=randread, bs=(R) 16.0KiB-16.0KiB, (W) 
16.0KiB-16.0KiB, (T) 16.0KiB-16.0KiB, ioengine=libaio, iodepth=128
fio-3.25
Starting 1 process
Jobs: 1 (f=1): [r(1)][100.0%][r=31.3MiB/s][r=2002 IOPS][eta 00m:00s]
/dev/rbd/rbd.ssd/test: (groupid=0, jobs=1): err= 0: pid=3825356: Thu Jan 30 
19:49:15 2025
  read: IOPS=2007, BW=31.4MiB/s (32.9MB/s)(9422MiB/300334msec)
    slat (usec): min=2, max=500, avg=13.50, stdev= 7.03
    clat (usec): min=45, max=3395.6k, avg=63732.62, stdev=241344.46
     lat (usec): min=136, max=3395.6k, avg=63746.39, stdev=241344.47
    clat percentiles (usec):
     |  1.00th=[    204],  5.00th=[    243], 10.00th=[    273],
     | 20.00th=[    330], 30.00th=[    429], 40.00th=[   1090],
     | 50.00th=[   6390], 60.00th=[   9110], 70.00th=[  14091],
     | 80.00th=[  30278], 90.00th=[ 107480], 95.00th=[ 287310],
     | 99.00th=[1350566], 99.50th=[1820328], 99.90th=[2701132],
     | 99.95th=[2868904], 99.99th=[3204449]
   bw (  KiB/s): min= 2560, max=39680, per=100.00%, avg=32371.00, 
stdev=3270.26, samples=596
   iops        : min=  160, max= 2480, avg=2023.19, stdev=204.39, samples=596
  lat (usec)   : 50=0.01%, 250=6.09%, 500=27.41%, 750=4.50%, 1000=1.83%
  lat (msec)   : 2=0.50%, 4=2.27%, 10=20.07%, 20=12.33%, 50=9.59%
  lat (msec)   : 100=4.95%, 250=4.91%, 500=2.41%, 750=0.92%, 1000=0.64%
  lat (msec)   : 2000=1.18%, >=2000=0.40%
  cpu          : usr=1.09%, sys=4.10%, ctx=541464, majf=0, minf=524
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued rwts: total=603021,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=128

Run status group 0 (all jobs):
   READ: bw=31.4MiB/s (32.9MB/s), 31.4MiB/s-31.4MiB/s (32.9MB/s-32.9MB/s), 
io=9422MiB (9880MB), run=300334-300334msec

_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[ceph-users] Massive performance issues

Reply via email to