[ceph-users] unbalanced OSDs

2023-08-03 Thread Spiros Papageorgiou

Hi all,


I have a ceph cluster with 3 nodes. ceph version is 16.2.9. There are 7 
SSD OSDs on each server and one pool that resides on these OSDs.


My OSDs are terribly unbalanced:

ID  CLASS  WEIGHT    REWEIGHT  SIZE RAW USE  DATA OMAP META  
AVAIL    %USE   VAR   PGS STATUS  TYPE NAME
-9 28.42200 -   28 TiB  9.3 TiB  9.2 TiB  161 MiB    26 
GiB   19 TiB  32.56  1.09    -  root ssddisks
-2  9.47400 -  9.5 TiB  3.4 TiB  3.4 TiB   66 MiB   9.2 
GiB  6.1 TiB  35.52  1.19    -  host px1-ssd
 0    ssd   1.74599   0.85004  1.7 TiB  810 GiB  807 GiB  3.2 MiB   2.3 
GiB  978 GiB  45.28  1.51   26  up  osd.0
 5    ssd   0.82999   0.85004  850 GiB  581 GiB  580 GiB   22 MiB   912 
MiB  269 GiB  68.38  2.29   19  up  osd.5
 6    ssd   0.82999   1.0  850 GiB  8.2 GiB  7.8 GiB  9.5 MiB   435 
MiB  842 GiB   0.97  0.03    4  up  osd.6
 7    ssd   0.82999   1.0  850 GiB  294 GiB  293 GiB   26 MiB   591 
MiB  556 GiB  34.60  1.16   11  up  osd.7
16    ssd   1.74599   0.85004  1.7 TiB  872 GiB  869 GiB  3.1 MiB   2.3 
GiB  916 GiB  48.75  1.63   27  up  osd.16
23    ssd   1.74599   1.0  1.7 TiB  438 GiB  436 GiB  1.5 MiB   1.7 
GiB  1.3 TiB  24.48  0.82   14  up  osd.23
24    ssd   1.74599   1.0  1.7 TiB  444 GiB  443 GiB  1.6 MiB   1.0 
GiB  1.3 TiB  24.81  0.83   17  up  osd.24
-6  9.47400 -  9.5 TiB  2.9 TiB  2.9 TiB   46 MiB   8.1 
GiB  6.6 TiB  30.39  1.02    -  host px2-ssd
12    ssd   0.82999   1.0  850 GiB  154 GiB  154 GiB   21 MiB   368 
MiB  696 GiB  18.16  0.61    9  up  osd.12
13    ssd   0.82999   1.0  850 GiB  144 GiB  143 GiB  527 KiB   469 
MiB  706 GiB  16.92  0.57    4  up  osd.13
14    ssd   0.82999   1.0  850 GiB  149 GiB  149 GiB   16 MiB   299 
MiB  700 GiB  17.58  0.59    7  up  osd.14
29    ssd   1.74599   1.0  1.7 TiB  449 GiB  448 GiB  1.6 MiB   1.4 
GiB  1.3 TiB  25.11  0.84   20  up  osd.29
30    ssd   1.74599   0.85004  1.7 TiB  885 GiB  882 GiB  3.1 MiB   2.3 
GiB  903 GiB  49.48  1.65   31  up  osd.30
31    ssd   1.74599   1.0  1.7 TiB  728 GiB  727 GiB  2.6 MiB   1.8 
GiB  1.0 TiB  40.74  1.36   22  up  osd.31
32    ssd   1.74599   1.0  1.7 TiB  438 GiB  437 GiB  1.6 MiB   1.4 
GiB  1.3 TiB  24.51  0.82   15  up  osd.32
-4  9.47400 -  9.5 TiB  3.0 TiB  3.0 TiB   49 MiB   8.7 
GiB  6.5 TiB  31.78  1.06    -  host px3-ssd
19    ssd   0.82999   1.0  850 GiB  293 GiB  292 GiB   14 MiB   500 
MiB  557 GiB  34.47  1.15    9  up  osd.19
20    ssd   0.82999   1.0  850 GiB  290 GiB  290 GiB   10 MiB   482 
MiB  560 GiB  34.15  1.14   10  up  osd.20
21    ssd   0.82999   1.0  850 GiB  148 GiB  147 GiB   16 MiB   428 
MiB  702 GiB  17.36  0.58    5  up  osd.21
25    ssd   1.74599   1.0  1.7 TiB  446 GiB  445 GiB  1.8 MiB   1.6 
GiB  1.3 TiB  24.96  0.83   19  up  osd.25
26    ssd   1.74599   1.0  1.7 TiB  739 GiB  737 GiB  2.6 MiB   2.0 
GiB  1.0 TiB  41.33  1.38   29  up  osd.26
27    ssd   1.74599   1.0  1.7 TiB  725 GiB  723 GiB  2.6 MiB   2.1 
GiB  1.0 TiB  40.55  1.36   21  up  osd.27
28    ssd   1.74599   1.0  1.7 TiB  442 GiB  440 GiB  1.6 MiB   1.7 
GiB  1.3 TiB  24.72  0.83   17  up  osd.28


I have done a "ceph osd reweight-by-utilization" and "ceph osd 
set-require-min-compat-client luminous". The pool has 32 PGs which were 
set by autoscale_mode, which is on.


Why are my OSDs, so unbalanced? I have osd.5 with 68.3% and osd.6 with 
0.97%  Also when the reweight-by-utilization, osd.5 utilization 
actually increased...



What am i missing here?


Sp

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Unbalanced OSDs when pg_autoscale enabled

2023-03-16 Thread 郑亮
Hi all,

I have a 9 node cluster running *Pacific 16.2.10*. OSDs live on 9 of the
nodes with each one having 4 x 1.8T ssd and 8 x 10.9T hdd for a total of
108 OSDs. We create three crush roots as belows.

1. The hdds (8x9=72) of all nodes form a large crush root, which is used as
a data pool, and object storage and cephfs share this crush root.
2. Take 3 ssds from the 4 ssds of each node as rbd block storage.
3. An ssd on each remaining node is used as an index pool for cephfs and
object storage.

[root@node01 smd]# ceph osd treeIDCLASS  WEIGHT TYPE NAME
   STATUS  REWEIGHT  PRI-AFF
-92  15.71910  root
root-1c31624a-ad18-445e-8e42-86b71c1fd76f
 -112   1.74657
host node01-fa2cdf3e-7212-4b5f-b62a-3ab1e803547f
13ssd1.74657
   osd.13up   1.0
1.0  -103   1.74657
  host node02-4e232f27-fe4b-4d0e-bd2a-67d5006a0cdd
  34ssd1.74657
 osd.34up   1.0
1.0  -109   1.74657
  host node03-3ae63d7a-9f65-4bea-b2ba-ff3fe342753d
  28ssd1.74657
 osd.28up   1.0
1.0  -118   1.74657
  host node04-37a3f92a-f6d8-41f9-a774-3069fc2f50b8
  54ssd1.74657
 osd.54up   1.0
1.0  -106   1.74657
  host node05-f667fa27-cc13-4b93-ad56-5dc4c31ffd77
  53ssd1.74657
 osd.53up   1.0
1.0   -91   1.74657
  host node06-3808c8f6-8e10-47c7-8456-62c1e0e800ed
  61ssd1.74657
 osd.61up   1.0
1.0   -97   1.74657
  host node07-78216b0d-0999-44e8-905d-8737a5f6f51f
  50ssd1.74657
 osd.50up   1.0
1.0  -115   1.74657
  host node08-947bd556-fb06-497d-8f2c-c4a679d2b06f
  86ssd1.74657
 osd.86up   1.0
1.0  -100   1.74657
  host node09-d9ae9046-0716-454f-ba0c-b03cf9986ba8
85ssd1.74657  osd.85
 up   1.0  1.0

-38 785.80701  root root-6041a4dc-7c9a-44ed-999c-a847cca81012
 -85  87.31189  host
node01-e5646053-2cf8-4ba5-90d5-bb1a63b1234c
  1hdd   10.91399  osd.1
  up   1.0  1.0  22hdd   10.91399  osd.22
  up   0.90002  1.0  31
hdd   10.91399  osd.31
   up   1.0  1.0  51hdd   10.91399  osd.51
   up   1.0  1.0  60hdd
10.91399  osd.60up
  1.0  1.0  70hdd   10.91399  osd.70
 up   1.0  1.0  78hdd
10.91399  osd.78up
  1.0  1.096hdd
10.91399  osd.96up
  1.0  1.0   -37
87.31189  host node02-be9925fd-60de-4147-81eb-720d7145715f
   9hdd
10.91399  osd.9 up
  1.0  1.019hdd
10.91399  osd.19up
  1.0  1.029hdd
10.91399  osd.29up
  1.0  1.047hdd
10.91399  osd.47up
  1.0  1.056hdd
10.91399  osd.56up
  1.0  1.065hdd
10.91399  osd.65up
  1.0  1.088hdd
10.91399  osd.88up
  1.0  1.0