Hi,
the balancer is probably running, which mode? I changed the mode to
none in our own cluster because it also never finished rebalancing and
we didn’t have a bad pg distribution. Maybe it’s supposed to be like
that, I don’t know.
Regards
Eugen
Zitat von "Kyriazis, George" <george.kyria...@intel.com>:
Hello,
I have a Proxmox ceph cluster with 5 nodes and 3 OSDs each (total 15
OSDs), on a 10G network.
The cluster started small, and I’ve progressively added OSDs over
time. Problem is…. The cluster never rebalances completely. There
is always progress on backfilling, but PGs that used to be in
active+clean state jump back into the active+remapped+backfilling
(or active+remapped+backfill_wait) state, to be moved to different
OSDs.
Initially I had a 1G network (recently upgraded to 10G), and I was
holding on the backfill settings (osd_max_backfills and
osd_recovery_sleep_hdd). I just recently (last few weeks) upgraded
to 10G, with osd_max_backfills = 50 and osd_recovery_sleep_hdd = 0
(only HDDs, no SSDs). Cluster has been backfilling for months now
with no end in sight.
Is this normal behavior? Is there any setting that I can look at
that till give me an idea as to why PGs are jumping back into
remapped from clean?
Below is output of “ceph osd tree” and “ceph osd df”:
# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 203.72472 root default
-9 40.01666 host vis-hsw-01
3 hdd 10.91309 osd.3 up 1.00000 1.00000
6 hdd 14.55179 osd.6 up 1.00000 1.00000
10 hdd 14.55179 osd.10 up 1.00000 1.00000
-13 40.01666 host vis-hsw-02
0 hdd 10.91309 osd.0 up 1.00000 1.00000
7 hdd 14.55179 osd.7 up 1.00000 1.00000
11 hdd 14.55179 osd.11 up 1.00000 1.00000
-11 40.01666 host vis-hsw-03
4 hdd 10.91309 osd.4 up 1.00000 1.00000
8 hdd 14.55179 osd.8 up 1.00000 1.00000
12 hdd 14.55179 osd.12 up 1.00000 1.00000
-3 40.01666 host vis-hsw-04
5 hdd 10.91309 osd.5 up 1.00000 1.00000
9 hdd 14.55179 osd.9 up 1.00000 1.00000
13 hdd 14.55179 osd.13 up 1.00000 1.00000
-15 43.65807 host vis-hsw-05
1 hdd 14.55269 osd.1 up 1.00000 1.00000
2 hdd 14.55269 osd.2 up 1.00000 1.00000
14 hdd 14.55269 osd.14 up 1.00000 1.00000
-5 0 host vis-ivb-07
-7 0 host vis-ivb-10
#
# ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META
AVAIL %USE VAR PGS STATUS
3 hdd 10.91309 1.00000 11 TiB 8.2 TiB 8.2 TiB 552 MiB 25 GiB
2.7 TiB 75.08 1.19 131 up
6 hdd 14.55179 1.00000 15 TiB 9.1 TiB 9.1 TiB 1.2 GiB 30 GiB
5.5 TiB 62.47 0.99 148 up
10 hdd 14.55179 1.00000 15 TiB 8.1 TiB 8.1 TiB 1.5 GiB 20 GiB
6.4 TiB 55.98 0.89 142 up
0 hdd 10.91309 1.00000 11 TiB 7.5 TiB 7.4 TiB 504 MiB 24 GiB
3.5 TiB 68.34 1.09 120 up
7 hdd 14.55179 1.00000 15 TiB 8.7 TiB 8.7 TiB 1.0 GiB 31 GiB
5.8 TiB 60.07 0.95 144 up
11 hdd 14.55179 1.00000 15 TiB 9.4 TiB 9.3 TiB 819 MiB 20 GiB
5.2 TiB 64.31 1.02 147 up
4 hdd 10.91309 1.00000 11 TiB 7.0 TiB 7.0 TiB 284 MiB 25 GiB
3.9 TiB 64.35 1.02 112 up
8 hdd 14.55179 1.00000 15 TiB 9.3 TiB 9.2 TiB 1.8 GiB 29 GiB
5.3 TiB 63.65 1.01 157 up
12 hdd 14.55179 1.00000 15 TiB 8.6 TiB 8.6 TiB 623 MiB 19 GiB
5.9 TiB 59.14 0.94 136 up
5 hdd 10.91309 1.00000 11 TiB 8.6 TiB 8.6 TiB 542 MiB 29 GiB
2.3 TiB 79.01 1.26 134 up
9 hdd 14.55179 1.00000 15 TiB 8.2 TiB 8.2 TiB 707 MiB 27 GiB
6.3 TiB 56.56 0.90 138 up
13 hdd 14.55179 1.00000 15 TiB 8.7 TiB 8.7 TiB 741 MiB 18 GiB
5.8 TiB 59.85 0.95 134 up
1 hdd 14.55269 1.00000 15 TiB 9.8 TiB 9.8 TiB 1.3 GiB 20 GiB
4.8 TiB 67.18 1.07 158 up
2 hdd 14.55269 1.00000 15 TiB 8.7 TiB 8.7 TiB 936 MiB 18 GiB
5.8 TiB 60.04 0.95 148 up
14 hdd 14.55269 1.00000 15 TiB 8.3 TiB 8.3 TiB 673 MiB 18 GiB
6.3 TiB 56.97 0.90 131 up
TOTAL 204 TiB 128 TiB 128 TiB 13 GiB 350 GiB
75 TiB 62.95
MIN/MAX VAR: 0.89/1.26 STDDEV: 6.44
#
Thank you!
George
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io