David,
So as I look at logs, it was originally 9.0956 for the 10TB drives and
0.9096 for the 1TB drives.
# zgrep -i weight /var/log/ceph/*.log*gz
/var/log/ceph/ceph.audit.log.4.gz:...cmd=[{"prefix": "osd crush
create-or-move", "id": 4, "weight":9.0956,...
/var/log/ceph/ceph.audit.log.4.gz:...cmd=
I would go with the weight that was originally assigned to them. That way
it is in line with what new osds will be weighted.
On Wed, Jul 19, 2017, 9:17 AM Roger Brown wrote:
> David,
>
> Thank you. I have it currently as...
>
> $ ceph osd df
> ID WEIGHT REWEIGHT SIZE USEAVAIL %USE VAR
David,
Thank you. I have it currently as...
$ ceph osd df
ID WEIGHT REWEIGHT SIZE USEAVAIL %USE VAR PGS
3 10.0 1.0 9313G 44404M 9270G 0.47 1.00 372
4 10.0 1.0 9313G 46933M 9268G 0.49 1.06 372
0 10.0 1.0 9313G 41283M 9273G 0.43 0.93 372
I would recommend sucking with the weight of 9.09560 for the osds as that
is the TiB size of the osds that ceph details to as supposed to the TB size
of the osds. New osds will have their weights based on the TiB value. What
is your `ceph osd df` output just to see what things look like? Hopefully
Resolution confirmed!
$ ceph -s
cluster:
id: eea7b78c-b138-40fc-9f3e-3d77afb770f0
health: HEALTH_OK
services:
mon: 3 daemons, quorum desktop,mon1,nuc2
mgr: desktop(active), standbys: mon1
osd: 3 osds: 3 up, 3 in
data:
pools: 19 pools, 372 pgs
objects: 5424
Ah, that was the problem!
So I edited the crushmap (
http://docs.ceph.com/docs/master/rados/operations/crush-map/) with a weight
of 10.000 for all three 10TB OSD hosts. The instant result was all those
pgs with only 2 OSDs were replaced with 3 OSDs while the cluster started
rebalancing the data. I
ID WEIGHT TYPE NAME
-5 1.0 host osd1
-6 9.09560 host osd2
-2 9.09560 host osd3
The weight allocated to host "osd1" should presumably be the same as
the other two hosts?
Dump your crushmap and take a good look at it, specifically the
weighting of "osd1".
On Wed, Jul 19, 2017
I also tried ceph pg query, but it gave no helpful recommendations for any
of the stuck pgs.
On Tue, Jul 18, 2017 at 7:45 PM Roger Brown wrote:
> Problem:
> I have some pgs with only two OSDs instead of 3 like all the other pgs
> have. This is causing active+undersized+degraded status.
>
> Hist
Problem:
I have some pgs with only two OSDs instead of 3 like all the other pgs
have. This is causing active+undersized+degraded status.
History:
1. I started with 3 hosts, each with 1 OSD process (min_size 2) for a 1TB
drive.
2. Added 3 more hosts, each with 1 OSD process for a 10TB drive.
3. Rem