Are you running a replica size of 4?  If not, these might be errantly being 
reported as being on 10.

________________________________

[cid:imagedfab80.JPG@a622f997.4d830ea4]<https://storagecraft.com>       David 
Turner | Cloud Operations Engineer | StorageCraft Technology 
Corporation<https://storagecraft.com>
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2760 | Mobile: 385.224.2943

________________________________

If you are not the intended recipient of this message or received it 
erroneously, please notify the sender and delete it, together with any 
attachments, and be advised that any dissemination or copying of this message 
is prohibited.

________________________________

________________________________________
From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of Wido den 
Hollander [w...@42on.com]
Sent: Monday, October 24, 2016 2:19 PM
To: ceph-us...@ceph.com
Subject: [ceph-users] All PGs are active+clean, still remapped PGs

Hi,

On a cluster running Hammer 0.94.9 (upgraded from Firefly) I have 29 remapped 
PGs according to the OSDMap, but all PGs are active+clean.

osdmap e111208: 171 osds: 166 up, 166 in; 29 remapped pgs

pgmap v101069070: 6144 pgs, 2 pools, 90122 GB data, 22787 kobjects
   264 TB used, 184 TB / 448 TB avail
       6144 active+clean

The OSDMap shows:

root@mon1:~# ceph osd dump|grep pg_temp
pg_temp 4.39 [160,17,10,8]
pg_temp 4.52 [161,16,10,11]
pg_temp 4.8b [166,29,10,7]
pg_temp 4.b1 [5,162,148,2]
pg_temp 4.168 [95,59,6,2]
pg_temp 4.1ef [22,162,10,5]
pg_temp 4.2c9 [164,95,10,7]
pg_temp 4.330 [165,154,10,8]
pg_temp 4.353 [2,33,18,54]
pg_temp 4.3f8 [88,67,10,7]
pg_temp 4.41a [30,59,10,5]
pg_temp 4.45f [47,156,21,2]
pg_temp 4.486 [138,43,10,7]
pg_temp 4.674 [59,18,7,2]
pg_temp 4.7b8 [164,68,10,11]
pg_temp 4.816 [167,147,57,2]
pg_temp 4.829 [82,45,10,11]
pg_temp 4.843 [141,34,10,6]
pg_temp 4.862 [31,160,138,2]
pg_temp 4.868 [78,67,10,5]
pg_temp 4.9ca [150,68,10,8]
pg_temp 4.a83 [156,83,10,7]
pg_temp 4.a98 [161,94,10,7]
pg_temp 4.b80 [162,88,10,8]
pg_temp 4.d41 [163,52,10,6]
pg_temp 4.d54 [149,140,10,7]
pg_temp 4.e8e [164,78,10,8]
pg_temp 4.f2a [150,68,10,6]
pg_temp 4.ff3 [30,157,10,7]
root@mon1:~#

So I tried to restart osd.160 and osd.161, but that didn't chance the state.

root@mon1:~# ceph pg 4.39 query
{
   "state": "active+clean",
   "snap_trimq": "[]",
   "epoch": 111212,
   "up": [
       160,
       17,
       8
   ],
   "acting": [
       160,
       17,
       8
   ],
   "actingbackfill": [
       "8",
       "17",
       "160"
   ],

In all these PGs osd.10 is involved, but that OSD is down and out. I tried 
marking it as down again, but that didn't work.

I haven't tried removing osd.10 yet from the CRUSHMap since that will trigger a 
rather large rebalance.

This cluster is still running with the Dumpling tunables though, so that might 
be the issue. But before I trigger a very large rebalance I wanted to check if 
there are any insights on this one.

Thanks,

Wido
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to