Hello, Thank you Bryan. I was just trying to upgrade to hammer or upper but before that I was wanting to get the cluster in Healthy state. Do you think it is safe to upgrade now first to latest firefly then to Hammer ?
Regards. Dimitar Boichev SysAdmin Team Lead AXSMarine Sofia Phone: +359 889 22 55 42 Skype: dimitar.boichev.axsmarine E-mail: dimitar.boic...@axsmarine.com<mailto:dimitar.boic...@axsmarine.com> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Stillwell, Bryan Sent: Tuesday, February 23, 2016 1:51 AM To: ceph-users@lists.ceph.com Subject: Re: [ceph-users] osd not removed from crush map after ceph osd crush remove Dimitar, I'm not sure why those PGs would be stuck in the stale+active+clean state. Maybe try upgrading to the 0.80.11 release to see if it's a bug that was fixed already? You can use the 'ceph tell osd.* version' command after the upgrade to make sure all OSDs are running the new version. Also since firefly (0.80.x) is near its EOL, you should consider upgrading to hammer (0.94.x). As for why osd.4 didn't get fully removed, the last command you ran isn't correct. It should be 'ceph osd rm 4'. Trying to remember when to use the CRUSH name (osd.4) versus the OSD number (4) can be a pain. Bryan From: ceph-users <ceph-users-boun...@lists.ceph.com<mailto:ceph-users-boun...@lists.ceph.com>> on behalf of Dimitar Boichev <dimitar.boic...@axsmarine.com<mailto:dimitar.boic...@axsmarine.com>> Date: Monday, February 22, 2016 at 1:10 AM To: Dimitar Boichev <dimitar.boic...@axsmarine.com<mailto:dimitar.boic...@axsmarine.com>>, "ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>" <ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>> Subject: Re: [ceph-users] osd not removed from crush map after ceph osd crush remove Anyone ? Regards. From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Dimitar Boichev Sent: Thursday, February 18, 2016 5:06 PM To: ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com> Subject: [ceph-users] osd not removed from crush map after ceph osd crush remove Hello, I am running a tiny cluster of 2 nodes. ceph -v ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3) One osd died and I added a new osd (not replacing the old one). After that I wanted to remove the failed osd completely from the cluster. Here is what I did: ceph osd reweight osd.4 0.0 ceph osd crush reweight osd.4 0.0 ceph osd out osd.4 ceph osd crush remove osd.4 ceph auth del osd.4 ceph osd rm osd.4 But after the rebalancing I ended up with 155 PGs in stale+active+clean state. @storage1:/tmp# ceph -s cluster 7a9120b9-df42-4308-b7b1-e1f3d0f1e7b3 health HEALTH_WARN 155 pgs stale; 155 pgs stuck stale; 1 requests are blocked > 32 sec; nodeep-scrub flag(s) set monmap e1: 1 mons at {storage1=192.168.10.3:6789/0}, election epoch 1, quorum 0 storage1 osdmap e1064: 6 osds: 6 up, 6 in flags nodeep-scrub pgmap v26760322: 712 pgs, 8 pools, 532 GB data, 155 kobjects 1209 GB used, 14210 GB / 15419 GB avail 155 stale+active+clean 557 active+clean client io 91925 B/s wr, 5 op/s I know about the 1 monitor problem I just want to fix the cluster to healthy state then I will add the third storage node and go up to 3 monitors. The problem is as follows: @storage1:/tmp# ceph pg map 2.3a osdmap e1064 pg 2.3a (2.3a) -> up [6] acting [6] @storage1:/tmp# ceph pg 2.3a query Error ENOENT: i don't have pgid 2.3a @storage1:/tmp# ceph health detail HEALTH_WARN 155 pgs stale; 155 pgs stuck stale; 1 requests are blocked > 32 sec; 1 osds have slow requests; nodeep-scrub flag(s) set pg 7.2a is stuck stale for 8887559.656879, current state stale+active+clean, last acting [4] pg 5.28 is stuck stale for 8887559.656886, current state stale+active+clean, last acting [4] pg 7.2b is stuck stale for 8887559.656889, current state stale+active+clean, last acting [4] pg 7.2c is stuck stale for 8887559.656892, current state stale+active+clean, last acting [4] pg 0.2b is stuck stale for 8887559.656893, current state stale+active+clean, last acting [4] pg 6.2c is stuck stale for 8887559.656894, current state stale+active+clean, last acting [4] pg 6.2f is stuck stale for 8887559.656893, current state stale+active+clean, last acting [4] pg 2.2b is stuck stale for 8887559.656896, current state stale+active+clean, last acting [4] pg 2.25 is stuck stale for 8887559.656896, current state stale+active+clean, last acting [4] pg 6.20 is stuck stale for 8887559.656898, current state stale+active+clean, last acting [4] pg 5.21 is stuck stale for 8887559.656898, current state stale+active+clean, last acting [4] pg 0.24 is stuck stale for 8887559.656904, current state stale+active+clean, last acting [4] pg 2.21 is stuck stale for 8887559.656904, current state stale+active+clean, last acting [4] pg 5.27 is stuck stale for 8887559.656906, current state stale+active+clean, last acting [4] pg 2.23 is stuck stale for 8887559.656908, current state stale+active+clean, last acting [4] pg 6.26 is stuck stale for 8887559.656909, current state stale+active+clean, last acting [4] pg 7.27 is stuck stale for 8887559.656913, current state stale+active+clean, last acting [4] pg 7.18 is stuck stale for 8887559.656914, current state stale+active+clean, last acting [4] pg 0.1e is stuck stale for 8887559.656914, current state stale+active+clean, last acting [4] pg 6.18 is stuck stale for 8887559.656919, current state stale+active+clean, last acting [4] pg 2.1f is stuck stale for 8887559.656919, current state stale+active+clean, last acting [4] pg 7.1b is stuck stale for 8887559.656922, current state stale+active+clean, last acting [4] pg 0.1b is stuck stale for 8887559.656919, current state stale+active+clean, last acting [4] pg 6.1d is stuck stale for 8887559.656925, current state stale+active+clean, last acting [4] pg 2.18 is stuck stale for 8887559.656920, current state stale+active+clean, last acting [4] pg 7.1d is stuck stale for 8887559.656926, current state stale+active+clean, last acting [4] pg 5.1c is stuck stale for 8887559.656921, current state stale+active+clean, last acting [4] pg 5.1d is stuck stale for 8887559.656920, current state stale+active+clean, last acting [4] pg 6.11 is stuck stale for 8887559.656922, current state stale+active+clean, last acting [4] pg 5.13 is stuck stale for 8887559.656919, current state stale+active+clean, last acting [4] pg 0.16 is stuck stale for 8887559.656924, current state stale+active+clean, last acting [4] pg 6.10 is stuck stale for 8887559.656928, current state stale+active+clean, last acting [4] pg 2.17 is stuck stale for 8887559.656927, current state stale+active+clean, last acting [4] pg 7.12 is stuck stale for 8887559.656932, current state stale+active+clean, last acting [4] pg 0.12 is stuck stale for 8887559.656929, current state stale+active+clean, last acting [4] pg 6.14 is stuck stale for 8887559.656935, current state stale+active+clean, last acting [4] pg 0.11 is stuck stale for 8887559.656932, current state stale+active+clean, last acting [4] pg 7.16 is stuck stale for 8887559.656936, current state stale+active+clean, last acting [4] pg 0.10 is stuck stale for 8887559.656936, current state stale+active+clean, last acting [4] pg 2.d is stuck stale for 8887559.656933, current state stale+active+clean, last acting [4] pg 6.9 is stuck stale for 8887559.656939, current state stale+active+clean, last acting [4] pg 7.9 is stuck stale for 8887559.656939, current state stale+active+clean, last acting [4] pg 0.d is stuck stale for 8887559.656940, current state stale+active+clean, last acting [4] pg 7.a is stuck stale for 8887559.656944, current state stale+active+clean, last acting [4] pg 0.c is stuck stale for 8887559.656941, current state stale+active+clean, last acting [4] pg 2.e is stuck stale for 8887559.656947, current state stale+active+clean, last acting [4] pg 6.a is stuck stale for 8887559.656953, current state stale+active+clean, last acting [4] pg 0.b is stuck stale for 8887559.656949, current state stale+active+clean, last acting [4] pg 2.9 is stuck stale for 8887559.656954, current state stale+active+clean, last acting [4] pg 5.f is stuck stale for 8887559.656953, current state stale+active+clean, last acting [4] pg 7.d is stuck stale for 8887559.656958, current state stale+active+clean, last acting [4] pg 6.f is stuck stale for 8887559.656957, current state stale+active+clean, last acting [4] pg 3.4 is stuck stale for 8887559.656957, current state stale+active+clean, last acting [4] pg 5.3 is stuck stale for 8887559.656956, current state stale+active+clean, last acting [4] pg 2.4 is stuck stale for 8887559.656961, current state stale+active+clean, last acting [4] pg 6.0 is stuck stale for 8887559.656966, current state stale+active+clean, last acting [4] pg 3.6 is stuck stale for 8887559.656965, current state stale+active+clean, last acting [4] pg 3.7 is stuck stale for 8887559.656964, current state stale+active+clean, last acting [4] pg 2.6 is stuck stale for 8887559.656970, current state stale+active+clean, last acting [4] pg 0.3 is stuck stale for 8887559.656965, current state stale+active+clean, last acting [4] pg 5.6 is stuck stale for 8887559.656970, current state stale+active+clean, last acting [4] pg 7.4 is stuck stale for 8887559.656975, current state stale+active+clean, last acting [4] pg 3.1 is stuck stale for 8887559.656970, current state stale+active+clean, last acting [4] pg 6.4 is stuck stale for 8887559.656975, current state stale+active+clean, last acting [4] pg 5.4 is stuck stale for 8887559.656972, current state stale+active+clean, last acting [4] pg 2.3 is stuck stale for 8887559.656977, current state stale+active+clean, last acting [4] pg 5.5 is stuck stale for 8887559.656977, current state stale+active+clean, last acting [4] pg 3.3 is stuck stale for 8887559.656982, current state stale+active+clean, last acting [4] pg 5.7a is stuck stale for 8887559.657309, current state stale+active+clean, last acting [4] pg 6.78 is stuck stale for 8887559.657308, current state stale+active+clean, last acting [4] pg 5.78 is stuck stale for 8887559.657311, current state stale+active+clean, last acting [4] pg 5.79 is stuck stale for 8887559.657311, current state stale+active+clean, last acting [4] pg 6.7c is stuck stale for 8887559.657313, current state stale+active+clean, last acting [4] pg 7.7e is stuck stale for 8887559.657312, current state stale+active+clean, last acting [4] pg 6.7e is stuck stale for 8887559.657315, current state stale+active+clean, last acting [4] pg 7.70 is stuck stale for 8887559.657316, current state stale+active+clean, last acting [4] pg 6.73 is stuck stale for 8887559.657316, current state stale+active+clean, last acting [4] pg 5.77 is stuck stale for 8887559.657317, current state stale+active+clean, last acting [4] pg 5.74 is stuck stale for 8887559.657319, current state stale+active+clean, last acting [4] pg 5.75 is stuck stale for 8887559.657321, current state stale+active+clean, last acting [4] pg 7.68 is stuck stale for 8887559.657322, current state stale+active+clean, last acting [4] pg 6.68 is stuck stale for 8887559.657324, current state stale+active+clean, last acting [4] pg 7.6b is stuck stale for 8887559.657326, current state stale+active+clean, last acting [4] pg 6.6d is stuck stale for 8887559.657328, current state stale+active+clean, last acting [4] pg 5.6e is stuck stale for 8887559.657330, current state stale+active+clean, last acting [4] pg 6.6c is stuck stale for 8887559.657330, current state stale+active+clean, last acting [4] pg 7.6f is stuck stale for 8887559.657331, current state stale+active+clean, last acting [4] pg 7.60 is stuck stale for 8887559.657333, current state stale+active+clean, last acting [4] pg 6.60 is stuck stale for 8887559.657333, current state stale+active+clean, last acting [4] pg 7.62 is stuck stale for 8887559.657334, current state stale+active+clean, last acting [4] pg 6.65 is stuck stale for 8887559.657334, current state stale+active+clean, last acting [4] pg 7.64 is stuck stale for 8887559.657339, current state stale+active+clean, last acting [4] pg 5.67 is stuck stale for 8887559.657338, current state stale+active+clean, last acting [4] pg 7.66 is stuck stale for 8887559.657340, current state stale+active+clean, last acting [4] pg 6.66 is stuck stale for 8887559.657340, current state stale+active+clean, last acting [4] pg 7.67 is stuck stale for 8887559.657345, current state stale+active+clean, last acting [4] pg 6.59 is stuck stale for 8887559.657344, current state stale+active+clean, last acting [4] pg 7.58 is stuck stale for 8887559.657348, current state stale+active+clean, last acting [4] pg 6.58 is stuck stale for 8887559.657348, current state stale+active+clean, last acting [4] pg 7.59 is stuck stale for 8887559.657352, current state stale+active+clean, last acting [4] pg 6.5b is stuck stale for 8887559.657353, current state stale+active+clean, last acting [4] pg 5.59 is stuck stale for 8887559.657348, current state stale+active+clean, last acting [4] pg 6.5a is stuck stale for 8887559.657356, current state stale+active+clean, last acting [4] pg 5.5e is stuck stale for 8887559.657352, current state stale+active+clean, last acting [4] pg 6.5d is stuck stale for 8887559.657358, current state stale+active+clean, last acting [4] pg 6.5f is stuck stale for 8887559.657356, current state stale+active+clean, last acting [4] pg 7.51 is stuck stale for 8887559.657356, current state stale+active+clean, last acting [4] pg 7.52 is stuck stale for 8887559.657356, current state stale+active+clean, last acting [4] pg 7.53 is stuck stale for 8887559.657358, current state stale+active+clean, last acting [4] pg 6.55 is stuck stale for 8887559.657359, current state stale+active+clean, last acting [4] pg 7.54 is stuck stale for 8887559.657364, current state stale+active+clean, last acting [4] pg 6.54 is stuck stale for 8887559.657364, current state stale+active+clean, last acting [4] pg 6.57 is stuck stale for 8887559.657365, current state stale+active+clean, last acting [4] pg 7.56 is stuck stale for 8887559.657369, current state stale+active+clean, last acting [4] pg 5.55 is stuck stale for 8887559.657371, current state stale+active+clean, last acting [4] pg 7.48 is stuck stale for 8887559.657372, current state stale+active+clean, last acting [4] pg 6.49 is stuck stale for 8887559.657375, current state stale+active+clean, last acting [4] pg 5.4a is stuck stale for 8887559.657376, current state stale+active+clean, last acting [4] pg 6.48 is stuck stale for 8887559.657379, current state stale+active+clean, last acting [4] pg 7.4a is stuck stale for 8887559.657380, current state stale+active+clean, last acting [4] pg 6.4a is stuck stale for 8887559.657383, current state stale+active+clean, last acting [4] pg 6.4d is stuck stale for 8887559.657385, current state stale+active+clean, last acting [4] pg 7.4d is stuck stale for 8887559.657387, current state stale+active+clean, last acting [4] pg 6.4c is stuck stale for 8887559.657389, current state stale+active+clean, last acting [4] pg 6.4e is stuck stale for 8887559.657391, current state stale+active+clean, last acting [4] pg 5.42 is stuck stale for 8887559.657391, current state stale+active+clean, last acting [4] pg 6.43 is stuck stale for 8887559.657393, current state stale+active+clean, last acting [4] pg 5.41 is stuck stale for 8887559.657393, current state stale+active+clean, last acting [4] pg 5.47 is stuck stale for 8887559.657394, current state stale+active+clean, last acting [4] pg 7.46 is stuck stale for 8887559.657396, current state stale+active+clean, last acting [4] pg 6.39 is stuck stale for 8887559.657398, current state stale+active+clean, last acting [4] pg 5.3a is stuck stale for 8887559.657399, current state stale+active+clean, last acting [4] pg 2.3e is stuck stale for 8887559.657399, current state stale+active+clean, last acting [4] pg 0.3c is stuck stale for 8887559.657402, current state stale+active+clean, last acting [4] pg 7.3c is stuck stale for 8887559.657404, current state stale+active+clean, last acting [4] pg 7.3d is stuck stale for 8887559.657405, current state stale+active+clean, last acting [4] pg 0.39 is stuck stale for 8887559.657402, current state stale+active+clean, last acting [4] pg 5.3c is stuck stale for 8887559.657405, current state stale+active+clean, last acting [4] pg 2.3a is stuck stale for 8887559.657406, current state stale+active+clean, last acting [4] pg 0.38 is stuck stale for 8887559.657409, current state stale+active+clean, last acting [4] pg 2.35 is stuck stale for 8887559.657411, current state stale+active+clean, last acting [4] pg 0.37 is stuck stale for 8887559.657412, current state stale+active+clean, last acting [4] pg 5.32 is stuck stale for 8887559.657413, current state stale+active+clean, last acting [4] pg 2.34 is stuck stale for 8887559.657416, current state stale+active+clean, last acting [4] pg 0.36 is stuck stale for 8887559.657416, current state stale+active+clean, last acting [4] pg 7.32 is stuck stale for 8887559.657419, current state stale+active+clean, last acting [4] pg 6.33 is stuck stale for 8887559.657420, current state stale+active+clean, last acting [4] pg 0.35 is stuck stale for 8887559.657423, current state stale+active+clean, last acting [4] pg 6.35 is stuck stale for 8887559.657423, current state stale+active+clean, last acting [4] pg 5.36 is stuck stale for 8887559.657424, current state stale+active+clean, last acting [4] pg 2.30 is stuck stale for 8887559.657427, current state stale+active+clean, last acting [4] pg 5.37 is stuck stale for 8887559.657429, current state stale+active+clean, last acting [4] pg 7.36 is stuck stale for 8887559.657430, current state stale+active+clean, last acting [4] pg 6.37 is stuck stale for 8887559.657432, current state stale+active+clean, last acting [4] pg 6.28 is stuck stale for 8887559.657427, current state stale+active+clean, last acting [4] This stays that way and I think this is because when I downloaded and decompiled the crush map I discovered this: @storage1:/tmp# crushtool -d /tmp/crushmap # begin crush map tunable choose_local_tries 0 tunable choose_local_fallback_tries 0 tunable choose_total_tries 50 tunable chooseleaf_descend_once 1 # devices device 0 osd.0 device 1 osd.1 device 2 osd.2 device 3 osd.3 device 4 device4 device 5 osd.5 device 6 osd.6 Is there a way to remove this device 4 aka osd.4 from here so ceph can make another copy from the other location shown in "ceph pg map 2.3a" ? Regards. Dimitar Boichev SysAdmin Team Lead AXSMarine Sofia Phone: +359 889 22 55 42 Skype: dimitar.boichev.axsmarine E-mail: dimitar.boic...@axsmarine.com<mailto:dimitar.boic...@axsmarine.com> ________________________________ This E-mail and any of its attachments may contain Time Warner Cable proprietary information, which is privileged, confidential, or subject to copyright belonging to Time Warner Cable. This E-mail is intended solely for the use of the individual or entity to which it is addressed. If you are not the intended recipient of this E-mail, you are hereby notified that any dissemination, distribution, copying, or action taken in relation to the contents of and attachments to this E-mail is strictly prohibited and may be unlawful. If you have received this E-mail in error, please notify the sender immediately and permanently delete the original and any copy of this E-mail and any printout.
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com