Hi Wido, Thanks for the reply. I've dumped the query below.
"recovery_state" doesn't say anything, there are also no missing or unfounded objects. What else could be wrong? - WP P.S: I am running tunables optimal already. { "state": "active+remapped", "epoch": 6500, "up": [ 7], "acting": [ 7, 3], "info": { "pgid": "1.fa", "last_update": "0'0", "last_complete": "0'0", "log_tail": "0'0", "last_user_version": 0, "last_backfill": "MAX", "purged_snaps": "[]", "history": { "epoch_created": 1, "last_epoch_started": 6377, "last_epoch_clean": 6379, "last_epoch_split": 0, "same_up_since": 6365, "same_interval_since": 6365, "same_primary_since": 6348, "last_scrub": "0'0", "last_scrub_stamp": "2014-01-09 11:37:18.202247", "last_deep_scrub": "0'0", "last_deep_scrub_stamp": "2014-01-09 11:37:18.202247", "last_clean_scrub_stamp": "2014-01-09 11:37:18.202247"}, "stats": { "version": "0'0", "reported_seq": "4320", "reported_epoch": "6500", "state": "active+remapped", "last_fresh": "2014-01-10 12:19:46.219163", "last_change": "2014-01-10 11:18:53.147842", "last_active": "2014-01-10 12:19:46.219163", "last_clean": "2014-01-09 22:02:41.243761", "last_became_active": "0.000000", "last_unstale": "2014-01-10 12:19:46.219163", "mapping_epoch": 6351, "log_start": "0'0", "ondisk_log_start": "0'0", "created": 1, "last_epoch_clean": 6379, "parent": "0.0", "parent_split_bits": 0, "last_scrub": "0'0", "last_scrub_stamp": "2014-01-09 11:37:18.202247", "last_deep_scrub": "0'0", "last_deep_scrub_stamp": "2014-01-09 11:37:18.202247", "last_clean_scrub_stamp": "2014-01-09 11:37:18.202247", "log_size": 0, "ondisk_log_size": 0, "stats_invalid": "0", "stat_sum": { "num_bytes": 0, "num_objects": 0, "num_object_clones": 0, "num_object_copies": 0, "num_objects_missing_on_primary": 0, "num_objects_degraded": 0, "num_objects_unfound": 0, "num_read": 0, "num_read_kb": 0, "num_write": 0, "num_write_kb": 0, "num_scrub_errors": 0, "num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 0, "num_objects_recovered": 0, "num_bytes_recovered": 0, "num_keys_recovered": 0}, "stat_cat_sum": {}, "up": [ 7], "acting": [ 7, 3]}, "empty": 1, "dne": 0, "incomplete": 0, "last_epoch_started": 6377}, "recovery_state": [ { "name": "Started\/Primary\/Active", "enter_time": "2014-01-10 11:18:53.147802", "might_have_unfound": [], "recovery_progress": { "backfill_target": -1, "waiting_on_backfill": 0, "last_backfill_started": "0\/\/0\/\/-1", "backfill_info": { "begin": "0\/\/0\/\/-1", "end": "0\/\/0\/\/-1", "objects": []}, "peer_backfill_info": { "begin": "0\/\/0\/\/-1", "end": "0\/\/0\/\/-1", "objects": []}, "backfills_in_flight": [], "recovering": [], "pg_backend": { "pull_from_peer": [], "pushing": []}}, "scrub": { "scrubber.epoch_start": "4757", "scrubber.active": 0, "scrubber.block_writes": 0, "scrubber.finalizing": 0, "scrubber.waiting_on": 0, "scrubber.waiting_on_whom": []}}, { "name": "Started", "enter_time": "2014-01-10 11:18:40.137868"}]} On Fri, Jan 10, 2014 at 12:16 PM, Wido den Hollander <w...@42on.com> wrote: > On 01/10/2014 05:13 AM, YIP Wai Peng wrote: > >> Dear all, >> >> I have some pgs that are stuck_unclean, I'm trying to understand why. >> Hopefully someone can help me shed some light on it. >> >> For example, one of them is >> >> # ceph pg dump_stuck unclean >> 1.fa0000000active+remapped2014-01-10 >> 11:18:53.1478420'06452:4272[7][7,3]0'02014-01-09 >> 11:37:18.2022470'02014-01-09 11:37:18.202247 >> >> >> >> My pool 1 looks like this >> >> pool 1 'metadata' rep size 2 min_size 1 crush_ruleset 3 object_hash >> rjenkins pg_num 256 pgp_num 256 last_change 2605 owner 0 >> >> >> The rule 3 is >> >> rule different_host { >> ruleset 3 >> type replicated >> min_size 1 >> max_size 10 >> step take default >> step chooseleaf firstn 0 type host >> step emit >> } >> >> >> My osd tree looks like >> >> # idweighttype nameup/downreweight >> -140root default >> -73datacenter CR2 >> -53host ceph3 >> 61osd.6up1 >> 71osd.7up1 >> 81osd.8up1 >> <snip> >> -93datacenter COM1 >> -66room 02-WIRECEN >> -43host ceph2 >> 31osd.3up1 >> 41osd.4up1 >> 51osd.5up1 >> >> >> osd.7 and osd.3 are in different hosts, so the rule is satisfied. Why is >> it still in the 'remapped' status, and what is it waiting for? >> >> > Try: > > $ ceph pg 1.fa query > > That will tell you the cause of why the PG is stuck. > > - Peng >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> > > -- > Wido den Hollander > 42on B.V. > > Phone: +31 (0)20 700 9902 > Skype: contact42on > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com