Re: [ceph-users] trying to understand stuck_unclean

YIP Wai Peng Thu, 09 Jan 2014 20:24:09 -0800

Hi Wido,

Thanks for the reply. I've dumped the query below.


"recovery_state" doesn't say anything, there are also no missing or
unfounded objects. What else could be wrong?

- WP

P.S: I am running tunables optimal already.


{ "state": "active+remapped",
  "epoch": 6500,
  "up": [
        7],
  "acting": [
        7,
        3],
  "info": { "pgid": "1.fa",
      "last_update": "0'0",
      "last_complete": "0'0",
      "log_tail": "0'0",
      "last_user_version": 0,
      "last_backfill": "MAX",
      "purged_snaps": "[]",
      "history": { "epoch_created": 1,
          "last_epoch_started": 6377,
          "last_epoch_clean": 6379,
          "last_epoch_split": 0,
          "same_up_since": 6365,
          "same_interval_since": 6365,
          "same_primary_since": 6348,
          "last_scrub": "0'0",
          "last_scrub_stamp": "2014-01-09 11:37:18.202247",
          "last_deep_scrub": "0'0",
          "last_deep_scrub_stamp": "2014-01-09 11:37:18.202247",
          "last_clean_scrub_stamp": "2014-01-09 11:37:18.202247"},
      "stats": { "version": "0'0",
          "reported_seq": "4320",
          "reported_epoch": "6500",
          "state": "active+remapped",
          "last_fresh": "2014-01-10 12:19:46.219163",
          "last_change": "2014-01-10 11:18:53.147842",
          "last_active": "2014-01-10 12:19:46.219163",
          "last_clean": "2014-01-09 22:02:41.243761",
          "last_became_active": "0.000000",
          "last_unstale": "2014-01-10 12:19:46.219163",
          "mapping_epoch": 6351,
          "log_start": "0'0",
          "ondisk_log_start": "0'0",
          "created": 1,
          "last_epoch_clean": 6379,
          "parent": "0.0",
          "parent_split_bits": 0,
          "last_scrub": "0'0",
          "last_scrub_stamp": "2014-01-09 11:37:18.202247",
          "last_deep_scrub": "0'0",
          "last_deep_scrub_stamp": "2014-01-09 11:37:18.202247",
          "last_clean_scrub_stamp": "2014-01-09 11:37:18.202247",
          "log_size": 0,
          "ondisk_log_size": 0,
          "stats_invalid": "0",
          "stat_sum": { "num_bytes": 0,
              "num_objects": 0,
              "num_object_clones": 0,
              "num_object_copies": 0,
              "num_objects_missing_on_primary": 0,
              "num_objects_degraded": 0,
              "num_objects_unfound": 0,
              "num_read": 0,
              "num_read_kb": 0,
              "num_write": 0,
              "num_write_kb": 0,
              "num_scrub_errors": 0,
              "num_shallow_scrub_errors": 0,
              "num_deep_scrub_errors": 0,
              "num_objects_recovered": 0,
              "num_bytes_recovered": 0,
              "num_keys_recovered": 0},
          "stat_cat_sum": {},
          "up": [
                7],
          "acting": [
                7,
                3]},
      "empty": 1,
      "dne": 0,
      "incomplete": 0,
      "last_epoch_started": 6377},
  "recovery_state": [
        { "name": "Started\/Primary\/Active",
          "enter_time": "2014-01-10 11:18:53.147802",
          "might_have_unfound": [],
          "recovery_progress": { "backfill_target": -1,
              "waiting_on_backfill": 0,
              "last_backfill_started": "0\/\/0\/\/-1",
              "backfill_info": { "begin": "0\/\/0\/\/-1",
                  "end": "0\/\/0\/\/-1",
                  "objects": []},
              "peer_backfill_info": { "begin": "0\/\/0\/\/-1",
                  "end": "0\/\/0\/\/-1",
                  "objects": []},
              "backfills_in_flight": [],
              "recovering": [],
              "pg_backend": { "pull_from_peer": [],
                  "pushing": []}},
          "scrub": { "scrubber.epoch_start": "4757",
              "scrubber.active": 0,
              "scrubber.block_writes": 0,
              "scrubber.finalizing": 0,
              "scrubber.waiting_on": 0,
              "scrubber.waiting_on_whom": []}},
        { "name": "Started",
          "enter_time": "2014-01-10 11:18:40.137868"}]}



On Fri, Jan 10, 2014 at 12:16 PM, Wido den Hollander <w...@42on.com> wrote:

> On 01/10/2014 05:13 AM, YIP Wai Peng wrote:
>
>> Dear all,
>>
>> I have some pgs that are stuck_unclean, I'm trying to understand why.
>> Hopefully someone can help me shed some light on it.
>>
>> For example, one of them is
>>
>> # ceph pg dump_stuck unclean
>> 1.fa0000000active+remapped2014-01-10
>> 11:18:53.1478420'06452:4272[7][7,3]0'02014-01-09
>> 11:37:18.2022470'02014-01-09 11:37:18.202247
>>
>>
>>
>> My pool 1 looks like this
>>
>> pool 1 'metadata' rep size 2 min_size 1 crush_ruleset 3 object_hash
>> rjenkins pg_num 256 pgp_num 256 last_change 2605 owner 0
>>
>>
>> The rule 3 is
>>
>> rule different_host {
>>          ruleset 3
>>          type replicated
>>          min_size 1
>>          max_size 10
>>          step take default
>>          step chooseleaf firstn 0 type host
>>          step emit
>> }
>>
>>
>> My osd tree looks like
>>
>> # idweighttype nameup/downreweight
>> -140root default
>> -73datacenter CR2
>> -53host ceph3
>> 61osd.6up1
>> 71osd.7up1
>> 81osd.8up1
>> <snip>
>> -93datacenter COM1
>> -66room 02-WIRECEN
>> -43host ceph2
>> 31osd.3up1
>> 41osd.4up1
>> 51osd.5up1
>>
>>
>> osd.7 and osd.3 are in different hosts, so the rule is satisfied. Why is
>> it still in the 'remapped' status, and what is it waiting for?
>>
>>
> Try:
>
> $ ceph pg 1.fa query
>
> That will tell you the cause of why the PG is stuck.
>
>  - Peng
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>
> --
> Wido den Hollander
> 42on B.V.
>
> Phone: +31 (0)20 700 9902
> Skype: contact42on
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] trying to understand stuck_unclean

Reply via email to