Re: [ceph-users] PGs inconsistent, do I fear data loss?

David Turner Wed, 01 Nov 2017 11:40:19 -0700

I don't know.  I've seen several cases where people have inconsistent pgs
that they can't recover from and they didn't lose any disks.  The most
common thread between them is min_size=1.  My postulated scenario might not
be the actual path in the code that leads to it, but something does... and
min_size=1 seems to be the common thread.


On Wed, Nov 1, 2017 at 2:30 PM Gregory Farnum <gfar...@redhat.com> wrote:

> On Wed, Nov 1, 2017 at 11:27 AM Denes Dolhay <de...@denkesys.com> wrote:
>
>> Hello,
>> I have a trick question for Mr. Turner's scenario:
>> Let's assume size=2, min_size=1
>> -We are looking at pg "A" acting [1, 2]
>> -osd 1 goes down, OK
>> -osd 1 comes back up, backfill of pg "A" commences from osd 2 to osd 1, OK
>> -osd 2 goes down (and therefore pg "A" 's backfill to osd 1 is incomplete
>> and stopped) not OK, but this is the case...
>> --> In this event, why does osd 1 accept IO to pg "A" knowing full well,
>> that it's data is outdated and will cause an inconsistent state?
>> Wouldn't it be prudent to deny io to pg "A" until either
>> -osd 2 comes back (therefore we have a clean osd in the acting group) ...
>> backfill would continue to osd 1 of course
>> -or data in pg "A" is manually marked as lost, and then continues
>> operation from osd 1 's (outdated) copy?
>>
>
> It does deny IO in that case. I think David was pointing out that if OSD 2
> is actually dead and gone, you've got data loss despite having only lost
> one OSD.
> -Greg
>
>
>>
>> Thanks in advance, I'm really curious!
>>
>> Denes.
>>
>>
>>
>> On 11/01/2017 06:33 PM, Mario Giammarco wrote:
>>
>> I have read your post then read the thread you suggested, very
>> interesting.
>> Then I read again your post and understood better.
>> The most important thing is that even with min_size=1 writes are
>> acknowledged after ceph wrote size=2 copies.
>> In the thread above there is:
>>
>> As David already said, when all OSDs are up and in for a PG Ceph will wait 
>> for ALL OSDs to Ack the write. Writes in RADOS are always synchronous.
>>
>> Only when OSDs go down you need at least min_size OSDs up before writes or 
>> reads are accepted.
>>
>> So if min_size = 2 and size = 3 you need at least 2 OSDs online for I/O to 
>> take place.
>>
>>
>> You then show me a sequence of events that may happen in some use cases.
>> I tell you my use case which is quite different. We use ceph under
>> proxmox. The servers have disks on raid 5 (I agree that it is better to
>> expose single disks to Ceph but it is late).
>> So it is unlikely that a ceph disk fails because of raid. If a disks fail
>> probabably is because the entire server has failed (and we need to provide
>> business availability in this case) and so it will never come up again so
>> in my situation your sequence of events will never happen.
>> What shocked me is that I did not expect to see so many inconsistencies.
>> Thanks,
>> Mario
>>
>>
>> 2017-11-01 16:45 GMT+01:00 David Turner <drakonst...@gmail.com>:
>>
>>> It looks like you're running with a size = 2 and min_size = 1 (the
>>> min_size is a guess, the size is based on how many osds belong to your
>>> problem PGs).  Here's some good reading for you.
>>> https://www.spinics.net/lists/ceph-users/msg32895.html
>>>
>>> Basically the jist is that when running with size = 2 you should assume
>>> that data loss is an eventuality and choose that it is ok for your use
>>> case.  This can be mitigated by using min_size = 2, but then your pool will
>>> block while an OSD is down and you'll have to manually go in and change the
>>> min_size temporarily to perform maintenance.
>>>
>>> All it takes for data loss is that an osd on server 1 is marked down and
>>> a write happens to an osd on server 2.  Now the osd on server 2 goes down
>>> before the osd on server 1 has finished backfilling and the first osd
>>> receives a request to modify data in the object that it doesn't know the
>>> current state of.  Tada, you have data loss.
>>>
>>> How likely is this to happen... eventually it will.  PG subfolder
>>> splitting (if you're using filestore) will occasionally take long enough to
>>> perform the task that the osd is marked down while it's still running, and
>>> this usually happens for some time all over the cluster when it does.
>>> Another option is something that causes segfaults in the osds; another is
>>> restarting a node before all pgs are done backfilling/recovering; OOM
>>> killer; power outages; etc; etc.
>>>
>>> Why does min_size = 2 prevent this?  Because for a write to be
>>> acknowledged by the cluster, it has to be written to every OSD that is up
>>> as long as there are at least min_size available.  This means that every
>>> write is acknowledged by at least 2 osds every time.  If you're running
>>> with size = 2, then both copies of the data need to be online for a write
>>> to happen and thus can never have a write that the other does not.  If
>>> you're running with size = 3, then you always have a majority of the OSDs
>>> online receiving a write and they can both agree on the correct data to
>>> give to the third when it comes back up.
>>>
>>> On Wed, Nov 1, 2017 at 3:31 AM Mario Giammarco <mgiamma...@gmail.com>
>>> wrote:
>>>
>>>> Sure here it is ceph -s:
>>>>
>>>> cluster:
>>>>    id:     8bc45d9a-ef50-4038-8e1b-1f25ac46c945
>>>>    health: HEALTH_ERR
>>>>            100 scrub errors
>>>>            Possible data damage: 56 pgs inconsistent
>>>>
>>>>  services:
>>>>    mon: 3 daemons, quorum 0,1,pve3
>>>>    mgr: pve3(active)
>>>>    osd: 3 osds: 3 up, 3 in
>>>>
>>>>  data:
>>>>    pools:   1 pools, 256 pgs
>>>>    objects: 269k objects, 1007 GB
>>>>    usage:   2050 GB used, 1386 GB / 3436 GB avail
>>>>    pgs:     200 active+clean
>>>>             56  active+clean+inconsistent
>>>>
>>>> ---
>>>>
>>>> ceph health detail :
>>>>
>>>> PG_DAMAGED Possible data damage: 56 pgs inconsistent
>>>>    pg 2.6 is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.19 is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.1e is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.1f is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.24 is active+clean+inconsistent, acting [0,2]
>>>>    pg 2.25 is active+clean+inconsistent, acting [2,0]
>>>>    pg 2.36 is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.3d is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.4b is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.4c is active+clean+inconsistent, acting [0,2]
>>>>    pg 2.4d is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.4f is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.50 is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.52 is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.56 is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.5b is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.5c is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.5d is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.5f is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.71 is active+clean+inconsistent, acting [0,2]
>>>>    pg 2.75 is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.77 is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.79 is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.7e is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.83 is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.8a is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.92 is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.98 is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.9a is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.9e is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.9f is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.c6 is active+clean+inconsistent, acting [0,2]
>>>>    pg 2.c7 is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.c8 is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.cb is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.cd is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.ce is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.d2 is active+clean+inconsistent, acting [2,1]
>>>>    pg 2.da is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.de is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.e1 is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.e4 is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.e6 is active+clean+inconsistent, acting [0,2]
>>>>    pg 2.e8 is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.ee is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.f9 is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.fa is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.fb is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.fc is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.fe is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.ff is active+clean+inconsistent, acting [1,0]
>>>>
>>>>
>>>> and ceph pg 2.6 query:
>>>>
>>>> {
>>>>    "state": "active+clean+inconsistent",
>>>>    "snap_trimq": "[]",
>>>>    "epoch": 1513,
>>>>    "up": [
>>>>        1,
>>>>        0
>>>>    ],
>>>>    "acting": [
>>>>        1,
>>>>        0
>>>>    ],
>>>>    "actingbackfill": [
>>>>        "0",
>>>>        "1"
>>>>    ],
>>>>    "info": {
>>>>        "pgid": "2.6",
>>>>        "last_update": "1513'89145",
>>>>        "last_complete": "1513'89145",
>>>>        "log_tail": "1503'87586",
>>>>        "last_user_version": 330583,
>>>>        "last_backfill": "MAX",
>>>>        "last_backfill_bitwise": 0,
>>>>        "purged_snaps": [
>>>>            {
>>>>                "start": "1",
>>>>                "length": "178"
>>>>            },
>>>>            {
>>>>                "start": "17a",
>>>>                "length": "3d"
>>>>            },
>>>>            {
>>>>                "start": "1b8",
>>>>                "length": "1"
>>>>            },
>>>>            {
>>>>                "start": "1ba",
>>>>                "length": "1"
>>>>            },
>>>>            {
>>>>                "start": "1bc",
>>>>                "length": "1"
>>>>            },
>>>>            {
>>>>                "start": "1be",
>>>>                "length": "44"
>>>>            },
>>>>            {
>>>>                "start": "205",
>>>>                "length": "12c"
>>>>            },
>>>>            {
>>>>                "start": "332",
>>>>                "length": "1"
>>>>            },
>>>>            {
>>>>                "start": "334",
>>>>                "length": "1"
>>>>            },
>>>>            {
>>>>                "start": "336",
>>>>                "length": "1"
>>>>            },
>>>>            {
>>>>                "start": "338",
>>>>                "length": "1"
>>>>            },
>>>>            {
>>>>                "start": "33a",
>>>>                "length": "1"
>>>>            }
>>>>        ],
>>>>        "history": {
>>>>            "epoch_created": 90,
>>>>            "epoch_pool_created": 90,
>>>>            "last_epoch_started": 1339,
>>>>            "last_interval_started": 1338,
>>>>            "last_epoch_clean": 1339,
>>>>            "last_interval_clean": 1338,
>>>>            "last_epoch_split": 0,
>>>>            "last_epoch_marked_full": 0,
>>>>            "same_up_since": 1338,
>>>>            "same_interval_since": 1338,
>>>>            "same_primary_since": 1338,
>>>>            "last_scrub": "1513'89112",
>>>>            "last_scrub_stamp": "2017-11-01 05:52:21.259654",
>>>>            "last_deep_scrub": "1513'89112",
>>>>            "last_deep_scrub_stamp": "2017-11-01 05:52:21.259654",
>>>>            "last_clean_scrub_stamp": "2017-10-25 04:25:09.830840"
>>>>        },
>>>>        "stats": {
>>>>            "version": "1513'89145",
>>>>            "reported_seq": "422820",
>>>>            "reported_epoch": "1513",
>>>>            "state": "active+clean+inconsistent",
>>>>            "last_fresh": "2017-11-01 08:11:38.411784",
>>>>            "last_change": "2017-11-01 05:52:21.259789",
>>>>            "last_active": "2017-11-01 08:11:38.411784",
>>>>            "last_peered": "2017-11-01 08:11:38.411784",
>>>>            "last_clean": "2017-11-01 08:11:38.411784",
>>>>            "last_became_active": "2017-10-15 20:36:33.644567",
>>>>            "last_became_peered": "2017-10-15 20:36:33.644567",
>>>>            "last_unstale": "2017-11-01 08:11:38.411784",
>>>>            "last_undegraded": "2017-11-01 08:11:38.411784",
>>>>            "last_fullsized": "2017-11-01 08:11:38.411784",
>>>>            "mapping_epoch": 1338,
>>>>            "log_start": "1503'87586",
>>>>            "ondisk_log_start": "1503'87586",
>>>>            "created": 90,
>>>>            "last_epoch_clean": 1339,
>>>>            "parent": "0.0",
>>>>            "parent_split_bits": 0,
>>>>            "last_scrub": "1513'89112",
>>>>            "last_scrub_stamp": "2017-11-01 05:52:21.259654",
>>>>            "last_deep_scrub": "1513'89112",
>>>>            "last_deep_scrub_stamp": "2017-11-01 05:52:21.259654",
>>>>            "last_clean_scrub_stamp": "2017-10-25 04:25:09.830840",
>>>>            "log_size": 1559,
>>>>            "ondisk_log_size": 1559,
>>>>            "stats_invalid": false,
>>>>            "dirty_stats_invalid": false,
>>>>            "omap_stats_invalid": false,
>>>>            "hitset_stats_invalid": false,
>>>>            "hitset_bytes_stats_invalid": false,
>>>>            "pin_stats_invalid": false,
>>>>            "stat_sum": {
>>>>                "num_bytes": 3747886080 <374%20788%206080>,
>>>>                "num_objects": 958,
>>>>                "num_object_clones": 295,
>>>>                "num_object_copies": 1916,
>>>>                "num_objects_missing_on_primary": 0,
>>>>                "num_objects_missing": 0,
>>>>                "num_objects_degraded": 0,
>>>>                "num_objects_misplaced": 0,
>>>>                "num_objects_unfound": 0,
>>>>                "num_objects_dirty": 958,
>>>>                "num_whiteouts": 0,
>>>>                "num_read": 333428,
>>>>                "num_read_kb": 135550185,
>>>>                "num_write": 79221,
>>>>                "num_write_kb": 13441239,
>>>>                "num_scrub_errors": 1,
>>>>                "num_shallow_scrub_errors": 0,
>>>>                "num_deep_scrub_errors": 1,
>>>>                "num_objects_recovered": 245,
>>>>                "num_bytes_recovered": 1012833792,
>>>>                "num_keys_recovered": 6,
>>>>                "num_objects_omap": 0,
>>>>                "num_objects_hit_set_archive": 0,
>>>>                "num_bytes_hit_set_archive": 0,
>>>>                "num_flush": 0,
>>>>                "num_flush_kb": 0,
>>>>                "num_evict": 0,
>>>>                "num_evict_kb": 0,
>>>>                "num_promote": 0,
>>>>                "num_flush_mode_high": 0,
>>>>                "num_flush_mode_low": 0,
>>>>                "num_evict_mode_some": 0,
>>>>                "num_evict_mode_full": 0,
>>>>                "num_objects_pinned": 0,
>>>>                "num_legacy_snapsets": 0
>>>>            },
>>>>            "up": [
>>>>                1,
>>>>                0
>>>>            ],
>>>>            "acting": [
>>>>                1,
>>>>                0
>>>>            ],
>>>>            "blocked_by": [],
>>>>            "up_primary": 1,
>>>>            "acting_primary": 1
>>>>        },
>>>>        "empty": 0,
>>>>        "dne": 0,
>>>>        "incomplete": 0,
>>>>        "last_epoch_started": 1339,
>>>>        "hit_set_history": {
>>>>            "current_last_update": "0'0",
>>>>            "history": []
>>>>        }
>>>>    },
>>>>    "peer_info": [
>>>>        {
>>>>            "peer": "0",
>>>>            "pgid": "2.6",
>>>>            "last_update": "1513'89145",
>>>>            "last_complete": "1513'89145",
>>>>            "log_tail": "1274'68440",
>>>>            "last_user_version": 315687,
>>>>            "last_backfill": "MAX",
>>>>            "last_backfill_bitwise": 0,
>>>>            "purged_snaps": [
>>>>                {
>>>>                    "start": "1",
>>>>                    "length": "178"
>>>>                },
>>>>                {
>>>>                    "start": "17a",
>>>>                    "length": "3d"
>>>>                },
>>>>                {
>>>>                    "start": "1b8",
>>>>                    "length": "1"
>>>>                },
>>>>                {
>>>>                    "start": "1ba",
>>>>                    "length": "1"
>>>>                },
>>>>                {
>>>>                    "start": "1bc",
>>>>                    "length": "1"
>>>>                },
>>>>                {
>>>>                    "start": "1be",
>>>>                    "length": "44"
>>>>                },
>>>>                {
>>>>                    "start": "205",
>>>>                    "length": "82"
>>>>                },
>>>>                {
>>>>                    "start": "288",
>>>>                    "length": "1"
>>>>                },
>>>>                {
>>>>                    "start": "28a",
>>>>                    "length": "1"
>>>>                },
>>>>                {
>>>>                    "start": "28c",
>>>>                    "length": "1"
>>>>                },
>>>>                {
>>>>                    "start": "28e",
>>>>                    "length": "1"
>>>>                },
>>>>                {
>>>>                    "start": "290",
>>>>                    "length": "1"
>>>>                }
>>>>            ],
>>>>            "history": {
>>>>                "epoch_created": 90,
>>>>                "epoch_pool_created": 90,
>>>>                "last_epoch_started": 1339,
>>>>                "last_interval_started": 1338,
>>>>                "last_epoch_clean": 1339,
>>>>                "last_interval_clean": 1338,
>>>>                "last_epoch_split": 0,
>>>>                "last_epoch_marked_full": 0,
>>>>                "same_up_since": 1338,
>>>>                "same_interval_since": 1338,
>>>>                "same_primary_since": 1338,
>>>>                "last_scrub": "1513'89112",
>>>>                "last_scrub_stamp": "2017-11-01 05:52:21.259654",
>>>>                "last_deep_scrub": "1513'89112",
>>>>                "last_deep_scrub_stamp": "2017-11-01 05:52:21.259654",
>>>>                "last_clean_scrub_stamp": "2017-10-25 04:25:09.830840"
>>>>            },
>>>>            "stats": {
>>>>                "version": "1337'71465",
>>>>                "reported_seq": "347015",
>>>>                "reported_epoch": "1338",
>>>>                "state": "active+undersized+degraded",
>>>>                "last_fresh": "2017-10-15 20:35:36.930611",
>>>>                "last_change": "2017-10-15 20:30:35.752042",
>>>>                "last_active": "2017-10-15 20:35:36.930611",
>>>>                "last_peered": "2017-10-15 20:35:36.930611",
>>>>                "last_clean": "2017-10-15 20:30:01.443288",
>>>>                "last_became_active": "2017-10-15 20:30:35.752042",
>>>>                "last_became_peered": "2017-10-15 20:30:35.752042",
>>>>                "last_unstale": "2017-10-15 20:35:36.930611",
>>>>                "last_undegraded": "2017-10-15 20:30:35.749043",
>>>>                "last_fullsized": "2017-10-15 20:30:35.749043",
>>>>                "mapping_epoch": 1338,
>>>>                "log_start": "1274'68440",
>>>>                "ondisk_log_start": "1274'68440",
>>>>                "created": 90,
>>>>                "last_epoch_clean": 1331,
>>>>                "parent": "0.0",
>>>>                "parent_split_bits": 0,
>>>>                "last_scrub": "1294'71370",
>>>>                "last_scrub_stamp": "2017-10-15 09:27:31.756027",
>>>>                "last_deep_scrub": "1284'70813",
>>>>                "last_deep_scrub_stamp": "2017-10-14 06:35:57.556773",
>>>>                "last_clean_scrub_stamp": "2017-10-15 09:27:31.756027",
>>>>                "log_size": 3025,
>>>>                "ondisk_log_size": 3025,
>>>>                "stats_invalid": false,
>>>>                "dirty_stats_invalid": false,
>>>>                "omap_stats_invalid": false,
>>>>                "hitset_stats_invalid": false,
>>>>                "hitset_bytes_stats_invalid": false,
>>>>                "pin_stats_invalid": false,
>>>>                "stat_sum": {
>>>>                    "num_bytes": 3555027456 <355%20502%207456>,
>>>>                    "num_objects": 917,
>>>>                    "num_object_clones": 255,
>>>>                    "num_object_copies": 1834,
>>>>                    "num_objects_missing_on_primary": 0,
>>>>                    "num_objects_missing": 0,
>>>>                    "num_objects_degraded": 917,
>>>>                    "num_objects_misplaced": 0,
>>>>                    "num_objects_unfound": 0,
>>>>                    "num_objects_dirty": 917,
>>>>                    "num_whiteouts": 0,
>>>>                    "num_read": 275095,
>>>>                    "num_read_kb": 111713846,
>>>>                    "num_write": 64324,
>>>>                    "num_write_kb": 11365374,
>>>>                    "num_scrub_errors": 0,
>>>>                    "num_shallow_scrub_errors": 0,
>>>>                    "num_deep_scrub_errors": 0,
>>>>                    "num_objects_recovered": 243,
>>>>                    "num_bytes_recovered": 1008594432,
>>>>                    "num_keys_recovered": 6,
>>>>                    "num_objects_omap": 0,
>>>>                    "num_objects_hit_set_archive": 0,
>>>>                    "num_bytes_hit_set_archive": 0,
>>>>                    "num_flush": 0,
>>>>                    "num_flush_kb": 0,
>>>>                    "num_evict": 0,
>>>>                    "num_evict_kb": 0,
>>>>                    "num_promote": 0,
>>>>                    "num_flush_mode_high": 0,
>>>>                    "num_flush_mode_low": 0,
>>>>                    "num_evict_mode_some": 0,
>>>>                    "num_evict_mode_full": 0,
>>>>                    "num_objects_pinned": 0,
>>>>                    "num_legacy_snapsets": 0
>>>>                },
>>>>                "up": [
>>>>                    1,
>>>>                    0
>>>>                ],
>>>>                "acting": [
>>>>                    1,
>>>>                    0
>>>>                ],
>>>>                "blocked_by": [],
>>>>                "up_primary": 1,
>>>>                "acting_primary": 1
>>>>            },
>>>>            "empty": 0,
>>>>            "dne": 0,
>>>>            "incomplete": 0,
>>>>            "last_epoch_started": 1339,
>>>>            "hit_set_history": {
>>>>                "current_last_update": "0'0",
>>>>                "history": []
>>>>            }
>>>>        }
>>>>    ],
>>>>    "recovery_state": [
>>>>        {
>>>>            "name": "Started/Primary/Active",
>>>>            "enter_time": "2017-10-15 20:36:33.574915",
>>>>            "might_have_unfound": [
>>>>                {
>>>>                    "osd": "0",
>>>>                    "status": "already probed"
>>>>                }
>>>>            ],
>>>>            "recovery_progress": {
>>>>                "backfill_targets": [],
>>>>                "waiting_on_backfill": [],
>>>>                "last_backfill_started": "MIN",
>>>>                "backfill_info": {
>>>>                    "begin": "MIN",
>>>>                    "end": "MIN",
>>>>                    "objects": []
>>>>                },
>>>>                "peer_backfill_info": [],
>>>>                "backfills_in_flight": [],
>>>>                "recovering": [],
>>>>                "pg_backend": {
>>>>                    "pull_from_peer": [],
>>>>                    "pushing": []
>>>>                }
>>>>            },
>>>>            "scrub": {
>>>>                "scrubber.epoch_start": "1338",
>>>>                "scrubber.active": false,
>>>>                "scrubber.state": "INACTIVE",
>>>>                "scrubber.start": "MIN",
>>>>                "scrubber.end": "MIN",
>>>>                "scrubber.subset_last_update": "0'0",
>>>>                "scrubber.deep": false,
>>>>                "scrubber.seed": 0,
>>>>                "scrubber.waiting_on": 0,
>>>>                "scrubber.waiting_on_whom": []
>>>>            }
>>>>        },
>>>>        {
>>>>            "name": "Started",
>>>>            "enter_time": "2017-10-15 20:36:32.592892"
>>>>        }
>>>>    ],
>>>>    "agent_state": {}
>>>> }
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> 2017-10-30 23:30 GMT+01:00 Gregory Farnum <gfar...@redhat.com>:
>>>>
>>>>> You'll need to tell us exactly what error messages you're seeing, what
>>>>> the output of ceph -s is, and the output of pg query for the relevant PGs.
>>>>> There's not a lot of documentation because much of this tooling is
>>>>> new, it's changing quickly, and most people don't have the kinds of
>>>>> problems that turn out to be unrepairable. We should do better about that,
>>>>> though.
>>>>> -Greg
>>>>>
>>>>> On Mon, Oct 30, 2017, 11:40 AM Mario Giammarco <mgiamma...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>>  >[Questions to the list]
>>>>>>  >How is it possible that the cluster cannot repair itself with ceph
>>>>>> pg
>>>>>> repair?
>>>>>>  >No good copies are remaining?
>>>>>>  >Cannot decide which copy is valid or up-to date?
>>>>>>  >If so, why not, when there is checksum, mtime for everything?
>>>>>>  >In this inconsistent state which object does the cluster serve when
>>>>>> it
>>>>>> doesn't know which one is the valid?
>>>>>>
>>>>>>
>>>>>> I am asking the same questions too, it seems strange to me that in a
>>>>>> fault tolerant clustered file storage like Ceph there is no
>>>>>> documentation about this.
>>>>>>
>>>>>> I know that I am pedantic but please note that saying "to be sure use
>>>>>> three copies" is not enough because I am not sure what Ceph really
>>>>>> does
>>>>>> when three copies are not matching.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> ceph-users mailing list
>>>>>> ceph-users@lists.ceph.com
>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>
>>>>>
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users@lists.ceph.com
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>>
>>
>>
>> _______________________________________________
>> ceph-users mailing 
>> listceph-us...@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] PGs inconsistent, do I fear data loss?

Reply via email to