Re: [ceph-users] Ceph health error (was: Prioritize recovery over backfilling)

2018-12-20 Thread Paul Emmerich
Oh, I've seen this bug twice on different clusters with Luminous on EC
pools with lots of snapshots in the last few months. Seen it on 12.2.5
and 12.2.10 on CentOS.

It's basically a broken object somewhere that kills an OSD and then
gets recovered to another OSD which then also dies
For us the only seemed to be a quite radical to revive these PGs:
hard-code the broken object name into the OSD and tell it to ignore
it. We luckily didn't care about the object in both cases and we even
could get a copy of it with objectstore-tool before doing the ignore
stuff.

(Which reminds me that I should report that on tracker.ceph.com, but I
think I don't have enough debug logs stored beyond what is already
available on the issues mentioned above; the priority was to get it
back online; we were only called in after the bug already ate through
lots of OSDs and there were inactive PGs)

Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

On Thu, Dec 20, 2018 at 5:10 PM Daniel K  wrote:
>
> Did you ever get anywhere with this?
>
> I have 6 OSDs out of 36 continuously flapping with this error in the logs.
>
> Thanks,
>
> Dan
>
> On Fri, Jun 8, 2018 at 11:10 AM Caspar Smit  wrote:
>>
>> Hi all,
>>
>> Maybe this will help:
>>
>> The issue is with shards 3,4 and 5 of PG 6.3f:
>>
>> LOG's of OSD's 16, 17 & 36 (the ones crashing on startup).
>>
>>
>> Log OSD.16 (shard 4):
>>
>> 2018-06-08 08:35:01.727261 7f4c585e3700 -1 
>> bluestore(/var/lib/ceph/osd/ceph-16) _txc_add_transaction error (2) No such 
>> file or directory not handled on operation 30 (op 0, counting from 0)
>> 2018-06-08 08:35:01.727273 7f4c585e3700 -1 
>> bluestore(/var/lib/ceph/osd/ceph-16) ENOENT on clone suggests osd bug
>> 2018-06-08 08:35:01.727274 7f4c585e3700  0 
>> bluestore(/var/lib/ceph/osd/ceph-16)  transaction dump:
>> {
>> "ops": [
>> {
>> "op_num": 0,
>> "op_name": "clonerange2",
>> "collection": "6.3fs4_head",
>> "src_oid": 
>> "4#6:fc074663:::rbd_data.5.6c1d9574b0dc51.000312db:head#903d0",
>> "dst_oid": 
>> "4#6:fc074663:::rbd_data.5.6c1d9574b0dc51.000312db:head#",
>> "src_offset": 950272,
>> "len": 98304,
>> "dst_offset": 950272
>> },
>> {
>> "op_num": 1,
>> "op_name": "remove",
>> "collection": "6.3fs4_head",
>> "oid": 
>> "4#6:fc074663:::rbd_data.5.6c1d9574b0dc51.000312db:head#903d0"
>> },
>> {
>> "op_num": 2,
>> "op_name": "setattrs",
>> "collection": "6.3fs4_head",
>> "oid": 
>> "4#6:fc074663:::rbd_data.5.6c1d9574b0dc51.000312db:head#",
>> "attr_lens": {
>> "_": 297,
>> "hinfo_key": 18,
>> "snapset": 35
>> }
>> },
>> {
>> "op_num": 3,
>> "op_name": "clonerange2",
>> "collection": "6.3fs4_head",
>> "src_oid": 
>> "4#6:fc074663:::rbd_data.5.6c1d9574b0dc51.000312db:head#903cf",
>> "dst_oid": 
>> "4#6:fc074663:::rbd_data.5.6c1d9574b0dc51.000312db:head#",
>> "src_offset": 679936,
>> "len": 274432,
>> "dst_offset": 679936
>> },
>> {
>> "op_num": 4,
>> "op_name": "remove",
>> "collection": "6.3fs4_head",
>> "oid": 
>> "4#6:fc074663:::rbd_data.5.6c1d9574b0dc51.000312db:head#903cf"
>> },
>> {
>> "op_num": 5,
>> "op_name": "setattrs",
>> "collection": "6.3fs4_head",
>> "oid": 
>> "4#6:fc074663:::rbd_data.5.6c1d9574b0dc51.000312db:head#",
>> "attr_lens": {
>> "_": 297,
>> "hinfo_key": 18,
>> "snapset": 35
>> }
>> },
>> {
>> "op_num": 6,
>> "op_name": "nop"
>> },
>> {
>> "op_num": 7,
>> "op_name": "op_omap_rmkeyrange",
>> "collection": "6.3fs4_head",
>> "oid": "4#6:fc00head#",
>> "first": "011124.00590799",
>> "last": "4294967295.18446744073709551615"
>> },
>> {
>> "op_num": 8,
>> "op_name": "omap_setkeys",
>> "collection": "6.3fs4_head",
>> "oid": "4#6:fc00head#",
>> "attr_lens": {
>> "_biginfo": 597,
>> "_epoch": 4,
>> "_info": 953,
>> "can_rollback_to": 12,
>> "rollback_info_trimmed_to": 12
>> }
>> }
>> ]
>> }
>>
>> 2018-06-08 08:35:01.730584 7f4c585e3700 -1 
>> /home/builder/source/ceph-12.2.2/src/os/bluestore/BlueStor

Re: [ceph-users] Ceph health error (was: Prioritize recovery over backfilling)

2018-12-20 Thread Daniel K
Did you ever get anywhere with this?

I have 6 OSDs out of 36 continuously flapping with this error in the logs.

Thanks,

Dan

On Fri, Jun 8, 2018 at 11:10 AM Caspar Smit  wrote:

> Hi all,
>
> Maybe this will help:
>
> The issue is with shards 3,4 and 5 of PG 6.3f:
>
> LOG's of OSD's 16, 17 & 36 (the ones crashing on startup).
>
>
> *Log OSD.16 (shard 4):*
>
> 2018-06-08 08:35:01.727261 7f4c585e3700 -1
> bluestore(/var/lib/ceph/osd/ceph-16) _txc_add_transaction error (2) No such
> file or directory not handled on operation 30 (op 0, counting from 0)
> 2018-06-08 08:35:01.727273 7f4c585e3700 -1
> bluestore(/var/lib/ceph/osd/ceph-16) ENOENT on clone suggests osd bug
> 2018-06-08 08:35:01.727274 7f4c585e3700  0
> bluestore(/var/lib/ceph/osd/ceph-16)  transaction dump:
> {
> "ops": [
> {
> "op_num": 0,
> "op_name": "clonerange2",
> "collection": "6.3fs4_head",
> "src_oid":
> "4#6:fc074663:::rbd_data.5.6c1d9574b0dc51.000312db:head#903d0",
> "dst_oid":
> "4#6:fc074663:::rbd_data.5.6c1d9574b0dc51.000312db:head#",
> "src_offset": 950272,
> "len": 98304,
> "dst_offset": 950272
> },
> {
> "op_num": 1,
> "op_name": "remove",
> "collection": "6.3fs4_head",
> "oid":
> "4#6:fc074663:::rbd_data.5.6c1d9574b0dc51.000312db:head#903d0"
> },
> {
> "op_num": 2,
> "op_name": "setattrs",
> "collection": "6.3fs4_head",
> "oid":
> "4#6:fc074663:::rbd_data.5.6c1d9574b0dc51.000312db:head#",
> "attr_lens": {
> "_": 297,
> "hinfo_key": 18,
> "snapset": 35
> }
> },
> {
> "op_num": 3,
> "op_name": "clonerange2",
> "collection": "6.3fs4_head",
> "src_oid":
> "4#6:fc074663:::rbd_data.5.6c1d9574b0dc51.000312db:head#903cf",
> "dst_oid":
> "4#6:fc074663:::rbd_data.5.6c1d9574b0dc51.000312db:head#",
> "src_offset": 679936,
> "len": 274432,
> "dst_offset": 679936
> },
> {
> "op_num": 4,
> "op_name": "remove",
> "collection": "6.3fs4_head",
> "oid":
> "4#6:fc074663:::rbd_data.5.6c1d9574b0dc51.000312db:head#903cf"
> },
> {
> "op_num": 5,
> "op_name": "setattrs",
> "collection": "6.3fs4_head",
> "oid":
> "4#6:fc074663:::rbd_data.5.6c1d9574b0dc51.000312db:head#",
> "attr_lens": {
> "_": 297,
> "hinfo_key": 18,
> "snapset": 35
> }
> },
> {
> "op_num": 6,
> "op_name": "nop"
> },
> {
> "op_num": 7,
> "op_name": "op_omap_rmkeyrange",
> "collection": "6.3fs4_head",
> "oid": "4#6:fc00head#",
> "first": "011124.00590799",
> "last": "4294967295.18446744073709551615"
> },
> {
> "op_num": 8,
> "op_name": "omap_setkeys",
> "collection": "6.3fs4_head",
> "oid": "4#6:fc00head#",
> "attr_lens": {
> "_biginfo": 597,
> "_epoch": 4,
> "_info": 953,
> "can_rollback_to": 12,
> "rollback_info_trimmed_to": 12
> }
> }
> ]
> }
>
> 2018-06-08 08:35:01.730584 7f4c585e3700 -1
> /home/builder/source/ceph-12.2.2/src/os/bluestore/BlueStore.cc: In function
> 'void BlueStore::_txc_add_transaction(BlueStore::TransContext*,
> ObjectStore::Transaction*)' thread 7f4c585e3700 time 2018-06-08
> 08:35:01.727379
> /home/builder/source/ceph-12.2.2/src/os/bluestore/BlueStore.cc: 9363:
> FAILED assert(0 == "unexpected error")
>
>  ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous
> (stable)
>  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x102) [0x558e08ba4202]
>  2: (BlueStore::_txc_add_transaction(BlueStore::TransContext*,
> ObjectStore::Transaction*)+0x15fa) [0x558e08a55c3a]
>  3: (BlueStore::queue_transactions(ObjectStore::Sequencer*,
> std::vector std::allocator >&,
> boost::intrusive_ptr, ThreadPool::TPHandle*)+0x546)
> [0x558e08a572a6]
>  4: (ObjectStore::queue_transaction(ObjectStore::Sequencer*,
> ObjectStore::Transaction&&, Context*, Context*, Context*,
> boost::intrusive_ptr, ThreadPool::TPHandle*)+0x14f)
> [0x558e085fa37f]
>  5: (OSD::dispatch_context_transaction(PG::RecoveryCtx&, PG*,
> ThreadPool::TPHandle*)+0x6c) [0x558e0857db5c]
>  6: (OSD::process_peering_events(std::__cxx11::list std::allocator > const&, ThreadPool::TPHandle&)+0x442) [0x558e085abec2]
>  7: (ThreadPool::BatchWorkQueue::_void_process(void*,
> ThreadP

Re: [ceph-users] Ceph health error (was: Prioritize recovery over backfilling)

2018-06-08 Thread Caspar Smit
Hi all,

Maybe this will help:

The issue is with shards 3,4 and 5 of PG 6.3f:

LOG's of OSD's 16, 17 & 36 (the ones crashing on startup).


*Log OSD.16 (shard 4):*

2018-06-08 08:35:01.727261 7f4c585e3700 -1
bluestore(/var/lib/ceph/osd/ceph-16) _txc_add_transaction error (2) No such
file or directory not handled on operation 30 (op 0, counting from 0)
2018-06-08 08:35:01.727273 7f4c585e3700 -1
bluestore(/var/lib/ceph/osd/ceph-16) ENOENT on clone suggests osd bug
2018-06-08 08:35:01.727274 7f4c585e3700  0
bluestore(/var/lib/ceph/osd/ceph-16)  transaction dump:
{
"ops": [
{
"op_num": 0,
"op_name": "clonerange2",
"collection": "6.3fs4_head",
"src_oid":
"4#6:fc074663:::rbd_data.5.6c1d9574b0dc51.000312db:head#903d0",
"dst_oid":
"4#6:fc074663:::rbd_data.5.6c1d9574b0dc51.000312db:head#",
"src_offset": 950272,
"len": 98304,
"dst_offset": 950272
},
{
"op_num": 1,
"op_name": "remove",
"collection": "6.3fs4_head",
"oid":
"4#6:fc074663:::rbd_data.5.6c1d9574b0dc51.000312db:head#903d0"
},
{
"op_num": 2,
"op_name": "setattrs",
"collection": "6.3fs4_head",
"oid":
"4#6:fc074663:::rbd_data.5.6c1d9574b0dc51.000312db:head#",
"attr_lens": {
"_": 297,
"hinfo_key": 18,
"snapset": 35
}
},
{
"op_num": 3,
"op_name": "clonerange2",
"collection": "6.3fs4_head",
"src_oid":
"4#6:fc074663:::rbd_data.5.6c1d9574b0dc51.000312db:head#903cf",
"dst_oid":
"4#6:fc074663:::rbd_data.5.6c1d9574b0dc51.000312db:head#",
"src_offset": 679936,
"len": 274432,
"dst_offset": 679936
},
{
"op_num": 4,
"op_name": "remove",
"collection": "6.3fs4_head",
"oid":
"4#6:fc074663:::rbd_data.5.6c1d9574b0dc51.000312db:head#903cf"
},
{
"op_num": 5,
"op_name": "setattrs",
"collection": "6.3fs4_head",
"oid":
"4#6:fc074663:::rbd_data.5.6c1d9574b0dc51.000312db:head#",
"attr_lens": {
"_": 297,
"hinfo_key": 18,
"snapset": 35
}
},
{
"op_num": 6,
"op_name": "nop"
},
{
"op_num": 7,
"op_name": "op_omap_rmkeyrange",
"collection": "6.3fs4_head",
"oid": "4#6:fc00head#",
"first": "011124.00590799",
"last": "4294967295.18446744073709551615"
},
{
"op_num": 8,
"op_name": "omap_setkeys",
"collection": "6.3fs4_head",
"oid": "4#6:fc00head#",
"attr_lens": {
"_biginfo": 597,
"_epoch": 4,
"_info": 953,
"can_rollback_to": 12,
"rollback_info_trimmed_to": 12
}
}
]
}

2018-06-08 08:35:01.730584 7f4c585e3700 -1
/home/builder/source/ceph-12.2.2/src/os/bluestore/BlueStore.cc: In function
'void BlueStore::_txc_add_transaction(BlueStore::TransContext*,
ObjectStore::Transaction*)' thread 7f4c585e3700 time 2018-06-08
08:35:01.727379
/home/builder/source/ceph-12.2.2/src/os/bluestore/BlueStore.cc: 9363:
FAILED assert(0 == "unexpected error")

 ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous
(stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x102) [0x558e08ba4202]
 2: (BlueStore::_txc_add_transaction(BlueStore::TransContext*,
ObjectStore::Transaction*)+0x15fa) [0x558e08a55c3a]
 3: (BlueStore::queue_transactions(ObjectStore::Sequencer*,
std::vector >&,
boost::intrusive_ptr, ThreadPool::TPHandle*)+0x546)
[0x558e08a572a6]
 4: (ObjectStore::queue_transaction(ObjectStore::Sequencer*,
ObjectStore::Transaction&&, Context*, Context*, Context*,
boost::intrusive_ptr, ThreadPool::TPHandle*)+0x14f)
[0x558e085fa37f]
 5: (OSD::dispatch_context_transaction(PG::RecoveryCtx&, PG*,
ThreadPool::TPHandle*)+0x6c) [0x558e0857db5c]
 6: (OSD::process_peering_events(std::__cxx11::list > const&, ThreadPool::TPHandle&)+0x442) [0x558e085abec2]
 7: (ThreadPool::BatchWorkQueue::_void_process(void*,
ThreadPool::TPHandle&)+0x2c) [0x558e0861a91c]
 8: (ThreadPool::worker(ThreadPool::WorkThread*)+0xeb8) [0x558e08bab3a8]
 9: (ThreadPool::WorkThread::entry()+0x10) [0x558e08bac540]
 10: (()+0x7494) [0x7f4c709ca494]
 11: (clone()+0x3f) [0x7f4c6fa51aff]
 NOTE: a copy of the executable, or `objdump -rdS ` is needed
to interpret this.


*Log OSD.17 (Shard 5):*

2018-06-08 08:35:56.745249 7fe3fa687700 -1
bluestore(/var/lib/ceph/osd/ceph-17) _txc_add_transaction error (2) No such
file or

Re: [ceph-users] Ceph health error (was: Prioritize recovery over backfilling)

2018-06-08 Thread Caspar Smit
Hi all,

I seem to be hitting these tracker issues:

https://tracker.ceph.com/issues/23145
http://tracker.ceph.com/issues/24422

PG's 6.1 and 6.3f are having the issues

When i list all PG's of a down OSD with:

ceph-objectstore-tool --dry-run --type bluestore --data-path
/var/lib/ceph/osd/ceph-17/ --op list-pgs

There are a lot of 'double' pgid's like (also for other pg's):

6.3fs3
6.3fs5

Is that normal? I would assume different shards for EC would be on seperate
OSD's

We still have 4 OSD's down and 2 PG's down+remapped and i can't find any
way to get the crashed OSD's back up.

pg 6.1 is down+remapped, acting
[6,3,2147483647,29,2147483647,2147483647]
pg 6.3f is down+remapped, acting [20,24,2147483647,2147483647,3,28]

Kind regards,
Caspar Smit

2018-06-08 8:53 GMT+02:00 Caspar Smit :

> Update:
>
> I've unset nodown to let it continue but now 4 osd's are down and cannot
> be brought up again, here's what the lofgfile reads:
>
> 2018-06-08 08:35:01.716245 7f4c58de4700  0 log_channel(cluster) log [INF]
> : 6.e3s0 continuing backfill to osd.37(4) from (10864'911406,11124'921472]
> 6:c7d71bbd:::rbd_data.5.6c1d9574b0dc51.00bf38b9:head to
> 11124'921472
> 2018-06-08 08:35:01.727261 7f4c585e3700 -1 
> bluestore(/var/lib/ceph/osd/ceph-16)
> _txc_add_transaction error (2) No such file or directory not handled on
> operation 30 (op 0, counting from 0)
> 2018-06-08 08:35:01.727273 7f4c585e3700 -1 
> bluestore(/var/lib/ceph/osd/ceph-16)
> ENOENT on clone suggests osd bug
>
> 2018-06-08 08:35:01.730584 7f4c585e3700 -1 /home/builder/source/ceph-12.
> 2.2/src/os/bluestore/BlueStore.cc: In function 'void BlueStore::_txc_add_
> transaction(BlueStore::TransContext*, ObjectStore::Transaction*)' thread
> 7f4c585e3700 time 2018-06-08 08:35:01.727379
> /home/builder/source/ceph-12.2.2/src/os/bluestore/BlueStore.cc: 9363:
> FAILED assert(0 == "unexpected error")
>
>  ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous
> (stable)
>  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x102) [0x558e08ba4202]
>  2: (BlueStore::_txc_add_transaction(BlueStore::TransContext*,
> ObjectStore::Transaction*)+0x15fa) [0x558e08a55c3a]
>  3: (BlueStore::queue_transactions(ObjectStore::Sequencer*,
> std::vector
> >&, boost::intrusive_ptr, ThreadPool::TPHandle*)+0x546)
> [0x558e08a572a6]
>  4: (ObjectStore::queue_transaction(ObjectStore::Sequencer*,
> ObjectStore::Transaction&&, Context*, Context*, Context*,
> boost::intrusive_ptr, ThreadPool::TPHandle*)+0x14f)
> [0x558e085fa37f]
>  5: (OSD::dispatch_context_transaction(PG::RecoveryCtx&, PG*,
> ThreadPool::TPHandle*)+0x6c) [0x558e0857db5c]
>  6: (OSD::process_peering_events(std::__cxx11::list std::allocator > const&, ThreadPool::TPHandle&)+0x442) [0x558e085abec2]
>  7: (ThreadPool::BatchWorkQueue::_void_process(void*,
> ThreadPool::TPHandle&)+0x2c) [0x558e0861a91c]
>  8: (ThreadPool::worker(ThreadPool::WorkThread*)+0xeb8) [0x558e08bab3a8]
>  9: (ThreadPool::WorkThread::entry()+0x10) [0x558e08bac540]
>  10: (()+0x7494) [0x7f4c709ca494]
>  11: (clone()+0x3f) [0x7f4c6fa51aff]
>  NOTE: a copy of the executable, or `objdump -rdS ` is needed
> to interpret this.
>
> Any help is highly appreciated.
>
> Kind regards,
> Caspar Smit
>
>
> 2018-06-08 7:57 GMT+02:00 Caspar Smit :
>
>> Well i let it run with flags nodown and it looked like it would finish
>> BUT it all went wrong somewhere:
>>
>> This is now the state:
>>
>> health: HEALTH_ERR
>> nodown flag(s) set
>> 5602396/94833780 objects misplaced (5.908%)
>> Reduced data availability: 143 pgs inactive, 142 pgs peering,
>> 7 pgs stale
>> Degraded data redundancy: 248859/94833780 objects degraded
>> (0.262%), 194 pgs unclean, 21 pgs degraded, 12 pgs undersized
>> 11 stuck requests are blocked > 4096 sec
>>
>> pgs: 13.965% pgs not active
>>  248859/94833780 objects degraded (0.262%)
>>  5602396/94833780 objects misplaced (5.908%)
>>  830 active+clean
>>  75  remapped+peering
>>  66  peering
>>  26  active+remapped+backfill_wait
>>  6   active+undersized+degraded+remapped+backfill_wait
>>  6   active+recovery_wait+degraded+remapped
>>  3   active+undersized+degraded+remapped+backfilling
>>  3   stale+active+undersized+degraded+remapped+backfill_wait
>>  3   stale+active+remapped+backfill_wait
>>  2   active+recovery_wait+degraded
>>  2   active+remapped+backfilling
>>  1   activating+degraded+remapped
>>  1   stale+remapped+peering
>>
>>
>> #ceph health detail shows:
>>
>> REQUEST_STUCK 11 stuck requests are blocked > 4096 sec
>> 11 ops are blocked > 16777.2 sec
>> osds 4,7,23,24 have stuck requests > 16777.2 sec
>>
>>
>> So what happened and what should i do now?
>>
>> Thank you very much for any help
>>
>> Kind regards,
>> Caspar
>>
>>

Re: [ceph-users] Ceph health error (was: Prioritize recovery over backfilling)

2018-06-07 Thread Caspar Smit
Update:

I've unset nodown to let it continue but now 4 osd's are down and cannot be
brought up again, here's what the lofgfile reads:

2018-06-08 08:35:01.716245 7f4c58de4700  0 log_channel(cluster) log [INF] :
6.e3s0 continuing backfill to osd.37(4) from (10864'911406,11124'921472]
6:c7d71bbd:::rbd_data.5.6c1d9574b0dc51.00bf38b9:head to 11124'921472
2018-06-08 08:35:01.727261 7f4c585e3700 -1
bluestore(/var/lib/ceph/osd/ceph-16) _txc_add_transaction error (2) No such
file or directory not handled on operation 30 (op 0, counting from 0)
2018-06-08 08:35:01.727273 7f4c585e3700 -1
bluestore(/var/lib/ceph/osd/ceph-16) ENOENT on clone suggests osd bug

2018-06-08 08:35:01.730584 7f4c585e3700 -1
/home/builder/source/ceph-12.2.2/src/os/bluestore/BlueStore.cc: In function
'void BlueStore::_txc_add_transaction(BlueStore::TransContext*,
ObjectStore::Transaction*)' thread 7f4c585e3700 time 2018-06-08
08:35:01.727379
/home/builder/source/ceph-12.2.2/src/os/bluestore/BlueStore.cc: 9363:
FAILED assert(0 == "unexpected error")

 ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous
(stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x102) [0x558e08ba4202]
 2: (BlueStore::_txc_add_transaction(BlueStore::TransContext*,
ObjectStore::Transaction*)+0x15fa) [0x558e08a55c3a]
 3: (BlueStore::queue_transactions(ObjectStore::Sequencer*,
std::vector >&,
boost::intrusive_ptr, ThreadPool::TPHandle*)+0x546)
[0x558e08a572a6]
 4: (ObjectStore::queue_transaction(ObjectStore::Sequencer*,
ObjectStore::Transaction&&, Context*, Context*, Context*,
boost::intrusive_ptr, ThreadPool::TPHandle*)+0x14f)
[0x558e085fa37f]
 5: (OSD::dispatch_context_transaction(PG::RecoveryCtx&, PG*,
ThreadPool::TPHandle*)+0x6c) [0x558e0857db5c]
 6: (OSD::process_peering_events(std::__cxx11::list > const&, ThreadPool::TPHandle&)+0x442) [0x558e085abec2]
 7: (ThreadPool::BatchWorkQueue::_void_process(void*,
ThreadPool::TPHandle&)+0x2c) [0x558e0861a91c]
 8: (ThreadPool::worker(ThreadPool::WorkThread*)+0xeb8) [0x558e08bab3a8]
 9: (ThreadPool::WorkThread::entry()+0x10) [0x558e08bac540]
 10: (()+0x7494) [0x7f4c709ca494]
 11: (clone()+0x3f) [0x7f4c6fa51aff]
 NOTE: a copy of the executable, or `objdump -rdS ` is needed
to interpret this.

Any help is highly appreciated.

Kind regards,
Caspar Smit


2018-06-08 7:57 GMT+02:00 Caspar Smit :

> Well i let it run with flags nodown and it looked like it would finish BUT
> it all went wrong somewhere:
>
> This is now the state:
>
> health: HEALTH_ERR
> nodown flag(s) set
> 5602396/94833780 objects misplaced (5.908%)
> Reduced data availability: 143 pgs inactive, 142 pgs peering,
> 7 pgs stale
> Degraded data redundancy: 248859/94833780 objects degraded
> (0.262%), 194 pgs unclean, 21 pgs degraded, 12 pgs undersized
> 11 stuck requests are blocked > 4096 sec
>
> pgs: 13.965% pgs not active
>  248859/94833780 objects degraded (0.262%)
>  5602396/94833780 objects misplaced (5.908%)
>  830 active+clean
>  75  remapped+peering
>  66  peering
>  26  active+remapped+backfill_wait
>  6   active+undersized+degraded+remapped+backfill_wait
>  6   active+recovery_wait+degraded+remapped
>  3   active+undersized+degraded+remapped+backfilling
>  3   stale+active+undersized+degraded+remapped+backfill_wait
>  3   stale+active+remapped+backfill_wait
>  2   active+recovery_wait+degraded
>  2   active+remapped+backfilling
>  1   activating+degraded+remapped
>  1   stale+remapped+peering
>
>
> #ceph health detail shows:
>
> REQUEST_STUCK 11 stuck requests are blocked > 4096 sec
> 11 ops are blocked > 16777.2 sec
> osds 4,7,23,24 have stuck requests > 16777.2 sec
>
>
> So what happened and what should i do now?
>
> Thank you very much for any help
>
> Kind regards,
> Caspar
>
>
> 2018-06-07 13:33 GMT+02:00 Sage Weil :
>
>> On Wed, 6 Jun 2018, Caspar Smit wrote:
>> > Hi all,
>> >
>> > We have a Luminous 12.2.2 cluster with 3 nodes and i recently added a
>> node
>> > to it.
>> >
>> > osd-max-backfills is at the default 1 so backfilling didn't go very fast
>> > but that doesn't matter.
>> >
>> > Once it started backfilling everything looked ok:
>> >
>> > ~300 pgs in backfill_wait
>> > ~10 pgs backfilling (~number of new osd's)
>> >
>> > But i noticed the degraded objects increasing a lot. I presume a pg
>> that is
>> > in backfill_wait state doesn't accept any new writes anymore? Hence
>> > increasing the degraded objects?
>> >
>> > So far so good, but once a while i noticed a random OSD flapping (they
>> come
>> > back up automatically). This isn't because the disk is saturated but a
>> > driver/controller/kernel incompatibility which 'hangs' the disk for a
>> short
>> > time (scsi abort_task error in syslog). Investigating further i noticed

Re: [ceph-users] Ceph health error (was: Prioritize recovery over backfilling)

2018-06-07 Thread Caspar Smit
Well i let it run with flags nodown and it looked like it would finish BUT
it all went wrong somewhere:

This is now the state:

health: HEALTH_ERR
nodown flag(s) set
5602396/94833780 objects misplaced (5.908%)
Reduced data availability: 143 pgs inactive, 142 pgs peering, 7
pgs stale
Degraded data redundancy: 248859/94833780 objects degraded
(0.262%), 194 pgs unclean, 21 pgs degraded, 12 pgs undersized
11 stuck requests are blocked > 4096 sec

pgs: 13.965% pgs not active
 248859/94833780 objects degraded (0.262%)
 5602396/94833780 objects misplaced (5.908%)
 830 active+clean
 75  remapped+peering
 66  peering
 26  active+remapped+backfill_wait
 6   active+undersized+degraded+remapped+backfill_wait
 6   active+recovery_wait+degraded+remapped
 3   active+undersized+degraded+remapped+backfilling
 3   stale+active+undersized+degraded+remapped+backfill_wait
 3   stale+active+remapped+backfill_wait
 2   active+recovery_wait+degraded
 2   active+remapped+backfilling
 1   activating+degraded+remapped
 1   stale+remapped+peering


#ceph health detail shows:

REQUEST_STUCK 11 stuck requests are blocked > 4096 sec
11 ops are blocked > 16777.2 sec
osds 4,7,23,24 have stuck requests > 16777.2 sec


So what happened and what should i do now?

Thank you very much for any help

Kind regards,
Caspar


2018-06-07 13:33 GMT+02:00 Sage Weil :

> On Wed, 6 Jun 2018, Caspar Smit wrote:
> > Hi all,
> >
> > We have a Luminous 12.2.2 cluster with 3 nodes and i recently added a
> node
> > to it.
> >
> > osd-max-backfills is at the default 1 so backfilling didn't go very fast
> > but that doesn't matter.
> >
> > Once it started backfilling everything looked ok:
> >
> > ~300 pgs in backfill_wait
> > ~10 pgs backfilling (~number of new osd's)
> >
> > But i noticed the degraded objects increasing a lot. I presume a pg that
> is
> > in backfill_wait state doesn't accept any new writes anymore? Hence
> > increasing the degraded objects?
> >
> > So far so good, but once a while i noticed a random OSD flapping (they
> come
> > back up automatically). This isn't because the disk is saturated but a
> > driver/controller/kernel incompatibility which 'hangs' the disk for a
> short
> > time (scsi abort_task error in syslog). Investigating further i noticed
> > this was already the case before the node expansion.
> >
> > These OSD's flapping results in lots of pg states which are a bit
> worrying:
> >
> >  109 active+remapped+backfill_wait
> >  80  active+undersized+degraded+remapped+backfill_wait
> >  51  active+recovery_wait+degraded+remapped
> >  41  active+recovery_wait+degraded
> >  27  active+recovery_wait+undersized+degraded+remapped
> >  14  active+undersized+remapped+backfill_wait
> >  4   active+undersized+degraded+remapped+backfilling
> >
> > I think the recovery_wait is more important then the backfill_wait, so i
> > like to prioritize these because the recovery_wait was triggered by the
> > flapping OSD's
>
> Just a note: this is fixed in mimic.  Previously, we would choose the
> highest-priority PG to start recovery on at the time, but once recovery
> had started, the appearance of a new PG with a higher priority (e.g.,
> because it finished peering after the others) wouldn't preempt/cancel the
> other PG's recovery, so you would get behavior like the above.
>
> Mimic implements that preemption, so you should not see behavior like
> this.  (If you do, then the function that assigns a priority score to a
> PG needs to be tweaked.)
>
> sage
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com