[ceph-users] Ceph benchmark tool (cbt)

2020-12-10 Thread Seena Fallah
Hi all,

I want to benchmark my production cluster with cbt. I read a bit of the
code and I see something strange in it, for example, it's going to create
ceph-osd by it selves (
https://github.com/ceph/cbt/blob/master/cluster/ceph.py#L373) and also
shutdown the whole cluster!! (
https://github.com/ceph/cbt/blob/master/cluster/ceph.py#L212)

Is there any configuration to not do harmful things with the cluster and
for example just test the read_ahead_kb or simply stop some OSDs and other
things that can be reverted and not get the cluster fully down?!

Thanks.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Incomplete PG due to primary OSD crashing during EC backfill - get_hash_info: Mismatch of total_chunk_size 0

2020-12-10 Thread Byrne, Thomas (STFC,RAL,SC)
A few more things of note after more poking with the help of Dan vdS.

1) The object that the backfill is crashing on has an mtime of a few minutes 
before the original primary died this morning, and a 'rados get' gives an 
input/output error. So it looks like a new object that was possibly corrupted 
by the dying primary OSD. I can't see any disk I/O error in any of the PGs OSD 
logs when trying the 'get', but I do see this error in most of the OSDs logs:

2020-12-10 23:22:31.840 7fc7161e3700  0 osd.4134 pg_epoch: 1162547 pg[11.214s8( 
v 1162547'714924 (1162114'711864,1162547'714924] local-lis/les=1162304/1162305 
n=133402 ec=1069520/992 lis/c 1162304/1125301 les/c/f 1162305/1125302/257760 
1162303/1162304/1162301) 
[2147483647,1708,2099,1346,4309,777,5098,4501,4134,217,4643]p1708(1) r=8 
lpr=1162304 pi=[1125301,1162304)/2 luod=0'0 crt=1162547'714924 active mbc={}] 
get_hash_info: Mismatch of total_chunk_size 0
2020-12-10 23:22:31.840 7fc7161e3700 -1 log_channel(cluster) log [ERR] : 
Corruption detected: object 
11:28447b4a:::962de230-ed6c-44f2-ab02-788c52ea6a82.3210530112.122__multipart_201%2fin5%2fexp_4-05-737%2fprocessed%2fspe%2fsqw_187570.nxspe.2~bgZPo_rC64ZXJWKyTfdn4dIApqLNDPp.22:head
 is missing hash_info

This error was present in OSD logs: 1708, 2099, 1346, 4309, 777, 5098, 4501, 
4134, and absent in: 217, 4643 (possibly because they are unused parity 
shards?). Checking on one of the FileStore OSDs that returned the error 
message, the underlying file is present and the correct size at least.

I'm checking all objects in the PG now for corruption, I'm only 25% through the 
133385 objects in the PG, but that object is the only corrupted one I've seen 
so far, so hopefully it is an isolated corruption. If so, I can possibly try 
deleting the problematic object and seeing if the backfill can continue.

2)  This PG is a mix of FileStore and BlueStore OSDs, all 14.2.9. The original 
primary that died (1466) was FileStore. Bluestore: 1708, 4309, 5098, 4501, 
4134, 4643, Filestore: 2099, 1346, 777, 217.

PG query for reference: https://pastebin.com/ZUUH2mQ6

Cheers,
Tom

> -Original Message-
> From: Byrne, Thomas (STFC,RAL,SC) 
> Sent: 10 December 2020 18:40
> To: 'ceph-users' 
> Subject: [ceph-users] Incomplete PG due to primary OSD crashing during EC
> backfill - get_hash_info: Mismatch of total_chunk_size 0
> 
> Hi all,
> 
> Got an odd issue that I'm not sure how to solve on our Nautilus 14.2.9 EC
> cluster.
> 
> The primary OSD of an EC 8+3 PG died this morning with a very sad disk
> (thousands of pending sectors). After the down out interval a new 'up' primary
> was assigned and the backfill started. Twenty minutes later the acting
> primary (not the new 'up' primary) started crashing with a "get_hash_info:
> Mismatch of total_chunk_size 0" error (see log below)
> 
> This crash always happens at the same object, with different acting primaries,
> and with a different new 'up' primary. I can't see anything in the logs that
> points to a particular OSD being the issue, so I suspect there is a corrupted
> object in the PG that is causing issues, but I'm not sure how to dig into this
> further. The PG is currently active (but degraded), but only whilst 
> nobackfill or
> noout are set (+turning the new OSD off), and if the flags are unset the
> backfill will eventually crash enough OSDs to render the PG incomplete, which
> is not ideal. I would appreciate being able to resolve this so I can go back 
> to
> letting Ceph deal with down OSDs itself :)
> 
> Does anyone have some pointers on how to dig into or resolve this? Happy to
> create a tracker ticket and post more logs if this looks like a bug.
> 
> Thanks,
> Tom
> 
> OSD log with debug_osd=20 (preamble cut from subsequent lines in an
> attempt to improve readability...):
> 
> 2020-12-10 15:14:16.130 7fc0a1575700 10 osd.1708 pg_epoch: 1162259
> pg[11.214s1( v 1162255'714638 (1162110'711564,1162255'714638] local-
> lis/les=1162253/1162254 n=133385 ec=1069520/992 lis/c 1162253/1125301
> les/c/f 1162254/1125302/257760 1162252/1162253/1162253)
> [2449,1708,2099,1346,4309,777,5098,4501,4134,217,4643]/[2147483647,170
> 8,2099,1346,4309,777,5098,4501,4134,217,4643]p1708(1) backfill=[2449(0)]
> r=1 lpr=1162253 pi=[1125301,1162253)/3 rops=1 crt=1162255'714638 lcod
> 1162254'714637 mlcod 1162254'714637
> active+undersized+degraded+remapped+backfilling mbc={}]
> run_recovery_op: starting RecoveryOp(hoid=11:28447b4a:::962de230-ed6c-
> 44f2-ab02-788c52ea6a82.3210530112.122__multipart_201%2fin5%2fexp_4-
> 05-
> 737%2fprocessed%2fspe%2fsqw_187570.nxspe.2~bgZPo_rC64ZXJWKyTfdn4dI
> ApqLNDPp.22:head v=1162125'713150 missing_on=2449(0)
> missing_on_shards=0
> recovery_info=ObjectRecoveryInfo(11:28447b4a:::962de230-ed6c-44f2-ab02-
> 788c52ea6a82.3210530112.122__multipart_201%2fin5%2fexp_4-05-
> 737%2fprocessed%2fspe%
> 
> 2fsqw_187570.nxspe.2~bgZPo_rC64ZXJWKyTfdn4dIApqLNDPp.22:head@1162
> 125'713150, size: 4194304, copy_subset: [], clone_subset: {}, snapset: 

[ceph-users] Re: mgr's stop responding, dropping out of cluster with _check_auth_rotating

2020-12-10 Thread David Orman
Hi Janek,

We realize this, we referenced that issue in our initial email. We do want
the metrics exposed by Ceph internally, and would prefer to work towards a
fix upstream. We appreciate the suggestion for a workaround, however!

Again, we're happy to provide whatever information we can that would be of
assistance. If there's some debug setting that is preferred, we are happy
to implement it, as this is currently a test cluster for us to work through
issues such as this one.

David

On Thu, Dec 10, 2020 at 12:02 PM Janek Bevendorff <
janek.bevendo...@uni-weimar.de> wrote:

> Do you have the prometheus module enabled? Turn that off, it's causing
> issues. I replaced it with another ceph exporter from Github and almost
> forgot about it.
>
> Here's the relevant issue report:
> https://tracker.ceph.com/issues/39264#change-179946
>
> On 10/12/2020 16:43, Welby McRoberts wrote:
> > Hi Folks
> >
> > We've noticed that in a cluster of 21 nodes (5 mgrs & 504 OSDs with
> 24
> > per node) that the mgr's are, after a non specific period of time,
> dropping
> > out of the cluster. The logs only show the following:
> >
> > debug 2020-12-10T02:02:50.409+ 7f1005840700  0 log_channel(cluster)
> log
> > [DBG] : pgmap v14163: 4129 pgs: 4129 active+clean; 10 GiB data, 31 TiB
> > used, 6.3 PiB / 6.3 PiB avail
> > debug 2020-12-10T03:20:59.223+ 7f10624eb700 -1 monclient:
> > _check_auth_rotating possible clock skew, rotating keys expired way too
> > early (before 2020-12-10T02:20:59.226159+)
> > debug 2020-12-10T03:21:00.223+ 7f10624eb700 -1 monclient:
> > _check_auth_rotating possible clock skew, rotating keys expired way too
> > early (before 2020-12-10T02:21:00.226310+)
> >
> > The _check_auth_rotating repeats approximately every second. The
> instances
> > are all syncing their time with NTP and have no issues on that front. A
> > restart of the mgr fixes the issue.
> >
> > It appears that this may be related to
> https://tracker.ceph.com/issues/39264.
> > The suggestion seems to be to disable prometheus metrics, however, this
> > obviously isn't realistic for a production environment where metrics are
> > critical for operations.
> >
> > Please let us know what additional information we can provide to assist
> in
> > resolving this critical issue.
> >
> > Cheers
> > Welby
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] removing index for non-existent buckets

2020-12-10 Thread Christopher Durham
Hi,
I am uisng 15.2.7 on CentOS 8.1. I have a number of old buckets that are listed 
with

# radosgw-admin metadata list bucket.instance

but are not listed with:
# radosgw-admin bucket list
Lets say that one of them is:
'old-bucket' and its instance is 'c100feda-5e16-48a4-b908-7be61aa877ef.123.1'
The bucket/instance is empty:

# radosgw-admin bucket list --bucket  old-bucket --bucket-id 
c100feda-5e16-48a4-b908-7be61aa877ef.123.1[]
Again, if I ist the index pool I see the .dir object:
# rados -p  ls | grep 
c100feda-5e16-48a4-b908-7be61aa877ef.dir.c100feda-5e16-48a4-b908-7be61aa877ef.123.1.N
 (an N for each shard)
I THINK I can delete these old indices with the following two commands

# radosgw-admin bi purge --bucket old-bucket --bucket-id 
c100feda-5e16-48a4-b908-7be61aa877ef.123.1# radosgw-admin metadata rm 
bucket.instance:old-bucket:c100feda-5e16-48a4-b908-7be61aa877ef.123.1
Am I correct? What if I have another zone in a multi site configuration? Will 
this suffice to remove the indices from both sites? I thought I better check 
herebefore doing this and causing myself grief. Thanks!
-Chris





___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Incomplete PG due to primary OSD crashing during EC backfill - get_hash_info: Mismatch of total_chunk_size 0

2020-12-10 Thread Byrne, Thomas (STFC,RAL,SC)
Hi all,

Got an odd issue that I'm not sure how to solve on our Nautilus 14.2.9 EC 
cluster.

The primary OSD of an EC 8+3 PG died this morning with a very sad disk 
(thousands of pending sectors). After the down out interval a new 'up' primary 
was assigned and the backfill started. Twenty minutes later the acting primary 
(not the new 'up' primary) started crashing with a "get_hash_info: Mismatch of 
total_chunk_size 0" error (see log below)

This crash always happens at the same object, with different acting primaries, 
and with a different new 'up' primary. I can't see anything in the logs that 
points to a particular OSD being the issue, so I suspect there is a corrupted 
object in the PG that is causing issues, but I'm not sure how to dig into this 
further. The PG is currently active (but degraded), but only whilst nobackfill 
or noout are set (+turning the new OSD off), and if the flags are unset the 
backfill will eventually crash enough OSDs to render the PG incomplete, which 
is not ideal. I would appreciate being able to resolve this so I can go back to 
letting Ceph deal with down OSDs itself :)

Does anyone have some pointers on how to dig into or resolve this? Happy to 
create a tracker ticket and post more logs if this looks like a bug.

Thanks,
Tom

OSD log with debug_osd=20 (preamble cut from subsequent lines in an attempt to 
improve readability...):

2020-12-10 15:14:16.130 7fc0a1575700 10 osd.1708 pg_epoch: 1162259 pg[11.214s1( 
v 1162255'714638 (1162110'711564,1162255'714638] local-lis/les=1162253/1162254 
n=133385 ec=1069520/992 lis/c 1162253/1125301 les/c/f 1162254/1125302/257760 
1162252/1162253/1162253) 
[2449,1708,2099,1346,4309,777,5098,4501,4134,217,4643]/[2147483647,1708,2099,1346,4309,777,5098,4501,4134,217,4643]p1708(1)
 backfill=[2449(0)] r=1 lpr=1162253 pi=[1125301,1162253)/3 rops=1 
crt=1162255'714638 lcod 1162254'714637 mlcod 1162254'714637 
active+undersized+degraded+remapped+backfilling mbc={}] run_recovery_op: 
starting 
RecoveryOp(hoid=11:28447b4a:::962de230-ed6c-44f2-ab02-788c52ea6a82.3210530112.122__multipart_201%2fin5%2fexp_4-05-737%2fprocessed%2fspe%2fsqw_187570.nxspe.2~bgZPo_rC64ZXJWKyTfdn4dIApqLNDPp.22:head
 v=1162125'713150 missing_on=2449(0) missing_on_shards=0 
recovery_info=ObjectRecoveryInfo(11:28447b4a:::962de230-ed6c-44f2-ab02-788c52ea6a82.3210530112.122__multipart_201%2fin5%2fexp_4-05-737%2fprocessed%2fspe%
 2fsqw_187570.nxspe.2~bgZPo_rC64ZXJWKyTfdn4dIApqLNDPp.22:head@1162125'713150, 
size: 4194304, copy_subset: [], clone_subset: {}, snapset: 0=[]:{}) 
recovery_progress=ObjectRecoveryProgress(first, data_recovered_to:0, 
data_complete:false, omap_recovered_to:, omap_complete:true, error:false) obc 
refcount=3 state=IDLE waiting_on_pushes= extent_requested=0,0)
continue_recovery_op: continuing 
RecoveryOp(hoid=11:28447b4a:::962de230-ed6c-44f2-ab02-788c52ea6a82.3210530112.122__multipart_201%2fin5%2fexp_4-05-737%2fprocessed%2fspe%2fsqw_187570.nxspe.2~bgZPo_rC64ZXJWKyTfdn4dIApqLNDPp.22:head
 v=1162125'713150 missing_on=2449(0) missing_on_shards=0 
recovery_info=ObjectRecoveryInfo(11:28447b4a:::962de230-ed6c-44f2-ab02-788c52ea6a82.3210530112.122__multipart_201%2fin5%2fexp_4-05-737%2fprocessed%2fspe%2fsqw_187570.nxspe.2~bgZPo_rC64ZXJWKyTfdn4dIApqLNDPp.22:head@1162125'713150,
 size: 4194304, copy_subset: [], clone_subset: {}, snapset: 0=[]:{}) 
recovery_progress=ObjectRecoveryProgress(first, data_recovered_to:0, 
data_complete:false, omap_recovered_to:, omap_complete:true, error:false) obc 
refcount=4 state=IDLE waiting_on_pushes= extent_requested=0,0)
get_hash_info: Getting attr on 
11:28447b4a:::962de230-ed6c-44f2-ab02-788c52ea6a82.3210530112.122__multipart_201%2fin5%2fexp_4-05-737%2fprocessed%2fspe%2fsqw_187570.nxspe.2~bgZPo_rC64ZXJWKyTfdn4dIApqLNDPp.22:head
get_hash_info: not in cache 
11:28447b4a:::962de230-ed6c-44f2-ab02-788c52ea6a82.3210530112.122__multipart_201%2fin5%2fexp_4-05-737%2fprocessed%2fspe%2fsqw_187570.nxspe.2~bgZPo_rC64ZXJWKyTfdn4dIApqLNDPp.22:head
get_hash_info: found on disk, size 524288
get_hash_info: Mismatch of total_chunk_size 0
2020-12-10 15:14:16.136 7fc0a1575700 -1 
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.9/rpm/el7/BUILD/ceph-14.2.9/src/osd/ECBackend.cc:
 In function 'void ECBackend::continue_recovery_op(ECBackend::RecoveryOp&, 
RecoveryMessages*)' thread 7fc0a1575700 time 2020-12-10 15:14:16.132060
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.9/rpm/el7/BUILD/ceph-14.2.9/src/osd/ECBackend.cc:
 585: FAILED ceph_assert(op.hinfo)

ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0) nautilus (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14a) 
[0x55e1569acf7d]
2: (()+0x4cb145) [0x55e1569ad145]
3: (ECBackend::continue_recovery_op(ECBackend::RecoveryOp&, 

[ceph-users] Re: mgr's stop responding, dropping out of cluster with _check_auth_rotating

2020-12-10 Thread Janek Bevendorff
FYI, this is the ceph-exporter we're using at the moment: 
https://github.com/digitalocean/ceph_exporter


It's not as good, but it does the job mostly. Some more specific metrics 
are missing, but the majority is there.



On 10/12/2020 19:01, Janek Bevendorff wrote:
Do you have the prometheus module enabled? Turn that off, it's causing 
issues. I replaced it with another ceph exporter from Github and 
almost forgot about it.


Here's the relevant issue report: 
https://tracker.ceph.com/issues/39264#change-179946


On 10/12/2020 16:43, Welby McRoberts wrote:

Hi Folks

We've noticed that in a cluster of 21 nodes (5 mgrs & 504 OSDs 
with 24
per node) that the mgr's are, after a non specific period of time, 
dropping

out of the cluster. The logs only show the following:

debug 2020-12-10T02:02:50.409+ 7f1005840700  0 
log_channel(cluster) log

[DBG] : pgmap v14163: 4129 pgs: 4129 active+clean; 10 GiB data, 31 TiB
used, 6.3 PiB / 6.3 PiB avail
debug 2020-12-10T03:20:59.223+ 7f10624eb700 -1 monclient:
_check_auth_rotating possible clock skew, rotating keys expired way too
early (before 2020-12-10T02:20:59.226159+)
debug 2020-12-10T03:21:00.223+ 7f10624eb700 -1 monclient:
_check_auth_rotating possible clock skew, rotating keys expired way too
early (before 2020-12-10T02:21:00.226310+)

The _check_auth_rotating repeats approximately every second. The 
instances

are all syncing their time with NTP and have no issues on that front. A
restart of the mgr fixes the issue.

It appears that this may be related to 
https://tracker.ceph.com/issues/39264.

The suggestion seems to be to disable prometheus metrics, however, this
obviously isn't realistic for a production environment where metrics are
critical for operations.

Please let us know what additional information we can provide to 
assist in

resolving this critical issue.

Cheers
Welby
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: mgr's stop responding, dropping out of cluster with _check_auth_rotating

2020-12-10 Thread Janek Bevendorff
Do you have the prometheus module enabled? Turn that off, it's causing 
issues. I replaced it with another ceph exporter from Github and almost 
forgot about it.


Here's the relevant issue report: 
https://tracker.ceph.com/issues/39264#change-179946


On 10/12/2020 16:43, Welby McRoberts wrote:

Hi Folks

We've noticed that in a cluster of 21 nodes (5 mgrs & 504 OSDs with 24
per node) that the mgr's are, after a non specific period of time, dropping
out of the cluster. The logs only show the following:

debug 2020-12-10T02:02:50.409+ 7f1005840700  0 log_channel(cluster) log
[DBG] : pgmap v14163: 4129 pgs: 4129 active+clean; 10 GiB data, 31 TiB
used, 6.3 PiB / 6.3 PiB avail
debug 2020-12-10T03:20:59.223+ 7f10624eb700 -1 monclient:
_check_auth_rotating possible clock skew, rotating keys expired way too
early (before 2020-12-10T02:20:59.226159+)
debug 2020-12-10T03:21:00.223+ 7f10624eb700 -1 monclient:
_check_auth_rotating possible clock skew, rotating keys expired way too
early (before 2020-12-10T02:21:00.226310+)

The _check_auth_rotating repeats approximately every second. The instances
are all syncing their time with NTP and have no issues on that front. A
restart of the mgr fixes the issue.

It appears that this may be related to https://tracker.ceph.com/issues/39264.
The suggestion seems to be to disable prometheus metrics, however, this
obviously isn't realistic for a production environment where metrics are
critical for operations.

Please let us know what additional information we can provide to assist in
resolving this critical issue.

Cheers
Welby
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] mgr's stop responding, dropping out of cluster with _check_auth_rotating

2020-12-10 Thread Welby McRoberts
Hi Folks

We've noticed that in a cluster of 21 nodes (5 mgrs & 504 OSDs with 24
per node) that the mgr's are, after a non specific period of time, dropping
out of the cluster. The logs only show the following:

debug 2020-12-10T02:02:50.409+ 7f1005840700  0 log_channel(cluster) log
[DBG] : pgmap v14163: 4129 pgs: 4129 active+clean; 10 GiB data, 31 TiB
used, 6.3 PiB / 6.3 PiB avail
debug 2020-12-10T03:20:59.223+ 7f10624eb700 -1 monclient:
_check_auth_rotating possible clock skew, rotating keys expired way too
early (before 2020-12-10T02:20:59.226159+)
debug 2020-12-10T03:21:00.223+ 7f10624eb700 -1 monclient:
_check_auth_rotating possible clock skew, rotating keys expired way too
early (before 2020-12-10T02:21:00.226310+)

The _check_auth_rotating repeats approximately every second. The instances
are all syncing their time with NTP and have no issues on that front. A
restart of the mgr fixes the issue.

It appears that this may be related to https://tracker.ceph.com/issues/39264.
The suggestion seems to be to disable prometheus metrics, however, this
obviously isn't realistic for a production environment where metrics are
critical for operations.

Please let us know what additional information we can provide to assist in
resolving this critical issue.

Cheers
Welby
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io