[ceph-users] Re: High CPU usage by ceph-mgr in 14.2.5

2019-12-19 Thread Mark Nelson
Hi Paul, Thanks for gathering this!  It looks to me like at the very least we should redo the fixed_u_to_string and fixed_to_string functions in common/Formatter.cc.  That alone looks like it's having a pretty significant impact. Mark On 12/19/19 2:09 PM, Paul Mezzanini wrote: Based on

[ceph-users] Re: can run more than one rgw multisite realm on one ceph cluster

2019-12-19 Thread Casey Bodley
On 12/19/19 5:44 AM, tda...@hotmail.com wrote: Hello, I managed to do that 3 months ago with 2 realms as i wanted to connect 2 different openstack environments (object store) and use different zones on the same ceph cluster. Now unfortunately i am not able to recreate the scenario :( as the p

[ceph-users] Re: High CPU usage by ceph-mgr in 14.2.5

2019-12-19 Thread Neha Ojha
We'd let to verify it the network ping time monitoring feature in 14.2.5 is attributing to this problem. It'd be great if someone could try https://tracker.ceph.com/issues/43364#note-3 and let us know. Thanks, Neha On Thu, Dec 19, 2019 at 8:48 AM Mark Nelson wrote: > > If you can get a wallclock

[ceph-users] RGW bucket stats extremely slow to respond

2019-12-19 Thread David Monschein
Hi all! Reaching out again about this issue since I haven't had much luck. We've been seeing some strange behavior with our object storage cluster. While bucket stats (radosgw-admin bucket stats) normally return in a matter of seconds, we frequently observe it taking almost ten minutes, which is n

[ceph-users] Re: High CPU usage by ceph-mgr in 14.2.5

2019-12-19 Thread Mark Nelson
If you can get a wallclock profiler on the mgr process we might be able to figure out specifics of what's taking so much time (ie processing pg_summary or something else).  Assuming you have gdb with the python bindings and the ceph debug packages installed, if you (are anyone) could try gdbpmp

[ceph-users] Re: Changing failure domain

2019-12-19 Thread Francois Legrand
Thanks for you advices. I thus created a new replica profile : {     "rule_id": 2,     "rule_name": "replicated3over2rooms",     "ruleset": 2,     "type": 1,     "min_size": 3,     "max_size": 4,     "steps": [     {     "op": "take",     "item": -1,     "it

[ceph-users] Re: radosgw - Etags suffixed with #x0e

2019-12-19 Thread Ingo Reimann
equest method_string='GET', uri='/?prefix=XXX-ZZZ%2Fdata%2FL%2FK', headers={'x-amz-date': '20191219T133746Z', 'Authorization': 'AWS4-HMAC-SHA256 Credential=XXX/20191219/us-east-1/s3/aws4_request,SignedHeaders=host;x

[ceph-users] Re: Pool Max Avail and Ceph Dashboard Pool Useage on Nautilus giving different percentages

2019-12-19 Thread Stephan Mueller
Hi, if "MAX AVAIL" displays the wrong data, the bug is just made more visible through the dashboard, as the calculation is correct. To get the right percentage you have to divide the used space through the total, and the total can only consist of two states used and not used space, so both states

[ceph-users] Re: High CPU usage by ceph-mgr in 14.2.5

2019-12-19 Thread Paul Emmerich
We're also seeing unusually high mgr CPU usage on some setups, the only thing they have in common seem to > 300 OSDs. Threads using the CPU are "mgr-fin" and and "ms_dispatch" Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusst

[ceph-users] Re: rbd images inaccessible for a longer period of time

2019-12-19 Thread tdados
I don't have a lot of experience with rbd-nbd but i suppose it works same with rbd. We use xen as hypervisor and sometimes when there is a crash, we need to remove the locks on the volumes when remapping them as these are dead locks. Now removing the locks will sometimes put blacklist on these

[ceph-users] Re: can run more than one rgw multisite realm on one ceph cluster

2019-12-19 Thread tdados
Hello, I managed to do that 3 months ago with 2 realms as i wanted to connect 2 different openstack environments (object store) and use different zones on the same ceph cluster. Now unfortunately i am not able to recreate the scenario :( as the period are getting mixed or i am doing something w

[ceph-users] Strange behavior for crush buckets of erasure-profile

2019-12-19 Thread tdados
So I wanted to report a crush rule/ec profile strange behaviour regarding radosgw items which i am not sure if it's a bug or it's supposed to work that way. I am trying to implement the below scenario in my home lab: By default there is a "default" erasure-code-profile with the below settings:

[ceph-users] Re: rbd images inaccessible for a longer period of time

2019-12-19 Thread yveskretzschmar
addendum: If I try to purge snaps the following happens: rbd snap purge rbd_hdd_1.8tb_01_3t/vm-29009-disk-2 Removing all snapshots: 50% complete...failed. rbd: removing snaps failed: (2) No such file or directory Despite the output a rbd ls -l don't show any snapshots any longer. After this the

[ceph-users] rbd images inaccessible for a longer period of time

2019-12-19 Thread yveskretzschmar
Hi, yesterday I had to power off some vm (proxmox) backed by rbd images for maintenance. After the VMs were off, I tried to create a Snapshot which didn't finish even after half an hour. Because it was maintenance I rebooted all VM nodes an all ceph nodes - nothing changed. Powering on the VM w

[ceph-users] Re: High CPU usage by ceph-mgr in 14.2.5

2019-12-19 Thread Serkan Çoban
+1 1500 OSDs, mgr is constant %100 after upgrading from 14.2.2 to 14.2.5. On Thu, Dec 19, 2019 at 11:06 AM Toby Darling wrote: > > On 18/12/2019 22:40, Bryan Stillwell wrote: > > That's how we noticed it too. Our graphs went silent after the upgrade > > completed. Is your large cluster over 350

[ceph-users] Re: High CPU usage by ceph-mgr in 14.2.5

2019-12-19 Thread Toby Darling
On 18/12/2019 22:40, Bryan Stillwell wrote: That's how we noticed it too.  Our graphs went silent after the upgrade completed.  Is your large cluster over 350 OSDs? A 'me too' on this - graphs have gone quiet, and mgr is using 100% CPU. This happened when we grew our 14.2.5 cluster from 328 to