[ceph-users] Re: Ceph cluster Sync

2021-10-12 Thread Tony Liu
For PR-DR case, I am using RGW multi-site support to replicate backup image. Tony From: Manuel Holtgrewe Sent: October 12, 2021 11:40 AM To: dhils...@performair.com Cc: mico...@gmail.com; ceph-users Subject: [ceph-users] Re: Ceph cluster Sync To chime in

[ceph-users] Re: Broken mon state after (attempted) 16.2.5 -> 16.2.6 upgrade

2021-10-12 Thread Patrick Donnelly
I found the problem, thanks. There is a tracker ticket: https://tracker.ceph.com/issues/52820 On Fri, Oct 8, 2021 at 8:01 AM Jonathan D. Proulx wrote: > > Hi Patrick, > > Yes we had been successfully running on Pacific v16.2.5 > > Thanks for the pointer to the bug, we eventually ended up taking

[ceph-users] Re: Ceph cluster Sync

2021-10-12 Thread Manuel Holtgrewe
To chime in here, there is https://github.com/45Drives/cephgeorep That allows cephfs replication pre pacific. There is a mail thread somewhere on the list where a ceph developer warns about semantics issues of recursive mtime even on pacific. However, according to 45 drives they have never had a

[ceph-users] Re: Where is my free space?

2021-10-12 Thread Gregory Farnum
On Mon, Oct 11, 2021 at 10:22 PM Szabo, Istvan (Agoda) wrote: > > Hi, > > 377TiB is the total cluster size, data pool 4:2 ec, stored 66TiB, how can be > the data pool on 60% used??!! Since you have an EC pool, you presumably have a CRUSH rule demanding 6 hosts. Among your seven hosts, 2 of them

[ceph-users] Re: OSD Crashes in 16.2.6

2021-10-12 Thread Zakhar Kirpichenko
Indeed, this is the PVE forum post I saw earlier. /Z On Tue, Oct 12, 2021 at 9:27 PM Marco Pizzolo wrote: > Igor, > > Thanks for the response. One that I found was: > https://forum.proxmox.com/threads/pve-7-0-bug-kernel-null-pointer-dereference-address-00c0-pf-error_code-0x-no-

[ceph-users] Re: ceph full-object read crc != expected on xxx:head

2021-10-12 Thread Gregory Farnum
On Tue, Oct 12, 2021 at 12:52 AM Frank Schilder wrote: > > Is there a way (mimic latest) to find out which PG contains the object that > caused this error: > > 2021-10-11 23:46:19.631006 osd.335 osd.335 192.168.32.87:6838/8605 623 : > cluster [ERR] full-object read crc 0x6c3a7719 != expected 0x

[ceph-users] Re: OSD Crashes in 16.2.6

2021-10-12 Thread Marco Pizzolo
Igor, Thanks for the response. One that I found was: https://forum.proxmox.com/threads/pve-7-0-bug-kernel-null-pointer-dereference-address-00c0-pf-error_code-0x-no-web-access-no-ssh.96598/ In regards to your questions, this is a new cluster deployed at 16.2.6. It currently has l

[ceph-users] Re: OSD Crashes in 16.2.6

2021-10-12 Thread Igor Fedotov
Zakhar, could you please point me to the similar reports at Proxmox forum? Curious what's the Ceph release mentioned there... Thanks, Igor On 10/12/2021 8:53 PM, Zakhar Kirpichenko wrote: Hi, This could be kernel-related, as I've seen similar reports in Proxmox forum. Specifically, 5.11.x w

[ceph-users] Re: OSD Crashes in 16.2.6

2021-10-12 Thread Igor Fedotov
FYI: telemetry reports that triggered the above-mentioned ticket creation indicate kernel v4.18... "utsname_release": "4.18.0-305.10.2.el8_4.x86_64" On 10/12/2021 8:53 PM, Zakhar Kirpichenko wrote: Hi, This could be kernel-related, as I've seen similar reports in Proxmox forum. Specifically,

[ceph-users] Re: OSD Crashes in 16.2.6

2021-10-12 Thread Zakhar Kirpichenko
Can't say much about kernel 5.4 and connectx-6, as we have no experience with this combination. 5.4 + connectx-5 works well though :-) / Z On Tue, Oct 12, 2021 at 9:06 PM Marco Pizzolo wrote: > Hi Zakhar, > > Thanks for the quick response. I was coming across some of those Proxmox > forum post

[ceph-users] Re: OSD Crashes in 16.2.6

2021-10-12 Thread Igor Fedotov
Hi Marco, this reminds me the following ticket: https://tracker.ceph.com/issues/52234 Unfortunately that's all we have so far about that issue. Could you please answer some questions: 1) Is this a new or upgraded cluster? 2) If you upgraded it - what was the previous Ceph version  and did y

[ceph-users] Re: OSD Crashes in 16.2.6

2021-10-12 Thread Marco Pizzolo
Hi Zakhar, Thanks for the quick response. I was coming across some of those Proxmox forum posts as well. I'm not sure if going to the 5.4 kernel will create any other challenges for us, as we're using dual port mellanox connectx-6 200G nics in the hosts, but it is definitely something we can try

[ceph-users] Re: OSD Crashes in 16.2.6

2021-10-12 Thread Zakhar Kirpichenko
Hi, This could be kernel-related, as I've seen similar reports in Proxmox forum. Specifically, 5.11.x with Ceph seems to be hitting kernel NULL pointer dereference. Perhaps a newer kernel would help. If not, I'm running 16.2.6 with kernel 5.4.x without any issues. Best regards, Z On Tue, Oct 12,

[ceph-users] OSD Crashes in 16.2.6

2021-10-12 Thread Marco Pizzolo
Hello everyone, We are seeing instability in 20.04.3 using HWE kernel and Ceph 16.2.6 w/Podman. We have OSDs that fail after <24 hours and I'm not sure why. Seeing this: ceph crash info 2021-10-12T14:32:49.169552Z_d1ee94f7-1aaa-4221-abeb-68bd56d3c763 { "backtrace": [ "/lib64/libpthr

[ceph-users] Re: Ceph cluster Sync

2021-10-12 Thread DHilsbos
Michel; I am neither a Ceph evangelist, nor a Ceph expert, but here is my current understanding: Ceph clusters do not have in-built cross cluster synchronization. That said, there are several things which might meet your needs. 1) If you're just planning your Ceph deployment, then the latest r

[ceph-users] Re: is it possible to remove the db+wal from an external device (nvme)

2021-10-12 Thread Szabo, Istvan (Agoda)
One more thing, what I’m doing at the moment: Noout norebalance on 1 host Stop all osd Compact all the osds Migrate the db 1 by 1 Start the osds 1 by 1 Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.co

[ceph-users] Re: is it possible to remove the db+wal from an external device (nvme)

2021-10-12 Thread Szabo, Istvan (Agoda)
I’m having 1 billions of objects in the cluster and we are still increasing and faced spillovers allover the clusters. After 15-18 spilledover osds (out of the 42-50) the osds started to die, flapping. Tried to compact manually the spilleovered ones, but didn’t help, however the not spilled osds

[ceph-users] Re: is it possible to remove the db+wal from an external device (nvme)

2021-10-12 Thread Szabo, Istvan (Agoda)
Omg, I’ve already migrated 24x osds in each dc-s (altogether 72). What should I do then? 12 left (altogether 36). In my case slow device is faster in random write iops than the one which is serving it. Istvan Szabo Senior Infrastructure Engineer ---

[ceph-users] Ceph cluster Sync

2021-10-12 Thread Michel Niyoyita
Dear team I want to build two different cluster: one for primary site and the second for DR site. I would like to ask if these two cluster can communicate(synchronized) each other and data written to the PR site be synchronized to the DR site , if once we got trouble for the PR site the DR automa

[ceph-users] Announcing go-ceph v0.12.0

2021-10-12 Thread John Mulligan
I'm happy to announce another release of the go-ceph API library. This is a regular release following our every-two-months release cadence. https://github.com/ceph/go-ceph/releases/tag/v0.12.0 Changes include additions to the rbd, rbd admin, and rgw admin packages. More details are available a

[ceph-users] Re: is it possible to remove the db+wal from an external device (nvme)

2021-10-12 Thread Szabo, Istvan (Agoda)
Hi Igor, I’ve attached here, thank you in advance. Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com --- From: Igor Fedo

[ceph-users] Re: is it possible to remove the db+wal from an external device (nvme)

2021-10-12 Thread Igor Fedotov
Istvan, So things with migrations are clear at the moment, right? As I mentioned the migrate command in 15.2.14 has a bug which causes corrupted OSD if db->slow migration occurs on spilled over OSD. To work around that you might want to migrate slow to db first or try manual compaction. Please

[ceph-users] Re: Metrics for object sizes

2021-10-12 Thread Szabo, Istvan (Agoda)
Hi, Just got the chance to have a look, but I see lua scripting is new in version pacific ☹ I have octopus 15.2.14, will it be backported or no chance? Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.c

[ceph-users] Re: Where is my free space?

2021-10-12 Thread Szabo, Istvan (Agoda)
I see, I'm using ssds so it shouldn't be a problem I guess, because the : "bluestore_min_alloc_size": "0" is overwritten with the: "bluestore_min_alloc_size_ssd": "4096" ? -Original Message- From: Stefan Kooman Sent: Tuesday, October 12, 2021 2:19 PM To: Szabo, Istvan (Agoda) ; cep

[ceph-users] ceph full-object read crc != expected on xxx:head

2021-10-12 Thread Frank Schilder
Is there a way (mimic latest) to find out which PG contains the object that caused this error: 2021-10-11 23:46:19.631006 osd.335 osd.335 192.168.32.87:6838/8605 623 : cluster [ERR] full-object read crc 0x6c3a7719 != expected 0xd27f7a2c on 19:28b9843f:::3b43237.:head In all refere

[ceph-users] Re: is it possible to remove the db+wal from an external device (nvme)

2021-10-12 Thread Igor Fedotov
You mean you run migrate for these 72 OSDs and all of them aren't starting any more? Or you just upgraded them to Octopus and experiencing performance issues. In the latter case and if you have enough space at DB device you might want to try to migrate data from slow to db first. Run fsck (jus

[ceph-users] Re: is it possible to remove the db+wal from an external device (nvme)

2021-10-12 Thread Igor Fedotov
Istvan, you're bitten by https://github.com/ceph/ceph/pull/43140 It's not fixed in 15.2.14. This has got a backport to upcoming Octopus minor release. Please do not use 'migrate' command from WAL/DB to slow volume if some data is already present there... Thanks, Igor On 10/12/2021 12:13 P

[ceph-users] Re: Where is my free space?

2021-10-12 Thread Stefan Kooman
On 10/12/21 07:21, Szabo, Istvan (Agoda) wrote: Hi, 377TiB is the total cluster size, data pool 4:2 ec, stored 66TiB, how can be the data pool on 60% used??!! Space amplification? It depends on, among others (like object size), the min_alloc size you use for the OSDs. See this thread [1], an

[ceph-users] Re: bluefs _allocate unable to allocate

2021-10-12 Thread José H . Freidhof
Hi Igor the reason why i tested differekt rocksdb options is that was having a really bad write performance with the default settings (30-60mb/s) on the cluster... actually i have 200mb/s read and 180mb/s write performance now i dont now which of the booth settings are the good ones Another que

[ceph-users] get_health_metrics reporting slow ops and gw outage

2021-10-12 Thread Szabo, Istvan (Agoda)
Hi, Many of my osds having this issue which causes 10-15ms osd write operation latency and more than 60ms read operation latency. This causes rgw wait for operations and after a while rgw just restarted (all of them in my cluster) and only available after slow ops disappeared. I see similar iss

[ceph-users] Re: bluefs _allocate unable to allocate

2021-10-12 Thread José H . Freidhof
Hi Igor, Thx for checking the logs.. but what the hell is going on here? :-) Yes its true i tested the and created the osd´s with three different rockdb options. I can not understand why the osd dont have the same rockdb option, because i have created ALL OSDs new after set and test those settings

[ceph-users] Where is my free space?

2021-10-12 Thread Szabo, Istvan (Agoda)
Hi, 377TiB is the total cluster size, data pool 4:2 ec, stored 66TiB, how can be the data pool on 60% used??!! Some output: ceph df detail --- RAW STORAGE --- CLASS SIZE AVAILUSED RAW USED %RAW USED nvme12 TiB 11 TiB 128 MiB 1.2 TiB 9.81 ssd377 TiB 269 TiB 100

[ceph-users] Re: bluefs _allocate unable to allocate

2021-10-12 Thread José H . Freidhof
Hello Igor "Does single OSD startup (after if's experiencing "unable to allocate) takes 20 mins as well?" A: YES Here the example log of the startup and recovery of a problematic osd. https://paste.ubuntu.com/p/2WVJbg7cBy/ Here the example log of a problematic osd https://paste.ubuntu.com/p/qbB6