[ceph-users] Re: [Suspicious newsletter] RGW: Multiple Site does not sync olds data

2021-05-10 Thread Szabo, Istvan (Agoda)
Ok, will be challenging with an 800 millions object bucket 😃 But I might give a try. Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com --

[ceph-users] Re: radosgw-admin user create takes a long time (with failed to distribute cache message)

2021-05-10 Thread Boris Behrens
Hi Amit, I just pinged the mons from every system and they are all available. Am Mo., 10. Mai 2021 um 21:18 Uhr schrieb Amit Ghadge : > We seen slowness due to unreachable one of them mgr service, maybe here > are different, you can check monmap/ ceph.conf mon entry and then verify > all nodes a

[ceph-users] Re: Write Ops on CephFS Increasing exponentially

2021-05-10 Thread Patrick Donnelly
Hi Kyle, On Thu, May 6, 2021 at 7:56 AM Kyle Dean wrote: > > Hi, hoping someone could help me get to the bottom of this particular issue > I'm having. > > I have ceph octopus installed using ceph-ansible. > > Currently, I have 3 MDS servers running, and one client connected to the > active MDS.

[ceph-users] Re: Ceph 16.2.3 issues during upgrade from 15.2.10 with cephadm/lvm list

2021-05-10 Thread David Orman
Hi Sage, We've got 2.0.27 installed. I restarted all the manager pods, just in case, and I have the same behavior afterwards. David On Mon, May 10, 2021 at 6:53 PM Sage Weil wrote: > > The root cause is a bug in conmon. If you can upgrade to >= 2.0.26 > this will also fix the problem. What ve

[ceph-users] Re: Ceph 16.2.3 issues during upgrade from 15.2.10 with cephadm/lvm list

2021-05-10 Thread Sage Weil
The root cause is a bug in conmon. If you can upgrade to >= 2.0.26 this will also fix the problem. What version are you using? The kubic repos currently have 2.0.27. See https://build.opensuse.org/project/show/devel:kubic:libcontainers:stable We'll make sure the next release has the verbosity

[ceph-users] Re: Ceph 16.2.3 issues during upgrade from 15.2.10 with cephadm/lvm list

2021-05-10 Thread David Orman
I think I may have found the issue: https://tracker.ceph.com/issues/50526 It seems it may be fixed in: https://github.com/ceph/ceph/pull/41045 I hope this can be prioritized as an urgent fix as it's broken upgrades on clusters of a relatively normal size (14 nodes, 24x OSDs, 2x NVME for DB/WAL w/

[ceph-users] Re: [Suspicious newsletter] RGW: Multiple Site does not sync olds data

2021-05-10 Thread Szabo, Istvan (Agoda)
So how is your multisite things going at the moment? Seems like with this rewrite you’ve moved further than me 😊 Is it working properly now? If yes, what is the steps to make it work? Where is the magic 😊 ? Istvan Szabo Senior Infrastructure Engineer -

[ceph-users] Re: Host crash undetected by ceph health check

2021-05-10 Thread Frank Schilder
I reproduced the problem today by taking down the ceph cluster network interface on a host, cutting off all ceph communication at once. What I observe is, that IO gets stuck, but OSDs are not marked down. Instead, operations like the one below get stuck in the MON leader and a MON slow ops warni

[ceph-users] Which EC-code for 6 servers?

2021-05-10 Thread Szabo, Istvan (Agoda)
Hi, Thinking to have 2:2 so I can tolerate 2 hosts loss, but if I just want to tolerate 1 host loss, which one better, 3:2 or 4:1? Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com

[ceph-users] Re: [Suspicious newsletter] Building ceph clusters with 8TB SSD drives?

2021-05-10 Thread Szabo, Istvan (Agoda)
We are using in our objectstore 15TB SSDs. Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com --- -Original Message- From: Matt Larson Sent: Sat

[ceph-users] Ceph 16.2.3 issues during upgrade from 15.2.10 with cephadm/lvm list

2021-05-10 Thread David Orman
Hi, We are seeing the mgr attempt to apply our OSD spec on the various hosts, then block. When we investigate, we see the mgr has executed cephadm calls like so, which are blocking: root 1522444 0.0 0.0 102740 23216 ?S17:32 0:00 \_ /usr/bin/python3 /var/lib/ceph/X/cep

[ceph-users] Re: Upgrade tips from Luminous to Nautilus?

2021-05-10 Thread Mark Schouten
On Thu, Apr 29, 2021 at 10:58:15AM +0200, Mark Schouten wrote: > We've done our fair share of Ceph cluster upgrades since Hammer, and > have not seen much problems with them. I'm now at the point that I have > to upgrade a rather large cluster running Luminous and I would like to > hear from other

[ceph-users] Re: Building ceph clusters with 8TB SSD drives?

2021-05-10 Thread Erik Lindahl
Hi Matt, Yes, we've experimented a bit with consumer SSDs, and also done some benchmarks. The main reason for SSDs is typically to improve IOPS for small writes, since even HDDs will usually give you quite good aggregated bandwidth as long as you have enough of them - but for high-IOPS usage most

[ceph-users] Re: radosgw-admin user create takes a long time (with failed to distribute cache message)

2021-05-10 Thread Boris Behrens
Hi guys, does someone got any idea? Am Mi., 5. Mai 2021 um 16:16 Uhr schrieb Boris Behrens : > Hi, > since a couple of days we experience a strange slowness on some > radosgw-admin operations. > What is the best way to debug this? > > For example creating a user takes over 20s. > [root@s3db1 ~]#

[ceph-users] Re: Stuck OSD service specification - can't remove

2021-05-10 Thread David Orman
This turns out to be worse than we thought. We attempted another Ceph upgrade (15.2.10->16.2.3) on another cluster, and have run into this again. We're seeing strange behavior with the OSD specifications, which also have a count that is #OSDs + #hosts, so for example, on a 504 OSD cluster (21 nodes

[ceph-users] Re: [Suspicious newsletter] RGW: Multiple Site does not sync olds data

2021-05-10 Thread 特木勒
Hi Istvan: Thanks for your help. After we rewrite all the objects that in buckets, the sync seems to work again. We are using this command to rewrite all the objects in specific bucket: `radosgw-admin bucket rewrite —bucket=BUCKET_NAME --min-rewrite-size 0` You can try to run this on 1 bucket a

[ceph-users] Re: v16.2.2 Pacific released

2021-05-10 Thread Mike Perez
Hi Norman, Here's the correct link. https://docs.ceph.com/en/latest/install/get-packages/ On Fri, May 7, 2021 at 9:04 PM kefu chai wrote: > > On Sat, May 8, 2021 at 10:42 AM Norman.Kern wrote: > > > > Hi David, > > > > The web page is missing: > > https://docs.ceph.com/en/latest/docs/master/i

[ceph-users] How to deploy ceph with ssd ?

2021-05-10 Thread codignotto
I'm deploying 6 ceph servers with 128GB of memory each, 12 SSDs of 1 Tb on each server, 10Gb network cards connected to 10Gb port switches. I'm following this documentation https://docs.ceph.com/en/octopus/cephadm/install/ But I don't know if this is the best way to get the most out of the disks,