[ceph-users] Re: Move on cephfs not O(1)?

2020-03-27 Thread Frank Schilder
Thanks a lot! Have a good weekend. = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Gregory Farnum Sent: 27 March 2020 21:18:37 To: Jeff Layton Cc: Frank Schilder; Zheng Yan; ceph-users; Luis Henriques Subject: Re: [ceph-users]

[ceph-users] Re: Move on cephfs not O(1)?

2020-03-27 Thread Gregory Farnum
Made a ticket in the Linux Kernel Client tracker: https://tracker.ceph.com/issues/44791 I naively don't think this should be very complicated at all, except that I recall hearing something about locking issues with quota realms in the kclient? But the userspace client definitely doesn't have to do

[ceph-users] Ceph influxDB support versus Telegraf Ceph plugin?

2020-03-27 Thread victorhooi
Hi, I've read that Ceph has some InfluxDB reporting capabilities inbuilt (https://docs.ceph.com/docs/master/mgr/influx/). However, Telegraf, which is the system reporting daemon for InfluxDB, also has a Ceph plugin (https://github.com/influxdata/telegraf/tree/master/plugins/inputs/ceph). Just

[ceph-users] Ceph WAL/DB disks - do they really only use 3GB, or 30Gb, or 300GB

2020-03-27 Thread victorhooi
Hi, I'm using Intel Optane disks to provide WAL/DB capacity for my Ceph cluster (which is part of Proxmox - for VM hosting). I've read that WAL/DB partitions only use either 3GB, or 30GB, or 300GB - due to the way that RocksDB works. Is this true? My current partition for WAL/DB is 145 GB - d

[ceph-users] Leave of absence...

2020-03-27 Thread Sage Weil
Hi everyone, I am taking time off from the Ceph project and from Red Hat, starting in April and extending through the US election in November. I will initially be working with an organization focused on voter registration and turnout and combating voter suppression and disinformation campaigns.

[ceph-users] Re: v15.2.0 Octopus released

2020-03-27 Thread Mazzystr
What about the missing dependencies for octopus on el8? (looking at yu ceph-mgr!) On Fri, Mar 27, 2020 at 7:15 AM Sage Weil wrote: > One word of caution: there is one known upgrade issue if you > > - upgrade from luminous to nautilus, and then > - run nautilus for a very short period of t

[ceph-users] Re: Space leak in Bluestore

2020-03-27 Thread vitalif
Update on my issue. It seems it was caused by the broken compression which one of 14.2.x releases (ubuntu builds) probably had. My osd versions were mixed. Five OSDs were 14.2.7, one was 14.2.4, other 6 were 14.2.8. I moved the same pg several times more. Space usage dropped when the pg was

[ceph-users] Re: Identify slow ops

2020-03-27 Thread Thomas Schneider
Hi, I have upgraded to 14.2.8 and rebooted all nodes sequentially including all 3 MON services. However the slow ops are still displayed with increasing block time. If this is a (harmless) bug anymore, then I would like to understand what's causing it. But if this is not a bug, then I would like

[ceph-users] Re: No reply or very slow reply from Prometheus plugin - ceph-mgr 13.2.8 mimic

2020-03-27 Thread Jarett DeAngelis
I’m actually very curious how well this is performing for you as I’ve definitely not seen a deployment this large. How do you use it? > On Mar 27, 2020, at 11:47 AM, shubjero wrote: > > I've reported stability problems with ceph-mgr w/ prometheus plugin > enabled on all versions we ran in produ

[ceph-users] Re: No reply or very slow reply from Prometheus plugin - ceph-mgr 13.2.8 mimic

2020-03-27 Thread shubjero
I've reported stability problems with ceph-mgr w/ prometheus plugin enabled on all versions we ran in production which were several versions of Luminous and Mimic. Our solution was to disable the prometheus exporter. I am using Zabbix instead. Our cluster is 1404 OSD's in size with about 9PB raw wi

[ceph-users] Re: Move on cephfs not O(1)?

2020-03-27 Thread Jeff Layton
On Fri, 2020-03-27 at 07:36 -0400, Jeff Layton wrote: > On Thu, 2020-03-26 at 10:32 -0700, Gregory Farnum wrote: > > On Thu, Mar 26, 2020 at 9:13 AM Frank Schilder wrote: > > > Dear all, > > > > > > yes, this is it, quotas. In the structure A/B/ there was a quota set on > > > A. Hence, B was mov

[ceph-users] Re: v15.2.0 Octopus released

2020-03-27 Thread Sage Weil
One word of caution: there is one known upgrade issue if you - upgrade from luminous to nautilus, and then - run nautilus for a very short period of time (hours), and then - upgrade from nautilus to octopus that prevents OSDs from starting. We have a fix that will be in 15.2.1, but until tha

[ceph-users] Re: BlueStore and checksum

2020-03-27 Thread Nathan Fish
The error is silently corrected and the correct data rewritten to the bad sector. There may be a slight latency increase on the read. The checksumming is implemented at the Bluestore layer, what you are storing makes no difference. On Fri, Mar 27, 2020 at 5:12 AM Priya Sehgal wrote: > > Hi, > I a

[ceph-users] Re: Help: corrupt pg

2020-03-27 Thread Jake Grimmett
Hi Greg, Yes, this was caused by a chain of event. As a cautionary tale, the main ones were: 1) minor nautilus release upgrade, followed by a rolling node restart script that mistakenly relied on "ceph -s" for cluster health info, i.e. it didn't wait for the cluster to return to health bef

[ceph-users] fast luminous -> nautilus -> octopus upgrade could lead to assertion failure on OSD

2020-03-27 Thread kefu chai
hi folks, if you are upgrading from luminous to octopus, or you plan to do so, please read on. in octopus, osd will crash if it processes an osdmap whose require_osd_release flag is still luminous. this only happens if a cluster upgrades very quickly from luminous to nautilus and to octopus. in

[ceph-users] Re: samba ceph-vfs and scrubbing interval

2020-03-27 Thread Jeff Layton
On Fri, 2020-03-27 at 12:00 +0100, Marco Savoca wrote: > Hi all, > > i‘m running a 3 node ceph cluster setup with collocated mons and mds > for actually 3 filesystems at home since mimic. I’m planning to > downgrade to one FS and use RBD in the future, but this is another > story. I’m using the cl

[ceph-users] Re: Move on cephfs not O(1)?

2020-03-27 Thread Jeff Layton
On Thu, 2020-03-26 at 10:32 -0700, Gregory Farnum wrote: > On Thu, Mar 26, 2020 at 9:13 AM Frank Schilder wrote: > > Dear all, > > > > yes, this is it, quotas. In the structure A/B/ there was a quota set on A. > > Hence, B was moved out of this zone and this does indeed change mv to be a > > cp

[ceph-users] samba ceph-vfs and scrubbing interval

2020-03-27 Thread Marco Savoca
Hi all, i‘m running a 3 node ceph cluster setup with collocated mons and mds for actually 3 filesystems at home since mimic. I’m planning to downgrade to one FS and use RBD in the future, but this is another story. I’m using the cluster as cold storage on spindles with EC-pools for archive purp

[ceph-users] Re: [ceph][nautilus] error initalizing secondary zone

2020-03-27 Thread Ignazio Cassano
I am sorry. The problem was the http_proxy Ignazio Il giorno ven 27 mar 2020 alle ore 11:24 Ignazio Cassano < ignaziocass...@gmail.com> ha scritto: > Hello , I am trying to initializing the secondary zone pulling the realm > define in the primary zone: > > radosgw-admin realm pull --rgw-realm=niv

[ceph-users] [ceph][nautilus] error initalizing secondary zone

2020-03-27 Thread Ignazio Cassano
Hello , I am trying to initializing the secondary zone pulling the realm define in the primary zone: radosgw-admin realm pull --rgw-realm=nivola --url=http://10.102.184.190:8080 --access-key=access --secret=secret The following errors appears: request failed: (16) Device or resource busy Could

[ceph-users] Re: Combining erasure coding and replication?

2020-03-27 Thread Simon Oosthoek
On 27/03/2020 09:56, Eugen Block wrote: > Hi, > >> I guess what you are suggesting is something like k+m with m>=k+2, for >> example k=4, m=6. Then, one can distribute 5 shards per DC and sustain >> the loss of an entire DC while still having full access to redundant >> storage. > > that's exactl

[ceph-users] BlueStore and checksum

2020-03-27 Thread Priya Sehgal
Hi, I am trying to find out whether Ceph has a method to detect silent corruption such as bit rot. I came across this text in a book - "Mastering Ceph : Infrastructure Storage Solution with the latest Ceph release" by Nick Fisk - Luminous release of Ceph employs ZFS-like ability to checksum data

[ceph-users] Re: Combining erasure coding and replication?

2020-03-27 Thread Lars Täuber
Hi Brett, I'm far from being an expert, but you may consider rbd-mirroring between EC-pools. Cheers, Lars Am Fri, 27 Mar 2020 06:28:02 + schrieb Brett Randall : > Hi all > > Had a fun time trying to join this list, hopefully you don’t get this message > 3 times! > > On to Ceph… We are l

[ceph-users] Re: Combining erasure coding and replication?

2020-03-27 Thread Eugen Block
Hi, I guess what you are suggesting is something like k+m with m>=k+2, for example k=4, m=6. Then, one can distribute 5 shards per DC and sustain the loss of an entire DC while still having full access to redundant storage. that's exactly what I mean, yes. Now, a long time ago I was in a

[ceph-users] Re: No reply or very slow reply from Prometheus plugin - ceph-mgr 13.2.8 mimic

2020-03-27 Thread Janek Bevendorff
Sorry, I meant MGR of course. MDS are fine for me. But the MGRs were failing constantly due to the prometheus module doing something funny. On 26/03/2020 18:10, Paul Choi wrote: > I won't speculate more into the MDS's stability, but I do wonder about > the same thing. > There is one file served b

[ceph-users] Re: Combining erasure coding and replication?

2020-03-27 Thread Frank Schilder
Dear Eugen, I guess what you are suggesting is something like k+m with m>=k+2, for example k=4, m=6. Then, one can distribute 5 shards per DC and sustain the loss of an entire DC while still having full access to redundant storage. Now, a long time ago I was in a lecture about error-correcting

[ceph-users] Re: How to migrate ceph-xattribs?

2020-03-27 Thread Frank Schilder
> > If automatic migration is not possible, is there at least an efficient way > > to > > *find* everything with special ceph attributes? > > IIRC, you can still see all these attributes by querying for the > "ceph" xattr. Does that not work for you? In case I misunderstand this part of your mes

[ceph-users] Re: Combining erasure coding and replication?

2020-03-27 Thread Eugen Block
Hi Brett, Our concern with Ceph is the cost of having three replicas. Storage may be cheap but I’d rather not buy ANOTHER 5pb for a third replica if there are ways to do this more efficiently. Site-level redundancy is important to us so we can’t simply create an erasure-coded volume acros