[ceph-users] Combining erasure coding and replication?

2020-03-26 Thread Brett Randall
Hi all Had a fun time trying to join this list, hopefully you don’t get this message 3 times! On to Ceph… We are looking at setting up our first ever Ceph cluster to replace Gluster as our media asset storage and production system. The Ceph cluster will have 5pb of usable storage. Whether we u

[ceph-users] Ceph rbd mirror and object storage multisite

2020-03-26 Thread Ignazio Cassano
Hello All, I am going to test rbd mirroring and object storage multisite. I would like to know which network is used by rbd mirror ( ceph public or cluster network ?) Same question for object storage multisite What about firewall ? What about bandwidth ? Our sites are connected with 1Gbs networ

[ceph-users] Re: How to migrate ceph-xattribs?

2020-03-26 Thread Frank Schilder
Dear Gregory, thanks for your fast reply. I understand the reasons for hiding the attributes and agree with them. For example, migrating data pool settings between different ceph file systems without value mapping as in our case will break everything. Now, the situation is this, consider a mul

[ceph-users] Re: How to migrate ceph-xattribs?

2020-03-26 Thread Gregory Farnum
On Thu, Mar 26, 2020 at 9:24 AM Frank Schilder wrote: > > De all, > > we are in the process of migrating a ceph file system from a 2-pool layout > (rep meta+ec data) to the recently recommended 3-pool layout (rep meta, per > primary data, ec data). As part of this, we need to migrate any ceph xa

[ceph-users] Re: Move on cephfs not O(1)?

2020-03-26 Thread Gregory Farnum
On Thu, Mar 26, 2020 at 9:13 AM Frank Schilder wrote: > > Dear all, > > yes, this is it, quotas. In the structure A/B/ there was a quota set on A. > Hence, B was moved out of this zone and this does indeed change mv to be a > cp+rm. > > The obvious follow-up. What is the procedure for properly m

[ceph-users] Re: Help: corrupt pg

2020-03-26 Thread Gregory Farnum
On Wed, Mar 25, 2020 at 5:19 AM Jake Grimmett wrote: > > Dear All, > > We are "in a bit of a pickle"... > > No reply to my message (23/03/2020), subject "OSD: FAILED > ceph_assert(clone_size.count(clone))" > > So I'm presuming it's not possible to recover the crashed OSD From your later email i

[ceph-users] Re: No reply or very slow reply from Prometheus plugin - ceph-mgr 13.2.8 mimic

2020-03-26 Thread Paul Choi
I won't speculate more into the MDS's stability, but I do wonder about the same thing. There is one file served by the MDS that would cause the ceph-fuse client to hang. It was a file that many people in the company relied on for data updates, so very noticeable. The only fix was to fail over the M

[ceph-users] Re: No reply or very slow reply from Prometheus plugin - ceph-mgr 13.2.8 mimic

2020-03-26 Thread Janek Bevendorff
If there is actually a connection, then it's no wonder our MDS kept crashing. Our Ceph has 9.2PiB of available space at the moment. On 26/03/2020 17:32, Paul Choi wrote: > I can't quite explain what happened, but the Prometheus endpoint > became stable after the free disk space for the largest po

[ceph-users] How to migrate ceph-xattribs?

2020-03-26 Thread Frank Schilder
De all, we are in the process of migrating a ceph file system from a 2-pool layout (rep meta+ec data) to the recently recommended 3-pool layout (rep meta, per primary data, ec data). As part of this, we need to migrate any ceph xattribs set on files and directories. As these are no longer disco

[ceph-users] Re: No reply or very slow reply from Prometheus plugin - ceph-mgr 13.2.8 mimic

2020-03-26 Thread Paul Choi
I can't quite explain what happened, but the Prometheus endpoint became stable after the free disk space for the largest pool went substantially lower than 1PB. I wonder if there's some metric that exceeds the maximum size for some int, double, etc? -Paul On Mon, Mar 23, 2020 at 9:50 AM Janek Bev

[ceph-users] Re: Move on cephfs not O(1)?

2020-03-26 Thread Frank Schilder
Dear all, yes, this is it, quotas. In the structure A/B/ there was a quota set on A. Hence, B was moved out of this zone and this does indeed change mv to be a cp+rm. The obvious follow-up. What is the procedure for properly moving data as an administrator? Do I really need to unset quotas, do

[ceph-users] Performance characteristics of ‘if-none-match’ on rgw

2020-03-26 Thread akmd
Hello, I am observing non-intuitive results for a performance test using the S3 API to RGW. I am wondering if others have similar experiences or knowledge here. Our application is using the “if-none-match” header on S3-API requests. This header is set by the application if it already has a c

[ceph-users] Re: v15.2.0 Octopus released

2020-03-26 Thread Anthony D'Atri
I suspect that while not default we still need to be able to run OSDs created on older releases, or with leveldb explicitly? > >> - nothing provides libleveldb.so.1()(64bit) needed by >> ceph-osd-2:15.2.0-0.el8.x86_64 > > Does the OSD still use leveldb at all? > > - Ken >

[ceph-users] Re: v15.2.0 Octopus released

2020-03-26 Thread Ken Dreyer
On Tue, Mar 24, 2020 at 3:34 PM Mazzystr wrote: > - nothing provides libleveldb.so.1()(64bit) needed by > ceph-osd-2:15.2.0-0.el8.x86_64 Does the OSD still use leveldb at all? - Ken ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe se

[ceph-users] Re: Move on cephfs not O(1)?

2020-03-26 Thread Frank Schilder
Dear all, thanks for the hints. I was afraid I see ghosts. Yes, we have quotas on many directories. I'm not entirely sure if there were different quotas involved. I believe it was set only on the highest level, but now that you mention it ... I will try with quotas again, didn't include this in

[ceph-users] Re: octopus upgrade stuck: Assertion `map->require_osd_release >= ceph_release_t::mimic' failed.

2020-03-26 Thread DHilsbos
This is a little beyond my understanding of Ceph, but let me take a crack at it... I've found that Ceph tends to be fairly logical, mostly. require_osd_release looks like a cluster wide configuration value which controls the minimum required version for an OSD daemon to join the cluster. check_

[ceph-users] Re: Move on cephfs not O(1)?

2020-03-26 Thread Gregory Farnum
I was wondering that but at least in the userspace client the quotas will just throw EXDEV or EDQUOT if it will exceed quotas... ...and EXDEV might trigger the kernel to do a copy-and-delete, I guess? Not sure. On Thu, Mar 26, 2020 at 7:18 AM Adam Tygart wrote: > > Is there is a possibility that

[ceph-users] Re: Move on cephfs not O(1)?

2020-03-26 Thread Adam Tygart
Is there is a possibility that there was a quota involved? I've seen moves between quota zones to cause a copy then delete. -- Adam On Thu, Mar 26, 2020 at 9:14 AM Gregory Farnum wrote: > > On Thu, Mar 26, 2020 at 5:49 AM Frank Schilder wrote: > > > > Some time ago I made a surprising observati

[ceph-users] Re: Move on cephfs not O(1)?

2020-03-26 Thread Gregory Farnum
On Thu, Mar 26, 2020 at 5:49 AM Frank Schilder wrote: > > Some time ago I made a surprising observation. I reorganised a directory > structure and needed to move a folder one level up with a command like > > mv A/B/ B > > B contained something like 9TB in very large files. To my surprise, this >

[ceph-users] Move on cephfs not O(1)?

2020-03-26 Thread Frank Schilder
Some time ago I made a surprising observation. I reorganised a directory structure and needed to move a folder one level up with a command like mv A/B/ B B contained something like 9TB in very large files. To my surprise, this command didn't return for a couple of minutes and I started to look

[ceph-users] Re: Using sendfile on Ceph FS results in data stuck in client cache

2020-03-26 Thread Jeff Layton
On Wed, 2020-03-25 at 22:10 +, Mikael Öhman wrote: > Hi Jeff! (also, I'm also sorry for a resend, I did exactly the same with my > message as well!) > > Unfortunately, the answer wasn't that simple, as I am on the latest C7 kernel > as well > uname -r > 3.10.0-1062.1.2.el7.x86_64 > > I did

[ceph-users] Re: 14.2.7 MDS High Virtual Memory

2020-03-26 Thread Andrej Filipčič
On 26/03/2020 11:52, Dave Hall wrote: Hello. I have a cluster with 3 MDSs - 2 active and 1 standby.  Looking at 'top' I see that the 2 active MDSs have relatively low resident memory - less than 3 g, but very high virtual memory - 21 g and 41 g.  Is it safe to just restart these processes one

[ceph-users] Re: Space leak in Bluestore

2020-03-26 Thread vitalif
Hi, The cluster is all-flash (NVMe), so the removal is fast and it's in fact pretty noticeable, even on Prometheus graphs. Also I've logged raw space usage from `ceph -f json df`: 1) before pg rebalance started the space usage was 32724002664448 bytes 2) just before the rebalance finished it

[ceph-users] 14.2.7 MDS High Virtual Memory

2020-03-26 Thread Dave Hall
Hello. I have a cluster with 3 MDSs - 2 active and 1 standby.  Looking at 'top' I see that the 2 active MDSs have relatively low resident memory - less than 3 g, but very high virtual memory - 21 g and 41 g.  Is it safe to just restart these processes one at a time to release the virtual memor

[ceph-users] Re: Space leak in Bluestore

2020-03-26 Thread Igor Fedotov
Hi Vitaliy, just as a guess to verify: a while ago I've been observed very long pool (pretty large) removal. It took several days to complete. DB was at spinner which was one of driver of this slow behavior. Another one - PG removal design which enumerates up to 30 entries max to fill singl

[ceph-users] Re: Exporting

2020-03-26 Thread Eugen Block
Hi, have you tried to create a snapshot of that image and export the snapshot? Or maybe clone the snapshot first and then export it? Does it also fail? Zitat von Rhian Resnick : Evening, We are running into issues exporting a disk image from ceph rbd. When we attempt to export an rbd