Re: [ceph-users] clients failing to advance oldest client/flush tid

2017-10-09 Thread Nigel Williams
On 9 October 2017 at 19:21, Jake Grimmett wrote: > HEALTH_WARN 9 clients failing to advance oldest client/flush tid; > 1 MDSs report slow requests; 1 MDSs behind on trimming On a proof-of-concept 12.2.1 cluster (few random files added, 30 OSDs, default Ceph settings) I can get the above error by

Re: [ceph-users] installing specific version of ceph-common

2017-10-09 Thread Ben Hines
Just encountered this same problem with 11.2.0. " yum install ceph-common-11.2.0 libradosstriper1-11.2.0 librgw2-11.2.0" did the trick. Thanks! It would be nice if it was easier to install older noncurrent versions of Ceph, perhaps there is a way to fix the dependencies so that yum can figure it

[ceph-users] Unable to restrict a CephFS client to a subdirectory

2017-10-09 Thread Shawfeng Dong
Dear all, I am trying to follow the instructions at: http://docs.ceph.com/docs/master/cephfs/client-auth/ to restrict a client to a subdirectory of Ceph filesystem, but always get an error. We are running the latest stable release of Ceph (v12.2.1) on CentOS 7 servers. The user 'hydra' has the f

Re: [ceph-users] killing ceph-disk [was Re: ceph-volume: migration and disk partition support]

2017-10-09 Thread Christian Balzer
Hello, (pet peeve alert) On Mon, 9 Oct 2017 15:09:29 + (UTC) Sage Weil wrote: > To put this in context, the goal here is to kill ceph-disk in mimic. > > One proposal is to make it so new OSDs can *only* be deployed with LVM, > and old OSDs with the ceph-disk GPT partitions would be start

Re: [ceph-users] Snapshot space

2017-10-09 Thread Jason Dillaman
OK. I read your "rbd du" results as saying that the clone image "e01f31e94a65cf7e786972b915e07364-1" wrote appoximately 10GB of data, then snapshot "d-1" was created (so the space is associated w/ the snapshot), before another 616MB was written against the "HEAD" revision of the image. If you delet

Re: [ceph-users] Bareos and libradosstriper works only for 4M sripe_unit size

2017-10-09 Thread Gregory Farnum
Well, just from a quick skim, libradosstriper.h has a function rados_striper_set_object_layout_object_size(rados_striper_t striper, unsigned int object_size) and libradosstriper.hpp has one in RadosStriper set_object_layout_object_size(unsigned int object_size); So I imagine you specify it

Re: [ceph-users] Snapshot space

2017-10-09 Thread Josy
Sorry for the confusion. >> e01f31e94a65cf7e786972b915e07364-1 is a clone image created from the parent "ostemplates/windows-std-2k8r2-x64-20171004@snap_windows-std-2k8r2-x64-20171004" [cephuser@ceph-las-admin-a1 ceph-cluster]$ rbd info cvm/e01f31e94a65cf7e786972b915e07364-1 rbd image 'e01f

Re: [ceph-users] rgw resharding operation seemingly won't end

2017-10-09 Thread Yehuda Sadeh-Weinraub
On Mon, Oct 9, 2017 at 1:59 PM, Ryan Leimenstoll wrote: > Hi all, > > We recently upgraded to Ceph 12.2.1 (Luminous) from 12.2.0 however are now > seeing issues running radosgw. Specifically, it appears an automatically > triggered resharding operation won’t end, despite the jobs being cancelled

Re: [ceph-users] Snapshot space

2017-10-09 Thread Jason Dillaman
If the clone has written 10GB of data, yes, the clone should show 10GB. I am not sure what you are referring to when you say "clone" since you only included a single image in your response. The clone is the image chained from a parent image snapshot. Is "e01f31e94a65cf7e786972b915e07364-1@d-1" the

Re: [ceph-users] Snapshot space

2017-10-09 Thread Josy
Thank you for your response! If the cloned VM had written around 10Gbs of data, wouldn't the clone also show that much space? Below is a list of the original image, the clone and new snapshots along with their sizes. The clone is still only a few hundred megabytes, while the snapshot shows

[ceph-users] rgw resharding operation seemingly won't end

2017-10-09 Thread Ryan Leimenstoll
Hi all, We recently upgraded to Ceph 12.2.1 (Luminous) from 12.2.0 however are now seeing issues running radosgw. Specifically, it appears an automatically triggered resharding operation won’t end, despite the jobs being cancelled (radosgw-admin reshard cancel). I have also disabled dynamic sh

Re: [ceph-users] Ceph mirrors

2017-10-09 Thread Ken Dreyer
On Thu, Oct 5, 2017 at 1:35 PM, Stefan Kooman wrote: > Hi, > > Sorry for empty mail, that shouldn't have happened. I would like to > address the following. Currently the repository list for debian- > packages contain _only_ the latest package version. In case of a > (urgent) need to downgrade you

Re: [ceph-users] PGs get placed in the same datacenter (Trying to make a hybrid NVMe/HDD pool with 6 servers, 2 in each datacenter)

2017-10-09 Thread Peter Linder
I was able to get this working with the crushmap in my last post! I now have the intended behavior together with the change of primary affinity on the slow hdds. Very happy, performance is excellent. One thing was a little weird though, I had to manually change the weight of each hostgroup so that

Re: [ceph-users] clients failing to advance oldest client/flush tid

2017-10-09 Thread John Spray
On Mon, Oct 9, 2017 at 5:52 PM, Jake Grimmett wrote: > Hi John, > > Many thanks for getting back to me. > > Yes, I did see the "experimental" label on snapshots... > > After reading other posts, I got the impression that cephfs snapshots > might be OK; provided you used a single active MDS and the

Re: [ceph-users] clients failing to advance oldest client/flush tid

2017-10-09 Thread Jake Grimmett
Hi John, Many thanks for getting back to me. Yes, I did see the "experimental" label on snapshots... After reading other posts, I got the impression that cephfs snapshots might be OK; provided you used a single active MDS and the latest ceph fuse client, both of which we have. Anyhow as you pre

Re: [ceph-users] Ceph cache pool full

2017-10-09 Thread Gregory Farnum
On Fri, Oct 6, 2017 at 2:22 PM Shawfeng Dong wrote: > Here is a quick update. I found that a CephFS client process was accessing > the big 1TB file, which I think had a lock on the file, preventing the > flushing of objects to the underlying data pool. Once I killed that > process, objects starte

Re: [ceph-users] clients failing to advance oldest client/flush tid

2017-10-09 Thread John Spray
On Mon, Oct 9, 2017 at 9:21 AM, Jake Grimmett wrote: > Dear All, > > We have a new cluster based on v12.2.1 > > After three days of copying 300TB data into cephfs, > we have started getting the following Health errors: > > # ceph health > HEALTH_WARN 9 clients failing to advance oldest client/flus

[ceph-users] killing ceph-disk [was Re: ceph-volume: migration and disk partition support]

2017-10-09 Thread Sage Weil
To put this in context, the goal here is to kill ceph-disk in mimic. One proposal is to make it so new OSDs can *only* be deployed with LVM, and old OSDs with the ceph-disk GPT partitions would be started via ceph-volume support that can only start (but not deploy new) OSDs in that style. Is

Re: [ceph-users] Snapshot space

2017-10-09 Thread Jason Dillaman
No -- it means that your clone had written approximately 10GB of space within the image before you created the first snapshot. If the "fast-diff" feature is enabled, note that it only calculates usage in object size chunks (defaults to 4MB) -- which means that even writing 1 byte to a 4MB object wo

Re: [ceph-users] cephfs: how to repair damaged mds rank?

2017-10-09 Thread Daniel Baumann
Hi John, On 10/09/2017 10:47 AM, John Spray wrote: > When a rank is "damaged", that means the MDS rank is blocked from > starting because Ceph thinks the on-disk metadata is damaged -- no > amount of restarting things will help. thanks. > The place to start with the investigation is to find the

Re: [ceph-users] cephfs: how to repair damaged mds rank?

2017-10-09 Thread John Spray
On Mon, Oct 9, 2017 at 8:17 AM, Daniel Baumann wrote: > Hi all, > > we have a Ceph Cluster (12.2.1) with 9 MDS ranks in multi-mds mode. > > "out of the blue", rank 6 is marked as damaged (and all other MDS are in > state up:resolve) and I can't bring the FS up again. > > 'ceph -s' says: > [...] >

[ceph-users] clients failing to advance oldest client/flush tid

2017-10-09 Thread Jake Grimmett
Dear All, We have a new cluster based on v12.2.1 After three days of copying 300TB data into cephfs, we have started getting the following Health errors: # ceph health HEALTH_WARN 9 clients failing to advance oldest client/flush tid; 1 MDSs report slow requests; 1 MDSs behind on trimming ceph-m

Re: [ceph-users] cephfs: how to repair damaged mds rank?

2017-10-09 Thread Daniel Baumann
On 10/09/2017 09:17 AM, Daniel Baumann wrote: > The relevant portion from the ceph-mds log (when starting mds9 which > should then take up rank 6; I'm happy to provide any logs): i've turned up the logging (see attachment).. could it be that we hit this bug here? http://tracker.ceph.com/issues/17

[ceph-users] cephfs: how to repair damaged mds rank?

2017-10-09 Thread Daniel Baumann
Hi all, we have a Ceph Cluster (12.2.1) with 9 MDS ranks in multi-mds mode. "out of the blue", rank 6 is marked as damaged (and all other MDS are in state up:resolve) and I can't bring the FS up again. 'ceph -s' says: [...] 1 filesystem is degraded 1 mds daemon damaged