Re: [ceph-users] Write operation to cephFS mount hangs

2018-07-31 Thread Gregory Farnum
On Tue, Jul 31, 2018 at 7:46 PM Bödefeld Sabine wrote: > Hello, > > > > we have a Ceph Cluster 10.2.10 on VMs with Ubuntu 16.04 using Xen as the > hypervisor. We use CephFS and the clients use ceph-fuse to access the files. > > Some of the ceph-fuse clients hang on write operations to the

Re: [ceph-users] Force cephfs delayed deletion

2018-07-31 Thread Yan, Zheng
On Wed, Aug 1, 2018 at 6:43 AM Kamble, Nitin A wrote: > Hi John, > > > > I am running ceph Luminous 12.2.1 release on the storage nodes with > v4.4.114 kernel on the cephfs clients. > > > > 3 client nodes are running 3 instances of a test program. > > The test program is doing this repeatedly in

Re: [ceph-users] OMAP warning ( again )

2018-07-31 Thread Brad Hubbard
Search the cluster log for 'Large omap object found' for more details. On Wed, Aug 1, 2018 at 3:50 AM, Brent Kennedy wrote: > Upgraded from 12.2.5 to 12.2.6, got a “1 large omap objects” warning > message, then upgraded to 12.2.7 and the message went away. I just added > four OSDs to balance

Re: [ceph-users] Mgr cephx caps to run `ceph fs status`?

2018-07-31 Thread Linh Vu
Thanks John, that works! Also works with multiple commands, e.g I granted my user access to both `ceph fs status` and `ceph status`: mgr 'allow command "fs status", allow command "status"' From: John Spray Sent: Tuesday, 31 July 2018 8:12:00 PM To: Linh Vu Cc:

Re: [ceph-users] Force cephfs delayed deletion

2018-07-31 Thread Kamble, Nitin A
Hi John, I am running ceph Luminous 12.2.1 release on the storage nodes with v4.4.114 kernel on the cephfs clients. 3 client nodes are running 3 instances of a test program. The test program is doing this repeatedly in a loop: * sequentially write a 256GB file on cephfs * delete the

Re: [ceph-users] Whole cluster flapping

2018-07-31 Thread Brent Kennedy
I have had this happen during large data movements. Stopped happening after I went to 10Gb though(from 1Gb). What I had done is injected a setting ( and adjusted the configs ) to give more time before an OSD was marked down. osd heartbeat grace = 200 mon osd down out interval = 900 For

[ceph-users] Hiring: Ceph community manager

2018-07-31 Thread Rich Bowen
Hi, folks, The Open Source and Standards (OSAS) group at Red Hat is hiring a Ceph Community Manager. If you're interested, check out the job listing here: https://us-redhat.icims.com/jobs/64407/ceph-community-manager/job If you'd like to talk to someone about what's involved in being a

[ceph-users] OMAP warning ( again )

2018-07-31 Thread Brent Kennedy
Upgraded from 12.2.5 to 12.2.6, got a "1 large omap objects" warning message, then upgraded to 12.2.7 and the message went away. I just added four OSDs to balance out the cluster ( we had some servers with fewer drives in them; jbod config ) and now the "1 large omap objects" warning message is

Re: [ceph-users] CephFS Snapshots in Mimic

2018-07-31 Thread Ken Dreyer
On Tue, Jul 31, 2018 at 9:23 AM, Kenneth Waegeman wrote: > Thanks David and John, > > That sounds logical now. When I did read "To make a snapshot on directory > “/1/2/3/”, the client invokes “mkdir” on “/1/2/3/.snap” directory > (http://docs.ceph.com/docs/master/dev/cephfs-snapshots/)" it didn't

Re: [ceph-users] CephFS Snapshots in Mimic

2018-07-31 Thread Kenneth Waegeman
Thanks David and John, That sounds logical now. When I did read "To make a snapshot on directory “/1/2/3/”, the client invokes “mkdir” on “/1/2/3/.snap” directory (http://docs.ceph.com/docs/master/dev/cephfs-snapshots/)" it didn't come to mind I should create subdirectory immediately.

[ceph-users] RBD mirroring replicated and erasure coded pools

2018-07-31 Thread Ilja Slepnev
Hi, is it possible to establish RBD mirroring between replicated and erasure coded pools? I'm trying to setup replication as described on http://docs.ceph.com/docs/master/rbd/rbd-mirroring/ without success. Ceph 12.2.5 Luminous root@local:~# rbd --cluster local mirror pool enable rbd-2 pool

Re: [ceph-users] CephFS Snapshots in Mimic

2018-07-31 Thread John Spray
On Tue, Jul 31, 2018 at 3:45 PM Kenneth Waegeman wrote: > > Hi all, > > I updated an existing Luminous cluster to Mimic 13.2.1. All daemons were > updated, so I did ceph osd require-osd-release mimic, so everything > seems up to date. > > I want to try the snapshots in Mimic, since this should be

Re: [ceph-users] CephFS Snapshots in Mimic

2018-07-31 Thread David Disseldorp
Hi Kenneth, On Tue, 31 Jul 2018 16:44:36 +0200, Kenneth Waegeman wrote: > Hi all, > > I updated an existing Luminous cluster to Mimic 13.2.1. All daemons were > updated, so I did ceph osd require-osd-release mimic, so everything > seems up to date. > > I want to try the snapshots in Mimic,

Re: [ceph-users] Write operation to cephFS mount hangs

2018-07-31 Thread Bödefeld Sabine
Hello Eugen Yes all of the clients use the same credentials for authentication. I’ve mounted the cephFS on about 10 VMs and it works only on about 4 of them. We have used this setup before but on Ubuntu 14.04 with ceph 0.94.1, ceph-deploy 1.5.35 and ceph-fuse 0.80.11. In dmesg there is

[ceph-users] CephFS Snapshots in Mimic

2018-07-31 Thread Kenneth Waegeman
Hi all, I updated an existing Luminous cluster to Mimic 13.2.1. All daemons were updated, so I did ceph osd require-osd-release mimic, so everything seems up to date. I want to try the snapshots in Mimic, since this should be stable, so i ran: [root@osd2801 alleee]# ceph fs set cephfs

Re: [ceph-users] Whole cluster flapping

2018-07-31 Thread Webert de Souza Lima
The pool deletion might have triggered a lot of IO operations on the disks and the process might be too busy to respond to hearbeats, so the mons mark them as down due to no response. Check also the OSD logs to see if they are actually crashing and restarting, and disk IO usage (i.e. iostat).

Re: [ceph-users] Write operation to cephFS mount hangs

2018-07-31 Thread Eugen Block
Hi, Some of the ceph-fuse clients hang on write operations to the cephFS. Do all the clients use the same credentials for authentication? Have you tried to mount the filesystem with the same credentials as your VMs do and then tried to create files? Has it worked before or is this a new

[ceph-users] Write operation to cephFS mount hangs

2018-07-31 Thread Bödefeld Sabine
Hello, we have a Ceph Cluster 10.2.10 on VMs with Ubuntu 16.04 using Xen as the hypervisor. We use CephFS and the clients use ceph-fuse to access the files. Some of the ceph-fuse clients hang on write operations to the cephFS. On copying a file to the cephFS, the file is created but it's

Re: [ceph-users] Intermittent client reconnect delay following node fail

2018-07-31 Thread John Spray
On Tue, Jul 31, 2018 at 12:33 AM William Lawton wrote: > > Hi. > > > > We have recently setup our first ceph cluster (4 nodes) but our node failure > tests have revealed an intermittent problem. When we take down a node (i.e. > by powering it off) most of the time all clients reconnect to the

Re: [ceph-users] Mgr cephx caps to run `ceph fs status`?

2018-07-31 Thread John Spray
On Tue, Jul 31, 2018 at 3:36 AM Linh Vu wrote: > > Hi all, > > > I want a non-admin client to be able to run `ceph fs status`, either via the > ceph CLI or a python script. Adding `mgr "allow *"` to this client's cephx > caps works, but I'd like to be more specific if possible. I can't find the

Re: [ceph-users] Mimi Telegraf plugin on Luminous

2018-07-31 Thread Denny Fuchs
hi, found the issue: problem was a wrong syntax from Telegraf, which does not create the socket and because there was no socket, I've got the "no such file ..." I tried later udp/tcp ... but I didn't read it correctly, that Telegraf creates the necessary input :-D Now it works :-) cu

Re: [ceph-users] Mimi Telegraf plugin on Luminous

2018-07-31 Thread Wido den Hollander
On 07/31/2018 09:38 AM, Denny Fuchs wrote: > hi, > > I try to get the Telegraf plugin from Mimic on Luminous running (Debian > Stretch). I copied the files from the Git into > /usr/lib/ceph/mgr/telegraf; enabled the plugin and get: > > > 2018-07-31 09:25:46.501858 7f496cfc9700 -1

[ceph-users] Whole cluster flapping

2018-07-31 Thread CUZA Frédéric
Hi Everyone, I just upgrade our cluster to Luminous 12.2.7 and I delete a quite large pool that we had (120 TB). Our cluster is made of 14 Nodes with each composed of 12 OSDs (1 HDD -> 1 OSD), we have SDD for journal. After I deleted the large pool my cluster started to flapping on all OSDs.

[ceph-users] Mimi Telegraf plugin on Luminous

2018-07-31 Thread Denny Fuchs
hi, I try to get the Telegraf plugin from Mimic on Luminous running (Debian Stretch). I copied the files from the Git into /usr/lib/ceph/mgr/telegraf; enabled the plugin and get: 2018-07-31 09:25:46.501858 7f496cfc9700 -1 log_channel(cluster) log [ERR] : Unhandled exception from module

Re: [ceph-users] Self shutdown of 1 whole system: Oops, it did it again (not yet anymore)

2018-07-31 Thread Nicolas Huillard
Hi all, The latest hint I received (thanks!) was to replace a failing hardware. Before that, I updated the BIOS, which included a CPU microcode fix for melddown/spectre and probably other thngs. Last time I had checked, the vendor didn't have that fix yet. Since this update, not CATERR

Re: [ceph-users] Enable daemonperf - no stats selected by filters

2018-07-31 Thread Marc Roos
Luminous 12.2.7 [@c01 ~]# rpm -qa | grep ceph- ceph-mon-12.2.7-0.el7.x86_64 ceph-selinux-12.2.7-0.el7.x86_64 ceph-osd-12.2.7-0.el7.x86_64 ceph-mgr-12.2.7-0.el7.x86_64 ceph-12.2.7-0.el7.x86_64 ceph-common-12.2.7-0.el7.x86_64 ceph-mds-12.2.7-0.el7.x86_64 ceph-radosgw-12.2.7-0.el7.x86_64