Re: [ceph-users] CephFS with cache-tier kernel-mount client unable to write (Nautilus)

2020-01-22 Thread Hayashida, Mami
Thanks, Ilya. I just tried modifying the osd cap for client.testuser by getting rid of "tag cephfs data=cephfs_test" part and confirmed this key does work (i.e. lets the CephFS client read/write). It now reads: [client.testuser] key = XXXZZZ caps mds = "allow rw" caps mon = "allow r" caps os

Re: [ceph-users] CephFS with cache-tier kernel-mount client unable to write (Nautilus)

2020-01-21 Thread Ilya Dryomov
On Tue, Jan 21, 2020 at 7:51 PM Hayashida, Mami wrote: > > Ilya, > > Thank you for your suggestions! > > `dmsg` (on the client node) only had `libceph: mon0 10.33.70.222:6789 socket > error on write`. No further detail. But using the admin key (client.admin) > for mounting CephFS solved my pro

Re: [ceph-users] CephFS with cache-tier kernel-mount client unable to write (Nautilus)

2020-01-21 Thread Hayashida, Mami
Ilya, Thank you for your suggestions! `dmsg` (on the client node) only had `libceph: mon0 10.33.70.222:6789 socket error on write`. No further detail. But using the admin key (client.admin) for mounting CephFS solved my problem. I was able to write successfully! :-) $ sudo mount -t ceph 10.33

Re: [ceph-users] CephFS with cache-tier kernel-mount client unable to write (Nautilus)

2020-01-21 Thread Ilya Dryomov
On Tue, Jan 21, 2020 at 6:02 PM Hayashida, Mami wrote: > > I am trying to set up a CephFS with a Cache Tier (for data) on a mini test > cluster, but a kernel-mount CephFS client is unable to write. Cache tier > setup alone seems to be working fine (I tested it with `rados put` and `osd > map`

[ceph-users] CephFS with cache-tier kernel-mount client unable to write (Nautilus)

2020-01-21 Thread Hayashida, Mami
I am trying to set up a CephFS with a Cache Tier (for data) on a mini test cluster, but a kernel-mount CephFS client is unable to write. Cache tier setup alone seems to be working fine (I tested it with `rados put` and `osd map` commands to verify on which OSDs the objects are placed) and setting

Re: [ceph-users] cephfs kernel client io performance decreases extremely

2019-12-31 Thread ste...@bit.nl
Quoting renjianxinlover (renjianxinlo...@163.com): > hi,Stefan > could you please provide further guidence? https://docs.ceph.com/docs/master/cephfs/troubleshooting/#slow-requests-mds Do a "dump ops in flight" to see what's going on on the MDS. https://docs.ceph.com/docs/master/cephfs/trou

Re: [ceph-users] cephfs kernel client io performance decreases extremely

2019-12-29 Thread renjianxinlover
hi,Stefan could you please provide further guidence? Brs | | renjianxinlover | | renjianxinlo...@163.com | 签名由网易邮箱大师定制 On 12/28/2019 21:44,renjianxinlover wrote: Sorry what i said was fuzzy before. Currently, my mds is running with certain osds at same node in which SSD drive serves as ca

Re: [ceph-users] cephfs kernel client io performance decreases extremely

2019-12-28 Thread renjianxinlover
Sorry what i said was fuzzy before. Currently, my mds is running with certain osds at same node in which SSD drive serves as cache device. | | renjianxinlover | | renjianxinlo...@163.com | 签名由网易邮箱大师定制 On 12/28/2019 15:49,Stefan Kooman wrote: Quoting renjianxinlover (renjianxinlo...@163.com): HI

Re: [ceph-users] cephfs kernel client io performance decreases extremely

2019-12-27 Thread Stefan Kooman
Quoting renjianxinlover (renjianxinlo...@163.com): > HI, Nathan, thanks for your quick reply! > comand 'ceph status' outputs warning including about ten clients failing to > respond to cache pressure; > in addition, in mds node, 'iostat -x 1' shows drive io usage of mds within > five seconds as f

Re: [ceph-users] cephfs kernel client io performance decreases extremely

2019-12-27 Thread renjianxinlover
HI, Nathan, thanks for your quick reply! comand 'ceph status' outputs warning including about ten clients failing to respond to cache pressure; in addition, in mds node, 'iostat -x 1' shows drive io usage of mds within five seconds as follow, Device: rrqm/s wrqm/s r/s w/srkB

Re: [ceph-users] cephfs kernel client io performance decreases extremely

2019-12-26 Thread Nathan Fish
I would start by viewing "ceph status", drive IO with: "iostat -x 1 /dev/sd{a..z}" and the CPU/RAM usage of the active MDS. If "ceph status" warns that the MDS cache is oversized, that may be an easy fix. On Thu, Dec 26, 2019 at 7:33 AM renjianxinlover wrote: > hello, >recently, after de

[ceph-users] cephfs kernel client io performance decreases extremely

2019-12-26 Thread renjianxinlover
hello, recently, after deleting some fs data in a small-scale ceph cluster, some clients IO performance became bad, specially latency. for example, opening a tiny text file by vim maybe consumed nearly twenty seconds, i am not clear about how to diagnose the cause, could anyone give some

Re: [ceph-users] CephFS "denied reconnect attempt" after updating Ceph

2019-12-11 Thread William Edwards
It seems like this has been fixed with client kernel version 4.19.0-0.bpo.5-amd64. -- Groeten, William Edwards Tuxis Internet Engineering   - Originele bericht - Van: William Edwards (wedwa...@tuxis.nl) Datum: 08/13/19 16:46 Naar: ceph-users@lists.ceph.com Onderwerp: [ceph-users] CephFS

[ceph-users] Cephfs metadata fix tool

2019-12-07 Thread Robert LeBlanc
Our Jewel cluster is exhibiting some similar issues to the one in this thread [0] and it was indicated that a tool would need to be written to fix that kind of corruption. Has the tool been written? How would I go about repair this 16EB directories that won't delete? Thank you, Robert LeBlanc [0]

Re: [ceph-users] CephFS kernel module lockups in Ubuntu linux-image-5.0.0-32-generic?

2019-11-06 Thread Simon Oosthoek
I finally took the time to report the bug: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1851470 On 29/10/2019 10:44, Simon Oosthoek wrote: > On 24/10/2019 16:23, Christopher Wieringa wrote: >> Hello all, >> >>   >> >> I’ve been using the Ceph kernel modules in Ubuntu to load a CephFS >> f

Re: [ceph-users] cephfs 1 large omap objects

2019-10-30 Thread Patrick Donnelly
On Wed, Oct 30, 2019 at 9:28 AM Jake Grimmett wrote: > > Hi Zheng, > > Many thanks for your helpful post, I've done the following: > > 1) set the threshold to 1024 * 1024: > > # ceph config set osd \ > osd_deep_scrub_large_omap_object_key_threshold 1048576 > > 2) deep scrubbed all of the pgs on th

Re: [ceph-users] cephfs 1 large omap objects

2019-10-30 Thread Jake Grimmett
Hi Zheng, Many thanks for your helpful post, I've done the following: 1) set the threshold to 1024 * 1024: # ceph config set osd \ osd_deep_scrub_large_omap_object_key_threshold 1048576 2) deep scrubbed all of the pgs on the two OSD that reported "Large omap object found." - these were all in p

Re: [ceph-users] CephFS client hanging and cache issues

2019-10-30 Thread Bob Farrell
Thanks a lot and sorry for the spam, I should have checked ! We are on 18.04, kernel is currently upgrading so if you don't hear back from me then it is fixed. Thanks for the amazing support ! On Wed, 30 Oct 2019, 09:54 Lars Täuber, wrote: > Hi. > > Sounds like you use kernel clients with kerne

Re: [ceph-users] CephFS client hanging and cache issues

2019-10-30 Thread Lars Täuber
Hi. Sounds like you use kernel clients with kernels from canonical/ubuntu. Two kernels have a bug: 4.15.0-66 and 5.0.0-32 Updated kernels are said to have fixes. Older kernels also work: 4.15.0-65 and 5.0.0-31 Lars Wed, 30 Oct 2019 09:42:16 + Bob Farrell ==> ceph-users : > Hi. We are ex

Re: [ceph-users] CephFS client hanging and cache issues

2019-10-30 Thread Paul Emmerich
Kernel bug due to a bad backport, see recent posts here. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Wed, Oct 30, 2019 at 10:42 AM Bob Farrell wrote: > > Hi. W

[ceph-users] CephFS client hanging and cache issues

2019-10-30 Thread Bob Farrell
Hi. We are experiencing a CephFS client issue on one of our servers. ceph version 14.2.0 (3a54b2b6d167d4a2a19e003a705696d4fe619afc) nautilus (stable) Trying to access, `umount`, or `umount -f` a mounted CephFS volumes causes my shell to hang indefinitely. After a reboot I can remount the volumes

Re: [ceph-users] cephfs 1 large omap objects

2019-10-29 Thread Yan, Zheng
see https://tracker.ceph.com/issues/42515. just ignore the warning for now On Mon, Oct 7, 2019 at 7:50 AM Nigel Williams wrote: > > Out of the blue this popped up (on an otherwise healthy cluster): > > HEALTH_WARN 1 large omap objects > LARGE_OMAP_OBJECTS 1 large omap objects > 1 large objec

Re: [ceph-users] CephFS kernel module lockups in Ubuntu linux-image-5.0.0-32-generic?

2019-10-29 Thread Simon Oosthoek
On 24/10/2019 16:23, Christopher Wieringa wrote: > Hello all, > >   > > I’ve been using the Ceph kernel modules in Ubuntu to load a CephFS > filesystem quite successfully for several months.  Yesterday, I went > through a round of updates on my Ubuntu 18.04 machines, which loaded > linux-image-5.

[ceph-users] CephFS Ganesha NFS for VMWare

2019-10-29 Thread Glen Baars
Hello Ceph Users, I am trialing CephFS / Ganesha NFS for VMWare usage. We are on Mimic / Centos 7.7 / 130 x 12TB 7200rpm OSDs / 13 hosts / 3 replica. So far the read performance has been great. The write performance ( NFS sync ) hasn't been great. We use a lot of 64KB NFS read / writes and the

Re: [ceph-users] cephfs 1 large omap objects

2019-10-28 Thread Jake Grimmett
Hi Paul, Nigel, I'm also seeing "HEALTH_WARN 6 large omap objects" warnings with cephfs after upgrading to 14.2.4: The affected osd's are used (only) by the metadata pool: POOLID STORED OBJECTS USED %USED MAX AVAIL mds_ssd 1 64 GiB 1.74M 65 GiB 4.47 466 GiB See below for more log de

Re: [ceph-users] CephFS kernel module lockups in Ubuntu linux-image-5.0.0-32-generic?

2019-10-24 Thread Ilya Dryomov
On Thu, Oct 24, 2019 at 5:45 PM Paul Emmerich wrote: > > Could it be related to the broken backport as described in > https://tracker.ceph.com/issues/40102 ? > > (It did affect 4.19, not sure about 5.0) It does, I have just updated the linked ticket to reflect that. Thanks, Ilya

Re: [ceph-users] CephFS kernel module lockups in Ubuntu linux-image-5.0.0-32-generic?

2019-10-24 Thread Sasha Litvak
Also, search for this topic on the list. Ubuntu Disco with most recent Kernel 5.0.0-32 seems to be instable On Thu, Oct 24, 2019 at 10:45 AM Paul Emmerich wrote: > Could it be related to the broken backport as described in > https://tracker.ceph.com/issues/40102 ? > > (It did affect 4.19, not

Re: [ceph-users] CephFS kernel module lockups in Ubuntu linux-image-5.0.0-32-generic?

2019-10-24 Thread Paul Emmerich
Could it be related to the broken backport as described in https://tracker.ceph.com/issues/40102 ? (It did affect 4.19, not sure about 5.0) Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel

[ceph-users] CephFS kernel module lockups in Ubuntu linux-image-5.0.0-32-generic?

2019-10-24 Thread Christopher Wieringa
Hello all, I've been using the Ceph kernel modules in Ubuntu to load a CephFS filesystem quite successfully for several months. Yesterday, I went through a round of updates on my Ubuntu 18.04 machines, which loaded linux-image-5.0.0-32-generic as the kernel. I'm noticing that while the kernel

Re: [ceph-users] cephfs 1 large omap objects

2019-10-08 Thread Paul Emmerich
Hi, the default for this warning changed recently (see other similar threads on the mailing list), it was 2 million before 14.2.3. I don't think the new default of 200k is a good choice, so increasing it is a reasonable work-around. Paul -- Paul Emmerich Looking for help with your Ceph cluste

Re: [ceph-users] cephfs 1 large omap objects

2019-10-06 Thread Nigel Williams
I've adjusted the threshold: ceph config set osd osd_deep_scrub_large_omap_object_key_threshold 35 Colleague suggested that this will take effect on the next deep-scrub. Is the default of 200,000 too small? will this be adjusted in future releases or is it meant to be adjusted in some use-ca

Re: [ceph-users] cephfs 1 large omap objects

2019-10-06 Thread Nigel Williams
I followed some other suggested steps, and have this: root@cnx-17:/var/log/ceph# zcat ceph-osd.178.log.?.gz|fgrep Large 2019-10-02 13:28:39.412 7f482ab1c700 0 log_channel(cluster) log [WRN] : Large omap object found. Object: 2:654134d2:::mds0_openfiles.0:head Key count: 306331 Size (bytes): 13993

[ceph-users] cephfs 1 large omap objects

2019-10-06 Thread Nigel Williams
Out of the blue this popped up (on an otherwise healthy cluster): HEALTH_WARN 1 large omap objects LARGE_OMAP_OBJECTS 1 large omap objects 1 large objects found in pool 'cephfs_metadata' Search the cluster log for 'Large omap object found' for more details. "Search the cluster log" is som

Re: [ceph-users] cephfs performance issue MDSs report slow requests and osd memory usage

2019-09-24 Thread Robert LeBlanc
On Tue, Sep 24, 2019 at 4:33 AM Thomas <74cmo...@gmail.com> wrote: > > Hi, > > I'm experiencing the same issue with this setting in ceph.conf: > osd op queue = wpq > osd op queue cut off = high > > Furthermore I cannot read any old data in the relevant pool that is > serving CephFS.

Re: [ceph-users] cephfs performance issue MDSs report slow requests and osd memory usage

2019-09-24 Thread Thomas
Hi, I'm experiencing the same issue with this setting in ceph.conf:     osd op queue = wpq     osd op queue cut off = high Furthermore I cannot read any old data in the relevant pool that is serving CephFS. However, I can write new data and read this new data. Regards Thomas Am 24.09.20

Re: [ceph-users] cephfs performance issue MDSs report slow requests and osd memory usage

2019-09-24 Thread Yoann Moulin
Hello, >> I have a Ceph Nautilus Cluster 14.2.1 for cephfs only on 40x 1.8T SAS disk >> (no SSD) in 20 servers. >> >> I often get "MDSs report slow requests" and plenty of "[WRN] 3 slow >> requests, 0 included below; oldest blocked for > 60281.199503 secs" >> >> After a few investigations, I saw

Re: [ceph-users] cephfs performance issue MDSs report slow requests and osd memory usage

2019-09-23 Thread Robert LeBlanc
On Thu, Sep 19, 2019 at 2:36 AM Yoann Moulin wrote: > > Hello, > > I have a Ceph Nautilus Cluster 14.2.1 for cephfs only on 40x 1.8T SAS disk > (no SSD) in 20 servers. > > > cluster: > > id: 778234df-5784-4021-b983-0ee1814891be > > health: HEALTH_WARN > > 2 MDSs report s

[ceph-users] cephfs performance issue MDSs report slow requests and osd memory usage

2019-09-19 Thread Yoann Moulin
Hello, I have a Ceph Nautilus Cluster 14.2.1 for cephfs only on 40x 1.8T SAS disk (no SSD) in 20 servers. > cluster: > id: 778234df-5784-4021-b983-0ee1814891be > health: HEALTH_WARN > 2 MDSs report slow requests > > services: > mon: 3 daemons, quorum icadmin006,

Re: [ceph-users] CephFS deletion performance

2019-09-18 Thread Hector Martin
On 17/09/2019 17.46, Yan, Zheng wrote: > when a snapshoted directory is deleted, mds moves the directory into > to stray directory. You have 57k strays, each time mds have a cache > miss for stray, mds needs to load a stray dirfrag. This is very > inefficient because a stray dirfrag contains lots

Re: [ceph-users] cephfs: apache locks up after parallel reloads on multiple nodes

2019-09-17 Thread Gregory Farnum
On Tue, Sep 17, 2019 at 8:12 AM Sander Smeenk wrote: > > Quoting Paul Emmerich (paul.emmer...@croit.io): > > > Yeah, CephFS is much closer to POSIX semantics for a filesystem than > > NFS. There's an experimental relaxed mode called LazyIO but I'm not > > sure if it's applicable here. > > Out of c

Re: [ceph-users] cephfs: apache locks up after parallel reloads on multiple nodes

2019-09-17 Thread Sander Smeenk
Quoting Paul Emmerich (paul.emmer...@croit.io): > Yeah, CephFS is much closer to POSIX semantics for a filesystem than > NFS. There's an experimental relaxed mode called LazyIO but I'm not > sure if it's applicable here. Out of curiosity, how would CephFS being more POSIX compliant cause this muc

Re: [ceph-users] CephFS deletion performance

2019-09-17 Thread Yan, Zheng
On Sat, Sep 14, 2019 at 8:57 PM Hector Martin wrote: > > On 13/09/2019 16.25, Hector Martin wrote: > > Is this expected for CephFS? I know data deletions are asynchronous, but > > not being able to delete metadata/directories without an undue impact on > > the whole filesystem performance is somew

Re: [ceph-users] CephFS deletion performance

2019-09-14 Thread Hector Martin
On 13/09/2019 16.25, Hector Martin wrote: > Is this expected for CephFS? I know data deletions are asynchronous, but > not being able to delete metadata/directories without an undue impact on > the whole filesystem performance is somewhat problematic. I think I'm getting a feeling for who the cu

Re: [ceph-users] CephFS client-side load issues for write-/delete-heavy workloads

2019-09-13 Thread Janek Bevendorff
Here's some more information on this issue. I found the MDS host not to have any load issues, but other clients who have the FS mounted cannot execute statfs/fstatfs on the mount, since the call never returns while my rsync job is running. Other syscalls like fstat work without problems. Thus,

[ceph-users] CephFS client-side load issues for write-/delete-heavy workloads

2019-09-13 Thread Janek Bevendorff
Hi, There have been various stability issues with the MDS that I reported a while ago and most of them have been addressed and fixes will be available in upcoming patch releases. However, there also seem to be problems on the client side, which I have not reported so far. Note: This report i

[ceph-users] CephFS deletion performance

2019-09-13 Thread Hector Martin
We have a cluster running CephFS with metadata on SSDs and data split between SSDs and OSDs (main pool is on HDDs, some subtrees are on an SSD pool). We're seeing quite poor deletion performance, especially for directories. It seems that previously empty directories are often deleted quickly,

Re: [ceph-users] cephfs: apache locks up after parallel reloads on multiple nodes

2019-09-12 Thread jesper
Thursday, 12 September 2019, 17.16 +0200 from Paul Emmerich : >Yeah, CephFS is much closer to POSIX semantics for a filesystem than >NFS. There's an experimental relaxed mode called LazyIO but I'm not >sure if it's applicable here. > >You can debug this by dumping slow requests from the MDS se

Re: [ceph-users] cephfs: apache locks up after parallel reloads on multiple nodes

2019-09-12 Thread Paul Emmerich
Yeah, CephFS is much closer to POSIX semantics for a filesystem than NFS. There's an experimental relaxed mode called LazyIO but I'm not sure if it's applicable here. You can debug this by dumping slow requests from the MDS servers via the admin socket Paul -- Paul Emmerich Looking for help w

[ceph-users] cephfs: apache locks up after parallel reloads on multiple nodes

2019-09-12 Thread Stefan Kooman
Dear list, We recently switched the shared storage for our linux shared hosting platforms from "nfs" to "cephfs". Performance improvement are noticeable. It all works fine, however, there is one peculiar thing: when Apache reloads after a logrotate of the "error" logs all but one node will hang fo

Re: [ceph-users] cephfs-snapshots causing mds failover, hangs

2019-08-27 Thread thoralf schulze
hi Zheng, On 8/26/19 3:31 PM, Yan, Zheng wrote: […] > change code to : […] we can happily confirm that this resolves the issue. thank you _very_ much & with kind regards, t. signature.asc Description: OpenPGP digital signature ___ ceph-users mailin

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Mark Nelson
On 8/26/19 7:39 AM, Wido den Hollander wrote: On 8/26/19 1:35 PM, Simon Oosthoek wrote: On 26-08-19 13:25, Simon Oosthoek wrote: On 26-08-19 13:11, Wido den Hollander wrote: The reweight might actually cause even more confusion for the balancer. The balancer uses upmap mode and that re-allo

Re: [ceph-users] cephfs-snapshots causing mds failover, hangs

2019-08-26 Thread Yan, Zheng
On Mon, Aug 26, 2019 at 9:25 PM thoralf schulze wrote: > > hi Zheng - > > On 8/26/19 2:55 PM, Yan, Zheng wrote: > > I tracked down the bug > > https://tracker.ceph.com/issues/41434 > > wow, that was quick - thank you for investigating. we are looking > forward for the fix :-) > > in the meantime,

Re: [ceph-users] cephfs-snapshots causing mds failover, hangs

2019-08-26 Thread thoralf schulze
hi Zheng - On 8/26/19 2:55 PM, Yan, Zheng wrote: > I tracked down the bug > https://tracker.ceph.com/issues/41434 wow, that was quick - thank you for investigating. we are looking forward for the fix :-) in the meantime, is there anything we can do to prevent q == p->second.end() from happening?

Re: [ceph-users] cephfs-snapshots causing mds failover, hangs

2019-08-26 Thread Yan, Zheng
On Mon, Aug 26, 2019 at 6:57 PM thoralf schulze wrote: > > hi Zheng, > > On 8/21/19 4:32 AM, Yan, Zheng wrote: > > Please enable debug mds (debug_mds=10), and try reproducing it again. > > please find the logs at > https://www.user.tu-berlin.de/thoralf.schulze/ceph-debug.tar.xz . > > we managed to

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Wido den Hollander
On 8/26/19 1:35 PM, Simon Oosthoek wrote: > On 26-08-19 13:25, Simon Oosthoek wrote: >> On 26-08-19 13:11, Wido den Hollander wrote: >> >>> >>> The reweight might actually cause even more confusion for the balancer. >>> The balancer uses upmap mode and that re-allocates PGs to different OSDs >>

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Simon Oosthoek
On 26-08-19 13:25, Simon Oosthoek wrote: On 26-08-19 13:11, Wido den Hollander wrote: The reweight might actually cause even more confusion for the balancer. The balancer uses upmap mode and that re-allocates PGs to different OSDs if needed. Looking at the output send earlier I have some repl

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Simon Oosthoek
On 26-08-19 13:11, Wido den Hollander wrote: The reweight might actually cause even more confusion for the balancer. The balancer uses upmap mode and that re-allocates PGs to different OSDs if needed. Looking at the output send earlier I have some replies. See below. Looking at this outpu

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Wido den Hollander
t; /Simon > >> >> Regards >> >> >> -Mensaje original- >> De: ceph-users En nombre de Simon >> Oosthoek >> Enviado el: lunes, 26 de agosto de 2019 11:52 >> Para: Dan van der Ster >> CC: ceph-users >> Asunto: Re: [ceph

Re: [ceph-users] cephfs-snapshots causing mds failover, hangs

2019-08-26 Thread thoralf schulze
hi Zheng, On 8/21/19 4:32 AM, Yan, Zheng wrote: > Please enable debug mds (debug_mds=10), and try reproducing it again. please find the logs at https://www.user.tu-berlin.de/thoralf.schulze/ceph-debug.tar.xz . we managed to reproduce the issue as a worst case scenario: before snapshotting, juju-

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Paul Emmerich
; > > > > -Mensaje original- > > De: ceph-users En nombre de Simon > > Oosthoek > > Enviado el: lunes, 26 de agosto de 2019 11:52 > > Para: Dan van der Ster > > CC: ceph-users > > Asunto: Re: [ceph-users] cephfs full, 2/3 Raw capacity used &g

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Simon Oosthoek
imon Regards -Mensaje original- De: ceph-users En nombre de Simon Oosthoek Enviado el: lunes, 26 de agosto de 2019 11:52 Para: Dan van der Ster CC: ceph-users Asunto: Re: [ceph-users] cephfs full, 2/3 Raw capacity used On 26-08-19 11:37, Dan van der Ster wrote: Thanks. The version and balancer c

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread EDH - Manuel Rios Fernandez
der Ster CC: ceph-users Asunto: Re: [ceph-users] cephfs full, 2/3 Raw capacity used On 26-08-19 11:37, Dan van der Ster wrote: > Thanks. The version and balancer config look good. > > So you can try `ceph osd reweight osd.10 0.8` to see if it helps to > get you out of this. I'

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Simon Oosthoek
On 26-08-19 11:37, Dan van der Ster wrote: Thanks. The version and balancer config look good. So you can try `ceph osd reweight osd.10 0.8` to see if it helps to get you out of this. I've done this and the next fullest 3 osds. This will take some time to recover, I'll let you know when it's d

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Dan van der Ster
Thanks. The version and balancer config look good. So you can try `ceph osd reweight osd.10 0.8` to see if it helps to get you out of this. -- dan On Mon, Aug 26, 2019 at 11:35 AM Simon Oosthoek wrote: > > On 26-08-19 11:16, Dan van der Ster wrote: > > Hi, > > > > Which version of ceph are you

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Simon Oosthoek
On 26-08-19 11:16, Dan van der Ster wrote: Hi, Which version of ceph are you using? Which balancer mode? Nautilus (14.2.2), balancer is in upmap mode. The balancer score isn't a percent-error or anything humanly usable. `ceph osd df tree` can better show you exactly which osds are over/under

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Dan van der Ster
Hi, Which version of ceph are you using? Which balancer mode? The balancer score isn't a percent-error or anything humanly usable. `ceph osd df tree` can better show you exactly which osds are over/under utilized and by how much. You might be able to manually fix things by using `ceph osd reweigh

[ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Simon Oosthoek
Hi all, we're building up our experience with our ceph cluster before we take it into production. I've now tried to fill up the cluster with cephfs, which we plan to use for about 95% of all data on the cluster. The cephfs pools are full when the cluster reports 67% raw capacity used. There

Re: [ceph-users] cephfs-snapshots causing mds failover, hangs

2019-08-21 Thread thoralf schulze
hi zheng, On 8/21/19 4:32 AM, Yan, Zheng wrote: > Please enable debug mds (debug_mds=10), and try reproducing it again. we will get back with the logs on monday. thank you & with kind regards, t. signature.asc Description: OpenPGP digital signature

Re: [ceph-users] cephfs-snapshots causing mds failover, hangs

2019-08-20 Thread Yan, Zheng
On Tue, Aug 20, 2019 at 9:43 PM thoralf schulze wrote: > > hi there, > > we are struggling with the creation of cephfs-snapshots: doing so > reproducible causes a failover of our metadata servers. afterwards, the > demoted mds servers won't be available as standby servers and the mds > daemons on

[ceph-users] cephfs-snapshots causing mds failover, hangs

2019-08-20 Thread thoralf schulze
hi there, we are struggling with the creation of cephfs-snapshots: doing so reproducible causes a failover of our metadata servers. afterwards, the demoted mds servers won't be available as standby servers and the mds daemons on these machines have to be manually restarted. more often than we wish

Re: [ceph-users] CephFS meltdown fallout: mds assert failure, kernel oopses

2019-08-15 Thread Jeff Layton
On Thu, 2019-08-15 at 16:45 +0900, Hector Martin wrote: > On 15/08/2019 03.40, Jeff Layton wrote: > > On Wed, 2019-08-14 at 19:29 +0200, Ilya Dryomov wrote: > > > Jeff, the oops seems to be a NULL dereference in ceph_lock_message(). > > > Please take a look. > > > > > > > (sorry for duplicate mai

Re: [ceph-users] CephFS meltdown fallout: mds assert failure, kernel oopses

2019-08-15 Thread Hector Martin
On 15/08/2019 03.40, Jeff Layton wrote: On Wed, 2019-08-14 at 19:29 +0200, Ilya Dryomov wrote: Jeff, the oops seems to be a NULL dereference in ceph_lock_message(). Please take a look. (sorry for duplicate mail -- the other one ended up in moderation) Thanks Ilya, That function is pretty st

Re: [ceph-users] CephFS meltdown fallout: mds assert failure, kernel oopses

2019-08-14 Thread Jeff Layton
On Wed, 2019-08-14 at 19:29 +0200, Ilya Dryomov wrote: > On Tue, Aug 13, 2019 at 1:06 PM Hector Martin wrote: > > I just had a minor CephFS meltdown caused by underprovisioned RAM on the > > MDS servers. This is a CephFS with two ranks; I manually failed over the > > first rank and the new MDS ser

Re: [ceph-users] CephFS meltdown fallout: mds assert failure, kernel oopses

2019-08-14 Thread Ilya Dryomov
On Tue, Aug 13, 2019 at 1:06 PM Hector Martin wrote: > > I just had a minor CephFS meltdown caused by underprovisioned RAM on the > MDS servers. This is a CephFS with two ranks; I manually failed over the > first rank and the new MDS server ran out of RAM in the rejoin phase > (ceph-mds didn't get

Re: [ceph-users] Cephfs cannot mount with kernel client

2019-08-14 Thread Serkan Çoban
Hi, just double checked the stack trace and I can confirm it is same as in tracker. compaction also worked for me, I can now mount cephfs without problems. Thanks for help, Serkan On Tue, Aug 13, 2019 at 6:44 PM Ilya Dryomov wrote: > > On Tue, Aug 13, 2019 at 4:30 PM Serkan Çoban wrote: > > > >

Re: [ceph-users] Cephfs cannot mount with kernel client

2019-08-13 Thread Ilya Dryomov
On Tue, Aug 13, 2019 at 4:30 PM Serkan Çoban wrote: > > I am out of office right now, but I am pretty sure it was the same > stack trace as in tracker. > I will confirm tomorrow. > Any workarounds? Compaction # echo 1 >/proc/sys/vm/compact_memory might help if the memory in question is moveable

[ceph-users] CephFS "denied reconnect attempt" after updating Ceph

2019-08-13 Thread William Edwards
Hello, I've been using CephFS for quite a while now, and am very happy with it. However, I'm experiencing an issue that's quite hard to debug. On almost every server where CephFS is mounted, the CephFS mount becomes unusable after updating Ceph (has happened 3 times now, after Ceph update). W

Re: [ceph-users] Cephfs cannot mount with kernel client

2019-08-13 Thread Serkan Çoban
I am out of office right now, but I am pretty sure it was the same stack trace as in tracker. I will confirm tomorrow. Any workarounds? On Tue, Aug 13, 2019 at 5:16 PM Ilya Dryomov wrote: > > On Tue, Aug 13, 2019 at 3:57 PM Serkan Çoban wrote: > > > > I checked /var/log/messages and see there ar

Re: [ceph-users] Cephfs cannot mount with kernel client

2019-08-13 Thread Ilya Dryomov
On Tue, Aug 13, 2019 at 3:57 PM Serkan Çoban wrote: > > I checked /var/log/messages and see there are page allocation > failures. But I don't understand why? > The client has 768GB memory and most of it is not used, cluster has > 1500OSDs. Do I need to increase vm.min_free_kytes? It is set to 1GB

Re: [ceph-users] Cephfs cannot mount with kernel client

2019-08-13 Thread Serkan Çoban
I checked /var/log/messages and see there are page allocation failures. But I don't understand why? The client has 768GB memory and most of it is not used, cluster has 1500OSDs. Do I need to increase vm.min_free_kytes? It is set to 1GB now. Also huge_page is disabled in clients. Thanks, Serkan On

Re: [ceph-users] Cephfs cannot mount with kernel client

2019-08-13 Thread Ilya Dryomov
On Tue, Aug 13, 2019 at 12:36 PM Serkan Çoban wrote: > > Hi, > > Just installed nautilus 14.2.2 and setup cephfs on it. OS is all centos 7.6. > From a client I can mount the cephfs with ceph-fuse, but I cannot > mount with ceph kernel client. > It gives "mount error 110 connection timeout" and I c

[ceph-users] Cephfs cannot mount with kernel client

2019-08-13 Thread Serkan Çoban
Hi, Just installed nautilus 14.2.2 and setup cephfs on it. OS is all centos 7.6. >From a client I can mount the cephfs with ceph-fuse, but I cannot mount with ceph kernel client. It gives "mount error 110 connection timeout" and I can see "libceph: corrupt full osdmap (-12) epoch 2759 off 656" in

[ceph-users] CephFS meltdown fallout: mds assert failure, kernel oopses

2019-08-13 Thread Hector Martin
I just had a minor CephFS meltdown caused by underprovisioned RAM on the MDS servers. This is a CephFS with two ranks; I manually failed over the first rank and the new MDS server ran out of RAM in the rejoin phase (ceph-mds didn't get OOM-killed, but I think things slowed down enough due to sw

Re: [ceph-users] CephFS snapshot for backup & disaster recovery

2019-08-08 Thread Alexandre DERUMIER
ce between two snapshots? I think it's on the roadmap for next ceph version. - Mail original - De: "Eitan Mosenkis" À: "Vitaliy Filippov" Cc: "ceph-users" Envoyé: Lundi 5 Août 2019 18:43:00 Objet: Re: [ceph-users] CephFS snapshot for backup &

Re: [ceph-users] CephFS snapshot for backup & disaster recovery

2019-08-05 Thread Eitan Mosenkis
I'm using it for a NAS to make backups from the other machines on my home network. Since everything is in one location, I want to keep a copy offsite for disaster recovery. Running Ceph across the internet is not recommended and is also very expensive compared to just storing snapshots. On Sun, Au

Re: [ceph-users] CephFS snapshot for backup & disaster recovery

2019-08-05 Thread Lars Marowsky-Bree
On 2019-08-04T13:27:00, Eitan Mosenkis wrote: > I'm running a single-host Ceph cluster for CephFS and I'd like to keep > backups in Amazon S3 for disaster recovery. Is there a simple way to > extract a CephFS snapshot as a single file and/or to create a file that > represents the incremental diff

Re: [ceph-users] CephFS Recovery/Internals Questions

2019-08-04 Thread Gregory Farnum
On Fri, Aug 2, 2019 at 12:13 AM Pierre Dittes wrote: > > Hi, > we had some major up with our CephFS. Long story short..no Journal backup > and journal was truncated. > Now..I still see a metadata pool with all objects and datapool is fine, from > what I know neither was corrupted. Last mount

Re: [ceph-users] CephFS snapshot for backup & disaster recovery

2019-08-04 Thread Виталий Филиппов
Afaik no. What's the idea of running a single-host cephfs cluster? 4 августа 2019 г. 13:27:00 GMT+03:00, Eitan Mosenkis пишет: >I'm running a single-host Ceph cluster for CephFS and I'd like to keep >backups in Amazon S3 for disaster recovery. Is there a simple way to >extract a CephFS snapshot a

[ceph-users] CephFS snapshot for backup & disaster recovery

2019-08-04 Thread Eitan Mosenkis
I'm running a single-host Ceph cluster for CephFS and I'd like to keep backups in Amazon S3 for disaster recovery. Is there a simple way to extract a CephFS snapshot as a single file and/or to create a file that represents the incremental difference between two snapshots? __

[ceph-users] CephFS Recovery/Internals Questions

2019-08-02 Thread Pierre Dittes
Hi, we had some major up with our CephFS. Long story short..no Journal backup and journal was truncated. Now..I still see a metadata pool with all objects and datapool is fine, from what I know neither was corrupted. Last mount attempt showed a blank FS though. What are the proper steps now to

Re: [ceph-users] cephfs quota setfattr permission denied

2019-07-31 Thread Mattia Belluco
Hi Nathan, Indeed that was the reason. With your hint I was able to find the relevant documentation: https://docs.ceph.com/docs/master/cephfs/client-auth/ that is completely absent from: https://docs.ceph.com/docs/master/cephfs/quota/#configuration I will send a pull request to include the lin

Re: [ceph-users] cephfs quota setfattr permission denied

2019-07-31 Thread Nathan Fish
The client key which is used to mount the FS needs the 'p' permission to set xattrs. eg: ceph fs authorize cephfs client.foo / rwsp That might be your problem. On Wed, Jul 31, 2019 at 5:43 AM Mattia Belluco wrote: > > Dear ceph users, > > We have been recently trying to use the two quota attrib

[ceph-users] cephfs quota setfattr permission denied

2019-07-31 Thread Mattia Belluco
Dear ceph users, We have been recently trying to use the two quota attributes: - ceph.quota.max_files - ceph.quota.max_bytes to prepare for quota enforcing. While the idea is quite straightforward we found out we cannot set any additional file attribute (we tried with the directory pinning, too

Re: [ceph-users] cephfs snapshot scripting questions

2019-07-19 Thread Frank Schilder
t: 17 July 2019 02:44:02 To: ceph-users@lists.ceph.com Subject: [ceph-users] cephfs snapshot scripting questions Greetings. Before I reinvent the wheel has anyone written a script to maintain X number of snapshots on a cephfs file system that can be run through cron? I am aware of the cephfs-sna

Re: [ceph-users] cephfs snapshot scripting questions

2019-07-17 Thread Marc Roos
NAPBAK") -ne 0 ] then if [ $(snapshotexists $SNAPBAK $SNNAME) -eq 0 ] then snapshotremove $SNAPBAK $SNNAME fi createsnapshot $SNAPBAK $SNNAME fi -Original Message- From: Robert Ruge [mailto:robert.r...@deakin.edu.au] Sent: woensdag 17 juli 2019 2:44 To: ceph-users@lists.

[ceph-users] cephfs snapshot scripting questions

2019-07-16 Thread Robert Ruge
Greetings. Before I reinvent the wheel has anyone written a script to maintain X number of snapshots on a cephfs file system that can be run through cron? I am aware of the cephfs-snap code but just wondering if there are any other options out there. On a related note which of these options wou

[ceph-users] cephfs size

2019-07-03 Thread ST Wong (ITSC)
Hi, Mounted a CephFS through kernel module or FUSE. Both work except when we do a "df -h", the "Avail" value shown is the MAX AVAIL of the data pool in "ceph df". I'm expecting it should match with max_bytes of the data pool. Rbd mount doesn't have similar observation. Is this normal? Thanks a

Re: [ceph-users] CephFS : Kernel/Fuse technical differences

2019-06-25 Thread Robert LeBlanc
There may also be more memory coping involved instead of just passing pointers around as well, but I'm not 100% sure. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Mon, Jun 24, 2019 at 10:28 AM Jeff Layton wrote: > On Mon, 2019-06-24 at 15

Re: [ceph-users] CephFS : Kernel/Fuse technical differences

2019-06-24 Thread Jeff Layton
On Mon, 2019-06-24 at 15:51 +0200, Hervé Ballans wrote: > Hi everyone, > > We successfully use Ceph here for several years now, and since recently, > CephFS. > > From the same CephFS server, I notice a big difference between a fuse > mount and a kernel mount (10 times faster for kernel mount).

[ceph-users] CephFS : Kernel/Fuse technical differences

2019-06-24 Thread Hervé Ballans
Hi everyone, We successfully use Ceph here for several years now, and since recently, CephFS. From the same CephFS server, I notice a big difference between a fuse mount and a kernel mount (10 times faster for kernel mount). It makes sense to me (an additional fuse library versus a direct ac

  1   2   3   4   5   6   7   8   9   10   >