Re: [ceph-users] Unexpected "out" OSD behaviour

2019-12-23 Thread Oliver Freyermuth
seem to be effective. Cheers, Oliver Am 22.12.19 um 23:50 schrieb Oliver Freyermuth: > Dear Jonas, > > Am 22.12.19 um 23:40 schrieb Jonas Jelten: >> hi! >> >> I've also noticed that behavior and have submitted a patch some time ago >> that should fix (2): &

Re: [ceph-users] Unexpected "out" OSD behaviour

2019-12-22 Thread Oliver Freyermuth
of the "out" OSDs and see what happens). Cheers and many thanks, Oliver > > Cheers > -- Jonas > > > On 22/12/2019 19.48, Oliver Freyermuth wrote: >> Dear Cephers, >> >> I realized the following behaviour only recently: >> >>

[ceph-users] Unexpected "out" OSD behaviour

2019-12-22 Thread Oliver Freyermuth
Dear Cephers, I realized the following behaviour only recently: 1. Marking an OSD "out" sets the weight to zero and allows to migrate data away (as long as it is up), i.e. it is still considered as a "source" and nothing goes to degraded state (so far, everything expected). 2. Restarting

Re: [ceph-users] dashboard hangs

2019-11-22 Thread Oliver Freyermuth
Hi, On 2019-11-20 15:55, thoralf schulze wrote: hi, we were able to track this down to the auto balancer: disabling the auto balancer and cleaning out old (and probably not very meaningful) upmap-entries via ceph osd rm-pg-upmap-items brought back stable mgr daemons and an usable dashboard.

Re: [ceph-users] Can't create erasure coded pools with k+m greater than hosts?

2019-10-24 Thread Oliver Freyermuth
On 2019-10-24 09:46, Janne Johansson wrote: (Slightly abbreviated) Den tors 24 okt. 2019 kl 09:24 skrev Frank Schilder mailto:fr...@dtu.dk>>:  What I learned are the following: 1) Avoid this work-around too few hosts for EC rule at all cost. 2) Do not use EC 2+1. It does not

Re: [ceph-users] POOL_TARGET_SIZE_BYTES_OVERCOMMITTED

2019-09-25 Thread Oliver Freyermuth
5952G 0.0302 0.0700 1.0 32 on rbd 1856G3.0 5952G 0.9359 0.9200 1.0 256 on Cheers, Oliver Am 12.09.19 um 23:34 schrieb Oliver Freyermuth: Dear Cephalopodians, I can confirm the same

Re: [ceph-users] eu.ceph.com mirror out of sync?

2019-09-24 Thread Oliver Freyermuth
Dear Wido, On 2019-09-24 08:53, Wido den Hollander wrote: On 9/17/19 11:01 PM, Oliver Freyermuth wrote: Dear Cephalopodians, I realized just now that:   https://eu.ceph.com/rpm-nautilus/el7/x86_64/ still holds only released up to 14.2.2, and nothing is to be seen of 14.2.3 or 14.2.4, while

Re: [ceph-users] eu.ceph.com mirror out of sync?

2019-09-23 Thread Oliver Freyermuth
tthew. (au.ceph.com maintainer) On 24/9/19 6:48 am, David Majchrzak, ODERLAND Webbhotell AB wrote: Hi, I'll have a look at the status of se.ceph.com tomorrow morning, it's maintained by us. Kind Regards, David On mån, 2019-09-23 at 22:41 +0200, Oliver Freyermuth wrote: Hi together, the EU mirror

Re: [ceph-users] eu.ceph.com mirror out of sync?

2019-09-23 Thread Oliver Freyermuth
geographically, this only leaves Sweden and UK. Sweden at se.ceph.com does not load for me, but UK indeed seems fine. Should people in the EU use that mirror, or should we all just use download.ceph.com instead of something geographically close-by? Cheers, Oliver On 2019-09-17 23:01, Oliver

Re: [ceph-users] OSD's keep crasching after clusterreboot

2019-09-23 Thread Oliver Freyermuth
afterwards, though. So this probably means we are not affected by the upgrade bug - still, I would sleep better if somebody can confirm how to detected this bug and - if you are affected - how to edit the pool to fix it. Cheers, Oliver On 2019-09-17 21:23, Oliver Freyermuth wrote: Hi

[ceph-users] eu.ceph.com mirror out of sync?

2019-09-17 Thread Oliver Freyermuth
Dear Cephalopodians, I realized just now that: https://eu.ceph.com/rpm-nautilus/el7/x86_64/ still holds only released up to 14.2.2, and nothing is to be seen of 14.2.3 or 14.2.4, while the main repository at: https://download.ceph.com/rpm-nautilus/el7/x86_64/ looks as expected. Is this

Re: [ceph-users] OSD's keep crasching after clusterreboot

2019-09-17 Thread Oliver Freyermuth
Hi together, it seems the issue described by Ansgar was reported and closed here as being fixed for newly created pools in post-Luminous releases: https://tracker.ceph.com/issues/41336 However, it is unclear to me: - How to find out if an EC cephfs you have created in Luminous is actually

Re: [ceph-users] Ceph RBD Mirroring

2019-09-14 Thread Oliver Freyermuth
p+replaying". Thanks and all the best, Oliver > > On Fri, Sep 13, 2019 at 12:44 PM Oliver Freyermuth > wrote: >> >> Am 13.09.19 um 18:38 schrieb Jason Dillaman: >>> On Fri, Sep 13, 2019 at 11:30 AM Oliver Freyermuth >>> wrote: >>>>

Re: [ceph-users] Ceph RBD Mirroring

2019-09-13 Thread Oliver Freyermuth
Am 13.09.19 um 18:38 schrieb Jason Dillaman: On Fri, Sep 13, 2019 at 11:30 AM Oliver Freyermuth wrote: Am 13.09.19 um 17:18 schrieb Jason Dillaman: On Fri, Sep 13, 2019 at 10:41 AM Oliver Freyermuth wrote: Am 13.09.19 um 16:30 schrieb Jason Dillaman: On Fri, Sep 13, 2019 at 10:17 AM

Re: [ceph-users] Ceph RBD Mirroring

2019-09-13 Thread Oliver Freyermuth
Am 13.09.19 um 17:18 schrieb Jason Dillaman: On Fri, Sep 13, 2019 at 10:41 AM Oliver Freyermuth wrote: Am 13.09.19 um 16:30 schrieb Jason Dillaman: On Fri, Sep 13, 2019 at 10:17 AM Jason Dillaman wrote: On Fri, Sep 13, 2019 at 10:02 AM Oliver Freyermuth wrote: Dear Jason, thanks

Re: [ceph-users] Ceph RBD Mirroring

2019-09-13 Thread Oliver Freyermuth
Am 13.09.19 um 16:30 schrieb Jason Dillaman: On Fri, Sep 13, 2019 at 10:17 AM Jason Dillaman wrote: On Fri, Sep 13, 2019 at 10:02 AM Oliver Freyermuth wrote: Dear Jason, thanks for the very detailed explanation! This was very instructive. Sadly, the watchers look correct - see details

Re: [ceph-users] Ceph RBD Mirroring

2019-09-13 Thread Oliver Freyermuth
Am 13.09.19 um 16:17 schrieb Jason Dillaman: On Fri, Sep 13, 2019 at 10:02 AM Oliver Freyermuth wrote: Dear Jason, thanks for the very detailed explanation! This was very instructive. Sadly, the watchers look correct - see details inline. Am 13.09.19 um 15:02 schrieb Jason Dillaman: On Thu

Re: [ceph-users] Ceph RBD Mirroring

2019-09-13 Thread Oliver Freyermuth
Dear Jason, thanks for the very detailed explanation! This was very instructive. Sadly, the watchers look correct - see details inline. Am 13.09.19 um 15:02 schrieb Jason Dillaman: On Thu, Sep 12, 2019 at 9:55 PM Oliver Freyermuth wrote: Dear Jason, thanks for taking care and developing

Re: [ceph-users] Ceph RBD Mirroring

2019-09-12 Thread Oliver Freyermuth
Any idea on this (or how I can extract more information)? I fear keeping high-level debug logs active for ~24h is not feasible. Cheers, Oliver On 2019-09-11 19:14, Jason Dillaman wrote: > On Wed, Sep 11, 2019 at 12:57 PM Oliver Freyermuth > wrote: >> >> Dear Jaso

Re: [ceph-users] POOL_TARGET_SIZE_BYTES_OVERCOMMITTED

2019-09-12 Thread Oliver Freyermuth
Dear Cephalopodians, I can confirm the same problem described by Joe Ryner in 14.2.2. I'm also getting (in a small test setup): - # ceph health detail HEALTH_WARN 1 subtrees have overcommitted pool target_size_bytes; 1 subtrees have

Re: [ceph-users] Ceph RBD Mirroring

2019-09-11 Thread Oliver Freyermuth
for me to figure out what could be the problem - do you see what I did wrong? Cheers and thanks again, Oliver On 2019-09-10 23:17, Oliver Freyermuth wrote: Dear Jason, On 2019-09-10 23:04, Jason Dillaman wrote: On Tue, Sep 10, 2019 at 2:08 PM Oliver Freyermuth wrote: Dear Jason

Re: [ceph-users] Ceph RBD Mirroring

2019-09-10 Thread Oliver Freyermuth
Dear Jason, On 2019-09-10 23:04, Jason Dillaman wrote: > On Tue, Sep 10, 2019 at 2:08 PM Oliver Freyermuth > wrote: >> >> Dear Jason, >> >> On 2019-09-10 18:50, Jason Dillaman wrote: >>> On Tue, Sep 10, 2019 at 12:25 PM Oliver Freyermuth >>> w

Re: [ceph-users] Ceph RBD Mirroring

2019-09-10 Thread Oliver Freyermuth
Dear Jason, On 2019-09-10 18:50, Jason Dillaman wrote: > On Tue, Sep 10, 2019 at 12:25 PM Oliver Freyermuth > wrote: >> >> Dear Cephalopodians, >> >> I have two questions about RBD mirroring. >> >> 1) I can not get it to work - my setup is: >> &

[ceph-users] Ceph RBD Mirroring

2019-09-10 Thread Oliver Freyermuth
Dear Cephalopodians, I have two questions about RBD mirroring. 1) I can not get it to work - my setup is: - One cluster holding the live RBD volumes and snapshots, in pool "rbd", cluster name "ceph", running latest Mimic. I ran "rbd mirror pool enable rbd pool" on that cluster

Re: [ceph-users] Urgent Help Needed (regarding rbd cache)

2019-08-01 Thread Oliver Freyermuth
Hi together, Am 01.08.19 um 08:45 schrieb Janne Johansson: Den tors 1 aug. 2019 kl 07:31 skrev Muhammad Junaid mailto:junaid.fsd...@gmail.com>>: Your email has cleared many things to me. Let me repeat my understanding. Every Critical data (Like Oracle/Any Other DB) writes will be done

Re: [ceph-users] Fix scrub error in bluestore.

2019-06-06 Thread Oliver Freyermuth
Hi Alfredo, you may want to check the SMART data for the disk. I also had such a case recently (see http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-May/035117.html for the thread), and the disk had one unreadable sector which was pending reallocation. Triggering "ceph pg repair" for

Re: [ceph-users] Object read error - enough copies available

2019-05-31 Thread Oliver Freyermuth
Hi, Am 31.05.19 um 12:07 schrieb Burkhard Linke: > Hi, > > > see my post in the recent 'CephFS object mapping.' thread. It describes the > necessary commands to lookup a file based on its rados object name. many thanks! I somehow missed the important part in that thread earlier and only got

Re: [ceph-users] Object read error - enough copies available

2019-05-30 Thread Oliver Freyermuth
Am 30.05.19 um 17:00 schrieb Oliver Freyermuth: > Dear Cephalopodians, > > I found the messages: > 2019-05-30 16:08:51.656363 [ERR] Error -5 reading object > 2:0979ae43:::10002954ea6.007c:head > 2019-05-30 16:08:51.760660 [WRN] Error(s) ignored

[ceph-users] Object read error - enough copies available

2019-05-30 Thread Oliver Freyermuth
Dear Cephalopodians, I found the messages: 2019-05-30 16:08:51.656363 [ERR] Error -5 reading object 2:0979ae43:::10002954ea6.007c:head 2019-05-30 16:08:51.760660 [WRN] Error(s) ignored for 2:0979ae43:::10002954ea6.007c:head enough copies available just now in our logs (Mimic

Re: [ceph-users] Balancer: uneven OSDs

2019-05-29 Thread Oliver Freyermuth
.327 7f40cd3e8700 4 mgr get_config get_config key: > mgr/balancer/max_misplaced > 2019-05-29 17:06:54.327 7f40cd3e8700 4 mgr[balancer] Mode upmap, max > misplaced 0.50 > 2019-05-29 17:06:54.327 7f40cd3e8700 4 mgr[balancer] do_upmap > 2019-05-29 17:06:54.327 7f40cd3e8700 4 mg

Re: [ceph-users] Balancer: uneven OSDs

2019-05-29 Thread Oliver Freyermuth
ot;num": 3 > } > ], > "osd": [ > { > "features": "0x3ffddff8ffacfffb", > "release": "luminous", > "num": 7 > } > ], > "client": [ > { > "features": "0x3ffddff8ffacfffb", > &qu

Re: [ceph-users] Balancer: uneven OSDs

2019-05-29 Thread Oliver Freyermuth
Hi Tarek, what's the output of "ceph balancer status"? In case you are using "upmap" mode, you must make sure to have a min-client-compat-level of at least Luminous: http://docs.ceph.com/docs/mimic/rados/operations/upmap/ Of course, please be aware that your clients must be recent enough

Re: [ceph-users] Quotas with Mimic (CephFS-FUSE) clients in a Luminous Cluster

2019-05-28 Thread Oliver Freyermuth
Am 28.05.19 um 03:24 schrieb Yan, Zheng: On Mon, May 27, 2019 at 6:54 PM Oliver Freyermuth wrote: Am 27.05.19 um 12:48 schrieb Oliver Freyermuth: Am 27.05.19 um 11:57 schrieb Dan van der Ster: On Mon, May 27, 2019 at 11:54 AM Oliver Freyermuth wrote: Dear Dan, thanks for the quick reply

Re: [ceph-users] Quotas with Mimic (CephFS-FUSE) clients in a Luminous Cluster

2019-05-27 Thread Oliver Freyermuth
Am 27.05.19 um 12:48 schrieb Oliver Freyermuth: Am 27.05.19 um 11:57 schrieb Dan van der Ster: On Mon, May 27, 2019 at 11:54 AM Oliver Freyermuth wrote: Dear Dan, thanks for the quick reply! Am 27.05.19 um 11:44 schrieb Dan van der Ster: Hi Oliver, We saw the same issue after upgrading

Re: [ceph-users] Quotas with Mimic (CephFS-FUSE) clients in a Luminous Cluster

2019-05-27 Thread Oliver Freyermuth
Am 27.05.19 um 11:57 schrieb Dan van der Ster: On Mon, May 27, 2019 at 11:54 AM Oliver Freyermuth wrote: Dear Dan, thanks for the quick reply! Am 27.05.19 um 11:44 schrieb Dan van der Ster: Hi Oliver, We saw the same issue after upgrading to mimic. IIRC we could make the max_bytes xattr

Re: [ceph-users] Quotas with Mimic (CephFS-FUSE) clients in a Luminous Cluster

2019-05-27 Thread Oliver Freyermuth
nd worst case could survive until then without quota enforcement, but it's a really strange and unexpected incompatibility. Cheers, Oliver Does that work? -- dan On Mon, May 27, 2019 at 11:36 AM Oliver Freyermuth wrote: Dear Cephalopodians, in the process of migrating a cluster fr

[ceph-users] Quotas with Mimic (CephFS-FUSE) clients in a Luminous Cluster

2019-05-27 Thread Oliver Freyermuth
Dear Cephalopodians, in the process of migrating a cluster from Luminous (12.2.12) to Mimic (13.2.5), we have upgraded the FUSE clients first (we took the chance during a time of low activity), thinking that this should not cause any issues. All MDS+MON+OSDs are still on Luminous, 12.2.12.

Re: [ceph-users] Inodes on /cephfs

2019-05-01 Thread Oliver Freyermuth
is not information to be monitored. What do you think? Cheers, Oliver > > > -- Yury > > On Wed, May 01, 2019 at 01:23:57AM +0200, Oliver Freyermuth wrote: >> Am 01.05.19 um 00:51 schrieb Patrick Donnelly: >>> On Tue, Apr 30, 2019 at 8:01 AM Oliver Freye

Re: [ceph-users] Inodes on /cephfs

2019-04-30 Thread Oliver Freyermuth
Am 01.05.19 um 00:51 schrieb Patrick Donnelly: > On Tue, Apr 30, 2019 at 8:01 AM Oliver Freyermuth > wrote: >> >> Dear Cephalopodians, >> >> we have a classic libvirtd / KVM based virtualization cluster using Ceph-RBD >> (librbd) as backend and sharin

[ceph-users] Inodes on /cephfs

2019-04-30 Thread Oliver Freyermuth
Dear Cephalopodians, we have a classic libvirtd / KVM based virtualization cluster using Ceph-RBD (librbd) as backend and sharing the libvirtd configuration between the nodes via CephFS (all on Mimic). To share the libvirtd configuration between the nodes, we have symlinked some folders from

[ceph-users] Some ceph config parameters default values

2019-02-16 Thread Oliver Freyermuth
Dear Cephalopodians, in some recent threads on this list, I have read about the "knobs": pglog_hardlimit (false by default, available at least with 12.2.11 and 13.2.5) bdev_enable_discard (false by default, advanced option, no description) bdev_async_discard (false by default,

Re: [ceph-users] read-only mounts of RBD images on multiple nodes for parallel reads

2019-01-17 Thread Oliver Freyermuth
Hi, first of: I'm probably not the expert you are waiting for, but we are using CephFS for HPC / HTC (storing datafiles), and make use of containers for all jobs (up to ~2000 running in parallel). We also use RBD, but for our virtualization infrastructure. While I'm always one of the first

Re: [ceph-users] Invalid RBD object maps of snapshots on Mimic

2019-01-12 Thread Oliver Freyermuth
Am 10.01.19 um 16:53 schrieb Jason Dillaman: > On Thu, Jan 10, 2019 at 10:50 AM Oliver Freyermuth > wrote: >> >> Dear Jason and list, >> >> Am 10.01.19 um 16:28 schrieb Jason Dillaman: >>> On Thu, Jan 10, 2019 at 4:01 AM Oliver Freyermuth >>> w

Re: [ceph-users] Invalid RBD object maps of snapshots on Mimic

2019-01-10 Thread Oliver Freyermuth
Dear Jason and list, Am 10.01.19 um 16:28 schrieb Jason Dillaman: On Thu, Jan 10, 2019 at 4:01 AM Oliver Freyermuth wrote: Dear Cephalopodians, I performed several consistency checks now: - Exporting an RBD snapshot before and after the object map rebuilding. - Exporting a backup as raw

Re: [ceph-users] Invalid RBD object maps of snapshots on Mimic

2019-01-10 Thread Oliver Freyermuth
nderstanding correct? Then the underlying issue would still be a bug, but (as it seems) a harmless one. I'll let you know if it happens again to some of our snapshots, and if so, if it only happens to newly created ones... Cheers, Oliver Am 10.01.19 um 01:18 schrieb Oliver Freyermuth:

[ceph-users] Invalid RBD object maps of snapshots on Mimic

2019-01-09 Thread Oliver Freyermuth
Dear Cephalopodians, inspired by http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-January/032092.html I did a check of the object-maps of our RBD volumes and snapshots. We are running 13.2.1 on the cluster I am talking about, all hosts (OSDs, MONs, RBD client nodes) still on CentOS

Re: [ceph-users] RBD snapshot atomicity guarantees?

2018-12-18 Thread Oliver Freyermuth
Am 18.12.18 um 11:48 schrieb Hector Martin: > On 18/12/2018 18:28, Oliver Freyermuth wrote: >> We have yet to observe these hangs, we are running this with ~5 VMs with ~10 >> disks for about half a year now with daily snapshots. But all of these VMs >> have very "

Re: [ceph-users] RBD snapshot atomicity guarantees?

2018-12-18 Thread Oliver Freyermuth
Dear Hector, we are using the very same approach on CentOS 7 (freeze + thaw), but preceeded by an fstrim. With virtio-scsi, using fstrim propagates the discards from within the VM to Ceph RBD (if qemu is configured accordingly), and a lot of space is saved. We have yet to observe these hangs,

Re: [ceph-users] [Warning: Forged Email] Ceph 10.2.11 - Status not working

2018-12-17 Thread Oliver Freyermuth
That's kind of unrelated to Ceph, but since you wrote two mails already, and I believe it is caused by the mailing list software for ceph-users... Your original mail distributed via the list ("[ceph-users] Ceph 10.2.11 - Status not working") did *not* have the forged-warning. Only the

Re: [ceph-users] Upgrade to Luminous (mon+osd)

2018-12-03 Thread Oliver Freyermuth
There's also an additional issue which made us activate CEPH_AUTO_RESTART_ON_UPGRADE=yes (and of course, not have automatic updates of Ceph): When using compression e.g. with Snappy, it seems that already running OSDs which try to dlopen() the snappy library for some version upgrades become

Re: [ceph-users] Customized Crush location hooks in Mimic

2018-11-30 Thread Oliver Freyermuth
ving the ceph buckets manually to the other rack / datacenter. Thanks for the explanation! Cheers, Oliver > -Greg > On Fri, Nov 30, 2018 at 6:46 AM Oliver Freyermuth > mailto:freyerm...@physik.uni-bonn.de>> wrote: > > Dear Cephalopodians, > > sorry for

Re: [ceph-users] Customized Crush location hooks in Mimic

2018-11-30 Thread Oliver Freyermuth
ght 3.6824 at location {datacenter=FTD,host=osd001,root=default} -- So the request to move to datacenter=FTD arrives at the mon, but no action is taken, and the OSD is left in FTD_1. Cheers, Oliver Am 30.11.

Re: [ceph-users] Customized Crush location hooks in Mimic

2018-11-30 Thread Oliver Freyermuth
ove itself into datacenter=FTD. But that does not happen... Any idea what I am missing? Cheers, Oliver Am 30.11.18 um 11:44 schrieb Oliver Freyermuth: Dear Cephalopodians, I'm probably missing something obvious, but I am at a loss here on how to actually make use of a customized crush loc

[ceph-users] Customized Crush location hooks in Mimic

2018-11-30 Thread Oliver Freyermuth
Dear Cephalopodians, I'm probably missing something obvious, but I am at a loss here on how to actually make use of a customized crush location hook. I'm currently on "ceph version 13.2.1" on CentOS 7 (i.e. the last version before the upgrade-preventing bugs). Here's what I did: 1. Write a

Re: [ceph-users] ceph df space usage confusion - balancing needed?

2018-10-26 Thread Oliver Freyermuth
43TiB 27.81 1.00 173  >> 139   mf1hdd 8.91019  1.0 8.91TiB 2.48TiB 6.43TiB 27.84 1.00 173  >> 140   mf1hdd 8.91019  1.0 8.91TiB 2.48TiB 6.43TiB 27

Re: [ceph-users] ceph df space usage confusion - balancing needed?

2018-10-26 Thread Oliver Freyermuth
------

Re: [ceph-users] ceph df space usage confusion - balancing needed?

2018-10-20 Thread Oliver Freyermuth
21:26 schrieb Janne Johansson: > Ok, can't say "why" then, I'd reweigh them somewhat to even it out, > 1.22 -vs- 0.74 in variance is a lot, so either a balancer plugin for > the MGRs, a script or just a few manual tweaks might be in order. > > Den lör 20 okt. 2018 kl 21:0

Re: [ceph-users] ceph df space usage confusion - balancing needed?

2018-10-20 Thread Oliver Freyermuth
n them, so RAW space >>> is what it says, how much free space there is. Then the avail and >>> %USED on per-pool stats will take replication into account, it can >>> tell how much data you may write into that particular pool, given that >>> pools replication or

Re: [ceph-users] ceph df space usage confusion - balancing needed?

2018-10-20 Thread Oliver Freyermuth
replication or EC settings. > > Den lör 20 okt. 2018 kl 19:09 skrev Oliver Freyermuth > : >> >> Dear Cephalopodians, >> >> as many others, I'm also a bit confused by "ceph df" output >> in a pretty straightforward configuration. >&g

[ceph-users] ceph df space usage confusion - balancing needed?

2018-10-20 Thread Oliver Freyermuth
Dear Cephalopodians, as many others, I'm also a bit confused by "ceph df" output in a pretty straightforward configuration. We have a CephFS (12.2.7) running, with 4+2 EC profile. I get: # ceph df GLOBAL: SIZE

Re: [ceph-users] backup ceph

2018-09-21 Thread Oliver Freyermuth
ows to grow / shrink the cluster more easily as needed ;-). All the best, Oliver > Thanks again for your help. > Best Regards, > /ST Wong > > -Original Message- > From: Oliver Freyermuth > Sent: Thursday, September 20, 2018 2:10 AM > To: ST Wong (ITSC) > C

Re: [ceph-users] backup ceph

2018-09-19 Thread Oliver Freyermuth
t, Oliver > > Thanks again. > /st wong > > -Original Message- > From: Oliver Freyermuth > Sent: Wednesday, September 19, 2018 5:28 PM > To: ST Wong (ITSC) > Cc: Peter Wienemann ; ceph-users@lists.ceph.com > Subject: Re: [ceph-users] backup cep

Re: [ceph-users] backup ceph

2018-09-19 Thread Oliver Freyermuth
course). > Btw, is this one (https://benji-backup.me/) Benji you'r referring to ? > Thanks a lot. Exactly :-). Cheers, Oliver > > > > Cheers, > /ST Wong > > > > -Original Message- > From: Oliver Freyermuth > Sent: Tuesday, S

Re: [ceph-users] CephFS Quota and ACL support

2018-08-28 Thread Oliver Freyermuth
Am 28.08.18 um 07:14 schrieb Yan, Zheng: > On Mon, Aug 27, 2018 at 10:53 AM Oliver Freyermuth > wrote: >> >> Thanks for the replies. >> >> Am 27.08.18 um 19:25 schrieb Patrick Donnelly: >>> On Mon, Aug 27, 2018 at 12:51 AM, Oliver Freyermuth >>> wr

Re: [ceph-users] CephFS Quota and ACL support

2018-08-27 Thread Oliver Freyermuth
Thanks for the replies. Am 27.08.18 um 19:25 schrieb Patrick Donnelly: > On Mon, Aug 27, 2018 at 12:51 AM, Oliver Freyermuth > wrote: >> These features are critical for us, so right now we use the Fuse client. My >> hope is CentOS 8 will use a recent enough kernel >&g

[ceph-users] CephFS Quota and ACL support

2018-08-27 Thread Oliver Freyermuth
Dear Cephalopodians, sorry if this is the wrong place to ask - but does somebody know if the recently added quota support in the kernel client, and the ACL support, are going to be backported to RHEL 7 / CentOS 7 kernels? Or can someone redirect me to the correct place to ask? We don't have a

Re: [ceph-users] how can time machine know difference between cephfs fuse and kernel client?

2018-08-17 Thread Oliver Freyermuth
Hi, completely different idea: Have you tried to export the "time capsule" storage via AFP (using netatalk) instead of Samba? We are also planning to offer something like this for our users (in the mid-term future), but my feeling was that compatibility with netatalk / AFP would be better

Re: [ceph-users] HELP! --> CLUSER DOWN (was "v13.2.1 Mimic released")

2018-07-30 Thread Oliver Freyermuth
Hi together, for all others on this list, it might also be helpful to know which setups are likely affected. Does this only occur for Filestore disks, i.e. if ceph-volume has taken over taking care of these? Does it happen on every RHEL 7.5 system? We're still on 13.2.0 here and

Re: [ceph-users] Self shutdown of 1 whole system (Derbian stretch/Ceph 12.2.7/bluestore)

2018-07-23 Thread Oliver Freyermuth
Am 23.07.2018 um 14:59 schrieb Nicolas Huillard: > Le lundi 23 juillet 2018 à 12:40 +0200, Oliver Freyermuth a écrit : >> Am 23.07.2018 um 11:18 schrieb Nicolas Huillard: >>> Le lundi 23 juillet 2018 à 18:23 +1000, Brad Hubbard a écrit : >>>> Ceph doesn't shut dow

Re: [ceph-users] "CPU CATERR Fault" Was: Self shutdown of 1 whole system (Derbian stretch/Ceph 12.2.7/bluestore)

2018-07-23 Thread Oliver Freyermuth
Am 23.07.2018 um 11:39 schrieb Nicolas Huillard: > Le lundi 23 juillet 2018 à 10:28 +0200, Caspar Smit a écrit : >> Do you have any hardware watchdog running in the system? A watchdog >> could >> trigger a powerdown if it meets some value. Any event logs from the >> chassis >> itself? > > Nice

Re: [ceph-users] Self shutdown of 1 whole system (Derbian stretch/Ceph 12.2.7/bluestore)

2018-07-23 Thread Oliver Freyermuth
Am 23.07.2018 um 11:18 schrieb Nicolas Huillard: > Le lundi 23 juillet 2018 à 18:23 +1000, Brad Hubbard a écrit : >> Ceph doesn't shut down systems as in kill or reboot the box if that's >> what you're saying? > > That's the first part of what I was saying, yes. I was pretty sure Ceph > doesn't

Re: [ceph-users] Self shutdown of 1 whole system (Derbian stretch/Ceph 12.2.7/bluestore)

2018-07-21 Thread Oliver Freyermuth
Since all services are running on these machines - are you by any chance running low on memory? Do you have a monitoring of this? We observe some strange issues with our servers if they run for a long while, and with high memory pressure (more memory is ordered...). Then, it seems our

Re: [ceph-users] JBOD question

2018-07-20 Thread Oliver Freyermuth
Hi Satish, that really completely depends on your controller. For what it's worth: We have AVAGO MegaRAID controllers (9361 series). They can be switched to a "JBOD personality". After doing so and reinitializing (poewrcycling), the cards change PCI-ID and run a different firmware, optimized

Re: [ceph-users] Crush Rules with multiple Device Classes

2018-07-19 Thread Oliver Freyermuth
----- > *From:* ceph-users on behal

Re: [ceph-users] Crush Rules with multiple Device Classes

2018-07-19 Thread Oliver Freyermuth
Am 19.07.2018 um 05:57 schrieb Konstantin Shalygin: >> Now my first question is: >> 1) Is there a way to specify "take default class (ssd or nvme)"? >>Then we could just do this for the migration period, and at some point >> remove "ssd". >> >> If multi-device-class in a crush rule is not

[ceph-users] Crush Rules with multiple Device Classes

2018-07-18 Thread Oliver Freyermuth
Dear Cephalopodians, we use an SSD-only pool to store the metadata of our CephFS. In the future, we will add a few NVMEs, and in the long-term view, replace the existing SSDs by NVMEs, too. Thinking this through, I came up with three questions which I do not find answered in the docs (yet).

Re: [ceph-users] v12.2.7 Luminous released

2018-07-18 Thread Oliver Freyermuth
Am 18.07.2018 um 16:20 schrieb Sage Weil: > On Wed, 18 Jul 2018, Oliver Freyermuth wrote: >> Am 18.07.2018 um 14:20 schrieb Sage Weil: >>> On Wed, 18 Jul 2018, Linh Vu wrote: >>>> Thanks for all your hard work in putting out the fixes so quickly! :) >>

Re: [ceph-users] v12.2.7 Luminous released

2018-07-18 Thread Oliver Freyermuth
Am 18.07.2018 um 14:20 schrieb Sage Weil: > On Wed, 18 Jul 2018, Linh Vu wrote: >> Thanks for all your hard work in putting out the fixes so quickly! :) >> >> We have a cluster on 12.2.5 with Bluestore and EC pool but for CephFS, >> not RGW. In the release notes, it says RGW is a risk especially

Re: [ceph-users] v12.2.7 Luminous released

2018-07-18 Thread Oliver Freyermuth
Also many thanks from my side! Am 18.07.2018 um 03:04 schrieb Linh Vu: > Thanks for all your hard work in putting out the fixes so quickly! :) > > We have a cluster on 12.2.5 with Bluestore and EC pool but for CephFS, not > RGW. In the release notes, it says RGW is a risk especially the

Re: [ceph-users] mds daemon damaged

2018-07-13 Thread Oliver Freyermuth
contents since months which have been fixed in 12.2.6, but given this situation, we'll rather live with that a bit longer and hold off on the update... > > Thanks for pointing that out though, it seems like almost the exact same > situation > > On 2018-07-12 18:23, Oliver F

Re: [ceph-users] mds daemon damaged

2018-07-12 Thread Oliver Freyermuth
Hi, all this sounds an awful lot like: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-July/027992.html In htat case, things started with an update to 12.2.6. Which version are you running? Cheers, Oliver Am 12.07.2018 um 23:30 schrieb Kevin: > Sorry for the long posting but trying

Re: [ceph-users] Bug? Ceph-volume /var/lib/ceph/osd permissions

2018-06-02 Thread Oliver Freyermuth
Am 02.06.2018 um 12:35 schrieb Marc Roos: > > o+w? I don’t think that is necessary not? I also wondered about that, but it seems safe - it's only a tmpfs, with sticky bit set - and all files within have: -rw---. as you can check. Also, on our systems, we have: drwxr-x---. for /var/lib/ceph,

Re: [ceph-users] Should ceph-volume lvm prepare not be backwards compitable with ceph-disk?

2018-06-02 Thread Oliver Freyermuth
Am 02.06.2018 um 11:44 schrieb Marc Roos: > > > ceph-disk does not require bootstrap-osd/ceph.keyring and ceph-volume > does I believe that's expected when you use "prepare". For ceph-volume, "prepare" already bootstraps the OSD and fetches a fresh OSD id, for which it needs the keyring.

Re: [ceph-users] Bug? ceph-volume zap not working

2018-06-02 Thread Oliver Freyermuth
The command mapping from ceph-disk to ceph-volume is certainly not 1:1. What we are ended up using is: ceph-volume lvm zap /dev/sda --destroy This takes care of destroying Pvs and Lvs (as the documentation says). Cheers, Oliver Am 02.06.2018 um 12:16 schrieb Marc Roos: > > I guess zap

Re: [ceph-users] Ceph-fuse getting stuck with "currently failed to authpin local pins"

2018-05-31 Thread Oliver Freyermuth
Am 01.06.2018 um 02:59 schrieb Yan, Zheng: > On Wed, May 30, 2018 at 5:17 PM, Oliver Freyermuth > wrote: >> Am 30.05.2018 um 10:37 schrieb Yan, Zheng: >>> On Wed, May 30, 2018 at 3:04 PM, Oliver Freyermuth >>> wrote: >>>> Hi, >>>> >&

Re: [ceph-users] Ceph-fuse getting stuck with "currently failed to authpin local pins"

2018-05-30 Thread Oliver Freyermuth
Am 30.05.2018 um 10:37 schrieb Yan, Zheng: > On Wed, May 30, 2018 at 3:04 PM, Oliver Freyermuth > wrote: >> Hi, >> >> ij our case, there's only a single active MDS >> (+1 standby-replay + 1 standby). >> We also get the health warning in case it happens.

Re: [ceph-users] Ceph-fuse getting stuck with "currently failed to authpin local pins"

2018-05-30 Thread Oliver Freyermuth
__ >> From: ceph-users on behalf of Yan, Zheng >> >> Sent: Tuesday, 29 May 2018 9:53:43 PM >> To: Oliver Freyermuth >> Cc: Ceph Users; Peter Wienemann >> Subject: Re: [ceph-users] Ceph-fuse getting stuck with "currently failed to >> au

Re: [ceph-users] Ceph-fuse getting stuck with "currently failed to authpin local pins"

2018-05-29 Thread Oliver Freyermuth
--------- > *From:* ceph-users on behalf of Oliver > Freyermuth > *Sent:* Tuesday, 29 May 2018 7:29:06 AM > *To:* Paul Emmerich > *Cc:* Ceph Users; Peter Wienemann > *Subject:*

Re: [ceph-users] Ceph-fuse getting stuck with "currently failed to authpin local pins"

2018-05-28 Thread Oliver Freyermuth
and the user in question who complained was accessing files in parallel via NFS and ceph-fuse), but I don't have a clear indication of that. Cheers, Oliver > > Paul > > 2018-05-28 16:38 GMT+02:00 Oliver Freyermuth <mailto:freyerm...@physik.uni-bonn.de>>: > > De

Re: [ceph-users] CephFS "move" operation

2018-05-25 Thread Oliver Freyermuth
Am 25.05.2018 um 15:39 schrieb Sage Weil: > On Fri, 25 May 2018, Oliver Freyermuth wrote: >> Dear Ric, >> >> I played around a bit - the common denominator seems to be: Moving it >> within a directory subtree below a directory for which max_bytes / >> max_fil

Re: [ceph-users] CephFS "move" operation

2018-05-25 Thread Oliver Freyermuth
Am 25.05.2018 um 15:26 schrieb Luis Henriques: > Oliver Freyermuth <freyerm...@physik.uni-bonn.de> writes: > >> Mhhhm... that's funny, I checked an mv with an strace now. I get: >> - >&g

Re: [ceph-users] CephFS "move" operation

2018-05-25 Thread Oliver Freyermuth
oo' and 'stat /cephfs/some_folder'? > (Maybe also the same with 'stat -f'.) > > Thanks! > sage > > > On Fri, 25 May 2018, Ric Wheeler wrote: >> That seems to be the issue - we need to understand why rename sees them as >> different. >> >> Ric >> >

Re: [ceph-users] CephFS "move" operation

2018-05-25 Thread Oliver Freyermuth
, rename() returns EXDEV. Cheers, Oliver Am 25.05.2018 um 15:18 schrieb Ric Wheeler: > That seems to be the issue - we need to understand why rename sees them as > different. > > Ric > > > On Fri, May 25, 2018, 9:15 AM Oliver Freyermuth > <freyerm...@physik.

Re: [ceph-users] CephFS "move" operation

2018-05-25 Thread Oliver Freyermuth
ks at is confused, that might explain it. > > Ric > > > On Fri, May 25, 2018, 9:04 AM Oliver Freyermuth > <freyerm...@physik.uni-bonn.de <mailto:freyerm...@physik.uni-bonn.de>> wrote: > > Am 25.05.2018 um 14:57 schrieb Ric Wheeler: > > Is t

Re: [ceph-users] CephFS "move" operation

2018-05-25 Thread Oliver Freyermuth
AM John Spray <jsp...@redhat.com > <mailto:jsp...@redhat.com>> wrote: > > On Fri, May 25, 2018 at 1:10 PM, Oliver Freyermuth > <freyerm...@physik.uni-bonn.de <mailto:freyerm...@physik.uni-bonn.de>> > wrote: > > Dear Cephalopodians, &g

Re: [ceph-users] CephFS "move" operation

2018-05-25 Thread Oliver Freyermuth
Am 25.05.2018 um 14:50 schrieb John Spray: > On Fri, May 25, 2018 at 1:10 PM, Oliver Freyermuth > <freyerm...@physik.uni-bonn.de> wrote: >> Dear Cephalopodians, >> >> I was wondering why a simple "mv" is taking extraordinarily long on CephFS >> and mu

[ceph-users] CephFS "move" operation

2018-05-25 Thread Oliver Freyermuth
Dear Cephalopodians, I was wondering why a simple "mv" is taking extraordinarily long on CephFS and must note that, at least with the fuse-client (12.2.5) and when moving a file from one directory to another, the file appears to be copied first (byte by byte, traffic going through the client?)

Re: [ceph-users] Nfs-ganesha 2.6 packages in ceph repo

2018-05-16 Thread Oliver Freyermuth
> happy to share some insights. Any tuning you would recommend? > > Thanks, > > On Wed, May 16, 2018 at 4:14 PM, Oliver Freyermuth > <freyerm...@physik.uni-bonn.de <mailto:freyerm...@physik.uni-bonn.de>> wrote: > > Hi David, > > did you alr

Re: [ceph-users] Nfs-ganesha 2.6 packages in ceph repo

2018-05-16 Thread Oliver Freyermuth
Hi David, did you already manage to check your librados2 version and manage to pin down the issue? Cheers, Oliver Am 11.05.2018 um 17:15 schrieb Oliver Freyermuth: > Hi David, > > Am 11.05.2018 um 16:55 schrieb David C: >> Hi Oliver >> >> Thanks for

Re: [ceph-users] Nfs-ganesha 2.6 packages in ceph repo

2018-05-11 Thread Oliver Freyermuth
Any idea why it's giving me this error? > > Thanks, > > On Fri, May 11, 2018 at 2:17 AM, Oliver Freyermuth > <freyerm...@physik.uni-bonn.de <mailto:freyerm...@physik.uni-bonn.de>> wrote: > > Hi David, > > for what it's worth, we are running with

  1   2   >