Re: [ceph-users] MDS hangs in "heartbeat_map" deadlock

2019-01-16 Thread Stefan Kooman
Hi Patrick, Quoting Stefan Kooman (ste...@bit.nl): > Quoting Stefan Kooman (ste...@bit.nl): > > Quoting Patrick Donnelly (pdonn...@redhat.com): > > > Thanks for the detailed notes. It looks like the MDS is stuck > > > somewhere it's not even outputting any log messages. If possible, it'd > > > be

[ceph-users] Ceph tech talk tomorrow: NooBaa data platform for distributed hybrid clouds

2019-01-16 Thread Sage Weil
Hi everyone, First, this is a reminder that there is a Tech Talk tomorrow from Guy Margalit about NooBaa, a multi-cloud object data services platform: Jan 17 at 19:00 UTC https://bluejeans.com/908675367 Why, you might ask? There is a lot of interest among many Ceph developers and vendors to

[ceph-users] Difference between OSD lost vs rm

2019-01-16 Thread Subhachandra Chandra
Hello, What is the difference between marking an OSD "lost" vs removing it with "rm" in terms of cluster recovery? What is the next step after marking an OSD "lost" and the cluster finishes recovering? Do you then "rm" it? Thanks Chandra -- This email message, including attachments, may

Re: [ceph-users] Offsite replication scenario

2019-01-16 Thread Wido den Hollander
On 1/16/19 8:08 PM, Anthony Verevkin wrote: > I would definitely see huge value in going to 3 MONs here (and btw 2 on-site > MGR and 2 on-site MDS) > However 350Kbps is quite low and MONs may be latency sensitive, so I suggest > you do heavy QoS if you want to use that link for ANYTHING else.

Re: [ceph-users] Offsite replication scenario

2019-01-16 Thread Brian Topping
> On Jan 16, 2019, at 12:08 PM, Anthony Verevkin wrote: > > I would definitely see huge value in going to 3 MONs here (and btw 2 on-site > MGR and 2 on-site MDS) > However 350Kbps is quite low and MONs may be latency sensitive, so I suggest > you do heavy QoS if you want to use that link for

Re: [ceph-users] slow requests and high i/o / read rate on bluestore osds after upgrade 12.2.8 -> 12.2.10

2019-01-16 Thread Stefan Priebe - Profihost AG
i reverted the whole cluster back to 12.2.8 - recovery speed also dropped from 300-400MB/s to 20MB/s on 12.2.10. So something is really broken. Greets, Stefan Am 16.01.19 um 16:00 schrieb Stefan Priebe - Profihost AG: > This is not the case with 12.2.8 - it happens with 12.2.9 as well. After >

[ceph-users] Fw: Re: Why does "df" on a cephfs not report same free space as "rados df" ?

2019-01-16 Thread David Young
Forgot to reply to the list! ‐‐‐ Original Message ‐‐‐ On Thursday, January 17, 2019 8:32 AM, David Young wrote: > Thanks David, > > "ceph osd df" looks like this: > > - > root@node1:~# ceph osd df > ID CLASS WEIGHT REWEIGHT SIZEUSE AVAIL%USE VAR PGS > 9 hdd

Re: [ceph-users] Offsite replication scenario

2019-01-16 Thread Anthony Verevkin
I would definitely see huge value in going to 3 MONs here (and btw 2 on-site MGR and 2 on-site MDS) However 350Kbps is quite low and MONs may be latency sensitive, so I suggest you do heavy QoS if you want to use that link for ANYTHING else. If you do so, make sure your clients are only listing

Re: [ceph-users] cephfs kernel client instability

2019-01-16 Thread Ilya Dryomov
On Wed, Jan 16, 2019 at 7:12 PM Andras Pataki wrote: > > Hi Ilya/Kjetil, > > I've done some debugging and tcpdump-ing to see what the interaction > between the kernel client and the mon looks like. Indeed - > CEPH_MSG_MAX_FRONT defined as 16Mb seems low for the default mon > messages for our

[ceph-users] Fixing a broken bucket index in RGW

2019-01-16 Thread Bryan Stillwell
I'm looking for some help in fixing a bucket index on a Luminous (12.2.8) cluster running on FileStore. First some background on how I believe the bucket index became broken. Last month we had a PG in our .rgw.buckets.index pool become inconsistent: 2018-12-11 09:12:17.743983 osd.1879 osd.1879

Re: [ceph-users] Why does "df" on a cephfs not report same free space as "rados df" ?

2019-01-16 Thread David C
On Wed, 16 Jan 2019, 02:20 David Young Hi folks, > > My ceph cluster is used exclusively for cephfs, as follows: > > --- > root@node1:~# grep ceph /etc/fstab > node2:6789:/ /ceph ceph > auto,_netdev,name=admin,secretfile=/root/ceph.admin.secret > root@node1:~# > --- > > "rados df" shows me the

Re: [ceph-users] cephfs kernel client instability

2019-01-16 Thread Andras Pataki
Hi Ilya/Kjetil, I've done some debugging and tcpdump-ing to see what the interaction between the kernel client and the mon looks like.  Indeed - CEPH_MSG_MAX_FRONT defined as 16Mb seems low for the default mon messages for our cluster (with osd_mon_messages_max at 100).  We have about 3500

[ceph-users] Ceph Nautilus Release T-shirt Design

2019-01-16 Thread Mike Perez
Hey everyone, We're getting close to the release of Ceph Nautilus, and I wanted to start the discussion of our next shirt! It looks like in the past we've used common works from Wikipedia pages. https://en.wikipedia.org/wiki/Nautilus I thought it would be fun to see who in our community would

Re: [ceph-users] dropping python 2 for nautilus... go/no-go

2019-01-16 Thread Mike Perez
I spoke with Doug Hellmann who has been championing the goal inside of OpenStack [1]. According to Doug all major services in OpenStack should be supporting python 3.5 and 3.6. They have a goal in their current cycle, set for 2019-04-10 [2], to make python 3 the default in tests [3]. [1] -

Re: [ceph-users] dropping python 2 for nautilus... go/no-go

2019-01-16 Thread Marc Roos
I have python 2 in rhel7/centos7 [@c04 ~]# python -V Python 2.7.5 [@c04 ~]# cat /etc/redhat-release CentOS Linux release 7.6.1810 (Core) -Original Message- From: c...@jack.fr.eu.org [mailto:c...@jack.fr.eu.org] Sent: 16 January 2019 16:55 To: ceph-users@lists.ceph.com Subject: Re:

Re: [ceph-users] dropping python 2 for nautilus... go/no-go

2019-01-16 Thread ceph
Hi, My 2 cents: - do drop python2 support - do not drop python2 support unexpectedly, aka do a deprecation phase People should already know that python2 is dead That is not enough, though, to remove that "by surprise" Regards, On 01/16/2019 04:45 PM, Sage Weil wrote: > Hi everyone, > > This

Re: [ceph-users] Filestore OSD on CephFS?

2019-01-16 Thread Kenneth Van Alstyne
I’d actually rather it not be an extra cluster, but can the destination pool name be different? If not, I have conflicting image names in the “rbd” pool on either side. Thanks, -- Kenneth Van Alstyne Systems Architect Knight Point Systems, LLC Service-Disabled Veteran-Owned Business 1775

[ceph-users] dropping python 2 for nautilus... go/no-go

2019-01-16 Thread Sage Weil
Hi everyone, This has come up several times before, but we need to make a final decision. Alfredo has a PR prepared that drops Python 2 support entirely in master, which will mean nautilus is Python 3 only. All of our distro targets (el7, bionic, xenial) include python 3, so that isn't an

Re: [ceph-users] Filestore OSD on CephFS?

2019-01-16 Thread Robert Sander
On 16.01.19 16:03, Kenneth Van Alstyne wrote: > To be clear, I know the question comes across as ludicrous.  It *seems* > like this is going to work okay for the light workload use case that I > have in mind — I just didn’t want to risk impacting the underlying > cluster too much or hit any other

Re: [ceph-users] slow requests and high i/o / read rate on bluestore osds after upgrade 12.2.8 -> 12.2.10

2019-01-16 Thread Mark Nelson
Hi Stefan, 12.2.9 included the pg hard limit patches and the osd_memory_autotuning patches.  While at first I was wondering if this was autotuning, it sounds like it may be more related to the pg hard limit.  I'm not terribly familiar with those patches though so some of the other members

Re: [ceph-users] Filestore OSD on CephFS?

2019-01-16 Thread Kenneth Van Alstyne
Burkhard: Thank you, this is literally what I was looking for. A VM with RBD images attached was my first choice (and what we do for a test and integration lab today), but am trying to give as much possible space to the underlying cluster without having to frequently add/remove OSDs and

Re: [ceph-users] slow requests and high i/o / read rate on bluestore osds after upgrade 12.2.8 -> 12.2.10

2019-01-16 Thread Stefan Priebe - Profihost AG
This is not the case with 12.2.8 - it happens with 12.2.9 as well. After boot all pgs are instantly active - not inactive pgs at least not noticable in ceph -s. With 12.2.9 or 12.2.10 or eben current upstream/luminous it takes minutes until all pgs are active again. Greets, Stefan Am 16.01.19 um

Re: [ceph-users] Filestore OSD on CephFS?

2019-01-16 Thread Kenneth Van Alstyne
Marc: To clarify, there will be no direct client workload (which is what I mean by “active production workload”), but rather RBD images from a remote cluster imported from either RBD export/import or as an RBD mirror destination. Obviously the best solution is dedicated hardware, but I don’t

Re: [ceph-users] Filestore OSD on CephFS?

2019-01-16 Thread Burkhard Linke
Hi, just some comments: CephFS has an overhead for accessing files (capabilities round trip to MDS for first access, cap cache management, limited number of concurrent caps depending on MDS cache size...), so using the cephfs filesystem as storage for a filestore OSD will add some extra

Re: [ceph-users] slow requests and high i/o / read rate on bluestore osds after upgrade 12.2.8 -> 12.2.10

2019-01-16 Thread Stefan Priebe - Profihost AG
Hello, while digging into this further i saw that it takes ages until all pgs are active. After starting the OSD 3% of all pgs are inactive and it takes minutes after they're active. The log of the OSD is full of: 2019-01-16 15:19:13.568527 7fecbf7da700 0 osd.33 pg_epoch: 1318479 pg[5.563( v

Re: [ceph-users] Filestore OSD on CephFS?

2019-01-16 Thread Marc Roos
How can there be a "catastrophic reason" if you have "no active, production workload"...? Do as you please. I am also having 1 replication for temp en tests. But if you have only one osd why use ceph? Choose the correct 'tool' for the job. -Original Message- From: Kenneth Van

[ceph-users] Filestore OSD on CephFS?

2019-01-16 Thread Kenneth Van Alstyne
Disclaimer: Even I will admit that I know this is going to sound like a silly/crazy/insane question, but I have a reason for wanting to do this and asking the question. It’s also worth noting that no active, production workload will be used on this “cluster”, so I’m worried more about data

[ceph-users] Kubernetes won't mount image with rbd-nbd

2019-01-16 Thread Hammad Abdullah
Hi, I’m trying run a Gerrit deployment on kubernetes. The deployment fails because the it can’t mount a ceph image with rbd-nbd. I tried to manually mount the image with "sudo rbd-nbd map rbd/xgrid-rep-test -m 172.31.141.8:6789,172.31.141.9:6789,172.31.141.10:6789 --keyfile keyfile" and it

Re: [ceph-users] /var/lib/ceph/mon/ceph-{node}/store.db on mon nodes

2019-01-16 Thread Wido den Hollander
On 1/16/19 11:19 AM, Brian Topping wrote: > Thanks guys! This does leave me a little worried that I only have one mon at > the moment based on reasons in my previous emails in the list (physical limit > of two nodes at the moment). Going to have to get more creative! > My advice: Do

Re: [ceph-users] cephfs kernel client instability

2019-01-16 Thread Ilya Dryomov
On Wed, Jan 16, 2019 at 1:27 AM Kjetil Joergensen wrote: > > Hi, > > you could try reducing "osd map message max", some code paths that end up as > -EIO (kernel: libceph: mon1 *** io error) is exceeding > include/linux/ceph/libceph.h:CEPH_MSG_MAX_{FRONT,MIDDLE,DATA}_LEN. > > This "worked for

Re: [ceph-users] /var/lib/ceph/mon/ceph-{node}/store.db on mon nodes

2019-01-16 Thread Brian Topping
Thanks guys! This does leave me a little worried that I only have one mon at the moment based on reasons in my previous emails in the list (physical limit of two nodes at the moment). Going to have to get more creative! Sent from my iPhone > On Jan 16, 2019, at 02:56, Wido den Hollander

Re: [ceph-users] /var/lib/ceph/mon/ceph-{node}/store.db on mon nodes

2019-01-16 Thread Wido den Hollander
On 1/16/19 10:36 AM, Matthew Vernon wrote: > Hi, > > On 16/01/2019 09:02, Brian Topping wrote: > >> I’m looking at writes to a fragile SSD on a mon node, >> /var/lib/ceph/mon/ceph-{node}/store.db is the big offender at the >> moment. >> Is it required to be on a physical disk or can it be in

Re: [ceph-users] Suggestions/experiences with mixed disk sizes and models from 4TB - 14TB

2019-01-16 Thread Ashley Merrick
Have had some good experiences with ST1NM0156-2AA111 Also running with EC but using RBD for slow storage for VM's, have had good performance for what id expect for 10TB drive's and EC. I would definitely say getting helium drives helps VS standard air filled when you get to the 8TB+ drives.

Re: [ceph-users] /var/lib/ceph/mon/ceph-{node}/store.db on mon nodes

2019-01-16 Thread Matthew Vernon
Hi, On 16/01/2019 09:02, Brian Topping wrote: > I’m looking at writes to a fragile SSD on a mon node, > /var/lib/ceph/mon/ceph-{node}/store.db is the big offender at the > moment. > Is it required to be on a physical disk or can it be in tempfs? One > of the log files has paxos strings, so I’m

[ceph-users] Suggestions/experiences with mixed disk sizes and models from 4TB - 14TB

2019-01-16 Thread Götz Reinicke
Dear Ceph users, I’d like to get some feedback for the following thought: Currently I run some 24*4TB bluestore OSD nodes. The main focus is on storage space over IOPS. We use erasure code and cephfs, and things look good right now. The „but“ is, I do need more disk space and don’t have so

[ceph-users] /var/lib/ceph/mon/ceph-{node}/store.db on mon nodes

2019-01-16 Thread Brian Topping
I’m looking at writes to a fragile SSD on a mon node, /var/lib/ceph/mon/ceph-{node}/store.db is the big offender at the moment. Is it required to be on a physical disk or can it be in tempfs? One of the log files has paxos strings, so I’m guessing it has to be on disk for a panic recovery? Are

Re: [ceph-users] Recommendations for sharing a file system to a heterogeneous client network?

2019-01-16 Thread Marc Roos
I opened a thread recently here asking about what can be generally accepted as 'ceph overhead' when using the file system. I wonder if the performance loss I have on a cephfs 1x replication pool compared to native performance is really so much. 5,6x to 2x slower than native disk performance

Re: [ceph-users] slow requests and high i/o / read rate on bluestore osds after upgrade 12.2.8 -> 12.2.10

2019-01-16 Thread Stefan Priebe - Profihost AG
Hi, no ok it was not. Bug still present. It was only working because the osdmap was so far away that it has started backfill instead of recovery. So it happens only in the recovery case. Greets, Stefan Am 15.01.19 um 16:02 schrieb Stefan Priebe - Profihost AG: > > Am 15.01.19 um 12:45 schrieb