Re: [ceph-users] centos 7.6 kernel panic caused by osd

2019-01-11 Thread Ilya Dryomov
On Fri, Jan 11, 2019 at 1:38 AM Brad Hubbard wrote: > > On Fri, Jan 11, 2019 at 9:57 AM Jason Dillaman wrote: > > > > I think Ilya recently looked into a bug that can occur when > > CONFIG_HARDENED_USERCOPY is enabled and the IO's TCP message goes > > through the loopback interface (i.e. co-locat

Re: [ceph-users] `ceph-bluestore-tool bluefs-bdev-expand` corrupts OSDs

2019-01-11 Thread Hector Martin
Sorry for the late reply, Here's what I did this time around. osd.0 and osd.1 should be identical, except osd.0 was recreated (that's the first one that failed) and I'm trying to expand osd.1 from its original size. # ceph-bluestore-tool show-label --path /var/lib/ceph/osd/ceph-0 | grep size

Re: [ceph-users] Encryption questions

2019-01-11 Thread Sergio A. de Carvalho Jr.
Thanks for the answers, guys! Am I right to assume msgr2 (http://docs.ceph.com/docs/mimic/dev/msgr2/) will provide encryption between Ceph daemons as well as between clients and daemons? Does anybody know if it will be available in Nautilus? On Fri, Jan 11, 2019 at 8:10 AM Tobias Florek wrote:

Re: [ceph-users] centos 7.6 kernel panic caused by osd

2019-01-11 Thread Brad Hubbard
Haha, in the email thread he says CentOS but the bug is opened against RHEL :P Is it worth recommending a fix in skb_can_coalesce() upstream so other modules don't hit this? On Fri, Jan 11, 2019 at 7:39 PM Ilya Dryomov wrote: > > On Fri, Jan 11, 2019 at 1:38 AM Brad Hubbard wrote: > > > > On Fr

Re: [ceph-users] centos 7.6 kernel panic caused by osd

2019-01-11 Thread Rom Freiman
Same kernel :) On Fri, Jan 11, 2019, 12:49 Brad Hubbard wrote: > Haha, in the email thread he says CentOS but the bug is opened against > RHEL :P > > Is it worth recommending a fix in skb_can_coalesce() upstream so other > modules don't hit this? > > On Fri, Jan 11, 2019 at 7:39 PM Ilya Dryomov

Re: [ceph-users] `ceph-bluestore-tool bluefs-bdev-expand` corrupts OSDs

2019-01-11 Thread Igor Fedotov
Hi Hector, just realized that you're trying to expand main (and exclusive) device which isn't supported in mimic. Here is bluestore_tool complaint (pretty confusing and not preventing from the partial expansion though)  while expanding: expanding dev 1 from 0x1df2eb0 to 0x3a38120 Ca

Re: [ceph-users] Migrate/convert replicated pool to EC?

2019-01-11 Thread Garr
Hallo again, re-reading my message I realized I need to point out an important detail about my use-case. The pool I need to migrate is an object-storage one: it is the destination of an OpenStack-Swift. Do you think that, in this case, the procedure below would be the correct one to use?

Re: [ceph-users] centos 7.6 kernel panic caused by osd

2019-01-11 Thread Ilya Dryomov
On Fri, Jan 11, 2019 at 11:58 AM Rom Freiman wrote: > > Same kernel :) Rom, can you update your CentOS ticket with the link to the Ceph BZ? Thanks, Ilya ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listin

Re: [ceph-users] RBD mirroring feat not supported

2019-01-11 Thread Jason Dillaman
krbd doesn't yet support several RBD features, including journaling [1]. The only current way to use object-map, fast-diff, deep-flatten, and/or journaling features against a block device is to use "rbd device map --device-type nbd " (or use a TCMU loopback device to create an librbd-backed SCSI bl

Re: [ceph-users] `ceph-bluestore-tool bluefs-bdev-expand` corrupts OSDs

2019-01-11 Thread Hector Martin
Hi Igor, On 11/01/2019 20:16, Igor Fedotov wrote: In short - we're planning to support main device expansion for Nautilus+ and to introduce better error handling for the case in Mimic and Luminous. Nautilus PR has been merged, M & L PRs are pending review at the moment: Got it. No problem then

Re: [ceph-users] centos 7.6 kernel panic caused by osd

2019-01-11 Thread Rom Freiman
Done. On Fri, Jan 11, 2019 at 3:36 PM Ilya Dryomov wrote: > On Fri, Jan 11, 2019 at 11:58 AM Rom Freiman wrote: > > > > Same kernel :) > > Rom, can you update your CentOS ticket with the link to the Ceph BZ? > > Thanks, > > Ilya > ___

[ceph-users] Problems enabling automatic balancer

2019-01-11 Thread Massimo Sgaravatto
I am trying to enable the automatic balancer in my Luminous ceph cluster, following the documentation at: http://docs.ceph.com/docs/luminous/mgr/balancer/ [root@ceph-mon-01 ~]# ceph balancer status { "active": true, "plans": [], "mode": "crush-compat" } After having issued the comma

Re: [ceph-users] osdmaps not being cleaned up in 12.2.8

2019-01-11 Thread Bryan Stillwell
That thread looks like the right one. So far I haven't needed to restart the osd's for the churn trick to work. I bet you're right that something thinks it still needs one of the old osdmaps on your cluster. Last night our cluster finished another round of expansions and we're seeing up to 49

Re: [ceph-users] osdmaps not being cleaned up in 12.2.8

2019-01-11 Thread Bryan Stillwell
I've created the following bug report to address this issue: http://tracker.ceph.com/issues/37875 Bryan From: ceph-users on behalf of Bryan Stillwell Date: Friday, January 11, 2019 at 8:59 AM To: Dan van der Ster Cc: ceph-users Subject: Re: [ceph-users] osdmaps not being cleaned up in 12.2.

Re: [ceph-users] Problems enabling automatic balancer

2019-01-11 Thread Massimo Sgaravatto
I think I found myself the problem (for the time being I am debugging the issue on a testbed): [root@c-mon-01 ceph]# ceph osd crush weight-set create-compat Error EPERM: crush map contains one or more bucket(s) that are not straw2 So I issued: [root@c-mon-01 ceph]# ceph osd crush set-all-straw-b

Re: [ceph-users] ceph-mgr fails to restart after upgrade to mimic

2019-01-11 Thread Randall Smith
I was going through the permissions on the various keys in the cluster and I think the admin capabilities look a little weird. (below) Could this be causing the ceph-mgr problems when it starts? [client.admin] key = [redacted] auid = 0 caps mds = "allow" caps mgr =

[ceph-users] RBD Mirror Proxy Support?

2019-01-11 Thread Kenneth Van Alstyne
Hello all (and maybe this would be better suited for the ceph devel mailing list): I’d like to use RBD mirroring between two sites (to each other), but I have the following limitations: - The clusters use the same name (“ceph”) - The clusters share IP address space on a private, non-routed storag

[ceph-users] Ceph Meetups

2019-01-11 Thread Jason Van der Schyff
Hi All, We wanted to let everyone know about a couple of meetups that are happening in the near future relating to Ceph and it was suggested we send it out to the list. First of all, in Dallas on January 15th with details here: https://www.meetup.com/Object-Storage-Craft-Beer-Dallas/events/257

Re: [ceph-users] centos 7.6 kernel panic caused by osd

2019-01-11 Thread Brad Hubbard
On Fri, Jan 11, 2019 at 8:58 PM Rom Freiman wrote: > > Same kernel :) Not exactly the point I had in mind, but sure ;) > > > On Fri, Jan 11, 2019, 12:49 Brad Hubbard wrote: >> >> Haha, in the email thread he says CentOS but the bug is opened against RHEL >> :P >> >> Is it worth recommending a

[ceph-users] OSDs busy reading from Bluestore partition while bringing up nodes.

2019-01-11 Thread Subhachandra Chandra
Hi, We have a cluster with 9 hosts and 540 HDDs using Bluestore and containerized OSDs running luminous 12.2.4. While trying to add new nodes, the cluster collapsed as it could not keep up with establishing enough tcp connections. We fixed sysctl to be able to handle more connections and also

[ceph-users] Offsite replication scenario

2019-01-11 Thread Brian Topping
Hi all, I have a simple two-node Ceph cluster that I’m comfortable with the care and feeding of. Both nodes are in a single rack and captured in the attached dump, it has two nodes, only one mon, all pools size 2. Due to physical limitations, the primary location can’t move past two nodes at th

[ceph-users] Boot volume on OSD device

2019-01-11 Thread Brian Topping
Question about OSD sizes: I have two cluster nodes, each with 4x 800GiB SLC SSD using BlueStore. They boot from SATADOM so the OSDs are data-only, but the MLC SATADOM have terrible reliability and the SLC are way overpriced for this application. Can I carve off 64GiB of from one of the four dri

Re: [ceph-users] OSDs busy reading from Bluestore partition while bringing up nodes.

2019-01-11 Thread Paul Emmerich
This seems like a case of accumulating lots of new osd maps. What might help is also setting the noup and nodown flags and wait for the OSDs to start up. Use the "status" daemon command to check the current OSD state even if it can't come up in the cluster map due to noup (it also somewhere shows