Re: [ceph-users] Ceph health error (was: Prioritize recovery over backfilling)

2018-06-07 Thread Caspar Smit
Well i let it run with flags nodown and it looked like it would finish BUT it all went wrong somewhere: This is now the state: health: HEALTH_ERR nodown flag(s) set 5602396/94833780 objects misplaced (5.908%) Reduced data availability: 143 pgs inactive,

Re: [ceph-users] rbd map hangs

2018-06-07 Thread Tracy Reed
On Thu, Jun 07, 2018 at 09:30:23AM PDT, Jason Dillaman spake thusly: > I think what Ilya is saying is that it's a very old RHEL 7-based > kernel (RHEL 7.1?). For example, the current RHEL 7.5 kernel includes > numerous improvements that have been backported from the current > upstream kernel. Ah,

Re: [ceph-users] Adding additional disks to the production cluster without performance impacts on the existing

2018-06-07 Thread Pardhiv Karri
Hi John, We recently added a lot of nodes to our ceph clusters. To mitigate lot of problems (we are using tree algorithm) we added an empty node first to the crushmap and then added OSDs with zero weight, made sure the ceph health is OK and then started ramping up each OSD. I created a script to

Re: [ceph-users] Adding additional disks to the production cluster without performance impacts on the existing

2018-06-07 Thread Paul Emmerich
Hi, the "osd_recovery_sleep_hdd/ssd" options are way better to fine-tune the impact of a backfill operation in this case. Paul 2018-06-07 20:55 GMT+02:00 David Turner : > A recommendation for adding disks with minimal impact is to add them with > a crush weight of 0 (configurable in the

Re: [ceph-users] cannot add new OSDs in mimic

2018-06-07 Thread Michael Kuriger
Yes, I followed the procedure. Also, I'm not able to create new OSD's at all in mimic, even on a newly deployed osd server. Same error. Even if I pass the --id {1d} parameter to the ceph-volume command, it still uses the first available ID and not the one I specify. Mike Kuriger Sr. Unix

Re: [ceph-users] cannot add new OSDs in mimic

2018-06-07 Thread Vasu Kulkarni
It is actually documented in replacing osd case, http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/#replacing-an-osd, I hope you followed that procedure? On Thu, Jun 7, 2018 at 1:11 PM, Michael Kuriger wrote: > Do you mean: > ceph osd destroy {ID} --yes-i-really-mean-it > > Mike

Re: [ceph-users] cannot add new OSDs in mimic

2018-06-07 Thread Gerhard W. Recher
Michael, this apperently chnage in mimic to "--yes-i-really-really-mean-it" :( Gerhard W. Recher net4sec UG (haftungsbeschränkt) Leitenweg 6 86929 Penzing +49 171 4802507 Am 07.06.2018 um 22:11 schrieb Michael Kuriger: Do you mean: ceph osd destroy {ID} --yes-i-really-mean-it Mike Kuriger

Re: [ceph-users] cannot add new OSDs in mimic

2018-06-07 Thread Michael Kuriger
Do you mean: ceph osd destroy {ID} --yes-i-really-mean-it Mike Kuriger -Original Message- From: Vasu Kulkarni [mailto:vakul...@redhat.com] Sent: Thursday, June 07, 2018 12:28 PM To: Michael Kuriger Cc: ceph-users Subject: Re: [ceph-users] cannot add new OSDs in mimic There is a osd

Re: [ceph-users] cannot add new OSDs in mimic

2018-06-07 Thread Vasu Kulkarni
There is a osd destroy command but not documented, did you run that as well? On Thu, Jun 7, 2018 at 12:21 PM, Michael Kuriger wrote: > CEPH team, > Is there a solution yet for adding OSDs in mimic - specifically re-using old > IDs? I was looking over this BUG report - >

[ceph-users] cannot add new OSDs in mimic

2018-06-07 Thread Michael Kuriger
CEPH team, Is there a solution yet for adding OSDs in mimic - specifically re-using old IDs? I was looking over this BUG report - https://tracker.ceph.com/issues/24423 and my issue is similar. I removed a bunch of OSD's after upgrading to mimic and I'm not able to re-add them using the new

Re: [ceph-users] ceph-volume: failed to activate some bluestore osds

2018-06-07 Thread Alfredo Deza
On Thu, Jun 7, 2018 at 3:04 PM, Dan van der Ster wrote: > On Thu, Jun 7, 2018 at 8:58 PM Alfredo Deza wrote: >> >> On Thu, Jun 7, 2018 at 2:45 PM, Dan van der Ster wrote: >> > On Thu, Jun 7, 2018 at 6:58 PM Alfredo Deza wrote: >> >> >> >> On Thu, Jun 7, 2018 at 12:09 PM, Sage Weil wrote: >>

Re: [ceph-users] ceph-volume: failed to activate some bluestore osds

2018-06-07 Thread Dan van der Ster
On Thu, Jun 7, 2018 at 8:58 PM Alfredo Deza wrote: > > On Thu, Jun 7, 2018 at 2:45 PM, Dan van der Ster wrote: > > On Thu, Jun 7, 2018 at 6:58 PM Alfredo Deza wrote: > >> > >> On Thu, Jun 7, 2018 at 12:09 PM, Sage Weil wrote: > >> > On Thu, 7 Jun 2018, Dan van der Ster wrote: > >> >> On Thu,

Re: [ceph-users] ceph-volume: failed to activate some bluestore osds

2018-06-07 Thread Alfredo Deza
On Thu, Jun 7, 2018 at 2:45 PM, Dan van der Ster wrote: > On Thu, Jun 7, 2018 at 6:58 PM Alfredo Deza wrote: >> >> On Thu, Jun 7, 2018 at 12:09 PM, Sage Weil wrote: >> > On Thu, 7 Jun 2018, Dan van der Ster wrote: >> >> On Thu, Jun 7, 2018 at 5:36 PM Dan van der Ster >> >> wrote: >> >> > >>

Re: [ceph-users] Adding additional disks to the production cluster without performance impacts on the existing

2018-06-07 Thread David Turner
A recommendation for adding disks with minimal impact is to add them with a crush weight of 0 (configurable in the ceph.conf file and then increasing their weight in small increments until you get to the desired OSD weight. That way you're never moving too much data at once and can stop at any

Re: [ceph-users] ceph-volume: failed to activate some bluestore osds

2018-06-07 Thread Dan van der Ster
On Thu, Jun 7, 2018 at 6:58 PM Alfredo Deza wrote: > > On Thu, Jun 7, 2018 at 12:09 PM, Sage Weil wrote: > > On Thu, 7 Jun 2018, Dan van der Ster wrote: > >> On Thu, Jun 7, 2018 at 5:36 PM Dan van der Ster > >> wrote: > >> > > >> > On Thu, Jun 7, 2018 at 5:34 PM Sage Weil wrote: > >> > > > >>

Re: [ceph-users] ceph-volume: failed to activate some bluestore osds

2018-06-07 Thread Dan van der Ster
On Thu, Jun 7, 2018 at 6:33 PM Sage Weil wrote: > > On Thu, 7 Jun 2018, Dan van der Ster wrote: > > > > Wait, we found something!!! > > > > > > > > In the 1st 4k on the block we found the block.db pointing at the wrong > > > > device (/dev/sdc1 instead of /dev/sdaa1) > > > > > > > > 0130 6b

Re: [ceph-users] rbd map hangs

2018-06-07 Thread Ilya Dryomov
On Thu, Jun 7, 2018 at 6:30 PM, Jason Dillaman wrote: > On Thu, Jun 7, 2018 at 12:13 PM, Tracy Reed wrote: >> On Thu, Jun 07, 2018 at 08:40:50AM PDT, Ilya Dryomov spake thusly: >>> > Kernel is Linux cpu04.mydomain.com 3.10.0-229.20.1.el7.x86_64 #1 SMP Tue >>> > Nov 3 19:10:07 UTC 2015 x86_64

Re: [ceph-users] Openstack VMs with Ceph EC pools

2018-06-07 Thread Jason Dillaman
On Thu, Jun 7, 2018 at 12:54 PM, Pardhiv Karri wrote: > Thank you, Andrew and Jason for replying. > > Jason, > Do you have a sample ceph config file that you can share which works with > RBD and EC pools? Yup -- see below from my previous email. > Thanks, > Pardhiv Karri > > On Thu, Jun 7, 2018

[ceph-users] Mimic (13.2.0) Release Notes Bug on CephFS Snapshot Upgrades

2018-06-07 Thread Patrick Donnelly
There was a bug [1] in the release notes [2] which had incorrect commands for upgrading the snapshot format of an existing CephFS file system which has had snapshots enabled at some point. The correction is here [3]: diff --git a/doc/releases/mimic.rst b/doc/releases/mimic.rst index

Re: [ceph-users] Openstack VMs with Ceph EC pools

2018-06-07 Thread Pardhiv Karri
Thank you, Andrew and Jason for replying. Jason, Do you have a sample ceph config file that you can share which works with RBD and EC pools? Thanks, Pardhiv Karri On Thu, Jun 7, 2018 at 9:08 AM, Jason Dillaman wrote: > On Thu, Jun 7, 2018 at 11:54 AM, Andrew Denton > wrote: > > On Wed,

Re: [ceph-users] pool has many more objects per pg than average

2018-06-07 Thread Brett Chancellor
The error will go away once you start storing data in the other pools. Or, you could simply silence the message with mon_pg_warn_max_object_skew = 0 On Thu, Jun 7, 2018 at 10:48 AM, Torin Woltjer wrote: > I have a ceph cluster and status shows this error: pool libvirt-pool has > many more

Re: [ceph-users] ceph-volume: failed to activate some bluestore osds

2018-06-07 Thread Sage Weil
On Thu, 7 Jun 2018, Dan van der Ster wrote: > > > Wait, we found something!!! > > > > > > In the 1st 4k on the block we found the block.db pointing at the wrong > > > device (/dev/sdc1 instead of /dev/sdaa1) > > > > > > 0130 6b 35 79 2b 67 3d 3d 0d 00 00 00 70 61 74 68 5f > > >

Re: [ceph-users] rbd map hangs

2018-06-07 Thread Sergey Malinin
http://elrepo.org/tiki/kernel-ml provides 4.17 > On 7.06.2018, at 19:13, Tracy Reed wrote: > > It's what's shipping with CentOS/RHEL 7 and probably what the vast > majority of people are using aside from perhaps the Ubuntu LTS people.

Re: [ceph-users] ceph-volume: failed to activate some bluestore osds

2018-06-07 Thread Sage Weil
On Thu, 7 Jun 2018, Dan van der Ster wrote: > On Thu, Jun 7, 2018 at 6:09 PM Sage Weil wrote: > > > > On Thu, 7 Jun 2018, Dan van der Ster wrote: > > > On Thu, Jun 7, 2018 at 5:36 PM Dan van der Ster > > > wrote: > > > > > > > > On Thu, Jun 7, 2018 at 5:34 PM Sage Weil wrote: > > > > > > > > >

Re: [ceph-users] rbd map hangs

2018-06-07 Thread Jason Dillaman
On Thu, Jun 7, 2018 at 12:13 PM, Tracy Reed wrote: > On Thu, Jun 07, 2018 at 08:40:50AM PDT, Ilya Dryomov spake thusly: >> > Kernel is Linux cpu04.mydomain.com 3.10.0-229.20.1.el7.x86_64 #1 SMP Tue >> > Nov 3 19:10:07 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux >> >> This is a *very* old kernel. > >

Re: [ceph-users] ceph-volume: failed to activate some bluestore osds

2018-06-07 Thread Dan van der Ster
On Thu, Jun 7, 2018 at 6:09 PM Sage Weil wrote: > > On Thu, 7 Jun 2018, Dan van der Ster wrote: > > On Thu, Jun 7, 2018 at 5:36 PM Dan van der Ster wrote: > > > > > > On Thu, Jun 7, 2018 at 5:34 PM Sage Weil wrote: > > > > > > > > On Thu, 7 Jun 2018, Dan van der Ster wrote: > > > > > On Thu,

Re: [ceph-users] rbd map hangs

2018-06-07 Thread Tracy Reed
On Thu, Jun 07, 2018 at 08:40:50AM PDT, Ilya Dryomov spake thusly: > > Kernel is Linux cpu04.mydomain.com 3.10.0-229.20.1.el7.x86_64 #1 SMP Tue > > Nov 3 19:10:07 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux > > This is a *very* old kernel. It's what's shipping with CentOS/RHEL 7 and probably what

Re: [ceph-users] ceph-volume: failed to activate some bluestore osds

2018-06-07 Thread Dan van der Ster
On Thu, Jun 7, 2018 at 6:01 PM Dan van der Ster wrote: > > On Thu, Jun 7, 2018 at 5:36 PM Dan van der Ster wrote: > > > > On Thu, Jun 7, 2018 at 5:34 PM Sage Weil wrote: > > > > > > On Thu, 7 Jun 2018, Dan van der Ster wrote: > > > > On Thu, Jun 7, 2018 at 4:41 PM Sage Weil wrote: > > > > > >

Re: [ceph-users] ceph-volume: failed to activate some bluestore osds

2018-06-07 Thread Sage Weil
On Thu, 7 Jun 2018, Dan van der Ster wrote: > On Thu, Jun 7, 2018 at 5:36 PM Dan van der Ster wrote: > > > > On Thu, Jun 7, 2018 at 5:34 PM Sage Weil wrote: > > > > > > On Thu, 7 Jun 2018, Dan van der Ster wrote: > > > > On Thu, Jun 7, 2018 at 4:41 PM Sage Weil wrote: > > > > > > > > > > On

Re: [ceph-users] Openstack VMs with Ceph EC pools

2018-06-07 Thread Jason Dillaman
On Thu, Jun 7, 2018 at 11:54 AM, Andrew Denton wrote: > On Wed, 2018-06-06 at 17:02 -0700, Pardhiv Karri wrote: >> Hi, >> >> Is anyone using Openstack with Ceph Erasure Coding pools as it now >> supports RBD in Luminous. If so, hows the performance? > > I attempted it, but couldn't figure out

Re: [ceph-users] ceph-volume: failed to activate some bluestore osds

2018-06-07 Thread Dan van der Ster
On Thu, Jun 7, 2018 at 5:36 PM Dan van der Ster wrote: > > On Thu, Jun 7, 2018 at 5:34 PM Sage Weil wrote: > > > > On Thu, 7 Jun 2018, Dan van der Ster wrote: > > > On Thu, Jun 7, 2018 at 4:41 PM Sage Weil wrote: > > > > > > > > On Thu, 7 Jun 2018, Dan van der Ster wrote: > > > > > On Thu, Jun

Re: [ceph-users] Openstack VMs with Ceph EC pools

2018-06-07 Thread Andrew Denton
On Wed, 2018-06-06 at 17:02 -0700, Pardhiv Karri wrote: > Hi, > > Is anyone using Openstack with Ceph Erasure Coding pools as it now > supports RBD in Luminous. If so, hows the performance? I attempted it, but couldn't figure out how to get Cinder to specify the data pool. You can't just point

Re: [ceph-users] rbd map hangs

2018-06-07 Thread Ilya Dryomov
On Thu, Jun 7, 2018 at 4:33 PM, Tracy Reed wrote: > On Thu, Jun 07, 2018 at 02:05:31AM PDT, Ilya Dryomov spake thusly: >> > find /sys/kernel/debug/ceph -type f -print -exec cat {} \; >> >> Can you paste the entire output of that command? >> >> Which kernel are you running on the client box? > >

Re: [ceph-users] ceph-volume: failed to activate some bluestore osds

2018-06-07 Thread Dan van der Ster
On Thu, Jun 7, 2018 at 5:34 PM Sage Weil wrote: > > On Thu, 7 Jun 2018, Dan van der Ster wrote: > > On Thu, Jun 7, 2018 at 4:41 PM Sage Weil wrote: > > > > > > On Thu, 7 Jun 2018, Dan van der Ster wrote: > > > > On Thu, Jun 7, 2018 at 4:33 PM Sage Weil wrote: > > > > > > > > > > On Thu, 7 Jun

Re: [ceph-users] ceph-volume: failed to activate some bluestore osds

2018-06-07 Thread Sage Weil
On Thu, 7 Jun 2018, Dan van der Ster wrote: > On Thu, Jun 7, 2018 at 4:41 PM Sage Weil wrote: > > > > On Thu, 7 Jun 2018, Dan van der Ster wrote: > > > On Thu, Jun 7, 2018 at 4:33 PM Sage Weil wrote: > > > > > > > > On Thu, 7 Jun 2018, Dan van der Ster wrote: > > > > > Hi all, > > > > > > > > >

Re: [ceph-users] ceph-volume: failed to activate some bluestore osds

2018-06-07 Thread Dan van der Ster
On Thu, Jun 7, 2018 at 5:16 PM Alfredo Deza wrote: > > On Thu, Jun 7, 2018 at 10:54 AM, Dan van der Ster wrote: > > On Thu, Jun 7, 2018 at 4:41 PM Sage Weil wrote: > >> > >> On Thu, 7 Jun 2018, Dan van der Ster wrote: > >> > On Thu, Jun 7, 2018 at 4:33 PM Sage Weil wrote: > >> > > > >> > > On

Re: [ceph-users] ceph-volume: failed to activate some bluestore osds

2018-06-07 Thread Alfredo Deza
On Thu, Jun 7, 2018 at 10:54 AM, Dan van der Ster wrote: > On Thu, Jun 7, 2018 at 4:41 PM Sage Weil wrote: >> >> On Thu, 7 Jun 2018, Dan van der Ster wrote: >> > On Thu, Jun 7, 2018 at 4:33 PM Sage Weil wrote: >> > > >> > > On Thu, 7 Jun 2018, Dan van der Ster wrote: >> > > > Hi all, >> > > >

[ceph-users] slow MDS requests [Solved]

2018-06-07 Thread Alfredo Daniel Rezinovsky
I had a lot of "slow MDS request" (MDS, nos OSD) when writing a lot of small files to cephfs. The problem was an I/O stuck when flushing XFS buffers. I had the same problem in other (non ceph) xfs systems when asking for a lot of inode operations (rm -rf for example). The solution was to

Re: [ceph-users] ceph-volume: failed to activate some bluestore osds

2018-06-07 Thread Dan van der Ster
On Thu, Jun 7, 2018 at 4:41 PM Sage Weil wrote: > > On Thu, 7 Jun 2018, Dan van der Ster wrote: > > On Thu, Jun 7, 2018 at 4:33 PM Sage Weil wrote: > > > > > > On Thu, 7 Jun 2018, Dan van der Ster wrote: > > > > Hi all, > > > > > > > > We have an intermittent issue where bluestore osds sometimes

[ceph-users] pool has many more objects per pg than average

2018-06-07 Thread Torin Woltjer
I have a ceph cluster and status shows this error: pool libvirt-pool has many more objects per pg than average (too few pgs?) This pool has the most stored in it currently, by a large margin. The other pools are underutilized currently, but are purposed to take a role much greater than

Re: [ceph-users] ceph-volume: failed to activate some bluestore osds

2018-06-07 Thread Alfredo Deza
On Thu, Jun 7, 2018 at 10:40 AM, Dan van der Ster wrote: > On Thu, Jun 7, 2018 at 4:31 PM Alfredo Deza wrote: >> >> On Thu, Jun 7, 2018 at 10:23 AM, Dan van der Ster >> wrote: >> > Hi all, >> > >> > We have an intermittent issue where bluestore osds sometimes fail to >> > start after a reboot.

Re: [ceph-users] ceph-volume: failed to activate some bluestore osds

2018-06-07 Thread Sage Weil
On Thu, 7 Jun 2018, Dan van der Ster wrote: > On Thu, Jun 7, 2018 at 4:33 PM Sage Weil wrote: > > > > On Thu, 7 Jun 2018, Dan van der Ster wrote: > > > Hi all, > > > > > > We have an intermittent issue where bluestore osds sometimes fail to > > > start after a reboot. > > > The osds all fail the

Re: [ceph-users] ceph-volume: failed to activate some bluestore osds

2018-06-07 Thread Dan van der Ster
On Thu, Jun 7, 2018 at 4:31 PM Alfredo Deza wrote: > > On Thu, Jun 7, 2018 at 10:23 AM, Dan van der Ster wrote: > > Hi all, > > > > We have an intermittent issue where bluestore osds sometimes fail to > > start after a reboot. > > The osds all fail the same way [see 2], failing to open the

Re: [ceph-users] ceph-volume: failed to activate some bluestore osds

2018-06-07 Thread Dan van der Ster
On Thu, Jun 7, 2018 at 4:33 PM Sage Weil wrote: > > On Thu, 7 Jun 2018, Dan van der Ster wrote: > > Hi all, > > > > We have an intermittent issue where bluestore osds sometimes fail to > > start after a reboot. > > The osds all fail the same way [see 2], failing to open the superblock. > > One

Re: [ceph-users] rbd map hangs

2018-06-07 Thread Tracy Reed
On Thu, Jun 07, 2018 at 02:05:31AM PDT, Ilya Dryomov spake thusly: > > find /sys/kernel/debug/ceph -type f -print -exec cat {} \; > > Can you paste the entire output of that command? > > Which kernel are you running on the client box? Kernel is Linux cpu04.mydomain.com

Re: [ceph-users] ceph-volume: failed to activate some bluestore osds

2018-06-07 Thread Sage Weil
On Thu, 7 Jun 2018, Dan van der Ster wrote: > Hi all, > > We have an intermittent issue where bluestore osds sometimes fail to > start after a reboot. > The osds all fail the same way [see 2], failing to open the superblock. > One one particular host, there are 24 osds and 4 SSDs partitioned for

Re: [ceph-users] ceph-volume: failed to activate some bluestore osds

2018-06-07 Thread Alfredo Deza
On Thu, Jun 7, 2018 at 10:23 AM, Dan van der Ster wrote: > Hi all, > > We have an intermittent issue where bluestore osds sometimes fail to > start after a reboot. > The osds all fail the same way [see 2], failing to open the superblock. > One one particular host, there are 24 osds and 4 SSDs

[ceph-users] ceph-volume: failed to activate some bluestore osds

2018-06-07 Thread Dan van der Ster
Hi all, We have an intermittent issue where bluestore osds sometimes fail to start after a reboot. The osds all fail the same way [see 2], failing to open the superblock. One one particular host, there are 24 osds and 4 SSDs partitioned for the block.db's. The affected non-starting OSDs all have

Re: [ceph-users] I/O hangs when one of three nodes is down

2018-06-07 Thread Grigori Frolov
Thank you, Burkhard. There are really 3 active MDSs, so this is a misconfiguration. I will try a standby one. kind regards, Grigori. От: ceph-users от имени Burkhard Linke Отправлено: 7 июня 2018 г. 18:59 Кому: ceph-users@lists.ceph.com Тема: Re:

Re: [ceph-users] I/O hangs when one of three nodes is down

2018-06-07 Thread Burkhard Linke
Hi, On 06/07/2018 02:52 PM, Фролов Григорий wrote: ?Hello. Could you please help me troubleshoot the issue. I have 3 nodes in a cluster. *snipsnap* root@testk8s2:~# ceph -s cluster 0bcc00ec-731a-4734-8d76-599f70f06209 health HEALTH_ERR 80 pgs degraded

Re: [ceph-users] I/O hangs when one of three nodes is down

2018-06-07 Thread Grigori Frolov
root@testk8s1:~# ceph osd pool ls detail pool 0 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0 pool 1 'cephfs_data' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8

Re: [ceph-users] FAILED assert(p != recovery_info.ss.clone_snaps.end())

2018-06-07 Thread Nick Fisk
So I've recompiled a 12.2.5 ceph-osd binary with the fix included in https://github.com/ceph/ceph/pull/22396 The OSD has restarted as expected and the PG is now active+clean ..success so far. What's the best method to clean up the stray snapshot on OSD.46? I'm guessing using the

Re: [ceph-users] Adding cluster network to running cluster

2018-06-07 Thread mj
On 06/07/2018 01:45 PM, Wido den Hollander wrote: Removing cluster network is enough. After the restart the OSDs will not publish a cluster network in the OSDMap anymore. You can keep the public network in ceph.conf and can even remove that after you removed the 10.10.x.x addresses from the

[ceph-users] I/O hangs when one of three nodes is down

2018-06-07 Thread Фролов Григорий
?Hello. Could you please help me troubleshoot the issue. I have 3 nodes in a cluster. ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 0.02637 root default -2 0.00879 host testk8s3 0 0.00879 osd.0 up 1.0 1.0 -3 0.00879 host testk8s1

Re: [ceph-users] Adding cluster network to running cluster

2018-06-07 Thread Wido den Hollander
On 06/07/2018 01:39 PM, mj wrote: > Hi, > > Please allow me to ask one more question: > > We currently have a seperated network: cluster on 10.10.x.x and public > on 192.168.x.x. > > I would like to migrate all network to 192.168.x.x setup, which would > give us 2*10G. > > Is simply

Re: [ceph-users] Prioritize recovery over backfilling

2018-06-07 Thread Sage Weil
On Wed, 6 Jun 2018, Caspar Smit wrote: > Hi all, > > We have a Luminous 12.2.2 cluster with 3 nodes and i recently added a node > to it. > > osd-max-backfills is at the default 1 so backfilling didn't go very fast > but that doesn't matter. > > Once it started backfilling everything looked ok:

Re: [ceph-users] Adding cluster network to running cluster

2018-06-07 Thread Paul Emmerich
We also build almost all of our clusters with a single Ceph network. 2x10 Gbit/s is almost never the bottleneck. Paul 2018-06-07 11:05 GMT+02:00 Wido den Hollander : > > > On 06/07/2018 10:56 AM, Kevin Olbrich wrote: > > Realy? > > > > I always thought that splitting the replication network

Re: [ceph-users] Prioritize recovery over backfilling

2018-06-07 Thread Piotr Dałek
On 18-06-07 12:43 PM, Caspar Smit wrote: Hi Piotr, Thanks for your answer! I've set nodown and now it doesn't mark any OSD's as down anymore :) Any tip when everything is recovered/backfilled and unsetting the nodown flag? When all pgs are reported as active+clean (any scrubbing/deep

Re: [ceph-users] Prioritize recovery over backfilling

2018-06-07 Thread Caspar Smit
Hi Piotr, Thanks for your answer! I've set nodown and now it doesn't mark any OSD's as down anymore :) Any tip when everything is recovered/backfilled and unsetting the nodown flag? Shutdown all activity to the ceph cluster before that moment? If i unset the nodown flag and suddenly a lot of

Re: [ceph-users] Adding cluster network to running cluster

2018-06-07 Thread Wido den Hollander
On 06/07/2018 10:56 AM, Kevin Olbrich wrote: > Realy? > > I always thought that splitting the replication network is best practice. > Keeping everything in the same IPv6 network is much easier. > No, there is no big benefit unless your usecase (which 99% isn't) asks for it. Keep it simple,

Re: [ceph-users] rbd map hangs

2018-06-07 Thread Ilya Dryomov
On Thu, Jun 7, 2018 at 5:12 AM, Tracy Reed wrote: > > Hello all! I'm running luminous with old style non-bluestore OSDs. ceph > 10.2.9 clients though, haven't been able to upgrade those yet. > > Occasionally I have access to rbds hang on the client such as right now. > I tried to dd a VM image

Re: [ceph-users] Adding cluster network to running cluster

2018-06-07 Thread Kevin Olbrich
Realy? I always thought that splitting the replication network is best practice. Keeping everything in the same IPv6 network is much easier. Thank you. Kevin 2018-06-07 10:44 GMT+02:00 Wido den Hollander : > > > On 06/07/2018 09:46 AM, Kevin Olbrich wrote: > > Hi! > > > > When we installed

Re: [ceph-users] Adding cluster network to running cluster

2018-06-07 Thread Wido den Hollander
On 06/07/2018 09:46 AM, Kevin Olbrich wrote: > Hi! > > When we installed our new luminous cluster, we had issues with the > cluster network (setup of mon's failed). > We moved on with a single network setup. > > Now I would like to set the cluster network again but the cluster is in > use (4

Re: [ceph-users] Adding cluster network to running cluster

2018-06-07 Thread Burkhard Linke
Hi, I may be wrong, but AFAIK the cluster network is only used to bind the corresponding functionality to the correct network interface. There's no check for a common CIDR range or something similar in CEPH. As long as the traffic is routeable from the current public network and the new

Re: [ceph-users] Adding cluster network to running cluster

2018-06-07 Thread David Turner
2 Things about this. First, Mons do not communicate over the cluster network. Only OSD daemons send any traffic over that network. Mons, MDS, RGW, MGR, etc all communicate over the public network. OSDs communicate with all clients, Mons, etc on the public network and with each other on the

Re: [ceph-users] Update to Mimic with prior Snapshots leads to MDS damaged metadata

2018-06-07 Thread Yan, Zheng
On Thu, Jun 7, 2018 at 2:44 PM, Tobias Florek wrote: > Hi! > > Thank you for your help! The cluster is running healthily for a day now. > > Regarding the problem, I just checked in the release notes [1] and on > docs.ceph.com and did not find the right invocation after an upgrade. > Maybe that

[ceph-users] Debian GPG key for Luminous

2018-06-07 Thread Steffen Winther Sørensen
Community, Where would I find the GPG release key for Debian Luminous? as I’m getting: W: GPG error: http://download.ceph.com/debian-luminous stretch InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY E84AC2C0460F3994 when attempting

Re: [ceph-users] rbd map hangs

2018-06-07 Thread ceph
Just a bet: have you inconsistant MTU across your network ? I already had your issue when OSD and client was using jumbo frames, but MON did not (or something like that) On 06/07/2018 05:12 AM, Tracy Reed wrote: > > Hello all! I'm running luminous with old style non-bluestore OSDs. ceph >

Re: [ceph-users] mimic cephfs snapshot in active/standby mds env

2018-06-07 Thread Yan, Zheng
On Thu, Jun 7, 2018 at 10:04 AM, Brady Deetz wrote: > I've seen several mentions of stable snapshots in Mimic for cephfs in > multi-active mds environments. I'm currently running active/standby in > 12.2.5 with no snapshops. If I upgrade to Mimic, is there any concern with > snapshots in an

Re: [ceph-users] Prioritize recovery over backfilling

2018-06-07 Thread Piotr Dałek
On 18-06-06 09:29 PM, Caspar Smit wrote: Hi all, We have a Luminous 12.2.2 cluster with 3 nodes and i recently added a node to it. osd-max-backfills is at the default 1 so backfilling didn't go very fast but that doesn't matter. Once it started backfilling everything looked ok: ~300 pgs

Re: [ceph-users] Update to Mimic with prior Snapshots leads to MDS damaged metadata

2018-06-07 Thread Tobias Florek
Hi! Thank you for your help! The cluster is running healthily for a day now. Regarding the problem, I just checked in the release notes [1] and on docs.ceph.com and did not find the right invocation after an upgrade. Maybe that ought to be fixed. >> [upgrade from luminous to mimic with prior

Re: [ceph-users] Stop scrubbing

2018-06-07 Thread Wido den Hollander
On 06/06/2018 08:32 PM, Joe Comeau wrote: > When I am upgrading from filestore to bluestore > or any other server maintenance for a short time > (ie high I/O while rebuilding) >   > ceph osd set noout > ceph osd set noscrub > ceph osd set nodeep-scrub >   > when finished >   > ceph osd unset