[ceph-users] Fwd: ceph-mon leader - election via CLI

2019-03-25 Thread M Ranga Swami Reddy
Hello - We have seen the ceph-mon election is taking time or cause some issues when a mon leader down or during the maintenance. So in this case - spl. during the maintaince - its required that - ceph-mon leader set via CLI and do the maintenance for ceph-mon node. This functionality very much u

Re: [ceph-users] Access cephfs from second public network

2019-03-25 Thread Andres Rojas Guerrero
Hi, from what we're seeing although Ceph permit to include other public networks it's seems that implementation is not fully operational (mimic), and we need to route all traffic from client to OSD through the first network, for now we leave this approximation and use only one public network to acc

Re: [ceph-users] Ceph MDS laggy

2019-03-25 Thread Mark Schouten
On Mon, Jan 21, 2019 at 10:17:31AM +0800, Yan, Zheng wrote: > It's http://tracker.ceph.com/issues/37977. Thanks for your help. > I think I've hit this bug. Ceph MDS using 100% ceph and reporting as laggy and being kicked out. I'm not sure though if this fix is currently in a released version of L

[ceph-users] 1/3 mon not working after upgrade to Nautilus

2019-03-25 Thread Clausen , Jörn
Hi! I just tried upgrading my test cluster from Mimic (13.2.5) to Nautilus (14.2.0), and everything looked fine. Until I activated msgr2. At that moment, one of my three MONs (the then active one) fell out of the quorum and refuses to join back. The two other MONs seem to work fine. ceph-mon

Re: [ceph-users] Ceph MDS laggy

2019-03-25 Thread Yan, Zheng
On Mon, Mar 25, 2019 at 6:36 PM Mark Schouten wrote: > > On Mon, Jan 21, 2019 at 10:17:31AM +0800, Yan, Zheng wrote: > > It's http://tracker.ceph.com/issues/37977. Thanks for your help. > > > > I think I've hit this bug. Ceph MDS using 100% ceph and reporting as > laggy and being kicked out. I'm n

Re: [ceph-users] Ceph MDS laggy

2019-03-25 Thread Mark Schouten
On Mon, Mar 25, 2019 at 07:13:20PM +0800, Yan, Zheng wrote: > Yes. the fix is in 12.2.11 Great, thanks. -- Mark Schouten | Tuxis Internet Engineering KvK: 61527076 | http://www.tuxis.nl/ T: 0318 200208 | i...@tuxis.nl ___ ceph-users mailing list ceph

Re: [ceph-users] 1/3 mon not working after upgrade to Nautilus

2019-03-25 Thread Clausen , Jörn
Hi again! moment, one of my three MONs (the then active one) fell out of the "active one" is of course nonsense, I confused it with MGRs. Which are running okay, btw, on the same three hosts. I reverted the MON back to a snapshot (vSphere) before the upgrade, repeated the upgrade, and ende

Re: [ceph-users] 1/3 mon not working after upgrade to Nautilus

2019-03-25 Thread Brian Topping
Did you check port access from other nodes? My guess is a forgotten firewall re-emerged on that node after reboot. Sent from my iPhone > On Mar 25, 2019, at 07:26, Clausen, Jörn wrote: > > Hi again! > >> moment, one of my three MONs (the then active one) fell out of the > > "active one" i

Re: [ceph-users] 1/3 mon not working after upgrade to Nautilus

2019-03-25 Thread Clausen , Jörn
Hi! Am 25.03.2019 um 15:07 schrieb Brian Topping: Did you check port access from other nodes? My guess is a forgotten firewall re-emerged on that node after reboot. I am pretty sure it's not the firewall. To be extra sure, I switched it off for testing. I found this in the mon-logs: On t

Re: [ceph-users] OSD stuck in booting state

2019-03-25 Thread PHARABOT Vincent
Hello folks, Nobody to give me a hint ? The communication and auth with mon is ok 2019-03-25 14:16:25.342 7fa3af260700 1 -- 10.8.33.158:6789/0 <== osd.0 10.8.33.183:6800/293177 184 auth(proto 2 2 bytes epoch 0) v1 32+0+0 (2260890001 0 0) 0x559759ffd680 con 0x55975548700 0 2019-03-25

[ceph-users] Ceph will be at SUSECON 2019!

2019-03-25 Thread Mike Perez
Hi all, I'm excited to announce that we will have a booth at SUSECON 2-4 2019! https://www.susecon.com Thank you to SUSE for providing the Ceph Foundation space to allow developers and users of Ceph to showcase demos, and an area to meet for questions and discussions. If you can join the confer

[ceph-users] scrub errors

2019-03-25 Thread solarflow99
I noticed my cluster has scrub errors but the deep-scrub command doesn't show any errors. Is there any way to know what it takes to fix it? # ceph health detail HEALTH_ERR 1 pgs inconsistent; 47 scrub errors pg 10.2a is active+clean+inconsistent, acting [41,38,8] 47 scrub errors # zgrep 10.2a

Re: [ceph-users] scrub errors

2019-03-25 Thread Brad Hubbard
It would help to know what version you are running but, to begin with, could you post the output of the following? $ sudo ceph pg 10.2a query $ sudo rados list-inconsistent-obj 10.2a --format=json-pretty Also, have a read of http://docs.ceph.com/docs/mimic/rados/troubleshooting/troubleshooting-pg

Re: [ceph-users] scrub errors

2019-03-25 Thread solarflow99
hi, thanks. Its still using Hammer. Here's the output from the pg query, the last command you gave doesn't work at all but be too old. # ceph pg 10.2a query { "state": "active+clean+inconsistent", "snap_trimq": "[]", "epoch": 23265, "up": [ 41, 38, 8

Re: [ceph-users] scrub errors

2019-03-25 Thread Brad Hubbard
Hammer is no longer supported. What's the status of osds 7 and 17? On Tue, Mar 26, 2019 at 8:56 AM solarflow99 wrote: > > hi, thanks. Its still using Hammer. Here's the output from the pg query, > the last command you gave doesn't work at all but be too old. > > > # ceph pg 10.2a query > { >

Re: [ceph-users] scrub errors

2019-03-25 Thread solarflow99
yes, I know its old. I intend to have it replaced but thats a few months away and was hoping to get past this. the other OSDs appear to be ok, I see them up and in, why do you see something wrong? On Mon, Mar 25, 2019 at 4:00 PM Brad Hubbard wrote: > Hammer is no longer supported. > > What's t

Re: [ceph-users] v14.2.0 Nautilus released

2019-03-25 Thread Frank Yu
forgive me, it's my mistake - - On Sat, Mar 23, 2019 at 4:28 PM Frank Yu wrote: > Hi guys, > > I have try to setup a cluster with this version, I found the mgr > prometheus metrics has been changed a lot compared with version 13.2.x. > e.g: there is no ceph_mds_* related metrics, or there is so