Re: [ceph-users] Firefly OSDs stuck in creating state forever

Brian Rak Fri, 01 Aug 2014 14:54:29 -0700

Why do you have a MDS active? I'd suggest getting rid of that at leastuntil you have everything else working.

I see you've set nodown on the OSDs, did you have problems with the OSDsflapping? Do the OSDs have broken connectivity between themselves? Doyou have some kind of firewall interfering here?

I've seen odd issues when the OSDs have broken private networking,you'll get one OSD marking all the other ones down. Adding this to myconfig helped:


[mon]
mon osd min down reporters = 2


On 8/1/2014 5:41 PM, Bruce McFarland wrote:

Hello,
I've run out of ideas and assume I've overlooked something very basic.I've created 2 ceph clusters in the last 2 weeks with different OSD HWand private network fabrics -- 1GE and 10GE. I have never been ableto get the OSDs to come up to the 'active+clean' state. I havefollowed your online documentation and at this point the only thing Idon't think I've done is modifying the CRUSH map (although I have beenlooking into that). These are new clusters with no data and only 1 HDDand 1 SSD per OSD (24 2.5Ghz cores with 64GB RAM).
Since the disks are being recycled is there something I need to flagto let ceph just create it's mappings, but not scrub for datacompatibility? I've tried setting the noscrub flag to no effect.
I also have constant OSD flapping. I've set nodown, but assume that isjust masking a problem that still occurring.
Besides the lack of ever reaching 'active+clean' state ceph-mon alwayscrashes after leaving it running overnight. The OSDs all eventuallyfill /root with with ceph logs so I regularly have to bring everythingdown Delete logs and restart.
I have all sorts of output from the ceph.conf; osd boot ouput with'debug osd -= 20' and 'debug ms = 1'; ceph --w output; and pretty muchall of the debug/monitoring suggestions from the online docs and 2weeks of google searches from online references in blogs, mailinglists etc.
[root@essperf3 Ceph]# ceph -v

ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74)

[root@essperf3 Ceph]# ceph -s

    cluster 4b3ffe60-73f4-4512-b7da-b04e4775dd73
health HEALTH_WARN 96 pgs incomplete; 106 pgs peering; 202 pgsstuck inactive; 202 pgs stuck unclean; nodown,noscrub flag(s) set
monmap e1: 1 mons at {essperf3=209.243.160.35:6789/0}, electionepoch 1, quorum 0 essperf3
     mdsmap e43: 1/1/1 up {0=essperf3=up:creating}

     osdmap e752: 3 osds: 3 up, 3 in

            flags nodown,noscrub

      pgmap v1476: 202 pgs, 4 pools, 0 bytes data, 0 objects

            134 MB used, 1158 GB / 1158 GB avail

                 106 creating+peering

                  96 creating+incomplete

[root@essperf3 Ceph]#

Suggestions?

Thanks,

Bruce



_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Firefly OSDs stuck in creating state forever

Reply via email to