Re: [ceph-users] Running ceph in Deis/Docker
Hi, This is a followup question to my previous question. When the last monitor in a ceph monitor set is down, what is the proper way to boot up the ceph monitor set again? On one hand, we could try not to make this happen, but on the other hand, as Murphy Law states, I am sure it will happen sooner or later. Thanks - JC On 16/12/14 12:14 pm, Christian Balzer wrote: Hello, your subject is misleading, as this is not really related to Deis/Docker. Find the very recent "Is mon initial members used after the first quorum?" thread in this ML. In short, list all your 3 mons in the initial members section. And yes, rebooting things all at the same time can be "fun", I managed to get into a similar situation like yours once (though that cluster really had only one mon) and it took a hard reset to fix things eventually. Christian On Tue, 16 Dec 2014 08:52:15 +0800 Jimmy Chu wrote: Hi, I installed ceph on 3 nodes, having one monitor, and one OSD running on each node. After rebooting them all at once (I see this may be a bad move now), the ceph monitors refuse to connect to each other. When I run: ceph mon getmap -o /etc/ceph/monmap or even ceph -s It only shows the following: Dec 14 16:38:44 deis-1 sh[933]: 2014-12-14 08:38:44.265419 7f5cec71f700 0 -- :/121 >> 10.132.183.191:6789/0 pipe(0x7f5ce40296a0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f5ce4029930).fault Dec 14 16:38:44 deis-1 sh[933]: 2014-12-14 08:38:44.265419 7f5cec71f700 0 -- :/121 >> 10.132.183.192:6789/0 pipe(0x7f5ce40296a0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f5ce4029930).fault Dec 14 16:38:50 deis-1 sh[933]: 2014-12-14 08:38:50.267398 7f5cec71f700 0 -- :/121 >> 10.132.183.190:6789/0 pipe(0x7f5cd40030e0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f5cd4003370).fault ...keep repeating... So, there is no quorum formed, and ceph admin socket file is not there for connection. What should be my next step to recover the storage? This is *my /etc/ceph/ceph.conf file*: [global] fsid = cc368515-9dc6-48e2-9526-58ac4cbb3ec9 mon initial members = deis-3 auth cluster required = cephx auth service required = cephx auth client required = cephx osd pool default size = 3 osd pool default min_size = 1 osd pool default pg_num = 128 osd pool default pgp_num = 128 osd recovery delay start = 15 log file = /dev/stdout [mon.deis-3] host = deis-3 mon addr = 10.132.183.190:6789 [mon.deis-1] host = deis-1 mon addr = 10.132.183.191:6789 [mon.deis-2] host = deis-2 mon addr = 10.132.183.192:6789 [client.radosgw.gateway] host = deis-store-gateway keyring = /etc/ceph/ceph.client.radosgw.keyring rgw socket path = /var/run/ceph/ceph.radosgw.gateway.fastcgi.sock log file = /dev/stdout Thank you. - Jimmy Chu ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Running ceph in Deis/Docker
Hi, I installed ceph on 3 nodes, having one monitor, and one OSD running on each node. After rebooting them all at once (I see this may be a bad move now), the ceph monitors refuse to connect to each other. When I run: ceph mon getmap -o /etc/ceph/monmap or even ceph -s It only shows the following: Dec 14 16:38:44 deis-1 sh[933]: 2014-12-14 08:38:44.265419 7f5cec71f700 0 -- :/121 >> 10.132.183.191:6789/0 pipe(0x7f5ce40296a0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f5ce4029930).fault Dec 14 16:38:44 deis-1 sh[933]: 2014-12-14 08:38:44.265419 7f5cec71f700 0 -- :/121 >> 10.132.183.192:6789/0 pipe(0x7f5ce40296a0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f5ce4029930).fault Dec 14 16:38:50 deis-1 sh[933]: 2014-12-14 08:38:50.267398 7f5cec71f700 0 -- :/121 >> 10.132.183.190:6789/0 pipe(0x7f5cd40030e0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f5cd4003370).fault ...keep repeating... So, there is no quorum formed, and ceph admin socket file is not there for connection. What should be my next step to recover the storage? This is *my /etc/ceph/ceph.conf file*: [global] fsid = cc368515-9dc6-48e2-9526-58ac4cbb3ec9 mon initial members = deis-3 auth cluster required = cephx auth service required = cephx auth client required = cephx osd pool default size = 3 osd pool default min_size = 1 osd pool default pg_num = 128 osd pool default pgp_num = 128 osd recovery delay start = 15 log file = /dev/stdout [mon.deis-3] host = deis-3 mon addr = 10.132.183.190:6789 [mon.deis-1] host = deis-1 mon addr = 10.132.183.191:6789 [mon.deis-2] host = deis-2 mon addr = 10.132.183.192:6789 [client.radosgw.gateway] host = deis-store-gateway keyring = /etc/ceph/ceph.client.radosgw.keyring rgw socket path = /var/run/ceph/ceph.radosgw.gateway.fastcgi.sock log file = /dev/stdout Thank you. - Jimmy Chu ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Running ceph in Deis/Docker
Hi, I am running a 3-node Deis cluster with ceph as underlying FS. So it is ceph running inside Docker containers running in three separate servers. I rebooted all three nodes (almost at once). After rebooted, the ceph monitor refuse to connect to each other. Symptoms are: - no quorum formed, - ceph admin socket file does not exist - only the following in ceph log: Dec 14 16:38:44 deis-1 sh[933]: 2014-12-14 08:38:44.265419 7f5cec71f700 0 -- :/121 >> 10.132.183.191:6789/0 pipe(0x7f5ce40296a0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f5ce4029930).fault Dec 14 16:38:44 deis-1 sh[933]: 2014-12-14 08:38:44.265419 7f5cec71f700 0 -- :/121 >> 10.132.183.192:6789/0 pipe(0x7f5ce40296a0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f5ce4029930).fault Dec 14 16:38:50 deis-1 sh[933]: 2014-12-14 08:38:50.267398 7f5cec71f700 0 -- :/121 >> 10.132.183.190:6789/0 pipe(0x7f5cd40030e0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f5cd4003370).fault ...keep repeating... This is *my /etc/ceph/ceph.conf file*: [global] fsid = cc368515-9dc6-48e2-9526-58ac4cbb3ec9 mon initial members = deis-3 auth cluster required = cephx auth service required = cephx auth client required = cephx osd pool default size = 3 osd pool default min_size = 1 osd pool default pg_num = 128 osd pool default pgp_num = 128 osd recovery delay start = 15 log file = /dev/stdout [mon.deis-3] host = deis-3 mon addr = 10.132.183.190:6789 [mon.deis-1] host = deis-1 mon addr = 10.132.183.191:6789 [mon.deis-2] host = deis-2 mon addr = 10.132.183.192:6789 [client.radosgw.gateway] host = deis-store-gateway keyring = /etc/ceph/ceph.client.radosgw.keyring rgw socket path = /var/run/ceph/ceph.radosgw.gateway.fastcgi.sock log file = /dev/stdout *IP table of the docker host:** *core@deis-3 ~ $ sudo iptables --list Chain INPUT (policy DROP) target prot opt source destination Firewall-INPUT all -- anywhere anywhere Chain FORWARD (policy DROP) target prot opt source destination ACCEPT tcp -- anywhere 172.17.0.2 tcp dpt:http ACCEPT tcp -- anywhere 172.17.0.2 tcp dpt:https ACCEPT tcp -- anywhere 172.17.0.2 tcp dpt: ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED ACCEPT all -- anywhere anywhere ACCEPT all -- anywhere anywhere Firewall-INPUT all -- anywhere anywhere Chain OUTPUT (policy ACCEPT) target prot opt source destination Chain Firewall-INPUT (2 references) target prot opt source destination ACCEPT all -- anywhere anywhere ACCEPT icmp -- anywhere anywhere icmp echo-reply ACCEPT icmp -- anywhere anywhere icmp destination-unreachable ACCEPT icmp -- anywhere anywhere icmp time-exceeded ACCEPT icmp -- anywhere anywhere icmp echo-request ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED ACCEPT all -- 10.132.183.190 anywhere ACCEPT all -- 10.132.183.192 anywhere ACCEPT all -- 10.132.183.191 anywhere ACCEPT all -- anywhere anywhere ACCEPT tcp -- anywhere anywhere ctstate NEW multiport dports ssh,,http,https LOGall -- anywhere anywhere LOG level warning REJECT all -- anywhere anywhere reject-with icmp-host-prohibited All private IPs are ping-gable within the ceph monitor container. What could I do next to troubleshoot this issue? Thanks a lot! - Jimmy Chu ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com