Re: [ceph-users] 1 particular ceph-mon never jobs on 0.67.2
Hi James, Yes, all configured using the interfaces file. Only two interfaces, eth0 and eth1: auto eth0 iface eth0 inet dhcp auto eth1 iface eth1 inet dhcp I took a single node and rebooted it several times, and it really was about 50/50 whether or not the OSDs showed up under 'localhost' or "n0". I tried a few different things last night with no luck. I modified when ceph-all starts by writing differet "start on" values to /etc/init/ceph-all.override. I was grasping for straws a bit, as I just kept adding (and'ing) events, hoping to find something that works. I tried: start on (local-filesystems and net-device-up IFACE=eth0) start on (local-filesystems and net-device-up IFACE=eth0 and net-device-up IFACE=eth1) start on (local-filesystems and net-device-up IFACE=eth0 and net-device-up IFACE=eth1 and started network-services) Oddly, the last one seemed to work at first. When I added the "started network-services" to the list, the OSDs came up correctly each time! But, the monitor never started. If I started it directly "start ceph-mon id=n0", it came up fine, but not during boot. I spent a couple hours trying to debug *that* before I gave up and switched to static hostnames. =/ I had even thrown "--verbose" in the kernel command line so I could see all the upstart events happening, but didn't see anything obvious. So now I'm back to the stock upstart scripts, using static hostnames, and, and I don't have any issues with OSDs moving in the crushmap, or any new problems with the monitors. Sage, I do think I still saw a weird issue with my third mon not starting (same as the original email -- even now with static hostnames), but it was late, and I lost access to the cluster right about then and haven't regained it. I"ll double-check that when I get access again and hopefully will find that problem has gone away too. - Travis ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] 1 particular ceph-mon never jobs on 0.67.2
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On 26/08/13 19:31, Sage Weil wrote: >>> I'm wondering what kind of delay, or additional "start-on" >>> logic I can add to the upstart script to work around this. > Hmm, this is beyond my upstart-fu, unfortunately. This has come up > before, actually. Previously we would wait for any interface to > come up and then start, but that broke with multi-nic machines, and > I ended up just making things start in runlevel [2345]. > > James, do you know what should be done to make the job wait for > *all* network interfaces to be up? Is that even the right solution > here? This is actually really tricky; runlevel [2345] should cover most use cases as this will ensure that interfaces configured in /etc/network/interfaces. But it sounds like that might not be the case; Travis - in your example are all network interfaces configured using /etc/network/interfaces? - -- James Page Ubuntu and Debian Developer james.p...@ubuntu.com jamesp...@debian.org -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.14 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iQIcBAEBCAAGBQJSHG6KAAoJEL/srsug59jDHJ4P/RR0HOUC8awoKlH1f92GRqqa bOo/vszIY/4c2NhCTXRX3jWxBCexlNLlwYExsbX9hzP3DDOMzOMdXh2rMM9o3zaD 98z+o1jC+hUYf27UmK+ZbZGqr4Xh9bi07g6jF2/u3OmbCcxQUaRQdzDp4dbf1MK4 Q2iigJhLBSiZw+OX0+2210+7Cmz9lKKNeuuUcsqT0jdagPYJIQlIbA9v7aNzsxlI AmEShkCBoI9lzedyFsBIZ10gtMDrvBPJHyDf3VySW/ZhLlZeAnPhRZaZ/AkcrToX 1x6quQvheWyr52bbe0gnAAoIZUpLyCG0+Xkp9+jw11HWLTdGMsn3nMI7BUZ6MHrB 8rIdBGc9gxuKsZyqP/QRBVWDWjACHckjAl0ORJdeXkfm6ZmruRTEB2CNgXZF+Wl5 h0InmcdjMTIwgxV5wgJ4d6Lom45AKaTIumpBiGvmMjuVm08V0xftkPpNbpIsbbol fmmpqTlxJtVrsd1CZd1nN74z+EOgrCDRJg4bzSPVRjkYIJc6by3udLSRFlQfz5qd 8pm7PsbSEBEY873HPZMuxAMfXQKf/EMNZTq6bbrA61sgIXUEr/YFmG9EiA8ptnAp 1cy4zRgrnL1KI9rSrjKi19wFeYEn0HRLlPqlA8likUTaGbNpXppohpt7RyE1eA6t vdMkd1v47yZuNoEsEA8e =iLfL -END PGP SIGNATURE- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] 1 particular ceph-mon never jobs on 0.67.2
Cool. So far I have tried: start on (local-filesystems and net-device-up IFACE=eth0) start on (local-filesystems and net-device-up IFACE=eth0 and net-device-up IFACE=eth1) About to try: start on (local-filesystems and net-device-up IFACE=eth0 and net-device-up IFACE=eth1 and started network-services) The "local-filesystems" + network device is billed as an alternative to runlevel if you need to to do something *after* networking... No luck so far. I'll keep trying things out. On Mon, Aug 26, 2013 at 2:31 PM, Sage Weil wrote: > On Mon, 26 Aug 2013, Travis Rhoden wrote: > > Hi Sage, > > > > Thanks for the response. I noticed that as well, and suspected > > hostname/DHCP/DNS shenanigans. What's weird is that all nodes are > > identically configured. I also have monitors running on n0 and n12, and > > they come up fine, every time. > > > > Here's the mon_host line from ceph.conf: > > > > mon_initial_members = n0, n12, n24 > > mon_host = 10.0.1.0,10.0.1.12,10.0.1.24 > > > > just to test /etc/hosts and name resolution... > > > > root@n24:~# getent hosts n24 > > 10.0.1.24 n24 > > root@n24:~# hostname -s > > n24 > > > > The only loopback device in /etc/hosts is "127.0.0.1 localhost", so > > that should be fine. > > > > Upon rebooting this node, I've had the monitor come up okay once, maybe > out > > of 12 tries. So it appears to be some kind of race... No clue what is > > going on. If I stop and start the monitor (or restart), it doesn't > appear > > to change anything. > > > > However, on the topic of races, I having one other more pressing issue. > > Each OSD host is having it's hostname assigned via DHCP. Until that > > assignment is made (during init), the hostname is "localhost", and then > it > > switches over to "n", for some node number. The issue I am seeing is > > that there is a race between this hostname assignment and the Ceph > Upstart > > scripts, such that sometimes ceph-osd starts while the hostname is still > > 'localhost'. This then causes the osd location to change in the > crushmap, > > which is going to be a very bad thing. =) When rebooting all my nodes > at > > once (there are several dozen), about 50% move from being under n to > > localhost. Restarting all the ceph-osd jobs moves them back (because the > > hostname is defined). > > > > I'm wondering what kind of delay, or additional "start-on" logic I can > add > > to the upstart script to work around this. > > Hmm, this is beyond my upstart-fu, unfortunately. This has come up > before, actually. Previously we would wait for any interface to come up > and then start, but that broke with multi-nic machines, and I ended up > just making things start in runlevel [2345]. > > James, do you know what should be done to make the job wait for *all* > network interfaces to be up? Is that even the right solution here? > > sage > > > > > > > > On Fri, Aug 23, 2013 at 4:47 PM, Sage Weil wrote: > > Hi Travis, > > > > On Fri, 23 Aug 2013, Travis Rhoden wrote: > > > Hey folks, > > > > > > I've just done a brand new install of 0.67.2 on a cluster of > > Calxeda nodes. > > > > > > I have one particular monitor that number joins the quorum > > when I restart > > > the node. Looks to me like it has something to do with the > > "create-keys" > > > task, which never seems to finish: > > > > > > root 1240 1 4 13:03 ?00:00:02 > > /usr/bin/ceph-mon > > > --cluster=ceph -i n24 -f > > > root 1244 1 0 13:03 ?00:00:00 > > /usr/bin/python > > > /usr/sbin/ceph-create-keys --cluster=ceph -i n24 > > > > > > I don't see that task on my other monitors. Additionally, > > that task is > > > periodically query the monitor status: > > > > > > root 1240 1 2 13:03 ?00:00:02 > > /usr/bin/ceph-mon > > > --cluster=ceph -i n24 -f > > > root 1244 1 0 13:03 ?00:00:00 > > /usr/bin/python > > > /usr/sbin/ceph-create-keys --cluster=ceph -i n24 > > > root 1982 1244 15 13:04 ?00:00:00 > > /usr/bin/python > > > /usr/bin/ceph --cluster=ceph > > --admin-daemon=/var/run/ceph/ceph-mon.n24.asok > > > mon_status > > > > > > Checking that status myself, I see: > > > > > > # ceph --cluster=ceph > > --admin-daemon=/var/run/ceph/ceph-mon.n24.asok > > > mon_status > > > { "name": "n24", > > > "rank": 2, > > > "state": "probing", > > > "election_epoch": 0, > > > "quorum": [], > > > "outside_quorum": [ > > > "n24"], > > > "extra_probe_peers": [], > > > "sync_provider": [], > > > "monmap": { "epoch": 2, > > > "fsid": "f0b0d4ec-1ac3-4b24-9eab-c19760ce4682", > > > "modified": "2013-08-23 12:55:34.374650", > > > "created":
Re: [ceph-users] 1 particular ceph-mon never jobs on 0.67.2
On Mon, 26 Aug 2013, Travis Rhoden wrote: > Hi Sage, > > Thanks for the response. I noticed that as well, and suspected > hostname/DHCP/DNS shenanigans. What's weird is that all nodes are > identically configured. I also have monitors running on n0 and n12, and > they come up fine, every time. > > Here's the mon_host line from ceph.conf: > > mon_initial_members = n0, n12, n24 > mon_host = 10.0.1.0,10.0.1.12,10.0.1.24 > > just to test /etc/hosts and name resolution... > > root@n24:~# getent hosts n24 > 10.0.1.24 n24 > root@n24:~# hostname -s > n24 > > The only loopback device in /etc/hosts is "127.0.0.1 localhost", so > that should be fine. > > Upon rebooting this node, I've had the monitor come up okay once, maybe out > of 12 tries. So it appears to be some kind of race... No clue what is > going on. If I stop and start the monitor (or restart), it doesn't appear > to change anything. > > However, on the topic of races, I having one other more pressing issue. > Each OSD host is having it's hostname assigned via DHCP. Until that > assignment is made (during init), the hostname is "localhost", and then it > switches over to "n", for some node number. The issue I am seeing is > that there is a race between this hostname assignment and the Ceph Upstart > scripts, such that sometimes ceph-osd starts while the hostname is still > 'localhost'. This then causes the osd location to change in the crushmap, > which is going to be a very bad thing. =) When rebooting all my nodes at > once (there are several dozen), about 50% move from being under n to > localhost. Restarting all the ceph-osd jobs moves them back (because the > hostname is defined). > > I'm wondering what kind of delay, or additional "start-on" logic I can add > to the upstart script to work around this. Hmm, this is beyond my upstart-fu, unfortunately. This has come up before, actually. Previously we would wait for any interface to come up and then start, but that broke with multi-nic machines, and I ended up just making things start in runlevel [2345]. James, do you know what should be done to make the job wait for *all* network interfaces to be up? Is that even the right solution here? sage > > > On Fri, Aug 23, 2013 at 4:47 PM, Sage Weil wrote: > Hi Travis, > > On Fri, 23 Aug 2013, Travis Rhoden wrote: > > Hey folks, > > > > I've just done a brand new install of 0.67.2 on a cluster of > Calxeda nodes. > > > > I have one particular monitor that number joins the quorum > when I restart > > the node. Looks to me like it has something to do with the > "create-keys" > > task, which never seems to finish: > > > > root 1240 1 4 13:03 ? 00:00:02 > /usr/bin/ceph-mon > > --cluster=ceph -i n24 -f > > root 1244 1 0 13:03 ? 00:00:00 > /usr/bin/python > > /usr/sbin/ceph-create-keys --cluster=ceph -i n24 > > > > I don't see that task on my other monitors. Additionally, > that task is > > periodically query the monitor status: > > > > root 1240 1 2 13:03 ? 00:00:02 > /usr/bin/ceph-mon > > --cluster=ceph -i n24 -f > > root 1244 1 0 13:03 ? 00:00:00 > /usr/bin/python > > /usr/sbin/ceph-create-keys --cluster=ceph -i n24 > > root 1982 1244 15 13:04 ? 00:00:00 > /usr/bin/python > > /usr/bin/ceph --cluster=ceph > --admin-daemon=/var/run/ceph/ceph-mon.n24.asok > > mon_status > > > > Checking that status myself, I see: > > > > # ceph --cluster=ceph > --admin-daemon=/var/run/ceph/ceph-mon.n24.asok > > mon_status > > { "name": "n24", > > "rank": 2, > > "state": "probing", > > "election_epoch": 0, > > "quorum": [], > > "outside_quorum": [ > > "n24"], > > "extra_probe_peers": [], > > "sync_provider": [], > > "monmap": { "epoch": 2, > > "fsid": "f0b0d4ec-1ac3-4b24-9eab-c19760ce4682", > > "modified": "2013-08-23 12:55:34.374650", > > "created": "0.00", > > "mons": [ > > { "rank": 0, > > "name": "n0", > > "addr": "10.0.1.0:6789\/0"}, > > { "rank": 1, > > "name": "n12", > > "addr": "10.0.1.12:6789\/0"}, > > { "rank": 2, > > "name": "n24", > > "addr": "0.0.0.0:6810\/0"}]}} > > > This is the problem. I can't remember exactly what causes this, > though. > Can you verify the host in ceph.conf mon_host line matches the ip that > is > configured on th machine, and that the /etc/hsots on the machine > doesn't > have a loopback address on it. > > Thanks! > sage >
Re: [ceph-users] 1 particular ceph-mon never jobs on 0.67.2
Hi Sage, Thanks for the response. I noticed that as well, and suspected hostname/DHCP/DNS shenanigans. What's weird is that all nodes are identically configured. I also have monitors running on n0 and n12, and they come up fine, every time. Here's the mon_host line from ceph.conf: mon_initial_members = n0, n12, n24 mon_host = 10.0.1.0,10.0.1.12,10.0.1.24 just to test /etc/hosts and name resolution... root@n24:~# getent hosts n24 10.0.1.24 n24 root@n24:~# hostname -s n24 The only loopback device in /etc/hosts is "127.0.0.1 localhost", so that should be fine. Upon rebooting this node, I've had the monitor come up okay once, maybe out of 12 tries. So it appears to be some kind of race... No clue what is going on. If I stop and start the monitor (or restart), it doesn't appear to change anything. However, on the topic of races, I having one other more pressing issue. Each OSD host is having it's hostname assigned via DHCP. Until that assignment is made (during init), the hostname is "localhost", and then it switches over to "n", for some node number. The issue I am seeing is that there is a race between this hostname assignment and the Ceph Upstart scripts, such that sometimes ceph-osd starts while the hostname is still 'localhost'. This then causes the osd location to change in the crushmap, which is going to be a very bad thing. =) When rebooting all my nodes at once (there are several dozen), about 50% move from being under n to localhost. Restarting all the ceph-osd jobs moves them back (because the hostname is defined). I'm wondering what kind of delay, or additional "start-on" logic I can add to the upstart script to work around this. On Fri, Aug 23, 2013 at 4:47 PM, Sage Weil wrote: > Hi Travis, > > On Fri, 23 Aug 2013, Travis Rhoden wrote: > > Hey folks, > > > > I've just done a brand new install of 0.67.2 on a cluster of Calxeda > nodes. > > > > I have one particular monitor that number joins the quorum when I restart > > the node. Looks to me like it has something to do with the > "create-keys" > > task, which never seems to finish: > > > > root 1240 1 4 13:03 ?00:00:02 /usr/bin/ceph-mon > > --cluster=ceph -i n24 -f > > root 1244 1 0 13:03 ?00:00:00 /usr/bin/python > > /usr/sbin/ceph-create-keys --cluster=ceph -i n24 > > > > I don't see that task on my other monitors. Additionally, that task is > > periodically query the monitor status: > > > > root 1240 1 2 13:03 ?00:00:02 /usr/bin/ceph-mon > > --cluster=ceph -i n24 -f > > root 1244 1 0 13:03 ?00:00:00 /usr/bin/python > > /usr/sbin/ceph-create-keys --cluster=ceph -i n24 > > root 1982 1244 15 13:04 ?00:00:00 /usr/bin/python > > /usr/bin/ceph --cluster=ceph > --admin-daemon=/var/run/ceph/ceph-mon.n24.asok > > mon_status > > > > Checking that status myself, I see: > > > > # ceph --cluster=ceph --admin-daemon=/var/run/ceph/ceph-mon.n24.asok > > mon_status > > { "name": "n24", > > "rank": 2, > > "state": "probing", > > "election_epoch": 0, > > "quorum": [], > > "outside_quorum": [ > > "n24"], > > "extra_probe_peers": [], > > "sync_provider": [], > > "monmap": { "epoch": 2, > > "fsid": "f0b0d4ec-1ac3-4b24-9eab-c19760ce4682", > > "modified": "2013-08-23 12:55:34.374650", > > "created": "0.00", > > "mons": [ > > { "rank": 0, > > "name": "n0", > > "addr": "10.0.1.0:6789\/0"}, > > { "rank": 1, > > "name": "n12", > > "addr": "10.0.1.12:6789\/0"}, > > { "rank": 2, > > "name": "n24", > > "addr": "0.0.0.0:6810\/0"}]}} > > > This is the problem. I can't remember exactly what causes this, though. > Can you verify the host in ceph.conf mon_host line matches the ip that is > configured on th machine, and that the /etc/hsots on the machine doesn't > have a loopback address on it. > > Thanks! > sage > > > > > > > > Any ideas what is going on here? I don't see anything useful in > > /var/log/ceph/ceph-mon.n24.log > > > > Thanks, > > > > - Travis > > > > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] 1 particular ceph-mon never jobs on 0.67.2
Hi Travis, On Fri, 23 Aug 2013, Travis Rhoden wrote: > Hey folks, > > I've just done a brand new install of 0.67.2 on a cluster of Calxeda nodes. > > I have one particular monitor that number joins the quorum when I restart > the node. Looks to me like it has something to do with the "create-keys" > task, which never seems to finish: > > root 1240 1 4 13:03 ? 00:00:02 /usr/bin/ceph-mon > --cluster=ceph -i n24 -f > root 1244 1 0 13:03 ? 00:00:00 /usr/bin/python > /usr/sbin/ceph-create-keys --cluster=ceph -i n24 > > I don't see that task on my other monitors. Additionally, that task is > periodically query the monitor status: > > root 1240 1 2 13:03 ? 00:00:02 /usr/bin/ceph-mon > --cluster=ceph -i n24 -f > root 1244 1 0 13:03 ? 00:00:00 /usr/bin/python > /usr/sbin/ceph-create-keys --cluster=ceph -i n24 > root 1982 1244 15 13:04 ? 00:00:00 /usr/bin/python > /usr/bin/ceph --cluster=ceph --admin-daemon=/var/run/ceph/ceph-mon.n24.asok > mon_status > > Checking that status myself, I see: > > # ceph --cluster=ceph --admin-daemon=/var/run/ceph/ceph-mon.n24.asok > mon_status > { "name": "n24", > "rank": 2, > "state": "probing", > "election_epoch": 0, > "quorum": [], > "outside_quorum": [ > "n24"], > "extra_probe_peers": [], > "sync_provider": [], > "monmap": { "epoch": 2, > "fsid": "f0b0d4ec-1ac3-4b24-9eab-c19760ce4682", > "modified": "2013-08-23 12:55:34.374650", > "created": "0.00", > "mons": [ > { "rank": 0, > "name": "n0", > "addr": "10.0.1.0:6789\/0"}, > { "rank": 1, > "name": "n12", > "addr": "10.0.1.12:6789\/0"}, > { "rank": 2, > "name": "n24", > "addr": "0.0.0.0:6810\/0"}]}} This is the problem. I can't remember exactly what causes this, though. Can you verify the host in ceph.conf mon_host line matches the ip that is configured on th machine, and that the /etc/hsots on the machine doesn't have a loopback address on it. Thanks! sage > > Any ideas what is going on here? I don't see anything useful in > /var/log/ceph/ceph-mon.n24.log > > Thanks, > > - Travis > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] 1 particular ceph-mon never jobs on 0.67.2
Hey folks, I've just done a brand new install of 0.67.2 on a cluster of Calxeda nodes. I have one particular monitor that number joins the quorum when I restart the node. Looks to me like it has something to do with the "create-keys" task, which never seems to finish: root 1240 1 4 13:03 ?00:00:02 /usr/bin/ceph-mon --cluster=ceph -i n24 -f root 1244 1 0 13:03 ?00:00:00 /usr/bin/python /usr/sbin/ceph-create-keys --cluster=ceph -i n24 I don't see that task on my other monitors. Additionally, that task is periodically query the monitor status: root 1240 1 2 13:03 ?00:00:02 /usr/bin/ceph-mon --cluster=ceph -i n24 -f root 1244 1 0 13:03 ?00:00:00 /usr/bin/python /usr/sbin/ceph-create-keys --cluster=ceph -i n24 root 1982 1244 15 13:04 ?00:00:00 /usr/bin/python /usr/bin/ceph --cluster=ceph --admin-daemon=/var/run/ceph/ceph-mon.n24.asok mon_status Checking that status myself, I see: # ceph --cluster=ceph --admin-daemon=/var/run/ceph/ceph-mon.n24.asok mon_status { "name": "n24", "rank": 2, "state": "probing", "election_epoch": 0, "quorum": [], "outside_quorum": [ "n24"], "extra_probe_peers": [], "sync_provider": [], "monmap": { "epoch": 2, "fsid": "f0b0d4ec-1ac3-4b24-9eab-c19760ce4682", "modified": "2013-08-23 12:55:34.374650", "created": "0.00", "mons": [ { "rank": 0, "name": "n0", "addr": "10.0.1.0:6789\/0"}, { "rank": 1, "name": "n12", "addr": "10.0.1.12:6789\/0"}, { "rank": 2, "name": "n24", "addr": "0.0.0.0:6810\/0"}]}} Any ideas what is going on here? I don't see anything useful in /var/log/ceph/ceph-mon.n24.log Thanks, - Travis ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com