adminsocket
What is an adminsocket used for? Would librbd use one in normal operation? Thanks James -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: adminsocket
On 11/22/2013 06:42 PM, James Harper wrote: What is an adminsocket used for? Would librbd use one in normal operation? It's a way to send administrative and informational commands directly to a Ceph entity (usually a daemon, but sometimes a client). Almost all the ceph entities create one...osd, mon, mds, client. It's not really normal operation, but you can find stats there, and often things like status, version of the software, etc. -- Dan Mick, Filesystem Engineering Inktank Storage, Inc. http://inktank.com Ceph docs: http://ceph.com/docs -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: adminsocket
On 11/22/2013 06:42 PM, James Harper wrote: What is an adminsocket used for? Would librbd use one in normal operation? It's a way to send administrative and informational commands directly to a Ceph entity (usually a daemon, but sometimes a client). Almost all the ceph entities create one...osd, mon, mds, client. It's not really normal operation, but you can find stats there, and often things like status, version of the software, etc. Ok. So if I commented out anything that uses it (just ceph context for perf counters by the looks of it), for the purposes of a win32 build of librbd, it should still be functional. Thanks James -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ceph-mon not starting - AdminSocketConfigObs::init: error: AdminSocket::create_shutdown_pipe error: (38) Function not implemented
Am 04.10.2012 15:38, schrieb Smart Weblications GmbH - Florian Wiessner: Hi, i have a ceph cluster with 2 osds, 3 mons.. one of the monitors does not start anymore: 2012-10-04 13:36:29.501178 7f7e123f9780 -1 asok(0x14ac000) AdminSocketConfigObs::init: error: AdminSocket::create_shutdown_pipe error: (38) Function not implemented 2012-10-04 13:36:29.535018 7f7e123f9780 1 mon.2@-1(probing) e1 init fsid 5b59811a-d235-488f-9b9b-953db7e5028b 2012-10-04 13:36:29.541171 7f7e123f9780 -1 mon/Paxos.cc: In function 'bool Paxos::is_consistent()' thread 7f7e123f9780 time 2012-10-04 13:36:29.536744 mon/Paxos.cc: 1031: FAILED assert(consistent || (slurping == 1)) ceph version 0.48.1argonaut (commit:a7ad701b9bd479f20429f19e6fea7373ca6bba7c) 1: /usr/bin/ceph-mon() [0x488a67] 2: (Monitor::init()+0xc5a) [0x476f4a] 3: (main()+0x2789) [0x45c3b9] 4: (__libc_start_main()+0xfd) [0x7f7e10929c8d] 5: /usr/bin/ceph-mon() [0x459a49] NOTE: a copy of the executable, or `objdump -rdS executable` is needed to interpret this. --- begin dump of recent events --- -20 2012-10-04 13:36:29.443083 7f7e123f9780 5 asok(0x14ac000) register_command perfcounters_dump hook 0x14a0010 -19 2012-10-04 13:36:29.443578 7f7e123f9780 5 asok(0x14ac000) register_command 1 hook 0x14a0010 -18 2012-10-04 13:36:29.443600 7f7e123f9780 5 asok(0x14ac000) register_command perf dump hook 0x14a0010 -17 2012-10-04 13:36:29.443627 7f7e123f9780 5 asok(0x14ac000) register_command perfcounters_schema hook 0x14a0010 -16 2012-10-04 13:36:29.443637 7f7e123f9780 5 asok(0x14ac000) register_command 2 hook 0x14a0010 -15 2012-10-04 13:36:29.443644 7f7e123f9780 5 asok(0x14ac000) register_command perf schema hook 0x14a0010 -14 2012-10-04 13:36:29.443651 7f7e123f9780 5 asok(0x14ac000) register_command config show hook 0x14a0010 -13 2012-10-04 13:36:29.443658 7f7e123f9780 5 asok(0x14ac000) register_command config set hook 0x14a0010 -12 2012-10-04 13:36:29.443665 7f7e123f9780 5 asok(0x14ac000) register_command log flush hook 0x14a0010 -11 2012-10-04 13:36:29.443671 7f7e123f9780 5 asok(0x14ac000) register_command log dump hook 0x14a0010 -10 2012-10-04 13:36:29.443678 7f7e123f9780 5 asok(0x14ac000) register_command log reopen hook 0x14a0010 -9 2012-10-04 13:36:29.453381 7f7e123f9780 1 store(/data/ceph_backend/mon) mount -8 2012-10-04 13:36:29.454581 7f7e123f9780 0 ceph version 0.48.1argonaut (commit:a7ad701b9bd479f20429f19e6fea7373ca6bba7c), process ceph-mon, pid 3643 -7 2012-10-04 13:36:29.455363 7f7e123f9780 1 -- 10.0.0.11:6789/0 accepter.bind my_inst.addr is 10.0.0.11:6789/0 need_addr=0 -6 2012-10-04 13:36:29.469799 7f7e123f9780 1 finished global_init_daemonize -5 2012-10-04 13:36:29.500601 7f7e123f9780 5 asok(0x14ac000) init /var/run/ceph/ceph-mon.2.asok -4 2012-10-04 13:36:29.501178 7f7e123f9780 -1 asok(0x14ac000) AdminSocketConfigObs::init: error: AdminSocket::create_shutdown_pipe error: (38) Function not implemented -3 2012-10-04 13:36:29.502014 7f7e123f9780 1 -- 10.0.0.11:6789/0 messenger.start -2 2012-10-04 13:36:29.502392 7f7e123f9780 1 -- 10.0.0.11:6789/0 accepter.start -1 2012-10-04 13:36:29.535018 7f7e123f9780 1 mon.2@-1(probing) e1 init fsid 5b59811a-d235-488f-9b9b-953db7e5028b 0 2012-10-04 13:36:29.541171 7f7e123f9780 -1 mon/Paxos.cc: In function 'bool Paxos::is_consistent()' thread 7f7e123f9780 time 2012-10-04 13:36:29.536744 mon/Paxos.cc: 1031: FAILED assert(consistent || (slurping == 1)) ceph version 0.48.1argonaut (commit:a7ad701b9bd479f20429f19e6fea7373ca6bba7c) 1: /usr/bin/ceph-mon() [0x488a67] 2: (Monitor::init()+0xc5a) [0x476f4a] 3: (main()+0x2789) [0x45c3b9] 4: (__libc_start_main()+0xfd) [0x7f7e10929c8d] 5: /usr/bin/ceph-mon() [0x459a49] NOTE: a copy of the executable, or `objdump -rdS executable` is needed to interpret this. --- end dump of recent events --- 2012-10-04 13:36:29.568387 7f7e123f9780 -1 *** Caught signal (Aborted) ** in thread 7f7e123f9780 ceph version 0.48.1argonaut (commit:a7ad701b9bd479f20429f19e6fea7373ca6bba7c) 1: /usr/bin/ceph-mon() [0x520c49] 2: (()+0xeff0) [0x7f7e11a9aff0] 3: (gsignal()+0x35) [0x7f7e1093d1b5] 4: (abort()+0x180) [0x7f7e1093ffc0] 5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f7e111d1dc5] 6: (()+0xcb166) [0x7f7e111d0166] 7: (()+0xcb193) [0x7f7e111d0193] 8: (()+0xcb28e) [0x7f7e111d028e] 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x793) [0x574023] 10: /usr/bin/ceph-mon() [0x488a67] 11: (Monitor::init()+0xc5a) [0x476f4a] 12: (main()+0x2789) [0x45c3b9] 13: (__libc_start_main()+0xfd) [0x7f7e10929c8d] 14: /usr/bin/ceph-mon() [0x459a49] NOTE: a copy of the executable, or `objdump -rdS executable` is needed to interpret this. --- begin dump of recent events --- 0 2012-10-04 13:36:29.568387 7f7e123f9780 -1 *** Caught signal (Aborted) ** in thread
Re: ceph-mon not starting - AdminSocketConfigObs::init: error: AdminSocket::create_shutdown_pipe error: (38) Function not implemented
On Fri, 5 Oct 2012, Joao Eduardo Luis wrote: On 10/05/2012 01:24 PM, Smart Weblications GmbH - Florian Wiessner wrote: Am 04.10.2012 15:38, schrieb Smart Weblications GmbH - Florian Wiessner: Hi, i have a ceph cluster with 2 osds, 3 mons.. one of the monitors does not start anymore: 2012-10-04 13:36:29.501178 7f7e123f9780 -1 asok(0x14ac000) AdminSocketConfigObs::init: error: AdminSocket::create_shutdown_pipe error: (38) Function not implemented 2012-10-04 13:36:29.535018 7f7e123f9780 1 mon.2@-1(probing) e1 init fsid 5b59811a-d235-488f-9b9b-953db7e5028b 2012-10-04 13:36:29.541171 7f7e123f9780 -1 mon/Paxos.cc: In function 'bool Paxos::is_consistent()' thread 7f7e123f9780 time 2012-10-04 13:36:29.536744 mon/Paxos.cc: 1031: FAILED assert(consistent || (slurping == 1)) This assertion means the monitor was killed or failed either during slurping (while catching up with the other monitors) or while performing some kind of update. So it ended up in an inconsistent state. The monitor is supposed to take note of when it is slurping and may be temporarily inconsistent by writing a 'slurping' file with '1' in it in the paxos subdirectory(ies), so some bug triggered this. A simple workaround is to do echo 1 $mondata/osdmap/slurping echo 1 $mondata/pgmap/slurping echo 1 $mondata/monmap/slurping echo 1 $mondata/logm/slurping echo 1 $mondata/auth/slurping and it will go through the recovery steps. It would be helpful if you could tar up a copy of the mon directory first, though, along with any log files on that host, so we can try to figure out what went wrong. Thanks! sage I, for one, don't know what is advised in this kind of situations for a production (or anything slightly more critical than a test) cluster. If it were me, given that you have 3 monitors on the total, and assuming the other 2 monitors are fine, up and running, and with a *formed quorum* (ceph -s should let you know about that), then: I would simply start that monitor off with a fresh store. It should slurp its way back into the quorum. It could take some time if you have a huge monitor store, but everything should work. And even if it doesn't, the worst thing that could happen is that you'd end up with the same two monitors that are already running and a third that does not. However, maybe you should wait for input from someone with some more experience dealing with real usage scenarios. -Joao ceph version 0.48.1argonaut (commit:a7ad701b9bd479f20429f19e6fea7373ca6bba7c) 1: /usr/bin/ceph-mon() [0x488a67] 2: (Monitor::init()+0xc5a) [0x476f4a] 3: (main()+0x2789) [0x45c3b9] 4: (__libc_start_main()+0xfd) [0x7f7e10929c8d] 5: /usr/bin/ceph-mon() [0x459a49] NOTE: a copy of the executable, or `objdump -rdS executable` is needed to interpret this. --- begin dump of recent events --- -20 2012-10-04 13:36:29.443083 7f7e123f9780 5 asok(0x14ac000) register_command perfcounters_dump hook 0x14a0010 -19 2012-10-04 13:36:29.443578 7f7e123f9780 5 asok(0x14ac000) register_command 1 hook 0x14a0010 -18 2012-10-04 13:36:29.443600 7f7e123f9780 5 asok(0x14ac000) register_command perf dump hook 0x14a0010 -17 2012-10-04 13:36:29.443627 7f7e123f9780 5 asok(0x14ac000) register_command perfcounters_schema hook 0x14a0010 -16 2012-10-04 13:36:29.443637 7f7e123f9780 5 asok(0x14ac000) register_command 2 hook 0x14a0010 -15 2012-10-04 13:36:29.443644 7f7e123f9780 5 asok(0x14ac000) register_command perf schema hook 0x14a0010 -14 2012-10-04 13:36:29.443651 7f7e123f9780 5 asok(0x14ac000) register_command config show hook 0x14a0010 -13 2012-10-04 13:36:29.443658 7f7e123f9780 5 asok(0x14ac000) register_command config set hook 0x14a0010 -12 2012-10-04 13:36:29.443665 7f7e123f9780 5 asok(0x14ac000) register_command log flush hook 0x14a0010 -11 2012-10-04 13:36:29.443671 7f7e123f9780 5 asok(0x14ac000) register_command log dump hook 0x14a0010 -10 2012-10-04 13:36:29.443678 7f7e123f9780 5 asok(0x14ac000) register_command log reopen hook 0x14a0010 -9 2012-10-04 13:36:29.453381 7f7e123f9780 1 store(/data/ceph_backend/mon) mount -8 2012-10-04 13:36:29.454581 7f7e123f9780 0 ceph version 0.48.1argonaut (commit:a7ad701b9bd479f20429f19e6fea7373ca6bba7c), process ceph-mon, pid 3643 -7 2012-10-04 13:36:29.455363 7f7e123f9780 1 -- 10.0.0.11:6789/0 accepter.bind my_inst.addr is 10.0.0.11:6789/0 need_addr=0 -6 2012-10-04 13:36:29.469799 7f7e123f9780 1 finished global_init_daemonize -5 2012-10-04 13:36:29.500601 7f7e123f9780 5 asok(0x14ac000) init /var/run/ceph/ceph-mon.2.asok -4 2012-10-04 13:36:29.501178 7f7e123f9780 -1 asok(0x14ac000) AdminSocketConfigObs::init: error: AdminSocket::create_shutdown_pipe error: (38) Function not implemented -3 2012-10-04 13:36:29.502014 7f7e123f9780 1 -- 10.0.0.11:6789/0
Re: ceph-mon not starting - AdminSocketConfigObs::init: error: AdminSocket::create_shutdown_pipe error: (38) Function not implemented
Am 05.10.2012 17:24, schrieb Sage Weil: On Fri, 5 Oct 2012, Joao Eduardo Luis wrote: On 10/05/2012 01:24 PM, Smart Weblications GmbH - Florian Wiessner wrote: Am 04.10.2012 15:38, schrieb Smart Weblications GmbH - Florian Wiessner: Hi, i have a ceph cluster with 2 osds, 3 mons.. one of the monitors does not start anymore: 2012-10-04 13:36:29.501178 7f7e123f9780 -1 asok(0x14ac000) AdminSocketConfigObs::init: error: AdminSocket::create_shutdown_pipe error: (38) Function not implemented 2012-10-04 13:36:29.535018 7f7e123f9780 1 mon.2@-1(probing) e1 init fsid 5b59811a-d235-488f-9b9b-953db7e5028b 2012-10-04 13:36:29.541171 7f7e123f9780 -1 mon/Paxos.cc: In function 'bool Paxos::is_consistent()' thread 7f7e123f9780 time 2012-10-04 13:36:29.536744 mon/Paxos.cc: 1031: FAILED assert(consistent || (slurping == 1)) This assertion means the monitor was killed or failed either during slurping (while catching up with the other monitors) or while performing some kind of update. So it ended up in an inconsistent state. The monitor is supposed to take note of when it is slurping and may be temporarily inconsistent by writing a 'slurping' file with '1' in it in the paxos subdirectory(ies), so some bug triggered this. A simple workaround is to do echo 1 $mondata/osdmap/slurping echo 1 $mondata/pgmap/slurping echo 1 $mondata/monmap/slurping echo 1 $mondata/logm/slurping echo 1 $mondata/auth/slurping and it will go through the recovery steps. It would be helpful if you could tar up a copy of the mon directory first, though, along with any log files on that host, so we can try to figure out what went wrong. unfortunatelly, i deleted the logs for the monitor, as i did not see anything special except this assertion... i'll send mon-directory directly to Sage with a seperate mail. -- Mit freundlichen Grüßen, Florian Wiessner Smart Weblications GmbH Martinsberger Str. 1 D-95119 Naila fon.: +49 9282 9638 200 fax.: +49 9282 9638 205 24/7: +49 900 144 000 00 - 0,99 EUR/Min* http://www.smart-weblications.de -- Sitz der Gesellschaft: Naila Geschäftsführer: Florian Wiessner HRB-Nr.: HRB 3840 Amtsgericht Hof *aus dem dt. Festnetz, ggf. abweichende Preise aus dem Mobilfunknetz -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ceph-mon not starting - AdminSocketConfigObs::init: error: AdminSocket::create_shutdown_pipe error: (38) Function not implemented
Am 05.10.2012 17:24, schrieb Sage Weil: On Fri, 5 Oct 2012, Joao Eduardo Luis wrote: On 10/05/2012 01:24 PM, Smart Weblications GmbH - Florian Wiessner wrote: Am 04.10.2012 15:38, schrieb Smart Weblications GmbH - Florian Wiessner: Hi, i have a ceph cluster with 2 osds, 3 mons.. one of the monitors does not start anymore: 2012-10-04 13:36:29.501178 7f7e123f9780 -1 asok(0x14ac000) AdminSocketConfigObs::init: error: AdminSocket::create_shutdown_pipe error: (38) Function not implemented 2012-10-04 13:36:29.535018 7f7e123f9780 1 mon.2@-1(probing) e1 init fsid 5b59811a-d235-488f-9b9b-953db7e5028b 2012-10-04 13:36:29.541171 7f7e123f9780 -1 mon/Paxos.cc: In function 'bool Paxos::is_consistent()' thread 7f7e123f9780 time 2012-10-04 13:36:29.536744 mon/Paxos.cc: 1031: FAILED assert(consistent || (slurping == 1)) This assertion means the monitor was killed or failed either during slurping (while catching up with the other monitors) or while performing some kind of update. So it ended up in an inconsistent state. The monitor is supposed to take note of when it is slurping and may be temporarily inconsistent by writing a 'slurping' file with '1' in it in the paxos subdirectory(ies), so some bug triggered this. A simple workaround is to do echo 1 $mondata/osdmap/slurping echo 1 $mondata/pgmap/slurping echo 1 $mondata/monmap/slurping echo 1 $mondata/logm/slurping echo 1 $mondata/auth/slurping OK, this did fix it, now the 3 mons are running again. and it will go through the recovery steps. It would be helpful if you could tar up a copy of the mon directory first, though, along with any log files on that host, so we can try to figure out what went wrong. Thank you very much. -- Mit freundlichen Grüßen, Florian Wiessner Smart Weblications GmbH Martinsberger Str. 1 D-95119 Naila fon.: +49 9282 9638 200 fax.: +49 9282 9638 205 24/7: +49 900 144 000 00 - 0,99 EUR/Min* http://www.smart-weblications.de -- Sitz der Gesellschaft: Naila Geschäftsführer: Florian Wiessner HRB-Nr.: HRB 3840 Amtsgericht Hof *aus dem dt. Festnetz, ggf. abweichende Preise aus dem Mobilfunknetz -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ceph-mon not starting - AdminSocketConfigObs::init: error: AdminSocket::create_shutdown_pipe error: (38) Function not implemented
On 10/05/2012 04:56 PM, Smart Weblications GmbH - Florian Wiessner wrote: Am 05.10.2012 17:24, schrieb Sage Weil: On Fri, 5 Oct 2012, Joao Eduardo Luis wrote: On 10/05/2012 01:24 PM, Smart Weblications GmbH - Florian Wiessner wrote: Am 04.10.2012 15:38, schrieb Smart Weblications GmbH - Florian Wiessner: Hi, i have a ceph cluster with 2 osds, 3 mons.. one of the monitors does not start anymore: 2012-10-04 13:36:29.501178 7f7e123f9780 -1 asok(0x14ac000) AdminSocketConfigObs::init: error: AdminSocket::create_shutdown_pipe error: (38) Function not implemented 2012-10-04 13:36:29.535018 7f7e123f9780 1 mon.2@-1(probing) e1 init fsid 5b59811a-d235-488f-9b9b-953db7e5028b 2012-10-04 13:36:29.541171 7f7e123f9780 -1 mon/Paxos.cc: In function 'bool Paxos::is_consistent()' thread 7f7e123f9780 time 2012-10-04 13:36:29.536744 mon/Paxos.cc: 1031: FAILED assert(consistent || (slurping == 1)) This assertion means the monitor was killed or failed either during slurping (while catching up with the other monitors) or while performing some kind of update. So it ended up in an inconsistent state. The monitor is supposed to take note of when it is slurping and may be temporarily inconsistent by writing a 'slurping' file with '1' in it in the paxos subdirectory(ies), so some bug triggered this. A simple workaround is to do echo 1 $mondata/osdmap/slurping echo 1 $mondata/pgmap/slurping echo 1 $mondata/monmap/slurping echo 1 $mondata/logm/slurping echo 1 $mondata/auth/slurping and it will go through the recovery steps. It would be helpful if you could tar up a copy of the mon directory first, though, along with any log files on that host, so we can try to figure out what went wrong. unfortunatelly, i deleted the logs for the monitor, as i did not see anything special except this assertion... i'll send mon-directory directly to Sage with a seperate mail. Just following up on this, do you remember why this monitor went down initially (the time before you were unable to start it)? Did it fail? Was it killed? Were you upgrading it from a version prior to argonaut? -Joao -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html