Hi Everyone,
Thanks for your input on this. I know it's been a long time but I just
wanted to report back that this issue has been resolved.
We added two more monitors which happened to be on Ubuntu 14.04
(rather than 12.04) and these had no issues. So we upgraded every host
to 14.04.
Since the OS update we have not had any Monitor crashes. It's now been
over two months and the Mon's have been stable.
Thanks again,
Richard

On 17 October 2015 at 07:26, Richard Bade <hitr...@gmail.com> wrote:
> Ok, debugging increased
> ceph tell mon.[abc] injectargs --debug-mon 20
> ceph tell mon.[abc] injectargs --debug-ms 1
>
> Regards,
> Richard
>
> On 17 October 2015 at 01:38, Sage Weil <s...@newdream.net> wrote:
>>
>> This doesn't look familiar.  Are you able to enable a higher log level so
>> that if it happens again we'll have more info?
>>
>> debug mon = 20
>> debug ms = 1
>>
>> Thanks!
>> sage
>>
>> On Fri, 16 Oct 2015, Dan van der Ster wrote:
>>
>> > Hmm, that's strange. I didn't see anything in the tracker that looks
>> > related. Hopefully an expert can chime in...
>> >
>> > Cheers, Dan
>> >
>> > On Fri, Oct 16, 2015 at 1:38 PM, Richard Bade <hitr...@gmail.com> wrote:
>> > > Thanks for your quick response Dan, but no. All the ceph-mon.*.log
>> > > files are
>> > > empty.
>> > > I did track this down in syslog though, in case it helps:
>> > > ceph-mon: 2015-10-16 21:25:00.117115 7f4c9f458700 -1 *** Caught signal
>> > > (Segmentation fault) **#012 in thread 7f4c9f458700#012#012 ceph
>> > > version
>> > > 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b)#012 1:
>> > > /usr/bin/ceph-mon()
>> > > [0x928b05]#012 2: (()+0xfcb0) [0x7f4ca50e0cb0]#012 3:
>> > > (get_str_map_key(std::map<std::string, std::string,
>> > > std::less<std::string>,
>> > > std::allocator<std::pair<std::string const, std::string> > > const&,
>> > > std::string const&, std::string const*)+0x37) [0x87d8e7]#012 4:
>> > > (LogMonitor::update_from_paxos(bool*)+0x801) [0x6846e1]#012 5:
>> > > (PaxosService::refresh(bool*)+0x3c6) [0x5dc326]#012 6:
>> > > (Monitor::refresh_from_paxos(bool*)+0x36b) [0x588aab]#012 7:
>> > > (Paxos::do_refresh()+0x4c) [0x5c465c]#012 8:
>> > > (Paxos::handle_commit(MMonPaxos*)+0x243) [0x5cb2d3]#012 9:
>> > > (Paxos::dispatch(PaxosServiceMessage*)+0x22b) [0x5d3fbb]#012 10:
>> > > (Monitor::dispatch(MonSession*, Message*, bool)+0x864) [0x5ab0d4]#012
>> > > 11:
>> > > (Monitor::_ms_dispatch(Message*)+0x2c9) [0x5a8a19]#012 12:
>> > > (Monitor::ms_dispatch(Message*)+0x32) [0x5c3952]#012 13:
>> > > (Messenger::ms_deliver_dispatch(Message*)+0x77) [0x8ac987]#012 14:
>> > > (DispatchQueue::entry()+0x44a) [0x8a9b2a]#012 15:
>> > > (DispatchQueue::DispatchThread::entry()+0xd) [0x79e4ad]#012 16:
>> > > (()+0x7e9a)
>> > > [0x7f4ca50d8e9a]#012 17: (clone()+0x6d) [0x7f4ca3dca38d]#012 NOTE: a
>> > > copy of
>> > > the executable, or `objdump -rdS <executable>` is needed to interpret
>> > > this.
>> > >
>> > > Regards,
>> > > Richard
>> > >
>> > > On 17 October 2015 at 00:33, Dan van der Ster <d...@vanderster.com>
>> > > wrote:
>> > >>
>> > >> Hi,
>> > >> Is there a backtrace in /var/log/ceph/ceph-mon.*.log ?
>> > >> Cheers, Dan
>> > >>
>> > >> On Fri, Oct 16, 2015 at 12:46 PM, Richard Bade <hitr...@gmail.com>
>> > >> wrote:
>> > >> > Hi Everyone,
>> > >> > I upgraded our cluster to Hammer 0.94.3 a couple of days ago and
>> > >> > today
>> > >> > we've
>> > >> > had one monitor crash twice and another one once. We have 3
>> > >> > monitors
>> > >> > total
>> > >> > and have been running Firefly 0.80.10 for quite some time without
>> > >> > any
>> > >> > monitor issues.
>> > >> > When the monitor crashes it leaves a core file and a crash file in
>> > >> > /var/crash
>> > >> > I can't see anything obviously the same goolging about.
>> > >> > Has anyone seen anything like this?
>> > >> > Any suggestions? What other info would be useful to help track down
>> > >> > the
>> > >> > issue.
>> > >> >
>> > >> > Regards,
>> > >> > Richard
>> > >> >
>> > >> > _______________________________________________
>> > >> > ceph-users mailing list
>> > >> > ceph-users@lists.ceph.com
>> > >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> > >> >
>> > >
>> > >
>> > _______________________________________________
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>> >
>
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to