Re: DEBUG - analysing core dumps

2011-05-26 Thread Damien Fleuriot
On 26 May 2011 09:51, Damien Fleuriot  wrote:
>
>
> On 5/25/11 7:10 PM, Garrett Cooper wrote:
>> On Wed, May 25, 2011 at 9:36 AM, Damien Fleuriot  wrote:
>>> Hello list,
>>>
>>>
>>>
>>> We've got these boxes at work running FreeBSD 8.1-STABLE amd64 and
>>> serving as firewalls and openvpn gateways.
>>>
>>> We use CARP interfaces to provide an active-passive fault tolerant system.
>>>
>>>
>>> Today, we received a nagios alert from the master box saying it's
>>> rsyslogd process had crashed.
>>>
>>> I logged on to it and tried to relaunch it, to no avail:
>>> pid 2303 (rsyslogd), uid 0: exited on signal 11 (core dumped)
>>>
>>>
>>>
>>>
>>> I would like advice on how to debug the output from the core dump.
>>>
>>> This is what I get from gdb:
>>>
>>> # gdb
>>> GNU gdb 6.1.1 [FreeBSD]
>>> Copyright 2004 Free Software Foundation, Inc.
>>> GDB is free software, covered by the GNU General Public License, and you are
>>> welcome to change it and/or distribute copies of it under certain
>>> conditions.
>>> Type "show copying" to see the conditions.
>>> There is absolutely no warranty for GDB.  Type "show warranty" for details.
>>> This GDB was configured as "amd64-marcel-freebsd".
>>> (gdb) core rsyslogd.core
>>> Core was generated by `rsyslogd'.
>>> Program terminated with signal 11, Segmentation fault.
>>> #0  0x004258ec in ?? ()
>>>
>>>
>>>
>>>
>>> Sadly, getting a backtrace with "bt" gives me more lines with "??",
>>> which is totally not helpful:
>>> [SNIP]
>>> #13 0x7f1f9d70 in ?? ()
>>> #14 0x in ?? ()
>>> #15 0x6f70732f7261762f in ?? ()
>>> #16 0x6c737973722f6c6f in ?? ()
>>> #17 0x5f6e70766f2f676f in ?? ()
>>> #18 0x746174732e676f6c in ?? ()
>>> #19 0x0065 in ?? ()
>>> #20 0x in ?? ()
>>> [SNIP]
>>>
>>> I am not sure what steps I should follow to get more information ?
>>>
>>>
>>>
>>> Also, I believe that often, core dumps with signal 11 = RAM problems and
>>> I would like a confirmation here.
>>>
>>> I am concerned because rsyslogd is the only process that crashes in this
>>> way, even after I rebooted the firewall.
>>
>>     Rebuild and reinstall rsyslogd with debug symbols and see if you
>> can get a reasonable stack trace. Something else to try before that to
>> narrow down the problem section of code is ktrace/kdump it, or truss
>> it, and see if it's trying to open/read from a file and failing.
>> Thanks,
>> -Garrett
>
>
>
>
> Thanks everyone for your answers, I'll recompile with DEBUG and obtain a
> new core dump.
>
> I'll also investigate the possibility of corrupted spool files and post
> the resolution here :)
>
>
> --
> dfl
>


Turns out that after rebuilding rsyslog4-relp with -DWITH_DEBUG , the
new daemon works just fine and doesn't sig11 anymore.
Odd, but well, solves my problem.

I will upgrade it on all the other boxes then.

Thanks for the help guys

--
dfl
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: DEBUG - analysing core dumps

2011-05-26 Thread Damien Fleuriot


On 5/25/11 7:10 PM, Garrett Cooper wrote:
> On Wed, May 25, 2011 at 9:36 AM, Damien Fleuriot  wrote:
>> Hello list,
>>
>>
>>
>> We've got these boxes at work running FreeBSD 8.1-STABLE amd64 and
>> serving as firewalls and openvpn gateways.
>>
>> We use CARP interfaces to provide an active-passive fault tolerant system.
>>
>>
>> Today, we received a nagios alert from the master box saying it's
>> rsyslogd process had crashed.
>>
>> I logged on to it and tried to relaunch it, to no avail:
>> pid 2303 (rsyslogd), uid 0: exited on signal 11 (core dumped)
>>
>>
>>
>>
>> I would like advice on how to debug the output from the core dump.
>>
>> This is what I get from gdb:
>>
>> # gdb
>> GNU gdb 6.1.1 [FreeBSD]
>> Copyright 2004 Free Software Foundation, Inc.
>> GDB is free software, covered by the GNU General Public License, and you are
>> welcome to change it and/or distribute copies of it under certain
>> conditions.
>> Type "show copying" to see the conditions.
>> There is absolutely no warranty for GDB.  Type "show warranty" for details.
>> This GDB was configured as "amd64-marcel-freebsd".
>> (gdb) core rsyslogd.core
>> Core was generated by `rsyslogd'.
>> Program terminated with signal 11, Segmentation fault.
>> #0  0x004258ec in ?? ()
>>
>>
>>
>>
>> Sadly, getting a backtrace with "bt" gives me more lines with "??",
>> which is totally not helpful:
>> [SNIP]
>> #13 0x7f1f9d70 in ?? ()
>> #14 0x in ?? ()
>> #15 0x6f70732f7261762f in ?? ()
>> #16 0x6c737973722f6c6f in ?? ()
>> #17 0x5f6e70766f2f676f in ?? ()
>> #18 0x746174732e676f6c in ?? ()
>> #19 0x0065 in ?? ()
>> #20 0x in ?? ()
>> [SNIP]
>>
>> I am not sure what steps I should follow to get more information ?
>>
>>
>>
>> Also, I believe that often, core dumps with signal 11 = RAM problems and
>> I would like a confirmation here.
>>
>> I am concerned because rsyslogd is the only process that crashes in this
>> way, even after I rebooted the firewall.
> 
> Rebuild and reinstall rsyslogd with debug symbols and see if you
> can get a reasonable stack trace. Something else to try before that to
> narrow down the problem section of code is ktrace/kdump it, or truss
> it, and see if it's trying to open/read from a file and failing.
> Thanks,
> -Garrett




Thanks everyone for your answers, I'll recompile with DEBUG and obtain a
new core dump.

I'll also investigate the possibility of corrupted spool files and post
the resolution here :)


--
dfl
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: DEBUG - analysing core dumps

2011-05-25 Thread Daniel Hartmeier
On Wed, May 25, 2011 at 06:36:49PM +0200, Damien Fleuriot wrote:

> I am not sure what steps I should follow to get more information ?

Rebuild the port with debug information, as in

  # cd /usr/ports/sysutils/rsyslog4
  # WITH_DEBUG=1 make package

And install that on the target host. Then repeat the gdb backtrace.

If you are using disk spool files, try removing them before the start,
I've seen rsyslogd die like that due to corrupted spool files...

HTH,
Daniel
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: DEBUG - analysing core dumps

2011-05-25 Thread Patrick Lamaiziere
Le Wed, 25 May 2011 18:36:49 +0200,
Damien Fleuriot  a écrit :

Hello,

> Sadly, getting a backtrace with "bt" gives me more lines with "??",
> which is totally not helpful:
> [SNIP]
> #13 0x7f1f9d70 in ?? ()
> #14 0x in ?? ()
> #15 0x6f70732f7261762f in ?? ()
> #16 0x6c737973722f6c6f in ?? ()
> #17 0x5f6e70766f2f676f in ?? ()
> #18 0x746174732e676f6c in ?? ()
> #19 0x0065 in ?? ()
> #20 0x in ?? ()
> [SNIP]
> 
> I am not sure what steps I should follow to get more information ?

You have to build the binary with debug symbols included.

The rsyslog port provides an option for this. Did you see this notice in
the port's makefile?

"
# XXX: 5.5.6+ seem to crash frequently with low-mid load 
# on FreeBSD, temporailiy enable debugging by default.
# Now we can send gdb backtraces into the list:
# rsyslog-users 
OPTIONS=DEBUG   "Enable debugging"  on
"

Good luck...

Regards.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


RE: DEBUG - analysing core dumps

2011-05-25 Thread Andrew Duane
Damien Fleuriot wrote:
> Hello list,
> 
> 
> 
> We've got these boxes at work running FreeBSD 8.1-STABLE amd64 and
> serving as firewalls and openvpn gateways.
> 
> We use CARP interfaces to provide an active-passive fault tolerant
> system. 
> 
> 
> Today, we received a nagios alert from the master box saying it's
> rsyslogd process had crashed.
> 
> I logged on to it and tried to relaunch it, to no avail:
> pid 2303 (rsyslogd), uid 0: exited on signal 11 (core dumped)
> 
> 
> 
> 
> I would like advice on how to debug the output from the core dump.
> 
> This is what I get from gdb:
> 
> # gdb
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and
> you are welcome to change it and/or distribute copies of it under
> certain conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for
> details. This GDB was configured as "amd64-marcel-freebsd".
> (gdb) core rsyslogd.core
> Core was generated by `rsyslogd'.
> Program terminated with signal 11, Segmentation fault.
> #0  0x004258ec in ?? ()
> 
> 
> Sadly, getting a backtrace with "bt" gives me more lines with "??",
> which is totally not helpful:
> [SNIP]
> #13 0x7f1f9d70 in ?? ()
> #14 0x in ?? ()
> #15 0x6f70732f7261762f in ?? ()
> #16 0x6c737973722f6c6f in ?? ()
> #17 0x5f6e70766f2f676f in ?? ()
> #18 0x746174732e676f6c in ?? ()
> #19 0x0065 in ?? ()
> #20 0x in ?? ()
> [SNIP]
> 
> I am not sure what steps I should follow to get more information ?
> 
> 
> 
> Also, I believe that often, core dumps with signal 11 = RAM problems
> and I would like a confirmation here.
> 
> I am concerned because rsyslogd is the only process that crashes in
> this way, even after I rebooted the firewall.
> 
> Thanks for your input :)

For what it's worth, the addresses shown in frames 15, 16, 17, and 18 are ASCII:

ops/rav/
lsysr/lo
_npvo/go
tats.gol

/Andrew
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


DEBUG - analysing core dumps

2011-05-25 Thread Damien Fleuriot
Hello list,



We've got these boxes at work running FreeBSD 8.1-STABLE amd64 and
serving as firewalls and openvpn gateways.

We use CARP interfaces to provide an active-passive fault tolerant system.


Today, we received a nagios alert from the master box saying it's
rsyslogd process had crashed.

I logged on to it and tried to relaunch it, to no avail:
pid 2303 (rsyslogd), uid 0: exited on signal 11 (core dumped)




I would like advice on how to debug the output from the core dump.

This is what I get from gdb:

# gdb
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd".
(gdb) core rsyslogd.core
Core was generated by `rsyslogd'.
Program terminated with signal 11, Segmentation fault.
#0  0x004258ec in ?? ()




Sadly, getting a backtrace with "bt" gives me more lines with "??",
which is totally not helpful:
[SNIP]
#13 0x7f1f9d70 in ?? ()
#14 0x in ?? ()
#15 0x6f70732f7261762f in ?? ()
#16 0x6c737973722f6c6f in ?? ()
#17 0x5f6e70766f2f676f in ?? ()
#18 0x746174732e676f6c in ?? ()
#19 0x0065 in ?? ()
#20 0x in ?? ()
[SNIP]

I am not sure what steps I should follow to get more information ?



Also, I believe that often, core dumps with signal 11 = RAM problems and
I would like a confirmation here.

I am concerned because rsyslogd is the only process that crashes in this
way, even after I rebooted the firewall.



Thanks for your input :)


--
dfl
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"