Bernard,
I had lately also some crashes with 2.2.1, 2.2.2 and earlier. After some
investigations we think we pin pointed the problem with the links section
and absolute links. I change it to relative and my problems were gone. I
have to see what is wrong, but for the moment i have no time. It are links
to directories that are mounted with NFS and root has no write
permissions. Then we get a lot of these errors:
{{{
*** glibc detected *** malloc(): memory corruption: 0x0818e530 ***
*** glibc detected *** corrupted double-linked list: 0x081969a0 ***
*** glibc detected *** free(): invalid pointer: 0x0818e338 ***
}}}
Mark Burgess wrote:
> THanks Bernard,
>
> it could be a spurious operating system thing.
> Good luck
>
> M
>
> bernardchan wrote:
>> Hi Mark,
>>
>> Yes, I have checked from SVN that the code was not changed for many
>> revisions, so I also suspected there might be something specific to
>> those installations and our cfengine configuration, but what is strange
>> is that I had not experienced similar crashes with our machines with
>> identical configuration (and software version, by the way),
>> because all the disks were dd'ed from the same source. I raised it
>> because I was unsure whether there might be some behaviour I don't
>> know of that may lead to those crashes. In Case 1 the SIGABRT occurred
>> while recursive copying a dir to another dir with mixed regular files
>> and symbolic links. In Case 2 the
>> SIGSEGV occurred while processing the "processes" section and apparently
>> checking the ps auxw.
>> For instance, whether the locale or charset or special filename pattern
>> may have anything to do with it.
>>
>> Yes, they were repeatable on many of our systems only recently, but not some
>> time before. So probably some of our recent file changes may have triggered
>> that. I recalled seeing some spurious SIGSEGV on a few other systems but
>> were not always repeatable there.
>>
>> I have briefly compiled a version of 2.2.2 and put it in some temporary
>> directory on some of our systems with crashes seen with 2.2.1. Up to now
>> a few runs have been through without crashing, but considering it has
>> not sustained enough real usage,
>> At this point I cannot tell for sure whether those problems may pop up
>> again once I replaced the
>> systemwide version ( 2.2.1 ) with 2.2.2 and some file changes made which
>> require copying and
>> restarting processes.
>>
>> Will try to migrate more machines to 2.2.2 and check whether the issue
>> would go away.
>>
>> Thank you again for your reply.
>>
>> Regards,
>> Bernard Chan.
>>
>> On Sun, 07 Oct 2007 10:25:41 +0200, Mark Burgess wrote
>>> Hi Bernard,
>>>
>>> thanks for this information. This is a little unusual. In fact this
>>> is not a SEG fault but an abort signal, which is software generated.
>>> It comes from file operations, which is code that has not changed
>>> for several years. This makes me suspect that there could be some
>>> site-specific reason for this.
>>>
>>> Does this happen regularly/repeatably? On the same host, or different
>>> ones? Would you be willing to try compiling 2.2.2 to see if there are
>>> any differences?
>>>
>>> thanks
>>> Mark
>>>
>>> Bernard Chan wrote:
>>>> Hello,
>>>>
>>>> I experienced various instances of segfaults on some cfengine
>>>> installations.
>>>> Following shows the two cases which I encounter so far:
>>>>
>>>> Compiler: gcc 3.4.4
>>>> Version: cfengine 2.2.1
>>>> Linux (Distribubtion: AsteriskNow)
>>>>
>>>> CASE 1
>>>>
>>>> (gdb) run -D forceUpdate
>>>> Starting program: /usr/local/sbin/cfagent -D forceUpdate
>>>> Detaching after fork from child process 4679.
>>>> *** glibc detected *** free(): invalid pointer: 0x081772c8 ***
>>>>
>>>> Program received signal SIGABRT, Aborted.
>>>> 0xb7f7f410 in ?? ()
>>>> (gdb) back
>>>> #0 0xb7f7f410 in ?? ()
>>>> #1 0xbff88560 in ?? ()
>>>> #2 0x00000006 in ?? ()
>>>> #3 0x00001244 in ?? ()
>>>> #4 0xb7c3b275 in raise () from /lib/tls/libc.so.6
>>>> #5 0xb7c3ca59 in abort () from /lib/tls/libc.so.6
>>>> #6 0xb7c6f19a in __fsetlocking () from /lib/tls/libc.so.6
>>>> #7 0xb7c750a7 in malloc_usable_size () from /lib/tls/libc.so.6
>>>> #8 0xb7c75abb in free () from /lib/tls/libc.so.6
>>>> #9 0xb7c97e08 in closedir () from /lib/tls/libc.so.6
>>>> #10 0x0805fe75 in cfclosedir (dirh=0xb7d29e40) at image.c:1086
>>>> #11 0x080a162e in RecursiveImage (ip=0x81602f0,
>>>> from=0xbff92950 "/mnt/asterisksetup", to=0xbff90950
>>>> "/etc/asterisk_bak",
>>>> maxrecurse=-99) at expand-image.c:234
>>>> #12 0x08052c25 in MakeImages () at do.c:2548
>>>> #13 0x0804de24 in DoTree (passes=3, info=0x80a7afa "Main Tree")
>>>> at cfagent.c:1328
>>>> #14 0x0804ea5f in main (argc=3, argv=0xbff94aa4) at cfagent.c:180
>>>>
>>>> CASE 2
>>>>
>>>>
>>>> (gdb) run -q -D forceUpdate
>>>> Starting program: /usr/local/sbin/cfagent -q -D forceUpdate
>>>> Detaching after fork from child process 6206.
>>>>
>>>> Detaching after fork from child process 6207.
>>>> Detaching after fork from child process 6208.
>>>> Detaching after fork from child process 6209.
>>>> Detaching after fork from child process 6210.
>>>> Detaching after fork from child process 6211.
>>>>
>>>> Program received signal SIGSEGV, Segmentation fault.
>>>> 0xb7c0efee in free () from /lib/tls/libc.so.6
>>>> (gdb) back
>>>> #0 0xb7c0efee in free () from /lib/tls/libc.so.6
>>>> #1 0xb7c10701 in malloc () from /lib/tls/libc.so.6
>>>> #2 0x0806237d in AppendItem (liststart=0xbfd1b608,
>>>> itemstring=0x816d130 "root 362 0.0 0.0 0 0 ?
>>>> S< 10:05 0:00 [cifsoplockd]", classes=0x8199b18 "") at item.c:349
>>>> #3 0x080624fd in CopyList (dest=0xbfd1b608, source=0x8193030) at
>>>> item.c:210
>>>> #4 0x0805d513 in LoadProcessTable (procdata=0xbfd1b748,
>>>> psopts=0x80b15c1 "auxw") at process.c:78
>>>> #5 0x0805302d in CheckProcesses () at do.c:2678
>>>> #6 0x0804ddc9 in DoTree (passes=3, info=0x80a7afa "Main Tree")
>>>> at cfagent.c:1348
>>>> #7 0x0804ea5f in main (argc=4, argv=0xbfd1b834) at cfagent.c:180
>>>>
>>>>
>>>> Thanks for creating the cfengine
>>>>
>>>> Regards,
>>>> Bernard Chan.
>>>>
>>>> _______________________________________________
>>>> Bug-cfengine mailing list
>>>> [email protected]
>>>> https://cfengine.org/mailman/listinfo/bug-cfengine
>>> --
>>> Mark Burgess
>>>
>>> Professor of Network and System Administration
>>> Oslo University College
>>>
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Work:
>> +47 22453272 Email: [EMAIL PROTECTED] Fax : +47 22453205
>> WWW : http://www.iu.hio.no/~mark
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> _______________________________________________ Bug-cfengine mailing list
>> [email protected] https://cfengine.org/mailman/listinfo/bug-cfengine
>>
>>
>> --
>> PowerAll Networks Ltd (http://www.powerallnetworks.com)
>>
>
> --
> Mark Burgess
>
> Professor of Network and System Administration
> Oslo University College
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Work: +47 22453272 Email: [EMAIL PROTECTED]
> Fax : +47 22453205 WWW : http://www.iu.hio.no/~mark
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> _______________________________________________
> Bug-cfengine mailing list
> [email protected]
> https://cfengine.org/mailman/listinfo/bug-cfengine
> _______________________________________________
> Cfengine mailing list
> [EMAIL PROTECTED]
> https://orwell.sara.nl/mailman/listinfo/cfengine
--
--
********************************************************************
* *
* Bas van der Vlies e-mail: [EMAIL PROTECTED] *
* SARA - Academic Computing Services phone: +31 20 592 8012 *
* Kruislaan 415 fax: +31 20 6683167 *
* 1098 SJ Amsterdam *
* *
********************************************************************
_______________________________________________
Bug-cfengine mailing list
[email protected]
https://cfengine.org/mailman/listinfo/bug-cfengine