Thanks for taking the time to do this. The failure seems to occur at same spot in two reports. This is during startup, so nothing significant is happening.
I did run the clusterepo build for f12 with pacemaker with no segfault. I am downloading centos atm to give it a shot - could be distro specific. Both reports are on CentOS 32 and 64 bit. Regards -steve On 05/25/2010 01:52 PM, Simpson, John R wrote: > > [r...@node01 ~]# corosync-fplay > Starting replay: head [1241] tail [0] > rec=[1] Log Message=Corosync Cluster Engine ('1.2.2'): started and ready to > provide service. > rec=[2] Log Message=Corosync built-in features: nss rdma > rec=[3] Log Message=Successfully read main configuration file > '/etc/corosync/corosync.conf'. > rec=[4] Log Message=Token Timeout (5000 ms) retransmit timeout (247 ms) > rec=[5] Log Message=token hold (187 ms) retransmits before loss (20 retrans) > rec=[6] Log Message=join (1000 ms) send_join (0 ms) consensus (7500 ms) merge > (200 ms) > rec=[7] Log Message=downcheck (1000 ms) fail to recv const (50 msgs) > rec=[8] Log Message=seqno unchanged const (30 rotations) Maximum network MTU > 1402 > rec=[9] Log Message=window size per rotation (50 messages) maximum messages > per rotation (20 messages) > rec=[10] Log Message=send threads (0 threads) > rec=[11] Log Message=RRP token expired timeout (247 ms) > rec=[12] Log Message=RRP token problem counter (2000 ms) > rec=[13] Log Message=RRP threshold (10 problem count) > rec=[14] Log Message=RRP mode set to none. > rec=[15] Log Message=heartbeat_failures_allowed (0) > rec=[16] Log Message=max_network_delay (50 ms) > rec=[17] Log Message=HeartBeat is Disabled. To enable set > heartbeat_failures_allowed> 0 > rec=[18] Log Message=Initializing transport (UDP/IP). > rec=[19] Log Message=Initializing transmit/receive security: libtomcrypt > SOBER128/SHA1HMAC (mode 0). > rec=[20] Log Message=you are using ipc api v2 > rec=[21] Log Message=Receive multicast socket recv buffer size (262142 bytes). > rec=[22] Log Message=Transmit multicast socket send buffer size (262142 > bytes). > rec=[23] Log Message=The network interface [172.16.0.147] is now up. > rec=[24] Log Message=Created or loaded sequence id 13448.172.16.0.147 for > this ring. > rec=[25] Log Message=info: process_ais_conf: Reading configure > rec=[26] Log Message=info: config_find_init: Local handle: > 2013064636357672962 for logging > rec=[27] Log Message=info: config_find_next: Processing additional logging > options... > rec=[28] Log Message=info: get_config_opt: Found 'on' for option: debug > rec=[29] Log Message=info: get_config_opt: Defaulting to 'off' for option: > to_file > rec=[30] Log Message=info: get_config_opt: Found 'yes' for option: to_syslog > rec=[31] Log Message=info: get_config_opt: Defaulting to 'daemon' for option: > syslog_facility > rec=[32] Log Message=info: config_find_init: Local handle: > 4730966301143465987 for service > rec=[33] Log Message=info: config_find_next: Processing additional service > options... > rec=[34] Log Message=info: get_config_opt: Defaulting to 'pcmk' for option: > clustername > rec=[35] Log Message=info: get_config_opt: Defaulting to 'no' for option: > use_logd > rec=[36] Log Message=info: get_config_opt: Defaulting to 'no' for option: > use_mgmtd > rec=[37] Log Message=info: pcmk_startup: CRM: Initialized > rec=[38] Log Message=Logging: Initialized pcmk_startup > rec=[39] Log Message=info: pcmk_startup: Maximum core file size is: 4294967295 > rec=[40] Log Message=info: pcmk_startup: Service: 9 > Finishing replay: records found [40] > > John Simpson > Senior Software Engineer, I. T. Engineering and Operations > >> -----Original Message----- >> From: Steven Dake [mailto:sd...@redhat.com] >> Sent: Tuesday, May 25, 2010 4:46 PM >> To: Simpson, John R >> Cc: openais@lists.linux-foundation.org >> Subject: Re: [Openais] FW: [Linux-HA] Problem with last Pacemaker and >> corosync releases available for RHEL5 ? >> >> On 05/25/2010 12:49 PM, Simpson, John R wrote: >>> Steve, >>> >>> Unfortunately I've downgraded to an earlier version that >>> doesn't have the issue. I have an hb_report from the problem version >>> if that would help. >>> >>> John >>> >>> John Simpson >>> Senior Software Engineer, I. T. Engineering and Operations >>> >>> >>>> -----Original Message----- >>>> From: Steven Dake [mailto:sd...@redhat.com] >>>> Sent: Tuesday, May 25, 2010 2:53 PM >>>> To: Simpson, John R >>>> Cc: openais@lists.linux-foundation.org >>>> Subject: Re: [Openais] FW: [Linux-HA] Problem with last Pacemaker and >>>> corosync releases available for RHEL5 ? >>>> >>>> On 05/25/2010 06:30 AM, Simpson, John R wrote: >>>>> Greetings all, >>>>> >>>>> I'm new to the OpenAIS/Corosync list and am moving this >>>>> discussion here at the request of Andrew Beekhof. A few people >>>>> on the Linux-HA / Pacemaker mailing list, myself included, have >>>>> been getting segmentation faults from corosync after the latest >>>>> update. >>>>> >>>>> My systems are CentOS 5.5 32-bit running under VMware ESXi 3.5. >>>>> >>>>> Please let me know if there is any additional information that >>>>> I need to provide, or if this is a known issue. >>>>> >>>>> Thanks, >>>> >>>> Could you provide a corosync-fplay output please? >>>> >>>> Thanks >>>> -steve >>>> >>>>> >>>>> John >>>>> >>>>> >>>>> [r...@node01 ~]# uname -a >>>>> Linux node01 2.6.18-194.3.1.el5 #1 SMP Thu May 13 13:09:10 EDT 2010 >> i686 >>>> i686 i386 GNU/Linux >>>>> corosynclib-1.2.2-1.1.el5 >>>>> corosync-1.2.2-1.1.el5 >>>>> corosynclib-devel-1.2.2-1.1.el5 >>>>> pacemaker-libs-devel-1.0.8-6.1.el5 >>>>> pacemaker-libs-1.0.8-6.1.el5 >>>>> pacemaker-1.0.8-6.1.el5 >>>>> centos-release-notes-5.5-0 >>>>> centos-release-5-5.el5.centos >>>>> >>>>> John Simpson >>>>> Senior Software Engineer, I. T. Engineering and Operations >>>>> >>>>> >>>>> -----Original Message----- >>>>> From: linux-ha-boun...@lists.linux-ha.org [mailto:linux-ha- >>>> boun...@lists.linux-ha.org] On Behalf Of Andrew Beekhof >>>>> Sent: Tuesday, May 25, 2010 4:31 AM >>>>> To: General Linux-HA mailing list >>>>> Subject: Re: [Linux-HA] Pb with last Pacemaker and corosync releases >>>> available for RHEL5 ? >>>>> >>>>> On Mon, May 24, 2010 at 11:05 PM, Simpson, John R >>>>> <john_simp...@reyrey.com> wrote: >>>>>> Andrew, >>>>>> >>>>>> I have the same problem and have a core file. Is there somewhere >> you'd >>>> like me to send it (24M)? >>>>> >>>>> Core files are only useful on the machine that generated them. >>>>> >>>>> Could you send this stack trace to the openais list please? >>>>> Thats the best place to report corosync/openais related issues. >>>>> >>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> John >>>>>> >>>>>> [r...@node01 ~]# file /var/lib/corosync/core.2998 >>>>>> /var/lib/corosync/core.2998: ELF 32-bit LSB core file Intel 80386, >>>> version 1 (SYSV), SVR4-style, from 'corosync' >>>>>> >>>>>> [r...@node01 ~]# gdb /usr/sbin/corosync /var/lib/corosync/core.2998 >>>>>> GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-23.el5) >>>>>> Copyright (C) 2009 Free Software Foundation, Inc. >>>>>> License GPLv3+: GNU GPL version 3 or >>>> later<http://gnu.org/licenses/gpl.html> >>>>>> This is free software: you are free to change and redistribute it. >>>>>> There is NO WARRANTY, to the extent permitted by law. Type "show >>>> copying" >>>>>> and "show warranty" for details. >>>>>> This GDB was configured as "i386-redhat-linux-gnu". >>>>>> For bug reporting instructions, please see: >>>>>> <http://www.gnu.org/software/gdb/bugs/>... >>>>>> Reading symbols from /usr/sbin/corosync...(no debugging symbols >>>> found)...done. >>>>>> [New Thread 3001] >>>>>> [New Thread 3000] >>>>>> [New Thread 2998] >>>>>> >>>>>> warning: .dynamic section for "/usr/lib/libnssutil3.so" is not at the >>>> expected address >>>>>> >>>>>> warning: difference appears to be caused by prelink, adjusting >>>> expectations >>>>>> >>>>>> warning: .dynamic section for "/usr/lib/libplds4.so" is not at the >>>> expected address >>>>>> >>>>>> warning: difference appears to be caused by prelink, adjusting >>>> expectations >>>>>> >>>>>> warning: .dynamic section for "/usr/lib/libplc4.so" is not at the >>>> expected address >>>>>> >>>>>> warning: difference appears to be caused by prelink, adjusting >>>> expectations >>>>>> >>>>>> warning: .dynamic section for "/usr/lib/libxml2.so.2" is not at the >>>> expected address >>>>>> >>>>>> warning: difference appears to be caused by prelink, adjusting >>>> expectations >>>>>> >>>>>> warning: .dynamic section for "/lib/libpam.so.0" is not at the >> expected >>>> address >>>>>> >>>>>> warning: difference appears to be caused by prelink, adjusting >>>> expectations >>>>>> >>>>>> warning: .dynamic section for "/lib/libglib-2.0.so.0" is not at the >>>> expected address >>>>>> >>>>>> warning: difference appears to be caused by prelink, adjusting >>>> expectations >>>>>> Reading symbols from /usr/lib/libtotem_pg.so.4...(no debugging >> symbols >>>> found)...done. >>>>>> Loaded symbols for /usr/lib/libtotem_pg.so.4 >>>>>> Reading symbols from /usr/lib/liblogsys.so.4...(no debugging symbols >>>> found)...done. >>>>>> Loaded symbols for /usr/lib/liblogsys.so.4 >>>>>> Reading symbols from /usr/lib/libcoroipcs.so.4...(no debugging >> symbols >>>> found)...done. >>>>>> Loaded symbols for /usr/lib/libcoroipcs.so.4 >>>>>> Reading symbols from /lib/librt.so.1...(no debugging symbols >>>> found)...done. >>>>>> Loaded symbols for /lib/librt.so.1 >>>>>> Reading symbols from /lib/libpthread.so.0...(no debugging symbols >>>> found)...done. >>>>>> Loaded symbols for /lib/libpthread.so.0 >>>>>> Reading symbols from /lib/libdl.so.2...(no debugging symbols >>>> found)...done. >>>>>> Loaded symbols for /lib/libdl.so.2 >>>>>> Reading symbols from /lib/libc.so.6...(no debugging symbols >>>> found)...done. >>>>>> Loaded symbols for /lib/libc.so.6 >>>>>> Reading symbols from /usr/lib/libssl3.so...(no debugging symbols >>>> found)...done. >>>>>> Loaded symbols for /usr/lib/libssl3.so >>>>>> Reading symbols from /usr/lib/libsmime3.so...(no debugging symbols >>>> found)...done. >>>>>> Loaded symbols for /usr/lib/libsmime3.so >>>>>> Reading symbols from /usr/lib/libnss3.so...(no debugging symbols >>>> found)...done. >>>>>> Loaded symbols for /usr/lib/libnss3.so >>>>>> Reading symbols from /usr/lib/libnssutil3.so...(no debugging symbols >>>> found)...done. >>>>>> Loaded symbols for /usr/lib/libnssutil3.so >>>>>> Reading symbols from /usr/lib/libplds4.so...(no debugging symbols >>>> found)...done. >>>>>> Loaded symbols for /usr/lib/libplds4.so >>>>>> Reading symbols from /usr/lib/libplc4.so...(no debugging symbols >>>> found)...done. >>>>>> Loaded symbols for /usr/lib/libplc4.so >>>>>> Reading symbols from /usr/lib/libnspr4.so...(no debugging symbols >>>> found)...done. >>>>>> Loaded symbols for /usr/lib/libnspr4.so >>>>>> Reading symbols from /usr/lib/librdmacm.so.1...(no debugging symbols >>>> found)...done. >>>>>> Loaded symbols for /usr/lib/librdmacm.so.1 >>>>>> Reading symbols from /usr/lib/libibverbs.so.1...(no debugging symbols >>>> found)...done. >>>>>> Loaded symbols for /usr/lib/libibverbs.so.1 >>>>>> Reading symbols from /lib/ld-linux.so.2...(no debugging symbols >>>> found)...done. >>>>>> Loaded symbols for /lib/ld-linux.so.2 >>>>>> Reading symbols from /usr/lib/libz.so.1...(no debugging symbols >>>> found)...done. >>>>>> Loaded symbols for /usr/lib/libz.so.1 >>>>>> Reading symbols from /usr/libexec/lcrso/objdb.lcrso...(no debugging >>>> symbols found)...done. >>>>>> Loaded symbols for /usr/libexec/lcrso/objdb.lcrso >>>>>> Reading symbols from /usr/libexec/lcrso/coroparse.lcrso...(no >> debugging >>>> symbols found)...done. >>>>>> Loaded symbols for /usr/libexec/lcrso/coroparse.lcrso >>>>>> Reading symbols from /usr/libexec/lcrso/pacemaker.lcrso...(no >> debugging >>>> symbols found)...done. >>>>>> Loaded symbols for /usr/libexec/lcrso/pacemaker.lcrso >>>>>> Reading symbols from /usr/lib/libplumb.so.2...(no debugging symbols >>>> found)...done. >>>>>> Loaded symbols for /usr/lib/libplumb.so.2 >>>>>> Reading symbols from /usr/lib/libpils.so.2...(no debugging symbols >>>> found)...done. >>>>>> Loaded symbols for /usr/lib/libpils.so.2 >>>>>> Reading symbols from /usr/lib/libbz2.so.1...(no debugging symbols >>>> found)...done. >>>>>> Loaded symbols for /usr/lib/libbz2.so.1 >>>>>> Reading symbols from /usr/lib/libxslt.so.1...(no debugging symbols >>>> found)...done. >>>>>> Loaded symbols for /usr/lib/libxslt.so.1 >>>>>> Reading symbols from /usr/lib/libxml2.so.2...(no debugging symbols >>>> found)...done. >>>>>> Loaded symbols for /usr/lib/libxml2.so.2 >>>>>> Reading symbols from /lib/libuuid.so.1...(no debugging symbols >>>> found)...done. >>>>>> Loaded symbols for /lib/libuuid.so.1 >>>>>> Reading symbols from /lib/libpam.so.0...(no debugging symbols >>>> found)...done. >>>>>> Loaded symbols for /lib/libpam.so.0 >>>>>> Reading symbols from /lib/libglib-2.0.so.0...(no debugging symbols >>>> found)...done. >>>>>> Loaded symbols for /lib/libglib-2.0.so.0 >>>>>> Reading symbols from /usr/lib/libltdl.so.3...(no debugging symbols >>>> found)...done. >>>>>> Loaded symbols for /usr/lib/libltdl.so.3 >>>>>> Reading symbols from /lib/libm.so.6...(no debugging symbols >>>> found)...done. >>>>>> Loaded symbols for /lib/libm.so.6 >>>>>> Reading symbols from /lib/libaudit.so.0...(no debugging symbols >>>> found)...done. >>>>>> Loaded symbols for /lib/libaudit.so.0 >>>>>> Reading symbols from /lib/libnss_files.so.2...(no debugging symbols >>>> found)...done. >>>>>> Loaded symbols for /lib/libnss_files.so.2 >>>>>> Core was generated by `corosync'. >>>>>> Program terminated with signal 11, Segmentation fault. >>>>>> #0 0x00d351ab in strlen () from /lib/libc.so.6 >>>>>> >>>>>> (gdb) where >>>>>> #0 0x00d351ab in strlen () from /lib/libc.so.6 >>>>>> #1 0x00b0d52b in ?? () from /usr/lib/liblogsys.so.4 >>>>>> #2 0x00856832 in start_thread () from /lib/libpthread.so.0 >>>>>> #3 0x00d96e0e in clone () from /lib/libc.so.6 >>>>>> >>>>>> >>>>>> John Simpson >>>>>> Senior Software Engineer, I. T. Engineering and Operations >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: linux-ha-boun...@lists.linux-ha.org [mailto:linux-ha- >>>>>>> boun...@lists.linux-ha.org] On Behalf Of Andrew Beekhof >>>>>>> Sent: Friday, May 21, 2010 7:01 AM >>>>>>> To: General Linux-HA mailing list >>>>>>> Subject: Re: [Linux-HA] Pb with last Pacemaker and corosync releases >>>>>>> available for RHEL5 ? >>>>>>> >>>>>>> is there a core file in /var/lib/corosync? >>>>>>> >>>>>>> On Fri, May 21, 2010 at 11:57 AM, >> Alain.Moulle<alain.mou...@bull.net> >>>>>>> wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> FYI , it was working fine with : >>>>>>>> corosync-1.2.1-1.el5 >>>>>>>> corosynclib-1.2.1-1.el5 >>>>>>>> pacemaker-1.0.8-6.el5 >>>>>>>> pacemaker-libs-1.0.8-6.el5 >>>>>>>> >>>>>>>> then I update to : >>>>>>>> corosync-1.2.2-1.1.el5 >>>>>>>> corosynclib-1.2.2-1.1.el5 >>>>>>>> pacemaker-1.0.8-6.1.el5 >>>>>>>> pacemaker-libs-1.0.8-6.1.el5 >>>>>>>> >>>>>>>> and the /etc/init.d/corosync start fails on all nodes with only >> these >>>>>>>> messages: >>>>>>>> Starting Corosync Cluster Engine (corosync): [FAILED] >>>>>>>> and as it is very short in message, the file is joined ... >>>>>>>> >>>>>>>> Regards >>>>>>>> Alain >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Linux-HA mailing list >>>>>>>> linux...@lists.linux-ha.org >>>>>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha >>>>>>>> See also: http://linux-ha.org/ReportingProblems >>>>>>>> >>>>>>> _______________________________________________ >>>>>>> Linux-HA mailing list >>>>>>> linux...@lists.linux-ha.org >>>>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha >>>>>>> See also: http://linux-ha.org/ReportingProblems >>>>>> _______________________________________________ >>>>>> Linux-HA mailing list >>>>>> linux...@lists.linux-ha.org >>>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha >>>>>> See also: http://linux-ha.org/ReportingProblems >>>>>> >>>>> _______________________________________________ >>>>> Linux-HA mailing list >>>>> linux...@lists.linux-ha.org >>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha >>>>> See also: http://linux-ha.org/ReportingProblems >>>>> _______________________________________________ >>>>> Openais mailing list >>>>> Openais@lists.linux-foundation.org >>>>> https://lists.linux-foundation.org/mailman/listinfo/openais >>> >> >> >> Even after a downgrade, corosync-fplay will produce output from the last >> segfault. Please give it a run and post results. >> >> Regards >> -steve Th _______________________________________________ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais