On Wed, 2019-02-20 at 14:03 +0000, Edwin Török wrote: > > On 20/02/2019 12:44, Jan Pokorný wrote: > > On 19/02/19 16:41 +0000, Edwin Török wrote: > > > Also noticed this: [ 5390.361861] crmd[12620]: segfault at 0 ip > > > 00007f221c5e03b1 sp 00007ffcf9cf9d88 error 4 in > > > libc-2.17.so[7f221c554000+1c2000] [ 5390.361918] Code: b8 00 00 > > > 00 04 00 00 00 74 07 48 8d 05 f8 f2 0d 00 c3 0f 1f 80 00 00 00 00 > > > 48 31 c0 89 f9 83 e1 3f 66 0f ef c0 83 f9 30 77 19 <f3> 0f 6f 0f > > > 66 0f 74 c1 66 0f d7 d0 85 d2 75 7a 48 89 f8 48 83 e0 > > > > By any chance, is this an unmodified pacemaker package as > > obtainable from some public repo together with debug symbols? > > I haven't modified pacemaker, here are the versions: > > rpm -q pacemaker > pacemaker-1.1.19-8.el7.x86_64 > rpm -q glibc > glibc-2.17-260.el7_6.3.x86_64 > > 0x00007f221c5e03b1 - 0x7f221c554000 = 0x8c3b1 > addr2line -fie /lib64/libc.so.6 0x8c3b1 > __GI_strlen > :? > > Feb 19 16:22:04 host-10 crmd[12620]: notice: Additional logging > available in /var/log/cluster/corosync.log > Feb 19 16:22:05 host-10 crmd[12620]: notice: Connecting to cluster > infrastructure: corosync > Feb 19 16:29:50 host-10 crmd[12620]: error: Could not join the CPG > group 'crmd': 6 > Feb 19 16:29:50 host-10 kernel: crmd[12620]: segfault at 0 ip > 00007f221c5e03b1 sp 00007ffcf9cf9d88 error 4 in > libc-2.17.so[7f221c554000+1c2000] > Feb 19 16:38:28 host-10 pacemakerd[12614]: error: Managed process > 12620 (crmd) dumped core > Feb 19 16:38:28 host-10 pacemakerd[12614]: error: The crmd process > (12620) terminated with signal 11 (core=1) > > I found a core file in /var/lib/pacemaker/cores > (gdb) bt > #0 0x00007f221c5e03b1 in __strlen_sse2 () from /lib64/libc.so.6 > #1 0x00007f221c5e00be in strdup () from /lib64/libc.so.6 > #2 0x00007f221f1a05cd in election_init (name=name@entry=0x0, > uname=0x0, period_ms=period_ms@entry=60000, cb=cb@entry=0x55ea42cb279 > 0 > <election_timeout_popped>) > at election.c:78
The current code asserts that uname is non-NULL so this won't happen, but of course that still is a crash. > #3 0x000055ea42cb3d4c in do_ha_control (action=4, cause=<optimized > out>, cur_state=<optimized out>, current_input=<optimized out>, > msg_data=0x55ea4464fec0) > at control.c:139 > #4 0x000055ea42cb0524 in s_crmd_fsa_actions > (fsa_data=fsa_data@entry=0x55ea4464fec0) at fsa.c:305 > #5 0x000055ea42cb216a in s_crmd_fsa (cause=cause@entry=C_STARTUP) at > fsa.c:237 > #6 0x000055ea42cad707 in crmd_init () at main.c:173 > #7 0x000055ea42cad510 in main (argc=1, argv=0x7ffcf9cfa078) at > main.c:122 > > g > > Best regards, > --Edwin > > _______________________________________________ > Users mailing list: Users@clusterlabs.org > https://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org