Launchpad has imported 4 comments from the remote bug at https://bugzilla.redhat.com/show_bug.cgi?id=586752.
If you reply to an imported comment from within Launchpad, your comment will be sent to the remote bug automatically. Read more about Launchpad's inter-bugtracker facilities at https://help.launchpad.net/InterBugTracking. ------------------------------------------------------------------------ On 2010-04-28T10:12:15+00:00 Oliver wrote: Created attachment 409748 Andrew Beekhof's patch to fix this issue Description of problem: dlm_controld.pcmk segfaults on startup if network uses vlan, bonding or bridging and corosync/pacemaker is invoked too early Version-Release number of selected component (if applicable): bug and patch testet on 3.0.7 ubuntu lucid packages How reproducible: Configure any of the obove on top of the raw interface and start corosync before the network settles. Additional info: The issue is discussed here http://oss.clusterlabs.org/pipermail/pacemaker/2010-April/005954.html Andrew Beekhof <[email protected]> posted the attached patch that fixes this issue. gdb output is: Core was generated by `dlm_controld.pcmk -q 0'. Program terminated with signal 11, Segmentation fault. #0 __strlen_sse2 () at ../sysdeps/x86_64/multiarch/../strlen.S:31 in ../sysdeps/x86_64/multiarch/../strlen.S #0 __strlen_sse2 () at ../sysdeps/x86_64/multiarch/../strlen.S:31 #1 0x00007f499565cd46 in *__GI___strdup (s=0x0) at strdup.c:42 #2 0x0000000000403f0c in dlm_process_node (key=<value optimized out>, value=0x1864a30, user_data=0x62a4f8) at /usr/src/packages/redhat-cluster/3.0.7/redhat-cluster-3.0.7/group/dlm_controld/pacemaker.c:136 #3 0x00007f4995cdbd73 in IA__g_hash_table_foreach (hash_table=0x1866050, func=0x403e40 <dlm_process_node>, user_data=0x62a4f8) at /build/buildd/glib2.0-2.24.0/glib/ghash.c:1325 #4 0x0000000000403c9e in update_cluster () at /usr/src/packages/redhat-cluster/3.0.7/redhat-cluster-3.0.7/group/dlm_controld/pacemaker.c:82 #5 0x0000000000415a4a in loop () at /usr/src/packages/redhat-cluster/3.0.7/redhat-cluster-3.0.7/group/dlm_controld/main.c:986 #6 0x000000000041659c in main (argc=<value optimized out>, argv=<value optimized out>) at /usr/src/packages/redhat-cluster/3.0.7/redhat-cluster-3.0.7/group/dlm_controld/main.c:1295 hth, Oliver Reply at: https://bugs.launchpad.net/ubuntu/+source/redhat- cluster/+bug/571612/comments/0 ------------------------------------------------------------------------ On 2010-04-28T12:08:13+00:00 Andrew wrote: Patch fa24b46 resolving this issue has been committed in cluster.git http://git.fedorahosted.org/git/?p=cluster.git;a=commitdiff;h=fa24b460c51aa0c47d0842703feea8bca0ed66b7 Essentially, the dlm was trying to create a configfs entry for a node with no address. This lead to a NULL pointer being dereferenced and the dlm crashing. The above mentioned patch now checks for a valid address before continuing. Reply at: https://bugs.launchpad.net/ubuntu/+source/redhat- cluster/+bug/571612/comments/1 ------------------------------------------------------------------------ On 2010-04-29T13:22:44+00:00 Andrew wrote: Sorry, set the wrong status. Reply at: https://bugs.launchpad.net/ubuntu/+source/redhat- cluster/+bug/571612/comments/3 ------------------------------------------------------------------------ On 2010-07-30T11:29:34+00:00 Bug wrote: This bug appears to have been reported against 'rawhide' during the Fedora 14 development cycle. Changing version to '14'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping Reply at: https://bugs.launchpad.net/ubuntu/+source/redhat- cluster/+bug/571612/comments/4 ** Changed in: redhatcluster Status: Unknown => Fix Released ** Changed in: redhatcluster Importance: Unknown => Medium -- You received this bug notification because you are a member of Ubuntu High Availability Team, which is subscribed to redhat-cluster in Ubuntu. https://bugs.launchpad.net/bugs/571612 Title: dlm_controld.pcmk segfault Status in Red Hat Cluster: Fix Released Status in redhat-cluster package in Ubuntu: Invalid Bug description: Anyone who uses link aggregation (me), bridging, and vlans are affect due to the time required to bring up the network after reboot. Corosync comes up and dlm segfaults. This has been fixed upstream, and the fix is included in Maverick+. Upstream bugreport and patch [1]. Patch commited upstream [2]. Discussion about the issue [3]. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=586752 [2]: http://git.fedorahosted.org/git/?p=cluster.git;a=commitdiff;h=fa24b460c51aa0c47d0842703feea8bca0ed66b7 [3]: http://oss.clusterlabs.org/pipermail/pacemaker/2010-April/005954.html To manage notifications about this bug go to: https://bugs.launchpad.net/redhatcluster/+bug/571612/+subscriptions _______________________________________________ Mailing list: https://launchpad.net/~ubuntu-ha Post to : [email protected] Unsubscribe : https://launchpad.net/~ubuntu-ha More help : https://help.launchpad.net/ListHelp

