[Bug 1341496] Re: corosync hangs inside libqb
utopic has seen the end of its life and is no longer receiving any updates. Marking the utopic task for this ticket as "Won't Fix". ** Changed in: libqb (Ubuntu Utopic) Status: In Progress => Won't Fix -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1341496 Title: corosync hangs inside libqb To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libqb/+bug/1341496/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1341496] Re: corosync hangs inside libqb
** Branch linked: lp:ubuntu/trusty-proposed/libqb -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1341496 Title: corosync hangs inside libqb To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libqb/+bug/1341496/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1341496] Re: corosync hangs inside libqb
This bug was fixed in the package libqb - 0.16.0.real-1ubuntu4 --- libqb (0.16.0.real-1ubuntu4) trusty; urgency=medium [ Billy Olsen ] * debian/patches/ringbuffer-reclaim-fix.patch: infinite loop when attempting to reclaim space in the ringbuffer fails. (LP: #1341496) -- Billy Olsen Tue, 28 Apr 2015 12:03:15 -0500 ** Changed in: libqb (Ubuntu Trusty) Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1341496 Title: corosync hangs inside libqb To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libqb/+bug/1341496/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1341496] Re: corosync hangs inside libqb
Marking verification-done for trusty based on reports of no longer seeing the corosync 100% cpu issue after applying this update ** Tags removed: verification-needed ** Tags added: verification-done -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1341496 Title: corosync hangs inside libqb To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libqb/+bug/1341496/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1341496] Re: corosync hangs inside libqb
I set the development task to Fix Released based off the comments that this works with 0.17 which is in Vivid. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1341496 Title: corosync hangs inside libqb To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libqb/+bug/1341496/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1341496] Re: corosync hangs inside libqb
Hello Tomasz, or anyone else affected, Accepted libqb into trusty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/libqb/0.16.0.real- 1ubuntu4 in a few hours, and then in the -proposed repository. Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! ** Changed in: libqb (Ubuntu) Status: Confirmed => Fix Released ** Changed in: libqb (Ubuntu Trusty) Status: In Progress => Fix Committed ** Tags added: verification-needed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1341496 Title: corosync hangs inside libqb To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libqb/+bug/1341496/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1341496] Re: corosync hangs inside libqb
Sponsored for Trusty. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1341496 Title: corosync hangs inside libqb To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libqb/+bug/1341496/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1341496] Re: corosync hangs inside libqb
** Also affects: libqb (Ubuntu Utopic) Importance: Undecided Status: New ** Also affects: libqb (Ubuntu Trusty) Importance: Undecided Status: New ** Changed in: libqb (Ubuntu Trusty) Importance: Undecided => Medium ** Changed in: libqb (Ubuntu Utopic) Importance: Undecided => Medium ** Changed in: libqb (Ubuntu Trusty) Status: New => In Progress ** Changed in: libqb (Ubuntu Utopic) Status: New => In Progress -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1341496 Title: corosync hangs inside libqb To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libqb/+bug/1341496/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1341496] Re: corosync hangs inside libqb
Attaching debdiff for SRU proposal. The package is the same between trusty and utopic, with the commits already part of the 0.17.1 release in vivid. Please let me know if there's any additional information necessary. ** Description changed: $ lsb_release -rd Description: Ubuntu 14.04 LTS Release: 14.04 $ apt-cache policy libqb0 libqb0: - Installed: 0.16.0.real-1ubuntu3 - Candidate: 0.16.0.real-1ubuntu3 - Version table: - *** 0.16.0.real-1ubuntu3 0 - 500 http://archive.ubuntu.com/ubuntu/ trusty/main amd64 Packages - 100 /var/lib/dpkg/status + Installed: 0.16.0.real-1ubuntu3 + Candidate: 0.16.0.real-1ubuntu3 + Version table: + *** 0.16.0.real-1ubuntu3 0 + 500 http://archive.ubuntu.com/ubuntu/ trusty/main amd64 Packages + 100 /var/lib/dpkg/status Corosync sometimes hangs inside libqb. I've looked at a hanged process with gdb, and I think I've found the problem. The problem is the loop here: https://github.com/ClusterLabs/libqb/blob/v0.16.0/lib/ringbuffer.c#L451 This was fixed in 0.17.0, see: https://github.com/ClusterLabs/libqb/blob/v0.17.0/lib/ringbuffer.c#L451 I think bumping to 0.17.0 should fix this (at least in backports? Please?) + + -- + [Impact] + + * libqb does not currently handle ring buffer alloc errors properly. The +result of this is corosync frequently ending up in an infinite loop +(consuming 100% cpu) as it continuously tries and fails to allocate +space from the ringbuffer due to erroneous logic when an attempt to +reclaim space fails. This patch ensures that when the reclaim fails the +libqb library gracefully errors out and allows corosync to proceed with +execution. + + * This is fixed by cherry-picking the following 2 commits: +- https://github.com/ClusterLabs/libqb/commit/00082df49f045053d03bba7713bfff35d2448459 +- https://github.com/ClusterLabs/libqb/commit/47c690dbbc75957ac2354844b8fbf0a9f4791a87 + + [Test Case] + + There is a test case in comment #2. + A test case that was simple for me to recreate the problem (I used juju to replicate): + + 1. Deploy a 2 node percona-cluster w/ corosync and pacemaker. + 2. Scale the number of units from 2 to 5 nodes. + 3. Observe one of the instances of corosync will encounter 100% cpu usage and will not be stuck. + + e.g. + juju bootstrap + # install percona-cluster + juju deploy -n 2 cs:trusty/percona-cluster + juju deploy cs:trusty/hacluster + + # configure corosync to use unicast for communication + juju set hacluster corosync_transport=udpu + + # configure the virtual ip for corosync + juju set percona-cluster vip= + + # cause juju to configure the corosync/pacemaker configuration with percona-cluster. + juju add-relation percona-cluster hacluster + + # wait for juju debug-log to go quiet. + # then expand the cluster by 3 nodes. + juju add-unit -n 3 percona-cluster + + + [Regression Potential] + + * As a result of the changes, this may cause a blackbox log entry to be +dropped or it may cause a ring to be discarded and a new ring to be +created. + +- If a log entry is dropped, some information may be missing from the + blackbox used later for analysis. However, upstream has determined + that missing a log entry is more ideal than hanging the corosync + process. + +- Rings are discarded as part of the normal corosync communication + process, and corosync already knows how ot properly handle this + situation so the risk is small in this area. ** Patch added: "lp1341496.debdiff" https://bugs.launchpad.net/ubuntu/+source/libqb/+bug/1341496/+attachment/4386861/+files/lp1341496.debdiff -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1341496 Title: corosync hangs inside libqb To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libqb/+bug/1341496/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1341496] Re: corosync hangs inside libqb
Doesn't appear to be in backports, however there is a backported version for Trusty in this PPA: https://launchpad.net/~claudiu-popescu/+archive/ubuntu/ppa/+packages -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1341496 Title: corosync hangs inside libqb To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libqb/+bug/1341496/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1341496] Re: corosync hangs inside libqb
Hi guys, I didn't find the libqb0 in trusty-backports, anyone can help show me the link or is it still in the progress? Thanks. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1341496 Title: corosync hangs inside libqb To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libqb/+bug/1341496/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1341496] Re: corosync hangs inside libqb
Thanks @patrickdk your backported package resolved this issue for me. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1341496 Title: corosync hangs inside libqb To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libqb/+bug/1341496/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1341496] Re: corosync hangs inside libqb
Thanks @patrickdk your backported package resolved this issue for me. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1341496 Title: corosync hangs inside libqb To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libqb/+bug/1341496/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1341496] Re: corosync hangs inside libqb
I just backported the vivid 0.17.0 version to trusty. It runs without issues, and seems to have corrected the problems. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1341496 Title: corosync hangs inside libqb To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libqb/+bug/1341496/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1341496] Re: corosync hangs inside libqb
Just a bump request for this bug, so hopefully it doesn't sting others who wish to run high availability services on Ubuntu! -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1341496 Title: corosync hangs inside libqb To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libqb/+bug/1341496/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1341496] Re: corosync hangs inside libqb
Is there a deadline for this fix? Maybe as a backport of the 0.17.0? Regards. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1341496 Title: corosync hangs inside libqb To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libqb/+bug/1341496/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1341496] Re: corosync hangs inside libqb
** Changed in: libqb (Ubuntu) Assignee: (unassigned) => Kick In (kick-d) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1341496 Title: corosync hangs inside libqb To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libqb/+bug/1341496/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1341496] Re: corosync hangs inside libqb
Status changed to 'Confirmed' because the bug affects multiple users. ** Changed in: libqb (Ubuntu) Status: New => Confirmed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1341496 Title: corosync hangs inside libqb To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libqb/+bug/1341496/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1341496] Re: corosync hangs inside libqb
How I was able to reproduce the bug: 1. Install and configure a postgresql with streaming replication 2. Install and configure pacemaker and corosync (controlling the postgres cluster) 3. When both servers are operational, something like: * Node psql1: + master-pgsql : 1000 + pgsql-data-status : LATEST + pgsql-master-baseline : F490 + pgsql-status : PRI * Node psql2: + master-pgsql : 100 + pgsql-data-status : STREAMING|SYNC + pgsql-status : HS:sync + pgsql-xlog-loc: F6030D60 Reboot one of the servers, slave or master. 4. Run: corosync-cmapctl, it will freeze and never return. I was able to reproduce this every single time wit libqb 0.16.0 With libqb 0.17.0 I was not able to reproduce this scenario. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1341496 Title: corosync hangs inside libqb To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libqb/+bug/1341496/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1341496] Re: corosync hangs inside libqb
I had the same issue and eventually I ended up with: https://launchpad.net/~claudiu-popescu/+archive/ubuntu/ppa/+packages I installed it on a testing cluster and it is working ok for now. I strongly advise against using this directly in production since it is my first library built for Ubuntu. Maybe some one will make an official release soon since v0.16.0 is not usable in production environments. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1341496 Title: corosync hangs inside libqb To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libqb/+bug/1341496/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs