[ https://issues.apache.org/jira/browse/QPID-7991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16256223#comment-16256223 ]
Chris Richardson commented on QPID-7991: ---------------------------------------- Just a note for posterity - the segfault under discussion did not seem to be triggered when creating the routes with the current version of qpid-route (which has recently changed to use the Broker::create API rather than the Link::Bridge approach which code comments suggest should be deprecated, see changes under https://issues.apache.org/jira/browse/QPID-7876). However it DID (prior to Alan's submitted fix) rear its ugly head when the route was created with the supposedly identical call from the c++ broker management library I authored at https://github.com/fourceu/fourc-qpid-manager and I have not yet been able to determine the exact cause. Since this fix appears to remedy the issue in either case I will abandon the investigation unless additional issues arise. > Segfault in broker while processing active bridges > -------------------------------------------------- > > Key: QPID-7991 > URL: https://issues.apache.org/jira/browse/QPID-7991 > Project: Qpid > Issue Type: Bug > Components: C++ Broker > Affects Versions: qpid-cpp-1.36.0, qpid-cpp-1.37.0 > Environment: Ubuntu 17.10 x86_64, gcc 7. > Reporter: Chris Richardson > Assignee: Alan Conway > Priority: Critical > Fix For: qpid-cpp-1.37.0 > > Attachments: segfault stack trace, segfault-fix.patch, > segfault-repoduce.tar.gz, std_remove_if_with_smart_ptr.cpp > > Original Estimate: 48h > Remaining Estimate: 48h > > Segfault occurs on a brackground thread within about 5-10 seconds of broker > startup at src/qpid/broker/Link.cpp:465. [^segfault stack trace] attached, > frames #3 and #5 are of particular relevance. > The unchecked Bridge::shared_ptr derived from the iterator is null and the > invocation of bridge->closed() triggers the segfault. Adding a simple null > check (as per attached [^segfault-fix.patch]) fixes the segfault but not the > underlying reason for the null pointer. > The segfault appears to be related to how a second broker (henceforth > "broker1") is configured; this is the one to which the links are established. > Without broker1, the "segfaulting broker" (aka "broker2") does not do its > thing. It may be that broker1 returns invalid data to broker2 but this is not > in the scope of this bug report, which focuses on the segfault. > h2. Reproduce > Unfortunately the steps to arrive at this situation are not clear so the > reproduce is a bit hacky - the data directory, config file and some certs for > the two brokers are attached as a tarball in the hope that they can be > arranged in such a way as to provide a reproduce in lieu of a purely > step-based procedure. > Steps to reproduce: > * Temporarily add a DNS alias to the local machine of "octopussy" (necessary > due to cert config and durable link config in broker2's data store) > * Extract the attached [^segfault-repoduce.tar.gz] to an empty directory > (assumed to be cwd) > * Start broker1 with "qpidd --config broker1/qpidd.conf" > * In another shell with the same cwd, start broker2 with "qpidd --config > broker2/qpidd.conf" > * Observe segfault in broker2 after 5-10 seconds. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org