[ 
https://issues.apache.org/jira/browse/QPID-7991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16256973#comment-16256973
 ] 

Alan Conway commented on QPID-7991:
-----------------------------------

The bug would not be triggered if all the detached bridges were already at the 
end of the vector; then std::remove_if wouldn't bother moving them, it would 
simply return an iterator to the dead zone and everything would work fine. 
Possibly your tool does things in a slightly different order or with different 
timing - so your tool causes mixed batches detached/active bridges to be 
processed, where python tool did not. E.g. (speculating) the  python tool might 
be more synchronous than necessary while your tool issues management commands 
in batches or something like that.

> Segfault in broker while processing active bridges
> --------------------------------------------------
>
>                 Key: QPID-7991
>                 URL: https://issues.apache.org/jira/browse/QPID-7991
>             Project: Qpid
>          Issue Type: Bug
>          Components: C++ Broker
>    Affects Versions: qpid-cpp-1.36.0, qpid-cpp-1.37.0
>         Environment: Ubuntu 17.10 x86_64, gcc 7.
>            Reporter: Chris Richardson
>            Assignee: Alan Conway
>            Priority: Critical
>             Fix For: qpid-cpp-1.37.0
>
>         Attachments: segfault stack trace, segfault-fix.patch, 
> segfault-repoduce.tar.gz, std_remove_if_with_smart_ptr.cpp
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Segfault occurs on a brackground thread within about 5-10 seconds of broker 
> startup at src/qpid/broker/Link.cpp:465. [^segfault stack trace] attached, 
> frames #3 and #5 are of particular relevance.
> The unchecked Bridge::shared_ptr derived from the iterator is null and the 
> invocation of bridge->closed() triggers the segfault. Adding a simple null 
> check (as per attached [^segfault-fix.patch]) fixes the segfault but not the 
> underlying reason for the null pointer. 
> The segfault appears to be related to how a second broker (henceforth 
> "broker1") is configured; this is the one to which the links are established. 
> Without broker1, the "segfaulting broker" (aka "broker2") does not do its 
> thing. It may be that broker1 returns invalid data to broker2 but this is not 
> in the scope of this bug report, which focuses on the segfault. 
> h2. Reproduce
> Unfortunately the steps to  arrive at this situation are not clear so the 
> reproduce is a bit hacky - the data directory, config file and some certs for 
> the two brokers are attached as a tarball in the hope that they can be 
> arranged in such a way as to provide a reproduce in lieu of a purely 
> step-based procedure.
> Steps to reproduce:
> * Temporarily add a DNS alias to the local machine of "octopussy" (necessary 
> due to cert config and durable link config in broker2's data store)
> * Extract the attached [^segfault-repoduce.tar.gz] to an empty directory 
> (assumed to be cwd)
> * Start broker1 with "qpidd --config broker1/qpidd.conf"
> * In another shell with the same cwd, start broker2 with "qpidd --config 
> broker2/qpidd.conf"
> * Observe segfault in broker2 after 5-10 seconds.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org

Reply via email to