[ 
https://issues.apache.org/jira/browse/QPID-7991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Richardson updated QPID-7991:
-----------------------------------
    Description: 
Segfault occurs on a brackground thread within about 5-10 seconds of broker 
startup at src/qpid/broker/Link.cpp:465. [^segfault stack trace] attached, 
frames #3 and #5 are of particular relevance.
The unchecked Bridge::shared_ptr derived from the iterator is null and the 
invocation of bridge->closed() triggers the segfault. Adding a simple null 
check (as per attached patch) fixes the segfault but not the underlying reason 
for the null pointer. 

The segfault appears to be related to how a second broker (henceforth 
"broker1") is configured; this is the one to which the links are established. 
Without broker1, the "segfaulting broker" (aka "broker2") does not do its 
thing. It may be that broker1 returns invalid data to broker2 but this is not 
in the scope of this bug report, which focuses on the segfault. 

h2. Reproduce
Unfortunately the steps to  arrive at this situation are not clear so the 
reproduce is a bit hacky - the data directory, config file and some certs for 
the two brokers are attached as a tarball in the hope that they can be arranged 
in such a way as to provide a reproduce in lieu of a purely step-based 
procedure.
Steps to reproduce:
* Temporarily add a DNS alias to the local machine of "octopussy" (necessary 
due to cert config and durable link config in broker2's data store)
* Unpack the attached [^segfault-repoduce.tar.gz] to an empty directory 
(assumed to be cwd)
* Start the broker1 with "qpidd --config broker1/qpidd.conf"
* In another shell with the same cwd, start broker2 with "qpidd --config 
broker2/qpidd.conf"
* Observe segfault in broker2 after 5-10 seconds.


  was:
Segfault occurs on a brackground thread within about 5-10 seconds of broker 
startup at src/qpid/broker/Link.cpp:465. [^segfault stack trace] attached, 
frames #3 and #5 are of particular relevance.
The unchecked Bridge::shared_ptr derived from the iterator is null and the 
invocation of bridge->closed() triggers the segfault. Adding a simple null 
check (as per attached patch) fixes the segfault but not the underlying reason 
for the null pointer. 

The segfault appears to be related to how a second broker (henceforth 
"broker1") is configured; this is the one to which the links are established. 
Without broker1, the "segfaulting broker" (aka "broker2") does not do its 
thing. It may be that broker1 returns invalid data to broker2 but this is not 
in the scope of this bug report, which focuses on the segfault. 

h2. Reproduce
Unfortunately the steps to  arrive at this situation are not clear so the 
reproduce is a bit hacky - the data directory, config file and some certs for 
the two brokers are attached as a tarball in the hope that they can be arranged 
in such a way as to provide a reproduce in lieu of a purely step-based 
procedure.
Steps to reproduce:
* Temporarily add a DNS alias to the local machine of "octopussy" (necessary 
due to cert config and durable link config in broker2's data store)
* Unpack the attached tarball to an empty directory (assumed to be cwd)
* Start the broker1 with "qpidd --config broker1/qpidd.conf"
* In another shell with the same cwd, start broker2 with "qpidd --config 
broker2/qpidd.conf"
* Observe segfault in broker2 after 5-10 seconds.



> Segfault in broker while processing active bridges
> --------------------------------------------------
>
>                 Key: QPID-7991
>                 URL: https://issues.apache.org/jira/browse/QPID-7991
>             Project: Qpid
>          Issue Type: Bug
>          Components: C++ Broker
>    Affects Versions: qpid-cpp-1.37.0
>         Environment: Ubuntu 17.10 x86_64, gcc 7.
>            Reporter: Chris Richardson
>            Priority: Critical
>         Attachments: segfault stack trace, segfault-repoduce.tar.gz
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Segfault occurs on a brackground thread within about 5-10 seconds of broker 
> startup at src/qpid/broker/Link.cpp:465. [^segfault stack trace] attached, 
> frames #3 and #5 are of particular relevance.
> The unchecked Bridge::shared_ptr derived from the iterator is null and the 
> invocation of bridge->closed() triggers the segfault. Adding a simple null 
> check (as per attached patch) fixes the segfault but not the underlying 
> reason for the null pointer. 
> The segfault appears to be related to how a second broker (henceforth 
> "broker1") is configured; this is the one to which the links are established. 
> Without broker1, the "segfaulting broker" (aka "broker2") does not do its 
> thing. It may be that broker1 returns invalid data to broker2 but this is not 
> in the scope of this bug report, which focuses on the segfault. 
> h2. Reproduce
> Unfortunately the steps to  arrive at this situation are not clear so the 
> reproduce is a bit hacky - the data directory, config file and some certs for 
> the two brokers are attached as a tarball in the hope that they can be 
> arranged in such a way as to provide a reproduce in lieu of a purely 
> step-based procedure.
> Steps to reproduce:
> * Temporarily add a DNS alias to the local machine of "octopussy" (necessary 
> due to cert config and durable link config in broker2's data store)
> * Unpack the attached [^segfault-repoduce.tar.gz] to an empty directory 
> (assumed to be cwd)
> * Start the broker1 with "qpidd --config broker1/qpidd.conf"
> * In another shell with the same cwd, start broker2 with "qpidd --config 
> broker2/qpidd.conf"
> * Observe segfault in broker2 after 5-10 seconds.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org

Reply via email to