[jira] [Commented] (DISPATCH-1086) Dispatch Router sporadically goes into a state where TLS connections to the auth service fail

2018-09-20 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/DISPATCH-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621707#comment-16621707
 ] 

ASF subversion and git services commented on DISPATCH-1086:
---

Commit 194747dcd4ddf6973a6f5f85e893f769706fa47d in qpid-dispatch's branch 
refs/heads/master from Gordon Sim
[ https://git-wip-us.apache.org/repos/asf?p=qpid-dispatch.git;h=194747d ]

DISPATCH-1086: use a pn_ssl_domain_t instance per auth service connection


> Dispatch Router sporadically goes into a state where TLS connections to the 
> auth service fail
> -
>
> Key: DISPATCH-1086
> URL: https://issues.apache.org/jira/browse/DISPATCH-1086
> Project: Qpid Dispatch
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Keith Wall
>Priority: Major
>
> Whilst running performance tests against Enmasse, we periodically see a 
> problem where Dispatch Router (1.1.0) goes into a state where fails to form 
> TLS connections to the authservice. When this occurs, the router needs to be 
> restarted to restore service. There does not seem to be a pattern to when the 
> issue occurs, but in all cases where it has been seen, the test case included 
> tens or hundreds of concurrently formed connections.
> The following message is written to the log:
>  
> {noformat}
> 2018-07-06 10:38:45.543519 + AUTHSERVICE (warning) Cannot initialise 
> SSL{noformat}
>  Unfortunately turning up the router logging (using the following command) 
> reveal no more useful information. This Proton improvement JIRA was raised to 
> include the diagnostics from OpenSSL.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-1086) Dispatch Router sporadically goes into a state where TLS connections to the auth service fail

2018-09-19 Thread Carsten Lohmann (JIRA)


[ 
https://issues.apache.org/jira/browse/DISPATCH-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620303#comment-16620303
 ] 

Carsten Lohmann commented on DISPATCH-1086:
---

Here are backtraces from 3 separate core dumps:

1:
{noformat}
#0  0x7f4a0578f4fd in pn_list_get (list=0x1424900, index=24140)
    at /build/qpid-proton-src/c/src/core/object/list.c:42
#1  0x7f4a0579866f in pni_session_bound (ssn=)
    at /build/qpid-proton-src/c/src/core/engine.c:1021
#2  0x7f4a0579892b in pn_connection_bound (
    connection=connection@entry=0x14244a0)
    at /build/qpid-proton-src/c/src/core/engine.c:157
#3  0x7f4a0579e268 in pn_transport_bind (transport=0xf1f760,
    connection=0x14244a0) at /build/qpid-proton-src/c/src/core/transport.c:706
#4  0x7f4a05797d35 in batch_next (batch=0x1461330)
    at /build/qpid-proton-src/c/src/core/connection_driver.c:41
#5  0x7f4a0557a6f1 in pconnection_batch_next ()
    at /build/qpid-proton-src/c/src/proactor/epoll.c:948
#6  0x7f4a05a0c2fb in thread_run (arg=arg@entry=0xf0bf60)
    at /build/qpid-dispatch-src/src/server.c:976
#7  0x7f4a05a0c590 in qd_server_run (qd=)
    at /build/qpid-dispatch-src/src/server.c:1247
#8  0x0040182c in main_process (
    config_path=0x7fff809c4965 "/tmp/qdrouterd.conf",
    python_pkgdir=, fd=2)
    at /build/qpid-dispatch-src/router/src/main.c:112
#9  0x00401589 in main (argc=3, argv=0x7fff809c3ba8)
    at /build/qpid-dispatch-src/router/src/main.c:360
{noformat}
2:
{noformat}
#0  ssl_cert_dup (cert=0x0) at ssl/ssl_cert.c:89
#1  0x7fda9b487b98 in SSL_new (ctx=0x1fb8a40) at ssl/ssl_lib.c:716
#2  0x7fda9c95b0c8 in init_ssl_socket (transport=0x7fda8014ec60,
    ssl=ssl@entry=0x7fda8004dc70)
    at /build/qpid-proton-src/c/src/ssl/openssl.c:1235
#3  0x7fda9c95bc18 in init_ssl_socket (ssl=0x7fda8004dc70,
    transport=0x7fda8014ec60)
    at /build/qpid-proton-src/c/src/ssl/openssl.c:1232
#4  process_input_ssl (transport=0x7fda8014ec60, layer=0,
    input_data=0x7fda80163f90 "\220\a", available=0)
    at /build/qpid-proton-src/c/src/ssl/openssl.c:963
#5  0x7fda9c95317a in transport_consume (
    transport=transport@entry=0x7fda8014ec60)
    at /build/qpid-proton-src/c/src/core/transport.c:1821
#6  0x7fda9c954eda in pn_transport_close_tail (transport=0x7fda8014ec60)
    at /build/qpid-proton-src/c/src/core/transport.c:2972
#7  0x7fda9c94cdea in pn_connection_driver_read_close (
    d=d@entry=0x7fda800e8968)
    at /build/qpid-proton-src/c/src/core/connection_driver.c:114
#8  0x7fda9c72f5f8 in pconnection_process (pc=pc@entry=0x7fda800e83c0,
    events=events@entry=0, timeout=timeout@entry=false,
    topup=topup@entry=true, is_io_2=is_io_2@entry=false)
    at /build/qpid-proton-src/c/src/proactor/epoll.c:1230
#9  0x7fda9c72f754 in pconnection_batch_next ()
    at /build/qpid-proton-src/c/src/proactor/epoll.c:953
#10 0x7fda9cbc12fb in thread_run (arg=0x1bd0f60)
    at /build/qpid-dispatch-src/src/server.c:976
#11 0x7fda9c512594 in start_thread (arg=)
    at pthread_create.c:463
#12 0x7fda9b7bbe6f in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
{noformat}
3:
{noformat}
#0  0x7f761a027858 in EVP_PKEY_up_ref (pkey=0x1a0bf8b480166)
    at crypto/evp/p_lib.c:160
#1  0x7f761a3898a4 in ssl_cert_dup (cert=0x7f761b85a6a0 )
    at ssl/ssl_cert.c:99
#2  0x7f761a394b98 in SSL_new (ctx=0x7f761ba77940 )
    at ssl/ssl_lib.c:716
#3  0x7f761b8680c8 in init_ssl_socket (transport=0xcbede0, ssl=0xc92d70)
    at /build/qpid-proton-src/c/src/ssl/openssl.c:1235
#4  0x7f761b869464 in init_ssl_socket (ssl=,
    transport=)
    at /build/qpid-proton-src/c/src/ssl/openssl.c:1232
#5  pn_ssl_init (ssl0=, domain=,
    session_id=session_id@entry=0x0)
    at /build/qpid-proton-src/c/src/ssl/openssl.c:822
#6  0x7f761bab09ef in qdr_handle_authentication_service_connection_event (
    e=e@entry=0xc78900) at /build/qpid-dispatch-src/src/remote_sasl.c:623
#7  0x7f761bacda2c in handle (qd_server=qd_server@entry=0x85bf60,
    e=e@entry=0xc78900, pn_conn=pn_conn@entry=0xc35ba0, ctx=ctx@entry=0x0)
    at /build/qpid-dispatch-src/src/server.c:864
#8  0x7f761bace2e4 in thread_run (arg=arg@entry=0x85bf60)
    at /build/qpid-dispatch-src/src/server.c:973
#9  0x7f761bace590 in qd_server_run (qd=)
    at /build/qpid-dispatch-src/src/server.c:1247
#10 0x0040182c in main_process (
    config_path=0x7ffecc430965 "/tmp/qdrouterd.conf",
    python_pkgdir=, fd=2)
    at /build/qpid-dispatch-src/router/src/main.c:112
#11 0x00401589 in main (argc=3, argv=0x7ffecc42e848)
    at /build/qpid-dispatch-src/router/src/main.c:360
{noformat}
 

> Dispatch Router sporadically goes into a state where TLS connections to the 
> auth service fail
> -
>
> Key: DISPATCH-1086
>

***UNCHECKED*** [jira] [Commented] (DISPATCH-1086) Dispatch Router sporadically goes into a state where TLS connections to the auth service fail

2018-09-19 Thread Carsten Lohmann (JIRA)


[ 
https://issues.apache.org/jira/browse/DISPATCH-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620236#comment-16620236
 ] 

Carsten Lohmann commented on DISPATCH-1086:
---

(yes, sent via mail)

> Dispatch Router sporadically goes into a state where TLS connections to the 
> auth service fail
> -
>
> Key: DISPATCH-1086
> URL: https://issues.apache.org/jira/browse/DISPATCH-1086
> Project: Qpid Dispatch
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Keith Wall
>Priority: Major
>
> Whilst running performance tests against Enmasse, we periodically see a 
> problem where Dispatch Router (1.1.0) goes into a state where fails to form 
> TLS connections to the authservice. When this occurs, the router needs to be 
> restarted to restore service. There does not seem to be a pattern to when the 
> issue occurs, but in all cases where it has been seen, the test case included 
> tens or hundreds of concurrently formed connections.
> The following message is written to the log:
>  
> {noformat}
> 2018-07-06 10:38:45.543519 + AUTHSERVICE (warning) Cannot initialise 
> SSL{noformat}
>  Unfortunately turning up the router logging (using the following command) 
> reveal no more useful information. This Proton improvement JIRA was raised to 
> include the diagnostics from OpenSSL.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-1086) Dispatch Router sporadically goes into a state where TLS connections to the auth service fail

2018-09-18 Thread Gordon Sim (JIRA)


[ 
https://issues.apache.org/jira/browse/DISPATCH-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16619367#comment-16619367
 ] 

Gordon Sim commented on DISPATCH-1086:
--

[~calohmn] you don't have a core dump by any chance?

> Dispatch Router sporadically goes into a state where TLS connections to the 
> auth service fail
> -
>
> Key: DISPATCH-1086
> URL: https://issues.apache.org/jira/browse/DISPATCH-1086
> Project: Qpid Dispatch
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Keith Wall
>Priority: Major
>
> Whilst running performance tests against Enmasse, we periodically see a 
> problem where Dispatch Router (1.1.0) goes into a state where fails to form 
> TLS connections to the authservice. When this occurs, the router needs to be 
> restarted to restore service. There does not seem to be a pattern to when the 
> issue occurs, but in all cases where it has been seen, the test case included 
> tens or hundreds of concurrently formed connections.
> The following message is written to the log:
>  
> {noformat}
> 2018-07-06 10:38:45.543519 + AUTHSERVICE (warning) Cannot initialise 
> SSL{noformat}
>  Unfortunately turning up the router logging (using the following command) 
> reveal no more useful information. This Proton improvement JIRA was raised to 
> include the diagnostics from OpenSSL.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-1086) Dispatch Router sporadically goes into a state where TLS connections to the auth service fail

2018-09-17 Thread Carsten Lohmann (JIRA)


[ 
https://issues.apache.org/jira/browse/DISPATCH-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16617739#comment-16617739
 ] 

Carsten Lohmann commented on DISPATCH-1086:
---

We also got this issue.

Using the additional log information provided by PROTON-1886 (with PN_TRACE_DRV 
set to 1), there was this log output:
{noformat}
[0x7feae80d4f30]:Client SSL socket created.
[0x7feaf0142120]:Client SSL socket created.
[0x7feae80d4f30]:Read 26 bytes from SSL socket for app
[0x7feae80d4f30]:SSL socket freed.
[0x7feaf0142120]:Read 26 bytes from SSL socket for app
[0x7feaf0142120]:SSL socket freed.
[0x7feaec0c4e60]:SSL socket setup failure.
[0x7feaec0c4e60]:error:140BA0E4:SSL routines:SSL_new:ssl ctx has no default ssl 
version
2018-09-17 15:48:49.713040 + AUTHSERVICE (warning) Cannot initialise SSL
[0x7feaec0c4e60]:SSL socket setup failure.
[0x7feaec0c4e60]:error:140BA0E4:SSL routines:SSL_new:ssl ctx has no default ssl 
version
[0x7feae805ea00]:SSL socket setup failure.
[0x7feae805ea00]:error:140BA0E4:SSL routines:SSL_new:ssl ctx has no default ssl 
version
2018-09-17 15:48:49.713156 + AUTHSERVICE (warning) Cannot initialise SSL
[0x7feae805ea00]:SSL socket setup failure.
[0x7feae805ea00]:error:140BA0E4:SSL routines:SSL_new:ssl ctx has no default ssl 
version
[0x7feae8116100]:SSL socket setup failure.
[0x7feae8116100]:error:140BA0E4:SSL routines:SSL_new:ssl ctx has no default ssl 
version
2018-09-17 15:48:49.758598 + AUTHSERVICE (warning) Cannot initialise SSL
[0x7feae8116100]:SSL socket setup failure.
[0x7feae8116100]:error:140BA0E4:SSL routines:SSL_new:ssl ctx has no default ssl 
version
[0x7feae806adf0]:SSL socket setup failure.
[0x7feae806adf0]:error:140BA0E4:SSL routines:SSL_new:ssl ctx has no default ssl 
version
2018-09-17 15:48:49.762368 + AUTHSERVICE (warning) Cannot initialise SSL
[0x7feae806adf0]:SSL socket setup failure.
[0x7feae806adf0]:error:140BA0E4:SSL routines:SSL_new:ssl ctx has no default ssl 
version{noformat}
After the last line above, the router exited with a seg fault.

> Dispatch Router sporadically goes into a state where TLS connections to the 
> auth service fail
> -
>
> Key: DISPATCH-1086
> URL: https://issues.apache.org/jira/browse/DISPATCH-1086
> Project: Qpid Dispatch
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Keith Wall
>Priority: Major
>
> Whilst running performance tests against Enmasse, we periodically see a 
> problem where Dispatch Router (1.1.0) goes into a state where fails to form 
> TLS connections to the authservice. When this occurs, the router needs to be 
> restarted to restore service. There does not seem to be a pattern to when the 
> issue occurs, but in all cases where it has been seen, the test case included 
> tens or hundreds of concurrently formed connections.
> The following message is written to the log:
>  
> {noformat}
> 2018-07-06 10:38:45.543519 + AUTHSERVICE (warning) Cannot initialise 
> SSL{noformat}
>  Unfortunately turning up the router logging (using the following command) 
> reveal no more useful information. This Proton improvement JIRA was raised to 
> include the diagnostics from OpenSSL.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-1086) Dispatch Router sporadically goes into a state where TLS connections to the auth service fail

2018-08-03 Thread Keith Wall (JIRA)


[ 
https://issues.apache.org/jira/browse/DISPATCH-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16568067#comment-16568067
 ] 

Keith Wall commented on DISPATCH-1086:
--

I've spent considerable time trying to reproduce this error, so far without 
success, using both 1.2.0 and the 1.1.0 derivative 
(enmasseproject/qdrouterd-base:1.1.0_proton-0.23.0-http-DISPATCH-1034-1039) 
that was in use at the time.  Unfortunately, the original test environment that 
brought out this issue (many times), is not currently available.

I've raised a PR against Proton (PROTON-1886) to make the contents of the SSL 
error queue available via the Proton transport tracer.  Hopefully this will 
shed more light.

> Dispatch Router sporadically goes into a state where TLS connections to the 
> auth service fail
> -
>
> Key: DISPATCH-1086
> URL: https://issues.apache.org/jira/browse/DISPATCH-1086
> Project: Qpid Dispatch
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Keith Wall
>Priority: Major
>
> Whilst running performance tests against Enmasse, we periodically see a 
> problem where Dispatch Router (1.1.0) goes into a state where fails to form 
> TLS connections to the authservice. When this occurs, the router needs to be 
> restarted to restore service. There does not seem to be a pattern to when the 
> issue occurs, but in all cases where it has been seen, the test case included 
> tens or hundreds of concurrently formed connections.
> The following message is written to the log:
>  
> {noformat}
> 2018-07-06 10:38:45.543519 + AUTHSERVICE (warning) Cannot initialise 
> SSL{noformat}
>  Unfortunately turning up the router logging (using the following command) 
> reveal no more useful information. This Proton improvement JIRA was raised to 
> include the diagnostics from OpenSSL.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org