[jira] [Commented] (DISPATCH-1086) Dispatch Router sporadically goes into a state where TLS connections to the auth service fail
[ https://issues.apache.org/jira/browse/DISPATCH-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621707#comment-16621707 ] ASF subversion and git services commented on DISPATCH-1086: --- Commit 194747dcd4ddf6973a6f5f85e893f769706fa47d in qpid-dispatch's branch refs/heads/master from Gordon Sim [ https://git-wip-us.apache.org/repos/asf?p=qpid-dispatch.git;h=194747d ] DISPATCH-1086: use a pn_ssl_domain_t instance per auth service connection > Dispatch Router sporadically goes into a state where TLS connections to the > auth service fail > - > > Key: DISPATCH-1086 > URL: https://issues.apache.org/jira/browse/DISPATCH-1086 > Project: Qpid Dispatch > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Keith Wall >Priority: Major > > Whilst running performance tests against Enmasse, we periodically see a > problem where Dispatch Router (1.1.0) goes into a state where fails to form > TLS connections to the authservice. When this occurs, the router needs to be > restarted to restore service. There does not seem to be a pattern to when the > issue occurs, but in all cases where it has been seen, the test case included > tens or hundreds of concurrently formed connections. > The following message is written to the log: > > {noformat} > 2018-07-06 10:38:45.543519 + AUTHSERVICE (warning) Cannot initialise > SSL{noformat} > Unfortunately turning up the router logging (using the following command) > reveal no more useful information. This Proton improvement JIRA was raised to > include the diagnostics from OpenSSL. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org
[jira] [Commented] (DISPATCH-1086) Dispatch Router sporadically goes into a state where TLS connections to the auth service fail
[ https://issues.apache.org/jira/browse/DISPATCH-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620303#comment-16620303 ] Carsten Lohmann commented on DISPATCH-1086: --- Here are backtraces from 3 separate core dumps: 1: {noformat} #0 0x7f4a0578f4fd in pn_list_get (list=0x1424900, index=24140) at /build/qpid-proton-src/c/src/core/object/list.c:42 #1 0x7f4a0579866f in pni_session_bound (ssn=) at /build/qpid-proton-src/c/src/core/engine.c:1021 #2 0x7f4a0579892b in pn_connection_bound ( connection=connection@entry=0x14244a0) at /build/qpid-proton-src/c/src/core/engine.c:157 #3 0x7f4a0579e268 in pn_transport_bind (transport=0xf1f760, connection=0x14244a0) at /build/qpid-proton-src/c/src/core/transport.c:706 #4 0x7f4a05797d35 in batch_next (batch=0x1461330) at /build/qpid-proton-src/c/src/core/connection_driver.c:41 #5 0x7f4a0557a6f1 in pconnection_batch_next () at /build/qpid-proton-src/c/src/proactor/epoll.c:948 #6 0x7f4a05a0c2fb in thread_run (arg=arg@entry=0xf0bf60) at /build/qpid-dispatch-src/src/server.c:976 #7 0x7f4a05a0c590 in qd_server_run (qd=) at /build/qpid-dispatch-src/src/server.c:1247 #8 0x0040182c in main_process ( config_path=0x7fff809c4965 "/tmp/qdrouterd.conf", python_pkgdir=, fd=2) at /build/qpid-dispatch-src/router/src/main.c:112 #9 0x00401589 in main (argc=3, argv=0x7fff809c3ba8) at /build/qpid-dispatch-src/router/src/main.c:360 {noformat} 2: {noformat} #0 ssl_cert_dup (cert=0x0) at ssl/ssl_cert.c:89 #1 0x7fda9b487b98 in SSL_new (ctx=0x1fb8a40) at ssl/ssl_lib.c:716 #2 0x7fda9c95b0c8 in init_ssl_socket (transport=0x7fda8014ec60, ssl=ssl@entry=0x7fda8004dc70) at /build/qpid-proton-src/c/src/ssl/openssl.c:1235 #3 0x7fda9c95bc18 in init_ssl_socket (ssl=0x7fda8004dc70, transport=0x7fda8014ec60) at /build/qpid-proton-src/c/src/ssl/openssl.c:1232 #4 process_input_ssl (transport=0x7fda8014ec60, layer=0, input_data=0x7fda80163f90 "\220\a", available=0) at /build/qpid-proton-src/c/src/ssl/openssl.c:963 #5 0x7fda9c95317a in transport_consume ( transport=transport@entry=0x7fda8014ec60) at /build/qpid-proton-src/c/src/core/transport.c:1821 #6 0x7fda9c954eda in pn_transport_close_tail (transport=0x7fda8014ec60) at /build/qpid-proton-src/c/src/core/transport.c:2972 #7 0x7fda9c94cdea in pn_connection_driver_read_close ( d=d@entry=0x7fda800e8968) at /build/qpid-proton-src/c/src/core/connection_driver.c:114 #8 0x7fda9c72f5f8 in pconnection_process (pc=pc@entry=0x7fda800e83c0, events=events@entry=0, timeout=timeout@entry=false, topup=topup@entry=true, is_io_2=is_io_2@entry=false) at /build/qpid-proton-src/c/src/proactor/epoll.c:1230 #9 0x7fda9c72f754 in pconnection_batch_next () at /build/qpid-proton-src/c/src/proactor/epoll.c:953 #10 0x7fda9cbc12fb in thread_run (arg=0x1bd0f60) at /build/qpid-dispatch-src/src/server.c:976 #11 0x7fda9c512594 in start_thread (arg=) at pthread_create.c:463 #12 0x7fda9b7bbe6f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 {noformat} 3: {noformat} #0 0x7f761a027858 in EVP_PKEY_up_ref (pkey=0x1a0bf8b480166) at crypto/evp/p_lib.c:160 #1 0x7f761a3898a4 in ssl_cert_dup (cert=0x7f761b85a6a0 ) at ssl/ssl_cert.c:99 #2 0x7f761a394b98 in SSL_new (ctx=0x7f761ba77940 ) at ssl/ssl_lib.c:716 #3 0x7f761b8680c8 in init_ssl_socket (transport=0xcbede0, ssl=0xc92d70) at /build/qpid-proton-src/c/src/ssl/openssl.c:1235 #4 0x7f761b869464 in init_ssl_socket (ssl=, transport=) at /build/qpid-proton-src/c/src/ssl/openssl.c:1232 #5 pn_ssl_init (ssl0=, domain=, session_id=session_id@entry=0x0) at /build/qpid-proton-src/c/src/ssl/openssl.c:822 #6 0x7f761bab09ef in qdr_handle_authentication_service_connection_event ( e=e@entry=0xc78900) at /build/qpid-dispatch-src/src/remote_sasl.c:623 #7 0x7f761bacda2c in handle (qd_server=qd_server@entry=0x85bf60, e=e@entry=0xc78900, pn_conn=pn_conn@entry=0xc35ba0, ctx=ctx@entry=0x0) at /build/qpid-dispatch-src/src/server.c:864 #8 0x7f761bace2e4 in thread_run (arg=arg@entry=0x85bf60) at /build/qpid-dispatch-src/src/server.c:973 #9 0x7f761bace590 in qd_server_run (qd=) at /build/qpid-dispatch-src/src/server.c:1247 #10 0x0040182c in main_process ( config_path=0x7ffecc430965 "/tmp/qdrouterd.conf", python_pkgdir=, fd=2) at /build/qpid-dispatch-src/router/src/main.c:112 #11 0x00401589 in main (argc=3, argv=0x7ffecc42e848) at /build/qpid-dispatch-src/router/src/main.c:360 {noformat} > Dispatch Router sporadically goes into a state where TLS connections to the > auth service fail > - > > Key: DISPATCH-1086 >
***UNCHECKED*** [jira] [Commented] (DISPATCH-1086) Dispatch Router sporadically goes into a state where TLS connections to the auth service fail
[ https://issues.apache.org/jira/browse/DISPATCH-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620236#comment-16620236 ] Carsten Lohmann commented on DISPATCH-1086: --- (yes, sent via mail) > Dispatch Router sporadically goes into a state where TLS connections to the > auth service fail > - > > Key: DISPATCH-1086 > URL: https://issues.apache.org/jira/browse/DISPATCH-1086 > Project: Qpid Dispatch > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Keith Wall >Priority: Major > > Whilst running performance tests against Enmasse, we periodically see a > problem where Dispatch Router (1.1.0) goes into a state where fails to form > TLS connections to the authservice. When this occurs, the router needs to be > restarted to restore service. There does not seem to be a pattern to when the > issue occurs, but in all cases where it has been seen, the test case included > tens or hundreds of concurrently formed connections. > The following message is written to the log: > > {noformat} > 2018-07-06 10:38:45.543519 + AUTHSERVICE (warning) Cannot initialise > SSL{noformat} > Unfortunately turning up the router logging (using the following command) > reveal no more useful information. This Proton improvement JIRA was raised to > include the diagnostics from OpenSSL. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org
[jira] [Commented] (DISPATCH-1086) Dispatch Router sporadically goes into a state where TLS connections to the auth service fail
[ https://issues.apache.org/jira/browse/DISPATCH-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16619367#comment-16619367 ] Gordon Sim commented on DISPATCH-1086: -- [~calohmn] you don't have a core dump by any chance? > Dispatch Router sporadically goes into a state where TLS connections to the > auth service fail > - > > Key: DISPATCH-1086 > URL: https://issues.apache.org/jira/browse/DISPATCH-1086 > Project: Qpid Dispatch > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Keith Wall >Priority: Major > > Whilst running performance tests against Enmasse, we periodically see a > problem where Dispatch Router (1.1.0) goes into a state where fails to form > TLS connections to the authservice. When this occurs, the router needs to be > restarted to restore service. There does not seem to be a pattern to when the > issue occurs, but in all cases where it has been seen, the test case included > tens or hundreds of concurrently formed connections. > The following message is written to the log: > > {noformat} > 2018-07-06 10:38:45.543519 + AUTHSERVICE (warning) Cannot initialise > SSL{noformat} > Unfortunately turning up the router logging (using the following command) > reveal no more useful information. This Proton improvement JIRA was raised to > include the diagnostics from OpenSSL. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org
[jira] [Commented] (DISPATCH-1086) Dispatch Router sporadically goes into a state where TLS connections to the auth service fail
[ https://issues.apache.org/jira/browse/DISPATCH-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16617739#comment-16617739 ] Carsten Lohmann commented on DISPATCH-1086: --- We also got this issue. Using the additional log information provided by PROTON-1886 (with PN_TRACE_DRV set to 1), there was this log output: {noformat} [0x7feae80d4f30]:Client SSL socket created. [0x7feaf0142120]:Client SSL socket created. [0x7feae80d4f30]:Read 26 bytes from SSL socket for app [0x7feae80d4f30]:SSL socket freed. [0x7feaf0142120]:Read 26 bytes from SSL socket for app [0x7feaf0142120]:SSL socket freed. [0x7feaec0c4e60]:SSL socket setup failure. [0x7feaec0c4e60]:error:140BA0E4:SSL routines:SSL_new:ssl ctx has no default ssl version 2018-09-17 15:48:49.713040 + AUTHSERVICE (warning) Cannot initialise SSL [0x7feaec0c4e60]:SSL socket setup failure. [0x7feaec0c4e60]:error:140BA0E4:SSL routines:SSL_new:ssl ctx has no default ssl version [0x7feae805ea00]:SSL socket setup failure. [0x7feae805ea00]:error:140BA0E4:SSL routines:SSL_new:ssl ctx has no default ssl version 2018-09-17 15:48:49.713156 + AUTHSERVICE (warning) Cannot initialise SSL [0x7feae805ea00]:SSL socket setup failure. [0x7feae805ea00]:error:140BA0E4:SSL routines:SSL_new:ssl ctx has no default ssl version [0x7feae8116100]:SSL socket setup failure. [0x7feae8116100]:error:140BA0E4:SSL routines:SSL_new:ssl ctx has no default ssl version 2018-09-17 15:48:49.758598 + AUTHSERVICE (warning) Cannot initialise SSL [0x7feae8116100]:SSL socket setup failure. [0x7feae8116100]:error:140BA0E4:SSL routines:SSL_new:ssl ctx has no default ssl version [0x7feae806adf0]:SSL socket setup failure. [0x7feae806adf0]:error:140BA0E4:SSL routines:SSL_new:ssl ctx has no default ssl version 2018-09-17 15:48:49.762368 + AUTHSERVICE (warning) Cannot initialise SSL [0x7feae806adf0]:SSL socket setup failure. [0x7feae806adf0]:error:140BA0E4:SSL routines:SSL_new:ssl ctx has no default ssl version{noformat} After the last line above, the router exited with a seg fault. > Dispatch Router sporadically goes into a state where TLS connections to the > auth service fail > - > > Key: DISPATCH-1086 > URL: https://issues.apache.org/jira/browse/DISPATCH-1086 > Project: Qpid Dispatch > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Keith Wall >Priority: Major > > Whilst running performance tests against Enmasse, we periodically see a > problem where Dispatch Router (1.1.0) goes into a state where fails to form > TLS connections to the authservice. When this occurs, the router needs to be > restarted to restore service. There does not seem to be a pattern to when the > issue occurs, but in all cases where it has been seen, the test case included > tens or hundreds of concurrently formed connections. > The following message is written to the log: > > {noformat} > 2018-07-06 10:38:45.543519 + AUTHSERVICE (warning) Cannot initialise > SSL{noformat} > Unfortunately turning up the router logging (using the following command) > reveal no more useful information. This Proton improvement JIRA was raised to > include the diagnostics from OpenSSL. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org
[jira] [Commented] (DISPATCH-1086) Dispatch Router sporadically goes into a state where TLS connections to the auth service fail
[ https://issues.apache.org/jira/browse/DISPATCH-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16568067#comment-16568067 ] Keith Wall commented on DISPATCH-1086: -- I've spent considerable time trying to reproduce this error, so far without success, using both 1.2.0 and the 1.1.0 derivative (enmasseproject/qdrouterd-base:1.1.0_proton-0.23.0-http-DISPATCH-1034-1039) that was in use at the time. Unfortunately, the original test environment that brought out this issue (many times), is not currently available. I've raised a PR against Proton (PROTON-1886) to make the contents of the SSL error queue available via the Proton transport tracer. Hopefully this will shed more light. > Dispatch Router sporadically goes into a state where TLS connections to the > auth service fail > - > > Key: DISPATCH-1086 > URL: https://issues.apache.org/jira/browse/DISPATCH-1086 > Project: Qpid Dispatch > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Keith Wall >Priority: Major > > Whilst running performance tests against Enmasse, we periodically see a > problem where Dispatch Router (1.1.0) goes into a state where fails to form > TLS connections to the authservice. When this occurs, the router needs to be > restarted to restore service. There does not seem to be a pattern to when the > issue occurs, but in all cases where it has been seen, the test case included > tens or hundreds of concurrently formed connections. > The following message is written to the log: > > {noformat} > 2018-07-06 10:38:45.543519 + AUTHSERVICE (warning) Cannot initialise > SSL{noformat} > Unfortunately turning up the router logging (using the following command) > reveal no more useful information. This Proton improvement JIRA was raised to > include the diagnostics from OpenSSL. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org