[jira] [Resolved] (PROTON-2790) Improve session flow control

2024-11-05 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen resolved PROTON-2790.
-
Fix Version/s: proton-c-0.40.0
   Resolution: Fixed

merged: 60ab050b

> Improve session flow control
> 
>
> Key: PROTON-2790
> URL: https://issues.apache.org/jira/browse/PROTON-2790
> Project: Qpid Proton
>  Issue Type: Improvement
>  Components: proton-c
>Affects Versions: proton-c-0.39.0
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
> Fix For: proton-c-0.40.0
>
>
> Current flow control replenishment for the session incoming window only 
> occurs when the window reaches 0.  This minimizes flow frames on the wire but 
> introduces a stall in transfer processing.
> Switching to using a low watermark for the session incoming window would 
> allow the application to choose a preferred trade off between transfer stalls 
> and FLOW frames.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-2859) Improve performance of pn_buffer_t defrag

2024-10-24 Thread Clifford Jansen (Jira)


[ 
https://issues.apache.org/jira/browse/PROTON-2859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17892349#comment-17892349
 ] 

Clifford Jansen commented on PROTON-2859:
-

cj-sender.c used with test-drain.c from PROTON-2857

 

> Improve performance of pn_buffer_t defrag
> -
>
> Key: PROTON-2859
> URL: https://issues.apache.org/jira/browse/PROTON-2859
> Project: Qpid Proton
>  Issue Type: Improvement
>  Components: proton-c
>Affects Versions: proton-c-0.39.0
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
> Attachments: cj-sender.c
>
>
> Currently the only optimization in defrag is a check in rotate to do skip 
> memory copies if the rotation amount is zero.  Otherwise, the full capacity 
> is rotated one byte at a time, even if there is only one byte of content.
> Propose to check if the data in the buffer is currently contiguous and only 
> move actual content via memmove.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Updated] (PROTON-2859) Improve performance of pn_buffer_t defrag

2024-10-23 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen updated PROTON-2859:

Attachment: cj-sender.c

> Improve performance of pn_buffer_t defrag
> -
>
> Key: PROTON-2859
> URL: https://issues.apache.org/jira/browse/PROTON-2859
> Project: Qpid Proton
>  Issue Type: Improvement
>  Components: proton-c
>Affects Versions: proton-c-0.39.0
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
> Attachments: cj-sender.c
>
>
> Currently the only optimization in defrag is a check in rotate to do skip 
> memory copies if the rotation amount is zero.  Otherwise, the full capacity 
> is rotated one byte at a time, even if there is only one byte of content.
> Propose to check if the data in the buffer is currently contiguous and only 
> move actual content via memmove.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2857) Improve performance of session flow control for senders.

2024-10-23 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2857:
---

 Summary: Improve performance of session flow control for senders.
 Key: PROTON-2857
 URL: https://issues.apache.org/jira/browse/PROTON-2857
 Project: Qpid Proton
  Issue Type: Improvement
  Components: proton-c
Affects Versions: proton-c-0.39.0
Reporter: Clifford Jansen
Assignee: Clifford Jansen
 Attachments: test-drain.c, test-sender.c

The current interaction of pn_link_send and transfer frame generation results 
in many needless buffer rotate calls that are costly.

The attached test programs (courtesy of kgiusti) shine a bright light on the 
problem.  In this case a single 40MB message results in 76 buffer rotates and 
5GB of individual 8 bit byte moves that are all busy work.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2859) Improve performance of pn_buffer_t defrag

2024-10-23 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2859:
---

 Summary: Improve performance of pn_buffer_t defrag
 Key: PROTON-2859
 URL: https://issues.apache.org/jira/browse/PROTON-2859
 Project: Qpid Proton
  Issue Type: Improvement
  Components: proton-c
Affects Versions: proton-c-0.39.0
Reporter: Clifford Jansen
Assignee: Clifford Jansen


Currently the only optimization in defrag is a check in rotate to do skip 
memory copies if the rotation amount is zero.  Otherwise, the full capacity is 
rotated one byte at a time, even if there is only one byte of content.

Propose to check if the data in the buffer is currently contiguous and only 
move actual content via memmove.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2858) Improve scheduling fairness for outgoing streaming messages

2024-10-23 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2858:
---

 Summary: Improve scheduling fairness for outgoing streaming 
messages
 Key: PROTON-2858
 URL: https://issues.apache.org/jira/browse/PROTON-2858
 Project: Qpid Proton
  Issue Type: Improvement
  Components: proton-c
Affects Versions: proton-c-0.39.0
Reporter: Clifford Jansen
Assignee: Clifford Jansen


PROTON-2857 takes special action in the case of an outgoing streaming link 
delivery during pn_link_send().  At that point, we know that the delivery is 
the current one for the link and the last for that link that may be on the 
tpwork queue with message data to send.

It could be possible to continually refill and not fully drain the delivery in 
pni_process_tpwork_sender().

A simple check if bytes have been sent on the wire since the last pn_link_send, 
and further if the delivery is on the tpwork queue, can bypass this problem by 
moving the delivery to the back of the queue and  allow other links to progress.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2856) Provide TLS support for intermediate CA certificates as trust anchors in OpenSSL

2024-10-16 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2856:
---

 Summary: Provide TLS support for intermediate CA certificates as 
trust anchors  in OpenSSL
 Key: PROTON-2856
 URL: https://issues.apache.org/jira/browse/PROTON-2856
 Project: Qpid Proton
  Issue Type: Improvement
  Components: proton-c
Affects Versions: proton-c-0.39.0
 Environment: Proton-C built with OpenSSL
Reporter: Clifford Jansen
Assignee: Clifford Jansen


The current implementation of TLS in Proton-C uses the default certificate 
verification algorithms provided by the OpenSLL library.

This has the effect of making it difficult to use intermediate CA certificates 
in Proton-C to provide finer grade security envelopes for use, for example, by 
different organizational units in an organization or to differentiate subnets 
in cloud environments.  Currently an intermediate CA, by default, cannot be 
used to anchor a subtree of a parent root CA because the root CA must also be 
in the trust store, at which point the whole tree flowing from the root CA 
becomes trusted.

This behavior goes against current user expectations and industry norms.  See

  https://github.com/golang/go/issues/24685#issuecomment-1058119312

This makes it difficult for Proton-C users to use certificate chain tooling 
that they already have in place.

This JIRA proposes to set the X509_V_FLAG_PARTIAL_CHAIN flag when verifying 
peer certificates in OpenSSL.

An additional advantage is a shortened verification sequence.

After this change, existing trust stores for use with Proton-C that contain 
self-signed root certificates will continue to verify the whole subordinate 
trees of leaf certificates that flow from those roots.  Users will now be able 
to create new trust stores that limit trust to subtrees anchored to 
intermediate CA certificates.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-2594) Use of HSM for crypto opterations with the private key of a TLS certificate

2024-08-12 Thread Clifford Jansen (Jira)


[ 
https://issues.apache.org/jira/browse/PROTON-2594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17873030#comment-17873030
 ] 

Clifford Jansen commented on PROTON-2594:
-

Sorry for the delay in responding.  I feel the suggested patch is useful and 
clear in its goal and implementation.   Many thanks for your submission.

+1 for using the provider api.

I would like to comment on the pull request, however I am having difficulty 
running a simple C test program.  No doubt it is due to my lack of familiarity 
with the standard, as well as with the layers of tooling to simulate HSM in 
software for testing.  From getting this program to run I hope to better 
understand the implications of the patch for installed package requirements, 
documentation changes, CI issues, and other differences (e.g. user password 
prompts?).

I have tried using the pkcs11-provider-qpid-proton-bug-reproduction project as 
a template to initialize softhsm, populate data with pkcs11-tool along with the 
OPENSSL_CONF and SOFTHSM2_CONF.  I cannot yet get it to run using Fedora 40, 
which should be new enough to work with your patch.

I have also tried your suggested C++ program, but the setup stage hangs at the 
"openssl storeutl", no matter what pin/password I supply, before even 
exercising your code.  A further indication that I get tripped up merely taking 
baby steps with pkcs11.

I have attached the C program I am trying to use (pn2594.c).  It simply makes 
one client and one server connection.  It allows you to specify each argument 
to the OpenSSL domain setup routines for each side.  For example, if run with 
these arguments from qpid-proton/cpp/testdata/certs you can run with mutual TLS 
(two ways), server side TLS, or no TLS:

/path/to/pn2594 amqps "client-certificate.pem" "client-private-key.pem" 
"client-password" "ca-certificate.pem" "server-certificate.pem" 
"server-private-key.pem" "server-password" "ca-certificate.pem"

  /pat/to/pn2594 amqps "client-certificate.pem" 
"client-private-key-no-password.pem" "" "ca-certificate.pem" 
"server-certificate.pem" "server-private-key.pem" "server-password" 
"ca-certificate.pem"

  /path/to/pn2594 amqps "" "" "" "ca-certificate.pem" "server-certificate.pem" 
"server-private-key.pem" "server-password" ""

  /path/to/pn2594 amqp "" "" "" "" "" "" "" ""

I am trying to replace the first two examples "client private key" and "client 
password" with a pkcs11 URI and PIN, i.e.

  pkcs11-tool --module=/usr/lib64/libsofthsm2.so --token-label clitest --pin 
tclientpw --label test --id  --write-object 
/r4/amqp/p/pkcs11/cj/cjcerts/cj-client-private-key-no-password.pem --type 
privkey --usage-sign

   pn2594 amqps "client-certificate.pem" "pkcs11:token=clitest;id=%44%44" 
"tclientpw" "ca-certificate.pem" "server-certificate.pem" 
"server-private-key.pem" "server-password" "ca-certificate.pem"

I would appreciate if you can confirm you can run this test with your pkcs11 
patch and get it to work in the way you think it should be run (i.e. not 
"fixing" my command usage or config files).

Step by step commands (or a captured terminal session) to reproduce would be 
appreciated.  Preferably starting with an empty softhsm, initializing it, 
creating/loading the slot+token.

Hopefully from this exercise I can help you get the patch integrated.

Thanks.

 

> Use of HSM for crypto opterations with the private key of a TLS certificate
> ---
>
> Key: PROTON-2594
> URL: https://issues.apache.org/jira/browse/PROTON-2594
> Project: Qpid Proton
>  Issue Type: New Feature
>  Components: cpp-binding, proton-c
>Reporter: Franz Hollerer
>Priority: Major
> Attachments: pn2594.c
>
>
> We use a Hardware Security Module with PKCS#11 Interface (to be more 
> specific: OP-TEE) as key store. This key store holds the public and private 
> key for a TLS certificate for the purpose of client authentication.
> Is there a way to instruct proton-qpid to use the HSM for cryptographic 
> operations with the private key?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Updated] (PROTON-2594) Use of HSM for crypto opterations with the private key of a TLS certificate

2024-08-12 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen updated PROTON-2594:

Attachment: pn2594.c

> Use of HSM for crypto opterations with the private key of a TLS certificate
> ---
>
> Key: PROTON-2594
> URL: https://issues.apache.org/jira/browse/PROTON-2594
> Project: Qpid Proton
>  Issue Type: New Feature
>  Components: cpp-binding, proton-c
>Reporter: Franz Hollerer
>Priority: Major
> Attachments: pn2594.c
>
>
> We use a Hardware Security Module with PKCS#11 Interface (to be more 
> specific: OP-TEE) as key store. This key store holds the public and private 
> key for a TLS certificate for the purpose of client authentication.
> Is there a way to instruct proton-qpid to use the HSM for cryptographic 
> operations with the private key?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2834) Container stop delayed by canceled work_queue task.

2024-06-27 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2834:
---

 Summary: Container stop delayed by canceled work_queue task.
 Key: PROTON-2834
 URL: https://issues.apache.org/jira/browse/PROTON-2834
 Project: Qpid Proton
  Issue Type: Bug
  Components: cpp-binding
Affects Versions: proton-c-0.39.0
Reporter: Clifford Jansen
 Attachments: tc1.cpp

Canceling work using the work_handle does not remove the canceled item nor 
adjust the next proactor timeout forward if necessary.

This prevents the container from stopping until the last scheduled work has 
reached its deadline, even if canceled.

My first attempt at a fix fell short.  I believe a proper fix requires a 
combination of checking for a shortened timer in cancel and some sort of 
reaping of canceled work in cancel() or run_timer_jobs() or both.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-2832) Use-after free tsan error in epoll.c::post_event

2024-06-18 Thread Clifford Jansen (Jira)


[ 
https://issues.apache.org/jira/browse/PROTON-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17855982#comment-17855982
 ] 

Clifford Jansen commented on PROTON-2832:
-

Presumably there is a race between the last incoming epoll IO event (on the 
poller thread) and the call to stop_polling() (on the worker thread).  I 
suspect you need a raw connection wake (on its own) activating the worker, 
otherwise there would be no competing epoll activity "armed".

One solution is to adopt the current_arm + shutdown() behaviour of the AMQP 
connection.  I'm not sure what the equivalent state machine solution would be, 
but presumably this extra state will be relevant to an IOCP or (future) 
io_uring implementation too.

 

> Use-after free tsan error in epoll.c::post_event
> 
>
> Key: PROTON-2832
> URL: https://issues.apache.org/jira/browse/PROTON-2832
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.40.0
>Reporter: Ken Giusti
>Priority: Major
>
> Hit only once in github CI running in ubuntu 24.04 container.
> proton main git ref:  813f87eef9e44682948e390b1d20a68ff283bad1
> Link to github log:
> [https://github.com/skupperproject/skupper-router/actions/runs/9550471254/job/26322660920?pr=1524#step:10:5749]
>  
>  
> {quote}59: E RuntimeError: Process 7654 (name=HTTP1EventChannel) error: 
> returned error code 66 {quote}
> {quote} 
> 59: E skrouterd -c HTTP1EventChannel.conf -I 
> /home/runner/work/skupper-router/skupper-router/skupper-router/python 
>  
> 59: E 
> /home/runner/work/skupper-router/skupper-router/skupper-router/build/tests/system_test.dir/tests/system_tests_http1_adaptor/Http1AdaptorEventChannelTest/setUpClass/HTTP1EventChannel-17.cmd
>  
>  
> 59: E  
>  
> 59: E == 
>  
> 59: E WARNING: ThreadSanitizer: heap-use-after-free (pid=7654) 
>  
> 59: E Write of size 1 at 0x726800030c91 by thread T4 (mutexes: write M0, 
> write M1): 
>  
> 59: E #0 post_event ../c/src/proactor/epoll.c:2349 
> (libqpid-proton-proactor.so.1+0x137f8) (BuildId: 
> 158eaa565e8d209417b7751d724f3f73f8099121) 
>  
> 59: E #1 poller_do_epoll ../c/src/proactor/epoll.c:2617 
> (libqpid-proton-proactor.so.1+0x137f8) 
>  
> 59: E #2 next_event_batch ../c/src/proactor/epoll.c:2501 
> (libqpid-proton-proactor.so.1+0x137f8) 
>  
> 59: E #3 pn_proactor_wait ../c/src/proactor/epoll.c:2740 
> (libqpid-proton-proactor.so.1+0x16265) (BuildId: 
> 158eaa565e8d209417b7751d724f3f73f8099121) 
>  
> 59: E #4 proactor_thread ../src/server.c:168 (skrouterd+0x130421) (BuildId: 
> 3a2755d79ab408265526faf0567b497811b59975) 
>  
> 59: E #5 _thread_init ../src/posix/threading.c:207 (skrouterd+0xc8441) 
> (BuildId: 3a2755d79ab408265526faf0567b497811b59975) 
>  
> 59: E  
>  
> 59: E Previous write of size 8 at 0x726800030c90 by thread T6: 
>  
> 59: E #0 free 
> ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:724 
> (libtsan.so.2+0x5747c) (BuildId: 64c1e8de04b11a7d960abd7e45f94f3b277b7779) 
>  
> 59: E #1 praw_connection_cleanup ../c/src/proactor/epoll_raw_connection.c:171 
> (libqpid-proton-proactor.so.1+0x8de4) (BuildId: 
> 158eaa565e8d209417b7751d724f3f73f8099121) 
>  
> 59: E #2 praw_connection_cleanup ../c/src/proactor/epoll_raw_connection.c:157 
> (libqpid-proton-proactor.so.1+0x8de4) 
>  
> 59: E #3 pni_raw_connection_done ../c/src/proactor/epoll_raw_connection.c:496 
> (libqpid-proton-proactor.so.1+0x174b9) (BuildId: 
> 158eaa565e8d209417b7751d724f3f73f8099121) 
>  
> 59: E #4 pn_proactor_done ../c/src/proactor/epoll.c:2762 
> (libqpid-proton-proactor.so.1+0x174b9) 
>  
> 59: E #5 proactor_thread ../src/server.c:200 (skrouterd+0x1304d8) (BuildId: 
> 3a2755d79ab408265526faf0567b497811b59975) 
>  
> 59: E #6 _thread_init ../src/posix/threading.c:207 (skrouterd+0xc8441) 
> (BuildId: 3a2755d79ab408265526faf0567b497811b59975) 
>  
> 59: E  
>  
> 59: E Mutex M0 (0x726400030a50) created at: 
>  
> 59: E #0 pthread_mutex_init 
> ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:1315 
> (libtsan.so.2+0x58bfd) (BuildId: 64c1e8de04b11a7d960abd7e45f94f3b277b7779) 
>  
> 59: E #1 pmutex_init ../c/src/proactor/epoll-internal.h:336 
> (libqpid-proton-proactor.so.1+0xcfb3) (BuildId: 
> 158eaa565e8d209417b7751d724f3f73f8099121) 
>  
> 59: E #2 pn_proactor ../c/src/proactor/epoll.c:1991 
> (libqpid-proton-proactor.so.1+0xcfb3) 
>  
> 59: E #3 qd_server ../src/server.c:219 (skrouterd+0x13b739) (BuildId: 
> 3a2755d79ab408265526faf0567b497811b59975) 
>  
> 59: E #4 qd_dispatch_prepare ../src/dispatch.c:343 (skrouterd+0xb11cd) 
> (BuildId: 3a2755d79ab408265526faf0567b497811b59975) 
>  
> 59: E #5   (libffi.so.8+0x7b15) (BuildId: 
> c9149b6e99105aa4321ddd4a10ee4b90de7b7d49) 
>  
> 59: E #6 main_process ../router/src/main.c:101 (skrouterd+0x13c57c) (BuildId:

[jira] [Created] (PROTON-2818) Move epoll proctor connection logic to a task thread

2024-04-22 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2818:
---

 Summary: Move epoll proctor connection logic to a task thread
 Key: PROTON-2818
 URL: https://issues.apache.org/jira/browse/PROTON-2818
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: proton-c-0.39.0
Reporter: Clifford Jansen
Assignee: Clifford Jansen


See PROTON-2812.  Implement the first described mitigation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-2812) Epoll proactor blocks thread during DNS lookups in getaddrinfo

2024-04-15 Thread Clifford Jansen (Jira)


[ 
https://issues.apache.org/jira/browse/PROTON-2812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17837355#comment-17837355
 ] 

Clifford Jansen commented on PROTON-2812:
-

An additional possible mitigation (with thanks to astitcher):

Since the epoll proactor knows when the getaddrinfo calls are needed and also 
when they are completed, it could regulate a maximum concurrent number of 
threads committed to servicing such calls.

 

> Epoll proactor blocks thread during DNS lookups in getaddrinfo
> --
>
> Key: PROTON-2812
> URL: https://issues.apache.org/jira/browse/PROTON-2812
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.39.0
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
> Attachments: mitigate01.diff
>
>
> The epoll proactor uses getaddrinfo() to resolve network addresses for 
> inbound and outbound AMQP and raw connections.  These connect and listener 
> calls are thread safe so may be called from any thread and the expectation is 
> that they initiate the action without blocking.
> Solutions could entail:
> 1) using a dedicated DNS thread pool that multiplexes N serialized (blocking) 
> getaddrinfo calls over the pool (e.g. getaddrinfo_a or self managed like 
> libuv)
> 2) use some custom library that scales DNS requests without blocking
> 3) write the simplest custom proactor library that does #2.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-2812) Epoll proactor blocks thread during DNS lookups in getaddrinfo

2024-04-09 Thread Clifford Jansen (Jira)


[ 
https://issues.apache.org/jira/browse/PROTON-2812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17835482#comment-17835482
 ] 

Clifford Jansen commented on PROTON-2812:
-

A short term mitigation could be to defer the getaddrinfo call for outgoing 
connections to run on a proactor task thread provided by the application.  The 
outgoing connection call will not block in this case.

If there are sufficient task threads compared to the number of number of 
blocked DNS calls during runtime, the performance impact may be greatly reduced.

See the attached mitigate01.txt

> Epoll proactor blocks thread during DNS lookups in getaddrinfo
> --
>
> Key: PROTON-2812
> URL: https://issues.apache.org/jira/browse/PROTON-2812
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.39.0
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
> Attachments: mitigate01.diff
>
>
> The epoll proactor uses getaddrinfo() to resolve network addresses for 
> inbound and outbound AMQP and raw connections.  These connect and listener 
> calls are thread safe so may be called from any thread and the expectation is 
> that they initiate the action without blocking.
> Solutions could entail:
> 1) using a dedicated DNS thread pool that multiplexes N serialized (blocking) 
> getaddrinfo calls over the pool (e.g. getaddrinfo_a or self managed like 
> libuv)
> 2) use some custom library that scales DNS requests without blocking
> 3) write the simplest custom proactor library that does #2.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Updated] (PROTON-2812) Epoll proactor blocks thread during DNS lookups in getaddrinfo

2024-04-09 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen updated PROTON-2812:

Attachment: mitigate01.diff

> Epoll proactor blocks thread during DNS lookups in getaddrinfo
> --
>
> Key: PROTON-2812
> URL: https://issues.apache.org/jira/browse/PROTON-2812
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.39.0
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
> Attachments: mitigate01.diff
>
>
> The epoll proactor uses getaddrinfo() to resolve network addresses for 
> inbound and outbound AMQP and raw connections.  These connect and listener 
> calls are thread safe so may be called from any thread and the expectation is 
> that they initiate the action without blocking.
> Solutions could entail:
> 1) using a dedicated DNS thread pool that multiplexes N serialized (blocking) 
> getaddrinfo calls over the pool (e.g. getaddrinfo_a or self managed like 
> libuv)
> 2) use some custom library that scales DNS requests without blocking
> 3) write the simplest custom proactor library that does #2.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2812) Epoll proactor blocks thread during DNS lookups in getaddrinfo

2024-04-09 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2812:
---

 Summary: Epoll proactor blocks thread during DNS lookups in 
getaddrinfo
 Key: PROTON-2812
 URL: https://issues.apache.org/jira/browse/PROTON-2812
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: proton-c-0.39.0
Reporter: Clifford Jansen
Assignee: Clifford Jansen


The epoll proactor uses getaddrinfo() to resolve network addresses for inbound 
and outbound AMQP and raw connections.  These connect and listener calls are 
thread safe so may be called from any thread and the expectation is that they 
initiate the action without blocking.

Solutions could entail:

1) using a dedicated DNS thread pool that multiplexes N serialized (blocking) 
getaddrinfo calls over the pool (e.g. getaddrinfo_a or self managed like libuv)

2) use some custom library that scales DNS requests without blocking

3) write the simplest custom proactor library that does #2.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-2792) [cpp] Segmentation fault in container::impl::run_timer_jobs

2024-02-08 Thread Clifford Jansen (Jira)


[ 
https://issues.apache.org/jira/browse/PROTON-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17815749#comment-17815749
 ] 

Clifford Jansen commented on PROTON-2792:
-

My previous comment is obviously irrelevant to the stated problem.  Please 
ignore.

> [cpp] Segmentation fault in container::impl::run_timer_jobs
> ---
>
> Key: PROTON-2792
> URL: https://issues.apache.org/jira/browse/PROTON-2792
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: cpp-binding
>Affects Versions: proton-c-0.38.0
>Reporter: Martin Zlomek
>Priority: Major
>
> PROTON-2438 introduced a race condition in 
> [reading|https://github.com/DreamPearl/qpid-proton/blob/8142e3cecd9f668992e76a5448afc09fd7b1030a/cpp/src/proactor_container_impl.cpp#L545]
>  / 
> [writing|https://github.com/DreamPearl/qpid-proton/blob/8142e3cecd9f668992e76a5448afc09fd7b1030a/cpp/src/proactor_container_impl.cpp#L547]
>  {{is_active_}} in 
> [{{run_timer_jobs()}}|https://github.com/DreamPearl/qpid-proton/blob/8142e3cecd9f668992e76a5448afc09fd7b1030a/cpp/src/proactor_container_impl.cpp#L498]
>  while modifying it in 
> [{{schedule()}}|https://github.com/DreamPearl/qpid-proton/blob/8142e3cecd9f668992e76a5448afc09fd7b1030a/cpp/src/proactor_container_impl.cpp#L455]
>  or 
> [{{cancel()}}|https://github.com/DreamPearl/qpid-proton/blob/8142e3cecd9f668992e76a5448afc09fd7b1030a/cpp/src/proactor_container_impl.cpp#L473]
>  at the same time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-2792) [cpp] Segmentation fault in container::impl::run_timer_jobs

2024-02-08 Thread Clifford Jansen (Jira)


[ 
https://issues.apache.org/jira/browse/PROTON-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17815733#comment-17815733
 ] 

Clifford Jansen commented on PROTON-2792:
-

run_timer_jobs() is only called from the PN_PROACTOR_TIMEOUT callback.  The 
proactor only allows one such callback at a time.  There should be no competing 
thread to GUARD against.

Is this JIRA based on an actual runtime failure?  If so, do you have a stack 
trace?

> [cpp] Segmentation fault in container::impl::run_timer_jobs
> ---
>
> Key: PROTON-2792
> URL: https://issues.apache.org/jira/browse/PROTON-2792
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: cpp-binding
>Affects Versions: proton-c-0.38.0
>Reporter: Martin Zlomek
>Priority: Major
>
> PROTON-2438 introduced a race condition in 
> [reading|https://github.com/DreamPearl/qpid-proton/blob/8142e3cecd9f668992e76a5448afc09fd7b1030a/cpp/src/proactor_container_impl.cpp#L545]
>  / 
> [writing|https://github.com/DreamPearl/qpid-proton/blob/8142e3cecd9f668992e76a5448afc09fd7b1030a/cpp/src/proactor_container_impl.cpp#L547]
>  {{is_active_}} in 
> [{{run_timer_jobs()}}|https://github.com/DreamPearl/qpid-proton/blob/8142e3cecd9f668992e76a5448afc09fd7b1030a/cpp/src/proactor_container_impl.cpp#L498]
>  while modifying it in 
> [{{schedule()}}|https://github.com/DreamPearl/qpid-proton/blob/8142e3cecd9f668992e76a5448afc09fd7b1030a/cpp/src/proactor_container_impl.cpp#L455]
>  or 
> [{{cancel()}}|https://github.com/DreamPearl/qpid-proton/blob/8142e3cecd9f668992e76a5448afc09fd7b1030a/cpp/src/proactor_container_impl.cpp#L473]
>  at the same time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2791) Add MSG_MORE performance boost to raw connections

2024-01-12 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2791:
---

 Summary: Add MSG_MORE performance boost to raw connections
 Key: PROTON-2791
 URL: https://issues.apache.org/jira/browse/PROTON-2791
 Project: Qpid Proton
  Issue Type: Improvement
  Components: proton-c
Affects Versions: proton-c-0.39.0
Reporter: Clifford Jansen
Assignee: Clifford Jansen


When multiple buffers are staged for writing, the use of the MSG_MORE send() 
flag for all but the last buffer can result in significant speed improvement.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2790) Improve session flow control

2024-01-12 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2790:
---

 Summary: Improve session flow control
 Key: PROTON-2790
 URL: https://issues.apache.org/jira/browse/PROTON-2790
 Project: Qpid Proton
  Issue Type: Improvement
  Components: proton-c
Affects Versions: proton-c-0.39.0
Reporter: Clifford Jansen
Assignee: Clifford Jansen


Current flow control replenishment for the session incoming window only occurs 
when the window reaches 0.  This minimizes flow frames on the wire but 
introduces a stall in transfer processing.

Switching to using a low watermark for the session incoming window would allow 
the application to choose a preferred trade off between transfer stalls and 
FLOW frames.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Closed] (PROTON-2545) raw connection: client disconnect is ignored if no read buffers are available.

2023-10-02 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen closed PROTON-2545.
---
Resolution: Won't Fix

Opposite approach taken.  See 2748.  If there is a need to revisit, it would be 
best to open a new Jira with reference to these older issues with new info on 
why the decision needs refining.

> raw connection: client disconnect is ignored if no read buffers are available.
> --
>
> Key: PROTON-2545
> URL: https://issues.apache.org/jira/browse/PROTON-2545
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.37.0
>Reporter: Ken Giusti
>Assignee: Clifford Jansen
>Priority: Major
>
> Refer to [https://github.com/skupperproject/skupper-router/issues/477]
> TL;DR - if a client closes its TCP connection (full drop - not half close), 
> the proactor cannot post a PN_RAW_CONNECTION_DISCONNECTED event unless read 
> buffers have been provided to the raw connection.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Resolved] (PROTON-2763) Two final disconnect events possible from a raw connection

2023-10-02 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen resolved PROTON-2763.
-
Fix Version/s: proton-c-0.40.0
   Resolution: Fixed

> Two final disconnect events possible from a raw connection
> --
>
> Key: PROTON-2763
> URL: https://issues.apache.org/jira/browse/PROTON-2763
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.39.0
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
> Fix For: proton-c-0.40.0
>
>
> In writing a new threaderciser for raw connections the following scenario can 
> result in a state machine mixup and second disconnect.
> If a pn_raw_connection_wake() occurs around the time that the first 
> disconnect event is being consumed the task may be added to the global ready 
> list for processing.  The batch done() processing will (correctly) defer the 
> task cleanup until the task is next scheduled via the ready list.  However 
> the raw connection forgets that it has already done the disconnect and 
> restarts the state machine at the first disconnect stage, resulting in the 
> second disconnect event.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Resolved] (PROTON-2764) Zombie raw connections

2023-10-02 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen resolved PROTON-2764.
-
Fix Version/s: proton-c-0.40.0
   Resolution: Fixed

> Zombie raw connections
> --
>
> Key: PROTON-2764
> URL: https://issues.apache.org/jira/browse/PROTON-2764
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.39.0
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
> Fix For: proton-c-0.40.0
>
>
> In writing a new threaderciser for raw connections the following scenario can 
> result in raw connections that are never scheduled.
> If a pn_listener_raw_accept() fails due to a temporary fdlimit shortage or 
> simultaneous close of the listener by another thread, the new raw connection 
> is correctly set to an error state but is never scheduled for processing.  
> The state machine is never advanced and the raw connection resources are not 
> cleaned up.  This also causes the PN_PROACTOR_INACTIVE event to be blocked.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2764) Zombie raw connections

2023-09-13 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2764:
---

 Summary: Zombie raw connections
 Key: PROTON-2764
 URL: https://issues.apache.org/jira/browse/PROTON-2764
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: proton-c-0.39.0
Reporter: Clifford Jansen
Assignee: Clifford Jansen


In writing a new threaderciser for raw connections the following scenario can 
result in raw connections that are never scheduled.

If a pn_listener_raw_accept() fails due to a temporary fdlimit shortage or 
simultaneous close of the listener by another thread, the new raw connection is 
correctly set to an error state but is never scheduled for processing.  The 
state machine is never advanced and the raw connection resources are not 
cleaned up.  This also causes the PN_PROACTOR_INACTIVE event to be blocked.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2763) Two final disconnect events possible from a raw connection

2023-09-13 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2763:
---

 Summary: Two final disconnect events possible from a raw connection
 Key: PROTON-2763
 URL: https://issues.apache.org/jira/browse/PROTON-2763
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: proton-c-0.39.0
Reporter: Clifford Jansen
Assignee: Clifford Jansen


In writing a new threaderciser for raw connections the following scenario can 
result in a state machine mixup and second disconnect.

If a pn_raw_connection_wake() occurs around the time that the first disconnect 
event is being consumed the task may be added to the global ready list for 
processing.  The batch done() processing will (correctly) defer the task 
cleanup until the task is next scheduled via the ready list.  However the raw 
connection forgets that it has already done the disconnect and restarts the 
state machine at the first disconnect stage, resulting in the second disconnect 
event.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-2748) Raw connections do not always complete close operations

2023-08-14 Thread Clifford Jansen (Jira)


[ 
https://issues.apache.org/jira/browse/PROTON-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17754426#comment-17754426
 ] 

Clifford Jansen commented on PROTON-2748:
-

Related issues:

  https://issues.apache.org/jira/browse/PROTON-2545

  https://issues.apache.org/jira/browse/PROTON-2680

> Raw connections do not always complete close operations
> ---
>
> Key: PROTON-2748
> URL: https://issues.apache.org/jira/browse/PROTON-2748
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.39.0
> Environment: linux epoll
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
> Attachments: pn2748.patch
>
>
> The Proton raw_connection_t currently requires cooperation from the 
> application layer to complete a close.  There is a baked in assumption that 
> the application will always eventually provide a read buffer.  A second 
> assumption is that the peer (not necessarily a Proton raw connection) will 
> detect a read close on its side, and do a graceful close of it's write side 
> "soon".
> These incorrect assumptions can leave the raw connection in a hung state 
> waiting for non-existent wind up activity by the application or peer, 
> respectively.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-2545) raw connection: client disconnect is ignored if no read buffers are available.

2023-08-14 Thread Clifford Jansen (Jira)


[ 
https://issues.apache.org/jira/browse/PROTON-2545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17754425#comment-17754425
 ] 

Clifford Jansen commented on PROTON-2545:
-

See the opposite reasoning in

  https://issues.apache.org/jira/browse/PROTON-2748

It looks at detecting/initiating/completing the shutdown and cleanup of socket 
resources from a wider perspective and is perhaps the better place to 
discuss/resolve this issue.

> raw connection: client disconnect is ignored if no read buffers are available.
> --
>
> Key: PROTON-2545
> URL: https://issues.apache.org/jira/browse/PROTON-2545
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.37.0
>Reporter: Ken Giusti
>Assignee: Clifford Jansen
>Priority: Major
>
> Refer to [https://github.com/skupperproject/skupper-router/issues/477]
> TL;DR - if a client closes its TCP connection (full drop - not half close), 
> the proactor cannot post a PN_RAW_CONNECTION_DISCONNECTED event unless read 
> buffers have been provided to the raw connection.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-2680) [proton-c] PN_RAW_CONNECTION_DISCONNECTED event does not show up when client is disconnected

2023-08-14 Thread Clifford Jansen (Jira)


[ 
https://issues.apache.org/jira/browse/PROTON-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17754421#comment-17754421
 ] 

Clifford Jansen commented on PROTON-2680:
-

The test case should be retested against the code changes in

  https://issues.apache.org/jira/browse/PROTON-2748

once they have been finalized, approved and checked in.

Ultimately, it should be noted that the killed curl process does:

  connect to router
  send http request bytes on socket
  [ kill ]
  OS closes socket -> FIN

and nothing else except an ack to a FIN if the router ever sends one, or an RST 
if the router sends data.

>From the router's perspective, this is identical to some other client which 
>does:

  connect to router
  send http request bytes on socket
  wait some time
  half close socket (write side) -> FIN
  wait a long long time for the http response from the router

The latter is completely valid and should not result in a DISCONNECT.

The two are indistinguishable on the wire (or loopback).

> [proton-c] PN_RAW_CONNECTION_DISCONNECTED event does not show up when client 
> is disconnected 
> -
>
> Key: PROTON-2680
> URL: https://issues.apache.org/jira/browse/PROTON-2680
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Reporter: Ganesh Murthy
>Assignee: Clifford Jansen
>Priority: Major
>
> Steps to reproduce
> Start the skupper-router with the following config -
> {noformat}
> router {
> mode: standalone
> }
> listener {
> host: 0.0.0.0
> port: amqp
> authenticatePeer: no
> saslMechanisms: ANONYMOUS
> }
> tcpConnector {
> name: echo-1
> host: 10.108.50.177
> port: 9090
> address: echo
> }
> tcpConnector {
> name: echo-2
> host: 10.108.50.177
> port: 9090
> address: echo
> }
> tcpListener {
> host: 0.0.0.0
> port: 9000
> address: echo
> }  
> log {
>     module: DEFAULT
>     enable: trace+
>     outputFile: tcp.log
> } {noformat}
>  
> Note that the ip address in the host field of the tcpConnector is bogus.
> Now connect a curl client to the tcpListener port  -
> {noformat}
> curl http://localhost:9000/api {noformat}
>  
> The curl client will hang. Terminate the curl client and look in the tcp.log 
> for logged proton events - the PN_RAW_CONNECTION_DISCONNECTED event will be 
> missing on connection C2
> Here is the full log of the relevant client connection
>  
> {noformat}
> 2023-02-01 16:51:57.069705 -0500 ROUTER_CORE (info) [C2] Connection Opened: 
> dir=in host=127.0.0.1:35348 encrypted=no auth=no user= 
> container_id=TcpAdaptor props={:"qd.adaptor"="tcp"}
> 2023-02-01 16:51:57.069793 -0500 ROUTER_CORE (trace) Core action 
> 'connection_opened'
> 2023-02-01 16:51:57.069986 -0500 TCP_ADAPTOR (info) [C2] 
> PN_RAW_CONNECTION_CONNECTED Listener ingress accepted to 0.0.0.0:9000 from 
> 127.0.0.1:35348 (global_id=127.0.0.1:35348)
> 2023-02-01 16:51:57.070015 -0500 ROUTER_CORE (trace) Core action 
> 'link_first_attach'
> 2023-02-01 16:51:57.070098 -0500 TCP_ADAPTOR (debug) [C2] 
> PN_RAW_CONNECTION_NEED_WRITE_BUFFERS listener
> 2023-02-01 16:51:57.070148 -0500 TCP_ADAPTOR (debug) [C2] 
> PN_RAW_CONNECTION_NEED_READ_BUFFERS listener
> 2023-02-01 16:51:57.070171 -0500 ROUTER_CORE (info) [C2][L4] Link attached: 
> dir=out source={(dyn) expire:link} target={ expire:link}
> 2023-02-01 16:51:57.070222 -0500 TCP_ADAPTOR (debug) [C2] 
> qdr_tcp_activate_CT: call pn_raw_connection_wake()
> 2023-02-01 16:51:57.070246 -0500 ROUTER_CORE (trace) Core action 
> 'link_first_attach'
> 2023-02-01 16:51:57.070273 -0500 TCP_ADAPTOR (debug) [C2][L4] (listener 
> outgoing) qdr_tcp_second_attach
> 2023-02-01 16:51:57.070347 -0500 DEFAULT (trace) Parse tree search for 'echo'
> 2023-02-01 16:51:57.070376 -0500 TCP_ADAPTOR (trace) [C2][L5] handle_incoming 
> qdr_tcp_second_attach for listener connection. read_closed:F, flow_enabled:F
> 2023-02-01 16:51:57.070404 -0500 DEFAULT (trace) Parse tree match not found
> 2023-02-01 16:51:57.070425 -0500 TCP_ADAPTOR (debug) [C2][L5] Waiting for 
> credit before initiating listener ingress stream message, returning
> 2023-02-01 16:51:57.070456 -0500 TCP_ADAPTOR (debug) [C2][L4] 
> qdr_tcp_get_credit: NOOP
> 2023-02-01 16:51:57.070517 -0500 TCP_ADAPTOR (trace) Listener 
> tcpListener/0.0.0.0:9000 (0.0.0.0:9000) service address echo consumer count 
> updates: local=1 in-process=0 remote=0
> 2023-02-01 16:51:57.070553 -0500 ROUTER_CORE (info) [C2][L5] Link attached: 
> dir=in source={ expire:link} target={echo expire:link}
> 2023-02-01 16:51:57.070583 -0500 ROUTER_CORE (trace) Core action 
> 'add_tcp_connection'
> 2023-02-01 16:51:57.070606 -0500 TCP_ADAPTOR (debug) [C2] 
> PN_RAW_CONNECTION_WAKE listener
> 2023-02-01 16:51:57.070646 -0500 TCP_A

[jira] [Commented] (PROTON-2748) Raw connections do not always complete close operations

2023-08-10 Thread Clifford Jansen (Jira)


[ 
https://issues.apache.org/jira/browse/PROTON-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17752818#comment-17752818
 ] 

Clifford Jansen commented on PROTON-2748:
-

This is the proposed behaviour for normal operation and asynchronous network 
errors.

In normal operation the application may suspend/resume activity on a raw 
connection by withholding/supplying read and raw buffers as desired.  In the 
absence of network errors, pending input bytes will be available for read 
before the CLOSED_READ event and pending output bytes will be sent before a 
CLOSED_WRITE event.

READ and READ_CLOSED activity will not be polled/requested of the OS by the raw 
connection in the absence of read buffers.  If there are no queued output 
buffers for writing, the raw connection will be suspended until a future 
pn_raw_connection_wake() or network error.

Async disconnect (RST) is always immediately detected and leads to DISCONNECTED 
state and subsequent resource cleanup including close of the underlying socket 
without further blocking.

pn_raw_connection_close() results in progression to DISCONNECTED state without 
blocking (including resource cleanup).  In particular, no acknowledgment of the 
close operation is required or expected from the peer and the TCP connection is 
cleaned up by the operating system according to its configured SO_LINGER policy.

> Raw connections do not always complete close operations
> ---
>
> Key: PROTON-2748
> URL: https://issues.apache.org/jira/browse/PROTON-2748
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.39.0
> Environment: linux epoll
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
> Attachments: pn2748.patch
>
>
> The Proton raw_connection_t currently requires cooperation from the 
> application layer to complete a close.  There is a baked in assumption that 
> the application will always eventually provide a read buffer.  A second 
> assumption is that the peer (not necessarily a Proton raw connection) will 
> detect a read close on its side, and do a graceful close of it's write side 
> "soon".
> These incorrect assumptions can leave the raw connection in a hung state 
> waiting for non-existent wind up activity by the application or peer, 
> respectively.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-2748) Raw connections do not always complete close operations

2023-07-05 Thread Clifford Jansen (Jira)


[ 
https://issues.apache.org/jira/browse/PROTON-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17740290#comment-17740290
 ] 

Clifford Jansen commented on PROTON-2748:
-

test case in pn2748.patch

> Raw connections do not always complete close operations
> ---
>
> Key: PROTON-2748
> URL: https://issues.apache.org/jira/browse/PROTON-2748
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.39.0
> Environment: linux epoll
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
> Attachments: pn2748.patch
>
>
> The Proton raw_connection_t currently requires cooperation from the 
> application layer to complete a close.  There is a baked in assumption that 
> the application will always eventually provide a read buffer.  A second 
> assumption is that the peer (not necessarily a Proton raw connection) will 
> detect a read close on its side, and do a graceful close of it's write side 
> "soon".
> These incorrect assumptions can leave the raw connection in a hung state 
> waiting for non-existent wind up activity by the application or peer, 
> respectively.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Updated] (PROTON-2748) Raw connections do not always complete close operations

2023-07-05 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen updated PROTON-2748:

Attachment: pn2748.patch

> Raw connections do not always complete close operations
> ---
>
> Key: PROTON-2748
> URL: https://issues.apache.org/jira/browse/PROTON-2748
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.39.0
> Environment: linux epoll
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
> Attachments: pn2748.patch
>
>
> The Proton raw_connection_t currently requires cooperation from the 
> application layer to complete a close.  There is a baked in assumption that 
> the application will always eventually provide a read buffer.  A second 
> assumption is that the peer (not necessarily a Proton raw connection) will 
> detect a read close on its side, and do a graceful close of it's write side 
> "soon".
> These incorrect assumptions can leave the raw connection in a hung state 
> waiting for non-existent wind up activity by the application or peer, 
> respectively.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2748) Raw connections do not always complete close operations

2023-07-05 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2748:
---

 Summary: Raw connections do not always complete close operations
 Key: PROTON-2748
 URL: https://issues.apache.org/jira/browse/PROTON-2748
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: proton-c-0.39.0
 Environment: linux epoll
Reporter: Clifford Jansen
Assignee: Clifford Jansen


The Proton raw_connection_t currently requires cooperation from the application 
layer to complete a close.  There is a baked in assumption that the application 
will always eventually provide a read buffer.  A second assumption is that the 
peer (not necessarily a Proton raw connection) will detect a read close on its 
side, and do a graceful close of it's write side "soon".

These incorrect assumptions can leave the raw connection in a hung state 
waiting for non-existent wind up activity by the application or peer, 
respectively.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2747) Switch to OpenSSL for TLS support on Windows

2023-06-28 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2747:
---

 Summary: Switch to OpenSSL for TLS support on Windows
 Key: PROTON-2747
 URL: https://issues.apache.org/jira/browse/PROTON-2747
 Project: Qpid Proton
  Issue Type: Improvement
  Components: proton-c
Affects Versions: proton-c-future
 Environment: Windows
Reporter: Clifford Jansen
Assignee: Clifford Jansen


Proton-C performance has received considerable attention in the last few years 
resulting in significant performance boosts.  Further improvements are planned 
and some of these are expected to require non-trivial plumbing changes to the 
the IO subsystem including TLS support.  Currently a lot of this plumbing has 
twinned implementations for Windows and Posix.

Given that today the use and adoption of open source software is actively 
supported by Microsoft, including integrated build tool chains, it makes sense 
to simplify the Proton code for future enhancements and long term maintenance.

This JIRA tracks ongoing implementation work for the switch from Schannel 
libraries (native Windows) to OpenSSL.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Updated] (PROTON-2736) TLS OpenSSL library: hang with large application data frames

2023-05-29 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen updated PROTON-2736:

Fix Version/s: proton-c-0.39.0

> TLS OpenSSL library: hang with large application data frames
> 
>
> Key: PROTON-2736
> URL: https://issues.apache.org/jira/browse/PROTON-2736
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.38.0
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
> Fix For: proton-c-0.39.0
>
>
> OpenSSL maintains a buffer large enough for the largest possible TLS protocol 
> record + 1K.  The Proton TLS decrypt loop is unaware of record boundaries and 
> repeatedly adds encrypted bytes at one end and takes out decrypted bytes at 
> the other, stopping when there is no more to decrypt or no more application 
> buffer space to move decrypted content into.
> It also tests if there are remaining decrypted bytes available should the 
> application provide additional buffers.  This test can fail in the case that 
> the OpenSSL buffer is completely filled with:
>  handshake record > 1K followed by
>  partial max sized application data record
> The SSL_peek operation will not see any application data and Proton 
> "remembers" the full buffer without allowing that the handshake record has 
> been processed and the buffer is no longer full.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Resolved] (PROTON-2736) TLS OpenSSL library: hang with large application data frames

2023-05-29 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen resolved PROTON-2736.
-
Resolution: Fixed

> TLS OpenSSL library: hang with large application data frames
> 
>
> Key: PROTON-2736
> URL: https://issues.apache.org/jira/browse/PROTON-2736
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.38.0
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
> Fix For: proton-c-0.39.0
>
>
> OpenSSL maintains a buffer large enough for the largest possible TLS protocol 
> record + 1K.  The Proton TLS decrypt loop is unaware of record boundaries and 
> repeatedly adds encrypted bytes at one end and takes out decrypted bytes at 
> the other, stopping when there is no more to decrypt or no more application 
> buffer space to move decrypted content into.
> It also tests if there are remaining decrypted bytes available should the 
> application provide additional buffers.  This test can fail in the case that 
> the OpenSSL buffer is completely filled with:
>  handshake record > 1K followed by
>  partial max sized application data record
> The SSL_peek operation will not see any application data and Proton 
> "remembers" the full buffer without allowing that the handshake record has 
> been processed and the buffer is no longer full.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2736) TLS OpenSSL library: hang with large application data frames

2023-05-14 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2736:
---

 Summary: TLS OpenSSL library: hang with large application data 
frames
 Key: PROTON-2736
 URL: https://issues.apache.org/jira/browse/PROTON-2736
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: proton-c-0.38.0
Reporter: Clifford Jansen
Assignee: Clifford Jansen


OpenSSL maintains a buffer large enough for the largest possible TLS protocol 
record + 1K.  The Proton TLS decrypt loop is unaware of record boundaries and 
repeatedly adds encrypted bytes at one end and takes out decrypted bytes at the 
other, stopping when there is no more to decrypt or no more application buffer 
space to move decrypted content into.

It also tests if there are remaining decrypted bytes available should the 
application provide additional buffers.  This test can fail in the case that 
the OpenSSL buffer is completely filled with:

 handshake record > 1K followed by
 partial max sized application data record

The SSL_peek operation will not see any application data and Proton "remembers" 
the full buffer without allowing that the handshake record has been processed 
and the buffer is no longer full.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2725) epoll spin locks disabled

2023-05-06 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2725:
---

 Summary: epoll spin locks disabled
 Key: PROTON-2725
 URL: https://issues.apache.org/jira/browse/PROTON-2725
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: proton-c-0.38.0
Reporter: Clifford Jansen
Assignee: Clifford Jansen


PROTON-2346 has the unfortunate effect of never enabling adative spin locks 
even on platforms that support them.

 

PTHREAD_MUTEX_ADAPTIVE_NP is an enumeration and the #ifdef test for it fails 
even when it exists as a platform enumerated option.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Resolved] (PROTON-2673) Proactor hangs if pn_raw_connection_wake() is called with outstanding connection attempt

2023-04-11 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen resolved PROTON-2673.
-
Fix Version/s: proton-c-0.39.0
   Resolution: Fixed

> Proactor hangs if pn_raw_connection_wake() is called with outstanding 
> connection attempt
> 
>
> Key: PROTON-2673
> URL: https://issues.apache.org/jira/browse/PROTON-2673
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.38.0, proton-c-0.39.0
>Reporter: Ken Giusti
>Assignee: Clifford Jansen
>Priority: Major
> Fix For: proton-c-0.39.0
>
> Attachments: raw_wake.c
>
>
> If pn_raw_connection_wake() is called on a raw connection that is attempting 
> to connect  no further proactor events are generated and the proactor hangs.
> Important observations:
>  * This only occurs {_}if there is no server available at the target address 
> for the connection{_}. If a server is present then the PN_RAW_CONNECTION_WAKE 
> and PN_RAW_CONNECTION_CONNECTED events arrive properly (in that order).
>  * The host address is "localhost" - using "127.0.0.1" or "::1" instead 
> works. localhost on my machine maps to both "127.0.0.1" and "::1"
>  * Extra bonus: if you move the call to pn_raw_connection_wake() to _before_ 
> the call to pn_proactor_raw_connect() a crash occurs
> See attached reproducer.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2699) Turn off proactor fdlimit test by default

2023-03-29 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2699:
---

 Summary: Turn off proactor fdlimit test by default
 Key: PROTON-2699
 URL: https://issues.apache.org/jira/browse/PROTON-2699
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: proton-c-0.38.0
Reporter: Clifford Jansen
Assignee: Clifford Jansen
 Fix For: proton-c-0.39.0


It has had many tweaks over the years yet remains sensitive to changes in OS 
versions, Python versions, parallelism of the test, system resources... i.e. it 
is flaky.  Keep it around but off by default.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-2673) Proactor hangs if pn_raw_connection_wake() is called with outstanding connection attempt

2023-03-24 Thread Clifford Jansen (Jira)


[ 
https://issues.apache.org/jira/browse/PROTON-2673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704485#comment-17704485
 ] 

Clifford Jansen commented on PROTON-2673:
-

PN_RAW_CONNECTION_WAKE can now be the first event ahead of a successful 
connected event.

 

The doc for pn_raw_connection_wake() has also been updated to clarify when it 
has defined results.  The restrictions  could be looser, along the lines of 
AMQP connections, but that would require extra locking unhelpful for the normal 
use case.  If this is too restrictive for the application, this could be 
revisited.

> Proactor hangs if pn_raw_connection_wake() is called with outstanding 
> connection attempt
> 
>
> Key: PROTON-2673
> URL: https://issues.apache.org/jira/browse/PROTON-2673
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.38.0, proton-c-0.39.0
>Reporter: Ken Giusti
>Assignee: Clifford Jansen
>Priority: Major
> Attachments: raw_wake.c
>
>
> If pn_raw_connection_wake() is called on a raw connection that is attempting 
> to connect  no further proactor events are generated and the proactor hangs.
> Important observations:
>  * This only occurs {_}if there is no server available at the target address 
> for the connection{_}. If a server is present then the PN_RAW_CONNECTION_WAKE 
> and PN_RAW_CONNECTION_CONNECTED events arrive properly (in that order).
>  * The host address is "localhost" - using "127.0.0.1" or "::1" instead 
> works. localhost on my machine maps to both "127.0.0.1" and "::1"
>  * Extra bonus: if you move the call to pn_raw_connection_wake() to _before_ 
> the call to pn_proactor_raw_connect() a crash occurs
> See attached reproducer.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2695) Epoll proactor raw connections hang on incomplete batches

2023-03-23 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2695:
---

 Summary: Epoll proactor raw connections hang on incomplete batches
 Key: PROTON-2695
 URL: https://issues.apache.org/jira/browse/PROTON-2695
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: proton-c-0.38.0
Reporter: Clifford Jansen
Assignee: Clifford Jansen


If an application returns a batch before draining all available events from it, 
the internal state machine may not have completed the steps needed to determine 
the correct polling events of interest, leaving the associated task in a hung 
state.

This is particularly relevant for the Catch2 test harness using the proactor.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2658) Proton TLS library - buffer leak on cleanup

2022-12-05 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2658:
---

 Summary: Proton TLS library - buffer leak on cleanup
 Key: PROTON-2658
 URL: https://issues.apache.org/jira/browse/PROTON-2658
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: proton-c-0.38.0
Reporter: Clifford Jansen
Assignee: Clifford Jansen
 Fix For: proton-c-0.39.0


pn_tls_stop() should make all staged buffers retrievable on subsequent buffer 
get operations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Resolved] (PROTON-2643) SSL connection hanging

2022-11-24 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen resolved PROTON-2643.
-
Fix Version/s: proton-c-0.39.0
 Assignee: Clifford Jansen
   Resolution: Fixed

> SSL connection hanging
> --
>
> Key: PROTON-2643
> URL: https://issues.apache.org/jira/browse/PROTON-2643
> Project: Qpid Proton
>  Issue Type: Bug
>Affects Versions: proton-c-0.37.0
> Environment: Qpid-proton 0.37 with epoll proactor and openssl 1.0.2k 
> running on centos7
>Reporter: Fredrik Hallenberg
>Assignee: Clifford Jansen
>Priority: Major
> Fix For: proton-c-0.39.0
>
> Attachments: ssl-issue-3.zip
>
>
> With a CA bundle of a certain size the SSL/TLS connection process hangs. This 
> is 100% repeatable. The process stops before reaching verification callback, 
> it seems there is an issue with reading from the BIO sockets. I can only 
> repeat it with certain CA bundles, it seems they have to contain >100 
> certificates but I have not found an obvious pattern. It does happen with my 
> current system bundle (/etc/ssl/certs/ca-bundle.crt). 
> I enclose an example with appropriate keys and bundles, the code is based on 
> the cpp ssl example in the proton release. See the readme file on how to run 
> it. Basically it will build a proton server from the example code and connect 
> to it using openssl s_client. There is a good and a bad bundle included. The 
> good one has a few less certificates than the big one but is otherwise the 
> same. If using the bad bundle the connection process will stop after a few 
> ssl read/writes. With the good bundle it proceeds as expected.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-2643) SSL connection hanging

2022-11-24 Thread Clifford Jansen (Jira)


[ 
https://issues.apache.org/jira/browse/PROTON-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17638383#comment-17638383
 ] 

Clifford Jansen commented on PROTON-2643:
-

This looks to me like an OpenSSL bug.  The server CertificateRequest 
(constructed from the ca-bad.pem example) plus the rest of the server's first 
response is just a bit larger than 17K, which happens to be the buffer size of 
the BIO.  There have been several bugs fixed over the years relating to hangs 
on the BIO, but I could not find an exact match to this case.  It appears fixed 
in OpenSSL 1.1 and above, so perhaps it was fixed accidentally as part of some 
other BIO hang bug.

One workaround is to trim the CA list to get the overall server's response 
below 17K (by removing unnecessary certs from the CA database).  It is also 
possible that increasing the CA list with dummy entries might also work (since 
the CertificateRequest size can be up to 64K and there are presumably tests for 
that edge case).

Another workaround is to have the Proton code poke the OpenSSL session instance 
during the handshake phase to get it to "notice" opportunities to replenish the 
BIO buffer.  I would normally be reluctant to add code like this but it has 
tiny overhead and, purely by coincidence, makes the operation slightly more 
similar to the new Proton TLS library for raw connections.  This may result in 
reducing other bug variations between the two.

> SSL connection hanging
> --
>
> Key: PROTON-2643
> URL: https://issues.apache.org/jira/browse/PROTON-2643
> Project: Qpid Proton
>  Issue Type: Bug
>Affects Versions: proton-c-0.37.0
> Environment: Qpid-proton 0.37 with epoll proactor and openssl 1.0.2k 
> running on centos7
>Reporter: Fredrik Hallenberg
>Priority: Major
> Attachments: ssl-issue-3.zip
>
>
> With a CA bundle of a certain size the SSL/TLS connection process hangs. This 
> is 100% repeatable. The process stops before reaching verification callback, 
> it seems there is an issue with reading from the BIO sockets. I can only 
> repeat it with certain CA bundles, it seems they have to contain >100 
> certificates but I have not found an obvious pattern. It does happen with my 
> current system bundle (/etc/ssl/certs/ca-bundle.crt). 
> I enclose an example with appropriate keys and bundles, the code is based on 
> the cpp ssl example in the proton release. See the readme file on how to run 
> it. Basically it will build a proton server from the example code and connect 
> to it using openssl s_client. There is a good and a bad bundle included. The 
> good one has a few less certificates than the big one but is otherwise the 
> same. If using the bad bundle the connection process will stop after a few 
> ssl read/writes. With the good bundle it proceeds as expected.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Resolved] (PROTON-2647) Fix FLOW event processing in send-abort example.

2022-11-07 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen resolved PROTON-2647.
-
Resolution: Fixed

> Fix FLOW event processing in send-abort example.
> 
>
> Key: PROTON-2647
> URL: https://issues.apache.org/jira/browse/PROTON-2647
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.37.0
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
> Fix For: proton-c-0.38.0
>
>
> The current send-abort example program relies on a cadence of FLOW events, 
> some self generated and some originating from the peer.  This cadence can be 
> disrupted by the timing of frames at each peer.  They can also be disrupted 
> by additional self generated FLOW frames in the case of smaller 
> max-frame-size configurations which may be chunked between event batches.
> The program can be made deterministic by not counting FLOW events but by 
> checking the actual state change that may be expected with a FLOW event.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-2647) Fix FLOW event processing in send-abort example.

2022-11-06 Thread Clifford Jansen (Jira)


[ 
https://issues.apache.org/jira/browse/PROTON-2647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17629629#comment-17629629
 ] 

Clifford Jansen commented on PROTON-2647:
-

This bug went unnoticed until the default max frame size was recently changed.

> Fix FLOW event processing in send-abort example.
> 
>
> Key: PROTON-2647
> URL: https://issues.apache.org/jira/browse/PROTON-2647
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.37.0
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
> Fix For: proton-c-0.38.0
>
>
> The current send-abort example program relies on a cadence of FLOW events, 
> some self generated and some originating from the peer.  This cadence can be 
> disrupted by the timing of frames at each peer.  They can also be disrupted 
> by additional self generated FLOW frames in the case of smaller 
> max-frame-size configurations which may be chunked between event batches.
> The program can be made deterministic by not counting FLOW events but by 
> checking the actual state change that may be expected with a FLOW event.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2647) Fix FLOW event processing in send-abort example.

2022-11-06 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2647:
---

 Summary: Fix FLOW event processing in send-abort example.
 Key: PROTON-2647
 URL: https://issues.apache.org/jira/browse/PROTON-2647
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: proton-c-0.37.0
Reporter: Clifford Jansen
Assignee: Clifford Jansen
 Fix For: proton-c-0.38.0


The current send-abort example program relies on a cadence of FLOW events, some 
self generated and some originating from the peer.  This cadence can be 
disrupted by the timing of frames at each peer.  They can also be disrupted by 
additional self generated FLOW frames in the case of smaller max-frame-size 
configurations which may be chunked between event batches.

The program can be made deterministic by not counting FLOW events but by 
checking the actual state change that may be expected with a FLOW event.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Resolved] (PROTON-2586) TLS OpenSSL library: incomplete decryption/encryption of staged buffers

2022-10-31 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen resolved PROTON-2586.
-
Resolution: Fixed

> TLS OpenSSL library: incomplete decryption/encryption of staged buffers
> ---
>
> Key: PROTON-2586
> URL: https://issues.apache.org/jira/browse/PROTON-2586
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.37.0
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
>
> OpenSSL processes TLS records one at time.  It does its conversion work in 
> buffers just larger than a maximum sized TLS record (16K).  When processing 
> large sized input and output buffers in a single pn_tls_process() call, the 
> Proton TLS library has to loop inserting unprocessed data into the small 
> OpenSSL buffer and extract the encrypted/decrypted data into the output 
> buffer and free space for the next iteration.  The code currently can exit 
> the loop prematurely.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Resolved] (PROTON-2535) TLS library - false indication of user data in OpenSSL

2022-10-31 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen resolved PROTON-2535.
-
Resolution: Fixed

> TLS library - false indication of user data in OpenSSL
> --
>
> Key: PROTON-2535
> URL: https://issues.apache.org/jira/browse/PROTON-2535
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.37.0
> Environment: OpenSSL
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
> Fix For: proton-c-0.38.0
>
>
> pn_tls_need_decrypt_output_buffers can falsely indicate the availability of 
> user data.  For example if there is a handshake failure, BIO_pending can 
> indicate the presence of bytes but BIO_read will return -1 and the 
> appropriate error.
> An application may be fooled into providing a decrypt output buffer that 
> won't be immediately be returned after the next pn_tls_process() step, since 
> no bytes will be read into it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-2471) Run raw connection examples during proton-c examples test

2022-10-31 Thread Clifford Jansen (Jira)


[ 
https://issues.apache.org/jira/browse/PROTON-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17626967#comment-17626967
 ] 

Clifford Jansen commented on PROTON-2471:
-

My best suggestion is to see if pn_raw_connection() returns a non NULL value.

 

If this is insufficient, we may have to add an equivalent to pn_ssl_present( 
void );

> Run raw connection examples during proton-c examples test
> -
>
> Key: PROTON-2471
> URL: https://issues.apache.org/jira/browse/PROTON-2471
> Project: Qpid Proton
>  Issue Type: Test
>  Components: examples, proton-c
>Affects Versions: proton-c-0.36.0, proton-c-0.37.0
>Reporter: Jiri Daněk
>Assignee: Jiri Daněk
>Priority: Major
> Fix For: proton-c-future
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Resolved] (PROTON-2622) TLS OpenSSL library: ensure capacity values match given capacity

2022-10-31 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen resolved PROTON-2622.
-
Resolution: Fixed

> TLS OpenSSL library: ensure capacity values match given capacity
> 
>
> Key: PROTON-2622
> URL: https://issues.apache.org/jira/browse/PROTON-2622
> Project: Qpid Proton
>  Issue Type: Wish
>  Components: proton-c
>Affects Versions: proton-c-0.38.0
>Reporter: Ken Giusti
>Assignee: Clifford Jansen
>Priority: Major
>
>  pn_tls_get_encrypt/decrypt_input_buffer_capacity() unconditionally return 
> the number of empty buffer slots.
> However pn_tls_give_encrypt/decrypt_input_buffers() checks the state of the 
> tls session and can take zero buffers even though get capacity returned > 0.
> In this case the application will have to "unwind" any buffer 
> allocation/setup work it did expecting there was capacity available.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2642) add tests for buffer capacity

2022-10-31 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2642:
---

 Summary: add tests for buffer capacity
 Key: PROTON-2642
 URL: https://issues.apache.org/jira/browse/PROTON-2642
 Project: Qpid Proton
  Issue Type: Improvement
  Components: proton-c
Affects Versions: proton-c-0.37.0
Reporter: Clifford Jansen
Assignee: Clifford Jansen


Add test for correct buffer capacity, specifically for

 

  https://issues.apache.org/jira/browse/PROTON-2622



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Resolved] (PROTON-2641) use consistent socket io cals in epoll proactor

2022-10-31 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen resolved PROTON-2641.
-
Resolution: Fixed

actual fix is in 93960f1e2129cf98200bdb2ab31e9ad868f71f61

> use consistent socket io cals in epoll proactor
> ---
>
> Key: PROTON-2641
> URL: https://issues.apache.org/jira/browse/PROTON-2641
> Project: Qpid Proton
>  Issue Type: Improvement
>  Components: proton-c
>Affects Versions: proton-c-0.37.0
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Minor
> Fix For: proton-c-0.38.0
>
>
> Epoll proactor currently uses send/read for IO.  For consistency it should 
> use write/read or send/recv.  The latter allows the kernel to skip code 
> handling the generic to specific transition and is the more performant option 
> (even if rarely measurable).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2641) use consistent socket io cals in epoll proactor

2022-10-31 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2641:
---

 Summary: use consistent socket io cals in epoll proactor
 Key: PROTON-2641
 URL: https://issues.apache.org/jira/browse/PROTON-2641
 Project: Qpid Proton
  Issue Type: Improvement
  Components: proton-c
Affects Versions: proton-c-0.37.0
Reporter: Clifford Jansen
Assignee: Clifford Jansen
 Fix For: proton-c-0.38.0


Epoll proactor currently uses send/read for IO.  For consistency it should use 
write/read or send/recv.  The latter allows the kernel to skip code handling 
the generic to specific transition and is the more performant option (even if 
rarely measurable).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Resolved] (PROTON-2640) Set a reasonable default maximum frame size

2022-10-31 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen resolved PROTON-2640.
-
Resolution: Fixed

> Set a reasonable default maximum frame size
> ---
>
> Key: PROTON-2640
> URL: https://issues.apache.org/jira/browse/PROTON-2640
> Project: Qpid Proton
>  Issue Type: Improvement
>  Components: cpp-binding, proton-c, python-binding
>Affects Versions: proton-c-0.38.0
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
> Fix For: proton-c-0.38.0
>
>
> The default is currently MAXINT.
>  
> Instrumenting using quiver shows 32k is a reasonable tradeoff of reduced 
> latency between transmissions and additional byte overhead for large messages.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2640) Set a reasonable default maximum frame size

2022-10-31 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2640:
---

 Summary: Set a reasonable default maximum frame size
 Key: PROTON-2640
 URL: https://issues.apache.org/jira/browse/PROTON-2640
 Project: Qpid Proton
  Issue Type: Improvement
  Components: cpp-binding, proton-c, python-binding
Affects Versions: proton-c-0.38.0
Reporter: Clifford Jansen
Assignee: Clifford Jansen
 Fix For: proton-c-0.38.0


The default is currently MAXINT.

 

Instrumenting using quiver shows 32k is a reasonable tradeoff of reduced 
latency between transmissions and additional byte overhead for large messages.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-2639) write flush capability for libuv and Windows

2022-10-31 Thread Clifford Jansen (Jira)


[ 
https://issues.apache.org/jira/browse/PROTON-2639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17626701#comment-17626701
 ] 

Clifford Jansen commented on PROTON-2639:
-

See https://issues.apache.org/jira/browse/PROTON-2633

> write flush capability for libuv and Windows
> 
>
> Key: PROTON-2639
> URL: https://issues.apache.org/jira/browse/PROTON-2639
> Project: Qpid Proton
>  Issue Type: Improvement
>  Components: proton-c
>Affects Versions: proton-c-0.38.0
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Minor
>
> There is a version implemented for the epoll proactor. Track here the pending 
> work for the other proactors.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Resolved] (PROTON-2633) Proactor: allow early writes to reduce latency

2022-10-31 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen resolved PROTON-2633.
-
Resolution: Fixed

> Proactor: allow early writes to reduce latency
> --
>
> Key: PROTON-2633
> URL: https://issues.apache.org/jira/browse/PROTON-2633
> Project: Qpid Proton
>  Issue Type: Improvement
>  Components: proton-c
>Affects Versions: proton-c-0.37.0
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
>
> A new API call to instruct the proactor implementation to extract pending 
> output from the Proton engine and immediately deliver what it can to the 
> operation system for transmission to peer.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2639) write flush capability for libuv and Windows

2022-10-31 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2639:
---

 Summary: write flush capability for libuv and Windows
 Key: PROTON-2639
 URL: https://issues.apache.org/jira/browse/PROTON-2639
 Project: Qpid Proton
  Issue Type: Improvement
  Components: proton-c
Affects Versions: proton-c-0.38.0
Reporter: Clifford Jansen
Assignee: Clifford Jansen


There is a version implemented for the epoll proactor. Track here the pending 
work for the other proactors.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2633) Proactor: allow early writes to reduce latency

2022-10-26 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2633:
---

 Summary: Proactor: allow early writes to reduce latency
 Key: PROTON-2633
 URL: https://issues.apache.org/jira/browse/PROTON-2633
 Project: Qpid Proton
  Issue Type: Improvement
  Components: proton-c
Affects Versions: proton-c-0.37.0
Reporter: Clifford Jansen
Assignee: Clifford Jansen


A new API call to instruct the proactor implementation to extract pending 
output from the Proton engine and immediately deliver what it can to the 
operation system for transmission to peer.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Resolved] (PROTON-2613) TLS OpenSSL library: write channel not fully configured.

2022-09-15 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen resolved PROTON-2613.
-
Fix Version/s: proton-c-0.38.0
   Resolution: Fixed

> TLS OpenSSL library: write channel not fully configured.
> 
>
> Key: PROTON-2613
> URL: https://issues.apache.org/jira/browse/PROTON-2613
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.37.0
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
> Fix For: proton-c-0.38.0
>
>
> The library code assumes that write operations provide more detail on partial 
> writes than just "try again later".  There is a configuration option that 
> makes the low level SSL write operations more like BIO and Posix write 
> semantics.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Resolved] (PROTON-2612) TLS OpenSSL library: uninitialized raw buffer size for output buffers

2022-09-15 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen resolved PROTON-2612.
-
Fix Version/s: proton-c-0.38.0
   Resolution: Fixed

> TLS OpenSSL library: uninitialized raw buffer size for output buffers
> -
>
> Key: PROTON-2612
> URL: https://issues.apache.org/jira/browse/PROTON-2612
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.37.0
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
> Fix For: proton-c-0.38.0
>
>
> For TLS library output buffers (used for reading into), the size must be set 
> to zero regardless of its value when provided by the application... but is 
> not.  This prevents the full capacity of the buffers to be used.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2613) TLS OpenSSL library: write channel not fully configured.

2022-09-15 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2613:
---

 Summary: TLS OpenSSL library: write channel not fully configured.
 Key: PROTON-2613
 URL: https://issues.apache.org/jira/browse/PROTON-2613
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: proton-c-0.37.0
Reporter: Clifford Jansen
Assignee: Clifford Jansen


The library code assumes that write operations provide more detail on partial 
writes than just "try again later".  There is a configuration option that makes 
the low level SSL write operations more like BIO and Posix write semantics.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2612) TLS OpenSSL library: uninitialized raw buffer size for output buffers

2022-09-15 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2612:
---

 Summary: TLS OpenSSL library: uninitialized raw buffer size for 
output buffers
 Key: PROTON-2612
 URL: https://issues.apache.org/jira/browse/PROTON-2612
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: proton-c-0.37.0
Reporter: Clifford Jansen
Assignee: Clifford Jansen


For TLS library output buffers (used for reading into), the size must be set to 
zero regardless of its value when provided by the application... but is not.  
This prevents the full capacity of the buffers to be used.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2586) TLS OpenSSL library: incomplete decryption/encryption of staged buffers

2022-08-02 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2586:
---

 Summary: TLS OpenSSL library: incomplete decryption/encryption of 
staged buffers
 Key: PROTON-2586
 URL: https://issues.apache.org/jira/browse/PROTON-2586
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: proton-c-0.37.0
Reporter: Clifford Jansen
Assignee: Clifford Jansen


OpenSSL processes TLS records one at time.  It does its conversion work in 
buffers just larger than a maximum sized TLS record (16K).  When processing 
large sized input and output buffers in a single pn_tls_process() call, the 
Proton TLS library has to loop inserting unprocessed data into the small 
OpenSSL buffer and extract the encrypted/decrypted data into the output buffer 
and free space for the next iteration.  The code currently can exit the loop 
prematurely.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-2543) Crash in epoll.c resched_pop_front

2022-06-01 Thread Clifford Jansen (Jira)


[ 
https://issues.apache.org/jira/browse/PROTON-2543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17545037#comment-17545037
 ] 

Clifford Jansen commented on PROTON-2543:
-

Thank you for the update.

I will keep this open a bit longer and see if I can't get lucky on reproducing 
it myself with a few tweaks to my existing soak tests.

If you can answer a subset of the questions I asked earlier, whatever is quick 
and easy, that may help me zero in on the bug.

Thanks.

> Crash in epoll.c resched_pop_front
> --
>
> Key: PROTON-2543
> URL: https://issues.apache.org/jira/browse/PROTON-2543
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Reporter: Fredrik Hallenberg
>Assignee: Clifford Jansen
>Priority: Major
> Attachments: qpid-epoll-crash.patch
>
>
> During stress testing it is fairly easy to reproduce a segfault in 
> resched_pop_front. Using gdb it is easy to see that polled_resched_front can 
> be zero when entering this function which causes the value to wrap and then a 
> crash in later calls.
> polled_resched_front is not checked when calling this function in one 
> instance, the trivial fix to check this value is seen in the attached patch 
> seems to work.
> Tested with Qpid Proton C++ 0.37.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-2543) Crash in epoll.c resched_pop_front

2022-05-30 Thread Clifford Jansen (Jira)


[ 
https://issues.apache.org/jira/browse/PROTON-2543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17544161#comment-17544161
 ] 

Clifford Jansen commented on PROTON-2543:
-

I don't know if you have had any time to try to gather further information 
about the crash that you are seeing.

It would certainly help me to be of greater assistance if you could provide 
more details about the environment where you see the crash:

* cpu hardware type and model
* OS and version
* compiler (gcc/clang/other)
* Number of concurrent threads servicing proactor event batches
* Number of active proactors in failing process (usually 1)
* Running on bare hardware, VM, container
* crash occurs during main operation or on shutdown (or both)
* Types of connections and listeners
** All outgoing connections
** All incoming connections and listeners
** Mix of both (describe)
** Mainly/only pn_raw_connection_t or pn_connection_t connections.
** connections are over a network/virtual network/loopback

If you are having difficulty reproducing the crash in debug mode, perhaps I 
could provide an instrumented version of epoll.c that could give us recent 
proactor history and help debug the problem.

Also, if you could provide a debugger dump of the failing pn_proactor_t at time 
of one of your crashes, that might help me think of other things to explore.

Thank you for any information you can provide.

> Crash in epoll.c resched_pop_front
> --
>
> Key: PROTON-2543
> URL: https://issues.apache.org/jira/browse/PROTON-2543
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Reporter: Fredrik Hallenberg
>Assignee: Clifford Jansen
>Priority: Major
> Attachments: qpid-epoll-crash.patch
>
>
> During stress testing it is fairly easy to reproduce a segfault in 
> resched_pop_front. Using gdb it is easy to see that polled_resched_front can 
> be zero when entering this function which causes the value to wrap and then a 
> crash in later calls.
> polled_resched_front is not checked when calling this function in one 
> instance, the trivial fix to check this value is seen in the attached patch 
> seems to work.
> Tested with Qpid Proton C++ 0.37.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-2543) Crash in epoll.c resched_pop_front

2022-05-24 Thread Clifford Jansen (Jira)


[ 
https://issues.apache.org/jira/browse/PROTON-2543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17541594#comment-17541594
 ] 

Clifford Jansen commented on PROTON-2543:
-

https://rr-project.org/

The related package name is "rr" on Fedora and Ubuntu.

If you can catch the failure in rr, you can reproduce exactly the run that 
failed and multi threaded bugs can be debugged more easily (you can move 
backwards and forwards in time in the debugger).

However, you may find that your reproducer fails easily outside of rr but 
stubbornly refuses to do so with rr in the mix.

> Crash in epoll.c resched_pop_front
> --
>
> Key: PROTON-2543
> URL: https://issues.apache.org/jira/browse/PROTON-2543
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Reporter: Fredrik Hallenberg
>Assignee: Clifford Jansen
>Priority: Major
> Attachments: qpid-epoll-crash.patch
>
>
> During stress testing it is fairly easy to reproduce a segfault in 
> resched_pop_front. Using gdb it is easy to see that polled_resched_front can 
> be zero when entering this function which causes the value to wrap and then a 
> crash in later calls.
> polled_resched_front is not checked when calling this function in one 
> instance, the trivial fix to check this value is seen in the attached patch 
> seems to work.
> Tested with Qpid Proton C++ 0.37.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-2543) Crash in epoll.c resched_pop_front

2022-05-24 Thread Clifford Jansen (Jira)


[ 
https://issues.apache.org/jira/browse/PROTON-2543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17541496#comment-17541496
 ] 

Clifford Jansen commented on PROTON-2543:
-

Thank you for the bug report and suggested patch.

Unfortunately your suggested fix targets the symptom you are seeing but not the 
underlying problem.

It should never be possible that p->resched_cutoff is non-null while 
p->polled_resched_count is zero, so your code should have no effect.  Yet we 
know it does.

The patch allows the proactor to keep running even though one of its critical 
scheduling lists is in an undefined state.  This could lead to crashes or hangs 
even further removed from the actual problem.

Have you tried running your reproducer with a "Debug" CMake build?  There are 
several asserts in the code that might catch the broken list earlier or point 
us closer to a good place to look.

Alternatively, can your reproducer be pared down and shared in this JIRA?

Otherwise, is it possible for you to trigger the bug using rr?  In the crash 
analysis is should be possible to check for the point at which the list looses 
its integrity from the most recent poller_do_epoll() to a subsequent 
resched_pop_front().

> Crash in epoll.c resched_pop_front
> --
>
> Key: PROTON-2543
> URL: https://issues.apache.org/jira/browse/PROTON-2543
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Reporter: Fredrik Hallenberg
>Assignee: Clifford Jansen
>Priority: Major
> Attachments: qpid-epoll-crash.patch
>
>
> During stress testing it is fairly easy to reproduce a segfault in 
> resched_pop_front. Using gdb it is easy to see that polled_resched_front can 
> be zero when entering this function which causes the value to wrap and then a 
> crash in later calls.
> polled_resched_front is not checked when calling this function in one 
> instance, the trivial fix to check this value is seen in the attached patch 
> seems to work.
> Tested with Qpid Proton C++ 0.37.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-1870) better logging for ssl

2022-05-12 Thread Clifford Jansen (Jira)


[ 
https://issues.apache.org/jira/browse/PROTON-1870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17536142#comment-17536142
 ] 

Clifford Jansen commented on PROTON-1870:
-

Commit dfebbe8c provides TLS error alert feedback to peer as provided by 
default in the OpenSSL library.  It doesn't necessarily address the lack of 
detail of error messages on either side.

 

> better logging for ssl
> --
>
> Key: PROTON-1870
> URL: https://issues.apache.org/jira/browse/PROTON-1870
> Project: Qpid Proton
>  Issue Type: Improvement
>  Components: python-binding
>Affects Versions: proton-0.9.1, proton-c-0.31.0
>Reporter: Gordon Sim
>Priority: Major
>  Labels: logging, tls, usability
>
> Would be nice to have better logging for ssl connections, particularly where 
> they  fail, e.g. the sni used, the ca the peer cert is signed with etc



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2535) TLS library - false indication of user data in OpenSSL

2022-04-19 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2535:
---

 Summary: TLS library - false indication of user data in OpenSSL
 Key: PROTON-2535
 URL: https://issues.apache.org/jira/browse/PROTON-2535
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: proton-c-0.37.0
 Environment: OpenSSL
Reporter: Clifford Jansen
Assignee: Clifford Jansen


pn_tls_need_decrypt_output_buffers can falsely indicate the availability of 
user data.  For example if there is a handshake failure, BIO_pending can 
indicate the presence of bytes but BIO_read will return -1 and the appropriate 
error.

An application may be fooled into providing a decrypt output buffer that won't 
be immediately be returned after the next pn_tls_process() step, since no bytes 
will be read into it.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-2522) Intermittent C fdlimit test failures

2022-03-20 Thread Clifford Jansen (Jira)


[ 
https://issues.apache.org/jira/browse/PROTON-2522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509626#comment-17509626
 ] 

Clifford Jansen commented on PROTON-2522:
-

Preliminary investigation indicates that increasing the sleep time between 
steps in the test makes the error go away.  A more robust test mechanism is 
obviously preferable to just increasing pause times and making the tests run 
slower.

> Intermittent C fdlimit test failures
> 
>
> Key: PROTON-2522
> URL: https://issues.apache.org/jira/browse/PROTON-2522
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.36.0, proton-c-0.37.0
> Environment: Specifics unknown.
> On some hardware, fails with Python 3.10 but not 3.9.
> Also seen on other harware with Python 3.6.
> But also seen 
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
>
> The CTest: c-fdlimit-tests fails in some environments with output containing:
>  
>   /usr/lib64/python3.10/subprocess.py:1067: ResourceWarning: subprocess 27520 
> is still running
>  
> and
>  
>   self.assertNotEqual(sender.poll(), 0)
>     AssertionError: 0 == 0
>  
> First reported by Roddie Kieley and Gordon Sim.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2522) Intermittent C fdlimit test failures

2022-03-20 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2522:
---

 Summary: Intermittent C fdlimit test failures
 Key: PROTON-2522
 URL: https://issues.apache.org/jira/browse/PROTON-2522
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: proton-c-0.36.0, proton-c-0.37.0
 Environment: Specifics unknown.

On some hardware, fails with Python 3.10 but not 3.9.

Also seen on other harware with Python 3.6.

But also seen 
Reporter: Clifford Jansen
Assignee: Clifford Jansen


The CTest: c-fdlimit-tests fails in some environments with output containing:

 

  /usr/lib64/python3.10/subprocess.py:1067: ResourceWarning: subprocess 27520 
is still running

 

and

 

  self.assertNotEqual(sender.poll(), 0)
    AssertionError: 0 == 0

 

First reported by Roddie Kieley and Gordon Sim.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Resolved] (PROTON-2519) TLSlibrary: null pointer reference

2022-03-14 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen resolved PROTON-2519.
-
Resolution: Fixed

> TLSlibrary: null pointer reference
> --
>
> Key: PROTON-2519
> URL: https://issues.apache.org/jira/browse/PROTON-2519
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.37.0
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
>
> Thanks to Coverity:
>   *** CID 376597:  Null pointer dereferences  (FORWARD_NULL)
>   /qpid-proton/c/src/tls/openssl.c: 2283 in pn_tls_config_set_alpn_protocols()
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2519) TLSlibrary: null pointer reference

2022-03-14 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2519:
---

 Summary: TLSlibrary: null pointer reference
 Key: PROTON-2519
 URL: https://issues.apache.org/jira/browse/PROTON-2519
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: proton-c-0.37.0
Reporter: Clifford Jansen
Assignee: Clifford Jansen


Thanks to Coverity:

  *** CID 376597:  Null pointer dereferences  (FORWARD_NULL)
  /qpid-proton/c/src/tls/openssl.c: 2283 in pn_tls_config_set_alpn_protocols()

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Resolved] (PROTON-2512) Proton raw TLS library does not build on aarch64 Ubuntu in Travis CI

2022-03-14 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen resolved PROTON-2512.
-
Resolution: Fixed

> Proton raw TLS library does not build on aarch64 Ubuntu in Travis CI
> 
>
> Key: PROTON-2512
> URL: https://issues.apache.org/jira/browse/PROTON-2512
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.37.0
>Reporter: Jiri Daněk
>Priority: Major
>
> https://app.travis-ci.com/github/jiridanek/skupper-router/jobs/562112202#L605
> {noformat}
> cmake .. 
> -DCMAKE_INSTALL_PREFIX=/home/travis/build/jiridanek/skupper-router/install 
> -DCMAKE_BUILD_TYPE=RelWithDebInfo -DBUILD_BINDINGS=python -DBUILD_TLS=ON
> {noformat}
> [...]
> {noformat}
> [  8%] Building C object 
> c/CMakeFiles/qpid-proton-proactor-objects.dir/src/proactor/epoll_raw_connection.c.o
> /home/travis/build/jiridanek/skupper-router/qpid-proton/c/src/tls/openssl.c:1465:22:
>  error: unused function 'size_min' [-Werror,-Wunused-function]
> static inline size_t size_min(uint32_t a, uint32_t b) {
>  ^
> 1 error generated.
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2517) The new C codec can misinterpret pn_data_t values resulting in unintended wire data.

2022-03-14 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2517:
---

 Summary: The new C codec can misinterpret pn_data_t values 
resulting in unintended wire data.
 Key: PROTON-2517
 URL: https://issues.apache.org/jira/browse/PROTON-2517
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: proton-c-0.37.0
Reporter: Clifford Jansen
Assignee: Clifford Jansen


See the C++ frame trace from

  https://issues.redhat.com/browse/ENTMQCL-3278

The zero length array is printed instead of a null because the test in 
emit_multiple() from emitters.h fails to set the current node of the pn_data_t 
to the first node.  The test

  if (pn_data_type(data) == PN_ARRAY) { //...

fails and the array processing logic is bypassed, including the lines

     switch (pn_data_get_array(data)) {
        case 0:
          pni_emitter_writef8(emitter, PNE_NULL);

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2509) python-integration-test errors with tsan and asan runtime checks

2022-02-28 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2509:
---

 Summary: python-integration-test errors with tsan and asan runtime 
checks
 Key: PROTON-2509
 URL: https://issues.apache.org/jira/browse/PROTON-2509
 Project: Qpid Proton
  Issue Type: Bug
  Components: python-binding
Affects Versions: proton-c-0.36.0
 Environment: Fedora release 34.
Reporter: Clifford Jansen
 Fix For: proton-c-0.38.0


build with

 

   -DRUNTIME_CHECK=asan       (or tsan)

 

and test with

 

  ctest -V -R python-integration-test



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Resolved] (PROTON-2484) epoll proactor memory use after free

2022-02-23 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen resolved PROTON-2484.
-
Resolution: Fixed

> epoll proactor memory use after free
> 
>
> Key: PROTON-2484
> URL: https://issues.apache.org/jira/browse/PROTON-2484
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.36.0
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
> Fix For: proton-c-0.37.0
>
>
> ASAN correctly notes use of task memory after task deletion.  Notably using 
> the task's pointer value for the proactor.  This value can be saved at a time 
> the task is known to still exist.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Resolved] (PROTON-2483) TSAN reported potential deadlock in epoll proactor when run via Qpid Dispatch router.

2022-02-23 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen resolved PROTON-2483.
-
Resolution: Fixed

> TSAN reported potential deadlock in epoll proactor when run via Qpid Dispatch 
> router.
> -
>
> Key: PROTON-2483
> URL: https://issues.apache.org/jira/browse/PROTON-2483
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.36.0
> Environment: linux epoll
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
> Fix For: proton-c-0.37.0
>
> Attachments: tsan_out.txt
>
>
> The traces are incomplete but the 4 way thread tangle can be inferred as 
> follows:
>   A: pn_proactor_set_timeout()   (p->task.mutex + tm->task.mutex)
>   B: pni_timer_manager_process() (tm->task.mutex + tm->deletion_mutex)
>   C: pni_connection_timeout()    (tm->deletion_mutex + pc1->task.mutex)
>   D: proactor_remove()           (pc1->task.mutex + p->task.mutex)
> While this particular trace is a false positive (D occurs after all other 
> threads have been joined and there are no competing threads to complete the 
> circle), the lock ordering is clearly asking for eventual trouble.
> The proactor set_timeout and cancel_timeout API calls do not need to hold the 
> proactor task lock while interacting with the timer manager, but do so as a 
> convenience to prevent collisions between simultaneous sets/cancels.  A 
> separate lock can achieve that purpose, stopping A from participating in the 
> potential deadlock.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Resolved] (PROTON-2362) c-threaderciser timed out on 32-core machine.

2022-02-23 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen resolved PROTON-2362.
-
Fix Version/s: proton-c-0.37.0
   Resolution: Fixed

> c-threaderciser timed out on 32-core machine.
> -
>
> Key: PROTON-2362
> URL: https://issues.apache.org/jira/browse/PROTON-2362
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.33.0, proton-c-0.35.0, proton-c-0.34.0, 
> proton-c-0.36.0, proton-c-0.37.0
>Reporter: michael goulish
>Assignee: Clifford Jansen
>Priority: Major
> Fix For: proton-c-0.37.0
>
> Attachments: tsan_tr1.txt, tsan_tr2.txt, tsan_tr3.txt
>
>
> Using recent master – maybe 3 days old or so – I just ran Proton's ctest, 
> after turning on THREADERCISER.  I ran it on a box with 32 physical cores, 64 
> threads.
>  
> Test number 6 – c-threaderciser – failed with timeout after 1500 seconds.
> ( 1.5e18 femtoseconds. )
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Resolved] (PROTON-2497) General TLS library for Proton C

2022-02-23 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen resolved PROTON-2497.
-
Resolution: Implemented

Initial implementation.  API is unsettled.  Does not build by default.  CMake 
flag

 

  -DBUILD_TLS=ON

 

is required to build it.

> General TLS library for Proton C
> 
>
> Key: PROTON-2497
> URL: https://issues.apache.org/jira/browse/PROTON-2497
> Project: Qpid Proton
>  Issue Type: New Feature
>  Components: proton-c
>Affects Versions: proton-c-0.36.0
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
> Fix For: proton-c-0.37.0
>
>
> The current TLS functionality for Proton (see "c/include/proton/ssl.h") is 
> tightly coupled to AMQP connections and does not allow TLS sessions for 
> arbitrary content including Proton raw connections.
> A more generalized API is proposed that works with arrays of pn_raw_buffer_t 
> content.  As it matures it could serve as the TLS engine for AMQP connections 
> as well.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2500) Proactor memory leak on aborted shutdown

2022-02-14 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2500:
---

 Summary: Proactor memory leak on aborted shutdown
 Key: PROTON-2500
 URL: https://issues.apache.org/jira/browse/PROTON-2500
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: proton-c-0.36.0
 Environment: Linux epoll: yes.
libuv: no.
Windows IOCP: TBD.
Reporter: Clifford Jansen
Assignee: Clifford Jansen
 Attachments: ptest.diff

If pn_proactor_free is called while pending closes from a 
pn_proactor_disconnect are pending, some reference counts remain positive and 
memory leaks occur.

See test case.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2497) General TLS library for Proton C

2022-02-09 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2497:
---

 Summary: General TLS library for Proton C
 Key: PROTON-2497
 URL: https://issues.apache.org/jira/browse/PROTON-2497
 Project: Qpid Proton
  Issue Type: New Feature
  Components: proton-c
Affects Versions: proton-c-0.36.0
Reporter: Clifford Jansen
Assignee: Clifford Jansen


The current TLS functionality for Proton (see "c/include/proton/ssl.h") is 
tightly coupled to AMQP connections and does not allow TLS sessions for 
arbitrary content including Proton raw connections.

A more generalized API is proposed that works with arrays of pn_raw_buffer_t 
content.  As it matures it could serve as the TLS engine for AMQP connections 
as well.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2484) epoll proactor memory use after free

2022-01-13 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2484:
---

 Summary: epoll proactor memory use after free
 Key: PROTON-2484
 URL: https://issues.apache.org/jira/browse/PROTON-2484
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: proton-c-0.36.0
Reporter: Clifford Jansen
Assignee: Clifford Jansen


ASAN correctly notes use of task memory after task deletion.  Notably using the 
task's pointer value for the proactor.  This value can be saved at a time the 
task is known to still exist.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Updated] (PROTON-2483) TSAN reported potential deadlock in epoll proactor when run via Qpid Dispatch router.

2022-01-12 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen updated PROTON-2483:

Attachment: tsan_out.txt

> TSAN reported potential deadlock in epoll proactor when run via Qpid Dispatch 
> router.
> -
>
> Key: PROTON-2483
> URL: https://issues.apache.org/jira/browse/PROTON-2483
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.36.0
> Environment: linux epoll
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
> Attachments: tsan_out.txt
>
>
> The traces are incomplete but the 4 way thread tangle can be inferred as 
> follows:
>   A: pn_proactor_set_timeout()   (p->task.mutex + tm->task.mutex)
>   B: pni_timer_manager_process() (tm->task.mutex + tm->deletion_mutex)
>   C: pni_connection_timeout()    (tm->deletion_mutex + pc1->task.mutex)
>   D: proactor_remove()           (pc1->task.mutex + p->task.mutex)
> While this particular trace is a false positive (D occurs after all other 
> threads have been joined and there are no competing threads to complete the 
> circle), the lock ordering is clearly asking for eventual trouble.
> The proactor set_timeout and cancel_timeout API calls do not need to hold the 
> proactor task lock while interacting with the timer manager, but do so as a 
> convenience to prevent collisions between simultaneous sets/cancels.  A 
> separate lock can achieve that purpose, stopping A from participating in the 
> potential deadlock.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2483) TSAN reported potential deadlock in epoll proactor when run via Qpid Dispatch router.

2022-01-12 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2483:
---

 Summary: TSAN reported potential deadlock in epoll proactor when 
run via Qpid Dispatch router.
 Key: PROTON-2483
 URL: https://issues.apache.org/jira/browse/PROTON-2483
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: proton-c-0.36.0
 Environment: linux epoll
Reporter: Clifford Jansen
Assignee: Clifford Jansen


The traces are incomplete but the 4 way thread tangle can be inferred as 
follows:

  A: pn_proactor_set_timeout()   (p->task.mutex + tm->task.mutex)
  B: pni_timer_manager_process() (tm->task.mutex + tm->deletion_mutex)
  C: pni_connection_timeout()    (tm->deletion_mutex + pc1->task.mutex)
  D: proactor_remove()           (pc1->task.mutex + p->task.mutex)

While this particular trace is a false positive (D occurs after all other 
threads have been joined and there are no competing threads to complete the 
circle), the lock ordering is clearly asking for eventual trouble.

The proactor set_timeout and cancel_timeout API calls do not need to hold the 
proactor task lock while interacting with the timer manager, but do so as a 
convenience to prevent collisions between simultaneous sets/cancels.  A 
separate lock can achieve that purpose, stopping A from participating in the 
potential deadlock.

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Resolved] (PROTON-2436) TSAN race in epoll.c post_event with raw connection

2022-01-11 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen resolved PROTON-2436.
-
Fix Version/s: proton-c-0.37.0
   Resolution: Fixed

Make ownership of scheduled io events compared to task-processed io events 
consistent between AMQP connections, listeners, and raw connections.

> TSAN race in epoll.c post_event with raw connection
> ---
>
> Key: PROTON-2436
> URL: https://issues.apache.org/jira/browse/PROTON-2436
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.36.0
>Reporter: Ken Giusti
>Assignee: Clifford Jansen
>Priority: Major
> Fix For: proton-c-0.37.0
>
>
> today's github CI run of dispatch+proton main kicked up a tsan error in 
> proton I've never seen before:
> https://github.com/apache/qpid-dispatch/runs/3700836319?check_suite_focus=true#step:27:2142
>  
> {noformat}
> 70: WARNING: ThreadSanitizer: data race (pid=3075)
> 70:   Write of size 4 at 0x7b68dd38 by main thread (mutexes: write M257):
> 70: #0 post_event 
> /home/runner/work/qpid-dispatch/qpid-dispatch/qpid-proton/c/src/proactor/epoll.c:2304
>  (libqpid-proton-proactor.so.1+0x14108)
> 70: #1 poller_do_epoll 
> /home/runner/work/qpid-dispatch/qpid-dispatch/qpid-proton/c/src/proactor/epoll.c:2534
>  (libqpid-proton-proactor.so.1+0x14108)
> 70: #2 next_event_batch 
> /home/runner/work/qpid-dispatch/qpid-dispatch/qpid-proton/c/src/proactor/epoll.c:2438
>  (libqpid-proton-proactor.so.1+0x14108)
> 70: #3 pn_proactor_wait 
> /home/runner/work/qpid-dispatch/qpid-dispatch/qpid-proton/c/src/proactor/epoll.c:2650
>  (libqpid-proton-proactor.so.1+0x14622)
> 70: #4 thread_run 
> /home/runner/work/qpid-dispatch/qpid-dispatch/qpid-dispatch/src/server.c:1118 
> (qdrouterd+0x4d83a9)
> 70: #5 qd_server_run 
> /home/runner/work/qpid-dispatch/qpid-dispatch/qpid-dispatch/src/server.c:1527 
> (qdrouterd+0x4d904c)
> 70: #6 main_process 
> /home/runner/work/qpid-dispatch/qpid-dispatch/qpid-dispatch/router/src/main.c:115
>  (qdrouterd+0x426cdc)
> 70: #7 main 
> /home/runner/work/qpid-dispatch/qpid-dispatch/qpid-dispatch/router/src/main.c:369
>  (qdrouterd+0x42623c)
> 70: 
> 70:   Previous read of size 4 at 0x7b68dd38 by thread T3 (mutexes: write 
> M499):
> 70: #0 pni_raw_connection_process 
> /home/runner/work/qpid-dispatch/qpid-dispatch/qpid-proton/c/src/proactor/epoll_raw_connection.c:355
>  (libqpid-proton-proactor.so.1+0x108ec)
> 70: #1 process 
> /home/runner/work/qpid-dispatch/qpid-dispatch/qpid-proton/c/src/proactor/epoll.c:2230
>  (libqpid-proton-proactor.so.1+0x108ec)
> 70: #2 next_event_batch 
> /home/runner/work/qpid-dispatch/qpid-dispatch/qpid-proton/c/src/proactor/epoll.c:2419
>  (libqpid-proton-proactor.so.1+0x108ec)
> 70: #3 pn_proactor_wait 
> /home/runner/work/qpid-dispatch/qpid-dispatch/qpid-proton/c/src/proactor/epoll.c:2650
>  (libqpid-proton-proactor.so.1+0x14622)
> 70: #4 thread_run 
> /home/runner/work/qpid-dispatch/qpid-dispatch/qpid-dispatch/src/server.c:1118 
> (qdrouterd+0x4d83a9)
> 70: #5 _thread_init 
> /home/runner/work/qpid-dispatch/qpid-dispatch/qpid-dispatch/src/posix/threading.c:172
>  (qdrouterd+0x47fe2d)
> 70: 
> 70:   Location is heap block of size 1536 at 0x7b68d800 allocated by main 
> thread:
> 70: #0 calloc  (libtsan.so.0+0x32b3e)
> 70: #1 pn_raw_connection 
> /home/runner/work/qpid-dispatch/qpid-dispatch/qpid-proton/c/src/proactor/epoll_raw_connection.c:168
>  (libqpid-proton-proactor.so.1+0xdf82)
> 70: #2 _do_reconnect 
> /home/runner/work/qpid-dispatch/qpid-dispatch/qpid-dispatch/src/adaptors/http1/http1_server.c:451
>  (qdrouterd+0x43da47)
> 70: #3 qd_timer_visit 
> /home/runner/work/qpid-dispatch/qpid-dispatch/qpid-dispatch/src/timer.c:316 
> (qdrouterd+0x4daddf)
> 70: #4 handle 
> /home/runner/work/qpid-dispatch/qpid-dispatch/qpid-dispatch/src/server.c:1018 
> (qdrouterd+0x4d60d6)
> 70: #5 thread_run 
> /home/runner/work/qpid-dispatch/qpid-dispatch/qpid-dispatch/src/server.c:1133 
> (qdrouterd+0x4d84e7)
> 70: #6 qd_server_run 
> /home/runner/work/qpid-dispatch/qpid-dispatch/qpid-dispatch/src/server.c:1527 
> (qdrouterd+0x4d904c)
> 70: #7 main_process 
> /home/runner/work/qpid-dispatch/qpid-dispatch/qpid-dispatch/router/src/main.c:115
>  (qdrouterd+0x426cdc)
> 70: #8 main 
> /home/runner/work/qpid-dispatch/qpid-dispatch/qpid-dispatch/router/src/main.c:369
>  (qdrouterd+0x42623c)
> 70: 
> 70:   Mutex M257 (0x7b640003aa20) created at:
> 70: #0 pthread_mutex_init  (libtsan.so.0+0x49603)
> 70: #1 pmutex_init 
> /home/runner/work/qpid-dispatch/qpid-dispatch/qpid-proton/c/src/proactor/epoll-internal.h:323
>  (libqpid-proton-proactor.so.1+0xd52c)
> 70: #2 pn_proactor 
> /home

[jira] [Commented] (PROTON-2362) c-threaderciser timed out on 32-core machine.

2021-11-21 Thread Clifford Jansen (Jira)


[ 
https://issues.apache.org/jira/browse/PROTON-2362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17447116#comment-17447116
 ] 

Clifford Jansen commented on PROTON-2362:
-

I have a less thunderous 8 core (16 thread) machine.

If I run the threaderciser under tsan with ambitious pthread counts (> 100), I 
can provoke three separate thread traces with helpful debugging.  tsan_trX.txt 
files attached.

> c-threaderciser timed out on 32-core machine.
> -
>
> Key: PROTON-2362
> URL: https://issues.apache.org/jira/browse/PROTON-2362
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.33.0, proton-c-0.34.0
>Reporter: michael goulish
>Priority: Major
> Attachments: tsan_tr1.txt, tsan_tr2.txt, tsan_tr3.txt
>
>
> Using recent master – maybe 3 days old or so – I just ran Proton's ctest, 
> after turning on THREADERCISER.  I ran it on a box with 32 physical cores, 64 
> threads.
>  
> Test number 6 – c-threaderciser – failed with timeout after 1500 seconds.
> ( 1.5e18 femtoseconds. )
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Updated] (PROTON-2362) c-threaderciser timed out on 32-core machine.

2021-11-21 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen updated PROTON-2362:

Attachment: tsan_tr3.txt
tsan_tr2.txt
tsan_tr1.txt

> c-threaderciser timed out on 32-core machine.
> -
>
> Key: PROTON-2362
> URL: https://issues.apache.org/jira/browse/PROTON-2362
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.33.0, proton-c-0.34.0
>Reporter: michael goulish
>Priority: Major
> Attachments: tsan_tr1.txt, tsan_tr2.txt, tsan_tr3.txt
>
>
> Using recent master – maybe 3 days old or so – I just ran Proton's ctest, 
> after turning on THREADERCISER.  I ran it on a box with 32 physical cores, 64 
> threads.
>  
> Test number 6 – c-threaderciser – failed with timeout after 1500 seconds.
> ( 1.5e18 femtoseconds. )
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Closed] (PROTON-2432) Proton crashes because of a concurrency failure in collector->pool

2021-10-29 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen closed PROTON-2432.
---
Resolution: Not A Bug

> Proton crashes because of a concurrency failure in collector->pool
> --
>
> Key: PROTON-2432
> URL: https://issues.apache.org/jira/browse/PROTON-2432
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.32.0
> Environment: RHEL 7 
>Reporter: Jesse Hulsizer
>Priority: Major
> Attachments: proton-2432.patch
>
>
> While running our application tests, our application crashes with many 
> different backtraces that look similar to this...
> {noformat}
> #0  0x in ?? ()
> #1  0x7fc777579198 in pn_class_incref () from 
> /usr/lib64/libqpid-proton.so.11
> #2  0x7fc777587d8a in pn_collector_put () from 
> /usr/lib64/libqpid-proton.so.11
> #3  0x7fc7775887ea in ?? () from /usr/lib64/libqpid-proton.so.11
> #4  0x7fc777588c7b in pn_transport_pending () from 
> /usr/lib64/libqpid-proton.so.11
> #5  0x7fc777588d9e in pn_transport_pop () from 
> /usr/lib64/libqpid-proton.so.11
> #6  0x7fc777599298 in ?? () from /usr/lib64/libqpid-proton.so.11
> #7  0x7fc77759a784 in ?? () from /usr/lib64/libqpid-proton.so.11
> #8  0x7fc7773236f0 in proton::container::impl::thread() () from 
> /usr/lib64/libqpid-proton-cpp.so.12
> #9  0x7fc7760b2470 in ?? () from /usr/lib64/libstdc++.so.6
> #10 0x7fc776309aa1 in start_thread () from /lib64/libpthread.so.0
> #11 0x7fc7758b6bdd in clone () from /lib64/libc.so.6{noformat}
> Using gdb to probe one of the backtraces show that the collector->pool size 
> is -1... (seen here as 18446744073709551615)
> {noformat}
> (gdb) p *collector $1 = \{pool = 0x7fa7182de180, head = 0x7fa7182de250, tail 
> = 0x7fa7182b8b90, prev = 0x7fa7182ea010, freed = false}
> (gdb) p collector->pool $2 = (pn_list_t *) 0x7fa7182de180 (gdb) p 
> *collector->pool $3 = \{clazz = 0x7fa74eb7c000, capacity = 16, size = 
> 18446744073709551615, elements = 0x7fa7182de1b0}{noformat}
> The proton code was marked up with print statements which show that two 
> threads were accessing the collector->pool data structure at the same time...
> {noformat} 
>  7b070700: pn_list_pop index 0 list->0x7fec401e0b70 value->0x7fec3c728a10
>  4700:pn_list_add index 1 size 2list->0x7fec401e0b70 value->0x7fec402095b0
>  7b070700: pn_list_pop size 1 list->0x7fec401e0b70
>  4700: pn_list_pop size 1 list->0x7fec401e0b70
>  7b070700: pn_list_pop index 0 list->0x7fec401e0b70 value->0x7fec3c728a10
>  4700: pn_list_pop index 0 list->0x7fec401e0b70 
> value->0x7fec3c728a10{noformat}
> The hex number on the far left is the thread id. As can be seen in the last 
> two lines, two threads are popping from the collector->pool simultaneously. 
> This produces the -1 size as seen up above



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-2422) Proton will sometimes fail to send empty frame if the idle timeout ratio between peers is greater than 2.

2021-10-29 Thread Clifford Jansen (Jira)


[ 
https://issues.apache.org/jira/browse/PROTON-2422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17435827#comment-17435827
 ] 

Clifford Jansen commented on PROTON-2422:
-

Thank-you for the reproducer.

This is caused by the wrong substitution of sort compare function.

The existing code assumes that the Proton class of the timer_deadline is used 
to determine the ordering between objects. The compare function actually used 
is from a separate class specified at list creation.

The fix is to decouple the class definition from the items in the list. The 
timer_deadline class now has no instantiated Proton objects but does provide a 
"static" compare function called by the pn_list on the list items to be sorted.

> Proton will sometimes fail to send empty frame if the idle timeout ratio 
> between peers is greater than 2. 
> --
>
> Key: PROTON-2422
> URL: https://issues.apache.org/jira/browse/PROTON-2422
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: cpp-binding, proton-c
>Affects Versions: proton-c-0.33.0
> Environment: RHEL 8
>Reporter: Jesse Hulsizer
>Assignee: Clifford Jansen
>Priority: Minor
> Attachments: instrument.patch, reproducer.cpp
>
>
> When a connection is made to a proton listener with both sides having 
> different idle timeout intervals, the epoll_timer can fail to trigger, 
> resulting in no empty frames being sent, and the connection dropped with a 
> 'amqp:resource-limit-exceeded: local-idle-timeout expired' exception.
> Instrumentation of the proton library showed that when the an epoll timer 
> deadline was rolled back and the timer resequenced due to the peer idle 
> timeout being shorter than the local, the new timer is pushed on the timer 
> manager heap incorrectly. The timer deadline object should be pushed on the 
> timer heap in order by deadline, by in fact the timer is pushed on the head 
> by timer deadline object address. This causes the invalidated timer to be 
> first on the list, and the proactor timer set incorrectly. When enough time 
> has elapsed, the remote peer will close the connection due to inactivity. 
> Note that if the newly created resequenced timer deadline object has an 
> address lower than the old invalidated timer deadline object, proton will 
> work correctly.
> I've attached a reproducer as well as a patch for the instrumentation. 
> Annotated proton logging from the reproducer is below.
> This issue does not occur prior to 0.33.0
> {code:java}
> [builder@SE-RHEL8-ITCM-TEST-01 qpid-proton-idle-timeout-repo $ ] 
> PN_LOG='frame info+' ./a.out 
> listening on 9030
> # The initial connection
> [0x7fdb3c001be0]: SASL:FRAME: -> SASL
> [0x7fdb44002620]: SASL:FRAME: <- SASL
> [0x7fdb44002620]: SASL:FRAME: -> SASL
> [0x7fdb44002620]: AMQP:FRAME:0 -> @sasl-mechanisms(64) 
> [sasl-server-mechanisms=@PN_SYMBOL[:ANONYMOUS]]
> [0x7fdb3c001be0]: SASL:FRAME: <- SASL
> [0x7fdb3c001be0]: AMQP:FRAME:0 <- @sasl-mechanisms(64) 
> [sasl-server-mechanisms=@PN_SYMBOL[:ANONYMOUS]]
> [0x7fdb4ca21e20]:EVENT: INFO:In pni_timer_set - timer* 0x7fdb3c008570, 
> deadline 5189836829, proactor_timer* 0x10B9660
> [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump
> [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump
> [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump post
> [0x7fdb4ca21e20]:EVENT: INFO:Heap position 0: td=0x0167fdb3c0085b0, 
> td->list_deadline=5189836829, td->timer=0x7fdb3c008570, 
> td->resequenced=false
> [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump
> [0x7fdb3c001be0]: AMQP:FRAME:0 -> @sasl-init(65) [mechanism=:ANONYMOUS, 
> initial-response=b"anonymous@SE-RHEL8-ITCM-TEST-01"]
> [0x7fdb44002620]: AMQP:FRAME:0 <- @sasl-init(65) [mechanism=:ANONYMOUS, 
> initial-response=b"anonymous@SE-RHEL8-ITCM-TEST-01"]
> [0x7fdb44002620]: SASL: INFO:Authenticated user: anonymous for anonymous with 
> mechanism ANONYMOUS
> [0x7fdb44002620]: AMQP:FRAME:0 -> @sasl-outcome(68) [code=0]
> [0x7fdb3c001be0]: AMQP:FRAME:0 <- @sasl-outcome(68) [code=0]
> [0x7fdb4ca21e20]:EVENT: INFO:In pni_timer_set - timer* 0x7fdb3c008570, 
> deadline 5189836829, proactor_timer* 0x10B9660
> [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump
> [0x7fdb4ca21e20]:EVENT: INFO:Heap position 0: td=0x0167fdb3c0085b0, 
> td->list_deadline=5189836829, td->timer=0x7fdb3c008570, 
> td->resequenced=false
> [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump
> [0x7fdb3c001be0]: AMQP:FRAME: -> AMQP
> [0x7fdb3c001be0]: AMQP:FRAME:0 -> @open(16) 
> [container-id="cf87e911-f46b-471a-a664-e34de8a57b6b", hostname="127.0.0.1", 
> channel-max=32767, idle-time-out=2]
> [0x7fdb44002620]: AMQP:FRAME: <- AMQP
> [0x7fdb44002620]: AMQP:FRAME:0 <- @open(16) 
> [container-id="cf87e911-f46b-

[jira] [Commented] (PROTON-2432) Proton crashes because of a concurrency failure in collector->pool

2021-09-22 Thread Clifford Jansen (Jira)


[ 
https://issues.apache.org/jira/browse/PROTON-2432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17418723#comment-17418723
 ] 

Clifford Jansen commented on PROTON-2432:
-

Further to Robbie's excellent response:

See also the "Thread-safety" note in messaging_handler.hpp. Useful examples 
working with work queues can be found in cpp/examples including broker.cpp and 
the multithreaded clients.

An alternate method to achieve thread safety in Proton (from using 
proton::work_queue) is to use connection::wake() paired with 
on_connection_wake() and have your own locking mechanism to manage your own 
work queue concept to ensure active use of the connection only happens in the 
dedicated thread that receives the connection callbacks.

One frequent "gotcha" is inadvertent use of the connection or its sub-objects 
(senders/receivers/deliveries) from another thread. Destructors and copy 
constructors are the usual problem. A good strategy is to get a smart pointer 
to the Proton object while in the callback and stash it until a future safe 
callback where the application is ready to release it, and do so via 
smart_ptr::reset(). That way the destructor is called exactly when you want it, 
and any unnoticed copies of the shared ptr in another tread will have no 
surprise calls into the Proton engine.

 

> Proton crashes because of a concurrency failure in collector->pool
> --
>
> Key: PROTON-2432
> URL: https://issues.apache.org/jira/browse/PROTON-2432
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.32.0
> Environment: RHEL 7 
>Reporter: Jesse Hulsizer
>Priority: Major
> Attachments: proton-2432.patch
>
>
> While running our application tests, our application crashes with many 
> different backtraces that look similar to this...
> {noformat}
> #0  0x in ?? ()
> #1  0x7fc777579198 in pn_class_incref () from 
> /usr/lib64/libqpid-proton.so.11
> #2  0x7fc777587d8a in pn_collector_put () from 
> /usr/lib64/libqpid-proton.so.11
> #3  0x7fc7775887ea in ?? () from /usr/lib64/libqpid-proton.so.11
> #4  0x7fc777588c7b in pn_transport_pending () from 
> /usr/lib64/libqpid-proton.so.11
> #5  0x7fc777588d9e in pn_transport_pop () from 
> /usr/lib64/libqpid-proton.so.11
> #6  0x7fc777599298 in ?? () from /usr/lib64/libqpid-proton.so.11
> #7  0x7fc77759a784 in ?? () from /usr/lib64/libqpid-proton.so.11
> #8  0x7fc7773236f0 in proton::container::impl::thread() () from 
> /usr/lib64/libqpid-proton-cpp.so.12
> #9  0x7fc7760b2470 in ?? () from /usr/lib64/libstdc++.so.6
> #10 0x7fc776309aa1 in start_thread () from /lib64/libpthread.so.0
> #11 0x7fc7758b6bdd in clone () from /lib64/libc.so.6{noformat}
> Using gdb to probe one of the backtraces show that the collector->pool size 
> is -1... (seen here as 18446744073709551615)
> {noformat}
> (gdb) p *collector $1 = \{pool = 0x7fa7182de180, head = 0x7fa7182de250, tail 
> = 0x7fa7182b8b90, prev = 0x7fa7182ea010, freed = false}
> (gdb) p collector->pool $2 = (pn_list_t *) 0x7fa7182de180 (gdb) p 
> *collector->pool $3 = \{clazz = 0x7fa74eb7c000, capacity = 16, size = 
> 18446744073709551615, elements = 0x7fa7182de1b0}{noformat}
> The proton code was marked up with print statements which show that two 
> threads were accessing the collector->pool data structure at the same time...
> {noformat} 
>  7b070700: pn_list_pop index 0 list->0x7fec401e0b70 value->0x7fec3c728a10
>  4700:pn_list_add index 1 size 2list->0x7fec401e0b70 value->0x7fec402095b0
>  7b070700: pn_list_pop size 1 list->0x7fec401e0b70
>  4700: pn_list_pop size 1 list->0x7fec401e0b70
>  7b070700: pn_list_pop index 0 list->0x7fec401e0b70 value->0x7fec3c728a10
>  4700: pn_list_pop index 0 list->0x7fec401e0b70 
> value->0x7fec3c728a10{noformat}
> The hex number on the far left is the thread id. As can be seen in the last 
> two lines, two threads are popping from the collector->pool simultaneously. 
> This produces the -1 size as seen up above



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-2411) Simultaneous idle timeout sequencing errors

2021-08-03 Thread Clifford Jansen (Jira)


[ 
https://issues.apache.org/jira/browse/PROTON-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17392573#comment-17392573
 ] 

Clifford Jansen commented on PROTON-2411:
-

p2411_0.diff is a patch that can be applied to Proton 0.34 to help debug this 
issue.

Instead of aborting if an AMQP connection is seen to set an earlier heartbeat 
timeout more than once, it prints a detailed diagnostic and continues to run.

The problem is supposed to be very rare and this change could introduce some 
new runaway problem so if there are more than a handful of such sequencing 
errors on a single connection, the connection is terminated, and the process 
can continue to run, perhaps to reconnect as for any other temporary network 
failure (or to continue listening in the case of the router).

To collect the error information Proton clients should be started with

    PN_LOG=ERROR+

in their process environment, or any other setting that includes ERROR level 
logging.

Similarly, the router configuration should allow "error+" logging levels.

The log messages will contain either

    "timer sequence error" or "timer multi sequence errors"

If you use the patch and find examples of these errors in the logs, please add 
a representative sample to the JIRA.

> Simultaneous idle timeout sequencing errors
> ---
>
> Key: PROTON-2411
> URL: https://issues.apache.org/jira/browse/PROTON-2411
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.34.0
>Reporter: Jaap Wiggelinkhuizen
>Priority: Critical
> Attachments: p2411_0.diff
>
>
> In our mission critical software we use Qpid proton 0.34.0 in our C++-client 
> software together with the Qpid dispatch router 1.16.0. We updated to these 
> versions not so long ago, before we used proton 0.25.0 and dispatch 1.3.0. 
> Our application runs on several VM’s with a router on each VM. All clients 
> connect to the local router only and the routers connect to eachother in a 
> hub spoke pattern. In both the client configuration as the router 
> configuration we have configured an idle timeout of 30 seconds.
> On July 4th we were confronted with an incident in production where a lot of 
> our client processes reported problems regarding the idle timeouts. These 
> client processes were already running stable for more than 3 weeks. The 
> problem appeared in two flavors:
>  # Transport error “error: amqp:resource-limit-exceeded: local-idle-timeout 
> expired”
>  # epoll proactor failure in epoll_timer.c:263: “idle timeout sequencing 
> error”
> On each VM at least 3 processes showed one of these problems in a total time 
> window of less than a minute. We haven’t found any cause in the underlying 
> hardware, hypervisor, network or operating system until now.
> Although we don’t know the root cause of the problems, we can solve the first 
> situation by using the proper reconnect settings (by mistake we handled 
> on_transport_error() as a fatal situation and will correct that so that only 
> on_transport_close() will be handled as fatal). However the second situation 
> is more odd because it results in an abort within proton itself. The comments 
> in epoll_timer.c explain that this error occurs when a connection timer is 
> moved backwards a second time. We don’t understand how this can happen 
> suddenly.
>  
> Last sunday the problem occurred again on two more production sites where our 
> software was operational just over 3 weeks now. And again it has happened on 
> all VM's within a short timeframe. It's interesting that it only occurs on 
> sunday mornings until now. Maybe it has something to do with how long the 
> software is running and the fact that on sunday mornings there is less 
> messaging traffic, i.e. more heartbeats?...
>  
> Unfortunately we haven't been able to reproduce the issue at our test 
> facilities and hence can not provide a reproducer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Updated] (PROTON-2411) Simultaneous idle timeout sequencing errors

2021-08-03 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen updated PROTON-2411:

Attachment: p2411_0.diff

> Simultaneous idle timeout sequencing errors
> ---
>
> Key: PROTON-2411
> URL: https://issues.apache.org/jira/browse/PROTON-2411
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.34.0
>Reporter: Jaap Wiggelinkhuizen
>Priority: Critical
> Attachments: p2411_0.diff
>
>
> In our mission critical software we use Qpid proton 0.34.0 in our C++-client 
> software together with the Qpid dispatch router 1.16.0. We updated to these 
> versions not so long ago, before we used proton 0.25.0 and dispatch 1.3.0. 
> Our application runs on several VM’s with a router on each VM. All clients 
> connect to the local router only and the routers connect to eachother in a 
> hub spoke pattern. In both the client configuration as the router 
> configuration we have configured an idle timeout of 30 seconds.
> On July 4th we were confronted with an incident in production where a lot of 
> our client processes reported problems regarding the idle timeouts. These 
> client processes were already running stable for more than 3 weeks. The 
> problem appeared in two flavors:
>  # Transport error “error: amqp:resource-limit-exceeded: local-idle-timeout 
> expired”
>  # epoll proactor failure in epoll_timer.c:263: “idle timeout sequencing 
> error”
> On each VM at least 3 processes showed one of these problems in a total time 
> window of less than a minute. We haven’t found any cause in the underlying 
> hardware, hypervisor, network or operating system until now.
> Although we don’t know the root cause of the problems, we can solve the first 
> situation by using the proper reconnect settings (by mistake we handled 
> on_transport_error() as a fatal situation and will correct that so that only 
> on_transport_close() will be handled as fatal). However the second situation 
> is more odd because it results in an abort within proton itself. The comments 
> in epoll_timer.c explain that this error occurs when a connection timer is 
> moved backwards a second time. We don’t understand how this can happen 
> suddenly.
>  
> Last sunday the problem occurred again on two more production sites where our 
> software was operational just over 3 weeks now. And again it has happened on 
> all VM's within a short timeframe. It's interesting that it only occurs on 
> sunday mornings until now. Maybe it has something to do with how long the 
> software is running and the fact that on sunday mornings there is less 
> messaging traffic, i.e. more heartbeats?...
>  
> Unfortunately we haven't been able to reproduce the issue at our test 
> facilities and hence can not provide a reproducer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-2403) libuv based proactor test errors

2021-06-26 Thread Clifford Jansen (Jira)


[ 
https://issues.apache.org/jira/browse/PROTON-2403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17370125#comment-17370125
 ] 

Clifford Jansen commented on PROTON-2403:
-

pn2403_0.diff works on Fedora 34 and libuv-1.41.0-1.

 

Doc for read_start() indicates a change in behaviour in V1.38, but the 
specified change does not explain why the proactor code was working with 
versions earlier than 1.38.

 

TBD if this is a general fix or version dependent code needs to created 
becreated.

> libuv based proactor test errors
> 
>
> Key: PROTON-2403
> URL: https://issues.apache.org/jira/browse/PROTON-2403
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.34.0
> Environment: Builds using the libuv proactor.
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
> Attachments: pn2403_0.diff
>
>
> New test failures are seen with recent versions of libuv.  At least starting 
> with version 1.41 of libuv and perhaps earlier.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Updated] (PROTON-2403) libuv based proactor test errors

2021-06-26 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen updated PROTON-2403:

Attachment: pn2403_0.diff

> libuv based proactor test errors
> 
>
> Key: PROTON-2403
> URL: https://issues.apache.org/jira/browse/PROTON-2403
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.34.0
> Environment: Builds using the libuv proactor.
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
> Attachments: pn2403_0.diff
>
>
> New test failures are seen with recent versions of libuv.  At least starting 
> with version 1.41 of libuv and perhaps earlier.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-2403) libuv based proactor test errors

2021-06-26 Thread Clifford Jansen (Jira)
Clifford Jansen created PROTON-2403:
---

 Summary: libuv based proactor test errors
 Key: PROTON-2403
 URL: https://issues.apache.org/jira/browse/PROTON-2403
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-c
Affects Versions: proton-c-0.34.0
 Environment: Builds using the libuv proactor.
Reporter: Clifford Jansen
Assignee: Clifford Jansen


New test failures are seen with recent versions of libuv.  At least starting 
with version 1.41 of libuv and perhaps earlier.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Updated] (PROTON-2397) Update default client TLS defaults for verifying outbound connections to AMQP servers.

2021-06-18 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen updated PROTON-2397:

Description: 
Proton C and its associated bindings do not have consistent default client side 
TLS configuration. Proton libraries will be changed on a per-language/binding 
basis so that all clients verify the server's certificate and identifying name 
by default, i.e. to use PN_SSL_VERIFY_PEER_NAME unless the application takes 
steps to change the desired level of authentication.

This default behaviour is required for the Proton libraries to be compliant 
with the TLS specification 1.3 (RFC 8446). Such compliance is obviously highly 
desirable now and will become mandatory in the future.

C++ applications will not be affected (this is the existing default).

C, Python, Ruby and Go applications that fully configure their client 
connections are also unaffected.

Python programs that use MESSAGING_CONNECT_FILE (or the connect.json 
equivalent) are unaffected.

Proton applications that do not make outbound connections are unaffected.

All other applications may run into stricter verification policies that cause 
previously successful TLS negotiations to now fail. These applications will 
need to either:

- explicitly downgrade the verification mechanism of outgoing connections to 
the old default (PN_SSL_ANONYMOUS_PEER)

- update server certificates and/or client trusted root CA's as required to 
work in the full PN_SSL_VERIFY_PEER_NAME verification mode.

> Update default client TLS defaults for verifying outbound connections to AMQP 
> servers.
> --
>
> Key: PROTON-2397
> URL: https://issues.apache.org/jira/browse/PROTON-2397
> Project: Qpid Proton
>  Issue Type: Improvement
>  Components: cpp-binding, go-binding, proton-c, python-binding, 
> ruby-binding
>Affects Versions: proton-c-0.34.0
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
> Fix For: proton-c-0.35.0
>
>
> Proton C and its associated bindings do not have consistent default client 
> side TLS configuration. Proton libraries will be changed on a 
> per-language/binding basis so that all clients verify the server's 
> certificate and identifying name by default, i.e. to use 
> PN_SSL_VERIFY_PEER_NAME unless the application takes steps to change the 
> desired level of authentication.
> This default behaviour is required for the Proton libraries to be compliant 
> with the TLS specification 1.3 (RFC 8446). Such compliance is obviously 
> highly desirable now and will become mandatory in the future.
> C++ applications will not be affected (this is the existing default).
> C, Python, Ruby and Go applications that fully configure their client 
> connections are also unaffected.
> Python programs that use MESSAGING_CONNECT_FILE (or the connect.json 
> equivalent) are unaffected.
> Proton applications that do not make outbound connections are unaffected.
> All other applications may run into stricter verification policies that cause 
> previously successful TLS negotiations to now fail. These applications will 
> need to either:
> - explicitly downgrade the verification mechanism of outgoing connections to 
> the old default (PN_SSL_ANONYMOUS_PEER)
> - update server certificates and/or client trusted root CA's as required to 
> work in the full PN_SSL_VERIFY_PEER_NAME verification mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Updated] (PROTON-2397) Update default client TLS defaults for verifying outbound connections to AMQP servers.

2021-06-18 Thread Clifford Jansen (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clifford Jansen updated PROTON-2397:

Environment: (was: Proton C and its associated bindings do not have 
consistent default client side TLS configuration.  Proton libraries will be 
changed on a per-language/binding basis so that all clients verify the server's 
certificate and identifying name by default, i.e. to use 
PN_SSL_VERIFY_PEER_NAME unless the application takes steps to change the 
desired level of authentication.

This default behaviour is required for the Proton libraries to be compliant 
with the TLS specification 1.3 (RFC 8446).  Such compliance is obviously highly 
desirable now and will become mandatory in the future.

C++ applications will not be affected (this is the existing default).

C, Python, Ruby and Go applications that fully configure their client 
connections are also unaffected.

Python programs that use MESSAGING_CONNECT_FILE (or the connect.json 
equivalent) are unaffected.

Proton applications that do not make outbound connections are unaffected.

All other applications may run into stricter verification policies that cause 
previously successful TLS negotiations to now fail.  These applications will 
need to either:

  - explicitly downgrade the verification mechanism of outgoing connections to 
the old default (PN_SSL_ANONYMOUS_PEER)

  - update server certificates and/or client trusted root CA's as required to 
work in the full PN_SSL_VERIFY_PEER_NAME verification mode.
)

> Update default client TLS defaults for verifying outbound connections to AMQP 
> servers.
> --
>
> Key: PROTON-2397
> URL: https://issues.apache.org/jira/browse/PROTON-2397
> Project: Qpid Proton
>  Issue Type: Improvement
>  Components: cpp-binding, go-binding, proton-c, python-binding, 
> ruby-binding
>Affects Versions: proton-c-0.34.0
>Reporter: Clifford Jansen
>Assignee: Clifford Jansen
>Priority: Major
> Fix For: proton-c-0.35.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



  1   2   >