Re: resumption problem
On Fri, Mar 27, 2020 at 10:10:16PM +, Jeremy Harris wrote: > >> A simple code addition to avoid that call in the client case sounds > >> in order. > > Testing, it appears to work - I get resumption and not that error. > And the Exim testsuite shows no regressions, at least on my laptop > (which is Fedora 31, with 1.1.1d). On a Fedora 31 system I also don't see those directives in the system openssl.cnf or includes. Mind you, closer inspection of the code suggests that in the config file also "RequestCAPath" and "ClientCAPath" would result in setting the bidirectional CA list. But I don't find those either. > >>> Another possibility is that your system-wide openssl.cnf file has a > >>> "RequestCAFile" or "ClientCAFile" setting. > >> > >> Neither appears to be present in /etc/pki/tls/openssl.cnf > > > > And neither has any ".include" directives? So my best guess is that you were testing with approximately a stock 1.1.1 that predates 1.1.1a, modulo security fixes. Otherwise, it is unclear how the client CA list (server -> client) ended up being sent from client -> server. -- Viktor.
Re: resumption problem
On 27/03/2020 21:52, Viktor Dukhovni wrote: > On Fri, Mar 27, 2020 at 09:25:28PM +, Jeremy Harris wrote: > >>> If the distro started with 1.1.1 and only backported security fixes, you >>> could be running an OpenSSL version with the unintentional bidirectional >>> setting. >> >> .. either this, or even an unpatched basic 1.1.1 . >> >> A simple code addition to avoid that call in the client case sounds >> in order. Testing, it appears to work - I get resumption and not that error. And the Exim testsuite shows no regressions, at least on my laptop (which is Fedora 31, with 1.1.1d). >>> Another possibility is that your system-wide openssl.cnf file has a >>> "RequestCAFile" or "ClientCAFile" setting. >> >> Neither appears to be present in /etc/pki/tls/openssl.cnf > > And neither has any ".include" directives? One, but that file doesn't have either string, either. -- Cheers, Jeremy
Re: resumption problem
On Fri, Mar 27, 2020 at 09:25:28PM +, Jeremy Harris wrote: > > If the distro started with 1.1.1 and only backported security fixes, you > > could be running an OpenSSL version with the unintentional bidirectional > > setting. > > .. either this, or even an unpatched basic 1.1.1 . > > A simple code addition to avoid that call in the client case sounds > in order. Would the above likely explain the error I'm getting? You could explicitly set the client CA list to an empty stack, as a final step in initializing the SSL_CTX: SSL_CTX_set0_CA_list(ctx, NULL); > > Another possibility is that your system-wide openssl.cnf file has a > > "RequestCAFile" or "ClientCAFile" setting. > > Neither appears to be present in /etc/pki/tls/openssl.cnf And neither has any ".include" directives? -- Viktor.
Re: resumption problem
On 27/03/2020 21:07, Viktor Dukhovni wrote: > That function should only affect the server -> client direction. > Briefly, in OpenSSL 1.1.1 it affected both the client and server > directions, but this was fixed in OpenSSL 1.1.1a. If Centos is following the same pattern in 8 as they did in 7, they do list the letter when there is one; I have a 7 system claiming "1.0.2k-fips". So: > If the distro started with 1.1.1 and only backported security fixes, you > could be running an OpenSSL version with the unintentional bidirectional > setting. .. either this, or even an unpatched basic 1.1.1 . A simple code addition to avoid that call in the client case sounds in order. Would the above likely explain the error I'm getting? > Another possibility is that your system-wide openssl.cnf file has a > "RequestCAFile" or "ClientCAFile" setting. Neither appears to be present in /etc/pki/tls/openssl.cnf -- Cheers, Jeremy
Re: resumption problem
On Fri, Mar 27, 2020 at 08:20:55PM +, Jeremy Harris wrote: > > Right, you're running out of space by trying to send too many > > CA names. It is better to have this fail, so you can figure > > what is trying to dump your entire trusted CA list (of names) > > to the server, than to actually have that silently "work". > > It's /etc/pki/tls/certs/ca-bundle.crt (on Centos 8) and > it's being loaded using SSL_CTX_set_client_CA_list() > into the client, supposedly to be the acceptable CAs > for the cert sent by the server. That function should only affect the server -> client direction. Briefly, in OpenSSL 1.1.1 it affected both the client and server directions, but this was fixed in OpenSSL 1.1.1a. If the distro started with 1.1.1 and only backported security fixes, you could be running an OpenSSL version with the unintentional bidirectional setting. Another possibility is that your system-wide openssl.cnf file has a "RequestCAFile" or "ClientCAFile" setting. Both unfortunately set not only the server->client direction, but also the client->server direction. This may be a bug (oversight), the second of these should not touch the bidirectional list. "ssl/ssl_conf.c" line 500: static int cmd_RequestCAFile(SSL_CONF_CTX *cctx, const char *value) { if (cctx->canames == NULL) cctx->canames = sk_X509_NAME_new_null(); if (cctx->canames == NULL) return 0; return SSL_add_file_cert_subjects_to_stack(cctx->canames, value); } static int cmd_ClientCAFile(SSL_CONF_CTX *cctx, const char *value) { return cmd_RequestCAFile(cctx, value); } "ssl/ssl_conf.c" line 883: int SSL_CONF_CTX_finish(SSL_CONF_CTX *cctx) { ... if (cctx->canames) { if (cctx->ssl) SSL_set0_CA_list(cctx->ssl, cctx->canames); else if (cctx->ctx) SSL_CTX_set0_CA_list(cctx->ctx, cctx->canames); else sk_X509_NAME_pop_free(cctx->canames, X509_NAME_free); cctx->canames = NULL; } return 1; } Can you confirm whether any such directive is present in your system-wide openssl.cnf (possibly via .include)? > > Perhaps your "openssl.cnf" (Linux distro mistake?) causes the > > damage for all applications even without explicit code to that end > > in Exim? Or you're calling something to make it happen. > > The Exim config is saying to use that file, so I do have a workaround > point there. It is actually not easy to avoid using the system-wide config. It is loaded by default during library initialization, and you need custom initialization calls to avoid loading it. > However, the question is: what is [1] the proper fix? We want to > accept "the usual, pretty big, set of well-known CAs". Is that an > incorrect way of telling OpenSSL that list? It makes some sense to send a CA list to clients, but ideally one that is not too large. In Postfix I recommend a small or no CAfile (from which the CAlist is initialized), and the bulk of the trust store configured via a CApath (which is not used to send CA names to clients). Most clients (real MTAs, not random Java apps) handle an empty CA list just fine, and use the (at most) one client cert they have. > And how come it only fails under resumption and when a client-cert is > also used? Here, I am not sure, but perhaps this is because the extension is only valid with TLS 1.3, and on resumption we already know the server protocol version. On initial handshakes the protocol might be TLS 1.2, and perhaps as a result the list is not sent??? > 1] We also need to code to be compatible across a wide range > of OpenSSL versions. Minimising #ifdeffery is a factor. Understood, I'm in the same boat with Postfix. -- Viktor.
Re: resumption problem
On 26/03/2020 00:58, Viktor Dukhovni wrote: > On Thu, Mar 26, 2020 at 12:40:08AM +, Jeremy Harris wrote: > >> Looks like I'm wrong, from the behaviour. >> >> It's the second of the possible places, and "i" is 129. >> It appears to be failing the WPACKET_sub_allocate_bytes_u16() >> call. %rsi before the call, which I think should be >> the "namelen" arg, has value 172. > > Right, you're running out of space by trying to send too many > CA names. It is better to have this fail, so you can figure > what is trying to dump your entire trusted CA list (of names) > to the server, than to actually have that silently "work". It's /etc/pki/tls/certs/ca-bundle.crt (on Centos 8) and it's being loaded using SSL_CTX_set_client_CA_list() into the client, supposedly to be the acceptable CAs for the cert sent by the server. The load call doesn't have a return value to check. The list presented had 133 entries. > Perhaps your > "openssl.cnf" (Linux distro mistake?) causes the damage > for all applications even without explicit code to that > end in Exim? Or you're calling something to make it happen. The Exim config is saying to use that file, so I do have a workaround point there. However, the question is: what is [1] the proper fix? We want to accept "the usual, pretty big, set of well-known CAs". Is that an incorrect way of telling OpenSSL that list? And how come it only fails under resumption and when a client-cert is also used? 1] We also need to code to be compatible across a wide range of OpenSSL versions. Minimising #ifdeffery is a factor. -- Cheers, Jeremy
Re: Certificate subject match validation
On Fri, Mar 27, 2020 at 07:38:35PM +0200, George-Theodor Serbana wrote: > I am writing a SSL/TLS client (using Boost.Beast but underlying it's using > OpenSSL) and although I have set on the SSL context the 'verify_peer' flag, > there is no verification to prove the server presents an X509 which > contains in the Subject Alternative Names the hostname of that server. > > As this is probably the dumbest type of attack someone could do (using a > valid certificate with another domain name), I am thinking I'm doing > something wrong. But from the documentation, I saw that using "verify_peer" > should perform all the verifications... It verifies the trust chain. To also verify the peer name, you need to specify the peer name via: SSL_set1_host() > Now if not even this simple check is being done, how about expiration of > the certificate, revocation status and other checks? Should they be > performed manually as well? No, that's what VERIFY_PEER is for. > For now I am using X509_VERIFY_PARAM_set1_host with SSL_CTX_set1_param to > do this specific check. That's the slightly less convenient legacy API from OpenSSL 1.0.2. In 1.1.0 and later, you can use SSL_set1_host() (and in some cases also SSL_add1_host()). See the SSL_set1_host(3) manpage for details. -- Viktor.
Certificate subject match validation
I am writing a SSL/TLS client (using Boost.Beast but underlying it's using OpenSSL) and although I have set on the SSL context the 'verify_peer' flag, there is no verification to prove the server presents an X509 which contains in the Subject Alternative Names the hostname of that server. As this is probably the dumbest type of attack someone could do (using a valid certificate with another domain name), I am thinking I'm doing something wrong. But from the documentation, I saw that using "verify_peer" should perform all the verifications... Now if not even this simple check is being done, how about expiration of the certificate, revocation status and other checks? Should they be performed manually as well? For now I am using X509_VERIFY_PARAM_set1_host with SSL_CTX_set1_param to do this specific check. Best regards, Theodor
Re: Explicit thread cleanup in OpenSSL 1.1.1 possible?
Hello Michael, Am 27.03.20 um 15:46 schrieb Michael Wojcik: As a workaround, what about first making a JNI call to a trivial shared object that does an explicit dlopen of the OpenSSL shared object? The JVM wouldn't know about that load, and its subsequent unload of the shared object wouldn't remove it from the process address space, because there would still be a reference to it. thanks for the suggestion. This sounds similar to the trick that is already built into OpenSSL. Through other channels I found now two other possible solutions for the problem: 1) Use JNI_OnUnload() to call OPENSSL_cleanup() JNI_OnUnload() is called when the JVM unloads the JNI library. OPENSSL_cleanup() cleans up everything and no further per-thread cleanup happens after unloading the JNI shared library. 2) Use a C++ static object in the JNI library where the destructor calls OPENSSL_cleanup() When debugging the problem I came across this neat trick on stackoverflow: https://stackoverflow.com/a/11394263 class ResourceHolder { public: ResourceHolder() { // at start } ~ResourceHolder() { // at exit } }; ResourceHolder theHolder; I think that putting a call to OPENSSL_cleanup() into the destructor of the ResourceHolder class would also fix the problem, and it would be Java-agnostic. Approach 1) does fix the crash for me. If approach 2) works as well, I will use that one. -- Stephan
Re: Explicit thread cleanup in OpenSSL 1.1.1 possible?
As a workaround, what about first making a JNI call to a trivial shared object that does an explicit dlopen of the OpenSSL shared object? The JVM wouldn't know about that load, and its subsequent unload of the shared object wouldn't remove it from the process address space, because there would still be a reference to it.
Explicit thread cleanup in OpenSSL 1.1.1 possible?
Hi, with OpenSSL 1.1.1 it is possible to turn off the automatic cleanup with an atexit() handler by passing the flag OPENSSL_INIT_NO_ATEXIT to OPENSSL_init_crypto(). Is it possible to configure this also at the thread level, so that no automatic thread cleanup occurs, with the option to do an explicit per-thread cleanup? I looked at documentation and source code of OpenSSL 1.1.1 and this seems not to be possible, but I wanted to ask nevertheless. Background: We are embedding OpenSSL in a Java JNI library, and one particular JVM (IBM J9 on z/OS) loads our Java class and the JNI library in a dedicated thread. At the end of the program the JVM then unloads the the Java class and the JNI library before the thread terminates, and when the OpenSSL thread cleanup functions are called the JVM crashes because the shared library is gone from the address space. This happens because the pinning of the OpenSSL shared library does not work on z/OS. -- Stephan