Hi All, I'm very interested in this thread. I have two patches I'd like to toss into the ring as incentive for more discussion on this subject. They both go the openssl ENGINE + engine_pkcs11 route that Dan has described. They are generally applicable and are being used to integrate with the PKCS#11 interface offered by nCiphers keyAuthority product[1].
Disclosure: I am working on these patches under contract for nCipher PLC. Dan/Sander, what do you think needs to be done to accommodate the "one context per process" implicit requirement in the pkcs11 standard ? Should this, IMV important use case, be allowed to sway the way mod_ssl does things ? The basic idea behind the attached patches is to leverage the openssl ENGINE api and the engine_pkcs11 implementation, from the opensc project, in order to delegate key use and management to our pkcs11 implementation. By adopting pkcs11 as our keyAuthority <=> apache boundary we are hoping create a change set for mod_ssl that is generally useful. In this there is a slight departure from what Dan has described: SSLCertificateFile and SSLCertificateKey file become opaque to apache. All the semantics are in the ENGINE. The scenario above gives 3 specific problems with apache/mod_ssl. I think they are general mod_ssl/engine issues rather than specific to one vendor - but please put me straight if you don't agree ! 1. mod_ssl is married to PEM files. This blows for HSM's because we end up with grotty "fake" keys with magic values embedded in the keys bignums. 2. The pkcs11 standard is not particularly friendly to multi-process applications (apache pre-fork & worker are affected by this). It requires that there be exactly one application library context per process [p 17 PKCS #11 v2.2 6.6.1]. 3. The (elegant) asn.1 caching solution to password entry used by ssl_pphrase_Handle is, currently, not compatible with keyAuthority/HSM managed server keys. I suspect I'm not the only person who would prefer to _not_ export generally usable "fake" keys for the benefit of applications that rely on being able to obtain a serialized representation of a key. 2., & 3. in combination are particularly nasty. 2. I believe, means we need ensure a distinct openssl context in _each_ process. This was the only reliable way to ensure a distinct engine instance (along with any dso's it pulls in) for each process. 3. Effectively requires that all engine implementations have an answer for cross process key import that does not require manual intervention in each process. Note that there _are_ cases where HSM based keys do not require manual intervention on import and hence the engine's duties are not so onerous. The keyAuthority hardware endpoint is one such case. [PATCH PR-20364][BUG 42687] The first patch ignores 2. It disables asn.1 caching of the server key. It allow the configuration phase to proceed pretty much as normal except that SSLCertificateKeyFile and SSLCertificateFile are treated as opaque identifiers. It makes use of the LOAD_CERT_CTRL ENGINE_ctrl method defined by the opensc/engine_pkcs11 project to load the certificate and ENGINE_load_private_key for the server key. The report: http://issues.apache.org/bugzilla/show_bug.cgi?id=42687 The patch: http://issues.apache.org/bugzilla/attachment.cgi?id=20364 Provided I run my apache with -DONE_PROCESS and use my "fixed" version of opensc's engine_pkcs11, I can quite happily establish ssl sessions based on keys that are never exposed to the apache host. [PATCH PR-20365][BUG 42688] The second patch incorporates most of the changes in the first. The addition is that it deals with 2. well enough for me to use the standard worker mpm without issue. The way it deals with 2. is crude. The validation of SSLCryptoDevice is allowed to proceeds as normal. If it succeeds the engine and all openssl state that was needed to successfully load the engine, is torn down. Each child process is then required to create and configure its own distinct openssl ctx and engine handle. The mutex and session cache mechanisms are unchanged. The report: http://issues.apache.org/bugzilla/show_bug.cgi?id=42688 The patch: http://issues.apache.org/bugzilla/attachment.cgi?id=20365 I'm still not sure what to do about "SSL_init_app_data2_idx" but ignoring it "works on my machine". The initial pass on PR-20365 completely disabled validation of SSLCryptoDevice. I had a complete nightmare trying to "shutdown" openssl both reliably and completely. The pre-flight config system makes this a little harder and I was unsure how to get a context pointer passed into ssl_cleanup_pre_config. I remain unconvinced of the value of the validation step and it certainly adds to the complexity. Things "break" early enough if SSLCryptoDevice is spelled incorrectly for it to be obvious in the log files. I feel disabling the asn.1 caching of keys and certs, at least for some set of configurable "ENGINES", is reasonable: With any HSM, there is going to be infrastructure on the host, along side apache, that is better suited to dealing with the authentication and cross-process use of keys. I'm beginning to think that, even in the conventional PEM based approach, it would be better all round if ssl_engine_pphrase.c was killed off in favor of a purely ENGINE based approach that required the ENGINE to completely deal with user auth. Does any one have any thoughts on how issues 1., 2., 3 above can or should be dealt with ? Regards, Robin Bryce [1] nCipher's keyAuthority delivers centralized cryptographic key management and automated key distribution to security applications deployed across large numbers of network-attached end-points. For more information please see this link: http://www.ncipher.com/key_management/9/keyauthority