Thanks for the response.  I was wondering if anyone was out there :)

In the end, I may need to do encryption without taking advantage of hardware 
acceleration, but I was hoping to do some benchmark tests with and without 
hardware acceleration to see what the difference in measured performance would 
actually be.  Our project is in an embedded system, with a small ARM core 
running Linux, which has a lot of other jobs to do beyone encrypting data.  The 
CPU utilization looked like it was at 99% when we ran pure software encryption 
using openssl stand alone.  We obviously are not running full throttle all of 
the time, so we may still be able to keep up.  I also still need to 
characterize the block size of data that needs to be transmitted.  The hardware 
acceleration doesn't buy much for small block sizes.  In fact, because of all 
of the system overhead on context switches, it can actually be slower than pure 
software encryption for small block sizes.  However as block sizes increase, 
there is an inflection point, where hardware acceleration really starts to kick 
in.  In the end, the value of hardware encryption will depend on block sizes, 
which will depend on customer needs that have not yet been well defined, but 
I'd love to have some headroom if we end up requiring large block sizes.

I still have not given up on the idea of trying to get hardware encryption 
working.  I know openssl has the capability to use the hardware encryption in 
the Atmel chip, when I use it alone.  I just need to figure out the right hooks 
to get stunnel to configure openssl correctly.  I'd love to hear from anyone 
who is using hardware encryption in stunnel, even if you only send the 
configuration file you are using.  Just looking for some examples.  I have only 
seen one example of a cryptography engine in stunnel, in the default 
stunnel.conf file, but it was for a Microsoft CryptoAPI engine.  I'd love to 
see someone who used a Linux based cryptodev device.  Has anyone out there 
tried this?

Thanks,
Tamar

From: Eric Eberhard [mailto:[email protected]]
Sent: Friday, January 18, 2019 6:23 PM
To: Tamar Pedersen <[email protected]>; [email protected]
Subject: RE: [stunnel-users] How can stunnel use openssl HW cryptodev encryption

I'll give you two pieces of advice that almost everyone on the list won't agree 
with :)

1) Make a static openssl and static stunnel and locate them someplace apart 
from the standard locations (/usr/local/company_name/lib is what I use).  This 
means if anyone messes with openssl or stunnel you won't be affected - and it 
will always work as it is static - and part of your application - the user does 
not even need your libraries.  I have a personal distaste for dynamic linking 
... mostly because I have a lot of customers that update a lot of things 
(including openssl - a lot - and stunnel sometimes) ... and then wonder why 
things stop working.  A 10 year old openssl and stunnel - all static - will 
still run and work fine past all updates and user messing around.  I choose 
when I want to update openssl and stunnel (meaning I look to see if there is 
something new I need or want).  As a result I missed the keep alive and poodle 
bugs - I did not update until after both were fixed.

2) Forget hardware implementation - geez - modern computers are so darn fast 
that I cannot imagine you really need that level of "speed up" versus the grief 
you are handling.  I have customers that exchange millions (4+) XML documents a 
day, all through stunnel, all through inetd (also not efficient supposedly - 
just reliable and always works and needs no management) - and have no problems. 
 I am using IBM p Series (AIX) and these machines even at the low level are 
fast ... but I also use some SCO and Linux and certainly with lesser volume 
they are fine as well.

3) OK - 3 is really - use inetd, so much easier and always works (assuming you 
have Unix).  If inetd crashes Unix crashes so ... see number 2 for reasons :)

Of course, these ideas won't help much if you don't have a Unix variation or if 
you are really that tight on performance (although if you are I'd suggest 
hardware upgrades!).

Good luck with your project,

Eric







From: stunnel-users [mailto:[email protected]] On Behalf Of 
Tamar Pedersen
Sent: Wednesday, January 16, 2019 1:06 PM
To: [email protected]<mailto:[email protected]>
Subject: [stunnel-users] How can stunnel use openssl HW cryptodev encryption

Hello,
I am evaluating stunnel, to see if it is a viable solution for providing 
encryption in a system that contains an Atmel processor which includes a HW 
accelerated encryption block.  I am just ramping up on stunnel, and figured I 
should capture what I have done so far.  My questions will come towards the end 
of my email.

My research indicates that stunnel incorporates openssl.  I have been able to 
use openssl independently, to access the cryptodev HW encryption engine, in the 
Linux kernel module located in /lib/modules/4.14.79/extra/cryptodev.ko.  When 
openssl is run without accessing the cryptodev engine (cryptodev module not 
loaded), I get the pure SW encryption implementation provided by default in 
openssl.  When I run bench mark speed tests using openssl, using SW encryption, 
I see the following results:

# time -v openssl speed -evp aes-128-cbc
Doing aes-128-cbc for 3s on 16 size blocks: 1689887 aes-128-cbc's in 2.95s
Doing aes-128-cbc for 3s on 64 size blocks: 568389 aes-128-cbc's in 2.95s
Doing aes-128-cbc for 3s on 256 size blocks: 151550 aes-128-cbc's in 2.96s
Doing aes-128-cbc for 3s on 1024 size blocks: 38599 aes-128-cbc's in 2.96s
Doing aes-128-cbc for 3s on 8192 size blocks: 4845 aes-128-cbc's in 2.95s
OpenSSL 1.0.2p-fips  14 Aug 2018
built on: reproducible build, date unspecified
options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) idea(int) 
blowfish(ptr)
compiler: arm-laird-linux-gnueabi-gcc -I. -I.. -I../include  -fPIC 
-DOPENSSL_PIC -DZLIB_SHARED -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN 
-DHAVE_DLFCN_H -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 
 -O3  -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -Wall -DOPENSSL_BN_ASM_MONT 
-DOPENSSL_BN_ASM_GF2m 
-I/home/sii/wb50n_space2_legacy_6.0.0.x/wb/buildroot/output/wb50n_space2_legacy/host/arm-buildroot-linux-gnueabi/sysroot/usr/local/ssl/fips-2.0/include
 -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128-cbc       9165.49k    12331.15k    13107.03k    13353.17k    13454.32k
        Command being timed: "openssl speed -evp aes-128-cbc"
        User time (seconds): 14.81
        System time (seconds): 0.10
        Percent of CPU this job got: 99%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0m 15.06s
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 13376
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 145
        Voluntary context switches: 0
        Involuntary context switches: 721
        Swaps: 0
        File system inputs: 0
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0
#

When I load the cryptodev module, and take advantage of the accelerated 
hardware encryption the benchmark tests are significantly faster.  Here is what 
those results look like.

# modprobe cryptodev
# time -v openssl speed -evp aes-128-cbc
Doing aes-128-cbc for 3s on 16 size blocks: 44163 aes-128-cbc's in 0.12s
Doing aes-128-cbc for 3s on 64 size blocks: 31345 aes-128-cbc's in 0.15s
Doing aes-128-cbc for 3s on 256 size blocks: 18923 aes-128-cbc's in 0.11s
Doing aes-128-cbc for 3s on 1024 size blocks: 13847 aes-128-cbc's in 0.13s
Doing aes-128-cbc for 3s on 8192 size blocks: 8427 aes-128-cbc's in 0.06s
OpenSSL 1.0.2p-fips  14 Aug 2018
built on: reproducible build, date unspecified
options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) idea(int) 
blowfish(ptr)
compiler: arm-laird-linux-gnueabi-gcc -I. -I.. -I../include  -fPIC 
-DOPENSSL_PIC -DZLIB_SHARED -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN 
-DHAVE_DLFCN_H -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 
 -O3  -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -Wall -DOPENSSL_BN_ASM_MONT 
-DOPENSSL_BN_ASM_GF2m 
-I/home/sii/wb50n_space2_legacy_6.0.0.x/wb/buildroot/output/wb50n_space2_legacy/host/arm-buildroot-linux-gnueabi/sysroot/usr/local/ssl/fips-2.0/include
 -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128-cbc       5888.40k    13373.87k    44038.98k   109071.75k  1150566.40k
        Command being timed: "openssl speed -evp aes-128-cbc"
        User time (seconds): 0.59
        System time (seconds): 8.72
        Percent of CPU this job got: 61%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0m 15.11s
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 13792
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 144
        Voluntary context switches: 41154
        Involuntary context switches: 3321
       Swaps: 0
        File system inputs: 0
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0
#

As can be seen in the results (hi-lighted in red), the average speed to do 
aes-128-cbc encryption jumped from around 2.95 s to 0.10 s.  Also of interest 
is the context switches are significantly higher when running hardware 
encryption, because of interrupts and overhead to use the hardware engine.  I 
can also look at /proc/interrupts and see significant increases in atmel-aes 
interrupt counts when using the cryptodev HW acceleration encryption engine.  
This gives a good indication that the cryptodev module is in use, and is doing 
encryption.

I would like to try to figure out how to allow stunnel to take advantage of the 
cryptodev HW acceleration encryption engine available in openssl.  I have made 
some attempts, but so far, I have not been able to determine if stunnel is 
successfully using the cryptodev engine.  Here is what I have done with 
stunnel.  I already have a client and server successfully communicating with 
each other using stunnel.  To verify this I used the "nc" utility to send 
characters back and forth between two different machines.  The stunnel.conf 
file, on the server, is out of the box.  I'm interested in encrypting on the 
client side.  Here is my current client.conf file, in /etc/stunnel:

# cat client.conf
debug = 7
output = /tmp/stunnel-server.log
pid = /tmp/stunnel.pid

engine = cryptodev

[test]
verify = 1
client = yes
accept = 127.0.0.1:2000
connect = 192.168.0.220:30000
CAfile = /etc/stunnel/certificate.crt
engineNum = 1

#

I am attempting to set up the cryptodev to be the configured engine for the 
client.  I am able to start stunnel, using client.conf, as follows:

# stunnel /etc/stunnel/client.conf
#

If I do a "ps" command to display processes, I can see that stunnel is running 
in the background.  At this point, I can use "nc" to send data, as follows:

# nc 127.0.0.1 2000 < /tmp/long_file.txt

I am able to see the text from long_file.txt on the server, which is also 
running nc.  The problem is that I don't see interrupts increasing in 
/proc/interrupts, which leaves me wondering if I have not configured stunnel 
correctly to use the cryptodev engine.  If I try to remove the cryptodev module 
as this point, while stunnel is running, I receive a message that it is in use, 
as follows:

# modprobe -r cryptodev
modprobe: FATAL: Module cryptodev is in use.
#

If I kill the stunnel process, I am able to successfully remove the cryptodev 
module, which seems to suggest stunnel is the process using the cryptodev 
module.  Also, once I have removed the cryptodev module, I can't restart 
stunnel.  Instead, I get the following errors back:

# stunnel /etc/stunnel/client.conf
[.] stunnel 5.44 on arm-buildroot-linux-gnueabi platform
[.] Compiled/running with OpenSSL 1.0.2p-fips  14 Aug 2018
[.] Threading:FORK Sockets:POLL,IPv6 TLS:ENGINE,FIPS,OCSP,PSK,SNI
[ ] errno: (*__errno_location ())
[.] Reading configuration from file /etc/stunnel/client.conf
[.] UTF-8 byte order mark not detected
[ ] Enabling support for engine "cryptodev"
[!] error queue: 2606A074: error:2606A074:engine routines:ENGINE_by_id:no such 
engine
[!] error queue: 260B6084: error:260B6084:engine routines:DYNAMIC_LOAD:dso not 
found
[!] error queue: 25070067: error:25070067:DSO support routines:DSO_load:could 
not load the shared library
[!] ENGINE_by_id: 25066067: error:25066067:DSO support 
routines:DLFCN_LOAD:could not load the shared library
[!] /etc/stunnel/client.conf:5: "engine = cryptodev": Failed to open the engine
#

Again, this suggests stunnel is trying to use cryptodev.  I just don't know how 
to prove I am taking advantage of the HW encryption acceleration engine.  I 
never see interrupts updating in /proc/interrupts when using nc, while stunnel 
is running.

So, here are my questions:


1.)    Does it look like I have things set up correctly in client.conf, to use 
the cryptodev engine?

2.)    If client.conf is correct, how can I prove that stunnel is using the 
cryptodev engine, since I don't see the expected interrupts?

One idea is that the cryptodev module might not support the type of encryption 
being requested by the certificate, so openssl falls back to the pure SW 
encryption implementation.   I know the Atmel chip in question supports the 
following:

# openssl engine -t -c
(cryptodev) cryptodev engine
[RSA, DSA, DH, DES-CBC, DES-EDE3-CBC, AES-128-CBC, AES-192-CBC, AES-256-CBC, 
MD5, SHA1, SHA256, SHA384, SHA512]
      [ available ]
(dynamic) Dynamic engine loading support
      [ unavailable ]
#

I was able to decode the contents of the certificate, and it says it is 
sha256WithRSAEncryption.  My engine supports SHA256 and RSA, but does it 
support combining, like SHA256WithRSA?  I'm not sure.  I'll keep chasing that 
one.

Thanks for any guidance on how to use the cryptodev in stunnel.

Regards,
Tamar


_______________________________________________
stunnel-users mailing list
[email protected]
https://www.stunnel.org/cgi-bin/mailman/listinfo/stunnel-users

Reply via email to