Hi Grant, > Hey Emeric, > > Thank you very much for the information. Hopefully the s_server + qat issue > could be addressed soon. > > Regards, > > Grant > > >
Intel's guys told me that the bug is related to prf and asked me to recompile the engine using '--disable_qat_prf'. Doing that i can do some tests iwth the qat engine but i'm facing stability issues: [root@centos haproxy]# /usr/local/ssl/bin/openssl speed -engine qat -elapsed -async_jobs 8 rsa2048 [WARNING][e_qat.c:1531:bind_qat()] QAT Warnings enabled. engine "qat" set. You have chosen to measure elapsed time instead of user CPU time. Doing 2048 bit private rsa's for 10s: 13442 2048 bit private RSA's in 10.01s Doing 2048 bit public rsa's for 10s: 290503 2048 bit public RSA's in 10.00s OpenSSL 1.1.0e 16 Feb 2017 built on: reproducible build, date unspecified options:bn(64,64) rc4(16x,int) des(int) aes(partial) idea(int) blowfish(ptr) compiler: gcc -DDSO_DLFCN -DHAVE_DLFCN_H -DNDEBUG -DOPENSSL_THREADS -DOPENSSL_NO_STATIC_ENGINE -DOPENSSL_PIC -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DRC4_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DPADLOCK_ASM -DPOLY1305_ASM -DOPENSSLDIR="\"/usr/local/ssl/ssl\"" -DENGINESDIR="\"/usr/local/ssl/lib/engines-1.1\"" -Wa,--noexecstack sign verify sign/s verify/s rsa 2048 bits 0.000745s 0.000034s 1342.9 29050.3 Doing a benchmark using haproxy and qat engine stall to ~450 connections/sec Stopping the injection, the haproxy process continue to steal cpu doing nothing (top shows ~50% of one core, mainly in user): here thre trace: [root@centos ~]# strace -p 27085 Process 27085 attached epoll_wait(3, {}, 200, 1000) = 0 epoll_wait(3, {}, 200, 1000) = 0 epoll_wait(3, {}, 200, 1000) = 0 epoll_wait(3, {}, 200, 1000) = 0 epoll_wait(3, {}, 200, 1000) = 0 The epoll awake all seconds, seems normal. If i continue to inject re-using the same key (session resuming,no rsa computation), i observe ~1500 connections/src But stopping the injection the process steal 156% of cpu doing nothing ( core 1 20% in user and 80% in system, and core 2 76% in user): Here the trace: epoll_wait(3, {{EPOLLIN|EPOLLRDHUP, {u32=57, u64=57}}, {EPOLLIN|EPOLLRDHUP, {u32=56, u64=56}}, {EPOLLIN|EPOLLRDHUP, {u32=55, u64=55}}, {EPOLLIN|EPOLLRDHUP, {u32=54, u64=54}}, {EPOLLIN|EPOLLRDHUP, {u32=53, u64=53}}, {EPOLLIN|EPOLLRDHUP, {u32=52, u64=52}}, {EPOLLIN|EPOLLRDHUP, {u32=51, u64=51}}, {EPOLLIN|EPOLLRDHUP, {u32=50, u64=50}}, {EPOLLIN|EPOLLRDHUP, {u32=49, u64=49}}, {EPOLLIN|EPOLLRDHUP, {u32=48, u64=48}}, {EPOLLIN|EPOLLRDHUP, {u32=47, u64=47}}, {EPOLLIN|EPOLLRDHUP, {u32=45, u64=45}}, {EPOLLIN|EPOLLRDHUP, {u32=42, u64=42}}, {EPOLLIN|EPOLLRDHUP, {u32=44, u64=44}}, {EPOLLIN|EPOLLRDHUP, {u32=43, u64=43}}}, 200, 1000) = 15 epoll_wait(3, {{EPOLLIN|EPOLLRDHUP, {u32=57, u64=57}}, {EPOLLIN|EPOLLRDHUP, {u32=56, u64=56}}, {EPOLLIN|EPOLLRDHUP, {u32=55, u64=55}}, {EPOLLIN|EPOLLRDHUP, {u32=54, u64=54}}, {EPOLLIN|EPOLLRDHUP, {u32=53, u64=53}}, {EPOLLIN|EPOLLRDHUP, {u32=52, u64=52}}, {EPOLLIN|EPOLLRDHUP, {u32=51, u64=51}}, {EPOLLIN|EPOLLRDHUP, {u32=50, u64=50}}, {EPOLLIN|EPOLLRDHUP, {u32=49, u64=49}}, {EPOLLIN|EPOLLRDHUP, {u32=48, u64=48}}, {EPOLLIN|EPOLLRDHUP, {u32=47, u64=47}}, {EPOLLIN|EPOLLRDHUP, {u32=45, u64=45}}, {EPOLLIN|EPOLLRDHUP, {u32=42, u64=42}}, {EPOLLIN|EPOLLRDHUP, {u32=44, u64=44}}, {EPOLLIN|EPOLLRDHUP, {u32=43, u64=43}}}, 200, 1000) = 15 epoll_wait(3, {{EPOLLIN|EPOLLRDHUP, {u32=57, u64=57}}, {EPOLLIN|EPOLLRDHUP, {u32=56, u64=56}}, {EPOLLIN|EPOLLRDHUP, {u32=55, u64=55}}, {EPOLLIN|EPOLLRDHUP, {u32=54, u64=54}}, {EPOLLIN|EPOLLRDHUP, {u32=53, u64=53}}, {EPOLLIN|EPOLLRDHUP, {u32=52, u64=52}}, {EPOLLIN|EPOLLRDHUP, {u32=51, u64=51}}, {EPOLLIN|EPOLLRDHUP, {u32=50, u64=50}}, {EPOLLIN|EPOLLRDHUP, {u32=49, u64=49}}, {EPOLLIN|EPOLLRDHUP, {u32=48, u64=48}}, {EPOLLIN|EPOLLRDHUP, {u32=47, u64=47}}, {EPOLLIN|EPOLLRDHUP, {u32=45, u64=45}}, {EPOLLIN|EPOLLRDHUP, {u32=42, u64=42}}, {EPOLLIN|EPOLLRDHUP, {u32=44, u64=44}}, {EPOLLIN|EPOLLRDHUP, {u32=43, u64=43}}}, 200, 1000) = 15 epoll_wait(3, {{EPOLLIN|EPOLLRDHUP, {u32=57, u64=57}}, {EPOLLIN|EPOLLRDHUP, {u32=56, u64=56}}, {EPOLLIN|EPOLLRDHUP, {u32=55, u64=55}}, {EPOLLIN|EPOLLRDHUP, {u32=54, u64=54}}, {EPOLLIN|EPOLLRDHUP, {u32=53, u64=53}}, {EPOLLIN|EPOLLRDHUP, {u32=52, u64=52}}, {EPOLLIN|EPOLLRDHUP, {u32=51, u64=51}}, {EPOLLIN|EPOLLRDHUP, {u32=50, u64=50}}, {EPOLLIN|EPOLLRDHUP, {u32=49, u64=49}}, {EPOLLIN|EPOLLRDHUP, {u32=48, u64=48}}, {EPOLLIN|EPOLLRDHUP, {u32=47, u64=47}}, {EPOLLIN|EPOLLRDHUP, {u32=45, u64=45}}, {EPOLLIN|EPOLLRDHUP, {u32=42, u64=42}}, {EPOLLIN|EPOLLRDHUP, {u32=44, u64=44}}, {EPOLLIN|EPOLLRDHUP, {u32=43, u64=43}}}, 200, 1000) = 15 epoll_wait awake in very fast loop. When this point is reached, some of time, re-starting the injection will crash haproxy in segfault. Here my haproxy's config: global tune.ssl.default-dh-param 2048 ssl-engine qat ssl-async listen gg mode http bind 0.0.0.0:9443 ssl crt /root/2048.pem ciphers AES redirect location / R, Emeric