Hello Zoltan Borok-Nagy, Wenzhe Zhou, Impala Public Jenkins, I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/19157 to look at the new patch set (#2). Change subject: IMPALA-11674: Fix timeout detection for TSSLSocket ...................................................................... IMPALA-11674: Fix timeout detection for TSSLSocket Functions IsPeekTimeoutTException() and IsReadTimeoutTException() in be/src/rpc/thrift-util.cc make assumption about the implementation of read(), peek(), write() and write_partial() in TSocket.cpp and TSSLSocket.cpp. The functions read() and peek() in TSSLSocket.cpp were changed in version 0.11.0 and 0.16.0 to throw different exception for timeout. This cause IsPeekTimeoutTException() and IsReadTimeoutTException() to return wrong value after upgrade thrift, which in turn cause TAcceptQueueServer::Peek() to rethrow the exception to caller TAcceptQueueServer::run() and make TAcceptQueueServer::run() to close the connection, ignoring idle_session_timeout query option. The issue was reproducible through the following scenario: 1. From the local development environment, start the impala cluster with SSL enabled and idle_client_poll_period_s equals 5 seconds. export CERT_DIR="$IMPALA_HOME/be/src/testutil" export SSL_ARGS="--ssl_client_ca_certificate=$CERT_DIR/server-cert.pem --ssl_server_certificate=$CERT_DIR/server-cert.pem --ssl_private_key=$CERT_DIR/server-key.pem --hostname=localhost" ./bin/start-impala-cluster.py --state_store_args="$SSL_ARGS" \ --catalogd_args="$SSL_ARGS" \ --impalad_args="$SSL_ARGS --idle_client_poll_period_s=5" 2. Run impala-shell with a higher idle_session_timeout query option impala-shell.sh --ssl -Q idle_session_timeout=100 3. Run a simple query like "show databases" and rerun it after 15 seconds pass. The second query run will fail with the following error message in impala-shell: [localhost:21050] default> show databases; Caught exception TLS/SSL connection has been closed (EOF) (_ssl.c:1829), type=<class 'ssl.SSLZeroReturnError'> in CloseSession. Warning: close session RPC failed: TLS/SSL connection has been closed (EOF) (_ssl.c:1829), <class 'ssl.SSLZeroReturnError'> This patch fix the expected error message in IsReadTimeoutTException and IsPeekTimeoutTException to correctly detect timeout error from TSSLSocket. Additionally, this patch also fix typo in NEW_THRIFT_VERSION_MSG. Testing: - Redo the scenario manually and confirm that the second query runs complete without error. - Add test_thrift_socket.py to begin verifying IsPeekTimeoutTException function. Change-Id: I6ad168a1c96d751a3c50d924e6ecaf6404e589ab --- M be/src/rpc/TAcceptQueueServer.cpp M be/src/rpc/thrift-util.cc A tests/custom_cluster/test_thrift_socket.py 3 files changed, 116 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/19157/2 -- To view, visit http://gerrit.cloudera.org:8080/19157 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6ad168a1c96d751a3c50d924e6ecaf6404e589ab Gerrit-Change-Number: 19157 Gerrit-PatchSet: 2 Gerrit-Owner: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>