Hello Zoltan Borok-Nagy, Wenzhe Zhou, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/19157

to look at the new patch set (#2).

Change subject: IMPALA-11674: Fix timeout detection for TSSLSocket
......................................................................

IMPALA-11674: Fix timeout detection for TSSLSocket

Functions IsPeekTimeoutTException() and IsReadTimeoutTException() in
be/src/rpc/thrift-util.cc make assumption about the implementation of
read(), peek(), write() and write_partial() in TSocket.cpp and
TSSLSocket.cpp. The functions read() and peek() in TSSLSocket.cpp were
changed in version 0.11.0 and 0.16.0 to throw different exception for
timeout. This cause IsPeekTimeoutTException() and
IsReadTimeoutTException() to return wrong value after upgrade thrift,
which in turn cause TAcceptQueueServer::Peek() to rethrow the exception
to caller TAcceptQueueServer::run() and make TAcceptQueueServer::run()
to close the connection, ignoring idle_session_timeout query option.

The issue was reproducible through the following scenario:

1. From the local development environment, start the impala cluster with
SSL enabled and idle_client_poll_period_s equals 5 seconds.

export CERT_DIR="$IMPALA_HOME/be/src/testutil"
export SSL_ARGS="--ssl_client_ca_certificate=$CERT_DIR/server-cert.pem
  --ssl_server_certificate=$CERT_DIR/server-cert.pem
  --ssl_private_key=$CERT_DIR/server-key.pem
  --hostname=localhost"
./bin/start-impala-cluster.py --state_store_args="$SSL_ARGS" \
  --catalogd_args="$SSL_ARGS" \
  --impalad_args="$SSL_ARGS --idle_client_poll_period_s=5"

2. Run impala-shell with a higher idle_session_timeout query option

impala-shell.sh --ssl -Q idle_session_timeout=100

3. Run a simple query like "show databases" and rerun it after 15
   seconds pass.

The second query run will fail with the following error message in impala-shell:
[localhost:21050] default> show databases;
Caught exception TLS/SSL connection has been closed (EOF) (_ssl.c:1829), 
type=<class 'ssl.SSLZeroReturnError'> in CloseSession.
Warning: close session RPC failed: TLS/SSL connection has been closed (EOF) 
(_ssl.c:1829), <class 'ssl.SSLZeroReturnError'>

This patch fix the expected error message in IsReadTimeoutTException and
IsPeekTimeoutTException to correctly detect timeout error from
TSSLSocket. Additionally, this patch also fix typo in
NEW_THRIFT_VERSION_MSG.

Testing:
- Redo the scenario manually and confirm that the second query runs
  complete without error.
- Add test_thrift_socket.py to begin verifying IsPeekTimeoutTException
  function.

Change-Id: I6ad168a1c96d751a3c50d924e6ecaf6404e589ab
---
M be/src/rpc/TAcceptQueueServer.cpp
M be/src/rpc/thrift-util.cc
A tests/custom_cluster/test_thrift_socket.py
3 files changed, 116 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/19157/2
--
To view, visit http://gerrit.cloudera.org:8080/19157
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6ad168a1c96d751a3c50d924e6ecaf6404e589ab
Gerrit-Change-Number: 19157
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>

Reply via email to