Juan Yu has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/3343

Change subject: IMPALA-3575: Add retry to backend connection request and rpc 
timeout
......................................................................

IMPALA-3575: Add retry to backend connection request and rpc timeout

Impala doesn't set socket send/recv timeout. RPC calls will wait
forever for data. In extreme case of network failure, or destination
host has kernel panic, sender will not get response and rpc call will
hang. Query hang is hard to detect. if hang happens at ExecRemoteFragment()
or CancelPlanFragments(), query cannot be canelled unless you restart
coordinator.

Added send/recv timeout to all rpc calls to avoid query hang. And fix
a bug that reporting thread not quiting even after query is cancelled.

Besides the new EE test, I used the following iptable rule to
inject network failure to make sure rpc call never hang.
1. Block network traffic on a port completely
  iptables -A INPUT -p tcp -m tcp --dport 22002 -j DROP
2. Randomly drop 5% of TCP packet to slowdown network
  iptables -A INPUT -p tcp -m tcp --dport 22000 -m statistic --mode random 
--probability 0.05 -j DROP

Change-Id: Id6723cfe58df6217f4a9cdd12facd320cbc24964
---
M be/src/runtime/client-cache.cc
M be/src/runtime/client-cache.h
M be/src/runtime/exec-env.cc
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
M be/src/service/fragment-exec-state.cc
M be/src/statestore/statestore.cc
M common/thrift/generate_error_codes.py
8 files changed, 99 insertions(+), 29 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/43/3343/2
-- 
To view, visit http://gerrit.cloudera.org:8080/3343
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Id6723cfe58df6217f4a9cdd12facd320cbc24964
Gerrit-PatchSet: 2
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Juan Yu <j...@cloudera.com>
Gerrit-Reviewer: Alan Choi <a...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Juan Yu <j...@cloudera.com>

Reply via email to