Tim Armstrong has uploaded a new patch set (#2). Change subject: IMPALA-3633: cancel fragment if coordinator is gone ......................................................................
IMPALA-3633: cancel fragment if coordinator is gone The bug is that return_val.status is an optional field, so setting the status without __isset is equivalent to Status::OK(). This meant that fragment did not get notified when reporting status if the coordinator had gone away. This means that is a cancel RPC was lost, we could be left with zombie fragments with no coordinator that kept on running until completion. Testing: I couldn't see a way to replicate this reliably with our existing test setup, since it requires some RPCs to be dropped to get into this state. I manually tested by commenting out CancelRemoteFragments(), starting a long-running query then cancelling it. Before the patch, perf top showed that the fragments continue to execute the query. After the patch, the fragments stopped executing quickly. Change-Id: I62ab6f4df7c0ee60c6aa6291513f9f0cbfac3fe7 --- M be/src/service/impala-server.cc 1 file changed, 4 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/38/3238/2 -- To view, visit http://gerrit.cloudera.org:8080/3238 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I62ab6f4df7c0ee60c6aa6291513f9f0cbfac3fe7 Gerrit-PatchSet: 2 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Tim Armstrong <[email protected]> Gerrit-Reviewer: Sailesh Mukil <[email protected]>
