Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5386: Fix ReopenCachedHdfsFileHandle failure case
......................................................................


Patch Set 2:

> Did you look into the conditions which triggered the failure to
 > begin with ? Is there any way to trigger similar error locally with
 > debug action or stress flag ? It would be good to add a test for
 > this case.

I know the sequence of events:
1. File is deleted using hdfs command line
2. Run a query over the table that has the deleted file
3. ScanRange::Open succeeds (!!)
4. ScanRange::Read tries hdfsRead and fails, destroys the file handle, and 
reopening the file handle fails. The ScanRange's file handle reference is now 
invalid, but it is also non-null.
5. Query is aborted, leading to ScanRange::Cancel
6. ScanRange::Cancel calls ScanRange::Close, which sees that the file handle 
reference is non-null and tries to release it. The release fails, because the 
file handle reference is invalid.

The problem with reproducing this on normal Hdfs is that when a file is 
deleted, the subsequent Open in #2 fails, so the query never even has a file 
handle. If the Open succeeded, it is a file handle to a local file, so POSIX 
guarantees that the file stays around. I have tried modifying the code and 
modifying the test to produce this sequence, but it is difficult to get this 
particular combination. I'm looking at what it would take.

-- 
To view, visit http://gerrit.cloudera.org:8080/7020
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iee982fa5e964f6c8969b2eb7e5f3eca89e793b3a
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-HasComments: No

Reply via email to