Joe McDonnell has posted comments on this change. Change subject: IMPALA-5386: Fix ReopenCachedHdfsFileHandle failure case ......................................................................
Patch Set 2: > Did you look into the conditions which triggered the failure to > begin with ? Is there any way to trigger similar error locally with > debug action or stress flag ? It would be good to add a test for > this case. I know the sequence of events: 1. File is deleted using hdfs command line 2. Run a query over the table that has the deleted file 3. ScanRange::Open succeeds (!!) 4. ScanRange::Read tries hdfsRead and fails, destroys the file handle, and reopening the file handle fails. The ScanRange's file handle reference is now invalid, but it is also non-null. 5. Query is aborted, leading to ScanRange::Cancel 6. ScanRange::Cancel calls ScanRange::Close, which sees that the file handle reference is non-null and tries to release it. The release fails, because the file handle reference is invalid. The problem with reproducing this on normal Hdfs is that when a file is deleted, the subsequent Open in #2 fails, so the query never even has a file handle. If the Open succeeded, it is a file handle to a local file, so POSIX guarantees that the file stays around. I have tried modifying the code and modifying the test to produce this sequence, but it is difficult to get this particular combination. I'm looking at what it would take. -- To view, visit http://gerrit.cloudera.org:8080/7020 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iee982fa5e964f6c8969b2eb7e5f3eca89e793b3a Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Michael Ho <k...@cloudera.com> Gerrit-HasComments: No