Ted Yu created HBASE-20668: ------------------------------ Summary: Exception from FileSystem operation in finally block of ExportSnapshot#doWork may hide exception from FileUtil.copy call Key: HBASE-20668 URL: https://issues.apache.org/jira/browse/HBASE-20668 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu
I was debugging the following error [~romil.choksi] saw during testing ExportSnapshot : {code} 2018-06-01 02:40:52,363|INFO|MainThread|machine.py:167 - run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|2018-06-01 02:40:52,358 ERROR [main] util.AbstractHBaseTool: Error running command-line tool 2018-06-01 02:40:52,363|INFO|MainThread|machine.py:167 - run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|java.io.FileNotFoundException: Directory/File does not exist /apps/ hbase/data/.hbase-snapshot/.tmp/snapshot_table_334546 2018-06-01 02:40:52,364|INFO|MainThread|machine.py:167 - run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|at org.apache.hadoop.hdfs.server.namenode.FSDirectory. checkOwner(FSDirectory.java:1777) 2018-06-01 02:40:52,364|INFO|MainThread|machine.py:167 - run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp. setOwner(FSDirAttrOp.java:82) {code} Here is corresponding code (with extra log added): {code} try { LOG.info("Copy Snapshot Manifest from " + snapshotDir + " to " + initialOutputSnapshotDir); boolean ret = FileUtil.copy(inputFs, snapshotDir, outputFs, initialOutputSnapshotDir, false, false, conf); LOG.info("return val = " + ret); } catch (IOException e) { LOG.warn("Failed to copy the snapshot directory: from=" + snapshotDir + " to=" + initialOutputSnapshotDir, e); throw new ExportSnapshotException("Failed to copy the snapshot directory: from=" + snapshotDir + " to=" + initialOutputSnapshotDir, e); } finally { if (filesUser != null || filesGroup != null) { LOG.warn((filesUser == null ? "" : "Change the owner of " + needSetOwnerDir + " to " + filesUser) + (filesGroup == null ? "" : ", Change the group of " + needSetOwnerDir + " to " + filesGroup)); setOwner(outputFs, needSetOwnerDir, filesUser, filesGroup, true); } {code} "return val = " was not seen in rerun of the test. This is what the additional log revealed: {code} 2018-06-01 09:22:54,247|INFO|MainThread|machine.py:167 - run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|2018-06-01 09:22:54,241 WARN [main] snapshot.ExportSnapshot: Failed to copy the snapshot directory: from=hdfs://ns1/apps/hbase/data/.hbase-snapshot/snapshot_table_157842 to=hdfs://ns3/apps/hbase/data/.hbase-snapshot/.tmp/snapshot_table_157842 2018-06-01 09:22:54,248|INFO|MainThread|machine.py:167 - run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|org.apache.hadoop.security.AccessControlException: Permission denied: user=hbase, access=WRITE, inode="/apps/hbase/data/.hbase-snapshot/.tmp":hrt_qa:hadoop:dr----x-wT 2018-06-01 09:22:54,248|INFO|MainThread|machine.py:167 - run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker. check(FSPermissionChecker.java:399) 2018-06-01 09:22:54,249|INFO|MainThread|machine.py:167 - run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker. checkPermission(FSPermissionChecker.java:255) {code} It turned out that the exception from {{setOwner}} call in the finally block eclipsed the real exception. -- This message was sent by Atlassian JIRA (v7.6.3#76005)