[jira] [Commented] (HDFS-6193) HftpFileSystem open should throw FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524797#comment-14524797 ] Hadoop QA commented on HDFS-6193: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | patch | 0m 0s | The patch command could not apply the patch during dryrun. | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12643130/HDFS-6193-branch-2.4.v02.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / f1a152c | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/10609/console | This message was automatically generated. HftpFileSystem open should throw FileNotFoundException for non-existing paths - Key: HDFS-6193 URL: https://issues.apache.org/jira/browse/HDFS-6193 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Attachments: HDFS-6193-branch-2.4.0.v01.patch, HDFS-6193-branch-2.4.v02.patch WebHdfsFileSystem.open and HftpFileSystem.open incorrectly handles non-existing paths. - 'open', does not really open anything, i.e., it does not contact the server, and therefore cannot discover FileNotFound, it's deferred until next read. It's counterintuitive and not how local FS or HDFS work. In POSIX you get ENOENT on open. [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] is an example of the code that's broken because of this. - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6193) HftpFileSystem open should throw FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524781#comment-14524781 ] Hadoop QA commented on HDFS-6193: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | patch | 0m 0s | The patch command could not apply the patch during dryrun. | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12643130/HDFS-6193-branch-2.4.v02.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / f1a152c | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/10603/console | This message was automatically generated. HftpFileSystem open should throw FileNotFoundException for non-existing paths - Key: HDFS-6193 URL: https://issues.apache.org/jira/browse/HDFS-6193 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Attachments: HDFS-6193-branch-2.4.0.v01.patch, HDFS-6193-branch-2.4.v02.patch WebHdfsFileSystem.open and HftpFileSystem.open incorrectly handles non-existing paths. - 'open', does not really open anything, i.e., it does not contact the server, and therefore cannot discover FileNotFound, it's deferred until next read. It's counterintuitive and not how local FS or HDFS work. In POSIX you get ENOENT on open. [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] is an example of the code that's broken because of this. - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6193) HftpFileSystem open should throw FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13992986#comment-13992986 ] Haohui Mai commented on HDFS-6193: -- I don't think this is a blocker since hftp / hsftp have been deprecated and been superseded by webhdfs. It looks to me that the performance impact is still up to debate (the same fix has been applied to webhdfs in HDFS-6143, see the comments for the details). I'm moving it out to unblock 2.4.1. Feel free to move it back you think it is essential for the release. HftpFileSystem open should throw FileNotFoundException for non-existing paths - Key: HDFS-6193 URL: https://issues.apache.org/jira/browse/HDFS-6193 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker Attachments: HDFS-6193-branch-2.4.0.v01.patch, HDFS-6193-branch-2.4.v02.patch WebHdfsFileSystem.open and HftpFileSystem.open incorrectly handles non-existing paths. - 'open', does not really open anything, i.e., it does not contact the server, and therefore cannot discover FileNotFound, it's deferred until next read. It's counterintuitive and not how local FS or HDFS work. In POSIX you get ENOENT on open. [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] is an example of the code that's broken because of this. - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6193) HftpFileSystem open should throw FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13990007#comment-13990007 ] Tsuyoshi OZAWA commented on HDFS-6193: -- Let's wait for review by HDFS experts. HftpFileSystem open should throw FileNotFoundException for non-existing paths - Key: HDFS-6193 URL: https://issues.apache.org/jira/browse/HDFS-6193 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker Attachments: HDFS-6193-branch-2.4.0.v01.patch, HDFS-6193-branch-2.4.v02.patch WebHdfsFileSystem.open and HftpFileSystem.open incorrectly handles non-existing paths. - 'open', does not really open anything, i.e., it does not contact the server, and therefore cannot discover FileNotFound, it's deferred until next read. It's counterintuitive and not how local FS or HDFS work. In POSIX you get ENOENT on open. [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] is an example of the code that's broken because of this. - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6193) HftpFileSystem open should throw FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13988028#comment-13988028 ] Tsuyoshi OZAWA commented on HDFS-6193: -- is HftpFileSystem missing from trunk now? Please correct me if I get wrong. HftpFileSystem open should throw FileNotFoundException for non-existing paths - Key: HDFS-6193 URL: https://issues.apache.org/jira/browse/HDFS-6193 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker Attachments: HDFS-6193-branch-2.4.0.v01.patch WebHdfsFileSystem.open and HftpFileSystem.open incorrectly handles non-existing paths. - 'open', does not really open anything, i.e., it does not contact the server, and therefore cannot discover FileNotFound, it's deferred until next read. It's counterintuitive and not how local FS or HDFS work. In POSIX you get ENOENT on open. [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] is an example of the code that's broken because of this. - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6193) HftpFileSystem open should throw FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13988058#comment-13988058 ] Gera Shegalov commented on HDFS-6193: - Hi [~ozawa], yeah Hftp was recently kicked out with HDFS-5570 HftpFileSystem open should throw FileNotFoundException for non-existing paths - Key: HDFS-6193 URL: https://issues.apache.org/jira/browse/HDFS-6193 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker Attachments: HDFS-6193-branch-2.4.0.v01.patch WebHdfsFileSystem.open and HftpFileSystem.open incorrectly handles non-existing paths. - 'open', does not really open anything, i.e., it does not contact the server, and therefore cannot discover FileNotFound, it's deferred until next read. It's counterintuitive and not how local FS or HDFS work. In POSIX you get ENOENT on open. [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] is an example of the code that's broken because of this. - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6193) HftpFileSystem open should throw FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13988084#comment-13988084 ] Tsuyoshi OZAWA commented on HDFS-6193: -- Thanks for the pointing, [~jira.shegalov]! Now I could apply your patch against branch-2.4.0. However, some compilation error occurs with the patch. In HftpFileSystem, RangeHeaderInputStream cannot call the super constructor as follows: {code} static class RangeHeaderInputStream extends ByteRangeInputStream { RangeHeaderInputStream(RangeHeaderUrlOpener o, RangeHeaderUrlOpener r) throws IOException { super(o, r, true); } {code} FileDataServlet: the method ExceptionHandler.toHttpStatus is missing: {code} response.sendError(ExceptionHandler.toHttpStatus(e), StringUtils.stringifyException(e)); {code} Can you check them? Thanks! HftpFileSystem open should throw FileNotFoundException for non-existing paths - Key: HDFS-6193 URL: https://issues.apache.org/jira/browse/HDFS-6193 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker Attachments: HDFS-6193-branch-2.4.0.v01.patch WebHdfsFileSystem.open and HftpFileSystem.open incorrectly handles non-existing paths. - 'open', does not really open anything, i.e., it does not contact the server, and therefore cannot discover FileNotFound, it's deferred until next read. It's counterintuitive and not how local FS or HDFS work. In POSIX you get ENOENT on open. [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] is an example of the code that's broken because of this. - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6193) HftpFileSystem open should throw FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13988171#comment-13988171 ] Gera Shegalov commented on HDFS-6193: - Will upload a fixed version shortly. HftpFileSystem open should throw FileNotFoundException for non-existing paths - Key: HDFS-6193 URL: https://issues.apache.org/jira/browse/HDFS-6193 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker Attachments: HDFS-6193-branch-2.4.0.v01.patch WebHdfsFileSystem.open and HftpFileSystem.open incorrectly handles non-existing paths. - 'open', does not really open anything, i.e., it does not contact the server, and therefore cannot discover FileNotFound, it's deferred until next read. It's counterintuitive and not how local FS or HDFS work. In POSIX you get ENOENT on open. [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] is an example of the code that's broken because of this. - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6193) HftpFileSystem open should throw FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13988327#comment-13988327 ] Hadoop QA commented on HDFS-6193: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12643130/HDFS-6193-branch-2.4.v02.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6799//console This message is automatically generated. HftpFileSystem open should throw FileNotFoundException for non-existing paths - Key: HDFS-6193 URL: https://issues.apache.org/jira/browse/HDFS-6193 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker Attachments: HDFS-6193-branch-2.4.0.v01.patch, HDFS-6193-branch-2.4.v02.patch WebHdfsFileSystem.open and HftpFileSystem.open incorrectly handles non-existing paths. - 'open', does not really open anything, i.e., it does not contact the server, and therefore cannot discover FileNotFound, it's deferred until next read. It's counterintuitive and not how local FS or HDFS work. In POSIX you get ENOENT on open. [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] is an example of the code that's broken because of this. - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6193) HftpFileSystem open should throw FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13988413#comment-13988413 ] Tsuyoshi OZAWA commented on HDFS-6193: -- Thank you for updating! +1 for the patch(non-binding). * Compilation works correctly. * Confirmed that WebHdfsFileSystem.open() and HftpFileSystem.open() throw FileNotFoundException when files are missing. Test cases covers it. HftpFileSystem open should throw FileNotFoundException for non-existing paths - Key: HDFS-6193 URL: https://issues.apache.org/jira/browse/HDFS-6193 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker Attachments: HDFS-6193-branch-2.4.0.v01.patch, HDFS-6193-branch-2.4.v02.patch WebHdfsFileSystem.open and HftpFileSystem.open incorrectly handles non-existing paths. - 'open', does not really open anything, i.e., it does not contact the server, and therefore cannot discover FileNotFound, it's deferred until next read. It's counterintuitive and not how local FS or HDFS work. In POSIX you get ENOENT on open. [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] is an example of the code that's broken because of this. - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6193) HftpFileSystem open should throw FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13961490#comment-13961490 ] Steve Loughran commented on HDFS-6193: -- linking to HADOOP-9361 and FS semantics. Failing on the open if a file is not found is a core expectation of filesystems. We could optimise any of the web filesystems by not doing that open (e,g, S3, s3n, swift) and waiting for the first seek. But we don't because things expect missing files to not be there. Interesting that FileSystemContractBaseTest doesn't catch this HftpFileSystem open should throw FileNotFoundException for non-existing paths - Key: HDFS-6193 URL: https://issues.apache.org/jira/browse/HDFS-6193 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker WebHdfsFileSystem.open and HftpFileSystem.open incorrectly handles non-existing paths. - 'open', does not really open anything, i.e., it does not contact the server, and therefore cannot discover FileNotFound, it's deferred until next read. It's counterintuitive and not how local FS or HDFS work. In POSIX you get ENOENT on open. [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] is an example of the code that's broken because of this. - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6193) HftpFileSystem open should throw FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13961611#comment-13961611 ] Gera Shegalov commented on HDFS-6193: - [~ste...@apache.org], thanks for following up. bq. Interesting that FileSystemContractBaseTest doesn't catch this FileSystemContractBaseTest does not have a test for {{open}} on a non-exisisting path. Neither did {{TestHftpFileSystem}}. {{TestWebHdfsFileSystemContract.testOpenNonExistFile}} had incorrect implementation that relied on {{read}} to fail. bq. We could optimise any of the web filesystems by not doing that open (e,g, S3, s3n, swift) and waiting for the first seek. But we don't because things expect missing files to not be there. Note that a seek for WebHdfs/Hftp is a client-only operation as well. Deferring real open to a stream operation is misleading because the application presumes an open stream when issuing a stream operation. HftpFileSystem open should throw FileNotFoundException for non-existing paths - Key: HDFS-6193 URL: https://issues.apache.org/jira/browse/HDFS-6193 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker WebHdfsFileSystem.open and HftpFileSystem.open incorrectly handles non-existing paths. - 'open', does not really open anything, i.e., it does not contact the server, and therefore cannot discover FileNotFound, it's deferred until next read. It's counterintuitive and not how local FS or HDFS work. In POSIX you get ENOENT on open. [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] is an example of the code that's broken because of this. - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6193) HftpFileSystem open should throw FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13961646#comment-13961646 ] Hadoop QA commented on HDFS-6193: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12638942/HDFS-6193-branch-2.4.0.v01.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6598//console This message is automatically generated. HftpFileSystem open should throw FileNotFoundException for non-existing paths - Key: HDFS-6193 URL: https://issues.apache.org/jira/browse/HDFS-6193 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker Attachments: HDFS-6193-branch-2.4.0.v01.patch WebHdfsFileSystem.open and HftpFileSystem.open incorrectly handles non-existing paths. - 'open', does not really open anything, i.e., it does not contact the server, and therefore cannot discover FileNotFound, it's deferred until next read. It's counterintuitive and not how local FS or HDFS work. In POSIX you get ENOENT on open. [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] is an example of the code that's broken because of this. - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)