[ 
https://issues.apache.org/jira/browse/HDFS-6193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13961611#comment-13961611
 ] 

Gera Shegalov commented on HDFS-6193:
-------------------------------------

[~ste...@apache.org], thanks for following up. 

bq. Interesting that FileSystemContractBaseTest doesn't catch this

FileSystemContractBaseTest does not have a test for {{open}} on a 
non-exisisting path. Neither did {{TestHftpFileSystem}}. 
{{TestWebHdfsFileSystemContract.testOpenNonExistFile}} had incorrect 
implementation that relied on {{read}} to fail.

bq. We could optimise any of the web filesystems by not doing that open (e,g, 
S3, s3n, swift) and waiting for the first seek. But we don't because things 
expect missing files to not be there.

Note that a seek for WebHdfs/Hftp is a client-only operation as well. Deferring 
real open to a stream operation is misleading because the application presumes 
an open stream when issuing a stream operation.




> HftpFileSystem open should throw FileNotFoundException for non-existing paths
> -----------------------------------------------------------------------------
>
>                 Key: HDFS-6193
>                 URL: https://issues.apache.org/jira/browse/HDFS-6193
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.3.0
>            Reporter: Gera Shegalov
>            Assignee: Gera Shegalov
>            Priority: Blocker
>
> WebHdfsFileSystem.open and HftpFileSystem.open incorrectly handles 
> non-existing paths. 
> - 'open', does not really open anything, i.e., it does not contact the 
> server, and therefore cannot discover FileNotFound, it's deferred until next 
> read. It's counterintuitive and not how local FS or HDFS work. In POSIX you 
> get ENOENT on open. 
> [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java]
>  is an example of the code that's broken because of this.
> - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST 
> instead of SC_NOT_FOUND for non-exitsing paths



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to