[ https://issues.apache.org/jira/browse/HADOOP-2025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14168475#comment-14168475 ]
jay vyas commented on HADOOP-2025: ---------------------------------- Thanks for this.... its true... its not in the perview of the filesystem itself - Its really up to the hadoop *solution* (i.e. the thing that a vendor is giving to a client), to solve this problem, not the file system itself. In the upstream we have *Bigtop*, which indeed aims to fill that gap in a free and open way for the community, and which produces idioms for others to follow around setting up and maintaining a distributed hadoop based bigdata product. To solve this problem, we have. - a file system agnostic provisioner, https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/bigtop-utils/provision.groovy and - as a universal json file format for defining the filesystems schema https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/init-hcfs.json hopefully those artifacts can help people needing to solve this problem in a way that is FS agnostic and maintainble. > Instantiating a FileSystem object should guarantee the existence of the > working directory > ----------------------------------------------------------------------------------------- > > Key: HADOOP-2025 > URL: https://issues.apache.org/jira/browse/HADOOP-2025 > Project: Hadoop Common > Issue Type: Improvement > Components: fs > Affects Versions: 0.14.1 > Reporter: Sameer Paranjpye > Assignee: Chris Douglas > Attachments: 2025-1.patch, 2025.patch > > > Issues like HADOOP-1891 and HADOOP-1916 illustrate the need for this behavior. > In HADOOP-1916 the problem is that the default working directory for a user > on HDFS '/user/<username>' does not exist. This results in the command > 'hadoop dfs -copyFromLocal foo ." creating a *file* called /user/<username> > and copying the contents of the file 'foo' into this file. > HADOOP-1891 is basically the same problem. The problem that Olga observed was > that copying a file to '.' on HDFS when her 'home directory' did not exist > resulted in the creation of a file with the path as her home directory. The > problem is incorrectly filed as a bug in the Path class. The behavior of Path > is correct, as Doug points out, it is perfectly reasonable for Path(".") to > convert to an empty path. When this empty path is resolved in HDFS or any > other filesystem the resolution to '/user/<username>' is also correct (at > least for HDFS). The problem IMO is that the existence of the working > directory is not guaranteed. > When I log in to a machine my default working directory is '/home/sameerp' > and filesystem operations that I execute with relative paths all work > correctly because this directory exists. My home directory lives on a filer, > in the event of it being unmountable the default working directory I get is > '/' which also is guaranteed to exist. > In the context of Hadoop, instantiating a FileSystem object is the analogue > of logging in and should result in a working directory whose existence has > been validated. In the case of HDFS this should be '/user/<username>' or '/' > if the directory does not exist. -- This message was sent by Atlassian JIRA (v6.3.4#6332)