[ 
https://issues.apache.org/jira/browse/HADOOP-7360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049467#comment-13049467
 ] 

Matt Foley commented on HADOOP-7360:
------------------------------------

TestPathData:

Multiple places: The preferred way to express "new Path(testDir+"/d1")" is "new 
Path(testDir, "d1")".  This avoids hard-coding the path delimiter for a local 
filesystem path.

initialize():
* Consider "fs.createNewFile(Path)" instead of "fs.create(Path).close()"
* If you put the "setWorkingDirectory" command near the beginning of the 
method, wouldn't all the creates and mkdirs get simpler?

I'm concerned that testDir is a relative path.  Can you convert it to an 
absolute path as early in the process as possible, before anyone might have set 
a non-default working directory?

Why did you remove testWithFsAndPath()?  It is not a null test.

getParent() should be named testGetParent().

testRelativeGlobBack(): see previous concern about making "testDir" absolute 
before interacting with second and later setWorkingDirectory() invocations.  
This might give you testDir/testDir/d1 ?

PathData:

Agree with your elimination of PathData(FileSystem, Path, FileStatus),
but suggest also eliminating PathData(FileSystem, String, FileStatus)
and PathData(FileSystem, FileStatus).
Basically, the status should be looked up in the FS, not set as an argument.
Should do so at the end of method PathData(FileSystem, String).
In comment for PathData(FileSystem, String), remove "Convenience" adjective.  
It becomes the primary ctor.

setStat(boolean ignoreFNF) is counter-intuitive.  Most people seeing 
"setStat(boolean)" will assume the status is being set to the argument.  Please 
call it lookupStatus(boolean ignoreFNF).  Or could use two methods, both with 
zero arguments, called setStat() and setStatIgnoreFNF().

Suggest letting lookupStatus(ignoreFNF) return the stat value instead of void.  
Then refreshStatus just becomes "return lookupStatus(false);".

getChecksumFile(): Why is this method needed (returns PathData), vs just using 
Fs.getChecksumFile() (returns Path)?

getParent(): Several concerns:
* Why is this method needed instead of just using Path.getParent()?  There are 
currently no callers of this new method.
* Why is it important to preserve the relative-ness of the parent path?  Trying 
to do so exposes you to serious issues because the current working directory 
may have changed since PathData#string was stored, which means the value 
returned by this method could be meaningless if string was relative.  I think I 
would go so far as to say this can't work.  I would much rather let 
Path.getParent() do what it was carefully implemented to do.
* The same issue applies to the next two chunks.

This brings up the issue of why PathData saves the value of #string?
This just begs for such mis-use.  I've opened HADOOP-7393 for this.


> FsShell does not preserve relative paths with globs
> ---------------------------------------------------
>
>                 Key: HADOOP-7360
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7360
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs
>    Affects Versions: 0.23.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>         Attachments: HADOOP-7360.patch
>
>
> FsShell currently preserves relative paths that do not contain globs.  
> Unfortunately the method {{fs.globStatus()}} is fully qualifying all returned 
> paths.  This is causing inconsistent display of paths.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to