shanyu zhao created HADOOP-9774:
-----------------------------------

             Summary: RawLocalFileSystem.listStatus() return absolution paths 
when input path is relative on Windows
                 Key: HADOOP-9774
                 URL: https://issues.apache.org/jira/browse/HADOOP-9774
             Project: Hadoop Common
          Issue Type: Bug
          Components: fs
    Affects Versions: 0.23.9, 0.23.8, 0.23.7, 0.23.6, 0.23.5
            Reporter: shanyu zhao


On Windows, when using RawLocalFileSystem.listStatus() to enumerate a relative 
path (without drive spec), e.g., "file:///mydata", the resulting paths become 
absolute paths, e.g., ["file://E:/mydata/t1.txt", "file://E:/mydata/t2.txt"...].
Note that if we use it to enumerate an absolute path, e.g., "file://E:/mydata" 
then the we get the same results as above.

This breaks some hive unit tests which uses local file system to simulate HDFS 
when testing, therefore the drive spec is removed. Then after listStatus() the 
path is changed to absolute path, hive failed to find the path in its map 
reduce job.

You'll see the following exception:
[junit] java.io.IOException: cannot find dir = 
pfile:/E:/GitHub/hive-monarch/build/ql/test/data/warehouse/src/kv1.txt in 
pathToPartitionInfo: 
[pfile:/GitHub/hive-monarch/build/ql/test/data/warehouse/src]
[junit]         at 
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java:298)


This problem is introduced by this JIRA:
HADOOP-8962

Prior to the fix for HADOOP-8962 (merged in 0.23.5), the resulting paths are 
relative paths if the parent paths are relative, e.g., 
["file:///mydata/t1.txt", "file:///mydata/t2.txt"...]

This behavior change is a side effect of the fix in HADOOP-8962, not an 
intended change. The resulting behavior, even though is legitimate from a 
function point of view, break consistency from the caller's point of view. When 
the caller use a relative path (without drive spec) to do listStatus() the 
resulting path should be relative. Therefore, I think this should be fixed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to