[ https://issues.apache.org/jira/browse/HADOOP-9774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13752687#comment-13752687 ]
shanyu zhao commented on HADOOP-9774: ------------------------------------- Thank you Ivan. I actually was able to run all unit tests on hadoop trunk on Linux. I didn't observe any negative impact by this patch. Would you please commit this patch? > RawLocalFileSystem.listStatus() return absolute paths when input path is > relative on Windows > -------------------------------------------------------------------------------------------- > > Key: HADOOP-9774 > URL: https://issues.apache.org/jira/browse/HADOOP-9774 > Project: Hadoop Common > Issue Type: Bug > Components: fs > Affects Versions: 3.0.0, 2.1.0-beta > Reporter: shanyu zhao > Assignee: shanyu zhao > Attachments: HADOOP-9774-2.patch, HADOOP-9774-3.patch, > HADOOP-9774-4.patch, HADOOP-9774-5.patch, HADOOP-9774.patch > > > On Windows, when using RawLocalFileSystem.listStatus() to enumerate a > relative path (without drive spec), e.g., "file:///mydata", the resulting > paths become absolute paths, e.g., ["file://E:/mydata/t1.txt", > "file://E:/mydata/t2.txt"...]. > Note that if we use it to enumerate an absolute path, e.g., > "file://E:/mydata" then the we get the same results as above. > This breaks some hive unit tests which uses local file system to simulate > HDFS when testing, therefore the drive spec is removed. Then after > listStatus() the path is changed to absolute path, hive failed to find the > path in its map reduce job. > You'll see the following exception: > [junit] java.io.IOException: cannot find dir = > pfile:/E:/GitHub/hive-monarch/build/ql/test/data/warehouse/src/kv1.txt in > pathToPartitionInfo: > [pfile:/GitHub/hive-monarch/build/ql/test/data/warehouse/src] > [junit] at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java:298) > This problem is introduced by this JIRA: > HADOOP-8962 > Prior to the fix for HADOOP-8962 (merged in 0.23.5), the resulting paths are > relative paths if the parent paths are relative, e.g., > ["file:///mydata/t1.txt", "file:///mydata/t2.txt"...] > This behavior change is a side effect of the fix in HADOOP-8962, not an > intended change. The resulting behavior, even though is legitimate from a > function point of view, break consistency from the caller's point of view. > When the caller use a relative path (without drive spec) to do listStatus() > the resulting path should be relative. Therefore, I think this should be > fixed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira