[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated MAPREDUCE-4983:
-------------------------------------

    Attachment: MAPREDUCE-4983-branch-trunk-win.1.patch

The attached patch changes the test code so that it's not platform-specific.  I 
tested the patch on Mac and Windows.  This patch is intended for 
branch-trunk-win (not trunk).

Here is a summary of the kinds of changes that are in this patch:

{code}
   private static void delete(File dir) throws IOException {
-    Path p = new Path("file://"+dir.getAbsolutePath());
     Configuration conf = new Configuration();
-    FileSystem fs = p.getFileSystem(conf);
+    FileSystem fs = FileSystem.getLocal(conf);
+    Path p = fs.makeQualified(new Path(dir.getAbsolutePath()));
     fs.delete(p, true);
   }
{code}

The prior code for path construction fails on Windows due to the drive spec and 
backslashes.  Using {{FileSystem#makeQualified}} against the local file system 
works cross-platform.

{code}
-    assertTrue(environment.get("CLASSPATH").startsWith("$PWD:"));
+    assertTrue(environment.get("CLASSPATH").startsWith(
+      ApplicationConstants.Environment.PWD.$() + File.pathSeparator));
{code}

On Windows, an environment variable shows up in the classpath as %VAR% instead 
of $VAR.  On branch-trunk-win, we have already changed 
{{ApplicationConstants#Environment#$}} to return the correct thing depending on 
platform, so I'm reusing that here.

{code}
-      yarnAppClasspath = yarnAppClasspath.replaceAll(",\\s*", ":").trim();
+      yarnAppClasspath = yarnAppClasspath.replaceAll(",\\s*", 
File.pathSeparator)
+        .trim();
{code}

On Windows, classpath entries are separated by ';' instead of ':'.  Using 
{{File#pathSeparator}} handles this correctly cross-platform.

{code}
-    assertSame("MAPREDUCE_JOB_USER_CLASSPATH_FIRST set, but not taking 
effect!",
-      
env_str.indexOf("$PWD:job.jar/job.jar:job.jar/classes/:job.jar/lib/*:$PWD/*"), 
0);
+    String expectedClasspath = StringUtils.join(File.pathSeparator,
+      Arrays.asList(ApplicationConstants.Environment.PWD.$(), 
"job.jar/job.jar",
+        "job.jar/classes/", "job.jar/lib/*",
+        ApplicationConstants.Environment.PWD.$() + "/*"));
+    assertTrue("MAPREDUCE_JOB_USER_CLASSPATH_FIRST set, but not taking 
effect!",
+      env_str.startsWith(expectedClasspath));
{code}

This is a combination of the prior issues: handling environment variables and 
classpath entry separator in a way that works cross-platform.

{code}
-  private static String TEST_ROOT_DIR = new File(System.getProperty(
-           "test.build.data", "/tmp")).getAbsolutePath() + "/mapPahseprogress";
+  private static String TEST_ROOT_DIR;
+  static {
+    String root = new File(System.getProperty("test.build.data", "/tmp"))
+      .getAbsolutePath();
+    TEST_ROOT_DIR = new Path(root, "mapPahseprogress").toString();
+  }
{code}

The old code would generate a path with a mix of backslashes and forward 
slashes.  Passing through {{Path#toString}} handles this correctly.

{code}
-        new Path(mrCluster.getTestWorkDir().getAbsolutePath(), 
"random-output");
+        new Path("/tmp/" + getClass().getSimpleName(), "random-output");
{code}

The old code would attempt to use a path on HDFS with a drive spec.  HDFS would 
reject this, because it considers ':' invalid in a path.  See prior discussion 
in HDFS-4470, HADOOP-8487, and HDFS-4260 for discussion and justification for 
switching to a path of the form /tmp/<test name>.  Note that this does not 
change any paths used on the local file system.  This only changes paths used 
for creating files in HDFS.

{code}
       // Check lengths of the files
-      Assert.assertEquals(1, localFs.getFileStatus(files[1]).getLen());
-      Assert.assertTrue(localFs.getFileStatus(files[2]).getLen() > 1);
+      Map<String, Path> filesMap = pathsToMap(files);
+      Assert.assertTrue(filesMap.containsKey("distributed.first.symlink"));
+      Assert.assertEquals(1, localFs.getFileStatus(
+        filesMap.get("distributed.first.symlink")).getLen());
+      Assert.assertTrue(filesMap.containsKey("distributed.second.jar"));
+      Assert.assertTrue(localFs.getFileStatus(
+        filesMap.get("distributed.second.jar")).getLen() > 1);
{code}

The old code assumed that the directory listing would come back in a specific 
order.  The order can be different on Windows.  Additionally, Windows has an 
additional jar used to bundle a potentially long classpath into a jar manifest. 
 The new code creates a mapping based on name and then interrogates the map, so 
there is no assumption of order.

With this patch, we still have a failure in 
{{TestMRJobs#testDistributedCache}}, but it's a separate issue due to 
mishandling of symlinks in the distributed cache.  I'll file a separate jira 
for that.

                
> multiple MapReduce tests fail on Windows due to platform-specific assumptions 
> in test code
> ------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4983
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4983
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: test
>    Affects Versions: trunk-win
>            Reporter: Chris Nauroth
>            Assignee: Chris Nauroth
>         Attachments: MAPREDUCE-4983-branch-trunk-win.1.patch
>
>
> Multiple MapReduce tests have code that makes platform-specific assumptions 
> which do not hold true on Windows.  This includes assumptions about file path 
> manipulation, the path separator used between classpath elements, environment 
> variable syntax, and order of files returned from a directory listing of the 
> local file system.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to