[ https://issues.apache.org/jira/browse/MAPREDUCE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15860382#comment-15860382 ]
Chris Trezzo commented on MAPREDUCE-6846: ----------------------------------------- bq. I was under the impression that if the wildcard mapped to only one file then we would not convey this as a wildcard through to the staging directory but instead remap it to the one entry that it globbed to (i.e.: as if the user had specified the one path directly rather than a glob to that one path). True, once it is in the staging dir it will not look like a wildcard. That being said, there is a second part to the feature. I will attempt to explain my current understanding: See {{JobResourceUploader#uploadLibJars}}: {code:java} private void uploadLibJars(Configuration conf, Collection<String> libjars, Path submitJobDir, FsPermission mapredSysPerms, short submitReplication) throws IOException { Path libjarsDir = JobSubmissionFiles.getJobDistCacheLibjars(submitJobDir); if (!libjars.isEmpty()) { FileSystem.mkdirs(jtFs, libjarsDir, mapredSysPerms); for (String tmpjars : libjars) { Path tmp = new Path(tmpjars); Path newPath = copyRemoteFiles(libjarsDir, tmp, conf, submitReplication); // Add each file to the classpath DistributedCache.addFileToClassPath( new Path(newPath.toUri().getPath()), conf, jtFs, !useWildcard); } if (useWildcard) { // Add the whole directory to the cache Path libJarsDirWildcard = jtFs.makeQualified(new Path(libjarsDir, DistributedCache.WILDCARD)); DistributedCache.addCacheFile(libJarsDirWildcard.toUri(), conf); } } } {code} {{useWildcard}} is set by the {{mapreduce.client.libjars.wildcard}} config parameter. If this is set to true, then we add the files individually to the classpath (i.e. {{mapreduce.job.classpath.files}}), but then we glob them all together when adding them to the distributed cache (i.e. {{mapreduce.job.cache.files}}). At that point, we would loose the fragment name because the LocalResource objects submitted to YARN are created based off of those paths. > Fragments specified for libjar paths are not handled correctly > -------------------------------------------------------------- > > Key: MAPREDUCE-6846 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6846 > Project: Hadoop Map/Reduce > Issue Type: Bug > Affects Versions: 2.7.3, 3.0.0-alpha2 > Reporter: Chris Trezzo > Assignee: Chris Trezzo > Priority: Minor > > If a user specifies a fragment for a libjars path via generic options parser, > the client crashes with a FileNotFoundException: > {noformat} > java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt > does not exist > at > org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628) > at > org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442) > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:363) > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:314) > at > org.apache.hadoop.mapreduce.JobResourceUploader.copyRemoteFiles(JobResourceUploader.java:387) > at > org.apache.hadoop.mapreduce.JobResourceUploader.uploadLibJars(JobResourceUploader.java:154) > at > org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:105) > at > org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341) > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362) > at > org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306) > at > org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at > org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71) > at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) > at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:239) > at org.apache.hadoop.util.RunJar.main(RunJar.java:153) > {noformat} > This is actually inconsistent with the behavior for files and archives. Here > is a table showing the current behavior for each type of path and resource: > | || Qualified path (i.e. file://home/mapred/test.txt#frag.txt) || Absolute > path (i.e. /home/mapred/test.txt#frag.txt) || Relative path (i.e. > test.txt#frag.txt) || > || -libjars | FileNotFound | FileNotFound|FileNotFound| > || -files | (/) | (/) | (/) | > || -archives | (/) | (/) | (/) | -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org