[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15941256#comment-15941256
 ] 

Chris Trezzo commented on MAPREDUCE-6846:
-----------------------------------------

Thanks for the review [~templedf]!

Quick question:

bq. It seems odd to create pathURI and then do nothing with it that you 
couldn't do with tmpURI until the end.

Can we actually use tmpURI in this case? It seems as though the URIs/paths we 
submit to the DistributedCache#addFileToClassPath and 
DistributedCache#addCacheFile methods should match. This is so that the symlink 
is correctly resolved in MRApps#addToClasspathIfNotJar for libjars that are not 
jars.

My understanding is that we need to use the path returned by copyRemoteFiles() 
for DistributedCache#addCacheFile otherwise the resource will not be found 
during localization. Because of this, we also need the pathURI so that the 
paths match and we honor user supplied fragments. I can move the 
addFileToClassPath call to the top, but would still need the pathURI. Is this 
what you had in mind?
{code}
        Path newPath =
            copyRemoteFiles(libjarsDir, tmp, conf, submitReplication);
        try {
          URI pathURI = getPathURI(newPath, tmpURI.getFragment());
          DistributedCache.addFileToClassPath(new Path(pathURI.getPath()), conf,
              jtFs, false);
          if (!foundFragment) {
            foundFragment = pathURI.getFragment() != null;
          }
          libjarURIs.add(pathURI);
        }
{code}

Please let me know if I am missing something! Thanks!

> Fragments specified for libjar paths are not handled correctly
> --------------------------------------------------------------
>
>                 Key: MAPREDUCE-6846
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6846
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 2.6.0, 2.7.3, 3.0.0-alpha2
>            Reporter: Chris Trezzo
>            Assignee: Chris Trezzo
>            Priority: Minor
>         Attachments: MAPREDUCE-6846-trunk.001.patch, 
> MAPREDUCE-6846-trunk.002.patch
>
>
> If a user specifies a fragment for a libjars path via generic options parser, 
> the client crashes with a FileNotFoundException:
> {noformat}
> java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt 
> does not exist
>       at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
>       at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
>       at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
>       at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
>       at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:363)
>       at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:314)
>       at 
> org.apache.hadoop.mapreduce.JobResourceUploader.copyRemoteFiles(JobResourceUploader.java:387)
>       at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadLibJars(JobResourceUploader.java:154)
>       at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:105)
>       at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
>       at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
>       at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>       at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:422)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
>       at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>       at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>       at 
> org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
>       at 
> org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>       at 
> org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>       at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>       at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>       at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}
> This is actually inconsistent with the behavior for files and archives. Here 
> is a table showing the current behavior for each type of path and resource:
> | || Qualified path (i.e. file://home/mapred/test.txt#frag.txt) || Absolute 
> path (i.e. /home/mapred/test.txt#frag.txt) || Relative path (i.e. 
> test.txt#frag.txt) ||
> || -libjars | FileNotFound | FileNotFound|FileNotFound|
> || -files | (/) | (/) | (/) |
> || -archives | (/) | (/) | (/) |



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

Reply via email to