[jira] [Commented] (PIG-3815) Hadoop bug causes to pig to fail silently with jar cache
[ https://issues.apache.org/jira/browse/PIG-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13940867#comment-13940867 ] Aniket Mokashi commented on PIG-3815: - Thanks [~cheolsoo], I committed it to trunk. > Hadoop bug causes to pig to fail silently with jar cache > > > Key: PIG-3815 > URL: https://issues.apache.org/jira/browse/PIG-3815 > Project: Pig > Issue Type: Bug >Affects Versions: 0.13.0 >Reporter: Aniket Mokashi >Assignee: Aniket Mokashi > Fix For: 0.13.0 > > Attachments: PIG-3815-1.patch, PIG-3815-2.patch, PIG-3815-3.patch, > PIG-3815.patch > > > Pig uses DistributedCache.addFileToClassPath api that puts jars on > distributed cache configuration. This uses : to separate list of files to be > put of classpath via distributed cache. If fs.default.name has port number in > it, it causes the tokenization logic to fail in hadoop for retrieving list of > cache filenames in backend. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3815) Hadoop bug causes to pig to fail silently with jar cache
[ https://issues.apache.org/jira/browse/PIG-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13940600#comment-13940600 ] Cheolsoo Park commented on PIG-3815: Ha, that looks a lot better to me. +1. > Hadoop bug causes to pig to fail silently with jar cache > > > Key: PIG-3815 > URL: https://issues.apache.org/jira/browse/PIG-3815 > Project: Pig > Issue Type: Bug >Affects Versions: 0.13.0 >Reporter: Aniket Mokashi >Assignee: Aniket Mokashi > Fix For: 0.13.0 > > Attachments: PIG-3815-1.patch, PIG-3815-2.patch, PIG-3815-3.patch, > PIG-3815.patch > > > Pig uses DistributedCache.addFileToClassPath api that puts jars on > distributed cache configuration. This uses : to separate list of files to be > put of classpath via distributed cache. If fs.default.name has port number in > it, it causes the tokenization logic to fail in hadoop for retrieving list of > cache filenames in backend. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3815) Hadoop bug causes to pig to fail silently with jar cache
[ https://issues.apache.org/jira/browse/PIG-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13940196#comment-13940196 ] Aniket Mokashi commented on PIG-3815: - I just realized that there is a better way to refactor this code. Can someone review the patch attached? > Hadoop bug causes to pig to fail silently with jar cache > > > Key: PIG-3815 > URL: https://issues.apache.org/jira/browse/PIG-3815 > Project: Pig > Issue Type: Bug >Affects Versions: 0.13.0 >Reporter: Aniket Mokashi >Assignee: Aniket Mokashi > Fix For: 0.13.0 > > Attachments: PIG-3815-1.patch, PIG-3815-2.patch, PIG-3815-3.patch, > PIG-3815.patch > > > Pig uses DistributedCache.addFileToClassPath api that puts jars on > distributed cache configuration. This uses : to separate list of files to be > put of classpath via distributed cache. If fs.default.name has port number in > it, it causes the tokenization logic to fail in hadoop for retrieving list of > cache filenames in backend. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3815) Hadoop bug causes to pig to fail silently with jar cache
[ https://issues.apache.org/jira/browse/PIG-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13939825#comment-13939825 ] Aniket Mokashi commented on PIG-3815: - I have committed PIG-3815-2.patch to trunk! Thanks everyone for your comments. > Hadoop bug causes to pig to fail silently with jar cache > > > Key: PIG-3815 > URL: https://issues.apache.org/jira/browse/PIG-3815 > Project: Pig > Issue Type: Bug >Affects Versions: 0.13.0 >Reporter: Aniket Mokashi >Assignee: Aniket Mokashi > Fix For: 0.13.0 > > Attachments: PIG-3815-1.patch, PIG-3815-2.patch, PIG-3815.patch > > > Pig uses DistributedCache.addFileToClassPath api that puts jars on > distributed cache configuration. This uses : to separate list of files to be > put of classpath via distributed cache. If fs.default.name has port number in > it, it causes the tokenization logic to fail in hadoop for retrieving list of > cache filenames in backend. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3815) Hadoop bug causes to pig to fail silently with jar cache
[ https://issues.apache.org/jira/browse/PIG-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13939815#comment-13939815 ] Rohini Palaniswamy commented on PIG-3815: - [~julienledem], It is being qualified only to be used in addCacheFile() which sets the mapred.cache.files which is required. conf.set("mapred.job.classpath.files") uses just the file path after removing scheme and port. > Hadoop bug causes to pig to fail silently with jar cache > > > Key: PIG-3815 > URL: https://issues.apache.org/jira/browse/PIG-3815 > Project: Pig > Issue Type: Bug >Affects Versions: 0.13.0 >Reporter: Aniket Mokashi >Assignee: Aniket Mokashi > Fix For: 0.13.0 > > Attachments: PIG-3815-1.patch, PIG-3815-2.patch, PIG-3815.patch > > > Pig uses DistributedCache.addFileToClassPath api that puts jars on > distributed cache configuration. This uses : to separate list of files to be > put of classpath via distributed cache. If fs.default.name has port number in > it, it causes the tokenization logic to fail in hadoop for retrieving list of > cache filenames in backend. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3815) Hadoop bug causes to pig to fail silently with jar cache
[ https://issues.apache.org/jira/browse/PIG-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13939749#comment-13939749 ] Julien Le Dem commented on PIG-3815: [~rohini] in the code you quoted, don't you think it is putting the port back in the following line? {noformat} URI uri = fs.makeQualified(file).toUri(); {noformat} > Hadoop bug causes to pig to fail silently with jar cache > > > Key: PIG-3815 > URL: https://issues.apache.org/jira/browse/PIG-3815 > Project: Pig > Issue Type: Bug >Affects Versions: 0.13.0 >Reporter: Aniket Mokashi >Assignee: Aniket Mokashi > Fix For: 0.13.0 > > Attachments: PIG-3815-1.patch, PIG-3815-2.patch, PIG-3815.patch > > > Pig uses DistributedCache.addFileToClassPath api that puts jars on > distributed cache configuration. This uses : to separate list of files to be > put of classpath via distributed cache. If fs.default.name has port number in > it, it causes the tokenization logic to fail in hadoop for retrieving list of > cache filenames in backend. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3815) Hadoop bug causes to pig to fail silently with jar cache
[ https://issues.apache.org/jira/browse/PIG-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13939750#comment-13939750 ] Rohini Palaniswamy commented on PIG-3815: - Yes. It has been fixed 3 years ago. I am not sure what version of hadoop you are using and hitting this issue. But since we still support 0.20 as well there is no harm in doing .toUri().getPath() in pig as well. +1. Since the issue is not with hadoop 1.0, please update your comment when checking in this patch from "// PIG-3815 In hadoop 1.0, addFileToClassPath uses : as separator" to say hadoop 0.20. > Hadoop bug causes to pig to fail silently with jar cache > > > Key: PIG-3815 > URL: https://issues.apache.org/jira/browse/PIG-3815 > Project: Pig > Issue Type: Bug >Affects Versions: 0.13.0 >Reporter: Aniket Mokashi >Assignee: Aniket Mokashi > Fix For: 0.13.0 > > Attachments: PIG-3815-1.patch, PIG-3815-2.patch, PIG-3815.patch > > > Pig uses DistributedCache.addFileToClassPath api that puts jars on > distributed cache configuration. This uses : to separate list of files to be > put of classpath via distributed cache. If fs.default.name has port number in > it, it causes the tokenization logic to fail in hadoop for retrieving list of > cache filenames in backend. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3815) Hadoop bug causes to pig to fail silently with jar cache
[ https://issues.apache.org/jira/browse/PIG-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13939728#comment-13939728 ] Aniket Mokashi commented on PIG-3815: - Thanks for your comments, [~rohini]. I was not aware of limitations on the HDFS streams, I have attached a patch (PIG-3815-2.patch) to fix those problems. Hadoop Jira: https://issues.apache.org/jira/browse/MAPREDUCE-2361. Looks like this was fixed here - http://svn.apache.org/viewvc?view=revision&revision=1077790. > Hadoop bug causes to pig to fail silently with jar cache > > > Key: PIG-3815 > URL: https://issues.apache.org/jira/browse/PIG-3815 > Project: Pig > Issue Type: Bug >Affects Versions: 0.13.0 >Reporter: Aniket Mokashi >Assignee: Aniket Mokashi > Fix For: 0.13.0 > > Attachments: PIG-3815-1.patch, PIG-3815-2.patch, PIG-3815.patch > > > Pig uses DistributedCache.addFileToClassPath api that puts jars on > distributed cache configuration. This uses : to separate list of files to be > put of classpath via distributed cache. If fs.default.name has port number in > it, it causes the tokenization logic to fail in hadoop for retrieving list of > cache filenames in backend. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3815) Hadoop bug causes to pig to fail silently with jar cache
[ https://issues.apache.org/jira/browse/PIG-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13939618#comment-13939618 ] Aniket Mokashi commented on PIG-3815: - Thanks for the review, [~julienledem] and [~cheolsoo]. I have attached revised patch and committed it to trunk! > Hadoop bug causes to pig to fail silently with jar cache > > > Key: PIG-3815 > URL: https://issues.apache.org/jira/browse/PIG-3815 > Project: Pig > Issue Type: Bug >Affects Versions: 0.13.0 >Reporter: Aniket Mokashi >Assignee: Aniket Mokashi > Fix For: 0.13.0 > > Attachments: PIG-3815-1.patch, PIG-3815.patch > > > Pig uses DistributedCache.addFileToClassPath api that puts jars on > distributed cache configuration. This uses : to separate list of files to be > put of classpath via distributed cache. If fs.default.name has port number in > it, it causes the tokenization logic to fail in hadoop for retrieving list of > cache filenames in backend. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3815) Hadoop bug causes to pig to fail silently with jar cache
[ https://issues.apache.org/jira/browse/PIG-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13939606#comment-13939606 ] Julien Le Dem commented on PIG-3815: same comment as 1. from Cheolsoo otherwise, this looks good to me. > Hadoop bug causes to pig to fail silently with jar cache > > > Key: PIG-3815 > URL: https://issues.apache.org/jira/browse/PIG-3815 > Project: Pig > Issue Type: Bug >Affects Versions: 0.13.0 >Reporter: Aniket Mokashi >Assignee: Aniket Mokashi > Fix For: 0.13.0 > > Attachments: PIG-3815-1.patch, PIG-3815.patch > > > Pig uses DistributedCache.addFileToClassPath api that puts jars on > distributed cache configuration. This uses : to separate list of files to be > put of classpath via distributed cache. If fs.default.name has port number in > it, it causes the tokenization logic to fail in hadoop for retrieving list of > cache filenames in backend. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3815) Hadoop bug causes to pig to fail silently with jar cache
[ https://issues.apache.org/jira/browse/PIG-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13939354#comment-13939354 ] Cheolsoo Park commented on PIG-3815: # Can you delete this? It's unused. {code} +import org.codehaus.plexus.util.IOUtil; {code} # Do you mind fixing JobControlCompiler.java#L1700 too? Looks like we can use IOUtils.closeQuietly() here too. {code} OutputStream os = fs.create(dst); try { IOUtils.copyBytes(url.openStream(), os, 4096, true); } finally { // IOUtils can not close both the input and the output properly in a finally // as we can get an exception in between opening the stream and calling the method os.close(); } {code} > Hadoop bug causes to pig to fail silently with jar cache > > > Key: PIG-3815 > URL: https://issues.apache.org/jira/browse/PIG-3815 > Project: Pig > Issue Type: Bug >Affects Versions: 0.13.0 >Reporter: Aniket Mokashi >Assignee: Aniket Mokashi > Fix For: 0.13.0 > > Attachments: PIG-3815.patch > > > Pig uses DistributedCache.addFileToClassPath api that puts jars on > distributed cache configuration. This uses : to separate list of files to be > put of classpath via distributed cache. If fs.default.name has port number in > it, it causes the tokenization logic to fail in hadoop for retrieving list of > cache filenames in backend. -- This message was sent by Atlassian JIRA (v6.2#6252)