[ 
https://issues.apache.org/jira/browse/YARN-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13784708#comment-13784708
 ] 

Omkar Vinit Joshi commented on YARN-1219:
-----------------------------------------

bq. I didn't see anywhere in code to treat the ".tmp" file differently. If you 
know please let me know. If the original author only used a suffix to make sure 
the name is different than the original file name, it doesn't seem to be worth 
it to add an unnecessary and error-prone rename operations just to keep the 
temporary file name suffix.
No we are not adding new just moving them around. from unpack to here..Ideally 
that rename code should have been present here only. I remember we had a bug to 
remove that .tmp file. But I think it is fine we can go ahead with this patch. 
As it will not break anything else.

> FSDownload changes file suffix making FileUtil.unTar() throw exception
> ----------------------------------------------------------------------
>
>                 Key: YARN-1219
>                 URL: https://issues.apache.org/jira/browse/YARN-1219
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 3.0.0, 2.1.1-beta, 2.1.2-beta
>            Reporter: shanyu zhao
>            Assignee: shanyu zhao
>             Fix For: 2.1.2-beta
>
>         Attachments: YARN-1219.patch
>
>
> While running a Hive join operation on Yarn, I saw exception as described 
> below. This is caused by FSDownload copy the files into a temp file and 
> change the suffix into ".tmp" before unpacking it. In unpack(), it uses 
> FileUtil.unTar() which will determine if the file is "gzipped" by looking at 
> the file suffix:
> {code}
> boolean gzipped = inFile.toString().endsWith("gz");
> {code}
> To fix this problem, we can remove the ".tmp" in the temp file name.
> Here is the detailed exception:
> org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:240)
>       at org.apache.hadoop.fs.FileUtil.unTarUsingJava(FileUtil.java:676)
>       at org.apache.hadoop.fs.FileUtil.unTar(FileUtil.java:625)
>       at org.apache.hadoop.yarn.util.FSDownload.unpack(FSDownload.java:203)
>       at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:287)
>       at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:50)
>       at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>       at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:722)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to