I would agree with removing it from the default build for now.

I only used thrift because that's what we were using for all of the
RPC at the time.  I'd rather that we just settle on one RPC to rule
them all, and I will change the code accordingly.


On Aug 28, 2009, at 3:04 PM, Steve Gao wrote:

Thanks, Brian. Would you tell me what is the filename of the code snippet?

--- On Fri, 8/28/09, Brian Bockelman <bbock...@cse.unl.edu> wrote:

From: Brian Bockelman <bbock...@cse.unl.edu>
Subject: Re: [Help] Why "java.util.zip.ZipOutputStream" need to use / tmp?
To: common-user@hadoop.apache.org
Date: Friday, August 28, 2009, 2:37 PM

Actually, poking the code, it seems that the streaming package does set this value:

String tmp = jobConf_.get("stream.tmpdir"); //, "/tmp/$ {user.name}/"

Try setting stream.tmpdir to a different directory maybe?

Brian

On Aug 28, 2009, at 1:31 PM, Steve Gao wrote:

Thanks lot, Brian. It seems to be a design flaw of hadoop that it can not manage (or pass in) the temp of "java.util.zip". Can we create a jira ticket for this?

--- On Fri, 8/28/09, Brian Bockelman <bbock...@cse.unl.edu> wrote:

From: Brian Bockelman <bbock...@cse.unl.edu>
Subject: Re: [Help] Why "java.util.zip.ZipOutputStream" need to use /tmp?
To:
Cc: common-user@hadoop.apache.org
Date: Friday, August 28, 2009, 2:27 PM

Hey Steve,

Correct, java.util.zip.* does not necessarily respect hadoop settings.

Try setting TMPDIR in the environment to your large local disk space. It might respect that, if Java decides to act like a unix utility.

http://en.wikipedia.org/wiki/TMPDIR

Brian

On Aug 28, 2009, at 1:19 PM, Steve Gao wrote:

would someone give us a hint? Thanks.
Why "java.util.zip.ZipOutputStream" need to use /tmp?

The hadoop version is 0.18.3 . Recently we got "out of space" issue. It's from "java.util.zip.ZipOutputStream". We found that /tmp is full and after cleaning /tmp the problem is solved.

However why hadoop needs to use /tmp? We had already configured hadoop tmp to a local disk in: hadoop-site.xml

<property>
    <name>hadoop.tmp.dir</name>
    <value> ... some large local disk ... </value>
</property>


Could it because java.util.zip.ZipOutputStream uses /tmp even if we configured hadoop.tmp.dir to a large local disk?

The error log is here FYI:

java.io.IOException: No space left on device
at java.io.FileOutputStream.write(Native Method)
at java.util.zip.ZipOutputStream.writeInt(ZipOutputStream.java: 445)
at java.util.zip.ZipOutputStream.writeEXT(ZipOutputStream.java:362)
at java.util.zip.ZipOutputStream.closeEntry(ZipOutputStream.java: 220)
at java.util.zip.ZipOutputStream.finish(ZipOutputStream.java:301)
at java.util.zip.DeflaterOutputStream.close(DeflaterOutputStream.java: 146)
at java.util.zip.ZipOutputStream.close(ZipOutputStream.java:321)
at org.apache.hadoop.streaming.JarBuilder.merge(JarBuilder.java:79)
at org.apache.hadoop.streaming.StreamJob.packageJobJar(StreamJob.java: 628) at org.apache.hadoop.streaming.StreamJob.setJobConf(StreamJob.java: 843)
at org.apache.hadoop.streaming.StreamJob.go(StreamJob.java:110)
at org .apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java: 33)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun .reflect .NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun .reflect .DelegatingMethodAccessorImpl .invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
at org.apache.hadoop.mapred.JobShell.run(JobShell.java:194)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.mapred.JobShell.main(JobShell.java:220)
Executing Hadoop job failure
















Reply via email to