I'm running some mapreduce, and some jobs has outofmemory errors, and I find
that that the raw data itself also got corrupted, becomes zero bytes, very
strange to me, I did not look very detail into it, but just want to check
quickly with someone with such experience. I'm running at 0.18.3.
thanks
into sequence files. If the script's
setInputPaths takes a Path[] of all 24k files, it will get a OutOfMemory
error at about 35% map complete. If I make the script process 2k files
per job and run 12 jobs consecutively, then it goes through all files
fine. The cluster I'm using has about 67 nodes
a OutOfMemory
error at about 35% map complete. If I make the script process 2k files
per job and run 12 jobs consecutively, then it goes through all files
fine. The cluster I'm using has about 67 nodes. Each nodes has 16GB
memory, max 7 map, and max 2 reduce.
The map task is really simple
On Feb 24, 2009, at 4:03 PM, bzheng wrote:
2009-02-23 14:27:50,902 INFO org.apache.hadoop.mapred.TaskTracker:
java.lang.OutOfMemoryError: Java heap space
That tells that that your TaskTracker is running out of memory, not
your reduce tasks.
I think you are hitting
. One interesting thing though, if we do use gz
files, the out of memory issues occurs in a few minutes.
--
View this message in context:
http://www.nabble.com/OutOfMemory-error-processing-large-amounts-of-gz-files-tp22193552p22231249.html
Sent from the Hadoop core-user mailing list archive
about 24k gz files (about 550GB total) on hdfs and has a really
simple
java program to convert them into sequence files. If the script's
setInputPaths takes a Path[] of all 24k files, it will get a OutOfMemory
error at about 35% map complete. If I make the script process 2k files
per
job
550GB total) on hdfs and has a really
simple
java program to convert them into sequence files. If the script's
setInputPaths takes a Path[] of all 24k files, it will get a OutOfMemory
error at about 35% map complete. If I make the script process 2k files
per
job and run 12 jobs consecutively
of which is holding a bit of
native memory.)
- Gordon @ IA
bzheng wrote:
I have about 24k gz files (about 550GB total) on hdfs and has a really simple
java program to convert them into sequence files. If the script's
setInputPaths takes a Path[] of all 24k files, it will get a OutOfMemory
error
Hello,
I have work machines with 32GB and allocated 16GB to the heap size
==hadoop-env.sh==
export HADOOP_HEAPSIZE=16384
==hadoop-site.xml==
property
namemapred.child.java.opts/name
value-Xmx16384m/value
/property
The same code runs when not being run through Hadoop, but it fails
when in a
Hey Saptarshi,
Watch the running child process while using ps, top, or Ganglia
monitoring. Does the map task actually use 16GB of memory, or is the
memory not getting set properly?
Brian
On Dec 28, 2008, at 3:00 PM, Saptarshi Guha wrote:
Hello,
I have work machines with 32GB and
On Sun, Dec 28, 2008 at 4:33 PM, Brian Bockelman bbock...@cse.unl.edu wrote:
Hey Saptarshi,
Watch the running child process while using ps, top, or Ganglia
monitoring. Does the map task actually use 16GB of memory, or is the memory
not getting set properly?
Brian
I haven't figured out how
Caught it in action.
Running ps -e -o 'vsz pid ruser args' |sort -nr|head -5
on a machine where the map task was running
04812 16962 sguha/home/godhuli/custom/jdk1.6.0_11/jre/bin/java
Saptarshi Guha wrote:
Caught it in action.
Running ps -e -o 'vsz pid ruser args' |sort -nr|head -5
on a machine where the map task was running
04812 16962 sguha/home/godhuli/custom/jdk1.6.0_11/jre/bin/java
Thanks for the information. It helps a lot.
On Sat, Dec 6, 2008 at 11:54 AM, Arun C Murthy [EMAIL PROTECTED] wrote:
On Dec 6, 2008, at 11:40 AM, charles du wrote:
I used the default value, which I believe is 1000 MB. My cluster has about
30 machines. Each machine is configured to run up to
PROTECTED] wrote:
Hi all,
We are using hadoop-0.17.2 for some time now. Since yesterday, We
have
been seeing jobTracker failing to respond with an OutOfMemory Error
very
frequently. Things are going fine after restarting it. But the
problem
is occurring after a while. Below
yesterday, We have
been seeing jobTracker failing to respond with an OutOfMemory Error very
frequently. Things are going fine after restarting it. But the problem
is occurring after a while. Below is the exception that we are seeing in
jobtracker logs. Can someone please suggest what is going
On Dec 6, 2008, at 11:40 AM, charles du wrote:
I used the default value, which I believe is 1000 MB. My cluster has
about
30 machines. Each machine is configured to run up to 5 tasks. We run
hourly,
daily jobs on the cluster. When OOM happened, I was running a job
with 1500
- 1600
, Pallavi
[EMAIL PROTECTED] wrote:
Hi all,
We are using hadoop-0.17.2 for some time now. Since yesterday, We have
been seeing jobTracker failing to respond with an OutOfMemory Error very
frequently. Things are going fine after restarting it. But the problem
is occurring after a while. Below
.
On Wed, Nov 19, 2008 at 8:40 PM, Palleti, Pallavi
[EMAIL PROTECTED] wrote:
Hi all,
We are using hadoop-0.17.2 for some time now. Since yesterday, We have
been seeing jobTracker failing to respond with an OutOfMemory Error very
frequently. Things are going fine after restarting
Hi all,
We are using hadoop-0.17.2 for some time now. Since yesterday, We have
been seeing jobTracker failing to respond with an OutOfMemory Error very
frequently. Things are going fine after restarting it. But the problem
is occurring after a while. Below is the exception that we are seeing
. Yoon
Sent: Friday, September 19, 2008 10:35 AM
To: core-user@hadoop.apache.org; [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: Re: OutOfMemory Error
The key is of the form ID :DenseVector Representation in mahout with
I guess vector size seems too large so it'll need a distributed vector
Palleti [EMAIL PROTECTED] wrote:
Hi all,
I am getting outofmemory error as shown below when I ran map-red on
huge
amount of data.:
java.lang.OutOfMemoryError: Java heap space
at
org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:52
]
Subject: Re: OutOfMemory Error
The key is of the form ID :DenseVector Representation in mahout with
I guess vector size seems too large so it'll need a distributed vector
architecture (or 2d partitioning strategies) for large scale matrix
operations. The hama team investigate these problem areas
Hi all,
I am getting outofmemory error as shown below when I ran map-red on huge
amount of data.:
java.lang.OutOfMemoryError: Java heap space
at
org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:52)
at org.apache.hadoop.io.DataOutputBuffer.write
24 matches
Mail list logo