raw files become zero bytes when mapreduce job hit outofmemory error

2009-04-13 Thread javateck javateck
I'm running some mapreduce, and some jobs has outofmemory errors, and I find that that the raw data itself also got corrupted, becomes zero bytes, very strange to me, I did not look very detail into it, but just want to check quickly with someone with such experience. I'm running at 0.18.3. thanks

Re: OutOfMemory error processing large amounts of gz files

2009-03-02 Thread bzheng
into sequence files. If the script's setInputPaths takes a Path[] of all 24k files, it will get a OutOfMemory error at about 35% map complete. If I make the script process 2k files per job and run 12 jobs consecutively, then it goes through all files fine. The cluster I'm using has about 67 nodes

Re: OutOfMemory error processing large amounts of gz files

2009-03-02 Thread Runping Qi
a OutOfMemory error at about 35% map complete. If I make the script process 2k files per job and run 12 jobs consecutively, then it goes through all files fine. The cluster I'm using has about 67 nodes. Each nodes has 16GB memory, max 7 map, and max 2 reduce. The map task is really simple

Re: OutOfMemory error processing large amounts of gz files

2009-02-26 Thread Arun C Murthy
On Feb 24, 2009, at 4:03 PM, bzheng wrote: 2009-02-23 14:27:50,902 INFO org.apache.hadoop.mapred.TaskTracker: java.lang.OutOfMemoryError: Java heap space That tells that that your TaskTracker is running out of memory, not your reduce tasks. I think you are hitting

Re: OutOfMemory error processing large amounts of gz files

2009-02-26 Thread bzheng
. One interesting thing though, if we do use gz files, the out of memory issues occurs in a few minutes. -- View this message in context: http://www.nabble.com/OutOfMemory-error-processing-large-amounts-of-gz-files-tp22193552p22231249.html Sent from the Hadoop core-user mailing list archive

Re: OutOfMemory error processing large amounts of gz files

2009-02-25 Thread Tom White
about 24k gz files (about 550GB total) on hdfs and has a really simple java program to convert them into sequence files.  If the script's setInputPaths takes a Path[] of all 24k files, it will get a OutOfMemory error at about 35% map complete.  If I make the script process 2k files per job

Re: OutOfMemory error processing large amounts of gz files

2009-02-25 Thread bzheng
550GB total) on hdfs and has a really simple java program to convert them into sequence files.  If the script's setInputPaths takes a Path[] of all 24k files, it will get a OutOfMemory error at about 35% map complete.  If I make the script process 2k files per job and run 12 jobs consecutively

Re: OutOfMemory error processing large amounts of gz files

2009-02-24 Thread Gordon Mohr
of which is holding a bit of native memory.) - Gordon @ IA bzheng wrote: I have about 24k gz files (about 550GB total) on hdfs and has a really simple java program to convert them into sequence files. If the script's setInputPaths takes a Path[] of all 24k files, it will get a OutOfMemory error

OutofMemory Error, inspite of large amounts provided

2008-12-28 Thread Saptarshi Guha
Hello, I have work machines with 32GB and allocated 16GB to the heap size ==hadoop-env.sh== export HADOOP_HEAPSIZE=16384 ==hadoop-site.xml== property namemapred.child.java.opts/name value-Xmx16384m/value /property The same code runs when not being run through Hadoop, but it fails when in a

Re: OutofMemory Error, inspite of large amounts provided

2008-12-28 Thread Brian Bockelman
Hey Saptarshi, Watch the running child process while using ps, top, or Ganglia monitoring. Does the map task actually use 16GB of memory, or is the memory not getting set properly? Brian On Dec 28, 2008, at 3:00 PM, Saptarshi Guha wrote: Hello, I have work machines with 32GB and

Re: OutofMemory Error, inspite of large amounts provided

2008-12-28 Thread Saptarshi Guha
On Sun, Dec 28, 2008 at 4:33 PM, Brian Bockelman bbock...@cse.unl.edu wrote: Hey Saptarshi, Watch the running child process while using ps, top, or Ganglia monitoring. Does the map task actually use 16GB of memory, or is the memory not getting set properly? Brian I haven't figured out how

Re: OutofMemory Error, inspite of large amounts provided

2008-12-28 Thread Saptarshi Guha
Caught it in action. Running ps -e -o 'vsz pid ruser args' |sort -nr|head -5 on a machine where the map task was running 04812 16962 sguha/home/godhuli/custom/jdk1.6.0_11/jre/bin/java

Re: OutofMemory Error, inspite of large amounts provided

2008-12-28 Thread Amareshwari Sriramadasu
Saptarshi Guha wrote: Caught it in action. Running ps -e -o 'vsz pid ruser args' |sort -nr|head -5 on a machine where the map task was running 04812 16962 sguha/home/godhuli/custom/jdk1.6.0_11/jre/bin/java

Re: JobTracker Faiing to respond with OutOfMemory error

2008-12-07 Thread charles du
Thanks for the information. It helps a lot. On Sat, Dec 6, 2008 at 11:54 AM, Arun C Murthy [EMAIL PROTECTED] wrote: On Dec 6, 2008, at 11:40 AM, charles du wrote: I used the default value, which I believe is 1000 MB. My cluster has about 30 machines. Each machine is configured to run up to

Re: JobTracker Faiing to respond with OutOfMemory error

2008-12-06 Thread Arun C Murthy
PROTECTED] wrote: Hi all, We are using hadoop-0.17.2 for some time now. Since yesterday, We have been seeing jobTracker failing to respond with an OutOfMemory Error very frequently. Things are going fine after restarting it. But the problem is occurring after a while. Below

Re: JobTracker Faiing to respond with OutOfMemory error

2008-12-06 Thread charles du
yesterday, We have been seeing jobTracker failing to respond with an OutOfMemory Error very frequently. Things are going fine after restarting it. But the problem is occurring after a while. Below is the exception that we are seeing in jobtracker logs. Can someone please suggest what is going

Re: JobTracker Faiing to respond with OutOfMemory error

2008-12-06 Thread Arun C Murthy
On Dec 6, 2008, at 11:40 AM, charles du wrote: I used the default value, which I believe is 1000 MB. My cluster has about 30 machines. Each machine is configured to run up to 5 tasks. We run hourly, daily jobs on the cluster. When OOM happened, I was running a job with 1500 - 1600

Re: JobTracker Faiing to respond with OutOfMemory error

2008-12-05 Thread charles du
, Pallavi [EMAIL PROTECTED] wrote: Hi all, We are using hadoop-0.17.2 for some time now. Since yesterday, We have been seeing jobTracker failing to respond with an OutOfMemory Error very frequently. Things are going fine after restarting it. But the problem is occurring after a while. Below

Re: JobTracker Faiing to respond with OutOfMemory error

2008-12-05 Thread charles du
. On Wed, Nov 19, 2008 at 8:40 PM, Palleti, Pallavi [EMAIL PROTECTED] wrote: Hi all, We are using hadoop-0.17.2 for some time now. Since yesterday, We have been seeing jobTracker failing to respond with an OutOfMemory Error very frequently. Things are going fine after restarting

JobTracker Faiing to respond with OutOfMemory error

2008-11-19 Thread Palleti, Pallavi
Hi all, We are using hadoop-0.17.2 for some time now. Since yesterday, We have been seeing jobTracker failing to respond with an OutOfMemory Error very frequently. Things are going fine after restarting it. But the problem is occurring after a while. Below is the exception that we are seeing

Re: OutOfMemory Error

2008-09-19 Thread Edward J. Yoon
. Yoon Sent: Friday, September 19, 2008 10:35 AM To: core-user@hadoop.apache.org; [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: OutOfMemory Error The key is of the form ID :DenseVector Representation in mahout with I guess vector size seems too large so it'll need a distributed vector

Re: OutOfMemory Error

2008-09-18 Thread Edward J. Yoon
Palleti [EMAIL PROTECTED] wrote: Hi all, I am getting outofmemory error as shown below when I ran map-red on huge amount of data.: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:52

RE: OutOfMemory Error

2008-09-18 Thread Palleti, Pallavi
] Subject: Re: OutOfMemory Error The key is of the form ID :DenseVector Representation in mahout with I guess vector size seems too large so it'll need a distributed vector architecture (or 2d partitioning strategies) for large scale matrix operations. The hama team investigate these problem areas

OutOfMemory Error

2008-09-17 Thread Pallavi Palleti
Hi all, I am getting outofmemory error as shown below when I ran map-red on huge amount of data.: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:52) at org.apache.hadoop.io.DataOutputBuffer.write