Re: OOMs in java hadoop tests?

2010-04-30 Thread Doug Cutting

Scott Carey wrote:

How much longer do the tests take with that setting?


It doesn't seem markedly longer.


Also, if destroying the JVM makes it work, then there is probably some sort of 
leak in certain tests


Yes, something seems to be leaking.


I'm fairly sure that running out of file descriptors has a slightly different 
message.


With toy programs I can't reproduce an OOM with descriptor exhaustion, 
but I know I've seen it before.


Doug


Re: OOMs in java hadoop tests?

2010-04-29 Thread Scott Carey
On Apr 29, 2010, at 4:54 PM, Doug Cutting wrote:

> Scott Carey wrote:
>> Yes, I see the same thing.
>> 
>> Adding maxmemory="128m"
>> to the junit ant task fixed it for me.
> 
> I wonder if what we're actually seeing is an exhaustion of file 
> descriptors due to some leak.  As I recall, this also shows up as an OOM.
> 
> Another way to fix this, rather than adding maxmemory, is to remove 
> forkmode="once", so that a new JVM is forked per test.  That also fixes 
> it for me.
> 

How much longer do the tests take with that setting?  Generating a new JVM for 
each unit test will increase the run time.

Also, if destroying the JVM makes it work, then there is probably some sort of 
leak in certain tests -- static variables or something.  We could force the JVM 
to create a heap dump on exit and poke around with the heap dump to see.

I'm fairly sure that running out of file descriptors has a slightly different 
message.


> Doug



Re: OOMs in java hadoop tests?

2010-04-29 Thread Doug Cutting

Scott Carey wrote:

Yes, I see the same thing.

Adding maxmemory="128m"
to the junit ant task fixed it for me.


I wonder if what we're actually seeing is an exhaustion of file 
descriptors due to some leak.  As I recall, this also shows up as an OOM.


Another way to fix this, rather than adding maxmemory, is to remove 
forkmode="once", so that a new JVM is forked per test.  That also fixes 
it for me.


Doug


Re: OOMs in java hadoop tests?

2010-04-03 Thread Scott Carey
Yes, I see the same thing.

Adding maxmemory="128m"
to the junit ant task fixed it for me.  Alternatively we may be able to turn 
down some of the Hadoop memory usage parameters such as sort and in memory fs 
space.

I see some other errors:
RAT fails, with:
Unapproved licenses:
  
/Users/scottc/Documents/workspace/avro-trunk/build/avro-src-1.4.0-SNAPSHOT/lang/ruby/test/test_datafile.rb

Java unit tests fail mysteriously for me after TestUtf8 -- no errors are logged 
and no tests report a failure in the output.  It might be an environment issue 
on my machine however.

On Apr 3, 2010, at 6:09 PM, Jeff Hodges wrote:

> Anyone else seeing OutOfMemory errors when trying to run the hadoop tests?
> 
> Specifically:
> 
> 
>[junit] Running org.apache.avro.mapred.TestWordCountGeneric
>[junit] Apr 3, 2010 5:57:56 PM org.apache.hadoop.metrics.jvm.JvmMetrics 
> init
>[junit] INFO: Initializing JVM Metrics with
> processName=JobTracker, sessionId=
>[junit] Apr 3, 2010 5:57:56 PM org.apache.hadoop.mapred.JobClient
> configureCommandLineOptions
>[junit] WARNING: Use GenericOptionsParser for parsing the
> arguments. Applications should implement Tool for the same.
>[junit] Apr 3, 2010 5:57:57 PM org.apache.hadoop.mapred.JobClient
> configureCommandLineOptions
>[junit] WARNING: No job jar file set.  User classes may not be
> found. See JobConf(Class) or JobConf#setJar(String).
>[junit] Apr 3, 2010 5:57:57 PM
> org.apache.hadoop.mapred.FileInputFormat listStatus
>[junit] INFO: Total input paths to process : 1
>[junit] Apr 3, 2010 5:57:57 PM org.apache.hadoop.mapred.JobClient
> monitorAndPrintJob
>[junit] INFO: Running job: job_local_0001
>[junit] Apr 3, 2010 5:57:57 PM
> org.apache.hadoop.mapred.FileInputFormat listStatus
>[junit] INFO: Total input paths to process : 1
>[junit] Apr 3, 2010 5:57:57 PM org.apache.hadoop.mapred.MapTask 
> runOldMapper
>[junit] INFO: numReduceTasks: 1
>[junit] Apr 3, 2010 5:57:57 PM
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer 
>[junit] INFO: io.sort.mb = 100
>[junit] Apr 3, 2010 5:57:57 PM
> org.apache.hadoop.mapred.LocalJobRunner$Job run
>[junit] WARNING: job_local_0001
>[junit] java.lang.OutOfMemoryError: Java heap space
>[junit]at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.(MapTask.java:781)
>[junit]at 
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:350)
>[junit]at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>[junit]at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
>[junit] Apr 3, 2010 5:57:58 PM org.apache.hadoop.mapred.JobClient
> monitorAndPrintJob
>[junit] INFO:  map 0% reduce 0%
>[junit] Apr 3, 2010 5:57:58 PM org.apache.hadoop.mapred.JobClient
> monitorAndPrintJob
>[junit] INFO: Job complete: job_local_0001
>[junit] Apr 3, 2010 5:57:58 PM org.apache.hadoop.mapred.Counters log
>[junit] INFO: Counters: 0
>[junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 2.29 sec
>[junit] Error:
>[junit] Apr 3, 2010 5:57:56 PM org.apache.hadoop.metrics.jvm.JvmMetrics 
> init
>[junit] INFO: Initializing JVM Metrics with
> processName=JobTracker, sessionId=
>[junit] Apr 3, 2010 5:57:56 PM org.apache.hadoop.mapred.JobClient
> configureCommandLineOptions
>[junit] WARNING: Use GenericOptionsParser for parsing the
> arguments. Applications should implement Tool for the same.
>[junit] Apr 3, 2010 5:57:57 PM org.apache.hadoop.mapred.JobClient
> configureCommandLineOptions
>[junit] WARNING: No job jar file set.  User classes may not be
> found. See JobConf(Class) or JobConf#setJar(String).
>[junit] Apr 3, 2010 5:57:57 PM
> org.apache.hadoop.mapred.FileInputFormat listStatus
>[junit] INFO: Total input paths to process : 1
>[junit] Apr 3, 2010 5:57:57 PM org.apache.hadoop.mapred.JobClient
> monitorAndPrintJob
>[junit] INFO: Running job: job_local_0001
>[junit] Apr 3, 2010 5:57:57 PM
> org.apache.hadoop.mapred.FileInputFormat listStatus
>[junit] INFO: Total input paths to process : 1
>[junit] Apr 3, 2010 5:57:57 PM org.apache.hadoop.mapred.MapTask 
> runOldMapper
>[junit] INFO: numReduceTasks: 1
>[junit] Apr 3, 2010 5:57:57 PM
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer 
>[junit] INFO: io.sort.mb = 100
>[junit] Apr 3, 2010 5:57:57 PM
> org.apache.hadoop.mapred.LocalJobRunner$Job run
>[junit] WARNING: job_local_0001
>[junit] java.lang.OutOfMemoryError: Java heap space
>[junit]at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.(MapTask.java:781)
>[junit]at 
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:350)
>[junit]at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>[junit]at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
>[junit] Apr 3, 2010 5:57:58 PM org.apache.hadoop.mapred.JobClient
> monitorAndPrintJob
>[junit] INFO:  ma

OOMs in java hadoop tests?

2010-04-03 Thread Jeff Hodges
Anyone else seeing OutOfMemory errors when trying to run the hadoop tests?

Specifically:


[junit] Running org.apache.avro.mapred.TestWordCountGeneric
[junit] Apr 3, 2010 5:57:56 PM org.apache.hadoop.metrics.jvm.JvmMetrics init
[junit] INFO: Initializing JVM Metrics with
processName=JobTracker, sessionId=
[junit] Apr 3, 2010 5:57:56 PM org.apache.hadoop.mapred.JobClient
configureCommandLineOptions
[junit] WARNING: Use GenericOptionsParser for parsing the
arguments. Applications should implement Tool for the same.
[junit] Apr 3, 2010 5:57:57 PM org.apache.hadoop.mapred.JobClient
configureCommandLineOptions
[junit] WARNING: No job jar file set.  User classes may not be
found. See JobConf(Class) or JobConf#setJar(String).
[junit] Apr 3, 2010 5:57:57 PM
org.apache.hadoop.mapred.FileInputFormat listStatus
[junit] INFO: Total input paths to process : 1
[junit] Apr 3, 2010 5:57:57 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
[junit] INFO: Running job: job_local_0001
[junit] Apr 3, 2010 5:57:57 PM
org.apache.hadoop.mapred.FileInputFormat listStatus
[junit] INFO: Total input paths to process : 1
[junit] Apr 3, 2010 5:57:57 PM org.apache.hadoop.mapred.MapTask runOldMapper
[junit] INFO: numReduceTasks: 1
[junit] Apr 3, 2010 5:57:57 PM
org.apache.hadoop.mapred.MapTask$MapOutputBuffer 
[junit] INFO: io.sort.mb = 100
[junit] Apr 3, 2010 5:57:57 PM
org.apache.hadoop.mapred.LocalJobRunner$Job run
[junit] WARNING: job_local_0001
[junit] java.lang.OutOfMemoryError: Java heap space
[junit] at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.(MapTask.java:781)
[junit] at 
org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:350)
[junit] at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
[junit] at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
[junit] Apr 3, 2010 5:57:58 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
[junit] INFO:  map 0% reduce 0%
[junit] Apr 3, 2010 5:57:58 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
[junit] INFO: Job complete: job_local_0001
[junit] Apr 3, 2010 5:57:58 PM org.apache.hadoop.mapred.Counters log
[junit] INFO: Counters: 0
[junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 2.29 sec
[junit] Error:
[junit] Apr 3, 2010 5:57:56 PM org.apache.hadoop.metrics.jvm.JvmMetrics init
[junit] INFO: Initializing JVM Metrics with
processName=JobTracker, sessionId=
[junit] Apr 3, 2010 5:57:56 PM org.apache.hadoop.mapred.JobClient
configureCommandLineOptions
[junit] WARNING: Use GenericOptionsParser for parsing the
arguments. Applications should implement Tool for the same.
[junit] Apr 3, 2010 5:57:57 PM org.apache.hadoop.mapred.JobClient
configureCommandLineOptions
[junit] WARNING: No job jar file set.  User classes may not be
found. See JobConf(Class) or JobConf#setJar(String).
[junit] Apr 3, 2010 5:57:57 PM
org.apache.hadoop.mapred.FileInputFormat listStatus
[junit] INFO: Total input paths to process : 1
[junit] Apr 3, 2010 5:57:57 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
[junit] INFO: Running job: job_local_0001
[junit] Apr 3, 2010 5:57:57 PM
org.apache.hadoop.mapred.FileInputFormat listStatus
[junit] INFO: Total input paths to process : 1
[junit] Apr 3, 2010 5:57:57 PM org.apache.hadoop.mapred.MapTask runOldMapper
[junit] INFO: numReduceTasks: 1
[junit] Apr 3, 2010 5:57:57 PM
org.apache.hadoop.mapred.MapTask$MapOutputBuffer 
[junit] INFO: io.sort.mb = 100
[junit] Apr 3, 2010 5:57:57 PM
org.apache.hadoop.mapred.LocalJobRunner$Job run
[junit] WARNING: job_local_0001
[junit] java.lang.OutOfMemoryError: Java heap space
[junit] at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.(MapTask.java:781)
[junit] at 
org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:350)
[junit] at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
[junit] at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
[junit] Apr 3, 2010 5:57:58 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
[junit] INFO:  map 0% reduce 0%
[junit] Apr 3, 2010 5:57:58 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
[junit] INFO: Job complete: job_local_0001
[junit] Apr 3, 2010 5:57:58 PM org.apache.hadoop.mapred.Counters log
[junit] INFO: Counters: 0
[junit]
[junit] TEST org.apache.avro.mapred.TestWordCountGeneric FAILED
[junit] Running org.apache.avro.mapred.TestWordCountSpecific
[junit] Apr 3, 2010 5:57:58 PM org.apache.hadoop.metrics.jvm.JvmMetrics init
[junit] INFO: Cannot initialize JVM Metrics with
processName=JobTracker, sessionId= - already initialized
[junit] Apr 3, 2010 5:57:58 PM org.apache.hadoop.mapred.JobClient
configureCommandLineOptions
[junit] WARNING: Use GenericOptionsParser for parsing the
arg