shahab mehmandoust wrote:
I'm try to write a daemon that periodically wakes up and runs map/reduce
jobs, but I've have little luck. I've tried different ways (including using
cascading) and I keep arriving at the below exception:
java.lang.OutOfMemoryError: Java heap space
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:359)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:185)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:157)
java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1062)
at
com.txing.mapred.watcher.DirWatcherImpl2.runMapReduce(DirWatcherImpl2.java:29)
at com.txing.mapred.watcher.DirWatcher.run(DirWatcher.java:52)
at java.lang.Thread.run(Thread.java:637)
I've set this propertyl: mapred.child.java.opts larger and larger makes no
difference.
Setting this property doesn't make sense for LocalJobRunner, because
LocalJobRunner doesn't spawn jvm for the child process. The whole job
will be run by a single thread. Mostly LocalJobRunner is used for
debugging/testing.
Thanks
Amareshwari
Furthermore, I get working like this:
WARN | No job jar file set. User classes may not be found. See
JobConf(Class) or JobConf#setJar(String). | JobClient.java:637 |
org.apache.hadoop.mapred.JobClient | Thread-0 |
WARN | job_local_1 | LocalJobRunner.java:234 |
org.apache.hadoop.mapred.LocalJobRunner | Thread-15 |
Do I have to submit jar files to hadoop? Can't I daemonize this?
Thanks,
Shahab