>> my task logs I see the message: >> "attempt to override final parameter: mapred.child.ulimit; Ignoring." >> which doesn't exactly inspire confidence that I'm on the right path. > > Chances are the param has been marked final in the task tracker's running > config which will prevent you overriding the value with a job specific > configuration. Do you have any idea how one unmarks such a thing? Do I just need to edit the configuration file for the task tracker?
> > Depending upon how many tasks per node, that may not be enough. Streaming > jobs eat a crapton (I'm pretty sure that is an SI unit) of memory. If you Is there any particular reason for the excessive memory use? I realize this is Java, but it's just sloshing data down to my processes... > are hitting 2gb+, that means you can probably run 3 tasks max without > swapping. [Don't forget to count the size of the task tracker JVM, the > streaming.jar JVM, etc, and be cognizant of the fact that JVM mem size != > Java heap size.] I'm seeing the failures even when I run a single job. But, obviously I don't want to schedule more than 3 jobs on a node since they won't have enough memory. How does one change the number of map slots per node? I'm a hadoop configuration newbie (which is why I was originally excited about the Cloudera EC2 scripts...) -Chris