Re: process limits for streaming jar
On Fri, Jun 27, 2008 at 08:57, Chris Anderson <[EMAIL PROTECTED]> wrote: > The problem is that when there are a large number of map tasks to > complete, Hadoop doesn't seem to obey the map.tasks.maximum. Instead, > it is spawning 8 map tasks per tasktracker (even when I change the > mapred.tasktracker.map.tasks.maximum in hadoop-site.xml to 2, on the > master). The cluster was booted with the setting at 8. Do I need to > change hadoop-site.xml on all the slaves, and restart the task > trackers, in order to make the limit apply? That seems unlikely - I'd > really like to manage this parameter on a per-job level. > Yes, mapred.tasktracker.map.tasks.maximum is configured per tasktracker on startup. It can't be configured per job because it's not a job-scope parameter (if there are multiple concurrent jobs, they have to share the task limit). rick
Re: process limits for streaming jar
Having experimented some more, I've found that the simple solution is to limit the resource usage by limiting the # of map tasks and the memory they are allowed to consume. I'm specifying the constraints on the command line like this: -jobconf mapred.tasktracker.map.tasks.maximum=2 mapred.child.ulimit=1048576 The configuration parameters seem to take, in the job.xml available from the web console, I see these lines: mapred.child.ulimit 1048576 mapred.tasktracker.map.tasks.maximum2 The problem is that when there are a large number of map tasks to complete, Hadoop doesn't seem to obey the map.tasks.maximum. Instead, it is spawning 8 map tasks per tasktracker (even when I change the mapred.tasktracker.map.tasks.maximum in hadoop-site.xml to 2, on the master). The cluster was booted with the setting at 8. Do I need to change hadoop-site.xml on all the slaves, and restart the task trackers, in order to make the limit apply? That seems unlikely - I'd really like to manage this parameter on a per-job level. Thanks for any input! Chris -- Chris Anderson http://jchris.mfdz.com
Re: process limits for streaming jar
Allen Wittenauer wrote: This is essentially what we're doing via torque (and therefore hod). When we intend to move away from HOD(and thus torque) to using Hadoop Resource manager(HADOOP-3421) and scheduler(HADOOP-3412) interfaces, we need to move this resource management functionality into Hadoop. As I said in HADOOP-3280, I'm still convinced that having Hadoop set these types of restrictions directly is the wrong approach. It is almost always going to be better to use the OS controls that are specific to the installation. Enabling OS specific features means that, at a maximum, hadoop should likely being calling a script rather than doing ulimits or having the equivalent of #ifdef code everywhere. [After all, what if I want to use Solaris-specific features like projects or privileges?] But I suspect most of the tunables that people will care about can likely be managed at the OS level before hadoop is even involved. Please see HADOOP-3581(Prevent memory intensive user tasks from taking down nodes). This issue is aimed at a general solution for putting aggregate memory limits on the tasks and any subprocesses that tasks might launch. Your comments don't seem to be in line with what I proposed on this issue JIRA. Having Hadoop to just call a sript, rather than doing ulimits itself, does look nice, but by doing just ulimits alone, Hadoop cannot have a complete control over what the tasks do. For e.g. as stated on the JIRA, run-away tasks that fork themselves repeatedly could wreck havoc and disturb the normal functioning of not only Hadoop daemons but also might bring down the nodes themselves. The current solution HADOOP-3280, doesn't preclude this, it only limits memory usable by a single process, not its subprocesses. Neither do ulimits via limits.conf will suffice - it just limits vmem usable per-process per-user as I checked. Not to sound imitating Torque a bit too much, even torque resorts to have control over the process tree itself rather than depending directly on OS specific tools - this is done by specific code for each platform. We need a consensus on all of this. And I might be missing something too. Can you please comment on HADOOP-3581? Thanks, -Vinod
Re: process limits for streaming jar
On 6/26/08 10:14 AM, "Joydeep Sen Sarma" <[EMAIL PROTECTED]> wrote: > However - in our environment - we spawn all streaming tasks through a > wrapper program (with the wrapper being defaulted across all users). We > can control resource uses from this wrapper (and also do some level of > job control). This is essentially what we're doing via torque (and therefore hod). > This does require some minor code changes and would be happy to contrib > - but the approach doesn't quite fit in well with the approach in 2765 > that passes each one of these resource limits as a hadoop config param. As I said in HADOOP-3280, I'm still convinced that having Hadoop set these types of restrictions directly is the wrong approach. It is almost always going to be better to use the OS controls that are specific to the installation. Enabling OS specific features means that, at a maximum, hadoop should likely being calling a script rather than doing ulimits or having the equivalent of #ifdef code everywhere. [After all, what if I want to use Solaris-specific features like projects or privileges?] But I suspect most of the tunables that people will care about can likely be managed at the OS level before hadoop is even involved.
RE: process limits for streaming jar
Memory limits were handled in this jira: https://issues.apache.org/jira/browse/HADOOP-2765 However - in our environment - we spawn all streaming tasks through a wrapper program (with the wrapper being defaulted across all users). We can control resource uses from this wrapper (and also do some level of job control). This does require some minor code changes and would be happy to contrib - but the approach doesn't quite fit in well with the approach in 2765 that passes each one of these resource limits as a hadoop config param. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Chris Anderson Sent: Wednesday, June 25, 2008 7:00 PM To: core-user@hadoop.apache.org Subject: process limits for streaming jar Hi there, I'm running some streaming jobs on ec2 (ruby parsing scripts) and in my most recent test I managed to spike the load on my large instances to 25 or so. As a result, I lost communication with one instance. I think I took down sshd. Whoops. My question is, has anyone got strategies for managing resources used by the processes spawned by streaming jar? Ideally I'd like to run my ruby scripts under nice. I can hack something together with wrappers, but I'm thinking there might be a configuration option to handle this within Streaming jar. Thanks for any suggestions! -- Chris Anderson http://jchris.mfdz.com
process limits for streaming jar
Hi there, I'm running some streaming jobs on ec2 (ruby parsing scripts) and in my most recent test I managed to spike the load on my large instances to 25 or so. As a result, I lost communication with one instance. I think I took down sshd. Whoops. My question is, has anyone got strategies for managing resources used by the processes spawned by streaming jar? Ideally I'd like to run my ruby scripts under nice. I can hack something together with wrappers, but I'm thinking there might be a configuration option to handle this within Streaming jar. Thanks for any suggestions! -- Chris Anderson http://jchris.mfdz.com