Dne 10.5.2012 15:29, Robert Evans napsal(a):
Yes adding in more resources in the scheduling request would be the
ideal solution to the problem. But sadly that is not a trivial change.
Best will be to have custom resources.
Example: in node config define that this node has:
1 x GPU
1 x Gbit Ne
ay 10, 2012 1:27 PM
To: mapreduce-user@hadoop.apache.org
Subject: Re: max 1 mapper per node
For terasort you want to fill up your entire cluster with maps/reduces as fast
as you can to get the best performance.
Just play with #slots.
Arun
On May 9, 2012, at 12:36 PM, Jeffrey Buell wrote:
N
For terasort you want to fill up your entire cluster with maps/reduces as fast
as you can to get the best performance.
Just play with #slots.
Arun
On May 9, 2012, at 12:36 PM, Jeffrey Buell wrote:
> Not to speak for Radim, but what I’m trying to achieve is performance at
> least as good as 0.
Yes adding in more resources in the scheduling request would be the ideal
solution to the problem. But sadly that is not a trivial change. The initial
solution I suggested is an ugly hack, and will not work for the cases you have
suggested. If you feel that this is important work please feel
Financial Engineering
Freddie Mac
Radim Kolar
05/10/2012 07:56 AM
Please respond to
mapreduce-user@hadoop.apache.org
To
mapreduce-user@hadoop.apache.org
cc
Subject
Re: max 1 mapper per node
> We've been against these 'features' since it leads to very bad
> be
We've been against these 'features' since it leads to very bad
behaviour across the cluster with multiple apps/users etc.
Its not new feature, its extension of existing resource scheduling which
works good enough only for RAM. There are 2 other resources - CPU cores
and network IO which needs
Not to speak for Radim, but what I'm trying to achieve is performance at least
as good as 0.20 for all cases. That is, no regressions. For something as
simple as terasort, I don't think that is possible without being able to
specify the max number of mappers/reducers per node. As it is, I see
We've been against these 'features' since it leads to very bad behaviour across
the cluster with multiple apps/users etc.
What is your use-case i.e. what are you trying to achieve with this?
thanks,
Arun
On May 3, 2012, at 5:59 AM, Radim Kolar wrote:
> if plugin system for AM is overkill, some
Either plugins or configuration options would be possible to do. The real
question is what is the use case for this, and is that use case big enough to
warrant it being part of core Hadoop? I have seen a few situations where this
perhaps makes since, but most of those are because the resource
if plugin system for AM is overkill, something simpler can be made like:
maximum number of mappers per node
maximum number of reducers per node
maximum percentage of non data local tasks
maximum percentage of rack local tasks
and set this in job properties.
Dne 27.4.2012 17:30, Robert Evans napsal(a):
Radim,
You would want to modify the application master for this, and it is
likely to be a bit of a hack because the RM scheduler itself is not
really designed for this.
What about to do something like this:
In job JAR there will be loadable plugin
Radim,
You would want to modify the application master for this, and it is likely to
be a bit of a hack because the RM scheduler itself is not really designed for
this. You can have the AM return containers to the RM that it does not want
(which it does already in some cases), because they are
Do you want it so that the node can run only 1 mapper at a time but
the node can also run multiple reducers at the same time?
I currently want it per-application. Node can run more mappers/reducers
from other jobs at same time.
Do you want it so that this particular application will only us
I do not believe that there is currently a way to do this.
Do you want it so that the node can run only 1 mapper at a time but the node
can also run multiple reducers at the same time? Do you want it so that this
particular application will only use that node for a single mapper, but other
app
Its possible to configure in YARN scheduling requirement to not run more
than 1 mapper per node?
Its not because of memory requirements. Who decides this kind of
scheduling? ResourceManager or ApplicationManager?
15 matches
Mail list logo