I found out the problem.  CPU scheduling is not enabled by default for the
capacity scheduler.  The following needs to be set in
capacity-scheduler.xml to enable it:

  <property>
    <name>yarn.scheduler.capacity.resource-calculator</name>

<value>org.apache.hadoop.yarn.util.resource.DominantResourceCalculator</value>
  </property>

If the above is set, containers can have multiple vcores.  However, cgroups
should also be enabled.  Otherwise, CPU scheduling will be best effort &
unpredictable.

Not sure if this already exists.  However, it would be nice to be able to
configure Twill to throw an exception or fail if an application does not
get all of its requested resources or if you try to specify the number of
cores or memory and the scheduler (that is configured) does not support
those resource types.


On Fri, May 1, 2015 at 12:09 PM Mike Walch <[email protected]> wrote:

> Hi Terence,
>
> I don't see anything out of ordinary in the RM log.  The RM logs that its
> returning a container with 1 core but there are no errors or warnings.
>
> After doing a little more research, this may be caused by not having
> cgroups enabled in YARN.  I am working on enabling them now and will let
> everyone know if that fixes the issue.
>
> -Mike
>
> On Fri, May 1, 2015 at 11:45 AM Terence Yim <[email protected]> wrote:
>
>> Hi Mike,
>>
>> Does the RM log shows any hint?
>>
>> Terence
>>
>> Sent from my iPhone
>>
>> > On May 1, 2015, at 7:35 AM, Mike Walch <[email protected]> wrote:
>> >
>> > I am using the CapacityScheduler.  My Hadoop configuration files can be
>> > viewed at: https://github.com/fluo-io/fluo-dev/tree/master/conf/hadoop
>> >
>> > While this could be a configuration issue, I don't think it's a lack of
>> > resources as my ResourceManager has several vcores remaining after
>> starting
>> > my YARN app.
>> >
>> >> On Thu, Apr 30, 2015 at 7:57 PM Poorna Chandra <[email protected]> wrote:
>> >>
>> >> Hi Mike,
>> >>
>> >> What YARN scheduler are you using?
>> >>
>> >> Poorna.
>> >>
>> >>
>> >>> On Thu, Apr 30, 2015 at 12:28 PM, Mike Walch <[email protected]>
>> wrote:
>> >>>
>> >>> I am trying to start a Twill Runnable with 2 cores in YARN.  When I
>> set
>> >> the
>> >>> number of virtual cores to 2 in my ResourceSpecification, the
>> container
>> >>> that is started in YARN ends up having only 1 core (according to its
>> logs
>> >>> and the NodeManager).  It looks like Twill is asking for 2 cores in
>> the
>> >>> ApplicationMaster but YARN only returns a container with 1 core.
>> >>> Therefore, I am not sure if this is a Twill or YARN problem.  I am
>> >> running
>> >>> Twill 0.5 and Hadoop 2.6.0.  In my yarn-site.xml, I am not changing
>> any
>> >> of
>> >>> the configuration for virtual cores from the default.  Any ideas of
>> what
>> >>> could be causing this?
>> >>>
>> >>> Below are the logs from my Twill ApplicationMaster:
>> >>>
>> >>> 17:56:00.610 [ApplicationMasterService] INFO
>> >>> o.a.t.i.a.ApplicationMasterService - Request 1 container with
>> capability
>> >>> <memory:1024, vCores:2> for runnable FluoWorker
>> >>>
>> >>> 17:56:02.616 [ApplicationMasterService] INFO
>> >>> o.a.t.i.a.ApplicationMasterService - Starting runnable FluoWorker with
>> >>
>> RunnableProcessLauncher{container=org.apache.twill.internal.yarn.Hadoop21YarnContainerInfo@74cff77f
>> >>> }
>> >>>
>> >>> I added the logging below to my Twill 0.5 branch which shows that the
>> >>> container returned by YARN only has 1 core even though the request was
>> >> for
>> >>> 2:
>> >>>
>> >>> 17:56:02.616 [ApplicationMasterService] INFO
>> >>> o.a.t.i.a.ApplicationMasterService -
>> >>> processLauncher.getContainerInfo().getVirtualCores() =  1
>> >>>
>> >>> 17:56:02.617 [ApplicationMasterService] INFO
>> >>> o.a.t.i.a.ApplicationMasterService -
>> >>
>> provisionRequest.getRuntimeSpec().getResourceSpecification().getVirtualCores()
>> >>> = 2
>> >>
>>
>

Reply via email to