Re: JStorm CGroup

Boyang(Jerry) Peng Tue, 19 Jan 2016 11:56:59 -0800

Hello Everyone,
Currently at Yahoo, we want to enable the Resource Aware Scheduler we built to 
have cgroup support. The CGroup code that is part of JStorm looks good and 
perhaps we can modify it slightly so that the Resource Aware Scheduler can 
interact with it. What I would like to do is modify the CGroup code that 
already exists in JStorm to be able to start jvm workers that is limited to the 
amount of resources that the resource aware scheduler has allocated for that 
worker and move it to Storm. I would like to have a discussion (especially with 
people that worked on JStorm) about how we can integrate support for the 
resource aware scheduler into the existing CGroups code. Also, I know the folks 
at Alibaba is working on converting the supervisor.clj to java which is tied to 
launching workers and in the future would include CGoups. What is the status of 
that?
Best,
Boyang Jerry Peng

    On Thursday, January 14, 2016 9:25 AM, Bobby Evans 
<[email protected]> wrote:

 I would love to see true support for mesos, YARN, openstack, etc. added, but I 
also see stand alone mode offering a lot more flexibility, especially in the 
area of scheduling, than a two level scheduler can currently offer.  It is on 
my roadmap to look into after the JStorm migration (just started), Resource 
Aware Scheduling (almost done needs testing and better isolation), and adding 
in automatic elasticity around topology specified SLAs (working with a few 
researchers around some prototypes in this area).

To be able to support running on other cluster technologies in a proper way we 
need to provide plugability in a few different places.
First we need a way for a scheduler/cluster to request topology specific 
dedicated resources, and for nimbus to provision, manage, monitor, and ideally 
resize (for elasticity) those resources.  With security and resource aware 
scheduling, we need these external requests to be on a per topology bases, not 
bolted on like they are now.  This would also necessitate the schedulers being 
updated so that they could take advantage of these new APIs requesting external 
resources either when a topology explicitly asks to be on a given external 
resource, or optionally when dedicated resources are no longer available and 
the topology has specified the proper configurations/credentials to allow it to 
run using those external resources.

That handles scheduling, but there are some additional features that storm 
offers which other systems don't yet offer, and many never will.  For example 
the storm blob store API is similar to the dist cache in YARN, but it we can do 
in place replacement without relaunching.  We also favor fast fail and I don't 
think all of these types of clusters will nor should offer the process 
monitoring and re-spawning needed for it.  As such we would need some sort of a 
supervisor that would also run under YARN/mesos, etc to provide this extra 
functionality.  I have not totally thought about all of what it would need from 
a plugability standpoint to make that work.  There is also the logviewer which 
does more then just logs, so we would need some pluggable way to be able to 
point people to where their logs/artifacts are, and to monitor the resource 
usage of the logs (perhaps that part should move off to the supervisor). All of 
that seems like a lot more work compared to providing a pluggable interface in 
the supervisor that would allow for it to provision, manage, monitor, and again 
possibly resize, local workers.  In fact I see a lot of potential overlap 
between the two of them and the pluggability that would be needed in the 
supervisor for running on mesos, YARN, etc.

- Bobby 

    On Thursday, January 14, 2016 12:39 AM, Erik Weathers 
<[email protected]> wrote:

 Perhaps rather than just bolting on "cgroup support", we could instead open
a dialogue about having Mesos support be a core feature of Storm.

The current integration is a bit unwieldy & hackish at the moment, arising
from the conflicting natures of Mesos and Storm w.r.t. scheduling of
resources.  i.e., Storm assumes you have existing "slots" for running
workers on, whereas Mesos is more dynamic, requiring frameworks that run on
top of it to tell Mesos just how many resources (CPUs, Memory, etc.) are
needed by the framework's tasks.

One example of an issue with Storm-on-Mesos:  the Storm logviewer is
completely busted when you are using Mesos, I filed a ticket with a
description of the issue and proposed modifications to allow it to function:

  - https://issues.apache.org/jira/browse/STORM-1342

Furthermore, there are fundamental behaviors in Storm that don't mesh well
with Mesos:

  - the interfaces of INimbus (allSlotsAvailableForScheduling(),
  assignSlots(), getForcedScheduler(), etc.) make it difficult to create an
  ideal Mesos integration framework, since they don't allow the Mesos
  integration code to *really* know what's going on from the Nimbus's
  perspective. e.g.,
      - knowing which topologies & how many workers need to be scheduled at
      any given moment.
      - since the integration code cannot know what is actually needed to
      be run when it receives offers from Mesos, it just hoards those offers,
      leading to resource starvation in the Mesos cluster.
  - the "fallback" behavior of allowing the topology to settle for having
  less worker processes than requested should be disable-able.  For carefully
  tuned topologies it is quite bad to run on less than the expected number of
  worker processes.
      - also, this behavior endangers the idea of having the Mesos
      integration code *only* hoard Mesos offers after a successful round-trip
      through the allSlotsAvailableForScheduling() polling calls (i.e., only
      hoard when we know there are pending topologies).  It's dangerous because
      while we wait for another call to allSlotsAvailableForScheduling(), the
      Nimbus may have decided that it's okie dokie to use less than
the requested
      number of worker processes.

I'm sure there are other issues that I can conjure up, but those are the
major ones that came to mind instantly.  I'm happy to explain more about
this, since I realize the above bulleted info may lack context.

I wish I knew something about how Twitter's new Heron project addresses the
concerns above since it comes with Mesos support out-of-the-box, but it's
unclear at this point what they're doing until they open source it.

Thanks!

- Erik

On Wed, Jan 13, 2016 at 6:27 PM, 刘键(Basti Liu) <[email protected]>
wrote:

> Hi Bobby & Jerry,
>
> Yes, JStorm implements generic cgroup support. But just only cpu control
> is enable when starting worker.
>
> Regards
> Basti
>
> -----Original Message-----
> From: Bobby Evans [mailto:[email protected]]
> Sent: Wednesday, January 13, 2016 11:14 PM
> To: [email protected]
> Subject: Re: JStorm CGroup
>
> Jerry,
> I think most of the code you are going to want to look at is here
> https://github.com/apache/storm/blob/jstorm-import/jstorm-core/src/main/java/com/alibaba/jstorm/daemon/supervisor/CgroupManager.java
> The back end for most of it seems to come from
>
>
> https://github.com/apache/storm/tree/jstorm-import/jstorm-core/src/main/java/com/alibaba/jstorm/container
>
> Which looks like it implements a somewhat generic cgroup support.
>  - Bobby
>
>    On Wednesday, January 13, 2016 1:34 AM, 刘键(Basti Liu) <
> [email protected]> wrote:
>
>
>  Hi Jerry,
>
> Currently, JStorm supports to control the upper limit of cpu time for a
> worker by cpu.cfs_period_us & cpu.cfs_quota_us in cgroup.
> e.g. cpu.cfs_period_us= 100000, cpu.cfs_quota_us=3*100000. Cgroup will
> limit the corresponding process to occupy at most 300% cpu (3 cores).
>
> Regards
> Basti
>
> -----Original Message-----
> From: Jerry Peng [mailto:[email protected]]
> Sent: Wednesday, January 13, 2016 1:57 PM
> To: [email protected]
> Subject: JStorm CGroup
>
> Hello everyone,
>
> This question is directed more towards the people that worked on JStorm.
> If I recall correctly JStorm offers some sort of resource isolation through
> CGroups.  What kind of support does JStorm offer for resource isolation?
> Can someone elaborate on this feature in JStorm.
>
> Best,
>
> Jerry
>
>
>
>
>

Re: JStorm CGroup

Reply via email to