Perhaps rather than just bolting on "cgroup support", we could instead open
a dialogue about having Mesos support be a core feature of Storm.

The current integration is a bit unwieldy & hackish at the moment, arising
from the conflicting natures of Mesos and Storm w.r.t. scheduling of
resources.  i.e., Storm assumes you have existing "slots" for running
workers on, whereas Mesos is more dynamic, requiring frameworks that run on
top of it to tell Mesos just how many resources (CPUs, Memory, etc.) are
needed by the framework's tasks.

One example of an issue with Storm-on-Mesos:  the Storm logviewer is
completely busted when you are using Mesos, I filed a ticket with a
description of the issue and proposed modifications to allow it to function:

   - https://issues.apache.org/jira/browse/STORM-1342

Furthermore, there are fundamental behaviors in Storm that don't mesh well
with Mesos:

   - the interfaces of INimbus (allSlotsAvailableForScheduling(),
   assignSlots(), getForcedScheduler(), etc.) make it difficult to create an
   ideal Mesos integration framework, since they don't allow the Mesos
   integration code to *really* know what's going on from the Nimbus's
   perspective. e.g.,
      - knowing which topologies & how many workers need to be scheduled at
      any given moment.
      - since the integration code cannot know what is actually needed to
      be run when it receives offers from Mesos, it just hoards those offers,
      leading to resource starvation in the Mesos cluster.
   - the "fallback" behavior of allowing the topology to settle for having
   less worker processes than requested should be disable-able.  For carefully
   tuned topologies it is quite bad to run on less than the expected number of
   worker processes.
      - also, this behavior endangers the idea of having the Mesos
      integration code *only* hoard Mesos offers after a successful round-trip
      through the allSlotsAvailableForScheduling() polling calls (i.e., only
      hoard when we know there are pending topologies).  It's dangerous because
      while we wait for another call to allSlotsAvailableForScheduling(), the
      Nimbus may have decided that it's okie dokie to use less than
the requested
      number of worker processes.

I'm sure there are other issues that I can conjure up, but those are the
major ones that came to mind instantly.  I'm happy to explain more about
this, since I realize the above bulleted info may lack context.

I wish I knew something about how Twitter's new Heron project addresses the
concerns above since it comes with Mesos support out-of-the-box, but it's
unclear at this point what they're doing until they open source it.

Thanks!

- Erik

On Wed, Jan 13, 2016 at 6:27 PM, 刘键(Basti Liu) <[email protected]>
wrote:

> Hi Bobby & Jerry,
>
> Yes, JStorm implements generic cgroup support. But just only cpu control
> is enable when starting worker.
>
> Regards
> Basti
>
> -----Original Message-----
> From: Bobby Evans [mailto:[email protected]]
> Sent: Wednesday, January 13, 2016 11:14 PM
> To: [email protected]
> Subject: Re: JStorm CGroup
>
> Jerry,
> I think most of the code you are going to want to look at is here
> https://github.com/apache/storm/blob/jstorm-import/jstorm-core/src/main/java/com/alibaba/jstorm/daemon/supervisor/CgroupManager.java
> The back end for most of it seems to come from
>
>
> https://github.com/apache/storm/tree/jstorm-import/jstorm-core/src/main/java/com/alibaba/jstorm/container
>
> Which looks like it implements a somewhat generic cgroup support.
>  - Bobby
>
>     On Wednesday, January 13, 2016 1:34 AM, 刘键(Basti Liu) <
> [email protected]> wrote:
>
>
>  Hi Jerry,
>
> Currently, JStorm supports to control the upper limit of cpu time for a
> worker by cpu.cfs_period_us & cpu.cfs_quota_us in cgroup.
> e.g. cpu.cfs_period_us= 100000, cpu.cfs_quota_us=3*100000. Cgroup will
> limit the corresponding process to occupy at most 300% cpu (3 cores).
>
> Regards
> Basti
>
> -----Original Message-----
> From: Jerry Peng [mailto:[email protected]]
> Sent: Wednesday, January 13, 2016 1:57 PM
> To: [email protected]
> Subject: JStorm CGroup
>
> Hello everyone,
>
> This question is directed more towards the people that worked on JStorm.
> If I recall correctly JStorm offers some sort of resource isolation through
> CGroups.  What kind of support does JStorm offer for resource isolation?
> Can someone elaborate on this feature in JStorm.
>
> Best,
>
> Jerry
>
>
>
>
>

Reply via email to