Re: [proposal] Module extension: hooks

Niklas Nielsen Tue, 25 Nov 2014 15:37:30 -0800

On 22 November 2014 at 11:02, Alex Rukletsov <a...@mesosphere.io> wrote:


> I like the idea of a cluster event stream very much! With this feature
> implemented, we will be able to gather cluster health status, various
> statistics for on-the-fly or offline analysis. AFAIK, the only way to
> gather some stats now is to parse master and slave logs. As it looks for
> me, the event stream will make Mesos more friendly to its users: framework
> writers, SREs and so on.
>
> Though an event stream may use hooks internally, I would still distinguish
> between these features. While an event stream is a read-only interface with
> performance overhead close to 0, hooks can modify cluster state (e.g.
> passing TaskInfo messages) and may significantly impact performance or add
> to master / slave complexity, as Ben pointed in (B).
>
> > (B) I assume this also means that there is a side-effect inducing
> "action"
> > that is performed, in addition to the transformation. I wouldn't be able
> to
> > do any expensive or asynchronous work through these, unless we made them
> > return Futures. At which point, we would need some additional semantics
> > (e.g. ordering), and we'd be adding complexity to the Master.
>
> I would propose to make hooks synchronous in order to keep the code simple.
> To protect from performance issues introduced by heavy hooks, we can
> enable/disable hooks at the compile time (similar to assert), i.e. if the
> code is compiled without -DENABLE_HOOKS, no code related to hooks support
> will land in the object files and modules with hook payload will not be
> loaded. Thus we delegate responsibility for heavy hooks to Mesos users who
> use them for their workflows.
>

I agree on running hooks inline. If you don't have any hook modules loaded
or selected, they won't introduce any significant overhead than the call to
the hook manager (which in turn figures out that there isn't any applicable
hooks to run). So I am not sure I understand "heavy" hooks in that context.


>
>
> On Sat, Nov 22, 2014 at 1:56 AM, Niklas Nielsen <nik...@mesosphere.io>
> wrote:
>
> > First off, thanks for all the comments!
> > I really appreciate it and am excited about where we get with this
> effort.
> > Let me see if I can answer your questions (best-effort inlined).
> >
> > On 21 November 2014 01:11, Tom Arnfeld <t...@duedil.com> wrote:
> >
> > > This all sounds really great, and opens up some interesting
> opportunities
> > > for automated service discovery (well, the announcement side) for a
> > cluster
> > > which is what we've been looking into for a while.
> > >
> > >
> > >
> > >
> > > Correct me if I'm wrong, but would it be possible to make use of the
> > > master log to achieve an event stream? I'm not entirely sure what's
> > stored
> > > in the shared master transaction log but I'm assume some state about
> > tasks
> > > etc? If there were to be a stream of events, it'd be great to support
> > > rewinding and replaying for some period of time better allow for HA
> > stream
> > > consumers.
> > >
> >
> > See comment below. But yes - service discovery systems could definitely
> > leverage hooks.
> >
> >
> > >
> > >
> > >
> > >
> > > Either way, hooks would be a welcomed feature for us!
> > >
> > >
> > > --
> > >
> > >
> > > Tom Arnfeld
> > >
> > > Developer // DueDil
> > >
> > >
> > >
> > >
> > >
> > > (+44) 7525940046
> > >
> > > 25 Christopher Street, London, EC2A 2BS
> > >
> > > On Fri, Nov 21, 2014 at 6:44 AM, Vinod Kone <vinodk...@gmail.com>
> wrote:
> > >
> > > > Good points Ben.
> > > > Also, I've been recently thinking about an events endpoint (not to
> > > confuse
> > > > with the Event/Call API) that could stream all kinds of events
> > happening
> > > > the cluster (master events, allocator events, gc events, slave
> events,
> > > > containerizer events etc). In fact this could probably be exposed by
> > > > libprocess very easily. I was mainly thinking about this in terms of
> > > > auditing. Having such an endpoint would allow external tooling to
> > "hook"
> > > > into that endpoint and consume the event stream. The tooling could
> then
> > > > perform arbitrary actions *without interfering* with mesos control
> > flow.
> > > I
> > > > think such an architecture would be powerful because it is generic
> and
> > > > non-invasive. Have you considered that approach?
> > >
> >
> > Ben, Vinod: A cluster event stream sounds like an awesome idea!
> > I have previously hacked together post-mortem log analysis to determine
> > workload profiles. That could be done online (!)
> > That aside, our use-case involves hanging meta-data off the task with
> > labels which we cannot do with an event stream alone.
> > The metadata we need is produced by a 3rd party security infrastructure
> > which we invoke and use when setting up the executor environment in the
> > slave.
> > We actually only need the pre hook / filter mechanism to do this, but
> > wanted to come up with a generalized solution.
> >
> > In my mind, the ideas of hooks and event streams are not mutually
> > exclusive.
> > The event stream could use all the insertion points of hooks (and
> > vice-versa).
> >
> >
> >
> > > > On Thu, Nov 20, 2014 at 10:24 PM, Benjamin Mahler <
> > > benjamin.mah...@gmail.com
> > > >> wrote:
> > > >> Thanks for sending this Nik!
> > > >>
> > > >> The general idea of hooks sounds good. I think the question for
> hooks
> > is
> > > >> about which extensibility points make sense, and I think we'll have
> to
> > > >> assess that with the introduction of each hook.
> > > >>
> > > >> (1) Is the idea behind hooks about actions, as you initially
> > mentioned?
> > > Or
> > > >> is it about data transformation, which is what is shown in the API
> > > example?
> > > >> Or both?
> > >
> >
> > Both.
> >
> > To Tom's point: service discovery systems with hooks could both 1) be
> > notified when tasks are launched in a push-like fashion and 2) read from
> > and alter the task info (for example with labels)
> >
> > We wanted to aim for flexibility. Similar to web server hooks, they can
> > purposely change the behavior of request handling.
> > If it cannot interact or influence the task sequence, it isn't a hook but
> > rather a probe (similar to DTrace probes).
> >
> >
> > > >>
> > > >> (2) Is external tooling meant to describe hooks? Or is it meant to
> > > describe
> > > >> external tools that can leverage the hooks? This part is a bit fuzzy
> > to
> > > me.
> > > >>
> > >
> >
> > Hooks are defined by us and implementations can be provided module
> writers.
> >
> > Similar to dtrace probes, kernel developers chose interesting insertion
> > points - some
> > specific, others generic (where filters can be applied).
> >
> >
> >
> > > >> (3) Is instrumentation meant to allow us to gain visibility into
> > things
> > > >> like performance? If so, hooks might not be the most maintainable
> > > approach
> > > >> for that. Ideally we could add instrumentation into libprocess. Are
> > > there
> > > >> other forms of instrumentation in mind?
> > >
> >
> > Instrumentation in libprocess is one thing (being able to analyze
> > bandwidth/latency and message throughput/distribution - which would be
> > pretty awesome).
> > There should be plenty of non-libprocess code which gives insight into
> the
> > task/status update life-cycle.
> >
> > Hooks would allow local aggregation of high-frequency events where you
> want
> > to run user-defined code.
> >
> >
> > > >>
> > > >> Let's take the hook example you showed:
> > > >>
> > > >>  // Performs an action and/or transforms the TaskInfo.
> > > >>  virtual TaskInfo preMasterLaunchTask(const TaskInfo& task) = 0;
> > > >>  virtual TaskInfo postMasterLaunchTask(const TaskInfo& task) = 0;
> > > >>  virtual TaskInfo preSlaveLaunchTask(const TaskInfo& task) = 0;
> > > >>  virtual TaskInfo postSlaveLaunchTask(const TaskInfo& task) = 0;
> > > >>
> > > >> Comment mine. This interface suggests synchronous transformation of
> > > >> TaskInfo objects:
> > > >>
> > > >> (A) A transformation of TaskInfo seems a bit surprising to me, how
> can
> > > one
> > > >> do this generically? Is the idea that this would be customized per
> > > >> framework within the hook? How would one differentiate the
> frameworks?
> > > Via
> > > >> role? This part seems fuzzy to me.
> > >
> >
> > That was an oversimplified API. The arguments could/should match the
> > parameters passed to Master::launchTask()
> > for example. The hook runs in the thread and context, so we can share
> state
> > with the called environment.
> > The return argument could be a tuple with all incoming parameter types,
> > taken these usually are const.
> >
> >
> > >
> > >
> > >>
> > > >> (B) I assume this also means that there is a side-effect inducing
> > > "action"
> > > >> that is performed, in addition to the transformation. I wouldn't be
> > > able to
> > > >> do any expensive or asynchronous work through these, unless we made
> > them
> > > >> return Futures. At which point, we would need some additional
> > semantics
> > > >> (e.g. ordering), and we'd be adding complexity to the Master.
> > >
> >
> > Maybe only entry points, so they effectively before filters, makes sense
> > (to avoid complexity of post actions being executed on arbitrary places
> > and/or on scope exit (which could be one of many places and hard to
> reason
> > about).
> >
> >
> > > >>
> > > >> (C) What differentiates pre and post in this case? Sending the
> > message?
> > > >> Let's consider that these are responsible for performing "actions".
> > Then
> > > >> differentiating pre and post seems a bit arbitrary, since the
> sending
> > > of a
> > > >> message is asynchronous. This means that the "action" occurs after
> the
> > > >> message has been handed to libprocess, but not before it is sent to
> > the
> > > >> socket, not before it is sent over the wire, not before it is
> received
> > > by
> > > >> the slave, etc. Seems like an odd distinction, no?
> > >
> >
> > See comment above.
> >
> >
> > > >>
> > > >> Looking forward to hearing more, thanks Nik!
> > > >>
> > > >> FYI I'm about to go on vacation, so I'm going to be slow at email.
> :)
> > > >>
> > > >> On Thu, Nov 20, 2014 at 10:07 AM, Dominic Hamon <
> > > dha...@twopensource.com>
> > > >> wrote:
> > > >>
> > > >> > Do you have specific use cases in mind? Ie, specific actions that
> > > might
> > > >> > take place pre and post launch?
> > > >> >
> > > >> > On Thu, Nov 20, 2014 at 9:37 AM, Niklas Nielsen <
> > nik...@mesosphere.io
> > > >
> > > >> > wrote:
> > > >> >
> > > >> > > Hi everyone,
> > > >> > >
> > > >> > >
> > > >> > > As a part of our current sprint at Mesosphere, we are striving
> to
> > > work
> > > >> on
> > > >> > > and land an extension to the modules subsystem which we (per
> > > >> > > https://issues.apache.org/jira/browse/MESOS-2060) have referred
> > to
> > > as
> > > >> > > ‘hooks’. We wanted to give some background to this feature and
> > will
> > > be
> > > >> > > asking for input to the proposal.
> > > >> > >
> > > >> > > The term is inspired by Apache Web Server hooks (
> > > >> > > http://httpd.apache.org/docs/2.2/developer/hooks.html) which
> > allows
> > > >> > > modules
> > > >> > > to tie into the request processing life-cycle. It is different
> > from
> > > the
> > > >> > > existing modules capability, in that the usual request
> processing
> > > >> remains
> > > >> > > untouched (and isn’t replaced fully as a regular module would
> do).
> > > >> > >
> > > >> > > In our case, we are interested in being able to tie into the
> > > life-cycle
> > > >> > of
> > > >> > > tasks to run pre and post-actions during task launch in the
> master
> > > and
> > > >> > > slave processes. In general, it adds capability for all sorts of
> > > >> external
> > > >> > > tooling and instrumentation.
> > > >> > > The main idea is to enable modules to register themselves as
> hook
> > > >> > > providers. For example through a new flag:
> --hooks=”module_name1,
> > > >> > > module_name2, ...”
> > > >> > >
> > > >> > > A new ‘HookManager’ will query each module and get an object
> back
> > of
> > > >> > type ‘
> > > >> > > Hooks’ which has virtual member functions which points to the
> > > desired
> > > >> > > callbacks in the module.
> > > >> > >
> > > >> > >
> > > >> > > For example,
> > > >> > >
> > > >> > > class Hooks {
> > > >> > >
> > > >> > > public:
> > > >> > >
> > > >> > >  virtual TaskInfo preMasterLaunchTask(TaskInfo task) = 0;
> > > >> > >
> > > >> > >  virtual TaskInfo postMasterLaunchTask(TaskInfo task) = 0;
> > > >> > >
> > > >> > >  virtual TaskInfo preSlaveLaunchTask(TaskInfo task) = 0;
> > > >> > >
> > > >> > >  virtual TaskInfo postSlaveLaunchTask(TaskInfo task) = 0;
> > > >> > >
> > > >> > >  // ...
> > > >> > >
> > > >> > > };
> > > >> > >
> > > >> > > An example of the call site in Mesos could be:
> > > >> > >
> > > >> > > Master::launchTask(..., TaskInfo task, ...)
> > > >> > >
> > > >> > > {
> > > >> > >
> > > >> > >  task = HookManager::preMasterLaunchTask(task);
> > > >> > >
> > > >> > >  ...
> > > >> > >
> > > >> > >  task = HookManager::postMasterLaunchTask(task);
> > > >> > >
> > > >> > > }
> > > >> > >
> > > >> > > We are not tied at all to how the hooks will be named (they
> could
> > > >> > > potentially live in Master, Slave, Allocator, ...) subclasses,
> > > return
> > > >> > Try,
> > > >> > > Option, Result to indicate failure and so on.
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > > Introducing the hook functionality is similar to what we’ve done
> > in
> > > the
> > > >> > > past with Isolators for the MesosContainerizer that enables
> people
> > > to
> > > >> > > provide new functionality for launching containers. In that same
> > > way,
> > > >> we
> > > >> > > want people to be able to provide new functionality with respect
> > to
> > > >> > > launching tasks without changing the existing task flow.
> > > >> > >
> > > >> > >
> > > >> > > We’d love to get people’s feedback so we can move forward!
> > > >> > >
> > > >> > >
> > > >> > > Thanks,
> > > >> > > Niklas
> > > >> > >
> > > >> >
> > > >> >
> > > >> >
> > > >> > --
> > > >> > Dominic Hamon | @mrdo | Twitter
> > > >> > *There are no bad ideas; only good ideas that go horribly wrong.*
> > > >> >
> > > >>
> > >
> >
> > Let's keep the discussion going :-)
> >
>

Re: [proposal] Module extension: hooks

Reply via email to