+1

On Fri, Sep 5, 2014 at 2:25 PM, Kostas Tzoumas <ktzou...@apache.org> wrote:

> +1 for refactoring using Akka, the arguments are overwhelming.
>
>
> On Fri, Sep 5, 2014 at 2:04 PM, Robert Metzger <rmetz...@apache.org>
> wrote:
>
> > I agree with using Akka for RPC. It is ASF 2.0 licensed, seems to have a
> > big community [1] and users [2] that depend on the system.
> >
> > The YARN client is also using the old RPC service. I would like to
> rewrite
> > it with Akka once we have added it into the other parts of the system, to
> > learn it.
> >
> >
> > [1] https://github.com/akka/akka/pulls
> > [2]
> > http://doc.akka.io/docs/akka/2.0.4/additional/companies-using-akka.html
> >
> >
> >
> > On Fri, Sep 5, 2014 at 1:34 PM, Stephan Ewen <se...@apache.org> wrote:
> >
> > > This proposes to refactor the RPC service and the coordination between
> > > Client, JobManager, and TaskManager to use the Akka actor library.
> > >
> > > Even though Akka is written in Scala, it offers a Java interface and we
> > can
> > > use Akka completely from Java.
> > >
> > > Below are a list of arguments why this would help the system:
> > >
> > >
> > > Problems with the current RPC service:
> > > --------------------------------------------------------
> > >
> > >   - No asynchronous calls with callbacks. This is the reason why
> several
> > > parts of the runtime poll the status, introducing unnecessary latency.
> > >
> > >   - No exception forwarding (many exceptions are simply swallowed),
> > making
> > > debugging and operation in flaky environments very hard
> > >
> > >   - Limited number of handler threads. The RPC can only handle a fix
> > number
> > > of concurrent requests, forcing you to maintain separate thread pools
> to
> > > delegate actions to
> > >
> > >   - No support for primitive data types (or boxed primitives) as
> > arguments,
> > > everything has to be a specially serializable type
> > >
> > >   - Problematic threading model. The RPC continuously spawns and
> > terminates
> > > threads
> > >
> > >
> > >
> > > Benefits of switching to the Akka actor model:
> > >
> > >
> >
> -------------------------------------------------------------------------------
> > >
> > >   - Akka solves all of the above issues out of the box
> > >
> > >   - The supervisor model allows you to do failure detection of actors.
> > That
> > > provides a unified way of detecting and handling failures (missing
> > > heartbeats, failed calls, ...)
> > >
> > >   - Akka has tools to make stateful actors persistent and restart them
> on
> > > other machines in cases of failure. That would greatly help in
> > implementing
> > > "master fail-over", which will become important
> > >
> > >   - You can define many "call targets" (actors). Tasks (on
> taskmanagers)
> > > can directly call their ExecutionVertex on the JobManager, rather than
> > > calling the JobManager, creating a Runnable that looks up the execution
> > > vertex, and so on...
> > >
> > >   - The actor model's approach to queue actions on an actor and run the
> > one
> > > after another makes the concurrency model of the state machine very
> > simple
> > > and robust
> > >
> > >   - We "outsource" our own concerns about maintaining and improving
> that
> > > part of the system
> > >
> > > Greetings,
> > > Stephan
> > >
> >
>

Reply via email to