Thanks for the proposal, Gour! Interesting thought.
I think it makes sense. As YARN is maturing, long-lived services
becoming a primitive is a natural progression. Slider is likely at the
forefront of building such a primitive on YARN (from a lot of great
planning/design from Steve).
I think this would definitely be an interesting conversation to be had
with YARN (if the other podling members are of the same mindset). I
think how this plays out would require a bit of planning/coordination
from the Hadoop PMC side.
Now, there is the other half of Slider: the app-packages. My gut
reaction is that YARN would have no interest in owning/maintaining
these. This is a bit concerning to me because Slider on its own really
isn't that exciting. It's the app-packages that make it so enticing --
build a zip, install it to your cluster, and suddenly users can start
dynamically creating clusters (HBase, Accumulo, Storm, etc). I would be
strongly opposed to any plan to merge Slider into YARN/Hadoop without a
clear path forward on where the app-packages would live. This is
extremely important to me.
I'd love to see where this conversation can go.
- Josh
Gour Saha wrote:
Slider community,
The YARN team is discussing in YARN-4692<https://issues.apache.org/jira/browse/YARN-4692>
on how to add "first class services" directly to YARN. Some of the names in the
discussion document should be familiar: that's because Slider is essentially the original
long-lived application in YARN.
With YARN-4692<https://issues.apache.org/jira/browse/YARN-4692>, it is apparent that the
Apache Hadoop YARN community is working towards providing direct support for long-lived
services. I think we need to look at that proposal and think "where and how does Slider
relate to this".
Apache Slider (incubating) has been in the business of creating and managing
long-running services in YARN for a couple of years. Today it is being used in
production YARN clusters across several companies (big and small). Several
production-grade applications (data and non-data) are available as sample
packages. A good number of them have been contributed by interested parties
like Lucidworks contributing a Solr Slider Application Package and DataTorrent
contributing a Kafka Slider Application Package.
Slider has been pretty good at taking existing applications and turning them into long-lived
services in YARN. YARN offers the core scheduling, execution and failure reporting functions;
slider takes that and adds: advanced container placement (history; anti-affine, escalation
policies), configuration, dynamic binding, monitoring, failure handling, and an API for
clients. It's also driven a lot of the
YARN-896<https://issues.apache.org/jira/browse/YARN-896> "long-lived services"
development: long-lived failure resilience, the YARN registry, container-preservation over YARN
restarts. Big chunks of that code actually came from the Slider team. This was always a goal of
the work even in its Hoya predecessor: show that YARN can be used to host applications like
HBase, and identify where it can be be improved.
What does it mean for Slider if YARN starts doing this directly?
Slider provides a lot of the basic functionalities for long-running services
proposed in YARN-4692. It is a universal YARN app-master and lets
application-owners focus on their application functionalities, while it handles
the internals of orchestrating services on YARN.
Which means: we have an opportunity here to contribute the core of slider into
YARN itself, and, with it in YARN, use it as the basis for the full TODO-list
of YARN-4692.
The YARN team gets the stable codebase that's evolved over the past few years:
something to deploy applications in a YARN cluster. What does Slider get? We'd
get to be the foundation for long lived YARN services with the new work on top.
Would this work? What's wrong with the idea? How do we do it if we want to go
with it?
I would like to call upon the community to weigh in their thoughts and opinions
on this topic.
-Gour