Your proposed allocation approach makes a lot of sense. I think it will
solve a large number of use cases. Thanks for giving an overview of the
different frameworks. I wonder if they got too focused on the simple use
case....

Have you looked at LLama to see whether we could extend it for our needs?
Its Apache licensed and probably has at least a start at a bunch of things
we're trying to do.

https://github.com/cloudera/llama

--
Jacques Nadeau
CTO and Co-Founder, Dremio

On Tue, Mar 22, 2016 at 7:42 PM, Paul Rogers <prog...@maprtech.com> wrote:

> Hi Jacques,
>
> I’m thinking of “semi-static” allocation at first. Spin up a cluster of
> Drill-bits, after which the user can add or remove nodes while the cluster
> runs. (The add part is easy, the remove part is a bit tricky since we don’t
> yet have a way to gracefully shut down a Drill-bit.) Once we get the basics
> to work, we can incrementally try out dynamics. For example, someone could
> whip up a script to look at load and use the proposed YARN client app to
> adjust resources. Later, we can fold dynamic load management into the
> solution once we’re sure what folks want.
>
> I did look at Slider, Twill, Kitten and REEF. Kitten is too basic. I had
> great hope for Slider. But, it turns out that Slider and Weave have each
> built an elaborate framework to isolate us from YARN. The Slider framework
> (written in Python) seems harder to understand than YARN itself. At least,
> one has to be an expert in YARN to understand what all that Python code
> does. And, just looking at the class count in the Twill Javadoc was
> overwhelming. Slider and Twill have to solve the general case. If we build
> our own Java solution, we only have to solve the Drill case, which is
> likely much simpler.
>
> A bespoke solution would seem to offer some other advantages. It lets us
> do things like integrate ZK monitoring so we can learn of zombie drill bits
> (haven’t exited, but not sending heartbeat messages.) We can also gather
> metrics and historical data about the cluster as a whole. We can try out
> different cluster topologies. (Run Drill-bits on x of y nodes on a rack,
> say.) And, we can eventually do the dynamic load management we discussed
> earlier.
>
> But first, I look forward to hearing what others have tried and what we’ve
> learned about how people want to use Drill in a production YARN cluster.
>
> Thanks,
>
> - Paul
>
>
> > On Mar 22, 2016, at 5:45 PM, Jacques Nadeau <jacq...@dremio.com> wrote:
> >
> > This is great news, welcome!
> >
> > What are you thinking in regards to static versus dynamic resource
> > allocation? We have some conversations going regarding workload
> management
> > but they are still early so it seems like starting with user-controlled
> > allocation makes sense initially.
> >
> > Also, have you spent much time evaluating whether one of the existing
> YARN
> > frameworks such as Slider would be useful? Does anyone on the list have
> any
> > feedback on the relative merits of these technologies?
> >
> > Again, glad to see someone picking this up.
> >
> > Jacques
> >
> >
> > --
> > Jacques Nadeau
> > CTO and Co-Founder, Dremio
> >
> > On Tue, Mar 22, 2016 at 4:58 PM, Paul Rogers <prog...@maprtech.com>
> wrote:
> >
> >> Hi All,
> >>
> >> I’m a new member of the Drill Team here at MapR. We’d like to take a
> look
> >> at running Drill on YARN for production customers. JIRA suggests some
> early
> >> work may have been done (DRILL-142 <
> >> https://issues.apache.org/jira/browse/DRILL-142>, DRILL-1170 <
> >> https://issues.apache.org/jira/browse/DRILL-1170>, DRILL-3675 <
> >> https://issues.apache.org/jira/browse/DRILL-3675>).
> >>
> >> YARN is a complex beast and the Drill community is large and growing.
> So,
> >> a good place to start is to ask if anyone has already done work on
> >> integrating Drill with YARN (see DRILL-142)?  Or has thought about what
> >> might be needed?
> >>
> >> DRILL-1170 (YARN support for Drill) seems a good place to gather
> >> requirements, designs and so on. I’ve posted a “starter set” of
> >> requirements to spur discussion.
> >>
> >> Thanks,
> >>
> >> - Paul
> >>
> >>
>
>

Reply via email to