Re: Proposal for new Job Engine

Zhou, Qianhao Thu, 22 Jan 2015 23:41:50 -0800

Hi, all
    Design doc can be reached:
    https://github.com/KylinOLAP/Kylin/tree/inverted-index/docs/JobEngine


Best Regard
Zhou QianHao





On 1/16/15, 10:40 PM, "周千昊" <[email protected]> wrote:

>Sorry for missing the link.
>Here is the related jira ticket
>https://issues.apache.org/jira/browse/KYLIN-533
>
>On Fri Jan 16 2015 at 7:52:29 PM Luke Han <[email protected]> wrote:
>
>> Qianhao, could you please paste JIRA link here also?
>>
>> Thanks.
>>
>> 2015-01-16 11:12 GMT+08:00 Zhou, Qianhao <[email protected]>:
>>
>> > Hi, all
>> >    I have created a JIRA ticket for this task.
>> >    And I am working on it right now.
>> >
>> > Best Regard
>> > Zhou QianHao
>> >
>> >
>> >
>> >
>> >
>> > On 1/16/15, 11:00 AM, "Li Yang" <[email protected]> wrote:
>> >
>> > >> Still worth considering an existing tool. The simplest code is the
>> code
>> > >you don’t maintain. :)
>> > >
>> > >Exactly, we try to maintain as little code as possible for the job
>> manager
>> > >(engine is an old bad name as we rely on external jobs to do the
>>heavy
>> > >lifting). The previous version depends on quartz and it took so much
>> code
>> > >to fit what we need in the quartz bottle. Quartz in our case is like
>>a
>> > >gorilla
>> > >holding a banana. All we need is the banana but we had bring the
>>gorilla
>> > >home. What's why the refactoring comes, and on top of JDK
>> ExecutorService,
>> > >we expect only a few hundred line of code for the new implementation.
>> > >
>> > >On Thu, Jan 15, 2015 at 8:47 PM, Luke Han <[email protected]> wrote:
>> > >
>> > >> maybe rename this module to "builder" will more make sense:-)
>> > >>
>> > >> 2015-01-15 8:09 GMT+08:00 Henry Saputra <[email protected]>:
>> > >>
>> > >> > I believe the job engine here is a cube builder which is the
>> component
>> > >> > that manages submissions to different distributed platform (MR,
>> Flink,
>> > >> > Spark) that actually execute the jobs in different machine.
>> > >> > It primary function is to manage the "job" submission and act as
>> > >> > reverse proxy for status, scheduling, and metadata access for the
>> > >> > operations.
>> > >> >
>> > >> > I had worked similar like this in my pref role =)
>> > >> >
>> > >> > - Henry
>> > >> >
>> > >> > On Wed, Jan 14, 2015 at 9:40 AM, Julian Hyde
>><[email protected]
>> >
>> > >> > wrote:
>> > >> > > Still worth considering an existing tool. The simplest code is
>>the
>> > >>code
>> > >> > you don’t maintain. :)
>> > >> > >
>> > >> > > On Jan 14, 2015, at 2:57 AM, Li Yang <[email protected]> wrote:
>> > >> > >
>> > >> > >> Sorry I'm late, just a recap.
>> > >> > >>
>> > >> > >> The "Job Engine" here only manages long running tasks
>>lifecycle
>> and
>> > >> > >> dependencies, it oversees task sequences, like cube build is
>>made
>> > >>up
>> > >> of
>> > >> > >> several mapreduces, and allow user to start/stop/pause/resume.
>> > >> > >>
>> > >> > >> It does not do scheduling or fancy workflow, that's why many
>> > >>existing
>> > >> > >> products like quartz or oozie overkill. We want keep Kylin
>> overall
>> > >> > >> architecture simple and be easy to deploy and debug.
>> > >> > >>
>> > >> > >> The purpose of this refactoring is to separate the manager
>>role
>> and
>> > >> the
>> > >> > >> worker role which previous impl mixed up. Once done,
>>replacing a
>> > >> worker
>> > >> > >> shall become easy. We will be free to explore other cube
>>building
>> > >> > workers,
>> > >> > >> like Flink and Spark mentioned.
>> > >> > >>
>> > >> > >> Cheers
>> > >> > >> Yang
>> > >> > >>
>> > >> > >> On Wed, Jan 14, 2015 at 10:08 AM, Zhou, Qianhao <
>> [email protected]
>> > >
>> > >> > wrote:
>> > >> > >>
>> > >> > >>> Thanks Ted for the advice.
>> > >> > >>> I think the right way to do is to take more options into
>> > >> consideration,
>> > >> > >>> then make decision.
>> > >> > >>> Whichever solution is used, we are going to learn something
>>that
>> > >>will
>> > >> > >>> benefit sooner or later.
>> > >> > >>>
>> > >> > >>> Best Regard
>> > >> > >>> Zhou QianHao
>> > >> > >>>
>> > >> > >>>
>> > >> > >>>
>> > >> > >>>
>> > >> > >>>
>> > >> > >>> On 1/14/15, 12:37 AM, "Ted Dunning" <[email protected]>
>> > wrote:
>> > >> > >>>
>> > >> > >>>> OK.
>> > >> > >>>>
>> > >> > >>>> On Tue, Jan 13, 2015 at 10:30 AM, 周千昊 <[email protected]>
>> > >>wrote:
>> > >> > >>>>
>> > >> > >>>>> As I mentioned, we don't want extra dependency because that
>> will
>> > >> make
>> > >> > >>>>> the
>> > >> > >>>>> deployment more complex.
>> > >> > >>>>> As for Aurora, the users will have an extra step for
>> > >>installation.
>> > >> > >>>>> However
>> > >> > >>>>> so far, kylin will only need a war package and a hadoop
>> cluster.
>> > >> > >>>>> On Tue Jan 13 2015 at 10:26:50 PM Ted Dunning <
>> > >> [email protected]
>> > >> > >
>> > >> > >>>>> wrote:
>> > >> > >>>>>
>> > >> > >>>>>> I understand you want to write your own job engine.  But
>>why
>> > >>not
>> > >> use
>> > >> > >>>>> one
>> > >> > >>>>>> that already exists?
>> > >> > >>>>>>
>> > >> > >>>>>> Given that you mention quartz, it sounds like Aurora might
>> be a
>> > >> good
>> > >> > >>>>> fit.
>> > >> > >>>>>> Why not use it?
>> > >> > >>>>>>
>> > >> > >>>>>>
>> > >> > >>>>>>
>> > >> > >>>>>> On Tue, Jan 13, 2015 at 3:34 AM, Zhou, Qianhao
>> > >><[email protected]
>> > >> >
>> > >> > >>>>> wrote:
>> > >> > >>>>>>
>> > >> > >>>>>>> What we want is that:
>> > >> > >>>>>>>
>> > >> > >>>>>>> 1. A lightweight job engine, easy to start, stop and
>>check
>> > >>jobs
>> > >> > >>>>>>>   Because most of the heavyweight job is map-reduce
>>which is
>> > >> > >>>>> already
>> > >> > >>>>>>>   running on the cluster, so we don’t need the job
>>engine to
>> > >>run
>> > >> > >>>>> on a
>> > >> > >>>>>>> cluster.
>> > >> > >>>>>>>
>> > >> > >>>>>>> 2. Kylin already has a job engine based on Quartz,
>>however,
>> > >>only
>> > >> a
>> > >> > >>>>> very
>> > >> > >>>>>>> small
>> > >> > >>>>>>>   part of functionalities are used, so we can easily
>>replace
>> > >>it
>> > >> > >>>>> with
>> > >> > >>>>>>>   standard java api.
>> > >> > >>>>>>>   Thus there will be no extra dependency which means
>>easier
>> to
>> > >> > >>>>> deploy.
>> > >> > >>>>>>>
>> > >> > >>>>>>> Currently a very simple job engine implementation will
>>meet
>> > >>the
>> > >> > >>>>> kylin’s
>> > >> > >>>>>>> needs.
>> > >> > >>>>>>> So I think at this timing just keep it simple would be
>>the
>> > >>better
>> > >> > >>>>> choice.
>> > >> > >>>>>>>
>> > >> > >>>>>>>
>> > >> > >>>>>>> Best Regard
>> > >> > >>>>>>> Zhou QianHao
>> > >> > >>>>>>>
>> > >> > >>>>>>>
>> > >> > >>>>>>>
>> > >> > >>>>>>>
>> > >> > >>>>>>>
>> > >> > >>>>>>> On 1/13/15, 4:43 PM, "Ted Dunning"
>><[email protected]>
>> > >> wrote:
>> > >> > >>>>>>>
>> > >> > >>>>>>>> So why are the following systems unsuitable?
>> > >> > >>>>>>>>
>> > >> > >>>>>>>> - mesos + (aurora or chronos)
>> > >> > >>>>>>>> - spark
>> > >> > >>>>>>>> - yarn
>> > >> > >>>>>>>> - drill's drillbits
>> > >> > >>>>>>>>
>> > >> > >>>>>>>> These options do different things.  I know that.  I am
>>not
>> > >> > entirely
>> > >> > >>>>>> clear
>> > >> > >>>>>>>> on what you want, however, so I present these different
>> > >>options
>> > >> so
>> > >> > >>>>> that
>> > >> > >>>>>>>> you
>> > >> > >>>>>>>> can tell me better what you want.
>> > >> > >>>>>>>>
>> > >> > >>>>>>>> Mesos provides very flexible job scheduling.  With
>>Aurora,
>> it
>> > >> has
>> > >> > >>>>>> support
>> > >> > >>>>>>>> for handling long-running and periodic jobs.  With
>>Chronos,
>> > >>it
>> > >> has
>> > >> > >>>>> the
>> > >> > >>>>>>>> equivalent of a cluster level cron.
>> > >> > >>>>>>>>
>> > >> > >>>>>>>> Spark provides the ability for a program to spawn lots
>>of
>> > >> parallel
>> > >> > >>>>>>>> execution.  This is different than what most people
>>mean by
>> > >>job
>> > >> > >>>>>>>> scheduling,
>> > >> > >>>>>>>> but in conjunction with a queuing system combined with
>> spark
>> > >> > >>>>> streaming,
>> > >> > >>>>>>>> you
>> > >> > >>>>>>>> can get remarkably close to a job scheduler.
>> > >> > >>>>>>>>
>> > >> > >>>>>>>> Yarn can run jobs, but has no capabilities to schedule
>> > >>recurring
>> > >> > >>>>> jobs.
>> > >> > >>>>>> It
>> > >> > >>>>>>>> can adjudicate the allocation of cluster resources.
>>This
>> is
>> > >> > >>>>> different
>> > >> > >>>>>>>> from
>> > >> > >>>>>>>> what either spark or mesos does.
>> > >> > >>>>>>>>
>> > >> > >>>>>>>> Drill's drillbits do scheduling of queries across a
>> parallel
>> > >> > >>>>> execution
>> > >> > >>>>>>>> environment.  It currently has no user impersonation,
>>but
>> > >>does
>> > >> do
>> > >> > >>>>> an
>> > >> > >>>>>>>> interesting job of scheduling parts of parallel queries.
>> > >> > >>>>>>>>
>> > >> > >>>>>>>> Each of these could be considered like a job scheduler.
>> > >>Only a
>> > >> > >>>>> very
>> > >> > >>>>> few
>> > >> > >>>>>>>> are likely to be what you are talking about.
>> > >> > >>>>>>>>
>> > >> > >>>>>>>> Which is it?
>> > >> > >>>>>>>>
>> > >> > >>>>>>>>
>> > >> > >>>>>>>>
>> > >> > >>>>>>>
>> > >> > >>>>>>
>> > >> > >>>>>
>> > >> > >>>
>> > >> > >>>
>> > >> > >
>> > >> >
>> > >>
>> >
>> >
>>

Re: Proposal for new Job Engine

Reply via email to