Stability is one thing, and another thing is the difficulty of
configuration and deployment.

For configuration, it is always a pain. I do not want to restart HMaster
many times to get thing right. A standalone service would be better.

For deployment, as chenheng said above, usually we do not deploy a YARN
along with the HBase cluster, which means we need to use an external YARN
cluster to run the job. The cluster may not be controlled by us which means
it may miss the HBase jars or the version is incorrect(as seems YARN's
timeline server depends on HBase?). This could cause the job to fail
because ClassNotFound or some other strange exceptions.

Thanks.

2016-09-23 11:47 GMT+08:00 Ted Yu <yuzhih...@gmail.com>:

> You mean standalone service which runs Procedure V2 ?
> Not sure how much work is involved.
>
> Is this concerning the stability of Master where backup / restore
> procedures run ?
>
> To my understanding, errors in one procedure are isolated, not having
> adverse impact on Master's stability.
>
> On Thu, Sep 22, 2016 at 8:32 PM, 张铎 <palomino...@gmail.com> wrote:
>
> > So what about a standalone service other than master? You can use your
> own
> > procedure store in that service?
> >
> > 2016-09-23 11:28 GMT+08:00 Ted Yu <yuzhih...@gmail.com>:
> >
> > > An earlier implementation was client driven.
> > >
> > > But with that approach, it is hard to resume if there is error midway.
> > > Using Procedure V2 makes the backup / restore more robust.
> > >
> > > Another consideration is for security. It is hard to enforce security
> (to
> > > be implemented) for client driven actions.
> > >
> > > Cheers
> > >
> > > > On Sep 22, 2016, at 8:15 PM, Andrew Purtell <
> andrew.purt...@gmail.com>
> > > wrote:
> > > >
> > > > No, this misses Matteo's finer point, which is "shelling out" from
> the
> > > master directly to run MR is a first. Why not drive this with a utility
> > > derived from Tool?
> > > >
> > > > On Sep 22, 2016, at 7:57 PM, Vladimir Rodionov <
> vladrodio...@gmail.com
> > >
> > > wrote:
> > > >
> > > >>>> In our production cluster,  it is a common case we just have HDFS
> > and
> > > >>>> HBase deployed.
> > > >>>> If our Master/RS depend on MR framework (especially some features
> we
> > > >>>> have not used at all),  it introduced another cost for maintain.
> I
> > > >>>> don't think it is a good idea.
> > > >>
> > > >> So , you are not backup users in this case. Many our customers have
> > full
> > > >> stack deployed and
> > > >> want see backup to be a standard feature. Besides this, nothing will
> > > happen
> > > >> in your cluster
> > > >> if you won't be doing backups.
> > > >>
> > > >> This discussion (we do not want see M/R dependency) goes to nowhere.
> > We
> > > >> asked already, at least twice, to suggest another framework (other
> > than
> > > M/R)
> > > >> for bulk data copy with *conversion*. Still waiting for suggestions.
> > > >>
> > > >> -Vlad
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>> On Thu, Sep 22, 2016 at 7:49 PM, Ted Yu <yuzhih...@gmail.com>
> wrote:
> > > >>>
> > > >>> If MR framework is not deployed in the cluster, hbase still
> functions
> > > >>> normally (post merge).
> > > >>>
> > > >>> In terms of build time dependency, we have long been depending on
> > > >>> mapreduce. Take a look at ExportSnapshot.
> > > >>>
> > > >>> Cheers
> > > >>>
> > > >>> On Thu, Sep 22, 2016 at 7:42 PM, Heng Chen <
> heng.chen.1...@gmail.com
> > >
> > > >>> wrote:
> > > >>>
> > > >>>> In our production cluster,  it is a common case we just have HDFS
> > and
> > > >>>> HBase deployed.
> > > >>>> If our Master/RS depend on MR framework (especially some features
> we
> > > >>>> have not used at all),  it introduced another cost for maintain.
> I
> > > >>>> don't think it is a good idea.
> > > >>>>
> > > >>>> 2016-09-23 10:28 GMT+08:00 张铎 <palomino...@gmail.com>:
> > > >>>>> To be specific, for example, our nice Backup/Restore feature, if
> we
> > > >>> think
> > > >>>>> this is not a core feature of HBase, then we could make it depend
> > on
> > > >>> MR,
> > > >>>>> and start a standalone BackupManager instance that submits MR
> jobs
> > to
> > > >>> do
> > > >>>>> periodical maintenance job. And if we think this is a core
> feature
> > > that
> > > >>>>> everyone should use it, then we'd better implement it without MR
> > > >>>>> dependency, like DLS.
> > > >>>>>
> > > >>>>> Thanks.
> > > >>>>>
> > > >>>>> 2016-09-23 10:11 GMT+08:00 张铎 <palomino...@gmail.com>:
> > > >>>>>
> > > >>>>>> I‘m -1 on let master or rs launch MR jobs. It is OK that some of
> > our
> > > >>>>>> features depend on MR but I think the bottom line is that we
> > should
> > > >>>> launch
> > > >>>>>> the jobs from outside manually or by other services.
> > > >>>>>>
> > > >>>>>> 2016-09-23 9:47 GMT+08:00 Andrew Purtell <
> > andrew.purt...@gmail.com
> > > >:
> > > >>>>>>
> > > >>>>>>> Ok, got it. Well "shelling out" is on the line I think, so a
> fair
> > > >>>>>>> question.
> > > >>>>>>>
> > > >>>>>>> Can this be driven by a utility derived from Tool like our
> other
> > MR
> > > >>>> apps?
> > > >>>>>>> The issue is needing the AccessController to decide if allowed?
> > But
> > > >>>> nothing
> > > >>>>>>> prevents the user from running the job manually/independently,
> > > right?
> > > >>>>>>>
> > > >>>>>>>> On Sep 22, 2016, at 3:44 PM, Matteo Bertozzi <
> > > >>>> theo.berto...@gmail.com>
> > > >>>>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>> just a remark. my query was not about tools using MR
> (everyone i
> > > >>>> think
> > > >>>>>>> is
> > > >>>>>>>> ok with those).
> > > >>>>>>>> the topic was about: "are we ok with running MR jobs from
> Master
> > > >>> and
> > > >>>> RSs
> > > >>>>>>>> code?" since this will be the first time we do this
> > > >>>>>>>>
> > > >>>>>>>> Matteo
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>> On Thu, Sep 22, 2016 at 2:49 PM, Devaraj Das <
> > > >>> d...@hortonworks.com>
> > > >>>>>>> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>> Very much agree; for tools like ExportSnapshot / Backup /
> > > Restore,
> > > >>>> it's
> > > >>>>>>>>> fine to be dependent on MR. MR is the right framework for
> such.
> > > We
> > > >>>>>>> should
> > > >>>>>>>>> also do compactions using MR (just saying :) )
> > > >>>>>>>>> ________________________________________
> > > >>>>>>>>> From: Ted Yu <yuzhih...@gmail.com>
> > > >>>>>>>>> Sent: Thursday, September 22, 2016 2:00 PM
> > > >>>>>>>>> To: dev@hbase.apache.org
> > > >>>>>>>>> Subject: Re: [DISCUSSION] MR jobs started by Master or RS
> > > >>>>>>>>>
> > > >>>>>>>>> I agree - backup / restore is in the same category as import
> /
> > > >>>> export.
> > > >>>>>>>>>
> > > >>>>>>>>> On Thu, Sep 22, 2016 at 1:58 PM, Andrew Purtell <
> > > >>>>>>> andrew.purt...@gmail.com>
> > > >>>>>>>>> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>>> Backup is extra tooling around core in my opinion. Like
> import
> > > or
> > > >>>>>>> export.
> > > >>>>>>>>>> Or the optional MOB tool. It's fine.
> > > >>>>>>>>>>
> > > >>>>>>>>>>> On Sep 22, 2016, at 1:50 PM, Matteo Bertozzi <
> > > >>>> mberto...@apache.org>
> > > >>>>>>>>>> wrote:
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> What's the latest opinion around running MR jobs from hbase
> > > >>>> (Master
> > > >>>>>>> or
> > > >>>>>>>>>> RS)?
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> I remember in the past that there was discussion about not
> > > >>> having
> > > >>>> MR
> > > >>>>>>>>> has
> > > >>>>>>>>>>> direct dependency of hbase.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> I think some of discussion where around MOB that had a MR
> job
> > > to
> > > >>>>>>>>> compact,
> > > >>>>>>>>>>> that later was transformed in a non-MR job to be merged, I
> > > think
> > > >>>> we
> > > >>>>>>>>> had a
> > > >>>>>>>>>>> similar discussion for log split/replay.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> the latest is the new Backup feature (HBASE-7912), that
> runs
> > a
> > > >>> MR
> > > >>>> job
> > > >>>>>>>>>> from
> > > >>>>>>>>>>> the master to copy data or restore data.
> > > >>>>>>>>>>> (backup is also "not really core" as in.. if you don't use
> > > >>> backup
> > > >>>>>>>>> you'll
> > > >>>>>>>>>>> not end up running MR jobs, but this was probably true for
> > MOB
> > > >>> as
> > > >>>> in
> > > >>>>>>>>> "if
> > > >>>>>>>>>>> you don't enable MOB you don't need MR")
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> any thoughts? do we a rule that says "we don't want to have
> > > >>> hbase
> > > >>>> run
> > > >>>>>>>>> MR
> > > >>>>>>>>>>> jobs, only tool started manually by the user can do that".
> or
> > > >>> can
> > > >>>> we
> > > >>>>>>>>>> start
> > > >>>>>>>>>>> adding MR calls around without problems?
> > > >>>
> > >
> >
>

Reply via email to