Agreed, this would be interesting to contemplate. On Sep 22, 2016, at 8:03 PM, Vladimir Rodionov <vladrodio...@gmail.com> wrote:
>>> No, never. > > No need for M/R here, just a simple compaction-server colocated with RS on > a same node. > You save a lot on GC in RS. Ideally, it can be IO "nice" in Linux (by > setting IO priority). But offtopic, of course :) > > -Vlad > > On Thu, Sep 22, 2016 at 7:57 PM, Vladimir Rodionov <vladrodio...@gmail.com> > wrote: > >>>> And if MR not deployed, Backup/Restore feature could not be used, >> right? >> >> Yes. >> >> On Thu, Sep 22, 2016 at 7:53 PM, Heng Chen <heng.chen.1...@gmail.com> >> wrote: >> >>> {quote} >>> If MR framework is not deployed in the cluster, hbase still functions >>> normally (post merge). >>> {quote} >>> >>> If MR is not strong dependency for Master/RS, it is OK for me. >>> And if MR not deployed, Backup/Restore feature could not be used, right? >>> >>> 2016-09-23 10:49 GMT+08:00 Ted Yu <yuzhih...@gmail.com>: >>>> If MR framework is not deployed in the cluster, hbase still functions >>>> normally (post merge). >>>> >>>> In terms of build time dependency, we have long been depending on >>>> mapreduce. Take a look at ExportSnapshot. >>>> >>>> Cheers >>>> >>>> On Thu, Sep 22, 2016 at 7:42 PM, Heng Chen <heng.chen.1...@gmail.com> >>> wrote: >>>> >>>>> In our production cluster, it is a common case we just have HDFS and >>>>> HBase deployed. >>>>> If our Master/RS depend on MR framework (especially some features we >>>>> have not used at all), it introduced another cost for maintain. I >>>>> don't think it is a good idea. >>>>> >>>>> 2016-09-23 10:28 GMT+08:00 张铎 <palomino...@gmail.com>: >>>>>> To be specific, for example, our nice Backup/Restore feature, if we >>> think >>>>>> this is not a core feature of HBase, then we could make it depend on >>> MR, >>>>>> and start a standalone BackupManager instance that submits MR jobs >>> to do >>>>>> periodical maintenance job. And if we think this is a core feature >>> that >>>>>> everyone should use it, then we'd better implement it without MR >>>>>> dependency, like DLS. >>>>>> >>>>>> Thanks. >>>>>> >>>>>> 2016-09-23 10:11 GMT+08:00 张铎 <palomino...@gmail.com>: >>>>>> >>>>>>> I‘m -1 on let master or rs launch MR jobs. It is OK that some of our >>>>>>> features depend on MR but I think the bottom line is that we should >>>>> launch >>>>>>> the jobs from outside manually or by other services. >>>>>>> >>>>>>> 2016-09-23 9:47 GMT+08:00 Andrew Purtell <andrew.purt...@gmail.com >>>> : >>>>>>> >>>>>>>> Ok, got it. Well "shelling out" is on the line I think, so a fair >>>>>>>> question. >>>>>>>> >>>>>>>> Can this be driven by a utility derived from Tool like our other MR >>>>> apps? >>>>>>>> The issue is needing the AccessController to decide if allowed? But >>>>> nothing >>>>>>>> prevents the user from running the job manually/independently, >>> right? >>>>>>>> >>>>>>>>> On Sep 22, 2016, at 3:44 PM, Matteo Bertozzi < >>>>> theo.berto...@gmail.com> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> just a remark. my query was not about tools using MR (everyone i >>>>> think >>>>>>>> is >>>>>>>>> ok with those). >>>>>>>>> the topic was about: "are we ok with running MR jobs from Master >>> and >>>>> RSs >>>>>>>>> code?" since this will be the first time we do this >>>>>>>>> >>>>>>>>> Matteo >>>>>>>>> >>>>>>>>> >>>>>>>>>> On Thu, Sep 22, 2016 at 2:49 PM, Devaraj Das < >>> d...@hortonworks.com> >>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Very much agree; for tools like ExportSnapshot / Backup / >>> Restore, >>>>> it's >>>>>>>>>> fine to be dependent on MR. MR is the right framework for such. >>> We >>>>>>>> should >>>>>>>>>> also do compactions using MR (just saying :) ) >>>>>>>>>> ________________________________________ >>>>>>>>>> From: Ted Yu <yuzhih...@gmail.com> >>>>>>>>>> Sent: Thursday, September 22, 2016 2:00 PM >>>>>>>>>> To: dev@hbase.apache.org >>>>>>>>>> Subject: Re: [DISCUSSION] MR jobs started by Master or RS >>>>>>>>>> >>>>>>>>>> I agree - backup / restore is in the same category as import / >>>>> export. >>>>>>>>>> >>>>>>>>>> On Thu, Sep 22, 2016 at 1:58 PM, Andrew Purtell < >>>>>>>> andrew.purt...@gmail.com> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Backup is extra tooling around core in my opinion. Like import >>> or >>>>>>>> export. >>>>>>>>>>> Or the optional MOB tool. It's fine. >>>>>>>>>>> >>>>>>>>>>>> On Sep 22, 2016, at 1:50 PM, Matteo Bertozzi < >>>>> mberto...@apache.org> >>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> What's the latest opinion around running MR jobs from hbase >>>>> (Master >>>>>>>> or >>>>>>>>>>> RS)? >>>>>>>>>>>> >>>>>>>>>>>> I remember in the past that there was discussion about not >>> having >>>>> MR >>>>>>>>>> has >>>>>>>>>>>> direct dependency of hbase. >>>>>>>>>>>> >>>>>>>>>>>> I think some of discussion where around MOB that had a MR job >>> to >>>>>>>>>> compact, >>>>>>>>>>>> that later was transformed in a non-MR job to be merged, I >>> think >>>>> we >>>>>>>>>> had a >>>>>>>>>>>> similar discussion for log split/replay. >>>>>>>>>>>> >>>>>>>>>>>> the latest is the new Backup feature (HBASE-7912), that runs >>> a MR >>>>> job >>>>>>>>>>> from >>>>>>>>>>>> the master to copy data or restore data. >>>>>>>>>>>> (backup is also "not really core" as in.. if you don't use >>> backup >>>>>>>>>> you'll >>>>>>>>>>>> not end up running MR jobs, but this was probably true for >>> MOB as >>>>> in >>>>>>>>>> "if >>>>>>>>>>>> you don't enable MOB you don't need MR") >>>>>>>>>>>> >>>>>>>>>>>> any thoughts? do we a rule that says "we don't want to have >>> hbase >>>>> run >>>>>>>>>> MR >>>>>>>>>>>> jobs, only tool started manually by the user can do that". or >>> can >>>>> we >>>>>>>>>>> start >>>>>>>>>>>> adding MR calls around without problems? >>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>> >>> >> >>