>> No, never. No need for M/R here, just a simple compaction-server colocated with RS on a same node. You save a lot on GC in RS. Ideally, it can be IO "nice" in Linux (by setting IO priority). But offtopic, of course :)
-Vlad On Thu, Sep 22, 2016 at 7:57 PM, Vladimir Rodionov <vladrodio...@gmail.com> wrote: > >> And if MR not deployed, Backup/Restore feature could not be used, > right? > > Yes. > > On Thu, Sep 22, 2016 at 7:53 PM, Heng Chen <heng.chen.1...@gmail.com> > wrote: > >> {quote} >> If MR framework is not deployed in the cluster, hbase still functions >> normally (post merge). >> {quote} >> >> If MR is not strong dependency for Master/RS, it is OK for me. >> And if MR not deployed, Backup/Restore feature could not be used, right? >> >> 2016-09-23 10:49 GMT+08:00 Ted Yu <yuzhih...@gmail.com>: >> > If MR framework is not deployed in the cluster, hbase still functions >> > normally (post merge). >> > >> > In terms of build time dependency, we have long been depending on >> > mapreduce. Take a look at ExportSnapshot. >> > >> > Cheers >> > >> > On Thu, Sep 22, 2016 at 7:42 PM, Heng Chen <heng.chen.1...@gmail.com> >> wrote: >> > >> >> In our production cluster, it is a common case we just have HDFS and >> >> HBase deployed. >> >> If our Master/RS depend on MR framework (especially some features we >> >> have not used at all), it introduced another cost for maintain. I >> >> don't think it is a good idea. >> >> >> >> 2016-09-23 10:28 GMT+08:00 张铎 <palomino...@gmail.com>: >> >> > To be specific, for example, our nice Backup/Restore feature, if we >> think >> >> > this is not a core feature of HBase, then we could make it depend on >> MR, >> >> > and start a standalone BackupManager instance that submits MR jobs >> to do >> >> > periodical maintenance job. And if we think this is a core feature >> that >> >> > everyone should use it, then we'd better implement it without MR >> >> > dependency, like DLS. >> >> > >> >> > Thanks. >> >> > >> >> > 2016-09-23 10:11 GMT+08:00 张铎 <palomino...@gmail.com>: >> >> > >> >> >> I‘m -1 on let master or rs launch MR jobs. It is OK that some of our >> >> >> features depend on MR but I think the bottom line is that we should >> >> launch >> >> >> the jobs from outside manually or by other services. >> >> >> >> >> >> 2016-09-23 9:47 GMT+08:00 Andrew Purtell <andrew.purt...@gmail.com >> >: >> >> >> >> >> >>> Ok, got it. Well "shelling out" is on the line I think, so a fair >> >> >>> question. >> >> >>> >> >> >>> Can this be driven by a utility derived from Tool like our other MR >> >> apps? >> >> >>> The issue is needing the AccessController to decide if allowed? But >> >> nothing >> >> >>> prevents the user from running the job manually/independently, >> right? >> >> >>> >> >> >>> > On Sep 22, 2016, at 3:44 PM, Matteo Bertozzi < >> >> theo.berto...@gmail.com> >> >> >>> wrote: >> >> >>> > >> >> >>> > just a remark. my query was not about tools using MR (everyone i >> >> think >> >> >>> is >> >> >>> > ok with those). >> >> >>> > the topic was about: "are we ok with running MR jobs from Master >> and >> >> RSs >> >> >>> > code?" since this will be the first time we do this >> >> >>> > >> >> >>> > Matteo >> >> >>> > >> >> >>> > >> >> >>> >> On Thu, Sep 22, 2016 at 2:49 PM, Devaraj Das < >> d...@hortonworks.com> >> >> >>> wrote: >> >> >>> >> >> >> >>> >> Very much agree; for tools like ExportSnapshot / Backup / >> Restore, >> >> it's >> >> >>> >> fine to be dependent on MR. MR is the right framework for such. >> We >> >> >>> should >> >> >>> >> also do compactions using MR (just saying :) ) >> >> >>> >> ________________________________________ >> >> >>> >> From: Ted Yu <yuzhih...@gmail.com> >> >> >>> >> Sent: Thursday, September 22, 2016 2:00 PM >> >> >>> >> To: dev@hbase.apache.org >> >> >>> >> Subject: Re: [DISCUSSION] MR jobs started by Master or RS >> >> >>> >> >> >> >>> >> I agree - backup / restore is in the same category as import / >> >> export. >> >> >>> >> >> >> >>> >> On Thu, Sep 22, 2016 at 1:58 PM, Andrew Purtell < >> >> >>> andrew.purt...@gmail.com> >> >> >>> >> wrote: >> >> >>> >> >> >> >>> >>> Backup is extra tooling around core in my opinion. Like import >> or >> >> >>> export. >> >> >>> >>> Or the optional MOB tool. It's fine. >> >> >>> >>> >> >> >>> >>>> On Sep 22, 2016, at 1:50 PM, Matteo Bertozzi < >> >> mberto...@apache.org> >> >> >>> >>> wrote: >> >> >>> >>>> >> >> >>> >>>> What's the latest opinion around running MR jobs from hbase >> >> (Master >> >> >>> or >> >> >>> >>> RS)? >> >> >>> >>>> >> >> >>> >>>> I remember in the past that there was discussion about not >> having >> >> MR >> >> >>> >> has >> >> >>> >>>> direct dependency of hbase. >> >> >>> >>>> >> >> >>> >>>> I think some of discussion where around MOB that had a MR job >> to >> >> >>> >> compact, >> >> >>> >>>> that later was transformed in a non-MR job to be merged, I >> think >> >> we >> >> >>> >> had a >> >> >>> >>>> similar discussion for log split/replay. >> >> >>> >>>> >> >> >>> >>>> the latest is the new Backup feature (HBASE-7912), that runs >> a MR >> >> job >> >> >>> >>> from >> >> >>> >>>> the master to copy data or restore data. >> >> >>> >>>> (backup is also "not really core" as in.. if you don't use >> backup >> >> >>> >> you'll >> >> >>> >>>> not end up running MR jobs, but this was probably true for >> MOB as >> >> in >> >> >>> >> "if >> >> >>> >>>> you don't enable MOB you don't need MR") >> >> >>> >>>> >> >> >>> >>>> any thoughts? do we a rule that says "we don't want to have >> hbase >> >> run >> >> >>> >> MR >> >> >>> >>>> jobs, only tool started manually by the user can do that". or >> can >> >> we >> >> >>> >>> start >> >> >>> >>>> adding MR calls around without problems? >> >> >>> >>> >> >> >>> >> >> >> >>> >> >> >> >> >> >> >> >> >> > >