If you guys have already implemented the feature in the MR way and the
patch is ready for landing on master, I'm a -0 on it as I do not want to
block the development progress.
But I strongly suggest later we need to revisit the design and see if we
can seperated the logic from HMaster as much as
2016-09-23 12:38 GMT+08:00 Devaraj Das :
> Guys, first off apologies for bringing in the topic of MR-based
> compactions.. But I was thinking more about the SpliceMachine approach of
> managing compactions in Spark where apparently they saw a lot of benefits.
> Apologies for
Just wanted to add one argument of doing this in a Master way :
Client - based backups/restore are very hard (if possible) to make fully
fault tolerant. If client fails abruptly half way, some system data will be
broken, cluster will never return into original state. We disable, for
example
All the better, Vlad!
On Thu, Sep 22, 2016 at 9:53 PM -0700, "Vladimir Rodionov"
> wrote:
>> If in the future, we find better ways of doing this without using MR, we
can certainly consider that
Our framework for distributed operations
>> If in the future, we find better ways of doing this without using MR, we
can certainly consider that
Our framework for distributed operations is abstract and allows
different implementations. MR is just one implementation we provide.
-Vlad
On Thu, Sep 22, 2016 at 9:38 PM, Devaraj Das
Guys, first off apologies for bringing in the topic of MR-based compactions..
But I was thinking more about the SpliceMachine approach of managing
compactions in Spark where apparently they saw a lot of benefits. Apologies for
giving you that sore throat Andrew; I really didn't mean to :-)
So
stack created HBASE-16689:
-
Summary: Durability == ASYNC_WAL means no SYNC
Key: HBASE-16689
URL: https://issues.apache.org/jira/browse/HBASE-16689
Project: HBase
Issue Type: Bug
Stability is one thing, and another thing is the difficulty of
configuration and deployment.
For configuration, it is always a pain. I do not want to restart HMaster
many times to get thing right. A standalone service would be better.
For deployment, as chenheng said above, usually we do not
You mean standalone service which runs Procedure V2 ?
Not sure how much work is involved.
Is this concerning the stability of Master where backup / restore
procedures run ?
To my understanding, errors in one procedure are isolated, not having
adverse impact on Master's stability.
On Thu, Sep
So what about a standalone service other than master? You can use your own
procedure store in that service?
2016-09-23 11:28 GMT+08:00 Ted Yu :
> An earlier implementation was client driven.
>
> But with that approach, it is hard to resume if there is error midway.
> Using
Agreed, this would be interesting to contemplate.
On Sep 22, 2016, at 8:03 PM, Vladimir Rodionov wrote:
>>> No, never.
>
> No need for M/R here, just a simple compaction-server colocated with RS on
> a same node.
> You save a lot on GC in RS. Ideally, it can be IO
No, this misses Matteo's finer point, which is "shelling out" from the master
directly to run MR is a first. Why not drive this with a utility derived from
Tool?
On Sep 22, 2016, at 7:57 PM, Vladimir Rodionov wrote:
>>> In our production cluster, it is a common case
>> If MR is not strong dependency for Master/RS, it is OK for me.
There is no strong MR dependency for Master/RS. They will function as
usual, until you try backup,
it will fail but Master won't.
-Vlad
On Thu, Sep 22, 2016 at 8:03 PM, Vladimir Rodionov
wrote:
> >> No,
>> No, never.
No need for M/R here, just a simple compaction-server colocated with RS on
a same node.
You save a lot on GC in RS. Ideally, it can be IO "nice" in Linux (by
setting IO priority). But offtopic, of course :)
-Vlad
On Thu, Sep 22, 2016 at 7:57 PM, Vladimir Rodionov
>> In our production cluster, it is a common case we just have HDFS and
>> HBase deployed.
>> If our Master/RS depend on MR framework (especially some features we
>> have not used at all), it introduced another cost for maintain. I
>> don't think it is a good idea.
So , you are not backup
If MR framework is not deployed in the cluster, hbase still functions
normally (post merge).
In terms of build time dependency, we have long been depending on
mapreduce. Take a look at ExportSnapshot.
Cheers
On Thu, Sep 22, 2016 at 7:42 PM, Heng Chen wrote:
> In our
In our production cluster, it is a common case we just have HDFS and
HBase deployed.
If our Master/RS depend on MR framework (especially some features we
have not used at all), it introduced another cost for maintain. I
don't think it is a good idea.
2016-09-23 10:28 GMT+08:00 张铎
I‘m -1 on let master or rs launch MR jobs. It is OK that some of our
features depend on MR but I think the bottom line is that we should launch
the jobs from outside manually or by other services.
2016-09-23 9:47 GMT+08:00 Andrew Purtell :
> Ok, got it. Well "shelling
(Back with a sore throat.)
Also for what it is worth - it may well be that the attempt to bolt
containers-as-executors to YARN is too little too late and coordination of
container based services and applications (such as distributed map-reduce
workflows or more likely Spark) will be handled by
> We should also do compactions using MR (just saying :)
No, never. It's not a good idea to wed any of our core function to something
that independently evolves, that some of us don't have commit rights on (and
never will), and has varying degrees of utility depending on deploy. Like JM
says
Ok, got it. Well "shelling out" is on the line I think, so a fair question.
Can this be driven by a utility derived from Tool like our other MR apps? The
issue is needing the AccessController to decide if allowed? But nothing
prevents the user from running the job manually/independently, right?
Once you are in the game of coordinating large scale tasks with
distribution, fault tolerance, etc other than implementing a similar
framework inside HBase, MR will be the way to go. Things like exporting
snapshots, dist cp, or backups (which uses these) must use such a
framework.
The issue about
Not practical to do those tools without MR, JM. We should be using the right
framework for the use cases in hand. MR fits this really well.
JM, when you say "if we can do without MR, then, why not?", do you have a
framework in mind that performs/scale as well as MR? Curious.
Matteo, the Master won't spawn the job unless someone actually wants to use the
backup/restore. So I'd argue we still don't have a 'hard' dependency - it's
still much like the other tools that you consider as being outside the core.
From: Matteo Bertozzi
Well, I'm just not using those features ;) But was hopping for the MOBs ;)
My point is, if we can do it without MR, then, why not? )
2016-09-22 19:25 GMT-04:00 Vladimir Rodionov :
> Forgot WALPlayer :)
>
> -Vlad
>
> On Thu, Sep 22, 2016 at 4:21 PM, Vladimir Rodionov
Forgot WALPlayer :)
-Vlad
On Thu, Sep 22, 2016 at 4:21 PM, Vladimir Rodionov
wrote:
> >> and
> >> backups too, but don't want to bother having to install and configure
> YARN
> >> just for that, as well as removing resources from HBase to give it to
>
> Any suggestions
>> and
>> backups too, but don't want to bother having to install and configure
YARN
>> just for that, as well as removing resources from HBase to give it to
Any suggestions on how to do bulk data move with transformation from/to
HBase cluster w/o MapReduce?
Opposition to M/R does not make sense
My 2¢: I have a strong preference for NOT having a dependency on MR
anywhere :( I run my HBase cluste without YARN. Just HBase and HDFS. I like
all the features that we built. Would love to be able to use MOBs and
backups too, but don't want to bother having to install and configure YARN
just for
just a remark. my query was not about tools using MR (everyone i think is
ok with those).
the topic was about: "are we ok with running MR jobs from Master and RSs
code?" since this will be the first time we do this
Matteo
On Thu, Sep 22, 2016 at 2:49 PM, Devaraj Das
Very much agree; for tools like ExportSnapshot / Backup / Restore, it's fine to
be dependent on MR. MR is the right framework for such. We should also do
compactions using MR (just saying :) )
From: Ted Yu
Sent: Thursday, September
[
https://issues.apache.org/jira/browse/HBASE-16687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Purtell resolved HBASE-16687.
Resolution: Duplicate
Assignee: (was: Andrew Purtell)
Fix Version/s:
I agree - backup / restore is in the same category as import / export.
On Thu, Sep 22, 2016 at 1:58 PM, Andrew Purtell
wrote:
> Backup is extra tooling around core in my opinion. Like import or export.
> Or the optional MOB tool. It's fine.
>
> > On Sep 22, 2016, at
Backup is extra tooling around core in my opinion. Like import or export. Or
the optional MOB tool. It's fine.
> On Sep 22, 2016, at 1:50 PM, Matteo Bertozzi wrote:
>
> What's the latest opinion around running MR jobs from hbase (Master or RS)?
>
> I remember in the
I would be -1 a requirement for MR for something core to HBase.
> On Sep 22, 2016, at 1:50 PM, Matteo Bertozzi wrote:
>
> What's the latest opinion around running MR jobs from hbase (Master or RS)?
>
> I remember in the past that there was discussion about not having MR
What's the latest opinion around running MR jobs from hbase (Master or RS)?
I remember in the past that there was discussion about not having MR has
direct dependency of hbase.
I think some of discussion where around MOB that had a MR job to compact,
that later was transformed in a non-MR job to
Matteo Bertozzi created HBASE-16688:
---
Summary: Split TestMasterFailoverWithProcedures
Key: HBASE-16688
URL: https://issues.apache.org/jira/browse/HBASE-16688
Project: HBase
Issue Type: Bug
Andrew Purtell created HBASE-16687:
--
Summary: Remove MaxPermSize from surefire/failsave command line
Key: HBASE-16687
URL: https://issues.apache.org/jira/browse/HBASE-16687
Project: HBase
Guang Yang created HBASE-16686:
--
Summary: Add latency metrics for REST
Key: HBASE-16686
URL: https://issues.apache.org/jira/browse/HBASE-16686
Project: HBase
Issue Type: New Feature
Ted Yu created HBASE-16685:
--
Summary: Revisit execution of SnapshotCopy in
MapReduceBackupCopyService
Key: HBASE-16685
URL: https://issues.apache.org/jira/browse/HBASE-16685
Project: HBase
Issue
Haohui Mai created HBASE-16684:
--
Summary: The get() requests does not see locally buffered put()
requests when autoflush is disabled
Key: HBASE-16684
URL: https://issues.apache.org/jira/browse/HBASE-16684
I'd like to see the docs proposed on HBASE-16574 integrated into our
project's documentation prior to merge.
On Thu, Sep 22, 2016 at 9:02 AM, Ted Yu wrote:
> This feature can be marked experimental due to some limitations such as
> security.
>
> Your previous round of
Ted Yu created HBASE-16683:
--
Summary: Address review comments for backup / restore feature
Key: HBASE-16683
URL: https://issues.apache.org/jira/browse/HBASE-16683
Project: HBase
Issue Type: Bug
This feature can be marked experimental due to some limitations such as
security.
Your previous round of comments have been addressed.
Command line tool has gone through:
HBASE-16620 Fix backup command-line tool usability issues
HBASE-16655 hbase backup describe with incorrect backup id results
On Wed, Sep 21, 2016 at 7:43 AM, Ted Yu wrote:
> Are there more (review) comments ?
>
>
Are outstanding comments addressed?
I don't see answer to my 'is this experimental/will it be marked
experimental' question.
I ran into some issues trying to use the feature and
Build status: Successful
If successful, the website and docs have been generated. To update the live
site, follow the instructions below. If failed, skip to the bottom of this
email.
Use the following commands to download the patch and apply it to a clean branch
based on origin/asf-site. If
Appy created HBASE-16682:
Summary: Fix Shell tests failure. NoClassDefFoundError for MiniKdc
Key: HBASE-16682
URL: https://issues.apache.org/jira/browse/HBASE-16682
Project: HBase
Issue Type: Bug
Appy created HBASE-16681:
Summary: Fix flaky TestReplicationSourceManagerZkImpl
Key: HBASE-16681
URL: https://issues.apache.org/jira/browse/HBASE-16681
Project: HBase
Issue Type: Bug
47 matches
Mail list logo