from:"Mallikarjun"

Re: Hbase Meetup -- Bangalore on 19th Jan

2024-01-13 Thread Mallikarjun

Unfortunately, we had to postpone the above event as we could not get
enough traction to get minimum submissions for this event.

You can continue submissions, once we can get enough of them, will organise
this event. Will keep you posted.

---
Mallikarjun

On Sun, Dec 17, 2023 at 7:10 PM Mallikarjun 
wrote:

> We are organising hbase-meetup in Bangalore, India on 19th Jan 2024. Reach
> out to me if you want to attend. I will arrange for entry passes.
>
> CALL FOR SPEAKERS is open
> https://sessionize.com/apache-hbase-meetup-bangalore/
>
> Date: 19th Jan 2024
> Venue: Flipkart Office, Bangalore.
>
> ---
> Mallikarjun
>

Hbase Meetup -- Bangalore on 19th Jan

2023-12-17 Thread Mallikarjun

We are organising hbase-meetup in Bangalore, India on 19th Jan 2024. Reach
out to me if you want to attend. I will arrange for entry passes.

CALL FOR SPEAKERS is open
https://sessionize.com/apache-hbase-meetup-bangalore/

Date: 19th Jan 2024
Venue: Flipkart Office, Bangalore.

---
Mallikarjun

Re: [NOTICE] Going to cut 3.0.0-alpha-4

2023-05-22 Thread Mallikarjun

It would be good to include these 2 features

Parallel Backups: https://issues.apache.org/jira/browse/HBASE-26034
RSGroup Support for Backups:
https://issues.apache.org/jira/browse/HBASE-26322

CC: Bryan
---
Mallikarjun


On Sat, May 20, 2023 at 7:13 PM 张铎(Duo Zhang)  wrote:

> HBASE-27109 has been merged, which is the last feature I want to land in
> the 3.0.0 release.
>
> The road map is to make 3.0.0-alpha-4, which is the last alpha release for
> 3.0.0, and then cut branch-3, focus on stabilizing and make two beta
> releases, and finally cut branch-3.0 and make the final 3.0 release.
>
> Shout if you still have other things that want to land into 3.0.0 release,
> as once we cut branch-3, we will hold on the committing of big features so
> we can soon make the branch stable enough, and prepare the final 3.0.0
> release.
>
> Thanks.
>

Re: [DISCUSS] Kubernetes Orchestration for ZK, HDFS, and HBase

2023-03-14 Thread Mallikarjun

Hi Nick,

I agree with your thought that there is an increasing reliance on
kubernetes, more so for complex workloads like hbase deployments because of
unavailability of reliable automation frameworks outside of k8s.

But I have a slightly different view in terms of how to achieve it. When I
was exploring what are the possibilities such as kustomize or helm or
operator. I found it can get pretty complex in terms of writing extensible
deployment manifest (for different kinds of deployments) with tools like
kustomize or helm. Here is our attempt to conterairise hbase with operator
--> https://github.com/flipkart-incubator/hbase-k8s-operator

---
Mallikarjun

On Mon, Mar 13, 2023 at 3:58 PM Nick Dimiduk  wrote:

> Heya team,
>
> Over here at $dayjob, we have an increasing reliance on Kubernetes for
> both development and production workloads. Our tools are maturing and
> we're hoping that they might be of interest to the wider community.
> I'd like to see if there's community interest in receiving some/any of
> them as a contribution. I think we'll also need a plan from ASF Infra
> that makes kubernetes available to us as a project.


> We have implemented a basic stack of tools for orchestrating ZK + HDFS
> + HBase on Kubernetes. We use this for running a small local dev
> cluster via MiniKube/KIND ; for ITBLL on smallish distributed clusters
> in a public cloud ; and in production for running clusters of ~100
> Data Nodes/Region Servers in a public cloud. There was an earlier
> discussion about using our donation of test hardware for running more
> thorough tests in our CI, but one of the limiting factors is full
> cluster deployment. I hope that the community might be interested in
> receiving this tooling as a foundation for more rigorous correctness
> and maybe even performance tests in the open. Furthermore, perhaps the
> wider community has interest in an Apache licensed cluster
> orchestration tool for other uses.
>
> Now for some details: The implementation is built on Kustomize, so
> it's fundamentally transparent resource specification with yaml
> patches for composability; this is in contrast to a solution using
> templates with defined capabilities and interfaces. There is no
> operator ; it's all coordinated via init/bootstrap containers, shell
> scripts, shared volumes for state, &c. For now.


> Such a donation will amount to a code drop, which will have its
> challenges. I'm motivated via internal processes to carve it into
> smaller pieces, and I think that will benefit community review as
> well. Perhaps this approach could be used to make the contribution via
> a feature branch.
>
> Is there community interest in adding such a capability to our
> maintained responsibilities? I'd hope that we have several volunteers
> to work with me through the contribution process, and who are
> reasonably confident that they'll be able to help maintain such a
> capability going forward. We'll also need someone who can work with
> Infra to get us access to Kubernetes cluster(s), via whatever means.
>
> What do you think?
>
> Thanks,
> Nick & the HBase team at Apple

Re: Branching for 2.6 code line (branch-2.6)

2022-10-16 Thread Mallikarjun

On hbase-backup, we are using in production for more then 1 year. I can
vouch for it to be stable enough to be in a release version so that more
people can use it and polished it further.

On Sun, Oct 16, 2022, 11:25 PM Andrew Purtell 
wrote:

> My understanding is some folks evaluating and polishing TLS for their
> production are also considering hbase-backup in the same way, which is why
> I linked them together. If that is incorrect then they both are still worth
> considering in my opinion but would have a more tenuous link.
>
> Where we are with hbase-backup is it should probably be ported to where
> more people would be inclined to evaluate it, in order for it to make more
> progress. A new minor releasing line would fit. On the other hand if it is
> too unpolished then the experience would be poor.
>
>
> > On Oct 16, 2022, at 5:35 AM, 张铎  wrote:
> >
> > I believe the second one is still ongoing?
> >
> > Andrew Purtell  于2022年10月14日周五 05:37写道：
> >>
> >> We will begin releasing activity for the 2.6 code line and as a
> >> prerequisite to that we shall need to make a new branch branch-2.6 from
> >> branch-2.
> >>
> >> Before we do that let's make sure all commits for the key features of
> 2.6
> >> are settled in branch-2 before the branching point. Those key features
> are:
> >> - mTLS RPC
> >> - hbase-backup backport
> >>
> >> --
> >> Best regards,
> >> Andrew
>

Blog Posts on "How we solved for Hbase MultiTenancy"

2022-08-29 Thread Mallikarjun

Multi Tenancy was needed because of diverse use cases at our org. So we
took hbase and made customisations on top of 2.1 to have better isolation
guarantees. I have put down those customisations in the following II part
articles.

Part I -->
https://blog.flipkart.tech/hbase-multi-tenancy-part-i-37cad340c0fa
Part II -->
https://blog.flipkart.tech/hbase-multi-tenancy-part-ii-79488c19b03d

Talk:
https://www.youtube.com/watch?v=ttGI9Ma7Xos&t=26s&ab_channel=LifeAtFlipkart

Some of those can be contributed back upstream with modifications. Happy to
hear thoughts.

---
Mallikarjun

[jira] [Created] (HBASE-27238) Backport Backup/Restore to 2.x

2022-07-25 Thread Mallikarjun (Jira)

Mallikarjun created HBASE-27238:
---

 Summary: Backport Backup/Restore to 2.x
 Key: HBASE-27238
 URL: https://issues.apache.org/jira/browse/HBASE-27238
 Project: HBase
  Issue Type: New Feature
  Components: backport, backup&restore
Reporter: Mallikarjun
Assignee: Mallikarjun






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [DISCUSS] Time for 3.0.0-alpha-3

2022-06-18 Thread Mallikarjun

Thanks Duo

I have responded to the comments.

On Fri, Jun 17, 2022, 6:07 PM 张铎(Duo Zhang)  wrote:

> I posted some comments on the issues. PTAL.
>
> Thanks.
>
> Mallikarjun  于2022年6月17日周五 17:03写道：
>
> > These 2 patches would be a good addition. They still need to be reviewed.
> >
> > HBASE-26322 <https://issues.apache.org/jira/browse/HBASE-26322>:
> > https://github.com/apache/hbase/pull/4544
> > HBASE-26034 <https://issues.apache.org/jira/browse/HBASE-26034>:
> > https://github.com/apache/hbase/pull/4545
> >
> > ---
> > Mallikarjun
> >
> >
> > On Thu, Jun 16, 2022 at 7:38 PM 张铎(Duo Zhang) 
> > wrote:
> >
> > > And also bump this one, feel free to reply here if you want to land
> > > something in the final 3.0.0 release.
> > >
> > > Thanks.
> > >
> > > Andrew Purtell  于2022年6月13日周一 02:04写道：
> > >
> > > > +1, thanks for your efforts to push this forward.
> > > >
> > > > On Sun, Jun 12, 2022 at 7:55 AM 张铎(Duo Zhang)  >
> > > > wrote:
> > > >
> > > > > Will roll 3.0.0-alpha-3 in the next week.
> > > > >
> > > > > And 3.0.0-alpha-4 will be our last alpha release, after which we
> will
> > > cut
> > > > > branch-3 and enter the feature freeze state.
> > > > >
> > > > > If you want something to be released in 3.0.0 please shout, and
> let's
> > > see
> > > > > whether we can land it before the alpha4 release. For me, I would
> > like
> > > to
> > > > > finish HBASE-27109 and HBASE-27110 before alpha4.
> > > > >
> > > > > Thanks.
> > > > >
> > > >
> > > >
> > > > --
> > > > Best regards,
> > > > Andrew
> > > >
> > > > Unrest, ignorance distilled, nihilistic imbeciles -
> > > > It's what we’ve earned
> > > > Welcome, apocalypse, what’s taken you so long?
> > > > Bring us the fitting end that we’ve been counting on
> > > >- A23, Welcome, Apocalypse
> > > >
> > >
> >
>

Re: [DISCUSS] Time for 3.0.0-alpha-3

2022-06-17 Thread Mallikarjun

These 2 patches would be a good addition. They still need to be reviewed.

HBASE-26322 <https://issues.apache.org/jira/browse/HBASE-26322>:
https://github.com/apache/hbase/pull/4544
HBASE-26034 <https://issues.apache.org/jira/browse/HBASE-26034>:
https://github.com/apache/hbase/pull/4545

---
Mallikarjun


On Thu, Jun 16, 2022 at 7:38 PM 张铎(Duo Zhang)  wrote:

> And also bump this one, feel free to reply here if you want to land
> something in the final 3.0.0 release.
>
> Thanks.
>
> Andrew Purtell  于2022年6月13日周一 02:04写道：
>
> > +1, thanks for your efforts to push this forward.
> >
> > On Sun, Jun 12, 2022 at 7:55 AM 张铎(Duo Zhang) 
> > wrote:
> >
> > > Will roll 3.0.0-alpha-3 in the next week.
> > >
> > > And 3.0.0-alpha-4 will be our last alpha release, after which we will
> cut
> > > branch-3 and enter the feature freeze state.
> > >
> > > If you want something to be released in 3.0.0 please shout, and let's
> see
> > > whether we can land it before the alpha4 release. For me, I would like
> to
> > > finish HBASE-27109 and HBASE-27110 before alpha4.
> > >
> > > Thanks.
> > >
> >
> >
> > --
> > Best regards,
> > Andrew
> >
> > Unrest, ignorance distilled, nihilistic imbeciles -
> > It's what we’ve earned
> > Welcome, apocalypse, what’s taken you so long?
> > Bring us the fitting end that we’ve been counting on
> >- A23, Welcome, Apocalypse
> >
>

Re: [VOTE] First release candidate for hbase 3.0.0-alpha-2 is available for download

2021-12-15 Thread Mallikarjun

Thanks. Got the point.

---
Mallikarjun


On Tue, Dec 14, 2021 at 7:43 PM 张铎(Duo Zhang)  wrote:

> We could still add new big features before the last alpha-4 release.
>
> After alpha-4 is out, we will cut a branch-3 and then start to stabilize
> it. For branch-3, we should treat it like branch-2. And after beta1 and
> beta2, we will cut branch-3.0, to make the final 3.0.0 release.
>
> Thanks.
>
> Mallikarjun  于2021年12月14日周二 10:14写道：
>
> > Not sure if we can add a new feature at this time of the release cycle.
> >
> > Since native backup/restore will be part of hbase release for the first
> > time, these 2 features would be worthwhile to be considered to be part of
> > the release.
> >
> > 1. Add rsgroup support for Backup
> > Details: https://issues.apache.org/jira/browse/HBASE-26322
> > Patch: https://github.com/apache/hbase/pull/3726
> >
> > 2. Add support to take parallel backups
> > Details: https://issues.apache.org/jira/browse/HBASE-26034
> > Patch: https://github.com/apache/hbase/pull/3766
> >
> >
> >
> > ---
> > Mallikarjun
> >
> >
> > On Tue, Dec 14, 2021 at 5:45 AM 张铎(Duo Zhang) 
> > wrote:
> >
> > > Thanks Josh!
> > >
> > > Will make a new RC1 soon.
> > >
> > > Josh Elser  于2021年12月14日周二 04:57写道：
> > >
> > > > -1 (binding)
> > > >
> > > > Log4j2 CVE mitigation is ineffective due an incorrect `export` in
> > > > bin/hbase-config.sh. Appears that HBASE-26557 tried to add the
> > > > mitigation to HBASE_OPTS but added spaces around either side of the
> > > > equals sign, e.g. `export HBASE_OPTS = ".."`, which is invalid
> syntax.
> > > >
> > > >
> > > > 
> > > > $ ./bin/start-hbase.sh
> > > >
> > >
> >
> /Users/jelser/hbase300alpha2rc0/hbase300/hbase-3.0.0-alpha-2/bin/hbase-config.sh:
> > > >
> > > > line 167: export: `=': not a valid identifier
> > > >
> > >
> >
> /Users/jelser/hbase300alpha2rc0/hbase300/hbase-3.0.0-alpha-2/bin/hbase-config.sh:
> > > >
> > > > line 167: export: ` -Dlog4j2.formatMsgNoLookups=true': not a valid
> > > > identifier
> > > >
> > >
> >
> /Users/jelser/hbase300alpha2rc0/hbase300/hbase-3.0.0-alpha-2/bin/hbase-config.sh:
> > > >
> > > > line 167: export: `=': not a valid identifier
> > > >
> > >
> >
> /Users/jelser/hbase300alpha2rc0/hbase300/hbase-3.0.0-alpha-2/bin/hbase-config.sh:
> > > >
> > > > line 167: export: ` -Dlog4j2.formatMsgNoLookups=true': not a valid
> > > > identifier
> > > > 
> > > >
> > > > More naively, and just in plain bash:
> > > > 
> > > > bash-5.1$ export FOO = "$FOO bar"
> > > > bash: export: `=': not a valid identifier
> > > > bash: export: ` bar': not a valid identifier
> > > > bash-5.1$ echo $FOO
> > > > 
> > > >
> > > > I'll post a PR to fix after sending this.
> > > >
> > > > The good:
> > > > * xsums and sigs were OK
> > > > * Was able to run most unit tests locally
> > > > * Was able to launch using bin tarball
> > > > * Everything else looks great so far
> > > >
> > > > - Josh
> > > >
> > > > On 12/11/21 11:34 AM, Duo Zhang wrote:
> > > > > Please vote on this Apache hbase release candidate,
> > > > > hbase-3.0.0-alpha-2RC0
> > > > >
> > > > > The VOTE will remain open for at least 72 hours.
> > > > >
> > > > > [ ] +1 Release this package as Apache hbase 3.0.0-alpha-2
> > > > > [ ] -1 Do not release this package because ...
> > > > >
> > > > > The tag to be voted on is 3.0.0-alpha-2RC0:
> > > > >
> > > > >https://github.com/apache/hbase/tree/3.0.0-alpha-2RC0
> > > > >
> > > > > This tag currently points to git reference
> > > > >
> > > > >8bca21b47d7c809a0940aea8ed12dd4d2af12432
> > > > >
> > > > > The release files, including signatures, digests, as well as
> > CHANGES.md
> > > > > and RELEASENOTES.md included in this RC can be found at:
> > > > >
> > > > >https://dist.apache.org/repos/dist/dev/hbase/3.0.0-alpha-2RC0/
> > > >

Re: [VOTE] First release candidate for hbase 3.0.0-alpha-2 is available for download

2021-12-13 Thread Mallikarjun

Not sure if we can add a new feature at this time of the release cycle.

Since native backup/restore will be part of hbase release for the first
time, these 2 features would be worthwhile to be considered to be part of
the release.

1. Add rsgroup support for Backup
Details: https://issues.apache.org/jira/browse/HBASE-26322
Patch: https://github.com/apache/hbase/pull/3726

2. Add support to take parallel backups
Details: https://issues.apache.org/jira/browse/HBASE-26034
Patch: https://github.com/apache/hbase/pull/3766



---
Mallikarjun


On Tue, Dec 14, 2021 at 5:45 AM 张铎(Duo Zhang)  wrote:

> Thanks Josh!
>
> Will make a new RC1 soon.
>
> Josh Elser  于2021年12月14日周二 04:57写道：
>
> > -1 (binding)
> >
> > Log4j2 CVE mitigation is ineffective due an incorrect `export` in
> > bin/hbase-config.sh. Appears that HBASE-26557 tried to add the
> > mitigation to HBASE_OPTS but added spaces around either side of the
> > equals sign, e.g. `export HBASE_OPTS = ".."`, which is invalid syntax.
> >
> >
> > 
> > $ ./bin/start-hbase.sh
> >
> /Users/jelser/hbase300alpha2rc0/hbase300/hbase-3.0.0-alpha-2/bin/hbase-config.sh:
> >
> > line 167: export: `=': not a valid identifier
> >
> /Users/jelser/hbase300alpha2rc0/hbase300/hbase-3.0.0-alpha-2/bin/hbase-config.sh:
> >
> > line 167: export: ` -Dlog4j2.formatMsgNoLookups=true': not a valid
> > identifier
> >
> /Users/jelser/hbase300alpha2rc0/hbase300/hbase-3.0.0-alpha-2/bin/hbase-config.sh:
> >
> > line 167: export: `=': not a valid identifier
> >
> /Users/jelser/hbase300alpha2rc0/hbase300/hbase-3.0.0-alpha-2/bin/hbase-config.sh:
> >
> > line 167: export: ` -Dlog4j2.formatMsgNoLookups=true': not a valid
> > identifier
> > 
> >
> > More naively, and just in plain bash:
> > 
> > bash-5.1$ export FOO = "$FOO bar"
> > bash: export: `=': not a valid identifier
> > bash: export: ` bar': not a valid identifier
> > bash-5.1$ echo $FOO
> > 
> >
> > I'll post a PR to fix after sending this.
> >
> > The good:
> > * xsums and sigs were OK
> > * Was able to run most unit tests locally
> > * Was able to launch using bin tarball
> > * Everything else looks great so far
> >
> > - Josh
> >
> > On 12/11/21 11:34 AM, Duo Zhang wrote:
> > > Please vote on this Apache hbase release candidate,
> > > hbase-3.0.0-alpha-2RC0
> > >
> > > The VOTE will remain open for at least 72 hours.
> > >
> > > [ ] +1 Release this package as Apache hbase 3.0.0-alpha-2
> > > [ ] -1 Do not release this package because ...
> > >
> > > The tag to be voted on is 3.0.0-alpha-2RC0:
> > >
> > >https://github.com/apache/hbase/tree/3.0.0-alpha-2RC0
> > >
> > > This tag currently points to git reference
> > >
> > >8bca21b47d7c809a0940aea8ed12dd4d2af12432
> > >
> > > The release files, including signatures, digests, as well as CHANGES.md
> > > and RELEASENOTES.md included in this RC can be found at:
> > >
> > >https://dist.apache.org/repos/dist/dev/hbase/3.0.0-alpha-2RC0/
> > >
> > > Maven artifacts are available in a staging repository at:
> > >
> > >
> > https://repository.apache.org/content/repositories/orgapachehbase-1472/
> > >
> > > Artifacts were signed with the 9AD2AE49 key which can be found in:
> > >
> > >https://downloads.apache.org/hbase/KEYS
> > >
> > > 3.0.0-alpha-2 is the second alpha release for our 3.0.0 major release
> > line.
> > > HBase 3.0.0 includes the following big feature/changes:
> > >Synchronous Replication
> > >OpenTelemetry Tracing
> > >Distributed MOB Compaction
> > >Backup and Restore
> > >Move RSGroup balancer to core
> > >Reimplement sync client on async client
> > >CPEPs on shaded proto
> > >Move the logging framework from log4j to log4j2
> > >
> > > 3.0.0-alpha-2 contains a critical security fix for addressing the
> log4j2
> > > CVE-2021-44228. All users who already use 3.0.0-alpha-1 should upgrade
> > > to 3.0.0-alpha-2 ASAP.
> > >
> > > Notice that this is not a production ready release. It is used to let
> our
> > > users try and test the new major release, to get feedback before the
> > final
> > > GA release is out.
> > > So please do NOT use it in production. Just try it and report back
> > > everything you find unusual.
> > >
> > > And this time we will not include CHANGES.md and RELEASENOTE.md
> > > in our source code, you can find it on the download site. For getting
> > these
> > > two files for old releases, please go to
> > >
> > >https://archive.apache.org/dist/hbase/
> > >
> > > To learn more about Apache hbase, please see
> > >
> > >http://hbase.apache.org/
> > >
> > > Thanks,
> > > Your HBase Release Manager
> > >
> >
>

Re: Need code review for a couple of Backup/Restore related Pull Requests

2021-11-18 Thread Mallikarjun

Bump.

---
Mallikarjun


On Tue, Nov 2, 2021 at 4:46 PM Mallikarjun  wrote:

> Hi,
>
> I have a couple of pull requests on Backup/Restore. Appreciate if someone
> could have a look at them.
>
> https://github.com/apache/hbase/pull/3766
> https://github.com/apache/hbase/pull/3726
>
> I have put down the context in respective Jiras. Please ask if any clarity
> is needed.
>
> ---
> Mallikarjun
>

Need code review for a couple of Backup/Restore related Pull Requests

2021-11-02 Thread Mallikarjun

Hi,

I have a couple of pull requests on Backup/Restore. Appreciate if someone
could have a look at them.

https://github.com/apache/hbase/pull/3766
https://github.com/apache/hbase/pull/3726

I have put down the context in respective Jiras. Please ask if any clarity
is needed.

---
Mallikarjun

[jira] [Created] (HBASE-26346) Design support for rsgroup data isolation

2021-10-11 Thread Mallikarjun (Jira)

Mallikarjun created HBASE-26346:
---

 Summary: Design support for rsgroup data isolation 
 Key: HBASE-26346
 URL: https://issues.apache.org/jira/browse/HBASE-26346
 Project: HBase
  Issue Type: New Feature
  Components: rsgroup
Reporter: Mallikarjun
Assignee: Mallikarjun


TODO



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-26343) Extend RSGroup to support data isolation to achieve true multitenancy in Hbase

2021-10-09 Thread Mallikarjun (Jira)

Mallikarjun created HBASE-26343:
---

 Summary: Extend RSGroup to support data isolation to achieve true 
multitenancy in Hbase
 Key: HBASE-26343
 URL: https://issues.apache.org/jira/browse/HBASE-26343
 Project: HBase
  Issue Type: Umbrella
  Components: rsgroup
Reporter: Mallikarjun


RSGroups currently only provide isolation on serving layer, but not on the data 
layer. And there is a need for providing data isolation between rsgroups to 
achieve true multitenancy in hbase leading to independently scale individual 
rsgroups on need bases. Some of the aspects to be covered in this umbrella 
project are 
 # Provide data isolation between different RSGroups
 # Add balancer support to understand this construct on various balancing 
activities
 # Extend support on various ancillary services such as export snapshot, 
cluster replication, etc 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-26322) Add rsgroup support for Backup

2021-10-02 Thread Mallikarjun (Jira)

Mallikarjun created HBASE-26322:
---

 Summary: Add rsgroup support for Backup
 Key: HBASE-26322
 URL: https://issues.apache.org/jira/browse/HBASE-26322
 Project: HBase
  Issue Type: Improvement
  Components: backup&restore
Affects Versions: 3.0.0-alpha-1
Reporter: Mallikarjun
Assignee: Mallikarjun
 Fix For: 3.0.0-alpha-2






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-26279) Merger of backup:system table with hbase:meta table

2021-09-12 Thread Mallikarjun (Jira)

Mallikarjun created HBASE-26279:
---

 Summary: Merger of backup:system table with hbase:meta table
 Key: HBASE-26279
 URL: https://issues.apache.org/jira/browse/HBASE-26279
 Project: HBase
  Issue Type: Improvement
  Components: backup&restore
Reporter: Mallikarjun
Assignee: Mallikarjun


To Be filled



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: Stuck Serial replication -- Need suggestions on recovery

2021-08-17 Thread Mallikarjun

Thanks for answering the queries.

---
Mallikarjun


On Wed, Aug 18, 2021 at 9:32 AM 张铎(Duo Zhang)  wrote:

> In hbase, the mvcc write number and wal sequence id are the same
> thing, so when we just want to bump the mvcc number, we will not write
> an actual WAL but the sequence id will be increased.
>
> And I think it will be good to have a MR job to replicate WAL serially.
>
> Mallikarjun  于2021年8月18日周三 上午10:50写道：
> >
> > Inline Reply
> >
> > On Wed, Aug 18, 2021 at 8:06 AM 张铎(Duo Zhang) 
> wrote:
> >
> > > Mallikarjun  于2021年8月18日周三 上午10:19写道：
> > > >
> > > > Thanks for the response @Duo
> > > >
> > > > Inline reply
> > > >
> > > > On Wed, Aug 18, 2021 at 7:37 AM 张铎(Duo Zhang)  >
> > > wrote:
> > > >
> > > > > This is the isRangeFinished method
> > > > >
> > > > >   private boolean isRangeFinished(long endBarrier, String
> > > > > encodedRegionName) throws IOException {
> > > > > long pushedSeqId;
> > > > > try {
> > > > >   pushedSeqId = storage.getLastSequenceId(encodedRegionName,
> > > peerId);
> > > > > } catch (ReplicationException e) {
> > > > >   throw new IOException(
> > > > > "Failed to get pushed sequence id for " +
> encodedRegionName +
> > > > > ", peer " + peerId, e);
> > > > > }
> > > > > // endBarrier is the open sequence number. When opening a
> region,
> > > > > the open sequence number will
> > > > > // be set to the old max sequence id plus one, so here we need
> to
> > > > > minus one.
> > > > > return pushedSeqId >= endBarrier - 1;
> > > > >   }
> > > > >
> > > > > So for this region
> > > > >
> > > > > rs-9 24c765b42253f96b550831d83e99cc9e 18775105 18762209 [17776286,
> +++
> > > > > 18762210, 18775053, 18775079, 18775104, -- 18775119]
> > > > >
> > > > > We have already finished the first range [17776286, 18762210), but
> > > > > then we jump directly to range [18775053, 18775079), so the problem
> > > > > here is where is the [18762210, 18775053)...
> > > > >
> > > >
> > > > These sequence ID's are present in WAL's which are not cleaned up.
> (in
> > > > OLDWALs)
> > > >
> > > > Related Question: Is it allowed to have gaps in sequence IDs in WAL's
> > > for a
> > > > single region?
> > > > Example:  for Region: *24c765b42253f96b550831d83e99cc9e*, if
> sequence ID
> > > > *18775105* is present, can I expect *18775106 *is mandatory to be
> > > present?
> > > > or there can be gaps.
> > > Inside a range it is allowed to have gaps, but when reopening a
> > > region, we need to make sure there are no gaps otherwise the
> > > replication will be stuck.
> > >
> >
> > Just curious to know, what are some scenarios which can lead to gaps?
> From
> > my small number of experiments It was consecutive in nature, I did not
> find
> > such gap scenarios.
> >
> >
> > > >
> > > >
> > > > >
> > > > > And on the fix, you can clear the range information for the given
> > > > > regions in meta table, and then restart the clusters, I think the
> > > > > replication could continue.
> > > > >
> > > > >
> > > > If you mean removing some barriers so that replication is unblocked.
> > > > Doesn't it lead to *out of order events *replicated end up in
> > > > corrupting data?
> > > Yes, the replication will be out of order. But this is the easier way
> > > to recover the replication.
> > > If you still want to obtain the order, then you need to find out the
> > > root cause of my question, where is the WAL for the missing ranges. Is
> > > it because we have already replicated the data but do not mark the
> > > range as finished, or we just lose the WAL data for the range?
> > >
> >
> > Current scenario occurred in Active Passive cluster setup and replication
> > is stuck on the Passive side. So I won't be able to answer following
> > question
> >
> > we have already replicated the data
> >
> >
> >  But the following comment can help me find data from

Re: Stuck Serial replication -- Need suggestions on recovery

2021-08-17 Thread Mallikarjun

Inline Reply

On Wed, Aug 18, 2021 at 8:06 AM 张铎(Duo Zhang)  wrote:

> Mallikarjun  于2021年8月18日周三 上午10:19写道：
> >
> > Thanks for the response @Duo
> >
> > Inline reply
> >
> > On Wed, Aug 18, 2021 at 7:37 AM 张铎(Duo Zhang) 
> wrote:
> >
> > > This is the isRangeFinished method
> > >
> > >   private boolean isRangeFinished(long endBarrier, String
> > > encodedRegionName) throws IOException {
> > > long pushedSeqId;
> > > try {
> > >   pushedSeqId = storage.getLastSequenceId(encodedRegionName,
> peerId);
> > > } catch (ReplicationException e) {
> > >   throw new IOException(
> > > "Failed to get pushed sequence id for " + encodedRegionName +
> > > ", peer " + peerId, e);
> > > }
> > > // endBarrier is the open sequence number. When opening a region,
> > > the open sequence number will
> > > // be set to the old max sequence id plus one, so here we need to
> > > minus one.
> > > return pushedSeqId >= endBarrier - 1;
> > >   }
> > >
> > > So for this region
> > >
> > > rs-9 24c765b42253f96b550831d83e99cc9e 18775105 18762209 [17776286, +++
> > > 18762210, 18775053, 18775079, 18775104, -- 18775119]
> > >
> > > We have already finished the first range [17776286, 18762210), but
> > > then we jump directly to range [18775053, 18775079), so the problem
> > > here is where is the [18762210, 18775053)...
> > >
> >
> > These sequence ID's are present in WAL's which are not cleaned up. (in
> > OLDWALs)
> >
> > Related Question: Is it allowed to have gaps in sequence IDs in WAL's
> for a
> > single region?
> > Example:  for Region: *24c765b42253f96b550831d83e99cc9e*, if sequence ID
> > *18775105* is present, can I expect *18775106 *is mandatory to be
> present?
> > or there can be gaps.
> Inside a range it is allowed to have gaps, but when reopening a
> region, we need to make sure there are no gaps otherwise the
> replication will be stuck.
>

Just curious to know, what are some scenarios which can lead to gaps? From
my small number of experiments It was consecutive in nature, I did not find
such gap scenarios.


> >
> >
> > >
> > > And on the fix, you can clear the range information for the given
> > > regions in meta table, and then restart the clusters, I think the
> > > replication could continue.
> > >
> > >
> > If you mean removing some barriers so that replication is unblocked.
> > Doesn't it lead to *out of order events *replicated end up in
> > corrupting data?
> Yes, the replication will be out of order. But this is the easier way
> to recover the replication.
> If you still want to obtain the order, then you need to find out the
> root cause of my question, where is the WAL for the missing ranges. Is
> it because we have already replicated the data but do not mark the
> range as finished, or we just lose the WAL data for the range?
>

Current scenario occurred in Active Passive cluster setup and replication
is stuck on the Passive side. So I won't be able to answer following
question

we have already replicated the data


 But the following comment can help me find data from oldWAL if last
sequence id is present or not before region movement.

but when reopening a region, we need to make sure there are no gaps
>

Alternatively, do you think it is a good idea to write a job similar
to *RecoveredReplicationSource
*which ensures serial replication to other cluster outside of hbase cluster?


> >
> >
> > > Mallikarjun  于2021年8月17日周二 下午3:04写道：
> > > >
> > > > I have got into the following scenario. I won't go into details of
> how I
> > > > got here, since I am not able to reliably reproduce this scenario
> thus
> > > far.
> > > > (Typically happens when some rs goes down because of hardware issues)
> > > >
> > > > Let me explain to you the following details.
> > > > Col 1: Region server on which region is trying to replicate
> > > > Col 2: Region trying to replicate but stuck
> > > > Col 3: SequenceID which is being replicated and stuck because
> previous
> > > > range is not finished
> > > > Col 4: Checkpoint in zk until which sequence id is already
> replicated to
> > > > peer
> > > > Col 5: Replication barriers for that region. This is a list of open
> > > > sequence IDs on region movement. (+++ means where *checkpoint*
> belongs,

Re: Stuck Serial replication -- Need suggestions on recovery

2021-08-17 Thread Mallikarjun

Thanks for the response @Duo

Inline reply

On Wed, Aug 18, 2021 at 7:37 AM 张铎(Duo Zhang)  wrote:

> This is the isRangeFinished method
>
>   private boolean isRangeFinished(long endBarrier, String
> encodedRegionName) throws IOException {
> long pushedSeqId;
> try {
>   pushedSeqId = storage.getLastSequenceId(encodedRegionName, peerId);
> } catch (ReplicationException e) {
>   throw new IOException(
> "Failed to get pushed sequence id for " + encodedRegionName +
> ", peer " + peerId, e);
> }
> // endBarrier is the open sequence number. When opening a region,
> the open sequence number will
> // be set to the old max sequence id plus one, so here we need to
> minus one.
> return pushedSeqId >= endBarrier - 1;
>   }
>
> So for this region
>
> rs-9 24c765b42253f96b550831d83e99cc9e 18775105 18762209 [17776286, +++
> 18762210, 18775053, 18775079, 18775104, -- 18775119]
>
> We have already finished the first range [17776286, 18762210), but
> then we jump directly to range [18775053, 18775079), so the problem
> here is where is the [18762210, 18775053)...
>

These sequence ID's are present in WAL's which are not cleaned up. (in
OLDWALs)

Related Question: Is it allowed to have gaps in sequence IDs in WAL's for a
single region?
Example:  for Region: *24c765b42253f96b550831d83e99cc9e*, if sequence ID
*18775105* is present, can I expect *18775106 *is mandatory to be present?
or there can be gaps.


>
> And on the fix, you can clear the range information for the given
> regions in meta table, and then restart the clusters, I think the
> replication could continue.
>
>
If you mean removing some barriers so that replication is unblocked.
Doesn't it lead to *out of order events *replicated end up in
corrupting data?


> Mallikarjun  于2021年8月17日周二 下午3:04写道：
> >
> > I have got into the following scenario. I won't go into details of how I
> > got here, since I am not able to reliably reproduce this scenario thus
> far.
> > (Typically happens when some rs goes down because of hardware issues)
> >
> > Let me explain to you the following details.
> > Col 1: Region server on which region is trying to replicate
> > Col 2: Region trying to replicate but stuck
> > Col 3: SequenceID which is being replicated and stuck because previous
> > range is not finished
> > Col 4: Checkpoint in zk until which sequence id is already replicated to
> > peer
> > Col 5: Replication barriers for that region. This is a list of open
> > sequence IDs on region movement. (+++ means where *checkpoint* belongs,
> ---
> > is where *to replicate seqid* belongs)
> >
> > There are in total 53 regions and 10 regionservers
> >
> > RegionServer Region Trying to replicate sequenceID Replicated until
> Current
> > Barriers
> > rs-9 24c765b42253f96b550831d83e99cc9e 18775105 18762209 [17776286, +++
> > 18762210, 18775053, 18775079, 18775104, -- 18775119]
> > rs-5 b4144bfe75c5826710ec54849741b038 189154192 189091221 [184183678, +++
> > 189117430, 189154191, -- 189154327]
> > rs-8 deb6fee3380e7b9db9826cb5f27f8a59 189099509 189036510 [180662218, +++
> > 189062798, 189099508, -- 189099587]
> > rs-8 3338fd34ae7ba06a7eccd89048fa83ce 189078951 189077722 [184170310, +++
> > 189078876, 189078950, -- 189104780, 189141509, 189141545, 189141595]
> > rs-6 1af22c68b9212971ab2570e14b7b0dc2 183301002 183265047 [180239864, +++
> > 183265048, 183270357, 183277363, 183300886, 183301001, -- 183301062]
> > rs-10 1af22c68b9212971ab2570e14b7b0dc2 183301063 183265047 [180239864,
> +++
> > 183265048, 183270357, 183277363, 183300886, 183301001, 183301062 --]
> > rs-6 4b9e98c7eca7a24c74136de1aa8aeab0 189027036 189022619 [189022618, +++
> > 189027035, 189085155, 189085241, 189085290]
> > rs-4 e45ba292df95edbdf884e2ec50cf5f16 189099081 189062191 [184126535, +++
> > 189098947, 189099080, -- 189099226]
> > rs-4 83e65729dcad644738a0a3cee994e2df 189012454 189012365 [184103269, +++
> > 189012453, -- 189012538, 189074967, 189075016, 189075294, 189075349]
> > rs-10 83e65729dcad644738a0a3cee994e2df 189012539 189012365 [184103269,
> +++
> > 189012453, 189012538, -- 189074967, 189075016, 189075294, 189075349]
> > rs-3 11fca95de4878782af53371a25cf44d0 189121426 189058129 [180684344, +++
> > 189084916, 189121283, 189121425, -- 189121602]
> > rs-3 b9db001578e127740d7e0e186e4fbab6 189145458 189081436 [184175242, +++
> > 189083026, 189145417, 189145457, -- 189145562, 189145723, 189145781]
> > rs-2 262ca9ff7b878f32c451fac3eb430a88 189128535 189065879 [184159187, +++
> > 189091684, 189128534, -- 189128708]
>

Stuck Serial replication -- Need suggestions on recovery

2021-08-17 Thread Mallikarjun

I have got into the following scenario. I won't go into details of how I
got here, since I am not able to reliably reproduce this scenario thus far.
(Typically happens when some rs goes down because of hardware issues)

Let me explain to you the following details.
Col 1: Region server on which region is trying to replicate
Col 2: Region trying to replicate but stuck
Col 3: SequenceID which is being replicated and stuck because previous
range is not finished
Col 4: Checkpoint in zk until which sequence id is already replicated to
peer
Col 5: Replication barriers for that region. This is a list of open
sequence IDs on region movement. (+++ means where *checkpoint* belongs, ---
is where *to replicate seqid* belongs)

There are in total 53 regions and 10 regionservers

RegionServer Region Trying to replicate sequenceID Replicated until Current
Barriers
rs-9 24c765b42253f96b550831d83e99cc9e 18775105 18762209 [17776286, +++
18762210, 18775053, 18775079, 18775104, -- 18775119]
rs-5 b4144bfe75c5826710ec54849741b038 189154192 189091221 [184183678, +++
189117430, 189154191, -- 189154327]
rs-8 deb6fee3380e7b9db9826cb5f27f8a59 189099509 189036510 [180662218, +++
189062798, 189099508, -- 189099587]
rs-8 3338fd34ae7ba06a7eccd89048fa83ce 189078951 189077722 [184170310, +++
189078876, 189078950, -- 189104780, 189141509, 189141545, 189141595]
rs-6 1af22c68b9212971ab2570e14b7b0dc2 183301002 183265047 [180239864, +++
183265048, 183270357, 183277363, 183300886, 183301001, -- 183301062]
rs-10 1af22c68b9212971ab2570e14b7b0dc2 183301063 183265047 [180239864, +++
183265048, 183270357, 183277363, 183300886, 183301001, 183301062 --]
rs-6 4b9e98c7eca7a24c74136de1aa8aeab0 189027036 189022619 [189022618, +++
189027035, 189085155, 189085241, 189085290]
rs-4 e45ba292df95edbdf884e2ec50cf5f16 189099081 189062191 [184126535, +++
189098947, 189099080, -- 189099226]
rs-4 83e65729dcad644738a0a3cee994e2df 189012454 189012365 [184103269, +++
189012453, -- 189012538, 189074967, 189075016, 189075294, 189075349]
rs-10 83e65729dcad644738a0a3cee994e2df 189012539 189012365 [184103269, +++
189012453, 189012538, -- 189074967, 189075016, 189075294, 189075349]
rs-3 11fca95de4878782af53371a25cf44d0 189121426 189058129 [180684344, +++
189084916, 189121283, 189121425, -- 189121602]
rs-3 b9db001578e127740d7e0e186e4fbab6 189145458 189081436 [184175242, +++
189083026, 189145417, 189145457, -- 189145562, 189145723, 189145781]
rs-2 262ca9ff7b878f32c451fac3eb430a88 189128535 189065879 [184159187, +++
189091684, 189128534, -- 189128708]
rs-2 03a1eb906a344944aad727dbb8210cfc 172392082 172390331 [167737983, +++
172392081, -- 172400093, 172446121, 172446172]
rs-10 ae2726c7b4eeec3f93336d71e80145a4 189027430 189026939 [184119428, +++
189027429, -- 189053118, 189089933, 189089995, 189090059]
rs-10 770ba4f4568fff803e6df340b2ffe486 189034144 189032879 [184127026, +++
189034143, --189048295, 189059834, 189096413, 189096513, 189096548,
189096606]
rs-1 5846f4ce8acdd5aabf325c847d18c729 18793501 18780639 [18778549, +++
18784783, 18793471, 18793484, 18793500 --]
rs-1 5846f4ce8acdd5aabf325c847d18c729 18793472 18780639 [18778549, +++
18784783, 18793471, --- 18793484, 18793500]
rs-1 fabd3ea591d5f20a86a26f8767d34f63 189028498 189024357 [184116531, +++
189025318, 189028497, --- 189051176, 189087488, 189087737, 189087850]
rs-1 335d855c5005343719ea73bcb7dcb269 189064849 189037338 [184130122, +++
189064848, --- 189101485, 189101698, 189101774]


My question is, how do I recover from here? Any suggestions.

Only thought is that I have to replay by writing some MR jobs / some
scripts to read and replay selectively and update checkpoints.

---
Mallikarjun

[jira] [Created] (HBASE-26203) Minor cleanups to reduce checkstyle warnings on backup code

2021-08-16 Thread Mallikarjun (Jira)

Mallikarjun created HBASE-26203:
---

 Summary: Minor cleanups to reduce checkstyle warnings on backup 
code
 Key: HBASE-26203
 URL: https://issues.apache.org/jira/browse/HBASE-26203
 Project: HBase
  Issue Type: Improvement
  Components: backup&restore
Affects Versions: 3.0.0-alpha-2
Reporter: Mallikarjun
Assignee: Mallikarjun
 Fix For: 3.0.0-alpha-2


`WALProcedureStore` stands deprecated. Review its usage in Backup/Restore



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-26202) Review deprecated WALProcedureStore usage in Backup

2021-08-16 Thread Mallikarjun (Jira)

Mallikarjun created HBASE-26202:
---

 Summary: Review deprecated WALProcedureStore usage in Backup
 Key: HBASE-26202
 URL: https://issues.apache.org/jira/browse/HBASE-26202
 Project: HBase
  Issue Type: Improvement
  Components: backup&restore
Affects Versions: 3.0.0-alpha-2
Reporter: Mallikarjun
Assignee: Mallikarjun
 Fix For: 3.0.0-alpha-2






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] Hbase Backup design changes

2021-07-25 Thread Mallikarjun

Thanks Duo.

---
Mallikarjun


On Sun, Jul 25, 2021 at 7:32 PM 张铎(Duo Zhang)  wrote:

> Replied on jira. Please give more details about what you are doing in the
> PR...
>
> Thanks.
>
> Mallikarjun  于2021年7月25日周日 上午10:48写道：
> >
> > Can someone review this pull request?
> > https://github.com/apache/hbase/pull/3359
> >
> > This change changes meta information for backup, if not part of hbase
> > 3.0.0. It might have a lot of additional work to be put into executing
> the
> > above mentioned plan.
> >
> > ---
> > Mallikarjun
> >
> >
> > On Thu, Feb 11, 2021 at 5:36 PM Mallikarjun 
> > wrote:
> >
> > > Slight modification to previous version --> https://ibb.co/Nttx3J1
> > >
> > > ---
> > > Mallikarjun
> > >
> > >
> > > On Thu, Feb 11, 2021 at 8:12 AM Mallikarjun 
> > > wrote:
> > >
> > >> Inline Reply
> > >>
> > >> On Wed, Feb 3, 2021 at 6:44 AM Sean Busbey  wrote:
> > >>
> > >>> Hi Mallikarjun,
> > >>>
> > >>> Those goals sound worthwhile.
> > >>>
> > >>> Do you have a flow chart similar to the one you posted for the
> current
> > >>> system but for the proposed solution?
> > >>>
> > >>
> > >> This is what I am thinking --> https://ibb.co/KmH6Cwv
> > >>
> > >>
> > >>>
> > >>> How much will we need to change our existing test coverage to
> accommodate
> > >>> the proposed solution?
> > >>>
> > >>
> > >> Of the 38 tests, it looks like we might have to change a couple only.
> > >> Will have to add more tests to cover parallel backup scenarios.
> > >>
> > >>
> > >>>
> > >>> How much will we need to update the existing reference guide section?
> > >>>
> > >>>
> > >> Probably nothing. Interface as such will not change.
> > >>
> > >>
> > >>>
> > >>> On Sun, Jan 31, 2021, 04:59 Mallikarjun 
> > >>> wrote:
> > >>>
> > >>> > Bringing up this thread.
> > >>> >
> > >>> > On Mon, Jan 25, 2021, 3:38 PM Viraj Jasani 
> wrote:
> > >>> >
> > >>> > > Thanks, the image is visible now.
> > >>> > >
> > >>> > > > Since I wanted to open this for discussion, did not consider
> > >>> placing it
> > >>> > > in
> > >>> > > *hbase/dev_support/design-docs*.
> > >>> > >
> > >>> > > Definitely, only after we come to concrete conclusion with the
> > >>> reviewer,
> > >>> > we
> > >>> > > should open up a PR. Until then this thread is anyways up for
> > >>> discussion.
> > >>> > >
> > >>> > >
> > >>> > > On Mon, 25 Jan 2021 at 1:58 PM, Mallikarjun <
> > >>> mallik.v.ar...@gmail.com>
> > >>> > > wrote:
> > >>> > >
> > >>> > > > Hope this link works --> https://ibb.co/hYjRpgP
> > >>> > > >
> > >>> > > > Inline reply
> > >>> > > > On Mon, Jan 25, 2021 at 1:16 PM Viraj Jasani <
> vjas...@apache.org>
> > >>> > wrote:
> > >>> > > >
> > >>> > > > > Hi,
> > >>> > > > >
> > >>> > > > > Still not available :)
> > >>> > > > > The attachments don’t work on mailing lists. You can try
> > >>> uploading
> > >>> > the
> > >>> > > > > attachment on some public hosting site and provide the url
> to the
> > >>> > same
> > >>> > > > > here.
> > >>> > > > >
> > >>> > > > > Since I am not aware of the contents, I cannot confirm right
> > >>> away but
> > >>> > > if
> > >>> > > > > the reviewer feels we should have the attachment on our
> github
> > >>> repo:
> > >>> > > > > hbase/dev-support/design-docs , good to upload the content
> there
> > >>> > later.
> > >>> > > > For
> > >>

Re: [DISCUSS] Hbase Backup design changes

2021-07-24 Thread Mallikarjun

Can someone review this pull request?
https://github.com/apache/hbase/pull/3359

This change changes meta information for backup, if not part of hbase
3.0.0. It might have a lot of additional work to be put into executing the
above mentioned plan.

---
Mallikarjun


On Thu, Feb 11, 2021 at 5:36 PM Mallikarjun 
wrote:

> Slight modification to previous version --> https://ibb.co/Nttx3J1
>
> ---
> Mallikarjun
>
>
> On Thu, Feb 11, 2021 at 8:12 AM Mallikarjun 
> wrote:
>
>> Inline Reply
>>
>> On Wed, Feb 3, 2021 at 6:44 AM Sean Busbey  wrote:
>>
>>> Hi Mallikarjun,
>>>
>>> Those goals sound worthwhile.
>>>
>>> Do you have a flow chart similar to the one you posted for the current
>>> system but for the proposed solution?
>>>
>>
>> This is what I am thinking --> https://ibb.co/KmH6Cwv
>>
>>
>>>
>>> How much will we need to change our existing test coverage to accommodate
>>> the proposed solution?
>>>
>>
>> Of the 38 tests, it looks like we might have to change a couple only.
>> Will have to add more tests to cover parallel backup scenarios.
>>
>>
>>>
>>> How much will we need to update the existing reference guide section?
>>>
>>>
>> Probably nothing. Interface as such will not change.
>>
>>
>>>
>>> On Sun, Jan 31, 2021, 04:59 Mallikarjun 
>>> wrote:
>>>
>>> > Bringing up this thread.
>>> >
>>> > On Mon, Jan 25, 2021, 3:38 PM Viraj Jasani  wrote:
>>> >
>>> > > Thanks, the image is visible now.
>>> > >
>>> > > > Since I wanted to open this for discussion, did not consider
>>> placing it
>>> > > in
>>> > > *hbase/dev_support/design-docs*.
>>> > >
>>> > > Definitely, only after we come to concrete conclusion with the
>>> reviewer,
>>> > we
>>> > > should open up a PR. Until then this thread is anyways up for
>>> discussion.
>>> > >
>>> > >
>>> > > On Mon, 25 Jan 2021 at 1:58 PM, Mallikarjun <
>>> mallik.v.ar...@gmail.com>
>>> > > wrote:
>>> > >
>>> > > > Hope this link works --> https://ibb.co/hYjRpgP
>>> > > >
>>> > > > Inline reply
>>> > > > On Mon, Jan 25, 2021 at 1:16 PM Viraj Jasani 
>>> > wrote:
>>> > > >
>>> > > > > Hi,
>>> > > > >
>>> > > > > Still not available :)
>>> > > > > The attachments don’t work on mailing lists. You can try
>>> uploading
>>> > the
>>> > > > > attachment on some public hosting site and provide the url to the
>>> > same
>>> > > > > here.
>>> > > > >
>>> > > > > Since I am not aware of the contents, I cannot confirm right
>>> away but
>>> > > if
>>> > > > > the reviewer feels we should have the attachment on our github
>>> repo:
>>> > > > > hbase/dev-support/design-docs , good to upload the content there
>>> > later.
>>> > > > For
>>> > > > > instance, pdf file can contain existing design and new design
>>> > diagrams
>>> > > > and
>>> > > > > talk about pros and cons etc once we have things finalized.
>>> > > > >
>>> > > > >
>>> > > > Since I wanted to open this for discussion, did not consider
>>> placing it
>>> > > in
>>> > > > *hbase/dev_support/design-docs*.
>>> > > >
>>> > > >
>>> > > > >
>>> > > > > On Mon, 25 Jan 2021 at 12:13 PM, Mallikarjun <
>>> > mallik.v.ar...@gmail.com
>>> > > >
>>> > > > > wrote:
>>> > > > >
>>> > > > > > Attached as image. Please let me know if it is availabe now.
>>> > > > > >
>>> > > > > > ---
>>> > > > > > Mallikarjun
>>> > > > > >
>>> > > > > >
>>> > > > > > On Mon, Jan 25, 2021 at 10:32 AM Sean Busbey <
>>> bus...@apache.org>
>>> > > > wrote:
>>> > > > > >
>>> > > > > >> Hi!
>&g

[jira] [Created] (HBASE-26034) Add support to take multiple parallel backup

2021-06-27 Thread Mallikarjun (Jira)

Mallikarjun created HBASE-26034:
---

 Summary: Add support to take multiple parallel backup
 Key: HBASE-26034
 URL: https://issues.apache.org/jira/browse/HBASE-26034
 Project: HBase
  Issue Type: Improvement
  Components: backup&restore
Affects Versions: 3.0.0-alpha-2
Reporter: Mallikarjun
Assignee: Mallikarjun
 Fix For: 3.0.0-alpha-2






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: Time for 3.0.0 release

2021-05-21 Thread Mallikarjun

For multi tenancy with favoured nodes, timeline looks unreasonable for 3.0.
Can it be part of later 3.x releases? Or should it wait for 4.0?

On Fri, May 21, 2021, 7:30 PM 张铎(Duo Zhang)  wrote:

> We already have the below big feature/changes for 3.0.0.
>
> Synchronous Replication
> OpenTelemetry Tracing
> Distributed MOB Compaction
> Backup and Restore
> Move RSGroup balancer to core
> Reimplement sync client on async client
> CPEPs on shaded proto
>
> There are also some ongoing works which target 3.0.0.
>
> Splittable meta
> Move balancer code to hbase-balancer
> Compaction offload
> Replication offload
>
> Since now, we do not even have enough new features to cut a minor release
> for 2.x, I think it is time to cut the 3.x release line now, and I think we
> already have enough new features for a new major release.
>
> Here I plan to cut a branch-3 at the end of June and make our first
> 3.0.0-alpha1 release, and finally make the 3.0.0 release by the end of
> 2021. So if any of the above work can not be done before the end of June,
> they will be moved out to 4.0.0.
>
> Thoughts? Thanks.
>

[jira] [Created] (HBASE-25891) Simplify backup table to be able to maintain it

2021-05-15 Thread Mallikarjun (Jira)

Mallikarjun created HBASE-25891:
---

 Summary: Simplify backup table to be able to maintain it
 Key: HBASE-25891
 URL: https://issues.apache.org/jira/browse/HBASE-25891
 Project: HBase
  Issue Type: Improvement
  Components: backup&restore
Reporter: Mallikarjun
Assignee: Mallikarjun


More details will be added soon



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] Next 2.x releases

2021-05-15 Thread Mallikarjun

Inline

On Wed, May 5, 2021 at 4:51 AM Sean Busbey  wrote:

> My understanding is that backup work is not ready for inclusion in 2.x.
>
> The talk of removing it from the master branch and proposed adoption of the
> feature through more involvement from some community members were not so
> long ago.
>
>
I am working on backup changes as per the discussion in email with
subject *[DISCUSS]
Hbase Backup design changes*

Removing it entirely may not be a good idea, as we are using it in
production and I see a good value addition with this feature instead of
starting fresh.

Maybe in a month or two, I will be able to close it.


> On Tue, May 4, 2021, 15:49 Andrew Purtell  wrote:
>
> > Correct me if I am mistaken but backup tests failing on master in
> precommit
> > is common enough to warrant ignoring them as unrelated. Is it not fully
> > baked yet?
> >
> > +1 for backport of tracing. If we do the backport to branch-2 that would
> be
> > one new compelling reason for a 2.5.0 release.
> >
> >
> > On Tue, May 4, 2021 at 9:34 AM Nick Dimiduk  wrote:
> >
> > > On Fri, Apr 30, 2021 at 5:40 PM 张铎(Duo Zhang) 
> > > wrote:
> > >
> > > > Does anyone have interest in backporting HBASE-22120 to branch-2?
> > > >
> > >
> > > Yes, I think there would be interest in getting your tracing effort
> onto
> > > branch-2 ; there's quite a few "watchers" on that Jira.
> > >
> > > What about the backup work? Has it stabilized enough for backport?
> > >
> > > Conversely, if we don't have a killer new feature for 2.5, does that
> mean
> > > it's time for 3.0?
> > >
> > > Andrew Purtell  于2021年5月1日周六 上午5:46写道：
> > > >
> > > > > Inline
> > > > >
> > > > > On Fri, Apr 30, 2021 at 2:11 PM Nick Dimiduk 
> > > > wrote:
> > > > >
> > > > > > Heya,
> > > > > >
> > > > > > I'd like to have a planning discussion around our 2.x releases.
> 2.4
> > > > seems
> > > > > > to be humming along nicely, I think we won't have a need for 2.3
> > much
> > > > > > longer. Likewise, it seems like it should be time to start
> planning
> > > > 2.5,
> > > > > > but when I look at the issues unique to that branch [0], I don't
> > see
> > > > the
> > > > > > big new feature that justifies the new minor release. Rather, I
> > see a
> > > > > > number of items that should be backported to a 2.4.x. Do we have
> a
> > > big
> > > > > new
> > > > > > feature in the works? Should we consider backporting something
> from
> > > > > master?
> > > > > > Or maybe there's enough minor changes on 2.5 to justify the
> > > release...
> > > > > but
> > > > > > if so, which of them motivate users to upgrade?
> > > > > >
> > > > > > So, to summarize:
> > > > > >
> > > > > > - How much longer do we want to run 2.3?
> > > > > > - What issues in the below query can be backported to 2.4? Any
> > > > > volunteers?
> > > > > >
> > > > >
> > > > > Thanks for starting this discussion, Nick.
> > > > >
> > > > > Looking at that report, issues like HBASE-24305, HBASE-25793, or
> > > > > HBASE-25458 that clean up or reduce interfaces or refactor/move
> > classes
> > > > are
> > > > > probably excluded from a patch release by our guidelines.
> Conversely,
> > > > they
> > > > > would not provide much value if backported.
> > > > >
> > > > > I also agree that the motivation for 2.5 here is thin as of now.
> > > > Refactors,
> > > > > interface improvements, or deprecation removals will be
> nice-to-haves
> > > > when
> > > > > there is a 2.5 someday.
> > > > >
> > > > > All the others in the report are either operational improvements
> that
> > > > would
> > > > > be nice to backport or bug fixes that should be backported.
> > > > >
> > > > > There might be case by case issues with compatibility during the
> pick
> > > > > backs, but we can deal with them one at a time.
> > > > >
> > > > > If you are looking for a volunteer to do this, as 2.4 RM I am game.
> > > > >
> > > > >
> > > > > - What's the big new feature that motivates 2.5?
> > > > > >
> > > > > > Thanks,
> > > > > > Nick
> > > > > >
> > > > > > [0]:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20HBASE%20AND%20resolution%20%3D%20Fixed%20AND%20fixVersion%20%3D%202.5.0%20AND%20fixVersion%20not%20in%20(2.4.0%2C%202.4.1%2C%202.4.2%2C%202.4.3)%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Best regards,
> > > > > Andrew
> > > > >
> > > > > Words like orphans lost among the crosstalk, meaning torn from
> > truth's
> > > > > decrepit hands
> > > > >- A23, Crosstalk
> > > > >
> > > >
> > >
> >
> >
> > --
> > Best regards,
> > Andrew
> >
> > Words like orphans lost among the crosstalk, meaning torn from truth's
> > decrepit hands
> >- A23, Crosstalk
> >
>

Re: DISCUSS: Remove hbase-backup from master?

2021-05-13 Thread Mallikarjun

Sure. I have assigned it to myself. Will look into it.

Last time I checked, I did not find any failed tests and it was not hadoop3

---
Mallikarjun


On Fri, May 14, 2021 at 2:34 AM Nick Dimiduk  wrote:

> Hi Mallikarjun,
>
> I just saw a bunch of backup tests fail on an unrelated PR build. I filed
> HBASE-25888 and uploaded some logs. I have the full test-logs.zip file, but
> it's too big to upload to jira. I linked it from the Jira, but the archive
> will disappear when the PR is eventually closed. I would ping you from
> Jira, but I didn't find any Jira user that seemed likely to be your
> account. Would you mind taking a look?
>
> Thanks,
> Nick
>
> On Wed, Dec 23, 2020 at 8:20 PM Mallikarjun 
> wrote:
>
> > Yea. I have noticed that.
> >
> > I have some patches ready and have to add unit tests. Will start raising
> in
> > a couple of weeks time.
> > ---
> > Mallikarjun
> >
> >
> > On Thu, Dec 24, 2020 at 7:48 AM 张铎(Duo Zhang) 
> > wrote:
> >
> > > The UTs in the backup module are easy to fail with NPE, I've seen this
> in
> > > several pre commit results.
> > >
> > > Any progress here?
> > >
> > > mallik.v.ar...@gmail.com  于2020年12月3日周四
> > > 上午9:58写道：
> > >
> > > > On Tue, Dec 1, 2020 at 11:26 PM Stack  wrote:
> > > >
> > > > > On Tue, Dec 1, 2020 at 6:38 AM mallik.v.ar...@gmail.com <
> > > > > mallik.v.ar...@gmail.com> wrote:
> > > > >
> > > > > > On Tue, Dec 1, 2020 at 7:34 PM Sean Busbey 
> > > wrote:
> > > > > >
> > > > > > > One reason for moving it out of core is that at some point we
> > will
> > > > have
> > > > > > > hbase 3 releases. Right now hbase 3 targeted things are in the
> > > master
> > > > > > > branch. If the feature is not sufficiently ready to be in a
> > release
> > > > > then
> > > > > > > when the time for HBase 3 releases comes we'll have to do work
> to
> > > > pull
> > > > > it
> > > > > > > out of the release branch. If the feature needs to stay in the
> > core
> > > > > repo
> > > > > > > that will necessitate that hbase 3 have a branch distinct from
> > the
> > > > > master
> > > > > > > branch (which may or may not happen anyways). At that point we
> > risk
> > > > > > having
> > > > > > > to do the work to remove the feature from a release branch
> again
> > > come
> > > > > > hbase
> > > > > > > 4.
> > > > > > >
> > > > > > >
> > > > > > > I think a lot of this is moot if you'd like to start providing
> > > > patches
> > > > > > for
> > > > > > > the things you've needed to use it. If there are gaps that you
> > > think
> > > > > can
> > > > > > > trip folks up maybe we could label it "experimental" to provide
> > > > better
> > > > > > > context for others.
> > > > > > >
> > > > > >
> > > > > > I will start putting effort in maintaining this feature.
> > > > > >
> > > > > >
> > > > > FYI, seems like backup has a bunch of flakey tests. Might be worth
> > > > looking
> > > > > at.
> > > > >
> > > >
> > > > Any reference I can get. They seem fine when I run tests.
> > > >
> > > >
> > > > > Thanks,
> > > > > S
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > > On Tue, Dec 1, 2020, 07:48 mallik.v.ar...@gmail.com <
> > > > > > > mallik.v.ar...@gmail.com> wrote:
> > > > > > >
> > > > > > > > On Tue, Dec 1, 2020 at 12:14 PM Stack 
> > wrote:
> > > > > > > >
> > > > > > > > > On Mon, Nov 30, 2020 at 9:30 PM mallik.v.ar...@gmail.com <
> > > > > > > > > mallik.v.ar...@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > > Inline
> > > > > > > > > >
> > >

[jira] [Created] (HBASE-25870) Search only ancestors instead of entire history instead for a particular backup

2021-05-07 Thread Mallikarjun (Jira)

Mallikarjun created HBASE-25870:
---

 Summary: Search only ancestors instead of entire history instead 
for a particular backup
 Key: HBASE-25870
 URL: https://issues.apache.org/jira/browse/HBASE-25870
 Project: HBase
  Issue Type: Bug
  Components: backup&restore
Reporter: Mallikarjun
Assignee: Mallikarjun






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [SURVEY] The current usage of favor node balancer across the community

2021-04-26 Thread Mallikarjun

Inline reply

On Tue, Apr 27, 2021 at 1:03 AM Stack  wrote:

> On Mon, Apr 26, 2021 at 12:30 PM Stack  wrote:
>
> > On Mon, Apr 26, 2021 at 8:10 AM Mallikarjun 
> > wrote:
> >
> >> We use FavoredStochasticBalancer, which by description says the same
> thing
> >> as FavoredNodeLoadBalancer. Ignoring that fact, problem appears to be
> >>
> >>
> >
> > Other concerns:
> >
> >  * Hard-coded triplet of nodes that will inevitably rot as machines come
> > and go (Are there tools for remediation?)
>

It doesn't really rot, if you think it with balancer responsible to
assigning regions

1. On every region assigned to a particular regionserver, the balancer
would have to reassign this triplet and hence there is no scope of rot
(Same logic applied to WAL as well). (On compaction hdfs blocks will be
pulled back if any spill over)

2. We used hostnames only (so, come and go is not going to be new nodes but
same hostnames)

Couple of outstanding problems though.

1. We couldn't increase replication factor to > 3. Which was fine so far
for our use cases. But we have had thoughts around fixing them.

2. Balancer doesn't understand favored nodes construct, perfect balanced fn
among the rsgroup datanodes isn't possible, but with some variance like
10-20% difference is expected

> >  * A workaround for a facility that belongs in the NN
>

Probably, you can argue both ways. Hbase is the owner of data and hbase has
the authority to dictate where a particular region replica sits. Benefits
like data locality will be mostly around 1, rack awareness is more aligned
to this strategy and so on.

Moreover, HDFS has data pinning for clients to make use of it. Isn't it?

> >  * Opaque in operation
>

We haven't looked around wrapping these operations around metrics, so that
it is no longer opaque and reasons mentioned in the above point.

> >  * My understanding was that the feature was never finished; in
> particular
> > the balancer wasn't properly wired- up (Happy to be incorrect here).
> >
> >
> One more concern was that the feature was dead/unused. You seem to refute
> this notion of mine.
> S
>

We have been using this for more than a year with hbase 2.1 in highly
critical workloads for our company. And several years with hbase 1.2 as
well with backporting rsgroup from master at that time. (2017-18 ish)

And it has been very smooth operationally in hbase 2.1

>
>
> >
> >
> >> Going a step back.
> >>
> >> Did we ever consider giving a thought towards truely multi-tenant hbase?
> >>
> >
> > Always.
> >
> >
> >> Where each rsgroup has a group of datanodes and namespace tables data
> >> created under that particular rsgroup would sit on those datanodes only?
> >> We
> >> have attempted to do that and have largely been very successful running
> >> clusters of hundreds of terabytes with hundreds of
> >> regionservers(datanodes)
> >> per cluster.
> >>
> >>
> > So isolation of load by node? (I believe this is where the rsgroup
> feature
> > came from originally; the desire for a deploy like you describe above.
> > IIUC, its what Thiru and crew run).
> >
> >
> >
> >> 1. We use a modified version of RSGroupBasedFavoredNodeLoadBalancer
> >> contributed by Thiruvel Thirumoolan -->
> >> https://issues.apache.org/jira/browse/HBASE-15533
> >>
> >> On each balance operation, while the region is moved around (or while
> >> creating table), favored nodes are assigned based on the rsgroup that
> >> region is pinned to. And hence data is pinned to those datanodes only
> >> (Pinning favored nodes is best effort from the hdfs side, but there are
> >> only a few exception scenarios where data will be spilled over and they
> >> recover after a major compaction).
> >>
> >>
> > Sounds like you have studied this deploy in operation. Write it up? Blog
> > post on hbase.apache.org?
> >
>

Definitely will write up.

> >
> >
> >> 2. We have introduced several balancer cost functions to restore things
> to
> >> normalcy (multi tenancy with fn pinning) such as when a node is dead, or
> >> when fn's are imbalanced within the same rsgroup, etc.
> >>
> >> 3. We had diverse workloads under the same cluster and WAL isolation
> >> became
> >> a requirement and we went ahead with similar philosophy mentioned in
> line
> >> 1. Where WAL's are created with FN pinning so that they are tied to
> >> datanodes belonging to the same rsgroup. Some

Re: [SURVEY] The current usage of favor node balancer across the community

2021-04-26 Thread Mallikarjun

We use FavoredStochasticBalancer, which by description says the same thing
as FavoredNodeLoadBalancer. Ignoring that fact, problem appears to be

 favor node balancer is a problem, as it stores the favor node information
> in hbase:meta.
>

Going a step back.

Did we ever consider giving a thought towards truely multi-tenant hbase?
Where each rsgroup has a group of datanodes and namespace tables data
created under that particular rsgroup would sit on those datanodes only? We
have attempted to do that and have largely been very successful running
clusters of hundreds of terabytes with hundreds of regionservers(datanodes)
per cluster.

1. We use a modified version of RSGroupBasedFavoredNodeLoadBalancer
contributed by Thiruvel Thirumoolan -->
https://issues.apache.org/jira/browse/HBASE-15533

On each balance operation, while the region is moved around (or while
creating table), favored nodes are assigned based on the rsgroup that
region is pinned to. And hence data is pinned to those datanodes only
(Pinning favored nodes is best effort from the hdfs side, but there are
only a few exception scenarios where data will be spilled over and they
recover after a major compaction).

2. We have introduced several balancer cost functions to restore things to
normalcy (multi tenancy with fn pinning) such as when a node is dead, or
when fn's are imbalanced within the same rsgroup, etc.

3. We had diverse workloads under the same cluster and WAL isolation became
a requirement and we went ahead with similar philosophy mentioned in line
1. Where WAL's are created with FN pinning so that they are tied to
datanodes belonging to the same rsgroup. Some discussion seems to have
happened here --> https://issues.apache.org/jira/browse/HBASE-21641

There are several other enhancements we have worked on with respect to
rsgroup aware export snapshot, rsaware regionmover, rsaware cluster
replication, etc.

For above use cases, we would be needing fn information on hbase:meta.

If the use case seems to be a fit for how we would want hbase to be taken
forward as one of the supported use cases, willing to contribute our
changes back to the community. (I was anyway planning to initiate this
discussion)

To strengthen the above use case. Here is what one of our multi tenant
cluster looks like

RSGroups(Tenants): 21 (With tenant isolation)
Regionservers: 275
Regions Hosted: 6k
Tables Hosted: 87
Capacity: 250 TB (100TB used)

---
Mallikarjun

On Mon, Apr 26, 2021 at 9:15 AM 张铎(Duo Zhang)  wrote:

> As you all know, we always want to reduce the size of the hbase-server
> module. This time we want to separate the balancer related code to another
> sub module.
>
> The design doc:
>
> https://docs.google.com/document/d/1T7WSgcQBJTtbJIjqi8sZYLxD2Z7JbIHx4TJaKKdkBbE/edit#
>
> You can see the bottom of the design doc, favor node balancer is a problem,
> as it stores the favor node information in hbase:meta. Stack mentioned that
> the feature is already dead, maybe we could just purge it from our code
> base.
>
> So here we want to know if there are still some users in the community who
> still use favor node balancer. Please share your experience and whether you
> still want to use it.
>
> Thanks.
>

[jira] [Created] (HBASE-25784) Support for Parallel Backups enabling multi tenancy

2021-04-16 Thread Mallikarjun (Jira)

Mallikarjun created HBASE-25784:
---

 Summary: Support for Parallel Backups enabling multi tenancy
 Key: HBASE-25784
 URL: https://issues.apache.org/jira/browse/HBASE-25784
 Project: HBase
  Issue Type: Improvement
  Components: backport
Reporter: Mallikarjun
Assignee: Mallikarjun


Proposed Design.

!https://i.ibb.co/vVV1BTs/Backup-Activity-Diagram.png|width=322,height=414!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] Hbase Backup design changes

2021-02-11 Thread Mallikarjun

Slight modification to previous version --> https://ibb.co/Nttx3J1

---
Mallikarjun


On Thu, Feb 11, 2021 at 8:12 AM Mallikarjun 
wrote:

> Inline Reply
>
> On Wed, Feb 3, 2021 at 6:44 AM Sean Busbey  wrote:
>
>> Hi Mallikarjun,
>>
>> Those goals sound worthwhile.
>>
>> Do you have a flow chart similar to the one you posted for the current
>> system but for the proposed solution?
>>
>
> This is what I am thinking --> https://ibb.co/KmH6Cwv
>
>
>>
>> How much will we need to change our existing test coverage to accommodate
>> the proposed solution?
>>
>
> Of the 38 tests, it looks like we might have to change a couple only.
> Will have to add more tests to cover parallel backup scenarios.
>
>
>>
>> How much will we need to update the existing reference guide section?
>>
>>
> Probably nothing. Interface as such will not change.
>
>
>>
>> On Sun, Jan 31, 2021, 04:59 Mallikarjun  wrote:
>>
>> > Bringing up this thread.
>> >
>> > On Mon, Jan 25, 2021, 3:38 PM Viraj Jasani  wrote:
>> >
>> > > Thanks, the image is visible now.
>> > >
>> > > > Since I wanted to open this for discussion, did not consider
>> placing it
>> > > in
>> > > *hbase/dev_support/design-docs*.
>> > >
>> > > Definitely, only after we come to concrete conclusion with the
>> reviewer,
>> > we
>> > > should open up a PR. Until then this thread is anyways up for
>> discussion.
>> > >
>> > >
>> > > On Mon, 25 Jan 2021 at 1:58 PM, Mallikarjun > >
>> > > wrote:
>> > >
>> > > > Hope this link works --> https://ibb.co/hYjRpgP
>> > > >
>> > > > Inline reply
>> > > > On Mon, Jan 25, 2021 at 1:16 PM Viraj Jasani 
>> > wrote:
>> > > >
>> > > > > Hi,
>> > > > >
>> > > > > Still not available :)
>> > > > > The attachments don’t work on mailing lists. You can try uploading
>> > the
>> > > > > attachment on some public hosting site and provide the url to the
>> > same
>> > > > > here.
>> > > > >
>> > > > > Since I am not aware of the contents, I cannot confirm right away
>> but
>> > > if
>> > > > > the reviewer feels we should have the attachment on our github
>> repo:
>> > > > > hbase/dev-support/design-docs , good to upload the content there
>> > later.
>> > > > For
>> > > > > instance, pdf file can contain existing design and new design
>> > diagrams
>> > > > and
>> > > > > talk about pros and cons etc once we have things finalized.
>> > > > >
>> > > > >
>> > > > Since I wanted to open this for discussion, did not consider
>> placing it
>> > > in
>> > > > *hbase/dev_support/design-docs*.
>> > > >
>> > > >
>> > > > >
>> > > > > On Mon, 25 Jan 2021 at 12:13 PM, Mallikarjun <
>> > mallik.v.ar...@gmail.com
>> > > >
>> > > > > wrote:
>> > > > >
>> > > > > > Attached as image. Please let me know if it is availabe now.
>> > > > > >
>> > > > > > ---
>> > > > > > Mallikarjun
>> > > > > >
>> > > > > >
>> > > > > > On Mon, Jan 25, 2021 at 10:32 AM Sean Busbey > >
>> > > > wrote:
>> > > > > >
>> > > > > >> Hi!
>> > > > > >>
>> > > > > >> Thanks for the write up. unfortunately, your image for the
>> > existing
>> > > > > >> design didn't come through. Could you post it to some host and
>> > link
>> > > it
>> > > > > >> here?
>> > > > > >>
>> > > > > >> On Sun, Jan 24, 2021 at 3:12 AM Mallikarjun <
>> > > mallik.v.ar...@gmail.com
>> > > > >
>> > > > > >> wrote:
>> > > > > >> >
>> > > > > >> > Existing Design:
>> > > > > >> >
>> > > > > >> >
>> > > > > >> >
>> > > > > >> &

Re: [DISCUSS] Hbase Backup design changes

2021-02-10 Thread Mallikarjun

Inline Reply

On Wed, Feb 3, 2021 at 6:44 AM Sean Busbey  wrote:

> Hi Mallikarjun,
>
> Those goals sound worthwhile.
>
> Do you have a flow chart similar to the one you posted for the current
> system but for the proposed solution?
>

This is what I am thinking --> https://ibb.co/KmH6Cwv


>
> How much will we need to change our existing test coverage to accommodate
> the proposed solution?
>

Of the 38 tests, it looks like we might have to change a couple only.
Will have to add more tests to cover parallel backup scenarios.


>
> How much will we need to update the existing reference guide section?
>
>
Probably nothing. Interface as such will not change.


>
> On Sun, Jan 31, 2021, 04:59 Mallikarjun  wrote:
>
> > Bringing up this thread.
> >
> > On Mon, Jan 25, 2021, 3:38 PM Viraj Jasani  wrote:
> >
> > > Thanks, the image is visible now.
> > >
> > > > Since I wanted to open this for discussion, did not consider placing
> it
> > > in
> > > *hbase/dev_support/design-docs*.
> > >
> > > Definitely, only after we come to concrete conclusion with the
> reviewer,
> > we
> > > should open up a PR. Until then this thread is anyways up for
> discussion.
> > >
> > >
> > > On Mon, 25 Jan 2021 at 1:58 PM, Mallikarjun 
> > > wrote:
> > >
> > > > Hope this link works --> https://ibb.co/hYjRpgP
> > > >
> > > > Inline reply
> > > > On Mon, Jan 25, 2021 at 1:16 PM Viraj Jasani 
> > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > Still not available :)
> > > > > The attachments don’t work on mailing lists. You can try uploading
> > the
> > > > > attachment on some public hosting site and provide the url to the
> > same
> > > > > here.
> > > > >
> > > > > Since I am not aware of the contents, I cannot confirm right away
> but
> > > if
> > > > > the reviewer feels we should have the attachment on our github
> repo:
> > > > > hbase/dev-support/design-docs , good to upload the content there
> > later.
> > > > For
> > > > > instance, pdf file can contain existing design and new design
> > diagrams
> > > > and
> > > > > talk about pros and cons etc once we have things finalized.
> > > > >
> > > > >
> > > > Since I wanted to open this for discussion, did not consider placing
> it
> > > in
> > > > *hbase/dev_support/design-docs*.
> > > >
> > > >
> > > > >
> > > > > On Mon, 25 Jan 2021 at 12:13 PM, Mallikarjun <
> > mallik.v.ar...@gmail.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > Attached as image. Please let me know if it is availabe now.
> > > > > >
> > > > > > ---
> > > > > > Mallikarjun
> > > > > >
> > > > > >
> > > > > > On Mon, Jan 25, 2021 at 10:32 AM Sean Busbey 
> > > > wrote:
> > > > > >
> > > > > >> Hi!
> > > > > >>
> > > > > >> Thanks for the write up. unfortunately, your image for the
> > existing
> > > > > >> design didn't come through. Could you post it to some host and
> > link
> > > it
> > > > > >> here?
> > > > > >>
> > > > > >> On Sun, Jan 24, 2021 at 3:12 AM Mallikarjun <
> > > mallik.v.ar...@gmail.com
> > > > >
> > > > > >> wrote:
> > > > > >> >
> > > > > >> > Existing Design:
> > > > > >> >
> > > > > >> >
> > > > > >> >
> > > > > >> > Problem 1:
> > > > > >> >
> > > > > >> > With this design, Incremental and Full backup can't be run in
> > > > parallel
> > > > > >> and leading to degraded RPO's in case Full backup is of longer
> > > > duration
> > > > > esp
> > > > > >> for large tables.
> > > > > >> >
> > > > > >> > Example:
> > > > > >> > Expectation: Say you have a big table with 10 TB and your RPO
> is
> > > 60
> > > > > >> minutes and you are allowed to shi

Re: [DISCUSS] Hbase Backup design changes

2021-02-08 Thread Mallikarjun

Hi Sean,

I will get back with the design changes and the answers to above questions
in a few days time.

---
Mallikarjun


On Wed, Feb 3, 2021 at 6:44 AM Sean Busbey  wrote:

> Hi Mallikarjun,
>
> Those goals sound worthwhile.
>
> Do you have a flow chart similar to the one you posted for the current
> system but for the proposed solution?
>
> How much will we need to change our existing test coverage to accommodate
> the proposed solution?
>
> How much will we need to update the existing reference guide section?
>
>
> On Sun, Jan 31, 2021, 04:59 Mallikarjun  wrote:
>
> > Bringing up this thread.
> >
> > On Mon, Jan 25, 2021, 3:38 PM Viraj Jasani  wrote:
> >
> > > Thanks, the image is visible now.
> > >
> > > > Since I wanted to open this for discussion, did not consider placing
> it
> > > in
> > > *hbase/dev_support/design-docs*.
> > >
> > > Definitely, only after we come to concrete conclusion with the
> reviewer,
> > we
> > > should open up a PR. Until then this thread is anyways up for
> discussion.
> > >
> > >
> > > On Mon, 25 Jan 2021 at 1:58 PM, Mallikarjun 
> > > wrote:
> > >
> > > > Hope this link works --> https://ibb.co/hYjRpgP
> > > >
> > > > Inline reply
> > > > On Mon, Jan 25, 2021 at 1:16 PM Viraj Jasani 
> > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > Still not available :)
> > > > > The attachments don’t work on mailing lists. You can try uploading
> > the
> > > > > attachment on some public hosting site and provide the url to the
> > same
> > > > > here.
> > > > >
> > > > > Since I am not aware of the contents, I cannot confirm right away
> but
> > > if
> > > > > the reviewer feels we should have the attachment on our github
> repo:
> > > > > hbase/dev-support/design-docs , good to upload the content there
> > later.
> > > > For
> > > > > instance, pdf file can contain existing design and new design
> > diagrams
> > > > and
> > > > > talk about pros and cons etc once we have things finalized.
> > > > >
> > > > >
> > > > Since I wanted to open this for discussion, did not consider placing
> it
> > > in
> > > > *hbase/dev_support/design-docs*.
> > > >
> > > >
> > > > >
> > > > > On Mon, 25 Jan 2021 at 12:13 PM, Mallikarjun <
> > mallik.v.ar...@gmail.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > Attached as image. Please let me know if it is availabe now.
> > > > > >
> > > > > > ---
> > > > > > Mallikarjun
> > > > > >
> > > > > >
> > > > > > On Mon, Jan 25, 2021 at 10:32 AM Sean Busbey 
> > > > wrote:
> > > > > >
> > > > > >> Hi!
> > > > > >>
> > > > > >> Thanks for the write up. unfortunately, your image for the
> > existing
> > > > > >> design didn't come through. Could you post it to some host and
> > link
> > > it
> > > > > >> here?
> > > > > >>
> > > > > >> On Sun, Jan 24, 2021 at 3:12 AM Mallikarjun <
> > > mallik.v.ar...@gmail.com
> > > > >
> > > > > >> wrote:
> > > > > >> >
> > > > > >> > Existing Design:
> > > > > >> >
> > > > > >> >
> > > > > >> >
> > > > > >> > Problem 1:
> > > > > >> >
> > > > > >> > With this design, Incremental and Full backup can't be run in
> > > > parallel
> > > > > >> and leading to degraded RPO's in case Full backup is of longer
> > > > duration
> > > > > esp
> > > > > >> for large tables.
> > > > > >> >
> > > > > >> > Example:
> > > > > >> > Expectation: Say you have a big table with 10 TB and your RPO
> is
> > > 60
> > > > > >> minutes and you are allowed to ship the remote backup with 800
> > Mbps.
> > > > And
> > > > > >> you are allowed to take Full Backups once

Re: [DISCUSS] Hbase Backup design changes

2021-01-31 Thread Mallikarjun

Bringing up this thread.

On Mon, Jan 25, 2021, 3:38 PM Viraj Jasani  wrote:

> Thanks, the image is visible now.
>
> > Since I wanted to open this for discussion, did not consider placing it
> in
> *hbase/dev_support/design-docs*.
>
> Definitely, only after we come to concrete conclusion with the reviewer, we
> should open up a PR. Until then this thread is anyways up for discussion.
>
>
> On Mon, 25 Jan 2021 at 1:58 PM, Mallikarjun 
> wrote:
>
> > Hope this link works --> https://ibb.co/hYjRpgP
> >
> > Inline reply
> > On Mon, Jan 25, 2021 at 1:16 PM Viraj Jasani  wrote:
> >
> > > Hi,
> > >
> > > Still not available :)
> > > The attachments don’t work on mailing lists. You can try uploading the
> > > attachment on some public hosting site and provide the url to the same
> > > here.
> > >
> > > Since I am not aware of the contents, I cannot confirm right away but
> if
> > > the reviewer feels we should have the attachment on our github repo:
> > > hbase/dev-support/design-docs , good to upload the content there later.
> > For
> > > instance, pdf file can contain existing design and new design diagrams
> > and
> > > talk about pros and cons etc once we have things finalized.
> > >
> > >
> > Since I wanted to open this for discussion, did not consider placing it
> in
> > *hbase/dev_support/design-docs*.
> >
> >
> > >
> > > On Mon, 25 Jan 2021 at 12:13 PM, Mallikarjun  >
> > > wrote:
> > >
> > > > Attached as image. Please let me know if it is availabe now.
> > > >
> > > > ---
> > > > Mallikarjun
> > > >
> > > >
> > > > On Mon, Jan 25, 2021 at 10:32 AM Sean Busbey 
> > wrote:
> > > >
> > > >> Hi!
> > > >>
> > > >> Thanks for the write up. unfortunately, your image for the existing
> > > >> design didn't come through. Could you post it to some host and link
> it
> > > >> here?
> > > >>
> > > >> On Sun, Jan 24, 2021 at 3:12 AM Mallikarjun <
> mallik.v.ar...@gmail.com
> > >
> > > >> wrote:
> > > >> >
> > > >> > Existing Design:
> > > >> >
> > > >> >
> > > >> >
> > > >> > Problem 1:
> > > >> >
> > > >> > With this design, Incremental and Full backup can't be run in
> > parallel
> > > >> and leading to degraded RPO's in case Full backup is of longer
> > duration
> > > esp
> > > >> for large tables.
> > > >> >
> > > >> > Example:
> > > >> > Expectation: Say you have a big table with 10 TB and your RPO is
> 60
> > > >> minutes and you are allowed to ship the remote backup with 800 Mbps.
> > And
> > > >> you are allowed to take Full Backups once in a week and rest of them
> > > should
> > > >> be incremental backups
> > > >> >
> > > >> > Shortcoming: With the above design, one can't run parallel backups
> > and
> > > >> whenever there is a full backup running (which takes roughly 25
> hours)
> > > you
> > > >> are not allowed to take incremental backups and that would be a
> breach
> > > in
> > > >> your RPO.
> > > >> >
> > > >> > Proposed Solution: Barring some critical sections such as
> modifying
> > > >> state of the backup on meta tables, others can happen parallelly.
> > > Leaving
> > > >> incremental backups to be able to run based on older successful
> full /
> > > >> incremental backups and completion time of backup should be used
> > > instead of
> > > >> start time of backup for ordering. I have not worked on the full
> > > redesign,
> > > >> and will be doing so if this proposal seems acceptable for the
> > > community.
> > > >> >
> > > >> > Problem 2:
> > > >> >
> > > >> > With one backup at a time, it fails easily for a multi-tenant
> > system.
> > > >> This poses following problems
> > > >> >
> > > >> > Admins will not be able to achieve required RPO's for their tables
> > > >> because of dependence on other tenants present in the system. As one
> > > tenant
> > > >> doesn't have control over other tenants' table sizes and hence the
> > > duration
> > > >> of the backup
> > > >> > Management overhead of setting up a right sequence to achieve
> > required
> > > >> RPO's for different tenants could be very hard.
> > > >> >
> > > >> > Proposed Solution: Same as previous proposal
> > > >> >
> > > >> > Problem 3:
> > > >> >
> > > >> > Incremental backup works on WAL's and
> > > >> org.apache.hadoop.hbase.backup.master.BackupLogCleaner ensures that
> > > WAL's
> > > >> are never cleaned up until the next backup (Full / Incremental) is
> > > taken.
> > > >> This poses following problem
> > > >> >
> > > >> > WAL's can grow unbounded in case there are transient problems like
> > > >> backup site facing issues or anything else until next backup
> scheduled
> > > goes
> > > >> successful
> > > >> >
> > > >> > Proposed Solution: I can't think of anything better, but I see
> this
> > > can
> > > >> be a potential problem. Also, one can force full backup if required
> > WAL
> > > >> files are missing for whatever other reasons not necessarily
> mentioned
> > > >> above.
> > > >> >
> > > >> > ---
> > > >> > Mallikarjun
> > > >>
> > > >
> > >
> >
>

Re: [DISCUSS] Hbase Backup design changes

2021-01-25 Thread Mallikarjun

Hope this link works --> https://ibb.co/hYjRpgP

Inline reply
On Mon, Jan 25, 2021 at 1:16 PM Viraj Jasani  wrote:

> Hi,
>
> Still not available :)
> The attachments don’t work on mailing lists. You can try uploading the
> attachment on some public hosting site and provide the url to the same
> here.
>
> Since I am not aware of the contents, I cannot confirm right away but if
> the reviewer feels we should have the attachment on our github repo:
> hbase/dev-support/design-docs , good to upload the content there later. For
> instance, pdf file can contain existing design and new design diagrams and
> talk about pros and cons etc once we have things finalized.
>
>
Since I wanted to open this for discussion, did not consider placing it in
*hbase/dev_support/design-docs*.


>
> On Mon, 25 Jan 2021 at 12:13 PM, Mallikarjun 
> wrote:
>
> > Attached as image. Please let me know if it is availabe now.
> >
> > ---
> > Mallikarjun
> >
> >
> > On Mon, Jan 25, 2021 at 10:32 AM Sean Busbey  wrote:
> >
> >> Hi!
> >>
> >> Thanks for the write up. unfortunately, your image for the existing
> >> design didn't come through. Could you post it to some host and link it
> >> here?
> >>
> >> On Sun, Jan 24, 2021 at 3:12 AM Mallikarjun 
> >> wrote:
> >> >
> >> > Existing Design:
> >> >
> >> >
> >> >
> >> > Problem 1:
> >> >
> >> > With this design, Incremental and Full backup can't be run in parallel
> >> and leading to degraded RPO's in case Full backup is of longer duration
> esp
> >> for large tables.
> >> >
> >> > Example:
> >> > Expectation: Say you have a big table with 10 TB and your RPO is 60
> >> minutes and you are allowed to ship the remote backup with 800 Mbps. And
> >> you are allowed to take Full Backups once in a week and rest of them
> should
> >> be incremental backups
> >> >
> >> > Shortcoming: With the above design, one can't run parallel backups and
> >> whenever there is a full backup running (which takes roughly 25 hours)
> you
> >> are not allowed to take incremental backups and that would be a breach
> in
> >> your RPO.
> >> >
> >> > Proposed Solution: Barring some critical sections such as modifying
> >> state of the backup on meta tables, others can happen parallelly.
> Leaving
> >> incremental backups to be able to run based on older successful full /
> >> incremental backups and completion time of backup should be used
> instead of
> >> start time of backup for ordering. I have not worked on the full
> redesign,
> >> and will be doing so if this proposal seems acceptable for the
> community.
> >> >
> >> > Problem 2:
> >> >
> >> > With one backup at a time, it fails easily for a multi-tenant system.
> >> This poses following problems
> >> >
> >> > Admins will not be able to achieve required RPO's for their tables
> >> because of dependence on other tenants present in the system. As one
> tenant
> >> doesn't have control over other tenants' table sizes and hence the
> duration
> >> of the backup
> >> > Management overhead of setting up a right sequence to achieve required
> >> RPO's for different tenants could be very hard.
> >> >
> >> > Proposed Solution: Same as previous proposal
> >> >
> >> > Problem 3:
> >> >
> >> > Incremental backup works on WAL's and
> >> org.apache.hadoop.hbase.backup.master.BackupLogCleaner ensures that
> WAL's
> >> are never cleaned up until the next backup (Full / Incremental) is
> taken.
> >> This poses following problem
> >> >
> >> > WAL's can grow unbounded in case there are transient problems like
> >> backup site facing issues or anything else until next backup scheduled
> goes
> >> successful
> >> >
> >> > Proposed Solution: I can't think of anything better, but I see this
> can
> >> be a potential problem. Also, one can force full backup if required WAL
> >> files are missing for whatever other reasons not necessarily mentioned
> >> above.
> >> >
> >> > ---
> >> > Mallikarjun
> >>
> >
>

Re: [DISCUSS] Hbase Backup design changes

2021-01-24 Thread Mallikarjun

Attached as image. Please let me know if it is availabe now.

---
Mallikarjun


On Mon, Jan 25, 2021 at 10:32 AM Sean Busbey  wrote:

> Hi!
>
> Thanks for the write up. unfortunately, your image for the existing
> design didn't come through. Could you post it to some host and link it
> here?
>
> On Sun, Jan 24, 2021 at 3:12 AM Mallikarjun 
> wrote:
> >
> > Existing Design:
> >
> >
> >
> > Problem 1:
> >
> > With this design, Incremental and Full backup can't be run in parallel
> and leading to degraded RPO's in case Full backup is of longer duration esp
> for large tables.
> >
> > Example:
> > Expectation: Say you have a big table with 10 TB and your RPO is 60
> minutes and you are allowed to ship the remote backup with 800 Mbps. And
> you are allowed to take Full Backups once in a week and rest of them should
> be incremental backups
> >
> > Shortcoming: With the above design, one can't run parallel backups and
> whenever there is a full backup running (which takes roughly 25 hours) you
> are not allowed to take incremental backups and that would be a breach in
> your RPO.
> >
> > Proposed Solution: Barring some critical sections such as modifying
> state of the backup on meta tables, others can happen parallelly. Leaving
> incremental backups to be able to run based on older successful full /
> incremental backups and completion time of backup should be used instead of
> start time of backup for ordering. I have not worked on the full redesign,
> and will be doing so if this proposal seems acceptable for the community.
> >
> > Problem 2:
> >
> > With one backup at a time, it fails easily for a multi-tenant system.
> This poses following problems
> >
> > Admins will not be able to achieve required RPO's for their tables
> because of dependence on other tenants present in the system. As one tenant
> doesn't have control over other tenants' table sizes and hence the duration
> of the backup
> > Management overhead of setting up a right sequence to achieve required
> RPO's for different tenants could be very hard.
> >
> > Proposed Solution: Same as previous proposal
> >
> > Problem 3:
> >
> > Incremental backup works on WAL's and
> org.apache.hadoop.hbase.backup.master.BackupLogCleaner ensures that WAL's
> are never cleaned up until the next backup (Full / Incremental) is taken.
> This poses following problem
> >
> > WAL's can grow unbounded in case there are transient problems like
> backup site facing issues or anything else until next backup scheduled goes
> successful
> >
> > Proposed Solution: I can't think of anything better, but I see this can
> be a potential problem. Also, one can force full backup if required WAL
> files are missing for whatever other reasons not necessarily mentioned
> above.
> >
> > ---
> > Mallikarjun
>

[DISCUSS] Hbase Backup design changes

2021-01-24 Thread Mallikarjun

*Existing Design:*

[image: image.png]

*Problem 1: *

With this design, Incremental and Full backup can't be run in parallel and
leading to degraded RPO's in case Full backup is of longer duration esp for
large tables.

Example:
Expectation: Say you have a big table with 10 TB and your RPO is 60 minutes
and you are allowed to ship the remote backup with 800 Mbps. And you are
allowed to take Full Backups once in a week and rest of them should be
incremental backups

Shortcoming: With the above design, one can't run parallel backups and
whenever there is a full backup running (which takes roughly 25 hours) you
are not allowed to take incremental backups and that would be a breach in
your RPO.

*Proposed Solution: *Barring some critical sections such as modifying state
of the backup on meta tables, others can happen parallelly.
Leaving incremental backups to be able to run based on older successful
full / incremental backups and completion time of backup should be used
instead of start time of backup for ordering. I have not worked on the full
redesign, and will be doing so if this proposal seems acceptable for the
community.

*Problem 2:*

With one backup at a time, it fails easily for a multi-tenant system. This
poses following problems

   - Admins will not be able to achieve required RPO's for their tables
   because of dependence on other tenants present in the system. As one tenant
   doesn't have control over other tenants' table sizes and hence the duration
   of the backup
   - Management overhead of setting up a right sequence to achieve required
   RPO's for different tenants could be very hard.

*Proposed Solution: *Same as previous proposal

*Problem 3: *

Incremental backup works on WAL's and
org.apache.hadoop.hbase.backup.master.BackupLogCleaner ensures that WAL's
are never cleaned up until the next backup (Full / Incremental) is taken.
This poses following problem

   - WAL's can grow unbounded in case there are transient problems like
   backup site facing issues or anything else until next backup scheduled goes
   successful

*Proposed Solution: *I can't think of anything better, but I see this can
be a potential problem. Also, one can force full backup if required WAL
files are missing for whatever other reasons not necessarily mentioned
above.

---
Mallikarjun

[jira] [Resolved] (HBASE-14414) HBase Backup/Restore Phase 3

2021-01-19 Thread Mallikarjun (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-14414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun resolved HBASE-14414.
-
Resolution: Done

> HBase Backup/Restore Phase 3
> 
>
> Key: HBASE-14414
> URL: https://issues.apache.org/jira/browse/HBASE-14414
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 3.0.0-alpha-1
>Reporter: Vladimir Rodionov
>    Assignee: Mallikarjun
>Priority: Major
>  Labels: backup
>
> Umbrella ticket for Phase 3 of development 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-18892) B&R testing

2021-01-19 Thread Mallikarjun (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-18892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun resolved HBASE-18892.
-
Resolution: Resolved

> B&R testing
> ---
>
> Key: HBASE-18892
> URL: https://issues.apache.org/jira/browse/HBASE-18892
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
>
> Backup & Restore testing umbrella, for all bugs dicovered



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-25501) Backup not using parameters such as bandwidth, workers, etc while exporting snapshot

2021-01-12 Thread Mallikarjun (Jira)

Mallikarjun created HBASE-25501:
---

 Summary: Backup not using parameters such as bandwidth, workers, 
etc while exporting snapshot
 Key: HBASE-25501
 URL: https://issues.apache.org/jira/browse/HBASE-25501
 Project: HBase
  Issue Type: Bug
Reporter: Mallikarjun
Assignee: Mallikarjun






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: DISCUSS: Remove hbase-backup from master?

2020-12-23 Thread Mallikarjun

Yea. I have noticed that.

I have some patches ready and have to add unit tests. Will start raising in
a couple of weeks time.
---
Mallikarjun


On Thu, Dec 24, 2020 at 7:48 AM 张铎(Duo Zhang)  wrote:

> The UTs in the backup module are easy to fail with NPE, I've seen this in
> several pre commit results.
>
> Any progress here?
>
> mallik.v.ar...@gmail.com  于2020年12月3日周四
> 上午9:58写道：
>
> > On Tue, Dec 1, 2020 at 11:26 PM Stack  wrote:
> >
> > > On Tue, Dec 1, 2020 at 6:38 AM mallik.v.ar...@gmail.com <
> > > mallik.v.ar...@gmail.com> wrote:
> > >
> > > > On Tue, Dec 1, 2020 at 7:34 PM Sean Busbey 
> wrote:
> > > >
> > > > > One reason for moving it out of core is that at some point we will
> > have
> > > > > hbase 3 releases. Right now hbase 3 targeted things are in the
> master
> > > > > branch. If the feature is not sufficiently ready to be in a release
> > > then
> > > > > when the time for HBase 3 releases comes we'll have to do work to
> > pull
> > > it
> > > > > out of the release branch. If the feature needs to stay in the core
> > > repo
> > > > > that will necessitate that hbase 3 have a branch distinct from the
> > > master
> > > > > branch (which may or may not happen anyways). At that point we risk
> > > > having
> > > > > to do the work to remove the feature from a release branch again
> come
> > > > hbase
> > > > > 4.
> > > > >
> > > > >
> > > > > I think a lot of this is moot if you'd like to start providing
> > patches
> > > > for
> > > > > the things you've needed to use it. If there are gaps that you
> think
> > > can
> > > > > trip folks up maybe we could label it "experimental" to provide
> > better
> > > > > context for others.
> > > > >
> > > >
> > > > I will start putting effort in maintaining this feature.
> > > >
> > > >
> > > FYI, seems like backup has a bunch of flakey tests. Might be worth
> > looking
> > > at.
> > >
> >
> > Any reference I can get. They seem fine when I run tests.
> >
> >
> > > Thanks,
> > > S
> > >
> > >
> > >
> > >
> > > >
> > > > >
> > > > > On Tue, Dec 1, 2020, 07:48 mallik.v.ar...@gmail.com <
> > > > > mallik.v.ar...@gmail.com> wrote:
> > > > >
> > > > > > On Tue, Dec 1, 2020 at 12:14 PM Stack  wrote:
> > > > > >
> > > > > > > On Mon, Nov 30, 2020 at 9:30 PM mallik.v.ar...@gmail.com <
> > > > > > > mallik.v.ar...@gmail.com> wrote:
> > > > > > >
> > > > > > > > Inline
> > > > > > > >
> > > > > > > > On Tue, Dec 1, 2020, 10:10 AM Andrew Purtell <
> > > > > andrew.purt...@gmail.com
> > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > We are allowed to debate if this should be in the tree or
> > not.
> > > > > Given
> > > > > > > the
> > > > > > > > > lack of interest, as evidenced by incomplete state, lack of
> > > > > release,
> > > > > > > and
> > > > > > > > > lack of contribution, it is more than fair to discuss
> > removal.
> > > > > > > > >
> > > > > > > > > Here is my take: First of all, it is not released. There is
> > no
> > > > > > implied
> > > > > > > > > roadmap or support. Second, there do not seem to be any
> > active
> > > > > > > > maintainers
> > > > > > > > > or volunteers as such. Third, unless someone shows up with
> > more
> > > > > > patches
> > > > > > > > for
> > > > > > > > > it there will be no polish or maturing, there can be no
> > > > > expectations
> > > > > > of
> > > > > > > > > further improvement.
> > > > > > > > >
> > > > > > > > > That said, this is open source. New code contribution will
> > > change
> > > > >

[jira] [Created] (HBASE-24931) Candidate Generator helper Action method ignoring 0th index region

2020-08-23 Thread Mallikarjun (Jira)

Mallikarjun created HBASE-24931:
---

 Summary: Candidate Generator helper Action method ignoring 0th 
index region
 Key: HBASE-24931
 URL: https://issues.apache.org/jira/browse/HBASE-24931
 Project: HBase
  Issue Type: Bug
  Components: Balancer
Reporter: Mallikarjun
Assignee: Mallikarjun


Balance Candidate generators such as `LocalityBasedCandidateGenerator`, 
`RegionReplicaCandidateGenerator`,  `RegionReplicaRackCandidateGenerator`, etc 
uses helper method `getAction` to generate action is ignoring 0th index of 
`fromRegion` and `toRegion`. 
{code:java}
protected BaseLoadBalancer.Cluster.Action getAction(int fromServer, int 
fromRegion,
int toServer, int toRegion) {
  if (fromServer < 0 || toServer < 0) {
return BaseLoadBalancer.Cluster.NullAction;
  }
  if (fromRegion > 0 && toRegion > 0) {
return new BaseLoadBalancer.Cluster.SwapRegionsAction(fromServer, 
fromRegion,
  toServer, toRegion);
  } else if (fromRegion > 0) {
return new BaseLoadBalancer.Cluster.MoveRegionAction(fromRegion, 
fromServer, toServer);
  } else if (toRegion > 0) {
return new BaseLoadBalancer.Cluster.MoveRegionAction(toRegion, toServer, 
fromServer);
  } else {
return BaseLoadBalancer.Cluster.NullAction;
  }
}
{code}
 

Is this unintentional or is there some particular reason? 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-24157) RSGroup Aware Export Snapshot

2020-04-09 Thread Mallikarjun (Jira)

Mallikarjun created HBASE-24157:
---

 Summary: RSGroup Aware Export Snapshot
 Key: HBASE-24157
 URL: https://issues.apache.org/jira/browse/HBASE-24157
 Project: HBase
  Issue Type: Improvement
  Components: snapshots
Reporter: Mallikarjun


Export Snapshot to a destination cluster which is RSGroup aware can lead to 
data spill outside of destination RSGroup. This improvement aims at ensuring 
export snapshort to be aware of destination RSGroup nodes and placing the data 
only in those boxes. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

47 matches

Mail list logo