date:20200513

[jira] [Created] (HBASE-24363) Fix failed ut TestAssignmentManagerMetrics for branch-2.2

2020-05-13 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-24363:
--

 Summary: Fix failed ut TestAssignmentManagerMetrics for branch-2.2
 Key: HBASE-24363
 URL: https://issues.apache.org/jira/browse/HBASE-24363
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.2.4
Reporter: Guanghao Zhang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Reopened] (HBASE-24080) [flakey test] TestRegionReplicaFailover.testSecondaryRegionKill fails.

2020-05-13 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang reopened HBASE-24080:


Reopen for backport this to branch-2.2.

> [flakey test] TestRegionReplicaFailover.testSecondaryRegionKill fails.
> --
>
> Key: HBASE-24080
> URL: https://issues.apache.org/jira/browse/HBASE-24080
> Project: HBase
>  Issue Type: Test
>  Components: read replicas
>Affects Versions: 3.0.0-alpha-1, 2.3.0
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
>
> Run into the following error locally:
> {code:java}
> ---
> Test set: org.apache.hadoop.hbase.regionserver.TestRegionReplicaFailover
> ---
> Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 97.391 s <<< 
> FAILURE! - in org.apache.hadoop.hbase.regionserver.TestRegionReplicaFailover
> org.apache.hadoop.hbase.regionserver.TestRegionReplicaFailover.testSecondaryRegionKill
>   Time elapsed: 28.682 s  <<< FAILURE!
> java.lang.AssertionError: Failed verification of row :0
>         at org.junit.Assert.fail(Assert.java:89)
>         at org.junit.Assert.assertTrue(Assert.java:42)
>         at 
> org.apache.hadoop.hbase.HBaseTestingUtility.verifyNumericRows(HBaseTestingUtility.java:2407)
>         at 
> org.apache.hadoop.hbase.regionserver.TestRegionReplicaFailover.testSecondaryRegionKill(TestRegionReplicaFailover.java:240)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>         at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>         at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>         at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>         at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>         at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>         at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
>         at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>         at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>         at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>         at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>         at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>         at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>         at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>         at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>         at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>         at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>         at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
>         at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at java.lang.Thread.run(Thread.java:748) {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-24364) [Chaos Monkey] Invalid data block encoding in ChangeEncodingAction

2020-05-13 Thread Yi Mei (Jira)

Yi Mei created HBASE-24364:
--

 Summary: [Chaos Monkey] Invalid data block encoding in 
ChangeEncodingAction
 Key: HBASE-24364
 URL: https://issues.apache.org/jira/browse/HBASE-24364
 Project: HBase
  Issue Type: Bug
Reporter: Yi Mei


I found the following exception when I run ITBLL:
{code:java}
2020-05-12 11:43:14,201 WARN  [ChaosMonkey] policies.Policy: Exception 
performing action:
java.lang.IllegalArgumentException: There is no data block encoder for given id 
'6'
at 
org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.getEncodingById(DataBlockEncoding.java:168)
at 
org.apache.hadoop.hbase.chaos.actions.ChangeEncodingAction.lambda$perform$0(ChangeEncodingAction.java:50)
at 
org.apache.hadoop.hbase.chaos.actions.Action.modifyAllTableColumns(Action.java:356)
at 
org.apache.hadoop.hbase.chaos.actions.ChangeEncodingAction.perform(ChangeEncodingAction.java:48)
at 
org.apache.hadoop.hbase.chaos.policies.PeriodicRandomActionPolicy.runOneIteration(PeriodicRandomActionPolicy.java:59)
at 
org.apache.hadoop.hbase.chaos.policies.PeriodicPolicy.run(PeriodicPolicy.java:41)
at java.lang.Thread.run(Thread.java:748)
{code}
Because PREFIX_TREE is removed in DataBlockEncoding:
{code:java}
/** Disable data block encoding. */
NONE(0, null),
// id 1 is reserved for the BITSET algorithm to be added later
PREFIX(2, "org.apache.hadoop.hbase.io.encoding.PrefixKeyDeltaEncoder"),
DIFF(3, "org.apache.hadoop.hbase.io.encoding.DiffKeyDeltaEncoder"),
FAST_DIFF(4, "org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder"),
// id 5 is reserved for the COPY_KEY algorithm for benchmarking
// COPY_KEY(5, "org.apache.hadoop.hbase.io.encoding.CopyKeyDataBlockEncoder"),
// PREFIX_TREE(6, "org.apache.hadoop.hbase.codec.prefixtree.PrefixTreeCodec"),
ROW_INDEX_V1(7, "org.apache.hadoop.hbase.io.encoding.RowIndexCodecV1");
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24309) Avoid introducing log4j and slf4j-log4j dependencies for modules other than hbase-assembly

2020-05-13 Thread Duo Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-24309.
---
Hadoop Flags: Reviewed
  Resolution: Fixed

Pushed to branch-2.3+.

Thanks [~stack] for reviewing.

> Avoid introducing log4j and slf4j-log4j dependencies for modules other than 
> hbase-assembly
> --
>
> Key: HBASE-24309
> URL: https://issues.apache.org/jira/browse/HBASE-24309
> Project: HBase
>  Issue Type: Sub-task
>  Components: logging, pom
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
>
> In general, a library should not force the down stream users to use a 
> specific logging framework, and this is why there is a slf4j library.
> For HBase, since we also publish the testing-util module which almost depends 
> on all other sub modules, we should not introduce logging dependencies other 
> than slf4j-api in these modules. We should only add log4j dependencies in 
> hbase-assembly and ship it with our binary distribution.
> This is also important for switching to log4j2.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24155) When running the tests, a tremendous number of connections are put into TIME_WAIT.

2020-05-13 Thread Mark Robert Miller (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Robert Miller resolved HBASE-24155.

Resolution: Information Provided

Man it took me a long time to finally see a lot of what was going on here.

Mostly just seems to be hdfs short circuit read socket polling management that 
you can just call me not a fan of and with defaults that you can call me an 
anti fan of. Couple that with some hbase fail and fast retry stuff, especially 
in like snapshooting or splitting stuff, and well, the number of potential 
sockets (many without a proper tcp lifecycle) are just one part of the 
resulting fun.

> When running the tests, a tremendous number of connections are put into 
> TIME_WAIT.
> --
>
> Key: HBASE-24155
> URL: https://issues.apache.org/jira/browse/HBASE-24155
> Project: HBase
>  Issue Type: Test
>  Components: test
>Reporter: Mark Robert Miller
>Priority: Major
>
> When you run the test suite and monitor the number of connections in 
> TIME_WAIT, it appears that a very large number of connections do not end up 
> with a proper connection close lifecycle or perhaps proper reuse.
> Given connections can stay in TIME_WAIT from 1-4 minutes depending on OS/Env, 
> running the tests faster or with more tests in parallel increases the 
> TIME_WAIT connection buildup. Some tests spin up a very, very large number of 
> connections and if the wrong ones run at the same time, this can also greatly 
> increase the number of connections put into TIME_WAIT. This can have a 
> dramatic affect on performance (as it can take longer to create a new 
> connection) or flat out fail or timeout.
> In my experience, a much, much smaller number of connections in a test suite 
> would end up in TIME_WAIT when connection handling is all correct.
> Notes to come in comments below.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Reopened] (HBASE-24327) TestMasterShutdown#testMasterShutdownBeforeStartingAnyRegionServer can fail with retries exhausted on an admin call.

2020-05-13 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reopened HBASE-24327:
--

> TestMasterShutdown#testMasterShutdownBeforeStartingAnyRegionServer can fail 
> with retries exhausted on an admin call.
> 
>
> Key: HBASE-24327
> URL: https://issues.apache.org/jira/browse/HBASE-24327
> Project: HBase
>  Issue Type: Test
>  Components: test
>Reporter: Mark Robert Miller
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-24365) MetricsTableWrapperAggregateImpl runnable fails due to exception and never runs

2020-05-13 Thread ramkrishna.s.vasudevan (Jira)

ramkrishna.s.vasudevan created HBASE-24365:
--

 Summary: MetricsTableWrapperAggregateImpl runnable fails due to 
exception and never runs
 Key: HBASE-24365
 URL: https://issues.apache.org/jira/browse/HBASE-24365
 Project: HBase
  Issue Type: Bug
  Components: metrics
Affects Versions: 2.2.4, 2.1.9, 3.0.0-alpha-1
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan


MetricsTableWrapperAggregateImpl has a thread that periodically updates the 
values.
It seems that once the region server is online and the meta is assigned if 
there are no store files the Optional values  here
{code}
mt.maxStoreFileAge = Math.max(mt.maxStoreFileAge, 
store.getMaxStoreFileAge().getAsLong());
mt.minStoreFileAge = Math.min(mt.minStoreFileAge, 
store.getMinStoreFileAge().getAsLong());
{code}
is not available and thus throwing NoSuchElementException and the thread dies. 
It never updates the values till the RS is alive. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: Recent experience with Chaos Monkey?

2020-05-13 Thread Nick Dimiduk

To follow up, I've needed to apply these two patches to get my local
environment running.

https://issues.apache.org/jira/browse/HBASE-24360
https://issues.apache.org/jira/browse/HBASE-24361

On Tue, May 12, 2020 at 11:52 AM Nick Dimiduk  wrote:

> Thanks Zach.
>
> > It actually performs even worse in this case in my experience since
> Chaos monkey can consider the failure mechanism to have failed (and
> eventually times out) because the process is too quick to recover (or the
> recovery fails because the process is already running). The only way I was
> able to get it to run was to disable the process that automatically
> restarts killed processes in my system.
>
> Interesting observation.
>
> > This brings up a discussion on whether the ITBLL (or whatever process)
> should even continue if either a killing or recovering action failed.
> I would argue that invalidates the entire test, but it might not be obvious
> it failed unless you were watching the logs as it went.
>
> I'm coming to a similar conclusion -- failure in the orchestration layer
> should invalidate the test.
>
> On Thu, May 7, 2020 at 5:27 PM Zach York 
> wrote:
>
>> I should note that I was using HBase 2.2.3 to test.
>>
>> On Thu, May 7, 2020 at 5:26 PM Zach York 
>> wrote:
>>
>> > I recently ran ITBLL with Chaos monkey[1] against a real HBase
>> > installation (EMR). I initially tried to run it locally, but couldn't
>> get
>> > it working and eventually gave up.
>> >
>> > > So I'm curious if this matches others' experience running the monkey.
>> For
>> > example, do you have an environment more resilient than mine, one where
>> an
>> > external actor is restarting downed processed without the monkey
>> action's
>> > involvement?
>> >
>> > It actually performs even worse in this case in my experience since
>> Chaos
>> > monkey can consider the failure mechanism to have failed (and eventually
>> > times out)
>> > because the process is too quick to recover (or the recovery fails
>> because
>> > the process is already running). The only way I was able to get it to
>> run
>> > was to disable
>> > the process that automatically restarts killed processes in my system.
>> >
>> > One other thing I hit was the validation for a suspended process was
>> > incorrect so if chaos monkey tried to suspend the process the run would
>> > fail. I'll put up a JIRA for that.
>> >
>> > This brings up a discussion on whether the ITBLL (or whatever process)
>> > should even continue if either a killing or recovering action failed. I
>> > would argue that invalidates the entire test,
>> > but it might not be obvious it failed unless you were watching the logs
>> as
>> > it went.
>> >
>> > Thanks,
>> > Zach
>> >
>> >
>> > [1] sudo -u hbase hbase
>> > org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList -m
>> serverKilling
>> > loop 4 2 100 ${RANDOM} 10
>> >
>> > On Thu, May 7, 2020 at 5:05 PM Nick Dimiduk 
>> wrote:
>> >
>> >> Hello,
>> >>
>> >> Does anyone have recent experience running Chaos Monkey? Are you
>> running
>> >> against an external cluster, or one of the other modes? What monkey
>> >> factory
>> >> are you using? Any property overrides? A non-default ClusterManager?
>> >>
>> >> I'm trying to run ITBLL with chaos against branch-2.3 and I'm not
>> having
>> >> much luck. My environment is an "external" cluster, 4 racks of 4 hosts
>> >> each, the relatively simple "serverKilling" factory with
>> >> `rolling.batch.suspend.rs.ratio = 0.0`. So, randomly kill various
>> hosts
>> >> on
>> >> various scheduled, plus some balancer play mixed in; no process
>> >> suspension.
>> >>
>> >> Running for any length of time (~30 minutes) the chaos monkey
>> eventually
>> >> terminates between a majority and all of the hosts in the cluster. My
>> logs
>> >> are peppered with warnings such as the below. There are other
>> variants. As
>> >> far as I can tell, actions are intended to cause some harm and then
>> >> restore
>> >> state after themselves. In practice, the harm is successful but
>> >> restoration
>> >> rarely succeeds. Mostly these actions are "safeguarded" but this 60-sec
>> >> timeout. The result is a methodical termination of the cluster.
>> >>
>> >> So I'm curious if this matches others' experience running the monkey.
>> For
>> >> example, do you have an environment more resilient than mine, one
>> where an
>> >> external actor is restarting downed processed without the monkey
>> action's
>> >> involvement? Is the monkey designed to run only in such an environment?
>> >> These timeouts are configurable; are you cranking them way up?
>> >>
>> >> Any input you have would be greatly appreciated. This is my last major
>> >> action item blocking initial 2.3.0 release candidates.
>> >>
>> >> Thanks,
>> >> Nick
>> >>
>> >> 20/05/05 21:19:29 WARN policies.Policy: Exception occurred during
>> >> performing action: java.io.IOException: did timeout 6ms waiting for
>> >> region server to start: host-a.example.com
>> >> at
>> >>
>> >>
>> org.apac

[jira] [Resolved] (HBASE-24318) Create-release scripts fixes and enhancements

2020-05-13 Thread Matthew Foley (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew Foley resolved HBASE-24318.
---
Fix Version/s: 3.0.0-alpha-1
   Resolution: Fixed

> Create-release scripts fixes and enhancements
> -
>
> Key: HBASE-24318
> URL: https://issues.apache.org/jira/browse/HBASE-24318
> Project: HBase
>  Issue Type: Improvement
>  Components: create-release
>Affects Versions: 3.0.0-alpha-1, 2.3.0
> Environment: Linux Docker container, or Mac OS X without Docker
>Reporter: Matthew Foley
>Assignee: Matthew Foley
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> The create-release tools are a set of scripts that promote best practice in 
> making Apache releases, by automating most of the steps and running them 
> under Docker for reliability. However, the current state of the scripts has 
> many bugs and ambiguities.  
> The proposed PR cleans up the code and clarifies usage. It enables:
> - Clear statement of the four steps, which are now called `tag`, 
> `publish-dist`, `publish-snapshot`, and `publish-release` (the latter two 
> being mutually exclusive alternatives).
> - Ability to do the three tag-dist-release steps with a single command, or do 
> any of the steps singly. (Running singly had bugs and unfulfilled 
> dependencies before.)
> - Ability to do a reliable and useful "dry run" of all steps or each step, 
> and chain together the tag step with publish steps in a dry run.
> - Ability to run any or all steps correctly in Docker or outside of Docker, 
> on Linux and Mac.
> - Cleaned up all `shellcheck` errors in the scripts, and removed ambiguities 
> and redundancies in the many environment variables used.
> In addition, the changes move the code toward being more general / less 
> HBase-specific, so it can be run on any Apache project (while still 
> accommodating HBase-specific features regarding how sub-projects are named 
> and organized in Jira and release repos). In future I propose to take it 
> further along that path, and move create-release into Yetus (recognizing that 
> this create-release code has been passed between a couple other projects 
> already).
> These changes have NO IMPACT on HBase functionality.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[ANNOUNCE] New HBase committer Wei-Chiu Chuang

2020-05-13 Thread Sean Busbey

Folks,

On behalf of the Apache HBase PMC I am pleased to announce that Wei-Chiu
Chuang has accepted the PMC's invitation to become a committer on the
project.

We appreciate all of the great contributions Wei-Chiu has made to the
community thus far and we look forward to his continued involvement.

Allow me to be the first to congratulate Wei-Chiu on his new role!

thanks,
busbey

Re: [ANNOUNCE] New HBase committer Wei-Chiu Chuang

2020-05-13 Thread Andrew Purtell

Congratulations and welcome Wei-Chiu!

On Wed, May 13, 2020 at 12:10 PM Sean Busbey  wrote:

> Folks,
>
> On behalf of the Apache HBase PMC I am pleased to announce that Wei-Chiu
> Chuang has accepted the PMC's invitation to become a committer on the
> project.
>
> We appreciate all of the great contributions Wei-Chiu has made to the
> community thus far and we look forward to his continued involvement.
>
> Allow me to be the first to congratulate Wei-Chiu on his new role!
>
> thanks,
> busbey
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk

Re: [ANNOUNCE] New HBase committer Wei-Chiu Chuang

2020-05-13 Thread Bharath Vissapragada

Congrats, Wei-Chiu.

On Wed, May 13, 2020 at 12:13 PM Andrew Purtell  wrote:

> Congratulations and welcome Wei-Chiu!
>
> On Wed, May 13, 2020 at 12:10 PM Sean Busbey  wrote:
>
> > Folks,
> >
> > On behalf of the Apache HBase PMC I am pleased to announce that Wei-Chiu
> > Chuang has accepted the PMC's invitation to become a committer on the
> > project.
> >
> > We appreciate all of the great contributions Wei-Chiu has made to the
> > community thus far and we look forward to his continued involvement.
> >
> > Allow me to be the first to congratulate Wei-Chiu on his new role!
> >
> > thanks,
> > busbey
> >
>
>
> --
> Best regards,
> Andrew
>
> Words like orphans lost among the crosstalk, meaning torn from truth's
> decrepit hands
>- A23, Crosstalk
>

PR linking broken...again

2020-05-13 Thread Bharath Vissapragada

Seems like an infra issue
 across multiple
projects. From what I can tell, email notifications for new PRs are also
not working from the past two days.

Please make sure to link the PRs manually until it is fixed and add
specific reviewers if you'd like to get your PR noticed (or send out an
email if urgent).

Just FYI.

[jira] [Reopened] (HBASE-24190) Case-sensitive use of configuration parameter hbase.security.authentication

2020-05-13 Thread Nick Dimiduk (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk reopened HBASE-24190:
--

The commits applied do not conform to the project requirements for including a 
Jira ticket and matching between the commit title and jira summary. Responsible 
committer, please revert and reapply everywhere. Thanks.

> Case-sensitive use of configuration parameter hbase.security.authentication
> ---
>
> Key: HBASE-24190
> URL: https://issues.apache.org/jira/browse/HBASE-24190
> Project: HBase
>  Issue Type: Bug
>Reporter: Yuanliang Zhang
>Assignee: Rushabh Shah
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0, 1.7.0, 2.1.10, 1.4.14, 2.2.5
>
>
> In hbase-20586 (https://issues.apache.org/jira/browse/HBASE-20586)
> （commit_sha: [https://github.com/apache/hbase/commit/cd61bcc0] ）
> The code added 
> ([SyncTable.java|https://github.com/apache/hbase/commit/cd61bcc0#diff-d1b79635f33483bf6226609e91fd1cc3])
>  for the use of *hbase.security.authentication* is case-sensitive. So users 
> setting it to “KERBEROS” won’t take effect. 
>  
> {code:java}
>  private void initCredentialsForHBase(String zookeeper, Job job) throws 
> IOException {
>    Configuration peerConf = 
> HBaseConfiguration.createClusterConf(job.getConfiguration(), zookeeper);
>    if(peerConf.get("hbase.security.authentication").equals("kerberos")){
>  TableMapReduceUtil.initCredentialsForCluster(job, peerConf);    }
>  }
> {code}
>  
> However, in current code base, other uses of *hbase.security.authentication* 
> are all case-insensitive. For example in *MasterFileSystem.java.* 
>  
> {code:java}
> public MasterFileSystem(Configuration conf) throws IOException{   
>   ...   
>   this.isSecurityEnabled = 
> "kerberos".equalsIgnoreCase(conf.get("hbase.security.authentication"));  
>   ... 
> }
> {code}
>  
> The doc in GitHub repo is also misleading (Giving upper-case value).
> {quote}As a distributed database, HBase must be able to authenticate users 
> and HBase services across an untrusted network. Clients and HBase services 
> are treated equivalently in terms of authentication (and this is the only 
> time we will draw such a distinction).
> There are currently three modes of authentication which are supported by 
> HBase today via the configuration property {{hbase.security.authentication}}
> {{1.SIMPLE}}
> {{2.KERBROS}}
> {{3.TOKEN}}
> {quote}
> Users may misconfigure the parameter because of the case-senstive problem.
> *How To Fix*
> Using *eqaulsIgnoreCase* API consistently in every place when using 
> *hbase.security.authentication* or make it clear in Doc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [ANNOUNCE] New HBase committer Wei-Chiu Chuang

2020-05-13 Thread Zach York

Congratulations Wei-Chiu!

On Wed, May 13, 2020 at 12:15 PM Bharath Vissapragada 
wrote:

> Congrats, Wei-Chiu.
>
> On Wed, May 13, 2020 at 12:13 PM Andrew Purtell 
> wrote:
>
> > Congratulations and welcome Wei-Chiu!
> >
> > On Wed, May 13, 2020 at 12:10 PM Sean Busbey  wrote:
> >
> > > Folks,
> > >
> > > On behalf of the Apache HBase PMC I am pleased to announce that
> Wei-Chiu
> > > Chuang has accepted the PMC's invitation to become a committer on the
> > > project.
> > >
> > > We appreciate all of the great contributions Wei-Chiu has made to the
> > > community thus far and we look forward to his continued involvement.
> > >
> > > Allow me to be the first to congratulate Wei-Chiu on his new role!
> > >
> > > thanks,
> > > busbey
> > >
> >
> >
> > --
> > Best regards,
> > Andrew
> >
> > Words like orphans lost among the crosstalk, meaning torn from truth's
> > decrepit hands
> >- A23, Crosstalk
> >
>

[jira] [Resolved] (HBASE-24093) Exclude H2 from the build workers pool

2020-05-13 Thread Nick Dimiduk (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk resolved HBASE-24093.
--
Resolution: Won't Fix

I think the issues we were trying to side-step were resolved by Infra 
increasing the jenkins worker heap allocation.

> Exclude H2 from the build workers pool
> --
>
> Key: HBASE-24093
> URL: https://issues.apache.org/jira/browse/HBASE-24093
> Project: HBase
>  Issue Type: Task
>  Components: build
>Reporter: Nick Dimiduk
>Priority: Major
>
> Tracking INFRA-20025, H2 keeps coming up as impacted. Let's exclude it from 
> our build workers while Infra investigates the hardware.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [ANNOUNCE] New HBase committer Wei-Chiu Chuang

2020-05-13 Thread Rushabh Shah

Congratulations Wei-Chiu !!


Rushabh Shah

   - Software Engineering SMTS | Salesforce
   -
  - Mobile: 213 422 9052



On Wed, May 13, 2020 at 1:35 PM Zach York 
wrote:

> Congratulations Wei-Chiu!
>
> On Wed, May 13, 2020 at 12:15 PM Bharath Vissapragada  >
> wrote:
>
> > Congrats, Wei-Chiu.
> >
> > On Wed, May 13, 2020 at 12:13 PM Andrew Purtell 
> > wrote:
> >
> > > Congratulations and welcome Wei-Chiu!
> > >
> > > On Wed, May 13, 2020 at 12:10 PM Sean Busbey 
> wrote:
> > >
> > > > Folks,
> > > >
> > > > On behalf of the Apache HBase PMC I am pleased to announce that
> > Wei-Chiu
> > > > Chuang has accepted the PMC's invitation to become a committer on the
> > > > project.
> > > >
> > > > We appreciate all of the great contributions Wei-Chiu has made to the
> > > > community thus far and we look forward to his continued involvement.
> > > >
> > > > Allow me to be the first to congratulate Wei-Chiu on his new role!
> > > >
> > > > thanks,
> > > > busbey
> > > >
> > >
> > >
> > > --
> > > Best regards,
> > > Andrew
> > >
> > > Words like orphans lost among the crosstalk, meaning torn from truth's
> > > decrepit hands
> > >- A23, Crosstalk
> > >
> >
>

Re: [ANNOUNCE] New HBase committer Wei-Chiu Chuang

2020-05-13 Thread Nick Dimiduk

Thank you Wei-Chiu for your contributions! Looking forward to continuing to
work together :)

On Wed, May 13, 2020 at 2:01 PM Rushabh Shah
 wrote:

> Congratulations Wei-Chiu !!
>
>
> Rushabh Shah
>
>- Software Engineering SMTS | Salesforce
>-
>   - Mobile: 213 422 9052
>
>
>
> On Wed, May 13, 2020 at 1:35 PM Zach York 
> wrote:
>
> > Congratulations Wei-Chiu!
> >
> > On Wed, May 13, 2020 at 12:15 PM Bharath Vissapragada <
> bhara...@apache.org
> > >
> > wrote:
> >
> > > Congrats, Wei-Chiu.
> > >
> > > On Wed, May 13, 2020 at 12:13 PM Andrew Purtell 
> > > wrote:
> > >
> > > > Congratulations and welcome Wei-Chiu!
> > > >
> > > > On Wed, May 13, 2020 at 12:10 PM Sean Busbey 
> > wrote:
> > > >
> > > > > Folks,
> > > > >
> > > > > On behalf of the Apache HBase PMC I am pleased to announce that
> > > Wei-Chiu
> > > > > Chuang has accepted the PMC's invitation to become a committer on
> the
> > > > > project.
> > > > >
> > > > > We appreciate all of the great contributions Wei-Chiu has made to
> the
> > > > > community thus far and we look forward to his continued
> involvement.
> > > > >
> > > > > Allow me to be the first to congratulate Wei-Chiu on his new role!
> > > > >
> > > > > thanks,
> > > > > busbey
> > > > >
> > > >
> > > >
> > > > --
> > > > Best regards,
> > > > Andrew
> > > >
> > > > Words like orphans lost among the crosstalk, meaning torn from
> truth's
> > > > decrepit hands
> > > >- A23, Crosstalk
> > > >
> > >
> >
>

Re: [ANNOUNCE] New HBase committer Wei-Chiu Chuang

2020-05-13 Thread Esteban Gutierrez

Congratulations Wei-Chiu! and welcome aboard!

Esteban.

--
Cloudera, Inc.



On Wed, May 13, 2020 at 4:11 PM Nick Dimiduk  wrote:

> Thank you Wei-Chiu for your contributions! Looking forward to continuing to
> work together :)
>
> On Wed, May 13, 2020 at 2:01 PM Rushabh Shah
>  wrote:
>
> > Congratulations Wei-Chiu !!
> >
> >
> > Rushabh Shah
> >
> >- Software Engineering SMTS | Salesforce
> >-
> >   - Mobile: 213 422 9052
> >
> >
> >
> > On Wed, May 13, 2020 at 1:35 PM Zach York 
> > wrote:
> >
> > > Congratulations Wei-Chiu!
> > >
> > > On Wed, May 13, 2020 at 12:15 PM Bharath Vissapragada <
> > bhara...@apache.org
> > > >
> > > wrote:
> > >
> > > > Congrats, Wei-Chiu.
> > > >
> > > > On Wed, May 13, 2020 at 12:13 PM Andrew Purtell  >
> > > > wrote:
> > > >
> > > > > Congratulations and welcome Wei-Chiu!
> > > > >
> > > > > On Wed, May 13, 2020 at 12:10 PM Sean Busbey 
> > > wrote:
> > > > >
> > > > > > Folks,
> > > > > >
> > > > > > On behalf of the Apache HBase PMC I am pleased to announce that
> > > > Wei-Chiu
> > > > > > Chuang has accepted the PMC's invitation to become a committer on
> > the
> > > > > > project.
> > > > > >
> > > > > > We appreciate all of the great contributions Wei-Chiu has made to
> > the
> > > > > > community thus far and we look forward to his continued
> > involvement.
> > > > > >
> > > > > > Allow me to be the first to congratulate Wei-Chiu on his new
> role!
> > > > > >
> > > > > > thanks,
> > > > > > busbey
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Best regards,
> > > > > Andrew
> > > > >
> > > > > Words like orphans lost among the crosstalk, meaning torn from
> > truth's
> > > > > decrepit hands
> > > > >- A23, Crosstalk
> > > > >
> > > >
> > >
> >
>

[jira] [Created] (HBASE-24366) Document how to move WebUI access log entries to a separate log file

2020-05-13 Thread Nick Dimiduk (Jira)

Nick Dimiduk created HBASE-24366:


 Summary: Document how to move WebUI access log entries to a 
separate log file
 Key: HBASE-24366
 URL: https://issues.apache.org/jira/browse/HBASE-24366
 Project: HBase
  Issue Type: Task
  Components: master, regionserver
Affects Versions: 2.3.0
Reporter: Nick Dimiduk


I've noticed that after a recent commit, we now have webui access log lines 
going into our service log file. The log entires are going to a logger called 
{{http.requests.regionserver}}, and after the preamble of timestamp, log level, 
logger, they appear to be conformant to the 
[CLF|https://en.wikipedia.org/wiki/Common_Log_Format] specification. Tools 
designed for parsing http logs usually expect to have just the CLF entries, and 
not need preprocessing.

We should document how to configure the service to log these entries into a 
separate log file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[DISCUSS] Separate web access logs

2020-05-13 Thread Nick Dimiduk

Heya,

Looks like since HBASE-24310 we have nicely formatted logs of access to our
WebUI endpoints, following a standard web server log format. I think it'll
be a common requirement for environments to process these logs following
standard processes for web servers, so I think we should document how to
separate them out to their own dedicated file, thus HBASE-24366. However,
I'm wondering if we should make the separate file as the default
configuration. Are there different answers to this question for 2.x vs. 3.0
releases?

Maybe you're a lurking operator who has opinions about this? Please speak
up! :D

Thanks,
Nick

Re: [DISCUSS] Separate web access logs

2020-05-13 Thread Nick Dimiduk

Having read through a region server log with this feature enabled, looking
to diagnose another issue, I'm going to change my tone. I think having
these messages in the standard server log is too noisy, makes them even
harder to read. In my case, it looks like automation is pinging the /jmx
endpoint on a regular basis, and we're seeing those interaction here.

So, I propose we change the default configuration to ship these logs to a
dedicated access.log.

On Wed, May 13, 2020 at 2:41 PM Nick Dimiduk  wrote:

> Heya,
>
> Looks like since HBASE-24310 we have nicely formatted logs of access to
> our WebUI endpoints, following a standard web server log format. I think
> it'll be a common requirement for environments to process these logs
> following standard processes for web servers, so I think we should document
> how to separate them out to their own dedicated file, thus HBASE-24366.
> However, I'm wondering if we should make the separate file as the default
> configuration. Are there different answers to this question for 2.x vs. 3.0
> releases?
>
> Maybe you're a lurking operator who has opinions about this? Please speak
> up! :D
>
> Thanks,
> Nick
>

[jira] [Created] (HBASE-24367) ScheduledChore log elapsed timespan in a human-friendly format

2020-05-13 Thread Nick Dimiduk (Jira)

Nick Dimiduk created HBASE-24367:


 Summary: ScheduledChore log elapsed timespan in a human-friendly 
format
 Key: HBASE-24367
 URL: https://issues.apache.org/jira/browse/HBASE-24367
 Project: HBase
  Issue Type: Task
  Components: master, regionserver
Affects Versions: 2.3.0
Reporter: Nick Dimiduk


I noticed this in a log line,

{noformat}
2020-04-23 18:31:14,183 INFO org.apache.hadoop.hbase.ScheduledChore: 
host-a.example.com,16000,1587577999888-ClusterStatusChore average execution 
time: 68488258 ns.
{noformat}

I'm not sure if there's a case when elapsed time in nanoseconds is meaningful 
for these background chores, but we could do a little work before printing the 
number and time unit to truncate precision down to something a little more 
intuitive for operators. This number purports to be an average, so a high level 
of precision isn't necessarily meaningful.

Separately, or while we're here, if we think an operator really cares about the 
performance of this chore, we should print a histogram of elapsed times, rather 
than an opaque average.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [ANNOUNCE] New HBase committer Wei-Chiu Chuang

2020-05-13 Thread Hui Fei

Congratulations Wei-Chiu!

Sean Busbey  于2020年5月14日周四 上午3:10写道：

> Folks,
>
> On behalf of the Apache HBase PMC I am pleased to announce that Wei-Chiu
> Chuang has accepted the PMC's invitation to become a committer on the
> project.
>
> We appreciate all of the great contributions Wei-Chiu has made to the
> community thus far and we look forward to his continued involvement.
>
> Allow me to be the first to congratulate Wei-Chiu on his new role!
>
> thanks,
> busbey
>

Re: [ANNOUNCE] New HBase committer Wei-Chiu Chuang

2020-05-13 Thread Tak-Lon (Stephen) Wu

Congrats Wei-Chiu !

-Stephen

On Wed, May 13, 2020 at 6:05 PM Hui Fei  wrote:

> Congratulations Wei-Chiu!
>
> Sean Busbey  于2020年5月14日周四 上午3:10写道：
>
> > Folks,
> >
> > On behalf of the Apache HBase PMC I am pleased to announce that Wei-Chiu
> > Chuang has accepted the PMC's invitation to become a committer on the
> > project.
> >
> > We appreciate all of the great contributions Wei-Chiu has made to the
> > community thus far and we look forward to his continued involvement.
> >
> > Allow me to be the first to congratulate Wei-Chiu on his new role!
> >
> > thanks,
> > busbey
> >
>
-- 
Sent from Gmail Mobile

Re: [DISCUSS] Separate web access logs

2020-05-13 Thread Duo Zhang

For now the default config is just disable the access log?

No sure if it is useful to output these access logs as for me I do not care
the access to status pages too much...

Nick Dimiduk  于2020年5月14日周四 上午6:20写道：

> Having read through a region server log with this feature enabled, looking
> to diagnose another issue, I'm going to change my tone. I think having
> these messages in the standard server log is too noisy, makes them even
> harder to read. In my case, it looks like automation is pinging the /jmx
> endpoint on a regular basis, and we're seeing those interaction here.
>
> So, I propose we change the default configuration to ship these logs to a
> dedicated access.log.
>
> On Wed, May 13, 2020 at 2:41 PM Nick Dimiduk  wrote:
>
> > Heya,
> >
> > Looks like since HBASE-24310 we have nicely formatted logs of access to
> > our WebUI endpoints, following a standard web server log format. I think
> > it'll be a common requirement for environments to process these logs
> > following standard processes for web servers, so I think we should
> document
> > how to separate them out to their own dedicated file, thus HBASE-24366.
> > However, I'm wondering if we should make the separate file as the default
> > configuration. Are there different answers to this question for 2.x vs.
> 3.0
> > releases?
> >
> > Maybe you're a lurking operator who has opinions about this? Please speak
> > up! :D
> >
> > Thanks,
> > Nick
> >
>

[jira] [Resolved] (HBASE-24333) Backport HBASE-24304 "Separate a hbase-asyncfs module" to branch-2.x

2020-05-13 Thread Duo Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-24333.
---
Resolution: Fixed

> Backport HBASE-24304 "Separate a hbase-asyncfs module" to branch-2.x
> 
>
> Key: HBASE-24333
> URL: https://issues.apache.org/jira/browse/HBASE-24333
> Project: HBase
>  Issue Type: Sub-task
>  Components: build, pom
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 2.3.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24080) [flakey test] TestRegionReplicaFailover.testSecondaryRegionKill fails.

2020-05-13 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24080.

Resolution: Fixed

> [flakey test] TestRegionReplicaFailover.testSecondaryRegionKill fails.
> --
>
> Key: HBASE-24080
> URL: https://issues.apache.org/jira/browse/HBASE-24080
> Project: HBase
>  Issue Type: Test
>  Components: read replicas
>Affects Versions: 3.0.0-alpha-1, 2.3.0
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0, 2.2.5
>
>
> Run into the following error locally:
> {code:java}
> ---
> Test set: org.apache.hadoop.hbase.regionserver.TestRegionReplicaFailover
> ---
> Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 97.391 s <<< 
> FAILURE! - in org.apache.hadoop.hbase.regionserver.TestRegionReplicaFailover
> org.apache.hadoop.hbase.regionserver.TestRegionReplicaFailover.testSecondaryRegionKill
>   Time elapsed: 28.682 s  <<< FAILURE!
> java.lang.AssertionError: Failed verification of row :0
>         at org.junit.Assert.fail(Assert.java:89)
>         at org.junit.Assert.assertTrue(Assert.java:42)
>         at 
> org.apache.hadoop.hbase.HBaseTestingUtility.verifyNumericRows(HBaseTestingUtility.java:2407)
>         at 
> org.apache.hadoop.hbase.regionserver.TestRegionReplicaFailover.testSecondaryRegionKill(TestRegionReplicaFailover.java:240)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>         at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>         at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>         at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>         at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>         at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>         at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
>         at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>         at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>         at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>         at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>         at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>         at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>         at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>         at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>         at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>         at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>         at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
>         at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at java.lang.Thread.run(Thread.java:748) {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24363) Fix failed ut TestAssignmentManagerMetrics for branch-2.2

2020-05-13 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24363.

Fix Version/s: 2.2.5
   Resolution: Fixed

Pushed to branch-2.2. Thanks [~meiyi] for reviewing.

> Fix failed ut TestAssignmentManagerMetrics for branch-2.2
> -
>
> Key: HBASE-24363
> URL: https://issues.apache.org/jira/browse/HBASE-24363
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.2.4
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 2.2.5
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-24368) Let HBCKSCP clear 'Unknown Servers', even if RegionStateNode has RegionLocation == null

2020-05-13 Thread Michael Stack (Jira)

Michael Stack created HBASE-24368:
-

 Summary: Let HBCKSCP clear 'Unknown Servers', even if 
RegionStateNode has RegionLocation == null
 Key: HBASE-24368
 URL: https://issues.apache.org/jira/browse/HBASE-24368
 Project: HBase
  Issue Type: Bug
  Components: hbck2
Affects Versions: 2.3.0
Reporter: Michael Stack


This is an incidental noticed when in a hole trying to fix up a cluster. The 
'obvious' remediation didn't work. This issue is about addressing this.

HBASE-23594 added a filtering of Regions on the crashed server to handle the 
case where an Assign may be concurrent to the ServerCrashProcedure. To avoid 
double assign, the SCP will skip assign if the RegionStateNode RegionLocation 
is not that of the crashed server.

This is good.

Where it is an obstacle is when a Region is stuck in OPENING state, it 
references an 'Unknown Server' -- a server no longer tracked by the Master -- 
and there is no assign currently in flight. In this case, scheduling a 
ServerCrashProcedure to clean up the reference to the Unknown Server and to get 
the Region reassigned skips out when RegionStateNode in Master has a 
RegionLocation that does not match that of the ServerCrashProcedure, even when 
it is set to null (we set the RegionLocation to null when we fail an assign as 
we might if the server no longer is part of the cluster).

For background, cluster had a RIT. The RIT was a Region failing to open because 
of a missing Reference (Another issue). The Region open would fail with a 
FileNotFoundException. The master would attempt assign and then would fail when 
it went to confirm OPEN, logging the complaint about FNFE asking for operator 
intervention in master logs.

This state was in place for weeks on this particular cluster (a dev cluster not 
under close observation). The cluster had been restarted once or twice so the 
server the Region had once been on was no longer 'known' but it still had an 
entry in the hbase:meta table as last location assigned (The now 'Unknown 
Server').

To fix, I went about the task in the wrong order. I bypassed the long-running 
stuck procedure to terminate it and cleanup 'Procedures and Locks'. Mistake. 
Now there was no longer an assign Procedure for this Region. But I now had a 
Region in OPENING state with a reference to an unknown server with an in-memory 
RegionStateNode whose RegionLocation was null (set null on each failed assign). 
Running catalogjanitor_run and hbck_chore_report had the unknown server show in 
the 'HBCK Report' in the 'Unknown Servers' list. Attempts at assign fail 
because Region is in OPENING state -- you can't assign a Region in OPENING 
state. Scheduling an HBCKSCP via hbck2 scheduleRecoveries always generated the 
below in the logs.

{code}
org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: pid=157217, 
state=RUNNABLE:SERVER_CRASH_ASSIGN, locked=true; HBCKServerCrashProcedure 
server=unknown_server.example.com,16020,1587577972683, splitWal=true, 
meta=false found a region state=OPENING, location=null, table=bobby_analytics, 
region=1501ea3bd822c1a3e4e6216ea48733bd which is no longer on us 
unknown_server.example.com,16020,1587577972683, give up assigning...
{code}

My workaround was setting region state to CLOSED with hbck2 and then doing an 
assign with hbck2. At this point I noticed the FNFE. Easier if the HBCKSCP 
worked.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-24369) Add a new subsection under Hbck Overlaps section

2020-05-13 Thread Huaxiang Sun (Jira)

Huaxiang Sun created HBASE-24369:


 Summary: Add a new subsection under Hbck Overlaps section
 Key: HBASE-24369
 URL: https://issues.apache.org/jira/browse/HBASE-24369
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 2.3.0
Reporter: Huaxiang Sun
Assignee: Huaxiang Sun


Right now, in Master's hbck page, pairs of overlap regions are listed. 

with Hbck2 -fixMeta, it will merge these overlap regions. However, if one of 
the region is a newly merged child region and GC has not kicked in yet, they 
are not fixable (merge will fail).

Users get confused as why after HBCK2 -fixMeta and still there are overlaps. To 
avoid this confusion, propose a new subsection saying that master is doing GC 
on these regions and these regions cannot be fixed/merged at this moment, wait 
until these regions are ready (not showing up in this subsection) to start the 
fix. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [ANNOUNCE] New HBase committer Wei-Chiu Chuang

2020-05-13 Thread Viraj Jasani

Congratulations Wei-Chiu !!

On 2020/05/13 19:12:38, Sean Busbey  wrote: 
> Folks,
> 
> On behalf of the Apache HBase PMC I am pleased to announce that Wei-Chiu
> Chuang has accepted the PMC's invitation to become a committer on the
> project.
> 
> We appreciate all of the great contributions Wei-Chiu has made to the
> community thus far and we look forward to his continued involvement.
> 
> Allow me to be the first to congratulate Wei-Chiu on his new role!
> 
> thanks,
> busbey
>

[jira] [Created] (HBASE-24370) Avoid aggressive MergeRegion and GCMultipleMergedRegionsProcedure

2020-05-13 Thread Huaxiang Sun (Jira)

Huaxiang Sun created HBASE-24370:


 Summary: Avoid aggressive MergeRegion and 
GCMultipleMergedRegionsProcedure 
 Key: HBASE-24370
 URL: https://issues.apache.org/jira/browse/HBASE-24370
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Huaxiang Sun
Assignee: Huaxiang Sun


In 
[https://github.com/apache/hbase/blob/a40a0322a73add68d9cb0579abacdd6a2e41e8fb/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/MergeTableRegionsProcedure.java#L478,
  
|https://github.com/apache/hbase/blob/a40a0322a73add68d9cb0579abacdd6a2e41e8fb/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/MergeTableRegionsProcedure.java#L478]

prepareMergeRegion, it checks if one of merged parent regions is a merged child 
region and has not been GCed. If it is ready to GC, it will kick off a 
GCMultipleMergedRegionsProcedure and also start the MergeRegionProcedure. There 
is a race condition here. If MergeRegionProcedure finishes first, it will 
delete meta row for the merged child region. Then 
GCMultipleMergedRegionsProcedure runs, and because the newly added check, it 
thinks GC has been done and wont schedule GCRegionProcedure to clean up those 
merged parent regions. The end result is that these merged parent regions are 
left as orphans on Filesystem.

 

[https://github.com/apache/hbase/blob/a40a0322a73add68d9cb0579abacdd6a2e41e8fb/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/GCMultipleMergedRegionsProcedure.java#L105]

 

The proposed solution is to avoid being so aggressive, if it needs to kick off 
GCMultipleMergedRegionsProcedure, then abort MergeRegionProcedure and user can 
try MergeRegionProcedure later.

[|https://github.com/apache/hbase/blob/a40a0322a73add68d9cb0579abacdd6a2e41e8fb/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/MergeTableRegionsProcedure.java#L478]

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24164) Retain the ReadRequests and WriteRequests of region on web UI after alter table

2020-05-13 Thread Michael Stack (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Stack resolved HBASE-24164.
---
Fix Version/s: 2.3.0
   3.0.0-alpha-1
 Hadoop Flags: Reviewed
   Resolution: Fixed

Merged to branch-2.3+. Thanks for the patch [~filtertip]. I tried to bring the 
patch back to branch-2.2 but it would not apply. Mind putting up another PR for 
2.2? Can reopen this JIRA or make a sub-issue and ping me for commit. Thanks 
for the patience.

> Retain the ReadRequests and WriteRequests of region on web UI after alter 
> table
> ---
>
> Key: HBASE-24164
> URL: https://issues.apache.org/jira/browse/HBASE-24164
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Zheng Wang
>Assignee: Zheng Wang
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
>
> When we alter a table, all regions of it will do a rs self move, then the 
> ReadRequests and WriteRequests will be cleared, but they are very useful 
> metrics, my propose is keep them in RegionServerAccounting on close if it is 
> a rs self move, and recover them on open.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Row size analyzer in HBase

2020-05-13 Thread Sukumar Maddineni

Hello everyone,

Is there any existing tool which we can use to understand the size of the
rows in a table.  Like we want to know what is p90, max row size of rows in
a given table to understand the usage pattern and see how much room we have
before having large rows.

I was thinking similar to RowCounter with reducer to consolidate the info.


-- 
Sukumar

Re: [ANNOUNCE] New HBase committer Wei-Chiu Chuang

2020-05-13 Thread ramkrishna vasudevan

Congratulations Wei-Chiu !!!

Regards
Ram

On Thu, May 14, 2020 at 10:55 AM Viraj Jasani  wrote:

> Congratulations Wei-Chiu !!
>
> On 2020/05/13 19:12:38, Sean Busbey  wrote:
> > Folks,
> >
> > On behalf of the Apache HBase PMC I am pleased to announce that Wei-Chiu
> > Chuang has accepted the PMC's invitation to become a committer on the
> > project.
> >
> > We appreciate all of the great contributions Wei-Chiu has made to the
> > community thus far and we look forward to his continued involvement.
> >
> > Allow me to be the first to congratulate Wei-Chiu on his new role!
> >
> > thanks,
> > busbey
> >
>

RE: [ANNOUNCE] New HBase committer Wei-Chiu Chuang

2020-05-13 Thread Pankaj kr

Congratulations Wei-Chiu Chuang.!! 

- Pankaj

-Original Message-
From: Sean Busbey [mailto:bus...@apache.org] 
Sent: 14 May 2020 00:43
To: dev ; Hbase-User 
Subject: [ANNOUNCE] New HBase committer Wei-Chiu Chuang

Folks,

On behalf of the Apache HBase PMC I am pleased to announce that Wei-Chiu Chuang 
has accepted the PMC's invitation to become a committer on the project.

We appreciate all of the great contributions Wei-Chiu has made to the community 
thus far and we look forward to his continued involvement.

Allow me to be the first to congratulate Wei-Chiu on his new role!

thanks,
busbey

Re: [ANNOUNCE] New HBase committer Wei-Chiu Chuang

2020-05-13 Thread 李响

Congratulations Wei-Chiu!

On Thu, May 14, 2020 at 9:05 AM Hui Fei  wrote:

> Congratulations Wei-Chiu!
>
> Sean Busbey  于2020年5月14日周四 上午3:10写道：
>
> > Folks,
> >
> > On behalf of the Apache HBase PMC I am pleased to announce that Wei-Chiu
> > Chuang has accepted the PMC's invitation to become a committer on the
> > project.
> >
> > We appreciate all of the great contributions Wei-Chiu has made to the
> > community thus far and we look forward to his continued involvement.
> >
> > Allow me to be the first to congratulate Wei-Chiu on his new role!
> >
> > thanks,
> > busbey
> >
>


-- 

   李响 Xiang Li

手机 cellphone ：+86-136-8113-8972
邮件 e-mail  ：wate...@gmail.com

38 matches

Mail list logo