Re: want to join in slack

2018-09-29 Thread Reid Chan
Sent, please check.


--

Best regards,
R.C




From: Jaze Lee 
Sent: 29 September 2018 17:53
To: dev@hbase.apache.org
Subject: want to join in slack

Hello,
my slack is tianq...@unitedstack.com, can you help me to join in
hbase slack?
Thanks a lot.

--
谦谦君子


[jira] [Created] (HBASE-21260) The whole balancer plans might be aborted if there are more than one plans to move the same region

2018-09-29 Thread Xiaolin Ha (JIRA)
Xiaolin Ha created HBASE-21260:
--

 Summary: The whole balancer plans might be aborted if there are 
more than one plans to move the same region 
 Key: HBASE-21260
 URL: https://issues.apache.org/jira/browse/HBASE-21260
 Project: HBase
  Issue Type: Bug
  Components: Balancer, master
Affects Versions: 2.0.0, 2.1.0
Reporter: Xiaolin Ha
Assignee: Xiaolin Ha


In SimpleLoadBalancer, plans are generated firstly by average number regions 
per server for a table. Each server will be randomly assigned either 
floor(average) or ceiling(average) regions (if the average is not an integer 
number). But afterwards, the balanceOverall method might generate new plans of 
some regions of the table to balance server loads in whole cluster scope. As a 
result, there are plans to move a same region in one call of balance. 

Currently, branch-2 is using async procedures to implement balancer plans. But 
the concurrency of moving the same regions will cause the balance method 
failed. And all the afterwards plans will not be implement when one plan 
encounters exception.
We have encountered this problem in our practices, the logs are as follows,

{color:#205081}2018-09-26,12:12:38,224 INFO 
[master/c4-hadoop-tst-ct15:52900.Chore.1] 
org.apache.hadoop.hbase.master.HMaster: Balancer plans size is 3757, the 
balance interval is 79 ms, and the max number regions in transition is 25
2018-09-26,12:12:38,224 INFO [master/c4-hadoop-tst-ct15:52900.Chore.1] 
org.apache.hadoop.hbase.master.HMaster: balance hri=1588230740, 
source=c4-hadoop-tst-st99.bj,52900,1537522783781, 
destination=c4-hadoop-tst-st28.bj,52900,1537520009497
2018-09-26,12:12:38,325 INFO [master/c4-hadoop-tst-ct15:52900.Chore.1] 
org.apache.hadoop.hbase.master.HMaster: balance hri=1588230740, 
source=c4-hadoop-tst-st99.bj,52900,1537522783781, 
destination=c4-hadoop-tst-st29.bj,52900,1537522784188
2018-09-26,12:12:38,325 INFO [PEWorker-16] 
org.apache.hadoop.hbase.master.procedure.MasterProcedureScheduler: pid=119197, 
state=RUNNABLE:REGION_STATE_TRANSITION_CLOSE; TransitRegionStateProcedure 
table=hbase:meta, region=1588230740, REOPEN/MOVE checking lock on 1588230740
2018-09-26,12:12:38,325 ERROR [master/c4-hadoop-tst-ct15:52900.Chore.1] 
org.apache.hadoop.hbase.master.balancer.BalancerChore: Failed to balance.
org.apache.hadoop.hbase.HBaseIOException: rit=OPEN, 
location=c4-hadoop-tst-st99.bj,52900,1537522783781, table=hbase:meta, 
region=1588230740 is currently in transition
at 
org.apache.hadoop.hbase.master.assignment.AssignmentManager.preTransitCheck(AssignmentManager.java:536)
at 
org.apache.hadoop.hbase.master.assignment.AssignmentManager.createMoveRegionProcedure(AssignmentManager.java:592)
at 
org.apache.hadoop.hbase.master.assignment.AssignmentManager.moveAsync(AssignmentManager.java:609)
at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:1707)
at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:1622)
at 
org.apache.hadoop.hbase.master.balancer.BalancerChore.chore(BalancerChore.java:49)
at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:186)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at 
org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:111)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745){color}

This is a serious problem because it often occurs when new RSs started or old 
RSs failover. And what's more, no effective methods can be used to make the 
balance of the cluster back to normal.

But the solution of this problem may be simple. We can cache Exceptions when 
implementing a plan, and then just skip it, avoiding failed plans effect later 
plans in the whole plans list. New calls of balance can fetch up the failed and 
skipped plans.



 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-21207) Add client side sorting functionality in master web UI for table and region server details.

2018-09-29 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reopened HBASE-21207:


> Add client side sorting functionality in master web UI for table and region 
> server details.
> ---
>
> Key: HBASE-21207
> URL: https://issues.apache.org/jira/browse/HBASE-21207
> Project: HBase
>  Issue Type: Improvement
>  Components: master, monitoring, UI, Usability
>Reporter: Archana Katiyar
>Assignee: Archana Katiyar
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8
>
> Attachments: 14926e82-b929-11e8-8bdd-4ce4621f1118.png, 
> 21207.branch-1.addendum.patch, 2724afd8-b929-11e8-8171-8b5b2ba3084e.png, 
> HBASE-21207-branch-1.patch, HBASE-21207-branch-1.v1.patch, 
> HBASE-21207-branch-2.v1.patch, HBASE-21207.patch, HBASE-21207.patch, 
> HBASE-21207.v1.patch, edc5c812-b928-11e8-87e2-ce6396629bbc.png
>
>
> In Master UI, we can see region server details like requests per seconds and 
> number of regions etc. Similarly, for tables also we can see online regions , 
> offline regions.
> It will help ops people in determining hot spotting if we can provide sort 
> functionality in the UI.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21259) [amv2] Revived deadservers; recreated serverstatenode

2018-09-29 Thread stack (JIRA)
stack created HBASE-21259:
-

 Summary: [amv2] Revived deadservers; recreated serverstatenode
 Key: HBASE-21259
 URL: https://issues.apache.org/jira/browse/HBASE-21259
 Project: HBase
  Issue Type: Bug
  Components: amv2
Affects Versions: 2.1.0
Reporter: stack
Assignee: stack
 Fix For: 2.2.0, 2.1.1, 2.0.3


On startup, I see servers being revived; i.e. their serverstatenode is getting 
marked online even though its just been processed by ServerCrashProcedure. It 
looks like this (in a patched server that reports on whenever a serverstatenode 
is created):

{code}
2018-09-29 03:45:40,963 INFO 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=3982597, 
state=SUCCESS; ServerCrashProcedure 
server=vb1442.halxg.cloudera.com,22101,1536675314426, splitWal=true, meta=false 
in 1.0130sec
...

2018-09-29 03:45:43,733 INFO 
org.apache.hadoop.hbase.master.assignment.RegionStates: CREATING! 
vb1442.halxg.cloudera.com,22101,1536675314426
java.lang.RuntimeException: WHERE AM I?
at 
org.apache.hadoop.hbase.master.assignment.RegionStates.getOrCreateServer(RegionStates.java:1116)
at 
org.apache.hadoop.hbase.master.assignment.RegionStates.addRegionToServer(RegionStates.java:1143)
at 
org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1464)
at 
org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:200)
at 
org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:369)
at 
org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:97)
at 
org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:953)
at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1716)
at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1494)
at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:75)
at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:2022)

{code}

See how we've just finished a SCP which will have removed the 
serverstatenode... but then we come across an unassign that references the 
server that was just processed. The unassign will attempt to update the 
serverstatenode and therein we create one if one not present. We shouldn't be 
creating one.

I think I see this a lot because I am scheduling unassigns with hbck2. The 
servers crash and then come up with SCPs doing cleanup of old server and 
unassign procedures in the procedure executor queue to be processed still  
but could happen at any time on cluster should an unassign happen get scheduled 
near an SCP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-21257) misspelled words.[occured -> occurred]

2018-09-29 Thread zhanggangxue (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhanggangxue reopened HBASE-21257:
--

> misspelled words.[occured -> occurred]
> --
>
> Key: HBASE-21257
> URL: https://issues.apache.org/jira/browse/HBASE-21257
> Project: HBase
>  Issue Type: Bug
>Reporter: zhanggangxue
>Priority: Trivial
>  Labels: occured, occurred, typo
> Attachments: 0001-misspelled-words.-occured-occurred.patch
>
>
> I found some spelling errors on the master Branch.Found in [misspelled 
> words|https://github.com/apache/hbase/pull/91]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-21257) misspelled words.[occured -> occurred]

2018-09-29 Thread zhanggangxue (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhanggangxue resolved HBASE-21257.
--
Resolution: Fixed

> misspelled words.[occured -> occurred]
> --
>
> Key: HBASE-21257
> URL: https://issues.apache.org/jira/browse/HBASE-21257
> Project: HBase
>  Issue Type: Bug
>Reporter: zhanggangxue
>Priority: Trivial
>  Labels: occured, occurred, typo
> Attachments: 0001-misspelled-words.-occured-occurred.patch
>
>
> I found some spelling errors on the master Branch.Found in [misspelled 
> words|https://github.com/apache/hbase/pull/91]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21258) Add resetting of flags for RS Group pre/post hooks in TestRSGroups

2018-09-29 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21258:
--

 Summary: Add resetting of flags for RS Group pre/post hooks in 
TestRSGroups
 Key: HBASE-21258
 URL: https://issues.apache.org/jira/browse/HBASE-21258
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu
Assignee: Ted Yu


Over HBASE-20627, [~xucang] reminded me that the resetting of flags for RS 
Group pre/post hooks in TestRSGroups was absent.

This issue is to add the resetting of these flags before each subtest starts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21257) misspelled words.[occured -> occurred]

2018-09-29 Thread zhanggangxue (JIRA)
zhanggangxue created HBASE-21257:


 Summary: misspelled words.[occured -> occurred]
 Key: HBASE-21257
 URL: https://issues.apache.org/jira/browse/HBASE-21257
 Project: HBase
  Issue Type: Bug
Reporter: zhanggangxue
 Attachments: 0001-misspelled-words.-occured-occurred.patch

I found some spelling errors on the master Branch.Found in [misspelled 
words.[occured -> occurred] #91|https://github.com/apache/hbase/pull/91]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21256) Improve IntegrationTestBigLinkedList for testing huge data

2018-09-29 Thread Zephyr Guo (JIRA)
Zephyr Guo created HBASE-21256:
--

 Summary: Improve IntegrationTestBigLinkedList for testing huge data
 Key: HBASE-21256
 URL: https://issues.apache.org/jira/browse/HBASE-21256
 Project: HBase
  Issue Type: Improvement
  Components: integration tests
Affects Versions: 3.0.0
Reporter: Zephyr Guo
Assignee: Zephyr Guo
 Fix For: 3.0.0
 Attachments: ITBLL-1.png, ITBLL-2.png

Recently, I use ITBLL to test some features in our company. I have encountered 
the following problems:
  
 1. Generator is too slow at the generating stage, the root cause is 
SecureRandom. There is a global lock in SecureRandom( See the following 
picture). I use Random instead of SecureRandom, and it could speed up this 
stage(500% up with 20 mapper).  SecureRandom was brought by HBASE-13382, but 
speaking of generating random bytes, in my opnion,
 it is the same with Random.

!ITBLL-1.png!

2. VerifyReducer have a cpu cost of 14% on format method. This is cause by 
create keyString variable. However, keyString is never be used if test result 
is correct.(and that's in most cases). Just delay creating keyString can yield 
huge performance boost in verifing stage.

!ITBLL-2.png!

3.Arguments check is needed, because there's constraint between arguments. If 
we broken this constraint, we can not get a correct circular list.  
  
 4.Let big family value size could be configured.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


want to join in slack

2018-09-29 Thread Jaze Lee
Hello,
my slack is tianq...@unitedstack.com, can you help me to join in
hbase slack?
Thanks a lot.

-- 
谦谦君子


[jira] [Created] (HBASE-21255) Refactor TablePermission into three classes (Global, Namespace, Table)

2018-09-29 Thread Reid Chan (JIRA)
Reid Chan created HBASE-21255:
-

 Summary: Refactor TablePermission into three classes (Global, 
Namespace, Table)
 Key: HBASE-21255
 URL: https://issues.apache.org/jira/browse/HBASE-21255
 Project: HBase
  Issue Type: Improvement
Reporter: Reid Chan
Assignee: Reid Chan


A TODO in {{TablePermission.java}}
{code}
  //TODO refactor this class
  //we need to refacting this into three classes (Global, Table, Namespace)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)