***UNCHECKED*** [jira] [Commented] (HBASE-20938) Set version to 2.1.1-SNAPSHOT for branch-2.1

2018-07-25 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555260#comment-16555260
 ] 

Hadoop QA commented on HBASE-20938:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  4s{color} 
| {color:red} HBASE-20938 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.7.0/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HBASE-20938 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12933006/HBASE-20938.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13781/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Set version to 2.1.1-SNAPSHOT for branch-2.1
> 
>
> Key: HBASE-20938
> URL: https://issues.apache.org/jira/browse/HBASE-20938
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 2.1.1
>
> Attachments: HBASE-20938.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


***UNCHECKED*** [jira] [Commented] (HBASE-20919) meta region can't be re-onlined when restarting cluster if opening rsgroup

2018-07-25 Thread chenyang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555265#comment-16555265
 ] 

chenyang commented on HBASE-20919:
--

hi, [~elserj]. Thanks for your suggestions.

Q:"What about failing fast here, and having the caller decide how to handle the 
retry logic? AssignmentManager should already have logic to do this."

A: Fast failing is a better solution, AssignmentManager catches 
HBaseIOException and re-add to PendingAssignmentQueue. Codes showed below in 
processAssignmentPlans() method:

 
{code:java}
try {
  acceptPlan(regions, balancer.retainAssignment(retainMap, servers));
} catch (HBaseIOException e) {
  LOG.warn("unable to retain assignment", e);
  addToPendingAssignment(regions, retainMap.keySet());
}
//or
try {
  acceptPlan(regions, balancer.roundRobinAssignment(hris, servers));
} catch (HBaseIOException e) {
  LOG.warn("unable to round-robin assignment", e);
  addToPendingAssignment(regions, hris);
}{code}
I will submit a new patch which implements fast failing.

 

Q: "RSGroupLoadBalancer doesn't get initialized until after hbase:meta gets 
assigned, but hbase:meta can't be assigned until the RSGroupLoadBalancer is 
initialized so we soft-lock. "

A: I debug the initialization of rsgroup and test some cases. The 
initialization process is executed in a independent Thread. For the moment,  I 
don`t find soft-lock. But I think it is risk still.

Q: "This is hard because, while I don't disagree with Stack's comment about 
StochasticLB to RSGroupLB, the Master using the LoadBalancer before it was 
initialized is bad"

A: According my tests, it works to initialize balancers before calling 
startServiceThreads which starts ProcedureExecutor during HMaster`s 
finishActiveMasterInitialization method. But I can not make sure it`s ok for 
other cases. Maybe It needs more tests to do. So, I think the risks are lower 
to modify RSGroupBasedLoadBalancer. I will re-submit the patch which 
initializes balancers before calling startServiceThreads for reference only.

Q: "Do you have more logs you can share? "

A: I will offer whole logs and steps along with new patch. Because it need 
start, stop, and restart whole master to test the case, so i don`t know how to 
offer unit tests, do you or anyone have some suggestions?

> meta region can't be re-onlined when restarting cluster if opening rsgroup
> --
>
> Key: HBASE-20919
> URL: https://issues.apache.org/jira/browse/HBASE-20919
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer, master, rsgroup
>Affects Versions: 2.0.1
>Reporter: chenyang
>Priority: Major
> Attachments: HBASE-20919-branch-2.0-01.patch, bug2.png, 
> hbase-hbase-master-bjpg-rs4730.yz02.log.test
>
>
> if you open rsgroup, hbase-site.xml contains  below configuration.
> {code:java}
> 
>   hbase.coprocessor.master.classes
>   org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint
> 
> 
>   hbase.master.loadbalancer.class
>  org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer
> 
> {code}
> And you shut down the whole HBase cluster in the way:
>  # first shut down region server one by one
>  # shut down master
> Then you restart whole cluster in the way:
>  # start master
>  # start regionserver
> The hbase:meta region can not be re-online and the rsgroup can not be 
> initialized successfully.
>  master logs:
> {code:java}
> 2018-07-12 18:27:08,775 INFO 
> [org.apache.hadoop.hbase.rsgroup.RSGroupInfoManagerImpl$RSGroupStartupWorker-bjpg-rs4730.yz02,16000,1531389637409]
>  rsgroup.RSGro
> upInfoManagerImpl$RSGroupStartupWorker: Waiting for catalog tables to come 
> online
> 2018-07-12 18:27:08,876 INFO 
> [org.apache.hadoop.hbase.rsgroup.RSGroupInfoManagerImpl$RSGroupStartupWorker-bjpg-rs4730.yz02,16000,1531389637409]
>  zookeeper.Met
> aTableLocator: Failed verification of hbase:meta,,1 at 
> address=bjpg-rs4732.yz02,60020,1531388712053, 
> exception=org.apache.hadoop.hbase.NotServingRegionExcepti
> on: hbase:meta,,1 is not online on bjpg-rs4732.yz02,60020,1531389727928
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3249)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3226)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1414)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegionInfo(RSRpcServices.java:1729)
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:28286)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)

***UNCHECKED*** [jira] [Comment Edited] (HBASE-20919) meta region can't be re-onlined when restarting cluster if opening rsgroup

2018-07-25 Thread chenyang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555265#comment-16555265
 ] 

chenyang edited comment on HBASE-20919 at 7/25/18 7:05 AM:
---

hi, [~elserj]. Thanks for your suggestions.

Q:"What about failing fast here, and having the caller decide how to handle the 
retry logic? AssignmentManager should already have logic to do this."

A: Fast failing is a better solution, AssignmentManager catches 
HBaseIOException and re-add to PendingAssignmentQueue. Codes showed below in 
processAssignmentPlans() method:

 
{code:java}
try {
  acceptPlan(regions, balancer.retainAssignment(retainMap, servers));
} catch (HBaseIOException e) {
  LOG.warn("unable to retain assignment", e);
  addToPendingAssignment(regions, retainMap.keySet());
}
//or
try {
  acceptPlan(regions, balancer.roundRobinAssignment(hris, servers));
} catch (HBaseIOException e) {
  LOG.warn("unable to round-robin assignment", e);
  addToPendingAssignment(regions, hris);
}{code}
I will submit a new patch which implements fast failing.

 

Q: "RSGroupLoadBalancer doesn't get initialized until after hbase:meta gets 
assigned, but hbase:meta can't be assigned until the RSGroupLoadBalancer is 
initialized so we soft-lock. "

A: I debug the initialization of rsgroup and test some cases. The 
initialization process is executed in a independent Thread. For the moment,  I 
don`t find soft-lock. But I think it is risk still.

Q: "This is hard because, while I don't disagree with Stack's comment about 
StochasticLB to RSGroupLB, the Master using the LoadBalancer before it was 
initialized is bad"

A: According my tests, it works to initialize balancers before calling 
startServiceThreads which starts ProcedureExecutor during HMaster`s 
finishActiveMasterInitialization method. But I can not make sure it`s ok for 
other cases. Maybe It needs more tests to do. So, I think the risks are lower 
to modify RSGroupBasedLoadBalancer. I will re-submit the patch which 
initializes balancers before calling startServiceThreads for reference only.

Q: "Do you have more logs you can share? "

A: I will offer whole logs and steps along with new patch. Because it need 
start, stop, and restart whole cluster to test the case, so i don`t know how to 
offer unit tests, do you or anyone have some suggestions?


was (Author: hb-cy):
hi, [~elserj]. Thanks for your suggestions.

Q:"What about failing fast here, and having the caller decide how to handle the 
retry logic? AssignmentManager should already have logic to do this."

A: Fast failing is a better solution, AssignmentManager catches 
HBaseIOException and re-add to PendingAssignmentQueue. Codes showed below in 
processAssignmentPlans() method:

 
{code:java}
try {
  acceptPlan(regions, balancer.retainAssignment(retainMap, servers));
} catch (HBaseIOException e) {
  LOG.warn("unable to retain assignment", e);
  addToPendingAssignment(regions, retainMap.keySet());
}
//or
try {
  acceptPlan(regions, balancer.roundRobinAssignment(hris, servers));
} catch (HBaseIOException e) {
  LOG.warn("unable to round-robin assignment", e);
  addToPendingAssignment(regions, hris);
}{code}
I will submit a new patch which implements fast failing.

 

Q: "RSGroupLoadBalancer doesn't get initialized until after hbase:meta gets 
assigned, but hbase:meta can't be assigned until the RSGroupLoadBalancer is 
initialized so we soft-lock. "

A: I debug the initialization of rsgroup and test some cases. The 
initialization process is executed in a independent Thread. For the moment,  I 
don`t find soft-lock. But I think it is risk still.

Q: "This is hard because, while I don't disagree with Stack's comment about 
StochasticLB to RSGroupLB, the Master using the LoadBalancer before it was 
initialized is bad"

A: According my tests, it works to initialize balancers before calling 
startServiceThreads which starts ProcedureExecutor during HMaster`s 
finishActiveMasterInitialization method. But I can not make sure it`s ok for 
other cases. Maybe It needs more tests to do. So, I think the risks are lower 
to modify RSGroupBasedLoadBalancer. I will re-submit the patch which 
initializes balancers before calling startServiceThreads for reference only.

Q: "Do you have more logs you can share? "

A: I will offer whole logs and steps along with new patch. Because it need 
start, stop, and restart whole master to test the case, so i don`t know how to 
offer unit tests, do you or anyone have some suggestions?

> meta region can't be re-onlined when restarting cluster if opening rsgroup
> --
>
> Key: HBASE-20919
> URL: https://issues.apache.org/jira/browse/HBASE-20919
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer, master, rsgroup
>Affects Versions: 2.0.1
>  

***UNCHECKED*** [jira] [Updated] (HBASE-20938) Set version to 2.1.1-SNAPSHOT for branch-2.1

2018-07-25 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-20938:
--
Attachment: HBASE-20938-branch-2.1.patch

> Set version to 2.1.1-SNAPSHOT for branch-2.1
> 
>
> Key: HBASE-20938
> URL: https://issues.apache.org/jira/browse/HBASE-20938
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 2.1.1
>
> Attachments: HBASE-20938-branch-2.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


***UNCHECKED*** [jira] [Commented] (HBASE-20919) meta region can't be re-onlined when restarting cluster if opening rsgroup

2018-07-25 Thread chenyang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555316#comment-16555316
 ] 

chenyang commented on HBASE-20919:
--

Submit HBASE-20919-branch-2.0-02.patch which implements fast failing and logs.

Testing steps(there are one master and one rs):

1: start master without 02.patch

2: start rs without 02.patch 

    now, the cluster works fine

3: stop rs

4: stop master

5: restart master

6: restart rs

    now, the hbase:meta region can not be assign successfully.

7: stop rs

8: stop master

9: apply 02.path to branch-2.0, recompile hbase-rsgroup module and replace 
hbase-rsgroup-2.0.2-SNAPSHOT.jar with new version which includes 02.patch

10: restart master

11: restart rs

now, the hbase:meta region can be assign successfully. cluster works fine.

 

Logs:

hbase-hbase-master-bjpg-rs4729.yz02.log.no_02patch includes logs across 1 to 8 
steps.

In the log file, you can see RSGroupInfoManagerImpl$RSGroupStartupWorker kept 
trying to check wether meta region is online, but failed every time. 

 
{code:java}
2018-07-25 12:15:15,064 INFO 
[org.apache.hadoop.hbase.rsgroup.RSGroupInfoManagerImpl$RSGroupStartupWorker-bjpg-rs4729.yz02,16000,1532491114549]
 zookeeper.Me
taTableLocator: Failed verification of hbase:meta,,1 at 
address=bjpg-rs4736.yz02,16020,1532490935452, 
exception=org.apache.hadoop.hbase.NotServingRegionExcep
tion: hbase:meta,,1 is not online on bjpg-rs4736.yz02,16020,1532491949108
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3246)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3223)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1414)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegionInfo(RSRpcServices.java:1729)
at 
org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:28286)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
{code}
hbase-hbase-master-bjpg-rs4729.yz02.log.with_02patch includes logs across 9 to 
10 steps.

In the log file, you can see that hbase:meta region was assigned successfully 
finally after failing some times. 
{code:java}
2018-07-25 14:27:12,356 WARN [master/bjpg-rs4729:16000] 
rsgroup.RSGroupBasedLoadBalancer: RSGroupBasedLoadBalancer has not been 
initialized
org.apache.hadoop.hbase.HBaseIOException: RSGroupBasedLoadBalancer has not been 
initialized
at 
org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer.checkInitializedState(RSGroupBasedLoadBalancer.java:480)
at 
org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer.roundRobinAssignment(RSGroupBasedLoadBalancer.java:161){code}

> meta region can't be re-onlined when restarting cluster if opening rsgroup
> --
>
> Key: HBASE-20919
> URL: https://issues.apache.org/jira/browse/HBASE-20919
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer, master, rsgroup
>Affects Versions: 2.0.1
>Reporter: chenyang
>Priority: Major
> Attachments: HBASE-20919-branch-2.0-01.patch, bug2.png, 
> hbase-hbase-master-bjpg-rs4730.yz02.log.test
>
>
> if you open rsgroup, hbase-site.xml contains  below configuration.
> {code:java}
> 
>   hbase.coprocessor.master.classes
>   org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint
> 
> 
>   hbase.master.loadbalancer.class
>  org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer
> 
> {code}
> And you shut down the whole HBase cluster in the way:
>  # first shut down region server one by one
>  # shut down master
> Then you restart whole cluster in the way:
>  # start master
>  # start regionserver
> The hbase:meta region can not be re-online and the rsgroup can not be 
> initialized successfully.
>  master logs:
> {code:java}
> 2018-07-12 18:27:08,775 INFO 
> [org.apache.hadoop.hbase.rsgroup.RSGroupInfoManagerImpl$RSGroupStartupWorker-bjpg-rs4730.yz02,16000,1531389637409]
>  rsgroup.RSGro
> upInfoManagerImpl$RSGroupStartupWorker: Waiting for catalog tables to come 
> online
> 2018-07-12 18:27:08,876 INFO 
> [org.apache.hadoop.hbase.rsgroup.RSGroupInfoManagerImpl$RSGroupStartupWorker-bjpg-rs4730.yz02,16000,1531389637409]
>  zookeeper.Met
> aTableLocator: Failed verification of hbase:meta,,1 at 
> address=bjpg-rs4732.yz02,60020,1531388712053, 
> exception=org.apache.hadoop.hbase.NotServingRegionExcepti
> on: hbase:meta,,1 is not online on bjpg-rs4732.yz02,60020,1531389727928
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3249)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionS

[jira] [Updated] (HBASE-20919) meta region can't be re-onlined when restarting cluster if opening rsgroup

2018-07-25 Thread chenyang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenyang updated HBASE-20919:
-
Attachment: hbase-hbase-master-bjpg-rs4729.yz02.with_02patch.log
hbase-hbase-master-bjpg-rs4729.yz02.no_02patch.log
HBASE-20919-branch-2.0-02.patch

> meta region can't be re-onlined when restarting cluster if opening rsgroup
> --
>
> Key: HBASE-20919
> URL: https://issues.apache.org/jira/browse/HBASE-20919
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer, master, rsgroup
>Affects Versions: 2.0.1
>Reporter: chenyang
>Priority: Major
> Attachments: HBASE-20919-branch-2.0-01.patch, 
> HBASE-20919-branch-2.0-02.patch, bug2.png, 
> hbase-hbase-master-bjpg-rs4729.yz02.no_02patch.log, 
> hbase-hbase-master-bjpg-rs4729.yz02.with_02patch.log, 
> hbase-hbase-master-bjpg-rs4730.yz02.log.test
>
>
> if you open rsgroup, hbase-site.xml contains  below configuration.
> {code:java}
> 
>   hbase.coprocessor.master.classes
>   org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint
> 
> 
>   hbase.master.loadbalancer.class
>  org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer
> 
> {code}
> And you shut down the whole HBase cluster in the way:
>  # first shut down region server one by one
>  # shut down master
> Then you restart whole cluster in the way:
>  # start master
>  # start regionserver
> The hbase:meta region can not be re-online and the rsgroup can not be 
> initialized successfully.
>  master logs:
> {code:java}
> 2018-07-12 18:27:08,775 INFO 
> [org.apache.hadoop.hbase.rsgroup.RSGroupInfoManagerImpl$RSGroupStartupWorker-bjpg-rs4730.yz02,16000,1531389637409]
>  rsgroup.RSGro
> upInfoManagerImpl$RSGroupStartupWorker: Waiting for catalog tables to come 
> online
> 2018-07-12 18:27:08,876 INFO 
> [org.apache.hadoop.hbase.rsgroup.RSGroupInfoManagerImpl$RSGroupStartupWorker-bjpg-rs4730.yz02,16000,1531389637409]
>  zookeeper.Met
> aTableLocator: Failed verification of hbase:meta,,1 at 
> address=bjpg-rs4732.yz02,60020,1531388712053, 
> exception=org.apache.hadoop.hbase.NotServingRegionExcepti
> on: hbase:meta,,1 is not online on bjpg-rs4732.yz02,60020,1531389727928
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3249)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3226)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1414)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegionInfo(RSRpcServices.java:1729)
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:28286)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> {code}
> The logs show that hbase:meta region is not online and rsgroup keeps retrying 
> to initialize.
>   
>  but why the hbase:meta region is not online?
>  The info-level logs and jstack had not enough infomation, so I added some 
> debug logs in test-source-code. Then i checked the master`s logs and region 
> server`s logs, and found the meta region assign procedure which hold the meta 
> region lock not completed and not released the lock forever, so the 
> recoverMetaProcedure could not be executed. 
>   
>  Why the first procedure not completed and not released meta region lock?
>  In the test logs, i found when assignmentManager assigned the region, it 
> need to call the rsgroup balancer which  have not been initialized 
> completely, so throw NPE.  As a result, the procedure not completed and not 
> released the lock forever.
> {code:java}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer.generateGroupMaps(RSGroupBasedLoadBalancer.java:262)
> at 
> org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer.roundRobinAssignment(RSGroupBasedLoadBalancer.java:162)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.processAssignmentPlans(AssignmentManager.java:1864)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.processAssignQueue(AssignmentManager.java:1809)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.access$400(AssignmentManager.java:113)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager$2.run(AssignmentManager.java:1693)
> {code}
> !bug2.png!
> As shown in the figure named bug2.png listed in attachments, when we shutdown 
> the last region server, the master submit a ServerCrashProcedure. In the 
> procedure, it w

[jira] [Updated] (HBASE-20938) Set version to 2.1.1-SNAPSHOT for branch-2.1

2018-07-25 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-20938:
--
Attachment: (was: HBASE-20938.patch)

> Set version to 2.1.1-SNAPSHOT for branch-2.1
> 
>
> Key: HBASE-20938
> URL: https://issues.apache.org/jira/browse/HBASE-20938
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 2.1.1
>
> Attachments: HBASE-20938-branch-2.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20885) Remove entry for RPC quota from hbase:quota when RPC quota is removed.

2018-07-25 Thread Sakthi (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555339#comment-16555339
 ] 

Sakthi commented on HBASE-20885:


In GlobalQuotaSettingsImpl#merge :
{code:java}
validateQuotaTarget(other);

// Propagate the Throttle
QuotaProtos.Throttle.Builder throttleBuilder = (throttleProto == null
? null : throttleProto.toBuilder());
if (other instanceof ThrottleSettings) {
  if (throttleBuilder == null) {
throttleBuilder = QuotaProtos.Throttle.newBuilder();
  }
  ThrottleSettings otherThrottle = (ThrottleSettings) other;
  if (otherThrottle.proto.hasType()) {
QuotaProtos.ThrottleRequest otherProto = otherThrottle.proto;
if (otherProto.hasTimedQuota()) {
  if (otherProto.hasTimedQuota()) {
validateTimedQuota(otherProto.getTimedQuota());
  }

  switch (otherProto.getType()) {
...
   } else {
  throttleBuilder = clearThrottleBuilder(throttleBuilder);
}
  } else {
throttleBuilder = clearThrottleBuilder(throttleBuilder);
  }
}
{code}

following solves the issue:
{code:java}
-   throttleBuilder = clearThrottleBuilder(throttleBuilder);
+   throttleBuilder = null;
{code}

> Remove entry for RPC quota from hbase:quota when RPC quota is removed.
> --
>
> Key: HBASE-20885
> URL: https://issues.apache.org/jira/browse/HBASE-20885
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Sakthi
>Assignee: Sakthi
>Priority: Minor
>
> When a RPC quota is removed (using LIMIT => 'NONE'), the entry from 
> hbase:quota table is not completely removed. For e.g. see below:
> {noformat}
> hbase(main):005:0> create 't2','cf1'
> Created table t2
> Took 0.8000 seconds
> => Hbase::Table - t2
> hbase(main):006:0> set_quota TYPE => THROTTLE, TABLE => 't2', LIMIT => 
> '10M/sec'
> Took 0.1024 seconds
> hbase(main):007:0> list_quotas
> OWNER  QUOTAS
>  TABLE => t2   TYPE => THROTTLE, THROTTLE_TYPE => 
> REQUEST_SIZE, LIMIT => 10M/sec, SCOPE => MACHINE
> 1 row(s)
> Took 0.0622 seconds
> hbase(main):008:0> scan 'hbase:quota'
> ROWCOLUMN+CELL
>  t.t2  column=q:s, timestamp=1531513014463, 
> value=PBUF\x12\x0B\x12\x09\x08\x04\x10\x80\x80\x80
>\x05 \x02
> 1 row(s)
> Took 0.0453 seconds
> hbase(main):009:0> set_quota TYPE => THROTTLE, TABLE => 't2', LIMIT => 'NONE'
> Took 0.0097 seconds
> hbase(main):010:0> list_quotas
> OWNER  QUOTAS
> 0 row(s)
> Took 0.0338 seconds
> hbase(main):011:0> scan 'hbase:quota'
> ROWCOLUMN+CELL
>  t.t2  column=q:s, timestamp=1531513039505, 
> value=PBUF\x12\x00
> 1 row(s)
> Took 0.0066 seconds
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20885) Remove entry for RPC quota from hbase:quota when RPC quota is removed.

2018-07-25 Thread Sakthi (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555343#comment-16555343
 ] 

Sakthi commented on HBASE-20885:


Testing done: 
 # The way of reproducing the issue, which is mentioned in the description, now 
no longer produces the issue.
 # Test case added: TestQuotaAdmin#testSetGetRemoveRPCQuota

> Remove entry for RPC quota from hbase:quota when RPC quota is removed.
> --
>
> Key: HBASE-20885
> URL: https://issues.apache.org/jira/browse/HBASE-20885
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Sakthi
>Assignee: Sakthi
>Priority: Minor
>
> When a RPC quota is removed (using LIMIT => 'NONE'), the entry from 
> hbase:quota table is not completely removed. For e.g. see below:
> {noformat}
> hbase(main):005:0> create 't2','cf1'
> Created table t2
> Took 0.8000 seconds
> => Hbase::Table - t2
> hbase(main):006:0> set_quota TYPE => THROTTLE, TABLE => 't2', LIMIT => 
> '10M/sec'
> Took 0.1024 seconds
> hbase(main):007:0> list_quotas
> OWNER  QUOTAS
>  TABLE => t2   TYPE => THROTTLE, THROTTLE_TYPE => 
> REQUEST_SIZE, LIMIT => 10M/sec, SCOPE => MACHINE
> 1 row(s)
> Took 0.0622 seconds
> hbase(main):008:0> scan 'hbase:quota'
> ROWCOLUMN+CELL
>  t.t2  column=q:s, timestamp=1531513014463, 
> value=PBUF\x12\x0B\x12\x09\x08\x04\x10\x80\x80\x80
>\x05 \x02
> 1 row(s)
> Took 0.0453 seconds
> hbase(main):009:0> set_quota TYPE => THROTTLE, TABLE => 't2', LIMIT => 'NONE'
> Took 0.0097 seconds
> hbase(main):010:0> list_quotas
> OWNER  QUOTAS
> 0 row(s)
> Took 0.0338 seconds
> hbase(main):011:0> scan 'hbase:quota'
> ROWCOLUMN+CELL
>  t.t2  column=q:s, timestamp=1531513039505, 
> value=PBUF\x12\x00
> 1 row(s)
> Took 0.0066 seconds
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-18822) Create table for peer cluster automatically when creating table in source cluster of using namespace replication.

2018-07-25 Thread Zheng Hu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-18822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Hu updated HBASE-18822:
-
Fix Version/s: 2.2.0
   3.0.0

> Create table for peer cluster automatically when creating table in source 
> cluster of using namespace replication.
> -
>
> Key: HBASE-18822
> URL: https://issues.apache.org/jira/browse/HBASE-18822
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 2.0.0-alpha-2
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
>
> In our cluster of using namespace replication,   we always forget to create 
> table in peer cluster, which lead to replication get stuck. 
> We have implemented the feature in our cluster:  create table for peer 
> cluster automatically when creating table in source cluster of using 
> namespace replication.
>  
> I'm not sure if someone else needs this feature, so create an issue here for 
> discussing   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-20919) meta region can't be re-onlined when restarting cluster if opening rsgroup

2018-07-25 Thread chenyang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555316#comment-16555316
 ] 

chenyang edited comment on HBASE-20919 at 7/25/18 7:44 AM:
---

Submit HBASE-20919-branch-2.0-02.patch which implements fast failing and logs.

Testing steps(there are one master and one rs):

1: start master without 02.patch

2: start rs without 02.patch 

    now, the cluster works fine

3: stop rs

4: stop master

5: restart master

6: restart rs

    now, the hbase:meta region can not be assign successfully.

7: stop rs

8: stop master

9: apply 02.path to branch-2.0, recompile hbase-rsgroup module and replace 
hbase-rsgroup-2.0.2-SNAPSHOT.jar with new version which includes 02.patch

10: restart master

11: restart rs

now, the hbase:meta region can be assign successfully. cluster works fine.

 

Logs:

hbase-hbase-master-bjpg-rs4729.yz02.no_02patch.log includes logs across 1 to 8 
steps.

In the log file, you can see RSGroupInfoManagerImpl$RSGroupStartupWorker kept 
trying to check wether meta region is online, but failed every time. 

 
{code:java}
2018-07-25 12:15:15,064 INFO 
[org.apache.hadoop.hbase.rsgroup.RSGroupInfoManagerImpl$RSGroupStartupWorker-bjpg-rs4729.yz02,16000,1532491114549]
 zookeeper.Me
taTableLocator: Failed verification of hbase:meta,,1 at 
address=bjpg-rs4736.yz02,16020,1532490935452, 
exception=org.apache.hadoop.hbase.NotServingRegionExcep
tion: hbase:meta,,1 is not online on bjpg-rs4736.yz02,16020,1532491949108
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3246)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3223)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1414)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegionInfo(RSRpcServices.java:1729)
at 
org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:28286)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
{code}
hbase-hbase-master-bjpg-rs4729.yz02.with_02patch.log includes logs across 9 to 
10 steps.

In the log file, you can see that hbase:meta region was assigned successfully 
finally after failing some times. 
{code:java}
2018-07-25 14:27:12,356 WARN [master/bjpg-rs4729:16000] 
rsgroup.RSGroupBasedLoadBalancer: RSGroupBasedLoadBalancer has not been 
initialized
org.apache.hadoop.hbase.HBaseIOException: RSGroupBasedLoadBalancer has not been 
initialized
at 
org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer.checkInitializedState(RSGroupBasedLoadBalancer.java:480)
at 
org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer.roundRobinAssignment(RSGroupBasedLoadBalancer.java:161){code}


was (Author: hb-cy):
Submit HBASE-20919-branch-2.0-02.patch which implements fast failing and logs.

Testing steps(there are one master and one rs):

1: start master without 02.patch

2: start rs without 02.patch 

    now, the cluster works fine

3: stop rs

4: stop master

5: restart master

6: restart rs

    now, the hbase:meta region can not be assign successfully.

7: stop rs

8: stop master

9: apply 02.path to branch-2.0, recompile hbase-rsgroup module and replace 
hbase-rsgroup-2.0.2-SNAPSHOT.jar with new version which includes 02.patch

10: restart master

11: restart rs

now, the hbase:meta region can be assign successfully. cluster works fine.

 

Logs:

hbase-hbase-master-bjpg-rs4729.yz02.log.no_02patch includes logs across 1 to 8 
steps.

In the log file, you can see RSGroupInfoManagerImpl$RSGroupStartupWorker kept 
trying to check wether meta region is online, but failed every time. 

 
{code:java}
2018-07-25 12:15:15,064 INFO 
[org.apache.hadoop.hbase.rsgroup.RSGroupInfoManagerImpl$RSGroupStartupWorker-bjpg-rs4729.yz02,16000,1532491114549]
 zookeeper.Me
taTableLocator: Failed verification of hbase:meta,,1 at 
address=bjpg-rs4736.yz02,16020,1532490935452, 
exception=org.apache.hadoop.hbase.NotServingRegionExcep
tion: hbase:meta,,1 is not online on bjpg-rs4736.yz02,16020,1532491949108
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3246)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3223)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1414)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegionInfo(RSRpcServices.java:1729)
at 
org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:28286)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
at org.apache.hadoop.hbase.ipc.RpcEx

[jira] [Comment Edited] (HBASE-20919) meta region can't be re-onlined when restarting cluster if opening rsgroup

2018-07-25 Thread chenyang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555316#comment-16555316
 ] 

chenyang edited comment on HBASE-20919 at 7/25/18 7:46 AM:
---

Submit HBASE-20919-branch-2.0-02.patch which implements fast failing and logs.

Testing steps(there are one master and one rs):

1: start master without 02.patch

2: start rs without 02.patch 

    now, the cluster works fine

3: stop rs

4: stop master

5: restart master

6: restart rs

    now, the hbase:meta region can not be assign successfully.

7: stop rs

8: stop master

9: apply 02.path to branch-2.0, recompile hbase-rsgroup module and replace 
hbase-rsgroup-2.0.2-SNAPSHOT.jar with new version which includes 02.patch

10: restart master

11: restart rs

now, the hbase:meta region can be assign successfully. cluster works fine.

Logs:

hbase-hbase-master-bjpg-rs4729.yz02.no_02patch.log includes logs across 1 to 8 
steps.

In the log file, you can see RSGroupInfoManagerImpl$RSGroupStartupWorker kept 
trying to check wether meta region is online, but failed every time. 
{code:java}
2018-07-25 12:15:15,064 INFO 
[org.apache.hadoop.hbase.rsgroup.RSGroupInfoManagerImpl$RSGroupStartupWorker-bjpg-rs4729.yz02,16000,1532491114549]
 zookeeper.Me
taTableLocator: Failed verification of hbase:meta,,1 at 
address=bjpg-rs4736.yz02,16020,1532490935452, 
exception=org.apache.hadoop.hbase.NotServingRegionExcep
tion: hbase:meta,,1 is not online on bjpg-rs4736.yz02,16020,1532491949108
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3246)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3223)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1414)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegionInfo(RSRpcServices.java:1729)
at 
org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:28286)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
{code}
hbase-hbase-master-bjpg-rs4729.yz02.with_02patch.log includes logs across 9 to 
10 steps.

In the log file, you can see that hbase:meta region was assigned successfully 
finally after failing some times. 
{code:java}
2018-07-25 14:27:12,356 WARN [master/bjpg-rs4729:16000] 
rsgroup.RSGroupBasedLoadBalancer: RSGroupBasedLoadBalancer has not been 
initialized
org.apache.hadoop.hbase.HBaseIOException: RSGroupBasedLoadBalancer has not been 
initialized
at 
org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer.checkInitializedState(RSGroupBasedLoadBalancer.java:480)
at 
org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer.roundRobinAssignment(RSGroupBasedLoadBalancer.java:161){code}


was (Author: hb-cy):
Submit HBASE-20919-branch-2.0-02.patch which implements fast failing and logs.

Testing steps(there are one master and one rs):

1: start master without 02.patch

2: start rs without 02.patch 

    now, the cluster works fine

3: stop rs

4: stop master

5: restart master

6: restart rs

    now, the hbase:meta region can not be assign successfully.

7: stop rs

8: stop master

9: apply 02.path to branch-2.0, recompile hbase-rsgroup module and replace 
hbase-rsgroup-2.0.2-SNAPSHOT.jar with new version which includes 02.patch

10: restart master

11: restart rs

now, the hbase:meta region can be assign successfully. cluster works fine.

 

Logs:

hbase-hbase-master-bjpg-rs4729.yz02.no_02patch.log includes logs across 1 to 8 
steps.

In the log file, you can see RSGroupInfoManagerImpl$RSGroupStartupWorker kept 
trying to check wether meta region is online, but failed every time. 

 
{code:java}
2018-07-25 12:15:15,064 INFO 
[org.apache.hadoop.hbase.rsgroup.RSGroupInfoManagerImpl$RSGroupStartupWorker-bjpg-rs4729.yz02,16000,1532491114549]
 zookeeper.Me
taTableLocator: Failed verification of hbase:meta,,1 at 
address=bjpg-rs4736.yz02,16020,1532490935452, 
exception=org.apache.hadoop.hbase.NotServingRegionExcep
tion: hbase:meta,,1 is not online on bjpg-rs4736.yz02,16020,1532491949108
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3246)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3223)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1414)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegionInfo(RSRpcServices.java:1729)
at 
org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:28286)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
at org.apache.hadoop.hbase.ipc.RpcExecutor

[jira] [Created] (HBASE-20939) There will be race when we call suspendIfNotReady and then throw ProcedureSuspendedException

2018-07-25 Thread Duo Zhang (JIRA)
Duo Zhang created HBASE-20939:
-

 Summary: There will be race when we call suspendIfNotReady and 
then throw ProcedureSuspendedException
 Key: HBASE-20939
 URL: https://issues.apache.org/jira/browse/HBASE-20939
 Project: HBase
  Issue Type: Sub-task
Reporter: Duo Zhang


This is very typical usage in our procedure implementation, for example, in 
AssignProcedure, we will call AM.queueAssign and then suspend ourselves to wait 
until the AM finish processing our assign request.

But there could be races. Think of this:
1. We call suspendIfNotReady on a event, and it returns true so we need to wait.
2. The event has been waked up, and the procedure will be added back to the 
scheduler.
3. A worker picks up the procedure and finishes it.
4. We finally throw ProcedureSuspendException and the ProcedureExecutor suspend 
us and store the state in procedure store.

So we have a half done procedure in the procedure store for ever... This may 
cause assertion when loading procedures. And maybe the worker can not finish 
the procedure as when suspending we need to restore some state, for example, 
add something to RootProcedureState. But anyway, it will still lead to 
assertion or other unexpected errors.

And this can not be done by simply adding a lock in the procedure, as most 
works are done in the ProcedureExecutor after we throw 
ProcedureSuspendException.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-18822) Create table for peer cluster automatically when creating table in source cluster of using namespace replication.

2018-07-25 Thread Zheng Hu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-18822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Hu updated HBASE-18822:
-
Attachment: HBASE-18822.v1.patch

> Create table for peer cluster automatically when creating table in source 
> cluster of using namespace replication.
> -
>
> Key: HBASE-18822
> URL: https://issues.apache.org/jira/browse/HBASE-18822
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 2.0.0-alpha-2
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-18822.v1.patch
>
>
> In our cluster of using namespace replication,   we always forget to create 
> table in peer cluster, which lead to replication get stuck. 
> We have implemented the feature in our cluster:  create table for peer 
> cluster automatically when creating table in source cluster of using 
> namespace replication.
>  
> I'm not sure if someone else needs this feature, so create an issue here for 
> discussing   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-18822) Create table for peer cluster automatically when creating table in source cluster of using namespace replication.

2018-07-25 Thread Zheng Hu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-18822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Hu updated HBASE-18822:
-
Status: Patch Available  (was: Open)

> Create table for peer cluster automatically when creating table in source 
> cluster of using namespace replication.
> -
>
> Key: HBASE-18822
> URL: https://issues.apache.org/jira/browse/HBASE-18822
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 2.0.0-alpha-2
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-18822.v1.patch
>
>
> In our cluster of using namespace replication,   we always forget to create 
> table in peer cluster, which lead to replication get stuck. 
> We have implemented the feature in our cluster:  create table for peer 
> cluster automatically when creating table in source cluster of using 
> namespace replication.
>  
> I'm not sure if someone else needs this feature, so create an issue here for 
> discussing   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20716) Unsafe access cleanup

2018-07-25 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555409#comment-16555409
 ] 

Anoop Sam John commented on HBASE-20716:


Ya pls handle BBUtils also.. That is the main class getting used in our code 
paths now.

> Unsafe access cleanup
> -
>
> Key: HBASE-20716
> URL: https://issues.apache.org/jira/browse/HBASE-20716
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
>Assignee: Sahil Aggarwal
>Priority: Critical
>  Labels: beginner
> Attachments: HBASE-20716.master.001.patch, 
> HBASE-20716.master.002.patch, HBASE-20716.master.003.patch, Screen Shot 
> 2018-06-26 at 11.37.49 AM.png
>
>
> We have two means of getting at unsafe; UnsafeAccess and then internal to the 
> Bytes class. They are effectively doing the same thing. We should have one 
> avenue to Unsafe only.
> Many of our paths to Unsafe via UnsafeAccess traverse flags to check if 
> access is available, if it is aligned and the order in which words are 
> written on the machine. Each check costs -- especially if done millions of 
> times a second -- and on occasion adds bloat in hot code paths. The unsafe 
> access inside Bytes checks on startup what the machine is capable off and 
> then does a static assign of the appropriate class-to-use from there on out. 
> UnsafeAccess does not do this running the checks everytime. Would be good to 
> have the Bytes behavior pervasive.
> The benefit of one access to Unsafe only is plain. The benefits we gain 
> removing checks will be harder to measure though should be plain when you 
> disassemble a hot-path; in a (very) rare case, the saved byte codes could be 
> the difference between inlining or not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20935) HStore.removeCompactedfiles should log incase it unable to delete a file

2018-07-25 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555412#comment-16555412
 ] 

Hadoop QA commented on HBASE-20935:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
45s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
11s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
34s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
44s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 13s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}119m  
5s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}161m 36s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20935 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12933004/HBASE-20935.patch |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux d2b7fc6bd3e2 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / e44f506694 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13779/testReport/ |
| Max. process+thread count | 4739 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13779/consol

[jira] [Commented] (HBASE-18822) Create table for peer cluster automatically when creating table in source cluster of using namespace replication.

2018-07-25 Thread Ashish Singhi (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555415#comment-16555415
 ] 

Ashish Singhi commented on HBASE-18822:
---

Looked at the patch at high level. 
This will create all the tables in the peer cluster after enabling it. May be 
some user are not interested in all the tables data replication but still we 
will end up in creating the table in all the peers. 
Can we instead add a boolean to DDL APIs,  by default false and on true we can 
replicate the same operation in peer clusters. We have done the same way in my 
previous employer. WDYT ?
//CC [~pankaj2461]

> Create table for peer cluster automatically when creating table in source 
> cluster of using namespace replication.
> -
>
> Key: HBASE-18822
> URL: https://issues.apache.org/jira/browse/HBASE-18822
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 2.0.0-alpha-2
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-18822.v1.patch
>
>
> In our cluster of using namespace replication,   we always forget to create 
> table in peer cluster, which lead to replication get stuck. 
> We have implemented the feature in our cluster:  create table for peer 
> cluster automatically when creating table in source cluster of using 
> namespace replication.
>  
> I'm not sure if someone else needs this feature, so create an issue here for 
> discussing   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18822) Create table for peer cluster automatically when creating table in source cluster of using namespace replication.

2018-07-25 Thread Zheng Hu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555451#comment-16555451
 ] 

Zheng Hu commented on HBASE-18822:
--

bq. This will create all the tables in the peer cluster after enabling it. May 
be some user are not interested in all the tables data replication but still we 
will end up in creating the table in all the peers.  
[~ashish singhi], No all tables, Only when the replication peer config contains 
the table,  then the DDL of this table will be replicated to peer cluster.  For 
example, we have a peer with namespace ns, then all tables DDL under namespace 
ns will replicate to peer cluster..

> Create table for peer cluster automatically when creating table in source 
> cluster of using namespace replication.
> -
>
> Key: HBASE-18822
> URL: https://issues.apache.org/jira/browse/HBASE-18822
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 2.0.0-alpha-2
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-18822.v1.patch
>
>
> In our cluster of using namespace replication,   we always forget to create 
> table in peer cluster, which lead to replication get stuck. 
> We have implemented the feature in our cluster:  create table for peer 
> cluster automatically when creating table in source cluster of using 
> namespace replication.
>  
> I'm not sure if someone else needs this feature, so create an issue here for 
> discussing   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20649) Validate HFiles do not have PREFIX_TREE DataBlockEncoding

2018-07-25 Thread Balazs Meszaros (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555345#comment-16555345
 ] 

Balazs Meszaros commented on HBASE-20649:
-

[~busbey], is there anything which is missing from my patch?

> Validate HFiles do not have PREFIX_TREE DataBlockEncoding
> -
>
> Key: HBASE-20649
> URL: https://issues.apache.org/jira/browse/HBASE-20649
> Project: HBase
>  Issue Type: New Feature
>Reporter: Peter Somogyi
>Assignee: Balazs Meszaros
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HBASE-20649.master.001.patch, 
> HBASE-20649.master.002.patch, HBASE-20649.master.003.patch, 
> HBASE-20649.master.004.patch, HBASE-20649.master.005.patch, 
> HBASE-20649.master.006.patch
>
>
> HBASE-20592 adds a tool to check column families on the cluster do not have 
> PREFIX_TREE encoding.
> Since it is possible that DataBlockEncoding was already changed but HFiles 
> are not rewritten yet we would need a tool that can verify the content of 
> hfiles in the cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20873) Update doc for Endpoint-based Export

2018-07-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555351#comment-16555351
 ] 

Hudson commented on HBASE-20873:


Results for branch branch-2
[build #1024 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1024/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1024//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1024//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1024//console].


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Update doc for Endpoint-based Export
> 
>
> Key: HBASE-20873
> URL: https://issues.apache.org/jira/browse/HBASE-20873
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Minor
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-20873.master.001.patch, 
> HBASE-20873.master.002.patch
>
>
> The current documentation on the usage is a little vague. I'd like to take a 
> stab at expanding it, based on my experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20921) Possible NPE in ReopenTableRegionsProcedure

2018-07-25 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-20921:
---
Attachment: HBASE-20921.branch-2.0.002.patch

> Possible NPE in ReopenTableRegionsProcedure
> ---
>
> Key: HBASE-20921
> URL: https://issues.apache.org/jira/browse/HBASE-20921
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 3.0.0, 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-20921.branch-2.0.001.patch, 
> HBASE-20921.branch-2.0.002.patch
>
>
> After HBASE-20752, we issue a ReopenTableRegionsProcedure in 
> ModifyTableProcedure to ensure all regions are reopened.
> But, ModifyTableProcedure and ReopenTableRegionsProcedure do not hold the 
> lock (why?), so there is a chance that while ModifyTableProcedure  executing, 
> a merge/split procedure can be executed at the same time.
> So, when ReopenTableRegionsProcedure reaches the state of 
> "REOPEN_TABLE_REGIONS_CONFIRM_REOPENED", some of the persisted regions to 
> check is actually not exists, thus a NPE will throw.
> {code}
> 2018-07-18 01:38:57,528 INFO  [PEWorker-9] 
> procedure2.ProcedureExecutor(1246): Finished pid=6110, state=SUCCESS; 
> MergeTableRegionsProcedure table=IntegrationTestBigLinkedList, 
> regions=[845d286231eb01b7
> 1aeaa17b0e30058d, 4a46ab0918c99cada72d5336ad83a828], forcibly=false in 
> 10.8610sec
> 2018-07-18 01:38:57,530 ERROR [PEWorker-8] 
> procedure2.ProcedureExecutor(1478): CODE-BUG: Uncaught runtime exception: 
> pid=5974, ppid=5973, state=RUNNABLE:REOPEN_TABLE_REGIONS_CONFIRM_REOPENED; 
> ReopenTab
> leRegionsProcedure table=IntegrationTestBigLinkedList
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.checkReopened(RegionStates.java:651)
> at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
> at 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
> at 
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
> at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
> at 
> java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
> at 
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> at 
> java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
> at 
> org.apache.hadoop.hbase.master.procedure.ReopenTableRegionsProcedure.executeFromState(ReopenTableRegionsProcedure.java:102)
> at 
> org.apache.hadoop.hbase.master.procedure.ReopenTableRegionsProcedure.executeFromState(ReopenTableRegionsProcedure.java:45)
> at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:184)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:850)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1453)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1221)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> {code}
> I think we need to renew the region list of the table at the 
> "REOPEN_TABLE_REGIONS_CONFIRM_REOPENED" state. For the regions which are 
> merged or split, we do not need to check it. Since we can be sure that they 
> are opened after we made change to table descriptor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20933) multiple splits may result into forever uncleaned split region

2018-07-25 Thread Vishal Khandelwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vishal Khandelwal updated HBASE-20933:
--
Description: 
Incase of multiple subsequent split and with an open handle on old reference 
file, it may result into split region which can never be cleaned

 So Here are two issues.
 # Region is getting split even when it has reference to its parent
 # Region is going offline/in archive mode even when there are reference 
pending in store

*Repro Steps*
 # Region split (P)
 # Before major compaction starts after split, open a handle on store file on 
new region (DA & DB)
 # Let compaction completes on DA, (Here compaction will not clear reference 
store files as it is opened)
 # Split new region (DA) again ( shouldSplit will return true as before 
compaction even does the cleanup, it removes the compacted files and reference 
in-memory list)
 # Now CatalogJanitor will not remove this region as it has store references, 
majorCompaction/CompactedHFilesDischarger will not do the cleanup as it only 
looks at only online regions
 #  After above steps region-DA which is offline will always be in split 
regions and never getting cleaned up.

We found that catalog janitor is also not able to clean regions which are 
offline(split parent) because it has reference of the daughter of it's parent 
which is not getting cleaned up. This is causing lot of store files not getting 
cleaned causing more space in local index store and lot of split lingering 
regions.

Unit test repro the scenario has been attached.

Fix can be in CompactedHFilesDischarger or catalogJanitor to handle sch case 

  was:
Incase of multiple subsequent split and with an open handle on old reference 
file, it may result into split region which can never be cleaned

 So Here are two issues.
 # Region is getting split even when it has reference to its parent
 # Region is going offline/in archive mode even when there are reference 
pending in store

*Repro Steps*
 # Region split (P)
 # Before major compaction starts after split, open a handle on store file on 
new region (DA & DB)
 # Let compaction completes on DA, (Here compaction will not clear reference 
store files as it is opened)
 # Split new region (DA) again ( shouldSplit will return true as before 
compaction even does the cleanup, it removes the compacted files and reference 
in-memory list)
 # Now CatalogJanitor will not remove this region as it has store references, 
majorCompaction/CompactedHFilesDischarger will not do the cleanup as it only 
looks at only online regions
 #  After above steps region-DA which is offline will always be in split 
regions and never getting cleaned up.

We found that catalog janitor is also not able to clean regions which are 
offline(split parent) because it has reference of the daughter of it's parent 
which is not getting cleaned up. This is causing lot of store files not getting 
cleaned causing more space in local index store and lot of split lingering 
regions.

Unit test repro the scenario has been attached.

Fix can be in CompactedHFilesDischarger to look into offline region or stop 
split in such cases.


> multiple splits may result into forever uncleaned split region
> --
>
> Key: HBASE-20933
> URL: https://issues.apache.org/jira/browse/HBASE-20933
> Project: HBase
>  Issue Type: Bug
>Reporter: Vishal Khandelwal
>Assignee: Vishal Khandelwal
>Priority: Major
> Attachments: Test123.java
>
>
> Incase of multiple subsequent split and with an open handle on old reference 
> file, it may result into split region which can never be cleaned
>  So Here are two issues.
>  # Region is getting split even when it has reference to its parent
>  # Region is going offline/in archive mode even when there are reference 
> pending in store
> *Repro Steps*
>  # Region split (P)
>  # Before major compaction starts after split, open a handle on store file on 
> new region (DA & DB)
>  # Let compaction completes on DA, (Here compaction will not clear reference 
> store files as it is opened)
>  # Split new region (DA) again ( shouldSplit will return true as before 
> compaction even does the cleanup, it removes the compacted files and 
> reference in-memory list)
>  # Now CatalogJanitor will not remove this region as it has store references, 
> majorCompaction/CompactedHFilesDischarger will not do the cleanup as it only 
> looks at only online regions
>  #  After above steps region-DA which is offline will always be in split 
> regions and never getting cleaned up.
> We found that catalog janitor is also not able to clean regions which are 
> offline(split parent) because it has reference of the daughter of it's parent 
> which is not getting cleaned up. This is causing lot of store files not 
> getting cleane

[jira] [Updated] (HBASE-20933) multiple splits may result into forever uncleaned split region

2018-07-25 Thread Vishal Khandelwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vishal Khandelwal updated HBASE-20933:
--
Description: 
Incase of multiple subsequent split and with an open handle on old reference 
file, it may result into split region which can never be cleaned

 So Here are two issues.
 # Region is getting split even when it has reference to its parent
 # Region is going offline/in archive mode even when there are reference 
pending in store

*Repro Steps*
 # Region split (P)
 # Before major compaction starts after split, open a handle on store file on 
new region (DA & DB)
 # Let compaction completes on DA, (Here compaction will not clear reference 
store files as it is opened)
 # Split new region (DA) again ( shouldSplit will return true as before 
compaction even does the cleanup, it removes the compacted files and reference 
in-memory list)
 # Now CatalogJanitor will not remove this region as it has store references, 
majorCompaction/CompactedHFilesDischarger will not do the cleanup as it only 
looks at only online regions
 #  After above steps region-DA which is offline will always be in split 
regions and never getting cleaned up.

We found that catalog janitor is also not able to clean regions which are 
offline(split parent) because it has reference of the daughter of it's parent 
which is not getting cleaned up. This is causing lot of store files not getting 
cleaned causing more space in local index store and lot of split lingering 
regions.

Unit test repro the scenario has been attached.

Fix can be in CompactedHFilesDischarger or catalogJanitor to handle such cases. 
Even if such region exists which are offline and are split region. They should 
be able to clean t hem selves

  was:
Incase of multiple subsequent split and with an open handle on old reference 
file, it may result into split region which can never be cleaned

 So Here are two issues.
 # Region is getting split even when it has reference to its parent
 # Region is going offline/in archive mode even when there are reference 
pending in store

*Repro Steps*
 # Region split (P)
 # Before major compaction starts after split, open a handle on store file on 
new region (DA & DB)
 # Let compaction completes on DA, (Here compaction will not clear reference 
store files as it is opened)
 # Split new region (DA) again ( shouldSplit will return true as before 
compaction even does the cleanup, it removes the compacted files and reference 
in-memory list)
 # Now CatalogJanitor will not remove this region as it has store references, 
majorCompaction/CompactedHFilesDischarger will not do the cleanup as it only 
looks at only online regions
 #  After above steps region-DA which is offline will always be in split 
regions and never getting cleaned up.

We found that catalog janitor is also not able to clean regions which are 
offline(split parent) because it has reference of the daughter of it's parent 
which is not getting cleaned up. This is causing lot of store files not getting 
cleaned causing more space in local index store and lot of split lingering 
regions.

Unit test repro the scenario has been attached.

Fix can be in CompactedHFilesDischarger or catalogJanitor to handle sch case 


> multiple splits may result into forever uncleaned split region
> --
>
> Key: HBASE-20933
> URL: https://issues.apache.org/jira/browse/HBASE-20933
> Project: HBase
>  Issue Type: Bug
>Reporter: Vishal Khandelwal
>Assignee: Vishal Khandelwal
>Priority: Major
> Attachments: Test123.java
>
>
> Incase of multiple subsequent split and with an open handle on old reference 
> file, it may result into split region which can never be cleaned
>  So Here are two issues.
>  # Region is getting split even when it has reference to its parent
>  # Region is going offline/in archive mode even when there are reference 
> pending in store
> *Repro Steps*
>  # Region split (P)
>  # Before major compaction starts after split, open a handle on store file on 
> new region (DA & DB)
>  # Let compaction completes on DA, (Here compaction will not clear reference 
> store files as it is opened)
>  # Split new region (DA) again ( shouldSplit will return true as before 
> compaction even does the cleanup, it removes the compacted files and 
> reference in-memory list)
>  # Now CatalogJanitor will not remove this region as it has store references, 
> majorCompaction/CompactedHFilesDischarger will not do the cleanup as it only 
> looks at only online regions
>  #  After above steps region-DA which is offline will always be in split 
> regions and never getting cleaned up.
> We found that catalog janitor is also not able to clean regions which are 
> offline(split parent) because it has reference of the daughter of it's parent

[jira] [Commented] (HBASE-20931) [branch-1] Add -Dhttps.protocols=TLSv1.2 to Maven command line in make_rc.sh

2018-07-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555488#comment-16555488
 ] 

Hudson commented on HBASE-20931:


Results for branch branch-1
[build #392 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/392/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/392//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/392//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/392//JDK8_Nightly_Build_Report_(Hadoop2)/]




(x) {color:red}-1 source release artifact{color}
-- See build output for details.


> [branch-1] Add -Dhttps.protocols=TLSv1.2 to Maven command line in make_rc.sh
> 
>
> Key: HBASE-20931
> URL: https://issues.apache.org/jira/browse/HBASE-20931
> Project: HBase
>  Issue Type: Task
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Trivial
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6
>
> Attachments: HBASE-20931-branch-1.patch
>
>
> As of June 2018 the insecure TLS 1.0 and 1.1 protocols are no longer 
> supported for SSL connections to Maven Central and perhaps other public Maven 
> repositories. The branch-1 builds which require Java 7, of which the latest 
> public release was 7u80, need to add {{-Dhttps.protocols=TLSv1.2}} to the 
> Maven command line in order to avoid artifact retrieval problems during 
> builds.
> We especially need this in make_rc.sh which starts up with an empty local 
> Maven cache. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20940) HStore.cansplit should not allow split to happen if it has references

2018-07-25 Thread Vishal Khandelwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vishal Khandelwal updated HBASE-20940:
--
Summary: HStore.cansplit should not allow split to happen if it has 
references  (was: HStore.cansplit should not be allow split to happen if it has 
references)

> HStore.cansplit should not allow split to happen if it has references
> -
>
> Key: HBASE-20940
> URL: https://issues.apache.org/jira/browse/HBASE-20940
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Vishal Khandelwal
>Assignee: Vishal Khandelwal
>Priority: Major
>
> When split happens ans immediately another split happens, it may result into 
> a split of a region who still has references to its parent. More details 
> about scenario can be found here HBASE-20933
> HStore.hasReferences should check from fs.storefile rather than in memory 
> objects.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20930) MetaScanner.metaScan should use passed variable for meta table name rather than TableName.META_TABLE_NAME

2018-07-25 Thread Vishal Khandelwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vishal Khandelwal updated HBASE-20930:
--
Fix Version/s: 1.3.3
Affects Version/s: 1.3.3
   Attachment: HBASE-20935.branch-1.3.patch
   Status: Patch Available  (was: Open)

> MetaScanner.metaScan should use passed variable for meta table name rather 
> than TableName.META_TABLE_NAME
> -
>
> Key: HBASE-20930
> URL: https://issues.apache.org/jira/browse/HBASE-20930
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.3.3
>Reporter: Vishal Khandelwal
>Assignee: Vishal Khandelwal
>Priority: Minor
> Fix For: 1.3.3
>
> Attachments: HBASE-20935.branch-1.3.patch
>
>
> MetaScanner.metaScan 
>  try (Table metaTable = new HTable(TableName.META_TABLE_NAME, connection, 
> null)) {
> should be changed to 
> metaScan(connection, visitor, userTableName, null, Integer.MAX_VALUE, 
> metaTableName)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20873) Update doc for Endpoint-based Export

2018-07-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555491#comment-16555491
 ] 

Hudson commented on HBASE-20873:


Results for branch branch-2.1
[build #102 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/102/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/102//console].




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/102//console].


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/102//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Update doc for Endpoint-based Export
> 
>
> Key: HBASE-20873
> URL: https://issues.apache.org/jira/browse/HBASE-20873
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Minor
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-20873.master.001.patch, 
> HBASE-20873.master.002.patch
>
>
> The current documentation on the usage is a little vague. I'd like to take a 
> stab at expanding it, based on my experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20873) Update doc for Endpoint-based Export

2018-07-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1616#comment-1616
 ] 

Hudson commented on HBASE-20873:


Results for branch branch-2.0
[build #590 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/590/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/590//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/590//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/590//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Update doc for Endpoint-based Export
> 
>
> Key: HBASE-20873
> URL: https://issues.apache.org/jira/browse/HBASE-20873
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Minor
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-20873.master.001.patch, 
> HBASE-20873.master.002.patch
>
>
> The current documentation on the usage is a little vague. I'd like to take a 
> stab at expanding it, based on my experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20939) There will be race when we call suspendIfNotReady and then throw ProcedureSuspendedException

2018-07-25 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555401#comment-16555401
 ] 

Duo Zhang commented on HBASE-20939:
---

A possible solution, is to add a callable in ProcedureSuspendException, and in 
ProcedureExecutor, we execute it under a lock to avoid races.

> There will be race when we call suspendIfNotReady and then throw 
> ProcedureSuspendedException
> 
>
> Key: HBASE-20939
> URL: https://issues.apache.org/jira/browse/HBASE-20939
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Priority: Major
>
> This is very typical usage in our procedure implementation, for example, in 
> AssignProcedure, we will call AM.queueAssign and then suspend ourselves to 
> wait until the AM finish processing our assign request.
> But there could be races. Think of this:
> 1. We call suspendIfNotReady on a event, and it returns true so we need to 
> wait.
> 2. The event has been waked up, and the procedure will be added back to the 
> scheduler.
> 3. A worker picks up the procedure and finishes it.
> 4. We finally throw ProcedureSuspendException and the ProcedureExecutor 
> suspend us and store the state in procedure store.
> So we have a half done procedure in the procedure store for ever... This may 
> cause assertion when loading procedures. And maybe the worker can not finish 
> the procedure as when suspending we need to restore some state, for example, 
> add something to RootProcedureState. But anyway, it will still lead to 
> assertion or other unexpected errors.
> And this can not be done by simply adding a lock in the procedure, as most 
> works are done in the ProcedureExecutor after we throw 
> ProcedureSuspendException.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20716) Unsafe access cleanup

2018-07-25 Thread Sahil Aggarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555407#comment-16555407
 ] 

Sahil Aggarwal commented on HBASE-20716:


I intend to finish it all in this tasks itself. To me ByteBufferUtils seems to 
be only other class where we do this check and dispatch. Will changing the 
ByteBufferUtils too be all for this task?

> Unsafe access cleanup
> -
>
> Key: HBASE-20716
> URL: https://issues.apache.org/jira/browse/HBASE-20716
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
>Assignee: Sahil Aggarwal
>Priority: Critical
>  Labels: beginner
> Attachments: HBASE-20716.master.001.patch, 
> HBASE-20716.master.002.patch, HBASE-20716.master.003.patch, Screen Shot 
> 2018-06-26 at 11.37.49 AM.png
>
>
> We have two means of getting at unsafe; UnsafeAccess and then internal to the 
> Bytes class. They are effectively doing the same thing. We should have one 
> avenue to Unsafe only.
> Many of our paths to Unsafe via UnsafeAccess traverse flags to check if 
> access is available, if it is aligned and the order in which words are 
> written on the machine. Each check costs -- especially if done millions of 
> times a second -- and on occasion adds bloat in hot code paths. The unsafe 
> access inside Bytes checks on startup what the machine is capable off and 
> then does a static assign of the appropriate class-to-use from there on out. 
> UnsafeAccess does not do this running the checks everytime. Would be good to 
> have the Bytes behavior pervasive.
> The benefit of one access to Unsafe only is plain. The benefits we gain 
> removing checks will be harder to measure though should be plain when you 
> disassemble a hot-path; in a (very) rare case, the saved byte codes could be 
> the difference between inlining or not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20938) Set version to 2.1.1-SNAPSHOT for branch-2.1

2018-07-25 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1671#comment-1671
 ] 

Hadoop QA commented on HBASE-20938:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2.1 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
23s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
24s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 20m 
10s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
10s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
50s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  9m 
56s{color} | {color:green} branch-2.1 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 20m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 20m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m 
50s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
48s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m  6s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 10m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}179m 
54s{color} | {color:green} root in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 15m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}291m 40s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:42ca976 |
| JIRA Issue | HBASE-20938 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12933010/HBASE-20938-branch-2.1.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  shadedjars  hadoopcheck  
xml  compile  checkstyle  |
| uname | Linux ccebe9840213 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2.1 / 833657c46d |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_171 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13782/testReport/ |
| Max. process+thread count | 4513 (vs. ulimit of 1) |
| modules | C: hbase-checkstyle hbase-build-support 
hbase-build-support/hbase-error-prone hbase-annotations . hbase-archet

[jira] [Commented] (HBASE-18822) Create table for peer cluster automatically when creating table in source cluster of using namespace replication.

2018-07-25 Thread Zheng Hu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1677#comment-1677
 ] 

Zheng Hu commented on HBASE-18822:
--

bq. What will happen when a user doesn't explicitly specify any namespace or 
table in replication peer config ?
Depends on the peer's replicateAllUserTables, if true, then will replicate to 
peer, if false then won't.  you can see the RepliationUtils#contains method 
https://github.com/apache/hbase/blob/master/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ReplicationUtils.java#L160

We use it to decide whether need to replicate ddl or not :-) 

> Create table for peer cluster automatically when creating table in source 
> cluster of using namespace replication.
> -
>
> Key: HBASE-18822
> URL: https://issues.apache.org/jira/browse/HBASE-18822
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 2.0.0-alpha-2
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-18822.v1.patch
>
>
> In our cluster of using namespace replication,   we always forget to create 
> table in peer cluster, which lead to replication get stuck. 
> We have implemented the feature in our cluster:  create table for peer 
> cluster automatically when creating table in source cluster of using 
> namespace replication.
>  
> I'm not sure if someone else needs this feature, so create an issue here for 
> discussing   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20939) There will be race when we call suspendIfNotReady and then throw ProcedureSuspendedException

2018-07-25 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555463#comment-16555463
 ] 

Duo Zhang commented on HBASE-20939:
---

Or another solution is that, the ProcedureExecutor should guarantee that a 
procedure can not be executed concurrently. I think this will be easier for the 
developers to implement a procedure.

And the implementation maybe that, we introduce a IdLock in ProcedureExecutor - 
the id is the procedure id - to prevent concurrent executions for a procedure.

> There will be race when we call suspendIfNotReady and then throw 
> ProcedureSuspendedException
> 
>
> Key: HBASE-20939
> URL: https://issues.apache.org/jira/browse/HBASE-20939
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Priority: Major
>
> This is very typical usage in our procedure implementation, for example, in 
> AssignProcedure, we will call AM.queueAssign and then suspend ourselves to 
> wait until the AM finish processing our assign request.
> But there could be races. Think of this:
> 1. We call suspendIfNotReady on a event, and it returns true so we need to 
> wait.
> 2. The event has been waked up, and the procedure will be added back to the 
> scheduler.
> 3. A worker picks up the procedure and finishes it.
> 4. We finally throw ProcedureSuspendException and the ProcedureExecutor 
> suspend us and store the state in procedure store.
> So we have a half done procedure in the procedure store for ever... This may 
> cause assertion when loading procedures. And maybe the worker can not finish 
> the procedure as when suspending we need to restore some state, for example, 
> add something to RootProcedureState. But anyway, it will still lead to 
> assertion or other unexpected errors.
> And this can not be done by simply adding a lock in the procedure, as most 
> works are done in the ProcedureExecutor after we throw 
> ProcedureSuspendException.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18822) Create table for peer cluster automatically when creating table in source cluster of using namespace replication.

2018-07-25 Thread Ashish Singhi (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555467#comment-16555467
 ] 

Ashish Singhi commented on HBASE-18822:
---

What will happen when a user doesn't explicitly specify any namespace or table 
in replication peer config ?

Thanks.

> Create table for peer cluster automatically when creating table in source 
> cluster of using namespace replication.
> -
>
> Key: HBASE-18822
> URL: https://issues.apache.org/jira/browse/HBASE-18822
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 2.0.0-alpha-2
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-18822.v1.patch
>
>
> In our cluster of using namespace replication,   we always forget to create 
> table in peer cluster, which lead to replication get stuck. 
> We have implemented the feature in our cluster:  create table for peer 
> cluster automatically when creating table in source cluster of using 
> namespace replication.
>  
> I'm not sure if someone else needs this feature, so create an issue here for 
> discussing   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20921) Possible NPE in ReopenTableRegionsProcedure

2018-07-25 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-20921:
---
Attachment: (was: HBASE-20921.branch-2.0.002.patch)

> Possible NPE in ReopenTableRegionsProcedure
> ---
>
> Key: HBASE-20921
> URL: https://issues.apache.org/jira/browse/HBASE-20921
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 3.0.0, 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-20921.branch-2.0.001.patch, 
> HBASE-20921.branch-2.0.002.patch
>
>
> After HBASE-20752, we issue a ReopenTableRegionsProcedure in 
> ModifyTableProcedure to ensure all regions are reopened.
> But, ModifyTableProcedure and ReopenTableRegionsProcedure do not hold the 
> lock (why?), so there is a chance that while ModifyTableProcedure  executing, 
> a merge/split procedure can be executed at the same time.
> So, when ReopenTableRegionsProcedure reaches the state of 
> "REOPEN_TABLE_REGIONS_CONFIRM_REOPENED", some of the persisted regions to 
> check is actually not exists, thus a NPE will throw.
> {code}
> 2018-07-18 01:38:57,528 INFO  [PEWorker-9] 
> procedure2.ProcedureExecutor(1246): Finished pid=6110, state=SUCCESS; 
> MergeTableRegionsProcedure table=IntegrationTestBigLinkedList, 
> regions=[845d286231eb01b7
> 1aeaa17b0e30058d, 4a46ab0918c99cada72d5336ad83a828], forcibly=false in 
> 10.8610sec
> 2018-07-18 01:38:57,530 ERROR [PEWorker-8] 
> procedure2.ProcedureExecutor(1478): CODE-BUG: Uncaught runtime exception: 
> pid=5974, ppid=5973, state=RUNNABLE:REOPEN_TABLE_REGIONS_CONFIRM_REOPENED; 
> ReopenTab
> leRegionsProcedure table=IntegrationTestBigLinkedList
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.checkReopened(RegionStates.java:651)
> at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
> at 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
> at 
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
> at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
> at 
> java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
> at 
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> at 
> java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
> at 
> org.apache.hadoop.hbase.master.procedure.ReopenTableRegionsProcedure.executeFromState(ReopenTableRegionsProcedure.java:102)
> at 
> org.apache.hadoop.hbase.master.procedure.ReopenTableRegionsProcedure.executeFromState(ReopenTableRegionsProcedure.java:45)
> at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:184)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:850)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1453)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1221)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> {code}
> I think we need to renew the region list of the table at the 
> "REOPEN_TABLE_REGIONS_CONFIRM_REOPENED" state. For the regions which are 
> merged or split, we do not need to check it. Since we can be sure that they 
> are opened after we made change to table descriptor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20935) HStore.removeCompactedfiles should log incase it unable to delete a file

2018-07-25 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1693#comment-1693
 ] 

Ted Yu commented on HBASE-20935:


lgtm

nit:
{code}
2608  }
2609  else {
{code}
Put the above two on the same line.

> HStore.removeCompactedfiles should log incase it unable to delete a file
> 
>
> Key: HBASE-20935
> URL: https://issues.apache.org/jira/browse/HBASE-20935
> Project: HBase
>  Issue Type: Improvement
>Reporter: Vishal Khandelwal
>Assignee: Vishal Khandelwal
>Priority: Minor
> Fix For: 1.3.3
>
> Attachments: HBASE-20935.branch-1.3.patch, HBASE-20935.patch
>
>
> if (r != null && r.isCompactedAway() && !r.isReferencedInReads())
> If above check fails then there will be some files which are compacted but 
> not getting cleaned up. It is good to log which helps in debugging the issue. 
> This would let us know why is getting cleaned. either with reference pending 
> or compatedaway is not set.
> This will help debug issues like :
>  # HBASE-20933



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20940) HStore.cansplit should not be allow split to happen if it has references

2018-07-25 Thread Vishal Khandelwal (JIRA)
Vishal Khandelwal created HBASE-20940:
-

 Summary: HStore.cansplit should not be allow split to happen if it 
has references
 Key: HBASE-20940
 URL: https://issues.apache.org/jira/browse/HBASE-20940
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.3.2
Reporter: Vishal Khandelwal
Assignee: Vishal Khandelwal


When split happens ans immediately another split happens, it may result into a 
split of a region who still has references to its parent. More details about 
scenario can be found here HBASE-20933

HStore.hasReferences should check from fs.storefile rather than in memory 
objects.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18822) Create table for peer cluster automatically when creating table in source cluster of using namespace replication.

2018-07-25 Thread Ashish Singhi (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555605#comment-16555605
 ] 

Ashish Singhi commented on HBASE-18822:
---

{quote}Depends on the peer's replicateAllUserTables, if true, then will 
replicate to peer, if false then won't. 
{quote}
Exactly, by default it is true which means all the tables schema will be synced 
to peer cluster.

> Create table for peer cluster automatically when creating table in source 
> cluster of using namespace replication.
> -
>
> Key: HBASE-18822
> URL: https://issues.apache.org/jira/browse/HBASE-18822
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 2.0.0-alpha-2
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-18822.v1.patch
>
>
> In our cluster of using namespace replication,   we always forget to create 
> table in peer cluster, which lead to replication get stuck. 
> We have implemented the feature in our cluster:  create table for peer 
> cluster automatically when creating table in source cluster of using 
> namespace replication.
>  
> I'm not sure if someone else needs this feature, so create an issue here for 
> discussing   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20867) RS may get killed while master restarts

2018-07-25 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-20867:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> RS may get killed while master restarts
> ---
>
> Key: HBASE-20867
> URL: https://issues.apache.org/jira/browse/HBASE-20867
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20867.branch-2.0.001.patch, 
> HBASE-20867.branch-2.0.002.patch, HBASE-20867.branch-2.0.003.patch, 
> HBASE-20867.branch-2.0.004.patch, HBASE-20867.branch-2.0.005.patch, 
> HBASE-20867.branch-2.0.006.patch
>
>
> If the master is dispatching a RPC call to RS when aborting. A connection 
> exception may be thrown by the RPC layer(A IOException with "Connection 
> closed" message in this case). The RSProcedureDispatcher will regard is as an 
> un-retryable exception and pass it to UnassignProcedue.remoteCallFailed, 
> which will expire the RS.
> Actually, the RS is very healthy, only the master is restarting.
> I think we should deal with those kinds of connection exceptions in 
> RSProcedureDispatcher and retry the rpc call



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20873) Update doc for Endpoint-based Export

2018-07-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555615#comment-16555615
 ] 

Hudson commented on HBASE-20873:


Results for branch master
[build #408 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/408/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/408//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/408//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/master/408//console].


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Update doc for Endpoint-based Export
> 
>
> Key: HBASE-20873
> URL: https://issues.apache.org/jira/browse/HBASE-20873
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Minor
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-20873.master.001.patch, 
> HBASE-20873.master.002.patch
>
>
> The current documentation on the usage is a little vague. I'd like to take a 
> stab at expanding it, based on my experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20930) MetaScanner.metaScan should use passed variable for meta table name rather than TableName.META_TABLE_NAME

2018-07-25 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1607#comment-1607
 ] 

Hadoop QA commented on HBASE-20930:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-1.3 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
45s{color} | {color:green} branch-1.3 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} branch-1.3 passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} branch-1.3 passed with JDK v1.7.0_181 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
18s{color} | {color:green} branch-1.3 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
31s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} branch-1.3 passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} branch-1.3 passed with JDK v1.7.0_181 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed with JDK v1.7.0_181 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
28s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
6m 28s{color} | {color:green} Patch does not cause any errors with Hadoop 2.4.1 
2.5.2 2.6.5 2.7.4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed with JDK v1.7.0_181 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 96m 31s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}119m 28s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.regionserver.TestAtomicOperation |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:53dba69 |
| JIRA Issue | HBASE-20930 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12933018/HBASE-20935.branch-1.3.patch
 |
| Optional Tests |  asfli

[jira] [Resolved] (HBASE-20746) Release 2.1.0

2018-07-25 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-20746.
---
Resolution: Fixed

> Release 2.1.0
> -
>
> Key: HBASE-20746
> URL: https://issues.apache.org/jira/browse/HBASE-20746
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>
> After HBASE-20708 I do no think we will have unresolvable problems for 2.1.0 
> release any more. So let's create a issue to track the release processing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20919) meta region can't be re-onlined when restarting cluster if opening rsgroup

2018-07-25 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555706#comment-16555706
 ] 

Ted Yu commented on HBASE-20919:


{code}
481 LOG.info("waiting for balancer to be initialized, 
checkTimes:{}", checkTimes);
{code}
The log can be at DEBUG level.
{code}
485   } catch (InterruptedException e) {
{code}
Please restore interrupt state in the catch block.

> meta region can't be re-onlined when restarting cluster if opening rsgroup
> --
>
> Key: HBASE-20919
> URL: https://issues.apache.org/jira/browse/HBASE-20919
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer, master, rsgroup
>Affects Versions: 2.0.1
>Reporter: chenyang
>Priority: Major
> Attachments: HBASE-20919-branch-2.0-01.patch, 
> HBASE-20919-branch-2.0-02.patch, bug2.png, 
> hbase-hbase-master-bjpg-rs4729.yz02.no_02patch.log, 
> hbase-hbase-master-bjpg-rs4729.yz02.with_02patch.log, 
> hbase-hbase-master-bjpg-rs4730.yz02.log.test
>
>
> if you open rsgroup, hbase-site.xml contains  below configuration.
> {code:java}
> 
>   hbase.coprocessor.master.classes
>   org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint
> 
> 
>   hbase.master.loadbalancer.class
>  org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer
> 
> {code}
> And you shut down the whole HBase cluster in the way:
>  # first shut down region server one by one
>  # shut down master
> Then you restart whole cluster in the way:
>  # start master
>  # start regionserver
> The hbase:meta region can not be re-online and the rsgroup can not be 
> initialized successfully.
>  master logs:
> {code:java}
> 2018-07-12 18:27:08,775 INFO 
> [org.apache.hadoop.hbase.rsgroup.RSGroupInfoManagerImpl$RSGroupStartupWorker-bjpg-rs4730.yz02,16000,1531389637409]
>  rsgroup.RSGro
> upInfoManagerImpl$RSGroupStartupWorker: Waiting for catalog tables to come 
> online
> 2018-07-12 18:27:08,876 INFO 
> [org.apache.hadoop.hbase.rsgroup.RSGroupInfoManagerImpl$RSGroupStartupWorker-bjpg-rs4730.yz02,16000,1531389637409]
>  zookeeper.Met
> aTableLocator: Failed verification of hbase:meta,,1 at 
> address=bjpg-rs4732.yz02,60020,1531388712053, 
> exception=org.apache.hadoop.hbase.NotServingRegionExcepti
> on: hbase:meta,,1 is not online on bjpg-rs4732.yz02,60020,1531389727928
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3249)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3226)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1414)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegionInfo(RSRpcServices.java:1729)
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:28286)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> {code}
> The logs show that hbase:meta region is not online and rsgroup keeps retrying 
> to initialize.
>   
>  but why the hbase:meta region is not online?
>  The info-level logs and jstack had not enough infomation, so I added some 
> debug logs in test-source-code. Then i checked the master`s logs and region 
> server`s logs, and found the meta region assign procedure which hold the meta 
> region lock not completed and not released the lock forever, so the 
> recoverMetaProcedure could not be executed. 
>   
>  Why the first procedure not completed and not released meta region lock?
>  In the test logs, i found when assignmentManager assigned the region, it 
> need to call the rsgroup balancer which  have not been initialized 
> completely, so throw NPE.  As a result, the procedure not completed and not 
> released the lock forever.
> {code:java}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer.generateGroupMaps(RSGroupBasedLoadBalancer.java:262)
> at 
> org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer.roundRobinAssignment(RSGroupBasedLoadBalancer.java:162)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.processAssignmentPlans(AssignmentManager.java:1864)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.processAssignQueue(AssignmentManager.java:1809)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.access$400(AssignmentManager.java:113)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager$2.run(AssignmentManager.java:1693)
> {code}
> !bug2.png!
> As shown in the figure named bug2.png listed in

[jira] [Commented] (HBASE-20873) Update doc for Endpoint-based Export

2018-07-25 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1621#comment-1621
 ] 

Wei-Chiu Chuang commented on HBASE-20873:
-

Thanks [~chia7712] and [~liuml07]!

> Update doc for Endpoint-based Export
> 
>
> Key: HBASE-20873
> URL: https://issues.apache.org/jira/browse/HBASE-20873
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Minor
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-20873.master.001.patch, 
> HBASE-20873.master.002.patch
>
>
> The current documentation on the usage is a little vague. I'd like to take a 
> stab at expanding it, based on my experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20749) Upgrade our use of checkstyle to 8.6+

2018-07-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555724#comment-16555724
 ] 

Hudson commented on HBASE-20749:


Results for branch HBASE-20749
[build #3 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20749/3/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20749/3//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20749/3//console].


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20749/3//console].


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Upgrade our use of checkstyle to 8.6+
> -
>
> Key: HBASE-20749
> URL: https://issues.apache.org/jira/browse/HBASE-20749
> Project: HBase
>  Issue Type: Improvement
>  Components: build, community
>Reporter: Sean Busbey
>Assignee: Mike Drob
>Priority: Minor
> Attachments: HBASE-20749.master.001.patch
>
>
> We should upgrade our checkstyle version to 8.6 or later so we can use the 
> "match violation message to this regex" feature for suppression. That will 
> allow us to make sure we don't regress on HTrace v3 vs v4 APIs (came up in 
> HBASE-20332).
> We're currently blocked on upgrading to 8.3+ by [checkstyle 
> #5279|https://github.com/checkstyle/checkstyle/issues/5279], a regression 
> that flags our use of both the "separate import groups" and "put static 
> imports over here" configs as an error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20649) Validate HFiles do not have PREFIX_TREE DataBlockEncoding

2018-07-25 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-20649:

   Resolution: Fixed
Fix Version/s: 2.2.0
 Release Note: 

Users who have previously made use of prefix tree encoding can now check that 
their existing HFiles no longer contain data that uses it with an additional 
preupgrade check command.

```
hbase pre-upgrade validate-hfile
```

Please see the "HFile Content validation" section of the ref guide's coverage 
of the pre-upgrade validator tool for usage details.
   Status: Resolved  (was: Patch Available)

nope, patch is great as is. Thanks for the reminder, I've pushed this to master 
and branch-2 now.

Please feel free to edit the release note.

> Validate HFiles do not have PREFIX_TREE DataBlockEncoding
> -
>
> Key: HBASE-20649
> URL: https://issues.apache.org/jira/browse/HBASE-20649
> Project: HBase
>  Issue Type: New Feature
>  Components: Operability, tooling
>Reporter: Peter Somogyi
>Assignee: Balazs Meszaros
>Priority: Minor
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-20649.master.001.patch, 
> HBASE-20649.master.002.patch, HBASE-20649.master.003.patch, 
> HBASE-20649.master.004.patch, HBASE-20649.master.005.patch, 
> HBASE-20649.master.006.patch
>
>
> HBASE-20592 adds a tool to check column families on the cluster do not have 
> PREFIX_TREE encoding.
> Since it is possible that DataBlockEncoding was already changed but HFiles 
> are not rewritten yet we would need a tool that can verify the content of 
> hfiles in the cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-20919) meta region can't be re-onlined when restarting cluster if opening rsgroup

2018-07-25 Thread chenyang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555778#comment-16555778
 ] 

chenyang edited comment on HBASE-20919 at 7/25/18 2:46 PM:
---

hi, Ted Yu

HBASE-20919-branch-2.0-01.patch is deprecated.

HBASE-20919-branch-2.0-02.patch offers a better solution which not blocks 
current thread.

Please review HBASE-20919-branch-2.0-02.patch, thank you.


was (Author: hb-cy):
hi, Ted Yu

HBASE-20919-branch-2.0-01.patch is deprecated.

HBASE-20919-branch-2.0-02.patch offers a better solution which not block 
current thread.

Please review HBASE-20919-branch-2.0-02.patch, thank you.

> meta region can't be re-onlined when restarting cluster if opening rsgroup
> --
>
> Key: HBASE-20919
> URL: https://issues.apache.org/jira/browse/HBASE-20919
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer, master, rsgroup
>Affects Versions: 2.0.1
>Reporter: chenyang
>Priority: Major
> Attachments: HBASE-20919-branch-2.0-01.patch, 
> HBASE-20919-branch-2.0-02.patch, bug2.png, 
> hbase-hbase-master-bjpg-rs4729.yz02.no_02patch.log, 
> hbase-hbase-master-bjpg-rs4729.yz02.with_02patch.log, 
> hbase-hbase-master-bjpg-rs4730.yz02.log.test
>
>
> if you open rsgroup, hbase-site.xml contains  below configuration.
> {code:java}
> 
>   hbase.coprocessor.master.classes
>   org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint
> 
> 
>   hbase.master.loadbalancer.class
>  org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer
> 
> {code}
> And you shut down the whole HBase cluster in the way:
>  # first shut down region server one by one
>  # shut down master
> Then you restart whole cluster in the way:
>  # start master
>  # start regionserver
> The hbase:meta region can not be re-online and the rsgroup can not be 
> initialized successfully.
>  master logs:
> {code:java}
> 2018-07-12 18:27:08,775 INFO 
> [org.apache.hadoop.hbase.rsgroup.RSGroupInfoManagerImpl$RSGroupStartupWorker-bjpg-rs4730.yz02,16000,1531389637409]
>  rsgroup.RSGro
> upInfoManagerImpl$RSGroupStartupWorker: Waiting for catalog tables to come 
> online
> 2018-07-12 18:27:08,876 INFO 
> [org.apache.hadoop.hbase.rsgroup.RSGroupInfoManagerImpl$RSGroupStartupWorker-bjpg-rs4730.yz02,16000,1531389637409]
>  zookeeper.Met
> aTableLocator: Failed verification of hbase:meta,,1 at 
> address=bjpg-rs4732.yz02,60020,1531388712053, 
> exception=org.apache.hadoop.hbase.NotServingRegionExcepti
> on: hbase:meta,,1 is not online on bjpg-rs4732.yz02,60020,1531389727928
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3249)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3226)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1414)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegionInfo(RSRpcServices.java:1729)
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:28286)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> {code}
> The logs show that hbase:meta region is not online and rsgroup keeps retrying 
> to initialize.
>   
>  but why the hbase:meta region is not online?
>  The info-level logs and jstack had not enough infomation, so I added some 
> debug logs in test-source-code. Then i checked the master`s logs and region 
> server`s logs, and found the meta region assign procedure which hold the meta 
> region lock not completed and not released the lock forever, so the 
> recoverMetaProcedure could not be executed. 
>   
>  Why the first procedure not completed and not released meta region lock?
>  In the test logs, i found when assignmentManager assigned the region, it 
> need to call the rsgroup balancer which  have not been initialized 
> completely, so throw NPE.  As a result, the procedure not completed and not 
> released the lock forever.
> {code:java}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer.generateGroupMaps(RSGroupBasedLoadBalancer.java:262)
> at 
> org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer.roundRobinAssignment(RSGroupBasedLoadBalancer.java:162)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.processAssignmentPlans(AssignmentManager.java:1864)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.processAssignQueue(AssignmentManager.java:1809)
> at 
> org.apache.hadoop.hbase.master.as

[jira] [Commented] (HBASE-18822) Create table for peer cluster automatically when creating table in source cluster of using namespace replication.

2018-07-25 Thread Zheng Hu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555800#comment-16555800
 ] 

Zheng Hu commented on HBASE-18822:
--

[~ashish singhi],  it's the correct behaviour,  because if we don't sync the 
table to peer cluster, then the replication will be stuck.

> Create table for peer cluster automatically when creating table in source 
> cluster of using namespace replication.
> -
>
> Key: HBASE-18822
> URL: https://issues.apache.org/jira/browse/HBASE-18822
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 2.0.0-alpha-2
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-18822.v1.patch
>
>
> In our cluster of using namespace replication,   we always forget to create 
> table in peer cluster, which lead to replication get stuck. 
> We have implemented the feature in our cluster:  create table for peer 
> cluster automatically when creating table in source cluster of using 
> namespace replication.
>  
> I'm not sure if someone else needs this feature, so create an issue here for 
> discussing   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20867) RS may get killed while master restarts

2018-07-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555885#comment-16555885
 ] 

Hudson commented on HBASE-20867:


Results for branch branch-2.0
[build #591 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/591/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/591//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/591//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/591//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> RS may get killed while master restarts
> ---
>
> Key: HBASE-20867
> URL: https://issues.apache.org/jira/browse/HBASE-20867
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20867.branch-2.0.001.patch, 
> HBASE-20867.branch-2.0.002.patch, HBASE-20867.branch-2.0.003.patch, 
> HBASE-20867.branch-2.0.004.patch, HBASE-20867.branch-2.0.005.patch, 
> HBASE-20867.branch-2.0.006.patch
>
>
> If the master is dispatching a RPC call to RS when aborting. A connection 
> exception may be thrown by the RPC layer(A IOException with "Connection 
> closed" message in this case). The RSProcedureDispatcher will regard is as an 
> un-retryable exception and pass it to UnassignProcedue.remoteCallFailed, 
> which will expire the RS.
> Actually, the RS is very healthy, only the master is restarting.
> I think we should deal with those kinds of connection exceptions in 
> RSProcedureDispatcher and retry the rpc call



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555616#comment-16555616
 ] 

Hudson commented on HBASE-20846:


Results for branch master
[build #408 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/408/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/408//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/408//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/master/408//console].


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, 
> HBASE-20846-v3.patch, HBASE-20846-v4.patch, HBASE-20846-v4.patch, 
> HBASE-20846-v4.patch, HBASE-20846-v5.patch, HBASE-20846-v6.patch, 
> HBASE-20846.branch-2.0.002.patch, HBASE-20846.branch-2.0.patch, 
> HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not possible if the master was not restarted.
> This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, 
> I wrote a simple UT to repo this case.
> I think we don't have to check the parent for table's shared lock. It is a 
> shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20928) Rewrite calculation of midpoint in binarySearch functions to prevent overflow

2018-07-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555614#comment-16555614
 ] 

Hudson commented on HBASE-20928:


Results for branch master
[build #408 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/408/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/408//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/408//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/master/408//console].


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Rewrite calculation of midpoint in binarySearch functions to prevent overflow
> -
>
> Key: HBASE-20928
> URL: https://issues.apache.org/jira/browse/HBASE-20928
> Project: HBase
>  Issue Type: Bug
>  Components: io
>Reporter: saurabh singh
>Assignee: saurabh singh
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HBASE-20928-addendum.patch, 
> HBASE-20928-fix-binarySearch-v5.patch, HBASE-20928-fix-binarySearch-v5.patch
>
>
> There are couple of issues in the function:
>  * {{>>>}} operator would mess the values if {{low}} + {{high}} end up being 
> negative. This shouldn't happen but I don't see anything to prevent this from 
> happening.
>  * The code fails around boundary values of {{low}} and {{high}}. This is a 
> well known binary search catch. 
> [https://ai.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html]
>  
> Most of the code should already be covered by tests. I would have liked to 
> add a test that actually fails without the fix but given these are private 
> methods I am not sure on the best place to add the test. Suggestions?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20899) Add Hadoop KMS dependency and basic HDFS at-rest encryption tests

2018-07-25 Thread Wei-Chiu Chuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HBASE-20899:

Attachment: HBASE-20899.master.001.patch

> Add Hadoop KMS dependency and basic HDFS at-rest encryption tests
> -
>
> Key: HBASE-20899
> URL: https://issues.apache.org/jira/browse/HBASE-20899
> Project: HBase
>  Issue Type: Sub-task
>  Components: encryption
>Affects Versions: 2.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HBASE-20899.master.001.patch
>
>
> We should start by adding hadoop-kms dependency in HBase test scope, and add 
> basic HDFS at-rest encryption tests using the hadoop-kms dependency.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-11715) HBase should provide a tool to compare 2 remote tables.

2018-07-25 Thread Divesh Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-11715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555943#comment-16555943
 ] 

Divesh Jain edited comment on HBASE-11715 at 7/25/18 4:52 PM:
--

How about introducing and using a new tool that uses the following approach:

1) Takes an HBase snapshot of the table on both the clusters.

2) Launches a map reduce job to calculate the total number of row keys, min 
key, max key and hash sum on both cluster.

3) Compare the above values with the remote cluster to figure out if the data 
is same or not.

4) Delete HBase snapshot taken in step 1.

Since only a few bytes worth of data is transferred, the tool will perform 
faster than VerifyReplicatedData. We can also look at extending the 
functionality of VerifyReplicatedData to operate on fast mode using the above 
logic. 

 

_The tool can also be configurable enough to accept start and end row keys, 
enabling/disabling raw scan, accept start time and end time range, etc._

 

[~apurtell]


was (Author: jdivesh):
How about introducing and using a new tool that uses the following approach:

1) Takes an HBase snapshot of the table on both the clusters.

2) Launches a map reduce job to calculate the total number of row keys, min 
key, max key and hash sum on both cluster.

3) Compare the above values with the remote cluster to figure out if the data 
is same or not.

4) Delete HBase snapshot taken in step 1.

Since only a few bytes worth of data is transferred, the tool will perform 
faster than VerifyReplicatedData. We can also look at extending the 
functionality of VerifyReplicatedData to operate on fast mode using the above 
logic. 

* The tool can also be configurable enough to accept start and end row keys, 
enabling/disabling raw scan, accept start time and end time range, etc.

> HBase should provide a tool to compare 2 remote tables.
> ---
>
> Key: HBASE-11715
> URL: https://issues.apache.org/jira/browse/HBASE-11715
> Project: HBase
>  Issue Type: New Feature
>  Components: util
>Reporter: Jean-Marc Spaggiari
>Priority: Major
>
> As discussed in the mailing list, when a table is copied to another cluster 
> and need to be validated against the first one, only VerifyReplication can be 
> used. However, this can be very long since data need to be copied again.
> We should provide an easier and faster way to compare the tables. 
> One option is to calculate hashs per ranges. User can define number of 
> buckets, then we split the table into this number of buckets and calculate an 
> hash for each (Like partitioner is already doing). We can also optionally 
> calculate an overall CRC to reduce even more hash collision. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-11715) HBase should provide a tool to compare 2 remote tables.

2018-07-25 Thread Divesh Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-11715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555943#comment-16555943
 ] 

Divesh Jain edited comment on HBASE-11715 at 7/25/18 4:46 PM:
--

How about introducing and using a new tool that uses the following approach:

1) Takes an HBase snapshot of the table on both the clusters.

2) Launches a map reduce job to calculate the total number of row keys, min 
key, max key and hash sum on both cluster.

3) Compare the above values with the remote cluster to figure out if the data 
is same or not.

4) Delete HBase snapshot taken in step 1.

Since only a few bytes worth of data is transferred, the tool will perform 
faster than VerifyReplicatedData. We can also look at extending the 
functionality of VerifyReplicatedData to operate on fast mode using the above 
logic. 

* The tool can also be configurable enough to accept start and end row keys, 
enabling/disabling raw scan, accept start time and end time range, etc.



> HBase should provide a tool to compare 2 remote tables.
> ---
>
> Key: HBASE-11715
> URL: https://issues.apache.org/jira/browse/HBASE-11715
> Project: HBase
>  Issue Type: New Feature
>  Components: util
>Reporter: Jean-Marc Spaggiari
>Priority: Major
>
> As discussed in the mailing list, when a table is copied to another cluster 
> and need to be validated against the first one, only VerifyReplication can be 
> used. However, this can be very long since data need to be copied again.
> We should provide an easier and faster way to compare the tables. 
> One option is to calculate hashs per ranges. User can define number of 
> buckets, then we split the table into this number of buckets and calculate an 
> hash for each (Like partitioner is already doing). We can also optionally 
> calculate an overall CRC to reduce even more hash collision. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-20934) Create an hbase-connectors repository; commit new kafka connect here

2018-07-25 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555680#comment-16555680
 ] 

Sean Busbey edited comment on HBASE-20934 at 7/25/18 1:41 PM:
--

mapreduce stuff too? would require us clean up how our tools load 
implementations that are specific to an execution framework.


was (Author: busbey):
mapreduce stuff too?

> Create an hbase-connectors repository; commit new kafka connect here
> 
>
> Key: HBASE-20934
> URL: https://issues.apache.org/jira/browse/HBASE-20934
> Project: HBase
>  Issue Type: Bug
>  Components: kafka, REST, spark, Thrift
>Affects Versions: 2.1.0
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.2.0
>
>
> Create a new repository at hbase.apache.org and commit the new kafka proxy 
> here (HBASE-15320). Make sure it plays nicely with hbase core making use of 
> public-apis only (It does as best as I can see... but saying this anyways).
> Once the kafka proxy is working, as subissue, move REST and thrift over... 
> SPARK too.
> This might be better done for an hbase3 target. I filed it against hbase2.2 
> for now.
> See discussion up on dev list, "[DISCUSS] Kafka Connection, HBASE-15320", 
> https://s.apache.org/RQcC.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20938) Set version to 2.1.1-SNAPSHOT for branch-2.1

2018-07-25 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-20938:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to branch-2.1.

> Set version to 2.1.1-SNAPSHOT for branch-2.1
> 
>
> Key: HBASE-20938
> URL: https://issues.apache.org/jira/browse/HBASE-20938
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 2.1.1
>
> Attachments: HBASE-20938-branch-2.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20934) Create an hbase-connectors repository; commit new kafka connect here

2018-07-25 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555680#comment-16555680
 ] 

Sean Busbey commented on HBASE-20934:
-

mapreduce stuff too?

> Create an hbase-connectors repository; commit new kafka connect here
> 
>
> Key: HBASE-20934
> URL: https://issues.apache.org/jira/browse/HBASE-20934
> Project: HBase
>  Issue Type: Bug
>  Components: kafka, REST, spark, Thrift
>Affects Versions: 2.1.0
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.2.0
>
>
> Create a new repository at hbase.apache.org and commit the new kafka proxy 
> here (HBASE-15320). Make sure it plays nicely with hbase core making use of 
> public-apis only (It does as best as I can see... but saying this anyways).
> Once the kafka proxy is working, as subissue, move REST and thrift over... 
> SPARK too.
> This might be better done for an hbase3 target. I filed it against hbase2.2 
> for now.
> See discussion up on dev list, "[DISCUSS] Kafka Connection, HBASE-15320", 
> https://s.apache.org/RQcC.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18822) Create table for peer cluster automatically when creating table in source cluster of using namespace replication.

2018-07-25 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555716#comment-16555716
 ] 

Hadoop QA commented on HBASE-18822:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
34s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
48s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
20s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
20s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
11s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  3m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m 
29s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
1s{color} | {color:red} hbase-server: The patch generated 1 new + 12 unchanged 
- 0 fixed = 13 total (was 12) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
19s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 42s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
1m 39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
33s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
42s{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
35s{color} | {color:green} hbase-replication in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}225m  8s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}286m  4s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.client.TestMobRestoreSnapshotFromClient |
|   | hadoop.hbase.client.TestMobSnapshotCloneIndependence |
|   | hadoop.hbase.

[jira] [Updated] (HBASE-20649) Validate HFiles do not have PREFIX_TREE DataBlockEncoding

2018-07-25 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-20649:

Component/s: tooling
 Operability

> Validate HFiles do not have PREFIX_TREE DataBlockEncoding
> -
>
> Key: HBASE-20649
> URL: https://issues.apache.org/jira/browse/HBASE-20649
> Project: HBase
>  Issue Type: New Feature
>  Components: Operability, tooling
>Reporter: Peter Somogyi
>Assignee: Balazs Meszaros
>Priority: Minor
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-20649.master.001.patch, 
> HBASE-20649.master.002.patch, HBASE-20649.master.003.patch, 
> HBASE-20649.master.004.patch, HBASE-20649.master.005.patch, 
> HBASE-20649.master.006.patch
>
>
> HBASE-20592 adds a tool to check column families on the cluster do not have 
> PREFIX_TREE encoding.
> Since it is possible that DataBlockEncoding was already changed but HFiles 
> are not rewritten yet we would need a tool that can verify the content of 
> hfiles in the cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20886) [Auth] Support keytab login in hbase client

2018-07-25 Thread Reid Chan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reid Chan updated HBASE-20886:
--
Attachment: HBASE-20886.master.005.patch

> [Auth] Support keytab login in hbase client
> ---
>
> Key: HBASE-20886
> URL: https://issues.apache.org/jira/browse/HBASE-20886
> Project: HBase
>  Issue Type: Improvement
>  Components: asyncclient, Client, security
>Reporter: Reid Chan
>Assignee: Reid Chan
>Priority: Critical
> Attachments: HBASE-20886.master.001.patch, 
> HBASE-20886.master.002.patch, HBASE-20886.master.003.patch, 
> HBASE-20886.master.004.patch, HBASE-20886.master.005.patch
>
>
> There're lots of questions about how to connect to kerberized hbase cluster 
> through hbase-client api from user-mail and slack channel.
> {{hbase.client.keytab.file}} and {{hbase.client.keytab.principal}} are 
> already existed in code base, but they are only used in {{Canary}}.
> This issue is to make use of two configs to support client-side keytab based 
> login, after this issue resolved, hbase-client should directly connect to 
> kerberized cluster without changing any code as long as 
> {{hbase.client.keytab.file}} and {{hbase.client.keytab.principal}} are 
> specified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20919) meta region can't be re-onlined when restarting cluster if opening rsgroup

2018-07-25 Thread chenyang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555778#comment-16555778
 ] 

chenyang commented on HBASE-20919:
--

hi, Ted Yu

HBASE-20919-branch-2.0-01.patch is deprecated.

HBASE-20919-branch-2.0-02.patch offers a better solution which not block 
current thread.

Please review HBASE-20919-branch-2.0-02.patch, thank you.

> meta region can't be re-onlined when restarting cluster if opening rsgroup
> --
>
> Key: HBASE-20919
> URL: https://issues.apache.org/jira/browse/HBASE-20919
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer, master, rsgroup
>Affects Versions: 2.0.1
>Reporter: chenyang
>Priority: Major
> Attachments: HBASE-20919-branch-2.0-01.patch, 
> HBASE-20919-branch-2.0-02.patch, bug2.png, 
> hbase-hbase-master-bjpg-rs4729.yz02.no_02patch.log, 
> hbase-hbase-master-bjpg-rs4729.yz02.with_02patch.log, 
> hbase-hbase-master-bjpg-rs4730.yz02.log.test
>
>
> if you open rsgroup, hbase-site.xml contains  below configuration.
> {code:java}
> 
>   hbase.coprocessor.master.classes
>   org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint
> 
> 
>   hbase.master.loadbalancer.class
>  org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer
> 
> {code}
> And you shut down the whole HBase cluster in the way:
>  # first shut down region server one by one
>  # shut down master
> Then you restart whole cluster in the way:
>  # start master
>  # start regionserver
> The hbase:meta region can not be re-online and the rsgroup can not be 
> initialized successfully.
>  master logs:
> {code:java}
> 2018-07-12 18:27:08,775 INFO 
> [org.apache.hadoop.hbase.rsgroup.RSGroupInfoManagerImpl$RSGroupStartupWorker-bjpg-rs4730.yz02,16000,1531389637409]
>  rsgroup.RSGro
> upInfoManagerImpl$RSGroupStartupWorker: Waiting for catalog tables to come 
> online
> 2018-07-12 18:27:08,876 INFO 
> [org.apache.hadoop.hbase.rsgroup.RSGroupInfoManagerImpl$RSGroupStartupWorker-bjpg-rs4730.yz02,16000,1531389637409]
>  zookeeper.Met
> aTableLocator: Failed verification of hbase:meta,,1 at 
> address=bjpg-rs4732.yz02,60020,1531388712053, 
> exception=org.apache.hadoop.hbase.NotServingRegionExcepti
> on: hbase:meta,,1 is not online on bjpg-rs4732.yz02,60020,1531389727928
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3249)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3226)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1414)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegionInfo(RSRpcServices.java:1729)
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:28286)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> {code}
> The logs show that hbase:meta region is not online and rsgroup keeps retrying 
> to initialize.
>   
>  but why the hbase:meta region is not online?
>  The info-level logs and jstack had not enough infomation, so I added some 
> debug logs in test-source-code. Then i checked the master`s logs and region 
> server`s logs, and found the meta region assign procedure which hold the meta 
> region lock not completed and not released the lock forever, so the 
> recoverMetaProcedure could not be executed. 
>   
>  Why the first procedure not completed and not released meta region lock?
>  In the test logs, i found when assignmentManager assigned the region, it 
> need to call the rsgroup balancer which  have not been initialized 
> completely, so throw NPE.  As a result, the procedure not completed and not 
> released the lock forever.
> {code:java}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer.generateGroupMaps(RSGroupBasedLoadBalancer.java:262)
> at 
> org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer.roundRobinAssignment(RSGroupBasedLoadBalancer.java:162)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.processAssignmentPlans(AssignmentManager.java:1864)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.processAssignQueue(AssignmentManager.java:1809)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.access$400(AssignmentManager.java:113)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager$2.run(AssignmentManager.java:1693)
> {code}
> !bug2.png!
> As shown in the figure named bug2.png listed in attachments, when we shutdown 
> the last

[jira] [Commented] (HBASE-18822) Create table for peer cluster automatically when creating table in source cluster of using namespace replication.

2018-07-25 Thread Ashish Singhi (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555816#comment-16555816
 ] 

Ashish Singhi commented on HBASE-18822:
---

I get that point [~openinx]. My only concern is, unless explicitly a user 
specify any table for replication all the tables will be synced to peer 
cluster, which IMO is not correct. If others here think otherwise also I'm fine 
with it. I don't want to be a blocker.

> Create table for peer cluster automatically when creating table in source 
> cluster of using namespace replication.
> -
>
> Key: HBASE-18822
> URL: https://issues.apache.org/jira/browse/HBASE-18822
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 2.0.0-alpha-2
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-18822.v1.patch
>
>
> In our cluster of using namespace replication,   we always forget to create 
> table in peer cluster, which lead to replication get stuck. 
> We have implemented the feature in our cluster:  create table for peer 
> cluster automatically when creating table in source cluster of using 
> namespace replication.
>  
> I'm not sure if someone else needs this feature, so create an issue here for 
> discussing   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-18822) Create table for peer cluster automatically when creating table in source cluster of using namespace replication.

2018-07-25 Thread Zheng Hu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-18822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Hu updated HBASE-18822:
-
Attachment: HBASE-18822.v1.patch

> Create table for peer cluster automatically when creating table in source 
> cluster of using namespace replication.
> -
>
> Key: HBASE-18822
> URL: https://issues.apache.org/jira/browse/HBASE-18822
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 2.0.0-alpha-2
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-18822.v1.patch, HBASE-18822.v1.patch
>
>
> In our cluster of using namespace replication,   we always forget to create 
> table in peer cluster, which lead to replication get stuck. 
> We have implemented the feature in our cluster:  create table for peer 
> cluster automatically when creating table in source cluster of using 
> namespace replication.
>  
> I'm not sure if someone else needs this feature, so create an issue here for 
> discussing   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20782) Fix duplication of TestServletFilter.access

2018-07-25 Thread Xu Cang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Cang updated HBASE-20782:

Attachment: HBASE-20782.master.003.patch

> Fix duplication of TestServletFilter.access
> ---
>
> Key: HBASE-20782
> URL: https://issues.apache.org/jira/browse/HBASE-20782
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jan Hentschel
>Assignee: Xu Cang
>Priority: Minor
> Attachments: HBASE-20782.master.001.patch, 
> HBASE-20782.master.002.patch, HBASE-20782.master.003.patch, 
> HBASE-20782.master.003.patch
>
>
> The {{access}} method in {{TestServletFilter}} is duplicated in 
> {{TestPathFilter}}. The method should be moved into a common place.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20928) Rewrite calculation of midpoint in binarySearch functions to prevent overflow

2018-07-25 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20928:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Rewrite calculation of midpoint in binarySearch functions to prevent overflow
> -
>
> Key: HBASE-20928
> URL: https://issues.apache.org/jira/browse/HBASE-20928
> Project: HBase
>  Issue Type: Bug
>  Components: io
>Reporter: saurabh singh
>Assignee: saurabh singh
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HBASE-20928-addendum.patch, 
> HBASE-20928-fix-binarySearch-v5.patch, HBASE-20928-fix-binarySearch-v5.patch
>
>
> There are couple of issues in the function:
>  * {{>>>}} operator would mess the values if {{low}} + {{high}} end up being 
> negative. This shouldn't happen but I don't see anything to prevent this from 
> happening.
>  * The code fails around boundary values of {{low}} and {{high}}. This is a 
> well known binary search catch. 
> [https://ai.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html]
>  
> Most of the code should already be covered by tests. I would have liked to 
> add a test that actually fails without the fix but given these are private 
> methods I am not sure on the best place to add the test. Suggestions?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18822) Create table for peer cluster automatically when creating table in source cluster of using namespace replication.

2018-07-25 Thread Xu Cang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555870#comment-16555870
 ] 

Xu Cang commented on HBASE-18822:
-

[~openinx] thanks for showing this patch.

One question, seems like this patch makes the best effort to sync tables but 
does not guarantee all synced. Especially when I read 
#syncSchemaModificationToPeer, I realize it only modifies table when cf count 
is different. So there are still some work users need to do to verify/make sure 
table schema is synced.  

> Create table for peer cluster automatically when creating table in source 
> cluster of using namespace replication.
> -
>
> Key: HBASE-18822
> URL: https://issues.apache.org/jira/browse/HBASE-18822
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 2.0.0-alpha-2
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-18822.v1.patch, HBASE-18822.v1.patch
>
>
> In our cluster of using namespace replication,   we always forget to create 
> table in peer cluster, which lead to replication get stuck. 
> We have implemented the feature in our cluster:  create table for peer 
> cluster automatically when creating table in source cluster of using 
> namespace replication.
>  
> I'm not sure if someone else needs this feature, so create an issue here for 
> discussing   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20919) meta region can't be re-onlined when restarting cluster if opening rsgroup

2018-07-25 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555904#comment-16555904
 ] 

Ted Yu commented on HBASE-20919:


Checked HBASE-20919-branch-2.0-02.patch which seems fine.

Triggered QA run:
https://builds.apache.org/job/PreCommit-HBASE-Build/13789/

> meta region can't be re-onlined when restarting cluster if opening rsgroup
> --
>
> Key: HBASE-20919
> URL: https://issues.apache.org/jira/browse/HBASE-20919
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer, master, rsgroup
>Affects Versions: 2.0.1
>Reporter: chenyang
>Priority: Major
> Attachments: HBASE-20919-branch-2.0-01.patch, 
> HBASE-20919-branch-2.0-02.patch, bug2.png, 
> hbase-hbase-master-bjpg-rs4729.yz02.no_02patch.log, 
> hbase-hbase-master-bjpg-rs4729.yz02.with_02patch.log, 
> hbase-hbase-master-bjpg-rs4730.yz02.log.test
>
>
> if you open rsgroup, hbase-site.xml contains  below configuration.
> {code:java}
> 
>   hbase.coprocessor.master.classes
>   org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint
> 
> 
>   hbase.master.loadbalancer.class
>  org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer
> 
> {code}
> And you shut down the whole HBase cluster in the way:
>  # first shut down region server one by one
>  # shut down master
> Then you restart whole cluster in the way:
>  # start master
>  # start regionserver
> The hbase:meta region can not be re-online and the rsgroup can not be 
> initialized successfully.
>  master logs:
> {code:java}
> 2018-07-12 18:27:08,775 INFO 
> [org.apache.hadoop.hbase.rsgroup.RSGroupInfoManagerImpl$RSGroupStartupWorker-bjpg-rs4730.yz02,16000,1531389637409]
>  rsgroup.RSGro
> upInfoManagerImpl$RSGroupStartupWorker: Waiting for catalog tables to come 
> online
> 2018-07-12 18:27:08,876 INFO 
> [org.apache.hadoop.hbase.rsgroup.RSGroupInfoManagerImpl$RSGroupStartupWorker-bjpg-rs4730.yz02,16000,1531389637409]
>  zookeeper.Met
> aTableLocator: Failed verification of hbase:meta,,1 at 
> address=bjpg-rs4732.yz02,60020,1531388712053, 
> exception=org.apache.hadoop.hbase.NotServingRegionExcepti
> on: hbase:meta,,1 is not online on bjpg-rs4732.yz02,60020,1531389727928
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3249)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3226)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1414)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegionInfo(RSRpcServices.java:1729)
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:28286)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> {code}
> The logs show that hbase:meta region is not online and rsgroup keeps retrying 
> to initialize.
>   
>  but why the hbase:meta region is not online?
>  The info-level logs and jstack had not enough infomation, so I added some 
> debug logs in test-source-code. Then i checked the master`s logs and region 
> server`s logs, and found the meta region assign procedure which hold the meta 
> region lock not completed and not released the lock forever, so the 
> recoverMetaProcedure could not be executed. 
>   
>  Why the first procedure not completed and not released meta region lock?
>  In the test logs, i found when assignmentManager assigned the region, it 
> need to call the rsgroup balancer which  have not been initialized 
> completely, so throw NPE.  As a result, the procedure not completed and not 
> released the lock forever.
> {code:java}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer.generateGroupMaps(RSGroupBasedLoadBalancer.java:262)
> at 
> org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer.roundRobinAssignment(RSGroupBasedLoadBalancer.java:162)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.processAssignmentPlans(AssignmentManager.java:1864)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.processAssignQueue(AssignmentManager.java:1809)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.access$400(AssignmentManager.java:113)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager$2.run(AssignmentManager.java:1693)
> {code}
> !bug2.png!
> As shown in the figure named bug2.png listed in attachments, when we shutdown 
> the last region server, the master submit a ServerCrashProcedure. In the 
> procedure

[jira] [Commented] (HBASE-20919) meta region can't be re-onlined when restarting cluster if opening rsgroup

2018-07-25 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555922#comment-16555922
 ] 

Ted Yu commented on HBASE-20919:


{code}
2018-07-25 14:27:12,356 WARN  [master/bjpg-rs4729:16000] 
assignment.AssignmentManager: unable to round-robin assignment
org.apache.hadoop.hbase.HBaseIOException: RSGroupBasedLoadBalancer has not been 
initialized
  at 
org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer.checkInitializedState(RSGroupBasedLoadBalancer.java:480)
{code}
Ultimately RSGroupBasedLoadBalancer would be initialized.
Shouldn't the above log be at DEBUG level since there is nothing required from 
operator ?

> meta region can't be re-onlined when restarting cluster if opening rsgroup
> --
>
> Key: HBASE-20919
> URL: https://issues.apache.org/jira/browse/HBASE-20919
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer, master, rsgroup
>Affects Versions: 2.0.1
>Reporter: chenyang
>Priority: Major
> Attachments: HBASE-20919-branch-2.0-01.patch, 
> HBASE-20919-branch-2.0-02.patch, bug2.png, 
> hbase-hbase-master-bjpg-rs4729.yz02.no_02patch.log, 
> hbase-hbase-master-bjpg-rs4729.yz02.with_02patch.log, 
> hbase-hbase-master-bjpg-rs4730.yz02.log.test
>
>
> if you open rsgroup, hbase-site.xml contains  below configuration.
> {code:java}
> 
>   hbase.coprocessor.master.classes
>   org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint
> 
> 
>   hbase.master.loadbalancer.class
>  org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer
> 
> {code}
> And you shut down the whole HBase cluster in the way:
>  # first shut down region server one by one
>  # shut down master
> Then you restart whole cluster in the way:
>  # start master
>  # start regionserver
> The hbase:meta region can not be re-online and the rsgroup can not be 
> initialized successfully.
>  master logs:
> {code:java}
> 2018-07-12 18:27:08,775 INFO 
> [org.apache.hadoop.hbase.rsgroup.RSGroupInfoManagerImpl$RSGroupStartupWorker-bjpg-rs4730.yz02,16000,1531389637409]
>  rsgroup.RSGro
> upInfoManagerImpl$RSGroupStartupWorker: Waiting for catalog tables to come 
> online
> 2018-07-12 18:27:08,876 INFO 
> [org.apache.hadoop.hbase.rsgroup.RSGroupInfoManagerImpl$RSGroupStartupWorker-bjpg-rs4730.yz02,16000,1531389637409]
>  zookeeper.Met
> aTableLocator: Failed verification of hbase:meta,,1 at 
> address=bjpg-rs4732.yz02,60020,1531388712053, 
> exception=org.apache.hadoop.hbase.NotServingRegionExcepti
> on: hbase:meta,,1 is not online on bjpg-rs4732.yz02,60020,1531389727928
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3249)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3226)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1414)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegionInfo(RSRpcServices.java:1729)
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:28286)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> {code}
> The logs show that hbase:meta region is not online and rsgroup keeps retrying 
> to initialize.
>   
>  but why the hbase:meta region is not online?
>  The info-level logs and jstack had not enough infomation, so I added some 
> debug logs in test-source-code. Then i checked the master`s logs and region 
> server`s logs, and found the meta region assign procedure which hold the meta 
> region lock not completed and not released the lock forever, so the 
> recoverMetaProcedure could not be executed. 
>   
>  Why the first procedure not completed and not released meta region lock?
>  In the test logs, i found when assignmentManager assigned the region, it 
> need to call the rsgroup balancer which  have not been initialized 
> completely, so throw NPE.  As a result, the procedure not completed and not 
> released the lock forever.
> {code:java}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer.generateGroupMaps(RSGroupBasedLoadBalancer.java:262)
> at 
> org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer.roundRobinAssignment(RSGroupBasedLoadBalancer.java:162)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.processAssignmentPlans(AssignmentManager.java:1864)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.processAssignQueue(AssignmentManager.java:1809)
> at 
> org.apache.hadoop.hbase.master.

[jira] [Updated] (HBASE-19263) Hbase BaseLoadBalancer

2018-07-25 Thread Xu Cang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Cang updated HBASE-19263:

Description: 
hbase master as a regionserver,BaseLoadBalancer throw Exception 
ArrayIndexOutOfBoundsException because of count of region is contain hbase 
master.However,hbase master can not resign any region except hbase-site.xml 
config.

java.lang.ArrayIndexOutOfBoundsException: 2
 at 
org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.getLocalityOfRegion(BaseLoadBalancer.java:843)
 at 
org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer$LocalityCostFunction.cost(StochasticLoadBalancer.java:1076)
 at 
org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer.computeCost(StochasticLoadBalancer.java:413)
 at 
org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer.balanceCluster(StochasticLoadBalancer.java:280)
 at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:1341)
 at 
org.apache.hadoop.hbase.master.balancer.BalancerChore.chore(BalancerChore.java:48)
 at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:185)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
 at 
org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:110)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)

  was:
hbase master as a regionserver,BaseLoadBalancer throw Exception 
ArrayIndexOutOfBoundsException because of  count of region is contain hbase 
master.However,hbase master can not resign any region except hbase-site.xml 
config.


java.lang.ArrayIndexOutOfBoundsException: 2
at 
org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.getLocalityOfRegion(BaseLoadBalancer.java:843)
at 
org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer$LocalityCostFunction.cost(StochasticLoadBalancer.java:1076)
at 
org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer.computeCost(StochasticLoadBalancer.java:413)
at 
org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer.balanceCluster(StochasticLoadBalancer.java:280)
at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:1341)
at 
org.apache.hadoop.hbase.master.balancer.BalancerChore.chore(BalancerChore.java:48)
at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:185)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at 
org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:110)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)


> Hbase BaseLoadBalancer
> --
>
> Key: HBASE-19263
> URL: https://issues.apache.org/jira/browse/HBASE-19263
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.6
> Environment: hbase 1.2.6
>Reporter: SuperbDong
>Priority: Major
>
> hbase master as a regionserver,BaseLoadBalancer throw Exception 
> ArrayIndexOutOfBoundsException because of count of region is contain hbase 
> master.However,hbase master can not resign any region except hbase-site.xml 
> config.
> java.lang.ArrayIndexOutOfBoundsException: 2
>  at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.getLocalityOfRegion(BaseLoadBalancer.java:843)
>  at 
> org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer$LocalityCostFunction.cost(StochasticLoadBalancer.java:1076)
>  at 
> org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer.computeCost(StochasticLoadBalancer.java:413)
>  at 
> org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer.balanceCluster(StochasticLoadBalancer.java:280)
>  at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:1341)
>  at 
> org.apache.hadoop.hbase.master.bala

[jira] [Commented] (HBASE-20899) Add Hadoop KMS dependency and basic HDFS at-rest encryption tests

2018-07-25 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555935#comment-16555935
 ] 

Wei-Chiu Chuang commented on HBASE-20899:
-

I thought this is affected by HBASE-20538, but running against the latest JDK8 
(8u181) on Mac and test passed. Submit for precommit check

> Add Hadoop KMS dependency and basic HDFS at-rest encryption tests
> -
>
> Key: HBASE-20899
> URL: https://issues.apache.org/jira/browse/HBASE-20899
> Project: HBase
>  Issue Type: Sub-task
>  Components: encryption
>Affects Versions: 2.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HBASE-20899.master.001.patch
>
>
> We should start by adding hadoop-kms dependency in HBase test scope, and add 
> basic HDFS at-rest encryption tests using the hadoop-kms dependency.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20899) Add Hadoop KMS dependency and basic HDFS at-rest encryption tests

2018-07-25 Thread Wei-Chiu Chuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HBASE-20899:

Status: Patch Available  (was: Open)

> Add Hadoop KMS dependency and basic HDFS at-rest encryption tests
> -
>
> Key: HBASE-20899
> URL: https://issues.apache.org/jira/browse/HBASE-20899
> Project: HBase
>  Issue Type: Sub-task
>  Components: encryption
>Affects Versions: 2.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HBASE-20899.master.001.patch
>
>
> We should start by adding hadoop-kms dependency in HBase test scope, and add 
> basic HDFS at-rest encryption tests using the hadoop-kms dependency.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20867) RS may get killed while master restarts

2018-07-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555964#comment-16555964
 ] 

Hudson commented on HBASE-20867:


Results for branch branch-2
[build #1025 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1025/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1025//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1025//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1025//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> RS may get killed while master restarts
> ---
>
> Key: HBASE-20867
> URL: https://issues.apache.org/jira/browse/HBASE-20867
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20867.branch-2.0.001.patch, 
> HBASE-20867.branch-2.0.002.patch, HBASE-20867.branch-2.0.003.patch, 
> HBASE-20867.branch-2.0.004.patch, HBASE-20867.branch-2.0.005.patch, 
> HBASE-20867.branch-2.0.006.patch
>
>
> If the master is dispatching a RPC call to RS when aborting. A connection 
> exception may be thrown by the RPC layer(A IOException with "Connection 
> closed" message in this case). The RSProcedureDispatcher will regard is as an 
> un-retryable exception and pass it to UnassignProcedue.remoteCallFailed, 
> which will expire the RS.
> Actually, the RS is very healthy, only the master is restarting.
> I think we should deal with those kinds of connection exceptions in 
> RSProcedureDispatcher and retry the rpc call



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555963#comment-16555963
 ] 

Hudson commented on HBASE-20846:


Results for branch branch-2
[build #1025 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1025/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1025//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1025//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1025//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, 
> HBASE-20846-v3.patch, HBASE-20846-v4.patch, HBASE-20846-v4.patch, 
> HBASE-20846-v4.patch, HBASE-20846-v5.patch, HBASE-20846-v6.patch, 
> HBASE-20846.branch-2.0.002.patch, HBASE-20846.branch-2.0.patch, 
> HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not possible if the master was not restarted.
> This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, 
> I wrote a simple UT to repo this case.
> I think we don't have to check the parent for table's shared lock. It is a 
> shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20934) Create an hbase-connectors repository; commit new kafka connect here

2018-07-25 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-20934:
--
Component/s: mapreduce

> Create an hbase-connectors repository; commit new kafka connect here
> 
>
> Key: HBASE-20934
> URL: https://issues.apache.org/jira/browse/HBASE-20934
> Project: HBase
>  Issue Type: Bug
>  Components: kafka, mapreduce, REST, spark, Thrift
>Affects Versions: 2.1.0
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.2.0
>
>
> Create a new repository at hbase.apache.org and commit the new kafka proxy 
> here (HBASE-15320). Make sure it plays nicely with hbase core making use of 
> public-apis only (It does as best as I can see... but saying this anyways).
> Once the kafka proxy is working, as subissue, move REST and thrift over... 
> SPARK too.
> This might be better done for an hbase3 target. I filed it against hbase2.2 
> for now.
> See discussion up on dev list, "[DISCUSS] Kafka Connection, HBASE-15320", 
> https://s.apache.org/RQcC.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20886) [Auth] Support keytab login in hbase client

2018-07-25 Thread Reid Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555985#comment-16555985
 ] 

Reid Chan commented on HBASE-20886:
---

v5 addresses the following:
* bq.Could you expand this check to include the principal of the current user 
with krb credentials against the specified principal in the configuration?
* bq. update the javadoc for AuthUtil 
* bq. make AuthUtil IA.Private in 3.0
* bq. mark AuthUtil as deprecated in any earlier release lines
* update both ConnectionFactory class javadocs and the ["Client-side 
Configuration for Secure Operation
" section of the ref 
guide|http://hbase.apache.org/book.html#_client_side_configuration_for_secure_operation]

i'll also update release note later.

> [Auth] Support keytab login in hbase client
> ---
>
> Key: HBASE-20886
> URL: https://issues.apache.org/jira/browse/HBASE-20886
> Project: HBase
>  Issue Type: Improvement
>  Components: asyncclient, Client, security
>Reporter: Reid Chan
>Assignee: Reid Chan
>Priority: Critical
> Attachments: HBASE-20886.master.001.patch, 
> HBASE-20886.master.002.patch, HBASE-20886.master.003.patch, 
> HBASE-20886.master.004.patch, HBASE-20886.master.005.patch
>
>
> There're lots of questions about how to connect to kerberized hbase cluster 
> through hbase-client api from user-mail and slack channel.
> {{hbase.client.keytab.file}} and {{hbase.client.keytab.principal}} are 
> already existed in code base, but they are only used in {{Canary}}.
> This issue is to make use of two configs to support client-side keytab based 
> login, after this issue resolved, hbase-client should directly connect to 
> kerberized cluster without changing any code as long as 
> {{hbase.client.keytab.file}} and {{hbase.client.keytab.principal}} are 
> specified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20886) [Auth] Support keytab login in hbase client

2018-07-25 Thread Reid Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555997#comment-16555997
 ] 

Reid Chan commented on HBASE-20886:
---

bq. One final thought: ...
Multiple credentials-like issues always happens on server-side, e.g, hbase 
thrift server. But i doubt client-side should handle this, or leave it to 
client's application i think...




> [Auth] Support keytab login in hbase client
> ---
>
> Key: HBASE-20886
> URL: https://issues.apache.org/jira/browse/HBASE-20886
> Project: HBase
>  Issue Type: Improvement
>  Components: asyncclient, Client, security
>Reporter: Reid Chan
>Assignee: Reid Chan
>Priority: Critical
> Attachments: HBASE-20886.master.001.patch, 
> HBASE-20886.master.002.patch, HBASE-20886.master.003.patch, 
> HBASE-20886.master.004.patch, HBASE-20886.master.005.patch
>
>
> There're lots of questions about how to connect to kerberized hbase cluster 
> through hbase-client api from user-mail and slack channel.
> {{hbase.client.keytab.file}} and {{hbase.client.keytab.principal}} are 
> already existed in code base, but they are only used in {{Canary}}.
> This issue is to make use of two configs to support client-side keytab based 
> login, after this issue resolved, hbase-client should directly connect to 
> kerberized cluster without changing any code as long as 
> {{hbase.client.keytab.file}} and {{hbase.client.keytab.principal}} are 
> specified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20925) Canary test to expose results per table/ per region to result file

2018-07-25 Thread Xu Cang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Cang updated HBASE-20925:

Description: 
Canary test to expose results per table/ per region to a result file.

In the result file, it should provide a summary of the canary test and shows 
which region has read or write failures.

Also, include some stats regarding the test. Such as, "region read count: 500, 
region read success count: 499"

  was:Canary test to expose results per table.

Summary: Canary test to expose results per table/ per region to result 
file  (was: Canary test to expose results per table)

> Canary test to expose results per table/ per region to result file
> --
>
> Key: HBASE-20925
> URL: https://issues.apache.org/jira/browse/HBASE-20925
> Project: HBase
>  Issue Type: Improvement
>  Components: canary
>Reporter: Xu Cang
>Assignee: Xu Cang
>Priority: Major
>
> Canary test to expose results per table/ per region to a result file.
> In the result file, it should provide a summary of the canary test and shows 
> which region has read or write failures.
> Also, include some stats regarding the test. Such as, "region read count: 
> 500, region read success count: 499"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20925) Canary test to expose results per table/ per region to result file

2018-07-25 Thread Xu Cang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Cang updated HBASE-20925:

Description: 
This change will provide a new cmd line argument for users to specify as 
"-verboseTestResultFilePath" and Canary will output structured/ easy to be 
parsed/ easy to be read strings to the file. 

In the result file, it should provide a summary of the canary test and shows 
which region has read or write failures.

Also to include some stats regarding the test. Such as, "region read count: 
500, region read success count: 499"

 

(Before, Canary test sends some of above information into log file, mixed with 
other debugging information. Though, the format for different tests various,

which is no easy to be parsed. )

  was:
Canary test to expose results per table/ per region to a result file.

In the result file, it should provide a summary of the canary test and shows 
which region has read or write failures.

Also, include some stats regarding the test. Such as, "region read count: 500, 
region read success count: 499"


> Canary test to expose results per table/ per region to result file
> --
>
> Key: HBASE-20925
> URL: https://issues.apache.org/jira/browse/HBASE-20925
> Project: HBase
>  Issue Type: Improvement
>  Components: canary
>Reporter: Xu Cang
>Assignee: Xu Cang
>Priority: Major
>
> This change will provide a new cmd line argument for users to specify as 
> "-verboseTestResultFilePath" and Canary will output structured/ easy to be 
> parsed/ easy to be read strings to the file. 
> In the result file, it should provide a summary of the canary test and shows 
> which region has read or write failures.
> Also to include some stats regarding the test. Such as, "region read count: 
> 500, region read success count: 499"
>  
> (Before, Canary test sends some of above information into log file, mixed 
> with other debugging information. Though, the format for different tests 
> various,
> which is no easy to be parsed. )



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20895) NPE in RpcServer#readAndProcess

2018-07-25 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-20895:
---
Attachment: HBASE-20895-branch-1.patch

> NPE in RpcServer#readAndProcess
> ---
>
> Key: HBASE-20895
> URL: https://issues.apache.org/jira/browse/HBASE-20895
> Project: HBase
>  Issue Type: Bug
>  Components: rpc
>Affects Versions: 1.3.2
>Reporter: Andrew Purtell
>Assignee: Monani Mihir
>Priority: Major
> Fix For: 1.5.0, 1.3.3, 1.4.7
>
> Attachments: HBASE-20895-branch-1.patch, HBASE-20895-branch-1.patch
>
>
> {noformat}
> 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - 
> RpcServer.listener,port=60020: Caught exception while reading:
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This looks like it could be a use after close problem if there is concurrent 
> access to a Connection.
> In process() we might store a null back to the 'data' field.
> Meanwhile in readAndProcess() we have a case where we might be blocked on a 
> channel read and then after coming back from the read we go to use 'data' 
> after a null has been written back, leading to a NPE.
> {quote}count = channelRead(channel, data);
>  1761 ---> if (count >= 0 && *data.remaining()* == 0)
>  \{ process(); }{quote}
> Whether a NPE happens or not is going to depend on the timing of the store 
> back to 'data' in another thread and use of 'data' in this thread and whether 
> or not the JVM has optimized away a reload of 'data' (it's not declared 
> volatile)
> We should do a null check here just to be defensive. We should also look at 
> whether concurrent access to the Connection is happening and intended.The 
> above is just a theory. We should also look at other execution sequences that 
> could lead to 'data' being null in this location. At a glance I didn't find 
> one but the store to 'data' happens behind conditionals so it is possible. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20815) In TestServerCrashProcedure collect and assert on submitted and failed counts for ServerCrashProcedure

2018-07-25 Thread Umesh Agashe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Umesh Agashe updated HBASE-20815:
-
Attachment: HBASE-20815.master.002.patch

> In TestServerCrashProcedure collect and assert on submitted and failed counts 
> for ServerCrashProcedure
> --
>
> Key: HBASE-20815
> URL: https://issues.apache.org/jira/browse/HBASE-20815
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Reporter: Umesh Agashe
>Assignee: Xu Cang
>Priority: Minor
> Attachments: HBASE-20815.master.001.patch, 
> HBASE-20815.master.002.patch, HBASE-20815.master.002.patch
>
>
> We need to collect and possibly assert on number of procedures submitted and 
> failed for ServerCrashProcedures.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20815) In TestServerCrashProcedure collect and assert on submitted and failed counts for ServerCrashProcedure

2018-07-25 Thread Umesh Agashe (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16556075#comment-16556075
 ] 

Umesh Agashe commented on HBASE-20815:
--

Unit test for patch 002 failed. Failure doesn't look related to the changes. 
Retrying.

> In TestServerCrashProcedure collect and assert on submitted and failed counts 
> for ServerCrashProcedure
> --
>
> Key: HBASE-20815
> URL: https://issues.apache.org/jira/browse/HBASE-20815
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Reporter: Umesh Agashe
>Assignee: Xu Cang
>Priority: Minor
> Attachments: HBASE-20815.master.001.patch, 
> HBASE-20815.master.002.patch, HBASE-20815.master.002.patch
>
>
> We need to collect and possibly assert on number of procedures submitted and 
> failed for ServerCrashProcedures.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-20893) Data loss if splitting region while ServerCrashProcedure executing

2018-07-25 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack reopened HBASE-20893:
---

Reopening to look at these logs I see running this patch on cluster (Its great 
it detected recovered.edits... but it looks like the patch causes us to hit 
CODE-BUG...  though we seem to be ok...Minimally it will freak-out an operator):

{code}

2018-07-25 06:46:56,692 ERROR [PEWorker-3] 
assignment.SplitTableRegionProcedure: Error trying to split region 
2cb977a87bc6bdf90ef7fc71320d7b50 in the table IntegrationTestBigLinkedList (in 
state=SPLIT_TABLE_REGIONS_CHECK_CLOSED_REGIONS)
java.io.IOException: Recovered.edits are found in Region: {ENCODED => 
2cb977a87bc6bdf90ef7fc71320d7b50, NAME => 
'IntegrationTestBigLinkedList,z\xAA;\xC7M\x1Bf8\x85\xB5\x07\xD5\x9B#\xCD\xCC,1531911202047.2cb977a87bc6bdf90ef7fc71320d7b50.',
 STARTKEY => 'z\xAA;\xC7M\x1Bf8\x85\xB5\x07\xD5\x9B#\xCD\xCC', ENDKEY => 
'{\x8D\xF2?'}, abort split to prevent data loss
  at 
org.apache.hadoop.hbase.master.assignment.SplitTableRegionProcedure.checkClosedRegion(SplitTableRegionProcedure.java:151)
  at 
org.apache.hadoop.hbase.master.assignment.SplitTableRegionProcedure.executeFromState(SplitTableRegionProcedure.java:259)
  at 
org.apache.hadoop.hbase.master.assignment.SplitTableRegionProcedure.executeFromState(SplitTableRegionProcedure.java:92)
  at 
org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:184)
  at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:850)
  at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1472)
  at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1240)

   at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
  at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1760)

 2018-07-25 06:46:56,934 INFO  [PEWorker-3] 
procedure.MasterProcedureScheduler: pid=4106, ppid=4105, state=SUCCESS; 
UnassignProcedure table=IntegrationTestBigLinkedList, 
region=2cb977a87bc6bdf90ef7fc71320d7b50, 
server=ve0540.halxg.cloudera.com,16020,1532501580658 checking lock on 
2cb977a87bc6bdf90ef7fc71320d7b50

2018-07-25 06:46:56,934 ERROR [PEWorker-3] procedure2.ProcedureExecutor: 
CODE-BUG: Uncaught runtime exception for pid=4106, ppid=4105, state=SUCCESS; 
UnassignProcedure table=IntegrationTestBigLinkedList, 
region=2cb977a87bc6bdf90ef7fc71320d7b50, 
server=ve0540.halxg.cloudera.com,16020,1532501580658

   java.lang.UnsupportedOperationException: 
Unhandled state REGION_TRANSITION_FINISH; there is no rollback for assignment 
unless we cancel the operation by dropping/disabling the table
  at 
org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.rollback(RegionTransitionProcedure.java:412)
  at 
org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.rollback(RegionTransitionProcedure.java:95)

  at 
org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:864)
  at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1372)
  at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1328)

at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1197)
  at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)

   at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1760)
2018-07-25 06:46:57,088 ERROR [PEWorker-3] procedure2.ProcedureExecutor: 
CODE-BUG: Uncaught runtime exception for pid=4106, ppid=4105, state=SUCCESS; 
UnassignProcedure table=IntegrationTestBigLinkedList, 
region=2cb977a87bc6bdf90ef7fc71320d7b50, 
server=ve0540.halxg.cloudera.com,16020,1532501580658

   java.lang.UnsupportedOperationException: 
Unha

[jira] [Commented] (HBASE-18822) Create table for peer cluster automatically when creating table in source cluster of using namespace replication.

2018-07-25 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16556097#comment-16556097
 ] 

Hadoop QA commented on HBASE-18822:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
9s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
53s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
32s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
38s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
9s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  3m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m 
26s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
9s{color} | {color:red} hbase-server: The patch generated 1 new + 12 unchanged 
- 0 fixed = 13 total (was 12) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
29s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 59s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
1m 37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
31s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
22s{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
21s{color} | {color:green} hbase-replication in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}123m 
41s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}184m 20s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-18822 |
| JIRA Patch URL | 
https://issues.a

[jira] [Created] (HBASE-20941) Cre

2018-07-25 Thread Umesh Agashe (JIRA)
Umesh Agashe created HBASE-20941:


 Summary: Cre
 Key: HBASE-20941
 URL: https://issues.apache.org/jira/browse/HBASE-20941
 Project: HBase
  Issue Type: Sub-task
Reporter: Umesh Agashe
Assignee: Umesh Agashe


Create HbckService in master and implement following methods:
 # purgeProcedure/s(): some procedures do not support abort at every step. When 
these procedures get stuck then they can not be aborted or make further 
progress. Corrective action is to purge these procedures from ProcWAL. Provide 
option to purge sub-procedures as well.
 # setTable/RegionState(): If table/ region state are inconsistent with action/ 
procedures working on them, sometimes manipulating their states in meta fix 
things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20941) Create and implement HbckService in master

2018-07-25 Thread Umesh Agashe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Umesh Agashe updated HBASE-20941:
-
Summary: Create and implement HbckService in master  (was: Cre)

> Create and implement HbckService in master
> --
>
> Key: HBASE-20941
> URL: https://issues.apache.org/jira/browse/HBASE-20941
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Umesh Agashe
>Assignee: Umesh Agashe
>Priority: Major
>
> Create HbckService in master and implement following methods:
>  # purgeProcedure/s(): some procedures do not support abort at every step. 
> When these procedures get stuck then they can not be aborted or make further 
> progress. Corrective action is to purge these procedures from ProcWAL. 
> Provide option to purge sub-procedures as well.
>  # setTable/RegionState(): If table/ region state are inconsistent with 
> action/ procedures working on them, sometimes manipulating their states in 
> meta fix things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18477) Umbrella JIRA for HBase Read Replica clusters

2018-07-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16556202#comment-16556202
 ] 

Hudson commented on HBASE-18477:


Results for branch HBASE-18477
[build #275 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/275/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/275//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/275//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/275//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(x) {color:red}-1 client integration test{color}
--Failed when running client tests on top of Hadoop 2. [see log for 
details|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/275//artifact/output-integration/hadoop-2.log].
 (note that this means we didn't run on Hadoop 3)


> Umbrella JIRA for HBase Read Replica clusters
> -
>
> Key: HBASE-18477
> URL: https://issues.apache.org/jira/browse/HBASE-18477
> Project: HBase
>  Issue Type: New Feature
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Attachments: HBase Read-Replica Clusters Scope doc.docx, HBase 
> Read-Replica Clusters Scope doc.pdf, HBase Read-Replica Clusters Scope 
> doc_v2.docx, HBase Read-Replica Clusters Scope doc_v2.pdf
>
>
> Recently, changes (such as HBASE-17437) have unblocked HBase to run with a 
> root directory external to the cluster (such as in Amazon S3). This means 
> that the data is stored outside of the cluster and can be accessible after 
> the cluster has been terminated. One use case that is often asked about is 
> pointing multiple clusters to one root directory (sharing the data) to have 
> read resiliency in the case of a cluster failure.
>  
> This JIRA is an umbrella JIRA to contain all the tasks necessary to create a 
> read-replica HBase cluster that is pointed at the same root directory.
>  
> This requires making the Read-Replica cluster Read-Only (no metadata 
> operation or data operations).
> Separating the hbase:meta table for each cluster (Otherwise HBase gets 
> confused with multiple clusters trying to update the meta table with their ip 
> addresses)
> Adding refresh functionality for the meta table to ensure new metadata is 
> picked up on the read replica cluster.
> Adding refresh functionality for HFiles for a given table to ensure new data 
> is picked up on the read replica cluster.
>  
> This can be used with any existing cluster that is backed by an external 
> filesystem.
>  
> Please note that this feature is still quite manual (with the potential for 
> automation later).
>  
> More information on this particular feature can be found here: 
> https://aws.amazon.com/blogs/big-data/setting-up-read-replica-clusters-with-hbase-on-amazon-s3/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20930) MetaScanner.metaScan should use passed variable for meta table name rather than TableName.META_TABLE_NAME

2018-07-25 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16556236#comment-16556236
 ] 

Josh Elser commented on HBASE-20930:


[~vishk], looks like you attached the wrong file.

> MetaScanner.metaScan should use passed variable for meta table name rather 
> than TableName.META_TABLE_NAME
> -
>
> Key: HBASE-20930
> URL: https://issues.apache.org/jira/browse/HBASE-20930
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.3.3
>Reporter: Vishal Khandelwal
>Assignee: Vishal Khandelwal
>Priority: Minor
> Fix For: 1.3.3
>
> Attachments: HBASE-20935.branch-1.3.patch
>
>
> MetaScanner.metaScan 
>  try (Table metaTable = new HTable(TableName.META_TABLE_NAME, connection, 
> null)) {
> should be changed to 
> metaScan(connection, visitor, userTableName, null, Integer.MAX_VALUE, 
> metaTableName)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20942) Make RpcServer trace log length configurable

2018-07-25 Thread Mike Drob (JIRA)
Mike Drob created HBASE-20942:
-

 Summary: Make RpcServer trace log length configurable
 Key: HBASE-20942
 URL: https://issues.apache.org/jira/browse/HBASE-20942
 Project: HBase
  Issue Type: Task
Reporter: Esteban Gutierrez


We truncate RpcServer output to 1000 characters for trace logging. Would be 
better if that value was configurable.

Esteban mentioned this to me earlier, so I'm crediting him as the reporter.

cc: [~elserj]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20932) Effective MemStoreSize::hashCode()

2018-07-25 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16556243#comment-16556243
 ] 

Hadoop QA commented on HBASE-20932:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
45s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
35s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
31s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 14s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}122m 
15s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}163m 41s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20932 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12932970/HBASE-20932.001.patch 
|
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux c92b0bc8a255 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 1913164970 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13792/testReport/ |
| Max. process+thread count | 4955 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13792/c

[jira] [Commented] (HBASE-20899) Add Hadoop KMS dependency and basic HDFS at-rest encryption tests

2018-07-25 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16556276#comment-16556276
 ] 

Hadoop QA commented on HBASE-20899:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
35s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
55s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
56s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
25s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} 
|
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
40s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m 
15s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  2m  
5s{color} | {color:red} root: The patch generated 2 new + 0 unchanged - 0 fixed 
= 2 total (was 0) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
3s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
13s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red}  6m 
11s{color} | {color:red} The patch causes 10 errors with Hadoop v3.0.0. {color} 
|
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} 
|
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m  
3s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}216m 14s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
45s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}273m 37s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20899 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12933065/HBASE-20899.master.001.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  shadedjars  hadoopcheck  
xml  compile  findbugs  hbaseanti  checkstyle  |
| uname | Linux 639b94b627b7 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 
08:53:28 UTC 2018 

[jira] [Commented] (HBASE-20932) Effective MemStoreSize::hashCode()

2018-07-25 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16556298#comment-16556298
 ] 

Ted Yu commented on HBASE-20932:


Mingliang:

Can you generate patch with proper header ?

You can find example on HBASE-20928.
This way, your identity would appear as author in commit log.

> Effective MemStoreSize::hashCode() 
> ---
>
> Key: HBASE-20932
> URL: https://issues.apache.org/jira/browse/HBASE-20932
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.2
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>Priority: Major
> Attachments: HBASE-20932.001.patch
>
>
> After HBASE-20411 we have
> {code:java|title=MemStoreSize::hashCode()}
>   @Override
>   public int hashCode() {
> long h = 31 * this.dataSize;
> h = h + 31 * this.heapSize;
> h = h + 31 * this.offHeapSize;
> return (int) h;
>   }
>  {code}
> This is not effective {{hashCode()}} implementation. Instead we can use:
> {code:java|title=MemStoreSize::hashCode()}
>   @Override
>   public int hashCode() {
> long h = this.dataSize;
> h = h * 31 + this.heapSize;
> h = h * 31 + this.offHeapSize;
> return (int) h;
>   }
>  {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20899) Add Hadoop KMS dependency and basic HDFS at-rest encryption tests

2018-07-25 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16556301#comment-16556301
 ] 

Wei-Chiu Chuang commented on HBASE-20899:
-

I forgot that in Hadoop 3, KMS switched from Tomcat to Jetty and therefore the 
package changed. Will update with a new patch.

> Add Hadoop KMS dependency and basic HDFS at-rest encryption tests
> -
>
> Key: HBASE-20899
> URL: https://issues.apache.org/jira/browse/HBASE-20899
> Project: HBase
>  Issue Type: Sub-task
>  Components: encryption
>Affects Versions: 2.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HBASE-20899.master.001.patch
>
>
> We should start by adding hadoop-kms dependency in HBase test scope, and add 
> basic HDFS at-rest encryption tests using the hadoop-kms dependency.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20932) Effective MemStoreSize::hashCode()

2018-07-25 Thread Mingliang Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HBASE-20932:
--
Attachment: HBASE-20932.002.patch

> Effective MemStoreSize::hashCode() 
> ---
>
> Key: HBASE-20932
> URL: https://issues.apache.org/jira/browse/HBASE-20932
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.2
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>Priority: Major
> Attachments: HBASE-20932.001.patch, HBASE-20932.002.patch
>
>
> After HBASE-20411 we have
> {code:java|title=MemStoreSize::hashCode()}
>   @Override
>   public int hashCode() {
> long h = 31 * this.dataSize;
> h = h + 31 * this.heapSize;
> h = h + 31 * this.offHeapSize;
> return (int) h;
>   }
>  {code}
> This is not effective {{hashCode()}} implementation. Instead we can use:
> {code:java|title=MemStoreSize::hashCode()}
>   @Override
>   public int hashCode() {
> long h = this.dataSize;
> h = h * 31 + this.heapSize;
> h = h * 31 + this.offHeapSize;
> return (int) h;
>   }
>  {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20932) Effective MemStoreSize::hashCode()

2018-07-25 Thread Mingliang Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16556303#comment-16556303
 ] 

Mingliang Liu commented on HBASE-20932:
---

Thanks [~yuzhih...@gmail.com]. I generate the same patch with meta info as  
[^HBASE-20932.002.patch] .

> Effective MemStoreSize::hashCode() 
> ---
>
> Key: HBASE-20932
> URL: https://issues.apache.org/jira/browse/HBASE-20932
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.2
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>Priority: Major
> Attachments: HBASE-20932.001.patch, HBASE-20932.002.patch
>
>
> After HBASE-20411 we have
> {code:java|title=MemStoreSize::hashCode()}
>   @Override
>   public int hashCode() {
> long h = 31 * this.dataSize;
> h = h + 31 * this.heapSize;
> h = h + 31 * this.offHeapSize;
> return (int) h;
>   }
>  {code}
> This is not effective {{hashCode()}} implementation. Instead we can use:
> {code:java|title=MemStoreSize::hashCode()}
>   @Override
>   public int hashCode() {
> long h = this.dataSize;
> h = h * 31 + this.heapSize;
> h = h * 31 + this.offHeapSize;
> return (int) h;
>   }
>  {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20932) Effective MemStoreSize::hashCode()

2018-07-25 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20932:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Thanks for the patch, Mingliang

> Effective MemStoreSize::hashCode() 
> ---
>
> Key: HBASE-20932
> URL: https://issues.apache.org/jira/browse/HBASE-20932
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.2
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>Priority: Major
> Fix For: 2.2.0
>
> Attachments: HBASE-20932.001.patch, HBASE-20932.002.patch
>
>
> After HBASE-20411 we have
> {code:java|title=MemStoreSize::hashCode()}
>   @Override
>   public int hashCode() {
> long h = 31 * this.dataSize;
> h = h + 31 * this.heapSize;
> h = h + 31 * this.offHeapSize;
> return (int) h;
>   }
>  {code}
> This is not effective {{hashCode()}} implementation. Instead we can use:
> {code:java|title=MemStoreSize::hashCode()}
>   @Override
>   public int hashCode() {
> long h = this.dataSize;
> h = h * 31 + this.heapSize;
> h = h * 31 + this.offHeapSize;
> return (int) h;
>   }
>  {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20867) RS may get killed while master restarts

2018-07-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16556316#comment-16556316
 ] 

Hudson commented on HBASE-20867:


Results for branch branch-2.1
[build #103 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/103/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/103//console].




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/103//console].


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/103//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.




> RS may get killed while master restarts
> ---
>
> Key: HBASE-20867
> URL: https://issues.apache.org/jira/browse/HBASE-20867
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20867.branch-2.0.001.patch, 
> HBASE-20867.branch-2.0.002.patch, HBASE-20867.branch-2.0.003.patch, 
> HBASE-20867.branch-2.0.004.patch, HBASE-20867.branch-2.0.005.patch, 
> HBASE-20867.branch-2.0.006.patch
>
>
> If the master is dispatching a RPC call to RS when aborting. A connection 
> exception may be thrown by the RPC layer(A IOException with "Connection 
> closed" message in this case). The RSProcedureDispatcher will regard is as an 
> un-retryable exception and pass it to UnassignProcedue.remoteCallFailed, 
> which will expire the RS.
> Actually, the RS is very healthy, only the master is restarting.
> I think we should deal with those kinds of connection exceptions in 
> RSProcedureDispatcher and retry the rpc call



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16556315#comment-16556315
 ] 

Hudson commented on HBASE-20846:


Results for branch branch-2.1
[build #103 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/103/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/103//console].




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/103//console].


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/103//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.




> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, 
> HBASE-20846-v3.patch, HBASE-20846-v4.patch, HBASE-20846-v4.patch, 
> HBASE-20846-v4.patch, HBASE-20846-v5.patch, HBASE-20846-v6.patch, 
> HBASE-20846.branch-2.0.002.patch, HBASE-20846.branch-2.0.patch, 
> HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not possible if the master was not restarted.
> This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, 
> I wrote a simple UT to repo this case.
> I think we don't have to check the parent for table's shared lock. It is a 
> shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20815) In TestServerCrashProcedure collect and assert on submitted and failed counts for ServerCrashProcedure

2018-07-25 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16556322#comment-16556322
 ] 

Hadoop QA commented on HBASE-20815:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
10s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
45s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
47s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
32s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
34s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m  6s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}122m 
22s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}163m 30s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20815 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12933080/HBASE-20815.master.002.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 134628383ae4 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 1913164970 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13794/testReport/ |
| Max. process+thread count | 4735 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13794/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> In TestServer

[jira] [Commented] (HBASE-20932) Effective MemStoreSize::hashCode()

2018-07-25 Thread Mingliang Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16556324#comment-16556324
 ] 

Mingliang Liu commented on HBASE-20932:
---

Thanks [~yuzhih...@gmail.com] for prompt review and commit!

> Effective MemStoreSize::hashCode() 
> ---
>
> Key: HBASE-20932
> URL: https://issues.apache.org/jira/browse/HBASE-20932
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.2
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>Priority: Major
> Fix For: 2.2.0
>
> Attachments: HBASE-20932.001.patch, HBASE-20932.002.patch
>
>
> After HBASE-20411 we have
> {code:java|title=MemStoreSize::hashCode()}
>   @Override
>   public int hashCode() {
> long h = 31 * this.dataSize;
> h = h + 31 * this.heapSize;
> h = h + 31 * this.offHeapSize;
> return (int) h;
>   }
>  {code}
> This is not effective {{hashCode()}} implementation. Instead we can use:
> {code:java|title=MemStoreSize::hashCode()}
>   @Override
>   public int hashCode() {
> long h = this.dataSize;
> h = h * 31 + this.heapSize;
> h = h * 31 + this.offHeapSize;
> return (int) h;
>   }
>  {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >