from:"Guanghao Zhang"

[jira] [Created] (HBASE-23633) Find a way to handle the corrupt recovered hfiles

2020-01-03 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-23633:
--

 Summary: Find a way to handle the corrupt recovered hfiles
 Key: HBASE-23633
 URL: https://issues.apache.org/jira/browse/HBASE-23633
 Project: HBase
  Issue Type: Umbrella
Reporter: Guanghao Zhang


Copy the comment from PR review.

 

If the file is a corrupt HFile, an exception will be thrown here, which will 
cause the region to fail to open.
Maybe we can add a new parameter to control whether to skip the exception, 
similar to recover edits which has a parameter 
"hbase.hregion.edits.replay.skip.errors";

 

Regions that can't be opened because of detached References or corrupt hfiles 
are a fact-of-life. We need work on this issue. This will be a new variant on 
the problem -- i.e. bad recovered hfiles.

On adding a config to ignore bad files and just open, thats a bit dangerous as 
per @infraio  as it could mean silent data loss.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-23286) Improve MTTR: Split WAL to HFile

2020-01-03 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-23286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-23286.

Resolution: Fixed

Pushed to branch-2 and master. Thanks all for reviewing. And opened two 
follow-up issues.

> Improve MTTR: Split WAL to HFile
> 
>
> Key: HBASE-23286
> URL: https://issues.apache.org/jira/browse/HBASE-23286
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR
>Affects Versions: 3.0.0, 2.3.0
>    Reporter: Guanghao Zhang
>    Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 3.0.0, 2.3.0
>
>
> After HBASE-20724, the compaction event marker is not used anymore when 
> failover. So our new proposal is split WAL to HFile to imporve MTTR. It has 3 
> steps:
>  # Read WAL and write HFile to region’s column family’s recovered.hfiles 
> directory.
>  # Open region.
>  # Bulkload the recovered.hfiles for every column family.
> The design doc was attathed by a google doc. Any suggestions are welcomed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Reopened] (HBASE-23175) Yarn unable to acquire delegation token for HBase Spark jobs

2020-01-03 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-23175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang reopened HBASE-23175:


Forgot to pushed to branch-2.2. Reopened it.

> Yarn unable to acquire delegation token for HBase Spark jobs
> 
>
> Key: HBASE-23175
> URL: https://issues.apache.org/jira/browse/HBASE-23175
> Project: HBase
>  Issue Type: Bug
>  Components: security, spark
>Affects Versions: 2.0.0
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 2.1.8, 2.2.3
>
> Attachments: HBASE-23175.master.001.patch
>
>
> Spark rely on the TokenUtil.obtainToken(conf) API which is removed in 
> HBase-2.0, though it has been fixed in SPARK-26432 to use the new API but 
> planned for Spark-3.0, hence we need the fix in HBase until they release it 
> and we upgrade it
> {code}
> 18/03/20 20:39:07 ERROR ApplicationMaster: User class threw exception: 
> org.apache.hadoop.hbase.HBaseIOException: 
> com.google.protobuf.ServiceException: Error calling method 
> hbase.pb.AuthenticationService.GetAuthenticationToken
> org.apache.hadoop.hbase.HBaseIOException: 
> com.google.protobuf.ServiceException: Error calling method 
> hbase.pb.AuthenticationService.GetAuthenticationToken
> at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.makeIOExceptionOfException(ProtobufUtil.java:360)
> at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.handleRemoteException(ProtobufUtil.java:346)
> at 
> org.apache.hadoop.hbase.security.token.TokenUtil.obtainToken(TokenUtil.java:86)
> at 
> org.apache.hadoop.hbase.security.token.TokenUtil$1.run(TokenUtil.java:121)
> at 
> org.apache.hadoop.hbase.security.token.TokenUtil$1.run(TokenUtil.java:118)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
> at 
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:313)
> at 
> org.apache.hadoop.hbase.security.token.TokenUtil.obtainToken(TokenUtil.java:118)
> at 
> org.apache.hadoop.hbase.security.token.TokenUtil.addTokenForJob(TokenUtil.java:272)
> at 
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initCredentials(TableMapReduceUtil.java:533)
> at 
> org.apache.hadoop.hbase.spark.HBaseContext.(HBaseContext.scala:73)
> at 
> org.apache.hadoop.hbase.spark.JavaHBaseContext.(JavaHBaseContext.scala:46)
> at 
> org.apache.hadoop.hbase.spark.example.hbasecontext.JavaHBaseBulkDeleteExample.main(JavaHBaseBulkDeleteExample.java:64)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$4.run(ApplicationMaster.scala:706)
> Caused by: com.google.protobuf.ServiceException: Error calling method 
> hbase.pb.AuthenticationService.GetAuthenticationToken
> at 
> org.apache.hadoop.hbase.client.SyncCoprocessorRpcChannel.callBlockingMethod(SyncCoprocessorRpcChannel.java:71)
> at 
> org.apache.hadoop.hbase.protobuf.generated.AuthenticationProtos$AuthenticationService$BlockingStub.getAuthenticationToken(AuthenticationProtos.java:4512)
> at 
> org.apache.hadoop.hbase.security.token.TokenUtil.obtainToken(TokenUtil.java:81)
> ... 17 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-23175) Yarn unable to acquire delegation token for HBase Spark jobs

2020-01-03 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-23175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-23175.

Resolution: Fixed

Pushed to branch-2.2.

> Yarn unable to acquire delegation token for HBase Spark jobs
> 
>
> Key: HBASE-23175
> URL: https://issues.apache.org/jira/browse/HBASE-23175
> Project: HBase
>  Issue Type: Bug
>  Components: security, spark
>Affects Versions: 2.0.0
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 2.2.3, 2.1.8
>
> Attachments: HBASE-23175.master.001.patch
>
>
> Spark rely on the TokenUtil.obtainToken(conf) API which is removed in 
> HBase-2.0, though it has been fixed in SPARK-26432 to use the new API but 
> planned for Spark-3.0, hence we need the fix in HBase until they release it 
> and we upgrade it
> {code}
> 18/03/20 20:39:07 ERROR ApplicationMaster: User class threw exception: 
> org.apache.hadoop.hbase.HBaseIOException: 
> com.google.protobuf.ServiceException: Error calling method 
> hbase.pb.AuthenticationService.GetAuthenticationToken
> org.apache.hadoop.hbase.HBaseIOException: 
> com.google.protobuf.ServiceException: Error calling method 
> hbase.pb.AuthenticationService.GetAuthenticationToken
> at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.makeIOExceptionOfException(ProtobufUtil.java:360)
> at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.handleRemoteException(ProtobufUtil.java:346)
> at 
> org.apache.hadoop.hbase.security.token.TokenUtil.obtainToken(TokenUtil.java:86)
> at 
> org.apache.hadoop.hbase.security.token.TokenUtil$1.run(TokenUtil.java:121)
> at 
> org.apache.hadoop.hbase.security.token.TokenUtil$1.run(TokenUtil.java:118)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
> at 
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:313)
> at 
> org.apache.hadoop.hbase.security.token.TokenUtil.obtainToken(TokenUtil.java:118)
> at 
> org.apache.hadoop.hbase.security.token.TokenUtil.addTokenForJob(TokenUtil.java:272)
> at 
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initCredentials(TableMapReduceUtil.java:533)
> at 
> org.apache.hadoop.hbase.spark.HBaseContext.(HBaseContext.scala:73)
> at 
> org.apache.hadoop.hbase.spark.JavaHBaseContext.(JavaHBaseContext.scala:46)
> at 
> org.apache.hadoop.hbase.spark.example.hbasecontext.JavaHBaseBulkDeleteExample.main(JavaHBaseBulkDeleteExample.java:64)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$4.run(ApplicationMaster.scala:706)
> Caused by: com.google.protobuf.ServiceException: Error calling method 
> hbase.pb.AuthenticationService.GetAuthenticationToken
> at 
> org.apache.hadoop.hbase.client.SyncCoprocessorRpcChannel.callBlockingMethod(SyncCoprocessorRpcChannel.java:71)
> at 
> org.apache.hadoop.hbase.protobuf.generated.AuthenticationProtos$AuthenticationService$BlockingStub.getAuthenticationToken(AuthenticationProtos.java:4512)
> at 
> org.apache.hadoop.hbase.security.token.TokenUtil.obtainToken(TokenUtil.java:81)
> ... 17 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-23637) Generate CHANGES.md and RELEASENOTES.md for 2.2.3

2020-01-03 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-23637:
--

 Summary: Generate CHANGES.md and RELEASENOTES.md for 2.2.3
 Key: HBASE-23637
 URL: https://issues.apache.org/jira/browse/HBASE-23637
 Project: HBase
  Issue Type: Sub-task
Reporter: Guanghao Zhang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-23553) Snapshot referenced data files are deleted in some case

2020-01-03 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-23553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-23553.

Fix Version/s: 2.2.3
   2.3.0
   3.0.0
   Resolution: Fixed

> Snapshot referenced data files are deleted in some case
> ---
>
> Key: HBASE-23553
> URL: https://issues.apache.org/jira/browse/HBASE-23553
> Project: HBase
>  Issue Type: Bug
>Reporter: Yi Mei
>Assignee: Yi Mei
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 2.2.3
>
>
> We scan snapshot in our cluster and got following exception:
> {code:java}
> java.io.IOException: java.io.IOException: java.io.FileNotFoundException: 
> Unable to open link: org.apache.hadoop.hbase.io.HFileLink 
> locations=[hdfs://tjwqsrv-galaxy98/hbase/tjwqsrv-galaxy98/data/default/galaxy_online_fds_object_table/06dd90d8540b56343859b63a6134450c/A/4a6cf05f419a9f61059cb05a962f,
>  
> hdfs://tjwqsrv-galaxy98/hbase/tjwqsrv-galaxy98/.tmp/data/default/galaxy_online_fds_object_table/06dd90d8540b56343859b63a6134450c/A/4a6cf05f419a9f61059cb05a962f,
>  
> hdfs://tjwqsrv-galaxy98/hbase/tjwqsrv-galaxy98/archive/data/default/galaxy_online_fds_object_table/06dd90d8540b56343859b63a6134450c/A/4a6cf05f419a9f61059cb05a962f]
>  
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionStores(HRegion.java:867)
>  
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:778)
>  at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:749) 
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5306) 
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5271) 
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5243) 
> at 
> org.apache.hadoop.hbase.client.ClientSideRegionScanner.(ClientSideRegionScanner.java:72)
>  
> at 
> org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormatImpl$RecordReader.initialize(TableSnapshotInputFormatImpl.java:239)
>  
> at 
> org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormat$TableSnapshotRegionRecordReader.initialize(TableSnapshotInputFormat.java:150)
>  
> at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:552)
>  at {code}
> I checked to namenode logs and found that this file is deleted by hbase 
> cleaner although a snapshot still referenced to this file.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-23638) Set version to 2.2.3 in branch-2.2 for first RC of 2.2.3

2020-01-03 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-23638:
--

 Summary: Set version to 2.2.3 in branch-2.2 for first RC of 2.2.3
 Key: HBASE-23638
 URL: https://issues.apache.org/jira/browse/HBASE-23638
 Project: HBase
  Issue Type: Sub-task
Reporter: Guanghao Zhang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-23655) Fix flaky TestRSGroupsKillRS: should wait the SCP to finish

2020-01-07 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-23655:
--

 Summary: Fix flaky TestRSGroupsKillRS: should wait the SCP to 
finish
 Key: HBASE-23655
 URL: https://issues.apache.org/jira/browse/HBASE-23655
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.2.2
Reporter: Guanghao Zhang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-23659) BaseLoadBalancer#wouldLowerAvailability should consider region replicas

2020-01-08 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-23659:
--

 Summary: BaseLoadBalancer#wouldLowerAvailability should consider 
region replicas
 Key: HBASE-23659
 URL: https://issues.apache.org/jira/browse/HBASE-23659
 Project: HBase
  Issue Type: Bug
Reporter: Guanghao Zhang


Found this issue when try to fix the flaky unit test  TestRegionReplicaSplit. 
It may fail as 

java.lang.AssertionError: Splitted regions should not be assigned to same 
region server.

See 
[https://builds.apache.org/job/HBase-Flaky-Tests/job/master/5227/testReport/junit/org.apache.hadoop.hbase.master.assignment/TestRegionReplicaSplit/testRegionReplicaSplitRegionAssignment/].

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-23658) Fix flaky TestSnapshotFromMaster

2020-01-07 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-23658:
--

 Summary: Fix flaky TestSnapshotFromMaster
 Key: HBASE-23658
 URL: https://issues.apache.org/jira/browse/HBASE-23658
 Project: HBase
  Issue Type: Bug
Reporter: Guanghao Zhang


testAsyncSnapshotWillNotBlockSnapshotHFileCleaner is flaky.  The assert may 
fail.
{code:java}
assertTrue(master.getSnapshotManager().isTakingAnySnapshot());
future.get(); // in branch-2.2, here is Thread.sleep
assertFalse(master.getSnapshotManager().isTakingAnySnapshot());
{code}
See 
[https://builds.apache.org/job/HBase-Flaky-Tests/job/master/5227/testReport/junit/org.apache.hadoop.hbase.master.cleaner/TestSnapshotFromMaster/testAsyncSnapshotWillNotBlockSnapshotHFileCleaner/]

 

[https://builds.apache.org/view/H-L/view/HBase/job/HBase-Find-Flaky-Tests/job/branch-2.2/lastSuccessfulBuild/artifact/dashboard.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-23964) Set version to 2.2.4 in branch-2.2 for first RC of 2.2.3

2020-03-10 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-23964:
--

 Summary: Set version to 2.2.4 in branch-2.2 for first RC of 2.2.3
 Key: HBASE-23964
 URL: https://issues.apache.org/jira/browse/HBASE-23964
 Project: HBase
  Issue Type: Sub-task
Reporter: Guanghao Zhang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-23953) SimpleBalancer bug when second pass to fill up to min

2020-03-10 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-23953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-23953.

Fix Version/s: (was: 2.2.5)
   2.2.4
   2.3.0
   3.0.0
   Resolution: Fixed

Pushed to branch-2.2+. Thanks [~niuyulin] for contributing.

> SimpleBalancer bug when second pass to fill up to min
> -
>
> Key: HBASE-23953
> URL: https://issues.apache.org/jira/browse/HBASE-23953
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer
>Affects Versions: 2.2.0
>Reporter: niuyulin
>Assignee: niuyulin
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 2.2.4
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-23964) Set version to 2.2.4 in branch-2.2 for first RC of 2.2.4

2020-03-11 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-23964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-23964.

Resolution: Fixed

> Set version to 2.2.4 in branch-2.2 for first RC of 2.2.4
> 
>
> Key: HBASE-23964
> URL: https://issues.apache.org/jira/browse/HBASE-23964
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Guanghao Zhang
>        Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 2.2.4
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-23965) Generate CHANGES.md and RELEASENOTES.md for 2.2.4

2020-03-11 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-23965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-23965.

Fix Version/s: 2.2.4
   Resolution: Fixed

Thanks [~meiyi] for reviewing. Pushed to branch-2.2.

> Generate CHANGES.md and RELEASENOTES.md for 2.2.4
> -
>
> Key: HBASE-23965
> URL: https://issues.apache.org/jira/browse/HBASE-23965
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Guanghao Zhang
>        Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 2.2.4
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-23965) Generate CHANGES.md and RELEASENOTES.md for 2.2.4

2020-03-10 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-23965:
--

 Summary: Generate CHANGES.md and RELEASENOTES.md for 2.2.4
 Key: HBASE-23965
 URL: https://issues.apache.org/jira/browse/HBASE-23965
 Project: HBase
  Issue Type: Sub-task
Reporter: Guanghao Zhang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-23684) NPE HFilesOutputSink

2020-03-09 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-23684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-23684.

Resolution: Duplicate

Resolved as duplicate.

> NPE HFilesOutputSink
> 
>
> Key: HBASE-23684
> URL: https://issues.apache.org/jira/browse/HBASE-23684
> Project: HBase
>  Issue Type: Bug
>  Components: MTTR, wal
>Affects Versions: 2.3.0
>Reporter: Michael Stack
>    Assignee: Guanghao Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.3.0
>
>
> Enabling the new split to hfiles feature, HBASE-23286, running branch-2 tip, 
> I see this out on RegionServers:
> {code}
>  2020-01-13 17:37:08,204 INFO org.apache.hadoop.hbase.wal.OutputSink: 3 split 
> writer threads finished
>  2020-01-13 17:37:08,233 INFO org.apache.hadoop.hbase.wal.WALSplitter: 
> Processed 1007 edits across 0 regions cost 284 ms; edits skipped=76; 
> WAL=hdfs://nameservice1/hbase/genie/WALs/hbasedn101.example.org,16020,1578934806382-splitting/hbasedn101.example.org%2C16020%2C1578934806382.1578937008832,
>  size=128.5 M, length=134708720, corrupted=false, progress failed=true
>  2020-01-13 17:37:08,234 WARN 
> org.apache.hadoop.hbase.regionserver.SplitLogWorker: log splitting of 
> WALs/hbasedn101.example.org,16020,1578934806382-splitting/hbasedn101.example.org%2C16020%2C1578934806382.1578937008832
>  failed, returning error
>  java.io.IOException: java.lang.NullPointerException
>  at 
> org.apache.hadoop.hbase.wal.BoundedRecoveredHFilesOutputSink.writeRemainingEntryBuffers(BoundedRecoveredHFilesOutputSink.java:173)
>  at 
> org.apache.hadoop.hbase.wal.BoundedRecoveredHFilesOutputSink.close(BoundedRecoveredHFilesOutputSink.java:140)
>  at 
> org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:339)
>  at 
> org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:181)
>  at 
> org.apache.hadoop.hbase.regionserver.SplitLogWorker.splitLog(SplitLogWorker.java:105)
>  at 
> org.apache.hadoop.hbase.regionserver.SplitLogWorker.lambda$new$0(SplitLogWorker.java:84)
>  at 
> org.apache.hadoop.hbase.regionserver.handler.WALSplitterHandler.process(WALSplitterHandler.java:70)
>  at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  at java.base/java.lang.Thread.run(Thread.java:834)
>  Caused by: java.lang.NullPointerException
>  at 
> org.apache.hadoop.hbase.wal.BoundedRecoveredHFilesOutputSink.configContextForNonMetaWriter(BoundedRecoveredHFilesOutputSink.java:225)
>  at 
> org.apache.hadoop.hbase.wal.BoundedRecoveredHFilesOutputSink.createRecoveredHFileWriter(BoundedRecoveredHFilesOutputSink.java:213)
>  at 
> org.apache.hadoop.hbase.wal.BoundedRecoveredHFilesOutputSink.append(BoundedRecoveredHFilesOutputSink.java:117)
>  at 
> org.apache.hadoop.hbase.wal.BoundedRecoveredHFilesOutputSink.lambda$writeRemainingEntryBuffers$3(BoundedRecoveredHFilesOutputSink.java:155)
>  at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>  at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>  at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> {code}
> It is a bit odd because log says there were zero regions. Not sure what that 
> was about.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-23895) STUCK Region-In-Transition when failed to insert procedure to procedure store

2020-03-07 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-23895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-23895.

Resolution: Fixed

Pushed to master and branch-2. Thanks [~zhangduo] and [~stack] for reviewing.

> STUCK Region-In-Transition when failed to insert procedure to procedure store
> -
>
> Key: HBASE-23895
> URL: https://issues.apache.org/jira/browse/HBASE-23895
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2, RegionProcedureStore
>        Reporter: Guanghao Zhang
>    Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 3.0.0, 2.3.0
>
> Attachments: suggestion.patch
>
>
> When move an region, it will generate a TRSP first and set the procedure to 
> the region state node. But if the submit TRSP failed, the procedure cannot be 
> unset now and the region will stuck in RIT.
> hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
> {code:java}
> public Future moveAsync(RegionPlan regionPlan) throws 
> HBaseIOException {
> TransitRegionStateProcedure proc =
>   createMoveRegionProcedure(regionPlan.getRegionInfo(), 
> regionPlan.getDestination());
> return 
> ProcedureSyncWait.submitProcedure(master.getMasterProcedureExecutor(), proc);
>   }
>   public TransitRegionStateProcedure createMoveRegionProcedure(RegionInfo 
> regionInfo,
>   ServerName targetServer) throws HBaseIOException {
> RegionStateNode regionNode = 
> this.regionStates.getRegionStateNode(regionInfo);
> if (regionNode == null) {
>   throw new UnknownRegionException("No RegionStateNode found for " +
>   regionInfo.getEncodedName() + "(Closed/Deleted?)");
> }
> TransitRegionStateProcedure proc;
> regionNode.lock();
> try {
>   preTransitCheck(regionNode, STATES_EXPECTED_ON_UNASSIGN_OR_MOVE);
>   regionNode.checkOnline();
>   proc = TransitRegionStateProcedure.move(getProcedureEnvironment(), 
> regionInfo, targetServer);
>   regionNode.setProcedure(proc);
> } finally {
>   regionNode.unlock();
> }
> return proc;
>   }
> {code}
> hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionStateNode.java
> {code:java}
>   public void setProcedure(TransitRegionStateProcedure proc) {
> assert this.procedure == null;
> this.procedure = proc;
> ritMap.put(regionInfo, this);
>   }
>   public void unsetProcedure(TransitRegionStateProcedure proc) {
> assert this.procedure == proc;
> this.procedure = null;
> ritMap.remove(regionInfo, this);
>   } 
> {code}
> {code:java}
> 2020-02-26,13:45:21,344 ERROR 
> [RpcServer.default.RWQ.Fifo.read.handler=437,queue=5,port=21500] 
> org.apache.hadoop.hbase.ipc.RpcServer: Unexpected throwable object
> java.io.UncheckedIOException: 
> org.apache.hadoop.hbase.exceptions.TimeoutIOException: Timed out waiting for 
> lock for row: \x00\x00\x00\x00\x00\x0B\xAB\xD2 in region 
> 9731aea823e7f83264b14713ae486fb7
> at 
> org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.update(RegionProcedureStore.java:588)
> at 
> org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.insert(RegionProcedureStore.java:545)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.submitProcedure(ProcedureExecutor.java:1042)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.submitProcedure(ProcedureExecutor.java:860)
> at 
> org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.submitProcedure(ProcedureSyncWait.java:123)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.moveAsync(AssignmentManager.java:657)
> at 
> org.apache.hadoop.hbase.master.HMaster.executeRegionPlansWithThrottling(HMaster.java:1793)
> at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:1761)
> at 
> org.apache.hadoop.hbase.master.MasterRpcServices.balance(MasterRpcServices.java:654)
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:374)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:135)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:352)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Han

[jira] [Resolved] (HBASE-23944) The method setClusterLoad of SimpleLoadBalancer is incorrect when balance by table

2020-03-07 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-23944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-23944.

Resolution: Fixed

> The method setClusterLoad of SimpleLoadBalancer is incorrect when balance by 
> table 
> ---
>
> Key: HBASE-23944
> URL: https://issues.apache.org/jira/browse/HBASE-23944
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer
>Affects Versions: 2.2.2
>Reporter: niuyulin
>Assignee: niuyulin
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 2.2.4, 2.1.10
>
>
> now if in parameter clusterLoad is by table, for example
> {code:java}
> table1=>
>      server1=>[table1,region1]
>      server2=>[]
> table2=>
>     server1=>[table2,region1]
>     server2=>[]
> {code}
> then, the member variable serverLoadList is:
> {code:java}
> [{server1, load 1}{server2, load 0}{server1, load 1} {server2, load 0}]
> {code}
> the cluster will be considered balanced  in method overallNeedsBalance



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-23739) BoundedRecoveredHFilesOutputSink should read the table descriptor directly

2020-03-07 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-23739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-23739.

Fix Version/s: 2.3.0
   3.0.0
   Resolution: Fixed

Pushed to branch-2+.

> BoundedRecoveredHFilesOutputSink should read the table descriptor directly
> --
>
> Key: HBASE-23739
> URL: https://issues.apache.org/jira/browse/HBASE-23739
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Guanghao Zhang
>        Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 3.0.0, 2.3.0
>
>
> Read from meta or filesystem?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24021) Fail fast when bulkLoadHFiles method catch some IOException

2020-04-02 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24021.

Fix Version/s: 2.2.5
   2.4.0
   2.3.0
   3.0.0
   Resolution: Fixed

Pushed to branch-2.2+.

> Fail fast when bulkLoadHFiles method catch some IOException
> ---
>
> Key: HBASE-24021
> URL: https://issues.apache.org/jira/browse/HBASE-24021
> Project: HBase
>  Issue Type: Improvement
>  Components: HFile, regionserver
>Reporter: niuyulin
>Assignee: niuyulin
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 2.4.0, 2.2.5
>
>
> In production environment, we usually do bulkload huge amount hfile . It 
> reasonable  fail fast when any  IOException occur
>  
> hbase/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
> {code:java}
> public Map> bulkLoadHFiles(Collection String>> familyPaths,
> boolean assignSeqId, BulkLoadListener bulkLoadListener,
>   boolean copyFile, List clusterIds, boolean replicate) throws 
> IOException {
>   ..
>   try {
> this.writeRequestsCount.increment();
> // There possibly was a split that happened between when the split keys
> // were gathered and before the HRegion's write lock was taken.  We need
> // to validate the HFile region before attempting to bulk load all of them
> List ioes = new ArrayList<>();
> List> failures = new ArrayList<>();
> for (Pair p : familyPaths) {
>   byte[] familyName = p.getFirst();
>   String path = p.getSecond();
>   HStore store = getStore(familyName);
>   if (store == null) {
> IOException ioe = new org.apache.hadoop.hbase.DoNotRetryIOException(
> "No such column family " + Bytes.toStringBinary(familyName));
> ioes.add(ioe);
>   } else {
> try {
>   store.assertBulkLoadHFileOk(new Path(path));
> } catch (WrongRegionException wre) {
>   // recoverable (file doesn't fit in region)
>   failures.add(p);
> } catch (IOException ioe) {
>   // unrecoverable (hdfs problem)
>   ioes.add(ioe);
> }
>   }
> }
> // validation failed because of some sort of IO problem.
> if (ioes.size() != 0) {
>   IOException e = MultipleIOException.createIOException(ioes);
>   LOG.error("There were one or more IO errors when checking if the bulk 
> load is ok.", e);
>   throw e;
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24037) Add ut for root dir and wal root dir are different

2020-03-24 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24037.

Fix Version/s: 2.4.0
   2.3.0
   3.0.0
   Resolution: Fixed

Pushed to branch-2.3+.

> Add ut for root dir and wal root dir are different
> --
>
> Key: HBASE-24037
> URL: https://issues.apache.org/jira/browse/HBASE-24037
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Guanghao Zhang
>        Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 2.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-23949) refactor loadBalancer implements for rsgroup balance by table to achieve overallbalanced

2020-03-24 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-23949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-23949.

Fix Version/s: 2.2.5
   2.4.0
   2.3.0
   3.0.0
   Resolution: Fixed

Pushed to branch-2.2+. Thanks [~niuyulin] for contributing.

> refactor  loadBalancer implements for rsgroup balance by table to  achieve 
> overallbalanced
> --
>
> Key: HBASE-23949
> URL: https://issues.apache.org/jira/browse/HBASE-23949
> Project: HBase
>  Issue Type: Bug
>  Components: rsgroup
>Affects Versions: 2.2.0
>Reporter: niuyulin
>Assignee: niuyulin
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 2.4.0, 2.2.5
>
>
>  now can not achieve overallbalanced when use rsgroup balancer and by table 
> is on,
> because balance every table actually use the clusterload only contain one 
> table's load.
> we should use clusterload contain all this rsgroup table's load to balance 
> overall
>  
>  hbase/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
> {code:java}
>   public boolean balance(boolean force) throws IOException {
> ..
> boolean isByTable = 
> getConfiguration().getBoolean("hbase.master.loadbalance.bytable", false);
> Map>> assignments =
>   this.assignmentManager.getRegionStates()
> .getAssignmentsForBalancer(tableStateManager, 
> this.serverManager.getOnlineServersList(),
>   isByTable);
> for (Map> serverMap : assignments.values()) {
>   
> serverMap.keySet().removeAll(this.serverManager.getDrainingServersList());
> }
> //Give the balancer the current cluster state.
> this.balancer.setClusterMetrics(getClusterMetricsWithoutCoprocessor());
> this.balancer.setClusterLoad(assignments);
> List plans = new ArrayList<>();
> for (Entry>> e : 
> assignments.entrySet()) {
>   List partialPlans = 
> this.balancer.balanceCluster(e.getKey(), e.getValue());
>   if (partialPlans != null) {
> plans.addAll(partialPlans);
>   }
> }
> {code}
> now do refactor:
>  # add method 'balanceTable' in interface LoadBalancer
>  # SimpleLoadBalancer and StochasticLoadBalancer do the real 'balanceTable' , 
> and 'balanceTable' is not support in BaseLoadBalancer and 
> RSGroupBasedLoadBalancer
>  # RSGroupBasedLoadBalancer invoke balanceCluster , and pass GroupClusterLoad 
> to internal balacer by group
>  # internal balancer balance cluster invoke 'balanceTable' 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-24363) Fix failed ut TestAssignmentManagerMetrics for branch-2.2

2020-05-13 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-24363:
--

 Summary: Fix failed ut TestAssignmentManagerMetrics for branch-2.2
 Key: HBASE-24363
 URL: https://issues.apache.org/jira/browse/HBASE-24363
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.2.4
Reporter: Guanghao Zhang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24165) maxPoolSize is logged incorrectly in ByteBufferPool

2020-05-16 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24165.

Resolution: Fixed

> maxPoolSize is logged incorrectly in ByteBufferPool
> ---
>
> Key: HBASE-24165
> URL: https://issues.apache.org/jira/browse/HBASE-24165
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.2.4
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Minor
> Fix For: 2.2.5
>
>
> In ByteBufferPool _maxPoolSize_ is converted into byte format,
> https://github.com/apache/hbase/blob/a521a80c4b9a8b0749c368d1ff66fea2ed2d77a2/hbase-common/src/main/java/org/apache/hadoop/hbase/io/ByteBufferPool.java#L85
>  
> Currently maxPoolSize is logged as below,
> 2020-04-10 14:20:56,000 INFO  [Time-limited test] io.ByteBufferPool(83): 
> Created with bufferSize=64 KB and maxPoolSize=320 B



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24381) The Size metrics in Master Webui is wrong if the size is 0

2020-05-17 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24381.

Fix Version/s: 2.2.5
   2.3.0
   3.0.0-alpha-1
   Resolution: Fixed

Pushed to branch-2.2+. Thanks [~DeanZ] for contributing.

> The Size metrics in Master Webui is wrong if the size is 0
> --
>
> Key: HBASE-24381
> URL: https://issues.apache.org/jira/browse/HBASE-24381
> Project: HBase
>  Issue Type: Bug
>  Components: UI
>Affects Versions: 2.2.4
>Reporter: Baiqiang Zhao
>Assignee: Baiqiang Zhao
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0, 2.2.5
>
> Attachments: master-webui-size-wrong.png
>
>
> As shown in attachment, there is no storefiles on the last RS, but the 
> StoreFile Size is as large as the previous RS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24080) [flakey test] TestRegionReplicaFailover.testSecondaryRegionKill fails.

2020-05-13 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24080.

Resolution: Fixed

> [flakey test] TestRegionReplicaFailover.testSecondaryRegionKill fails.
> --
>
> Key: HBASE-24080
> URL: https://issues.apache.org/jira/browse/HBASE-24080
> Project: HBase
>  Issue Type: Test
>  Components: read replicas
>Affects Versions: 3.0.0-alpha-1, 2.3.0
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0, 2.2.5
>
>
> Run into the following error locally:
> {code:java}
> ---
> Test set: org.apache.hadoop.hbase.regionserver.TestRegionReplicaFailover
> ---
> Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 97.391 s <<< 
> FAILURE! - in org.apache.hadoop.hbase.regionserver.TestRegionReplicaFailover
> org.apache.hadoop.hbase.regionserver.TestRegionReplicaFailover.testSecondaryRegionKill
>   Time elapsed: 28.682 s  <<< FAILURE!
> java.lang.AssertionError: Failed verification of row :0
>         at org.junit.Assert.fail(Assert.java:89)
>         at org.junit.Assert.assertTrue(Assert.java:42)
>         at 
> org.apache.hadoop.hbase.HBaseTestingUtility.verifyNumericRows(HBaseTestingUtility.java:2407)
>         at 
> org.apache.hadoop.hbase.regionserver.TestRegionReplicaFailover.testSecondaryRegionKill(TestRegionReplicaFailover.java:240)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>         at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>         at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>         at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>         at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>         at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>         at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
>         at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>         at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>         at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>         at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>         at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>         at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>         at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>         at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>         at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>         at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>         at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
>         at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at java.lang.Thread.run(Thread.java:748) {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24363) Fix failed ut TestAssignmentManagerMetrics for branch-2.2

2020-05-13 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24363.

Fix Version/s: 2.2.5
   Resolution: Fixed

Pushed to branch-2.2. Thanks [~meiyi] for reviewing.

> Fix failed ut TestAssignmentManagerMetrics for branch-2.2
> -
>
> Key: HBASE-24363
> URL: https://issues.apache.org/jira/browse/HBASE-24363
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.2.4
>        Reporter: Guanghao Zhang
>    Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 2.2.5
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24201) Fix CI builds on branch-2.2

2020-05-14 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24201.

Resolution: Invalid

> Fix CI builds on branch-2.2
> ---
>
> Key: HBASE-24201
> URL: https://issues.apache.org/jira/browse/HBASE-24201
> Project: HBase
>  Issue Type: Task
>  Components: build
>Affects Versions: 2.2.5
>Reporter: Nick Dimiduk
>    Assignee: Guanghao Zhang
>Priority: Major
>
> From a recent [PR 
> build|https://builds.apache.org/blue/organizations/jenkins/HBase-PreCommit-GitHub-PR/detail/PR-1532/1/pipeline/]
> {noformat}
> [2020-04-16T18:43:21.548Z] Setting up ruby2.3 (2.3.3-1+deb9u7) ...
> [2020-04-16T18:43:21.548Z] Setting up ruby2.3-dev:amd64 (2.3.3-1+deb9u7) ...
> [2020-04-16T18:43:21.548Z] Setting up ruby-dev:amd64 (1:2.3.3) ...
> [2020-04-16T18:43:21.548Z] Setting up ruby (1:2.3.3) ...
> [2020-04-16T18:43:22.261Z] Processing triggers for libc-bin (2.24-11+deb9u3) 
> ...
> [2020-04-16T18:43:22.975Z] Successfully installed rake-13.0.1
> [2020-04-16T18:43:22.975Z] Building native extensions.  This could take a 
> while...
> [2020-04-16T18:43:25.277Z] ERROR:  Error installing rubocop:
> [2020-04-16T18:43:25.277Z]rubocop requires Ruby version >= 2.4.0.
> {noformat}
> Looks like the Dockerfile on branch-2.2 has bit-rot. I suspect package 
> versions are partially pinned or not pinned at all: the rubocop version has 
> incremented by ruby version has not.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-20289) Comparator for NormalizationPlan breaks comparator's convention

2020-05-18 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-20289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-20289.

Fix Version/s: 2.2.5
   Resolution: Fixed

Pushed to branch-2.2. Thanks [~twyuki] for contributing.

> Comparator for NormalizationPlan breaks comparator's convention
> ---
>
> Key: HBASE-20289
> URL: https://issues.apache.org/jira/browse/HBASE-20289
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Reporter: Yuki Tawara
>Assignee: Yuki Tawara
>Priority: Minor
> Fix For: 3.0.0-alpha-1, 2.3.0, 2.2.5
>
> Attachments: HBASE-20289.master.001.patch
>
>
> Comparator must meet the condition: sign(comparator(plan1, plan2)) = - 
> sign(comparator(plan2, plan1)).
> Current implementation breaks above condition.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Reopened] (HBASE-20289) Comparator for NormalizationPlan breaks comparator's convention

2020-05-18 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-20289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang reopened HBASE-20289:


Reopen for backport to branch-2.2.

> Comparator for NormalizationPlan breaks comparator's convention
> ---
>
> Key: HBASE-20289
> URL: https://issues.apache.org/jira/browse/HBASE-20289
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Reporter: Yuki Tawara
>Assignee: Yuki Tawara
>Priority: Minor
> Fix For: 3.0.0-alpha-1, 2.3.0
>
> Attachments: HBASE-20289.master.001.patch
>
>
> Comparator must meet the condition: sign(comparator(plan1, plan2)) = - 
> sign(comparator(plan2, plan1)).
> Current implementation breaks above condition.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Reopened] (HBASE-24165) maxPoolSize is logged incorrectly in ByteBufferPool

2020-05-14 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang reopened HBASE-24165:


Reopen as this introduce a findbugs warning.
|{color:#00}Result of integer multiplication cast to long in new 
org.apache.hadoop.hbase.io.ByteBufferPool(int, int, boolean) At 
ByteBufferPool.java:to long in new 
org.apache.hadoop.hbase.io.ByteBufferPool(int, int, boolean) At 
ByteBufferPool.java:[line 84]{color}|

> maxPoolSize is logged incorrectly in ByteBufferPool
> ---
>
> Key: HBASE-24165
> URL: https://issues.apache.org/jira/browse/HBASE-24165
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.2.4
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Minor
> Fix For: 2.2.5
>
>
> In ByteBufferPool _maxPoolSize_ is converted into byte format,
> https://github.com/apache/hbase/blob/a521a80c4b9a8b0749c368d1ff66fea2ed2d77a2/hbase-common/src/main/java/org/apache/hadoop/hbase/io/ByteBufferPool.java#L85
>  
> Currently maxPoolSize is logged as below,
> 2020-04-10 14:20:56,000 INFO  [Time-limited test] io.ByteBufferPool(83): 
> Created with bufferSize=64 KB and maxPoolSize=320 B



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Reopened] (HBASE-24080) [flakey test] TestRegionReplicaFailover.testSecondaryRegionKill fails.

2020-05-13 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang reopened HBASE-24080:


Reopen for backport this to branch-2.2.

> [flakey test] TestRegionReplicaFailover.testSecondaryRegionKill fails.
> --
>
> Key: HBASE-24080
> URL: https://issues.apache.org/jira/browse/HBASE-24080
> Project: HBase
>  Issue Type: Test
>  Components: read replicas
>Affects Versions: 3.0.0-alpha-1, 2.3.0
>Reporter: Huaxiang Sun
>Assignee: Huaxiang Sun
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
>
> Run into the following error locally:
> {code:java}
> ---
> Test set: org.apache.hadoop.hbase.regionserver.TestRegionReplicaFailover
> ---
> Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 97.391 s <<< 
> FAILURE! - in org.apache.hadoop.hbase.regionserver.TestRegionReplicaFailover
> org.apache.hadoop.hbase.regionserver.TestRegionReplicaFailover.testSecondaryRegionKill
>   Time elapsed: 28.682 s  <<< FAILURE!
> java.lang.AssertionError: Failed verification of row :0
>         at org.junit.Assert.fail(Assert.java:89)
>         at org.junit.Assert.assertTrue(Assert.java:42)
>         at 
> org.apache.hadoop.hbase.HBaseTestingUtility.verifyNumericRows(HBaseTestingUtility.java:2407)
>         at 
> org.apache.hadoop.hbase.regionserver.TestRegionReplicaFailover.testSecondaryRegionKill(TestRegionReplicaFailover.java:240)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>         at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>         at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>         at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>         at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>         at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>         at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
>         at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>         at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>         at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>         at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>         at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>         at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>         at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>         at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>         at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>         at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>         at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
>         at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at java.lang.Thread.run(Thread.java:748) {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-24022) Set version as 2.2.5-SNAPSHOT in branch-2.2

2020-03-20 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-24022:
--

 Summary: Set version as 2.2.5-SNAPSHOT in branch-2.2
 Key: HBASE-24022
 URL: https://issues.apache.org/jira/browse/HBASE-24022
 Project: HBase
  Issue Type: Sub-task
Reporter: Guanghao Zhang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-24023) Add 2.2.4 to download page

2020-03-20 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-24023:
--

 Summary: Add 2.2.4 to download page
 Key: HBASE-24023
 URL: https://issues.apache.org/jira/browse/HBASE-24023
 Project: HBase
  Issue Type: Sub-task
Reporter: Guanghao Zhang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24022) Set version as 2.2.5-SNAPSHOT in branch-2.2

2020-03-20 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24022.

Fix Version/s: 2.2.5
 Assignee: Guanghao Zhang
   Resolution: Fixed

> Set version as 2.2.5-SNAPSHOT in branch-2.2
> ---
>
> Key: HBASE-24022
> URL: https://issues.apache.org/jira/browse/HBASE-24022
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Guanghao Zhang
>        Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 2.2.5
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24023) Add 2.2.4 to download page

2020-03-20 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24023.

Resolution: Fixed

> Add 2.2.4 to download page
> --
>
> Key: HBASE-24023
> URL: https://issues.apache.org/jira/browse/HBASE-24023
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Guanghao Zhang
>        Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-23922) Release 2.2.4

2020-03-20 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-23922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-23922.

Fix Version/s: 2.2.4
   Resolution: Fixed

> Release 2.2.4
> -
>
> Key: HBASE-23922
> URL: https://issues.apache.org/jira/browse/HBASE-23922
> Project: HBase
>  Issue Type: Umbrella
>    Reporter: Guanghao Zhang
>        Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 2.2.4
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-24033) Add ut for loading the corrupt recovered hfiles

2020-03-22 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-24033:
--

 Summary: Add ut for loading the corrupt recovered hfiles
 Key: HBASE-24033
 URL: https://issues.apache.org/jira/browse/HBASE-24033
 Project: HBase
  Issue Type: Sub-task
Reporter: Guanghao Zhang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24033) Add ut for loading the corrupt recovered hfiles

2020-03-22 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24033.

Resolution: Fixed

Pushed to branch-2.3+. Thanks [~zhangduo] for reviewing.

> Add ut for loading the corrupt recovered hfiles
> ---
>
> Key: HBASE-24033
> URL: https://issues.apache.org/jira/browse/HBASE-24033
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Guanghao Zhang
>        Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 2.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-23633) Find a way to handle the corrupt recovered hfiles

2020-03-22 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-23633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-23633.

Resolution: Fixed

> Find a way to handle the corrupt recovered hfiles
> -
>
> Key: HBASE-23633
> URL: https://issues.apache.org/jira/browse/HBASE-23633
> Project: HBase
>  Issue Type: Bug
>  Components: MTTR, wal
>Affects Versions: 3.0.0, 2.3.0
>    Reporter: Guanghao Zhang
>Assignee: Pankaj Kumar
>Priority: Critical
> Fix For: 3.0.0, 2.3.0, 2.4.0
>
>
> Copy the comment from PR review.
>  
> If the file is a corrupt HFile, an exception will be thrown here, which will 
> cause the region to fail to open.
> Maybe we can add a new parameter to control whether to skip the exception, 
> similar to recover edits which has a parameter 
> "hbase.hregion.edits.replay.skip.errors";
>  
> Regions that can't be opened because of detached References or corrupt hfiles 
> are a fact-of-life. We need work on this issue. This will be a new variant on 
> the problem -- i.e. bad recovered hfiles.
> On adding a config to ignore bad files and just open, thats a bit dangerous 
> as per @infraio  as it could mean silent data loss.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-23741) Data loss when WAL split to HFile enabled

2020-03-23 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-23741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-23741.

Resolution: Fixed

Pushed to branch-2.3+. Thanks [~zhangduo] for reviewing.

> Data loss when WAL split to HFile enabled
> -
>
> Key: HBASE-23741
> URL: https://issues.apache.org/jira/browse/HBASE-23741
> Project: HBase
>  Issue Type: Bug
>  Components: MTTR
>Affects Versions: 3.0.0, 2.3.0
>Reporter: Pankaj Kumar
>    Assignee: Guanghao Zhang
>Priority: Blocker
> Fix For: 3.0.0, 2.3.0, 2.4.0
>
>
> Very simple steps as below,
> 1. Create table with 1 region
> 2. Insert 1 record 
> 3. Flush the table 
> 4. Scan table and observe timestamp of the inserted row
> 5. Insert same row key with same timestamp as previously inserted but with 
> different value
> 6. Kill -9 RS where table region is online
> 7. Start RS
> Scan the table and check the result, latest cell must be returned.
> Thanks [~sreenivasulureddy] for finding this issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-24037) Add ut for root dir and wal root dir are different

2020-03-23 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-24037:
--

 Summary: Add ut for root dir and wal root dir are different
 Key: HBASE-24037
 URL: https://issues.apache.org/jira/browse/HBASE-24037
 Project: HBase
  Issue Type: Sub-task
Reporter: Guanghao Zhang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-24344) Release 2.2.5

2020-05-07 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-24344:
--

 Summary: Release 2.2.5
 Key: HBASE-24344
 URL: https://issues.apache.org/jira/browse/HBASE-24344
 Project: HBase
  Issue Type: Umbrella
Reporter: Guanghao Zhang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-24411) Set version to 2.2.5 in branch-2.2 for first RC of 2.2.5

2020-05-21 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-24411:
--

 Summary: Set version to 2.2.5 in branch-2.2 for first RC of 2.2.5
 Key: HBASE-24411
 URL: https://issues.apache.org/jira/browse/HBASE-24411
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 2.2.5
Reporter: Guanghao Zhang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-24410) Generate CHANGES.md and RELEASENOTES.md for 2.2.5

2020-05-21 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-24410:
--

 Summary: Generate CHANGES.md and RELEASENOTES.md for 2.2.5
 Key: HBASE-24410
 URL: https://issues.apache.org/jira/browse/HBASE-24410
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 2.2.5
Reporter: Guanghao Zhang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24410) Generate CHANGES.md and RELEASENOTES.md for 2.2.5

2020-05-21 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24410.

Fix Version/s: 2.2.5
   Resolution: Fixed

> Generate CHANGES.md and RELEASENOTES.md for 2.2.5
> -
>
> Key: HBASE-24410
> URL: https://issues.apache.org/jira/browse/HBASE-24410
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.2.5
>        Reporter: Guanghao Zhang
>Priority: Major
> Fix For: 2.2.5
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Reopened] (HBASE-23771) [Flakey Tests] Test TestSplitTransactionOnCluster Again

2020-05-22 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-23771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang reopened HBASE-23771:


Reopen for backport to branch-2.2.

> [Flakey Tests] Test TestSplitTransactionOnCluster Again
> ---
>
> Key: HBASE-23771
> URL: https://issues.apache.org/jira/browse/HBASE-23771
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Michael Stack
>Assignee: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
> Attachments: 
> 0001-HBASE-23771-Flakey-Tests-Test-TestSplitTransactionOn.patch, Screen Shot 
> 2020-01-31 at 8.37.13 AM.png
>
>
> Parent fix had the test failures in GCE go from 35% to 4%. Let me see if can 
> clear the remaining fails.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-23771) [Flakey Tests] Test TestSplitTransactionOnCluster Again

2020-05-22 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-23771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-23771.

Fix Version/s: 2.2.6
   Resolution: Fixed

Pushed to branch-2.2.

> [Flakey Tests] Test TestSplitTransactionOnCluster Again
> ---
>
> Key: HBASE-23771
> URL: https://issues.apache.org/jira/browse/HBASE-23771
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Michael Stack
>Assignee: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0, 2.2.6
>
> Attachments: 
> 0001-HBASE-23771-Flakey-Tests-Test-TestSplitTransactionOn.patch, Screen Shot 
> 2020-01-31 at 8.37.13 AM.png
>
>
> Parent fix had the test failures in GCE go from 35% to 4%. Let me see if can 
> clear the remaining fails.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Reopened] (HBASE-24410) Generate CHANGES.md and RELEASENOTES.md for 2.2.5

2020-05-21 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang reopened HBASE-24410:


> Generate CHANGES.md and RELEASENOTES.md for 2.2.5
> -
>
> Key: HBASE-24410
> URL: https://issues.apache.org/jira/browse/HBASE-24410
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.2.5
>        Reporter: Guanghao Zhang
>    Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 2.2.5
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Reopened] (HBASE-24115) Relocate test-only REST "client" from src/ to test/ and mark Private

2020-05-21 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang reopened HBASE-24115:


Reopen to add release note.

> Relocate test-only REST "client" from src/ to test/ and mark Private
> 
>
> Key: HBASE-24115
> URL: https://issues.apache.org/jira/browse/HBASE-24115
> Project: HBase
>  Issue Type: Test
>  Components: REST, security
>Reporter: Andrew Kyle Purtell
>Assignee: Andrew Kyle Purtell
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0, 1.3.7, 1.7.0, 2.4.0, 2.1.10, 
> 1.4.14, 2.2.5
>
>
> Relocate test-only REST "client" from src/ to test/ and annotate as Private. 
> The classes o.a.h.h.rest.Remote* were developed to facilitate REST unit tests 
> and incorrectly committed to src/ . 
> Although this "breaks" compatibility by moving public classes to test jar and 
> marking them private, no attention has been paid to these classes with 
> respect to performance, convenience, or security. Consensus from various 
> discussions over the years is to move them to test/ as was intent of the 
> original committer, but misplaced by the same individual. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24115) Relocate test-only REST "client" from src/ to test/ and mark Private

2020-05-21 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24115.

Resolution: Fixed

> Relocate test-only REST "client" from src/ to test/ and mark Private
> 
>
> Key: HBASE-24115
> URL: https://issues.apache.org/jira/browse/HBASE-24115
> Project: HBase
>  Issue Type: Test
>  Components: REST, security
>Reporter: Andrew Kyle Purtell
>Assignee: Andrew Kyle Purtell
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0, 1.3.7, 1.7.0, 2.4.0, 2.1.10, 
> 1.4.14, 2.2.5
>
>
> Relocate test-only REST "client" from src/ to test/ and annotate as Private. 
> The classes o.a.h.h.rest.Remote* were developed to facilitate REST unit tests 
> and incorrectly committed to src/ . 
> Although this "breaks" compatibility by moving public classes to test jar and 
> marking them private, no attention has been paid to these classes with 
> respect to performance, convenience, or security. Consensus from various 
> discussions over the years is to move them to test/ as was intent of the 
> original committer, but misplaced by the same individual. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24980) Fix dead links in HBase book

2020-09-03 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24980.

Fix Version/s: (was: 2.3.2)
   3.0.0-alpha-1
   Resolution: Fixed

Pushed to master branch. Thanks [~echohlne] for contributing.

> Fix dead links in HBase book
> 
>
> Key: HBASE-24980
> URL: https://issues.apache.org/jira/browse/HBASE-24980
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.3.0
>Reporter: echohlne
>Assignee: echohlne
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> 1.  
> -[https://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/file/tfile/TFile.html|https://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/file/tfile/TFile.html-]-
>  => 
> [https://hadoop.apache.org/docs/current/api/org/apache/hadoop/io/file/tfile/TFile.html]
> 2. -[https://vimeo.com/26804675|https://vimeo.com/26804675-]- => 
> [https://www.youtube.com/watch?v=DdGKAorSSZ0]
> 3. 
> -[http://www.cloudera.com/videos/hw10_video_how_stumbleupon_built_and_advertising_platform_using_hbase_and_hadoop|http://www.cloudera.com/videos/hw10_video_how_stumbleupon_built_and_advertising_platform_using_hbase_and_hadoop-]-
>  has been invalid and cannot be found in other website, just remove it.
> 4. 
> -[https://hadoop.apache.org/core/docs/stable/api/org/apache/hadoop/metrics/package-summary.html]-
>  => 
> [https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/metrics2/package-summary.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24656) [Flakey Tests] branch-2 TestMasterNoCluster.testStopDuringStart

2020-09-03 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24656.

Fix Version/s: 2.2.6
   Resolution: Fixed

Cherry-picked to branch-2.2.

> [Flakey Tests] branch-2 TestMasterNoCluster.testStopDuringStart
> ---
>
> Key: HBASE-24656
> URL: https://issues.apache.org/jira/browse/HBASE-24656
> Project: HBase
>  Issue Type: Bug
>Reporter: Michael Stack
>Assignee: Michael Stack
>Priority: Major
> Fix For: 2.2.6, 2.3.0
>
>
> org.apache.hadoop.hbase.master.TestMasterNoCluster.testStopDuringStart is 
> (only) flakey on branch-2 currently. Fails here:
> Error Message
> KeeperErrorCode = Directory not empty for /hbase/backup-masters
> Stacktrace
> org.apache.zookeeper.KeeperException$NotEmptyException: KeeperErrorCode = 
> Directory not empty for /hbase/backup-masters
>   at 
> org.apache.hadoop.hbase.master.TestMasterNoCluster.tearDown(TestMasterNoCluster.java:121)
> I can see the zk events in teardown as we purge children as part of cleanup. 
> Can also see that the backup master registers later. Other than that, log is 
> opaque on why the teardown is failing. This is just clean up so adding in 
> retry to see if that helps.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Reopened] (HBASE-24656) [Flakey Tests] branch-2 TestMasterNoCluster.testStopDuringStart

2020-09-03 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang reopened HBASE-24656:


Reopen for branch-2.2.

> [Flakey Tests] branch-2 TestMasterNoCluster.testStopDuringStart
> ---
>
> Key: HBASE-24656
> URL: https://issues.apache.org/jira/browse/HBASE-24656
> Project: HBase
>  Issue Type: Bug
>Reporter: Michael Stack
>Assignee: Michael Stack
>Priority: Major
> Fix For: 2.3.0
>
>
> org.apache.hadoop.hbase.master.TestMasterNoCluster.testStopDuringStart is 
> (only) flakey on branch-2 currently. Fails here:
> Error Message
> KeeperErrorCode = Directory not empty for /hbase/backup-masters
> Stacktrace
> org.apache.zookeeper.KeeperException$NotEmptyException: KeeperErrorCode = 
> Directory not empty for /hbase/backup-masters
>   at 
> org.apache.hadoop.hbase.master.TestMasterNoCluster.tearDown(TestMasterNoCluster.java:121)
> I can see the zk events in teardown as we purge children as part of cleanup. 
> Can also see that the backup master registers later. Other than that, log is 
> opaque on why the teardown is failing. This is just clean up so adding in 
> retry to see if that helps.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24831) Avoid invoke Counter using reflection in SnapshotInputFormat

2020-09-02 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24831.

Fix Version/s: 2.3.2
   2.4.0
   3.0.0-alpha-1
   Resolution: Fixed

Pushed to branch-2.3+. Thanks [~chenyechao] for contributing.

> Avoid invoke Counter using reflection  in SnapshotInputFormat
> -
>
> Key: HBASE-24831
> URL: https://issues.apache.org/jira/browse/HBASE-24831
> Project: HBase
>  Issue Type: Improvement
>Reporter: Yechao Chen
>Assignee: Yechao Chen
>Priority: Major
>  Labels: Performance, mapreduce, snapshot
> Fix For: 3.0.0-alpha-1, 2.4.0, 2.3.2
>
>
> In TableRecordReaderImpl we invoke Counter increment by reflection
> This will be called nextKeyValue() in TableSnapshotInputFormat 
> reflection invoke is very slower than normal method call
> we can avoid these to improve the read performance



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24973) Remove read point parameter in method StoreFlush#performFlush and StoreFlush#createScanner

2020-09-02 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24973.

Fix Version/s: 2.4.0
   3.0.0-alpha-1
   Resolution: Fixed

Pushed to branch-2+. Thanks [~yuqi] for contributing.

> Remove read point parameter in method StoreFlush#performFlush and 
> StoreFlush#createScanner
> --
>
> Key: HBASE-24973
> URL: https://issues.apache.org/jira/browse/HBASE-24973
> Project: HBase
>  Issue Type: Improvement
>Reporter: yuqi
>Assignee: yuqi
>Priority: Minor
> Fix For: 3.0.0-alpha-1, 2.4.0
>
>
> Currently, read point parameter in method StoreFlush#performFlush is useless 
> and can be safely removed.
> and then method StoreFlush#createScanner can also remove this parameter
> See below 
> {code:java}
> // Some comments here
>   /**
>* Performs memstore flush, writing data from scanner into sink.
>* @param scanner Scanner to get data from.
>* @param sink Sink to write data to. Could be StoreFile.Writer.
>* @param smallestReadPoint Smallest read point used for the flush.
>* @param throughputController A controller to avoid flush too fast
>*/
>   protected void performFlush(InternalScanner scanner, CellSink sink,
>   long smallestReadPoint, ThroughputController throughputController) 
> throws IOException
> {code}
> Parameter smallestReadPoint is not used in this method. When 
> `smallestReadPoint` is removed,  inner method `createScanner` can remove this 
> necessary parameter too



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24760) Add a config hbase.rsgroup.fallback.enable for RSGroup fallback feature

2020-08-30 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24760.

Fix Version/s: 2.4.0
   Resolution: Fixed

 

Pushed to branch-2+. Thanks [~Ddupg] for contributing.

> Add a config hbase.rsgroup.fallback.enable for RSGroup fallback feature
> ---
>
> Key: HBASE-24760
> URL: https://issues.apache.org/jira/browse/HBASE-24760
> Project: HBase
>  Issue Type: New Feature
>  Components: rsgroup
>Affects Versions: 3.0.0-alpha-1
>Reporter: Sun Xin
>Assignee: Sun Xin
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0
>
>
> In HBASE-22738 we allow tables fallback to specific rs groups, If there is no 
> online servers in the table's rsgroup.
> -But for system tables, if there is no specified fallback rsgroup or the 
> servers in the fallback rsgroup all went down, It is necessary to allow 
> system tables fallback to any rsgroup in order to keey available at all 
> times.-
> For Availability, refactor design of rsgroup fallback, finally only 
> introduced one config property `hbase.rsgroup.fallback.enable`, allow all 
> table, whether or not system tables, fallback to the default rsgroup first, 
> then fallback to any group if no online servers in default rsgroup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24913) Refactor TestJMXConnectorServer

2020-08-30 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24913.

Fix Version/s: 2.3.2
   2.2.7
   Resolution: Fixed

Pushed to branch-2.2+. Thanks [~Ddupg] for contributing.

> Refactor TestJMXConnectorServer
> ---
>
> Key: HBASE-24913
> URL: https://issues.apache.org/jira/browse/HBASE-24913
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 3.0.0-alpha-1
>Reporter: Sun Xin
>Assignee: Sun Xin
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.2.7, 2.3.2
>
>
> Two optimization points for TestJMXConnectorServer in this issue:
>  # Just run cluster once, not once per test case.
>  # Use random free port to run ConnectorServer, avoid specifying a fixed port.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-24998) Introduce a ReplicationSourceOverallController interface and decouple ReplicationSourceManager and ReplicationSource

2020-09-08 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-24998:
--

 Summary: Introduce a  ReplicationSourceOverallController interface 
and decouple ReplicationSourceManager and ReplicationSource
 Key: HBASE-24998
 URL: https://issues.apache.org/jira/browse/HBASE-24998
 Project: HBase
  Issue Type: Sub-task
Reporter: Guanghao Zhang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-25035) Add 2.2.6 to download page

2020-09-16 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-25035:
--

 Summary: Add 2.2.6 to download page
 Key: HBASE-25035
 URL: https://issues.apache.org/jira/browse/HBASE-25035
 Project: HBase
  Issue Type: Sub-task
Reporter: Guanghao Zhang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-25036) Set version as 2.2.7-SNAPSHOT in branch-2.2

2020-09-16 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-25036:
--

 Summary: Set version as 2.2.7-SNAPSHOT in branch-2.2
 Key: HBASE-25036
 URL: https://issues.apache.org/jira/browse/HBASE-25036
 Project: HBase
  Issue Type: Sub-task
Reporter: Guanghao Zhang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-25014) ScheduledChore is never triggered when initalDelay > 1.5*period

2020-09-16 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-25014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-25014.

Fix Version/s: 2.2.7
   2.4.0
   2.3.3
   Resolution: Fixed

> ScheduledChore is never triggered when initalDelay > 1.5*period
> ---
>
> Key: HBASE-25014
> URL: https://issues.apache.org/jira/browse/HBASE-25014
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha-1, 2.2.3, 2.2.4, 2.2.5
>Reporter: Sun Xin
>Assignee: Sun Xin
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.3, 2.4.0, 2.2.7
>
>
> In our recent tests, ScheduledChore is never triggered when initalDelay > 
> 1.5*period.
> The cause of the bug is the following:
> The trigger time for a ScheduleChore must be within an acceptable time window 
> that is 1.5 * period. see 
> [here|https://github.com/apache/hbase/blob/e5ca9adc54f9f580f85d21d38217afa97aa79d68/hbase-common/src/main/java/org/apache/hadoop/hbase/ScheduledChore.java#L234]
> timeOfLastRun and timeOfThisRun are two variables that record two adjacent 
> trigger time. [The first initialization of 
> timeOfThisRun|https://github.com/apache/hbase/blob/e5ca9adc54f9f580f85d21d38217afa97aa79d68/hbase-common/src/main/java/org/apache/hadoop/hbase/ScheduledChore.java#L273]
>  is when the ScheduleChore is created, it's not a real trigger time.
> If we set initialDelay > 1.5 period , after initialDelay, the first time when 
> chore is triggered has exceeded the allowed window. Then [cancel the chore 
> and schedule it 
> again|https://github.com/apache/hbase/blob/e5ca9adc54f9f580f85d21d38217afa97aa79d68/hbase-common/src/main/java/org/apache/hadoop/hbase/ChoreService.java#L176].
> So it's stuck in loop when initialDelay > 1.5 period :
> 1.  init timeOfThisRun at a wrong time.
> 2. wait initalDelay
> 3. chore trigger, but exceeded the allowed window.
> 4. cancel chore and schedule it again
> 5. go step 1.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-25009) Hbck chore logs wrong message when loading regions from RS report

2020-09-16 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-25009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-25009.

Fix Version/s: 2.2.7
   2.4.0
   2.3.3
   3.0.0-alpha-1
   Resolution: Fixed

Pushed to branch-2.2+. Thanks [~arshad.mohammad] for contributing.

> Hbck chore logs wrong message when loading regions from RS report
> -
>
> Key: HBASE-25009
> URL: https://issues.apache.org/jira/browse/HBASE-25009
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha-1, 2.3.1
>Reporter: Mohammad Arshad
>Assignee: Mohammad Arshad
>Priority: Minor
> Fix For: 3.0.0-alpha-1, 2.3.3, 2.4.0, 2.2.7
>
>
> {code:java}
> LOG.info("Loaded {} regions from {} regionservers' reports and found {} 
> orphan regions",
> numRegions, rsReports.size(), orphanRegionsOnFS.size());
> {code}
> In above log message orphanRegionsOnFS.size() should be replaced with 
> orphanRegionsOnRS.size() as the regions are loaded from RS not form FS



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-25012) HBASE-24359 causes replication missed log of some RemoteException

2020-09-16 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-25012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-25012.

Fix Version/s: 2.4.0
   2.3.3
   Resolution: Fixed

Pushed to branch-2.3+. Thanks [~Ddupg] for contributing.

> HBASE-24359 causes replication missed log of some RemoteException
> -
>
> Key: HBASE-25012
> URL: https://issues.apache.org/jira/browse/HBASE-25012
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 3.0.0-alpha-1, 2.3.0, 2.3.1
>Reporter: Sun Xin
>Assignee: Sun Xin
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.3, 2.4.0
>
> Attachments: image-2020-09-11-14-30-27-898.png
>
>
> HBASE-24359 broken the logic of handling exception. In branch2, it even 
> causes some RemoteException log missed.
> [File 
> changed|[https://github.com/apache/hbase/pull/1855/files#diff-1e3f171b19474698601a0752b618af0eL435]]
>  in branch2.
> !image-2020-09-11-14-30-27-898.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24982) Disassemble the method replicateWALEntry from AdminService to a new interface ReplicationServerService

2020-09-09 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24982.

Resolution: Fixed

Merged. Thanks [~Ddupg] for contributing.

> Disassemble the method replicateWALEntry from AdminService to a new interface 
> ReplicationServerService
> --
>
> Key: HBASE-24982
> URL: https://issues.apache.org/jira/browse/HBASE-24982
> Project: HBase
>  Issue Type: Sub-task
>  Components: Replication
>Reporter: Sun Xin
>Assignee: Sun Xin
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-25177) Try create table with 100 regions for branch-2.2 nightly job's hadoop integration test

2020-10-15 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-25177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-25177.

Resolution: Won't Fix

> Try create table with 100 regions for branch-2.2 nightly job's hadoop 
> integration test
> --
>
> Key: HBASE-25177
> URL: https://issues.apache.org/jira/browse/HBASE-25177
> Project: HBase
>  Issue Type: Bug
>        Reporter: Guanghao Zhang
>    Assignee: Guanghao Zhang
>Priority: Major
>
> It still failed now.
> [https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.2/88/execution/node/171/log/]
>  
> [https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.2/88//artifact/output-integration/hadoop-2.log]
>  
> It failed when create table with 1000 regions. And not import the example TSV 
> to HDFS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-25177) Try create table with 100 regions for branch-2.2 nightly job's hadoop integration test

2020-10-11 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-25177:
--

 Summary: Try create table with 100 regions for branch-2.2 nightly 
job's hadoop integration test
 Key: HBASE-25177
 URL: https://issues.apache.org/jira/browse/HBASE-25177
 Project: HBase
  Issue Type: Bug
Reporter: Guanghao Zhang


It still failed now.

[https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.2/88/execution/node/171/log/]

 

[https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.2/88//artifact/output-integration/hadoop-2.log]

 

It failed when create table with 1000 regions. And not import the example TSV 
to HDFS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-25178) Fix the LICENSE error when branch-2.2 build with hadoop 3.3.0

2020-10-11 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-25178:
--

 Summary: Fix the LICENSE error when branch-2.2 build with hadoop 
3.3.0 
 Key: HBASE-25178
 URL: https://issues.apache.org/jira/browse/HBASE-25178
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.2.6
Reporter: Guanghao Zhang


See 
[https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.2/88/execution/node/163/log/]

 

It will fail when run "mvn clean install -DskipTests -DHBasePatchProcess 
-Dhadoop-three.version=3.3.0 -Dhadoop.profile=3.0".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-25172) No need timelineservice for branch-2.2 nightly job's hadoop integration test

2020-10-11 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-25172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-25172.

Resolution: Fixed

> No need timelineservice for branch-2.2 nightly job's hadoop integration test
> 
>
> Key: HBASE-25172
> URL: https://issues.apache.org/jira/browse/HBASE-25172
> Project: HBase
>  Issue Type: Bug
>    Reporter: Guanghao Zhang
>        Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 2.2.7
>
>
> [https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.2/86/execution/node/171/log/]
>  
>  
> /home/jenkins/jenkins-home/workspace/HBase_HBase_Nightly_branch-2.2/component/dev-support/hbase_nightly_pseudo-distributed-test.sh
>  --single-process --working-dir output-integration/hadoop-2 
> --hbase-client-install hbase-client hbase-install hadoop-2/bin/hadoop 
> {color:#ff}hadoop-2/share/hadoop/yarn/timelineservice{color} 
> hadoop-2/share/hadoop/yarn/test/hadoop-yarn-server-tests-2.8.5-tests.jar 
> hadoop-2/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.8.5-tests.jar
>  hadoop-2/bin/mapred
>   
> branch-2.2 still use hadoop 2.8.5 and hadoop 2.8.5 doesn't have 
> timelineservice. The dev-support/hbase_nightly_pseudo-distributed-test.sh not 
> consider this timelineservice and only consider 5 paramerters. But 
> branch-2.3+ use 2.10.x hadoop, so they consider 6 parameters.
>  
> And for hadoop-3, the timelineservice is not used, too. See 
> [https://github.com/apache/hbase/blob/master/dev-support/hbase_nightly_pseudo-distributed-test.sh#L286]
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-25178) Remove the hadoop 3.3.0 personality hadoopcheck for branch-2.2/branch-2.3

2020-10-12 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-25178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-25178.

Resolution: Duplicate

Already fixed by HBASE-25144.

> Remove the hadoop 3.3.0 personality hadoopcheck for branch-2.2/branch-2.3
> -
>
> Key: HBASE-25178
> URL: https://issues.apache.org/jira/browse/HBASE-25178
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.2.6
>        Reporter: Guanghao Zhang
>    Assignee: Guanghao Zhang
>Priority: Major
>
> For branch-2.2, see 
> [https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.2/88/execution/node/163/log/]
>  It will fail when run "mvn clean install -DskipTests -DHBasePatchProcess 
> -Dhadoop-three.version=3.3.0 -Dhadoop.profile=3.0".
>  
> For branch-2.3, see HBASE-23834. HBase failed to start on hadoop 3.3.0 
> because the jetty problem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-25200) Try enlarge the flaky test timeout for branch-2.2

2020-10-18 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-25200:
--

 Summary: Try enlarge the flaky test timeout for branch-2.2
 Key: HBASE-25200
 URL: https://issues.apache.org/jira/browse/HBASE-25200
 Project: HBase
  Issue Type: Bug
Reporter: Guanghao Zhang


Now there are too many flaky tests to run. And the flaky test job cannot 
finished. Then these tests will be marked to flaky again.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-25204) Nightly job failed as the name of jdk and maven changed

2020-10-20 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-25204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-25204.

Fix Version/s: 2.2.7
   1.4.14
   2.4.0
   1.7.0
   2.3.3
   3.0.0-alpha-1
   Resolution: Fixed

Pushed to all active branchs. Thanks [~zhangduo] for reviewing.

> Nightly job failed as  the name of jdk and maven changed
> 
>
> Key: HBASE-25204
> URL: https://issues.apache.org/jira/browse/HBASE-25204
> Project: HBase
>  Issue Type: Bug
>    Reporter: Guanghao Zhang
>        Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.3, 1.7.0, 2.4.0, 1.4.14, 2.2.7
>
>
> See 
> [https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/85/console]
> [https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.2/103/console]
>  
> org.codehaus.groovy.control.MultipleCompilationErrorsException: startup 
> failed: WorkflowScript: 508: Tool type "maven" does not have an install of 
> "Maven (latest)" configured - did you mean "maven_latest"? @ line 508, column 
> 19. maven 'Maven (latest)' ^ WorkflowScript: 510: Tool type "jdk" does not 
> have an install of "JDK 1.8 (latest)" configured - did you mean 
> "jdk_1.8_latest"? @ line 510, column 17. jdk "JDK 1.8 (latest)"
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-25172) No need timelineservice for branch-2.2 nightly job's hadoop integration test

2020-10-10 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-25172:
--

 Summary: No need timelineservice for branch-2.2 nightly job's 
hadoop integration test
 Key: HBASE-25172
 URL: https://issues.apache.org/jira/browse/HBASE-25172
 Project: HBase
  Issue Type: Bug
Reporter: Guanghao Zhang


[https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.2/86/execution/node/171/log/]

 
/home/jenkins/jenkins-home/workspace/HBase_HBase_Nightly_branch-2.2/component/dev-support/hbase_nightly_pseudo-distributed-test.sh
 --single-process --working-dir output-integration/hadoop-2 
--hbase-client-install hbase-client hbase-install hadoop-2/bin/hadoop 
hadoop-2/share/hadoop/yarn/timelineservice 
hadoop-2/share/hadoop/yarn/test/hadoop-yarn-server-tests-2.8.5-tests.jar 
hadoop-2/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.8.5-tests.jar
 hadoop-2/bin/mapred
 

branch-2.2 still use hadoop 2.8.5 and doesn't have timelineservice. The 
dev-support/hbase_nightly_pseudo-distributed-test.sh not consider this.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-25204) Nightly job failed as the name of jdk and maven changed

2020-10-19 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-25204:
--

 Summary: Nightly job failed as  the name of jdk and maven changed
 Key: HBASE-25204
 URL: https://issues.apache.org/jira/browse/HBASE-25204
 Project: HBase
  Issue Type: Bug
Reporter: Guanghao Zhang


See 
[https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/85/console]
[https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.2/103/console]
 
org.codehaus.groovy.control.MultipleCompilationErrorsException: startup failed: 
WorkflowScript: 508: Tool type "maven" does not have an install of "Maven 
(latest)" configured - did you mean "maven_latest"? @ line 508, column 19. 
maven 'Maven (latest)' ^ WorkflowScript: 510: Tool type "jdk" does not have an 
install of "JDK 1.8 (latest)" configured - did you mean "jdk_1.8_latest"? @ 
line 510, column 17. jdk "JDK 1.8 (latest)"
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-23987) NettyRpcClientConfigHelper will not share event loop by default which is incorrect

2020-08-26 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-23987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-23987.

Fix Version/s: 2.2.6
   Resolution: Fixed

> NettyRpcClientConfigHelper will not share event loop by default which is 
> incorrect
> --
>
> Key: HBASE-23987
> URL: https://issues.apache.org/jira/browse/HBASE-23987
> Project: HBase
>  Issue Type: Bug
>  Components: Client, rpc
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.2.6, 2.3.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24870) Ignore TestAsyncTableRSCrashPublish

2020-08-26 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24870.

Fix Version/s: 2.2.6
   Resolution: Fixed

> Ignore TestAsyncTableRSCrashPublish
> ---
>
> Key: HBASE-24870
> URL: https://issues.apache.org/jira/browse/HBASE-24870
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Guanghao Zhang
>        Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 2.2.6
>
>
> [ERROR] Failures: 
> [ERROR] TestAsyncTableRSCrashPublish.test:94 Waiting timed out after [60,000] 
> msec
>  
> I meet this failure many times when runAllTests. And other developers meet 
> this too when vote RC. Let's ignore this first and enable this after parent 
> issue resolved.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24897) RegionReplicaFlushHandler should handle NoServerForRegionException to avoid aborting RegionServer

2020-08-26 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24897.

Fix Version/s: 2.2.6
   Resolution: Fixed

> RegionReplicaFlushHandler should handle NoServerForRegionException to avoid 
> aborting RegionServer
> -
>
> Key: HBASE-24897
> URL: https://issues.apache.org/jira/browse/HBASE-24897
> Project: HBase
>  Issue Type: Bug
>        Reporter: Guanghao Zhang
>    Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 2.2.6
>
>
> Debug flaky test TestRegionReplicaReplicationEndpoint, I found the RS aborted 
> because RegionReplicaFlushHandler flush failed. When create a new table with 
> region replica, the assign order may be:
>  # assign 0002 replica region and trigger primary region flush.
>  # assign 0001 replica region and trigger primary region flush.
>  # assign primary region.
> But the primary region flush may failed because the primary region not opened 
> now. So it may abort the RS..
>  
> {code:java}
> 2020-08-18 16:56:30,041 INFO 
> [RS_OPEN_REGION-regionserver/hao-OptiPlex-7050:0-0] 
> handler.AssignRegionHandler(141): Opened 
> testRegionReplicaReplicationIgnoresDisabledTables_drop_false_disabledReplication_false,,1597740978463_0002.66e9757a05fbae7623cfea3369fc8354.
> 2020-08-18 16:56:30,558 INFO 
> [RS_OPEN_REGION-regionserver/hao-OptiPlex-7050:0-0] 
> handler.AssignRegionHandler(141): Opened 
> testRegionReplicaReplicationIgnoresDisabledTables_drop_false_disabledReplication_false,,1597740978463_0001.22ff45423b0f1f0e93794f673449d140.
> 2020-08-18 16:56:31,192 INFO 
> [RS_OPEN_REGION-regionserver/hao-OptiPlex-7050:0-0] 
> handler.AssignRegionHandler(141): Opened 
> testRegionReplicaReplicationIgnoresDisabledTables_drop_false_disabledReplication_false,,1597740978463.901f9cd06bbf27ef7c2d70b5af725cd2.
> 2020-08-18 16:58:53,857 ERROR 
> [RS_REGION_REPLICA_FLUSH_OPS-regionserver/hao-OptiPlex-7050:0-0] 
> helpers.MarkerIgnoringBase(159): * ABORTING region server 
> hao-optiplex-7050,36368,1597740961432: ServerAborting because an exception 
> was thrown *
> org.apache.hadoop.hbase.client.NoServerForRegionException: No server address 
> listed in hbase:meta for region 
> testRegionReplicaReplicationWithReplicas_10,,1597741128945.0f541dc1a7ca64797c4cf054adb9edfb.
>  containing row 
>   at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegionInMeta(ConnectionImplementation.java:926)
>   at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:784)
>   at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:140)
>   at 
> org.apache.hadoop.hbase.client.RegionAdminServiceCallable.getRegionLocations(RegionAdminServiceCallable.java:147)
>   at 
> org.apache.hadoop.hbase.client.RegionAdminServiceCallable.getLocation(RegionAdminServiceCallable.java:98)
>   at 
> org.apache.hadoop.hbase.client.RegionAdminServiceCallable.prepare(RegionAdminServiceCallable.java:84)
>   at 
> org.apache.hadoop.hbase.client.FlushRegionCallable.prepare(FlushRegionCallable.java:62)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:105)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.RegionReplicaFlushHandler.triggerFlushInPrimaryRegion(RegionReplicaFlushHandler.java:129)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.RegionReplicaFlushHandler.process(RegionReplicaFlushHandler.java:78)
>   at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
> I thought the fix should be assign primary region firstly when enable region 
> replica featue. Will check the implmenation of region replica.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24881) Fix flaky TestMasterAbortAndRSGotKilled for branch-2.2

2020-08-26 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24881.

Fix Version/s: 2.2.6
   Resolution: Fixed

> Fix flaky TestMasterAbortAndRSGotKilled for branch-2.2
> --
>
> Key: HBASE-24881
> URL: https://issues.apache.org/jira/browse/HBASE-24881
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Guanghao Zhang
>        Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 2.2.6
>
>
> I meet this problem on branch-2.2 too. This case happened because the 
> DelayCloseCP. The event execute order is:
>  # Close regiong. But because the DelayCloseCP, it will close after 10 
> seconds.
>  # Finish ut and shutdown cluster.
>  # Shutdown master.
>  # Shutdown RS. Call waitOnAllRegionsToClose method. But abortRequested is 
> false now.
>  # Close region and failed because master is down and report master error. 
> Then abort RegionServer and set abortRequested to ture.
>  # waitOnAllRegionsToClose hanged because the online regions cannot be empty.
>  
> waitOnAllRegionsToClose(final boolean abort) already consider the abort case 
> but the problem is abortRequested is false when call this method. I thought 
> the fix should be that keep to check the abortRequested in 
> waitOnAllRegionsToClose method internal.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Reopened] (HBASE-24689) Generate CHANGES.md and RELEASENOTES.md for 2.2.6

2020-08-26 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang reopened HBASE-24689:


> Generate CHANGES.md and RELEASENOTES.md for 2.2.6
> -
>
> Key: HBASE-24689
> URL: https://issues.apache.org/jira/browse/HBASE-24689
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Guanghao Zhang
>        Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 2.2.6
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Reopened] (HBASE-23814) Add null checks and logging to misc set of tests

2020-08-25 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-23814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang reopened HBASE-23814:


Reopen for cherry-pick to branch-2.2.

> Add null checks and logging to misc set of tests
> 
>
> Key: HBASE-23814
> URL: https://issues.apache.org/jira/browse/HBASE-23814
> Project: HBase
>  Issue Type: Test
>Reporter: Michael Stack
>Assignee: Michael Stack
>Priority: Trivial
> Fix For: 3.0.0-alpha-1, 2.3.0
>
>
> I've been studying unit tests of late. A few are failing but then the output 
> is missing a detail or shutdown complains of NPE because startup didn't 
> succeed.
> Here are super minor items I've been carrying around that I'd like to land. 
> They do  not change the function of tests (there is an attempt at a fix of 
> TestLogsCleaner).
> * TestFullLogReconstruction log the server we've chosen to expire and then 
> note where we starting counting rows
> * TestAsyncTableScanException use a define for row counts; count 100 instead 
> of 1000 and see if helps
> * TestRawAsyncTableLimitedScanWithFilter check connection was made before 
> closing it in tearDown
> * TestLogsCleaner use single mod time. Make it for sure less than now in case 
> test runs all in the same millisecond (would cause test fail)
> * TestReplicationBase test table is non-null before closing in tearDown



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-23814) Add null checks and logging to misc set of tests

2020-08-25 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-23814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-23814.

Fix Version/s: 2.2.6
   Resolution: Fixed

> Add null checks and logging to misc set of tests
> 
>
> Key: HBASE-23814
> URL: https://issues.apache.org/jira/browse/HBASE-23814
> Project: HBase
>  Issue Type: Test
>Reporter: Michael Stack
>Assignee: Michael Stack
>Priority: Trivial
> Fix For: 3.0.0-alpha-1, 2.2.6, 2.3.0
>
>
> I've been studying unit tests of late. A few are failing but then the output 
> is missing a detail or shutdown complains of NPE because startup didn't 
> succeed.
> Here are super minor items I've been carrying around that I'd like to land. 
> They do  not change the function of tests (there is an attempt at a fix of 
> TestLogsCleaner).
> * TestFullLogReconstruction log the server we've chosen to expire and then 
> note where we starting counting rows
> * TestAsyncTableScanException use a define for row counts; count 100 instead 
> of 1000 and see if helps
> * TestRawAsyncTableLimitedScanWithFilter check connection was made before 
> closing it in tearDown
> * TestLogsCleaner use single mod time. Make it for sure less than now in case 
> test runs all in the same millisecond (would cause test fail)
> * TestReplicationBase test table is non-null before closing in tearDown



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24928) balanceRSGroup should skip generating balance plan for disabled table and splitParent region

2020-08-26 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24928.

Fix Version/s: 2.3.2
   2.2.6
   Resolution: Fixed

> balanceRSGroup should skip generating balance plan for disabled table and 
> splitParent region
> 
>
> Key: HBASE-24928
> URL: https://issues.apache.org/jira/browse/HBASE-24928
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Reporter: niuyulin
>Assignee: niuyulin
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.2.6, 2.3.2
>
>
> now ,we generate balance plan for disabled tables, which is useless
> {code:java}
> 2020-08-20,20:47:54,702 WARN 
> [RpcServer.default.RWQ.Fifo.read.handler=310,queue=6,port=22500] 
> org.apache.hadoop.hbase.master.HMaster: Failed balance plan: 
> hri=aa325467924edc865ab2ef6d82f9e2a7, 
> source=tj1-hadoop-staging-st02.kscn,22600,1572403947348, destination=, just 
> skip it
> org.apache.hadoop.hbase.client.DoNotRetryRegionException: Unexpected state 
> for rit=CLOSED, location=tj1-hadoop-staging-st02.kscn,22600,1572403947348, 
> table=galaxysds:sds_staging_258z, region=aa325467924edc865ab2ef6d82f9e2a7
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.preTransitCheck(AssignmentManager.java:580)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.createMoveRegionProcedure(AssignmentManager.java:635)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.moveAsync(AssignmentManager.java:652)
> at 
> org.apache.hadoop.hbase.master.HMaster.executeRegionPlansWithThrottling(HMaster.java:1776)
> at 
> org.apache.hadoop.hbase.rsgroup.RSGroupAdminServer.balanceRSGroup(RSGroupAdminServer.java:486)
> at 
> org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint$RSGroupAdminServiceImpl.balanceRSGroup(RSGroupAdminEndpoint.java:293)
> at 
> org.apache.hadoop.hbase.protobuf.generated.RSGroupAdminProtos$RSGroupAdminService.callMethod(RSGroupAdminProtos.java:13890)
> at 
> org.apache.hadoop.hbase.master.MasterRpcServices.execMasterService(MasterRpcServices.java:908)
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:135)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24948) Reduce the resource of TestReplicationBase

2020-08-25 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24948.

Fix Version/s: 2.2.6
 Assignee: Guanghao Zhang
   Resolution: Fixed

> Reduce the resource of  TestReplicationBase
> ---
>
> Key: HBASE-24948
> URL: https://issues.apache.org/jira/browse/HBASE-24948
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Guanghao Zhang
>        Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 2.2.6
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24946) Remove the metrics assert in TestClusterRestartFailover

2020-08-25 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24946.

Fix Version/s: 2.2.6
   Resolution: Fixed

> Remove the metrics assert in TestClusterRestartFailover
> ---
>
> Key: HBASE-24946
> URL: https://issues.apache.org/jira/browse/HBASE-24946
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Guanghao Zhang
>Priority: Major
> Fix For: 2.2.6
>
>
> MetricsMasterSource masterSource = 
> UTIL.getHBaseCluster().getMaster().getMasterMetrics()
>  .getMetricsSource();
> metricsHelper.assertCounter(MetricsMasterSource.SERVER_CRASH_METRIC_PREFIX+"SubmittedCount",
>  4, masterSource);
>  
> Introduced by HBASE-24199. But flaky now as this unit test will restart all 
> clusters. Meanwhile, this metric already tested by TestMasterMetrics. I plan 
> to remove this assert for branch-2.2.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-24948) Reduce the resource of TestReplicationBase

2020-08-24 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-24948:
--

 Summary: Reduce the resource of  TestReplicationBase
 Key: HBASE-24948
 URL: https://issues.apache.org/jira/browse/HBASE-24948
 Project: HBase
  Issue Type: Sub-task
Reporter: Guanghao Zhang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-24895) Speed up TestFromClientSide3 by reduce the table regions number

2020-08-17 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-24895:
--

 Summary: Speed up TestFromClientSide3 by reduce the table regions 
number
 Key: HBASE-24895
 URL: https://issues.apache.org/jira/browse/HBASE-24895
 Project: HBase
  Issue Type: Sub-task
Reporter: Guanghao Zhang


[https://ci-hadoop.apache.org/job/HBase/job/HBase-Flaky-Tests/job/branch-2.2/52/testReport/junit/org.apache.hadoop.hbase.client/TestFromClientSide3//]

 
|[testHTableExistsMethodMultipleRegionsMultipleGets|https://ci-hadoop.apache.org/job/HBase/job/HBase-Flaky-Tests/job/branch-2.2/52/testReport/junit/org.apache.hadoop.hbase.client/TestFromClientSide3//testHTableExistsMethodMultipleRegionsMultipleGets]|2
 min 58 sec|Regression|
|[testHTableExistsMethodMultipleRegionsSingleGet|https://ci-hadoop.apache.org/job/HBase/job/HBase-Flaky-Tests/job/branch-2.2/52/testReport/junit/org.apache.hadoop.hbase.client/TestFromClientSide3//testHTableExistsMethodMultipleRegionsSingleGet]|4
 min 20 sec|Passed|

 

It take too many time and timeout.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-22548) Split TestAdmin1

2020-08-17 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-22548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-22548.

Fix Version/s: 2.2.6
   Resolution: Fixed

> Split TestAdmin1
> 
>
> Key: HBASE-22548
> URL: https://issues.apache.org/jira/browse/HBASE-22548
> Project: HBase
>  Issue Type: Test
>  Components: Admin, test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.2.6, 2.3.0
>
> Attachments: HBASE-22548-branch-2-v1.patch, HBASE-22548-branch-2.patch
>
>
> It is too large and easy to timeout.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-24897) Fix flaky test TestRegionReplicaReplicationEndpoint

2020-08-18 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-24897:
--

 Summary: Fix flaky test TestRegionReplicaReplicationEndpoint
 Key: HBASE-24897
 URL: https://issues.apache.org/jira/browse/HBASE-24897
 Project: HBase
  Issue Type: Bug
Reporter: Guanghao Zhang


Debug this unti test, I found the RS aborted because RegionReplicaFlushHandler 
flush failed. When create a new table with region replica, the assign order may 
be:
 # assign 0002 replica region and trigger primary region flush.
 # assign 0001 replica region and trigger primary region flush.
 # assign primary region.

But the primary region flush may failed because the primary region not opened 
now. So it may abort the RS..

 
{code:java}
2020-08-18 16:56:30,041 INFO 
[RS_OPEN_REGION-regionserver/hao-OptiPlex-7050:0-0] 
handler.AssignRegionHandler(141): Opened 
testRegionReplicaReplicationIgnoresDisabledTables_drop_false_disabledReplication_false,,1597740978463_0002.66e9757a05fbae7623cfea3369fc8354.
2020-08-18 16:56:30,558 INFO 
[RS_OPEN_REGION-regionserver/hao-OptiPlex-7050:0-0] 
handler.AssignRegionHandler(141): Opened 
testRegionReplicaReplicationIgnoresDisabledTables_drop_false_disabledReplication_false,,1597740978463_0001.22ff45423b0f1f0e93794f673449d140.
2020-08-18 16:56:31,192 INFO 
[RS_OPEN_REGION-regionserver/hao-OptiPlex-7050:0-0] 
handler.AssignRegionHandler(141): Opened 
testRegionReplicaReplicationIgnoresDisabledTables_drop_false_disabledReplication_false,,1597740978463.901f9cd06bbf27ef7c2d70b5af725cd2.


2020-08-18 16:58:53,857 ERROR 
[RS_REGION_REPLICA_FLUSH_OPS-regionserver/hao-OptiPlex-7050:0-0] 
helpers.MarkerIgnoringBase(159): * ABORTING region server 
hao-optiplex-7050,36368,1597740961432: ServerAborting because an exception was 
thrown *
org.apache.hadoop.hbase.client.NoServerForRegionException: No server address 
listed in hbase:meta for region 
testRegionReplicaReplicationWithReplicas_10,,1597741128945.0f541dc1a7ca64797c4cf054adb9edfb.
 containing row 
  at 
org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegionInMeta(ConnectionImplementation.java:926)
  at 
org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:784)
  at 
org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:140)
  at 
org.apache.hadoop.hbase.client.RegionAdminServiceCallable.getRegionLocations(RegionAdminServiceCallable.java:147)
  at 
org.apache.hadoop.hbase.client.RegionAdminServiceCallable.getLocation(RegionAdminServiceCallable.java:98)
  at 
org.apache.hadoop.hbase.client.RegionAdminServiceCallable.prepare(RegionAdminServiceCallable.java:84)
  at 
org.apache.hadoop.hbase.client.FlushRegionCallable.prepare(FlushRegionCallable.java:62)
  at 
org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:105)
  at 
org.apache.hadoop.hbase.regionserver.handler.RegionReplicaFlushHandler.triggerFlushInPrimaryRegion(RegionReplicaFlushHandler.java:129)
  at 
org.apache.hadoop.hbase.regionserver.handler.RegionReplicaFlushHandler.process(RegionReplicaFlushHandler.java:78)
  at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at java.lang.Thread.run(Thread.java:748)
{code}
I thought the fix should be assign primary region firstly when enable region 
replica featue. Will check the implmenation of region replica.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-24907) Turn off the balancer when test region admin api

2020-08-19 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-24907:
--

 Summary: Turn off the balancer when test region admin api
 Key: HBASE-24907
 URL: https://issues.apache.org/jira/browse/HBASE-24907
 Project: HBase
  Issue Type: Sub-task
Reporter: Guanghao Zhang


For region admin api, we will test move/split/merge/assign/unassign and test 
the region location right or not. But the balancer may move region to other 
places and break the UT. So turn off the balancer for TestAsyncRegionAdminApi.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-24904) Split TestAsyncTableAdminApi and TestSnapshotTemporaryDirectoryWithRegionReplicas

2020-08-19 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-24904:
--

 Summary: Split TestAsyncTableAdminApi and 
TestSnapshotTemporaryDirectoryWithRegionReplicas
 Key: HBASE-24904
 URL: https://issues.apache.org/jira/browse/HBASE-24904
 Project: HBase
  Issue Type: Sub-task
Reporter: Guanghao Zhang


See 
[https://ci-hadoop.apache.org/job/HBase/job/HBase-Flaky-Tests/job/branch-2.2/42/testReport/org.apache.hadoop.hbase.client/TestAsyncTableAdminApi/]

[https://ci-hadoop.apache.org/job/HBase/job/HBase-Flaky-Tests/job/branch-2.2/61/testReport/junit/org.apache.hadoop.hbase.client/TestSnapshotTemporaryDirectoryWithRegionReplicas//]

 

These ut are flaky because they take too much time which more than 780 seconds. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-24906) Enalrge the wait time in TestReplicationEndpoint#testInterClusterReplication

2020-08-19 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-24906:
--

 Summary: Enalrge the wait time in 
TestReplicationEndpoint#testInterClusterReplication
 Key: HBASE-24906
 URL: https://issues.apache.org/jira/browse/HBASE-24906
 Project: HBase
  Issue Type: Sub-task
Reporter: Guanghao Zhang


Failed many times. But the failed reason are different. The replicated entries 
number are different. So it means the replication is work and it need more time 
to replicate all 2500 entries.
h3. Error Message

Waiting timed out after [30,000] msec Failed to replicate all edits, expected = 
2500 replicated = 2499

 
h3. Error Message

Waiting timed out after [30,000] msec Failed to replicate all edits, expected = 
2500 replicated = 2481

 
h3. Error Message

Waiting timed out after [30,000] msec Failed to replicate all edits, expected = 
2500 replicated = 2491



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24895) Speed up TestFromClientSide3 by reduce the table regions number

2020-08-19 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24895.

Fix Version/s: 2.2.6
   Resolution: Fixed

> Speed up TestFromClientSide3 by reduce the table regions number
> ---
>
> Key: HBASE-24895
> URL: https://issues.apache.org/jira/browse/HBASE-24895
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Guanghao Zhang
>        Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 2.2.6
>
>
> [https://ci-hadoop.apache.org/job/HBase/job/HBase-Flaky-Tests/job/branch-2.2/52/testReport/junit/org.apache.hadoop.hbase.client/TestFromClientSide3//]
>  
> |[testHTableExistsMethodMultipleRegionsMultipleGets|https://ci-hadoop.apache.org/job/HBase/job/HBase-Flaky-Tests/job/branch-2.2/52/testReport/junit/org.apache.hadoop.hbase.client/TestFromClientSide3//testHTableExistsMethodMultipleRegionsMultipleGets]|2
>  min 58 sec|Regression|
> |[testHTableExistsMethodMultipleRegionsSingleGet|https://ci-hadoop.apache.org/job/HBase/job/HBase-Flaky-Tests/job/branch-2.2/52/testReport/junit/org.apache.hadoop.hbase.client/TestFromClientSide3//testHTableExistsMethodMultipleRegionsSingleGet]|4
>  min 20 sec|Passed|
>  
> It take too many time and timeout.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-24912) Enalrge MemstoreFlusherChore/CompactionChecker period for unit test

2020-08-19 Thread Guanghao Zhang (Jira)

Guanghao Zhang created HBASE-24912:
--

 Summary: Enalrge MemstoreFlusherChore/CompactionChecker period for 
unit test
 Key: HBASE-24912
 URL: https://issues.apache.org/jira/browse/HBASE-24912
 Project: HBase
  Issue Type: Improvement
Reporter: Guanghao Zhang


Too many debug logs when run unit test now.

 

2020-08-19 01:20:59,899 DEBUG [regionserver/asf909:0.Chore.1] 
hbase.ScheduledChore(192): CompactionChecker execution time: 0 ms.
2020-08-19 01:20:59,899 DEBUG [regionserver/asf909:0.Chore.1] 
hbase.ScheduledChore(192): MemstoreFlusherChore execution time: 0 ms.
2020-08-19 01:20:59,900 DEBUG [regionserver/asf909:0.Chore.1] 
hbase.ScheduledChore(192): CompactionChecker execution time: 0 ms.
2020-08-19 01:20:59,900 DEBUG [regionserver/asf909:0.Chore.1] 
hbase.ScheduledChore(192): MemstoreFlusherChore execution time: 0 ms.
2020-08-19 01:20:59,905 DEBUG [regionserver/asf909:0.Chore.1] 
hbase.ScheduledChore(192): MemstoreFlusherChore execution time: 0 ms.
2020-08-19 01:20:59,905 DEBUG [regionserver/asf909:0.Chore.1] 
hbase.ScheduledChore(192): CompactionChecker execution time: 0 ms.
2020-08-19 01:21:00,001 DEBUG [regionserver/asf909:0.Chore.1] 
hbase.ScheduledChore(192): CompactionChecker execution time: 0 ms.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24907) Turn off the balancer when test region admin api

2020-08-19 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24907.

Fix Version/s: 2.2.6
   Resolution: Fixed

> Turn off the balancer when test region admin api
> 
>
> Key: HBASE-24907
> URL: https://issues.apache.org/jira/browse/HBASE-24907
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Guanghao Zhang
>Priority: Major
> Fix For: 2.2.6
>
>
> For region admin api, we will test move/split/merge/assign/unassign and test 
> the region location right or not. But the balancer may move region to other 
> places and break the UT. So turn off the balancer for TestAsyncRegionAdminApi.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24912) Enlarge MemstoreFlusherChore/CompactionChecker period for unit test

2020-08-21 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24912.

Fix Version/s: 2.3.2
   2.2.6
   2.4.0
   3.0.0-alpha-1
   Resolution: Fixed

Pushed to branch-2.2+. Thanks [~stack] for reviewing.

> Enlarge MemstoreFlusherChore/CompactionChecker period for unit test
> ---
>
> Key: HBASE-24912
> URL: https://issues.apache.org/jira/browse/HBASE-24912
> Project: HBase
>  Issue Type: Improvement
>    Reporter: Guanghao Zhang
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0, 2.2.6, 2.3.2
>
>
> Too many debug logs when run unit test now.
>  
> 2020-08-19 01:20:59,899 DEBUG [regionserver/asf909:0.Chore.1] 
> hbase.ScheduledChore(192): CompactionChecker execution time: 0 ms.
> 2020-08-19 01:20:59,899 DEBUG [regionserver/asf909:0.Chore.1] 
> hbase.ScheduledChore(192): MemstoreFlusherChore execution time: 0 ms.
> 2020-08-19 01:20:59,900 DEBUG [regionserver/asf909:0.Chore.1] 
> hbase.ScheduledChore(192): CompactionChecker execution time: 0 ms.
> 2020-08-19 01:20:59,900 DEBUG [regionserver/asf909:0.Chore.1] 
> hbase.ScheduledChore(192): MemstoreFlusherChore execution time: 0 ms.
> 2020-08-19 01:20:59,905 DEBUG [regionserver/asf909:0.Chore.1] 
> hbase.ScheduledChore(192): MemstoreFlusherChore execution time: 0 ms.
> 2020-08-19 01:20:59,905 DEBUG [regionserver/asf909:0.Chore.1] 
> hbase.ScheduledChore(192): CompactionChecker execution time: 0 ms.
> 2020-08-19 01:21:00,001 DEBUG [regionserver/asf909:0.Chore.1] 
> hbase.ScheduledChore(192): CompactionChecker execution time: 0 ms.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24689) Generate CHANGES.md and RELEASENOTES.md for 2.2.6

2020-08-26 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24689.

Resolution: Fixed

> Generate CHANGES.md and RELEASENOTES.md for 2.2.6
> -
>
> Key: HBASE-24689
> URL: https://issues.apache.org/jira/browse/HBASE-24689
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Guanghao Zhang
>        Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 2.2.6
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24906) Enlarge the wait time in TestReplicationEndpoint/TestMetaWithReplicasBasic

2020-08-23 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24906.

Fix Version/s: 2.2.6
   Resolution: Fixed

> Enlarge the wait time in TestReplicationEndpoint/TestMetaWithReplicasBasic
> --
>
> Key: HBASE-24906
> URL: https://issues.apache.org/jira/browse/HBASE-24906
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Guanghao Zhang
>        Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 2.2.6
>
>
> Failed many times. But the failed reason are different. The replicated 
> entries number are different. So it means the replication is work and it need 
> more time to replicate all 2500 entries.
> h3. Error Message
> Waiting timed out after [30,000] msec Failed to replicate all edits, expected 
> = 2500 replicated = 2499
>  
> h3. Error Message
> Waiting timed out after [30,000] msec Failed to replicate all edits, expected 
> = 2500 replicated = 2481
>  
> h3. Error Message
> Waiting timed out after [30,000] msec Failed to replicate all edits, expected 
> = 2500 replicated = 2491



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24904) Speed up some unit tests

2020-08-23 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24904.

Fix Version/s: 2.2.6
   Resolution: Fixed

> Speed up some unit tests
> 
>
> Key: HBASE-24904
> URL: https://issues.apache.org/jira/browse/HBASE-24904
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Guanghao Zhang
>        Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 2.2.6
>
>
> See 
> [https://ci-hadoop.apache.org/job/HBase/job/HBase-Flaky-Tests/job/branch-2.2/42/testReport/org.apache.hadoop.hbase.client/TestAsyncTableAdminApi/]
> [https://ci-hadoop.apache.org/job/HBase/job/HBase-Flaky-Tests/job/branch-2.2/61/testReport/junit/org.apache.hadoop.hbase.client/TestSnapshotTemporaryDirectoryWithRegionReplicas//]
>  
> These ut are flaky because they take too much time which more than 780 
> seconds. 
>  
> Split TestAsyncTableAdminApi/TestAdminShell/TestLoadIncrementalHFiles
>  
>  Reduce region numbers in 
> TestSnapshotTemporaryDirectoryWithRegionReplicas/TestRegionReplicaFailover/TestSCP*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HBASE-24052) Add debug+fix to TestMasterShutdown

2020-08-24 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24052.

Fix Version/s: 2.2.6
   Resolution: Fixed

Pushed to branch-2.2.

> Add debug+fix to TestMasterShutdown
> ---
>
> Key: HBASE-24052
> URL: https://issues.apache.org/jira/browse/HBASE-24052
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Michael Stack
>Assignee: Michael Stack
>Priority: Trivial
> Fix For: 3.0.0-alpha-1, 2.2.6, 2.3.0
>
> Attachments: 
> 0001-HBASE-24052-Add-debug-to-TestMasterShutdown.addendum.patch, 
> 0001-HBASE-24052-Add-debug-to-TestMasterShutdown.addendum2.patch, 
> 0001-HBASE-24052-Add-debug-to-TestMasterShutdown.patch
>
>
> Temporarily add debug to TestMasterShutdown overnight to learn more about a 
> test failure not reproducible locally.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Reopened] (HBASE-24052) Add debug+fix to TestMasterShutdown

2020-08-24 Thread Guanghao Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-24052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang reopened HBASE-24052:


Reopen for cherry-pick to branch-2.2.

> Add debug+fix to TestMasterShutdown
> ---
>
> Key: HBASE-24052
> URL: https://issues.apache.org/jira/browse/HBASE-24052
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Michael Stack
>Assignee: Michael Stack
>Priority: Trivial
> Fix For: 3.0.0-alpha-1, 2.3.0
>
> Attachments: 
> 0001-HBASE-24052-Add-debug-to-TestMasterShutdown.addendum.patch, 
> 0001-HBASE-24052-Add-debug-to-TestMasterShutdown.addendum2.patch, 
> 0001-HBASE-24052-Add-debug-to-TestMasterShutdown.patch
>
>
> Temporarily add debug to TestMasterShutdown overnight to learn more about a 
> test failure not reproducible locally.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

< 2 3 4 5 6 7 8 9 >

601 - 700 of 849 matches

Mail list logo