[jira] [Commented] (HBASE-20538) Upgrade our hadoop-two.version to 2.7.7 and 3.0.3

2018-07-27 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560630#comment-16560630
 ] 

Mike Drob commented on HBASE-20538:
---

Ah, ok, great, thanks.

+1

> Upgrade our hadoop-two.version to 2.7.7 and 3.0.3
> -
>
> Key: HBASE-20538
> URL: https://issues.apache.org/jira/browse/HBASE-20538
> Project: HBase
>  Issue Type: Bug
>  Components: java, security
>Reporter: stack
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: 
> 0001-HBASE-20538-TestSaslFanOutOneBlockAsyncDFSOutput-DISABLE.patch, 
> HBASE-20538-v1.patch, HBASE-20538.patch
>
>
> Infra must have updated our JDK over the weekend. The test 
> TestSaslFanOutOneBlockAsyncDFSOutput fails solidly since.
> [~gabor.bota] ran into it already over in HDFS-13494 and provided nice 
> pointers as to cause.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20538) Upgrade our hadoop-two.version to 2.7.7 and 3.0.3

2018-07-27 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560629#comment-16560629
 ] 

Duo Zhang commented on HBASE-20538:
---

See the linked issue, HBASE-20937.

> Upgrade our hadoop-two.version to 2.7.7 and 3.0.3
> -
>
> Key: HBASE-20538
> URL: https://issues.apache.org/jira/browse/HBASE-20538
> Project: HBase
>  Issue Type: Bug
>  Components: java, security
>Reporter: stack
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: 
> 0001-HBASE-20538-TestSaslFanOutOneBlockAsyncDFSOutput-DISABLE.patch, 
> HBASE-20538-v1.patch, HBASE-20538.patch
>
>
> Infra must have updated our JDK over the weekend. The test 
> TestSaslFanOutOneBlockAsyncDFSOutput fails solidly since.
> [~gabor.bota] ran into it already over in HDFS-13494 and provided nice 
> pointers as to cause.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20925) Canary test to expose table availability rate

2018-07-27 Thread Xu Cang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Cang updated HBASE-20925:

Description: 
Canary test to expose table availability rate.

 

It will print table availability rate such as below. 

 

 

*2018-07-27 17:11:06,823 INFO [CanaryMonitor-1532736665083] tool.Canary: 
*
*2018-07-27 17:11:06,824 INFO [CanaryMonitor-1532736665083] tool.Canary: 
=== Summary: ===*
*2018-07-27 17:11:06,824 INFO [CanaryMonitor-1532736665083] tool.Canary: Read 
success rate for table : MyTable is: 1.0 .*   
*2018-07-27 17:11:06,824 INFO [CanaryMonitor-1532736665083] tool.Canary: Read 
success rate for table : mytable3 is: 0.9*
*2018-07-27 17:11:06,824 INFO [CanaryMonitor-1532736665083] tool.Canary: Read 
success rate for table : mytable2 is: 0.8*
*2018-07-27 17:11:06,824 INFO [CanaryMonitor-1532736665083] tool.Canary: Read 
success rate for table : mytable4 is: 1.0*
*2018-07-27 17:11:06,824 INFO [CanaryMonitor-1532736665083] tool.Canary: 
===END==*

  was:Canary test to expose table availability rate.


> Canary test to expose table availability rate 
> --
>
> Key: HBASE-20925
> URL: https://issues.apache.org/jira/browse/HBASE-20925
> Project: HBase
>  Issue Type: Improvement
>  Components: canary
>Affects Versions: 3.0.0, 2.0.0, 1.4.6
>Reporter: Xu Cang
>Assignee: Xu Cang
>Priority: Major
>  Labels: Canary
> Attachments: HBASE-20925.master.001.patch, 
> HBASE-20925.master.002.patch
>
>
> Canary test to expose table availability rate.
>  
> It will print table availability rate such as below. 
>  
>  
> *2018-07-27 17:11:06,823 INFO [CanaryMonitor-1532736665083] tool.Canary: 
> *
> *2018-07-27 17:11:06,824 INFO [CanaryMonitor-1532736665083] tool.Canary: 
> === Summary: ===*
> *2018-07-27 17:11:06,824 INFO [CanaryMonitor-1532736665083] tool.Canary: Read 
> success rate for table : MyTable is: 1.0 .*   
> *2018-07-27 17:11:06,824 INFO [CanaryMonitor-1532736665083] tool.Canary: Read 
> success rate for table : mytable3 is: 0.9*
> *2018-07-27 17:11:06,824 INFO [CanaryMonitor-1532736665083] tool.Canary: Read 
> success rate for table : mytable2 is: 0.8*
> *2018-07-27 17:11:06,824 INFO [CanaryMonitor-1532736665083] tool.Canary: Read 
> success rate for table : mytable4 is: 1.0*
> *2018-07-27 17:11:06,824 INFO [CanaryMonitor-1532736665083] tool.Canary: 
> ===END==*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19369) HBase Should use Builder Pattern to Create Log Files while using WAL on Erasure Coding

2018-07-27 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560613#comment-16560613
 ] 

Mike Drob commented on HBASE-19369:
---

Pushed to branch-2.1+, there was a conflict in branch-2.0 that I will look at 
on Monday.

fyi [~stack]

> HBase Should use Builder Pattern to Create Log Files while using WAL on 
> Erasure Coding
> --
>
> Key: HBASE-19369
> URL: https://issues.apache.org/jira/browse/HBASE-19369
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Alex Leblang
>Assignee: Mike Drob
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-19369.master.001.patch, 
> HBASE-19369.master.002.patch, HBASE-19369.master.003.patch, 
> HBASE-19369.master.004.patch, HBASE-19369.v10.patch, HBASE-19369.v11.patch, 
> HBASE-19369.v12.patch, HBASE-19369.v13.patch, HBASE-19369.v5.patch, 
> HBASE-19369.v6.patch, HBASE-19369.v7.patch, HBASE-19369.v8.patch, 
> HBASE-19369.v9.patch
>
>
> Right now an HBase instance using the WAL won't function properly in an 
> Erasure Coded environment. We should change the following line to use the 
> hdfs.DistributedFileSystem builder pattern 
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java#L92



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19369) HBase Should use Builder Pattern to Create Log Files while using WAL on Erasure Coding

2018-07-27 Thread Mike Drob (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-19369:
--
Fix Version/s: 2.1.1

> HBase Should use Builder Pattern to Create Log Files while using WAL on 
> Erasure Coding
> --
>
> Key: HBASE-19369
> URL: https://issues.apache.org/jira/browse/HBASE-19369
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Alex Leblang
>Assignee: Mike Drob
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-19369.master.001.patch, 
> HBASE-19369.master.002.patch, HBASE-19369.master.003.patch, 
> HBASE-19369.master.004.patch, HBASE-19369.v10.patch, HBASE-19369.v11.patch, 
> HBASE-19369.v12.patch, HBASE-19369.v13.patch, HBASE-19369.v5.patch, 
> HBASE-19369.v6.patch, HBASE-19369.v7.patch, HBASE-19369.v8.patch, 
> HBASE-19369.v9.patch
>
>
> Right now an HBase instance using the WAL won't function properly in an 
> Erasure Coded environment. We should change the following line to use the 
> hdfs.DistributedFileSystem builder pattern 
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java#L92



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20925) Canary test to expose table availability rate

2018-07-27 Thread Xu Cang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Cang updated HBASE-20925:

Attachment: HBASE-20925.master.002.patch

> Canary test to expose table availability rate 
> --
>
> Key: HBASE-20925
> URL: https://issues.apache.org/jira/browse/HBASE-20925
> Project: HBase
>  Issue Type: Improvement
>  Components: canary
>Affects Versions: 3.0.0, 2.0.0, 1.4.6
>Reporter: Xu Cang
>Assignee: Xu Cang
>Priority: Major
>  Labels: Canary
> Attachments: HBASE-20925.master.001.patch, 
> HBASE-20925.master.002.patch
>
>
> Canary test to expose table availability rate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20538) Upgrade our hadoop-two.version to 2.7.7 and 3.0.3

2018-07-27 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560597#comment-16560597
 ] 

Mike Drob commented on HBASE-20538:
---

Do we need to update anything in the ref guide?

> Upgrade our hadoop-two.version to 2.7.7 and 3.0.3
> -
>
> Key: HBASE-20538
> URL: https://issues.apache.org/jira/browse/HBASE-20538
> Project: HBase
>  Issue Type: Bug
>  Components: java, security
>Reporter: stack
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: 
> 0001-HBASE-20538-TestSaslFanOutOneBlockAsyncDFSOutput-DISABLE.patch, 
> HBASE-20538-v1.patch, HBASE-20538.patch
>
>
> Infra must have updated our JDK over the weekend. The test 
> TestSaslFanOutOneBlockAsyncDFSOutput fails solidly since.
> [~gabor.bota] ran into it already over in HDFS-13494 and provided nice 
> pointers as to cause.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20950) Helper method to configure secure DFS cluster for tests

2018-07-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560582#comment-16560582
 ] 

Hadoop QA commented on HBASE-20950:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
23s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
40s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
2s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
11s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
57s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
14s{color} | {color:red} hbase-server: The patch generated 3 new + 5 unchanged 
- 1 fixed = 8 total (was 6) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
43s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 41s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}124m 
45s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  2m 20s{color} 
| {color:red} hbase-thrift in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
54s{color} | {color:green} hbase-endpoint in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 2s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}183m 42s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.thrift.TestThriftSpnegoHttpServer |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20950 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12933451/HBASE-20950.master.002.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 53b74c794f95 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality 

[jira] [Updated] (HBASE-17633) Update unflushed sequence id in SequenceIdAccounting after flush with the minimum sequence id in memstore

2018-07-27 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-17633:
--
Status: Open  (was: Patch Available)

> Update unflushed sequence id in SequenceIdAccounting after flush with the 
> minimum sequence id in memstore
> -
>
> Key: HBASE-17633
> URL: https://issues.apache.org/jira/browse/HBASE-17633
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Attachments: HBASE-17633-v1.patch, HBASE-17633.patch
>
>
> Now the tracking work is done by SequenceIdAccounting. And it is a little 
> tricky when dealing with flush. We should remove the mapping for the given 
> stores of a region from lowestUnflushedSequenceIds, so that we have space to 
> store the new lowest unflushed sequence id after flush. But we still need to 
> keep the old sequence ids in another map as we still need to use these values 
> when reporting to master to prevent data loss(think of the scenario that we 
> report the new lowest unflushed sequence id to master and we crashed before 
> actually flushed the data to disk).
> And when reviewing HBASE-17407, I found  that for CompactingMemStore, we have 
> to record the minimum sequence id.in memstore. We could just update the 
> mappings in SequenceIdAccounting using these values after flush. This means 
> we do not need to update the lowest unflushed sequence id in 
> SequenceIdAccounting, and also do not need to make space for the new lowest 
> unflushed when startCacheFlush, and also do not need the extra map to store 
> the old mappings.
> This could simplify our logic a lot. But this is a fundamental change so I 
> need sometime to implement, especially for modifying tests... And I also need 
> sometime to check if I miss something.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20538) Upgrade our hadoop-two.version to 2.7.7 and 3.0.3

2018-07-27 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560577#comment-16560577
 ] 

Duo Zhang commented on HBASE-20538:
---

Ping [~stack] [~busbey].

> Upgrade our hadoop-two.version to 2.7.7 and 3.0.3
> -
>
> Key: HBASE-20538
> URL: https://issues.apache.org/jira/browse/HBASE-20538
> Project: HBase
>  Issue Type: Bug
>  Components: java, security
>Reporter: stack
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: 
> 0001-HBASE-20538-TestSaslFanOutOneBlockAsyncDFSOutput-DISABLE.patch, 
> HBASE-20538-v1.patch, HBASE-20538.patch
>
>
> Infra must have updated our JDK over the weekend. The test 
> TestSaslFanOutOneBlockAsyncDFSOutput fails solidly since.
> [~gabor.bota] ran into it already over in HDFS-13494 and provided nice 
> pointers as to cause.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19369) HBase Should use Builder Pattern to Create Log Files while using WAL on Erasure Coding

2018-07-27 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560575#comment-16560575
 ] 

Duo Zhang commented on HBASE-19369:
---

+1.

> HBase Should use Builder Pattern to Create Log Files while using WAL on 
> Erasure Coding
> --
>
> Key: HBASE-19369
> URL: https://issues.apache.org/jira/browse/HBASE-19369
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Alex Leblang
>Assignee: Mike Drob
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-19369.master.001.patch, 
> HBASE-19369.master.002.patch, HBASE-19369.master.003.patch, 
> HBASE-19369.master.004.patch, HBASE-19369.v10.patch, HBASE-19369.v11.patch, 
> HBASE-19369.v12.patch, HBASE-19369.v13.patch, HBASE-19369.v5.patch, 
> HBASE-19369.v6.patch, HBASE-19369.v7.patch, HBASE-19369.v8.patch, 
> HBASE-19369.v9.patch
>
>
> Right now an HBase instance using the WAL won't function properly in an 
> Erasure Coded environment. We should change the following line to use the 
> hdfs.DistributedFileSystem builder pattern 
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java#L92



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20925) Canary test to expose table availability rate

2018-07-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560574#comment-16560574
 ] 

Hadoop QA commented on HBASE-20925:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 2s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
51s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
13s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
46s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
22s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
35s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
10s{color} | {color:red} hbase-server: The patch generated 7 new + 42 unchanged 
- 2 fixed = 49 total (was 44) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
44s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 26s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}124m  
2s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}167m 33s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20925 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12933450/HBASE-20925.master.001.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux f4d842cceb7c 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh
 |
| git revision | master / cf481d3b51 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13837/artifact/patchprocess/diff-checkstyle-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13837/testReport/ |

[jira] [Commented] (HBASE-20895) NPE in RpcServer#readAndProcess

2018-07-27 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560569#comment-16560569
 ] 

Hudson commented on HBASE-20895:


FAILURE: Integrated in Jenkins build HBase-1.3-IT #444 (See 
[https://builds.apache.org/job/HBase-1.3-IT/444/])
HBASE-20895 NPE in RpcServer#readAndProcess (apurtell: rev 
16138513cc1bb62f0019613f17e0141582f25fff)
* (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java


> NPE in RpcServer#readAndProcess
> ---
>
> Key: HBASE-20895
> URL: https://issues.apache.org/jira/browse/HBASE-20895
> Project: HBase
>  Issue Type: Bug
>  Components: rpc
>Affects Versions: 1.3.2
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.7
>
> Attachments: HBASE-20895-branch-1.patch, HBASE-20895-branch-1.patch
>
>
> {noformat}
> 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - 
> RpcServer.listener,port=60020: Caught exception while reading:
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This looks like it could be a use after close problem if there is concurrent 
> access to a Connection.
> In process() we might store a null back to the 'data' field.
> Meanwhile in readAndProcess() we have a case where we might be blocked on a 
> channel read and then after coming back from the read we go to use 'data' 
> after a null has been written back, leading to a NPE.
> {quote}count = channelRead(channel, data);
>  1761 ---> if (count >= 0 && *data.remaining()* == 0)
>  \{ process(); }{quote}
> Whether a NPE happens or not is going to depend on the timing of the store 
> back to 'data' in another thread and use of 'data' in this thread and whether 
> or not the JVM has optimized away a reload of 'data' (it's not declared 
> volatile)
> We should do a null check here just to be defensive. We should also look at 
> whether concurrent access to the Connection is happening and intended.The 
> above is just a theory. We should also look at other execution sequences that 
> could lead to 'data' being null in this location. At a glance I didn't find 
> one but the store to 'data' happens behind conditionals so it is possible. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20895) NPE in RpcServer#readAndProcess

2018-07-27 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560564#comment-16560564
 ] 

Hudson commented on HBASE-20895:


SUCCESS: Integrated in Jenkins build HBase-1.2-IT #1141 (See 
[https://builds.apache.org/job/HBase-1.2-IT/1141/])
HBASE-20895 NPE in RpcServer#readAndProcess (apurtell: rev 
b3fd08e39e99703e13e970854eb3c1622b9476ae)
* (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java


> NPE in RpcServer#readAndProcess
> ---
>
> Key: HBASE-20895
> URL: https://issues.apache.org/jira/browse/HBASE-20895
> Project: HBase
>  Issue Type: Bug
>  Components: rpc
>Affects Versions: 1.3.2
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.7
>
> Attachments: HBASE-20895-branch-1.patch, HBASE-20895-branch-1.patch
>
>
> {noformat}
> 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - 
> RpcServer.listener,port=60020: Caught exception while reading:
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This looks like it could be a use after close problem if there is concurrent 
> access to a Connection.
> In process() we might store a null back to the 'data' field.
> Meanwhile in readAndProcess() we have a case where we might be blocked on a 
> channel read and then after coming back from the read we go to use 'data' 
> after a null has been written back, leading to a NPE.
> {quote}count = channelRead(channel, data);
>  1761 ---> if (count >= 0 && *data.remaining()* == 0)
>  \{ process(); }{quote}
> Whether a NPE happens or not is going to depend on the timing of the store 
> back to 'data' in another thread and use of 'data' in this thread and whether 
> or not the JVM has optimized away a reload of 'data' (it's not declared 
> volatile)
> We should do a null check here just to be defensive. We should also look at 
> whether concurrent access to the Connection is happening and intended.The 
> above is just a theory. We should also look at other execution sequences that 
> could lead to 'data' being null in this location. At a glance I didn't find 
> one but the store to 'data' happens behind conditionals so it is possible. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20950) Helper method to configure secure DFS cluster for tests

2018-07-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560558#comment-16560558
 ] 

Hadoop QA commented on HBASE-20950:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
37s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
55s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
46s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
21s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
38s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
4s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
53s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
3s{color} | {color:red} hbase-server: The patch generated 3 new + 5 unchanged - 
1 fixed = 8 total (was 6) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
22s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 50s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}159m 
56s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  3m  5s{color} 
| {color:red} hbase-thrift in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
25s{color} | {color:green} hbase-endpoint in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
10s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}217m 43s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.thrift.TestThriftSpnegoHttpServer |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20950 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12933443/HBASE-20950.master.001.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux e21a938c7658 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 
08:53:28 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Updated] (HBASE-20895) NPE in RpcServer#readAndProcess

2018-07-27 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-20895:
---
   Resolution: Fixed
Fix Version/s: 1.2.7
   Status: Resolved  (was: Patch Available)

> NPE in RpcServer#readAndProcess
> ---
>
> Key: HBASE-20895
> URL: https://issues.apache.org/jira/browse/HBASE-20895
> Project: HBase
>  Issue Type: Bug
>  Components: rpc
>Affects Versions: 1.3.2
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.7
>
> Attachments: HBASE-20895-branch-1.patch, HBASE-20895-branch-1.patch
>
>
> {noformat}
> 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - 
> RpcServer.listener,port=60020: Caught exception while reading:
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This looks like it could be a use after close problem if there is concurrent 
> access to a Connection.
> In process() we might store a null back to the 'data' field.
> Meanwhile in readAndProcess() we have a case where we might be blocked on a 
> channel read and then after coming back from the read we go to use 'data' 
> after a null has been written back, leading to a NPE.
> {quote}count = channelRead(channel, data);
>  1761 ---> if (count >= 0 && *data.remaining()* == 0)
>  \{ process(); }{quote}
> Whether a NPE happens or not is going to depend on the timing of the store 
> back to 'data' in another thread and use of 'data' in this thread and whether 
> or not the JVM has optimized away a reload of 'data' (it's not declared 
> volatile)
> We should do a null check here just to be defensive. We should also look at 
> whether concurrent access to the Connection is happening and intended.The 
> above is just a theory. We should also look at other execution sequences that 
> could lead to 'data' being null in this location. At a glance I didn't find 
> one but the store to 'data' happens behind conditionals so it is possible. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20895) NPE in RpcServer#readAndProcess

2018-07-27 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560557#comment-16560557
 ] 

Andrew Purtell commented on HBASE-20895:


Thanks for reporting the problem [~mmonani]

> NPE in RpcServer#readAndProcess
> ---
>
> Key: HBASE-20895
> URL: https://issues.apache.org/jira/browse/HBASE-20895
> Project: HBase
>  Issue Type: Bug
>  Components: rpc
>Affects Versions: 1.3.2
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.7
>
> Attachments: HBASE-20895-branch-1.patch, HBASE-20895-branch-1.patch
>
>
> {noformat}
> 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - 
> RpcServer.listener,port=60020: Caught exception while reading:
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This looks like it could be a use after close problem if there is concurrent 
> access to a Connection.
> In process() we might store a null back to the 'data' field.
> Meanwhile in readAndProcess() we have a case where we might be blocked on a 
> channel read and then after coming back from the read we go to use 'data' 
> after a null has been written back, leading to a NPE.
> {quote}count = channelRead(channel, data);
>  1761 ---> if (count >= 0 && *data.remaining()* == 0)
>  \{ process(); }{quote}
> Whether a NPE happens or not is going to depend on the timing of the store 
> back to 'data' in another thread and use of 'data' in this thread and whether 
> or not the JVM has optimized away a reload of 'data' (it's not declared 
> volatile)
> We should do a null check here just to be defensive. We should also look at 
> whether concurrent access to the Connection is happening and intended.The 
> above is just a theory. We should also look at other execution sequences that 
> could lead to 'data' being null in this location. At a glance I didn't find 
> one but the store to 'data' happens behind conditionals so it is possible. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20893) Data loss if splitting region while ServerCrashProcedure executing

2018-07-27 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560546#comment-16560546
 ] 

Allan Yang commented on HBASE-20893:


[~stack], yeah, your addendum does work more nicely, I will +1 on this addendum.

> Data loss if splitting region while ServerCrashProcedure executing
> --
>
> Key: HBASE-20893
> URL: https://issues.apache.org/jira/browse/HBASE-20893
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-20893-branch-2.0.addendum.patch, 
> HBASE-20893.branch-2.0.001.patch, HBASE-20893.branch-2.0.002.patch, 
> HBASE-20893.branch-2.0.003.patch, HBASE-20893.branch-2.0.004.patch, 
> HBASE-20893.branch-2.0.005.patch
>
>
> Similar case as HBASE-20878.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20925) Canary test to expose table availability rate

2018-07-27 Thread Xu Cang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Cang updated HBASE-20925:

Affects Version/s: 3.0.0
   2.0.0
   1.4.6

> Canary test to expose table availability rate 
> --
>
> Key: HBASE-20925
> URL: https://issues.apache.org/jira/browse/HBASE-20925
> Project: HBase
>  Issue Type: Improvement
>  Components: canary
>Affects Versions: 3.0.0, 2.0.0, 1.4.6
>Reporter: Xu Cang
>Assignee: Xu Cang
>Priority: Major
>  Labels: Canary
> Attachments: HBASE-20925.master.001.patch
>
>
> Canary test to expose table availability rate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20950) Helper method to configure secure DFS cluster for tests

2018-07-27 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560514#comment-16560514
 ] 

Wei-Chiu Chuang commented on HBASE-20950:
-

Attach patch v002 to address Ted's review comments.

> Helper method to configure secure DFS cluster for tests
> ---
>
> Key: HBASE-20950
> URL: https://issues.apache.org/jira/browse/HBASE-20950
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HBASE-20950.master.001.patch, 
> HBASE-20950.master.002.patch
>
>
> There is quite some boilerplate code for configuring a secure HDFS cluster 
> for tests. The code is repeated in a number of test files within HBase code 
> base. Convert the boilerplate code into a helper method to avoid duplication 
> and lower maintenance effort.
> SecureTestCluster#setHdfsSecuredConfiguration
> TestSecureExport#setUpClusterKdc
> TestThriftSpnegoHttpServer#addSecurityConfigurations
> TestSaslFanOutOneBlockAsyncDFSOutput#setHdfsSecuredConfiguration



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20528) Revise collections copying from iteration to built-in function

2018-07-27 Thread Hua-Yi Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560512#comment-16560512
 ] 

Hua-Yi Ho commented on HBASE-20528:
---

I have fixed the both of them.

> Revise collections copying from iteration to built-in function
> --
>
> Key: HBASE-20528
> URL: https://issues.apache.org/jira/browse/HBASE-20528
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Hua-Yi Ho
>Assignee: Hua-Yi Ho
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HBASE-20528.master.001.patch
>
>
> Some collection codes in file
> StochasticLoadBalancer.java, AbstractHBaseTool.java, HFileInputFormat.java, 
> Result.java, and WalPlayer.java, using iterations to copy whole data in 
> collections. The iterations can just replace by just Colletions.addAll and 
> Arrays.copyOf.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20950) Helper method to configure secure DFS cluster for tests

2018-07-27 Thread Wei-Chiu Chuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HBASE-20950:

Attachment: HBASE-20950.master.002.patch

> Helper method to configure secure DFS cluster for tests
> ---
>
> Key: HBASE-20950
> URL: https://issues.apache.org/jira/browse/HBASE-20950
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HBASE-20950.master.001.patch, 
> HBASE-20950.master.002.patch
>
>
> There is quite some boilerplate code for configuring a secure HDFS cluster 
> for tests. The code is repeated in a number of test files within HBase code 
> base. Convert the boilerplate code into a helper method to avoid duplication 
> and lower maintenance effort.
> SecureTestCluster#setHdfsSecuredConfiguration
> TestSecureExport#setUpClusterKdc
> TestThriftSpnegoHttpServer#addSecurityConfigurations
> TestSaslFanOutOneBlockAsyncDFSOutput#setHdfsSecuredConfiguration



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20925) Canary test to expose table availability rate

2018-07-27 Thread Xu Cang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560510#comment-16560510
 ] 

Xu Cang commented on HBASE-20925:
-

Example output:

username@username-wsl:~/Downloads/hbase-master-github/hbase$ bin/hbase 
org.apache.hadoop.hbase.tool.Canary -regionMode -tableAvailabilitySummary
2018-07-27 17:11:04,644 INFO [main] tool.Canary: Number of execution threads 16
2018-07-27 17:11:04,673 WARN [main] util.NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
2018-07-27 17:11:04,912 INFO [main] zookeeper.ReadOnlyZKClient: Connect 
0x1e730495 to localhost:2181 with session timeout=9ms, retries 30, retry 
interval 1000ms, keepAlive=6ms
2018-07-27 17:11:04,920 INFO [ReadOnlyZKClient-localhost:2181@0x1e730495] 
zookeeper.ZooKeeper: Client 
environment:zookeeper.version=3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, 
built on 03/23/2017 10:13 GMT
2018-07-27 17:11:04,920 INFO [ReadOnlyZKClient-localhost:2181@0x1e730495] 
zookeeper.ZooKeeper: Client environment:host.name=username-wsl
2018-07-27 17:11:04,920 INFO [ReadOnlyZKClient-localhost:2181@0x1e730495] 
zookeeper.ZooKeeper: Client environment:java.version=1.8.0_171
2018-07-27 17:11:04,920 INFO [ReadOnlyZKClient-localhost:2181@0x1e730495] 
zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
2018-07-27 17:11:04,920 INFO [ReadOnlyZKClient-localhost:2181@0x1e730495] 
zookeeper.ZooKeeper: Client 
environment:java.home=/usr/lib/jvm/java-8-openjdk-amd64/jre
2018-07-27 17:11:04,920 INFO [ReadOnlyZKClient-localhost:2181@0x1e730495] 
zookeeper.ZooKeeper: 
ion-api-1.1.0.Final.jar:/home/username/Downloads/hbase-master-github/hbase/hbase-examples/target/hbase-examples-3.0.0-SNAPSHOT.jar:/home/username/Downloads/hbase-master-github/hbase/hbase-endpoint/target/hbase-endpoint-3.0.0-SNAPSHOT.jar:/home/username/.m2/repository/org/apache/curator/curator-framework/4.0.0/curator-framework-4.0.0.jar:/home/username/.m2/repository/org/apache/curator/curator-client/4.0.0/curator-client-4.0.0.jar:/home/username/.m2/repository/org/apache/curator/curator-recipes/4.0.0/curator-recipes-4.0.0.jar:/home/username/Downloads/hbase-master-github/hbase/hbase-zookeeper/target/hbase-zookeeper-3.0.0-SNAPSHOT.jar:/home/username/Downloads/hbase-master-github/hbase/hbase-rsgroup/target/hbase-rsgroup-3.0.0-SNAPSHOT.jar:/home/username/Downloads/hbase-master-github/hbase/hbase-annotations/target/hbase-annotations-3.0.0-SNAPSHOT-tests.jar:/home/username/.m2/repository/org/apache/yetus/audience-annotations/0.5.0/audience-annotations-0.5.0.jar:/home/username/.m2/repository/junit/junit/4.12/junit-4.12.jar
2018-07-27 17:11:04,920 INFO [ReadOnlyZKClient-localhost:2181@0x1e730495] 
zookeeper.ZooKeeper: Client 
environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib
2018-07-27 17:11:04,920 INFO [ReadOnlyZKClient-localhost:2181@0x1e730495] 
zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
2018-07-27 17:11:04,920 INFO [ReadOnlyZKClient-localhost:2181@0x1e730495] 
zookeeper.ZooKeeper: Client environment:java.compiler=
2018-07-27 17:11:04,920 INFO [ReadOnlyZKClient-localhost:2181@0x1e730495] 
zookeeper.ZooKeeper: Client environment:os.name=Linux
2018-07-27 17:11:04,920 INFO [ReadOnlyZKClient-localhost:2181@0x1e730495] 
zookeeper.ZooKeeper: Client environment:os.arch=amd64
2018-07-27 17:11:04,921 INFO [ReadOnlyZKClient-localhost:2181@0x1e730495] 
zookeeper.ZooKeeper: Client environment:os.version=4.4.0-127-generic
2018-07-27 17:11:04,921 INFO [ReadOnlyZKClient-localhost:2181@0x1e730495] 
zookeeper.ZooKeeper: Client environment:user.name=username
2018-07-27 17:11:04,921 INFO [ReadOnlyZKClient-localhost:2181@0x1e730495] 
zookeeper.ZooKeeper: Client environment:user.home=/home/username
2018-07-27 17:11:04,921 INFO [ReadOnlyZKClient-localhost:2181@0x1e730495] 
zookeeper.ZooKeeper: Client 
environment:user.dir=/home/username/Downloads/hbase-master-github/hbase
2018-07-27 17:11:04,922 INFO [ReadOnlyZKClient-localhost:2181@0x1e730495] 
zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181 
sessionTimeout=9 
watcher=org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$$Lambda$12/1936622116@521189fe
2018-07-27 17:11:04,933 INFO 
[ReadOnlyZKClient-localhost:2181@0x1e730495-SendThread(localhost:2181)] 
zookeeper.ClientCnxn: Opening socket connection to server 
localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown 
error)
2018-07-27 17:11:04,940 INFO 
[ReadOnlyZKClient-localhost:2181@0x1e730495-SendThread(localhost:2181)] 
zookeeper.ClientCnxn: Socket connection established to 
localhost/127.0.0.1:2181, initiating session
2018-07-27 17:11:04,955 INFO 
[ReadOnlyZKClient-localhost:2181@0x1e730495-SendThread(localhost:2181)] 
zookeeper.ClientCnxn: Session 

[jira] [Updated] (HBASE-20925) Canary test to expose table availability rate

2018-07-27 Thread Xu Cang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Cang updated HBASE-20925:

Attachment: HBASE-20925.master.001.patch
Status: Patch Available  (was: Open)

> Canary test to expose table availability rate 
> --
>
> Key: HBASE-20925
> URL: https://issues.apache.org/jira/browse/HBASE-20925
> Project: HBase
>  Issue Type: Improvement
>  Components: canary
>Reporter: Xu Cang
>Assignee: Xu Cang
>Priority: Major
> Attachments: HBASE-20925.master.001.patch
>
>
> Canary test to expose table availability rate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20925) Canary test to expose table availability rate

2018-07-27 Thread Xu Cang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Cang updated HBASE-20925:

Description: Canary test to expose table availability rate.  (was: This 
change will provide a new cmd line argument for users to specify as 
"-verboseTestResultFilePath" and Canary will output structured/ easy to be 
parsed/ easy to be read strings to the file. 

In the result file, it should provide a summary of the canary test and shows 
which region has read or write failures.

Also to include some stats regarding the test. Such as, "region read count: 
500, region read success count: 499"

 

(Before, Canary test sends some of above information into log file, mixed with 
other debugging information. Though, the format for different tests various,

which is no easy to be parsed. ))
Summary: Canary test to expose table availability rate   (was: Canary 
test to expose results per table/ per region to result file)

> Canary test to expose table availability rate 
> --
>
> Key: HBASE-20925
> URL: https://issues.apache.org/jira/browse/HBASE-20925
> Project: HBase
>  Issue Type: Improvement
>  Components: canary
>Reporter: Xu Cang
>Assignee: Xu Cang
>Priority: Major
>
> Canary test to expose table availability rate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20894) Move BucketCache from java serialization to protobuf

2018-07-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560487#comment-16560487
 ] 

Hadoop QA commented on HBASE-20894:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
39s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
29s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
 0s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
52s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  2m 
51s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  1m 52s{color} 
| {color:red} hbase-server generated 1 new + 187 unchanged - 1 fixed = 188 
total (was 188) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
13s{color} | {color:red} hbase-server: The patch generated 1 new + 45 unchanged 
- 8 fixed = 46 total (was 53) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
40s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 22s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
1m  9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
32s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}108m 
43s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}162m 28s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20894 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12933428/HBASE-20894.master.003.patch
 |
| Optional Tests |  asflicense  cc  unit  hbaseprotoc  javac  javadoc  findbugs 
 shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux b69cef275e83 

[jira] [Commented] (HBASE-20950) Helper method to configure secure DFS cluster for tests

2018-07-27 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560483#comment-16560483
 ] 

Ted Yu commented on HBASE-20950:


If the test normally passes, there is no need for this change.

> Helper method to configure secure DFS cluster for tests
> ---
>
> Key: HBASE-20950
> URL: https://issues.apache.org/jira/browse/HBASE-20950
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HBASE-20950.master.001.patch
>
>
> There is quite some boilerplate code for configuring a secure HDFS cluster 
> for tests. The code is repeated in a number of test files within HBase code 
> base. Convert the boilerplate code into a helper method to avoid duplication 
> and lower maintenance effort.
> SecureTestCluster#setHdfsSecuredConfiguration
> TestSecureExport#setUpClusterKdc
> TestThriftSpnegoHttpServer#addSecurityConfigurations
> TestSaslFanOutOneBlockAsyncDFSOutput#setHdfsSecuredConfiguration



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20950) Helper method to configure secure DFS cluster for tests

2018-07-27 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560467#comment-16560467
 ] 

Wei-Chiu Chuang commented on HBASE-20950:
-

Thanks for the review, [~yuzhih...@gmail.com]. This particular change is 
actually unrelated to the main issue found in this jira, but while testing I 
found the CLUSTER object could throw NPE if the test fails unexpectedly. I can 
restore this change for sure.

> Helper method to configure secure DFS cluster for tests
> ---
>
> Key: HBASE-20950
> URL: https://issues.apache.org/jira/browse/HBASE-20950
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HBASE-20950.master.001.patch
>
>
> There is quite some boilerplate code for configuring a secure HDFS cluster 
> for tests. The code is repeated in a number of test files within HBase code 
> base. Convert the boilerplate code into a helper method to avoid duplication 
> and lower maintenance effort.
> SecureTestCluster#setHdfsSecuredConfiguration
> TestSecureExport#setUpClusterKdc
> TestThriftSpnegoHttpServer#addSecurityConfigurations
> TestSaslFanOutOneBlockAsyncDFSOutput#setHdfsSecuredConfiguration



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19320) document the mysterious direct memory leak in hbase

2018-07-27 Thread huaxiang sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560458#comment-16560458
 ] 

huaxiang sun commented on HBASE-19320:
--

I am going to commit the patch, thanks [~stack].

> document the mysterious direct memory leak in hbase 
> 
>
> Key: HBASE-19320
> URL: https://issues.apache.org/jira/browse/HBASE-19320
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.2.6, 2.0.0
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>Priority: Major
> Attachments: HBASE-19320-master-v001.patch, Screen Shot 2017-11-21 at 
> 4.43.36 PM.png, Screen Shot 2017-11-21 at 4.44.22 PM.png
>
>
> Recently we run into a direct memory leak case, which takes some time to 
> trace and debug. Internally discussed with our [~saint@gmail.com], we 
> thought we had some findings and want to share with the community.
> Basically, it is the issue described in 
> http://www.evanjones.ca/java-bytebuffer-leak.html and it happened to one of 
> our hbase clusters.
> Create the jira first and will fill in more details later.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20939) There will be race when we call suspendIfNotReady and then throw ProcedureSuspendedException

2018-07-27 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560453#comment-16560453
 ] 

Hudson commented on HBASE-20939:


Results for branch branch-2.0
[build #600 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/600/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/600//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/600//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/600//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> There will be race when we call suspendIfNotReady and then throw 
> ProcedureSuspendedException
> 
>
> Key: HBASE-20939
> URL: https://issues.apache.org/jira/browse/HBASE-20939
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.0.1, 2.2.0, 2.1.1
>
> Attachments: HBASE-20939.patch, HBASE-20939.patch
>
>
> This is very typical usage in our procedure implementation, for example, in 
> AssignProcedure, we will call AM.queueAssign and then suspend ourselves to 
> wait until the AM finish processing our assign request.
> But there could be races. Think of this:
> 1. We call suspendIfNotReady on a event, and it returns true so we need to 
> wait.
> 2. The event has been waked up, and the procedure will be added back to the 
> scheduler.
> 3. A worker picks up the procedure and finishes it.
> 4. We finally throw ProcedureSuspendException and the ProcedureExecutor 
> suspend us and store the state in procedure store.
> So we have a half done procedure in the procedure store for ever... This may 
> cause assertion when loading procedures. And maybe the worker can not finish 
> the procedure as when suspending we need to restore some state, for example, 
> add something to RootProcedureState. But anyway, it will still lead to 
> assertion or other unexpected errors.
> And this can not be done by simply adding a lock in the procedure, as most 
> works are done in the ProcedureExecutor after we throw 
> ProcedureSuspendException.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20950) Helper method to configure secure DFS cluster for tests

2018-07-27 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560452#comment-16560452
 ] 

Ted Yu commented on HBASE-20950:


Looks good overall.
{code}
+  CLUSTER.join();
 }
-CLUSTER.join();
{code}
I am a bit curious - why moving the join call outside the if ?

> Helper method to configure secure DFS cluster for tests
> ---
>
> Key: HBASE-20950
> URL: https://issues.apache.org/jira/browse/HBASE-20950
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HBASE-20950.master.001.patch
>
>
> There is quite some boilerplate code for configuring a secure HDFS cluster 
> for tests. The code is repeated in a number of test files within HBase code 
> base. Convert the boilerplate code into a helper method to avoid duplication 
> and lower maintenance effort.
> SecureTestCluster#setHdfsSecuredConfiguration
> TestSecureExport#setUpClusterKdc
> TestThriftSpnegoHttpServer#addSecurityConfigurations
> TestSaslFanOutOneBlockAsyncDFSOutput#setHdfsSecuredConfiguration



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20950) Helper method to configure secure DFS cluster for tests

2018-07-27 Thread Wei-Chiu Chuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HBASE-20950:

Affects Version/s: 3.0.0

> Helper method to configure secure DFS cluster for tests
> ---
>
> Key: HBASE-20950
> URL: https://issues.apache.org/jira/browse/HBASE-20950
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HBASE-20950.master.001.patch
>
>
> There is quite some boilerplate code for configuring a secure HDFS cluster 
> for tests. The code is repeated in a number of test files within HBase code 
> base. Convert the boilerplate code into a helper method to avoid duplication 
> and lower maintenance effort.
> SecureTestCluster#setHdfsSecuredConfiguration
> TestSecureExport#setUpClusterKdc
> TestThriftSpnegoHttpServer#addSecurityConfigurations
> TestSaslFanOutOneBlockAsyncDFSOutput#setHdfsSecuredConfiguration



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20950) Helper method to configure secure DFS cluster for tests

2018-07-27 Thread Wei-Chiu Chuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HBASE-20950:

Status: Patch Available  (was: Open)

> Helper method to configure secure DFS cluster for tests
> ---
>
> Key: HBASE-20950
> URL: https://issues.apache.org/jira/browse/HBASE-20950
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HBASE-20950.master.001.patch
>
>
> There is quite some boilerplate code for configuring a secure HDFS cluster 
> for tests. The code is repeated in a number of test files within HBase code 
> base. Convert the boilerplate code into a helper method to avoid duplication 
> and lower maintenance effort.
> SecureTestCluster#setHdfsSecuredConfiguration
> TestSecureExport#setUpClusterKdc
> TestThriftSpnegoHttpServer#addSecurityConfigurations
> TestSaslFanOutOneBlockAsyncDFSOutput#setHdfsSecuredConfiguration



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20950) Helper method to configure secure DFS cluster for tests

2018-07-27 Thread Wei-Chiu Chuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HBASE-20950:

Attachment: HBASE-20950.master.001.patch

> Helper method to configure secure DFS cluster for tests
> ---
>
> Key: HBASE-20950
> URL: https://issues.apache.org/jira/browse/HBASE-20950
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HBASE-20950.master.001.patch
>
>
> There is quite some boilerplate code for configuring a secure HDFS cluster 
> for tests. The code is repeated in a number of test files within HBase code 
> base. Convert the boilerplate code into a helper method to avoid duplication 
> and lower maintenance effort.
> SecureTestCluster#setHdfsSecuredConfiguration
> TestSecureExport#setUpClusterKdc
> TestThriftSpnegoHttpServer#addSecurityConfigurations
> TestSaslFanOutOneBlockAsyncDFSOutput#setHdfsSecuredConfiguration



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20939) There will be race when we call suspendIfNotReady and then throw ProcedureSuspendedException

2018-07-27 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560410#comment-16560410
 ] 

Hudson commented on HBASE-20939:


Results for branch branch-2.1
[build #112 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/112/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/112//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/112//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/112//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> There will be race when we call suspendIfNotReady and then throw 
> ProcedureSuspendedException
> 
>
> Key: HBASE-20939
> URL: https://issues.apache.org/jira/browse/HBASE-20939
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.0.1, 2.2.0, 2.1.1
>
> Attachments: HBASE-20939.patch, HBASE-20939.patch
>
>
> This is very typical usage in our procedure implementation, for example, in 
> AssignProcedure, we will call AM.queueAssign and then suspend ourselves to 
> wait until the AM finish processing our assign request.
> But there could be races. Think of this:
> 1. We call suspendIfNotReady on a event, and it returns true so we need to 
> wait.
> 2. The event has been waked up, and the procedure will be added back to the 
> scheduler.
> 3. A worker picks up the procedure and finishes it.
> 4. We finally throw ProcedureSuspendException and the ProcedureExecutor 
> suspend us and store the state in procedure store.
> So we have a half done procedure in the procedure store for ever... This may 
> cause assertion when loading procedures. And maybe the worker can not finish 
> the procedure as when suspending we need to restore some state, for example, 
> add something to RootProcedureState. But anyway, it will still lead to 
> assertion or other unexpected errors.
> And this can not be done by simply adding a lock in the procedure, as most 
> works are done in the ProcedureExecutor after we throw 
> ProcedureSuspendException.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19369) HBase Should use Builder Pattern to Create Log Files while using WAL on Erasure Coding

2018-07-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560403#comment-16560403
 ] 

Hadoop QA commented on HBASE-19369:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
41s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
39s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
31s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
17s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} The patch hbase-common passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} The patch hbase-procedure passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 2s{color} | {color:green} hbase-server: The patch generated 0 new + 0 
unchanged - 2 fixed = 0 total (was 2) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
31s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 25s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
35s{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
43s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}177m 
28s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 9s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}232m 27s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-19369 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12933404/HBASE-19369.v11.patch 
|
| Optional Tests |  asflicense  javac  

[jira] [Commented] (HBASE-19369) HBase Should use Builder Pattern to Create Log Files while using WAL on Erasure Coding

2018-07-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560348#comment-16560348
 ] 

Hadoop QA commented on HBASE-19369:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
55s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
42s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
49s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
39s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
4s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} The patch hbase-common passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} The patch hbase-procedure passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
10s{color} | {color:green} hbase-server: The patch generated 0 new + 0 
unchanged - 2 fixed = 0 total (was 2) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
41s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 17s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
22s{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
36s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}111m 
15s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 3s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}166m 18s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-19369 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12933409/HBASE-19369.v12.patch 
|
| Optional Tests |  asflicense  javac  

[jira] [Comment Edited] (HBASE-20893) Data loss if splitting region while ServerCrashProcedure executing

2018-07-27 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559861#comment-16559861
 ] 

stack edited comment on HBASE-20893 at 7/27/18 9:10 PM:


[~allan163] Yeah, support for rollback at the split procedure level was added 
but rollback invokes rolling back of subprocedures and this does not work if 
subprocedure is unassign/assign. A hack was added to do async assign/unassign 
up in split/merge but this runs out-of-band, not as part of the rollback, until 
rollback gets more love and finish.

Yeah, unsupported exceptions and CODE-BUG as well as crashed out procedure are 
scary.

bq. But. since a exception is thrown, the decrease for stateCount never happen.

Lets fix, in a new issue?

Do you have a problem with this patch? It avoids CODE-BUG and skips use of 
rollback with its hack async assign/unassign. It also is less *violent* than 
what was here previous just re-running a step rather than flipping to (dodgy) 
rollback waiting on new procedure scheduling. Your unit tests now conclude with 
successful merges and splits where before they finish at rollback, not with 
successful split/merge procedure completion.

I'm in here because my long-running tests are failing and I thought this the 
cause...(Now I don't think it is but we should clean up the mess it makes).

Thanks.




was (Author: stack):
[~allan163] Yeah, support for rollback at the split procedure level was added 
but rollback invokes rolling back of subprocedures and this does not work if 
subprocedure is unassign/assign. A hack was added to do async assign/unassign 
which run out-of-band as part of rollback until rollback got love and finish.

Unsupported exceptions and CODE-BUG as well as crashed out procedure are scary.

bq. But. since a exception is thrown, the decrease for stateCount never happen.

Lets fix, in a new issue?

Do you have a problem with this patch? It avoids CODE-BUG and skips use of 
rollback with the hack async assign/unassign. It also is less *violent* than 
what was here previous just re-running a step rather than flipping to (dodgy) 
rollback waiting on new procedure scheduling. 

I'm in here because my long-running tests are failing and this looks to be the 
cause.



> Data loss if splitting region while ServerCrashProcedure executing
> --
>
> Key: HBASE-20893
> URL: https://issues.apache.org/jira/browse/HBASE-20893
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-20893-branch-2.0.addendum.patch, 
> HBASE-20893.branch-2.0.001.patch, HBASE-20893.branch-2.0.002.patch, 
> HBASE-20893.branch-2.0.003.patch, HBASE-20893.branch-2.0.004.patch, 
> HBASE-20893.branch-2.0.005.patch
>
>
> Similar case as HBASE-20878.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20894) Move BucketCache from java serialization to protobuf

2018-07-27 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560323#comment-16560323
 ] 

Mike Drob commented on HBASE-20894:
---

Address the checkstyle issues.

I was looking at this closer, and I'm not sure how the serializer cache 
actually has any meaning on startup, to be honest. The ram cache is cleared in 
{{disableCache}} before we store the backing map anyway, so I think the 
deserializer references go away there as well.

> Move BucketCache from java serialization to protobuf
> 
>
> Key: HBASE-20894
> URL: https://issues.apache.org/jira/browse/HBASE-20894
> Project: HBase
>  Issue Type: Task
>  Components: BucketCache
>Affects Versions: 2.0.0
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20894.WIP-2.patch, HBASE-20894.WIP.patch, 
> HBASE-20894.master.001.patch, HBASE-20894.master.002.patch, 
> HBASE-20894.master.003.patch
>
>
> We should use a better serialization format instead of Java Serialization for 
> the BucketCache entry persistence.
> Suggested by Chris McCown, who does not appear to have a JIRA account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20894) Move BucketCache from java serialization to protobuf

2018-07-27 Thread Mike Drob (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-20894:
--
Attachment: HBASE-20894.master.003.patch

> Move BucketCache from java serialization to protobuf
> 
>
> Key: HBASE-20894
> URL: https://issues.apache.org/jira/browse/HBASE-20894
> Project: HBase
>  Issue Type: Task
>  Components: BucketCache
>Affects Versions: 2.0.0
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20894.WIP-2.patch, HBASE-20894.WIP.patch, 
> HBASE-20894.master.001.patch, HBASE-20894.master.002.patch, 
> HBASE-20894.master.003.patch
>
>
> We should use a better serialization format instead of Java Serialization for 
> the BucketCache entry persistence.
> Suggested by Chris McCown, who does not appear to have a JIRA account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20952) Re-visit the WAL API

2018-07-27 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560315#comment-16560315
 ] 

stack commented on HBASE-20952:
---

Deferred would be good to support again; i.e. fire the edit but client doesn't 
wait on it. In background we sync edits on a configurable periodiciit. We used 
to have this but distruptor messed it up.

Don't forget read/recovery API.

Currently WALs are grouped by server and/or by multiwal policy/count. You might 
want to add metadata on WALs to support queries by server or wal-group. Cool 
would be an API that allowed adding arbitrary metadata about a WAL on the fly. 
Imagine being able to mark regions seen as WAL metadata and then being able to 
query before replay, what regions have edits in a particular WAL. Other 
metadata might be oldest/newest sequenceid and/or timestamp.

> Re-visit the WAL API
> 
>
> Key: HBASE-20952
> URL: https://issues.apache.org/jira/browse/HBASE-20952
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Josh Elser
>Priority: Major
>
> Take a step back from the current WAL implementations and think about what an 
> HBase WAL API should look like. What are the primitive calls that we require 
> to guarantee durability of writes with a high degree of performance?
> The API needs to take the current implementations into consideration. We 
> should also have a mind for what is happening in the Ratis LogService (but 
> the LogService should not dictate what HBase's WAL API looks like RATIS-272).
> Other "systems" inside of HBase that use WALs are replication and 
> backup Replication has the use-case for "tail"'ing the WAL which we 
> should provide via our new API. B doesn't do anything fancy (IIRC). We 
> should make sure all consumers are generally going to be OK with the API we 
> create.
> The API may be "OK" (or OK in a part). We need to also consider other methods 
> which were "bolted" on such as {{AbstractFSWAL}} and 
> {{WALFileLengthProvider}}. Other corners of "WAL use" (like the 
> {{WALSplitter}} should also be looked at to use WAL-APIs only).
> We also need to make sure that adequate interface audience and stability 
> annotations are chosen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18477) Umbrella JIRA for HBase Read Replica clusters

2018-07-27 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560305#comment-16560305
 ] 

Hudson commented on HBASE-18477:


Results for branch HBASE-18477
[build #277 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/277/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/277//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/277//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/277//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(x) {color:red}-1 client integration test{color}
--Failed when running client tests on top of Hadoop 2. [see log for 
details|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/277//artifact/output-integration/hadoop-2.log].
 (note that this means we didn't run on Hadoop 3)


> Umbrella JIRA for HBase Read Replica clusters
> -
>
> Key: HBASE-18477
> URL: https://issues.apache.org/jira/browse/HBASE-18477
> Project: HBase
>  Issue Type: New Feature
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Attachments: HBase Read-Replica Clusters Scope doc.docx, HBase 
> Read-Replica Clusters Scope doc.pdf, HBase Read-Replica Clusters Scope 
> doc_v2.docx, HBase Read-Replica Clusters Scope doc_v2.pdf
>
>
> Recently, changes (such as HBASE-17437) have unblocked HBase to run with a 
> root directory external to the cluster (such as in Amazon S3). This means 
> that the data is stored outside of the cluster and can be accessible after 
> the cluster has been terminated. One use case that is often asked about is 
> pointing multiple clusters to one root directory (sharing the data) to have 
> read resiliency in the case of a cluster failure.
>  
> This JIRA is an umbrella JIRA to contain all the tasks necessary to create a 
> read-replica HBase cluster that is pointed at the same root directory.
>  
> This requires making the Read-Replica cluster Read-Only (no metadata 
> operation or data operations).
> Separating the hbase:meta table for each cluster (Otherwise HBase gets 
> confused with multiple clusters trying to update the meta table with their ip 
> addresses)
> Adding refresh functionality for the meta table to ensure new metadata is 
> picked up on the read replica cluster.
> Adding refresh functionality for HFiles for a given table to ensure new data 
> is picked up on the read replica cluster.
>  
> This can be used with any existing cluster that is backed by an external 
> filesystem.
>  
> Please note that this feature is still quite manual (with the potential for 
> automation later).
>  
> More information on this particular feature can be found here: 
> https://aws.amazon.com/blogs/big-data/setting-up-read-replica-clusters-with-hbase-on-amazon-s3/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20749) Upgrade our use of checkstyle to 8.6+

2018-07-27 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560247#comment-16560247
 ] 

Mike Drob commented on HBASE-20749:
---

Pushed an update to the branch to use 8.11 instead of 8.12 which isn't release 
yet, and rebased. Failed to update the commit message, but will get that next 
time/before pushing to master.

> Upgrade our use of checkstyle to 8.6+
> -
>
> Key: HBASE-20749
> URL: https://issues.apache.org/jira/browse/HBASE-20749
> Project: HBase
>  Issue Type: Improvement
>  Components: build, community
>Reporter: Sean Busbey
>Assignee: Mike Drob
>Priority: Minor
> Attachments: HBASE-20749.master.001.patch
>
>
> We should upgrade our checkstyle version to 8.6 or later so we can use the 
> "match violation message to this regex" feature for suppression. That will 
> allow us to make sure we don't regress on HTrace v3 vs v4 APIs (came up in 
> HBASE-20332).
> We're currently blocked on upgrading to 8.3+ by [checkstyle 
> #5279|https://github.com/checkstyle/checkstyle/issues/5279], a regression 
> that flags our use of both the "separate import groups" and "put static 
> imports over here" configs as an error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20939) There will be race when we call suspendIfNotReady and then throw ProcedureSuspendedException

2018-07-27 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560232#comment-16560232
 ] 

Hudson commented on HBASE-20939:


Results for branch branch-2
[build #1034 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1034/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1034//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1034//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1034//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> There will be race when we call suspendIfNotReady and then throw 
> ProcedureSuspendedException
> 
>
> Key: HBASE-20939
> URL: https://issues.apache.org/jira/browse/HBASE-20939
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.0.1, 2.2.0, 2.1.1
>
> Attachments: HBASE-20939.patch, HBASE-20939.patch
>
>
> This is very typical usage in our procedure implementation, for example, in 
> AssignProcedure, we will call AM.queueAssign and then suspend ourselves to 
> wait until the AM finish processing our assign request.
> But there could be races. Think of this:
> 1. We call suspendIfNotReady on a event, and it returns true so we need to 
> wait.
> 2. The event has been waked up, and the procedure will be added back to the 
> scheduler.
> 3. A worker picks up the procedure and finishes it.
> 4. We finally throw ProcedureSuspendException and the ProcedureExecutor 
> suspend us and store the state in procedure store.
> So we have a half done procedure in the procedure store for ever... This may 
> cause assertion when loading procedures. And maybe the worker can not finish 
> the procedure as when suspending we need to restore some state, for example, 
> add something to RootProcedureState. But anyway, it will still lead to 
> assertion or other unexpected errors.
> And this can not be done by simply adding a lock in the procedure, as most 
> works are done in the ProcedureExecutor after we throw 
> ProcedureSuspendException.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-20749) Upgrade our use of checkstyle to 8.6+

2018-07-27 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560228#comment-16560228
 ] 

Mike Drob edited comment on HBASE-20749 at 7/27/18 8:00 PM:


[~busbey] - from 
https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20749/5//artifact/output-general/maven-patch-checkstyle-root.txt

{noformat}
[WARNING] The requested profile "test-patch" could not be activated because it 
does not exist.
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-checkstyle-plugin:3.0.0:checkstyle (default-cli) 
on project hbase: Execution default-cli of goal 
org.apache.maven.plugins:maven-checkstyle-plugin:3.0.0:checkstyle failed: 
Plugin org.apache.maven.plugins:maven-checkstyle-plugin:3.0.0 or one of its 
dependencies could not be resolved: Could not find artifact 
com.puppycrawl.tools:checkstyle:jar:8.12 in Nexus 
(http://repository.apache.org/snapshots) -> [Help 1]
{noformat}

This probably affects other nightly build stuff too?

Edit: I misread the source of the error, it looks like not finding profile 
"test-patch" is not fatal.


was (Author: mdrob):
[~busbey] - from 
https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20749/5//artifact/output-general/maven-patch-checkstyle-root.txt

{noformat}
[WARNING] The requested profile "test-patch" could not be activated because it 
does not exist.
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-checkstyle-plugin:3.0.0:checkstyle (default-cli) 
on project hbase: Execution default-cli of goal 
org.apache.maven.plugins:maven-checkstyle-plugin:3.0.0:checkstyle failed: 
Plugin org.apache.maven.plugins:maven-checkstyle-plugin:3.0.0 or one of its 
dependencies could not be resolved: Could not find artifact 
com.puppycrawl.tools:checkstyle:jar:8.12 in Nexus 
(http://repository.apache.org/snapshots) -> [Help 1]
{noformat}

This probably affects other nightly build stuff too?

> Upgrade our use of checkstyle to 8.6+
> -
>
> Key: HBASE-20749
> URL: https://issues.apache.org/jira/browse/HBASE-20749
> Project: HBase
>  Issue Type: Improvement
>  Components: build, community
>Reporter: Sean Busbey
>Assignee: Mike Drob
>Priority: Minor
> Attachments: HBASE-20749.master.001.patch
>
>
> We should upgrade our checkstyle version to 8.6 or later so we can use the 
> "match violation message to this regex" feature for suppression. That will 
> allow us to make sure we don't regress on HTrace v3 vs v4 APIs (came up in 
> HBASE-20332).
> We're currently blocked on upgrading to 8.3+ by [checkstyle 
> #5279|https://github.com/checkstyle/checkstyle/issues/5279], a regression 
> that flags our use of both the "separate import groups" and "put static 
> imports over here" configs as an error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20749) Upgrade our use of checkstyle to 8.6+

2018-07-27 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560228#comment-16560228
 ] 

Mike Drob commented on HBASE-20749:
---

[~busbey] - from 
https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20749/5//artifact/output-general/maven-patch-checkstyle-root.txt

{noformat}
[WARNING] The requested profile "test-patch" could not be activated because it 
does not exist.
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-checkstyle-plugin:3.0.0:checkstyle (default-cli) 
on project hbase: Execution default-cli of goal 
org.apache.maven.plugins:maven-checkstyle-plugin:3.0.0:checkstyle failed: 
Plugin org.apache.maven.plugins:maven-checkstyle-plugin:3.0.0 or one of its 
dependencies could not be resolved: Could not find artifact 
com.puppycrawl.tools:checkstyle:jar:8.12 in Nexus 
(http://repository.apache.org/snapshots) -> [Help 1]
{noformat}

This probably affects other nightly build stuff too?

> Upgrade our use of checkstyle to 8.6+
> -
>
> Key: HBASE-20749
> URL: https://issues.apache.org/jira/browse/HBASE-20749
> Project: HBase
>  Issue Type: Improvement
>  Components: build, community
>Reporter: Sean Busbey
>Assignee: Mike Drob
>Priority: Minor
> Attachments: HBASE-20749.master.001.patch
>
>
> We should upgrade our checkstyle version to 8.6 or later so we can use the 
> "match violation message to this regex" feature for suppression. That will 
> allow us to make sure we don't regress on HTrace v3 vs v4 APIs (came up in 
> HBASE-20332).
> We're currently blocked on upgrading to 8.3+ by [checkstyle 
> #5279|https://github.com/checkstyle/checkstyle/issues/5279], a regression 
> that flags our use of both the "separate import groups" and "put static 
> imports over here" configs as an error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20528) Revise collections copying from iteration to built-in function

2018-07-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560206#comment-16560206
 ] 

Hadoop QA commented on HBASE-20528:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
10s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
26s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
25s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
35s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
42s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
31s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
40s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 22s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
38s{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
3s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}111m 
59s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 
41s{color} | {color:green} hbase-mapreduce in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
34s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}190m 51s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20528 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12933394/HBASE-20528.master.001.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | 

[jira] [Commented] (HBASE-20952) Re-visit the WAL API

2018-07-27 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560174#comment-16560174
 ] 

Mike Drob commented on HBASE-20952:
---

This would have been better if I could think of everything at once instead of 
posting tweet-sized comments, but we might also have to revisit some of the 
StreamCapability enforcement stuff for this too...

> Re-visit the WAL API
> 
>
> Key: HBASE-20952
> URL: https://issues.apache.org/jira/browse/HBASE-20952
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Josh Elser
>Priority: Major
>
> Take a step back from the current WAL implementations and think about what an 
> HBase WAL API should look like. What are the primitive calls that we require 
> to guarantee durability of writes with a high degree of performance?
> The API needs to take the current implementations into consideration. We 
> should also have a mind for what is happening in the Ratis LogService (but 
> the LogService should not dictate what HBase's WAL API looks like RATIS-272).
> Other "systems" inside of HBase that use WALs are replication and 
> backup Replication has the use-case for "tail"'ing the WAL which we 
> should provide via our new API. B doesn't do anything fancy (IIRC). We 
> should make sure all consumers are generally going to be OK with the API we 
> create.
> The API may be "OK" (or OK in a part). We need to also consider other methods 
> which were "bolted" on such as {{AbstractFSWAL}} and 
> {{WALFileLengthProvider}}. Other corners of "WAL use" (like the 
> {{WALSplitter}} should also be looked at to use WAL-APIs only).
> We also need to make sure that adequate interface audience and stability 
> annotations are chosen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20952) Re-visit the WAL API

2018-07-27 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560169#comment-16560169
 ] 

Josh Elser commented on HBASE-20952:


{quote}We also have ProcWAL for use by the Pv2 state machine
{quote}
Oh! That was something else I wanted to mention here. [~stack] had mentioned in 
HBASE-18152 about switching pv2 over to using FSHLog instead of the custom WAL 
implementation since there are some concurrency problems in that implementation 
(and we know FSHLog is good). That may also impact consumers.

> Re-visit the WAL API
> 
>
> Key: HBASE-20952
> URL: https://issues.apache.org/jira/browse/HBASE-20952
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Josh Elser
>Priority: Major
>
> Take a step back from the current WAL implementations and think about what an 
> HBase WAL API should look like. What are the primitive calls that we require 
> to guarantee durability of writes with a high degree of performance?
> The API needs to take the current implementations into consideration. We 
> should also have a mind for what is happening in the Ratis LogService (but 
> the LogService should not dictate what HBase's WAL API looks like RATIS-272).
> Other "systems" inside of HBase that use WALs are replication and 
> backup Replication has the use-case for "tail"'ing the WAL which we 
> should provide via our new API. B doesn't do anything fancy (IIRC). We 
> should make sure all consumers are generally going to be OK with the API we 
> create.
> The API may be "OK" (or OK in a part). We need to also consider other methods 
> which were "bolted" on such as {{AbstractFSWAL}} and 
> {{WALFileLengthProvider}}. Other corners of "WAL use" (like the 
> {{WALSplitter}} should also be looked at to use WAL-APIs only).
> We also need to make sure that adequate interface audience and stability 
> annotations are chosen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20952) Re-visit the WAL API

2018-07-27 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560163#comment-16560163
 ] 

Mike Drob commented on HBASE-20952:
---

We also have ProcWAL for use by the Pv2 state machine

> Re-visit the WAL API
> 
>
> Key: HBASE-20952
> URL: https://issues.apache.org/jira/browse/HBASE-20952
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Josh Elser
>Priority: Major
>
> Take a step back from the current WAL implementations and think about what an 
> HBase WAL API should look like. What are the primitive calls that we require 
> to guarantee durability of writes with a high degree of performance?
> The API needs to take the current implementations into consideration. We 
> should also have a mind for what is happening in the Ratis LogService (but 
> the LogService should not dictate what HBase's WAL API looks like RATIS-272).
> Other "systems" inside of HBase that use WALs are replication and 
> backup Replication has the use-case for "tail"'ing the WAL which we 
> should provide via our new API. B doesn't do anything fancy (IIRC). We 
> should make sure all consumers are generally going to be OK with the API we 
> create.
> The API may be "OK" (or OK in a part). We need to also consider other methods 
> which were "bolted" on such as {{AbstractFSWAL}} and 
> {{WALFileLengthProvider}}. Other corners of "WAL use" (like the 
> {{WALSplitter}} should also be looked at to use WAL-APIs only).
> We also need to make sure that adequate interface audience and stability 
> annotations are chosen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20257) hbase-spark should not depend on com.google.code.findbugs.jsr305

2018-07-27 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560161#comment-16560161
 ] 

Ted Yu commented on HBASE-20257:


Ping [~busbey]

> hbase-spark should not depend on com.google.code.findbugs.jsr305
> 
>
> Key: HBASE-20257
> URL: https://issues.apache.org/jira/browse/HBASE-20257
> Project: HBase
>  Issue Type: Task
>  Components: build, spark
>Affects Versions: 3.0.0
>Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-20257.v01.patch, HBASE-20257.v02.patch, 
> HBASE-20257.v03.patch, HBASE-20257.v04.patch, HBASE-20257.v05.patch
>
>
> The following can be observed in the build output of master branch:
> {code}
> [WARNING] Rule 0: org.apache.maven.plugins.enforcer.BannedDependencies failed 
> with message:
> We don't allow the JSR305 jar from the Findbugs project, see HBASE-16321.
> Found Banned Dependency: com.google.code.findbugs:jsr305:jar:1.3.9
> Use 'mvn dependency:tree' to locate the source of the banned dependencies.
> {code}
> Here is related snippet from hbase-spark/pom.xml:
> {code}
> 
>   com.google.code.findbugs
>   jsr305
> {code}
> Dependency on jsr305 should be dropped.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20952) Re-visit the WAL API

2018-07-27 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560160#comment-16560160
 ] 

Josh Elser commented on HBASE-20952:


Talked through this a little bit today with [~an...@apache.org], trying to 
break it down into consumable pieces.
 * Look at prior art (e.g. Apache Kafka and DistributedLog) to come up with 
ideal API (e.g. RATIS-272)
 * Go through each "system" in HBase to document how they use WALs, and try to 
find an "ideal" HBase API
 ** Replication
 ** Backup and restore
 ** write path
 ** .. other?

Other tenets we called out:
 * Asynchronous API is desired
 * Easily supporting group-commit/multiple writers being batched into one "sync"
 * Must provide "tail" (hard part will be implementing this for 
FSHLog/AsyncFSWal – maybe two interfaces would be good, an extension that 
supports tailing?)

Some other things here:
 * Look at WALEdit/WALKey as Stack asked

Once we have these pieces written down, I think we should be able to start 
finding a middle ground between it all.

> Re-visit the WAL API
> 
>
> Key: HBASE-20952
> URL: https://issues.apache.org/jira/browse/HBASE-20952
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Josh Elser
>Priority: Major
>
> Take a step back from the current WAL implementations and think about what an 
> HBase WAL API should look like. What are the primitive calls that we require 
> to guarantee durability of writes with a high degree of performance?
> The API needs to take the current implementations into consideration. We 
> should also have a mind for what is happening in the Ratis LogService (but 
> the LogService should not dictate what HBase's WAL API looks like RATIS-272).
> Other "systems" inside of HBase that use WALs are replication and 
> backup Replication has the use-case for "tail"'ing the WAL which we 
> should provide via our new API. B doesn't do anything fancy (IIRC). We 
> should make sure all consumers are generally going to be OK with the API we 
> create.
> The API may be "OK" (or OK in a part). We need to also consider other methods 
> which were "bolted" on such as {{AbstractFSWAL}} and 
> {{WALFileLengthProvider}}. Other corners of "WAL use" (like the 
> {{WALSplitter}} should also be looked at to use WAL-APIs only).
> We also need to make sure that adequate interface audience and stability 
> annotations are chosen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19369) HBase Should use Builder Pattern to Create Log Files while using WAL on Erasure Coding

2018-07-27 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560150#comment-16560150
 ] 

Ted Yu commented on HBASE-19369:


+1

Thanks for the super fast turn around.

> HBase Should use Builder Pattern to Create Log Files while using WAL on 
> Erasure Coding
> --
>
> Key: HBASE-19369
> URL: https://issues.apache.org/jira/browse/HBASE-19369
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Alex Leblang
>Assignee: Mike Drob
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-19369.master.001.patch, 
> HBASE-19369.master.002.patch, HBASE-19369.master.003.patch, 
> HBASE-19369.master.004.patch, HBASE-19369.v10.patch, HBASE-19369.v11.patch, 
> HBASE-19369.v12.patch, HBASE-19369.v13.patch, HBASE-19369.v5.patch, 
> HBASE-19369.v6.patch, HBASE-19369.v7.patch, HBASE-19369.v8.patch, 
> HBASE-19369.v9.patch
>
>
> Right now an HBase instance using the WAL won't function properly in an 
> Erasure Coded environment. We should change the following line to use the 
> hdfs.DistributedFileSystem builder pattern 
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java#L92



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19369) HBase Should use Builder Pattern to Create Log Files while using WAL on Erasure Coding

2018-07-27 Thread Mike Drob (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-19369:
--
Attachment: HBASE-19369.v13.patch

> HBase Should use Builder Pattern to Create Log Files while using WAL on 
> Erasure Coding
> --
>
> Key: HBASE-19369
> URL: https://issues.apache.org/jira/browse/HBASE-19369
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Alex Leblang
>Assignee: Mike Drob
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-19369.master.001.patch, 
> HBASE-19369.master.002.patch, HBASE-19369.master.003.patch, 
> HBASE-19369.master.004.patch, HBASE-19369.v10.patch, HBASE-19369.v11.patch, 
> HBASE-19369.v12.patch, HBASE-19369.v13.patch, HBASE-19369.v5.patch, 
> HBASE-19369.v6.patch, HBASE-19369.v7.patch, HBASE-19369.v8.patch, 
> HBASE-19369.v9.patch
>
>
> Right now an HBase instance using the WAL won't function properly in an 
> Erasure Coded environment. We should change the following line to use the 
> hdfs.DistributedFileSystem builder pattern 
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java#L92



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19369) HBase Should use Builder Pattern to Create Log Files while using WAL on Erasure Coding

2018-07-27 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560141#comment-16560141
 ] 

Mike Drob commented on HBASE-19369:
---

v13: address Ted's additional feedback.

> HBase Should use Builder Pattern to Create Log Files while using WAL on 
> Erasure Coding
> --
>
> Key: HBASE-19369
> URL: https://issues.apache.org/jira/browse/HBASE-19369
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Alex Leblang
>Assignee: Mike Drob
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-19369.master.001.patch, 
> HBASE-19369.master.002.patch, HBASE-19369.master.003.patch, 
> HBASE-19369.master.004.patch, HBASE-19369.v10.patch, HBASE-19369.v11.patch, 
> HBASE-19369.v12.patch, HBASE-19369.v5.patch, HBASE-19369.v6.patch, 
> HBASE-19369.v7.patch, HBASE-19369.v8.patch, HBASE-19369.v9.patch
>
>
> Right now an HBase instance using the WAL won't function properly in an 
> Erasure Coded environment. We should change the following line to use the 
> hdfs.DistributedFileSystem builder pattern 
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java#L92



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19369) HBase Should use Builder Pattern to Create Log Files while using WAL on Erasure Coding

2018-07-27 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560110#comment-16560110
 ] 

Ted Yu commented on HBASE-19369:


Looks good overall.
{code}
+} catch (NoSuchMethodException e) {
+  LOG.debug("Could not find method on builder; will use old DFS API 
for file creation", e);
{code}
Since the absence of the method(s) is logged at DEBUG level, there is no need 
to include the full stack trace of the exception. Exception message should be 
good enough.
{code}
+  } else {
+return fs.createNonRecursive(path, overwritable, bufferSize, 
replication, blockSize, null);
{code}
nit: the 'else' is not needed.
{code}
+public class TestHBaseOnEC {
{code}
There might be more tests added for hadoop with EC enabled. This test is for 
WAL.
Please consider adding WAL to the class name.

> HBase Should use Builder Pattern to Create Log Files while using WAL on 
> Erasure Coding
> --
>
> Key: HBASE-19369
> URL: https://issues.apache.org/jira/browse/HBASE-19369
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Alex Leblang
>Assignee: Mike Drob
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-19369.master.001.patch, 
> HBASE-19369.master.002.patch, HBASE-19369.master.003.patch, 
> HBASE-19369.master.004.patch, HBASE-19369.v10.patch, HBASE-19369.v11.patch, 
> HBASE-19369.v12.patch, HBASE-19369.v5.patch, HBASE-19369.v6.patch, 
> HBASE-19369.v7.patch, HBASE-19369.v8.patch, HBASE-19369.v9.patch
>
>
> Right now an HBase instance using the WAL won't function properly in an 
> Erasure Coded environment. We should change the following line to use the 
> hdfs.DistributedFileSystem builder pattern 
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java#L92



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19369) HBase Should use Builder Pattern to Create Log Files while using WAL on Erasure Coding

2018-07-27 Thread Mike Drob (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-19369:
--
Attachment: HBASE-19369.v12.patch

> HBase Should use Builder Pattern to Create Log Files while using WAL on 
> Erasure Coding
> --
>
> Key: HBASE-19369
> URL: https://issues.apache.org/jira/browse/HBASE-19369
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Alex Leblang
>Assignee: Mike Drob
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-19369.master.001.patch, 
> HBASE-19369.master.002.patch, HBASE-19369.master.003.patch, 
> HBASE-19369.master.004.patch, HBASE-19369.v10.patch, HBASE-19369.v11.patch, 
> HBASE-19369.v12.patch, HBASE-19369.v5.patch, HBASE-19369.v6.patch, 
> HBASE-19369.v7.patch, HBASE-19369.v8.patch, HBASE-19369.v9.patch
>
>
> Right now an HBase instance using the WAL won't function properly in an 
> Erasure Coded environment. We should change the following line to use the 
> hdfs.DistributedFileSystem builder pattern 
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java#L92



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19369) HBase Should use Builder Pattern to Create Log Files while using WAL on Erasure Coding

2018-07-27 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560098#comment-16560098
 ] 

Mike Drob commented on HBASE-19369:
---

v12: address [~tedyu]'s feedback about log levels as well, missed it earlier. 
Thank you, Ted; anything else catch your eye here?

> HBase Should use Builder Pattern to Create Log Files while using WAL on 
> Erasure Coding
> --
>
> Key: HBASE-19369
> URL: https://issues.apache.org/jira/browse/HBASE-19369
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Alex Leblang
>Assignee: Mike Drob
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-19369.master.001.patch, 
> HBASE-19369.master.002.patch, HBASE-19369.master.003.patch, 
> HBASE-19369.master.004.patch, HBASE-19369.v10.patch, HBASE-19369.v11.patch, 
> HBASE-19369.v12.patch, HBASE-19369.v5.patch, HBASE-19369.v6.patch, 
> HBASE-19369.v7.patch, HBASE-19369.v8.patch, HBASE-19369.v9.patch
>
>
> Right now an HBase instance using the WAL won't function properly in an 
> Erasure Coded environment. We should change the following line to use the 
> hdfs.DistributedFileSystem builder pattern 
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java#L92



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20885) Remove entry for RPC quota from hbase:quota when RPC quota is removed.

2018-07-27 Thread Sakthi (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560095#comment-16560095
 ] 

Sakthi commented on HBASE-20885:


[~elserj], FYI. I am yet to figure out why this happens but in the course of 
that, I saw few issues (HBASE-20813 & HBASE-20885) and decided to work upon 
these first. But, Once these 2 issues are solved, HBASE-20705(which is the 
parent issue) resolves. i.e. making sure that removal of table/namespace 
removes the RPC quotas as well & removal of the empty row after removal of an 
rpc quota automatically seemed to resolve HBASE-20705. I tested that myself.
{noformat}
hbase(main):001:0> create 't2','cf1'
Created table t2
Took 1.0745 seconds
=> Hbase::Table - t2
hbase(main):002:0> set_quota TYPE => THROTTLE, TABLE => 't2', LIMIT => '10M/sec'
Took 0.0328 seconds
hbase(main):003:0> list_quotas
OWNER  QUOTAS
 TABLE => t2   TYPE => THROTTLE, THROTTLE_TYPE 
=> REQUEST_SIZE, LIMIT => 10M/sec, SCOPE => MACHINE
1 row(s)
Took 0.1078 seconds
hbase(main):004:0> scan 'hbase:quota'
ROWCOLUMN+CELL
 t.t2  column=q:s, 
timestamp=1532714834471, value=PBUF\x12\x0B\x12\x09\x08\x04\x10\x80\x80\x80\x05 
\x02
1 row(s)
Took 0.0301 seconds
hbase(main):005:0> set_quota TYPE => THROTTLE, TABLE => 't2', LIMIT => 'NONE'
Took 0.0066 seconds
hbase(main):006:0> list_quotas
OWNER  QUOTAS
0 row(s)
Took 0.0248 seconds
hbase(main):007:0> scan 'hbase:quota'
ROWCOLUMN+CELL
 t.t2  column=q:s, 
timestamp=1532714855300, value=PBUF\x12\x00
1 row(s)
Took 0.0034 seconds
hbase(main):008:0> set_quota TYPE => SPACE, TABLE => 't2', LIMIT => '1G', 
POLICY => NO_WRITES
Took 0.0145 seconds
hbase(main):009:0> list_quotas
OWNER  QUOTAS
 TABLE => t2   TYPE => SPACE, TABLE => t2, 
LIMIT => 1073741824, VIOLATION_POLICY => NO_WRITES
1 row(s)
Took 0.0340 seconds
hbase(main):010:0> scan 'hbase:quota'
ROWCOLUMN+CELL
 t.t2  column=q:s, 
timestamp=1532714870451, 
value=PBUF\x12\x00\x1A\x08\x08\x80\x80\x80\x80\x04\x10\x03
1 row(s)
Took 0.0045 seconds
hbase(main):011:0> set_quota TYPE => SPACE, TABLE => 't2', LIMIT => 'NONE', 
POLICY => NO_WRITES
Took 0.0075 seconds
hbase(main):012:0> list_quotas
OWNER  QUOTAS
 TABLE => t2   TYPE => SPACE, TABLE => t2, 
REMOVE => true
1 row(s)
Took 0.0353 seconds
hbase(main):013:0> scan 'hbase:quota'
ROWCOLUMN+CELL
 t.t2  column=q:s, 
timestamp=1532714887189, value=PBUF\x12\x00\x1A\x02\x18\x01
1 row(s)
{noformat}

> Remove entry for RPC quota from hbase:quota when RPC quota is removed.
> --
>
> Key: HBASE-20885
> URL: https://issues.apache.org/jira/browse/HBASE-20885
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Sakthi
>Assignee: Sakthi
>Priority: Minor
> Attachments: hbase-20885.master.001.patch, 
> hbase-20885.master.002.patch, hbase-20885.master.003.patch
>
>
> When a RPC quota is removed (using LIMIT => 'NONE'), the entry from 
> hbase:quota table is not completely removed. For e.g. see below:
> {noformat}
> hbase(main):005:0> create 't2','cf1'
> Created table t2
> Took 0.8000 seconds
> => Hbase::Table - t2
> hbase(main):006:0> set_quota TYPE => THROTTLE, TABLE => 't2', LIMIT => 
> '10M/sec'
> Took 0.1024 seconds
> hbase(main):007:0> list_quotas
> OWNER  QUOTAS
>  TABLE => t2   TYPE => THROTTLE, THROTTLE_TYPE => 
> REQUEST_SIZE, LIMIT => 10M/sec, SCOPE => MACHINE
> 1 row(s)
> Took 0.0622 seconds
> hbase(main):008:0> scan 'hbase:quota'
> ROWCOLUMN+CELL
>  t.t2  column=q:s, timestamp=1531513014463, 
> value=PBUF\x12\x0B\x12\x09\x08\x04\x10\x80\x80\x80
>\x05 \x02
> 1 row(s)
> Took 0.0453 seconds
> hbase(main):009:0> set_quota TYPE => THROTTLE, TABLE => 't2', LIMIT => 'NONE'
> Took 0.0097 seconds
> hbase(main):010:0> list_quotas
> OWNER  QUOTAS
> 0 row(s)
> Took 0.0338 seconds
> hbase(main):011:0> scan 'hbase:quota'
> ROWCOLUMN+CELL
>  t.t2  column=q:s, timestamp=1531513039505, 
> value=PBUF\x12\x00
> 1 row(s)
> Took 0.0066 seconds
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20966) RestoreTool#getTableInfoPath should look for completed snapshot only

2018-07-27 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20966:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Thanks for the review, Vlad.

> RestoreTool#getTableInfoPath should look for completed snapshot only
> 
>
> Key: HBASE-20966
> URL: https://issues.apache.org/jira/browse/HBASE-20966
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: 20966.v1.txt
>
>
> [~gubjanos] reported seeing the following error when running backup / restore 
> test on Azure:
> {code}
> 2018-07-25 17:03:56,661|INFO|MainThread|machine.py:167 - 
> run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException:
>  Couldn't read snapshot info 
> from:wasb://hbase3-m30wub1711kond-115...@humbtesting8wua.blob.core.windows.net/user/hbase/backup_loc/backup_1532538064246/default/table_fnfawii1za/.hbase-snapshot/.tmp/.
> snapshotinfo
> 2018-07-25 17:03:56,661|INFO|MainThread|machine.py:167 - 
> run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at 
> org.apache.hadoop.hbase.snapshot.SnapshotDescriptionUtils.readSnapshotInfo(SnapshotDescriptionUtils.java:328)
> 2018-07-25 17:03:56,661|INFO|MainThread|machine.py:167 - 
> run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at 
> org.apache.hadoop.hbase.backup.util.RestoreServerUtil.getTableDesc(RestoreServerUtil.java:237)
> 2018-07-25 17:03:56,662|INFO|MainThread|machine.py:167 - 
> run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at 
> org.apache.hadoop.hbase.backup.util.RestoreServerUtil.restoreTableAndCreate(RestoreServerUtil.java:351)
> 2018-07-25 17:03:56,662|INFO|MainThread|machine.py:167 - 
> run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at 
> org.apache.hadoop.hbase.backup.util.RestoreServerUtil.fullRestoreTable(RestoreServerUtil.java:186)
> {code}
> Here is related code in master branch:
> {code}
>   Path getTableInfoPath(TableName tableName) throws IOException {
> Path tableSnapShotPath = getTableSnapshotPath(backupRootPath, tableName, 
> backupId);
> Path tableInfoPath = null;
> // can't build the path directly as the timestamp values are different
> FileStatus[] snapshots = fs.listStatus(tableSnapShotPath);
> {code}
> In the above code, we don't exclude incomplete snapshot, leading to exception 
> later when reading snapshot info.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20966) RestoreTool#getTableInfoPath should look for completed snapshot only

2018-07-27 Thread Vladimir Rodionov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560089#comment-16560089
 ] 

Vladimir Rodionov commented on HBASE-20966:
---

lgtm.

> RestoreTool#getTableInfoPath should look for completed snapshot only
> 
>
> Key: HBASE-20966
> URL: https://issues.apache.org/jira/browse/HBASE-20966
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20966.v1.txt
>
>
> [~gubjanos] reported seeing the following error when running backup / restore 
> test on Azure:
> {code}
> 2018-07-25 17:03:56,661|INFO|MainThread|machine.py:167 - 
> run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException:
>  Couldn't read snapshot info 
> from:wasb://hbase3-m30wub1711kond-115...@humbtesting8wua.blob.core.windows.net/user/hbase/backup_loc/backup_1532538064246/default/table_fnfawii1za/.hbase-snapshot/.tmp/.
> snapshotinfo
> 2018-07-25 17:03:56,661|INFO|MainThread|machine.py:167 - 
> run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at 
> org.apache.hadoop.hbase.snapshot.SnapshotDescriptionUtils.readSnapshotInfo(SnapshotDescriptionUtils.java:328)
> 2018-07-25 17:03:56,661|INFO|MainThread|machine.py:167 - 
> run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at 
> org.apache.hadoop.hbase.backup.util.RestoreServerUtil.getTableDesc(RestoreServerUtil.java:237)
> 2018-07-25 17:03:56,662|INFO|MainThread|machine.py:167 - 
> run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at 
> org.apache.hadoop.hbase.backup.util.RestoreServerUtil.restoreTableAndCreate(RestoreServerUtil.java:351)
> 2018-07-25 17:03:56,662|INFO|MainThread|machine.py:167 - 
> run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at 
> org.apache.hadoop.hbase.backup.util.RestoreServerUtil.fullRestoreTable(RestoreServerUtil.java:186)
> {code}
> Here is related code in master branch:
> {code}
>   Path getTableInfoPath(TableName tableName) throws IOException {
> Path tableSnapShotPath = getTableSnapshotPath(backupRootPath, tableName, 
> backupId);
> Path tableInfoPath = null;
> // can't build the path directly as the timestamp values are different
> FileStatus[] snapshots = fs.listStatus(tableSnapShotPath);
> {code}
> In the above code, we don't exclude incomplete snapshot, leading to exception 
> later when reading snapshot info.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19369) HBase Should use Builder Pattern to Create Log Files while using WAL on Erasure Coding

2018-07-27 Thread Mike Drob (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-19369:
--
Attachment: HBASE-19369.v11.patch

> HBase Should use Builder Pattern to Create Log Files while using WAL on 
> Erasure Coding
> --
>
> Key: HBASE-19369
> URL: https://issues.apache.org/jira/browse/HBASE-19369
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Alex Leblang
>Assignee: Mike Drob
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-19369.master.001.patch, 
> HBASE-19369.master.002.patch, HBASE-19369.master.003.patch, 
> HBASE-19369.master.004.patch, HBASE-19369.v10.patch, HBASE-19369.v11.patch, 
> HBASE-19369.v5.patch, HBASE-19369.v6.patch, HBASE-19369.v7.patch, 
> HBASE-19369.v8.patch, HBASE-19369.v9.patch
>
>
> Right now an HBase instance using the WAL won't function properly in an 
> Erasure Coded environment. We should change the following line to use the 
> hdfs.DistributedFileSystem builder pattern 
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java#L92



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19369) HBase Should use Builder Pattern to Create Log Files while using WAL on Erasure Coding

2018-07-27 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560085#comment-16560085
 ] 

Mike Drob commented on HBASE-19369:
---

bq. I think the approach is fine. For the test, I think we could make use of 
org.junit.Assume to ignore it if we do not have EC support.
good idea, will do this.

bq. CommonFSUtils will not use the builder pattern even if the HFileSystem is 
just a wrapper of DistributedFileSystem
yea, i just ran into this when trying to do more tests. I don't think it's an 
issue that we have to fix right now though, and the test should fail if we 
start using HFileSystem for WALs somewhere because the cluster won't work. I'll 
add an explicit configuration to enforce stream capabilities checks and add to 
the javadoc that HFileSystem is not supported for those methods.

v11 attached.

> HBase Should use Builder Pattern to Create Log Files while using WAL on 
> Erasure Coding
> --
>
> Key: HBASE-19369
> URL: https://issues.apache.org/jira/browse/HBASE-19369
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Alex Leblang
>Assignee: Mike Drob
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-19369.master.001.patch, 
> HBASE-19369.master.002.patch, HBASE-19369.master.003.patch, 
> HBASE-19369.master.004.patch, HBASE-19369.v10.patch, HBASE-19369.v11.patch, 
> HBASE-19369.v5.patch, HBASE-19369.v6.patch, HBASE-19369.v7.patch, 
> HBASE-19369.v8.patch, HBASE-19369.v9.patch
>
>
> Right now an HBase instance using the WAL won't function properly in an 
> Erasure Coded environment. We should change the following line to use the 
> hdfs.DistributedFileSystem builder pattern 
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java#L92



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19369) HBase Should use Builder Pattern to Create Log Files while using WAL on Erasure Coding

2018-07-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560083#comment-16560083
 ] 

Hadoop QA commented on HBASE-19369:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
33s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m  
9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
27s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  7m 
20s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  7m  
6s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
45s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
37s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} The patch hbase-common passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} The patch hbase-procedure passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 8s{color} | {color:green} hbase-server: The patch generated 0 new + 0 
unchanged - 2 fixed = 0 total (was 2) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
32s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m  1s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
21s{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
37s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}120m 
50s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  3m 
53s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}200m 49s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-19369 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12933365/HBASE-19369.v10.patch 
|
| Optional Tests |  asflicense  javac  

[jira] [Comment Edited] (HBASE-20893) Data loss if splitting region while ServerCrashProcedure executing

2018-07-27 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559861#comment-16559861
 ] 

stack edited comment on HBASE-20893 at 7/27/18 6:01 PM:


[~allan163] Yeah, support for rollback at the split procedure level was added 
but rollback invokes rolling back of subprocedures and this does not work if 
subprocedure is unassign/assign. A hack was added to do async assign/unassign 
which run out-of-band as part of rollback until rollback got love and finish.

Unsupported exceptions and CODE-BUG as well as crashed out procedure are scary.

bq. But. since a exception is thrown, the decrease for stateCount never happen.

Lets fix, in a new issue?

Do you have a problem with this patch? It avoids CODE-BUG and skips use of 
rollback with the hack async assign/unassign. It also is less *violent* than 
what was here previous just re-running a step rather than flipping to (dodgy) 
rollback waiting on new procedure scheduling. 

I'm in here because my long-running tests are failing and this looks to be the 
cause.




was (Author: stack):
[~allan163] Yeah, support for rollback at the split procedure level was added 
but rollback invokes rolling back of subprocedures and this does not work if 
subprocedure is unassign.

Yeah, unsupported and CODE-BUG as well as crashed out procedure are scary.

bq. But. since a exception is thrown, the decrease for stateCount never happen.

Lets fix.

Do you have a problem with this patch? It avoids CODE-BUG and skips use of 
rollback. I'm in here because my long-running tests are failing and this looks 
to be the cause.



> Data loss if splitting region while ServerCrashProcedure executing
> --
>
> Key: HBASE-20893
> URL: https://issues.apache.org/jira/browse/HBASE-20893
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-20893-branch-2.0.addendum.patch, 
> HBASE-20893.branch-2.0.001.patch, HBASE-20893.branch-2.0.002.patch, 
> HBASE-20893.branch-2.0.003.patch, HBASE-20893.branch-2.0.004.patch, 
> HBASE-20893.branch-2.0.005.patch
>
>
> Similar case as HBASE-20878.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20734) Colocate recovered edits directory with hbase.wal.dir

2018-07-27 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560069#comment-16560069
 ] 

Andrew Purtell commented on HBASE-20734:


I don't like the idea of doing an extra check at every region open either, but 
consider:
 * If the change would stymie a rolling upgrade then it can't go in
 * If a change on a patch or minor requires a separate tool or script to run 
before upgrade, then I think the change breaks our compatibility guidelines and 
could not go in
 * One round trip to the NN is not _that_ expensive

So unless this fix is to be committed only to master, for future HBase 3, then 
I think we need to be backwards compatible by checking for both the expected 
location after this fix and the prior expected location.

If we had some kind of feature flag facility for region metadata, then after 
doing this check once we could set a flag and then avoid doing the filesystem 
check if the flag is set. That kind of change might be possible to introduce 
into new minor releases.

> Colocate recovered edits directory with hbase.wal.dir
> -
>
> Key: HBASE-20734
> URL: https://issues.apache.org/jira/browse/HBASE-20734
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR, Recovery, wal
>Reporter: Ted Yu
>Assignee: Zach York
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20734.branch-1.001.patch
>
>
> During investigation of HBASE-20723, I realized that we wouldn't get the best 
> performance when hbase.wal.dir is configured to be on different (fast) media 
> than hbase rootdir w.r.t. recovered edits since recovered edits directory is 
> currently under rootdir.
> Such setup may not result in fast recovery when there is region server 
> failover.
> This issue is to find proper (hopefully backward compatible) way in 
> colocating recovered edits directory with hbase.wal.dir .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20749) Upgrade our use of checkstyle to 8.6+

2018-07-27 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560067#comment-16560067
 ] 

Hudson commented on HBASE-20749:


Results for branch HBASE-20749
[build #5 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20749/5/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20749/5//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20749/5//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20749/5//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Upgrade our use of checkstyle to 8.6+
> -
>
> Key: HBASE-20749
> URL: https://issues.apache.org/jira/browse/HBASE-20749
> Project: HBase
>  Issue Type: Improvement
>  Components: build, community
>Reporter: Sean Busbey
>Assignee: Mike Drob
>Priority: Minor
> Attachments: HBASE-20749.master.001.patch
>
>
> We should upgrade our checkstyle version to 8.6 or later so we can use the 
> "match violation message to this regex" feature for suppression. That will 
> allow us to make sure we don't regress on HTrace v3 vs v4 APIs (came up in 
> HBASE-20332).
> We're currently blocked on upgrading to 8.3+ by [checkstyle 
> #5279|https://github.com/checkstyle/checkstyle/issues/5279], a regression 
> that flags our use of both the "separate import groups" and "put static 
> imports over here" configs as an error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20583) SplitLogWorker should handle FileNotFoundException when split a wal

2018-07-27 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560064#comment-16560064
 ] 

Andrew Purtell commented on HBASE-20583:


+1 [~pankaj2461], are you interested in doing that?

> SplitLogWorker should handle FileNotFoundException when split a wal
> ---
>
> Key: HBASE-20583
> URL: https://issues.apache.org/jira/browse/HBASE-20583
> Project: HBase
>  Issue Type: Bug
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 2.0.1
>
> Attachments: HBASE-20583.master.001.patch, 
> HBASE-20583.master.001.patch
>
>
> When a split task is finished, master will delete the wal first, then remove 
> the task's zk node. So if master crashed after delelte the wal, the zk task 
> node may be leaved on zk. When master resubmit this task, the task will 
> failed by FileNotFoundException.
> We also handle FileNotFoundException in WALSplitter. But not handle this in 
> SplitLogWorker.
>  
> {code:java}
>   try {
> in = getReader(path, reporter);
>   } catch (EOFException e) {
> if (length <= 0) {
>   // TODO should we ignore an empty, not-last log file if skip.errors
>   // is false? Either way, the caller should decide what to do. E.g.
>   // ignore if this is the last log in sequence.
>   // TODO is this scenario still possible if the log has been
>   // recovered (i.e. closed)
>   LOG.warn("Could not open {} for reading. File is empty", path, e);
> }
> // EOFException being ignored
> return null;
>   }
> } catch (IOException e) {
>   if (e instanceof FileNotFoundException) {
> // A wal file may not exist anymore. Nothing can be recovered so move on
> LOG.warn("File {} does not exist anymore", path, e);
> return null;
>   }
> }{code}
> {code:java}
> // Here fs.getFileStatus may throw FileNotFoundException, too. We should 
> handle this exception as the WALSplitter.getReader.
> try {
>   if (!WALSplitter.splitLogFile(walDir, fs.getFileStatus(new Path(walDir, 
> filename)),
> fs, conf, p, sequenceIdChecker,
>   server.getCoordinatedStateManager().getSplitLogWorkerCoordination(), 
> factory)) {
> return Status.PREEMPTED;
>   }
> } 
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-20895) NPE in RpcServer#readAndProcess

2018-07-27 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560014#comment-16560014
 ] 

Andrew Purtell edited comment on HBASE-20895 at 7/27/18 5:34 PM:
-

[~mihir6692] indicates the latest patch fixes the issue as observed in the test 
environment and scenario which first identified it, and the precommit failure 
is a VM crash, so looks environmental in nature. 

I plan to commit this trivial NPE fix today unless objection.


was (Author: apurtell):
[~mihir6692] indicates the latest patch fixes the issue and the precommit 
failure is a VM crash, so looks environmental in nature. 

I plan to commit this trivial NPE fix today unless objection.

> NPE in RpcServer#readAndProcess
> ---
>
> Key: HBASE-20895
> URL: https://issues.apache.org/jira/browse/HBASE-20895
> Project: HBase
>  Issue Type: Bug
>  Components: rpc
>Affects Versions: 1.3.2
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0, 1.3.3, 1.4.7
>
> Attachments: HBASE-20895-branch-1.patch, HBASE-20895-branch-1.patch
>
>
> {noformat}
> 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - 
> RpcServer.listener,port=60020: Caught exception while reading:
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This looks like it could be a use after close problem if there is concurrent 
> access to a Connection.
> In process() we might store a null back to the 'data' field.
> Meanwhile in readAndProcess() we have a case where we might be blocked on a 
> channel read and then after coming back from the read we go to use 'data' 
> after a null has been written back, leading to a NPE.
> {quote}count = channelRead(channel, data);
>  1761 ---> if (count >= 0 && *data.remaining()* == 0)
>  \{ process(); }{quote}
> Whether a NPE happens or not is going to depend on the timing of the store 
> back to 'data' in another thread and use of 'data' in this thread and whether 
> or not the JVM has optimized away a reload of 'data' (it's not declared 
> volatile)
> We should do a null check here just to be defensive. We should also look at 
> whether concurrent access to the Connection is happening and intended.The 
> above is just a theory. We should also look at other execution sequences that 
> could lead to 'data' being null in this location. At a glance I didn't find 
> one but the store to 'data' happens behind conditionals so it is possible. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-18920) Operation.toString() is counterintuitive

2018-07-27 Thread Hua-Yi Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-18920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hua-Yi Ho reassigned HBASE-18920:
-

Assignee: Hua-Yi Ho

> Operation.toString() is counterintuitive
> 
>
> Key: HBASE-18920
> URL: https://issues.apache.org/jira/browse/HBASE-18920
> Project: HBase
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Hua-Yi Ho
>Priority: Major
>
> When debugging, you often need to know which columns are projected into the 
> Scan. The Operation.toString() silently truncates what you see to the first 
> 5. Typically, if you truncate a value, you should end it with "..." so that 
> the user knows there are actually more (instead of wasting time thinking 
> they're not projecting everything that needs to be projected). Also, a limit 
> of 5 is way to small of a default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-18920) Operation.toString() is counterintuitive

2018-07-27 Thread Hua-Yi Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-18920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hua-Yi Ho reassigned HBASE-18920:
-

Assignee: (was: Hua-Yi Ho)

> Operation.toString() is counterintuitive
> 
>
> Key: HBASE-18920
> URL: https://issues.apache.org/jira/browse/HBASE-18920
> Project: HBase
>  Issue Type: Bug
>Reporter: James Taylor
>Priority: Major
>
> When debugging, you often need to know which columns are projected into the 
> Scan. The Operation.toString() silently truncates what you see to the first 
> 5. Typically, if you truncate a value, you should end it with "..." so that 
> the user knows there are actually more (instead of wasting time thinking 
> they're not projecting everything that needs to be projected). Also, a limit 
> of 5 is way to small of a default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20883) HMaster Read / Write Requests Per Sec across RegionServers, currently only Total Requests Per Sec

2018-07-27 Thread Hari Sekhon (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560029#comment-16560029
 ] 

Hari Sekhon commented on HBASE-20883:
-

I probably won't get round to this any time soon as I solved it quickly by 
writing an external tool before raising this ticket and am on to the next 
thing, I merely raised this as a future improvement to do at some point.

Request breakdowns by region and by regionserver are available along with a 
selection of other HBase tools in my PyTools github repo:

[https://github.com/HariSekhon/pytools]

 

> HMaster Read / Write Requests Per Sec across RegionServers, currently only 
> Total Requests Per Sec 
> --
>
> Key: HBASE-20883
> URL: https://issues.apache.org/jira/browse/HBASE-20883
> Project: HBase
>  Issue Type: Improvement
>  Components: Admin, master, metrics, monitoring, UI, Usability
>Affects Versions: 1.1.2
>Reporter: Hari Sekhon
>Priority: Major
>
> HMaster currently shows Requests Per Second per RegionServer under HMaster 
> UI's /master-status page -> Region Servers -> Base Stats section in the Web 
> UI.
> Please add Reads Per Second and Writes Per Second per RegionServer alongside 
> this in the HMaster UI, and also expose the Read/Write/Total requests per sec 
> information in the HMaster JMX API.
> This will make it easier to find read or write hotspotting on HBase as a 
> combined total will minimize and mask differences between RegionServers. For 
> example, we do 30,000 reads/sec but only 900 writes/sec to each RegionServer, 
> so write skew will be masked as it won't show enough significant difference 
> in the much larger combined Total Requests Per Second stat.
> For now I've written a Python tool to calculate this info from RegionServers 
> JMX read/write/total request counts but since HMaster is collecting this info 
> anyway it shouldn't be a big change to improve it to also show Reads / Writes 
> Per Sec as well as Total.
> Find my tools for more granular Read/Write Requests Per Sec Per Regionserver 
> and also Per Region at my [PyTools github 
> repo|https://github.com/harisekhon/pytools] along with a selection of other 
> HBase tools I've used for performance debugging over the years.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-20895) NPE in RpcServer#readAndProcess

2018-07-27 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell reassigned HBASE-20895:
--

Assignee: Andrew Purtell  (was: Monani Mihir)

> NPE in RpcServer#readAndProcess
> ---
>
> Key: HBASE-20895
> URL: https://issues.apache.org/jira/browse/HBASE-20895
> Project: HBase
>  Issue Type: Bug
>  Components: rpc
>Affects Versions: 1.3.2
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0, 1.3.3, 1.4.7
>
> Attachments: HBASE-20895-branch-1.patch, HBASE-20895-branch-1.patch
>
>
> {noformat}
> 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - 
> RpcServer.listener,port=60020: Caught exception while reading:
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This looks like it could be a use after close problem if there is concurrent 
> access to a Connection.
> In process() we might store a null back to the 'data' field.
> Meanwhile in readAndProcess() we have a case where we might be blocked on a 
> channel read and then after coming back from the read we go to use 'data' 
> after a null has been written back, leading to a NPE.
> {quote}count = channelRead(channel, data);
>  1761 ---> if (count >= 0 && *data.remaining()* == 0)
>  \{ process(); }{quote}
> Whether a NPE happens or not is going to depend on the timing of the store 
> back to 'data' in another thread and use of 'data' in this thread and whether 
> or not the JVM has optimized away a reload of 'data' (it's not declared 
> volatile)
> We should do a null check here just to be defensive. We should also look at 
> whether concurrent access to the Connection is happening and intended.The 
> above is just a theory. We should also look at other execution sequences that 
> could lead to 'data' being null in this location. At a glance I didn't find 
> one but the store to 'data' happens behind conditionals so it is possible. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20895) NPE in RpcServer#readAndProcess

2018-07-27 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560014#comment-16560014
 ] 

Andrew Purtell commented on HBASE-20895:


[~mihir6692] indicates the latest patch fixes the issue and the precommit 
failure is a VM crash, so looks environmental in nature. 

I plan to commit this trivial NPE fix today unless objection.

> NPE in RpcServer#readAndProcess
> ---
>
> Key: HBASE-20895
> URL: https://issues.apache.org/jira/browse/HBASE-20895
> Project: HBase
>  Issue Type: Bug
>  Components: rpc
>Affects Versions: 1.3.2
>Reporter: Andrew Purtell
>Assignee: Monani Mihir
>Priority: Major
> Fix For: 1.5.0, 1.3.3, 1.4.7
>
> Attachments: HBASE-20895-branch-1.patch, HBASE-20895-branch-1.patch
>
>
> {noformat}
> 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - 
> RpcServer.listener,port=60020: Caught exception while reading:
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This looks like it could be a use after close problem if there is concurrent 
> access to a Connection.
> In process() we might store a null back to the 'data' field.
> Meanwhile in readAndProcess() we have a case where we might be blocked on a 
> channel read and then after coming back from the read we go to use 'data' 
> after a null has been written back, leading to a NPE.
> {quote}count = channelRead(channel, data);
>  1761 ---> if (count >= 0 && *data.remaining()* == 0)
>  \{ process(); }{quote}
> Whether a NPE happens or not is going to depend on the timing of the store 
> back to 'data' in another thread and use of 'data' in this thread and whether 
> or not the JVM has optimized away a reload of 'data' (it's not declared 
> volatile)
> We should do a null check here just to be defensive. We should also look at 
> whether concurrent access to the Connection is happening and intended.The 
> above is just a theory. We should also look at other execution sequences that 
> could lead to 'data' being null in this location. At a glance I didn't find 
> one but the store to 'data' happens behind conditionals so it is possible. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20966) RestoreTool#getTableInfoPath should look for completed snapshot only

2018-07-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559990#comment-16559990
 ] 

Hadoop QA commented on HBASE-20966:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} patch {color} | {color:blue}  0m  
1s{color} | {color:blue} The patch file was not named according to hbase's 
naming conventions. Please see 
https://yetus.apache.org/documentation/0.7.0/precommit-patchnames for 
instructions. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
37s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
15s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 48s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 
58s{color} | {color:green} hbase-backup in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 48m 56s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20966 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12933386/20966.v1.txt |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 4a5c5281c8ee 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh
 |
| git revision | master / 7178a98258 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 

[jira] [Updated] (HBASE-20528) Revise collections copying from iteration to built-in function

2018-07-27 Thread Hua-Yi Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hua-Yi Ho updated HBASE-20528:
--
Attachment: (was: 0001-HBASE-20528.v0.patch)

> Revise collections copying from iteration to built-in function
> --
>
> Key: HBASE-20528
> URL: https://issues.apache.org/jira/browse/HBASE-20528
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Hua-Yi Ho
>Assignee: Hua-Yi Ho
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HBASE-20528.master.001.patch
>
>
> Some collection codes in file
> StochasticLoadBalancer.java, AbstractHBaseTool.java, HFileInputFormat.java, 
> Result.java, and WalPlayer.java, using iterations to copy whole data in 
> collections. The iterations can just replace by just Colletions.addAll and 
> Arrays.copyOf.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20528) Revise collections copying from iteration to built-in function

2018-07-27 Thread Hua-Yi Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hua-Yi Ho updated HBASE-20528:
--
Attachment: HBASE-20528.master.001.patch

> Revise collections copying from iteration to built-in function
> --
>
> Key: HBASE-20528
> URL: https://issues.apache.org/jira/browse/HBASE-20528
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Hua-Yi Ho
>Assignee: Hua-Yi Ho
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HBASE-20528.master.001.patch
>
>
> Some collection codes in file
> StochasticLoadBalancer.java, AbstractHBaseTool.java, HFileInputFormat.java, 
> Result.java, and WalPlayer.java, using iterations to copy whole data in 
> collections. The iterations can just replace by just Colletions.addAll and 
> Arrays.copyOf.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20886) [Auth] Support keytab login in hbase client

2018-07-27 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559976#comment-16559976
 ] 

Sean Busbey commented on HBASE-20886:
-

+1

nit: maybe we call this "new feature" instead of "improvement". could see it 
going either way.

> [Auth] Support keytab login in hbase client
> ---
>
> Key: HBASE-20886
> URL: https://issues.apache.org/jira/browse/HBASE-20886
> Project: HBase
>  Issue Type: Improvement
>  Components: asyncclient, Client, security
>Reporter: Reid Chan
>Assignee: Reid Chan
>Priority: Critical
> Attachments: HBASE-20886.master.001.patch, 
> HBASE-20886.master.002.patch, HBASE-20886.master.003.patch, 
> HBASE-20886.master.004.patch, HBASE-20886.master.005.patch, 
> HBASE-20886.master.006.patch, HBASE-20886.master.007.patch, 
> HBASE-20886.master.008.patch
>
>
> There're lots of questions about how to connect to kerberized hbase cluster 
> through hbase-client api from user-mail and slack channel.
> {{hbase.client.keytab.file}} and {{hbase.client.keytab.principal}} are 
> already existed in code base, but they are only used in {{Canary}}.
> This issue is to make use of two configs to support client-side keytab based 
> login, after this issue resolved, hbase-client should directly connect to 
> kerberized cluster without changing any code as long as 
> {{hbase.client.keytab.file}} and {{hbase.client.keytab.principal}} are 
> specified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20528) Revise collections copying from iteration to built-in function

2018-07-27 Thread Hua-Yi Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hua-Yi Ho updated HBASE-20528:
--
Attachment: 0001-HBASE-20528.v0.patch

> Revise collections copying from iteration to built-in function
> --
>
> Key: HBASE-20528
> URL: https://issues.apache.org/jira/browse/HBASE-20528
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Hua-Yi Ho
>Assignee: Hua-Yi Ho
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: 0001-HBASE-20528.v0.patch
>
>
> Some collection codes in file
> StochasticLoadBalancer.java, AbstractHBaseTool.java, HFileInputFormat.java, 
> Result.java, and WalPlayer.java, using iterations to copy whole data in 
> collections. The iterations can just replace by just Colletions.addAll and 
> Arrays.copyOf.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20528) Revise collections copying from iteration to built-in function

2018-07-27 Thread Hua-Yi Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hua-Yi Ho updated HBASE-20528:
--
Attachment: (was: 
0001-HBASE-20528-Revise-collections-copying-from-iteratio.patch)

> Revise collections copying from iteration to built-in function
> --
>
> Key: HBASE-20528
> URL: https://issues.apache.org/jira/browse/HBASE-20528
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Hua-Yi Ho
>Assignee: Hua-Yi Ho
>Priority: Minor
> Fix For: 3.0.0
>
>
> Some collection codes in file
> StochasticLoadBalancer.java, AbstractHBaseTool.java, HFileInputFormat.java, 
> Result.java, and WalPlayer.java, using iterations to copy whole data in 
> collections. The iterations can just replace by just Colletions.addAll and 
> Arrays.copyOf.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20528) Revise collections copying from iteration to built-in function

2018-07-27 Thread Hua-Yi Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hua-Yi Ho updated HBASE-20528:
--
Attachment: 0001-HBASE-20528-Revise-collections-copying-from-iteratio.patch

> Revise collections copying from iteration to built-in function
> --
>
> Key: HBASE-20528
> URL: https://issues.apache.org/jira/browse/HBASE-20528
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Hua-Yi Ho
>Assignee: Hua-Yi Ho
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: 
> 0001-HBASE-20528-Revise-collections-copying-from-iteratio.patch
>
>
> Some collection codes in file
> StochasticLoadBalancer.java, AbstractHBaseTool.java, HFileInputFormat.java, 
> Result.java, and WalPlayer.java, using iterations to copy whole data in 
> collections. The iterations can just replace by just Colletions.addAll and 
> Arrays.copyOf.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20528) Revise collections copying from iteration to built-in function

2018-07-27 Thread Hua-Yi Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hua-Yi Ho updated HBASE-20528:
--
Attachment: (was: 
0001-Revise-collections-copying-to-built-in-function.patch)

> Revise collections copying from iteration to built-in function
> --
>
> Key: HBASE-20528
> URL: https://issues.apache.org/jira/browse/HBASE-20528
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Hua-Yi Ho
>Assignee: Hua-Yi Ho
>Priority: Minor
> Fix For: 3.0.0
>
>
> Some collection codes in file
> StochasticLoadBalancer.java, AbstractHBaseTool.java, HFileInputFormat.java, 
> Result.java, and WalPlayer.java, using iterations to copy whole data in 
> collections. The iterations can just replace by just Colletions.addAll and 
> Arrays.copyOf.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20966) RestoreTool#getTableInfoPath should look for completed snapshot only

2018-07-27 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20966:
---
Attachment: 20966.v1.txt

> RestoreTool#getTableInfoPath should look for completed snapshot only
> 
>
> Key: HBASE-20966
> URL: https://issues.apache.org/jira/browse/HBASE-20966
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20966.v1.txt
>
>
> [~gubjanos] reported seeing the following error when running backup / restore 
> test on Azure:
> {code}
> 2018-07-25 17:03:56,661|INFO|MainThread|machine.py:167 - 
> run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException:
>  Couldn't read snapshot info 
> from:wasb://hbase3-m30wub1711kond-115...@humbtesting8wua.blob.core.windows.net/user/hbase/backup_loc/backup_1532538064246/default/table_fnfawii1za/.hbase-snapshot/.tmp/.
> snapshotinfo
> 2018-07-25 17:03:56,661|INFO|MainThread|machine.py:167 - 
> run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at 
> org.apache.hadoop.hbase.snapshot.SnapshotDescriptionUtils.readSnapshotInfo(SnapshotDescriptionUtils.java:328)
> 2018-07-25 17:03:56,661|INFO|MainThread|machine.py:167 - 
> run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at 
> org.apache.hadoop.hbase.backup.util.RestoreServerUtil.getTableDesc(RestoreServerUtil.java:237)
> 2018-07-25 17:03:56,662|INFO|MainThread|machine.py:167 - 
> run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at 
> org.apache.hadoop.hbase.backup.util.RestoreServerUtil.restoreTableAndCreate(RestoreServerUtil.java:351)
> 2018-07-25 17:03:56,662|INFO|MainThread|machine.py:167 - 
> run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at 
> org.apache.hadoop.hbase.backup.util.RestoreServerUtil.fullRestoreTable(RestoreServerUtil.java:186)
> {code}
> Here is related code in master branch:
> {code}
>   Path getTableInfoPath(TableName tableName) throws IOException {
> Path tableSnapShotPath = getTableSnapshotPath(backupRootPath, tableName, 
> backupId);
> Path tableInfoPath = null;
> // can't build the path directly as the timestamp values are different
> FileStatus[] snapshots = fs.listStatus(tableSnapShotPath);
> {code}
> In the above code, we don't exclude incomplete snapshot, leading to exception 
> later when reading snapshot info.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20966) RestoreTool#getTableInfoPath should look for completed snapshot only

2018-07-27 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20966:
---
Status: Patch Available  (was: Open)

> RestoreTool#getTableInfoPath should look for completed snapshot only
> 
>
> Key: HBASE-20966
> URL: https://issues.apache.org/jira/browse/HBASE-20966
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20966.v1.txt
>
>
> [~gubjanos] reported seeing the following error when running backup / restore 
> test on Azure:
> {code}
> 2018-07-25 17:03:56,661|INFO|MainThread|machine.py:167 - 
> run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException:
>  Couldn't read snapshot info 
> from:wasb://hbase3-m30wub1711kond-115...@humbtesting8wua.blob.core.windows.net/user/hbase/backup_loc/backup_1532538064246/default/table_fnfawii1za/.hbase-snapshot/.tmp/.
> snapshotinfo
> 2018-07-25 17:03:56,661|INFO|MainThread|machine.py:167 - 
> run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at 
> org.apache.hadoop.hbase.snapshot.SnapshotDescriptionUtils.readSnapshotInfo(SnapshotDescriptionUtils.java:328)
> 2018-07-25 17:03:56,661|INFO|MainThread|machine.py:167 - 
> run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at 
> org.apache.hadoop.hbase.backup.util.RestoreServerUtil.getTableDesc(RestoreServerUtil.java:237)
> 2018-07-25 17:03:56,662|INFO|MainThread|machine.py:167 - 
> run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at 
> org.apache.hadoop.hbase.backup.util.RestoreServerUtil.restoreTableAndCreate(RestoreServerUtil.java:351)
> 2018-07-25 17:03:56,662|INFO|MainThread|machine.py:167 - 
> run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at 
> org.apache.hadoop.hbase.backup.util.RestoreServerUtil.fullRestoreTable(RestoreServerUtil.java:186)
> {code}
> Here is related code in master branch:
> {code}
>   Path getTableInfoPath(TableName tableName) throws IOException {
> Path tableSnapShotPath = getTableSnapshotPath(backupRootPath, tableName, 
> backupId);
> Path tableInfoPath = null;
> // can't build the path directly as the timestamp values are different
> FileStatus[] snapshots = fs.listStatus(tableSnapShotPath);
> {code}
> In the above code, we don't exclude incomplete snapshot, leading to exception 
> later when reading snapshot info.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20966) RestoreTool#getTableInfoPath should look for completed snapshot only

2018-07-27 Thread Ted Yu (JIRA)
Ted Yu created HBASE-20966:
--

 Summary: RestoreTool#getTableInfoPath should look for completed 
snapshot only
 Key: HBASE-20966
 URL: https://issues.apache.org/jira/browse/HBASE-20966
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu


[~gubjanos] reported seeing the following error when running backup / restore 
test on Azure:
{code}
2018-07-25 17:03:56,661|INFO|MainThread|machine.py:167 - 
run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException:
 Couldn't read snapshot info 
from:wasb://hbase3-m30wub1711kond-115...@humbtesting8wua.blob.core.windows.net/user/hbase/backup_loc/backup_1532538064246/default/table_fnfawii1za/.hbase-snapshot/.tmp/.
snapshotinfo
2018-07-25 17:03:56,661|INFO|MainThread|machine.py:167 - 
run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at 
org.apache.hadoop.hbase.snapshot.SnapshotDescriptionUtils.readSnapshotInfo(SnapshotDescriptionUtils.java:328)
2018-07-25 17:03:56,661|INFO|MainThread|machine.py:167 - 
run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at 
org.apache.hadoop.hbase.backup.util.RestoreServerUtil.getTableDesc(RestoreServerUtil.java:237)
2018-07-25 17:03:56,662|INFO|MainThread|machine.py:167 - 
run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at 
org.apache.hadoop.hbase.backup.util.RestoreServerUtil.restoreTableAndCreate(RestoreServerUtil.java:351)
2018-07-25 17:03:56,662|INFO|MainThread|machine.py:167 - 
run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at 
org.apache.hadoop.hbase.backup.util.RestoreServerUtil.fullRestoreTable(RestoreServerUtil.java:186)
{code}
Here is related code in master branch:
{code}
  Path getTableInfoPath(TableName tableName) throws IOException {
Path tableSnapShotPath = getTableSnapshotPath(backupRootPath, tableName, 
backupId);
Path tableInfoPath = null;

// can't build the path directly as the timestamp values are different
FileStatus[] snapshots = fs.listStatus(tableSnapShotPath);
{code}
In the above code, we don't exclude incomplete snapshot, leading to exception 
later when reading snapshot info.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20893) Data loss if splitting region while ServerCrashProcedure executing

2018-07-27 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559861#comment-16559861
 ] 

stack commented on HBASE-20893:
---

[~allan163] Yeah, support for rollback at the split procedure level was added 
but rollback invokes rolling back of subprocedures and this does not work if 
subprocedure is unassign.

Yeah, unsupported and CODE-BUG as well as crashed out procedure are scary.

bq. But. since a exception is thrown, the decrease for stateCount never happen.

Lets fix.

Do you have a problem with this patch? It avoids CODE-BUG and skips use of 
rollback. I'm in here because my long-running tests are failing and this looks 
to be the cause.



> Data loss if splitting region while ServerCrashProcedure executing
> --
>
> Key: HBASE-20893
> URL: https://issues.apache.org/jira/browse/HBASE-20893
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-20893-branch-2.0.addendum.patch, 
> HBASE-20893.branch-2.0.001.patch, HBASE-20893.branch-2.0.002.patch, 
> HBASE-20893.branch-2.0.003.patch, HBASE-20893.branch-2.0.004.patch, 
> HBASE-20893.branch-2.0.005.patch
>
>
> Similar case as HBASE-20878.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19369) HBase Should use Builder Pattern to Create Log Files while using WAL on Erasure Coding

2018-07-27 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559852#comment-16559852
 ] 

Duo Zhang commented on HBASE-19369:
---

As the builder is introduced in hadoop-2.9 so introduce a hadoop3-compact 
module does not make sense...

I think the approach is fine. For the test, I think we could make use of 
org.junit.Assume to ignore it if we do not have EC support.

And maybe another problem is that, we sometimes use HFileSystem instead of the 
original FileSystem instance in our code, the CommonFSUtils will not use the 
builder pattern even if the HFileSystem is just a wrapper of 
DistributedFileSystem? For WAL I think it is fine as we will not use 
HFileSystem, but is it also safe at other places?

Thanks.

> HBase Should use Builder Pattern to Create Log Files while using WAL on 
> Erasure Coding
> --
>
> Key: HBASE-19369
> URL: https://issues.apache.org/jira/browse/HBASE-19369
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Alex Leblang
>Assignee: Mike Drob
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-19369.master.001.patch, 
> HBASE-19369.master.002.patch, HBASE-19369.master.003.patch, 
> HBASE-19369.master.004.patch, HBASE-19369.v10.patch, HBASE-19369.v5.patch, 
> HBASE-19369.v6.patch, HBASE-19369.v7.patch, HBASE-19369.v8.patch, 
> HBASE-19369.v9.patch
>
>
> Right now an HBase instance using the WAL won't function properly in an 
> Erasure Coded environment. We should change the following line to use the 
> hdfs.DistributedFileSystem builder pattern 
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java#L92



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-19369) HBase Should use Builder Pattern to Create Log Files while using WAL on Erasure Coding

2018-07-27 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559848#comment-16559848
 ] 

Mike Drob edited comment on HBASE-19369 at 7/27/18 2:37 PM:


v10: addresses checkstyle/whitespace issues, adds another simple test.

[~Apache9] - can you review? I think one is finally in a good state.


was (Author: mdrob):
v10: addresses checkstyle/whitespace issues, adds another simple test.

[~Apache9] - can you review?

> HBase Should use Builder Pattern to Create Log Files while using WAL on 
> Erasure Coding
> --
>
> Key: HBASE-19369
> URL: https://issues.apache.org/jira/browse/HBASE-19369
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Alex Leblang
>Assignee: Mike Drob
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-19369.master.001.patch, 
> HBASE-19369.master.002.patch, HBASE-19369.master.003.patch, 
> HBASE-19369.master.004.patch, HBASE-19369.v10.patch, HBASE-19369.v5.patch, 
> HBASE-19369.v6.patch, HBASE-19369.v7.patch, HBASE-19369.v8.patch, 
> HBASE-19369.v9.patch
>
>
> Right now an HBase instance using the WAL won't function properly in an 
> Erasure Coded environment. We should change the following line to use the 
> hdfs.DistributedFileSystem builder pattern 
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java#L92



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19369) HBase Should use Builder Pattern to Create Log Files while using WAL on Erasure Coding

2018-07-27 Thread Mike Drob (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-19369:
--
Attachment: HBASE-19369.v10.patch

> HBase Should use Builder Pattern to Create Log Files while using WAL on 
> Erasure Coding
> --
>
> Key: HBASE-19369
> URL: https://issues.apache.org/jira/browse/HBASE-19369
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Alex Leblang
>Assignee: Mike Drob
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-19369.master.001.patch, 
> HBASE-19369.master.002.patch, HBASE-19369.master.003.patch, 
> HBASE-19369.master.004.patch, HBASE-19369.v10.patch, HBASE-19369.v5.patch, 
> HBASE-19369.v6.patch, HBASE-19369.v7.patch, HBASE-19369.v8.patch, 
> HBASE-19369.v9.patch
>
>
> Right now an HBase instance using the WAL won't function properly in an 
> Erasure Coded environment. We should change the following line to use the 
> hdfs.DistributedFileSystem builder pattern 
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java#L92



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19369) HBase Should use Builder Pattern to Create Log Files while using WAL on Erasure Coding

2018-07-27 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559848#comment-16559848
 ] 

Mike Drob commented on HBASE-19369:
---

v10: addresses checkstyle/whitespace issues, adds another simple test.

[~Apache9] - can you review?

> HBase Should use Builder Pattern to Create Log Files while using WAL on 
> Erasure Coding
> --
>
> Key: HBASE-19369
> URL: https://issues.apache.org/jira/browse/HBASE-19369
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Alex Leblang
>Assignee: Mike Drob
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-19369.master.001.patch, 
> HBASE-19369.master.002.patch, HBASE-19369.master.003.patch, 
> HBASE-19369.master.004.patch, HBASE-19369.v10.patch, HBASE-19369.v5.patch, 
> HBASE-19369.v6.patch, HBASE-19369.v7.patch, HBASE-19369.v8.patch, 
> HBASE-19369.v9.patch
>
>
> Right now an HBase instance using the WAL won't function properly in an 
> Erasure Coded environment. We should change the following line to use the 
> hdfs.DistributedFileSystem builder pattern 
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java#L92



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-20949) Split/Merge table can be executed concurrently with DisableTableProcedure

2018-07-27 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-20949.
---
Resolution: Duplicate

Fixed by HBASE-20939.

> Split/Merge table can be executed concurrently with DisableTableProcedure
> -
>
> Key: HBASE-20949
> URL: https://issues.apache.org/jira/browse/HBASE-20949
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Priority: Major
> Attachments: HBASE-20949-debug.patch
>
>
> The top flaky tests on the dashboard are all because of this.
> TestRestoreSnapshotFromClient
> TestSimpleRegionNormalizerOnCluster
> Theoretically this should not happen, need to dig more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20939) There will be race when we call suspendIfNotReady and then throw ProcedureSuspendedException

2018-07-27 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-20939:
--
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Pushed to branch-2.0+.

Thanks [~stack] for reviewing.

> There will be race when we call suspendIfNotReady and then throw 
> ProcedureSuspendedException
> 
>
> Key: HBASE-20939
> URL: https://issues.apache.org/jira/browse/HBASE-20939
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.1
>
> Attachments: HBASE-20939.patch, HBASE-20939.patch
>
>
> This is very typical usage in our procedure implementation, for example, in 
> AssignProcedure, we will call AM.queueAssign and then suspend ourselves to 
> wait until the AM finish processing our assign request.
> But there could be races. Think of this:
> 1. We call suspendIfNotReady on a event, and it returns true so we need to 
> wait.
> 2. The event has been waked up, and the procedure will be added back to the 
> scheduler.
> 3. A worker picks up the procedure and finishes it.
> 4. We finally throw ProcedureSuspendException and the ProcedureExecutor 
> suspend us and store the state in procedure store.
> So we have a half done procedure in the procedure store for ever... This may 
> cause assertion when loading procedures. And maybe the worker can not finish 
> the procedure as when suspending we need to restore some state, for example, 
> add something to RootProcedureState. But anyway, it will still lead to 
> assertion or other unexpected errors.
> And this can not be done by simply adding a lock in the procedure, as most 
> works are done in the ProcedureExecutor after we throw 
> ProcedureSuspendException.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20949) Split/Merge table can be executed concurrently with DisableTableProcedure

2018-07-27 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559761#comment-16559761
 ] 

Hudson commented on HBASE-20949:


Results for branch master
[build #410 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/410/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/410//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/410//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/410//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Split/Merge table can be executed concurrently with DisableTableProcedure
> -
>
> Key: HBASE-20949
> URL: https://issues.apache.org/jira/browse/HBASE-20949
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Priority: Major
> Attachments: HBASE-20949-debug.patch
>
>
> The top flaky tests on the dashboard are all because of this.
> TestRestoreSnapshotFromClient
> TestSimpleRegionNormalizerOnCluster
> Theoretically this should not happen, need to dig more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20927) RSGroupAdminEndpoint doesn't handle clearing dead servers if they are not processed yet.

2018-07-27 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559762#comment-16559762
 ] 

Hudson commented on HBASE-20927:


Results for branch master
[build #410 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/410/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/410//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/410//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/410//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> RSGroupAdminEndpoint doesn't handle clearing dead servers if they are not 
> processed yet.
> 
>
> Key: HBASE-20927
> URL: https://issues.apache.org/jira/browse/HBASE-20927
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Sergey Soldatov
>Assignee: Sergey Soldatov
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-20927-master.patch, HBASE-20927.master.002.patch
>
>
> Admin.clearDeadServers is supposed to return the list of servers that were 
> not cleared. But if RSGroupAdminEndpoint is set, the ConstraintException is 
> thrown:
> {noformat}
> Caused by: 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.constraint.ConstraintException):
>  org.apache.hadoop.hbase.constraint.ConstraintException: The set of servers 
> to remove cannot be null or empty.
>   at 
> org.apache.hadoop.hbase.rsgroup.RSGroupAdminServer.removeServers(RSGroupAdminServer.java:573)
>   at 
> org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint.postClearDeadServers(RSGroupAdminEndpoint.java:519)
>   at 
> org.apache.hadoop.hbase.master.MasterCoprocessorHost$133.call(MasterCoprocessorHost.java:1607)
>   at 
> org.apache.hadoop.hbase.master.MasterCoprocessorHost$133.call(MasterCoprocessorHost.java:1604)
>   at 
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost$ObserverOperationWithoutResult.callObserver(CoprocessorHost.java:540)
>   at 
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:614)
>   at 
> org.apache.hadoop.hbase.master.MasterCoprocessorHost.postClearDeadServers(MasterCoprocessorHost.java:1604)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.clearDeadServers(MasterRpcServices.java:2231)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> {noformat}
> That happens because in postClearDeadServers it calls 
> groupAdminServer.removeServers(clearedServer) even if the clearedServer is 
> empty.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20921) Possible NPE in ReopenTableRegionsProcedure

2018-07-27 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559763#comment-16559763
 ] 

Hudson commented on HBASE-20921:


Results for branch master
[build #410 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/410/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/410//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/410//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/410//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Possible NPE in ReopenTableRegionsProcedure
> ---
>
> Key: HBASE-20921
> URL: https://issues.apache.org/jira/browse/HBASE-20921
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 3.0.0, 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20921.branch-2.0.001.patch, 
> HBASE-20921.branch-2.0.002.patch
>
>
> After HBASE-20752, we issue a ReopenTableRegionsProcedure in 
> ModifyTableProcedure to ensure all regions are reopened.
> But, ModifyTableProcedure and ReopenTableRegionsProcedure do not hold the 
> lock (why?), so there is a chance that while ModifyTableProcedure  executing, 
> a merge/split procedure can be executed at the same time.
> So, when ReopenTableRegionsProcedure reaches the state of 
> "REOPEN_TABLE_REGIONS_CONFIRM_REOPENED", some of the persisted regions to 
> check is actually not exists, thus a NPE will throw.
> {code}
> 2018-07-18 01:38:57,528 INFO  [PEWorker-9] 
> procedure2.ProcedureExecutor(1246): Finished pid=6110, state=SUCCESS; 
> MergeTableRegionsProcedure table=IntegrationTestBigLinkedList, 
> regions=[845d286231eb01b7
> 1aeaa17b0e30058d, 4a46ab0918c99cada72d5336ad83a828], forcibly=false in 
> 10.8610sec
> 2018-07-18 01:38:57,530 ERROR [PEWorker-8] 
> procedure2.ProcedureExecutor(1478): CODE-BUG: Uncaught runtime exception: 
> pid=5974, ppid=5973, state=RUNNABLE:REOPEN_TABLE_REGIONS_CONFIRM_REOPENED; 
> ReopenTab
> leRegionsProcedure table=IntegrationTestBigLinkedList
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.checkReopened(RegionStates.java:651)
> at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
> at 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
> at 
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
> at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
> at 
> java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
> at 
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> at 
> java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
> at 
> org.apache.hadoop.hbase.master.procedure.ReopenTableRegionsProcedure.executeFromState(ReopenTableRegionsProcedure.java:102)
> at 
> org.apache.hadoop.hbase.master.procedure.ReopenTableRegionsProcedure.executeFromState(ReopenTableRegionsProcedure.java:45)
> at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:184)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:850)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1453)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1221)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> {code}
> I think we need to renew the region list of the table at the 
> "REOPEN_TABLE_REGIONS_CONFIRM_REOPENED" state. For the regions which are 
> merged or split, we do not need to check it. Since we can be sure that they 
> are opened after we made change to table descriptor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20939) There will be race when we call suspendIfNotReady and then throw ProcedureSuspendedException

2018-07-27 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559749#comment-16559749
 ] 

Duo Zhang commented on HBASE-20939:
---

Seems worked. The TestRestoreSnapshotFromClient and 
TestSimpleRegionNormalizerOnCluster are all passed in the recent three runs.

Let me commit to branches other than master.

> There will be race when we call suspendIfNotReady and then throw 
> ProcedureSuspendedException
> 
>
> Key: HBASE-20939
> URL: https://issues.apache.org/jira/browse/HBASE-20939
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.0.1, 2.2.0, 2.1.1
>
> Attachments: HBASE-20939.patch, HBASE-20939.patch
>
>
> This is very typical usage in our procedure implementation, for example, in 
> AssignProcedure, we will call AM.queueAssign and then suspend ourselves to 
> wait until the AM finish processing our assign request.
> But there could be races. Think of this:
> 1. We call suspendIfNotReady on a event, and it returns true so we need to 
> wait.
> 2. The event has been waked up, and the procedure will be added back to the 
> scheduler.
> 3. A worker picks up the procedure and finishes it.
> 4. We finally throw ProcedureSuspendException and the ProcedureExecutor 
> suspend us and store the state in procedure store.
> So we have a half done procedure in the procedure store for ever... This may 
> cause assertion when loading procedures. And maybe the worker can not finish 
> the procedure as when suspending we need to restore some state, for example, 
> add something to RootProcedureState. But anyway, it will still lead to 
> assertion or other unexpected errors.
> And this can not be done by simply adding a lock in the procedure, as most 
> works are done in the ProcedureExecutor after we throw 
> ProcedureSuspendException.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20886) [Auth] Support keytab login in hbase client

2018-07-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559665#comment-16559665
 ] 

Hadoop QA commented on HBASE-20886:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
23s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
46s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
13s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
14s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} refguide {color} | {color:blue}  5m 
16s{color} | {color:blue} branch has no errors when building the reference 
guide. See footer for rendered docs, which you should manually inspect. {color} 
|
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
35s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} 
|
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
39s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
17s{color} | {color:green} root: The patch generated 0 new + 24 unchanged - 2 
fixed = 24 total (was 26) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:blue}0{color} | {color:blue} refguide {color} | {color:blue}  5m  
3s{color} | {color:blue} patch has no errors when building the reference guide. 
See footer for rendered docs, which you should manually inspect. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
32s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 10s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} 
|
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
42s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}182m 
17s{color} | {color:green} root in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
47s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}259m  1s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | 

[jira] [Comment Edited] (HBASE-20945) HBase JMX - timestamp of last Major Compaction (started, completed successfully)

2018-07-27 Thread Hari Sekhon (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559548#comment-16559548
 ] 

Hari Sekhon edited comment on HBASE-20945 at 7/27/18 11:34 AM:
---

Yes but the JMX API already exposes the following information per region in 
RegionServer JMX:
 * compactionsCompletedCount
 * numBytesCompactedCount
 * numFilesCompactedCount

and 12 other metrics per region next to them.

Adding one or two more for timestamp of last compaction started and completed 
seems to be in line with what has already been done and would go from 15 to 17 
metrics per region, so shouldn't break the bank.

For HMaster you probably only want a summary JMX field per table similar to 
what is shown in the UI under table.jsp for whether a table is currently 
compacting:
{code:java}
Table Attributes
Attribute Name  Value   Description
Enabled trueIs the table enabled
Compaction  MINOR   Is the table compacting{code}
Adding a field next to Compaction showing Time of Last Compaction is in line 
with what has already been done.


was (Author: harisekhon):
Yes but the JMX API already exposes the following information per region in 
RegionServer JMX:
 * compactionsCompletedCount
 * numBytesCompactedCount
 * numFilesCompactedCount

and 12 other metrics per region next to them.

Adding one or two more for fields for timestamp of last compaction started and 
completed seems to be in line with what has already been done.

For HMaster you probably only want a summary JMX field per table similar to 
what is shown in the UI under table.jsp for whether a table is currently 
compacting:
{code:java}
Table Attributes
Attribute Name  Value   Description
Enabled trueIs the table enabled
Compaction  MINOR   Is the table compacting{code}
Adding a field next to Compaction showing Time of Last Compaction is in line 
with what has already been done.

> HBase JMX - timestamp of last Major Compaction (started, completed 
> successfully)
> 
>
> Key: HBASE-20945
> URL: https://issues.apache.org/jira/browse/HBASE-20945
> Project: HBase
>  Issue Type: Improvement
>  Components: API, Compaction, master, metrics, monitoring, 
> regionserver, tooling
>Reporter: Hari Sekhon
>Priority: Major
>
> Request that the timestamp of the last major compaction be stored in JMX API 
> available at /jmx.
> Major Compactions may be disabled to better control scheduling to trigger off 
> peak (this is an old school recommendation), but there is a risk that the 
> major compaction doesn't happen in that case. Also people may trigger major 
> compactions manually and it's hard to see that (I've looked at graphs of 
> storefile counts where it's not obvious but I can infer it from spikes in 
> compaction queue length). Storing the last timestamps would allow all sorts 
> of scripting checks against the API much more simply than trying to infer it 
> from changes in graphs. Also with recent changes to allow compactions to be 
> cancelled in HBASE-6028, the queue length doesn't tell the whole story as the 
> compaction may not have happened if it got cancelled, so the compaction queue 
> spike will be there even though major compaction did not in fact 
> happen/complete.
> Since major compactions may take hours and can also now be cancelled in the 
> latest versions of HBase, we need a few different fields added to JMX:
>  * HBase Master JMX:
>  ** timestamp that last major compaction was triggered, either manually via 
> major_compact command or via schedule
>  ** timestamp that last major compaction completed successfully (since 
> timestamp above could have been started and then later cancelled manually if 
> load was too high)
>  * HBase Regionserver JMX:
>  ** timestamp per region that last major compaction was triggered (there are 
> already compcationsCompletedCount, numBytesCompactedCount and 
> numFilesCompactedCount so it makes sense to add this next to those for each 
> region)
>  ** timestamp per region that last major compaction completed successfully



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20921) Possible NPE in ReopenTableRegionsProcedure

2018-07-27 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559591#comment-16559591
 ] 

Hudson commented on HBASE-20921:


Results for branch branch-2
[build #1032 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1032/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1032//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1032//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1032//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Possible NPE in ReopenTableRegionsProcedure
> ---
>
> Key: HBASE-20921
> URL: https://issues.apache.org/jira/browse/HBASE-20921
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 3.0.0, 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20921.branch-2.0.001.patch, 
> HBASE-20921.branch-2.0.002.patch
>
>
> After HBASE-20752, we issue a ReopenTableRegionsProcedure in 
> ModifyTableProcedure to ensure all regions are reopened.
> But, ModifyTableProcedure and ReopenTableRegionsProcedure do not hold the 
> lock (why?), so there is a chance that while ModifyTableProcedure  executing, 
> a merge/split procedure can be executed at the same time.
> So, when ReopenTableRegionsProcedure reaches the state of 
> "REOPEN_TABLE_REGIONS_CONFIRM_REOPENED", some of the persisted regions to 
> check is actually not exists, thus a NPE will throw.
> {code}
> 2018-07-18 01:38:57,528 INFO  [PEWorker-9] 
> procedure2.ProcedureExecutor(1246): Finished pid=6110, state=SUCCESS; 
> MergeTableRegionsProcedure table=IntegrationTestBigLinkedList, 
> regions=[845d286231eb01b7
> 1aeaa17b0e30058d, 4a46ab0918c99cada72d5336ad83a828], forcibly=false in 
> 10.8610sec
> 2018-07-18 01:38:57,530 ERROR [PEWorker-8] 
> procedure2.ProcedureExecutor(1478): CODE-BUG: Uncaught runtime exception: 
> pid=5974, ppid=5973, state=RUNNABLE:REOPEN_TABLE_REGIONS_CONFIRM_REOPENED; 
> ReopenTab
> leRegionsProcedure table=IntegrationTestBigLinkedList
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.checkReopened(RegionStates.java:651)
> at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
> at 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
> at 
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
> at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
> at 
> java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
> at 
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> at 
> java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
> at 
> org.apache.hadoop.hbase.master.procedure.ReopenTableRegionsProcedure.executeFromState(ReopenTableRegionsProcedure.java:102)
> at 
> org.apache.hadoop.hbase.master.procedure.ReopenTableRegionsProcedure.executeFromState(ReopenTableRegionsProcedure.java:45)
> at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:184)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:850)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1453)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1221)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> {code}
> I think we need to renew the region list of the table at the 
> "REOPEN_TABLE_REGIONS_CONFIRM_REOPENED" state. For the regions which are 
> merged or split, we do not need to check it. Since we can be sure that they 
> are opened after we made change to table descriptor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20921) Possible NPE in ReopenTableRegionsProcedure

2018-07-27 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559574#comment-16559574
 ] 

Hudson commented on HBASE-20921:


Results for branch branch-2.1
[build #110 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/110/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/110//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/110//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/110//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Possible NPE in ReopenTableRegionsProcedure
> ---
>
> Key: HBASE-20921
> URL: https://issues.apache.org/jira/browse/HBASE-20921
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 3.0.0, 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20921.branch-2.0.001.patch, 
> HBASE-20921.branch-2.0.002.patch
>
>
> After HBASE-20752, we issue a ReopenTableRegionsProcedure in 
> ModifyTableProcedure to ensure all regions are reopened.
> But, ModifyTableProcedure and ReopenTableRegionsProcedure do not hold the 
> lock (why?), so there is a chance that while ModifyTableProcedure  executing, 
> a merge/split procedure can be executed at the same time.
> So, when ReopenTableRegionsProcedure reaches the state of 
> "REOPEN_TABLE_REGIONS_CONFIRM_REOPENED", some of the persisted regions to 
> check is actually not exists, thus a NPE will throw.
> {code}
> 2018-07-18 01:38:57,528 INFO  [PEWorker-9] 
> procedure2.ProcedureExecutor(1246): Finished pid=6110, state=SUCCESS; 
> MergeTableRegionsProcedure table=IntegrationTestBigLinkedList, 
> regions=[845d286231eb01b7
> 1aeaa17b0e30058d, 4a46ab0918c99cada72d5336ad83a828], forcibly=false in 
> 10.8610sec
> 2018-07-18 01:38:57,530 ERROR [PEWorker-8] 
> procedure2.ProcedureExecutor(1478): CODE-BUG: Uncaught runtime exception: 
> pid=5974, ppid=5973, state=RUNNABLE:REOPEN_TABLE_REGIONS_CONFIRM_REOPENED; 
> ReopenTab
> leRegionsProcedure table=IntegrationTestBigLinkedList
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.checkReopened(RegionStates.java:651)
> at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
> at 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
> at 
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
> at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
> at 
> java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
> at 
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> at 
> java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
> at 
> org.apache.hadoop.hbase.master.procedure.ReopenTableRegionsProcedure.executeFromState(ReopenTableRegionsProcedure.java:102)
> at 
> org.apache.hadoop.hbase.master.procedure.ReopenTableRegionsProcedure.executeFromState(ReopenTableRegionsProcedure.java:45)
> at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:184)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:850)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1453)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1221)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> {code}
> I think we need to renew the region list of the table at the 
> "REOPEN_TABLE_REGIONS_CONFIRM_REOPENED" state. For the regions which are 
> merged or split, we do not need to check it. Since we can be sure that they 
> are opened after we made change to table descriptor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >