[jira] [Commented] (HBASE-20877) Hbase-1.2.0 OldWals age getting filled and not purged by Hmaster

2018-07-12 Thread Reid Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541233#comment-16541233
 ] 

Reid Chan commented on HBASE-20877:
---

I'm not familiar with solr, but it makes much more sense to use current wal 
under {{/hbase/WALs}} other than being cleaned wals under {{/hbase/oldWALs}}. 
At least,  you can start {{hdfs rm}} from the oldest subdir under 
{{/hbase/oldWALs}} to if you have some worries.

Pardon me if i'm wrong.

Or try this way
{quote}apply that patch manually
{quote}
 

> Hbase-1.2.0 OldWals age getting filled and not purged by Hmaster
> 
>
> Key: HBASE-20877
> URL: https://issues.apache.org/jira/browse/HBASE-20877
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 1.2.0
>Reporter: Manjeet Singh
>Priority: Major
>
> Hbase version 1.2.0 OldWals are getting filled and showing as below
> 7.2 K 21.5 K /hbase/.hbase-snapshot
> 0 0 /hbase/.tmp
> 0 0 /hbase/MasterProcWALs
> 18.3 G 60.2 G /hbase/WALs
> 28.7 G 86.1 G /hbase/archive
> 0 0 /hbase/corrupt
> 1.7 T 5.2 T /hbase/data
> 42 126 /hbase/hbase.id
> 7 21 /hbase/hbase.version
> 7.2 T 21.6 T /hbase/oldWALs
>  
> It;s not getting purged by Hmaster as oldWals are supposed to be cleaned in 
> master background chore, HBASE-20352(for 1.x version) is created to speed up 
> cleaning oldWals, in our case it's not happening.
> hbase.master.logcleaner.ttl is 1 minutes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20877) Hbase-1.2.0 OldWals age getting filled and not purged by Hmaster

2018-07-12 Thread Reid Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541236#comment-16541236
 ] 

Reid Chan commented on HBASE-20877:
---

actual data are stored in {{/hbase/data}} directory.

 

WALs are used for replay when rs is crashed, and oldWALs are no longer used for 
replay that's why they are being cleaned. The problem why {{oldWALs}} grows so 
huge it's cleaning up speed much slower than producing speed, and this's the 
background for HBASE-18309.

> Hbase-1.2.0 OldWals age getting filled and not purged by Hmaster
> 
>
> Key: HBASE-20877
> URL: https://issues.apache.org/jira/browse/HBASE-20877
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 1.2.0
>Reporter: Manjeet Singh
>Priority: Major
>
> Hbase version 1.2.0 OldWals are getting filled and showing as below
> 7.2 K 21.5 K /hbase/.hbase-snapshot
> 0 0 /hbase/.tmp
> 0 0 /hbase/MasterProcWALs
> 18.3 G 60.2 G /hbase/WALs
> 28.7 G 86.1 G /hbase/archive
> 0 0 /hbase/corrupt
> 1.7 T 5.2 T /hbase/data
> 42 126 /hbase/hbase.id
> 7 21 /hbase/hbase.version
> 7.2 T 21.6 T /hbase/oldWALs
>  
> It;s not getting purged by Hmaster as oldWals are supposed to be cleaned in 
> master background chore, HBASE-20352(for 1.x version) is created to speed up 
> cleaning oldWals, in our case it's not happening.
> hbase.master.logcleaner.ttl is 1 minutes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20846) Table's shared lock is not held by sub-procedures after master restart

2018-07-12 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-20846:
--
Status: Open  (was: Patch Available)

> Table's shared lock is not held by sub-procedures after master restart
> --
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20846.branch-2.0.002.patch, 
> HBASE-20846.branch-2.0.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not possible if the master was not restarted.
> This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, 
> I wrote a simple UT to repo this case.
> I think we don't have to check the parent for table's shared lock. It is a 
> shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20865) CreateTableProcedure is stuck in retry loop in CREATE_TABLE_WRITE_FS_LAYOUT state

2018-07-12 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541237#comment-16541237
 ] 

Duo Zhang commented on HBASE-20865:
---

So the problem is that, the createFsLayout will fail if there are half written 
dirs/files? The patch seems fine.

+1.

> CreateTableProcedure is stuck in retry loop in CREATE_TABLE_WRITE_FS_LAYOUT 
> state
> -
>
> Key: HBASE-20865
> URL: https://issues.apache.org/jira/browse/HBASE-20865
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Attachments: HBASE-20865.master.001.patch
>
>
> Similar to HBASE-20616, CreateTableProcedure gets stuck in retry loop in 
> CREATE_TABLE_WRITE_FS_LAYOUT state when writing HDFS fails.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version

2018-07-12 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541247#comment-16541247
 ] 

Hadoop QA commented on HBASE-20866:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-1.3 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
36s{color} | {color:green} branch-1.3 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} branch-1.3 passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} branch-1.3 passed with JDK v1.7.0_181 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
48s{color} | {color:green} branch-1.3 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
32s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} branch-1.3 passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} branch-1.3 passed with JDK v1.7.0_181 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed with JDK v1.7.0_181 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} hbase-client: The patch generated 0 new + 52 
unchanged - 1 fixed = 52 total (was 53) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
19s{color} | {color:red} hbase-server: The patch generated 3 new + 314 
unchanged - 3 fixed = 317 total (was 317) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
31s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
6m 15s{color} | {color:green} Patch does not cause any errors with Hadoop 2.4.1 
2.5.2 2.6.5 2.7.4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed with JDK v1.7.0_181 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
52s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 87m 45s{color} 
| {color:red} h

[jira] [Commented] (HBASE-20865) CreateTableProcedure is stuck in retry loop in CREATE_TABLE_WRITE_FS_LAYOUT state

2018-07-12 Thread Toshihiro Suzuki (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541253#comment-16541253
 ] 

Toshihiro Suzuki commented on HBASE-20865:
--

{quote}
So the problem is that, the createFsLayout will fail if there are half written 
dirs/files?
{quote}
Yes. [~Apache9]

> CreateTableProcedure is stuck in retry loop in CREATE_TABLE_WRITE_FS_LAYOUT 
> state
> -
>
> Key: HBASE-20865
> URL: https://issues.apache.org/jira/browse/HBASE-20865
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Attachments: HBASE-20865.master.001.patch
>
>
> Similar to HBASE-20616, CreateTableProcedure gets stuck in retry loop in 
> CREATE_TABLE_WRITE_FS_LAYOUT state when writing HDFS fails.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19572) RegionMover should use the configured default port number and not the one from HConstants

2018-07-12 Thread Reid Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541259#comment-16541259
 ] 

Reid Chan commented on HBASE-19572:
---

I just think your patch can do much better if you get to know 
{{AbstractHBaseTool}} framework better, especially for the usage of {{conf}}, 
there're redundant codes.

 

 

> RegionMover should use the configured default port number and not the one 
> from HConstants
> -
>
> Key: HBASE-19572
> URL: https://issues.apache.org/jira/browse/HBASE-19572
> Project: HBase
>  Issue Type: Bug
>Reporter: Esteban Gutierrez
>Assignee: Toshihiro Suzuki
>Priority: Major
> Attachments: HBASE-19572.master.001.patch, 
> HBASE-19572.master.001.patch, HBASE-19572.master.003.patch, 
> HBASE-19572.master.004.patch, HBASE-19572.master.004.patch, 
> HBASE-19572.master.005.patch
>
>
> The issue I ran into HBASE-19499 was due RegionMover not using the port used 
> by {{hbase-site.xml}}. The tool should use the value used in the 
> configuration before falling back to the hardcoded value 
> {{HConstants.DEFAULT_REGIONSERVER_PORT}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-19572) RegionMover should use the configured default port number and not the one from HConstants

2018-07-12 Thread Reid Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541259#comment-16541259
 ] 

Reid Chan edited comment on HBASE-19572 at 7/12/18 7:40 AM:


I just think your patch can do much better if you get to know 
{{AbstractHBaseTool}} framework better, especially for the usage of {{conf}}, 
there're redundant codes.


was (Author: reidchan):
I just think your patch can do much better if you get to know 
{{AbstractHBaseTool}} framework better, especially for the usage of {{conf}}, 
there're redundant codes.

 

 

> RegionMover should use the configured default port number and not the one 
> from HConstants
> -
>
> Key: HBASE-19572
> URL: https://issues.apache.org/jira/browse/HBASE-19572
> Project: HBase
>  Issue Type: Bug
>Reporter: Esteban Gutierrez
>Assignee: Toshihiro Suzuki
>Priority: Major
> Attachments: HBASE-19572.master.001.patch, 
> HBASE-19572.master.001.patch, HBASE-19572.master.003.patch, 
> HBASE-19572.master.004.patch, HBASE-19572.master.004.patch, 
> HBASE-19572.master.005.patch
>
>
> The issue I ran into HBASE-19499 was due RegionMover not using the port used 
> by {{hbase-site.xml}}. The tool should use the value used in the 
> configuration before falling back to the hardcoded value 
> {{HConstants.DEFAULT_REGIONSERVER_PORT}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19572) RegionMover should use the configured default port number and not the one from HConstants

2018-07-12 Thread Reid Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541280#comment-16541280
 ] 

Reid Chan commented on HBASE-19572:
---

I will commit it tomorrow if there's no other objection

> RegionMover should use the configured default port number and not the one 
> from HConstants
> -
>
> Key: HBASE-19572
> URL: https://issues.apache.org/jira/browse/HBASE-19572
> Project: HBase
>  Issue Type: Bug
>Reporter: Esteban Gutierrez
>Assignee: Toshihiro Suzuki
>Priority: Major
> Attachments: HBASE-19572.master.001.patch, 
> HBASE-19572.master.001.patch, HBASE-19572.master.003.patch, 
> HBASE-19572.master.004.patch, HBASE-19572.master.004.patch, 
> HBASE-19572.master.005.patch
>
>
> The issue I ran into HBASE-19499 was due RegionMover not using the port used 
> by {{hbase-site.xml}}. The tool should use the value used in the 
> configuration before falling back to the hardcoded value 
> {{HConstants.DEFAULT_REGIONSERVER_PORT}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19572) RegionMover should use the configured default port number and not the one from HConstants

2018-07-12 Thread Reid Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541276#comment-16541276
 ] 

Reid Chan commented on HBASE-19572:
---

+1
But it's still good to go, we can do some refactor in another jira.

> RegionMover should use the configured default port number and not the one 
> from HConstants
> -
>
> Key: HBASE-19572
> URL: https://issues.apache.org/jira/browse/HBASE-19572
> Project: HBase
>  Issue Type: Bug
>Reporter: Esteban Gutierrez
>Assignee: Toshihiro Suzuki
>Priority: Major
> Attachments: HBASE-19572.master.001.patch, 
> HBASE-19572.master.001.patch, HBASE-19572.master.003.patch, 
> HBASE-19572.master.004.patch, HBASE-19572.master.004.patch, 
> HBASE-19572.master.005.patch
>
>
> The issue I ran into HBASE-19499 was due RegionMover not using the port used 
> by {{hbase-site.xml}}. The tool should use the value used in the 
> configuration before falling back to the hardcoded value 
> {{HConstants.DEFAULT_REGIONSERVER_PORT}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20846) Table's shared lock is not held by sub-procedures after master restart

2018-07-12 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541364#comment-16541364
 ] 

Duo Zhang commented on HBASE-20846:
---

OK, found another problem. I want to call acquireLock to restore the lock 
state, but this does work when master restarts. For most procedures, we will 
call env.waitInitialized to wait until master has been initialized. This is 
reasonable. But we need to finish procedure executor initialization before 
loading meta, which means we need to restore the procedure locks before master 
has been initialized, then dead lock...

A proper way to fix this is to split the acquire lock to two stages, and put 
the waitInitialized to the pre check stage. But the logic will be more 
complicated, as it is possible that we haven't passed the pre check, but we 
have already held the lock...

> Table's shared lock is not held by sub-procedures after master restart
> --
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20846.branch-2.0.002.patch, 
> HBASE-20846.branch-2.0.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not possible if the master was not restarted.
> This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, 
> I wrote a simple UT to repo this case.
> I think we don't have to check the parent for table's shared lock. It is a 
> shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19572) RegionMover should use the configured default port number and not the one from HConstants

2018-07-12 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541376#comment-16541376
 ] 

Hadoop QA commented on HBASE-19572:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
20s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
45s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
35s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
34s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 55s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}117m 
10s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}159m 48s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-19572 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12931279/HBASE-19572.master.005.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux e31282403935 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / c55acb0b03 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13605/testReport/ |
| Max. process+thread count | 4899 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13605/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> RegionMover s

[jira] [Updated] (HBASE-20853) Polish "Add defaults to Table Interface so Implementors don't have to"

2018-07-12 Thread Balazs Meszaros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Balazs Meszaros updated HBASE-20853:

Attachment: HBASE-20853.master.002.patch

> Polish "Add defaults to Table Interface so Implementors don't have to"
> --
>
> Key: HBASE-20853
> URL: https://issues.apache.org/jira/browse/HBASE-20853
> Project: HBase
>  Issue Type: Sub-task
>  Components: API
>Reporter: stack
>Assignee: Balazs Meszaros
>Priority: Major
>  Labels: beginner, beginners
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20853.master.001.patch, 
> HBASE-20853.master.002.patch
>
>
> This issue is to address feedback that came in after commit on the parent 
> (FYI [~chia7712]). See tail of parent issue and amendment attached to parent 
> adding better defaults to the Table Interface.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20860) Merged region's RIT state may not be cleaned after master restart

2018-07-12 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541441#comment-16541441
 ] 

Hudson commented on HBASE-20860:


Results for branch branch-2.1
[build #51 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/51/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/51//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/51//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/51//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Merged region's RIT state may not be cleaned after master restart
> -
>
> Key: HBASE-20860
> URL: https://issues.apache.org/jira/browse/HBASE-20860
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20860.branch-2.0.002.patch, 
> HBASE-20860.branch-2.0.003.patch, HBASE-20860.branch-2.0.004.patch, 
> HBASE-20860.branch-2.0.005.patch, HBASE-20860.branch-2.0.patch
>
>
> In MergeTableRegionsProcedure, we issue UnassignProcedures to offline regions 
> to merge. But if we restart master just after MergeTableRegionsProcedure 
> finished these two UnassignProcedure and before it can delete their meta 
> entries. The new master will found these two region is CLOSED but no 
> procedures are attached to them. They will be regard as RIT regions and 
> nobody will clean the RIT state for them later.
> A quick way to resolve this stuck situation in the production env is 
> restarting master again, since the meta entries are deleted in 
> MergeTableRegionsProcedure. Here, I offer a fix for this problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20864) RS was killed due to master thought the region should be on a already dead server

2018-07-12 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541451#comment-16541451
 ] 

Duo Zhang commented on HBASE-20864:
---

I think the problem is here
{noformat}
2018-07-09 20:03:38,716 INFO  [PEWorker-9] assignment.RegionStateStore: 
pid=2308 updating hbase:meta row=7a5b2c7b4b1edaba7f90b45f3e536293, 
regionState=OPENING
2018-07-09 20:03:38,716 INFO  [PEWorker-8] assignment.RegionStateStore: 
pid=2309 updating hbase:meta row=7e9317c9b32e95b2e6482ef4a7145078, 
regionState=OPENING, regionLocation=e010125049164.bja,60020,1531136465378
2018-07-09 20:03:38,716 INFO  [PEWorker-3] assignment.RegionStateStore: 
pid=2305 updating hbase:meta row=4423e4182457c5b573729be4682cc3a3, 
regionState=OPENING
2018-07-09 20:03:38,716 INFO  [PEWorker-15] assignment.RegionStateStore: 
pid=2306 updating hbase:meta row=fc5a65649a2462683a380f9f833151c3, 
regionState=OPENING, regionLocation=e010125048016.bja,60020,1531137190779
2018-07-09 20:03:38,716 INFO  [PEWorker-16] assignment.RegionStateStore: 
pid=2307 updating hbase:meta row=30d22d10f12cee0ed3603a447ee710e2, 
regionState=OPENING, regionLocation=e010125048016.bja,60020,1531137190779
2018-07-09 20:03:38,716 INFO  [PEWorker-1] assignment.RegionStateStore: 
pid=2304 updating hbase:meta row=58cd377b1c46faf98c3a5ee61b4c97fa, 
regionState=OPENING
{noformat}

You can see that, for 4423e4182457c5b573729be4682cc3a3, there is no 
regionLocation information. Actually, there are two fields in meta table which 
record the location of the region, and OPENING and OPEN will write different 
field. And when loading meta, we will use the location written when OPENING.

Need to dig more why here we do not have a regionLocation, it should not happen.

> RS was killed due to master thought the region should be on a already dead 
> server
> -
>
> Key: HBASE-20864
> URL: https://issues.apache.org/jira/browse/HBASE-20864
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Allan Yang
>Priority: Major
> Attachments: log.zip
>
>
> When I was running ITBLL with our internal 2.0.0 version(with 2.0.1 
> backported and with other two issues: HBASE-20706, HBASE-20752). I found two 
> of my RS killed by master since master has a different region state with 
> those RS. It is very strange that master thought these region should be on a 
> already dead server. There might be a serious bug, but I haven't found it 
> yet. Here is the process:
> 1. e010125048153.bja,60020,1531137365840 is crashed, and clearly 
> 4423e4182457c5b573729be4682cc3a3 was assigned to 
> e010125049164.bja,60020,1531136465378 during ServerCrashProcedure
> {code:java}
> 2018-07-09 20:03:32,443 INFO  [PEWorker-10] procedure.ServerCrashProcedure: 
> Start pid=2303, state=RUNNABLE:SERVER_CRASH_START; ServerCrashProcedure 
> server=e010125048153.bja,60020,1531137365840, splitWal=true, meta=false
> 2018-07-09 20:03:39,220 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=294,queue=24,port=6] 
> assignment.RegionTransitionProcedure: Received report OPENED seqId=16021, 
> pid=2305, ppid=2303, state=RUNNABLE:REGION_TRANSITION_DISPATCH; 
> AssignProcedure table=IntegrationTestBigLinkedList, 
> region=4423e4182457c5b573729be4682cc3a3; rit=OPENING, 
> location=e010125049164.bja,60020,1531136465378
> 2018-07-09 20:03:39,220 INFO  [PEWorker-13] assignment.RegionStateStore: 
> pid=2305 updating hbase:meta row=4423e4182457c5b573729be4682cc3a3, 
> regionState=OPEN, openSeqNum=16021, 
> regionLocation=e010125049164.bja,60020,1531136465378
> 2018-07-09 20:03:43,190 INFO  [PEWorker-12] procedure2.ProcedureExecutor: 
> Finished pid=2303, state=SUCCESS; ServerCrashProcedure 
> server=e010125048153.bja,60020,1531137365840, splitWal=true, meta=false in 
> 10.7490sec
> {code}
> 2. A modify table happened later, the 4423e4182457c5b573729be4682cc3a3 was 
> reopend on e010125049164.bja,60020,1531136465378
> {code:java}
> 2018-07-09 20:04:39,929 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=295,queue=25,port=6] 
> assignment.RegionTransitionProcedure: Received report OPENED seqId=16024, 
> pid=2351, ppid=2314, state=RUNNABLE:REGION_TRANSITION_DISPATCH; 
> AssignProcedure table=IntegrationTestBigLinkedList, 
> region=4423e4182457c5b573729be4682cc3a3, 
> target=e010125049164.bja,60020,1531136465378; rit=OPENING, 
> location=e010125049164.bja,60020,1531136465378
> 2018-07-09 20:04:40,554 INFO  [PEWorker-6] assignment.RegionStateStore: 
> pid=2351 updating hbase:meta row=4423e4182457c5b573729be4682cc3a3, 
> regionState=OPEN, openSeqNum=16024, 
> regionLocation=e010125049164.bja,60020,1531136465378
> {code}
> 3. Active master was killed, the backup master took over, but when loading 
> meta entry, it clearly showed 4423e4182457c5b573729be4682cc3a3 is on 

[jira] [Commented] (HBASE-20838) Include hbase-server in precommit test if CommonFSUtils is changed

2018-07-12 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541442#comment-16541442
 ] 

Hudson commented on HBASE-20838:


Results for branch branch-2.1
[build #51 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/51/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/51//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/51//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/51//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Include hbase-server in precommit test if CommonFSUtils is changed
> --
>
> Key: HBASE-20838
> URL: https://issues.apache.org/jira/browse/HBASE-20838
> Project: HBase
>  Issue Type: Test
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-20838.patch, HBASE-20838.patch, 
> HBASE-20838.v2.patch
>
>
> -As per 
> [discussed|https://issues.apache.org/jira/browse/HBASE-20691?focusedCommentId=16517662&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16517662]
>  in HBASE-20691, since the setStoragePolicy code is in CommonFSUtils, the 
> test should be in TestCommonFSUtils-
> We don't want to introduce dependency on hadoop-hdfs to hbase-common so 
> decided to leave the setStoragePolicy related tests in TestCommonFSUtils. 
> Instead, we will change the personality script to include hbase-server in 
> unit test if any change made against {{CommonFSUtils}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20864) RS was killed due to master thought the region should be on a already dead server

2018-07-12 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541455#comment-16541455
 ] 

Duo Zhang commented on HBASE-20864:
---

Have you tried the patch in HBASE-20792? [~allan163].

> RS was killed due to master thought the region should be on a already dead 
> server
> -
>
> Key: HBASE-20864
> URL: https://issues.apache.org/jira/browse/HBASE-20864
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Allan Yang
>Priority: Major
> Attachments: log.zip
>
>
> When I was running ITBLL with our internal 2.0.0 version(with 2.0.1 
> backported and with other two issues: HBASE-20706, HBASE-20752). I found two 
> of my RS killed by master since master has a different region state with 
> those RS. It is very strange that master thought these region should be on a 
> already dead server. There might be a serious bug, but I haven't found it 
> yet. Here is the process:
> 1. e010125048153.bja,60020,1531137365840 is crashed, and clearly 
> 4423e4182457c5b573729be4682cc3a3 was assigned to 
> e010125049164.bja,60020,1531136465378 during ServerCrashProcedure
> {code:java}
> 2018-07-09 20:03:32,443 INFO  [PEWorker-10] procedure.ServerCrashProcedure: 
> Start pid=2303, state=RUNNABLE:SERVER_CRASH_START; ServerCrashProcedure 
> server=e010125048153.bja,60020,1531137365840, splitWal=true, meta=false
> 2018-07-09 20:03:39,220 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=294,queue=24,port=6] 
> assignment.RegionTransitionProcedure: Received report OPENED seqId=16021, 
> pid=2305, ppid=2303, state=RUNNABLE:REGION_TRANSITION_DISPATCH; 
> AssignProcedure table=IntegrationTestBigLinkedList, 
> region=4423e4182457c5b573729be4682cc3a3; rit=OPENING, 
> location=e010125049164.bja,60020,1531136465378
> 2018-07-09 20:03:39,220 INFO  [PEWorker-13] assignment.RegionStateStore: 
> pid=2305 updating hbase:meta row=4423e4182457c5b573729be4682cc3a3, 
> regionState=OPEN, openSeqNum=16021, 
> regionLocation=e010125049164.bja,60020,1531136465378
> 2018-07-09 20:03:43,190 INFO  [PEWorker-12] procedure2.ProcedureExecutor: 
> Finished pid=2303, state=SUCCESS; ServerCrashProcedure 
> server=e010125048153.bja,60020,1531137365840, splitWal=true, meta=false in 
> 10.7490sec
> {code}
> 2. A modify table happened later, the 4423e4182457c5b573729be4682cc3a3 was 
> reopend on e010125049164.bja,60020,1531136465378
> {code:java}
> 2018-07-09 20:04:39,929 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=295,queue=25,port=6] 
> assignment.RegionTransitionProcedure: Received report OPENED seqId=16024, 
> pid=2351, ppid=2314, state=RUNNABLE:REGION_TRANSITION_DISPATCH; 
> AssignProcedure table=IntegrationTestBigLinkedList, 
> region=4423e4182457c5b573729be4682cc3a3, 
> target=e010125049164.bja,60020,1531136465378; rit=OPENING, 
> location=e010125049164.bja,60020,1531136465378
> 2018-07-09 20:04:40,554 INFO  [PEWorker-6] assignment.RegionStateStore: 
> pid=2351 updating hbase:meta row=4423e4182457c5b573729be4682cc3a3, 
> regionState=OPEN, openSeqNum=16024, 
> regionLocation=e010125049164.bja,60020,1531136465378
> {code}
> 3. Active master was killed, the backup master took over, but when loading 
> meta entry, it clearly showed 4423e4182457c5b573729be4682cc3a3 is on the 
> privous dead server e010125048153.bja,60020,1531137365840. That is very very 
> strange!!!
> {code:java}
> 2018-07-09 20:06:17,985 INFO  [master/e010125048016:6] 
> assignment.RegionStateStore: Load hbase:meta entry 
> region=4423e4182457c5b573729be4682cc3a3, regionState=OPEN, 
> lastHost=e010125049164.bja,60020,1531136465378, 
> regionLocation=e010125048153.bja,60020,1531137365840, openSeqNum=16024
> {code}
> 4. the rs was killed
> {code:java}
> 2018-07-09 20:06:20,265 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=297,queue=27,port=6] 
> assignment.AssignmentManager: Killing e010125049164.bja,60020,1531136465378: 
> rit=OPEN, location=e010125048153.bja,60020,1531137365840, 
> table=IntegrationTestBigLinkedList, 
> region=4423e4182457c5b573729be4682cc3a3reported OPEN on 
> server=e010125049164.bja,60020,1531136465378 but state has otherwise.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20838) Include hbase-server in precommit test if CommonFSUtils is changed

2018-07-12 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541453#comment-16541453
 ] 

Hudson commented on HBASE-20838:


Results for branch branch-2
[build #973 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/973/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/973//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/973//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/973//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Include hbase-server in precommit test if CommonFSUtils is changed
> --
>
> Key: HBASE-20838
> URL: https://issues.apache.org/jira/browse/HBASE-20838
> Project: HBase
>  Issue Type: Test
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-20838.patch, HBASE-20838.patch, 
> HBASE-20838.v2.patch
>
>
> -As per 
> [discussed|https://issues.apache.org/jira/browse/HBASE-20691?focusedCommentId=16517662&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16517662]
>  in HBASE-20691, since the setStoragePolicy code is in CommonFSUtils, the 
> test should be in TestCommonFSUtils-
> We don't want to introduce dependency on hadoop-hdfs to hbase-common so 
> decided to leave the setStoragePolicy related tests in TestCommonFSUtils. 
> Instead, we will change the personality script to include hbase-server in 
> unit test if any change made against {{CommonFSUtils}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20697) Can't cache All region locations of the specify table by calling table.getRegionLocator().getAllRegionLocations()

2018-07-12 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541462#comment-16541462
 ] 

Hudson commented on HBASE-20697:


Results for branch branch-2.0
[build #539 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/539/]: 
(/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/539//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/539//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/539//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Can't cache All region locations of the specify table by calling 
> table.getRegionLocator().getAllRegionLocations()
> -
>
> Key: HBASE-20697
> URL: https://issues.apache.org/jira/browse/HBASE-20697
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.1, 1.2.6, 2.0.1
>Reporter: zhaoyuan
>Assignee: zhaoyuan
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 1.4.6, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-20697.branch-1.2.001.patch, 
> HBASE-20697.branch-1.2.002.patch, HBASE-20697.branch-1.2.003.patch, 
> HBASE-20697.branch-1.2.004.patch, HBASE-20697.branch-1.addendum.patch, 
> HBASE-20697.master.001.patch, HBASE-20697.master.002.patch, 
> HBASE-20697.master.002.patch, HBASE-20697.master.003.patch
>
>
> When we upgrade and restart  a new version application which will read and 
> write to HBase, we will get some operation timeout. The time out is expected 
> because when the application restarts,It will not hold any region locations 
> cache and do communication with zk and meta regionserver to get region 
> locations.
> We want to avoid these timeouts so we do warmup work and as far as I am 
> concerned,the method table.getRegionLocator().getAllRegionLocations() will 
> fetch all region locations and cache them. However, it didn't work good. 
> There are still a lot of time outs,so it confused me. 
> I dig into the source code and find something below
> {code:java}
> // code placeholder
> public List getAllRegionLocations() throws IOException {
>   TableName tableName = getName();
>   NavigableMap locations =
>   MetaScanner.allTableRegions(this.connection, tableName);
>   ArrayList regions = new ArrayList<>(locations.size());
>   for (Entry entry : locations.entrySet()) {
> regions.add(new HRegionLocation(entry.getKey(), entry.getValue()));
>   }
>   if (regions.size() > 0) {
> connection.cacheLocation(tableName, new RegionLocations(regions));
>   }
>   return regions;
> }
> In MetaCache
> public void cacheLocation(final TableName tableName, final RegionLocations 
> locations) {
>   byte [] startKey = 
> locations.getRegionLocation().getRegionInfo().getStartKey();
>   ConcurrentMap tableLocations = 
> getTableLocations(tableName);
>   RegionLocations oldLocation = tableLocations.putIfAbsent(startKey, 
> locations);
>   boolean isNewCacheEntry = (oldLocation == null);
>   if (isNewCacheEntry) {
> if (LOG.isTraceEnabled()) {
>   LOG.trace("Cached location: " + locations);
> }
> addToCachedServers(locations);
> return;
>   }
> {code}
> It will collect all regions into one RegionLocations object and only cache 
> the first not null region location and then when we put or get to hbase, we 
> do getCacheLocation() 
> {code:java}
> // code placeholder
> public RegionLocations getCachedLocation(final TableName tableName, final 
> byte [] row) {
>   ConcurrentNavigableMap tableLocations =
> getTableLocations(tableName);
>   Entry e = tableLocations.floorEntry(row);
>   if (e == null) {
> if (metrics!= null) metrics.incrMetaCacheMiss();
> return null;
>   }
>   RegionLocations possibleRegion = e.getValue();
>   // make sure that the end key is greater than the row we're looking
>   // for, otherwise the row actually belongs in the next region, not
>   // this one. the exception case is when the endkey is
>   // HConstants.EMPTY_END_ROW, signifying that the region we're
>   // checking is actually the last region in the table.
>   byte[] endKey = 
> possibleRegion.getRegionLocation().getRegionInfo().getEndKey();
>   if (Bytes.equals(endKey, HConstants.EMPTY_END_ROW) ||
>   getRowComparator(tableName).compareRows(
>   endKey, 0, endKey

[jira] [Commented] (HBASE-20860) Merged region's RIT state may not be cleaned after master restart

2018-07-12 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541461#comment-16541461
 ] 

Hudson commented on HBASE-20860:


Results for branch branch-2.0
[build #539 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/539/]: 
(/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/539//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/539//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/539//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Merged region's RIT state may not be cleaned after master restart
> -
>
> Key: HBASE-20860
> URL: https://issues.apache.org/jira/browse/HBASE-20860
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20860.branch-2.0.002.patch, 
> HBASE-20860.branch-2.0.003.patch, HBASE-20860.branch-2.0.004.patch, 
> HBASE-20860.branch-2.0.005.patch, HBASE-20860.branch-2.0.patch
>
>
> In MergeTableRegionsProcedure, we issue UnassignProcedures to offline regions 
> to merge. But if we restart master just after MergeTableRegionsProcedure 
> finished these two UnassignProcedure and before it can delete their meta 
> entries. The new master will found these two region is CLOSED but no 
> procedures are attached to them. They will be regard as RIT regions and 
> nobody will clean the RIT state for them later.
> A quick way to resolve this stuck situation in the production env is 
> restarting master again, since the meta entries are deleted in 
> MergeTableRegionsProcedure. Here, I offer a fix for this problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19572) RegionMover should use the configured default port number and not the one from HConstants

2018-07-12 Thread Reid Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541439#comment-16541439
 ] 

Reid Chan commented on HBASE-19572:
---

Could you provide a patch for branch-2 as well? ping [~brfrn169]

> RegionMover should use the configured default port number and not the one 
> from HConstants
> -
>
> Key: HBASE-19572
> URL: https://issues.apache.org/jira/browse/HBASE-19572
> Project: HBase
>  Issue Type: Bug
>Reporter: Esteban Gutierrez
>Assignee: Toshihiro Suzuki
>Priority: Major
> Attachments: HBASE-19572.master.001.patch, 
> HBASE-19572.master.001.patch, HBASE-19572.master.003.patch, 
> HBASE-19572.master.004.patch, HBASE-19572.master.004.patch, 
> HBASE-19572.master.005.patch
>
>
> The issue I ran into HBASE-19499 was due RegionMover not using the port used 
> by {{hbase-site.xml}}. The tool should use the value used in the 
> configuration before falling back to the hardcoded value 
> {{HConstants.DEFAULT_REGIONSERVER_PORT}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20697) Can't cache All region locations of the specify table by calling table.getRegionLocator().getAllRegionLocations()

2018-07-12 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541463#comment-16541463
 ] 

Hudson commented on HBASE-20697:


Results for branch branch-1
[build #378 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/378/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/378//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/378//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/378//JDK8_Nightly_Build_Report_(Hadoop2)/]




(x) {color:red}-1 source release artifact{color}
-- See build output for details.


> Can't cache All region locations of the specify table by calling 
> table.getRegionLocator().getAllRegionLocations()
> -
>
> Key: HBASE-20697
> URL: https://issues.apache.org/jira/browse/HBASE-20697
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.1, 1.2.6, 2.0.1
>Reporter: zhaoyuan
>Assignee: zhaoyuan
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 1.4.6, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-20697.branch-1.2.001.patch, 
> HBASE-20697.branch-1.2.002.patch, HBASE-20697.branch-1.2.003.patch, 
> HBASE-20697.branch-1.2.004.patch, HBASE-20697.branch-1.addendum.patch, 
> HBASE-20697.master.001.patch, HBASE-20697.master.002.patch, 
> HBASE-20697.master.002.patch, HBASE-20697.master.003.patch
>
>
> When we upgrade and restart  a new version application which will read and 
> write to HBase, we will get some operation timeout. The time out is expected 
> because when the application restarts,It will not hold any region locations 
> cache and do communication with zk and meta regionserver to get region 
> locations.
> We want to avoid these timeouts so we do warmup work and as far as I am 
> concerned,the method table.getRegionLocator().getAllRegionLocations() will 
> fetch all region locations and cache them. However, it didn't work good. 
> There are still a lot of time outs,so it confused me. 
> I dig into the source code and find something below
> {code:java}
> // code placeholder
> public List getAllRegionLocations() throws IOException {
>   TableName tableName = getName();
>   NavigableMap locations =
>   MetaScanner.allTableRegions(this.connection, tableName);
>   ArrayList regions = new ArrayList<>(locations.size());
>   for (Entry entry : locations.entrySet()) {
> regions.add(new HRegionLocation(entry.getKey(), entry.getValue()));
>   }
>   if (regions.size() > 0) {
> connection.cacheLocation(tableName, new RegionLocations(regions));
>   }
>   return regions;
> }
> In MetaCache
> public void cacheLocation(final TableName tableName, final RegionLocations 
> locations) {
>   byte [] startKey = 
> locations.getRegionLocation().getRegionInfo().getStartKey();
>   ConcurrentMap tableLocations = 
> getTableLocations(tableName);
>   RegionLocations oldLocation = tableLocations.putIfAbsent(startKey, 
> locations);
>   boolean isNewCacheEntry = (oldLocation == null);
>   if (isNewCacheEntry) {
> if (LOG.isTraceEnabled()) {
>   LOG.trace("Cached location: " + locations);
> }
> addToCachedServers(locations);
> return;
>   }
> {code}
> It will collect all regions into one RegionLocations object and only cache 
> the first not null region location and then when we put or get to hbase, we 
> do getCacheLocation() 
> {code:java}
> // code placeholder
> public RegionLocations getCachedLocation(final TableName tableName, final 
> byte [] row) {
>   ConcurrentNavigableMap tableLocations =
> getTableLocations(tableName);
>   Entry e = tableLocations.floorEntry(row);
>   if (e == null) {
> if (metrics!= null) metrics.incrMetaCacheMiss();
> return null;
>   }
>   RegionLocations possibleRegion = e.getValue();
>   // make sure that the end key is greater than the row we're looking
>   // for, otherwise the row actually belongs in the next region, not
>   // this one. the exception case is when the endkey is
>   // HConstants.EMPTY_END_ROW, signifying that the region we're
>   // checking is actually the last region in the table.
>   byte[] endKey = 
> possibleRegion.getRegionLocation().getRegionInfo().getEndKey();
>   if (Bytes.equals(endKey, HConstants.EMPTY_END_ROW) ||
>   getRowComparator(tableName).compareRows(
>   endKey, 0, endKey.length, row, 0, row.length) > 0) {
> if (me

[jira] [Commented] (HBASE-20853) Polish "Add defaults to Table Interface so Implementors don't have to"

2018-07-12 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541467#comment-16541467
 ] 

Hadoop QA commented on HBASE-20853:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
54s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
30s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
52s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
32s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m  2s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
2s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
10s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 37m 19s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20853 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12931303/HBASE-20853.master.002.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 9344944f6af5 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 3fc23fe930 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13606/testReport/ |
| Max. process+thread count | 260 (vs. ulimit of 1) |
| modules | C: hbase-client U: hbase-client |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13606/console |

[jira] [Commented] (HBASE-20864) RS was killed due to master thought the region should be on a already dead server

2018-07-12 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541558#comment-16541558
 ] 

Allan Yang commented on HBASE-20864:


We don't have  HBASE-20792 in our branch. But if I understand correctly,  
HBASE-20792 only affect read only tables?

> RS was killed due to master thought the region should be on a already dead 
> server
> -
>
> Key: HBASE-20864
> URL: https://issues.apache.org/jira/browse/HBASE-20864
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Allan Yang
>Priority: Major
> Attachments: log.zip
>
>
> When I was running ITBLL with our internal 2.0.0 version(with 2.0.1 
> backported and with other two issues: HBASE-20706, HBASE-20752). I found two 
> of my RS killed by master since master has a different region state with 
> those RS. It is very strange that master thought these region should be on a 
> already dead server. There might be a serious bug, but I haven't found it 
> yet. Here is the process:
> 1. e010125048153.bja,60020,1531137365840 is crashed, and clearly 
> 4423e4182457c5b573729be4682cc3a3 was assigned to 
> e010125049164.bja,60020,1531136465378 during ServerCrashProcedure
> {code:java}
> 2018-07-09 20:03:32,443 INFO  [PEWorker-10] procedure.ServerCrashProcedure: 
> Start pid=2303, state=RUNNABLE:SERVER_CRASH_START; ServerCrashProcedure 
> server=e010125048153.bja,60020,1531137365840, splitWal=true, meta=false
> 2018-07-09 20:03:39,220 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=294,queue=24,port=6] 
> assignment.RegionTransitionProcedure: Received report OPENED seqId=16021, 
> pid=2305, ppid=2303, state=RUNNABLE:REGION_TRANSITION_DISPATCH; 
> AssignProcedure table=IntegrationTestBigLinkedList, 
> region=4423e4182457c5b573729be4682cc3a3; rit=OPENING, 
> location=e010125049164.bja,60020,1531136465378
> 2018-07-09 20:03:39,220 INFO  [PEWorker-13] assignment.RegionStateStore: 
> pid=2305 updating hbase:meta row=4423e4182457c5b573729be4682cc3a3, 
> regionState=OPEN, openSeqNum=16021, 
> regionLocation=e010125049164.bja,60020,1531136465378
> 2018-07-09 20:03:43,190 INFO  [PEWorker-12] procedure2.ProcedureExecutor: 
> Finished pid=2303, state=SUCCESS; ServerCrashProcedure 
> server=e010125048153.bja,60020,1531137365840, splitWal=true, meta=false in 
> 10.7490sec
> {code}
> 2. A modify table happened later, the 4423e4182457c5b573729be4682cc3a3 was 
> reopend on e010125049164.bja,60020,1531136465378
> {code:java}
> 2018-07-09 20:04:39,929 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=295,queue=25,port=6] 
> assignment.RegionTransitionProcedure: Received report OPENED seqId=16024, 
> pid=2351, ppid=2314, state=RUNNABLE:REGION_TRANSITION_DISPATCH; 
> AssignProcedure table=IntegrationTestBigLinkedList, 
> region=4423e4182457c5b573729be4682cc3a3, 
> target=e010125049164.bja,60020,1531136465378; rit=OPENING, 
> location=e010125049164.bja,60020,1531136465378
> 2018-07-09 20:04:40,554 INFO  [PEWorker-6] assignment.RegionStateStore: 
> pid=2351 updating hbase:meta row=4423e4182457c5b573729be4682cc3a3, 
> regionState=OPEN, openSeqNum=16024, 
> regionLocation=e010125049164.bja,60020,1531136465378
> {code}
> 3. Active master was killed, the backup master took over, but when loading 
> meta entry, it clearly showed 4423e4182457c5b573729be4682cc3a3 is on the 
> privous dead server e010125048153.bja,60020,1531137365840. That is very very 
> strange!!!
> {code:java}
> 2018-07-09 20:06:17,985 INFO  [master/e010125048016:6] 
> assignment.RegionStateStore: Load hbase:meta entry 
> region=4423e4182457c5b573729be4682cc3a3, regionState=OPEN, 
> lastHost=e010125049164.bja,60020,1531136465378, 
> regionLocation=e010125048153.bja,60020,1531137365840, openSeqNum=16024
> {code}
> 4. the rs was killed
> {code:java}
> 2018-07-09 20:06:20,265 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=297,queue=27,port=6] 
> assignment.AssignmentManager: Killing e010125049164.bja,60020,1531136465378: 
> rit=OPEN, location=e010125048153.bja,60020,1531137365840, 
> table=IntegrationTestBigLinkedList, 
> region=4423e4182457c5b573729be4682cc3a3reported OPEN on 
> server=e010125049164.bja,60020,1531136465378 but state has otherwise.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20878) Data loss if merging regions while ServerCrashProcedure executing

2018-07-12 Thread Allan Yang (JIRA)
Allan Yang created HBASE-20878:
--

 Summary: Data loss if merging regions while ServerCrashProcedure 
executing
 Key: HBASE-20878
 URL: https://issues.apache.org/jira/browse/HBASE-20878
 Project: HBase
  Issue Type: Bug
  Components: amv2
Affects Versions: 2.0.1, 3.0.0, 2.1.0
Reporter: Allan Yang
Assignee: Allan Yang


In MergeTableRegionsProcedure, we close the regions to merge using 
UnassignProcedure. But, if the RS these regions on is crashed, a 
ServerCrashProcedure will execute at the same time. UnassignProcedures will be 
blocks until all logs are split. But since these regions are closed for 
merging, the regions won't open again, the recovered.edit in the region dir 
won't be replay, thus, data will loss.
I provided a test to repo this case. I seriously doubt Split region procedure 
also has this kind of problem. I will check later



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20878) Data loss if merging regions while ServerCrashProcedure executing

2018-07-12 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-20878:
---
Status: Patch Available  (was: Open)

> Data loss if merging regions while ServerCrashProcedure executing
> -
>
> Key: HBASE-20878
> URL: https://issues.apache.org/jira/browse/HBASE-20878
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Affects Versions: 2.0.1, 3.0.0, 2.1.0
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Critical
>
> In MergeTableRegionsProcedure, we close the regions to merge using 
> UnassignProcedure. But, if the RS these regions on is crashed, a 
> ServerCrashProcedure will execute at the same time. UnassignProcedures will 
> be blocks until all logs are split. But since these regions are closed for 
> merging, the regions won't open again, the recovered.edit in the region dir 
> won't be replay, thus, data will loss.
> I provided a test to repo this case. I seriously doubt Split region procedure 
> also has this kind of problem. I will check later



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20878) Data loss if merging regions while ServerCrashProcedure executing

2018-07-12 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-20878:
---
Attachment: HBASE-20878.branch-2.0.001.patch

> Data loss if merging regions while ServerCrashProcedure executing
> -
>
> Key: HBASE-20878
> URL: https://issues.apache.org/jira/browse/HBASE-20878
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Critical
> Attachments: HBASE-20878.branch-2.0.001.patch
>
>
> In MergeTableRegionsProcedure, we close the regions to merge using 
> UnassignProcedure. But, if the RS these regions on is crashed, a 
> ServerCrashProcedure will execute at the same time. UnassignProcedures will 
> be blocks until all logs are split. But since these regions are closed for 
> merging, the regions won't open again, the recovered.edit in the region dir 
> won't be replay, thus, data will loss.
> I provided a test to repo this case. I seriously doubt Split region procedure 
> also has this kind of problem. I will check later



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-20864) RS was killed due to master thought the region should be on a already dead server

2018-07-12 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang reassigned HBASE-20864:
--

Assignee: Allan Yang

> RS was killed due to master thought the region should be on a already dead 
> server
> -
>
> Key: HBASE-20864
> URL: https://issues.apache.org/jira/browse/HBASE-20864
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: log.zip
>
>
> When I was running ITBLL with our internal 2.0.0 version(with 2.0.1 
> backported and with other two issues: HBASE-20706, HBASE-20752). I found two 
> of my RS killed by master since master has a different region state with 
> those RS. It is very strange that master thought these region should be on a 
> already dead server. There might be a serious bug, but I haven't found it 
> yet. Here is the process:
> 1. e010125048153.bja,60020,1531137365840 is crashed, and clearly 
> 4423e4182457c5b573729be4682cc3a3 was assigned to 
> e010125049164.bja,60020,1531136465378 during ServerCrashProcedure
> {code:java}
> 2018-07-09 20:03:32,443 INFO  [PEWorker-10] procedure.ServerCrashProcedure: 
> Start pid=2303, state=RUNNABLE:SERVER_CRASH_START; ServerCrashProcedure 
> server=e010125048153.bja,60020,1531137365840, splitWal=true, meta=false
> 2018-07-09 20:03:39,220 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=294,queue=24,port=6] 
> assignment.RegionTransitionProcedure: Received report OPENED seqId=16021, 
> pid=2305, ppid=2303, state=RUNNABLE:REGION_TRANSITION_DISPATCH; 
> AssignProcedure table=IntegrationTestBigLinkedList, 
> region=4423e4182457c5b573729be4682cc3a3; rit=OPENING, 
> location=e010125049164.bja,60020,1531136465378
> 2018-07-09 20:03:39,220 INFO  [PEWorker-13] assignment.RegionStateStore: 
> pid=2305 updating hbase:meta row=4423e4182457c5b573729be4682cc3a3, 
> regionState=OPEN, openSeqNum=16021, 
> regionLocation=e010125049164.bja,60020,1531136465378
> 2018-07-09 20:03:43,190 INFO  [PEWorker-12] procedure2.ProcedureExecutor: 
> Finished pid=2303, state=SUCCESS; ServerCrashProcedure 
> server=e010125048153.bja,60020,1531137365840, splitWal=true, meta=false in 
> 10.7490sec
> {code}
> 2. A modify table happened later, the 4423e4182457c5b573729be4682cc3a3 was 
> reopend on e010125049164.bja,60020,1531136465378
> {code:java}
> 2018-07-09 20:04:39,929 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=295,queue=25,port=6] 
> assignment.RegionTransitionProcedure: Received report OPENED seqId=16024, 
> pid=2351, ppid=2314, state=RUNNABLE:REGION_TRANSITION_DISPATCH; 
> AssignProcedure table=IntegrationTestBigLinkedList, 
> region=4423e4182457c5b573729be4682cc3a3, 
> target=e010125049164.bja,60020,1531136465378; rit=OPENING, 
> location=e010125049164.bja,60020,1531136465378
> 2018-07-09 20:04:40,554 INFO  [PEWorker-6] assignment.RegionStateStore: 
> pid=2351 updating hbase:meta row=4423e4182457c5b573729be4682cc3a3, 
> regionState=OPEN, openSeqNum=16024, 
> regionLocation=e010125049164.bja,60020,1531136465378
> {code}
> 3. Active master was killed, the backup master took over, but when loading 
> meta entry, it clearly showed 4423e4182457c5b573729be4682cc3a3 is on the 
> privous dead server e010125048153.bja,60020,1531137365840. That is very very 
> strange!!!
> {code:java}
> 2018-07-09 20:06:17,985 INFO  [master/e010125048016:6] 
> assignment.RegionStateStore: Load hbase:meta entry 
> region=4423e4182457c5b573729be4682cc3a3, regionState=OPEN, 
> lastHost=e010125049164.bja,60020,1531136465378, 
> regionLocation=e010125048153.bja,60020,1531137365840, openSeqNum=16024
> {code}
> 4. the rs was killed
> {code:java}
> 2018-07-09 20:06:20,265 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=297,queue=27,port=6] 
> assignment.AssignmentManager: Killing e010125049164.bja,60020,1531136465378: 
> rit=OPEN, location=e010125048153.bja,60020,1531137365840, 
> table=IntegrationTestBigLinkedList, 
> region=4423e4182457c5b573729be4682cc3a3reported OPEN on 
> server=e010125049164.bja,60020,1531136465378 but state has otherwise.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20878) Data loss if merging regions while ServerCrashProcedure executing

2018-07-12 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-20878:
---
Description: 
In MergeTableRegionsProcedure, we close the regions to merge using 
UnassignProcedure. But, if the RS these regions on is crashed, a 
ServerCrashProcedure will execute at the same time. UnassignProcedures will be 
blockd until all logs are split. But since these regions are closed for 
merging, the regions won't open again, the recovered.edit in the region dir 
won't be replay, thus, data will loss.
I provided a test to repo this case. I seriously doubt Split region procedure 
also has this kind of problem. I will check later

  was:
In MergeTableRegionsProcedure, we close the regions to merge using 
UnassignProcedure. But, if the RS these regions on is crashed, a 
ServerCrashProcedure will execute at the same time. UnassignProcedures will be 
blocks until all logs are split. But since these regions are closed for 
merging, the regions won't open again, the recovered.edit in the region dir 
won't be replay, thus, data will loss.
I provided a test to repo this case. I seriously doubt Split region procedure 
also has this kind of problem. I will check later


> Data loss if merging regions while ServerCrashProcedure executing
> -
>
> Key: HBASE-20878
> URL: https://issues.apache.org/jira/browse/HBASE-20878
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Critical
> Attachments: HBASE-20878.branch-2.0.001.patch
>
>
> In MergeTableRegionsProcedure, we close the regions to merge using 
> UnassignProcedure. But, if the RS these regions on is crashed, a 
> ServerCrashProcedure will execute at the same time. UnassignProcedures will 
> be blockd until all logs are split. But since these regions are closed for 
> merging, the regions won't open again, the recovered.edit in the region dir 
> won't be replay, thus, data will loss.
> I provided a test to repo this case. I seriously doubt Split region procedure 
> also has this kind of problem. I will check later



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20864) RS was killed due to master thought the region should be on a already dead server

2018-07-12 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541612#comment-16541612
 ] 

Duo Zhang commented on HBASE-20864:
---

Not really. The problem for read only table is fixed by HBASE-20817. 
HBASE-20792 is another problem.

> RS was killed due to master thought the region should be on a already dead 
> server
> -
>
> Key: HBASE-20864
> URL: https://issues.apache.org/jira/browse/HBASE-20864
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: log.zip
>
>
> When I was running ITBLL with our internal 2.0.0 version(with 2.0.1 
> backported and with other two issues: HBASE-20706, HBASE-20752). I found two 
> of my RS killed by master since master has a different region state with 
> those RS. It is very strange that master thought these region should be on a 
> already dead server. There might be a serious bug, but I haven't found it 
> yet. Here is the process:
> 1. e010125048153.bja,60020,1531137365840 is crashed, and clearly 
> 4423e4182457c5b573729be4682cc3a3 was assigned to 
> e010125049164.bja,60020,1531136465378 during ServerCrashProcedure
> {code:java}
> 2018-07-09 20:03:32,443 INFO  [PEWorker-10] procedure.ServerCrashProcedure: 
> Start pid=2303, state=RUNNABLE:SERVER_CRASH_START; ServerCrashProcedure 
> server=e010125048153.bja,60020,1531137365840, splitWal=true, meta=false
> 2018-07-09 20:03:39,220 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=294,queue=24,port=6] 
> assignment.RegionTransitionProcedure: Received report OPENED seqId=16021, 
> pid=2305, ppid=2303, state=RUNNABLE:REGION_TRANSITION_DISPATCH; 
> AssignProcedure table=IntegrationTestBigLinkedList, 
> region=4423e4182457c5b573729be4682cc3a3; rit=OPENING, 
> location=e010125049164.bja,60020,1531136465378
> 2018-07-09 20:03:39,220 INFO  [PEWorker-13] assignment.RegionStateStore: 
> pid=2305 updating hbase:meta row=4423e4182457c5b573729be4682cc3a3, 
> regionState=OPEN, openSeqNum=16021, 
> regionLocation=e010125049164.bja,60020,1531136465378
> 2018-07-09 20:03:43,190 INFO  [PEWorker-12] procedure2.ProcedureExecutor: 
> Finished pid=2303, state=SUCCESS; ServerCrashProcedure 
> server=e010125048153.bja,60020,1531137365840, splitWal=true, meta=false in 
> 10.7490sec
> {code}
> 2. A modify table happened later, the 4423e4182457c5b573729be4682cc3a3 was 
> reopend on e010125049164.bja,60020,1531136465378
> {code:java}
> 2018-07-09 20:04:39,929 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=295,queue=25,port=6] 
> assignment.RegionTransitionProcedure: Received report OPENED seqId=16024, 
> pid=2351, ppid=2314, state=RUNNABLE:REGION_TRANSITION_DISPATCH; 
> AssignProcedure table=IntegrationTestBigLinkedList, 
> region=4423e4182457c5b573729be4682cc3a3, 
> target=e010125049164.bja,60020,1531136465378; rit=OPENING, 
> location=e010125049164.bja,60020,1531136465378
> 2018-07-09 20:04:40,554 INFO  [PEWorker-6] assignment.RegionStateStore: 
> pid=2351 updating hbase:meta row=4423e4182457c5b573729be4682cc3a3, 
> regionState=OPEN, openSeqNum=16024, 
> regionLocation=e010125049164.bja,60020,1531136465378
> {code}
> 3. Active master was killed, the backup master took over, but when loading 
> meta entry, it clearly showed 4423e4182457c5b573729be4682cc3a3 is on the 
> privous dead server e010125048153.bja,60020,1531137365840. That is very very 
> strange!!!
> {code:java}
> 2018-07-09 20:06:17,985 INFO  [master/e010125048016:6] 
> assignment.RegionStateStore: Load hbase:meta entry 
> region=4423e4182457c5b573729be4682cc3a3, regionState=OPEN, 
> lastHost=e010125049164.bja,60020,1531136465378, 
> regionLocation=e010125048153.bja,60020,1531137365840, openSeqNum=16024
> {code}
> 4. the rs was killed
> {code:java}
> 2018-07-09 20:06:20,265 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=297,queue=27,port=6] 
> assignment.AssignmentManager: Killing e010125049164.bja,60020,1531136465378: 
> rit=OPEN, location=e010125048153.bja,60020,1531137365840, 
> table=IntegrationTestBigLinkedList, 
> region=4423e4182457c5b573729be4682cc3a3reported OPEN on 
> server=e010125049164.bja,60020,1531136465378 but state has otherwise.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20864) RS was killed due to master thought the region should be on a already dead server

2018-07-12 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541617#comment-16541617
 ] 

Duo Zhang commented on HBASE-20864:
---

See this comment

https://issues.apache.org/jira/browse/HBASE-20792?focusedCommentId=16524696&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16524696

> RS was killed due to master thought the region should be on a already dead 
> server
> -
>
> Key: HBASE-20864
> URL: https://issues.apache.org/jira/browse/HBASE-20864
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: log.zip
>
>
> When I was running ITBLL with our internal 2.0.0 version(with 2.0.1 
> backported and with other two issues: HBASE-20706, HBASE-20752). I found two 
> of my RS killed by master since master has a different region state with 
> those RS. It is very strange that master thought these region should be on a 
> already dead server. There might be a serious bug, but I haven't found it 
> yet. Here is the process:
> 1. e010125048153.bja,60020,1531137365840 is crashed, and clearly 
> 4423e4182457c5b573729be4682cc3a3 was assigned to 
> e010125049164.bja,60020,1531136465378 during ServerCrashProcedure
> {code:java}
> 2018-07-09 20:03:32,443 INFO  [PEWorker-10] procedure.ServerCrashProcedure: 
> Start pid=2303, state=RUNNABLE:SERVER_CRASH_START; ServerCrashProcedure 
> server=e010125048153.bja,60020,1531137365840, splitWal=true, meta=false
> 2018-07-09 20:03:39,220 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=294,queue=24,port=6] 
> assignment.RegionTransitionProcedure: Received report OPENED seqId=16021, 
> pid=2305, ppid=2303, state=RUNNABLE:REGION_TRANSITION_DISPATCH; 
> AssignProcedure table=IntegrationTestBigLinkedList, 
> region=4423e4182457c5b573729be4682cc3a3; rit=OPENING, 
> location=e010125049164.bja,60020,1531136465378
> 2018-07-09 20:03:39,220 INFO  [PEWorker-13] assignment.RegionStateStore: 
> pid=2305 updating hbase:meta row=4423e4182457c5b573729be4682cc3a3, 
> regionState=OPEN, openSeqNum=16021, 
> regionLocation=e010125049164.bja,60020,1531136465378
> 2018-07-09 20:03:43,190 INFO  [PEWorker-12] procedure2.ProcedureExecutor: 
> Finished pid=2303, state=SUCCESS; ServerCrashProcedure 
> server=e010125048153.bja,60020,1531137365840, splitWal=true, meta=false in 
> 10.7490sec
> {code}
> 2. A modify table happened later, the 4423e4182457c5b573729be4682cc3a3 was 
> reopend on e010125049164.bja,60020,1531136465378
> {code:java}
> 2018-07-09 20:04:39,929 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=295,queue=25,port=6] 
> assignment.RegionTransitionProcedure: Received report OPENED seqId=16024, 
> pid=2351, ppid=2314, state=RUNNABLE:REGION_TRANSITION_DISPATCH; 
> AssignProcedure table=IntegrationTestBigLinkedList, 
> region=4423e4182457c5b573729be4682cc3a3, 
> target=e010125049164.bja,60020,1531136465378; rit=OPENING, 
> location=e010125049164.bja,60020,1531136465378
> 2018-07-09 20:04:40,554 INFO  [PEWorker-6] assignment.RegionStateStore: 
> pid=2351 updating hbase:meta row=4423e4182457c5b573729be4682cc3a3, 
> regionState=OPEN, openSeqNum=16024, 
> regionLocation=e010125049164.bja,60020,1531136465378
> {code}
> 3. Active master was killed, the backup master took over, but when loading 
> meta entry, it clearly showed 4423e4182457c5b573729be4682cc3a3 is on the 
> privous dead server e010125048153.bja,60020,1531137365840. That is very very 
> strange!!!
> {code:java}
> 2018-07-09 20:06:17,985 INFO  [master/e010125048016:6] 
> assignment.RegionStateStore: Load hbase:meta entry 
> region=4423e4182457c5b573729be4682cc3a3, regionState=OPEN, 
> lastHost=e010125049164.bja,60020,1531136465378, 
> regionLocation=e010125048153.bja,60020,1531137365840, openSeqNum=16024
> {code}
> 4. the rs was killed
> {code:java}
> 2018-07-09 20:06:20,265 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=297,queue=27,port=6] 
> assignment.AssignmentManager: Killing e010125049164.bja,60020,1531136465378: 
> rit=OPEN, location=e010125048153.bja,60020,1531137365840, 
> table=IntegrationTestBigLinkedList, 
> region=4423e4182457c5b573729be4682cc3a3reported OPEN on 
> server=e010125049164.bja,60020,1531136465378 but state has otherwise.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20878) Data loss if merging regions while ServerCrashProcedure executing

2018-07-12 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541622#comment-16541622
 ] 

stack commented on HBASE-20878:
---

Makes sense. Good find. This has been a problem with all versions of merge? The 
child region should look for recovered edits in parent regions on open?

> Data loss if merging regions while ServerCrashProcedure executing
> -
>
> Key: HBASE-20878
> URL: https://issues.apache.org/jira/browse/HBASE-20878
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Critical
> Attachments: HBASE-20878.branch-2.0.001.patch
>
>
> In MergeTableRegionsProcedure, we close the regions to merge using 
> UnassignProcedure. But, if the RS these regions on is crashed, a 
> ServerCrashProcedure will execute at the same time. UnassignProcedures will 
> be blockd until all logs are split. But since these regions are closed for 
> merging, the regions won't open again, the recovered.edit in the region dir 
> won't be replay, thus, data will loss.
> I provided a test to repo this case. I seriously doubt Split region procedure 
> also has this kind of problem. I will check later



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20878) Data loss if merging regions while ServerCrashProcedure executing

2018-07-12 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541630#comment-16541630
 ] 

Duo Zhang commented on HBASE-20878:
---

Yes I think this is a problem. We could add a check in MTRP, if there are 
recovered.edits for any of the regions we want to merge, then we just fail the 
MTRP and make the regions online again? Or we could add something like 
ReopenTableRegionsProcedure, jump back to an earlier state and try again, until 
we can close the region normally?

> Data loss if merging regions while ServerCrashProcedure executing
> -
>
> Key: HBASE-20878
> URL: https://issues.apache.org/jira/browse/HBASE-20878
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Critical
> Attachments: HBASE-20878.branch-2.0.001.patch
>
>
> In MergeTableRegionsProcedure, we close the regions to merge using 
> UnassignProcedure. But, if the RS these regions on is crashed, a 
> ServerCrashProcedure will execute at the same time. UnassignProcedures will 
> be blockd until all logs are split. But since these regions are closed for 
> merging, the regions won't open again, the recovered.edit in the region dir 
> won't be replay, thus, data will loss.
> I provided a test to repo this case. I seriously doubt Split region procedure 
> also has this kind of problem. I will check later



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version

2018-07-12 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541632#comment-16541632
 ] 

Vikas Vishwakarma commented on HBASE-20866:
---

these test failures and checkstyle warnings in server unit tests don't look 
related to my change. I will check, run the server tests locally and then 
resubmit the patch and reviewboard link. 

> HBase 1.x scan performance degradation compared to 0.98 version
> ---
>
> Key: HBASE-20866
> URL: https://issues.apache.org/jira/browse/HBASE-20866
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
>Priority: Critical
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6
>
> Attachments: HBASE-20866.branch-1.3.001.patch, 
> HBASE-20866.branch-1.3.002.patch, HBASE-20866.branch-1.3.003.patch
>
>
> Internally while testing 1.3 as part of migration from 0.98 to 1.3 we 
> observed perf degradation in scan performance for phoenix queries varying 
> from few 10's to upto 200% depending on the query being executed. We tried 
> simple native HBase scan and there also we saw upto 40% degradation in 
> performance when the number of column qualifiers are high (40-50+)
> To identify the root cause of performance diff between 0.98 and 1.3 we 
> carried out lot of experiments with profiling and git bisect iterations, 
> however we were not able to identify any particular source of scan 
> performance degradation and it looked like this is an accumulated degradation 
> of 5-10% over various enhancements and refactoring.
> We identified few major enhancements like partialResult handling, 
> ScannerContext with heartbeat processing, time/size limiting, RPC 
> refactoring, etc that could have contributed to small degradation in 
> performance which put together could be leading to large overall degradation.
> One of the changes is 
> [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which 
> implements partialResult handling. In ClientScanner.java the results received 
> from server are cached on the client side by converting the result array into 
> an ArrayList. This function gets called in a loop depending on the number of 
> rows in the scan result. Example for ten’s of millions of rows scanned, this 
> can be called in the order of millions of times.
> In almost all the cases 99% of the time (except for handling partial results, 
> etc). We are just taking the resultsFromServer converting it into a ArrayList 
> resultsToAddToCache in addResultsToList(..) and then iterating over the list 
> again and adding it to cache in loadCache(..) as given in the code path below
> In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → 
> addResultsToList(..) →
> {code:java}
> loadCache() {
> ...
>  List resultsToAddToCache =
>  getResultsToAddToCache(values, callable.isHeartbeatMessage());
> ...
> …
>    for (Result rs : resultsToAddToCache) {
>  rs = filterLoadedCell(rs);
>  cache.add(rs);
> ...
>    }
> }
> getResultsToAddToCache(..) {
> ..
>    final boolean isBatchSet = scan != null && scan.getBatch() > 0;
>    final boolean allowPartials = scan != null && 
> scan.getAllowPartialResults();
> ..
>    if (allowPartials || isBatchSet) {
>  addResultsToList(resultsToAddToCache, resultsFromServer, 0,
>    (null == resultsFromServer ? 0 : resultsFromServer.length));
>  return resultsToAddToCache;
>    }
> ...
> }
> private void addResultsToList(List outputList, Result[] inputArray, 
> int start, int end) {
>    if (inputArray == null || start < 0 || end > inputArray.length) return;
>    for (int i = start; i < end; i++) {
>  outputList.add(inputArray[i]);
>    }
>  }{code}
>  
> It looks like we can avoid the result array to arraylist conversion 
> (resultsFromServer --> resultsToAddToCache ) for the first case which is also 
> the most frequent case and instead directly take the values arraay returned 
> by callable and add it to the cache without converting it into ArrayList.
> I have taken both these flags allowPartials and isBatchSet out in loadcahe() 
> and I am directly adding values to scanner cache if the above condition is 
> pass instead of coverting it into arrayList by calling 
> getResultsToAddToCache(). For example:
> {code:java}
> protected void loadCache() throws IOException {
> Result[] values = null;
> ..
> final boolean isBatchSet = scan != null && scan.getBatch() > 0;
> final boolean allowPartials = scan != null && scan.getAllowPartialResults();
> ..
> for (;;) {
> try {
> values = call(callable, caller, scannerTimeout);
> ..
> } catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) {
> ..
> }
> if (allowPartials || isBatchSet) {  // DIRECTLY COPY values 

[jira] [Updated] (HBASE-20865) CreateTableProcedure is stuck in retry loop in CREATE_TABLE_WRITE_FS_LAYOUT state

2018-07-12 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-20865:
--
Fix Version/s: 2.1.1
   2.2.0
   2.0.2
   3.0.0

> CreateTableProcedure is stuck in retry loop in CREATE_TABLE_WRITE_FS_LAYOUT 
> state
> -
>
> Key: HBASE-20865
> URL: https://issues.apache.org/jira/browse/HBASE-20865
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-20865.master.001.patch
>
>
> Similar to HBASE-20616, CreateTableProcedure gets stuck in retry loop in 
> CREATE_TABLE_WRITE_FS_LAYOUT state when writing HDFS fails.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20616) TruncateTableProcedure is stuck in retry loop in TRUNCATE_TABLE_CREATE_FS_LAYOUT state

2018-07-12 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541641#comment-16541641
 ] 

Duo Zhang commented on HBASE-20616:
---

This one has not been committed to branch-2.0?

> TruncateTableProcedure is stuck in retry loop in 
> TRUNCATE_TABLE_CREATE_FS_LAYOUT state
> --
>
> Key: HBASE-20616
> URL: https://issues.apache.org/jira/browse/HBASE-20616
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
> Environment: HDP-2.5.3
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 2.1.0
>
> Attachments: 20616.master.004.patch, HBASE-20616.master.001.patch, 
> HBASE-20616.master.002.patch, HBASE-20616.master.003.patch, 
> HBASE-20616.master.004.patch
>
>
> At first, TruncateTableProcedure failed to write some files to HDFS in 
> TRUNCATE_TABLE_CREATE_FS_LAYOUT state for some reason.
> {code:java}
> 2018-05-15 08:00:25,346 WARN  [ProcedureExecutorThread-8] 
> procedure.TruncateTableProcedure: Retriable error trying to truncate 
> table=: state=TRUNCATE_TABLE_CREATE_FS_LAYOUT
> java.io.IOException: java.util.concurrent.ExecutionException: 
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File 
> /apps/hbase/data/.tmp/data.regioninfo could 
> only be replicated to 0 nodes instead of minReplication (=1).  There are  number of DNs> datanode(s) running and no node(s) are excluded in this 
> operation.
> ...
> {code}
> But at this time, seemed like writing some files to HDFS was successful.
> And then, TruncateTableProcedure was stuck in retry loop in 
> TRUNCATE_TABLE_CREATE_FS_LAYOUT state. At this point, the following log 
> messages were shown repeatedly in the master log:
> {code:java}
> 2018-05-15 08:00:25,463 WARN  [ProcedureExecutorThread-8] 
> procedure.TruncateTableProcedure: Retriable error trying to truncate 
> table=: state=TRUNCATE_TABLE_CREATE_FS_LAYOUT
> java.io.IOException: java.util.concurrent.ExecutionException: 
> java.io.IOException: The specified region already exists on disk: 
> hdfs:///apps/hbase/data/.tmp/data///
> ...
> {code}
> It seems like this is because TruncateTableProcedure tried to write the files 
> that were written successfully in the first try.
> I think we need to delete all the files and directories that are written 
> successfully in the previous try before retrying the 
> TRUNCATE_TABLE_CREATE_FS_LAYOUT state.
> Actually, this issue was observed in HDP-2.5.3, but I think the upstream has 
> the same issue. Also, it looks to me that CreateTableProcedure has a similar 
> issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20847) The parent procedure of RegionTransitionProcedure may not have the table lock

2018-07-12 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541646#comment-16541646
 ] 

Hudson commented on HBASE-20847:


Results for branch master
[build #394 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/394/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/394//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/394//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/394//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> The parent procedure of RegionTransitionProcedure may not have the table lock
> -
>
> Key: HBASE-20847
> URL: https://issues.apache.org/jira/browse/HBASE-20847
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2, Region Assignment
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-20847-addendum.patch, 
> HBASE-20847-branch-2.0-v1.patch, HBASE-20847-branch-2.0.patch, 
> HBASE-20847-v1.patch, HBASE-20847-v2.patch, HBASE-20847-v3.patch, 
> HBASE-20847.patch
>
>
> For example, SCP can also schedule AssignProcedure and obviously it will not 
> hold the table lock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20859) Backup and incremental load could fail in secure clusters

2018-07-12 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541648#comment-16541648
 ] 

Hudson commented on HBASE-20859:


Results for branch master
[build #394 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/394/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/394//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/394//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/394//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Backup and incremental load could fail in secure clusters
> -
>
> Key: HBASE-20859
> URL: https://issues.apache.org/jira/browse/HBASE-20859
> Project: HBase
>  Issue Type: Bug
>  Components: backup&restore
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20859.master.001.patch, 
> HBASE-20859.master.002.patch, HBASE-20859.master.003.patch
>
>
> HBase Backup and incremental load uses 
> HConstants.DEFAULT_TEMPORARY_HDFS_DIRECTORY for temporary path.
> HConstants.DEFAULT_TEMPORARY_HDFS_DIRECTORY uses the Java runtime user name 
> to generate a temporary path on HDFS. This can be a wrong assumption in a 
> secure cluster where Kerberos principal name can be different from the system 
> user name.
> {code:java}
> public static final String DEFAULT_TEMPORARY_HDFS_DIRECTORY = "/user/"
>   + System.getProperty("user.name") + "/hbase-staging";
> {code}
> This constant variable is used in BackupUtils.java and HFileOutputFormat2.java
> In such cases, you will not be able to write files to the temporary location 
> on HDFS due to permission error, and therefore operations such as backup will 
> fail.
> This bug is similar in nature to HDFS-12485.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20865) CreateTableProcedure is stuck in retry loop in CREATE_TABLE_WRITE_FS_LAYOUT state

2018-07-12 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541651#comment-16541651
 ] 

Hudson commented on HBASE-20865:


Results for branch master
[build #394 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/394/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/394//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/394//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/394//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> CreateTableProcedure is stuck in retry loop in CREATE_TABLE_WRITE_FS_LAYOUT 
> state
> -
>
> Key: HBASE-20865
> URL: https://issues.apache.org/jira/browse/HBASE-20865
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-20865.master.001.patch
>
>
> Similar to HBASE-20616, CreateTableProcedure gets stuck in retry loop in 
> CREATE_TABLE_WRITE_FS_LAYOUT state when writing HDFS fails.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20838) Include hbase-server in precommit test if CommonFSUtils is changed

2018-07-12 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541650#comment-16541650
 ] 

Hudson commented on HBASE-20838:


Results for branch master
[build #394 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/394/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/394//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/394//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/394//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Include hbase-server in precommit test if CommonFSUtils is changed
> --
>
> Key: HBASE-20838
> URL: https://issues.apache.org/jira/browse/HBASE-20838
> Project: HBase
>  Issue Type: Test
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-20838.patch, HBASE-20838.patch, 
> HBASE-20838.v2.patch
>
>
> -As per 
> [discussed|https://issues.apache.org/jira/browse/HBASE-20691?focusedCommentId=16517662&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16517662]
>  in HBASE-20691, since the setStoragePolicy code is in CommonFSUtils, the 
> test should be in TestCommonFSUtils-
> We don't want to introduce dependency on hadoop-hdfs to hbase-common so 
> decided to leave the setStoragePolicy related tests in TestCommonFSUtils. 
> Instead, we will change the personality script to include hbase-server in 
> unit test if any change made against {{CommonFSUtils}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20617) Upgrade/remove jetty-jsp

2018-07-12 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541647#comment-16541647
 ] 

Hudson commented on HBASE-20617:


Results for branch master
[build #394 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/394/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/394//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/394//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/394//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Upgrade/remove jetty-jsp
> 
>
> Key: HBASE-20617
> URL: https://issues.apache.org/jira/browse/HBASE-20617
> Project: HBase
>  Issue Type: Improvement
>Reporter: Sakthi
>Assignee: Sakthi
>Priority: Minor
> Fix For: 3.0.0, 2.2.0
>
> Attachments: hbase-20617.master.001.patch
>
>
> jetty-jsp removed after jetty-9.2.x version. We use the 9.2 version. Research 
> so far brings out that apache-jsp might be of interest to us in jetty-9.4.x 
> version(as JettyJspServlet.class is in apache-jsp). Yet to figure out about 
> jetty-9.3.x.
> Filing to track this along.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20860) Merged region's RIT state may not be cleaned after master restart

2018-07-12 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541649#comment-16541649
 ] 

Hudson commented on HBASE-20860:


Results for branch master
[build #394 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/394/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/394//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/394//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/394//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Merged region's RIT state may not be cleaned after master restart
> -
>
> Key: HBASE-20860
> URL: https://issues.apache.org/jira/browse/HBASE-20860
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20860.branch-2.0.002.patch, 
> HBASE-20860.branch-2.0.003.patch, HBASE-20860.branch-2.0.004.patch, 
> HBASE-20860.branch-2.0.005.patch, HBASE-20860.branch-2.0.patch
>
>
> In MergeTableRegionsProcedure, we issue UnassignProcedures to offline regions 
> to merge. But if we restart master just after MergeTableRegionsProcedure 
> finished these two UnassignProcedure and before it can delete their meta 
> entries. The new master will found these two region is CLOSED but no 
> procedures are attached to them. They will be regard as RIT regions and 
> nobody will clean the RIT state for them later.
> A quick way to resolve this stuck situation in the production env is 
> restarting master again, since the meta entries are deleted in 
> MergeTableRegionsProcedure. Here, I offer a fix for this problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20870) Wrong HBase root dir in ITBLL's Search Tool

2018-07-12 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541710#comment-16541710
 ] 

Mike Drob commented on HBASE-20870:
---

It wasn't committed because for the tests that we were doing, it didn't seem 
necessary. Maybe Stack and I had something in our environment that made this 
line unnecessary. Do you think using that approach would work for your case? It 
seems like it will be less surprising to future maintainers, and will pick up 
configurations from the original conf which may be important.

> Wrong HBase root dir in ITBLL's Search Tool
> ---
>
> Key: HBASE-20870
> URL: https://issues.apache.org/jira/browse/HBASE-20870
> Project: HBase
>  Issue Type: Bug
>  Components: integration tests
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Minor
> Attachments: HBASE-20870.branch-2.0.001.patch
>
>
> When using IntegrationTestBigLinkedList's Search tools, it always fails since 
> it tries to read WALs in a wrong HBase root dir. Turned out that when 
> initializing IntegrationTestingUtility in IntegrationTestBigLinkedList, its 
> super class HBaseTestingUtility will change hbase.rootdir to a local random 
> dir. It is not wrong since HBaseTestingUtility is mostly used by Minicluster. 
> But for IntegrationTest runs on distributed clusters, we should change it 
> back.
>  Here is the error info.
> {code:java}
> 2018-07-11 16:35:49,679 DEBUG [main] hbase.HBaseCommonTestingUtility: Setting 
> hbase.rootdir to 
> /home/hadoop/target/test-data/deb67611-2737-4696-abe9-32a7783df7bb
> 2018-07-11 16:35:50,736 ERROR [main] util.AbstractHBaseTool: Error running 
> command-line tool java.io.FileNotFoundException: File 
> file:/home/hadoop/target/test-data/deb67611-2737-4696-abe9-32a7783df7bb/WALs 
> does not exist
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:431)
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1517)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20616) TruncateTableProcedure is stuck in retry loop in TRUNCATE_TABLE_CREATE_FS_LAYOUT state

2018-07-12 Thread Toshihiro Suzuki (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541717#comment-16541717
 ] 

Toshihiro Suzuki commented on HBASE-20616:
--

[~Apache9] No, looks like it hasn't been committed to branch-2.0.

> TruncateTableProcedure is stuck in retry loop in 
> TRUNCATE_TABLE_CREATE_FS_LAYOUT state
> --
>
> Key: HBASE-20616
> URL: https://issues.apache.org/jira/browse/HBASE-20616
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
> Environment: HDP-2.5.3
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 2.1.0
>
> Attachments: 20616.master.004.patch, HBASE-20616.master.001.patch, 
> HBASE-20616.master.002.patch, HBASE-20616.master.003.patch, 
> HBASE-20616.master.004.patch
>
>
> At first, TruncateTableProcedure failed to write some files to HDFS in 
> TRUNCATE_TABLE_CREATE_FS_LAYOUT state for some reason.
> {code:java}
> 2018-05-15 08:00:25,346 WARN  [ProcedureExecutorThread-8] 
> procedure.TruncateTableProcedure: Retriable error trying to truncate 
> table=: state=TRUNCATE_TABLE_CREATE_FS_LAYOUT
> java.io.IOException: java.util.concurrent.ExecutionException: 
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File 
> /apps/hbase/data/.tmp/data.regioninfo could 
> only be replicated to 0 nodes instead of minReplication (=1).  There are  number of DNs> datanode(s) running and no node(s) are excluded in this 
> operation.
> ...
> {code}
> But at this time, seemed like writing some files to HDFS was successful.
> And then, TruncateTableProcedure was stuck in retry loop in 
> TRUNCATE_TABLE_CREATE_FS_LAYOUT state. At this point, the following log 
> messages were shown repeatedly in the master log:
> {code:java}
> 2018-05-15 08:00:25,463 WARN  [ProcedureExecutorThread-8] 
> procedure.TruncateTableProcedure: Retriable error trying to truncate 
> table=: state=TRUNCATE_TABLE_CREATE_FS_LAYOUT
> java.io.IOException: java.util.concurrent.ExecutionException: 
> java.io.IOException: The specified region already exists on disk: 
> hdfs:///apps/hbase/data/.tmp/data///
> ...
> {code}
> It seems like this is because TruncateTableProcedure tried to write the files 
> that were written successfully in the first try.
> I think we need to delete all the files and directories that are written 
> successfully in the previous try before retrying the 
> TRUNCATE_TABLE_CREATE_FS_LAYOUT state.
> Actually, this issue was observed in HDP-2.5.3, but I think the upstream has 
> the same issue. Also, it looks to me that CreateTableProcedure has a similar 
> issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20616) TruncateTableProcedure is stuck in retry loop in TRUNCATE_TABLE_CREATE_FS_LAYOUT state

2018-07-12 Thread Toshihiro Suzuki (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541730#comment-16541730
 ] 

Toshihiro Suzuki commented on HBASE-20616:
--

It seems like we can apply the last patch in this Jira to branch-2.0 without 
some changes.

> TruncateTableProcedure is stuck in retry loop in 
> TRUNCATE_TABLE_CREATE_FS_LAYOUT state
> --
>
> Key: HBASE-20616
> URL: https://issues.apache.org/jira/browse/HBASE-20616
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
> Environment: HDP-2.5.3
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 2.1.0
>
> Attachments: 20616.master.004.patch, HBASE-20616.master.001.patch, 
> HBASE-20616.master.002.patch, HBASE-20616.master.003.patch, 
> HBASE-20616.master.004.patch
>
>
> At first, TruncateTableProcedure failed to write some files to HDFS in 
> TRUNCATE_TABLE_CREATE_FS_LAYOUT state for some reason.
> {code:java}
> 2018-05-15 08:00:25,346 WARN  [ProcedureExecutorThread-8] 
> procedure.TruncateTableProcedure: Retriable error trying to truncate 
> table=: state=TRUNCATE_TABLE_CREATE_FS_LAYOUT
> java.io.IOException: java.util.concurrent.ExecutionException: 
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File 
> /apps/hbase/data/.tmp/data.regioninfo could 
> only be replicated to 0 nodes instead of minReplication (=1).  There are  number of DNs> datanode(s) running and no node(s) are excluded in this 
> operation.
> ...
> {code}
> But at this time, seemed like writing some files to HDFS was successful.
> And then, TruncateTableProcedure was stuck in retry loop in 
> TRUNCATE_TABLE_CREATE_FS_LAYOUT state. At this point, the following log 
> messages were shown repeatedly in the master log:
> {code:java}
> 2018-05-15 08:00:25,463 WARN  [ProcedureExecutorThread-8] 
> procedure.TruncateTableProcedure: Retriable error trying to truncate 
> table=: state=TRUNCATE_TABLE_CREATE_FS_LAYOUT
> java.io.IOException: java.util.concurrent.ExecutionException: 
> java.io.IOException: The specified region already exists on disk: 
> hdfs:///apps/hbase/data/.tmp/data///
> ...
> {code}
> It seems like this is because TruncateTableProcedure tried to write the files 
> that were written successfully in the first try.
> I think we need to delete all the files and directories that are written 
> successfully in the previous try before retrying the 
> TRUNCATE_TABLE_CREATE_FS_LAYOUT state.
> Actually, this issue was observed in HDP-2.5.3, but I think the upstream has 
> the same issue. Also, it looks to me that CreateTableProcedure has a similar 
> issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19572) RegionMover should use the configured default port number and not the one from HConstants

2018-07-12 Thread Toshihiro Suzuki (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541731#comment-16541731
 ] 

Toshihiro Suzuki commented on HBASE-19572:
--

 [~reidchan] It seems like we can apply the last patch in this Jira to 
branch-2.0 without some changes.

> RegionMover should use the configured default port number and not the one 
> from HConstants
> -
>
> Key: HBASE-19572
> URL: https://issues.apache.org/jira/browse/HBASE-19572
> Project: HBase
>  Issue Type: Bug
>Reporter: Esteban Gutierrez
>Assignee: Toshihiro Suzuki
>Priority: Major
> Attachments: HBASE-19572.master.001.patch, 
> HBASE-19572.master.001.patch, HBASE-19572.master.003.patch, 
> HBASE-19572.master.004.patch, HBASE-19572.master.004.patch, 
> HBASE-19572.master.005.patch
>
>
> The issue I ran into HBASE-19499 was due RegionMover not using the port used 
> by {{hbase-site.xml}}. The tool should use the value used in the 
> configuration before falling back to the hardcoded value 
> {{HConstants.DEFAULT_REGIONSERVER_PORT}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20877) Hbase-1.2.0 OldWals age getting filled and not purged by Hmaster

2018-07-12 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541785#comment-16541785
 ] 

Sean Busbey commented on HBASE-20877:
-

please don't rm WALs without understanding why they're present. The most likely 
reason for them still being around is that replication has not yet managed to 
successfully send them to your indexer. Please follow the steps from my email 
to user@hbase:

https://lists.apache.org/thread.html/221e18a4a861ff6736cb17036ce17f410027046fd7f00fb80bfd11f1@%3Cuser.hbase.apache.org%3E

> Hbase-1.2.0 OldWals age getting filled and not purged by Hmaster
> 
>
> Key: HBASE-20877
> URL: https://issues.apache.org/jira/browse/HBASE-20877
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 1.2.0
>Reporter: Manjeet Singh
>Priority: Major
>
> Hbase version 1.2.0 OldWals are getting filled and showing as below
> 7.2 K 21.5 K /hbase/.hbase-snapshot
> 0 0 /hbase/.tmp
> 0 0 /hbase/MasterProcWALs
> 18.3 G 60.2 G /hbase/WALs
> 28.7 G 86.1 G /hbase/archive
> 0 0 /hbase/corrupt
> 1.7 T 5.2 T /hbase/data
> 42 126 /hbase/hbase.id
> 7 21 /hbase/hbase.version
> 7.2 T 21.6 T /hbase/oldWALs
>  
> It;s not getting purged by Hmaster as oldWals are supposed to be cleaned in 
> master background chore, HBASE-20352(for 1.x version) is created to speed up 
> cleaning oldWals, in our case it's not happening.
> hbase.master.logcleaner.ttl is 1 minutes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20878) Data loss if merging regions while ServerCrashProcedure executing

2018-07-12 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-20878:

Fix Version/s: 2.1.1
   2.0.2
   3.0.0

> Data loss if merging regions while ServerCrashProcedure executing
> -
>
> Key: HBASE-20878
> URL: https://issues.apache.org/jira/browse/HBASE-20878
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Critical
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20878.branch-2.0.001.patch
>
>
> In MergeTableRegionsProcedure, we close the regions to merge using 
> UnassignProcedure. But, if the RS these regions on is crashed, a 
> ServerCrashProcedure will execute at the same time. UnassignProcedures will 
> be blockd until all logs are split. But since these regions are closed for 
> merging, the regions won't open again, the recovered.edit in the region dir 
> won't be replay, thus, data will loss.
> I provided a test to repo this case. I seriously doubt Split region procedure 
> also has this kind of problem. I will check later



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20870) Wrong HBase root dir in ITBLL's Search Tool

2018-07-12 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541806#comment-16541806
 ] 

Allan Yang commented on HBASE-20870:


You can refer to the error info above. Without the patch, the root dir will be 
set to a random local dir, and ITBLL's search tool will fail.

> Wrong HBase root dir in ITBLL's Search Tool
> ---
>
> Key: HBASE-20870
> URL: https://issues.apache.org/jira/browse/HBASE-20870
> Project: HBase
>  Issue Type: Bug
>  Components: integration tests
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Minor
> Attachments: HBASE-20870.branch-2.0.001.patch
>
>
> When using IntegrationTestBigLinkedList's Search tools, it always fails since 
> it tries to read WALs in a wrong HBase root dir. Turned out that when 
> initializing IntegrationTestingUtility in IntegrationTestBigLinkedList, its 
> super class HBaseTestingUtility will change hbase.rootdir to a local random 
> dir. It is not wrong since HBaseTestingUtility is mostly used by Minicluster. 
> But for IntegrationTest runs on distributed clusters, we should change it 
> back.
>  Here is the error info.
> {code:java}
> 2018-07-11 16:35:49,679 DEBUG [main] hbase.HBaseCommonTestingUtility: Setting 
> hbase.rootdir to 
> /home/hadoop/target/test-data/deb67611-2737-4696-abe9-32a7783df7bb
> 2018-07-11 16:35:50,736 ERROR [main] util.AbstractHBaseTool: Error running 
> command-line tool java.io.FileNotFoundException: File 
> file:/home/hadoop/target/test-data/deb67611-2737-4696-abe9-32a7783df7bb/WALs 
> does not exist
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:431)
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1517)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20878) Data loss if merging regions while ServerCrashProcedure executing

2018-07-12 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541804#comment-16541804
 ] 

Allan Yang commented on HBASE-20878:


[~Apache9], you can see my patch, my solution is to check whether the closed 
region was on a dead server, if it is, then abort the merge procedure. I think 
this can work.

> Data loss if merging regions while ServerCrashProcedure executing
> -
>
> Key: HBASE-20878
> URL: https://issues.apache.org/jira/browse/HBASE-20878
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Critical
> Attachments: HBASE-20878.branch-2.0.001.patch
>
>
> In MergeTableRegionsProcedure, we close the regions to merge using 
> UnassignProcedure. But, if the RS these regions on is crashed, a 
> ServerCrashProcedure will execute at the same time. UnassignProcedures will 
> be blockd until all logs are split. But since these regions are closed for 
> merging, the regions won't open again, the recovered.edit in the region dir 
> won't be replay, thus, data will loss.
> I provided a test to repo this case. I seriously doubt Split region procedure 
> also has this kind of problem. I will check later



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20870) Wrong HBase root dir in ITBLL's Search Tool

2018-07-12 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541815#comment-16541815
 ] 

Allan Yang commented on HBASE-20870:


{code}
It wasn't committed because for the tests that we were doing, it didn't seem 
necessary. 
{code}
[~mdrob], have you tried 'hbase IntegrationTestBigLinkedList search'. HBase 
root dir only means something here. It won't affect other parts like Generate 
or Verify.

> Wrong HBase root dir in ITBLL's Search Tool
> ---
>
> Key: HBASE-20870
> URL: https://issues.apache.org/jira/browse/HBASE-20870
> Project: HBase
>  Issue Type: Bug
>  Components: integration tests
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Minor
> Attachments: HBASE-20870.branch-2.0.001.patch
>
>
> When using IntegrationTestBigLinkedList's Search tools, it always fails since 
> it tries to read WALs in a wrong HBase root dir. Turned out that when 
> initializing IntegrationTestingUtility in IntegrationTestBigLinkedList, its 
> super class HBaseTestingUtility will change hbase.rootdir to a local random 
> dir. It is not wrong since HBaseTestingUtility is mostly used by Minicluster. 
> But for IntegrationTest runs on distributed clusters, we should change it 
> back.
>  Here is the error info.
> {code:java}
> 2018-07-11 16:35:49,679 DEBUG [main] hbase.HBaseCommonTestingUtility: Setting 
> hbase.rootdir to 
> /home/hadoop/target/test-data/deb67611-2737-4696-abe9-32a7783df7bb
> 2018-07-11 16:35:50,736 ERROR [main] util.AbstractHBaseTool: Error running 
> command-line tool java.io.FileNotFoundException: File 
> file:/home/hadoop/target/test-data/deb67611-2737-4696-abe9-32a7783df7bb/WALs 
> does not exist
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:431)
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1517)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20870) Wrong HBase root dir in ITBLL's Search Tool

2018-07-12 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541830#comment-16541830
 ] 

Mike Drob commented on HBASE-20870:
---

Ah, thanks for the clarification! I don't have a cluster set up for itbll right 
now, so I don't want my theoretical musings to hold you up. I would be +1 on a 
patch that saves the original root for and then resets it, and would strongly 
prefer that approach to the current approach.

> Wrong HBase root dir in ITBLL's Search Tool
> ---
>
> Key: HBASE-20870
> URL: https://issues.apache.org/jira/browse/HBASE-20870
> Project: HBase
>  Issue Type: Bug
>  Components: integration tests
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Minor
> Attachments: HBASE-20870.branch-2.0.001.patch
>
>
> When using IntegrationTestBigLinkedList's Search tools, it always fails since 
> it tries to read WALs in a wrong HBase root dir. Turned out that when 
> initializing IntegrationTestingUtility in IntegrationTestBigLinkedList, its 
> super class HBaseTestingUtility will change hbase.rootdir to a local random 
> dir. It is not wrong since HBaseTestingUtility is mostly used by Minicluster. 
> But for IntegrationTest runs on distributed clusters, we should change it 
> back.
>  Here is the error info.
> {code:java}
> 2018-07-11 16:35:49,679 DEBUG [main] hbase.HBaseCommonTestingUtility: Setting 
> hbase.rootdir to 
> /home/hadoop/target/test-data/deb67611-2737-4696-abe9-32a7783df7bb
> 2018-07-11 16:35:50,736 ERROR [main] util.AbstractHBaseTool: Error running 
> command-line tool java.io.FileNotFoundException: File 
> file:/home/hadoop/target/test-data/deb67611-2737-4696-abe9-32a7783df7bb/WALs 
> does not exist
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:431)
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1517)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20879) Compacting memstore config should handle lower case

2018-07-12 Thread Ted Yu (JIRA)
Ted Yu created HBASE-20879:
--

 Summary: Compacting memstore config should handle lower case
 Key: HBASE-20879
 URL: https://issues.apache.org/jira/browse/HBASE-20879
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.0.1
Reporter: Tushar Sharma
Assignee: Ted Yu


Tushar reported seeing the following in region server log when entering 'basic' 
for compacting memstore type:
{code}
2018-07-10 19:43:45,944 ERROR [RS_OPEN_REGION-regionserver/c01s22:16020-0] 
handler.OpenRegionHandler: Failed open of 
region=usertable,user6379,1531182972304.69abd81a44e9cc3ef9e150709f4f69ab., 
starting to roll back the global memstore size.
java.io.IOException: java.lang.IllegalArgumentException: No enum constant 
org.apache.hadoop.hbase.MemoryCompactionPolicy.basic
at 
org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1035)
at 
org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:900)
at 
org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:872)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7048)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7006)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6977)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6933)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6884)
at 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:284)
at 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:109)
at 
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: No enum constant 
org.apache.hadoop.hbase.MemoryCompactionPolicy.basic
at java.lang.Enum.valueOf(Enum.java:238)
at 
org.apache.hadoop.hbase.MemoryCompactionPolicy.valueOf(MemoryCompactionPolicy.java:26)
at 
org.apache.hadoop.hbase.regionserver.HStore.getMemstore(HStore.java:331)
at org.apache.hadoop.hbase.regionserver.HStore.(HStore.java:271)
at 
org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:5531)
at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:999)
at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:996)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
... 3 more
2018-07-10 19:43:45,944 ERROR [RS_OPEN_REGION-regionserver/c01s22:16020-1] 
handler.OpenRegionHandler: Failed open of 
region=temp,,1530511278693.0be48eedc68b9358aa475946d00571f1., starting to roll 
back the global memstore size.
java.io.IOException: java.lang.IllegalArgumentException: No enum constant 
org.apache.hadoop.hbase.MemoryCompactionPolicy.basic
at 
org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1035)
at 
org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:900)
at 
org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:872)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7048)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7006)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6977)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6933)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6884)
at 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:284)
at 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:109)
at 
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: No enum constant 
org.apache.hadoop.hbase.MemoryCompactionPolicy.basic
at java.lang.Enum.valueOf(Enum.java:238)
at 
org.apache.hadoop.hbase.MemoryCompactionPolicy.valueOf(MemoryCompactionPolicy.java:26)
at 
org.apache.hadoop.hbase.regionserver.HSt

[jira] [Commented] (HBASE-20879) Compacting memstore config should handle lower case

2018-07-12 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541857#comment-16541857
 ] 

Ted Yu commented on HBASE-20879:


Without the fix, the test fails with:
{code}
testReplayWorksThoughLotsOfFlushing(org.apache.hadoop.hbase.regionserver.TestRecoveredEdits)
  Time elapsed: 0.958 sec  <<< ERROR!
java.io.IOException: java.lang.IllegalArgumentException: No enum constant 
org.apache.hadoop.hbase.MemoryCompactionPolicy.none
at 
org.apache.hadoop.hbase.regionserver.TestRecoveredEdits.testReplayWorksWithMemoryCompactionPolicy(TestRecoveredEdits.java:127)
at 
org.apache.hadoop.hbase.regionserver.TestRecoveredEdits.testReplayWorksThoughLotsOfFlushing(TestRecoveredEdits.java:84)
Caused by: java.lang.IllegalArgumentException: No enum constant 
org.apache.hadoop.hbase.MemoryCompactionPolicy.none
{code}

> Compacting memstore config should handle lower case
> ---
>
> Key: HBASE-20879
> URL: https://issues.apache.org/jira/browse/HBASE-20879
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.1
>Reporter: Tushar Sharma
>Assignee: Ted Yu
>Priority: Major
>
> Tushar reported seeing the following in region server log when entering 
> 'basic' for compacting memstore type:
> {code}
> 2018-07-10 19:43:45,944 ERROR [RS_OPEN_REGION-regionserver/c01s22:16020-0] 
> handler.OpenRegionHandler: Failed open of 
> region=usertable,user6379,1531182972304.69abd81a44e9cc3ef9e150709f4f69ab., 
> starting to roll back the global memstore size.
> java.io.IOException: java.lang.IllegalArgumentException: No enum constant 
> org.apache.hadoop.hbase.MemoryCompactionPolicy.basic
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1035)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:900)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:872)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7048)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7006)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6977)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6933)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6884)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:284)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:109)
> at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IllegalArgumentException: No enum constant 
> org.apache.hadoop.hbase.MemoryCompactionPolicy.basic
> at java.lang.Enum.valueOf(Enum.java:238)
> at 
> org.apache.hadoop.hbase.MemoryCompactionPolicy.valueOf(MemoryCompactionPolicy.java:26)
> at 
> org.apache.hadoop.hbase.regionserver.HStore.getMemstore(HStore.java:331)
> at org.apache.hadoop.hbase.regionserver.HStore.(HStore.java:271)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:5531)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:999)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:996)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> ... 3 more
> 2018-07-10 19:43:45,944 ERROR [RS_OPEN_REGION-regionserver/c01s22:16020-1] 
> handler.OpenRegionHandler: Failed open of 
> region=temp,,1530511278693.0be48eedc68b9358aa475946d00571f1., starting to 
> roll back the global memstore size.
> java.io.IOException: java.lang.IllegalArgumentException: No enum constant 
> org.apache.hadoop.hbase.MemoryCompactionPolicy.basic
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1035)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:900)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:872)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7048)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7006)
> at 
> org.apache

[jira] [Updated] (HBASE-20879) Compacting memstore config should handle lower case

2018-07-12 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20879:
---
Status: Patch Available  (was: Open)

> Compacting memstore config should handle lower case
> ---
>
> Key: HBASE-20879
> URL: https://issues.apache.org/jira/browse/HBASE-20879
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.1
>Reporter: Tushar Sharma
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20879.v2.txt
>
>
> Tushar reported seeing the following in region server log when entering 
> 'basic' for compacting memstore type:
> {code}
> 2018-07-10 19:43:45,944 ERROR [RS_OPEN_REGION-regionserver/c01s22:16020-0] 
> handler.OpenRegionHandler: Failed open of 
> region=usertable,user6379,1531182972304.69abd81a44e9cc3ef9e150709f4f69ab., 
> starting to roll back the global memstore size.
> java.io.IOException: java.lang.IllegalArgumentException: No enum constant 
> org.apache.hadoop.hbase.MemoryCompactionPolicy.basic
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1035)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:900)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:872)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7048)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7006)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6977)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6933)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6884)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:284)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:109)
> at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IllegalArgumentException: No enum constant 
> org.apache.hadoop.hbase.MemoryCompactionPolicy.basic
> at java.lang.Enum.valueOf(Enum.java:238)
> at 
> org.apache.hadoop.hbase.MemoryCompactionPolicy.valueOf(MemoryCompactionPolicy.java:26)
> at 
> org.apache.hadoop.hbase.regionserver.HStore.getMemstore(HStore.java:331)
> at org.apache.hadoop.hbase.regionserver.HStore.(HStore.java:271)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:5531)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:999)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:996)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> ... 3 more
> 2018-07-10 19:43:45,944 ERROR [RS_OPEN_REGION-regionserver/c01s22:16020-1] 
> handler.OpenRegionHandler: Failed open of 
> region=temp,,1530511278693.0be48eedc68b9358aa475946d00571f1., starting to 
> roll back the global memstore size.
> java.io.IOException: java.lang.IllegalArgumentException: No enum constant 
> org.apache.hadoop.hbase.MemoryCompactionPolicy.basic
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1035)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:900)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:872)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7048)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7006)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6977)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6933)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6884)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:284)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:109)
> at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>

[jira] [Updated] (HBASE-20879) Compacting memstore config should handle lower case

2018-07-12 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20879:
---
Attachment: 20879.v2.txt

> Compacting memstore config should handle lower case
> ---
>
> Key: HBASE-20879
> URL: https://issues.apache.org/jira/browse/HBASE-20879
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.1
>Reporter: Tushar Sharma
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20879.v2.txt
>
>
> Tushar reported seeing the following in region server log when entering 
> 'basic' for compacting memstore type:
> {code}
> 2018-07-10 19:43:45,944 ERROR [RS_OPEN_REGION-regionserver/c01s22:16020-0] 
> handler.OpenRegionHandler: Failed open of 
> region=usertable,user6379,1531182972304.69abd81a44e9cc3ef9e150709f4f69ab., 
> starting to roll back the global memstore size.
> java.io.IOException: java.lang.IllegalArgumentException: No enum constant 
> org.apache.hadoop.hbase.MemoryCompactionPolicy.basic
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1035)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:900)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:872)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7048)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7006)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6977)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6933)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6884)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:284)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:109)
> at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IllegalArgumentException: No enum constant 
> org.apache.hadoop.hbase.MemoryCompactionPolicy.basic
> at java.lang.Enum.valueOf(Enum.java:238)
> at 
> org.apache.hadoop.hbase.MemoryCompactionPolicy.valueOf(MemoryCompactionPolicy.java:26)
> at 
> org.apache.hadoop.hbase.regionserver.HStore.getMemstore(HStore.java:331)
> at org.apache.hadoop.hbase.regionserver.HStore.(HStore.java:271)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:5531)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:999)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:996)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> ... 3 more
> 2018-07-10 19:43:45,944 ERROR [RS_OPEN_REGION-regionserver/c01s22:16020-1] 
> handler.OpenRegionHandler: Failed open of 
> region=temp,,1530511278693.0be48eedc68b9358aa475946d00571f1., starting to 
> roll back the global memstore size.
> java.io.IOException: java.lang.IllegalArgumentException: No enum constant 
> org.apache.hadoop.hbase.MemoryCompactionPolicy.basic
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1035)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:900)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:872)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7048)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7006)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6977)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6933)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6884)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:284)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:109)
> at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.

[jira] [Commented] (HBASE-20878) Data loss if merging regions while ServerCrashProcedure executing

2018-07-12 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541880#comment-16541880
 ] 

Hadoop QA commented on HBASE-20878:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
40s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 
48s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
17s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 6s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
0s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 13m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 13m 
45s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
8s{color} | {color:red} hbase-server: The patch generated 2 new + 4 unchanged - 
0 fixed = 6 total (was 4) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
11s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 46s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
1m  6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
33s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}121m 47s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
40s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}193m 27s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.snapshot.TestFlushSnapshotFromClient |
|   | hadoop.hbase.namespace.TestNamespaceAuditor |
|   | hadoop.hbase.TestSequenceIdMonotonicallyIncreasing |
|   | hadoop.hbase.master.TestAssignmentListener |
|   | hadoop.hbase.TestSplitMerge |
|   | hadoop.hbase.master.assignment.TestMasterAbortWhileMergingTable |
|   | hadoop.hbase.master.assignment.TestMergeTableRegionsProcedure |
|   | hadoop.

[jira] [Updated] (HBASE-20586) SyncTable tool: Add support for cross-realm remote clusters

2018-07-12 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-20586:

Status: Patch Available  (was: Open)

> SyncTable tool: Add support for cross-realm remote clusters
> ---
>
> Key: HBASE-20586
> URL: https://issues.apache.org/jira/browse/HBASE-20586
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce, Replication
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Attachments: HBASE-20586.master.001.patch
>
>
> One possible scenario for HashTable/SyncTable is for synchronize different 
> clusters, for instance, when replication has been enabled but data existed 
> already, or due replication issues that may had caused long lags in the 
> replication.
> For secured clusters under different kerberos realms (with cross-realm 
> properly set), though, current SyncTable version would fail to authenticate 
> with the remote cluster when trying to read HashTable outputs (when 
> *sourcehashdir* is remote) and also when trying to read table data on the 
> remote cluster (when *sourcezkcluster* is remote).
> The hdfs error would look like this:
> {noformat}
> INFO mapreduce.Job: Task Id : attempt_1524358175778_105392_m_00_0, Status 
> : FAILED
> Error: java.io.IOException: Failed on local exception: java.io.IOException: 
> org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
> via:[TOKEN, KERBEROS]; Host Details : local host is: "local-host/1.1.1.1"; 
> destination host is: "remote-nn":8020;
>         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1506)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1439)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
>         at com.sun.proxy.$Proxy13.getBlockLocations(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:256)
> ...
>         at 
> org.apache.hadoop.hbase.mapreduce.HashTable$TableHash.readPropertiesFile(HashTable.java:144)
>         at 
> org.apache.hadoop.hbase.mapreduce.HashTable$TableHash.read(HashTable.java:105)
>         at 
> org.apache.hadoop.hbase.mapreduce.SyncTable$SyncMapper.setup(SyncTable.java:188)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
> ...
> Caused by: java.io.IOException: 
> org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
> via:[TOKEN, KERBEROS]{noformat}
> The above can be sorted if the SyncTable job acquires a DT for the remote NN. 
> Once hdfs related authentication is done, it's also necessary to authenticate 
> against remote HBase, as the below error would arise:
> {noformat}
> INFO mapreduce.Job: Task Id : attempt_1524358175778_172414_m_00_0, Status 
> : FAILED
> Error: org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get 
> the location
> at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:326)
> ...
> at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:867)
> at 
> org.apache.hadoop.hbase.mapreduce.SyncTable$SyncMapper.syncRange(SyncTable.java:331)
> ...
> Caused by: java.io.IOException: Could not set up IO Streams to 
> remote-rs-host/1.1.1.2:60020
> at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:786)
> ...
> Caused by: java.lang.RuntimeException: SASL authentication failed. The most 
> likely cause is missing or invalid credentials. Consider 'kinit'.
> ...
> Caused by: GSSException: No valid credentials provided (Mechanism level: 
> Failed to find any Kerberos tgt)
> ...{noformat}
> The above would need additional authentication logic against the remote hbase 
> cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20586) SyncTable tool: Add support for cross-realm remote clusters

2018-07-12 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-20586:

Affects Version/s: 1.2.0
   2.0.0

> SyncTable tool: Add support for cross-realm remote clusters
> ---
>
> Key: HBASE-20586
> URL: https://issues.apache.org/jira/browse/HBASE-20586
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce, Replication
>Affects Versions: 1.2.0, 2.0.0
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Attachments: HBASE-20586.master.001.patch
>
>
> One possible scenario for HashTable/SyncTable is for synchronize different 
> clusters, for instance, when replication has been enabled but data existed 
> already, or due replication issues that may had caused long lags in the 
> replication.
> For secured clusters under different kerberos realms (with cross-realm 
> properly set), though, current SyncTable version would fail to authenticate 
> with the remote cluster when trying to read HashTable outputs (when 
> *sourcehashdir* is remote) and also when trying to read table data on the 
> remote cluster (when *sourcezkcluster* is remote).
> The hdfs error would look like this:
> {noformat}
> INFO mapreduce.Job: Task Id : attempt_1524358175778_105392_m_00_0, Status 
> : FAILED
> Error: java.io.IOException: Failed on local exception: java.io.IOException: 
> org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
> via:[TOKEN, KERBEROS]; Host Details : local host is: "local-host/1.1.1.1"; 
> destination host is: "remote-nn":8020;
>         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1506)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1439)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
>         at com.sun.proxy.$Proxy13.getBlockLocations(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:256)
> ...
>         at 
> org.apache.hadoop.hbase.mapreduce.HashTable$TableHash.readPropertiesFile(HashTable.java:144)
>         at 
> org.apache.hadoop.hbase.mapreduce.HashTable$TableHash.read(HashTable.java:105)
>         at 
> org.apache.hadoop.hbase.mapreduce.SyncTable$SyncMapper.setup(SyncTable.java:188)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
> ...
> Caused by: java.io.IOException: 
> org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
> via:[TOKEN, KERBEROS]{noformat}
> The above can be sorted if the SyncTable job acquires a DT for the remote NN. 
> Once hdfs related authentication is done, it's also necessary to authenticate 
> against remote HBase, as the below error would arise:
> {noformat}
> INFO mapreduce.Job: Task Id : attempt_1524358175778_172414_m_00_0, Status 
> : FAILED
> Error: org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get 
> the location
> at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:326)
> ...
> at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:867)
> at 
> org.apache.hadoop.hbase.mapreduce.SyncTable$SyncMapper.syncRange(SyncTable.java:331)
> ...
> Caused by: java.io.IOException: Could not set up IO Streams to 
> remote-rs-host/1.1.1.2:60020
> at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:786)
> ...
> Caused by: java.lang.RuntimeException: SASL authentication failed. The most 
> likely cause is missing or invalid credentials. Consider 'kinit'.
> ...
> Caused by: GSSException: No valid credentials provided (Mechanism level: 
> Failed to find any Kerberos tgt)
> ...{noformat}
> The above would need additional authentication logic against the remote hbase 
> cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20586) SyncTable tool: Add support for cross-realm remote clusters

2018-07-12 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541887#comment-16541887
 ] 

Sean Busbey commented on HBASE-20586:
-

I'm not sure it's possible to get a unit test going for this, especially if 
MiniKDC can't do it. If we do this as an integration test outside of maven that 
manually runs two different MiniKDCs could we do it?

> SyncTable tool: Add support for cross-realm remote clusters
> ---
>
> Key: HBASE-20586
> URL: https://issues.apache.org/jira/browse/HBASE-20586
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce, Replication
>Affects Versions: 1.2.0, 2.0.0
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Attachments: HBASE-20586.master.001.patch
>
>
> One possible scenario for HashTable/SyncTable is for synchronize different 
> clusters, for instance, when replication has been enabled but data existed 
> already, or due replication issues that may had caused long lags in the 
> replication.
> For secured clusters under different kerberos realms (with cross-realm 
> properly set), though, current SyncTable version would fail to authenticate 
> with the remote cluster when trying to read HashTable outputs (when 
> *sourcehashdir* is remote) and also when trying to read table data on the 
> remote cluster (when *sourcezkcluster* is remote).
> The hdfs error would look like this:
> {noformat}
> INFO mapreduce.Job: Task Id : attempt_1524358175778_105392_m_00_0, Status 
> : FAILED
> Error: java.io.IOException: Failed on local exception: java.io.IOException: 
> org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
> via:[TOKEN, KERBEROS]; Host Details : local host is: "local-host/1.1.1.1"; 
> destination host is: "remote-nn":8020;
>         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1506)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1439)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
>         at com.sun.proxy.$Proxy13.getBlockLocations(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:256)
> ...
>         at 
> org.apache.hadoop.hbase.mapreduce.HashTable$TableHash.readPropertiesFile(HashTable.java:144)
>         at 
> org.apache.hadoop.hbase.mapreduce.HashTable$TableHash.read(HashTable.java:105)
>         at 
> org.apache.hadoop.hbase.mapreduce.SyncTable$SyncMapper.setup(SyncTable.java:188)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
> ...
> Caused by: java.io.IOException: 
> org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
> via:[TOKEN, KERBEROS]{noformat}
> The above can be sorted if the SyncTable job acquires a DT for the remote NN. 
> Once hdfs related authentication is done, it's also necessary to authenticate 
> against remote HBase, as the below error would arise:
> {noformat}
> INFO mapreduce.Job: Task Id : attempt_1524358175778_172414_m_00_0, Status 
> : FAILED
> Error: org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get 
> the location
> at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:326)
> ...
> at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:867)
> at 
> org.apache.hadoop.hbase.mapreduce.SyncTable$SyncMapper.syncRange(SyncTable.java:331)
> ...
> Caused by: java.io.IOException: Could not set up IO Streams to 
> remote-rs-host/1.1.1.2:60020
> at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:786)
> ...
> Caused by: java.lang.RuntimeException: SASL authentication failed. The most 
> likely cause is missing or invalid credentials. Consider 'kinit'.
> ...
> Caused by: GSSException: No valid credentials provided (Mechanism level: 
> Failed to find any Kerberos tgt)
> ...{noformat}
> The above would need additional authentication logic against the remote hbase 
> cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-20305) Add option to SyncTable that skip deletes on target cluster

2018-07-12 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey reopened HBASE-20305:
-

Anyone opposed to this getting pulled back into earlier release lines? It seems 
like a solid low risk addition.

> Add option to SyncTable that skip deletes on target cluster
> ---
>
> Key: HBASE-20305
> URL: https://issues.apache.org/jira/browse/HBASE-20305
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 2.0.0-alpha-4
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: 0001-HBASE-20305.master.001.patch, 
> HBASE-20305.master.002.patch
>
>
> We had a situation where two clusters with active-active replication got out 
> of sync, but both had data that should be kept. The tables in question never 
> have data deleted, but ingestion had happened on the two different clusters, 
> some rows had been even updated.
> In this scenario, a cell that is present in one of the table clusters should 
> not be deleted, but replayed on the other. Also, for cells with same 
> identifier but different values, the most recent value should be kept. 
> Current version of SyncTable would not be applicable here, because it would 
> simply copy the whole state from source to target, then losing any additional 
> rows that might be only in target, as well as cell values that got most 
> recent update. This could be solved by adding an option to skip deletes for 
> SyncTable. This way, the additional cells not present on source would still 
> be kept. For cells with same identifier but different values, it would just 
> perform a Put for the cell version from source, but client scans would still 
> fetch the most recent timestamp.
> I'm attaching a patch with this additional option shortly. Please share your 
> thoughts.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20305) Add option to SyncTable that skip deletes on target cluster

2018-07-12 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541892#comment-16541892
 ] 

Sean Busbey commented on HBASE-20305:
-

tentatively adding to scope of next minor releases. If I don't hear an 
objection I'll backport this later this week.

> Add option to SyncTable that skip deletes on target cluster
> ---
>
> Key: HBASE-20305
> URL: https://issues.apache.org/jira/browse/HBASE-20305
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 2.0.0-alpha-4
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.2.0
>
> Attachments: 0001-HBASE-20305.master.001.patch, 
> HBASE-20305.master.002.patch
>
>
> We had a situation where two clusters with active-active replication got out 
> of sync, but both had data that should be kept. The tables in question never 
> have data deleted, but ingestion had happened on the two different clusters, 
> some rows had been even updated.
> In this scenario, a cell that is present in one of the table clusters should 
> not be deleted, but replayed on the other. Also, for cells with same 
> identifier but different values, the most recent value should be kept. 
> Current version of SyncTable would not be applicable here, because it would 
> simply copy the whole state from source to target, then losing any additional 
> rows that might be only in target, as well as cell values that got most 
> recent update. This could be solved by adding an option to skip deletes for 
> SyncTable. This way, the additional cells not present on source would still 
> be kept. For cells with same identifier but different values, it would just 
> perform a Put for the cell version from source, but client scans would still 
> fetch the most recent timestamp.
> I'm attaching a patch with this additional option shortly. Please share your 
> thoughts.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20305) Add option to SyncTable that skip deletes on target cluster

2018-07-12 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-20305:

Fix Version/s: 2.2.0
   1.5.0

> Add option to SyncTable that skip deletes on target cluster
> ---
>
> Key: HBASE-20305
> URL: https://issues.apache.org/jira/browse/HBASE-20305
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 2.0.0-alpha-4
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.2.0
>
> Attachments: 0001-HBASE-20305.master.001.patch, 
> HBASE-20305.master.002.patch
>
>
> We had a situation where two clusters with active-active replication got out 
> of sync, but both had data that should be kept. The tables in question never 
> have data deleted, but ingestion had happened on the two different clusters, 
> some rows had been even updated.
> In this scenario, a cell that is present in one of the table clusters should 
> not be deleted, but replayed on the other. Also, for cells with same 
> identifier but different values, the most recent value should be kept. 
> Current version of SyncTable would not be applicable here, because it would 
> simply copy the whole state from source to target, then losing any additional 
> rows that might be only in target, as well as cell values that got most 
> recent update. This could be solved by adding an option to skip deletes for 
> SyncTable. This way, the additional cells not present on source would still 
> be kept. For cells with same identifier but different values, it would just 
> perform a Put for the cell version from source, but client scans would still 
> fetch the most recent timestamp.
> I'm attaching a patch with this additional option shortly. Please share your 
> thoughts.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20244) NoSuchMethodException when retrieving private method decryptEncryptedDataEncryptionKey from DFSClient

2018-07-12 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20244:
---
Release Note: 
HDFS-12574 made an incompatible change to HdfsKMSUtil with different method 
signature.
This issue uses reflection to try known method signatures in order to work with 
hadoop releases with and without HDFS-12574

> NoSuchMethodException when retrieving private method 
> decryptEncryptedDataEncryptionKey from DFSClient
> -
>
> Key: HBASE-20244
> URL: https://issues.apache.org/jira/browse/HBASE-20244
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Affects Versions: 2.0.0, 2.0.1
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Blocker
> Fix For: 3.0.0, 2.1.0, 2.0.2, 2.2.0
>
> Attachments: 20244.v1.txt, 20244.v1.txt, 20244.v1.txt, 
> HBASE-20244-v1.patch, HBASE-20244.patch
>
>
> I was running unit test against hadoop 3.0.1 RC and saw the following in test 
> output:
> {code}
> ERROR [RS-EventLoopGroup-3-3] 
> asyncfs.FanOutOneBlockAsyncDFSOutputSaslHelper(267): Couldn't properly 
> initialize access to HDFS internals. Please update  your WAL Provider to not 
> make use of the 'asyncfs' provider. See HBASE-16110 for more information.
> java.lang.NoSuchMethodException: 
> org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(org.apache.hadoop.fs.FileEncryptionInfo)
>   at java.lang.Class.getDeclaredMethod(Class.java:2130)
>   at 
> org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputSaslHelper.createTransparentCryptoHelper(FanOutOneBlockAsyncDFSOutputSaslHelper.java:232)
>   at 
> org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputSaslHelper.(FanOutOneBlockAsyncDFSOutputSaslHelper.java:262)
>   at 
> org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper.initialize(FanOutOneBlockAsyncDFSOutputHelper.java:661)
>   at 
> org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper.access$300(FanOutOneBlockAsyncDFSOutputHelper.java:118)
>   at 
> org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper$13.operationComplete(FanOutOneBlockAsyncDFSOutputHelper.java:720)
>   at 
> org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper$13.operationComplete(FanOutOneBlockAsyncDFSOutputHelper.java:715)
>   at 
> org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
>   at 
> org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:500)
>   at 
> org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:479)
>   at 
> org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420)
>   at 
> org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:104)
>   at 
> org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:82)
>   at 
> org.apache.hbase.thirdparty.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:306)
>   at 
> org.apache.hbase.thirdparty.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:341)
>   at 
> org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:633)
>   at 
> org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
> {code}
> The private method was moved by HDFS-12574 to HdfsKMSUtil with different 
> signature.
> To accommodate the above method movement, it seems we need to call the 
> following method of DFSClient :
> {code}
>   public KeyProvider getKeyProvider() throws IOException {
> {code}
> Since the new decryptEncryptedDataEncryptionKey method has this signature:
> {code}
>   static KeyVersion decryptEncryptedDataEncryptionKey(FileEncryptionInfo
> feInfo, KeyProvider keyProvider) throws IOException {
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20880) Fix for warning It would fail on the following input in hbase-spark

2018-07-12 Thread Artem Ervits (JIRA)
Artem Ervits created HBASE-20880:


 Summary: Fix for warning It would fail on the following input in 
hbase-spark
 Key: HBASE-20880
 URL: https://issues.apache.org/jira/browse/HBASE-20880
 Project: HBase
  Issue Type: Bug
 Environment: {code:java}
Maven home: /opt/apache-maven-3.5.3
Java version: 1.8.0_172, vendor: Oracle Corporation
Java home: /Library/Java/JavaVirtualMachines/jdk1.8.0_172.jdk/Contents/Home/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "mac os x", version: "10.13.5", arch: "x86_64", family: "mac"{code}
last commit: 3fc23fe930aa93e8755cf2bd478bd9907f719fd2
Reporter: Artem Ervits
Assignee: Artem Ervits


compiling hbase-spark module returns a warning
{code:java}
[WARNING] 
/.../hbase/hbase-spark/src/test/scala/org/apache/hadoop/hbase/spark/TableOutputFormatSuite.scala:117:
 warning: match may not be exhaustive.
[WARNING] It would fail on the following input: Failure((x: Throwable forSome x 
not in Exception))
[WARNING] Try {
[WARNING] ^
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20880) Fix for warning It would fail on the following input in hbase-spark

2018-07-12 Thread Artem Ervits (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Ervits updated HBASE-20880:
-
Attachment: HBASE-20880.v01.patch

> Fix for warning It would fail on the following input in hbase-spark
> ---
>
> Key: HBASE-20880
> URL: https://issues.apache.org/jira/browse/HBASE-20880
> Project: HBase
>  Issue Type: Bug
> Environment: {code:java}
> Maven home: /opt/apache-maven-3.5.3
> Java version: 1.8.0_172, vendor: Oracle Corporation
> Java home: 
> /Library/Java/JavaVirtualMachines/jdk1.8.0_172.jdk/Contents/Home/jre
> Default locale: en_US, platform encoding: UTF-8
> OS name: "mac os x", version: "10.13.5", arch: "x86_64", family: "mac"{code}
> last commit: 3fc23fe930aa93e8755cf2bd478bd9907f719fd2
>Reporter: Artem Ervits
>Assignee: Artem Ervits
>Priority: Minor
> Attachments: HBASE-20880.v01.patch
>
>
> compiling hbase-spark module returns a warning
> {code:java}
> [WARNING] 
> /.../hbase/hbase-spark/src/test/scala/org/apache/hadoop/hbase/spark/TableOutputFormatSuite.scala:117:
>  warning: match may not be exhaustive.
> [WARNING] It would fail on the following input: Failure((x: Throwable forSome 
> x not in Exception))
> [WARNING] Try {
> [WARNING] ^
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20880) Fix for warning It would fail on the following input in hbase-spark

2018-07-12 Thread Artem Ervits (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Ervits updated HBASE-20880:
-
Status: Patch Available  (was: Open)

> Fix for warning It would fail on the following input in hbase-spark
> ---
>
> Key: HBASE-20880
> URL: https://issues.apache.org/jira/browse/HBASE-20880
> Project: HBase
>  Issue Type: Bug
> Environment: {code:java}
> Maven home: /opt/apache-maven-3.5.3
> Java version: 1.8.0_172, vendor: Oracle Corporation
> Java home: 
> /Library/Java/JavaVirtualMachines/jdk1.8.0_172.jdk/Contents/Home/jre
> Default locale: en_US, platform encoding: UTF-8
> OS name: "mac os x", version: "10.13.5", arch: "x86_64", family: "mac"{code}
> last commit: 3fc23fe930aa93e8755cf2bd478bd9907f719fd2
>Reporter: Artem Ervits
>Assignee: Artem Ervits
>Priority: Minor
> Attachments: HBASE-20880.v01.patch
>
>
> compiling hbase-spark module returns a warning
> {code:java}
> [WARNING] 
> /.../hbase/hbase-spark/src/test/scala/org/apache/hadoop/hbase/spark/TableOutputFormatSuite.scala:117:
>  warning: match may not be exhaustive.
> [WARNING] It would fail on the following input: Failure((x: Throwable forSome 
> x not in Exception))
> [WARNING] Try {
> [WARNING] ^
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20586) SyncTable tool: Add support for cross-realm remote clusters

2018-07-12 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541971#comment-16541971
 ] 

Hadoop QA commented on HBASE-20586:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
29s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
11s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
30s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
45s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
39s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
13m 20s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 17m  
2s{color} | {color:green} hbase-mapreduce in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 59m 11s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20586 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12923490/HBASE-20586.master.001.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux ff479255ab0e 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 
08:53:28 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 3fc23fe930 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13609/testReport/ |
| Max. process+thread count | 4712 (vs. ulimit of 1) |
| modules | C: hbase-mapreduce U: hbase-mapreduce |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13609/

[jira] [Assigned] (HBASE-20876) Improve docs style in HConstants

2018-07-12 Thread Wei-Chiu Chuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang reassigned HBASE-20876:
---

Assignee: Wei-Chiu Chuang

> Improve docs style in HConstants
> 
>
> Key: HBASE-20876
> URL: https://issues.apache.org/jira/browse/HBASE-20876
> Project: HBase
>  Issue Type: Improvement
>Reporter: Reid Chan
>Assignee: Wei-Chiu Chuang
>Priority: Minor
>  Labels: beginner, beginners, newbie
>
> In {{HConstants}}, there's a docs snippet:
> {code}
>  /** Don't use it! This'll get you the wrong path in a secure cluster.
>   * Use FileSystem.getHomeDirectory() or
>   * "/user/" + UserGroupInformation.getCurrentUser().getShortUserName()  */
> {code}
> It's ugly style.
> Let's improve this docs with following
> {code}
> /**
>  * Description
>  */
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20876) Improve docs style in HConstants

2018-07-12 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541991#comment-16541991
 ] 

Wei-Chiu Chuang commented on HBASE-20876:
-

Thanks Reid for catching the issue!

I'll assign it to myself and work on it.

> Improve docs style in HConstants
> 
>
> Key: HBASE-20876
> URL: https://issues.apache.org/jira/browse/HBASE-20876
> Project: HBase
>  Issue Type: Improvement
>Reporter: Reid Chan
>Assignee: Wei-Chiu Chuang
>Priority: Minor
>  Labels: beginner, beginners, newbie
>
> In {{HConstants}}, there's a docs snippet:
> {code}
>  /** Don't use it! This'll get you the wrong path in a secure cluster.
>   * Use FileSystem.getHomeDirectory() or
>   * "/user/" + UserGroupInformation.getCurrentUser().getShortUserName()  */
> {code}
> It's ugly style.
> Let's improve this docs with following
> {code}
> /**
>  * Description
>  */
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20880) Fix for warning It would fail on the following input in hbase-spark

2018-07-12 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541996#comment-16541996
 ] 

Ted Yu commented on HBASE-20880:


Compiled with patch where warning disappeared.

> Fix for warning It would fail on the following input in hbase-spark
> ---
>
> Key: HBASE-20880
> URL: https://issues.apache.org/jira/browse/HBASE-20880
> Project: HBase
>  Issue Type: Bug
> Environment: {code:java}
> Maven home: /opt/apache-maven-3.5.3
> Java version: 1.8.0_172, vendor: Oracle Corporation
> Java home: 
> /Library/Java/JavaVirtualMachines/jdk1.8.0_172.jdk/Contents/Home/jre
> Default locale: en_US, platform encoding: UTF-8
> OS name: "mac os x", version: "10.13.5", arch: "x86_64", family: "mac"{code}
> last commit: 3fc23fe930aa93e8755cf2bd478bd9907f719fd2
>Reporter: Artem Ervits
>Assignee: Artem Ervits
>Priority: Minor
> Attachments: HBASE-20880.v01.patch
>
>
> compiling hbase-spark module returns a warning
> {code:java}
> [WARNING] 
> /.../hbase/hbase-spark/src/test/scala/org/apache/hadoop/hbase/spark/TableOutputFormatSuite.scala:117:
>  warning: match may not be exhaustive.
> [WARNING] It would fail on the following input: Failure((x: Throwable forSome 
> x not in Exception))
> [WARNING] Try {
> [WARNING] ^
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20880) Fix for warning It would fail on the following input in hbase-spark

2018-07-12 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542000#comment-16542000
 ] 

Hadoop QA commented on HBASE-20880:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
44s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
4s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} scaladoc {color} | {color:green}  0m 
37s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} scalac {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} scaladoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
39s{color} | {color:green} hbase-spark in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
 9s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 13m 30s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20880 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12931356/HBASE-20880.v01.patch 
|
| Optional Tests |  asflicense  scalac  scaladoc  unit  compile  |
| uname | Linux c99db4bf1619 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 3fc23fe930 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13610/testReport/ |
| Max. process+thread count | 889 (vs. ulimit of 1) |
| modules | C: hbase-spark U: hbase-spark |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13610/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Fix for warning It would fail on the following input in hbase-spark
> ---
>
> Key: HBASE-20880
> URL: https://issues.apache.org/jira/browse/HBASE-20880
> Project: HBase
>  Issue Type: Bug
> Environment: {code:java}
> Maven home: /opt/apache-maven-3.5.3
> Java version: 1.8.0_172, vendor: Oracle Corporation
> Java home: 
> /Library/Java/JavaVirtualMachines/jdk1.8.0_172.jdk/Contents/Home/jre
> Default locale: en_US, platform encoding: UTF-8
> OS name: "mac os x", version: "10.13.5", arch: "x86_64", family: "mac"{code}
> last commit: 3fc23fe930aa93e8755cf2bd478bd9907f719fd2
>Reporter: Artem Ervits
>Assignee: Artem Ervits
>Priority: Minor
> Attachments: HBASE-20880.v01.patch
>
>
> compiling hbase-spark module returns a warning
> {code:java}
> [WARNING] 
> /.../hbase/hbase-spark/src/test/scala/org/apache/hadoop/hbase/spark/TableOutputFormatSuite.scala:117:
>  warning: match may not be exhaustive.
> [WARNING] It would fail on the following input: Failure((x: Throwable forSome 
> x not in Exception))
> [WARNING] Try {
> [WARNING] ^
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20879) Compacting memstore config should handle lower case

2018-07-12 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542128#comment-16542128
 ] 

Hadoop QA commented on HBASE-20879:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
26s{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} patch {color} | {color:blue}  0m  
1s{color} | {color:blue} The patch file was not named according to hbase's 
naming conventions. Please see 
https://yetus.apache.org/documentation/0.7.0/precommit-patchnames for 
instructions. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
50s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
46s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 5s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
20s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
11s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
37s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 49s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}179m 
20s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}220m  6s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20879 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12931348/20879.v2.txt |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 79c5a10db1a3 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 
08:53:28 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 3fc23fe930 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13608/testReport/ |
| Max. process+thread count | 4765 (vs. ulimit of 1) |

[jira] [Commented] (HBASE-18477) Umbrella JIRA for HBase Read Replica clusters

2018-07-12 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542141#comment-16542141
 ] 

Hudson commented on HBASE-18477:


Results for branch HBASE-18477
[build #262 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/262/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/262//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/262//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/262//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(x) {color:red}-1 client integration test{color}
--Failed when running client tests on top of Hadoop 2. [see log for 
details|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/262//artifact/output-integration/hadoop-2.log].
 (note that this means we didn't run on Hadoop 3)


> Umbrella JIRA for HBase Read Replica clusters
> -
>
> Key: HBASE-18477
> URL: https://issues.apache.org/jira/browse/HBASE-18477
> Project: HBase
>  Issue Type: New Feature
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Attachments: HBase Read-Replica Clusters Scope doc.docx, HBase 
> Read-Replica Clusters Scope doc.pdf, HBase Read-Replica Clusters Scope 
> doc_v2.docx, HBase Read-Replica Clusters Scope doc_v2.pdf
>
>
> Recently, changes (such as HBASE-17437) have unblocked HBase to run with a 
> root directory external to the cluster (such as in Amazon S3). This means 
> that the data is stored outside of the cluster and can be accessible after 
> the cluster has been terminated. One use case that is often asked about is 
> pointing multiple clusters to one root directory (sharing the data) to have 
> read resiliency in the case of a cluster failure.
>  
> This JIRA is an umbrella JIRA to contain all the tasks necessary to create a 
> read-replica HBase cluster that is pointed at the same root directory.
>  
> This requires making the Read-Replica cluster Read-Only (no metadata 
> operation or data operations).
> Separating the hbase:meta table for each cluster (Otherwise HBase gets 
> confused with multiple clusters trying to update the meta table with their ip 
> addresses)
> Adding refresh functionality for the meta table to ensure new metadata is 
> picked up on the read replica cluster.
> Adding refresh functionality for HFiles for a given table to ensure new data 
> is picked up on the read replica cluster.
>  
> This can be used with any existing cluster that is backed by an external 
> filesystem.
>  
> Please note that this feature is still quite manual (with the potential for 
> automation later).
>  
> More information on this particular feature can be found here: 
> https://aws.amazon.com/blogs/big-data/setting-up-read-replica-clusters-with-hbase-on-amazon-s3/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20880) Fix for warning It would fail on the following input in hbase-spark

2018-07-12 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20880:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Thanks for the patch, Artem

> Fix for warning It would fail on the following input in hbase-spark
> ---
>
> Key: HBASE-20880
> URL: https://issues.apache.org/jira/browse/HBASE-20880
> Project: HBase
>  Issue Type: Bug
> Environment: {code:java}
> Maven home: /opt/apache-maven-3.5.3
> Java version: 1.8.0_172, vendor: Oracle Corporation
> Java home: 
> /Library/Java/JavaVirtualMachines/jdk1.8.0_172.jdk/Contents/Home/jre
> Default locale: en_US, platform encoding: UTF-8
> OS name: "mac os x", version: "10.13.5", arch: "x86_64", family: "mac"{code}
> last commit: 3fc23fe930aa93e8755cf2bd478bd9907f719fd2
>Reporter: Artem Ervits
>Assignee: Artem Ervits
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HBASE-20880.v01.patch
>
>
> compiling hbase-spark module returns a warning
> {code:java}
> [WARNING] 
> /.../hbase/hbase-spark/src/test/scala/org/apache/hadoop/hbase/spark/TableOutputFormatSuite.scala:117:
>  warning: match may not be exhaustive.
> [WARNING] It would fail on the following input: Failure((x: Throwable forSome 
> x not in Exception))
> [WARNING] Try {
> [WARNING] ^
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20558) Backport HBASE-17854 to branch-1

2018-07-12 Thread Tak Lon (Stephen) Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542145#comment-16542145
 ] 

Tak Lon (Stephen) Wu commented on HBASE-20558:
--

[~zyork] any comments ?

> Backport HBASE-17854 to branch-1
> 
>
> Key: HBASE-20558
> URL: https://issues.apache.org/jira/browse/HBASE-20558
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile
>Affects Versions: 1.4.4, 1.4.5
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Major
> Attachments: HBASE-20558.branch-1.001.patch
>
>
> As part of HBASE-20555, HBASE-17854 is the third patch that is needed for 
> backporting HBASE-18083



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20837) Make IDE configuration for import order match that in our checkstyle module

2018-07-12 Thread Tak Lon (Stephen) Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542143#comment-16542143
 ] 

Tak Lon (Stephen) Wu commented on HBASE-20837:
--

[~zyork] [~busbey] any comments on these patches? I didn't add the group for 
java and javax because I figured out that Eclipse and IntelliJ has a opposite 
default order (Eclipse's order is java,javax, and IntelliJ has the other way 
around)

> Make IDE configuration for import order match that in our checkstyle module
> ---
>
> Key: HBASE-20837
> URL: https://issues.apache.org/jira/browse/HBASE-20837
> Project: HBase
>  Issue Type: Improvement
>  Components: community
>Affects Versions: 3.0.0, 2.0.1, 1.4.5
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.2.0
>
> Attachments: HBASE-20837.branch-1.001.patch, 
> HBASE-20837.branch-1.002.patch, HBASE-20837.branch-1.003.patch, 
> HBASE-20837.branch-2.001.patch, HBASE-20837.branch-2.002.patch, 
> HBASE-20837.branch-2.003.patch, HBASE-20837.master.001.patch, 
> HBASE-20837.master.002.patch, HBASE-20837.master.003.patch, IDEA import 
> layout.png, hbase-intellij-formatter.xml
>
>
> While working on HBASE-20557 contribution, we figured out that the checkstyle 
> build target (ImportOrder's `groups` 
> [http://checkstyle.sourceforge.net/config_imports.html] ) was different from 
> the development supported IDE (e.g. IntelliJ and Eclipse) formatter, we would 
> provide a fix here to sync between 
> [dev-support/hbase_eclipse_formatter.xml|https://github.com/apache/hbase/blob/master/dev-support/hbase_eclipse_formatter.xml]
>  and 
> [hbase/checkstyle.xml|https://github.com/apache/hbase/blob/master/hbase-checkstyle/src/main/resources/hbase/checkstyle.xml]
> This might need to backport the changes of master to branch-1 and branch-2 as 
> well.
> Before this change, this is what checkstyle is expecting for import order
>  
> {code:java}
> import com.google.common.annotations.VisibleForTesting;
> import java.io.IOException;
> import java.util.ArrayList;
> import java.util.List;
> import java.util.Map;
> import org.apache.commons.logging.Log;
> import org.apache.commons.logging.LogFactory;
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.hbase.classification.InterfaceAudience;
> import org.apache.hadoop.hbase.conf.ConfigurationObserver;{code}
>  
> And the proposed import order with the respect to HBASE-19262 and HBASE-19552 
> should be
>  
>    !IDEA import layout.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20734) Colocate recovered edits directory with hbase.wal.dir

2018-07-12 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542158#comment-16542158
 ] 

Ted Yu commented on HBASE-20734:


{code}
+  private FileSystem walFS;
{code}
Maybe put the above field next to {{HRegionFileSystem fs}}. Please also add 
comment on the role for the above field.
{code}
+  public HRegionFileSystem getRegionWALFileSystem() throws IOException {
+return new HRegionFileSystem(conf, getWalFileSystem(),
{code}
Can the HRegionFileSystem instance be buffered ?
{code}
+  public Path getWALRegionDir() throws IOException {
+if (regionDir == null) {
{code}
Looks like naming the regionDir as {{walRegionDir}} would be better.

I don't seem to find the code where, if recovered edits dir is not found under 
walFs, we should look under region dir.

> Colocate recovered edits directory with hbase.wal.dir
> -
>
> Key: HBASE-20734
> URL: https://issues.apache.org/jira/browse/HBASE-20734
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR, Recovery, wal
>Reporter: Ted Yu
>Assignee: Zach York
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20734.branch-1.001.patch
>
>
> During investigation of HBASE-20723, I realized that we wouldn't get the best 
> performance when hbase.wal.dir is configured to be on different (fast) media 
> than hbase rootdir w.r.t. recovered edits since recovered edits directory is 
> currently under rootdir.
> Such setup may not result in fast recovery when there is region server 
> failover.
> This issue is to find proper (hopefully backward compatible) way in 
> colocating recovered edits directory with hbase.wal.dir .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20697) Can't cache All region locations of the specify table by calling table.getRegionLocator().getAllRegionLocations()

2018-07-12 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542160#comment-16542160
 ] 

Hudson commented on HBASE-20697:


Results for branch branch-1.4
[build #383 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/383/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/383//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/383//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/383//JDK8_Nightly_Build_Report_(Hadoop2)/]




(x) {color:red}-1 source release artifact{color}
-- See build output for details.


> Can't cache All region locations of the specify table by calling 
> table.getRegionLocator().getAllRegionLocations()
> -
>
> Key: HBASE-20697
> URL: https://issues.apache.org/jira/browse/HBASE-20697
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.1, 1.2.6, 2.0.1
>Reporter: zhaoyuan
>Assignee: zhaoyuan
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 1.4.6, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-20697.branch-1.2.001.patch, 
> HBASE-20697.branch-1.2.002.patch, HBASE-20697.branch-1.2.003.patch, 
> HBASE-20697.branch-1.2.004.patch, HBASE-20697.branch-1.addendum.patch, 
> HBASE-20697.master.001.patch, HBASE-20697.master.002.patch, 
> HBASE-20697.master.002.patch, HBASE-20697.master.003.patch
>
>
> When we upgrade and restart  a new version application which will read and 
> write to HBase, we will get some operation timeout. The time out is expected 
> because when the application restarts,It will not hold any region locations 
> cache and do communication with zk and meta regionserver to get region 
> locations.
> We want to avoid these timeouts so we do warmup work and as far as I am 
> concerned,the method table.getRegionLocator().getAllRegionLocations() will 
> fetch all region locations and cache them. However, it didn't work good. 
> There are still a lot of time outs,so it confused me. 
> I dig into the source code and find something below
> {code:java}
> // code placeholder
> public List getAllRegionLocations() throws IOException {
>   TableName tableName = getName();
>   NavigableMap locations =
>   MetaScanner.allTableRegions(this.connection, tableName);
>   ArrayList regions = new ArrayList<>(locations.size());
>   for (Entry entry : locations.entrySet()) {
> regions.add(new HRegionLocation(entry.getKey(), entry.getValue()));
>   }
>   if (regions.size() > 0) {
> connection.cacheLocation(tableName, new RegionLocations(regions));
>   }
>   return regions;
> }
> In MetaCache
> public void cacheLocation(final TableName tableName, final RegionLocations 
> locations) {
>   byte [] startKey = 
> locations.getRegionLocation().getRegionInfo().getStartKey();
>   ConcurrentMap tableLocations = 
> getTableLocations(tableName);
>   RegionLocations oldLocation = tableLocations.putIfAbsent(startKey, 
> locations);
>   boolean isNewCacheEntry = (oldLocation == null);
>   if (isNewCacheEntry) {
> if (LOG.isTraceEnabled()) {
>   LOG.trace("Cached location: " + locations);
> }
> addToCachedServers(locations);
> return;
>   }
> {code}
> It will collect all regions into one RegionLocations object and only cache 
> the first not null region location and then when we put or get to hbase, we 
> do getCacheLocation() 
> {code:java}
> // code placeholder
> public RegionLocations getCachedLocation(final TableName tableName, final 
> byte [] row) {
>   ConcurrentNavigableMap tableLocations =
> getTableLocations(tableName);
>   Entry e = tableLocations.floorEntry(row);
>   if (e == null) {
> if (metrics!= null) metrics.incrMetaCacheMiss();
> return null;
>   }
>   RegionLocations possibleRegion = e.getValue();
>   // make sure that the end key is greater than the row we're looking
>   // for, otherwise the row actually belongs in the next region, not
>   // this one. the exception case is when the endkey is
>   // HConstants.EMPTY_END_ROW, signifying that the region we're
>   // checking is actually the last region in the table.
>   byte[] endKey = 
> possibleRegion.getRegionLocation().getRegionInfo().getEndKey();
>   if (Bytes.equals(endKey, HConstants.EMPTY_END_ROW) ||
>   getRowComparator(tableName).compareRows(
>   endKey, 0, endKey.length, row, 0, row.length) > 0) {
> 

[jira] [Commented] (HBASE-20651) Master, prevents hbck or shell command to reassign the split parent region

2018-07-12 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542159#comment-16542159
 ] 

Hudson commented on HBASE-20651:


Results for branch branch-1.4
[build #383 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/383/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/383//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/383//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/383//JDK8_Nightly_Build_Report_(Hadoop2)/]




(x) {color:red}-1 source release artifact{color}
-- See build output for details.


> Master, prevents hbck or shell command to reassign the split parent region
> --
>
> Key: HBASE-20651
> URL: https://issues.apache.org/jira/browse/HBASE-20651
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 1.2.6
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>Priority: Minor
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6
>
> Attachments: HBASE-20651-branch-1-v001.patch, 
> HBASE-20651-branch-1-v002.patch, HBASE-20651-branch-1-v003.patch
>
>
> We are seeing that hbck brings back split parent region and this causes 
> region inconsistency. More details will be filled as reproduce is still 
> ongoing. Might need to do something at hbck or master to prevent this from 
> happening.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20734) Colocate recovered edits directory with hbase.wal.dir

2018-07-12 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542168#comment-16542168
 ] 

Ted Yu commented on HBASE-20734:


I ran the unit tests touched by the patch.
There were a few test failures. e.g.
{code}
testRecoveredEditsReplayCompaction(org.apache.hadoop.hbase.regionserver.TestHRegion)
  Time elapsed: 0.146 sec  <<< FAILURE!
java.lang.AssertionError: expected:<1> but was:<4>
at 
org.apache.hadoop.hbase.regionserver.TestHRegion.testRecoveredEditsReplayCompaction(TestHRegion.java:969)
at 
org.apache.hadoop.hbase.regionserver.TestHRegion.testRecoveredEditsReplayCompaction(TestHRegion.java:876)
{code}

> Colocate recovered edits directory with hbase.wal.dir
> -
>
> Key: HBASE-20734
> URL: https://issues.apache.org/jira/browse/HBASE-20734
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR, Recovery, wal
>Reporter: Ted Yu
>Assignee: Zach York
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20734.branch-1.001.patch
>
>
> During investigation of HBASE-20723, I realized that we wouldn't get the best 
> performance when hbase.wal.dir is configured to be on different (fast) media 
> than hbase rootdir w.r.t. recovered edits since recovered edits directory is 
> currently under rootdir.
> Such setup may not result in fast recovery when there is region server 
> failover.
> This issue is to find proper (hopefully backward compatible) way in 
> colocating recovered edits directory with hbase.wal.dir .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-20872) Cause: java.lang.RuntimeException: Failed construction of Master: class org.apache.hadoop.hbase.master.HMasterUncompilable source code - package org.apache.hbase.thirdp

2018-07-12 Thread Artem Ervits (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Ervits resolved HBASE-20872.
--
Resolution: Cannot Reproduce

I was able to reproduce the issue a few times but then it magically started 
working.

> Cause: java.lang.RuntimeException: Failed construction of Master: class 
> org.apache.hadoop.hbase.master.HMasterUncompilable source code - package 
> org.apache.hbase.thirdparty.io.netty.channel does not exist
> 
>
> Key: HBASE-20872
> URL: https://issues.apache.org/jira/browse/HBASE-20872
> Project: HBase
>  Issue Type: Bug
>Reporter: Artem Ervits
>Priority: Major
>
> running
> {code:java}
> mvn clean test{code}
> on hbase-spark fails with
> {code:java}
> Cause: java.lang.RuntimeException: Failed construction of Master: class 
> org.apache.hadoop.hbase.master.HMasterUncompilable source code - package 
> org.apache.hbase.thirdparty.io.netty.channel does not exist
> at 
> org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:136)
> at 
> org.apache.hadoop.hbase.LocalHBaseCluster.addMaster(LocalHBaseCluster.java:212)
> at 
> org.apache.hadoop.hbase.LocalHBaseCluster.(LocalHBaseCluster.java:159)
> at org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:250)
> at org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:121)
> at 
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1042)
> at 
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:988)
> at 
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:859)
> at 
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:853)
> at 
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:782)
> ...
> Cause: java.lang.ExceptionInInitializerError:
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.setupNetty(HRegionServer.java:688)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:547)
> at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:486)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at 
> org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:131)
> at 
> org.apache.hadoop.hbase.LocalHBaseCluster.addMaster(LocalHBaseCluster.java:212)
> at 
> org.apache.hadoop.hbase.LocalHBaseCluster.(LocalHBaseCluster.java:159)
> ...
> Cause: java.lang.RuntimeException: Uncompilable source code - package 
> org.apache.hbase.thirdparty.io.netty.channel does not exist
> at 
> org.apache.hadoop.hbase.util.NettyEventLoopGroupConfig.(NettyEventLoopGroupConfig.java:20)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.setupNetty(HRegionServer.java:688)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:547)
> at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:486)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at 
> org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:131)
> at 
> org.apache.hadoop.hbase.LocalHBaseCluster.addMaster(LocalHBaseCluster.java:212){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20651) Master, prevents hbck or shell command to reassign the split parent region

2018-07-12 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542241#comment-16542241
 ] 

Hudson commented on HBASE-20651:


Results for branch branch-1.2
[build #394 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/394/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/394//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/394//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/394//JDK8_Nightly_Build_Report_(Hadoop2)/]




(x) {color:red}-1 source release artifact{color}
-- See build output for details.


> Master, prevents hbck or shell command to reassign the split parent region
> --
>
> Key: HBASE-20651
> URL: https://issues.apache.org/jira/browse/HBASE-20651
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 1.2.6
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>Priority: Minor
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6
>
> Attachments: HBASE-20651-branch-1-v001.patch, 
> HBASE-20651-branch-1-v002.patch, HBASE-20651-branch-1-v003.patch
>
>
> We are seeing that hbck brings back split parent region and this causes 
> region inconsistency. More details will be filled as reproduce is still 
> ongoing. Might need to do something at hbck or master to prevent this from 
> happening.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20704) Sometimes some compacted storefiles are not archived on region close

2018-07-12 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542245#comment-16542245
 ] 

Andrew Purtell commented on HBASE-20704:


bq.  add a map to keep track of the new readers created when a stream scanner 
is created and close those on region close 

That does sound like the right thing to do. How difficult would it be? Leaving 
unclosed streams around, expecting eventual GC to call a finalizer that cleans 
things up, is both a leak, technically, and a source of elevated GC pause 
latency. 

> Sometimes some compacted storefiles are not archived on region close
> 
>
> Key: HBASE-20704
> URL: https://issues.apache.org/jira/browse/HBASE-20704
> Project: HBase
>  Issue Type: Bug
>  Components: Compaction
>Affects Versions: 3.0.0, 1.3.0, 1.4.0, 1.5.0, 2.0.0
>Reporter: Francis Liu
>Assignee: Francis Liu
>Priority: Critical
> Attachments: HBASE-20704.001.patch, HBASE-20704.002.patch, 
> HBASE-20704.003.patch
>
>
> During region close compacted files which have not yet been archived by the 
> discharger are archived as part of the region closing process. It is 
> important that these files are wholly archived to insure data consistency. ie 
> a storefile containing delete tombstones can be archived while older 
> storefiles containing cells that were supposed to be deleted are left 
> unarchived thereby undeleting those cells. 
> On region close a compacted storefile is skipped from archiving if it has 
> read references (ie open scanners). This behavior is correct for when the 
> discharger chore runs but on region close consistency is of course more 
> important so we should add a special case to ignore any references on the 
> storefile and go ahead and archive it. 
> Attached patch contains a unit test that reproduces the problem and the 
> proposed fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20305) Add option to SyncTable that skip deletes on target cluster

2018-07-12 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542248#comment-16542248
 ] 

Andrew Purtell commented on HBASE-20305:


No concerns here

> Add option to SyncTable that skip deletes on target cluster
> ---
>
> Key: HBASE-20305
> URL: https://issues.apache.org/jira/browse/HBASE-20305
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 2.0.0-alpha-4
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.2.0
>
> Attachments: 0001-HBASE-20305.master.001.patch, 
> HBASE-20305.master.002.patch
>
>
> We had a situation where two clusters with active-active replication got out 
> of sync, but both had data that should be kept. The tables in question never 
> have data deleted, but ingestion had happened on the two different clusters, 
> some rows had been even updated.
> In this scenario, a cell that is present in one of the table clusters should 
> not be deleted, but replayed on the other. Also, for cells with same 
> identifier but different values, the most recent value should be kept. 
> Current version of SyncTable would not be applicable here, because it would 
> simply copy the whole state from source to target, then losing any additional 
> rows that might be only in target, as well as cell values that got most 
> recent update. This could be solved by adding an option to skip deletes for 
> SyncTable. This way, the additional cells not present on source would still 
> be kept. For cells with same identifier but different values, it would just 
> perform a Put for the cell version from source, but client scans would still 
> fetch the most recent timestamp.
> I'm attaching a patch with this additional option shortly. Please share your 
> thoughts.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-20704) Sometimes some compacted storefiles are not archived on region close

2018-07-12 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542245#comment-16542245
 ] 

Andrew Purtell edited comment on HBASE-20704 at 7/12/18 10:07 PM:
--

bq.  add a map to keep track of the new readers created when a stream scanner 
is created and close those on region close 

That does sound like the right thing to do. How difficult would it be? Leaving 
unclosed streams around, expecting eventual GC to call a finalizer that cleans 
things up, is both a leak, technically, and a source of elevated GC pause 
latency. OTOH, we wouldn't expect a high rate of region close and this wouldn't 
happen every time, so occurrence probability is low. Not a must do IMHO but 
good if we can


was (Author: apurtell):
bq.  add a map to keep track of the new readers created when a stream scanner 
is created and close those on region close 

That does sound like the right thing to do. How difficult would it be? Leaving 
unclosed streams around, expecting eventual GC to call a finalizer that cleans 
things up, is both a leak, technically, and a source of elevated GC pause 
latency. 

> Sometimes some compacted storefiles are not archived on region close
> 
>
> Key: HBASE-20704
> URL: https://issues.apache.org/jira/browse/HBASE-20704
> Project: HBase
>  Issue Type: Bug
>  Components: Compaction
>Affects Versions: 3.0.0, 1.3.0, 1.4.0, 1.5.0, 2.0.0
>Reporter: Francis Liu
>Assignee: Francis Liu
>Priority: Critical
> Attachments: HBASE-20704.001.patch, HBASE-20704.002.patch, 
> HBASE-20704.003.patch
>
>
> During region close compacted files which have not yet been archived by the 
> discharger are archived as part of the region closing process. It is 
> important that these files are wholly archived to insure data consistency. ie 
> a storefile containing delete tombstones can be archived while older 
> storefiles containing cells that were supposed to be deleted are left 
> unarchived thereby undeleting those cells. 
> On region close a compacted storefile is skipped from archiving if it has 
> read references (ie open scanners). This behavior is correct for when the 
> discharger chore runs but on region close consistency is of course more 
> important so we should add a special case to ignore any references on the 
> storefile and go ahead and archive it. 
> Attached patch contains a unit test that reproduces the problem and the 
> proposed fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20734) Colocate recovered edits directory with hbase.wal.dir

2018-07-12 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542259#comment-16542259
 ] 

Zach York commented on HBASE-20734:
---

Thanks for reviewing [~yuzhih...@gmail.com]. I am trying to rebase on master, 
but there are a ton of conflicts. I'll hopefully get a new patch up for that 
early next week as I will likely have to redo a lot of the changes on top of 
the master branch. I'll also toss it in review board.

> Colocate recovered edits directory with hbase.wal.dir
> -
>
> Key: HBASE-20734
> URL: https://issues.apache.org/jira/browse/HBASE-20734
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR, Recovery, wal
>Reporter: Ted Yu
>Assignee: Zach York
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20734.branch-1.001.patch
>
>
> During investigation of HBASE-20723, I realized that we wouldn't get the best 
> performance when hbase.wal.dir is configured to be on different (fast) media 
> than hbase rootdir w.r.t. recovered edits since recovered edits directory is 
> currently under rootdir.
> Such setup may not result in fast recovery when there is region server 
> failover.
> This issue is to find proper (hopefully backward compatible) way in 
> colocating recovered edits directory with hbase.wal.dir .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20651) Master, prevents hbck or shell command to reassign the split parent region

2018-07-12 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542283#comment-16542283
 ] 

Hudson commented on HBASE-20651:


Results for branch branch-1.3
[build #390 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/390/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/390//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/390//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/390//JDK8_Nightly_Build_Report_(Hadoop2)/]




(x) {color:red}-1 source release artifact{color}
-- See build output for details.


> Master, prevents hbck or shell command to reassign the split parent region
> --
>
> Key: HBASE-20651
> URL: https://issues.apache.org/jira/browse/HBASE-20651
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 1.2.6
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>Priority: Minor
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6
>
> Attachments: HBASE-20651-branch-1-v001.patch, 
> HBASE-20651-branch-1-v002.patch, HBASE-20651-branch-1-v003.patch
>
>
> We are seeing that hbck brings back split parent region and this causes 
> region inconsistency. More details will be filled as reproduce is still 
> ongoing. Might need to do something at hbck or master to prevent this from 
> happening.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20053) Remove .cmake file extension from .gitignore

2018-07-12 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20053:
---
Labels: build  (was: )

> Remove .cmake file extension from .gitignore
> 
>
> Key: HBASE-20053
> URL: https://issues.apache.org/jira/browse/HBASE-20053
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Priority: Minor
>  Labels: build
>
> There are .cmake files under hbase-native-client/cmake/ which are under 
> source control.
> The .cmake extension should be taken out of hbase-native-client/.gitignore



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20878) Data loss if merging regions while ServerCrashProcedure executing

2018-07-12 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542361#comment-16542361
 ] 

Duo Zhang commented on HBASE-20878:
---

The patch contains some testing code? And I still think we should check the 
region itself, not the server. It is possible that the region has been closed 
normally, and then the RS crashes, before we do the check.

> Data loss if merging regions while ServerCrashProcedure executing
> -
>
> Key: HBASE-20878
> URL: https://issues.apache.org/jira/browse/HBASE-20878
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Critical
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20878.branch-2.0.001.patch
>
>
> In MergeTableRegionsProcedure, we close the regions to merge using 
> UnassignProcedure. But, if the RS these regions on is crashed, a 
> ServerCrashProcedure will execute at the same time. UnassignProcedures will 
> be blockd until all logs are split. But since these regions are closed for 
> merging, the regions won't open again, the recovered.edit in the region dir 
> won't be replay, thus, data will loss.
> I provided a test to repo this case. I seriously doubt Split region procedure 
> also has this kind of problem. I will check later



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20878) Data loss if merging regions while ServerCrashProcedure executing

2018-07-12 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-20878:
---
Attachment: HBASE-20878.branch-2.0.002.patch

> Data loss if merging regions while ServerCrashProcedure executing
> -
>
> Key: HBASE-20878
> URL: https://issues.apache.org/jira/browse/HBASE-20878
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Critical
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20878.branch-2.0.001.patch, 
> HBASE-20878.branch-2.0.002.patch
>
>
> In MergeTableRegionsProcedure, we close the regions to merge using 
> UnassignProcedure. But, if the RS these regions on is crashed, a 
> ServerCrashProcedure will execute at the same time. UnassignProcedures will 
> be blockd until all logs are split. But since these regions are closed for 
> merging, the regions won't open again, the recovered.edit in the region dir 
> won't be replay, thus, data will loss.
> I provided a test to repo this case. I seriously doubt Split region procedure 
> also has this kind of problem. I will check later



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20616) TruncateTableProcedure is stuck in retry loop in TRUNCATE_TABLE_CREATE_FS_LAYOUT state

2018-07-12 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542385#comment-16542385
 ] 

Duo Zhang commented on HBASE-20616:
---

Please open a sub task to backport the fix to branch-2.0. This is a bug fix so 
it should be included in branch-2.0, especially that it has been backported to 
branch-1.

Thanks.

> TruncateTableProcedure is stuck in retry loop in 
> TRUNCATE_TABLE_CREATE_FS_LAYOUT state
> --
>
> Key: HBASE-20616
> URL: https://issues.apache.org/jira/browse/HBASE-20616
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
> Environment: HDP-2.5.3
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 2.1.0
>
> Attachments: 20616.master.004.patch, HBASE-20616.master.001.patch, 
> HBASE-20616.master.002.patch, HBASE-20616.master.003.patch, 
> HBASE-20616.master.004.patch
>
>
> At first, TruncateTableProcedure failed to write some files to HDFS in 
> TRUNCATE_TABLE_CREATE_FS_LAYOUT state for some reason.
> {code:java}
> 2018-05-15 08:00:25,346 WARN  [ProcedureExecutorThread-8] 
> procedure.TruncateTableProcedure: Retriable error trying to truncate 
> table=: state=TRUNCATE_TABLE_CREATE_FS_LAYOUT
> java.io.IOException: java.util.concurrent.ExecutionException: 
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File 
> /apps/hbase/data/.tmp/data.regioninfo could 
> only be replicated to 0 nodes instead of minReplication (=1).  There are  number of DNs> datanode(s) running and no node(s) are excluded in this 
> operation.
> ...
> {code}
> But at this time, seemed like writing some files to HDFS was successful.
> And then, TruncateTableProcedure was stuck in retry loop in 
> TRUNCATE_TABLE_CREATE_FS_LAYOUT state. At this point, the following log 
> messages were shown repeatedly in the master log:
> {code:java}
> 2018-05-15 08:00:25,463 WARN  [ProcedureExecutorThread-8] 
> procedure.TruncateTableProcedure: Retriable error trying to truncate 
> table=: state=TRUNCATE_TABLE_CREATE_FS_LAYOUT
> java.io.IOException: java.util.concurrent.ExecutionException: 
> java.io.IOException: The specified region already exists on disk: 
> hdfs:///apps/hbase/data/.tmp/data///
> ...
> {code}
> It seems like this is because TruncateTableProcedure tried to write the files 
> that were written successfully in the first try.
> I think we need to delete all the files and directories that are written 
> successfully in the previous try before retrying the 
> TRUNCATE_TABLE_CREATE_FS_LAYOUT state.
> Actually, this issue was observed in HDP-2.5.3, but I think the upstream has 
> the same issue. Also, it looks to me that CreateTableProcedure has a similar 
> issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20865) CreateTableProcedure is stuck in retry loop in CREATE_TABLE_WRITE_FS_LAYOUT state

2018-07-12 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542388#comment-16542388
 ] 

Ted Yu commented on HBASE-20865:


[~Apache9]:
Can you help integrate to 2.x.y branches ?

> CreateTableProcedure is stuck in retry loop in CREATE_TABLE_WRITE_FS_LAYOUT 
> state
> -
>
> Key: HBASE-20865
> URL: https://issues.apache.org/jira/browse/HBASE-20865
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-20865.master.001.patch
>
>
> Similar to HBASE-20616, CreateTableProcedure gets stuck in retry loop in 
> CREATE_TABLE_WRITE_FS_LAYOUT state when writing HDFS fails.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20878) Data loss if merging regions while ServerCrashProcedure executing

2018-07-12 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542395#comment-16542395
 ] 

Allan Yang commented on HBASE-20878:


uploaded a new patch, deleted the testing code.
I think checking the server is OK. It is like a safe fence. If we close the 
regions, and the RS which the regions is still online, we can safely proceed to 
the next state. The chance of the RS crashes just before we do this check and 
region close is very very small, and if it happens, aborting the merge is still 
OK to do.
On the other hand, checking the region itself, like checking the 
recovered.edit, is a bit of hack I think. What if we bring back DLR(distributed 
log replay) back and there is no recovered.edit.  
Still, it is open to discuss. Another question is, should we just aborting the 
merge, or we should retry just like you said above, [~Apache9]?

> Data loss if merging regions while ServerCrashProcedure executing
> -
>
> Key: HBASE-20878
> URL: https://issues.apache.org/jira/browse/HBASE-20878
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Critical
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20878.branch-2.0.001.patch, 
> HBASE-20878.branch-2.0.002.patch
>
>
> In MergeTableRegionsProcedure, we close the regions to merge using 
> UnassignProcedure. But, if the RS these regions on is crashed, a 
> ServerCrashProcedure will execute at the same time. UnassignProcedures will 
> be blockd until all logs are split. But since these regions are closed for 
> merging, the regions won't open again, the recovered.edit in the region dir 
> won't be replay, thus, data will loss.
> I provided a test to repo this case. I seriously doubt Split region procedure 
> also has this kind of problem. I will check later



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20878) Data loss if merging regions while ServerCrashProcedure executing

2018-07-12 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542403#comment-16542403
 ] 

Duo Zhang commented on HBASE-20878:
---

{quote}
What if we bring back DLR(distributed log replay) back and there is no 
recovered.edit. 
{quote}
That's the problem for the one who wants to bring it back.

Anyway, the problem here is that the region has not been closed normally, and I 
do not think a crashed RS is a good condition for testing it. Besides checking 
the recovered.edits, maybe we could check the region state? IIRC, we will not 
update the meta in SCP to move the region to CLOSED state. The assumption here 
is that, a region in CLOSED state must have been closed normally. Maybe we 
could introduce a state called ABNORMALLY_CLOSED, which indicates that the 
region will be processed by SCP.

For now, I prefer checking recovered.edits more than checking crashed rs. 
Skimmed the code again, you use lastHost to determine whether the region has 
been on a crashed RS.
{code}
// notice that, the lastHost will only be updated when a region is 
successfully CLOSED through
// UnassignProcedure, so do not use it for critical condition as the data 
maybe stale and unsync
// with the data in meta.
private volatile ServerName lastHost = null;
{code}

The comment is added by me when resolving HBASE-20792. The lastHost should not 
be used for critical condition...



> Data loss if merging regions while ServerCrashProcedure executing
> -
>
> Key: HBASE-20878
> URL: https://issues.apache.org/jira/browse/HBASE-20878
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Critical
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20878.branch-2.0.001.patch, 
> HBASE-20878.branch-2.0.002.patch
>
>
> In MergeTableRegionsProcedure, we close the regions to merge using 
> UnassignProcedure. But, if the RS these regions on is crashed, a 
> ServerCrashProcedure will execute at the same time. UnassignProcedures will 
> be blockd until all logs are split. But since these regions are closed for 
> merging, the regions won't open again, the recovered.edit in the region dir 
> won't be replay, thus, data will loss.
> I provided a test to repo this case. I seriously doubt Split region procedure 
> also has this kind of problem. I will check later



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version

2018-07-12 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542421#comment-16542421
 ] 

Vikas Vishwakarma commented on HBASE-20866:
---

added a reviewboard link here [https://reviews.apache.org/r/67902/]

[~apurtell] [~reidchan] [~mdrob]

 

> HBase 1.x scan performance degradation compared to 0.98 version
> ---
>
> Key: HBASE-20866
> URL: https://issues.apache.org/jira/browse/HBASE-20866
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
>Priority: Critical
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6
>
> Attachments: HBASE-20866.branch-1.3.001.patch, 
> HBASE-20866.branch-1.3.002.patch, HBASE-20866.branch-1.3.003.patch
>
>
> Internally while testing 1.3 as part of migration from 0.98 to 1.3 we 
> observed perf degradation in scan performance for phoenix queries varying 
> from few 10's to upto 200% depending on the query being executed. We tried 
> simple native HBase scan and there also we saw upto 40% degradation in 
> performance when the number of column qualifiers are high (40-50+)
> To identify the root cause of performance diff between 0.98 and 1.3 we 
> carried out lot of experiments with profiling and git bisect iterations, 
> however we were not able to identify any particular source of scan 
> performance degradation and it looked like this is an accumulated degradation 
> of 5-10% over various enhancements and refactoring.
> We identified few major enhancements like partialResult handling, 
> ScannerContext with heartbeat processing, time/size limiting, RPC 
> refactoring, etc that could have contributed to small degradation in 
> performance which put together could be leading to large overall degradation.
> One of the changes is 
> [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which 
> implements partialResult handling. In ClientScanner.java the results received 
> from server are cached on the client side by converting the result array into 
> an ArrayList. This function gets called in a loop depending on the number of 
> rows in the scan result. Example for ten’s of millions of rows scanned, this 
> can be called in the order of millions of times.
> In almost all the cases 99% of the time (except for handling partial results, 
> etc). We are just taking the resultsFromServer converting it into a ArrayList 
> resultsToAddToCache in addResultsToList(..) and then iterating over the list 
> again and adding it to cache in loadCache(..) as given in the code path below
> In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → 
> addResultsToList(..) →
> {code:java}
> loadCache() {
> ...
>  List resultsToAddToCache =
>  getResultsToAddToCache(values, callable.isHeartbeatMessage());
> ...
> …
>    for (Result rs : resultsToAddToCache) {
>  rs = filterLoadedCell(rs);
>  cache.add(rs);
> ...
>    }
> }
> getResultsToAddToCache(..) {
> ..
>    final boolean isBatchSet = scan != null && scan.getBatch() > 0;
>    final boolean allowPartials = scan != null && 
> scan.getAllowPartialResults();
> ..
>    if (allowPartials || isBatchSet) {
>  addResultsToList(resultsToAddToCache, resultsFromServer, 0,
>    (null == resultsFromServer ? 0 : resultsFromServer.length));
>  return resultsToAddToCache;
>    }
> ...
> }
> private void addResultsToList(List outputList, Result[] inputArray, 
> int start, int end) {
>    if (inputArray == null || start < 0 || end > inputArray.length) return;
>    for (int i = start; i < end; i++) {
>  outputList.add(inputArray[i]);
>    }
>  }{code}
>  
> It looks like we can avoid the result array to arraylist conversion 
> (resultsFromServer --> resultsToAddToCache ) for the first case which is also 
> the most frequent case and instead directly take the values arraay returned 
> by callable and add it to the cache without converting it into ArrayList.
> I have taken both these flags allowPartials and isBatchSet out in loadcahe() 
> and I am directly adding values to scanner cache if the above condition is 
> pass instead of coverting it into arrayList by calling 
> getResultsToAddToCache(). For example:
> {code:java}
> protected void loadCache() throws IOException {
> Result[] values = null;
> ..
> final boolean isBatchSet = scan != null && scan.getBatch() > 0;
> final boolean allowPartials = scan != null && scan.getAllowPartialResults();
> ..
> for (;;) {
> try {
> values = call(callable, caller, scannerTimeout);
> ..
> } catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) {
> ..
> }
> if (allowPartials || isBatchSet) {  // DIRECTLY COPY values TO CACHE
> if (values != null) {
> for (int v=0; v Result rs = values[v];
> 
> cache.

[jira] [Updated] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version

2018-07-12 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20866:
--
Attachment: (was: HBASE-20866.branch-1.3.003.patch)

> HBase 1.x scan performance degradation compared to 0.98 version
> ---
>
> Key: HBASE-20866
> URL: https://issues.apache.org/jira/browse/HBASE-20866
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
>Priority: Critical
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6
>
> Attachments: HBASE-20866.branch-1.3.001.patch, 
> HBASE-20866.branch-1.3.002.patch
>
>
> Internally while testing 1.3 as part of migration from 0.98 to 1.3 we 
> observed perf degradation in scan performance for phoenix queries varying 
> from few 10's to upto 200% depending on the query being executed. We tried 
> simple native HBase scan and there also we saw upto 40% degradation in 
> performance when the number of column qualifiers are high (40-50+)
> To identify the root cause of performance diff between 0.98 and 1.3 we 
> carried out lot of experiments with profiling and git bisect iterations, 
> however we were not able to identify any particular source of scan 
> performance degradation and it looked like this is an accumulated degradation 
> of 5-10% over various enhancements and refactoring.
> We identified few major enhancements like partialResult handling, 
> ScannerContext with heartbeat processing, time/size limiting, RPC 
> refactoring, etc that could have contributed to small degradation in 
> performance which put together could be leading to large overall degradation.
> One of the changes is 
> [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which 
> implements partialResult handling. In ClientScanner.java the results received 
> from server are cached on the client side by converting the result array into 
> an ArrayList. This function gets called in a loop depending on the number of 
> rows in the scan result. Example for ten’s of millions of rows scanned, this 
> can be called in the order of millions of times.
> In almost all the cases 99% of the time (except for handling partial results, 
> etc). We are just taking the resultsFromServer converting it into a ArrayList 
> resultsToAddToCache in addResultsToList(..) and then iterating over the list 
> again and adding it to cache in loadCache(..) as given in the code path below
> In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → 
> addResultsToList(..) →
> {code:java}
> loadCache() {
> ...
>  List resultsToAddToCache =
>  getResultsToAddToCache(values, callable.isHeartbeatMessage());
> ...
> …
>    for (Result rs : resultsToAddToCache) {
>  rs = filterLoadedCell(rs);
>  cache.add(rs);
> ...
>    }
> }
> getResultsToAddToCache(..) {
> ..
>    final boolean isBatchSet = scan != null && scan.getBatch() > 0;
>    final boolean allowPartials = scan != null && 
> scan.getAllowPartialResults();
> ..
>    if (allowPartials || isBatchSet) {
>  addResultsToList(resultsToAddToCache, resultsFromServer, 0,
>    (null == resultsFromServer ? 0 : resultsFromServer.length));
>  return resultsToAddToCache;
>    }
> ...
> }
> private void addResultsToList(List outputList, Result[] inputArray, 
> int start, int end) {
>    if (inputArray == null || start < 0 || end > inputArray.length) return;
>    for (int i = start; i < end; i++) {
>  outputList.add(inputArray[i]);
>    }
>  }{code}
>  
> It looks like we can avoid the result array to arraylist conversion 
> (resultsFromServer --> resultsToAddToCache ) for the first case which is also 
> the most frequent case and instead directly take the values arraay returned 
> by callable and add it to the cache without converting it into ArrayList.
> I have taken both these flags allowPartials and isBatchSet out in loadcahe() 
> and I am directly adding values to scanner cache if the above condition is 
> pass instead of coverting it into arrayList by calling 
> getResultsToAddToCache(). For example:
> {code:java}
> protected void loadCache() throws IOException {
> Result[] values = null;
> ..
> final boolean isBatchSet = scan != null && scan.getBatch() > 0;
> final boolean allowPartials = scan != null && scan.getAllowPartialResults();
> ..
> for (;;) {
> try {
> values = call(callable, caller, scannerTimeout);
> ..
> } catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) {
> ..
> }
> if (allowPartials || isBatchSet) {  // DIRECTLY COPY values TO CACHE
> if (values != null) {
> for (int v=0; v Result rs = values[v];
> 
> cache.add(rs);
> ...
> } else { // DO ALL THE REGULAR PARTIAL RESULT HANDLING ..
> List resultsToAddToCache =
> getResultsToAddToCache(values, 

[jira] [Updated] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version

2018-07-12 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20866:
--
Attachment: HBASE-20866.branch-1.3.003.patch

> HBase 1.x scan performance degradation compared to 0.98 version
> ---
>
> Key: HBASE-20866
> URL: https://issues.apache.org/jira/browse/HBASE-20866
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
>Priority: Critical
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6
>
> Attachments: HBASE-20866.branch-1.3.001.patch, 
> HBASE-20866.branch-1.3.002.patch, HBASE-20866.branch-1.3.003.patch
>
>
> Internally while testing 1.3 as part of migration from 0.98 to 1.3 we 
> observed perf degradation in scan performance for phoenix queries varying 
> from few 10's to upto 200% depending on the query being executed. We tried 
> simple native HBase scan and there also we saw upto 40% degradation in 
> performance when the number of column qualifiers are high (40-50+)
> To identify the root cause of performance diff between 0.98 and 1.3 we 
> carried out lot of experiments with profiling and git bisect iterations, 
> however we were not able to identify any particular source of scan 
> performance degradation and it looked like this is an accumulated degradation 
> of 5-10% over various enhancements and refactoring.
> We identified few major enhancements like partialResult handling, 
> ScannerContext with heartbeat processing, time/size limiting, RPC 
> refactoring, etc that could have contributed to small degradation in 
> performance which put together could be leading to large overall degradation.
> One of the changes is 
> [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which 
> implements partialResult handling. In ClientScanner.java the results received 
> from server are cached on the client side by converting the result array into 
> an ArrayList. This function gets called in a loop depending on the number of 
> rows in the scan result. Example for ten’s of millions of rows scanned, this 
> can be called in the order of millions of times.
> In almost all the cases 99% of the time (except for handling partial results, 
> etc). We are just taking the resultsFromServer converting it into a ArrayList 
> resultsToAddToCache in addResultsToList(..) and then iterating over the list 
> again and adding it to cache in loadCache(..) as given in the code path below
> In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → 
> addResultsToList(..) →
> {code:java}
> loadCache() {
> ...
>  List resultsToAddToCache =
>  getResultsToAddToCache(values, callable.isHeartbeatMessage());
> ...
> …
>    for (Result rs : resultsToAddToCache) {
>  rs = filterLoadedCell(rs);
>  cache.add(rs);
> ...
>    }
> }
> getResultsToAddToCache(..) {
> ..
>    final boolean isBatchSet = scan != null && scan.getBatch() > 0;
>    final boolean allowPartials = scan != null && 
> scan.getAllowPartialResults();
> ..
>    if (allowPartials || isBatchSet) {
>  addResultsToList(resultsToAddToCache, resultsFromServer, 0,
>    (null == resultsFromServer ? 0 : resultsFromServer.length));
>  return resultsToAddToCache;
>    }
> ...
> }
> private void addResultsToList(List outputList, Result[] inputArray, 
> int start, int end) {
>    if (inputArray == null || start < 0 || end > inputArray.length) return;
>    for (int i = start; i < end; i++) {
>  outputList.add(inputArray[i]);
>    }
>  }{code}
>  
> It looks like we can avoid the result array to arraylist conversion 
> (resultsFromServer --> resultsToAddToCache ) for the first case which is also 
> the most frequent case and instead directly take the values arraay returned 
> by callable and add it to the cache without converting it into ArrayList.
> I have taken both these flags allowPartials and isBatchSet out in loadcahe() 
> and I am directly adding values to scanner cache if the above condition is 
> pass instead of coverting it into arrayList by calling 
> getResultsToAddToCache(). For example:
> {code:java}
> protected void loadCache() throws IOException {
> Result[] values = null;
> ..
> final boolean isBatchSet = scan != null && scan.getBatch() > 0;
> final boolean allowPartials = scan != null && scan.getAllowPartialResults();
> ..
> for (;;) {
> try {
> values = call(callable, caller, scannerTimeout);
> ..
> } catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) {
> ..
> }
> if (allowPartials || isBatchSet) {  // DIRECTLY COPY values TO CACHE
> if (values != null) {
> for (int v=0; v Result rs = values[v];
> 
> cache.add(rs);
> ...
> } else { // DO ALL THE REGULAR PARTIAL RESULT HANDLING ..
> List resultsToAddToCache =
> getResul

[jira] [Updated] (HBASE-20878) Data loss if merging regions while ServerCrashProcedure executing

2018-07-12 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-20878:
---
Issue Type: Sub-task  (was: Bug)
Parent: HBASE-20828

> Data loss if merging regions while ServerCrashProcedure executing
> -
>
> Key: HBASE-20878
> URL: https://issues.apache.org/jira/browse/HBASE-20878
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Critical
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20878.branch-2.0.001.patch, 
> HBASE-20878.branch-2.0.002.patch
>
>
> In MergeTableRegionsProcedure, we close the regions to merge using 
> UnassignProcedure. But, if the RS these regions on is crashed, a 
> ServerCrashProcedure will execute at the same time. UnassignProcedures will 
> be blockd until all logs are split. But since these regions are closed for 
> merging, the regions won't open again, the recovered.edit in the region dir 
> won't be replay, thus, data will loss.
> I provided a test to repo this case. I seriously doubt Split region procedure 
> also has this kind of problem. I will check later



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20881) Introduce a region transition procedure to handle all the state transition for a region

2018-07-12 Thread Duo Zhang (JIRA)
Duo Zhang created HBASE-20881:
-

 Summary: Introduce a region transition procedure to handle all the 
state transition for a region
 Key: HBASE-20881
 URL: https://issues.apache.org/jira/browse/HBASE-20881
 Project: HBase
  Issue Type: Sub-task
Reporter: Duo Zhang


Now have an AssignProcedure, an UnssignProcedure, and also a 
MoveRegionProcedure which schedules an AssignProcedure and an UnssignProcedure 
to move a region. This makes the logic a bit complicated, as MRP is not a RIT, 
so when SCP can not interrupt it directly...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >