[jira] [Created] (HBASE-19893) restore_snapshot is broken in master branch

2018-01-29 Thread Toshihiro Suzuki (JIRA)
Toshihiro Suzuki created HBASE-19893:


 Summary: restore_snapshot is broken in master branch
 Key: HBASE-19893
 URL: https://issues.apache.org/jira/browse/HBASE-19893
 Project: HBase
  Issue Type: Bug
Reporter: Toshihiro Suzuki


When I was investigating HBASE-19850, I found restore_snapshot didn't work in 
master branch.

 

Steps to reproduce are as follows:

1. Create a table

{code}

create "test", "cf"

{code}

2. Load data (2000 rows) to the table

{code}

(0...2000).each\{|i| put "test", "row#{i}", "cf:col", "val"}

{code}

3. Split the table

{code}

split "test"

{code}

4. Take a snapshot

{code}

snapshot "test", "snap"

{code}

5. Load more data (2000 rows) to the table and split the table agin

{code}

(2000...4000).each\{|i| put "test", "row#{i}", "cf:col", "val"}

split "test"

{code}

6. Restore the table from the snapshot 

{code}

disable "test"
restore_snapshot "snap"
enable "test"

{code}

8. Scan the table

{code}

scan "test"

{code}

However, this scan returns only 244 rows (it should return 2000 rows) like the 
following:

{code}

hbase(main):038:0> scan "test"
ROW COLUMN+CELL
 row78 column=cf:col, timestamp=1517298307049, value=val



row999 column=cf:col, timestamp=1517298307608, value=val
244 row(s)
Took 0.1500 seconds

{code}

 

Also, the restored table should have 2 online regions but it has 3 online 
regions.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19892) Checking 'patch attach' and yetus 0.7.0 and move to Yetus 0.7.0

2018-01-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344595#comment-16344595
 ] 

Hadoop QA commented on HBASE-19892:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
26s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
17s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  9m 
24s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
48s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
19m 48s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
21s{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
 9s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 37m 21s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-19892 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908272/HBASE-19892.master.002.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  shadedjars  hadoopcheck  
xml  compile  |
| uname | Linux 254110142b1a 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 9b8d7e0aef |
| maven | version: Apache Maven 3.5.2 
(138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) |
| Default Java | 1.8.0_151 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11243/testReport/ |
| Max. process+thread count | 348 (vs. ulimit of 1000) |
| modules | C: hbase-common U: hbase-common |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11243/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Checking 'patch attach' and yetus 0.7.0 and move to Yetus 0.7.0
> ---
>
> Key: HBASE-19892
> URL: https://issues.apache.org/jira/browse/HBASE-19892
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 1.5.0, 2.0.0-beta-2, 1.4.2
>
> Attachments: HBASE-19892.master

[jira] [Commented] (HBASE-19892) Checking 'patch attach' and yetus 0.7.0 and move to Yetus 0.7.0

2018-01-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344597#comment-16344597
 ] 

Hadoop QA commented on HBASE-19892:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
10s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
54s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
17s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  9m 
52s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  4m 
28s{color} | {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  6m 
 9s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
23m 28s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
18s{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
 8s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 41m 34s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-19892 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908271/HBASE-19892.master.001.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  shadedjars  hadoopcheck  
xml  compile  |
| uname | Linux 3d63e4b9f0be 3.13.0-133-generic #182-Ubuntu SMP Tue Sep 19 
15:49:21 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 9b8d7e0aef |
| maven | version: Apache Maven 3.5.2 
(138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) |
| Default Java | 1.8.0_151 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11242/artifact/patchprocess/patch-mvninstall-root.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11242/testReport/ |
| Max. process+thread count | 262 (vs. ulimit of 1000) |
| modules | C: hbase-common U: hbase-common |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11242/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Checking 'patch attach' and yetus 0.7.0 and move to Yetus 0.7.0
> ---
>
> Key: HBASE-19892
> URL: https://issues.apache.org/jira/browse/HBASE-19892
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: s

[jira] [Comment Edited] (HBASE-19887) Do not overwrite the surefire junit listener property in the pom of sub modules

2018-01-29 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344590#comment-16344590
 ] 

stack edited comment on HBASE-19887 at 1/30/18 7:01 AM:


Patch looks beautiful. +1. Its the way it should have been


was (Author: stack):
Patch looks beautiful. +1.

> Do not overwrite the surefire junit listener property in the pom of sub 
> modules
> ---
>
> Key: HBASE-19887
> URL: https://issues.apache.org/jira/browse/HBASE-19887
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Attachments: HBASE-19887.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19887) Do not overwrite the surefire junit listener property in the pom of sub modules

2018-01-29 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344590#comment-16344590
 ] 

stack commented on HBASE-19887:
---

Patch looks beautiful. +1.

> Do not overwrite the surefire junit listener property in the pom of sub 
> modules
> ---
>
> Key: HBASE-19887
> URL: https://issues.apache.org/jira/browse/HBASE-19887
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Attachments: HBASE-19887.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19887) Do not overwrite the surefire junit listener property in the pom of sub modules

2018-01-29 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344588#comment-16344588
 ] 

Duo Zhang commented on HBASE-19887:
---

The ConnectionCountResourceAnalyzer had been removed since HBASE-13252.

So I think for now we can just remove it...

Mind taking a look at the patch sir? [~stack]

Thanks.

> Do not overwrite the surefire junit listener property in the pom of sub 
> modules
> ---
>
> Key: HBASE-19887
> URL: https://issues.apache.org/jira/browse/HBASE-19887
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Attachments: HBASE-19887.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2018-01-29 Thread Appy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344578#comment-16344578
 ] 

Appy edited comment on HBASE-17852 at 1/30/18 6:53 AM:
---

bq. My intent was not to squash your opinions, but to avoid being blocked if 
you were not interested/busy as seemed might be the case.
That's reasonable. Sorry for the delay on my part.
Leaving a comment saying the same and that post-hoc reviews would be welcome 
would have avoided whole situation.
The only thing that put me off was, finding it out myself, followed by certain 
tone in certain comments.

[~elserj] i believe you. Everyone makes mistakes from time-to-time, i'm certain 
i must have done too. Always happy with "acknowledge, learn, and move past 
them" way. All's good (between us two).



was (Author: appy):
bq. My intent was not to squash your opinions, but to avoid being blocked if 
you were not interested/busy as seemed might be the case.
That's reasonable. Sorry for the delay on my part. 

[~elserj] i believe you. Everyone makes mistakes from time-to-time, i'm certain 
i must have done too. Always happy with "acknowledge, learn, and move past 
them" way. All's good (between us two).


> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> 
>
> Key: HBASE-17852
> URL: https://issues.apache.org/jira/browse/HBASE-17852
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-17852-v10.patch, screenshot-1.png
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19892) Checking 'patch attach' and yetus 0.7.0 and move to Yetus 0.7.0

2018-01-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344585#comment-16344585
 ] 

Hudson commented on HBASE-19892:


SUCCESS: Integrated in Jenkins build HBase-1.3-IT #345 (See 
[https://builds.apache.org/job/HBase-1.3-IT/345/])
HBASE-19892 Checking patch attach and yetus 0.7.0 and move to Yetus (stack: rev 
85b4615a1f5c88e3442ee47644a7a163cbbd3dfa)
* (edit) dev-support/Jenkinsfile


> Checking 'patch attach' and yetus 0.7.0 and move to Yetus 0.7.0
> ---
>
> Key: HBASE-19892
> URL: https://issues.apache.org/jira/browse/HBASE-19892
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 1.5.0, 2.0.0-beta-2, 1.4.2
>
> Attachments: HBASE-19892.master.001.patch, 
> HBASE-19892.master.002.patch, HBASE-19892.master.003.patch
>
>
> _Yetus-0.7.0 has a fix for the changed Jira behavior that made it so we 
> weren't picking up the latest attached patch. Check it works and if it does 
> move over to yetus 0.7.0_



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19887) Do not overwrite the surefire junit listener property in the pom of sub modules

2018-01-29 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344584#comment-16344584
 ] 

Duo Zhang commented on HBASE-19887:
---

Thanks sir. Let me check the code on branch-1.3.

> Do not overwrite the surefire junit listener property in the pom of sub 
> modules
> ---
>
> Key: HBASE-19887
> URL: https://issues.apache.org/jira/browse/HBASE-19887
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Attachments: HBASE-19887.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19887) Do not overwrite the surefire junit listener property in the pom of sub modules

2018-01-29 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344582#comment-16344582
 ] 

Duo Zhang commented on HBASE-19887:
---

Remove ServerResourceCheckerJUnitListener since it is useless for now. I think 
we can add more checks in ResourceCheckerJUnitListener directly.

Remove the surefire plugin definition in sub modules if not necessary so that 
they will not overwrite the default listener config.

> Do not overwrite the surefire junit listener property in the pom of sub 
> modules
> ---
>
> Key: HBASE-19887
> URL: https://issues.apache.org/jira/browse/HBASE-19887
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Attachments: HBASE-19887.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19887) Do not overwrite the surefire junit listener property in the pom of sub modules

2018-01-29 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344580#comment-16344580
 ] 

stack commented on HBASE-19887:
---

Yes go for it.

These resource checker/server resource checker are useful util added long ago.

In branch-1.3 at least there is a difference.
{code:java}
static class ConnectionCountResourceAnalyzer extends 
ResourceChecker.ResourceAnalyzer {
  @Override
  public int getVal(Phase phase) {
return HConnectionTestingUtility.getConnectionCount();
  }
}

@Override
protected void addResourceAnalyzer(ResourceChecker rc) {
  rc.addResourceAnalyzer(new ConnectionCountResourceAnalyzer());
}{code}
Does our new ClassRule do what these listeners used to do? It doesn't seem too. 
Will I add it into 
./hbase-common/src/test/java/org/apache/hadoop/hbase/HBaseClassTestRuleChecker.java
 ? 

Thanks [~Apache9]

> Do not overwrite the surefire junit listener property in the pom of sub 
> modules
> ---
>
> Key: HBASE-19887
> URL: https://issues.apache.org/jira/browse/HBASE-19887
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Attachments: HBASE-19887.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19887) Do not overwrite the surefire junit listener property in the pom of sub modules

2018-01-29 Thread Duo Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-19887:
--
Attachment: (was: HBASE-19887.patch)

> Do not overwrite the surefire junit listener property in the pom of sub 
> modules
> ---
>
> Key: HBASE-19887
> URL: https://issues.apache.org/jira/browse/HBASE-19887
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Attachments: HBASE-19887.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19892) Checking 'patch attach' and yetus 0.7.0 and move to Yetus 0.7.0

2018-01-29 Thread Chia-Ping Tsai (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344579#comment-16344579
 ] 

Chia-Ping Tsai commented on HBASE-19892:


yay! we get rid of the jira noise about the deletion of patch.

> Checking 'patch attach' and yetus 0.7.0 and move to Yetus 0.7.0
> ---
>
> Key: HBASE-19892
> URL: https://issues.apache.org/jira/browse/HBASE-19892
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 1.5.0, 2.0.0-beta-2, 1.4.2
>
> Attachments: HBASE-19892.master.001.patch, 
> HBASE-19892.master.002.patch, HBASE-19892.master.003.patch
>
>
> _Yetus-0.7.0 has a fix for the changed Jira behavior that made it so we 
> weren't picking up the latest attached patch. Check it works and if it does 
> move over to yetus 0.7.0_



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19887) Do not overwrite the surefire junit listener property in the pom of sub modules

2018-01-29 Thread Duo Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-19887:
--
Attachment: HBASE-19887.patch

> Do not overwrite the surefire junit listener property in the pom of sub 
> modules
> ---
>
> Key: HBASE-19887
> URL: https://issues.apache.org/jira/browse/HBASE-19887
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Attachments: HBASE-19887.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19887) Do not overwrite the surefire junit listener property in the pom of sub modules

2018-01-29 Thread Duo Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-19887:
--
Summary: Do not overwrite the surefire junit listener property in the pom 
of sub modules  (was: Find out why the HBaseClassTestRuleCheck does not work in 
pre commit)

> Do not overwrite the surefire junit listener property in the pom of sub 
> modules
> ---
>
> Key: HBASE-19887
> URL: https://issues.apache.org/jira/browse/HBASE-19887
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Attachments: HBASE-19887.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2018-01-29 Thread Appy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344578#comment-16344578
 ] 

Appy commented on HBASE-17852:
--

bq. My intent was not to squash your opinions, but to avoid being blocked if 
you were not interested/busy as seemed might be the case.
That's reasonable. Sorry for the delay on my part. 

[~elserj] i believe you. Everyone makes mistakes from time-to-time, i'm certain 
i must have done too. Always happy with "acknowledge, learn, and move past 
them" way. All's good (between us two).


> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> 
>
> Key: HBASE-17852
> URL: https://issues.apache.org/jira/browse/HBASE-17852
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-17852-v10.patch, screenshot-1.png
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19892) Checking 'patch attach' and yetus 0.7.0 and move to Yetus 0.7.0

2018-01-29 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19892:
--
Release Note: Moved our internal yetus reference from 0.6.0 to 0.7.0. 
Concurrently, I changed hadoopqa to run with 0.7.0 (by editing the config in 
jenkins).  (was: Moved our internal yetus reference from 0.6.0 to 0.7.0.)

> Checking 'patch attach' and yetus 0.7.0 and move to Yetus 0.7.0
> ---
>
> Key: HBASE-19892
> URL: https://issues.apache.org/jira/browse/HBASE-19892
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 1.5.0, 2.0.0-beta-2, 1.4.2
>
> Attachments: HBASE-19892.master.001.patch, 
> HBASE-19892.master.002.patch, HBASE-19892.master.003.patch
>
>
> _Yetus-0.7.0 has a fix for the changed Jira behavior that made it so we 
> weren't picking up the latest attached patch. Check it works and if it does 
> move over to yetus 0.7.0_



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19892) Checking 'patch attach' and yetus 0.7.0 and move to Yetus 0.7.0

2018-01-29 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19892:
--
Summary: Checking 'patch attach' and yetus 0.7.0 and move to Yetus 0.7.0  
(was: Checking patch attach and yetus 0.7.0 and move to Yetus 0.7.0)

> Checking 'patch attach' and yetus 0.7.0 and move to Yetus 0.7.0
> ---
>
> Key: HBASE-19892
> URL: https://issues.apache.org/jira/browse/HBASE-19892
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 1.5.0, 2.0.0-beta-2, 1.4.2
>
> Attachments: HBASE-19892.master.001.patch, 
> HBASE-19892.master.002.patch, HBASE-19892.master.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19892) Checking 'patch attach' and yetus 0.7.0 and move to Yetus 0.7.0

2018-01-29 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19892:
--
Description: _Yetus-0.7.0 has a fix for the changed Jira behavior that made 
it so we weren't picking up the latest attached patch. Check it works and if it 
does move over to yetus 0.7.0_

> Checking 'patch attach' and yetus 0.7.0 and move to Yetus 0.7.0
> ---
>
> Key: HBASE-19892
> URL: https://issues.apache.org/jira/browse/HBASE-19892
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 1.5.0, 2.0.0-beta-2, 1.4.2
>
> Attachments: HBASE-19892.master.001.patch, 
> HBASE-19892.master.002.patch, HBASE-19892.master.003.patch
>
>
> _Yetus-0.7.0 has a fix for the changed Jira behavior that made it so we 
> weren't picking up the latest attached patch. Check it works and if it does 
> move over to yetus 0.7.0_



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19892) Checking patch attach and yetus 0.7.0 and move to Yetus 0.7.0

2018-01-29 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19892:
--
   Resolution: Fixed
 Assignee: stack
Fix Version/s: 1.4.2
   2.0.0-beta-2
   1.5.0
 Release Note: Moved our internal yetus reference from 0.6.0 to 0.7.0.
   Status: Resolved  (was: Patch Available)

Did following on master, branch-2, branch-1, branch-1.4, and branch-1.3:

 

commit 85b4615a1f5c88e3442ee47644a7a163cbbd3dfa
Author: Michael Stack 
Date: Mon Jan 29 22:34:31 2018 -0800

HBASE-19892 Checking patch attach and yetus 0.7.0 and move to Yetus 0.7.0

One-liner that ups our yetus version from 0.6.0 to 0.7.0.

diff --git a/dev-support/Jenkinsfile b/dev-support/Jenkinsfile
index b0d6724bf0..d6da2b8637 100644
--- a/dev-support/Jenkinsfile
+++ b/dev-support/Jenkinsfile
@@ -33,7 +33,7 @@ pipeline {
 TOOLS = "${env.WORKSPACE}/tools"
 // where we check out to across stages
 BASEDIR = "${env.WORKSPACE}/component"
- YETUS_RELEASE = '0.6.0'
+ YETUS_RELEASE = '0.7.0'
 PROJECT = 'hbase'
 PROJECT_PERSONALITY = 
'https://raw.githubusercontent.com/apache/hbase/master/dev-support/hbase-personality.sh'
 // This section of the docs tells folks not to use the javadoc tag. older 
branches have our old version of the check for said tag.

> Checking patch attach and yetus 0.7.0 and move to Yetus 0.7.0
> -
>
> Key: HBASE-19892
> URL: https://issues.apache.org/jira/browse/HBASE-19892
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 1.5.0, 2.0.0-beta-2, 1.4.2
>
> Attachments: HBASE-19892.master.001.patch, 
> HBASE-19892.master.002.patch, HBASE-19892.master.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19680) BufferedMutatorImpl#mutate should wait the result from AP in order to throw the failed mutations

2018-01-29 Thread Chia-Ping Tsai (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344571#comment-16344571
 ] 

Chia-Ping Tsai commented on HBASE-19680:


{quote}+1 on commit then.
{quote}
Thanks for the reviews. Let met do the performance testing before committing 
just in case concurrent condition exist.

> BufferedMutatorImpl#mutate should wait the result from AP in order to throw 
> the failed mutations
> 
>
> Key: HBASE-19680
> URL: https://issues.apache.org/jira/browse/HBASE-19680
> Project: HBase
>  Issue Type: Improvement
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19680.v0.patch, HBASE-19680.v1.patch
>
>
> Currently, BMI#mutate doesn't wait the result from AP so the errors are 
> stored in AP. The only way which can return the errors to user is, calling 
> the flush to catch the exception. That is non-intuitive.
> I feel BMI#mutate should wait the result. That is to say, user can parse the 
> exception thrown by BM#mutate to get the failed mutations. Also, we can 
> remove the global error from AP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19892) Checking patch attach and yetus 0.7.0 and move to Yetus 0.7.0

2018-01-29 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19892:
--
Summary: Checking patch attach and yetus 0.7.0 and move to Yetus 0.7.0  
(was: Checking patch attach and yetus 0.7.0; ignore!)

> Checking patch attach and yetus 0.7.0 and move to Yetus 0.7.0
> -
>
> Key: HBASE-19892
> URL: https://issues.apache.org/jira/browse/HBASE-19892
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Priority: Major
> Attachments: HBASE-19892.master.001.patch, 
> HBASE-19892.master.002.patch, HBASE-19892.master.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19892) Checking patch attach and yetus 0.7.0 and move to Yetus 0.7.0

2018-01-29 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344568#comment-16344568
 ] 

stack commented on HBASE-19892:
---

Seems to be doing right thing. Let me change our yetus version in JenkinsFile 
to be 0.7.0 as part of this issue.

> Checking patch attach and yetus 0.7.0 and move to Yetus 0.7.0
> -
>
> Key: HBASE-19892
> URL: https://issues.apache.org/jira/browse/HBASE-19892
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Priority: Major
> Attachments: HBASE-19892.master.001.patch, 
> HBASE-19892.master.002.patch, HBASE-19892.master.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19892) Checking patch attach and yetus 0.7.0; ignore!

2018-01-29 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19892:
--
Attachment: HBASE-19892.master.003.patch

> Checking patch attach and yetus 0.7.0; ignore!
> --
>
> Key: HBASE-19892
> URL: https://issues.apache.org/jira/browse/HBASE-19892
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Priority: Major
> Attachments: HBASE-19892.master.001.patch, 
> HBASE-19892.master.002.patch, HBASE-19892.master.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19892) Checking patch attach and yetus 0.7.0; ignore!

2018-01-29 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19892:
--
Attachment: HBASE-19892.master.002.patch

> Checking patch attach and yetus 0.7.0; ignore!
> --
>
> Key: HBASE-19892
> URL: https://issues.apache.org/jira/browse/HBASE-19892
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Priority: Major
> Attachments: HBASE-19892.master.001.patch, 
> HBASE-19892.master.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19892) Checking patch attach and yetus 0.7.0; ignore!

2018-01-29 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19892:
--
Attachment: HBASE-19892.master.001.patch

> Checking patch attach and yetus 0.7.0; ignore!
> --
>
> Key: HBASE-19892
> URL: https://issues.apache.org/jira/browse/HBASE-19892
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Priority: Major
> Attachments: HBASE-19892.master.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19892) Checking patch attach and yetus 0.7.0; ignore!

2018-01-29 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19892:
--
Status: Patch Available  (was: Open)

.001 non-change in the hbase-common pom.xml

> Checking patch attach and yetus 0.7.0; ignore!
> --
>
> Key: HBASE-19892
> URL: https://issues.apache.org/jira/browse/HBASE-19892
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Priority: Major
> Attachments: HBASE-19892.master.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-19892) Checking patch attach and yetus 0.7.0; ignore!

2018-01-29 Thread stack (JIRA)
stack created HBASE-19892:
-

 Summary: Checking patch attach and yetus 0.7.0; ignore!
 Key: HBASE-19892
 URL: https://issues.apache.org/jira/browse/HBASE-19892
 Project: HBase
  Issue Type: Bug
Reporter: stack






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-19891) Up nightly test run timeout from 6 hours to 8

2018-01-29 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-19891.
---
   Resolution: Fixed
 Assignee: stack
Fix Version/s: 2.0.0-beta-2

.001 is what I pushed to master and branch-2.

> Up nightly test run timeout from 6 hours to 8
> -
>
> Key: HBASE-19891
> URL: https://issues.apache.org/jira/browse/HBASE-19891
> Project: HBase
>  Issue Type: Sub-task
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19891.master.001.patch
>
>
> Yesterday, a nightly run for hbase2 passed all unit tests against hadoop2. 
> Hadoop3 tests got cut off at the 6 hour mark, our maximum total run time. 
> This is crazy but for now, just up the max time from 6 to 8 hours to see if 
> we can get a good build in. Can work on breaking this down in subsequent 
> issues. To be clear, the nightly 2.0 runs full test suite against hadoop2 and 
> then hadoop3... this is why it takes a while.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19891) Up nightly test run timeout from 6 hours to 8

2018-01-29 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19891:
--
Attachment: HBASE-19891.master.001.patch

> Up nightly test run timeout from 6 hours to 8
> -
>
> Key: HBASE-19891
> URL: https://issues.apache.org/jira/browse/HBASE-19891
> Project: HBase
>  Issue Type: Sub-task
>Reporter: stack
>Priority: Major
> Attachments: HBASE-19891.master.001.patch
>
>
> Yesterday, a nightly run for hbase2 passed all unit tests against hadoop2. 
> Hadoop3 tests got cut off at the 6 hour mark, our maximum total run time. 
> This is crazy but for now, just up the max time from 6 to 8 hours to see if 
> we can get a good build in. Can work on breaking this down in subsequent 
> issues. To be clear, the nightly 2.0 runs full test suite against hadoop2 and 
> then hadoop3... this is why it takes a while.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19891) Up nightly test run timeout from 6 hours to 8

2018-01-29 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19891:
--
Description: Yesterday, a nightly run for hbase2 passed all unit tests 
against hadoop2. Hadoop3 tests got cut off at the 6 hour mark, our maximum 
total run time. This is crazy but for now, just up the max time from 6 to 8 
hours to see if we can get a good build in. Can work on breaking this down in 
subsequent issues. To be clear, the nightly 2.0 runs full test suite against 
hadoop2 and then hadoop3... this is why it takes a while.  (was: Yesterday, a 
nightly run for hbase2 passed all unit tests against hadoop2. Hadoop3 tests got 
cut off at the 6 hour mark, our maximum total run time. This is crazy but for 
now, just up the max time from 6 to 8 hours to see if we can get a good build 
in. Can work on breaking this down in subsequent issues.)

> Up nightly test run timeout from 6 hours to 8
> -
>
> Key: HBASE-19891
> URL: https://issues.apache.org/jira/browse/HBASE-19891
> Project: HBase
>  Issue Type: Sub-task
>Reporter: stack
>Priority: Major
>
> Yesterday, a nightly run for hbase2 passed all unit tests against hadoop2. 
> Hadoop3 tests got cut off at the 6 hour mark, our maximum total run time. 
> This is crazy but for now, just up the max time from 6 to 8 hours to see if 
> we can get a good build in. Can work on breaking this down in subsequent 
> issues. To be clear, the nightly 2.0 runs full test suite against hadoop2 and 
> then hadoop3... this is why it takes a while.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19887) Find out why the HBaseClassTestRuleCheck does not work in pre commit

2018-01-29 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344554#comment-16344554
 ] 

Duo Zhang commented on HBASE-19887:
---

OK, the problem is the same with hbase-common in HBASE-19873. We define a 
listener in hbase-server/pom.xml and overwrite the default listener in the 
parent pom.xml.

{noformat}

  listener
  
org.apache.hadoop.hbase.ServerResourceCheckerJUnitListener

{noformat}

Looked at the code, there is no difference between 
ServerResourceCheckerJUnitListener and ResourceCheckerJUnitListener?

{code}
/**
 * Monitor the resources. use by the tests All resources in {@link 
ResourceCheckerJUnitListener}
 *  plus the number of connection.
 */
public class ServerResourceCheckerJUnitListener extends 
ResourceCheckerJUnitListener {
}
{code}

Although the comments said that we have an extra number of connection check but 
seems not.

In general I prefer we set the listener in the parent pom.xml. What do you 
think [~stack].

Thanks.

> Find out why the HBaseClassTestRuleCheck does not work in pre commit
> 
>
> Key: HBASE-19887
> URL: https://issues.apache.org/jira/browse/HBASE-19887
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Attachments: HBASE-19887.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-19891) Up nightly test run timeout from 6 hours to 8

2018-01-29 Thread stack (JIRA)
stack created HBASE-19891:
-

 Summary: Up nightly test run timeout from 6 hours to 8
 Key: HBASE-19891
 URL: https://issues.apache.org/jira/browse/HBASE-19891
 Project: HBase
  Issue Type: Sub-task
Reporter: stack


Yesterday, a nightly run for hbase2 passed all unit tests against hadoop2. 
Hadoop3 tests got cut off at the 6 hour mark, our maximum total run time. This 
is crazy but for now, just up the max time from 6 to 8 hours to see if we can 
get a good build in. Can work on breaking this down in subsequent issues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19887) Find out why the HBaseClassTestRuleCheck does not work in pre commit

2018-01-29 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344550#comment-16344550
 ] 

Duo Zhang commented on HBASE-19887:
---

So it works...

Let me check why.

> Find out why the HBaseClassTestRuleCheck does not work in pre commit
> 
>
> Key: HBASE-19887
> URL: https://issues.apache.org/jira/browse/HBASE-19887
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Attachments: HBASE-19887.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19887) Find out why the HBaseClassTestRuleCheck does not work in pre commit

2018-01-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344547#comment-16344547
 ] 

Hadoop QA commented on HBASE-19887:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
10s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
26s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
17s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
 5s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
23s{color} | {color:red} hbase-common: The patch generated 2 new + 0 unchanged 
- 0 fixed = 2 total (was 0) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
33s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
18m 14s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  2m  5s{color} 
| {color:red} hbase-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
 9s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 36m  7s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.TestRuleChecker |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-19887 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908264/HBASE-19887.patch |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux b18a4f89a7cb 3.13.0-133-generic #182-Ubuntu SMP Tue Sep 19 
15:49:21 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 34c6c99041 |
| maven | version: Apache Maven 3.5.2 
(138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) |
| Default Java | 1.8.0_151 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11241/artifact/patchprocess/diff-checkstyle-hbase-common.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11241/artifact/patchprocess/patch-unit-hbase-common.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11241/testReport/ |
| modules | C: hbase-common U: hbase-common |
| Console output | 
https://build

[jira] [Updated] (HBASE-19868) TestCoprocessorWhitelistMasterObserver is flakey

2018-01-29 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19868:
--
Issue Type: Sub-task  (was: Bug)
Parent: HBASE-19147

> TestCoprocessorWhitelistMasterObserver is flakey
> 
>
> Key: HBASE-19868
> URL: https://issues.apache.org/jira/browse/HBASE-19868
> Project: HBase
>  Issue Type: Sub-task
>  Components: flakey, test
>Affects Versions: 2.0.0-beta-1
>Reporter: Peter Somogyi
>Assignee: Peter Somogyi
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19868.branch-2.001.patch
>
>
> TestCoprocessorWhitelistMasterObserver is failing 33% of the time. In the 
> logs it looks like the failure is related to Master initialization.
> Following log is from 
> [https://builds.apache.org/job/HBase%20Nightly/job/branch-2/203] 
> {noformat}
> 2018-01-26 02:36:36,686 WARN [M:0;1f0c4777c1ba:35049] 
> master.TableNamespaceManager(307): Caught exception in initializing namespace 
> table manager
> org.apache.hadoop.hbase.DoNotRetryIOException: hconnection-0x18cd2ac8 closed
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:722)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:714)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:684)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.getRegionLocation(ConnectionImplementation.java:562)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.getRegionLocation(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.HRegionLocator.getRegionLocation(HRegionLocator.java:73)
> at 
> org.apache.hadoop.hbase.client.RegionServerCallable.prepare(RegionServerCallable.java:223)
> at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:105)
> at org.apache.hadoop.hbase.client.HTable.get(HTable.java:388)
> at org.apache.hadoop.hbase.client.HTable.get(HTable.java:362)
> at 
> org.apache.hadoop.hbase.master.TableNamespaceManager.get(TableNamespaceManager.java:141)
> at 
> org.apache.hadoop.hbase.master.TableNamespaceManager.isTableAvailableAndInitialized(TableNamespaceManager.java:281)
> at 
> org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:103)
> at 
> org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:62)
> at 
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:226)
> at 
> org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1059)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:921)
> at 
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2034)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:553)
> at java.lang.Thread.run(Thread.java:748)
> 2018-01-26 02:36:36,691 ERROR [M:0;1f0c4777c1ba:35049] 
> helpers.MarkerIgnoringBase(159): Failed to become active master
> java.lang.IllegalStateException: Expected the service 
> ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED
> at 
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:345)
> at 
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:291)
> at 
> org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1061)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:921)
> at 
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2034)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:553)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException: 
> hconnection-0x18cd2ac8 closed
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:722)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:714)
> at 
> org.apache.hadoop.hbase.clien

[jira] [Commented] (HBASE-19868) TestCoprocessorWhitelistMasterObserver is flakey

2018-01-29 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344545#comment-16344545
 ] 

stack commented on HBASE-19868:
---

Oh, flakey 2.0 just came in. It has a few of these failures in the top-list. 
Thats good because these have more logs 
[https://builds.apache.org/job/HBASE-Flaky-Tests-branch2.0/983/] I'm still a 
bit stuck though. Our thread-dumper seems to be missing in this test run 

 

 

> TestCoprocessorWhitelistMasterObserver is flakey
> 
>
> Key: HBASE-19868
> URL: https://issues.apache.org/jira/browse/HBASE-19868
> Project: HBase
>  Issue Type: Bug
>  Components: flakey, test
>Affects Versions: 2.0.0-beta-1
>Reporter: Peter Somogyi
>Assignee: Peter Somogyi
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19868.branch-2.001.patch
>
>
> TestCoprocessorWhitelistMasterObserver is failing 33% of the time. In the 
> logs it looks like the failure is related to Master initialization.
> Following log is from 
> [https://builds.apache.org/job/HBase%20Nightly/job/branch-2/203] 
> {noformat}
> 2018-01-26 02:36:36,686 WARN [M:0;1f0c4777c1ba:35049] 
> master.TableNamespaceManager(307): Caught exception in initializing namespace 
> table manager
> org.apache.hadoop.hbase.DoNotRetryIOException: hconnection-0x18cd2ac8 closed
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:722)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:714)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:684)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.getRegionLocation(ConnectionImplementation.java:562)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.getRegionLocation(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.HRegionLocator.getRegionLocation(HRegionLocator.java:73)
> at 
> org.apache.hadoop.hbase.client.RegionServerCallable.prepare(RegionServerCallable.java:223)
> at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:105)
> at org.apache.hadoop.hbase.client.HTable.get(HTable.java:388)
> at org.apache.hadoop.hbase.client.HTable.get(HTable.java:362)
> at 
> org.apache.hadoop.hbase.master.TableNamespaceManager.get(TableNamespaceManager.java:141)
> at 
> org.apache.hadoop.hbase.master.TableNamespaceManager.isTableAvailableAndInitialized(TableNamespaceManager.java:281)
> at 
> org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:103)
> at 
> org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:62)
> at 
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:226)
> at 
> org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1059)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:921)
> at 
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2034)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:553)
> at java.lang.Thread.run(Thread.java:748)
> 2018-01-26 02:36:36,691 ERROR [M:0;1f0c4777c1ba:35049] 
> helpers.MarkerIgnoringBase(159): Failed to become active master
> java.lang.IllegalStateException: Expected the service 
> ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED
> at 
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:345)
> at 
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:291)
> at 
> org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1061)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:921)
> at 
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2034)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:553)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException: 
> hconnection-0x18cd2ac8 closed
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:722)
> a

[jira] [Commented] (HBASE-19147) All branch-2 unit tests pass

2018-01-29 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344529#comment-16344529
 ] 

stack commented on HBASE-19147:
---

We got a pass on a nightly for hadoop2. Failed in hadoop3: 
https://builds.apache.org/view/H-L/view/HBase/job/HBase%20Nightly/job/branch-2/

> All branch-2 unit tests pass
> 
>
> Key: HBASE-19147
> URL: https://issues.apache.org/jira/browse/HBASE-19147
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Reporter: stack
>Priority: Blocker
> Fix For: 2.0.0-beta-2
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-19868) TestCoprocessorWhitelistMasterObserver is flakey

2018-01-29 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack reassigned HBASE-19868:
-

Assignee: Peter Somogyi

> TestCoprocessorWhitelistMasterObserver is flakey
> 
>
> Key: HBASE-19868
> URL: https://issues.apache.org/jira/browse/HBASE-19868
> Project: HBase
>  Issue Type: Bug
>  Components: flakey, test
>Affects Versions: 2.0.0-beta-1
>Reporter: Peter Somogyi
>Assignee: Peter Somogyi
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19868.branch-2.001.patch
>
>
> TestCoprocessorWhitelistMasterObserver is failing 33% of the time. In the 
> logs it looks like the failure is related to Master initialization.
> Following log is from 
> [https://builds.apache.org/job/HBase%20Nightly/job/branch-2/203] 
> {noformat}
> 2018-01-26 02:36:36,686 WARN [M:0;1f0c4777c1ba:35049] 
> master.TableNamespaceManager(307): Caught exception in initializing namespace 
> table manager
> org.apache.hadoop.hbase.DoNotRetryIOException: hconnection-0x18cd2ac8 closed
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:722)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:714)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:684)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.getRegionLocation(ConnectionImplementation.java:562)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.getRegionLocation(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.HRegionLocator.getRegionLocation(HRegionLocator.java:73)
> at 
> org.apache.hadoop.hbase.client.RegionServerCallable.prepare(RegionServerCallable.java:223)
> at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:105)
> at org.apache.hadoop.hbase.client.HTable.get(HTable.java:388)
> at org.apache.hadoop.hbase.client.HTable.get(HTable.java:362)
> at 
> org.apache.hadoop.hbase.master.TableNamespaceManager.get(TableNamespaceManager.java:141)
> at 
> org.apache.hadoop.hbase.master.TableNamespaceManager.isTableAvailableAndInitialized(TableNamespaceManager.java:281)
> at 
> org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:103)
> at 
> org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:62)
> at 
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:226)
> at 
> org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1059)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:921)
> at 
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2034)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:553)
> at java.lang.Thread.run(Thread.java:748)
> 2018-01-26 02:36:36,691 ERROR [M:0;1f0c4777c1ba:35049] 
> helpers.MarkerIgnoringBase(159): Failed to become active master
> java.lang.IllegalStateException: Expected the service 
> ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED
> at 
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:345)
> at 
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:291)
> at 
> org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1061)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:921)
> at 
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2034)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:553)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException: 
> hconnection-0x18cd2ac8 closed
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:722)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:714)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingCl

[jira] [Commented] (HBASE-19868) TestCoprocessorWhitelistMasterObserver is flakey

2018-01-29 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344524#comment-16344524
 ] 

stack commented on HBASE-19868:
---

2.001 just ups retries from 1 to 5. Test runs in about same time locally at 
least. Pushed to master and branch-2 to see.

> TestCoprocessorWhitelistMasterObserver is flakey
> 
>
> Key: HBASE-19868
> URL: https://issues.apache.org/jira/browse/HBASE-19868
> Project: HBase
>  Issue Type: Bug
>  Components: flakey, test
>Affects Versions: 2.0.0-beta-1
>Reporter: Peter Somogyi
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19868.branch-2.001.patch
>
>
> TestCoprocessorWhitelistMasterObserver is failing 33% of the time. In the 
> logs it looks like the failure is related to Master initialization.
> Following log is from 
> [https://builds.apache.org/job/HBase%20Nightly/job/branch-2/203] 
> {noformat}
> 2018-01-26 02:36:36,686 WARN [M:0;1f0c4777c1ba:35049] 
> master.TableNamespaceManager(307): Caught exception in initializing namespace 
> table manager
> org.apache.hadoop.hbase.DoNotRetryIOException: hconnection-0x18cd2ac8 closed
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:722)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:714)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:684)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.getRegionLocation(ConnectionImplementation.java:562)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.getRegionLocation(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.HRegionLocator.getRegionLocation(HRegionLocator.java:73)
> at 
> org.apache.hadoop.hbase.client.RegionServerCallable.prepare(RegionServerCallable.java:223)
> at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:105)
> at org.apache.hadoop.hbase.client.HTable.get(HTable.java:388)
> at org.apache.hadoop.hbase.client.HTable.get(HTable.java:362)
> at 
> org.apache.hadoop.hbase.master.TableNamespaceManager.get(TableNamespaceManager.java:141)
> at 
> org.apache.hadoop.hbase.master.TableNamespaceManager.isTableAvailableAndInitialized(TableNamespaceManager.java:281)
> at 
> org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:103)
> at 
> org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:62)
> at 
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:226)
> at 
> org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1059)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:921)
> at 
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2034)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:553)
> at java.lang.Thread.run(Thread.java:748)
> 2018-01-26 02:36:36,691 ERROR [M:0;1f0c4777c1ba:35049] 
> helpers.MarkerIgnoringBase(159): Failed to become active master
> java.lang.IllegalStateException: Expected the service 
> ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED
> at 
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:345)
> at 
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:291)
> at 
> org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1061)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:921)
> at 
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2034)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:553)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException: 
> hconnection-0x18cd2ac8 closed
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:722)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion

[jira] [Updated] (HBASE-19868) TestCoprocessorWhitelistMasterObserver is flakey

2018-01-29 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19868:
--
Attachment: HBASE-19868.branch-2.001.patch

> TestCoprocessorWhitelistMasterObserver is flakey
> 
>
> Key: HBASE-19868
> URL: https://issues.apache.org/jira/browse/HBASE-19868
> Project: HBase
>  Issue Type: Bug
>  Components: flakey, test
>Affects Versions: 2.0.0-beta-1
>Reporter: Peter Somogyi
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19868.branch-2.001.patch
>
>
> TestCoprocessorWhitelistMasterObserver is failing 33% of the time. In the 
> logs it looks like the failure is related to Master initialization.
> Following log is from 
> [https://builds.apache.org/job/HBase%20Nightly/job/branch-2/203] 
> {noformat}
> 2018-01-26 02:36:36,686 WARN [M:0;1f0c4777c1ba:35049] 
> master.TableNamespaceManager(307): Caught exception in initializing namespace 
> table manager
> org.apache.hadoop.hbase.DoNotRetryIOException: hconnection-0x18cd2ac8 closed
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:722)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:714)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:684)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.getRegionLocation(ConnectionImplementation.java:562)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.getRegionLocation(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.HRegionLocator.getRegionLocation(HRegionLocator.java:73)
> at 
> org.apache.hadoop.hbase.client.RegionServerCallable.prepare(RegionServerCallable.java:223)
> at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:105)
> at org.apache.hadoop.hbase.client.HTable.get(HTable.java:388)
> at org.apache.hadoop.hbase.client.HTable.get(HTable.java:362)
> at 
> org.apache.hadoop.hbase.master.TableNamespaceManager.get(TableNamespaceManager.java:141)
> at 
> org.apache.hadoop.hbase.master.TableNamespaceManager.isTableAvailableAndInitialized(TableNamespaceManager.java:281)
> at 
> org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:103)
> at 
> org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:62)
> at 
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:226)
> at 
> org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1059)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:921)
> at 
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2034)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:553)
> at java.lang.Thread.run(Thread.java:748)
> 2018-01-26 02:36:36,691 ERROR [M:0;1f0c4777c1ba:35049] 
> helpers.MarkerIgnoringBase(159): Failed to become active master
> java.lang.IllegalStateException: Expected the service 
> ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED
> at 
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:345)
> at 
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:291)
> at 
> org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1061)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:921)
> at 
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2034)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:553)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException: 
> hconnection-0x18cd2ac8 closed
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:722)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:714)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateReg

[jira] [Commented] (HBASE-13153) Bulk Loaded HFile Replication

2018-01-29 Thread Ashish Singhi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344518#comment-16344518
 ] 

Ashish Singhi commented on HBASE-13153:
---

Hi [~anoop.hbase],
I will try to post a patch for document update coming weekend. On week days I 
am little busy with my paid job.

bq. Many bugs were resolved in bulk load and all such fix should be there in 
1.3+ versions.
AFAIK all those bugs are fixed in 1.3+ versions. I usually keep a eye on it.

bq. The mentioned potential issues also seems out of date.
which one's you are pointing at ?

> Bulk Loaded HFile Replication
> -
>
> Key: HBASE-13153
> URL: https://issues.apache.org/jira/browse/HBASE-13153
> Project: HBase
>  Issue Type: New Feature
>  Components: Replication
>Reporter: sunhaitao
>Assignee: Ashish Singhi
>Priority: Major
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-13153-branch-1-v20.patch, 
> HBASE-13153-branch-1-v21.patch, HBASE-13153-v1.patch, HBASE-13153-v10.patch, 
> HBASE-13153-v11.patch, HBASE-13153-v12.patch, HBASE-13153-v13.patch, 
> HBASE-13153-v14.patch, HBASE-13153-v15.patch, HBASE-13153-v16.patch, 
> HBASE-13153-v17.patch, HBASE-13153-v18.patch, HBASE-13153-v19.patch, 
> HBASE-13153-v2.patch, HBASE-13153-v20.patch, HBASE-13153-v21.patch, 
> HBASE-13153-v3.patch, HBASE-13153-v4.patch, HBASE-13153-v5.patch, 
> HBASE-13153-v6.patch, HBASE-13153-v7.patch, HBASE-13153-v8.patch, 
> HBASE-13153-v9.patch, HBASE-13153.patch, HBase Bulk Load 
> Replication-v1-1.pdf, HBase Bulk Load Replication-v2.pdf, HBase Bulk Load 
> Replication-v3.pdf, HBase Bulk Load Replication.pdf, HDFS_HA_Solution.PNG
>
>
> Currently we plan to use HBase Replication feature to deal with disaster 
> tolerance scenario.But we encounter an issue that we will use bulkload very 
> frequently,because bulkload bypass write path, and will not generate WAL, so 
> the data will not be replicated to backup cluster. It's inappropriate to 
> bukload twice both on active cluster and backup cluster. So i advise do some 
> modification to bulkload feature to enable bukload to both active cluster and 
> backup cluster



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19887) Find out why the HBaseClassTestRuleCheck does not work in pre commit

2018-01-29 Thread Duo Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-19887:
--
Assignee: Duo Zhang
  Status: Patch Available  (was: Open)

> Find out why the HBaseClassTestRuleCheck does not work in pre commit
> 
>
> Key: HBASE-19887
> URL: https://issues.apache.org/jira/browse/HBASE-19887
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Attachments: HBASE-19887.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19887) Find out why the HBaseClassTestRuleCheck does not work in pre commit

2018-01-29 Thread Duo Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-19887:
--
Attachment: HBASE-19887.patch

> Find out why the HBaseClassTestRuleCheck does not work in pre commit
> 
>
> Key: HBASE-19887
> URL: https://issues.apache.org/jira/browse/HBASE-19887
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: Duo Zhang
>Priority: Major
> Attachments: HBASE-19887.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-13153) Bulk Loaded HFile Replication

2018-01-29 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344508#comment-16344508
 ] 

Anoop Sam John commented on HBASE-13153:


This feature as such is not described in book? If so, we have to add.  Ya we 
have to update the section with removing the replication related limitation.
Many bugs were resolved in bulk load and all such fix should be there in 1.3+ 
versions. The mentioned potential issues also seems out of date.

> Bulk Loaded HFile Replication
> -
>
> Key: HBASE-13153
> URL: https://issues.apache.org/jira/browse/HBASE-13153
> Project: HBase
>  Issue Type: New Feature
>  Components: Replication
>Reporter: sunhaitao
>Assignee: Ashish Singhi
>Priority: Major
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-13153-branch-1-v20.patch, 
> HBASE-13153-branch-1-v21.patch, HBASE-13153-v1.patch, HBASE-13153-v10.patch, 
> HBASE-13153-v11.patch, HBASE-13153-v12.patch, HBASE-13153-v13.patch, 
> HBASE-13153-v14.patch, HBASE-13153-v15.patch, HBASE-13153-v16.patch, 
> HBASE-13153-v17.patch, HBASE-13153-v18.patch, HBASE-13153-v19.patch, 
> HBASE-13153-v2.patch, HBASE-13153-v20.patch, HBASE-13153-v21.patch, 
> HBASE-13153-v3.patch, HBASE-13153-v4.patch, HBASE-13153-v5.patch, 
> HBASE-13153-v6.patch, HBASE-13153-v7.patch, HBASE-13153-v8.patch, 
> HBASE-13153-v9.patch, HBASE-13153.patch, HBase Bulk Load 
> Replication-v1-1.pdf, HBase Bulk Load Replication-v2.pdf, HBase Bulk Load 
> Replication-v3.pdf, HBase Bulk Load Replication.pdf, HDFS_HA_Solution.PNG
>
>
> Currently we plan to use HBase Replication feature to deal with disaster 
> tolerance scenario.But we encounter an issue that we will use bulkload very 
> frequently,because bulkload bypass write path, and will not generate WAL, so 
> the data will not be replicated to backup cluster. It's inappropriate to 
> bukload twice both on active cluster and backup cluster. So i advise do some 
> modification to bulkload feature to enable bukload to both active cluster and 
> backup cluster



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19868) TestCoprocessorWhitelistMasterObserver is flakey

2018-01-29 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344504#comment-16344504
 ] 

stack commented on HBASE-19868:
---

[~psomogyi] How'd you extract the exception? Going about nightlies, I was 
unable to get good log. The exception seems to come after the log run at 

02:36:36,686 (Log here ends at 
[https://builds.apache.org/job/HBase%20Nightly/job/branch-2/203/testReport/junit/org.apache.hadoop.hbase.security.access/TestCoprocessorWhitelistMasterObserver/org_apache_hadoop_hbase_security_access_TestCoprocessorWhitelistMasterObserver/]
  2018-01-26 02:36:36,440)

I can't make the test fail locally which I'm guessing is what you are finding.

(I was looking to see why we are not archiving the surefire test run output 
seemingly. It looks like the flags are in place but i can't find the raw 
emissions. Need to dig more).

So, the odd thing about this test is conf.setInt("hbase.client.retries.number", 
1) (I think). If stuff is slow around startup, we'll fail our one attempt. Its 
happening in

ClusterSchemaServiceImpl which is cast as a Guava Service. It is being started 
async. Are we stuck in 

isTableAvailableAndInitialized failing our one attempt over and over. I can't 
tell. Or is it that Master fails to come up and then we are just stuck in mini 
cluster startup trying to scan a meta that is not coming? Let me try some debug 
and up the retries to 5 Test should still fail fast

 

> TestCoprocessorWhitelistMasterObserver is flakey
> 
>
> Key: HBASE-19868
> URL: https://issues.apache.org/jira/browse/HBASE-19868
> Project: HBase
>  Issue Type: Bug
>  Components: flakey, test
>Affects Versions: 2.0.0-beta-1
>Reporter: Peter Somogyi
>Priority: Major
> Fix For: 2.0.0-beta-2
>
>
> TestCoprocessorWhitelistMasterObserver is failing 33% of the time. In the 
> logs it looks like the failure is related to Master initialization.
> Following log is from 
> [https://builds.apache.org/job/HBase%20Nightly/job/branch-2/203] 
> {noformat}
> 2018-01-26 02:36:36,686 WARN [M:0;1f0c4777c1ba:35049] 
> master.TableNamespaceManager(307): Caught exception in initializing namespace 
> table manager
> org.apache.hadoop.hbase.DoNotRetryIOException: hconnection-0x18cd2ac8 closed
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:722)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:714)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:684)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.getRegionLocation(ConnectionImplementation.java:562)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.getRegionLocation(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.HRegionLocator.getRegionLocation(HRegionLocator.java:73)
> at 
> org.apache.hadoop.hbase.client.RegionServerCallable.prepare(RegionServerCallable.java:223)
> at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:105)
> at org.apache.hadoop.hbase.client.HTable.get(HTable.java:388)
> at org.apache.hadoop.hbase.client.HTable.get(HTable.java:362)
> at 
> org.apache.hadoop.hbase.master.TableNamespaceManager.get(TableNamespaceManager.java:141)
> at 
> org.apache.hadoop.hbase.master.TableNamespaceManager.isTableAvailableAndInitialized(TableNamespaceManager.java:281)
> at 
> org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:103)
> at 
> org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:62)
> at 
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:226)
> at 
> org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1059)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:921)
> at 
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2034)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:553)
> at java.lang.Thread.run(Thread.java:748)
> 2018-01-26 02:36:36,691 ERROR [M:0;1f0c4777c1ba:35049] 
> helpers.MarkerIgnoringBase(159): Failed to become active master
> java.lang.IllegalStateException: Expected the service 
> ClusterSch

[jira] [Commented] (HBASE-19824) SingleColumnValueFilter returns wrong result when used in shell command

2018-01-29 Thread Reid Chan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344456#comment-16344456
 ] 

Reid Chan commented on HBASE-19824:
---

Got it! Thanks Chia-Ping, Ted.

> SingleColumnValueFilter returns wrong result when used in shell command
> ---
>
> Key: HBASE-19824
> URL: https://issues.apache.org/jira/browse/HBASE-19824
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-4
>Reporter: Ted Yu
>Assignee: Reid Chan
>Priority: Major
>
> There are two cells in table t1:
> {code}
> ROW COLUMN+CELL
>  r1 column=f1:a1, 
> timestamp=1516313683984, value=a2
>  r1 column=f1:b1, 
> timestamp=1516313700744, value=b2
> {code}
> When SingleColumnValueFilter is used in shell command, no filtering was done:
> {code}
> hbase(main):022:0> scan 't1', {FILTER => "SingleColumnValueFilter('f1', 'a1', 
> =, 'binary:a2')"}
> ROW COLUMN+CELL
>  r1 column=f1:a1, 
> timestamp=1516313683984, value=a2
>  r1 column=f1:b1, 
> timestamp=1516313700744, value=b2
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19811) Fix findbugs and error-prone warnings in hbase-server (branch-2)

2018-01-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1635#comment-1635
 ] 

Hudson commented on HBASE-19811:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4493 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/4493/])
HBASE-19811 Fix findbugs and error-prone warnings in hbase-server (stack: rev 
34c6c99041d5f4a217363667b090fb1b5beb7abe)
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/procedure/TestZKProcedure.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/zookeeper/TestZooKeeperACL.java


> Fix findbugs and error-prone warnings in hbase-server (branch-2)
> 
>
> Key: HBASE-19811
> URL: https://issues.apache.org/jira/browse/HBASE-19811
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0-beta-1
>Reporter: Peter Somogyi
>Assignee: Peter Somogyi
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: 0HBASE-19811.branch-2.ADDENDUM.2.patch, 
> 1-HBASE-19811.branch-2.002.patch, HBASE-19811.branch-2.001.patch, 
> HBASE-19811.branch-2.001.patch, HBASE-19811.branch-2.002.patch, 
> HBASE-19811.branch-2.ADDENDUM.patch, HBASE-19811.master.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2018-01-29 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1631#comment-1631
 ] 

Josh Elser commented on HBASE-17852:


bq. I'm not messing with you, Appy. Check the push logs/comments on the other 
JIRA issue.. I swear to you that I did not push this until after I heard back 
from you. My guess is that this is due to me using git-am or cherry picking a 
commit from a local branch.

My apologies, Appy. I am wrong. I apparently got impatient and pushed this 
because there was silence from the Dec 6th mention and the Jan 12th re-ping. My 
intent was not to squash your opinions, but to avoid being blocked if you were 
not interested/busy as seemed might be the case.

If you have since changed your mind about the reduced patch hitting master, my 
offer to revert stands. My apologies again for arguing with you while in the 
wrong.

> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> 
>
> Key: HBASE-17852
> URL: https://issues.apache.org/jira/browse/HBASE-17852
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-17852-v10.patch, screenshot-1.png
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-19890) Canary usage should document hbase.canary.sink.class config

2018-01-29 Thread Ted Yu (JIRA)
Ted Yu created HBASE-19890:
--

 Summary: Canary usage should document hbase.canary.sink.class 
config
 Key: HBASE-19890
 URL: https://issues.apache.org/jira/browse/HBASE-19890
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu


Canary#main uses config hbase.canary.sink.class to instantiate Sink class.

The Sink instance affects creation of Monitor.

In the refguide for Canary, hbase.canary.sink.class was not mentioned.
We should document this config.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2018-01-29 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-17852:
--
Comment: was deleted

(was: Hmm, did I insult someone savagely, [~elserj]?  )

> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> 
>
> Key: HBASE-17852
> URL: https://issues.apache.org/jira/browse/HBASE-17852
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-17852-v10.patch, screenshot-1.png
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19863) java.lang.IllegalStateException: isDelete failed when SingleColumnValueFilter is used

2018-01-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344426#comment-16344426
 ] 

Hadoop QA commented on HBASE-19863:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
13s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  6m 
 6s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
46s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
18m 25s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}105m 41s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}143m 49s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.TestZooKeeper |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-19863 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908234/HBASE-19863-branch1.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux a53b47ac1993 3.13.0-133-generic #182-Ubuntu SMP Tue Sep 19 
15:49:21 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 34c6c99041 |
| maven | version: Apache Maven 3.5.2 
(138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) |
| Default Java | 1.8.0_151 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11240/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11240/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11240/console |

[jira] [Commented] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2018-01-29 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344425#comment-16344425
 ] 

Vladimir Rodionov commented on HBASE-17852:
---

Hmm, did I insult someone savagely, [~elserj]?  

> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> 
>
> Key: HBASE-17852
> URL: https://issues.apache.org/jira/browse/HBASE-17852
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-17852-v10.patch, screenshot-1.png
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2018-01-29 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-17852:
--
Comment: was deleted

(was: {quote}
that's precisely the reason why i can't trust you.
{quote}

You can start discussion about trust and respect in HBase community and I 
assure you I have a lot to say about.  
)

> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> 
>
> Key: HBASE-17852
> URL: https://issues.apache.org/jira/browse/HBASE-17852
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-17852-v10.patch, screenshot-1.png
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2018-01-29 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344416#comment-16344416
 ] 

Josh Elser commented on HBASE-17852:


[~vrodionov] that's also plenty shit-slinging from you too on the matter. 
Thanks.

> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> 
>
> Key: HBASE-17852
> URL: https://issues.apache.org/jira/browse/HBASE-17852
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-17852-v10.patch, screenshot-1.png
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2018-01-29 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344407#comment-16344407
 ] 

Josh Elser commented on HBASE-17852:


Also, again, if you want this reverted, please say so.

> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> 
>
> Key: HBASE-17852
> URL: https://issues.apache.org/jira/browse/HBASE-17852
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-17852-v10.patch, screenshot-1.png
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2018-01-29 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344404#comment-16344404
 ] 

Josh Elser commented on HBASE-17852:


bq. I did say it was okay to go in master, but that's like 4 days after the 
commit - 2018-01-16T12:46:19-0800

I'm not messing with you, [~appy]. Check the push logs/comments on the other 
JIRA issue.. I swear to you that I did not push this until after I heard back 
from you. My guess is that this is due to me using git-am or cherry picking a 
commit from a local branch.

> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> 
>
> Key: HBASE-17852
> URL: https://issues.apache.org/jira/browse/HBASE-17852
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-17852-v10.patch, screenshot-1.png
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19803) False positive for the HBASE-Find-Flaky-Tests job

2018-01-29 Thread Appy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344384#comment-16344384
 ] 

Appy commented on HBASE-19803:
--

Yeah, it's infra issue. I'm not able to even access the site 
https://builds.apache.org/

> False positive for the HBASE-Find-Flaky-Tests job
> -
>
> Key: HBASE-19803
> URL: https://issues.apache.org/jira/browse/HBASE-19803
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Priority: Major
> Attachments: 2018-01-24T17-45-37_000-jvmRun1.dumpstream, 
> HBASE-19803.master.001.patch
>
>
> It reports two hangs for TestAsyncTableGetMultiThreaded, but I checked the 
> surefire output
> https://builds.apache.org/job/HBASE-Flaky-Tests/24830/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.client.TestAsyncTableGetMultiThreaded-output.txt
> This one was likely to be killed in the middle of the run within 20 seconds.
> https://builds.apache.org/job/HBASE-Flaky-Tests/24852/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.client.TestAsyncTableGetMultiThreaded-output.txt
> This one was also killed within about 1 minutes.
> The test is declared as LargeTests so the time limit should be 10 minutes. It 
> seems that the jvm may crash during the mvn test run and then we will kill 
> all the running tests and then we may mark some of them as hang which leads 
> to the false positive.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2018-01-29 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344380#comment-16344380
 ] 

Vladimir Rodionov commented on HBASE-17852:
---

{quote}
that's precisely the reason why i can't trust you.
{quote}

You can start discussion about trust and respect in HBase community and I 
assure you I have a lot to say about.  


> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> 
>
> Key: HBASE-17852
> URL: https://issues.apache.org/jira/browse/HBASE-17852
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-17852-v10.patch, screenshot-1.png
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19803) False positive for the HBASE-Find-Flaky-Tests job

2018-01-29 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344372#comment-16344372
 ] 

Duo Zhang commented on HBASE-19803:
---

Seems the infra sucks... I added label Hadoop but the two new builds still fail 
due to disconnect from the build machine...

> False positive for the HBASE-Find-Flaky-Tests job
> -
>
> Key: HBASE-19803
> URL: https://issues.apache.org/jira/browse/HBASE-19803
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Priority: Major
> Attachments: 2018-01-24T17-45-37_000-jvmRun1.dumpstream, 
> HBASE-19803.master.001.patch
>
>
> It reports two hangs for TestAsyncTableGetMultiThreaded, but I checked the 
> surefire output
> https://builds.apache.org/job/HBASE-Flaky-Tests/24830/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.client.TestAsyncTableGetMultiThreaded-output.txt
> This one was likely to be killed in the middle of the run within 20 seconds.
> https://builds.apache.org/job/HBASE-Flaky-Tests/24852/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.client.TestAsyncTableGetMultiThreaded-output.txt
> This one was also killed within about 1 minutes.
> The test is declared as LargeTests so the time limit should be 10 minutes. It 
> seems that the jvm may crash during the mvn test run and then we will kill 
> all the running tests and then we may mark some of them as hang which leads 
> to the false positive.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19876) The exception happening in converting pb mutation to hbase.mutation messes up the CellScanner

2018-01-29 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344363#comment-16344363
 ] 

Anoop Sam John commented on HBASE-19876:


Append and increment Mutation may have N Cells in that (for same row).  Even if 
one cell is malformed, we have to fail the op. Ya that make sense.  Need to 
verify once though. Thanks Chia.

> The exception happening in converting pb mutation to hbase.mutation messes up 
> the CellScanner
> -
>
> Key: HBASE-19876
> URL: https://issues.apache.org/jira/browse/HBASE-19876
> Project: HBase
>  Issue Type: Bug
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Critical
> Fix For: 1.3.2, 1.5.0, 1.2.7, 2.0.0-beta-2, 1.4.2
>
>
> {code:java}
> 2018-01-27 22:51:43,794 INFO  [hconnection-0x3291b443-shared-pool11-t6] 
> client.AsyncRequestFutureImpl(778): id=5, table=testQuotaStatusFromMaster3, 
> attempt=6/16 failed=20ops, last 
> exception=org.apache.hadoop.hbase.client.WrongRowIOException: 
> org.apache.hadoop.hbase.client.WrongRowIOException: The row in xxx doesn't 
> match the original one aaa
>   at org.apache.hadoop.hbase.client.Mutation.add(Mutation.java:776)
>   at org.apache.hadoop.hbase.client.Put.add(Put.java:282)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toPut(ProtobufUtil.java:642)
>   at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:952)
>   at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:896)
>   at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2591)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:41560)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:404)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304){code}
> I noticed this bug when testing the table space quota.
> When rs are converting pb mutation to hbase.mutation, the quota exception or 
> cell exception may be thrown.
> {code}
> Unable to find source-code formatter for language: 
> rsrpcservices#dobatchop.java. Available languages are: actionscript, ada, 
> applescript, bash, c, c#, c++, cpp, css, erlang, go, groovy, haskell, html, 
> java, javascript, js, json, lua, none, nyan, objc, perl, php, python, r, 
> rainbow, ruby, scala, sh, sql, swift, visualbasic, xml, yaml  for 
> (ClientProtos.Action action: mutations) {
> MutationProto m = action.getMutation();
> Mutation mutation;
> if (m.getMutateType() == MutationType.PUT) {
>   mutation = ProtobufUtil.toPut(m, cells);
>   batchContainsPuts = true;
> } else {
>   mutation = ProtobufUtil.toDelete(m, cells);
>   batchContainsDelete = true;
> }
> mutationActionMap.put(mutation, action);
> mArray[i++] = mutation;
> checkCellSizeLimit(region, mutation);
> // Check if a space quota disallows this mutation
> spaceQuotaEnforcement.getPolicyEnforcement(region).check(mutation);
> quota.addMutation(mutation);
>   }
> {code}
> rs has caught the exception but it doesn't have the cellscanner skip the 
> failed cells.
> {code:java}
> } catch (IOException ie) {
>   if (atomic) {
> throw ie;
>   }
>   for (Action mutation : mutations) {
> builder.addResultOrException(getResultOrException(ie, 
> mutation.getIndex()));
>   }
> }
> {code}
> The bug results in the WrongRowIOException to remaining mutations since they 
> refer to invalid cells.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2018-01-29 Thread Appy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344359#comment-16344359
 ] 

Appy commented on HBASE-17852:
--

If your mental radar doesn't tick-off in loud red alarms between the time of 
choosing the second approach and getting someone to commit it, that's precisely 
the reason why i can't trust you.


> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> 
>
> Key: HBASE-17852
> URL: https://issues.apache.org/jira/browse/HBASE-17852
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-17852-v10.patch, screenshot-1.png
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19803) False positive for the HBASE-Find-Flaky-Tests job

2018-01-29 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344357#comment-16344357
 ] 

stack commented on HBASE-19803:
---

{quote}[~appy] The flakey test finder job is hang?
{quote}
I was wondering why it wasn't moving today...

> False positive for the HBASE-Find-Flaky-Tests job
> -
>
> Key: HBASE-19803
> URL: https://issues.apache.org/jira/browse/HBASE-19803
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Priority: Major
> Attachments: 2018-01-24T17-45-37_000-jvmRun1.dumpstream, 
> HBASE-19803.master.001.patch
>
>
> It reports two hangs for TestAsyncTableGetMultiThreaded, but I checked the 
> surefire output
> https://builds.apache.org/job/HBASE-Flaky-Tests/24830/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.client.TestAsyncTableGetMultiThreaded-output.txt
> This one was likely to be killed in the middle of the run within 20 seconds.
> https://builds.apache.org/job/HBASE-Flaky-Tests/24852/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.client.TestAsyncTableGetMultiThreaded-output.txt
> This one was also killed within about 1 minutes.
> The test is declared as LargeTests so the time limit should be 10 minutes. It 
> seems that the jvm may crash during the mvn test run and then we will kill 
> all the running tests and then we may mark some of them as hang which leads 
> to the false positive.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19840) Flakey TestMetaWithReplicas

2018-01-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344342#comment-16344342
 ] 

Hudson commented on HBASE-19840:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4492 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/4492/])
HBASE-19840 Flakey TestMetaWithReplicas (stack: rev 
4f547b3817e01a1f98c965a502775de481e6ca96)
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestMetaWithReplicas.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterMetaBootstrap.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* (edit) hbase-common/src/main/java/org/apache/hadoop/hbase/util/HasThread.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterNoCluster.java
* (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
HBASE-19840 Flakey TestMetaWithReplicas; ADDENDUM to fix Checksyte (stack: rev 
0b9a0dc9519d511908efd28caf2cf010e3a1ff79)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterMetaBootstrap.java


> Flakey TestMetaWithReplicas
> ---
>
> Key: HBASE-19840
> URL: https://issues.apache.org/jira/browse/HBASE-19840
> Project: HBase
>  Issue Type: Sub-task
>  Components: flakey, test
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19840.master.001.patch, 
> HBASE-19840.master.001.patch
>
>
> Failing about 15% of the time..  In testShutdownHandling.. 
> [https://builds.apache.org/view/H-L/view/HBase/job/HBase-Find-Flaky-Tests-branch2.0/lastSuccessfulBuild/artifact/dashboard.html]
>  
> Adding some debug. Its hard to follow what is going on in this test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2018-01-29 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344341#comment-16344341
 ] 

Vladimir Rodionov commented on HBASE-17852:
---

{quote}

I did say it was okay to go in master, but that's like 4 days after the commit 
- 2018-01-16T12:46:19-0800

{quote}

OK, there was an issue found during QA testing - HBASE-19568. It turned out 
that HBASE-17852 fixes the issue. Let us say I have had two options:
 # Find out which part of HBASE-17852 fixes the issue and create smaller 
HBASE-19568- specific patch
 # Apply HBASE-17852 patch directly (with some refactoring part stripped down)  
    
  
So I have chosen the latter one. Reasons: time, time, time. We can revert 
HBASE-19568 back if there are so many objections. 


> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> 
>
> Key: HBASE-17852
> URL: https://issues.apache.org/jira/browse/HBASE-17852
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-17852-v10.patch, screenshot-1.png
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2018-01-29 Thread Appy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344327#comment-16344327
 ] 

Appy edited comment on HBASE-17852 at 1/30/18 1:13 AM:
---

Commit date is 12th jan
{noformat}
commit a5601c8eac6bfcac7d869574547f505d44e49065
Author: Vladimir Rodionov 
AuthorDate: Wed Jan 10 16:26:09 2018 -0800
Commit: Josh Elser 
CommitDate: Fri Jan 12 13:13:17 2018 -0500

HBASE-19568: Restore of HBase table using incremental backup doesn't 
restore rows from an earlier incremental backup

Signed-off-by: Josh Elser 
{noformat}

I did say it was okay to go in master, but that's like 4 days after the commit 
- 2018-01-16T12:46:19-0800

{color:red}Edit{color}
Btw, anyone wishing to cross check the diffs.
Diff on this jira that wasn't approved (until 16th) : 
https://reviews.apache.org/r/63155/diff/5/
Diff on HBASE-19568 which was committed on 12th: 
https://issues.apache.org/jira/secure/attachment/12905579/HBASE-19568-v4.patch


was (Author: appy):
Commit date is 12th jan
{noformat}
commit a5601c8eac6bfcac7d869574547f505d44e49065
Author: Vladimir Rodionov 
AuthorDate: Wed Jan 10 16:26:09 2018 -0800
Commit: Josh Elser 
CommitDate: Fri Jan 12 13:13:17 2018 -0500

HBASE-19568: Restore of HBase table using incremental backup doesn't 
restore rows from an earlier incremental backup

Signed-off-by: Josh Elser 
{noformat}

I did say it was okay to go in master, but that's like 4 days after the commit 
- 2018-01-16T12:46:19-0800

> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> 
>
> Key: HBASE-17852
> URL: https://issues.apache.org/jira/browse/HBASE-17852
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-17852-v10.patch, screenshot-1.png
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-19872) hbase1.3.1 regionerver Crash (bucketcache)

2018-01-29 Thread Duo Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang reopened HBASE-19872:
---

> hbase1.3.1 regionerver Crash (bucketcache)
> --
>
> Key: HBASE-19872
> URL: https://issues.apache.org/jira/browse/HBASE-19872
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 1.3.1
>Reporter: gehaijiang
>Priority: Major
>
> hbase 1.3.1  regionserver  crash ,  configure bucketcache.
> error log :
>  FATAL [RpcServer.FifoWFPBQ.default.handler=42,queue=2,port=16020] 
> regionserver.RSRpcServices: Run out of memory; RSRpcServices will abort 
> itself immediately
>  
>  hbase-env.sh:
> export HBASE_REGIONSERVER_OPTS="-XX:+UseG1GC -XX:+UnlockExperimentalVMOptions 
> -Xmx24g -Xms24g -XX:MetaspaceSize=256M -XX:MaxMetaspaceSize=512M 
> -XX:MaxGCPauseMillis=100 -XX:G1NewSizePercent=5 -XX:ConcGCThreads=4 
> -XX:ParallelGCThreads=16 -XX:-ResizePLAB -XX:+ParallelRefProcEnabled 
> -XX:InitiatingHeapOccupancyPercent=65 -XX:G1HeapRegionSize=32M 
> -XX:G1MixedGCCountTarget=64 -XX:G1OldCSetRegionThresholdPercent=5 
> -XX:MaxTenuringThreshold=1 -XX:MaxDirectMemorySize=28g 
> -XX:ReservedCodeCacheSize=512M -XX:+DisableExplicitGC 
> -Xloggc:${HBASE_LOG_DIR}/regionserver.gc.log"
>  
> hbase-site.xml:
>  
> 
>    hbase.bucketcache.combinedcache.enabled
>    true
>      
>      
>         hbase.bucketcache.ioengine
>     offheap
>      
>  
>     hbase.bucketcache.size
>     25600
>      
>      
>         hbase.bucketcache.writer.queuelength
>    64
>      
>      
>    hbase.bucketcache.writer.threads
>    3
>      
>  
>  hfile.block.cache.size
>  0.3
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-19872) hbase1.3.1 regionerver Crash (bucketcache)

2018-01-29 Thread Duo Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-19872.
---
Resolution: Not A Problem

> hbase1.3.1 regionerver Crash (bucketcache)
> --
>
> Key: HBASE-19872
> URL: https://issues.apache.org/jira/browse/HBASE-19872
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 1.3.1
>Reporter: gehaijiang
>Priority: Major
>
> hbase 1.3.1  regionserver  crash ,  configure bucketcache.
> error log :
>  FATAL [RpcServer.FifoWFPBQ.default.handler=42,queue=2,port=16020] 
> regionserver.RSRpcServices: Run out of memory; RSRpcServices will abort 
> itself immediately
>  
>  hbase-env.sh:
> export HBASE_REGIONSERVER_OPTS="-XX:+UseG1GC -XX:+UnlockExperimentalVMOptions 
> -Xmx24g -Xms24g -XX:MetaspaceSize=256M -XX:MaxMetaspaceSize=512M 
> -XX:MaxGCPauseMillis=100 -XX:G1NewSizePercent=5 -XX:ConcGCThreads=4 
> -XX:ParallelGCThreads=16 -XX:-ResizePLAB -XX:+ParallelRefProcEnabled 
> -XX:InitiatingHeapOccupancyPercent=65 -XX:G1HeapRegionSize=32M 
> -XX:G1MixedGCCountTarget=64 -XX:G1OldCSetRegionThresholdPercent=5 
> -XX:MaxTenuringThreshold=1 -XX:MaxDirectMemorySize=28g 
> -XX:ReservedCodeCacheSize=512M -XX:+DisableExplicitGC 
> -Xloggc:${HBASE_LOG_DIR}/regionserver.gc.log"
>  
> hbase-site.xml:
>  
> 
>    hbase.bucketcache.combinedcache.enabled
>    true
>      
>      
>         hbase.bucketcache.ioengine
>     offheap
>      
>  
>     hbase.bucketcache.size
>     25600
>      
>      
>         hbase.bucketcache.writer.queuelength
>    64
>      
>      
>    hbase.bucketcache.writer.threads
>    3
>      
>  
>  hfile.block.cache.size
>  0.3
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-19874) Regionserver handle is not deleted(bucketcache)

2018-01-29 Thread Duo Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang reopened HBASE-19874:
---

> Regionserver handle is not deleted(bucketcache)  
> -
>
> Key: HBASE-19874
> URL: https://issues.apache.org/jira/browse/HBASE-19874
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 1.3.1
>Reporter: gehaijiang
>Priority: Major
>
> hbase  configure    bucketcache, Production environment   exist is not 
> deleted。
> deleted file  number reached more than a few hundred, and Growing。  Memory is 
> growing.
> $ ll|grep delete
> lr-x-- 1 data data 64 Jan 28 14:28 1048 -> 
> /block4/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir216/blk_3036141819
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1050 -> 
> /block4/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir216/blk_3036141819_1962457009.meta
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1078 -> 
> /block5/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir62/blk_3036102314
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1079 -> 
> /block7/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir95/blk_3036110832
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1091 -> 
> /block3/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir245/subdir53/blk_3035968976_1962284166.meta
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1092 -> 
> /block9/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir97/blk_3036111332
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1093 -> 
> /block9/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir97/blk_3036111332_1962426522.meta
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1096 -> 
> /block5/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir62/blk_3036102314_1962417504.meta
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1100 -> 
> /block7/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir95/blk_3036110832_1962426022.meta
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1101 -> 
> /block10/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir245/subdir13/blk_3035958550
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1102 -> 
> /block10/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir245/subdir13/blk_3035958550_1962273740.meta
>  (deleted)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-19874) Regionserver handle is not deleted(bucketcache)

2018-01-29 Thread Duo Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-19874.
---
  Resolution: Not A Problem
Hadoop Flags:   (was: Reviewed)

> Regionserver handle is not deleted(bucketcache)  
> -
>
> Key: HBASE-19874
> URL: https://issues.apache.org/jira/browse/HBASE-19874
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 1.3.1
>Reporter: gehaijiang
>Priority: Major
>
> hbase  configure    bucketcache, Production environment   exist is not 
> deleted。
> deleted file  number reached more than a few hundred, and Growing。  Memory is 
> growing.
> $ ll|grep delete
> lr-x-- 1 data data 64 Jan 28 14:28 1048 -> 
> /block4/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir216/blk_3036141819
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1050 -> 
> /block4/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir216/blk_3036141819_1962457009.meta
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1078 -> 
> /block5/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir62/blk_3036102314
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1079 -> 
> /block7/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir95/blk_3036110832
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1091 -> 
> /block3/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir245/subdir53/blk_3035968976_1962284166.meta
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1092 -> 
> /block9/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir97/blk_3036111332
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1093 -> 
> /block9/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir97/blk_3036111332_1962426522.meta
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1096 -> 
> /block5/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir62/blk_3036102314_1962417504.meta
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1100 -> 
> /block7/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir95/blk_3036110832_1962426022.meta
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1101 -> 
> /block10/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir245/subdir13/blk_3035958550
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1102 -> 
> /block10/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir245/subdir13/blk_3035958550_1962273740.meta
>  (deleted)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-19874) Regionserver handle is not deleted(bucketcache)

2018-01-29 Thread gehaijiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gehaijiang resolved HBASE-19874.

  Resolution: Done
Hadoop Flags: Reviewed
Release Note: hdfs problem

> Regionserver handle is not deleted(bucketcache)  
> -
>
> Key: HBASE-19874
> URL: https://issues.apache.org/jira/browse/HBASE-19874
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 1.3.1
>Reporter: gehaijiang
>Priority: Major
>
> hbase  configure    bucketcache, Production environment   exist is not 
> deleted。
> deleted file  number reached more than a few hundred, and Growing。  Memory is 
> growing.
> $ ll|grep delete
> lr-x-- 1 data data 64 Jan 28 14:28 1048 -> 
> /block4/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir216/blk_3036141819
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1050 -> 
> /block4/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir216/blk_3036141819_1962457009.meta
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1078 -> 
> /block5/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir62/blk_3036102314
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1079 -> 
> /block7/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir95/blk_3036110832
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1091 -> 
> /block3/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir245/subdir53/blk_3035968976_1962284166.meta
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1092 -> 
> /block9/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir97/blk_3036111332
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1093 -> 
> /block9/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir97/blk_3036111332_1962426522.meta
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1096 -> 
> /block5/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir62/blk_3036102314_1962417504.meta
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1100 -> 
> /block7/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir95/blk_3036110832_1962426022.meta
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1101 -> 
> /block10/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir245/subdir13/blk_3035958550
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1102 -> 
> /block10/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir245/subdir13/blk_3035958550_1962273740.meta
>  (deleted)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2018-01-29 Thread Appy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344327#comment-16344327
 ] 

Appy commented on HBASE-17852:
--

Commit date is 12th jan
{noformat}
commit a5601c8eac6bfcac7d869574547f505d44e49065
Author: Vladimir Rodionov 
AuthorDate: Wed Jan 10 16:26:09 2018 -0800
Commit: Josh Elser 
CommitDate: Fri Jan 12 13:13:17 2018 -0500

HBASE-19568: Restore of HBase table using incremental backup doesn't 
restore rows from an earlier incremental backup

Signed-off-by: Josh Elser 
{noformat}

I did say it was okay to go in master, but that's like 4 days after the commit 
- 2018-01-16T12:46:19-0800

> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> 
>
> Key: HBASE-17852
> URL: https://issues.apache.org/jira/browse/HBASE-17852
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-17852-v10.patch, screenshot-1.png
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-19872) hbase1.3.1 regionerver Crash (bucketcache)

2018-01-29 Thread gehaijiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gehaijiang resolved HBASE-19872.

Resolution: Done

> hbase1.3.1 regionerver Crash (bucketcache)
> --
>
> Key: HBASE-19872
> URL: https://issues.apache.org/jira/browse/HBASE-19872
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 1.3.1
>Reporter: gehaijiang
>Priority: Major
>
> hbase 1.3.1  regionserver  crash ,  configure bucketcache.
> error log :
>  FATAL [RpcServer.FifoWFPBQ.default.handler=42,queue=2,port=16020] 
> regionserver.RSRpcServices: Run out of memory; RSRpcServices will abort 
> itself immediately
>  
>  hbase-env.sh:
> export HBASE_REGIONSERVER_OPTS="-XX:+UseG1GC -XX:+UnlockExperimentalVMOptions 
> -Xmx24g -Xms24g -XX:MetaspaceSize=256M -XX:MaxMetaspaceSize=512M 
> -XX:MaxGCPauseMillis=100 -XX:G1NewSizePercent=5 -XX:ConcGCThreads=4 
> -XX:ParallelGCThreads=16 -XX:-ResizePLAB -XX:+ParallelRefProcEnabled 
> -XX:InitiatingHeapOccupancyPercent=65 -XX:G1HeapRegionSize=32M 
> -XX:G1MixedGCCountTarget=64 -XX:G1OldCSetRegionThresholdPercent=5 
> -XX:MaxTenuringThreshold=1 -XX:MaxDirectMemorySize=28g 
> -XX:ReservedCodeCacheSize=512M -XX:+DisableExplicitGC 
> -Xloggc:${HBASE_LOG_DIR}/regionserver.gc.log"
>  
> hbase-site.xml:
>  
> 
>    hbase.bucketcache.combinedcache.enabled
>    true
>      
>      
>         hbase.bucketcache.ioengine
>     offheap
>      
>  
>     hbase.bucketcache.size
>     25600
>      
>      
>         hbase.bucketcache.writer.queuelength
>    64
>      
>      
>    hbase.bucketcache.writer.threads
>    3
>      
>  
>  hfile.block.cache.size
>  0.3
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19874) Regionserver handle is not deleted(bucketcache)

2018-01-29 Thread gehaijiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344324#comment-16344324
 ] 

gehaijiang commented on HBASE-19874:


open  shortcircuit  problem

> Regionserver handle is not deleted(bucketcache)  
> -
>
> Key: HBASE-19874
> URL: https://issues.apache.org/jira/browse/HBASE-19874
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 1.3.1
>Reporter: gehaijiang
>Priority: Major
>
> hbase  configure    bucketcache, Production environment   exist is not 
> deleted。
> deleted file  number reached more than a few hundred, and Growing。  Memory is 
> growing.
> $ ll|grep delete
> lr-x-- 1 data data 64 Jan 28 14:28 1048 -> 
> /block4/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir216/blk_3036141819
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1050 -> 
> /block4/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir216/blk_3036141819_1962457009.meta
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1078 -> 
> /block5/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir62/blk_3036102314
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1079 -> 
> /block7/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir95/blk_3036110832
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1091 -> 
> /block3/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir245/subdir53/blk_3035968976_1962284166.meta
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1092 -> 
> /block9/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir97/blk_3036111332
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1093 -> 
> /block9/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir97/blk_3036111332_1962426522.meta
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1096 -> 
> /block5/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir62/blk_3036102314_1962417504.meta
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1100 -> 
> /block7/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir247/subdir95/blk_3036110832_1962426022.meta
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1101 -> 
> /block10/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir245/subdir13/blk_3035958550
>  (deleted)
> lr-x-- 1 data data 64 Jan 28 14:28 1102 -> 
> /block10/hadoop/dfs/data/current/BP-1101579887-10.50.64.23-1497104043858/current/finalized/subdir245/subdir13/blk_3035958550_1962273740.meta
>  (deleted)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19872) hbase1.3.1 regionerver Crash (bucketcache)

2018-01-29 Thread gehaijiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344319#comment-16344319
 ] 

gehaijiang commented on HBASE-19872:


thanks, I  known

> hbase1.3.1 regionerver Crash (bucketcache)
> --
>
> Key: HBASE-19872
> URL: https://issues.apache.org/jira/browse/HBASE-19872
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 1.3.1
>Reporter: gehaijiang
>Priority: Major
>
> hbase 1.3.1  regionserver  crash ,  configure bucketcache.
> error log :
>  FATAL [RpcServer.FifoWFPBQ.default.handler=42,queue=2,port=16020] 
> regionserver.RSRpcServices: Run out of memory; RSRpcServices will abort 
> itself immediately
>  
>  hbase-env.sh:
> export HBASE_REGIONSERVER_OPTS="-XX:+UseG1GC -XX:+UnlockExperimentalVMOptions 
> -Xmx24g -Xms24g -XX:MetaspaceSize=256M -XX:MaxMetaspaceSize=512M 
> -XX:MaxGCPauseMillis=100 -XX:G1NewSizePercent=5 -XX:ConcGCThreads=4 
> -XX:ParallelGCThreads=16 -XX:-ResizePLAB -XX:+ParallelRefProcEnabled 
> -XX:InitiatingHeapOccupancyPercent=65 -XX:G1HeapRegionSize=32M 
> -XX:G1MixedGCCountTarget=64 -XX:G1OldCSetRegionThresholdPercent=5 
> -XX:MaxTenuringThreshold=1 -XX:MaxDirectMemorySize=28g 
> -XX:ReservedCodeCacheSize=512M -XX:+DisableExplicitGC 
> -Xloggc:${HBASE_LOG_DIR}/regionserver.gc.log"
>  
> hbase-site.xml:
>  
> 
>    hbase.bucketcache.combinedcache.enabled
>    true
>      
>      
>         hbase.bucketcache.ioengine
>     offheap
>      
>  
>     hbase.bucketcache.size
>     25600
>      
>      
>         hbase.bucketcache.writer.queuelength
>    64
>      
>      
>    hbase.bucketcache.writer.threads
>    3
>      
>  
>  hfile.block.cache.size
>  0.3
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2018-01-29 Thread Appy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344314#comment-16344314
 ] 

Appy commented on HBASE-17852:
--

Okay, f**k it, I really don't want to waste anymore of my time fighting some 
fight. It's obvious from events what happened here, and that it shouldn't have 
- makes me very sad and angry.
I leave its further handling to PMC.
At the very least, someone lost my basic trust and respect.




> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> 
>
> Key: HBASE-17852
> URL: https://issues.apache.org/jira/browse/HBASE-17852
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-17852-v10.patch, screenshot-1.png
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19803) False positive for the HBASE-Find-Flaky-Tests job

2018-01-29 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344311#comment-16344311
 ] 

Duo Zhang commented on HBASE-19803:
---

I've changed the label from 'ubuntu' to 'ubuntu||Hadoop' so it can start a new 
build...

> False positive for the HBASE-Find-Flaky-Tests job
> -
>
> Key: HBASE-19803
> URL: https://issues.apache.org/jira/browse/HBASE-19803
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Priority: Major
> Attachments: 2018-01-24T17-45-37_000-jvmRun1.dumpstream, 
> HBASE-19803.master.001.patch
>
>
> It reports two hangs for TestAsyncTableGetMultiThreaded, but I checked the 
> surefire output
> https://builds.apache.org/job/HBASE-Flaky-Tests/24830/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.client.TestAsyncTableGetMultiThreaded-output.txt
> This one was likely to be killed in the middle of the run within 20 seconds.
> https://builds.apache.org/job/HBASE-Flaky-Tests/24852/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.client.TestAsyncTableGetMultiThreaded-output.txt
> This one was also killed within about 1 minutes.
> The test is declared as LargeTests so the time limit should be 10 minutes. It 
> seems that the jvm may crash during the mvn test run and then we will kill 
> all the running tests and then we may mark some of them as hang which leads 
> to the false positive.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2018-01-29 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344310#comment-16344310
 ] 

Josh Elser commented on HBASE-17852:


bq. I'll can't believe that because I can't believe that..

 [~appy], truly, boss, if you weren't giving your blessing on the fix going 
into master, say so and I'll revert it when next at a computer. I was operating 
under the assumption that we had time to address design and not look 
gift-contribtuion(horses) in the mouth.

The rest of this is a product of some heavy-handedness about the busted Yetus 
after the JIRA upgrade.

Not trying to tell you something different than what you think happened, did. 
Trying to express that I thought you were ok with this plan against master (not 
branch-2).

> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> 
>
> Key: HBASE-17852
> URL: https://issues.apache.org/jira/browse/HBASE-17852
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-17852-v10.patch, screenshot-1.png
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2018-01-29 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344307#comment-16344307
 ] 

Vladimir Rodionov commented on HBASE-17852:
---

{quote}

Wasn't the said patch objected against committing by multiple members of the 
community?

{quote}

 

Calm down,  [~appy]. We are not doing anything criminal here. The result of 
these two patches is what you have agreed on personally :

https://issues.apache.org/jira/browse/HBASE-17852?focusedCommentId=16327774&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16327774

> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> 
>
> Key: HBASE-17852
> URL: https://issues.apache.org/jira/browse/HBASE-17852
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-17852-v10.patch, screenshot-1.png
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19803) False positive for the HBASE-Find-Flaky-Tests job

2018-01-29 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344305#comment-16344305
 ] 

Duo Zhang commented on HBASE-19803:
---

[~appy] The flakey test finder job is hang?

https://builds.apache.org/job/HBASE-Find-Flaky-Tests/

The last build can not start...

> False positive for the HBASE-Find-Flaky-Tests job
> -
>
> Key: HBASE-19803
> URL: https://issues.apache.org/jira/browse/HBASE-19803
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Priority: Major
> Attachments: 2018-01-24T17-45-37_000-jvmRun1.dumpstream, 
> HBASE-19803.master.001.patch
>
>
> It reports two hangs for TestAsyncTableGetMultiThreaded, but I checked the 
> surefire output
> https://builds.apache.org/job/HBASE-Flaky-Tests/24830/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.client.TestAsyncTableGetMultiThreaded-output.txt
> This one was likely to be killed in the middle of the run within 20 seconds.
> https://builds.apache.org/job/HBASE-Flaky-Tests/24852/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.client.TestAsyncTableGetMultiThreaded-output.txt
> This one was also killed within about 1 minutes.
> The test is declared as LargeTests so the time limit should be 10 minutes. It 
> seems that the jvm may crash during the mvn test run and then we will kill 
> all the running tests and then we may mark some of them as hang which leads 
> to the false positive.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19873) Add a CategoryBasedTimeout ClassRule for all UTs

2018-01-29 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344304#comment-16344304
 ] 

Duo Zhang commented on HBASE-19873:
---

Thanks [~stack].

> Add a CategoryBasedTimeout ClassRule for all UTs
> 
>
> Key: HBASE-19873
> URL: https://issues.apache.org/jira/browse/HBASE-19873
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19873-branch-2-v2.patch
>
>
> So that our test can timeout as expected without making the surefire plugin 
> kill other tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2018-01-29 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344302#comment-16344302
 ] 

stack commented on HBASE-17852:
---

{quote}The majority of this code (but not all) went into master in HBASE-19568 
btw.
{quote}
The majority of 'HBASE-17852 Add Fault tolerance to HBASE-14417 (Support bulk 
loaded files in incremental backup)', a contentious issue, went into another 
commit named 'HBASE-19568 Restore of HBase table using incremental backup 
doesn't restore rows from an earlier incremental backup' with no outline of 
what made it and what did not, and no changeset explaination. There is no 
release note. The two JIRAs are not even linked.
{quote}Nope, it turned out that this patch (HBASE-17852) also fixes the issue 
raised in HBASE-19568, that is why it was committed (with refactoring code 
stripped down). No conspiracy here.   
{quote}
But hang on, now the patch here on 'fault tolerance' fixes issues over in the 
'restore rows' issue, -HBASE-19568?-

I can see how [~appy] might arrive at his assessment.

On the 'declarations', the first offers options free of context or explanation.

This one I find interesting:

 # Use procedure framework:  Short answer - no. I will wait until procv2 
becomes more mature and robust. I do not want to build new feature on a 
foundation of a new feature. Too risky in my opinion. NO

when we are talking about a hbase3 (possibly) feature and when there is no 
alternative.

Anyway, keeping it short.

 

> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> 
>
> Key: HBASE-17852
> URL: https://issues.apache.org/jira/browse/HBASE-17852
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-17852-v10.patch, screenshot-1.png
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2018-01-29 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344287#comment-16344287
 ] 

Vladimir Rodionov commented on HBASE-17852:
---

{quote}

he had random urge to delete all previous 9 patches from this jira

{quote}

No conspiracy here as well. I was not able to submit patch v10 due to some 
Apache Jira issues and had to remove all previous patches to be able to submit 
v10. 

> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> 
>
> Key: HBASE-17852
> URL: https://issues.apache.org/jira/browse/HBASE-17852
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-17852-v10.patch, screenshot-1.png
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2018-01-29 Thread Appy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344272#comment-16344272
 ] 

Appy commented on HBASE-17852:
--

bq. There was nothing malicious intending to happen here
I'll can't believe that because I can't believe that
-  he started fixing the other jira from clean slate and somehow mysteriously 
ended up with exact same diff as was here, and which we all were against.
- he had random urge to delete all previous 9 patches from this jira, but not 
from phase1 jira HBASE-14030 or phase2 jira HBASE-14123, which both have like 
40 patches each


> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> 
>
> Key: HBASE-17852
> URL: https://issues.apache.org/jira/browse/HBASE-17852
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-17852-v10.patch, screenshot-1.png
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2018-01-29 Thread Appy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344258#comment-16344258
 ] 

Appy commented on HBASE-17852:
--

bq. Nope, it turned out that this patch (HBASE-17852) also fixes the issue 
raised in HBASE-19568, that is why it was committed (with refactoring code 
stripped down).
Not a justification!
Did you not use the patch in this jira to fix HBASE-19568?
Wasn't the said patch objected against committing by multiple members of the 
community?
Did you brought to anyone's attention, who raised the objections 
(me/stack/andrew/[~mdrob]), the fact that you were committing these changes.

bq. No conspiracy here.  Besides this, I thought that we have agreed on pushing 
this to the master branch and continue working on a critical changes after 
that? 
You really think that'd work? People can match timestamps, you committed 4 days 
before i even replied back!


> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> 
>
> Key: HBASE-17852
> URL: https://issues.apache.org/jira/browse/HBASE-17852
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-17852-v10.patch, screenshot-1.png
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19863) java.lang.IllegalStateException: isDelete failed when SingleColumnValueFilter is used

2018-01-29 Thread Sergey Soldatov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Soldatov updated HBASE-19863:

Status: Patch Available  (was: Open)

A WIP patch to deal with this issue. If during the attempt of skip to next 
column we hit the case when store scanner was changed and new head is behind 
the cell we are looking for, we just return false to force reseek in 
seekOrSkipToNextColumn. Not sure yet whether it covers all possible scenarios.

> java.lang.IllegalStateException: isDelete failed when SingleColumnValueFilter 
> is used
> -
>
> Key: HBASE-19863
> URL: https://issues.apache.org/jira/browse/HBASE-19863
> Project: HBase
>  Issue Type: Bug
>  Components: Filters
>Affects Versions: 1.4.1
>Reporter: Sergey Soldatov
>Assignee: Sergey Soldatov
>Priority: Major
> Attachments: HBASE-19863-branch1.patch, HBASE-19863-test.patch
>
>
> Under some circumstances scan with SingleColumnValueFilter may fail with an 
> exception
> {noformat} 
> java.lang.IllegalStateException: isDelete failed: deleteBuffer=C3, 
> qualifier=C2, timestamp=1516433595543, comparison result: 1 
> at 
> org.apache.hadoop.hbase.regionserver.ScanDeleteTracker.isDeleted(ScanDeleteTracker.java:149)
>   at 
> org.apache.hadoop.hbase.regionserver.ScanQueryMatcher.match(ScanQueryMatcher.java:386)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:545)
>   at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:147)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5876)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:6027)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:5814)
>   at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2552)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32385)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2150)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167)
> {noformat}
> Conditions:
> table T with a single column family 0 that uses ROWCOL bloom filter 
> (important)  and column qualifiers C1,C2,C3,C4,C5. 
> When we fill the table for every row we put deleted cell for C3.
> The table has a single region with two HStore:
> A: start row: 0, stop row: 99 
> B: start row: 10 stop row: 99
> B has newer versions of rows 10-99. Store files have several blocks each 
> (important). 
> Store A is the result of major compaction,  so it doesn't have any deleted 
> cells (important).
> So, we are running a scan like:
> {noformat}
> scan 'T', { COLUMNS => ['0:C3','0:C5'], FILTER => "SingleColumnValueFilter 
> ('0','C5',=,'binary:whatever')"}
> {noformat}  
> How the scan performs:
> First, we iterate A for rows 0 and 1 without any problems. 
> Next, we start to iterate A for row 10, so read the first cell and set hfs 
> scanner to A :
> 10:0/C1/0/Put/x but found that we have a newer version of the cell in B : 
> 10:0/C1/1/Put/x, 
> so we make B as our current store scanner. Since we are looking for 
> particular columns 
> C3 and C5, we perform the optimization StoreScanner.seekOrSkipToNextColumn 
> which 
> would run reseek for all store scanners.
> For store A the following magic would happen in requestSeek:
>   1. bloom filter check passesGeneralBloomFilter would set haveToSeek to 
> false because row 10 doesn't have C3 qualifier in store A.  
>   2. Since we don't have to seek we just create a fake row 
> 10:0/C3/OLDEST_TIMESTAMP/Maximum, an optimization that is quite important for 
> us and it commented with :
> {noformat}
>  // Multi-column Bloom filter optimization.
> // Create a fake key/value, so that this scanner only bubbles up to the 
> top
> // of the KeyValueHeap in StoreScanner after we scanned this row/column in
> // all other store files. The query matcher will then just skip this fake
> // key/value and the store scanner will progress to the next column. This
> // is obviously not a "real real" seek, but unlike the fake KV earlier in
> // this method, we want this to be propagated to ScanQueryMatcher.
> {noformat}
> 
> For store B we would set it to fake 10:0/C3/createFirstOnRowColTS()/Maximum 
> to skip C3 entirely. 
> After that we start searching for qualifier C5 using seekOrSkipToNextColumn 
> which run first trySkipToNextColumn:
> {n

[jira] [Updated] (HBASE-19863) java.lang.IllegalStateException: isDelete failed when SingleColumnValueFilter is used

2018-01-29 Thread Sergey Soldatov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Soldatov updated HBASE-19863:

Attachment: HBASE-19863-branch1.patch

> java.lang.IllegalStateException: isDelete failed when SingleColumnValueFilter 
> is used
> -
>
> Key: HBASE-19863
> URL: https://issues.apache.org/jira/browse/HBASE-19863
> Project: HBase
>  Issue Type: Bug
>  Components: Filters
>Affects Versions: 1.4.1
>Reporter: Sergey Soldatov
>Assignee: Sergey Soldatov
>Priority: Major
> Attachments: HBASE-19863-branch1.patch, HBASE-19863-test.patch
>
>
> Under some circumstances scan with SingleColumnValueFilter may fail with an 
> exception
> {noformat} 
> java.lang.IllegalStateException: isDelete failed: deleteBuffer=C3, 
> qualifier=C2, timestamp=1516433595543, comparison result: 1 
> at 
> org.apache.hadoop.hbase.regionserver.ScanDeleteTracker.isDeleted(ScanDeleteTracker.java:149)
>   at 
> org.apache.hadoop.hbase.regionserver.ScanQueryMatcher.match(ScanQueryMatcher.java:386)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:545)
>   at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:147)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5876)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:6027)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:5814)
>   at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2552)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32385)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2150)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167)
> {noformat}
> Conditions:
> table T with a single column family 0 that uses ROWCOL bloom filter 
> (important)  and column qualifiers C1,C2,C3,C4,C5. 
> When we fill the table for every row we put deleted cell for C3.
> The table has a single region with two HStore:
> A: start row: 0, stop row: 99 
> B: start row: 10 stop row: 99
> B has newer versions of rows 10-99. Store files have several blocks each 
> (important). 
> Store A is the result of major compaction,  so it doesn't have any deleted 
> cells (important).
> So, we are running a scan like:
> {noformat}
> scan 'T', { COLUMNS => ['0:C3','0:C5'], FILTER => "SingleColumnValueFilter 
> ('0','C5',=,'binary:whatever')"}
> {noformat}  
> How the scan performs:
> First, we iterate A for rows 0 and 1 without any problems. 
> Next, we start to iterate A for row 10, so read the first cell and set hfs 
> scanner to A :
> 10:0/C1/0/Put/x but found that we have a newer version of the cell in B : 
> 10:0/C1/1/Put/x, 
> so we make B as our current store scanner. Since we are looking for 
> particular columns 
> C3 and C5, we perform the optimization StoreScanner.seekOrSkipToNextColumn 
> which 
> would run reseek for all store scanners.
> For store A the following magic would happen in requestSeek:
>   1. bloom filter check passesGeneralBloomFilter would set haveToSeek to 
> false because row 10 doesn't have C3 qualifier in store A.  
>   2. Since we don't have to seek we just create a fake row 
> 10:0/C3/OLDEST_TIMESTAMP/Maximum, an optimization that is quite important for 
> us and it commented with :
> {noformat}
>  // Multi-column Bloom filter optimization.
> // Create a fake key/value, so that this scanner only bubbles up to the 
> top
> // of the KeyValueHeap in StoreScanner after we scanned this row/column in
> // all other store files. The query matcher will then just skip this fake
> // key/value and the store scanner will progress to the next column. This
> // is obviously not a "real real" seek, but unlike the fake KV earlier in
> // this method, we want this to be propagated to ScanQueryMatcher.
> {noformat}
> 
> For store B we would set it to fake 10:0/C3/createFirstOnRowColTS()/Maximum 
> to skip C3 entirely. 
> After that we start searching for qualifier C5 using seekOrSkipToNextColumn 
> which run first trySkipToNextColumn:
> {noformat}
>   protected boolean trySkipToNextColumn(Cell cell) throws IOException {
> Cell nextCell = null;
> do {
>   Cell nextIndexedKey = getNextIndexedKey();
>   if (nextIndexedKey != null && nextIndexedKey != 
> KeyValueScanner.NO_NEXT_INDEXED_KEY
>   && matcher.compareKeyFo

[jira] [Commented] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2018-01-29 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344231#comment-16344231
 ] 

Josh Elser commented on HBASE-17852:


{quote}
HBASE-19568 had basically everything that was objected in the reviews here, why 
wasn't it brought to the attention of people who raised objections? The 
title/reason of that jira reason doesn't matter.
I see it as a really sly move - going behind community and committed changes 
which were heavily objected against, by using separate jira.
{quote}

[~appy], let's take a step back, please. I called this out to your attention -- 
I was under the impression that, based on your earlier comment 
([here|https://issues.apache.org/jira/browse/HBASE-17852?focusedCommentId=16327774&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16327774])
 that you were OK of this implementation landing in master as-is.

HBASE-19568 was used to commit to master (with what I thought was your 
blessing) while we continue to use this JIRA issue to flesh out design because 
of all of the discussion that has happened. If I misunderstood you or poorly 
asked you the question, let's take that over to HBASE-19568 and get a revert in 
place. There was nothing malicious intending to happen here.

> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> 
>
> Key: HBASE-17852
> URL: https://issues.apache.org/jira/browse/HBASE-17852
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-17852-v10.patch, screenshot-1.png
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2018-01-29 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344228#comment-16344228
 ] 

Vladimir Rodionov edited comment on HBASE-17852 at 1/30/18 12:07 AM:
-

Nope, it turned out that this patch (HBASE-17852) also fixes the issue raised 
in HBASE-19568, that is why it was committed (with refactoring code stripped 
down). No conspiracy here.  Besides this, I thought that we have agreed on 
pushing this to the master branch and continue working on a critical changes 
after that? 


was (Author: vrodionov):
Nope, it turned out that this patch (HBASE-17852) also fixes the issue raised 
in HBASE-19568, that is why it was committed (with refactoring code stripped 
down). No conspiracy here.  Besides this, I thought that we agreed on pusing 
this to the master branch and continue working on a critical changes after 
that? 

> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> 
>
> Key: HBASE-17852
> URL: https://issues.apache.org/jira/browse/HBASE-17852
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-17852-v10.patch, screenshot-1.png
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2018-01-29 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344228#comment-16344228
 ] 

Vladimir Rodionov edited comment on HBASE-17852 at 1/30/18 12:05 AM:
-

Nope, it turned out that this patch (HBASE-17852) also fixes the issue raised 
in HBASE-19568, that is why it was committed (with refactoring code stripped 
down). No conspiracy here.  Besides this, I thought that we agreed on pusing 
this to the master branch and continue working on a critical changes after 
that? 


was (Author: vrodionov):
Nope, it turned out that this patch (HBASE-17852) also fixes the issue raised 
in HBASE-19568, that is why it was committed (with refactoring code stripped 
down). No conspiracy here.   

> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> 
>
> Key: HBASE-17852
> URL: https://issues.apache.org/jira/browse/HBASE-17852
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-17852-v10.patch, screenshot-1.png
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2018-01-29 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344228#comment-16344228
 ] 

Vladimir Rodionov commented on HBASE-17852:
---

Nope, it turned out that this patch (HBASE-17852) also fixes the issue raised 
in HBASE-19568, that is why it was committed (with refactoring code stripped 
down). No conspiracy here.   

> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> 
>
> Key: HBASE-17852
> URL: https://issues.apache.org/jira/browse/HBASE-17852
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-17852-v10.patch, screenshot-1.png
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19725) Build fails, unable to read hbase/checkstyle-suppressions.xml "invalid distance too far back"

2018-01-29 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344222#comment-16344222
 ] 

stack commented on HBASE-19725:
---

This is fixed by -HBASE-19780-

Resolved

I'll not apply the below change, undoing a workaround which does not run 
checkstyle as part of site build... its not harm (checkstyle is in place when 
we do the initial build when we make a release candidate).
{code:java}
diff --git a/dev-support/make_rc.sh b/dev-support/make_rc.sh
index f067ee9..1424249 100755
--- a/dev-support/make_rc.sh
+++ b/dev-support/make_rc.sh
@@ -78,8 +78,7 @@ function build_bin {
MAVEN_OPTS="${mvnopts}" ${mvn} clean install -DskipTests \
-Papache-release -Prelease \
-Dmaven.repo.local=${output_dir}/repository
- MAVEN_OPTS="${mvnopts}" ${mvn} install -DskipTests \
- -Dcheckstyle.skip=true site assembly:single \
+ MAVEN_OPTS="${mvnopts}" ${mvn} install -DskipTests assembly:single \
-Papache-release -Prelease \
-Dmaven.repo.local=${output_dir}/repository
mv ./hbase-assembly/target/hbase-*.tar.gz "${output_dir}"{code}
 

> Build fails, unable to read hbase/checkstyle-suppressions.xml "invalid 
> distance too far back"
> -
>
> Key: HBASE-19725
> URL: https://issues.apache.org/jira/browse/HBASE-19725
> Project: HBase
>  Issue Type: Sub-task
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Fix For: 2.0.0-beta-2
>
>
> Build is failing on me (Trying to cut beta-1 RC on branch-2). It is first 
> time we go to use the jars made by hbase-checkstyle in the hbase-error-prone 
> module under 'build support' module when running the 'site' target. It is 
> trying to make the checkstyle report.
> I see that we find the right jar to read:
> [DEBUG] The resource 'hbase/checkstyle-suppressions.xml' was found as 
> jar:file:/home/stack/rc/hbase-2.0.0-beta-1.20180107T061305Z/repository/org/apache/hbase/hbase-checkstyle/2.0.0-beta-1/hbase-checkstyle-2.0.0-beta-1.jar!/hbase/checkstyle-suppressions.xml.
> But then it thinks the jar corrupt 'ZipException: invalid distance too far 
> back'.
> Here is mvn output:
> 12667058 [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-checkstyle-plugin:2.17:check (checkstyle) on 
> project hbase-error-prone: Failed during checkstyle executi on: 
> Unable to process suppressions file location: 
> hbase/checkstyle-suppressions.xml: Cannot create file-based resource:invalid 
> distance too far back -> [Help 1]
> 12667059 org.apache.maven.lifecycle.LifecycleExecutionException: Failed to 
> execute goal org.apache.maven.plugins:maven-checkstyle-plugin:2.17:check 
> (checkstyle) on project hba se-error-prone: Failed during checkstyle 
> execution
> I'm running this command:
> mvn -X install -DskipTests site assembly:single -Papache-release -Prelease 
> -Dmaven.repo.local=//home/stack/rc/hbase-2.0.0-beta-1.20180107T061305Z/repository
> Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 
> 2015-11-10T08:41:47-08:00) 
> Java version: 1.8.0_151, vendor: Oracle Corporation



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-19725) Build fails, unable to read hbase/checkstyle-suppressions.xml "invalid distance too far back"

2018-01-29 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-19725.
---
Resolution: Not A Problem

> Build fails, unable to read hbase/checkstyle-suppressions.xml "invalid 
> distance too far back"
> -
>
> Key: HBASE-19725
> URL: https://issues.apache.org/jira/browse/HBASE-19725
> Project: HBase
>  Issue Type: Sub-task
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Fix For: 2.0.0-beta-2
>
>
> Build is failing on me (Trying to cut beta-1 RC on branch-2). It is first 
> time we go to use the jars made by hbase-checkstyle in the hbase-error-prone 
> module under 'build support' module when running the 'site' target. It is 
> trying to make the checkstyle report.
> I see that we find the right jar to read:
> [DEBUG] The resource 'hbase/checkstyle-suppressions.xml' was found as 
> jar:file:/home/stack/rc/hbase-2.0.0-beta-1.20180107T061305Z/repository/org/apache/hbase/hbase-checkstyle/2.0.0-beta-1/hbase-checkstyle-2.0.0-beta-1.jar!/hbase/checkstyle-suppressions.xml.
> But then it thinks the jar corrupt 'ZipException: invalid distance too far 
> back'.
> Here is mvn output:
> 12667058 [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-checkstyle-plugin:2.17:check (checkstyle) on 
> project hbase-error-prone: Failed during checkstyle executi on: 
> Unable to process suppressions file location: 
> hbase/checkstyle-suppressions.xml: Cannot create file-based resource:invalid 
> distance too far back -> [Help 1]
> 12667059 org.apache.maven.lifecycle.LifecycleExecutionException: Failed to 
> execute goal org.apache.maven.plugins:maven-checkstyle-plugin:2.17:check 
> (checkstyle) on project hba se-error-prone: Failed during checkstyle 
> execution
> I'm running this command:
> mvn -X install -DskipTests site assembly:single -Papache-release -Prelease 
> -Dmaven.repo.local=//home/stack/rc/hbase-2.0.0-beta-1.20180107T061305Z/repository
> Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 
> 2015-11-10T08:41:47-08:00) 
> Java version: 1.8.0_151, vendor: Oracle Corporation



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2018-01-29 Thread Appy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344219#comment-16344219
 ] 

Appy commented on HBASE-17852:
--

Forget all the design discussion, that's not important anymore.

HBASE-19568 had basically everything that was objected in the reviews here, why 
wasn't it brought to the attention of people who raised objections?  The 
title/reason of that jira reason doesn't matter.
I see it as a really sly move - going behind community and committed changes 
which were heavily objected against, by using separate jira.

Ping reviewers of other jira: [~elserj] [~tedyu] 
Ping [~stack] [~apurtell]

> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> 
>
> Key: HBASE-17852
> URL: https://issues.apache.org/jira/browse/HBASE-17852
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-17852-v10.patch, screenshot-1.png
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2018-01-29 Thread Appy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344160#comment-16344160
 ] 

Appy edited comment on HBASE-17852 at 1/29/18 11:10 PM:


I see only patch v10 in the attached files, and all it's doing is changing name 
of BackupSystemTable to BackupMetaTable. It's far from what the title says - 
"Add Fault Tolerance". What am i missing?

{color:red}Edit: {color} *Please never delete attachments which formed the 
basis of earlier discussions in a jira*


was (Author: appy):
I see only patch v10 in the attached files, and all it's doing is changing name 
of BackupSystemTable to BackupMetaTable. It's far from what the title says - 
"Add Fault Tolerance". What am i missing?

*Please never delete attachments which formed the basis of earlier discussions 
in a jira*

> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> 
>
> Key: HBASE-17852
> URL: https://issues.apache.org/jira/browse/HBASE-17852
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-17852-v10.patch, screenshot-1.png
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2018-01-29 Thread Appy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344160#comment-16344160
 ] 

Appy edited comment on HBASE-17852 at 1/29/18 11:09 PM:


I see only patch v10 in the attached files, and all it's doing is changing name 
of BackupSystemTable to BackupMetaTable. It's far from what the title says - 
"Add Fault Tolerance". What am i missing?

*Please never delete attachments which formed the basis of earlier discussions 
in a jira*


was (Author: appy):
I see only patch v10 in the attached files, and all it's doing is changing name 
of BackupSystemTable to BackupMetaTable. It's far from what the title says - 
"Add Fault Tolerance". What am i missing?

> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> 
>
> Key: HBASE-17852
> URL: https://issues.apache.org/jira/browse/HBASE-17852
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-17852-v10.patch, screenshot-1.png
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2018-01-29 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344163#comment-16344163
 ] 

Vladimir Rodionov commented on HBASE-17852:
---

I will quote myself

{quote}

I will rebase patch to the current master. The majority of this code (but not 
all) went into master in HBASE-19568 btw.

{quote}

> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> 
>
> Key: HBASE-17852
> URL: https://issues.apache.org/jira/browse/HBASE-17852
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-17852-v10.patch, screenshot-1.png
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

2018-01-29 Thread Appy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344160#comment-16344160
 ] 

Appy commented on HBASE-17852:
--

I see only patch v10 in the attached files, and all it's doing is changing name 
of BackupSystemTable to BackupMetaTable. It's far from what the title says - 
"Add Fault Tolerance". What am i missing?

> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> 
>
> Key: HBASE-17852
> URL: https://issues.apache.org/jira/browse/HBASE-17852
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-17852-v10.patch, screenshot-1.png
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19889) Revert Workaround: Purge User API building from branch-2 so can make a beta-1

2018-01-29 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19889:
--
Fix Version/s: (was: 2.0.0-beta-2)

> Revert Workaround: Purge User API building from branch-2 so can make a beta-1
> -
>
> Key: HBASE-19889
> URL: https://issues.apache.org/jira/browse/HBASE-19889
> Project: HBase
>  Issue Type: Sub-task
>  Components: site
>Reporter: stack
>Assignee: stack
>Priority: Major
>
> Root fix looks to be  -HBASE-19780-
> Let me try it



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19663) site build fails complaining "javadoc: error - class file for javax.annotation.meta.TypeQualifierNickname not found"

2018-01-29 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19663:
--
Fix Version/s: (was: 2.0.0)
   2.0.0-beta-2

> site build fails complaining "javadoc: error - class file for 
> javax.annotation.meta.TypeQualifierNickname not found"
> 
>
> Key: HBASE-19663
> URL: https://issues.apache.org/jira/browse/HBASE-19663
> Project: HBase
>  Issue Type: Bug
>  Components: site
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Fix For: 2.0.0-beta-2
>
> Attachments: script.sh
>
>
> Cryptic failure trying to build beta-1 RC. Fails like this:
> {code}
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time: 03:54 min
> [INFO] Finished at: 2017-12-29T01:13:15-08:00
> [INFO] Final Memory: 381M/9165M
> [INFO] 
> 
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-site-plugin:3.4:site (default-site) on project 
> hbase: Error generating maven-javadoc-plugin:2.10.3:aggregate:
> [ERROR] Exit code: 1 - warning: unknown enum constant When.ALWAYS
> [ERROR] reason: class file for javax.annotation.meta.When not found
> [ERROR] warning: unknown enum constant When.UNKNOWN
> [ERROR] warning: unknown enum constant When.MAYBE
> [ERROR] 
> /home/stack/hbase.git/hbase-common/src/main/java/org/apache/hadoop/hbase/CellUtil.java:762:
>  warning - Tag @link: malformed: "#matchingRows(Cell, byte[]))"
> [ERROR] 
> /home/stack/hbase.git/hbase-common/src/main/java/org/apache/hadoop/hbase/CellUtil.java:762:
>  warning - Tag @link: reference not found: #matchingRows(Cell, byte[]))
> [ERROR] 
> /home/stack/hbase.git/hbase-common/src/main/java/org/apache/hadoop/hbase/CellUtil.java:762:
>  warning - Tag @link: reference not found: #matchingRows(Cell, byte[]))
> [ERROR] javadoc: warning - Class javax.annotation.Nonnull not found.
> [ERROR] javadoc: error - class file for 
> javax.annotation.meta.TypeQualifierNickname not found
> [ERROR]
> [ERROR] Command line was: /home/stack/bin/jdk1.8.0_151/jre/../bin/javadoc 
> -J-Xmx2G @options @packages
> [ERROR]
> [ERROR] Refer to the generated Javadoc files in 
> '/home/stack/hbase.git/target/site/apidocs' dir.
> [ERROR] -> [Help 1]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
> {code}
> javax.annotation.meta.TypeQualifierNickname is out of jsr305 but we don't 
> include this anywhere according to mvn dependency.
> Happens building the User API both test and main.
> Excluding these lines gets us passing again:
> {code}
>   3511   
>   3512 
> org.apache.yetus.audience.tools.IncludePublicAnnotationsStandardDoclet
>   3513   
>   3514   
>   3515 org.apache.yetus
>   3516 audience-annotations
>   3517 ${audience-annotations.version}
>   3518   
> + 3519   true
> {code}
> Tried upgrading to newer mvn site (ours is three years old) but that a 
> different set of problems.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   >