[jira] [Commented] (HBASE-21281) Update bouncycastle dependency.

2018-10-22 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16660126#comment-16660126
 ] 

Hudson commented on HBASE-21281:


Results for branch branch-2.0
[build #998 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/998/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/998//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/998//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/998//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Update bouncycastle dependency.
> ---
>
> Key: HBASE-21281
> URL: https://issues.apache.org/jira/browse/HBASE-21281
> Project: HBase
>  Issue Type: Task
>  Components: dependencies, test
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: 21281.addendum.patch, 21281.addendum2.patch, 
> HBASE-21281.001.branch-2.0.patch
>
>
> Looks like we still depend on bcprov-jdk16 for some x509 certificate 
> generation in our tests. Bouncycastle has moved beyond this in 1.47, changing 
> the artifact names.
> [http://www.bouncycastle.org/wiki/display/JA1/Porting+from+earlier+BC+releases+to+1.47+and+later]
> There are some API changes too, but it looks like we don't use any of these.
> It seems like we also have vestiges in the POMs from when we were depending 
> on a specific BC version that came in from Hadoop. We now have a 
> KeyStoreTestUtil class in HBase, which makes me think we can also clean up 
> some dependencies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20973) ArrayIndexOutOfBoundsException when rolling back procedure

2018-10-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16660116#comment-16660116
 ] 

Hadoop QA commented on HBASE-20973:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
24s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
34s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
45s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
27s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
0s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
 3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
20s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 55s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
34s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}116m 
55s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}164m 35s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 |
| JIRA Issue | HBASE-20973 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12945141/HBASE-20973.branch-2.0.002.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 9efa62300afb 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| 

[jira] [Reopened] (HBASE-21354) Procedure may be deleted improperly during master restarts resulting in 'Corrupt'

2018-10-22 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang reopened HBASE-21354:
---

Compile error on master.

> Procedure may be deleted improperly during master restarts resulting in 
> 'Corrupt'
> -
>
> Key: HBASE-21354
> URL: https://issues.apache.org/jira/browse/HBASE-21354
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21354.branch-2.0.001.patch, 
> HBASE-21354.branch-2.0.002.patch, HBASE-21354.branch-2.0.003.patch, 
> HBASE-21354.branch-2.0.004.patch
>
>
> Good news! [~stack], [~Apache9], I may find the root cause of mysterious 
> ‘Corrupted procedure’ or some procedures disappeared after master 
> restarts(happens during ITBLL).
> This is because during master restarts, we load procedures from the log, and 
> builds the 'holdingCleanupTracker' according each log's tracker. We may mark 
> a procedure in the oldest log as deleted if one log doesn't contain the 
> procedure. This is Inappropriate since one log will not contain info of the 
> log if this procedure was not updated during the time. We can only delete the 
> procedure only if it is not in the global tracker, which have the whole 
> picture.
> {code}
> trackerNode = tracker.lookupClosestNode(trackerNode, procId);
> if (trackerNode == null || !trackerNode.contains(procId) ||
>   trackerNode.isModified(procId)) {
>   // the procedure was removed or modified
>   node.delete(procId);
> }
> {code}
> A test case(testProcedureShouldNotCleanOnLoad) shows cleanly how the 
> corruption happened in the patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21354) Procedure may be deleted improperly during master restarts resulting in 'Corrupt'

2018-10-22 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16660111#comment-16660111
 ] 

Duo Zhang commented on HBASE-21354:
---

I mean you need to mention that, we may fail to persist the store tracker for a 
proc wal file, this is the root cause here... But I still haven't seen this in 
the comments...

> Procedure may be deleted improperly during master restarts resulting in 
> 'Corrupt'
> -
>
> Key: HBASE-21354
> URL: https://issues.apache.org/jira/browse/HBASE-21354
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21354.branch-2.0.001.patch, 
> HBASE-21354.branch-2.0.002.patch, HBASE-21354.branch-2.0.003.patch, 
> HBASE-21354.branch-2.0.004.patch
>
>
> Good news! [~stack], [~Apache9], I may find the root cause of mysterious 
> ‘Corrupted procedure’ or some procedures disappeared after master 
> restarts(happens during ITBLL).
> This is because during master restarts, we load procedures from the log, and 
> builds the 'holdingCleanupTracker' according each log's tracker. We may mark 
> a procedure in the oldest log as deleted if one log doesn't contain the 
> procedure. This is Inappropriate since one log will not contain info of the 
> log if this procedure was not updated during the time. We can only delete the 
> procedure only if it is not in the global tracker, which have the whole 
> picture.
> {code}
> trackerNode = tracker.lookupClosestNode(trackerNode, procId);
> if (trackerNode == null || !trackerNode.contains(procId) ||
>   trackerNode.isModified(procId)) {
>   // the procedure was removed or modified
>   node.delete(procId);
> }
> {code}
> A test case(testProcedureShouldNotCleanOnLoad) shows cleanly how the 
> corruption happened in the patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20973) ArrayIndexOutOfBoundsException when rolling back procedure

2018-10-22 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16660099#comment-16660099
 ] 

Duo Zhang commented on HBASE-20973:
---

You can commit it for now as it can solve the problem. Still need to find out 
why the max size limit does not work here. Maybe in other issues.

Thanks.

> ArrayIndexOutOfBoundsException when rolling back procedure
> --
>
> Key: HBASE-20973
> URL: https://issues.apache.org/jira/browse/HBASE-20973
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Critical
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-20973.branch-2.0.001.patch, 
> HBASE-20973.branch-2.0.002.patch
>
>
> Find this one while investigating HBASE-20921. After the root 
> procedure(ModifyTableProcedure  in this case) rolled back, a 
> ArrayIndexOutOfBoundsException was thrown
> {code}
> 2018-07-18 01:39:10,241 ERROR [PEWorker-8] procedure2.ProcedureExecutor(159): 
> CODE-BUG: Uncaught runtime exception for pid=5973, 
> state=FAILED:MODIFY_TABLE_REOPEN_ALL_REGIONS, exception=java.lang.NullPo
> interException via CODE-BUG: Uncaught runtime exception: pid=5974, ppid=5973, 
> state=RUNNABLE:REOPEN_TABLE_REGIONS_CONFIRM_REOPENED; 
> ReopenTableRegionsProcedure table=IntegrationTestBigLinkedList:java.l
> ang.NullPointerException; ModifyTableProcedure 
> table=IntegrationTestBigLinkedList
> java.lang.UnsupportedOperationException: unhandled 
> state=MODIFY_TABLE_REOPEN_ALL_REGIONS
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:147)
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:50)
> at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:203)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:864)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1353)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> 2018-07-18 01:39:10,243 WARN  [PEWorker-8] 
> procedure2.ProcedureExecutor(1756): Worker terminating UNNATURALLY null
> java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.updateState(ProcedureStoreTracker.java:405)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.delete(ProcedureStoreTracker.java:178)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:513)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:505)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.updateStoreTracker(WALProcedureStore.java:741)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:691)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.delete(WALProcedureStore.java:603)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1387)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> {code}
> This is a very serious condition, After this exception thrown, the exclusive 
> lock held by ModifyTableProcedure was never released. All the procedure 
> against this table were blocked. Until the master restarted, and since the 
> lock info for the procedure won't be restored, the other procedures can go 
> again, it is quite embarrassing that a bug save us...(this bug will be fixed 
> in HBASE-20846)
> I tried to reproduce this one using the test case in HBASE-20921 but I just 
> can't reproduce it.
> A easy way to resolve this is add a try catch, making sure no matter what 
> happens, the table's 

[jira] [Commented] (HBASE-21354) Procedure may be deleted improperly during master restarts resulting in 'Corrupt'

2018-10-22 Thread Xu Cang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16660095#comment-16660095
 ] 

Xu Cang commented on HBASE-21354:
-

Just noticed this  branch-2 patch got pushed to Master and cause compile issue. 
#restart signature does not match.

[~allan163] Thanks.

> Procedure may be deleted improperly during master restarts resulting in 
> 'Corrupt'
> -
>
> Key: HBASE-21354
> URL: https://issues.apache.org/jira/browse/HBASE-21354
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21354.branch-2.0.001.patch, 
> HBASE-21354.branch-2.0.002.patch, HBASE-21354.branch-2.0.003.patch, 
> HBASE-21354.branch-2.0.004.patch
>
>
> Good news! [~stack], [~Apache9], I may find the root cause of mysterious 
> ‘Corrupted procedure’ or some procedures disappeared after master 
> restarts(happens during ITBLL).
> This is because during master restarts, we load procedures from the log, and 
> builds the 'holdingCleanupTracker' according each log's tracker. We may mark 
> a procedure in the oldest log as deleted if one log doesn't contain the 
> procedure. This is Inappropriate since one log will not contain info of the 
> log if this procedure was not updated during the time. We can only delete the 
> procedure only if it is not in the global tracker, which have the whole 
> picture.
> {code}
> trackerNode = tracker.lookupClosestNode(trackerNode, procId);
> if (trackerNode == null || !trackerNode.contains(procId) ||
>   trackerNode.isModified(procId)) {
>   // the procedure was removed or modified
>   node.delete(procId);
> }
> {code}
> A test case(testProcedureShouldNotCleanOnLoad) shows cleanly how the 
> corruption happened in the patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20952) Re-visit the WAL API

2018-10-22 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16660073#comment-16660073
 ] 

Hudson commented on HBASE-20952:


Results for branch HBASE-20952
[build #26 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/26/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/26//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/26//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/26//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Re-visit the WAL API
> 
>
> Key: HBASE-20952
> URL: https://issues.apache.org/jira/browse/HBASE-20952
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Josh Elser
>Priority: Major
> Attachments: 20952.v1.txt
>
>
> Take a step back from the current WAL implementations and think about what an 
> HBase WAL API should look like. What are the primitive calls that we require 
> to guarantee durability of writes with a high degree of performance?
> The API needs to take the current implementations into consideration. We 
> should also have a mind for what is happening in the Ratis LogService (but 
> the LogService should not dictate what HBase's WAL API looks like RATIS-272).
> Other "systems" inside of HBase that use WALs are replication and 
> backup Replication has the use-case for "tail"'ing the WAL which we 
> should provide via our new API. B doesn't do anything fancy (IIRC). We 
> should make sure all consumers are generally going to be OK with the API we 
> create.
> The API may be "OK" (or OK in a part). We need to also consider other methods 
> which were "bolted" on such as {{AbstractFSWAL}} and 
> {{WALFileLengthProvider}}. Other corners of "WAL use" (like the 
> {{WALSplitter}} should also be looked at to use WAL-APIs only).
> We also need to make sure that adequate interface audience and stability 
> annotations are chosen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21356) bulkLoadHFile API should ensure that rs has the source hfile's write permission

2018-10-22 Thread Zheng Hu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Hu updated HBASE-21356:
-
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

> bulkLoadHFile API should ensure that rs has the source hfile's write 
> permission
> ---
>
> Key: HBASE-21356
> URL: https://issues.apache.org/jira/browse/HBASE-21356
> Project: HBase
>  Issue Type: Bug
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.0.3, 2.1.2
>
> Attachments: HBASE-21356.v1.patch
>
>
> If the rs bulk load a HFile but has no write permission of it,  we can read & 
> compact the hfile, but after the compaction finished, the HFile willl be 
> moved to archive directory,  the HFileCleaner won't has permission to delete, 
> then the HFile will always be keep in HDFS. 
> Need check the file's write permission when run bulkLoadHFile at server side, 
>  if no write permission, then reject.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21360) Disable printing of stack-trace in shell for quotas

2018-10-22 Thread Sakthi (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16660067#comment-16660067
 ] 

Sakthi commented on HBASE-21360:


Can be changed to umbrella?

> Disable printing of stack-trace in shell for quotas
> ---
>
> Key: HBASE-21360
> URL: https://issues.apache.org/jira/browse/HBASE-21360
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Sakthi
>Assignee: Sakthi
>Priority: Minor
>
> Filing this jira to track sub-jiras that can be potentially created under 
> this umbrella. Description of each jira can be found in the respective jiras.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21362) Disable printing of stack-trace in shell when quotas are violated

2018-10-22 Thread Sakthi (JIRA)
Sakthi created HBASE-21362:
--

 Summary: Disable printing of stack-trace in shell when quotas are 
violated
 Key: HBASE-21362
 URL: https://issues.apache.org/jira/browse/HBASE-21362
 Project: HBase
  Issue Type: Sub-task
  Components: shell
Reporter: Sakthi
Assignee: Sakthi


When quotas are violated, a 'Quota violated on #table. Due to #ViolationPolicy, 
#Action is not allowed' message should suffice. The current trace in shell 
looks like this:
{noformat}
hbase(main):009:0> put 't2','r1','cf1:c','val'
ERROR: org.apache.hadoop.hbase.quotas.SpaceLimitingException: NO_WRITES Puts 
are disallowed due to a space quota.
at 
org.apache.hadoop.hbase.quotas.policies.NoWritesViolationPolicyEnforcement.check(NoWritesViolationPolicyEnforcement.java:46)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.mutate(RSRpcServices.java:2779)
at 
org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:42000)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)

{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21356) bulkLoadHFile API should ensure that rs has the source hfile's write permission

2018-10-22 Thread Zheng Hu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16660066#comment-16660066
 ] 

Zheng Hu commented on HBASE-21356:
--

Pushed to all branches. Thanks [~uagashe]  for reviewing. 

> bulkLoadHFile API should ensure that rs has the source hfile's write 
> permission
> ---
>
> Key: HBASE-21356
> URL: https://issues.apache.org/jira/browse/HBASE-21356
> Project: HBase
>  Issue Type: Bug
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.0.3, 2.1.2
>
> Attachments: HBASE-21356.v1.patch
>
>
> If the rs bulk load a HFile but has no write permission of it,  we can read & 
> compact the hfile, but after the compaction finished, the HFile willl be 
> moved to archive directory,  the HFileCleaner won't has permission to delete, 
> then the HFile will always be keep in HDFS. 
> Need check the file's write permission when run bulkLoadHFile at server side, 
>  if no write permission, then reject.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21361) Disable printing of stack-trace in shell when quotas are not enabled

2018-10-22 Thread Sakthi (JIRA)
Sakthi created HBASE-21361:
--

 Summary: Disable printing of stack-trace in shell when quotas are 
not enabled
 Key: HBASE-21361
 URL: https://issues.apache.org/jira/browse/HBASE-21361
 Project: HBase
  Issue Type: Sub-task
  Components: shell
Reporter: Sakthi
Assignee: Sakthi


When user tries to access 'set_quota' with quota support not enabled, a 'Quota 
support not enabled' message should suffice. The current trace looks like this:
 {noformat}
hbase(main):009:0> set_quota TYPE => THROTTLE, TABLE => 't2', LIMIT => '10M/sec'
ERROR: org.apache.hadoop.hbase.DoNotRetryIOException: 
java.lang.UnsupportedOperationException: quota support disabled
at 
org.apache.hadoop.hbase.quotas.MasterQuotaManager.checkQuotaSupport(MasterQuotaManager.java:442)
at 
org.apache.hadoop.hbase.quotas.MasterQuotaManager.setQuota(MasterQuotaManager.java:124)
at 
org.apache.hadoop.hbase.master.MasterRpcServices.setQuota(MasterRpcServices.java:1593)
at 
org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
Caused by: java.lang.UnsupportedOperationException: quota support disabled
... 8 more
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21270) [amv2] Let go of Procedure entity lock on CODE-BUG or UnsupportedOperationException

2018-10-22 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16660059#comment-16660059
 ] 

stack commented on HBASE-21270:
---

Undoing fix version. I've not run into this lately, not since I suppressed the 
throw of UnsupportedOperationException out of AP/UP rollback. SCP can throw a 
UnsupportedOperationException but it doesn't do the same damage. Will keep an 
eye on this but moving out of 2.1.1.

> [amv2] Let go of Procedure entity lock on CODE-BUG or 
> UnsupportedOperationException
> ---
>
> Key: HBASE-21270
> URL: https://issues.apache.org/jira/browse/HBASE-21270
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Reporter: stack
>Priority: Major
> Attachments: HBASE-21270.branch-2.1.001.patch, 
> HBASE-21270.branch-2.1.002.patch
>
>
> Small patch but helps when we run into ugly bugs... Let go of the entity lock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21301) Heatmap for key access patterns

2018-10-22 Thread Archana Katiyar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16660058#comment-16660058
 ] 

Archana Katiyar commented on HBASE-21301:
-

Sure [~apurtell], thanks.

I was working on the *logic to generate UID for given string*; here the problem 
statement is two fold -
 * We should be able to lookup uid, given the string.
 * Also, we should be able to lookup string back, given the uid.

So, we will have to store two rows in the table corresponding to one uid for 
both side lookups.

Generating uid can be done by using incrementColumnValue (in HTable), but 
saving the uid against the string equivalent (and vice versa) will require some 
sort of lock because HBase doesn't guarantee atomicity of operations across 
rows. I am thinking of using Zookeeper lock here.

Following is the algorithm that I am working on -

For given string (like table name)
 # Check if there is an entry in the table with given string as key
 # If yes, then use the value as uid.
 # If not, then follow next set of steps -
 # take a lock at zookeeper level for the given string.
 # Check again if there is an entry in the table with given string as key; this 
is important as someone might have taken the lock, finished the work and 
eventually released the lock between steps #1 and #4. If there is an entry 
present, simply use that value as uid. If no entry present then move to next 
steps -
 # get uid by HTable#incrementColumnValue
 # store string equivalent as row key and uid as value 
 # store uid as row key and string equivalent as value
 # release zookeeper lock

One more thing to take care of here is system crash\shutdown between #7 and #8 
i.e. string to uid relationship is stored but uid to string relationship could 
not be stored. For handling this case, we have to check and create an row 
entry, if needed, in #2 above.

Alternatively, we can following something on below lines to avoid taking the 
lock -
 * For table-name, assign and store uid during table creation.
 * For metric names, admin can do it while creating the table post cluster 
creation. One caveat is that whenever we will add new metrics (like anything 
else than readcount\writecount), admin has to run a script again. This process 
can lead to errors.
 * For region_uid and block_uid also, admin can do at the beginning.
 * For string values like region names and block names, it is expected that 
only single server will try to create corresponding uid so taking lock can be 
avoided.

 

I would prefer to go via locking approach because it puts all the logic at the 
same place and also, it reduces chances of error because it doesn't require any 
manual intervention.

> Heatmap for key access patterns
> ---
>
> Key: HBASE-21301
> URL: https://issues.apache.org/jira/browse/HBASE-21301
> Project: HBase
>  Issue Type: Improvement
>Reporter: Archana Katiyar
>Assignee: Archana Katiyar
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 2.2.0
>
>
> Google recently released a beta feature for Cloud Bigtable which presents a 
> heat map of the keyspace. *Given how hotspotting comes up now and again here, 
> this is a good idea for giving HBase ops a tool to be proactive about it.* 
> >>>
> Additionally, we are announcing the beta version of Key Visualizer, a 
> visualization tool for Cloud Bigtable key access patterns. Key Visualizer 
> helps debug performance issues due to unbalanced access patterns across the 
> key space, or single rows that are too large or receiving too much read or 
> write activity. With Key Visualizer, you get a heat map visualization of 
> access patterns over time, along with the ability to zoom into specific key 
> or time ranges, or select a specific row to find the full row key ID that's 
> responsible for a hotspot. Key Visualizer is automatically enabled for Cloud 
> Bigtable clusters with sufficient data or activity, and does not affect Cloud 
> Bigtable cluster performance. 
> <<<
> From 
> [https://cloudplatform.googleblog.com/2018/07/on-gcp-your-database-your-way.html]
> (Copied this description from the write-up by [~apurtell], thanks Andrew.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21270) [amv2] Let go of Procedure entity lock on CODE-BUG or UnsupportedOperationException

2018-10-22 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21270:
--
Fix Version/s: (was: 2.0.3)
   (was: 2.1.1)
   (was: 2.2.0)

> [amv2] Let go of Procedure entity lock on CODE-BUG or 
> UnsupportedOperationException
> ---
>
> Key: HBASE-21270
> URL: https://issues.apache.org/jira/browse/HBASE-21270
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Reporter: stack
>Priority: Major
> Attachments: HBASE-21270.branch-2.1.001.patch, 
> HBASE-21270.branch-2.1.002.patch
>
>
> Small patch but helps when we run into ugly bugs... Let go of the entity lock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21360) Disable printing of stack-trace in shell for quotas

2018-10-22 Thread Sakthi (JIRA)
Sakthi created HBASE-21360:
--

 Summary: Disable printing of stack-trace in shell for quotas
 Key: HBASE-21360
 URL: https://issues.apache.org/jira/browse/HBASE-21360
 Project: HBase
  Issue Type: Improvement
  Components: shell
Reporter: Sakthi
Assignee: Sakthi


Filing this jira to track sub-jiras that can be potentially created under this 
umbrella. Description of each jira can be found in the respective jiras.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21321) Backport HBASE-21278 to branch-2.1 and branch-2.0

2018-10-22 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21321:
--
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Pushed to branch-2.0 and branch-2.1. Thanks for review [~Apache9]

> Backport HBASE-21278 to branch-2.1 and branch-2.0
> -
>
> Key: HBASE-21321
> URL: https://issues.apache.org/jira/browse/HBASE-21321
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: stack
>Priority: Critical
> Fix For: 2.1.1, 2.0.3
>
> Attachments: HBASE-21321.branch-2.1.001.patch
>
>
> As the assign\unassign part is a bit different.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21349) Cluster is going down but CatalogJanitor and Normalizer try to run and fail noisely

2018-10-22 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21349:
--
Fix Version/s: 2.1.2
   2.0.3
   2.2.0
   3.0.0

> Cluster is going down but CatalogJanitor and Normalizer try to run and fail 
> noisely
> ---
>
> Key: HBASE-21349
> URL: https://issues.apache.org/jira/browse/HBASE-21349
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: stack
>Assignee: Xu Cang
>Priority: Minor
> Fix For: 3.0.0, 2.2.0, 2.0.3, 2.1.2
>
> Attachments: HBASE-22349.master.001.patch
>
>
> Shutting down can take a while. Meantime catalog janitor and or normalizer 
> (etc?) try to run and when they can't, they fail noisely. Looks bad:
> {code}
> 2018-10-19 21:23:24,962 INFO org.apache.hadoop.hbase.master.ServerManager: 
> Cluster shutdown set; vc1205.halxg.cloudera.com,22101,1539991730711 expired; 
> onlineServers=51
> 2018-10-19 21:25:54,502 WARN org.apache.hadoop.hbase.master.CatalogJanitor: 
> Failed scan of catalog table
> java.io.IOException: connection is closed
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.getMetaHTable(MetaTableAccessor.java:267)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:763)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:734)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:684)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.scanMetaForTableRegions(MetaTableAccessor.java:679)
> at 
> org.apache.hadoop.hbase.master.CatalogJanitor.getMergedRegionsAndSplitParents(CatalogJanitor.java:185)
> at 
> org.apache.hadoop.hbase.master.CatalogJanitor.getMergedRegionsAndSplitParents(CatalogJanitor.java:137)
> at 
> org.apache.hadoop.hbase.master.CatalogJanitor.scan(CatalogJanitor.java:243)
> at 
> org.apache.hadoop.hbase.master.CatalogJanitor.chore(CatalogJanitor.java:116)
> at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:186)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:111)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> 2018-10-19 21:25:54,507 ERROR 
> org.apache.hadoop.hbase.master.normalizer.RegionNormalizerChore: Failed to 
> normalize regions.
> java.io.IOException: connection is closed
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.getMetaHTable(MetaTableAccessor.java:267)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:763)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:734)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:690)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.fullScanTables(MetaTableAccessor.java:240)
> at 
> org.apache.hadoop.hbase.master.TableStateManager.getTablesInStates(TableStateManager.java:189)
> at 
> org.apache.hadoop.hbase.master.HMaster.normalizeRegions(HMaster.java:1718)
> at 
> org.apache.hadoop.hbase.master.normalizer.RegionNormalizerChore.chore(RegionNormalizerChore.java:48)
> at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:186)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:111)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> 

[jira] [Updated] (HBASE-20973) ArrayIndexOutOfBoundsException when rolling back procedure

2018-10-22 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-20973:
--
Fix Version/s: 2.0.3
   2.1.1
   2.2.0
   3.0.0

> ArrayIndexOutOfBoundsException when rolling back procedure
> --
>
> Key: HBASE-20973
> URL: https://issues.apache.org/jira/browse/HBASE-20973
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Critical
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-20973.branch-2.0.001.patch, 
> HBASE-20973.branch-2.0.002.patch
>
>
> Find this one while investigating HBASE-20921. After the root 
> procedure(ModifyTableProcedure  in this case) rolled back, a 
> ArrayIndexOutOfBoundsException was thrown
> {code}
> 2018-07-18 01:39:10,241 ERROR [PEWorker-8] procedure2.ProcedureExecutor(159): 
> CODE-BUG: Uncaught runtime exception for pid=5973, 
> state=FAILED:MODIFY_TABLE_REOPEN_ALL_REGIONS, exception=java.lang.NullPo
> interException via CODE-BUG: Uncaught runtime exception: pid=5974, ppid=5973, 
> state=RUNNABLE:REOPEN_TABLE_REGIONS_CONFIRM_REOPENED; 
> ReopenTableRegionsProcedure table=IntegrationTestBigLinkedList:java.l
> ang.NullPointerException; ModifyTableProcedure 
> table=IntegrationTestBigLinkedList
> java.lang.UnsupportedOperationException: unhandled 
> state=MODIFY_TABLE_REOPEN_ALL_REGIONS
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:147)
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:50)
> at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:203)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:864)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1353)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> 2018-07-18 01:39:10,243 WARN  [PEWorker-8] 
> procedure2.ProcedureExecutor(1756): Worker terminating UNNATURALLY null
> java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.updateState(ProcedureStoreTracker.java:405)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.delete(ProcedureStoreTracker.java:178)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:513)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:505)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.updateStoreTracker(WALProcedureStore.java:741)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:691)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.delete(WALProcedureStore.java:603)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1387)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> {code}
> This is a very serious condition, After this exception thrown, the exclusive 
> lock held by ModifyTableProcedure was never released. All the procedure 
> against this table were blocked. Until the master restarted, and since the 
> lock info for the procedure won't be restored, the other procedures can go 
> again, it is quite embarrassing that a bug save us...(this bug will be fixed 
> in HBASE-20846)
> I tried to reproduce this one using the test case in HBASE-20921 but I just 
> can't reproduce it.
> A easy way to resolve this is add a try catch, making sure no matter what 
> happens, the table's exclusive lock can always be relased.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21338) [balancer] If balancer is an ill-fit for cluster size, it gives little indication

2018-10-22 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21338:
--
Fix Version/s: 2.1.2
   2.0.3
   2.2.0
   3.0.0

> [balancer] If balancer is an ill-fit for cluster size, it gives little 
> indication
> -
>
> Key: HBASE-21338
> URL: https://issues.apache.org/jira/browse/HBASE-21338
> Project: HBase
>  Issue Type: Sub-task
>  Components: Balancer, Operability
>Reporter: stack
>Assignee: Xu Cang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.0.3, 2.1.2
>
> Attachments: HBASE-21338.master.001.patch
>
>
> See parent issue. Running balancer on a cluster where the max steps was way 
> inadequate, the balancer gave little to no indication that it was 
> ill-configured. In fact, it only logged its starting and then that there was 
> nothing to do though the cluster was obviously out-of-whack.
> Ideally the balancer would complain when say the maxSteps limit is a small 
> fraction of what the cluster's calculated max steps are, or it would notice 
> that the balancer is making little progress on an imbalanced cluster and 
> shout. Can we set balancer configs w/o having to restart Master?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21338) [balancer] If balancer is an ill-fit for cluster size, it gives little indication

2018-10-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16660045#comment-16660045
 ] 

Hadoop QA commented on HBASE-21338:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
48s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
13s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
24s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
26s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 12s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}212m 
40s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}256m 13s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21338 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12945124/HBASE-21338.master.001.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 38c3ba328c02 3.13.0-144-generic #193-Ubuntu SMP Thu Mar 15 
17:03:53 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 931156f66b |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14811/testReport/ |
| Max. process+thread count | 4691 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 

[jira] [Commented] (HBASE-20973) ArrayIndexOutOfBoundsException when rolling back procedure

2018-10-22 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16660044#comment-16660044
 ] 

stack commented on HBASE-20973:
---

+1 from me. The test failure seems unrelated. +1 for branch-2.1 and branch-2.0.

> ArrayIndexOutOfBoundsException when rolling back procedure
> --
>
> Key: HBASE-20973
> URL: https://issues.apache.org/jira/browse/HBASE-20973
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Critical
> Attachments: HBASE-20973.branch-2.0.001.patch, 
> HBASE-20973.branch-2.0.002.patch
>
>
> Find this one while investigating HBASE-20921. After the root 
> procedure(ModifyTableProcedure  in this case) rolled back, a 
> ArrayIndexOutOfBoundsException was thrown
> {code}
> 2018-07-18 01:39:10,241 ERROR [PEWorker-8] procedure2.ProcedureExecutor(159): 
> CODE-BUG: Uncaught runtime exception for pid=5973, 
> state=FAILED:MODIFY_TABLE_REOPEN_ALL_REGIONS, exception=java.lang.NullPo
> interException via CODE-BUG: Uncaught runtime exception: pid=5974, ppid=5973, 
> state=RUNNABLE:REOPEN_TABLE_REGIONS_CONFIRM_REOPENED; 
> ReopenTableRegionsProcedure table=IntegrationTestBigLinkedList:java.l
> ang.NullPointerException; ModifyTableProcedure 
> table=IntegrationTestBigLinkedList
> java.lang.UnsupportedOperationException: unhandled 
> state=MODIFY_TABLE_REOPEN_ALL_REGIONS
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:147)
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:50)
> at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:203)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:864)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1353)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> 2018-07-18 01:39:10,243 WARN  [PEWorker-8] 
> procedure2.ProcedureExecutor(1756): Worker terminating UNNATURALLY null
> java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.updateState(ProcedureStoreTracker.java:405)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.delete(ProcedureStoreTracker.java:178)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:513)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:505)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.updateStoreTracker(WALProcedureStore.java:741)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:691)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.delete(WALProcedureStore.java:603)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1387)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> {code}
> This is a very serious condition, After this exception thrown, the exclusive 
> lock held by ModifyTableProcedure was never released. All the procedure 
> against this table were blocked. Until the master restarted, and since the 
> lock info for the procedure won't be restored, the other procedures can go 
> again, it is quite embarrassing that a bug save us...(this bug will be fixed 
> in HBASE-20846)
> I tried to reproduce this one using the test case in HBASE-20921 but I just 
> can't reproduce it.
> A easy way to resolve this is add a try catch, making sure no matter what 
> happens, the table's exclusive lock can always be relased.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21349) Cluster is going down but CatalogJanitor and Normalizer try to run and fail noisely

2018-10-22 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16660040#comment-16660040
 ] 

stack commented on HBASE-21349:
---

+1

> Cluster is going down but CatalogJanitor and Normalizer try to run and fail 
> noisely
> ---
>
> Key: HBASE-21349
> URL: https://issues.apache.org/jira/browse/HBASE-21349
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: stack
>Assignee: Xu Cang
>Priority: Minor
> Attachments: HBASE-22349.master.001.patch
>
>
> Shutting down can take a while. Meantime catalog janitor and or normalizer 
> (etc?) try to run and when they can't, they fail noisely. Looks bad:
> {code}
> 2018-10-19 21:23:24,962 INFO org.apache.hadoop.hbase.master.ServerManager: 
> Cluster shutdown set; vc1205.halxg.cloudera.com,22101,1539991730711 expired; 
> onlineServers=51
> 2018-10-19 21:25:54,502 WARN org.apache.hadoop.hbase.master.CatalogJanitor: 
> Failed scan of catalog table
> java.io.IOException: connection is closed
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.getMetaHTable(MetaTableAccessor.java:267)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:763)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:734)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:684)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.scanMetaForTableRegions(MetaTableAccessor.java:679)
> at 
> org.apache.hadoop.hbase.master.CatalogJanitor.getMergedRegionsAndSplitParents(CatalogJanitor.java:185)
> at 
> org.apache.hadoop.hbase.master.CatalogJanitor.getMergedRegionsAndSplitParents(CatalogJanitor.java:137)
> at 
> org.apache.hadoop.hbase.master.CatalogJanitor.scan(CatalogJanitor.java:243)
> at 
> org.apache.hadoop.hbase.master.CatalogJanitor.chore(CatalogJanitor.java:116)
> at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:186)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:111)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> 2018-10-19 21:25:54,507 ERROR 
> org.apache.hadoop.hbase.master.normalizer.RegionNormalizerChore: Failed to 
> normalize regions.
> java.io.IOException: connection is closed
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.getMetaHTable(MetaTableAccessor.java:267)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:763)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:734)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:690)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.fullScanTables(MetaTableAccessor.java:240)
> at 
> org.apache.hadoop.hbase.master.TableStateManager.getTablesInStates(TableStateManager.java:189)
> at 
> org.apache.hadoop.hbase.master.HMaster.normalizeRegions(HMaster.java:1718)
> at 
> org.apache.hadoop.hbase.master.normalizer.RegionNormalizerChore.chore(RegionNormalizerChore.java:48)
> at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:186)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:111)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by 

[jira] [Comment Edited] (HBASE-21246) Introduce WALIdentity interface

2018-10-22 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659867#comment-16659867
 ] 

Ted Yu edited comment on HBASE-21246 at 10/23/18 3:33 AM:
--

bq. Why do we have a String constructor anyways?

One example usage for the ctor taking String is in {{refreshSources}} method of 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java
 :
{code}
for (SortedSet walsByGroup : 
walsByIdRecoveredQueues.get(queueId).values()) {
  walsByGroup.forEach(wal -> 
src.enqueueLog(this.walProvider.createWALIdentity(wal)));
{code}
The {{wal}} variable above is the String representation of WAL.

In handling failed replication queue(s), for each queue Id, there are Set of 
WALs represented using String (persisted form on zookeeper).
So {{walProvider.createWALIdentity}} call is deserialization from String form 
of WAL name to WALIdentity.
Here is corresponding code from current master branch:
{code}
  walsByGroup.forEach(wal -> src.enqueueLog(new Path(wal)));
{code}
createWALIdentity() ends up calling FSWALIdentity ctor accepting String when 
WAL is backed by hdfs. For other WAL provider, different WALIdentity instance 
would be created.



was (Author: yuzhih...@gmail.com):
bq. Why do we have a String constructor anyways?

One example usage for the ctor taking String is in refreshSources of 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java
 :
{code}
for (SortedSet walsByGroup : 
walsByIdRecoveredQueues.get(queueId).values()) {
  walsByGroup.forEach(wal -> 
src.enqueueLog(this.walProvider.createWALIdentity(wal)));
{code}
createWALIdentity() ends up calling FSWALIdentity ctor accepting String (when 
WAL is backed by hdfs).

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch, 
> 21246.HBASE-20952.005.patch, 21246.HBASE-20952.007.patch, 
> 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21224) Handle compaction queue duplication

2018-10-22 Thread Xu Cang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16660002#comment-16660002
 ] 

Xu Cang commented on HBASE-21224:
-

Mind taking a look at the patch? [~allan163] thanks.

I think the unit test failures are not related to the patch.

> Handle compaction queue duplication
> ---
>
> Key: HBASE-21224
> URL: https://issues.apache.org/jira/browse/HBASE-21224
> Project: HBase
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Xu Cang
>Assignee: Xu Cang
>Priority: Minor
> Attachments: HBASE-21224-master.001.patch, 
> HBASE-21224-master.002.patch, HBASE-21224-master.003.patch
>
>
> Mentioned by [~allan163] that we may want to handle compaction queue 
> duplication in this Jira https://issues.apache.org/jira/browse/HBASE-18451 
> Creating this item for further assessment and discussion.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-21246) Introduce WALIdentity interface

2018-10-22 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659991#comment-16659991
 ] 

Josh Elser edited comment on HBASE-21246 at 10/23/18 3:03 AM:
--

{quote}
> Why do we have a String constructor anyways?
createWALIdentity() ends up calling FSWALIdentity ctor accepting String (when 
WAL is backed by hdfs).
{quote}
I don't understand why you're telling me this. I could look at the wal-refactor 
code if I wanted to know who called it. I am asking you why this HDFS-based 
implementation for a WAL is using anything other than a {{Path}} to refer to a 
WAL.

The question (worry) in-between the lines is: do we have references to WALs 
other than by Path? Looking at {{master}}, the original code from 
ReplicationSourceManager was using Path.


was (Author: elserj):
{quote}{quote}Why do we have a String constructor anyways?
{quote}
createWALIdentity() ends up calling FSWALIdentity ctor accepting String (when 
WAL is backed by hdfs).
{quote}
I don't understand why you're telling me this. I could look at the wal-refactor 
code if I wanted to know who called it. I am asking you why this HDFS-based 
implementation for a WAL is using anything other than a {{Path}} to refer to a 
WAL.

The question (worry) in-between the lines is: do we have references to WALs 
other than by Path? Looking at {{master}}, the original code from 
ReplicationSourceManager was using Path.

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch, 
> 21246.HBASE-20952.005.patch, 21246.HBASE-20952.007.patch, 
> 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20973) ArrayIndexOutOfBoundsException when rolling back procedure

2018-10-22 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659992#comment-16659992
 ] 

Allan Yang commented on HBASE-20973:


Can I have a +1 on this one? [~stack],[~Apache9]?
And does it need to go to branch-2 and master? Do you have other plan for these 
branches, [~Apache9]?

> ArrayIndexOutOfBoundsException when rolling back procedure
> --
>
> Key: HBASE-20973
> URL: https://issues.apache.org/jira/browse/HBASE-20973
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Critical
> Attachments: HBASE-20973.branch-2.0.001.patch, 
> HBASE-20973.branch-2.0.002.patch
>
>
> Find this one while investigating HBASE-20921. After the root 
> procedure(ModifyTableProcedure  in this case) rolled back, a 
> ArrayIndexOutOfBoundsException was thrown
> {code}
> 2018-07-18 01:39:10,241 ERROR [PEWorker-8] procedure2.ProcedureExecutor(159): 
> CODE-BUG: Uncaught runtime exception for pid=5973, 
> state=FAILED:MODIFY_TABLE_REOPEN_ALL_REGIONS, exception=java.lang.NullPo
> interException via CODE-BUG: Uncaught runtime exception: pid=5974, ppid=5973, 
> state=RUNNABLE:REOPEN_TABLE_REGIONS_CONFIRM_REOPENED; 
> ReopenTableRegionsProcedure table=IntegrationTestBigLinkedList:java.l
> ang.NullPointerException; ModifyTableProcedure 
> table=IntegrationTestBigLinkedList
> java.lang.UnsupportedOperationException: unhandled 
> state=MODIFY_TABLE_REOPEN_ALL_REGIONS
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:147)
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:50)
> at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:203)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:864)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1353)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> 2018-07-18 01:39:10,243 WARN  [PEWorker-8] 
> procedure2.ProcedureExecutor(1756): Worker terminating UNNATURALLY null
> java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.updateState(ProcedureStoreTracker.java:405)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.delete(ProcedureStoreTracker.java:178)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:513)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:505)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.updateStoreTracker(WALProcedureStore.java:741)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:691)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.delete(WALProcedureStore.java:603)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1387)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> {code}
> This is a very serious condition, After this exception thrown, the exclusive 
> lock held by ModifyTableProcedure was never released. All the procedure 
> against this table were blocked. Until the master restarted, and since the 
> lock info for the procedure won't be restored, the other procedures can go 
> again, it is quite embarrassing that a bug save us...(this bug will be fixed 
> in HBASE-20846)
> I tried to reproduce this one using the test case in HBASE-20921 but I just 
> can't reproduce it.
> A easy way to resolve this is add a try catch, making sure no matter what 
> happens, the table's exclusive lock can always be relased.



--
This 

[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-22 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659991#comment-16659991
 ] 

Josh Elser commented on HBASE-21246:


{quote}{quote}Why do we have a String constructor anyways?
{quote}
createWALIdentity() ends up calling FSWALIdentity ctor accepting String (when 
WAL is backed by hdfs).
{quote}
I don't understand why you're telling me this. I could look at the wal-refactor 
code if I wanted to know who called it. I am asking you why this HDFS-based 
implementation for a WAL is using anything other than a {{Path}} to refer to a 
WAL.

The question (worry) in-between the lines is: do we have references to WALs 
other than by Path? Looking at {{master}}, the original code from 
ReplicationSourceManager was using Path.

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch, 
> 21246.HBASE-20952.005.patch, 21246.HBASE-20952.007.patch, 
> 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20973) ArrayIndexOutOfBoundsException when rolling back procedure

2018-10-22 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-20973:
---
Attachment: HBASE-20973.branch-2.0.002.patch

> ArrayIndexOutOfBoundsException when rolling back procedure
> --
>
> Key: HBASE-20973
> URL: https://issues.apache.org/jira/browse/HBASE-20973
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Critical
> Attachments: HBASE-20973.branch-2.0.001.patch, 
> HBASE-20973.branch-2.0.002.patch
>
>
> Find this one while investigating HBASE-20921. After the root 
> procedure(ModifyTableProcedure  in this case) rolled back, a 
> ArrayIndexOutOfBoundsException was thrown
> {code}
> 2018-07-18 01:39:10,241 ERROR [PEWorker-8] procedure2.ProcedureExecutor(159): 
> CODE-BUG: Uncaught runtime exception for pid=5973, 
> state=FAILED:MODIFY_TABLE_REOPEN_ALL_REGIONS, exception=java.lang.NullPo
> interException via CODE-BUG: Uncaught runtime exception: pid=5974, ppid=5973, 
> state=RUNNABLE:REOPEN_TABLE_REGIONS_CONFIRM_REOPENED; 
> ReopenTableRegionsProcedure table=IntegrationTestBigLinkedList:java.l
> ang.NullPointerException; ModifyTableProcedure 
> table=IntegrationTestBigLinkedList
> java.lang.UnsupportedOperationException: unhandled 
> state=MODIFY_TABLE_REOPEN_ALL_REGIONS
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:147)
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:50)
> at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:203)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:864)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1353)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> 2018-07-18 01:39:10,243 WARN  [PEWorker-8] 
> procedure2.ProcedureExecutor(1756): Worker terminating UNNATURALLY null
> java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.updateState(ProcedureStoreTracker.java:405)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.delete(ProcedureStoreTracker.java:178)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:513)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:505)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.updateStoreTracker(WALProcedureStore.java:741)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:691)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.delete(WALProcedureStore.java:603)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1387)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> {code}
> This is a very serious condition, After this exception thrown, the exclusive 
> lock held by ModifyTableProcedure was never released. All the procedure 
> against this table were blocked. Until the master restarted, and since the 
> lock info for the procedure won't be restored, the other procedures can go 
> again, it is quite embarrassing that a bug save us...(this bug will be fixed 
> in HBASE-20846)
> I tried to reproduce this one using the test case in HBASE-20921 but I just 
> can't reproduce it.
> A easy way to resolve this is add a try catch, making sure no matter what 
> happens, the table's exclusive lock can always be relased.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20973) ArrayIndexOutOfBoundsException when rolling back procedure

2018-10-22 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-20973:
---
Attachment: (was: HBASE-20973.branch-2.0.002.patch)

> ArrayIndexOutOfBoundsException when rolling back procedure
> --
>
> Key: HBASE-20973
> URL: https://issues.apache.org/jira/browse/HBASE-20973
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Critical
> Attachments: HBASE-20973.branch-2.0.001.patch, 
> HBASE-20973.branch-2.0.002.patch
>
>
> Find this one while investigating HBASE-20921. After the root 
> procedure(ModifyTableProcedure  in this case) rolled back, a 
> ArrayIndexOutOfBoundsException was thrown
> {code}
> 2018-07-18 01:39:10,241 ERROR [PEWorker-8] procedure2.ProcedureExecutor(159): 
> CODE-BUG: Uncaught runtime exception for pid=5973, 
> state=FAILED:MODIFY_TABLE_REOPEN_ALL_REGIONS, exception=java.lang.NullPo
> interException via CODE-BUG: Uncaught runtime exception: pid=5974, ppid=5973, 
> state=RUNNABLE:REOPEN_TABLE_REGIONS_CONFIRM_REOPENED; 
> ReopenTableRegionsProcedure table=IntegrationTestBigLinkedList:java.l
> ang.NullPointerException; ModifyTableProcedure 
> table=IntegrationTestBigLinkedList
> java.lang.UnsupportedOperationException: unhandled 
> state=MODIFY_TABLE_REOPEN_ALL_REGIONS
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:147)
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:50)
> at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:203)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:864)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1353)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> 2018-07-18 01:39:10,243 WARN  [PEWorker-8] 
> procedure2.ProcedureExecutor(1756): Worker terminating UNNATURALLY null
> java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.updateState(ProcedureStoreTracker.java:405)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.delete(ProcedureStoreTracker.java:178)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:513)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:505)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.updateStoreTracker(WALProcedureStore.java:741)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:691)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.delete(WALProcedureStore.java:603)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1387)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> {code}
> This is a very serious condition, After this exception thrown, the exclusive 
> lock held by ModifyTableProcedure was never released. All the procedure 
> against this table were blocked. Until the master restarted, and since the 
> lock info for the procedure won't be restored, the other procedures can go 
> again, it is quite embarrassing that a bug save us...(this bug will be fixed 
> in HBASE-20846)
> I tried to reproduce this one using the test case in HBASE-20921 but I just 
> can't reproduce it.
> A easy way to resolve this is add a try catch, making sure no matter what 
> happens, the table's exclusive lock can always be relased.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21354) Procedure may be deleted improperly during master restarts resulting in 'Corrupt'

2018-10-22 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659985#comment-16659985
 ] 

Allan Yang commented on HBASE-21354:


Pushed to branch-2.0+, thanks for reviewing![~stack],[~Apache9]

> Procedure may be deleted improperly during master restarts resulting in 
> 'Corrupt'
> -
>
> Key: HBASE-21354
> URL: https://issues.apache.org/jira/browse/HBASE-21354
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21354.branch-2.0.001.patch, 
> HBASE-21354.branch-2.0.002.patch, HBASE-21354.branch-2.0.003.patch, 
> HBASE-21354.branch-2.0.004.patch
>
>
> Good news! [~stack], [~Apache9], I may find the root cause of mysterious 
> ‘Corrupted procedure’ or some procedures disappeared after master 
> restarts(happens during ITBLL).
> This is because during master restarts, we load procedures from the log, and 
> builds the 'holdingCleanupTracker' according each log's tracker. We may mark 
> a procedure in the oldest log as deleted if one log doesn't contain the 
> procedure. This is Inappropriate since one log will not contain info of the 
> log if this procedure was not updated during the time. We can only delete the 
> procedure only if it is not in the global tracker, which have the whole 
> picture.
> {code}
> trackerNode = tracker.lookupClosestNode(trackerNode, procId);
> if (trackerNode == null || !trackerNode.contains(procId) ||
>   trackerNode.isModified(procId)) {
>   // the procedure was removed or modified
>   node.delete(procId);
> }
> {code}
> A test case(testProcedureShouldNotCleanOnLoad) shows cleanly how the 
> corruption happened in the patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21354) Procedure may be deleted improperly during master restarts resulting in 'Corrupt'

2018-10-22 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-21354:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Procedure may be deleted improperly during master restarts resulting in 
> 'Corrupt'
> -
>
> Key: HBASE-21354
> URL: https://issues.apache.org/jira/browse/HBASE-21354
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21354.branch-2.0.001.patch, 
> HBASE-21354.branch-2.0.002.patch, HBASE-21354.branch-2.0.003.patch, 
> HBASE-21354.branch-2.0.004.patch
>
>
> Good news! [~stack], [~Apache9], I may find the root cause of mysterious 
> ‘Corrupted procedure’ or some procedures disappeared after master 
> restarts(happens during ITBLL).
> This is because during master restarts, we load procedures from the log, and 
> builds the 'holdingCleanupTracker' according each log's tracker. We may mark 
> a procedure in the oldest log as deleted if one log doesn't contain the 
> procedure. This is Inappropriate since one log will not contain info of the 
> log if this procedure was not updated during the time. We can only delete the 
> procedure only if it is not in the global tracker, which have the whole 
> picture.
> {code}
> trackerNode = tracker.lookupClosestNode(trackerNode, procId);
> if (trackerNode == null || !trackerNode.contains(procId) ||
>   trackerNode.isModified(procId)) {
>   // the procedure was removed or modified
>   node.delete(procId);
> }
> {code}
> A test case(testProcedureShouldNotCleanOnLoad) shows cleanly how the 
> corruption happened in the patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-22 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659978#comment-16659978
 ] 

Ted Yu commented on HBASE-21246:


I ran failed tests with patch v8 which passed.
{code}
[ERROR] Crashed tests:
[ERROR] org.apache.hadoop.hbase.master.TestSplitLogManager
[ERROR] ExecutionException The forked VM terminated without properly saying 
goodbye. VM crash or System.exit called?
[ERROR] Command was /bin/sh -c cd /testptch/hbase/hbase-server && 
/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -enableassertions 
-Dhbase.build.id=2018-10-22T21:39:24Z -Xmx2800m 
-Djava.security.egd=file:/dev/./urandom -Djava.net.preferIPv4Stack=true 
-Djava.awt.headless=true -jar 
/testptch/hbase/hbase-server/target/surefire/surefirebooter6128037881631299296.jar
 /testptch/hbase/hbase-server/target/surefire 2018-10-22T21-40-02_474-jvmRun4 
surefire7294898340571802335tmp surefire_16957884523968105888244tmp
[ERROR] Error occurred in starting fork, check output in log
[ERROR] Process Exit Code: 1
{code}
It seems the environment of Jenkins had some issue.

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch, 
> 21246.HBASE-20952.005.patch, 21246.HBASE-20952.007.patch, 
> 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21201) Support to run VerifyReplication MR tool without peerid

2018-10-22 Thread Toshihiro Suzuki (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659975#comment-16659975
 ] 

Toshihiro Suzuki commented on HBASE-21201:
--

[~sujit] Sounds good. Do you want to work on this?

> Support to run VerifyReplication MR tool without peerid
> ---
>
> Key: HBASE-21201
> URL: https://issues.apache.org/jira/browse/HBASE-21201
> Project: HBase
>  Issue Type: Brainstorming
>  Components: hbase-operator-tools
>Affects Versions: 3.0.0, 2.2.0
>Reporter: Sujit P
>Priority: Major
>
> In some use cases, hbase clients writes to separate clusters(probably 
> different datacenters) tables for redundancy. As an administrator/application 
> architect, I would like to find out if both cluster tables are in the same 
> state (cell by cell). One of the tools that is readily available to use is 
> VerifyRep which is part of replication.
> However, it requires peerId to be setup on atleast of the involved cluster. 
> PeerId is unnecessary in this use-case scenario and possibly cause unintended 
> consequences as the clusters aren't really replication peers neither do We 
> prefer them to be.
> Looking at the code:
> Tool attempts to get only the clusterKey which is essentially ZooKeeper 
> quorum url
>  
> {code:java}
> //VerifyReplication.java
> private static Pair 
> getPeerQuorumConfig(final Configuration conf, String peerId)
> .
> .
> return Pair.newPair(peerConfig,
>         ReplicationUtils.getPeerClusterConfiguration(peerConfig, conf));
> //ReplicationUtils.java
> public static Configuration getPeerClusterConfiguration(ReplicationPeerConfig 
> peerConfig, Configuration baseConf) throws ReplicationException {
> Configuration otherConf;
> try {
> otherConf = HBaseConfiguration.createClusterConf(baseConf, 
> peerConfig.getClusterKey());{code}
>  
>  
> So I would like to propose to update the tool to pass the remote cluster 
> ZkQuorum as an argument (ex. --peerQuorumAddress 
> clusterBzk1,clusterBzk2,clusterBzk3:2181/hbase-secure ) and use it 
> effectively without dependence on replication peerId, similar to 
> peerFSAddress. The are certain advantages in doing so as follows:
>  * Reduce the development/maintenance of separate tool for above scenario
>  * Allow the tool to be more useful for other scenarios as well such as 
>  ** validating backups in remote cluster HBASE-19106
>  ** compare cloned tableA and original tableA in same/remote cluster incase 
> of user error before restoring snapshot to original table to find the records 
> that need to be added/invalid/missing etc
>  ** Allow backup operators who are non-Hbase admins(who shouldn't be adding 
> the peerId) to run the tool, since currently only Hbase superuser can add a 
> peerId for reasons discussed in HBASE-21163.
> Please post your comments
> Thanks
> cc: [~clayb], [~brfrn169] , [~vrodionov] , [~rashidaligee]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659960#comment-16659960
 ] 

Hadoop QA commented on HBASE-21246:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} HBASE-20952 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
32s{color} | {color:green} HBASE-20952 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
58s{color} | {color:green} HBASE-20952 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
11s{color} | {color:green} HBASE-20952 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 1s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
5s{color} | {color:green} HBASE-20952 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} HBASE-20952 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 7s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 5 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
10s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 56s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}283m 55s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}327m  2s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hbase.client.TestCloneSnapshotFromClientWithRegionReplicas |
|   | hadoop.hbase.client.TestFromClientSide |
|   | hadoop.hbase.client.TestMobCloneSnapshotFromClient |
|   | hadoop.hbase.client.TestMobRestoreSnapshotFromClient |
|   | hadoop.hbase.client.TestSnapshotTemporaryDirectoryWithRegionReplicas |
|   | hadoop.hbase.replication.TestReplicationSmallTests |
|   | hadoop.hbase.client.TestFromClientSideWithCoprocessor |
|   | hadoop.hbase.client.TestCloneSnapshotFromClient |
|   | hadoop.hbase.client.TestRestoreSnapshotFromClientWithRegionReplicas |
|   | hadoop.hbase.client.TestFromClientSide3 |
|   | hadoop.hbase.replication.TestReplicationKillSlaveRS |
|   | hadoop.hbase.replication.TestReplicationSmallTestsSync |
|   | hadoop.hbase.client.replication.TestReplicationAdminWithClusters |
|   | hadoop.hbase.client.TestRestoreSnapshotFromClient |
|   | hadoop.hbase.client.TestSnapshotDFSTemporaryDirectory |
|   | hadoop.hbase.master.procedure.TestProcedurePriority |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| 

[jira] [Commented] (HBASE-21237) Use CompatRemoteProcedureResolver to dispatch open/close region requests to RS

2018-10-22 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659958#comment-16659958
 ] 

Allan Yang commented on HBASE-21237:


{quote}
Of note, I see us doing about 90 assigns a second with batching enabled whereas 
w/o, with this patch applied, we do 40. FYI.
{quote}
This is because the CompatRemoteProcedureResolver send the requests one by one 
in one thread... Maybe we can change it to multi thread.

> Use CompatRemoteProcedureResolver to dispatch open/close region requests to RS
> --
>
> Key: HBASE-21237
> URL: https://issues.apache.org/jira/browse/HBASE-21237
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 2.0.3
>
> Attachments: HBASE-21237.branch-2.0.001.patch
>
>
> As discussed in HBASE-21217, in branch-2.0 and branch-2.1, we should use  
> CompatRemoteProcedureResolver  instead of ExecuteProceduresRemoteCall to 
> dispatch region open/close requests to RS. Since ExecuteProceduresRemoteCall  
> will group all the open/close operations in one call and execute them 
> sequentially on the target RS. If one operation fails, all the operation will 
> be marked as failure. Actually, some of the operations(like open region) is 
> already executing in the open region handler thread. But master thinks these 
> operations fails and reassign the regions to another RS. So when the previous 
> RS report to the master that the region is online, master will kill the RS 
> since it already assign the region to another RS.
> For branch-2.2+, HBASE-21217 will fix this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21301) Heatmap for key access patterns

2018-10-22 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659955#comment-16659955
 ] 

Andrew Purtell commented on HBASE-21301:


If table uid will have 3 bytes suggest you use 4 bytes for region name uid. 
Relationship of region to table is many to one. 

> Heatmap for key access patterns
> ---
>
> Key: HBASE-21301
> URL: https://issues.apache.org/jira/browse/HBASE-21301
> Project: HBase
>  Issue Type: Improvement
>Reporter: Archana Katiyar
>Assignee: Archana Katiyar
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 2.2.0
>
>
> Google recently released a beta feature for Cloud Bigtable which presents a 
> heat map of the keyspace. *Given how hotspotting comes up now and again here, 
> this is a good idea for giving HBase ops a tool to be proactive about it.* 
> >>>
> Additionally, we are announcing the beta version of Key Visualizer, a 
> visualization tool for Cloud Bigtable key access patterns. Key Visualizer 
> helps debug performance issues due to unbalanced access patterns across the 
> key space, or single rows that are too large or receiving too much read or 
> write activity. With Key Visualizer, you get a heat map visualization of 
> access patterns over time, along with the ability to zoom into specific key 
> or time ranges, or select a specific row to find the full row key ID that's 
> responsible for a hotspot. Key Visualizer is automatically enabled for Cloud 
> Bigtable clusters with sufficient data or activity, and does not affect Cloud 
> Bigtable cluster performance. 
> <<<
> From 
> [https://cloudplatform.googleblog.com/2018/07/on-gcp-your-database-your-way.html]
> (Copied this description from the write-up by [~apurtell], thanks Andrew.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21354) Procedure may be deleted improperly during master restarts resulting in 'Corrupt'

2018-10-22 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659933#comment-16659933
 ] 

Allan Yang commented on HBASE-21354:


{quote}
These transitions are critical so they should be INFO-level. Most folks run 
INFO-level so if an accounting issue around Master Proc WALs, they'll miss 
critical info. We do INFO level for regionserver WALs.
{quote}
OK, I see, will change this on commit.

> Procedure may be deleted improperly during master restarts resulting in 
> 'Corrupt'
> -
>
> Key: HBASE-21354
> URL: https://issues.apache.org/jira/browse/HBASE-21354
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21354.branch-2.0.001.patch, 
> HBASE-21354.branch-2.0.002.patch, HBASE-21354.branch-2.0.003.patch, 
> HBASE-21354.branch-2.0.004.patch
>
>
> Good news! [~stack], [~Apache9], I may find the root cause of mysterious 
> ‘Corrupted procedure’ or some procedures disappeared after master 
> restarts(happens during ITBLL).
> This is because during master restarts, we load procedures from the log, and 
> builds the 'holdingCleanupTracker' according each log's tracker. We may mark 
> a procedure in the oldest log as deleted if one log doesn't contain the 
> procedure. This is Inappropriate since one log will not contain info of the 
> log if this procedure was not updated during the time. We can only delete the 
> procedure only if it is not in the global tracker, which have the whole 
> picture.
> {code}
> trackerNode = tracker.lookupClosestNode(trackerNode, procId);
> if (trackerNode == null || !trackerNode.contains(procId) ||
>   trackerNode.isModified(procId)) {
>   // the procedure was removed or modified
>   node.delete(procId);
> }
> {code}
> A test case(testProcedureShouldNotCleanOnLoad) shows cleanly how the 
> corruption happened in the patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21321) Backport HBASE-21278 to branch-2.1 and branch-2.0

2018-10-22 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659928#comment-16659928
 ] 

Duo Zhang commented on HBASE-21321:
---

+1.

> Backport HBASE-21278 to branch-2.1 and branch-2.0
> -
>
> Key: HBASE-21321
> URL: https://issues.apache.org/jira/browse/HBASE-21321
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: stack
>Priority: Critical
> Fix For: 2.1.1, 2.0.3
>
> Attachments: HBASE-21321.branch-2.1.001.patch
>
>
> As the assign\unassign part is a bit different.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21352) Polish the rollback implementation in ProcedureExecutor

2018-10-22 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659920#comment-16659920
 ] 

Duo Zhang commented on HBASE-21352:
---

Remove 2.1.1 and 2.0.3 from the fix versions, can open a backport issue later.

> Polish the rollback implementation in ProcedureExecutor
> ---
>
> Key: HBASE-21352
> URL: https://issues.apache.org/jira/browse/HBASE-21352
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2, proc-v2
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21352.patch
>
>
> Need to confirm that the way we make sure that there is one PE worker which 
> rollback the whole procedure stack is correct. Now the comment is not enough.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21352) Polish the rollback implementation in ProcedureExecutor

2018-10-22 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659919#comment-16659919
 ] 

Duo Zhang commented on HBASE-21352:
---

It is fine to not include this in 2.1.1, rollback is rare if we make most 
things work properly. And since there is a basic HBCK2, there is a way to fix 
things up.

> Polish the rollback implementation in ProcedureExecutor
> ---
>
> Key: HBASE-21352
> URL: https://issues.apache.org/jira/browse/HBASE-21352
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2, proc-v2
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21352.patch
>
>
> Need to confirm that the way we make sure that there is one PE worker which 
> rollback the whole procedure stack is correct. Now the comment is not enough.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21352) Polish the rollback implementation in ProcedureExecutor

2018-10-22 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-21352:
--
Fix Version/s: (was: 2.0.3)
   (was: 2.1.1)

> Polish the rollback implementation in ProcedureExecutor
> ---
>
> Key: HBASE-21352
> URL: https://issues.apache.org/jira/browse/HBASE-21352
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2, proc-v2
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21352.patch
>
>
> Need to confirm that the way we make sure that there is one PE worker which 
> rollback the whole procedure stack is correct. Now the comment is not enough.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21237) Use CompatRemoteProcedureResolver to dispatch open/close region requests to RS

2018-10-22 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659916#comment-16659916
 ] 

Duo Zhang commented on HBASE-21237:
---

Oh this should be a blocker for 2.1...

> Use CompatRemoteProcedureResolver to dispatch open/close region requests to RS
> --
>
> Key: HBASE-21237
> URL: https://issues.apache.org/jira/browse/HBASE-21237
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 2.0.3
>
> Attachments: HBASE-21237.branch-2.0.001.patch
>
>
> As discussed in HBASE-21217, in branch-2.0 and branch-2.1, we should use  
> CompatRemoteProcedureResolver  instead of ExecuteProceduresRemoteCall to 
> dispatch region open/close requests to RS. Since ExecuteProceduresRemoteCall  
> will group all the open/close operations in one call and execute them 
> sequentially on the target RS. If one operation fails, all the operation will 
> be marked as failure. Actually, some of the operations(like open region) is 
> already executing in the open region handler thread. But master thinks these 
> operations fails and reassign the regions to another RS. So when the previous 
> RS report to the master that the region is online, master will kill the RS 
> since it already assign the region to another RS.
> For branch-2.2+, HBASE-21217 will fix this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21338) [balancer] If balancer is an ill-fit for cluster size, it gives little indication

2018-10-22 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659901#comment-16659901
 ] 

stack commented on HBASE-21338:
---

+1 on patch. Looks helpful.

> [balancer] If balancer is an ill-fit for cluster size, it gives little 
> indication
> -
>
> Key: HBASE-21338
> URL: https://issues.apache.org/jira/browse/HBASE-21338
> Project: HBase
>  Issue Type: Sub-task
>  Components: Balancer, Operability
>Reporter: stack
>Assignee: Xu Cang
>Priority: Major
> Attachments: HBASE-21338.master.001.patch
>
>
> See parent issue. Running balancer on a cluster where the max steps was way 
> inadequate, the balancer gave little to no indication that it was 
> ill-configured. In fact, it only logged its starting and then that there was 
> nothing to do though the cluster was obviously out-of-whack.
> Ideally the balancer would complain when say the maxSteps limit is a small 
> fraction of what the cluster's calculated max steps are, or it would notice 
> that the balancer is making little progress on an imbalanced cluster and 
> shout. Can we set balancer configs w/o having to restart Master?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21338) [balancer] If balancer is an ill-fit for cluster size, it gives little indication

2018-10-22 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21338:
--
Component/s: Operability

> [balancer] If balancer is an ill-fit for cluster size, it gives little 
> indication
> -
>
> Key: HBASE-21338
> URL: https://issues.apache.org/jira/browse/HBASE-21338
> Project: HBase
>  Issue Type: Sub-task
>  Components: Balancer, Operability
>Reporter: stack
>Assignee: Xu Cang
>Priority: Major
> Attachments: HBASE-21338.master.001.patch
>
>
> See parent issue. Running balancer on a cluster where the max steps was way 
> inadequate, the balancer gave little to no indication that it was 
> ill-configured. In fact, it only logged its starting and then that there was 
> nothing to do though the cluster was obviously out-of-whack.
> Ideally the balancer would complain when say the maxSteps limit is a small 
> fraction of what the cluster's calculated max steps are, or it would notice 
> that the balancer is making little progress on an imbalanced cluster and 
> shout. Can we set balancer configs w/o having to restart Master?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21338) [balancer] If balancer is an ill-fit for cluster size, it gives little indication

2018-10-22 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659900#comment-16659900
 ] 

stack commented on HBASE-21338:
---

Oh. Excellent [~xucang] Thank you for the reminder. I see it here in 
http://hbase.apache.org/book.html#dyn_config after your prompting.

bq. Also, to overcome above mentioned issue, just need to set 
"hbase.master.balancer.stochastic.runMaxSteps" to true.

I should have tried it. It looks weird though. I wonder if it works.



> [balancer] If balancer is an ill-fit for cluster size, it gives little 
> indication
> -
>
> Key: HBASE-21338
> URL: https://issues.apache.org/jira/browse/HBASE-21338
> Project: HBase
>  Issue Type: Sub-task
>  Components: Balancer
>Reporter: stack
>Assignee: Xu Cang
>Priority: Major
> Attachments: HBASE-21338.master.001.patch
>
>
> See parent issue. Running balancer on a cluster where the max steps was way 
> inadequate, the balancer gave little to no indication that it was 
> ill-configured. In fact, it only logged its starting and then that there was 
> nothing to do though the cluster was obviously out-of-whack.
> Ideally the balancer would complain when say the maxSteps limit is a small 
> fraction of what the cluster's calculated max steps are, or it would notice 
> that the balancer is making little progress on an imbalanced cluster and 
> shout. Can we set balancer configs w/o having to restart Master?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21359) Fix build problem against Hadoop 2.8.5

2018-10-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659884#comment-16659884
 ] 

Hadoop QA commented on HBASE-21359:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-1 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 0s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m  
7s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} xml {color} | {color:red}  0m  0s{color} | 
{color:red} The patch has 1 ill-formed XML file(s). {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m  
7s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m  
7s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m  
8s{color} | {color:green} hbase-resource-bundle in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 12m 26s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| XML | Parsing Error(s): |
|   | hbase-resource-bundle/src/main/resources/supplemental-models.xml |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:61288f8 |
| JIRA Issue | HBASE-21359 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12945125/HBASE-21359-branch-1.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  xml  |
| uname | Linux 1ec394bd6c30 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 
07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-1 / f73d619 |
| maven | version: Apache Maven 3.0.5 |
| Default Java | 1.7.0_191 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-openjdk-amd64:1.8.0_181 
/usr/lib/jvm/java-7-openjdk-amd64:1.7.0_191 |
| xml | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14812/artifact/patchprocess/xml.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14812/testReport/ |
| Max. process+thread count | 67 (vs. ulimit of 1) |
| modules | C: hbase-resource-bundle U: hbase-resource-bundle |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14812/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Fix build problem against Hadoop 2.8.5
> --
>
> Key: HBASE-21359
> URL: https://issues.apache.org/jira/browse/HBASE-21359
> Project: HBase
>  Issue Type: Bug
>  Components: build
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Trivial
> Fix For: 1.5.0, 1.4.9
>
> Attachments: HBASE-21359-branch-1.patch
>
>
> 1.4.8 build fails against Hadoop 2.8.5. The fix is an easy change to 
> supplemental-models.xml. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21355) HStore's storeSize is calculated repeatedly which causing the confusing region split

2018-10-22 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659880#comment-16659880
 ] 

Hudson commented on HBASE-21355:


Results for branch branch-1.3
[build #513 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/513/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/513//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/513//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/513//JDK8_Nightly_Build_Report_(Hadoop2)/]




(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> HStore's storeSize is calculated repeatedly which causing the confusing 
> region split 
> -
>
> Key: HBASE-21355
> URL: https://issues.apache.org/jira/browse/HBASE-21355
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Blocker
> Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 2.1.1, 2.0.3, 1.4.9
>
> Attachments: HBASE-21355.addendum.patch, HBASE-21355.addendum.patch, 
> HBASE-21355.branch-1.patch, HBASE-21355.v1.patch
>
>
> When testing the branch-2's write performance in our internal cluster,  we 
> found that the region will be inexplicably split.  
> We use the default ConstantSizeRegionSplitPolicy and 
> hbase.hregion.max.filesize=40G,but  the region will be split even if its 
> bytes size is less than 40G(only ~6G). 
> Checked the code, I found that the following path  will  accumulate the 
> store's storeSize to a very big value, because the path has no reset..
> {code}
> RsRpcServices#getRegionInfo
>   -> HRegion#isMergeable
>-> HRegion#hasReferences
> -> HStore#hasReferences
> -> HStore#openStoreFiles
> {code}
> BTW, we seems forget to maintain the read replica's storeSize when refresh 
> the store files.
> Some comment here,  I move the  storeSize calculation out of loadStoreFiles() 
> method, because the secondary read replica's refreshStoreFiles() will also 
> use loadStoreFiles() to refresh its store files and update the storeSize in 
> the completeCompaction(..) in the final (just like compaction.) , so no need 
> calculate the storeSize twice.. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21349) Cluster is going down but CatalogJanitor and Normalizer try to run and fail noisely

2018-10-22 Thread Xu Cang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Cang updated HBASE-21349:

Attachment: HBASE-22349.master.001.patch
Status: Patch Available  (was: Open)

> Cluster is going down but CatalogJanitor and Normalizer try to run and fail 
> noisely
> ---
>
> Key: HBASE-21349
> URL: https://issues.apache.org/jira/browse/HBASE-21349
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: stack
>Assignee: Xu Cang
>Priority: Minor
> Attachments: HBASE-22349.master.001.patch
>
>
> Shutting down can take a while. Meantime catalog janitor and or normalizer 
> (etc?) try to run and when they can't, they fail noisely. Looks bad:
> {code}
> 2018-10-19 21:23:24,962 INFO org.apache.hadoop.hbase.master.ServerManager: 
> Cluster shutdown set; vc1205.halxg.cloudera.com,22101,1539991730711 expired; 
> onlineServers=51
> 2018-10-19 21:25:54,502 WARN org.apache.hadoop.hbase.master.CatalogJanitor: 
> Failed scan of catalog table
> java.io.IOException: connection is closed
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.getMetaHTable(MetaTableAccessor.java:267)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:763)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:734)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:684)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.scanMetaForTableRegions(MetaTableAccessor.java:679)
> at 
> org.apache.hadoop.hbase.master.CatalogJanitor.getMergedRegionsAndSplitParents(CatalogJanitor.java:185)
> at 
> org.apache.hadoop.hbase.master.CatalogJanitor.getMergedRegionsAndSplitParents(CatalogJanitor.java:137)
> at 
> org.apache.hadoop.hbase.master.CatalogJanitor.scan(CatalogJanitor.java:243)
> at 
> org.apache.hadoop.hbase.master.CatalogJanitor.chore(CatalogJanitor.java:116)
> at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:186)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:111)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> 2018-10-19 21:25:54,507 ERROR 
> org.apache.hadoop.hbase.master.normalizer.RegionNormalizerChore: Failed to 
> normalize regions.
> java.io.IOException: connection is closed
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.getMetaHTable(MetaTableAccessor.java:267)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:763)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:734)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:690)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.fullScanTables(MetaTableAccessor.java:240)
> at 
> org.apache.hadoop.hbase.master.TableStateManager.getTablesInStates(TableStateManager.java:189)
> at 
> org.apache.hadoop.hbase.master.HMaster.normalizeRegions(HMaster.java:1718)
> at 
> org.apache.hadoop.hbase.master.normalizer.RegionNormalizerChore.chore(RegionNormalizerChore.java:48)
> at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:186)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:111)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at 

[jira] [Commented] (HBASE-21321) Backport HBASE-21278 to branch-2.1 and branch-2.0

2018-10-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659871#comment-16659871
 ] 

Hadoop QA commented on HBASE-21321:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
10s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 12 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.1 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
2s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
45s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
13s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
27s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
39s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
39s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} branch-2.1 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
 7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} hbase-procedure: The patch generated 0 new + 17 
unchanged - 1 fixed = 17 total (was 18) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
10s{color} | {color:green} The patch passed checkstyle in hbase-server {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
36s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m 46s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
19s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}123m 
26s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
39s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}169m 44s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:42ca976 |
| JIRA Issue | HBASE-21321 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12945105/HBASE-21321.branch-2.1.001.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 3f89dd9940c0 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-22 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659867#comment-16659867
 ] 

Ted Yu commented on HBASE-21246:


bq. Why do we have a String constructor anyways?

One example usage for the ctor taking String is in refreshSources of 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java
 :
{code}
for (SortedSet walsByGroup : 
walsByIdRecoveredQueues.get(queueId).values()) {
  walsByGroup.forEach(wal -> 
src.enqueueLog(this.walProvider.createWALIdentity(wal)));
{code}
createWALIdentity() ends up calling FSWALIdentity ctor accepting String (when 
WAL is backed by hdfs).

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch, 
> 21246.HBASE-20952.005.patch, 21246.HBASE-20952.007.patch, 
> 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-21349) Cluster is going down but CatalogJanitor and Normalizer try to run and fail noisely

2018-10-22 Thread Xu Cang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Cang reassigned HBASE-21349:
---

Assignee: Xu Cang

> Cluster is going down but CatalogJanitor and Normalizer try to run and fail 
> noisely
> ---
>
> Key: HBASE-21349
> URL: https://issues.apache.org/jira/browse/HBASE-21349
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: stack
>Assignee: Xu Cang
>Priority: Minor
>
> Shutting down can take a while. Meantime catalog janitor and or normalizer 
> (etc?) try to run and when they can't, they fail noisely. Looks bad:
> {code}
> 2018-10-19 21:23:24,962 INFO org.apache.hadoop.hbase.master.ServerManager: 
> Cluster shutdown set; vc1205.halxg.cloudera.com,22101,1539991730711 expired; 
> onlineServers=51
> 2018-10-19 21:25:54,502 WARN org.apache.hadoop.hbase.master.CatalogJanitor: 
> Failed scan of catalog table
> java.io.IOException: connection is closed
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.getMetaHTable(MetaTableAccessor.java:267)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:763)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:734)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:684)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.scanMetaForTableRegions(MetaTableAccessor.java:679)
> at 
> org.apache.hadoop.hbase.master.CatalogJanitor.getMergedRegionsAndSplitParents(CatalogJanitor.java:185)
> at 
> org.apache.hadoop.hbase.master.CatalogJanitor.getMergedRegionsAndSplitParents(CatalogJanitor.java:137)
> at 
> org.apache.hadoop.hbase.master.CatalogJanitor.scan(CatalogJanitor.java:243)
> at 
> org.apache.hadoop.hbase.master.CatalogJanitor.chore(CatalogJanitor.java:116)
> at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:186)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:111)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> 2018-10-19 21:25:54,507 ERROR 
> org.apache.hadoop.hbase.master.normalizer.RegionNormalizerChore: Failed to 
> normalize regions.
> java.io.IOException: connection is closed
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.getMetaHTable(MetaTableAccessor.java:267)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:763)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:734)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:690)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.fullScanTables(MetaTableAccessor.java:240)
> at 
> org.apache.hadoop.hbase.master.TableStateManager.getTablesInStates(TableStateManager.java:189)
> at 
> org.apache.hadoop.hbase.master.HMaster.normalizeRegions(HMaster.java:1718)
> at 
> org.apache.hadoop.hbase.master.normalizer.RegionNormalizerChore.chore(RegionNormalizerChore.java:48)
> at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:186)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:111)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21359) Fix build problem against Hadoop 2.8.5

2018-10-22 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-21359:
---
Attachment: HBASE-21359-branch-1.patch

> Fix build problem against Hadoop 2.8.5
> --
>
> Key: HBASE-21359
> URL: https://issues.apache.org/jira/browse/HBASE-21359
> Project: HBase
>  Issue Type: Bug
>  Components: build
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Trivial
> Fix For: 1.5.0, 1.4.9
>
> Attachments: HBASE-21359-branch-1.patch
>
>
> 1.4.8 build fails against Hadoop 2.8.5. The fix is an easy change to 
> supplemental-models.xml. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21359) Fix build problem against Hadoop 2.8.5

2018-10-22 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-21359:
---
Status: Patch Available  (was: Open)

> Fix build problem against Hadoop 2.8.5
> --
>
> Key: HBASE-21359
> URL: https://issues.apache.org/jira/browse/HBASE-21359
> Project: HBase
>  Issue Type: Bug
>  Components: build
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Trivial
> Fix For: 1.5.0, 1.4.9
>
> Attachments: HBASE-21359-branch-1.patch
>
>
> 1.4.8 build fails against Hadoop 2.8.5. The fix is an easy change to 
> supplemental-models.xml. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21073) "Maintenance mode" master

2018-10-22 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659848#comment-16659848
 ] 

stack commented on HBASE-21073:
---

That TestRSGroups is flakey. I can't make it fail locally. Fails all the time 
on GCE flakey build. Need to dig in. Shouldn't prevent commit here I'd say.

> "Maintenance mode" master
> -
>
> Key: HBASE-21073
> URL: https://issues.apache.org/jira/browse/HBASE-21073
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2, hbck2, master
>Reporter: stack
>Assignee: Mike Drob
>Priority: Major
> Attachments: HBASE-21073.branch-2.001.patch, 
> HBASE-21073.branch-2.1.001.patch, HBASE-21073.branch-2.1.002.patch, 
> HBASE-21073.master.001.patch, HBASE-21073.master.002.patch, 
> HBASE-21073.master.003.patch, HBASE-21073.master.004.patch, 
> HBASE-21073.master.005.patch, HBASE-21073.master.006.patch, 
> HBASE-21073.master.007.patch, HBASE-21073.master.008.patch, 
> HBASE-21073.master.009.patch, HBASE-21073.master.010.patch, 
> HBASE-21073.master.011.patch
>
>
> Make it so we can bring up a Master in "maintenance mode". This is parse of 
> master wal procs but not taking on regionservers. It would be in a state 
> where "repair" Procedures could run; e.g. a Procedure that could recover meta 
> by looking for meta WALs, splitting them, dropping recovered.edits, and even 
> making it so meta is readable. See parent issue for why needed (disaster 
> recovery).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21359) Fix build problem against Hadoop 2.8.5

2018-10-22 Thread Andrew Purtell (JIRA)
Andrew Purtell created HBASE-21359:
--

 Summary: Fix build problem against Hadoop 2.8.5
 Key: HBASE-21359
 URL: https://issues.apache.org/jira/browse/HBASE-21359
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 1.4.8
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 1.5.0, 1.4.9


1.4.8 build fails against Hadoop 2.8.5. The fix is an easy change to 
supplemental-models.xml. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-22 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659842#comment-16659842
 ] 

Ted Yu commented on HBASE-21246:


bq. is the 'design' the thread attached to the document or do we read the body 
of the doc to find the design?

The design is outlined starting from second paragraph of section 3, 'Write 
Path' subsection.

The thread beside the document provides background on how design decision is 
reached.

bq. Has it been integrated

The current form of design for WALIdentity has integrated both comments from 
design doc and comments on this JIRA.

bq. is the design ongoing?

I would say, if there is no objection on the current form, we can consider the 
current form as consensus.
I left the thread on design doc open to give reviewers more time in evaluation.

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch, 
> 21246.HBASE-20952.005.patch, 21246.HBASE-20952.007.patch, 
> 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-22 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659826#comment-16659826
 ] 

stack commented on HBASE-21246:
---

The link works now. Thanks.

So, exactly my point, is the 'design' the thread attached to the document or do 
we read the body of the doc to find the design? The discussion in the thread is 
involved. Has it been integrated/resolved/discounted? It doesn't look so. So, 
is the design ongoing?



> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch, 
> 21246.HBASE-20952.005.patch, 21246.HBASE-20952.007.patch, 
> 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-21338) [balancer] If balancer is an ill-fit for cluster size, it gives little indication

2018-10-22 Thread Xu Cang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659815#comment-16659815
 ] 

Xu Cang edited comment on HBASE-21338 at 10/22/18 11:30 PM:


Config for this class is auto-reloadable. 
[https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java#L162]
 

 

Also, to overcome above mentioned issue, just need to set  
"hbase.master.balancer.stochastic.runMaxSteps" to true.

I added some logging including a WARN message when computedMaxSteps  is greater 
than maxSteps.

 


was (Author: xucang):
Config for this class is auto-reloadable. 
[https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java#L162]
 

 

Also, to overcome above mentioned issue, just need to set  
"hbase.master.balancer.stochastic.runMaxSteps" to true.

I added some logging including a WARN message to code when computedMaxSteps  is 
greater than maxSteps.

 

> [balancer] If balancer is an ill-fit for cluster size, it gives little 
> indication
> -
>
> Key: HBASE-21338
> URL: https://issues.apache.org/jira/browse/HBASE-21338
> Project: HBase
>  Issue Type: Sub-task
>  Components: Balancer
>Reporter: stack
>Assignee: Xu Cang
>Priority: Major
> Attachments: HBASE-21338.master.001.patch
>
>
> See parent issue. Running balancer on a cluster where the max steps was way 
> inadequate, the balancer gave little to no indication that it was 
> ill-configured. In fact, it only logged its starting and then that there was 
> nothing to do though the cluster was obviously out-of-whack.
> Ideally the balancer would complain when say the maxSteps limit is a small 
> fraction of what the cluster's calculated max steps are, or it would notice 
> that the balancer is making little progress on an imbalanced cluster and 
> shout. Can we set balancer configs w/o having to restart Master?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-21338) [balancer] If balancer is an ill-fit for cluster size, it gives little indication

2018-10-22 Thread Xu Cang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Cang reassigned HBASE-21338:
---

Assignee: Xu Cang

> [balancer] If balancer is an ill-fit for cluster size, it gives little 
> indication
> -
>
> Key: HBASE-21338
> URL: https://issues.apache.org/jira/browse/HBASE-21338
> Project: HBase
>  Issue Type: Sub-task
>  Components: Balancer
>Reporter: stack
>Assignee: Xu Cang
>Priority: Major
> Attachments: HBASE-21338.master.001.patch
>
>
> See parent issue. Running balancer on a cluster where the max steps was way 
> inadequate, the balancer gave little to no indication that it was 
> ill-configured. In fact, it only logged its starting and then that there was 
> nothing to do though the cluster was obviously out-of-whack.
> Ideally the balancer would complain when say the maxSteps limit is a small 
> fraction of what the cluster's calculated max steps are, or it would notice 
> that the balancer is making little progress on an imbalanced cluster and 
> shout. Can we set balancer configs w/o having to restart Master?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21338) [balancer] If balancer is an ill-fit for cluster size, it gives little indication

2018-10-22 Thread Xu Cang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Cang updated HBASE-21338:

Attachment: HBASE-21338.master.001.patch
Status: Patch Available  (was: Open)

> [balancer] If balancer is an ill-fit for cluster size, it gives little 
> indication
> -
>
> Key: HBASE-21338
> URL: https://issues.apache.org/jira/browse/HBASE-21338
> Project: HBase
>  Issue Type: Sub-task
>  Components: Balancer
>Reporter: stack
>Priority: Major
> Attachments: HBASE-21338.master.001.patch
>
>
> See parent issue. Running balancer on a cluster where the max steps was way 
> inadequate, the balancer gave little to no indication that it was 
> ill-configured. In fact, it only logged its starting and then that there was 
> nothing to do though the cluster was obviously out-of-whack.
> Ideally the balancer would complain when say the maxSteps limit is a small 
> fraction of what the cluster's calculated max steps are, or it would notice 
> that the balancer is making little progress on an imbalanced cluster and 
> shout. Can we set balancer configs w/o having to restart Master?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21338) [balancer] If balancer is an ill-fit for cluster size, it gives little indication

2018-10-22 Thread Xu Cang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659815#comment-16659815
 ] 

Xu Cang commented on HBASE-21338:
-

Config for this class is auto-reloadable. 
[https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java#L162]
 

 

Also, to overcome mentioned issue, just need to set  
"hbase.master.balancer.stochastic.runMaxSteps" to true.

I added some logging including a WARN message to code when computedMaxSteps  is 
greater than maxSteps.

 

> [balancer] If balancer is an ill-fit for cluster size, it gives little 
> indication
> -
>
> Key: HBASE-21338
> URL: https://issues.apache.org/jira/browse/HBASE-21338
> Project: HBase
>  Issue Type: Sub-task
>  Components: Balancer
>Reporter: stack
>Priority: Major
>
> See parent issue. Running balancer on a cluster where the max steps was way 
> inadequate, the balancer gave little to no indication that it was 
> ill-configured. In fact, it only logged its starting and then that there was 
> nothing to do though the cluster was obviously out-of-whack.
> Ideally the balancer would complain when say the maxSteps limit is a small 
> fraction of what the cluster's calculated max steps are, or it would notice 
> that the balancer is making little progress on an imbalanced cluster and 
> shout. Can we set balancer configs w/o having to restart Master?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-21338) [balancer] If balancer is an ill-fit for cluster size, it gives little indication

2018-10-22 Thread Xu Cang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659815#comment-16659815
 ] 

Xu Cang edited comment on HBASE-21338 at 10/22/18 11:26 PM:


Config for this class is auto-reloadable. 
[https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java#L162]
 

 

Also, to overcome above mentioned issue, just need to set  
"hbase.master.balancer.stochastic.runMaxSteps" to true.

I added some logging including a WARN message to code when computedMaxSteps  is 
greater than maxSteps.

 


was (Author: xucang):
Config for this class is auto-reloadable. 
[https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java#L162]
 

 

Also, to overcome mentioned issue, just need to set  
"hbase.master.balancer.stochastic.runMaxSteps" to true.

I added some logging including a WARN message to code when computedMaxSteps  is 
greater than maxSteps.

 

> [balancer] If balancer is an ill-fit for cluster size, it gives little 
> indication
> -
>
> Key: HBASE-21338
> URL: https://issues.apache.org/jira/browse/HBASE-21338
> Project: HBase
>  Issue Type: Sub-task
>  Components: Balancer
>Reporter: stack
>Priority: Major
>
> See parent issue. Running balancer on a cluster where the max steps was way 
> inadequate, the balancer gave little to no indication that it was 
> ill-configured. In fact, it only logged its starting and then that there was 
> nothing to do though the cluster was obviously out-of-whack.
> Ideally the balancer would complain when say the maxSteps limit is a small 
> fraction of what the cluster's calculated max steps are, or it would notice 
> that the balancer is making little progress on an imbalanced cluster and 
> shout. Can we set balancer configs w/o having to restart Master?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21336) Simplify the implementation of WALProcedureMap

2018-10-22 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659773#comment-16659773
 ] 

Hudson commented on HBASE-21336:


Results for branch branch-2.0
[build #997 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/997/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/997//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/997//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/997//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Simplify the implementation of WALProcedureMap
> --
>
> Key: HBASE-21336
> URL: https://issues.apache.org/jira/browse/HBASE-21336
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21336-v1.patch, HBASE-21336-v2.patch, 
> HBASE-21336-v3.patch, HBASE-21336-v4.patch, HBASE-21336.patch
>
>
> I do not think we need to implement the logic from such a low level, i.e, 
> building complicated linked list by hand, which makes it really hard to 
> understand.
> Let me try to implement it with existing data structures...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21293) [2.0] Update bouncycastle dependency.

2018-10-22 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659736#comment-16659736
 ] 

Josh Elser commented on HBASE-21293:


Threw on the addendums from the original issue onto branch-2.0

> [2.0] Update bouncycastle dependency.
> -
>
> Key: HBASE-21293
> URL: https://issues.apache.org/jira/browse/HBASE-21293
> Project: HBase
>  Issue Type: Task
>  Components: dependencies, test
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 2.0.3
>
> Attachments: HBASE-21281.001.branch-2.0.patch
>
>
> Looks like we still depend on bcprov-jdk16 for some x509 certificate 
> generation in our tests. Bouncycastle has moved beyond this in 1.47, changing 
> the artifact names.
> [http://www.bouncycastle.org/wiki/display/JA1/Porting+from+earlier+BC+releases+to+1.47+and+later]
> There are some API changes too, but it looks like we don't use any of these.
> It seems like we also have vestiges in the POMs from when we were depending 
> on a specific BC version that came in from Hadoop. We now have a 
> KeyStoreTestUtil class in HBase, which makes me think we can also clean up 
> some dependencies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21073) "Maintenance mode" master

2018-10-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659738#comment-16659738
 ] 

Hadoop QA commented on HBASE-21073:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 1s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.1 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
5s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
33s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 22m 
41s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  4m 
 9s{color} | {color:green} branch-2.1 passed {color} |
| {color:blue}0{color} | {color:blue} refguide {color} | {color:blue}  6m 
36s{color} | {color:blue} branch has no errors when building the reference 
guide. See footer for rendered docs, which you should manually inspect. {color} 
|
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
11s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} 
|
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
12s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
52s{color} | {color:green} branch-2.1 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 23m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 23m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:blue}0{color} | {color:blue} refguide {color} | {color:blue}  5m 
14s{color} | {color:blue} patch has no errors when building the reference 
guide. See footer for rendered docs, which you should manually inspect. {color} 
|
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
42s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m 53s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} 
|
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
14s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}198m 26s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 7s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}308m 20s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.rsgroup.TestRSGroups |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce 

[jira] [Resolved] (HBASE-21294) [2.1] Update bouncycastle dependency.

2018-10-22 Thread Josh Elser (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-21294.

  Resolution: Fixed
Hadoop Flags: Reviewed

Just put the original change and the addendums from branch-2/master onto 
branch-2.1.

> [2.1] Update bouncycastle dependency.
> -
>
> Key: HBASE-21294
> URL: https://issues.apache.org/jira/browse/HBASE-21294
> Project: HBase
>  Issue Type: Task
>  Components: dependencies, test
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 2.1.1
>
> Attachments: HBASE-21281.001.branch-2.0.patch
>
>
> Looks like we still depend on bcprov-jdk16 for some x509 certificate 
> generation in our tests. Bouncycastle has moved beyond this in 1.47, changing 
> the artifact names.
> [http://www.bouncycastle.org/wiki/display/JA1/Porting+from+earlier+BC+releases+to+1.47+and+later]
> There are some API changes too, but it looks like we don't use any of these.
> It seems like we also have vestiges in the POMs from when we were depending 
> on a specific BC version that came in from Hadoop. We now have a 
> KeyStoreTestUtil class in HBase, which makes me think we can also clean up 
> some dependencies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-22 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659691#comment-16659691
 ] 

Ted Yu commented on HBASE-21246:


I have updated the link to point to the actual thread on the design doc.

bq. Is there a 'design' for WI?
Quoting my response from the discussion (for people who didn't have time to 
read the full thread):
{code}
>From my experiments over HBASE-21246, I think it is preferable to have 
>WALIdentity (please see the patch attached on HBASE-21246).
In hbase, we have the following related pairs:
HTable <-> TableName
HRegion <-> HRegionInfo

The pair of WAL instance and WALIdentity is similar.

* WALIdentity is light weight - the classes referencing WAL by WALIdentity 
wouldn't hold onto much resource
* WALProvider handles deserialization from String form of WAL name to 
WALIdentity
{code}
bq. Where is the 'outline'?

There is code snippet for the paragraph where the discussion thread was 
centered around.
Please also see subsequent paragraph below the code snippet.

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch, 
> 21246.HBASE-20952.005.patch, 21246.HBASE-20952.007.patch, 
> 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21247) Custom WAL Provider cannot be specified by configuration whose value is outside the enums in Providers

2018-10-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659674#comment-16659674
 ] 

Hadoop QA commented on HBASE-21247:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} patch {color} | {color:blue}  0m  
2s{color} | {color:blue} The patch file was not named according to hbase's 
naming conventions. Please see 
https://yetus.apache.org/documentation/0.8.0/precommit-patchnames for 
instructions. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
 3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
2s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
20s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
42s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
43s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 29s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}143m 
10s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}189m  3s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21247 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12945071/21247.v11.txt |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux d937d9a33736 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 931156f66b |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14808/testReport/ |
| Max. process+thread count | 4701 (vs. ulimit of 1) |
| 

[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-22 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659672#comment-16659672
 ] 

stack commented on HBASE-21246:
---

bq. See this thread on the doc:
https://docs.google.com/document/d/1o552MkKq9wF3BXY2nVcsCXBAImUH6r132Cxv9WHL3D8/edit


This link just takes me to the head of the doc (Usually docs have numbered 
sections or titles and you just cite either as pointers).

When I asked 'Where is WALIdentity in the design doc', I was imprecise. I did 
not mean 'where in the doc' but where is in regards stages of development. Is 
there a 'design' for WI? Last time I read the doc, it seemed up in the air 
still. I'm wondering if this state has changed.

bq. What is presented in the patch on this JIRA follows what is outlined in the 
design doc.

Where is the 'outline'?



> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch, 
> 21246.HBASE-20952.005.patch, 21246.HBASE-20952.007.patch, 
> 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-21246) Introduce WALIdentity interface

2018-10-22 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659666#comment-16659666
 ] 

Ted Yu edited comment on HBASE-21246 at 10/22/18 9:03 PM:
--

Stack's questions seem to be at higher level. So allow me to respond to them 
first.
bq. Where is WALIdentity in the design doc?

See this thread on the doc:
https://docs.google.com/document/d/1o552MkKq9wF3BXY2nVcsCXBAImUH6r132Cxv9WHL3D8/edit?disco=COODUIA

bq. Designed or are we trying variations? Or a particular variation?

What is presented in the patch on this JIRA follows what is outlined in the 
design doc.
w.r.t. the concrete fields / methods of WALIdentity (and implementing classes), 
we're open to discussion and will respect the community's consensus.


was (Author: yuzhih...@gmail.com):
Stack's questions seem to be at higher level. So allow me to respond to them 
first.
bq. Where is WALIdentity in the design doc?

See this thread on the doc:
https://docs.google.com/document/d/1o552MkKq9wF3BXY2nVcsCXBAImUH6r132Cxv9WHL3D8/edit

bq. Designed or are we trying variations? Or a particular variation?

What is presented in the patch on this JIRA follows what is outlined in the 
design doc.
w.r.t. the concrete fields / methods of WALIdentity (and implementing classes), 
we're open to discussion and will respect the community's consensus.

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch, 
> 21246.HBASE-20952.005.patch, 21246.HBASE-20952.007.patch, 
> 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21321) Backport HBASE-21278 to branch-2.1 and branch-2.0

2018-10-22 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21321:
--
Attachment: HBASE-21321.branch-2.1.001.patch

> Backport HBASE-21278 to branch-2.1 and branch-2.0
> -
>
> Key: HBASE-21321
> URL: https://issues.apache.org/jira/browse/HBASE-21321
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: stack
>Priority: Critical
> Fix For: 2.1.1, 2.0.3
>
> Attachments: HBASE-21321.branch-2.1.001.patch
>
>
> As the assign\unassign part is a bit different.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21321) Backport HBASE-21278 to branch-2.1 and branch-2.0

2018-10-22 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21321:
--
Status: Patch Available  (was: Open)

Let me see what tests are broke by this backport.

> Backport HBASE-21278 to branch-2.1 and branch-2.0
> -
>
> Key: HBASE-21321
> URL: https://issues.apache.org/jira/browse/HBASE-21321
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: stack
>Priority: Critical
> Fix For: 2.1.1, 2.0.3
>
> Attachments: HBASE-21321.branch-2.1.001.patch
>
>
> As the assign\unassign part is a bit different.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-22 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659666#comment-16659666
 ] 

Ted Yu commented on HBASE-21246:


Stack's questions seem to be at higher level. So allow me to respond to them 
first.
bq. Where is WALIdentity in the design doc?

See this thread on the doc:
https://docs.google.com/document/d/1o552MkKq9wF3BXY2nVcsCXBAImUH6r132Cxv9WHL3D8/edit

bq. Designed or are we trying variations? Or a particular variation?

What is presented in the patch on this JIRA follows what is outlined in the 
design doc.
w.r.t. the concrete fields / methods of WALIdentity (and implementing classes), 
we're open to discussion and will respect the community's consensus.

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch, 
> 21246.HBASE-20952.005.patch, 21246.HBASE-20952.007.patch, 
> 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21355) HStore's storeSize is calculated repeatedly which causing the confusing region split

2018-10-22 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659657#comment-16659657
 ] 

Hudson commented on HBASE-21355:


Results for branch branch-1.4
[build #519 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/519/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/519//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/519//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/519//JDK8_Nightly_Build_Report_(Hadoop2)/]




(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> HStore's storeSize is calculated repeatedly which causing the confusing 
> region split 
> -
>
> Key: HBASE-21355
> URL: https://issues.apache.org/jira/browse/HBASE-21355
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Blocker
> Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 2.1.1, 2.0.3, 1.4.9
>
> Attachments: HBASE-21355.addendum.patch, HBASE-21355.addendum.patch, 
> HBASE-21355.branch-1.patch, HBASE-21355.v1.patch
>
>
> When testing the branch-2's write performance in our internal cluster,  we 
> found that the region will be inexplicably split.  
> We use the default ConstantSizeRegionSplitPolicy and 
> hbase.hregion.max.filesize=40G,but  the region will be split even if its 
> bytes size is less than 40G(only ~6G). 
> Checked the code, I found that the following path  will  accumulate the 
> store's storeSize to a very big value, because the path has no reset..
> {code}
> RsRpcServices#getRegionInfo
>   -> HRegion#isMergeable
>-> HRegion#hasReferences
> -> HStore#hasReferences
> -> HStore#openStoreFiles
> {code}
> BTW, we seems forget to maintain the read replica's storeSize when refresh 
> the store files.
> Some comment here,  I move the  storeSize calculation out of loadStoreFiles() 
> method, because the secondary read replica's refreshStoreFiles() will also 
> use loadStoreFiles() to refresh its store files and update the storeSize in 
> the completeCompaction(..) in the final (just like compaction.) , so no need 
> calculate the storeSize twice.. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21356) bulkLoadHFile API should ensure that rs has the source hfile's write permission

2018-10-22 Thread Umesh Agashe (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659643#comment-16659643
 ] 

Umesh Agashe commented on HBASE-21356:
--

+1, lgtm. A few comments left on RB.

> bulkLoadHFile API should ensure that rs has the source hfile's write 
> permission
> ---
>
> Key: HBASE-21356
> URL: https://issues.apache.org/jira/browse/HBASE-21356
> Project: HBase
>  Issue Type: Bug
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.0.3, 2.1.2
>
> Attachments: HBASE-21356.v1.patch
>
>
> If the rs bulk load a HFile but has no write permission of it,  we can read & 
> compact the hfile, but after the compaction finished, the HFile willl be 
> moved to archive directory,  the HFileCleaner won't has permission to delete, 
> then the HFile will always be keep in HDFS. 
> Need check the file's write permission when run bulkLoadHFile at server side, 
>  if no write permission, then reject.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21278) Do not rollback successful sub procedures when rolling back a procedure

2018-10-22 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21278:
--
Release Note: 
For the sub procedures which are successfully finished, do not do rollback. 
This is a change in rollback behavior.

State changes which are done by sub procedures should be handled by parent 
procedures when rolling back. For example, when rolling back a 
MergeTableProcedure, we will schedule new procedures to bring the offline 
regions online instead of rolling back the original procedures which off-lined 
the regions (in fact these procedures can not be rolled back...).

> Do not rollback successful sub procedures when rolling back a procedure
> ---
>
> Key: HBASE-21278
> URL: https://issues.apache.org/jira/browse/HBASE-21278
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21278-v1.patch, HBASE-21278.patch, 
> org.apache.hadoop.hbase.master.assignment.TestMergeTableRegionsProcedure-output.txt
>
>
> https://builds.apache.org/job/HBase-Flaky-Tests/job/master/1235/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.master.assignment.TestMergeTableRegionsProcedure-output.txt/*view*/
> I think the problem is
> {noformat}
> 2018-10-08 03:44:30,315 INFO  [PEWorker-1] 
> procedure.MasterProcedureScheduler(689): pid=43, ppid=42, state=SUCCESS, 
> hasLock=false; TransitRegionStateProcedure 
> table=testRollbackAndDoubleExecution, 
> region=9bac7c539ac0cff6dc5706ed375a3bfb, UNASSIGN checking lock on 
> 9bac7c539ac0cff6dc5706ed375a3bfb
> 2018-10-08 03:44:30,320 ERROR [PEWorker-1] helpers.MarkerIgnoringBase(159): 
> CODE-BUG: Uncaught runtime exception for pid=43, ppid=42, state=SUCCESS, 
> hasLock=true; TransitRegionStateProcedure 
> table=testRollbackAndDoubleExecution, 
> region=9bac7c539ac0cff6dc5706ed375a3bfb, UNASSIGN
> java.lang.UnsupportedOperationException
>   at 
> org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure.rollbackState(TransitRegionStateProcedure.java:458)
>   at 
> org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure.rollbackState(TransitRegionStateProcedure.java:97)
>   at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:208)
>   at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:957)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1605)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1567)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1446)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:76)
> {noformat}
> Typically there is no rollback for TRSP. Need to dig more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21333) [amv2] large cluster startup is slow

2018-10-22 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659630#comment-16659630
 ] 

stack commented on HBASE-21333:
---

HBASE-21237 removed the batching of open/close RPCs in branch-2.0 and 
branch-2.1 because it doesn't do errors properly. Testing with HBASE-21237 
temporarily backed-out, we do about 90 assigns a second.

> [amv2] large cluster startup is slow
> 
>
> Key: HBASE-21333
> URL: https://issues.apache.org/jira/browse/HBASE-21333
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0
>Reporter: stack
>Priority: Major
> Attachments: 2.1.1.129578.alloc.svg, 2.1.1.129578.cpu.svg, 
> 2.1.1.129578.lock.svg
>
>
> Testing startup of cluster with 500+ nodes and .5M regions takes a few hours.
> This is a 2.1.x cluster with batching disabled.
> Looking at what the Master is doing, its mostly just parsing regionserver 
> reports.
> Stats to follow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-22 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659628#comment-16659628
 ] 

stack commented on HBASE-21246:
---

Where is WALIdentity in the design doc? Designed or are we trying variations? 
Or a particular variation?

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch, 
> 21246.HBASE-20952.005.patch, 21246.HBASE-20952.007.patch, 
> 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-22 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659626#comment-16659626
 ] 

Josh Elser commented on HBASE-21246:


{quote}Waiting for more review.
{quote}
I don't think there's enough here to give a meaningful review to.

I can't find it now (only looked quickly), but you may recall that Stack's 
stated concern about using a WALIdentity was having {{NoSuchLogException}} (or 
similar) littered all over the place. How do we create/use readers and writers? 
Where are we passing around {{WALIdentity}}'s and is this a major concern? I'd 
expect some example of how the major codepaths that use WALs to be stubbed out 
to show how they'd work with these changes you're proposing (even if they are 
not functional at runtime).

One more concrete thing:
{code:java}
+  public FSWALIdentity(String name) {
+this.path = new Path(name);
+this.name = path.getName();
+  }{code}
Why do we have a {{String}} constructor anyways? We would always have a Path 
when working with HDFS< no? Do the other providers need to give their own 
implementation of WALIdentity?

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch, 
> 21246.HBASE-20952.005.patch, 21246.HBASE-20952.007.patch, 
> 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21237) Use CompatRemoteProcedureResolver to dispatch open/close region requests to RS

2018-10-22 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659623#comment-16659623
 ] 

stack commented on HBASE-21237:
---

Of note, I see us doing about 90 assigns a second with batching enabled whereas 
w/o, with this patch applied, we do 40. FYI.

> Use CompatRemoteProcedureResolver to dispatch open/close region requests to RS
> --
>
> Key: HBASE-21237
> URL: https://issues.apache.org/jira/browse/HBASE-21237
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 2.0.3
>
> Attachments: HBASE-21237.branch-2.0.001.patch
>
>
> As discussed in HBASE-21217, in branch-2.0 and branch-2.1, we should use  
> CompatRemoteProcedureResolver  instead of ExecuteProceduresRemoteCall to 
> dispatch region open/close requests to RS. Since ExecuteProceduresRemoteCall  
> will group all the open/close operations in one call and execute them 
> sequentially on the target RS. If one operation fails, all the operation will 
> be marked as failure. Actually, some of the operations(like open region) is 
> already executing in the open region handler thread. But master thinks these 
> operations fails and reassign the regions to another RS. So when the previous 
> RS report to the master that the region is online, master will kill the RS 
> since it already assign the region to another RS.
> For branch-2.2+, HBASE-21217 will fix this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-22 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659601#comment-16659601
 ] 

Ted Yu commented on HBASE-21246:


Addressed above comment in patch v8.

Waiting for more review.

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch, 
> 21246.HBASE-20952.005.patch, 21246.HBASE-20952.007.patch, 
> 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21246) Introduce WALIdentity interface

2018-10-22 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21246:
---
Attachment: 21246.HBASE-20952.008.patch

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch, 
> 21246.HBASE-20952.005.patch, 21246.HBASE-20952.007.patch, 
> 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-22 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659589#comment-16659589
 ] 

Josh Elser commented on HBASE-21246:


You looking for a review, [~yuzhih...@gmail.com], or still working?
{code:java}
+  public FSWALIdentity(String name) {
+this.path = new Path(name);
+if (path != null) {
+  this.name = path.getName();
+}
+  }{code}
Why would we accept a {{null}} name or path?

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch, 
> 21246.HBASE-20952.005.patch, 21246.HBASE-20952.007.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21352) Polish the rollback implementation in ProcedureExecutor

2018-10-22 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659519#comment-16659519
 ] 

stack commented on HBASE-21352:
---

You think this a blocker on branch-2.1 [~Apache9]? Could it wait till a 2.1.2?

Rollback needs some study and we need a branch-2.1 version of this patch 
Thanks.

> Polish the rollback implementation in ProcedureExecutor
> ---
>
> Key: HBASE-21352
> URL: https://issues.apache.org/jira/browse/HBASE-21352
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2, proc-v2
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21352.patch
>
>
> Need to confirm that the way we make sure that there is one PE worker which 
> rollback the whole procedure stack is correct. Now the comment is not enough.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21294) [2.1] Update bouncycastle dependency.

2018-10-22 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659473#comment-16659473
 ] 

Josh Elser commented on HBASE-21294:


{quote}Let me know if you want me to pull it in (Does it need update since the 
original patch seems to have evolved since?)
{quote}
I'll make sure it gets done. Ted caught some uncaught-by-precommit breakages in 
the nightly when using H3. Hopefully good now. Thanks.

> [2.1] Update bouncycastle dependency.
> -
>
> Key: HBASE-21294
> URL: https://issues.apache.org/jira/browse/HBASE-21294
> Project: HBase
>  Issue Type: Task
>  Components: dependencies, test
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 2.1.1
>
> Attachments: HBASE-21281.001.branch-2.0.patch
>
>
> Looks like we still depend on bcprov-jdk16 for some x509 certificate 
> generation in our tests. Bouncycastle has moved beyond this in 1.47, changing 
> the artifact names.
> [http://www.bouncycastle.org/wiki/display/JA1/Porting+from+earlier+BC+releases+to+1.47+and+later]
> There are some API changes too, but it looks like we don't use any of these.
> It seems like we also have vestiges in the POMs from when we were depending 
> on a specific BC version that came in from Hadoop. We now have a 
> KeyStoreTestUtil class in HBase, which makes me think we can also clean up 
> some dependencies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21281) Update bouncycastle dependency.

2018-10-22 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659443#comment-16659443
 ] 

Josh Elser commented on HBASE-21281:


{quote}Turns out that one more module needs the addition of the test dependency:
{quote}
Nice catch, Ted!

> Update bouncycastle dependency.
> ---
>
> Key: HBASE-21281
> URL: https://issues.apache.org/jira/browse/HBASE-21281
> Project: HBase
>  Issue Type: Task
>  Components: dependencies, test
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: 21281.addendum.patch, 21281.addendum2.patch, 
> HBASE-21281.001.branch-2.0.patch
>
>
> Looks like we still depend on bcprov-jdk16 for some x509 certificate 
> generation in our tests. Bouncycastle has moved beyond this in 1.47, changing 
> the artifact names.
> [http://www.bouncycastle.org/wiki/display/JA1/Porting+from+earlier+BC+releases+to+1.47+and+later]
> There are some API changes too, but it looks like we don't use any of these.
> It seems like we also have vestiges in the POMs from when we were depending 
> on a specific BC version that came in from Hadoop. We now have a 
> KeyStoreTestUtil class in HBase, which makes me think we can also clean up 
> some dependencies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20970) Update hadoop check versions in hbase-personality

2018-10-22 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659410#comment-16659410
 ] 

stack commented on HBASE-20970:
---

Asked if objection up on dev list. Will push tomorrow if none.

> Update hadoop check versions in hbase-personality
> -
>
> Key: HBASE-20970
> URL: https://issues.apache.org/jira/browse/HBASE-20970
> Project: HBase
>  Issue Type: Bug
>  Components: build
>Reporter: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-20970.branch-2.0.001.patch, 
> HBASE-20970.master.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21358) Snapshot procedure fails but SnapshotManager thinks it is still running

2018-10-22 Thread Andrew Purtell (JIRA)
Andrew Purtell created HBASE-21358:
--

 Summary: Snapshot procedure fails but SnapshotManager thinks it is 
still running
 Key: HBASE-21358
 URL: https://issues.apache.org/jira/browse/HBASE-21358
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 1.3.2
Reporter: Andrew Purtell


A snapshot procedure fails due to chaotic test action but the snapshot manager 
still thinks it is running. The test client spins needlessly checking for 
something that will never actually complete. We give up eventually but we could 
be failing this a lot faster. 

On the integration client we are checking and re-checking: 

2018-10-20 01:06:11,718 DEBUG [ChaosMonkeyThread] client.HBaseAdmin: Getting 
current status of snapshot from master... 
2018-10-20 01:06:11,719 DEBUG [ChaosMonkeyThread] client.HBaseAdmin: (#40) 
Sleeping: 8571ms while waiting for snapshot completion. 

This is what it looks like on the master side each time the client checks in: 

2018-10-20 01:04:54,565 DEBUG 
[RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=8100] 
master.MasterRpcServices: Checking to see if snapshot from request:{ 
ss=IntegrationTestBigLinkedList-it-1539997289258 
table=IntegrationTestBigLinkedList type=FLUSH } is done 
2018-10-20 01:04:54,565 DEBUG 
[RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=8100] 
snapshot.SnapshotManager: Snapshoting '{ 
ss=IntegrationTestBigLinkedList-it-1539997289258 
table=IntegrationTestBigLinkedList type=FLUSH }' is still in progress! 

There is no running procedure for the snapshot. The procedure has failed. The 
snapshot manager does not take any useful action afterward but believes the 
snapshot to still be in progress.

I see related complaint from the hfile archiver task afterward, empty 
directories, failure to parse protobuf in descriptor files... Seems like there 
was junk in the filesystem left over from the failed snapshot. The master was 
soon restarted by chaos action, and now I don't see these complaints, so that 
partially complete snapshot may have been cleaned up.

This is with 1.3.2, but patched to include the multithreaded hfile archiving 
improvements from later versions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21357) RS should abort if OOM in Reader thread

2018-10-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659359#comment-16659359
 ] 

Hadoop QA commented on HBASE-21357:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  1s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-1 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 6s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
32s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
 1s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
17s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
1m 53s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}113m 
59s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}142m  6s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:61288f8 |
| JIRA Issue | HBASE-21357 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12945045/HBASE-21357.branch-1.001.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 

[jira] [Commented] (HBASE-21247) Custom WAL Provider cannot be specified by configuration whose value is outside the enums in Providers

2018-10-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659260#comment-16659260
 ] 

Hadoop QA commented on HBASE-21247:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} patch {color} | {color:blue}  0m  
2s{color} | {color:blue} The patch file was not named according to hbase's 
naming conventions. Please see 
https://yetus.apache.org/documentation/0.8.0/precommit-patchnames for 
instructions. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
24s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
43s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 15s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}129m 
27s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}174m 10s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21247 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12945037/21247.v9.txt |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 2bc6229d3c83 4.4.0-134-generic #160~14.04.1-Ubuntu SMP Fri Aug 
17 11:07:07 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 931156f66b |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
| whitespace | 

[jira] [Commented] (HBASE-21336) Simplify the implementation of WALProcedureMap

2018-10-22 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659249#comment-16659249
 ] 

Hudson commented on HBASE-21336:


Results for branch branch-2.1
[build #513 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/513/]: 
(/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/513//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/513//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/513//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Simplify the implementation of WALProcedureMap
> --
>
> Key: HBASE-21336
> URL: https://issues.apache.org/jira/browse/HBASE-21336
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21336-v1.patch, HBASE-21336-v2.patch, 
> HBASE-21336-v3.patch, HBASE-21336-v4.patch, HBASE-21336.patch
>
>
> I do not think we need to implement the logic from such a low level, i.e, 
> building complicated linked list by hand, which makes it really hard to 
> understand.
> Let me try to implement it with existing data structures...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21355) HStore's storeSize is calculated repeatedly which causing the confusing region split

2018-10-22 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659250#comment-16659250
 ] 

Hudson commented on HBASE-21355:


Results for branch branch-2.1
[build #513 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/513/]: 
(/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/513//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/513//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/513//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> HStore's storeSize is calculated repeatedly which causing the confusing 
> region split 
> -
>
> Key: HBASE-21355
> URL: https://issues.apache.org/jira/browse/HBASE-21355
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Blocker
> Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 2.1.1, 2.0.3, 1.4.9
>
> Attachments: HBASE-21355.addendum.patch, HBASE-21355.addendum.patch, 
> HBASE-21355.branch-1.patch, HBASE-21355.v1.patch
>
>
> When testing the branch-2's write performance in our internal cluster,  we 
> found that the region will be inexplicably split.  
> We use the default ConstantSizeRegionSplitPolicy and 
> hbase.hregion.max.filesize=40G,but  the region will be split even if its 
> bytes size is less than 40G(only ~6G). 
> Checked the code, I found that the following path  will  accumulate the 
> store's storeSize to a very big value, because the path has no reset..
> {code}
> RsRpcServices#getRegionInfo
>   -> HRegion#isMergeable
>-> HRegion#hasReferences
> -> HStore#hasReferences
> -> HStore#openStoreFiles
> {code}
> BTW, we seems forget to maintain the read replica's storeSize when refresh 
> the store files.
> Some comment here,  I move the  storeSize calculation out of loadStoreFiles() 
> method, because the secondary read replica's refreshStoreFiles() will also 
> use loadStoreFiles() to refresh its store files and update the storeSize in 
> the completeCompaction(..) in the final (just like compaction.) , so no need 
> calculate the storeSize twice.. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21344) hbase:meta location in ZooKeeper set to OPENING by the procedure which eventually failed but precludes Master from assigning it forever

2018-10-22 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659239#comment-16659239
 ] 

Josh Elser commented on HBASE-21344:


{quote}This is not right, the meta table may on a crashed server, and in RIT 
state, if we assign it directly, it may loss some data, since the WAL may not 
replayed.
{quote}
Ah, I think I see what you're saying, Allan; I didn't consider that. I was just 
thinking about it from the context of the failed AP at the end of an 
(otherwise) successful SCP. We don't necessarily know that there isn't an SCP 
that needs to happen (and that the SCP wouldn't be re-running when this new 
master comes up). I was just thinking about this problem knowing that there was 
no SCP to run and assuming IMP was the one who should be assigning meta.
{quote}I suggest we can set hbase.assignment.maximum.attempts to Long.MAX. So 
AP will try forever, until we resolve the problem which cause the region can't 
assign.
{quote}
 
{quote}We can't afford that the AP fails(thus SCP roll back) leaving some 
region unassigned and we don't know about it
{quote}
I get what you're saying, but I'm not excited about that being our answer, 
especially for the seriousness of meta being offline and that we know/expect 
such problems to happen.

Thinking out loud... the goal is that IMP is just doing a "normal" assignment 
of meta _or_ creating it for the first time. SCP is the one responsible if meta 
was on a failed server and getting it back onto the cluster. We can't re-run an 
SCP, so we're left with these hosed regions. I want to suggest something that 
can make sure we get Regions re-assigned (otherwise, huge user pain-point), but 
I need to read some more code to think about that some more :)

Thanks for your help (and Stack's too). Great to come back to!

> hbase:meta location in ZooKeeper set to OPENING by the procedure which 
> eventually failed but precludes Master from assigning it forever
> ---
>
> Key: HBASE-21344
> URL: https://issues.apache.org/jira/browse/HBASE-21344
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Attachments: HBASE-21344-branch-2.0.patch
>
>
> [~elserj] has already summarized it well.
> 1. hbase:meta was on RS8
> 2. RS8 crashed, SCP was queued for it, meta first
> 3. meta was marked OFFLINE
> 4. meta marked as OPENING on RS3
> 5. Can't actually send the openRegion RPC to RS3 due to the krb ticket issue
> 6. We attempt the openRegion/assignment 10 times, failing each time
> 7. We start rolling back the procedure:
> {code:java}
> 2018-10-08 06:51:24,440 WARN  [PEWorker-9] procedure2.ProcedureExecutor: 
> Usually this should not happen, we will release the lock before if the 
> procedure is finished, even if the holdLock is true, arrive here means we 
> have some holes where we do not release the lock. And the releaseLock below 
> may fail since the procedure may have already been deleted from the procedure 
> store.
> 2018-10-08 06:51:24,543 INFO  [PEWorker-9] 
> procedure.MasterProcedureScheduler: pid=48, ppid=47, 
> state=FAILED:REGION_TRANSITION_QUEUE, 
> exception=org.apache.hadoop.hbase.client.RetriesExhaustedException via 
> AssignProcedure:org.apache.hadoop.hbase.client.RetriesExhaustedException: Max 
> attempts exceeded; AssignProcedure table=hbase:meta, region=1588230740 
> checking lock on 1588230740
> {code}
> {code:java}
> 2018-10-08 06:51:30,957 ERROR [PEWorker-9] procedure2.ProcedureExecutor: 
> CODE-BUG: Uncaught runtime exception for pid=47, 
> state=FAILED:SERVER_CRASH_ASSIGN_META, locked=true, 
> exception=org.apache.hadoop.hbase.client.RetriesExhaustedException via 
> AssignProcedure:org.apache.hadoop.hbase.client.RetriesExhaustedException: Max 
> attempts exceeded; ServerCrashProcedure 
> server=,16020,1538974612843, splitWal=true, meta=true
> java.lang.UnsupportedOperationException: unhandled 
> state=SERVER_CRASH_GET_REGIONS
>   at 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure.rollbackState(ServerCrashProcedure.java:254)
>   at 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure.rollbackState(ServerCrashProcedure.java:58)
>   at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:203)
>   at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:960)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1577)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1539)
>   at 
> 

[jira] [Updated] (HBASE-21247) Custom WAL Provider cannot be specified by configuration whose value is outside the enums in Providers

2018-10-22 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21247:
---
Attachment: 21247.v11.txt

> Custom WAL Provider cannot be specified by configuration whose value is 
> outside the enums in Providers
> --
>
> Key: HBASE-21247
> URL: https://issues.apache.org/jira/browse/HBASE-21247
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: 21247.v1.txt, 21247.v10.txt, 21247.v11.txt, 
> 21247.v2.txt, 21247.v3.txt, 21247.v4.tst, 21247.v4.txt, 21247.v5.txt, 
> 21247.v6.txt, 21247.v7.txt, 21247.v8.txt, 21247.v9.txt
>
>
> Currently all the WAL Providers acceptable to hbase are specified in 
> Providers enum of WALFactory.
> This restricts the ability for additional WAL Providers to be supplied - by 
> class name.
> This issue fixes the bug by allowing the specification of new WAL Provider 
> class name using the config "hbase.wal.provider".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21247) Custom WAL Provider cannot be specified by configuration whose value is outside the enums in Providers

2018-10-22 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21247:
---
Status: Patch Available  (was: Open)

> Custom WAL Provider cannot be specified by configuration whose value is 
> outside the enums in Providers
> --
>
> Key: HBASE-21247
> URL: https://issues.apache.org/jira/browse/HBASE-21247
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: 21247.v1.txt, 21247.v10.txt, 21247.v11.txt, 
> 21247.v2.txt, 21247.v3.txt, 21247.v4.tst, 21247.v4.txt, 21247.v5.txt, 
> 21247.v6.txt, 21247.v7.txt, 21247.v8.txt, 21247.v9.txt
>
>
> Currently all the WAL Providers acceptable to hbase are specified in 
> Providers enum of WALFactory.
> This restricts the ability for additional WAL Providers to be supplied - by 
> class name.
> This issue fixes the bug by allowing the specification of new WAL Provider 
> class name using the config "hbase.wal.provider".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21355) HStore's storeSize is calculated repeatedly which causing the confusing region split

2018-10-22 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659235#comment-16659235
 ] 

Hudson commented on HBASE-21355:


Results for branch branch-2.0
[build #996 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/996/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/996//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/996//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/996//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> HStore's storeSize is calculated repeatedly which causing the confusing 
> region split 
> -
>
> Key: HBASE-21355
> URL: https://issues.apache.org/jira/browse/HBASE-21355
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Blocker
> Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 2.1.1, 2.0.3, 1.4.9
>
> Attachments: HBASE-21355.addendum.patch, HBASE-21355.addendum.patch, 
> HBASE-21355.branch-1.patch, HBASE-21355.v1.patch
>
>
> When testing the branch-2's write performance in our internal cluster,  we 
> found that the region will be inexplicably split.  
> We use the default ConstantSizeRegionSplitPolicy and 
> hbase.hregion.max.filesize=40G,but  the region will be split even if its 
> bytes size is less than 40G(only ~6G). 
> Checked the code, I found that the following path  will  accumulate the 
> store's storeSize to a very big value, because the path has no reset..
> {code}
> RsRpcServices#getRegionInfo
>   -> HRegion#isMergeable
>-> HRegion#hasReferences
> -> HStore#hasReferences
> -> HStore#openStoreFiles
> {code}
> BTW, we seems forget to maintain the read replica's storeSize when refresh 
> the store files.
> Some comment here,  I move the  storeSize calculation out of loadStoreFiles() 
> method, because the secondary read replica's refreshStoreFiles() will also 
> use loadStoreFiles() to refresh its store files and update the storeSize in 
> the completeCompaction(..) in the final (just like compaction.) , so no need 
> calculate the storeSize twice.. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21336) Simplify the implementation of WALProcedureMap

2018-10-22 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659234#comment-16659234
 ] 

Hudson commented on HBASE-21336:


Results for branch branch-2.0
[build #996 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/996/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/996//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/996//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/996//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Simplify the implementation of WALProcedureMap
> --
>
> Key: HBASE-21336
> URL: https://issues.apache.org/jira/browse/HBASE-21336
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21336-v1.patch, HBASE-21336-v2.patch, 
> HBASE-21336-v3.patch, HBASE-21336-v4.patch, HBASE-21336.patch
>
>
> I do not think we need to implement the logic from such a low level, i.e, 
> building complicated linked list by hand, which makes it really hard to 
> understand.
> Let me try to implement it with existing data structures...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21247) Custom WAL Provider cannot be specified by configuration whose value is outside the enums in Providers

2018-10-22 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659222#comment-16659222
 ] 

Sean Busbey commented on HBASE-21247:
-

probably best to just save a copy of the class name for this.provider 
pre-wrapping, since if we need it for the meta wal we'll instantiate our own.

> Custom WAL Provider cannot be specified by configuration whose value is 
> outside the enums in Providers
> --
>
> Key: HBASE-21247
> URL: https://issues.apache.org/jira/browse/HBASE-21247
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: 21247.v1.txt, 21247.v10.txt, 21247.v2.txt, 21247.v3.txt, 
> 21247.v4.tst, 21247.v4.txt, 21247.v5.txt, 21247.v6.txt, 21247.v7.txt, 
> 21247.v8.txt, 21247.v9.txt
>
>
> Currently all the WAL Providers acceptable to hbase are specified in 
> Providers enum of WALFactory.
> This restricts the ability for additional WAL Providers to be supplied - by 
> class name.
> This issue fixes the bug by allowing the specification of new WAL Provider 
> class name using the config "hbase.wal.provider".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21247) Custom WAL Provider cannot be specified by configuration whose value is outside the enums in Providers

2018-10-22 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659221#comment-16659221
 ] 

Sean Busbey commented on HBASE-21247:
-

oh even better, WALFactory is the one that does the wrapping in SRWP so we can 
just save a reference pre-wrapping

{code}
 if (enableSyncReplicationWALProvider) {
provider = new SyncReplicationWALProvider(provider);
  }

{code}

> Custom WAL Provider cannot be specified by configuration whose value is 
> outside the enums in Providers
> --
>
> Key: HBASE-21247
> URL: https://issues.apache.org/jira/browse/HBASE-21247
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: 21247.v1.txt, 21247.v10.txt, 21247.v2.txt, 21247.v3.txt, 
> 21247.v4.tst, 21247.v4.txt, 21247.v5.txt, 21247.v6.txt, 21247.v7.txt, 
> 21247.v8.txt, 21247.v9.txt
>
>
> Currently all the WAL Providers acceptable to hbase are specified in 
> Providers enum of WALFactory.
> This restricts the ability for additional WAL Providers to be supplied - by 
> class name.
> This issue fixes the bug by allowing the specification of new WAL Provider 
> class name using the config "hbase.wal.provider".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >