[jira] [Commented] (HBASE-24632) Enable procedure-based log splitting as default in hbase3

2020-11-12 Thread Guanghao Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17230493#comment-17230493
 ] 

Guanghao Zhang commented on HBASE-24632:


Ok. Reconsider this after 2.5.x released. :)

> Enable procedure-based log splitting as default in hbase3
> -
>
> Key: HBASE-24632
> URL: https://issues.apache.org/jira/browse/HBASE-24632
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Michael Stack
>Assignee: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0
>
>
> Means changing this value in HConstants to false:
>public static final boolean DEFAULT_HBASE_SPLIT_COORDINATED_BY_ZK = true;
> Should probably also deprecate the current zk distributed split too so we can 
> clear out those classes to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24632) Enable procedure-based log splitting as default in hbase3

2020-11-10 Thread Michael Stack (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229682#comment-17229682
 ] 

Michael Stack commented on HBASE-24632:
---

{quote}[~stack] [~anoop.hbase] The zk based log splitting is only a internal 
implenation. Can we purge them out in master branch and no need wait to 4.0.0?
{quote}
It is on by default in 2.4.0. I was thinking it should stay in place for a 
while in case we find a problem in procedure-based log splitting. Perhaps we 
figure out if procedure-based log splitting is stable in 2.4 + 2.5  So, 
purging from trunk/branch-3 would be ok?

> Enable procedure-based log splitting as default in hbase3
> -
>
> Key: HBASE-24632
> URL: https://issues.apache.org/jira/browse/HBASE-24632
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Michael Stack
>Assignee: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0
>
>
> Means changing this value in HConstants to false:
>public static final boolean DEFAULT_HBASE_SPLIT_COORDINATED_BY_ZK = true;
> Should probably also deprecate the current zk distributed split too so we can 
> clear out those classes to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24632) Enable procedure-based log splitting as default in hbase3

2020-11-10 Thread Guanghao Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229625#comment-17229625
 ] 

Guanghao Zhang commented on HBASE-24632:


[~stack] [~anoop.hbase] The zk based log splitting is only a internal 
implenation. Can we purge them out in master branch and no need wait to 4.0.0?

> Enable procedure-based log splitting as default in hbase3
> -
>
> Key: HBASE-24632
> URL: https://issues.apache.org/jira/browse/HBASE-24632
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Michael Stack
>Assignee: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0
>
>
> Means changing this value in HConstants to false:
>public static final boolean DEFAULT_HBASE_SPLIT_COORDINATED_BY_ZK = true;
> Should probably also deprecate the current zk distributed split too so we can 
> clear out those classes to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24632) Enable procedure-based log splitting as default in hbase3

2020-07-30 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17167987#comment-17167987
 ] 

Hudson commented on HBASE-24632:


Results for branch master
[build #1798 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/1798/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/1798/General_20Nightly_20Build_20Report/]






(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/1798/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/1798/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(x) {color:red}-1 client integration test{color}
--Failed when running client tests on top of Hadoop 2. [see log for 
details|https://builds.apache.org/job/HBase%20Nightly/job/master/1798//artifact/output-integration/hadoop-2.log].
 (note that this means we didn't run on Hadoop 3)


> Enable procedure-based log splitting as default in hbase3
> -
>
> Key: HBASE-24632
> URL: https://issues.apache.org/jira/browse/HBASE-24632
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Michael Stack
>Assignee: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0
>
>
> Means changing this value in HConstants to false:
>public static final boolean DEFAULT_HBASE_SPLIT_COORDINATED_BY_ZK = true;
> Should probably also deprecate the current zk distributed split too so we can 
> clear out those classes to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24632) Enable procedure-based log splitting as default in hbase3

2020-07-28 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166283#comment-17166283
 ] 

Hudson commented on HBASE-24632:


Results for branch branch-2
[build #2760 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2760/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2760/General_20Nightly_20Build_20Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2756/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2756/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2756/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Enable procedure-based log splitting as default in hbase3
> -
>
> Key: HBASE-24632
> URL: https://issues.apache.org/jira/browse/HBASE-24632
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0
>
>
> Means changing this value in HConstants to false:
>public static final boolean DEFAULT_HBASE_SPLIT_COORDINATED_BY_ZK = true;
> Should probably also deprecate the current zk distributed split too so we can 
> clear out those classes to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24632) Enable procedure-based log splitting as default in hbase3

2020-07-27 Thread Michael Stack (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165818#comment-17165818
 ] 

Michael Stack commented on HBASE-24632:
---

Merged to branch-2. Added new PR for master (there is something up w/ the 
hadoop3 profile)

> Enable procedure-based log splitting as default in hbase3
> -
>
> Key: HBASE-24632
> URL: https://issues.apache.org/jira/browse/HBASE-24632
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0
>
>
> Means changing this value in HConstants to false:
>public static final boolean DEFAULT_HBASE_SPLIT_COORDINATED_BY_ZK = true;
> Should probably also deprecate the current zk distributed split too so we can 
> clear out those classes to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24632) Enable procedure-based log splitting as default in hbase3

2020-07-24 Thread Anoop Sam John (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164725#comment-17164725
 ] 

Anoop Sam John commented on HBASE-24632:


+1

> Enable procedure-based log splitting as default in hbase3
> -
>
> Key: HBASE-24632
> URL: https://issues.apache.org/jira/browse/HBASE-24632
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0
>
>
> Means changing this value in HConstants to false:
>public static final boolean DEFAULT_HBASE_SPLIT_COORDINATED_BY_ZK = true;
> Should probably also deprecate the current zk distributed split too so we can 
> clear out those classes to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24632) Enable procedure-based log splitting as default in hbase3

2020-07-24 Thread Michael Stack (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164620#comment-17164620
 ] 

Michael Stack commented on HBASE-24632:
---

Ok to commit then [~anoop.hbase] ?

> Enable procedure-based log splitting as default in hbase3
> -
>
> Key: HBASE-24632
> URL: https://issues.apache.org/jira/browse/HBASE-24632
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0
>
>
> Means changing this value in HConstants to false:
>public static final boolean DEFAULT_HBASE_SPLIT_COORDINATED_BY_ZK = true;
> Should probably also deprecate the current zk distributed split too so we can 
> clear out those classes to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24632) Enable procedure-based log splitting as default in hbase3

2020-07-23 Thread Michael Stack (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163998#comment-17163998
 ] 

Michael Stack commented on HBASE-24632:
---

{quote} So here also once the work is given, the thread wont get blocked but 
will yield?
{quote}
Thats right. The RPD will dispatch the split WAL job to the remote RS wrapped 
in a ExecuteProceduresRemoteCall. Once the RPC delivers the job to the remote 
RS, the RPD dispatcher is done (and its thread from the RPD executor of 128 
threads is now idle again). Meanwhile the split WAL starts to run on the RS 
side. When it is done, it will do a similar call back to the Master to tell it 
success or failure but this is different RPC that originates at the RS.

 

HBASE-24766 is issue where I'll add some doc around this stuff. The questions 
above are good. I said I'd do doc in this area a while ago. Let me do now.

> Enable procedure-based log splitting as default in hbase3
> -
>
> Key: HBASE-24632
> URL: https://issues.apache.org/jira/browse/HBASE-24632
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0
>
>
> Means changing this value in HConstants to false:
>public static final boolean DEFAULT_HBASE_SPLIT_COORDINATED_BY_ZK = true;
> Should probably also deprecate the current zk distributed split too so we can 
> clear out those classes to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24632) Enable procedure-based log splitting as default in hbase3

2020-07-23 Thread Anoop Sam John (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163814#comment-17163814
 ] 

Anoop Sam John commented on HBASE-24632:


bq.And there in that executor we have 128 threads by default.
My bad.. I mean RemoteProcedureDispatcher.  There it is 128 threads default in 
its pool. These threads responsible for the remote RS calls (splits or region 
open etc).  So here also once the work is given, the thread wont get blocked 
but will yield?  Sorry I messed it up with saying Executor.

> Enable procedure-based log splitting as default in hbase3
> -
>
> Key: HBASE-24632
> URL: https://issues.apache.org/jira/browse/HBASE-24632
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0
>
>
> Means changing this value in HConstants to false:
>public static final boolean DEFAULT_HBASE_SPLIT_COORDINATED_BY_ZK = true;
> Should probably also deprecate the current zk distributed split too so we can 
> clear out those classes to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24632) Enable procedure-based log splitting as default in hbase3

2020-07-23 Thread Michael Stack (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163800#comment-17163800
 ] 

Michael Stack commented on HBASE-24632:
---

{quote}So for every WAL file, it will add a sub procedure in ProcedureExecutor 
and that in turn add entries for RemoteProcExecutor correct?
{quote}
[~anoop.hbase] yes.

 
{quote}So at this stage also these Sub proc executing threads will yield once 
the split proc is submitted to RemoteProcExecutor ?
{quote}
yes previous SCP would block occupying the PE worker thread until WAL split 
competed.

 
{quote}And there in that executor we have 128 threads by default.
{quote}
 

{color:#80}It keeps the below number of threads running:{color}

 

{color:#80}final int {color}numThreads = 
{color:#660e7a}conf{color}.getInt(MasterProcedureConstants.{color:#660e7a}MASTER_PROCEDURE_THREADS{color},
 Math.max(
 (cpus > {color:#ff}0 {color}? cpus / {color:#ff}4 {color}: 
{color:#ff}0{color}), 
MasterProcedureConstants.{color:#660e7a}DEFAULT_MIN_MASTER_PROCEDURE_THREADS{color}));

 

... where {color:#660e7a}DEFAULT_MIN_MASTER_PROCEDURE_THREADS is 16.
{color}

{color:#660e7a}... and internally, it will spin up to{color}

{color:#ff}10 {color}* numThreads;

 

... if it gets backed up.

 
{quote}Or will that also yield?
{quote}
It will yield.

 

Ok to commit? Thanks [~anoop.hbase]

 

 

 

> Enable procedure-based log splitting as default in hbase3
> -
>
> Key: HBASE-24632
> URL: https://issues.apache.org/jira/browse/HBASE-24632
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0
>
>
> Means changing this value in HConstants to false:
>public static final boolean DEFAULT_HBASE_SPLIT_COORDINATED_BY_ZK = true;
> Should probably also deprecate the current zk distributed split too so we can 
> clear out those classes to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24632) Enable procedure-based log splitting as default in hbase3

2020-07-23 Thread Anoop Sam John (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163551#comment-17163551
 ] 

Anoop Sam John commented on HBASE-24632:


So for every WAL file, it will add a sub procedure in ProcedureExecutor and 
that in turn add entries for RemoteProcExecutor correct?  So at this stage also 
these Sub proc executing threads will yield once the split proc is submitted to 
RemoteProcExecutor ?  And there in that executor we have 128 threads by 
default. Will that wait once it issue split request to RS for split?  Or will 
that also yield? If former, then we will get big backlog in a bigger cluster 
with many RS down and every having many WALs to split

> Enable procedure-based log splitting as default in hbase3
> -
>
> Key: HBASE-24632
> URL: https://issues.apache.org/jira/browse/HBASE-24632
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0
>
>
> Means changing this value in HConstants to false:
>public static final boolean DEFAULT_HBASE_SPLIT_COORDINATED_BY_ZK = true;
> Should probably also deprecate the current zk distributed split too so we can 
> clear out those classes to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24632) Enable procedure-based log splitting as default in hbase3

2020-07-22 Thread Michael Stack (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163200#comment-17163200
 ] 

Michael Stack commented on HBASE-24632:
---

I was going to push this as default in hbase-2.4 and on hbase3 in the morning 
unless objections. I've been running it a while and its nice... There is a 
procedure per WAL split and ServerCrashProcedure doesn't block and wait till 
ALL WALs split before it can move forward; now it schedules all the WAL splits 
and then suspends itself freeing up the ProcedureExecutor.

> Enable procedure-based log splitting as default in hbase3
> -
>
> Key: HBASE-24632
> URL: https://issues.apache.org/jira/browse/HBASE-24632
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0
>
>
> Means changing this value in HConstants to false:
>public static final boolean DEFAULT_HBASE_SPLIT_COORDINATED_BY_ZK = true;
> Should probably also deprecate the current zk distributed split too so we can 
> clear out those classes to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24632) Enable procedure-based log splitting as default in hbase3

2020-07-09 Thread Michael Stack (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154996#comment-17154996
 ] 

Michael Stack commented on HBASE-24632:
---

This one-liner has gotten kinda big:

 

 * We were trying to delete non-empty directory; weren't doing accounting for 
meta WALs where meta had moved off the server (successfully)
 * We were deleting split WALs rather than archiving them.
 * We were not handling corrupt files.

Deprecations and removal of tests of old system.

> Enable procedure-based log splitting as default in hbase3
> -
>
> Key: HBASE-24632
> URL: https://issues.apache.org/jira/browse/HBASE-24632
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0
>
>
> Means changing this value in HConstants to false:
>public static final boolean DEFAULT_HBASE_SPLIT_COORDINATED_BY_ZK = true;
> Should probably also deprecate the current zk distributed split too so we can 
> clear out those classes to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24632) Enable procedure-based log splitting as default in hbase3

2020-07-08 Thread Pankaj Kumar (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153366#comment-17153366
 ] 

Pankaj Kumar commented on HBASE-24632:
--

{quote}You in favor of enabling this for 2.4.0/3.0.0 by default sir?
{quote}
Yeah, we can make procedure-based log splitting as default in master branch.

> Enable procedure-based log splitting as default in hbase3
> -
>
> Key: HBASE-24632
> URL: https://issues.apache.org/jira/browse/HBASE-24632
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0
>
>
> Means changing this value in HConstants to false:
>public static final boolean DEFAULT_HBASE_SPLIT_COORDINATED_BY_ZK = true;
> Should probably also deprecate the current zk distributed split too so we can 
> clear out those classes to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24632) Enable procedure-based log splitting as default in hbase3

2020-07-03 Thread Michael Stack (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17151137#comment-17151137
 ] 

Michael Stack commented on HBASE-24632:
---

Was going to enable for 2.4 unless objection

> Enable procedure-based log splitting as default in hbase3
> -
>
> Key: HBASE-24632
> URL: https://issues.apache.org/jira/browse/HBASE-24632
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0
>
>
> Means changing this value in HConstants to false:
>public static final boolean DEFAULT_HBASE_SPLIT_COORDINATED_BY_ZK = true;
> Should probably also deprecate the current zk distributed split too so we can 
> clear out those classes to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24632) Enable procedure-based log splitting as default in hbase3

2020-07-02 Thread Michael Stack (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17150526#comment-17150526
 ] 

Michael Stack commented on HBASE-24632:
---

[~anoop.hbase] you are right sir, HBASE-24619. Let me fill your concerns in 
there.

> Enable procedure-based log splitting as default in hbase3
> -
>
> Key: HBASE-24632
> URL: https://issues.apache.org/jira/browse/HBASE-24632
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> Means changing this value in HConstants to false:
>public static final boolean DEFAULT_HBASE_SPLIT_COORDINATED_BY_ZK = true;
> Should probably also deprecate the current zk distributed split too so we can 
> clear out those classes to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24632) Enable procedure-based log splitting as default in hbase3

2020-07-02 Thread Anoop Sam John (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17150515#comment-17150515
 ] 

Anoop Sam John commented on HBASE-24632:


we have one already I believe Stack.

> Enable procedure-based log splitting as default in hbase3
> -
>
> Key: HBASE-24632
> URL: https://issues.apache.org/jira/browse/HBASE-24632
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> Means changing this value in HConstants to false:
>public static final boolean DEFAULT_HBASE_SPLIT_COORDINATED_BY_ZK = true;
> Should probably also deprecate the current zk distributed split too so we can 
> clear out those classes to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24632) Enable procedure-based log splitting as default in hbase3

2020-07-02 Thread Michael Stack (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17150488#comment-17150488
 ] 

Michael Stack commented on HBASE-24632:
---

Let me open subissue w/ your concern [~anoop.hbase]  Meantime, will push ahead 
with this commit unless objection.

> Enable procedure-based log splitting as default in hbase3
> -
>
> Key: HBASE-24632
> URL: https://issues.apache.org/jira/browse/HBASE-24632
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> Means changing this value in HConstants to false:
>public static final boolean DEFAULT_HBASE_SPLIT_COORDINATED_BY_ZK = true;
> Should probably also deprecate the current zk distributed split too so we can 
> clear out those classes to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24632) Enable procedure-based log splitting as default in hbase3

2020-07-02 Thread Anoop Sam John (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17150482#comment-17150482
 ] 

Anoop Sam John commented on HBASE-24632:


bq.Only concern is the compaction after region open, which impact MTTR due to 
heavy IO in large cluster with many outstanding WALs
I see. This compaction adding more IO pressure on storage which is already 
under heavy usage. !!! Good point. That also to consider. 

> Enable procedure-based log splitting as default in hbase3
> -
>
> Key: HBASE-24632
> URL: https://issues.apache.org/jira/browse/HBASE-24632
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> Means changing this value in HConstants to false:
>public static final boolean DEFAULT_HBASE_SPLIT_COORDINATED_BY_ZK = true;
> Should probably also deprecate the current zk distributed split too so we can 
> clear out those classes to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24632) Enable procedure-based log splitting as default in hbase3

2020-07-02 Thread Anoop Sam John (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17150481#comment-17150481
 ] 

Anoop Sam John commented on HBASE-24632:


Yes Stack mostly we might compact these files as after region open we issue a 
compact request.  But there can be some cases.
Assume there were some small files because of flush but never got compacted 
before the RS down happened.  We will look for the possible candidate from 
oldest files and in all chance the very old files would get excluded because of 
the size math.  But It is possible that new flushed files would get selected. 
And we have the max files to compact config also which is 10 by default.  Even 
these small files count alone might be >10. If there are say 15 WAL files to 
split, for sure we will have at least 15 small HFiles.
My thinking was this. After the region open, we have to make sure these small 
files are compacted in one go and we should not even consider the max files 
limit for this compaction.  Also to note that this files might not even have 
the DBE/compression etc being applied.   Ya coding wise am not sure how clean 
it might come. Let us see. Extremely busy these days. Once out of that, I will 
have a look at this.

> Enable procedure-based log splitting as default in hbase3
> -
>
> Key: HBASE-24632
> URL: https://issues.apache.org/jira/browse/HBASE-24632
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> Means changing this value in HConstants to false:
>public static final boolean DEFAULT_HBASE_SPLIT_COORDINATED_BY_ZK = true;
> Should probably also deprecate the current zk distributed split too so we can 
> clear out those classes to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24632) Enable procedure-based log splitting as default in hbase3

2020-07-02 Thread Michael Stack (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17150422#comment-17150422
 ] 

Michael Stack commented on HBASE-24632:
---

Thank you [~pankajkumar] for weighing in. You in favor of enabling this for 
2.4.0/3.0.0 by default sir? If so perhaps +1 the PR. Do we have an issue to 
cover your compaction concern? Thanks.

> Enable procedure-based log splitting as default in hbase3
> -
>
> Key: HBASE-24632
> URL: https://issues.apache.org/jira/browse/HBASE-24632
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> Means changing this value in HConstants to false:
>public static final boolean DEFAULT_HBASE_SPLIT_COORDINATED_BY_ZK = true;
> Should probably also deprecate the current zk distributed split too so we can 
> clear out those classes to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24632) Enable procedure-based log splitting as default in hbase3

2020-07-01 Thread Pankaj Kumar (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17149921#comment-17149921
 ] 

Pankaj Kumar commented on HBASE-24632:
--

Basic functionlities are working fine, few issues were there which are already 
addressed. Only concern is the compaction after region open, which impact MTTR 
due to heavy IO in large cluster with many outstanding WALs.

> Enable procedure-based log splitting as default in hbase3
> -
>
> Key: HBASE-24632
> URL: https://issues.apache.org/jira/browse/HBASE-24632
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> Means changing this value in HConstants to false:
>public static final boolean DEFAULT_HBASE_SPLIT_COORDINATED_BY_ZK = true;
> Should probably also deprecate the current zk distributed split too so we can 
> clear out those classes to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24632) Enable procedure-based log splitting as default in hbase3

2020-07-01 Thread Michael Stack (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17149504#comment-17149504
 ] 

Michael Stack commented on HBASE-24632:
---

Thanks [~anoop.hbase]. Will wait on [~13.pankajkumar] input.

On small hfiles being quickly compacted away -- I think this concern belongs 
against HBASE-23634 -- but by default, we generally pick up the small files 
first (from RatioBasedCompactionPolicy, our default compaction policy and the 
policy subclassed by the likes of DateTieredCompaction):

{code}
  /**
* -- Default minor compaction selection algorithm:
* choose CompactSelection from candidates --
* First exclude bulk-load files if indicated in configuration.
* Start at the oldest file and stop when you find the first file that
* meets compaction criteria:
* (1) a recently-flushed, small file (i.e. <= minCompactSize)
* OR
* (2) within the compactRatio of sum(newer_files)
* Given normal skew, any newer files will also meet this criteria
* 
* Additional Note:
* If fileSizes.size() >> maxFilesToCompact, we will recurse on
* compact().  Consider the oldest files first to avoid a
* situation where we always compact [end-threshold,end).  Then, the
* last file becomes an aggregate of the previous compactions.
*
* normal skew:
*
* older > newer (increasing seqID)
* _
*| |   _
*| |  | |   _
*  --|-|- |-|- |-|---_---_---  minCompactSize
*| |  | |  | |  | |  _  | |
*| |  | |  | |  | | | | | |
*| |  | |  | |  | | | | | |
* @param candidates pre-filtrate
* @return filtered subset
*/
{code}

> Enable procedure-based log splitting as default in hbase3
> -
>
> Key: HBASE-24632
> URL: https://issues.apache.org/jira/browse/HBASE-24632
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> Means changing this value in HConstants to false:
>public static final boolean DEFAULT_HBASE_SPLIT_COORDINATED_BY_ZK = true;
> Should probably also deprecate the current zk distributed split too so we can 
> clear out those classes to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24632) Enable procedure-based log splitting as default in hbase3

2020-07-01 Thread Anoop Sam John (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17149153#comment-17149153
 ] 

Anoop Sam John commented on HBASE-24632:


[~pankaj2461] raised a few.  Can u pls confirm Pankaj whether all are addressed?
Also there is one IA raised now so that we can compact all small HFiles created 
by this split as the 1st item once this region is opened, Not sure how 
easy/difficult it is.  But that will be super useful as there is a chance now 
that these small HFiles will not have Table specific 
compression/encoding/blocksize etc. 

> Enable procedure-based log splitting as default in hbase3
> -
>
> Key: HBASE-24632
> URL: https://issues.apache.org/jira/browse/HBASE-24632
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> Means changing this value in HConstants to false:
>public static final boolean DEFAULT_HBASE_SPLIT_COORDINATED_BY_ZK = true;
> Should probably also deprecate the current zk distributed split too so we can 
> clear out those classes to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24632) Enable procedure-based log splitting as default in hbase3

2020-06-29 Thread Michael Stack (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148275#comment-17148275
 ] 

Michael Stack commented on HBASE-24632:
---

Was going to turn it on. A few ITBLL runs indicate this thing at least works 
and doesn't lose data.

HBASE-23634 is for enabling hfile by default.

What are the open issues [~anoop.hbase]? I don't think there any?

> Enable procedure-based log splitting as default in hbase3
> -
>
> Key: HBASE-24632
> URL: https://issues.apache.org/jira/browse/HBASE-24632
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> Means changing this value in HConstants to false:
>public static final boolean DEFAULT_HBASE_SPLIT_COORDINATED_BY_ZK = true;
> Should probably also deprecate the current zk distributed split too so we can 
> clear out those classes to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24632) Enable procedure-based log splitting as default in hbase3

2020-06-28 Thread Anoop Sam John (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147520#comment-17147520
 ] 

Anoop Sam John commented on HBASE-24632:


Even split creating HFiles feature should be enabled by default. (Or is it 
already in trunk?)  There are still some open issues with this. Once those are 
solved, we can do.. 

> Enable procedure-based log splitting as default in hbase3
> -
>
> Key: HBASE-24632
> URL: https://issues.apache.org/jira/browse/HBASE-24632
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> Means changing this value in HConstants to false:
>public static final boolean DEFAULT_HBASE_SPLIT_COORDINATED_BY_ZK = true;
> Should probably also deprecate the current zk distributed split too so we can 
> clear out those classes to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)