[jira] [Commented] (HBASE-21093) Increase the dispatch delay for testing DDL procedures
[ https://issues.apache.org/jira/browse/HBASE-21093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588866#comment-16588866 ] Duo Zhang commented on HBASE-21093: --- Seems the difference is how fast we process sub procedures... For a successful run, the PEWorker will switch procedures very quickly, but the failing run, we can see this {noformat} 2018-08-22 08:21:06,668 INFO [PEWorker-1] procedure.MasterProcedureScheduler(689): pid=576, ppid=75, state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=false; TransitRegionStateProcedure table=TestMRegions, region=42b8c3f44612af35705d88fdacd3b9c1, ASSIGN checking lock on 42b8c3f44612af35705d88fdacd3b9c1 2018-08-22 08:21:06,745 INFO [PEWorker-1] assignment.TransitRegionStateProcedure(159): Starting pid=576, ppid=75, state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=true; TransitRegionStateProcedure table=TestMRegions, region=42b8c3f44612af35705d88fdacd3b9c1, ASSIGN; rit=OFFLINE, location=asf916.gq1.ygridcore.net,41001,1534925965205; forceNewPlan=false, retain=false 2018-08-22 08:21:06,860 INFO [PEWorker-1] procedure.MasterProcedureScheduler(689): pid=575, ppid=75, state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=false; TransitRegionStateProcedure table=TestMRegions, region=d0f9c85da53b9000b636ca5cc613ad26, ASSIGN checking lock on d0f9c85da53b9000b636ca5cc613ad26 2018-08-22 08:21:06,944 INFO [PEWorker-1] assignment.TransitRegionStateProcedure(159): Starting pid=575, ppid=75, state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=true; TransitRegionStateProcedure table=TestMRegions, region=d0f9c85da53b9000b636ca5cc613ad26, ASSIGN; rit=OFFLINE, location=asf916.gq1.ygridcore.net,41001,1534925965205; forceNewPlan=false, retain=false 2018-08-22 08:21:07,045 INFO [PEWorker-1] assignment.RegionStateStore(199): pid=576 updating hbase:meta row=42b8c3f44612af35705d88fdacd3b9c1, regionState=OPENING, regionLocation=asf916.gq1.ygridcore.net,41001,1534925965205 2018-08-22 08:21:07,050 INFO [PEWorker-1] procedure2.ProcedureExecutor(1612): Initialized subprocedures=[{pid=577, ppid=576, state=RUNNABLE, hasLock=false; org.apache.hadoop.hbase.master.assignment.OpenRegionProcedure}] 2018-08-22 08:21:07,362 INFO [PEWorker-1] assignment.RegionStateStore(199): pid=575 updating hbase:meta row=d0f9c85da53b9000b636ca5cc613ad26, regionState=OPENING, regionLocation=asf916.gq1.ygridcore.net,41001,1534925965205 2018-08-22 08:21:07,366 INFO [PEWorker-1] procedure2.ProcedureExecutor(1612): Initialized subprocedures=[{pid=578, ppid=575, state=RUNNABLE, hasLock=false; org.apache.hadoop.hbase.master.assignment.OpenRegionProcedure}] 2018-08-22 08:21:07,751 INFO [PEWorker-1] procedure.MasterProcedureScheduler(689): pid=574, ppid=75, state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=false; TransitRegionStateProcedure table=TestMRegions, region=e23509e402adee724763cf4f48cd86a5, ASSIGN checking lock on e23509e402adee724763cf4f48cd86a5 2018-08-22 08:21:07,834 INFO [PEWorker-1] assignment.TransitRegionStateProcedure(159): Starting pid=574, ppid=75, state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=true; TransitRegionStateProcedure table=TestMRegions, region=e23509e402adee724763cf4f48cd86a5, ASSIGN; rit=OFFLINE, location=asf916.gq1.ygridcore.net,41001,1534925965205; forceNewPlan=false, retain=false 2018-08-22 08:21:07,918 INFO [PEWorker-1] procedure.MasterProcedureScheduler(689): pid=573, ppid=75, state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=false; TransitRegionStateProcedure table=TestMRegions, region=2c050ee558571b09362e3c63f9b75dfb, ASSIGN checking lock on 2c050ee558571b09362e3c63f9b75dfb 2018-08-22 08:21:08,001 INFO [PEWorker-1] assignment.TransitRegionStateProcedure(159): Starting pid=573, ppid=75, state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=true; TransitRegionStateProcedure table=TestMRegions, region=2c050ee558571b09362e3c63f9b75dfb, ASSIGN; rit=OFFLINE, location=asf916.gq1.ygridcore.net,41001,1534925965205; forceNewPlan=false, retain=false 2018-08-22 08:21:08,084 INFO [PEWorker-1] assignment.RegionStateStore(199): pid=574 updating hbase:meta row=e23509e402adee724763cf4f48cd86a5, regionState=OPENING, regionLocation=asf916.gq1.ygridcore.net,41001,1534925965205 2018-08-22 08:21:08,088 INFO [PEWorker-1] procedure2.ProcedureExecutor(1612): Initialized subprocedures=[{pid=579, ppid=574, state=RUNNABLE, hasLock=false; org.apache.hadoop.hbase.master.assignment.OpenRegionProcedure}] 2018-08-22 08:21:08,427 INFO [PEWorker-1] assignment.RegionStateStore(199): pid=573 updating hbase:meta row=2c050ee558571b09362e3c63f9b75dfb, regionState=OPENING, regionLocation=asf916.gq1.ygridcore.net,41001,1534925965205 2018-08-22 08:21:08,431 INFO [PEWorker-1] procedure2.ProcedureExecutor(1612): Initialized subproc
[jira] [Commented] (HBASE-21093) Increase the dispatch delay for testing DDL procedures
[ https://issues.apache.org/jira/browse/HBASE-21093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588842#comment-16588842 ] Hudson commented on HBASE-21093: Results for branch master [build #448 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/448/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/master/448//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/master/448//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/master/448//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Increase the dispatch delay for testing DDL procedures > -- > > Key: HBASE-21093 > URL: https://issues.apache.org/jira/browse/HBASE-21093 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Attachments: HBASE-21093.patch > > > In TestTableDDLProcedureBase we set the procedure worker number to 1, so on a > slow machine, we will fail to batch the remote calls to RS with the default > dispatch delay, and lead to a very long time to finish the bunch of sub TRSPs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21093) Increase the dispatch delay for testing DDL procedures
[ https://issues.apache.org/jira/browse/HBASE-21093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588828#comment-16588828 ] Duo Zhang commented on HBASE-21093: --- Checked a successful run and a failing run, there are not much differences on time before we begin to assign regions. This is the successful one {noformat} 2018-08-22 02:22:12,073 DEBUG [RpcServer.default.FPBQ.Fifo.handler=4,queue=0,port=44370] procedure2.ProcedureExecutor(1004): Stored pid=75, state=RUNNABLE:CREATE_TABLE_PRE_OPERATION, hasLock=false; CreateTableProcedure table=TestMRegions 2018-08-22 02:22:25,003 INFO [RegionOpenAndInitThread-TestMRegions-15] regionserver.HRegion(1687): Closed TestMRegions,0499,1534904531895.e8c30f58fcef29d43e8c5d69b4e582c5. 2018-08-22 02:22:25,543 INFO [PEWorker-1] hbase.MetaTableAccessor(1700): Updated tableName=TestMRegions, state=ENABLING in hbase:meta {noformat} This the failing one {noformat} 2018-08-21 22:44:38,005 DEBUG [RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=48225] procedure2.ProcedureExecutor(1004): Stored pid=75, state=RUNNABLE:CREATE_TABLE_PRE_OPERATION, hasLock=false; CreateTableProcedure table=TestMRegions 2018-08-21 22:44:50,291 INFO [RegionOpenAndInitThread-TestMRegions-13] regionserver.HRegion(1687): Closed TestMRegions,0499,1534891477987.f51d1f6e305eed925b5a4975e84166dd. 2018-08-21 22:44:50,501 INFO [PEWorker-1] hbase.MetaTableAccessor(1700): Updated tableName=TestMRegions, state=ENABLING in hbase:meta {noformat} So I do not think there are much performance difference on the machines? Need to dig more. > Increase the dispatch delay for testing DDL procedures > -- > > Key: HBASE-21093 > URL: https://issues.apache.org/jira/browse/HBASE-21093 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Attachments: HBASE-21093.patch > > > In TestTableDDLProcedureBase we set the procedure worker number to 1, so on a > slow machine, we will fail to batch the remote calls to RS with the default > dispatch delay, and lead to a very long time to finish the bunch of sub TRSPs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21093) Increase the dispatch delay for testing DDL procedures
[ https://issues.apache.org/jira/browse/HBASE-21093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588818#comment-16588818 ] Duo Zhang commented on HBASE-21093: --- Seems does not work... Let me dig more. > Increase the dispatch delay for testing DDL procedures > -- > > Key: HBASE-21093 > URL: https://issues.apache.org/jira/browse/HBASE-21093 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Attachments: HBASE-21093.patch > > > In TestTableDDLProcedureBase we set the procedure worker number to 1, so on a > slow machine, we will fail to batch the remote calls to RS with the default > dispatch delay, and lead to a very long time to finish the bunch of sub TRSPs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21093) Increase the dispatch delay for testing DDL procedures
[ https://issues.apache.org/jira/browse/HBASE-21093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588430#comment-16588430 ] Duo Zhang commented on HBASE-21093: --- Pushed to master. Let's see how it works. Will also keep an eye on the failing TestDisableTableProcedure. > Increase the dispatch delay for testing DDL procedures > -- > > Key: HBASE-21093 > URL: https://issues.apache.org/jira/browse/HBASE-21093 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Attachments: HBASE-21093.patch > > > In TestTableDDLProcedureBase we set the procedure worker number to 1, so on a > slow machine, we will fail to batch the remote calls to RS with the default > dispatch delay, and lead to a very long time to finish the bunch of sub TRSPs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21093) Increase the dispatch delay for testing DDL procedures
[ https://issues.apache.org/jira/browse/HBASE-21093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588413#comment-16588413 ] Hadoop QA commented on HBASE-21093: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 5s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 58s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 17s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 25s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 13s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 10s{color} | {color:green} hbase-server: The patch generated 0 new + 0 unchanged - 1 fixed = 0 total (was 1) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 17s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 7m 47s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}117m 31s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}154m 43s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.master.procedure.TestDisableTableProcedure | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-21093 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12936554/HBASE-21093.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux c2ef5975d901 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 50055dbf04 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC3 | | unit | https://builds.apache.org/job/PreCommit-HBASE-Build/14127/artifact/patchprocess/patch-unit-hbase-server.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/14127/testReport/ | | Max. process+thread count | 4388 (vs. uli
[jira] [Commented] (HBASE-21093) Increase the dispatch delay for testing DDL procedures
[ https://issues.apache.org/jira/browse/HBASE-21093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588336#comment-16588336 ] stack commented on HBASE-21093: --- +1 > Increase the dispatch delay for testing DDL procedures > -- > > Key: HBASE-21093 > URL: https://issues.apache.org/jira/browse/HBASE-21093 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Attachments: HBASE-21093.patch > > > In TestTableDDLProcedureBase we set the procedure worker number to 1, so on a > slow machine, we will fail to batch the remote calls to RS with the default > dispatch delay, and lead to a very long time to finish the bunch of sub TRSPs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)