[ 
https://issues.apache.org/jira/browse/HBASE-20842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16534648#comment-16534648
 ] 

Hadoop QA commented on HBASE-20842:
-----------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
29s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
 4s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
6s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
 4s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
29s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
10s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
12m 33s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}201m 39s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}250m 48s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20842 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12930488/HBASE-20842.master.002.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 60683424844a 4.4.0-98-generic #121-Ubuntu SMP Tue Oct 10 
14:24:03 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 1ade4d2f44 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13521/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13521/testReport/ |
| Max. process+thread count | 4498 (vs. ulimit of 10000) |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13521/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Infinite loop when replaying remote wals
> ----------------------------------------
>
>                 Key: HBASE-20842
>                 URL: https://issues.apache.org/jira/browse/HBASE-20842
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>            Reporter: Duo Zhang
>            Assignee: Guanghao Zhang
>            Priority: Major
>             Fix For: 3.0.0
>
>         Attachments: HBASE-20842.master.001.patch, 
> HBASE-20842.master.002.patch, HBASE-20842.master.002.patch, 
> HBASE-20842.master.002.patch
>
>
> {noformat}
> 2018-07-03 12:25:11,375 WARN  [RSProcedureDispatcher-pool13-t19] 
> replication.SyncReplicationReplayWALRemoteProcedure(107): Replay wals 
> [remoteWALs/1-replay/asf916.gq1.ygridcore.net%2C36931%2C1530620616106-1530620683061-1.1530620683075.syncrep]
>  on asf916.gq1.ygridcore.net,33811,1530620636539 failed for peer id=1
> org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 
> asf916.gq1.ygridcore.net,33811,1530620636539 is not online
>       at 
> org.apache.hadoop.hbase.master.procedure.RSProcedureDispatcher$DeadRSRemoteCall.call(RSProcedureDispatcher.java:285)
>       at 
> org.apache.hadoop.hbase.master.procedure.RSProcedureDispatcher$DeadRSRemoteCall.call(RSProcedureDispatcher.java:276)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at java.lang.Thread.run(Thread.java:748)
> 2018-07-03 12:25:11,440 DEBUG [Thread-2883] 
> replication.TestSyncReplicationStandbyKillRS(111): Server 
> [asf916.gq1.ygridcore.net,33811,1530620636539] marked as dead, waiting for it 
> to finish dead processing
> 2018-07-03 12:25:11,441 DEBUG [Thread-2883] 
> replication.TestSyncReplicationStandbyKillRS(114): Server 
> [asf916.gq1.ygridcore.net,33811,1530620636539] still being processed, waiting
> 2018-07-03 12:25:11,456 WARN  [RS:3;asf916:45751] wal.AbstractFSWAL(419): 
> 'hbase.regionserver.maxlogs' was deprecated.
> 2018-07-03 12:25:11,457 INFO  [RS:3;asf916:45751] wal.AbstractFSWAL(424): WAL 
> configuration: blocksize=256 MB, rollsize=128 MB, 
> prefix=asf916.gq1.ygridcore.net%2C45751%2C1530620709275, suffix=, 
> logDir=hdfs://localhost:42624/user/jenkins/test-data/a86a805e-162f-5f22-7b9e-573dbf0f40fb/WALs/asf916.gq1.ygridcore.net,45751,1530620709275,
>  
> archiveDir=hdfs://localhost:42624/user/jenkins/test-data/a86a805e-162f-5f22-7b9e-573dbf0f40fb/oldWALs
> 2018-07-03 12:25:11,467 DEBUG [RS-EventLoopGroup-14-4] 
> asyncfs.FanOutOneBlockAsyncDFSOutputSaslHelper(737): SASL client skipping 
> handshake in unsecured configuration for addr = 127.0.0.1/127.0.0.1, 
> datanodeId = 
> DatanodeInfoWithStorage[127.0.0.1:38997,DS-6002160d-388b-4840-8538-e4c2255108be,DISK]
> 2018-07-03 12:25:11,467 DEBUG [RS-EventLoopGroup-14-5] 
> asyncfs.FanOutOneBlockAsyncDFSOutputSaslHelper(737): SASL client skipping 
> handshake in unsecured configuration for addr = 127.0.0.1/127.0.0.1, 
> datanodeId = 
> DatanodeInfoWithStorage[127.0.0.1:45904,DS-e189e3c8-a1bd-475c-86c0-3891e541fc6e,DISK]
> 2018-07-03 12:25:11,467 DEBUG [RS-EventLoopGroup-14-3] 
> asyncfs.FanOutOneBlockAsyncDFSOutputSaslHelper(737): SASL client skipping 
> handshake in unsecured configuration for addr = 127.0.0.1/127.0.0.1, 
> datanodeId = 
> DatanodeInfoWithStorage[127.0.0.1:39589,DS-62ced3f8-35c4-4904-80cc-4d514b8f4050,DISK]
> 2018-07-03 12:25:11,495 DEBUG [RegionServerTracker-0] 
> procedure2.ProcedureExecutor(887): Stored pid=30, 
> state=RUNNABLE:SERVER_CRASH_START; ServerCrashProcedure 
> server=asf916.gq1.ygridcore.net,33811,1530620636539, splitWal=true, meta=true
> 2018-07-03 12:25:11,495 DEBUG [RegionServerTracker-0] 
> assignment.AssignmentManager(1321): 
> Added=asf916.gq1.ygridcore.net,33811,1530620636539 to dead servers, submitted 
> shutdown handler to be executed meta=true
> 2018-07-03 12:25:11,498 INFO  [PEWorker-6] 
> procedure.ServerCrashProcedure(118): Start pid=30, 
> state=RUNNABLE:SERVER_CRASH_START; ServerCrashProcedure 
> server=asf916.gq1.ygridcore.net,33811,1530620636539, splitWal=true, meta=true
> 2018-07-03 12:25:11,500 WARN  [RegionServerTracker-0] 
> replication.SyncReplicationReplayWALRemoteProcedure(107): Replay wals 
> [remoteWALs/1-replay/asf916.gq1.ygridcore.net%2C36931%2C1530620616106-1530620683061-1.1530620683075.syncrep]
>  on asf916.gq1.ygridcore.net,33811,1530620636539 failed for peer id=1
> org.apache.hadoop.hbase.DoNotRetryIOException: server not online 
> asf916.gq1.ygridcore.net,33811,1530620636539
>       at 
> org.apache.hadoop.hbase.master.procedure.RSProcedureDispatcher.abortPendingOperations(RSProcedureDispatcher.java:130)
>       at 
> org.apache.hadoop.hbase.master.procedure.RSProcedureDispatcher.abortPendingOperations(RSProcedureDispatcher.java:60)
>       at 
> org.apache.hadoop.hbase.procedure2.RemoteProcedureDispatcher$BufferNode.abortOperationsInQueue(RemoteProcedureDispatcher.java:380)
>       at 
> org.apache.hadoop.hbase.procedure2.RemoteProcedureDispatcher.removeNode(RemoteProcedureDispatcher.java:193)
>       at 
> org.apache.hadoop.hbase.master.procedure.RSProcedureDispatcher.serverRemoved(RSProcedureDispatcher.java:143)
>       at 
> org.apache.hadoop.hbase.master.ServerManager.expireServer(ServerManager.java:610)
>       at 
> org.apache.hadoop.hbase.master.RegionServerTracker.refresh(RegionServerTracker.java:160)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at java.lang.Thread.run(Thread.java:748)
> 2018-07-03 12:25:11,503 WARN  [PEWorker-4] 
> replication.SyncReplicationReplayWALRemoteProcedure(162): Can not add remote 
> operation for replay wals 
> [remoteWALs/1-replay/asf916.gq1.ygridcore.net%2C36931%2C1530620616106-1530620683061-1.1530620683075.syncrep]
>  on asf916.gq1.ygridcore.net,33811,1530620636539 for peer id=1, this usually 
> because the server is already dead, retry
> 2018-07-03 12:25:11,503 WARN  [PEWorker-4] 
> replication.SyncReplicationReplayWALRemoteProcedure(162): Can not add remote 
> operation for replay wals 
> [remoteWALs/1-replay/asf916.gq1.ygridcore.net%2C36931%2C1530620616106-1530620683061-1.1530620683075.syncrep]
>  on asf916.gq1.ygridcore.net,33811,1530620636539 for peer id=1, this usually 
> because the server is already dead, retry
> 2018-07-03 12:25:11,503 WARN  [PEWorker-4] 
> replication.SyncReplicationReplayWALRemoteProcedure(162): Can not add remote 
> operation for replay wals 
> [remoteWALs/1-replay/asf916.gq1.ygridcore.net%2C36931%2C1530620616106-1530620683061-1.1530620683075.syncrep]
>  on asf916.gq1.ygridcore.net,33811,1530620636539 for peer id=1, this usually 
> because the server is already dead, retry
> 2018-07-03 12:25:11,503 WARN  [PEWorker-7] 
> replication.SyncReplicationReplayWALRemoteProcedure(162): Can not add remote 
> operation for replay wals 
> [remoteWALs/1-replay/asf916.gq1.ygridcore.net%2C36931%2C1530620616106-1530620683061-1.1530620683075.syncrep]
>  on asf916.gq1.ygridcore.net,33811,1530620636539 for peer id=1, this usually 
> because the server is already dead, retry
> 2018-07-03 12:25:11,504 WARN  [PEWorker-7] 
> replication.SyncReplicationReplayWALRemoteProcedure(162): Can not add remote 
> operation for replay wals 
> [remoteWALs/1-replay/asf916.gq1.ygridcore.net%2C36931%2C1530620616106-1530620683061-1.1530620683075.syncrep]
>  on asf916.gq1.ygridcore.net,33811,1530620636539 for peer id=1, this usually 
> because the server is already dead, retry
> 2018-07-03 12:25:11,504 WARN  [PEWorker-7] 
> replication.SyncReplicationReplayWALRemoteProcedure(162): Can not add remote 
> operation for replay wals 
> [remoteWALs/1-replay/asf916.gq1.ygridcore.net%2C36931%2C1530620616106-1530620683061-1.1530620683075.syncrep]
>  on asf916.gq1.ygridcore.net,33811,1530620636539 for peer id=1, this usually 
> because the server is already dead, retry
> 2018-07-03 12:25:11,504 WARN  [PEWorker-7] 
> replication.SyncReplicationReplayWALRemoteProcedure(162): Can not add remote 
> operation for replay wals 
> [remoteWALs/1-replay/asf916.gq1.ygridcore.net%2C36931%2C1530620616106-1530620683061-1.1530620683075.syncrep]
>  on asf916.gq1.ygridcore.net,33811,1530620636539 for peer id=1, this usually 
> because the server is already dead, retry
> 2018-07-03 12:25:11,504 WARN  [PEWorker-7] 
> replication.SyncReplicationReplayWALRemoteProcedure(162): Can not add remote 
> operation for replay wals 
> [remoteWALs/1-replay/asf916.gq1.ygridcore.net%2C36931%2C1530620616106-1530620683061-1.1530620683075.syncrep]
>  on asf916.gq1.ygridcore.net,33811,1530620636539 for peer id=1, this usually 
> because the server is already dead, retry
> 2018-07-03 12:25:11,504 WARN  [PEWorker-7] 
> replication.SyncReplicationReplayWALRemoteProcedure(162): Can not add remote 
> operation for replay wals 
> [remoteWALs/1-replay/asf916.gq1.ygridcore.net%2C36931%2C1530620616106-1530620683061-1.1530620683075.syncrep]
>  on asf916.gq1.ygridcore.net,33811,1530620636539 for peer id=1, this usually 
> because the server is already dead, retry
> 2018-07-03 12:25:11,504 WARN  [PEWorker-7] 
> replication.SyncReplicationReplayWALRemoteProcedure(162): Can not add remote 
> operation for replay wals 
> [remoteWALs/1-replay/asf916.gq1.ygridcore.net%2C36931%2C1530620616106-1530620683061-1.1530620683075.syncrep]
>  on asf916.gq1.ygridcore.net,33811,1530620636539 for peer id=1, this usually 
> because the server is already dead, retry
> 2018-07-03 12:25:11,504 WARN  [PEWorker-7] 
> replication.SyncReplicationReplayWALRemoteProcedure(162): Can not add remote 
> operation for replay wals 
> [remoteWALs/1-replay/asf916.gq1.ygridcore.net%2C36931%2C1530620616106-1530620683061-1.1530620683075.syncrep]
>  on asf916.gq1.ygridcore.net,33811,1530620636539 for peer id=1, this usually 
> because the server is already dead, retry
> 2018-07-03 12:25:11,505 WARN  [PEWorker-11] 
> replication.SyncReplicationReplayWALRemoteProcedure(162): Can not add remote 
> operation for replay wals 
> [remoteWALs/1-replay/asf916.gq1.ygridcore.net%2C36931%2C1530620616106-1530620683061-1.1530620683075.syncrep]
>  on asf916.gq1.ygridcore.net,33811,1530620636539 for peer id=1, this usually 
> because the server is already dead, retry
> 2018-07-03 12:25:11,505 WARN  [PEWorker-8] 
> replication.SyncReplicationReplayWALRemoteProcedure(162): Can not add remote 
> operation for replay wals 
> [remoteWALs/1-replay/asf916.gq1.ygridcore.net%2C36931%2C1530620616106-1530620683061-1.1530620683075.syncrep]
>  on asf916.gq1.ygridcore.net,33811,1530620636539 for peer id=1, this usually 
> because the server is already dead, retry
> 2018-07-03 12:25:11,505 WARN  [PEWorker-8] 
> replication.SyncReplicationReplayWALRemoteProcedure(162): Can not add remote 
> operation for replay wals 
> [remoteWALs/1-replay/asf916.gq1.ygridcore.net%2C36931%2C1530620616106-1530620683061-1.1530620683075.syncrep]
>  on asf916.gq1.ygridcore.net,33811,1530620636539 for peer id=1, this usually 
> because the server is already dead, retry
> 2018-07-03 12:25:11,505 WARN  [PEWorker-8] 
> replication.SyncReplicationReplayWALRemoteProcedure(162): Can not add remote 
> operation for replay wals 
> [remoteWALs/1-replay/asf916.gq1.ygridcore.net%2C36931%2C1530620616106-1530620683061-1.1530620683075.syncrep]
>  on asf916.gq1.ygridcore.net,33811,1530620636539 for peer id=1, this usually 
> because the server is already dead, retry
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to