[jira] [Updated] (RATIS-840) Memory leak of LogAppender

2020-04-28 Thread runzhiwang (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated RATIS-840:
-
Attachment: (was: RATIS-840.004.patch)

> Memory leak of LogAppender
> --
>
> Key: RATIS-840
> URL: https://issues.apache.org/jira/browse/RATIS-840
> Project: Ratis
>  Issue Type: Bug
>  Components: server
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Blocker
> Attachments: RATIS-840.001.patch, RATIS-840.002.patch, 
> RATIS-840.003.patch, image-2020-04-06-14-27-28-485.png, 
> image-2020-04-06-14-27-39-582.png, screenshot-1.png, screenshot-2.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> *What's the problem ?*
>  When run hadoop-ozone for 4 days, datanode memory leak.  When dump heap, I 
> found there are 460710 instances of GrpcLogAppender. But there are only 6 
> instances of SenderList, and each SenderList contains 1-2 instance of 
> GrpcLogAppender. And there are a lot of logs related to 
> [LeaderState::restartSender|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderState.java#L428].
>  {code:java}INFO impl.RaftServerImpl: 
> 1665f5ea-ab17-4a0e-af6d-6958efd322fa@group-F64B465F37B5-LeaderState: 
> Restarting GrpcLogAppender for 
> 1665f5ea-ab17-4a0e-af6d-6958efd322fa@group-F64B465F37B5-\u003e229cbcc1-a3b2-4383-9c0d-c0f4c28c3d4a\n","stream":"stderr","time":"2020-04-06T03:59:53.37892512Z"}{code}
>  
>  So there are a lot of GrpcLogAppender did not stop the Daemon Thread when 
> removed from senders. 
>  !image-2020-04-06-14-27-28-485.png! 
>  !image-2020-04-06-14-27-39-582.png! 
>  
> *Why 
> [LeaderState::restartSender|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderState.java#L428]
>  so many times ?*
> 1. As the image shows, when remove group, SegmentedRaftLog will close, then 
> GrpcLogAppender throw exception when find the SegmentedRaftLog was closed. 
> Then GrpcLogAppender will be 
> [restarted|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LogAppender.java#L94],
>  and the new GrpcLogAppender throw exception again when find the 
> SegmentedRaftLog was closed, then GrpcLogAppender will be restarted again ... 
> . It results in an infinite restart of GrpcLogAppender.
> 2. Actually, when remove group, GrpcLogAppender will be stoped: 
> RaftServerImpl::shutdown -> 
> [RoleInfo::shutdownLeaderState|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java#L266]
>  -> LeaderState::stop -> LogAppender::stopAppender, then SegmentedRaftLog 
> will be closed:  RaftServerImpl::shutdown -> 
> [ServerState:close|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java#L271]
>  ... . Though RoleInfo::shutdownLeaderState called before ServerState:close, 
> but the GrpcLogAppender was stopped asynchronously. So infinite restart of 
> GrpcLogAppender happens, when GrpcLogAppender stop after SegmentedRaftLog 
> close.
>  !screenshot-1.png! 
> *Why GrpcLogAppender did not stop the Daemon Thread when removed from senders 
> ?*
>  I find a lot of GrpcLogAppender blocked inside logs4j. I think it's 
> GrpcLogAppender restart too fast, then blocked in logs4j.
>  !screenshot-2.png! 
> *Can the new GrpcLogAppender work normally ?*
> 1. Even though without the above problem, the new created GrpcLogAppender 
> still can not work normally. 
> 2. When creat a new GrpcLogAppender, a new FollowerInfo will also be created: 
> LeaderState::addAndStartSenders -> 
> LeaderState::addSenders->RaftServerImpl::newLogAppender -> [new 
> FollowerInfo|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java#L129]
> 3. When the new created GrpcLogAppender append entry to follower, then the 
> follower response SUCCESS.
> 4. Then LeaderState::updateCommit -> [LeaderState::getMajorityMin | 
> https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderState.java#L599]
>  -> 
> [voterLists.get(0) | 
> https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderState.java#L607].
>  {color:#DE350B}Error happens because voterLists.get(0) return the 
> FollowerInfo of the old GrpcLogAppender, not the FollowerInfo of the new 
> GrpcLogAppender. {color}
> 5. Because the majority commit got from the FollowerInfo of the old 
> GrpcLogAppender never changes. So even though follower has append entry 
> successfully, the leader can not update commit. So the new created 
> GrpcLogAppender can 

[jira] [Updated] (RATIS-912) Failed UT: RejectedExecutionException: event executor terminated

2020-04-28 Thread Shashikant Banerjee (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated RATIS-912:
--
Parent: RATIS-863
Issue Type: Sub-task  (was: Bug)

> Failed UT: RejectedExecutionException: event executor terminated
> 
>
> Key: RATIS-912
> URL: https://issues.apache.org/jira/browse/RATIS-912
> Project: Ratis
>  Issue Type: Sub-task
>Reporter: runzhiwang
>Priority: Major
> Attachments: screenshot-1.png
>
>
> It looks like the  RATIS-910 generate new failed UT. This type of failed UT 
> did not happen in previous commit.
> https://github.com/apache/incubator-ratis/runs/625933249
> https://github.com/apache/incubator-ratis/runs/626750611
>  !screenshot-1.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-912) Failed UT: RejectedExecutionException: event executor terminated

2020-04-28 Thread runzhiwang (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated RATIS-912:
-
Description: 
It looks like the  RATIS-910 generate new failed UT. This type of failed UT did 
not happen in previous commit.
https://github.com/apache/incubator-ratis/runs/625933249
https://github.com/apache/incubator-ratis/runs/626750611
 !screenshot-1.png! 

  was:
It looks like the  RATIS-910 generate new failed UT.  The previous commit did 
not generate this type of failed UT.
https://github.com/apache/incubator-ratis/runs/625933249
https://github.com/apache/incubator-ratis/runs/626750611
 !screenshot-1.png! 


> Failed UT: RejectedExecutionException: event executor terminated
> 
>
> Key: RATIS-912
> URL: https://issues.apache.org/jira/browse/RATIS-912
> Project: Ratis
>  Issue Type: Bug
>Reporter: runzhiwang
>Priority: Major
> Attachments: screenshot-1.png
>
>
> It looks like the  RATIS-910 generate new failed UT. This type of failed UT 
> did not happen in previous commit.
> https://github.com/apache/incubator-ratis/runs/625933249
> https://github.com/apache/incubator-ratis/runs/626750611
>  !screenshot-1.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-912) Failed UT: RejectedExecutionException: event executor terminated

2020-04-28 Thread runzhiwang (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated RATIS-912:
-
Description: 
It looks like the  RATIS-910 generate new failed UT.  The previous commit did 
not generate this type of failed UT.
https://github.com/apache/incubator-ratis/runs/625933249
https://github.com/apache/incubator-ratis/runs/626750611
 !screenshot-1.png! 

  was:
It looks like the  RATIS-910 generate new failed UT.  The previous commit never 
generate this type of failed UT.
https://github.com/apache/incubator-ratis/runs/625933249
https://github.com/apache/incubator-ratis/runs/626750611
 !screenshot-1.png! 


> Failed UT: RejectedExecutionException: event executor terminated
> 
>
> Key: RATIS-912
> URL: https://issues.apache.org/jira/browse/RATIS-912
> Project: Ratis
>  Issue Type: Bug
>Reporter: runzhiwang
>Priority: Major
> Attachments: screenshot-1.png
>
>
> It looks like the  RATIS-910 generate new failed UT.  The previous commit did 
> not generate this type of failed UT.
> https://github.com/apache/incubator-ratis/runs/625933249
> https://github.com/apache/incubator-ratis/runs/626750611
>  !screenshot-1.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-773) Fix checkstyle violations in ratis-server

2020-04-28 Thread Dinesh Chitlangia (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095051#comment-17095051
 ] 

Dinesh Chitlangia commented on RATIS-773:
-

Thanks [~ljain] for review/commit.

> Fix checkstyle violations in ratis-server
> -
>
> Key: RATIS-773
> URL: https://issues.apache.org/jira/browse/RATIS-773
> Project: Ratis
>  Issue Type: Sub-task
>  Components: server
>Reporter: Dinesh Chitlangia
>Assignee: Dinesh Chitlangia
>Priority: Major
> Fix For: 0.6.0
>
> Attachments: RATIS-773.001.patch, RATIS-773.002.patch
>
>
> Fix checkstyle violations in ratis-server module



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-912) Failed UT: RejectedExecutionException: event executor terminated

2020-04-28 Thread runzhiwang (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095022#comment-17095022
 ] 

runzhiwang commented on RATIS-912:
--

[~ljain] [~msingh] Could you have a look at this ?

> Failed UT: RejectedExecutionException: event executor terminated
> 
>
> Key: RATIS-912
> URL: https://issues.apache.org/jira/browse/RATIS-912
> Project: Ratis
>  Issue Type: Bug
>Reporter: runzhiwang
>Priority: Major
> Attachments: screenshot-1.png
>
>
> It looks like the  RATIS-910 generate new failed UT.  The previous commit 
> never generate this type of failed UT.
> https://github.com/apache/incubator-ratis/runs/625933249
> https://github.com/apache/incubator-ratis/runs/626750611
>  !screenshot-1.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-912) Failed UT: RejectedExecutionException: event executor terminated

2020-04-28 Thread runzhiwang (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated RATIS-912:
-
Attachment: screenshot-1.png

> Failed UT: RejectedExecutionException: event executor terminated
> 
>
> Key: RATIS-912
> URL: https://issues.apache.org/jira/browse/RATIS-912
> Project: Ratis
>  Issue Type: Bug
>Reporter: runzhiwang
>Priority: Major
> Attachments: screenshot-1.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-912) Failed UT: RejectedExecutionException: event executor terminated

2020-04-28 Thread runzhiwang (Jira)
runzhiwang created RATIS-912:


 Summary: Failed UT: RejectedExecutionException: event executor 
terminated
 Key: RATIS-912
 URL: https://issues.apache.org/jira/browse/RATIS-912
 Project: Ratis
  Issue Type: Bug
Reporter: runzhiwang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-840) Memory leak of LogAppender

2020-04-28 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095000#comment-17095000
 ] 

Hadoop QA commented on RATIS-840:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
45s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
10s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
41s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
7s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 56m 28s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 71m  9s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | ratis.logservice.TestLogServiceWithNetty |
|   | ratis.logservice.server.TestMetaServer |
|   | ratis.netty.TestRaftExceptionWithNetty |
|   | ratis.netty.TestLeaderElectionWithNetty |
|   | ratis.netty.TestGroupManagementWithNetty |
|   | ratis.grpc.TestRaftWithGrpc |
|   | ratis.grpc.TestRaftServerWithGrpc |
|   | ratis.netty.TestRaftReconfigurationWithNetty |
|   | ratis.server.simulation.TestRaftStateMachineExceptionWithSimulatedRpc |
|   | ratis.netty.TestRaftSnapshotWithNetty |
|   | ratis.grpc.TestRaftAsyncWithGrpc |
|   | ratis.grpc.TestRaftSnapshotWithGrpc |
|   | ratis.server.simulation.TestGroupManagementWithSimulatedRpc |
|   | ratis.server.simulation.TestRaftSnapshotWithSimulatedRpc |
|   | ratis.netty.TestRaftWithNetty |
|   | ratis.netty.TestGroupInfoWithNetty |
|   | ratis.examples.filestore.TestFileStoreWithGrpc |
|   | ratis.examples.filestore.TestFileStoreWithNetty |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.8 Server=19.03.8 Image:yetus/ratis:date2020-04-29 |
| JIRA Issue | RATIS-840 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13001528/RATIS-840.004.patch |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
checkstyle  compile  |
| uname | Linux f3c345ee734b 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-RATIS-Build/yetus-personality.sh
 |
| git revision | master / cac3336 |
| maven | version: Apache Maven 3.6.3 
(cecedd343002696d0abb50b32b541b8a6ba2883f) |
| Default Java | 1.8.0_252 |
| unit | 
https://builds.apache.org/job/PreCommit-RATIS-Build/1309/artifact/out/patch-unit-root.txt
 |
|  Test Results | 

[jira] [Updated] (RATIS-845) Memory leak of RaftServerImpl

2020-04-28 Thread runzhiwang (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated RATIS-845:
-
Description: 
*What's the problem ? *
As the image shows, there are 1885 instances of  RaftServerImpl, most of them 
are Closed, and should be GC, but actually not. You can find from the image 
 1513 RaftServerImpl were held by ManagermentFactory->jxmMBeanServer->HashMap, 
372 RaftServerImpl were held by Datanode ReportManager Thread -> prometheus -> 
HashMap. So 1513 RaftServerImpl leak in ratis, and 372 leak in ozone. If 
RaftServerImpl can not GC, there are a lot of related resource can not be GC, 
such as the 
[DirectByteBuffer|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/raftlog/segmented/SegmentedRaftLogWorker.java#L150]
  in SegmentRaftLogWorker, which result 1GB memory leak out of heap.

h3. *{color:#DE350B}1.  1885 instances of RaftServerImpl {color}*
 !screenshot-4.png! 

h3. *{color:#DE350B}2. 1513 RaftServerImpl were held by 
ManagermentFactory->jxmMBeanServer->HashMap, 372 RaftServerImpl were held by 
Datanode ReportManager Thread -> prometheus -> HashMap{color}*
 !screenshot-5.png! 

h3. *{color:#DE350B}3. 1513 RaftServerImpl were held by 
ManagermentFactory->jxmMBeanServer->HashMap{color}*
 !screenshot-6.png! 

h3. *{color:#DE350B}4. 372 RaftServerImpl were held by Datanode ReportManager 
Thread -> prometheus -> HashMap{color}*
 !screenshot-7.png! 

h3. *{color:#DE350B}5. 2038 DirectByteBuffer, and 1885 held by 
RaftServerImpl.{color}*
 !screenshot-8.png! 
 !screenshot-9.png! 

h3. *{color:#DE350B}6. 1033 DirectByteBuffer were held by ManagermentFactory, 
802 DirectByteBuffer were held by Datanode ReportManager Thread, total 
1885.{color}*
 !screenshot-10.png! 






  was:
*What's the problem ? *
As the image shows, there are 1885 instances of  RaftServerImpl, most of them 
are Closed, and should be GC, but actually not. You can find from the image 
 1513 RaftServerImpl were held by ManagermentFactory->jxmMBeanServer->HashMap, 
372 RaftServerImpl were held by Datanode ReportManager Thread -> prometheus -> 
HashMap. If RaftServerImpl can not GC, there are a lot of related resource can 
not be GC, such as the 
[DirectByteBuffer|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/raftlog/segmented/SegmentedRaftLogWorker.java#L150]
  in SegmentRaftLogWorker, which result 1GB memory leak out of heap.

h3. *{color:#DE350B}1.  1885 instances of RaftServerImpl {color}*
 !screenshot-4.png! 

h3. *{color:#DE350B}2. 1513 RaftServerImpl were held by 
ManagermentFactory->jxmMBeanServer->HashMap, 372 RaftServerImpl were held by 
Datanode ReportManager Thread -> prometheus -> HashMap{color}*
 !screenshot-5.png! 

h3. *{color:#DE350B}3. 1513 RaftServerImpl were held by 
ManagermentFactory->jxmMBeanServer->HashMap{color}*
 !screenshot-6.png! 

h3. *{color:#DE350B}4. 372 RaftServerImpl were held by Datanode ReportManager 
Thread -> prometheus -> HashMap{color}*
 !screenshot-7.png! 

h3. *{color:#DE350B}5. 2038 DirectByteBuffer, and 1885 held by 
RaftServerImpl.{color}*
 !screenshot-8.png! 
 !screenshot-9.png! 

h3. *{color:#DE350B}6. 1033 DirectByteBuffer were held by ManagermentFactory, 
802 DirectByteBuffer were held by Datanode ReportManager Thread, total 
1885.{color}*
 !screenshot-10.png! 







> Memory leak of RaftServerImpl
> -
>
> Key: RATIS-845
> URL: https://issues.apache.org/jira/browse/RATIS-845
> Project: Ratis
>  Issue Type: Bug
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
> Attachments: screenshot-10.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png, screenshot-5.png, screenshot-6.png, screenshot-7.png, 
> screenshot-8.png, screenshot-9.png
>
>
> *What's the problem ? *
> As the image shows, there are 1885 instances of  RaftServerImpl, most of them 
> are Closed, and should be GC, but actually not. You can find from the image 
>  1513 RaftServerImpl were held by 
> ManagermentFactory->jxmMBeanServer->HashMap, 372 RaftServerImpl were held by 
> Datanode ReportManager Thread -> prometheus -> HashMap. So 1513 
> RaftServerImpl leak in ratis, and 372 leak in ozone. If RaftServerImpl can 
> not GC, there are a lot of related resource can not be GC, such as the 
> [DirectByteBuffer|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/raftlog/segmented/SegmentedRaftLogWorker.java#L150]
>   in SegmentRaftLogWorker, which result 1GB memory leak out of heap.
> h3. *{color:#DE350B}1.  1885 instances of RaftServerImpl {color}*
>  !screenshot-4.png! 
> h3. *{color:#DE350B}2. 1513 RaftServerImpl were held by 
> ManagermentFactory->jxmMBeanServer->HashMap, 372 RaftServerImpl were held by 
> Datanode 

[jira] [Updated] (RATIS-840) Memory leak of LogAppender

2020-04-28 Thread runzhiwang (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated RATIS-840:
-
Attachment: RATIS-840.004.patch

> Memory leak of LogAppender
> --
>
> Key: RATIS-840
> URL: https://issues.apache.org/jira/browse/RATIS-840
> Project: Ratis
>  Issue Type: Bug
>  Components: server
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Blocker
> Attachments: RATIS-840.001.patch, RATIS-840.002.patch, 
> RATIS-840.003.patch, RATIS-840.004.patch, image-2020-04-06-14-27-28-485.png, 
> image-2020-04-06-14-27-39-582.png, screenshot-1.png, screenshot-2.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *What's the problem ?*
>  When run hadoop-ozone for 4 days, datanode memory leak.  When dump heap, I 
> found there are 460710 instances of GrpcLogAppender. But there are only 6 
> instances of SenderList, and each SenderList contains 1-2 instance of 
> GrpcLogAppender. And there are a lot of logs related to 
> [LeaderState::restartSender|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderState.java#L428].
>  {code:java}INFO impl.RaftServerImpl: 
> 1665f5ea-ab17-4a0e-af6d-6958efd322fa@group-F64B465F37B5-LeaderState: 
> Restarting GrpcLogAppender for 
> 1665f5ea-ab17-4a0e-af6d-6958efd322fa@group-F64B465F37B5-\u003e229cbcc1-a3b2-4383-9c0d-c0f4c28c3d4a\n","stream":"stderr","time":"2020-04-06T03:59:53.37892512Z"}{code}
>  
>  So there are a lot of GrpcLogAppender did not stop the Daemon Thread when 
> removed from senders. 
>  !image-2020-04-06-14-27-28-485.png! 
>  !image-2020-04-06-14-27-39-582.png! 
>  
> *Why 
> [LeaderState::restartSender|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderState.java#L428]
>  so many times ?*
> 1. As the image shows, when remove group, SegmentedRaftLog will close, then 
> GrpcLogAppender throw exception when find the SegmentedRaftLog was closed. 
> Then GrpcLogAppender will be 
> [restarted|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LogAppender.java#L94],
>  and the new GrpcLogAppender throw exception again when find the 
> SegmentedRaftLog was closed, then GrpcLogAppender will be restarted again ... 
> . It results in an infinite restart of GrpcLogAppender.
> 2. Actually, when remove group, GrpcLogAppender will be stoped: 
> RaftServerImpl::shutdown -> 
> [RoleInfo::shutdownLeaderState|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java#L266]
>  -> LeaderState::stop -> LogAppender::stopAppender, then SegmentedRaftLog 
> will be closed:  RaftServerImpl::shutdown -> 
> [ServerState:close|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java#L271]
>  ... . Though RoleInfo::shutdownLeaderState called before ServerState:close, 
> but the GrpcLogAppender was stopped asynchronously. So infinite restart of 
> GrpcLogAppender happens, when GrpcLogAppender stop after SegmentedRaftLog 
> close.
>  !screenshot-1.png! 
> *Why GrpcLogAppender did not stop the Daemon Thread when removed from senders 
> ?*
>  I find a lot of GrpcLogAppender blocked inside logs4j. I think it's 
> GrpcLogAppender restart too fast, then blocked in logs4j.
>  !screenshot-2.png! 
> *Can the new GrpcLogAppender work normally ?*
> 1. Even though without the above problem, the new created GrpcLogAppender 
> still can not work normally. 
> 2. When creat a new GrpcLogAppender, a new FollowerInfo will also be created: 
> LeaderState::addAndStartSenders -> 
> LeaderState::addSenders->RaftServerImpl::newLogAppender -> [new 
> FollowerInfo|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java#L129]
> 3. When the new created GrpcLogAppender append entry to follower, then the 
> follower response SUCCESS.
> 4. Then LeaderState::updateCommit -> [LeaderState::getMajorityMin | 
> https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderState.java#L599]
>  -> 
> [voterLists.get(0) | 
> https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderState.java#L607].
>  {color:#DE350B}Error happens because voterLists.get(0) return the 
> FollowerInfo of the old GrpcLogAppender, not the FollowerInfo of the new 
> GrpcLogAppender. {color}
> 5. Because the majority commit got from the FollowerInfo of the old 
> GrpcLogAppender never changes. So even though follower has append entry 
> successfully, the leader can not update commit. So the new created 
> 

[jira] [Commented] (RATIS-874) Fix AppendEntry validity checks to take the SnapshotIndex into account

2020-04-28 Thread Hanisha Koneru (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17094928#comment-17094928
 ] 

Hanisha Koneru commented on RATIS-874:
--

[~ljain], [~msingh] can you please review when you get a chance. Thanks.

> Fix AppendEntry validity checks to take the SnapshotIndex into account
> --
>
> Key: RATIS-874
> URL: https://issues.apache.org/jira/browse/RATIS-874
> Project: Ratis
>  Issue Type: Bug
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
> Attachments: RATIS-874.001.patch, RATIS-874.002.patch
>
>
> This Jira aims to fix the following:
>  # Before sending an appendEntry request to follower, leader checks the 
> validity of the request be verifying if the follower has the previous log 
> entry. But if the follower had installed a snapshot, the previous could be 
> missing and the appendEntry would still be valid. Hence, the SnapshotIndex 
> should be factored in while checking the validity of appendEntry request. 
> Leader should store Follower's SnapshotIndex for this.
>  # When follower receives appendEntry request, it checks the validity of log 
> entry - the first index of the log entry is exactly 1 more than the last log 
> index. During this check, the snapshotIndex should also be considered i.e. 
> the first index of the log entry can be 1 more than the last log index or the 
> snapshotIndex.
>  # After Ratis server is restared, it loads all the available log segments. 
> But logs with end index < last snapshot index should not be loaded. There can 
> be gaps in the log segments upto the snapshotIndex if a snapshot was 
> installed from leader node. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-873) Fix Install Snapshot Notification

2020-04-28 Thread Hanisha Koneru (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17094929#comment-17094929
 ] 

Hanisha Koneru commented on RATIS-873:
--

[~ljain], [~msingh] can you please review when you get a chance. Thanks.

> Fix Install Snapshot Notification
> -
>
> Key: RATIS-873
> URL: https://issues.apache.org/jira/browse/RATIS-873
> Project: Ratis
>  Issue Type: Bug
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
> Attachments: RATIS-873.001.patch
>
>
> This Jira aims to address the following:
> # When Follower is in the process of installing snapshot and it gets an 
> append entry, it replies with result INCONSISTENCY and the follower next 
> index is updated to the snapshot index being installed. This should not 
> happen as the snapshot installation is still in progress. Follower's next 
> index on the leader should remain the same.
> # After InstallSnapshot is done, Leader should update the commitIndex of the 
> Follower to the installed snapshot index.
> # After InstallSnapshot, when reloading StateMachine, any previously open 
> LogSegment should be closed, if the last entry in the open log is already 
> included in the snapshot.
> # When Follower is notified to install snapshot through StateMachine, the 
> reply should indicate the same. It would help with debugging if the Install 
> Snapshot success reply and Install Snapshot notified replies are distinct.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-874) Fix AppendEntry validity checks to take the SnapshotIndex into account

2020-04-28 Thread Hanisha Koneru (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanisha Koneru updated RATIS-874:
-
Attachment: RATIS-874.002.patch

> Fix AppendEntry validity checks to take the SnapshotIndex into account
> --
>
> Key: RATIS-874
> URL: https://issues.apache.org/jira/browse/RATIS-874
> Project: Ratis
>  Issue Type: Bug
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
> Attachments: RATIS-874.001.patch, RATIS-874.002.patch
>
>
> This Jira aims to fix the following:
>  # Before sending an appendEntry request to follower, leader checks the 
> validity of the request be verifying if the follower has the previous log 
> entry. But if the follower had installed a snapshot, the previous could be 
> missing and the appendEntry would still be valid. Hence, the SnapshotIndex 
> should be factored in while checking the validity of appendEntry request. 
> Leader should store Follower's SnapshotIndex for this.
>  # When follower receives appendEntry request, it checks the validity of log 
> entry - the first index of the log entry is exactly 1 more than the last log 
> index. During this check, the snapshotIndex should also be considered i.e. 
> the first index of the log entry can be 1 more than the last log index or the 
> snapshotIndex.
>  # After Ratis server is restared, it loads all the available log segments. 
> But logs with end index < last snapshot index should not be loaded. There can 
> be gaps in the log segments upto the snapshotIndex if a snapshot was 
> installed from leader node. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-874) Fix AppendEntry validity checks to take the SnapshotIndex into account

2020-04-28 Thread Hanisha Koneru (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanisha Koneru updated RATIS-874:
-
Description: 
This Jira aims to fix the following:
 # Before sending an appendEntry request to follower, leader checks the 
validity of the request be verifying if the follower has the previous log 
entry. But if the follower had installed a snapshot, the previous could be 
missing and the appendEntry would still be valid. Hence, the SnapshotIndex 
should be factored in while checking the validity of appendEntry request. 
Leader should store Follower's SnapshotIndex for this.
 # When follower receives appendEntry request, it checks the validity of log 
entry - the first index of the log entry is exactly 1 more than the last log 
index. During this check, the snapshotIndex should also be considered i.e. the 
first index of the log entry can be 1 more than the last log index or the 
snapshotIndex.
 # After Ratis server is restared, it loads all the available log segments. But 
logs with end index < last snapshot index should not be loaded. There can be 
gaps in the log segments upto the snapshotIndex if a snapshot was installed 
from leader node. 

  was:
This Jira aims to fix the following:
# Before sending an appendEntry request to follower, leader checks the validity 
of the request be verifying if the follower has the previous log entry. But if 
the follower had installed a snapshot, the previous could be missing and the 
appendEntry would still be valid. Hence, the SnapshotIndex should be factored 
in while checking the validity of appendEntry request. Leader should store 
Follower's SnapshotIndex for this.
# When follower receives appendEntry request, it checks the validity of log 
entry - the first index of the log entry is exactly 1 more than the last log 
index. During this check, the snapshotIndex should also be considered i.e. the 
first index of the log entry can be 1 more than the last log index or the 
snapshotIndex.



> Fix AppendEntry validity checks to take the SnapshotIndex into account
> --
>
> Key: RATIS-874
> URL: https://issues.apache.org/jira/browse/RATIS-874
> Project: Ratis
>  Issue Type: Bug
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
> Attachments: RATIS-874.001.patch
>
>
> This Jira aims to fix the following:
>  # Before sending an appendEntry request to follower, leader checks the 
> validity of the request be verifying if the follower has the previous log 
> entry. But if the follower had installed a snapshot, the previous could be 
> missing and the appendEntry would still be valid. Hence, the SnapshotIndex 
> should be factored in while checking the validity of appendEntry request. 
> Leader should store Follower's SnapshotIndex for this.
>  # When follower receives appendEntry request, it checks the validity of log 
> entry - the first index of the log entry is exactly 1 more than the last log 
> index. During this check, the snapshotIndex should also be considered i.e. 
> the first index of the log entry can be 1 more than the last log index or the 
> snapshotIndex.
>  # After Ratis server is restared, it loads all the available log segments. 
> But logs with end index < last snapshot index should not be loaded. There can 
> be gaps in the log segments upto the snapshotIndex if a snapshot was 
> installed from leader node. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (RATIS-840) Memory leak of LogAppender

2020-04-28 Thread runzhiwang (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17094551#comment-17094551
 ] 

runzhiwang edited comment on RATIS-840 at 4/28/20, 3:16 PM:


[~elek] Please wait for me, I have to make sure the patch does not generate new 
failed ut. Because there are about 30 failed ut in ratis even though without my 
patch currently, it's need some time to do it.


was (Author: yjxxtd):
[~elek] Please wait for me, I have to verify the ratis failed ut has nothing to 
do with my patch. Because there are about 30 failed ut in ratis even though 
without my patch currently, it's need some time to do it.

> Memory leak of LogAppender
> --
>
> Key: RATIS-840
> URL: https://issues.apache.org/jira/browse/RATIS-840
> Project: Ratis
>  Issue Type: Bug
>  Components: server
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Blocker
> Attachments: RATIS-840.001.patch, RATIS-840.002.patch, 
> RATIS-840.003.patch, image-2020-04-06-14-27-28-485.png, 
> image-2020-04-06-14-27-39-582.png, screenshot-1.png, screenshot-2.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *What's the problem ?*
>  When run hadoop-ozone for 4 days, datanode memory leak.  When dump heap, I 
> found there are 460710 instances of GrpcLogAppender. But there are only 6 
> instances of SenderList, and each SenderList contains 1-2 instance of 
> GrpcLogAppender. And there are a lot of logs related to 
> [LeaderState::restartSender|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderState.java#L428].
>  {code:java}INFO impl.RaftServerImpl: 
> 1665f5ea-ab17-4a0e-af6d-6958efd322fa@group-F64B465F37B5-LeaderState: 
> Restarting GrpcLogAppender for 
> 1665f5ea-ab17-4a0e-af6d-6958efd322fa@group-F64B465F37B5-\u003e229cbcc1-a3b2-4383-9c0d-c0f4c28c3d4a\n","stream":"stderr","time":"2020-04-06T03:59:53.37892512Z"}{code}
>  
>  So there are a lot of GrpcLogAppender did not stop the Daemon Thread when 
> removed from senders. 
>  !image-2020-04-06-14-27-28-485.png! 
>  !image-2020-04-06-14-27-39-582.png! 
>  
> *Why 
> [LeaderState::restartSender|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderState.java#L428]
>  so many times ?*
> 1. As the image shows, when remove group, SegmentedRaftLog will close, then 
> GrpcLogAppender throw exception when find the SegmentedRaftLog was closed. 
> Then GrpcLogAppender will be 
> [restarted|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LogAppender.java#L94],
>  and the new GrpcLogAppender throw exception again when find the 
> SegmentedRaftLog was closed, then GrpcLogAppender will be restarted again ... 
> . It results in an infinite restart of GrpcLogAppender.
> 2. Actually, when remove group, GrpcLogAppender will be stoped: 
> RaftServerImpl::shutdown -> 
> [RoleInfo::shutdownLeaderState|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java#L266]
>  -> LeaderState::stop -> LogAppender::stopAppender, then SegmentedRaftLog 
> will be closed:  RaftServerImpl::shutdown -> 
> [ServerState:close|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java#L271]
>  ... . Though RoleInfo::shutdownLeaderState called before ServerState:close, 
> but the GrpcLogAppender was stopped asynchronously. So infinite restart of 
> GrpcLogAppender happens, when GrpcLogAppender stop after SegmentedRaftLog 
> close.
>  !screenshot-1.png! 
> *Why GrpcLogAppender did not stop the Daemon Thread when removed from senders 
> ?*
>  I find a lot of GrpcLogAppender blocked inside logs4j. I think it's 
> GrpcLogAppender restart too fast, then blocked in logs4j.
>  !screenshot-2.png! 
> *Can the new GrpcLogAppender work normally ?*
> 1. Even though without the above problem, the new created GrpcLogAppender 
> still can not work normally. 
> 2. When creat a new GrpcLogAppender, a new FollowerInfo will also be created: 
> LeaderState::addAndStartSenders -> 
> LeaderState::addSenders->RaftServerImpl::newLogAppender -> [new 
> FollowerInfo|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java#L129]
> 3. When the new created GrpcLogAppender append entry to follower, then the 
> follower response SUCCESS.
> 4. Then LeaderState::updateCommit -> [LeaderState::getMajorityMin | 
> https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderState.java#L599]
>  -> 
> [voterLists.get(0) | 
> 

[jira] [Commented] (RATIS-840) Memory leak of LogAppender

2020-04-28 Thread runzhiwang (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17094551#comment-17094551
 ] 

runzhiwang commented on RATIS-840:
--

[~elek] Please wait for me, I have to verify the ratis failed ut has nothing to 
do with my patch. Because there are about 30 failed ut in ratis even though 
without my patch currently, it's need some time to do it.

> Memory leak of LogAppender
> --
>
> Key: RATIS-840
> URL: https://issues.apache.org/jira/browse/RATIS-840
> Project: Ratis
>  Issue Type: Bug
>  Components: server
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Blocker
> Attachments: RATIS-840.001.patch, RATIS-840.002.patch, 
> RATIS-840.003.patch, image-2020-04-06-14-27-28-485.png, 
> image-2020-04-06-14-27-39-582.png, screenshot-1.png, screenshot-2.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *What's the problem ?*
>  When run hadoop-ozone for 4 days, datanode memory leak.  When dump heap, I 
> found there are 460710 instances of GrpcLogAppender. But there are only 6 
> instances of SenderList, and each SenderList contains 1-2 instance of 
> GrpcLogAppender. And there are a lot of logs related to 
> [LeaderState::restartSender|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderState.java#L428].
>  {code:java}INFO impl.RaftServerImpl: 
> 1665f5ea-ab17-4a0e-af6d-6958efd322fa@group-F64B465F37B5-LeaderState: 
> Restarting GrpcLogAppender for 
> 1665f5ea-ab17-4a0e-af6d-6958efd322fa@group-F64B465F37B5-\u003e229cbcc1-a3b2-4383-9c0d-c0f4c28c3d4a\n","stream":"stderr","time":"2020-04-06T03:59:53.37892512Z"}{code}
>  
>  So there are a lot of GrpcLogAppender did not stop the Daemon Thread when 
> removed from senders. 
>  !image-2020-04-06-14-27-28-485.png! 
>  !image-2020-04-06-14-27-39-582.png! 
>  
> *Why 
> [LeaderState::restartSender|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderState.java#L428]
>  so many times ?*
> 1. As the image shows, when remove group, SegmentedRaftLog will close, then 
> GrpcLogAppender throw exception when find the SegmentedRaftLog was closed. 
> Then GrpcLogAppender will be 
> [restarted|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LogAppender.java#L94],
>  and the new GrpcLogAppender throw exception again when find the 
> SegmentedRaftLog was closed, then GrpcLogAppender will be restarted again ... 
> . It results in an infinite restart of GrpcLogAppender.
> 2. Actually, when remove group, GrpcLogAppender will be stoped: 
> RaftServerImpl::shutdown -> 
> [RoleInfo::shutdownLeaderState|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java#L266]
>  -> LeaderState::stop -> LogAppender::stopAppender, then SegmentedRaftLog 
> will be closed:  RaftServerImpl::shutdown -> 
> [ServerState:close|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java#L271]
>  ... . Though RoleInfo::shutdownLeaderState called before ServerState:close, 
> but the GrpcLogAppender was stopped asynchronously. So infinite restart of 
> GrpcLogAppender happens, when GrpcLogAppender stop after SegmentedRaftLog 
> close.
>  !screenshot-1.png! 
> *Why GrpcLogAppender did not stop the Daemon Thread when removed from senders 
> ?*
>  I find a lot of GrpcLogAppender blocked inside logs4j. I think it's 
> GrpcLogAppender restart too fast, then blocked in logs4j.
>  !screenshot-2.png! 
> *Can the new GrpcLogAppender work normally ?*
> 1. Even though without the above problem, the new created GrpcLogAppender 
> still can not work normally. 
> 2. When creat a new GrpcLogAppender, a new FollowerInfo will also be created: 
> LeaderState::addAndStartSenders -> 
> LeaderState::addSenders->RaftServerImpl::newLogAppender -> [new 
> FollowerInfo|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java#L129]
> 3. When the new created GrpcLogAppender append entry to follower, then the 
> follower response SUCCESS.
> 4. Then LeaderState::updateCommit -> [LeaderState::getMajorityMin | 
> https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderState.java#L599]
>  -> 
> [voterLists.get(0) | 
> https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderState.java#L607].
>  {color:#DE350B}Error happens because voterLists.get(0) return the 
> FollowerInfo of the old GrpcLogAppender, not the FollowerInfo of the new 
> GrpcLogAppender. {color}
> 5. Because the majority 

[jira] [Commented] (RATIS-840) Memory leak of LogAppender

2020-04-28 Thread Marton Elek (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17094548#comment-17094548
 ] 

Marton Elek commented on RATIS-840:
---

[~szetszwo] can you please help to review it? Ozone test results are very noisy 
because this issue.

> Memory leak of LogAppender
> --
>
> Key: RATIS-840
> URL: https://issues.apache.org/jira/browse/RATIS-840
> Project: Ratis
>  Issue Type: Bug
>  Components: server
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Blocker
> Attachments: RATIS-840.001.patch, RATIS-840.002.patch, 
> RATIS-840.003.patch, image-2020-04-06-14-27-28-485.png, 
> image-2020-04-06-14-27-39-582.png, screenshot-1.png, screenshot-2.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *What's the problem ?*
>  When run hadoop-ozone for 4 days, datanode memory leak.  When dump heap, I 
> found there are 460710 instances of GrpcLogAppender. But there are only 6 
> instances of SenderList, and each SenderList contains 1-2 instance of 
> GrpcLogAppender. And there are a lot of logs related to 
> [LeaderState::restartSender|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderState.java#L428].
>  {code:java}INFO impl.RaftServerImpl: 
> 1665f5ea-ab17-4a0e-af6d-6958efd322fa@group-F64B465F37B5-LeaderState: 
> Restarting GrpcLogAppender for 
> 1665f5ea-ab17-4a0e-af6d-6958efd322fa@group-F64B465F37B5-\u003e229cbcc1-a3b2-4383-9c0d-c0f4c28c3d4a\n","stream":"stderr","time":"2020-04-06T03:59:53.37892512Z"}{code}
>  
>  So there are a lot of GrpcLogAppender did not stop the Daemon Thread when 
> removed from senders. 
>  !image-2020-04-06-14-27-28-485.png! 
>  !image-2020-04-06-14-27-39-582.png! 
>  
> *Why 
> [LeaderState::restartSender|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderState.java#L428]
>  so many times ?*
> 1. As the image shows, when remove group, SegmentedRaftLog will close, then 
> GrpcLogAppender throw exception when find the SegmentedRaftLog was closed. 
> Then GrpcLogAppender will be 
> [restarted|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LogAppender.java#L94],
>  and the new GrpcLogAppender throw exception again when find the 
> SegmentedRaftLog was closed, then GrpcLogAppender will be restarted again ... 
> . It results in an infinite restart of GrpcLogAppender.
> 2. Actually, when remove group, GrpcLogAppender will be stoped: 
> RaftServerImpl::shutdown -> 
> [RoleInfo::shutdownLeaderState|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java#L266]
>  -> LeaderState::stop -> LogAppender::stopAppender, then SegmentedRaftLog 
> will be closed:  RaftServerImpl::shutdown -> 
> [ServerState:close|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java#L271]
>  ... . Though RoleInfo::shutdownLeaderState called before ServerState:close, 
> but the GrpcLogAppender was stopped asynchronously. So infinite restart of 
> GrpcLogAppender happens, when GrpcLogAppender stop after SegmentedRaftLog 
> close.
>  !screenshot-1.png! 
> *Why GrpcLogAppender did not stop the Daemon Thread when removed from senders 
> ?*
>  I find a lot of GrpcLogAppender blocked inside logs4j. I think it's 
> GrpcLogAppender restart too fast, then blocked in logs4j.
>  !screenshot-2.png! 
> *Can the new GrpcLogAppender work normally ?*
> 1. Even though without the above problem, the new created GrpcLogAppender 
> still can not work normally. 
> 2. When creat a new GrpcLogAppender, a new FollowerInfo will also be created: 
> LeaderState::addAndStartSenders -> 
> LeaderState::addSenders->RaftServerImpl::newLogAppender -> [new 
> FollowerInfo|https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java#L129]
> 3. When the new created GrpcLogAppender append entry to follower, then the 
> follower response SUCCESS.
> 4. Then LeaderState::updateCommit -> [LeaderState::getMajorityMin | 
> https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderState.java#L599]
>  -> 
> [voterLists.get(0) | 
> https://github.com/apache/incubator-ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderState.java#L607].
>  {color:#DE350B}Error happens because voterLists.get(0) return the 
> FollowerInfo of the old GrpcLogAppender, not the FollowerInfo of the new 
> GrpcLogAppender. {color}
> 5. Because the majority commit got from the FollowerInfo of the old 
> GrpcLogAppender never changes. So even though follower has append 

[jira] [Resolved] (RATIS-855) Release Ratis 0.4.0 Thirdparty

2020-04-28 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh resolved RATIS-855.
-
Fix Version/s: 0.6.0
   Resolution: Fixed

Ratis artifacts have been released and the binary can be found at 
https://dist.apache.org/repos/dist/release/incubator/ratis/thirdparty/. 
Resolving this as Fixed.

> Release Ratis 0.4.0 Thirdparty
> --
>
> Key: RATIS-855
> URL: https://issues.apache.org/jira/browse/RATIS-855
> Project: Ratis
>  Issue Type: Bug
>  Components: thirdparty
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Fix For: 0.6.0
>
>
> with RATIS-852 and RATIS-847 resolved, release 0.4.0 third party



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (RATIS-910) Update Ratis-thirdparty to 0.4.0

2020-04-28 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh resolved RATIS-910.
-
Fix Version/s: 0.6.0
   Resolution: Fixed

The PR has been merged. Thanks [~ljain] for review and merge.

> Update Ratis-thirdparty to 0.4.0
> 
>
> Key: RATIS-910
> URL: https://issues.apache.org/jira/browse/RATIS-910
> Project: Ratis
>  Issue Type: Bug
>  Components: thirdparty
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
> Fix For: 0.6.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This jira is to track update of Ratis third-party after 0.4.0 release.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-911) Failed UT: testRestartLogAppender

2020-04-28 Thread Shashikant Banerjee (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated RATIS-911:
--
Parent: RATIS-863
Issue Type: Sub-task  (was: Bug)

> Failed UT: testRestartLogAppender
> -
>
> Key: RATIS-911
> URL: https://issues.apache.org/jira/browse/RATIS-911
> Project: Ratis
>  Issue Type: Sub-task
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
> Attachments: screenshot-1.png
>
>
> Can not elect a leader for a long time.
>  !screenshot-1.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-911) Failed UT: testRestartLogAppender

2020-04-28 Thread runzhiwang (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated RATIS-911:
-
Summary: Failed UT: testRestartLogAppender  (was: Failed UT: 
runTestRestartLogAppender)

> Failed UT: testRestartLogAppender
> -
>
> Key: RATIS-911
> URL: https://issues.apache.org/jira/browse/RATIS-911
> Project: Ratis
>  Issue Type: Bug
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
> Attachments: screenshot-1.png
>
>
> Can not elect a leader for a long time.
>  !screenshot-1.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-911) Failed UT: runTestRestartLogAppender

2020-04-28 Thread runzhiwang (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated RATIS-911:
-
Attachment: screenshot-1.png

> Failed UT: runTestRestartLogAppender
> 
>
> Key: RATIS-911
> URL: https://issues.apache.org/jira/browse/RATIS-911
> Project: Ratis
>  Issue Type: Bug
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
> Attachments: screenshot-1.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-911) Failed UT: runTestRestartLogAppender

2020-04-28 Thread runzhiwang (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated RATIS-911:
-
Description: 
Can not elect a leader for a long time.
 !screenshot-1.png! 

> Failed UT: runTestRestartLogAppender
> 
>
> Key: RATIS-911
> URL: https://issues.apache.org/jira/browse/RATIS-911
> Project: Ratis
>  Issue Type: Bug
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
> Attachments: screenshot-1.png
>
>
> Can not elect a leader for a long time.
>  !screenshot-1.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (RATIS-911) Failed UT: runTestRestartLogAppender

2020-04-28 Thread runzhiwang (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang reassigned RATIS-911:


Assignee: runzhiwang

> Failed UT: runTestRestartLogAppender
> 
>
> Key: RATIS-911
> URL: https://issues.apache.org/jira/browse/RATIS-911
> Project: Ratis
>  Issue Type: Bug
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-911) Failed UT: runTestRestartLogAppender

2020-04-28 Thread runzhiwang (Jira)
runzhiwang created RATIS-911:


 Summary: Failed UT: runTestRestartLogAppender
 Key: RATIS-911
 URL: https://issues.apache.org/jira/browse/RATIS-911
 Project: Ratis
  Issue Type: Bug
Reporter: runzhiwang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-910) Update Ratis-thirdparty to 0.4.0

2020-04-28 Thread Mukul Kumar Singh (Jira)
Mukul Kumar Singh created RATIS-910:
---

 Summary: Update Ratis-thirdparty to 0.4.0
 Key: RATIS-910
 URL: https://issues.apache.org/jira/browse/RATIS-910
 Project: Ratis
  Issue Type: Bug
  Components: thirdparty
Reporter: Mukul Kumar Singh
Assignee: Mukul Kumar Singh


This jira is to track update of Ratis third-party after 0.4.0 release.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-773) Fix checkstyle violations in ratis-server

2020-04-28 Thread Lokesh Jain (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17094238#comment-17094238
 ] 

Lokesh Jain commented on RATIS-773:
---

[~dineshchitlangia] Thanks for the contribution! I have committed the patch to 
master branch.

> Fix checkstyle violations in ratis-server
> -
>
> Key: RATIS-773
> URL: https://issues.apache.org/jira/browse/RATIS-773
> Project: Ratis
>  Issue Type: Sub-task
>  Components: server
>Reporter: Dinesh Chitlangia
>Assignee: Dinesh Chitlangia
>Priority: Major
> Attachments: RATIS-773.001.patch, RATIS-773.002.patch
>
>
> Fix checkstyle violations in ratis-server module



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-773) Fix checkstyle violations in ratis-server

2020-04-28 Thread Lokesh Jain (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17094230#comment-17094230
 ] 

Lokesh Jain commented on RATIS-773:
---

[~dineshchitlangia] I have created RATIS-909 which tracks the issue with the 
unit test. It caused the leader election to never succeed. I am +1 on this 
patch. Will commit it shortly.

> Fix checkstyle violations in ratis-server
> -
>
> Key: RATIS-773
> URL: https://issues.apache.org/jira/browse/RATIS-773
> Project: Ratis
>  Issue Type: Sub-task
>  Components: server
>Reporter: Dinesh Chitlangia
>Assignee: Dinesh Chitlangia
>Priority: Major
> Attachments: RATIS-773.001.patch, RATIS-773.002.patch
>
>
> Fix checkstyle violations in ratis-server module



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-909) Candidate does not transition to follower on leader election timeout

2020-04-28 Thread Lokesh Jain (Jira)
Lokesh Jain created RATIS-909:
-

 Summary: Candidate does not transition to follower on leader 
election timeout
 Key: RATIS-909
 URL: https://issues.apache.org/jira/browse/RATIS-909
 Project: Ratis
  Issue Type: Bug
Reporter: Lokesh Jain
Assignee: Lokesh Jain


When leader election times out, candidate should change to follower so that it 
can restart leader election process. New leader election is started when server 
transitions from follower to candidate. If server remains in candidate state, 
new leader election will not be started.

In TestStateMachineShutdownWithGrpc, it was seen that all the servers see 
leader election timeout and remain in CANDIDATE state. Therefore new leader is 
never elected.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (RATIS-865) Fix TestRaftWithGrpc#testStateMachineMetrics

2020-04-28 Thread runzhiwang (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang reassigned RATIS-865:


Assignee: runzhiwang

> Fix TestRaftWithGrpc#testStateMachineMetrics
> 
>
> Key: RATIS-865
> URL: https://issues.apache.org/jira/browse/RATIS-865
> Project: Ratis
>  Issue Type: Sub-task
>Reporter: Shashikant Banerjee
>Assignee: runzhiwang
>Priority: Major
> Fix For: 0.6.0
>
>
> The failure was observed here:
> [https://builds.apache.org/job/PreCommit-RATIS-Build/1299/testReport/org.apache.ratis.grpc/TestRaftWithGrpc/testStateMachineMetrics/]
> {code:java}
> org.apache.ratis.MiniRaftCluster$Factory$Get.runWithNewCluster(MiniRaftCluster.java:113)
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.ratis.RaftBasicTests.checkFollowerCommitLagsLeader(RaftBasicTests.java:494)
>   at 
> org.apache.ratis.RaftBasicTests.testStateMachineMetrics(RaftBasicTests.java:469)
>   at 
> org.apache.ratis.grpc.TestRaftWithGrpc.lambda$testStateMachineMetrics$1(TestRaftWithGrpc.java:65)
>   at 
> org.apache.ratis.MiniRaftCluster$Factory$Get.runWithNewCluster(MiniRaftCluster.java:125)
>   at 
> org.apache.ratis.MiniRaftCluster$Factory$Get.runWithNewCluster(MiniRaftCluster.java:113)
>   at 
> org.apache.ratis.grpc.TestRaftWithGrpc.testStateMachineMetrics(TestRaftWithGrpc.java:64)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> 2ND INSTANCE
> ---
> org.apache.ratis.MiniRaftCluster$Factory$Get.runWithNewCluster(MiniRaftCluster.java:113)
> java.util.NoSuchElementException
>   at java.util.TreeMap.key(TreeMap.java:1327)
>   at java.util.TreeMap.firstKey(TreeMap.java:290)
>   at 
> java.util.Collections$UnmodifiableSortedMap.firstKey(Collections.java:1808)
>   at 
> org.apache.ratis.server.impl.RaftServerMetrics.getPeerCommitIndexGauge(RaftServerMetrics.java:159)
>   at 
> org.apache.ratis.RaftBasicTests.checkFollowerCommitLagsLeader(RaftBasicTests.java:487)
>   at 
> org.apache.ratis.RaftBasicTests.testStateMachineMetrics(RaftBasicTests.java:458)
>   at 
> org.apache.ratis.grpc.TestRaftWithGrpc.lambda$testStateMachineMetrics$1(TestRaftWithGrpc.java:65)
>   at 
> org.apache.ratis.MiniRaftCluster$Factory$Get.runWithNewCluster(MiniRaftCluster.java:125)
>   at 
> org.apache.ratis.MiniRaftCluster$Factory$Get.runWithNewCluster(MiniRaftCluster.java:113)
>   at 
> org.apache.ratis.grpc.TestRaftWithGrpc.testStateMachineMetrics(TestRaftWithGrpc.java:64)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at 

[jira] [Updated] (RATIS-908) Failed UT: testOldLeaderCommit

2020-04-28 Thread runzhiwang (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated RATIS-908:
-
Attachment: screenshot-1.png

> Failed UT: testOldLeaderCommit
> --
>
> Key: RATIS-908
> URL: https://issues.apache.org/jira/browse/RATIS-908
> Project: Ratis
>  Issue Type: Sub-task
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
> Attachments: screenshot-1.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-908) Failed UT: testOldLeaderCommit

2020-04-28 Thread runzhiwang (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated RATIS-908:
-
Description:  !screenshot-1.png! 

> Failed UT: testOldLeaderCommit
> --
>
> Key: RATIS-908
> URL: https://issues.apache.org/jira/browse/RATIS-908
> Project: Ratis
>  Issue Type: Sub-task
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
> Attachments: screenshot-1.png
>
>
>  !screenshot-1.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-908) Failed UT: testOldLeaderCommit

2020-04-28 Thread runzhiwang (Jira)
runzhiwang created RATIS-908:


 Summary: Failed UT: testOldLeaderCommit
 Key: RATIS-908
 URL: https://issues.apache.org/jira/browse/RATIS-908
 Project: Ratis
  Issue Type: Sub-task
Reporter: runzhiwang
Assignee: runzhiwang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-907) Failed UT: RaftBasicTests.runTestBasicAppendEntries

2020-04-28 Thread runzhiwang (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated RATIS-907:
-
Description:  !screenshot-1.png! 

> Failed UT: RaftBasicTests.runTestBasicAppendEntries
> ---
>
> Key: RATIS-907
> URL: https://issues.apache.org/jira/browse/RATIS-907
> Project: Ratis
>  Issue Type: Sub-task
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
> Attachments: screenshot-1.png
>
>
>  !screenshot-1.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-907) Failed UT: RaftBasicTests.runTestBasicAppendEntries

2020-04-28 Thread runzhiwang (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated RATIS-907:
-
Attachment: screenshot-1.png

> Failed UT: RaftBasicTests.runTestBasicAppendEntries
> ---
>
> Key: RATIS-907
> URL: https://issues.apache.org/jira/browse/RATIS-907
> Project: Ratis
>  Issue Type: Sub-task
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
> Attachments: screenshot-1.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-907) Failed UT: RaftBasicTests.runTestBasicAppendEntries

2020-04-28 Thread runzhiwang (Jira)
runzhiwang created RATIS-907:


 Summary: Failed UT: RaftBasicTests.runTestBasicAppendEntries
 Key: RATIS-907
 URL: https://issues.apache.org/jira/browse/RATIS-907
 Project: Ratis
  Issue Type: Sub-task
Reporter: runzhiwang
Assignee: runzhiwang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-906) Failed UT: testBasicAppendEntriesKillLeader

2020-04-28 Thread runzhiwang (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated RATIS-906:
-
Attachment: screenshot-1.png

> Failed UT: testBasicAppendEntriesKillLeader
> ---
>
> Key: RATIS-906
> URL: https://issues.apache.org/jira/browse/RATIS-906
> Project: Ratis
>  Issue Type: Sub-task
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
> Attachments: screenshot-1.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-906) Failed UT: testBasicAppendEntriesKillLeader

2020-04-28 Thread runzhiwang (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated RATIS-906:
-
Description:  !screenshot-1.png! 

> Failed UT: testBasicAppendEntriesKillLeader
> ---
>
> Key: RATIS-906
> URL: https://issues.apache.org/jira/browse/RATIS-906
> Project: Ratis
>  Issue Type: Sub-task
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
> Attachments: screenshot-1.png
>
>
>  !screenshot-1.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-906) Failed UT: testBasicAppendEntriesKillLeader

2020-04-28 Thread runzhiwang (Jira)
runzhiwang created RATIS-906:


 Summary: Failed UT: testBasicAppendEntriesKillLeader
 Key: RATIS-906
 URL: https://issues.apache.org/jira/browse/RATIS-906
 Project: Ratis
  Issue Type: Sub-task
Reporter: runzhiwang
Assignee: runzhiwang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)