[jira] [Updated] (RATIS-813) Add streamAsync(..)
[ https://issues.apache.org/jira/browse/RATIS-813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz-wo Sze updated RATIS-813: - Component/s: server client > Add streamAsync(..) > --- > > Key: RATIS-813 > URL: https://issues.apache.org/jira/browse/RATIS-813 > Project: Ratis > Issue Type: New Feature > Components: client, server >Reporter: Tsz-wo Sze >Assignee: Tsz-wo Sze >Priority: Major > > This is a followup of RATIS-759. Will add streamAsync(..) here. > {code} > /** Send the given message using a stream. */ > CompletableFuture streamAsync(Message message); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (RATIS-759) Support stream APIs to send large messages
[ https://issues.apache.org/jira/browse/RATIS-759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17032710#comment-17032710 ] Tsz-wo Sze edited comment on RATIS-759 at 2/7/20 11:17 PM: --- Filed RATIS-813 to add to the other method. was (Author: szetszwo): Filed RATIS-813 to add to other method. > Support stream APIs to send large messages > -- > > Key: RATIS-759 > URL: https://issues.apache.org/jira/browse/RATIS-759 > Project: Ratis > Issue Type: New Feature > Components: client, server >Reporter: Tsz-wo Sze >Assignee: Tsz-wo Sze >Priority: Major > Fix For: 0.5.0 > > Attachments: r759_20200115.patch, r759_20200123.patch, > r759_20200204.patch, r759_20200206.patch > > > It is inefficient to send a large message using > send(Message)/sendAsync(Message) in RaftClient. We already have > RaftOutputStream implemented with sendAsync(..). We propose adding the > following new APIs > {code} > /** Create a stream to send a large message. */ > MessageOutputStream stream(); > /** Send the given message using a stream. */ > CompletableFuture streamAsync(Message message); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (RATIS-759) Support stream APIs to send large messages
[ https://issues.apache.org/jira/browse/RATIS-759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17032710#comment-17032710 ] Tsz-wo Sze commented on RATIS-759: -- Filed RATIS-813 to add to other method. > Support stream APIs to send large messages > -- > > Key: RATIS-759 > URL: https://issues.apache.org/jira/browse/RATIS-759 > Project: Ratis > Issue Type: New Feature > Components: client, server >Reporter: Tsz-wo Sze >Assignee: Tsz-wo Sze >Priority: Major > Fix For: 0.5.0 > > Attachments: r759_20200115.patch, r759_20200123.patch, > r759_20200204.patch, r759_20200206.patch > > > It is inefficient to send a large message using > send(Message)/sendAsync(Message) in RaftClient. We already have > RaftOutputStream implemented with sendAsync(..). We propose adding the > following new APIs > {code} > /** Create a stream to send a large message. */ > MessageOutputStream stream(); > /** Send the given message using a stream. */ > CompletableFuture streamAsync(Message message); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (RATIS-813) Add streamAsync(..)
Tsz-wo Sze created RATIS-813: Summary: Add streamAsync(..) Key: RATIS-813 URL: https://issues.apache.org/jira/browse/RATIS-813 Project: Ratis Issue Type: New Feature Reporter: Tsz-wo Sze Assignee: Tsz-wo Sze This is a followup of RATIS-759. Will add streamAsync(..) here. {code} /** Send the given message using a stream. */ CompletableFuture streamAsync(Message message); {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (RATIS-804) Race condition between cache evict and load in LogSegment
[ https://issues.apache.org/jira/browse/RATIS-804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17032519#comment-17032519 ] Marton Elek commented on RATIS-804: --- +1 I tested and couldn't reproduce the Exception any more. (In fact it's very hard to reproduce with a properly configured client. I used a specific client which doesn't close the GRPC requests. With the fixed client, it's very hard to see the Exception during real tests...) > Race condition between cache evict and load in LogSegment > - > > Key: RATIS-804 > URL: https://issues.apache.org/jira/browse/RATIS-804 > Project: Ratis > Issue Type: Bug > Components: server >Reporter: Marton Elek >Assignee: Tsz-wo Sze >Priority: Critical > Attachments: r804_20200205.patch > > > I am doing some kind of stress testing with Ozone. I start one Datanode in > FOLLOWER mode and the load generator (Freon) behaves like a LEADER. > I am sending huge number of AppendLogEntries to the FOLLOWER without > inhibitions. > As a result I got NPE: > {code:java} > 2020-01-28 15:08:20 ERROR StateMachineUpdater:184 - > 3fda0c39-ce3c-4540-a804-44d9ac1f4853@group-E1B13B4CA5C0-StateMachineUpdater: > the StateMachineUp > dater hits Throwable > org.apache.ratis.server.raftlog.RaftLogIOException: > java.lang.NullPointerException > at > org.apache.ratis.server.raftlog.segmented.LogSegment.loadCache(LogSegment.java:320) > at > org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.get(SegmentedRaftLog.java:293) > at > org.apache.ratis.server.impl.StateMachineUpdater.applyLog(StateMachineUpdater.java:218) > at > org.apache.ratis.server.impl.StateMachineUpdater.run(StateMachineUpdater.java:167) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.NullPointerException > at java.util.Objects.requireNonNull(Objects.java:203) > at > org.apache.ratis.server.raftlog.segmented.LogSegment$LogEntryLoader.load(LogSegment.java:214) > at > org.apache.ratis.server.raftlog.segmented.LogSegment.loadCache(LogSegment.java:318) > ... 4 more {code} > It seems to be a race condition between LogSegment.evictCache() and > LogSegment.loadCache(). > # StateMachineUpdater tries to update the StateMachine with the next log > entry > # It can't be found in the cache, therefore the LogSegment.loadCache() is > called > # The LogSegment.LogEntryLoader.load() reads the segment files from the disk > # After loading, it returns with the loaded entry > If the GRPC thread evicts the cache between 3 and 4. (it's possible that the > log segment is already flushed, therefore can be evicted) an NPE will be > thrown. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (RATIS-804) Race condition between cache evict and load in LogSegment
[ https://issues.apache.org/jira/browse/RATIS-804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17032334#comment-17032334 ] Marton Elek commented on RATIS-804: --- {quote}[~elek], would you mind testing the patch? {quote} Sure, thanks the patch. I just started to create a new build to deploy and test. > Race condition between cache evict and load in LogSegment > - > > Key: RATIS-804 > URL: https://issues.apache.org/jira/browse/RATIS-804 > Project: Ratis > Issue Type: Bug > Components: server >Reporter: Marton Elek >Assignee: Tsz-wo Sze >Priority: Critical > Attachments: r804_20200205.patch > > > I am doing some kind of stress testing with Ozone. I start one Datanode in > FOLLOWER mode and the load generator (Freon) behaves like a LEADER. > I am sending huge number of AppendLogEntries to the FOLLOWER without > inhibitions. > As a result I got NPE: > {code:java} > 2020-01-28 15:08:20 ERROR StateMachineUpdater:184 - > 3fda0c39-ce3c-4540-a804-44d9ac1f4853@group-E1B13B4CA5C0-StateMachineUpdater: > the StateMachineUp > dater hits Throwable > org.apache.ratis.server.raftlog.RaftLogIOException: > java.lang.NullPointerException > at > org.apache.ratis.server.raftlog.segmented.LogSegment.loadCache(LogSegment.java:320) > at > org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.get(SegmentedRaftLog.java:293) > at > org.apache.ratis.server.impl.StateMachineUpdater.applyLog(StateMachineUpdater.java:218) > at > org.apache.ratis.server.impl.StateMachineUpdater.run(StateMachineUpdater.java:167) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.NullPointerException > at java.util.Objects.requireNonNull(Objects.java:203) > at > org.apache.ratis.server.raftlog.segmented.LogSegment$LogEntryLoader.load(LogSegment.java:214) > at > org.apache.ratis.server.raftlog.segmented.LogSegment.loadCache(LogSegment.java:318) > ... 4 more {code} > It seems to be a race condition between LogSegment.evictCache() and > LogSegment.loadCache(). > # StateMachineUpdater tries to update the StateMachine with the next log > entry > # It can't be found in the cache, therefore the LogSegment.loadCache() is > called > # The LogSegment.LogEntryLoader.load() reads the segment files from the disk > # After loading, it returns with the loaded entry > If the GRPC thread evicts the cache between 3 and 4. (it's possible that the > log segment is already flushed, therefore can be evicted) an NPE will be > thrown. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (RATIS-759) Support stream APIs to send large messages
[ https://issues.apache.org/jira/browse/RATIS-759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17032311#comment-17032311 ] Hadoop QA commented on RATIS-759: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 58s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 3m 6s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 12s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 5s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 26s{color} | {color:red} root in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 32m 5s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | ratis.logservice.server.TestMetaServer | | | ratis.server.simulation.TestRaftSnapshotWithSimulatedRpc | | | ratis.netty.TestRaftSnapshotWithNetty | | | ratis.server.simulation.TestLogAppenderWithSimulatedRpc | | | ratis.grpc.TestRaftSnapshotWithGrpc | | | ratis.grpc.TestRaftExceptionWithGrpc | | | ratis.grpc.TestServerRestartWithGrpc | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/ratis:date2020-02-07 | | JIRA Issue | RATIS-759 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12992825/r759_20200206.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs checkstyle compile cc | | uname | Linux 59f0543554d0 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-RATIS-Build/yetus-personality.sh | | git revision | master / c7db0a2 | | maven | version: Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f) | | Default Java | 1.8.0_242 | | unit | https://builds.apache.org/job/PreCommit-RATIS-Build/1239/artifact/out/patch-unit-root.txt | | Test Results | https://builds.apache.org/job/PreCommit-RATIS-Build/1239/testReport/ | | Max. process+thread count | 3435 (vs. ulimit of 5000) | | modules | C: ratis-proto ratis-common ratis-client ratis-server ratis-test U: . | | Console output | https://builds.apache.org/job/PreCommit-RATIS-Build/1239/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Support