[jira] [Updated] (RATIS-813) Add streamAsync(..)

2020-02-07 Thread Tsz-wo Sze (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz-wo Sze updated RATIS-813:
-
Component/s: server
 client

> Add streamAsync(..)
> ---
>
> Key: RATIS-813
> URL: https://issues.apache.org/jira/browse/RATIS-813
> Project: Ratis
>  Issue Type: New Feature
>  Components: client, server
>Reporter: Tsz-wo Sze
>Assignee: Tsz-wo Sze
>Priority: Major
>
> This is a followup of RATIS-759.  Will add streamAsync(..) here.
> {code}
>  /** Send the given message using a stream. */
>   CompletableFuture streamAsync(Message message);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (RATIS-759) Support stream APIs to send large messages

2020-02-07 Thread Tsz-wo Sze (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17032710#comment-17032710
 ] 

Tsz-wo Sze edited comment on RATIS-759 at 2/7/20 11:17 PM:
---

Filed RATIS-813 to add to the other method.


was (Author: szetszwo):
Filed RATIS-813 to add to other method.

> Support stream APIs to send large messages
> --
>
> Key: RATIS-759
> URL: https://issues.apache.org/jira/browse/RATIS-759
> Project: Ratis
>  Issue Type: New Feature
>  Components: client, server
>Reporter: Tsz-wo Sze
>Assignee: Tsz-wo Sze
>Priority: Major
> Fix For: 0.5.0
>
> Attachments: r759_20200115.patch, r759_20200123.patch, 
> r759_20200204.patch, r759_20200206.patch
>
>
> It is inefficient to send a large message using 
> send(Message)/sendAsync(Message) in RaftClient.  We already have 
> RaftOutputStream implemented with sendAsync(..).  We propose adding the 
> following new APIs
> {code}
>   /** Create a stream to send a large message. */
>   MessageOutputStream stream();
>   /** Send the given message using a stream. */
>   CompletableFuture streamAsync(Message message);
> {code} 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-759) Support stream APIs to send large messages

2020-02-07 Thread Tsz-wo Sze (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17032710#comment-17032710
 ] 

Tsz-wo Sze commented on RATIS-759:
--

Filed RATIS-813 to add to other method.

> Support stream APIs to send large messages
> --
>
> Key: RATIS-759
> URL: https://issues.apache.org/jira/browse/RATIS-759
> Project: Ratis
>  Issue Type: New Feature
>  Components: client, server
>Reporter: Tsz-wo Sze
>Assignee: Tsz-wo Sze
>Priority: Major
> Fix For: 0.5.0
>
> Attachments: r759_20200115.patch, r759_20200123.patch, 
> r759_20200204.patch, r759_20200206.patch
>
>
> It is inefficient to send a large message using 
> send(Message)/sendAsync(Message) in RaftClient.  We already have 
> RaftOutputStream implemented with sendAsync(..).  We propose adding the 
> following new APIs
> {code}
>   /** Create a stream to send a large message. */
>   MessageOutputStream stream();
>   /** Send the given message using a stream. */
>   CompletableFuture streamAsync(Message message);
> {code} 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (RATIS-813) Add streamAsync(..)

2020-02-07 Thread Tsz-wo Sze (Jira)
Tsz-wo Sze created RATIS-813:


 Summary: Add streamAsync(..)
 Key: RATIS-813
 URL: https://issues.apache.org/jira/browse/RATIS-813
 Project: Ratis
  Issue Type: New Feature
Reporter: Tsz-wo Sze
Assignee: Tsz-wo Sze


This is a followup of RATIS-759.  Will add streamAsync(..) here.
{code}
 /** Send the given message using a stream. */
  CompletableFuture streamAsync(Message message);
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-804) Race condition between cache evict and load in LogSegment

2020-02-07 Thread Marton Elek (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17032519#comment-17032519
 ] 

Marton Elek commented on RATIS-804:
---

+1

 

I tested and couldn't reproduce the Exception any more.

 

(In fact it's very hard to reproduce with a properly configured client. I used 
a specific client which doesn't close the GRPC requests. With the fixed client, 
it's very hard to see the Exception during real tests...)

> Race condition between cache evict and load in LogSegment
> -
>
> Key: RATIS-804
> URL: https://issues.apache.org/jira/browse/RATIS-804
> Project: Ratis
>  Issue Type: Bug
>  Components: server
>Reporter: Marton Elek
>Assignee: Tsz-wo Sze
>Priority: Critical
> Attachments: r804_20200205.patch
>
>
> I am doing some kind of stress testing with Ozone. I start one Datanode in 
> FOLLOWER mode and the load generator (Freon) behaves like a LEADER.
> I am sending huge number of AppendLogEntries to the FOLLOWER without 
> inhibitions.
> As a result I got NPE:
> {code:java}
> 2020-01-28 15:08:20 ERROR StateMachineUpdater:184 - 
> 3fda0c39-ce3c-4540-a804-44d9ac1f4853@group-E1B13B4CA5C0-StateMachineUpdater: 
> the StateMachineUp
> dater hits Throwable
> org.apache.ratis.server.raftlog.RaftLogIOException: 
> java.lang.NullPointerException
> at 
> org.apache.ratis.server.raftlog.segmented.LogSegment.loadCache(LogSegment.java:320)
> at 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.get(SegmentedRaftLog.java:293)
> at 
> org.apache.ratis.server.impl.StateMachineUpdater.applyLog(StateMachineUpdater.java:218)
> at 
> org.apache.ratis.server.impl.StateMachineUpdater.run(StateMachineUpdater.java:167)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NullPointerException
> at java.util.Objects.requireNonNull(Objects.java:203)
> at 
> org.apache.ratis.server.raftlog.segmented.LogSegment$LogEntryLoader.load(LogSegment.java:214)
> at 
> org.apache.ratis.server.raftlog.segmented.LogSegment.loadCache(LogSegment.java:318)
> ... 4 more {code}
> It seems to be a race condition between LogSegment.evictCache() and 
> LogSegment.loadCache().
>  # StateMachineUpdater tries to update the StateMachine with the next log 
> entry
>  # It can't be found in the cache, therefore the LogSegment.loadCache() is 
> called
>  # The LogSegment.LogEntryLoader.load() reads the segment files from the disk
>  # After loading, it returns with the loaded entry
> If the GRPC thread evicts the cache between 3 and 4. (it's possible that the 
> log segment is already flushed, therefore can be evicted) an NPE will be 
> thrown.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-804) Race condition between cache evict and load in LogSegment

2020-02-07 Thread Marton Elek (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17032334#comment-17032334
 ] 

Marton Elek commented on RATIS-804:
---

{quote}[~elek], would you mind testing the patch?
{quote}
Sure, thanks the patch. I just started to create a new build to deploy and test.

> Race condition between cache evict and load in LogSegment
> -
>
> Key: RATIS-804
> URL: https://issues.apache.org/jira/browse/RATIS-804
> Project: Ratis
>  Issue Type: Bug
>  Components: server
>Reporter: Marton Elek
>Assignee: Tsz-wo Sze
>Priority: Critical
> Attachments: r804_20200205.patch
>
>
> I am doing some kind of stress testing with Ozone. I start one Datanode in 
> FOLLOWER mode and the load generator (Freon) behaves like a LEADER.
> I am sending huge number of AppendLogEntries to the FOLLOWER without 
> inhibitions.
> As a result I got NPE:
> {code:java}
> 2020-01-28 15:08:20 ERROR StateMachineUpdater:184 - 
> 3fda0c39-ce3c-4540-a804-44d9ac1f4853@group-E1B13B4CA5C0-StateMachineUpdater: 
> the StateMachineUp
> dater hits Throwable
> org.apache.ratis.server.raftlog.RaftLogIOException: 
> java.lang.NullPointerException
> at 
> org.apache.ratis.server.raftlog.segmented.LogSegment.loadCache(LogSegment.java:320)
> at 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.get(SegmentedRaftLog.java:293)
> at 
> org.apache.ratis.server.impl.StateMachineUpdater.applyLog(StateMachineUpdater.java:218)
> at 
> org.apache.ratis.server.impl.StateMachineUpdater.run(StateMachineUpdater.java:167)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NullPointerException
> at java.util.Objects.requireNonNull(Objects.java:203)
> at 
> org.apache.ratis.server.raftlog.segmented.LogSegment$LogEntryLoader.load(LogSegment.java:214)
> at 
> org.apache.ratis.server.raftlog.segmented.LogSegment.loadCache(LogSegment.java:318)
> ... 4 more {code}
> It seems to be a race condition between LogSegment.evictCache() and 
> LogSegment.loadCache().
>  # StateMachineUpdater tries to update the StateMachine with the next log 
> entry
>  # It can't be found in the cache, therefore the LogSegment.loadCache() is 
> called
>  # The LogSegment.LogEntryLoader.load() reads the segment files from the disk
>  # After loading, it returns with the loaded entry
> If the GRPC thread evicts the cache between 3 and 4. (it's possible that the 
> log segment is already flushed, therefore can be evicted) an NPE will be 
> thrown.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-759) Support stream APIs to send large messages

2020-02-07 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17032311#comment-17032311
 ] 

Hadoop QA commented on RATIS-759:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
58s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  3m  
6s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
5s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 26s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 32m  5s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | ratis.logservice.server.TestMetaServer |
|   | ratis.server.simulation.TestRaftSnapshotWithSimulatedRpc |
|   | ratis.netty.TestRaftSnapshotWithNetty |
|   | ratis.server.simulation.TestLogAppenderWithSimulatedRpc |
|   | ratis.grpc.TestRaftSnapshotWithGrpc |
|   | ratis.grpc.TestRaftExceptionWithGrpc |
|   | ratis.grpc.TestServerRestartWithGrpc |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.4 Server=19.03.4 Image:yetus/ratis:date2020-02-07 |
| JIRA Issue | RATIS-759 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12992825/r759_20200206.patch |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
checkstyle  compile  cc  |
| uname | Linux 59f0543554d0 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-RATIS-Build/yetus-personality.sh
 |
| git revision | master / c7db0a2 |
| maven | version: Apache Maven 3.6.3 
(cecedd343002696d0abb50b32b541b8a6ba2883f) |
| Default Java | 1.8.0_242 |
| unit | 
https://builds.apache.org/job/PreCommit-RATIS-Build/1239/artifact/out/patch-unit-root.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-RATIS-Build/1239/testReport/ |
| Max. process+thread count | 3435 (vs. ulimit of 5000) |
| modules | C: ratis-proto ratis-common ratis-client ratis-server ratis-test U: 
. |
| Console output | 
https://builds.apache.org/job/PreCommit-RATIS-Build/1239/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Support