[jira] [Commented] (IGNITE-15568) Striped Disruptor doesn't work with JRaft event handlers properly

2024-05-27 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-15568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849779#comment-17849779
 ] 

Vladislav Pyatkov commented on IGNITE-15568:


Merged ced0ebba0969ad1b75ee94ca5a252aef15d97955

> Striped Disruptor doesn't work with JRaft event handlers properly
> -
>
> Key: IGNITE-15568
> URL: https://issues.apache.org/jira/browse/IGNITE-15568
> Project: Ignite
>  Issue Type: Bug
>Reporter: Alexey Scherbakov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3, performance
> Fix For: 3.0.0-beta2
>
> Attachments: InsertBenchmark.java, MyInsertBenchmarkWithMetrics.java
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> The following scenario is broken:
>  # Two raft groups are started and mapped to the same stripe.
>  # Two LogEntryAndClosure events are added in quick succession so they form 
> distruptor batch: first for group 1, second for group 2.
> First event is delivered to group 1 with endOfBatch=false, so it's cached in 
> org.apache.ignite.raft.jraft.core.NodeImpl.LogEntryAndClosureHandler#tasks 
> and is not processed.
> Second event is delivered to group 2 with endOfBatch=true and processed, but 
> first event will remain in queue unprocessed forever, because 
> LogEntryAndClosureHandler are different instances per raft group.
> The possible WA for this is to set 
> org.apache.ignite.raft.jraft.option.RaftOptions#applyBatch=1
> Reproducible by 
> org.apache.ignite.internal.table.TxDistributedTest_1_1_1#testCrossTable + 
> applyBatch=32 in ignite-15085 branch
> *Implementation notes*
> My proposal goes bound Disruptor. The striped disruptor implementation has an 
> interceptor that proposes an event to a specific interceptor. Only the last 
> event in the batch has a completion batch flag. For the other RAFT groups, 
> which has been notified in the striped disruptor, required to create an event 
> to fix a batch into the specific group. The new event will be created in the 
> common striped disruptor interceptor, and it will send to a specific 
> interceptor with flag about batch completion.
> The rule of handling the new event is differenced for various interceptor:
> {code:java|title=title=ApplyTaskHandler (FSMCallerImpl#runApplyTask)}
> if (maxCommittedIndex >= 0) {
>   doCommitted(maxCommittedIndex);
>   return -1;
> }
> {code}
> {code:java|title=LogEntryAndClosureHandler(LogEntryAndClosureHandler#onEvent)}
> if (this.tasks.size() > 0) {
>   executeApplyingTasks(this.tasks);
>   this.tasks.clear();
> }
> {code}
> {code:java|title=ReadIndexEventHandler(ReadIndexEventHandler#onEvent)}
> if (this.events.size() > 0) {
>   executeReadIndexEvents(this.events);
>   this.events.clear();
> }
> {code}
> {code:java|title=StableClosureEventHandler(StableClosureEventHandler#onEvent)}
> if (this.ab.size > 0) {
>   this.lastId = this.ab.flush();
>   setDiskId(this.lastId);
> }
> {code}
> Also in bound of this issue, required to rerun benchmarks. Those are expected 
> to dhow increasing in case with high parallelism in one partition.
> There is [an example of the 
> benchmark|https://github.com/gridgain/apache-ignite-3/tree/4b9de922caa4aef97a5e8e159d5db76a3fc7a3ad/modules/runner/src/test/java/org/apache/ignite/internal/benchmark].
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-15568) Striped Disruptor doesn't work with JRaft event handlers properly

2024-05-27 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-15568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849768#comment-17849768
 ] 

Vladislav Pyatkov commented on IGNITE-15568:


In bound of this ticket, IGNITE-20536 is also fixed.

> Striped Disruptor doesn't work with JRaft event handlers properly
> -
>
> Key: IGNITE-15568
> URL: https://issues.apache.org/jira/browse/IGNITE-15568
> Project: Ignite
>  Issue Type: Bug
>Reporter: Alexey Scherbakov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3, performance
> Fix For: 3.0.0-beta2
>
> Attachments: InsertBenchmark.java, MyInsertBenchmarkWithMetrics.java
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> The following scenario is broken:
>  # Two raft groups are started and mapped to the same stripe.
>  # Two LogEntryAndClosure events are added in quick succession so they form 
> distruptor batch: first for group 1, second for group 2.
> First event is delivered to group 1 with endOfBatch=false, so it's cached in 
> org.apache.ignite.raft.jraft.core.NodeImpl.LogEntryAndClosureHandler#tasks 
> and is not processed.
> Second event is delivered to group 2 with endOfBatch=true and processed, but 
> first event will remain in queue unprocessed forever, because 
> LogEntryAndClosureHandler are different instances per raft group.
> The possible WA for this is to set 
> org.apache.ignite.raft.jraft.option.RaftOptions#applyBatch=1
> Reproducible by 
> org.apache.ignite.internal.table.TxDistributedTest_1_1_1#testCrossTable + 
> applyBatch=32 in ignite-15085 branch
> *Implementation notes*
> My proposal goes bound Disruptor. The striped disruptor implementation has an 
> interceptor that proposes an event to a specific interceptor. Only the last 
> event in the batch has a completion batch flag. For the other RAFT groups, 
> which has been notified in the striped disruptor, required to create an event 
> to fix a batch into the specific group. The new event will be created in the 
> common striped disruptor interceptor, and it will send to a specific 
> interceptor with flag about batch completion.
> The rule of handling the new event is differenced for various interceptor:
> {code:java|title=title=ApplyTaskHandler (FSMCallerImpl#runApplyTask)}
> if (maxCommittedIndex >= 0) {
>   doCommitted(maxCommittedIndex);
>   return -1;
> }
> {code}
> {code:java|title=LogEntryAndClosureHandler(LogEntryAndClosureHandler#onEvent)}
> if (this.tasks.size() > 0) {
>   executeApplyingTasks(this.tasks);
>   this.tasks.clear();
> }
> {code}
> {code:java|title=ReadIndexEventHandler(ReadIndexEventHandler#onEvent)}
> if (this.events.size() > 0) {
>   executeReadIndexEvents(this.events);
>   this.events.clear();
> }
> {code}
> {code:java|title=StableClosureEventHandler(StableClosureEventHandler#onEvent)}
> if (this.ab.size > 0) {
>   this.lastId = this.ab.flush();
>   setDiskId(this.lastId);
> }
> {code}
> Also in bound of this issue, required to rerun benchmarks. Those are expected 
> to dhow increasing in case with high parallelism in one partition.
> There is [an example of the 
> benchmark|https://github.com/gridgain/apache-ignite-3/tree/4b9de922caa4aef97a5e8e159d5db76a3fc7a3ad/modules/runner/src/test/java/org/apache/ignite/internal/benchmark].
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-15568) Striped Disruptor doesn't work with JRaft event handlers properly

2024-05-17 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-15568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847377#comment-17847377
 ] 

Vladislav Pyatkov commented on IGNITE-15568:


{code}
New

Raft metrics:
raft.logmanager.disruptor.Batch: [
  0_10:2487232, 
  10_20:5693, 
  20_30:2096, 
  30_40:574, 
  40_50:295, 
  50_inf:4]

Benchmark  (clusterSize)  (fsync)  (partitionCount) 
 Mode  Cnt Score Error  Units
MyInsertBenchmarkWithMetrics.kvInsert  1false 2 
 avgt  200  6723,882 ± 639,991  us/op
MyInsertBenchmarkWithMetrics.kvInsert  1 true 2 
 avgt  200  7722,169 ± 504,716  us/op


Old

raft.logmanager.disruptor.Batch: [
  0_10:2788769, 
  10_20:8218, 
  20_30:4532, 
  30_40:1579, 
  40_50:782, 
  50_inf:61]
raft.nodeimpl.disruptor.Batch: [
  0_10:3274036, 
  10_20:2066, 
  20_30:446, 
  30_40:128, 
  40_50:35, 
  50_inf:8]
raft.readonlyservice.disruptor.Batch: [
  0_10:2, 
  10_20:0, 
  20_30:0, 
  30_40:0, 
  40_50:0, 
  50_inf:0]
raft.fsmcaller.disruptor.Batch: [
  0_10:9135, 
  10_20:6197, 
  20_30:4795, 
  30_40:6800, 
  40_50:80328, 
  50_inf:73]

Benchmark  (clusterSize)  (fsync)  (partitionCount) 
 Mode  Cnt Score Error  Units
MyInsertBenchmarkWithMetrics.kvInsert  1false 2 
 avgt  200  7611,808 ± 695,469  us/op
MyInsertBenchmarkWithMetrics.kvInsert  1 true 2 
 avgt  200  7681,789 ± 433,490  us/op
{code}

> Striped Disruptor doesn't work with JRaft event handlers properly
> -
>
> Key: IGNITE-15568
> URL: https://issues.apache.org/jira/browse/IGNITE-15568
> Project: Ignite
>  Issue Type: Bug
>Reporter: Alexey Scherbakov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3, performance
> Fix For: 3.0.0-beta2
>
> Attachments: InsertBenchmark.java, MyInsertBenchmarkWithMetrics.java
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> The following scenario is broken:
>  # Two raft groups are started and mapped to the same stripe.
>  # Two LogEntryAndClosure events are added in quick succession so they form 
> distruptor batch: first for group 1, second for group 2.
> First event is delivered to group 1 with endOfBatch=false, so it's cached in 
> org.apache.ignite.raft.jraft.core.NodeImpl.LogEntryAndClosureHandler#tasks 
> and is not processed.
> Second event is delivered to group 2 with endOfBatch=true and processed, but 
> first event will remain in queue unprocessed forever, because 
> LogEntryAndClosureHandler are different instances per raft group.
> The possible WA for this is to set 
> org.apache.ignite.raft.jraft.option.RaftOptions#applyBatch=1
> Reproducible by 
> org.apache.ignite.internal.table.TxDistributedTest_1_1_1#testCrossTable + 
> applyBatch=32 in ignite-15085 branch
> *Implementation notes*
> My proposal goes bound Disruptor. The striped disruptor implementation has an 
> interceptor that proposes an event to a specific interceptor. Only the last 
> event in the batch has a completion batch flag. For the other RAFT groups, 
> which has been notified in the striped disruptor, required to create an event 
> to fix a batch into the specific group. The new event will be created in the 
> common striped disruptor interceptor, and it will send to a specific 
> interceptor with flag about batch completion.
> The rule of handling the new event is differenced for various interceptor:
> {code:java|title=title=ApplyTaskHandler (FSMCallerImpl#runApplyTask)}
> if (maxCommittedIndex >= 0) {
>   doCommitted(maxCommittedIndex);
>   return -1;
> }
> {code}
> {code:java|title=LogEntryAndClosureHandler(LogEntryAndClosureHandler#onEvent)}
> if (this.tasks.size() > 0) {
>   executeApplyingTasks(this.tasks);
>   this.tasks.clear();
> }
> {code}
> {code:java|title=ReadIndexEventHandler(ReadIndexEventHandler#onEvent)}
> if (this.events.size() > 0) {
>   executeReadIndexEvents(this.events);
>   this.events.clear();
> }
> {code}
> {code:java|title=StableClosureEventHandler(StableClosureEventHandler#onEvent)}
> if (this.ab.size > 0) {
>   this.lastId = this.ab.flush();
>   setDiskId(this.lastId);
> }
> {code}
> Also in bound of this issue, required to rerun benchmarks. Those are expected 
> to dhow increasing in case with high parallelism in one partition.
> There is [an example of the 
> benchmark|https://github.com/gridgain/apache-ignite-3/tree/4b9de922caa4aef97a5e8e159d5db76a3fc7a3ad/modules/runner/src/test/java/org/apache/ignite/internal/benchmark].
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-15568) Striped Disruptor doesn't work with JRaft event handlers properly

2024-05-17 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-15568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847319#comment-17847319
 ] 

Vladislav Pyatkov commented on IGNITE-15568:


It was just like it was. I do not edit the old code.

> Striped Disruptor doesn't work with JRaft event handlers properly
> -
>
> Key: IGNITE-15568
> URL: https://issues.apache.org/jira/browse/IGNITE-15568
> Project: Ignite
>  Issue Type: Bug
>Reporter: Alexey Scherbakov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3, performance
> Fix For: 3.0.0-beta2
>
> Attachments: InsertBenchmark.java, MyInsertBenchmarkWithMetrics.java
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> The following scenario is broken:
>  # Two raft groups are started and mapped to the same stripe.
>  # Two LogEntryAndClosure events are added in quick succession so they form 
> distruptor batch: first for group 1, second for group 2.
> First event is delivered to group 1 with endOfBatch=false, so it's cached in 
> org.apache.ignite.raft.jraft.core.NodeImpl.LogEntryAndClosureHandler#tasks 
> and is not processed.
> Second event is delivered to group 2 with endOfBatch=true and processed, but 
> first event will remain in queue unprocessed forever, because 
> LogEntryAndClosureHandler are different instances per raft group.
> The possible WA for this is to set 
> org.apache.ignite.raft.jraft.option.RaftOptions#applyBatch=1
> Reproducible by 
> org.apache.ignite.internal.table.TxDistributedTest_1_1_1#testCrossTable + 
> applyBatch=32 in ignite-15085 branch
> *Implementation notes*
> My proposal goes bound Disruptor. The striped disruptor implementation has an 
> interceptor that proposes an event to a specific interceptor. Only the last 
> event in the batch has a completion batch flag. For the other RAFT groups, 
> which has been notified in the striped disruptor, required to create an event 
> to fix a batch into the specific group. The new event will be created in the 
> common striped disruptor interceptor, and it will send to a specific 
> interceptor with flag about batch completion.
> The rule of handling the new event is differenced for various interceptor:
> {code:java|title=title=ApplyTaskHandler (FSMCallerImpl#runApplyTask)}
> if (maxCommittedIndex >= 0) {
>   doCommitted(maxCommittedIndex);
>   return -1;
> }
> {code}
> {code:java|title=LogEntryAndClosureHandler(LogEntryAndClosureHandler#onEvent)}
> if (this.tasks.size() > 0) {
>   executeApplyingTasks(this.tasks);
>   this.tasks.clear();
> }
> {code}
> {code:java|title=ReadIndexEventHandler(ReadIndexEventHandler#onEvent)}
> if (this.events.size() > 0) {
>   executeReadIndexEvents(this.events);
>   this.events.clear();
> }
> {code}
> {code:java|title=StableClosureEventHandler(StableClosureEventHandler#onEvent)}
> if (this.ab.size > 0) {
>   this.lastId = this.ab.flush();
>   setDiskId(this.lastId);
> }
> {code}
> Also in bound of this issue, required to rerun benchmarks. Those are expected 
> to dhow increasing in case with high parallelism in one partition.
> There is [an example of the 
> benchmark|https://github.com/gridgain/apache-ignite-3/tree/4b9de922caa4aef97a5e8e159d5db76a3fc7a3ad/modules/runner/src/test/java/org/apache/ignite/internal/benchmark].
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-15568) Striped Disruptor doesn't work with JRaft event handlers properly

2024-05-17 Thread Alexey Scherbakov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-15568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847276#comment-17847276
 ] 

Alexey Scherbakov commented on IGNITE-15568:


This result doesn't make sense to me as well.

"Old distruptor' (without optimization) should have batches only of size=1

> Striped Disruptor doesn't work with JRaft event handlers properly
> -
>
> Key: IGNITE-15568
> URL: https://issues.apache.org/jira/browse/IGNITE-15568
> Project: Ignite
>  Issue Type: Bug
>Reporter: Alexey Scherbakov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3, performance
> Fix For: 3.0.0-beta2
>
> Attachments: InsertBenchmark.java, MyInsertBenchmarkWithMetrics.java
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> The following scenario is broken:
>  # Two raft groups are started and mapped to the same stripe.
>  # Two LogEntryAndClosure events are added in quick succession so they form 
> distruptor batch: first for group 1, second for group 2.
> First event is delivered to group 1 with endOfBatch=false, so it's cached in 
> org.apache.ignite.raft.jraft.core.NodeImpl.LogEntryAndClosureHandler#tasks 
> and is not processed.
> Second event is delivered to group 2 with endOfBatch=true and processed, but 
> first event will remain in queue unprocessed forever, because 
> LogEntryAndClosureHandler are different instances per raft group.
> The possible WA for this is to set 
> org.apache.ignite.raft.jraft.option.RaftOptions#applyBatch=1
> Reproducible by 
> org.apache.ignite.internal.table.TxDistributedTest_1_1_1#testCrossTable + 
> applyBatch=32 in ignite-15085 branch
> *Implementation notes*
> My proposal goes bound Disruptor. The striped disruptor implementation has an 
> interceptor that proposes an event to a specific interceptor. Only the last 
> event in the batch has a completion batch flag. For the other RAFT groups, 
> which has been notified in the striped disruptor, required to create an event 
> to fix a batch into the specific group. The new event will be created in the 
> common striped disruptor interceptor, and it will send to a specific 
> interceptor with flag about batch completion.
> The rule of handling the new event is differenced for various interceptor:
> {code:java|title=title=ApplyTaskHandler (FSMCallerImpl#runApplyTask)}
> if (maxCommittedIndex >= 0) {
>   doCommitted(maxCommittedIndex);
>   return -1;
> }
> {code}
> {code:java|title=LogEntryAndClosureHandler(LogEntryAndClosureHandler#onEvent)}
> if (this.tasks.size() > 0) {
>   executeApplyingTasks(this.tasks);
>   this.tasks.clear();
> }
> {code}
> {code:java|title=ReadIndexEventHandler(ReadIndexEventHandler#onEvent)}
> if (this.events.size() > 0) {
>   executeReadIndexEvents(this.events);
>   this.events.clear();
> }
> {code}
> {code:java|title=StableClosureEventHandler(StableClosureEventHandler#onEvent)}
> if (this.ab.size > 0) {
>   this.lastId = this.ab.flush();
>   setDiskId(this.lastId);
> }
> {code}
> Also in bound of this issue, required to rerun benchmarks. Those are expected 
> to dhow increasing in case with high parallelism in one partition.
> There is [an example of the 
> benchmark|https://github.com/gridgain/apache-ignite-3/tree/4b9de922caa4aef97a5e8e159d5db76a3fc7a3ad/modules/runner/src/test/java/org/apache/ignite/internal/benchmark].
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-15568) Striped Disruptor doesn't work with JRaft event handlers properly

2024-05-17 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-15568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847263#comment-17847263
 ] 

Vladislav Pyatkov commented on IGNITE-15568:


I think the previous one also has a sense, because it shows we do not degrade 
performance.
I have attached a new one: [^MyInsertBenchmarkWithMetrics.java] 
Here is the result from my laptop:
{code}
Old

Benchmark (clusterSize)  (fsync)  (partitionCount)  Mode  Cnt   
  Score  Error  Units
InsertBenchmark.kvInsert  1false 2  avgt  200  
6821,523 ± 1190,279  us/op
InsertBenchmark.kvInsert  1 true 2  avgt  200  
8172,433 ±  294,077  us/op

raft.fsmcaller.disruptor.Batch:[
  0_10:29890, 
  10_20:71825, 
  20_30:37548, 
  30_40:9062, 
  40_50:1428, 
  50_inf:2]
raft.logmanager.disruptor.Batch:[
  0_10:32324, 
  10_20:48196, 
  20_30:53466, 
  30_40:8831, 
  40_50:1081, 
  50_inf:26]
raft.nodeimpl.disruptor.Batch:[
  0_10:1804447, 
  10_20:1205, 
  20_30:122, 
  30_40:27, 
  40_50:14, 
  50_inf:0]
raft.readonlyservice.disruptor.Batch:[
  0_10:6, 
  10_20:0, 
  20_30:0, 
  30_40:0, 
  40_50:0, 
  50_inf:0]
  
New

Benchmark (clusterSize)  (fsync)  (partitionCount)  Mode  Cnt   
  Score Error  Units
InsertBenchmark.kvInsert  1false 2  avgt  200  
7357,067 ± 640,983  us/op
InsertBenchmark.kvInsert  1 true 2  avgt  200  
8015,733 ± 469,096  us/op

raft:
raft.fsmcaller.disruptor.Batch:[
  0_10:177419, 
  10_20:78244, 
  20_30:2549, 
  30_40:8, 
  40_50:0, 
  50_inf:0]
raft.logmanager.disruptor.Batch:[
  0_10:71704, 
  10_20:76075, 
  20_30:26090, 
  30_40:1540, 
  40_50:44, 
  50_inf:0]
raft.nodeimpl.disruptor.Batch:[
  0_10:2283394, 
  10_20:516, 
  20_30:52, 
  30_40:4, 
  40_50:0, 
  50_inf:0]
raft.readonlyservice.disruptor.Batch:[
  0_10:6, 
  10_20:0, 
  20_30:0, 
  30_40:0, 
  40_50:0, 
  50_inf:0]
{code}

> Striped Disruptor doesn't work with JRaft event handlers properly
> -
>
> Key: IGNITE-15568
> URL: https://issues.apache.org/jira/browse/IGNITE-15568
> Project: Ignite
>  Issue Type: Bug
>Reporter: Alexey Scherbakov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3, performance
> Fix For: 3.0.0-beta2
>
> Attachments: InsertBenchmark.java, MyInsertBenchmarkWithMetrics.java
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> The following scenario is broken:
>  # Two raft groups are started and mapped to the same stripe.
>  # Two LogEntryAndClosure events are added in quick succession so they form 
> distruptor batch: first for group 1, second for group 2.
> First event is delivered to group 1 with endOfBatch=false, so it's cached in 
> org.apache.ignite.raft.jraft.core.NodeImpl.LogEntryAndClosureHandler#tasks 
> and is not processed.
> Second event is delivered to group 2 with endOfBatch=true and processed, but 
> first event will remain in queue unprocessed forever, because 
> LogEntryAndClosureHandler are different instances per raft group.
> The possible WA for this is to set 
> org.apache.ignite.raft.jraft.option.RaftOptions#applyBatch=1
> Reproducible by 
> org.apache.ignite.internal.table.TxDistributedTest_1_1_1#testCrossTable + 
> applyBatch=32 in ignite-15085 branch
> *Implementation notes*
> My proposal goes bound Disruptor. The striped disruptor implementation has an 
> interceptor that proposes an event to a specific interceptor. Only the last 
> event in the batch has a completion batch flag. For the other RAFT groups, 
> which has been notified in the striped disruptor, required to create an event 
> to fix a batch into the specific group. The new event will be created in the 
> common striped disruptor interceptor, and it will send to a specific 
> interceptor with flag about batch completion.
> The rule of handling the new event is differenced for various interceptor:
> {code:java|title=title=ApplyTaskHandler (FSMCallerImpl#runApplyTask)}
> if (maxCommittedIndex >= 0) {
>   doCommitted(maxCommittedIndex);
>   return -1;
> }
> {code}
> {code:java|title=LogEntryAndClosureHandler(LogEntryAndClosureHandler#onEvent)}
> if (this.tasks.size() > 0) {
>   executeApplyingTasks(this.tasks);
>   this.tasks.clear();
> }
> {code}
> {code:java|title=ReadIndexEventHandler(ReadIndexEventHandler#onEvent)}
> if (this.events.size() > 0) {
>   executeReadIndexEvents(this.events);
>   this.events.clear();
> }
> {code}
> {code:java|title=StableClosureEventHandler(StableClosureEventHandler#onEvent)}
> if (this.ab.size > 0) {
>   this.lastId = this.ab.flush();
>   setDiskId(this.lastId);
> }
> {code}
> Also in bound of this issue, required to rerun benchmarks. Those are expected 
> 

[jira] [Commented] (IGNITE-15568) Striped Disruptor doesn't work with JRaft event handlers properly

2024-05-17 Thread Alexey Scherbakov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-15568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847257#comment-17847257
 ] 

Alexey Scherbakov commented on IGNITE-15568:


Proposed test scenario doesn't make sense to me.

We need at least 2 partitions and one stripe to see benefits from patch 
improvement.

> Striped Disruptor doesn't work with JRaft event handlers properly
> -
>
> Key: IGNITE-15568
> URL: https://issues.apache.org/jira/browse/IGNITE-15568
> Project: Ignite
>  Issue Type: Bug
>Reporter: Alexey Scherbakov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3, performance
> Fix For: 3.0.0-beta2
>
> Attachments: InsertBenchmark.java
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> The following scenario is broken:
>  # Two raft groups are started and mapped to the same stripe.
>  # Two LogEntryAndClosure events are added in quick succession so they form 
> distruptor batch: first for group 1, second for group 2.
> First event is delivered to group 1 with endOfBatch=false, so it's cached in 
> org.apache.ignite.raft.jraft.core.NodeImpl.LogEntryAndClosureHandler#tasks 
> and is not processed.
> Second event is delivered to group 2 with endOfBatch=true and processed, but 
> first event will remain in queue unprocessed forever, because 
> LogEntryAndClosureHandler are different instances per raft group.
> The possible WA for this is to set 
> org.apache.ignite.raft.jraft.option.RaftOptions#applyBatch=1
> Reproducible by 
> org.apache.ignite.internal.table.TxDistributedTest_1_1_1#testCrossTable + 
> applyBatch=32 in ignite-15085 branch
> *Implementation notes*
> My proposal goes bound Disruptor. The striped disruptor implementation has an 
> interceptor that proposes an event to a specific interceptor. Only the last 
> event in the batch has a completion batch flag. For the other RAFT groups, 
> which has been notified in the striped disruptor, required to create an event 
> to fix a batch into the specific group. The new event will be created in the 
> common striped disruptor interceptor, and it will send to a specific 
> interceptor with flag about batch completion.
> The rule of handling the new event is differenced for various interceptor:
> {code:java|title=title=ApplyTaskHandler (FSMCallerImpl#runApplyTask)}
> if (maxCommittedIndex >= 0) {
>   doCommitted(maxCommittedIndex);
>   return -1;
> }
> {code}
> {code:java|title=LogEntryAndClosureHandler(LogEntryAndClosureHandler#onEvent)}
> if (this.tasks.size() > 0) {
>   executeApplyingTasks(this.tasks);
>   this.tasks.clear();
> }
> {code}
> {code:java|title=ReadIndexEventHandler(ReadIndexEventHandler#onEvent)}
> if (this.events.size() > 0) {
>   executeReadIndexEvents(this.events);
>   this.events.clear();
> }
> {code}
> {code:java|title=StableClosureEventHandler(StableClosureEventHandler#onEvent)}
> if (this.ab.size > 0) {
>   this.lastId = this.ab.flush();
>   setDiskId(this.lastId);
> }
> {code}
> Also in bound of this issue, required to rerun benchmarks. Those are expected 
> to dhow increasing in case with high parallelism in one partition.
> There is [an example of the 
> benchmark|https://github.com/gridgain/apache-ignite-3/tree/4b9de922caa4aef97a5e8e159d5db76a3fc7a3ad/modules/runner/src/test/java/org/apache/ignite/internal/benchmark].
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-15568) Striped Disruptor doesn't work with JRaft event handlers properly

2024-05-16 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-15568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847113#comment-17847113
 ] 

Vladislav Pyatkov commented on IGNITE-15568:


After two passes (the benchmark was attached), it showed a weak positive 
impingement.

{code:java}
New disruptor

Benchmark (clusterSize)  (fsync)  (partitionCount)  Mode  Cnt   
  Score Error  Units
InsertBenchmark.kvInsert  1false 1  avgt  200  
6891,786 ± 480,532  us/op
InsertBenchmark.kvInsert  1 true 1  avgt  200  
7615,249 ± 462,971  us/op

Benchmark (clusterSize)  (fsync)  (partitionCount)  Mode  Cnt   
  Score Error  Units
InsertBenchmark.kvInsert  1false 1  avgt  200  
6676,231 ± 435,272  us/op
InsertBenchmark.kvInsert  1 true 1  avgt  200  
7656,038 ± 460,172  us/op


Old disruptor

Benchmark (clusterSize)  (fsync)  (partitionCount)  Mode  Cnt   
  Score Error  Units
InsertBenchmark.kvInsert  1false 1  avgt  200  
7398,135 ± 895,617  us/op
InsertBenchmark.kvInsert  1 true 1  avgt  200  
7965,185 ± 443,870  us/op

Benchmark (clusterSize)  (fsync)  (partitionCount)  Mode  Cnt   
  Score  Error  Units
InsertBenchmark.kvInsert  1false 1  avgt  200  
6618,169 ± 1093,236  us/op
InsertBenchmark.kvInsert  1 true 1  avgt  200  
8136,877 ±  292,777  us/op
{code}


> Striped Disruptor doesn't work with JRaft event handlers properly
> -
>
> Key: IGNITE-15568
> URL: https://issues.apache.org/jira/browse/IGNITE-15568
> Project: Ignite
>  Issue Type: Bug
>Reporter: Alexey Scherbakov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3, performance
> Fix For: 3.0.0-beta2
>
> Attachments: StripedDisruptor.java
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> The following scenario is broken:
>  # Two raft groups are started and mapped to the same stripe.
>  # Two LogEntryAndClosure events are added in quick succession so they form 
> distruptor batch: first for group 1, second for group 2.
> First event is delivered to group 1 with endOfBatch=false, so it's cached in 
> org.apache.ignite.raft.jraft.core.NodeImpl.LogEntryAndClosureHandler#tasks 
> and is not processed.
> Second event is delivered to group 2 with endOfBatch=true and processed, but 
> first event will remain in queue unprocessed forever, because 
> LogEntryAndClosureHandler are different instances per raft group.
> The possible WA for this is to set 
> org.apache.ignite.raft.jraft.option.RaftOptions#applyBatch=1
> Reproducible by 
> org.apache.ignite.internal.table.TxDistributedTest_1_1_1#testCrossTable + 
> applyBatch=32 in ignite-15085 branch
> *Implementation notes*
> My proposal goes bound Disruptor. The striped disruptor implementation has an 
> interceptor that proposes an event to a specific interceptor. Only the last 
> event in the batch has a completion batch flag. For the other RAFT groups, 
> which has been notified in the striped disruptor, required to create an event 
> to fix a batch into the specific group. The new event will be created in the 
> common striped disruptor interceptor, and it will send to a specific 
> interceptor with flag about batch completion.
> The rule of handling the new event is differenced for various interceptor:
> {code:java|title=title=ApplyTaskHandler (FSMCallerImpl#runApplyTask)}
> if (maxCommittedIndex >= 0) {
>   doCommitted(maxCommittedIndex);
>   return -1;
> }
> {code}
> {code:java|title=LogEntryAndClosureHandler(LogEntryAndClosureHandler#onEvent)}
> if (this.tasks.size() > 0) {
>   executeApplyingTasks(this.tasks);
>   this.tasks.clear();
> }
> {code}
> {code:java|title=ReadIndexEventHandler(ReadIndexEventHandler#onEvent)}
> if (this.events.size() > 0) {
>   executeReadIndexEvents(this.events);
>   this.events.clear();
> }
> {code}
> {code:java|title=StableClosureEventHandler(StableClosureEventHandler#onEvent)}
> if (this.ab.size > 0) {
>   this.lastId = this.ab.flush();
>   setDiskId(this.lastId);
> }
> {code}
> Also in bound of this issue, required to rerun benchmarks. Those are expected 
> to dhow increasing in case with high parallelism in one partition.
> There is [an example of the 
> benchmark|https://github.com/gridgain/apache-ignite-3/tree/4b9de922caa4aef97a5e8e159d5db76a3fc7a3ad/modules/runner/src/test/java/org/apache/ignite/internal/benchmark].
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-15568) Striped Disruptor doesn't work with JRaft event handlers properly

2023-07-10 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-15568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17741528#comment-17741528
 ] 

Ivan Bessonov commented on IGNITE-15568:


The actual implementation here differs from the description.

First of all, only the log manager is affected.

Second, instead of having a single event that would notify all listeners, I 
re-use the "endOfBatch" flag. This solution is not as general, and it can be 
later re-implemented using additional event, but I decided not to change the 
API too much for now.

Log manager internally has the information about the stripe it belongs to. 
Using that information, it's possible to perform a single write into a shared 
log storage, even when batch consists of data from several different 
replication groups.

> Striped Disruptor doesn't work with JRaft event handlers properly
> -
>
> Key: IGNITE-15568
> URL: https://issues.apache.org/jira/browse/IGNITE-15568
> Project: Ignite
>  Issue Type: Bug
>Reporter: Alexey Scherbakov
>Assignee: Ivan Bessonov
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The following scenario is broken:
>  # Two raft groups are started and mapped to the same stripe.
>  # Two LogEntryAndClosure events are added in quick succession so they form 
> distruptor batch: first for group 1, second for group 2.
> First event is delivered to group 1 with endOfBatch=false, so it's cached in 
> org.apache.ignite.raft.jraft.core.NodeImpl.LogEntryAndClosureHandler#tasks 
> and is not processed.
> Second event is delivered to group 2 with endOfBatch=true and processed, but 
> first event will remain in queue unprocessed forever, because 
> LogEntryAndClosureHandler are different instances per raft group.
> The possible WA for this is to set 
> org.apache.ignite.raft.jraft.option.RaftOptions#applyBatch=1
> Reproducible by 
> org.apache.ignite.internal.table.TxDistributedTest_1_1_1#testCrossTable + 
> applyBatch=32 in ignite-15085 branch
> *Implementation notes*
> My proposal goes bound Disruptor. The striped disruptor implementation has an 
> interceptor that proposes an event to a specific interceptor. Only the last 
> event in the batch has a completion batch flag. For the other RAFT groups, 
> which has been notified in the striped disruptor, required to create an event 
> to fix a batch into the specific group. The new event will be created in the 
> common striped disruptor interceptor, and it will send to a specific 
> interceptor with flag about batch completion.
> The rule of handling the new event is differenced for various interceptor:
> {code:java|title=title=ApplyTaskHandler (FSMCallerImpl#runApplyTask)}
> if (maxCommittedIndex >= 0) {
>   doCommitted(maxCommittedIndex);
>   return -1;
> }
> {code}
> {code:java|title=LogEntryAndClosureHandler(LogEntryAndClosureHandler#onEvent)}
> if (this.tasks.size() > 0) {
>   executeApplyingTasks(this.tasks);
>   this.tasks.clear();
> }
> {code}
> {code:java|title=ReadIndexEventHandler(ReadIndexEventHandler#onEvent)}
> if (this.events.size() > 0) {
>   executeReadIndexEvents(this.events);
>   this.events.clear();
> }
> {code}
> {code:java|title=StableClosureEventHandler(StableClosureEventHandler#onEvent)}
> if (this.ab.size > 0) {
>   this.lastId = this.ab.flush();
>   setDiskId(this.lastId);
> }
> {code}
> Also in bound of this issue, required to rerun benchmarks. Those are expected 
> to dhow increasing in case with high parallelism in one partition.
> There is [an example of the 
> benchmark|https://github.com/gridgain/apache-ignite-3/tree/4b9de922caa4aef97a5e8e159d5db76a3fc7a3ad/modules/runner/src/test/java/org/apache/ignite/internal/benchmark].
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)