[ 
https://issues.apache.org/jira/browse/IGNITE-17735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17607891#comment-17607891
 ] 

Vladimir Steshin edited comment on IGNITE-17735 at 10/27/22 6:37 PM:
---------------------------------------------------------------------

Datastreamer with 'allowOverwrite==true' and PRIMARY_SYNC persistent cache may 
cause heap issue or consume increased heap amount.

There is related 'perNodeParallelOperations()' setting. Probably not an issue 
at all. What discouraged, I met this issue with trivial research like few 
servers, simple cache and just trying data streaming with various persistence 
and loading settings (like `HeapConsumptionDataStreamerTest.src`). Think user 
may meet the same. But the default value might be improved for persistence.

Streamer node may not wait for backup updates depending on streamer receiver, 
setting 'allowOverwrite' and cache sync mode.
And keep sending more and more streamer batches to process. The receiving node 
collects related to backup updates futures, requests. 
The same happens on backup node: collecting update incoming update requests 
stucking at disk writes. See 'DS_heap_consumption.png' for example.

Suggestion: bring reduced default parallel batches number for persistent caches 
`IgniteDataStreamer#DFLT_PARALLEL_OPS_PERSISTENT_MULTIPLIER` (PR #10343).
Or use per-internal-receiver setting 
`InternalUpdater#perNodeParallelOperations()` (PR #10351)

Did estimation benchmarks. Even in-memory benchmarks (like 
'bench_inmem_isolated_pc2.txt') shows 2 or may be 4 batches per threads seems 
enough. 

For persistent caches, `CPUs x 2` seems enough. See 
`bench_persistent_results_Isolated_pc1.txt` and 
`bench_persistent_results_Individual_pc1.txt`


was (Author: vladsz83):
Datastreamer with 'allowOverwrite==true' and PRIMARY_SYNC persistent cache may 
cause heap issue or consume increased heap amount.

Streamer node may not wait for backup updates depending on streamer receiver, 
setting 'allowOverwrite' and cache sync mode.
And keep sending more and more streamer batches to process. The receiving node 
collects related to backup updates futures, requests. 
The same happens on backup node: collecting update incoming update requests 
stucking at disk writes. See 'DS_heap_consumption.png' for example.

There is related 'perNodeParallelOperations()' setting. Probably not an issue 
at all. What discouraged, I met this issue with trivial research like few 
servers, simple cache and just trying data streaming with various persistence 
and loading settings (like `HeapConsumptionDataStreamerTest.src`). Think user 
may meet the same. But the default value might be improved for persistence.

Suggestion: bring reduced default parallel batches number for persistent caches 
`IgniteDataStreamer#DFLT_PARALLEL_OPS_PERSISTENT_MULTIPLIER` (PR #10343).
Or use per-internal-receiver setting 
`InternalUpdater#perNodeParallelOperations()` (PR #10351)

Did estimation benchmarks. Even in-memory benchmarks (like 
'bench_inmem_isolated_pc2.txt') shows 2 or may be 4 batches per threads seems 
enough. 

For persistent caches, `CPUs x 2` seems enough. See 
`bench_persistent_results_Isolated_pc1.txt` and 
`bench_persistent_results_Individual_pc1.txt`

> Datastreamer may consume heap with allowOverwtire=='false'.
> -----------------------------------------------------------
>
>                 Key: IGNITE-17735
>                 URL: https://issues.apache.org/jira/browse/IGNITE-17735
>             Project: Ignite
>          Issue Type: Sub-task
>            Reporter: Vladimir Steshin
>            Assignee: Vladimir Steshin
>            Priority: Major
>              Labels: ise
>         Attachments: DS_heap_consumption.png, DS_heap_consumption_2.png, 
> HeapConsumptionDataStreamerTest.src, bench_inmem_individual_pc2.txt, 
> bench_inmem_isolated_pc2.txt, bench_persistent_full_Individual_pc1.txt, 
> bench_persistent_results_Individual_pc1.txt, 
> bench_persistent_results_Isolated_pc1.txt
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to