Hi, Stan.
Thank you for the answer.
>>> "your data streamer queue size is something like"
You are right about writes queue on primary node. It has just some fixed
size. But based on number of the CPUs. (x8). Even for my laptop I get
16x8=128 batches. I wonder why so much by default for persistence.
>>> "Can you check the heap dump in your tests to see what actually
occupies most of the heap?"
The backup nodes collect `GridDhtAtomicSingleUpdateRequest`with key/data
`byte[]`. That's where we don't wait for in this case.
I thought we might slightly adjust the default setting at least to
make simple test more reliable. As a user, I wouldn't like that I just
take a tool/product just to try/research and it fails quick. But yes,
user still has the related setting `perNodeParallelOperations()`
WDYT?
30.10.2022 21:24, Stanislav Lukyanov пишет:
Hi Vladimir,
I think this is potentially an issue but I don't think this is about PDS at all.
The description is a bit vague, I have to say. AFAIU what you see is that when
the caches are persistent the streamer writes data faster than the nodes
(especially, backup nodes) process the writes.
Therefore, the nodes accumulate the writes in the queues, the queues grow, and
then you might go OOM.
The solution to just have lesser queues when there is persistent (and therefore
it's more likely the queues will reach the max size) is not the best one, in my
opinion.
If the default max queue size is too large, it should be less always,
regardless of why the queues grow.
Furthermore, I have a feeling that what gives you OOM isn't the data streamer
queue... AFAIR your data streamer queue size is something like (entrySize *
bufferSize * perNodeParallelOperations),
which for 1 kb entries and 16 threads gives (1kb * 512 * 16 * 8) = 64mb which
is usually peanuts for server Java.
Can you check the heap dump in your tests to see what actually occupies most of
the heap?
Thanks,
Stan
On 28 Oct 2022, at 11:54, Vladimir Steshin<vlads...@gmail.com> wrote:
Hi Folks,
I found that Datastreamer may consume heap or use increased heap amount
when loading into a persistent cache.
This may happen with streamer's 'allowOverwite'==true and the cache is in
PRIMARY_SYNC mode.
What I don't like here is that the case looks simple. Not the defaults,
but user might meet the issue just in a trival test, trying/researching the
streamer.
Streamer has related 'perNodeParallelOperations()' which helps. But
addinional DFLT_PARALLEL_PERSISTENT_OPS_MULTIPLIER might be set for PDS.
My question are:
1) Is it an issue at all? Need to fix? A minor?
2) Should we bring additional default DFLT_PARALLEL_PERSISTENT_OPS_MULTIPLIER
for PDS because it reduces heap consumption?
3) Better solution is backpressure. But does it worth the case?
Ticket:https://issues.apache.org/jira/browse/IGNITE-17735
PR:https://github.com/apache/ignite/pull/10343