Hi,

We have been experiencing issues while connect to geode using putAll API
with spark. Issue is specific to one particular spark job which tries to
load data to a replicated region. Exception we see in the server side is
that default limit of 800 gets maxed out and on client side we see retry
attempt to each server but gets failed even though when we re ran the same
job it gets completed without any issue.

In the code problem I could see is that we are connecting to geode using
client cache in forEachPartition which I think could be the issue. So for
each partition we are making a connection to geode. In stats file we could
see that connections getting timeout and there is thread burst also
sometimes >4000.

What is the recommended way to connect to geode using spark?

But this one specific job which gets failed most of the times and is a
replicated region. Also when we change the type of region to partitioned
then job gets completed. We have enabled disk persistence for both type of
regions.

Thoughts?



With best regards,
Ashish

Reply via email to