Ksenia Rybakova created IGNITE-5707:
---------------------------------------
Summary: Client can't resume streaming even after topology got
stable during load test
Key: IGNITE-5707
URL: https://issues.apache.org/jira/browse/IGNITE-5707
Project: Ignite
Issue Type: Bug
Affects Versions: 2.1
Reporter: Ksenia Rybakova
Load test config:
- CacheRandomOperationBenchmark
- 8 clients, 48 servers at 8 hosts
- 26 physical caches of different types with different memory policies + 30
groups with 10 partitioned caches each + 20 groups with 10 replicated caches
each. Total 526 caches.
- Preloading amount: 50K, key range: 60K
Complete configs are attached.
3 of 8 clients have following messages during preloading:
{noformat}
[12:17:56] (err) Failed to execute compound future reducer: GridCompoundFuture
[rdc=null, initFlag=1, lsnrCalls=0, done=false, cancelled=false, err=null,
futs=[true, false, false]][12:17:56] (err) Failed to
execute compound future reducer: GridCompoundFuture [rdc=null, initFlag=1,
lsnrCalls=0, done=false, cancelled=false, err=null, futs=[true, true, false,
false, false, false, false, false, false, false, false,
false, false, false, false, false, false, false, false, false, false, false,
false, false, false, false, false, false, false, false, false, false, false,
false, false, false, false]][12:17:56] (err) Failed
to execute compound future reducer: GridCompoundFuture [rdc=null, initFlag=1,
lsnrCalls=0, done=false, cancelled=false, err=null, futs=[true, true, false,
false, false, false, false, false, false, false, fal
se, false, false, false, false, false, false, false, false, false, false,
false, false, false, false, false, false, false, false, false, false, false,
false, false, false, false, false]]class org.apache.igni
te.IgniteCheckedException: DataStreamer request failed
[node=16a20d0c-4009-4bfa-ad6e-0261d9e3b2a3]
at
org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl$Buffer.onResponse(DataStreamerImpl.java:1785)
at
org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl$3.onMessage(DataStreamerImpl.java:333)
at
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556)
at
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184)
at
org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)
at
org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1097)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: class org.apache.ignite.IgniteCheckedException: DataStreamer will
retry data transfer at stable topology [reqTop=AffinityTopologyVersion
[topVer=56, minorTopVer=0], topVer=AffinityTopologyVersion
[topVer=56, minorTopVer=1], node=remote]
at
org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor.localUpdate(DataStreamProcessor.java:343)
at
org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor.processRequest(DataStreamProcessor.java:301)
at
org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor.access$000(DataStreamProcessor.java:58)
at
org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor$1.onMessage(DataStreamProcessor.java:88)
... 7 more
{noformat}
2 drivers were able to resume streaming after some time, but 1 didn't (error
messages continued to be printed). This driver had high heap utilization, that
resulted in long GC pause. Finally it was considered failed by other nodes.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)