Hello! Can you please show more logs/full stack trace?
Data Streamer is not especially fault tolerant, but it should survive a server node leaving. How many backups do you have? What is partition loss policy? Regards, -- Ilya Kasnacheev пт, 31 янв. 2020 г. в 11:02, userx <gagan...@gmail.com>: > Hi team, > > I performed a simple check of CAP theorem on an Apache Ignite cluster and > observed a few things related to tolerance and availability of the system. > > Here are the steps > 1) Created a cluster of three Ignite servers - S1, S2, S3, say S1 is > started > first so it is the coordinator. > 2) Topology version : 3 > 3) 13 clients (C1 to C13) connect to the cluster say, sporadically > 4) Topology version: 16 = 3+13 > > Let's say the clients start writing into their respective distinct caches. > After 7 or 8 minutes into this activity, I kill S2 by doing a kill -9. What > I have observed is that I start getting the following errors for any cache > writes occurring afterwards > > 50008_116305_11951_2_12472_978_1_0_2 javax.cache.CacheException: class > org.apache.ignite.IgniteCheckedException: Some of DataStreamer operations > failed [failedCount=1] > at > > org.apache.ignite.internal.processors.cache.GridCacheUtils.convertToCacheException(GridCacheUtils.java:1337) > at > > org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl.close(DataStreamerImpl.java:1287) > at > > org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl.close(DataStreamerImpl.java:1388) > at > com.abc.datagrid.DataGridClient.writeAll(DataGridClient.java:209) > > Therefore the observation is that it is not partition or fault tolerant and > in such a situation, rest of the cluster does not seem available for > writing. > > Can someone throw some light here ? I can share more logs. > > > > > > > > -- > Sent from: http://apache-ignite-users.70518.x6.nabble.com/ >