Hello! If you are seeing any exceptions, please provide logs.
Yes, if you remove the node from baseline and have 1 backup, then the data will be rebalanced between remaining nodes. 1K messages per seconds means 4M writes/sec just for checkpoints given page size 4k, then add WAL to the mix. Regards, -- Ilya Kasnacheev ср, 7 апр. 2021 г. в 22:26, facundo.maldonado <maldonadofacu...@gmail.com>: > Hi everyone, kind of frustrated/disappointed here. > > I have a small cluster on a test environment where I'm trying to take some > measures so > I can size the cluster I will need in production and estimate some costs. > > The use case is simple, consume from a Kafka topic and populate the > database > so other components can start querying (key-value access only). > > The cluster is described below: > > AWS/K8S environment > 4 data nodes and 4 'streamer' nodes. > > Data nodes: > - 12 Gb memory requested > - 4 Gb for JMV xms and xmx > - 5 Gb DataRegion maxSize > - persistence Enabled > - writeThrottling Enabled > - walSegmentSize 256 Mb > - 10 Gb volume attached for storage /opt/work/storage > - 3 Gb volume attached for WAL /opt/work/wal (~10*walSegmentSize) > - WalArchive disabled (walArchivePath==walArchive) > - 1 cache > - partitionLossPolicy READ_ONLY_SAFE > - cacheMode PARTITIONED > - writeSynchronizationMode PRIMARY_SYNC > - rebalanceMode ASYNC > - backups 1 > - expiryPolicyFactory AccessedExpiryPolicy 20 min > > Streamer nodes (Kafka streamer as grid service - node singleton) > - 2 Gb memory requested > - allowOverwrite false > - autoflushFrequency 200ms > - 16 consumers (64 partitions in topic) > > Streamer is configured to have a stream receiver, a StreamTransformer that > checks an special case where I have to chose which record I will keep. > Records are of 1.5 Kb (avg) > They are deserialized and converted into domain objects that are streamed > as > BinaryObjects to the cache. > > Use case: > Started with a clean environment. No data in cache, no data in wal/storage > volumes, no data in the topic. > Input data is generated at a constante rate of 1K mesages per second. > First 20 minutes, cache size grow linearly. After that stays almost flat. > Thats expected since ExpiryPolicy was set to 20 min. > Around the hour, the lag in the consumers started to grow. > After that, everything goes wrong. > WAL size grew beyond the limits, exactly doubled before Kubernetes kills > the > pod. > Around the same moment, memory usage started to grow to near the limit > (12Gb) > Throttling times and checkpointing duration were almost the same during the > test. This last one is really high, (2 min avg), but I don't know if that > is > espected or not since I don't have nothing to compare. > > After 2 nodes were killed, they never join the cluster again. > I increase the size of the wal volume size still they didn't join. > Control.sh utility list both nodes as offline. > Logs output a message like this: > Blocked system-critical thread has been detected. This can lead to > cluster-wide undefined behaviour [workerName=sys-stripe-6, > threadName=sys-stripe-6-#7, blockedFor=74s] > > After restarting again them, one joined the cluster but not the other. > Control.sh utility displayed the node as offline. > By mistake I deleted the content of the wal folder. Shame on me. > Now, the node don't even start. > Node log displays: > JVM will be halted immediately due to the failure: > [failureCtx=FailureContext [type=CRITICAL_ERROR, err=class > o.a.i.i.processors.cache.persistence.StorageException: Failed to read > checkpoint record from WAL, persistence consistency cannot be guaranteed. > Make sure configuration points to correct WAL folders and WAL folder is > properly mounted [ptr=WALPointer [idx=179, fileOff=236972130, len=15006], > walPath=/opt/work/wal, walArchive=/opt/work/wal]]] > > What I think is expected. > Now the node is completely unusable. > > Finally my questions are: > - How can I reuse that node? Can I reuse it? Is there a way to clean the > data and rejoin the node? > - Do I lost the data of that node? It should be recovered from backups once > I remove the node from baseline, is that correct? > - If I increase the input rate to 2K the lag generated at the consumers > becomes unmanaged. Adding more consumers will not help since they are > already matched with topic partitions. > - 1 K messages per second is really really really slow. > - How exactly WAL works? Why I'm constantly running out of space here. > - Any clue of what I'm doing wrong? > > > <http://apache-ignite-users.70518.x6.nabble.com/file/t2948/WalSIze.png> > <http://apache-ignite-users.70518.x6.nabble.com/file/t2948/MemoryUsage.png> > > > > > Hope someone could throw some light here. > Thanks > > > > -- > Sent from: http://apache-ignite-users.70518.x6.nabble.com/ >