[ https://issues.apache.org/jira/browse/IGNITE-22843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17871250#comment-17871250 ]
Roman Puchkovskiy commented on IGNITE-22843: -------------------------------------------- We have an alternative implementation of Raft log storage, that is, logit. In [https://github.com/apache/ignite-3/pull/4178] , I tried to switch to it, but there is a problem: it preallocates 2 segments per Raft group, and that's 128Mb with the default segment size (64Mb, taken from AI2's WAL). We have a lot of groups, and just 1 table with the default 25 partitions eats up 3.2Gb of disk, even if it's empty. Configuration segments don't need to be so big, 100kb is enough. A check for 'entry fits a segment' seems to be absent, so tests crash JVM when data segment is too small. But there seems to be another problem: even when writing a small entry, sometimes the JVM gets crashed. Maybe it's a race, we need to investigate. When data segment size is configured to be 250kb, most tests pass, but ItDmlTest always either hangs or crashes the JVM when run in its suite. Locally, when it passes, forĀ some tests this takes enormous emount of time (like 3 minutes). It needs to be investigated. With smaller segment sizes the segments are switched often, which ruins performance (4.3us per write for 64Mb segment, 4.9us for 10Mb segment, 9us for 1Mb segment on my machine). The whole approach of having a lot of logs seems to be shaky. They need to preallocate a lot of disk (like 64mb) per log (otherwise, performance suffers), but, as we have a lot of logs, this means a huge waste of disk space. Could we share logs between few (or many?) Raft groups? > Writing into RAFT log is too long > --------------------------------- > > Key: IGNITE-22843 > URL: https://issues.apache.org/jira/browse/IGNITE-22843 > Project: Ignite > Issue Type: Improvement > Reporter: Vladislav Pyatkov > Assignee: Roman Puchkovskiy > Priority: Major > Labels: ignite-3 > > h3. Motivation > We are using RocksDB as RAFT log storage. Writing in the log is significantly > longer than writing in the memory-mapped buffer (as we used in Ignite 2). > {noformat} > appendLogEntry 0.8 6493700 6494500 > Here is hidden 0.5 us > flushLog 20.1 6495000 6515100 > Here is hidden 2.8 us > {noformat} > h3. Definition of done > We should find a way to implement faster log storage. -- This message was sent by Atlassian Jira (v8.20.10#820010)