[ https://issues.apache.org/jira/browse/IGNITE-23240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ivan Bessonov updated IGNITE-23240: ----------------------------------- Description: h1. Preface Current implementation, based on {{{}RocksDB{}}}, is known to be way slower then it should be. There are multiple obvious reasons for that: * Writing into WAL +and+ memtable * Creating unique keys for every record * Inability to efficiently serialize data, we must have an intermediate state before we pass data into {{{}RocksDB{}}}'s API. h1. Benchmarks h3. Local benchmarks Local benchmarks ({{{}LogStorageBenchmarks{}}}) have been performed on my local environment with fsync disabled. I got the following results: * {{{}Logit{}}}: {noformat} Test write: Log number : 1024000 Log Size : 16384 Batch Size : 100 Cost time(s) : 23.541 Total size : 16777216000 Throughput(bps) : 712680684 Throughput(rps) : 43498 Test read: Log number : 1024000 Log Size : 16384 Batch Size : 100 Cost time(s) : 3.808 Total size : 16777216000 Throughput(bps) : 4405781512 Throughput(rps) : 268907 Test done!{noformat} * {{{}RocksDB{}}}: {noformat} Test write: Log number : 1024000 Log Size : 16384 Batch Size : 100 Cost time(s) : 178.785 Total size : 16777216000 Throughput(bps) : 93840176 Throughput(rps) : 5727 Test read: Log number : 1024000 Log Size : 16384 Batch Size : 100 Cost time(s) : 13.572 Total size : 16777216000 Throughput(bps) : 1236163866 Throughput(rps) : 75449 Test done!{noformat} While testing on local environment is not optimal, is still shows a huge improvement in writing speed (7.5x) and reading speed (3.5x). Enabling {{fsync}} sort-of equalizes writing speed, but we still expect that simpler log implementation would be faster dues to smaller overall overhead. h3. Integration testing Benchmark for 3 servers and 1 client writing data in multiple threads shows 34438 vs 30299 throughput improvement. {{{}RocksDB{}}}: !Screenshot from 2024-09-20 10-38-53.png! {{{}Logit{}}}: !Screenshot from 2024-09-20 10-38-57.png! Benchmark for single thread insertions in embedded mode shows 4072 vs 3739 throughput improvement. {{{}RocksDB{}}}: !Screenshot from 2024-09-20 10-42-49.png! {{{}Logit{}}}: !Screenshot from 2024-09-20 10-43-09.png! h1. Observations Despite a drastic difference in log throughput, user operations throughput increase is only about 10%. This means that we lose a lot of time elsewhere, and optimizing those parts could significantly increase performance too. Log optimizations would become more evident after that. h1. Unsolved issues There are multiple issues with new log implementation, most of them have been mentioned in [IGNITE-22843|https://issues.apache.org/jira/browse/IGNITE-22843?focusedCommentId=17871250&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17871250] was: h1. Preface Current implementation, based on {{{}RocksDB{}}}, is known to be way slower then it should be. There are multiple obvious reasons for that: * Writing into WAL +and+ memtable * Creating unique keys for every record * Inability to efficiently serialize data, we must have an intermediate state before we pass data into {{{}RocksDB{}}}'s API. h1. Benchmarks h3. Local benchmarks Local benchmarks ({{{}LogStorageBenchmarks{}}}) have been performed on my local environment with fsync disabled. I got the following results: * {{{}Logit{}}}: {noformat} Test write: Log number : 1024000 Log Size : 16384 Batch Size : 100 Cost time(s) : 23.541 Total size : 16777216000 Throughput(bps) : 712680684 Throughput(rps) : 43498 Test read: Log number : 1024000 Log Size : 16384 Batch Size : 100 Cost time(s) : 3.808 Total size : 16777216000 Throughput(bps) : 4405781512 Throughput(rps) : 268907 Test done!{noformat} * {{{}RocksDB{}}}: {noformat} Test write: Log number : 1024000 Log Size : 16384 Batch Size : 100 Cost time(s) : 178.785 Total size : 16777216000 Throughput(bps) : 93840176 Throughput(rps) : 5727 Test read: Log number : 1024000 Log Size : 16384 Batch Size : 100 Cost time(s) : 13.572 Total size : 16777216000 Throughput(bps) : 1236163866 Throughput(rps) : 75449 Test done!{noformat} While testing on local environment is not optimal, is still shows a huge improvement in writing speed (7.5x) and reading speed (3.5x). Enabling {{fsync}} sort-of equalizes writing speed, but we still expect that simpler log implementation would be faster dues to smaller overall overhead. h3. Integration testing Benchmark for 3 servers and 1 client writing data in multiple threads show 34438 vs 30299 throughput. {{{}RocksDB{}}}: !Screenshot from 2024-09-20 10-38-53.png! {{{}Logit{}}}: !Screenshot from 2024-09-20 10-38-57.png! Single thread insertions in embedded mode show > Ignite 3 new log storage > ------------------------ > > Key: IGNITE-23240 > URL: https://issues.apache.org/jira/browse/IGNITE-23240 > Project: Ignite > Issue Type: Epic > Reporter: Ivan Bessonov > Priority: Major > Labels: ignite-3 > Attachments: Screenshot from 2024-09-20 10-38-53.png, Screenshot from > 2024-09-20 10-38-57.png, Screenshot from 2024-09-20 10-42-49.png, Screenshot > from 2024-09-20 10-43-09.png > > > h1. Preface > Current implementation, based on {{{}RocksDB{}}}, is known to be way slower > then it should be. There are multiple obvious reasons for that: > * Writing into WAL +and+ memtable > * Creating unique keys for every record > * Inability to efficiently serialize data, we must have an intermediate > state before we pass data into {{{}RocksDB{}}}'s API. > h1. Benchmarks > h3. Local benchmarks > Local benchmarks ({{{}LogStorageBenchmarks{}}}) have been performed on my > local environment with fsync disabled. I got the following results: > * {{{}Logit{}}}: > {noformat} > Test write: > Log number : 1024000 > Log Size : 16384 > Batch Size : 100 > Cost time(s) : 23.541 > Total size : 16777216000 > Throughput(bps) : 712680684 > Throughput(rps) : 43498 > Test read: > Log number : 1024000 > Log Size : 16384 > Batch Size : 100 > Cost time(s) : 3.808 > Total size : 16777216000 > Throughput(bps) : 4405781512 > Throughput(rps) : 268907 > Test done!{noformat} > * {{{}RocksDB{}}}: > {noformat} > Test write: > Log number : 1024000 > Log Size : 16384 > Batch Size : 100 > Cost time(s) : 178.785 > Total size : 16777216000 > Throughput(bps) : 93840176 > Throughput(rps) : 5727 > Test read: > Log number : 1024000 > Log Size : 16384 > Batch Size : 100 > Cost time(s) : 13.572 > Total size : 16777216000 > Throughput(bps) : 1236163866 > Throughput(rps) : 75449 > Test done!{noformat} > While testing on local environment is not optimal, is still shows a huge > improvement in writing speed (7.5x) and reading speed (3.5x). Enabling > {{fsync}} sort-of equalizes writing speed, but we still expect that simpler > log implementation would be faster dues to smaller overall overhead. > h3. Integration testing > Benchmark for 3 servers and 1 client writing data in multiple threads shows > 34438 vs 30299 throughput improvement. > {{{}RocksDB{}}}: > !Screenshot from 2024-09-20 10-38-53.png! > {{{}Logit{}}}: > !Screenshot from 2024-09-20 10-38-57.png! > Benchmark for single thread insertions in embedded mode shows 4072 vs 3739 > throughput improvement. > {{{}RocksDB{}}}: > !Screenshot from 2024-09-20 10-42-49.png! > {{{}Logit{}}}: > !Screenshot from 2024-09-20 10-43-09.png! > h1. Observations > Despite a drastic difference in log throughput, user operations throughput > increase is only about 10%. This means that we lose a lot of time elsewhere, > and optimizing those parts could significantly increase performance too. Log > optimizations would become more evident after that. > h1. Unsolved issues > There are multiple issues with new log implementation, most of them have been > mentioned in > [IGNITE-22843|https://issues.apache.org/jira/browse/IGNITE-22843?focusedCommentId=17871250&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17871250] -- This message was sent by Atlassian Jira (v8.20.10#820010)