sanastas commented on issue #7676: Add OakIncrementalIndex to Druid URL: https://github.com/apache/incubator-druid/pull/7676#issuecomment-494413285 Here we have some results for org.apache.druid.benchmark.indexing.IndexIngestionBenchmark (IndexIngestionBenchmark) We try to insert 3M rows, that should be about 4GB data. We try to give the same amount of data for Oak case and for native Druid IncrementalIndex. Pay attention that as IndexIngestionBenchmark is written the rows are generated prior to the benchmark and hold almost 4GB of on-heap memory anyway. So native Druid IncrementalIndex has some advantage, as it just needs to reference something already on-heap, but Oak needs to copy and needs additional memory. Also many on-heap space goes to StringIndexer and other structures. [IngestionOakvsIncrIdx.pdf](https://github.com/apache/incubator-druid/files/3203010/IngestionOakvsIncrIdx.pdf) Finally, this is single threaded. We see that we can give much more advantage in a multithreaded case, which we will describe shortly. The command lines for reference: ``` java -Xmx15g -XX:MaxDirectMemorySize=0g -jar benchmarks/target/benchmarks.jar IndexIngestionBenchmark -p rowsPerSegment=3000000 -p indexType=onheap java -Xmx9g -XX:MaxDirectMemorySize=6g -jar benchmarks/target/benchmarks.jar IndexIngestionBenchmark -p rowsPerSegment=3000000 -p indexType=oak ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org