[GitHub] [incubator-druid] sanastas commented on issue #7676: Add OakIncrementalIndex to Druid
sanastas commented on issue #7676: Add OakIncrementalIndex to Druid URL: https://github.com/apache/incubator-druid/pull/7676#issuecomment-52475 Hey, we are going to update this PR soon This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] sanastas commented on issue #7676: Add OakIncrementalIndex to Druid
sanastas commented on issue #7676: Add OakIncrementalIndex to Druid URL: https://github.com/apache/incubator-druid/pull/7676#issuecomment-504968841 @b-slim , sure I can provide more details. As I have mentioned we have run Druid's IncrementalIngestionBenchmark (incubator-druid/benchmarks/src/main/java/org/apache/druid/benchmark/indexing/IndexIngestionBenchmark.java) with the data/key generator used there. In order to be more sensitive to data allocation we have increased the number of rows to 3 million (rowsPerSegment=300). Experiments compare OakIncrementalIndex implemented with two different memory allocators: OakNativeAllocator and NettyAllocator. Those are the only differences between two Oak implementations. Both allocators allocate from off-heap. The results are presented in the two columns compared in the table I pasted above. The rows in the table represent how many off-heap memory was given to each experiment, for example (-XX:MaxDirectMemorySize=4g). In each experiment 3.8GB are written off-heap. What other details would you like to know? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] sanastas commented on issue #7676: Add OakIncrementalIndex to Druid
sanastas commented on issue #7676: Add OakIncrementalIndex to Druid URL: https://github.com/apache/incubator-druid/pull/7676#issuecomment-503631910 Hi Slim (@b-slim ), We have taken your comment about Netty Allocator seriously and implemented and measured OakIncrementalIndex performance based on Netty. The comparison was running IncrementalIngestionBenchmark and inserting 3 million rows which is about 3.8GGB of off-heap data. We have tried giving different off-heap size limits, the result can be seen in the following table. [Oak Native Allocator vs Netty.pdf](https://github.com/apache/incubator-druid/files/3306983/Oak.Native.Allocator.vs.Netty.pdf) To summarize, Netty Allocator requires up to twice memory and when its memory request is satisfied the latency is a bit slower. Looking closer on Netty implementation it is understandable, because Netty Allocator Pools hold buffers in different (exponentially increasing) sizes. This is reasonable for network protocol buffers but less reasonable for data storage. Bottom line, based on empirical results we would like to continue with the primitive OakNativeAllocator which satisfies the Druid needs and gives better performance. The experiments with Netty Allocator will continue. Thank you for you valuable input! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] sanastas commented on issue #7676: Add OakIncrementalIndex to Druid
sanastas commented on issue #7676: Add OakIncrementalIndex to Druid URL: https://github.com/apache/incubator-druid/pull/7676#issuecomment-494687467 Thank you for your input! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] sanastas commented on issue #7676: Add OakIncrementalIndex to Druid
sanastas commented on issue #7676: Add OakIncrementalIndex to Druid URL: https://github.com/apache/incubator-druid/pull/7676#issuecomment-494413285 Here we have some results for org.apache.druid.benchmark.indexing.IndexIngestionBenchmark (IndexIngestionBenchmark) We try to insert 3M rows, that should be about 4GB data. We try to give the same amount of data for Oak case and for native Druid IncrementalIndex. Pay attention that as IndexIngestionBenchmark is written the rows are generated prior to the benchmark and hold almost 4GB of on-heap memory anyway. So native Druid IncrementalIndex has some advantage, as it just needs to reference something already on-heap, but Oak needs to copy and needs additional memory. Also many on-heap space goes to StringIndexer and other structures. [IngestionOakvsIncrIdx.pdf](https://github.com/apache/incubator-druid/files/3203010/IngestionOakvsIncrIdx.pdf) Finally, this is single threaded. We see that we can give much more advantage in a multithreaded case, which we will describe shortly. The command lines for reference: ``` java -Xmx15g -XX:MaxDirectMemorySize=0g -jar benchmarks/target/benchmarks.jar IndexIngestionBenchmark -p rowsPerSegment=300 -p indexType=onheap java -Xmx9g -XX:MaxDirectMemorySize=6g -jar benchmarks/target/benchmarks.jar IndexIngestionBenchmark -p rowsPerSegment=300 -p indexType=oak ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] sanastas commented on issue #7676: Add OakIncrementalIndex to Druid
sanastas commented on issue #7676: Add OakIncrementalIndex to Druid URL: https://github.com/apache/incubator-druid/pull/7676#issuecomment-494300561 Hi Slim (@b-slim ), I am impressed with such a thorough review! Thanks! It looks like the issue of lock-free programming and lock-free memory management is as close to your heart as it is close to mine! This is the reason I did PhD in this field :) I am glad to find an ally! What was your PhD thesis about? I have some answers below. If I understand something wrongly, I would be happy to hear and learn. > that is simple not true, some of those locks are part of allocate calls and some are part of iterator loop to do some stuff that i doubt it is static ... I would be happy to speak with you about internal Oak implementation. As it is time consuming to discuss it via chat, I suggest us to arrange us an off-line discussion via Hangouts for example. Let me know what time-slots suit you. > Looking at the allocator briefly i see bunch of red flags, like possible internal fragmentations up to factor 2 and not clear way to deal with external fragmentations … Memory Management is a very hard thing in general and make it lock free is not going to be simple task We're very well aware that the full lock-free Memory Management is a hard thing and actually not available yet. However, the Druid's usage of IncrementalIndex is a restricted one, IMHO. For example, there are no concurrent deletions, thus dealing with internal fragmentation should not be an issue. Dealing with external fragmentation might be an issue of tuning per specific Block size and common workloads. Bottom line, we had no intent to write a perfect general-purpose full lock-free Memory Management, but a specialized memory management. Not to say that it is not final and can be fixed/replaced if some objective issues are found. > Writing one form scratch is a common source of bugs, In fact most of the big data projects avoid this task and use the Netty allocator. As you have seen our memory manager and memory allocator is quite restricted. So we believe it can perform faster and it can be easier to find bugs there. Thanks for referring to the Netty Allocator. I have taken a brief look on Netty Allocator, it appears to be based on reference-counting and their underlying data structure is their own ByteBuf (not ByteBuffer). It is assumed that epoch-based GC is more efficient in performance than reference-counting. But mostly important, I am not sure we can convert ByteBuf into ByteBuffer without any complications or performance issues. Nor can we require Druid to start working with Netty ByteBufs. However, we will continue investigation and if Netty Allocator can be used and gives a better performance, no problems to switch to it. > The take our of this, current state is not lock free thus lets avoid confusion, (FYI even on the oak github page i do not see it called lock-free so not sure why you are calling it lock free). Generally speaking lock-free is about the data structure not about the underlying memory management. As a full lock-free memory management (GC+allocation) isn’t available yet, not any of currently existing lock-free data structures can be called lock-free and neither their performance results are valid… Huge amount of scientific papers wasted :) . But I will remove the "lock-free” term from anywhere it appears here. As I said, I am a big fan of lock-free programming, and I will be the first one to fight to eliminate any lock anywhere. But in reality what does matter is the final performance results and in order to see gains from lock-free programming we need big contention of multi-threading, is it the case in Druid’s IncrementalIndex? All the benchmarks are single threaded… I am OK with lock not on the hot path and with lock that doesn't impact the performance at all. > But the most important thing to add here is how OKA memory management will fit with how Druid manages Direct memory. Exactly! This is the mostly important! And here we need your input! How does Druid manages Direct memory? Is there some policy? Documentation to read? Pointers to code? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] sanastas commented on issue #7676: Add OakIncrementalIndex to Druid
sanastas commented on issue #7676: Add OakIncrementalIndex to Druid URL: https://github.com/apache/incubator-druid/pull/7676#issuecomment-494267528 @fjy , thanks for being interested, we are going to publish some Druid benchmarks soon This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] sanastas commented on issue #7676: Add OakIncrementalIndex to Druid
sanastas commented on issue #7676: Add OakIncrementalIndex to Druid URL: https://github.com/apache/incubator-druid/pull/7676#issuecomment-493675946 Hi @b-slim , First, it is great you are taking a look :) , thanks! Second, we are trying now different Memory Allocators and Memory Managers. The implementation is not final, so we allow ourselves to make it easier and just to check the performance. The locks you see are for singleton initialization just upon a start. Most-likely, Memory Allocator/Pool will not continue to be singleton. Anyway, there are currently no locks on the hot path, and no locks are going to be there. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org