vamossagar12 commented on a change in pull request #10798: URL: https://github.com/apache/kafka/pull/10798#discussion_r644616944
########## File path: streams/src/main/java/org/apache/kafka/streams/state/internals/RocksDBStore.java ########## @@ -505,6 +506,14 @@ private void closeOpenIterators() { } } + private ByteBuffer createDirectByteBufferAndPut(byte[] bytes) { + ByteBuffer directBuffer = ByteBuffer.allocateDirect(bytes.length); Review comment: Thank you @guozhangwang , @cadonna . I think creating it every time does not make much sense. I should have been more careful before adding and asking for the internal benchmarks. In that case, would it even make sense to have it in an API like put and instead use it for putAll()/range/reverseRange/prefixSeek operations? That's because in the case of put, it is difficult to know how many put operations may be requested. If users were using the rocksdb library directly, then they can create DirectByteBuffers once and push as many entries as they want. Based upon my conversations with the rocksdb one of the comments was this: `Extracting large amounts of data under high concurrency, non-direct byte buffer will bring serious GC problems to the upper level Java services.` i guess, we can target those APIs? WDYT? ########## File path: streams/src/main/java/org/apache/kafka/streams/state/internals/RocksDBStore.java ########## @@ -505,6 +506,14 @@ private void closeOpenIterators() { } } + private ByteBuffer createDirectByteBufferAndPut(byte[] bytes) { + ByteBuffer directBuffer = ByteBuffer.allocateDirect(bytes.length); Review comment: ok. That makes sense. High concurrency is one of the cases where this might be useful. Having said that, on the PR, there are benchmarking numbers for a large number of put operations in a single threaded manner. As per the numbers direct byte buffer was 37% faster and with 0 GC cycles. Here is the comment: https://github.com/facebook/rocksdb/pull/2283#issuecomment-561563037 The users of kafka streams might call put() in this manner where in they loop through a bunch of records and use put() to insert. From the state store side, either we create 1 DirectByteBuffer object for put() and keep reusing it- subject to testing. But that might not always be the case. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org