ableegoldman commented on code in PR #15264: URL: https://github.com/apache/kafka/pull/15264#discussion_r1486914766
########## streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskManager.java: ########## @@ -1836,6 +1842,36 @@ public void maybeThrowTaskExceptionsFromProcessingThreads() { } } + private boolean transactionBuffersExceedCapacity = false; + + boolean transactionBuffersExceedCapacity() { + return transactionBuffersExceedCapacity; + } + + boolean transactionBuffersWillExceedCapacity() { + final boolean transactionBuffersAreUnbounded = maxUncommittedStateBytes < 0; + if (transactionBuffersAreUnbounded) { + return false; + } + + // force an early commit if the uncommitted bytes exceeds or is *likely to exceed* the configured threshold Review Comment: > Writes in RocksDB are the same size, irrespective of whether they are inserts of new keys, or updates to existing keys. I could be wrong about this, but I thought that writing to an existing key in the active memtable -- ie before it is flushed and/or made immutable -- is indeed applied as an overwrite, and does not grow the write buffer's overall size? Of course this only applies to patterns in which keys are updated multiple times between flushes, sure. Still, I feel like this is one of the major points of the write buffer: to deduplicate same-key writes as much as possible before flushing them, since once they're flushed it has to wait for a compaction to deduplicate further. That said, I don't feel strongly about this, and any potential performance degradation should come to light in the benchmarks. Thanks for explaining your thoughts on this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org