cadonna commented on a change in pull request #9027: URL: https://github.com/apache/kafka/pull/9027#discussion_r454858808
########## File path: docs/streams/developer-guide/config-streams.html ########## @@ -244,10 +265,15 @@ <h4><a class="toc-backref" href="#id5">bootstrap.servers</a><a class="headerlink <td colspan="2">Partition grouper class that implements the <code class="docutils literal"><span class="pre">PartitionGrouper</span></code> interface.</td> <td>See <a class="reference internal" href="#streams-developer-guide-partition-grouper"><span class="std std-ref">Partition Grouper</span></a></td> </tr> + <tr class="row-odd"><td>probing.rebalance.interval.ms</td> + <td>Low</td> + <td colspan="2">The maximum time to wait before triggering a rebalance to probe for warmup replicas that have sufficiently caught up.</td> + <td>600000 milliseconds</td> Review comment: Should we add 10 minutes in parenthesis to make the default better readable? ########## File path: docs/streams/developer-guide/config-streams.html ########## @@ -63,24 +63,29 @@ </ul> </li> <li><a class="reference internal" href="#optional-configuration-parameters" id="id6">Optional configuration parameters</a><ul> + <li><a class="reference internal" href="#acceptable-recovery-lag" id="id27">acceptable.recovery.lag</a></li> <li><a class="reference internal" href="#default-deserialization-exception-handler" id="id7">default.deserialization.exception.handler</a></li> <li><a class="reference internal" href="#default-production-exception-handler" id="id24">default.production.exception.handler</a></li> <li><a class="reference internal" href="#default-key-serde" id="id8">default.key.serde</a></li> <li><a class="reference internal" href="#default-value-serde" id="id9">default.value.serde</a></li> + <li><a class="reference internal" href="#max-task-idle-ms" id="id28">max.task.idle.ms</a></li> Review comment: Thank you for adding this here. FYI, here we have the docs for `max.task.idle.ms`: https://kafka.apache.org/documentation/#streamsconfigs. This relates to the issue Matthias brought up once that we have this config docs in two places on the website which makes it harder to maintain. Until Matthias brought up this issues I was not aware that we had this docs twice. ########## File path: docs/streams/developer-guide/memory-mgmt.html ########## @@ -206,7 +206,10 @@ <h2><a class="toc-backref" href="#id3">RocksDB</a><a class="headerlink" href="#r <span class="o">}</span> <span class="o">}</span> </div> - <sup id="fn1">1. INDEX_FILTER_BLOCK_RATIO can be used to set a fraction of the block cache to set aside for "high priority" (aka index and filter) blocks, preventing them from being evicted by data blocks. See the full signature of the <a class="reference external" href="https://github.com/facebook/rocksdb/blob/master/java/src/main/java/org/rocksdb/LRUCache.java#L72">LRUCache constructor</a>. </sup> + <sup id="fn1">1. INDEX_FILTER_BLOCK_RATIO can be used to set a fraction of the block cache to set aside for "high priority" (aka index and filter) blocks, preventing them from being evicted by data blocks. See the full signature of the <a class="reference external" href="https://github.com/facebook/rocksdb/blob/master/java/src/main/java/org/rocksdb/LRUCache.java#L72">LRUCache constructor</a>. + NOTE: the boolean parameter in the cache constructor lets you control whether the cache should enforce a strict memory limit by failing the read or iteration in the rare cases where it might go larger than its capacity. Due to a + <a class="reference external" href="https://github.com/facebook/rocksdb/issues/6247">bug in RocksDB</a>, this option cannot be used + if the write buffer memory is also counted against the cache. If you set this true, you should NOT pass the cache in to the WriteBufferManager and just control the write buffer and cache memory separately.</sup> Review comment: Did you mean to write "If you set this _to_ true" or is it also correct to say "If you set this true"? Could you put `WriteBufferManager` into `<code>` tags? ########## File path: docs/streams/developer-guide/config-streams.html ########## @@ -425,6 +465,24 @@ <h4><a class="toc-backref" href="#id9">default.value.serde</a><a class="headerli <p>This is discussed in more detail in <a class="reference internal" href="datatypes.html#streams-developer-guide-serdes"><span class="std std-ref">Data types and serialization</span></a>.</p> </div></blockquote> </div> + <div class="section" id="max-task-idle-ms"> + <span id="streams-developer-guide-max-task-idle-ms"></span><h4><a class="toc-backref" href="#id28">max.task.idle.ms</a><a class="headerlink" href="#max-task-idle-ms" title="Permalink to this headline"></a></h4> + <blockquote> + <div> + The maximum amount of time a task will idle without processing data when waiting for all of its input partition buffers to contain records. This can help avoid potential out-of-order + processing when the task has multiple input streams, as in a join, for example. Setting this to a nonzero value may increase latency but will improve time synchronization. + </div></blockquote> + </div> + <div class="section" id="max-warmup-replicas"> + <span id="streams-developer-guide-max-warmup-replicas"></span><h4><a class="toc-backref" href="#id29">max.warmup.replicas</a><a class="headerlink" href="#max-warmup-replicas" title="Permalink to this headline"></a></h4> + <blockquote> + <div> + The maximum number of warmup replicas (extra standbys beyond the configured num.standbys) that can be assigned at once for the purpose of keeping + the task available on one instance while it is warming up on another instance it has been reassigned to. Used to throttle how much extra broker + traffic and cluster state can be used for high availability. Increasing this will allow Streams to warm up more tasks at once, speeding up the time + for the reassigned warmups to restore sufficient state for them to be transitioned to active tasks. Must be at least 1. + </div></blockquote> Review comment: nit: Could you put `</blockquote>` to a new line? ########## File path: docs/streams/developer-guide/config-streams.html ########## @@ -270,21 +296,26 @@ <h4><a class="toc-backref" href="#id5">bootstrap.servers</a><a class="headerlink <td colspan="2">The amount of time in milliseconds, before a request is retried. This applies if the <code class="docutils literal"><span class="pre">retries</span></code> parameter is configured to be greater than 0. </td> <td>100</td> </tr> - <tr class="row-even"><td>rocksdb.config.setter</td> + <tr class="row-odd"><td>rocksdb.config.setter</td> <td>Medium</td> <td colspan="2">The RocksDB configuration.</td> <td></td> </tr> - <tr class="row-odd"><td>state.cleanup.delay.ms</td> + <tr class="row-even"><td>state.cleanup.delay.ms</td> <td>Low</td> <td colspan="2">The amount of time in milliseconds to wait before deleting state when a partition has migrated.</td> <td>600000 milliseconds</td> </tr> - <tr class="row-even"><td>state.dir</td> + <tr class="row-odd"><td>state.dir</td> <td>High</td> <td colspan="2">Directory location for state stores.</td> <td><code class="docutils literal"><span class="pre">/tmp/kafka-streams</span></code></td> </tr> + <tr class="row-odd"><td>topology.optimization</td> Review comment: Shouldn't that be `row-even`? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org