Re: [PR] KAFKA-15776: Use the FETCH request timeout as the delay timeout for DelayedRemoteFetch [kafka]

2024-06-04 Thread via GitHub


showuon commented on PR #14778:
URL: https://github.com/apache/kafka/pull/14778#issuecomment-2148598853

   @kamalcph , could we move the dynamic config change into another PR? I have 
some comments to it, but that is separate from the original changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-15776: Use the FETCH request timeout as the delay timeout for DelayedRemoteFetch [kafka]

2024-06-04 Thread via GitHub


showuon commented on code in PR #14778:
URL: https://github.com/apache/kafka/pull/14778#discussion_r1625827494


##
clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java:
##
@@ -198,7 +198,7 @@ public class ConsumerConfig extends AbstractConfig {
  * fetch.max.wait.ms
  */
 public static final String FETCH_MAX_WAIT_MS_CONFIG = "fetch.max.wait.ms";
-private static final String FETCH_MAX_WAIT_MS_DOC = "The maximum amount of 
time the server will block before answering the fetch request if there isn't 
sufficient data to immediately satisfy the requirement given by 
fetch.min.bytes.";
+private static final String FETCH_MAX_WAIT_MS_DOC = "The maximum amount of 
time the server will block before answering the fetch request when it is 
reading near to the tail of the partition (high-watermark) and there isn't 
sufficient data to immediately satisfy the requirement given by 
fetch.min.bytes.";

Review Comment:
   Should we mention this config is only used for local log fetch? 
   Also, we should mention this in the end:
   `To tune the remote fetch maximum time wait, please refer to 
"remote.fetch.max.wait.ms" config.`. WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-15776: Use the FETCH request timeout as the delay timeout for DelayedRemoteFetch [kafka]

2024-06-04 Thread via GitHub


showuon commented on code in PR #14778:
URL: https://github.com/apache/kafka/pull/14778#discussion_r1625827494


##
clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java:
##
@@ -198,7 +198,7 @@ public class ConsumerConfig extends AbstractConfig {
  * fetch.max.wait.ms
  */
 public static final String FETCH_MAX_WAIT_MS_CONFIG = "fetch.max.wait.ms";
-private static final String FETCH_MAX_WAIT_MS_DOC = "The maximum amount of 
time the server will block before answering the fetch request if there isn't 
sufficient data to immediately satisfy the requirement given by 
fetch.min.bytes.";
+private static final String FETCH_MAX_WAIT_MS_DOC = "The maximum amount of 
time the server will block before answering the fetch request when it is 
reading near to the tail of the partition (high-watermark) and there isn't 
sufficient data to immediately satisfy the requirement given by 
fetch.min.bytes.";

Review Comment:
   Should we mention this config is only used for local log fetch? 
   Also, we should mention this in the end:
   `To tune the remote fetch maximum time wait, please refer to 
"remote.fetch.max.wait.ms" config.`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-15776: Use the FETCH request timeout as the delay timeout for DelayedRemoteFetch [kafka]

2024-06-04 Thread via GitHub


kamalcph commented on PR #14778:
URL: https://github.com/apache/kafka/pull/14778#issuecomment-2147168785

   @showuon @clolov @jeqo 
   
   The patch is ready for review. PTAL. 
   
   Will open a separate PR for dynamic `remote.fetch.max.wait.ms` config and 
RemoteLogReader FetchRateAndTimeMs metrics.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-15776: Use the FETCH request timeout as the delay timeout for DelayedRemoteFetch [kafka]

2024-04-12 Thread via GitHub


showuon commented on PR #14778:
URL: https://github.com/apache/kafka/pull/14778#issuecomment-2051465918

   Correction: For this:
   > some users might feel surprised when their fetch doesn't respond in 
fetch.max.wait.ms time.
   
   This is wrong. Even if the remote reading is not completed, yet, the fetch 
request will still return in `fetch.max.wait.ms`. It's just an empty response.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-15776: Use the FETCH request timeout as the delay timeout for DelayedRemoteFetch [kafka]

2024-02-19 Thread via GitHub


github-actions[bot] commented on PR #14778:
URL: https://github.com/apache/kafka/pull/14778#issuecomment-1953432771

   This PR is being marked as stale since it has not had any activity in 90 
days. If you would like to keep this PR alive, please ask a committer for 
review. If the PR has  merge conflicts, please update it with the latest from 
trunk (or appropriate release branch)  If this PR is no longer valid or 
desired, please feel free to close it. If no activity occurs in the next 30 
days, it will be automatically closed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-15776: Use the FETCH request timeout as the delay timeout for DelayedRemoteFetch [kafka]

2023-11-21 Thread via GitHub


showuon commented on PR #14778:
URL: https://github.com/apache/kafka/pull/14778#issuecomment-1820588080

   I understand the problem you're trying to solve, but using the server 
default request timeout doesn't make sense to me. It will break the contract of 
fetch protocol that `fetch.max.wait.ms` will not be exceeded if no sufficient 
data in "local" log. I understand the remote read is some kind of grey area 
about if "data is existed or not", but we have to admit, some users might feel 
surprised when their fetch doesn't respond in `fetch.max.wait.ms` time. 
Ideally, we should introduce another config for this remote read waiting 
purpose, instead of re-using request timeout. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-15776: Use the FETCH request timeout as the delay timeout for DelayedRemoteFetch [kafka]

2023-11-19 Thread via GitHub


kamalcph commented on PR #14778:
URL: https://github.com/apache/kafka/pull/14778#issuecomment-1818291232

   > Could you please help me understand how this change works with 
fetch.max.wait.ms from a user perspective i.e. what happens when we are 
retrieving data from both local & remote in a single fetch call?
   
   `fetch.max.wait.ms` timeout is applicable only when there is no enough data 
(`fetch.min.bytes`) to respond back to the client. This is a special case where 
we are reading the data from both local and remote, the FETCH request has to 
wait for the tail latency which is a combined latency of reading from both 
local and remote storage. 
   
   Note that we always read from only one remote partition up-to 
`max.partition.fetch.bytes` even-though there is available bandwidth in the 
FETCH response (`fetch.max.bytes`) and the client rotates the partition order 
in the next FETCH request so that next partitions are served.
   
   > Also, wouldn't this change user clients? Asking because prior to this 
change users were expecting a guaranteed response within fetch.max.wait.ms = 
500ms but now they might not receive a response until 40s request.timeout.ms. 
If the user has configured their application timeouts to according to 
fetch.max.wait.ms, this change will break my application.
   
   `fetch.max.wait.ms` doesn't guarantee a response within this timeout. The 
client expires the request only when it exceeds the `request.timeout.ms` of 30 
seconds (default). The time taken to serve the FETCH request can be higher than 
the `fetch.max.wait.ms` due to slow hard-disk, sector errors in disk and so on.
   
   The 
[FetchRequest.json](https://sourcegraph.com/github.com/apache/kafka/-/blob/clients/src/main/resources/common/message/FetchRequest.json)
 doesn't expose the client configured request timeout, so we are using the 
default server request timeout of 30 seconds. Otherwise, we can introduce one 
more config `fetch.remote.max.wait.ms` to define the delay timeout for 
DelayedRemoteFetch requests. We need to decide whether to keep this config in 
the client/server since the server operator may need to tune this config if the 
remote storage degrades and latency to serve the FETCH requests is high.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-15776: Use the FETCH request timeout as the delay timeout for DelayedRemoteFetch [kafka]

2023-11-18 Thread via GitHub


divijvaidya commented on PR #14778:
URL: https://github.com/apache/kafka/pull/14778#issuecomment-1817593722

   Could you please help me understand how this change works with 
`fetch.max.wait.ms` from a user perspective i.e. what happens when we are 
retrieving data from both local & remote in a single fetch call?
   
   Also, wouldn't this change user clients? Asking because prior to this change 
users were expecting a guaranteed response within `fetch.max.wait.ms` = 500ms 
but now they might not receive a response until 40s `request.timeout.ms`. If 
the user has configured their application timeouts to according to 
`fetch.max.wait.ms`, this change will break my application. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-15776: Use the FETCH request timeout as the delay timeout for DelayedRemoteFetch [kafka]

2023-11-16 Thread via GitHub


kamalcph commented on code in PR #14778:
URL: https://github.com/apache/kafka/pull/14778#discussion_r1395458516


##
core/src/main/scala/kafka/server/ReplicaManager.scala:
##
@@ -1338,7 +1338,8 @@ class ReplicaManager(val config: KafkaConfig,
 return Some(createLogReadResult(e))
 }
 
-val remoteFetch = new DelayedRemoteFetch(remoteFetchTask, 
remoteFetchResult, remoteFetchInfo,
+val timeout = config.requestTimeoutMs.toLong
+val remoteFetch = new DelayedRemoteFetch(remoteFetchTask, 
remoteFetchResult, remoteFetchInfo, timeout,

Review Comment:
   Passing the full request timeout to the `DelayedRemoteFetch`, we can reduce 
the time spent by the handler from the `requestTimeout` to reach up-to here to 
expire the delayedRemoteFetch requests immediately if required.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-15776: Use the FETCH request timeout as the delay timeout for DelayedRemoteFetch [kafka]

2023-11-16 Thread via GitHub


kamalcph commented on code in PR #14778:
URL: https://github.com/apache/kafka/pull/14778#discussion_r1395458516


##
core/src/main/scala/kafka/server/ReplicaManager.scala:
##
@@ -1338,7 +1338,8 @@ class ReplicaManager(val config: KafkaConfig,
 return Some(createLogReadResult(e))
 }
 
-val remoteFetch = new DelayedRemoteFetch(remoteFetchTask, 
remoteFetchResult, remoteFetchInfo,
+val timeout = config.requestTimeoutMs.toLong
+val remoteFetch = new DelayedRemoteFetch(remoteFetchTask, 
remoteFetchResult, remoteFetchInfo, timeout,

Review Comment:
   Passing the full request timeout to the `DelayedRemoteFetch`, we can reduce 
the time spent by the handler from the `requestTimeout` to reach up-to here to 
expire the delayedRemoteFetch requests immediately if required.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] KAFKA-15776: Use the FETCH request timeout as the delay timeout for DelayedRemoteFetch [kafka]

2023-11-16 Thread via GitHub


kamalcph opened a new pull request, #14778:
URL: https://github.com/apache/kafka/pull/14778

   DelayedRemoteFetch uses `fetch.max.wait.ms` config as a delay timeout for 
DelayedRemoteFetchPurgatory. `fetch.max.wait.ms` purpose is to wait for the 
given amount of time when there is no data available to serve the FETCH request.
   
   ```
   The maximum amount of time the server will block before answering the fetch 
request if there isn't sufficient data to immediately satisfy the requirement 
given by fetch.min.bytes.
   ```
   
   Using the same timeout in the DelayedRemoteFetchPurgatory can confuse the 
user on how to configure optimal value for each purpose. Moreover, the config 
is of LOW importance and most of the users won't configure it and use the 
default value of 500 ms.
   
   Having the delay timeout of 500 ms in DelayedRemoteFetchPurgatory can lead 
to higher number of expired delayed remote fetch requests when the remote 
storage have any degradation.
   
   Test: Existing unit and integration tests
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org