[GitHub] [kafka] showuon commented on pull request #12274: KAFKA-13959: Controller should unfence Broker with busy metadata log
showuon commented on PR #12274: URL: https://github.com/apache/kafka/pull/12274#issuecomment-1176986634 > I think this will unfence the broker at startup even if the broker hasn't applied the snapshot or any of the log records, right? Currently, we will replay the metadata records when metadata listener got new records. So yes, if we just return the current LEO, the records/snapshots might have not applied, yet. Sorry, it's easy to reject other's proposal, but difficult to come up another solution. If we don't have any other better solution, maybe we can try the original proposed one? ``` One solution to this problem is to require the broker to only catch up to the last committed offset when they last sent the heartbeat. For example: Broker sends a heartbeat with current offset of Y. The last commit offset is X. The controller remember this last commit offset, call it X' Broker sends another heartbeat with current offset of Z. Unfence the broker if Z >= X or Z >= X'. ``` And again, thanks for keeping trying to fix this difficult issue, @dengziming ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] showuon commented on pull request #12274: KAFKA-13959: Controller should unfence Broker with busy metadata log
showuon commented on PR #12274: URL: https://github.com/apache/kafka/pull/12274#issuecomment-1155888453 Yes, I think this is the best solution under current design, although it will break the purpose of `fetchPurgatory` when there's temporarily no new records in metadata. I agree with Jason that we can adopt this solution right now, and create another JIRA to aim to come out a solution which is not super sensitive to timing-related configurations. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] showuon commented on pull request #12274: KAFKA-13959: Controller should unfence Broker with busy metadata log
showuon commented on PR #12274: URL: https://github.com/apache/kafka/pull/12274#issuecomment-1151085631 @dengziming , I like this idea to fix the issue. And you dig deeper than I did. Nice finding. But in this case, I think we should increase the `MetadataMaxIdleIntervalMs` to something like 800ms or 1000ms, instead of reduce the fetch wait time. After all, reducing the fetch wait time will make client and broker busier. But let's wait for the jenkins results to see if it really fixes the issue. (I think your investigation is correct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org