chickenchickenlove commented on PR #20969: URL: https://github.com/apache/kafka/pull/20969#issuecomment-3827252946
@mjsax @lucliu1108 Thanks everyone for the discussion, and thanks @lucliu1108 for outlining the options 🙇‍♂️ On why I initially picked option (2): I wanted to keep this PR as small and non-semantic as possible — just avoid a misleading WARN when the state directory can’t be deleted solely because the process metadata file remains. Since that file was introduced for stable process identity across restarts in KAFKA-10716, I was hesitant to change behavior around whether it should be removed. I agree we should first align on what `KafkaStreams#cleanUp` is intended to mean (KAFKA-17251). If we decide cleanUp must fully reset instance identity, then option (1) makes sense, but it would also cause a new processId on the next start (potentially increasing task shuffling after restart). Also, depending on how we define the cleanUp semantics, implementing option (1) could be an observable behavior change for users, so it might warrant a KIP (or at least an explicit compatibility discussion), unless we consider the current behavior a bug relative to the intended semantics. Option (3) is also attractive since it makes the semantics explicit, though it likely requires a KIP. As additional context, KAFKA-15190 doesn’t define cleanUp semantics, but it does highlight how important a stable/controllable process identity is for predictable assignments, especially in environments without persistent local storage — which makes me lean toward preserving identity by default unless we explicitly decide otherwise. Happy to adjust the PR based on the consensus! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
