Re: [PR] MINOR: Only commit running active and standby tasks when tasks corrupted [kafka]

2023-10-16 Thread via GitHub


cadonna commented on code in PR #14508:
URL: https://github.com/apache/kafka/pull/14508#discussion_r1360616252


##
streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskManager.java:
##
@@ -223,10 +223,7 @@ boolean handleCorruption(final Set corruptedTasks) 
{
 final Collection tasksToCommit = allTasks()
 .values()
 .stream()
-// TODO: once we remove state restoration from the stream 
thread, we can also remove
-//  the RESTORING state here, since there will not be any 
restoring tasks managed
-//  by the stream thread anymore.
-.filter(t -> t.state() == Task.State.RUNNING || t.state() == 
Task.State.RESTORING)
+.filter(t -> t.state() == Task.State.RUNNING)

Review Comment:
   @guozhangwang as far as I can see in the code, a restoring active task does 
not return `true` from `commitNeeded()`. Thus, `postCommit()` is never called 
here. Do you agree? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINOR: Only commit running active and standby tasks when tasks corrupted [kafka]

2023-10-13 Thread via GitHub


guozhangwang commented on code in PR #14508:
URL: https://github.com/apache/kafka/pull/14508#discussion_r1359038910


##
streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskManager.java:
##
@@ -223,10 +223,7 @@ boolean handleCorruption(final Set corruptedTasks) 
{
 final Collection tasksToCommit = allTasks()
 .values()
 .stream()
-// TODO: once we remove state restoration from the stream 
thread, we can also remove
-//  the RESTORING state here, since there will not be any 
restoring tasks managed
-//  by the stream thread anymore.
-.filter(t -> t.state() == Task.State.RUNNING || t.state() == 
Task.State.RESTORING)
+.filter(t -> t.state() == Task.State.RUNNING)

Review Comment:
   Hey @cadonna sorry I came late on this PR.
   
   One thing I'd like to raise is that in the past, we've seen active task 
restoring never complete under rolling restart / rebalance storm scenarios 
since we kept losing the progress we made thus far when reviving. I'm not 100% 
sure if this part of the code is related to that scenario but just try to 
double check. If you have thought about it and concluded this would not be 
related, I'm relieved :) 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINOR: Only commit running active and standby tasks when tasks corrupted [kafka]

2023-10-12 Thread via GitHub


cadonna merged PR #14508:
URL: https://github.com/apache/kafka/pull/14508


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINOR: Only commit running active and standby tasks when tasks corrupted [kafka]

2023-10-12 Thread via GitHub


cadonna commented on code in PR #14508:
URL: https://github.com/apache/kafka/pull/14508#discussion_r1356667200


##
streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskManager.java:
##
@@ -223,10 +223,7 @@ boolean handleCorruption(final Set corruptedTasks) 
{
 final Collection tasksToCommit = allTasks()

Review Comment:
   I am not sure if we strictly need to do it, because as you say standby tasks 
have nothing to do with the ongoing transaction. I was merely referring to what 
the old code path does. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINOR: Only commit running active and standby tasks when tasks corrupted [kafka]

2023-10-12 Thread via GitHub


lucasbru commented on code in PR #14508:
URL: https://github.com/apache/kafka/pull/14508#discussion_r1356569841


##
streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskManager.java:
##
@@ -223,10 +223,7 @@ boolean handleCorruption(final Set corruptedTasks) 
{
 final Collection tasksToCommit = allTasks()

Review Comment:
   Actually, not so sure I understand this. Why do we want to checkpoint 
non-corrupted standby tasks here? The comment says `since this will force the 
ongoing txn to abort`, but what do standby tasks have to do with the ongoing 
txn?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINOR: Only commit running active and standby tasks when tasks corrupted [kafka]

2023-10-11 Thread via GitHub


lucasbru commented on code in PR #14508:
URL: https://github.com/apache/kafka/pull/14508#discussion_r1355337776


##
streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskManager.java:
##
@@ -223,10 +223,7 @@ boolean handleCorruption(final Set corruptedTasks) 
{
 final Collection tasksToCommit = allTasks()

Review Comment:
   Oh, right. Forgot the checkpointing is piggy-backed here. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINOR: Only commit running active and standby tasks when tasks corrupted [kafka]

2023-10-11 Thread via GitHub


cadonna commented on code in PR #14508:
URL: https://github.com/apache/kafka/pull/14508#discussion_r1354810862


##
streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskManager.java:
##
@@ -223,10 +223,7 @@ boolean handleCorruption(final Set corruptedTasks) 
{
 final Collection tasksToCommit = allTasks()

Review Comment:
   Actually, we want to commit (actually checkpoint) non-corrupted standby 
tasks which are owned by the state updater.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINOR: Only commit running active and standby tasks when tasks corrupted [kafka]

2023-10-11 Thread via GitHub


lucasbru commented on code in PR #14508:
URL: https://github.com/apache/kafka/pull/14508#discussion_r1354615069


##
streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskManager.java:
##
@@ -223,10 +223,7 @@ boolean handleCorruption(final Set corruptedTasks) 
{
 final Collection tasksToCommit = allTasks()

Review Comment:
   We could still use `tasks.allTasks()` here, since we certainly do not want 
to process tasks owned by the state updater right? Would seem cleaner to me.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINOR: Only commit running active and standby tasks when tasks corrupted [kafka]

2023-10-10 Thread via GitHub


cadonna commented on code in PR #14508:
URL: https://github.com/apache/kafka/pull/14508#discussion_r1351723672


##
streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskManager.java:
##
@@ -223,10 +223,7 @@ boolean handleCorruption(final Set corruptedTasks) 
{
 final Collection tasksToCommit = allTasks()
 .values()
 .stream()
-// TODO: once we remove state restoration from the stream 
thread, we can also remove
-//  the RESTORING state here, since there will not be any 
restoring tasks managed
-//  by the stream thread anymore.
-.filter(t -> t.state() == Task.State.RUNNING || t.state() == 
Task.State.RESTORING)
+.filter(t -> t.state() == Task.State.RUNNING)

Review Comment:
   Since restoring active tasks return false from `commitNeeded()` because they 
have never processed records and have never executed a punctuation, 
`preCommit()` and `postCommit()` are [never called on restoring active task in 
this specific 
code](https://github.com/apache/kafka/blob/c32d2338a7e0079e539b74eb16f0095380a1ce85/streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskExecutor.java#L141).
 This is true for enabled and disabled state updater.
   Additionally, as far as I know there is nothing to flush in a restoring 
active task, because restoration uses the state restore callback. In any case, 
the flush is never called for the reason I pointed out above.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINOR: Only commit running active and standby tasks when tasks corrupted [kafka]

2023-10-10 Thread via GitHub


cadonna commented on code in PR #14508:
URL: https://github.com/apache/kafka/pull/14508#discussion_r1351723672


##
streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskManager.java:
##
@@ -223,10 +223,7 @@ boolean handleCorruption(final Set corruptedTasks) 
{
 final Collection tasksToCommit = allTasks()
 .values()
 .stream()
-// TODO: once we remove state restoration from the stream 
thread, we can also remove
-//  the RESTORING state here, since there will not be any 
restoring tasks managed
-//  by the stream thread anymore.
-.filter(t -> t.state() == Task.State.RUNNING || t.state() == 
Task.State.RESTORING)
+.filter(t -> t.state() == Task.State.RUNNING)

Review Comment:
   Since restoring active tasks return false from `commitNeeded()` because they 
have never processed records and have never executed a punctuation, 
`preCommit()` and `postCommit()` are [never called on restoring active task in 
this specific 
code](https://github.com/apache/kafka/blob/c32d2338a7e0079e539b74eb16f0095380a1ce85/streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskExecutor.java#L141).
 This is true for enabled and disabled state updater.
   Additionally, as far as I know there is nothing to flush in a restoring 
active task, because restoration uses the state restore callback.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINOR: Only commit running active and standby tasks when tasks corrupted [kafka]

2023-10-10 Thread via GitHub


cadonna commented on code in PR #14508:
URL: https://github.com/apache/kafka/pull/14508#discussion_r1351723672


##
streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskManager.java:
##
@@ -223,10 +223,7 @@ boolean handleCorruption(final Set corruptedTasks) 
{
 final Collection tasksToCommit = allTasks()
 .values()
 .stream()
-// TODO: once we remove state restoration from the stream 
thread, we can also remove
-//  the RESTORING state here, since there will not be any 
restoring tasks managed
-//  by the stream thread anymore.
-.filter(t -> t.state() == Task.State.RUNNING || t.state() == 
Task.State.RESTORING)
+.filter(t -> t.state() == Task.State.RUNNING)

Review Comment:
   Since restoring active tasks return false from `commitNeeded()` because they 
have never processed records and have never execute a punctuation, 
`preCommit()` and `postCommit()` are [never called on restoring active task in 
this specific 
code](https://github.com/apache/kafka/blob/c32d2338a7e0079e539b74eb16f0095380a1ce85/streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskExecutor.java#L141).
 This is true for enabled and disabled state updater.
   Additionally, as far as I know there is nothing to flush in a restoring 
active task, because restoration uses the state restore callback.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINOR: Only commit running active and standby tasks when tasks corrupted [kafka]

2023-10-09 Thread via GitHub


mjsax commented on code in PR #14508:
URL: https://github.com/apache/kafka/pull/14508#discussion_r1350625946


##
streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskManager.java:
##
@@ -223,10 +223,7 @@ boolean handleCorruption(final Set corruptedTasks) 
{
 final Collection tasksToCommit = allTasks()
 .values()
 .stream()
-// TODO: once we remove state restoration from the stream 
thread, we can also remove
-//  the RESTORING state here, since there will not be any 
restoring tasks managed
-//  by the stream thread anymore.
-.filter(t -> t.state() == Task.State.RUNNING || t.state() == 
Task.State.RESTORING)
+.filter(t -> t.state() == Task.State.RUNNING)

Review Comment:
   I was just reading the TODO without much thinking about it... -- I guess we 
might still want to flush restoring tasks and write the checkpoint file (what 
is part to a commit) -- so should we execute `preCommit()` and `postCommit()` 
for those -- I agree that we won't have input topic offsets to be committed 
(and the should not be any TX).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINOR: Only commit running active and standby tasks when tasks corrupted [kafka]

2023-10-09 Thread via GitHub


cadonna commented on code in PR #14508:
URL: https://github.com/apache/kafka/pull/14508#discussion_r1350271668


##
streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskManager.java:
##
@@ -223,10 +223,7 @@ boolean handleCorruption(final Set corruptedTasks) 
{
 final Collection tasksToCommit = allTasks()
 .values()
 .stream()
-// TODO: once we remove state restoration from the stream 
thread, we can also remove
-//  the RESTORING state here, since there will not be any 
restoring tasks managed
-//  by the stream thread anymore.
-.filter(t -> t.state() == Task.State.RUNNING || t.state() == 
Task.State.RESTORING)
+.filter(t -> t.state() == Task.State.RUNNING)

Review Comment:
   I do not think that this is needed. Don't you agree that restoring active 
tasks do not need to be committed -- with or without state updater.  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINOR: Only commit running active and standby tasks when tasks corrupted [kafka]

2023-10-06 Thread via GitHub


mjsax commented on code in PR #14508:
URL: https://github.com/apache/kafka/pull/14508#discussion_r1349252619


##
streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskManager.java:
##
@@ -223,10 +223,7 @@ boolean handleCorruption(final Set corruptedTasks) 
{
 final Collection tasksToCommit = allTasks()
 .values()
 .stream()
-// TODO: once we remove state restoration from the stream 
thread, we can also remove
-//  the RESTORING state here, since there will not be any 
restoring tasks managed
-//  by the stream thread anymore.
-.filter(t -> t.state() == Task.State.RUNNING || t.state() == 
Task.State.RESTORING)
+.filter(t -> t.state() == Task.State.RUNNING)

Review Comment:
   Given that we still have a feature flag, should we make this condition more 
complex and consider if the state updater thread is enabled or not, and check 
different conditions for both cases?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org