[jira] [Commented] (KAFKA-14294) Kafka Streams should commit transaction when no records are processed

2023-08-30 Thread A. Sophie Blee-Goldman (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-14294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17760632#comment-17760632
 ] 

A. Sophie Blee-Goldman commented on KAFKA-14294:


I'm not sure that workaround would help, as it follows the same mechanism for 
requesting a commit as the actual invocation of the punctuator, which is what 
is "broken" by this bug. Specifically, both a punctuator invocation and the 
`context.commit` API just set the `commitNeeded` flag. However the bug here is 
that this flag is never checked at all, and we skip over the commit entirely 
when there are no new input records processed. The only guarantee is really to 
upgrade to a version with the fix, ie 3.4 or above

> Kafka Streams should commit transaction when no records are processed
> -
>
> Key: KAFKA-14294
> URL: https://issues.apache.org/jira/browse/KAFKA-14294
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 3.1.0, 3.2.1
>Reporter: Vicky Papavasileiou
>Assignee: A. Sophie Blee-Goldman
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: image-2023-02-08-10-22-20-456.png
>
>
> Currently, if there are no records to process in the input topic, a 
> transaction does not commit. If a custom punctuator code is writing to a 
> state store (which is common practice) the producer gets fenced when trying 
> to write to the changelog topic. This throws a TaskMigratedException and 
> causes a rebalance. 
> A better approach would be to commit a transaction even when there are no 
> records processed as to allow the punctuator to make progress. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-14294) Kafka Streams should commit transaction when no records are processed

2023-02-08 Thread Sylvain Le Gouellec (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-14294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17685784#comment-17685784
 ] 

Sylvain Le Gouellec commented on KAFKA-14294:
-

*Workaround*

Developer can explicitly request to commit the transaction using 
{code:java}
context.commit(); {code}
after each punctate handler triggered.

> Kafka Streams should commit transaction when no records are processed
> -
>
> Key: KAFKA-14294
> URL: https://issues.apache.org/jira/browse/KAFKA-14294
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 3.2.1
>Reporter: Vicky Papavasileiou
>Assignee: A. Sophie Blee-Goldman
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: image-2023-02-08-10-22-20-456.png
>
>
> Currently, if there are no records to process in the input topic, a 
> transaction does not commit. If a custom punctuator code is writing to a 
> state store (which is common practice) the producer gets fenced when trying 
> to write to the changelog topic. This throws a TaskMigratedException and 
> causes a rebalance. 
> A better approach would be to commit a transaction even when there are no 
> records processed as to allow the punctuator to make progress. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-14294) Kafka Streams should commit transaction when no records are processed

2022-10-21 Thread A. Sophie Blee-Goldman (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-14294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17622498#comment-17622498
 ] 

A. Sophie Blee-Goldman commented on KAFKA-14294:


Mm, yeah this sounds similar to what I found in 
https://issues.apache.org/jira/browse/KAFKA-13295. Seems like we have a general 
vulnerability to uncommitted transactions. Unfortunately with the nontrivial 
overhead of committing, it's tricky to strike the right balance between keeping 
the producers alive and over-committing to the point of serious performance 
loss :/

 

Perhaps we need a better overall solution to the problem rather than playing 
whack-a-mole as issues like this come up. Fortunately, I actually think our 
work around decoupling the consumer from the processing thread(s) may naturally 
fix this, if we get it right.

 

I'm still in talks with [~guozhang] about some of the design details, including 
where the producers should fit into the new architecture. This is definitely 
some good food for thought there, if we're smart about it hopefully we can 
avoid any more issues like this with low(er) cost commits on a regular basis. 

> Kafka Streams should commit transaction when no records are processed
> -
>
> Key: KAFKA-14294
> URL: https://issues.apache.org/jira/browse/KAFKA-14294
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 3.2.1
>Reporter: Vicky Papavasileiou
>Priority: Major
>
> Currently, if there are no records to process in the input topic, a 
> transaction does not commit. If a custom punctuator code is writing to a 
> state store (which is common practice) the producer gets fenced when trying 
> to write to the changelog topic. This throws a TaskMigratedException and 
> causes a rebalance. 
> A better approach would be to commit a transaction even when there are no 
> records processed as to allow the punctuator to make progress. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-14294) Kafka Streams should commit transaction when no records are processed

2022-11-07 Thread Matthias J. Sax (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-14294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17630098#comment-17630098
 ] 

Matthias J. Sax commented on KAFKA-14294:
-

Why do you think we don't commit? Based on the code, we should commit: 
StreamTask has a flag `commitNeeded` that is set to `true` after a punctuation 
executed.

> Kafka Streams should commit transaction when no records are processed
> -
>
> Key: KAFKA-14294
> URL: https://issues.apache.org/jira/browse/KAFKA-14294
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 3.2.1
>Reporter: Vicky Papavasileiou
>Priority: Major
>
> Currently, if there are no records to process in the input topic, a 
> transaction does not commit. If a custom punctuator code is writing to a 
> state store (which is common practice) the producer gets fenced when trying 
> to write to the changelog topic. This throws a TaskMigratedException and 
> causes a rebalance. 
> A better approach would be to commit a transaction even when there are no 
> records processed as to allow the punctuator to make progress. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-14294) Kafka Streams should commit transaction when no records are processed

2022-11-08 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-14294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17630328#comment-17630328
 ] 

Bruno Cadonna commented on KAFKA-14294:
---

Streams does not commit a transaction if no records have been consumed from the 
input topic. See 
https://github.com/apache/kafka/blob/cd4a1cb4101abcc7cdd1d2d0d73662114108f3e5/streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskExecutor.java#L182
It does not matter whether {{commitNeeded}} is set to true or not.

If a punctuator writes to a state store but not records are consumed from the 
input topic, a transaction is started but it is never committed. 
  

> Kafka Streams should commit transaction when no records are processed
> -
>
> Key: KAFKA-14294
> URL: https://issues.apache.org/jira/browse/KAFKA-14294
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 3.2.1
>Reporter: Vicky Papavasileiou
>Priority: Major
>
> Currently, if there are no records to process in the input topic, a 
> transaction does not commit. If a custom punctuator code is writing to a 
> state store (which is common practice) the producer gets fenced when trying 
> to write to the changelog topic. This throws a TaskMigratedException and 
> causes a rebalance. 
> A better approach would be to commit a transaction even when there are no 
> records processed as to allow the punctuator to make progress. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-14294) Kafka Streams should commit transaction when no records are processed

2022-11-08 Thread A. Sophie Blee-Goldman (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-14294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17630678#comment-17630678
 ] 

A. Sophie Blee-Goldman commented on KAFKA-14294:


I guess we should just remove that check? ie first go through all the tasks and 
check whether any of them have commitNeeded = true, and if so we still need to 
commit whether or not there are actually "new" offsets to be committed.

 

I do worry a bit about the potential impact of excessively committing, maybe we 
should try to expose/check whether the producer has an open transaction OR 
there are offsets to commit, rather than relying on the commitNeeded flag which 
would be true regardless of whether the punctuation actually tried to forward 
new records/open a new transaction

> Kafka Streams should commit transaction when no records are processed
> -
>
> Key: KAFKA-14294
> URL: https://issues.apache.org/jira/browse/KAFKA-14294
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 3.2.1
>Reporter: Vicky Papavasileiou
>Priority: Major
>
> Currently, if there are no records to process in the input topic, a 
> transaction does not commit. If a custom punctuator code is writing to a 
> state store (which is common practice) the producer gets fenced when trying 
> to write to the changelog topic. This throws a TaskMigratedException and 
> causes a rebalance. 
> A better approach would be to commit a transaction even when there are no 
> records processed as to allow the punctuator to make progress. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)