[jira] [Commented] (KAFKA-12226) High-throughput source tasks fail to commit offsets

2021-11-15 Thread Randall Hauch (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17443975#comment-17443975
 ] 

Randall Hauch commented on KAFKA-12226:
---

Thanks for catching this, [~dajac]. I did miss merging this to the `3.1` branch 
-- I recall at the time looking for the branch and not seeing it. I'm in the 
process or building the branch after merging locally, so you should see this a 
bit later today.


> High-throughput source tasks fail to commit offsets
> ---
>
> Key: KAFKA-12226
> URL: https://issues.apache.org/jira/browse/KAFKA-12226
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Chris Egerton
>Assignee: Chris Egerton
>Priority: Major
> Fix For: 3.1.0
>
>
> The current source task thread has the following workflow:
>  # Poll messages from the source task
>  # Queue these messages to the producer and send them to Kafka asynchronously.
>  # Add the message to outstandingMessages, or if a flush is currently active, 
> outstandingMessagesBacklog
>  # When the producer completes the send of a record, remove it from 
> outstandingMessages
> The commit offsets thread has the following workflow:
>  # Wait a flat timeout for outstandingMessages to flush completely
>  # If this times out, add all of the outstandingMessagesBacklog to the 
> outstandingMessages and reset
>  # If it succeeds, commit the source task offsets to the backing store.
>  # Retry the above on a fixed schedule
> If the source task is producing records quickly (faster than the producer can 
> send), then the producer will throttle the task thread by blocking in its 
> {{send}} method, waiting at most {{max.block.ms}} for space in the 
> {{buffer.memory}} to be available. This means that the number of records in 
> {{outstandingMessages}} + {{outstandingMessagesBacklog}} is proportional to 
> the size of the producer memory buffer.
> This amount of data might take more than {{offset.flush.timeout.ms}} to 
> flush, and thus the flush will never succeed while the source task is 
> rate-limited by the producer memory. This means that we may write multiple 
> hours of data to Kafka and not ever commit source offsets for the connector. 
> When the task is lost due to a worker failure, hours of data will be 
> re-processed that otherwise were successfully written to Kafka.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-12226) High-throughput source tasks fail to commit offsets

2021-11-15 Thread David Jacot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17443699#comment-17443699
 ] 

David Jacot commented on KAFKA-12226:
-

[~rhauch] I don't see this commit in the 3.1.0 branch. Note that the 3.1.0 
branch was cut on November 2nd.

> High-throughput source tasks fail to commit offsets
> ---
>
> Key: KAFKA-12226
> URL: https://issues.apache.org/jira/browse/KAFKA-12226
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Chris Egerton
>Assignee: Chris Egerton
>Priority: Major
> Fix For: 3.1.0
>
>
> The current source task thread has the following workflow:
>  # Poll messages from the source task
>  # Queue these messages to the producer and send them to Kafka asynchronously.
>  # Add the message to outstandingMessages, or if a flush is currently active, 
> outstandingMessagesBacklog
>  # When the producer completes the send of a record, remove it from 
> outstandingMessages
> The commit offsets thread has the following workflow:
>  # Wait a flat timeout for outstandingMessages to flush completely
>  # If this times out, add all of the outstandingMessagesBacklog to the 
> outstandingMessages and reset
>  # If it succeeds, commit the source task offsets to the backing store.
>  # Retry the above on a fixed schedule
> If the source task is producing records quickly (faster than the producer can 
> send), then the producer will throttle the task thread by blocking in its 
> {{send}} method, waiting at most {{max.block.ms}} for space in the 
> {{buffer.memory}} to be available. This means that the number of records in 
> {{outstandingMessages}} + {{outstandingMessagesBacklog}} is proportional to 
> the size of the producer memory buffer.
> This amount of data might take more than {{offset.flush.timeout.ms}} to 
> flush, and thus the flush will never succeed while the source task is 
> rate-limited by the producer memory. This means that we may write multiple 
> hours of data to Kafka and not ever commit source offsets for the connector. 
> When the task is lost due to a worker failure, hours of data will be 
> re-processed that otherwise were successfully written to Kafka.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-12226) High-throughput source tasks fail to commit offsets

2021-11-07 Thread Randall Hauch (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17440067#comment-17440067
 ] 

Randall Hauch commented on KAFKA-12226:
---

Merged the PR to the `trunk` branch, which will be included in the upcoming 
3.1.0 branch. 

> High-throughput source tasks fail to commit offsets
> ---
>
> Key: KAFKA-12226
> URL: https://issues.apache.org/jira/browse/KAFKA-12226
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Chris Egerton
>Assignee: Chris Egerton
>Priority: Major
> Fix For: 3.1.0
>
>
> The current source task thread has the following workflow:
>  # Poll messages from the source task
>  # Queue these messages to the producer and send them to Kafka asynchronously.
>  # Add the message to outstandingMessages, or if a flush is currently active, 
> outstandingMessagesBacklog
>  # When the producer completes the send of a record, remove it from 
> outstandingMessages
> The commit offsets thread has the following workflow:
>  # Wait a flat timeout for outstandingMessages to flush completely
>  # If this times out, add all of the outstandingMessagesBacklog to the 
> outstandingMessages and reset
>  # If it succeeds, commit the source task offsets to the backing store.
>  # Retry the above on a fixed schedule
> If the source task is producing records quickly (faster than the producer can 
> send), then the producer will throttle the task thread by blocking in its 
> {{send}} method, waiting at most {{max.block.ms}} for space in the 
> {{buffer.memory}} to be available. This means that the number of records in 
> {{outstandingMessages}} + {{outstandingMessagesBacklog}} is proportional to 
> the size of the producer memory buffer.
> This amount of data might take more than {{offset.flush.timeout.ms}} to 
> flush, and thus the flush will never succeed while the source task is 
> rate-limited by the producer memory. This means that we may write multiple 
> hours of data to Kafka and not ever commit source offsets for the connector. 
> When the task is lost due to a worker failure, hours of data will be 
> re-processed that otherwise were successfully written to Kafka.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)