[GitHub] [kafka] ncliang commented on a change in pull request #10563: KAFKA-12487: Add support for cooperative consumer protocol with sink connectors

GitBox Thu, 29 Apr 2021 02:31:21 -0700


ncliang commented on a change in pull request #10563:
URL: https://github.com/apache/kafka/pull/10563#discussion_r622851011




##########
File path: 
connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java
##########
@@ -631,13 +648,31 @@ private void rewind() {
     }
 
     private void openPartitions(Collection<TopicPartition> partitions) {
-        sinkTaskMetricsGroup.recordPartitionCount(partitions.size());
+        updatePartitionCount();
         task.open(partitions);
     }
 
-    private void closePartitions() {
-        commitOffsets(time.milliseconds(), true);
-        sinkTaskMetricsGroup.recordPartitionCount(0);
+    private void closeAllPartitions() {
+        closePartitions(currentOffsets.keySet(), false);
+    }
+
+    private void closePartitions(Collection<TopicPartition> topicPartitions, 
boolean lost) {
+        if (!lost) {
+            commitOffsets(time.milliseconds(), true, topicPartitions);
+        } else {
+            log.trace("{} Closing the task as partitions have been lost: {}", 
this, topicPartitions);
+            task.close(topicPartitions);
+            if (workerErrantRecordReporter != null) {
+                log.trace("Cancelling reported errors for {}", 
topicPartitions);
+                workerErrantRecordReporter.cancelFutures(topicPartitions);

Review comment:
       I'm not sure if cancelling the outstanding futures for error reporting 
is the right thing to do here. Would it be reasonable to await their completion 
for a reasonable amount of time before giving up?

##########
File path: 
connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java
##########
@@ -680,13 +717,13 @@ public void 
onPartitionsAssigned(Collection<TopicPartition> partitions) {
             }
             sinkTaskMetricsGroup.assignedOffsets(currentOffsets);
 
-            // If we paused everything for redelivery (which is no longer 
relevant since we discarded the data), make
+            // If we paused everything for redelivery and all partitions for 
the failed deliveries have been revoked, make
             // sure anything we paused that the task didn't request to be 
paused *and* which we still own is resumed.
             // Also make sure our tracking of paused partitions is updated to 
remove any partitions we no longer own.
-            pausedForRedelivery = false;
+            pausedForRedelivery = pausedForRedelivery && 
!messageBatch.isEmpty();

Review comment:
       I don't know if this change is required. The way I read the current 
implementation, we make sure that the paused partitions contain only assigned 
partitions in the block below, setting the paused partitions on context. We 
then rely on the code block in `iteration()` to resume partitions that should 
not be paused.
   ```
               } else if (!pausedForRedelivery) {
                   resumeAll();
                   onResume();
               }
   ```
   Setting this to anything other than false causes us not to resume partitions 
which we own that were not explicitly requested to be paused.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [kafka] ncliang commented on a change in pull request #10563: KAFKA-12487: Add support for cooperative consumer protocol with sink connectors

Reply via email to