lnbest0707 commented on code in PR #274:
URL: 
https://github.com/apache/flink-connector-kafka/pull/274#discussion_r3456008993


##########
flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/dynamic/source/enumerator/DynamicKafkaSourceEnumerator.java:
##########
@@ -587,17 +653,38 @@ private void startAllEnumerators() {
     }
 
     private void closeAllEnumeratorsAndContexts() {
-        clusterEnumeratorMap.forEach(
+        Map<String, StoppableKafkaEnumContextProxy> 
closingClusterEnumContextMap =
+                new HashMap<>(clusterEnumContextMap);
+        Map<String, SplitEnumerator<KafkaPartitionSplit, KafkaSourceEnumState>>
+                closingClusterEnumeratorMap = new 
HashMap<>(clusterEnumeratorMap);
+        closingClusterEnumContextMap
+                .values()
+                .forEach(StoppableKafkaEnumContextProxy::prepareForClose);
+        clusterEnumContextMap.clear();
+        clusterEnumeratorMap.clear();
+
+        enumeratorClosingExecutor.execute(
+                () ->
+                        closeEnumeratorsAndContexts(
+                                closingClusterEnumContextMap, 
closingClusterEnumeratorMap));
+    }
+
+    private void closeEnumeratorsAndContexts(
+            Map<String, StoppableKafkaEnumContextProxy> 
closingClusterEnumContextMap,
+            Map<String, SplitEnumerator<KafkaPartitionSplit, 
KafkaSourceEnumState>>
+                    closingClusterEnumeratorMap) {
+        closingClusterEnumeratorMap.forEach(
                 (cluster, subEnumerator) -> {
                     try {
-                        clusterEnumContextMap.get(cluster).close();
+                        closingClusterEnumContextMap.get(cluster).close();
                         subEnumerator.close();
                     } catch (Exception e) {
-                        throw new RuntimeException(e);
+                        enumContext.runInCoordinatorThread(

Review Comment:
   Addressed in 8995be24. The async stale-enumerator close path now retains the 
first failure, still attempts normal coordinator-thread propagation, and 
rethrows the retained failure from close() after awaiting the close executor. 
Added a regression test where coordinator callbacks are dropped.



##########
flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/dynamic/source/enumerator/DynamicKafkaSourceEnumerator.java:
##########
@@ -725,6 +812,29 @@ private Set<SplitAndAssignmentStatus> filterStateByTopics(
                 .collect(Collectors.toSet());
     }
 
+    private void tuneEnumeratorAdminClientTimeouts(Properties consumerProps) {
+        tuneEnumeratorAdminClientTimeout(consumerProps, 
ConsumerConfig.REQUEST_TIMEOUT_MS_CONFIG);
+        tuneEnumeratorAdminClientTimeout(
+                consumerProps, 
CommonClientConfigs.DEFAULT_API_TIMEOUT_MS_CONFIG);
+    }
+
+    private void tuneEnumeratorAdminClientTimeout(Properties consumerProps, 
String propertyKey) {
+        String configuredTimeoutMs = consumerProps.getProperty(propertyKey);
+        if (configuredTimeoutMs == null) {
+            return;
+        }
+
+        long readerTimeoutMs = Long.parseLong(configuredTimeoutMs);
+        long enumeratorTimeoutMs =
+                Math.max(1L, readerTimeoutMs / 
ENUMERATOR_ADMIN_CLIENT_TIMEOUT_DIVISOR);
+        consumerProps.setProperty(propertyKey, 
Long.toString(enumeratorTimeoutMs));

Review Comment:
   Addressed in 8995be24. I removed the timeout rewriting entirely. The 
isolated metadata discovery worker remains the correctness fix, while user and 
cluster request.timeout.ms and default.api.timeout.ms settings remain unchanged.



##########
flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/dynamic/source/enumerator/DynamicKafkaSourceEnumerator.java:
##########
@@ -754,6 +864,8 @@ public void handleSourceEvent(int subtaskId, SourceEvent 
sourceEvent) {
     @Override
     public void close() throws IOException {
         try {
+            kafkaMetadataServiceDiscoveryContext.close();

Review Comment:
   Addressed in 8995be24. Shutdown now marks discovery and sub-enumerator 
contexts closing, closes KafkaMetadataService before waiting for discovery 
termination, and leaves SynchronizedKafkaMetadataService.close() unsynchronized 
so delegate close can unblock an in-flight metadata call. Added regression 
coverage for that unblock path.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to