lnbest0707 commented on code in PR #274:
URL:
https://github.com/apache/flink-connector-kafka/pull/274#discussion_r3456008993
##########
flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/dynamic/source/enumerator/DynamicKafkaSourceEnumerator.java:
##########
@@ -587,17 +653,38 @@ private void startAllEnumerators() {
}
private void closeAllEnumeratorsAndContexts() {
- clusterEnumeratorMap.forEach(
+ Map<String, StoppableKafkaEnumContextProxy>
closingClusterEnumContextMap =
+ new HashMap<>(clusterEnumContextMap);
+ Map<String, SplitEnumerator<KafkaPartitionSplit, KafkaSourceEnumState>>
+ closingClusterEnumeratorMap = new
HashMap<>(clusterEnumeratorMap);
+ closingClusterEnumContextMap
+ .values()
+ .forEach(StoppableKafkaEnumContextProxy::prepareForClose);
+ clusterEnumContextMap.clear();
+ clusterEnumeratorMap.clear();
+
+ enumeratorClosingExecutor.execute(
+ () ->
+ closeEnumeratorsAndContexts(
+ closingClusterEnumContextMap,
closingClusterEnumeratorMap));
+ }
+
+ private void closeEnumeratorsAndContexts(
+ Map<String, StoppableKafkaEnumContextProxy>
closingClusterEnumContextMap,
+ Map<String, SplitEnumerator<KafkaPartitionSplit,
KafkaSourceEnumState>>
+ closingClusterEnumeratorMap) {
+ closingClusterEnumeratorMap.forEach(
(cluster, subEnumerator) -> {
try {
- clusterEnumContextMap.get(cluster).close();
+ closingClusterEnumContextMap.get(cluster).close();
subEnumerator.close();
} catch (Exception e) {
- throw new RuntimeException(e);
+ enumContext.runInCoordinatorThread(
Review Comment:
Addressed in 8995be24. The async stale-enumerator close path now retains the
first failure, still attempts normal coordinator-thread propagation, and
rethrows the retained failure from close() after awaiting the close executor.
Added a regression test where coordinator callbacks are dropped.
##########
flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/dynamic/source/enumerator/DynamicKafkaSourceEnumerator.java:
##########
@@ -725,6 +812,29 @@ private Set<SplitAndAssignmentStatus> filterStateByTopics(
.collect(Collectors.toSet());
}
+ private void tuneEnumeratorAdminClientTimeouts(Properties consumerProps) {
+ tuneEnumeratorAdminClientTimeout(consumerProps,
ConsumerConfig.REQUEST_TIMEOUT_MS_CONFIG);
+ tuneEnumeratorAdminClientTimeout(
+ consumerProps,
CommonClientConfigs.DEFAULT_API_TIMEOUT_MS_CONFIG);
+ }
+
+ private void tuneEnumeratorAdminClientTimeout(Properties consumerProps,
String propertyKey) {
+ String configuredTimeoutMs = consumerProps.getProperty(propertyKey);
+ if (configuredTimeoutMs == null) {
+ return;
+ }
+
+ long readerTimeoutMs = Long.parseLong(configuredTimeoutMs);
+ long enumeratorTimeoutMs =
+ Math.max(1L, readerTimeoutMs /
ENUMERATOR_ADMIN_CLIENT_TIMEOUT_DIVISOR);
+ consumerProps.setProperty(propertyKey,
Long.toString(enumeratorTimeoutMs));
Review Comment:
Addressed in 8995be24. I removed the timeout rewriting entirely. The
isolated metadata discovery worker remains the correctness fix, while user and
cluster request.timeout.ms and default.api.timeout.ms settings remain unchanged.
##########
flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/dynamic/source/enumerator/DynamicKafkaSourceEnumerator.java:
##########
@@ -754,6 +864,8 @@ public void handleSourceEvent(int subtaskId, SourceEvent
sourceEvent) {
@Override
public void close() throws IOException {
try {
+ kafkaMetadataServiceDiscoveryContext.close();
Review Comment:
Addressed in 8995be24. Shutdown now marks discovery and sub-enumerator
contexts closing, closes KafkaMetadataService before waiting for discovery
termination, and leaves SynchronizedKafkaMetadataService.close() unsynchronized
so delegate close can unblock an in-flight metadata call. Added regression
coverage for that unblock path.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]