dmvk commented on code in PR #22506: URL: https://github.com/apache/flink/pull/22506#discussion_r1195405513
########## flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/failover/flip1/FailureHandlingResult.java: ########## @@ -204,13 +226,15 @@ public static FailureHandlingResult restartable( @Nullable Execution failedExecution, @Nullable Throwable cause, long timestamp, + CompletableFuture<Map<String, String>> labels, Review Comment: nit: there is an inconsistent naming of this parameter across this file ########## flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/DefaultScheduler.java: ########## @@ -167,9 +171,31 @@ protected DefaultScheduler( jobGraph.getName(), jobGraph.getJobID()); + final Context taskFailureCtx = + DefaultFailureEnricherContext.forTaskFailure( Review Comment: 🤔 Shouldn't we provide more context here? At least the task that the failure originated from? ########## flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/adaptivebatch/AdaptiveBatchScheduler.java: ########## @@ -295,7 +300,7 @@ void initializeVerticesIfPossible() { } } catch (JobException ex) { log.error("Unexpected error occurred when initializing ExecutionJobVertex", ex); - failJob(ex, System.currentTimeMillis()); + failJob(ex, System.currentTimeMillis(), FailureEnricherUtils.EMPTY_LABELS); Review Comment: Why don't we run this through labeling? (ABS should still fall into the default scheduler family) ########## flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/exceptionhistory/ExceptionHistoryEntry.java: ########## @@ -63,21 +66,27 @@ public static ExceptionHistoryEntry create(AccessExecution failedExecution, Stri return new ExceptionHistoryEntry( failure.getException(), failure.getTimestamp(), + failure.getLabelsFuture(), taskName, failedExecution.getAssignedResourceLocation()); } /** Creates an {@code ExceptionHistoryEntry} that is not based on an {@code Execution}. */ public static ExceptionHistoryEntry createGlobal(Throwable cause) { Review Comment: Can we mark this method as deprecated and maybe label it with an issue that will get rid of it? ########## flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/failover/flip1/ExecutionFailureHandlerTest.java: ########## @@ -59,17 +64,25 @@ class ExecutionFailureHandlerTest { private ExecutionFailureHandler executionFailureHandler; + private TrackingFailureEnricher trackingFailureEnricher; Review Comment: we should introduce a `TestingFailureEnricher`, because `TrackingFailureEnricher` already seems to be duplicated and we'd have to duplicate it in upcoming PRs as well ########## flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/exceptionhistory/ExceptionHistoryEntry.java: ########## @@ -63,21 +66,27 @@ public static ExceptionHistoryEntry create(AccessExecution failedExecution, Stri return new ExceptionHistoryEntry( failure.getException(), failure.getTimestamp(), + failure.getLabelsFuture(), taskName, failedExecution.getAssignedResourceLocation()); } /** Creates an {@code ExceptionHistoryEntry} that is not based on an {@code Execution}. */ public static ExceptionHistoryEntry createGlobal(Throwable cause) { Review Comment: The same applies to other factory methods that ignore labels -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org