featzhang opened a new pull request, #27711: URL: https://github.com/apache/flink/pull/27711
## What is the purpose of the change This PR implements the integration of NodeHealthManager with the slot allocation process in FineGrainedSlotManager. It enables filtering out quarantined nodes during slot allocation to prevent jobs from being scheduled on unhealthy nodes. This PR builds on PR #27701 (NodeHealthManager abstraction) and implements Phase 2 of the node health management mechanism. ## Brief change log - Modified FineGrainedSlotManager to filter out quarantined nodes during slot allocation in allocateSlotsAccordingTo() method - Updated ResourceManagerRuntimeServices to accept NodeHealthManager parameter in createSlotManager() method - Enhanced ResourceManagerFactory to pass NoOpNodeHealthManager as default implementation - Extended FineGrainedSlotManagerBuilder and FineGrainedSlotManagerTestBase to support NodeHealthManager in test infrastructure - Added comprehensive integration test NodeQuarantineSlotFilteringITCase covering slot allocation filtering, quarantine expiry, and manual removal scenarios - Fixed compilation issues in test infrastructure related to method name conflicts ## Verifying this change This change is verified by: - Existing unit tests continue to pass - New integration test NodeQuarantineSlotFilteringITCase validates the slot filtering functionality - Manual testing with quarantined nodes shows slots are correctly filtered - Compilation succeeds with mvnw clean spotless:apply install -DskipTests -Pfast ## Does this pull request potentially affect - Public API: No - Serializers: No - The runtime per-record code paths: No - Anything that affects deployment or recovery: JobManager failover: No - The S3 file system connector: No ## Documentation - Does this pull request introduce a new feature: Yes, node health-based slot filtering - If yes, how is the feature documented: Code comments and integration tests. Full documentation will be added in subsequent PRs for REST API and configuration options. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
