[ https://issues.apache.org/jira/browse/FLINK-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910848#comment-16910848 ]
TisonKun commented on FLINK-13750: ---------------------------------- Hi [~Zentol] & [~till.rohrmann]. After an investigation I notice that {{ClusterClient}} need not to hold a field is or like {{highAvailabilityServices}}. Towards the target {{ClusterClient}} is an interface, i.e., is not an abstract class, we can shift down the initialize logic into {{RestClusterClient}} and {{MiniClusterClient}}. Here are two possible direction we do the separation and I post here for advice. 1. introduce utility functions in {{HighAvailabilityServicesUtils}} to return a limited set of high-availability service regarded as client-side services, without introduce any new class or interface.(a prototype can be found at https://github.com/TisonKun/flink/commit/1ea7c4ed6c7c2ce2a82da48bcacfd20e2bc0fdfd) pros: - easy to implement - in custom HA scenario, customer doesn't need to modify their code instead of their implementation has similar issue with FLINK-13500. cons: - there is no explicit client-side service concept. - {{HighAvailabilityServicesUtils}} knows details of Standalone and ZooKeeper implementation. nit: for the prototype, we might separate {{getDispatcherLeaderRetrievalService}} and {{getWebMonitorLeaderRetrievalService}} while the downside is we would initialize {{CurationFramework}} and custom HA service twice or more. 2. introduce an interface {{RetrieverOnlyHighAvailabilityService}} which looks like {code:java} interface RetrieverOnlyHighAvailabilityService { LeaderRetrievalService getDispatcherLeaderRetrievalService(); LeaderRetrievalService getWebMonitorLeaderRetrievalService(); } {code} and implement it for different high-availability backends. pros: - a clear concept of separation between high-availability services. - HighAvailabilityServicesUtils only pass configuration to generate RetrieverOnlyHighAvailabilityService and only RetrieverOnlyHighAvailabilityService knows the detail. cons: - we need to implement RetrieverOnlyHighAvailabilityService for every high-availability services. - in {{MiniClusterClient}} scenario, we actually used the service passed from MiniCluster. either we should treat it as a special case or change totally the logic {{MiniClusterClient}} initialization. - in custom HA scenario, user has to implement a new interface. nit: it is not the truth for current codebase that every ClusterClient share the same retrieval requirements. only RestClusterClient need to getWebMonitorLeaderRetrievalService. or in a more conceptual layer client should only communicate with WebMonitor and request to Dispatcher is routed by WebMonitor. > Separate HA services between client-/ and server-side > ----------------------------------------------------- > > Key: FLINK-13750 > URL: https://issues.apache.org/jira/browse/FLINK-13750 > Project: Flink > Issue Type: Improvement > Components: Command Line Client, Runtime / Coordination > Reporter: Chesnay Schepler > Assignee: TisonKun > Priority: Major > > Currently, we use the same {{HighAvailabilityServices}} on the client and > server. However, the client does not need several of the features that the > services currently provide (access to the blobstore or checkpoint metadata). > Additionally, due to how these services are setup they also require the > client to have access to the blob storage, despite it never actually being > used, which can cause issues, like FLINK-13500. > [~Tison] Would be be interested in this issue? -- This message was sent by Atlassian Jira (v8.3.2#803003)