[ 
https://issues.apache.org/jira/browse/FLINK-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910848#comment-16910848
 ] 

TisonKun commented on FLINK-13750:
----------------------------------

Hi [~Zentol] & [~till.rohrmann].

After an investigation I notice that {{ClusterClient}} need not to hold a field 
is or like {{highAvailabilityServices}}. Towards the target {{ClusterClient}} 
is an interface, i.e., is not an abstract class, we can shift down the 
initialize logic into {{RestClusterClient}} and {{MiniClusterClient}}.

Here are two possible direction we do the separation and I post here for advice.

1. introduce utility functions in {{HighAvailabilityServicesUtils}} to return a 
limited set of high-availability service regarded as client-side services, 
without introduce any new class or interface.(a prototype can be found at 
https://github.com/TisonKun/flink/commit/1ea7c4ed6c7c2ce2a82da48bcacfd20e2bc0fdfd)

pros:

- easy to implement
- in custom HA scenario, customer doesn't need to modify their code instead of 
their implementation has similar issue with FLINK-13500.

cons:

- there is no explicit client-side service concept.
- {{HighAvailabilityServicesUtils}} knows details of Standalone and ZooKeeper 
implementation.

nit: for the prototype, we might separate 
{{getDispatcherLeaderRetrievalService}} and 
{{getWebMonitorLeaderRetrievalService}} while the downside is we would 
initialize {{CurationFramework}} and custom HA service twice or more.

2. introduce an interface {{RetrieverOnlyHighAvailabilityService}} which looks 
like


{code:java}
interface RetrieverOnlyHighAvailabilityService {
  LeaderRetrievalService getDispatcherLeaderRetrievalService();
  LeaderRetrievalService getWebMonitorLeaderRetrievalService();
}
{code}

and implement it for different high-availability backends.

pros:

- a clear concept of separation between high-availability services.
- HighAvailabilityServicesUtils only pass configuration to generate 
RetrieverOnlyHighAvailabilityService and only 
RetrieverOnlyHighAvailabilityService knows the detail.

cons:

- we need to implement RetrieverOnlyHighAvailabilityService for every 
high-availability services.
- in {{MiniClusterClient}} scenario, we actually used the service passed from 
MiniCluster. either we should treat it as a special case or change totally the 
logic {{MiniClusterClient}} initialization.
- in custom HA scenario, user has to implement a new interface.

nit:

it is not the truth for current codebase that every ClusterClient share the 
same retrieval requirements. only RestClusterClient need to 
getWebMonitorLeaderRetrievalService. or in a more conceptual layer client 
should only communicate with WebMonitor and request to Dispatcher is routed by 
WebMonitor.

> Separate HA services between client-/ and server-side
> -----------------------------------------------------
>
>                 Key: FLINK-13750
>                 URL: https://issues.apache.org/jira/browse/FLINK-13750
>             Project: Flink
>          Issue Type: Improvement
>          Components: Command Line Client, Runtime / Coordination
>            Reporter: Chesnay Schepler
>            Assignee: TisonKun
>            Priority: Major
>
> Currently, we use the same {{HighAvailabilityServices}} on the client and 
> server. However, the client does not need several of the features that the 
> services currently provide (access to the blobstore or checkpoint metadata).
> Additionally, due to how these services are setup they also require the 
> client to have access to the blob storage, despite it never actually being 
> used, which can cause issues, like FLINK-13500.
> [~Tison] Would be be interested in this issue?



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to