Re: Flink on Kubernetes - Session vs Job cluster mode and storage

Arvid Heise Wed, 26 Feb 2020 06:52:47 -0800

Creds on AWS are typically resolved through roles assigned to K8s pods (for
example with KIAM [1]).


[1] https://github.com/uswitch/kiam

On Tue, Feb 25, 2020 at 3:36 AM Yang Wang <danrtsey...@gmail.com> wrote:

> Hi M Singh,
>
> > Mans - If we use the session based deployment option for K8 - I thought
>> K8 will automatically restarts any failed TM or JM.
>> In the case of failed TM - the job will probably recover, but in the case
>> of failed JM - perhaps we need to resubmit all jobs.
>> Let me know if I have misunderstood anything.
>
>
> Since you are starting JM/TM with K8s deployment, when they failed new
> JM/TM will be created. If you do not set the high
> availability configuration, your jobs could recover when TM failed.
> However, they could not recover when JM failed. With HA
> configured, the jobs could always be recovered and you do not need to
> re-submit again.
>
> > Mans - Is there any safe way of a passing creds ?
>
>
> Yes, you are right, Using configmap to pass the credentials is not safe.
> On K8s, i think you could use secrets instead[1].
>
> > Mans - Does a task manager failure cause the job to fail ?  My
>> understanding is the JM failure are catastrophic while TM failures are
>> recoverable.
>
>
> What i mean is the job failed, and it could be restarted by your
> configured restart strategy[2].
>
> > Mans - So if we are saving checkpoint in S3 then there is no need for
>> disks - should we use emptyDir ?
>
>
> Yes, if you are saving the checkpoint in S3 and also set the
> `high-availability.storageDir` to S3. Then you do not need persistent
> volume. Since
> the local directory is only used for local cache, so you could directly
> use the overlay filesystem or empryDir(better io performance).
>
>
> [1].
> https://kubernetes.io/docs/tasks/inject-data-application/distribute-credentials-secure/
> [2].
> https://ci.apache.org/projects/flink/flink-docs-master/ops/config.html#fault-tolerance
>
> M Singh <mans2si...@yahoo.com> 于2020年2月25日周二 上午5:53写道：
>
>> Thanks Wang for your detailed answers.
>>
>> From what I understand the native_kubernetes also leans towards creating
>> a session and submitting a job to it.
>>
>> Regarding other recommendations, please my inline comments and advice.
>>
>> On Sunday, February 23, 2020, 10:01:10 PM EST, Yang Wang <
>> danrtsey...@gmail.com> wrote:
>>
>>
>> Hi Singh,
>>
>> Glad to hear that you are looking to run Flink on the Kubernetes. I am
>> trying to answer your question based on my limited knowledge and
>> others could correct me and add some more supplements.
>>
>> I think the biggest difference between session cluster and per-job cluster
>> on Kubernetesis the isolation. Since for per-job, a dedicated Flink
>> cluster
>> will be started for the only one job and no any other jobs could be
>> submitted.
>> Once the job is finished, then the Flink cluster will be
>> destroyed immediately.
>> The second point is one-step submission. You do not need to start a Flink
>> cluster first and then submit a job to the existing session.
>>
>> > Are there any benefits with regards to
>> 1. Configuring the jobs
>> No matter you are using the per-job cluster or submitting to the existing
>> session cluster, they share the configuration mechanism. You do not have
>> to change any codes and configurations.
>>
>> 2. Scaling the taskmanager
>> Since you are using the Standalone cluster on Kubernetes, it do not
>> provide
>> an active resourcemanager. You need to use external tools to monitor and
>> scale up the taskmanagers. The active integration is still evolving and
>> you
>> could have a taste[1].
>>
>> Mans - If we use the session based deployment option for K8 - I thought
>> K8 will automatically restarts any failed TM or JM.
>> In the case of failed TM - the job will probably recover, but in the case
>> of failed JM - perhaps we need to resubmit all jobs.
>> Let me know if I have misunderstood anything.
>>
>> 3. Restarting jobs
>> For the session cluster, you could directly cancel the job and re-submit.
>> And
>> for per-job cluster, when the job is canceled, you need to start a new
>> per-job
>> cluster from the latest savepoint.
>>
>> 4. Managing the flink jobs
>> The rest api and flink command line could be used to managing the
>> jobs(e.g.
>> flink cancel, etc.). I think there is no difference for session and
>> per-job here.
>>
>> 5. Passing credentials (in case of AWS, etc)
>> I am not sure how do you provide your credentials. If you put them in
>> the
>> config map and then mount into the jobmanager/taskmanager pod, then both
>> session and per-job could support this way.
>>
>> Mans - Is there any safe way of a passing creds ?
>>
>> 6. Fault tolerence and recovery of jobs from failure
>> For session cluster, if one taskmanager crashed, then all the jobs which
>> have tasks
>> on this taskmanager will failed.
>> Both session and per-job could be configured with high availability and
>> recover
>> from the latest checkpoint.
>>
>> Mans - Does a task manager failure cause the job to fail ?  My
>> understanding is the JM failure are catastrophic while TM failures are
>> recoverable.
>>
>> > Is there any need for specifying volume for the pods?
>> No, you do not need to specify a volume for pod. All the data in the pod
>> local directory is temporary. When a pod crashed and relaunched, the
>> taskmanager will retrieve the checkpoint from zookeeper + S3 and resume
>> from the latest checkpoint.
>>
>> Mans - So if we are saving checkpoint in S3 then there is no need for
>> disks - should we use emptyDir ?
>>
>>
>> [1].
>> https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/native_kubernetes.html
>>
>> M Singh <mans2si...@yahoo.com> 于2020年2月23日周日 上午2:28写道：
>>
>> Hey Folks:
>>
>> I am trying to figure out the options for running Flink on Kubernetes and
>> am trying to find out the pros and cons of running in Flink Session vs
>> Flink Cluster mode (
>> https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/kubernetes.html#flink-session-cluster-on-kubernetes
>> ).
>>
>> I understand that in job mode there is no need to submit the job since it
>> is part of the job image.  But what are other the pros and cons of this
>> approach vs session mode where a job manager is deployed and flink jobs can
>> be submitted it ?  Are there any benefits with regards to:
>>
>> 1. Configuring the jobs
>> 2. Scaling the taskmanager
>> 3. Restarting jobs
>> 4. Managing the flink jobs
>> 5. Passing credentials (in case of AWS, etc)
>> 6. Fault tolerence and recovery of jobs from failure
>>
>> Also, we will be keeping the checkpoints for the jobs on S3.  Is there
>> any need for specifying volume for the pods ?  If volume is required do we
>> need provisioned volume and what are the recommended
>> alternatives/considerations especially with AWS.
>>
>> If there are any other considerations, please let me know.
>>
>> Thanks for your advice.
>>
>>
>>
>>
>>

Re: Flink on Kubernetes - Session vs Job cluster mode and storage

Reply via email to