Re: Re: how to setup a ha flink cluster on k8s?
Hi Rock, If you correctly set the restart strategy, i think the jobmanager will failover and relaunched again. Also the job will be recovered, please share more logs about jobmanager if you want. Best, Yang Rock 于2019年11月20日周三 下午2:57写道: > Hi Yang Wang, > > Thanks for your reply, I MAY HAVE setup a ha cluster succefully. The > reason I can't setup before may be some bug about s3 in flink, after change > to hdfs,I can run it suceefully. > > But after about one day of running ,the job-manager will crash and can't > recover automatic, I must apply the deployment of job-manager manually (and > that will fix the problom,my jobs will auto start), so strange > > Since I changed too many from the yaml in flink's doc, I really don't know > where is my conf is wrong.But I have add logback to flink and let > it send log to my elasticsearch cluster,may the log can tell more.. > > -- 原始邮件 -- > *发件人:* "Yang Wang"; > *发送时间:* 2019年11月19日(星期二) 中午12:05 > *收件人:* "vino yang"; > *抄送:* "Rock";"user@flink.apache.org"< > user@flink.apache.org>; > *主题:* Re: how to setup a ha flink cluster on k8s? > > Hi Rock, > > If you want to start a ha flink cluster on k8s, the simplest way is to use > ZK+HDFS/S3, > just as the ha configuration on Yarn. The zookeeper-operator could help > the start a zk > cluster.[1] Please share more information that why it could not work. > > If you are using kubernetes per-job cluster, the job could be recovered > when the jm pod > crashed and restarted.[2] The savepoint could also be used to get better > recovery. > > [1].https://github.com/pravega/zookeeper-operator > [2]. > https://github.com/apache/flink/blob/release-1.9/flink-container/kubernetes/README.md#deploy-flink-job-cluster > > vino yang 于2019年11月16日周六 下午5:00写道: > >> Hi Rock, >> >> I searched by Google and found a blog[1] talk about how to config JM HA >> for Flink on k8s. Do not know whether it suitable for you or not. Please >> feel free to refer to it. >> >> Best, >> Vino >> >> [1]: >> http://shzhangji.com/blog/2019/08/24/deploy-flink-job-cluster-on-kubernetes/ >> >> Rock 于2019年11月16日周六 上午11:02写道: >> >>> I'm trying to setup a flink cluster on k8s for production use.But the >>> setup here >>> https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/deployment/kubernetes.html >>> this >>> not ha , when job-manager down and rescheduled >>> >>> the metadata for running job is lost. >>> >>> >>> >>> I tried to use ha setup for zk >>> https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/jobmanager_high_availability.html >>> on >>> k8s , but can't get it right. >>> >>> >>> >>> Stroing job's metadata on k8s using pvc or other external file >>> system should be very easy.Is there a way to achieve it. >>> >>
Re??Re?? how to setup a ha flink cluster on k8s?
Hi Yang Wang, Thanks for your reply, I MAY HAVE setup a ha cluster succefully. The reason I can't setup before may be some bug about s3 in flink, after change to hdfs,I can run it suceefully. But after about one day of running ,the job-manager will crash and can't recover automatic, I must apply the deployment of job-manager manually (and that will fix the problom,my jobs will auto start), so strange Since I changed too many from the yaml in flink's doc, I really don't know where is my conf is wrong.But I have add logback to flink and let it send log to my elasticsearch cluster,may the log can tell more.. -- -- ??: "Yang Wang"https://github.com/pravega/zookeeper-operator [2].https://github.com/apache/flink/blob/release-1.9/flink-container/kubernetes/README.md#deploy-flink-job-cluster vino yang http://shzhangji.com/blog/2019/08/24/deploy-flink-job-cluster-on-kubernetes/ Rock https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/deployment/kubernetes.html this not ha , when job-manager down and rescheduled the metadata for running job is lost. I tried to use ha setup for zk https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/jobmanager_high_availability.html on k8s , but can't get it right. Stroing job's metadata on k8s using pvc or other external file system should be very easy.Is there a way to achieve it.
Re: how to setup a ha flink cluster on k8s?
Hi Rock, If you want to start a ha flink cluster on k8s, the simplest way is to use ZK+HDFS/S3, just as the ha configuration on Yarn. The zookeeper-operator could help the start a zk cluster.[1] Please share more information that why it could not work. If you are using kubernetes per-job cluster, the job could be recovered when the jm pod crashed and restarted.[2] The savepoint could also be used to get better recovery. [1].https://github.com/pravega/zookeeper-operator [2]. https://github.com/apache/flink/blob/release-1.9/flink-container/kubernetes/README.md#deploy-flink-job-cluster vino yang 于2019年11月16日周六 下午5:00写道: > Hi Rock, > > I searched by Google and found a blog[1] talk about how to config JM HA > for Flink on k8s. Do not know whether it suitable for you or not. Please > feel free to refer to it. > > Best, > Vino > > [1]: > http://shzhangji.com/blog/2019/08/24/deploy-flink-job-cluster-on-kubernetes/ > > Rock 于2019年11月16日周六 上午11:02写道: > >> I'm trying to setup a flink cluster on k8s for production use.But the >> setup here >> https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/deployment/kubernetes.html >> this >> not ha , when job-manager down and rescheduled >> >> the metadata for running job is lost. >> >> >> >> I tried to use ha setup for zk >> https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/jobmanager_high_availability.html >> on >> k8s , but can't get it right. >> >> >> >> Stroing job's metadata on k8s using pvc or other external file >> system should be very easy.Is there a way to achieve it. >> >
Re: how to setup a ha flink cluster on k8s?
Hi Rock, I searched by Google and found a blog[1] talk about how to config JM HA for Flink on k8s. Do not know whether it suitable for you or not. Please feel free to refer to it. Best, Vino [1]: http://shzhangji.com/blog/2019/08/24/deploy-flink-job-cluster-on-kubernetes/ Rock 于2019年11月16日周六 上午11:02写道: > I'm trying to setup a flink cluster on k8s for production use.But the > setup here > https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/deployment/kubernetes.html > this > not ha , when job-manager down and rescheduled > > the metadata for running job is lost. > > > > I tried to use ha setup for zk > https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/jobmanager_high_availability.html > on > k8s , but can't get it right. > > > > Stroing job's metadata on k8s using pvc or other external file > system should be very easy.Is there a way to achieve it. >