不好意思,我忘记贴图了。
我的flink standalone集群挂了,查看日志,看到截图上的错误
我自己分析不明白,谷歌也查不到对应的问题。希望能得到你们的帮助,谢谢!

问题描述:我正在运行任务的flink集群跑了两天后挂掉了,原因是所有taskmanager进程全部挂了,只有一个jobmanager还在。
集群环境:5台centos7的机器,32核,256GB内存,2个jobmanager,5个taskmanager,每台机器32个slots。jobmanager使用zookeeper做了高可用。
初步分析原因:zookeeper的问题
另外:不小心把日志清理了,没法粘贴文字了~

Xintong Song <tonysong...@gmail.com> 于2019年4月22日周一 下午1:27写道:

> Hi naisili,
>
> This is the user-zh mailing list, so if you speak Chinese you can ask
> questions in Chinese. If you prefer using English, you can send emails to
> u...@flink.apache.org. Hope that helps you.
>
> BTW, I think you forgot to attache the screenshot.
>
> Thank you~
>
> Xintong Song
>
>
>
> On Mon, Apr 22, 2019 at 10:53 AM naisili Yuan <yuanlong1...@gmail.com>
> wrote:
>
> > I use standalone cluster on flink, and i use zookeeper for the jobmanager
> > HA.
> > The Screenshot is my taskmanager proccess down log, falte a error.
> > And is don't know why it failed, even i google the error.
> > Ask for help, thanks.
> >
> >
> >
>

回复