Re: flink on yarn 作业挂掉反复重启

2022-07-25 文章 Weihua Hu
可以检查下是不是 JobManager 内存不足被 OOM kill 了,如果有更多的日志也可以贴出来

Best,
Weihua


On Mon, Jul 18, 2022 at 8:41 PM SmileSmile  wrote:

> hi,all
> 遇到这种场景,flink on yarn,并行度3000的场景下,作业包含了多个agg操作,作业recover from checkpoint
> 或者savepoint必现无法恢复的情况,作业反复重启
> jm报错org.apache.flink.runtime.entrypoint.ClusterEntrypoint[] -
> RECEIVED S
> IGNAL 15: SIGTERM. Shutting down as requested.
>
> 请问有什么好的排查思路吗
>
>
>
>
>


flink on yarn 作业挂掉反复重启

2022-07-18 文章 SmileSmile
hi,all
遇到这种场景,flink on yarn,并行度3000的场景下,作业包含了多个agg操作,作业recover from checkpoint 
或者savepoint必现无法恢复的情况,作业反复重启
jm报错org.apache.flink.runtime.entrypoint.ClusterEntrypoint[] - RECEIVED S
IGNAL 15: SIGTERM. Shutting down as requested.

请问有什么好的排查思路吗