hi,all
遇到这种场景,flink on yarn,并行度3000的场景下,作业包含了多个agg操作,作业recover from checkpoint 
或者savepoint必现无法恢复的情况,作业反复重启
jm报错org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - RECEIVED S
IGNAL 15: SIGTERM. Shutting down as requested.

请问有什么好的排查思路吗




回复