答复: jobmanager holds too many CLOSE_WAIT connection to datanode

2018-08-24 Thread Yuan,Youjun
One more safer approach is to execute cancel with savepoint on all jobs first >> this sounds great! Thanks Youjun 发件人: vino yang 发送时间: Friday, August 24, 2018 2:43 PM 收件人: Yuan,Youjun ; user 主题: Re: jobmanager holds too many CLOSE_WAIT connection to datanode Hi Youjun, You c

Re: jobmanager holds too many CLOSE_WAIT connection to datanode

2018-08-24 Thread vino yang
re, how to avoid jobmanager holing so many, apparently not > necessary, TCP connections? > > > > Thanks > > Youjun > > > > *发件人**:* vino yang > *发送时间:* Friday, August 24, 2018 10:26 AM > *收件人:* Yuan,Youjun > *抄送:* user > *主题:* Re: jobmanager holds too m

Re: jobmanager holds too many CLOSE_WAIT connection to datanode

2018-08-23 Thread vino yang
Hi Youjun, How long has your job been running for a long time? As far as I know, if in a short time, for checkpoint, jobmanager will not generate so many connections to HDFS. What is your Flink cluster environment? Standalone or Flink on YARN? In addition, does JM's log show any timeout

jobmanager holds too many CLOSE_WAIT connection to datanode

2018-08-23 Thread Yuan,Youjun
Hi, After running for a while , my job manager holds thousands of CLOSE_WAIT TCP connection to HDFS datanode, the number is growing up slowly, and it's likely will hit the max open file limit. My jobs checkpoint to HDFS every minute. If I run lsof -i -a -p $JMPID, I can get a tons of following