I suspect you suffer from Client submission failure which also throws AskTimeoutException.
The related configure option are `akka.client.timeout` which you can increase. However, there was some cases you can resolve the problem by upgrading Java to latest minimum version 8u212 Best, tison. Zhu Zhu <[email protected]> 于2019年11月11日周一 下午6:03写道: > Hi Dominik, > > Would you check whether the JM GC status? > One possible cause is that the large number of file metas > inHadoopInputFormat is burdening the JM memory. > > `akka.ask.timeout` is the default RPC timeout, while some RPCs may override > this timeout for their own purpose. e.g. the RPCs from web usually use > `web.timeout`. > Providing the detailed call stack of the AskTimeoutException may help to > identify where this timeout happened. > > Thanks, > Zhu Zhu > > Dominik Wosiński <[email protected]> 于2019年11月11日周一 上午3:35写道: > > > Hey, > > I have a very specific use case. I have a history of records stored as > > Parquet in S3. I would like to read and process them with Flink. The > issue > > is that the number of files is quite large ( >100k). If I provide the > full > > list of files to HadoopInputFormat that I am using it will fail with > > AskTimeoutException, which Is weird since I am using YARN and setting the > > -yD akka.ask.timeout=600s, even thought according to the logs the setting > > is processed properly, the job execution still with AskTimeoutException > > after 10s, which seems weird to me. I have managed to go around this, by > > grouping files and reading them in a loop, so that finally I have the > > Seq[DataSet<Record>]. But if I try to union those datasets, then I will > > receive the AskTimeoutException again. So my question is, what can be the > > reason behind this exception being thrown and why is the setting ignored, > > even if this is pared properly. > > > > I will be glad for any help. > > > > Best Regards, > > Dom. > > >
