No worries :)
Thank you~
Xintong Song
On Mon, Oct 12, 2020 at 2:48 PM Paul Lam wrote:
> Sorry for the misspelled name, Xintong
>
> Best,
> Paul Lam
>
> 2020年10月12日 14:46,Paul Lam 写道:
>
> Hi Xingtong,
>
> Thanks a lot for the pointer!
>
> It’s good to see there would be a new IO executor
Sorry for the misspelled name, Xintong
Best,
Paul Lam
> 2020年10月12日 14:46,Paul Lam 写道:
>
> Hi Xingtong,
>
> Thanks a lot for the pointer!
>
> It’s good to see there would be a new IO executor to take care of the TM
> contexts. Looking forward to the 1.12 release!
>
> Best,
> Paul Lam
>
>>
Hi Xingtong,
Thanks a lot for the pointer!
It’s good to see there would be a new IO executor to take care of the TM
contexts. Looking forward to the 1.12 release!
Best,
Paul Lam
> 2020年10月12日 14:18,Xintong Song 写道:
>
> Hi Paul,
>
> Thanks for reporting this.
>
> Indeed, Flink's RM
FYI, I just created FLINK-19568 for tracking this issue.
Thank you~
Xintong Song
[1] https://issues.apache.org/jira/browse/FLINK-19568
On Mon, Oct 12, 2020 at 2:18 PM Xintong Song wrote:
> Hi Paul,
>
> Thanks for reporting this.
>
> Indeed, Flink's RM currently performs several HDFS
Hi Paul,
Thanks for reporting this.
Indeed, Flink's RM currently performs several HDFS operations in the rpc
main thread when preparing the TM context, which may block the main thread
when HDFS is slow.
Unfortunately, I don't see any out-of-box approach that fixes the problem
at the moment,
Hi,
After FLINK-13184 is implemented (even with Flink 1.11), occasionally there
would still be jobs
with high parallelism getting TM-RM heartbeat timeouts when RM is busy creating
TM contexts
on cluster initialization and HDFS is slow at that moment.
Apart from increasing the TM heartbeat