Tao Wang created FLINK-6020:
-------------------------------
Summary: Blob Server cannot hanlde multiple job sumits(with same
content) parallelly
Key: FLINK-6020
URL: https://issues.apache.org/jira/browse/FLINK-6020
Project: Flink
Issue Type: Bug
Reporter: Tao Wang
Priority: Critical
In yarn-cluster mode, if we submit one same job multiple times parallelly, the
task will encounter class load problem and lease occuputation.
Because blob server stores user jars in name with generated sha1sum of those,
first writes a temp file and move it to finalialize. For recovery it also will
put them to HDFS with same file name.
In same time, when multiple clients sumit same job with same jar, the local jar
files in blob server and those file on hdfs will be handled in multiple
threads(BlobServerConnection), and impact each other.
It's better to have a way to handle this, now two ideas comes up to my head:
1. lock the write operation, or
2. use some unique identifier as file name instead of ( or added up to) sha1sum
of the file contents.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)