Till Rohrmann created FLINK-6528:
------------------------------------
Summary: Refactor BlobLibraryCacheManager <=> BlobServer/Cache <=>
BlobStore hierarchy
Key: FLINK-6528
URL: https://issues.apache.org/jira/browse/FLINK-6528
Project: Flink
Issue Type: Improvement
Components: Distributed Coordination
Affects Versions: 1.3.0, 1.4.0
Reporter: Till Rohrmann
Currently, the {{BlobLibraryCacheManager}} is responsible for keeping reference
counts (amongst others) for files stored in the {{BlobServer}}. Additionally it
has a periodic timer task which deletes files from the {{BlobServer}} if the
reference count is {{0}}.
The {{BlobServer}} is responsible for storing files locally (on the
{{JobManager}}). Moreover, it has a {{BlobStore}} reference which it uses to
store files persistently for HA.
I think it is a wrong separation of concerns that the
{{BlobLibraryCacheManager}} is responsible for reference counting. Instead I
would propose to move this functionality to the {{BlobServer}} and
{{BlobCache}}, respectively. That way, the {{BlobLibraryCacheManager}} is only
responsible for keeping track of blobs relevant for a given job and its user
code class loader.
Additionally, we could think about merging the {{BlobServer}}/{{BlobCache}} and
the {{BlobStore}} functionality by having a
{{FileSystemBlobServer}}/{{FileSystemBlobCache}} which will be configured in
the HA case with a DFS path.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)