[
https://issues.apache.org/jira/browse/FLINK-1025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Metzger updated FLINK-1025:
----------------------------------
Fix Version/s: (was: 0.6-incubating)
0.7-incubating
> Improve BLOB Service
> --------------------
>
> Key: FLINK-1025
> URL: https://issues.apache.org/jira/browse/FLINK-1025
> Project: Flink
> Issue Type: Improvement
> Components: JobManager
> Affects Versions: 0.6-incubating
> Reporter: Stephan Ewen
> Assignee: Daniel Warneke
> Fix For: 0.7-incubating
>
>
> I like the idea of making it transparent where the blob service runs, so the
> code on the server/client side is agnostic to that.
> The current merged code is in
> https://github.com/StephanEwen/incubator-flink/commits/blobservice
> Local tests pass, I am trying distributed tests now.
> There are a few suggestions for improvements:
> - Since the all the resources are bound to a job or session, it makes sense
> to make all puts/gets relative to a jobId (becoming session id) and to have a
> cleanup hook that delete all resources associated with that job.
> - The BLOB service has hardwired to compute a message digest for the
> contents, and to use that as the key. While it may make sense for jar files
> (cached libraries), for many cases in the future, that will be unnecessary
> and impose only overhead. I would vote to make this optional and allow just
> UUIDs for keys. An example is for the taskmanager to put a part of an
> intermediate result into the blob store, for the client to pick it up.
> - At most points, we have started moving away from configured ports, because
> of configuration overhead and collisions in setups, where multiple instances
> end up on one machine. The latter happens actually frequently with YARN. I
> would suggest to have the JM open a port dynamically for the BlobService
> (similar as in TaskManager#getAvailablePort() ). RPC calls to figure out this
> configuration need to happen only once between client/JM and TM/JM. We can
> stomach that overhead ;-)
> - The write method does not write the length a single time, but "per
> buffer". Why is it done that way? The array-based methods know the length up
> front, and when the contents comes from an input stream, I think we know the
> length as well (for files: filesize, for network: sent up front).
> - I am personally in favor of moving away from static singleton registries.
> They tend to cause trouble during testing, pseudo cluster modes (multiple
> workers within one JVM). How hard is it to have a BlobService at the
> TaskManager / JobManager that we can pass as references to points where it is
> needed.
--
This message was sent by Atlassian JIRA
(v6.2#6252)