[ https://issues.apache.org/jira/browse/MESOS-700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13806345#comment-13806345 ]
Sam Taha commented on MESOS-700: -------------------------------- I suggest that any "readonly" resource tgz/zip, JAR..etc, referenced by an executor CommandInfo.URI, is always put in the Framework cache for each slave (XXX directory) so that it can be available to any future executor in that Framework. The new md5 field in CommandInfo.URI would essentially indicate that the resource is readonly and cachable at the Framework level and available to be shared by all executors of a given Framework. Something like this: {code:title=CommandInfo.URI|borderStyle=solid} message URI { required string value = 1; optional bool executable = 2; optional bool md5 = 3; //if present indicates this resource can be cached at Framework level on slave } {code} Then when any executor runs on a given slave/Framework that references any of the cached resources (and with correct resource name and md5 hash), would get a symlink to that resources in the executor's working directory instead of copying the resources to the executor working directory. The symlink would avoid the copy from slave/Framework cache into the executor run working directory. For tgz/zip resources, the cached expanded directory is symlinked into the exexcutor's working directory. So as future executor's run, that reference the previously cached resources (and same md5), no remote or local copying is done, only symlinks to the local cache are setup in the executor's working directory. > more efficient distribution of frameworks via HDFS > -------------------------------------------------- > > Key: MESOS-700 > URL: https://issues.apache.org/jira/browse/MESOS-700 > Project: Mesos > Issue Type: Improvement > Components: framework > Affects Versions: 0.13.0, 0.14.0, 0.15.0 > Environment: general > Reporter: Du Li > Fix For: 0.15.0 > > > I was exploring the latest code (0.15.0) at https://github.com/apache/mesos > to test the tgz distribution of frameworks. Take spark for example. I created > a tgz of spark binary and put it on HDFS. After a job is submitted, it is > decomposed into many tasks. For each task, the assigned mesos slave downloads > the tgz from HDFS, unzips it, and executes some script to launch the task. > This seems very wasteful and unnecessary. > Does the following suggestion make sense? When a spark job is submitted, the > spark/mesos master calculates a checksum or something the like for the tgz > distribution. Then the checksum is sent to the slaves when tasks are > assigned. If the same file has already been downloaded/unzipped, a slave > directly launches the task. This way the tgz is processed at most once for > each job (which may have thousands of tasks). The aggregated saving would be > tremendous. -- This message was sent by Atlassian JIRA (v6.1#6144)