[ https://issues.apache.org/jira/browse/MESOS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris Lambert updated MESOS-1554: --------------------------------- Epic Name: Persistence Epic Status: To Do Issue Type: Epic (was: Story) > Persistent resources support for storage-like services > ------------------------------------------------------ > > Key: MESOS-1554 > URL: https://issues.apache.org/jira/browse/MESOS-1554 > Project: Mesos > Issue Type: Epic > Components: general, hadoop > Reporter: Nikita Vetoshkin > Priority: Minor > > This question came up in [dev mailing > list|http://mail-archives.apache.org/mod_mbox/mesos-dev/201406.mbox/%3CCAK8jAgNDs9Fe011Sq1jeNr0h%3DE-tDD9rak6hAsap3PqHx1y%3DKQ%40mail.gmail.com%3E]. > It seems reasonable for storage like services (e.g. HDFS or Cassandra) to use > Mesos to manage it's instances. But right now if we'd like to restart > instance (e.g. to spin up a new version) - all previous instance version > sandbox filesystem resources will be recycled by slave's garbage collector. > At the moment filesystem resources can be managed out of band - i.e. > instances can save their data in some database specific placed, that various > instances can share (e.g. {{/var/lib/cassandra}}). > [~benjaminhindman] suggested an idea in the mailing list (though it still > needs some fleshing out): > {quote} > The idea originally came about because, even today, if we allocate some > file system space to a task/executor, and then that task/executor > terminates, we haven't officially "freed" those file system resources until > after we garbage collect the task/executor sandbox! (We keep the sandbox > around so a user/operator can get the stdout/stderr or anything else left > around from their task/executor.) > To solve this problem we wanted to be able to let a task/executor terminate > but not *give up* all of it's resources, hence: persistent resources. > Pushing this concept even further you could imagine always reallocating > resources to a framework that had already been allocated those resources > for a previous task/executor. Looked at from another perspective, these are > "late-binding", or "lazy", resource reservations. > At one point in time we had considered just doing 'right-of-first-refusal' > for allocations after a task/executor terminate. But this is really > insufficient for supporting storage-like frameworks well (and likely even > harder to reliably implement then 'persistent resources' IMHO). > There are a ton of things that need to get worked out in this model, > including (but not limited to), how should a file system (or disk) be > exposed in order to be made persistent? How should persistent resources be > returned to a master? How many persistent resources can a framework get > allocated? > {quote} -- This message was sent by Atlassian JIRA (v6.2#6252)