Re: Data locality in Mesos

haosdent Wed, 28 Jun 2017 10:08:09 -0700

Hi, @tobias I think a lot of people encounter such problems. And I saw in
CSI
<https://docs.google.com/document/d/125YWqg_5BB5OY9a6M7LZcby5RSqBwo2PZzpVLuxYXh4/edit>
(From
@jieyu) design document, Mesos is adding a new component resource provider,
I think it may help to resolve data locality problem.


For dynamic attributes, I think it is also doable, we could expose it via
HTTP APIs just like the dynamic reservation.

On Wed, Jun 28, 2017 at 8:22 AM, Tobias Pfeiffer <[email protected]> wrote:

> Hi,
>
> one of the major selling points of HDFS is (was?) that it is possible to
> schedule a Hadoop job close to where the data that it operates on is.  I am
> not using HDFS, but I was wondering if/how Mesos supports an approach to
> schedule a job to a machine that has a certain file/dataset already locally
> as opposed to scheduling it to a machine that would have to access it via
> the network or download to the local disk first.
>
> I was wondering if Mesos attributes could be used:  I could have an
> attribute `datasets` of type `set` and then node A could have {dataset1,
> dataset17, dataset3} and node B could have {dataset17, dataset5} and during
> scheduling I could decide based on this attribute where to run a task.
> However, I was wondering if there are dynamic changes of such attributes
> possible.  Imagine that node A deletes dataset17 from the local cache and
> downloads dataset5 instead, then I would like to update the `datasets`
> attribute dynamically, but without affecting the jobs that are running on
> node A.  Is such a thing possible?
>
> Is there an approach other than attributes to describe the data that
> resides on a node in order to achieve data locality?
>
> Thanks
> Tobias
>
>


-- 
Best Regards,
Haosdent Huang

Re: Data locality in Mesos

Reply via email to