Thanks for listing this out Adam.

Data Residency:
> - Should we destroy the sandbox/hdfs-data when shutting down a DN?
> - If starting DN on node that was previously running a DN, can/should we
> try to revive the existing data?
>

I think this is one of the key challenges for a production quality HDFS on
Mesos. Currently, since sandbox is deleted after a task exits, if all the
data nodes that hold a block (and its replicas) get lost/killed for
whatever reason there would be data loss. A short terms solution would be
to write outside sandbox and use slave attributes to track where to
re-launch data node tasks.

Reply via email to