Good question; I don't know of one but I believe people at Cloudera had some 
thoughts of porting Sqoop to Spark in the future, and maybe they'd consider 
DistCP as part of this effort. I agree it's missing right now.

Matei

On August 12, 2014 at 11:04:28 AM, Gary Malouf (malouf.g...@gmail.com) wrote:

We are probably still the minority, but our analytics platform based on Spark + 
HDFS does not have map/reduce installed.  I'm wondering if there is a distcp 
equivalent that leverages Spark to do the work.

Our team is trying to find the best way to do cross-datacenter replication of 
our HDFS data to minimize the impact of outages/dc failure.  

Reply via email to