DistCP - Spark-based

Gary Malouf Tue, 12 Aug 2014 11:04:48 -0700

We are probably still the minority, but our analytics platform based on
Spark + HDFS does not have map/reduce installed.  I'm wondering if there is
a distcp equivalent that leverages Spark to do the work.


Our team is trying to find the best way to do cross-datacenter replication
of our HDFS data to minimize the impact of outages/dc failure.

DistCP - Spark-based

Reply via email to