We are probably still the minority, but our analytics platform based on
Spark + HDFS does not have map/reduce installed.  I'm wondering if there is
a distcp equivalent that leverages Spark to do the work.

Our team is trying to find the best way to do cross-datacenter replication
of our HDFS data to minimize the impact of outages/dc failure.

Reply via email to