I've created SPARK-3499 <https://issues.apache.org/jira/browse/SPARK-3499> to
track creating a Spark-based distcp utility.

Nick

On Tue, Aug 12, 2014 at 4:20 PM, Matei Zaharia <matei.zaha...@gmail.com>
wrote:

> Good question; I don't know of one but I believe people at Cloudera had
> some thoughts of porting Sqoop to Spark in the future, and maybe they'd
> consider DistCP as part of this effort. I agree it's missing right now.
>
> Matei
>
> On August 12, 2014 at 11:04:28 AM, Gary Malouf (malouf.g...@gmail.com)
> wrote:
>
> We are probably still the minority, but our analytics platform based on
> Spark + HDFS does not have map/reduce installed.  I'm wondering if there is
> a distcp equivalent that leverages Spark to do the work.
>
> Our team is trying to find the best way to do cross-datacenter replication
> of our HDFS data to minimize the impact of outages/dc failure.
>
>

Reply via email to