Hi Deepak,
       Thanks for your response. If I am correct, you suggest reading all
of those files into an rdd on the cluster using wholeTextFiles then apply
compression codec on it, save the rdd to another Hadoop cluster?

Thank you,
Ajay

On Tuesday, May 10, 2016, Deepak Sharma <deepakmc...@gmail.com> wrote:

> Hi Ajay
> You can look at wholeTextFiles method of rdd[string,string] and then map
> each of rdd  to saveAsTextFile .
> This will serve the purpose .
> I don't think if anything default like distcp exists in spark
>
> Thanks
> Deepak
> On 10 May 2016 11:27 pm, "Ajay Chander" <itsche...@gmail.com
> <javascript:_e(%7B%7D,'cvml','itsche...@gmail.com');>> wrote:
>
>> Hi Everyone,
>>
>> we are planning to migrate the data between 2 clusters and I see distcp
>> doesn't support data compression. Is there any efficient way to compress
>> the data during the migration ? Can I implement any spark job to do this ?
>>  Thanks.
>>
>

Reply via email to