Re: Mapreduce to and from public clouds

2019-06-18 Thread Steve Loughran
You can use all the cloud stores as the destination of work; just get them on the classpath and use: s3a://, gs://, wasb:/ To use s3 as a destination with performance and consistency, you need the S3A committers (see the hadoop-aws docs), and to safely chain work, the S3Guard consistency tool. To

Re: Mapreduce to and from public clouds

2019-06-14 Thread Amit Kabra
Any help here ? On Thu, Jun 13, 2019 at 12:38 PM Amit Kabra wrote: > Hello, > > I have a requirement where I need to read/write data to public cloud via > map reduce job. > > Our systems currently read and write of data from hdfs using mapreduce and > its working well, we write data in sequencef

Mapreduce to and from public clouds

2019-06-13 Thread Amit Kabra
Hello, I have a requirement where I need to read/write data to public cloud via map reduce job. Our systems currently read and write of data from hdfs using mapreduce and its working well, we write data in sequencefile format. We might have to move data to public cloud i.e s3 / gcp. Where everyt