+1 , we need this functionality. Is it going to be a single operator or multiple operators? If multiple operators, then can you explain what functionality each operator will provide?
Regards, -Tushar. On Wed, Mar 23, 2016 at 5:01 PM, Yogi Devendra <[email protected]> wrote: > Writing to S3 is a common use-case for applications. > This module will be definitely helpful. > > +1 for adding this module. > > > ~ Yogi > > On 22 March 2016 at 13:52, Chaitanya Chebolu <[email protected]> > wrote: > > > Hi All, > > > > I am proposing S3 output copy Module. Primary functionality of this > > module is uploading files to S3 bucket using block-by-block approach. > > > > Below is the JIRA created for this task: > > https://issues.apache.org/jira/browse/APEXMALHAR-2022 > > > > Design of this module is similar to HDFS copy module. So, I will extend > > HDFS copy module for S3. > > > > Design of this Module: > > ======================= > > 1) Writing blocks into HDFS. > > 2) Merge the blocks into a file . > > 3) Upload the above merged file into S3 Bucket using AmazonS3Client > API's. > > > > Steps (1) & (2) are same as HDFS copy module. > > > > *Limitation:* Supports the size of file is up to 5 GB. Please refer the > > below link about limitations of Uploading objects into S3: > > http://docs.aws.amazon.com/AmazonS3/latest/dev/UploadingObjects.html > > > > We can resolve the above limitation by using S3 Multipart feature. I will > > add multipart support in next iteration. > > > > Please share your thoughts on this. > > > > Regards, > > Chaitanya > > >
