[jira] [Commented] (APEXMALHAR-2019) Creation of S3 Input Module

2016-05-01 Thread ASF GitHub Bot (JIRA)
Parallel read will work only if the scheme is "s3a" and the Hadoop version is 2.7+. --- End diff -- add link to amazon s3 wiki page to explain details about hadoop version and scheme. > Creation of S3 Input Module > --- > >

[jira] [Commented] (APEXMALHAR-2019) Creation of S3 Input Module

2016-05-01 Thread ASF GitHub Bot (JIRA)
ublic class S3FileSplitter extends FSFileSplitter +{ + public S3FileSplitter() + { --- End diff -- Not required. > Creation of S3 Input Module > --- > > Key: APEXMALHAR-2019 > URL: https://issues.apache.org/jira/

[jira] [Commented] (APEXMALHAR-2019) Creation of S3 Input Module

2016-05-01 Thread ASF GitHub Bot (JIRA)
IOException --- End diff -- mention the difference from super in javadocs. > Creation of S3 Input Module > --- > > Key: APEXMALHAR-2019 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2019 >

[jira] [Commented] (APEXMALHAR-2019) Creation of S3 Input Module

2016-05-01 Thread ASF GitHub Bot (JIRA)
bucketUri = fs.getScheme() + "://" + extractBucket(uri); + } + + @VisibleForTesting + protected String extractBucket(String s3uri) --- End diff -- javadoc. > Creation of S3 Input Module > --- > > Key

[jira] [Commented] (APEXMALHAR-2019) Creation of S3 Input Module

2016-05-01 Thread ASF GitHub Bot (JIRA)
hu14 opened a pull request: https://github.com/apache/incubator-apex-malhar/pull/263 APEXMALHAR-2019 S3-Input Implemented S3 Input Module You can merge this pull request into a Git repository by running: $ git pull https://github.com/chaithu14/incubator-apex-malhar APEXMALHAR-201

Re: S3 Input Module

2016-03-24 Thread Chaitanya Chebolu
; > > > > > >3. Mix of 1 and 2. Multiple files are read in > > parallel, > > > > and > > > > > > > every > > > > > > > > > file > > > > > > > > > > >in itself is also read in parallel. > > > > > > > > > > &g

Re: S3 Input Module

2016-03-23 Thread Ashwin Chandra Putta
ersions > > > > > > > > > > of Hadoop : 2.2.0 or so and a lot better support in 2.7. > > So, > > > > will > > > > > > > your > > > > > > > > > > module work on all Hadoop versions post 2.2 or only

Re: S3 Input Module

2016-03-21 Thread Chaitanya Chebolu
; > One way to support this feature is to copy few S3 related > > > files > > > > > > from > > > > > > > > Hadoop 2.7 version into the module and will use this in > module. > > > > > > > > > > > > >

Re: S3 Input Module

2016-03-20 Thread Sandeep Deshmukh
dent > > > > > > > of Hadoop version. > > > > > > > > > > > > > > @All: > > > > > > > Please share your thoughts on this approach. > > > > > > > > > > > > > > Regards, > > > > > > > Chaitanya > >

Re: S3 Input Module

2016-03-19 Thread Chaitanya Chebolu
gt; > > > > > > wrote: > > > > > > > > > > > +1 > > > > > > > > > > > > Many people face issues while copy data from S3 at large scale. > > This > > > > > module > > > > > > i

Re: S3 Input Module

2016-03-19 Thread Priyanka Gugale
It's a good idea to extract out common code in parent class. +1 for this feature. -Priyanka On Thu, Mar 17, 2016 at 1:57 PM, Chaitanya Chebolu < chaita...@datatorrent.com> wrote: > Dear Community, > > I am proposing S3 Input Module. Primary functionality of this module i

Re: S3 Input Module

2016-03-19 Thread Chaitanya Chebolu
t; > > > > > > > > > > Regards, > > > > Sandeep > > > > > > > > On Thu, Mar 17, 2016 at 2:04 PM, Priyanka Gugale < > > > priya...@datatorrent.com > > > > > > > > > wrote: > > > > &g

Re: S3 Input Module

2016-03-19 Thread Sandeep Deshmukh
; > > > > > wrote: > > > > > > > It's a good idea to extract out common code in parent class. > > > > > > > > +1 for this feature. > > > > > > > > -Priyanka > > > > > > > > On Thu, Mar 17,

[jira] [Created] (APEXMALHAR-2019) Creation of S3 Input Module

2016-03-19 Thread Chaitanya (JIRA)
Chaitanya created APEXMALHAR-2019: - Summary: Creation of S3 Input Module Key: APEXMALHAR-2019 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2019 Project: Apache Apex Malhar Issue

S3 Input Module

2016-03-19 Thread Chaitanya Chebolu
Dear Community, I am proposing S3 Input Module. Primary functionality of this module is to parallel read files from S3 bucket. Below is the JIRA created for this task: https://issues.apache.org/jira/browse/APEXMALHAR-2019 Design of this module is similar to HDFS input module. So, I will

Re: S3 Input Module

2016-03-19 Thread Amol Kekre
6 at 1:57 PM, Chaitanya Chebolu < > > chaita...@datatorrent.com> wrote: > > > > > Dear Community, > > > > > > I am proposing S3 Input Module. Primary functionality of this module > is > > > to parallel read files from S3 bucket. > >

Re: S3 Input Module

2016-03-19 Thread Chaitanya Chebolu
> a...@datatorrent.com > > > > > > > > wrote: > > > > > > > > > > > > > +1. Very common use case. Nice to have it. > > > > > > > > > > > > > > Thks > > > > > > > Amol >

Re: S3 Input Module

2016-03-19 Thread Yogi Devendra
readily used with simple > > > > configuration. > > > > > > > > > > > > > > > Regards, > > > > > Sandeep > > > > > > > > > > On Thu, Mar 17, 2016 at 2:04 PM, Priyanka Gugale < > > > > priya

Re: S3 Input Module

2016-03-19 Thread Thomas Weise
s > > > > > > Amol > > > > > > > > > > > > > > > > > > On Thu, Mar 17, 2016 at 1:49 AM, Sandeep Deshmukh < > > > > > sand...@datatorrent.com > > > > > > > > > > > > > wro

Re: S3 Input Module

2016-03-19 Thread Pradeep Dalvi
> > > wrote: > > > > > It's a good idea to extract out common code in parent class. > > > > > > +1 for this feature. > > > > > > -Priyanka > > > > > > On Thu, Mar 17, 2016 at 1:57 PM, Chaitanya Chebolu < > >

Re: S3 Input Module

2016-03-19 Thread Sandeep Deshmukh
nt class. > > +1 for this feature. > > -Priyanka > > On Thu, Mar 17, 2016 at 1:57 PM, Chaitanya Chebolu < > chaita...@datatorrent.com> wrote: > > > Dear Community, > > > > I am proposing S3 Input Module. Primary functionality of this module is > >

Re: S3 Input Module

2016-03-18 Thread Yogi Devendra
> > > > > > > > Regards, > > > > > > > Sandeep > > > > > > > > > > > > > > On Fri, Mar 18, 2016 at 10:49 AM, Pradeep Dalvi < > > > > > > > pradeep.da...@datatorrent.com> wrote: > > > > > > > >

Re: S3 Input Module

2016-03-18 Thread Thomas Weise
10:49 AM, Pradeep Dalvi < > > > > > > pradeep.da...@datatorrent.com> wrote: > > > > > > > > > > > > > +1 > > > > > > > > > > > > > > On Thu, Mar 17, 2016 at 10:56 PM, Amol Kekre < > > a...@datatorrent.com >