Re: Aligning FileSplitter and BlocReader with hadoop.mapreduce InputFormats

2016-03-24 Thread Shubham Pathak
+1 . Will certainly be a good addition. On Fri, Mar 25, 2016 at 9:31 AM, Chandni Singh wrote: > +1 for the idea > On Mar 24, 2016 8:41 PM, "Thomas Weise" wrote: > > > +1 for the idea in general and extending existing implementation. > > > > In case this introduces a MapReduce dependency we will

Re: Aligning FileSplitter and BlocReader with hadoop.mapreduce InputFormats

2016-03-24 Thread Chandni Singh
+1 for the idea On Mar 24, 2016 8:41 PM, "Thomas Weise" wrote: > +1 for the idea in general and extending existing implementation. > > In case this introduces a MapReduce dependency we will also need to > consider a separate module. > > Thomas > > > On Thu, Mar 24, 2016 at 2:35 AM, Devendra Tagar

Re: Aligning FileSplitter and BlocReader with hadoop.mapreduce InputFormats

2016-03-24 Thread Thomas Weise
+1 for the idea in general and extending existing implementation. In case this introduces a MapReduce dependency we will also need to consider a separate module. Thomas On Thu, Mar 24, 2016 at 2:35 AM, Devendra Tagare wrote: > Hi, > > We are thinking of extending the FileSplitter and BlockRea

Re: Aligning FileSplitter and BlocReader with hadoop.mapreduce InputFormats

2016-03-24 Thread Devendra Tagare
Hi, We are thinking of extending the FileSplitter and BlockReader . Changing the existing code could have side effects. Thanks, Dev On Mar 24, 2016 1:16 AM, "Tushar Gosavi" wrote: > My suggestion is to extend from FileSplitter and BlockReader without > changing them, and add support for InputFo

Re: Aligning FileSplitter and BlocReader with hadoop.mapreduce InputFormats

2016-03-24 Thread Tushar Gosavi
My suggestion is to extend from FileSplitter and BlockReader without changing them, and add support for InputFormat in derived classes. FileSplitter and BlockReader already provides enough hooks to define splits and read records. - Tushar. On Thu, Mar 24, 2016 at 11:17 AM, Yogi Devendra wrote:

Re: Aligning FileSplitter and BlocReader with hadoop.mapreduce InputFormats

2016-03-23 Thread Yogi Devendra
Aligning FileSplitter, BlockReader with respective counterparts from mapreduce will be excellent value addition. IMO, it has 2 advantages: 1. It will allow us to plug-in more formats for FileSplitter+BlockReader pattern use-cases. 2. It will be easy for end-users coming from mapreduce background

Re: Aligning FileSplitter and BlocReader with hadoop.mapreduce InputFormats

2016-03-23 Thread Priyanka Gugale
So as I understand splitter would be format aware, in that case would we need different kinds of parser we have right now? Or the format aware splitter will take care of parsing different file formats e.g. csv etc? -Priyanka On Wed, Mar 23, 2016 at 11:41 PM, Devendra Tagare wrote: > Hi All, > >

Aligning FileSplitter and BlocReader with hadoop.mapreduce InputFormats

2016-03-23 Thread Devendra Tagare
Hi All, Initiating this thread to get the community's opinion on aligning the FileSplitter with InputSplit & the BlockReader with the RecordReader from org.apache.hadoop.mapreduce.InputSplit & org.apache.hadoop.mapreduce.RecordReader respectively. Some more details and rationale on the approach,