Re: Merge HadoopInputFormatIO and HDFSIO in a single module

2017-03-02 Thread Jean-Baptiste Onofré
i_dkulka...@persistent.com> wrote: Thank you all for your inputs! -Original Message- From: Dan Halperin [mailto:dhalp...@google.com.INVALID] Sent: Friday, February 17, 2017 12:17 PM To: dev@beam.apache.org Subject: Re: Merge HadoopInputFormatIO and HDFSIO in a single module Raghu, Amit

Re: Merge HadoopInputFormatIO and HDFSIO in a single module

2017-03-01 Thread Stephen Sisk
9:38 AM, Dipti Kulkarni < dipti_dkulka...@persistent.com> wrote: > Thank you all for your inputs! > > > -Original Message- > From: Dan Halperin [mailto:dhalp...@google.com.INVALID] > Sent: Friday, February 17, 2017 12:17 PM > To: dev@beam.apache.org > Subject:

RE: Merge HadoopInputFormatIO and HDFSIO in a single module

2017-02-17 Thread Dipti Kulkarni
Thank you all for your inputs! -Original Message- From: Dan Halperin [mailto:dhalp...@google.com.INVALID] Sent: Friday, February 17, 2017 12:17 PM To: dev@beam.apache.org Subject: Re: Merge HadoopInputFormatIO and HDFSIO in a single module Raghu, Amit -- +1 to your expertise

Re: Merge HadoopInputFormatIO and HDFSIO in a single module

2017-02-16 Thread Dan Halperin
Raghu, Amit -- +1 to your expertise :) On Thu, Feb 16, 2017 at 3:39 PM, Amit Sela wrote: > I agree with Dan on everything regarding HdfsFileSystem - it's super > convenient for users to use TextIO with HdfsFileSystem rather then > replacing the IO and also specifying the

Re: Merge HadoopInputFormatIO and HDFSIO in a single module

2017-02-15 Thread Raghu Angadi
I skimmed through HdfsIO and I think it is essentially HahdoopInpuFormatIO with FileInputFormat. I would pretty much move most of the code to HadoopInputFormatIO (just make HdfsIO a specific instance of HIF_IO). On Wed, Feb 15, 2017 at 9:15 AM, Dipti Kulkarni < dipti_dkulka...@persistent.com>

Merge HadoopInputFormatIO and HDFSIO in a single module

2017-02-15 Thread Dipti Kulkarni
Hello there! I am working on writing a Read IO for Hadoop InputFormat. This will enable reading from any datasource which supports Hadoop InputFormat, i.e. provides source to read from InputFormat for integration with Hadoop. It makes sense for the HadoopInputFormatIO to share some code with the