Thomas and I talked. Both of us agree that a white paper is due to get
going. Google index clearly beats "find . | grep ..." in this day and age.

The white paper would walk through and have data on HDFS, FTP, NFS, S3,
maybe even example apps (could be app properties) accompanying this.

So any volunteers?

Thks
Amol


On Thu, May 5, 2016 at 5:10 PM, Thomas Weise <[email protected]> wrote:

> Do we have other projects that create dummy classes for every possible
> mounted file system just so that the user knows that's possible? The
> capability that matters here from app perspective is local file system and
> every developer in the Hadoop ecosystem should understand that.
>
> If the operator doesn't have anything specific to NFS then there is no
> place for it in the library (it would be confusing, not helpful).
>
> There should be a different approach for pre-configured operators that
> doesn't involve writing Java code.
>
> Thomas
>
>
>
> On Thu, May 5, 2016 at 3:10 PM, Amol Kekre <[email protected]> wrote:
>
> > I am not suggesting duplicating code; extend the operators. Just add
> > something (may not even be a function) that can be viewed as specific to
> a
> > particular source. Say for NFS, it may be as simple as changing a
> default.
> > A file with NFS in its name help a great deal with adoption.
> >
> > Thks
> > Amol
> >
> >
> > On Thu, May 5, 2016 at 11:45 AM, Chandni Singh <[email protected]>
> > wrote:
> >
> > > IMO this is not a good idea.
> > >
> > > We are proposing to add additional Java code which is generic (works
> with
> > > HDFS, NFS, local FS) but just calling it something specific - NFS. IMO
> > this
> > > is much more confusing to users.
> > >
> > > If we want to make it easier for users to find out that the FS Module
> > > supports writing to NFS then maybe we need to improve documentation or
> > > highlight it somewhere else.
> > >
> > > Adding java classes means more maintenance overhead and here these
> > classes
> > > are not doing anything additional.
> > >
> > > Thanks,
> > > Chandni
> > >
> > >
> > >
> > >
> > >
> > > On Thu, May 5, 2016 at 11:24 AM, Mohit Jotwani <[email protected]>
> > > wrote:
> > >
> > > > +1 on Sandeep's suggestion. This would make an end user's life lot
> more
> > > > easier!
> > > >
> > > > Regards,
> > > > Mohit
> > > >
> > > > On Thu, May 5, 2016 at 11:51 PM, Sandeep Deshmukh <
> > > [email protected]
> > > > >
> > > > wrote:
> > > >
> > > > > I do agree with Amol on having clear and explicit modules. This is
> > more
> > > > > from an end user perspective. For someone who is new to Apex,
> having
> > > > > separate NFS, HDFS, FTP, etc would make lot more sense than one
> > generic
> > > > FS
> > > > > module. However small change these modules may have, like just
> couple
> > > of
> > > > > small functions, I would like to have them separate for the end
> user.
> > > > >
> > > > > It is finally about the perspective and the user experience :)
> > > > >
> > > > > Regards,
> > > > > Sandeep
> > > > >
> > > > > On Thu, May 5, 2016 at 8:48 PM, Thomas Weise <
> [email protected]
> > >
> > > > > wrote:
> > > > >
> > > > > > I don't think we should name something NFS* when it isn't
> specific
> > to
> > > > > NFS.
> > > > > > It is just like any other local FS for this purpose and that's
> > > already
> > > > > > covered by the Hadoop file system abstraction.
> > > > > >
> > > > > > Why can't a single FS Input module accommodate all of this. Once
> > you
> > > > know
> > > > > > the FS URL, you can automatically optimize the configuration, if
> > > > > > appropriate.
> > > > > >
> > > > > > Thanks,
> > > > > > Thomas
> > > > > >
> > > > > >
> > > > > > On Thu, May 5, 2016 at 12:08 AM, Chaitanya Chebolu <
> > > > > > [email protected]> wrote:
> > > > > >
> > > > > > > Hi Chandni,
> > > > > > >
> > > > > > >   Its a good point. I created the hierarchy based on user
> > > perspective
> > > > > and
> > > > > > > especially for non Java users. If I return FileSplitter and
> > > > BlockReader
> > > > > > > from FS Input Module, then this module works for NFS. But, for
> > > users
> > > > > > > perspective it would be difficult, whether this module works
> for
> > > NFS
> > > > or
> > > > > > any
> > > > > > > other fileSystem.
> > > > > > >
> > > > > > > Regards,
> > > > > > > Chaitanya
> > > > > > >
> > > > > > > On Thu, May 5, 2016 at 11:05 AM, Chandni Singh <
> > > > > [email protected]>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > I am sorry Chaitanya but I have more questions about this
> > > > > > > >
> > > > > > > > 1. why is the FS Input Module abstract when by default it can
> > > > return
> > > > > > > > FileSplitter & BlockReader in com.datatorrent.lib.io.fs?
> > > > > > > >  These implementations are not specific to NFS.
> > > > > > > >
> > > > > > > > 2. In the NFS module that you have suggested to create, what
> is
> > > > > > specific
> > > > > > > to
> > > > > > > > NFS?
> > > > > > > >
> > > > > > > > Please note: I have created a ticket APEXMALHAR-2081 to
> remove
> > > > > > > > FSFileSplitter from library and move its feature to the base
> > > > > operator.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Chandni
> > > > > > > >
> > > > > > > > On Wed, May 4, 2016 at 10:29 PM, Chaitanya Chebolu <
> > > > > > > > [email protected]> wrote:
> > > > > > > >
> > > > > > > > > FSFileSplitter & BlockReader are available in
> > > > > > com.datatorrent.lib.io.fs
> > > > > > > > > package.
> > > > > > > > >
> > > > > > > > > On Thu, May 5, 2016 at 10:47 AM, Chandni Singh <
> > > > > > > [email protected]>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Ok. What is specific about the fileSplitter and
> blockReader
> > > > > > returned
> > > > > > > by
> > > > > > > > > > this implementation?
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On May 4, 2016 9:43 PM, "Chaitanya Chebolu" <
> > > > > > > [email protected]
> > > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Chandni,
> > > > > > > > > > >
> > > > > > > > > > > Properties wise nothing specific. FS Input Module is an
> > > > > abstract
> > > > > > > > Module
> > > > > > > > > > and
> > > > > > > > > > > NFS Module implements the abstract methods -
> > > > > createFileSplitter()
> > > > > > > and
> > > > > > > > > > > createBlockReader().
> > > > > > > > > > >
> > > > > > > > > > > Regards,
> > > > > > > > > > > Chaitanya
> > > > > > > > > > >
> > > > > > > > > > > On Wed, May 4, 2016 at 9:45 PM, Chandni Singh <
> > > > > > > > [email protected]
> > > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Chaitanya,
> > > > > > > > > > > >
> > > > > > > > > > > > What will be specific in NFS Input Module that is not
> > > > > provided
> > > > > > by
> > > > > > > > FS
> > > > > > > > > > > Input
> > > > > > > > > > > > Module?
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > > Chandni
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, May 4, 2016 at 7:12 AM, Amol Kekre <
> > > > > > [email protected]
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > +1
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thks
> > > > > > > > > > > > > Amol
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Tue, May 3, 2016 at 10:06 PM, Sandeep Deshmukh <
> > > > > > > > > > > > [email protected]
> > > > > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > +1
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > Sandeep
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Fri, Apr 29, 2016 at 3:26 PM, Mohit Jotwani <
> > > > > > > > > > > [email protected]>
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > +1
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > Mohit
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Fri, Apr 29, 2016 at 2:09 PM, Chaitanya
> > Chebolu
> > > <
> > > > > > > > > > > > > > > [email protected]> wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi All,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >   I am proposing NFS Input Module. Use case
> is
> > to
> > > > > read
> > > > > > > > large
> > > > > > > > > > > files
> > > > > > > > > > > > > from
> > > > > > > > > > > > > > > NFS
> > > > > > > > > > > > > > > > in parallel.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >  Design of NFS input module:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >    There is a common interface
> "FSInputModule"
> > in
> > > > > > Malhar
> > > > > > > > for
> > > > > > > > > > the
> > > > > > > > > > > > > input
> > > > > > > > > > > > > > > > Modules. NFS input Module extends from
> > > > FSInputModule
> > > > > > and
> > > > > > > > can
> > > > > > > > > be
> > > > > > > > > > > > > > achieved
> > > > > > > > > > > > > > > by
> > > > > > > > > > > > > > > > using FSFileSplitter and BlockReader
> operators.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >   Please share your thoughts on this.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > > Chaitanya
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to