@Thomas,@Amol I would like to contribute/collaborate on this.

Will create a ticket for the same.

Thanks,
Dev

On Sat, May 7, 2016 at 11:04 AM, Thomas Weise <[email protected]>
wrote:

> The documentation is here and is indexed:
>
> http://apex.apache.org/docs/malhar/
>
> I think this is a matter of enhancing it.
>
>
> On Sat, May 7, 2016 at 9:18 AM, Amol Kekre <[email protected]> wrote:
>
> > Thomas and I talked. Both of us agree that a white paper is due to get
> > going. Google index clearly beats "find . | grep ..." in this day and
> age.
> >
> > The white paper would walk through and have data on HDFS, FTP, NFS, S3,
> > maybe even example apps (could be app properties) accompanying this.
> >
> > So any volunteers?
> >
> > Thks
> > Amol
> >
> >
> > On Thu, May 5, 2016 at 5:10 PM, Thomas Weise <[email protected]>
> > wrote:
> >
> > > Do we have other projects that create dummy classes for every possible
> > > mounted file system just so that the user knows that's possible? The
> > > capability that matters here from app perspective is local file system
> > and
> > > every developer in the Hadoop ecosystem should understand that.
> > >
> > > If the operator doesn't have anything specific to NFS then there is no
> > > place for it in the library (it would be confusing, not helpful).
> > >
> > > There should be a different approach for pre-configured operators that
> > > doesn't involve writing Java code.
> > >
> > > Thomas
> > >
> > >
> > >
> > > On Thu, May 5, 2016 at 3:10 PM, Amol Kekre <[email protected]>
> wrote:
> > >
> > > > I am not suggesting duplicating code; extend the operators. Just add
> > > > something (may not even be a function) that can be viewed as specific
> > to
> > > a
> > > > particular source. Say for NFS, it may be as simple as changing a
> > > default.
> > > > A file with NFS in its name help a great deal with adoption.
> > > >
> > > > Thks
> > > > Amol
> > > >
> > > >
> > > > On Thu, May 5, 2016 at 11:45 AM, Chandni Singh <
> > [email protected]>
> > > > wrote:
> > > >
> > > > > IMO this is not a good idea.
> > > > >
> > > > > We are proposing to add additional Java code which is generic
> (works
> > > with
> > > > > HDFS, NFS, local FS) but just calling it something specific - NFS.
> > IMO
> > > > this
> > > > > is much more confusing to users.
> > > > >
> > > > > If we want to make it easier for users to find out that the FS
> Module
> > > > > supports writing to NFS then maybe we need to improve documentation
> > or
> > > > > highlight it somewhere else.
> > > > >
> > > > > Adding java classes means more maintenance overhead and here these
> > > > classes
> > > > > are not doing anything additional.
> > > > >
> > > > > Thanks,
> > > > > Chandni
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Thu, May 5, 2016 at 11:24 AM, Mohit Jotwani <
> > [email protected]>
> > > > > wrote:
> > > > >
> > > > > > +1 on Sandeep's suggestion. This would make an end user's life
> lot
> > > more
> > > > > > easier!
> > > > > >
> > > > > > Regards,
> > > > > > Mohit
> > > > > >
> > > > > > On Thu, May 5, 2016 at 11:51 PM, Sandeep Deshmukh <
> > > > > [email protected]
> > > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > I do agree with Amol on having clear and explicit modules. This
> > is
> > > > more
> > > > > > > from an end user perspective. For someone who is new to Apex,
> > > having
> > > > > > > separate NFS, HDFS, FTP, etc would make lot more sense than one
> > > > generic
> > > > > > FS
> > > > > > > module. However small change these modules may have, like just
> > > couple
> > > > > of
> > > > > > > small functions, I would like to have them separate for the end
> > > user.
> > > > > > >
> > > > > > > It is finally about the perspective and the user experience :)
> > > > > > >
> > > > > > > Regards,
> > > > > > > Sandeep
> > > > > > >
> > > > > > > On Thu, May 5, 2016 at 8:48 PM, Thomas Weise <
> > > [email protected]
> > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > I don't think we should name something NFS* when it isn't
> > > specific
> > > > to
> > > > > > > NFS.
> > > > > > > > It is just like any other local FS for this purpose and
> that's
> > > > > already
> > > > > > > > covered by the Hadoop file system abstraction.
> > > > > > > >
> > > > > > > > Why can't a single FS Input module accommodate all of this.
> > Once
> > > > you
> > > > > > know
> > > > > > > > the FS URL, you can automatically optimize the configuration,
> > if
> > > > > > > > appropriate.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Thomas
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, May 5, 2016 at 12:08 AM, Chaitanya Chebolu <
> > > > > > > > [email protected]> wrote:
> > > > > > > >
> > > > > > > > > Hi Chandni,
> > > > > > > > >
> > > > > > > > >   Its a good point. I created the hierarchy based on user
> > > > > perspective
> > > > > > > and
> > > > > > > > > especially for non Java users. If I return FileSplitter and
> > > > > > BlockReader
> > > > > > > > > from FS Input Module, then this module works for NFS. But,
> > for
> > > > > users
> > > > > > > > > perspective it would be difficult, whether this module
> works
> > > for
> > > > > NFS
> > > > > > or
> > > > > > > > any
> > > > > > > > > other fileSystem.
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > > Chaitanya
> > > > > > > > >
> > > > > > > > > On Thu, May 5, 2016 at 11:05 AM, Chandni Singh <
> > > > > > > [email protected]>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > I am sorry Chaitanya but I have more questions about this
> > > > > > > > > >
> > > > > > > > > > 1. why is the FS Input Module abstract when by default it
> > can
> > > > > > return
> > > > > > > > > > FileSplitter & BlockReader in com.datatorrent.lib.io.fs?
> > > > > > > > > >  These implementations are not specific to NFS.
> > > > > > > > > >
> > > > > > > > > > 2. In the NFS module that you have suggested to create,
> > what
> > > is
> > > > > > > > specific
> > > > > > > > > to
> > > > > > > > > > NFS?
> > > > > > > > > >
> > > > > > > > > > Please note: I have created a ticket APEXMALHAR-2081 to
> > > remove
> > > > > > > > > > FSFileSplitter from library and move its feature to the
> > base
> > > > > > > operator.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Chandni
> > > > > > > > > >
> > > > > > > > > > On Wed, May 4, 2016 at 10:29 PM, Chaitanya Chebolu <
> > > > > > > > > > [email protected]> wrote:
> > > > > > > > > >
> > > > > > > > > > > FSFileSplitter & BlockReader are available in
> > > > > > > > com.datatorrent.lib.io.fs
> > > > > > > > > > > package.
> > > > > > > > > > >
> > > > > > > > > > > On Thu, May 5, 2016 at 10:47 AM, Chandni Singh <
> > > > > > > > > [email protected]>
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Ok. What is specific about the fileSplitter and
> > > blockReader
> > > > > > > > returned
> > > > > > > > > by
> > > > > > > > > > > > this implementation?
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On May 4, 2016 9:43 PM, "Chaitanya Chebolu" <
> > > > > > > > > [email protected]
> > > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi Chandni,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Properties wise nothing specific. FS Input Module
> is
> > an
> > > > > > > abstract
> > > > > > > > > > Module
> > > > > > > > > > > > and
> > > > > > > > > > > > > NFS Module implements the abstract methods -
> > > > > > > createFileSplitter()
> > > > > > > > > and
> > > > > > > > > > > > > createBlockReader().
> > > > > > > > > > > > >
> > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > Chaitanya
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, May 4, 2016 at 9:45 PM, Chandni Singh <
> > > > > > > > > > [email protected]
> > > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi Chaitanya,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > What will be specific in NFS Input Module that is
> > not
> > > > > > > provided
> > > > > > > > by
> > > > > > > > > > FS
> > > > > > > > > > > > > Input
> > > > > > > > > > > > > > Module?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > Chandni
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Wed, May 4, 2016 at 7:12 AM, Amol Kekre <
> > > > > > > > [email protected]
> > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > +1
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thks
> > > > > > > > > > > > > > > Amol
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Tue, May 3, 2016 at 10:06 PM, Sandeep
> > Deshmukh <
> > > > > > > > > > > > > > [email protected]
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > +1
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > > Sandeep
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Fri, Apr 29, 2016 at 3:26 PM, Mohit
> Jotwani
> > <
> > > > > > > > > > > > > [email protected]>
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > +1
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > > > Mohit
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Fri, Apr 29, 2016 at 2:09 PM, Chaitanya
> > > > Chebolu
> > > > > <
> > > > > > > > > > > > > > > > > [email protected]> wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Hi All,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >   I am proposing NFS Input Module. Use
> case
> > > is
> > > > to
> > > > > > > read
> > > > > > > > > > large
> > > > > > > > > > > > > files
> > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > NFS
> > > > > > > > > > > > > > > > > > in parallel.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >  Design of NFS input module:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >    There is a common interface
> > > "FSInputModule"
> > > > in
> > > > > > > > Malhar
> > > > > > > > > > for
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > input
> > > > > > > > > > > > > > > > > > Modules. NFS input Module extends from
> > > > > > FSInputModule
> > > > > > > > and
> > > > > > > > > > can
> > > > > > > > > > > be
> > > > > > > > > > > > > > > > achieved
> > > > > > > > > > > > > > > > > by
> > > > > > > > > > > > > > > > > > using FSFileSplitter and BlockReader
> > > operators.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >   Please share your thoughts on this.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > > > > Chaitanya
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to