Hi,

I see HDFSFileCopyModule and HDFSFileMerger in the library as well. Since
we are so close to the release and I am not sure if these classes are just
specific to HDFS, I am going to mark them Evolving so that we can address
this afterwards and change the name if its suitable.

Thanks,
Chandni

On Sat, May 7, 2016 at 2:17 PM, Chandni Singh <[email protected]>
wrote:

> I can help Dev.
>
> Thanks,
> Chandni
>
> On Sat, May 7, 2016 at 1:23 PM, Amol Kekre <[email protected]> wrote:
>
>> We do have docs on apache.org. Love to a very extensive and deep doc on
>> this topic.
>>
>> Should we add "How to ..." sections?
>>
>> @dev, thks for volunteering. Anyone more volunteers?
>>
>> Thks,
>> Amol
>>
>>
>> On Sat, May 7, 2016 at 12:20 PM, Devendra Tagare <
>> [email protected]>
>> wrote:
>>
>> > @Thomas,@Amol I would like to contribute/collaborate on this.
>> >
>> > Will create a ticket for the same.
>> >
>> > Thanks,
>> > Dev
>> >
>> > On Sat, May 7, 2016 at 11:04 AM, Thomas Weise <[email protected]>
>> > wrote:
>> >
>> > > The documentation is here and is indexed:
>> > >
>> > > http://apex.apache.org/docs/malhar/
>> > >
>> > > I think this is a matter of enhancing it.
>> > >
>> > >
>> > > On Sat, May 7, 2016 at 9:18 AM, Amol Kekre <[email protected]>
>> wrote:
>> > >
>> > > > Thomas and I talked. Both of us agree that a white paper is due to
>> get
>> > > > going. Google index clearly beats "find . | grep ..." in this day
>> and
>> > > age.
>> > > >
>> > > > The white paper would walk through and have data on HDFS, FTP, NFS,
>> S3,
>> > > > maybe even example apps (could be app properties) accompanying this.
>> > > >
>> > > > So any volunteers?
>> > > >
>> > > > Thks
>> > > > Amol
>> > > >
>> > > >
>> > > > On Thu, May 5, 2016 at 5:10 PM, Thomas Weise <
>> [email protected]>
>> > > > wrote:
>> > > >
>> > > > > Do we have other projects that create dummy classes for every
>> > possible
>> > > > > mounted file system just so that the user knows that's possible?
>> The
>> > > > > capability that matters here from app perspective is local file
>> > system
>> > > > and
>> > > > > every developer in the Hadoop ecosystem should understand that.
>> > > > >
>> > > > > If the operator doesn't have anything specific to NFS then there
>> is
>> > no
>> > > > > place for it in the library (it would be confusing, not helpful).
>> > > > >
>> > > > > There should be a different approach for pre-configured operators
>> > that
>> > > > > doesn't involve writing Java code.
>> > > > >
>> > > > > Thomas
>> > > > >
>> > > > >
>> > > > >
>> > > > > On Thu, May 5, 2016 at 3:10 PM, Amol Kekre <[email protected]>
>> > > wrote:
>> > > > >
>> > > > > > I am not suggesting duplicating code; extend the operators. Just
>> > add
>> > > > > > something (may not even be a function) that can be viewed as
>> > specific
>> > > > to
>> > > > > a
>> > > > > > particular source. Say for NFS, it may be as simple as changing
>> a
>> > > > > default.
>> > > > > > A file with NFS in its name help a great deal with adoption.
>> > > > > >
>> > > > > > Thks
>> > > > > > Amol
>> > > > > >
>> > > > > >
>> > > > > > On Thu, May 5, 2016 at 11:45 AM, Chandni Singh <
>> > > > [email protected]>
>> > > > > > wrote:
>> > > > > >
>> > > > > > > IMO this is not a good idea.
>> > > > > > >
>> > > > > > > We are proposing to add additional Java code which is generic
>> > > (works
>> > > > > with
>> > > > > > > HDFS, NFS, local FS) but just calling it something specific -
>> > NFS.
>> > > > IMO
>> > > > > > this
>> > > > > > > is much more confusing to users.
>> > > > > > >
>> > > > > > > If we want to make it easier for users to find out that the FS
>> > > Module
>> > > > > > > supports writing to NFS then maybe we need to improve
>> > documentation
>> > > > or
>> > > > > > > highlight it somewhere else.
>> > > > > > >
>> > > > > > > Adding java classes means more maintenance overhead and here
>> > these
>> > > > > > classes
>> > > > > > > are not doing anything additional.
>> > > > > > >
>> > > > > > > Thanks,
>> > > > > > > Chandni
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > On Thu, May 5, 2016 at 11:24 AM, Mohit Jotwani <
>> > > > [email protected]>
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > +1 on Sandeep's suggestion. This would make an end user's
>> life
>> > > lot
>> > > > > more
>> > > > > > > > easier!
>> > > > > > > >
>> > > > > > > > Regards,
>> > > > > > > > Mohit
>> > > > > > > >
>> > > > > > > > On Thu, May 5, 2016 at 11:51 PM, Sandeep Deshmukh <
>> > > > > > > [email protected]
>> > > > > > > > >
>> > > > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > I do agree with Amol on having clear and explicit modules.
>> > This
>> > > > is
>> > > > > > more
>> > > > > > > > > from an end user perspective. For someone who is new to
>> Apex,
>> > > > > having
>> > > > > > > > > separate NFS, HDFS, FTP, etc would make lot more sense
>> than
>> > one
>> > > > > > generic
>> > > > > > > > FS
>> > > > > > > > > module. However small change these modules may have, like
>> > just
>> > > > > couple
>> > > > > > > of
>> > > > > > > > > small functions, I would like to have them separate for
>> the
>> > end
>> > > > > user.
>> > > > > > > > >
>> > > > > > > > > It is finally about the perspective and the user
>> experience
>> > :)
>> > > > > > > > >
>> > > > > > > > > Regards,
>> > > > > > > > > Sandeep
>> > > > > > > > >
>> > > > > > > > > On Thu, May 5, 2016 at 8:48 PM, Thomas Weise <
>> > > > > [email protected]
>> > > > > > >
>> > > > > > > > > wrote:
>> > > > > > > > >
>> > > > > > > > > > I don't think we should name something NFS* when it
>> isn't
>> > > > > specific
>> > > > > > to
>> > > > > > > > > NFS.
>> > > > > > > > > > It is just like any other local FS for this purpose and
>> > > that's
>> > > > > > > already
>> > > > > > > > > > covered by the Hadoop file system abstraction.
>> > > > > > > > > >
>> > > > > > > > > > Why can't a single FS Input module accommodate all of
>> this.
>> > > > Once
>> > > > > > you
>> > > > > > > > know
>> > > > > > > > > > the FS URL, you can automatically optimize the
>> > configuration,
>> > > > if
>> > > > > > > > > > appropriate.
>> > > > > > > > > >
>> > > > > > > > > > Thanks,
>> > > > > > > > > > Thomas
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > On Thu, May 5, 2016 at 12:08 AM, Chaitanya Chebolu <
>> > > > > > > > > > [email protected]> wrote:
>> > > > > > > > > >
>> > > > > > > > > > > Hi Chandni,
>> > > > > > > > > > >
>> > > > > > > > > > >   Its a good point. I created the hierarchy based on
>> user
>> > > > > > > perspective
>> > > > > > > > > and
>> > > > > > > > > > > especially for non Java users. If I return
>> FileSplitter
>> > and
>> > > > > > > > BlockReader
>> > > > > > > > > > > from FS Input Module, then this module works for NFS.
>> > But,
>> > > > for
>> > > > > > > users
>> > > > > > > > > > > perspective it would be difficult, whether this module
>> > > works
>> > > > > for
>> > > > > > > NFS
>> > > > > > > > or
>> > > > > > > > > > any
>> > > > > > > > > > > other fileSystem.
>> > > > > > > > > > >
>> > > > > > > > > > > Regards,
>> > > > > > > > > > > Chaitanya
>> > > > > > > > > > >
>> > > > > > > > > > > On Thu, May 5, 2016 at 11:05 AM, Chandni Singh <
>> > > > > > > > > [email protected]>
>> > > > > > > > > > > wrote:
>> > > > > > > > > > >
>> > > > > > > > > > > > I am sorry Chaitanya but I have more questions about
>> > this
>> > > > > > > > > > > >
>> > > > > > > > > > > > 1. why is the FS Input Module abstract when by
>> default
>> > it
>> > > > can
>> > > > > > > > return
>> > > > > > > > > > > > FileSplitter & BlockReader in
>> > com.datatorrent.lib.io.fs?
>> > > > > > > > > > > >  These implementations are not specific to NFS.
>> > > > > > > > > > > >
>> > > > > > > > > > > > 2. In the NFS module that you have suggested to
>> create,
>> > > > what
>> > > > > is
>> > > > > > > > > > specific
>> > > > > > > > > > > to
>> > > > > > > > > > > > NFS?
>> > > > > > > > > > > >
>> > > > > > > > > > > > Please note: I have created a ticket
>> APEXMALHAR-2081 to
>> > > > > remove
>> > > > > > > > > > > > FSFileSplitter from library and move its feature to
>> the
>> > > > base
>> > > > > > > > > operator.
>> > > > > > > > > > > >
>> > > > > > > > > > > > Thanks,
>> > > > > > > > > > > > Chandni
>> > > > > > > > > > > >
>> > > > > > > > > > > > On Wed, May 4, 2016 at 10:29 PM, Chaitanya Chebolu <
>> > > > > > > > > > > > [email protected]> wrote:
>> > > > > > > > > > > >
>> > > > > > > > > > > > > FSFileSplitter & BlockReader are available in
>> > > > > > > > > > com.datatorrent.lib.io.fs
>> > > > > > > > > > > > > package.
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > On Thu, May 5, 2016 at 10:47 AM, Chandni Singh <
>> > > > > > > > > > > [email protected]>
>> > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > > Ok. What is specific about the fileSplitter and
>> > > > > blockReader
>> > > > > > > > > > returned
>> > > > > > > > > > > by
>> > > > > > > > > > > > > > this implementation?
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > On May 4, 2016 9:43 PM, "Chaitanya Chebolu" <
>> > > > > > > > > > > [email protected]
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Hi Chandni,
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Properties wise nothing specific. FS Input
>> Module
>> > > is
>> > > > an
>> > > > > > > > > abstract
>> > > > > > > > > > > > Module
>> > > > > > > > > > > > > > and
>> > > > > > > > > > > > > > > NFS Module implements the abstract methods -
>> > > > > > > > > createFileSplitter()
>> > > > > > > > > > > and
>> > > > > > > > > > > > > > > createBlockReader().
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Regards,
>> > > > > > > > > > > > > > > Chaitanya
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > On Wed, May 4, 2016 at 9:45 PM, Chandni Singh
>> <
>> > > > > > > > > > > > [email protected]
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > Hi Chaitanya,
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > What will be specific in NFS Input Module
>> that
>> > is
>> > > > not
>> > > > > > > > > provided
>> > > > > > > > > > by
>> > > > > > > > > > > > FS
>> > > > > > > > > > > > > > > Input
>> > > > > > > > > > > > > > > > Module?
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > Thanks,
>> > > > > > > > > > > > > > > > Chandni
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > On Wed, May 4, 2016 at 7:12 AM, Amol Kekre <
>> > > > > > > > > > [email protected]
>> > > > > > > > > > > >
>> > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > +1
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > Thks
>> > > > > > > > > > > > > > > > > Amol
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > On Tue, May 3, 2016 at 10:06 PM, Sandeep
>> > > > Deshmukh <
>> > > > > > > > > > > > > > > > [email protected]
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > +1
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > Regards,
>> > > > > > > > > > > > > > > > > > Sandeep
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > On Fri, Apr 29, 2016 at 3:26 PM, Mohit
>> > > Jotwani
>> > > > <
>> > > > > > > > > > > > > > > [email protected]>
>> > > > > > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > +1
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > Regards,
>> > > > > > > > > > > > > > > > > > > Mohit
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > On Fri, Apr 29, 2016 at 2:09 PM,
>> > Chaitanya
>> > > > > > Chebolu
>> > > > > > > <
>> > > > > > > > > > > > > > > > > > > [email protected]> wrote:
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > Hi All,
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > >   I am proposing NFS Input Module.
>> Use
>> > > case
>> > > > > is
>> > > > > > to
>> > > > > > > > > read
>> > > > > > > > > > > > large
>> > > > > > > > > > > > > > > files
>> > > > > > > > > > > > > > > > > from
>> > > > > > > > > > > > > > > > > > > NFS
>> > > > > > > > > > > > > > > > > > > > in parallel.
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > >  Design of NFS input module:
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > >    There is a common interface
>> > > > > "FSInputModule"
>> > > > > > in
>> > > > > > > > > > Malhar
>> > > > > > > > > > > > for
>> > > > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > > input
>> > > > > > > > > > > > > > > > > > > > Modules. NFS input Module extends
>> from
>> > > > > > > > FSInputModule
>> > > > > > > > > > and
>> > > > > > > > > > > > can
>> > > > > > > > > > > > > be
>> > > > > > > > > > > > > > > > > > achieved
>> > > > > > > > > > > > > > > > > > > by
>> > > > > > > > > > > > > > > > > > > > using FSFileSplitter and BlockReader
>> > > > > operators.
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > >   Please share your thoughts on
>> this.
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > Regards,
>> > > > > > > > > > > > > > > > > > > > Chaitanya
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Reply via email to