Re: A proposal for Malhar

2016-08-11 Thread Lakshmi Velineni
Thomas thanks for the suggestions and the comments in the document. I will
take another look at the ones that I had shortlisted in the document to
keep. Within that subset, would it be ok to leave the ones that don't have
a large state problem, for the time being, till we have replacement
operators implemented with the new windowing and state management. After
the cleanup, I can also help in the development effort of those replacement
operators as well.

Thanks

On Tue, Aug 9, 2016 at 11:21 AM, Thomas Weise 
wrote:

> There are a bunch of operators that don't have proper state management and
> also don't support generic windowing (event time etc.). I would suggest to
> move those out or deprecate them.
>
> The new windowing and state management support along with the appropriate
> aggregators is going to make them obsolete.
>
> Thomas
>
>
> On Tue, Aug 9, 2016 at 10:47 AM, Lakshmi Velineni  > wrote:
>
>> Hi,
>>
>> Friendly Reminder :
>>
>> I created a shared google sheet and tracked the various details of
>> operators. The sheet contains information about operators under lib/algo,
>> lib/math & lib/streamquery. Link is https://docs.google.com/a/d
>> atatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptD
>> J_CaWpXt3GDccM/edit?usp=sharing . I also added recommendation for each
>> operator . Please take a look and provide comments as if any.
>>
>> Thanks
>> Lakshmi Prasanna
>>
>> On Tue, Aug 9, 2016 at 10:07 AM, Pramod Immaneni 
>> wrote:
>>
>>> Added comments, also recommend having the misc folder for the remaining
>>> operators in contrib according to proposed guidelines
>>>
>>> https://github.com/apache/apex-site/pull/44
>>>
>>> On Mon, Aug 1, 2016 at 12:22 AM, Lakshmi Velineni <
>>> laks...@datatorrent.com>
>>> wrote:
>>>
>>> > Hi
>>> >
>>> > I also added recommendation for lib/math operators to the same
>>> document as
>>> > a separate sheet. Please have a look.
>>> >
>>> > Thanks
>>> > Lakshmi Prasanna
>>> >
>>> > On Fri, Jul 29, 2016 at 7:19 PM, Lakshmi Velineni <
>>> laks...@datatorrent.com
>>> > > wrote:
>>> >
>>> >> Hi,
>>> >>
>>> >> I also added recommendation for each operator . Please take a look.
>>> >>
>>> >> thanks
>>> >>
>>> >> On Thu, Jul 28, 2016 at 11:03 AM, Lakshmi Velineni <
>>> >> laks...@datatorrent.com> wrote:
>>> >>
>>> >>> Hi,
>>> >>>
>>> >>> I created a shared google sheet and tracked the various details of
>>> >>> operators. Currently, the sheet contains information about operators
>>> under
>>> >>> lib/algo only. Link is https://docs.google.com/a/
>>> >>> datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_
>>> >>> CaWpXt3GDccM/edit?usp=sharing . Will update the sheet soon with
>>>
>>> >>> lib/math too.
>>> >>>
>>> >>> Thanks
>>> >>> Lakshmi Prasanna
>>> >>>
>>> >>> On Tue, Jul 12, 2016 at 2:36 PM, David Yan 
>>> >>> wrote:
>>> >>>
>>>  Hi Lakshmi,
>>> 
>>>  Thanks for volunteering.
>>> 
>>>  I think Pramod's suggestion of putting the operators into 3 buckets
>>> and
>>>  Siyuan's suggestion of starting a shared Google Sheet that tracks
>>>  individual operators are both good, with the exception that
>>> lib/streamquery
>>>  is one unit and we probably do not need to look at individual
>>> operators
>>>  under it.
>>> 
>>>  If we don't have any objection in the community, let's start the
>>>  process.
>>> 
>>>  David
>>> 
>>>  On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <
>>>  laks...@datatorrent.com> wrote:
>>> 
>>> > I am interested to work on this.
>>> >
>>> > Regards,
>>> > Lakshmi prasanna
>>> >
>>> > On Tue, Jul 12, 2016 at 1:55 PM, hsy...@gmail.com <
>>> hsy...@gmail.com>
>>> > wrote:
>>> >
>>> > > Why not have a shared google sheet with a list of operators and
>>> > options
>>> > > that we want to do with it.
>>> > > I think it's case by case.
>>> > > But retire unused or obsolete operators is important and we
>>> should
>>> > do it
>>> > > sooner rather than later.
>>> > >
>>> > > Regards,
>>> > > Siyuan
>>> > >
>>> > > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <
>>> a...@datatorrent.com>
>>> > wrote:
>>> > >
>>> > >>
>>> > >> My vote is to do 2&3
>>> > >>
>>> > >> Thks
>>> > >> Amol
>>> > >>
>>> > >>
>>> > >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
>>> > >> vkottapa...@directv.com> wrote:
>>> > >>
>>> > >>> +1 for deprecating the packages listed below.
>>> > >>>
>>> > >>> -Original Message-
>>> > >>> From: hsy...@gmail.com [mailto:hsy...@gmail.com]
>>> > >>> Sent: Tuesday, July 12, 2016 12:01 PM
>>> > >>>
>>> > >>> +1
>>> > >>>
>>> > >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <
>>> da...@datatorrent.com
>>> > >
>>> > >>> wrote:
>>> > >>>
>>> > >>> > Hi all,
>>> > >>> 

Re: A proposal for Malhar

2016-08-09 Thread Thomas Weise
There are a bunch of operators that don't have proper state management and
also don't support generic windowing (event time etc.). I would suggest to
move those out or deprecate them.

The new windowing and state management support along with the appropriate
aggregators is going to make them obsolete.

Thomas


On Tue, Aug 9, 2016 at 10:47 AM, Lakshmi Velineni 
wrote:

> Hi,
>
> Friendly Reminder :
>
> I created a shared google sheet and tracked the various details of
> operators. The sheet contains information about operators under lib/algo,
> lib/math & lib/streamquery. Link is https://docs.google.com/a/d
> atatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptD
> J_CaWpXt3GDccM/edit?usp=sharing . I also added recommendation for each
> operator . Please take a look and provide comments as if any.
>
> Thanks
> Lakshmi Prasanna
>
> On Tue, Aug 9, 2016 at 10:07 AM, Pramod Immaneni 
> wrote:
>
>> Added comments, also recommend having the misc folder for the remaining
>> operators in contrib according to proposed guidelines
>>
>> https://github.com/apache/apex-site/pull/44
>>
>> On Mon, Aug 1, 2016 at 12:22 AM, Lakshmi Velineni <
>> laks...@datatorrent.com>
>> wrote:
>>
>> > Hi
>> >
>> > I also added recommendation for lib/math operators to the same document
>> as
>> > a separate sheet. Please have a look.
>> >
>> > Thanks
>> > Lakshmi Prasanna
>> >
>> > On Fri, Jul 29, 2016 at 7:19 PM, Lakshmi Velineni <
>> laks...@datatorrent.com
>> > > wrote:
>> >
>> >> Hi,
>> >>
>> >> I also added recommendation for each operator . Please take a look.
>> >>
>> >> thanks
>> >>
>> >> On Thu, Jul 28, 2016 at 11:03 AM, Lakshmi Velineni <
>> >> laks...@datatorrent.com> wrote:
>> >>
>> >>> Hi,
>> >>>
>> >>> I created a shared google sheet and tracked the various details of
>> >>> operators. Currently, the sheet contains information about operators
>> under
>> >>> lib/algo only. Link is https://docs.google.com/a/
>> >>> datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_
>> >>> CaWpXt3GDccM/edit?usp=sharing . Will update the sheet soon with
>>
>> >>> lib/math too.
>> >>>
>> >>> Thanks
>> >>> Lakshmi Prasanna
>> >>>
>> >>> On Tue, Jul 12, 2016 at 2:36 PM, David Yan 
>> >>> wrote:
>> >>>
>>  Hi Lakshmi,
>> 
>>  Thanks for volunteering.
>> 
>>  I think Pramod's suggestion of putting the operators into 3 buckets
>> and
>>  Siyuan's suggestion of starting a shared Google Sheet that tracks
>>  individual operators are both good, with the exception that
>> lib/streamquery
>>  is one unit and we probably do not need to look at individual
>> operators
>>  under it.
>> 
>>  If we don't have any objection in the community, let's start the
>>  process.
>> 
>>  David
>> 
>>  On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <
>>  laks...@datatorrent.com> wrote:
>> 
>> > I am interested to work on this.
>> >
>> > Regards,
>> > Lakshmi prasanna
>> >
>> > On Tue, Jul 12, 2016 at 1:55 PM, hsy...@gmail.com > >
>> > wrote:
>> >
>> > > Why not have a shared google sheet with a list of operators and
>> > options
>> > > that we want to do with it.
>> > > I think it's case by case.
>> > > But retire unused or obsolete operators is important and we should
>> > do it
>> > > sooner rather than later.
>> > >
>> > > Regards,
>> > > Siyuan
>> > >
>> > > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre > >
>> > wrote:
>> > >
>> > >>
>> > >> My vote is to do 2&3
>> > >>
>> > >> Thks
>> > >> Amol
>> > >>
>> > >>
>> > >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
>> > >> vkottapa...@directv.com> wrote:
>> > >>
>> > >>> +1 for deprecating the packages listed below.
>> > >>>
>> > >>> -Original Message-
>> > >>> From: hsy...@gmail.com [mailto:hsy...@gmail.com]
>> > >>> Sent: Tuesday, July 12, 2016 12:01 PM
>> > >>>
>> > >>> +1
>> > >>>
>> > >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <
>> da...@datatorrent.com
>> > >
>> > >>> wrote:
>> > >>>
>> > >>> > Hi all,
>> > >>> >
>> > >>> > I would like to renew the discussion of retiring operators in
>> > Malhar.
>> > >>> >
>> > >>> > As stated before, the reason why we would like to retire
>> > operators in
>> > >>> > Malhar is because some of them were written a long time ago
>> > before
>> > >>> > Apache incubation, and they do not pertain to real use cases,
>> > are not
>> > >>> > up to par in code quality, have no potential for improvement,
>> and
>> > >>> > probably completely unused by anybody.
>> > >>> >
>> > >>> > We do not want contributors to use them as a model of their
>> > >>> > contribution, or users to use them thinking they are of
>> quality,
>> 

Re: A proposal for Malhar

2016-08-09 Thread Lakshmi Velineni
Hi,

Friendly Reminder :

I created a shared google sheet and tracked the various details of
operators. The sheet contains information about operators under lib/algo,
lib/math & lib/streamquery. Link is https://docs.google.com/a/
datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_
CaWpXt3GDccM/edit?usp=sharing . I also added recommendation for each
operator . Please take a look and provide comments as if any.

Thanks
Lakshmi Prasanna

On Tue, Aug 9, 2016 at 10:07 AM, Pramod Immaneni 
wrote:

> Added comments, also recommend having the misc folder for the remaining
> operators in contrib according to proposed guidelines
>
> https://github.com/apache/apex-site/pull/44
>
> On Mon, Aug 1, 2016 at 12:22 AM, Lakshmi Velineni  >
> wrote:
>
> > Hi
> >
> > I also added recommendation for lib/math operators to the same document
> as
> > a separate sheet. Please have a look.
> >
> > Thanks
> > Lakshmi Prasanna
> >
> > On Fri, Jul 29, 2016 at 7:19 PM, Lakshmi Velineni <
> laks...@datatorrent.com
> > > wrote:
> >
> >> Hi,
> >>
> >> I also added recommendation for each operator . Please take a look.
> >>
> >> thanks
> >>
> >> On Thu, Jul 28, 2016 at 11:03 AM, Lakshmi Velineni <
> >> laks...@datatorrent.com> wrote:
> >>
> >>> Hi,
> >>>
> >>> I created a shared google sheet and tracked the various details of
> >>> operators. Currently, the sheet contains information about operators
> under
> >>> lib/algo only. Link is https://docs.google.com/a/
> >>> datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_
> >>> CaWpXt3GDccM/edit?usp=sharing . Will update the sheet soon with
> >>> lib/math too.
> >>>
> >>> Thanks
> >>> Lakshmi Prasanna
> >>>
> >>> On Tue, Jul 12, 2016 at 2:36 PM, David Yan 
> >>> wrote:
> >>>
>  Hi Lakshmi,
> 
>  Thanks for volunteering.
> 
>  I think Pramod's suggestion of putting the operators into 3 buckets
> and
>  Siyuan's suggestion of starting a shared Google Sheet that tracks
>  individual operators are both good, with the exception that
> lib/streamquery
>  is one unit and we probably do not need to look at individual
> operators
>  under it.
> 
>  If we don't have any objection in the community, let's start the
>  process.
> 
>  David
> 
>  On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <
>  laks...@datatorrent.com> wrote:
> 
> > I am interested to work on this.
> >
> > Regards,
> > Lakshmi prasanna
> >
> > On Tue, Jul 12, 2016 at 1:55 PM, hsy...@gmail.com 
> > wrote:
> >
> > > Why not have a shared google sheet with a list of operators and
> > options
> > > that we want to do with it.
> > > I think it's case by case.
> > > But retire unused or obsolete operators is important and we should
> > do it
> > > sooner rather than later.
> > >
> > > Regards,
> > > Siyuan
> > >
> > > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre 
> > wrote:
> > >
> > >>
> > >> My vote is to do 2&3
> > >>
> > >> Thks
> > >> Amol
> > >>
> > >>
> > >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
> > >> vkottapa...@directv.com> wrote:
> > >>
> > >>> +1 for deprecating the packages listed below.
> > >>>
> > >>> -Original Message-
> > >>> From: hsy...@gmail.com [mailto:hsy...@gmail.com]
> > >>> Sent: Tuesday, July 12, 2016 12:01 PM
> > >>>
> > >>> +1
> > >>>
> > >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <
> da...@datatorrent.com
> > >
> > >>> wrote:
> > >>>
> > >>> > Hi all,
> > >>> >
> > >>> > I would like to renew the discussion of retiring operators in
> > Malhar.
> > >>> >
> > >>> > As stated before, the reason why we would like to retire
> > operators in
> > >>> > Malhar is because some of them were written a long time ago
> > before
> > >>> > Apache incubation, and they do not pertain to real use cases,
> > are not
> > >>> > up to par in code quality, have no potential for improvement,
> and
> > >>> > probably completely unused by anybody.
> > >>> >
> > >>> > We do not want contributors to use them as a model of their
> > >>> > contribution, or users to use them thinking they are of
> quality,
> > and
> > >>> then hit a wall.
> > >>> > Both scenarios are not beneficial to the reputation of Apex.
> > >>> >
> > >>> > The initial 3 packages that we would like to target are
> > *lib/algo*,
> > >>> > *lib/math*, and *lib/streamquery*.
> > >>>
> > >>> >
> > >>> > I'm adding this thread to the users list. Please speak up if
> you
> > are
> > >>> > using any operator in these 3 packages. We would like to hear
> > from you.
> > >>> >
> > >>> > These are the options I can think of for retiring those
> > 

Re: A proposal for Malhar

2016-08-09 Thread Pramod Immaneni
Added comments, also recommend having the misc folder for the remaining
operators in contrib according to proposed guidelines

https://github.com/apache/apex-site/pull/44

On Mon, Aug 1, 2016 at 12:22 AM, Lakshmi Velineni 
wrote:

> Hi
>
> I also added recommendation for lib/math operators to the same document as
> a separate sheet. Please have a look.
>
> Thanks
> Lakshmi Prasanna
>
> On Fri, Jul 29, 2016 at 7:19 PM, Lakshmi Velineni  > wrote:
>
>> Hi,
>>
>> I also added recommendation for each operator . Please take a look.
>>
>> thanks
>>
>> On Thu, Jul 28, 2016 at 11:03 AM, Lakshmi Velineni <
>> laks...@datatorrent.com> wrote:
>>
>>> Hi,
>>>
>>> I created a shared google sheet and tracked the various details of
>>> operators. Currently, the sheet contains information about operators under
>>> lib/algo only. Link is https://docs.google.com/a/
>>> datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_
>>> CaWpXt3GDccM/edit?usp=sharing . Will update the sheet soon with
>>> lib/math too.
>>>
>>> Thanks
>>> Lakshmi Prasanna
>>>
>>> On Tue, Jul 12, 2016 at 2:36 PM, David Yan 
>>> wrote:
>>>
 Hi Lakshmi,

 Thanks for volunteering.

 I think Pramod's suggestion of putting the operators into 3 buckets and
 Siyuan's suggestion of starting a shared Google Sheet that tracks
 individual operators are both good, with the exception that lib/streamquery
 is one unit and we probably do not need to look at individual operators
 under it.

 If we don't have any objection in the community, let's start the
 process.

 David

 On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <
 laks...@datatorrent.com> wrote:

> I am interested to work on this.
>
> Regards,
> Lakshmi prasanna
>
> On Tue, Jul 12, 2016 at 1:55 PM, hsy...@gmail.com 
> wrote:
>
> > Why not have a shared google sheet with a list of operators and
> options
> > that we want to do with it.
> > I think it's case by case.
> > But retire unused or obsolete operators is important and we should
> do it
> > sooner rather than later.
> >
> > Regards,
> > Siyuan
> >
> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre 
> wrote:
> >
> >>
> >> My vote is to do 2&3
> >>
> >> Thks
> >> Amol
> >>
> >>
> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
> >> vkottapa...@directv.com> wrote:
> >>
> >>> +1 for deprecating the packages listed below.
> >>>
> >>> -Original Message-
> >>> From: hsy...@gmail.com [mailto:hsy...@gmail.com]
> >>> Sent: Tuesday, July 12, 2016 12:01 PM
> >>>
> >>> +1
> >>>
> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan  >
> >>> wrote:
> >>>
> >>> > Hi all,
> >>> >
> >>> > I would like to renew the discussion of retiring operators in
> Malhar.
> >>> >
> >>> > As stated before, the reason why we would like to retire
> operators in
> >>> > Malhar is because some of them were written a long time ago
> before
> >>> > Apache incubation, and they do not pertain to real use cases,
> are not
> >>> > up to par in code quality, have no potential for improvement, and
> >>> > probably completely unused by anybody.
> >>> >
> >>> > We do not want contributors to use them as a model of their
> >>> > contribution, or users to use them thinking they are of quality,
> and
> >>> then hit a wall.
> >>> > Both scenarios are not beneficial to the reputation of Apex.
> >>> >
> >>> > The initial 3 packages that we would like to target are
> *lib/algo*,
> >>> > *lib/math*, and *lib/streamquery*.
> >>>
> >>> >
> >>> > I'm adding this thread to the users list. Please speak up if you
> are
> >>> > using any operator in these 3 packages. We would like to hear
> from you.
> >>> >
> >>> > These are the options I can think of for retiring those
> operators:
> >>> >
> >>> > 1) Completely remove them from the malhar repository.
> >>> > 2) Move them from malhar-library into a separate artifact called
> >>> > malhar-misc
> >>> > 3) Mark them deprecated and add to their javadoc that they are no
> >>> > longer supported
> >>> >
> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
> >>> >
> >>> > David
> >>> >
> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
> >>> > 
> >>> > wrote:
> >>> >
> >>> >> I wanted to close the loop on this discussion. In general
> everyone
> >>> >> seemed to be favorable to this idea with no serious objections.
> Folks
> >>> >> had good suggestions like documenting capabilities of
> operators, come
> >>> >> 

Re: A proposal for Malhar

2016-07-29 Thread Lakshmi Velineni
Hi,

I also added recommendation for each operator . Please take a look.

thanks

On Thu, Jul 28, 2016 at 11:03 AM, Lakshmi Velineni 
wrote:

> Hi,
>
> I created a shared google sheet and tracked the various details of
> operators. Currently, the sheet contains information about operators under
> lib/algo only. Link is
> https://docs.google.com/a/datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_CaWpXt3GDccM/edit?usp=sharing
>  .
> Will update the sheet soon with lib/math too.
>
> Thanks
> Lakshmi Prasanna
>
> On Tue, Jul 12, 2016 at 2:36 PM, David Yan  wrote:
>
>> Hi Lakshmi,
>>
>> Thanks for volunteering.
>>
>> I think Pramod's suggestion of putting the operators into 3 buckets and
>> Siyuan's suggestion of starting a shared Google Sheet that tracks
>> individual operators are both good, with the exception that lib/streamquery
>> is one unit and we probably do not need to look at individual operators
>> under it.
>>
>> If we don't have any objection in the community, let's start the process.
>>
>> David
>>
>> On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <
>> laks...@datatorrent.com> wrote:
>>
>>> I am interested to work on this.
>>>
>>> Regards,
>>> Lakshmi prasanna
>>>
>>> On Tue, Jul 12, 2016 at 1:55 PM, hsy...@gmail.com 
>>> wrote:
>>>
>>> > Why not have a shared google sheet with a list of operators and options
>>> > that we want to do with it.
>>> > I think it's case by case.
>>> > But retire unused or obsolete operators is important and we should do
>>> it
>>> > sooner rather than later.
>>> >
>>> > Regards,
>>> > Siyuan
>>> >
>>> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre 
>>> wrote:
>>> >
>>> >>
>>> >> My vote is to do 2&3
>>> >>
>>> >> Thks
>>> >> Amol
>>> >>
>>> >>
>>> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
>>> >> vkottapa...@directv.com> wrote:
>>> >>
>>> >>> +1 for deprecating the packages listed below.
>>> >>>
>>> >>> -Original Message-
>>> >>> From: hsy...@gmail.com [mailto:hsy...@gmail.com]
>>> >>> Sent: Tuesday, July 12, 2016 12:01 PM
>>> >>>
>>> >>> +1
>>> >>>
>>> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan 
>>> >>> wrote:
>>> >>>
>>> >>> > Hi all,
>>> >>> >
>>> >>> > I would like to renew the discussion of retiring operators in
>>> Malhar.
>>> >>> >
>>> >>> > As stated before, the reason why we would like to retire operators
>>> in
>>> >>> > Malhar is because some of them were written a long time ago before
>>> >>> > Apache incubation, and they do not pertain to real use cases, are
>>> not
>>> >>> > up to par in code quality, have no potential for improvement, and
>>> >>> > probably completely unused by anybody.
>>> >>> >
>>> >>> > We do not want contributors to use them as a model of their
>>> >>> > contribution, or users to use them thinking they are of quality,
>>> and
>>> >>> then hit a wall.
>>> >>> > Both scenarios are not beneficial to the reputation of Apex.
>>> >>> >
>>> >>> > The initial 3 packages that we would like to target are *lib/algo*,
>>> >>> > *lib/math*, and *lib/streamquery*.
>>> >>>
>>> >>> >
>>> >>> > I'm adding this thread to the users list. Please speak up if you
>>> are
>>> >>> > using any operator in these 3 packages. We would like to hear from
>>> you.
>>> >>> >
>>> >>> > These are the options I can think of for retiring those operators:
>>> >>> >
>>> >>> > 1) Completely remove them from the malhar repository.
>>> >>> > 2) Move them from malhar-library into a separate artifact called
>>> >>> > malhar-misc
>>> >>> > 3) Mark them deprecated and add to their javadoc that they are no
>>> >>> > longer supported
>>> >>> >
>>> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
>>> >>> >
>>> >>> > David
>>> >>> >
>>> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
>>> >>> > 
>>> >>> > wrote:
>>> >>> >
>>> >>> >> I wanted to close the loop on this discussion. In general everyone
>>> >>> >> seemed to be favorable to this idea with no serious objections.
>>> Folks
>>> >>> >> had good suggestions like documenting capabilities of operators,
>>> come
>>> >>> >> up well defined criteria for graduation of operators and what
>>> those
>>> >>> >> criteria may be and what to do with existing operators that may
>>> not
>>> >>> >> yet be mature or unused.
>>> >>> >>
>>> >>> >> I am going to summarize the key points that resulted from the
>>> >>> >> discussion and would like to proceed with them.
>>> >>> >>
>>> >>> >>- Operators that do not yet provide the key platform
>>> capabilities
>>> >>> to
>>> >>> >>make an operator useful across different applications such as
>>> >>> >> reusability,
>>> >>> >>partitioning static or dynamic, idempotency, exactly once will
>>> >>> still be
>>> >>> >>accepted as long as they are functionally correct, have unit
>>> tests
>>> >>> >> and will
>>> >>> >>go into a separate module.
>>> >>> >>- Contrib module was 

Re: A proposal for Malhar

2016-07-26 Thread Pramod Immaneni
A document for malhar contribution guidelines has been prepared and
submitted in a pull request

https://github.com/apache/apex-site/pull/44

Thanks

On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni 
wrote:

> I wanted to close the loop on this discussion. In general everyone seemed
> to be favorable to this idea with no serious objections. Folks had good
> suggestions like documenting capabilities of operators, come up well
> defined criteria for graduation of operators and what those criteria may be
> and what to do with existing operators that may not yet be mature or
> unused.
>
> I am going to summarize the key points that resulted from the discussion
> and would like to proceed with them.
>
>- Operators that do not yet provide the key platform capabilities to
>make an operator useful across different applications such as reusability,
>partitioning static or dynamic, idempotency, exactly once will still be
>accepted as long as they are functionally correct, have unit tests and will
>go into a separate module.
>- Contrib module was suggested as a place where new contributions go
>in that don't yet have all the platform capabilities and are not yet
>mature. If there are no other suggestions we will go with this one.
>- It was suggested the operators documentation list those platform
>capabilities it currently provides from the list above. I will document a
>structure for this in the contribution guidelines.
>- Folks wanted to know what would be the criteria to graduate an
>operator to the big leagues :). I will kick-off a separate thread for it as
>I think it requires its own discussion and hopefully we can come up with a
>set of guidelines for it.
>- David brought up state of some of the existing operators and their
>retirement and the layout of operators in Malhar in general and how it
>causes problems with development. I will ask him to lead the discussion on
>that.
>
> Thanks
>
> On Fri, May 27, 2016 at 7:47 PM, David Yan  wrote:
>
>> The two ideas are not conflicting, but rather complementing.
>>
>> On the contrary, putting a new process for people trying to contribute
>> while NOT addressing the old unused subpar operators in the repository is
>> what is conflicting.
>>
>> Keep in mind that when people try to contribute, they always look at the
>> existing operators already in the repository as examples and likely a
>> model
>> for their new operators.
>>
>> David
>>
>>
>> On Fri, May 27, 2016 at 4:05 PM, Amol Kekre  wrote:
>>
>> > Yes there are two conflicting threads now. The original thread was to
>> open
>> > up a way for contributors to submit code in a dir (contrib?) as long as
>> > license part of taken care of.
>> >
>> > On the thread of removing non-used operators -> How do we know what is
>> > being used?
>> >
>> > Thks,
>> > Amol
>> >
>> >
>> > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde > >
>> > wrote:
>> >
>> > > +1 for removing the not-used operators.
>> > >
>> > > So we are creating a process for operator writers who don't want to
>> > > understand the platform, yet wants to contribute? How big is that set?
>> > > If we tell the app-user, here is the code which has not passed all the
>> > > checklist, will they be ready to use that in production?
>> > >
>> > > This thread has 2 conflicting forces, reduce the operators and make it
>> > easy
>> > > to add more operators.
>> > >
>> > >
>> > >
>> > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>> pra...@datatorrent.com>
>> > > wrote:
>> > >
>> > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>> > gaurav.gopi...@gmail.com>
>> > > > wrote:
>> > > >
>> > > > > Pramod,
>> > > > >
>> > > > > By that logic I would say let's put all partitionable operators
>> into
>> > > one
>> > > > > folder, non-partitionable operators in another and so on...
>> > > > >
>> > > >
>> > > > Remember the original goal of making it easier for new members to
>> > > > contribute and managing those contributions to maturity. It is not a
>> > > > functional level separation.
>> > > >
>> > > >
>> > > > > When I look at hadoop code I see these annotations being used at
>> > class
>> > > > > level and not at package/folder level.
>> > > >
>> > > >
>> > > > I had a typo in my email, I meant to say "think of this like a
>> > folder..."
>> > > > as an analogy and not literally.
>> > > >
>> > > > Thanks
>> > > >
>> > > >
>> > > > > Thanks
>> > > > >
>> > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>> > > pra...@datatorrent.com
>> > > > >
>> > > > > wrote:
>> > > > >
>> > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>> > > > gaurav.gopi...@gmail.com>
>> > > > > > wrote:
>> > > > > >
>> > > > > > > Can same goal not be achieved by
>> > > > > > > using
>> > org.apache.hadoop.classification.InterfaceStability.Evolving
>> > > /
>> > > > > > > 

Re: A proposal for Malhar

2016-07-12 Thread Chinmay Kolhatkar
+1. This is a really good starting point to cleanup malhar.

On Wed, Jul 13, 2016 at 3:06 AM, David Yan  wrote:

> Hi Lakshmi,
>
> Thanks for volunteering.
>
> I think Pramod's suggestion of putting the operators into 3 buckets and
> Siyuan's suggestion of starting a shared Google Sheet that tracks
> individual operators are both good, with the exception that lib/streamquery
> is one unit and we probably do not need to look at individual operators
> under it.
>
> If we don't have any objection in the community, let's start the process.
>
> David
>
> On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni  > wrote:
>
>> I am interested to work on this.
>>
>> Regards,
>> Lakshmi prasanna
>>
>> On Tue, Jul 12, 2016 at 1:55 PM, hsy...@gmail.com 
>> wrote:
>>
>> > Why not have a shared google sheet with a list of operators and options
>> > that we want to do with it.
>> > I think it's case by case.
>> > But retire unused or obsolete operators is important and we should do it
>> > sooner rather than later.
>> >
>> > Regards,
>> > Siyuan
>> >
>> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre 
>> wrote:
>> >
>> >>
>> >> My vote is to do 2&3
>> >>
>> >> Thks
>> >> Amol
>> >>
>> >>
>> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
>> >> vkottapa...@directv.com> wrote:
>> >>
>> >>> +1 for deprecating the packages listed below.
>> >>>
>> >>> -Original Message-
>> >>> From: hsy...@gmail.com [mailto:hsy...@gmail.com]
>> >>> Sent: Tuesday, July 12, 2016 12:01 PM
>> >>>
>> >>> +1
>> >>>
>> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan 
>> >>> wrote:
>> >>>
>> >>> > Hi all,
>> >>> >
>> >>> > I would like to renew the discussion of retiring operators in
>> Malhar.
>> >>> >
>> >>> > As stated before, the reason why we would like to retire operators
>> in
>> >>> > Malhar is because some of them were written a long time ago before
>> >>> > Apache incubation, and they do not pertain to real use cases, are
>> not
>> >>> > up to par in code quality, have no potential for improvement, and
>> >>> > probably completely unused by anybody.
>> >>> >
>> >>> > We do not want contributors to use them as a model of their
>> >>> > contribution, or users to use them thinking they are of quality, and
>> >>> then hit a wall.
>> >>> > Both scenarios are not beneficial to the reputation of Apex.
>> >>> >
>> >>> > The initial 3 packages that we would like to target are *lib/algo*,
>> >>> > *lib/math*, and *lib/streamquery*.
>> >>>
>> >>> >
>> >>> > I'm adding this thread to the users list. Please speak up if you are
>> >>> > using any operator in these 3 packages. We would like to hear from
>> you.
>> >>> >
>> >>> > These are the options I can think of for retiring those operators:
>> >>> >
>> >>> > 1) Completely remove them from the malhar repository.
>> >>> > 2) Move them from malhar-library into a separate artifact called
>> >>> > malhar-misc
>> >>> > 3) Mark them deprecated and add to their javadoc that they are no
>> >>> > longer supported
>> >>> >
>> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
>> >>> >
>> >>> > David
>> >>> >
>> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
>> >>> > 
>> >>> > wrote:
>> >>> >
>> >>> >> I wanted to close the loop on this discussion. In general everyone
>> >>> >> seemed to be favorable to this idea with no serious objections.
>> Folks
>> >>> >> had good suggestions like documenting capabilities of operators,
>> come
>> >>> >> up well defined criteria for graduation of operators and what those
>> >>> >> criteria may be and what to do with existing operators that may not
>> >>> >> yet be mature or unused.
>> >>> >>
>> >>> >> I am going to summarize the key points that resulted from the
>> >>> >> discussion and would like to proceed with them.
>> >>> >>
>> >>> >>- Operators that do not yet provide the key platform
>> capabilities
>> >>> to
>> >>> >>make an operator useful across different applications such as
>> >>> >> reusability,
>> >>> >>partitioning static or dynamic, idempotency, exactly once will
>> >>> still be
>> >>> >>accepted as long as they are functionally correct, have unit
>> tests
>> >>> >> and will
>> >>> >>go into a separate module.
>> >>> >>- Contrib module was suggested as a place where new
>> contributions
>> >>> go in
>> >>> >>that don't yet have all the platform capabilities and are not
>> yet
>> >>> >> mature.
>> >>> >>If there are no other suggestions we will go with this one.
>> >>> >>- It was suggested the operators documentation list those
>> platform
>> >>> >>capabilities it currently provides from the list above. I will
>> >>> >> document a
>> >>> >>structure for this in the contribution guidelines.
>> >>> >>- Folks wanted to know what would be the criteria to graduate an
>> >>> >>operator to the big leagues :). I will kick-off a separate
>> thread
>> 

Re: A proposal for Malhar

2016-07-12 Thread David Yan
Hi Lakshmi,

Thanks for volunteering.

I think Pramod's suggestion of putting the operators into 3 buckets and
Siyuan's suggestion of starting a shared Google Sheet that tracks
individual operators are both good, with the exception that lib/streamquery
is one unit and we probably do not need to look at individual operators
under it.

If we don't have any objection in the community, let's start the process.

David

On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni 
wrote:

> I am interested to work on this.
>
> Regards,
> Lakshmi prasanna
>
> On Tue, Jul 12, 2016 at 1:55 PM, hsy...@gmail.com 
> wrote:
>
> > Why not have a shared google sheet with a list of operators and options
> > that we want to do with it.
> > I think it's case by case.
> > But retire unused or obsolete operators is important and we should do it
> > sooner rather than later.
> >
> > Regards,
> > Siyuan
> >
> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre 
> wrote:
> >
> >>
> >> My vote is to do 2&3
> >>
> >> Thks
> >> Amol
> >>
> >>
> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
> >> vkottapa...@directv.com> wrote:
> >>
> >>> +1 for deprecating the packages listed below.
> >>>
> >>> -Original Message-
> >>> From: hsy...@gmail.com [mailto:hsy...@gmail.com]
> >>> Sent: Tuesday, July 12, 2016 12:01 PM
> >>>
> >>> +1
> >>>
> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan 
> >>> wrote:
> >>>
> >>> > Hi all,
> >>> >
> >>> > I would like to renew the discussion of retiring operators in Malhar.
> >>> >
> >>> > As stated before, the reason why we would like to retire operators in
> >>> > Malhar is because some of them were written a long time ago before
> >>> > Apache incubation, and they do not pertain to real use cases, are not
> >>> > up to par in code quality, have no potential for improvement, and
> >>> > probably completely unused by anybody.
> >>> >
> >>> > We do not want contributors to use them as a model of their
> >>> > contribution, or users to use them thinking they are of quality, and
> >>> then hit a wall.
> >>> > Both scenarios are not beneficial to the reputation of Apex.
> >>> >
> >>> > The initial 3 packages that we would like to target are *lib/algo*,
> >>> > *lib/math*, and *lib/streamquery*.
> >>>
> >>> >
> >>> > I'm adding this thread to the users list. Please speak up if you are
> >>> > using any operator in these 3 packages. We would like to hear from
> you.
> >>> >
> >>> > These are the options I can think of for retiring those operators:
> >>> >
> >>> > 1) Completely remove them from the malhar repository.
> >>> > 2) Move them from malhar-library into a separate artifact called
> >>> > malhar-misc
> >>> > 3) Mark them deprecated and add to their javadoc that they are no
> >>> > longer supported
> >>> >
> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
> >>> >
> >>> > David
> >>> >
> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
> >>> > 
> >>> > wrote:
> >>> >
> >>> >> I wanted to close the loop on this discussion. In general everyone
> >>> >> seemed to be favorable to this idea with no serious objections.
> Folks
> >>> >> had good suggestions like documenting capabilities of operators,
> come
> >>> >> up well defined criteria for graduation of operators and what those
> >>> >> criteria may be and what to do with existing operators that may not
> >>> >> yet be mature or unused.
> >>> >>
> >>> >> I am going to summarize the key points that resulted from the
> >>> >> discussion and would like to proceed with them.
> >>> >>
> >>> >>- Operators that do not yet provide the key platform capabilities
> >>> to
> >>> >>make an operator useful across different applications such as
> >>> >> reusability,
> >>> >>partitioning static or dynamic, idempotency, exactly once will
> >>> still be
> >>> >>accepted as long as they are functionally correct, have unit
> tests
> >>> >> and will
> >>> >>go into a separate module.
> >>> >>- Contrib module was suggested as a place where new contributions
> >>> go in
> >>> >>that don't yet have all the platform capabilities and are not yet
> >>> >> mature.
> >>> >>If there are no other suggestions we will go with this one.
> >>> >>- It was suggested the operators documentation list those
> platform
> >>> >>capabilities it currently provides from the list above. I will
> >>> >> document a
> >>> >>structure for this in the contribution guidelines.
> >>> >>- Folks wanted to know what would be the criteria to graduate an
> >>> >>operator to the big leagues :). I will kick-off a separate thread
> >>> >> for it as
> >>> >>I think it requires its own discussion and hopefully we can come
> >>> >> up with a
> >>> >>set of guidelines for it.
> >>> >>- David brought up state of some of the existing operators and
> >>> their
> >>> >>retirement and the layout of operators in Malhar in 

Re: A proposal for Malhar

2016-07-12 Thread Lakshmi Velineni
I am interested to work on this.

Regards,
Lakshmi prasanna

On Tue, Jul 12, 2016 at 1:55 PM, hsy...@gmail.com  wrote:

> Why not have a shared google sheet with a list of operators and options
> that we want to do with it.
> I think it's case by case.
> But retire unused or obsolete operators is important and we should do it
> sooner rather than later.
>
> Regards,
> Siyuan
>
> On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre  wrote:
>
>>
>> My vote is to do 2&3
>>
>> Thks
>> Amol
>>
>>
>> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
>> vkottapa...@directv.com> wrote:
>>
>>> +1 for deprecating the packages listed below.
>>>
>>> -Original Message-
>>> From: hsy...@gmail.com [mailto:hsy...@gmail.com]
>>> Sent: Tuesday, July 12, 2016 12:01 PM
>>>
>>> +1
>>>
>>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan 
>>> wrote:
>>>
>>> > Hi all,
>>> >
>>> > I would like to renew the discussion of retiring operators in Malhar.
>>> >
>>> > As stated before, the reason why we would like to retire operators in
>>> > Malhar is because some of them were written a long time ago before
>>> > Apache incubation, and they do not pertain to real use cases, are not
>>> > up to par in code quality, have no potential for improvement, and
>>> > probably completely unused by anybody.
>>> >
>>> > We do not want contributors to use them as a model of their
>>> > contribution, or users to use them thinking they are of quality, and
>>> then hit a wall.
>>> > Both scenarios are not beneficial to the reputation of Apex.
>>> >
>>> > The initial 3 packages that we would like to target are *lib/algo*,
>>> > *lib/math*, and *lib/streamquery*.
>>>
>>> >
>>> > I'm adding this thread to the users list. Please speak up if you are
>>> > using any operator in these 3 packages. We would like to hear from you.
>>> >
>>> > These are the options I can think of for retiring those operators:
>>> >
>>> > 1) Completely remove them from the malhar repository.
>>> > 2) Move them from malhar-library into a separate artifact called
>>> > malhar-misc
>>> > 3) Mark them deprecated and add to their javadoc that they are no
>>> > longer supported
>>> >
>>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
>>> >
>>> > David
>>> >
>>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
>>> > 
>>> > wrote:
>>> >
>>> >> I wanted to close the loop on this discussion. In general everyone
>>> >> seemed to be favorable to this idea with no serious objections. Folks
>>> >> had good suggestions like documenting capabilities of operators, come
>>> >> up well defined criteria for graduation of operators and what those
>>> >> criteria may be and what to do with existing operators that may not
>>> >> yet be mature or unused.
>>> >>
>>> >> I am going to summarize the key points that resulted from the
>>> >> discussion and would like to proceed with them.
>>> >>
>>> >>- Operators that do not yet provide the key platform capabilities
>>> to
>>> >>make an operator useful across different applications such as
>>> >> reusability,
>>> >>partitioning static or dynamic, idempotency, exactly once will
>>> still be
>>> >>accepted as long as they are functionally correct, have unit tests
>>> >> and will
>>> >>go into a separate module.
>>> >>- Contrib module was suggested as a place where new contributions
>>> go in
>>> >>that don't yet have all the platform capabilities and are not yet
>>> >> mature.
>>> >>If there are no other suggestions we will go with this one.
>>> >>- It was suggested the operators documentation list those platform
>>> >>capabilities it currently provides from the list above. I will
>>> >> document a
>>> >>structure for this in the contribution guidelines.
>>> >>- Folks wanted to know what would be the criteria to graduate an
>>> >>operator to the big leagues :). I will kick-off a separate thread
>>> >> for it as
>>> >>I think it requires its own discussion and hopefully we can come
>>> >> up with a
>>> >>set of guidelines for it.
>>> >>- David brought up state of some of the existing operators and
>>> their
>>> >>retirement and the layout of operators in Malhar in general and
>>> how it
>>> >>causes problems with development. I will ask him to lead the
>>> >> discussion on
>>> >>that.
>>> >>
>>> >> Thanks
>>> >>
>>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan 
>>> wrote:
>>> >>
>>> >> > The two ideas are not conflicting, but rather complementing.
>>> >> >
>>> >> > On the contrary, putting a new process for people trying to
>>> >> > contribute while NOT addressing the old unused subpar operators in
>>> >> > the repository
>>> >> is
>>> >> > what is conflicting.
>>> >> >
>>> >> > Keep in mind that when people try to contribute, they always look
>>> >> > at the existing operators already in the repository as examples and
>>> >> > likely a
>>> >> model
>>> >> > 

Re: A proposal for Malhar

2016-07-12 Thread hsy...@gmail.com
Why not have a shared google sheet with a list of operators and options
that we want to do with it.
I think it's case by case.
But retire unused or obsolete operators is important and we should do it
sooner rather than later.

Regards,
Siyuan

On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre  wrote:

>
> My vote is to do 2&3
>
> Thks
> Amol
>
>
> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
> vkottapa...@directv.com> wrote:
>
>> +1 for deprecating the packages listed below.
>>
>> -Original Message-
>> From: hsy...@gmail.com [mailto:hsy...@gmail.com]
>> Sent: Tuesday, July 12, 2016 12:01 PM
>>
>> +1
>>
>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan 
>> wrote:
>>
>> > Hi all,
>> >
>> > I would like to renew the discussion of retiring operators in Malhar.
>> >
>> > As stated before, the reason why we would like to retire operators in
>> > Malhar is because some of them were written a long time ago before
>> > Apache incubation, and they do not pertain to real use cases, are not
>> > up to par in code quality, have no potential for improvement, and
>> > probably completely unused by anybody.
>> >
>> > We do not want contributors to use them as a model of their
>> > contribution, or users to use them thinking they are of quality, and
>> then hit a wall.
>> > Both scenarios are not beneficial to the reputation of Apex.
>> >
>> > The initial 3 packages that we would like to target are *lib/algo*,
>> > *lib/math*, and *lib/streamquery*.
>>
>> >
>> > I'm adding this thread to the users list. Please speak up if you are
>> > using any operator in these 3 packages. We would like to hear from you.
>> >
>> > These are the options I can think of for retiring those operators:
>> >
>> > 1) Completely remove them from the malhar repository.
>> > 2) Move them from malhar-library into a separate artifact called
>> > malhar-misc
>> > 3) Mark them deprecated and add to their javadoc that they are no
>> > longer supported
>> >
>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
>> >
>> > David
>> >
>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
>> > 
>> > wrote:
>> >
>> >> I wanted to close the loop on this discussion. In general everyone
>> >> seemed to be favorable to this idea with no serious objections. Folks
>> >> had good suggestions like documenting capabilities of operators, come
>> >> up well defined criteria for graduation of operators and what those
>> >> criteria may be and what to do with existing operators that may not
>> >> yet be mature or unused.
>> >>
>> >> I am going to summarize the key points that resulted from the
>> >> discussion and would like to proceed with them.
>> >>
>> >>- Operators that do not yet provide the key platform capabilities to
>> >>make an operator useful across different applications such as
>> >> reusability,
>> >>partitioning static or dynamic, idempotency, exactly once will
>> still be
>> >>accepted as long as they are functionally correct, have unit tests
>> >> and will
>> >>go into a separate module.
>> >>- Contrib module was suggested as a place where new contributions
>> go in
>> >>that don't yet have all the platform capabilities and are not yet
>> >> mature.
>> >>If there are no other suggestions we will go with this one.
>> >>- It was suggested the operators documentation list those platform
>> >>capabilities it currently provides from the list above. I will
>> >> document a
>> >>structure for this in the contribution guidelines.
>> >>- Folks wanted to know what would be the criteria to graduate an
>> >>operator to the big leagues :). I will kick-off a separate thread
>> >> for it as
>> >>I think it requires its own discussion and hopefully we can come
>> >> up with a
>> >>set of guidelines for it.
>> >>- David brought up state of some of the existing operators and their
>> >>retirement and the layout of operators in Malhar in general and how
>> it
>> >>causes problems with development. I will ask him to lead the
>> >> discussion on
>> >>that.
>> >>
>> >> Thanks
>> >>
>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan 
>> wrote:
>> >>
>> >> > The two ideas are not conflicting, but rather complementing.
>> >> >
>> >> > On the contrary, putting a new process for people trying to
>> >> > contribute while NOT addressing the old unused subpar operators in
>> >> > the repository
>> >> is
>> >> > what is conflicting.
>> >> >
>> >> > Keep in mind that when people try to contribute, they always look
>> >> > at the existing operators already in the repository as examples and
>> >> > likely a
>> >> model
>> >> > for their new operators.
>> >> >
>> >> > David
>> >> >
>> >> >
>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre 
>> >> wrote:
>> >> >
>> >> > > Yes there are two conflicting threads now. The original thread
>> >> > > was to
>> >> > open
>> >> > > up a way 

Re: A proposal for Malhar

2016-07-12 Thread Pramod Immaneni
I would suggest we go through the operators in those packages on an
individual basis and grade them into 3 buckets, those that meet the level
we expect from the operators (could be few of them), those that are
potentially useful but need additional work and those that we don't think
would be useful. The ones in the first bucket can remain in place, the
second set be moved to misc and third set moved to misc and deprecated.

Thanks

On Tue, Jul 12, 2016 at 11:53 AM, David Yan  wrote:

> Hi all,
>
> I would like to renew the discussion of retiring operators in Malhar.
>
> As stated before, the reason why we would like to retire operators in
> Malhar is because some of them were written a long time ago before Apache
> incubation, and they do not pertain to real use cases, are not up to par in
> code quality, have no potential for improvement, and probably completely
> unused by anybody.
>
> We do not want contributors to use them as a model of their contribution,
> or users to use them thinking they are of quality, and then hit a wall.
> Both scenarios are not beneficial to the reputation of Apex.
>
> The initial 3 packages that we would like to target are *lib/algo*,
> *lib/math*, and *lib/streamquery*.
>
> I'm adding this thread to the users list. Please speak up if you are using
> any operator in these 3 packages. We would like to hear from you.
>
> These are the options I can think of for retiring those operators:
>
> 1) Completely remove them from the malhar repository.
> 2) Move them from malhar-library into a separate artifact called
> malhar-misc
> 3) Mark them deprecated and add to their javadoc that they are no longer
> supported
>
> Note that 2 and 3 are not mutually exclusive. Any thoughts?
>
> David
>
> On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni 
> wrote:
>
>> I wanted to close the loop on this discussion. In general everyone seemed
>> to be favorable to this idea with no serious objections. Folks had good
>> suggestions like documenting capabilities of operators, come up well
>> defined criteria for graduation of operators and what those criteria may
>> be
>> and what to do with existing operators that may not yet be mature or
>> unused.
>>
>> I am going to summarize the key points that resulted from the discussion
>> and would like to proceed with them.
>>
>>- Operators that do not yet provide the key platform capabilities to
>>make an operator useful across different applications such as
>> reusability,
>>partitioning static or dynamic, idempotency, exactly once will still be
>>accepted as long as they are functionally correct, have unit tests and
>> will
>>go into a separate module.
>>- Contrib module was suggested as a place where new contributions go in
>>that don't yet have all the platform capabilities and are not yet
>> mature.
>>If there are no other suggestions we will go with this one.
>>- It was suggested the operators documentation list those platform
>>capabilities it currently provides from the list above. I will
>> document a
>>structure for this in the contribution guidelines.
>>- Folks wanted to know what would be the criteria to graduate an
>>operator to the big leagues :). I will kick-off a separate thread for
>> it as
>>I think it requires its own discussion and hopefully we can come up
>> with a
>>set of guidelines for it.
>>- David brought up state of some of the existing operators and their
>>
>>retirement and the layout of operators in Malhar in general and how it
>>causes problems with development. I will ask him to lead the
>> discussion on
>>that.
>>
>> Thanks
>>
>> On Fri, May 27, 2016 at 7:47 PM, David Yan  wrote:
>>
>> > The two ideas are not conflicting, but rather complementing.
>> >
>> > On the contrary, putting a new process for people trying to contribute
>> > while NOT addressing the old unused subpar operators in the repository
>> is
>> > what is conflicting.
>> >
>> > Keep in mind that when people try to contribute, they always look at the
>> > existing operators already in the repository as examples and likely a
>> model
>> > for their new operators.
>> >
>> > David
>> >
>> >
>> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre 
>> wrote:
>> >
>> > > Yes there are two conflicting threads now. The original thread was to
>> > open
>> > > up a way for contributors to submit code in a dir (contrib?) as long
>> as
>> > > license part of taken care of.
>> > >
>> > > On the thread of removing non-used operators -> How do we know what is
>> > > being used?
>> > >
>> > > Thks,
>> > > Amol
>> > >
>> > >
>> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>> sand...@datatorrent.com>
>> > > wrote:
>> > >
>> > > > +1 for removing the not-used operators.
>> > > >
>> > > > So we are creating a process for operator writers who don't want to
>> > > > understand the platform, yet wants to contribute? 

Re: A proposal for Malhar

2016-07-12 Thread Amol Kekre
My vote is to do 2&3

Thks
Amol


On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
vkottapa...@directv.com> wrote:

> +1 for deprecating the packages listed below.
>
> -Original Message-
> From: hsy...@gmail.com [mailto:hsy...@gmail.com]
> Sent: Tuesday, July 12, 2016 12:01 PM
>
> +1
>
> On Tue, Jul 12, 2016 at 11:53 AM, David Yan  wrote:
>
> > Hi all,
> >
> > I would like to renew the discussion of retiring operators in Malhar.
> >
> > As stated before, the reason why we would like to retire operators in
> > Malhar is because some of them were written a long time ago before
> > Apache incubation, and they do not pertain to real use cases, are not
> > up to par in code quality, have no potential for improvement, and
> > probably completely unused by anybody.
> >
> > We do not want contributors to use them as a model of their
> > contribution, or users to use them thinking they are of quality, and
> then hit a wall.
> > Both scenarios are not beneficial to the reputation of Apex.
> >
> > The initial 3 packages that we would like to target are *lib/algo*,
> > *lib/math*, and *lib/streamquery*.
> >
> > I'm adding this thread to the users list. Please speak up if you are
> > using any operator in these 3 packages. We would like to hear from you.
> >
> > These are the options I can think of for retiring those operators:
> >
> > 1) Completely remove them from the malhar repository.
> > 2) Move them from malhar-library into a separate artifact called
> > malhar-misc
> > 3) Mark them deprecated and add to their javadoc that they are no
> > longer supported
> >
> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
> >
> > David
> >
> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
> > 
> > wrote:
> >
> >> I wanted to close the loop on this discussion. In general everyone
> >> seemed to be favorable to this idea with no serious objections. Folks
> >> had good suggestions like documenting capabilities of operators, come
> >> up well defined criteria for graduation of operators and what those
> >> criteria may be and what to do with existing operators that may not
> >> yet be mature or unused.
> >>
> >> I am going to summarize the key points that resulted from the
> >> discussion and would like to proceed with them.
> >>
> >>- Operators that do not yet provide the key platform capabilities to
> >>make an operator useful across different applications such as
> >> reusability,
> >>partitioning static or dynamic, idempotency, exactly once will still
> be
> >>accepted as long as they are functionally correct, have unit tests
> >> and will
> >>go into a separate module.
> >>- Contrib module was suggested as a place where new contributions go
> in
> >>that don't yet have all the platform capabilities and are not yet
> >> mature.
> >>If there are no other suggestions we will go with this one.
> >>- It was suggested the operators documentation list those platform
> >>capabilities it currently provides from the list above. I will
> >> document a
> >>structure for this in the contribution guidelines.
> >>- Folks wanted to know what would be the criteria to graduate an
> >>operator to the big leagues :). I will kick-off a separate thread
> >> for it as
> >>I think it requires its own discussion and hopefully we can come
> >> up with a
> >>set of guidelines for it.
> >>- David brought up state of some of the existing operators and their
> >>retirement and the layout of operators in Malhar in general and how
> it
> >>causes problems with development. I will ask him to lead the
> >> discussion on
> >>that.
> >>
> >> Thanks
> >>
> >> On Fri, May 27, 2016 at 7:47 PM, David Yan 
> wrote:
> >>
> >> > The two ideas are not conflicting, but rather complementing.
> >> >
> >> > On the contrary, putting a new process for people trying to
> >> > contribute while NOT addressing the old unused subpar operators in
> >> > the repository
> >> is
> >> > what is conflicting.
> >> >
> >> > Keep in mind that when people try to contribute, they always look
> >> > at the existing operators already in the repository as examples and
> >> > likely a
> >> model
> >> > for their new operators.
> >> >
> >> > David
> >> >
> >> >
> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre 
> >> wrote:
> >> >
> >> > > Yes there are two conflicting threads now. The original thread
> >> > > was to
> >> > open
> >> > > up a way for contributors to submit code in a dir (contrib?) as
> >> > > long
> >> as
> >> > > license part of taken care of.
> >> > >
> >> > > On the thread of removing non-used operators -> How do we know
> >> > > what is being used?
> >> > >
> >> > > Thks,
> >> > > Amol
> >> > >
> >> > >
> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
> >> sand...@datatorrent.com>
> >> > > wrote:
> >> > >
> >> > > > +1 for removing the not-used operators.
> >> > > >
> 

Re: A proposal for Malhar

2016-07-12 Thread Timothy Farkas
+1 for options 2 and 3

On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
vkottapa...@directv.com> wrote:

> +1 for deprecating the packages listed below.
>
> -Original Message-
> From: hsy...@gmail.com [mailto:hsy...@gmail.com]
> Sent: Tuesday, July 12, 2016 12:01 PM
>
> +1
>
> On Tue, Jul 12, 2016 at 11:53 AM, David Yan  wrote:
>
> > Hi all,
> >
> > I would like to renew the discussion of retiring operators in Malhar.
> >
> > As stated before, the reason why we would like to retire operators in
> > Malhar is because some of them were written a long time ago before
> > Apache incubation, and they do not pertain to real use cases, are not
> > up to par in code quality, have no potential for improvement, and
> > probably completely unused by anybody.
> >
> > We do not want contributors to use them as a model of their
> > contribution, or users to use them thinking they are of quality, and
> then hit a wall.
> > Both scenarios are not beneficial to the reputation of Apex.
> >
> > The initial 3 packages that we would like to target are *lib/algo*,
> > *lib/math*, and *lib/streamquery*.
> >
> > I'm adding this thread to the users list. Please speak up if you are
> > using any operator in these 3 packages. We would like to hear from you.
> >
> > These are the options I can think of for retiring those operators:
> >
> > 1) Completely remove them from the malhar repository.
> > 2) Move them from malhar-library into a separate artifact called
> > malhar-misc
> > 3) Mark them deprecated and add to their javadoc that they are no
> > longer supported
> >
> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
> >
> > David
> >
> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
> > 
> > wrote:
> >
> >> I wanted to close the loop on this discussion. In general everyone
> >> seemed to be favorable to this idea with no serious objections. Folks
> >> had good suggestions like documenting capabilities of operators, come
> >> up well defined criteria for graduation of operators and what those
> >> criteria may be and what to do with existing operators that may not
> >> yet be mature or unused.
> >>
> >> I am going to summarize the key points that resulted from the
> >> discussion and would like to proceed with them.
> >>
> >>- Operators that do not yet provide the key platform capabilities to
> >>make an operator useful across different applications such as
> >> reusability,
> >>partitioning static or dynamic, idempotency, exactly once will still
> be
> >>accepted as long as they are functionally correct, have unit tests
> >> and will
> >>go into a separate module.
> >>- Contrib module was suggested as a place where new contributions go
> in
> >>that don't yet have all the platform capabilities and are not yet
> >> mature.
> >>If there are no other suggestions we will go with this one.
> >>- It was suggested the operators documentation list those platform
> >>capabilities it currently provides from the list above. I will
> >> document a
> >>structure for this in the contribution guidelines.
> >>- Folks wanted to know what would be the criteria to graduate an
> >>operator to the big leagues :). I will kick-off a separate thread
> >> for it as
> >>I think it requires its own discussion and hopefully we can come
> >> up with a
> >>set of guidelines for it.
> >>- David brought up state of some of the existing operators and their
> >>retirement and the layout of operators in Malhar in general and how
> it
> >>causes problems with development. I will ask him to lead the
> >> discussion on
> >>that.
> >>
> >> Thanks
> >>
> >> On Fri, May 27, 2016 at 7:47 PM, David Yan 
> wrote:
> >>
> >> > The two ideas are not conflicting, but rather complementing.
> >> >
> >> > On the contrary, putting a new process for people trying to
> >> > contribute while NOT addressing the old unused subpar operators in
> >> > the repository
> >> is
> >> > what is conflicting.
> >> >
> >> > Keep in mind that when people try to contribute, they always look
> >> > at the existing operators already in the repository as examples and
> >> > likely a
> >> model
> >> > for their new operators.
> >> >
> >> > David
> >> >
> >> >
> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre 
> >> wrote:
> >> >
> >> > > Yes there are two conflicting threads now. The original thread
> >> > > was to
> >> > open
> >> > > up a way for contributors to submit code in a dir (contrib?) as
> >> > > long
> >> as
> >> > > license part of taken care of.
> >> > >
> >> > > On the thread of removing non-used operators -> How do we know
> >> > > what is being used?
> >> > >
> >> > > Thks,
> >> > > Amol
> >> > >
> >> > >
> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
> >> sand...@datatorrent.com>
> >> > > wrote:
> >> > >
> >> > > > +1 for removing the not-used operators.
> >> > > >
> >> > > > 

RE: A proposal for Malhar

2016-07-12 Thread Kottapalli, Venkatesh
+1 for deprecating the packages listed below.

-Original Message-
From: hsy...@gmail.com [mailto:hsy...@gmail.com] 
Sent: Tuesday, July 12, 2016 12:01 PM

+1

On Tue, Jul 12, 2016 at 11:53 AM, David Yan  wrote:

> Hi all,
>
> I would like to renew the discussion of retiring operators in Malhar.
>
> As stated before, the reason why we would like to retire operators in 
> Malhar is because some of them were written a long time ago before 
> Apache incubation, and they do not pertain to real use cases, are not 
> up to par in code quality, have no potential for improvement, and 
> probably completely unused by anybody.
>
> We do not want contributors to use them as a model of their 
> contribution, or users to use them thinking they are of quality, and then hit 
> a wall.
> Both scenarios are not beneficial to the reputation of Apex.
>
> The initial 3 packages that we would like to target are *lib/algo*, 
> *lib/math*, and *lib/streamquery*.
>
> I'm adding this thread to the users list. Please speak up if you are 
> using any operator in these 3 packages. We would like to hear from you.
>
> These are the options I can think of for retiring those operators:
>
> 1) Completely remove them from the malhar repository.
> 2) Move them from malhar-library into a separate artifact called 
> malhar-misc
> 3) Mark them deprecated and add to their javadoc that they are no 
> longer supported
>
> Note that 2 and 3 are not mutually exclusive. Any thoughts?
>
> David
>
> On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni 
> 
> wrote:
>
>> I wanted to close the loop on this discussion. In general everyone 
>> seemed to be favorable to this idea with no serious objections. Folks 
>> had good suggestions like documenting capabilities of operators, come 
>> up well defined criteria for graduation of operators and what those 
>> criteria may be and what to do with existing operators that may not 
>> yet be mature or unused.
>>
>> I am going to summarize the key points that resulted from the 
>> discussion and would like to proceed with them.
>>
>>- Operators that do not yet provide the key platform capabilities to
>>make an operator useful across different applications such as 
>> reusability,
>>partitioning static or dynamic, idempotency, exactly once will still be
>>accepted as long as they are functionally correct, have unit tests 
>> and will
>>go into a separate module.
>>- Contrib module was suggested as a place where new contributions go in
>>that don't yet have all the platform capabilities and are not yet 
>> mature.
>>If there are no other suggestions we will go with this one.
>>- It was suggested the operators documentation list those platform
>>capabilities it currently provides from the list above. I will 
>> document a
>>structure for this in the contribution guidelines.
>>- Folks wanted to know what would be the criteria to graduate an
>>operator to the big leagues :). I will kick-off a separate thread 
>> for it as
>>I think it requires its own discussion and hopefully we can come 
>> up with a
>>set of guidelines for it.
>>- David brought up state of some of the existing operators and their
>>retirement and the layout of operators in Malhar in general and how it
>>causes problems with development. I will ask him to lead the 
>> discussion on
>>that.
>>
>> Thanks
>>
>> On Fri, May 27, 2016 at 7:47 PM, David Yan  wrote:
>>
>> > The two ideas are not conflicting, but rather complementing.
>> >
>> > On the contrary, putting a new process for people trying to 
>> > contribute while NOT addressing the old unused subpar operators in 
>> > the repository
>> is
>> > what is conflicting.
>> >
>> > Keep in mind that when people try to contribute, they always look 
>> > at the existing operators already in the repository as examples and 
>> > likely a
>> model
>> > for their new operators.
>> >
>> > David
>> >
>> >
>> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre 
>> wrote:
>> >
>> > > Yes there are two conflicting threads now. The original thread 
>> > > was to
>> > open
>> > > up a way for contributors to submit code in a dir (contrib?) as 
>> > > long
>> as
>> > > license part of taken care of.
>> > >
>> > > On the thread of removing non-used operators -> How do we know 
>> > > what is being used?
>> > >
>> > > Thks,
>> > > Amol
>> > >
>> > >
>> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>> sand...@datatorrent.com>
>> > > wrote:
>> > >
>> > > > +1 for removing the not-used operators.
>> > > >
>> > > > So we are creating a process for operator writers who don't 
>> > > > want to understand the platform, yet wants to contribute? How 
>> > > > big is that
>> set?
>> > > > If we tell the app-user, here is the code which has not passed 
>> > > > all
>> the
>> > > > checklist, will they be ready to use that in production?
>> > > >
>> > > > This 

Re: A proposal for Malhar

2016-07-12 Thread hsy...@gmail.com
+1

On Tue, Jul 12, 2016 at 11:53 AM, David Yan  wrote:

> Hi all,
>
> I would like to renew the discussion of retiring operators in Malhar.
>
> As stated before, the reason why we would like to retire operators in
> Malhar is because some of them were written a long time ago before Apache
> incubation, and they do not pertain to real use cases, are not up to par in
> code quality, have no potential for improvement, and probably completely
> unused by anybody.
>
> We do not want contributors to use them as a model of their contribution,
> or users to use them thinking they are of quality, and then hit a wall.
> Both scenarios are not beneficial to the reputation of Apex.
>
> The initial 3 packages that we would like to target are *lib/algo*,
> *lib/math*, and *lib/streamquery*.
>
> I'm adding this thread to the users list. Please speak up if you are using
> any operator in these 3 packages. We would like to hear from you.
>
> These are the options I can think of for retiring those operators:
>
> 1) Completely remove them from the malhar repository.
> 2) Move them from malhar-library into a separate artifact called
> malhar-misc
> 3) Mark them deprecated and add to their javadoc that they are no longer
> supported
>
> Note that 2 and 3 are not mutually exclusive. Any thoughts?
>
> David
>
> On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni 
> wrote:
>
>> I wanted to close the loop on this discussion. In general everyone seemed
>> to be favorable to this idea with no serious objections. Folks had good
>> suggestions like documenting capabilities of operators, come up well
>> defined criteria for graduation of operators and what those criteria may
>> be
>> and what to do with existing operators that may not yet be mature or
>> unused.
>>
>> I am going to summarize the key points that resulted from the discussion
>> and would like to proceed with them.
>>
>>- Operators that do not yet provide the key platform capabilities to
>>make an operator useful across different applications such as
>> reusability,
>>partitioning static or dynamic, idempotency, exactly once will still be
>>accepted as long as they are functionally correct, have unit tests and
>> will
>>go into a separate module.
>>- Contrib module was suggested as a place where new contributions go in
>>that don't yet have all the platform capabilities and are not yet
>> mature.
>>If there are no other suggestions we will go with this one.
>>- It was suggested the operators documentation list those platform
>>capabilities it currently provides from the list above. I will
>> document a
>>structure for this in the contribution guidelines.
>>- Folks wanted to know what would be the criteria to graduate an
>>operator to the big leagues :). I will kick-off a separate thread for
>> it as
>>I think it requires its own discussion and hopefully we can come up
>> with a
>>set of guidelines for it.
>>- David brought up state of some of the existing operators and their
>>retirement and the layout of operators in Malhar in general and how it
>>causes problems with development. I will ask him to lead the
>> discussion on
>>that.
>>
>> Thanks
>>
>> On Fri, May 27, 2016 at 7:47 PM, David Yan  wrote:
>>
>> > The two ideas are not conflicting, but rather complementing.
>> >
>> > On the contrary, putting a new process for people trying to contribute
>> > while NOT addressing the old unused subpar operators in the repository
>> is
>> > what is conflicting.
>> >
>> > Keep in mind that when people try to contribute, they always look at the
>> > existing operators already in the repository as examples and likely a
>> model
>> > for their new operators.
>> >
>> > David
>> >
>> >
>> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre 
>> wrote:
>> >
>> > > Yes there are two conflicting threads now. The original thread was to
>> > open
>> > > up a way for contributors to submit code in a dir (contrib?) as long
>> as
>> > > license part of taken care of.
>> > >
>> > > On the thread of removing non-used operators -> How do we know what is
>> > > being used?
>> > >
>> > > Thks,
>> > > Amol
>> > >
>> > >
>> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>> sand...@datatorrent.com>
>> > > wrote:
>> > >
>> > > > +1 for removing the not-used operators.
>> > > >
>> > > > So we are creating a process for operator writers who don't want to
>> > > > understand the platform, yet wants to contribute? How big is that
>> set?
>> > > > If we tell the app-user, here is the code which has not passed all
>> the
>> > > > checklist, will they be ready to use that in production?
>> > > >
>> > > > This thread has 2 conflicting forces, reduce the operators and make
>> it
>> > > easy
>> > > > to add more operators.
>> > > >
>> > > >
>> > > >
>> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>> > pra...@datatorrent.com>
>> > > > 

Re: A proposal for Malhar

2016-06-07 Thread Pramod Immaneni
I wanted to close the loop on this discussion. In general everyone seemed
to be favorable to this idea with no serious objections. Folks had good
suggestions like documenting capabilities of operators, come up well
defined criteria for graduation of operators and what those criteria may be
and what to do with existing operators that may not yet be mature or
unused.

I am going to summarize the key points that resulted from the discussion
and would like to proceed with them.

   - Operators that do not yet provide the key platform capabilities to
   make an operator useful across different applications such as reusability,
   partitioning static or dynamic, idempotency, exactly once will still be
   accepted as long as they are functionally correct, have unit tests and will
   go into a separate module.
   - Contrib module was suggested as a place where new contributions go in
   that don't yet have all the platform capabilities and are not yet mature.
   If there are no other suggestions we will go with this one.
   - It was suggested the operators documentation list those platform
   capabilities it currently provides from the list above. I will document a
   structure for this in the contribution guidelines.
   - Folks wanted to know what would be the criteria to graduate an
   operator to the big leagues :). I will kick-off a separate thread for it as
   I think it requires its own discussion and hopefully we can come up with a
   set of guidelines for it.
   - David brought up state of some of the existing operators and their
   retirement and the layout of operators in Malhar in general and how it
   causes problems with development. I will ask him to lead the discussion on
   that.

Thanks

On Fri, May 27, 2016 at 7:47 PM, David Yan  wrote:

> The two ideas are not conflicting, but rather complementing.
>
> On the contrary, putting a new process for people trying to contribute
> while NOT addressing the old unused subpar operators in the repository is
> what is conflicting.
>
> Keep in mind that when people try to contribute, they always look at the
> existing operators already in the repository as examples and likely a model
> for their new operators.
>
> David
>
>
> On Fri, May 27, 2016 at 4:05 PM, Amol Kekre  wrote:
>
> > Yes there are two conflicting threads now. The original thread was to
> open
> > up a way for contributors to submit code in a dir (contrib?) as long as
> > license part of taken care of.
> >
> > On the thread of removing non-used operators -> How do we know what is
> > being used?
> >
> > Thks,
> > Amol
> >
> >
> > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde 
> > wrote:
> >
> > > +1 for removing the not-used operators.
> > >
> > > So we are creating a process for operator writers who don't want to
> > > understand the platform, yet wants to contribute? How big is that set?
> > > If we tell the app-user, here is the code which has not passed all the
> > > checklist, will they be ready to use that in production?
> > >
> > > This thread has 2 conflicting forces, reduce the operators and make it
> > easy
> > > to add more operators.
> > >
> > >
> > >
> > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
> pra...@datatorrent.com>
> > > wrote:
> > >
> > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
> > gaurav.gopi...@gmail.com>
> > > > wrote:
> > > >
> > > > > Pramod,
> > > > >
> > > > > By that logic I would say let's put all partitionable operators
> into
> > > one
> > > > > folder, non-partitionable operators in another and so on...
> > > > >
> > > >
> > > > Remember the original goal of making it easier for new members to
> > > > contribute and managing those contributions to maturity. It is not a
> > > > functional level separation.
> > > >
> > > >
> > > > > When I look at hadoop code I see these annotations being used at
> > class
> > > > > level and not at package/folder level.
> > > >
> > > >
> > > > I had a typo in my email, I meant to say "think of this like a
> > folder..."
> > > > as an analogy and not literally.
> > > >
> > > > Thanks
> > > >
> > > >
> > > > > Thanks
> > > > >
> > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
> > > pra...@datatorrent.com
> > > > >
> > > > > wrote:
> > > > >
> > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
> > > > gaurav.gopi...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Can same goal not be achieved by
> > > > > > > using
> > org.apache.hadoop.classification.InterfaceStability.Evolving
> > > /
> > > > > > > org.apache.hadoop.classification.InterfaceStability.Unstable
> > > > > annotation?
> > > > > > >
> > > > > >
> > > > > > I think it is important to localize the additions in one place so
> > > that
> > > > it
> > > > > > becomes clearer to users about the maturity level of these,
> easier
> > > for
> > > > > > developers to track them towards the path to maturity and also
> > > > provides a
> > > > > > clearer directive 

Re: A proposal for Malhar

2016-05-27 Thread David Yan
The two ideas are not conflicting, but rather complementing.

On the contrary, putting a new process for people trying to contribute
while NOT addressing the old unused subpar operators in the repository is
what is conflicting.

Keep in mind that when people try to contribute, they always look at the
existing operators already in the repository as examples and likely a model
for their new operators.

David


On Fri, May 27, 2016 at 4:05 PM, Amol Kekre  wrote:

> Yes there are two conflicting threads now. The original thread was to open
> up a way for contributors to submit code in a dir (contrib?) as long as
> license part of taken care of.
>
> On the thread of removing non-used operators -> How do we know what is
> being used?
>
> Thks,
> Amol
>
>
> On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde 
> wrote:
>
> > +1 for removing the not-used operators.
> >
> > So we are creating a process for operator writers who don't want to
> > understand the platform, yet wants to contribute? How big is that set?
> > If we tell the app-user, here is the code which has not passed all the
> > checklist, will they be ready to use that in production?
> >
> > This thread has 2 conflicting forces, reduce the operators and make it
> easy
> > to add more operators.
> >
> >
> >
> > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni 
> > wrote:
> >
> > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
> gaurav.gopi...@gmail.com>
> > > wrote:
> > >
> > > > Pramod,
> > > >
> > > > By that logic I would say let's put all partitionable operators into
> > one
> > > > folder, non-partitionable operators in another and so on...
> > > >
> > >
> > > Remember the original goal of making it easier for new members to
> > > contribute and managing those contributions to maturity. It is not a
> > > functional level separation.
> > >
> > >
> > > > When I look at hadoop code I see these annotations being used at
> class
> > > > level and not at package/folder level.
> > >
> > >
> > > I had a typo in my email, I meant to say "think of this like a
> folder..."
> > > as an analogy and not literally.
> > >
> > > Thanks
> > >
> > >
> > > > Thanks
> > > >
> > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
> > pra...@datatorrent.com
> > > >
> > > > wrote:
> > > >
> > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
> > > gaurav.gopi...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Can same goal not be achieved by
> > > > > > using
> org.apache.hadoop.classification.InterfaceStability.Evolving
> > /
> > > > > > org.apache.hadoop.classification.InterfaceStability.Unstable
> > > > annotation?
> > > > > >
> > > > >
> > > > > I think it is important to localize the additions in one place so
> > that
> > > it
> > > > > becomes clearer to users about the maturity level of these, easier
> > for
> > > > > developers to track them towards the path to maturity and also
> > > provides a
> > > > > clearer directive for committers and contributors on acceptance of
> > new
> > > > > submissions. Relying on the annotations alone makes them spread all
> > > over
> > > > > the place and adds an additional layer of difficulty in
> > identification
> > > > not
> > > > > just for users but also for developers who want to find such
> > operators
> > > > and
> > > > > improve them. This of this like a folder level annotation where
> > > > everything
> > > > > under this folder is unstable or evolving.
> > > > >
> > > > > Thanks
> > > > >
> > > > >
> > > > > >
> > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
> da...@datatorrent.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Malhar in its current state, has way too many operators
> > that
> > > > fall
> > > > > > in
> > > > > > > > the
> > > > > > > > > > "non-production quality" category. We should make it
> > obvious
> > > to
> > > > > > users
> > > > > > > > > that
> > > > > > > > > > which operators are up to par, and which operators are
> not,
> > > and
> > > > > > maybe
> > > > > > > > > even
> > > > > > > > > > remove those that are likely not ever used in a real use
> > > case.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > I am ambivalent about revisiting older operators and doing
> > this
> > > > > > > exercise
> > > > > > > > as
> > > > > > > > > this can cause unnecessary tensions. My original intent is
> > for
> > > > > > > > > contributions going forward.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > IMO it is important to address this as well. Operators
> outside
> > > the
> > > > > play
> > > > > > > > area should be of well known quality.
> > > > > > > >
> > > > > > > >
> > > > > > > I think this is important, and I don't anticipate much tension
> if
> > > we
> > > > > > > establish clear criteria.
> > > > > > > It's not helpful if we let the old subpar operators stay and
> put
> > up
> > > > the
> > > > > > > bars for new operators.
> > 

Re: A proposal for Malhar

2016-05-27 Thread Ashwin Chandra Putta
Instead of creating a new space for evolving operators, why not create a
new space for mature operators and graduate existing mature operators?

That way, it solves both the items of discussion
- having separation between evolving and mature operators.
- not having evolving operators in mature space.

Regards,
Ashwin.

On Fri, May 27, 2016 at 6:01 PM, Thomas Weise 
wrote:

> It is important that the discussion happens, regardless in which thread. If
> we want to establish criteria for operators to be promoted out of contrib
> then we need to make sure those already out there match the bar. Today
> there are many operators that don't.
>
> Also, the incubator space needs to be managed. If things don't make it /
> are abandoned then there needs to be a policy to deal with it. That's part
> of the same package, criteria for going in needs to be complemented by
> criteria for removal.
>
> Thomas
>
>
> On Fri, May 27, 2016 at 4:30 PM, Amol Kekre  wrote:
>
> > Agreed
> >
> > Thks
> > Amol
> >
> > On Fri, May 27, 2016 at 4:14 PM, Pramod Immaneni  >
> > wrote:
> >
> > > Amol,
> > >
> > > I would suggest starting a separate thread for that discussion.
> > >
> > > Thanks
> > >
> > > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre 
> > wrote:
> > >
> > > > Yes there are two conflicting threads now. The original thread was to
> > > open
> > > > up a way for contributors to submit code in a dir (contrib?) as long
> as
> > > > license part of taken care of.
> > > >
> > > > On the thread of removing non-used operators -> How do we know what
> is
> > > > being used?
> > > >
> > > > Thks,
> > > > Amol
> > > >
> > > >
> > > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
> > sand...@datatorrent.com>
> > > > wrote:
> > > >
> > > > > +1 for removing the not-used operators.
> > > > >
> > > > > So we are creating a process for operator writers who don't want to
> > > > > understand the platform, yet wants to contribute? How big is that
> > set?
> > > > > If we tell the app-user, here is the code which has not passed all
> > the
> > > > > checklist, will they be ready to use that in production?
> > > > >
> > > > > This thread has 2 conflicting forces, reduce the operators and make
> > it
> > > > easy
> > > > > to add more operators.
> > > > >
> > > > >
> > > > >
> > > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
> > > pra...@datatorrent.com>
> > > > > wrote:
> > > > >
> > > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
> > > > gaurav.gopi...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Pramod,
> > > > > > >
> > > > > > > By that logic I would say let's put all partitionable operators
> > > into
> > > > > one
> > > > > > > folder, non-partitionable operators in another and so on...
> > > > > > >
> > > > > >
> > > > > > Remember the original goal of making it easier for new members to
> > > > > > contribute and managing those contributions to maturity. It is
> not
> > a
> > > > > > functional level separation.
> > > > > >
> > > > > >
> > > > > > > When I look at hadoop code I see these annotations being used
> at
> > > > class
> > > > > > > level and not at package/folder level.
> > > > > >
> > > > > >
> > > > > > I had a typo in my email, I meant to say "think of this like a
> > > > folder..."
> > > > > > as an analogy and not literally.
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
> > > > > pra...@datatorrent.com
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
> > > > > > gaurav.gopi...@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Can same goal not be achieved by
> > > > > > > > > using
> > > > org.apache.hadoop.classification.InterfaceStability.Evolving
> > > > > /
> > > > > > > > >
> org.apache.hadoop.classification.InterfaceStability.Unstable
> > > > > > > annotation?
> > > > > > > > >
> > > > > > > >
> > > > > > > > I think it is important to localize the additions in one
> place
> > so
> > > > > that
> > > > > > it
> > > > > > > > becomes clearer to users about the maturity level of these,
> > > easier
> > > > > for
> > > > > > > > developers to track them towards the path to maturity and
> also
> > > > > > provides a
> > > > > > > > clearer directive for committers and contributors on
> acceptance
> > > of
> > > > > new
> > > > > > > > submissions. Relying on the annotations alone makes them
> spread
> > > all
> > > > > > over
> > > > > > > > the place and adds an additional layer of difficulty in
> > > > > identification
> > > > > > > not
> > > > > > > > just for users but also for developers who want to find such
> > > > > operators
> > > > > > > and
> > > > > > > > improve them. This of this like a folder level annotation
> where
> > > > > > > everything
> > > > > > > > under this folder is unstable or 

Re: A proposal for Malhar

2016-05-27 Thread Thomas Weise
It is important that the discussion happens, regardless in which thread. If
we want to establish criteria for operators to be promoted out of contrib
then we need to make sure those already out there match the bar. Today
there are many operators that don't.

Also, the incubator space needs to be managed. If things don't make it /
are abandoned then there needs to be a policy to deal with it. That's part
of the same package, criteria for going in needs to be complemented by
criteria for removal.

Thomas


On Fri, May 27, 2016 at 4:30 PM, Amol Kekre  wrote:

> Agreed
>
> Thks
> Amol
>
> On Fri, May 27, 2016 at 4:14 PM, Pramod Immaneni 
> wrote:
>
> > Amol,
> >
> > I would suggest starting a separate thread for that discussion.
> >
> > Thanks
> >
> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre 
> wrote:
> >
> > > Yes there are two conflicting threads now. The original thread was to
> > open
> > > up a way for contributors to submit code in a dir (contrib?) as long as
> > > license part of taken care of.
> > >
> > > On the thread of removing non-used operators -> How do we know what is
> > > being used?
> > >
> > > Thks,
> > > Amol
> > >
> > >
> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
> sand...@datatorrent.com>
> > > wrote:
> > >
> > > > +1 for removing the not-used operators.
> > > >
> > > > So we are creating a process for operator writers who don't want to
> > > > understand the platform, yet wants to contribute? How big is that
> set?
> > > > If we tell the app-user, here is the code which has not passed all
> the
> > > > checklist, will they be ready to use that in production?
> > > >
> > > > This thread has 2 conflicting forces, reduce the operators and make
> it
> > > easy
> > > > to add more operators.
> > > >
> > > >
> > > >
> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
> > pra...@datatorrent.com>
> > > > wrote:
> > > >
> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
> > > gaurav.gopi...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Pramod,
> > > > > >
> > > > > > By that logic I would say let's put all partitionable operators
> > into
> > > > one
> > > > > > folder, non-partitionable operators in another and so on...
> > > > > >
> > > > >
> > > > > Remember the original goal of making it easier for new members to
> > > > > contribute and managing those contributions to maturity. It is not
> a
> > > > > functional level separation.
> > > > >
> > > > >
> > > > > > When I look at hadoop code I see these annotations being used at
> > > class
> > > > > > level and not at package/folder level.
> > > > >
> > > > >
> > > > > I had a typo in my email, I meant to say "think of this like a
> > > folder..."
> > > > > as an analogy and not literally.
> > > > >
> > > > > Thanks
> > > > >
> > > > >
> > > > > > Thanks
> > > > > >
> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
> > > > pra...@datatorrent.com
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
> > > > > gaurav.gopi...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Can same goal not be achieved by
> > > > > > > > using
> > > org.apache.hadoop.classification.InterfaceStability.Evolving
> > > > /
> > > > > > > > org.apache.hadoop.classification.InterfaceStability.Unstable
> > > > > > annotation?
> > > > > > > >
> > > > > > >
> > > > > > > I think it is important to localize the additions in one place
> so
> > > > that
> > > > > it
> > > > > > > becomes clearer to users about the maturity level of these,
> > easier
> > > > for
> > > > > > > developers to track them towards the path to maturity and also
> > > > > provides a
> > > > > > > clearer directive for committers and contributors on acceptance
> > of
> > > > new
> > > > > > > submissions. Relying on the annotations alone makes them spread
> > all
> > > > > over
> > > > > > > the place and adds an additional layer of difficulty in
> > > > identification
> > > > > > not
> > > > > > > just for users but also for developers who want to find such
> > > > operators
> > > > > > and
> > > > > > > improve them. This of this like a folder level annotation where
> > > > > > everything
> > > > > > > under this folder is unstable or evolving.
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
> > > da...@datatorrent.com
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Malhar in its current state, has way too many
> operators
> > > > that
> > > > > > fall
> > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > > > "non-production quality" category. We should make it
> > > > obvious
> > > > > to
> > > > > > > > users
> > > > > > > > > > > that
> > > > > > > > > > > > which operators are up to par, and which operators
> are
> > > not,
> > > > > 

Re: A proposal for Malhar

2016-05-27 Thread Amol Kekre
Agreed

Thks
Amol

On Fri, May 27, 2016 at 4:14 PM, Pramod Immaneni 
wrote:

> Amol,
>
> I would suggest starting a separate thread for that discussion.
>
> Thanks
>
> On Fri, May 27, 2016 at 4:05 PM, Amol Kekre  wrote:
>
> > Yes there are two conflicting threads now. The original thread was to
> open
> > up a way for contributors to submit code in a dir (contrib?) as long as
> > license part of taken care of.
> >
> > On the thread of removing non-used operators -> How do we know what is
> > being used?
> >
> > Thks,
> > Amol
> >
> >
> > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde 
> > wrote:
> >
> > > +1 for removing the not-used operators.
> > >
> > > So we are creating a process for operator writers who don't want to
> > > understand the platform, yet wants to contribute? How big is that set?
> > > If we tell the app-user, here is the code which has not passed all the
> > > checklist, will they be ready to use that in production?
> > >
> > > This thread has 2 conflicting forces, reduce the operators and make it
> > easy
> > > to add more operators.
> > >
> > >
> > >
> > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
> pra...@datatorrent.com>
> > > wrote:
> > >
> > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
> > gaurav.gopi...@gmail.com>
> > > > wrote:
> > > >
> > > > > Pramod,
> > > > >
> > > > > By that logic I would say let's put all partitionable operators
> into
> > > one
> > > > > folder, non-partitionable operators in another and so on...
> > > > >
> > > >
> > > > Remember the original goal of making it easier for new members to
> > > > contribute and managing those contributions to maturity. It is not a
> > > > functional level separation.
> > > >
> > > >
> > > > > When I look at hadoop code I see these annotations being used at
> > class
> > > > > level and not at package/folder level.
> > > >
> > > >
> > > > I had a typo in my email, I meant to say "think of this like a
> > folder..."
> > > > as an analogy and not literally.
> > > >
> > > > Thanks
> > > >
> > > >
> > > > > Thanks
> > > > >
> > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
> > > pra...@datatorrent.com
> > > > >
> > > > > wrote:
> > > > >
> > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
> > > > gaurav.gopi...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Can same goal not be achieved by
> > > > > > > using
> > org.apache.hadoop.classification.InterfaceStability.Evolving
> > > /
> > > > > > > org.apache.hadoop.classification.InterfaceStability.Unstable
> > > > > annotation?
> > > > > > >
> > > > > >
> > > > > > I think it is important to localize the additions in one place so
> > > that
> > > > it
> > > > > > becomes clearer to users about the maturity level of these,
> easier
> > > for
> > > > > > developers to track them towards the path to maturity and also
> > > > provides a
> > > > > > clearer directive for committers and contributors on acceptance
> of
> > > new
> > > > > > submissions. Relying on the annotations alone makes them spread
> all
> > > > over
> > > > > > the place and adds an additional layer of difficulty in
> > > identification
> > > > > not
> > > > > > just for users but also for developers who want to find such
> > > operators
> > > > > and
> > > > > > improve them. This of this like a folder level annotation where
> > > > > everything
> > > > > > under this folder is unstable or evolving.
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
> > da...@datatorrent.com
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Malhar in its current state, has way too many operators
> > > that
> > > > > fall
> > > > > > > in
> > > > > > > > > the
> > > > > > > > > > > "non-production quality" category. We should make it
> > > obvious
> > > > to
> > > > > > > users
> > > > > > > > > > that
> > > > > > > > > > > which operators are up to par, and which operators are
> > not,
> > > > and
> > > > > > > maybe
> > > > > > > > > > even
> > > > > > > > > > > remove those that are likely not ever used in a real
> use
> > > > case.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I am ambivalent about revisiting older operators and
> doing
> > > this
> > > > > > > > exercise
> > > > > > > > > as
> > > > > > > > > > this can cause unnecessary tensions. My original intent
> is
> > > for
> > > > > > > > > > contributions going forward.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > IMO it is important to address this as well. Operators
> > outside
> > > > the
> > > > > > play
> > > > > > > > > area should be of well known quality.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > I think this is important, and I don't anticipate much
> tension
> > if
> > > > we
> > > > > > > > establish clear criteria.
> > > > > > > > It's not helpful if we 

Re: A proposal for Malhar

2016-05-27 Thread Pramod Immaneni
On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde 
wrote:

> +1 for removing the not-used operators.
>
> So we are creating a process for operator writers who don't want to
> understand the platform, yet wants to contribute? How big is that set?
> If we tell the app-user, here is the code which has not passed all the
> checklist, will they be ready to use that in production?
>
> This thread has 2 conflicting forces, reduce the operators and make it easy
> to add more operators.
>

Like I mentioned in my responses to earlier comments on this topic lets
have a separate discussion about existing operators as it requires a
consideration of its own and discussions. This proposal isn't for that.

Thanks


>
>
> On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni 
> wrote:
>
> > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta 
> > wrote:
> >
> > > Pramod,
> > >
> > > By that logic I would say let's put all partitionable operators into
> one
> > > folder, non-partitionable operators in another and so on...
> > >
> >
> > Remember the original goal of making it easier for new members to
> > contribute and managing those contributions to maturity. It is not a
> > functional level separation.
> >
> >
> > > When I look at hadoop code I see these annotations being used at class
> > > level and not at package/folder level.
> >
> >
> > I had a typo in my email, I meant to say "think of this like a folder..."
> > as an analogy and not literally.
> >
> > Thanks
> >
> >
> > > Thanks
> > >
> > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
> pra...@datatorrent.com
> > >
> > > wrote:
> > >
> > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
> > gaurav.gopi...@gmail.com>
> > > > wrote:
> > > >
> > > > > Can same goal not be achieved by
> > > > > using org.apache.hadoop.classification.InterfaceStability.Evolving
> /
> > > > > org.apache.hadoop.classification.InterfaceStability.Unstable
> > > annotation?
> > > > >
> > > >
> > > > I think it is important to localize the additions in one place so
> that
> > it
> > > > becomes clearer to users about the maturity level of these, easier
> for
> > > > developers to track them towards the path to maturity and also
> > provides a
> > > > clearer directive for committers and contributors on acceptance of
> new
> > > > submissions. Relying on the annotations alone makes them spread all
> > over
> > > > the place and adds an additional layer of difficulty in
> identification
> > > not
> > > > just for users but also for developers who want to find such
> operators
> > > and
> > > > improve them. This of this like a folder level annotation where
> > > everything
> > > > under this folder is unstable or evolving.
> > > >
> > > > Thanks
> > > >
> > > >
> > > > >
> > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan  >
> > > > wrote:
> > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Malhar in its current state, has way too many operators
> that
> > > fall
> > > > > in
> > > > > > > the
> > > > > > > > > "non-production quality" category. We should make it
> obvious
> > to
> > > > > users
> > > > > > > > that
> > > > > > > > > which operators are up to par, and which operators are not,
> > and
> > > > > maybe
> > > > > > > > even
> > > > > > > > > remove those that are likely not ever used in a real use
> > case.
> > > > > > > > >
> > > > > > > >
> > > > > > > > I am ambivalent about revisiting older operators and doing
> this
> > > > > > exercise
> > > > > > > as
> > > > > > > > this can cause unnecessary tensions. My original intent is
> for
> > > > > > > > contributions going forward.
> > > > > > > >
> > > > > > > >
> > > > > > > IMO it is important to address this as well. Operators outside
> > the
> > > > play
> > > > > > > area should be of well known quality.
> > > > > > >
> > > > > > >
> > > > > > I think this is important, and I don't anticipate much tension if
> > we
> > > > > > establish clear criteria.
> > > > > > It's not helpful if we let the old subpar operators stay and put
> up
> > > the
> > > > > > bars for new operators.
> > > > > >
> > > > > > David
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: A proposal for Malhar

2016-05-27 Thread Sandesh Hegde
+1 for removing the not-used operators.

So we are creating a process for operator writers who don't want to
understand the platform, yet wants to contribute? How big is that set?
If we tell the app-user, here is the code which has not passed all the
checklist, will they be ready to use that in production?

This thread has 2 conflicting forces, reduce the operators and make it easy
to add more operators.



On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni 
wrote:

> On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta 
> wrote:
>
> > Pramod,
> >
> > By that logic I would say let's put all partitionable operators into one
> > folder, non-partitionable operators in another and so on...
> >
>
> Remember the original goal of making it easier for new members to
> contribute and managing those contributions to maturity. It is not a
> functional level separation.
>
>
> > When I look at hadoop code I see these annotations being used at class
> > level and not at package/folder level.
>
>
> I had a typo in my email, I meant to say "think of this like a folder..."
> as an analogy and not literally.
>
> Thanks
>
>
> > Thanks
> >
> > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni  >
> > wrote:
> >
> > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
> gaurav.gopi...@gmail.com>
> > > wrote:
> > >
> > > > Can same goal not be achieved by
> > > > using org.apache.hadoop.classification.InterfaceStability.Evolving /
> > > > org.apache.hadoop.classification.InterfaceStability.Unstable
> > annotation?
> > > >
> > >
> > > I think it is important to localize the additions in one place so that
> it
> > > becomes clearer to users about the maturity level of these, easier for
> > > developers to track them towards the path to maturity and also
> provides a
> > > clearer directive for committers and contributors on acceptance of new
> > > submissions. Relying on the annotations alone makes them spread all
> over
> > > the place and adds an additional layer of difficulty in identification
> > not
> > > just for users but also for developers who want to find such operators
> > and
> > > improve them. This of this like a folder level annotation where
> > everything
> > > under this folder is unstable or evolving.
> > >
> > > Thanks
> > >
> > >
> > > >
> > > > On Fri, May 27, 2016 at 12:35 PM, David Yan 
> > > wrote:
> > > >
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > Malhar in its current state, has way too many operators that
> > fall
> > > > in
> > > > > > the
> > > > > > > > "non-production quality" category. We should make it obvious
> to
> > > > users
> > > > > > > that
> > > > > > > > which operators are up to par, and which operators are not,
> and
> > > > maybe
> > > > > > > even
> > > > > > > > remove those that are likely not ever used in a real use
> case.
> > > > > > > >
> > > > > > >
> > > > > > > I am ambivalent about revisiting older operators and doing this
> > > > > exercise
> > > > > > as
> > > > > > > this can cause unnecessary tensions. My original intent is for
> > > > > > > contributions going forward.
> > > > > > >
> > > > > > >
> > > > > > IMO it is important to address this as well. Operators outside
> the
> > > play
> > > > > > area should be of well known quality.
> > > > > >
> > > > > >
> > > > > I think this is important, and I don't anticipate much tension if
> we
> > > > > establish clear criteria.
> > > > > It's not helpful if we let the old subpar operators stay and put up
> > the
> > > > > bars for new operators.
> > > > >
> > > > > David
> > > > >
> > > >
> > >
> >
>


Re: A proposal for Malhar

2016-05-27 Thread Pramod Immaneni
On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta 
wrote:

> Pramod,
>
> By that logic I would say let's put all partitionable operators into one
> folder, non-partitionable operators in another and so on...
>

Remember the original goal of making it easier for new members to
contribute and managing those contributions to maturity. It is not a
functional level separation.


> When I look at hadoop code I see these annotations being used at class
> level and not at package/folder level.


I had a typo in my email, I meant to say "think of this like a folder..."
as an analogy and not literally.

Thanks


> Thanks
>
> On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni 
> wrote:
>
> > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta 
> > wrote:
> >
> > > Can same goal not be achieved by
> > > using org.apache.hadoop.classification.InterfaceStability.Evolving /
> > > org.apache.hadoop.classification.InterfaceStability.Unstable
> annotation?
> > >
> >
> > I think it is important to localize the additions in one place so that it
> > becomes clearer to users about the maturity level of these, easier for
> > developers to track them towards the path to maturity and also provides a
> > clearer directive for committers and contributors on acceptance of new
> > submissions. Relying on the annotations alone makes them spread all over
> > the place and adds an additional layer of difficulty in identification
> not
> > just for users but also for developers who want to find such operators
> and
> > improve them. This of this like a folder level annotation where
> everything
> > under this folder is unstable or evolving.
> >
> > Thanks
> >
> >
> > >
> > > On Fri, May 27, 2016 at 12:35 PM, David Yan 
> > wrote:
> > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > > Malhar in its current state, has way too many operators that
> fall
> > > in
> > > > > the
> > > > > > > "non-production quality" category. We should make it obvious to
> > > users
> > > > > > that
> > > > > > > which operators are up to par, and which operators are not, and
> > > maybe
> > > > > > even
> > > > > > > remove those that are likely not ever used in a real use case.
> > > > > > >
> > > > > >
> > > > > > I am ambivalent about revisiting older operators and doing this
> > > > exercise
> > > > > as
> > > > > > this can cause unnecessary tensions. My original intent is for
> > > > > > contributions going forward.
> > > > > >
> > > > > >
> > > > > IMO it is important to address this as well. Operators outside the
> > play
> > > > > area should be of well known quality.
> > > > >
> > > > >
> > > > I think this is important, and I don't anticipate much tension if we
> > > > establish clear criteria.
> > > > It's not helpful if we let the old subpar operators stay and put up
> the
> > > > bars for new operators.
> > > >
> > > > David
> > > >
> > >
> >
>


Re: A proposal for Malhar

2016-05-27 Thread Amol Kekre
A separate directory conveys "new code; not cleared all checklisted items"
much better and stronger than evolving within the same dir. There are
examples like PiggyBank that achieved similar results. As Hitesh pointed
out, licensing compliance will need to be checked.

Thks,
Amol


On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni 
wrote:

> On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta 
> wrote:
>
> > Can same goal not be achieved by
> > using org.apache.hadoop.classification.InterfaceStability.Evolving /
> > org.apache.hadoop.classification.InterfaceStability.Unstable annotation?
> >
>
> I think it is important to localize the additions in one place so that it
> becomes clearer to users about the maturity level of these, easier for
> developers to track them towards the path to maturity and also provides a
> clearer directive for committers and contributors on acceptance of new
> submissions. Relying on the annotations alone makes them spread all over
> the place and adds an additional layer of difficulty in identification not
> just for users but also for developers who want to find such operators and
> improve them. This of this like a folder level annotation where everything
> under this folder is unstable or evolving.
>
> Thanks
>
>
> >
> > On Fri, May 27, 2016 at 12:35 PM, David Yan 
> wrote:
> >
> > > >
> > > > >
> > > > > >
> > > > > > Malhar in its current state, has way too many operators that fall
> > in
> > > > the
> > > > > > "non-production quality" category. We should make it obvious to
> > users
> > > > > that
> > > > > > which operators are up to par, and which operators are not, and
> > maybe
> > > > > even
> > > > > > remove those that are likely not ever used in a real use case.
> > > > > >
> > > > >
> > > > > I am ambivalent about revisiting older operators and doing this
> > > exercise
> > > > as
> > > > > this can cause unnecessary tensions. My original intent is for
> > > > > contributions going forward.
> > > > >
> > > > >
> > > > IMO it is important to address this as well. Operators outside the
> play
> > > > area should be of well known quality.
> > > >
> > > >
> > > I think this is important, and I don't anticipate much tension if we
> > > establish clear criteria.
> > > It's not helpful if we let the old subpar operators stay and put up the
> > > bars for new operators.
> > >
> > > David
> > >
> >
>


Re: A proposal for Malhar

2016-05-27 Thread Gaurav Gupta
Pramod,

By that logic I would say let's put all partitionable operators into one
folder, non-partitionable operators in another and so on...
When I look at hadoop code I see these annotations being used at class
level and not at package/folder level.

Thanks

On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni 
wrote:

> On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta 
> wrote:
>
> > Can same goal not be achieved by
> > using org.apache.hadoop.classification.InterfaceStability.Evolving /
> > org.apache.hadoop.classification.InterfaceStability.Unstable annotation?
> >
>
> I think it is important to localize the additions in one place so that it
> becomes clearer to users about the maturity level of these, easier for
> developers to track them towards the path to maturity and also provides a
> clearer directive for committers and contributors on acceptance of new
> submissions. Relying on the annotations alone makes them spread all over
> the place and adds an additional layer of difficulty in identification not
> just for users but also for developers who want to find such operators and
> improve them. This of this like a folder level annotation where everything
> under this folder is unstable or evolving.
>
> Thanks
>
>
> >
> > On Fri, May 27, 2016 at 12:35 PM, David Yan 
> wrote:
> >
> > > >
> > > > >
> > > > > >
> > > > > > Malhar in its current state, has way too many operators that fall
> > in
> > > > the
> > > > > > "non-production quality" category. We should make it obvious to
> > users
> > > > > that
> > > > > > which operators are up to par, and which operators are not, and
> > maybe
> > > > > even
> > > > > > remove those that are likely not ever used in a real use case.
> > > > > >
> > > > >
> > > > > I am ambivalent about revisiting older operators and doing this
> > > exercise
> > > > as
> > > > > this can cause unnecessary tensions. My original intent is for
> > > > > contributions going forward.
> > > > >
> > > > >
> > > > IMO it is important to address this as well. Operators outside the
> play
> > > > area should be of well known quality.
> > > >
> > > >
> > > I think this is important, and I don't anticipate much tension if we
> > > establish clear criteria.
> > > It's not helpful if we let the old subpar operators stay and put up the
> > > bars for new operators.
> > >
> > > David
> > >
> >
>


Re: A proposal for Malhar

2016-05-27 Thread Amol Kekre
I am in favor of contrib too as long as it did not imply "license
incompatible" issue in other Apache projects. I found the following

https://mail-archives.apache.org/mod_mbox/pig-user/201205.mbox/%3CCAB-acjMTZUy=8skp+aqdm4rj1ordbxynvjnmsfymjsdfdvp...@mail.gmail.com%3E
https://cwiki.apache.org/confluence/display/PIG/PiggyBank
https://issues.apache.org/jira/browse/HIVE-639

Looks like contrib is viable if we all decide to adopt that name.

Thks
Amol


On Fri, May 27, 2016 at 9:23 AM, Pramod Immaneni 
wrote:

> I like the idea of using contrib for this end.
>
> On Fri, May 27, 2016 at 7:57 AM, Thomas Weise 
> wrote:
>
> > That's a good proposal. Sounds like an "incubator" space inside Malhar.
> >
> > Originally that was the intention behind the contrib module. But now it
> > contains many connectors that should be promoted to their own modules.
> So a
> > possible route would be to use contrib for early/evolving code and then
> > promote as it matures.
> >
> > In any case the expectations need to be documented:
> >
> > http://apex.apache.org/contributing.html
> >
> > Currently these guidelines skip everything till submitting PR, we need to
> > update them to give newcomers a better picture.
> >
> > Thomas
> >
> >
> > On Thu, May 26, 2016 at 11:25 PM, Pramod Immaneni <
> pra...@datatorrent.com>
> > wrote:
> >
> > > As you all know the continued success and growth of an open source
> > project
> > > is dependent on new members joining the project and contributing. This
> > will
> > > depend on how accessible the project is for new folks to make
> meaningful
> > > contributions. For a project like ours where the code base has been in
> > > development for years, it can be quite daunting for new members to just
> > > pick up and make contributions. We need to find ways to make it easier
> > for
> > > people to do so. Malhar, namely the operator library, is an area where
> > > people can contribute without requiring deep knowledge or expertise.
> > >
> > > We have seen operators take time to mature as evidenced by the road
> taken
> > > by some of our commonly used operators to reach production quality.
> This
> > is
> > > due to the fact that apart from the core functionality the operator is
> > > trying to implement there are many other aspects to address such as
> > > performance, idempotency, processing semantics and scalability. It
> would
> > be
> > > difficult even for folks familiar with all these aspects to get
> > everything
> > > right the first time around and produce comprehensive operators let
> alone
> > > first time contributors. At the same time operators cannot reach this
> > > maturity level unless they get used in different scenarios and get a
> good
> > > look at by different people. In maturity I am also including API
> > stability.
> > >
> > > I would like to propose creation of a space inside Malhar, a
> sub-folder,
> > > where contributions can first go in if they are not fully ready and
> when
> > > they mature can be moved out of the sub-folder into an existing module
> > or a
> > > new module of its own, the package paths can remain the same. The
> > > evaluation bar for contributions into this space would be more
> permissive
> > > than it is today, it would require the functionality the operator was
> > > developed for be working but will not necessitate that all fault
> tolerant
> > > and scalability aspects be addressed. It will also allow new operators
> > that
> > > are variations of existing operators till such time as we can determine
> > if
> > > the new functionality can be subsumed by the original operator or it
> > makes
> > > sense for the new operator to exist as a separate entity. It will be
> > up-to
> > > committers and contributors to work together and make the decisions as
> to
> > > whether the individual contributions go into this space or are ready to
> > > just go into the regular modules.
> > >
> > > What does everyone think.
> > >
> > > Thanks,
> > > Pramod
> > >
> >
>


Re: A proposal for Malhar

2016-05-27 Thread Pramod Immaneni
On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta 
wrote:

> Can same goal not be achieved by
> using org.apache.hadoop.classification.InterfaceStability.Evolving /
> org.apache.hadoop.classification.InterfaceStability.Unstable annotation?
>

I think it is important to localize the additions in one place so that it
becomes clearer to users about the maturity level of these, easier for
developers to track them towards the path to maturity and also provides a
clearer directive for committers and contributors on acceptance of new
submissions. Relying on the annotations alone makes them spread all over
the place and adds an additional layer of difficulty in identification not
just for users but also for developers who want to find such operators and
improve them. This of this like a folder level annotation where everything
under this folder is unstable or evolving.

Thanks


>
> On Fri, May 27, 2016 at 12:35 PM, David Yan  wrote:
>
> > >
> > > >
> > > > >
> > > > > Malhar in its current state, has way too many operators that fall
> in
> > > the
> > > > > "non-production quality" category. We should make it obvious to
> users
> > > > that
> > > > > which operators are up to par, and which operators are not, and
> maybe
> > > > even
> > > > > remove those that are likely not ever used in a real use case.
> > > > >
> > > >
> > > > I am ambivalent about revisiting older operators and doing this
> > exercise
> > > as
> > > > this can cause unnecessary tensions. My original intent is for
> > > > contributions going forward.
> > > >
> > > >
> > > IMO it is important to address this as well. Operators outside the play
> > > area should be of well known quality.
> > >
> > >
> > I think this is important, and I don't anticipate much tension if we
> > establish clear criteria.
> > It's not helpful if we let the old subpar operators stay and put up the
> > bars for new operators.
> >
> > David
> >
>


Re: A proposal for Malhar

2016-05-27 Thread Pramod Immaneni
On Fri, May 27, 2016 at 1:05 PM, Ashwin Chandra Putta <
ashwinchand...@gmail.com> wrote:

> Thanks for the proposal to drop the barrier. This will enable a lot of
> functional contribution to the code base. We can always make a given
> operator production ready as and when an opportunity arises. I would go
> even further and categorize operators into the following which require
> various degrees of graduation criteria for maturity.
>
> 1. Input Operators --> needs partitioning, fault tolerance, idempotency
> 2. Output Operators --> needs partitioning, fault tolerance, exactly once
> writes behavior
> 3. Stateless Processing Operators --> default
> 4. Stateful Processing Operators --> needs partitioning, fault tolerance
>
> Also, we should have a common goal as a community to keep maturing
> operators.
>
> Regards,
> Ashwin.
>

Useful input please present them when we have deeper discussions on the
maturity guidelines.

Thanks


>
> On Fri, May 27, 2016 at 9:59 AM, Sasha Parfenov <sas...@apache.org> wrote:
>
> > +1
> >
> > I think this is a great idea to lower the barrier for entry and encourage
> > more contributions for Apex.  And checklist for each operator is very
> > valuable for end-users who do not want to spend time going through every
> > line of code, to provide better understanding and confidence in various
> > operator aspects including processing semantics, partitioning,
> idempotency,
> > etc.
> >
> > Thanks,
> > Sasha
> >
> > On Fri, May 27, 2016 at 9:32 AM, Pramod Immaneni <pra...@datatorrent.com
> >
> > wrote:
> >
> > > I think this is a good idea and the committer can help in determining
> > this
> > > without putting all the burden on the contributor. One way would be to
> > list
> > > what is missing in terms of platform capabilities and under what
> > scenarios
> > > the operators cannot be used or unsure whether they will work. As you
> > > mentioned it could be informally done in a javadoc or we could
> introduce
> > > special programming language constructs to denote these.
> > >
> > > On Fri, May 27, 2016 at 8:22 AM, Ganelin, Ilya <
> > > ilya.gane...@capitalone.com>
> > > wrote:
> > >
> > > > If we're going to adopt partially completed code, I propose that
> every
> > > new
> > > > Malhar operator then contain a checklist as a comment within the
> class.
> > > >
> > > > This checklist would be defined by the community and would track the
> > > > current development state of the operator. That way there are no
> > > unexpected
> > > > surprises.
> > > >
> > > >
> > > >
> > > > Sent with Good (www.good.com)
> > > > 
> > > > From: Amol Kekre <a...@datatorrent.com>
> > > > Sent: Friday, May 27, 2016 10:13:06 AM
> > > > To: dev@apex.apache.org
> > > > Cc: d...@apex.incubator.apache.org
> > > > Subject: Re: A proposal for Malhar
> > > >
> > > > This is a very good idea and will greatly help contributors. The
> > > > requirements to submit code to this Malhar folder should be very
> > > minimal. A
> > > > few that come to my mind
> > > >
> > > > - Should compile
> > > > - License of the external lib (if any) should be Apache compliant
> > license
> > > >  // Need to see if this is part of ASF guidelines
> > > >
> > > > Everything else including naming, idempotency, performance, ...
> should
> > be
> > > > waived.
> > > >
> > > > Thks
> > > > Amol
> > > >
> > > >
> > > > On Thu, May 26, 2016 at 11:25 PM, Pramod Immaneni <
> > > pra...@datatorrent.com>
> > > > wrote:
> > > >
> > > > > As you all know the continued success and growth of an open source
> > > > project
> > > > > is dependent on new members joining the project and contributing.
> > This
> > > > will
> > > > > depend on how accessible the project is for new folks to make
> > > meaningful
> > > > > contributions. For a project like ours where the code base has been
> > in
> > > > > development for years, it can be quite daunting for new members to
> > just
> > > > > pick up and make contributions. We need to find ways to make it
> > easier
> > > > for

Re: A proposal for Malhar

2016-05-27 Thread David Yan
I think you misunderstood what I said.
For operators that will never be used and have no potential to improve, my
opinion is to remove them completely.
I am not against using the annotations in place of the incubator space.

David

On Fri, May 27, 2016 at 1:37 PM, Gaurav Gupta 
wrote:

> David,
>
> Why do you want to have operators is new incubator  space that will never
> be used?
> My question what is this new incubator space going to provide that can't be
> achieved by annotations?
>
> Thanks
> Gaurav
>
> On Fri, May 27, 2016 at 1:20 PM, David Yan  wrote:
>
> > Yes, we can certainly do that for operators that have the potential to be
> > up to par.
> >
> > But we know that there are also many operators that are not likely to be
> > used in a real use case and will probably not change.  Examples include
> > most operators in lib/math and lib/algo.
> >
> > It's not helpful to have them stay in the repository.
> >
> > David
> >
> > On Fri, May 27, 2016 at 1:13 PM, Gaurav Gupta 
> > wrote:
> >
> > > To add to my previous mail,
> > > Contributor can add
> > > org.apache.hadoop.classification.InterfaceStability.Evolving
> > > / org.apache.hadoop.classification.InterfaceStability.Unstable
> > annotations
> > > to operator and list of JIRAs in documentation that are being tracked
> to
> > > move the given operator to stable state...
> > >
> > >
> > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
> gaurav.gopi...@gmail.com>
> > > wrote:
> > >
> > > > Can same goal not be achieved by
> > > > using org.apache.hadoop.classification.InterfaceStability.Evolving /
> > > > org.apache.hadoop.classification.InterfaceStability.Unstable
> > annotation?
> > > >
> > > > On Fri, May 27, 2016 at 12:35 PM, David Yan 
> > > wrote:
> > > >
> > > >> >
> > > >> > >
> > > >> > > >
> > > >> > > > Malhar in its current state, has way too many operators that
> > fall
> > > in
> > > >> > the
> > > >> > > > "non-production quality" category. We should make it obvious
> to
> > > >> users
> > > >> > > that
> > > >> > > > which operators are up to par, and which operators are not,
> and
> > > >> maybe
> > > >> > > even
> > > >> > > > remove those that are likely not ever used in a real use case.
> > > >> > > >
> > > >> > >
> > > >> > > I am ambivalent about revisiting older operators and doing this
> > > >> exercise
> > > >> > as
> > > >> > > this can cause unnecessary tensions. My original intent is for
> > > >> > > contributions going forward.
> > > >> > >
> > > >> > >
> > > >> > IMO it is important to address this as well. Operators outside the
> > > play
> > > >> > area should be of well known quality.
> > > >> >
> > > >> >
> > > >> I think this is important, and I don't anticipate much tension if we
> > > >> establish clear criteria.
> > > >> It's not helpful if we let the old subpar operators stay and put up
> > the
> > > >> bars for new operators.
> > > >>
> > > >> David
> > > >>
> > > >
> > > >
> > >
> >
>


Re: A proposal for Malhar

2016-05-27 Thread Gaurav Gupta
David,

Why do you want to have operators is new incubator  space that will never
be used?
My question what is this new incubator space going to provide that can't be
achieved by annotations?

Thanks
Gaurav

On Fri, May 27, 2016 at 1:20 PM, David Yan  wrote:

> Yes, we can certainly do that for operators that have the potential to be
> up to par.
>
> But we know that there are also many operators that are not likely to be
> used in a real use case and will probably not change.  Examples include
> most operators in lib/math and lib/algo.
>
> It's not helpful to have them stay in the repository.
>
> David
>
> On Fri, May 27, 2016 at 1:13 PM, Gaurav Gupta 
> wrote:
>
> > To add to my previous mail,
> > Contributor can add
> > org.apache.hadoop.classification.InterfaceStability.Evolving
> > / org.apache.hadoop.classification.InterfaceStability.Unstable
> annotations
> > to operator and list of JIRAs in documentation that are being tracked to
> > move the given operator to stable state...
> >
> >
> > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta 
> > wrote:
> >
> > > Can same goal not be achieved by
> > > using org.apache.hadoop.classification.InterfaceStability.Evolving /
> > > org.apache.hadoop.classification.InterfaceStability.Unstable
> annotation?
> > >
> > > On Fri, May 27, 2016 at 12:35 PM, David Yan 
> > wrote:
> > >
> > >> >
> > >> > >
> > >> > > >
> > >> > > > Malhar in its current state, has way too many operators that
> fall
> > in
> > >> > the
> > >> > > > "non-production quality" category. We should make it obvious to
> > >> users
> > >> > > that
> > >> > > > which operators are up to par, and which operators are not, and
> > >> maybe
> > >> > > even
> > >> > > > remove those that are likely not ever used in a real use case.
> > >> > > >
> > >> > >
> > >> > > I am ambivalent about revisiting older operators and doing this
> > >> exercise
> > >> > as
> > >> > > this can cause unnecessary tensions. My original intent is for
> > >> > > contributions going forward.
> > >> > >
> > >> > >
> > >> > IMO it is important to address this as well. Operators outside the
> > play
> > >> > area should be of well known quality.
> > >> >
> > >> >
> > >> I think this is important, and I don't anticipate much tension if we
> > >> establish clear criteria.
> > >> It's not helpful if we let the old subpar operators stay and put up
> the
> > >> bars for new operators.
> > >>
> > >> David
> > >>
> > >
> > >
> >
>


Re: A proposal for Malhar

2016-05-27 Thread David Yan
>
> >
> > >
> > > Malhar in its current state, has way too many operators that fall in
> the
> > > "non-production quality" category. We should make it obvious to users
> > that
> > > which operators are up to par, and which operators are not, and maybe
> > even
> > > remove those that are likely not ever used in a real use case.
> > >
> >
> > I am ambivalent about revisiting older operators and doing this exercise
> as
> > this can cause unnecessary tensions. My original intent is for
> > contributions going forward.
> >
> >
> IMO it is important to address this as well. Operators outside the play
> area should be of well known quality.
>
>
I think this is important, and I don't anticipate much tension if we
establish clear criteria.
It's not helpful if we let the old subpar operators stay and put up the
bars for new operators.

David


Re: A proposal for Malhar

2016-05-27 Thread Pramod Immaneni
On Fri, May 27, 2016 at 10:59 AM, Priyanka Gugale 
wrote:

> I like the idea. I am just worried about left with unreadable code as time
> passes. As few people already pointed out, I would suggest there should be
> kind of checklist which they should mark as done/not-done. Not done is
> fine, only thing is better if developer call that out. Also if he is adding
> any new parameters or any complicated logic, let's make sure it's well
> documented.
>

Checklist and a set of guidelines for pro-actively managing the code in
this space have been suggested and supported by many and should be part of
the plan if the proposal goes forward. These should ensure that things
don't get out of hand.

Thanks


>
> -Priyanka
>
> On Fri, May 27, 2016 at 10:52 AM, Pramod Immaneni 
> wrote:
>
> > On Fri, May 27, 2016 at 10:29 AM, Chinmay Kolhatkar <
> > chin...@datatorrent.com
> > > wrote:
> >
> > > +1 for the idea of incubating space in malhar. This will not just
> ensure
> > > more contributions but a faster development of operators for putting
> into
> > > firsthand use.
> > >
> > > Though, I believe we need to build a process around that, right from
> the
> > > point where operator developer starts to develop till the operator
> > matures
> > > and goes into some stable space in malhar. But the process should be
> > > lightweight enough for it not to become a new reason for lesser
> > > contributions.
> > >
> >
> > Agreed, we should call out the steps and have well documented guidelines
> > for these. As Thomas pointed out earlier, contributions guidelines
> document
> > is a good place to put these.
> >
> >
> > >
> > > Here are some thoughts around process (contributing guidelines), I
> think
> > we
> > > need to take care of for this:
> > > 1. What class address spaces the operators space should go in. This is
> to
> > > ensure that there are lesser backward compatibility issues exist after
> > > moving to stable space. In the best scenario, it should be the change
> of
> > > maven dependency and everything else should work fine.
> >
> > 2. As Ilya pointed out, min checklist criteria is required for both
> getting
> > > into incubating space & then into stable space. This will ensure
> nothing
> > is
> > > missed and the quality is maintained.
> > > 3. IMO its very important to decide when, what, why and who of moving
> the
> > > code from incubating to stable space. This can be a grey area and
> process
> > > should define that upfront. Last thing we want to do in have some code
> in
> > > incubating but not moved to stable space for ages.
> > > 4. We should also decide about the cases where some serious changes are
> > > required in stable code, should we move it back to incubating and go
> > > through the process again?
> > > 5. Maybe we can go for Commit-Then-Review model for this incubating
> > space.
> > > Its important to keep in mind that we're enabling faster development at
> > the
> > > cost of probably more work in longer run. Hence suggesting CTR model
> > > instead of RTC model.
> > > 6. We should also make it clear that though this is an incubating
> space,
> > > there should not be any lax on unit testing of the operators.
> > >
> >
> > I agree with most of what you have said and some of these should be
> > included in the guidelines. I would say we need further discussions on
> the
> > actual mechanisms for 3. and 4. and these could start immediately after
> the
> > current proposal, if acceptable to the group, is put in place. Regd 5, I
> > think we should stick with the RTC model for now and evaluate how it is
> > working before considering the alternative.
> >
> > Thanks
> >
> >
> > > Thanks,
> > > Chinmay.
> > >
> > >
> > > On Fri, May 27, 2016 at 10:24 AM, Pramod Immaneni <
> > pra...@datatorrent.com>
> > > wrote:
> > >
> > > > My comment inline
> > > >
> > > > On Fri, May 27, 2016 at 9:00 AM, David Yan 
> > > wrote:
> > > >
> > > > > This is an important change because not only will it help
> > contributors
> > > > who
> > > > > want to contribute to Apex Malhar, it also helps users on deciding
> > > which
> > > > > operators they can use.
> > > > >
> > > >
> > > > Thanks
> > > >
> > > >
> > > > >
> > > > > Malhar in its current state, has way too many operators that fall
> in
> > > the
> > > > > "non-production quality" category. We should make it obvious to
> users
> > > > that
> > > > > which operators are up to par, and which operators are not, and
> maybe
> > > > even
> > > > > remove those that are likely not ever used in a real use case.
> > > > >
> > > >
> > > > I am ambivalent about revisiting older operators and doing this
> > exercise
> > > as
> > > > this can cause unnecessary tensions. My original intent is for
> > > > contributions going forward.
> > > >
> > > > Thanks
> > > >
> > > >
> > > > > David
> > > > >
> > > > > On Fri, May 27, 2016 at 7:13 AM, Amol Kekre 
> > > > wrote:
> > > > >

Re: A proposal for Malhar

2016-05-27 Thread Pramod Immaneni
On Fri, May 27, 2016 at 10:31 AM, Thomas Weise 
wrote:

> >
> > >
> > > Malhar in its current state, has way too many operators that fall in
> the
> > > "non-production quality" category. We should make it obvious to users
> > that
> > > which operators are up to par, and which operators are not, and maybe
> > even
> > > remove those that are likely not ever used in a real use case.
> > >
> >
> > I am ambivalent about revisiting older operators and doing this exercise
> as
> > this can cause unnecessary tensions. My original intent is for
> > contributions going forward.
> >
> >
> IMO it is important to address this as well. Operators outside the play
> area should be of well known quality.
>
> Thomas
>

I would ask we take that up as a separate discussion.

Thanks


Re: A proposal for Malhar

2016-05-27 Thread Pramod Immaneni
Comments inline.

On Fri, May 27, 2016 at 7:13 AM, Amol Kekre  wrote:

> This is a very good idea and will greatly help contributors. The
> requirements to submit code to this Malhar folder should be very minimal. A
> few that come to my mind
>

Thanks


>
> - Should compile
> - License of the external lib (if any) should be Apache compliant license
>  // Need to see if this is part of ASF guidelines
>
> Everything else including naming, idempotency, performance, ... should be
> waived.
>

I would say that the core functionality the operator is developed for
should work correctly, unit tests etc should be present and pass and other
guidelines such as checkstyle etc should be followed.

Thanks

>
> Thks
> Amol
>
>
> On Thu, May 26, 2016 at 11:25 PM, Pramod Immaneni 
> wrote:
>
> > As you all know the continued success and growth of an open source
> project
> > is dependent on new members joining the project and contributing. This
> will
> > depend on how accessible the project is for new folks to make meaningful
> > contributions. For a project like ours where the code base has been in
> > development for years, it can be quite daunting for new members to just
> > pick up and make contributions. We need to find ways to make it easier
> for
> > people to do so. Malhar, namely the operator library, is an area where
> > people can contribute without requiring deep knowledge or expertise.
> >
> > We have seen operators take time to mature as evidenced by the road taken
> > by some of our commonly used operators to reach production quality. This
> is
> > due to the fact that apart from the core functionality the operator is
> > trying to implement there are many other aspects to address such as
> > performance, idempotency, processing semantics and scalability. It would
> be
> > difficult even for folks familiar with all these aspects to get
> everything
> > right the first time around and produce comprehensive operators let alone
> > first time contributors. At the same time operators cannot reach this
> > maturity level unless they get used in different scenarios and get a good
> > look at by different people. In maturity I am also including API
> stability.
> >
> > I would like to propose creation of a space inside Malhar, a sub-folder,
> > where contributions can first go in if they are not fully ready and when
> > they mature can be moved out of the sub-folder into an existing module
> or a
> > new module of its own, the package paths can remain the same. The
> > evaluation bar for contributions into this space would be more permissive
> > than it is today, it would require the functionality the operator was
> > developed for be working but will not necessitate that all fault tolerant
> > and scalability aspects be addressed. It will also allow new operators
> that
> > are variations of existing operators till such time as we can determine
> if
> > the new functionality can be subsumed by the original operator or it
> makes
> > sense for the new operator to exist as a separate entity. It will be
> up-to
> > committers and contributors to work together and make the decisions as to
> > whether the individual contributions go into this space or are ready to
> > just go into the regular modules.
> >
> > What does everyone think.
> >
> > Thanks,
> > Pramod
> >
>


Re: A proposal for Malhar

2016-05-27 Thread Pramod Immaneni
My comment inline

On Fri, May 27, 2016 at 9:00 AM, David Yan  wrote:

> This is an important change because not only will it help contributors who
> want to contribute to Apex Malhar, it also helps users on deciding which
> operators they can use.
>

Thanks


>
> Malhar in its current state, has way too many operators that fall in the
> "non-production quality" category. We should make it obvious to users that
> which operators are up to par, and which operators are not, and maybe even
> remove those that are likely not ever used in a real use case.
>

I am ambivalent about revisiting older operators and doing this exercise as
this can cause unnecessary tensions. My original intent is for
contributions going forward.

Thanks


> David
>
> On Fri, May 27, 2016 at 7:13 AM, Amol Kekre  wrote:
>
> > This is a very good idea and will greatly help contributors. The
> > requirements to submit code to this Malhar folder should be very
> minimal. A
> > few that come to my mind
> >
> > - Should compile
> > - License of the external lib (if any) should be Apache compliant license
> >  // Need to see if this is part of ASF guidelines
> >
> > Everything else including naming, idempotency, performance, ... should be
> > waived.
> >
> > Thks
> > Amol
> >
> >
> > On Thu, May 26, 2016 at 11:25 PM, Pramod Immaneni <
> pra...@datatorrent.com>
> > wrote:
> >
> > > As you all know the continued success and growth of an open source
> > project
> > > is dependent on new members joining the project and contributing. This
> > will
> > > depend on how accessible the project is for new folks to make
> meaningful
> > > contributions. For a project like ours where the code base has been in
> > > development for years, it can be quite daunting for new members to just
> > > pick up and make contributions. We need to find ways to make it easier
> > for
> > > people to do so. Malhar, namely the operator library, is an area where
> > > people can contribute without requiring deep knowledge or expertise.
> > >
> > > We have seen operators take time to mature as evidenced by the road
> taken
> > > by some of our commonly used operators to reach production quality.
> This
> > is
> > > due to the fact that apart from the core functionality the operator is
> > > trying to implement there are many other aspects to address such as
> > > performance, idempotency, processing semantics and scalability. It
> would
> > be
> > > difficult even for folks familiar with all these aspects to get
> > everything
> > > right the first time around and produce comprehensive operators let
> alone
> > > first time contributors. At the same time operators cannot reach this
> > > maturity level unless they get used in different scenarios and get a
> good
> > > look at by different people. In maturity I am also including API
> > stability.
> > >
> > > I would like to propose creation of a space inside Malhar, a
> sub-folder,
> > > where contributions can first go in if they are not fully ready and
> when
> > > they mature can be moved out of the sub-folder into an existing module
> > or a
> > > new module of its own, the package paths can remain the same. The
> > > evaluation bar for contributions into this space would be more
> permissive
> > > than it is today, it would require the functionality the operator was
> > > developed for be working but will not necessitate that all fault
> tolerant
> > > and scalability aspects be addressed. It will also allow new operators
> > that
> > > are variations of existing operators till such time as we can determine
> > if
> > > the new functionality can be subsumed by the original operator or it
> > makes
> > > sense for the new operator to exist as a separate entity. It will be
> > up-to
> > > committers and contributors to work together and make the decisions as
> to
> > > whether the individual contributions go into this space or are ready to
> > > just go into the regular modules.
> > >
> > > What does everyone think.
> > >
> > > Thanks,
> > > Pramod
> > >
> >
>


Re: A proposal for Malhar

2016-05-27 Thread Pramod Immaneni
My comments inline

On Fri, May 27, 2016 at 9:49 AM, Bhupesh Chawda <bhup...@datatorrent.com>
wrote:

> +1 for the idea of having a separate sub space (or contrib in this case)
> for new contributors to participate. Perhaps the contributor should not be
> given the impression that the task is done unless the operator becomes part
> of some mainstream project like lib.
>

When an operator gets added, with the checklist we can address what is
missing so the contributor will know. It will be upto them whether they
want to continue to improve their contribution or whether they want to
defer to someone else more familiar with platform to take it up to address
the platform capabiltiies. We can also use JIRAs to track these additional
tasks that need to be done.


> Thinking from the perspective of a potential contributor, I think there is
> less motivation in jus developing an operator for its sake. If I were to
> develop some operator, I would first have a usecase in mind. An app, which
> I think would need the operator in question.
> Should these operator contributions be accompanied by an example app,
> demonstrating how to use the operator and perhaps also providing some
> context of the real use case?
>

Examples would be a good way to show a potential use case or serve as a
howto for the operator but we cannot mandate it. We could definitely
recommend it.

Thanks for your input.


>
> ~ Bhupesh
> On 27-May-2016 9:32 am, "Pramod Immaneni" <pra...@datatorrent.com> wrote:
>
> > I think this is a good idea and the committer can help in determining
> this
> > without putting all the burden on the contributor. One way would be to
> list
> > what is missing in terms of platform capabilities and under what
> scenarios
> > the operators cannot be used or unsure whether they will work. As you
> > mentioned it could be informally done in a javadoc or we could introduce
> > special programming language constructs to denote these.
> >
> > On Fri, May 27, 2016 at 8:22 AM, Ganelin, Ilya <
> > ilya.gane...@capitalone.com>
> > wrote:
> >
> > > If we're going to adopt partially completed code, I propose that every
> > new
> > > Malhar operator then contain a checklist as a comment within the class.
> > >
> > > This checklist would be defined by the community and would track the
> > > current development state of the operator. That way there are no
> > unexpected
> > > surprises.
> > >
> > >
> > >
> > > Sent with Good (www.good.com)
> > > 
> > > From: Amol Kekre <a...@datatorrent.com>
> > > Sent: Friday, May 27, 2016 10:13:06 AM
> > > To: dev@apex.apache.org
> > > Cc: d...@apex.incubator.apache.org
> > > Subject: Re: A proposal for Malhar
> > >
> > > This is a very good idea and will greatly help contributors. The
> > > requirements to submit code to this Malhar folder should be very
> > minimal. A
> > > few that come to my mind
> > >
> > > - Should compile
> > > - License of the external lib (if any) should be Apache compliant
> license
> > >  // Need to see if this is part of ASF guidelines
> > >
> > > Everything else including naming, idempotency, performance, ... should
> be
> > > waived.
> > >
> > > Thks
> > > Amol
> > >
> > >
> > > On Thu, May 26, 2016 at 11:25 PM, Pramod Immaneni <
> > pra...@datatorrent.com>
> > > wrote:
> > >
> > > > As you all know the continued success and growth of an open source
> > > project
> > > > is dependent on new members joining the project and contributing.
> This
> > > will
> > > > depend on how accessible the project is for new folks to make
> > meaningful
> > > > contributions. For a project like ours where the code base has been
> in
> > > > development for years, it can be quite daunting for new members to
> just
> > > > pick up and make contributions. We need to find ways to make it
> easier
> > > for
> > > > people to do so. Malhar, namely the operator library, is an area
> where
> > > > people can contribute without requiring deep knowledge or expertise.
> > > >
> > > > We have seen operators take time to mature as evidenced by the road
> > taken
> > > > by some of our commonly used operators to reach production quality.
> > This
> > > is
> > > > due to the fact that apart from the core functionality the operator
> is
> 

Re: A proposal for Malhar

2016-05-27 Thread Pramod Immaneni
I think this is a good idea and the committer can help in determining this
without putting all the burden on the contributor. One way would be to list
what is missing in terms of platform capabilities and under what scenarios
the operators cannot be used or unsure whether they will work. As you
mentioned it could be informally done in a javadoc or we could introduce
special programming language constructs to denote these.

On Fri, May 27, 2016 at 8:22 AM, Ganelin, Ilya <ilya.gane...@capitalone.com>
wrote:

> If we're going to adopt partially completed code, I propose that every new
> Malhar operator then contain a checklist as a comment within the class.
>
> This checklist would be defined by the community and would track the
> current development state of the operator. That way there are no unexpected
> surprises.
>
>
>
> Sent with Good (www.good.com)
> 
> From: Amol Kekre <a...@datatorrent.com>
> Sent: Friday, May 27, 2016 10:13:06 AM
> To: dev@apex.apache.org
> Cc: d...@apex.incubator.apache.org
> Subject: Re: A proposal for Malhar
>
> This is a very good idea and will greatly help contributors. The
> requirements to submit code to this Malhar folder should be very minimal. A
> few that come to my mind
>
> - Should compile
> - License of the external lib (if any) should be Apache compliant license
>  // Need to see if this is part of ASF guidelines
>
> Everything else including naming, idempotency, performance, ... should be
> waived.
>
> Thks
> Amol
>
>
> On Thu, May 26, 2016 at 11:25 PM, Pramod Immaneni <pra...@datatorrent.com>
> wrote:
>
> > As you all know the continued success and growth of an open source
> project
> > is dependent on new members joining the project and contributing. This
> will
> > depend on how accessible the project is for new folks to make meaningful
> > contributions. For a project like ours where the code base has been in
> > development for years, it can be quite daunting for new members to just
> > pick up and make contributions. We need to find ways to make it easier
> for
> > people to do so. Malhar, namely the operator library, is an area where
> > people can contribute without requiring deep knowledge or expertise.
> >
> > We have seen operators take time to mature as evidenced by the road taken
> > by some of our commonly used operators to reach production quality. This
> is
> > due to the fact that apart from the core functionality the operator is
> > trying to implement there are many other aspects to address such as
> > performance, idempotency, processing semantics and scalability. It would
> be
> > difficult even for folks familiar with all these aspects to get
> everything
> > right the first time around and produce comprehensive operators let alone
> > first time contributors. At the same time operators cannot reach this
> > maturity level unless they get used in different scenarios and get a good
> > look at by different people. In maturity I am also including API
> stability.
> >
> > I would like to propose creation of a space inside Malhar, a sub-folder,
> > where contributions can first go in if they are not fully ready and when
> > they mature can be moved out of the sub-folder into an existing module
> or a
> > new module of its own, the package paths can remain the same. The
> > evaluation bar for contributions into this space would be more permissive
> > than it is today, it would require the functionality the operator was
> > developed for be working but will not necessitate that all fault tolerant
> > and scalability aspects be addressed. It will also allow new operators
> that
> > are variations of existing operators till such time as we can determine
> if
> > the new functionality can be subsumed by the original operator or it
> makes
> > sense for the new operator to exist as a separate entity. It will be
> up-to
> > committers and contributors to work together and make the decisions as to
> > whether the individual contributions go into this space or are ready to
> > just go into the regular modules.
> >
> > What does everyone think.
> >
> > Thanks,
> > Pramod
> >
> 
>
> The information contained in this e-mail is confidential and/or
> proprietary to Capital One and/or its affiliates and may only be used
> solely in performance of work or services for Capital One. The information
> transmitted herewith is intended only for use by the individual or entity
> to which it is addressed. If the reader of this message is not the intended
> recipient, you are hereby notified that any review, retransmission,
> dissemination, distribution, copying or other use of, or taking of any
> action in reliance upon this information is strictly prohibited. If you
> have received this communication in error, please contact the sender and
> delete the material from your computer.
>


RE: A proposal for Malhar

2016-05-27 Thread Ganelin, Ilya
If we're going to adopt partially completed code, I propose that every new 
Malhar operator then contain a checklist as a comment within the class.

This checklist would be defined by the community and would track the current 
development state of the operator. That way there are no unexpected surprises.



Sent with Good (www.good.com)

From: Amol Kekre <a...@datatorrent.com>
Sent: Friday, May 27, 2016 10:13:06 AM
To: dev@apex.apache.org
Cc: d...@apex.incubator.apache.org
Subject: Re: A proposal for Malhar

This is a very good idea and will greatly help contributors. The
requirements to submit code to this Malhar folder should be very minimal. A
few that come to my mind

- Should compile
- License of the external lib (if any) should be Apache compliant license
 // Need to see if this is part of ASF guidelines

Everything else including naming, idempotency, performance, ... should be
waived.

Thks
Amol


On Thu, May 26, 2016 at 11:25 PM, Pramod Immaneni <pra...@datatorrent.com>
wrote:

> As you all know the continued success and growth of an open source project
> is dependent on new members joining the project and contributing. This will
> depend on how accessible the project is for new folks to make meaningful
> contributions. For a project like ours where the code base has been in
> development for years, it can be quite daunting for new members to just
> pick up and make contributions. We need to find ways to make it easier for
> people to do so. Malhar, namely the operator library, is an area where
> people can contribute without requiring deep knowledge or expertise.
>
> We have seen operators take time to mature as evidenced by the road taken
> by some of our commonly used operators to reach production quality. This is
> due to the fact that apart from the core functionality the operator is
> trying to implement there are many other aspects to address such as
> performance, idempotency, processing semantics and scalability. It would be
> difficult even for folks familiar with all these aspects to get everything
> right the first time around and produce comprehensive operators let alone
> first time contributors. At the same time operators cannot reach this
> maturity level unless they get used in different scenarios and get a good
> look at by different people. In maturity I am also including API stability.
>
> I would like to propose creation of a space inside Malhar, a sub-folder,
> where contributions can first go in if they are not fully ready and when
> they mature can be moved out of the sub-folder into an existing module or a
> new module of its own, the package paths can remain the same. The
> evaluation bar for contributions into this space would be more permissive
> than it is today, it would require the functionality the operator was
> developed for be working but will not necessitate that all fault tolerant
> and scalability aspects be addressed. It will also allow new operators that
> are variations of existing operators till such time as we can determine if
> the new functionality can be subsumed by the original operator or it makes
> sense for the new operator to exist as a separate entity. It will be up-to
> committers and contributors to work together and make the decisions as to
> whether the individual contributions go into this space or are ready to
> just go into the regular modules.
>
> What does everyone think.
>
> Thanks,
> Pramod
>


The information contained in this e-mail is confidential and/or proprietary to 
Capital One and/or its affiliates and may only be used solely in performance of 
work or services for Capital One. The information transmitted herewith is 
intended only for use by the individual or entity to which it is addressed. If 
the reader of this message is not the intended recipient, you are hereby 
notified that any review, retransmission, dissemination, distribution, copying 
or other use of, or taking of any action in reliance upon this information is 
strictly prohibited. If you have received this communication in error, please 
contact the sender and delete the material from your computer.


Re: A proposal for Malhar

2016-05-27 Thread Thomas Weise
That's a good proposal. Sounds like an "incubator" space inside Malhar.

Originally that was the intention behind the contrib module. But now it
contains many connectors that should be promoted to their own modules. So a
possible route would be to use contrib for early/evolving code and then
promote as it matures.

In any case the expectations need to be documented:

http://apex.apache.org/contributing.html

Currently these guidelines skip everything till submitting PR, we need to
update them to give newcomers a better picture.

Thomas


On Thu, May 26, 2016 at 11:25 PM, Pramod Immaneni 
wrote:

> As you all know the continued success and growth of an open source project
> is dependent on new members joining the project and contributing. This will
> depend on how accessible the project is for new folks to make meaningful
> contributions. For a project like ours where the code base has been in
> development for years, it can be quite daunting for new members to just
> pick up and make contributions. We need to find ways to make it easier for
> people to do so. Malhar, namely the operator library, is an area where
> people can contribute without requiring deep knowledge or expertise.
>
> We have seen operators take time to mature as evidenced by the road taken
> by some of our commonly used operators to reach production quality. This is
> due to the fact that apart from the core functionality the operator is
> trying to implement there are many other aspects to address such as
> performance, idempotency, processing semantics and scalability. It would be
> difficult even for folks familiar with all these aspects to get everything
> right the first time around and produce comprehensive operators let alone
> first time contributors. At the same time operators cannot reach this
> maturity level unless they get used in different scenarios and get a good
> look at by different people. In maturity I am also including API stability.
>
> I would like to propose creation of a space inside Malhar, a sub-folder,
> where contributions can first go in if they are not fully ready and when
> they mature can be moved out of the sub-folder into an existing module or a
> new module of its own, the package paths can remain the same. The
> evaluation bar for contributions into this space would be more permissive
> than it is today, it would require the functionality the operator was
> developed for be working but will not necessitate that all fault tolerant
> and scalability aspects be addressed. It will also allow new operators that
> are variations of existing operators till such time as we can determine if
> the new functionality can be subsumed by the original operator or it makes
> sense for the new operator to exist as a separate entity. It will be up-to
> committers and contributors to work together and make the decisions as to
> whether the individual contributions go into this space or are ready to
> just go into the regular modules.
>
> What does everyone think.
>
> Thanks,
> Pramod
>


Re: A proposal for Malhar

2016-05-27 Thread Amol Kekre
This is a very good idea and will greatly help contributors. The
requirements to submit code to this Malhar folder should be very minimal. A
few that come to my mind

- Should compile
- License of the external lib (if any) should be Apache compliant license
 // Need to see if this is part of ASF guidelines

Everything else including naming, idempotency, performance, ... should be
waived.

Thks
Amol


On Thu, May 26, 2016 at 11:25 PM, Pramod Immaneni 
wrote:

> As you all know the continued success and growth of an open source project
> is dependent on new members joining the project and contributing. This will
> depend on how accessible the project is for new folks to make meaningful
> contributions. For a project like ours where the code base has been in
> development for years, it can be quite daunting for new members to just
> pick up and make contributions. We need to find ways to make it easier for
> people to do so. Malhar, namely the operator library, is an area where
> people can contribute without requiring deep knowledge or expertise.
>
> We have seen operators take time to mature as evidenced by the road taken
> by some of our commonly used operators to reach production quality. This is
> due to the fact that apart from the core functionality the operator is
> trying to implement there are many other aspects to address such as
> performance, idempotency, processing semantics and scalability. It would be
> difficult even for folks familiar with all these aspects to get everything
> right the first time around and produce comprehensive operators let alone
> first time contributors. At the same time operators cannot reach this
> maturity level unless they get used in different scenarios and get a good
> look at by different people. In maturity I am also including API stability.
>
> I would like to propose creation of a space inside Malhar, a sub-folder,
> where contributions can first go in if they are not fully ready and when
> they mature can be moved out of the sub-folder into an existing module or a
> new module of its own, the package paths can remain the same. The
> evaluation bar for contributions into this space would be more permissive
> than it is today, it would require the functionality the operator was
> developed for be working but will not necessitate that all fault tolerant
> and scalability aspects be addressed. It will also allow new operators that
> are variations of existing operators till such time as we can determine if
> the new functionality can be subsumed by the original operator or it makes
> sense for the new operator to exist as a separate entity. It will be up-to
> committers and contributors to work together and make the decisions as to
> whether the individual contributions go into this space or are ready to
> just go into the regular modules.
>
> What does everyone think.
>
> Thanks,
> Pramod
>