Hi, Friendly Reminder :
I created a shared google sheet and tracked the various details of operators. The sheet contains information about operators under lib/algo, lib/math & lib/streamquery. Link is https://docs.google.com/a/ datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_ CaWpXt3GDccM/edit?usp=sharing . I also added recommendation for each operator . Please take a look and provide comments as if any. Thanks Lakshmi Prasanna On Tue, Aug 9, 2016 at 10:07 AM, Pramod Immaneni <pra...@datatorrent.com> wrote: > Added comments, also recommend having the misc folder for the remaining > operators in contrib according to proposed guidelines > > https://github.com/apache/apex-site/pull/44 > > On Mon, Aug 1, 2016 at 12:22 AM, Lakshmi Velineni <laks...@datatorrent.com > > > wrote: > > > Hi > > > > I also added recommendation for lib/math operators to the same document > as > > a separate sheet. Please have a look. > > > > Thanks > > Lakshmi Prasanna > > > > On Fri, Jul 29, 2016 at 7:19 PM, Lakshmi Velineni < > laks...@datatorrent.com > > > wrote: > > > >> Hi, > >> > >> I also added recommendation for each operator . Please take a look. > >> > >> thanks > >> > >> On Thu, Jul 28, 2016 at 11:03 AM, Lakshmi Velineni < > >> laks...@datatorrent.com> wrote: > >> > >>> Hi, > >>> > >>> I created a shared google sheet and tracked the various details of > >>> operators. Currently, the sheet contains information about operators > under > >>> lib/algo only. Link is https://docs.google.com/a/ > >>> datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_ > >>> CaWpXt3GDccM/edit?usp=sharing . Will update the sheet soon with > >>> lib/math too. > >>> > >>> Thanks > >>> Lakshmi Prasanna > >>> > >>> On Tue, Jul 12, 2016 at 2:36 PM, David Yan <da...@datatorrent.com> > >>> wrote: > >>> > >>>> Hi Lakshmi, > >>>> > >>>> Thanks for volunteering. > >>>> > >>>> I think Pramod's suggestion of putting the operators into 3 buckets > and > >>>> Siyuan's suggestion of starting a shared Google Sheet that tracks > >>>> individual operators are both good, with the exception that > lib/streamquery > >>>> is one unit and we probably do not need to look at individual > operators > >>>> under it. > >>>> > >>>> If we don't have any objection in the community, let's start the > >>>> process. > >>>> > >>>> David > >>>> > >>>> On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni < > >>>> laks...@datatorrent.com> wrote: > >>>> > >>>>> I am interested to work on this. > >>>>> > >>>>> Regards, > >>>>> Lakshmi prasanna > >>>>> > >>>>> On Tue, Jul 12, 2016 at 1:55 PM, hsy...@gmail.com <hsy...@gmail.com> > >>>>> wrote: > >>>>> > >>>>> > Why not have a shared google sheet with a list of operators and > >>>>> options > >>>>> > that we want to do with it. > >>>>> > I think it's case by case. > >>>>> > But retire unused or obsolete operators is important and we should > >>>>> do it > >>>>> > sooner rather than later. > >>>>> > > >>>>> > Regards, > >>>>> > Siyuan > >>>>> > > >>>>> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <a...@datatorrent.com> > >>>>> wrote: > >>>>> > > >>>>> >> > >>>>> >> My vote is to do 2&3 > >>>>> >> > >>>>> >> Thks > >>>>> >> Amol > >>>>> >> > >>>>> >> > >>>>> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh < > >>>>> >> vkottapa...@directv.com> wrote: > >>>>> >> > >>>>> >>> +1 for deprecating the packages listed below. > >>>>> >>> > >>>>> >>> -----Original Message----- > >>>>> >>> From: hsy...@gmail.com [mailto:hsy...@gmail.com] > >>>>> >>> Sent: Tuesday, July 12, 2016 12:01 PM > >>>>> >>> > >>>>> >>> +1 > >>>>> >>> > >>>>> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan < > da...@datatorrent.com > >>>>> > > >>>>> >>> wrote: > >>>>> >>> > >>>>> >>> > Hi all, > >>>>> >>> > > >>>>> >>> > I would like to renew the discussion of retiring operators in > >>>>> Malhar. > >>>>> >>> > > >>>>> >>> > As stated before, the reason why we would like to retire > >>>>> operators in > >>>>> >>> > Malhar is because some of them were written a long time ago > >>>>> before > >>>>> >>> > Apache incubation, and they do not pertain to real use cases, > >>>>> are not > >>>>> >>> > up to par in code quality, have no potential for improvement, > and > >>>>> >>> > probably completely unused by anybody. > >>>>> >>> > > >>>>> >>> > We do not want contributors to use them as a model of their > >>>>> >>> > contribution, or users to use them thinking they are of > quality, > >>>>> and > >>>>> >>> then hit a wall. > >>>>> >>> > Both scenarios are not beneficial to the reputation of Apex. > >>>>> >>> > > >>>>> >>> > The initial 3 packages that we would like to target are > >>>>> *lib/algo*, > >>>>> >>> > *lib/math*, and *lib/streamquery*. > >>>>> >>> > >>>>> >>> > > >>>>> >>> > I'm adding this thread to the users list. Please speak up if > you > >>>>> are > >>>>> >>> > using any operator in these 3 packages. We would like to hear > >>>>> from you. > >>>>> >>> > > >>>>> >>> > These are the options I can think of for retiring those > >>>>> operators: > >>>>> >>> > > >>>>> >>> > 1) Completely remove them from the malhar repository. > >>>>> >>> > 2) Move them from malhar-library into a separate artifact > called > >>>>> >>> > malhar-misc > >>>>> >>> > 3) Mark them deprecated and add to their javadoc that they are > no > >>>>> >>> > longer supported > >>>>> >>> > > >>>>> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts? > >>>>> >>> > > >>>>> >>> > David > >>>>> >>> > > >>>>> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni > >>>>> >>> > <pra...@datatorrent.com> > >>>>> >>> > wrote: > >>>>> >>> > > >>>>> >>> >> I wanted to close the loop on this discussion. In general > >>>>> everyone > >>>>> >>> >> seemed to be favorable to this idea with no serious > objections. > >>>>> Folks > >>>>> >>> >> had good suggestions like documenting capabilities of > >>>>> operators, come > >>>>> >>> >> up well defined criteria for graduation of operators and what > >>>>> those > >>>>> >>> >> criteria may be and what to do with existing operators that > may > >>>>> not > >>>>> >>> >> yet be mature or unused. > >>>>> >>> >> > >>>>> >>> >> I am going to summarize the key points that resulted from the > >>>>> >>> >> discussion and would like to proceed with them. > >>>>> >>> >> > >>>>> >>> >> - Operators that do not yet provide the key platform > >>>>> capabilities > >>>>> >>> to > >>>>> >>> >> make an operator useful across different applications such > as > >>>>> >>> >> reusability, > >>>>> >>> >> partitioning static or dynamic, idempotency, exactly once > >>>>> will > >>>>> >>> still be > >>>>> >>> >> accepted as long as they are functionally correct, have > unit > >>>>> tests > >>>>> >>> >> and will > >>>>> >>> >> go into a separate module. > >>>>> >>> >> - Contrib module was suggested as a place where new > >>>>> contributions > >>>>> >>> go in > >>>>> >>> >> that don't yet have all the platform capabilities and are > >>>>> not yet > >>>>> >>> >> mature. > >>>>> >>> >> If there are no other suggestions we will go with this one. > >>>>> >>> >> - It was suggested the operators documentation list those > >>>>> platform > >>>>> >>> >> capabilities it currently provides from the list above. I > >>>>> will > >>>>> >>> >> document a > >>>>> >>> >> structure for this in the contribution guidelines. > >>>>> >>> >> - Folks wanted to know what would be the criteria to > >>>>> graduate an > >>>>> >>> >> operator to the big leagues :). I will kick-off a separate > >>>>> thread > >>>>> >>> >> for it as > >>>>> >>> >> I think it requires its own discussion and hopefully we can > >>>>> come > >>>>> >>> >> up with a > >>>>> >>> >> set of guidelines for it. > >>>>> >>> >> - David brought up state of some of the existing operators > >>>>> and > >>>>> >>> their > >>>>> >>> >> retirement and the layout of operators in Malhar in general > >>>>> and > >>>>> >>> how it > >>>>> >>> >> causes problems with development. I will ask him to lead > the > >>>>> >>> >> discussion on > >>>>> >>> >> that. > >>>>> >>> >> > >>>>> >>> >> Thanks > >>>>> >>> >> > >>>>> >>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan < > >>>>> da...@datatorrent.com> > >>>>> >>> wrote: > >>>>> >>> >> > >>>>> >>> >> > The two ideas are not conflicting, but rather complementing. > >>>>> >>> >> > > >>>>> >>> >> > On the contrary, putting a new process for people trying to > >>>>> >>> >> > contribute while NOT addressing the old unused subpar > >>>>> operators in > >>>>> >>> >> > the repository > >>>>> >>> >> is > >>>>> >>> >> > what is conflicting. > >>>>> >>> >> > > >>>>> >>> >> > Keep in mind that when people try to contribute, they always > >>>>> look > >>>>> >>> >> > at the existing operators already in the repository as > >>>>> examples and > >>>>> >>> >> > likely a > >>>>> >>> >> model > >>>>> >>> >> > for their new operators. > >>>>> >>> >> > > >>>>> >>> >> > David > >>>>> >>> >> > > >>>>> >>> >> > > >>>>> >>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre < > >>>>> a...@datatorrent.com> > >>>>> >>> >> wrote: > >>>>> >>> >> > > >>>>> >>> >> > > Yes there are two conflicting threads now. The original > >>>>> thread > >>>>> >>> >> > > was to > >>>>> >>> >> > open > >>>>> >>> >> > > up a way for contributors to submit code in a dir > >>>>> (contrib?) as > >>>>> >>> >> > > long > >>>>> >>> >> as > >>>>> >>> >> > > license part of taken care of. > >>>>> >>> >> > > > >>>>> >>> >> > > On the thread of removing non-used operators -> How do we > >>>>> know > >>>>> >>> >> > > what is being used? > >>>>> >>> >> > > > >>>>> >>> >> > > Thks, > >>>>> >>> >> > > Amol > >>>>> >>> >> > > > >>>>> >>> >> > > > >>>>> >>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde < > >>>>> >>> >> sand...@datatorrent.com> > >>>>> >>> >> > > wrote: > >>>>> >>> >> > > > >>>>> >>> >> > > > +1 for removing the not-used operators. > >>>>> >>> >> > > > > >>>>> >>> >> > > > So we are creating a process for operator writers who > >>>>> don't > >>>>> >>> >> > > > want to understand the platform, yet wants to > contribute? > >>>>> How > >>>>> >>> >> > > > big is that > >>>>> >>> >> set? > >>>>> >>> >> > > > If we tell the app-user, here is the code which has not > >>>>> passed > >>>>> >>> >> > > > all > >>>>> >>> >> the > >>>>> >>> >> > > > checklist, will they be ready to use that in production? > >>>>> >>> >> > > > > >>>>> >>> >> > > > This thread has 2 conflicting forces, reduce the > >>>>> operators and > >>>>> >>> >> > > > make > >>>>> >>> >> it > >>>>> >>> >> > > easy > >>>>> >>> >> > > > to add more operators. > >>>>> >>> >> > > > > >>>>> >>> >> > > > > >>>>> >>> >> > > > > >>>>> >>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni < > >>>>> >>> >> > pra...@datatorrent.com> > >>>>> >>> >> > > > wrote: > >>>>> >>> >> > > > > >>>>> >>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta < > >>>>> >>> >> > > gaurav.gopi...@gmail.com> > >>>>> >>> >> > > > > wrote: > >>>>> >>> >> > > > > > >>>>> >>> >> > > > > > Pramod, > >>>>> >>> >> > > > > > > >>>>> >>> >> > > > > > By that logic I would say let's put all > partitionable > >>>>> >>> >> > > > > > operators > >>>>> >>> >> > into > >>>>> >>> >> > > > one > >>>>> >>> >> > > > > > folder, non-partitionable operators in another and > so > >>>>> on... > >>>>> >>> >> > > > > > > >>>>> >>> >> > > > > > >>>>> >>> >> > > > > Remember the original goal of making it easier for new > >>>>> >>> >> > > > > members to contribute and managing those contributions > >>>>> to > >>>>> >>> >> > > > > maturity. It is > >>>>> >>> >> not a > >>>>> >>> >> > > > > functional level separation. > >>>>> >>> >> > > > > > >>>>> >>> >> > > > > > >>>>> >>> >> > > > > > When I look at hadoop code I see these annotations > >>>>> being > >>>>> >>> >> > > > > > used at > >>>>> >>> >> > > class > >>>>> >>> >> > > > > > level and not at package/folder level. > >>>>> >>> >> > > > > > >>>>> >>> >> > > > > > >>>>> >>> >> > > > > I had a typo in my email, I meant to say "think of > this > >>>>> like > >>>>> >>> >> > > > > a > >>>>> >>> >> > > folder..." > >>>>> >>> >> > > > > as an analogy and not literally. > >>>>> >>> >> > > > > > >>>>> >>> >> > > > > Thanks > >>>>> >>> >> > > > > > >>>>> >>> >> > > > > > >>>>> >>> >> > > > > > Thanks > >>>>> >>> >> > > > > > > >>>>> >>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni < > >>>>> >>> >> > > > pra...@datatorrent.com > >>>>> >>> >> > > > > > > >>>>> >>> >> > > > > > wrote: > >>>>> >>> >> > > > > > > >>>>> >>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta < > >>>>> >>> >> > > > > gaurav.gopi...@gmail.com> > >>>>> >>> >> > > > > > > wrote: > >>>>> >>> >> > > > > > > > >>>>> >>> >> > > > > > > > Can same goal not be achieved by using > >>>>> >>> >> > > org.apache.hadoop.classification. > >>>>> InterfaceStability.Evolving > >>>>> >>> >> > > > / > >>>>> >>> >> > > > > > > > org.apache.hadoop.classification. > >>>>> InterfaceStability.Uns > >>>>> >>> >> > > > > > > > table > >>>>> >>> >> > > > > > annotation? > >>>>> >>> >> > > > > > > > > >>>>> >>> >> > > > > > > > >>>>> >>> >> > > > > > > I think it is important to localize the additions > >>>>> in one > >>>>> >>> >> place so > >>>>> >>> >> > > > that > >>>>> >>> >> > > > > it > >>>>> >>> >> > > > > > > becomes clearer to users about the maturity level > of > >>>>> >>> >> > > > > > > these, > >>>>> >>> >> > easier > >>>>> >>> >> > > > for > >>>>> >>> >> > > > > > > developers to track them towards the path to > >>>>> maturity and > >>>>> >>> >> > > > > > > also > >>>>> >>> >> > > > > provides a > >>>>> >>> >> > > > > > > clearer directive for committers and contributors > on > >>>>> >>> >> acceptance > >>>>> >>> >> > of > >>>>> >>> >> > > > new > >>>>> >>> >> > > > > > > submissions. Relying on the annotations alone > makes > >>>>> them > >>>>> >>> >> spread > >>>>> >>> >> > all > >>>>> >>> >> > > > > over > >>>>> >>> >> > > > > > > the place and adds an additional layer of > >>>>> difficulty in > >>>>> >>> >> > > > identification > >>>>> >>> >> > > > > > not > >>>>> >>> >> > > > > > > just for users but also for developers who want to > >>>>> find > >>>>> >>> >> > > > > > > such > >>>>> >>> >> > > > operators > >>>>> >>> >> > > > > > and > >>>>> >>> >> > > > > > > improve them. This of this like a folder level > >>>>> annotation > >>>>> >>> >> where > >>>>> >>> >> > > > > > everything > >>>>> >>> >> > > > > > > under this folder is unstable or evolving. > >>>>> >>> >> > > > > > > > >>>>> >>> >> > > > > > > Thanks > >>>>> >>> >> > > > > > > > >>>>> >>> >> > > > > > > > >>>>> >>> >> > > > > > > > > >>>>> >>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan < > >>>>> >>> >> > > da...@datatorrent.com > >>>>> >>> >> > > > > > >>>>> >>> >> > > > > > > wrote: > >>>>> >>> >> > > > > > > > > >>>>> >>> >> > > > > > > > > > > >>>>> >>> >> > > > > > > > > > > > >>>>> >>> >> > > > > > > > > > > > > >>>>> >>> >> > > > > > > > > > > > Malhar in its current state, has way too > >>>>> many > >>>>> >>> >> operators > >>>>> >>> >> > > > that > >>>>> >>> >> > > > > > fall > >>>>> >>> >> > > > > > > > in > >>>>> >>> >> > > > > > > > > > the > >>>>> >>> >> > > > > > > > > > > > "non-production quality" category. We > >>>>> should > >>>>> >>> >> > > > > > > > > > > > make it > >>>>> >>> >> > > > obvious > >>>>> >>> >> > > > > to > >>>>> >>> >> > > > > > > > users > >>>>> >>> >> > > > > > > > > > > that > >>>>> >>> >> > > > > > > > > > > > which operators are up to par, and which > >>>>> >>> >> > > > > > > > > > > > operators > >>>>> >>> >> are > >>>>> >>> >> > > not, > >>>>> >>> >> > > > > and > >>>>> >>> >> > > > > > > > maybe > >>>>> >>> >> > > > > > > > > > > even > >>>>> >>> >> > > > > > > > > > > > remove those that are likely not ever > >>>>> used in a > >>>>> >>> >> > > > > > > > > > > > real > >>>>> >>> >> > use > >>>>> >>> >> > > > > case. > >>>>> >>> >> > > > > > > > > > > > > >>>>> >>> >> > > > > > > > > > > > >>>>> >>> >> > > > > > > > > > > I am ambivalent about revisiting older > >>>>> operators > >>>>> >>> >> > > > > > > > > > > and > >>>>> >>> >> > doing > >>>>> >>> >> > > > this > >>>>> >>> >> > > > > > > > > exercise > >>>>> >>> >> > > > > > > > > > as > >>>>> >>> >> > > > > > > > > > > this can cause unnecessary tensions. My > >>>>> original > >>>>> >>> >> intent > >>>>> >>> >> > is > >>>>> >>> >> > > > for > >>>>> >>> >> > > > > > > > > > > contributions going forward. > >>>>> >>> >> > > > > > > > > > > > >>>>> >>> >> > > > > > > > > > > > >>>>> >>> >> > > > > > > > > > IMO it is important to address this as well. > >>>>> >>> >> > > > > > > > > > Operators > >>>>> >>> >> > > outside > >>>>> >>> >> > > > > the > >>>>> >>> >> > > > > > > play > >>>>> >>> >> > > > > > > > > > area should be of well known quality. > >>>>> >>> >> > > > > > > > > > > >>>>> >>> >> > > > > > > > > > > >>>>> >>> >> > > > > > > > > I think this is important, and I don't > >>>>> anticipate > >>>>> >>> >> > > > > > > > > much > >>>>> >>> >> > tension > >>>>> >>> >> > > if > >>>>> >>> >> > > > > we > >>>>> >>> >> > > > > > > > > establish clear criteria. > >>>>> >>> >> > > > > > > > > It's not helpful if we let the old subpar > >>>>> operators > >>>>> >>> >> > > > > > > > > stay > >>>>> >>> >> and > >>>>> >>> >> > > put > >>>>> >>> >> > > > up > >>>>> >>> >> > > > > > the > >>>>> >>> >> > > > > > > > > bars for new operators. > >>>>> >>> >> > > > > > > > > > >>>>> >>> >> > > > > > > > > David > >>>>> >>> >> > > > > > > > > > >>>>> >>> >> > > > > > > > > >>>>> >>> >> > > > > > > > >>>>> >>> >> > > > > > > >>>>> >>> >> > > > > > >>>>> >>> >> > > > > >>>>> >>> >> > > > >>>>> >>> >> > > >>>>> >>> >> > >>>>> >>> > > >>>>> >>> > > >>>>> >>> > >>>>> >> > >>>>> >> > >>>>> > > >>>>> > >>>> > >>>> > >>> > >> > > >