Hi I also added recommendation for lib/math operators to the same document as a separate sheet. Please have a look.
Thanks Lakshmi Prasanna On Fri, Jul 29, 2016 at 7:19 PM, Lakshmi Velineni <laks...@datatorrent.com> wrote: > Hi, > > I also added recommendation for each operator . Please take a look. > > thanks > > On Thu, Jul 28, 2016 at 11:03 AM, Lakshmi Velineni < > laks...@datatorrent.com> wrote: > >> Hi, >> >> I created a shared google sheet and tracked the various details of >> operators. Currently, the sheet contains information about operators under >> lib/algo only. Link is >> https://docs.google.com/a/datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_CaWpXt3GDccM/edit?usp=sharing >> . >> Will update the sheet soon with lib/math too. >> >> Thanks >> Lakshmi Prasanna >> >> On Tue, Jul 12, 2016 at 2:36 PM, David Yan <da...@datatorrent.com> wrote: >> >>> Hi Lakshmi, >>> >>> Thanks for volunteering. >>> >>> I think Pramod's suggestion of putting the operators into 3 buckets and >>> Siyuan's suggestion of starting a shared Google Sheet that tracks >>> individual operators are both good, with the exception that lib/streamquery >>> is one unit and we probably do not need to look at individual operators >>> under it. >>> >>> If we don't have any objection in the community, let's start the process. >>> >>> David >>> >>> On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni < >>> laks...@datatorrent.com> wrote: >>> >>>> I am interested to work on this. >>>> >>>> Regards, >>>> Lakshmi prasanna >>>> >>>> On Tue, Jul 12, 2016 at 1:55 PM, hsy...@gmail.com <hsy...@gmail.com> >>>> wrote: >>>> >>>> > Why not have a shared google sheet with a list of operators and >>>> options >>>> > that we want to do with it. >>>> > I think it's case by case. >>>> > But retire unused or obsolete operators is important and we should do >>>> it >>>> > sooner rather than later. >>>> > >>>> > Regards, >>>> > Siyuan >>>> > >>>> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <a...@datatorrent.com> >>>> wrote: >>>> > >>>> >> >>>> >> My vote is to do 2&3 >>>> >> >>>> >> Thks >>>> >> Amol >>>> >> >>>> >> >>>> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh < >>>> >> vkottapa...@directv.com> wrote: >>>> >> >>>> >>> +1 for deprecating the packages listed below. >>>> >>> >>>> >>> -----Original Message----- >>>> >>> From: hsy...@gmail.com [mailto:hsy...@gmail.com] >>>> >>> Sent: Tuesday, July 12, 2016 12:01 PM >>>> >>> >>>> >>> +1 >>>> >>> >>>> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <da...@datatorrent.com> >>>> >>> wrote: >>>> >>> >>>> >>> > Hi all, >>>> >>> > >>>> >>> > I would like to renew the discussion of retiring operators in >>>> Malhar. >>>> >>> > >>>> >>> > As stated before, the reason why we would like to retire >>>> operators in >>>> >>> > Malhar is because some of them were written a long time ago before >>>> >>> > Apache incubation, and they do not pertain to real use cases, are >>>> not >>>> >>> > up to par in code quality, have no potential for improvement, and >>>> >>> > probably completely unused by anybody. >>>> >>> > >>>> >>> > We do not want contributors to use them as a model of their >>>> >>> > contribution, or users to use them thinking they are of quality, >>>> and >>>> >>> then hit a wall. >>>> >>> > Both scenarios are not beneficial to the reputation of Apex. >>>> >>> > >>>> >>> > The initial 3 packages that we would like to target are >>>> *lib/algo*, >>>> >>> > *lib/math*, and *lib/streamquery*. >>>> >>> >>>> >>> > >>>> >>> > I'm adding this thread to the users list. Please speak up if you >>>> are >>>> >>> > using any operator in these 3 packages. We would like to hear >>>> from you. >>>> >>> > >>>> >>> > These are the options I can think of for retiring those operators: >>>> >>> > >>>> >>> > 1) Completely remove them from the malhar repository. >>>> >>> > 2) Move them from malhar-library into a separate artifact called >>>> >>> > malhar-misc >>>> >>> > 3) Mark them deprecated and add to their javadoc that they are no >>>> >>> > longer supported >>>> >>> > >>>> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts? >>>> >>> > >>>> >>> > David >>>> >>> > >>>> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni >>>> >>> > <pra...@datatorrent.com> >>>> >>> > wrote: >>>> >>> > >>>> >>> >> I wanted to close the loop on this discussion. In general >>>> everyone >>>> >>> >> seemed to be favorable to this idea with no serious objections. >>>> Folks >>>> >>> >> had good suggestions like documenting capabilities of operators, >>>> come >>>> >>> >> up well defined criteria for graduation of operators and what >>>> those >>>> >>> >> criteria may be and what to do with existing operators that may >>>> not >>>> >>> >> yet be mature or unused. >>>> >>> >> >>>> >>> >> I am going to summarize the key points that resulted from the >>>> >>> >> discussion and would like to proceed with them. >>>> >>> >> >>>> >>> >> - Operators that do not yet provide the key platform >>>> capabilities >>>> >>> to >>>> >>> >> make an operator useful across different applications such as >>>> >>> >> reusability, >>>> >>> >> partitioning static or dynamic, idempotency, exactly once will >>>> >>> still be >>>> >>> >> accepted as long as they are functionally correct, have unit >>>> tests >>>> >>> >> and will >>>> >>> >> go into a separate module. >>>> >>> >> - Contrib module was suggested as a place where new >>>> contributions >>>> >>> go in >>>> >>> >> that don't yet have all the platform capabilities and are not >>>> yet >>>> >>> >> mature. >>>> >>> >> If there are no other suggestions we will go with this one. >>>> >>> >> - It was suggested the operators documentation list those >>>> platform >>>> >>> >> capabilities it currently provides from the list above. I will >>>> >>> >> document a >>>> >>> >> structure for this in the contribution guidelines. >>>> >>> >> - Folks wanted to know what would be the criteria to graduate >>>> an >>>> >>> >> operator to the big leagues :). I will kick-off a separate >>>> thread >>>> >>> >> for it as >>>> >>> >> I think it requires its own discussion and hopefully we can >>>> come >>>> >>> >> up with a >>>> >>> >> set of guidelines for it. >>>> >>> >> - David brought up state of some of the existing operators and >>>> >>> their >>>> >>> >> retirement and the layout of operators in Malhar in general >>>> and >>>> >>> how it >>>> >>> >> causes problems with development. I will ask him to lead the >>>> >>> >> discussion on >>>> >>> >> that. >>>> >>> >> >>>> >>> >> Thanks >>>> >>> >> >>>> >>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan < >>>> da...@datatorrent.com> >>>> >>> wrote: >>>> >>> >> >>>> >>> >> > The two ideas are not conflicting, but rather complementing. >>>> >>> >> > >>>> >>> >> > On the contrary, putting a new process for people trying to >>>> >>> >> > contribute while NOT addressing the old unused subpar >>>> operators in >>>> >>> >> > the repository >>>> >>> >> is >>>> >>> >> > what is conflicting. >>>> >>> >> > >>>> >>> >> > Keep in mind that when people try to contribute, they always >>>> look >>>> >>> >> > at the existing operators already in the repository as >>>> examples and >>>> >>> >> > likely a >>>> >>> >> model >>>> >>> >> > for their new operators. >>>> >>> >> > >>>> >>> >> > David >>>> >>> >> > >>>> >>> >> > >>>> >>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre < >>>> a...@datatorrent.com> >>>> >>> >> wrote: >>>> >>> >> > >>>> >>> >> > > Yes there are two conflicting threads now. The original >>>> thread >>>> >>> >> > > was to >>>> >>> >> > open >>>> >>> >> > > up a way for contributors to submit code in a dir (contrib?) >>>> as >>>> >>> >> > > long >>>> >>> >> as >>>> >>> >> > > license part of taken care of. >>>> >>> >> > > >>>> >>> >> > > On the thread of removing non-used operators -> How do we >>>> know >>>> >>> >> > > what is being used? >>>> >>> >> > > >>>> >>> >> > > Thks, >>>> >>> >> > > Amol >>>> >>> >> > > >>>> >>> >> > > >>>> >>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde < >>>> >>> >> sand...@datatorrent.com> >>>> >>> >> > > wrote: >>>> >>> >> > > >>>> >>> >> > > > +1 for removing the not-used operators. >>>> >>> >> > > > >>>> >>> >> > > > So we are creating a process for operator writers who don't >>>> >>> >> > > > want to understand the platform, yet wants to contribute? >>>> How >>>> >>> >> > > > big is that >>>> >>> >> set? >>>> >>> >> > > > If we tell the app-user, here is the code which has not >>>> passed >>>> >>> >> > > > all >>>> >>> >> the >>>> >>> >> > > > checklist, will they be ready to use that in production? >>>> >>> >> > > > >>>> >>> >> > > > This thread has 2 conflicting forces, reduce the operators >>>> and >>>> >>> >> > > > make >>>> >>> >> it >>>> >>> >> > > easy >>>> >>> >> > > > to add more operators. >>>> >>> >> > > > >>>> >>> >> > > > >>>> >>> >> > > > >>>> >>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni < >>>> >>> >> > pra...@datatorrent.com> >>>> >>> >> > > > wrote: >>>> >>> >> > > > >>>> >>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta < >>>> >>> >> > > gaurav.gopi...@gmail.com> >>>> >>> >> > > > > wrote: >>>> >>> >> > > > > >>>> >>> >> > > > > > Pramod, >>>> >>> >> > > > > > >>>> >>> >> > > > > > By that logic I would say let's put all partitionable >>>> >>> >> > > > > > operators >>>> >>> >> > into >>>> >>> >> > > > one >>>> >>> >> > > > > > folder, non-partitionable operators in another and so >>>> on... >>>> >>> >> > > > > > >>>> >>> >> > > > > >>>> >>> >> > > > > Remember the original goal of making it easier for new >>>> >>> >> > > > > members to contribute and managing those contributions to >>>> >>> >> > > > > maturity. It is >>>> >>> >> not a >>>> >>> >> > > > > functional level separation. >>>> >>> >> > > > > >>>> >>> >> > > > > >>>> >>> >> > > > > > When I look at hadoop code I see these annotations >>>> being >>>> >>> >> > > > > > used at >>>> >>> >> > > class >>>> >>> >> > > > > > level and not at package/folder level. >>>> >>> >> > > > > >>>> >>> >> > > > > >>>> >>> >> > > > > I had a typo in my email, I meant to say "think of this >>>> like >>>> >>> >> > > > > a >>>> >>> >> > > folder..." >>>> >>> >> > > > > as an analogy and not literally. >>>> >>> >> > > > > >>>> >>> >> > > > > Thanks >>>> >>> >> > > > > >>>> >>> >> > > > > >>>> >>> >> > > > > > Thanks >>>> >>> >> > > > > > >>>> >>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni < >>>> >>> >> > > > pra...@datatorrent.com >>>> >>> >> > > > > > >>>> >>> >> > > > > > wrote: >>>> >>> >> > > > > > >>>> >>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta < >>>> >>> >> > > > > gaurav.gopi...@gmail.com> >>>> >>> >> > > > > > > wrote: >>>> >>> >> > > > > > > >>>> >>> >> > > > > > > > Can same goal not be achieved by using >>>> >>> >> > > org.apache.hadoop.classification.InterfaceStability.Evolving >>>> >>> >> > > > / >>>> >>> >> > > > > > > > >>>> org.apache.hadoop.classification.InterfaceStability.Uns >>>> >>> >> > > > > > > > table >>>> >>> >> > > > > > annotation? >>>> >>> >> > > > > > > > >>>> >>> >> > > > > > > >>>> >>> >> > > > > > > I think it is important to localize the additions in >>>> one >>>> >>> >> place so >>>> >>> >> > > > that >>>> >>> >> > > > > it >>>> >>> >> > > > > > > becomes clearer to users about the maturity level of >>>> >>> >> > > > > > > these, >>>> >>> >> > easier >>>> >>> >> > > > for >>>> >>> >> > > > > > > developers to track them towards the path to >>>> maturity and >>>> >>> >> > > > > > > also >>>> >>> >> > > > > provides a >>>> >>> >> > > > > > > clearer directive for committers and contributors on >>>> >>> >> acceptance >>>> >>> >> > of >>>> >>> >> > > > new >>>> >>> >> > > > > > > submissions. Relying on the annotations alone makes >>>> them >>>> >>> >> spread >>>> >>> >> > all >>>> >>> >> > > > > over >>>> >>> >> > > > > > > the place and adds an additional layer of difficulty >>>> in >>>> >>> >> > > > identification >>>> >>> >> > > > > > not >>>> >>> >> > > > > > > just for users but also for developers who want to >>>> find >>>> >>> >> > > > > > > such >>>> >>> >> > > > operators >>>> >>> >> > > > > > and >>>> >>> >> > > > > > > improve them. This of this like a folder level >>>> annotation >>>> >>> >> where >>>> >>> >> > > > > > everything >>>> >>> >> > > > > > > under this folder is unstable or evolving. >>>> >>> >> > > > > > > >>>> >>> >> > > > > > > Thanks >>>> >>> >> > > > > > > >>>> >>> >> > > > > > > >>>> >>> >> > > > > > > > >>>> >>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan < >>>> >>> >> > > da...@datatorrent.com >>>> >>> >> > > > > >>>> >>> >> > > > > > > wrote: >>>> >>> >> > > > > > > > >>>> >>> >> > > > > > > > > > >>>> >>> >> > > > > > > > > > > >>>> >>> >> > > > > > > > > > > > >>>> >>> >> > > > > > > > > > > > Malhar in its current state, has way too >>>> many >>>> >>> >> operators >>>> >>> >> > > > that >>>> >>> >> > > > > > fall >>>> >>> >> > > > > > > > in >>>> >>> >> > > > > > > > > > the >>>> >>> >> > > > > > > > > > > > "non-production quality" category. We >>>> should >>>> >>> >> > > > > > > > > > > > make it >>>> >>> >> > > > obvious >>>> >>> >> > > > > to >>>> >>> >> > > > > > > > users >>>> >>> >> > > > > > > > > > > that >>>> >>> >> > > > > > > > > > > > which operators are up to par, and which >>>> >>> >> > > > > > > > > > > > operators >>>> >>> >> are >>>> >>> >> > > not, >>>> >>> >> > > > > and >>>> >>> >> > > > > > > > maybe >>>> >>> >> > > > > > > > > > > even >>>> >>> >> > > > > > > > > > > > remove those that are likely not ever used >>>> in a >>>> >>> >> > > > > > > > > > > > real >>>> >>> >> > use >>>> >>> >> > > > > case. >>>> >>> >> > > > > > > > > > > > >>>> >>> >> > > > > > > > > > > >>>> >>> >> > > > > > > > > > > I am ambivalent about revisiting older >>>> operators >>>> >>> >> > > > > > > > > > > and >>>> >>> >> > doing >>>> >>> >> > > > this >>>> >>> >> > > > > > > > > exercise >>>> >>> >> > > > > > > > > > as >>>> >>> >> > > > > > > > > > > this can cause unnecessary tensions. My >>>> original >>>> >>> >> intent >>>> >>> >> > is >>>> >>> >> > > > for >>>> >>> >> > > > > > > > > > > contributions going forward. >>>> >>> >> > > > > > > > > > > >>>> >>> >> > > > > > > > > > > >>>> >>> >> > > > > > > > > > IMO it is important to address this as well. >>>> >>> >> > > > > > > > > > Operators >>>> >>> >> > > outside >>>> >>> >> > > > > the >>>> >>> >> > > > > > > play >>>> >>> >> > > > > > > > > > area should be of well known quality. >>>> >>> >> > > > > > > > > > >>>> >>> >> > > > > > > > > > >>>> >>> >> > > > > > > > > I think this is important, and I don't anticipate >>>> >>> >> > > > > > > > > much >>>> >>> >> > tension >>>> >>> >> > > if >>>> >>> >> > > > > we >>>> >>> >> > > > > > > > > establish clear criteria. >>>> >>> >> > > > > > > > > It's not helpful if we let the old subpar >>>> operators >>>> >>> >> > > > > > > > > stay >>>> >>> >> and >>>> >>> >> > > put >>>> >>> >> > > > up >>>> >>> >> > > > > > the >>>> >>> >> > > > > > > > > bars for new operators. >>>> >>> >> > > > > > > > > >>>> >>> >> > > > > > > > > David >>>> >>> >> > > > > > > > > >>>> >>> >> > > > > > > > >>>> >>> >> > > > > > > >>>> >>> >> > > > > > >>>> >>> >> > > > > >>>> >>> >> > > > >>>> >>> >> > > >>>> >>> >> > >>>> >>> >> >>>> >>> > >>>> >>> > >>>> >>> >>>> >> >>>> >> >>>> > >>>> >>> >>> >> >