So in favor of just using Python modules for operators. I initially wrote mine as Airflow plugin compatible, and eventually had to un-write them that way, so it's really a new-user trap.
I've had at least a half dozen times installing/testing/operating Airflow where we had some issue based on an integration for a service we've never even used (like Hive). I would love to see all of that go away. However, we should make sure that it's not too onerous to get a fairly fully featured Airflow install, such as having a way for external repos/packages to even be discoverable. On Tue, Sep 18, 2018 at 1:28 PM Driesprong, Fokko <fo...@driesprong.frl> wrote: > I fully agree with using plain Python modules :) > > I don't think a lot of hooks/operators graduate to core since it will break > the import. A few of them, for example Databricks and the Google hooks are > mature enough. For me the main point is having test coverage and a stable > API. > > Cheers, Fokko > > Op di 18 sep. 2018 om 18:30 schreef Victor Noagbodji < > vnoagbo...@amplify-analytics.com>: > > > yes, please! > > > > > On Sep 18, 2018, at 12:23 PM, Maxime Beauchemin < > > maximebeauche...@gmail.com> wrote: > > > > > > +1 for deprecating operators/hooks as plugins, let's use Python's good > > old > > > python packages and maybe python "entry points" if we want to inject > them > > > in "airflow.operators"/"airflow.hooks" (which is probably not > necessary) > > > > > > On Tue, Sep 18, 2018 at 2:12 AM Ash Berlin-Taylor <a...@apache.org> > > wrote: > > > > > >> Operators and hooks don't need any special plugin system - simply > having > > >> them as as separate Python modules which are imported using normal > > python > > >> semantics is enough. > > >> > > >> In fact now that I think about it: I want to deprecate the plugins > > >> registering hooks/operators etc and limit it to only bits which a > simple > > >> python import can't manage - which I think is only anything that needs > > to > > >> be registered with another system, such as custom routes in the web > UI. > > >> > > >> I'll draft an AIP for this soon. > > >> > > >> -ash > > >> > > >> > > >>> On 18 Sep 2018, at 00:50, George Leslie-Waksman <waks...@gmail.com> > > >> wrote: > > >>> > > >>> Given we have a plugin system, could we alternatively move away from > > >>> keeping non-core supported code outside of the core project/repo? > > >>> > > >>> It would hugely decrease the surface area of the main repository and > > >>> testing infrastructure to get most of the contrib code out to its own > > >> place. > > >>> > > >>> Further, it would decrease the committer burden of having to > > >> approve/merge > > >>> code that is not supposed to be their responsibility. > > >>> > > >>> On Mon, Sep 17, 2018 at 4:37 PM Tim Swast <sw...@google.com.invalid> > > >> wrote: > > >>> > > >>>>> Individual operators and hooks living in separate repositories on > > >> github > > >>>> (or possibly other Apache projects), which are then distributed by > pip > > >> and > > >>>> installed as libraries seems like it would scale better. > > >>>> > > >>>> Pandas did this about a year ago, and it's seemed to have worked > well. > > >> For > > >>>> example, pandas.read_gbq is a very thin wrapper around > > >> pandas_gbq.read_gbq > > >>>> (distributed as a separate package). It has made it easier for me to > > >> track > > >>>> issues corresponding to my area of expertise. > > >>>> > > >>>> On Sun, Sep 16, 2018 at 1:25 PM Jakob Homan <jgho...@gmail.com> > > wrote: > > >>>> > > >>>>>> My understanding as a contributor is that if a hook/operator is in > > >>>> core, > > >>>>> it > > >>>>>> means that a committer is willing to take personal responsibility > to > > >>>>>> maintain it (or at least help maintain it), and everything else > goes > > >> in > > >>>>>> contrib. > > >>>>> > > >>>>> That's not correct. All of the code is owned by the entire > > >>>>> community[1]; no one person is responsible for any of it. There's > no > > >>>>> silos, fiefdoms, walled gardens, etc. If the community cannot > > support > > >>>>> a piece of code it should be deprecated and subsequently removed. > > >>>>> > > >>>>> Contrib sections are almost always problematic for this reason. > > >>>>> Hadoop ended up abandoning its. Because Airflow acts as a > gathering > > >>>>> point for so many disparate technologies (databases, storage > systems, > > >>>>> compute engines, etc.), trying to keep all of them corralled and up > > to > > >>>>> date will be very difficult. Individual operators and hooks living > > in > > >>>>> separate repositories on github (or possibly other Apache > projects), > > >>>>> which are then distributed by pip and installed as libraries seems > > >>>>> like it would scale better. > > >>>>> > > >>>>> -Jakob > > >>>>> > > >>>>> [1] > > >> https://blogs.apache.org/foundation/entry/success-at-apache-a-newbie > > >>>>> > > >>>>> On 15 September 2018 at 13:29, Jeff Payne <jpa...@bombora.com> > > wrote: > > >>>>>> How many operators are added to contrib per month? Is it too many > to > > >>>>> make the decision case by case? If so, then the above mentioned > rule > > >>>> sounds > > >>>>> fairly reasonable. However, if that's the rule, shouldn't a bunch > of > > >>>>> existing modules be moved from contrib to core? > > >>>>>> > > >>>>>> Get Outlook for Android<https://aka.ms/ghei36> > > >>>>>> > > >>>>>> ________________________________ > > >>>>>> From: Taylor Edmiston <tedmis...@gmail.com> > > >>>>>> Sent: Saturday, September 15, 2018 1:13:47 PM > > >>>>>> To: dev@airflow.incubator.apache.org > > >>>>>> Subject: Re: Guidelines on Contrib vs Non-contrib > > >>>>>> > > >>>>>> My understanding as a contributor is that if a hook/operator is in > > >>>> core, > > >>>>> it > > >>>>>> means that a committer is willing to take personal responsibility > to > > >>>>>> maintain it (or at least help maintain it), and everything else > goes > > >> in > > >>>>>> contrib. > > >>>>>> > > >>>>>> *Taylor Edmiston* > > >>>>>> Blog <https://blog.tedmiston.com/> | LinkedIn > > >>>>>> <https://www.linkedin.com/in/tedmiston/> | Stack Overflow > > >>>>>> <https://stackoverflow.com/users/149428/taylor-edmiston> | > > Developer > > >>>>> Story > > >>>>>> <https://stackoverflow.com/story/taylor> > > >>>>>> > > >>>>>> > > >>>>>> > > >>>>>> On Sat, Sep 15, 2018 at 2:02 PM Kaxil Naik <kaxiln...@gmail.com> > > >>>> wrote: > > >>>>>> > > >>>>>>> Hi, all (mainly contributors), > > >>>>>>> > > >>>>>>> Can we decide on a common guideline on when a hook/operator > should > > go > > >>>>> under > > >>>>>>> contrib vs core? > > >>>>>>> > > >>>>>>> Regards, > > >>>>>>> > > >>>>>>> *Kaxil Naik* > > >>>>>>> *Big Data Consultant *@ *Data Reply UK* > > >>>>>>> *Certified *Google Cloud Data Engineer | *Certified* Apache > Spark & > > >>>>> Neo4j > > >>>>>>> Developer > > >>>>>>> *Phone: *+44 (0) 74820 88992 > > >>>>>>> *LinkedIn*: https://www.linkedin.com/in/kaxil > > >>>>>>> > > >>>>> > > >>>> -- > > >>>> * • **Tim Swast* > > >>>> * • *Software Friendliness Engineer > > >>>> * • *Google Cloud Developer Relations > > >>>> * • *Seattle, WA, USA > > >>>> > > >> > > >> > > > > >