Fantastic reading. I love these kind of detailed analysis with real-life problems :).
I am myself guilty of some of the hook instantiations in the constructors of some of the operators :(. When I see such problems I always think "What kind of system improvement we can do to avoid such problems in the future" ... I thought that we might want to do some .... yes ... linting ... or more precisely use https://github.com/davidfraser/pyan - to analyse call graphs in Airflow and detect such problems in (yes you guessed it) in a pre-commit hook. I think it should be rather easy to look at all the operators and check that none of the classes in hooks packages are instantiated in init() . methods of the operators. I am currently on vacations, so no time to do any serious look at it/POC but maybe someone could take a look and see if we can have something like that in place :) J, On Sun, Aug 18, 2019 at 4:17 AM Kamil Breguła <kamil.breg...@polidea.com> wrote: > Hi > > This problem also exists in GCP operators. I have noticed this problem long > time ago and I will want to solve it > https://issues.apache.org/jira/browse/AIRFLOW-4771 > This problem limits the use of AIrflow in the multitenant > environment, because the scheduler connects to the connection table. > > Greets > > On Sat, Aug 17, 2019, 11:17 AM Bas Harenslak < > basharens...@godatadriven.com> > wrote: > > > Nice work! Always love reading these sort of “bug reports from hell” and > > the work required to find the cause. > > > > Also strongly agree we should standardize hooks in some way. > > > > Cheers, > > Bas > > > > > On 16 Aug 2019, at 17:52, Shaw, Damian P. < > > damian.sha...@credit-suisse.com> wrote: > > > > > > Thanks, this is really useful to know! I often write my own > > Operators/Sensors/Hooks and was just looking at doing the same with the > > SFTPSensor and Operator. > > > > > > I've never formalized it but my current pattern is the follow: > > > > > > Hooks, > > > Set self._conn to None on __init__, and have a property "self.conn" > that > > checks if "self._conn" is None, > > > *if None create a new connection set it to self._conn and return it > > > * if not None run a check to see if the connection is still alive, if > is > > alive return self._conn, otherwise create a new connection > > > > > > Sensor/Operators, > > > On __init__ set self.conn_id to the conn_id string, and set > > "self._{conn_type}_hook" to None and have a property > "self.{conn_type}_hook" > > > In property check if "self._{conn_type}_hook" is None and if so create > a > > new Hook, if not None then return "self._{conn_type}_hook" > > > > > > I would be really appreciative on any best practices here others could > > share. > > > > > > > > > -----Original Message----- > > > From: James Meickle [mailto:jmeic...@quantopian.com.INVALID] > > > Sent: Friday, August 16, 2019 11:27 AM > > > To: dev@airflow.apache.org > > > Subject: Outage report > > > > > > We had an outage last night that was rather complex and difficult to > > debug. > > > Rather than just writing up the bug, I included what we did for various > > > debug steps. Hope some folks who are also cluster maintainers may find > it > > > interesting! > > > > > > https://issues.apache.org/jira/browse/AIRFLOW-5238 > > > > > > > > > > > > > > > =============================================================================== > > > > > Please access the attached hyperlink for an important electronic > > communications disclaimer: > > > http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html > > > > > > =============================================================================== > > > > > > > -- Jarek Potiuk Polidea <https://www.polidea.com/> | Principal Software Engineer M: +48 660 796 129 <+48660796129> [image: Polidea] <https://www.polidea.com/>