+1 binding On Sat, Jul 25, 2020 at 12:05 AM Kevin Yang <yrql...@gmail.com> wrote:
> Thank you everyone! We'll keep the vote open until 2020-07-27T00:00Z. > > On Thu, Jul 23, 2020 at 4:16 AM Tomasz Urbaszek <turbas...@apache.org> > wrote: > > > +1 binding > > > > On Wed, Jul 22, 2020 at 4:57 PM Arthur Wiedmer <arthur.wied...@gmail.com > > > > wrote: > > > > > +1 (binding) > > > > > > On Tue, Jul 21, 2020 at 11:12 PM Kevin Yang <yrql...@gmail.com> wrote: > > > > > > > Thank you everyone for the feedback! I think the PR covered what is > > being > > > > proposed in the AIP and does not have big controversial change that > > > cannot > > > > be revised easily. The feature itself has also been baked in Airbnb's > > > large > > > > Airflow setup for 6+ months. So consider this my binding +1. > > > > > > > > I would love to call for more votes from the community, esp. from the > > > > committers. As mentioned already above by Vikram, Kaxil and Jarek, > this > > > > change would lay a good foundation for further work on > > onboarding/extend > > > to > > > > other operators. > > > > > > > > Release wise, what would be the implication of planning it for 2.1? > If > > > that > > > > doesn't mean we block the PR from merging until 2.0 is ready then I'm > > > > totally fine. > > > > > > > > > > > > Cheers, > > > > Kevin Y > > > > > > > > On Mon, Jul 20, 2020 at 9:24 PM Yingbo Wang <ybw...@gmail.com> > wrote: > > > > > > > > > It would be nice to include DAG serialization in the smart sensor > > > however > > > > > we may want to put it in a follow up PR. Cong Zhu has explained why > > it > > > > > would need more investigation to put the DAG serialization in a > > > separate > > > > PR > > > > > here > > > https://github.com/apache/airflow/pull/5499#issuecomment-657828649. > > > > > > > > > > In my opinion, the current PR is already too big and it is becoming > > > > harder > > > > > and harder for us to make the branch tracking with the master > branch. > > > At > > > > > the moment, I think it makes more sense to merge this change with > the > > > > > master before we make additional incremental improvement. > > > > > > > > > > Yingbo > > > > > > > > > > > > > > > On Sun, Jun 21, 2020 at 1:57 PM Jarek Potiuk < > > jarek.pot...@polidea.com > > > > > > > > > wrote: > > > > > > > > > > > I also llke the idea - it is really nice optimisation. As Kaxil > > > > > mentioned - > > > > > > it would be great to include Dag Serialisation. > > > > > > > > > > > > I think for some of the other operators that just send query and > > poll > > > > > > execution - I think once this one is implemented for sensors, we > > can > > > > > think > > > > > > about splitting some of the "operators" to support it. There was > > > > already > > > > > a > > > > > > PokeReschedule discussion > > > > > > > > > > > > > > > > > > > > > > > > > > > https://lists.apache.org/thread.html/rc6f56234342c87f154865489e3a6555609e4b98a8c62ca4997cb6a6c%40%3Cdev.airflow.apache.org%3E > > > > > > which > > > > > > i think might benefit from this architecture - but I'd say - > let's > > > > > > implement it for sensors first and then we can think about > > > implementing > > > > > it > > > > > > also for operators. > > > > > > > > > > > > One thing though - is it something that we could plan for 2.1 ? I > > > think > > > > > our > > > > > > 2.0 release gets some important features that focus on delivering > > > most > > > > > > value for Airflow users. And it keeps on slipping. I think > > > implementing > > > > > > such a change will keep some of the committers distracted (even > if > > it > > > > is > > > > > > already at a PR stage). Maybe we can start prioritising new > > features > > > to > > > > > 2.1 > > > > > > by default unless we have very good reason to implement it in > 2.0? > > > > > > > > > > > > J. > > > > > > > > > > > > > > > > > > > > > > > > On Sat, Jun 20, 2020 at 5:37 PM Kaxil Naik <kaxiln...@gmail.com> > > > > wrote: > > > > > > > > > > > > > Hi Yingbo, > > > > > > > > > > > > > > I like the concept described in the AIP. I was wondering if we > > > could > > > > > > > leverage Dag Serialization ( > > > > > > > > https://airflow.readthedocs.io/en/latest/dag-serialization.html) > > > to > > > > > get > > > > > > > the > > > > > > > "task level fields" without re-parsing the DAGs or storing it > in > > > the > > > > > new > > > > > > > table. > > > > > > > > > > > > > > And can we use some of the operators like the BigQueryOperator, > > > > > > > SparkOperator which just submits the SQL query and polls until > > > > > > completion? > > > > > > > > > > > > > > Regards, > > > > > > > Kaxil > > > > > > > > > > > > > > On Sat, Jun 20, 2020 at 8:26 AM Yingbo Wang <ybw...@gmail.com> > > > > wrote: > > > > > > > > > > > > > > > Thanks everyone for the feedback. I will also add the details > > > > > mentioned > > > > > > > in > > > > > > > > this thread into the AIP > > > > > > > > > > > > > > > > > > > > > > > > Q: From an implementation perspective, my one area of concern > > is > > > > the > > > > > > > > > > > > > > > > "sharding" concept and the configuration / management > overhead > > > > > > involved. > > > > > > > I > > > > > > > > > > > > > > > > may have missed it in the AIP, but would it be possible to > add > > > > > > > auto-scaling > > > > > > > > > > > > > > > > to minimize this configuration? > > > > > > > > > > > > > > > > The “sharding” configuration is an integer which implies the > > > number > > > > > of > > > > > > > > concurrently running smart sensor jobs for the whole airflow > > > > > cluster. A > > > > > > > > proper sharding setting mainly depends on the following > issues: > > > 1. > > > > > > > Cluster > > > > > > > > load -- how many sensor tasks need to be executed at the same > > > time. > > > > > 2. > > > > > > > How > > > > > > > > often should each sensor be poked at least once. 3. The > > response > > > > time > > > > > > > for a > > > > > > > > sensor task in the current system. As these answers may vary > > for > > > > > > > different > > > > > > > > systems we leave “sharding” as a configurable field for users > > to > > > > > > satisfy > > > > > > > > different use cases. > > > > > > > > > > > > > > > > Also, a couple of clarifying questions: > > > > > > > > > > > > > > > > 1. Do you know if this is more suitable to certain kinds of > > > sensors > > > > > vs. > > > > > > > > > > > > > > > > Others? > > > > > > > > > > > > > > > > Most sensors should be suitable for the smart sensor. Except > if > > > the > > > > > > > > argument needed to initialize a sensor object is > > unserializable, > > > > > e.g. a > > > > > > > > function. Serialize more complex types other than builtin and > > > > > datetime > > > > > > is > > > > > > > > not supported right now but we are planning to add them in > the > > > > > future. > > > > > > > > > > > > > > > > 2. What do you think about leveraging this to enable "async" > > > > > operations > > > > > > > > > > > > > > > > using Airflow i.e. submit a task and then use a "smart > sensor" > > to > > > > > check > > > > > > > for > > > > > > > > > > > > > > > > Completion? > > > > > > > > > > > > > > > > This is a very good point. We do notice the relationship > > between > > > > > these > > > > > > > two > > > > > > > > ideas. Technically this logic should also work. The “task > > > > submission” > > > > > > map > > > > > > > > to the pre_execute() in a sensor task logic and “check for > > > > > completion” > > > > > > > map > > > > > > > > to the sensor’s poke() function. The current implementation > of > > > > > > > > SubDagOperator > > > > > > > > < > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/airflow/blob/master/airflow/operators/subdag_operator.py#L144-L177 > > > > > > > > > > > > > > > > > actually follows this pattern. If the operator requires no > > > > > > unserializable > > > > > > > > argument to be instantiated, we should already be able to > > > leverage > > > > > the > > > > > > > > async operation in SmartSensor for it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Q: How would a user enable their own smart sensors? I don’t > see > > > any > > > > > > added > > > > > > > > documentation for this. It looks like they need to manually > add > > > the > > > > > > name > > > > > > > of > > > > > > > > the class to the airflow configuration and do *something* to > > > their > > > > > > sensor > > > > > > > > class, including override the "is_smart_sensor" method (why a > > > > method > > > > > > and > > > > > > > > not an attribute?) > > > > > > > > > > > > > > > > Having to enable it in multiple places seems a little > > cumbersome, > > > > why > > > > > > not > > > > > > > > have a "BaseSmartSensor" that the user inherits from like > most > > of > > > > the > > > > > > > rest > > > > > > > > of Airflow? Sensors inherited from BaseSmartSensor would be > > > "Smart" > > > > > > when > > > > > > > > smart sensors are enabled in the configuration and not smart > > when > > > > > smart > > > > > > > > sensors are not enabled. > > > > > > > > > > > > > > > > Enabling/Disabling the smart sensor is a system level config > > > which > > > > is > > > > > > > > transparent to the individual users. An example of smart > sensor > > > > > enabled > > > > > > > > cluster config is as follows: > > > > > > > > > > > > > > > > [smart_sensor] > > > > > > > > > > > > > > > > use_smart_sensor = true > > > > > > > > > > > > > > > > shard_code_upper_limit = 10000 > > > > > > > > > > > > > > > > shards = 5 > > > > > > > > > > > > > > > > sensor_enabled = NamedHivePartitionSensor, > > > MetastorePartitionSensor > > > > > > > > > > > > > > > > > > > > > > > > The "use_smart_sensor" config indicates if the smart sensor > is > > > > > enabled. > > > > > > > The > > > > > > > > "shards" config indicates the number of concurrently running > > > smart > > > > > > sensor > > > > > > > > jobs for the airflow cluster. The "sensor_enabled" config is > a > > > list > > > > > of > > > > > > > > sensor class names that will use the smart sensor. The users > > use > > > > the > > > > > > > same > > > > > > > > class names (e.g. HivePartitionSensor) in their DAGs and they > > > don’t > > > > > > have > > > > > > > > the control to use smart sensors or not, unless they exclude > > > their > > > > > > tasks > > > > > > > > explicits. > > > > > > > > > > > > > > > > > > > > > > > > Existing DAGs don't need to be changed for enabling/disabling > > the > > > > > smart > > > > > > > > sensor. > > > > > > > > > > > > > > > > > > > > > > > > “Is_smart_sensor_compatible” is a class level configuration > > > > (instead > > > > > of > > > > > > > > instance-level) so that the system knows if a particular > sensor > > > > > > operator > > > > > > > > can use the smart sensor. Currently only > > NamedHivePartitionSensor > > > > and > > > > > > > > MetastorePartitionSensor > > > > > > > > are enabled to use the smart sensor in the PR. > > > > > > > > > > > > > > > > To include other sensor operators for smart sensors that are > > not > > > > > > included > > > > > > > > in this PR: > > > > > > > > > > > > > > > > 1. > > > > > > > > > > > > > > > > Add a class attribute "poke_context_fields" to the > operator. > > > > > > > > "poke_context_fields" include all key names used for > > > > initializing > > > > > a > > > > > > > > sensor > > > > > > > > object. > > > > > > > > 2. > > > > > > > > > > > > > > > > In airflow.cfg, add the operator’s classname to the > session > > of > > > > > > > > “[smart_sensor]” with the field “sensors_enabled” as > > follows. > > > > > > > > > > > > > > > > > > > > > > > > Yingbo > > > > > > > > > > > > > > > > On Fri, Jun 19, 2020 at 7:27 AM Shaw, Damian P. < > > > > > > > > damian.sha...@credit-suisse.com> wrote: > > > > > > > > > > > > > > > > > Also +1 (non-binding) on the AIP but questions on the > > > > > implementation. > > > > > > > > > > > > > > > > > > How would a user enable their own smart sensors? I don’t > see > > > any > > > > > > added > > > > > > > > > documentation for this. It looks like they need to manually > > add > > > > the > > > > > > > name > > > > > > > > of > > > > > > > > > the class to the airflow configuration and do *something* > to > > > > their > > > > > > > sensor > > > > > > > > > class, including override the "is_smart_sensor" method > (why a > > > > > method > > > > > > > and > > > > > > > > > not an attribute?) > > > > > > > > > > > > > > > > > > Having to enable it in multiple places seems a little > > > cumbersome, > > > > > why > > > > > > > not > > > > > > > > > have a "BaseSmartSensor" that the user inherits from like > > most > > > of > > > > > the > > > > > > > > rest > > > > > > > > > of Airflow? Sensors inherited from BaseSmartSensor would be > > > > "Smart" > > > > > > > when > > > > > > > > > smart sensors are enabled in the configuration and not > smart > > > when > > > > > > smart > > > > > > > > > sensors are not enaled. > > > > > > > > > > > > > > > > > > Damian > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > > > > > > From: Vikram Koka <vik...@astronomer.io> > > > > > > > > > Sent: Friday, June 19, 2020 00:57 > > > > > > > > > To: dev@airflow.apache.org > > > > > > > > > Subject: Re: [VOTE] AIP-17: Consolidate and de-duplicate > > sensor > > > > > tasks > > > > > > > in > > > > > > > > > airflow Smart Sensor > > > > > > > > > > > > > > > > > > +1 (non-binding) for this AIP. > > > > > > > > > > > > > > > > > > I really like the concept and the efficiency improvements. > > The > > > > > > general > > > > > > > > > SmartSensor concept and the ability to add additional > sensor > > > > > classes > > > > > > is > > > > > > > > > elegant. > > > > > > > > > > > > > > > > > > From an implementation perspective, my one area of concern > is > > > the > > > > > > > > > "sharding" concept and the configuration / management > > overhead > > > > > > > involved. > > > > > > > > I > > > > > > > > > may have missed it in the AIP, but would it be possible to > > add > > > > > > > > auto-scaling > > > > > > > > > to minimize this configuration? > > > > > > > > > > > > > > > > > > Also, a couple of clarifying questions: > > > > > > > > > 1. Do you know if this is more suitable to certain kinds of > > > > sensors > > > > > > vs. > > > > > > > > > others? > > > > > > > > > 2. What do you think about leveraging this to enable > "async" > > > > > > operations > > > > > > > > > using Airflow i.e. submit a task and then use a "smart > > sensor" > > > to > > > > > > check > > > > > > > > for > > > > > > > > > completion? > > > > > > > > > > > > > > > > > > Best regards, > > > > > > > > > > > > > > > > > > Vikram > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Jun 18, 2020 at 3:38 PM Yingbo Wang < > > ybw...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hello everyone! > > > > > > > > > > > > > > > > > > > > This email calls for a vote to add the airflow smart > sensor > > > at > > > > > > > > > > https://github.com/apache/airflow/pull/5499 > > > > > > > > > > > > > > > > > > > > AIP: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-17%3A+Consolid > > > > > > > > > > ate+and+de-duplicate+sensor+tasks+in+airflow+Smart+Sensor > > > > > > > > > > > > > > > > > > > > Change summary: > > > > > > > > > > > > > > > > > > > > - Add a new mode called “smart sensor mode”. In smart > > > sensor > > > > > > mode, > > > > > > > > > > instead of holding a long running process for each > > sensor > > > > and > > > > > > > poking > > > > > > > > > > periodically, a sensor will only store poke context at > > > > > > > > sensor_instance > > > > > > > > > > table and then exits with a ‘sensing’ state. > > > > > > > > > > - When the smart sensor mode is enabled, a special set > > of > > > > > > builtin > > > > > > > > > smart > > > > > > > > > > sensor DAGs (named smart_sensor_group_shard_xxx) is > > > created > > > > by > > > > > > the > > > > > > > > > > system; > > > > > > > > > > These DAGs contain SmartSensorOperator task and manage > > the > > > > > smart > > > > > > > > > sensor > > > > > > > > > > jobs for the airflow cluster. The SmartSensorOperator > > task > > > > can > > > > > > > fetch > > > > > > > > > > hundreds of ‘sensing’ instances from sensor_instance > > table > > > > and > > > > > > > poke > > > > > > > > on > > > > > > > > > > behalf of them in batches. Users don’t need to change > > > their > > > > > > > > > > existing DAGs. > > > > > > > > > > - The smart sensor mode currently supports > > > > > > > NamedHivePartitionSensor > > > > > > > > > and > > > > > > > > > > MetastorePartitionSensor however it can easily be > > extended > > > > to > > > > > > > > > > support more > > > > > > > > > > sensor classes. > > > > > > > > > > - Smart sensor mode on/off, the list of smart sensor > > > enabled > > > > > > > > classes, > > > > > > > > > > and the number of SmartSensorOperator tasks can be > > > > configured > > > > > in > > > > > > > > > airflow > > > > > > > > > > config. > > > > > > > > > > - Sensor logs in smart sensors are populated to each > > task > > > > > > instance > > > > > > > > log > > > > > > > > > > UI. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > A PR https://github.com/apache/airflow/pull/5499 is > ready > > > for > > > > > > review > > > > > > > > > > from the committers and community. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This email is formally calling for a vote to accept the > AIP > > > and > > > > > PR. > > > > > > > > > > Please note that we will update the PR / feature to fix > > bugs > > > if > > > > > we > > > > > > > find > > > > > > > > > any. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best > > > > > > > > > > > > > > > > > > > > Yingbo > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > =============================================================================== > > > > > > > > > > > > > > > > > > Please access the attached hyperlink for an important > > > electronic > > > > > > > > > communications disclaimer: > > > > > > > > > > > http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > =============================================================================== > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > Jarek Potiuk > > > > > > Polidea <https://www.polidea.com/> | Principal Software Engineer > > > > > > > > > > > > M: +48 660 796 129 <+48660796129> > > > > > > [image: Polidea] <https://www.polidea.com/> > > > > > > > > > > > > > > > > > > > > >