+1 (binding)

On Tue, Jul 21, 2020 at 11:12 PM Kevin Yang <yrql...@gmail.com> wrote:

> Thank you everyone for the feedback! I think the PR covered what is being
> proposed in the AIP and does not have big controversial change that cannot
> be revised easily. The feature itself has also been baked in Airbnb's large
> Airflow setup for 6+ months. So consider this my binding +1.
>
> I would love to call for more votes from the community, esp. from the
> committers. As mentioned already above by Vikram, Kaxil and Jarek, this
> change would lay a good foundation for further work on onboarding/extend to
> other operators.
>
> Release wise, what would be the implication of planning it for 2.1? If that
> doesn't mean we block the PR from merging until 2.0 is ready then I'm
> totally fine.
>
>
> Cheers,
> Kevin Y
>
> On Mon, Jul 20, 2020 at 9:24 PM Yingbo Wang <ybw...@gmail.com> wrote:
>
> > It would be nice to include DAG serialization in the smart sensor however
> > we may want to put it in a follow up PR. Cong Zhu has explained why it
> > would need more investigation to put the DAG serialization in a separate
> PR
> > here https://github.com/apache/airflow/pull/5499#issuecomment-657828649.
> >
> > In my opinion, the current PR is already too big and it is becoming
> harder
> > and harder for us to make the branch tracking with the master branch.  At
> > the moment, I think it makes more sense to merge this change with the
> > master before we make additional incremental improvement.
> >
> > Yingbo
> >
> >
> > On Sun, Jun 21, 2020 at 1:57 PM Jarek Potiuk <jarek.pot...@polidea.com>
> > wrote:
> >
> > > I also llke the idea - it is really nice optimisation. As Kaxil
> > mentioned -
> > > it would be great to include Dag Serialisation.
> > >
> > > I think for some of the other operators that just send query and poll
> > > execution - I think once this one is implemented for sensors, we can
> > think
> > > about splitting some of the "operators" to support it. There was
> already
> > a
> > > PokeReschedule discussion
> > >
> > >
> >
> https://lists.apache.org/thread.html/rc6f56234342c87f154865489e3a6555609e4b98a8c62ca4997cb6a6c%40%3Cdev.airflow.apache.org%3E
> > > which
> > > i think might benefit from this architecture - but I'd say - let's
> > > implement it for sensors first and then we can think about implementing
> > it
> > > also for operators.
> > >
> > > One thing though - is it something that we could plan for 2.1 ? I think
> > our
> > > 2.0 release gets some important features that focus on delivering most
> > > value for Airflow users. And it keeps on slipping. I think implementing
> > > such a change will keep some of the committers distracted (even if it
> is
> > > already at a PR stage). Maybe we can start prioritising new features to
> > 2.1
> > > by default unless we have very good reason to implement it in 2.0?
> > >
> > > J.
> > >
> > >
> > >
> > > On Sat, Jun 20, 2020 at 5:37 PM Kaxil Naik <kaxiln...@gmail.com>
> wrote:
> > >
> > > > Hi Yingbo,
> > > >
> > > > I like the concept described in the AIP. I was wondering if we could
> > > > leverage Dag Serialization (
> > > > https://airflow.readthedocs.io/en/latest/dag-serialization.html) to
> > get
> > > > the
> > > > "task level fields" without re-parsing the DAGs or storing it in the
> > new
> > > > table.
> > > >
> > > > And can we use some of the operators like the BigQueryOperator,
> > > > SparkOperator which just submits the SQL query and polls until
> > > completion?
> > > >
> > > > Regards,
> > > > Kaxil
> > > >
> > > > On Sat, Jun 20, 2020 at 8:26 AM Yingbo Wang <ybw...@gmail.com>
> wrote:
> > > >
> > > > > Thanks everyone for the feedback. I will also add the details
> > mentioned
> > > > in
> > > > > this thread into the AIP
> > > > >
> > > > >
> > > > > Q: From an implementation perspective, my one area of concern is
> the
> > > > >
> > > > > "sharding" concept and the configuration / management overhead
> > > involved.
> > > > I
> > > > >
> > > > > may have missed it in the AIP, but would it be possible to add
> > > > auto-scaling
> > > > >
> > > > > to minimize this configuration?
> > > > >
> > > > > The “sharding” configuration is an integer which implies the number
> > of
> > > > > concurrently running smart sensor jobs for the whole airflow
> > cluster. A
> > > > > proper sharding setting mainly depends on the following issues: 1.
> > > > Cluster
> > > > > load -- how many sensor tasks need to be executed at the same time.
> > 2.
> > > > How
> > > > > often should each sensor be poked at least once. 3. The response
> time
> > > > for a
> > > > > sensor task in the current system. As these answers may vary for
> > > > different
> > > > > systems we leave “sharding” as a configurable field for users to
> > > satisfy
> > > > > different use cases.
> > > > >
> > > > > Also, a couple of clarifying questions:
> > > > >
> > > > > 1. Do you know if this is more suitable to certain kinds of sensors
> > vs.
> > > > >
> > > > > Others?
> > > > >
> > > > > Most sensors should be suitable for the smart sensor. Except if the
> > > > > argument needed to initialize a sensor object is unserializable,
> > e.g. a
> > > > > function. Serialize more complex types other than builtin and
> > datetime
> > > is
> > > > > not supported right now but we are planning to add them in the
> > future.
> > > > >
> > > > > 2. What do you think about leveraging this to enable "async"
> > operations
> > > > >
> > > > > using Airflow i.e. submit a task and then use a "smart sensor" to
> > check
> > > > for
> > > > >
> > > > > Completion?
> > > > >
> > > > > This is a very good point. We do notice the relationship between
> > these
> > > > two
> > > > > ideas. Technically this logic should also work. The “task
> submission”
> > > map
> > > > > to the pre_execute() in a sensor task logic and “check for
> > completion”
> > > > map
> > > > > to the sensor’s poke() function. The current implementation of
> > > > > SubDagOperator
> > > > > <
> > > > >
> > > >
> > >
> >
> https://github.com/apache/airflow/blob/master/airflow/operators/subdag_operator.py#L144-L177
> > > > > >
> > > > > actually follows this pattern. If the operator requires no
> > > unserializable
> > > > > argument to be instantiated, we should already be able to leverage
> > the
> > > > > async operation in SmartSensor for it.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > Q: How would a user enable their own smart sensors? I don’t see any
> > > added
> > > > > documentation for this. It looks like they need to manually add the
> > > name
> > > > of
> > > > > the class to the airflow configuration and do *something* to their
> > > sensor
> > > > > class, including override the "is_smart_sensor" method (why a
> method
> > > and
> > > > > not an attribute?)
> > > > >
> > > > > Having to enable it in multiple places seems a little cumbersome,
> why
> > > not
> > > > > have a "BaseSmartSensor" that the user inherits from like most of
> the
> > > > rest
> > > > > of Airflow? Sensors inherited from BaseSmartSensor would be "Smart"
> > > when
> > > > > smart sensors are enabled in the configuration and not smart when
> > smart
> > > > > sensors are not enabled.
> > > > >
> > > > > Enabling/Disabling the smart sensor is a system level config which
> is
> > > > > transparent to the individual users. An example of smart sensor
> > enabled
> > > > > cluster config is as follows:
> > > > >
> > > > > [smart_sensor]
> > > > >
> > > > > use_smart_sensor = true
> > > > >
> > > > > shard_code_upper_limit = 10000
> > > > >
> > > > > shards = 5
> > > > >
> > > > > sensor_enabled = NamedHivePartitionSensor, MetastorePartitionSensor
> > > > >
> > > > >
> > > > > The "use_smart_sensor" config indicates if the smart sensor is
> > enabled.
> > > > The
> > > > > "shards" config indicates the number of concurrently running smart
> > > sensor
> > > > > jobs for the airflow cluster. The "sensor_enabled" config is a list
> > of
> > > > > sensor class names that will use the smart sensor.  The users use
> the
> > > > same
> > > > > class names (e.g. HivePartitionSensor) in their DAGs and they don’t
> > > have
> > > > > the control to use smart sensors or not, unless they exclude their
> > > tasks
> > > > > explicits.
> > > > >
> > > > >
> > > > > Existing DAGs don't need to be changed for enabling/disabling the
> > smart
> > > > > sensor.
> > > > >
> > > > >
> > > > > “Is_smart_sensor_compatible” is a class level configuration
> (instead
> > of
> > > > > instance-level) so that the system knows if a particular sensor
> > > operator
> > > > > can use the smart sensor. Currently only NamedHivePartitionSensor
> and
> > > > > MetastorePartitionSensor
> > > > > are enabled to use the smart sensor in the PR.
> > > > >
> > > > > To include other sensor operators for smart sensors that are not
> > > included
> > > > > in this PR:
> > > > >
> > > > >    1.
> > > > >
> > > > >    Add a class attribute "poke_context_fields" to the operator.
> > > > >    "poke_context_fields" include all key names used for
> initializing
> > a
> > > > > sensor
> > > > >    object.
> > > > >    2.
> > > > >
> > > > >    In airflow.cfg, add the operator’s classname to the session of
> > > > >    “[smart_sensor]” with the field “sensors_enabled” as follows.
> > > > >
> > > > >
> > > > > Yingbo
> > > > >
> > > > > On Fri, Jun 19, 2020 at 7:27 AM Shaw, Damian P. <
> > > > > damian.sha...@credit-suisse.com> wrote:
> > > > >
> > > > > > Also +1 (non-binding) on the AIP but questions on the
> > implementation.
> > > > > >
> > > > > > How would a user enable their own smart sensors? I don’t see any
> > > added
> > > > > > documentation for this. It looks like they need to manually add
> the
> > > > name
> > > > > of
> > > > > > the class to the airflow configuration and do *something* to
> their
> > > > sensor
> > > > > > class, including override the "is_smart_sensor" method (why a
> > method
> > > > and
> > > > > > not an attribute?)
> > > > > >
> > > > > > Having to enable it in multiple places seems a little cumbersome,
> > why
> > > > not
> > > > > > have a "BaseSmartSensor" that the user inherits from like most of
> > the
> > > > > rest
> > > > > > of Airflow? Sensors inherited from BaseSmartSensor would be
> "Smart"
> > > > when
> > > > > > smart sensors are enabled in the configuration and not smart when
> > > smart
> > > > > > sensors are not enaled.
> > > > > >
> > > > > > Damian
> > > > > >
> > > > > > -----Original Message-----
> > > > > > From: Vikram Koka <vik...@astronomer.io>
> > > > > > Sent: Friday, June 19, 2020 00:57
> > > > > > To: dev@airflow.apache.org
> > > > > > Subject: Re: [VOTE] AIP-17: Consolidate and de-duplicate sensor
> > tasks
> > > > in
> > > > > > airflow Smart Sensor
> > > > > >
> > > > > > +1 (non-binding) for this AIP.
> > > > > >
> > > > > > I really like the concept and the efficiency improvements. The
> > > general
> > > > > > SmartSensor concept and the ability to add additional sensor
> > classes
> > > is
> > > > > > elegant.
> > > > > >
> > > > > > From an implementation perspective, my one area of concern is the
> > > > > > "sharding" concept and the configuration / management overhead
> > > > involved.
> > > > > I
> > > > > > may have missed it in the AIP, but would it be possible to add
> > > > > auto-scaling
> > > > > > to minimize this configuration?
> > > > > >
> > > > > > Also, a couple of clarifying questions:
> > > > > > 1. Do you know if this is more suitable to certain kinds of
> sensors
> > > vs.
> > > > > > others?
> > > > > > 2. What do you think about leveraging this to enable "async"
> > > operations
> > > > > > using Airflow i.e. submit a task and then use a "smart sensor" to
> > > check
> > > > > for
> > > > > > completion?
> > > > > >
> > > > > > Best regards,
> > > > > >
> > > > > > Vikram
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Thu, Jun 18, 2020 at 3:38 PM Yingbo Wang <ybw...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > > Hello everyone!
> > > > > > >
> > > > > > > This email calls for a vote to add the airflow smart sensor at
> > > > > > > https://github.com/apache/airflow/pull/5499
> > > > > > >
> > > > > > > AIP:
> > > > > > >
> > > > > > >
> > > >
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-17%3A+Consolid
> > > > > > > ate+and+de-duplicate+sensor+tasks+in+airflow+Smart+Sensor
> > > > > > >
> > > > > > > Change summary:
> > > > > > >
> > > > > > >    - Add a new mode called “smart sensor mode”. In smart sensor
> > > mode,
> > > > > > >    instead of holding a long running process for each sensor
> and
> > > > poking
> > > > > > >    periodically, a sensor will only store poke context at
> > > > > sensor_instance
> > > > > > >    table and then exits with a ‘sensing’ state.
> > > > > > >    - When the smart sensor mode is enabled, a special set of
> > > builtin
> > > > > > smart
> > > > > > >    sensor DAGs (named smart_sensor_group_shard_xxx) is created
> by
> > > the
> > > > > > > system;
> > > > > > >    These DAGs contain SmartSensorOperator task and manage the
> > smart
> > > > > > sensor
> > > > > > >    jobs for the airflow cluster. The SmartSensorOperator task
> can
> > > > fetch
> > > > > > >    hundreds of ‘sensing’ instances from sensor_instance table
> and
> > > > poke
> > > > > on
> > > > > > >    behalf of them in batches. Users don’t need to change their
> > > > > > > existing DAGs.
> > > > > > >    - The smart sensor mode currently supports
> > > > NamedHivePartitionSensor
> > > > > > and
> > > > > > >    MetastorePartitionSensor however it can easily be extended
> to
> > > > > > > support more
> > > > > > >    sensor classes.
> > > > > > >    - Smart sensor mode on/off, the list of smart sensor enabled
> > > > > classes,
> > > > > > >    and the number of SmartSensorOperator tasks can be
> configured
> > in
> > > > > > airflow
> > > > > > >    config.
> > > > > > >    - Sensor logs in smart sensors are populated to each task
> > > instance
> > > > > log
> > > > > > >    UI.
> > > > > > >
> > > > > > >
> > > > > > > A PR https://github.com/apache/airflow/pull/5499 is ready for
> > > review
> > > > > > > from the committers and community.
> > > > > > >
> > > > > > >
> > > > > > > This email is formally calling for a vote to accept the AIP and
> > PR.
> > > > > > > Please note that we will update the PR / feature to fix bugs if
> > we
> > > > find
> > > > > > any.
> > > > > > >
> > > > > > >
> > > > > > > Best
> > > > > > >
> > > > > > > Yingbo
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ===============================================================================
> > > > > >
> > > > > > Please access the attached hyperlink for an important electronic
> > > > > > communications disclaimer:
> > > > > > http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
> > > > > >
> > > > >
> > > >
> > >
> >
> ===============================================================================
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > > --
> > >
> > > Jarek Potiuk
> > > Polidea <https://www.polidea.com/> | Principal Software Engineer
> > >
> > > M: +48 660 796 129 <+48660796129>
> > > [image: Polidea] <https://www.polidea.com/>
> > >
> >
>

Reply via email to