Hi all,

I have created a document to summarize the discussion from our third dev
call for Airflow 2.0.

Thank you all who joined the call.

*Doc Link*:
https://cwiki.apache.org/confluence/display/AIRFLOW/Meeting+Notes#MeetingNotes-#4:14Sep2020

To all those who attended, can you please double-check and add if I have
missed anything?

To all those who didn't join, if you disagree to anything in
the Summary please voice your opinion.

Also please let me know if someone wants to include an item in Next call's
Agenda.

Including the Summary here too (might potentially break formatting):

*Key Decisions*

   - *Updates*
      - Airflow v2-0-test branch
      <https://github.com/apache/airflow/commits/v2-0-test> has already
      been cut and currently manually rebased on top of the Master.
Currently, we
      don't run CI as the branch is in-sync with Master. As soon as we
have a PR
      / commit that we don't want to have it in 2.0 we will diverge v2-0-test
      branch from Master and start running tests against it.
      - The upgrade-check PR <https://github.com/apache/airflow/pull/9467> was
      merged, we now need to define more rules to add more checks.
   - *API*
      - Progress:
         - Project Board: https://github.com/apache/airflow/projects/1
            - The issues labelled with "Enhancement" are not a requirement
            for 2.0
         - Endpoints:
            - Task Instance Endpoint
            <https://github.com/apache/airflow/pull/9597> is WIP, all the
            other endpoints have been implemented.
         - Permissions Model:
            - On-going discussion on the PR
            <https://github.com/apache/airflow/pull/10594> but close to
            completion.
            - The next piece of work to be done is migrating existing Views
            to use resource-based permissions. (Github issue
            <https://github.com/apache/airflow/issues/10469>). This is
            mainly for standardizing the permissions model across API and UI.
         - *Improvements to SubDags / Concept of TaskGroup*
      - AIP-34
      
<https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-34+TaskGroup%3A+A+UI+task+grouping+concept+as+an+alternative+to+SubDagOperator>
       | PR <https://github.com/apache/airflow/pull/10153> introduced the
      concepts of TaskGroup and will be *included in Airflow 2.0*.
         - The PR implements TaskGroups for Graph View, the Tree View will
         be implemented in follow-up PRs.
      - Follow-up items from the discussion:
         - Discuss on mailing list whether we should deprecate SubDags in
         favour of TaskGroup in 2.0 or wait until Airflow 2.1 or 2.2
         - Add docs around when to use TaskGroup vs SubDag and potentially
         listing PROs and CONS.
      - *Scheduler HA *(AIP-15
   <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103092651>
    )
      - A Draft PR <https://github.com/apache/airflow/pull/10956> has been
      created to enable code reviews and to allow the members of the
community to
      start testing it with various setups.
      - To get the most benefit of Scheduler HA on MySQL, users will need
      to use MySQL 8. This is because MySQL 5.7 does not support SKIP LOCK
      
<https://dev.mysql.com/doc/refman/8.0/en/innodb-locking-reads.html#innodb-locking-reads-nowait-skip-locked>feature
      but note that *MySQL 5.7 will still continue to work with at least
      the same or improved performance as now*.
      - Astronomer has done performance testing with different Scenarios
      and will publish benchmarks over the coming weeks. Google Composer Team +
      Polidea said that they would be happy to carry out various tests for
      Scheduler HA as well.
      - There were some concerns raised around LOCKING Timeout periods and
      the usage of DAG Serialization. More testing in the upcoming weeks should
      help mitigate any concerns and help fix the bugs if discovered.
      - *Docs:*
         - Explicitly mention that for HA Scheduler reads some of the
         properties from serialized_dag table. Users can turn on/off DAG
         Serialization in the Webserver but the Scheduler will
continue using it.
         - Do we recommend 2 schedulers for Production deployments?
         - X Schedulers vs single Scheduler. Use case when one would be
         better than the other.
            - Some kind of Bell Curve showing an increase in Schedulers
            stops improving performance and maybe also degrades. This
is intended to
            give guidance around what number of schedulers to run
based on expected
            load, since this decision could be based on multiple factors.
         - Follow up items:
         - Create mailing list thread to discuss "Removing Pickling from
         Airflow 2.0". Currently, pickled dags are only supported by
CeleryExecutor
         and we have a flag on *airflow scheduler
         <https://airflow.readthedocs.io/en/latest/cli-ref.html#scheduler>
*(--do-pickle)
         and "--ship-dag" on *airflow tasks run
         <https://airflow.readthedocs.io/en/latest/cli-ref.html#run> *command.
         If we want to remove pickling Airflow 2.0 is the right time
or we shouldn't
         do it until 3.0
      - *Helm Chart*
      - We will continue focusing on getting Airflow 2.0 out so the first
      official release of Helm Chart might need to wait.
      - The issue with Helm Chart sources was fixed and there are no
      blockers currently if we were to release it at some point in the near
      future.
      - Enhancements (but not blockers) are:
         - Better Test Coverage with integration tests
         - Docs pointing to the chart on the Airflow Website or the docsite
      - The artifacts for the Helm chart would be published at
      https://downloads.apache.org/airflow/
      - There is still an open question around *Helm Chart Versioning
      Policy *i.e. do we want to tie-in Airflow Versions with Helm Chart?
      Or do we just start from *1.0.0? * This needs to be decided before
      the release of the Helm Chart.



*Things to Discuss Next*

   - *21 September (Subject to Change)*
      - Finish up open discussion items from the earlier meeting if not yet
      resolved:
         - Providers versioning,
         - SubDag deprecation,
         - Helm Chart release,
         - REST API permissions
         - Docs changes
      - UI Changes for 2.0
         - Minimum effort changes: CSS/colours/spacing to make the UI look
         a bit modern
      - Process:
         - When should we defer the in-scope items to post-2.0
            - Completion by a date?
            - Progress by a date?


Regards,
Kaxil

Reply via email to