Hey all, I have updated our meeting notes document to summarize the discussion from our dev call for Airflow 3.0 on 4th June.
Link: https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+3+Dev+call%3A+Meeting+Notes#Airflow3Devcall:MeetingNotes-4June2024 To all those who attended, can you please double-check and add if I have missed anything? To all those who didn't join, if you disagree with anything in the Summary, please voice your opinion. I will send a separate email for the agenda for the next meeting on 13 June. Regards, Kaxil ------ Including the Summary here too (might break formatting): The following *principles* were agreed upon to drive Airflow 3 development: 1) For the features that require breaking changes, ship Airflow 3 with the foundational code to allow for iterative development, optimizing for speed and a quicker feedback cycle. - Additional features that do not require breaking changes can be included in minor releases such as 3.1, 3.2, and beyond since we follow SemVer. - Discussion points: - Examples: - Add the hook points with a few references (fetcher for GCS, S3, Git) for DAG Versioning’s Execution AIP (AIP-66). - Removing dependency on FAB in Core: The new plugin framework might support only a few functionalities, which are then built upon later. - For the multi-language AIP, start with only Python + one more language in AF 3.0 and then 3.1 and later minor versions can have more support for languages like Typescript. - Identify users who would be willing to give feedback during dev and beta snapshots. The Astronomer, MWAA, and GCC teams can help identify these. 2) Ensure a smoother migration path between Airflow 2 and 3, particularly for DAG authors using the existing official Airflow providers. - Directionally, the time required to update DAGs should be measured in hours, not days or months. - Action Items: - Update the AIP template to ask for the level of effort (manual and automated) needed for the users to adapt to the breaking changes. This should include high-level details on what could go in the upgrade utilities. This will help the AIP authors consciously think through the migration efforts. (Kaxil Naik) - For AIPs, be explicit about what’s for AF 3.0 and what’s for the next minor releases (3.1, 3.2, ...). - Action Items: - Complete the housekeeping of the AIPs (Kaxil Naik): - Move the AIPs that we don’t plan to work on in Abandoned state. - Add labels if an AIP is for Airflow 2, 3.0 or >= 3.1. These labels will be used via Macros to auto-populate the tables in Airflow Improvement Proposals, making it a good page for Roadmap items. 3) Build features that solidify Airflow as the modern Orchestrator that has state-of-the-art support for Data, AI & ML workloads. - This includes enhancing scalability, performance, and enterprise-level security, adhering to the principle of least privilege. - Making Airflow aware of what’s happening in the task to provide better auditability, lineage & observability. 4) Set up the codebase for the next 3-5 years. - Reducing the matrix of supported combinations for reducing complexity in testing & development. E.g., Remove MySQL support to reduce the test matrix. - Simplifying codebase & standardize architecture (e.g., consolidating serialization methods). - Remove deprecations. - Consider optimizing development workflows (core Airflow vs. provider, chart development). 5) Simplify the Learning Curve for new Airflow users. - Decrease the time from running the install command to first DAG. - Decrease the boilerplate code needed to run the first DAG/task. - Action Items: - Write a first draft of a doc on the different personas of Airflow users & current state. This will help tailor the learning curve via docs & tutorials as well as tailor features towards that persona. (Elad Kalif) 6) Shift focus on Airflow 2 to stability: bug fixes + security fixes after AF 2.10. This should continue for a longer period of time after AF 3 release. - The provider release will continue to happen independently of the core Airflow. - Discussion points: - After the AF 2.10 release (~Aug), the "main" branch will become Airflow 3, and the release manager will cherry-pick anything targeting Airflow 2 into the Airflow 2 release branch. - The primary focus for AF 2 will shift to reliability. If certain features need to go in AF 2, they will be done by cherry-picking or opening PRs targeting the AF 2 branch. 7) Target a shorter cycle to release Airflow 3. - So that Airflow 2 branches for features don't diverge. - Users have enough time between Airflow 3 release and Airflow Summit 2025, so we can have talks about successful migrations. - Discussion point: - This means the realistic target is March-April 2025. This is to allow users enough time to migrate to AF 3 and use it so they can submit CFPs around May-July 2025. - The focus for AF Summit 2025 would be on migration stories & features of Airflow 3. The following *Guidelines* were agreed upon that help decide if a feature should be in Airflow 3 or not: 1. Alignment with Core Principles (mentioned above). 2. Workstream Ownership (can be more than one). If no one is available to lead the workstream, the feature will be parked until a dedicated owner is found. 3. Community Demand and Feedback. 4. Impact on Scalability, Performance & Security. 5. Backward Compatibility and Migration Effort. 6. Implementation Complexity and Maintenance. 7. For big features, discussion on AIPs & a successful vote on the dev mailing list. *Next steps*: - Create AIPs for the features targeted for AF 3 in the next few days to start technical discussions. - Finalize the agenda for the next call.