Hey everyone, Thank you for attending the dev call on Thursday. I updated our meeting notes document in the Airflow 3.x wiki <https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+3.x>to capture the notes. The link for those notes is here <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=373886699#Airflow3.xDevCall:Meetingnotes-Summary.11>
The meeting continued the focus on user feedback regarding Airflow 3 and solving adoption issues. I have also updated the Airflow 3.x wiki page <https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+3.x> with a specific "Airflow 3 adoption focus" section. To everyone who attended the meeting, please check the summary and add anything that I may have missed. For those who could not join, please let us know if you disagree with anything discussed and agreed upon in the meeting. Also, please do ask questions if something is unclear. Our next meeting is scheduled for the 20th of November at the same time. Please let me know if you would like to add anything to the agenda <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=373886699#Airflow3.xDevCall:Meetingnotes-ProposedAgenda.12> . Best regards, Vikram -- Below is the summary from the call: - Catch-up on action items from last call: - DAG import issues (Dheeraj) - Dheeraj said that he had re-tested the upgrade process and that the RUFF based utilities had significantly improved DAG compatibility from Airflow 2 to Airflow 3, when run with autofix, with over 50% of all the DAGs successfully parsed with Airflow 3, without needing manual changes. - Dheeraj went to to say that the remaining issues requiring manual fixes were with: - airflow.utils days_ago method, - DB create session no longer being available, because of direct database access removal - Simple HTTP Operator deprecation and Bash Operator being moved - Dheeraj's summary was that the migration timeline after using the utilities would be about 3-4 days to achieve 80-90% DAG compatibility - The only remaining issue in his mind was UI performance, where it seemed that there was a noticeable slowdown as compared to Airflow 2.x - This report raised a fair amount of questions and discussion in the meeting itself. It was very helpful for the rest of the team to hear Dheeraj's feedback! - Development Updates and Presentations: - Airflow 3.1.x patch release update (Ephraim Anierobi) - Ephraim said that 3.1.2 had been released successfully. - Jarek reported that there was one issue reported right after about disappearing logs which may be critical and require a follow-on patch release. - Rahul chimed in to say that this log issue was reproducible and a fix had also been identified and tested. - There was agreement that this may require a 3.1.3 very soon, instead of waiting for the 2 week release cycle. - This is currently scheduled for this week and added to the Airflow 3.x wiki page <https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+3.x> - UI performance issues (Pierre Jeambrun) - Pierre reported that a number of N+1 query problems had been identified, were being resolved, and guard rails being put in place. The root cause was serialization layer lazy-loading relationships in loops. - Pierre also referenced an issue that had identified missing indexes as a source of slowness and this was being resolved by new index creation. Vikram raised his concern that new index creation could cause issues in the "DB migration" part of an Airflow upgrade. Ash concurred with the concern and proposed a solution to make index creation part of the API server/ Scheduler startup rather than as part of the migration. - Brent added that the Grid view performance remains challenging and that additional optimization work was being planned after the N+1 fixes were complete. - There was also discussion about FastAPI configuration changes because of scaling differences from the Flask approach. This triggered a need for updating the documentation to recommend scaling approach recommendations. - Auth issues (Vincent Beck) - Vincent reported that issues related to Auth were being resolved and that he had taken this on at Vikram's request. - Expanding Task SDK Integration test framework with more tests (Amogh) - There was a quick ask for help from Amogh requesting community contributions to the Task SDK integration test framework. - Amogh said that the complexity was higher than previous efforts and may require a SIG on slack for coordination. - Discussion topics: - Issue triage process (Vikram) - Vikram followed up on his email summary of issues sent to the dev list earlier, saying that the "needs triage" label was applied to 73 of the 284 open issues related to Airflow 3. And that this still seemed to be applied even after a PR had been created to address the issue. - Jarek chimed in to say that this was unintentional and that at least he himself often forgot to assign or remove labels during the review process. - Vikram proposed adoption of issues into logical swim lanes, with volunteer owners for those lanes such as: - Auth issues: Vincent leading - UI / API issues: Pierre and Brent leading - Data aware scheduling: TP and Wei leading - Edge Worker: Jens leading - There was some discussion around this with senior contributors such as Ash, saying that they look at everything, not based on individual areas. However, Vincent and Jens chimed in saying that this would be useful for them to focus attention. Brent chimed in saying that sometimes issues were mislabeled with the UI tag, but that was solvable by reassigning the UI-labelled issues post-initial triage. - An aspiring contributor commented that these swim lane labels would also be useful for issues tagged with "Good first issue", so that they could pick something to work on based on their own skills and interests. - Elad pointed out that this needs to be tried out in practice and if it works, also applied to PRs, since there are many PRs sitting waiting for approval. At this point, we have hit a record of 343 open PRs in the project! - Thoughts on how to document DB access options in Airflow 3 upgrade docs (Amogh) - Amogh said that the Database access options topic had raised a lot of discussion on the PR. - As a result, Amogh started a dev list discussion and was looking for input. Based on that, a lazy consensus would be started middle of next week -- Vikram Koka Chief Strategy Officer Email: [email protected] <https://www.astronomer.io/>
