> "Core Changes" for me is anything that touches task execution and its > behaviour - Scheduler loop, Local Task jobs, related CLI commands, DAG > Serialization.
Yep. And again let me reiterate that, making it clear and communicated as straightforward as possible in an automated way is even better (or good on top of it) than "describing" what core is (especially that it seems rather easy as it clearly follows our project structure). >> Airflow, to put it delicately, does not excel at getting changes merged in a >> timely manner. I routinely find myself asking stakeholders multiple times to >> review a change, and that’s before addressing any of their feedback. I’m >> worried that doubling the approval requirement will exacerbate one problem, >> without meaningfully solving the core issue of changes getting released with >> bugs or breaking changes. There’s also a psychological component to >> consider. If a PR has passed all tests and been reviewed/approved by another >> committer, it’s far too easy for the second reviewer to simply rubber-stamp >> without doing the kind of deep review needed to catch bugs that were missed >> by the original programmer and the first reviewer. Yeah. I think James, you touched on something very important. I definitely agree with the "timely" part, but I believe"merging quickly" is a very wrong optimization goal. By all means "Give PROPER feedback quickly" should be the goal, but "merging PR quickly" should not be. I think we should very explicitly tell the user "You CAN expect that it will take days or sometimes weeks" to merge your PR is just setting the right expectations. And we should do it - this is something we did at all "First time contributors Workshops". We also explained why this is expected. The nature of the OSS project is such that people (including committers and PMC members) contribute to it (including review) when they can and different people have different areas of expertise and it can happen that in certain areas, review might take more time as there are less number of people and they have less time. Surely some people are paid to contribute and can provide feedback and review and merge PRs faster, but the whole "concept" of OSS project run under the ASF is that there are various types of contributors, and that they a lot of them are doing it for free, and that should not put those people at disadvantage vs. people who are paid to their job. Similarly. as those people who are paid for their contributors, people who are not, should have a chance to review and comment on the different PRs. There is a reason why we have min 72 hrs for any voting - precisely to give such people time to think, respond, and be engaged in a meaningful discussion (if they want). So let me repeat that - "optimizing for speed of merging" is a very wrong goal IMHO. Instead, I believe we should embrace that it can take time to merge PRs and build our process around this - and optimize for "speed and accuracy of feedback" rather than "speed of merging". And yeah - I agree we have a lot of improvements to do in the area of feedback - both in the "timely" part of it and being "proper". But I think we have two kinds of feedback: The CI "automated" feedback and "Human" feedback. Regarding the "CI feedback": Everything that can be automated and give the proper and timely "CI feedback" - should be automated. Full stop. If there is one thing I am 100% sure about a long-term sustainable software development process - this is it. This is the only way you can "scale" committers - in many cases infinitely - without growing the committers list too fast. Like any organisation - when it gets too big, communication problems prevent it from scaling efficiently and the committer group is no exception to that). CI automation allows you effectively to scale committers without incurring the increased communication complexity cost. One area that we should (continue) improving is making sure that we quickly isolate and fix the "false-negative" feedback from our build system. I personally believe that CI is a GREAT "feedback" system that frees a lot of time of contributors, as long as we spend time and effort on keeping it running "well". We should communicate to the user about the "correct" status of the PR - whether it fails certain checks and what should be done to fix it. This is an ongoing process and I would love to get more hands helping with that. I would love more people paying attention to fixing the "flaky tests" and helping with investigation of some "consistent" failures there. Or even improving the "messages" the CI system gives to provide better guidance to users. Seems that there are just a few people who work in this area and "care" about "good CI feedback" and "proper" work of the CI. Most people complain when it does not work, but there are few who would like to improve it. I wish there are more people (I am running - together with Elad and Nasser in Outreachy interns on rewriting some of the bash to python in our CI so I hope the "bash-ism" there will be less of a barrier, but I think this is less of a bash and more of a mental "block" which I saw in many other projects (where for example Python was used). Most of the people do not understand how important CI is - until it breaks of course - and then they have little incentive to help with fixing and improving it - I would love to get more people engaged here :) . Note it's not a "complaint", it's more observation that more help in this area might improve the quality and timelines of the "automated feedback" our users have - and call for more participation. I think the more of the "feedback" we automate and the more "rock solid" it can be the faster the "initial feedback" will be for all users. Freeing a lot of time and effort from committers, because CI does the job for them. This is why I think things like "identifying PR that is Core and needs more than 2 maintainers" is a fantastic tool to focus the effort and energy of committers on those PRs that need more people. Or CI informing the user "Please add the unit test as it is missing" is a perfect candidate for automation and freeing the committer from even looking at the PR before the test is added. Overall - the unreachable goal (but one that is good to keep in your head to "asymptotically be closer and closer to") is that committers could look at PRs that are "Green. And yes it requires constant effort to keep it green by more than one person. Once the effort is there, I think the more we automate - the better. Providing the "Human feedback" This is - of course much more important to provide good feedback in reviews. And the more time of thecommitters is freed for the "trivial" checks - the more they can spend on the "difficult stuff". But the "human" feedback from committers also should come much faster IMHO. And I think this should be mostly focusing about "area of expertise". I think there are a number of areas where people are "experts" and well - seems that they can provide the feedback "fast". Sometimes I wait for feedback from people who I know are much more knowledgeable in a certain area. Sometimes I have a lot of questions in "my areay" that I have to address. Maybe this is a good area to improve? Maybe we should review our CODEOWNERS lists and make a more "updated" and "serious" list of CODEOWNERS - i.e. if someone is put in CODEOWNERS for a given area, it is also expected that those committers will respond in a timely fashion (when they can of course - there are vacations and others but this should be handled by having at least 3 people in every area as CODEOWNERS - even as "aspiration" of those people if they have no great expertise in this area. I think CODEOWNER entry should bear certain responsibilities on maintainers. J. >> >> Thanks, >> James > > > On Sat, Nov 13, 2021 at 12:49 AM Leah Cole <[email protected]> > wrote: >> >> Regarding these guidelines, love em, how do y'all feel about having them in >> the repo or somewhere in confluence? (maybe as part of a broader maintainers >> guide?) >> >> On Thu, Nov 11, 2021 at 11:03 PM Jarek Potiuk <[email protected]> wrote: >>> >>> Good guideline. All for it. Good point Ace, about automation. I think >>> any "sustainable" guidelines that are going to survive and be followed >>> in the future should be automated. >>> >>> I actually think we should introduce automation for those - especially >>> that this seems entirely possible: >>> >>> 1) I believe it should be possible (and even easy) to add a check in >>> CI that "2 reviews from committers are needed" when any of the core >>> files change. Should be easy by the right "change set". We already >>> have all the tools in our selective_chceks and notification to do >>> that. We could even make a new "check" (similarly as we do >>> "up-to-date" check for migrations" that will signal the status nicely >>> if a core change has not been reviewed by two committers. >>> >>> 2) Similarly we could also similarly flag a code that does not change >>> unit tests. This would also be a nice "signal" for new contributors, >>> that they still have some work to do - and we could explain why in the >>> message. That makes for a nice communication tool towards new >>> contributors.. >>> >>> 3) Rests for migration raised by Ace- I think that is a good idea, but >>> we would need to run it outside of the main CI, because the only >>> reasonable tests that we can do if we run it or "biggish" amount of >>> data. The problems with migrations are only really apparent when the >>> amount of data is sizable. But maybe we could do that as an automated >>> "pre-release" check - we just need a dump of a biggish database >>> (mysql/postgres/mssql) and run the migration and measure the added >>> migration time. >>> >>> J. >>> >>> On Fri, Nov 12, 2021 at 5:08 AM Ace Haidrey >>> <[email protected]> wrote: >>> > >>> > Hey Kaxil, >>> > I think this is great guidelines. Quick question , do we have tests >>> > watching an increase in runtime for installing the migration scripts for >>> > example? Any monitoring there? >>> > >>> > On Nov 11, 2021, at 4:26 PM, Kaxil Naik <[email protected]> wrote: >>> > >>> > Hi all committers and reviewers, >>> > >>> > Let’s be more stricter for PR reviews. Some of the PRs have slipped by >>> > and merged (I have been guilty too) that had breaking changes in the last >>> > couple of versions which are now fixed but let's be more vigilant. >>> > >>> > I propose the following guidelines (not rules): >>> > >>> > Ask for unit tests coverage wherever applicable >>> > Require at least 2 approvals for Core changes >>> > Be extra sensitive to DB migrations >>> > >>> > Verify the logic to confirm that it would not take an unreasonable amount >>> > of time to run it - especially the ones containing task_instance table. >>> > Use the utilities added in >>> > https://github.com/apache/airflow/commit/7622f5e08261afe5ab50a08a6ca0804af8c7c7fe >>> > to create migrations to avoid cases where for example we miss precisions >>> > for datetime for MySQL - PR. >>> > Ideally, each Migration should be idempotent. >>> > >>> > At least 1 minor release every 3 months so we don't diverge hugely from >>> > the main branch >>> > >>> > >>> > Thanks. >>> > >>> > Regards, >>> > Kaxil >>> > >>> > >> >> >> >> -- >> >> Leah Cole (she/her) | Developer Programs Engineer, Data Analytics | >> [email protected] | +1 (925) 257-2112 >> My working hours may not be your working hours. Please ping me anytime if >> you'd like a status update on anything we are working on together - my goal >> is to never be a blocker for you. >>
