Yeah. I am officially canceling the RC2 vote now. I also found another problem in the meantime: while the embedded README has all the information about the package that is needed, CHANGELOG.txt contains information about the changelog for the main apache-airflow package, not the backport one. Similarly, the INSTALL.txt (which is mentioned when the package is distributed does not have an installation on building and installing the backport packages - just the apache-airflow itself.
I think all the reasons compounding lead to canceling RC2 - I will cut a new RC3 release tomorrow. Thanks again for the thorough checks Bas and Fokko (and Ash/Kaxil before). Much appreciated! I look forward to a really good final release :) J. On Sat, May 23, 2020 at 4:17 PM Kaxil Naik <kaxiln...@gmail.com> wrote: > Yes, let's cancel this vote and create rc3 after reviewing/fixing errors > that Bas found with his script. > > Thanks Bas & Fokko. > > Regards, > Kaxil > > On Sat, May 23, 2020, 15:09 Jarek Potiuk <jarek.pot...@polidea.com> wrote: > > > Thanks Bas for looking at it. Much appreciated ! > > > > > > > > (1) First general comment is the split in microsoft providers seems > > unnecessary to me: > > > > > > * > > > * apache-airflow-backport-providers-microsoft-azure==2020.5.20rc2 > > > * apache-airflow-backport-providers-microsoft-mssql==2020.5.20rc2 > > > * apache-airflow-backport-providers-microsoft-winrm==2020.5.20rc2 > > > Amazon and Google have just 1 single providers package. Is this > > intentional? > > > > Initially, we had separate packages for Google cloud but we decided to > join > > them as they have a lot of common stuff and having them separated would > > mean that they heavily depended on each other anyway, > > For Amazon, there is only one "aws" sub-package so we could argue if it > > should be an "amazon" or "amazon-aws" one. Both would be fine. For > > Microsoft, it is a bit different. > > Those three packages do not depend on each other at all. Even when you > look > > at their requirements you have this: > > > > PROVIDERS_REQUIREMENTS: Dict[str, Iterable[str]] = { > > "amazon": amazon, > > ... > > "google": google, > > ... > > "microsoft.azure": azure, > > "microsoft.mssql": mssql, > > "microsoft.winrm": winrm, > > ... > > } > > > > And looking at the READMEs - "microsoft.azure" cross-depends on "oracle" > > provider, "Microsoft.mssql" cross-depends on "odbc" one, and "winrm" does > > not cross-depend on any other provider package. So it made perfect sense > to > > leave them as separate packages. One might argue, it might reflect the > > "philosophical" difference between those companies. I believe Google And > > Amazon - have a lot of cross-dependencies in their own products - even > > different ones and often have a lot of common code/interfaces and > whenever > > Google or Amazon buys something they closely integrate it in their > product > > offering. With Microsoft it's a bit different - Azure and Office And > MSSQL > > and WINRM have very little to do with each other on a technology level - > > they are not that well integrated (well Azure runs mostly on Linux, not > > Windows :) ). > > > > So for me, such a distinction makes sense. But I am happy to hear > > other's opinions. > > > > > > > > (2) Second, the naming convention is now consistent which is great! > Nit: > > > > > airflow.providers.google.suite.operators.sheets.GoogleSheetsCreateSpreadsheet > > is an operator but doesn’t end with “Operator”. > > > > Aaargh! Thanks! - That's a bummer :). I think that on its own is a reason > > to cancel rc2 :D:D:D:D:D:D > > > > > > > > (3) Third, I wanted to validate at least importing all > > hooks/sensors/operators/etc. works correctly. Therefore I wrote a test > > script, which iterates over all providers and per backport package: > > > > > > 1. Runs a Python 3.7 Docker container > > > 2. Does a fresh install of Airflow 1.10.10 > > > 3. Does a fresh install of the backport package > > > 4. And tries importing each class (e.g. “from > > airflow.providers.amazon.aws.operators.ecs import ECSOperator”) > > > > Indeed that's a great idea and something that ACTUALLY can make us go for > > rc3. I already do import all provider classes for 2.0. I am importing all > > operators/hooks/classes/secrets in order to generate the > > README's automatically (so I have the robust and tested script for that): > > > > > https://github.com/apache/airflow/blob/master/backport_packages/setup_backport_packages.py#L1070 > > ) > > - the method finds and imports all > > operators/hooks/sensors/secrets/protocols from providers. And with all > the > > CI/Breeze automation we can very easily plug this in the CI framework of > > ours. I also already run automatically (in CI) test the installation of > > all packages one-by-one using 1.10.10 version of Airflow. You can see the > > result here (for example): > > > > > https://github.com/apache/airflow/runs/701914153?check_suite_focus=true#step:6:9384 > > ) > > but indeed we should combine the two - installation on 1.10 and > importing! > > > > > The script is not 100% fool-proof and wildly inefficient, but should > give > > us the bulk of the import errors. I did not test for cross-provider > > installations. Also uncertain if all import errors are related to the > > backports packages, or just fail in general, but we should at least check > > them out. > > > > Agreed. I will automate and review them all now. > > > > Others? I think the last one is a valid reason to cancel rc2 and make > rc3. > > What do you think? > > > > J. > > -- > > > > Jarek Potiuk > > Polidea | Principal Software Engineer > > > > M: +48 660 796 129 > > > -- Jarek Potiuk Polidea <https://www.polidea.com/> | Principal Software Engineer M: +48 660 796 129 <+48660796129> [image: Polidea] <https://www.polidea.com/>