📣🎇 On Fri, Dec 18, 2020 at 12:16 AM Sid Anand <san...@apache.org> wrote:
> Woot!!! Wonderful work everyone. A truly long-awaited milestone for the > project -- almost since the beginning of incubation itself! > > -s > > On Thu, Dec 17, 2020 at 2:14 PM Aizhamal Nurmamat kyzy < > aizha...@apache.org> wrote: > >> Thanks to everyone who put an incredible amount of work into making this >> happen! 🎉 🎊 >> >> On Thu, Dec 17, 2020 at 1:58 PM Xinbin Huang <bin.huan...@gmail.com> >> wrote: >> >>> Amazing to see this! 🎉 🎉 🎉 🎉 🎉 🎉 >>> >>> On Thu, Dec 17, 2020 at 1:54 PM kumar pavan <pavankumar5...@gmail.com> >>> wrote: >>> >>>> Congrats EveryOne >>>> >>>> >>>> Thanks & Regards >>>> Pavan >>>> >>>> >>>> On Thu, Dec 17, 2020 at 12:36 PM Ash Berlin-Taylor <a...@apache.org> >>>> wrote: >>>> >>>>> I am proud to announce that Apache Airflow 2.0.0 has been released. >>>>> >>>>> The source release, as well as the binary "wheel" release (no sdist >>>>> this time), are available here >>>>> >>>>> We also made this version available on PyPi for convenience (`pip >>>>> install apache-airflow`): >>>>> >>>>> 📦 PyPI: https://pypi.org/project/apache-airflow/2.0.0 >>>>> >>>>> The documentation is available on: >>>>> https://airflow.apache.org/ >>>>> 📚 Docs: http://airflow.apache.org/docs/apache-airflow/2.0.0/ >>>>> >>>>> Docker images will be available shortly -- check out >>>>> https://hub.docker.com/r/apache/airflow/tags?page=1&ordering=last_updated&name=2.0.0 >>>>> for it to appear >>>>> >>>>> >>>>> The full changelog is about 3,000 lines long (already excluding >>>>> everything backported to 1.10), so for now I’ll simply share some of the >>>>> major features in 2.0.0 compared to 1.10.14: >>>>> >>>>> *A new way of writing dags: the TaskFlow API (AIP-31)* >>>>> >>>>> (Known in 2.0.0alphas as Functional DAGs.) >>>>> >>>>> DAGs are now much much nicer to author especially when using >>>>> PythonOperator. Dependencies are handled more clearly and XCom is nicer to >>>>> use >>>>> >>>>> Read more here: >>>>> >>>>> TaskFlow API Tutorial >>>>> <http://airflow.apache.org/docs/apache-airflow/stable/tutorial_taskflow_api.html> >>>>> TaskFlow API Documentation >>>>> <https://airflow.apache.org/docs/apache-airflow/stable/concepts.html#decorated-flows> >>>>> >>>>> A quick teaser of what DAGs can now look like: >>>>> >>>>> ``` >>>>> from airflow.decorators import dag, task >>>>> from airflow.utils.dates import days_ago >>>>> >>>>> @dag(default_args={'owner': 'airflow'}, schedule_interval=None, >>>>> start_date=days_ago(2)) >>>>> def tutorial_taskflow_api_etl(): >>>>> @task >>>>> def extract(): >>>>> return {"1001": 301.27, "1002": 433.21, "1003": 502.22} >>>>> >>>>> @task >>>>> def transform(order_data_dict: dict) -> dict: >>>>> total_order_value = 0 >>>>> >>>>> for value in order_data_dict.values(): >>>>> total_order_value += value >>>>> >>>>> return {"total_order_value": total_order_value} >>>>> >>>>> @task() >>>>> def load(total_order_value: float): >>>>> >>>>> print("Total order value is: %.2f" % total_order_value) >>>>> >>>>> order_data = extract() >>>>> order_summary = transform(order_data) >>>>> load(order_summary["total_order_value"]) >>>>> >>>>> tutorial_etl_dag = tutorial_taskflow_api_etl() >>>>> ``` >>>>> >>>>> *Fully specified REST API (AIP-32)* >>>>> >>>>> We now have a fully supported, no-longer-experimental API with a >>>>> comprehensive OpenAPI specification >>>>> >>>>> Read more here: >>>>> >>>>> REST API Documentation >>>>> <http://airflow.apache.org/docs/apache-airflow/stable/stable-rest-api-ref.html> >>>>> . >>>>> >>>>> *Massive Scheduler performance improvements* >>>>> >>>>> As part of AIP-15 (Scheduler HA+performance) and other work Kamil did, >>>>> we significantly improved the performance of the Airflow Scheduler. It now >>>>> starts tasks much, MUCH quicker. >>>>> >>>>> Over at Astronomer.io we’ve benchmarked the scheduler—it’s fast >>>>> <https://www.astronomer.io/blog/airflow-2-scheduler> (we had to >>>>> triple check the numbers as we don’t quite believe them at first!) >>>>> >>>>> *Scheduler is now HA compatible (AIP-15)* >>>>> >>>>> It’s now possible and supported to run more than a single scheduler >>>>> instance. This is super useful for both resiliency (in case a scheduler >>>>> goes down) and scheduling performance. >>>>> >>>>> To fully use this feature you need Postgres 9.6+ or MySQL 8+ (MySQL 5, >>>>> and MariaDB won’t work with more than one scheduler I’m afraid). >>>>> >>>>> There’s no config or other set up required to run more than one >>>>> scheduler—just start up a scheduler somewhere else (ensuring it has access >>>>> to the DAG files) and it will cooperate with your existing schedulers >>>>> through the database. >>>>> >>>>> For more information, read the Scheduler HA documentation >>>>> <http://airflowapache.org/docs/apache-airflow/stable/scheduler.html#running-more-than-one-scheduler> >>>>> . >>>>> >>>>> *Task Groups (AIP-34)* >>>>> >>>>> SubDAGs were commonly used for grouping tasks in the UI, but they had >>>>> many drawbacks in their execution behaviour (primarirly that they only >>>>> executed a single task in parallel!) To improve this experience, we’ve >>>>> introduced “Task Groups”: a method for organizing tasks which provides the >>>>> same grouping behaviour as a subdag without any of the execution-time >>>>> drawbacks. >>>>> >>>>> SubDAGs will still work for now, but we think that any previous use of >>>>> SubDAGs can now be replaced with task groups. If you find an example where >>>>> this isn’t the case, please let us know by opening an issue on GitHub >>>>> >>>>> For more information, check out the Task Group documentation >>>>> <http://airflow.apache.org/docs/apache-airflow/stable/concepts.html#taskgroup> >>>>> . >>>>> >>>>> *Refreshed UI* >>>>> >>>>> We’ve given the Airflow UI a visual refresh and updated some of the >>>>> styling. Check out the UI section of the docs >>>>> <http://0.0.0.0:8000/docs/apache-airflow/stable/ui.html> for >>>>> screenshots. >>>>> >>>>> We have also added an option to auto-refresh task states in Graph View >>>>> so you no longer need to continuously press the refresh button :). >>>>> >>>>> ## Smart Sensors for reduced load from sensors (AIP-17) >>>>> >>>>> If you make heavy use of sensors in your Airflow cluster, you might >>>>> find that sensor execution takes up a significant proportion of your >>>>> cluster even with “reschedule” mode. To improve this, we’ve added a new >>>>> mode called “Smart Sensors”. >>>>> >>>>> This feature is in “early-access”: it’s been well-tested by AirBnB and >>>>> is “stable”/usable, but we reserve the right to make backwards >>>>> incompatible >>>>> changes to it in a future release (if we have to. We’ll try very hard not >>>>> to!) >>>>> >>>>> Read more about it in the Smart Sensors documentation >>>>> <https://airflow.apache.org/docs/apache-airflow/stable/smart-sensor.html> >>>>> . >>>>> >>>>> *Simplified KubernetesExecutor* >>>>> >>>>> For Airflow 2.0, we have re-architected the KubernetesExecutor in a >>>>> fashion that is simultaneously faster, easier to understand, and more >>>>> flexible for Airflow users. Users will now be able to access the full >>>>> Kubernetes API to create a .yaml pod_template_file instead of specifying >>>>> parameters in their airflow.cfg. >>>>> >>>>> We have also replaced the executor_config dictionary with the >>>>> pod_override parameter, which takes a Kubernetes V1Pod object for a 1:1 >>>>> setting override. These changes have removed over three thousand lines of >>>>> code from the KubernetesExecutor, which makes it run faster and creates >>>>> fewer potential errors. >>>>> >>>>> Read more here: >>>>> >>>>> Docs on pod_template_file >>>>> <https://airflow.apache.org/docs/apache-airflow/stable/executor/kubernetes.html?highlight=pod_override#pod-template-file> >>>>> Docs on pod_override >>>>> <https://airflow.apache.org/docs/apache-airflow/stable/executor/kubernetes.html?highlight=pod_override#pod-override> >>>>> >>>>> *Airflow core and providers: Splitting Airflow into 60+ packages* >>>>> >>>>> Airflow 2.0 is not a monolithic “one to rule them all” package. We’ve >>>>> split Airflow into core and 61 (for now) provider packages. Each provider >>>>> package is for either a particular external service (Google, Amazon, >>>>> Microsoft, Snowflake), a database (Postgres, MySQL), or a protocol >>>>> (HTTP/FTP). Now you can create a custom Airflow installation from >>>>> “building” blocks and choose only what you need, plus add whatever other >>>>> requirements you might have. Some of the common providers are installed >>>>> automatically (ftp, http, imap, sqlite) as they are commonly used. Other >>>>> providers are automatically installed when you choose appropriate extras >>>>> when installing Airflow. >>>>> >>>>> The provider architecture should make it much easier to get a fully >>>>> customized, yet consistent runtime with the right set of Python >>>>> dependencies. >>>>> >>>>> But that’s not all: you can write your own custom providers and add >>>>> things like custom connection types, customizations of the Connection >>>>> Forms, and extra links to your operators in a manageable way. You can >>>>> build >>>>> your own provider and install it as a Python package and have your >>>>> customizations visible right in the Airflow UI. >>>>> >>>>> Our very own Jarek Potiuk has written about providers in much more >>>>> detail <https://www.polidea.com/blog/airflow-2-providers/> on the >>>>> Polidea blog. >>>>> >>>>> Docs on the providers concept and writing custom providers >>>>> <http://airflow.apache.org/docs/apache-airflow-providers/> >>>>> Docs on the all providers packages available >>>>> <http://airflow.apache.org/docs/apache-airflow-providers/packages-ref.html> >>>>> >>>>> *Security* >>>>> >>>>> As part of Airflow 2.0 effort, there has been a conscious focus on >>>>> Security and reducing areas of exposure. This is represented across >>>>> different functional areas in different forms. For example, in the new >>>>> REST >>>>> API, all operations now require authorization. Similarly, in the >>>>> configuration settings, the Fernet key is now required to be specified. >>>>> >>>>> *Configuration* >>>>> >>>>> Configuration in the form of the airflow.cfg file has been >>>>> rationalized further in distinct sections, specifically around “core”. >>>>> Additionally, a significant amount of configuration options have been >>>>> deprecated or moved to individual component-specific configuration files, >>>>> such as the pod-template-file for Kubernetes execution-related >>>>> configuration. >>>>> >>>>> *Thanks to all of you* >>>>> >>>>> We’ve tried to make as few breaking changes as possible and to provide >>>>> deprecation path in the code, especially in the case of anything called in >>>>> the DAG. That said, please read throughUPDATING.md to check what might >>>>> affect you. For example: r We re-organized the layout of operators (they >>>>> now all live under airflow.providers.*) but the old names should continue >>>>> to work - you’ll just notice a lot of DeprecationWarnings that need to be >>>>> fixed up. >>>>> >>>>> Thank you so much to all the contributors who got us to this point, in >>>>> no particular order: Kaxil Naik, Daniel Imberman, Jarek Potiuk, Tomek >>>>> Urbaszek, Kamil Breguła, Gerard Casas Saez, Xiaodong DENG, Kevin Yang, >>>>> James Timmins, Yingbo Wang, Qian Yu, Ryan Hamilton and the 100s of others >>>>> who keep making Airflow better for everyone. >>>>> >>>> -- Eugene