syedahsn opened a new pull request, #37618: URL: https://github.com/apache/airflow/pull/37618
Overview ------------ This PR introduces the AWS Batch Executor. This Executor can be configured to run Airflow tasks using AWS Batch. It is based on an initial contribution from @aelzeiny. From the README: ``` This is an Airflow executor powered by Amazon Batch. Each task scheduled by Airflow is run inside a separate container, scheduled by Batch. Some benefits of an executor like this include: 1. Scalability and Lower Costs: AWS Batch allows the ability to dynamically provision the resources needed to execute tasks. Depending on the resources allocated, AWS Batch can autoscale up or down based on the workload, ensuring efficient resource utilization and reducing costs. 2. Job Queues and Priority: AWS Batch provides the concept of job queues, allowing the ability to prioritize and manage the execution of tasks. This ensures that when multiple tasks are scheduled simultaneously, they are executed in the desired order of priority. 3. Flexibility: AWS Batch supports Fargate (ECS), EC2 and EKS compute environments. This range of compute environments, as well as the ability to finely define the resources allocated to the compute environments gives a lot of flexibility to users in choosing the most suitable execution environment for their workloads. 4. Rapid Task Execution: By maintaining an active worker within AWS Batch, tasks submitted to the service can be executed swiftly. With a ready-to-go worker, there's minimal startup delay, ensuring tasks commence immediately upon submission. This feature is particularly advantageous for time-sensitive workloads or applications requiring near-real-time processing, enhancing overall workflow efficiency and responsiveness. ``` This PR comes with *most* features now included in the ECS Executor, but we will continue to update both executors as we work to improve them. Review Notes ------------------ Similar to the ECS Executor, this PR comes as a fully functional feature, which is great for testing, but it does mean that the PR is large. When reviewing, it isn't necessary to go through every single line. Instead, it would be more beneficial to have read through the documentation, and glance at how the Executor is implemented. There are a lot of similarities in the code between the Batch Executor and the ECS Executor - this is by design. As we continue to refine the process of writing custom Executors, we are converging on an optimal framework to write efficient, reliable, and fault-tolerant custom Executors. Testing --------- There is extensive unit testing which has near 100% line coverage in most cases: ![BatchExecutorCodeCoverage](https://github.com/apache/airflow/assets/103602455/71fc531f-c2e1-4592-b2a3-15eb3684e60f) <!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> <!-- Thank you for contributing! Please make sure that your code changes are covered with tests. And in case of new features or big changes remember to adjust the documentation. Feel free to ping committers for the review! In case of an existing issue, reference it using one of the following: closes: #ISSUE related: #ISSUE How to write a good git commit message: http://chris.beams.io/posts/git-commit/ --> <!-- Please keep an empty line above the dashes. --> --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#pull-request-guidelines)** for more information. In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org