Re: [PR] Updating Airflow executor docs for AF3 [airflow]
amoghrajesh commented on code in PR #49389: URL: https://github.com/apache/airflow/pull/49389#discussion_r2055220505 ## airflow-core/docs/core-concepts/executor/index.rst: ## @@ -239,7 +239,8 @@ Important BaseExecutor Methods These methods don't require overriding to implement your own executor, but are useful to be aware of: * ``heartbeat``: The Airflow scheduler Job loop will periodically call heartbeat on the executor. This is one of the main points of interaction between the Airflow scheduler and the executor. This method updates some metrics, triggers newly queued tasks to execute and updates state of running/completed tasks. -* ``queue_command``: The Airflow Executor will call this method of the BaseExecutor to provide tasks to be run by the executor. The BaseExecutor simply adds the TaskInstances to an internal list of queued tasks within the executor. +* ``queue_command``: Airflow 2 way of doing things. The Airflow Executor will call this method of the BaseExecutor to provide tasks to be run by the executor. The BaseExecutor simply adds the TaskInstances to an internal list of queued tasks within the executor. CeleryK8s and LocalK8s executors are examples of this. Review Comment: Yep that has been removed / reworked -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Updating Airflow executor docs for AF3 [airflow]
amoghrajesh merged PR #49389: URL: https://github.com/apache/airflow/pull/49389 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Updating Airflow executor docs for AF3 [airflow]
amoghrajesh commented on code in PR #49389: URL: https://github.com/apache/airflow/pull/49389#discussion_r2055220078 ## airflow-core/docs/core-concepts/executor/index.rst: ## @@ -251,7 +252,7 @@ Mandatory Methods to Implement The following methods must be overridden at minimum to have your executor supported by Airflow: * ``sync``: Sync will get called periodically during executor heartbeats. Implement this method to update the state of the tasks which the executor knows about. Optionally, attempting to execute queued tasks that have been received from the scheduler. -* ``execute_async``: Executes a command asynchronously. A command in this context is an Airflow CLI command to run an Airflow task. This method is called (after a few layers) during executor heartbeat which is run periodically by the scheduler. In practice, this method often just enqueues tasks into an internal or external queue of tasks to be run (e.g. ``KubernetesExecutor``). But can also execute the tasks directly as well (e.g. ``LocalExecutor``). This will depend on the executor. +* ``execute_async``: Executes a command (Airflow 2) /workload(Airflow 3) asynchronously. A command is an Airflow CLI command whereas workload is basic unit of work that represents an Airflow task. This method is called (after a few layers) during executor heartbeat which is run periodically by the scheduler. In practice, this method often just enqueues tasks into an internal or external queue of tasks to be run (e.g. ``KubernetesExecutor``). But can also execute the tasks directly as well (e.g. ``LocalExecutor``). This will depend on the executor. Review Comment: Checked with Elad and did a re work on this, he was ok with it ## airflow-core/docs/core-concepts/executor/index.rst: ## @@ -239,7 +239,8 @@ Important BaseExecutor Methods These methods don't require overriding to implement your own executor, but are useful to be aware of: * ``heartbeat``: The Airflow scheduler Job loop will periodically call heartbeat on the executor. This is one of the main points of interaction between the Airflow scheduler and the executor. This method updates some metrics, triggers newly queued tasks to execute and updates state of running/completed tasks. -* ``queue_command``: The Airflow Executor will call this method of the BaseExecutor to provide tasks to be run by the executor. The BaseExecutor simply adds the TaskInstances to an internal list of queued tasks within the executor. +* ``queue_command``: Airflow 2 way of doing things. The Airflow Executor will call this method of the BaseExecutor to provide tasks to be run by the executor. The BaseExecutor simply adds the TaskInstances to an internal list of queued tasks within the executor. CeleryK8s and LocalK8s executors are examples of this. +* ``queue_workload``: Airflow 3 way of doing things. The Airflow Executor will call this method of the BaseExecutor to provide tasks to be run by the executor. The BaseExecutor simply adds the workloads to an internal list of queued workloads to run within the executor. All in-tree executors except the ones mentioned above are using this. Review Comment: Yeah that is removed now -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Updating Airflow executor docs for AF3 [airflow]
amoghrajesh commented on code in PR #49389: URL: https://github.com/apache/airflow/pull/49389#discussion_r2052233762 ## airflow-core/docs/core-concepts/executor/index.rst: ## @@ -49,6 +49,40 @@ If you want to check which executor is currently set, you can use the ``airflow LocalExecutor +Workloads +- + +A workload in context of an Executor is the fundamental unit of execution for an executor. It represents a discrete +operation or job that the executor runs on a worker. For example, it can run user code encapsulated in an Airflow task +on a worker. Review Comment: Okay let me answer the difference between workload vs ti. In the context for an executor, both essentially mean the same. This is what a workload looks like: ``` ExecuteTask( token="mock", ti=TaskInstance( id=UUID("4d828a62-a417-4936-a7a6-2b3fabacecab"), task_id="mock", dag_id="mock", run_id="mock", try_number=1, map_index=-1, pool_slots=1, queue="default", priority_weight=1, executor_config=None, parent_context_carrier=None, context_carrier=None, queued_dttm=None, ), dag_rel_path=PurePosixPath("mock.py"), bundle_info=BundleInfo(name="n/a", version="no matter"), log_path="mock.log", type="ExecuteTask", ) ``` Workload is an operational unit for running "something" using task sdk paradigm. For the case of executors, its a `ti` but we also run dag processors, triggers etc with task sdk and that is when a workload definition changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Updating Airflow executor docs for AF3 [airflow]
eladkal commented on code in PR #49389: URL: https://github.com/apache/airflow/pull/49389#discussion_r2052162668 ## airflow-core/docs/core-concepts/executor/index.rst: ## @@ -49,6 +49,40 @@ If you want to check which executor is currently set, you can use the ``airflow LocalExecutor +Workloads +- + +A workload in context of an Executor is the fundamental unit of execution for an executor. It represents a discrete +operation or job that the executor runs on a worker. For example, it can run user code encapsulated in an Airflow task +on a worker. Review Comment: So what is the difference between workload and task instance / index of task instance in case of mapped tasks? Is this a user facing concept or internal? If it's not user facing we should mention it anywhere in the docs but only in the developer ones -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Updating Airflow executor docs for AF3 [airflow]
eladkal commented on code in PR #49389: URL: https://github.com/apache/airflow/pull/49389#discussion_r2048420376 ## airflow-core/docs/core-concepts/executor/index.rst: ## @@ -251,7 +252,7 @@ Mandatory Methods to Implement The following methods must be overridden at minimum to have your executor supported by Airflow: * ``sync``: Sync will get called periodically during executor heartbeats. Implement this method to update the state of the tasks which the executor knows about. Optionally, attempting to execute queued tasks that have been received from the scheduler. -* ``execute_async``: Executes a command asynchronously. A command in this context is an Airflow CLI command to run an Airflow task. This method is called (after a few layers) during executor heartbeat which is run periodically by the scheduler. In practice, this method often just enqueues tasks into an internal or external queue of tasks to be run (e.g. ``KubernetesExecutor``). But can also execute the tasks directly as well (e.g. ``LocalExecutor``). This will depend on the executor. +* ``execute_async``: Executes a command (Airflow 2) /workload(Airflow 3) asynchronously. A command is an Airflow CLI command whereas workload is basic unit of work that represents an Airflow task. This method is called (after a few layers) during executor heartbeat which is run periodically by the scheduler. In practice, this method often just enqueues tasks into an internal or external queue of tasks to be run (e.g. ``KubernetesExecutor``). But can also execute the tasks directly as well (e.g. ``LocalExecutor``). This will depend on the executor. Review Comment: I think we need to refine from doing the AF2 / AF3 like that. It would be challenging for users to find what they need and for us to maintain the docs. For example this raise the question what is command and what is workload. Note that the word workload is not defined in the doc. I suggest the docs should speak only AF3 language. In the cases where we want to explain something in AF2 vs AF3 we should do it in a designated paragraph. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Updating Airflow executor docs for AF3 [airflow]
o-nikolas commented on code in PR #49389: URL: https://github.com/apache/airflow/pull/49389#discussion_r2049749306 ## airflow-core/docs/core-concepts/executor/index.rst: ## @@ -239,7 +239,8 @@ Important BaseExecutor Methods These methods don't require overriding to implement your own executor, but are useful to be aware of: * ``heartbeat``: The Airflow scheduler Job loop will periodically call heartbeat on the executor. This is one of the main points of interaction between the Airflow scheduler and the executor. This method updates some metrics, triggers newly queued tasks to execute and updates state of running/completed tasks. -* ``queue_command``: The Airflow Executor will call this method of the BaseExecutor to provide tasks to be run by the executor. The BaseExecutor simply adds the TaskInstances to an internal list of queued tasks within the executor. +* ``queue_command``: Airflow 2 way of doing things. The Airflow Executor will call this method of the BaseExecutor to provide tasks to be run by the executor. The BaseExecutor simply adds the TaskInstances to an internal list of queued tasks within the executor. CeleryK8s and LocalK8s executors are examples of this. +* ``queue_workload``: Airflow 3 way of doing things. The Airflow Executor will call this method of the BaseExecutor to provide tasks to be run by the executor. The BaseExecutor simply adds the workloads to an internal list of queued workloads to run within the executor. All in-tree executors except the ones mentioned above are using this. Review Comment: Many users won't really know what 'in-tree' means. And I'm not sure it's too helpful listing them here. If anything I'd create a new section to list executors that support task SDK, and we can have them clearly documented there, rather than buried deep in this technical text. ## airflow-core/docs/core-concepts/executor/index.rst: ## @@ -239,7 +239,8 @@ Important BaseExecutor Methods These methods don't require overriding to implement your own executor, but are useful to be aware of: * ``heartbeat``: The Airflow scheduler Job loop will periodically call heartbeat on the executor. This is one of the main points of interaction between the Airflow scheduler and the executor. This method updates some metrics, triggers newly queued tasks to execute and updates state of running/completed tasks. -* ``queue_command``: The Airflow Executor will call this method of the BaseExecutor to provide tasks to be run by the executor. The BaseExecutor simply adds the TaskInstances to an internal list of queued tasks within the executor. +* ``queue_command``: Airflow 2 way of doing things. The Airflow Executor will call this method of the BaseExecutor to provide tasks to be run by the executor. The BaseExecutor simply adds the TaskInstances to an internal list of queued tasks within the executor. CeleryK8s and LocalK8s executors are examples of this. Review Comment: > CeleryK8s and LocalK8s executors are examples of this. Those hardcoded hybrid executors don't have anything to do with the queue_command interface. _Any_ executor working with v2 core airflow will use this method. ## airflow-core/docs/core-concepts/executor/index.rst: ## @@ -251,7 +252,7 @@ Mandatory Methods to Implement The following methods must be overridden at minimum to have your executor supported by Airflow: * ``sync``: Sync will get called periodically during executor heartbeats. Implement this method to update the state of the tasks which the executor knows about. Optionally, attempting to execute queued tasks that have been received from the scheduler. -* ``execute_async``: Executes a command asynchronously. A command in this context is an Airflow CLI command to run an Airflow task. This method is called (after a few layers) during executor heartbeat which is run periodically by the scheduler. In practice, this method often just enqueues tasks into an internal or external queue of tasks to be run (e.g. ``KubernetesExecutor``). But can also execute the tasks directly as well (e.g. ``LocalExecutor``). This will depend on the executor. +* ``execute_async``: Executes a command (Airflow 2) /workload(Airflow 3) asynchronously. A command is an Airflow CLI command whereas workload is basic unit of work that represents an Airflow task. This method is called (after a few layers) during executor heartbeat which is run periodically by the scheduler. In practice, this method often just enqueues tasks into an internal or external queue of tasks to be run (e.g. ``KubernetesExecutor``). But can also execute the tasks directly as well (e.g. ``LocalExecutor``). This will depend on the executor. Review Comment: Honestly, I'm not sure this paragraph needs to even say whether it's a cli command or workload. But I like Elad's suggestion if it must. -- This is an automated message from the Apache Git Service. T
Re: [PR] Updating Airflow executor docs for AF3 [airflow]
amoghrajesh commented on code in PR #49389: URL: https://github.com/apache/airflow/pull/49389#discussion_r2048460739 ## airflow-core/docs/core-concepts/executor/index.rst: ## @@ -251,7 +252,7 @@ Mandatory Methods to Implement The following methods must be overridden at minimum to have your executor supported by Airflow: * ``sync``: Sync will get called periodically during executor heartbeats. Implement this method to update the state of the tasks which the executor knows about. Optionally, attempting to execute queued tasks that have been received from the scheduler. -* ``execute_async``: Executes a command asynchronously. A command in this context is an Airflow CLI command to run an Airflow task. This method is called (after a few layers) during executor heartbeat which is run periodically by the scheduler. In practice, this method often just enqueues tasks into an internal or external queue of tasks to be run (e.g. ``KubernetesExecutor``). But can also execute the tasks directly as well (e.g. ``LocalExecutor``). This will depend on the executor. +* ``execute_async``: Executes a command (Airflow 2) /workload(Airflow 3) asynchronously. A command is an Airflow CLI command whereas workload is basic unit of work that represents an Airflow task. This method is called (after a few layers) during executor heartbeat which is run periodically by the scheduler. In practice, this method often just enqueues tasks into an internal or external queue of tasks to be run (e.g. ``KubernetesExecutor``). But can also execute the tasks directly as well (e.g. ``LocalExecutor``). This will depend on the executor. Review Comment: Actually that makes sense, let me reword this thing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Updating Airflow executor docs for AF3 [airflow]
eladkal commented on code in PR #49389: URL: https://github.com/apache/airflow/pull/49389#discussion_r2048424994 ## airflow-core/docs/core-concepts/executor/index.rst: ## @@ -251,7 +252,7 @@ Mandatory Methods to Implement The following methods must be overridden at minimum to have your executor supported by Airflow: * ``sync``: Sync will get called periodically during executor heartbeats. Implement this method to update the state of the tasks which the executor knows about. Optionally, attempting to execute queued tasks that have been received from the scheduler. -* ``execute_async``: Executes a command asynchronously. A command in this context is an Airflow CLI command to run an Airflow task. This method is called (after a few layers) during executor heartbeat which is run periodically by the scheduler. In practice, this method often just enqueues tasks into an internal or external queue of tasks to be run (e.g. ``KubernetesExecutor``). But can also execute the tasks directly as well (e.g. ``LocalExecutor``). This will depend on the executor. +* ``execute_async``: Executes a command (Airflow 2) /workload(Airflow 3) asynchronously. A command is an Airflow CLI command whereas workload is basic unit of work that represents an Airflow task. This method is called (after a few layers) during executor heartbeat which is run periodically by the scheduler. In practice, this method often just enqueues tasks into an internal or external queue of tasks to be run (e.g. ``KubernetesExecutor``). But can also execute the tasks directly as well (e.g. ``LocalExecutor``). This will depend on the executor. Review Comment: Noting that the docs are versioned. so if someone want to see the content of this page in Airflow 2 they can just click the version button -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Updating Airflow executor docs for AF3 [airflow]
eladkal commented on code in PR #49389: URL: https://github.com/apache/airflow/pull/49389#discussion_r2048420376 ## airflow-core/docs/core-concepts/executor/index.rst: ## @@ -251,7 +252,7 @@ Mandatory Methods to Implement The following methods must be overridden at minimum to have your executor supported by Airflow: * ``sync``: Sync will get called periodically during executor heartbeats. Implement this method to update the state of the tasks which the executor knows about. Optionally, attempting to execute queued tasks that have been received from the scheduler. -* ``execute_async``: Executes a command asynchronously. A command in this context is an Airflow CLI command to run an Airflow task. This method is called (after a few layers) during executor heartbeat which is run periodically by the scheduler. In practice, this method often just enqueues tasks into an internal or external queue of tasks to be run (e.g. ``KubernetesExecutor``). But can also execute the tasks directly as well (e.g. ``LocalExecutor``). This will depend on the executor. +* ``execute_async``: Executes a command (Airflow 2) /workload(Airflow 3) asynchronously. A command is an Airflow CLI command whereas workload is basic unit of work that represents an Airflow task. This method is called (after a few layers) during executor heartbeat which is run periodically by the scheduler. In practice, this method often just enqueues tasks into an internal or external queue of tasks to be run (e.g. ``KubernetesExecutor``). But can also execute the tasks directly as well (e.g. ``LocalExecutor``). This will depend on the executor. Review Comment: I think we need to refine from doing the AF2 / AF3 like that. It would be a challenging for users to find what they need and for us to maintain the docs. For example this raise the question what is command and what is workload. Note that the word workload is not defined in the doc. I suggest the docs should speak only AF3 language. In the cases where we want to explain something in AF2 vs AF3 we should do it in a designated paragraph. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Updating Airflow executor docs for AF3 [airflow]
eladkal commented on code in PR #49389: URL: https://github.com/apache/airflow/pull/49389#discussion_r2048420376 ## airflow-core/docs/core-concepts/executor/index.rst: ## @@ -251,7 +252,7 @@ Mandatory Methods to Implement The following methods must be overridden at minimum to have your executor supported by Airflow: * ``sync``: Sync will get called periodically during executor heartbeats. Implement this method to update the state of the tasks which the executor knows about. Optionally, attempting to execute queued tasks that have been received from the scheduler. -* ``execute_async``: Executes a command asynchronously. A command in this context is an Airflow CLI command to run an Airflow task. This method is called (after a few layers) during executor heartbeat which is run periodically by the scheduler. In practice, this method often just enqueues tasks into an internal or external queue of tasks to be run (e.g. ``KubernetesExecutor``). But can also execute the tasks directly as well (e.g. ``LocalExecutor``). This will depend on the executor. +* ``execute_async``: Executes a command (Airflow 2) /workload(Airflow 3) asynchronously. A command is an Airflow CLI command whereas workload is basic unit of work that represents an Airflow task. This method is called (after a few layers) during executor heartbeat which is run periodically by the scheduler. In practice, this method often just enqueues tasks into an internal or external queue of tasks to be run (e.g. ``KubernetesExecutor``). But can also execute the tasks directly as well (e.g. ``LocalExecutor``). This will depend on the executor. Review Comment: I think we need to refine from doing the AF2 / AF3 like that. It would be a challenging for users to find what they need and for us to maintain the docs. For example this raise the question what is command and what is workload. Note that the word workload is not defined in the doc. I suggest the docs should speak only AF3 language like from a perspective. In the cases where we want to explain something in AF2 vs AF3 we should do it in a designated paragraph. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] Updating Airflow executor docs for AF3 [airflow]
amoghrajesh opened a new pull request, #49389: URL: https://github.com/apache/airflow/pull/49389 Well, had to do it :) --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#pull-request-guidelines)** for more information. In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [airflow-core/newsfragments](https://github.com/apache/airflow/tree/main/airflow-core/newsfragments). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org