pankajkoti commented on code in PR #45188: URL: https://github.com/apache/airflow/pull/45188#discussion_r1900587419
########## providers/src/airflow/providers/databricks/operators/databricks.py: ########## @@ -292,6 +292,8 @@ class DatabricksCreateJobsOperator(BaseOperator): :param databricks_retry_delay: Number of seconds to wait between retries (it might be a floating point number). :param databricks_retry_args: An optional dictionary with arguments passed to ``tenacity.Retrying`` class. + :param databricks_environments: An optional list of task execution environment specifications Review Comment: can we not call this `environments` only? the prefix databricks seems redundant. same suggestion across the changes in the PR. ########## providers/src/airflow/providers/databricks/operators/databricks.py: ########## @@ -1087,7 +1100,9 @@ def _get_run_json(self) -> dict[str, Any]: elif self.existing_cluster_id: run_json["existing_cluster_id"] = self.existing_cluster_id else: - raise ValueError("Must specify either existing_cluster_id or new_cluster.") Review Comment: IMO, let's keep else branch and the error with altering the message to say that either of existing_cluster_id, new_cluster spec or databricks_environments needs to be specified by also adding another `elif` block to verify that self.databricks_environment is set. ########## providers/src/airflow/providers/databricks/operators/databricks.py: ########## @@ -1087,7 +1100,9 @@ def _get_run_json(self) -> dict[str, Any]: elif self.existing_cluster_id: run_json["existing_cluster_id"] = self.existing_cluster_id else: - raise ValueError("Must specify either existing_cluster_id or new_cluster.") + self.log.info("The task %s will be executed in serverless mode", run_json["run_name"]) + if self.databricks_environments: Review Comment: this could be moved as an elif block before the above else block. ########## providers/src/airflow/providers/databricks/operators/databricks.py: ########## @@ -1372,27 +1387,29 @@ def _convert_to_databricks_workflow_task( class DatabricksTaskOperator(DatabricksTaskBaseOperator): """ - Runs a task on Databricks using an Airflow operator. - - The DatabricksTaskOperator allows users to launch and monitor task job runs on Databricks as Airflow - tasks. It can be used as a part of a DatabricksWorkflowTaskGroup to take advantage of job clusters, which - allows users to run their tasks on cheaper clusters that can be shared between tasks. + Runs a task on Databricks using an Airflow operator. - .. seealso:: - For more information on how to use this operator, take a look at the guide: - :ref:`howto/operator:DatabricksTaskOperator` + The DatabricksTaskOperator allows users to launch and monitor task job runs on Databricks as Airflow Review Comment: why are we changing the indent here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org