[GitHub] [airflow] eladkal commented on issue #27890: SFTP Sensor is not working with File Pattern Parameter
eladkal commented on issue #27890: URL: https://github.com/apache/airflow/issues/27890#issuecomment-1327112678 cc @Bowrna -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ephraimbuddy commented on pull request #27829: Improving the release process
ephraimbuddy commented on PR #27829: URL: https://github.com/apache/airflow/pull/27829#issuecomment-1327111335 I added a new function `user_confirm_bools` that returns a bool using `user_confirm` under the hood. This helped me reduce a lot of `if else` statements. Also added `console_print` that uses `get_console().print` to print messages to the screen. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dstandish commented on pull request #27344: Add retry to submit_event in trigger to avoid deadlock
dstandish commented on PR #27344: URL: https://github.com/apache/airflow/pull/27344#issuecomment-1327107144 @NickYadance did you give up on this one? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] bolkedebruin commented on a diff in pull request #27887: Add allow list for imports during deserialization
bolkedebruin commented on code in PR #27887: URL: https://github.com/apache/airflow/pull/27887#discussion_r1032070380 ## airflow/utils/json.py: ## @@ -189,7 +189,7 @@ def __init__(self, *args, **kwargs) -> None: if not kwargs.get("object_hook"): kwargs["object_hook"] = self.object_hook -patterns = conf.getjson("core", "allowed_deserialization_classes") +patterns = cast(list, conf.getjson("core", "allowed_deserialization_classes")) Review Comment: Mmm yes, I'd prefer that check at configure time rather than here. The config file shouldn't validate if this is not a list. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Bowrna opened a new pull request, #27905: listener plugin example added
Bowrna opened a new pull request, #27905: URL: https://github.com/apache/airflow/pull/27905 related: #15353 --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information. In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Bowrna closed pull request #27435: listener plugin example and documentation
Bowrna closed pull request #27435: listener plugin example and documentation URL: https://github.com/apache/airflow/pull/27435 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Bowrna opened a new pull request, #27435: listener plugin example and documentation
Bowrna opened a new pull request, #27435: URL: https://github.com/apache/airflow/pull/27435 This PR contains example code and documentation to use listener plugin feature in Airflow. related: https://github.com/apache/airflow/issues/15353 --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information. In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] blag commented on a diff in pull request #27828: Soft delete datasets that are no longer referenced in DAG schedules or task outlets
blag commented on code in PR #27828: URL: https://github.com/apache/airflow/pull/27828#discussion_r1032052680 ## airflow/jobs/scheduler_job.py: ## @@ -1574,3 +1585,33 @@ def _cleanup_stale_dags(self, session: Session = NEW_SESSION) -> None: dag.is_active = False SerializedDagModel.remove_dag(dag_id=dag.dag_id, session=session) session.flush() + +@provide_session +def _orphan_unreferenced_datasets(self, session: Session = NEW_SESSION) -> None: +""" +Detects datasets that are no longer referenced in any DAG schedule parameters or task outlets and +sets the dataset is_orphaned flags to True +""" +orphaned_dataset_query = ( +session.query(DatasetModel) +.join( +DagScheduleDatasetReference, +DagScheduleDatasetReference.dataset_id == DatasetModel.id, +isouter=True, +) +.join( +TaskOutletDatasetReference, +TaskOutletDatasetReference.dataset_id == DatasetModel.id, +isouter=True, +) +.group_by(DatasetModel.id) +.having( +and_( +func.count(DagScheduleDatasetReference.dag_id) == 0, +func.count(TaskOutletDatasetReference.dag_id) == 0, +) +) +) +for dataset in orphaned_dataset_query: +self.log.info("Orphaning unreferenced dataset '%s'", dataset.uri) +dataset.is_orphaned = True Review Comment: Nope, didn't work. Good idea though. :) ``` sqlalchemy.exc.InvalidRequestError: Can't call Query.update() or Query.delete() when group_by() has been called ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] blag commented on a diff in pull request #27828: Soft delete datasets that are no longer referenced in DAG schedules or task outlets
blag commented on code in PR #27828: URL: https://github.com/apache/airflow/pull/27828#discussion_r1032045393 ## airflow/jobs/scheduler_job.py: ## @@ -1574,3 +1585,33 @@ def _cleanup_stale_dags(self, session: Session = NEW_SESSION) -> None: dag.is_active = False SerializedDagModel.remove_dag(dag_id=dag.dag_id, session=session) session.flush() + +@provide_session +def _orphan_unreferenced_datasets(self, session: Session = NEW_SESSION) -> None: +""" +Detects datasets that are no longer referenced in any DAG schedule parameters or task outlets and +sets the dataset is_orphaned flags to True +""" +orphaned_dataset_query = ( +session.query(DatasetModel) +.join( +DagScheduleDatasetReference, +DagScheduleDatasetReference.dataset_id == DatasetModel.id, +isouter=True, +) +.join( +TaskOutletDatasetReference, +TaskOutletDatasetReference.dataset_id == DatasetModel.id, +isouter=True, +) +.group_by(DatasetModel.id) +.having( +and_( +func.count(DagScheduleDatasetReference.dag_id) == 0, +func.count(TaskOutletDatasetReference.dag_id) == 0, +) +) +) +for dataset in orphaned_dataset_query: +self.log.info("Orphaning unreferenced dataset '%s'", dataset.uri) +dataset.is_orphaned = True Review Comment: The group by expression might interfere but I'll try it, thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] tanelk opened a new pull request, #27904: Order TIs by map_index
tanelk opened a new pull request, #27904: URL: https://github.com/apache/airflow/pull/27904 Sort TIs by the `map_index` field when selecting them for queueing. Currently TIs are only ordered by `priority_weight` and `execution_date`. This does not change any bug, but makes it more understandable and "cleaner" in the UI. Without this, every now and then the TIs get executed from the middle - probably something to do with database internals. ![2022-11-24_15-47](https://user-images.githubusercontent.com/3342974/203914401-97c9be97-43ab-432f-bd8f-b858ad09c058.png) --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information. In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ephraimbuddy commented on a diff in pull request #27828: Soft delete datasets that are no longer referenced in DAG schedules or task outlets
ephraimbuddy commented on code in PR #27828: URL: https://github.com/apache/airflow/pull/27828#discussion_r1032034170 ## airflow/jobs/scheduler_job.py: ## @@ -1574,3 +1585,33 @@ def _cleanup_stale_dags(self, session: Session = NEW_SESSION) -> None: dag.is_active = False SerializedDagModel.remove_dag(dag_id=dag.dag_id, session=session) session.flush() + +@provide_session +def _orphan_unreferenced_datasets(self, session: Session = NEW_SESSION) -> None: +""" +Detects datasets that are no longer referenced in any DAG schedule parameters or task outlets and +sets the dataset is_orphaned flags to True +""" +orphaned_dataset_query = ( +session.query(DatasetModel) +.join( +DagScheduleDatasetReference, +DagScheduleDatasetReference.dataset_id == DatasetModel.id, +isouter=True, +) +.join( +TaskOutletDatasetReference, +TaskOutletDatasetReference.dataset_id == DatasetModel.id, +isouter=True, +) +.group_by(DatasetModel.id) +.having( +and_( +func.count(DagScheduleDatasetReference.dag_id) == 0, +func.count(TaskOutletDatasetReference.dag_id) == 0, +) +) +) +for dataset in orphaned_dataset_query: +self.log.info("Orphaning unreferenced dataset '%s'", dataset.uri) +dataset.is_orphaned = True Review Comment: ```suggestion ).update({DatasetModel.is_orphaned:True}, synchronize_session='fetch') ) ``` If this will work I think it's faster. ## airflow/migrations/versions/0122_2_5_0_add_is_orphaned_to_datasetmodel.py: ## @@ -0,0 +1,49 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +"""Add is_orphaned to DatasetModel + +Revision ID: 290244fb8b83 +Revises: 65a852f26899 +Create Date: 2022-11-22 00:12:53.432961 + +""" + +from __future__ import annotations + +import sqlalchemy as sa +from alembic import op + +# revision identifiers, used by Alembic. +revision = "290244fb8b83" +down_revision = "65a852f26899" +branch_labels = None +depends_on = None +airflow_version = "2.5.0" + + +def upgrade(): +"""Add is_orphaned to DatasetModel""" +with op.batch_alter_table("dataset") as batch_op: Review Comment: I think so, due to SQLite but I don't think we need the server_default since it's `False` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Taragolis commented on a diff in pull request #27901: Add information on how to run tests in Breeze via the PyCharm IDE
Taragolis commented on code in PR #27901: URL: https://github.com/apache/airflow/pull/27901#discussion_r1032025382 ## TESTING.rst: ## @@ -61,20 +61,51 @@ Running Unit Tests from PyCharm IDE To run unit tests from the PyCharm IDE, create the `local virtualenv `_, select it as the default project's environment, then configure your test runner: -.. image:: images/configure_test_runner.png +.. image:: images/pycharm/configure_test_runner.png :align: center :alt: Configuring test runner and run unit tests as follows: -.. image:: images/running_unittests.png +.. image:: images/pycharm/running_unittests.png :align: center :alt: Running unit tests **NOTE:** You can run the unit tests in the standalone local virtualenv (with no Breeze installed) if they do not have dependencies such as Postgres/MySQL/Hadoop/etc. +Running Unit Tests from PyCharm IDE using Breeze + + +Ideally, all unit tests should be run using the standardized Breeze environment. While not +as convenient as the one-click "play button" in PyCharm, the IDE can be configured to do +this in two clicks. + +1. Add Breeze as an "External Tool" + a. File > Settings > Tools > External Tools + b. Click the little plus symbol to open the "Create Tool" popup and fill it out: Review Comment: Some macOS specific stuff. On macOS (and only there) user should navigate to `PyCharm -> Preferences` instead of `File > Settings` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] jedcunningham commented on a diff in pull request #27828: Soft delete datasets that are no longer referenced in DAG schedules or task outlets
jedcunningham commented on code in PR #27828: URL: https://github.com/apache/airflow/pull/27828#discussion_r1031991662 ## airflow/jobs/scheduler_job.py: ## @@ -1574,3 +1585,33 @@ def _cleanup_stale_dags(self, session: Session = NEW_SESSION) -> None: dag.is_active = False SerializedDagModel.remove_dag(dag_id=dag.dag_id, session=session) session.flush() + +@provide_session +def _orphan_unreferenced_datasets(self, session: Session = NEW_SESSION) -> None: +""" +Detects datasets that are no longer referenced in any DAG schedule parameters or task outlets and +sets the dataset is_orphaned flags to True +""" +orphaned_dataset_query = ( +session.query(DatasetModel) +.join( +DagScheduleDatasetReference, +DagScheduleDatasetReference.dataset_id == DatasetModel.id, +isouter=True, +) +.join( +TaskOutletDatasetReference, +TaskOutletDatasetReference.dataset_id == DatasetModel.id, +isouter=True, +) +.group_by(DatasetModel.id) +.having( +and_( +func.count(DagScheduleDatasetReference.dag_id) == 0, +func.count(TaskOutletDatasetReference.dag_id) == 0, +) +) +) +for dataset in orphaned_dataset_query.all(): Review Comment: ```suggestion for dataset in orphaned_dataset_query: ``` ## airflow/www/views.py: ## @@ -3648,7 +3648,7 @@ def datasets_summary(self): if has_event_filters: count_query = count_query.join(DatasetEvent, DatasetEvent.dataset_id == DatasetModel.id) -filters = [] +filters = [DatasetModel.is_orphaned.is_(False)] Review Comment: ```suggestion filters = [~DatasetModel.is_orphaned] ``` ## airflow/jobs/scheduler_job.py: ## @@ -1574,3 +1585,33 @@ def _cleanup_stale_dags(self, session: Session = NEW_SESSION) -> None: dag.is_active = False SerializedDagModel.remove_dag(dag_id=dag.dag_id, session=session) session.flush() + +@provide_session +def _orphan_unreferenced_datasets(self, session: Session = NEW_SESSION) -> None: +""" +Detects datasets that are no longer referenced in any DAG schedule parameters or task outlets and +sets the dataset is_orphaned flags to True +""" +orphaned_dataset_query = ( +session.query(DatasetModel) +.join( +DagScheduleDatasetReference, +DagScheduleDatasetReference.dataset_id == DatasetModel.id, +isouter=True, +) +.join( +TaskOutletDatasetReference, +TaskOutletDatasetReference.dataset_id == DatasetModel.id, +isouter=True, +) +.group_by(DatasetModel.id) +.having( +and_( +func.count(DagScheduleDatasetReference.dag_id) == 0, +func.count(TaskOutletDatasetReference.dag_id) == 0, +) +) +) +for dataset in orphaned_dataset_query.all(): +self.log.info("Orphaning dataset '%s'", dataset.uri) Review Comment: ```suggestion self.log.info("Orphaning unreferenced dataset '%s'", dataset.uri) ``` ## airflow/dag_processing/manager.py: ## @@ -433,8 +433,10 @@ def __init__( self.last_stat_print_time = 0 # Last time we cleaned up DAGs which are no longer in files self.last_deactivate_stale_dags_time = timezone.make_aware(datetime.fromtimestamp(0)) -# How often to check for DAGs which are no longer in files -self.deactivate_stale_dags_interval = conf.getint("scheduler", "deactivate_stale_dags_interval") +# How often to clean up: +# * DAGs which are no longer in files +# * datasets that are no longer referenced by any DAG schedule parameters or task outlets Review Comment: ```suggestion # How often to check for DAGs which are no longer in files ``` ## airflow/models/dag.py: ## @@ -2828,6 +2828,7 @@ def bulk_write_to_db( for dataset in all_datasets: stored_dataset = session.query(DatasetModel).filter(DatasetModel.uri == dataset.uri).first() if stored_dataset: +stored_dataset.is_orphaned = False Review Comment: Test this situation. ## airflow/migrations/versions/0122_2_5_0_add_is_orphaned_to_datasetmodel.py: ## @@ -0,0 +1,49 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regard
[GitHub] [airflow] Ken-poc commented on issue #27903: dag.timezone can not have start_date.tzinfo
Ken-poc commented on issue #27903: URL: https://github.com/apache/airflow/issues/27903#issuecomment-1326986647 This is not a bug. Could you remove the label? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Ken-poc commented on issue #27903: dag.timezone can not have start_date.tzinfo
Ken-poc commented on issue #27903: URL: https://github.com/apache/airflow/issues/27903#issuecomment-1326984633 yes I know that works. if start_date and start_date.tzinfo: tzinfo = None if start_date.tzinfo else settings.TIMEZONE tz = pendulum.instance(start_date, tz=tzinfo).timezone Though `start_date` has its `tzinfo`, `tzinfo` is always assigned to _None_ and `tz` eventually is made from `start_date` not `tzinfo` anyway. this makes it confusing even though it acutally works. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] zsdyx commented on issue #13668: scheduler dies with "MySQLdb._exceptions.OperationalError: (1213, 'Deadlock found when trying to get lock; try restarting transaction')"
zsdyx commented on issue #13668: URL: https://github.com/apache/airflow/issues/13668#issuecomment-1326980476 MySQL deadlock occurs when I use 2.4.1, Has this problem never been solved -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] NickYadance closed pull request #27344: Add retry to submit_event in trigger to avoid deadlock
NickYadance closed pull request #27344: Add retry to submit_event in trigger to avoid deadlock URL: https://github.com/apache/airflow/pull/27344 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on issue #27026: Parameterise key sorting in "Rendered Template" view
uranusjr commented on issue #27026: URL: https://github.com/apache/airflow/issues/27026#issuecomment-1326964845 Problem is `template_fields` does not contain all operator fields, and you’d have no way to sort those non-templated fields (especially if some of them can’t be templated to begin with). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Ken-poc commented on issue #27903: dag.timezone can not have start_date.tzinfo
Ken-poc commented on issue #27903: URL: https://github.com/apache/airflow/issues/27903#issuecomment-1326961733 Yes that's right. I pointed out that tzinfo is always assinged to None. I think this is unnecessary. https://github.com/apache/airflow/blob/3e288abd0bc3e5788dcd7f6d9f6bef26ec4c7281/airflow/models/dag.py#L465 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] vksunilk commented on issue #27894: Status of testing Providers that were prepared on November 24, 2022
vksunilk commented on issue #27894: URL: https://github.com/apache/airflow/issues/27894#issuecomment-1326959302 #26986 Works as expected. I am able to view the DataprocLink irrespective of Job status -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on issue #27903: dag.timezone can not have start_date.tzinfo
uranusjr commented on issue #27903: URL: https://github.com/apache/airflow/issues/27903#issuecomment-1326956524 Not sure what you mean. The attribute is set, from what I can tell. ```pycon >>> from airflow.models.dag import DAG >>> import pendulum >>> d = pendulum.now() >>> d.tzinfo Timezone('Etc/UTC') >>> dag = DAG(dag_id="xxx", start_date=d) >>> dag.timezone Timezone('Etc/UTC') ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on issue #27903: dag.timezone can not have start_date.tzinfo
boring-cyborg[bot] commented on issue #27903: URL: https://github.com/apache/airflow/issues/27903#issuecomment-1326954259 Thanks for opening your first issue here! Be sure to follow the issue template! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Ken-poc opened a new issue, #27903: dag.timezone can not have start_date.tzinfo
Ken-poc opened a new issue, #27903: URL: https://github.com/apache/airflow/issues/27903 ### Apache Airflow version main (development) ### What happened I found a weird code when assigning dag.timezone from DAG model. The timezone of DAG is always assigned to None, `start_date` even has `tzinfo` though. Is it intended? ### What you think should happen instead Dag should have the timezone if start_date passed in has tzinfo. ### How to reproduce Dag can't have timezone from start_date. ### Operating System MacOS Monteray ### Versions of Apache Airflow Providers apache-airflow-providers-amazon==6.0.0 apache-airflow-providers-cncf-kubernetes==4.4.0 apache-airflow-providers-common-sql==1.2.0 apache-airflow-providers-ftp==3.1.0 apache-airflow-providers-http==4.0.0 apache-airflow-providers-imap==3.0.0 apache-airflow-providers-sqlite==3.2.1 ### Deployment Virtualenv installation ### Deployment details None ### Anything else None ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on pull request #27740: Remove XCom API endpoint full deserialization option
uranusjr commented on PR #27740: URL: https://github.com/apache/airflow/pull/27740#issuecomment-1326951283 Sounds to me the most reasonable approach here would be to add a config to allow this feature. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated: Remove is_mapped attribute (#27881)
This is an automated email from the ASF dual-hosted git repository. uranusjr pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/main by this push: new 3e288abd0b Remove is_mapped attribute (#27881) 3e288abd0b is described below commit 3e288abd0bc3e5788dcd7f6d9f6bef26ec4c7281 Author: Tzu-ping Chung AuthorDate: Fri Nov 25 09:21:01 2022 +0800 Remove is_mapped attribute (#27881) --- .../endpoints/task_instance_endpoint.py| 3 +- airflow/api_connexion/schemas/task_schema.py | 17 ++-- airflow/cli/commands/task_command.py | 3 +- airflow/models/baseoperator.py | 2 - airflow/models/mappedoperator.py | 2 - airflow/models/operator.py | 23 - airflow/models/taskinstance.py | 5 +- airflow/models/xcom_arg.py | 3 +- airflow/ti_deps/deps/ready_to_reschedule.py| 4 +- airflow/ti_deps/deps/trigger_rule_dep.py | 3 +- airflow/www/views.py | 7 +- tests/decorators/test_python.py| 7 +- tests/models/test_taskinstance.py | 99 +- 13 files changed, 151 insertions(+), 27 deletions(-) diff --git a/airflow/api_connexion/endpoints/task_instance_endpoint.py b/airflow/api_connexion/endpoints/task_instance_endpoint.py index 4e9d6cb9a1..9d5d54ba58 100644 --- a/airflow/api_connexion/endpoints/task_instance_endpoint.py +++ b/airflow/api_connexion/endpoints/task_instance_endpoint.py @@ -45,6 +45,7 @@ from airflow.api_connexion.schemas.task_instance_schema import ( from airflow.api_connexion.types import APIResponse from airflow.models import SlaMiss from airflow.models.dagrun import DagRun as DR +from airflow.models.operator import needs_expansion from airflow.models.taskinstance import TaskInstance as TI, clear_task_instances from airflow.security import permissions from airflow.utils.airflow_flask_app import get_airflow_app @@ -202,7 +203,7 @@ def get_mapped_task_instances( if not task: error_message = f"Task id {task_id} not found" raise NotFound(error_message) -if not task.is_mapped: +if not needs_expansion(task): error_message = f"Task id {task_id} is not mapped" raise NotFound(error_message) diff --git a/airflow/api_connexion/schemas/task_schema.py b/airflow/api_connexion/schemas/task_schema.py index 0fcb9ff18f..5715ca2ea0 100644 --- a/airflow/api_connexion/schemas/task_schema.py +++ b/airflow/api_connexion/schemas/task_schema.py @@ -27,6 +27,7 @@ from airflow.api_connexion.schemas.common_schema import ( WeightRuleField, ) from airflow.api_connexion.schemas.dag_schema import DAGSchema +from airflow.models.mappedoperator import MappedOperator from airflow.models.operator import Operator @@ -59,22 +60,28 @@ class TaskSchema(Schema): template_fields = fields.List(fields.String(), dump_only=True) sub_dag = fields.Nested(DAGSchema, dump_only=True) downstream_task_ids = fields.List(fields.String(), dump_only=True) -params = fields.Method("get_params", dump_only=True) -is_mapped = fields.Boolean(dump_only=True) +params = fields.Method("_get_params", dump_only=True) +is_mapped = fields.Method("_get_is_mapped", dump_only=True) -def _get_class_reference(self, obj): +@staticmethod +def _get_class_reference(obj): result = ClassReferenceSchema().dump(obj) return result.data if hasattr(result, "data") else result -def _get_operator_name(self, obj): +@staticmethod +def _get_operator_name(obj): return obj.operator_name @staticmethod -def get_params(obj): +def _get_params(obj): """Get the Params defined in a Task.""" params = obj.params return {k: v.dump() for k, v in params.items()} +@staticmethod +def _get_is_mapped(obj): +return isinstance(obj, MappedOperator) + class TaskCollection(NamedTuple): """List of Tasks with metadata.""" diff --git a/airflow/cli/commands/task_command.py b/airflow/cli/commands/task_command.py index a217d2c78d..078565dc38 100644 --- a/airflow/cli/commands/task_command.py +++ b/airflow/cli/commands/task_command.py @@ -42,6 +42,7 @@ from airflow.models import DagPickle, TaskInstance from airflow.models.baseoperator import BaseOperator from airflow.models.dag import DAG from airflow.models.dagrun import DagRun +from airflow.models.operator import needs_expansion from airflow.ti_deps.dep_context import DepContext from airflow.ti_deps.dependencies_deps import SCHEDULER_QUEUED_DEPS from airflow.typing_compat import Literal @@ -150,7 +151,7 @@ def _get_ti( """Get the task instance through DagRun.run_id, if that fails, get the TI the old way.""" if not exec_date_or_run_id and not create_if_necessary:
[GitHub] [airflow] uranusjr merged pull request #27881: Remove is_mapped attribute
uranusjr merged PR #27881: URL: https://github.com/apache/airflow/pull/27881 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr closed issue #27879: Getting error in scheduler logs "Task killed externally" when running a dag with task group mapping
uranusjr closed issue #27879: Getting error in scheduler logs "Task killed externally" when running a dag with task group mapping URL: https://github.com/apache/airflow/issues/27879 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a diff in pull request #27778: Lambda hook: make runtime and handler optional
uranusjr commented on code in PR #27778: URL: https://github.com/apache/airflow/pull/27778#discussion_r1031930667 ## airflow/providers/amazon/aws/hooks/lambda_function.py: ## @@ -93,6 +93,12 @@ def create_lambda( code_signing_config_arn: str | None = None, architectures: list[str] | None = None, ) -> dict: +if package_type == "Zip": +if handler is None: +raise ValueError("Parameter 'handler' is required if 'package_type' is 'Zip'") +if runtime is None: +raise ValueError("Parameter 'runtime' is required if 'package_type' is 'Zip'") Review Comment: These should be TypeError to mirror Python’s default behaviour. ```pycon >>> def f(*, a): pass ... >>> f() Traceback (most recent call last): File "", line 1, in TypeError: f() missing 1 required keyword-only argument: 'a' ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a diff in pull request #27844: Detect alternative container runtime automatically
uranusjr commented on code in PR #27844: URL: https://github.com/apache/airflow/pull/27844#discussion_r1031929935 ## CONTRIBUTORS_QUICK_START.rst: ## @@ -50,7 +50,7 @@ Local machine development If you do not work with remote development environment, you need those prerequisites. -1. Docker Community Edition (you can also use Colima, see instructions below) +1. Container runtime: Docker Community Edition (recommended), Colima. Review Comment: FWIW, last time I tried using containerd with breeze, Docker CLI is not main problem (Podman is actually close enough you can just change a few constants to make things work), but docker-compose. But that’s off-topic; the main point here is the terminology here needs to be fixed to not introduce confusion unnecessarily. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on issue #27859: DYNAMICALLY CREATING TASKS issue : "_TaskDecorator' object has no attribute 'update_relative': "
uranusjr commented on issue #27859: URL: https://github.com/apache/airflow/issues/27859#issuecomment-1326922559 I suspect a call is missed somewhere in how you instantiate tasks. Note that a `@task` function needs to be _called_ (either with `f.expand()`, `f.expand_kwargs()`, or just `f()` like a function) to become a concrete task. We can probably check for this user error and emit a better message, but we need a reproduction first to identify the issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr closed issue #27862: airflow failure and success callbacks read task instance state as 'running'
uranusjr closed issue #27862: airflow failure and success callbacks read task instance state as 'running' URL: https://github.com/apache/airflow/issues/27862 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on issue #27862: airflow failure and success callbacks read task instance state as 'running'
uranusjr commented on issue #27862: URL: https://github.com/apache/airflow/issues/27862#issuecomment-1326920978 Duplicate of #26760. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a diff in pull request #27887: Add allow list for imports during deserialization
uranusjr commented on code in PR #27887: URL: https://github.com/apache/airflow/pull/27887#discussion_r1031925774 ## airflow/utils/json.py: ## @@ -189,7 +189,7 @@ def __init__(self, *args, **kwargs) -> None: if not kwargs.get("object_hook"): kwargs["object_hook"] = self.object_hook -patterns = conf.getjson("core", "allowed_deserialization_classes") +patterns = cast(list, conf.getjson("core", "allowed_deserialization_classes")) Review Comment: This would result in a confusing error if the value is not set to a list. It’s probably better to explicitly check the value is a list instead (and raise a clear message explaining the config value is exact source of failure). ## airflow/utils/json.py: ## @@ -189,7 +189,7 @@ def __init__(self, *args, **kwargs) -> None: if not kwargs.get("object_hook"): kwargs["object_hook"] = self.object_hook -patterns = conf.getjson("core", "allowed_deserialization_classes") +patterns = cast(list, conf.getjson("core", "allowed_deserialization_classes")) Review Comment: This would result in a confusing error if the value is not set to a list. It’s probably better to explicitly check the value is a list instead (and raise a clear message explaining the config value is the exact source of failure). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ferruzzi commented on a diff in pull request #27901: Add information on how to run tests in Breeze via the PyCharm IDE
ferruzzi commented on code in PR #27901: URL: https://github.com/apache/airflow/pull/27901#discussion_r1031925157 ## TESTING.rst: ## @@ -61,20 +61,51 @@ Running Unit Tests from PyCharm IDE To run unit tests from the PyCharm IDE, create the `local virtualenv `_, select it as the default project's environment, then configure your test runner: -.. image:: images/configure_test_runner.png +.. image:: images/pycharm/configure_test_runner.png :align: center :alt: Configuring test runner and run unit tests as follows: -.. image:: images/running_unittests.png +.. image:: images/pycharm/running_unittests.png :align: center :alt: Running unit tests **NOTE:** You can run the unit tests in the standalone local virtualenv (with no Breeze installed) if they do not have dependencies such as Postgres/MySQL/Hadoop/etc. +Running Unit Tests from PyCharm IDE using Breeze + + +Ideally, all unit tests should be run using the standardized Breeze environment. While not +as convenient as the one-click "play button" in PyCharm, the IDE can be configured to do +this in two clicks. + +1. Add Breeze as an "External Tool" + a. File > Settings > Tools > External Tools + b. Click the little plus symbol to open the "Create Tool" popup and fill it out: + +.. image:: images/pycharm/pycharm_create_tool.png +:align: center +:alt: Installing Python extension + +2. Add the tool to the context menu + a. File > Settings > Appearance and Behavior > Menus and Toolbars > Project View Popup Menu + b. Click on the list of entries where you would like it to be added. Right above or below + "Project View Popup Menu Run Group" may be a good choice, you can drag and drop this list + to rearrange the placement later. + c. Click the little plus at the top of the popup window + d. Find your "External Tool" in the new "Choose Actions to Add" popup and click OK. If you + followed the image above, it will be at External Tools > External Tools > Breeze Review Comment: Committed the phrasing change, thanks. For the bullet styles, this is how it renders, despite the letter-bullets in the raw code. ![image](https://user-images.githubusercontent.com/1920178/203878654-362d882f-582b-4f38-83fc-3769ea3f5a05.png) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a diff in pull request #27834: Make sure we can get out of a faulty scheduler state
uranusjr commented on code in PR #27834: URL: https://github.com/apache/airflow/pull/27834#discussion_r1031924843 ## airflow/models/dagrun.py: ## @@ -780,8 +780,7 @@ def _expand_mapped_task_if_needed(ti: TI) -> Iterable[TI] | None: except NotMapped: # Not a mapped task, nothing needed. return None if expanded_tis: -assert expanded_tis[0] is ti -return expanded_tis[1:] +return expanded_tis Review Comment: Since this function only returns _new_ ti objects, should we do something like this? ```python if expanded_tis[0] is ti: return expanded_tis[1:] return expanded_tis ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a diff in pull request #27834: Make sure we can get out of a faulty scheduler state
uranusjr commented on code in PR #27834: URL: https://github.com/apache/airflow/pull/27834#discussion_r1031924565 ## airflow/models/abstractoperator.py: ## @@ -494,18 +495,30 @@ def expand_mapped_task(self, run_id: str, *, session: Session) -> tuple[Sequence total_length, ) unmapped_ti.state = TaskInstanceState.SKIPPED -indexes_to_map = () else: -# Otherwise convert this into the first mapped index, and create -# TaskInstance for other indexes. -unmapped_ti.map_index = 0 -self.log.debug("Updated in place to become %s", unmapped_ti) -all_expanded_tis.append(unmapped_ti) -indexes_to_map = range(1, total_length) -state = unmapped_ti.state -elif not total_length: +zero_index_ti_exists = session.query( +exists().where( +TaskInstance.dag_id == self.dag_id, +TaskInstance.task_id == self.task_id, +TaskInstance.run_id == run_id, +TaskInstance.map_index == 0, +) +).scalar() Review Comment: IIRC `EXISTS` has some compatibility issues across databases (don’t remember what exactly), so we generally use `query(count())...scalar() > 0` instead. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] xlanor commented on a diff in pull request #27898: fix: current_state method on TaskInstance doesn't filter by map_index
xlanor commented on code in PR #27898: URL: https://github.com/apache/airflow/pull/27898#discussion_r1031921282 ## airflow/models/taskinstance.py: ## @@ -719,6 +719,7 @@ def current_state(self, session: Session = NEW_SESSION) -> str: .filter( TaskInstance.dag_id == self.dag_id, TaskInstance.task_id == self.task_id, +TaskInstance.map_index == self.map_index, Review Comment: Thanks, will work on this PR tomorrow and hopefully get it ready for review shortly -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a diff in pull request #27898: fix: current_state method on TaskInstance doesn't filter by map_index
uranusjr commented on code in PR #27898: URL: https://github.com/apache/airflow/pull/27898#discussion_r1031920687 ## airflow/models/taskinstance.py: ## @@ -719,6 +719,7 @@ def current_state(self, session: Session = NEW_SESSION) -> str: .filter( TaskInstance.dag_id == self.dag_id, TaskInstance.task_id == self.task_id, +TaskInstance.map_index == self.map_index, Review Comment: Honestly `current_state` isn’t really used almost anywhere in the code base (the only is for `airflow tasks state`, but I’d challenge even that usage is not necessary at all), so the test coverage is mostly non-existent. You can probably add a test for the `airflow tasks` CLI command (in `tests/cli/commands/test_task_command.py`) to cover this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] xlanor commented on a diff in pull request #27898: fix: current_state method on TaskInstance doesn't filter by map_index
xlanor commented on code in PR #27898: URL: https://github.com/apache/airflow/pull/27898#discussion_r1031919560 ## airflow/models/taskinstance.py: ## @@ -719,6 +719,7 @@ def current_state(self, session: Session = NEW_SESSION) -> str: .filter( TaskInstance.dag_id == self.dag_id, TaskInstance.task_id == self.task_id, +TaskInstance.map_index == self.map_index, Review Comment: Thanks! I've looked at the tests in tests/model.py and I don't see any examples of a test of a mapped task there. Is there any test that you would suggest so that a regression does not occur in the future? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a diff in pull request #27898: fix: current_state method on TaskInstance doesn't filter by map_index
uranusjr commented on code in PR #27898: URL: https://github.com/apache/airflow/pull/27898#discussion_r1031919036 ## airflow/models/taskinstance.py: ## @@ -719,6 +719,7 @@ def current_state(self, session: Session = NEW_SESSION) -> str: .filter( TaskInstance.dag_id == self.dag_id, TaskInstance.task_id == self.task_id, +TaskInstance.map_index == self.map_index, Review Comment: It’s probably a good chance to rewrite this to something like ```python from sqlalchemy.inspection import inspect session.query(TaskInstance.state).filter( col == getattr(self, col.name) for col in inspect(TaskInstance).primary_key ).scalar() ``` This would be resilient to any primary key changes in the future. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] pulquero opened a new issue, #27902: HdfsSensor has no clear failure mode
pulquero opened a new issue, #27902: URL: https://github.com/apache/airflow/issues/27902 ### Description Currently, HdfsSensor pings forever if some failure causes the file not to be written. Some sort of timeout parameter would be nice. ### Use case/motivation If there is a failure earlier in the pipeline that prevents the file of interest being written, HdfsSensor just pings forever, and everything looks fine. I would like some sort of way to have HdfsSensor fail, so that my team can detect issues promptly and address them. ### Related issues _No response_ ### Are you willing to submit a PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a diff in pull request #27901: Add information on how to run tests in Breeze via the PyCharm IDE
uranusjr commented on code in PR #27901: URL: https://github.com/apache/airflow/pull/27901#discussion_r1031914073 ## TESTING.rst: ## @@ -61,20 +61,51 @@ Running Unit Tests from PyCharm IDE To run unit tests from the PyCharm IDE, create the `local virtualenv `_, select it as the default project's environment, then configure your test runner: -.. image:: images/configure_test_runner.png +.. image:: images/pycharm/configure_test_runner.png :align: center :alt: Configuring test runner and run unit tests as follows: -.. image:: images/running_unittests.png +.. image:: images/pycharm/running_unittests.png :align: center :alt: Running unit tests **NOTE:** You can run the unit tests in the standalone local virtualenv (with no Breeze installed) if they do not have dependencies such as Postgres/MySQL/Hadoop/etc. +Running Unit Tests from PyCharm IDE using Breeze + + +Ideally, all unit tests should be run using the standardized Breeze environment. While not +as convenient as the one-click "play button" in PyCharm, the IDE can be configured to do +this in two clicks. + +1. Add Breeze as an "External Tool" + a. File > Settings > Tools > External Tools + b. Click the little plus symbol to open the "Create Tool" popup and fill it out: + +.. image:: images/pycharm/pycharm_create_tool.png +:align: center +:alt: Installing Python extension + +2. Add the tool to the context menu + a. File > Settings > Appearance and Behavior > Menus and Toolbars > Project View Popup Menu + b. Click on the list of entries where you would like it to be added. Right above or below + "Project View Popup Menu Run Group" may be a good choice, you can drag and drop this list + to rearrange the placement later. + c. Click the little plus at the top of the popup window + d. Find your "External Tool" in the new "Choose Actions to Add" popup and click OK. If you + followed the image above, it will be at External Tools > External Tools > Breeze Review Comment: ```suggestion a. Navigate to File > Settings > Appearance and Behavior > Menus and Toolbars > Project View Popup Menu. b. Click on the list of entries where you would like it to be added. Right above or below "Project View Popup Menu Run Group" may be a good choice, you can drag and drop this list to rearrange the placement later. c. Click the little plus at the top of the popup window. d. Find your "External Tool" in the new "Choose Actions to Add" popup and click OK. If you followed the image above, it will be at External Tools > External Tools > Breeze. ``` Maybe unify the style of bullet items? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated (eba04d7c40 -> bad875b58d)
This is an automated email from the ASF dual-hosted git repository. ephraimanierobi pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git from eba04d7c40 tests: always cleanup registered test listeners (#27896) add bad875b58d Only get changelog for core commits (#27900) No new revisions were added by this update. Summary of changes: dev/airflow-github | 5 - 1 file changed, 4 insertions(+), 1 deletion(-)
[GitHub] [airflow] ephraimbuddy merged pull request #27900: Only get changelog for core commits
ephraimbuddy merged PR #27900: URL: https://github.com/apache/airflow/pull/27900 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] pierrejeambrun commented on pull request #27805: Automatically save and allow restore of recent DAG run configs
pierrejeambrun commented on PR #27805: URL: https://github.com/apache/airflow/pull/27805#issuecomment-1326861132 @aaronabraham311 There is a lot of example querying resource from the db in the views.py file. In this case DagRun should have what you need. In the example you mentioned above (json config displayed in the dagrun details), its coming from the grid_data view of that file :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ferruzzi opened a new pull request, #27901: Add information on how to run tests in Breeze via the PyCharm IDE
ferruzzi opened a new pull request, #27901: URL: https://github.com/apache/airflow/pull/27901 How to add a context menu entry in PyCharm to run selected unit tests in the Breeze environment instead if in your working venv. Also moved the two existing PyCharm-specific images into a new subdirectory for organizational reasons. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ferruzzi commented on a diff in pull request #27823: Amazon Provider Package user agent
ferruzzi commented on code in PR #27823: URL: https://github.com/apache/airflow/pull/27823#discussion_r1031869533 ## airflow/providers/amazon/aws/hooks/base_aws.py: ## @@ -42,11 +46,13 @@ from dateutil.tz import tzlocal from slugify import slugify +from airflow import __version__ as airflow_version Review Comment: I think I addressed this in https://github.com/apache/airflow/pull/27823/commits/a5fc3bc4a39855b9f3c7fc5ac26505709d191a98 by moving it to a local import in the helper method. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ferruzzi commented on a diff in pull request #27823: Amazon Provider Package user agent
ferruzzi commented on code in PR #27823: URL: https://github.com/apache/airflow/pull/27823#discussion_r1031868722 ## airflow/providers/amazon/aws/hooks/base_aws.py: ## @@ -405,9 +411,68 @@ def __init__( self.resource_type = resource_type self._region_name = region_name -self._config = config +self._config = config or botocore.config.Config() self._verify = verify +@classmethod +def _get_provider_version(cls) -> str: +"""Checks the Providers Manager for the package version.""" +manager = ProvidersManager() +provider_name = manager.hooks[cls.conn_type].package_name # type: ignore[union-attr] Review Comment: Sorry for the delay, just got back from vacation and it took a little longer to get back into gear. I ended up wrapping it in a try/except as mentioned and dropped the `if not hook`. If `hook` is falsy, then it'll error out on the next line at `hook.package_name` anyway and get caught by the `except`. Addressed in https://github.com/apache/airflow/pull/27823/commits/a5fc3bc4a39855b9f3c7fc5ac26505709d191a98 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ferruzzi commented on a diff in pull request #27823: Amazon Provider Package user agent
ferruzzi commented on code in PR #27823: URL: https://github.com/apache/airflow/pull/27823#discussion_r1031868722 ## airflow/providers/amazon/aws/hooks/base_aws.py: ## @@ -405,9 +411,68 @@ def __init__( self.resource_type = resource_type self._region_name = region_name -self._config = config +self._config = config or botocore.config.Config() self._verify = verify +@classmethod +def _get_provider_version(cls) -> str: +"""Checks the Providers Manager for the package version.""" +manager = ProvidersManager() +provider_name = manager.hooks[cls.conn_type].package_name # type: ignore[union-attr] Review Comment: Sorry for the delay, just got back from vacation and it took a little longer to get back into gear. I ended up wrapping it in a try/except as mentioned and dropped the `if not hook`. If `hook` is falsy, then it'll error out on the next line at `hook.package_name` anyway and get caught by the `except`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] vandonr-amz opened a new pull request, #27899: fix sagemaker system test to run on Apple Silicon
vandonr-amz opened a new pull request, #27899: URL: https://github.com/apache/airflow/pull/27899 this test was failing when launched from an M1 mac because the docker image was built for the local CPU type (arm64) and then uploaded to an amd64 linux, which didn't work. `--platform` is a flag for buildx, but breeze has it replacing the default `docker build`, so this works alright. tested on an M1 macbook pro and on an EC2 ubuntu instance. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch v2-5-test updated (523df868dd -> 9bae336e69)
This is an automated email from the ASF dual-hosted git repository. ephraimanierobi pushed a change to branch v2-5-test in repository https://gitbox.apache.org/repos/asf/airflow.git omit 523df868dd Add release notes add 9bae336e69 Add release notes This update added new revisions after undoing existing revisions. That is to say, some revisions that were in the old version of the branch are not in the new version. This situation occurs when a user --force pushes a change and generates a repository containing something like this: * -- * -- B -- O -- O -- O (523df868dd) \ N -- N -- N refs/heads/v2-5-test (9bae336e69) You should already have received notification emails for all of the O revisions, and so the following emails describe only the N revisions from the common base, B. Any revisions marked "omit" are not gone; other references still refer to them. Any revisions marked "discard" are gone forever. No new revisions were added by this update. Summary of changes: RELEASE_NOTES.rst | 19 --- 1 file changed, 4 insertions(+), 15 deletions(-)
[GitHub] [airflow] vincbeck commented on a diff in pull request #27820: Add retry option in RedshiftDeleteClusterOperator to retry when an operation is running in the cluster
vincbeck commented on code in PR #27820: URL: https://github.com/apache/airflow/pull/27820#discussion_r1031804824 ## airflow/providers/amazon/aws/operators/redshift_cluster.py: ## @@ -498,22 +502,38 @@ def __init__( wait_for_completion: bool = True, aws_conn_id: str = "aws_default", poll_interval: float = 30.0, +retry: bool = False, +retry_attempts: int = 10, Review Comment: Should be good now @eladkal -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ajaykarthick27 commented on issue #27890: SFTP Sensor is not working with File Pattern Parameter
ajaykarthick27 commented on issue #27890: URL: https://github.com/apache/airflow/issues/27890#issuecomment-1326810050 yes did not notice. I will close this issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ajaykarthick27 closed issue #27890: SFTP Sensor is not working with File Pattern Parameter
ajaykarthick27 closed issue #27890: SFTP Sensor is not working with File Pattern Parameter URL: https://github.com/apache/airflow/issues/27890 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] eladkal commented on issue #27890: SFTP Sensor is not working with File Pattern Parameter
eladkal commented on issue #27890: URL: https://github.com/apache/airflow/issues/27890#issuecomment-1326807715 Duplicate of https://github.com/apache/airflow/issues/27418 ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] vandonr-amz commented on a diff in pull request #27786: Add operators + sensor for aws sagemaker pipelines
vandonr-amz commented on code in PR #27786: URL: https://github.com/apache/airflow/pull/27786#discussion_r1031797200 ## airflow/providers/amazon/aws/hooks/sagemaker.py: ## @@ -647,28 +649,28 @@ def describe_endpoint(self, name: str) -> dict: def check_status( self, -job_name: str, +resource_name: str, Review Comment: ah that's a good point... I can keep it as job_name to avoid that, there is no strong need to rename it. I can also add a small comment about the fact that it can be used to check more than jobs. tbh I think this method should have been private, it's mostly a helper, only used in `wait_for_completion` cases, but now it's too late to change that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] xlanor commented on pull request #27898: fix: current_state method on TaskInstance doesn't filter by map_index
xlanor commented on PR #27898: URL: https://github.com/apache/airflow/pull/27898#issuecomment-1326802435 Currently I'm still trying to figure out how to run the unit tests as I'm fairly new to this code base, opening a PR first for CI. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on pull request #27898: fix: current_state method on TaskInstance doesn't filter by map_index
boring-cyborg[bot] commented on PR #27898: URL: https://github.com/apache/airflow/pull/27898#issuecomment-1326802257 Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst) Here are some useful points: - Pay attention to the quality of your code (flake8, mypy and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks) will help you with that. - In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/main/docs/apache-airflow/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it. - Consider using [Breeze environment](https://github.com/apache/airflow/blob/main/BREEZE.rst) for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations. - Be patient and persistent. It might take some time to get a review or get the final approval from Committers. - Please follow [ASF Code of Conduct](https://www.apache.org/foundation/policies/conduct) for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack. - Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#coding-style-and-best-practices). Apache Airflow is a community-driven project and together we are making it better 🚀. In case of doubts contact the developers at: Mailing List: d...@airflow.apache.org Slack: https://s.apache.org/airflow-slack -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] xlanor opened a new pull request, #27898: fix: current_state method on TaskInstance doesn't filter by map_index
xlanor opened a new pull request, #27898: URL: https://github.com/apache/airflow/pull/27898 Signed-off-by: xlanor Fixes #27864 --- current_state method on TaskInstance doesn't filter by map_index so calling this method on mapped task instance fails. Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information. In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] syedahsn closed pull request #27873: EMR Notebook Execution Sensor
syedahsn closed pull request #27873: EMR Notebook Execution Sensor URL: https://github.com/apache/airflow/pull/27873 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] aaronabraham311 commented on pull request #27805: Automatically save and allow restore of recent DAG run configs
aaronabraham311 commented on PR #27805: URL: https://github.com/apache/airflow/pull/27805#issuecomment-1326792202 @pierrejeambrun Oh that's great! Is there any example on how to access the DagRun configs from the db? Or is there a code snippet that we can use as an example? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] pierrejeambrun commented on pull request #27805: Automatically save and allow restore of recent DAG run configs
pierrejeambrun commented on PR #27805: URL: https://github.com/apache/airflow/pull/27805#issuecomment-1326777609 Now that you mention it, DagRun conf are already stored and available for each run. Isn't it easier to just retrieve them and provide them to the `trigger.html` template directly ? We also have a lot of control of what conf should be retrieved (5 most recent conf for a specific dag, etc.) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] george-zubrienko commented on issue #27838: apache-airflow-providers-common-sql==1.3.0 breaks BigQuery operators
george-zubrienko commented on issue #27838: URL: https://github.com/apache/airflow/issues/27838#issuecomment-1326771187 @potiuk I usually pin versions (`==1.2.3`) of providers that ship a lot of dependencies, to what is shown in the official image by running `pip show ` and only upgrade if it was upgraded in the next release. Also, we don't upgrade to every release right away, so the snippets I posted were for 2.4.1 version where we did some dependency shuffling (no version bumps, simple `poetry update`) and then I saw errors popping on test env after new image was deployed. Reason we use poetry is to resolve potential incompatibilities between our own libraries and airflow dependencies. For some providers - like datadog in the example above - it is more or less safe to only lock major and minor with `~`, but for google after running into problems with the protobuf library upgrade, I learned it is safer to pin dependencies. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch constraints-2-5 updated: Updating constraints. Build id:
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch constraints-2-5 in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/constraints-2-5 by this push: new d6aeef755f Updating constraints. Build id: d6aeef755f is described below commit d6aeef755fb797f01728fe3fb552c22dade5e087 Author: Automated GitHub Actions commit AuthorDate: Thu Nov 24 18:46:58 2022 + Updating constraints. Build id: This update in constraints is automatically committed by the CI 'constraints-push' step based on HEAD of '' in '' with commit sha . All tests passed in this build so we determined we can push the updated constraints. See https://github.com/apache/airflow/blob/main/README.md#installing-from-pypi for details. --- constraints-3.10.txt | 28 ++-- constraints-3.7.txt | 20 ++-- constraints-3.8.txt | 28 ++-- constraints-3.9.txt | 28 ++-- constraints-no-providers-3.10.txt | 6 +++--- constraints-no-providers-3.7.txt | 6 +++--- constraints-no-providers-3.8.txt | 8 constraints-no-providers-3.9.txt | 8 constraints-source-providers-3.10.txt | 28 ++-- constraints-source-providers-3.7.txt | 20 ++-- constraints-source-providers-3.8.txt | 28 ++-- constraints-source-providers-3.9.txt | 28 ++-- 12 files changed, 118 insertions(+), 118 deletions(-) diff --git a/constraints-3.10.txt b/constraints-3.10.txt index 4bea7fd639..61f49127f9 100644 --- a/constraints-3.10.txt +++ b/constraints-3.10.txt @@ -1,6 +1,6 @@ # -# This constraints file was automatically generated on 2022-11-23T11:28:48Z -# via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow. +# This constraints file was automatically generated on 2022-11-24T18:46:37Z +# via "eager-upgrade" mechanism of PIP. For the "v2-5-test" branch of Airflow. # This variant of constraints install uses the HEAD of the branch version for 'apache-airflow' but installs # the providers from PIP-released packages at the moment of the constraint generation. # @@ -172,9 +172,9 @@ billiard==3.6.4.0 black==22.10.0 bleach==5.0.1 blinker==1.5 -boto3==1.26.15 +boto3==1.26.16 boto==2.49.0 -botocore==1.29.15 +botocore==1.29.16 bowler==0.9.0 cachelib==0.9.0 cachetools==4.2.2 @@ -239,7 +239,7 @@ fastjsonschema==2.16.2 filelock==3.8.0 fissix==21.11.13 flake8-colors==0.1.9 -flake8==5.0.4 +flake8==6.0.0 flake8_implicit_str_concat==0.3.0 flaky==3.7.0 flower==1.2.0 @@ -253,7 +253,7 @@ gcloud-aio-storage==7.0.1 gcsfs==2022.11.0 geomet==0.2.1.post1 gevent==22.10.2 -gitdb==4.0.9 +gitdb==4.0.10 google-ads==18.0.0 google-api-core==2.8.2 google-api-python-client==1.12.11 @@ -322,7 +322,7 @@ identify==2.5.9 idna==3.4 ijson==3.1.4 imagesize==1.4.1 -importlib-metadata==5.0.0 +importlib-metadata==5.1.0 incremental==22.10.0 inflection==0.5.1 influxdb-client==1.34.0 @@ -447,7 +447,7 @@ pyOpenSSL==22.0.0 pyarrow==9.0.0 pyasn1-modules==0.2.8 pyasn1==0.4.8 -pycodestyle==2.9.1 +pycodestyle==2.10.0 pycountry==22.3.5 pycparser==2.21 pycryptodome==3.15.0 @@ -457,7 +457,7 @@ pydot==1.4.2 pydruid==0.6.5 pyenchant==3.2.2 pyexasol==0.25.1 -pyflakes==2.5.0 +pyflakes==3.0.1 pygraphviz==1.10 pyhcl==0.4.4 pykerberos==1.2.4 @@ -581,20 +581,20 @@ types-Deprecated==1.2.9 types-Markdown==3.4.2.1 types-PyMySQL==1.0.19.1 types-PyYAML==6.0.12.2 -types-boto==2.49.18.2 +types-boto==2.49.18.3 types-certifi==2021.10.8.3 types-croniter==1.3.2 types-cryptography==3.3.23.2 types-docutils==0.19.1.1 types-freezegun==1.1.10 types-paramiko==2.12.0.1 -types-protobuf==3.20.4.5 +types-protobuf==3.20.4.6 types-python-dateutil==2.8.19.4 types-python-slugify==7.0.0.1 types-pytz==2022.6.0.1 -types-redis==4.3.21.4 +types-redis==4.3.21.5 types-requests==2.28.11.5 -types-setuptools==65.6.0.0 +types-setuptools==65.6.0.1 types-tabulate==0.9.0.0 types-termcolor==1.1.6 types-toml==0.10.8.1 @@ -606,7 +606,7 @@ uamqp==1.6.3 uc-micro-py==1.0.1 unicodecsv==0.14.1 uritemplate==3.0.1 -urllib3==1.26.12 +urllib3==1.26.13 userpath==1.8.0 vertica-python==1.1.1 vine==5.0.0 diff --git a/constraints-3.7.txt b/constraints-3.7.txt index f0b6e98172..a6d25e0c48 100644 --- a/constraints-3.7.txt +++ b/constraints-3.7.txt @@ -1,6 +1,6 @@ # -# This constraints file was automatically generated on 2022-11-23T11:29:26Z -# via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow. +# This constraints file was automatically generated on 2022-11-24T18:46:56Z +# via "eager-upgrade" mechanism of PIP. For the "v2-5-test" branch of Airflow. # This variant of constraints install uses the HEAD of the branch version for 'apache-airflow' but installs # the provi
[GitHub] [airflow] Taragolis commented on a diff in pull request #27786: Add operators + sensor for aws sagemaker pipelines
Taragolis commented on code in PR #27786: URL: https://github.com/apache/airflow/pull/27786#discussion_r1031770958 ## airflow/providers/amazon/aws/hooks/sagemaker.py: ## @@ -647,28 +649,28 @@ def describe_endpoint(self, name: str) -> dict: def check_status( self, -job_name: str, +resource_name: str, Review Comment: I'm really waiting for Python 3.7 EOL date with positional only arguments -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Taragolis commented on a diff in pull request #27786: Add operators + sensor for aws sagemaker pipelines
Taragolis commented on code in PR #27786: URL: https://github.com/apache/airflow/pull/27786#discussion_r1031764368 ## airflow/providers/amazon/aws/hooks/sagemaker.py: ## @@ -647,28 +649,28 @@ def describe_endpoint(self, name: str) -> dict: def check_status( self, -job_name: str, +resource_name: str, Review Comment: I just wondering is this changes could be classified as breaking changes or not? It is a small chance that user might use this public method in their code and define as arguments as keywords ```python SageMakerHook.check_status( job_name="foo-bar", key="spam", describe_function=some_callable check_interval=42 ) ``` And after this changes their got: `TypeError: got an unexpected keyword argument 'job_name'` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ZhangCreations commented on issue #24988: sendgrid as a data source provider? #24815
ZhangCreations commented on issue #24988: URL: https://github.com/apache/airflow/issues/24988#issuecomment-1326760622 Just to double check my understanding of the release cycle as outlined [here](https://github.com/apache/airflow#release-process-for-providers). If I implement this provider, does that require me to continue to maintain the versioning of this provider indefinitely? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] syedahsn commented on a diff in pull request #27893: AWSGlueJobHook updates job configuration if it exists
syedahsn commented on code in PR #27893: URL: https://github.com/apache/airflow/pull/27893#discussion_r1031764389 ## airflow/providers/amazon/aws/hooks/glue.py: ## @@ -92,10 +93,51 @@ def __init__( kwargs["client_type"] = "glue" super().__init__(*args, **kwargs) +def create_glue_job_config(self) -> dict: +if self.s3_bucket is None: +raise AirflowException("Could not initialize glue job, error: Specify Parameter `s3_bucket`") + +default_command = { +"Name": "glueetl", +"ScriptLocation": self.script_location, +} +command = self.create_job_kwargs.pop("Command", default_command) + +s3_log_path = f"s3://{self.s3_bucket}/{self.s3_glue_logs}{self.job_name}" +execution_role = self.get_iam_execution_role() + +if "WorkerType" in self.create_job_kwargs and "NumberOfWorkers" in self.create_job_kwargs: +return dict( Review Comment: This part here can be refactored to be a bit more concise. Rather than have two return statements returning very similar dictionaries, something like this would be cleaner: ``` ret_config = { "Name": self.job_name, "Description": self.desc, "LogUri": s3_log_path, "Role": execution_role["Role"]["Arn"], "ExecutionProperty": {"MaxConcurrentRuns": self.concurrent_run_limit}, "Command": command, "MaxRetries": self.retry_limit, **self.create_job_kwargs, } if "WorkerType" in self.create_job_kwargs and "NumberOfWorkers" in self.create_job_kwargs: ret_config["MaxCapacity"] = self.num_of_dpus return ret_config ``` Also, it's [generally preferable](https://stackoverflow.com/questions/2853683/what-is-the-preferred-syntax-for-initializing-a-dict-curly-brace-literals-or/2853738#2853738) to use {} rather than the `dict()` function -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch constraints-main updated: Updating constraints. Build id:
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch constraints-main in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/constraints-main by this push: new 61e50acc6a Updating constraints. Build id: 61e50acc6a is described below commit 61e50acc6ac4ab9ed867024e01403171060b6827 Author: Automated GitHub Actions commit AuthorDate: Thu Nov 24 18:23:24 2022 + Updating constraints. Build id: This update in constraints is automatically committed by the CI 'constraints-push' step based on HEAD of '' in '' with commit sha . All tests passed in this build so we determined we can push the updated constraints. See https://github.com/apache/airflow/blob/main/README.md#installing-from-pypi for details. --- constraints-3.10.txt | 8 constraints-3.7.txt | 4 ++-- constraints-3.8.txt | 8 constraints-3.9.txt | 8 constraints-no-providers-3.10.txt | 2 +- constraints-no-providers-3.7.txt | 2 +- constraints-no-providers-3.8.txt | 4 ++-- constraints-no-providers-3.9.txt | 4 ++-- constraints-source-providers-3.10.txt | 8 constraints-source-providers-3.7.txt | 4 ++-- constraints-source-providers-3.8.txt | 8 constraints-source-providers-3.9.txt | 8 12 files changed, 34 insertions(+), 34 deletions(-) diff --git a/constraints-3.10.txt b/constraints-3.10.txt index 9cba609a82..ea2c4d025d 100644 --- a/constraints-3.10.txt +++ b/constraints-3.10.txt @@ -1,5 +1,5 @@ # -# This constraints file was automatically generated on 2022-11-24T00:13:51Z +# This constraints file was automatically generated on 2022-11-24T18:22:43Z # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow. # This variant of constraints install uses the HEAD of the branch version for 'apache-airflow' but installs # the providers from PIP-released packages at the moment of the constraint generation. @@ -253,7 +253,7 @@ gcloud-aio-storage==7.0.1 gcsfs==2022.11.0 geomet==0.2.1.post1 gevent==22.10.2 -gitdb==4.0.9 +gitdb==4.0.10 google-ads==18.0.0 google-api-core==2.8.2 google-api-python-client==1.12.11 @@ -322,7 +322,7 @@ identify==2.5.9 idna==3.4 ijson==3.1.4 imagesize==1.4.1 -importlib-metadata==5.0.0 +importlib-metadata==5.1.0 incremental==22.10.0 inflection==0.5.1 influxdb-client==1.34.0 @@ -457,7 +457,7 @@ pydot==1.4.2 pydruid==0.6.5 pyenchant==3.2.2 pyexasol==0.25.1 -pyflakes==3.0.0 +pyflakes==3.0.1 pygraphviz==1.10 pyhcl==0.4.4 pykerberos==1.2.4 diff --git a/constraints-3.7.txt b/constraints-3.7.txt index 6953da5a27..d6a519d406 100644 --- a/constraints-3.7.txt +++ b/constraints-3.7.txt @@ -1,5 +1,5 @@ # -# This constraints file was automatically generated on 2022-11-24T00:14:29Z +# This constraints file was automatically generated on 2022-11-24T18:23:21Z # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow. # This variant of constraints install uses the HEAD of the branch version for 'apache-airflow' but installs # the providers from PIP-released packages at the moment of the constraint generation. @@ -253,7 +253,7 @@ gcloud-aio-storage==7.0.1 gcsfs==2022.11.0 geomet==0.2.1.post1 gevent==22.10.2 -gitdb==4.0.9 +gitdb==4.0.10 google-ads==18.0.0 google-api-core==2.8.2 google-api-python-client==1.12.11 diff --git a/constraints-3.8.txt b/constraints-3.8.txt index 8213266a0c..5d10586a80 100644 --- a/constraints-3.8.txt +++ b/constraints-3.8.txt @@ -1,5 +1,5 @@ # -# This constraints file was automatically generated on 2022-11-24T00:14:21Z +# This constraints file was automatically generated on 2022-11-24T18:23:13Z # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow. # This variant of constraints install uses the HEAD of the branch version for 'apache-airflow' but installs # the providers from PIP-released packages at the moment of the constraint generation. @@ -254,7 +254,7 @@ gcloud-aio-storage==7.0.1 gcsfs==2022.11.0 geomet==0.2.1.post1 gevent==22.10.2 -gitdb==4.0.9 +gitdb==4.0.10 google-ads==18.0.0 google-api-core==2.8.2 google-api-python-client==1.12.11 @@ -323,7 +323,7 @@ identify==2.5.9 idna==3.4 ijson==3.1.4 imagesize==1.4.1 -importlib-metadata==5.0.0 +importlib-metadata==5.1.0 importlib-resources==5.10.0 incremental==22.10.0 inflection==0.5.1 @@ -460,7 +460,7 @@ pydot==1.4.2 pydruid==0.6.5 pyenchant==3.2.2 pyexasol==0.25.1 -pyflakes==3.0.0 +pyflakes==3.0.1 pygraphviz==1.10 pyhcl==0.4.4 pykerberos==1.2.4 diff --git a/constraints-3.9.txt b/constraints-3.9.txt index 05309e3648..3fe64b8aad 100644 --- a/constraints-3.9.txt +++ b/constraints-3.9.txt @@ -1,5 +1,5 @@ # -# This constraints file was automatically generated on 2022-11-24T00:14:18Z +# This constraints file was automatically generated on 2022-11-24T18:23:10Z # via "eager-up
[GitHub] [airflow] alexott commented on issue #27894: Status of testing Providers that were prepared on November 24, 2022
alexott commented on issue #27894: URL: https://github.com/apache/airflow/issues/27894#issuecomment-1326755638 Opened #27897 - tested all file formats -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] alexott opened a new pull request, #27897: Additional fix for writing output in DatabricksSqlOperator
alexott opened a new pull request, #27897: URL: https://github.com/apache/airflow/pull/27897 This PR adds fix for writing results in DatabricksSqlOperator -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ephraimbuddy commented on a diff in pull request #27895: Sync v2-5-stable with v2-5-test to release 2.5.0
ephraimbuddy commented on code in PR #27895: URL: https://github.com/apache/airflow/pull/27895#discussion_r1031752268 ## RELEASE_NOTES.rst: ## @@ -21,6 +21,279 @@ .. towncrier release notes start +Airflow 2.5.0 (2022-11-28) +-- + +Significant Changes +^^^ + +- ``airflow dags test`` no longer performs a backfill job. + + In order to make ``airflow dags test`` more useful as a testing and debugging tool, we no + longer run a backfill job and instead run a "local task runner". Users can still backfill + their DAGs using the ``airflow dags backfill`` command. (#26400) +- Airflow config section ``kubernetes`` renamed to ``kubernetes_executor`` + + KubernetesPodOperator no longer considers any core kubernetes config params, so this section now only applies to kubernetes executor. Renaming it reduces potential for confusion. (#26873) +- ``ExternalTaskSensor`` no longer hangs indefinitely when ``failed_states`` is set, an ``execute_date_fn`` is used, and some but not all of the dependent tasks fail. Instead, an ``AirflowException`` is thrown as soon as any of the dependent tasks fail. + + Any code handling this failure in addition to timeouts should move to cathing the ``AirflowException`` baseclass and not only the ``AirflowSensorTimeout`` subclass. (#27190) + +New Features + +- ``TaskRunner``: notify of component start and finish (#27855) +- Add DagRun state change to the Listener plugin system(#27113) +- Metric for raw task return codes (#27155) +- Add logic for XComArg to pull specific map indexes (#27771) +- Clear TaskGroup (#26658) +- Add critical section query duration metric (#27700) +- Add: #23880 :: Audit log for ``AirflowModelViews(Variables/Connection)`` (#24079) +- Add postgres 15 support (#27444) +- Expand tasks in mapped group at run time (#27491) +- reset commits, clean submodules (#27560) +- scheduler_job, add metric for scheduler loop timer (#27605) +- Allow datasets to be used in taskflow (#27540) +- Add expanded_ti_count to ti context (#27680) +- Add user comment to task instance and dag run (#26457, #27849, #27867) +- Enable copying DagRun JSON to clipboard (#27639) +- Implement extra controls for SLAs (#27557) +- add dag parsed time in DAG view (#27573) +- Add max_wait for exponential_backoff in BaseSensor (#27597) +- Expand tasks in mapped group at parse time (#27158) +- Add disable retry flag on backfill (#23829) +- Adding sensor decorator (#22562) +- Api endpoint update ti (#26165) +- Filtering datasets by recent update events (#26942) +- Support Is /not Null filter for value is None on webui (#26584) +- Add search to datasets list (#26893) +- Split out and handle 'params' in mapped operator (#26100) +- Add authoring API for TaskGroup mapping (#26844) +- Add ``one_done`` trigger rule (#26146) +- Create a more efficient airflow dag test command that also has better local logging (#26400) +- Support add/remove permissions to roles commands (#26338) +- Auto tail file logs in Web UI (#26169) +- Add triggerer info to task instance in API (#26249) +- Flag to deserialize value on custom XCom backend (#26343) + +Bug Fixes +^ +- Redirect to home view when there are no valid tags in the URL (#25715) +- Make MappedTaskGroup depend on its expand inputs (#27876) +- Make DagRun state updates for paused DAGs faster (#27725) +- Don't explicitly set include_examples to False on task run command (#27813) +- Fix menu border color (#27789) +- Fix backfill queued task getting reset to scheduled state. (#23720) +- Fix clearing child dag mapped tasks from parent dag (#27501) +- Handle json encoding of ``V1Pod`` in task callback (#27609) +- Fix ExternalTaskSensor can't check zipped dag (#27056) +- Avoid re-fetching DAG run in TriggerDagRunOperator (#27635) +- Continue on exception when retrieving metadata (#27665) +- Fix double logging with some task logging handler (#27591) +- External task sensor fail fix (#27190) +- Replace FAB url filtering function with Airflows (#27576) +- Fix mini scheduler expansion of mapped task (#27506) +- Add the default None when pop actions (#27537) +- Display parameter values from serialized dag in trigger dag view. (#27482) +- Fix getting the dag/task ids from base executor (#27550) +- Fix sqlalchemy primary key black-out error on DDRQ (#27538) +- Move TriggerDagRun conf check to execute (#27035) +- SLAMiss is nullable and not always given back when pulling task instances (#27423) +- Fix behavior of ``_`` when searching for DAGs (#27448) +- Add case insensitive constraint to username (#27266) +- Fix python external template keys (#27256) +- Resolve trigger assignment race condition (#27072) +- Update google_analytics.html (#27226) +- Fix IntegrityError during webserver startup (#27297) +- reduce extraneous task log requests (#27233) +- Make RotatingFilehandler used in DagProcessor non-caching (#27223) +- set executor.job_id to BackfillJob.id for backfills (#27020) +- Fix som
[GitHub] [airflow] vandonr-amz commented on pull request #27786: Add operators + sensor for aws sagemaker pipelines
vandonr-amz commented on PR #27786: URL: https://github.com/apache/airflow/pull/27786#issuecomment-1326741439 @Taragolis @potiuk do you think you could review this PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch v2-5-test updated (0c2ee0ad95 -> 523df868dd)
This is an automated email from the ASF dual-hosted git repository. ephraimanierobi pushed a change to branch v2-5-test in repository https://gitbox.apache.org/repos/asf/airflow.git omit 0c2ee0ad95 Add release notes omit 59d16b6765 Update version to 2.5.0 add 82b37d3ce5 tests: always cleanup registered test listeners (#27896) add 43c0607590 Update version to 2.5.0 add 523df868dd Add release notes This update added new revisions after undoing existing revisions. That is to say, some revisions that were in the old version of the branch are not in the new version. This situation occurs when a user --force pushes a change and generates a repository containing something like this: * -- * -- B -- O -- O -- O (0c2ee0ad95) \ N -- N -- N refs/heads/v2-5-test (523df868dd) You should already have received notification emails for all of the O revisions, and so the following emails describe only the N revisions from the common base, B. Any revisions marked "omit" are not gone; other references still refer to them. Any revisions marked "discard" are gone forever. No new revisions were added by this update. Summary of changes: README.md | 2 +- RELEASE_NOTES.rst | 121 + docs/spelling_wordlist.txt | 2 + tests/plugins/test_plugins_manager.py | 1 + .../task/task_runner/test_standard_task_runner.py | 2 + 5 files changed, 32 insertions(+), 96 deletions(-)
[GitHub] [airflow] Taragolis commented on a diff in pull request #27893: AWSGlueJobHook updates job configuration if it exists
Taragolis commented on code in PR #27893: URL: https://github.com/apache/airflow/pull/27893#discussion_r1031748050 ## airflow/providers/amazon/aws/hooks/glue.py: ## @@ -92,10 +93,51 @@ def __init__( kwargs["client_type"] = "glue" super().__init__(*args, **kwargs) +def create_glue_job_config(self) -> dict: +if self.s3_bucket is None: +raise AirflowException("Could not initialize glue job, error: Specify Parameter `s3_bucket`") + +default_command = { +"Name": "glueetl", +"ScriptLocation": self.script_location, +} +command = self.create_job_kwargs.pop("Command", default_command) + +s3_log_path = f"s3://{self.s3_bucket}/{self.s3_glue_logs}{self.job_name}" +execution_role = self.get_iam_execution_role() + +if "WorkerType" in self.create_job_kwargs and "NumberOfWorkers" in self.create_job_kwargs: +return dict( +Name=self.job_name, +Description=self.desc, +LogUri=s3_log_path, +Role=execution_role["Role"]["Arn"], +ExecutionProperty={"MaxConcurrentRuns": self.concurrent_run_limit}, +Command=command, +MaxRetries=self.retry_limit, +**self.create_job_kwargs, +) +else: +return dict( +Name=self.job_name, +Description=self.desc, +LogUri=s3_log_path, +Role=execution_role["Role"]["Arn"], +ExecutionProperty={"MaxConcurrentRuns": self.concurrent_run_limit}, +Command=command, +MaxRetries=self.retry_limit, +MaxCapacity=self.num_of_dpus, +**self.create_job_kwargs, +) + +@cached_property +def glue_client(self): +""":return: AWS Glue client""" +return self.get_conn() Review Comment: small nitpick `AwsBaseHook` already have two ways how to get `boto3.client`: - AwsBaseHook.get_conn() - AwsBaseHook.conn And both of them cached so... might be better not to create third way? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #27894: Status of testing Providers that were prepared on November 24, 2022
potiuk commented on issue #27894: URL: https://github.com/apache/airflow/issues/27894#issuecomment-1326737975 > In Databricks SQL operator (and I believe in others as well), there was following strategy: always return only last result - previous results were always discarded. Primary reason for this was following: > > * When you have multiple SQL statements, first one usually create table, inserts, etc. And only when you have select as the last statement, then you get results. This matches the logic of the SQL's `BATCH` statement > * When you have multiple SQL statements their result may have different schema, but results will be processed only according to the latest schema, not schemas for corresponding result sets > > We may need to think a bit about it - should we return results for each of the statements, or not. If yes, then we need to return pairs of description + results for each SQL statement, instead of using only the latest statement Yes - I noticed that too now. With two caveats: * depends on the oprator what is the default (no problem) * it behaves differently when there is an "sql" passed and return_last is true -> then instead of one-element result array it returns the results It is surprisingly difficult to unwind teh original convoluted behaviour :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] syedahsn commented on issue #27894: Status of testing Providers that were prepared on November 24, 2022
syedahsn commented on issue #27894: URL: https://github.com/apache/airflow/issues/27894#issuecomment-1326737941 #27276 works as expected. System tests using those operators are all passing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] alexott commented on issue #27894: Status of testing Providers that were prepared on November 24, 2022
alexott commented on issue #27894: URL: https://github.com/apache/airflow/issues/27894#issuecomment-1326727196 In Databricks SQL operator (and I believe in others as well), there was following strategy: always return only last result - previous results were always discarded. Primary reason for this was following: * When you have multiple SQL statements, first one usually create table, inserts, etc. And only when you have select as the last statement, then you get results. This matches the logic of the SQL's `BATCH` statement * When you have multiple SQL statements their result may have different schema, but results will be processed only according to the latest schema, not schemas for corresponding result sets We may need to think a bit about it - should we return results for each of the statements, or not. If yes, then we need to return pairs of description + results for each SQL statement, instead of using only the latest statement -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated: tests: always cleanup registered test listeners (#27896)
This is an automated email from the ASF dual-hosted git repository. ephraimanierobi pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/main by this push: new eba04d7c40 tests: always cleanup registered test listeners (#27896) eba04d7c40 is described below commit eba04d7c400c0d89492d75a7c81d21073933cd0c Author: Maciej Obuchowski AuthorDate: Thu Nov 24 18:24:31 2022 +0100 tests: always cleanup registered test listeners (#27896) Signed-off-by: Maciej Obuchowski Signed-off-by: Maciej Obuchowski --- tests/plugins/test_plugins_manager.py | 1 + tests/task/task_runner/test_standard_task_runner.py | 2 ++ 2 files changed, 3 insertions(+) diff --git a/tests/plugins/test_plugins_manager.py b/tests/plugins/test_plugins_manager.py index 9ed00cae05..9ae6f55b6b 100644 --- a/tests/plugins/test_plugins_manager.py +++ b/tests/plugins/test_plugins_manager.py @@ -65,6 +65,7 @@ class AirflowTestOnLoadExceptionPlugin(AirflowPlugin): @pytest.fixture(autouse=True, scope="module") def clean_plugins(): +get_listener_manager().clear() yield get_listener_manager().clear() diff --git a/tests/task/task_runner/test_standard_task_runner.py b/tests/task/task_runner/test_standard_task_runner.py index c54a27ae89..797462136a 100644 --- a/tests/task/task_runner/test_standard_task_runner.py +++ b/tests/task/task_runner/test_standard_task_runner.py @@ -72,6 +72,7 @@ class TestStandardTaskRunner: (as the test environment does not have enough context for the normal way to run) and ensures they reset back to normal on the way out. """ +get_listener_manager().clear() clear_db_runs() dictConfig(LOGGING_CONFIG) yield @@ -79,6 +80,7 @@ class TestStandardTaskRunner: airflow_logger.handlers = [] clear_db_runs() dictConfig(DEFAULT_LOGGING_CONFIG) +get_listener_manager().clear() def test_start_and_terminate(self): local_task_job = mock.Mock()
[GitHub] [airflow] potiuk commented on issue #27894: Status of testing Providers that were prepared on November 24, 2022
potiuk commented on issue #27894: URL: https://github.com/apache/airflow/issues/27894#issuecomment-1326710383 Ah I think I see where I made wrong assumption @alexott . looking at it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ephraimbuddy merged pull request #27896: tests: always cleanup registered test listeners
ephraimbuddy merged PR #27896: URL: https://github.com/apache/airflow/pull/27896 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #27894: Status of testing Providers that were prepared on November 24, 2022
potiuk commented on issue #27894: URL: https://github.com/apache/airflow/issues/27894#issuecomment-1326699948 > wraps this list into another list. I think that logic should be changed a bit - right now we're collecting results for all SQL statements into a single list although they could have different schemas. Yeah. that part is a bit not clear about the intentions (or maybe I misunderstood it). Woudl be great if you have a PR indeed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] vincbeck commented on pull request #27854: Fix errors in Databricks SQL operator introduced when refactoring
vincbeck commented on PR #27854: URL: https://github.com/apache/airflow/pull/27854#issuecomment-1326685898 Sorry for the late review/reply! LGTM! Thanks @potiuk for the explanations as well -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] jedcunningham commented on a diff in pull request #27895: Sync v2-5-stable with v2-5-test to release 2.5.0
jedcunningham commented on code in PR #27895: URL: https://github.com/apache/airflow/pull/27895#discussion_r1031692178 ## RELEASE_NOTES.rst: ## @@ -21,6 +21,279 @@ .. towncrier release notes start +Airflow 2.5.0 (2022-11-28) Review Comment: ```suggestion Airflow 2.5.0 (2022-11-30) ``` ## RELEASE_NOTES.rst: ## @@ -21,6 +21,279 @@ .. towncrier release notes start +Airflow 2.5.0 (2022-11-28) +-- + +Significant Changes +^^^ + +- ``airflow dags test`` no longer performs a backfill job. + + In order to make ``airflow dags test`` more useful as a testing and debugging tool, we no + longer run a backfill job and instead run a "local task runner". Users can still backfill + their DAGs using the ``airflow dags backfill`` command. (#26400) +- Airflow config section ``kubernetes`` renamed to ``kubernetes_executor`` Review Comment: ```suggestion Airflow config section ``kubernetes`` renamed to ``kubernetes_executor`` (#26873) " ``` ## RELEASE_NOTES.rst: ## @@ -21,6 +21,279 @@ .. towncrier release notes start +Airflow 2.5.0 (2022-11-28) +-- + +Significant Changes +^^^ + +- ``airflow dags test`` no longer performs a backfill job. + + In order to make ``airflow dags test`` more useful as a testing and debugging tool, we no + longer run a backfill job and instead run a "local task runner". Users can still backfill + their DAGs using the ``airflow dags backfill`` command. (#26400) Review Comment: ```suggestion their DAGs using the ``airflow dags backfill`` command. ``` ## RELEASE_NOTES.rst: ## @@ -21,6 +21,279 @@ .. towncrier release notes start +Airflow 2.5.0 (2022-11-28) +-- + +Significant Changes +^^^ + +- ``airflow dags test`` no longer performs a backfill job. + + In order to make ``airflow dags test`` more useful as a testing and debugging tool, we no + longer run a backfill job and instead run a "local task runner". Users can still backfill + their DAGs using the ``airflow dags backfill`` command. (#26400) +- Airflow config section ``kubernetes`` renamed to ``kubernetes_executor`` + + KubernetesPodOperator no longer considers any core kubernetes config params, so this section now only applies to kubernetes executor. Renaming it reduces potential for confusion. (#26873) +- ``ExternalTaskSensor`` no longer hangs indefinitely when ``failed_states`` is set, an ``execute_date_fn`` is used, and some but not all of the dependent tasks fail. Instead, an ``AirflowException`` is thrown as soon as any of the dependent tasks fail. + + Any code handling this failure in addition to timeouts should move to cathing the ``AirflowException`` baseclass and not only the ``AirflowSensorTimeout`` subclass. (#27190) Review Comment: This will need a similar change as above, but we probably need a shorter title for this? ## RELEASE_NOTES.rst: ## @@ -21,6 +21,279 @@ .. towncrier release notes start +Airflow 2.5.0 (2022-11-28) +-- + +Significant Changes +^^^ + +- ``airflow dags test`` no longer performs a backfill job. Review Comment: ```suggestion ``airflow dags test`` no longer performs a backfill job (#26400) ``` ## RELEASE_NOTES.rst: ## @@ -21,6 +21,279 @@ .. towncrier release notes start +Airflow 2.5.0 (2022-11-28) +-- + +Significant Changes +^^^ + +- ``airflow dags test`` no longer performs a backfill job. + + In order to make ``airflow dags test`` more useful as a testing and debugging tool, we no + longer run a backfill job and instead run a "local task runner". Users can still backfill + their DAGs using the ``airflow dags backfill`` command. (#26400) +- Airflow config section ``kubernetes`` renamed to ``kubernetes_executor`` + + KubernetesPodOperator no longer considers any core kubernetes config params, so this section now only applies to kubernetes executor. Renaming it reduces potential for confusion. (#26873) +- ``ExternalTaskSensor`` no longer hangs indefinitely when ``failed_states`` is set, an ``execute_date_fn`` is used, and some but not all of the dependent tasks fail. Instead, an ``AirflowException`` is thrown as soon as any of the dependent tasks fail. + + Any code handling this failure in addition to timeouts should move to cathing the ``AirflowException`` baseclass and not only the ``AirflowSensorTimeout`` subclass. (#27190) + +New Features + +- ``TaskRunner``: notify of component start and finish (#27855) +- Add DagRun state change to the Listener plugin system(#27113) +- Metric for raw task return codes (#27155) +- Add logic for XComAr
[GitHub] [airflow] alexott commented on issue #27894: Status of testing Providers that were prepared on November 24, 2022
alexott commented on issue #27894: URL: https://github.com/apache/airflow/issues/27894#issuecomment-1326659796 #27868 works, #27854 unfortunately not - results are double wrapped: For example for `select * from default.my_***_table, parameters: None` we get: `scalar_results=True`, `results=[Row(id=1, v='test 1'), Row(id=2, v='test 2')]`. And then code: ``` if scalar_results: list_results: list[Any] = [results] else: list_results = results ``` wraps this list into another list. I think that logic should be changed a bit - right now we're collecting results for all SQL statements into a single list although they could have different schemas. Let me debug it and open another PR -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] o-nikolas commented on pull request #26962: Fix system test for Memorystore memcached
o-nikolas commented on PR #26962: URL: https://github.com/apache/airflow/pull/26962#issuecomment-1326644068 > @o-nikolas , hi, > > Can we merge this PR please? Your comment has been addressed. All good on my end, but unfortunately I am not a committer and cannot merge changes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] mobuchowski opened a new pull request, #27896: tests: always cleanup registered test listeners
mobuchowski opened a new pull request, #27896: URL: https://github.com/apache/airflow/pull/27896 Listeners get registered in tests, that run in various order. Every place that registers listeners should clean up, defensively before and after using registering new managers. Signed-off-by: Maciej Obuchowski -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #27895: Sync v2-5-stable with v2-5-test to release 2.5.0
potiuk commented on code in PR #27895: URL: https://github.com/apache/airflow/pull/27895#discussion_r1031664386 ## README.md: ## @@ -86,7 +86,7 @@ Airflow is not a streaming solution, but it is often used to process real-time d Apache Airflow is tested with: -| | Main version (dev) | Stable version (2.4.2) | +| | Main version (dev) | Stable version (2.5.0) | Review Comment: good catch -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] raphaelauv commented on a diff in pull request #27895: Sync v2-5-stable with v2-5-test to release 2.5.0
raphaelauv commented on code in PR #27895: URL: https://github.com/apache/airflow/pull/27895#discussion_r1031649086 ## README.md: ## @@ -86,7 +86,7 @@ Airflow is not a streaming solution, but it is often used to process real-time d Apache Airflow is tested with: -| | Main version (dev) | Stable version (2.4.2) | +| | Main version (dev) | Stable version (2.5.0) | Review Comment: missing PostgreSQL 15 support for 2.5.0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] jh242 commented on pull request #27805: Automatically save and allow restore of recent DAG run configs
jh242 commented on PR #27805: URL: https://github.com/apache/airflow/pull/27805#issuecomment-1326606738 I disagree with the db/API endpoint idea. I think it's too much overhead for a minor feature that seems intended to save some time for DAGs with small configs that the user didn't expect to run multiple times. I feel like the ability to copy/paste configurations from [here](https://github.com/apache/airflow/pull/27639) also overlaps this feature. My suggestion is that we can either save recent configs in session storage by DAG, or in local storage but limited to a certain number of total recent configs across all DAGs. Additionally, Aaron and I are on a bit of a short timeline and likely won't have time to implement a backend-supported version of this feature, but if that's the direction we really want to go, we can get started on it and see where it goes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #27895: Sync v2-5-stable with v2-5-test to release 2.5.0
potiuk commented on PR #27895: URL: https://github.com/apache/airflow/pull/27895#issuecomment-1326598444 Very cool release - no dramatic changes, but steady stream of improvements: :muscle: -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] alexandermalyga commented on issue #27894: Status of testing Providers that were prepared on November 24, 2022
alexandermalyga commented on issue #27894: URL: https://github.com/apache/airflow/issues/27894#issuecomment-1326589981 #27724 works as expected! Finally Trino inserts are 100% working -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #27894: Status of testing Providers that were prepared on November 24, 2022
potiuk commented on issue #27894: URL: https://github.com/apache/airflow/issues/27894#issuecomment-1326587629 Checked my changes. cc: @alexott @kazanzhy -> would approeciate checking databricks SQL executor integration with the new common-sql provider. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ephraimbuddy opened a new pull request, #27895: Sync v2-5-stable with v2-5-test to release 2.5.0
ephraimbuddy opened a new pull request, #27895: URL: https://github.com/apache/airflow/pull/27895 Time for `2.5.0rc1`! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] 01/02: Update version to 2.5.0
This is an automated email from the ASF dual-hosted git repository. ephraimanierobi pushed a commit to branch v2-5-test in repository https://gitbox.apache.org/repos/asf/airflow.git commit 59d16b6765c0d7aee4efb9150ff30508863590b6 Author: Ephraim Anierobi AuthorDate: Thu Nov 24 15:12:33 2022 +0100 Update version to 2.5.0 --- README.md | 14 +++--- airflow/utils/db.py| 1 + docs/apache-airflow/installation/supported-versions.rst| 2 +- docs/docker-stack/README.md| 10 +- .../docker-examples/extending/add-apt-packages/Dockerfile | 2 +- .../extending/add-build-essential-extend/Dockerfile| 2 +- .../docker-examples/extending/add-providers/Dockerfile | 2 +- .../docker-examples/extending/add-pypi-packages/Dockerfile | 2 +- .../extending/add-requirement-packages/Dockerfile | 2 +- .../docker-examples/extending/custom-providers/Dockerfile | 2 +- .../docker-examples/extending/embedding-dags/Dockerfile| 2 +- .../extending/writable-directory/Dockerfile| 2 +- docs/docker-stack/entrypoint.rst | 14 +++--- scripts/ci/pre_commit/pre_commit_supported_versions.py | 2 +- setup.py | 2 +- 15 files changed, 31 insertions(+), 30 deletions(-) diff --git a/README.md b/README.md index a4f37d6dae..cf15510901 100644 --- a/README.md +++ b/README.md @@ -86,7 +86,7 @@ Airflow is not a streaming solution, but it is often used to process real-time d Apache Airflow is tested with: -| | Main version (dev) | Stable version (2.4.2) | +| | Main version (dev) | Stable version (2.5.0) | |-|--|--| | Python | 3.7, 3.8, 3.9, 3.10 | 3.7, 3.8, 3.9, 3.10 | | Platform| AMD64/ARM64(\*) | AMD64/ARM64(\*) | @@ -158,15 +158,15 @@ them to the appropriate format and workflow that your tool requires. ```bash -pip install 'apache-airflow==2.4.2' \ - --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.4.2/constraints-3.7.txt"; +pip install 'apache-airflow==2.5.0' \ + --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.5.0/constraints-3.7.txt"; ``` 2. Installing with extras (i.e., postgres, google) ```bash -pip install 'apache-airflow[postgres,google]==2.4.2' \ - --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.4.2/constraints-3.7.txt"; +pip install 'apache-airflow[postgres,google]==2.5.0' \ + --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.5.0/constraints-3.7.txt"; ``` For information on installing provider packages, check @@ -271,7 +271,7 @@ Apache Airflow version life cycle: | Version | Current Patch/Minor | State | First Release | Limited Support | EOL/Terminated | |---|---|---|-|---|--| -| 2 | 2.4.3 | Supported | Dec 17, 2020| TBD | TBD | +| 2 | 2.5.0 | Supported | Dec 17, 2020| TBD | TBD | | 1.10 | 1.10.15 | EOL | Aug 27, 2018| Dec 17, 2020 | June 17, 2021| | 1.9 | 1.9.0 | EOL | Jan 03, 2018| Aug 27, 2018 | Aug 27, 2018 | | 1.8 | 1.8.2 | EOL | Mar 19, 2017| Jan 03, 2018 | Jan 03, 2018 | @@ -301,7 +301,7 @@ They are based on the official release schedule of Python and Kubernetes, nicely 2. The "oldest" supported version of Python/Kubernetes is the default one until we decide to switch to later version. "Default" is only meaningful in terms of "smoke tests" in CI PRs, which are run using this default version and the default reference image available. Currently `apache/airflow:latest` - and `apache/airflow:2.4.2` images are Python 3.7 images. This means that default reference image will + and `apache/airflow:2.5.0` images are Python 3.7 images. This means that default reference image will become the default at the time when we start preparing for dropping 3.7 support which is few months before the end of life for Python 3.7. diff --git a/airflow/utils/db.py b/airflow/utils/db.py index b5ea63be00..00bf243e1d 100644 --- a/airflow/utils/db.py +++ b/airflow/utils/db.py @@ -75,6 +75,7 @@ REVISION_HEADS_MAP = { "2.4.1": "ecb43d2a1842", "2.4.2": "b0d31815b5a6", "2.4.3": "e07f49787c9d", +"2.5.0": "1986afd32c1b", } diff --git a/docs/apache-airflow/installation/supported-versions.rst b/docs/apache-airflow/installation/supported-ver
[airflow] 02/02: Add release notes
This is an automated email from the ASF dual-hosted git repository. ephraimanierobi pushed a commit to branch v2-5-test in repository https://gitbox.apache.org/repos/asf/airflow.git commit 0c2ee0ad958e5424be084793efbacf023dccd333 Author: Ephraim Anierobi AuthorDate: Thu Nov 24 16:14:49 2022 +0100 Add release notes --- RELEASE_NOTES.rst | 273 newsfragments/26400.significant.rst | 5 - newsfragments/26873.significant.rst | 3 - newsfragments/27190.significant.rst | 3 - 4 files changed, 273 insertions(+), 11 deletions(-) diff --git a/RELEASE_NOTES.rst b/RELEASE_NOTES.rst index fe2babef13..b94810118e 100644 --- a/RELEASE_NOTES.rst +++ b/RELEASE_NOTES.rst @@ -21,6 +21,279 @@ .. towncrier release notes start +Airflow 2.5.0 (2022-11-28) +-- + +Significant Changes +^^^ + +- ``airflow dags test`` no longer performs a backfill job. + + In order to make ``airflow dags test`` more useful as a testing and debugging tool, we no + longer run a backfill job and instead run a "local task runner". Users can still backfill + their DAGs using the ``airflow dags backfill`` command. (#26400) +- Airflow config section ``kubernetes`` renamed to ``kubernetes_executor`` + + KubernetesPodOperator no longer considers any core kubernetes config params, so this section now only applies to kubernetes executor. Renaming it reduces potential for confusion. (#26873) +- ``ExternalTaskSensor`` no longer hangs indefinitely when ``failed_states`` is set, an ``execute_date_fn`` is used, and some but not all of the dependent tasks fail. Instead, an ``AirflowException`` is thrown as soon as any of the dependent tasks fail. + + Any code handling this failure in addition to timeouts should move to cathing the ``AirflowException`` baseclass and not only the ``AirflowSensorTimeout`` subclass. (#27190) + +New Features + +- ``TaskRunner``: notify of component start and finish (#27855) +- Add DagRun state change to the Listener plugin system(#27113) +- Metric for raw task return codes (#27155) +- Add logic for XComArg to pull specific map indexes (#27771) +- Clear TaskGroup (#26658) +- Add critical section query duration metric (#27700) +- Add: #23880 :: Audit log for ``AirflowModelViews(Variables/Connection)`` (#24079) +- Add postgres 15 support (#27444) +- Expand tasks in mapped group at run time (#27491) +- reset commits, clean submodules (#27560) +- scheduler_job, add metric for scheduler loop timer (#27605) +- Allow datasets to be used in taskflow (#27540) +- Add expanded_ti_count to ti context (#27680) +- Add user comment to task instance and dag run (#26457, #27849, #27867) +- Enable copying DagRun JSON to clipboard (#27639) +- Implement extra controls for SLAs (#27557) +- add dag parsed time in DAG view (#27573) +- Add max_wait for exponential_backoff in BaseSensor (#27597) +- Expand tasks in mapped group at parse time (#27158) +- Add disable retry flag on backfill (#23829) +- Adding sensor decorator (#22562) +- Api endpoint update ti (#26165) +- Filtering datasets by recent update events (#26942) +- Support Is /not Null filter for value is None on webui (#26584) +- Add search to datasets list (#26893) +- Split out and handle 'params' in mapped operator (#26100) +- Add authoring API for TaskGroup mapping (#26844) +- Add ``one_done`` trigger rule (#26146) +- Create a more efficient airflow dag test command that also has better local logging (#26400) +- Support add/remove permissions to roles commands (#26338) +- Auto tail file logs in Web UI (#26169) +- Add triggerer info to task instance in API (#26249) +- Flag to deserialize value on custom XCom backend (#26343) + +Bug Fixes +^ +- Redirect to home view when there are no valid tags in the URL (#25715) +- Make MappedTaskGroup depend on its expand inputs (#27876) +- Make DagRun state updates for paused DAGs faster (#27725) +- Don't explicitly set include_examples to False on task run command (#27813) +- Fix menu border color (#27789) +- Fix backfill queued task getting reset to scheduled state. (#23720) +- Fix clearing child dag mapped tasks from parent dag (#27501) +- Handle json encoding of ``V1Pod`` in task callback (#27609) +- Fix ExternalTaskSensor can't check zipped dag (#27056) +- Avoid re-fetching DAG run in TriggerDagRunOperator (#27635) +- Continue on exception when retrieving metadata (#27665) +- Fix double logging with some task logging handler (#27591) +- External task sensor fail fix (#27190) +- Replace FAB url filtering function with Airflows (#27576) +- Fix mini scheduler expansion of mapped task (#27506) +- Add the default None when pop actions (#27537) +- Display parameter values from serialized dag in trigger dag view. (#27482) +- Fix getting the dag/task ids from base executor (#27550) +- Fix sqlalchemy primary key black-out error on DDRQ (#27538) +- Move TriggerDagRun conf check to execute (#27035) +- SLAMiss is nullabl
[airflow] branch v2-5-test updated (cc18921381 -> 0c2ee0ad95)
This is an automated email from the ASF dual-hosted git repository. ephraimanierobi pushed a change to branch v2-5-test in repository https://gitbox.apache.org/repos/asf/airflow.git from cc18921381 Update default branches for 2-5 new 59d16b6765 Update version to 2.5.0 new 0c2ee0ad95 Add release notes The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: README.md | 14 +- RELEASE_NOTES.rst | 273 + airflow/utils/db.py| 1 + .../installation/supported-versions.rst| 2 +- docs/docker-stack/README.md| 10 +- .../extending/add-apt-packages/Dockerfile | 2 +- .../add-build-essential-extend/Dockerfile | 2 +- .../extending/add-providers/Dockerfile | 2 +- .../extending/add-pypi-packages/Dockerfile | 2 +- .../extending/add-requirement-packages/Dockerfile | 2 +- .../extending/custom-providers/Dockerfile | 2 +- .../extending/embedding-dags/Dockerfile| 2 +- .../extending/writable-directory/Dockerfile| 2 +- docs/docker-stack/entrypoint.rst | 14 +- newsfragments/26400.significant.rst| 5 - newsfragments/26873.significant.rst| 3 - newsfragments/27190.significant.rst| 3 - .../ci/pre_commit/pre_commit_supported_versions.py | 2 +- setup.py | 2 +- 19 files changed, 304 insertions(+), 41 deletions(-) delete mode 100644 newsfragments/26400.significant.rst delete mode 100644 newsfragments/26873.significant.rst delete mode 100644 newsfragments/27190.significant.rst
[GitHub] [airflow] potiuk opened a new issue, #27894: Status of testing Providers that were prepared on November 24, 2022
potiuk opened a new issue, #27894: URL: https://github.com/apache/airflow/issues/27894 ### Body I have a kind request for all the contributors to the latest provider packages release. Could you please help us to test the RC versions of the providers? Let us know in the comment, whether the issue is addressed. Those are providers that require testing as there were some substantial changes introduced: ## Provider [amazon: 6.2.0rc1](https://pypi.org/project/apache-airflow-providers-amazon/6.2.0rc1) - [ ] [Use Boto waiters instead of customer _await_status method for RDS Operators (#27410)](https://github.com/apache/airflow/pull/27410): @hankehly - [ ] [Handle transient state errors in `RedshiftResumeClusterOperator` and `RedshiftPauseClusterOperator` (#27276)](https://github.com/apache/airflow/pull/27276): @syedahsn - [ ] [Correct job name matching in SagemakerProcessingOperator (#27634)](https://github.com/apache/airflow/pull/27634): @ferruzzi ## Provider [asana: 2.1.0rc1](https://pypi.org/project/apache-airflow-providers-asana/2.1.0rc1) - [ ] [Allow and prefer non-prefixed extra fields for AsanaHook (#27043)](https://github.com/apache/airflow/pull/27043): @dstandish ## Provider [common.sql: 1.3.1rc1](https://pypi.org/project/apache-airflow-providers-common-sql/1.3.1rc1) - [ ] [Restore removed (but used) methods in common.sql (#27843)](https://github.com/apache/airflow/pull/27843): @potiuk - [ ] [Fix errors in Databricks SQL operator introduced when refactoring (#27854)](https://github.com/apache/airflow/pull/27854): @potiuk ## Provider [databricks: 4.0.0rc1](https://pypi.org/project/apache-airflow-providers-databricks/4.0.0rc1) - [ ] [Fix errors in Databricks SQL operator introduced when refactoring (#27854)](https://github.com/apache/airflow/pull/27854): @potiuk - [ ] [Fix templating fields and do_xcom_push in DatabricksSQLOperator (#27868)](https://github.com/apache/airflow/pull/27868): @potiuk ## Provider [exasol: 4.1.1rc1](https://pypi.org/project/apache-airflow-providers-exasol/4.1.1rc1) - [ ] [Fix errors in Databricks SQL operator introduced when refactoring (#27854)](https://github.com/apache/airflow/pull/27854): @potiuk ## Provider [google: 8.6.0rc1](https://pypi.org/project/apache-airflow-providers-google/8.6.0rc1) - [ ] [Persist DataprocLink for workflow operators regardless of job status (#26986)](https://github.com/apache/airflow/pull/26986): @vksunilk - [ ] [Deferrable mode for BigQueryToGCSOperator (#27683)](https://github.com/apache/airflow/pull/27683): @lwyszomi - [ ] [Fix to read location parameter properly in BigQueryToBigQueryOperator (#27661)](https://github.com/apache/airflow/pull/27661): @VladaZakharova ## Provider [jdbc: 3.3.0rc1](https://pypi.org/project/apache-airflow-providers-jdbc/3.3.0rc1) - [ ] [Allow and prefer non-prefixed extra fields for JdbcHook (#27044)](https://github.com/apache/airflow/pull/27044): @dstandish - [ ] [Add SQLExecuteQueryOperator (#25717)](https://github.com/apache/airflow/pull/25717): @kazanzhy ## Provider [mysql: 3.4.0rc1](https://pypi.org/project/apache-airflow-providers-mysql/3.4.0rc1) - [ ] [Allow SSL mode in MySQL provider (#27717)](https://github.com/apache/airflow/pull/27717): @Adityamalik123 ## Provider [neo4j: 3.2.1rc1](https://pypi.org/project/apache-airflow-providers-neo4j/3.2.1rc1) - [ ] [Fix typing problem revealed after recent Neo4J release (#27759)](https://github.com/apache/airflow/pull/27759): @potiuk ## Provider [presto: 4.2.0rc1](https://pypi.org/project/apache-airflow-providers-presto/4.2.0rc1) - [ ] [Add _serialize_cell method to TrinoHook and PrestoHook (#27724)](https://github.com/apache/airflow/pull/27724): @alexandermalyga ## Provider [slack: 7.1.0rc1](https://pypi.org/project/apache-airflow-providers-slack/7.1.0rc1) - [ ] [Implements SqlToSlackApiFileOperator (#26374)](https://github.com/apache/airflow/pull/26374): @Taragolis ## Provider [snowflake: 4.0.1rc1](https://pypi.org/project/apache-airflow-providers-snowflake/4.0.1rc1) - [ ] [Fix errors in Databricks SQL operator introduced when refactoring (#27854)](https://github.com/apache/airflow/pull/27854): @potiuk ## Provider [trino: 4.3.0rc1](https://pypi.org/project/apache-airflow-providers-trino/4.3.0rc1) - [ ] [Add _serialize_cell method to TrinoHook and PrestoHook (#27724)](https://github.com/apache/airflow/pull/27724): @alexandermalyga The guidelines on how to test providers can be found in [Verify providers by contributors](https://github.com/apache/airflow/blob/main/dev/README_RELEASE_PROVIDER_PACKAGES.md#verify-by-contributors) ### Committer - [X] I acknowledge that I am a maintainer/committer of the Apache Airflow project. -- This is an automated message from the Apache Git Service. To respond to the messa
[GitHub] [airflow] boring-cyborg[bot] commented on pull request #27893: AWSGlueJobHook updates job configuration if it exists
boring-cyborg[bot] commented on PR #27893: URL: https://github.com/apache/airflow/pull/27893#issuecomment-1326563042 Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst) Here are some useful points: - Pay attention to the quality of your code (flake8, mypy and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks) will help you with that. - In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/main/docs/apache-airflow/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it. - Consider using [Breeze environment](https://github.com/apache/airflow/blob/main/BREEZE.rst) for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations. - Be patient and persistent. It might take some time to get a review or get the final approval from Committers. - Please follow [ASF Code of Conduct](https://www.apache.org/foundation/policies/conduct) for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack. - Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#coding-style-and-best-practices). Apache Airflow is a community-driven project and together we are making it better 🚀. In case of doubts contact the developers at: Mailing List: d...@airflow.apache.org Slack: https://s.apache.org/airflow-slack -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] romibuzi opened a new pull request, #27893: AWSGlueJobHook updates job configuration if it exists
romibuzi opened a new pull request, #27893: URL: https://github.com/apache/airflow/pull/27893 closes: #27592 --- Rename `GlueJobHook.get_or_create_glue_job()` into `create_or_update_glue_job()` and split code into different methods: `create_glue_job_config()`, `has_job()`, `create_job()` and `update_job()`. It is now similar to the behavior in `GlueCrawlerOperator`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org