[GitHub] [airflow] eladkal commented on issue #27890: SFTP Sensor is not working with File Pattern Parameter

2022-11-24 Thread GitBox


eladkal commented on issue #27890:
URL: https://github.com/apache/airflow/issues/27890#issuecomment-1327112678

   cc @Bowrna 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] ephraimbuddy commented on pull request #27829: Improving the release process

2022-11-24 Thread GitBox


ephraimbuddy commented on PR #27829:
URL: https://github.com/apache/airflow/pull/27829#issuecomment-1327111335

   I added a new function `user_confirm_bools` that returns a bool using 
`user_confirm` under the hood. This helped me reduce a lot of `if else` 
statements. 
   Also added `console_print` that uses `get_console().print` to print messages 
to the screen.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] dstandish commented on pull request #27344: Add retry to submit_event in trigger to avoid deadlock

2022-11-24 Thread GitBox


dstandish commented on PR #27344:
URL: https://github.com/apache/airflow/pull/27344#issuecomment-1327107144

   @NickYadance did you give up on this one?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] bolkedebruin commented on a diff in pull request #27887: Add allow list for imports during deserialization

2022-11-24 Thread GitBox


bolkedebruin commented on code in PR #27887:
URL: https://github.com/apache/airflow/pull/27887#discussion_r1032070380


##
airflow/utils/json.py:
##
@@ -189,7 +189,7 @@ def __init__(self, *args, **kwargs) -> None:
 if not kwargs.get("object_hook"):
 kwargs["object_hook"] = self.object_hook
 
-patterns = conf.getjson("core", "allowed_deserialization_classes")
+patterns = cast(list, conf.getjson("core", 
"allowed_deserialization_classes"))

Review Comment:
   Mmm yes, I'd prefer that check at configure time rather than here. The 
config file shouldn't validate if this is not a list. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Bowrna opened a new pull request, #27905: listener plugin example added

2022-11-24 Thread GitBox


Bowrna opened a new pull request, #27905:
URL: https://github.com/apache/airflow/pull/27905

   related: #15353
   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code changes, an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in a 
newsfragment file, named `{pr_number}.significant.rst` or 
`{issue_number}.significant.rst`, in 
[newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Bowrna closed pull request #27435: listener plugin example and documentation

2022-11-24 Thread GitBox


Bowrna closed pull request #27435: listener plugin example and documentation
URL: https://github.com/apache/airflow/pull/27435


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Bowrna opened a new pull request, #27435: listener plugin example and documentation

2022-11-24 Thread GitBox


Bowrna opened a new pull request, #27435:
URL: https://github.com/apache/airflow/pull/27435

   This PR contains example code and documentation to use listener plugin 
feature in Airflow.
   
   related: https://github.com/apache/airflow/issues/15353
   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code changes, an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in a 
newsfragment file, named `{pr_number}.significant.rst` or 
`{issue_number}.significant.rst`, in 
[newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] blag commented on a diff in pull request #27828: Soft delete datasets that are no longer referenced in DAG schedules or task outlets

2022-11-24 Thread GitBox


blag commented on code in PR #27828:
URL: https://github.com/apache/airflow/pull/27828#discussion_r1032052680


##
airflow/jobs/scheduler_job.py:
##
@@ -1574,3 +1585,33 @@ def _cleanup_stale_dags(self, session: Session = 
NEW_SESSION) -> None:
 dag.is_active = False
 SerializedDagModel.remove_dag(dag_id=dag.dag_id, session=session)
 session.flush()
+
+@provide_session
+def _orphan_unreferenced_datasets(self, session: Session = NEW_SESSION) -> 
None:
+"""
+Detects datasets that are no longer referenced in any DAG schedule 
parameters or task outlets and
+sets the dataset is_orphaned flags to True
+"""
+orphaned_dataset_query = (
+session.query(DatasetModel)
+.join(
+DagScheduleDatasetReference,
+DagScheduleDatasetReference.dataset_id == DatasetModel.id,
+isouter=True,
+)
+.join(
+TaskOutletDatasetReference,
+TaskOutletDatasetReference.dataset_id == DatasetModel.id,
+isouter=True,
+)
+.group_by(DatasetModel.id)
+.having(
+and_(
+func.count(DagScheduleDatasetReference.dag_id) == 0,
+func.count(TaskOutletDatasetReference.dag_id) == 0,
+)
+)
+)
+for dataset in orphaned_dataset_query:
+self.log.info("Orphaning unreferenced dataset '%s'", dataset.uri)
+dataset.is_orphaned = True

Review Comment:
   Nope, didn't work. Good idea though. :)
   
   ```
   sqlalchemy.exc.InvalidRequestError: Can't call Query.update() or 
Query.delete() when group_by() has been called
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] blag commented on a diff in pull request #27828: Soft delete datasets that are no longer referenced in DAG schedules or task outlets

2022-11-24 Thread GitBox


blag commented on code in PR #27828:
URL: https://github.com/apache/airflow/pull/27828#discussion_r1032045393


##
airflow/jobs/scheduler_job.py:
##
@@ -1574,3 +1585,33 @@ def _cleanup_stale_dags(self, session: Session = 
NEW_SESSION) -> None:
 dag.is_active = False
 SerializedDagModel.remove_dag(dag_id=dag.dag_id, session=session)
 session.flush()
+
+@provide_session
+def _orphan_unreferenced_datasets(self, session: Session = NEW_SESSION) -> 
None:
+"""
+Detects datasets that are no longer referenced in any DAG schedule 
parameters or task outlets and
+sets the dataset is_orphaned flags to True
+"""
+orphaned_dataset_query = (
+session.query(DatasetModel)
+.join(
+DagScheduleDatasetReference,
+DagScheduleDatasetReference.dataset_id == DatasetModel.id,
+isouter=True,
+)
+.join(
+TaskOutletDatasetReference,
+TaskOutletDatasetReference.dataset_id == DatasetModel.id,
+isouter=True,
+)
+.group_by(DatasetModel.id)
+.having(
+and_(
+func.count(DagScheduleDatasetReference.dag_id) == 0,
+func.count(TaskOutletDatasetReference.dag_id) == 0,
+)
+)
+)
+for dataset in orphaned_dataset_query:
+self.log.info("Orphaning unreferenced dataset '%s'", dataset.uri)
+dataset.is_orphaned = True

Review Comment:
   The group by expression might interfere but I'll try it, thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] tanelk opened a new pull request, #27904: Order TIs by map_index

2022-11-24 Thread GitBox


tanelk opened a new pull request, #27904:
URL: https://github.com/apache/airflow/pull/27904

   
   
   Sort TIs by the `map_index` field when selecting them for queueing. 
Currently TIs are only ordered by `priority_weight` and `execution_date`. This 
does not change any bug, but makes it more understandable and "cleaner" in the 
UI.
   
   Without this, every now and then the TIs get executed from the middle - 
probably something to do with database internals.
   
![2022-11-24_15-47](https://user-images.githubusercontent.com/3342974/203914401-97c9be97-43ab-432f-bd8f-b858ad09c058.png)
   
   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code changes, an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in a 
newsfragment file, named `{pr_number}.significant.rst` or 
`{issue_number}.significant.rst`, in 
[newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] ephraimbuddy commented on a diff in pull request #27828: Soft delete datasets that are no longer referenced in DAG schedules or task outlets

2022-11-24 Thread GitBox


ephraimbuddy commented on code in PR #27828:
URL: https://github.com/apache/airflow/pull/27828#discussion_r1032034170


##
airflow/jobs/scheduler_job.py:
##
@@ -1574,3 +1585,33 @@ def _cleanup_stale_dags(self, session: Session = 
NEW_SESSION) -> None:
 dag.is_active = False
 SerializedDagModel.remove_dag(dag_id=dag.dag_id, session=session)
 session.flush()
+
+@provide_session
+def _orphan_unreferenced_datasets(self, session: Session = NEW_SESSION) -> 
None:
+"""
+Detects datasets that are no longer referenced in any DAG schedule 
parameters or task outlets and
+sets the dataset is_orphaned flags to True
+"""
+orphaned_dataset_query = (
+session.query(DatasetModel)
+.join(
+DagScheduleDatasetReference,
+DagScheduleDatasetReference.dataset_id == DatasetModel.id,
+isouter=True,
+)
+.join(
+TaskOutletDatasetReference,
+TaskOutletDatasetReference.dataset_id == DatasetModel.id,
+isouter=True,
+)
+.group_by(DatasetModel.id)
+.having(
+and_(
+func.count(DagScheduleDatasetReference.dag_id) == 0,
+func.count(TaskOutletDatasetReference.dag_id) == 0,
+)
+)
+)
+for dataset in orphaned_dataset_query:
+self.log.info("Orphaning unreferenced dataset '%s'", dataset.uri)
+dataset.is_orphaned = True

Review Comment:
   ```suggestion
   ).update({DatasetModel.is_orphaned:True}, 
synchronize_session='fetch')
   )
   ```
   If this will work I think it's faster. 



##
airflow/migrations/versions/0122_2_5_0_add_is_orphaned_to_datasetmodel.py:
##
@@ -0,0 +1,49 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Add is_orphaned to DatasetModel
+
+Revision ID: 290244fb8b83
+Revises: 65a852f26899
+Create Date: 2022-11-22 00:12:53.432961
+
+"""
+
+from __future__ import annotations
+
+import sqlalchemy as sa
+from alembic import op
+
+# revision identifiers, used by Alembic.
+revision = "290244fb8b83"
+down_revision = "65a852f26899"
+branch_labels = None
+depends_on = None
+airflow_version = "2.5.0"
+
+
+def upgrade():
+"""Add is_orphaned to DatasetModel"""
+with op.batch_alter_table("dataset") as batch_op:

Review Comment:
   I think so, due to SQLite but I don't think we need the server_default since 
it's `False` 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on a diff in pull request #27901: Add information on how to run tests in Breeze via the PyCharm IDE

2022-11-24 Thread GitBox


Taragolis commented on code in PR #27901:
URL: https://github.com/apache/airflow/pull/27901#discussion_r1032025382


##
TESTING.rst:
##
@@ -61,20 +61,51 @@ Running Unit Tests from PyCharm IDE
 To run unit tests from the PyCharm IDE, create the `local virtualenv 
`_,
 select it as the default project's environment, then configure your test 
runner:
 
-.. image:: images/configure_test_runner.png
+.. image:: images/pycharm/configure_test_runner.png
 :align: center
 :alt: Configuring test runner
 
 and run unit tests as follows:
 
-.. image:: images/running_unittests.png
+.. image:: images/pycharm/running_unittests.png
 :align: center
 :alt: Running unit tests
 
 **NOTE:** You can run the unit tests in the standalone local virtualenv
 (with no Breeze installed) if they do not have dependencies such as
 Postgres/MySQL/Hadoop/etc.
 
+Running Unit Tests from PyCharm IDE using Breeze
+
+
+Ideally, all unit tests should be run using the standardized Breeze 
environment.  While not
+as convenient as the one-click "play button" in PyCharm, the IDE can be 
configured to do
+this in two clicks.
+
+1. Add Breeze as an "External Tool"
+  a. File > Settings > Tools > External Tools
+  b. Click the little plus symbol to open the "Create Tool" popup and fill it 
out:

Review Comment:
   Some macOS specific stuff. 
   On macOS (and only there) user should navigate to `PyCharm -> Preferences` 
instead of `File > Settings`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] jedcunningham commented on a diff in pull request #27828: Soft delete datasets that are no longer referenced in DAG schedules or task outlets

2022-11-24 Thread GitBox


jedcunningham commented on code in PR #27828:
URL: https://github.com/apache/airflow/pull/27828#discussion_r1031991662


##
airflow/jobs/scheduler_job.py:
##
@@ -1574,3 +1585,33 @@ def _cleanup_stale_dags(self, session: Session = 
NEW_SESSION) -> None:
 dag.is_active = False
 SerializedDagModel.remove_dag(dag_id=dag.dag_id, session=session)
 session.flush()
+
+@provide_session
+def _orphan_unreferenced_datasets(self, session: Session = NEW_SESSION) -> 
None:
+"""
+Detects datasets that are no longer referenced in any DAG schedule 
parameters or task outlets and
+sets the dataset is_orphaned flags to True
+"""
+orphaned_dataset_query = (
+session.query(DatasetModel)
+.join(
+DagScheduleDatasetReference,
+DagScheduleDatasetReference.dataset_id == DatasetModel.id,
+isouter=True,
+)
+.join(
+TaskOutletDatasetReference,
+TaskOutletDatasetReference.dataset_id == DatasetModel.id,
+isouter=True,
+)
+.group_by(DatasetModel.id)
+.having(
+and_(
+func.count(DagScheduleDatasetReference.dag_id) == 0,
+func.count(TaskOutletDatasetReference.dag_id) == 0,
+)
+)
+)
+for dataset in orphaned_dataset_query.all():

Review Comment:
   ```suggestion
   for dataset in orphaned_dataset_query:
   ```



##
airflow/www/views.py:
##
@@ -3648,7 +3648,7 @@ def datasets_summary(self):
 if has_event_filters:
 count_query = count_query.join(DatasetEvent, 
DatasetEvent.dataset_id == DatasetModel.id)
 
-filters = []
+filters = [DatasetModel.is_orphaned.is_(False)]

Review Comment:
   ```suggestion
   filters = [~DatasetModel.is_orphaned]
   ```



##
airflow/jobs/scheduler_job.py:
##
@@ -1574,3 +1585,33 @@ def _cleanup_stale_dags(self, session: Session = 
NEW_SESSION) -> None:
 dag.is_active = False
 SerializedDagModel.remove_dag(dag_id=dag.dag_id, session=session)
 session.flush()
+
+@provide_session
+def _orphan_unreferenced_datasets(self, session: Session = NEW_SESSION) -> 
None:
+"""
+Detects datasets that are no longer referenced in any DAG schedule 
parameters or task outlets and
+sets the dataset is_orphaned flags to True
+"""
+orphaned_dataset_query = (
+session.query(DatasetModel)
+.join(
+DagScheduleDatasetReference,
+DagScheduleDatasetReference.dataset_id == DatasetModel.id,
+isouter=True,
+)
+.join(
+TaskOutletDatasetReference,
+TaskOutletDatasetReference.dataset_id == DatasetModel.id,
+isouter=True,
+)
+.group_by(DatasetModel.id)
+.having(
+and_(
+func.count(DagScheduleDatasetReference.dag_id) == 0,
+func.count(TaskOutletDatasetReference.dag_id) == 0,
+)
+)
+)
+for dataset in orphaned_dataset_query.all():
+self.log.info("Orphaning dataset '%s'", dataset.uri)

Review Comment:
   ```suggestion
   self.log.info("Orphaning unreferenced dataset '%s'", dataset.uri)
   ```



##
airflow/dag_processing/manager.py:
##
@@ -433,8 +433,10 @@ def __init__(
 self.last_stat_print_time = 0
 # Last time we cleaned up DAGs which are no longer in files
 self.last_deactivate_stale_dags_time = 
timezone.make_aware(datetime.fromtimestamp(0))
-# How often to check for DAGs which are no longer in files
-self.deactivate_stale_dags_interval = conf.getint("scheduler", 
"deactivate_stale_dags_interval")
+# How often to clean up:
+# * DAGs which are no longer in files
+# * datasets that are no longer referenced by any DAG schedule 
parameters or task outlets

Review Comment:
   ```suggestion
   # How often to check for DAGs which are no longer in files
   ```



##
airflow/models/dag.py:
##
@@ -2828,6 +2828,7 @@ def bulk_write_to_db(
 for dataset in all_datasets:
 stored_dataset = 
session.query(DatasetModel).filter(DatasetModel.uri == dataset.uri).first()
 if stored_dataset:
+stored_dataset.is_orphaned = False

Review Comment:
   Test this situation.



##
airflow/migrations/versions/0122_2_5_0_add_is_orphaned_to_datasetmodel.py:
##
@@ -0,0 +1,49 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regard

[GitHub] [airflow] Ken-poc commented on issue #27903: dag.timezone can not have start_date.tzinfo

2022-11-24 Thread GitBox


Ken-poc commented on issue #27903:
URL: https://github.com/apache/airflow/issues/27903#issuecomment-1326986647

   This is not a bug. Could you remove the label?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Ken-poc commented on issue #27903: dag.timezone can not have start_date.tzinfo

2022-11-24 Thread GitBox


Ken-poc commented on issue #27903:
URL: https://github.com/apache/airflow/issues/27903#issuecomment-1326984633

   yes I know that works.
   
   if start_date and start_date.tzinfo:
   tzinfo = None if start_date.tzinfo else settings.TIMEZONE
   tz = pendulum.instance(start_date, tz=tzinfo).timezone
   
   Though `start_date` has its `tzinfo`, `tzinfo` is always assigned to _None_ 
and `tz` eventually is made from `start_date` not `tzinfo` anyway.  this makes 
it confusing even though it acutally works.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] zsdyx commented on issue #13668: scheduler dies with "MySQLdb._exceptions.OperationalError: (1213, 'Deadlock found when trying to get lock; try restarting transaction')"

2022-11-24 Thread GitBox


zsdyx commented on issue #13668:
URL: https://github.com/apache/airflow/issues/13668#issuecomment-1326980476

   MySQL deadlock occurs when I use 2.4.1, Has this problem never been solved


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] NickYadance closed pull request #27344: Add retry to submit_event in trigger to avoid deadlock

2022-11-24 Thread GitBox


NickYadance closed pull request #27344: Add retry to submit_event in trigger to 
avoid deadlock
URL: https://github.com/apache/airflow/pull/27344


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #27026: Parameterise key sorting in "Rendered Template" view

2022-11-24 Thread GitBox


uranusjr commented on issue #27026:
URL: https://github.com/apache/airflow/issues/27026#issuecomment-1326964845

   Problem is `template_fields` does not contain all operator fields, and you’d 
have no way to sort those non-templated fields (especially if some of them 
can’t be templated to begin with).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Ken-poc commented on issue #27903: dag.timezone can not have start_date.tzinfo

2022-11-24 Thread GitBox


Ken-poc commented on issue #27903:
URL: https://github.com/apache/airflow/issues/27903#issuecomment-1326961733

   Yes that's right. I pointed out that tzinfo is always assinged to None. I 
think this is unnecessary.
   
https://github.com/apache/airflow/blob/3e288abd0bc3e5788dcd7f6d9f6bef26ec4c7281/airflow/models/dag.py#L465


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] vksunilk commented on issue #27894: Status of testing Providers that were prepared on November 24, 2022

2022-11-24 Thread GitBox


vksunilk commented on issue #27894:
URL: https://github.com/apache/airflow/issues/27894#issuecomment-1326959302

   #26986 Works as expected. I am able to view the DataprocLink irrespective of 
Job status


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #27903: dag.timezone can not have start_date.tzinfo

2022-11-24 Thread GitBox


uranusjr commented on issue #27903:
URL: https://github.com/apache/airflow/issues/27903#issuecomment-1326956524

   Not sure what you mean. The attribute is set, from what I can tell.
   
   ```pycon
   >>> from airflow.models.dag import DAG
   >>> import pendulum
   >>> d = pendulum.now()
   >>> d.tzinfo
   Timezone('Etc/UTC')
   >>> dag = DAG(dag_id="xxx", start_date=d)
   >>> dag.timezone
   Timezone('Etc/UTC')
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #27903: dag.timezone can not have start_date.tzinfo

2022-11-24 Thread GitBox


boring-cyborg[bot] commented on issue #27903:
URL: https://github.com/apache/airflow/issues/27903#issuecomment-1326954259

   Thanks for opening your first issue here! Be sure to follow the issue 
template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Ken-poc opened a new issue, #27903: dag.timezone can not have start_date.tzinfo

2022-11-24 Thread GitBox


Ken-poc opened a new issue, #27903:
URL: https://github.com/apache/airflow/issues/27903

   ### Apache Airflow version
   
   main (development)
   
   ### What happened
   
   I found a weird code when assigning dag.timezone from DAG model. The 
timezone of DAG is always assigned to None, `start_date` even has `tzinfo` 
though. Is it intended?
   
   ### What you think should happen instead
   
   Dag should have the timezone if start_date passed in has tzinfo. 
   
   ### How to reproduce
   
   Dag can't have timezone from start_date.
   
   ### Operating System
   
   MacOS Monteray
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-providers-amazon==6.0.0
   apache-airflow-providers-cncf-kubernetes==4.4.0
   apache-airflow-providers-common-sql==1.2.0
   apache-airflow-providers-ftp==3.1.0
   apache-airflow-providers-http==4.0.0
   apache-airflow-providers-imap==3.0.0
   apache-airflow-providers-sqlite==3.2.1
   
   
   ### Deployment
   
   Virtualenv installation
   
   ### Deployment details
   
   None
   
   ### Anything else
   
   None
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on pull request #27740: Remove XCom API endpoint full deserialization option

2022-11-24 Thread GitBox


uranusjr commented on PR #27740:
URL: https://github.com/apache/airflow/pull/27740#issuecomment-1326951283

   Sounds to me the most reasonable approach here would be to add a config to 
allow this feature.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch main updated: Remove is_mapped attribute (#27881)

2022-11-24 Thread uranusjr
This is an automated email from the ASF dual-hosted git repository.

uranusjr pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/main by this push:
 new 3e288abd0b Remove is_mapped attribute (#27881)
3e288abd0b is described below

commit 3e288abd0bc3e5788dcd7f6d9f6bef26ec4c7281
Author: Tzu-ping Chung 
AuthorDate: Fri Nov 25 09:21:01 2022 +0800

Remove is_mapped attribute (#27881)
---
 .../endpoints/task_instance_endpoint.py|  3 +-
 airflow/api_connexion/schemas/task_schema.py   | 17 ++--
 airflow/cli/commands/task_command.py   |  3 +-
 airflow/models/baseoperator.py |  2 -
 airflow/models/mappedoperator.py   |  2 -
 airflow/models/operator.py | 23 -
 airflow/models/taskinstance.py |  5 +-
 airflow/models/xcom_arg.py |  3 +-
 airflow/ti_deps/deps/ready_to_reschedule.py|  4 +-
 airflow/ti_deps/deps/trigger_rule_dep.py   |  3 +-
 airflow/www/views.py   |  7 +-
 tests/decorators/test_python.py|  7 +-
 tests/models/test_taskinstance.py  | 99 +-
 13 files changed, 151 insertions(+), 27 deletions(-)

diff --git a/airflow/api_connexion/endpoints/task_instance_endpoint.py 
b/airflow/api_connexion/endpoints/task_instance_endpoint.py
index 4e9d6cb9a1..9d5d54ba58 100644
--- a/airflow/api_connexion/endpoints/task_instance_endpoint.py
+++ b/airflow/api_connexion/endpoints/task_instance_endpoint.py
@@ -45,6 +45,7 @@ from airflow.api_connexion.schemas.task_instance_schema 
import (
 from airflow.api_connexion.types import APIResponse
 from airflow.models import SlaMiss
 from airflow.models.dagrun import DagRun as DR
+from airflow.models.operator import needs_expansion
 from airflow.models.taskinstance import TaskInstance as TI, 
clear_task_instances
 from airflow.security import permissions
 from airflow.utils.airflow_flask_app import get_airflow_app
@@ -202,7 +203,7 @@ def get_mapped_task_instances(
 if not task:
 error_message = f"Task id {task_id} not found"
 raise NotFound(error_message)
-if not task.is_mapped:
+if not needs_expansion(task):
 error_message = f"Task id {task_id} is not mapped"
 raise NotFound(error_message)
 
diff --git a/airflow/api_connexion/schemas/task_schema.py 
b/airflow/api_connexion/schemas/task_schema.py
index 0fcb9ff18f..5715ca2ea0 100644
--- a/airflow/api_connexion/schemas/task_schema.py
+++ b/airflow/api_connexion/schemas/task_schema.py
@@ -27,6 +27,7 @@ from airflow.api_connexion.schemas.common_schema import (
 WeightRuleField,
 )
 from airflow.api_connexion.schemas.dag_schema import DAGSchema
+from airflow.models.mappedoperator import MappedOperator
 from airflow.models.operator import Operator
 
 
@@ -59,22 +60,28 @@ class TaskSchema(Schema):
 template_fields = fields.List(fields.String(), dump_only=True)
 sub_dag = fields.Nested(DAGSchema, dump_only=True)
 downstream_task_ids = fields.List(fields.String(), dump_only=True)
-params = fields.Method("get_params", dump_only=True)
-is_mapped = fields.Boolean(dump_only=True)
+params = fields.Method("_get_params", dump_only=True)
+is_mapped = fields.Method("_get_is_mapped", dump_only=True)
 
-def _get_class_reference(self, obj):
+@staticmethod
+def _get_class_reference(obj):
 result = ClassReferenceSchema().dump(obj)
 return result.data if hasattr(result, "data") else result
 
-def _get_operator_name(self, obj):
+@staticmethod
+def _get_operator_name(obj):
 return obj.operator_name
 
 @staticmethod
-def get_params(obj):
+def _get_params(obj):
 """Get the Params defined in a Task."""
 params = obj.params
 return {k: v.dump() for k, v in params.items()}
 
+@staticmethod
+def _get_is_mapped(obj):
+return isinstance(obj, MappedOperator)
+
 
 class TaskCollection(NamedTuple):
 """List of Tasks with metadata."""
diff --git a/airflow/cli/commands/task_command.py 
b/airflow/cli/commands/task_command.py
index a217d2c78d..078565dc38 100644
--- a/airflow/cli/commands/task_command.py
+++ b/airflow/cli/commands/task_command.py
@@ -42,6 +42,7 @@ from airflow.models import DagPickle, TaskInstance
 from airflow.models.baseoperator import BaseOperator
 from airflow.models.dag import DAG
 from airflow.models.dagrun import DagRun
+from airflow.models.operator import needs_expansion
 from airflow.ti_deps.dep_context import DepContext
 from airflow.ti_deps.dependencies_deps import SCHEDULER_QUEUED_DEPS
 from airflow.typing_compat import Literal
@@ -150,7 +151,7 @@ def _get_ti(
 """Get the task instance through DagRun.run_id, if that fails, get the TI 
the old way."""
 if not exec_date_or_run_id and not create_if_necessary:

[GitHub] [airflow] uranusjr merged pull request #27881: Remove is_mapped attribute

2022-11-24 Thread GitBox


uranusjr merged PR #27881:
URL: https://github.com/apache/airflow/pull/27881


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr closed issue #27879: Getting error in scheduler logs "Task killed externally" when running a dag with task group mapping

2022-11-24 Thread GitBox


uranusjr closed issue #27879: Getting error in scheduler logs "Task killed 
externally" when running a dag with task group mapping
URL: https://github.com/apache/airflow/issues/27879


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on a diff in pull request #27778: Lambda hook: make runtime and handler optional

2022-11-24 Thread GitBox


uranusjr commented on code in PR #27778:
URL: https://github.com/apache/airflow/pull/27778#discussion_r1031930667


##
airflow/providers/amazon/aws/hooks/lambda_function.py:
##
@@ -93,6 +93,12 @@ def create_lambda(
 code_signing_config_arn: str | None = None,
 architectures: list[str] | None = None,
 ) -> dict:
+if package_type == "Zip":
+if handler is None:
+raise ValueError("Parameter 'handler' is required if 
'package_type' is 'Zip'")
+if runtime is None:
+raise ValueError("Parameter 'runtime' is required if 
'package_type' is 'Zip'")

Review Comment:
   These should be TypeError to mirror Python’s default behaviour.
   
   ```pycon
   >>> def f(*, a): pass
   ... 
   >>> f()
   Traceback (most recent call last):
 File "", line 1, in 
   TypeError: f() missing 1 required keyword-only argument: 'a'
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on a diff in pull request #27844: Detect alternative container runtime automatically

2022-11-24 Thread GitBox


uranusjr commented on code in PR #27844:
URL: https://github.com/apache/airflow/pull/27844#discussion_r1031929935


##
CONTRIBUTORS_QUICK_START.rst:
##
@@ -50,7 +50,7 @@ Local machine development
 
 If you do not work with remote development environment, you need those 
prerequisites.
 
-1. Docker Community Edition (you can also use Colima, see instructions below)
+1. Container runtime: Docker Community Edition (recommended), Colima.

Review Comment:
   FWIW, last time I tried using containerd with breeze, Docker CLI is not main 
problem (Podman is actually close enough you can just change a few constants to 
make things work), but docker-compose. But that’s off-topic; the main point 
here is the terminology here needs to be fixed to not introduce confusion 
unnecessarily.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #27859: DYNAMICALLY CREATING TASKS issue : "_TaskDecorator' object has no attribute 'update_relative': "

2022-11-24 Thread GitBox


uranusjr commented on issue #27859:
URL: https://github.com/apache/airflow/issues/27859#issuecomment-1326922559

   I suspect a call is missed somewhere in how you instantiate tasks. Note that 
a `@task` function needs to be _called_ (either with `f.expand()`, 
`f.expand_kwargs()`, or just `f()` like a function) to become a concrete task. 
We can probably check for this user error and emit a better message, but we 
need a reproduction first to identify the issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr closed issue #27862: airflow failure and success callbacks read task instance state as 'running'

2022-11-24 Thread GitBox


uranusjr closed issue #27862: airflow failure and success callbacks read task 
instance state as 'running'
URL: https://github.com/apache/airflow/issues/27862


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #27862: airflow failure and success callbacks read task instance state as 'running'

2022-11-24 Thread GitBox


uranusjr commented on issue #27862:
URL: https://github.com/apache/airflow/issues/27862#issuecomment-1326920978

   Duplicate of #26760.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on a diff in pull request #27887: Add allow list for imports during deserialization

2022-11-24 Thread GitBox


uranusjr commented on code in PR #27887:
URL: https://github.com/apache/airflow/pull/27887#discussion_r1031925774


##
airflow/utils/json.py:
##
@@ -189,7 +189,7 @@ def __init__(self, *args, **kwargs) -> None:
 if not kwargs.get("object_hook"):
 kwargs["object_hook"] = self.object_hook
 
-patterns = conf.getjson("core", "allowed_deserialization_classes")
+patterns = cast(list, conf.getjson("core", 
"allowed_deserialization_classes"))

Review Comment:
   This would result in a confusing error if the value is not set to a list. 
It’s probably better to explicitly check the value is a list instead (and raise 
a clear message explaining the config value is exact source of failure).



##
airflow/utils/json.py:
##
@@ -189,7 +189,7 @@ def __init__(self, *args, **kwargs) -> None:
 if not kwargs.get("object_hook"):
 kwargs["object_hook"] = self.object_hook
 
-patterns = conf.getjson("core", "allowed_deserialization_classes")
+patterns = cast(list, conf.getjson("core", 
"allowed_deserialization_classes"))

Review Comment:
   This would result in a confusing error if the value is not set to a list. 
It’s probably better to explicitly check the value is a list instead (and raise 
a clear message explaining the config value is the exact source of failure).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a diff in pull request #27901: Add information on how to run tests in Breeze via the PyCharm IDE

2022-11-24 Thread GitBox


ferruzzi commented on code in PR #27901:
URL: https://github.com/apache/airflow/pull/27901#discussion_r1031925157


##
TESTING.rst:
##
@@ -61,20 +61,51 @@ Running Unit Tests from PyCharm IDE
 To run unit tests from the PyCharm IDE, create the `local virtualenv 
`_,
 select it as the default project's environment, then configure your test 
runner:
 
-.. image:: images/configure_test_runner.png
+.. image:: images/pycharm/configure_test_runner.png
 :align: center
 :alt: Configuring test runner
 
 and run unit tests as follows:
 
-.. image:: images/running_unittests.png
+.. image:: images/pycharm/running_unittests.png
 :align: center
 :alt: Running unit tests
 
 **NOTE:** You can run the unit tests in the standalone local virtualenv
 (with no Breeze installed) if they do not have dependencies such as
 Postgres/MySQL/Hadoop/etc.
 
+Running Unit Tests from PyCharm IDE using Breeze
+
+
+Ideally, all unit tests should be run using the standardized Breeze 
environment.  While not
+as convenient as the one-click "play button" in PyCharm, the IDE can be 
configured to do
+this in two clicks.
+
+1. Add Breeze as an "External Tool"
+  a. File > Settings > Tools > External Tools
+  b. Click the little plus symbol to open the "Create Tool" popup and fill it 
out:
+
+.. image:: images/pycharm/pycharm_create_tool.png
+:align: center
+:alt: Installing Python extension
+
+2. Add the tool to the context menu
+  a. File > Settings > Appearance and Behavior > Menus and Toolbars > Project 
View Popup Menu
+  b. Click on the list of entries where you would like it to be added.  Right 
above or below
+ "Project View Popup Menu Run Group" may be a good choice, you can drag 
and drop this list
+ to rearrange the placement later.
+  c. Click the little plus at the top of the popup window
+  d. Find your "External Tool" in the new "Choose Actions to Add" popup and 
click OK.  If you
+ followed the image above, it will be at External Tools > External Tools > 
Breeze

Review Comment:
   Committed the phrasing change, thanks.   
   
   For the bullet styles, this is how it renders, despite the letter-bullets in 
the raw code. 
   
![image](https://user-images.githubusercontent.com/1920178/203878654-362d882f-582b-4f38-83fc-3769ea3f5a05.png)
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on a diff in pull request #27834: Make sure we can get out of a faulty scheduler state

2022-11-24 Thread GitBox


uranusjr commented on code in PR #27834:
URL: https://github.com/apache/airflow/pull/27834#discussion_r1031924843


##
airflow/models/dagrun.py:
##
@@ -780,8 +780,7 @@ def _expand_mapped_task_if_needed(ti: TI) -> Iterable[TI] | 
None:
 except NotMapped:  # Not a mapped task, nothing needed.
 return None
 if expanded_tis:
-assert expanded_tis[0] is ti
-return expanded_tis[1:]
+return expanded_tis

Review Comment:
   Since this function only returns _new_ ti objects, should we do something 
like this?
   
   ```python
   if expanded_tis[0] is ti:
   return expanded_tis[1:]
   return expanded_tis
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on a diff in pull request #27834: Make sure we can get out of a faulty scheduler state

2022-11-24 Thread GitBox


uranusjr commented on code in PR #27834:
URL: https://github.com/apache/airflow/pull/27834#discussion_r1031924565


##
airflow/models/abstractoperator.py:
##
@@ -494,18 +495,30 @@ def expand_mapped_task(self, run_id: str, *, session: 
Session) -> tuple[Sequence
 total_length,
 )
 unmapped_ti.state = TaskInstanceState.SKIPPED
-indexes_to_map = ()
 else:
-# Otherwise convert this into the first mapped index, and 
create
-# TaskInstance for other indexes.
-unmapped_ti.map_index = 0
-self.log.debug("Updated in place to become %s", unmapped_ti)
-all_expanded_tis.append(unmapped_ti)
-indexes_to_map = range(1, total_length)
-state = unmapped_ti.state
-elif not total_length:
+zero_index_ti_exists = session.query(
+exists().where(
+TaskInstance.dag_id == self.dag_id,
+TaskInstance.task_id == self.task_id,
+TaskInstance.run_id == run_id,
+TaskInstance.map_index == 0,
+)
+).scalar()

Review Comment:
   IIRC `EXISTS` has some compatibility issues across databases (don’t remember 
what exactly), so we generally use `query(count())...scalar() > 0` instead.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] xlanor commented on a diff in pull request #27898: fix: current_state method on TaskInstance doesn't filter by map_index

2022-11-24 Thread GitBox


xlanor commented on code in PR #27898:
URL: https://github.com/apache/airflow/pull/27898#discussion_r1031921282


##
airflow/models/taskinstance.py:
##
@@ -719,6 +719,7 @@ def current_state(self, session: Session = NEW_SESSION) -> 
str:
 .filter(
 TaskInstance.dag_id == self.dag_id,
 TaskInstance.task_id == self.task_id,
+TaskInstance.map_index == self.map_index,

Review Comment:
   Thanks, will work on this PR tomorrow and hopefully get it ready for review 
shortly



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on a diff in pull request #27898: fix: current_state method on TaskInstance doesn't filter by map_index

2022-11-24 Thread GitBox


uranusjr commented on code in PR #27898:
URL: https://github.com/apache/airflow/pull/27898#discussion_r1031920687


##
airflow/models/taskinstance.py:
##
@@ -719,6 +719,7 @@ def current_state(self, session: Session = NEW_SESSION) -> 
str:
 .filter(
 TaskInstance.dag_id == self.dag_id,
 TaskInstance.task_id == self.task_id,
+TaskInstance.map_index == self.map_index,

Review Comment:
   Honestly `current_state` isn’t really used almost anywhere in the code base 
(the only is for `airflow tasks state`, but I’d challenge even that usage is 
not necessary at all), so the test coverage is mostly non-existent. You can 
probably add a test for the `airflow tasks` CLI command (in 
`tests/cli/commands/test_task_command.py`) to cover this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] xlanor commented on a diff in pull request #27898: fix: current_state method on TaskInstance doesn't filter by map_index

2022-11-24 Thread GitBox


xlanor commented on code in PR #27898:
URL: https://github.com/apache/airflow/pull/27898#discussion_r1031919560


##
airflow/models/taskinstance.py:
##
@@ -719,6 +719,7 @@ def current_state(self, session: Session = NEW_SESSION) -> 
str:
 .filter(
 TaskInstance.dag_id == self.dag_id,
 TaskInstance.task_id == self.task_id,
+TaskInstance.map_index == self.map_index,

Review Comment:
   Thanks!
   
   I've looked at the tests in tests/model.py and I don't see any examples of a 
test of a mapped task there. Is there any test that you would suggest so that a 
regression does not occur in the future?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on a diff in pull request #27898: fix: current_state method on TaskInstance doesn't filter by map_index

2022-11-24 Thread GitBox


uranusjr commented on code in PR #27898:
URL: https://github.com/apache/airflow/pull/27898#discussion_r1031919036


##
airflow/models/taskinstance.py:
##
@@ -719,6 +719,7 @@ def current_state(self, session: Session = NEW_SESSION) -> 
str:
 .filter(
 TaskInstance.dag_id == self.dag_id,
 TaskInstance.task_id == self.task_id,
+TaskInstance.map_index == self.map_index,

Review Comment:
   It’s probably a good chance to rewrite this to something like
   
   ```python
   from sqlalchemy.inspection import inspect
   
   session.query(TaskInstance.state).filter(
   col == getattr(self, col.name)
   for col in inspect(TaskInstance).primary_key
   ).scalar()
   ```
   
   This would be resilient to any primary key changes in the future.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] pulquero opened a new issue, #27902: HdfsSensor has no clear failure mode

2022-11-24 Thread GitBox


pulquero opened a new issue, #27902:
URL: https://github.com/apache/airflow/issues/27902

   ### Description
   
   Currently, HdfsSensor pings forever if some failure causes the file not to 
be written. Some sort of timeout parameter would be nice.
   
   ### Use case/motivation
   
   If there is a failure earlier in the pipeline that prevents the file of 
interest being written, HdfsSensor just pings forever, and everything looks 
fine. I would like some sort of way to have HdfsSensor fail, so that my team 
can detect issues promptly and address them.
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] uranusjr commented on a diff in pull request #27901: Add information on how to run tests in Breeze via the PyCharm IDE

2022-11-24 Thread GitBox


uranusjr commented on code in PR #27901:
URL: https://github.com/apache/airflow/pull/27901#discussion_r1031914073


##
TESTING.rst:
##
@@ -61,20 +61,51 @@ Running Unit Tests from PyCharm IDE
 To run unit tests from the PyCharm IDE, create the `local virtualenv 
`_,
 select it as the default project's environment, then configure your test 
runner:
 
-.. image:: images/configure_test_runner.png
+.. image:: images/pycharm/configure_test_runner.png
 :align: center
 :alt: Configuring test runner
 
 and run unit tests as follows:
 
-.. image:: images/running_unittests.png
+.. image:: images/pycharm/running_unittests.png
 :align: center
 :alt: Running unit tests
 
 **NOTE:** You can run the unit tests in the standalone local virtualenv
 (with no Breeze installed) if they do not have dependencies such as
 Postgres/MySQL/Hadoop/etc.
 
+Running Unit Tests from PyCharm IDE using Breeze
+
+
+Ideally, all unit tests should be run using the standardized Breeze 
environment.  While not
+as convenient as the one-click "play button" in PyCharm, the IDE can be 
configured to do
+this in two clicks.
+
+1. Add Breeze as an "External Tool"
+  a. File > Settings > Tools > External Tools
+  b. Click the little plus symbol to open the "Create Tool" popup and fill it 
out:
+
+.. image:: images/pycharm/pycharm_create_tool.png
+:align: center
+:alt: Installing Python extension
+
+2. Add the tool to the context menu
+  a. File > Settings > Appearance and Behavior > Menus and Toolbars > Project 
View Popup Menu
+  b. Click on the list of entries where you would like it to be added.  Right 
above or below
+ "Project View Popup Menu Run Group" may be a good choice, you can drag 
and drop this list
+ to rearrange the placement later.
+  c. Click the little plus at the top of the popup window
+  d. Find your "External Tool" in the new "Choose Actions to Add" popup and 
click OK.  If you
+ followed the image above, it will be at External Tools > External Tools > 
Breeze

Review Comment:
   ```suggestion
 a. Navigate to File > Settings > Appearance and Behavior > Menus and 
Toolbars > Project View Popup Menu.
 b. Click on the list of entries where you would like it to be added.  
Right above or below
"Project View Popup Menu Run Group" may be a good choice, you can drag 
and drop this list
to rearrange the placement later.
 c. Click the little plus at the top of the popup window.
 d. Find your "External Tool" in the new "Choose Actions to Add" popup and 
click OK.  If you
followed the image above, it will be at External Tools > External Tools 
> Breeze.
   ```
   
   Maybe unify the style of bullet items?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch main updated (eba04d7c40 -> bad875b58d)

2022-11-24 Thread ephraimanierobi
This is an automated email from the ASF dual-hosted git repository.

ephraimanierobi pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


from eba04d7c40 tests: always cleanup registered test listeners (#27896)
 add bad875b58d Only get changelog for core commits (#27900)

No new revisions were added by this update.

Summary of changes:
 dev/airflow-github | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)



[GitHub] [airflow] ephraimbuddy merged pull request #27900: Only get changelog for core commits

2022-11-24 Thread GitBox


ephraimbuddy merged PR #27900:
URL: https://github.com/apache/airflow/pull/27900


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] pierrejeambrun commented on pull request #27805: Automatically save and allow restore of recent DAG run configs

2022-11-24 Thread GitBox


pierrejeambrun commented on PR #27805:
URL: https://github.com/apache/airflow/pull/27805#issuecomment-1326861132

   @aaronabraham311 There is a lot of example querying resource from the db in 
the views.py file. In this case DagRun should have what you need. In the 
example you mentioned above (json config displayed in the dagrun details), its 
coming from the grid_data view of that file :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] ferruzzi opened a new pull request, #27901: Add information on how to run tests in Breeze via the PyCharm IDE

2022-11-24 Thread GitBox


ferruzzi opened a new pull request, #27901:
URL: https://github.com/apache/airflow/pull/27901

   How to add a context menu entry in PyCharm to run selected unit tests in the 
Breeze environment instead if in your working venv.
   
   Also moved the two existing PyCharm-specific images into a new subdirectory 
for organizational reasons.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a diff in pull request #27823: Amazon Provider Package user agent

2022-11-24 Thread GitBox


ferruzzi commented on code in PR #27823:
URL: https://github.com/apache/airflow/pull/27823#discussion_r1031869533


##
airflow/providers/amazon/aws/hooks/base_aws.py:
##
@@ -42,11 +46,13 @@
 from dateutil.tz import tzlocal
 from slugify import slugify
 
+from airflow import __version__ as airflow_version

Review Comment:
   I think I addressed this in 
https://github.com/apache/airflow/pull/27823/commits/a5fc3bc4a39855b9f3c7fc5ac26505709d191a98
 by moving it to a local import in the helper method.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a diff in pull request #27823: Amazon Provider Package user agent

2022-11-24 Thread GitBox


ferruzzi commented on code in PR #27823:
URL: https://github.com/apache/airflow/pull/27823#discussion_r1031868722


##
airflow/providers/amazon/aws/hooks/base_aws.py:
##
@@ -405,9 +411,68 @@ def __init__(
 self.resource_type = resource_type
 
 self._region_name = region_name
-self._config = config
+self._config = config or botocore.config.Config()
 self._verify = verify
 
+@classmethod
+def _get_provider_version(cls) -> str:
+"""Checks the Providers Manager for the package version."""
+manager = ProvidersManager()
+provider_name = manager.hooks[cls.conn_type].package_name  # type: 
ignore[union-attr]

Review Comment:
   Sorry for the delay, just got back from vacation and it took a little longer 
to get back into gear.  I ended up wrapping it in a try/except as mentioned and 
dropped the `if not hook`.  If `hook` is falsy, then it'll error out on the 
next line at `hook.package_name` anyway and get caught by the `except`.
   
   Addressed in 
https://github.com/apache/airflow/pull/27823/commits/a5fc3bc4a39855b9f3c7fc5ac26505709d191a98



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a diff in pull request #27823: Amazon Provider Package user agent

2022-11-24 Thread GitBox


ferruzzi commented on code in PR #27823:
URL: https://github.com/apache/airflow/pull/27823#discussion_r1031868722


##
airflow/providers/amazon/aws/hooks/base_aws.py:
##
@@ -405,9 +411,68 @@ def __init__(
 self.resource_type = resource_type
 
 self._region_name = region_name
-self._config = config
+self._config = config or botocore.config.Config()
 self._verify = verify
 
+@classmethod
+def _get_provider_version(cls) -> str:
+"""Checks the Providers Manager for the package version."""
+manager = ProvidersManager()
+provider_name = manager.hooks[cls.conn_type].package_name  # type: 
ignore[union-attr]

Review Comment:
   Sorry for the delay, just got back from vacation and it took a little longer 
to get back into gear.  I ended up wrapping it in a try/except as mentioned and 
dropped the `if not hook`.  If `hook` is falsy, then it'll error out on the 
next line at `hook.package_name` anyway and get caught by the `except`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] vandonr-amz opened a new pull request, #27899: fix sagemaker system test to run on Apple Silicon

2022-11-24 Thread GitBox


vandonr-amz opened a new pull request, #27899:
URL: https://github.com/apache/airflow/pull/27899

   this test was failing when launched from an M1 mac because the docker image 
was built for the local CPU type (arm64) and then uploaded to an amd64 linux, 
which didn't work.
   `--platform` is a flag for buildx, but breeze has it replacing the default 
`docker build`, so this works alright.
   
   tested on an M1 macbook pro and on an EC2 ubuntu instance.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch v2-5-test updated (523df868dd -> 9bae336e69)

2022-11-24 Thread ephraimanierobi
This is an automated email from the ASF dual-hosted git repository.

ephraimanierobi pushed a change to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git


omit 523df868dd Add release notes
 add 9bae336e69 Add release notes

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (523df868dd)
\
 N -- N -- N   refs/heads/v2-5-test (9bae336e69)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

No new revisions were added by this update.

Summary of changes:
 RELEASE_NOTES.rst | 19 ---
 1 file changed, 4 insertions(+), 15 deletions(-)



[GitHub] [airflow] vincbeck commented on a diff in pull request #27820: Add retry option in RedshiftDeleteClusterOperator to retry when an operation is running in the cluster

2022-11-24 Thread GitBox


vincbeck commented on code in PR #27820:
URL: https://github.com/apache/airflow/pull/27820#discussion_r1031804824


##
airflow/providers/amazon/aws/operators/redshift_cluster.py:
##
@@ -498,22 +502,38 @@ def __init__(
 wait_for_completion: bool = True,
 aws_conn_id: str = "aws_default",
 poll_interval: float = 30.0,
+retry: bool = False,
+retry_attempts: int = 10,

Review Comment:
   Should be good now @eladkal 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] ajaykarthick27 commented on issue #27890: SFTP Sensor is not working with File Pattern Parameter

2022-11-24 Thread GitBox


ajaykarthick27 commented on issue #27890:
URL: https://github.com/apache/airflow/issues/27890#issuecomment-1326810050

   yes did not notice.  I will close this issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] ajaykarthick27 closed issue #27890: SFTP Sensor is not working with File Pattern Parameter

2022-11-24 Thread GitBox


ajaykarthick27 closed issue #27890: SFTP Sensor is not working with File 
Pattern Parameter
URL: https://github.com/apache/airflow/issues/27890


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] eladkal commented on issue #27890: SFTP Sensor is not working with File Pattern Parameter

2022-11-24 Thread GitBox


eladkal commented on issue #27890:
URL: https://github.com/apache/airflow/issues/27890#issuecomment-1326807715

   Duplicate of https://github.com/apache/airflow/issues/27418 ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] vandonr-amz commented on a diff in pull request #27786: Add operators + sensor for aws sagemaker pipelines

2022-11-24 Thread GitBox


vandonr-amz commented on code in PR #27786:
URL: https://github.com/apache/airflow/pull/27786#discussion_r1031797200


##
airflow/providers/amazon/aws/hooks/sagemaker.py:
##
@@ -647,28 +649,28 @@ def describe_endpoint(self, name: str) -> dict:
 
 def check_status(
 self,
-job_name: str,
+resource_name: str,

Review Comment:
   ah that's a good point... I can keep it as job_name to avoid that, there is 
no strong need to rename it.
   I can also add a small comment about the fact that it can be used to check 
more than jobs.
   
   tbh I think this method should have been private, it's mostly a helper, only 
used in `wait_for_completion` cases, but now it's too late to change that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] xlanor commented on pull request #27898: fix: current_state method on TaskInstance doesn't filter by map_index

2022-11-24 Thread GitBox


xlanor commented on PR #27898:
URL: https://github.com/apache/airflow/pull/27898#issuecomment-1326802435

   Currently I'm still trying to figure out how to run the unit tests as I'm 
fairly new to this code base, opening a PR first for CI.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on pull request #27898: fix: current_state method on TaskInstance doesn't filter by map_index

2022-11-24 Thread GitBox


boring-cyborg[bot] commented on PR #27898:
URL: https://github.com/apache/airflow/pull/27898#issuecomment-1326802257

   Congratulations on your first Pull Request and welcome to the Apache Airflow 
community! If you have any issues or are unsure about any anything please check 
our Contribution Guide 
(https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
   Here are some useful points:
   - Pay attention to the quality of your code (flake8, mypy and type 
annotations). Our [pre-commits]( 
https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks)
 will help you with that.
   - In case of a new feature add useful documentation (in docstrings or in 
`docs/` directory). Adding a new operator? Check this short 
[guide](https://github.com/apache/airflow/blob/main/docs/apache-airflow/howto/custom-operator.rst)
 Consider adding an example DAG that shows how users should use it.
   - Consider using [Breeze 
environment](https://github.com/apache/airflow/blob/main/BREEZE.rst) for 
testing locally, it's a heavy docker but it ships with a working Airflow and a 
lot of integrations.
   - Be patient and persistent. It might take some time to get a review or get 
the final approval from Committers.
   - Please follow [ASF Code of 
Conduct](https://www.apache.org/foundation/policies/conduct) for all 
communication including (but not limited to) comments on Pull Requests, Mailing 
list and Slack.
   - Be sure to read the [Airflow Coding style]( 
https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#coding-style-and-best-practices).
   Apache Airflow is a community-driven project and together we are making it 
better 🚀.
   In case of doubts contact the developers at:
   Mailing List: d...@airflow.apache.org
   Slack: https://s.apache.org/airflow-slack
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] xlanor opened a new pull request, #27898: fix: current_state method on TaskInstance doesn't filter by map_index

2022-11-24 Thread GitBox


xlanor opened a new pull request, #27898:
URL: https://github.com/apache/airflow/pull/27898

   Signed-off-by: xlanor 
   
   Fixes #27864 
   ---
   
   current_state method on TaskInstance doesn't filter by map_index so calling 
this method on mapped task instance fails.
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code changes, an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in a 
newsfragment file, named `{pr_number}.significant.rst` or 
`{issue_number}.significant.rst`, in 
[newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] syedahsn closed pull request #27873: EMR Notebook Execution Sensor

2022-11-24 Thread GitBox


syedahsn closed pull request #27873: EMR Notebook Execution Sensor
URL: https://github.com/apache/airflow/pull/27873


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] aaronabraham311 commented on pull request #27805: Automatically save and allow restore of recent DAG run configs

2022-11-24 Thread GitBox


aaronabraham311 commented on PR #27805:
URL: https://github.com/apache/airflow/pull/27805#issuecomment-1326792202

   @pierrejeambrun Oh that's great! Is there any example on how to access the 
DagRun configs from the db? Or is there a code snippet that we can use as an 
example?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] pierrejeambrun commented on pull request #27805: Automatically save and allow restore of recent DAG run configs

2022-11-24 Thread GitBox


pierrejeambrun commented on PR #27805:
URL: https://github.com/apache/airflow/pull/27805#issuecomment-1326777609

   Now that you mention it, DagRun conf are already stored and available for 
each run. Isn't it easier to just retrieve them and provide them to the 
`trigger.html` template directly ? We also have a lot of control of what conf 
should be retrieved (5 most recent conf for a specific dag, etc.)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] george-zubrienko commented on issue #27838: apache-airflow-providers-common-sql==1.3.0 breaks BigQuery operators

2022-11-24 Thread GitBox


george-zubrienko commented on issue #27838:
URL: https://github.com/apache/airflow/issues/27838#issuecomment-1326771187

   @potiuk I usually pin versions (`==1.2.3`) of providers that ship a lot of 
dependencies, to what is shown in the official image by running `pip show 
` and only upgrade if it was upgraded in the next release. Also, we 
don't upgrade to every release right away, so the snippets I posted were for 
2.4.1 version where we did some dependency shuffling (no version bumps, simple 
`poetry update`) and then I saw errors popping on test env after new image was 
deployed.
   
   Reason we use poetry is to resolve potential incompatibilities between our 
own libraries and airflow dependencies. For some providers - like datadog in 
the example above - it is more or less safe to only lock major and minor with 
`~`, but for google after running into problems with the protobuf library 
upgrade, I learned it is safer to pin dependencies.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch constraints-2-5 updated: Updating constraints. Build id:

2022-11-24 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch constraints-2-5
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/constraints-2-5 by this push:
 new d6aeef755f Updating constraints. Build id:
d6aeef755f is described below

commit d6aeef755fb797f01728fe3fb552c22dade5e087
Author: Automated GitHub Actions commit 
AuthorDate: Thu Nov 24 18:46:58 2022 +

Updating constraints. Build id:

This update in constraints is automatically committed by the CI 
'constraints-push' step based on
HEAD of '' in ''
with commit sha .

All tests passed in this build so we determined we can push the updated 
constraints.

See 
https://github.com/apache/airflow/blob/main/README.md#installing-from-pypi for 
details.
---
 constraints-3.10.txt  | 28 ++--
 constraints-3.7.txt   | 20 ++--
 constraints-3.8.txt   | 28 ++--
 constraints-3.9.txt   | 28 ++--
 constraints-no-providers-3.10.txt |  6 +++---
 constraints-no-providers-3.7.txt  |  6 +++---
 constraints-no-providers-3.8.txt  |  8 
 constraints-no-providers-3.9.txt  |  8 
 constraints-source-providers-3.10.txt | 28 ++--
 constraints-source-providers-3.7.txt  | 20 ++--
 constraints-source-providers-3.8.txt  | 28 ++--
 constraints-source-providers-3.9.txt  | 28 ++--
 12 files changed, 118 insertions(+), 118 deletions(-)

diff --git a/constraints-3.10.txt b/constraints-3.10.txt
index 4bea7fd639..61f49127f9 100644
--- a/constraints-3.10.txt
+++ b/constraints-3.10.txt
@@ -1,6 +1,6 @@
 #
-# This constraints file was automatically generated on 2022-11-23T11:28:48Z
-# via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow.
+# This constraints file was automatically generated on 2022-11-24T18:46:37Z
+# via "eager-upgrade" mechanism of PIP. For the "v2-5-test" branch of Airflow.
 # This variant of constraints install uses the HEAD of the branch version for 
'apache-airflow' but installs
 # the providers from PIP-released packages at the moment of the constraint 
generation.
 #
@@ -172,9 +172,9 @@ billiard==3.6.4.0
 black==22.10.0
 bleach==5.0.1
 blinker==1.5
-boto3==1.26.15
+boto3==1.26.16
 boto==2.49.0
-botocore==1.29.15
+botocore==1.29.16
 bowler==0.9.0
 cachelib==0.9.0
 cachetools==4.2.2
@@ -239,7 +239,7 @@ fastjsonschema==2.16.2
 filelock==3.8.0
 fissix==21.11.13
 flake8-colors==0.1.9
-flake8==5.0.4
+flake8==6.0.0
 flake8_implicit_str_concat==0.3.0
 flaky==3.7.0
 flower==1.2.0
@@ -253,7 +253,7 @@ gcloud-aio-storage==7.0.1
 gcsfs==2022.11.0
 geomet==0.2.1.post1
 gevent==22.10.2
-gitdb==4.0.9
+gitdb==4.0.10
 google-ads==18.0.0
 google-api-core==2.8.2
 google-api-python-client==1.12.11
@@ -322,7 +322,7 @@ identify==2.5.9
 idna==3.4
 ijson==3.1.4
 imagesize==1.4.1
-importlib-metadata==5.0.0
+importlib-metadata==5.1.0
 incremental==22.10.0
 inflection==0.5.1
 influxdb-client==1.34.0
@@ -447,7 +447,7 @@ pyOpenSSL==22.0.0
 pyarrow==9.0.0
 pyasn1-modules==0.2.8
 pyasn1==0.4.8
-pycodestyle==2.9.1
+pycodestyle==2.10.0
 pycountry==22.3.5
 pycparser==2.21
 pycryptodome==3.15.0
@@ -457,7 +457,7 @@ pydot==1.4.2
 pydruid==0.6.5
 pyenchant==3.2.2
 pyexasol==0.25.1
-pyflakes==2.5.0
+pyflakes==3.0.1
 pygraphviz==1.10
 pyhcl==0.4.4
 pykerberos==1.2.4
@@ -581,20 +581,20 @@ types-Deprecated==1.2.9
 types-Markdown==3.4.2.1
 types-PyMySQL==1.0.19.1
 types-PyYAML==6.0.12.2
-types-boto==2.49.18.2
+types-boto==2.49.18.3
 types-certifi==2021.10.8.3
 types-croniter==1.3.2
 types-cryptography==3.3.23.2
 types-docutils==0.19.1.1
 types-freezegun==1.1.10
 types-paramiko==2.12.0.1
-types-protobuf==3.20.4.5
+types-protobuf==3.20.4.6
 types-python-dateutil==2.8.19.4
 types-python-slugify==7.0.0.1
 types-pytz==2022.6.0.1
-types-redis==4.3.21.4
+types-redis==4.3.21.5
 types-requests==2.28.11.5
-types-setuptools==65.6.0.0
+types-setuptools==65.6.0.1
 types-tabulate==0.9.0.0
 types-termcolor==1.1.6
 types-toml==0.10.8.1
@@ -606,7 +606,7 @@ uamqp==1.6.3
 uc-micro-py==1.0.1
 unicodecsv==0.14.1
 uritemplate==3.0.1
-urllib3==1.26.12
+urllib3==1.26.13
 userpath==1.8.0
 vertica-python==1.1.1
 vine==5.0.0
diff --git a/constraints-3.7.txt b/constraints-3.7.txt
index f0b6e98172..a6d25e0c48 100644
--- a/constraints-3.7.txt
+++ b/constraints-3.7.txt
@@ -1,6 +1,6 @@
 #
-# This constraints file was automatically generated on 2022-11-23T11:29:26Z
-# via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow.
+# This constraints file was automatically generated on 2022-11-24T18:46:56Z
+# via "eager-upgrade" mechanism of PIP. For the "v2-5-test" branch of Airflow.
 # This variant of constraints install uses the HEAD of the branch version for 
'apache-airflow' but installs
 # the provi

[GitHub] [airflow] Taragolis commented on a diff in pull request #27786: Add operators + sensor for aws sagemaker pipelines

2022-11-24 Thread GitBox


Taragolis commented on code in PR #27786:
URL: https://github.com/apache/airflow/pull/27786#discussion_r1031770958


##
airflow/providers/amazon/aws/hooks/sagemaker.py:
##
@@ -647,28 +649,28 @@ def describe_endpoint(self, name: str) -> dict:
 
 def check_status(
 self,
-job_name: str,
+resource_name: str,

Review Comment:
   I'm really waiting for Python 3.7 EOL date with positional only arguments



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on a diff in pull request #27786: Add operators + sensor for aws sagemaker pipelines

2022-11-24 Thread GitBox


Taragolis commented on code in PR #27786:
URL: https://github.com/apache/airflow/pull/27786#discussion_r1031764368


##
airflow/providers/amazon/aws/hooks/sagemaker.py:
##
@@ -647,28 +649,28 @@ def describe_endpoint(self, name: str) -> dict:
 
 def check_status(
 self,
-job_name: str,
+resource_name: str,

Review Comment:
   I just wondering is this changes could be classified as breaking changes or 
not?
   It is a small chance that user might use this public method in their code 
and define as arguments as keywords
   
   ```python
   SageMakerHook.check_status(
   job_name="foo-bar",
   key="spam",
   describe_function=some_callable
   check_interval=42
   )
   ```
   And after this changes their got:
   `TypeError: got an unexpected keyword argument 'job_name'`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] ZhangCreations commented on issue #24988: sendgrid as a data source provider? #24815

2022-11-24 Thread GitBox


ZhangCreations commented on issue #24988:
URL: https://github.com/apache/airflow/issues/24988#issuecomment-1326760622

   Just to double check my understanding of the release cycle as outlined 
[here](https://github.com/apache/airflow#release-process-for-providers). If I 
implement this provider, does that require me to continue to maintain the 
versioning of this provider indefinitely?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] syedahsn commented on a diff in pull request #27893: AWSGlueJobHook updates job configuration if it exists

2022-11-24 Thread GitBox


syedahsn commented on code in PR #27893:
URL: https://github.com/apache/airflow/pull/27893#discussion_r1031764389


##
airflow/providers/amazon/aws/hooks/glue.py:
##
@@ -92,10 +93,51 @@ def __init__(
 kwargs["client_type"] = "glue"
 super().__init__(*args, **kwargs)
 
+def create_glue_job_config(self) -> dict:
+if self.s3_bucket is None:
+raise AirflowException("Could not initialize glue job, error: 
Specify Parameter `s3_bucket`")
+
+default_command = {
+"Name": "glueetl",
+"ScriptLocation": self.script_location,
+}
+command = self.create_job_kwargs.pop("Command", default_command)
+
+s3_log_path = 
f"s3://{self.s3_bucket}/{self.s3_glue_logs}{self.job_name}"
+execution_role = self.get_iam_execution_role()
+
+if "WorkerType" in self.create_job_kwargs and "NumberOfWorkers" in 
self.create_job_kwargs:
+return dict(

Review Comment:
   This part here can be refactored to be a bit more concise. Rather than have 
two return statements returning very similar dictionaries, something like this 
would be cleaner:
   ```
   ret_config = {
   "Name": self.job_name,
   "Description": self.desc,
   "LogUri": s3_log_path,
   "Role": execution_role["Role"]["Arn"],
   "ExecutionProperty": {"MaxConcurrentRuns": self.concurrent_run_limit},
   "Command": command,
   "MaxRetries": self.retry_limit,
   **self.create_job_kwargs,
   }
   
   if "WorkerType" in self.create_job_kwargs and "NumberOfWorkers" in 
self.create_job_kwargs:
   ret_config["MaxCapacity"] = self.num_of_dpus
   
   return ret_config
   ```
   
   Also, it's [generally 
preferable](https://stackoverflow.com/questions/2853683/what-is-the-preferred-syntax-for-initializing-a-dict-curly-brace-literals-or/2853738#2853738)
 to use {} rather than the `dict()` function



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch constraints-main updated: Updating constraints. Build id:

2022-11-24 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch constraints-main
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/constraints-main by this push:
 new 61e50acc6a Updating constraints. Build id:
61e50acc6a is described below

commit 61e50acc6ac4ab9ed867024e01403171060b6827
Author: Automated GitHub Actions commit 
AuthorDate: Thu Nov 24 18:23:24 2022 +

Updating constraints. Build id:

This update in constraints is automatically committed by the CI 
'constraints-push' step based on
HEAD of '' in ''
with commit sha .

All tests passed in this build so we determined we can push the updated 
constraints.

See 
https://github.com/apache/airflow/blob/main/README.md#installing-from-pypi for 
details.
---
 constraints-3.10.txt  | 8 
 constraints-3.7.txt   | 4 ++--
 constraints-3.8.txt   | 8 
 constraints-3.9.txt   | 8 
 constraints-no-providers-3.10.txt | 2 +-
 constraints-no-providers-3.7.txt  | 2 +-
 constraints-no-providers-3.8.txt  | 4 ++--
 constraints-no-providers-3.9.txt  | 4 ++--
 constraints-source-providers-3.10.txt | 8 
 constraints-source-providers-3.7.txt  | 4 ++--
 constraints-source-providers-3.8.txt  | 8 
 constraints-source-providers-3.9.txt  | 8 
 12 files changed, 34 insertions(+), 34 deletions(-)

diff --git a/constraints-3.10.txt b/constraints-3.10.txt
index 9cba609a82..ea2c4d025d 100644
--- a/constraints-3.10.txt
+++ b/constraints-3.10.txt
@@ -1,5 +1,5 @@
 #
-# This constraints file was automatically generated on 2022-11-24T00:13:51Z
+# This constraints file was automatically generated on 2022-11-24T18:22:43Z
 # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow.
 # This variant of constraints install uses the HEAD of the branch version for 
'apache-airflow' but installs
 # the providers from PIP-released packages at the moment of the constraint 
generation.
@@ -253,7 +253,7 @@ gcloud-aio-storage==7.0.1
 gcsfs==2022.11.0
 geomet==0.2.1.post1
 gevent==22.10.2
-gitdb==4.0.9
+gitdb==4.0.10
 google-ads==18.0.0
 google-api-core==2.8.2
 google-api-python-client==1.12.11
@@ -322,7 +322,7 @@ identify==2.5.9
 idna==3.4
 ijson==3.1.4
 imagesize==1.4.1
-importlib-metadata==5.0.0
+importlib-metadata==5.1.0
 incremental==22.10.0
 inflection==0.5.1
 influxdb-client==1.34.0
@@ -457,7 +457,7 @@ pydot==1.4.2
 pydruid==0.6.5
 pyenchant==3.2.2
 pyexasol==0.25.1
-pyflakes==3.0.0
+pyflakes==3.0.1
 pygraphviz==1.10
 pyhcl==0.4.4
 pykerberos==1.2.4
diff --git a/constraints-3.7.txt b/constraints-3.7.txt
index 6953da5a27..d6a519d406 100644
--- a/constraints-3.7.txt
+++ b/constraints-3.7.txt
@@ -1,5 +1,5 @@
 #
-# This constraints file was automatically generated on 2022-11-24T00:14:29Z
+# This constraints file was automatically generated on 2022-11-24T18:23:21Z
 # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow.
 # This variant of constraints install uses the HEAD of the branch version for 
'apache-airflow' but installs
 # the providers from PIP-released packages at the moment of the constraint 
generation.
@@ -253,7 +253,7 @@ gcloud-aio-storage==7.0.1
 gcsfs==2022.11.0
 geomet==0.2.1.post1
 gevent==22.10.2
-gitdb==4.0.9
+gitdb==4.0.10
 google-ads==18.0.0
 google-api-core==2.8.2
 google-api-python-client==1.12.11
diff --git a/constraints-3.8.txt b/constraints-3.8.txt
index 8213266a0c..5d10586a80 100644
--- a/constraints-3.8.txt
+++ b/constraints-3.8.txt
@@ -1,5 +1,5 @@
 #
-# This constraints file was automatically generated on 2022-11-24T00:14:21Z
+# This constraints file was automatically generated on 2022-11-24T18:23:13Z
 # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow.
 # This variant of constraints install uses the HEAD of the branch version for 
'apache-airflow' but installs
 # the providers from PIP-released packages at the moment of the constraint 
generation.
@@ -254,7 +254,7 @@ gcloud-aio-storage==7.0.1
 gcsfs==2022.11.0
 geomet==0.2.1.post1
 gevent==22.10.2
-gitdb==4.0.9
+gitdb==4.0.10
 google-ads==18.0.0
 google-api-core==2.8.2
 google-api-python-client==1.12.11
@@ -323,7 +323,7 @@ identify==2.5.9
 idna==3.4
 ijson==3.1.4
 imagesize==1.4.1
-importlib-metadata==5.0.0
+importlib-metadata==5.1.0
 importlib-resources==5.10.0
 incremental==22.10.0
 inflection==0.5.1
@@ -460,7 +460,7 @@ pydot==1.4.2
 pydruid==0.6.5
 pyenchant==3.2.2
 pyexasol==0.25.1
-pyflakes==3.0.0
+pyflakes==3.0.1
 pygraphviz==1.10
 pyhcl==0.4.4
 pykerberos==1.2.4
diff --git a/constraints-3.9.txt b/constraints-3.9.txt
index 05309e3648..3fe64b8aad 100644
--- a/constraints-3.9.txt
+++ b/constraints-3.9.txt
@@ -1,5 +1,5 @@
 #
-# This constraints file was automatically generated on 2022-11-24T00:14:18Z
+# This constraints file was automatically generated on 2022-11-24T18:23:10Z
 # via "eager-up

[GitHub] [airflow] alexott commented on issue #27894: Status of testing Providers that were prepared on November 24, 2022

2022-11-24 Thread GitBox


alexott commented on issue #27894:
URL: https://github.com/apache/airflow/issues/27894#issuecomment-1326755638

   Opened #27897 - tested all file formats


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] alexott opened a new pull request, #27897: Additional fix for writing output in DatabricksSqlOperator

2022-11-24 Thread GitBox


alexott opened a new pull request, #27897:
URL: https://github.com/apache/airflow/pull/27897

   This PR adds fix for writing results in DatabricksSqlOperator 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] ephraimbuddy commented on a diff in pull request #27895: Sync v2-5-stable with v2-5-test to release 2.5.0

2022-11-24 Thread GitBox


ephraimbuddy commented on code in PR #27895:
URL: https://github.com/apache/airflow/pull/27895#discussion_r1031752268


##
RELEASE_NOTES.rst:
##
@@ -21,6 +21,279 @@
 
 .. towncrier release notes start
 
+Airflow 2.5.0 (2022-11-28)
+--
+
+Significant Changes
+^^^
+
+- ``airflow dags test`` no longer performs a backfill job.
+
+  In order to make ``airflow dags test`` more useful as a testing and 
debugging tool, we no
+  longer run a backfill job and instead run a "local task runner". Users can 
still backfill
+  their DAGs using the ``airflow dags backfill`` command. (#26400)
+- Airflow config section ``kubernetes`` renamed to ``kubernetes_executor``
+
+  KubernetesPodOperator no longer considers any core kubernetes config params, 
so this section now only applies to kubernetes executor. Renaming it reduces 
potential for confusion. (#26873)
+- ``ExternalTaskSensor`` no longer hangs indefinitely when ``failed_states`` 
is set, an ``execute_date_fn`` is used, and some but not all of the dependent 
tasks fail. Instead, an ``AirflowException`` is thrown as soon as any of the 
dependent tasks fail.
+
+  Any code handling this failure in addition to timeouts should move to 
cathing the ``AirflowException`` baseclass and not only the 
``AirflowSensorTimeout`` subclass. (#27190)
+
+New Features
+
+- ``TaskRunner``: notify of component start and finish (#27855)
+- Add DagRun state change to the Listener plugin system(#27113)
+- Metric for raw task return codes (#27155)
+- Add logic for XComArg to pull specific map indexes (#27771)
+- Clear TaskGroup (#26658)
+- Add critical section query duration metric (#27700)
+- Add: #23880 :: Audit log for ``AirflowModelViews(Variables/Connection)`` 
(#24079)
+- Add postgres 15 support (#27444)
+- Expand tasks in mapped group at run time (#27491)
+- reset commits, clean submodules (#27560)
+- scheduler_job, add metric for scheduler loop timer (#27605)
+- Allow datasets to be used in taskflow (#27540)
+- Add expanded_ti_count to ti context (#27680)
+- Add user comment to task instance and dag run (#26457, #27849, #27867)
+- Enable copying DagRun JSON to clipboard (#27639)
+- Implement extra controls for SLAs (#27557)
+- add dag parsed time in DAG view (#27573)
+- Add max_wait for exponential_backoff in BaseSensor (#27597)
+- Expand tasks in mapped group at parse time (#27158)
+- Add disable retry flag on backfill (#23829)
+- Adding sensor decorator (#22562)
+- Api endpoint update ti (#26165)
+- Filtering datasets by recent update events (#26942)
+- Support Is /not Null filter for value is None on webui (#26584)
+- Add search to datasets list (#26893)
+- Split out and handle 'params' in mapped operator (#26100)
+- Add authoring API for TaskGroup mapping (#26844)
+- Add ``one_done`` trigger rule (#26146)
+- Create a more efficient  airflow dag test command that also has better local 
logging (#26400)
+- Support add/remove permissions to roles commands (#26338)
+- Auto tail file logs in Web UI (#26169)
+- Add triggerer info to task instance in API (#26249)
+- Flag to deserialize value on custom XCom backend (#26343)
+
+Bug Fixes
+^
+- Redirect to home view when there are no valid tags in the URL (#25715)
+- Make MappedTaskGroup depend on its expand inputs (#27876)
+- Make DagRun state updates for paused DAGs faster (#27725)
+- Don't explicitly set include_examples to False on task run command (#27813)
+- Fix menu border color (#27789)
+- Fix  backfill  queued  task getting reset to scheduled state.  (#23720)
+- Fix clearing child dag mapped tasks from parent dag (#27501)
+- Handle json encoding of ``V1Pod`` in task callback (#27609)
+- Fix ExternalTaskSensor can't check zipped dag (#27056)
+- Avoid re-fetching DAG run in TriggerDagRunOperator (#27635)
+- Continue on exception when retrieving metadata (#27665)
+- Fix double logging with some task logging handler (#27591)
+- External task sensor fail fix (#27190)
+- Replace FAB url filtering function with Airflows (#27576)
+- Fix mini scheduler expansion of mapped task  (#27506)
+- Add the default None when pop actions (#27537)
+- Display parameter values from serialized dag in trigger dag view. (#27482)
+- Fix getting the dag/task ids from base executor (#27550)
+- Fix sqlalchemy primary key black-out error on DDRQ (#27538)
+- Move TriggerDagRun conf check to execute (#27035)
+- SLAMiss is nullable and not always given back when pulling task instances 
(#27423)
+- Fix behavior of ``_`` when searching for DAGs (#27448)
+- Add case insensitive constraint to username (#27266)
+- Fix python external template keys (#27256)
+- Resolve trigger assignment race condition (#27072)
+- Update google_analytics.html (#27226)
+- Fix IntegrityError during webserver startup (#27297)
+- reduce extraneous task log requests (#27233)
+- Make RotatingFilehandler used in DagProcessor non-caching (#27223)
+- set executor.job_id to BackfillJob.id for backfills (#27020)
+- Fix som

[GitHub] [airflow] vandonr-amz commented on pull request #27786: Add operators + sensor for aws sagemaker pipelines

2022-11-24 Thread GitBox


vandonr-amz commented on PR #27786:
URL: https://github.com/apache/airflow/pull/27786#issuecomment-1326741439

   @Taragolis @potiuk do you think you could review this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch v2-5-test updated (0c2ee0ad95 -> 523df868dd)

2022-11-24 Thread ephraimanierobi
This is an automated email from the ASF dual-hosted git repository.

ephraimanierobi pushed a change to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git


omit 0c2ee0ad95 Add release notes
omit 59d16b6765 Update version to 2.5.0
 add 82b37d3ce5 tests: always cleanup registered test listeners (#27896)
 add 43c0607590 Update version to 2.5.0
 add 523df868dd Add release notes

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (0c2ee0ad95)
\
 N -- N -- N   refs/heads/v2-5-test (523df868dd)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

No new revisions were added by this update.

Summary of changes:
 README.md  |   2 +-
 RELEASE_NOTES.rst  | 121 +
 docs/spelling_wordlist.txt |   2 +
 tests/plugins/test_plugins_manager.py  |   1 +
 .../task/task_runner/test_standard_task_runner.py  |   2 +
 5 files changed, 32 insertions(+), 96 deletions(-)



[GitHub] [airflow] Taragolis commented on a diff in pull request #27893: AWSGlueJobHook updates job configuration if it exists

2022-11-24 Thread GitBox


Taragolis commented on code in PR #27893:
URL: https://github.com/apache/airflow/pull/27893#discussion_r1031748050


##
airflow/providers/amazon/aws/hooks/glue.py:
##
@@ -92,10 +93,51 @@ def __init__(
 kwargs["client_type"] = "glue"
 super().__init__(*args, **kwargs)
 
+def create_glue_job_config(self) -> dict:
+if self.s3_bucket is None:
+raise AirflowException("Could not initialize glue job, error: 
Specify Parameter `s3_bucket`")
+
+default_command = {
+"Name": "glueetl",
+"ScriptLocation": self.script_location,
+}
+command = self.create_job_kwargs.pop("Command", default_command)
+
+s3_log_path = 
f"s3://{self.s3_bucket}/{self.s3_glue_logs}{self.job_name}"
+execution_role = self.get_iam_execution_role()
+
+if "WorkerType" in self.create_job_kwargs and "NumberOfWorkers" in 
self.create_job_kwargs:
+return dict(
+Name=self.job_name,
+Description=self.desc,
+LogUri=s3_log_path,
+Role=execution_role["Role"]["Arn"],
+ExecutionProperty={"MaxConcurrentRuns": 
self.concurrent_run_limit},
+Command=command,
+MaxRetries=self.retry_limit,
+**self.create_job_kwargs,
+)
+else:
+return dict(
+Name=self.job_name,
+Description=self.desc,
+LogUri=s3_log_path,
+Role=execution_role["Role"]["Arn"],
+ExecutionProperty={"MaxConcurrentRuns": 
self.concurrent_run_limit},
+Command=command,
+MaxRetries=self.retry_limit,
+MaxCapacity=self.num_of_dpus,
+**self.create_job_kwargs,
+)
+
+@cached_property
+def glue_client(self):
+""":return: AWS Glue client"""
+return self.get_conn()

Review Comment:
   small nitpick `AwsBaseHook` already have two ways how to get `boto3.client`:
   - AwsBaseHook.get_conn()
   - AwsBaseHook.conn
   
   And both of them cached so... might be better not to create third way?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #27894: Status of testing Providers that were prepared on November 24, 2022

2022-11-24 Thread GitBox


potiuk commented on issue #27894:
URL: https://github.com/apache/airflow/issues/27894#issuecomment-1326737975

   > In Databricks SQL operator (and I believe in others as well), there was 
following strategy: always return only last result - previous results were 
always discarded. Primary reason for this was following:
   > 
   > * When you have multiple SQL statements, first one usually create table, 
inserts, etc. And only when you have select as the last statement, then you get 
results.  This matches the logic of the SQL's `BATCH` statement
   > * When you have multiple SQL statements their result may have different 
schema, but results will be processed only according to the latest schema, not 
schemas for corresponding result sets
   > 
   > We may need to think a bit about it - should we return results for each of 
the statements, or not. If yes, then we need to return pairs of description + 
results for each SQL statement, instead of using only the latest statement
   
   Yes - I noticed that too now. With two caveats:
   *  depends on the oprator what is the default (no problem)
   *  it behaves differently when there is an "sql" passed and return_last is 
true -> then instead of one-element result  array  it returns the results 
   
   It is surprisingly difficult to unwind teh original convoluted behaviour :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] syedahsn commented on issue #27894: Status of testing Providers that were prepared on November 24, 2022

2022-11-24 Thread GitBox


syedahsn commented on issue #27894:
URL: https://github.com/apache/airflow/issues/27894#issuecomment-1326737941

   #27276 works as expected. System tests using those operators are all passing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] alexott commented on issue #27894: Status of testing Providers that were prepared on November 24, 2022

2022-11-24 Thread GitBox


alexott commented on issue #27894:
URL: https://github.com/apache/airflow/issues/27894#issuecomment-1326727196

   In Databricks SQL operator (and I believe in others as well), there was 
following strategy: always return only last result - previous results were 
always discarded.  Primary reason for this was following:
   
   * When you have multiple SQL statements, first one usually create table, 
inserts, etc. And only when you have select as the last statement, then you get 
results.  This matches the logic of the SQL's `BATCH` statement
   * When you have multiple SQL statements their result may have different 
schema, but results will be processed only according to the latest schema, not 
schemas for corresponding result sets
   
   We may need to think a bit about it - should we return results for each of 
the statements, or not. If yes, then we need to return pairs of description + 
results for each SQL statement, instead of using only the latest statement


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch main updated: tests: always cleanup registered test listeners (#27896)

2022-11-24 Thread ephraimanierobi
This is an automated email from the ASF dual-hosted git repository.

ephraimanierobi pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/main by this push:
 new eba04d7c40 tests: always cleanup registered test listeners (#27896)
eba04d7c40 is described below

commit eba04d7c400c0d89492d75a7c81d21073933cd0c
Author: Maciej Obuchowski 
AuthorDate: Thu Nov 24 18:24:31 2022 +0100

tests: always cleanup registered test listeners (#27896)

Signed-off-by: Maciej Obuchowski 

Signed-off-by: Maciej Obuchowski 
---
 tests/plugins/test_plugins_manager.py   | 1 +
 tests/task/task_runner/test_standard_task_runner.py | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/tests/plugins/test_plugins_manager.py 
b/tests/plugins/test_plugins_manager.py
index 9ed00cae05..9ae6f55b6b 100644
--- a/tests/plugins/test_plugins_manager.py
+++ b/tests/plugins/test_plugins_manager.py
@@ -65,6 +65,7 @@ class AirflowTestOnLoadExceptionPlugin(AirflowPlugin):
 
 @pytest.fixture(autouse=True, scope="module")
 def clean_plugins():
+get_listener_manager().clear()
 yield
 get_listener_manager().clear()
 
diff --git a/tests/task/task_runner/test_standard_task_runner.py 
b/tests/task/task_runner/test_standard_task_runner.py
index c54a27ae89..797462136a 100644
--- a/tests/task/task_runner/test_standard_task_runner.py
+++ b/tests/task/task_runner/test_standard_task_runner.py
@@ -72,6 +72,7 @@ class TestStandardTaskRunner:
 (as the test environment does not have enough context for the normal
 way to run) and ensures they reset back to normal on the way out.
 """
+get_listener_manager().clear()
 clear_db_runs()
 dictConfig(LOGGING_CONFIG)
 yield
@@ -79,6 +80,7 @@ class TestStandardTaskRunner:
 airflow_logger.handlers = []
 clear_db_runs()
 dictConfig(DEFAULT_LOGGING_CONFIG)
+get_listener_manager().clear()
 
 def test_start_and_terminate(self):
 local_task_job = mock.Mock()



[GitHub] [airflow] potiuk commented on issue #27894: Status of testing Providers that were prepared on November 24, 2022

2022-11-24 Thread GitBox


potiuk commented on issue #27894:
URL: https://github.com/apache/airflow/issues/27894#issuecomment-1326710383

   Ah I think I see where I made wrong assumption @alexott . looking at it


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] ephraimbuddy merged pull request #27896: tests: always cleanup registered test listeners

2022-11-24 Thread GitBox


ephraimbuddy merged PR #27896:
URL: https://github.com/apache/airflow/pull/27896


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #27894: Status of testing Providers that were prepared on November 24, 2022

2022-11-24 Thread GitBox


potiuk commented on issue #27894:
URL: https://github.com/apache/airflow/issues/27894#issuecomment-1326699948

   > wraps this list into another list. I think that logic should be changed a 
bit - right now we're collecting results for all SQL statements into a single 
list although they could have different schemas.
   
   Yeah. that part is a bit not clear about the intentions (or maybe I 
misunderstood it). Woudl be great if you have a PR indeed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] vincbeck commented on pull request #27854: Fix errors in Databricks SQL operator introduced when refactoring

2022-11-24 Thread GitBox


vincbeck commented on PR #27854:
URL: https://github.com/apache/airflow/pull/27854#issuecomment-1326685898

   Sorry for the late review/reply! LGTM! Thanks @potiuk for the explanations 
as well


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] jedcunningham commented on a diff in pull request #27895: Sync v2-5-stable with v2-5-test to release 2.5.0

2022-11-24 Thread GitBox


jedcunningham commented on code in PR #27895:
URL: https://github.com/apache/airflow/pull/27895#discussion_r1031692178


##
RELEASE_NOTES.rst:
##
@@ -21,6 +21,279 @@
 
 .. towncrier release notes start
 
+Airflow 2.5.0 (2022-11-28)

Review Comment:
   ```suggestion
   Airflow 2.5.0 (2022-11-30)
   ```



##
RELEASE_NOTES.rst:
##
@@ -21,6 +21,279 @@
 
 .. towncrier release notes start
 
+Airflow 2.5.0 (2022-11-28)
+--
+
+Significant Changes
+^^^
+
+- ``airflow dags test`` no longer performs a backfill job.
+
+  In order to make ``airflow dags test`` more useful as a testing and 
debugging tool, we no
+  longer run a backfill job and instead run a "local task runner". Users can 
still backfill
+  their DAGs using the ``airflow dags backfill`` command. (#26400)
+- Airflow config section ``kubernetes`` renamed to ``kubernetes_executor``

Review Comment:
   ```suggestion
   Airflow config section ``kubernetes`` renamed to ``kubernetes_executor`` 
(#26873)
   
"
   ```



##
RELEASE_NOTES.rst:
##
@@ -21,6 +21,279 @@
 
 .. towncrier release notes start
 
+Airflow 2.5.0 (2022-11-28)
+--
+
+Significant Changes
+^^^
+
+- ``airflow dags test`` no longer performs a backfill job.
+
+  In order to make ``airflow dags test`` more useful as a testing and 
debugging tool, we no
+  longer run a backfill job and instead run a "local task runner". Users can 
still backfill
+  their DAGs using the ``airflow dags backfill`` command. (#26400)

Review Comment:
   ```suggestion
 their DAGs using the ``airflow dags backfill`` command.
   ```



##
RELEASE_NOTES.rst:
##
@@ -21,6 +21,279 @@
 
 .. towncrier release notes start
 
+Airflow 2.5.0 (2022-11-28)
+--
+
+Significant Changes
+^^^
+
+- ``airflow dags test`` no longer performs a backfill job.
+
+  In order to make ``airflow dags test`` more useful as a testing and 
debugging tool, we no
+  longer run a backfill job and instead run a "local task runner". Users can 
still backfill
+  their DAGs using the ``airflow dags backfill`` command. (#26400)
+- Airflow config section ``kubernetes`` renamed to ``kubernetes_executor``
+
+  KubernetesPodOperator no longer considers any core kubernetes config params, 
so this section now only applies to kubernetes executor. Renaming it reduces 
potential for confusion. (#26873)
+- ``ExternalTaskSensor`` no longer hangs indefinitely when ``failed_states`` 
is set, an ``execute_date_fn`` is used, and some but not all of the dependent 
tasks fail. Instead, an ``AirflowException`` is thrown as soon as any of the 
dependent tasks fail.
+
+  Any code handling this failure in addition to timeouts should move to 
cathing the ``AirflowException`` baseclass and not only the 
``AirflowSensorTimeout`` subclass. (#27190)

Review Comment:
   This will need a similar change as above, but we probably need a shorter 
title for this?



##
RELEASE_NOTES.rst:
##
@@ -21,6 +21,279 @@
 
 .. towncrier release notes start
 
+Airflow 2.5.0 (2022-11-28)
+--
+
+Significant Changes
+^^^
+
+- ``airflow dags test`` no longer performs a backfill job.

Review Comment:
   ```suggestion
   ``airflow dags test`` no longer performs a backfill job (#26400)
   
   ```



##
RELEASE_NOTES.rst:
##
@@ -21,6 +21,279 @@
 
 .. towncrier release notes start
 
+Airflow 2.5.0 (2022-11-28)
+--
+
+Significant Changes
+^^^
+
+- ``airflow dags test`` no longer performs a backfill job.
+
+  In order to make ``airflow dags test`` more useful as a testing and 
debugging tool, we no
+  longer run a backfill job and instead run a "local task runner". Users can 
still backfill
+  their DAGs using the ``airflow dags backfill`` command. (#26400)
+- Airflow config section ``kubernetes`` renamed to ``kubernetes_executor``
+
+  KubernetesPodOperator no longer considers any core kubernetes config params, 
so this section now only applies to kubernetes executor. Renaming it reduces 
potential for confusion. (#26873)
+- ``ExternalTaskSensor`` no longer hangs indefinitely when ``failed_states`` 
is set, an ``execute_date_fn`` is used, and some but not all of the dependent 
tasks fail. Instead, an ``AirflowException`` is thrown as soon as any of the 
dependent tasks fail.
+
+  Any code handling this failure in addition to timeouts should move to 
cathing the ``AirflowException`` baseclass and not only the 
``AirflowSensorTimeout`` subclass. (#27190)
+
+New Features
+
+- ``TaskRunner``: notify of component start and finish (#27855)
+- Add DagRun state change to the Listener plugin system(#27113)
+- Metric for raw task return codes (#27155)
+- Add logic for XComAr

[GitHub] [airflow] alexott commented on issue #27894: Status of testing Providers that were prepared on November 24, 2022

2022-11-24 Thread GitBox


alexott commented on issue #27894:
URL: https://github.com/apache/airflow/issues/27894#issuecomment-1326659796

   #27868 works, #27854 unfortunately not - results are double wrapped:
   
   For example for `select * from default.my_***_table, parameters: None` we 
get: `scalar_results=True`, `results=[Row(id=1, v='test 1'), Row(id=2, v='test 
2')]`. And then code:
   
   ```
   if scalar_results:
   list_results: list[Any] = [results]
   else:
   list_results = results
   ```
   
   wraps this list into another list. I think that logic should be changed a 
bit - right now we're collecting results for all SQL statements into a single 
list although they could have different schemas.
   
   Let me debug it and open another PR


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] o-nikolas commented on pull request #26962: Fix system test for Memorystore memcached

2022-11-24 Thread GitBox


o-nikolas commented on PR #26962:
URL: https://github.com/apache/airflow/pull/26962#issuecomment-1326644068

   > @o-nikolas , hi,
   > 
   > Can we merge this PR please? Your comment has been addressed.
   
   All good on my end, but unfortunately I am not a committer and cannot merge 
changes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] mobuchowski opened a new pull request, #27896: tests: always cleanup registered test listeners

2022-11-24 Thread GitBox


mobuchowski opened a new pull request, #27896:
URL: https://github.com/apache/airflow/pull/27896

   Listeners get registered in tests, that run in various order. 
   
   Every place that registers listeners should clean up, defensively before and 
after using registering new managers.
   
   Signed-off-by: Maciej Obuchowski 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #27895: Sync v2-5-stable with v2-5-test to release 2.5.0

2022-11-24 Thread GitBox


potiuk commented on code in PR #27895:
URL: https://github.com/apache/airflow/pull/27895#discussion_r1031664386


##
README.md:
##
@@ -86,7 +86,7 @@ Airflow is not a streaming solution, but it is often used to 
process real-time d
 
 Apache Airflow is tested with:
 
-| | Main version (dev)   | Stable version (2.4.2)  
 |
+| | Main version (dev)   | Stable version (2.5.0)  
 |

Review Comment:
   good catch



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] raphaelauv commented on a diff in pull request #27895: Sync v2-5-stable with v2-5-test to release 2.5.0

2022-11-24 Thread GitBox


raphaelauv commented on code in PR #27895:
URL: https://github.com/apache/airflow/pull/27895#discussion_r1031649086


##
README.md:
##
@@ -86,7 +86,7 @@ Airflow is not a streaming solution, but it is often used to 
process real-time d
 
 Apache Airflow is tested with:
 
-| | Main version (dev)   | Stable version (2.4.2)  
 |
+| | Main version (dev)   | Stable version (2.5.0)  
 |

Review Comment:
   missing PostgreSQL 15 support for 2.5.0



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] jh242 commented on pull request #27805: Automatically save and allow restore of recent DAG run configs

2022-11-24 Thread GitBox


jh242 commented on PR #27805:
URL: https://github.com/apache/airflow/pull/27805#issuecomment-1326606738

   I disagree with the db/API endpoint idea. I think it's too much overhead for 
a minor feature that seems intended to save some time for DAGs with small 
configs that the user didn't expect to run multiple times. I feel like the 
ability to copy/paste configurations from 
[here](https://github.com/apache/airflow/pull/27639) also overlaps this 
feature. My suggestion is that we can either save recent configs in session 
storage by DAG, or in local storage but limited to a certain number of total 
recent configs across all DAGs.
   
   Additionally, Aaron and I are on a bit of a short timeline and likely won't 
have time to implement a backend-supported version of this feature, but if 
that's the direction we really want to go, we can get started on it and see 
where it goes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #27895: Sync v2-5-stable with v2-5-test to release 2.5.0

2022-11-24 Thread GitBox


potiuk commented on PR #27895:
URL: https://github.com/apache/airflow/pull/27895#issuecomment-1326598444

   Very cool release - no dramatic changes, but steady stream of improvements: 
:muscle: 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] alexandermalyga commented on issue #27894: Status of testing Providers that were prepared on November 24, 2022

2022-11-24 Thread GitBox


alexandermalyga commented on issue #27894:
URL: https://github.com/apache/airflow/issues/27894#issuecomment-1326589981

   #27724 works as expected! Finally Trino inserts are 100% working


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #27894: Status of testing Providers that were prepared on November 24, 2022

2022-11-24 Thread GitBox


potiuk commented on issue #27894:
URL: https://github.com/apache/airflow/issues/27894#issuecomment-1326587629

   Checked my changes. 
   
   cc: @alexott @kazanzhy ->  would approeciate checking  databricks SQL 
executor integration with the new common-sql provider.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] ephraimbuddy opened a new pull request, #27895: Sync v2-5-stable with v2-5-test to release 2.5.0

2022-11-24 Thread GitBox


ephraimbuddy opened a new pull request, #27895:
URL: https://github.com/apache/airflow/pull/27895

   Time for `2.5.0rc1`!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] 01/02: Update version to 2.5.0

2022-11-24 Thread ephraimanierobi
This is an automated email from the ASF dual-hosted git repository.

ephraimanierobi pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit 59d16b6765c0d7aee4efb9150ff30508863590b6
Author: Ephraim Anierobi 
AuthorDate: Thu Nov 24 15:12:33 2022 +0100

Update version to 2.5.0
---
 README.md  | 14 +++---
 airflow/utils/db.py|  1 +
 docs/apache-airflow/installation/supported-versions.rst|  2 +-
 docs/docker-stack/README.md| 10 +-
 .../docker-examples/extending/add-apt-packages/Dockerfile  |  2 +-
 .../extending/add-build-essential-extend/Dockerfile|  2 +-
 .../docker-examples/extending/add-providers/Dockerfile |  2 +-
 .../docker-examples/extending/add-pypi-packages/Dockerfile |  2 +-
 .../extending/add-requirement-packages/Dockerfile  |  2 +-
 .../docker-examples/extending/custom-providers/Dockerfile  |  2 +-
 .../docker-examples/extending/embedding-dags/Dockerfile|  2 +-
 .../extending/writable-directory/Dockerfile|  2 +-
 docs/docker-stack/entrypoint.rst   | 14 +++---
 scripts/ci/pre_commit/pre_commit_supported_versions.py |  2 +-
 setup.py   |  2 +-
 15 files changed, 31 insertions(+), 30 deletions(-)

diff --git a/README.md b/README.md
index a4f37d6dae..cf15510901 100644
--- a/README.md
+++ b/README.md
@@ -86,7 +86,7 @@ Airflow is not a streaming solution, but it is often used to 
process real-time d
 
 Apache Airflow is tested with:
 
-| | Main version (dev)   | Stable version (2.4.2)  
 |
+| | Main version (dev)   | Stable version (2.5.0)  
 |
 
|-|--|--|
 | Python  | 3.7, 3.8, 3.9, 3.10  | 3.7, 3.8, 3.9, 3.10 
 |
 | Platform| AMD64/ARM64(\*)  | AMD64/ARM64(\*) 
 |
@@ -158,15 +158,15 @@ them to the appropriate format and workflow that your 
tool requires.
 
 
 ```bash
-pip install 'apache-airflow==2.4.2' \
- --constraint 
"https://raw.githubusercontent.com/apache/airflow/constraints-2.4.2/constraints-3.7.txt";
+pip install 'apache-airflow==2.5.0' \
+ --constraint 
"https://raw.githubusercontent.com/apache/airflow/constraints-2.5.0/constraints-3.7.txt";
 ```
 
 2. Installing with extras (i.e., postgres, google)
 
 ```bash
-pip install 'apache-airflow[postgres,google]==2.4.2' \
- --constraint 
"https://raw.githubusercontent.com/apache/airflow/constraints-2.4.2/constraints-3.7.txt";
+pip install 'apache-airflow[postgres,google]==2.5.0' \
+ --constraint 
"https://raw.githubusercontent.com/apache/airflow/constraints-2.5.0/constraints-3.7.txt";
 ```
 
 For information on installing provider packages, check
@@ -271,7 +271,7 @@ Apache Airflow version life cycle:
 
 | Version   | Current Patch/Minor   | State | First Release   | Limited 
Support   | EOL/Terminated   |
 
|---|---|---|-|---|--|
-| 2 | 2.4.3 | Supported | Dec 17, 2020| TBD
   | TBD  |
+| 2 | 2.5.0 | Supported | Dec 17, 2020| TBD
   | TBD  |
 | 1.10  | 1.10.15   | EOL   | Aug 27, 2018| Dec 17, 
2020  | June 17, 2021|
 | 1.9   | 1.9.0 | EOL   | Jan 03, 2018| Aug 27, 
2018  | Aug 27, 2018 |
 | 1.8   | 1.8.2 | EOL   | Mar 19, 2017| Jan 03, 
2018  | Jan 03, 2018 |
@@ -301,7 +301,7 @@ They are based on the official release schedule of Python 
and Kubernetes, nicely
 2. The "oldest" supported version of Python/Kubernetes is the default one 
until we decide to switch to
later version. "Default" is only meaningful in terms of "smoke tests" in CI 
PRs, which are run using this
default version and the default reference image available. Currently 
`apache/airflow:latest`
-   and `apache/airflow:2.4.2` images are Python 3.7 images. This means that 
default reference image will
+   and `apache/airflow:2.5.0` images are Python 3.7 images. This means that 
default reference image will
become the default at the time when we start preparing for dropping 3.7 
support which is few months
before the end of life for Python 3.7.
 
diff --git a/airflow/utils/db.py b/airflow/utils/db.py
index b5ea63be00..00bf243e1d 100644
--- a/airflow/utils/db.py
+++ b/airflow/utils/db.py
@@ -75,6 +75,7 @@ REVISION_HEADS_MAP = {
 "2.4.1": "ecb43d2a1842",
 "2.4.2": "b0d31815b5a6",
 "2.4.3": "e07f49787c9d",
+"2.5.0": "1986afd32c1b",
 }
 
 
diff --git a/docs/apache-airflow/installation/supported-versions.rst 
b/docs/apache-airflow/installation/supported-ver

[airflow] 02/02: Add release notes

2022-11-24 Thread ephraimanierobi
This is an automated email from the ASF dual-hosted git repository.

ephraimanierobi pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit 0c2ee0ad958e5424be084793efbacf023dccd333
Author: Ephraim Anierobi 
AuthorDate: Thu Nov 24 16:14:49 2022 +0100

Add release notes
---
 RELEASE_NOTES.rst   | 273 
 newsfragments/26400.significant.rst |   5 -
 newsfragments/26873.significant.rst |   3 -
 newsfragments/27190.significant.rst |   3 -
 4 files changed, 273 insertions(+), 11 deletions(-)

diff --git a/RELEASE_NOTES.rst b/RELEASE_NOTES.rst
index fe2babef13..b94810118e 100644
--- a/RELEASE_NOTES.rst
+++ b/RELEASE_NOTES.rst
@@ -21,6 +21,279 @@
 
 .. towncrier release notes start
 
+Airflow 2.5.0 (2022-11-28)
+--
+
+Significant Changes
+^^^
+
+- ``airflow dags test`` no longer performs a backfill job.
+
+  In order to make ``airflow dags test`` more useful as a testing and 
debugging tool, we no
+  longer run a backfill job and instead run a "local task runner". Users can 
still backfill
+  their DAGs using the ``airflow dags backfill`` command. (#26400)
+- Airflow config section ``kubernetes`` renamed to ``kubernetes_executor``
+
+  KubernetesPodOperator no longer considers any core kubernetes config params, 
so this section now only applies to kubernetes executor. Renaming it reduces 
potential for confusion. (#26873)
+- ``ExternalTaskSensor`` no longer hangs indefinitely when ``failed_states`` 
is set, an ``execute_date_fn`` is used, and some but not all of the dependent 
tasks fail. Instead, an ``AirflowException`` is thrown as soon as any of the 
dependent tasks fail.
+
+  Any code handling this failure in addition to timeouts should move to 
cathing the ``AirflowException`` baseclass and not only the 
``AirflowSensorTimeout`` subclass. (#27190)
+
+New Features
+
+- ``TaskRunner``: notify of component start and finish (#27855)
+- Add DagRun state change to the Listener plugin system(#27113)
+- Metric for raw task return codes (#27155)
+- Add logic for XComArg to pull specific map indexes (#27771)
+- Clear TaskGroup (#26658)
+- Add critical section query duration metric (#27700)
+- Add: #23880 :: Audit log for ``AirflowModelViews(Variables/Connection)`` 
(#24079)
+- Add postgres 15 support (#27444)
+- Expand tasks in mapped group at run time (#27491)
+- reset commits, clean submodules (#27560)
+- scheduler_job, add metric for scheduler loop timer (#27605)
+- Allow datasets to be used in taskflow (#27540)
+- Add expanded_ti_count to ti context (#27680)
+- Add user comment to task instance and dag run (#26457, #27849, #27867)
+- Enable copying DagRun JSON to clipboard (#27639)
+- Implement extra controls for SLAs (#27557)
+- add dag parsed time in DAG view (#27573)
+- Add max_wait for exponential_backoff in BaseSensor (#27597)
+- Expand tasks in mapped group at parse time (#27158)
+- Add disable retry flag on backfill (#23829)
+- Adding sensor decorator (#22562)
+- Api endpoint update ti (#26165)
+- Filtering datasets by recent update events (#26942)
+- Support Is /not Null filter for value is None on webui (#26584)
+- Add search to datasets list (#26893)
+- Split out and handle 'params' in mapped operator (#26100)
+- Add authoring API for TaskGroup mapping (#26844)
+- Add ``one_done`` trigger rule (#26146)
+- Create a more efficient  airflow dag test command that also has better local 
logging (#26400)
+- Support add/remove permissions to roles commands (#26338)
+- Auto tail file logs in Web UI (#26169)
+- Add triggerer info to task instance in API (#26249)
+- Flag to deserialize value on custom XCom backend (#26343)
+
+Bug Fixes
+^
+- Redirect to home view when there are no valid tags in the URL (#25715)
+- Make MappedTaskGroup depend on its expand inputs (#27876)
+- Make DagRun state updates for paused DAGs faster (#27725)
+- Don't explicitly set include_examples to False on task run command (#27813)
+- Fix menu border color (#27789)
+- Fix  backfill  queued  task getting reset to scheduled state.  (#23720)
+- Fix clearing child dag mapped tasks from parent dag (#27501)
+- Handle json encoding of ``V1Pod`` in task callback (#27609)
+- Fix ExternalTaskSensor can't check zipped dag (#27056)
+- Avoid re-fetching DAG run in TriggerDagRunOperator (#27635)
+- Continue on exception when retrieving metadata (#27665)
+- Fix double logging with some task logging handler (#27591)
+- External task sensor fail fix (#27190)
+- Replace FAB url filtering function with Airflows (#27576)
+- Fix mini scheduler expansion of mapped task  (#27506)
+- Add the default None when pop actions (#27537)
+- Display parameter values from serialized dag in trigger dag view. (#27482)
+- Fix getting the dag/task ids from base executor (#27550)
+- Fix sqlalchemy primary key black-out error on DDRQ (#27538)
+- Move TriggerDagRun conf check to execute (#27035)
+- SLAMiss is nullabl

[airflow] branch v2-5-test updated (cc18921381 -> 0c2ee0ad95)

2022-11-24 Thread ephraimanierobi
This is an automated email from the ASF dual-hosted git repository.

ephraimanierobi pushed a change to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git


from cc18921381 Update default branches for 2-5
 new 59d16b6765 Update version to 2.5.0
 new 0c2ee0ad95 Add release notes

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 README.md  |  14 +-
 RELEASE_NOTES.rst  | 273 +
 airflow/utils/db.py|   1 +
 .../installation/supported-versions.rst|   2 +-
 docs/docker-stack/README.md|  10 +-
 .../extending/add-apt-packages/Dockerfile  |   2 +-
 .../add-build-essential-extend/Dockerfile  |   2 +-
 .../extending/add-providers/Dockerfile |   2 +-
 .../extending/add-pypi-packages/Dockerfile |   2 +-
 .../extending/add-requirement-packages/Dockerfile  |   2 +-
 .../extending/custom-providers/Dockerfile  |   2 +-
 .../extending/embedding-dags/Dockerfile|   2 +-
 .../extending/writable-directory/Dockerfile|   2 +-
 docs/docker-stack/entrypoint.rst   |  14 +-
 newsfragments/26400.significant.rst|   5 -
 newsfragments/26873.significant.rst|   3 -
 newsfragments/27190.significant.rst|   3 -
 .../ci/pre_commit/pre_commit_supported_versions.py |   2 +-
 setup.py   |   2 +-
 19 files changed, 304 insertions(+), 41 deletions(-)
 delete mode 100644 newsfragments/26400.significant.rst
 delete mode 100644 newsfragments/26873.significant.rst
 delete mode 100644 newsfragments/27190.significant.rst



[GitHub] [airflow] potiuk opened a new issue, #27894: Status of testing Providers that were prepared on November 24, 2022

2022-11-24 Thread GitBox


potiuk opened a new issue, #27894:
URL: https://github.com/apache/airflow/issues/27894

   ### Body
   
   I have a kind request for all the contributors to the latest provider 
packages release.
   Could you please help us to test the RC versions of the providers?
   
   Let us know in the comment, whether the issue is addressed.
   
   Those are providers that require testing as there were some substantial 
changes introduced:
   
   
   ## Provider [amazon: 
6.2.0rc1](https://pypi.org/project/apache-airflow-providers-amazon/6.2.0rc1)
  - [ ] [Use Boto waiters instead of customer _await_status method for RDS 
Operators (#27410)](https://github.com/apache/airflow/pull/27410): @hankehly
  - [ ] [Handle transient state errors in `RedshiftResumeClusterOperator` 
and `RedshiftPauseClusterOperator` 
(#27276)](https://github.com/apache/airflow/pull/27276): @syedahsn
  - [ ] [Correct job name matching in SagemakerProcessingOperator 
(#27634)](https://github.com/apache/airflow/pull/27634): @ferruzzi
   ## Provider [asana: 
2.1.0rc1](https://pypi.org/project/apache-airflow-providers-asana/2.1.0rc1)
  - [ ] [Allow and prefer non-prefixed extra fields for AsanaHook 
(#27043)](https://github.com/apache/airflow/pull/27043): @dstandish
   ## Provider [common.sql: 
1.3.1rc1](https://pypi.org/project/apache-airflow-providers-common-sql/1.3.1rc1)
  - [ ] [Restore removed (but used) methods in common.sql 
(#27843)](https://github.com/apache/airflow/pull/27843): @potiuk
  - [ ] [Fix errors in Databricks SQL operator introduced when refactoring 
(#27854)](https://github.com/apache/airflow/pull/27854): @potiuk
   ## Provider [databricks: 
4.0.0rc1](https://pypi.org/project/apache-airflow-providers-databricks/4.0.0rc1)
  - [ ] [Fix errors in Databricks SQL operator introduced when refactoring 
(#27854)](https://github.com/apache/airflow/pull/27854): @potiuk
  - [ ] [Fix templating fields and do_xcom_push in DatabricksSQLOperator 
(#27868)](https://github.com/apache/airflow/pull/27868): @potiuk
   ## Provider [exasol: 
4.1.1rc1](https://pypi.org/project/apache-airflow-providers-exasol/4.1.1rc1)
  - [ ] [Fix errors in Databricks SQL operator introduced when refactoring 
(#27854)](https://github.com/apache/airflow/pull/27854): @potiuk
   ## Provider [google: 
8.6.0rc1](https://pypi.org/project/apache-airflow-providers-google/8.6.0rc1)
  - [ ] [Persist DataprocLink for workflow operators regardless of job 
status (#26986)](https://github.com/apache/airflow/pull/26986): @vksunilk
  - [ ] [Deferrable mode for BigQueryToGCSOperator 
(#27683)](https://github.com/apache/airflow/pull/27683): @lwyszomi
  - [ ] [Fix to read location parameter properly in 
BigQueryToBigQueryOperator 
(#27661)](https://github.com/apache/airflow/pull/27661): @VladaZakharova
   ## Provider [jdbc: 
3.3.0rc1](https://pypi.org/project/apache-airflow-providers-jdbc/3.3.0rc1)
  - [ ] [Allow and prefer non-prefixed extra fields for JdbcHook 
(#27044)](https://github.com/apache/airflow/pull/27044): @dstandish
  - [ ] [Add SQLExecuteQueryOperator 
(#25717)](https://github.com/apache/airflow/pull/25717): @kazanzhy
   ## Provider [mysql: 
3.4.0rc1](https://pypi.org/project/apache-airflow-providers-mysql/3.4.0rc1)
  - [ ] [Allow SSL mode in MySQL provider 
(#27717)](https://github.com/apache/airflow/pull/27717): @Adityamalik123
   ## Provider [neo4j: 
3.2.1rc1](https://pypi.org/project/apache-airflow-providers-neo4j/3.2.1rc1)
  - [ ] [Fix typing problem revealed after recent Neo4J release 
(#27759)](https://github.com/apache/airflow/pull/27759): @potiuk
   ## Provider [presto: 
4.2.0rc1](https://pypi.org/project/apache-airflow-providers-presto/4.2.0rc1)
  - [ ] [Add _serialize_cell method to TrinoHook and PrestoHook 
(#27724)](https://github.com/apache/airflow/pull/27724): @alexandermalyga
   ## Provider [slack: 
7.1.0rc1](https://pypi.org/project/apache-airflow-providers-slack/7.1.0rc1)
  - [ ] [Implements SqlToSlackApiFileOperator 
(#26374)](https://github.com/apache/airflow/pull/26374): @Taragolis
   ## Provider [snowflake: 
4.0.1rc1](https://pypi.org/project/apache-airflow-providers-snowflake/4.0.1rc1)
  - [ ] [Fix errors in Databricks SQL operator introduced when refactoring 
(#27854)](https://github.com/apache/airflow/pull/27854): @potiuk
   ## Provider [trino: 
4.3.0rc1](https://pypi.org/project/apache-airflow-providers-trino/4.3.0rc1)
  - [ ] [Add _serialize_cell method to TrinoHook and PrestoHook 
(#27724)](https://github.com/apache/airflow/pull/27724): @alexandermalyga
   
   The guidelines on how to test providers can be found in
   
   [Verify providers by 
contributors](https://github.com/apache/airflow/blob/main/dev/README_RELEASE_PROVIDER_PACKAGES.md#verify-by-contributors)
   
   
   ### Committer
   
   - [X] I acknowledge that I am a maintainer/committer of the Apache Airflow 
project.


-- 
This is an automated message from the Apache Git Service.
To respond to the messa

[GitHub] [airflow] boring-cyborg[bot] commented on pull request #27893: AWSGlueJobHook updates job configuration if it exists

2022-11-24 Thread GitBox


boring-cyborg[bot] commented on PR #27893:
URL: https://github.com/apache/airflow/pull/27893#issuecomment-1326563042

   Congratulations on your first Pull Request and welcome to the Apache Airflow 
community! If you have any issues or are unsure about any anything please check 
our Contribution Guide 
(https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
   Here are some useful points:
   - Pay attention to the quality of your code (flake8, mypy and type 
annotations). Our [pre-commits]( 
https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks)
 will help you with that.
   - In case of a new feature add useful documentation (in docstrings or in 
`docs/` directory). Adding a new operator? Check this short 
[guide](https://github.com/apache/airflow/blob/main/docs/apache-airflow/howto/custom-operator.rst)
 Consider adding an example DAG that shows how users should use it.
   - Consider using [Breeze 
environment](https://github.com/apache/airflow/blob/main/BREEZE.rst) for 
testing locally, it's a heavy docker but it ships with a working Airflow and a 
lot of integrations.
   - Be patient and persistent. It might take some time to get a review or get 
the final approval from Committers.
   - Please follow [ASF Code of 
Conduct](https://www.apache.org/foundation/policies/conduct) for all 
communication including (but not limited to) comments on Pull Requests, Mailing 
list and Slack.
   - Be sure to read the [Airflow Coding style]( 
https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#coding-style-and-best-practices).
   Apache Airflow is a community-driven project and together we are making it 
better 🚀.
   In case of doubts contact the developers at:
   Mailing List: d...@airflow.apache.org
   Slack: https://s.apache.org/airflow-slack
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] romibuzi opened a new pull request, #27893: AWSGlueJobHook updates job configuration if it exists

2022-11-24 Thread GitBox


romibuzi opened a new pull request, #27893:
URL: https://github.com/apache/airflow/pull/27893

   closes: #27592 
   
   ---
   
   Rename `GlueJobHook.get_or_create_glue_job()` into 
`create_or_update_glue_job()` and split code into different methods: 
`create_glue_job_config()`, `has_job()`, `create_job()` and `update_job()`. 
   
   It is now similar to the behavior in `GlueCrawlerOperator`.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



  1   2   >