[GitHub] [airflow] github-actions[bot] commented on pull request #20441: Add "use_ssl" option to IMAP connection

2022-01-12 Thread GitBox


github-actions[bot] commented on pull request #20441:
URL: https://github.com/apache/airflow/pull/20441#issuecomment-1010765291


   The PR is likely OK to be merged with just subset of tests for default 
Python and Database versions without running the full matrix of tests, because 
it does not modify the core of Airflow. If the committers decide that the full 
tests matrix is needed, they will add the label 'full tests needed'. Then you 
should rebase to the latest main or amend the last commit of the PR, and push 
it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on pull request #20441: Add "use_ssl" option to IMAP connection

2022-01-12 Thread GitBox


potiuk commented on pull request #20441:
URL: https://github.com/apache/airflow/pull/20441#issuecomment-1010780169


   Thanks @feluelle !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk merged pull request #20441: Add "use_ssl" option to IMAP connection

2022-01-12 Thread GitBox


potiuk merged pull request #20441:
URL: https://github.com/apache/airflow/pull/20441


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch main updated (46d3799 -> 25a5f55)

2022-01-12 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git.


from 46d3799  Add Context stub to Airflow packages (#20817)
 add 25a5f55  Add "use_ssl" option to IMAP connection (#20441)

No new revisions were added by this update.

Summary of changes:
 airflow/providers/imap/hooks/imap.py   | 23 +++
 .../connections/imap.rst   | 17 +--
 tests/providers/imap/hooks/test_imap.py| 34 +++---
 3 files changed, 62 insertions(+), 12 deletions(-)


[GitHub] [airflow] Bowrna commented on issue #20740: Breeze: Running static checks with Breeze

2022-01-12 Thread GitBox


Bowrna commented on issue #20740:
URL: https://github.com/apache/airflow/issues/20740#issuecomment-1010789046


   Static Checks:
   
   docker_engine_resources::check_all_resources
   breeze::make_sure_precommit_is_installed
   breeze::run_static_checks
   
   
   Docker_engine_resources::check_all_resources:
   Run the docker command in Python to determine the resource availability.
   EXTRA_DOCKER_FLAGS
   AIRFLOW_CI_IMAGE_WITH_TAG
   
   This runs airflow ci image with tag with docker extra flags and entry point 
as /bin/bash and executes the script in the path 
/opt/airflow/scripts/in_container/run_resource_check.sh
   
   This is an in_container script and after running docker it will be taken 
care of. I have to work on starting the docker run command alone. These scripts 
will be taken care of inside the docker container.
   
   What's in the script?
   Resources checked: Memory, CPU, Diskspace

   Min Threshold of resources to run:
   Memory : 4GB
   CPU: 2
   Disk available: 20 GB
   
   
   Breeze::make_sure_precommit_is_installed
   Check if pip or pip3 is in the path. Error out if not available. Then pip 
install –upgrade pre-commit to install pre-commit. 
   
   path is updated with ~/.local/bin to the path in case pip is run outside of 
virtualenv. (Do we have to do this step when converting to Python?, Also not 
clear how this works, could you explain it to me?)
   
   Breeze::run_static_checks
   EXTRA_STATIC_CHECK_OPTIONS - Not sure where this is initialized. Could you 
point me to where it’s initialized in code?
   For all case, Pre commit run with EXTRA_STATIC_CHECK_OPTIONS
   
   If static_check is mypy or flake8, then run Pre commit build first and then 
run Pre commit run with static_check and EXTRA_STATIC_CHECK_OPTIONS
   
   @potiuk I have done the first check in current breeze and picked the current 
flow. Could you check and tell if my understanding is correct? Also can you 
tell if have to follow the similar flow in Python too?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] uranusjr commented on a change in pull request #20286: Add TaskMap and TaskInstance.map_id

2022-01-12 Thread GitBox


uranusjr commented on a change in pull request #20286:
URL: https://github.com/apache/airflow/pull/20286#discussion_r782838839



##
File path: 
airflow/migrations/versions/e655c0453f75_add_taskmap_and_map_id_on_taskinstance.py
##
@@ -0,0 +1,120 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Add TaskMap and map_index on TaskInstance.
+
+Revision ID: e655c0453f75
+Revises: 587bdf053233
+Create Date: 2021-12-13 22:59:41.052584
+"""
+
+from alembic import op
+from sqlalchemy import Column, ForeignKeyConstraint, Integer
+
+from airflow.models.base import StringID
+from airflow.utils.sqlalchemy import ExtendedJSON
+
+# Revision identifiers, used by Alembic.
+revision = "e655c0453f75"
+down_revision = "587bdf053233"
+branch_labels = None
+depends_on = None
+
+
+def upgrade():
+"""Add TaskMap and map_index on TaskInstance."""
+# We need to first remove constraints on task_reschedule since they depend 
on task_instance.
+with op.batch_alter_table("task_reschedule") as batch_op:
+batch_op.drop_constraint("task_reschedule_ti_fkey", "foreignkey")
+batch_op.drop_index("idx_task_reschedule_dag_task_run")
+
+# Change task_instance's primary key.
+with op.batch_alter_table("task_instance") as batch_op:
+# I think we always use this name for TaskInstance after 7b2661a43ba3?
+batch_op.drop_constraint("task_instance_pkey", type_="primary")
+batch_op.add_column(Column("map_index", Integer, nullable=False, 
default=-1))
+batch_op.create_primary_key("task_instance_pkey", ["dag_id", 
"task_id", "run_id", "map_index"])
+
+# Re-create task_reschedule's constraints.
+with op.batch_alter_table("task_reschedule") as batch_op:
+batch_op.add_column(Column("map_index", Integer, nullable=False, 
default=-1))
+batch_op.create_foreign_key(
+"task_reschedule_ti_fkey",
+"task_instance",
+["dag_id", "task_id", "run_id", "map_index"],
+["dag_id", "task_id", "run_id", "map_index"],
+ondelete="CASCADE",
+)
+batch_op.create_index(
+"idx_task_reschedule_dag_task_run",
+["dag_id", "task_id", "run_id", "map_index"],
+unique=False,
+)
+
+# Create task_map.
+op.create_table(
+"task_map",
+Column("dag_id", StringID(), primary_key=True),
+Column("task_id", StringID(), primary_key=True),
+Column("run_id", StringID(), primary_key=True),
+Column("map_index", Integer, primary_key=True),

Review comment:
   Added a CheckConstraint and a check in `__init__` for this instead.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] uranusjr commented on a change in pull request #20286: Add TaskMap and TaskInstance.map_id

2022-01-12 Thread GitBox


uranusjr commented on a change in pull request #20286:
URL: https://github.com/apache/airflow/pull/20286#discussion_r782776457



##
File path: 
airflow/migrations/versions/e655c0453f75_add_taskmap_and_map_id_on_taskinstance.py
##
@@ -0,0 +1,120 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Add TaskMap and map_index on TaskInstance.
+
+Revision ID: e655c0453f75
+Revises: 587bdf053233
+Create Date: 2021-12-13 22:59:41.052584
+"""
+
+from alembic import op
+from sqlalchemy import Column, ForeignKeyConstraint, Integer
+
+from airflow.models.base import StringID
+from airflow.utils.sqlalchemy import ExtendedJSON
+
+# Revision identifiers, used by Alembic.
+revision = "e655c0453f75"
+down_revision = "587bdf053233"
+branch_labels = None
+depends_on = None
+
+
+def upgrade():
+"""Add TaskMap and map_index on TaskInstance."""
+# We need to first remove constraints on task_reschedule since they depend 
on task_instance.
+with op.batch_alter_table("task_reschedule") as batch_op:
+batch_op.drop_constraint("task_reschedule_ti_fkey", "foreignkey")
+batch_op.drop_index("idx_task_reschedule_dag_task_run")
+
+# Change task_instance's primary key.
+with op.batch_alter_table("task_instance") as batch_op:
+# I think we always use this name for TaskInstance after 7b2661a43ba3?
+batch_op.drop_constraint("task_instance_pkey", type_="primary")
+batch_op.add_column(Column("map_index", Integer, nullable=False, 
default=-1))
+batch_op.create_primary_key("task_instance_pkey", ["dag_id", 
"task_id", "run_id", "map_index"])
+
+# Re-create task_reschedule's constraints.
+with op.batch_alter_table("task_reschedule") as batch_op:
+batch_op.add_column(Column("map_index", Integer, nullable=False, 
default=-1))
+batch_op.create_foreign_key(
+"task_reschedule_ti_fkey",
+"task_instance",
+["dag_id", "task_id", "run_id", "map_index"],
+["dag_id", "task_id", "run_id", "map_index"],
+ondelete="CASCADE",
+)
+batch_op.create_index(
+"idx_task_reschedule_dag_task_run",
+["dag_id", "task_id", "run_id", "map_index"],
+unique=False,
+)
+
+# Create task_map.
+op.create_table(
+"task_map",
+Column("dag_id", StringID(), primary_key=True),
+Column("task_id", StringID(), primary_key=True),
+Column("run_id", StringID(), primary_key=True),
+Column("map_index", Integer, primary_key=True),

Review comment:
   Actually I think this _shouldn’t ever_ be `-1` because `TI.map_index == 
-1` means the TI is _not_ mapped. Let me check if SQLAlchemy has an unsigned 
integer column type…

##
File path: 
airflow/migrations/versions/e655c0453f75_add_taskmap_and_map_id_on_taskinstance.py
##
@@ -0,0 +1,120 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Add TaskMap and map_index on TaskInstance.
+
+Revision ID: e655c0453f75
+Revises: 587bdf053233
+Create Date: 2021-12-13 22:59:41.052584
+"""
+
+from alembic import op
+from sqlalchemy import Column, ForeignKeyConstraint, Integer
+
+from airflow.models.base import StringID
+from airflow.utils.sqlalchemy import ExtendedJSON
+
+# Revision identifiers, used by Alembic.
+revision = "e655c0453f75"
+down_revision = "587bdf053233"
+branch_labels = None
+depends_on = None
+
+
+def upgrade():
+"""Add TaskMap and map_index on TaskInstanc

[GitHub] [airflow] uranusjr commented on a change in pull request #20286: Add TaskMap and TaskInstance.map_id

2022-01-12 Thread GitBox


uranusjr commented on a change in pull request #20286:
URL: https://github.com/apache/airflow/pull/20286#discussion_r782844937



##
File path: 
airflow/migrations/versions/e655c0453f75_add_taskmap_and_map_id_on_taskinstance.py
##
@@ -0,0 +1,120 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Add TaskMap and map_index on TaskInstance.
+
+Revision ID: e655c0453f75
+Revises: 587bdf053233
+Create Date: 2021-12-13 22:59:41.052584
+"""
+
+from alembic import op
+from sqlalchemy import Column, ForeignKeyConstraint, Integer
+
+from airflow.models.base import StringID
+from airflow.utils.sqlalchemy import ExtendedJSON
+
+# Revision identifiers, used by Alembic.
+revision = "e655c0453f75"
+down_revision = "587bdf053233"
+branch_labels = None
+depends_on = None
+
+
+def upgrade():
+"""Add TaskMap and map_index on TaskInstance."""
+# We need to first remove constraints on task_reschedule since they depend 
on task_instance.
+with op.batch_alter_table("task_reschedule") as batch_op:
+batch_op.drop_constraint("task_reschedule_ti_fkey", "foreignkey")
+batch_op.drop_index("idx_task_reschedule_dag_task_run")
+
+# Change task_instance's primary key.
+with op.batch_alter_table("task_instance") as batch_op:
+# I think we always use this name for TaskInstance after 7b2661a43ba3?
+batch_op.drop_constraint("task_instance_pkey", type_="primary")
+batch_op.add_column(Column("map_index", Integer, nullable=False, 
default=-1))
+batch_op.create_primary_key("task_instance_pkey", ["dag_id", 
"task_id", "run_id", "map_index"])
+
+# Re-create task_reschedule's constraints.
+with op.batch_alter_table("task_reschedule") as batch_op:
+batch_op.add_column(Column("map_index", Integer, nullable=False, 
default=-1))
+batch_op.create_foreign_key(
+"task_reschedule_ti_fkey",
+"task_instance",
+["dag_id", "task_id", "run_id", "map_index"],
+["dag_id", "task_id", "run_id", "map_index"],
+ondelete="CASCADE",
+)
+batch_op.create_index(
+"idx_task_reschedule_dag_task_run",
+["dag_id", "task_id", "run_id", "map_index"],
+unique=False,
+)
+
+# Create task_map.
+op.create_table(
+"task_map",
+Column("dag_id", StringID(), primary_key=True),
+Column("task_id", StringID(), primary_key=True),
+Column("run_id", StringID(), primary_key=True),
+Column("map_index", Integer, primary_key=True),

Review comment:
   The default on `TI.map_index` is mainly added so we don’t need to change 
too much existing code (otherwise we’d need to add a ton of `map_index=-1`), 
but this being a new class, I think I prefer being explicit instead.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] Bowrna commented on issue #20740: Breeze: Running static checks with Breeze

2022-01-12 Thread GitBox


Bowrna commented on issue #20740:
URL: https://github.com/apache/airflow/issues/20740#issuecomment-1010809006


   ` The list of available static checks should be retrieved by parsing the 
.pre-commit.yml file rather than (as it is currently done) maintaining the list 
in ./breeze-complete script`
   
   @potiuk  Do you think we have to parse the .pre-commit.yml  file on the fly 
and enable auto-complete for those checks rather than adding click options ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #20349: Fix Scheduler crash when executing task instances of missing DAG

2022-01-12 Thread GitBox


kaxil commented on a change in pull request #20349:
URL: https://github.com/apache/airflow/pull/20349#discussion_r782858400



##
File path: tests/jobs/test_scheduler_job.py
##
@@ -645,6 +645,34 @@ def 
test_find_executable_task_instances_in_default_pool(self, dag_maker):
 session.rollback()
 session.close()
 
+def 
test_executable_task_instances_to_queued_fails_task_for_missing_dag_in_dagbag(
+self, dag_maker, session
+):
+"""Only check concurrency for dag in dagbag"""

Review comment:
   docstring are wrong

##
File path: tests/jobs/test_scheduler_job.py
##
@@ -645,6 +645,34 @@ def 
test_find_executable_task_instances_in_default_pool(self, dag_maker):
 session.rollback()
 session.close()
 
+def 
test_executable_task_instances_to_queued_fails_task_for_missing_dag_in_dagbag(
+self, dag_maker, session
+):
+"""Only check concurrency for dag in dagbag"""

Review comment:
   docstring is wrong




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #20349: Fix Scheduler crash when executing task instances of missing DAG

2022-01-12 Thread GitBox


kaxil commented on a change in pull request #20349:
URL: https://github.com/apache/airflow/pull/20349#discussion_r782861728



##
File path: tests/jobs/test_scheduler_job.py
##
@@ -645,6 +645,34 @@ def 
test_find_executable_task_instances_in_default_pool(self, dag_maker):
 session.rollback()
 session.close()
 
+def 
test_executable_task_instances_to_queued_fails_task_for_missing_dag_in_dagbag(
+self, dag_maker, session
+):
+"""Only check concurrency for dag in dagbag"""
+dag_id = 
'SchedulerJobTest.test_find_executable_task_instances_not_in_dagbag'
+task_id_1 = 'dummy'
+task_id_2 = 'dummydummy'
+
+with dag_maker(dag_id=dag_id, session=session, 
default_args={"max_active_tis_per_dag": 1}):
+DummyOperator(task_id=task_id_1)
+DummyOperator(task_id=task_id_2)
+
+self.scheduler_job = SchedulerJob(subdir=os.devnull)
+self.scheduler_job.dagbag = mock.MagicMock()
+self.scheduler_job.dagbag.get_dag.return_value = None
+
+dr = dag_maker.create_dagrun(state=DagRunState.RUNNING)
+
+tis = dr.task_instances
+for ti in tis:
+ti.state = State.SCHEDULED
+session.merge(ti)
+session.flush()
+res = 
self.scheduler_job._executable_task_instances_to_queued(max_tis=32, 
session=session)
+session.flush()
+assert 0 == len(res)
+assert session.query(TaskInstance).filter(TaskInstance.state == 
State.FAILED).count() == 2

Review comment:
   you can use `dr.get_task_instances` here




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #20349: Fix Scheduler crash when executing task instances of missing DAG

2022-01-12 Thread GitBox


kaxil commented on a change in pull request #20349:
URL: https://github.com/apache/airflow/pull/20349#discussion_r782861728



##
File path: tests/jobs/test_scheduler_job.py
##
@@ -645,6 +645,34 @@ def 
test_find_executable_task_instances_in_default_pool(self, dag_maker):
 session.rollback()
 session.close()
 
+def 
test_executable_task_instances_to_queued_fails_task_for_missing_dag_in_dagbag(
+self, dag_maker, session
+):
+"""Only check concurrency for dag in dagbag"""
+dag_id = 
'SchedulerJobTest.test_find_executable_task_instances_not_in_dagbag'
+task_id_1 = 'dummy'
+task_id_2 = 'dummydummy'
+
+with dag_maker(dag_id=dag_id, session=session, 
default_args={"max_active_tis_per_dag": 1}):
+DummyOperator(task_id=task_id_1)
+DummyOperator(task_id=task_id_2)
+
+self.scheduler_job = SchedulerJob(subdir=os.devnull)
+self.scheduler_job.dagbag = mock.MagicMock()
+self.scheduler_job.dagbag.get_dag.return_value = None
+
+dr = dag_maker.create_dagrun(state=DagRunState.RUNNING)
+
+tis = dr.task_instances
+for ti in tis:
+ti.state = State.SCHEDULED
+session.merge(ti)
+session.flush()
+res = 
self.scheduler_job._executable_task_instances_to_queued(max_tis=32, 
session=session)
+session.flush()
+assert 0 == len(res)
+assert session.query(TaskInstance).filter(TaskInstance.state == 
State.FAILED).count() == 2

Review comment:
   you can use `dr.get_task_instances` here which will guarantee we get the 
correct ti's 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #19769: Handle stuck queued tasks in Celery

2022-01-12 Thread GitBox


kaxil commented on a change in pull request #19769:
URL: https://github.com/apache/airflow/pull/19769#discussion_r782876872



##
File path: airflow/executors/celery_executor.py
##
@@ -377,6 +384,52 @@ def _check_for_stalled_adopted_tasks(self):
 for key in timedout_keys:
 self.change_state(key, State.FAILED)
 
+@provide_session
+def _clear_stuck_queued_tasks(self, session=None):
+"""
+Tasks can get stuck in queued state in DB while still not in
+worker. This happens when the worker is autoscaled down and
+the task is queued but has not been picked up by any worker prior to 
the scaling.
+
+In such situation, we update the task instance state to scheduled so 
that
+it can be queued again. We chose to use task_adoption_timeout to decide
+"""
+if not isinstance(app.backend, DatabaseBackend):
+# We only want to do this for database backends where
+# this case has been spotted
+return
+# We use this instead of using bulk_state_fetcher because we
+# may not have the stuck task in self.tasks and we don't want
+# to clear task in self.tasks too
+session_ = app.backend.ResultSession()
+task_cls = getattr(app.backend, "task_cls", TaskDb)
+with session_cleanup(session_):
+celery_task_ids = [t.task_id for t in 
session_.query(task_cls.task_id).all()]
+self.log.debug("Checking for stuck queued tasks")
+
+max_allowed_time = utcnow() - self.task_adoption_timeout
+
+for task in session.query(TaskInstance).filter(
+TaskInstance.state == State.QUEUED, TaskInstance.queued_dttm < 
max_allowed_time
+):
+
+self.log.info("Checking task %s", task)

Review comment:
   ```suggestion
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #19769: Handle stuck queued tasks in Celery

2022-01-12 Thread GitBox


kaxil commented on a change in pull request #19769:
URL: https://github.com/apache/airflow/pull/19769#discussion_r782877828



##
File path: airflow/executors/celery_executor.py
##
@@ -377,6 +384,52 @@ def _check_for_stalled_adopted_tasks(self):
 for key in timedout_keys:
 self.change_state(key, State.FAILED)
 
+@provide_session
+def _clear_stuck_queued_tasks(self, session=None):

Review comment:
   ```suggestion
   def _clear_stuck_queued_tasks(self: Session = NEW_SESSION) -> None:
   ```
   
   while we are at it and since it is a new method




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on a change in pull request #20443: Add Listener Plugin API that tracks TaskInstance state changes

2022-01-12 Thread GitBox


ashb commented on a change in pull request #20443:
URL: https://github.com/apache/airflow/pull/20443#discussion_r782855132



##
File path: tests/listeners/test_empty_listener.py
##
@@ -0,0 +1,24 @@
+#

Review comment:
   Let's change the name of these files that don't contain tests to just 
`empty_listener` etc -- the `test_` prefix is confusing here as they don't 
contain any tests themselves, but are used by them.

##
File path: tests/listeners/test_listeners.py
##
@@ -0,0 +1,108 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import pytest as pytest
+
+from airflow import AirflowException
+from airflow.listeners import hookimpl
+from airflow.listeners.events import register_task_instance_state_events
+from airflow.listeners.listener import get_listener_manager
+from airflow.operators.bash import BashOperator
+from airflow.utils import timezone
+from airflow.utils.session import provide_session
+from airflow.utils.state import State
+from tests.listeners import test_full_listener, test_partial_listener, 
test_throwing_listener
+
+DAG_ID = "test_listener_dag"
+TASK_ID = "test_listener_task"
+EXECUTION_DATE = timezone.utcnow()
+
+
+@pytest.fixture(scope="module", autouse=True)
+def register_events():
+register_task_instance_state_events()

Review comment:
   Do we need to remove this setting/config when the test finishes?

##
File path: tests/listeners/test_listeners.py
##
@@ -0,0 +1,108 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import pytest as pytest
+
+from airflow import AirflowException
+from airflow.listeners import hookimpl
+from airflow.listeners.events import register_task_instance_state_events
+from airflow.listeners.listener import get_listener_manager
+from airflow.operators.bash import BashOperator
+from airflow.utils import timezone
+from airflow.utils.session import provide_session
+from airflow.utils.state import State
+from tests.listeners import test_full_listener, test_partial_listener, 
test_throwing_listener
+
+DAG_ID = "test_listener_dag"
+TASK_ID = "test_listener_task"
+EXECUTION_DATE = timezone.utcnow()
+
+
+@pytest.fixture(scope="module", autouse=True)
+def register_events():
+register_task_instance_state_events()
+
+
+@pytest.fixture(autouse=True)
+def clean_listener_manager():
+lm = get_listener_manager()
+lm.clear()
+yield
+lm = get_listener_manager()
+lm.clear()
+test_full_listener.state = []
+
+
+@provide_session
+def test_listener_gets_calls(create_task_instance, session=None):
+lm = get_listener_manager()
+lm.add_listener(test_full_listener)
+
+ti = create_task_instance(session=session, state=State.QUEUED)
+ti.run()

Review comment:
   Is this one using `ti.run()` and not `ti._run_raw_task()` for a reason? 
If so please add a short comment

##
File path: airflow/plugins_manager.py
##
@@ -458,6 +463,20 @@ def integrate_macros_plugins() -> None:
 setattr(macros, plugin.name, macros_module)
 
 
+def integrate_listener_plugins(listener_manager: "ListenerManager") -> None:
+global plugins
+global macros_modules

Review comment:
   ```suggestion
   ```
   
   Not used?

##
File path: tests/listeners/test_listeners.py
##
@@ -0,0 +1,108 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information

[GitHub] [airflow] kaxil commented on a change in pull request #19769: Handle stuck queued tasks in Celery

2022-01-12 Thread GitBox


kaxil commented on a change in pull request #19769:
URL: https://github.com/apache/airflow/pull/19769#discussion_r782895730



##
File path: tests/executors/test_celery_executor.py
##
@@ -411,6 +406,171 @@ def 
test_check_for_stalled_adopted_tasks_goes_in_ordered_fashion(self):
 assert executor.running == {key_2}
 assert executor.adopted_task_timeouts == {key_2: queued_dttm_2 + 
executor.task_adoption_timeout}
 
+@pytest.mark.backend("mysql", "postgres")
+@pytest.mark.parametrize(
+"state, queued_dttm, executor_id",
+[
+(State.SCHEDULED, timezone.utcnow() - timedelta(days=2), '231'),
+(State.QUEUED, timezone.utcnow(), '231'),
+(State.QUEUED, timezone.utcnow(), None),
+],
+)
+def test_stuck_queued_tasks_are_cleared(
+self, state, queued_dttm, executor_id, session, dag_maker, 
create_dummy_dag, create_task_instance
+):
+"""Test that clear_stuck_queued_tasks works"""
+ti = create_task_instance(state=State.QUEUED)
+ti.queued_dttm = queued_dttm
+ti.external_executor_id = executor_id
+session.merge(ti)
+session.flush()
+executor = celery_executor.CeleryExecutor()
+executor._clear_stuck_queued_tasks()
+session.flush()
+ti = session.query(TaskInstance).filter(TaskInstance.task_id == 
ti.task_id).one()
+assert ti.state == state
+
+@pytest.mark.backend("mysql", "postgres")
+def test_task_in_queued_tasks_dict_are_not_cleared(
+self, session, dag_maker, create_dummy_dag, create_task_instance
+):
+"""Test that clear_stuck_queued_tasks doesn't clear tasks in 
executor.queued_tasks"""
+ti = create_task_instance(state=State.QUEUED)
+ti.queued_dttm = timezone.utcnow() - timedelta(days=2)
+ti.external_executor_id = '231'
+session.merge(ti)
+session.flush()
+executor = celery_executor.CeleryExecutor()
+executor.queued_tasks = {ti.key: AsyncResult("231")}
+executor._clear_stuck_queued_tasks()
+session.flush()
+ti = session.query(TaskInstance).filter(TaskInstance.task_id == 
ti.task_id).one()

Review comment:
   based on the test name, you should check the following:
   
   ```
   assert executor.queued_tasks == {ti.key: AsyncResult("231")}
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #19769: Handle stuck queued tasks in Celery

2022-01-12 Thread GitBox


kaxil commented on a change in pull request #19769:
URL: https://github.com/apache/airflow/pull/19769#discussion_r782897587



##
File path: tests/executors/test_celery_executor.py
##
@@ -411,6 +406,171 @@ def 
test_check_for_stalled_adopted_tasks_goes_in_ordered_fashion(self):
 assert executor.running == {key_2}
 assert executor.adopted_task_timeouts == {key_2: queued_dttm_2 + 
executor.task_adoption_timeout}
 
+@pytest.mark.backend("mysql", "postgres")
+@pytest.mark.parametrize(
+"state, queued_dttm, executor_id",
+[
+(State.SCHEDULED, timezone.utcnow() - timedelta(days=2), '231'),
+(State.QUEUED, timezone.utcnow(), '231'),
+(State.QUEUED, timezone.utcnow(), None),
+],
+)
+def test_stuck_queued_tasks_are_cleared(
+self, state, queued_dttm, executor_id, session, dag_maker, 
create_dummy_dag, create_task_instance
+):
+"""Test that clear_stuck_queued_tasks works"""
+ti = create_task_instance(state=State.QUEUED)
+ti.queued_dttm = queued_dttm
+ti.external_executor_id = executor_id
+session.merge(ti)
+session.flush()
+executor = celery_executor.CeleryExecutor()
+executor._clear_stuck_queued_tasks()
+session.flush()
+ti = session.query(TaskInstance).filter(TaskInstance.task_id == 
ti.task_id).one()
+assert ti.state == state
+
+@pytest.mark.backend("mysql", "postgres")
+def test_task_in_queued_tasks_dict_are_not_cleared(
+self, session, dag_maker, create_dummy_dag, create_task_instance
+):
+"""Test that clear_stuck_queued_tasks doesn't clear tasks in 
executor.queued_tasks"""
+ti = create_task_instance(state=State.QUEUED)
+ti.queued_dttm = timezone.utcnow() - timedelta(days=2)
+ti.external_executor_id = '231'
+session.merge(ti)
+session.flush()
+executor = celery_executor.CeleryExecutor()
+executor.queued_tasks = {ti.key: AsyncResult("231")}
+executor._clear_stuck_queued_tasks()
+session.flush()
+ti = session.query(TaskInstance).filter(TaskInstance.task_id == 
ti.task_id).one()
+assert ti.state == State.QUEUED
+
+@pytest.mark.backend("mysql", "postgres")
+def test_task_in_running_dict_are_not_cleared(
+self, session, dag_maker, create_dummy_dag, create_task_instance
+):
+"""Test that clear_stuck_queued_tasks doesn't clear tasks in 
executor.running"""
+ti = create_task_instance(state=State.QUEUED)
+ti.queued_dttm = timezone.utcnow() - timedelta(days=2)
+ti.external_executor_id = '231'
+session.merge(ti)
+session.flush()
+executor = celery_executor.CeleryExecutor()
+executor.running = {ti.key: AsyncResult("231")}
+executor._clear_stuck_queued_tasks()
+session.flush()
+ti = session.query(TaskInstance).filter(TaskInstance.task_id == 
ti.task_id).one()

Review comment:
   ```
   assert executor.running == {ti.key: AsyncResult("231")}
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil merged pull request #20796: Better signing instructions for helm chart releases

2022-01-12 Thread GitBox


kaxil merged pull request #20796:
URL: https://github.com/apache/airflow/pull/20796


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch main updated (25a5f55 -> 19fb7fd)

2022-01-12 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git.


from 25a5f55  Add "use_ssl" option to IMAP connection (#20441)
 add 19fb7fd  Better signing instructions for helm chart releases (#20796)

No new revisions were added by this update.

Summary of changes:
 dev/README_RELEASE_HELM_CHART.md | 71 
 1 file changed, 36 insertions(+), 35 deletions(-)


[GitHub] [airflow] Bowrna opened a new issue #20823: Static check docs mistakes in example

2022-01-12 Thread GitBox


Bowrna opened a new issue #20823:
URL: https://github.com/apache/airflow/issues/20823


   ### Describe the issue with documentation
   
   https://user-images.githubusercontent.com/10162465/149127114-5810f86e-83eb-40f6-b438-5b18b7026e86.png";>
   Run the flake8 check for the tests.core package with verbose output:
   
   ./breeze static-check mypy -- --files tests/hooks/test_druid_hook.py
   
   The doc says flake8 check for tests.core package but it runs mypy check for 
files
   
   ### How to solve the problem
   
   It can be solved by adding the right instruction.
   ./breeze static-check flake8 -- --files tests/core/* --verbose
   
   Didn't check if the above command is working. But we have to use similar 
command like above.
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on issue #20823: Static check docs mistakes in example

2022-01-12 Thread GitBox


potiuk commented on issue #20823:
URL: https://github.com/apache/airflow/issues/20823#issuecomment-1010938435


   Feel free to update it :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on a change in pull request #20795: Fix remaining mypy issues in "core" Airflow

2022-01-12 Thread GitBox


potiuk commented on a change in pull request #20795:
URL: https://github.com/apache/airflow/pull/20795#discussion_r782982377



##
File path: airflow/executors/base_executor.py
##
@@ -55,7 +55,7 @@ class BaseExecutor(LoggingMixin):
 ``0`` for infinity
 """
 
-job_id: Optional[str] = None
+job_id: Union[None, int, str] = None

Review comment:
   Maybe:
   
   ```suggestion
   job_id: Optional[Union[int, str]] = None
   ```
   
   A bit longer, but I think conceptually more consistent with other optionals.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] wojsamjan commented on a change in pull request #20377: Dataplex operators

2022-01-12 Thread GitBox


wojsamjan commented on a change in pull request #20377:
URL: https://github.com/apache/airflow/pull/20377#discussion_r782982522



##
File path: airflow/providers/google/cloud/hooks/dataplex.py
##
@@ -0,0 +1,247 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""This module contains Google Dataplex hook."""
+import os
+from time import sleep
+from typing import Any, Dict, Optional, Sequence, Union
+
+from google.api_core.retry import exponential_sleep_generator
+from googleapiclient.discovery import Resource, build
+
+from airflow.exceptions import AirflowException
+from airflow.providers.google.common.hooks.base_google import GoogleBaseHook
+
+API_KEY = os.environ.get("GCP_API_KEY", "INVALID API KEY")

Review comment:
   Discussed within the team, for now I am going to remove API_KEY - it was 
needed for development purposes. Once the Dataplex API will be publicly 
available it will not be needed any more. I will commit changes and then draft 
this PR.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on a change in pull request #20795: Fix remaining mypy issues in "core" Airflow

2022-01-12 Thread GitBox


potiuk commented on a change in pull request #20795:
URL: https://github.com/apache/airflow/pull/20795#discussion_r782987340



##
File path: airflow/models/baseoperator.py
##
@@ -1600,6 +1601,10 @@ def get_serialized_fields(cls):
 
 return cls.__serialized_fields
 
+def serialize_for_task_group(self) -> Tuple[DagAttributeTypes, Any]:

Review comment:
   nice




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on a change in pull request #20795: Fix remaining mypy issues in "core" Airflow

2022-01-12 Thread GitBox


potiuk commented on a change in pull request #20795:
URL: https://github.com/apache/airflow/pull/20795#discussion_r782988971



##
File path: airflow/providers_manager.py
##
@@ -571,10 +571,10 @@ def _add_taskflow_decorator(self, name, 
decorator_class_name: str, provider_pack
 if name in self._taskflow_decorators:
 try:
 existing = self._taskflow_decorators[name]
-other_name = f'{existing.__module__}.{existing.__name}'
+other_name = f'{existing.__module__}.{existing.__name__}'

Review comment:
   ups.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch constraints-main updated: Updating constraints. Build id:1687075497

2022-01-12 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch constraints-main
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/constraints-main by this push:
 new 5b26c8a  Updating constraints. Build id:1687075497
5b26c8a is described below

commit 5b26c8a83953ecd5dc16cdc1aae828548e74c0ef
Author: Automated GitHub Actions commit 
AuthorDate: Wed Jan 12 11:36:51 2022 +

Updating constraints. Build id:1687075497

This update in constraints is automatically committed by the CI 
'constraints-push' step based on
HEAD of 'refs/heads/main' in 'apache/airflow'
with commit sha 19fb7fde4db593c2cb549ded95bec1585a4191e5.

All tests passed in this build so we determined we can push the updated 
constraints.

See 
https://github.com/apache/airflow/blob/main/README.md#installing-from-pypi for 
details.
---
 constraints-3.7.txt  | 6 +++---
 constraints-3.8.txt  | 6 +++---
 constraints-3.9.txt  | 6 +++---
 constraints-no-providers-3.7.txt | 2 +-
 constraints-no-providers-3.8.txt | 2 +-
 constraints-no-providers-3.9.txt | 2 +-
 constraints-source-providers-3.7.txt | 6 +++---
 constraints-source-providers-3.8.txt | 6 +++---
 constraints-source-providers-3.9.txt | 6 +++---
 9 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/constraints-3.7.txt b/constraints-3.7.txt
index 18a6129..a5f4000 100644
--- a/constraints-3.7.txt
+++ b/constraints-3.7.txt
@@ -1,5 +1,5 @@
 #
-# This constraints file was automatically generated on 2022-01-12T06:43:51Z
+# This constraints file was automatically generated on 2022-01-12T11:32:17Z
 # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow.
 # This variant of constraints install uses the HEAD of the branch version for 
'apache-airflow' but installs
 # the providers from PIP-released packages at the moment of the constraint 
generation.
@@ -548,13 +548,13 @@ types-python-dateutil==2.8.6
 types-python-slugify==5.0.3
 types-pytz==2021.3.4
 types-redis==4.1.7
-types-requests==2.27.5
+types-requests==2.27.6
 types-setuptools==57.4.7
 types-six==1.16.9
 types-tabulate==0.8.5
 types-termcolor==1.1.3
 types-toml==0.10.3
-types-urllib3==1.26.4
+types-urllib3==1.26.6
 typing-extensions==3.10.0.2
 typing-inspect==0.7.1
 tzdata==2021.5
diff --git a/constraints-3.8.txt b/constraints-3.8.txt
index a11d811..c07a775 100644
--- a/constraints-3.8.txt
+++ b/constraints-3.8.txt
@@ -1,5 +1,5 @@
 #
-# This constraints file was automatically generated on 2022-01-12T06:43:49Z
+# This constraints file was automatically generated on 2022-01-12T11:32:13Z
 # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow.
 # This variant of constraints install uses the HEAD of the branch version for 
'apache-airflow' but installs
 # the providers from PIP-released packages at the moment of the constraint 
generation.
@@ -545,13 +545,13 @@ types-python-dateutil==2.8.6
 types-python-slugify==5.0.3
 types-pytz==2021.3.4
 types-redis==4.1.7
-types-requests==2.27.5
+types-requests==2.27.6
 types-setuptools==57.4.7
 types-six==1.16.9
 types-tabulate==0.8.5
 types-termcolor==1.1.3
 types-toml==0.10.3
-types-urllib3==1.26.4
+types-urllib3==1.26.6
 typing-extensions==3.10.0.2
 typing-inspect==0.7.1
 tzdata==2021.5
diff --git a/constraints-3.9.txt b/constraints-3.9.txt
index aa838c3..7cb5034 100644
--- a/constraints-3.9.txt
+++ b/constraints-3.9.txt
@@ -1,5 +1,5 @@
 #
-# This constraints file was automatically generated on 2022-01-12T06:43:48Z
+# This constraints file was automatically generated on 2022-01-12T11:32:14Z
 # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow.
 # This variant of constraints install uses the HEAD of the branch version for 
'apache-airflow' but installs
 # the providers from PIP-released packages at the moment of the constraint 
generation.
@@ -540,13 +540,13 @@ types-python-dateutil==2.8.6
 types-python-slugify==5.0.3
 types-pytz==2021.3.4
 types-redis==4.1.7
-types-requests==2.27.5
+types-requests==2.27.6
 types-setuptools==57.4.7
 types-six==1.16.9
 types-tabulate==0.8.5
 types-termcolor==1.1.3
 types-toml==0.10.3
-types-urllib3==1.26.4
+types-urllib3==1.26.6
 typing-extensions==3.10.0.2
 typing-inspect==0.7.1
 tzdata==2021.5
diff --git a/constraints-no-providers-3.7.txt b/constraints-no-providers-3.7.txt
index 6bfbca5..457b9c5 100644
--- a/constraints-no-providers-3.7.txt
+++ b/constraints-no-providers-3.7.txt
@@ -1,5 +1,5 @@
 #
-# This constraints file was automatically generated on 2022-01-12T06:47:14Z
+# This constraints file was automatically generated on 2022-01-12T11:36:42Z
 # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow.
 # This variant of constraints install just the 'bare' 'apache-airflow' package 
build from the HEAD of
 # the branch, without installing any of the providers.
diff --git a/constraints-

[GitHub] [airflow] potiuk commented on a change in pull request #20795: Fix remaining mypy issues in "core" Airflow

2022-01-12 Thread GitBox


potiuk commented on a change in pull request #20795:
URL: https://github.com/apache/airflow/pull/20795#discussion_r782993963



##
File path: airflow/utils/log/logging_mixin.py
##
@@ -73,11 +73,23 @@ def supports_external_link(self) -> bool:
 """Return whether handler is able to support external links."""
 
 
-class StreamLogWriter(IOBase):
+class StreamLogWriter(IO[str], IOBase):
 """Allows to redirect stdout and stderr to logger"""
 
 encoding: None = None
 
+def fileno(self) -> int:

Review comment:
   Hmmm. I had the impression this has been here already :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] Bowrna commented on issue #20740: Breeze: Running static checks with Breeze

2022-01-12 Thread GitBox


Bowrna commented on issue #20740:
URL: https://github.com/apache/airflow/issues/20740#issuecomment-1010955332


   @potiuk I have one more question.
   
   Why do we have to run static checks with Breeze when it's possible to run 
via pre-commit. Only a few checks like flake8, mypy requires to build an image 
while others can be run. Is there any specific reason to support static-checks 
via Breeze commands?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on a change in pull request #20795: Fix remaining mypy issues in "core" Airflow

2022-01-12 Thread GitBox


potiuk commented on a change in pull request #20795:
URL: https://github.com/apache/airflow/pull/20795#discussion_r782995769



##
File path: airflow/version.py
##
@@ -22,7 +22,7 @@
 try:
 import importlib_metadata as metadata
 except ImportError:
-from importlib import metadata
+from importlib import metadata  # type: ignore

Review comment:
   ```suggestion
   from importlib import metadata  # type: ignore[attr-defined]
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on a change in pull request #20795: Fix remaining mypy issues in "core" Airflow

2022-01-12 Thread GitBox


potiuk commented on a change in pull request #20795:
URL: https://github.com/apache/airflow/pull/20795#discussion_r782996345



##
File path: airflow/plugins_manager.py
##
@@ -24,12 +24,12 @@
 import os
 import sys
 import types
-from typing import TYPE_CHECKING, Any, Dict, List, Optional, Type
+from typing import TYPE_CHECKING, Any, Dict, Iterable, List, Optional, Type
 
 try:
 import importlib_metadata
 except ImportError:
-from importlib import metadata as importlib_metadata
+from importlib import metadata as importlib_metadata  # type: ignore

Review comment:
   ```suggestion
   from importlib import metadata as importlib_metadata  # type: 
ignore[attr-defined]
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on a change in pull request #20795: Fix remaining mypy issues in "core" Airflow

2022-01-12 Thread GitBox


potiuk commented on a change in pull request #20795:
URL: https://github.com/apache/airflow/pull/20795#discussion_r782997629



##
File path: airflow/providers_manager.py
##
@@ -571,10 +571,10 @@ def _add_taskflow_decorator(self, name, 
decorator_class_name: str, provider_pack
 if name in self._taskflow_decorators:
 try:
 existing = self._taskflow_decorators[name]
-other_name = f'{existing.__module__}.{existing.__name}'
+other_name = f'{existing.__module__}.{existing.__name__}'
 except Exception:
 # If problem importing, then get the value from the 
functools.partial
-other_name = self._taskflow_decorators._raw_dict[name].args[0]
+other_name = self._taskflow_decorators._raw_dict[name].args[0] 
 # type: ignore

Review comment:
   Not sure which error is silenced here but we started to silence specific 
error codes recently: 
https://mypy.readthedocs.io/en/stable/error_code_list.html#error-code-list




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on a change in pull request #20795: Fix remaining mypy issues in "core" Airflow

2022-01-12 Thread GitBox


potiuk commented on a change in pull request #20795:
URL: https://github.com/apache/airflow/pull/20795#discussion_r782999465



##
File path: airflow/utils/operator_helpers.py
##
@@ -164,7 +164,10 @@ def determine(
 
 def unpacking(self) -> Mapping[str, Any]:
 """Dump the kwargs mapping to unpack with ``**`` in a function call."""
-if self._wildcard and isinstance(self._kwargs, Context):
+# Context is a TypedDict at lint time, and Mypy would complain it 
cannot
+# be used in isinstance. But the call works at runtime since the type 
is
+# actually implemented as a custom mapping, so we ignore the Mypy 
error.
+if self._wildcard and isinstance(self._kwargs, Context):  # type: 
ignore

Review comment:
   Might be good also to review other type-ignores to see if we can silence 
a specific error (watch-out for `misc`)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #20795: Fix remaining mypy issues in "core" Airflow

2022-01-12 Thread GitBox


github-actions[bot] commented on pull request #20795:
URL: https://github.com/apache/airflow/pull/20795#issuecomment-1010959908


   The PR most likely needs to run full matrix of tests because it modifies 
parts of the core of Airflow. However, committers might decide to merge it 
quickly and take the risk. If they don't merge it quickly - please rebase it to 
the latest main at your convenience, or amend the last commit of the PR, and 
push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on a change in pull request #20795: Fix remaining mypy issues in "core" Airflow

2022-01-12 Thread GitBox


ashb commented on a change in pull request #20795:
URL: https://github.com/apache/airflow/pull/20795#discussion_r783014315



##
File path: airflow/utils/log/logging_mixin.py
##
@@ -73,11 +73,23 @@ def supports_external_link(self) -> bool:
 """Return whether handler is able to support external links."""
 
 
-class StreamLogWriter(IOBase):
+class StreamLogWriter(IO[str], IOBase):
 """Allows to redirect stdout and stderr to logger"""
 
 encoding: None = None
 
+def fileno(self) -> int:

Review comment:
   I thought so too but it complained without it -- this was mostly around 
the use in StdoutRedirector which wants this to be `IO[str]`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on issue #20740: Breeze: Running static checks with Breeze

2022-01-12 Thread GitBox


potiuk commented on issue #20740:
URL: https://github.com/apache/airflow/issues/20740#issuecomment-1010978696


   > Why do we have to run static checks with Breeze when it's possible to run 
via pre-commit. Only a few checks like flake8, mypy requires to build an image 
while others can be run. Is there any specific reason to support static-checks 
via Breeze commands?
   
   Actually the only reason is actually auto-complete. Even the few "image" 
pre-commits can be run with pre-commit (they will just fail in case image is 
not built). But I found this really problematic that someone who wants to run 
specific pre-commit has to look it up elsewhere (pre-commit has no support for 
auto-complete and apparently author of pre-commit is rather "cold" when it 
comes to supporting it. 
   
   The pre-commit's autocomplete issue is here 
https://github.com/pre-commit/pre-commit/issues/1119 (you can see my comments 
there) - 2.5 years and the response from @asotile was rather "cold" towards it 
(and he refused yaml parsing and click idea). While I love pre-commit in 
general, this is the only feature i actually miss. In our case the lack of 
autocomplete with our 100 (just counted ) pre-commits where we continue to 
add them with a few pre-commits a month, the only way we can do it is by 
parsing the yaml. And since we are already using click and autocomplete in the 
new Breeze, I think this is the easiest way.
   
   @potiuk Do you think we have to parse the .pre-commit.yml file on the fly 
and enable auto-complete for those checks rather than adding click options ?
   
   We can do both actually. As @asotile mentioned in the comment - parsing the 
yaml on the flight might be slow. So one other option could be that we are 
using (yes!) pre-commit to generate list of pre-commits. We already do that 
actually in 
https://github.com/apache/airflow/blob/main/scripts/ci/pre_commit/pre_commit_check_pre_commit_hook_names.py
 where we check if all namse of pre-commit are generated in 
`./breeze-copmplete` and documented in the docs. We could actually modify that 
one to generate a python file with list of pre-commits that we could import in 
Breeze (not only check it):
   
   Something like that: 
   ```
   PRE_COMMIT_LIST=[
  'identity',
  'check-hooks-apply', .
  
   ]
   ```
   
   then we could import it from that file and add as valid click options. That 
woudl likely be best way. 
   
   Additionally (not a must but should be easy), the whole index of pre-commits 
in our docs could be auto-generted. We could hard-code the names of tess that 
need breeze image and read the names and description from the yaml file: 
https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#available-pre-commit-checks
 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk edited a comment on issue #20740: Breeze: Running static checks with Breeze

2022-01-12 Thread GitBox


potiuk edited a comment on issue #20740:
URL: https://github.com/apache/airflow/issues/20740#issuecomment-1010978696


   > Why do we have to run static checks with Breeze when it's possible to run 
via pre-commit. Only a few checks like flake8, mypy requires to build an image 
while others can be run. Is there any specific reason to support static-checks 
via Breeze commands?
   
   Actually the only reason is auto-complete. Even the few "image" pre-commits 
can be run with pre-commit (they will just fail in case image is not built). 
But I found this really problematic that someone who wants to run specific 
pre-commit has to look it up elsewhere (pre-commit has no support for 
auto-complete and apparently author of pre-commit is rather "cold" when it 
comes to supporting it. 
   
   The pre-commit's autocomplete issue is here 
https://github.com/pre-commit/pre-commit/issues/1119 (you can see my comments 
there) - 2.5 years and the response from @asotile was rather "cold" towards it 
(and he refused yaml parsing and click idea). While I love pre-commit in 
general, this is the only feature i actually miss. In our case the lack of 
autocomplete with our 100 (just counted ) pre-commits where we continue to 
add them with a few pre-commits a month, the only way we can do it is by 
parsing the yaml. And since we are already using click and autocomplete in the 
new Breeze, I think this is the easiest way.
   
   @potiuk Do you think we have to parse the .pre-commit.yml file on the fly 
and enable auto-complete for those checks rather than adding click options ?
   
   We can do both actually. As @asotile mentioned in the comment - parsing the 
yaml on the flight might be slow. So one other option could be that we are 
using (yes!) pre-commit to generate list of pre-commits. We already do that 
actually in 
https://github.com/apache/airflow/blob/main/scripts/ci/pre_commit/pre_commit_check_pre_commit_hook_names.py
 where we check if all namse of pre-commit are generated in 
`./breeze-copmplete` and documented in the docs. We could actually modify that 
one to generate a python file with list of pre-commits that we could import in 
Breeze (not only check it):
   
   Something like that: 
   ```
   PRE_COMMIT_LIST=[
  'identity',
  'check-hooks-apply', .
  
   ]
   ```
   
   then we could import it from that file and add as valid click options. That 
woudl likely be best way. 
   
   Additionally (not a must but should be easy), the whole index of pre-commits 
in our docs could be auto-generted. We could hard-code the names of tess that 
need breeze image and read the names and description from the yaml file: 
https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#available-pre-commit-checks
 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk edited a comment on issue #20740: Breeze: Running static checks with Breeze

2022-01-12 Thread GitBox


potiuk edited a comment on issue #20740:
URL: https://github.com/apache/airflow/issues/20740#issuecomment-1010978696


   > Why do we have to run static checks with Breeze when it's possible to run 
via pre-commit. Only a few checks like flake8, mypy requires to build an image 
while others can be run. Is there any specific reason to support static-checks 
via Breeze commands?
   
   Actually the only reason is auto-complete. Even the few "image" pre-commits 
can be run with pre-commit (they will just fail in case image is not built). 
But I found this really problematic that someone who wants to run specific 
pre-commit has to look it up elsewhere (pre-commit has no support for 
auto-complete and apparently author of pre-commit is rather "cold" when it 
comes to supporting it. 
   
   The pre-commit's autocomplete issue is here 
https://github.com/pre-commit/pre-commit/issues/1119 (you can see my comments 
there) - 2.5 years and the response from @asotile was rather "cold" towards it 
(and he refused yaml parsing and click idea). While I love pre-commit in 
general, this is the only feature i actually miss. In our case the lack of 
autocomplete with our 100 (just counted ) pre-commits where we continue to 
add them with a few pre-commits a month, the only way we can do it is by 
parsing the yaml. And since we are already using click and autocomplete in the 
new Breeze, I think this is the easiest way.
   
   > @potiuk Do you think we have to parse the .pre-commit.yml file on the fly 
and enable auto-complete for those checks rather than adding click options ?
   
   We can do both actually. As @asotile mentioned in the comment - parsing the 
yaml on the flight might be slow. So one other option could be that we are 
using (yes!) pre-commit to generate list of pre-commits. We already do that 
actually in 
https://github.com/apache/airflow/blob/main/scripts/ci/pre_commit/pre_commit_check_pre_commit_hook_names.py
 where we check if all namse of pre-commit are generated in 
`./breeze-copmplete` and documented in the docs. We could actually modify that 
one to generate a python file with list of pre-commits that we could import in 
Breeze (not only check it):
   
   Something like that: 
   ```
   PRE_COMMIT_LIST=[
  'identity',
  'check-hooks-apply', .
  
   ]
   ```
   
   then we could import it from that file and add as valid click options. That 
woudl likely be best way. 
   
   Additionally (not a must but should be easy), the whole index of pre-commits 
in our docs could be auto-generted. We could hard-code the names of tess that 
need breeze image and read the names and description from the yaml file: 
https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#available-pre-commit-checks
 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk edited a comment on issue #20740: Breeze: Running static checks with Breeze

2022-01-12 Thread GitBox


potiuk edited a comment on issue #20740:
URL: https://github.com/apache/airflow/issues/20740#issuecomment-1010978696


   > Why do we have to run static checks with Breeze when it's possible to run 
via pre-commit. Only a few checks like flake8, mypy requires to build an image 
while others can be run. Is there any specific reason to support static-checks 
via Breeze commands?
   
   Actually the only reason is auto-complete. Even the few "image" pre-commits 
can be run with pre-commit (they will just fail in case image is not built). 
But I found this really problematic that someone who wants to run specific 
pre-commit has to look it up elsewhere (pre-commit has no support for 
auto-complete and apparently author of pre-commit is rather "cold" when it 
comes to supporting it. 
   
   The pre-commit's autocomplete issue is here 
https://github.com/pre-commit/pre-commit/issues/1119 (you can see my comments 
there) - 2.5 years and the response from @asotile was rather "cold" towards it 
(and he refused yaml parsing and click idea). While I love pre-commit in 
general, this is the only feature i actually miss. In our case the lack of 
autocomplete with our 100 (just counted ) pre-commits where we continue to 
add them with a few pre-commits a month, the only way we can do it is by 
parsing the yaml. And since we are already using click and autocomplete in the 
new Breeze, I think this is the easiest way.
   
   > @potiuk Do you think we have to parse the .pre-commit.yml file on the fly 
and enable auto-complete for those checks rather than adding click options ?
   
   We can do something else. As @asotile mentioned in the comment - parsing the 
yaml on the flight might be slow. So one other option could be that we are 
using (yes!) pre-commit to generate list of pre-commits. We already do that 
actually in 
https://github.com/apache/airflow/blob/main/scripts/ci/pre_commit/pre_commit_check_pre_commit_hook_names.py
 where we check if all namse of pre-commit are generated in 
`./breeze-copmplete` and documented in the docs. We could actually modify that 
one to generate a python file with list of pre-commits that we could import in 
Breeze (not only check it):
   
   Something like that: 
   ```
   PRE_COMMIT_LIST=[
  'identity',
  'check-hooks-apply', .
  
   ]
   ```
   
   then we could import it from that file and add as valid click options. That 
woudl likely be best way. 
   
   Additionally (not a must but should be easy), the whole index of pre-commits 
in our docs could be auto-generted. We could hard-code the names of tess that 
need breeze image and read the names and description from the yaml file: 
https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#available-pre-commit-checks
 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk edited a comment on issue #20740: Breeze: Running static checks with Breeze

2022-01-12 Thread GitBox


potiuk edited a comment on issue #20740:
URL: https://github.com/apache/airflow/issues/20740#issuecomment-1010978696


   > Why do we have to run static checks with Breeze when it's possible to run 
via pre-commit. Only a few checks like flake8, mypy requires to build an image 
while others can be run. Is there any specific reason to support static-checks 
via Breeze commands?
   
   Actually the only reason is auto-complete. Even the few "image" pre-commits 
can be run with pre-commit (they will just fail in case image is not built). 
But I found this really problematic that someone who wants to run specific 
pre-commit has to look it up elsewhere (pre-commit has no support for 
auto-complete and apparently author of pre-commit is rather "cold" when it 
comes to supporting it. 
   
   The pre-commit's autocomplete issue is here 
https://github.com/pre-commit/pre-commit/issues/1119 (you can see my comments 
there) - 2.5 years and the response from @asotile was rather "cold" towards it 
(and he refused yaml parsing and click idea). While I love pre-commit in 
general, this is the only feature i actually miss. In our case the lack of 
autocomplete with our 100 (just counted ) pre-commits where we continue to 
add them with a few pre-commits a month, the only way we can do it is by 
parsing the yaml. And since we are already using click and autocomplete in the 
new Breeze, I think this is the easiest way.
   
   > @potiuk Do you think we have to parse the .pre-commit.yml file on the fly 
and enable auto-complete for those checks rather than adding click options ?
   
   We can do something else. As @asotile mentioned in the comment - parsing the 
yaml on the flight might be slow. So one other option could be that we are 
using (yes!) pre-commit to generate list of pre-commits. We already do that 
actually in 
https://github.com/apache/airflow/blob/main/scripts/ci/pre_commit/pre_commit_check_pre_commit_hook_names.py
 where we check if all names of pre-commit are generated in 
`./breeze-copmplete` and documented in the docs. We could actually modify that 
one to generate a python file with list of pre-commits that we could import in 
Breeze (not only check it):
   
   Something like that: 
   ```
   PRE_COMMIT_LIST=[
  'identity',
  'check-hooks-apply', .
  
   ]
   ```
   
   then we could import it from that file and add as valid click options. That 
woudl likely be best way. 
   
   Additionally (not a must but should be easy), the whole index of pre-commits 
in our docs could be auto-generted. We could hard-code the names of tess that 
need breeze image and read the names and description from the yaml file: 
https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#available-pre-commit-checks
 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk edited a comment on issue #20740: Breeze: Running static checks with Breeze

2022-01-12 Thread GitBox


potiuk edited a comment on issue #20740:
URL: https://github.com/apache/airflow/issues/20740#issuecomment-1010978696


   > Why do we have to run static checks with Breeze when it's possible to run 
via pre-commit. Only a few checks like flake8, mypy requires to build an image 
while others can be run. Is there any specific reason to support static-checks 
via Breeze commands?
   
   Actually the only reason is auto-complete. Even the few "image" pre-commits 
can be run with pre-commit (they will just fail in case image is not built). 
But I found this really problematic that someone who wants to run specific 
pre-commit has to look it up elsewhere (pre-commit has no support for 
auto-complete and apparently author of pre-commit is rather "cold" when it 
comes to supporting it. 
   
   The pre-commit's autocomplete issue is here 
https://github.com/pre-commit/pre-commit/issues/1119 (you can see my comments 
there) - 2.5 years and the response from @asotile was rather "cold" towards it 
(and he refused yaml parsing and click idea). While I love pre-commit in 
general, this is the only feature i actually miss. In our case the lack of 
autocomplete with our 100 (just counted ) pre-commits where we continue to 
add them with a few pre-commits a month, the only way we can do it is by 
parsing the yaml. And since we are already using click and autocomplete in the 
new Breeze, I think this is the easiest way.
   
   > @potiuk Do you think we have to parse the .pre-commit.yml file on the fly 
and enable auto-complete for those checks rather than adding click options ?
   
   We can do something else. As @asotile mentioned in the comment - parsing the 
yaml on the flight might be slow. So one other option could be that we are 
using (yes!) pre-commit to generate list of pre-commits. We already do that 
actually in 
https://github.com/apache/airflow/blob/main/scripts/ci/pre_commit/pre_commit_check_pre_commit_hook_names.py
 where we check if all names of pre-commit are generated in 
`./breeze-copmplete` and documented in the docs. We could actually modify that 
one to generate a python file with list of pre-commits that we could import in 
Breeze (not only check it):
   
   Something like that: 
   ```
   PRE_COMMIT_LIST=[
  'identity',
  'check-hooks-apply', .
  
   ]
   ```
   
   then we could import it from that file and add as valid click options. That 
woudl likely be best way. 
   
   Additionally (not a must but should be easy), the whole index of pre-commits 
in our docs could be auto-generted. Currently we just check if the pre-commits 
are there, but instead (as we do with a few other auto-generated documentation) 
we could hard-code the names of tess that need breeze image and read the names 
and description from the yaml file and generate the table automatically: 
https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#available-pre-commit-checks
 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk edited a comment on issue #20740: Breeze: Running static checks with Breeze

2022-01-12 Thread GitBox


potiuk edited a comment on issue #20740:
URL: https://github.com/apache/airflow/issues/20740#issuecomment-1010978696


   > Why do we have to run static checks with Breeze when it's possible to run 
via pre-commit. Only a few checks like flake8, mypy requires to build an image 
while others can be run. Is there any specific reason to support static-checks 
via Breeze commands?
   
   Actually the only reason is auto-complete. Even the few "image" pre-commits 
can be run with pre-commit (they will just fail in case image is not built). 
But I found this really problematic that someone who wants to run specific 
pre-commit has to look it up elsewhere (pre-commit has no support for 
auto-complete and apparently author of pre-commit is rather "cold" when it 
comes to supporting it. 
   
   The pre-commit's autocomplete issue is here 
https://github.com/pre-commit/pre-commit/issues/1119 (you can see my comments 
there) - 2.5 years and the response from @asotile was rather "cold" towards it 
(and he refused yaml parsing and click idea). While I love pre-commit in 
general, this is the only feature i actually miss. In our case the lack of 
autocomplete with our 100 (just counted ) pre-commits where we continue to 
add them with a few pre-commits a month, the only way we can do it is by 
parsing the yaml. And since we are already using click and autocomplete in the 
new Breeze, I think this is the easiest way.
   
   > @potiuk Do you think we have to parse the .pre-commit.yml file on the fly 
and enable auto-complete for those checks rather than adding click options ?
   
   We can do something else. As @asotile mentioned in the comment - parsing the 
yaml on the flight might be slow. So one other option could be that we are 
using (yes!) pre-commit to generate list of pre-commits. We already do that 
actually in 
https://github.com/apache/airflow/blob/main/scripts/ci/pre_commit/pre_commit_check_pre_commit_hook_names.py
 where we check if all names of pre-commit are generated in 
`./breeze-copmplete` and documented in the docs. We could actually modify that 
one to generate a python file with list of pre-commits that we could import in 
Breeze (not only check it):
   
   Something like that: 
   ```
   PRE_COMMIT_LIST=[
  'identity',
  'check-hooks-apply', .
  
   ]
   ```
   
   then we could import it from that file and add as valid click options. That 
woudl likely be best way. 
   
   Additionally (not a must but should be easy), the whole index of pre-commits 
in our docs could be auto-generted. Currently we just check if the pre-commits 
are there, but instead (as we do with a few other auto-generated documentation) 
we could hard-code the names of tess that need breeze image and read the names 
and description from the yaml file and generate the table automatically: 
https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#available-pre-commit-checks
 
   
   UPDATE: 
https://github.com/apache/airflow/blob/main/scripts/ci/pre_commit/supported_versions.py
 - this is an example table of supported versions that we generate in 
pre-commit so w should do it in a very similar fashion.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk edited a comment on issue #20740: Breeze: Running static checks with Breeze

2022-01-12 Thread GitBox


potiuk edited a comment on issue #20740:
URL: https://github.com/apache/airflow/issues/20740#issuecomment-1010978696


   > Why do we have to run static checks with Breeze when it's possible to run 
via pre-commit. Only a few checks like flake8, mypy requires to build an image 
while others can be run. Is there any specific reason to support static-checks 
via Breeze commands?
   
   Actually the only reason is auto-complete. Even the few "image" pre-commits 
can be run with pre-commit (they will just fail in case image is not built). 
But I found this really problematic that someone who wants to run specific 
pre-commit has to look it up elsewhere (pre-commit has no support for 
auto-complete and apparently author of pre-commit is rather "cold" when it 
comes to supporting it. 
   
   The pre-commit's autocomplete issue is here 
https://github.com/pre-commit/pre-commit/issues/1119 (you can see my comments 
there) - 2.5 years and the response from @asottile was rather "cold" towards it 
(and he refused yaml parsing and click idea). While I love pre-commit in 
general, this is the only feature i actually miss. In our case the lack of 
autocomplete with our 100 (just counted ) pre-commits where we continue to 
add them with a few pre-commits a month, the only way we can do it is by 
parsing the yaml. And since we are already using click and autocomplete in the 
new Breeze, I think this is the easiest way.
   
   > @potiuk Do you think we have to parse the .pre-commit.yml file on the fly 
and enable auto-complete for those checks rather than adding click options ?
   
   We can do something else. As @asottile mentioned in the comment - parsing 
the yaml on the flight might be slow. So one other option could be that we are 
using (yes!) pre-commit to generate list of pre-commits. We already do that 
actually in 
https://github.com/apache/airflow/blob/main/scripts/ci/pre_commit/pre_commit_check_pre_commit_hook_names.py
 where we check if all names of pre-commit are generated in 
`./breeze-copmplete` and documented in the docs. We could actually modify that 
one to generate a python file with list of pre-commits that we could import in 
Breeze (not only check it):
   
   Something like that: 
   ```
   PRE_COMMIT_LIST=[
  'identity',
  'check-hooks-apply', .
  
   ]
   ```
   
   then we could import it from that file and add as valid click options. That 
woudl likely be best way. 
   
   Additionally (not a must but should be easy), the whole index of pre-commits 
in our docs could be auto-generted. Currently we just check if the pre-commits 
are there, but instead (as we do with a few other auto-generated documentation) 
we could hard-code the names of tess that need breeze image and read the names 
and description from the yaml file and generate the table automatically: 
https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#available-pre-commit-checks
 
   
   UPDATE: 
https://github.com/apache/airflow/blob/main/scripts/ci/pre_commit/supported_versions.py
 - this is an example table of supported versions that we generate in 
pre-commit so w should do it in a very similar fashion.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on a change in pull request #20795: Fix remaining mypy issues in "core" Airflow

2022-01-12 Thread GitBox


ashb commented on a change in pull request #20795:
URL: https://github.com/apache/airflow/pull/20795#discussion_r783021187



##
File path: airflow/providers_manager.py
##
@@ -571,10 +571,10 @@ def _add_taskflow_decorator(self, name, 
decorator_class_name: str, provider_pack
 if name in self._taskflow_decorators:
 try:
 existing = self._taskflow_decorators[name]
-other_name = f'{existing.__module__}.{existing.__name}'
+other_name = f'{existing.__module__}.{existing.__name__}'
 except Exception:
 # If problem importing, then get the value from the 
functools.partial
-other_name = self._taskflow_decorators._raw_dict[name].args[0]
+other_name = self._taskflow_decorators._raw_dict[name].args[0] 
 # type: ignore

Review comment:
   I'll add `show_error_codes=True` in to the config in this PR too




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on a change in pull request #20795: Fix remaining mypy issues in "core" Airflow

2022-01-12 Thread GitBox


potiuk commented on a change in pull request #20795:
URL: https://github.com/apache/airflow/pull/20795#discussion_r783022283



##
File path: airflow/providers_manager.py
##
@@ -571,10 +571,10 @@ def _add_taskflow_decorator(self, name, 
decorator_class_name: str, provider_pack
 if name in self._taskflow_decorators:
 try:
 existing = self._taskflow_decorators[name]
-other_name = f'{existing.__module__}.{existing.__name}'
+other_name = f'{existing.__module__}.{existing.__name__}'
 except Exception:
 # If problem importing, then get the value from the 
functools.partial
-other_name = self._taskflow_decorators._raw_dict[name].args[0]
+other_name = self._taskflow_decorators._raw_dict[name].args[0] 
 # type: ignore

Review comment:
   Ah YEAH!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on a change in pull request #20795: Fix remaining mypy issues in "core" Airflow

2022-01-12 Thread GitBox


ashb commented on a change in pull request #20795:
URL: https://github.com/apache/airflow/pull/20795#discussion_r783023128



##
File path: airflow/version.py
##
@@ -22,7 +22,7 @@
 try:
 import importlib_metadata as metadata
 except ImportError:
-from importlib import metadata
+from importlib import metadata  # type: ignore

Review comment:
   You mean `no-redef` :) 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on a change in pull request #20795: Fix remaining mypy issues in "core" Airflow

2022-01-12 Thread GitBox


ashb commented on a change in pull request #20795:
URL: https://github.com/apache/airflow/pull/20795#discussion_r783028253



##
File path: airflow/utils/operator_helpers.py
##
@@ -164,7 +164,10 @@ def determine(
 
 def unpacking(self) -> Mapping[str, Any]:
 """Dump the kwargs mapping to unpack with ``**`` in a function call."""
-if self._wildcard and isinstance(self._kwargs, Context):
+# Context is a TypedDict at lint time, and Mypy would complain it 
cannot
+# be used in isinstance. But the call works at runtime since the type 
is
+# actually implemented as a custom mapping, so we ignore the Mypy 
error.
+if self._wildcard and isinstance(self._kwargs, Context):  # type: 
ignore

Review comment:
   Have gone through all the ones added in this PR and converted them to 
specific codes where possible.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] itayB commented on pull request #20801: changing execution_date to run_id

2022-01-12 Thread GitBox


itayB commented on pull request #20801:
URL: https://github.com/apache/airflow/pull/20801#issuecomment-1010995914


   I'm just not sure if it's a bug or misconfiguration.
   I'm working with remote logging and Kubernetes executor.
   Airflow webserver's log page is still sending log requests with 
`execution_date` rather than `run_id`. Inspection via browser show:
   ```
   
https://airflow-audience-webserver.use1.dynamicyield.com/get_logs_with_metadata?dag_id=sparkjobs-8764248&task_id=daily&execution_date=2022-01-11T10:50:00+00:00&try_number=1&metadata={"end_of_log":false,"last_log_timestamp":"2022-01-12T11:48:42.713060+00:00","offset":"0"}
   ```
   and in the web-server side:
   ```
   10.1.113.221 - - [12/Jan/2022:12:05:06 +] "GET 
/get_logs_with_metadata?dag_id=sparkjobs-8764248&task_id=daily&execution_date=2022-01-11T10%3A50%3A00%2B00%3A00&try_number=1&metadata=%7B%22end_of_log%22%3Afalse%2C%22last_log_timestamp%22%3A%222022-01-12T12%3A05%3A03.678921%2B00%3A00%22%2C%22offset%22%3A%220%22%7D
 HTTP/1.1" 200 116 
"https://airflow-audience-webserver.dev-use1.dynamicyield.com/log?dag_id=sparkjobs-8764248&task_id=daily&execution_date=2022-01-11T10%3A50%3A00%2B00%3A00";
 "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, 
like Gecko) Chrome/96.0.4664.110 Safari/537.36"
   ```
   
   On the other hand, `execution_date` info was dropped from (Kubernetes) 
executors - it was marked as a label in the pod before v2.2.x. `run_id` label 
was added instead.
   
   - Is there a bug in the web client that doesn't contain/send `run_id` in the 
log requests?
   - Is there a configuration change to make it send the new `run_id`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on a change in pull request #20795: Fix remaining mypy issues in "core" Airflow

2022-01-12 Thread GitBox


ashb commented on a change in pull request #20795:
URL: https://github.com/apache/airflow/pull/20795#discussion_r783029207



##
File path: airflow/utils/operator_helpers.py
##
@@ -164,10 +164,7 @@ def determine(
 
 def unpacking(self) -> Mapping[str, Any]:
 """Dump the kwargs mapping to unpack with ``**`` in a function call."""
-# Context is a TypedDict at lint time, and Mypy would complain it 
cannot
-# be used in isinstance. But the call works at runtime since the type 
is
-# actually implemented as a custom mapping, so we ignore the Mypy 
error.
-if self._wildcard and isinstance(self._kwargs, Context):  # type: 
ignore
+if self._wildcard and isinstance(self._kwargs, Context):

Review comment:
   Despite mypy complaining about this before _in this PR_ it's now not, at 
least not for me when I run `pre-commit run -a mypy` locally.
   
   I want to see what number CI gets -- it should be 102 errors left for the 
first mypy job.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mobuchowski commented on a change in pull request #20443: Add Listener Plugin API that tracks TaskInstance state changes

2022-01-12 Thread GitBox


mobuchowski commented on a change in pull request #20443:
URL: https://github.com/apache/airflow/pull/20443#discussion_r783036364



##
File path: tests/listeners/test_listeners.py
##
@@ -0,0 +1,108 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import pytest as pytest
+
+from airflow import AirflowException
+from airflow.listeners import hookimpl
+from airflow.listeners.events import register_task_instance_state_events
+from airflow.listeners.listener import get_listener_manager
+from airflow.operators.bash import BashOperator
+from airflow.utils import timezone
+from airflow.utils.session import provide_session
+from airflow.utils.state import State
+from tests.listeners import test_full_listener, test_partial_listener, 
test_throwing_listener
+
+DAG_ID = "test_listener_dag"
+TASK_ID = "test_listener_task"
+EXECUTION_DATE = timezone.utcnow()
+
+
+@pytest.fixture(scope="module", autouse=True)
+def register_events():
+register_task_instance_state_events()
+
+
+@pytest.fixture(autouse=True)
+def clean_listener_manager():
+lm = get_listener_manager()
+lm.clear()
+yield
+lm = get_listener_manager()
+lm.clear()
+test_full_listener.state = []
+
+
+@provide_session
+def test_listener_gets_calls(create_task_instance, session=None):
+lm = get_listener_manager()
+lm.add_listener(test_full_listener)
+
+ti = create_task_instance(session=session, state=State.QUEUED)
+ti.run()
+
+assert len(test_full_listener.state) == 2
+assert test_full_listener.state == [State.RUNNING, State.SUCCESS]
+
+
+@provide_session
+def test_listener_gets_only_subscribed_calls(create_task_instance, 
session=None):
+lm = get_listener_manager()
+lm.add_listener(test_partial_listener)
+
+ti = create_task_instance(session=session, state=State.QUEUED)
+ti.run()
+
+assert len(test_partial_listener.state) == 1
+assert test_partial_listener.state == [State.RUNNING]
+
+
+@provide_session
+def test_listener_throws_exceptions(create_task_instance, session=None):
+lm = get_listener_manager()
+lm.add_listener(test_throwing_listener)
+
+with pytest.raises(RuntimeError):
+ti = create_task_instance(session=session, state=State.QUEUED)
+ti._run_raw_task()
+
+
+@provide_session
+def 
test_listener_captures_failed_taskinstances(create_task_instance_of_operator, 
session=None):
+lm = get_listener_manager()
+lm.add_listener(test_full_listener)
+
+with pytest.raises(AirflowException):
+ti = create_task_instance_of_operator(
+BashOperator, dag_id=DAG_ID, execution_date=EXECUTION_DATE, 
task_id=TASK_ID, bash_command="exit 1"
+)
+ti._run_raw_task()

Review comment:
   Fixed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mobuchowski commented on a change in pull request #20443: Add Listener Plugin API that tracks TaskInstance state changes

2022-01-12 Thread GitBox


mobuchowski commented on a change in pull request #20443:
URL: https://github.com/apache/airflow/pull/20443#discussion_r783036547



##
File path: tests/listeners/test_listeners.py
##
@@ -0,0 +1,108 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import pytest as pytest
+
+from airflow import AirflowException
+from airflow.listeners import hookimpl
+from airflow.listeners.events import register_task_instance_state_events
+from airflow.listeners.listener import get_listener_manager
+from airflow.operators.bash import BashOperator
+from airflow.utils import timezone
+from airflow.utils.session import provide_session
+from airflow.utils.state import State
+from tests.listeners import test_full_listener, test_partial_listener, 
test_throwing_listener
+
+DAG_ID = "test_listener_dag"
+TASK_ID = "test_listener_task"
+EXECUTION_DATE = timezone.utcnow()
+
+
+@pytest.fixture(scope="module", autouse=True)
+def register_events():
+register_task_instance_state_events()
+
+
+@pytest.fixture(autouse=True)
+def clean_listener_manager():
+lm = get_listener_manager()
+lm.clear()
+yield
+lm = get_listener_manager()
+lm.clear()
+test_full_listener.state = []
+
+
+@provide_session
+def test_listener_gets_calls(create_task_instance, session=None):
+lm = get_listener_manager()
+lm.add_listener(test_full_listener)
+
+ti = create_task_instance(session=session, state=State.QUEUED)
+ti.run()

Review comment:
   Added comment:
   
   ```
   # Using ti.run() instead of ti._run_raw_task() to capture state change 
to RUNNING
   # that only happens on `check_and_change_state_before_execution()` that 
is called before
   # `run()` calls `_run_raw_task()`
   ```

##
File path: airflow/plugins_manager.py
##
@@ -458,6 +463,20 @@ def integrate_macros_plugins() -> None:
 setattr(macros, plugin.name, macros_module)
 
 
+def integrate_listener_plugins(listener_manager: "ListenerManager") -> None:
+global plugins
+global macros_modules

Review comment:
   Removed.

##
File path: docs/apache-airflow/listeners.rst
##
@@ -0,0 +1,39 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+Listeners
+=
+
+Airflow gives you an option to be notified of events happening in Airflow
+by writing listeners. Listeners are powered by `pluggy 
`__
+
+Listener API is meant to be called across all dags, and all operators - in 
contrast to methods like
+``on_success_callback``, ``pre_execute`` and related family which are meant to 
provide callbacks
+for particular dag authors, or operator creators. There is no possibility to 
listen on events generated
+by particular dag.
+
+To include listener in your Airflow installation, include it as a part of an 
:doc:`Airflow Plugin `

Review comment:
   Done.

##
File path: tests/listeners/test_empty_listener.py
##
@@ -0,0 +1,24 @@
+#

Review comment:
   Done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mobuchowski commented on a change in pull request #20443: Add Listener Plugin API that tracks TaskInstance state changes

2022-01-12 Thread GitBox


mobuchowski commented on a change in pull request #20443:
URL: https://github.com/apache/airflow/pull/20443#discussion_r783037348



##
File path: tests/listeners/test_listeners.py
##
@@ -0,0 +1,108 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import pytest as pytest
+
+from airflow import AirflowException
+from airflow.listeners import hookimpl
+from airflow.listeners.events import register_task_instance_state_events
+from airflow.listeners.listener import get_listener_manager
+from airflow.operators.bash import BashOperator
+from airflow.utils import timezone
+from airflow.utils.session import provide_session
+from airflow.utils.state import State
+from tests.listeners import test_full_listener, test_partial_listener, 
test_throwing_listener
+
+DAG_ID = "test_listener_dag"
+TASK_ID = "test_listener_task"
+EXECUTION_DATE = timezone.utcnow()
+
+
+@pytest.fixture(scope="module", autouse=True)
+def register_events():
+register_task_instance_state_events()

Review comment:
   Added `unregister_task_instance_state_events` function - please take a 
look at `airflow.listeners.events` now.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mobuchowski commented on a change in pull request #20443: Add Listener Plugin API that tracks TaskInstance state changes

2022-01-12 Thread GitBox


mobuchowski commented on a change in pull request #20443:
URL: https://github.com/apache/airflow/pull/20443#discussion_r783037510



##
File path: tests/listeners/test_listeners.py
##
@@ -0,0 +1,108 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import pytest as pytest
+
+from airflow import AirflowException
+from airflow.listeners import hookimpl
+from airflow.listeners.events import register_task_instance_state_events
+from airflow.listeners.listener import get_listener_manager
+from airflow.operators.bash import BashOperator
+from airflow.utils import timezone
+from airflow.utils.session import provide_session
+from airflow.utils.state import State
+from tests.listeners import test_full_listener, test_partial_listener, 
test_throwing_listener
+
+DAG_ID = "test_listener_dag"
+TASK_ID = "test_listener_task"
+EXECUTION_DATE = timezone.utcnow()
+
+
+@pytest.fixture(scope="module", autouse=True)
+def register_events():
+register_task_instance_state_events()
+
+
+@pytest.fixture(autouse=True)
+def clean_listener_manager():
+lm = get_listener_manager()
+lm.clear()
+yield
+lm = get_listener_manager()
+lm.clear()
+test_full_listener.state = []
+
+
+@provide_session
+def test_listener_gets_calls(create_task_instance, session=None):
+lm = get_listener_manager()
+lm.add_listener(test_full_listener)
+
+ti = create_task_instance(session=session, state=State.QUEUED)
+ti.run()
+
+assert len(test_full_listener.state) == 2
+assert test_full_listener.state == [State.RUNNING, State.SUCCESS]
+
+
+@provide_session
+def test_listener_gets_only_subscribed_calls(create_task_instance, 
session=None):
+lm = get_listener_manager()
+lm.add_listener(test_partial_listener)
+
+ti = create_task_instance(session=session, state=State.QUEUED)
+ti.run()
+
+assert len(test_partial_listener.state) == 1
+assert test_partial_listener.state == [State.RUNNING]
+
+
+@provide_session
+def test_listener_throws_exceptions(create_task_instance, session=None):
+lm = get_listener_manager()
+lm.add_listener(test_throwing_listener)
+
+with pytest.raises(RuntimeError):
+ti = create_task_instance(session=session, state=State.QUEUED)
+ti._run_raw_task()
+
+
+@provide_session
+def 
test_listener_captures_failed_taskinstances(create_task_instance_of_operator, 
session=None):
+lm = get_listener_manager()
+lm.add_listener(test_full_listener)
+
+with pytest.raises(AirflowException):
+ti = create_task_instance_of_operator(
+BashOperator, dag_id=DAG_ID, execution_date=EXECUTION_DATE, 
task_id=TASK_ID, bash_command="exit 1"
+)
+ti._run_raw_task()
+
+assert test_full_listener.state == [State.FAILED]
+assert len(test_full_listener.state) == 1
+
+
+def test_non_module_listener_is_not_registered():
+class NotAListener:
+@hookimpl
+def on_task_instance_running(self, previous_state, task_instance, 
session):
+pass
+
+lm = get_listener_manager()
+lm.add_listener(NotAListener())

Review comment:
   Yes. Fixed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] boring-cyborg[bot] commented on issue #20824: Apache Airflow environment is unable to load UI for the code.

2022-01-12 Thread GitBox


boring-cyborg[bot] commented on issue #20824:
URL: https://github.com/apache/airflow/issues/20824#issuecomment-1011019631


   Thanks for opening your first issue here! Be sure to follow the issue 
template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] geetakew opened a new issue #20824: Apache Airflow environment is unable to load UI for the code.

2022-01-12 Thread GitBox


geetakew opened a new issue #20824:
URL: https://github.com/apache/airflow/issues/20824


   ### Apache Airflow version
   
   2.0.2
   
   ### What happened
   
   I created airflow environment on AWS by following steps from 
https://towardsdatascience.com/set-up-an-airflow-environment-on-aws-in-minutes-f934cf10ec54.
 
   
   After the environment got available, I clicked on Airflow UI. But I am 
getting below message:
   
   Something bad has happened.
   Please consider letting us know by creating a bug report using GitHub.
   
   Python version: 3.7.10
   Airflow version: 2.0.2
   Node: redact
   
---
   Error! Please contact server admin.
   
   ### What you expected to happen
   
   _No response_
   
   ### How to reproduce
   
   _No response_
   
   ### Operating System
   
   AWS Airflow
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   MWAA
   
   ### Deployment details
   
   _No response_
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on a change in pull request #20795: Fix remaining mypy issues in "core" Airflow

2022-01-12 Thread GitBox


potiuk commented on a change in pull request #20795:
URL: https://github.com/apache/airflow/pull/20795#discussion_r783059967



##
File path: airflow/utils/operator_helpers.py
##
@@ -164,10 +164,7 @@ def determine(
 
 def unpacking(self) -> Mapping[str, Any]:
 """Dump the kwargs mapping to unpack with ``**`` in a function call."""
-# Context is a TypedDict at lint time, and Mypy would complain it 
cannot
-# be used in isinstance. But the call works at runtime since the type 
is
-# actually implemented as a custom mapping, so we ignore the Mypy 
error.
-if self._wildcard and isinstance(self._kwargs, Context):  # type: 
ignore
+if self._wildcard and isinstance(self._kwargs, Context):

Review comment:
   ```
   pre-commit run mypy --all-files 2>&1 | sed -r 
"s/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[mGK]//g" | grep "error:" | wc
   
   105 2928934
   ```
   
   BTW. The numbers might not add up. It's quite possible that fixing a mypy 
error in "selective" check will generate more errors elsewhere (in providers 
for example)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #20822: Add ElasticSearch log_id_template to tracking model

2022-01-12 Thread GitBox


github-actions[bot] commented on pull request #20822:
URL: https://github.com/apache/airflow/pull/20822#issuecomment-1011043081


   The PR most likely needs to run full matrix of tests because it modifies 
parts of the core of Airflow. However, committers might decide to merge it 
quickly and take the risk. If they don't merge it quickly - please rebase it to 
the latest main at your convenience, or amend the last commit of the PR, and 
push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] itayB edited a comment on pull request #20801: changing execution_date to run_id

2022-01-12 Thread GitBox


itayB edited a comment on pull request #20801:
URL: https://github.com/apache/airflow/pull/20801#issuecomment-1010995914


   I'm just not sure if it's a bug or misconfiguration.
   I'm working with remote logging and Kubernetes executor.
   Airflow webserver's log page is still sending log 
[requests](https://github.com/apache/airflow/blob/main/airflow/www/views.py#L1292)
 with `execution_date` rather than `run_id`. Inspection via browser show:
   ```
   
https://airflow-audience-webserver.use1.dynamicyield.com/get_logs_with_metadata?dag_id=sparkjobs-8764248&task_id=daily&execution_date=2022-01-11T10:50:00+00:00&try_number=1&metadata={"end_of_log":false,"last_log_timestamp":"2022-01-12T11:48:42.713060+00:00","offset":"0"}
   ```
   and in the web-server side:
   ```
   10.1.113.221 - - [12/Jan/2022:12:05:06 +] "GET 
/get_logs_with_metadata?dag_id=sparkjobs-8764248&task_id=daily&execution_date=2022-01-11T10%3A50%3A00%2B00%3A00&try_number=1&metadata=%7B%22end_of_log%22%3Afalse%2C%22last_log_timestamp%22%3A%222022-01-12T12%3A05%3A03.678921%2B00%3A00%22%2C%22offset%22%3A%220%22%7D
 HTTP/1.1" 200 116 
"https://airflow-audience-webserver.dev-use1.dynamicyield.com/log?dag_id=sparkjobs-8764248&task_id=daily&execution_date=2022-01-11T10%3A50%3A00%2B00%3A00";
 "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, 
like Gecko) Chrome/96.0.4664.110 Safari/537.36"
   ```
   
   On the other hand, `execution_date` info was dropped from (Kubernetes) 
executors - it was marked as a label in the pod before v2.2.x. `run_id` label 
was added instead.
   
   - Is there a bug in the web client that doesn't contain/send `run_id` in the 
log requests?
   - Is there a configuration change to make it send the new `run_id`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] itayB edited a comment on pull request #20801: changing execution_date to run_id

2022-01-12 Thread GitBox


itayB edited a comment on pull request #20801:
URL: https://github.com/apache/airflow/pull/20801#issuecomment-1010995914


   I'm just not sure if it's a bug or misconfiguration.
   I'm working with remote logging and Kubernetes executor.
   Airflow webserver's log page is still sending log 
[requests](https://github.com/apache/airflow/blob/main/airflow/www/views.py#L1292)
 with `execution_date` rather than `run_id`. Inspection via browser show:
   ```
   
https://airflow-audience-webserver.use1.dynamicyield.com/get_logs_with_metadata?dag_id=sparkjobs-8764248&task_id=daily&execution_date=2022-01-11T10:50:00+00:00&try_number=1&metadata={"end_of_log":false,"last_log_timestamp":"2022-01-12T11:48:42.713060+00:00","offset":"0"}
   ```
   and in the web-server side:
   ```
   10.1.113.221 - - [12/Jan/2022:12:05:06 +] "GET 
/get_logs_with_metadata?dag_id=sparkjobs-8764248&task_id=daily&execution_date=2022-01-11T10%3A50%3A00%2B00%3A00&try_number=1&metadata=%7B%22end_of_log%22%3Afalse%2C%22last_log_timestamp%22%3A%222022-01-12T12%3A05%3A03.678921%2B00%3A00%22%2C%22offset%22%3A%220%22%7D
 HTTP/1.1" 200 116 
"https://airflow-audience-webserver.dev-use1.dynamicyield.com/log?dag_id=sparkjobs-8764248&task_id=daily&execution_date=2022-01-11T10%3A50%3A00%2B00%3A00";
 "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, 
like Gecko) Chrome/96.0.4664.110 Safari/537.36"
   ```
   
   On the other hand, `execution_date` info was dropped from (Kubernetes) 
executors - it was marked as a label in the pod before v2.2.x. `run_id` label 
was added instead.
   Bottom line, there's no way to retrieve remote logs to Airflow's UI
   
   - Is there a bug in the web client that doesn't contain/send `run_id` in the 
log requests?
   - Is there a configuration change to make it send the new `run_id`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb opened a new pull request #20826: Move tests out of test core that are duplicated/should live elsewhere

2022-01-12 Thread GitBox


ashb opened a new pull request #20826:
URL: https://github.com/apache/airflow/pull/20826


   - Testing BashOperator lives in tests.operators.test_bash
   - CheckOperator and ValueCheckOperator already tested in
 tests.operators.test.sql (using the non-deprecated names)
   - on_failure_callback already tested in test_local_task_job
   - SqliteOperator already tested in the sqlite provider
   - PythonOperator already extensively tested in
 tests.operators.test_python
   - trigger_rule tested in test_baseoperator
   - Testing the task context was partially covvered already.
   
   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk opened a new pull request #20827: Add explanation to stub files introduced to handle errors in examples

2022-01-12 Thread GitBox


potiuk opened a new pull request #20827:
URL: https://github.com/apache/airflow/pull/20827


   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on a change in pull request #20826: Move tests out of test core that are duplicated/should live elsewhere

2022-01-12 Thread GitBox


potiuk commented on a change in pull request #20826:
URL: https://github.com/apache/airflow/pull/20826#discussion_r783083652



##
File path: tests/models/test_taskinstance.py
##
@@ -1616,7 +1616,34 @@ def test_template_with_json_variable_missing(self, 
create_task_instance):
 with pytest.raises(KeyError):
 ti.task.render_template('{{ var.json.get("missing_variable") }}', 
context)
 
-def test_tempalte_with_custom_timetable_deprecated_context(self, 
create_task_instance):
+@pytest.mark.parametrize(
+("field", "expected"),
+[
+("next_ds", "2016-01-01"),
+("next_ds_nodash", "20160101"),
+("prev_ds", "2015-12-31"),
+("prev_ds_nodash", "20151231"),
+("yesterday_ds", "2015-12-31"),
+("yesterday_ds_nodash", "20151231"),
+("tomorrow_ds", "2016-01-02"),
+("tomorrow_ds_nodash", "20160102"),
+],
+)
+def test_deprecated_context(self, field, expected, create_task_instance):
+ti = create_task_instance(execution_date=DEFAULT_DATE)
+context = ti.get_template_context()
+with pytest.deprecated_call() as recorder:
+assert context[field] == expected
+message_beginning = (
+f"Accessing {field!r} from the template is deprecated and "
+f"will be removed in a future version."
+)
+
+recorded_message = [str(m.message) for m in recorder]
+assert len(recorded_message) == 1
+assert recorded_message[0].startswith(message_beginning)
+
+def test_templte_with_custom_timetable_deprecated_context(self, 
create_task_instance):

Review comment:
   ```suggestion
   def test_template_with_custom_timetable_deprecated_context(self, 
create_task_instance):
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on a change in pull request #20826: Move tests out of test core that are duplicated/should live elsewhere

2022-01-12 Thread GitBox


ashb commented on a change in pull request #20826:
URL: https://github.com/apache/airflow/pull/20826#discussion_r783084638



##
File path: tests/models/test_taskinstance.py
##
@@ -1616,7 +1616,34 @@ def test_template_with_json_variable_missing(self, 
create_task_instance):
 with pytest.raises(KeyError):
 ti.task.render_template('{{ var.json.get("missing_variable") }}', 
context)
 
-def test_tempalte_with_custom_timetable_deprecated_context(self, 
create_task_instance):
+@pytest.mark.parametrize(
+("field", "expected"),
+[
+("next_ds", "2016-01-01"),
+("next_ds_nodash", "20160101"),
+("prev_ds", "2015-12-31"),
+("prev_ds_nodash", "20151231"),
+("yesterday_ds", "2015-12-31"),
+("yesterday_ds_nodash", "20151231"),
+("tomorrow_ds", "2016-01-02"),
+("tomorrow_ds_nodash", "20160102"),
+],
+)
+def test_deprecated_context(self, field, expected, create_task_instance):
+ti = create_task_instance(execution_date=DEFAULT_DATE)
+context = ti.get_template_context()
+with pytest.deprecated_call() as recorder:
+assert context[field] == expected
+message_beginning = (
+f"Accessing {field!r} from the template is deprecated and "
+f"will be removed in a future version."
+)
+
+recorded_message = [str(m.message) for m in recorder]
+assert len(recorded_message) == 1
+assert recorded_message[0].startswith(message_beginning)
+
+def test_templte_with_custom_timetable_deprecated_context(self, 
create_task_instance):

Review comment:
   Whoops, typo'd the typo fix 😁 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on a change in pull request #20795: Fix remaining mypy issues in "core" Airflow

2022-01-12 Thread GitBox


ashb commented on a change in pull request #20795:
URL: https://github.com/apache/airflow/pull/20795#discussion_r783085692



##
File path: airflow/utils/operator_helpers.py
##
@@ -164,10 +164,7 @@ def determine(
 
 def unpacking(self) -> Mapping[str, Any]:
 """Dump the kwargs mapping to unpack with ``**`` in a function call."""
-# Context is a TypedDict at lint time, and Mypy would complain it 
cannot
-# be used in isinstance. But the call works at runtime since the type 
is
-# actually implemented as a custom mapping, so we ignore the Mypy 
error.
-if self._wildcard and isinstance(self._kwargs, Context):  # type: 
ignore
+if self._wildcard and isinstance(self._kwargs, Context):

Review comment:
   Confirmed it's fine now. I have _no_ idea what was going on before but 
this was def a problem.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #20826: Move tests out of test core that are duplicated/should live elsewhere

2022-01-12 Thread GitBox


github-actions[bot] commented on pull request #20826:
URL: https://github.com/apache/airflow/pull/20826#issuecomment-1011059757


   The PR most likely needs to run full matrix of tests because it modifies 
parts of the core of Airflow. However, committers might decide to merge it 
quickly and take the risk. If they don't merge it quickly - please rebase it to 
the latest main at your convenience, or amend the last commit of the PR, and 
push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on a change in pull request #20795: Fix remaining mypy issues in "core" Airflow

2022-01-12 Thread GitBox


potiuk commented on a change in pull request #20795:
URL: https://github.com/apache/airflow/pull/20795#discussion_r783088432



##
File path: airflow/utils/operator_helpers.py
##
@@ -164,10 +164,7 @@ def determine(
 
 def unpacking(self) -> Mapping[str, Any]:
 """Dump the kwargs mapping to unpack with ``**`` in a function call."""
-# Context is a TypedDict at lint time, and Mypy would complain it 
cannot
-# be used in isinstance. But the call works at runtime since the type 
is
-# actually implemented as a custom mapping, so we ignore the Mypy 
error.
-if self._wildcard and isinstance(self._kwargs, Context):  # type: 
ignore
+if self._wildcard and isinstance(self._kwargs, Context):

Review comment:
   Possibly other fix fixed it. Also while MyPy is stable and repeatable, 
it is somewhat "vulnerable" to small changes - it generates more or less 
loosely related errors after small changes that should not (at least obviously) 
trigger them - at least that's the impression I had sometimes.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #20827: Add explanation to stub files introduced to handle errors in examples

2022-01-12 Thread GitBox


github-actions[bot] commented on pull request #20827:
URL: https://github.com/apache/airflow/pull/20827#issuecomment-1011064346


   The PR is likely OK to be merged with just subset of tests for default 
Python and Database versions without running the full matrix of tests, because 
it does not modify the core of Airflow. If the committers decide that the full 
tests matrix is needed, they will add the label 'full tests needed'. Then you 
should rebase to the latest main or amend the last commit of the PR, and push 
it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on pull request #20795: Fix remaining mypy issues in "core" Airflow

2022-01-12 Thread GitBox


potiuk commented on pull request #20795:
URL: https://github.com/apache/airflow/pull/20795#issuecomment-1011064541


   Still few nits, but it's good enough :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] josh-fell commented on pull request #20795: Fix remaining mypy issues in "core" Airflow

2022-01-12 Thread GitBox


josh-fell commented on pull request #20795:
URL: https://github.com/apache/airflow/pull/20795#issuecomment-1011076700


   Do we want to use this as the fix for `airflow/dag_processing` rather than 
#20470?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on pull request #19852: created SFTPBatchOperator which add batch function

2022-01-12 Thread GitBox


potiuk commented on pull request #19852:
URL: https://github.com/apache/airflow/pull/19852#issuecomment-1011082315


   Fixing doc/static checks and rebasing ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] edithturn commented on pull request #20701: [WIP] Check free space python

2022-01-12 Thread GitBox


edithturn commented on pull request #20701:
URL: https://github.com/apache/airflow/pull/20701#issuecomment-1011082989


   I got it @potiuk 
   "apt is for the terminal and gives beautiful output while apt-get and 
apt-cache are for scripts and give stable, parsable output. ", making the 
changes now :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] edithturn edited a comment on pull request #20701: [WIP] Check free space python

2022-01-12 Thread GitBox


edithturn edited a comment on pull request #20701:
URL: https://github.com/apache/airflow/pull/20701#issuecomment-1011082989


   I got it @potiuk 
   "apt is for the terminal and gives beautiful output while apt-get and 
apt-cache are for scripts and give stable, parsable output. ", making the 
changes now :)
   
   Something that I was wondering is when we will use: "dry_run", we are 
sending it as a parameter too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk merged pull request #20781: Fix airflow dag trigger cli

2022-01-12 Thread GitBox


potiuk merged pull request #20781:
URL: https://github.com/apache/airflow/pull/20781


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk closed issue #20579: Airflow 2.2.3 : "airflow dags trigger" command gets "Calling `DAG.create_dagrun()` without an explicit data interval is deprecated"

2022-01-12 Thread GitBox


potiuk closed issue #20579:
URL: https://github.com/apache/airflow/issues/20579


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch main updated (19fb7fd -> 82e466d)

2022-01-12 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git.


from 19fb7fd  Better signing instructions for helm chart releases (#20796)
 add 82e466d  Fix airflow trigger cli (#20781)

No new revisions were added by this update.

Summary of changes:
 airflow/api/common/trigger_dag.py |  5 +
 tests/api/client/test_local_client.py | 24 +++-
 2 files changed, 24 insertions(+), 5 deletions(-)


[GitHub] [airflow] edithturn commented on pull request #20701: [WIP] Check free space python

2022-01-12 Thread GitBox


edithturn commented on pull request #20701:
URL: https://github.com/apache/airflow/pull/20701#issuecomment-1011103682


   Just to add more information about "check":
   
   "If check is true, and the process exits with a non-zero exit code, a 
CalledProcessError exception will be raised. Attributes of that exception hold 
the arguments, the exit code, and stdout and stderr if they were captured."
   
   From what I understand: "sudo apt-get clean" will finish with a state zero 
(0), so if the check is true it will generate an exception, @potiuk, we don't 
want that exception to happen, right? for that reason we are changing 
"check=False"
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] edithturn edited a comment on pull request #20701: [WIP] Check free space python

2022-01-12 Thread GitBox


edithturn edited a comment on pull request #20701:
URL: https://github.com/apache/airflow/pull/20701#issuecomment-1011103682


   Just to add more information about "check":
   
   "If check is true, and the process exits with a non-zero exit code, a 
CalledProcessError exception will be raised. Attributes of that exception hold 
the arguments, the exit code, and stdout and stderr if they were captured."
   
   From what I understand: "sudo apt-get clean" will finish with a state zero 
(0), if this finish with a state <> than zero, and if the check is true it will 
generate an exception, @potiuk, we don't want that exception to happen, right? 
for that reason we are changing "check=False"
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mikaeld commented on issue #12136: KubernetesPodOperator breaks with active log-collection for long running tasks

2022-01-12 Thread GitBox


mikaeld commented on issue #12136:
URL: https://github.com/apache/airflow/issues/12136#issuecomment-107406


   @flrn77 what provider package version are you using for `cncf.kubernetes` 
when facing the issue?
   The possible fix mentioned by @nikie for `cncf.kubernetes` version `2.0.3`, 
but you seem to mention Airflow "core" versions. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mikaeld edited a comment on issue #12136: KubernetesPodOperator breaks with active log-collection for long running tasks

2022-01-12 Thread GitBox


mikaeld edited a comment on issue #12136:
URL: https://github.com/apache/airflow/issues/12136#issuecomment-107406


   @flrn77 what provider package version are you using for `cncf.kubernetes` 
when facing the issue?
   The possible fix mentioned by @nikie for `cncf.kubernetes` version `2.0.3`, 
but you seem to mention Airflow "core" versions which are independent from the 
provider package versioning. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on pull request #20795: Fix remaining mypy issues in "core" Airflow

2022-01-12 Thread GitBox


ashb commented on pull request #20795:
URL: https://github.com/apache/airflow/pull/20795#issuecomment-109023


   @josh-fell Oh I like 
https://github.com/apache/airflow/pull/20470/files#diff-af25dd96233a97ace811c614fa7d5bd059cbbc1571e421f6675e16f6290814c5
 more than adding all those "not implemeneted errors" I did.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mikaeld edited a comment on issue #12136: KubernetesPodOperator breaks with active log-collection for long running tasks

2022-01-12 Thread GitBox


mikaeld edited a comment on issue #12136:
URL: https://github.com/apache/airflow/issues/12136#issuecomment-107406


   @flrn77 what provider package version are you using for `cncf.kubernetes` 
when facing the issue?
   The possible fix mentioned by @nikie for `cncf.kubernetes` version `2.0.3`, 
but you seem to mention Airflow "core" versions which are independent from the 
provider package versioning. 
   
   We are facing the issue as well on an older Airflow 1.10 instance that we 
are in the process of migrating the Airflow 2.2 with the latest provider. I'm 
curious if that's something I should be expecting on the latest version as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mikaeld edited a comment on issue #12136: KubernetesPodOperator breaks with active log-collection for long running tasks

2022-01-12 Thread GitBox


mikaeld edited a comment on issue #12136:
URL: https://github.com/apache/airflow/issues/12136#issuecomment-107406


   @flrn77 what provider package version are you using for `cncf.kubernetes` 
when facing the issue?
   The possible fix mentioned by @nikie for `cncf.kubernetes` version `2.0.3`, 
but you seem to mention Airflow "core" versions which are independent from the 
provider package versioning. 
   
   We are also facing the issue on an older Airflow 1.10 instance that we are 
in the process of migrating to Airflow 2.2 with the latest provider packages. 
I'm curious if that's something I should be expecting on the latest version as 
well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] uranusjr commented on pull request #20286: Add TaskMap and TaskInstance.map_id

2022-01-12 Thread GitBox


uranusjr commented on pull request #20286:
URL: https://github.com/apache/airflow/pull/20286#issuecomment-1011122326


   Alright, modified the code a bit and added a few tests. We can’t put mapped 
task in a DAG yet (`bag_dag` kept throwing AttributeError at me about a ton of 
things expected on BaseOperator but not yet present on MappedOperator), so a 
bit of mocking is needed. But otherwise things seem to work as expected.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] bbovenzi commented on a change in pull request #20733: Add Audit Log View to Dag View

2022-01-12 Thread GitBox


bbovenzi commented on a change in pull request #20733:
URL: https://github.com/apache/airflow/pull/20733#discussion_r783153020



##
File path: airflow/config_templates/default_airflow.cfg
##
@@ -658,6 +658,18 @@ auto_refresh_interval = 3
 # Boolean for displaying warning for publicly viewable deployment
 warn_deployment_exposure = True
 
+# Number of dags to look back for dag audit log view
+audit_trail_query_days = 31
+
+# Types of events to ignore for dag audit log view
+# E.g. excluded_events = success,failed,skipped,landing_times
+excluded_events =

Review comment:
   We probably want to exclude all of the read-only events by default to 
cut down on noise.
   https://user-images.githubusercontent.com/4600967/149164066-6ac3d6df-149a-48eb-b20f-4e6cca47b6fc.png";>
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] bbovenzi commented on a change in pull request #20733: Add Audit Log View to Dag View

2022-01-12 Thread GitBox


bbovenzi commented on a change in pull request #20733:
URL: https://github.com/apache/airflow/pull/20733#discussion_r783154481



##
File path: airflow/www/templates/airflow/dag_audit_log.html
##
@@ -0,0 +1,92 @@
+{#
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+#}
+
+{% extends "airflow/dag.html" %}
+{% block title %}Dag Audit Log{% endblock %}
+
+{% block head_css %}
+{{ super() }}
+
+
+{% endblock %}
+
+{% block content %}
+  {{ super() }}
+  Dag Audit Log
+
+  
+Selected events and operations in the last {{ max_query_days }} days.
+For full list of events click
+here.
+  
+  User operations with comments are highlighted.
+
+
+
+
+Time
+Task Id
+Event
+Execution Date
+By User
+Details
+
+
+
+{% for log in dag_logs %}
+
+  
+  {{ log.dttm.strftime('%Y-%m-%d %H:%M:%S %Z') if log.dttm 
else None }}

Review comment:
   We should make sure this date is formatted to the app's timezone and not 
just UTC. We probably want to wrap it in a `` element.
   
   https://user-images.githubusercontent.com/4600967/149164600-cfab2015-ce85-4b9d-a484-c9b84beb8c50.png";>
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on a change in pull request #20826: Move tests out of test core that are duplicated/should live elsewhere

2022-01-12 Thread GitBox


ashb commented on a change in pull request #20826:
URL: https://github.com/apache/airflow/pull/20826#discussion_r783154034



##
File path: tests/core/test_core.py
##
@@ -16,32 +16,21 @@
 # specific language governing permissions and limitations
 # under the License.
 
-import os
-import signal
-from datetime import timedelta
+from datetime import datetime, timedelta

Review comment:
   ```suggestion
   from datetime import timedelta
   ```

##
File path: tests/core/test_core.py
##
@@ -16,32 +16,21 @@
 # specific language governing permissions and limitations
 # under the License.
 
-import os
-import signal
-from datetime import timedelta
+from datetime import datetime, timedelta
 from time import sleep
-from unittest.mock import MagicMock
 
 import pytest
 
 from airflow import settings
-from airflow.exceptions import AirflowException, AirflowTaskTimeout
-from airflow.hooks.base import BaseHook
-from airflow.models import DagBag, TaskFail, TaskInstance
+from airflow.exceptions import AirflowTaskTimeout
+from airflow.models import TaskFail, TaskInstance
 from airflow.operators.bash import BashOperator
-from airflow.operators.check_operator import CheckOperator, ValueCheckOperator
 from airflow.operators.dummy import DummyOperator
 from airflow.operators.python import PythonOperator
-from airflow.utils.dates import days_ago
-from airflow.utils.state import State
-from airflow.utils.timezone import datetime
 from airflow.utils.types import DagRunType

Review comment:
   ```suggestion
   from airflow.utils.timezone import datetime
   from airflow.utils.types import DagRunType
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] bbovenzi commented on a change in pull request #20733: Add Audit Log View to Dag View

2022-01-12 Thread GitBox


bbovenzi commented on a change in pull request #20733:
URL: https://github.com/apache/airflow/pull/20733#discussion_r783155908



##
File path: airflow/www/templates/airflow/dag_audit_log.html
##
@@ -0,0 +1,92 @@
+{#
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+#}
+
+{% extends "airflow/dag.html" %}
+{% block title %}Dag Audit Log{% endblock %}
+
+{% block head_css %}
+{{ super() }}
+
+
+{% endblock %}
+
+{% block content %}
+  {{ super() }}
+  Dag Audit Log
+
+  
+Selected events and operations in the last {{ max_query_days }} days.
+For full list of events click
+here.
+  
+  User operations with comments are highlighted.
+
+
+
+
+Time
+Task Id
+Event
+Execution Date

Review comment:
   Why is this min-width so much larger than time?
   Also, I believe we want to use `Logical Date` instead of `Execution Date`. 
Is that right @uranusjr ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] bbovenzi commented on a change in pull request #20733: Add Audit Log View to Dag View

2022-01-12 Thread GitBox


bbovenzi commented on a change in pull request #20733:
URL: https://github.com/apache/airflow/pull/20733#discussion_r783159243



##
File path: airflow/www/templates/airflow/dag_audit_log.html
##
@@ -0,0 +1,92 @@
+{#
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+#}
+
+{% extends "airflow/dag.html" %}
+{% block title %}Dag Audit Log{% endblock %}
+
+{% block head_css %}
+{{ super() }}
+
+
+{% endblock %}
+
+{% block content %}
+  {{ super() }}
+  Dag Audit Log
+
+  
+Selected events and operations in the last {{ max_query_days }} days.
+For full list of events click

Review comment:
   I don't find this to be very descriptive as we don't explain what is 
selected. I'm not certain of what a better copy would be but let's at least 
mention how this is managed in one's Airflow config.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] josh-fell commented on a change in pull request #19852: created SFTPBatchOperator which add batch function

2022-01-12 Thread GitBox


josh-fell commented on a change in pull request #19852:
URL: https://github.com/apache/airflow/pull/19852#discussion_r783132963



##
File path: airflow/providers/sftp/operators/sftp_batch.py
##
@@ -0,0 +1,312 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""This module contains SFTP Batch operator."""
+import ast
+import os
+import re
+from pathlib import Path
+from typing import List, Optional
+
+from paramiko.sftp_client import SFTPClient
+
+from airflow.exceptions import AirflowException
+from airflow.models import BaseOperator
+from airflow.providers.sftp.operators.sftp import SFTPOperation
+from airflow.providers.sftp.utils.common import check_conn, 
make_intermediate_dirs
+from airflow.utils.context import Context
+
+
+class SFTPBatchOperator(BaseOperator):
+"""
+SFTPBatchOperator for transferring files from remote host to local or vice 
a versa.
+This operator uses ssh_hook to open sftp transport channel that serve as 
basis
+for file transfer.
+:param ssh_hook: predefined ssh_hook to use for remote execution.
+Either `ssh_hook` or `ssh_conn_id` needs to be provided.
+:type ssh_hook: airflow.providers.ssh.hooks.ssh.SSHHook
+:param ssh_conn_id: :ref:`ssh connection id`
+from airflow Connections. `ssh_conn_id` will be ignored if `ssh_hook`
+is provided.
+:type ssh_conn_id: str
+:param remote_host: remote host to connect (templated)
+Nullable. If provided, it will replace the `remote_host` which was
+defined in `ssh_hook` or predefined in the connection of `ssh_conn_id`.
+:type remote_host: str
+:param local_files_path: local files path to get or put. (templated)
+:type local_files_path: list
+:param local_folder: local folder path to get or put. (templated)
+:type local_folder: str
+:param remote_folder: remote folder path to get or put. (templated)
+:type remote_folder: str
+:param remote_files_path: remote folder path to get or put. (templated)
+:type remote_files_path: list
+:param regexp_mask: regexp mask for file match in local_folder for PUT 
operational
+or match filenames in remote_folder for GET operational. (templated)
+:type regexp_mask: str
+:param operation: specify operation 'get' or 'put', defaults to put
+:type operation: str
+:param confirm: specify if the SFTP operation should be confirmed, 
defaults to True
+:type confirm: bool
+:param create_intermediate_dirs: create missing intermediate directories 
when
+:type create_intermediate_dirs: bool
+:param force: if the file already exists, it will be overwritten
+:type force: bool
+copying from remote to local and vice-versa. Default is False.
+:param list_as_str: if you uses xcom or string in remote_files_path or 
local_files_path
+for example your xcom return "['/tmp/tmp1/batch/file1.txt']"
+then turn use_xcom_args = True and your will be xcom transformed to 
list type
+:type list_as_str: bool
+copying from remote to local and vice-versa. Default is False.
+Summary, support arguments:
+Possible options for PUT:
+1.optional(regexp_mask:str) + local_folder:str + remote_folder:str
+2.local_files_path:list + remote_folder:str
+Possible options for GET:
+1.local_folder:str + remote_folder:str + optional(regexp_mask:str)
+2.local_folder:str + remote_files_path:list
+Example:
+Move all txt files
+from local `/tmp/dir_for_local_transfer/` to remote folder 
`/tmp/dir_for_remote_transfer/`
+put_dir_txt_files = SFTPOperator(
+task_id="put_dir_txt_files",
+ssh_conn_id="ssh_default",
+local_folder="/tmp/dir_for_local_transfer/",
+remote_folder="/tmp/dir_for_remote_transfer/",
+regexp_mask=".*[.]txt",
+operation=SFTPOperation.PUT,
+create_intermediate_dirs=True
+)
+Move `/tmp/file1.txt` file
+from local to remote folder `/tmp/dir_for_remote_transfer/`
+put_files = SFTPOperator(
+task_id="put_dir_txt_files",
+ssh_con

[GitHub] [airflow] edithturn commented on a change in pull request #20763: [WIP] Verify enough resources for breeze

2022-01-12 Thread GitBox


edithturn commented on a change in pull request #20763:
URL: https://github.com/apache/airflow/pull/20763#discussion_r783163958



##
File path: scripts/in_container/run_resource_check.py
##
@@ -0,0 +1,102 @@
+#!/usr/bin/env python
+
+
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+
+import psutil
+from rich.console import Console
+
+console = Console(force_terminal=True, color_system="standard", width=180)
+
+
+def get_size(kilobytes):
+"""
+Convert kilobytes into gigabytes
+1 Gigabytes = 1,048,576 Kb
+"""
+factor = 10
+value_gb = kilobytes // factor
+return value_gb
+
+
+def resoure_check():
+"""
+Use gsutil to get resources in kylobytes: memory, cpus and disk
+"""
+resources = {}
+print("\nChecking resources.\n")
+
+# Memory available
+svmem = psutil.virtual_memory()
+mem_available = get_size(svmem.total)
+resources.setdefault('Memory', []).append(mem_available)
+
+# Cpus available
+cpus_available = psutil.cpu_count(logical=True)
+resources.setdefault('Cpus', []).append(cpus_available)
+
+# Disk: Get all disk partitions /
+partitions = psutil.disk_partitions()
+partition_usage = psutil.disk_usage(partitions[0].mountpoint)
+disk_available = get_size(partition_usage.free)
+resources.setdefault('Disk', []).append(disk_available)
+
+return resources
+
+
+def resoure_validate():
+resources = resoure_check()
+warning_resources = False
+check = "OK"
+
+resources.setdefault("Memory", []).append(4)
+resources.setdefault("Cpus", []).append(2)
+resources.setdefault("Disk", []).append(20)
+
+for resource, available in resources.items():
+
+check = '' if resource == "Cpus" else 'GB'
+
+if (
+resource == "Memory"
+and available[0] < 4
+or resource == "Cpus"
+and available[0] < 2
+or resource == "Disk"
+and available[0] < 20
+):
+console.print(f"[yellow]WARNING!!!: Not enough {resource} 
available for Docker.")
+print(f"At least {available[1]}{check} of {resource} required. You 
have {available[0]}{check}\n")
+warning_resources = True
+else:
+console.print(f" * {resource} available {available[0]}{check}. 
[green]OK.\n")

Review comment:
   Done :), definitely looks great without space.
   ![Screenshot from 2022-01-12 
10-08-00](https://user-images.githubusercontent.com/58795858/149166520-6516d0fc-b6ee-40bb-aeea-28b312a457a4.png)
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] edithturn commented on a change in pull request #20763: [WIP] Verify enough resources for breeze

2022-01-12 Thread GitBox


edithturn commented on a change in pull request #20763:
URL: https://github.com/apache/airflow/pull/20763#discussion_r783163958



##
File path: scripts/in_container/run_resource_check.py
##
@@ -0,0 +1,102 @@
+#!/usr/bin/env python
+
+
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+
+import psutil
+from rich.console import Console
+
+console = Console(force_terminal=True, color_system="standard", width=180)
+
+
+def get_size(kilobytes):
+"""
+Convert kilobytes into gigabytes
+1 Gigabytes = 1,048,576 Kb
+"""
+factor = 10
+value_gb = kilobytes // factor
+return value_gb
+
+
+def resoure_check():
+"""
+Use gsutil to get resources in kylobytes: memory, cpus and disk
+"""
+resources = {}
+print("\nChecking resources.\n")
+
+# Memory available
+svmem = psutil.virtual_memory()
+mem_available = get_size(svmem.total)
+resources.setdefault('Memory', []).append(mem_available)
+
+# Cpus available
+cpus_available = psutil.cpu_count(logical=True)
+resources.setdefault('Cpus', []).append(cpus_available)
+
+# Disk: Get all disk partitions /
+partitions = psutil.disk_partitions()
+partition_usage = psutil.disk_usage(partitions[0].mountpoint)
+disk_available = get_size(partition_usage.free)
+resources.setdefault('Disk', []).append(disk_available)
+
+return resources
+
+
+def resoure_validate():
+resources = resoure_check()
+warning_resources = False
+check = "OK"
+
+resources.setdefault("Memory", []).append(4)
+resources.setdefault("Cpus", []).append(2)
+resources.setdefault("Disk", []).append(20)
+
+for resource, available in resources.items():
+
+check = '' if resource == "Cpus" else 'GB'
+
+if (
+resource == "Memory"
+and available[0] < 4
+or resource == "Cpus"
+and available[0] < 2
+or resource == "Disk"
+and available[0] < 20
+):
+console.print(f"[yellow]WARNING!!!: Not enough {resource} 
available for Docker.")
+print(f"At least {available[1]}{check} of {resource} required. You 
have {available[0]}{check}\n")
+warning_resources = True
+else:
+console.print(f" * {resource} available {available[0]}{check}. 
[green]OK.\n")

Review comment:
   Done :), definitely looks great without an end line at the end.
   ![Screenshot from 2022-01-12 
10-08-00](https://user-images.githubusercontent.com/58795858/149166520-6516d0fc-b6ee-40bb-aeea-28b312a457a4.png)
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on a change in pull request #20286: Add TaskMap and TaskInstance.map_id

2022-01-12 Thread GitBox


ashb commented on a change in pull request #20286:
URL: https://github.com/apache/airflow/pull/20286#discussion_r783171485



##
File path: airflow/models/baseoperator.py
##
@@ -1632,6 +1632,33 @@ def defer(
 def map(self, **kwargs) -> "MappedOperator":
 return MappedOperator.from_operator(self, kwargs)
 
+def has_mapped_dependants(self) -> bool:
+"""Whether any downstream dependencies depend on this task for 
mapping."""
+from airflow.utils.task_group import MappedTaskGroup, TaskGroup
+
+if not self.has_dag():
+return False
+
+def _walk_group(group: TaskGroup) -> Iterable[Tuple[str, DAGNode]]:
+"""Recursively walk children in a task group.
+
+This yields all direct children (including both tasks and task
+groups), and all children of any task groups.
+"""
+for key, child in group.children.items():
+yield key, child
+if isinstance(child, TaskGroup):
+yield from _walk_group(child)
+
+for key, child in _walk_group(self.dag.task_group):
+if key == self.task_id:
+continue
+if not isinstance(child, (MappedOperator, MappedTaskGroup)):
+continue
+if self.task_id in child.upstream_task_ids:
+return True
+return False

Review comment:
   Could you add a comment saying why?
   
   (Cos I think fairly soon it will be time to have TaskGroups directly in the 
graph rather than tacked on and UI only like we have it)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on pull request #20286: Add TaskMap and TaskInstance.map_id

2022-01-12 Thread GitBox


ashb commented on pull request #20286:
URL: https://github.com/apache/airflow/pull/20286#issuecomment-1011149780


   > so a bit of mocking is needed. But otherwise things seem to work as 
expected.
   
   Yeah, I guess those'll be fixed by 
https://github.com/apache/airflow/pull/20743


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on pull request #19808: Refactor dangling row check to use SQLA queries

2022-01-12 Thread GitBox


ashb commented on pull request #19808:
URL: https://github.com/apache/airflow/pull/19808#issuecomment-1011160510


   I've rebased this, and I'll need to re-run these checks still work as it's 
been a while.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on a change in pull request #20286: Add TaskMap and TaskInstance.map_id

2022-01-12 Thread GitBox


ashb commented on a change in pull request #20286:
URL: https://github.com/apache/airflow/pull/20286#discussion_r783184829



##
File path: airflow/models/taskinstance.py
##
@@ -2128,6 +2138,14 @@ def set_duration(self) -> None:
 self.duration = None
 self.log.debug("Task Duration set to %s", self.duration)
 
+def _record_task_map_for_downstreams(self, value: Any, *, session: 
Session) -> None:
+if not self.task.has_mapped_dependants():
+return
+if not isinstance(value, collections.abc.Collection) or 
isinstance(value, (bytes, str)):
+self.log.info("Failing %s for unmappable XCom push %r", self.key, 
value)
+raise UnmappableXComPushed(value)

Review comment:
   I don't think we should include the _whole_ value here -- it could 
potentially be a huge object/structure/etc. (Either here, or in the exception 
itself.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] edithturn commented on a change in pull request #20763: [WIP] Verify enough resources for breeze

2022-01-12 Thread GitBox


edithturn commented on a change in pull request #20763:
URL: https://github.com/apache/airflow/pull/20763#discussion_r783189748



##
File path: scripts/in_container/run_resource_check.py
##
@@ -0,0 +1,102 @@
+#!/usr/bin/env python
+
+
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+
+import psutil
+from rich.console import Console
+
+console = Console(force_terminal=True, color_system="standard", width=180)
+
+
+def get_size(kilobytes):
+"""
+Convert kilobytes into gigabytes
+1 Gigabytes = 1,048,576 Kb
+"""
+factor = 10
+value_gb = kilobytes // factor
+return value_gb
+
+
+def resoure_check():
+"""
+Use gsutil to get resources in kylobytes: memory, cpus and disk
+"""
+resources = {}
+print("\nChecking resources.\n")
+
+# Memory available
+svmem = psutil.virtual_memory()
+mem_available = get_size(svmem.total)
+resources.setdefault('Memory', []).append(mem_available)
+
+# Cpus available
+cpus_available = psutil.cpu_count(logical=True)
+resources.setdefault('Cpus', []).append(cpus_available)
+
+# Disk: Get all disk partitions /
+partitions = psutil.disk_partitions()
+partition_usage = psutil.disk_usage(partitions[0].mountpoint)
+disk_available = get_size(partition_usage.free)
+resources.setdefault('Disk', []).append(disk_available)
+
+return resources
+
+
+def resoure_validate():
+resources = resoure_check()
+warning_resources = False
+check = "OK"
+
+resources.setdefault("Memory", []).append(4)
+resources.setdefault("Cpus", []).append(2)
+resources.setdefault("Disk", []).append(20)
+
+for resource, available in resources.items():
+
+check = '' if resource == "Cpus" else 'GB'
+
+if (
+resource == "Memory"
+and available[0] < 4

Review comment:
   Should  I make the same with "Memory", "Cpus" and "Disk", they are used 
in both functions, the global variable will be good here?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] PApostol opened a new pull request #20828: Add Helm Chart 1.4.0 in airflow_helmchart_bug_report.yml

2022-01-12 Thread GitBox


PApostol opened a new pull request #20828:
URL: https://github.com/apache/airflow/pull/20828


   Update `.github/ISSUE_TEMPLATE/airflow_helmchart_bug_report.yml` with the 
new version of Helm Chart.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] uranusjr commented on a change in pull request #20286: Add TaskMap and TaskInstance.map_id

2022-01-12 Thread GitBox


uranusjr commented on a change in pull request #20286:
URL: https://github.com/apache/airflow/pull/20286#discussion_r783194609



##
File path: airflow/models/taskinstance.py
##
@@ -2128,6 +2138,14 @@ def set_duration(self) -> None:
 self.duration = None
 self.log.debug("Task Duration set to %s", self.duration)
 
+def _record_task_map_for_downstreams(self, value: Any, *, session: 
Session) -> None:
+if not self.task.has_mapped_dependants():
+return
+if not isinstance(value, collections.abc.Collection) or 
isinstance(value, (bytes, str)):
+self.log.info("Failing %s for unmappable XCom push %r", self.key, 
value)
+raise UnmappableXComPushed(value)

Review comment:
   Oh good point. Even doing `repr(value)[:100]` can potentially consume a 
lot of memory, so I guess the only reasonable approach is to not log the value 
at all. Worst case, the user can manually look up that problematic value in the 
XCom table.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #20828: Add Helm Chart 1.4.0 in airflow_helmchart_bug_report.yml

2022-01-12 Thread GitBox


github-actions[bot] commented on pull request #20828:
URL: https://github.com/apache/airflow/pull/20828#issuecomment-1011181987


   The PR is likely ready to be merged. No tests are needed as no important 
environment files, nor python files were modified by it. However, committers 
might decide that full test matrix is needed and add the 'full tests needed' 
label. Then you should rebase it to the latest main or amend the last commit of 
the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #19769: Handle stuck queued tasks in Celery

2022-01-12 Thread GitBox


kaxil commented on a change in pull request #19769:
URL: https://github.com/apache/airflow/pull/19769#discussion_r783205515



##
File path: airflow/config_templates/config.yml
##
@@ -1678,6 +1678,13 @@
   type: string
   example: ~
   default: "False"
+- name: stuck_queued_task_check_interval
+  description: |
+How often to check for stuck queued task (in seconds)
+  version_added: 2.2.3

Review comment:
   ```suggestion
 version_added: 2.3.0
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] boring-cyborg[bot] commented on issue #20829: Update google-cloud-container package to v2.10.1

2022-01-12 Thread GitBox


boring-cyborg[bot] commented on issue #20829:
URL: https://github.com/apache/airflow/issues/20829#issuecomment-1011207396


   Thanks for opening your first issue here! Be sure to follow the issue 
template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




  1   2   3   >