[GitHub] [airflow] uranusjr commented on a change in pull request #22333: Patch sql_alchemy_conn if old postgres scheme used
uranusjr commented on a change in pull request #22333: URL: https://github.com/apache/airflow/pull/22333#discussion_r828800148 ## File path: airflow/settings.py ## @@ -228,6 +229,19 @@ def configure_vars(): global PLUGINS_FOLDER global DONOT_MODIFY_HANDLERS SQL_ALCHEMY_CONN = conf.get('core', 'SQL_ALCHEMY_CONN') + +# as of sqlalchemy 1.4, scheme `postgres+psycopg2` must be replaced with `postgresql` +parsed = urlparse(SQL_ALCHEMY_CONN) +bad_scheme = 'postgres+psycopg2' +if parsed.scheme == bad_scheme: +warnings.warn( +f"Scheme for metadata sql alchemy connection is `{bad_scheme}`." +"As of sqlalchemy 1.4 this is no longer supported. You must change " +"to `postgresql`", +PendingDeprecationWarning, Review comment: Should this be `DeprecationWarning`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a change in pull request #22332: Events Timetable
uranusjr commented on a change in pull request #22332: URL: https://github.com/apache/airflow/pull/22332#discussion_r828798735 ## File path: airflow/timetables/events.py ## @@ -0,0 +1,83 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from typing import Iterable, Optional + +import numpy as np +import pendulum +from pendulum import DateTime + +from airflow.timetables.base import DagRunInfo, DataInterval, TimeRestriction, Timetable +from airflow.timetables.simple import NullTimetable + + +class EventsTimetable(NullTimetable): Review comment: `NullTimetable` (and other non-recurring timetables) has some special treatments in UI. I think it’s better to inherit `Timetable` instead (the root class). Also it should be useful to implement `__repr__`, `summary`, and `description` For UI representation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a change in pull request #22332: Events Timetable
uranusjr commented on a change in pull request #22332: URL: https://github.com/apache/airflow/pull/22332#discussion_r828798735 ## File path: airflow/timetables/events.py ## @@ -0,0 +1,83 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from typing import Iterable, Optional + +import numpy as np +import pendulum +from pendulum import DateTime + +from airflow.timetables.base import DagRunInfo, DataInterval, TimeRestriction, Timetable +from airflow.timetables.simple import NullTimetable + + +class EventsTimetable(NullTimetable): Review comment: `NullTimetable` (and other non-recurring timetables) has some special treatments in UI. I think it’s better to inherit `Timetable` instead (the root class). Also it should be useful to implement `__repr__` and `summary` For UI representation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr opened a new pull request #22334: Add fk between xcom and task instance
uranusjr opened a new pull request #22334: URL: https://github.com/apache/airflow/pull/22334 As discussed previously. Also improved the dag_run relationship to use the dag_run_id field directly because why not? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] shuhoy commented on pull request #22252: Refactor BigQuery to GCS Operator
shuhoy commented on pull request #22252: URL: https://github.com/apache/airflow/pull/22252#issuecomment-1070349062 All checks passed!!:rocket: -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dstandish commented on a change in pull request #22184: Add run_id and map_index to SlaMiss
dstandish commented on a change in pull request #22184: URL: https://github.com/apache/airflow/pull/22184#discussion_r828760938 ## File path: airflow/dag_processing/processor.py ## @@ -412,20 +410,20 @@ def manage_slas(self, dag: DAG, session: Session = None) -> None: else: while next_info.logical_date < ts: next_info = dag.next_dagrun_info(next_info.data_interval, restricted=False) - if next_info is None: break -if (ti.dag_id, ti.task_id, next_info.logical_date) in recorded_slas_query: +next_run_id = DR.generate_run_id(DagRunType.SCHEDULED, next_info.logical_date) +if (ti.dag_id, ti.task_id, next_run_id, ti.map_index) in recorded_sla_misses: Review comment: though it is weird ... i don't understand why we immediately call next run here https://github.com/apache/airflow/blob/main/airflow/dag_processing/processor.py#L414 we're already at "the next run" relative to the last completed TI -- why don't we see if _that_ run has an SLA miss? it seems we skip to "the run _after_ the run after" the latest TI. so that we could only get an SLA miss if airflow is 2 runs behind 🤪 i need to actually do some live testing to understand how this actually behaves. update: i did some live testing, and yeah, SLAs are all screwed up here's an example: ![image](https://user-images.githubusercontent.com/15932138/158741020-7b21db86-8cbb-439c-9c66-aecbe4365e50.png) i created a task that is sleep(300), SLA of one minute, and runs every 5 minutes. here you can see that first TI gets no miss (expected) and the second TI does not get any miss. the one that is created is always 2 runs ahead of the latest successful TI 🤦 and one ahead of the one that is running. also visible here is an odd quirk that appears to be a bug re timetables or data intervals or something in that area: the execution dates are wiggly i.e. not a value `N * timedelta + dag.start_date`. they have seemingly random millis precision. and as a consequence, not only are the SlaMiss records in the future, but i have found that they may _never_ correspond to TIs that actually come into existence. e.g. the SlaMiss you see above is 53m:54s and change but observe here after waiting a few minutes the exec date for the TI actually created is 54m:11s (we never end up seeing a TI with 53m:54s). https://user-images.githubusercontent.com/15932138/158741876-f8085c2c-70c7-4c47-a663-216a545f701b.png";> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dstandish commented on a change in pull request #22184: Add run_id and map_index to SlaMiss
dstandish commented on a change in pull request #22184: URL: https://github.com/apache/airflow/pull/22184#discussion_r828760938 ## File path: airflow/dag_processing/processor.py ## @@ -412,20 +410,20 @@ def manage_slas(self, dag: DAG, session: Session = None) -> None: else: while next_info.logical_date < ts: next_info = dag.next_dagrun_info(next_info.data_interval, restricted=False) - if next_info is None: break -if (ti.dag_id, ti.task_id, next_info.logical_date) in recorded_slas_query: +next_run_id = DR.generate_run_id(DagRunType.SCHEDULED, next_info.logical_date) +if (ti.dag_id, ti.task_id, next_run_id, ti.map_index) in recorded_sla_misses: Review comment: though it is weird ... i don't understand why we immediately call next run here https://github.com/apache/airflow/blob/main/airflow/dag_processing/processor.py#L414 we're already at "the next run" relative to the last completed TI -- why don't we see if _that_ run has an SLA miss? it seems we skip to "the run _after_ the run after" the latest TI. so that we could only get an SLA miss if airflow is 2 runs behind 🤪 i need to actually do some live testing to understand how this actually behaves. update: i did some live testing, and yeah, SLAs are all screwed up here's an example: ![image](https://user-images.githubusercontent.com/15932138/158741020-7b21db86-8cbb-439c-9c66-aecbe4365e50.png) i created a task that is sleep(300), SLA of one minute, and runs every 5 minutes. here you can see that first TI gets no miss (expected) and the second TI does not get any miss. the one that is created is always 2 runs ahead of the latest successful TI 🤦 and one ahead of the one that is running. also visible here is an odd quirk that appears to be a bug re timetables or data intervals or something in that area: the execution dates are wiggly i.e. not a value `N * timedelta + dag.start_date`. they have seemingly random millis precision. and as a consequence, not only are the SlaMiss records in the future, but i have found that they may _never_ correspond to TIs that actually come into existence. 🤷 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a change in pull request #22272: Add map_index support to al task instance-related views
uranusjr commented on a change in pull request #22272: URL: https://github.com/apache/airflow/pull/22272#discussion_r828749020 ## File path: airflow/www/static/js/tree/StatusBox.jsx ## @@ -36,9 +36,9 @@ const StatusBox = ({ group, instance, containerRef, extraLinks = [], }) => { const { -executionDate, taskId, tryNumber = 0, operator, runId, +executionDate, taskId, tryNumber = 0, operator, runId, mapIndex, } = instance; - const onClick = () => executionDate && callModal(taskId, executionDate, extraLinks, tryNumber, operator === 'SubDagOperator' || undefined, runId); + const onClick = () => executionDate && callModal(taskId, executionDate, extraLinks, tryNumber, operator === 'SubDagOperator', runId, mapIndex); Review comment: Same, does it matter to use false vs undefined? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a change in pull request #22272: Add map_index support to al task instance-related views
uranusjr commented on a change in pull request #22272: URL: https://github.com/apache/airflow/pull/22272#discussion_r828748779 ## File path: airflow/www/static/js/graph.js ## @@ -172,8 +172,15 @@ function draw() { const task = tasks[nodeId]; const tryNumber = taskInstances[nodeId].try_number || 0; - if (task.task_type === 'SubDagOperator') callModal(nodeId, executionDate, task.extra_links, tryNumber, true, dagRunId); - else callModal(nodeId, executionDate, task.extra_links, tryNumber, undefined, dagRunId); + callModal( +nodeId, +executionDate, +task.extra_links, +tryNumber, +task.task_tupe === 'SubDagOperator', Review comment: ```suggestion task.task_tupe === 'SubDagOperator' ? true : undefined, ``` Does it make a difference? This is closer to the original. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a change in pull request #22272: Add map_index support to al task instance-related views
uranusjr commented on a change in pull request #22272: URL: https://github.com/apache/airflow/pull/22272#discussion_r828748779 ## File path: airflow/www/static/js/graph.js ## @@ -172,8 +172,15 @@ function draw() { const task = tasks[nodeId]; const tryNumber = taskInstances[nodeId].try_number || 0; - if (task.task_type === 'SubDagOperator') callModal(nodeId, executionDate, task.extra_links, tryNumber, true, dagRunId); - else callModal(nodeId, executionDate, task.extra_links, tryNumber, undefined, dagRunId); + callModal( +nodeId, +executionDate, +task.extra_links, +tryNumber, +task.task_tupe === 'SubDagOperator', Review comment: ```suggestion task.task_tupe === 'SubDagOperator' ? true : undefined, ``` Does it make a difference? This matches closer to the original. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] collinmcnulty opened a new pull request #22332: Events Timetable
collinmcnulty opened a new pull request #22332: URL: https://github.com/apache/airflow/pull/22332 I've added a new Timetable that I believe will be widely useful for timing based on sporting events, planned communication campaigns, and other schedules that are arbitrary and irregular but predictable. I need to put more thought into (and could use help with) testing, but I wanted to put it out in draft form to solicit feedback. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] pohek321 commented on issue #22330: Add functionality to DatabricksHook to create notebooks
pohek321 commented on issue #22330: URL: https://github.com/apache/airflow/issues/22330#issuecomment-1070306241 Created pull request #22331 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] bbovenzi commented on issue #21201: Add Trigger Rule Display to Graph View
bbovenzi commented on issue #21201: URL: https://github.com/apache/airflow/issues/21201#issuecomment-1070306044 I agree we can do it in a new graph view, but I don't think this would be too complicated to do for the current one either. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] bbovenzi commented on issue #22325: ReST API : get_dag should return more than a simplified view of the dag
bbovenzi commented on issue #22325: URL: https://github.com/apache/airflow/issues/22325#issuecomment-1070305496 Good idea! I wonder if it should be on get dag or get dag details. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on pull request #22331: Add import_notebook method to databricks hook
boring-cyborg[bot] commented on pull request #22331: URL: https://github.com/apache/airflow/pull/22331#issuecomment-1070305458 Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst) Here are some useful points: - Pay attention to the quality of your code (flake8, mypy and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks) will help you with that. - In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/main/docs/apache-airflow/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it. - Consider using [Breeze environment](https://github.com/apache/airflow/blob/main/BREEZE.rst) for testing locally, it’s a heavy docker but it ships with a working Airflow and a lot of integrations. - Be patient and persistent. It might take some time to get a review or get the final approval from Committers. - Please follow [ASF Code of Conduct](https://www.apache.org/foundation/policies/conduct) for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack. - Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#coding-style-and-best-practices). Apache Airflow is a community-driven project and together we are making it better 🚀. In case of doubts contact the developers at: Mailing List: d...@airflow.apache.org Slack: https://s.apache.org/airflow-slack -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] pohek321 opened a new pull request #22331: Add import_notebook method to databricks hook
pohek321 opened a new pull request #22331: URL: https://github.com/apache/airflow/pull/22331 --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information. In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] bbovenzi commented on issue #22325: ReST API : get_dag should return more than a simplified view of the dag
bbovenzi commented on issue #22325: URL: https://github.com/apache/airflow/issues/22325#issuecomment-1070304848 Good idea! I wonder if it should be on get dag or get dag details. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated: Add recipe for BeamRunGoPipelineOperator (#22296)
This is an automated email from the ASF dual-hosted git repository. kamilbregula pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/main by this push: new 4a1503b Add recipe for BeamRunGoPipelineOperator (#22296) 4a1503b is described below commit 4a1503b39b0aaf50940c29ac886c6eeda35a79ff Author: pierrejeambrun AuthorDate: Thu Mar 17 04:57:22 2022 +0100 Add recipe for BeamRunGoPipelineOperator (#22296) --- airflow/providers/apache/beam/hooks/beam.py| 10 +- .../docker-images-recipes/go-beam.Dockerfile | 37 ++ docs/docker-stack/recipes.rst | 20 tests/providers/apache/beam/hooks/test_beam.py | 21 +++- 4 files changed, 86 insertions(+), 2 deletions(-) diff --git a/airflow/providers/apache/beam/hooks/beam.py b/airflow/providers/apache/beam/hooks/beam.py index 9be1a75..0644e02 100644 --- a/airflow/providers/apache/beam/hooks/beam.py +++ b/airflow/providers/apache/beam/hooks/beam.py @@ -20,12 +20,13 @@ import json import os import select import shlex +import shutil import subprocess import textwrap from tempfile import TemporaryDirectory from typing import Callable, List, Optional -from airflow.exceptions import AirflowException +from airflow.exceptions import AirflowConfigException, AirflowException from airflow.hooks.base import BaseHook from airflow.providers.google.go_module_utils import init_module, install_dependencies from airflow.utils.log.logging_mixin import LoggingMixin @@ -307,6 +308,13 @@ class BeamHook(BaseHook): source with GCSHook. :return: """ +if shutil.which("go") is None: +raise AirflowConfigException( +"You need to have Go installed to run beam go pipeline. See https://go.dev/doc/install " +"installation guide. If you are running airflow in Docker see more info at " +"'https://airflow.apache.org/docs/docker-stack/recipes.html'." +) + if "labels" in variables: variables["labels"] = json.dumps(variables["labels"], separators=(",", ":")) diff --git a/docs/docker-stack/docker-images-recipes/go-beam.Dockerfile b/docs/docker-stack/docker-images-recipes/go-beam.Dockerfile new file mode 100644 index 000..b224fe1 --- /dev/null +++ b/docs/docker-stack/docker-images-recipes/go-beam.Dockerfile @@ -0,0 +1,37 @@ +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +#http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +ARG BASE_AIRFLOW_IMAGE +FROM ${BASE_AIRFLOW_IMAGE} + +SHELL ["/bin/bash", "-o", "pipefail", "-e", "-u", "-x", "-c"] + +USER 0 + +ARG GO_VERSION=1.16.4 +ENV GO_INSTALL_DIR=/usr/local/go + +# Install Go +RUN if [[ "$(uname -a)" = *"x86_64"* ]] ; then export ARCH=amd64 ; else export ARCH=arm64 ; fi \ +&& DOWNLOAD_URL="https://dl.google.com/go/go${GO_VERSION}.linux-${ARCH}.tar.gz"; \ +&& TMP_DIR="$(mktemp -d)" \ +&& curl -fL "${DOWNLOAD_URL}" --output "${TMP_DIR}/go.linux-${ARCH}.tar.gz" \ +&& mkdir -p "${GO_INSTALL_DIR}" \ +&& tar xzf "${TMP_DIR}/go.linux-${ARCH}.tar.gz" -C "${GO_INSTALL_DIR}" --strip-components=1 \ +&& rm -rf "${TMP_DIR}" + +ENV GOROOT=/usr/local/go +ENV PATH="$GOROOT/bin:$PATH" + +USER ${AIRFLOW_UID} diff --git a/docs/docker-stack/recipes.rst b/docs/docker-stack/recipes.rst index a1c5777..1d258ab 100644 --- a/docs/docker-stack/recipes.rst +++ b/docs/docker-stack/recipes.rst @@ -70,3 +70,23 @@ Then build a new image. --pull \ --build-arg BASE_AIRFLOW_IMAGE="apache/airflow:2.0.2" \ --tag my-airflow-image:0.0.1 + +Apache Beam Go Stack installation +- + +To be able to run Beam Go Pipeline with the :class:`~airflow.providers.apache.beam.operators.beam.BeamRunGoPipelineOperator`, +you will need Go in your container. Install airflow with ``apache-airflow-providers-google>=6.5.0`` and ``apache-airflow-providers-apache-beam>=3.2.0`` + +Create a new Dockerfile like the one shown below. + +.. exampleinclude:: /docker-images-recipes/go-beam.Dockerfile +:language: dockerfile + +Then build a new image. + +.. code-block:: bash + + docker build . \ +--pull \ +--build-arg BASE_AIRFLOW_
[GitHub] [airflow] mik-laj merged pull request #22296: Add recipe for BeamRunGoPipelineOperator
mik-laj merged pull request #22296: URL: https://github.com/apache/airflow/pull/22296 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] mik-laj closed issue #21545: Add Go to docker images
mik-laj closed issue #21545: URL: https://github.com/apache/airflow/issues/21545 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] github-actions[bot] commented on pull request #22296: Add recipe for BeamRunGoPipelineOperator
github-actions[bot] commented on pull request #22296: URL: https://github.com/apache/airflow/pull/22296#issuecomment-1070279820 The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] pohek321 opened a new issue #22330: Add functionality to DatabricksHook to create notebooks
pohek321 opened a new issue #22330: URL: https://github.com/apache/airflow/issues/22330 ### Description Currently, there is no way to programmatically create a notebook in Databricks using the core provider components ([DatabricksHook](https://registry.astronomer.io/providers/databricks/modules/databrickshook), [DatabricksRunNowOperator](https://registry.astronomer.io/providers/databricks/modules/databricksrunnowoperator), or [DatabricksSubmitRunOperator](https://registry.astronomer.io/providers/databricks/modules/databrickssubmitrunoperator)). If implemented, this change would allow Airflow users to create a Scala, R, Python, or SQL notebook in DBFS programmatically from Airflow. ### Use case/motivation This would be useful for our users who are hosting their notebooks in an Airflow repository and would like to utilize advantages of tools like jinja templating, orchestration, git version control, etc. ### Related issues _No response_ ### Are you willing to submit a PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Madaditya commented on issue #20110: CORS access_control_allow_origin header never returned
Madaditya commented on issue #20110: URL: https://github.com/apache/airflow/issues/20110#issuecomment-1070231488 Was anybody able to fix a fix to the CORS issue for airflow API? We are facing a similar issue and wondered if its was a v2.2..2 specific. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] bskim45 commented on issue #21072: manage_sla firing notifications for the same sla miss instances repeatedly
bskim45 commented on issue #21072: URL: https://github.com/apache/airflow/issues/21072#issuecomment-1070181632 I'm experiencing a similar issue somewhat related to this. When the `sla` argument is provided but an SLA miss email is not sent nor `sla_miss_callback` is not specified, SlaMiss entries are piled up on the `sla_miss` table with `notification_sent=false`. This causes calling `DAGFileProcessor.manage_slas` times out for that callback processing. quick example: ```python with DAG( dag_id="example_dag", schedule_interval='@hourly', start_date=days_ago(1), catchup=False, ) as dag: dummy_task = DummyOperator( task_id='dummy_task', sla=datetime.timedelta(hours=18), ) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (AIRFLOW-5071) Thousand os Executor reports task instance X finished (success) although the task says its queued. Was the task killed externally?
[ https://issues.apache.org/jira/browse/AIRFLOW-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17507936#comment-17507936 ] ASF GitHub Bot commented on AIRFLOW-5071: - kenny813x201 commented on issue #10790: URL: https://github.com/apache/airflow/issues/10790#issuecomment-1069820469 We also got the same error message. In our case, it turns out that we are using the same name for different dags. Changing different dags from `as dag` to like `as dags1` and `as dags2` solve the issue for us. ``` with DAG( "dag_name", ) as dag: ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Thousand os Executor reports task instance X finished (success) although the > task says its queued. Was the task killed externally? > -- > > Key: AIRFLOW-5071 > URL: https://issues.apache.org/jira/browse/AIRFLOW-5071 > Project: Apache Airflow > Issue Type: Bug > Components: DAG, scheduler >Affects Versions: 1.10.3 >Reporter: msempere >Priority: Critical > Fix For: 1.10.12 > > Attachments: image-2020-01-27-18-10-29-124.png, > image-2020-07-08-07-58-42-972.png > > > I'm opening this issue because since I update to 1.10.3 I'm seeing thousands > of daily messages like the following in the logs: > > ``` > {{__init__.py:1580}} ERROR - Executor reports task instance 2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says > its queued. Was the task killed externally? > {{jobs.py:1484}} ERROR - Executor reports task instance 2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says > its queued. Was the task killed externally? > ``` > -And looks like this is triggering also thousand of daily emails because the > flag to send email in case of failure is set to True.- > I have Airflow setup to use Celery and Redis as a backend queue service. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [airflow] kenny813x201 commented on issue #10790: Copy of [AIRFLOW-5071] JIRA: Thousands of Executor reports task instance X finished (success) although the task says its queued. Was the task
kenny813x201 commented on issue #10790: URL: https://github.com/apache/airflow/issues/10790#issuecomment-1069820469 We also got the same error message. In our case, it turns out that we are using the same name for different dags. Changing different dags from `as dag` to like `as dags1` and `as dags2` solve the issue for us. ``` with DAG( "dag_name", ) as dag: ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] EricGao888 commented on pull request #21961: Add support for Alibaba-Cloud EMR cluster template (#21957)
EricGao888 commented on pull request #21961: URL: https://github.com/apache/airflow/pull/21961#issuecomment-1069771576 > @EricGao888 I marked the PR as draft. When finished please convert to ready so we can review. Sure, thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on issue #22328: bigquery provider's - BigQueryCursor missing implementation for description prooerty.
boring-cyborg[bot] commented on issue #22328: URL: https://github.com/apache/airflow/issues/22328#issuecomment-1069764906 Thanks for opening your first issue here! Be sure to follow the issue template! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] utkarsharma2 opened a new issue #22328: bigquery provider's - BigQueryCursor missing implementation for description prooerty.
utkarsharma2 opened a new issue #22328: URL: https://github.com/apache/airflow/issues/22328 ### Apache Airflow version 2.2.4 (latest released) ### What happened When trying to run following code: ``` import pandas as pd from airflow.providers.google.cloud.hooks.bigquery import BigqueryHook #using default connection hook = BigqueryHook() df = pd.read_sql( "SELECT * FROM table_name", con=hook.get_conn() ) ``` Running into following issue: ```Traceback (most recent call last): File "/Applications/PyCharm CE.app/Contents/plugins/python-ce/helpers/pydev/_pydevd_bundle/pydevd_exec2.py", line 3, in Exec exec(exp, global_vars, local_vars) File "", line 1, in File "/Users/utkarsharma/sandbox/astronomer/astro/.nox/dev/lib/python3.8/site-packages/pandas/io/sql.py", line 602, in read_sql return pandas_sql.read_query( File "/Users/utkarsharma/sandbox/astronomer/astro/.nox/dev/lib/python3.8/site-packages/pandas/io/sql.py", line 2117, in read_query columns = [col_desc[0] for col_desc in cursor.description] File "/Users/utkarsharma/sandbox/astronomer/astro/.nox/dev/lib/python3.8/site-packages/airflow/providers/google/cloud/hooks/bigquery.py", line 2599, in description raise NotImplementedError NotImplementedError ``` ### What you think should happen instead The property should be implemented in a similar manner as [postgres_to_gcs.py](https://github.com/apache/airflow/blob/7bd165fbe2cbbfa8208803ec352c5d16ca2bd3ec/airflow/providers/google/cloud/transfers/postgres_to_gcs.py#L58) ### How to reproduce _No response_ ### Operating System macOS ### Versions of Apache Airflow Providers apache-airflow-providers-google==6.5.0 ### Deployment Virtualenv installation ### Deployment details _No response_ ### Anything else _No response_ ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] github-actions[bot] closed pull request #18759: Movin trigger dag to operations folder
github-actions[bot] closed pull request #18759: URL: https://github.com/apache/airflow/pull/18759 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] github-actions[bot] commented on issue #8687: Airflow webserver should handle signals immediately when starting
github-actions[bot] commented on issue #8687: URL: https://github.com/apache/airflow/issues/8687#issuecomment-1069763273 This issue has been closed because it has not received response from the issue author. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] github-actions[bot] commented on issue #19158: Macros time operations not working
github-actions[bot] commented on issue #19158: URL: https://github.com/apache/airflow/issues/19158#issuecomment-1069763182 This issue has been closed because it has not received response from the issue author. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] github-actions[bot] closed issue #8687: Airflow webserver should handle signals immediately when starting
github-actions[bot] closed issue #8687: URL: https://github.com/apache/airflow/issues/8687 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] github-actions[bot] closed pull request #21080: Fix wrong use of $ref and nullable
github-actions[bot] closed pull request #21080: URL: https://github.com/apache/airflow/pull/21080 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] github-actions[bot] closed issue #19158: Macros time operations not working
github-actions[bot] closed issue #19158: URL: https://github.com/apache/airflow/issues/19158 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] alexbegg commented on issue #22320: Copying DAG ID from UI and pasting in Slack includes schedule
alexbegg commented on issue #22320: URL: https://github.com/apache/airflow/issues/22320#issuecomment-1069750132 > I can confirm this indeed happens with paste ( CMD + V) > > However if you will paste with CMD + SHIFT + V - it will behave as you expect > > Good catch on the CMD + SHIFT + V, I do that sometimes when I didn't want to paste HTML. I'll keep that in mind. It will be nice though if selecting the text just selects that DAG ID and not the other stuff. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dstandish commented on a change in pull request #22184: Add run_id and map_index to SlaMiss
dstandish commented on a change in pull request #22184: URL: https://github.com/apache/airflow/pull/22184#discussion_r828511240 ## File path: airflow/dag_processing/processor.py ## @@ -412,20 +410,20 @@ def manage_slas(self, dag: DAG, session: Session = None) -> None: else: while next_info.logical_date < ts: next_info = dag.next_dagrun_info(next_info.data_interval, restricted=False) - if next_info is None: break -if (ti.dag_id, ti.task_id, next_info.logical_date) in recorded_slas_query: +next_run_id = DR.generate_run_id(DagRunType.SCHEDULED, next_info.logical_date) +if (ti.dag_id, ti.task_id, next_run_id, ti.map_index) in recorded_sla_misses: Review comment: > the current SLA behaviour of creating the SlaMiss record against the next execution date is confusing (and likely wrong) so lets not confuse matters more by changing it to by against a future run_id that may never exist. i think that the notion of "next execution date" is maybe a little misleading. it just means next, relative to the last one examined. so we're in `manage_slas`. we start with a task. we look at the "last successful or skipped TI". then we say, ok, let's look at the next run for that task -- relative to the last one that's done. and let's see if it's failed its SLA. (e.g. cus it is still running). if so, let's create an SlaMiss for it. the only time that the TI would not exist is when scheduler for whatever reason isn't even creating the TI -- e.g. because it's catchup=True, or max_active_tasks=1, or scheduler is having trouble. `next_run_id` would probably be more accurately called `curr_run_id`. wdyt? for now i'll rename that variable. > (i.e. throw away most of this PR, sorry.) no worries if that's how it goes. we should do what we need to do, and it was a good exercise in any case. but i'm not convinced that it's not the right change. > create a parse-time-error if someone tries to set an sla property on a mapped task. might be better to just do a warning and ignore the SLA because e.g. if you have cluster policy putting a default SLA on everything, or if you apply it to all tasks in a dag with `default_args` this could be a little inconvenient. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dstandish commented on a change in pull request #22184: Add run_id and map_index to SlaMiss
dstandish commented on a change in pull request #22184: URL: https://github.com/apache/airflow/pull/22184#discussion_r828511240 ## File path: airflow/dag_processing/processor.py ## @@ -412,20 +410,20 @@ def manage_slas(self, dag: DAG, session: Session = None) -> None: else: while next_info.logical_date < ts: next_info = dag.next_dagrun_info(next_info.data_interval, restricted=False) - if next_info is None: break -if (ti.dag_id, ti.task_id, next_info.logical_date) in recorded_slas_query: +next_run_id = DR.generate_run_id(DagRunType.SCHEDULED, next_info.logical_date) +if (ti.dag_id, ti.task_id, next_run_id, ti.map_index) in recorded_sla_misses: Review comment: > the current SLA behaviour of creating the SlaMiss record against the next execution date is confusing (and likely wrong) so lets not confuse matters more by changing it to by against a future run_id that may never exist. i think that the notion of "next execution date" is maybe a little misleading. it just means next, relative to the last one examined. so we're in `manage_slas`. we start with a task. we look at the "last successful or skipped TI". then we say, ok, let's look at the next run for that task -- relative to the last one that's done. and let's see if it's failed its SLA. (e.g. cus it is still running). if so, let's create an SlaMiss for it. the only time that the TI would not exist is when scheduler for whatever reason isn't even creating the TI -- e.g. because it's catchup=True, or max_active_tasks=1, or scheduler is having trouble. `next_run_id` would probably be more accurately called `curr_run_id`. wdyt? for now i'll rename that variable. > (i.e. throw away most of this PR, sorry.) no worries if that's how it goes. we should do what we need to do, and it was a good exercise in any case. but i'm not convinced that it's not the right change. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ashb commented on a change in pull request #22184: Add run_id and map_index to SlaMiss
ashb commented on a change in pull request #22184: URL: https://github.com/apache/airflow/pull/22184#discussion_r828502785 ## File path: airflow/dag_processing/processor.py ## @@ -412,20 +410,20 @@ def manage_slas(self, dag: DAG, session: Session = None) -> None: else: while next_info.logical_date < ts: next_info = dag.next_dagrun_info(next_info.data_interval, restricted=False) - if next_info is None: break -if (ti.dag_id, ti.task_id, next_info.logical_date) in recorded_slas_query: +next_run_id = DR.generate_run_id(DagRunType.SCHEDULED, next_info.logical_date) +if (ti.dag_id, ti.task_id, next_run_id, ti.map_index) in recorded_sla_misses: Review comment: I.e. this dag should throw an error at parse time: ```python with DAG(dag_id='test'): BashOperator.partial(sla=timedelta(seconds=10)).expand(bash_command=["echo true", "echo false"]) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ashb commented on a change in pull request #22184: Add run_id and map_index to SlaMiss
ashb commented on a change in pull request #22184: URL: https://github.com/apache/airflow/pull/22184#discussion_r828502055 ## File path: airflow/dag_processing/processor.py ## @@ -412,20 +410,20 @@ def manage_slas(self, dag: DAG, session: Session = None) -> None: else: while next_info.logical_date < ts: next_info = dag.next_dagrun_info(next_info.data_interval, restricted=False) - if next_info is None: break -if (ti.dag_id, ti.task_id, next_info.logical_date) in recorded_slas_query: +next_run_id = DR.generate_run_id(DagRunType.SCHEDULED, next_info.logical_date) +if (ti.dag_id, ti.task_id, next_run_id, ti.map_index) in recorded_sla_misses: Review comment: I think this behaviour is _even_ more confusing. Let's not touch SlaMiss _at all_, and instead create a parse-time-error if someone tries to set an `sla` property on a mapped task. (i.e. throw away most of this PR, sorry.) My reason: the current SLA behaviour of creating the SlaMiss record against the next execution date is confusing (and likely wrong) so lets not confuse matters more by changing it to by against a future run_id that _may never exist_. We can come back and re-visit this once we have made SlaMiss more sensible. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] github-actions[bot] commented on pull request #22272: Add map_index support to al task instance-related views
github-actions[bot] commented on pull request #22272: URL: https://github.com/apache/airflow/pull/22272#issuecomment-1069693314 The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ashb commented on a change in pull request #22272: Add map_index support to al task instance-related views
ashb commented on a change in pull request #22272: URL: https://github.com/apache/airflow/pull/22272#discussion_r828494223 ## File path: airflow/www/static/js/dag.js ## @@ -151,46 +163,41 @@ export function callModal(t, d, extraLinks, tryNumbers, sd, drID) { $('#try_index > li').remove(); $('#redir_log_try_index > li').remove(); const startIndex = (tryNumbers > 2 ? 0 : 1); - for (let index = startIndex; index < tryNumbers; index += 1) { -let url = `${logsWithMetadataUrl -}?dag_id=${encodeURIComponent(dagId) -}&task_id=${encodeURIComponent(taskId) -}&execution_date=${encodeURIComponent(executionDate) -}&metadata=null` - + '&format=file'; Review comment: Whoops, lost this. One moment -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] edithturn opened a new pull request #22327: Rewrite Selective Check in Python
edithturn opened a new pull request #22327: URL: https://github.com/apache/airflow/pull/22327 closes #19971 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dstandish opened a new pull request #22326: Remove incorrect deprecation warning in secrets backend
dstandish opened a new pull request #22326: URL: https://github.com/apache/airflow/pull/22326 When the no value is found with `get_conn_value`, the warning was being triggered, even though `get_conn_value` was implemented and just returned no value (cus there wasn't one). Now we make the logic a little tighter and only raise the dep warning when `get_conn_value` not implemented, which is what we intended to do in the first place. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ashb commented on a change in pull request #22272: Add map_index support to al task instance-related views
ashb commented on a change in pull request #22272: URL: https://github.com/apache/airflow/pull/22272#discussion_r828471044 ## File path: airflow/www/static/js/dag.js ## @@ -53,6 +53,7 @@ let taskId = ''; let executionDate = ''; let subdagId = ''; let dagRunId = ''; +let mapIndex = undefined; Review comment: I'm on this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] pierrejeambrun commented on pull request #22296: Add recipe for BeamRunGoPipelineOperator
pierrejeambrun commented on pull request #22296: URL: https://github.com/apache/airflow/pull/22296#issuecomment-1069657325 @mik-laj Thank you for your comments. I just updated the PR, let me know what you think. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] pierrejeambrun commented on a change in pull request #22296: Add recipe for BeamRunGoPipelineOperator
pierrejeambrun commented on a change in pull request #22296: URL: https://github.com/apache/airflow/pull/22296#discussion_r828468900 ## File path: docs/docker-stack/docker-images-recipes/go-beam.Dockerfile ## @@ -0,0 +1,35 @@ +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +#http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +ARG BASE_AIRFLOW_IMAGE +FROM ${BASE_AIRFLOW_IMAGE} + +SHELL ["/bin/bash", "-o", "pipefail", "-e", "-u", "-x", "-c"] + +USER 0 + +ENV GO_INSTALL_DIR=/usr/local/go + +# Install Go +RUN DOWNLOAD_URL="https://dl.google.com/go/go1.16.4.linux-amd64.tar.gz"; \ +&& TMP_DIR="$(mktemp -d)" \ +&& curl -fL "${DOWNLOAD_URL}" --output "${TMP_DIR}/go.linux-amd64.tar.gz" \ +&& mkdir -p "${GO_INSTALL_DIR}" \ +&& tar xzf "${TMP_DIR}/go.linux-amd64.tar.gz" -C "${GO_INSTALL_DIR}" --strip-components=1 \ +&& rm -rf "${TMP_DIR}" + +ENV GOROOT=/usr/local/go +ENV PATH="$GOROOT/bin:$PATH" Review comment: Good idea. I just added that check and a test as well to validate this behavior. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] pierrejeambrun commented on a change in pull request #22296: Add recipe for BeamRunGoPipelineOperator
pierrejeambrun commented on a change in pull request #22296: URL: https://github.com/apache/airflow/pull/22296#discussion_r828468492 ## File path: docs/docker-stack/docker-images-recipes/go-beam.Dockerfile ## @@ -0,0 +1,35 @@ +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +#http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +ARG BASE_AIRFLOW_IMAGE +FROM ${BASE_AIRFLOW_IMAGE} + +SHELL ["/bin/bash", "-o", "pipefail", "-e", "-u", "-x", "-c"] + +USER 0 + +ENV GO_INSTALL_DIR=/usr/local/go + +# Install Go +RUN DOWNLOAD_URL="https://dl.google.com/go/go1.16.4.linux-amd64.tar.gz"; \ Review comment: Nice idea. I tweaked a little bit the run command. It kind of feel a bit hacky. Let me know if you have a better idea on how to do that, I am not really familiar with multi platform docker images. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] pierrejeambrun commented on a change in pull request #22296: Add recipe for BeamRunGoPipelineOperator
pierrejeambrun commented on a change in pull request #22296: URL: https://github.com/apache/airflow/pull/22296#discussion_r828467003 ## File path: docs/docker-stack/docker-images-recipes/go-beam.Dockerfile ## @@ -0,0 +1,35 @@ +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +#http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +ARG BASE_AIRFLOW_IMAGE +FROM ${BASE_AIRFLOW_IMAGE} + +SHELL ["/bin/bash", "-o", "pipefail", "-e", "-u", "-x", "-c"] + +USER 0 + +ENV GO_INSTALL_DIR=/usr/local/go + +# Install Go +RUN DOWNLOAD_URL="https://dl.google.com/go/go1.16.4.linux-amd64.tar.gz"; \ Review comment: You are right, good catch. Done :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] mik-laj edited a comment on issue #22306: DataflowStartFlexTemplateOperator is missing the impersonation_chain argument
mik-laj edited a comment on issue #22306: URL: https://github.com/apache/airflow/issues/22306#issuecomment-1069648241 We should make similar change to: https://github.com/apache/airflow/pull/19518#discussion_r748632907 (CC: @lwyszomi ) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] mik-laj commented on issue #22306: DataflowStartFlexTemplateOperator is missing the impersonation_chain argument
mik-laj commented on issue #22306: URL: https://github.com/apache/airflow/issues/22306#issuecomment-1069648241 This problem has been documented here: https://airflow.apache.org/docs/apache-airflow-providers-google/stable/connections/gcp.html#direct-impersonation-of-a-service-account Here's a longer discussion on this issue: https://github.com/apache/airflow/pull/19518#discussion_r748632907 (CC: @lwyszomi ) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ianbuss commented on a change in pull request #22051: Support glob syntax in .airflowignore files (#21392)
ianbuss commented on a change in pull request #22051: URL: https://github.com/apache/airflow/pull/22051#discussion_r828447783 ## File path: airflow/config_templates/config.yml ## @@ -347,6 +347,14 @@ type: string example: ~ default: "True" +- name: dag_ignorefile_syntax + description: | +The pattern syntax used in the ".airflowignore" files in the DAG directories. Valid values are +``regexp`` or ``glob``. + version_added: 2.3.0 + type: string + example: ~ + default: "regexp" Review comment: Done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] josh-fell commented on a change in pull request #22280: Add links for BigQuery Data Transfer
josh-fell commented on a change in pull request #22280: URL: https://github.com/apache/airflow/pull/22280#discussion_r827246403 ## File path: airflow/providers/google/cloud/links/bigquery_dts.py ## @@ -0,0 +1,49 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +"""This module contains Google BigQuery Data Transfer links.""" +from typing import TYPE_CHECKING, Optional + +from airflow.providers.google.cloud.links.base import BaseGoogleLink + +if TYPE_CHECKING: +from airflow.utils.context import Context + +BIGQUERY_BASE_LINK = "https://console.cloud.google.com/bigquery/transfers"; +BIGQUERY_DTS_LINK = BIGQUERY_BASE_LINK + "/locations/{region}/configs/{config_id}/runs?project={project_id}" + + +class BigQueryDataTransferConfigLink(BaseGoogleLink): +"""Helper class for constructing BigQuery Data Transfer Config Link""" + +name = "BigQuery Data Transfer Config" +key = "bigquery_dts_config" +format_str = BIGQUERY_DTS_LINK + +@staticmethod +def persist( +context: "Context", +task_instance, +region: Optional[str], +config_id: Optional[str], +project_id: Optional[str], Review comment: These args don't seem to be optional. Looks like they will always be provided when `BigQueryDataTransferConfigLink.persist()` is called but I could be missing something along the way here. I suspect the link value wouldn't be correct if these were missing too? ```suggestion region: str, config_id: str, project_id: str, ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] edithturn edited a comment on issue #19971: CI: Rewrite selective check script in Python
edithturn edited a comment on issue #19971: URL: https://github.com/apache/airflow/issues/19971#issuecomment-1004284910 > It's used in two places: ci.yml and build.yml: The right file should be build-images.yml :) Close: #19971 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] carmoreno1 commented on issue #8468: DbApiHook: add chunksize to get_pandas_df parameters
carmoreno1 commented on issue #8468: URL: https://github.com/apache/airflow/issues/8468#issuecomment-1069620108 > There are a problem with the chunksize, because the get_pandas_df close the connection in the first call of the iterator[DataFrame] returned Hi @imorales-mosaico is there a workaround about `chunksize` using Airflow? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dylanbstorey opened a new issue #22325: ReST API : get_dag should return more than a simplified view of the dag
dylanbstorey opened a new issue #22325: URL: https://github.com/apache/airflow/issues/22325 ### Description The current response payload from https://airflow.apache.org/docs/apache-airflow/stable/stable-rest-api-ref.html#operation/get_dag is a useful but simple view of the state of a given DAG. However it is missing some additional attributes that I feel would be useful for indiduals/groups who are choosing to interact with Airflow primarily through the ReST interface. ### Use case/motivation As part of a testing workflow we upload DAGs to a running airflow instance and want to trigger an execution of the DAG after we know that the scheduler has updated it. We're currently automating this process through the ReST API, but the `last_updated` is not exposed. This should be implemented from the dag_source endpoint. https://github.com/apache/airflow/blob/main/airflow/api_connexion/endpoints/dag_source_endpoint.py ### Related issues _No response_ ### Are you willing to submit a PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on issue #22325: ReST API : get_dag should return more than a simplified view of the dag
boring-cyborg[bot] commented on issue #22325: URL: https://github.com/apache/airflow/issues/22325#issuecomment-1069618240 Thanks for opening your first issue here! Be sure to follow the issue template! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on pull request #22324: Update ssh.py
boring-cyborg[bot] commented on pull request #22324: URL: https://github.com/apache/airflow/pull/22324#issuecomment-1069612287 Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst) Here are some useful points: - Pay attention to the quality of your code (flake8, mypy and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks) will help you with that. - In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/main/docs/apache-airflow/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it. - Consider using [Breeze environment](https://github.com/apache/airflow/blob/main/BREEZE.rst) for testing locally, it’s a heavy docker but it ships with a working Airflow and a lot of integrations. - Be patient and persistent. It might take some time to get a review or get the final approval from Committers. - Please follow [ASF Code of Conduct](https://www.apache.org/foundation/policies/conduct) for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack. - Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#coding-style-and-best-practices). Apache Airflow is a community-driven project and together we are making it better 🚀. In case of doubts contact the developers at: Mailing List: d...@airflow.apache.org Slack: https://s.apache.org/airflow-slack -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] a246530 opened a new pull request #22324: Update ssh.py
a246530 opened a new pull request #22324: URL: https://github.com/apache/airflow/pull/22324 Incorrect logic for self.allow_host_key_change warning regarding "Remote Identification Change is not verified". This was identified in https://github.com/apache/airflow/issues/9510 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ajbosco opened a new pull request #22323: enable optional subPath for dags volume mount
ajbosco opened a new pull request #22323: URL: https://github.com/apache/airflow/pull/22323 Users may have their DAGs on a volume that includes other things and as such want to use a `subPath` on the Volume Mount. This is similar to how the `gitSync` Volume Mount is setup. --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information. In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch constraints-main updated: Updating constraints. Build id:1994689956
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch constraints-main in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/constraints-main by this push: new 1ea6f9b Updating constraints. Build id:1994689956 1ea6f9b is described below commit 1ea6f9bf9b989df3f8ec1b40d8b16464cacedce3 Author: Automated GitHub Actions commit AuthorDate: Wed Mar 16 20:34:42 2022 + Updating constraints. Build id:1994689956 This update in constraints is automatically committed by the CI 'constraints-push' step based on HEAD of 'refs/heads/main' in 'apache/airflow' with commit sha 7bd165fbe2cbbfa8208803ec352c5d16ca2bd3ec. All tests passed in this build so we determined we can push the updated constraints. See https://github.com/apache/airflow/blob/main/README.md#installing-from-pypi for details. --- constraints-3.7.txt | 28 ++-- constraints-3.8.txt | 28 ++-- constraints-3.9.txt | 28 ++-- constraints-no-providers-3.7.txt | 4 ++-- constraints-no-providers-3.8.txt | 4 ++-- constraints-no-providers-3.9.txt | 4 ++-- constraints-source-providers-3.7.txt | 30 +++--- constraints-source-providers-3.8.txt | 30 +++--- constraints-source-providers-3.9.txt | 30 +++--- 9 files changed, 93 insertions(+), 93 deletions(-) diff --git a/constraints-3.7.txt b/constraints-3.7.txt index 33751e7..fe6a60e 100644 --- a/constraints-3.7.txt +++ b/constraints-3.7.txt @@ -1,5 +1,5 @@ # -# This constraints file was automatically generated on 2022-03-16T09:20:18Z +# This constraints file was automatically generated on 2022-03-16T20:18:24Z # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow. # This variant of constraints install uses the HEAD of the branch version for 'apache-airflow' but installs # the providers from PIP-released packages at the moment of the constraint generation. @@ -298,7 +298,7 @@ httplib2==0.19.1 httpx==0.22.0 humanize==4.0.0 hvac==0.11.2 -identify==2.4.11 +identify==2.4.12 idna==3.3 ijson==3.1.4 imagesize==1.3.0 @@ -536,24 +536,24 @@ twine==3.8.0 typed-ast==1.5.2 types-Deprecated==1.2.5 types-Markdown==3.3.12 -types-PyMySQL==1.0.13 -types-PyYAML==6.0.4 +types-PyMySQL==1.0.14 +types-PyYAML==6.0.5 types-boto==2.49.9 types-certifi==2021.10.8.1 -types-croniter==1.0.7 +types-croniter==1.0.8 types-cryptography==3.3.18 types-docutils==0.18.0 -types-freezegun==1.1.6 -types-paramiko==2.8.16 +types-freezegun==1.1.7 +types-paramiko==2.8.17 types-protobuf==3.19.12 -types-python-dateutil==2.8.9 +types-python-dateutil==2.8.10 types-python-slugify==5.0.3 types-pytz==2021.3.5 -types-redis==4.1.17 -types-requests==2.27.12 -types-setuptools==57.4.10 +types-redis==4.1.18 +types-requests==2.27.13 +types-setuptools==57.4.11 types-six==1.16.12 -types-tabulate==0.8.5 +types-tabulate==0.8.6 types-termcolor==1.1.3 types-toml==0.10.4 types-urllib3==1.26.11 @@ -562,7 +562,7 @@ tzdata==2021.5 tzlocal==4.1 unicodecsv==0.14.1 uritemplate==3.0.1 -urllib3==1.26.8 +urllib3==1.26.9 userpath==1.8.0 vertica-python==1.0.3 vine==5.0.0 @@ -575,7 +575,7 @@ websocket-client==1.3.1 wrapt==1.14.0 xmltodict==0.12.0 yamllint==1.26.3 -yandexcloud==0.148.0 +yandexcloud==0.149.0 yarl==1.7.2 zeep==4.1.0 zenpy==2.0.24 diff --git a/constraints-3.8.txt b/constraints-3.8.txt index 712afc2..7294c16 100644 --- a/constraints-3.8.txt +++ b/constraints-3.8.txt @@ -1,5 +1,5 @@ # -# This constraints file was automatically generated on 2022-03-16T09:20:42Z +# This constraints file was automatically generated on 2022-03-16T20:29:10Z # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow. # This variant of constraints install uses the HEAD of the branch version for 'apache-airflow' but installs # the providers from PIP-released packages at the moment of the constraint generation. @@ -299,7 +299,7 @@ httplib2==0.19.1 httpx==0.22.0 humanize==4.0.0 hvac==0.11.2 -identify==2.4.11 +identify==2.4.12 idna==3.3 ijson==3.1.4 imagesize==1.3.0 @@ -538,24 +538,24 @@ trino==0.310.0 twine==3.8.0 types-Deprecated==1.2.5 types-Markdown==3.3.12 -types-PyMySQL==1.0.13 -types-PyYAML==6.0.4 +types-PyMySQL==1.0.14 +types-PyYAML==6.0.5 types-boto==2.49.9 types-certifi==2021.10.8.1 -types-croniter==1.0.7 +types-croniter==1.0.8 types-cryptography==3.3.18 types-docutils==0.18.0 -types-freezegun==1.1.6 -types-paramiko==2.8.16 +types-freezegun==1.1.7 +types-paramiko==2.8.17 types-protobuf==3.19.12 -types-python-dateutil==2.8.9 +types-python-dateutil==2.8.10 types-python-slugify==5.0.3 types-pytz==2021.3.5 -types-redis==4.1.17 -types-requests==2.27.12 -types-setuptools==57.4.10 +types-redis==4.1.18 +types-requests==2.27.13 +types-setuptools==57.4.1
[GitHub] [airflow] eladkal commented on pull request #21961: Add support for Alibaba-Cloud EMR cluster template (#21957)
eladkal commented on pull request #21961: URL: https://github.com/apache/airflow/pull/21961#issuecomment-1069590428 @EricGao888 I marked the PR as draft. When finished please convert to ready so we can review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch mapped-task-drawer updated (9b246fb -> 2566a74)
This is an automated email from the ASF dual-hosted git repository. bbovenzi pushed a change to branch mapped-task-drawer in repository https://gitbox.apache.org/repos/asf/airflow.git. from 9b246fb fix extra links, hide local TZ if UTC, add 2566a74 confirm mark task failed/success No new revisions were added by this update. Summary of changes: airflow/www/static/js/tree/api/index.js| 2 + ...seMarkFailedTask.js => useConfirmTaskChange.js} | 25 + .../details/content/taskInstance/ExtraLinks.jsx} | 67 ++-- .../js/tree/details/content/taskInstance/Logs.jsx | 121 + .../{TaskInstance.jsx => taskInstance/index.jsx} | 119 +++- .../taskActions/ActionButton.jsx | 0 .../{ => taskInstance}/taskActions/Clear.jsx | 2 +- .../taskInstance/taskActions/ConfirmDialog.jsx | 71 .../{ => taskInstance}/taskActions/MarkFailed.jsx | 39 ++- .../{ => taskInstance}/taskActions/MarkSuccess.jsx | 40 ++- .../content/{ => taskInstance}/taskActions/Run.jsx | 2 +- airflow/www/static/js/tree/details/index.jsx | 2 +- airflow/www/views.py | 64 +++ 13 files changed, 389 insertions(+), 165 deletions(-) copy airflow/www/static/js/tree/api/{useMarkFailedTask.js => useConfirmTaskChange.js} (65%) copy airflow/{ui/src/components/SectionNav.tsx => www/static/js/tree/details/content/taskInstance/ExtraLinks.jsx} (53%) create mode 100644 airflow/www/static/js/tree/details/content/taskInstance/Logs.jsx rename airflow/www/static/js/tree/details/content/{TaskInstance.jsx => taskInstance/index.jsx} (72%) rename airflow/www/static/js/tree/details/content/{ => taskInstance}/taskActions/ActionButton.jsx (100%) rename airflow/www/static/js/tree/details/content/{ => taskInstance}/taskActions/Clear.jsx (98%) create mode 100644 airflow/www/static/js/tree/details/content/taskInstance/taskActions/ConfirmDialog.jsx rename airflow/www/static/js/tree/details/content/{ => taskInstance}/taskActions/MarkFailed.jsx (66%) rename airflow/www/static/js/tree/details/content/{ => taskInstance}/taskActions/MarkSuccess.jsx (66%) rename airflow/www/static/js/tree/details/content/{ => taskInstance}/taskActions/Run.jsx (97%)
[GitHub] [airflow] kaxil commented on issue #19049: Migrate execution_date database columns to logical_date
kaxil commented on issue #19049: URL: https://github.com/apache/airflow/issues/19049#issuecomment-1069569440 Is this still pending @uranusjr ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dstandish commented on pull request #19857: Enable json serialization for secrets backend
dstandish commented on pull request #19857: URL: https://github.com/apache/airflow/pull/19857#issuecomment-1069542431 got it @eladkal i think i see the issue, and i'll work on a fix today -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated: Remove RefreshConfiguration workaround for K8s token refreshing (#20759)
This is an automated email from the ASF dual-hosted git repository. dstandish pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/main by this push: new 7bd165f Remove RefreshConfiguration workaround for K8s token refreshing (#20759) 7bd165f is described below commit 7bd165fbe2cbbfa8208803ec352c5d16ca2bd3ec Author: Daniel Standish <15932138+dstand...@users.noreply.github.com> AuthorDate: Wed Mar 16 12:33:01 2022 -0700 Remove RefreshConfiguration workaround for K8s token refreshing (#20759) A workaround was added (https://github.com/apache/airflow/pull/5731) to handle the refreshing of EKS tokens. It was necessary because of an upstream bug. It has since been fixed (https://github.com/kubernetes-client/python-base/commit/70b78cd8488068c014b6d762a0c8d358273865b4) and released in v21.7.0 (https://github.com/kubernetes-client/python/blob/master/CHANGELOG.md#v2170). --- UPDATING.md| 4 + airflow/kubernetes/kube_client.py | 47 ++-- airflow/kubernetes/refresh_config.py | 124 - .../providers/cncf/kubernetes/utils/pod_manager.py | 8 +- setup.py | 2 +- tests/kubernetes/test_client.py| 22 ++-- tests/kubernetes/test_refresh_config.py| 106 -- 7 files changed, 26 insertions(+), 287 deletions(-) diff --git a/UPDATING.md b/UPDATING.md index cda775f..c929ece 100644 --- a/UPDATING.md +++ b/UPDATING.md @@ -81,6 +81,10 @@ https://developers.google.com/style/inclusive-documentation --> +### Minimum kubernetes version bumped from 3.0.0 to 21.7.0 + +No change in behavior is expected. This was necessary in order to take advantage of a [bugfix](https://github.com/kubernetes-client/python-base/commit/70b78cd8488068c014b6d762a0c8d358273865b4) concerning refreshing of Kubernetes API tokens with EKS, which enabled the removal of some [workaround code](https://github.com/apache/airflow/pull/20759). + ### Deprecation: `Connection.extra` must be JSON-encoded dict TLDR diff --git a/airflow/kubernetes/kube_client.py b/airflow/kubernetes/kube_client.py index 97836be..7e6ba05 100644 --- a/airflow/kubernetes/kube_client.py +++ b/airflow/kubernetes/kube_client.py @@ -25,39 +25,10 @@ log = logging.getLogger(__name__) try: from kubernetes import client, config from kubernetes.client import Configuration -from kubernetes.client.api_client import ApiClient from kubernetes.client.rest import ApiException -from airflow.kubernetes.refresh_config import RefreshConfiguration, load_kube_config - has_kubernetes = True -def _get_kube_config( -in_cluster: bool, cluster_context: Optional[str], config_file: Optional[str] -) -> Optional[Configuration]: -if in_cluster: -# load_incluster_config set default configuration with config populated by k8s -config.load_incluster_config() -return None -else: -# this block can be replaced with just config.load_kube_config once -# refresh_config module is replaced with upstream fix -cfg = RefreshConfiguration() -load_kube_config(client_configuration=cfg, config_file=config_file, context=cluster_context) -return cfg - -def _get_client_with_patched_configuration(cfg: Optional[Configuration]) -> client.CoreV1Api: -""" -This is a workaround for supporting api token refresh in k8s client. - -The function can be replace with `return client.CoreV1Api()` once the -upstream client supports token refresh. -""" -if cfg: -return client.CoreV1Api(api_client=ApiClient(configuration=cfg)) -else: -return client.CoreV1Api() - def _disable_verify_ssl() -> None: configuration = Configuration() configuration.verify_ssl = False @@ -126,17 +97,19 @@ def get_kube_client( if not has_kubernetes: raise _import_err -if not in_cluster: -if cluster_context is None: -cluster_context = conf.get('kubernetes', 'cluster_context', fallback=None) -if config_file is None: -config_file = conf.get('kubernetes', 'config_file', fallback=None) - if conf.getboolean('kubernetes', 'enable_tcp_keepalive'): _enable_tcp_keepalive() if not conf.getboolean('kubernetes', 'verify_ssl'): _disable_verify_ssl() -client_conf = _get_kube_config(in_cluster, cluster_context, config_file) -return _get_client_with_patched_configuration(client_conf) +if in_cluster: +config.load_incluster_config() +else: +if cluster_context is None: +cluster_context = conf.get('kubernetes', 'cluster_context', fallback=None) +if config_file is None: +
[GitHub] [airflow] dstandish merged pull request #20759: Remove RefreshConfiguration workaround for K8s token refreshing
dstandish merged pull request #20759: URL: https://github.com/apache/airflow/pull/20759 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] eladkal commented on pull request #19857: Enable json serialization for secrets backend
eladkal commented on pull request #19857: URL: https://github.com/apache/airflow/pull/19857#issuecomment-1069533737 @dstandish example: ``` from datetime import datetime from airflow import DAG from airflow.operators.python import PythonOperator from airflow.providers.amazon.aws.hooks.s3 import S3Hook default_args = { 'start_date': datetime(2022, 3, 15), 'owner': 'airflow', 'retries': 1 } def test_hook(): s3_hook = S3Hook() s3_hook.load_file(filename='bla', key='blaa', bucket_name='fff') dag = DAG(dag_id='my_dag', default_args=default_args) py_test = PythonOperator(task_id='python_task', python_callable=test_hook, dag=dag) ``` It produce the warning (some other unrelated warnings as well): ``` [2022-03-16, 19:26:09 UTC] {logging_mixin.py:115} WARNING - /opt/***/***/secrets/base_secrets.py:95 PendingDeprecationWarning: This method is deprecated. Please use `***.secrets.environment_variables.EnvironmentVariablesBackend.get_conn_value`. [2022-03-16, 19:26:09 UTC] {logging_mixin.py:115} WARNING - /opt/***/***/models/connection.py:420 PendingDeprecationWarning: Method `get_conn_uri` is deprecated. Please use `get_conn_value`. [2022-03-16, 19:26:11 UTC] {logging_mixin.py:115} WARNING - /opt/***/***/models/dag.py:1084 SADeprecationWarning: Query.value() is deprecated and will be removed in a future release. Please use Query.with_entities() in combination with Query.scalar() (deprecated since: 1.4) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dstandish commented on pull request #19857: Enable json serialization for secrets backend
dstandish commented on pull request #19857: URL: https://github.com/apache/airflow/pull/19857#issuecomment-1069480919 ok thanks @eladkal -- i can take a look at this. can you give me any examples of where you are seeing this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] eladkal commented on issue #22320: Copying DAG ID from UI and pasting in Slack includes schedule
eladkal commented on issue #22320: URL: https://github.com/apache/airflow/issues/22320#issuecomment-1069467356 I can confirm this indeed happens with paste ( CMD + V) However if you will paste with CMD + SHIFT + V - it will behave as you expect -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] alexbegg opened a new issue #22320: Copying DAG ID from UI and pasting in Slack includes schedule
alexbegg opened a new issue #22320: URL: https://github.com/apache/airflow/issues/22320 ### Apache Airflow version 2.2.3 ### What happened (Yes, I know the title says Slack and it might not seem like an Airflow issue, but so far this is the only application I noticed this on. There might be others.) PR https://github.com/apache/airflow/pull/11503 was a fix to issue https://github.com/apache/airflow/issues/11500 to prevent text-selection of scheduler interval when selecting DAG ID. However it does not fix pasting the text into certain applications (such as Slack), at least on a Mac. @ryanahamilton thanks for the fix, but this is fixed in the visible sense (double clicking the DAG ID to select it will now not show the schedule interval and next run as selected in the UI), however if you copy what is selected for some reason it still includes schedule interval and next run when pasted into certain applications. I can't be sure why this is happening, but certain places such as pasting into Google Chrome, TextEdit, or Visual Studio Code it will only include the DAG ID and a new line. But other applications such as Slack (so far only one I can tell) it includes the schedule interval and next run, as you can see below: - Schedule interval and next run **not shown as selected** on the DAG page: ![Screen Shot 2022-03-16 at 11 04 21 AM](https://user-images.githubusercontent.com/45696489/158659392-2df1f428-61e9-4785-be21-cdb1eda9ff6e.png) - Schedule interval and next run **not pasted** in Google Chrome and TextEdit: ![Screen Shot 2022-03-16 at 11 05 10 AM](https://user-images.githubusercontent.com/45696489/158659521-adc2be64-1b31-403f-8630-b36b40900b42.png) ![Screen Shot 2022-03-16 at 11 15 14 AM](https://user-images.githubusercontent.com/45696489/158659539-0c76c079-3b44-4846-b41e-9038689bb33d.png) - Schedule interval and next run **_pasted and visible_** in Slack: ![Screen Shot 2022-03-16 at 11 05 40 AM](https://user-images.githubusercontent.com/45696489/158659837-a57b0a57-306e-4ea2-9648-a4922d41c403.png) ### What you think should happen instead When you select the DAG ID on the DAG page, copy what is selected, and then paste into a Slack message, only the DAG ID should be pasted. ### How to reproduce Select the DAG ID on the DAG page (such as double-clicking the DAG ID), copy what is selected, and then paste into a Slack message. ### Operating System macOS 1.15.7 (Catalina) ### Versions of Apache Airflow Providers _No response_ ### Deployment Astronomer ### Deployment details _No response_ ### Anything else This is something that possibly could be a Slack bug (one could say that Slack should strip out anything that is `user-select: none`), however it should be possible to fix the HTML layout so `user-select: none` is not even needed to prevent selection. It is sort of a band-aid fix. ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on issue #22317: Make pause DAG it's own role separate from edit DAG
boring-cyborg[bot] commented on issue #22317: URL: https://github.com/apache/airflow/issues/22317#issuecomment-1069421624 Thanks for opening your first issue here! Be sure to follow the issue template! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on issue #22318: KubernetesPodOperator xcom sidecar stuck in running
boring-cyborg[bot] commented on issue #22318: URL: https://github.com/apache/airflow/issues/22318#issuecomment-1069412224 Thanks for opening your first issue here! Be sure to follow the issue template! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] eladkal commented on pull request #19857: Enable json serialization for secrets backend
eladkal commented on pull request #19857: URL: https://github.com/apache/airflow/pull/19857#issuecomment-1069406427 I'm starting to see warnings raised about `get_conn_uri`: `[2022-03-16, 18:06:38 UTC] {logging_mixin.py:115} WARNING - /opt/***/***/models/connection.py:420 PendingDeprecationWarning: Method `get_conn_uri` is deprecated. Please use `get_conn_value`.` We have several references in the code to the deprecated method we should fix the references to avoid the warrnings -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] pingzh commented on pull request #21877: AIP-45 Remove dag parsing in airflow run local
pingzh commented on pull request #21877: URL: https://github.com/apache/airflow/pull/21877#issuecomment-1069398672 hi @potiuk , the AIP-45 is ready to review. could you please take a look? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] eladkal commented on issue #16249: Fully functional DAG Dependencies Graph view
eladkal commented on issue #16249: URL: https://github.com/apache/airflow/issues/16249#issuecomment-1069385083 @ManiBharataraju do you have plans to add the functionality of your add-on to Airflow? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] o-nikolas commented on a change in pull request #22295: Add docs and example dag for AWS Glue
o-nikolas commented on a change in pull request #22295: URL: https://github.com/apache/airflow/pull/22295#discussion_r828284103 ## File path: airflow/providers/amazon/aws/example_dags/example_glue.py ## @@ -0,0 +1,139 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +import tempfile +from datetime import datetime +from os import getenv + +from airflow import DAG +from airflow.models.baseoperator import chain +from airflow.operators.python import PythonOperator +from airflow.providers.amazon.aws.hooks.s3 import S3Hook +from airflow.providers.amazon.aws.operators.glue import GlueJobOperator +from airflow.providers.amazon.aws.operators.glue_crawler import GlueCrawlerOperator +from airflow.providers.amazon.aws.sensors.glue import GlueJobSensor +from airflow.providers.amazon.aws.sensors.glue_crawler import GlueCrawlerSensor + +GLUE_DATABASE_NAME = getenv('GLUE_DATABASE_NAME', 'glue_database_name') +GLUE_EXAMPLE_S3_BUCKET = getenv('GLUE_EXAMPLE_S3_BUCKET', 'glue_example_s3_bucket') + +# Role needs putobject/getobject access to the above bucket as well as the glue +# service role, see docs here: https://docs.aws.amazon.com/glue/latest/dg/create-an-iam-role.html +GLUE_CRAWLER_ROLE = getenv('GLUE_CRAWLER_ROLE', 'glue_crawler_role') +GLUE_CRAWLER_NAME = 'example_crawler' +GLUE_CRAWLER_CONFIG = { +'Name': GLUE_CRAWLER_NAME, +'Role': GLUE_CRAWLER_ROLE, +'DatabaseName': GLUE_DATABASE_NAME, +'Targets': { +'S3Targets': [ +{ +'Path': f'{GLUE_EXAMPLE_S3_BUCKET}/input', +} +] +}, +} + +# Example csv data used as input to the example AWS Glue Job. +EXAMPLE_CSV = ''' +food,price +apple,0.5 +milk,2.5 +bread,4.0 +''' + +# Example Spark script to operate on the above sample csv data. +EXAMPLE_SCRIPT = f''' +from pyspark.context import SparkContext +from awsglue.context import GlueContext + +glueContext = GlueContext(SparkContext.getOrCreate()) +datasource = glueContext.create_dynamic_frame.from_catalog( + database='{GLUE_DATABASE_NAME}', table_name='input') +print('There are %s items in the table' % datasource.count()) + +datasource.toDF().write.format('csv').mode("append").save('s3://{GLUE_EXAMPLE_S3_BUCKET}/output') +''' + + +def _upload_from_tmp_file_to_glue_bucket(contents: str, s3_key: str): +'''Upload contents of string to S3 leveraging tempfiles to copy the data''' +with tempfile.NamedTemporaryFile(mode='a') as tmp: +tmp.write(contents.strip()) +tmp.seek(0) +s3_hook = S3Hook() +s3_hook.load_file( +filename=tmp.name, +key=s3_key, +bucket_name=GLUE_EXAMPLE_S3_BUCKET, +replace=True, +) + + +with DAG( +dag_id='example_glue', +schedule_interval=None, +start_date=datetime(2021, 1, 1), +tags=['example'], +catchup=False, +) as glue_dag: + +upload_csv = PythonOperator( +task_id='upload_csv', +python_callable=_upload_from_tmp_file_to_glue_bucket, +op_kwargs={'contents': EXAMPLE_CSV, 's3_key': 'input/input.csv'}, +) +upload_etl_script = PythonOperator( +task_id='upload_etl_script', +python_callable=_upload_from_tmp_file_to_glue_bucket, +op_kwargs={'contents': EXAMPLE_SCRIPT, 's3_key': 'etl_script.py'}, +) Review comment: Sure, folks, I will make the change :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] brettplarson opened a new issue #22317: Make pause DAG it's own role separate from edit DAG
brettplarson opened a new issue #22317: URL: https://github.com/apache/airflow/issues/22317 ### Description As an Airflow administrator I would like to be be able to configure a user account with permissions be able to "clear" and re-run a DAG run but also _not_ be able to pause a DAG. Currently this doesn't appear to be possible as `Toggle DAG paused status` and `Clear DAG Run` both come from the same `DAGs.can_edit` permission set. ### Use case/motivation A failure such as an unexpected reboot may require a non-privileged user to restart a run - without needing a huge amount of permissions. It is **_really easy_** to click and pause a DAG when viewing / browsing in the UI. Either cancelling an existing dag run or preventing further runs from proceeding can be an issue. This happened a few times and has caused issues as the DAG was paused preventing important ETLs from running at their scheduled times. ### Related issues _No response_ ### Are you willing to submit a PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] eladkal commented on issue #18323: Detect and fail any command when db is not migrated
eladkal commented on issue #18323: URL: https://github.com/apache/airflow/issues/18323#issuecomment-1069383143 fixed in https://github.com/apache/airflow/pull/18439 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] eladkal closed issue #18323: Detect and fail any command when db is not migrated
eladkal closed issue #18323: URL: https://github.com/apache/airflow/issues/18323 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #22310: Add generic connection type
potiuk commented on pull request #22310: URL: https://github.com/apache/airflow/pull/22310#issuecomment-1069365930 Yeah. I think it's good to have it to avoid confusion from the users. We had the same problem with email connection. This is really to give the right answer to the users who seek for it in the UI and don't realize they could use HTTP for example -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] tnyz opened a new issue #22318: KubernetesPodOperator xcom sidecar stuck in running
tnyz opened a new issue #22318: URL: https://github.com/apache/airflow/issues/22318 ### Apache Airflow version 2.2.4 (latest released) ### What happened When the main container errors and failed to write a return.json file, the xcom sidecar hangs and doesn't exit properly with an empty return.json. This is a problem because we want to suppress the following error, as the reason the pod failed should not be that xcom failed. ``` [2022-03-16, 17:08:07 UTC] {pod_manager.py:342} INFO - Running command... cat /airflow/xcom/return.json [2022-03-16, 17:08:07 UTC] {pod_manager.py:349} INFO - stderr from command: cat: can't open '/airflow/xcom/return.json': No such file or directory [2022-03-16, 17:08:07 UTC] {pod_manager.py:342} INFO - Running command... kill -s SIGINT 1 [2022-03-16, 17:08:08 UTC] {kubernetes_pod.py:417} INFO - Deleting pod: test.20882a4c607d418d94e87231214d34c0 [2022-03-16, 17:08:08 UTC] {taskinstance.py:1718} ERROR - Task failed with exception Traceback (most recent call last): File "/usr/local/lib/python3.8/site-packages/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py", line 385, in execute result = self.extract_xcom(pod=self.pod) File "/usr/local/lib/python3.8/site-packages/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py", line 360, in extract_xcom result = self.pod_manager.extract_xcom(pod) File "/usr/local/lib/python3.8/site-packages/airflow/providers/cncf/kubernetes/utils/pod_manager.py", line 337, in extract_xcom raise AirflowException(f'Failed to extract xcom from pod: {pod.metadata.name}') airflow.exceptions.AirflowException: Failed to extract xcom from pod: test.20882a4c607d418d94e87231214d34c0 ``` and have the KubernetesPodOperator exit gracefully ### What you think should happen instead sidecar should exit with an empty xcom return value ### How to reproduce KubernetesPodOperator with command `mkdir -p /airflow/xcom;touch /airflow/xcom/return.json; cat a >> /airflow/xcom/return.json` ### Operating System - ### Versions of Apache Airflow Providers _No response_ ### Deployment Other Docker-based deployment ### Deployment details _No response_ ### Anything else _No response_ ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] eladkal commented on issue #11294: Bugs/Improvement proposals filed to GitHub Actions
eladkal commented on issue #11294: URL: https://github.com/apache/airflow/issues/11294#issuecomment-1069366984 @potiuk any further actions here? The tickets you opened to github can be viewed only by you :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk merged pull request #22310: Add generic connection type
potiuk merged pull request #22310: URL: https://github.com/apache/airflow/pull/22310 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated (df6058c -> 6d1d53b)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git. from df6058c Disable default_pool delete on web ui (#21658) add 6d1d53b Add generic connection type (#22310) No new revisions were added by this update. Summary of changes: airflow/www/views.py | 1 + 1 file changed, 1 insertion(+)
[GitHub] [airflow] bbovenzi edited a comment on pull request #22272: Add map_index support to al task instance-related views
bbovenzi edited a comment on pull request #22272: URL: https://github.com/apache/airflow/pull/22272#issuecomment-1069354329 https://user-images.githubusercontent.com/4600967/158645377-3d3502d8-8f8a-4532-b72e-240542aa1bde.png";> I think it would be useful to show the map index in the header when it isn't `-1` ``` // task_instance.html {% if map_index and map_index != -1 %} Map Index: {{ map_index }} {% endif %} ``` or the task instance modal (I think this only applies to a modal triggered from the gantt view right now): ``` // dag.html Map Index: ``` and ``` // dag.js if (mi >= 0) { $('#show_map_index').show(); $('#map_index').text(mi); } else { $('#show_map_index').hide(); } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dstandish commented on a change in pull request #20759: Remove RefreshConfiguration workaround for K8s token refreshing
dstandish commented on a change in pull request #20759: URL: https://github.com/apache/airflow/pull/20759#discussion_r828248873 ## File path: setup.py ## @@ -381,7 +381,7 @@ def write_version(filename: str = os.path.join(*[my_dir, "airflow", "git_version ] kubernetes = [ 'cryptography>=2.0.0', -'kubernetes>=3.0.0', +'kubernetes>=21.7.0', Review comment: OK @mik-laj i made this change. I could not find any other instances like this. Any concerns with merging now? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] bbovenzi commented on pull request #22272: Add map_index support to al task instance-related views
bbovenzi commented on pull request #22272: URL: https://github.com/apache/airflow/pull/22272#issuecomment-1069354329 https://user-images.githubusercontent.com/4600967/158645377-3d3502d8-8f8a-4532-b72e-240542aa1bde.png";> I think it would be useful to show the map index in the header when it isn't `-1` ``` // task_instance.html {% if map_index and map_index != -1 %} Map Index: {{ map_index }} {% endif %} ``` or the task instance modal (I think this only applies to a modal triggered from the gantt view right now): ``` // dag.html Map Index: ``` and ``` // dag.js if (mi >= 0) { $('#show_map_index').show(); $('#map_index').text(mi); } else { $('#show_map_index').hide(); } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] bbovenzi commented on pull request #22272: Add map_index support to al task instance-related views
bbovenzi commented on pull request #22272: URL: https://github.com/apache/airflow/pull/22272#issuecomment-1069351761 > * `/success` and `/failed` set state to a task (and potentially its upstreams/downstreams). It feels a bit weird to be able to set the state to only one mapped task? Also upstream/downstream makes less sense if any of those are mapped. So I’m currently leaving these views alone (i.e. setting success/failed to a mapped task sets the state of all mapped tis). We can perhaps add some UI components to optionally do more fine-grained state modification. I think leaving it alone is fine, as long as the confirmation screen still works. We will want to select individual mapped tasks too. I'm not sure if enabling that is also in scope of this PR or a later task. But we probably want to allow only past/future of that exact map_index and disable upstream/downstream options. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] bbovenzi commented on a change in pull request #22314: WIP summarize mapped
bbovenzi commented on a change in pull request #22314: URL: https://github.com/apache/airflow/pull/22314#discussion_r828232501 ## File path: airflow/api_connexion/endpoints/task_instance_endpoint.py ## @@ -296,10 +298,49 @@ def get_task_instances_batch(session: Session = NEW_SESSION) -> APIResponse: ti_query = base_query.options(joinedload(TI.rendered_task_instance_fields)) task_instances = ti_query.all() -return task_instance_collection_schema.dump( +results = task_instance_collection_schema.dump( TaskInstanceCollection(task_instances=task_instances, total_entries=total_entries) ) +if "summarize_mapped" in body and body["summarize_mapped"]: +dag_run_ids = [ti["dag_run_id"] for ti in results["task_instances"]] +mapped_ti_query = session.query(TI).join(TI.dag_run) +mapped_ti_query = _apply_array_filter(mapped_ti_query, key=TI.run_id, values=dag_run_ids) +mapped_ti_query = mapped_ti_query.filter(TI.map_index != -1) +# FIXME without SLA block, this error when rendering: +# TypeError: 'TaskInstance' object is not subscriptable +mapped_ti_query = mapped_ti_query.join( +SlaMiss, +and_( +SlaMiss.dag_id == TI.dag_id, +SlaMiss.task_id == TI.task_id, +SlaMiss.execution_date == DR.execution_date, +), +isouter=True, +).add_entity(SlaMiss) +mapped_ti_query = mapped_ti_query.options(joinedload(TI.rendered_task_instance_fields)) +mapped_task_instances = mapped_ti_query.all() +mapped_summaries = task_instance_summary_collection_schema.dump( +TaskInstanceCollection(task_instances=mapped_task_instances, total_entries=1) +) + +by_dag_run_id = {} +for mapped_ti in mapped_summaries["task_instances"]: +dag_run_id = mapped_ti["dag_run_id"] +try: +by_dag_run_id[dag_run_id].append(mapped_ti) +except: +by_dag_run_id[dag_run_id] = [ +mapped_ti, +] + +for ti in results["task_instances"]: +dag_run_id = ti["dag_run_id"] +if dag_run_id in by_dag_run_id: +ti["mapped_tasks"] = by_dag_run_id[dag_run_id] Review comment: Let's have the list of `mapped_tasks` be a separate endpoint with which we can paginate with `limit` and `offset` params (and possibly, search). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on pull request #22315: Airflow file sensor by prefix for azure data lake storage
boring-cyborg[bot] commented on pull request #22315: URL: https://github.com/apache/airflow/pull/22315#issuecomment-1069340804 Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst) Here are some useful points: - Pay attention to the quality of your code (flake8, mypy and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks) will help you with that. - In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/main/docs/apache-airflow/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it. - Consider using [Breeze environment](https://github.com/apache/airflow/blob/main/BREEZE.rst) for testing locally, it’s a heavy docker but it ships with a working Airflow and a lot of integrations. - Be patient and persistent. It might take some time to get a review or get the final approval from Committers. - Please follow [ASF Code of Conduct](https://www.apache.org/foundation/policies/conduct) for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack. - Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#coding-style-and-best-practices). Apache Airflow is a community-driven project and together we are making it better 🚀. In case of doubts contact the developers at: Mailing List: d...@airflow.apache.org Slack: https://s.apache.org/airflow-slack -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Ash-ZAMAN opened a new pull request #22315: Airflow file sensor by prefix for azure data lake storage
Ash-ZAMAN opened a new pull request #22315: URL: https://github.com/apache/airflow/pull/22315 Hi, I noticed there was no airflow file sensor suitable for Azure Data Lake Storage. Hence, I created one. That could be realy useful for people who store data in Data Lake. My sensor is adaptable with the prefix of the file's name and any directory path hope this contribution will help someone. Regards, Ashraf ZAMAN --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information. In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] bbovenzi commented on a change in pull request #22272: Add map_index support to al task instance-related views
bbovenzi commented on a change in pull request #22272: URL: https://github.com/apache/airflow/pull/22272#discussion_r828200131 ## File path: airflow/www/static/js/graph.js ## @@ -172,8 +172,8 @@ function draw() { const task = tasks[nodeId]; const tryNumber = taskInstances[nodeId].try_number || 0; - if (task.task_type === 'SubDagOperator') callModal(nodeId, executionDate, task.extra_links, tryNumber, true, dagRunId); - else callModal(nodeId, executionDate, task.extra_links, tryNumber, undefined, dagRunId); + if (task.task_type === 'SubDagOperator') callModal(nodeId, executionDate, task.extra_links, tryNumber, true, dagRunId, task.map_index); + else callModal(nodeId, executionDate, task.extra_links, tryNumber, undefined, dagRunId, task.map_index); Review comment: We probably need the `// eslint-disable-next-line max-len` or to split everything into multiple lines. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] bbovenzi commented on a change in pull request #22272: Add map_index support to al task instance-related views
bbovenzi commented on a change in pull request #22272: URL: https://github.com/apache/airflow/pull/22272#discussion_r828198262 ## File path: airflow/www/static/js/gantt.js ## @@ -204,7 +204,7 @@ d3.gantt = () => { .on('mouseover', tip.show) .on('mouseout', tip.hide) .on('click', (d) => { -callModal(d.task_id, d.execution_date, d.extraLinks, undefined, undefined, d.run_id); +callModal(d.task_id, d.execution_date, d.extraLinks, undefined, undefined, d.run_id, d.map_index); Review comment: ```suggestion // eslint-disable-next-line max-len callModal(d.task_id, d.execution_date, d.extraLinks, undefined, undefined, d.run_id, d.map_index); ``` linter is angry about this. Let's ignore it for now. In a later PR, I would like to use object props here to avoid passing `undefined` and make it clearer which arg is what. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] bbovenzi commented on a change in pull request #22272: Add map_index support to al task instance-related views
bbovenzi commented on a change in pull request #22272: URL: https://github.com/apache/airflow/pull/22272#discussion_r828194742 ## File path: airflow/www/static/js/dag.js ## @@ -53,6 +53,7 @@ let taskId = ''; let executionDate = ''; let subdagId = ''; let dagRunId = ''; +let mapIndex = undefined; Review comment: ```suggestion let mapIndex = -1; ``` ```suggestion let mapIndex; ``` The linter doesn't like initializing as `undefined` so let's default to `-1` or not set anything -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] norm opened a new pull request #22314: WIP summarize mapped
norm opened a new pull request #22314: URL: https://github.com/apache/airflow/pull/22314 --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information. In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch mapped-task-drawer updated (c3d2d95 -> 9b246fb)
This is an automated email from the ASF dual-hosted git repository. bbovenzi pushed a change to branch mapped-task-drawer in repository https://gitbox.apache.org/repos/asf/airflow.git. from c3d2d95 download log bug fixes add 9b246fb fix extra links, hide local TZ if UTC, No new revisions were added by this update. Summary of changes: airflow/www/static/js/tree/api/useExtraLinks.js| 10 +- .../www/static/js/tree/details/content/DagRun.jsx | 38 +++-- .../js/tree/details/content/TaskInstance.jsx | 187 +++-- 3 files changed, 128 insertions(+), 107 deletions(-)
[GitHub] [airflow] dlesco commented on issue #22191: dag_processing code needs to handle OSError("handle is closed") in poll() and recv() calls
dlesco commented on issue #22191: URL: https://github.com/apache/airflow/issues/22191#issuecomment-1069278603 I plan to submit a PR within the next two weeks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] villasv commented on issue #15340: helm install airflow in namespace get error: File "", line 32, in TimeoutError: There are still unapplied migrations after 60 sec
villasv commented on issue #15340: URL: https://github.com/apache/airflow/issues/15340#issuecomment-1069268372 In my case, just removing the --wait flag was enough, I didn't have to fiddle with `airflow.dbMigrations.runAsJob` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] vincbeck opened a new pull request #22313: Update doc and sample dag for S3ToSFTPOperator and SFTPToS3Operator
vincbeck opened a new pull request #22313: URL: https://github.com/apache/airflow/pull/22313 Update doc and sample dag for `S3ToSFTPOperator` and `SFTPToS3Operator` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org