[GitHub] [airflow] jedcunningham opened a new pull request #20549: Remove unnecessary python 3.6 conditionals
jedcunningham opened a new pull request #20549: URL: https://github.com/apache/airflow/pull/20549 Since Python 3.7 is now the lowest supported version, we no longer need to have conditionals to support 3.6. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] microhuang opened a new pull request #20548: lazy load operation of attribute 'dag_model'
microhuang opened a new pull request #20548: URL: https://github.com/apache/airflow/pull/20548 is not bound to a Session; lazy load operation of attribute 'dag_model' #20534 --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information. In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] jedcunningham commented on pull request #20544: Use local definitions for k8s schema validation
jedcunningham commented on pull request #20544: URL: https://github.com/apache/airflow/pull/20544#issuecomment-1002414889 > The only problem with this is a way how to use some definitions that we do not use currently. I think it is worth to do it in the way that we can also remove the vendoing easily (reversing the process). The process would loook like this; Yeah, I think an easier way would be to simply remove all `io.k8s.*` definitions instead of resetting to `{}`. That would still get us the automatic k8s defs, while leaving any custom ones in place. I just pushed that change 👍. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] mik-laj commented on a change in pull request #20544: Use local definitions for k8s schema validation
mik-laj commented on a change in pull request #20544: URL: https://github.com/apache/airflow/pull/20544#discussion_r776160033 ## File path: scripts/ci/pre_commit/pre_commit_vendor_k8s_json_schema.py ## @@ -0,0 +1,87 @@ +#!/usr/bin/env python3 + +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +import json + +import requests + +K8S_DEFINITIONS = ( +"https://raw.githubusercontent.com/yannh/kubernetes-json-schema"; +"/master/v1.22.0-standalone-strict/_definitions.json" +) +VALUES_SCHEMA_FILE = "chart/values.schema.json" + + +with open(VALUES_SCHEMA_FILE) as f: +schema = json.load(f) + +refs = set() + + +def find_refs(props: dict): +for value in props.values(): +if "$ref" in value: +refs.add(value["$ref"]) + +if "items" in value: +if "$ref" in value["items"]: +refs.add(value["items"]["$ref"]) + +if "properties" in value: +find_refs(value["properties"]) + + +def get_remote_schema(url: str) -> dict: +req = requests.get(url) +req.raise_for_status() +return json.loads(req.text) Review comment: ```suggestion return req.json() ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] mik-laj commented on a change in pull request #20544: Use local definitions for k8s schema validation
mik-laj commented on a change in pull request #20544: URL: https://github.com/apache/airflow/pull/20544#discussion_r776159423 ## File path: scripts/ci/pre_commit/pre_commit_vendor_k8s_json_schema.py ## @@ -0,0 +1,87 @@ +#!/usr/bin/env python3 + +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +import json + +import requests + +K8S_DEFINITIONS = ( +"https://raw.githubusercontent.com/yannh/kubernetes-json-schema"; +"/master/v1.22.0-standalone-strict/_definitions.json" +) +VALUES_SCHEMA_FILE = "chart/values.schema.json" + + +with open(VALUES_SCHEMA_FILE) as f: +schema = json.load(f) + +refs = set() + + +def find_refs(props: dict): +for value in props.values(): +if "$ref" in value: +refs.add(value["$ref"]) + +if "items" in value: +if "$ref" in value["items"]: +refs.add(value["items"]["$ref"]) + +if "properties" in value: +find_refs(value["properties"]) Review comment: The code is just as long, but a tad more readable because the `refs` variable is created with a certain value and is not modified from another function. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] mik-laj commented on a change in pull request #20544: Use local definitions for k8s schema validation
mik-laj commented on a change in pull request #20544: URL: https://github.com/apache/airflow/pull/20544#discussion_r776159082 ## File path: scripts/ci/pre_commit/pre_commit_vendor_k8s_json_schema.py ## @@ -0,0 +1,87 @@ +#!/usr/bin/env python3 + +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +import json + +import requests + +K8S_DEFINITIONS = ( +"https://raw.githubusercontent.com/yannh/kubernetes-json-schema"; +"/master/v1.22.0-standalone-strict/_definitions.json" +) +VALUES_SCHEMA_FILE = "chart/values.schema.json" + + +with open(VALUES_SCHEMA_FILE) as f: +schema = json.load(f) + +refs = set() + + +def find_refs(props: dict): +for value in props.values(): +if "$ref" in value: +refs.add(value["$ref"]) + +if "items" in value: +if "$ref" in value["items"]: +refs.add(value["items"]["$ref"]) + +if "properties" in value: +find_refs(value["properties"]) + + +def get_remote_schema(url: str) -> dict: +req = requests.get(url) +req.raise_for_status() +return json.loads(req.text) + + +# Create 'definitions' if it doesn't exist or reset it +schema["definitions"] = {} + +# Get the k8s defs +defs = get_remote_schema(K8S_DEFINITIONS) + +# first find refs in our schema +find_refs(schema["properties"]) Review comment: ```suggestion refs = list(find_refs(schema["properties"])) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] mik-laj commented on a change in pull request #20544: Use local definitions for k8s schema validation
mik-laj commented on a change in pull request #20544: URL: https://github.com/apache/airflow/pull/20544#discussion_r776159009 ## File path: scripts/ci/pre_commit/pre_commit_vendor_k8s_json_schema.py ## @@ -0,0 +1,87 @@ +#!/usr/bin/env python3 + +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +import json + +import requests + +K8S_DEFINITIONS = ( +"https://raw.githubusercontent.com/yannh/kubernetes-json-schema"; +"/master/v1.22.0-standalone-strict/_definitions.json" +) +VALUES_SCHEMA_FILE = "chart/values.schema.json" + + +with open(VALUES_SCHEMA_FILE) as f: +schema = json.load(f) + +refs = set() + + +def find_refs(props: dict): +for value in props.values(): +if "$ref" in value: +refs.add(value["$ref"]) + +if "items" in value: +if "$ref" in value["items"]: +refs.add(value["items"]["$ref"]) + +if "properties" in value: +find_refs(value["properties"]) Review comment: ```suggestion def find_refs(props: dict): for value in props.values(): if "$ref" in value: yield value["$ref"] if "items" in value: if "$ref" in value["items"]: yield value["items"]["$ref"] if "properties" in value: yield from find_refs(value["properties"]) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] edithturn commented on pull request #20200: [New] - Ci free space python
edithturn commented on pull request #20200: URL: https://github.com/apache/airflow/pull/20200#issuecomment-1002400077 That is right Jarek, that directory is good. In the previous job, there was an error (something like this): import ci from main, ci doesn't found. I was related to the entry point. For this entry point: `Breeze2=airflow_breeze.breeze: main`, the directory airflow_breeze is inside "src", so it is not defined in the entry point, and free space is inside breeze/ci, not inside src, -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dstandish commented on a change in pull request #20530: Add sensor decorator
dstandish commented on a change in pull request #20530: URL: https://github.com/apache/airflow/pull/20530#discussion_r776152045 ## File path: airflow/decorators/sensor.py ## @@ -0,0 +1,93 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from inspect import signature +from typing import Any, Callable, Dict, Optional, Tuple + +from airflow.decorators.base import get_unique_task_id, task_decorator_factory +from airflow.sensors.base import BaseSensorOperator + + +class DecoratedSensorOperator(BaseSensorOperator): +""" +Wraps a Python callable and captures args/kwargs when called for execution. + +:param python_callable: A reference to an object that is callable +:type python_callable: python callable +:param task_id: task Id +:type task_id: str +:param op_kwargs: a dictionary of keyword arguments that will get unpacked +in your function (templated) +:type op_kwargs: dict +:param op_args: a list of positional arguments that will get unpacked when +calling your callable (templated) +:type op_args: list +:param kwargs_to_upstream: For certain operators, we might need to upstream certain arguments +that would otherwise be absorbed by the DecoratedOperator (for example python_callable for the +PythonOperator). This gives a user the option to upstream kwargs as needed. +:type kwargs_to_upstream: dict +""" + +template_fields = ('op_args', 'op_kwargs') +template_fields_renderers = {"op_args": "py", "op_kwargs": "py"} + +# since we won't mutate the arguments, we should just do the shallow copy +# there are some cases we can't deepcopy the objects (e.g protobuf). +shallow_copy_attrs = ('python_callable',) + +def __init__( +self, +*, +python_callable: Callable, +task_id: str, +op_args: Tuple[Any], +op_kwargs: Dict[str, Any], +**kwargs, +) -> None: +kwargs.pop('multiple_outputs') +kwargs['task_id'] = get_unique_task_id(task_id, kwargs.get('dag'), kwargs.get('task_group')) +self.python_callable = python_callable +# Check that arguments can be binded +signature(python_callable).bind(*op_args, **op_kwargs) +self.op_args = op_args +self.op_kwargs = op_kwargs +super().__init__(**kwargs) + +def poke(self, context: Dict) -> bool: +return self.python_callable(*self.op_args, **self.op_kwargs) + + +def sensor(python_callable: Optional[Callable] = None, multiple_outputs: Optional[bool] = None, **kwargs): +""" +Wraps a function into an Airflow operator. + +Accepts kwargs for operator kwarg. Can be reused in a single DAG. + +:param python_callable: Function to decorate +:type python_callable: Optional[Callable] +:param multiple_outputs: if set, function return value will be +unrolled to multiple XCom values. List/Tuples will unroll to xcom values +with index as key. Dict will unroll to xcom values with keys as XCom keys. +Defaults to False. Review comment: update: [draft PR](https://github.com/apache/airflow/pull/20547) for allowing return of xcom from poke method. with this in place we could allow return of XCom (even with multiple outputs) with this decorator. i'll try to confirm this appoach is OK tomorrow and work to get it merged. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dstandish commented on a change in pull request #20530: Add sensor decorator
dstandish commented on a change in pull request #20530: URL: https://github.com/apache/airflow/pull/20530#discussion_r776152045 ## File path: airflow/decorators/sensor.py ## @@ -0,0 +1,93 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from inspect import signature +from typing import Any, Callable, Dict, Optional, Tuple + +from airflow.decorators.base import get_unique_task_id, task_decorator_factory +from airflow.sensors.base import BaseSensorOperator + + +class DecoratedSensorOperator(BaseSensorOperator): +""" +Wraps a Python callable and captures args/kwargs when called for execution. + +:param python_callable: A reference to an object that is callable +:type python_callable: python callable +:param task_id: task Id +:type task_id: str +:param op_kwargs: a dictionary of keyword arguments that will get unpacked +in your function (templated) +:type op_kwargs: dict +:param op_args: a list of positional arguments that will get unpacked when +calling your callable (templated) +:type op_args: list +:param kwargs_to_upstream: For certain operators, we might need to upstream certain arguments +that would otherwise be absorbed by the DecoratedOperator (for example python_callable for the +PythonOperator). This gives a user the option to upstream kwargs as needed. +:type kwargs_to_upstream: dict +""" + +template_fields = ('op_args', 'op_kwargs') +template_fields_renderers = {"op_args": "py", "op_kwargs": "py"} + +# since we won't mutate the arguments, we should just do the shallow copy +# there are some cases we can't deepcopy the objects (e.g protobuf). +shallow_copy_attrs = ('python_callable',) + +def __init__( +self, +*, +python_callable: Callable, +task_id: str, +op_args: Tuple[Any], +op_kwargs: Dict[str, Any], +**kwargs, +) -> None: +kwargs.pop('multiple_outputs') +kwargs['task_id'] = get_unique_task_id(task_id, kwargs.get('dag'), kwargs.get('task_group')) +self.python_callable = python_callable +# Check that arguments can be binded +signature(python_callable).bind(*op_args, **op_kwargs) +self.op_args = op_args +self.op_kwargs = op_kwargs +super().__init__(**kwargs) + +def poke(self, context: Dict) -> bool: +return self.python_callable(*self.op_args, **self.op_kwargs) + + +def sensor(python_callable: Optional[Callable] = None, multiple_outputs: Optional[bool] = None, **kwargs): +""" +Wraps a function into an Airflow operator. + +Accepts kwargs for operator kwarg. Can be reused in a single DAG. + +:param python_callable: Function to decorate +:type python_callable: Optional[Callable] +:param multiple_outputs: if set, function return value will be +unrolled to multiple XCom values. List/Tuples will unroll to xcom values +with index as key. Dict will unroll to xcom values with keys as XCom keys. +Defaults to False. Review comment: update: [draft PR](https://github.com/apache/airflow/pull/20547) for allowing return of xcom from poke method. with this in place we could allow return of XCom (even with multiple outputs) with this decorator. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] thomasrockhu-codecov edited a comment on pull request #20087: Upload coverage for PRs to main
thomasrockhu-codecov edited a comment on pull request #20087: URL: https://github.com/apache/airflow/pull/20087#issuecomment-1002392632 hey @potiuk sorry for the delay here (holidays). I'm having trouble reproducing the issue you mentioned. For both cases `pytest ./tests/providers/cncf/kubernetes/hooks/test_kubernetes.py ./tests/jobs/test_backfill_job.py` `pytest ./tests/providers/cncf/kubernetes/hooks/test_kubernetes.py ./tests/jobs/test_backfill_job.py --cov` I'm hitting about +20MB of memory usage in the `coverage` case at most If it helps, ``` docker: 20.10.7 python: 3.7 ``` I uploaded a video of what I'm seeing on a clean run: https://youtu.be/VjX7892gL6Y -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dstandish opened a new pull request #20547: Allow sensors to return XCom values with PokeReturnValue
dstandish opened a new pull request #20547: URL: https://github.com/apache/airflow/pull/20547 Historically sensors do not support returning XCom unless you override the `execute` method. With this change we can optionally return an XCom value by returning one along with `True` in new class PokeReturnValue -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] thomasrockhu-codecov commented on pull request #20087: Upload coverage for PRs to main
thomasrockhu-codecov commented on pull request #20087: URL: https://github.com/apache/airflow/pull/20087#issuecomment-1002392632 hey @potiuk sorry for the delay here (holidays). I'm having trouble reproducing the issue you mentioned. For both cases `pytest ./tests/providers/cncf/kubernetes/hooks/test_kubernetes.py ./tests/jobs/test_backfill_job.py` `pytest ./tests/providers/cncf/kubernetes/hooks/test_kubernetes.py ./tests/jobs/test_backfill_job.py --cov` I'm hitting about 200-230 MB of memory usage in both cases. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] edithturn commented on pull request #20148: [Update] Breeze2-on-windows
edithturn commented on pull request #20148: URL: https://github.com/apache/airflow/pull/20148#issuecomment-1002391975 This part `entry: ([\W\s\n\t\r]|^)breeze([\W\s\n\t\r]|$)` was added. I will reverse my code. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dstandish opened a new pull request #20546: Allow sensors to return XCom values
dstandish opened a new pull request #20546: URL: https://github.com/apache/airflow/pull/20546 Historically sensors do not support returning XCom unless you override the `execute` method. With this change we can optionally return an XCom value by returning one along with `True` in a tuple e.g. `True, {"some": "information"}` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] yehoshuadimarsky commented on pull request #20499: Docs: Note that `remote_log_conn_id` is only used for reading logs, not writing them
yehoshuadimarsky commented on pull request #20499: URL: https://github.com/apache/airflow/pull/20499#issuecomment-1002369861 updated, @potiuk can you please enable the "first-time contributor" workflow? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] subkanthi commented on a change in pull request #20497: Fix mypy aws example dags
subkanthi commented on a change in pull request #20497: URL: https://github.com/apache/airflow/pull/20497#discussion_r776120651 ## File path: airflow/providers/amazon/aws/operators/eks.pyi ## @@ -0,0 +1,97 @@ +from typing import Optional, Dict, List Review comment: Done, thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] rustikk opened a new issue #20545: built in macros (macros.random, macros.time) import functions instead of modules
rustikk opened a new issue #20545: URL: https://github.com/apache/airflow/issues/20545 ### Apache Airflow version 2.2.3 (latest released) ### What happened My gut says that the way forward is to change the macros object so that it only exposes modules: - datetime - time - uuid - random ... and then leave it to the user to decide which functions on those modules they want to call. I'm not confident enough to make that change. If instead we want to change the docs to match the actual functionality, I can submit a PR for that. ### What you expected to happen When using either the bultin macros time or random they don't call datetime.time or random they instead call the builtin module time or for random returns a function instead of the module. ### How to reproduce ``` import datetime as dt import time from uuid import uuid4 from textwrap import dedent from airflow.models import DAG from airflow.operators.python import PythonOperator from dateutil.parser import parse as dateutil_parse """ According to the docs: macros.datetime - datetime.datetime macros.timedelta - datetime.timedelta macros.datetutil - dateutil package macros.time - datetime.time macros.uuid - python standard lib uuid macros.random - python standard lib random According to the code: macros.datetime - datetime.datetime macros.timedelta - datetime.timedelta macros.datetutil - dateutil package macros.time - python standard lib time <--- differs macros.uuid - python standard lib uuid macros.random - random.random <--- differs """ def date_time(datetime_obj): compare_obj = dt.datetime(2021, 12, 12, 8, 32, 23) assert datetime_obj == compare_obj def time_delta(timedelta_obj): compare_obj = dt.timedelta(days=3, hours=4) assert timedelta_obj == compare_obj def date_util(dateutil_obj): compare_obj = dateutil_parse("Thu Sep 26 10:36:28 2019") assert dateutil_obj == compare_obj def time_tester(time_obj): # note that datetime.time.time() gives an AttributeError # time.time() on the other hand, returns a float # this works because macro.time isn't 'datetime.time', like the docs say # it's just 'time' compare_obj = time.time() print(time_obj) print(compare_obj) # the macro might have captured a slightly differnt time than the task, # but they're not going to be more than 10s apart assert abs(time_obj - compare_obj) < 10 def uuid_tester(uuid_obj): compare_obj = uuid4() assert len(str(uuid_obj)) == len(str(compare_obj)) def random_tester(random_float): # note that 'random.random' is a function that returns a float # while 'random' is a module (and isn't callable) # the macro was 'macros.random()' and here we have a float: assert -0.1 < random_float < 100.1 # so the docs are wrong here too # macros.random actually returns a function, not the random module def show_docs(attr): print(attr.__doc__) with DAG( dag_id="builtin_macros_with_airflow_specials", schedule_interval=None, start_date=dt.datetime(1970, 1, 1), render_template_as_native_obj=True, # render templates using Jinja NativeEnvironment tags=["core"], ) as dag: test_functions = { "datetime": (date_time, "{{ macros.datetime(2021, 12, 12, 8, 32, 23) }}"), "timedelta": (time_delta, "{{ macros.timedelta(days=3, hours=4) }}"), "dateutil": ( date_util, "{{ macros.dateutil.parser.parse('Thu Sep 26 10:36:28 2019') }}", ), "time": (time_tester, "{{ macros.time.time() }}"), "uuid": (uuid_tester, "{{ macros.uuid.uuid4() }}"), "random": ( random_tester, "{{ 100 * macros.random() }}", ), } for name, (func, template) in test_functions.items(): ( PythonOperator( task_id=f"showdoc_{name}", python_callable=show_docs, op_args=[f" macros.{name} "], ) >> PythonOperator( task_id=f"test_{name}", python_callable=func, op_args=[template] ) ) ``` ### Operating System Docker (debian:buster) ### Versions of Apache Airflow Providers 2.2.2, and 2.2.3 ### Deployment Other ### Deployment details Astro CLI with sequential executor ### Anything else Rather than changing the docs to describe what the code actually does would it be better to make the code behave in a way tha
[GitHub] [airflow] potiuk commented on pull request #20544: Use local definitions for k8s schema validation
potiuk commented on pull request #20544: URL: https://github.com/apache/airflow/pull/20544#issuecomment-1002345416 > I'm not sure if it is worth the effort to only include the upstream defs we actually use, but it does cut it down from 17.5k lines to 2.6k, so figured it was worth exploring at least. Thoughts? The only problem with this is a way how to use some definitions that we do not use currently. I think it is worth to do it in the way that we can also remove the vendoing easily (reversing the process). The process would loook like this; 1) if you want to work on the schema and add new stuff, run a script to un-vendor the refs. 2) make your changes 3) commit (and this will automatically vendor them in). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #20544: Use local definitions for k8s schema validation
potiuk commented on pull request #20544: URL: https://github.com/apache/airflow/pull/20544#issuecomment-1002344349 > I'm not sure if it is worth the effort to only include the upstream defs we actually use, but it does cut it down from 17.5k lines to 2.6k, so figured it was worth exploring at least. Thoughts? Makes perfect sense - especially that it's fully automated with pre-commit. Eventually we might want to contribute (thought about it) support to add mapped loaders for refs and then we could even bundle all the definitions as separate files - https://github.com/xeipuuv/gojsonschema#loading-local-schemas - but that would have to be approved by helm maintainers and released. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] subkanthi edited a comment on issue #19903: Error setting dependencies on task_group defined using the decorator
subkanthi edited a comment on issue #19903: URL: https://github.com/apache/airflow/issues/19903#issuecomment-1002342857 Looking at this issue. The problem is here, for some when the right shift is called for start, the other value is None, its almost like the task group is not created properly. ``` def __rshift__(self, other: Union["DependencyMixin", Sequence["DependencyMixin"]]): """Implements Task >> Task""" self.set_downstream(other) return other ``` ``` if not isinstance(task_or_task_list, Sequence): task_or_task_list = [task_or_task_list] task_list: List["BaseOperator"] = [] for task_object in task_or_task_list: ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] subkanthi commented on issue #19903: Error setting dependencies on task_group defined using the decorator
subkanthi commented on issue #19903: URL: https://github.com/apache/airflow/issues/19903#issuecomment-1002342857 Looking at this issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dstandish commented on a change in pull request #19572: Simplify ``KubernetesPodOperator``
dstandish commented on a change in pull request #19572: URL: https://github.com/apache/airflow/pull/19572#discussion_r776114628 ## File path: airflow/providers/cncf/kubernetes/CHANGELOG.rst ## @@ -19,6 +19,19 @@ Changelog - +3.0.0 +. + +Breaking changes + + +* ``Simplify KubernetesPodOperator (#19572)`` + +.. warning:: Many methods in :class:`~.KubernetesPodOperator` and class:`~.PodLauncher` have been renamed. +If you have subclassed :class:`~.KubernetesPodOperator` will need to update your subclass to reflect +the new structure. Additionally ``PodStatus`` enum has been renamed to ``PodPhase``. Review comment: ok i have added the notes for podlauncher as well and i will mark as resolved. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] github-actions[bot] commented on pull request #20544: Use local definitions for k8s schema validation
github-actions[bot] commented on pull request #20544: URL: https://github.com/apache/airflow/pull/20544#issuecomment-1002342117 The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] jedcunningham commented on pull request #20544: Use local definitions for k8s schema validation
jedcunningham commented on pull request #20544: URL: https://github.com/apache/airflow/pull/20544#issuecomment-1002340933 I'm not sure if it is worth the effort to only include the upstream defs we actually use, but it does cut it down from 17.5k lines to 2.6k, so figured it was worth exploring at least. Thoughts? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] jedcunningham opened a new pull request #20544: Use local definitions for k8s schema validation
jedcunningham opened a new pull request #20544: URL: https://github.com/apache/airflow/pull/20544 Instead of using remote schemas to validate helm chart values, we will vendor in the definitions locally so the chart can be used without using outbound connections. There is a precommit hook that will ensure only the definitions we use are included in our schema file, as there are quite a few of them all together otherwise. closes: #20539 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dstandish commented on a change in pull request #20530: Add sensor decorator
dstandish commented on a change in pull request #20530: URL: https://github.com/apache/airflow/pull/20530#discussion_r776113484 ## File path: airflow/decorators/sensor.py ## @@ -0,0 +1,93 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from inspect import signature +from typing import Any, Callable, Dict, Optional, Tuple + +from airflow.decorators.base import get_unique_task_id, task_decorator_factory +from airflow.sensors.base import BaseSensorOperator + + +class DecoratedSensorOperator(BaseSensorOperator): +""" +Wraps a Python callable and captures args/kwargs when called for execution. + +:param python_callable: A reference to an object that is callable +:type python_callable: python callable +:param task_id: task Id +:type task_id: str +:param op_kwargs: a dictionary of keyword arguments that will get unpacked +in your function (templated) +:type op_kwargs: dict +:param op_args: a list of positional arguments that will get unpacked when +calling your callable (templated) +:type op_args: list +:param kwargs_to_upstream: For certain operators, we might need to upstream certain arguments +that would otherwise be absorbed by the DecoratedOperator (for example python_callable for the +PythonOperator). This gives a user the option to upstream kwargs as needed. +:type kwargs_to_upstream: dict +""" + +template_fields = ('op_args', 'op_kwargs') +template_fields_renderers = {"op_args": "py", "op_kwargs": "py"} + +# since we won't mutate the arguments, we should just do the shallow copy +# there are some cases we can't deepcopy the objects (e.g protobuf). +shallow_copy_attrs = ('python_callable',) + +def __init__( +self, +*, +python_callable: Callable, +task_id: str, +op_args: Tuple[Any], +op_kwargs: Dict[str, Any], +**kwargs, +) -> None: +kwargs.pop('multiple_outputs') +kwargs['task_id'] = get_unique_task_id(task_id, kwargs.get('dag'), kwargs.get('task_group')) +self.python_callable = python_callable +# Check that arguments can be binded +signature(python_callable).bind(*op_args, **op_kwargs) +self.op_args = op_args +self.op_kwargs = op_kwargs +super().__init__(**kwargs) + +def poke(self, context: Dict) -> bool: +return self.python_callable(*self.op_args, **self.op_kwargs) + + +def sensor(python_callable: Optional[Callable] = None, multiple_outputs: Optional[bool] = None, **kwargs): +""" +Wraps a function into an Airflow operator. + +Accepts kwargs for operator kwarg. Can be reused in a single DAG. + +:param python_callable: Function to decorate +:type python_callable: Optional[Callable] +:param multiple_outputs: if set, function return value will be +unrolled to multiple XCom values. List/Tuples will unroll to xcom values +with index as key. Dict will unroll to xcom values with keys as XCom keys. +Defaults to False. Review comment: Separately It seems to me that it would be nice if we could optionally return an XCom value with python sensor so the return type would be and this is a mouthful but... `Optional[Union[bool, Tuple[bool, Any]]]` or something like that so, if you return bool it's as normal and when sensing is successful, you can return `True, {"some": "information"}`, i.e. a 2-tuple where the first element is the poke boolean and the second element is the xcom return. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dstandish commented on a change in pull request #20530: Add sensor decorator
dstandish commented on a change in pull request #20530: URL: https://github.com/apache/airflow/pull/20530#discussion_r776113047 ## File path: airflow/decorators/sensor.py ## @@ -0,0 +1,93 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from inspect import signature +from typing import Any, Callable, Dict, Optional, Tuple + +from airflow.decorators.base import get_unique_task_id, task_decorator_factory +from airflow.sensors.base import BaseSensorOperator + + +class DecoratedSensorOperator(BaseSensorOperator): +""" +Wraps a Python callable and captures args/kwargs when called for execution. + +:param python_callable: A reference to an object that is callable +:type python_callable: python callable +:param task_id: task Id +:type task_id: str +:param op_kwargs: a dictionary of keyword arguments that will get unpacked +in your function (templated) +:type op_kwargs: dict +:param op_args: a list of positional arguments that will get unpacked when +calling your callable (templated) +:type op_args: list +:param kwargs_to_upstream: For certain operators, we might need to upstream certain arguments +that would otherwise be absorbed by the DecoratedOperator (for example python_callable for the +PythonOperator). This gives a user the option to upstream kwargs as needed. +:type kwargs_to_upstream: dict +""" + +template_fields = ('op_args', 'op_kwargs') +template_fields_renderers = {"op_args": "py", "op_kwargs": "py"} + +# since we won't mutate the arguments, we should just do the shallow copy +# there are some cases we can't deepcopy the objects (e.g protobuf). +shallow_copy_attrs = ('python_callable',) + +def __init__( +self, +*, +python_callable: Callable, +task_id: str, +op_args: Tuple[Any], +op_kwargs: Dict[str, Any], +**kwargs, +) -> None: +kwargs.pop('multiple_outputs') +kwargs['task_id'] = get_unique_task_id(task_id, kwargs.get('dag'), kwargs.get('task_group')) +self.python_callable = python_callable +# Check that arguments can be binded +signature(python_callable).bind(*op_args, **op_kwargs) +self.op_args = op_args +self.op_kwargs = op_kwargs +super().__init__(**kwargs) + +def poke(self, context: Dict) -> bool: +return self.python_callable(*self.op_args, **self.op_kwargs) + + +def sensor(python_callable: Optional[Callable] = None, multiple_outputs: Optional[bool] = None, **kwargs): +""" +Wraps a function into an Airflow operator. + +Accepts kwargs for operator kwarg. Can be reused in a single DAG. + +:param python_callable: Function to decorate +:type python_callable: Optional[Callable] +:param multiple_outputs: if set, function return value will be +unrolled to multiple XCom values. List/Tuples will unroll to xcom values +with index as key. Dict will unroll to xcom values with keys as XCom keys. +Defaults to False. Review comment: @josh-fell re Tuples and Lists are not currently supported for multiple_outputs. are you sure? how come it indicates it is supported in the docstring e.g. for _PythonDecoratedOperator? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dstandish removed a comment on pull request #20530: Add sensor decorator
dstandish removed a comment on pull request #20530: URL: https://github.com/apache/airflow/pull/20530#issuecomment-1002337888 @josh-fell re > Tuples and Lists are not currently supported for multiple_outputs. are you sure? how come it indicates it is supported in the docstring e.g. for `_PythonDecoratedOperator`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dstandish commented on a change in pull request #20530: Add sensor decorator
dstandish commented on a change in pull request #20530: URL: https://github.com/apache/airflow/pull/20530#discussion_r776109537 ## File path: airflow/decorators/__init__.py ## @@ -29,9 +30,11 @@ def __getattr__(self, name): if name.startswith("__"): raise AttributeError(f'{type(self).__name__} has no attribute {name!r}') decorators = ProvidersManager().taskflow_decorators -if name not in decorators: -raise AttributeError(f"task decorator {name!r} not found") -return decorators[name] +if name in decorators: +return decorators[name] +if name == "sensor": +return sensor Review comment: @dimberman can you comment on whether this is a good approach or whether instead we should do multiple inheritance with mixins as is done with `@task.virtualenv`? i lean towards supporting the approach taken here but interested if you want to advocate for the mixin approach -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dstandish commented on pull request #20530: Add sensor decorator
dstandish commented on pull request #20530: URL: https://github.com/apache/airflow/pull/20530#issuecomment-1002337888 @josh-fell re > Tuples and Lists are not currently supported for multiple_outputs. are you sure? how come it indicates it is supported in the docstring e.g. for `_PythonDecoratedOperator`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #20473: SSHOperator.execute() returns str regardless of enable_xcom_pickling
potiuk commented on pull request #20473: URL: https://github.com/apache/airflow/pull/20473#issuecomment-1002337416 Yah I see where it came from :) > * "run_as_user" is broken (at least for us) in the new version. We had to switch some `PythonOperator` to `BashOperator` (use sudo in our script) or, occasionally, `SSHOperator`. Yeah. I think that one will be fixed in 2.2.4 (or at least there is a work in progress). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #19891: Re-enable MyPy
potiuk commented on issue #19891: URL: https://github.com/apache/airflow/issues/19891#issuecomment-1002336697 > I fixed mypy errors in airflow/decorators and then realized it was a dup effort. I'd be happy to help if there is any work not taken by anyone yet. See at the list in the issue - it contains a number of unresolved issues when we last checked (and check if someone has not signed up for it in the thread above). I hope to merge 3-4 more prs tomorrow and will re-run it to refresh the numbers. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] derkuci closed pull request #20473: SSHOperator.execute() returns str regardless of enable_xcom_pickling
derkuci closed pull request #20473: URL: https://github.com/apache/airflow/pull/20473 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] cqzlxl commented on pull request #20491: Refactor: Remove unnecessary database commit and exception stack printing
cqzlxl commented on pull request #20491: URL: https://github.com/apache/airflow/pull/20491#issuecomment-1002333839 suppress logging not ok -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] cqzlxl closed pull request #20491: Refactor: Remove unnecessary database commit and exception stack printing
cqzlxl closed pull request #20491: URL: https://github.com/apache/airflow/pull/20491 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] derkuci commented on pull request #20473: SSHOperator.execute() returns str regardless of enable_xcom_pickling
derkuci commented on pull request #20473: URL: https://github.com/apache/airflow/pull/20473#issuecomment-1002333155 > I just wonder, why do you think it should be consistent? We do not have auto-discovery of types and format of the returned values in Airflow. Airlfow is task-based not data-based, so basicaly in case you want to pass data between tasks both producer and consumer has to agree on naming conventions and data format used. There is no "universal" discovery. Also even in the case you started with - the "pickling" is deprecated and actually it should be set "per installlation". So it could be that in different installations, different format is used, but it will be "constistent" for that installation. I do not see huge value in providing "consistent" output format really in this case. Got you. Our particular case indeed involves "installation" change: upgrading from an old version of airflow to the current version. * We recently upgraded from 1.9 to 2.2... `enbale_xcom_pickling` was True with the old version, and I hesitated whether I should keep the old value, or switch to "False" as suggested (safety, etc.). * "run_as_user" is broken (at least for us) in the new version. We had to switch some `PythonOperator` to `BashOperator` (use sudo in our script) or, occasionally, `SSHOperator`. While we are experimenting with all these changes, we had to deal with uncertainties in the XCom data type/formats. My patch is not general enough (as you kindly explained) but served us well during the upgrade process. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] github-actions[bot] commented on pull request #19536: Fix dynamic queue loss
github-actions[bot] commented on pull request #19536: URL: https://github.com/apache/airflow/pull/19536#issuecomment-1002330022 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] mingshi-wang commented on issue #19891: Re-enable MyPy
mingshi-wang commented on issue #19891: URL: https://github.com/apache/airflow/issues/19891#issuecomment-1002328580 I fixed mypy errors in airflow/decorators and then realized it was a dup effort. I'd be happy to help if there is any work not taken by anyone yet. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #20200: [New] - Ci free space python
potiuk commented on pull request #20200: URL: https://github.com/apache/airflow/pull/20200#issuecomment-1002328441 I think `./dev/breeze` should be good here for `pip install` - because that's where `setup.cfg` and `setup.py` are. Was there a problem with it ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #20148: [Update] Breeze2-on-windows
potiuk commented on pull request #20148: URL: https://github.com/apache/airflow/pull/20148#issuecomment-1002326952 I am afraid you will need to rebase the changes due to conflict @edithturn. We have a nice instructions on rebasing in https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#id15 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated (2cd4ed0 -> bfd6d45)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git. from 2cd4ed0 Set default logger in logging Mixin (#20355) add bfd6d45 switch to follow_redirects on httpx.get call in CloudSQL provider (#20239) No new revisions were added by this update. Summary of changes: airflow/providers/google/cloud/hooks/cloud_sql.py | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-)
[GitHub] [airflow] boring-cyborg[bot] commented on pull request #20239: switch to follow_redirects on httpx.get call in CloudSQL provider
boring-cyborg[bot] commented on pull request #20239: URL: https://github.com/apache/airflow/pull/20239#issuecomment-1002326641 Awesome work, congrats on your first merged pull request! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk merged pull request #20239: switch to follow_redirects on httpx.get call in CloudSQL provider
potiuk merged pull request #20239: URL: https://github.com/apache/airflow/pull/20239 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated (7d4d38b -> 2cd4ed0)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git. from 7d4d38b avoid deprecation warnings in BigQuery transfer operators (#20502) add 2cd4ed0 Set default logger in logging Mixin (#20355) No new revisions were added by this update. Summary of changes: airflow/utils/log/logging_mixin.py | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-)
[GitHub] [airflow] potiuk merged pull request #20355: Set default logger in logging Mixin
potiuk merged pull request #20355: URL: https://github.com/apache/airflow/pull/20355 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a change in pull request #20497: Fix mypy aws example dags
potiuk commented on a change in pull request #20497: URL: https://github.com/apache/airflow/pull/20497#discussion_r776105553 ## File path: airflow/providers/amazon/aws/operators/eks.pyi ## @@ -0,0 +1,97 @@ +from typing import Optional, Dict, List Review comment: Rebasing to latest main and running pre-commit should add licence automatically @subkanthi -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a change in pull request #20497: Fix mypy aws example dags
potiuk commented on a change in pull request #20497: URL: https://github.com/apache/airflow/pull/20497#discussion_r776105553 ## File path: airflow/providers/amazon/aws/operators/eks.pyi ## @@ -0,0 +1,97 @@ +from typing import Optional, Dict, List Review comment: Rebasing to latest main and running pre-commit should add licence automatically. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #20528: Change download_video parameter to resourceName
potiuk commented on pull request #20528: URL: https://github.com/apache/airflow/pull/20528#issuecomment-1002324364 Some tests are failing unfortunately :( -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #20515: Improve the connections list view
potiuk commented on issue #20515: URL: https://github.com/apache/airflow/issues/20515#issuecomment-1002324192 Pretty much so @mik-laj . @BasPH - should we close that ane? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #20527: Fixed has_calls to assert_has_calls
potiuk commented on pull request #20527: URL: https://github.com/apache/airflow/pull/20527#issuecomment-1002324006 Oh yeah. I expected some errors like those in the tests :( -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #19852: created SFTPBatchOperator which add batch function
potiuk commented on pull request #19852: URL: https://github.com/apache/airflow/pull/19852#issuecomment-1002323697 Docs are failing too. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated (b5d520c -> 7d4d38b)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git. from b5d520c Reinstate `region` to `default_args` for Alibaba example DAGs (#20423) add 7d4d38b avoid deprecation warnings in BigQuery transfer operators (#20502) No new revisions were added by this update. Summary of changes: airflow/providers/google/cloud/transfers/bigquery_to_gcs.py | 2 +- airflow/providers/google/cloud/transfers/bigquery_to_mssql.py | 2 +- airflow/providers/google/cloud/transfers/bigquery_to_mysql.py | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-)
[GitHub] [airflow] potiuk closed issue #20437: Deprecated arg is passed to BigQueryHook
potiuk closed issue #20437: URL: https://github.com/apache/airflow/issues/20437 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk merged pull request #20502: avoid deprecation warnings in BigQuery transfer operators
potiuk merged pull request #20502: URL: https://github.com/apache/airflow/pull/20502 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] mingshi-wang closed pull request #20543: Fix mypy errors in decorators module
mingshi-wang closed pull request #20543: URL: https://github.com/apache/airflow/pull/20543 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] mingshi-wang commented on pull request #20543: Fix mypy errors in decorators module
mingshi-wang commented on pull request #20543: URL: https://github.com/apache/airflow/pull/20543#issuecomment-1002320984 I just find this is dup of https://github.com/apache/airflow/pull/20034 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] mingshi-wang opened a new pull request #20543: Fix mypy errors in decorators module
mingshi-wang opened a new pull request #20543: URL: https://github.com/apache/airflow/pull/20543 Fix mypy errors in airflow/decorators module. related: https://github.com/apache/airflow/issues/19891 --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information. In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] rsg17 opened a new pull request #20542: [Part 1]: Add hook for integrating with Google Calendar
rsg17 opened a new pull request #20542: URL: https://github.com/apache/airflow/pull/20542 related: #8471 This PR adds hook for integrating with Google Calendar and corresponding unit tests. The code/steps follow a structure similar to that of [Google Sheets integration](https://github.com/apache/airflow/blob/main/airflow/providers/google/suite/hooks/sheets.py). Further steps to close #8471 involve creating operators to write to GCS, corresponding unit tests and system tests. I will create those as a separate PR. --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information. In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated (528baf4 -> b5d520c)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git. from 528baf4 Remove PyJWT upper bound from Dockerfile (#20503) add b5d520c Reinstate `region` to `default_args` for Alibaba example DAGs (#20423) No new revisions were added by this update. Summary of changes: .../cloud/example_dags/example_oss_bucket.py | 6 +- .../cloud/example_dags/example_oss_object.py | 6 +- airflow/providers/alibaba/cloud/operators/oss.py | 12 ++-- airflow/providers/alibaba/cloud/operators/oss.pyi | 82 ++ 4 files changed, 92 insertions(+), 14 deletions(-) create mode 100644 airflow/providers/alibaba/cloud/operators/oss.pyi
[GitHub] [airflow] potiuk merged pull request #20423: Reinstate `region` to `default_args` for Alibaba example DAGs
potiuk merged pull request #20423: URL: https://github.com/apache/airflow/pull/20423 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk closed pull request #20506: Add support to replace S3 file on MySqlToS3Operator
potiuk closed pull request #20506: URL: https://github.com/apache/airflow/pull/20506 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated (f743e46 -> 528baf4)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git. from f743e46 20496 fix port standalone mode (#20505) add 528baf4 Remove PyJWT upper bound from Dockerfile (#20503) No new revisions were added by this update. Summary of changes: Dockerfile| 3 +-- Dockerfile.ci | 3 +-- 2 files changed, 2 insertions(+), 4 deletions(-)
[GitHub] [airflow] potiuk merged pull request #20503: Remove PyJWT upper bound from Dockerfile
potiuk merged pull request #20503: URL: https://github.com/apache/airflow/pull/20503 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #20238: Optimize dockerfiles for local rebuilds
potiuk commented on pull request #20238: URL: https://github.com/apache/airflow/pull/20238#issuecomment-1002313050 All should be fixed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a change in pull request #20238: Optimize dockerfiles for local rebuilds
potiuk commented on a change in pull request #20238: URL: https://github.com/apache/airflow/pull/20238#discussion_r776097411 ## File path: Dockerfile ## @@ -62,6 +65,7 @@ ENV PYTHON_BASE_IMAGE=${PYTHON_BASE_IMAGE} \ # Install curl and gnupg2 - needed for many other installation steps RUN apt-get update \ +&& apt-get install --no-install-recommends -yqq apt-utils >/dev/null 2>&1 \ Review comment: However - I think that is a good idea eventually and I merged this RUN with the next one. Also. I realized that you asked for the ``` && apt-get install --no-install-recommends -yqq apt-utils >/dev/null 2>&1 \ ``` The main reson is that you need to install apt-utils to get rid of some warnings printed by apt-get. But I do not want to redirect stderr to /dev/null for all installation steps (just in case of some errors) - so i only redirect stderr in this one step (otherwise installing apt-utils would produce an stderr warning too). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a change in pull request #20238: Optimize dockerfiles for local rebuilds
potiuk commented on a change in pull request #20238: URL: https://github.com/apache/airflow/pull/20238#discussion_r776088296 ## File path: scripts/docker/compile_www_assets.sh ## @@ -35,8 +34,26 @@ function compile_www_assets() { www_dir="$(python -m site --user-site)/airflow/www" fi pushd ${www_dir} || exit 1 -yarn install --frozen-lockfile --no-cache -yarn run prod +set +e +yarn install --frozen-lockfile --no-cache 2>/tmp/out-yarn-install.txt +local res=$? +if [[ ${res} != 0 ]]; then +>&2 echo +>&2 echo "Error when running yarn install:" +>&2 echo +>&2 cat /tmp/out-yarn-install.txt +exit 1 +fi +yarn run prod 2>/tmp/out-yarn-run.txt +res=$? +if [[ ${res} != 0 ]]; then +>&2 echo +>&2 echo "Error when running yarn install:" +>&2 echo +>&2 cat /tmp/out-yarn-run.txt +exit 1 +fi +set -e find package.json yarn.lock static/css static/js -type f | sort | xargs md5sum > "${md5sum_file}" Review comment: Yep. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a change in pull request #20238: Optimize dockerfiles for local rebuilds
potiuk commented on a change in pull request #20238: URL: https://github.com/apache/airflow/pull/20238#discussion_r776087280 ## File path: Dockerfile ## @@ -62,6 +65,7 @@ ENV PYTHON_BASE_IMAGE=${PYTHON_BASE_IMAGE} \ # Install curl and gnupg2 - needed for many other installation steps RUN apt-get update \ +&& apt-get install --no-install-recommends -yqq apt-utils >/dev/null 2>&1 \ Review comment: Because we need curl/gpg2 to run DEV_APT_COMMAND - and we can potentially override this commmand (and we need curl/gpg2 to be able to install "extra resources" if needed. I prefer to install curl/gpg2 using "official sources" first - to be able to add "new sources" later. Otherwise it would require anyhow to run the "apt update" two times - once with the oriignal sources to update curl/gpg2, downloading keys for the "secondary" sources, and running apt-update again. The optimisations we can get this way are minimal, and it's much better to separate the concerns - the ommand to run all of that in one RUN would be pretty long.. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] khalidmammadov commented on issue #19891: Re-enable MyPy
khalidmammadov commented on issue #19891: URL: https://github.com/apache/airflow/issues/19891#issuecomment-1002299683 Anyone looking at `tests/providers/amazon`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a change in pull request #20238: Optimize dockerfiles for local rebuilds
potiuk commented on a change in pull request #20238: URL: https://github.com/apache/airflow/pull/20238#discussion_r776085431 ## File path: Dockerfile ## @@ -62,6 +65,7 @@ ENV PYTHON_BASE_IMAGE=${PYTHON_BASE_IMAGE} \ # Install curl and gnupg2 - needed for many other installation steps Review comment: Removed it. It's not needed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #20499: Docs: Note that `remote_log_conn_id` is only used for reading logs, not writing them
potiuk commented on pull request #20499: URL: https://github.com/apache/airflow/pull/20499#issuecomment-1002295331 Static checks are failing - the example file needs to be updated.You can install pre-commit to get ti fixed automatically for you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] josh-fell commented on pull request #20510: Add SAS check before attempting container creation when uploading files via WasbHook
josh-fell commented on pull request #20510: URL: https://github.com/apache/airflow/pull/20510#issuecomment-1002293598 Thinking out loud here, would changing up the behavior here be considered a breaking change? We're no longer going to attempt to create a container when uploading to Azure Blob Storage which _may_ be functionality users take advantage of now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] khalidmammadov closed pull request #20504: Fixing various issues in utils folder reported by MyPy
khalidmammadov closed pull request #20504: URL: https://github.com/apache/airflow/pull/20504 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] khalidmammadov commented on pull request #20504: Fixing various issues in utils folder reported by MyPy
khalidmammadov commented on pull request #20504: URL: https://github.com/apache/airflow/pull/20504#issuecomment-1002287711 Closing in favor of #20482 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated (2976e6f -> f743e46)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git. from 2976e6f Removes unnecessary --upgrade option from our examples (#20537) add f743e46 20496 fix port standalone mode (#20505) No new revisions were added by this update. Summary of changes: airflow/cli/commands/standalone_command.py | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-)
[GitHub] [airflow] potiuk merged pull request #20505: 20496 fix port standalone mode
potiuk merged pull request #20505: URL: https://github.com/apache/airflow/pull/20505 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk closed issue #20496: The standalone model bug, about port 8080
potiuk closed issue #20496: URL: https://github.com/apache/airflow/issues/20496 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #20248: "TaskInstance.dag_run" is joined twice in Dag.get_task_instances()
potiuk commented on issue #20248: URL: https://github.com/apache/airflow/issues/20248#issuecomment-1002284426 Assigned -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] github-actions[bot] commented on pull request #20541: StreamLogWriter: swap _buffer before _propagate_log()
github-actions[bot] commented on pull request #20541: URL: https://github.com/apache/airflow/pull/20541#issuecomment-1002279446 The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #20541: StreamLogWriter: swap _buffer before _propagate_log()
potiuk commented on pull request #20541: URL: https://github.com/apache/airflow/pull/20541#issuecomment-1002279213 Looks good! @ashb @kaxil @mik-laj @uranusjr can you please take a look if you find anything which I could miss? We need two committers review here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] yehoshuadimarsky commented on issue #20538: Change Helm Chart tolerations logic when combining multiple parts of values.yaml
yehoshuadimarsky commented on issue #20538: URL: https://github.com/apache/airflow/issues/20538#issuecomment-1002276844 But for some reason, the Scheduler has the order reversed: https://github.com/apache/airflow/blob/6dfc939833fd3dc477b3971d965ba142d3b8bd77/chart/templates/scheduler/scheduler-deployment.yaml#L33 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk edited a comment on pull request #20473: SSHOperator.execute() returns str regardless of enable_xcom_pickling
potiuk edited a comment on pull request #20473: URL: https://github.com/apache/airflow/pull/20473#issuecomment-1002275764 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #20473: SSHOperator.execute() returns str regardless of enable_xcom_pickling
potiuk commented on pull request #20473: URL: https://github.com/apache/airflow/pull/20473#issuecomment-1002275764 > My real point always has been that the `xcom_pull` result should be consistent, and whether an encoding/decoding is used shouldn't matter to the user. Since `enable_xcom_pickling=False` (which uses json) is quite limited in the data types that can be serialized, we chose to uuencode the return value from the `SSHOperator`, which causes the inconsistency. It works, though, as you indicated. I just wonder, why do you think it should be consistent? We do not have auto-discovery of types and format of the returned values in Airflow. Airlfow is task-based not data-based, so basicaly in case you want to pass data between tasks both producer and consumer has to agree on naming conventions and data format used. There is no "universal" discovery. Also even in the case you started with - the "pickling" is deprecated and actually it should be set "per installlation". So it could be that in different installations, different format is used, but it will be "constistent" for that installation. I do not see huge value in providing "consistent" output format really in this case. > However, if "arbitrary binary return value" + "json serialization" is what we want to support, we should wrap json to deal with binary data, instead of forcing `SSHOperator` to do the encode work --- what about other operators that may return binary data? I do not think we need to do it. Again this is something we do not reallu want to support via json serialization. SSH is an outlier here indeed. We deliberately only support "json-serializable" objects in default XCom implementation. And possibility of supporting binary storage is implemented already via cusom XCom backends. See here for a well explained example https://www.astronomer.io/guides/custom-xcom-backends -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] derkuci commented on issue #20500: infinite recursion in StreamLogWriter
derkuci commented on issue #20500: URL: https://github.com/apache/airflow/issues/20500#issuecomment-1002273148 Here you go, #20541 Thanks for your time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] derkuci opened a new pull request #20541: StreamLogWriter: swap _buffer before _propagate_log()
derkuci opened a new pull request #20541: URL: https://github.com/apache/airflow/pull/20541 As initially discussed in #20500, if the user code adds a log handler to write to the standard out/err, there will be an infinite loop of logging messages. The `StreamLogWriter._buffer` will be doubled in each iteration, and soon use up all the memory. This change doesn't stop the infinite loop; it merely changes the growth of `StreamLogWriter._buffer` from exponential to linear, which will also lead to runtime error (stack overflow). However, that is much easier to debug (see the infinite loop) than the out-of-memory error (no call stack). For normal cases, this doesn't change the logging behavior. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] josh-fell commented on pull request #20510: Add SAS check before attempting container creation when uploading files via WasbHook
josh-fell commented on pull request #20510: URL: https://github.com/apache/airflow/pull/20510#issuecomment-1002269225 > We should not do this, instead we should get blob client, and upload. Look at download method Right on. I had a similar observation in the original issue as part of whatever the next steps could be. Happy to move in this direction. I'll set this PR to draft for now. Thanks for the feedback @ephraimbuddy! >P.S. Generally speaking the WasbHook methods seem to favor using a ContainerClient to perform blob-level operations even though this hook specifically interacts with Azure Blob Storage using a BlobServiceClient which can directly perform these operations. Presumably this exists because of the "auto-container creation" behavior. IMO this behavior seems worthy of revisiting. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dstandish commented on pull request #20540: Remove `host` as an instance attr in `DatabricksHook`
dstandish commented on pull request #20540: URL: https://github.com/apache/airflow/pull/20540#issuecomment-1002268700 thanks @potiuk -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] derkuci commented on pull request #20473: SSHOperator.execute() returns str regardless of enable_xcom_pickling
derkuci commented on pull request #20473: URL: https://github.com/apache/airflow/pull/20473#issuecomment-1002268405 Good point! Let's abandon the changes. My real point always has been that the `xcom_pull` result should be consistent, and whether an encoding/decoding is used shouldn't matter to the user. Since `enable_xcom_pickling=False` (which uses json) is quite limited in the data types that can be serialized, we chose to uuencode the return value from the `SSHOperator`, which causes the inconsistency. It works, though, as you indicated. However, if "arbitrary binary return value" + "json serialization" is what we want to support, we should wrap json to deal with binary data, instead of forcing `SSHOperator` to do the encode work --- what about other operators that may return binary data? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] khalidmammadov commented on pull request #20482: Fix mypy errors in airflow/utils/
khalidmammadov commented on pull request #20482: URL: https://github.com/apache/airflow/pull/20482#issuecomment-1002267626 > > Duplication of effort here #20504 ... > > I wish we all mention/reference these changes here #19891 as per convention so to avoid time waste. > > My bad... No worries. I will close that. Please take a look to that anyway to see if you want to amend or express anything here differently, particularly those overloadings. I am not sure why they are necessary -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] john-jac commented on pull request #19137: Add RedshiftDataHook
john-jac commented on pull request #19137: URL: https://github.com/apache/airflow/pull/19137#issuecomment-1002267357 > After #20276 > ~is this PR still needed?~ > Using `RedshiftDataHook` can be confusing, isn't it? > cc @dstandish It is different. See description in the comments above. Very sorry for the delay. The additional tests requested are still to be coded. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #19608: Enhance `multiple_outputs` inference of dict typing
potiuk commented on pull request #19608: URL: https://github.com/apache/airflow/pull/19608#issuecomment-1002265864 > Looks like a new static check was added in #20501 which checks syntax for 3.7+ and is attempting to remove the check for Python 3.6 in the typing inference logic here. Is dropping Python 3.6 support slated for 2.2.4? Or should I go ahead and remove the Python 3.6 check and assume it's no longer supported even though #20467 is not part of a release yet? Interesting one that we have not foreseen. I think we should disable the pyupgrade for now and re-enable it when we release 2.3.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated (d3b3161 -> 2976e6f)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git. from d3b3161 Remove `host` as an instance attr in `DatabricksHook` (#20540) add 2976e6f Removes unnecessary --upgrade option from our examples (#20537) No new revisions were added by this update. Summary of changes: docs/apache-airflow/installation/installing-from-pypi.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[GitHub] [airflow] potiuk merged pull request #20537: Removes unnecessary --upgrade option from our examples
potiuk merged pull request #20537: URL: https://github.com/apache/airflow/pull/20537 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated (9e315ff -> d3b3161)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git. from 9e315ff Avoid calling DAG.following_schedule() for TaskInstance.get_template_context() (#20486) add d3b3161 Remove `host` as an instance attr in `DatabricksHook` (#20540) No new revisions were added by this update. Summary of changes: airflow/providers/databricks/hooks/databricks.py | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
[GitHub] [airflow] potiuk merged pull request #20540: Remove `host` as an instance attr in `DatabricksHook`
potiuk merged pull request #20540: URL: https://github.com/apache/airflow/pull/20540 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] eladkal edited a comment on pull request #19137: Add RedshiftDataHook
eladkal edited a comment on pull request #19137: URL: https://github.com/apache/airflow/pull/19137#issuecomment-1002263888 After https://github.com/apache/airflow/pull/20276 ~~is this PR still needed?~~ Using `RedshiftDataHook` can be confusing, isn't it? cc @dstandish -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk edited a comment on pull request #20540: Remove `host` as an instance attr in `DatabricksHook`
potiuk edited a comment on pull request #20540: URL: https://github.com/apache/airflow/pull/20540#issuecomment-1002263115 I think "behavrioural" changes like that are not "breaking". Whether `self.host` is an instance of variable does not change the "use" of it - at most it optimizes some behaviours - in this case the "self.host" was created in both - DagParser and Task, when it was removed, it was only created in Task (during execute). As we discussed here: https://github.com/apache/airflow/pull/19572#discussion_r776052141 - Databricks is not really a "generic" operator that you might want to extend or dertive from to use internal features. Unlke KPO for example which is generic and it's easier to imagine that people would like to change their behaviour by extending it. So I think we can consider it's internals are really "implementation details". Unfortunately we do not have clear-cut what is breaking and what not :(. Theoretically even adding a default parameter might be breaking because somone could overload that method in a derived class, or you could imagine someone checks signature of the method. This is really almost always an exercise of "how likely this change is breaking" rather than "clear-cut" 0/1 breaking/non-breaking. And except some obvious cases, almost always it will have to be judged individually. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] eladkal commented on pull request #19137: Add RedshiftDataHook
eladkal commented on pull request #19137: URL: https://github.com/apache/airflow/pull/19137#issuecomment-1002263888 After https://github.com/apache/airflow/pull/20276 is this PR still needed? cc @dstandish -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #20540: Remove `host` as an instance attr in `DatabricksHook`
potiuk commented on pull request #20540: URL: https://github.com/apache/airflow/pull/20540#issuecomment-1002263115 I think "behavrioural" changes like that are not "breaking". Whether `self.host` is an instance of variable does not change the "use" of it - at most it optimizes some behaviours - in this case the "self.host" was created in both - DagParser and Task, when it was removed, it was only created in Task (during execute). As we discussed here: https://github.com/apache/airflow/pull/19572#discussion_r776052141 - Databricks is not really a "generic" operator that you might want to extend or dertive from to use internal features. Unlke KPO for example which is generic and it's easier to imagine that people would like to change their behaviour by extending it. So I think we can consider it's internals are really "implementation details". Unfortunately we do not have clear-cut what is breaking and what not :(. Theoretically even adding a default parameter might be breaking because somone could overload that method in a derived class, or you could imagine someone checks signature of the method. This is really almost always an exercise of "how likely this change is breaking" rather than "clear-cut" 0/1 breaking/non-breaking. And except some obvious cases, almost always it will have to be judged individually. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dstandish edited a comment on pull request #20540: Remove `host` as an instance attr in `DatabricksHook`
dstandish edited a comment on pull request #20540: URL: https://github.com/apache/airflow/pull/20540#issuecomment-1002260448 > > I like it... BUT... must we consider this a "breaking" change? > > Why ? Because `host` will no longer be set after `__init__` in last release it seems instance attr `host` was set in `__init__` this among the most marginal of changes we should consider "breaking" but if we want to be strict it probably is right to consider it so. in this case the solution should probably be to make it a cached property. but @potiuk if you think we can just chop the attr that would be best from code perspective. based on @josh-fell's analyis it seems it was only an instance attr for one release (since v2.1.0) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org