[GitHub] [airflow] eladkal commented on issue #24364: BigqueryToGCS Operator Failing
eladkal commented on issue #24364: URL: https://github.com/apache/airflow/issues/24364#issuecomment-115288 That is a question for Composer support. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] adityaprakash-bobby commented on issue #24364: BigqueryToGCS Operator Failing
adityaprakash-bobby commented on issue #24364: URL: https://github.com/apache/airflow/issues/24364#issuecomment-1152854133 @eladkal Is it wise to override the composer managed airflow python packages? I see composer 2.0.14 till the latest 2.0.16, has the apache-airflow-providers-google==2022.5.18+composer package. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow-site] rajans163 opened a new issue, #612: requests.exceptions.SSLError: HTTPSConnectionPool(host='cloud.getdbt.com', port=443) while connecting to dbt cloud
rajans163 opened a new issue, #612: URL: https://github.com/apache/airflow-site/issues/612 Hi All..I am trying to connect dbt cloud using airflow but getting this error requests.exceptions.SSLError: HTTPSConnectionPool(host='cloud.getdbt.com', port=443): -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] josh-fell commented on issue #24014: Task mapping against 'params'
josh-fell commented on issue #24014: URL: https://github.com/apache/airflow/issues/24014#issuecomment-1152839870 @hammerhead See the "Anything else" section of #24388, but you should be able to still map via `parameters` for the `DELETE` DML in a Jinja-templated SQL using `{{ task.mapped_kwargs.parameters[ti.map_index].}}` as a workaround. Assuming the desired SQL `DELETE FROM {table_fqn} WHERE {column} = {value};`, you could try: ```python sql = """ DELETE FROM {{ task.mapped_kwargs.parameters[ti.map_index].table_fqn }} WHERE {{ task.mapped_kwargs.parameters[ti.map_index].column }} = {{ task.mapped_kwargs.parameters[ti.map_index].value }};""" PostgresOperator.partial( task_id="delete_partitions", postgres_conn_id="createdb_connection", sql=sql, ).expand(parameters=[ {"table_fqn": "tbl1", "column": "col1", "value": "val1"}, {"table_fqn": "tbl2", "column": "col2", "value": "val3"}, ) ``` Does this help? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] josh-fell opened a new issue, #24388: Unable to access operator attrs within Jinja context for mapped tasks
josh-fell opened a new issue, #24388: URL: https://github.com/apache/airflow/issues/24388 ### Apache Airflow version 2.3.2 (latest released) ### What happened When attempting to generate mapped SQL tasks using a Jinja-templated query, an exception like the following is thrown: `jinja2.exceptions.UndefinedError: 'airflow.models.mappedoperator.MappedOperator object' has no attribute ''` For example, when attempting to map `SQLValueCheckOperator` tasks with respect to `database` using a query of `SELECT COUNT(*) FROM {{ task.database }}.tbl;`: `jinja2.exceptions.UndefinedError: 'airflow.models.mappedoperator.MappedOperator object' has no attribute 'database'` Or, when using `SnowflakeOperator` and mapping via `parameters` of a query like `SELECT * FROM {{ task.parameters.tbl }};`: `jinja2.exceptions.UndefinedError: 'airflow.models.mappedoperator.MappedOperator object' has no attribute 'parameters'` ### What you think should happen instead When using Jinja-template SQL queries, the attribute that is being using for the mapping should be accessible via `{{ task.. }}`. Executing the same SQL query with classic, non-mapped tasks allows for this operator attr access from the `task` context object. Ideally, the same interface should apply for both non-mapped and mapped tasks. ### How to reproduce Consider the following DAG: ```python from pendulum import datetime from airflow.decorators import dag from airflow.operators.sql import SQLValueCheckOperator from airflow.providers.snowflake.operators.snowflake import SnowflakeOperator CORE_SQL = "SELECT COUNT(*) FROM {{ task.database }}.tbl;" SNOWFLAKE_SQL = """SELECT * FROM {{ task.parameters.tbl }};""" @dag(dag_id="map-city", start_date=datetime(2022, 6, 7), schedule_interval=None) def map_city(): classic_sql_value_check = SQLValueCheckOperator( task_id="classic_sql_value_check", conn_id="snowflake", sql=CORE_SQL, database="dev", pass_value=2, ) mapped_value_check = SQLValueCheckOperator.partial( task_id="check_row_count", conn_id="snowflake", sql=CORE_SQL, pass_value=2, ).expand(database=["dev", "production"]) classic_snowflake_task = SnowflakeOperator( task_id="classic_snowflake_task", snowflake_conn_id="snowflake", sql=SNOWFLAKE_SQL, parameters={"tbl": "foo"}, ) mapped_snowflake_task = SnowflakeOperator.partial( task_id="mapped_snowflake_task", snowflake_conn_id="snowflake", sql=SNOWFLAKE_SQL ).expand( parameters=[ {"tbl": "foo"}, {"tbl": "bar"}, ] ) _ = map_city() ``` **`SQLValueCheckOperator` tasks** The logs for the "classic_sql_value_check", non-mapped task show the query executing as expected: `[2022-06-11, 02:01:03 UTC] {sql.py:204} INFO - Executing SQL check: SELECT COUNT(*) FROM dev.tbl;` while the mapped "check_row_count" task fails with the following exception: ```bash [2022-06-11, 02:01:03 UTC] {standard_task_runner.py:79} INFO - Running: ['airflow', 'tasks', 'run', 'map-city', 'check_row_count', 'manual__2022-06-11T02:01:01.831761+00:00', '--job-id', '350', '--raw', '--subdir', 'DAGS_FOLDER/map_city.py', '--cfg-path', '/tmp/tmpm5bg9mt5', '--map-index', '0', '--error-file', '/tmp/tmp2kbilt2l'] [2022-06-11, 02:01:03 UTC] {standard_task_runner.py:80} INFO - Job 350: Subtask check_row_count [2022-06-11, 02:01:03 UTC] {task_command.py:370} INFO - Running on host 569596df5be5 [2022-06-11, 02:01:03 UTC] {taskinstance.py:1889} ERROR - Task failed with exception Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 1451, in _run_raw_task self._execute_task_with_callbacks(context, test_mode) File "/usr/local/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 1555, in _execute_task_with_callbacks task_orig = self.render_templates(context=context) File "/usr/local/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 2212, in render_templates rendered_task = self.task.render_template_fields(context) File "/usr/local/lib/python3.9/site-packages/airflow/models/mappedoperator.py", line 726, in render_template_fields self._do_render_template_fields( File "/usr/local/lib/python3.9/site-packages/airflow/utils/session.py", line 68, in wrapper return func(*args, **kwargs) File "/usr/local/lib/python3.9/site-packages/airflow/models/abstractoperator.py", line 344, in _do_render_template_fields rendered_content = self.render_template( File
[GitHub] [airflow] vanchaxy commented on pull request #24342: KubernetesExecutor: Always give precedence to pod from executor_config arg
vanchaxy commented on PR #24342: URL: https://github.com/apache/airflow/pull/24342#issuecomment-1152813887 Hi, I also encountered this bug. For now, I overcome it with pod_hook. From what I see, at the start of the construct_pod (line 339) we check for the image in pod_override_object and use it in kube_image. This should be removed if we change the order of pods as you did in this PR or the namespace should be overwritten in a similar way. I see the problem that after this PR users of airflow can overwrite some important annotations/labels which airflow use to operate and this will break executor work. My suggestion is: 1) Remove image overwriting at the start of the function (try... except block) 2) Split dynamic_pod into 2 different pods. One with namespace and kube_image. Another with annotations/labels/name/args/env. 3) Change pod_list order to pod_template_file -> pod with namespace and image -> pod from executor_config -> pod from from dynamic arguments (annotations/labels/...). After this change, we will have two dynamic pods: one with values that we allow users to overwrite and another with values that should not be overwritten by users. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Taragolis opened a new pull request, #24387: fix: RedshiftDataHook and RdsHook not use cached connection
Taragolis opened a new pull request, #24387: URL: https://github.com/apache/airflow/pull/24387 After finding in this PR: https://github.com/apache/airflow/pull/24057#discussion_r894848008 I've checked other AWS Hooks which not use cached `conn` property and found two hooks `RedshiftDataHook` and `RdsHook` Add to `AwsBaseHook` generic connection, so it won't required anymore overwrite `conn` property Type hinting still work in IDE (PyCharm) after changes ![out](https://user-images.githubusercontent.com/3998685/173164439-fbf99a81-087e-43d7-b973-e51f378763fe.gif) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] github-actions[bot] commented on pull request #18615: fix template string passed to env_vars when using render_template_as_native_obj for Kubernetes Pod Operator and add Tests
github-actions[bot] commented on PR #18615: URL: https://github.com/apache/airflow/pull/18615#issuecomment-1152810873 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] github-actions[bot] commented on pull request #18938: Jinja templates should be rendered in dict keys
github-actions[bot] commented on PR #18938: URL: https://github.com/apache/airflow/pull/18938#issuecomment-1152810851 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] github-actions[bot] closed pull request #21053: Fix: Exception when parsing log #20966
github-actions[bot] closed pull request #21053: Fix: Exception when parsing log #20966 URL: https://github.com/apache/airflow/pull/21053 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] github-actions[bot] commented on pull request #23239: Pluggable dag bag
github-actions[bot] commented on PR #23239: URL: https://github.com/apache/airflow/pull/23239#issuecomment-1152810814 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dungdm93 commented on a diff in pull request #24311: Update Oracle library to latest version
dungdm93 commented on code in PR #24311: URL: https://github.com/apache/airflow/pull/24311#discussion_r894940070 ## tests/providers/google/cloud/transfers/test_oracle_to_gcs.py: ## @@ -19,7 +19,7 @@ import unittest from unittest import mock -import cx_Oracle +import oracledb.base_impl as oracledbbase Review Comment: Should import `oracledb` as well ## tests/providers/oracle/hooks/test_oracle.py: ## @@ -158,12 +158,12 @@ def test_get_conn_events(self, mock_connect): assert args == () assert kwargs['events'] is True -@mock.patch('airflow.providers.oracle.hooks.oracle.cx_Oracle.connect') +@mock.patch('airflow.providers.oracle.hooks.oracle.oracledb.connect') def test_get_conn_purity(self, mock_connect): purity = { -'new': cx_Oracle.ATTR_PURITY_NEW, -'self': cx_Oracle.ATTR_PURITY_SELF, -'default': cx_Oracle.ATTR_PURITY_DEFAULT, +'new': oracledb.ATTR_PURITY_NEW, +'self': oracledb.ATTR_PURITY_SELF, +'default': oracledb.ATTR_PURITY_DEFAULT, Review Comment: Should be `oracledb.PURITY_*` as well ## tests/providers/oracle/hooks/test_oracle.py: ## @@ -119,15 +119,15 @@ def test_get_conn_nencoding(self, mock_connect): assert 'encoding' not in kwargs assert kwargs['nencoding'] == 'UTF-8' -@mock.patch('airflow.providers.oracle.hooks.oracle.cx_Oracle.connect') +@mock.patch('airflow.providers.oracle.hooks.oracle.oracledb.connect') def test_get_conn_mode(self, mock_connect): mode = { -'sysdba': cx_Oracle.SYSDBA, -'sysasm': cx_Oracle.SYSASM, -'sysoper': cx_Oracle.SYSOPER, -'sysbkp': cx_Oracle.SYSBKP, -'sysdgd': cx_Oracle.SYSDGD, -'syskmt': cx_Oracle.SYSKMT, +'sysdba': oracledb.SYSDBA, +'sysasm': oracledb.SYSASM, +'sysoper': oracledb.SYSOPER, +'sysbkp': oracledb.SYSBKP, +'sysdgd': oracledb.SYSDGD, +'syskmt': oracledb.SYSKMT, Review Comment: Should using `oracledb.AUTH_MODE_*` as well -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #24236: Add CI-friendly progress output for tests
potiuk commented on PR #24236: URL: https://github.com/apache/airflow/pull/24236#issuecomment-1152776485 An approval would be nice :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #24386: Fix links to sources for examples
potiuk commented on PR #24386: URL: https://github.com/apache/airflow/pull/24386#issuecomment-1152774964 @mik-laj - finally got some time to take a look and It looks I managed to fix it :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #24386: Fix links to sources for examples
potiuk commented on PR #24386: URL: https://github.com/apache/airflow/pull/24386#issuecomment-1152774199 cc: @rbiegacz -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #24386: Fix links to sources for examples
potiuk commented on PR #24386: URL: https://github.com/apache/airflow/pull/24386#issuecomment-1152772541 CC: @josh-fell - another fix to links for AIP-47 https://user-images.githubusercontent.com/595491/173156966-67e287dc-fa79-4658-9a9e-babdf52e4f8c.png;> vs. https://user-images.githubusercontent.com/595491/173157465-b6507187-0d0e-4266-9d4c-265a2dc5e2a7.png;> I think fixed past broken links will be difficult - because we would have to regenerate a lot of the class "apis" which were excluded (mistakenly) but I have an idea how we can instead utilise links to github and post-process links in the generated documenation (very similar to the fix merged yesterday) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk opened a new pull request, #24386: Fix links to sources for examples
potiuk opened a new pull request, #24386: URL: https://github.com/apache/airflow/pull/24386 The links to example sources in exampleinclude have been broken in a number of providers and they were additionally broken by AIP-47. This PR fixes it. --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information. In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in a newsfragement file, named `{pr_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ferruzzi commented on a diff in pull request #23628: Fixes SageMaker operator return values
ferruzzi commented on code in PR #23628: URL: https://github.com/apache/airflow/pull/23628#discussion_r894903355 ## airflow/providers/amazon/aws/operators/sagemaker.py: ## @@ -188,7 +193,7 @@ def execute(self, context: 'Context') -> dict: ) if response['ResponseMetadata']['HTTPStatusCode'] != 200: raise AirflowException(f'Sagemaker Processing Job creation failed: {response}') -return {'Processing': self.hook.describe_processing_job(self.config['ProcessingJobName'])} +return {'Processing': serialize(self.hook.describe_processing_job(self.config['ProcessingJobName']))} Review Comment: Thanks for the bump :stuck_out_tongue: For some reason this got stuck as a comment on a pending review instead of a reply. Strange. Sorry. ``` { 'ProcessingInputs': [ { 'InputName': 'input', 'AppManaged': False, 'S3Input': { 'S3Uri': 's3://bucket/project/preprocessing/input.csv', 'LocalPath': '/opt/ml/processing/input', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None' } } ], 'ProcessingOutputConfig': { 'Outputs': [ { 'OutputName': 'output', 'S3Output': { 'S3Uri': 's3://bucket/project/processed-input-data', 'LocalPath': '/opt/ml/processing/output', 'S3UploadMode': 'EndOfJob' }, 'AppManaged': False } ] }, 'ProcessingJobName': 'project-processing', 'ProcessingResources': { 'ClusterConfig': { 'InstanceCount': 1, 'InstanceType': 'ml.m5.large', 'VolumeSizeInGB': 1 } }, 'StoppingCondition': { 'MaxRuntimeInSeconds': 300 }, 'AppSpecification': { 'ImageUri': '123456789012.dkr.ecr.re-gio-n.amazonaws.com/process_data'}, 'RoleArn': 'arn:aws:iam::123456789012:role/service-role/SageMaker-Role', 'ProcessingJobArn': 'arn:aws:sagemaker:re-gio-n:123456789012:processing-job/project-processing', 'ProcessingJobStatus': 'Completed', 'ProcessingEndTime': datetime.datetime(2022, 6, 7, 14, 1, 25, 20, tzinfo=tzlocal()), 'ProcessingStartTime': datetime.datetime(2022, 6, 7, 14, 1, 6, 34000, tzinfo=tzlocal()), 'LastModifiedTime': datetime.datetime(2022, 6, 7, 14, 1, 25, 443000, tzinfo=tzlocal()), 'CreationTime': datetime.datetime(2022, 6, 7, 13, 57, 29, 487000, tzinfo=tzlocal()), 'ResponseMetadata': { 'RequestId': 'fy2r5b9e-552a-3759-7b2a-d15ef4f32dj3', 'HTTPStatusCode': 200, 'HTTPHeaders': { 'x-amzn-requestid': 'fy2r5b9e-552a-3759-7b2a-d15ef4f32dj3', 'content-type': 'application/x-amz-json-1.1', 'content-length': '1244', 'date': 'Tue, 07 Jun 2022 14:01:30 GMT' }, 'RetryAttempts': 0 } } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ferruzzi commented on a diff in pull request #23628: Fixes SageMaker operator return values
ferruzzi commented on code in PR #23628: URL: https://github.com/apache/airflow/pull/23628#discussion_r894903355 ## airflow/providers/amazon/aws/operators/sagemaker.py: ## @@ -188,7 +193,7 @@ def execute(self, context: 'Context') -> dict: ) if response['ResponseMetadata']['HTTPStatusCode'] != 200: raise AirflowException(f'Sagemaker Processing Job creation failed: {response}') -return {'Processing': self.hook.describe_processing_job(self.config['ProcessingJobName'])} +return {'Processing': serialize(self.hook.describe_processing_job(self.config['ProcessingJobName']))} Review Comment: Thanks for the bump :stuck_out_tongue: For some reason this got stuck as a comment on a pending review instead of a reply. Strange. Sorry. ``` { 'ProcessingInputs': [ { 'InputName': 'input', 'AppManaged': False, 'S3Input': { 'S3Uri': 's3://bucket/project/preprocessing/input.csv', 'LocalPath': '/opt/ml/processing/input', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None' } } ], 'ProcessingOutputConfig': { 'Outputs': [ { 'OutputName': 'output', 'S3Output': { 'S3Uri': 's3://bucket/project/processed-input-data', 'LocalPath': '/opt/ml/processing/output', 'S3UploadMode': 'EndOfJob' }, 'AppManaged': False } ] }, 'ProcessingJobName': 'project-processing', 'ProcessingResources': { 'ClusterConfig': { 'InstanceCount': 1, 'InstanceType': 'ml.m5.large', 'VolumeSizeInGB': 1 } }, 'StoppingCondition': { 'MaxRuntimeInSeconds': 300 }, 'AppSpecification': { 'ImageUri': '123456789012.dkr.ecr.re-gio-n.amazonaws.com/process_data'}, 'RoleArn': 'arn:aws:iam::123456789012:role/service-role/SageMaker-Role', 'ProcessingJobArn': 'arn:aws:sagemaker:re-gio-n:123456789012:processing-job/project-processing', 'ProcessingJobStatus': 'Completed', 'ProcessingEndTime': datetime.datetime(2022, 6, 7, 14, 1, 25, 20, tzinfo=tzlocal()), 'ProcessingStartTime': datetime.datetime(2022, 6, 7, 14, 1, 6, 34000, tzinfo=tzlocal()), 'LastModifiedTime': datetime.datetime(2022, 6, 7, 14, 1, 25, 443000, tzinfo=tzlocal()), 'CreationTime': datetime.datetime(2022, 6, 7, 13, 57, 29, 487000, tzinfo=tzlocal()), 'ResponseMetadata': { 'RequestId': 'fy2r5b9e-552a-3759-7b2a-d15ef4f32dj3', 'HTTPStatusCode': 200, 'HTTPHeaders': { 'x-amzn-requestid': 'fy2r5b9e-552a-3759-7b2a-d15ef4f32dj3', 'content-type': 'application/x-amz-json-1.1', 'content-length': '1244', 'date': 'Tue, 07 Jun 2022 14:01:30 GMT' }, 'RetryAttempts': 0 } } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] o-nikolas commented on a diff in pull request #24357: DebugExecutor use ti.run() instead of ti._run_raw_task
o-nikolas commented on code in PR #24357: URL: https://github.com/apache/airflow/pull/24357#discussion_r894902778 ## tests/jobs/test_backfill_job.py: ## @@ -1599,7 +1615,7 @@ def test_mapped_dag(self, dag_id, executor_name, session): self.dagbag.process_file(str(TEST_DAGS_FOLDER / f'{dag_id}.py')) dag = self.dagbag.get_dag(dag_id) -when = datetime.datetime(2022, 1, 1) +when = timezone.datetime(2022, 1, 1) Review Comment: :laughing: I meant more context for _why_ that's the case. I assume it's because folks just don't want to slow down the regular builds, but maybe there were timeouts on the test runners? Either way, what do you think about a scheduled CI run (once daily maybe?) on `main` which includes long running tests so that we at least get some coverage for them, thoughts? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #8540: MLFlow Operator
potiuk commented on issue #8540: URL: https://github.com/apache/airflow/issues/8540#issuecomment-1152742880 @Superskyyy - I think the easiest way is to get it done is to work on it. For now I will move it discussion (where it belongs) but if you would like to contribute (as one of > 2000 contributors) and male a PR to get it implemented, that would be cooll -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk closed issue #8540: MLFlow Operator
potiuk closed issue #8540: MLFlow Operator URL: https://github.com/apache/airflow/issues/8540 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ferruzzi opened a new pull request, #24384: Convert SNS Sample DAG to System Test (AIP-47)
ferruzzi opened a new pull request, #24384: URL: https://github.com/apache/airflow/pull/24384 Related: https://github.com/apache/airflow/issues/22438 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Taragolis commented on a diff in pull request #24057: Amazon appflow
Taragolis commented on code in PR #24057: URL: https://github.com/apache/airflow/pull/24057#discussion_r894885197 ## airflow/providers/amazon/aws/hooks/appflow.py: ## @@ -0,0 +1,50 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from typing import TYPE_CHECKING + +from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook + +if TYPE_CHECKING: +from mypy_boto3_appflow.client import AppflowClient + + +class AppflowHook(AwsBaseHook): +""" +Interact with Amazon Appflow, using the boto3 library +Hook attribute `conn` has all methods that listed in documentation + +.. seealso:: +- https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/appflow.html +- https://docs.aws.amazon.com/appflow/1.0/APIReference/Welcome.html + +Additional arguments (such as ``aws_conn_id`` or ``region_name``) may be specified and +are passed down to the underlying AwsBaseHook. + +.. seealso:: +:class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook` + +""" + +def __init__(self, *args, **kwargs) -> None: +kwargs["client_type"] = "appflow" +super().__init__(*args, **kwargs) + +@property +def conn(self) -> 'AppflowClient': +"""Get the underlying boto3 Appflow client (cached)""" Review Comment: I think in the future it could be simplify by add inheritance of `Generic` type for `AwsBaseHook` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk closed issue #24374: Airflow on ARM - Optional provider feature disabled when importing 'airflow.providers.google.leveldb.hooks.leveldb.LevelDBHook' from 'apache-airflow-prov
potiuk closed issue #24374: Airflow on ARM - Optional provider feature disabled when importing 'airflow.providers.google.leveldb.hooks.leveldb.LevelDBHook' from 'apache-airflow-providers-google' package URL: https://github.com/apache/airflow/issues/24374 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #24374: Airflow on ARM - Optional provider feature disabled when importing 'airflow.providers.google.leveldb.hooks.leveldb.LevelDBHook' from 'apache-airflo
potiuk commented on issue #24374: URL: https://github.com/apache/airflow/issues/24374#issuecomment-1152724609 > I've tested it with `airflow:slim-2.3.2` image and issue disappeared. After adding `apache-airflow-providers-google==7.0.0 ` into Airflow dependencies, I'm notified about optional provider feature being disabled. > > I'm wondering whether this won't cause any harm when needed to use this module using ARM-based machine INFO is an info - not warning. It's for diagnostics. I don't think INFO message is anything you should be worried about. You can get rid of it by installing the optional dependencies. But maybe indeed some extra message how it can be enabled. Would you like to maybe contribute a better message there? I am happy to guide you and review your Pull Request for it. You can become one of > 2K contributors to Apache Airflow. Converting it to a discussin as this is really not a bug. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] igorborgest commented on a diff in pull request #24057: Amazon appflow
igorborgest commented on code in PR #24057: URL: https://github.com/apache/airflow/pull/24057#discussion_r894871685 ## airflow/providers/amazon/aws/hooks/appflow.py: ## @@ -0,0 +1,50 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from typing import TYPE_CHECKING + +from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook + +if TYPE_CHECKING: +from mypy_boto3_appflow.client import AppflowClient + + +class AppflowHook(AwsBaseHook): +""" +Interact with Amazon Appflow, using the boto3 library +Hook attribute `conn` has all methods that listed in documentation + +.. seealso:: +- https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/appflow.html +- https://docs.aws.amazon.com/appflow/1.0/APIReference/Welcome.html + +Additional arguments (such as ``aws_conn_id`` or ``region_name``) may be specified and +are passed down to the underlying AwsBaseHook. + +.. seealso:: +:class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook` + +""" + +def __init__(self, *args, **kwargs) -> None: +kwargs["client_type"] = "appflow" +super().__init__(*args, **kwargs) + +@property +def conn(self) -> 'AppflowClient': +"""Get the underlying boto3 Appflow client (cached)""" Review Comment: Good catch @Taragolis! I've updated the code and added this assertion to the unit tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #22870: Keyboard shortcut in Airflow web UI
potiuk commented on issue #22870: URL: https://github.com/apache/airflow/issues/22870#issuecomment-1152717268 Feel free. Assigned. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #16654: "Suggest a change on this page" button bug/change request
potiuk commented on issue #16654: URL: https://github.com/apache/airflow/issues/16654#issuecomment-1152716260 Assigned you -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #16654: "Suggest a change on this page" button bug/change request
potiuk commented on issue #16654: URL: https://github.com/apache/airflow/issues/16654#issuecomment-1152715870 Sure. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #23727: Airflow 2.3 scheduler error: 'V1Container' object has no attribute '_startup_probe'
potiuk commented on issue #23727: URL: https://github.com/apache/airflow/issues/23727#issuecomment-1152691754 Ah yeah. thanks for the info! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Taragolis commented on a diff in pull request #24057: Amazon appflow
Taragolis commented on code in PR #24057: URL: https://github.com/apache/airflow/pull/24057#discussion_r894848008 ## airflow/providers/amazon/aws/hooks/appflow.py: ## @@ -0,0 +1,50 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from typing import TYPE_CHECKING + +from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook + +if TYPE_CHECKING: +from mypy_boto3_appflow.client import AppflowClient + + +class AppflowHook(AwsBaseHook): +""" +Interact with Amazon Appflow, using the boto3 library +Hook attribute `conn` has all methods that listed in documentation + +.. seealso:: +- https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/appflow.html +- https://docs.aws.amazon.com/appflow/1.0/APIReference/Welcome.html + +Additional arguments (such as ``aws_conn_id`` or ``region_name``) may be specified and +are passed down to the underlying AwsBaseHook. + +.. seealso:: +:class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook` + +""" + +def __init__(self, *args, **kwargs) -> None: +kwargs["client_type"] = "appflow" +super().__init__(*args, **kwargs) + +@property +def conn(self) -> 'AppflowClient': +"""Get the underlying boto3 Appflow client (cached)""" Review Comment: This connection property not cached anymore ```python from airflow.providers.amazon.aws.hooks.appflow import AppflowHook hook = AppflowHook(region_name="eu-west-1") conn = hook.conn assert conn is hook.conn, "AppflowHook conn property non-cached" ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Taragolis commented on a diff in pull request #24057: Amazon appflow
Taragolis commented on code in PR #24057: URL: https://github.com/apache/airflow/pull/24057#discussion_r894848008 ## airflow/providers/amazon/aws/hooks/appflow.py: ## @@ -0,0 +1,50 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from typing import TYPE_CHECKING + +from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook + +if TYPE_CHECKING: +from mypy_boto3_appflow.client import AppflowClient + + +class AppflowHook(AwsBaseHook): +""" +Interact with Amazon Appflow, using the boto3 library +Hook attribute `conn` has all methods that listed in documentation + +.. seealso:: +- https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/appflow.html +- https://docs.aws.amazon.com/appflow/1.0/APIReference/Welcome.html + +Additional arguments (such as ``aws_conn_id`` or ``region_name``) may be specified and +are passed down to the underlying AwsBaseHook. + +.. seealso:: +:class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook` + +""" + +def __init__(self, *args, **kwargs) -> None: +kwargs["client_type"] = "appflow" +super().__init__(*args, **kwargs) + +@property +def conn(self) -> 'AppflowClient': +"""Get the underlying boto3 Appflow client (cached)""" Review Comment: This connection not cached anymore ```python from airflow.providers.amazon.aws.hooks.appflow import AppflowHook hook = AppflowHook(region_name="eu-west-1") conn = hook.conn assert conn is hook.conn, "AppflowHook conn property non-cached" ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Superskyyy commented on issue #8540: MLFlow Operator
Superskyyy commented on issue #8540: URL: https://github.com/apache/airflow/issues/8540#issuecomment-1152661027 Hello, is there any new progress on this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] patricker opened a new pull request, #24382: Expose sql to gcs meta
patricker opened a new pull request, #24382: URL: https://github.com/apache/airflow/pull/24382 closes: #22065 --- This change captures the row count in total, and per file, for a DB to GCS operation. This metadata is then included in the XCom output. This metadata can also optionally be included as GCS Blob metadata. During test development, I noticed that the CSV file write test was outputting a 4th empty file. This wasn't immediately obvious with the old tests, since they had no metadata, but with the updated tests we know in advance that the max row count available is 3. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] pankajastro commented on a diff in pull request #24353: GCSDeleteObjectsOperator empty prefix bug fix
pankajastro commented on code in PR #24353: URL: https://github.com/apache/airflow/pull/24353#discussion_r894803146 ## airflow/providers/google/cloud/operators/gcs.py: ## @@ -305,8 +305,10 @@ def __init__( self.delegate_to = delegate_to self.impersonation_chain = impersonation_chain -if not objects and not prefix: -raise ValueError("Either object or prefix should be set. Both are None") +if objects is None and prefix is None: Review Comment: what about replacing this line with the below snippet? ``` if (objects or prefix) is None: ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #24344: airflow 2.3.2 vulnerabilities in docker images
potiuk commented on issue #24344: URL: https://github.com/apache/airflow/issues/24344#issuecomment-1152628473 > maybe there is something wrong behind that may explain these 200 vulns (almost all of them image, not as python dependency). One more thing. Please. By all means do. Analyse those. And let us know when you find real vulnerability find reproduction scenario and describe it in the way that we can assess severity. This is one of the best you and your company can pay back for the absolutely free software you get. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] PrincipalsOffice opened a new issue, #24381: Prioritize tasks based on execution_date
PrincipalsOffice opened a new issue, #24381: URL: https://github.com/apache/airflow/issues/24381 ### Description A config that lets the scheduler choose tasks with the latest/oldest execution_date to run first. Or perhaps a way to dynamically assign pool slots based on the execution_date could also work. ### Use case/motivation Sometimes the same task can have multiple task instances from different dag runs queued and occupying pool slots. The task instance from the most recent dag run could have a higher priority than the older ones. Currently there is no way to prioritize that. ### Related issues _No response_ ### Are you willing to submit a PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #24344: airflow 2.3.2 vulnerabilities in docker images
potiuk commented on issue #24344: URL: https://github.com/apache/airflow/issues/24344#issuecomment-1152614965 Just to explain things you completely misunderstand. You assume we have responsibilities for things that we don't. So let me set your expectations here. It would be great if you read all the information available about what ASF releases and how. You also apparently completely missed our explanation of what the airflow image is and how (if you have higher expectations about the security) YOU can make sure your expectations are met (but it's you who has to spend all the effort). Basically if you have high demands, you have to pay for it. Airfllow (which you get completely free) has very high standards when it comes to security, but your expectations clearly demand higher level and If you want to pay for it (with the time of your security team) - you are not only free to do it but we spent a lot of effort to make it possible. Let me explain then (and please read it VERY carefully before your respond). Ideally, if you demand somethign, It should start with the amount of money your company is willing to pay to get higher security level. There are a number of individuals and companies that I will be able to point you to who have good paid commercial support in case you have noit enough internal security experts. Airflow images are NOT what Airflow or Apache Software Foundation releases "offficially". The only officially released software that the ASF releases are https://downloads.apache.org/airflow/2.3.2/ - this is source of airlflow that you can use having sufficient tools and software to build airflow from the sources. If you are concerned about vulnerabilities in whatever non-airflow-sources part, you are free to download the -src.tgz, follow the instructions and build your software - using whatever third-party OS, compilers etc. you have. This is what is the source of any security vulnerabilities you might raise to the ASF. Any other vuinerabilities you can raise are vulnerabilities in 3rd-party packages and you can raise your vulnerabilities there. The "reference image" is really "convenience package that is also called "compiled" package https://www.apache.org/legal/release-policy.html#compiled-packages and is NOT official release by the ASF. this is a "convenience package" that makes it easy to start. BUT this is not something that you have to use - you can build your own image: https://airflow.apache.org/docs/docker-stack/build.html via customization that only uses our source Dockerfile and you can customize the Dockerfile and use other base (CentOS, whatever) than us - we use debian buster. > we will try to figure out how to improve the security. Maybe is another base image, creating a new distroless image and adding a pipelines for vulnerability management. What is the proper channel to bring this proposals? just PRs? If you have ideas how to improve and automate things - absolutely feel free to raise PRs. > we will try to figure out how to improve the security. Maybe is another base image, creating a new distroless image Just a bit of warning - you misunderstand how distroless images work and not understand that they are not good for arlflow. > almost all of them image, not as python dependency Yes. Since this is just convenience image, whatever you see there might come from non-airflow code and this is where you should raise your security issues with. We are basing our images on latest (At the moment of release) security patched debian buster version. Whenever we release new airflow version it uses latest available version of the "official" python debian-based images. https://hub.docker.com/_/python which uses latest official debian as a base. if you see any problems with dependencies we install - raise them to Python team or debian team. Whenever they release their updated versions of images we will automatically use latest images. Our CI pipeline is specifically done in the way to automatically update to latest available Python official image from each line (3.7 - 3.10). We also automatically upgrade dependencies to our And we almost never re-release released images for old versions. Mostly for stability but also because if you want - you can very easily [build your own image](https://airflow.apache.org/docs/docker-stack/build.html#customizing-the-image) following our instructions. If you care about updating to today's Python version of Airlfow 2.1.0 - feel free. It's very much possible. There are even people who establish their CI pipelines and they ise our standalone Dockerfile to build their image every day taking latest versions of the dependencies. While Airflow installation recommends to use constrained python dependencies. it is also possible to roll your own constraints and installation sources (where you can pre-vet and scan your own
[GitHub] [airflow] tanelk commented on a diff in pull request #24366: Send DAG timeout callbacks to processor outside of prohibit_commit
tanelk commented on code in PR #24366: URL: https://github.com/apache/airflow/pull/24366#discussion_r894779661 ## airflow/jobs/scheduler_job.py: ## @@ -1159,10 +1159,7 @@ def _schedule_dag_run( msg='timed_out', ) -# Send SLA & DAG Success/Failure Callbacks to be executed -self._send_dag_callbacks_to_processor(dag, callback_to_execute) Review Comment: No, the callback is executed once, because the `callback` it returned was `None`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] tanelk commented on pull request #24372: Move callback_sink from executor to scheduler
tanelk commented on PR #24372: URL: https://github.com/apache/airflow/pull/24372#issuecomment-115260 It sounds like this is a feature, that should be left as is. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] tanelk closed pull request #24372: Move callback_sink from executor to scheduler
tanelk closed pull request #24372: Move callback_sink from executor to scheduler URL: https://github.com/apache/airflow/pull/24372 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] pawellrus commented on issue #24329: '__type' is a required property exception when loading DAG Grid
pawellrus commented on issue #24329: URL: https://github.com/apache/airflow/issues/24329#issuecomment-1152595207 1) Endpoint /tasks is failing. 2) [dag.txt](https://github.com/apache/airflow/files/8881021/dag.txt) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #24344: airflow 2.3.2 vulnerabilities in docker images
potiuk commented on issue #24344: URL: https://github.com/apache/airflow/issues/24344#issuecomment-1152587946 This is the ASF policy. You can raise this point to secur...@apache.org. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #23538: CI: convert tests to use Breeze tests command
potiuk commented on issue #23538: URL: https://github.com/apache/airflow/issues/23538#issuecomment-1152586884 Just parallell running - I have not covered that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #24236: Add CI-friendly progress output for tests
potiuk commented on code in PR #24236: URL: https://github.com/apache/airflow/pull/24236#discussion_r894758526 ## dev/breeze/src/airflow_breeze/commands/testing_commands.py: ## @@ -149,11 +268,39 @@ def tests( os.environ["LIST_OF_INTEGRATION_TESTS_TO_RUN"] = ' '.join(list(integration)) if db_reset: os.environ["DB_RESET"] = "true" - -exec_shell_params = ShellParams(verbose=verbose, dry_run=dry_run) +exec_shell_params = ShellParams( +verbose=verbose, +dry_run=dry_run, +python=python, +backend=backend, +postgres_version=postgres_version, +mysql_version=mysql_version, +mssql_version=mssql_version, +) env_variables = get_env_variables_for_docker_commands(exec_shell_params) perform_environment_checks(verbose=verbose) cmd = ['docker-compose', 'run', '--service-ports', '--rm', 'airflow'] cmd.extend(list(extra_pytest_args)) -result = run_command(cmd, verbose=verbose, dry_run=dry_run, env=env_variables, check=False) +version = ( +mssql_version +if backend == "mssql" +else mysql_version +if backend == "mysql" +else postgres_version +if backend == "postgres" +else "none" +) +if limit_progress_output: +result = run_with_progress( +cmd=cmd, +env_variables=env_variables, +test_type=test_type, +python=python, +backend=backend, +version=version, +verbose=verbose, +dry_run=dry_run, +) +else: Review Comment: It's not the "limit progess output". It's the "test" command that missed it. The version of the tests command we had just did not have those parameters and always used defaults. Which was fine to run quick testing locally but if we are going to use "tests" command in CI, we need to test many combinations of tests (i.e. running mysql version 8 on Python 3.7). or running (Postgres 10 on python 3.10). That's why we need to specify those as parameters here (and that's why I added it to ShellParams - because they will determine which CI image (python 3.7 - 3.10) will be used for the tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Taragolis commented on pull request #24294: Refactoring EmrClusterLink and add it for other AWS EMR Operators
Taragolis commented on PR #24294: URL: https://github.com/apache/airflow/pull/24294#issuecomment-1152583999 > Maybe. But also, if BaseOperator created a Link set containing the link to the service's main console page The implementation of Aws Service link it straightforward ```python import attr from airflow.providers.amazon.aws.links.base_aws import BaseAwsLink, BASE_AWS_CONSOLE_LINK @attr.s(auto_attribs=True) class AwsServiceLink(BaseAwsLink): name: str format_str: str key = "aws_service" AwsServiceLink( name="AWS EMR Console Link", format_str=BASE_AWS_CONSOLE_LINK + "/elasticmapreduce/home?region={region_name}#" ) ``` The is two things which need to be check and solved 1. Each operator should call `AwsServiceLink.persist` in execute method of operator/sensor 2. Need to check is it fine to change `operator_extra_links` in runtime > I'm also not sure what this current implementation is going to look like when you want to add a second/third/nth link. All operator links predefined in operator in that moment In execute stage call `persist` method of link with appropriate keywords, which stored in XCom -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #24236: Add CI-friendly progress output for tests
potiuk commented on code in PR #24236: URL: https://github.com/apache/airflow/pull/24236#discussion_r894755846 ## dev/breeze/src/airflow_breeze/commands/testing_commands.py: ## @@ -112,6 +129,93 @@ def docker_compose_tests( sys.exit(return_code) +class MonitoringThread(Thread): +"""Thread class with a stop() method. The thread itself has to check +regularly for the stopped() condition.""" + +def __init__(self, title: str, file_name: str): +super().__init__(target=self.peek_percent_at_last_lines_of_file, daemon=True) +self._stop_event = Event() +self.title = title +self.file_name = file_name + +def peek_percent_at_last_lines_of_file(self) -> None: +max_line_length = 400 +matcher = re.compile(r"^.*\[([^\]]*)\]$") +while not self.stopped(): +if os.path.exists(self.file_name): +try: +with open(self.file_name, 'rb') as temp_f: +temp_f.seek(-(max_line_length * 2), os.SEEK_END) Review Comment: Beause I want to find last two lines in the file very efficiently without reading the whole file in memory and without loosing time on reading the whole file. from the beginning (it can be 20-30MB file so it might take few seconds to read it and it will consume I/O and memory) I really do not want to read the whole file to find the last two lines - this is all I care about and I can survive if I won't succeed - at worst I will not print progress. No big deal.. And it's generally a difficult problem to solve, because in order to be sure you'd have to read the whole file to be absolutely sure to print last two lines. For example if you have big file with two lines - each of them 1 chars, you would have to print 2 characters to prin the line. So I am cheating "a bit" here. I assume the line is not longer than X. I go to the END of the file (SEEK.END), then i go back 2 *X and read only those maximum 2*X characters. This is much faster and takes much less memory than reading the whole file nd being "sure" on the other hand, I KNOW our lines are not too long in most cases and actually I only care about two last lines if they contain Pytest output: ``` test1 [20%] test2 .. ``` So I KNOW the lines I am interested at are not long. And in case I have longer lines - too bad (the output does nothing if ti does not find the [X%] in the output. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] patricker closed pull request #23184: Track and optionally include row count metadata with GCS upload
patricker closed pull request #23184: Track and optionally include row count metadata with GCS upload URL: https://github.com/apache/airflow/pull/23184 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] patricker commented on pull request #23184: Track and optionally include row count metadata with GCS upload
patricker commented on PR #23184: URL: https://github.com/apache/airflow/pull/23184#issuecomment-1152581094 Will re-open on a new branch. Underlying structure changed enough, I decided to start-over/copy-paste my changes into a fresh branch. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dstandish commented on a diff in pull request #23317: Add dag configuration warning for missing pools
dstandish commented on code in PR #23317: URL: https://github.com/apache/airflow/pull/23317#discussion_r894753175 ## airflow/dag_processing/processor.py: ## @@ -559,6 +560,28 @@ def update_import_errors(session: Session, dagbag: DagBag) -> None: session.commit() +@staticmethod +def process_dag_warnings(session: Session, dagbag: DagBag) -> None: +""" +For the DAGs in the given DagBag, record any associated configuration warnings and clear +warnings for files that no longer have them. These are usually displayed through the +Airflow UI so that users know that there are issues parsing DAGs. + +:param session: session for ORM operations +:param dagbag: DagBag containing DAGs with configuration warnings +""" +stored_config_warnings = set(session.query(DagWarning).all()) Review Comment: so, this is always gonna be a small amount of data. and, in any case we already always hold all current dag warnings in memory; every warning that was emitted in the course of the dagbag load gets stored on an attr and then we just have to sync that with the db. similar is done for importerror. which part are you thinking needs to be optimized? the "delete stale warnings" operation? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #24236: Add CI-friendly progress output for tests
potiuk commented on code in PR #24236: URL: https://github.com/apache/airflow/pull/24236#discussion_r894749991 ## dev/breeze/src/airflow_breeze/commands/testing_commands.py: ## @@ -112,6 +129,93 @@ def docker_compose_tests( sys.exit(return_code) +class MonitoringThread(Thread): +"""Thread class with a stop() method. The thread itself has to check +regularly for the stopped() condition.""" + +def __init__(self, title: str, file_name: str): +super().__init__(target=self.peek_percent_at_last_lines_of_file, daemon=True) +self._stop_event = Event() +self.title = title +self.file_name = file_name + +def peek_percent_at_last_lines_of_file(self) -> None: +max_line_length = 400 +matcher = re.compile(r"^.*\[([^\]]*)\]$") +while not self.stopped(): +if os.path.exists(self.file_name): +try: +with open(self.file_name, 'rb') as temp_f: +temp_f.seek(-(max_line_length * 2), os.SEEK_END) +tail = temp_f.read().decode() +try: +two_last_lines = tail.splitlines()[-2:] +previous_no_ansi_line = escape_ansi(two_last_lines[0]) +m = matcher.match(previous_no_ansi_line) +if m: +get_console().print(f"[info]{self.title}:[/] {m.group(1).strip()}") +print(f"\r{two_last_lines[0]}\r") +print(f"\r{two_last_lines[1]}\r") +except IndexError: +pass +except OSError as e: +if e.errno == errno.EINVAL: +pass +else: +raise +sleep(5) + +def stop(self): +self._stop_event.set() + +def stopped(self): +return self._stop_event.is_set() + + +def escape_ansi(line): +ansi_escape = re.compile(r'(?:\x1B[@-_]|[\x80-\x9F])[0-?]*[ -/]*[@-~]') +return ansi_escape.sub('', line) + + +def run_with_progress( +cmd: List[str], +env_variables: Dict[str, str], +test_type: str, +python: str, +backend: str, +version: str, +verbose: bool, +dry_run: bool, +) -> RunCommandResult: +title = f"Running tests: {test_type}, Python: {python}, Backend: {backend}:{version}" +try: +with tempfile.NamedTemporaryFile(mode='w+t', delete=False) as f: +get_console().print(f"[info]Starting test = {title}[/]") +thread = MonitoringThread(title=title, file_name=f.name) +thread.start() +try: +result = run_command( +cmd, +verbose=verbose, +dry_run=dry_run, +env=env_variables, +check=False, +stdout=f, +stderr=subprocess.STDOUT, Review Comment: It is in captured in the same file. The stderr = `subprocess.STDOUT` means literally "send the output to the same place where stdout is sent". If you want to send it to stdout you would like to send it to `sys.stdout` instead. See https://docs.python.org/3/library/subprocess.html#subprocess.CompletedProcess.stdout > If you ran the process with stderr=subprocess.STDOUT, stdout and stderr will be combined in this attribute, and [stderr](https://docs.python.org/3/library/subprocess.html#subprocess.CompletedProcess.stderr) will be None. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #24236: Add CI-friendly progress output for tests
potiuk commented on code in PR #24236: URL: https://github.com/apache/airflow/pull/24236#discussion_r894747966 ## dev/breeze/src/airflow_breeze/commands/testing_commands.py: ## @@ -112,6 +129,93 @@ def docker_compose_tests( sys.exit(return_code) +class MonitoringThread(Thread): +"""Thread class with a stop() method. The thread itself has to check +regularly for the stopped() condition.""" + +def __init__(self, title: str, file_name: str): +super().__init__(target=self.peek_percent_at_last_lines_of_file, daemon=True) +self._stop_event = Event() +self.title = title +self.file_name = file_name + +def peek_percent_at_last_lines_of_file(self) -> None: +max_line_length = 400 +matcher = re.compile(r"^.*\[([^\]]*)\]$") +while not self.stopped(): +if os.path.exists(self.file_name): +try: +with open(self.file_name, 'rb') as temp_f: +temp_f.seek(-(max_line_length * 2), os.SEEK_END) +tail = temp_f.read().decode() +try: +two_last_lines = tail.splitlines()[-2:] +previous_no_ansi_line = escape_ansi(two_last_lines[0]) +m = matcher.match(previous_no_ansi_line) +if m: +get_console().print(f"[info]{self.title}:[/] {m.group(1).strip()}") +print(f"\r{two_last_lines[0]}\r") +print(f"\r{two_last_lines[1]}\r") +except IndexError: +pass +except OSError as e: +if e.errno == errno.EINVAL: +pass +else: +raise +sleep(5) + +def stop(self): +self._stop_event.set() + +def stopped(self): +return self._stop_event.is_set() + + +def escape_ansi(line): +ansi_escape = re.compile(r'(?:\x1B[@-_]|[\x80-\x9F])[0-?]*[ -/]*[@-~]') +return ansi_escape.sub('', line) + + +def run_with_progress( +cmd: List[str], +env_variables: Dict[str, str], +test_type: str, +python: str, +backend: str, +version: str, +verbose: bool, +dry_run: bool, +) -> RunCommandResult: +title = f"Running tests: {test_type}, Python: {python}, Backend: {backend}:{version}" +try: +with tempfile.NamedTemporaryFile(mode='w+t', delete=False) as f: +get_console().print(f"[info]Starting test = {title}[/]") +thread = MonitoringThread(title=title, file_name=f.name) +thread.start() +try: +result = run_command( +cmd, +verbose=verbose, +dry_run=dry_run, +env=env_variables, +check=False, +stdout=f, +stderr=subprocess.STDOUT, +) +finally: +thread.stop() +thread.join() +with ci_group( +f"Result of {title}", message_type="[success]" if result.returncode == 0 else "[error]" +): +with open(f.name) as f: +shutil.copyfileobj(f, sys.stdout) Review Comment: The `copyfileobj` sends directly content of the file to stdeout without any buffering and reading it to memory. If you read the content to a string and print it - it will be loaded in memory first. the output can be potentially very long (20-30MB) and when you load it to string - this is how much memory you wil use. And we already had a lot of troubles with our tests exceeding memory available to run tests (that's why some tests are not run when we have mssql or mysql database). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] chenglongyan opened a new pull request, #24380: Fix HttpHook.run_with_advanced_retry document error
chenglongyan opened a new pull request, #24380: URL: https://github.com/apache/airflow/pull/24380 related: #9569 --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information. In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in a newsfragement file, named `{pr_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ShahNewazKhan commented on issue #24197: KubernetesPodOperator rendered template tab does not prettty print `env_vars`
ShahNewazKhan commented on issue #24197: URL: https://github.com/apache/airflow/issues/24197#issuecomment-1152565619 Thanks for looking into this @DMilmont! @josh-fell unfortunately I do not have the bandwidth to take this on, just reporting here, for the time being I will use the human readable `env_vars` `dict` passed in from the task definition. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ldacey commented on pull request #21731: Store callbacks in database if standalone_dag_processor config is True.
ldacey commented on PR #21731: URL: https://github.com/apache/airflow/pull/21731#issuecomment-1152561962 FYI - I disabled the standalone DAG processor and it fixed my issues, but while enabled my scheduler would die sometimes. Here are some logs where it died almost immediately after I deployed Airflow: ``` | [2022-06-09 19:11:22,231] {scheduler_job.py:696} INFO - Starting the scheduler | [2022-06-09 19:11:22,231] {scheduler_job.py:701} INFO - Processing each file at most -1 times | [2022-06-09 19:11:22,501] {executor_loader.py:105} INFO - Loaded executor: CeleryExecutor | [2022-06-09 19:11:22,502] {scheduler_job.py:1221} INFO - Resetting orphaned tasks for active dag runs | [2022-06-09 19:11:22,656] {celery_executor.py:532} INFO - Adopted the following 5 tasks from a dead executor |in state STARTED |in state STARTED |in state STARTED |in state STARTED | [2022-06-09 19:11:22,777] {dag.py:2927} INFO - Setting next_dagrun for telecom-gts to 2022-06-09T19:05:00+00:00, run_after=2022-06-09T19:35:00+00:00 | [2022-06-09 19:11:27,118] {font_manager.py:1443} INFO - generated new fontManager | [2022-06-09 19:11:31,341] {scheduler_job.py:1126} INFO - Run scheduled__2022-06-09T17:33:00+00:00 of cms-agent has timed-out | [2022-06-09 19:11:31,347] {dag.py:2927} INFO - Setting next_dagrun for cms-agent to 2022-06-09T18:33:00+00:00, run_after=2022-06-09T19:03:00+00:00 | [2022-06-09 19:11:31,349] {scheduler_job.py:756} ERROR - Exception when executing SchedulerJob._run_scheduler_loop | Traceback (most recent call last): | File "/home/airflow/.local/lib/python3.10/site-packages/airflow/jobs/scheduler_job.py", line 739, in _execute | self._run_scheduler_loop() | File "/home/airflow/.local/lib/python3.10/site-packages/airflow/jobs/scheduler_job.py", line 827, in _run_scheduler_loop | num_queued_tis = self._do_scheduling(session) | File "/home/airflow/.local/lib/python3.10/site-packages/airflow/jobs/scheduler_job.py", line 909, in _do_scheduling | callback_to_run = self._schedule_dag_run(dag_run, session) | File "/home/airflow/.local/lib/python3.10/site-packages/airflow/jobs/scheduler_job.py", line 1141, in _schedule_dag_run | self._send_dag_callbacks_to_processor(dag, callback_to_execute) | File "/home/airflow/.local/lib/python3.10/site-packages/airflow/jobs/scheduler_job.py", line 1184, in _send_dag_callbacks_to_processor | self.executor.send_callback(callback) | File "/home/airflow/.local/lib/python3.10/site-packages/airflow/executors/base_executor.py", line 363, in send_callback | self.callback_sink.send(request) | File "/home/airflow/.local/lib/python3.10/site-packages/airflow/utils/session.py", line 70, in wrapper | with create_session() as session: | File "/usr/local/lib/python3.10/contextlib.py", line 142, in __exit__ | next(self.gen) | File "/home/airflow/.local/lib/python3.10/site-packages/airflow/utils/session.py", line 33, in create_session | session.commit() | File "/home/airflow/.local/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 1423, in commit | self._transaction.commit(_to_root=self.future) | File "/home/airflow/.local/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 829, in commit | self._prepare_impl() | File "/home/airflow/.local/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 797, in _prepare_impl | self.session.dispatch.before_commit(self.session) | File "/home/airflow/.local/lib/python3.10/site-packages/sqlalchemy/event/attr.py", line 320, in __call__ | fn(*args, **kw) | File "/home/airflow/.local/lib/python3.10/site-packages/airflow/utils/sqlalchemy.py", line 268, in _validate_commit | raise RuntimeError("UNEXPECTED COMMIT - THIS WILL BREAK HA LOCKS!") | RuntimeError: UNEXPECTED COMMIT - THIS WILL BREAK HA LOCKS! | [2022-06-09 19:11:31,372] {scheduler_job.py:768} INFO - Exited execute loop | Traceback (most recent call last): | File "/home/airflow/.local/bin/airflow", line 8, in | sys.exit(main()) | File "/home/airflow/.local/lib/python3.10/site-packages/airflow/__main__.py", line 38, in main | args.func(args) | File "/home/airflow/.local/lib/python3.10/site-packages/airflow/cli/cli_parser.py", line 51, in command | return func(*args, **kwargs) | File "/home/airflow/.local/lib/python3.10/site-packages/airflow/utils/cli.py", line 99, in wrapper | return f(*args, **kwargs) | File "/home/airflow/.local/lib/python3.10/site-packages/airflow/cli/commands/scheduler_command.py", line 75, in scheduler | _run_scheduler_job(args=args) | File "/home/airflow/.local/lib/python3.10/site-packages/airflow/cli/commands/scheduler_command.py", line 46, in _run_scheduler_job | job.run() |
[GitHub] [airflow] eladkal closed issue #17978: Can't delete user "admin"
eladkal closed issue #17978: Can't delete user "admin" URL: https://github.com/apache/airflow/issues/17978 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] chenglongyan commented on issue #17978: Can't delete user "admin"
chenglongyan commented on issue #17978: URL: https://github.com/apache/airflow/issues/17978#issuecomment-1152539160 Resolved on main branch. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] chenglongyan opened a new pull request, #24379: Remove bigquery example already migrated to AIP-47
chenglongyan opened a new pull request, #24379: URL: https://github.com/apache/airflow/pull/24379 related: #22430, #22447, #22311 --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information. In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in a newsfragement file, named `{pr_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] DMilmont commented on issue #24197: KubernetesPodOperator rendered template tab does not prettty print `env_vars`
DMilmont commented on issue #24197: URL: https://github.com/apache/airflow/issues/24197#issuecomment-1152529215 I took a crack at this but I don't think it is easily possible to make the env_vars formatting nicer. I tried adding this line to the k8s pod operator ``` template_fields_renderers = {"env_vars": "json"}``` The env_vars look like this in the UI afterwards: https://user-images.githubusercontent.com/3598125/173108728-db7dbcd6-0abd-4385-a4ff-39979e48f1c9.png;> In the past it looks like env_vars was just a dict so it was easy to format it with the json lexer however now env_vars are converted to a list of env_vars with [this function](https://github.com/apache/airflow/blob/2b2d97068fa45881672dab6f2134becae246f3f3/airflow/providers/cncf/kubernetes/backcompat/backwards_compat_converters.py#L90-L105). Perhaps there is a way to still make this work, but it isn't as straight forward as I was hoping. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #24360: Pattern parameter in S3ToSnowflakeOperator
potiuk commented on issue #24360: URL: https://github.com/apache/airflow/issues/24360#issuecomment-1152516697 Feel free to add it in PR. BTW there is no need to open an issue for it. Just Pull Request is enough. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] voltcode closed issue #24345: Grid view broken for DAG
voltcode closed issue #24345: Grid view broken for DAG URL: https://github.com/apache/airflow/issues/24345 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] vincbeck opened a new pull request, #24378: Fix S3KeySensor
vincbeck opened a new pull request, #24378: URL: https://github.com/apache/airflow/pull/24378 Fixes #24321 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] github-actions[bot] commented on pull request #24366: Send DAG timeout callbacks to processor outside of prohibit_commit
github-actions[bot] commented on PR #24366: URL: https://github.com/apache/airflow/pull/24366#issuecomment-1152445045 The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ashb commented on a diff in pull request #24366: Send DAG timeout callbacks to processor outside of prohibit_commit
ashb commented on code in PR #24366: URL: https://github.com/apache/airflow/pull/24366#discussion_r894613222 ## airflow/jobs/scheduler_job.py: ## @@ -1159,10 +1159,7 @@ def _schedule_dag_run( msg='timed_out', ) -# Send SLA & DAG Success/Failure Callbacks to be executed -self._send_dag_callbacks_to_processor(dag, callback_to_execute) Review Comment: Yet another case where our session handling causes bugs. Our use of `@create_session` which does an expunge_all _and_ closes the session when it exits needs fixing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ashb commented on a diff in pull request #24366: Send DAG timeout callbacks to processor outside of prohibit_commit
ashb commented on code in PR #24366: URL: https://github.com/apache/airflow/pull/24366#discussion_r894611528 ## airflow/jobs/scheduler_job.py: ## @@ -1159,10 +1159,7 @@ def _schedule_dag_run( msg='timed_out', ) -# Send SLA & DAG Success/Failure Callbacks to be executed -self._send_dag_callbacks_to_processor(dag, callback_to_execute) Review Comment: Whoops, didn't this end up with the callback being sent twice before too? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated: Add tests for the grid_data endpoint (#24375)
This is an automated email from the ASF dual-hosted git repository. ash pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/main by this push: new 2b2d97068f Add tests for the grid_data endpoint (#24375) 2b2d97068f is described below commit 2b2d97068fa45881672dab6f2134becae246f3f3 Author: Ash Berlin-Taylor AuthorDate: Fri Jun 10 15:35:38 2022 +0100 Add tests for the grid_data endpoint (#24375) The one fix/change here was to include the JSON content response here so that `resp.json` works in the test. --- airflow/www/views.py | 5 +- tests/www/views/test_views_grid.py | 238 + 2 files changed, 242 insertions(+), 1 deletion(-) diff --git a/airflow/www/views.py b/airflow/www/views.py index 32cb921a70..fe8bf5c027 100644 --- a/airflow/www/views.py +++ b/airflow/www/views.py @@ -3540,7 +3540,10 @@ class Airflow(AirflowBaseView): } # avoid spaces to reduce payload size -return htmlsafe_json_dumps(data, separators=(',', ':')) +return ( +htmlsafe_json_dumps(data, separators=(',', ':')), +{'Content-Type': 'application/json; charset=utf-8'}, +) @expose('/robots.txt') @action_logging diff --git a/tests/www/views/test_views_grid.py b/tests/www/views/test_views_grid.py new file mode 100644 index 00..e5d29be8a2 --- /dev/null +++ b/tests/www/views/test_views_grid.py @@ -0,0 +1,238 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +import freezegun +import pendulum +import pytest + +from airflow.models import DagBag +from airflow.operators.empty import EmptyOperator +from airflow.utils.state import DagRunState, TaskInstanceState +from airflow.utils.task_group import TaskGroup +from airflow.utils.types import DagRunType +from tests.test_utils.mock_operators import MockOperator + +DAG_ID = 'test' +CURRENT_TIME = pendulum.DateTime(2021, 9, 7) + + +@pytest.fixture(autouse=True, scope="module") +def examples_dag_bag(): +# Speed up: We don't want example dags for this module +return DagBag(include_examples=False, read_dags_from_db=True) + + +@pytest.fixture +def dag_without_runs(dag_maker, session, app, monkeypatch): +with monkeypatch.context() as m: +# Remove global operator links for this test +m.setattr('airflow.plugins_manager.global_operator_extra_links', []) +m.setattr('airflow.plugins_manager.operator_extra_links', []) +m.setattr('airflow.plugins_manager.registered_operator_link_classes', {}) + +with dag_maker(dag_id=DAG_ID, serialized=True, session=session): +EmptyOperator(task_id="task1") +with TaskGroup(group_id='group'): +MockOperator.partial(task_id='mapped').expand(arg1=['a', 'b', 'c']) + +m.setattr(app, 'dag_bag', dag_maker.dagbag) +yield dag_maker + + +@pytest.fixture +def dag_with_runs(dag_without_runs): +with freezegun.freeze_time(CURRENT_TIME): +date = dag_without_runs.dag.start_date +run_1 = dag_without_runs.create_dagrun( +run_id='run_1', state=DagRunState.SUCCESS, run_type=DagRunType.SCHEDULED, execution_date=date +) +run_2 = dag_without_runs.create_dagrun( +run_id='run_2', +run_type=DagRunType.SCHEDULED, + execution_date=dag_without_runs.dag.next_dagrun_info(date).logical_date, +) + +yield run_1, run_2 + + +def test_no_runs(admin_client, dag_without_runs): +resp = admin_client.get(f'/object/grid_data?dag_id={DAG_ID}', follow_redirects=True) +assert resp.status_code == 200, resp.json +assert resp.json == { +'dag_runs': [], +'groups': { +'children': [ +{ +'extra_links': [], +'id': 'task1', +'instances': [], +'is_mapped': False, +'label': 'task1', +}, +{ +'children': [ +{ +'extra_links': [], +'id': 'group.mapped',
[GitHub] [airflow] ashb merged pull request #24375: Add tests for the grid_data endpoint
ashb merged PR #24375: URL: https://github.com/apache/airflow/pull/24375 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] NilsJPWerner commented on pull request #24249: Add Task Logs to Grid details panel
NilsJPWerner commented on PR #24249: URL: https://github.com/apache/airflow/pull/24249#issuecomment-1152424576 > In a separate PR, before or after, we should make the grid only as big as it needs to be and the details panel can grow to take up the rest of the page (inverse of today). The logs textarea gets cramped as it is now. It might be cool to make the panel resizeable. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] github-actions[bot] commented on pull request #24375: Add tests for the grid_data endpoint
github-actions[bot] commented on PR #24375: URL: https://github.com/apache/airflow/pull/24375#issuecomment-1152415587 The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] bbovenzi commented on pull request #23516: clear specific dag run TI
bbovenzi commented on PR #23516: URL: https://github.com/apache/airflow/pull/23516#issuecomment-1152388565 > @bbovenzi instead of using the field `recursive`, the existing api for `clearTaskInstances` have the param `include_subdags` and `include_parentdag`. The older api in views.py use the value in `recursive` param to fill the value for both `include_subdags` and `include_parentdag`. > > Do you still think `recursive` field has to be given in the query param? Ah I didn't realize that. We don't need recursive then -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated: Refactor `DagRun.verify_integrity` (#24114)
This is an automated email from the ASF dual-hosted git repository. ephraimanierobi pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/main by this push: new 12638d2310 Refactor `DagRun.verify_integrity` (#24114) 12638d2310 is described below commit 12638d2310d962986b43af8f1584a405e280badf Author: Ephraim Anierobi AuthorDate: Fri Jun 10 14:44:19 2022 +0100 Refactor `DagRun.verify_integrity` (#24114) This refactoring became necessary as there's a necessity to add additional code to the already exisiting code to handle mapped task immutability during run. The additional code would make this method difficult to read. Refactoring the code will aid understanding and help in debugging. --- airflow/models/dagrun.py | 102 --- 1 file changed, 88 insertions(+), 14 deletions(-) diff --git a/airflow/models/dagrun.py b/airflow/models/dagrun.py index c66e24c536..216272c79a 100644 --- a/airflow/models/dagrun.py +++ b/airflow/models/dagrun.py @@ -23,6 +23,7 @@ from datetime import datetime from typing import ( TYPE_CHECKING, Any, +Callable, Dict, Generator, Iterable, @@ -30,6 +31,7 @@ from typing import ( NamedTuple, Optional, Sequence, +Set, Tuple, Union, cast, @@ -818,13 +820,50 @@ class DagRun(Base, LoggingMixin): """ from airflow.settings import task_instance_mutation_hook +# Set for the empty default in airflow.settings -- if it's not set this means it has been changed +hook_is_noop = getattr(task_instance_mutation_hook, 'is_noop', False) + dag = self.get_dag() +task_ids = self._check_for_removed_or_restored_tasks( +dag, task_instance_mutation_hook, session=session +) + +def task_filter(task: "Operator") -> bool: +return task.task_id not in task_ids and ( +self.is_backfill +or task.start_date <= self.execution_date +and (task.end_date is None or self.execution_date <= task.end_date) +) + +created_counts: Dict[str, int] = defaultdict(int) + +# Get task creator function +task_creator = self._get_task_creator(created_counts, task_instance_mutation_hook, hook_is_noop) + +# Create the missing tasks, including mapped tasks +tasks = self._create_missing_tasks(dag, task_creator, task_filter, session=session) + +self._create_task_instances(dag.dag_id, tasks, created_counts, hook_is_noop, session=session) + +def _check_for_removed_or_restored_tasks( +self, dag: "DAG", ti_mutation_hook, *, session: Session +) -> Set[str]: +""" +Check for removed tasks/restored tasks. + +:param dag: DAG object corresponding to the dagrun +:param ti_mutation_hook: task_instance_mutation_hook function +:param session: Sqlalchemy ORM Session + +:return: List of task_ids in the dagrun + +""" tis = self.get_task_instances(session=session) # check for removed or restored tasks task_ids = set() for ti in tis: -task_instance_mutation_hook(ti) +ti_mutation_hook(ti) task_ids.add(ti.task_id) task = None try: @@ -885,19 +924,21 @@ class DagRun(Base, LoggingMixin): ) ti.state = State.REMOVED ... +return task_ids -def task_filter(task: "Operator") -> bool: -return task.task_id not in task_ids and ( -self.is_backfill -or task.start_date <= self.execution_date -and (task.end_date is None or self.execution_date <= task.end_date) -) +def _get_task_creator( +self, created_counts: Dict[str, int], ti_mutation_hook: Callable, hook_is_noop: bool +) -> Callable: +""" +Get the task creator function. -created_counts: Dict[str, int] = defaultdict(int) +This function also updates the created_counts dictionary with the number of tasks created. -# Set for the empty default in airflow.settings -- if it's not set this means it has been changed -hook_is_noop = getattr(task_instance_mutation_hook, 'is_noop', False) +:param created_counts: Dictionary of task_type -> count of created TIs +:param ti_mutation_hook: task_instance_mutation_hook function +:param hook_is_noop: Whether the task_instance_mutation_hook is a noop +""" if hook_is_noop: def create_ti_mapping(task: "Operator", indexes: Tuple[int, ...]) -> Generator: @@ -912,13 +953,25 @@ class DagRun(Base, LoggingMixin): def create_ti(task: "Operator", indexes: Tuple[int, ...]) -> Generator: for map_index in
[GitHub] [airflow] ephraimbuddy merged pull request #24114: Refactor `DagRun.verify_integrity`
ephraimbuddy merged PR #24114: URL: https://github.com/apache/airflow/pull/24114 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch constraints-2-3 updated: Updating constraints. Build id:2474949367
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch constraints-2-3 in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/constraints-2-3 by this push: new b8ecf47e3c Updating constraints. Build id:2474949367 b8ecf47e3c is described below commit b8ecf47e3c522c3ed7a0d20e01b01b578995f0b7 Author: Automated GitHub Actions commit AuthorDate: Fri Jun 10 13:28:07 2022 + Updating constraints. Build id:2474949367 This update in constraints is automatically committed by the CI 'constraints-push' step based on HEAD of 'refs/heads/v2-3-test' in 'apache/airflow' with commit sha 641ce142611d4a57bdf3ee1679c12d9f59f070a5. All tests passed in this build so we determined we can push the updated constraints. See https://github.com/apache/airflow/blob/main/README.md#installing-from-pypi for details. --- constraints-3.10.txt | 120 +- constraints-3.7.txt | 120 +- constraints-3.8.txt | 120 +- constraints-3.9.txt | 120 +- constraints-no-providers-3.10.txt | 10 +-- constraints-no-providers-3.7.txt | 10 +-- constraints-no-providers-3.8.txt | 10 +-- constraints-no-providers-3.9.txt | 10 +-- constraints-source-providers-3.10.txt | 120 +- constraints-source-providers-3.7.txt | 120 +- constraints-source-providers-3.8.txt | 120 +- constraints-source-providers-3.9.txt | 120 +- 12 files changed, 500 insertions(+), 500 deletions(-) diff --git a/constraints-3.10.txt b/constraints-3.10.txt index c9f2982c8c..c93d7d3e21 100644 --- a/constraints-3.10.txt +++ b/constraints-3.10.txt @@ -1,5 +1,5 @@ # -# This constraints file was automatically generated on 2022-06-01T12:25:26Z +# This constraints file was automatically generated on 2022-06-10T13:27:43Z # via "eager-upgrade" mechanism of PIP. For the "v2-3-test" branch of Airflow. # This variant of constraints install uses the HEAD of the branch version for 'apache-airflow' but installs # the providers from PIP-released packages at the moment of the constraint generation. @@ -148,7 +148,7 @@ attrs==20.3.0 aws-xray-sdk==2.9.0 azure-batch==12.0.0 azure-common==1.1.28 -azure-core==1.24.0 +azure-core==1.24.1 azure-cosmos==4.3.0 azure-datalake-store==0.0.52 azure-identity==1.10.0 @@ -173,9 +173,9 @@ billiard==3.6.4.0 black==22.3.0 bleach==5.0.0 blinker==1.4 -boto3==1.24.0 +boto3==1.24.6 boto==2.49.0 -botocore==1.27.0 +botocore==1.27.6 bowler==0.9.0 cached-property==1.5.2 cachelib==0.7.0 @@ -200,7 +200,7 @@ colorama==0.4.4 colorlog==4.8.0 commonmark==0.9.1 connexion==2.13.1 -coverage==6.4 +coverage==6.4.1 crcmod==1.7 cron-descriptor==1.2.24 croniter==1.3.5 @@ -210,7 +210,7 @@ cx-Oracle==8.3.0 dask==2022.5.2 databricks-sql-connector==2.0.2 datadog==0.44.0 -db-dtypes==1.0.1 +db-dtypes==1.0.2 decorator==5.1.1 defusedxml==0.7.1 dill==0.3.1.1 @@ -230,7 +230,7 @@ eventlet==0.33.1 execnet==1.9.0 executing==0.8.3 facebook-business==13.0.0 -fastavro==1.4.12 +fastavro==1.5.1 fastjsonschema==2.15.3 filelock==3.7.1 fissix==21.11.13 @@ -252,44 +252,44 @@ google-ads==16.0.0 google-api-core==2.8.1 google-api-python-client==1.12.11 google-auth-httplib2==0.1.0 -google-auth-oauthlib==0.5.1 -google-auth==2.6.6 -google-cloud-aiplatform==1.13.1 -google-cloud-appengine-logging==1.1.1 +google-auth-oauthlib==0.5.2 +google-auth==2.7.0 +google-cloud-aiplatform==1.14.0 +google-cloud-appengine-logging==1.1.2 google-cloud-audit-log==0.2.2 -google-cloud-automl==2.7.2 -google-cloud-bigquery-datatransfer==3.6.1 -google-cloud-bigquery-storage==2.13.1 -google-cloud-bigquery==2.34.3 -google-cloud-bigtable==1.7.1 -google-cloud-build==3.8.2 -google-cloud-container==2.10.7 -google-cloud-core==2.3.0 -google-cloud-datacatalog==3.8.0 -google-cloud-dataplex==1.0.0 -google-cloud-dataproc-metastore==1.5.0 -google-cloud-dataproc==4.0.2 -google-cloud-dlp==1.0.1 -google-cloud-kms==2.11.1 -google-cloud-language==1.3.1 -google-cloud-logging==3.1.1 -google-cloud-memcache==1.3.1 -google-cloud-monitoring==2.9.1 -google-cloud-orchestration-airflow==1.3.1 -google-cloud-os-login==2.6.1 -google-cloud-pubsub==2.12.1 -google-cloud-redis==2.8.0 -google-cloud-resource-manager==1.5.0 -google-cloud-secret-manager==1.0.1 -google-cloud-spanner==1.19.2 -google-cloud-speech==1.3.3 +google-cloud-automl==2.7.3 +google-cloud-bigquery-datatransfer==3.6.2 +google-cloud-bigquery-storage==2.13.2 +google-cloud-bigquery==2.34.4 +google-cloud-bigtable==1.7.2 +google-cloud-build==3.8.3 +google-cloud-container==2.10.8 +google-cloud-core==2.3.1 +google-cloud-datacatalog==3.8.1 +google-cloud-dataplex==1.0.1
[airflow] branch main updated: Fix: `emr_conn_id` should be optional in `EmrCreateJobFlowOperator` (#24306)
This is an automated email from the ASF dual-hosted git repository. kaxilnaik pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/main by this push: new 99d9833631 Fix: `emr_conn_id` should be optional in `EmrCreateJobFlowOperator` (#24306) 99d9833631 is described below commit 99d98336312d188a078721579a3f71060bdde542 Author: Pankaj Singh <98807258+pankajas...@users.noreply.github.com> AuthorDate: Fri Jun 10 18:55:12 2022 +0530 Fix: `emr_conn_id` should be optional in `EmrCreateJobFlowOperator` (#24306) Closes: #24318 --- airflow/providers/amazon/aws/hooks/emr.py | 17 - airflow/providers/amazon/aws/operators/emr.py | 9 +++-- 2 files changed, 15 insertions(+), 11 deletions(-) diff --git a/airflow/providers/amazon/aws/hooks/emr.py b/airflow/providers/amazon/aws/hooks/emr.py index 143bdcdcc8..2141b38ed8 100644 --- a/airflow/providers/amazon/aws/hooks/emr.py +++ b/airflow/providers/amazon/aws/hooks/emr.py @@ -20,7 +20,7 @@ from typing import Any, Dict, List, Optional from botocore.exceptions import ClientError -from airflow.exceptions import AirflowException +from airflow.exceptions import AirflowException, AirflowNotFoundException from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook @@ -41,8 +41,8 @@ class EmrHook(AwsBaseHook): conn_type = 'emr' hook_name = 'Amazon Elastic MapReduce' -def __init__(self, emr_conn_id: Optional[str] = default_conn_name, *args, **kwargs) -> None: -self.emr_conn_id = emr_conn_id +def __init__(self, emr_conn_id: str = default_conn_name, *args, **kwargs) -> None: +self.emr_conn_id: str = emr_conn_id kwargs["client_type"] = "emr" super().__init__(*args, **kwargs) @@ -78,12 +78,11 @@ class EmrHook(AwsBaseHook): run_job_flow method. Overrides for this config may be passed as the job_flow_overrides. """ -if not self.emr_conn_id: -raise AirflowException('emr_conn_id must be present to use create_job_flow') - -emr_conn = self.get_connection(self.emr_conn_id) - -config = emr_conn.extra_dejson.copy() +try: +emr_conn = self.get_connection(self.emr_conn_id) +config = emr_conn.extra_dejson.copy() +except AirflowNotFoundException: +config = {} config.update(job_flow_overrides) response = self.get_conn().run_job_flow(**config) diff --git a/airflow/providers/amazon/aws/operators/emr.py b/airflow/providers/amazon/aws/operators/emr.py index 510c77184f..67ae54af50 100644 --- a/airflow/providers/amazon/aws/operators/emr.py +++ b/airflow/providers/amazon/aws/operators/emr.py @@ -285,8 +285,13 @@ class EmrCreateJobFlowOperator(BaseOperator): For more information on how to use this operator, take a look at the guide: :ref:`howto/operator:EmrCreateJobFlowOperator` -:param aws_conn_id: aws connection to uses -:param emr_conn_id: emr connection to use +:param aws_conn_id: The Airflow connection used for AWS credentials. +If this is None or empty then the default boto3 behaviour is used. If +running Airflow in a distributed manner and aws_conn_id is None or +empty, then default boto3 configuration would be used (and must be +maintained on each worker node) +:param emr_conn_id: emr connection to use for run_job_flow request body. +This will be overridden by the job_flow_overrides param :param job_flow_overrides: boto3 style arguments or reference to an arguments file (must be '.json') to override emr_connection extra. (templated) :param region_name: Region named passed to EmrHook
[GitHub] [airflow] kaxil merged pull request #24306: Fix: `emr_conn_id` should be optional in `EmrCreateJobFlowOperator`
kaxil merged PR #24306: URL: https://github.com/apache/airflow/pull/24306 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] kaxil closed issue #24318: `EmrCreateJobFlowOperator` does not work if emr_conn_id param contain credential
kaxil closed issue #24318: `EmrCreateJobFlowOperator` does not work if emr_conn_id param contain credential URL: https://github.com/apache/airflow/issues/24318 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] kijewskimateusz commented on issue #24374: Airflow on ARM - Optional provider feature disabled when importing 'airflow.providers.google.leveldb.hooks.leveldb.LevelDBHook' from 'apac
kijewskimateusz commented on issue #24374: URL: https://github.com/apache/airflow/issues/24374#issuecomment-1152352427 I've tested it with `airflow:slim-2.3.2` image and issue disappeared. After adding `apache-airflow-providers-google==7.0.0 ` into Airflow dependencies, I'm notified about optional provider feature being disabled. I'm wondering whether this won't cause any harm when needed to use this module using ARM-based machine -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] voltcode commented on issue #24345: Grid view broken for DAG
voltcode commented on issue #24345: URL: https://github.com/apache/airflow/issues/24345#issuecomment-1152351132 Thanks a lot for the suggestion! The grid view works fine in 2.3.2. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] pankajastro commented on pull request #24306: Fix: `emr_conn_id` should be optional in `EmrCreateJobFlowOperator`
pankajastro commented on PR #24306: URL: https://github.com/apache/airflow/pull/24306#issuecomment-1152348168 Hey guys, just wanted to check if I need to address anything more here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] eladkal commented on issue #24364: BigqueryToGCS Operator Failing
eladkal commented on issue #24364: URL: https://github.com/apache/airflow/issues/24364#issuecomment-1152334563 The fix is in `install apache-airflow-providers-google==8.0.0rc2` Please test it and report in https://github.com/apache/airflow/issues/24289 if the fix doesn't work -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] eladkal closed issue #24364: BigqueryToGCS Operator Failing
eladkal closed issue #24364: BigqueryToGCS Operator Failing URL: https://github.com/apache/airflow/issues/24364 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] kianelbo commented on issue #24343: BigQueryCreateEmptyTableOperator do not deprecated bigquery_conn_id yet
kianelbo commented on issue #24343: URL: https://github.com/apache/airflow/issues/24343#issuecomment-1152322651 I also took the liberty of using `gcp_conn_id` in [BigQueryCreateExternalTableOperator](https://github.com/apache/airflow/blob/main/airflow/providers/google/cloud/operators/bigquery.py#L898) in my PR if that's okay. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] kianelbo opened a new pull request, #24376: Replace the remaining occurrences of bigquery_conn_id with gcp_conn_id
kianelbo opened a new pull request, #24376: URL: https://github.com/apache/airflow/pull/24376 closes: #24343 Replace the remaining occurrences of the deprecated `bigquery_conn_id` arg in google cloud providers with `gcp_conn_id` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch v2-3-test updated: fix 2.3.2 release date. (#24370)
This is an automated email from the ASF dual-hosted git repository. ephraimanierobi pushed a commit to branch v2-3-test in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/v2-3-test by this push: new 641ce14261 fix 2.3.2 release date. (#24370) 641ce14261 is described below commit 641ce142611d4a57bdf3ee1679c12d9f59f070a5 Author: Soichiro Taga (Future) <52783301+future-t...@users.noreply.github.com> AuthorDate: Fri Jun 10 21:20:25 2022 +0900 fix 2.3.2 release date. (#24370) (cherry picked from commit 4daf51a2c388b41201a0a8095e0a97c27d6704c8) --- RELEASE_NOTES.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/RELEASE_NOTES.rst b/RELEASE_NOTES.rst index 3ec086d249..edc7de3df9 100644 --- a/RELEASE_NOTES.rst +++ b/RELEASE_NOTES.rst @@ -21,7 +21,7 @@ .. towncrier release notes start -Airflow 2.3.2 (2021-06-04) +Airflow 2.3.2 (2022-06-04) -- No significant changes
[GitHub] [airflow] boring-cyborg[bot] commented on pull request #24370: fix 2.3.2 release date.
boring-cyborg[bot] commented on PR #24370: URL: https://github.com/apache/airflow/pull/24370#issuecomment-1152300301 Awesome work, congrats on your first merged pull request! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated (6b46240520 -> 4daf51a2c3)
This is an automated email from the ASF dual-hosted git repository. ephraimanierobi pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git from 6b46240520 Added small health check server and endpoint in scheduler and updated… (#23905) add 4daf51a2c3 fix 2.3.2 release date. (#24370) No new revisions were added by this update. Summary of changes: RELEASE_NOTES.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[GitHub] [airflow] ephraimbuddy merged pull request #24370: fix 2.3.2 release date.
ephraimbuddy merged PR #24370: URL: https://github.com/apache/airflow/pull/24370 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] github-actions[bot] commented on pull request #24370: fix 2.3.2 release date.
github-actions[bot] commented on PR #24370: URL: https://github.com/apache/airflow/pull/24370#issuecomment-1152293138 The PR is likely ready to be merged. No tests are needed as no important environment files, nor python files were modified by it. However, committers might decide that full test matrix is needed and add the 'full tests needed' label. Then you should rebase it to the latest main or amend the last commit of the PR, and push it with --force-with-lease. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ashb commented on pull request #24375: Add tests for the grid_data endpoint
ashb commented on PR #24375: URL: https://github.com/apache/airflow/pull/24375#issuecomment-1152282788 /cc @bbovenzi -- though I'm not sure I'm testing the case that you fixed in https://github.com/apache/airflow/pull/24327 here yet. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ashb opened a new pull request, #24375: Add tests for the grid_data endpoint
ashb opened a new pull request, #24375: URL: https://github.com/apache/airflow/pull/24375 --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information. In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in a newsfragement file, named `{pr_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] argibbs commented on pull request #23647: ExternalTaskSensor respects soft_fail if the external task enters a failed_state
argibbs commented on PR #23647: URL: https://github.com/apache/airflow/pull/23647#issuecomment-1152280057 Just checking but I assume the helm tests timing out are not a blocker? Do they need to be rerun, or do we just not mind given the unrelated nature of the change, or \? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on issue #24374: {providers_manager.py:218} INFO - Optional provider feature disabled when importing 'airflow.providers.google.leveldb.hooks.leveldb.Lev
boring-cyborg[bot] commented on issue #24374: URL: https://github.com/apache/airflow/issues/24374#issuecomment-1152278773 Thanks for opening your first issue here! Be sure to follow the issue template! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] kijewskimateusz opened a new issue, #24374: {providers_manager.py:218} INFO - Optional provider feature disabled when importing 'airflow.providers.google.leveldb.hooks.leveldb.Level
kijewskimateusz opened a new issue, #24374: URL: https://github.com/apache/airflow/issues/24374 ### Apache Airflow version 2.3.2 (latest released) ### What happened I'm running locally Airflow 2.3.2 on my ARM-based laptop and when building webserver, I'm seeing multiple notifications with content: `{providers_manager.py:218} INFO - Optional provider feature disabled when importing 'airflow.providers.google.leveldb.hooks.leveldb.LevelDBHook' from 'apache-airflow-providers-google' package` ### What you think should happen instead I think info should be prompted only once or feature should be enabled ### How to reproduce I'm using `apache/airflow:2.3.2` image, installing additional provider packages mentioned below ### Operating System 12.4 (21F79) ### Versions of Apache Airflow Providers apache-airflow-providers-postgres==4.1.0 apache-airflow-providers-databricks==2.6.0 ### Deployment Docker-Compose ### Deployment details I'm building custom dockerimage as provided in Airflow docupmmentation and then execution it using docker compose command ### Anything else _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Megha-Sai-Sree-PINNAKA commented on issue #22870: Keyboard shortcut in Airflow web UI
Megha-Sai-Sree-PINNAKA commented on issue #22870: URL: https://github.com/apache/airflow/issues/22870#issuecomment-1152269911 Hello, is this issue still available? If yes, I would like to try it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Megha-Sai-Sree-PINNAKA commented on issue #16654: "Suggest a change on this page" button bug/change request
Megha-Sai-Sree-PINNAKA commented on issue #16654: URL: https://github.com/apache/airflow/issues/16654#issuecomment-1152269363 Hello, is this issue still available? If yes, I would like to try it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] rino0601 commented on issue #23796: Webserver shows wrong datetime (timezone) in log
rino0601 commented on issue #23796: URL: https://github.com/apache/airflow/issues/23796#issuecomment-1152255996 I've create PR #24373 for this issue. @tanelk @uranusjr could you take a look? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] gmyrianthous commented on issue #24364: BigqueryToGCS Operator Failing
gmyrianthous commented on issue #24364: URL: https://github.com/apache/airflow/issues/24364#issuecomment-1152252154 @adityaprakash-bobby This isn't fixed in composer `2.0.14` (and not even a more recent one) as a fix has not yet merged. For the time being, you can follow one of the workarounds [mentioned here](https://stackoverflow.com/questions/72529432/airflow-2-job-not-found-when-transferring-data-from-bigquery-into-cloud-storage). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] adityaprakash-bobby commented on issue #24364: BigqueryToGCS Operator Failing
adityaprakash-bobby commented on issue #24364: URL: https://github.com/apache/airflow/issues/24364#issuecomment-1152250554 Thank you @gmyrianthous . This helps. In our case we are not managing the libraries in composer. We are letting composer manage the libraries. Hope this is brought into composer 2.0.14, else we might have to explicitly mention this and provide extra care while upgrading our composer environment. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (AIRFLOW-3905) Allow using parameters for sql statement in SqlSensor
[ https://issues.apache.org/jira/browse/AIRFLOW-3905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17552690#comment-17552690 ] ASF GitHub Bot commented on AIRFLOW-3905: - malthe commented on PR #4723: URL: https://github.com/apache/airflow/pull/4723#issuecomment-1152249671 Shouldn't the `parameters` be templated? > Allow using parameters for sql statement in SqlSensor > - > > Key: AIRFLOW-3905 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3905 > Project: Apache Airflow > Issue Type: Improvement > Components: operators >Affects Versions: 1.10.2 >Reporter: Xiaodong Deng >Assignee: Xiaodong Deng >Priority: Minor > Fix For: 1.10.3 > > > In most SQL-related operators/sensors, argument `parameters` is available to > help render SQL command conveniently. But this is not available in SqlSensor > yet. -- This message was sent by Atlassian Jira (v8.20.7#820007)