[GitHub] [airflow] uranusjr commented on a change in pull request #17797: Fix broken MSSQL test
uranusjr commented on a change in pull request #17797: URL: https://github.com/apache/airflow/pull/17797#discussion_r694509540 ## File path: tests/task/task_runner/test_standard_task_runner.py ## @@ -62,15 +62,12 @@ def logging_and_db(self): (as the test environment does not have enough context for the normal way to run) and ensures they reset back to normal on the way out. """ +clear_db_runs() dictConfig(LOGGING_CONFIG) yield airflow_logger = logging.getLogger('airflow') airflow_logger.handlers = [] -try: -clear_db_runs() -except Exception: -# It might happen that we lost connection to the server here so we need to ignore any errors here -pass +clear_db_runs() Review comment: Is this related? I’m suspecting the try-except is there for a reason… but we can always add it back anyway. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a change in pull request #17797: Fix broken MSSQL test
uranusjr commented on a change in pull request #17797: URL: https://github.com/apache/airflow/pull/17797#discussion_r694508782 ## File path: tests/task/task_runner/test_standard_task_runner.py ## @@ -228,6 +225,8 @@ def test_on_kill(self): for process in processes: assert not psutil.pid_exists(process.pid), f"{process} is still alive" +session.close() Review comment: I like how your code blocks are all in a Java-ish language -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr edited a comment on issue #17800: BigQueryCreateExternalTableOperator from providers package fails to get schema from GCS object
uranusjr edited a comment on issue #17800: URL: https://github.com/apache/airflow/issues/17800#issuecomment-904336344 It feels weird to me the use of `download_as_string` results in a 404, the issue seems to be separate. A PR fixing `download_as_string` would be very much welcomed, but I suspect it won’t fix your issue. But I can very well be wrong. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on issue #17800: BigQueryCreateExternalTableOperator from providers package fails to get schema from GCS object
uranusjr commented on issue #17800: URL: https://github.com/apache/airflow/issues/17800#issuecomment-904336344 I feels weird to me the use of `download_as_string` results in a 404, the issue seems to be separate. A PR fixing `download_as_string` would be very much welcomed, but I suspect it won’t fix your issue. But I can very well be wrong. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated: Fix missing whitespace in ``apply_default`` deprecation message (#17799)
This is an automated email from the ASF dual-hosted git repository. uranusjr pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/main by this push: new 41e3a91 Fix missing whitespace in ``apply_default`` deprecation message (#17799) 41e3a91 is described below commit 41e3a918f01a4506ea676b0db7856e99fae47528 Author: Kaxil Naik AuthorDate: Tue Aug 24 06:30:46 2021 +0100 Fix missing whitespace in ``apply_default`` deprecation message (#17799) --- airflow/utils/decorators.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/airflow/utils/decorators.py b/airflow/utils/decorators.py index e0b314b..ea5a536 100644 --- a/airflow/utils/decorators.py +++ b/airflow/utils/decorators.py @@ -36,7 +36,7 @@ def apply_defaults(func: T) -> T: warnings.warn( "This decorator is deprecated. \n" "\n" -"In previous versions, all subclasses of BaseOperator must use apply_default decorator for the" +"In previous versions, all subclasses of BaseOperator must use apply_default decorator for the " "`default_args` feature to work properly.\n" "\n" "In current version, it is optional. The decorator is applied automatically using the metaclass.\n",
[GitHub] [airflow] uranusjr merged pull request #17799: Fix missing whitespace in ``apply_default`` deprecation message
uranusjr merged pull request #17799: URL: https://github.com/apache/airflow/pull/17799 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on a change in pull request #17777: get_pandas_df() fails when it tries to read an empty table
uranusjr commented on a change in pull request #1: URL: https://github.com/apache/airflow/pull/1#discussion_r694503504 ## File path: tests/providers/apache/hive/hooks/test_hive.py ## @@ -687,6 +687,13 @@ def test_get_pandas_df(self): hook.mock_cursor.execute.assert_any_call('set airflow.ctx.dag_owner=airflow') hook.mock_cursor.execute.assert_any_call('set airflow.ctx.dag_email=t...@airflow.com') +hook = MockHiveServer2Hook(empty_table_flag = True) Review comment: I’d probably do something like `MockHiveServer2Hook(connection_cursor=EmptyMockTableCursor())` instead. `EmptyMockTableCursor` would inherit from a new `BaseMockHiveServer2Hook` class that contains common logic refactored out of `MockHiveServer2Hook` (which would then also inherit from `BaseMockHiveServer2Hook`). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] uranusjr commented on issue #17795: Circular Dependency in Apache Airflow 2.1.3
uranusjr commented on issue #17795: URL: https://github.com/apache/airflow/issues/17795#issuecomment-904329243 FWIW circular dependencies are allowed in Python (in other words, a project’s Python package dependencies is not a DAG), so Bazel not supporting it is a bug (or missing feature). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] edwardwang888 edited a comment on issue #16877: Cleared task instances in manual runs should have borders
edwardwang888 edited a comment on issue #16877: URL: https://github.com/apache/airflow/issues/16877#issuecomment-904297333 @bbovenzi Sorry for the delay! I have my local development environment setup now, and am playing around with `tree.js`. I will reach out either here or on Slack if I have any questions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] edwardwang888 edited a comment on issue #16877: Cleared task instances in manual runs should have borders
edwardwang888 edited a comment on issue #16877: URL: https://github.com/apache/airflow/issues/16877#issuecomment-904297333 @bbovenzi Sorry for the delay. I have my local development environment setup now, and am playing around with `tree.js`. I will reach out either here or on Slack if I have any questions! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] edwardwang888 commented on issue #16877: Cleared task instances in manual runs should have borders
edwardwang888 commented on issue #16877: URL: https://github.com/apache/airflow/issues/16877#issuecomment-904297333 @bbovenzi Sorry for the delay. I have my local development environment setup now, and am playing around with `tree.js`. I will reach out on Slack if I have any questions! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] EliMor commented on issue #17661: Support git sync from multiple repositories
EliMor commented on issue #17661: URL: https://github.com/apache/airflow/issues/17661#issuecomment-904285865 seems related to: https://github.com/apache/airflow/issues/11708 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] EliMor commented on issue #17737: Allow multiple repos to be source for dags in helm chart
EliMor commented on issue #17737: URL: https://github.com/apache/airflow/issues/17737#issuecomment-904266423 Seems like duplicate of https://github.com/apache/airflow/issues/17661 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] iblaine commented on pull request #17777: get_pandas_df() fails when it tries to read an empty table
iblaine commented on pull request #1: URL: https://github.com/apache/airflow/pull/1#issuecomment-904239844 * A new flag `empty_table_flag` has been added to enable tests against an empty hive table * A new test has been added to validate get_pandas_df against an empty table -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on issue #17800: BigQueryCreateExternalTableOperator from providers package fails to get schema from GCS object
boring-cyborg[bot] commented on issue #17800: URL: https://github.com/apache/airflow/issues/17800#issuecomment-904225776 Thanks for opening your first issue here! Be sure to follow the issue template! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] lawrencestfs opened a new issue #17800: BigQueryCreateExternalTableOperator from providers package fails to get schema from GCS object
lawrencestfs opened a new issue #17800: URL: https://github.com/apache/airflow/issues/17800 **Apache Airflow version**: 1.10.15 **OS**: Linux 5.4.109+ **Apache Airflow Provider versions**: apache-airflow-backport-providers-apache-beam==2021.3.13 apache-airflow-backport-providers-cncf-kubernetes==2021.3.3 apache-airflow-backport-providers-google==2021.3.3 **Deployment**: Cloud Composer 1.16.6 (Google Cloud Managed Airflow Service) **What happened**: BigQueryCreateExternalTableOperator from the providers package ([airflow.providers.google.cloud.operators.bigquery](https://github.com/apache/airflow/blob/main/airflow/providers/google/cloud/operators/bigquery.py)) fails with correct _schema_object_ parameter. **What you expected to happen**: I expected the DAG to succesfully run, as I've previously tested it with the deprecated operator from the contrib package ([airflow.contrib.operators.bigquery_operator](https://github.com/apache/airflow/blob/5786dcdc392f7a2649f398353a0beebef01c428e/airflow/contrib/operators/bigquery_operator.py#L476)), using the same parameters. Debbuging the DAG execution log, I saw the providers operator generated a wrong call to the Cloud Storage API: it mixed up the bucket and object parameters, according the stack trace bellow. ``` [2021-08-23 23:17:22,316] {taskinstance.py:1152} ERROR - 404 GET https://storage.googleapis.com/download/storage/v1/b/foo/bar/schema.json/o/mybucket?alt=media: Not Found: ('Request failed with status code', 404, 'Expected one of', , ) Traceback (most recent call last) File "/opt/python3.6/lib/python3.6/site-packages/google/cloud/storage/client.py", line 728, in download_blob_to_fil checksum=checksum File "/opt/python3.6/lib/python3.6/site-packages/google/cloud/storage/blob.py", line 956, in _do_downloa response = download.consume(transport, timeout=timeout File "/opt/python3.6/lib/python3.6/site-packages/google/resumable_media/requests/download.py", line 168, in consum self._process_response(result File "/opt/python3.6/lib/python3.6/site-packages/google/resumable_media/_download.py", line 186, in _process_respons response, _ACCEPTABLE_STATUS_CODES, self._get_status_cod File "/opt/python3.6/lib/python3.6/site-packages/google/resumable_media/_helpers.py", line 104, in require_status_cod *status_code google.resumable_media.common.InvalidResponse: ('Request failed with status code', 404, 'Expected one of', , During handling of the above exception, another exception occurred Traceback (most recent call last) File "/usr/local/lib/airflow/airflow/models/taskinstance.py", line 985, in _run_raw_tas result = task_copy.execute(context=context File "/usr/local/lib/airflow/airflow/providers/google/cloud/operators/bigquery.py", line 1178, in execut schema_fields = json.loads(gcs_hook.download(self.bucket, self.schema_object) File "/usr/local/lib/airflow/airflow/providers/google/cloud/hooks/gcs.py", line 301, in downloa return blob.download_as_string( File "/opt/python3.6/lib/python3.6/site-packages/google/cloud/storage/blob.py", line 1391, in download_as_strin timeout=timeout File "/opt/python3.6/lib/python3.6/site-packages/google/cloud/storage/blob.py", line 1302, in download_as_byte checksum=checksum File "/opt/python3.6/lib/python3.6/site-packages/google/cloud/storage/client.py", line 731, in download_blob_to_fil _raise_from_invalid_response(exc File "/opt/python3.6/lib/python3.6/site-packages/google/cloud/storage/blob.py", line 3936, in _raise_from_invalid_respons raise exceptions.from_http_status(response.status_code, message, response=response google.api_core.exceptions.NotFound: 404 GET https://storage.googleapis.com/download/storage/v1/b/foo/bar/schema.json/o/mybucket?alt=media: Not Found: ('Request failed with status code', 404, 'Expected one of', , ``` PS: the bucket (_mybucket_) and object path (_foo/bar/schema.json_) were masked for security reasons. I believe the error appears on the [following](https://github.com/apache/airflow/blob/main/airflow/providers/google/cloud/operators/bigquery.py#L1183) line, although the bug itself is probably located on the [gcs_hook.download()](https://github.com/apache/airflow/blob/0264fea8c2024d7d3d64aa0ffa28a0cfa48839cd/airflow/providers/google/cloud/hooks/gcs.py#L266) method: `schema_fields = json.loads(gcs_hook.download(self.bucket, self.schema_object))` **How to reproduce it**: Create a DAG using both operators and the same parameters, as the example bellow. The task using the contrib version of the operator should work, while the task using the providers version should fail. ``` from airflow.contrib.operators.bigquery_operator import BigQueryCreateExternalTableOperator as
[GitHub] [airflow] github-actions[bot] closed issue #10574: Dag in existing persistent volume claim not registering
github-actions[bot] closed issue #10574: URL: https://github.com/apache/airflow/issues/10574 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] github-actions[bot] commented on issue #10574: Dag in existing persistent volume claim not registering
github-actions[bot] commented on issue #10574: URL: https://github.com/apache/airflow/issues/10574#issuecomment-904217119 This issue has been closed because it has not received response from the issue author. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] kaxil opened a new pull request #17799: Fix missing whitespace in ``apply_default`` deprecation message
kaxil opened a new pull request #17799: URL: https://github.com/apache/airflow/pull/17799 Before: ``` In previous versions, all subclasses of BaseOperator must use apply_default decorator for the`default_args` feature to work properly. ``` After: ``` In previous versions, all subclasses of BaseOperator must use apply_default decorator for the `default_args` feature to work properly. ``` --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information. In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] kaxil opened a new pull request #17798: Prepare release for Kubernetes Provider
kaxil opened a new pull request #17798: URL: https://github.com/apache/airflow/pull/17798 https://github.com/apache/airflow/issues/17186 -- This has made the XCom functionality not work with KubernetesPodOperator, this has been fixed by https://github.com/apache/airflow/pull/17760 -- so we should get this out sooner rather than later as recently released Airflow 2.1.3 will pull in latest kubernetes provider when we run `pip install -U apache-airflow[cncf.kubernetes]` --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information. In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a change in pull request #17797: Fix broken MSSQL test
potiuk commented on a change in pull request #17797: URL: https://github.com/apache/airflow/pull/17797#discussion_r694367348 ## File path: tests/task/task_runner/test_standard_task_runner.py ## @@ -228,6 +225,8 @@ def test_on_kill(self): for process in processes: assert not psutil.pid_exists(process.pid), f"{process} is still alive" +session.close() Review comment: Maybe simply: ``` from contextlib import closing with closing(new Session()) as session: ``` That would avoid creating an explicit context manager. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a change in pull request #17797: Fix broken MSSQL test
potiuk commented on a change in pull request #17797: URL: https://github.com/apache/airflow/pull/17797#discussion_r694367348 ## File path: tests/task/task_runner/test_standard_task_runner.py ## @@ -228,6 +225,8 @@ def test_on_kill(self): for process in processes: assert not psutil.pid_exists(process.pid), f"{process} is still alive" +session.close() Review comment: Maybe simply: ``` from contextlib import closing with closing(new Session) as session: ``` That would avoid creating an explicit context manager. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a change in pull request #17797: Fix broken MSSQL test
potiuk commented on a change in pull request #17797: URL: https://github.com/apache/airflow/pull/17797#discussion_r694367348 ## File path: tests/task/task_runner/test_standard_task_runner.py ## @@ -228,6 +225,8 @@ def test_on_kill(self): for process in processes: assert not psutil.pid_exists(process.pid), f"{process} is still alive" +session.close() Review comment: Maybe simply: ``` from contextlib import closing with closing(new Session) as session: -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a change in pull request #17797: Fix broken MSSQL test
potiuk commented on a change in pull request #17797: URL: https://github.com/apache/airflow/pull/17797#discussion_r694365694 ## File path: tests/task/task_runner/test_standard_task_runner.py ## @@ -228,6 +225,8 @@ def test_on_kill(self): for process in processes: assert not psutil.pid_exists(process.pid), f"{process} is still alive" +session.close() Review comment: Actually it does call for a `with newSession() as session` ... Should we make ContextManager for that? That would avoid try/finally. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a change in pull request #17797: Fix broken MSSQL test
potiuk commented on a change in pull request #17797: URL: https://github.com/apache/airflow/pull/17797#discussion_r694363964 ## File path: tests/task/task_runner/test_standard_task_runner.py ## @@ -228,6 +225,8 @@ def test_on_kill(self): for process in processes: assert not psutil.pid_exists(process.pid), f"{process} is still alive" +session.close() Review comment: I see a number of similar uses (but a number of finally ones as well). Actually not having finally might be a reason why we are having side effects/flaky tests so we might want to fix them all. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] jedcunningham commented on a change in pull request #17797: Fix broken MSSQL test
jedcunningham commented on a change in pull request #17797: URL: https://github.com/apache/airflow/pull/17797#discussion_r694361191 ## File path: tests/task/task_runner/test_standard_task_runner.py ## @@ -228,6 +225,8 @@ def test_on_kill(self): for process in processes: assert not psutil.pid_exists(process.pid), f"{process} is still alive" +session.close() Review comment: I can convert this to `createSession` which would do that, but calling `close` like this is pretty common in our test suite. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] jedcunningham closed pull request #17797: Fix broken MSSQL test
jedcunningham closed pull request #17797: URL: https://github.com/apache/airflow/pull/17797 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated (36c5fd3 -> 0264fea)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git. from 36c5fd3 Move instriuctions of constraint/image refreshing to dev add 0264fea Remove airflow dependency from http provider No new revisions were added by this update. Summary of changes: airflow/providers/http/provider.yaml | 3 --- 1 file changed, 3 deletions(-)
[GitHub] [airflow] potiuk closed issue #17795: Circular Dependency in Apache Airflow 2.1.3
potiuk closed issue #17795: URL: https://github.com/apache/airflow/issues/17795 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk merged pull request #17796: Remove airflow dependency from http provider
potiuk merged pull request #17796: URL: https://github.com/apache/airflow/pull/17796 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a change in pull request #17797: Fix broken MSSQL test
potiuk commented on a change in pull request #17797: URL: https://github.com/apache/airflow/pull/17797#discussion_r694358373 ## File path: tests/task/task_runner/test_standard_task_runner.py ## @@ -228,6 +225,8 @@ def test_on_kill(self): for process in processes: assert not psutil.pid_exists(process.pid), f"{process} is still alive" +session.close() Review comment: Should it be in try/finally clause? Or maybe even setup/tearDown? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] github-actions[bot] commented on pull request #17796: Remove airflow dependency from http provider
github-actions[bot] commented on pull request #17796: URL: https://github.com/apache/airflow/pull/17796#issuecomment-904183023 The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #17788: livy operator print sparkUiUrl in log output
potiuk commented on issue #17788: URL: https://github.com/apache/airflow/issues/17788#issuecomment-904174613 > > @potiuk I am not sure if that extra operator will work for this case. The livy api only stores data for about 3 hours in our case. The default is much shorter. Since the uri has to be pulled form the API and is not accessible long term, I am not sure the link would be valid after or even available later. I hope that makes sense. But yeah. If the link is not functional after 3 hours, then probably you are right that exposing it via extra link is not a good idea. I Thought that you will still be able to access it after 3hrs (but not retrieve it), but if I read it correctly, this is not the case. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #17788: livy operator print sparkUiUrl in log output
potiuk commented on issue #17788: URL: https://github.com/apache/airflow/issues/17788#issuecomment-904173353 > @potiuk I am not sure if that extra operator will work for this case. The livy api only stores data for about 3 hours in our case. The default is much shorter. Since the uri has to be pulled form the API and is not accessible long term, I am not sure the link would be valid after or even available later. I hope that makes sense. You could store the link in the XCom right after submission, and retrieve it from the XCom to show it in the extra link. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] jedcunningham opened a new pull request #17797: Fix broken MSSQL test
jedcunningham opened a new pull request #17797: URL: https://github.com/apache/airflow/pull/17797 This broken test was causing the next test to use the db to fail. Also, by not ignoring exceptions here we let the failure be exposed where its broken, not in the next test that happens to run. This was causing `TestBaseSensor.test_fail` to fail with: ``` pyodbc.ProgrammingError: ('42000', '[42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]The server failed to resume the transaction. Desc:330006. (3971) (SQLExecDirectW)') ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated: Move instriuctions of constraint/image refreshing to dev
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/main by this push: new 36c5fd3 Move instriuctions of constraint/image refreshing to dev 36c5fd3 is described below commit 36c5fd3df9b271702e1dd2d73c579de3f3bd5fc0 Author: Jarek Potiuk AuthorDate: Mon Aug 23 11:35:44 2021 +0200 Move instriuctions of constraint/image refreshing to dev When we have a prolonged issue with flaky tests or Github runners instabilities, our automated constraint and image refresh might not work, so we might need to manually refresh the constraints and images. Documentation about that was in CONTRIBUTING.rst but it is more appriate to keep it in ``dev`` as it only applies to committers. Also during testing the parallell refresh without delays an error was discovered which prevented parallell check of random image hash during the build. This has been fixed and parallell image cache building should work flawlessly now. --- CONTRIBUTING.rst| 36 - dev/REFRESHING_CI_CACHE.md | 94 + dev/refresh_images.sh | 38 + scripts/ci/libraries/_build_images.sh | 68 +--- scripts/ci/libraries/_initialization.sh | 11 5 files changed, 170 insertions(+), 77 deletions(-) diff --git a/CONTRIBUTING.rst b/CONTRIBUTING.rst index b5a81ff..3a561f8 100644 --- a/CONTRIBUTING.rst +++ b/CONTRIBUTING.rst @@ -877,42 +877,6 @@ The ``constraints-.txt`` and ``constraints-no-provid will be automatically regenerated by CI job every time after the ``setup.py`` is updated and pushed if the tests are successful. -Manually generating constraint files - - -The constraint files are generated automatically by the CI job. Sometimes however it is needed to regenerate -them manually (committers only). For example when main build did not succeed for quite some time). -This can be done by running this (it utilizes parallel preparation of the constraints): - -.. code-block:: bash - -export CURRENT_PYTHON_MAJOR_MINOR_VERSIONS_AS_STRING="3.6 3.7 3.8 3.9" -for python_version in $(echo "${CURRENT_PYTHON_MAJOR_MINOR_VERSIONS_AS_STRING}") -do - ./breeze build-image --upgrade-to-newer-dependencies --python ${python_version} --build-cache-local -done - -GENERATE_CONSTRAINTS_MODE="pypi-providers" ./scripts/ci/constraints/ci_generate_all_constraints.sh -GENERATE_CONSTRAINTS_MODE="source-providers" ./scripts/ci/constraints/ci_generate_all_constraints.sh -GENERATE_CONSTRAINTS_MODE="no-providers" ./scripts/ci/constraints/ci_generate_all_constraints.sh - -AIRFLOW_SOURCES=$(pwd) - - -The constraints will be generated in "files/constraints-PYTHON_VERSION/constraints-*.txt files. You need to -checkout the right 'constraints-' branch in a separate repository and then you can copy, commit and push the -generated files: - -.. code-block:: bash - -cd -git pull -cp ${AIRFLOW_SOURCES}/files/constraints-*/constraints*.txt . -git diff -git add . -git commit -m "Your commit message here" --no-verify -git push - Documentation = diff --git a/dev/REFRESHING_CI_CACHE.md b/dev/REFRESHING_CI_CACHE.md new file mode 100644 index 000..c5a27ee --- /dev/null +++ b/dev/REFRESHING_CI_CACHE.md @@ -0,0 +1,94 @@ + + + + +**Table of Contents** *generated with [DocToc](https://github.com/thlorenz/doctoc)* + +- [Automated cache refreshing in CI](#automated-cache-refreshing-in-ci) +- [Manually generating constraint files](#manually-generating-constraint-files) +- [Manually refreshing the images](#manually-refreshing-the-images) + + + +# Automated cache refreshing in CI + +Our [CI system](../CI.rst) is build in the way that it self-maintains. Regular scheduled builds and +merges to `main` branch have separate maintenance step that take care about refreshing the cache that is +used to speed up our builds and to speed up rebuilding of [Breeze](../BREEZE.rst) images for development +purpose. This is all happening automatically, usually: + +* The latest [constraints](../COMMITTERS.rst#pinned-constraint-files) are pushed to appropriate branch + after all tests succeeded in `main` merge or in `scheduled` build + +* The [images](../IMAGES.rst) in `ghcr.io` registry are refreshed after every successful merge to `main` + or `scheduled` build and after pushing the constraints, this means that the latest image cache uses + also the latest tested constraints + +Sometimes however, when we have prolonged period of fighting with flakiness of GitHub Actions runners or our +tests, the refresh might not be triggered - because tests will not succeed for some time. In this case +manual refresh might be needed. + +# Manually
[GitHub] [airflow] potiuk merged pull request #17782: Move instructions of constraint/image refreshing to dev
potiuk merged pull request #17782: URL: https://github.com/apache/airflow/pull/17782 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk edited a comment on issue #17795: Circular Dependency in Apache Airflow 2.1.3
potiuk edited a comment on issue #17795: URL: https://github.com/apache/airflow/issues/17795#issuecomment-904169637 Yeah . HTTP provider has been brought back to be preinstalled and we have to fix that and release it to remove the circular dependency, This will be fixed with the next release of HTTP provider (https://github.com/apache/airflow/pull/17796) BTW. Bazel is not supported as the way of installing Airflow. The only official way of installing airflow is via `pip`and constraints -https://airflow.apache.org/docs/apache-airflow/stable/installation.html#installation-tools. PIP can handle circular dependencies well. So it's really Bazel's problem not Airflow (we are happy to fix it anyway for the next release - thanks for reporting) . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk edited a comment on issue #17795: Circular Dependency in Apache Airflow 2.1.3
potiuk edited a comment on issue #17795: URL: https://github.com/apache/airflow/issues/17795#issuecomment-904169637 Yeah . HTTP provider has been brought back to be preinstalled and we have to fix that and release it to remove the circular dependency, This will be fixed with the next release of HTTP provider (https://github.com/apache/airflow/pull/17796) BTW. Bazel is not supported as the way of installing Airflow. The only official way of installing airflow is via `pip`and constraints -https://airflow.apache.org/docs/apache-airflow/stable/installation.html#installation-tools. PIP can handle circular dependencies well. So it's really Bazel's problem not Airflow (we are happy to fix it anyway for the next release - thenks for reporting) . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #17795: Circular Dependency in Apache Airflow 2.1.3
potiuk commented on issue #17795: URL: https://github.com/apache/airflow/issues/17795#issuecomment-904169637 Yeah . HTTP provider has been brought back to be preinstalled and we had to release it to remove the circular dependency, This will be fixed with the next release of HTTP provider (https://github.com/apache/airflow/pull/17796) BTW. Bazel is not supported as the way of installing Airflow. The only official way of installing airflow is via `pip`and constraints -https://airflow.apache.org/docs/apache-airflow/stable/installation.html#installation-tools. PIP can handle circular dependencies well. So it's really Bazel's problem not Airflow (we are happy to fix it anyway for the next release - thenks for reporting) . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk opened a new pull request #17796: Remove airflow dependency from http provider
potiuk opened a new pull request #17796: URL: https://github.com/apache/airflow/pull/17796 The http provider has been temporarily moved out of preinstalled providers (because of licensing issues). Those issues have now been removed and http provider went back to be preinstalled, however it still had the apache-airflow>=2.1 as dependency. This PR removes the dependency. Fixes: #17795 --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information. In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] cocampbe commented on issue #17788: livy operator print sparkUiUrl in log output
cocampbe commented on issue #17788: URL: https://github.com/apache/airflow/issues/17788#issuecomment-904163197 @potiuk I am not sure if that extra operator will work for this case. The livy api only stores data for about 3 hours in our case. The default is much shorter. Since the uri has to be pulled form the API and is not accessible long term, I am not sure the link would be valid after or even available later. I hope that makes sense. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk closed pull request #17767: Update description about the new ``connection-types`` provider meta-data
potiuk closed pull request #17767: URL: https://github.com/apache/airflow/pull/17767 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #17767: Update description about the new ``connection-types`` provider meta-data
potiuk commented on pull request #17767: URL: https://github.com/apache/airflow/pull/17767#issuecomment-904161210 Already merged together with #17775 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] 02/02: Improve discoverability of Provider packages' functionality
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git commit bcc76656844be47f923527c0a6cd1de546655cb4 Author: Jarek Potiuk AuthorDate: Sat Aug 21 23:10:31 2021 +0200 Improve discoverability of Provider packages' functionality The documentation of provider packages was rather disconnected from the apache-airlfow documentation. It was hard to find the ways how the apache airflow's core extensions are implemented by the community managed providers - you needed to know what you were looking for, and you could not find links to the summary of the core-functionality extended by providers when you were looking at the functionality (like logging/secret backends/connections/auth) This PR inroduces much more comprehensive cross-linking between the airflow core functionalithy and the community-managed providers that are providing extensions to the core functionality. --- .../logging/cloud-watch-task-handlers.rst | 2 +- .../logging/s3-task-handler.rst| 2 +- .../index.rst | 2 +- .../{logging.rst => logging/index.rst} | 6 +- .../redirects.txt | 1 + .../logging/gcs.rst| 2 +- .../logging/index.rst | 4 +- .../logging/stackdriver.rst| 2 +- .../index.rst | 2 +- .../{logging.rst => logging/index.rst} | 2 +- .../redirects.txt | 1 + .../core-extensions/auth-backends.rst} | 19 +- .../core-extensions/connections.rst} | 22 +- .../core-extensions/extra-links.rst} | 21 +- .../core-extensions}/index.rst | 10 +- .../core-extensions/logging.rst} | 15 +- .../core-extensions/secrets-backends.rst | 36 +++ docs/apache-airflow-providers/index.rst| 222 --- docs/apache-airflow/concepts/connections.rst | 10 + docs/apache-airflow/concepts/operators.rst | 9 +- docs/apache-airflow/howto/define_extra_link.rst| 7 +- .../logging-monitoring/logging-tasks.rst | 11 +- docs/apache-airflow/operators-and-hooks-ref.rst| 7 +- .../security/secrets/secrets-backend/index.rst | 23 +- docs/build_docs.py | 75 +-- docs/exts/auth_backend.rst.jinja2 | 27 +++ docs/exts/connections.rst.jinja2 | 27 +++ docs/exts/extra_links.rst.jinja2 | 27 +++ docs/exts/logging.rst.jinja2 | 29 +++ docs/exts/operators_and_hooks_ref.py | 246 ++--- docs/exts/secret_backend.rst.jinja2| 27 +++ docs/helm-chart/manage-logs.rst| 2 +- setup.py | 1 + 33 files changed, 719 insertions(+), 180 deletions(-) diff --git a/docs/apache-airflow-providers-amazon/logging/cloud-watch-task-handlers.rst b/docs/apache-airflow-providers-amazon/logging/cloud-watch-task-handlers.rst index 4d431c7..c576d78 100644 --- a/docs/apache-airflow-providers-amazon/logging/cloud-watch-task-handlers.rst +++ b/docs/apache-airflow-providers-amazon/logging/cloud-watch-task-handlers.rst @@ -17,7 +17,7 @@ .. _write-logs-amazon-cloudwatch: -Writing Logs to Amazon Cloudwatch +Writing logs to Amazon Cloudwatch - Remote logging to Amazon Cloudwatch uses an existing Airflow connection to read or write logs. If you diff --git a/docs/apache-airflow-providers-amazon/logging/s3-task-handler.rst b/docs/apache-airflow-providers-amazon/logging/s3-task-handler.rst index e37f622..bc12088 100644 --- a/docs/apache-airflow-providers-amazon/logging/s3-task-handler.rst +++ b/docs/apache-airflow-providers-amazon/logging/s3-task-handler.rst @@ -17,7 +17,7 @@ .. _write-logs-amazon-s3: -Writing Logs to Amazon S3 +Writing logs to Amazon S3 - Remote logging to Amazon S3 uses an existing Airflow connection to read or write logs. If you diff --git a/docs/apache-airflow-providers-elasticsearch/index.rst b/docs/apache-airflow-providers-elasticsearch/index.rst index a5f9799..60d500e 100644 --- a/docs/apache-airflow-providers-elasticsearch/index.rst +++ b/docs/apache-airflow-providers-elasticsearch/index.rst @@ -27,7 +27,7 @@ Content :caption: Guides Connection types -Logging for Tasks +Logging for Tasks .. toctree:: :maxdepth: 1 diff --git a/docs/apache-airflow-providers-elasticsearch/logging.rst b/docs/apache-airflow-providers-elasticsearch/logging/index.rst similarity index 98% rename from docs/apache-airflow-providers-elasticsearch/logging.rst rename to
[airflow] 01/02: Update description about the new ``connection-types`` provider meta-data
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git commit be75dcd39cd10264048c86e74110365bd5daf8b7 Author: Jarek Potiuk AuthorDate: Sat Aug 21 12:12:39 2021 +0200 Update description about the new ``connection-types`` provider meta-data The ``hook-class-names`` provider's meta-data property has been deprecated and is now replaced by ``connection-types`` property. This documents the change. --- airflow/providers/airbyte/provider.yaml | 2 +- airflow/providers/alibaba/provider.yaml | 2 +- airflow/providers/amazon/provider.yaml | 2 +- airflow/providers/apache/cassandra/provider.yaml| 2 +- airflow/providers/apache/drill/provider.yaml| 2 +- airflow/providers/apache/druid/provider.yaml| 2 +- airflow/providers/apache/hdfs/provider.yaml | 2 +- airflow/providers/apache/hive/provider.yaml | 2 +- airflow/providers/apache/livy/provider.yaml | 2 +- airflow/providers/apache/pig/provider.yaml | 2 +- airflow/providers/apache/spark/provider.yaml| 2 +- airflow/providers/apache/sqoop/provider.yaml| 2 +- airflow/providers/asana/provider.yaml | 2 +- airflow/providers/cloudant/provider.yaml| 2 +- airflow/providers/cncf/kubernetes/provider.yaml | 2 +- airflow/providers/databricks/provider.yaml | 2 +- airflow/providers/dingding/provider.yaml| 2 +- airflow/providers/discord/provider.yaml | 2 +- airflow/providers/docker/provider.yaml | 2 +- airflow/providers/elasticsearch/provider.yaml | 2 +- airflow/providers/exasol/provider.yaml | 2 +- airflow/providers/facebook/provider.yaml| 2 +- airflow/providers/ftp/provider.yaml | 2 +- airflow/providers/google/provider.yaml | 2 +- airflow/providers/grpc/provider.yaml| 2 +- airflow/providers/hashicorp/provider.yaml | 2 +- airflow/providers/http/provider.yaml| 2 +- airflow/providers/imap/provider.yaml| 2 +- airflow/providers/jdbc/provider.yaml| 2 +- airflow/providers/jenkins/provider.yaml | 2 +- airflow/providers/jira/provider.yaml| 2 +- airflow/providers/microsoft/azure/provider.yaml | 2 +- airflow/providers/microsoft/mssql/provider.yaml | 2 +- airflow/providers/mongo/provider.yaml | 2 +- airflow/providers/mysql/provider.yaml | 2 +- airflow/providers/neo4j/provider.yaml | 2 +- airflow/providers/odbc/provider.yaml| 2 +- airflow/providers/opsgenie/provider.yaml| 2 +- airflow/providers/oracle/provider.yaml | 2 +- airflow/providers/postgres/provider.yaml| 2 +- airflow/providers/presto/provider.yaml | 2 +- airflow/providers/qubole/provider.yaml | 2 +- airflow/providers/redis/provider.yaml | 2 +- airflow/providers/salesforce/provider.yaml | 2 +- airflow/providers/samba/provider.yaml | 2 +- airflow/providers/segment/provider.yaml | 2 +- airflow/providers/sftp/provider.yaml| 2 +- airflow/providers/slack/provider.yaml | 2 +- airflow/providers/snowflake/provider.yaml | 2 +- airflow/providers/sqlite/provider.yaml | 2 +- airflow/providers/ssh/provider.yaml | 2 +- airflow/providers/tableau/provider.yaml | 2 +- airflow/providers/trino/provider.yaml | 2 +- airflow/providers/vertica/provider.yaml | 2 +- airflow/providers/yandex/provider.yaml | 2 +- .../howto/create-update-providers.rst | 17 ++--- docs/apache-airflow-providers/index.rst | 17 + docs/apache-airflow/howto/connection.rst| 12 ++-- 58 files changed, 92 insertions(+), 64 deletions(-) diff --git a/airflow/providers/airbyte/provider.yaml b/airflow/providers/airbyte/provider.yaml index e082e66..4d2778d 100644 --- a/airflow/providers/airbyte/provider.yaml +++ b/airflow/providers/airbyte/provider.yaml @@ -52,7 +52,7 @@ sensors: python-modules: - airflow.providers.airbyte.sensors.airbyte -hook-class-names: +hook-class-names: # deprecated - to be removed after providers add dependency on Airflow 2.2.0+ - airflow.providers.airbyte.hooks.airbyte.AirbyteHook connection-types: diff --git a/airflow/providers/alibaba/provider.yaml
[airflow] branch main updated (3a7a65c -> bcc7665)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git. from 3a7a65c feat: Add Loadsmart in the list of companies using it (#17792) new be75dcd Update description about the new ``connection-types`` provider meta-data new bcc7665 Improve discoverability of Provider packages' functionality The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: airflow/providers/airbyte/provider.yaml| 2 +- airflow/providers/alibaba/provider.yaml| 2 +- airflow/providers/amazon/provider.yaml | 2 +- airflow/providers/apache/cassandra/provider.yaml | 2 +- airflow/providers/apache/drill/provider.yaml | 2 +- airflow/providers/apache/druid/provider.yaml | 2 +- airflow/providers/apache/hdfs/provider.yaml| 2 +- airflow/providers/apache/hive/provider.yaml| 2 +- airflow/providers/apache/livy/provider.yaml| 2 +- airflow/providers/apache/pig/provider.yaml | 2 +- airflow/providers/apache/spark/provider.yaml | 2 +- airflow/providers/apache/sqoop/provider.yaml | 2 +- airflow/providers/asana/provider.yaml | 2 +- airflow/providers/cloudant/provider.yaml | 2 +- airflow/providers/cncf/kubernetes/provider.yaml| 2 +- airflow/providers/databricks/provider.yaml | 2 +- airflow/providers/dingding/provider.yaml | 2 +- airflow/providers/discord/provider.yaml| 2 +- airflow/providers/docker/provider.yaml | 2 +- airflow/providers/elasticsearch/provider.yaml | 2 +- airflow/providers/exasol/provider.yaml | 2 +- airflow/providers/facebook/provider.yaml | 2 +- airflow/providers/ftp/provider.yaml| 2 +- airflow/providers/google/provider.yaml | 2 +- airflow/providers/grpc/provider.yaml | 2 +- airflow/providers/hashicorp/provider.yaml | 2 +- airflow/providers/http/provider.yaml | 2 +- airflow/providers/imap/provider.yaml | 2 +- airflow/providers/jdbc/provider.yaml | 2 +- airflow/providers/jenkins/provider.yaml| 2 +- airflow/providers/jira/provider.yaml | 2 +- airflow/providers/microsoft/azure/provider.yaml| 2 +- airflow/providers/microsoft/mssql/provider.yaml| 2 +- airflow/providers/mongo/provider.yaml | 2 +- airflow/providers/mysql/provider.yaml | 2 +- airflow/providers/neo4j/provider.yaml | 2 +- airflow/providers/odbc/provider.yaml | 2 +- airflow/providers/opsgenie/provider.yaml | 2 +- airflow/providers/oracle/provider.yaml | 2 +- airflow/providers/postgres/provider.yaml | 2 +- airflow/providers/presto/provider.yaml | 2 +- airflow/providers/qubole/provider.yaml | 2 +- airflow/providers/redis/provider.yaml | 2 +- airflow/providers/salesforce/provider.yaml | 2 +- airflow/providers/samba/provider.yaml | 2 +- airflow/providers/segment/provider.yaml| 2 +- airflow/providers/sftp/provider.yaml | 2 +- airflow/providers/slack/provider.yaml | 2 +- airflow/providers/snowflake/provider.yaml | 2 +- airflow/providers/sqlite/provider.yaml | 2 +- airflow/providers/ssh/provider.yaml| 2 +- airflow/providers/tableau/provider.yaml| 2 +- airflow/providers/trino/provider.yaml | 2 +- airflow/providers/vertica/provider.yaml| 2 +- airflow/providers/yandex/provider.yaml | 2 +- .../logging/cloud-watch-task-handlers.rst | 2 +- .../logging/s3-task-handler.rst| 2 +- .../index.rst | 2 +- .../{logging.rst => logging/index.rst} | 6 +- .../redirects.txt | 1 + .../logging/gcs.rst| 2 +- .../logging/index.rst | 4 +- .../logging/stackdriver.rst| 2 +- .../index.rst | 2 +- .../{logging.rst => logging/index.rst} | 2 +- .../redirects.txt | 1 + .../auth-backends.rst} | 28 +-- .../connections.rst} | 29 ++- .../apache.rst => core-extensions/extra-links.rst} | 32 +-- .../core-extensions}/index.rst | 10 +- .../protocol.rst => core-extensions/logging.rst}
[GitHub] [airflow] potiuk merged pull request #17775: Improve discoverability of Provider packages' functionality
potiuk merged pull request #17775: URL: https://github.com/apache/airflow/pull/17775 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk merged pull request #17792: Added Loadsmart to the list of companies using Airflow
potiuk merged pull request #17792: URL: https://github.com/apache/airflow/pull/17792 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated: feat: Add Loadsmart in the list of companies using it (#17792)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/main by this push: new 3a7a65c feat: Add Loadsmart in the list of companies using it (#17792) 3a7a65c is described below commit 3a7a65c1b654346a08125e5334d2943fb0cd9c8f Author: Guilherme Martins Crocetti AuthorDate: Mon Aug 23 18:55:28 2021 -0300 feat: Add Loadsmart in the list of companies using it (#17792) --- INTHEWILD.md | 1 + 1 file changed, 1 insertion(+) diff --git a/INTHEWILD.md b/INTHEWILD.md index 06df389..d1adec5 100644 --- a/INTHEWILD.md +++ b/INTHEWILD.md @@ -255,6 +255,7 @@ Currently, **officially** using Airflow: 1. [Liberty Global](https://www.libertyglobal.com/) [[@LibertyGlobal](https://github.com/LibertyGlobal/)] 1. [liligo](http://liligo.com/) [[@tromika](https://github.com/tromika)] 1. [LingoChamp](http://www.liulishuo.com/) [[@haitaoyao](https://github.com/haitaoyao)] +1. [Loadsmart](https://loadsmart.com/) [[@loadsmart](https://github.com/loadsmart)] 1. [Logitravel Group](https://www.logitravel.com/) 1. [LokSuvidha](http://loksuvidha.com/) [[@saurabhwahile](https://github.com/saurabhwahile)] 1. [Los Angeles Times](http://www.latimes.com/) [[@standyro](https://github.com/standyro)]
[airflow] branch main updated: Update ``README.md`` to point to Airflow 2.1.3 (#17793)
This is an automated email from the ASF dual-hosted git repository. kaxilnaik pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/main by this push: new 39a8b2c Update ``README.md`` to point to Airflow 2.1.3 (#17793) 39a8b2c is described below commit 39a8b2c556d81333ff371547992ab572783ce8d7 Author: Kaxil Naik AuthorDate: Mon Aug 23 22:48:10 2021 +0100 Update ``README.md`` to point to Airflow 2.1.3 (#17793) --- README.md | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index b64e316..5ff45ea 100644 --- a/README.md +++ b/README.md @@ -82,7 +82,7 @@ Airflow is not a streaming solution, but it is often used to process real-time d Apache Airflow is tested with: -| | Main version (dev)| Stable version (2.1.2) | +| | Main version (dev)| Stable version (2.1.3) | | | - | | | Python | 3.6, 3.7, 3.8, 3.9| 3.6, 3.7, 3.8, 3.9 | | Kubernetes | 1.20, 1.19, 1.18 | 1.20, 1.19, 1.18 | @@ -142,15 +142,15 @@ them to appropriate format and workflow that your tool requires. ```bash -pip install 'apache-airflow==2.1.2' \ - --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.1.2/constraints-3.7.txt; +pip install 'apache-airflow==2.1.3' \ + --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.1.3/constraints-3.7.txt; ``` 2. Installing with extras (for example postgres,google) ```bash -pip install 'apache-airflow[postgres,google]==2.1.2' \ - --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.1.2/constraints-3.7.txt; +pip install 'apache-airflow[postgres,google]==2.1.3' \ + --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.1.3/constraints-3.7.txt; ``` For information on installing provider packages check @@ -254,7 +254,7 @@ Apache Airflow version life cycle: | Version | Current Patch/Minor | State | First Release | Limited Support | EOL/Terminated | |-|-|---|---|-|| -| 2 | 2.1.2 | Supported | Dec 17, 2020 | Dec 2021 | TBD| +| 2 | 2.1.3 | Supported | Dec 17, 2020 | Dec 2021 | TBD| | 1.10| 1.10.15 | EOL | Aug 27, 2018 | Dec 17, 2020 | June 17, 2021 | | 1.9 | 1.9.0 | EOL | Jan 03, 2018 | Aug 27, 2018 | Aug 27, 2018 | | 1.8 | 1.8.2 | EOL | Mar 19, 2017 | Jan 03, 2018 | Jan 03, 2018 | @@ -280,7 +280,7 @@ They are based on the official release schedule of Python and Kubernetes, nicely 2. The "oldest" supported version of Python/Kubernetes is the default one. "Default" is only meaningful in terms of "smoke tests" in CI PRs which are run using this default version and default reference - image available. Currently `apache/airflow:latest` and `apache/airflow:2.1.2` images + image available. Currently `apache/airflow:latest` and `apache/airflow:2.1.3` images are both Python 3.6 images, however the first MINOR/MAJOR release of Airflow release after 23.12.2021 will become Python 3.7 images.
[GitHub] [airflow] kaxil merged pull request #17793: Update ``README.md`` to point to Airflow 2.1.3
kaxil merged pull request #17793: URL: https://github.com/apache/airflow/pull/17793 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] github-actions[bot] commented on pull request #17793: Update ``README.md`` to point to Airflow 2.1.3
github-actions[bot] commented on pull request #17793: URL: https://github.com/apache/airflow/pull/17793#issuecomment-904156007 The PR is likely ready to be merged. No tests are needed as no important environment files, nor python files were modified by it. However, committers might decide that full test matrix is needed and add the 'full tests needed' label. Then you should rebase it to the latest main or amend the last commit of the PR, and push it with --force-with-lease. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated (a0ce41c -> 6fdfa09)
This is an automated email from the ASF dual-hosted git repository. kaxilnaik pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git. from a0ce41c Add warning about https configuration in SimpleHttpOperator (#17783) add 6fdfa09 Chart: Update the default Airflow version to ``2.1.3`` (#17794) No new revisions were added by this update. Summary of changes: chart/Chart.yaml | 2 +- chart/UPDATING.rst | 10 ++ chart/values.schema.json | 4 ++-- chart/values.yaml| 4 ++-- 4 files changed, 15 insertions(+), 5 deletions(-)
[GitHub] [airflow] kaxil merged pull request #17794: Chart: Update the default Airflow version to ``2.1.2``
kaxil merged pull request #17794: URL: https://github.com/apache/airflow/pull/17794 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] github-actions[bot] commented on pull request #17794: Chart: Update the default Airflow version to ``2.1.2``
github-actions[bot] commented on pull request #17794: URL: https://github.com/apache/airflow/pull/17794#issuecomment-904155679 The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (AIRFLOW-2910) HTTP Connection not obvious how use with https://
[ https://issues.apache.org/jira/browse/AIRFLOW-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17403399#comment-17403399 ] ASF subversion and git services commented on AIRFLOW-2910: -- Commit a0ce41cc80a8c187800417b8484a305dd910dde0 in airflow's branch refs/heads/main from Jarek Potiuk [ https://gitbox.apache.org/repos/asf?p=airflow.git;h=a0ce41c ] Add warning about https configuration in SimpleHttpOperator (#17783) For historical reasons, configuring ``https`` via SimpleHttpOperator is well, complex. This PR adds warning which informs the users about it as well provides explanation why it is like that and gives some helpful examples, so that people do not have to look for answers in StackOverflow questions or GitHub issues or JIRAs (as they did so far - for example #17780 and in https://issues.apache.org/jira/browse/AIRFLOW-2910 or https://stackoverflow.com/questions/51630344 and many other questions. > HTTP Connection not obvious how use with https:// > - > > Key: AIRFLOW-2910 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2910 > Project: Apache Airflow > Issue Type: Bug >Reporter: isaac martin >Priority: Major > > The SimpleHttpOperator, and anything else relying on > airlfow.models.Connection, cannot make use of https due to what appears to be > a bug in the way it parses user-provided urls. The bug ends up replacing any > https uri with an http uri. > To reproduce: > * Create a new airflow implementation. > * Set a connection environment var: > AIRFLOW_CONN_ETL_API=[https://yourdomain.com|https://yourdomain.com/] > * Instantiate a SimpleHttpOperator which uses the above for its http_conn_id > argument. > * Notice with horror that your requests are made to http://yourdomain.com > To fix: > Proposal 1 > Line 590 of airflow.models.py assigns nothing to Connection.schema. > Change: > self.schema = temp_uri.path[1:] > to > self.schema = temp_uri[0] > > Proposal 2: > Line 40 or airflow.hooks.http_hook.py starts a block which tries to set the > base_url. We could add a new elif which checks self.conn_type, as > self.conn_type is correctly populated with 'https'. > For example: > elif conn.conn_type: > self.base_url = conn.conn_type + "://" + conn.host -- This message was sent by Atlassian Jira (v8.3.4#803005)
[airflow] branch main updated (cb9f0bd -> a0ce41c)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git. from cb9f0bd Move Model collation args tests to correct folder (#17791) add a0ce41c Add warning about https configuration in SimpleHttpOperator (#17783) No new revisions were added by this update. Summary of changes: docs/apache-airflow-providers-http/operators.rst | 21 + 1 file changed, 21 insertions(+)
[GitHub] [airflow] potiuk merged pull request #17783: Add warning about https configuration in SimpleHttpOperator
potiuk merged pull request #17783: URL: https://github.com/apache/airflow/pull/17783 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] github-actions[bot] commented on pull request #17782: Move instructions of constraint/image refreshing to dev
github-actions[bot] commented on pull request #17782: URL: https://github.com/apache/airflow/pull/17782#issuecomment-904154595 The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #17791: Move Model collation args tests to correct folder
potiuk commented on pull request #17791: URL: https://github.com/apache/airflow/pull/17791#issuecomment-904154511 Nice catch! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow] branch main updated: Move Model collation args tests to correct folder (#17791)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/main by this push: new cb9f0bd Move Model collation args tests to correct folder (#17791) cb9f0bd is described below commit cb9f0bd5bd27306e4e8c0985468fe979518d0896 Author: Ash Berlin-Taylor AuthorDate: Mon Aug 23 22:43:35 2021 +0100 Move Model collation args tests to correct folder (#17791) The tests for this got added to test_base.py, which is the right file name, but inside tests/sensors/, which isn't right :) Created a new tests/models/test_base.py for this --- tests/models/test_base.py | 47 ++ tests/sensors/test_base.py | 41 2 files changed, 47 insertions(+), 41 deletions(-) diff --git a/tests/models/test_base.py b/tests/models/test_base.py new file mode 100644 index 000..fa25313 --- /dev/null +++ b/tests/models/test_base.py @@ -0,0 +1,47 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +import pytest +from pytest import param + +from airflow.models.base import get_id_collation_args +from tests.test_utils.config import conf_vars + + +@pytest.mark.parametrize( +("dsn", "expected", "extra"), +[ +param("postgres://host/the_database", {}, {}, id="postgres"), +param("mysql://host/the_database", {"collation": "utf8mb3_general_ci"}, {}, id="mysql"), +param("mysql+pymsql://host/the_database", {"collation": "utf8mb3_general_ci"}, {}, id="mysql+pymsql"), +param( +"mysql://host/the_database", +{"collation": "ascii"}, +{('core', 'sql_engine_collation_for_ids'): 'ascii'}, +id="mysql with explicit config", +), +param( +"postgres://host/the_database", +{"collation": "ascii"}, +{('core', 'sql_engine_collation_for_ids'): 'ascii'}, +id="postgres with explicit config", +), +], +) +def test_collation(dsn, expected, extra): +with conf_vars({('core', 'sql_alchemy_conn'): dsn, **extra}): +assert expected == get_id_collation_args() diff --git a/tests/sensors/test_base.py b/tests/sensors/test_base.py index a26bc94..dd3bf29 100644 --- a/tests/sensors/test_base.py +++ b/tests/sensors/test_base.py @@ -19,7 +19,6 @@ import unittest from datetime import timedelta -from unittest import mock from unittest.mock import Mock, patch import pytest @@ -27,7 +26,6 @@ from freezegun import freeze_time from airflow.exceptions import AirflowException, AirflowRescheduleException, AirflowSensorTimeout from airflow.models import DagBag, TaskInstance, TaskReschedule -from airflow.models.base import get_id_collation_args from airflow.models.dag import DAG from airflow.operators.dummy import DummyOperator from airflow.sensors.base import BaseSensorOperator, poke_mode_only @@ -657,42 +655,3 @@ class TestPokeModeOnly(unittest.TestCase): sensor = DummyPokeOnlySensor(task_id='foo', mode='poke', poke_changes_mode=True, dag=self.dag) with pytest.raises(ValueError): sensor.poke({}) - - -class TestCollation(unittest.TestCase): -@mock.patch.dict( -'os.environ', -AIRFLOW__CORE__SQL_ALCHEMY_CONN='postgres://host/the_database', -) -def test_collation_empty_on_non_mysql(self): -assert {} == get_id_collation_args() - -@mock.patch.dict( -'os.environ', -AIRFLOW__CORE__SQL_ALCHEMY_CONN='mysql://host/the_database', -) -def test_collation_set_on_mysql(self): -assert {"collation": "utf8mb3_general_ci"} == get_id_collation_args() - -@mock.patch.dict( -'os.environ', -AIRFLOW__CORE__SQL_ALCHEMY_CONN='mysql+pymsql://host/the_database', -) -def test_collation_set_on_mysql_with_pymsql(self): -assert {"collation": "utf8mb3_general_ci"} == get_id_collation_args() - -@mock.patch.dict( -'os.environ', -AIRFLOW__CORE__SQL_ALCHEMY_CONN='mysql://host/the_database', -AIRFLOW__CORE__SQL_ENGINE_COLLATION_FOR_IDS='ascii', -) -def
[GitHub] [airflow] potiuk merged pull request #17791: Move Model collation args tests to correct folder
potiuk merged pull request #17791: URL: https://github.com/apache/airflow/pull/17791 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #17757: Improves documentation about modules management
potiuk commented on pull request #17757: URL: https://github.com/apache/airflow/pull/17757#issuecomment-904152014 Thanks for the thorough comments @jedcunningham ! I think the only one remaining is about `__init__.py` presence - I'd love to hear what others think about it - is it really needed or is it my imagination :)? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a change in pull request #17757: Improves documentation about modules management
potiuk commented on a change in pull request #17757: URL: https://github.com/apache/airflow/pull/17757#discussion_r694324576 ## File path: docs/apache-airflow/modules_management.rst ## @@ -68,99 +81,192 @@ In the next section, you will learn how to create your own simple installable package and how to specify additional directories to be added to ``sys.path`` using the environment variable :envvar:`PYTHONPATH`. +If you want to import some packages from a directory that is added to ``PYTHONPATH`` you should import +it following the full Python path of the files. All directories where you put your files have to also +have an empty ``__init__.py`` file which turns it into Python package. Take as an example such structure +as described below (the root directory which is on the ``PYTHONPATH`` might be any of the directories +listed in the next chapter or those that you added manually to the path. -Creating a package in Python - +Typical structure of packages +- -1. Before starting, install the following packages: +This is an example structure that you might have in your ``dags`` folder (see below) -``setuptools``: setuptools is a package development process library designed -for creating and distributing Python packages. +.. code-block:: none -``wheel``: The wheel package provides a bdist_wheel command for setuptools. It -creates .whl file which is directly installable through the ``pip install`` -command. We can then upload the same file to `PyPI `_. + + | .airflowignore -- only needed in in ``dags`` folder, see below + | -- my_company + | __init__.py + | common_package + | | __init__.py + | | common_module.py + | | subpackage + | | __init__.py + | | subpackaged_util_module.py + | + | my_custom_dags + | __init__.py + | my_dag_1.py + | my_dag_2.py + | base_dag.py + +In the case above, those are the ways you should import the python files: -.. code-block:: bash +.. code-block:: python -pip install --upgrade pip setuptools wheel + from my_company.common_package.common_module import SomeClass + from my_company.common_package.subpackge.subpackaged_util_module import AnotherClass + from my_company.my_custom_dags.base_dag import BaseDag -2. Create the package directory - in our case, we will call it ``airflow_operators``. +You can see the ``.ariflowignore`` file at the root of your folder. This is a file that you can put in your +``dags`` folder to tell Airflow which files from the 'dags` folder should be ignored when Airflow +scheduler looks for DAGs. It should contain regular expressions for the paths that should be ignored. You +do not need to have that file in any other folder in ``PYTHONPATH`` (and also you can only keep +shared code in the other folders, not the actual DAGs). -.. code-block:: bash +In the example above the dags are only in ``my_custom_dags`` folder, the ``common_package`` should not be +scanned by scheduler when searching for DAGS, so we should ignore ``common_package`` folder. You also +want to ignore the ``base_dag`` if you keep a base DAG there that ``my_dag1.py`` and ``my_dag1.py`` derives +from. Your ``.airflowignore`` should look then like this: -mkdir airflow_operators +.. code-block:: none -3. Create the file ``__init__.py`` inside the package and add following code: + my_company/common_package/.* + my_company/my_custom_dags/base_dag\.py -.. code-block:: python +Built-in ``PYTHONPATH`` entries in Airflow +-- -print("Hello from airflow_operators") +Airflow, when running dynamically adds three directories to the ``sys.path``: -When we import this package, it should print the above message. +- The ``dags`` folder: It is configured with option ``dags_folder`` in section ``[core]``. +- The ``config`` folder: It is configured by setting ``AIRFLOW_HOME`` variable (``{AIRFLOW_HOME}/config``) by default. +- The ``plugins`` Folder: It is configured with option ``plugins_folder`` in section ``[core]``. -4. Create ``setup.py``: +.. note:: + DAGS folder in Airflow 2 should not be shared with Webserver. While you can do it, unlike in Airflow 1.10 + Airflow has no expectations that the DAGS folder is present for webserver. In fact it's a bit of + security risk to share ``dags`` folder with the webserver, because it means that people who write DAGS + can write code that webserver will be able to execute (And Airflow 2 approach is that webserver should + never run code which can be modified by users who write DAGs). Therefore if you need to
[GitHub] [airflow] potiuk commented on a change in pull request #17757: Improves documentation about modules management
potiuk commented on a change in pull request #17757: URL: https://github.com/apache/airflow/pull/17757#discussion_r694319921 ## File path: docs/apache-airflow/modules_management.rst ## @@ -68,99 +81,192 @@ In the next section, you will learn how to create your own simple installable package and how to specify additional directories to be added to ``sys.path`` using the environment variable :envvar:`PYTHONPATH`. +If you want to import some packages from a directory that is added to ``PYTHONPATH`` you should import +it following the full Python path of the files. All directories where you put your files have to also +have an empty ``__init__.py`` file which turns it into Python package. Take as an example such structure +as described below (the root directory which is on the ``PYTHONPATH`` might be any of the directories +listed in the next chapter or those that you added manually to the path. Review comment: I am not sure either :) removed it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a change in pull request #17757: Improves documentation about modules management
potiuk commented on a change in pull request #17757: URL: https://github.com/apache/airflow/pull/17757#discussion_r694319259 ## File path: docs/apache-airflow/modules_management.rst ## @@ -68,99 +81,192 @@ In the next section, you will learn how to create your own simple installable package and how to specify additional directories to be added to ``sys.path`` using the environment variable :envvar:`PYTHONPATH`. +If you want to import some packages from a directory that is added to ``PYTHONPATH`` you should import +it following the full Python path of the files. All directories where you put your files have to also +have an empty ``__init__.py`` file which turns it into Python package. Take as an example such structure Review comment: > Python 3 doesn't need __init__.py's, right, That's not entirely correct... I also got that impression for some time, but (at least I see it), it's a bit of a misunderstanding. bB Python 3 **can** turn regular folders into implicit packages with their own namespaces, But a number of tools discovering python code (airflow DAG loading including) does not deal with implicit namespaces and still requirea an `__init__.py` files in the folders). Maybe I am wrong about it ? @ashb ? But that's the impression I have. I did separate it out, but I would love to have a good statement about it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ashwin153 opened a new issue #17795: Circular Dependency in Apache Airflow 2.1.3
ashwin153 opened a new issue #17795: URL: https://github.com/apache/airflow/issues/17795 **Apache Airflow version**: 2.1.3 **OS**: Ubuntu 20.04 LTS **Deployment**: Bazel **What happened**: When I tried to bump my Bazel monorepo from 2.1.2 to 2.1.3, Bazel complains that there is the following circular dependency. ``` ERROR: /github/home/.cache/bazel/_bazel_bookie/c5c5e4532705a81d38d884f806d2bf84/external/pip/pypi__apache_airflow/BUILD:11:11: in py_library rule @pip//pypi__apache_airflow:pypi__apache_airflow: cycle in dependency graph: //wager/publish/airflow:airflow .-> @pip//pypi__apache_airflow:pypi__apache_airflow | @pip//pypi__apache_airflow_providers_http:pypi__apache_airflow_providers_http `-- @pip//pypi__apache_airflow:pypi__apache_airflow ``` **What you expected to happen**: No dependency cycles. **How to reproduce it**: A concise reproduction will require some effort. I am hoping that there is a quick resolution to this, but am willing to create a reproduction if it is required to determine the root cause. **Anything else we need to know**: Perhaps related to apache/airflow#14128. **Are you willing to submit a PR?** Yes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] kaxil opened a new pull request #17794: Chart: Update the default Airflow version to ``2.1.2``
kaxil opened a new pull request #17794: URL: https://github.com/apache/airflow/pull/17794 Since 2.1.3 is out we should use that as the default Airflow version. --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information. In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] kaxil opened a new pull request #17793: Update ``README.md`` to point to Airflow 2.1.3
kaxil opened a new pull request #17793: URL: https://github.com/apache/airflow/pull/17793 --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information. In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #17788: livy operator print sparkUiUrl in log output
potiuk commented on issue #17788: URL: https://github.com/apache/airflow/issues/17788#issuecomment-904102830 That's a perfect case for Extra Operator link: https://airflow.apache.org/docs/apache-airflow/stable/howto/define_extra_link.html . Shall I assign you to that issue? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] github-actions[bot] commented on pull request #17775: Improve discoverability of Provider packages' functionality
github-actions[bot] commented on pull request #17775: URL: https://github.com/apache/airflow/pull/17775#issuecomment-904101694 The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] kaxil closed issue #10127: Quarantined issues v1-10-stable
kaxil closed issue #10127: URL: https://github.com/apache/airflow/issues/10127 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] kaxil closed issue #10128: Quarantined Issues v1-10-test
kaxil closed issue #10128: URL: https://github.com/apache/airflow/issues/10128 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Jorricks edited a comment on pull request #16634: Require can_edit on DAG privileges to modify TaskInstances and DagRuns
Jorricks edited a comment on pull request #16634: URL: https://github.com/apache/airflow/pull/16634#issuecomment-904079791 ~~Let me know if you need me to rebase to the latest main ~~ Nvm. Saw a merge conflict so rebased to the latest main. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Jorricks commented on pull request #16634: Require can_edit on DAG privileges to modify TaskInstances and DagRuns
Jorricks commented on pull request #16634: URL: https://github.com/apache/airflow/pull/16634#issuecomment-904079791 Let me know if you need me to rebase to the latest main -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a change in pull request #17775: Improve discoverability of Provider packages' functionality
potiuk commented on a change in pull request #17775: URL: https://github.com/apache/airflow/pull/17775#discussion_r694269305 ## File path: docs/apache-airflow-providers-elasticsearch/logging/index.rst ## @@ -17,7 +17,7 @@ Review comment: Yep. Found it. Fixed already in the latest push (I also checked that the redirection works). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Jorricks commented on pull request #17207: WIP: Fix external_executor_id not being set for manually run jobs.
Jorricks commented on pull request #17207: URL: https://github.com/apache/airflow/pull/17207#issuecomment-904077786 I tried to add some tests on the CLI task part. That is pretty much done. However, I had quite some trouble wrapping my head around a decent test approach on the `celery_executor` part. There is currently not really a test for any of the functions I modified which makes me wonder if I should add them. If so, do you have any remarks on how I could best do that? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] Jorricks commented on a change in pull request #17207: WIP: Fix external_executor_id not being set for manually run jobs.
Jorricks commented on a change in pull request #17207: URL: https://github.com/apache/airflow/pull/17207#discussion_r694266238 ## File path: airflow/executors/celery_executor.py ## @@ -125,8 +129,10 @@ def _execute_in_fork(command_to_exec: CommandType) -> None: os._exit(ret) -def _execute_in_subprocess(command_to_exec: CommandType) -> None: +def _execute_in_subprocess(command_to_exec: CommandType, celery_task_id: Optional[str] = None) -> None: env = os.environ.copy() +if celery_task_id: +env["celery_task_id"] = celery_task_id Review comment: This should now be fixed :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] gmcrocetti opened a new pull request #17792: Added Loadsmart to the list of companies using Airflow
gmcrocetti opened a new pull request #17792: URL: https://github.com/apache/airflow/pull/17792 Added Loadsmart to INTHEWILD.md -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] mik-laj commented on a change in pull request #17775: Improve discoverability of Provider packages' functionality
mik-laj commented on a change in pull request #17775: URL: https://github.com/apache/airflow/pull/17775#discussion_r694264711 ## File path: docs/apache-airflow-providers-elasticsearch/logging/index.rst ## @@ -17,7 +17,7 @@ Review comment: Here is example: https://github.com/apache/airflow/blob/main/docs/apache-airflow-providers-google/redirects.txt https://github.com/apache/airflow/blob/main/docs/apache-airflow/redirects.txt -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a change in pull request #17775: Improve discoverability of Provider packages' functionality
potiuk commented on a change in pull request #17775: URL: https://github.com/apache/airflow/pull/17775#discussion_r694258279 ## File path: docs/apache-airflow-providers-elasticsearch/logging/index.rst ## @@ -17,7 +17,7 @@ Review comment: Ah. Found it! Good catch! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on a change in pull request #17775: Improve discoverability of Provider packages' functionality
potiuk commented on a change in pull request #17775: URL: https://github.com/apache/airflow/pull/17775#discussion_r694257448 ## File path: docs/apache-airflow-providers-elasticsearch/logging/index.rst ## @@ -17,7 +17,7 @@ Review comment: How do I do that? Could you show me an example / maybe past PR that did this ? Do we have it explained somewhere? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] github-actions[bot] commented on pull request #17783: Add warning about https configuration in SimpleHttpOperator
github-actions[bot] commented on pull request #17783: URL: https://github.com/apache/airflow/pull/17783#issuecomment-904063607 The PR is likely ready to be merged. No tests are needed as no important environment files, nor python files were modified by it. However, committers might decide that full test matrix is needed and add the 'full tests needed' label. Then you should rebase it to the latest main or amend the last commit of the PR, and push it with --force-with-lease. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] ashb opened a new pull request #17791: Move Model collation args tests to correct folder
ashb opened a new pull request #17791: URL: https://github.com/apache/airflow/pull/17791 The tests for this got added to test_base.py in #17729, which is the right file name, but inside tests/sensors/, which isn't right :) Created a new tests/models/test_base.py for this. And cos I'm on a "removing lines of code" kick I've parameterized it too so it a "data driven" test. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #17576: Add pre/post execution hooks
potiuk commented on pull request #17576: URL: https://github.com/apache/airflow/pull/17576#issuecomment-904010653 > These two new fields you add don't need to be serialized (and callables can't generally be anyway) -- the general rule is that things needed by the Scheduler should be serialized, and that test was there to make people think about the change. > > In this case, since the scheduler doesn't care about these fields they should be added to the ignore list in the test. Ah cool. I will close/reopen to rebuild, but I think this one is good-to-go. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk closed pull request #17576: Add pre/post execution hooks
potiuk closed pull request #17576: URL: https://github.com/apache/airflow/pull/17576 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow-site] kaxil merged pull request #468: Add documentation for Apache Airflow 2.1.3
kaxil merged pull request #468: URL: https://github.com/apache/airflow-site/pull/468 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow-site] kaxil opened a new pull request #468: Add documentation for Apache Airflow 2.1.3
kaxil opened a new pull request #468: URL: https://github.com/apache/airflow-site/pull/468 Add docs for Airflow and Docker images -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[airflow-site] 02/03: Remove log files
This is an automated email from the ASF dual-hosted git repository. kaxilnaik pushed a commit to branch airflow-2.1.3-docker in repository https://gitbox.apache.org/repos/asf/airflow-site.git commit 260873efa12d4d008ef95a85538774a14aa48be7 Author: Kaxil Naik AuthorDate: Mon Aug 23 18:56:44 2021 +0100 Remove log files --- .../1.0.0/warning-build-apache-airflow-providers-airbyte.log | 0 .../2.0.0/warning-build-apache-airflow-providers-airbyte.log | 0 .../1.3.0/warning-build-apache-airflow-providers-amazon.log | 0 .../1.4.0/warning-build-apache-airflow-providers-amazon.log | 0 .../2.0.0/warning-build-apache-airflow-providers-amazon.log | 0 .../2.0.0/warning-build-apache-airflow-providers-apache-beam.log | 0 .../3.0.0/warning-build-apache-airflow-providers-apache-beam.log | 0 .../1.0.1/warning-build-apache-airflow-providers-apache-cassandra.log | 0 .../2.0.0/warning-build-apache-airflow-providers-apache-cassandra.log | 0 .../1.1.0/warning-build-apache-airflow-providers-apache-druid.log | 0 .../2.0.0/warning-build-apache-airflow-providers-apache-druid.log | 0 .../1.0.1/warning-build-apache-airflow-providers-apache-hdfs.log | 0 .../2.0.0/warning-build-apache-airflow-providers-apache-hdfs.log | 0 .../1.0.3/warning-build-apache-airflow-providers-apache-hive.log | 0 .../2.0.0/warning-build-apache-airflow-providers-apache-hive.log | 0 .../1.0.1/warning-build-apache-airflow-providers-apache-kylin.log | 0 .../2.0.0/warning-build-apache-airflow-providers-apache-kylin.log | 0 .../1.1.0/warning-build-apache-airflow-providers-apache-livy.log | 0 .../2.0.0/warning-build-apache-airflow-providers-apache-livy.log | 0 .../1.0.1/warning-build-apache-airflow-providers-apache-pig.log | 0 .../2.0.0/warning-build-apache-airflow-providers-apache-pig.log | 0 .../1.0.1/warning-build-apache-airflow-providers-apache-pinot.log | 0 .../2.0.0/warning-build-apache-airflow-providers-apache-pinot.log | 0 .../1.0.3/warning-build-apache-airflow-providers-apache-spark.log | 0 .../2.0.0/warning-build-apache-airflow-providers-apache-spark.log | 0 .../1.0.1/warning-build-apache-airflow-providers-apache-sqoop.log | 0 .../2.0.0/warning-build-apache-airflow-providers-apache-sqoop.log | 0 .../1.0.0/warning-build-apache-airflow-providers-asana.log| 0 .../1.0.1/warning-build-apache-airflow-providers-celery.log | 0 .../2.0.0/warning-build-apache-airflow-providers-celery.log | 0 .../1.0.1/warning-build-apache-airflow-providers-cloudant.log | 0 .../2.0.0/warning-build-apache-airflow-providers-cloudant.log | 0 .../1.1.0/warning-build-apache-airflow-providers-cncf-kubernetes.log | 0 .../1.2.0/warning-build-apache-airflow-providers-cncf-kubernetes.log | 0 .../2.0.0/warning-build-apache-airflow-providers-cncf-kubernetes.log | 0 .../1.0.1/warning-build-apache-airflow-providers-databricks.log | 0 .../2.0.0/warning-build-apache-airflow-providers-databricks.log | 0 .../1.0.1/warning-build-apache-airflow-providers-datadog.log | 0 .../2.0.0/warning-build-apache-airflow-providers-datadog.log | 0 .../1.0.2/warning-build-apache-airflow-providers-dingding.log | 0 .../2.0.0/warning-build-apache-airflow-providers-dingding.log | 0 .../1.0.1/warning-build-apache-airflow-providers-discord.log | 0 .../2.0.0/warning-build-apache-airflow-providers-discord.log | 0 .../1.1.0/warning-build-apache-airflow-providers-docker.log | 0 .../1.2.0/warning-build-apache-airflow-providers-docker.log | 0 .../2.0.0/warning-build-apache-airflow-providers-docker.log | 0 .../1.0.4/warning-build-apache-airflow-providers-elasticsearch.log| 0 .../2.0.1/warning-build-apache-airflow-providers-elasticsearch.log| 0 .../2.0.2/warning-build-apache-airflow-providers-elasticsearch.log| 0 .../1.1.1/warning-build-apache-airflow-providers-exasol.log | 0 .../2.0.0/warning-build-apache-airflow-providers-exasol.log | 0 .../1.1.0/warning-build-apache-airflow-providers-facebook.log | 0 .../2.0.0/warning-build-apache-airflow-providers-facebook.log | 0 .../1.1.0/warning-build-apache-airflow-providers-ftp.log | 0 .../2.0.0/warning-build-apache-airflow-providers-ftp.log | 0 .../2.2.0/warning-build-apache-airflow-providers-google.log | 0 .../3.0.0/warning-build-apache-airflow-providers-google.log | 0 .../4.0.0/warning-build-apache-airflow-providers-google.log | 0 .../1.1.0/warning-build-apache-airflow-providers-grpc.log | 0
[airflow-site] branch airflow-2.1.3-docker created (now 545bbb5)
This is an automated email from the ASF dual-hosted git repository. kaxilnaik pushed a change to branch airflow-2.1.3-docker in repository https://gitbox.apache.org/repos/asf/airflow-site.git. at 545bbb5 Remove Airflow Summit banner This branch includes the following new commits: new ea50028 Add documentation for Apache Airflow 2.1.3 new 260873e Remove log files new 545bbb5 Remove Airflow Summit banner The 3 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference.
[GitHub] [airflow] iblaine commented on pull request #17777: get_pandas_df() fails when it tries to read an empty table
iblaine commented on pull request #1: URL: https://github.com/apache/airflow/pull/1#issuecomment-903993858 Setting this [line](https://github.com/apache/airflow/blob/main/tests/test_utils/mock_process.py#L59) to `self.iterable = []` will trigger this bug. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] iblaine commented on pull request #17777: get_pandas_df() fails when it tries to read an empty table
iblaine commented on pull request #1: URL: https://github.com/apache/airflow/pull/1#issuecomment-903981341 > I wonder if it’d work to do `SELECT * FROM table LIMIT 0` Mock seems to return a result set regardless of the SQL. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] iblaine commented on issue #17765: get_pandas_df() fails when it tries to read an empty table
iblaine commented on issue #17765: URL: https://github.com/apache/airflow/issues/17765#issuecomment-903980110 PR for this issue is https://github.com/apache/airflow/pull/1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] subkanthi opened a new pull request #17790: Gcp ai hyperparameter tuning
subkanthi opened a new pull request #17790: URL: https://github.com/apache/airflow/pull/17790 closes: #17348 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] dimberman commented on a change in pull request #15330: Add a Docker Taskflow decorator
dimberman commented on a change in pull request #15330: URL: https://github.com/apache/airflow/pull/15330#discussion_r694154586 ## File path: airflow/providers_manager.py ## @@ -278,6 +309,51 @@ def _get_attr(obj: Any, attr_name: str): return None return getattr(obj, attr_name) +def _add_taskflow_decorator( +self, decorator_name, decorator_class_name: str, provider_package: str +) -> None: +if provider_package.startswith("apache-airflow"): +provider_path = provider_package[len("apache-") :].replace("-", ".") +if not decorator_class_name.startswith(provider_path): +log.warning( +"Sanity check failed when importing '%s' from '%s' package. It should start with '%s'", +decorator_class_name, +provider_package, +provider_path, +) +return +if decorator_name in self._taskflow_decorator_dict: +log.warning( +"The hook_class '%s' has been already registered.", +decorator_class_name, +) +return +try: +module, class_name = decorator_class_name.rsplit('.', maxsplit=1) +decorator_class = getattr(importlib.import_module(module), class_name) +self._taskflow_decorator_dict[decorator_name] = decorator_class Review comment: @ashb so is this something that should be addressed in this PR or another? As it stands there won't be many (if any) decorators to import, so maybe we can defer? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] jedcunningham commented on a change in pull request #17757: Improves documentation about modules management
jedcunningham commented on a change in pull request #17757: URL: https://github.com/apache/airflow/pull/17757#discussion_r694153747 ## File path: docs/apache-airflow/modules_management.rst ## @@ -68,99 +81,192 @@ In the next section, you will learn how to create your own simple installable package and how to specify additional directories to be added to ``sys.path`` using the environment variable :envvar:`PYTHONPATH`. +If you want to import some packages from a directory that is added to ``PYTHONPATH`` you should import +it following the full Python path of the files. All directories where you put your files have to also +have an empty ``__init__.py`` file which turns it into Python package. Take as an example such structure Review comment: Maybe mention `__init__.py`'s and link down to the "Add __init__.py in package folders" section? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] jedcunningham commented on a change in pull request #17757: Improves documentation about modules management
jedcunningham commented on a change in pull request #17757: URL: https://github.com/apache/airflow/pull/17757#discussion_r694114919 ## File path: docs/apache-airflow/modules_management.rst ## @@ -25,9 +25,21 @@ Airflow configuration. The following article will describe how you can create your own module so that Airflow can load it correctly, as well as diagnose problems when modules are not loaded properly. +Often you want to use your own python code in your Airflow deployment, +for example common code, libraries, you might want to generate DAGs using +shared python code and have several DAG python files. -Packages Loading in Python --- +You can do it in one of those ways: + +* add your modules to one of the folders that airflow automatically adds to ``PYTHONPATH`` +* add extra folders where you keep your code to ``PYTHONPATH`` +* package your code into Python package and install it together with Airflow. Review comment: ```suggestion * package your code into a Python package and install it together with Airflow. ``` ## File path: docs/apache-airflow/modules_management.rst ## @@ -68,99 +81,192 @@ In the next section, you will learn how to create your own simple installable package and how to specify additional directories to be added to ``sys.path`` using the environment variable :envvar:`PYTHONPATH`. +If you want to import some packages from a directory that is added to ``PYTHONPATH`` you should import +it following the full Python path of the files. All directories where you put your files have to also +have an empty ``__init__.py`` file which turns it into Python package. Take as an example such structure Review comment: ```suggestion it using the full Python path of the files. Take as an example such structure ``` Python 3 doesn't need `__init__.py`'s, right, particularly when you're just dealing with a normal dir being added to `PYTHONPATH`? ## File path: docs/apache-airflow/modules_management.rst ## @@ -68,99 +81,192 @@ In the next section, you will learn how to create your own simple installable package and how to specify additional directories to be added to ``sys.path`` using the environment variable :envvar:`PYTHONPATH`. +If you want to import some packages from a directory that is added to ``PYTHONPATH`` you should import +it following the full Python path of the files. All directories where you put your files have to also +have an empty ``__init__.py`` file which turns it into Python package. Take as an example such structure +as described below (the root directory which is on the ``PYTHONPATH`` might be any of the directories +listed in the next chapter or those that you added manually to the path. -Creating a package in Python - +Typical structure of packages +- -1. Before starting, install the following packages: +This is an example structure that you might have in your ``dags`` folder (see below) -``setuptools``: setuptools is a package development process library designed -for creating and distributing Python packages. +.. code-block:: none -``wheel``: The wheel package provides a bdist_wheel command for setuptools. It -creates .whl file which is directly installable through the ``pip install`` -command. We can then upload the same file to `PyPI `_. + + | .airflowignore -- only needed in in ``dags`` folder, see below + | -- my_company + | __init__.py + | common_package + | | __init__.py + | | common_module.py + | | subpackage + | | __init__.py + | | subpackaged_util_module.py + | + | my_custom_dags + | __init__.py + | my_dag_1.py + | my_dag_2.py + | base_dag.py + +In the case above, those are the ways you should import the python files: -.. code-block:: bash +.. code-block:: python -pip install --upgrade pip setuptools wheel + from my_company.common_package.common_module import SomeClass + from my_company.common_package.subpackge.subpackaged_util_module import AnotherClass + from my_company.my_custom_dags.base_dag import BaseDag -2. Create the package directory - in our case, we will call it ``airflow_operators``. +You can see the ``.ariflowignore`` file at the root of your folder. This is a file that you can put in your +``dags`` folder to tell Airflow which files from the 'dags` folder should be ignored when Airflow +scheduler looks for DAGs. It should contain regular expressions for the paths that should be ignored. You +do not need to have that file in any other folder in ``PYTHONPATH`` (and also you
[airflow] branch main updated: Fix failing Helm Chart docs test (#17789)
This is an automated email from the ASF dual-hosted git repository. kaxilnaik pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/main by this push: new d97e50f Fix failing Helm Chart docs test (#17789) d97e50f is described below commit d97e50fb41b596eb942528efe99a46715b40026f Author: Kaxil Naik AuthorDate: Mon Aug 23 17:43:10 2021 +0100 Fix failing Helm Chart docs test (#17789) Fixes: https://github.com/apache/airflow/runs/3402579247#step:7:718 --- chart/values.schema.json | 1 + 1 file changed, 1 insertion(+) diff --git a/chart/values.schema.json b/chart/values.schema.json index 1825caa..433bf35 100644 --- a/chart/values.schema.json +++ b/chart/values.schema.json @@ -12,6 +12,7 @@ "Scheduler", "Webserver", "Workers", +"Triggerer", "Flower", "Redis", "Statsd",
[GitHub] [airflow] kaxil merged pull request #17789: Fix failing Helm Chart docs test
kaxil merged pull request #17789: URL: https://github.com/apache/airflow/pull/17789 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org