[jira] [Created] (AIRFLOW-6976) Correct cli dag command ignore-first-depends-on-past
zhongjiajie created AIRFLOW-6976: Summary: Correct cli dag command ignore-first-depends-on-past Key: AIRFLOW-6976 URL: https://issues.apache.org/jira/browse/AIRFLOW-6976 Project: Apache Airflow Issue Type: Improvement Components: cli Affects Versions: 1.10.9 Reporter: zhongjiajie Assignee: zhongjiajie ref PR [https://github.com/apache/airflow/pull/7490] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] zhongjiajie commented on a change in pull request #7490: [AIRFLOW-6860] Default ignore_first_depends_on_past to True
zhongjiajie commented on a change in pull request #7490: [AIRFLOW-6860] Default ignore_first_depends_on_past to True URL: https://github.com/apache/airflow/pull/7490#discussion_r386845124 ## File path: airflow/cli/commands/dag_command.py ## @@ -67,6 +67,13 @@ def dag_backfill(args, dag=None): signal.signal(signal.SIGTERM, sigint_handler) +import warnings +warnings.warn('--ignore_first_depends_on_past is deprecated as the value is always set to True', Review comment: will submit new PR to correct it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] zhongjiajie commented on a change in pull request #7490: [AIRFLOW-6860] Default ignore_first_depends_on_past to True
zhongjiajie commented on a change in pull request #7490: [AIRFLOW-6860] Default ignore_first_depends_on_past to True URL: https://github.com/apache/airflow/pull/7490#discussion_r386844032 ## File path: airflow/cli/commands/dag_command.py ## @@ -67,6 +67,13 @@ def dag_backfill(args, dag=None): signal.signal(signal.SIGTERM, sigint_handler) +import warnings +warnings.warn('--ignore_first_depends_on_past is deprecated as the value is always set to True', Review comment: Should be `--ignore-first-depends-on-past` due to https://github.com/apache/airflow/pull/7148 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (AIRFLOW-6975) Base AWSHook AssumeRoleWithSAML
Bjorn Olsen created AIRFLOW-6975: Summary: Base AWSHook AssumeRoleWithSAML Key: AIRFLOW-6975 URL: https://issues.apache.org/jira/browse/AIRFLOW-6975 Project: Apache Airflow Issue Type: Improvement Components: aws Affects Versions: 1.10.9 Reporter: Bjorn Olsen Assignee: Bjorn Olsen Base AWS Hook currently does AssumeRole but we require it to additionally be able to do AssumeRoleWithSAML. +Current+ [https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp_request.html#api_assumerole] The AssumeRole API operation is useful for allowing existing IAM users to access AWS resources that they don't already have access to. (This requires an AWS IAM user) +Proposed addition+ [https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp_request.html#api_assumerolewithsaml] The AssumeRoleWithSAML API operation returns a set of temporary security credentials for federated users who are authenticated by your organization's existing identity system. (This allows federated login using another IDP rather than requiring an AWS IAM user). +Use case+ We need to be able to authenticate an AD user against our IDP (Windows Active Directory). We can obtain a SAML assertion from our IDP, and then provide it to AWS STS to exchange it for AWS temporary credentials, thus authorising us to use AWS services. The AWS AssumeRoleWithSAML API is intended for this use case, and the Base AWS Hook should be updated to allow for this method of authentication. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] stale[bot] commented on issue #7184: [AIRFLOW-6574] Adding private_environment to docker operator.
stale[bot] commented on issue #7184: [AIRFLOW-6574] Adding private_environment to docker operator. URL: https://github.com/apache/airflow/pull/7184#issuecomment-593738311 This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] stale[bot] commented on issue #7177: [AIRFLOW-6571] Rewrite BigQueryExecuteQueryOperator to use python client
stale[bot] commented on issue #7177: [AIRFLOW-6571] Rewrite BigQueryExecuteQueryOperator to use python client URL: https://github.com/apache/airflow/pull/7177#issuecomment-593738317 This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] stale[bot] commented on issue #7087: [AIRFLOW-XXXX] Clarify breaking changes to macros in updating.md
stale[bot] commented on issue #7087: [AIRFLOW-] Clarify breaking changes to macros in updating.md URL: https://github.com/apache/airflow/pull/7087#issuecomment-593738315 This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (AIRFLOW-6974) Using MS SQL Server 17 as a backend, Migration cc1e65623dc7_add_max_tries_column_to_task_instance.py fails if load_examples = True
[ https://issues.apache.org/jira/browse/AIRFLOW-6974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Brookes updated AIRFLOW-6974: -- Description: This took me a while to figure out as there was another issue with the migration in question which is cc1e65623dc7_add_max_tries_column_to_task_instance.py This file USED to have an issue where it would sit there forever during an initdb on MS SQL Server, essentially deadlocked with itself. I couldn't figure out why it was still sitting there for me, given that I was using the version of the migration where this had been fixed, so I went looking at the locks on the DB. I found TWO processes running on the DB both originating inside the airflow initdb Python instance. The first was happily sitting there trying to query the max_retries column on a table, but the other was attempting to query the table "slot_pool" from within example_subdag_operator.py . I killed the session which was querying that table and of course my Python process crashed, helpfully with a stack trace. The session I killed was interacting with the DB running in EXAMPLES and was actually complaining that the table was not a valid object name. As soon as I set load_examples = False, the initdb process ran through in a few seconds and all was well. But with load_examples = True it would reliably hang on this specific migration every single time. I have attached a full stack trace from when I terminated the second DB session. was: This took me a while to figure out as there was another issue with the migration in question which is cc1e65623dc7_add_max_tries_column_to_task_instance.py This file USED to have an issue where it would sit there forever during an initdb essentially deadlocked with itself. I couldn't figure out why it was still sitting there for me, given that I was using the version of the migration where this had been fixed, so I went looking at the locks on the DB. I found TWO processes running on the DB both originating inside the airflow initdb Python instance. The first was happily sitting there trying to query the max_retries column on a table, but the OTHER as attempting to query the table "slot_pool" from within example_subdag_operator.py . I killed the session which was querying that table and of course my Python process crashed, but helpfully with a stack trace. The session I killed was interacting with the DB running in EXAMPLES and was actually complaining that the table was not a valid object name. As soon as I set load_examples = False, the initdb process ran through in a few seconds and all was well. But with load_examples = True it would reliably hang on this specific migration every single time. I have attached a full stack trace from when I terminated the second DB session. > Using MS SQL Server 17 as a backend, Migration > cc1e65623dc7_add_max_tries_column_to_task_instance.py fails if load_examples > = True > -- > > Key: AIRFLOW-6974 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6974 > Project: Apache Airflow > Issue Type: Bug > Components: db >Affects Versions: 1.10.9 >Reporter: Tony Brookes >Priority: Minor > Attachments: airflow-mssql-stack-trace.txt > > > This took me a while to figure out as there was another issue with the > migration in question which is > cc1e65623dc7_add_max_tries_column_to_task_instance.py > This file USED to have an issue where it would sit there forever during an > initdb on MS SQL Server, essentially deadlocked with itself. > I couldn't figure out why it was still sitting there for me, given that I was > using the version of the migration where this had been fixed, so I went > looking at the locks on the DB. I found TWO processes running on the DB both > originating inside the airflow initdb Python instance. > The first was happily sitting there trying to query the max_retries column on > a table, but the other was attempting to query the table "slot_pool" from > within example_subdag_operator.py . I killed the session which was querying > that table and of course my Python process crashed, helpfully with a stack > trace. > The session I killed was interacting with the DB running in EXAMPLES and was > actually complaining that the table was not a valid object name. As soon as > I set load_examples = False, the initdb process ran through in a few seconds > and all was well. But with load_examples = True it would reliably hang on > this specific migration every single time. > I have attached a full stack trace from when I terminated the second DB > session. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (AIRFLOW-6974) Using MS SQL Server 17 as a backend, Migration cc1e65623dc7_add_max_tries_column_to_task_instance.py fails if load_examples = True
Tony Brookes created AIRFLOW-6974: - Summary: Using MS SQL Server 17 as a backend, Migration cc1e65623dc7_add_max_tries_column_to_task_instance.py fails if load_examples = True Key: AIRFLOW-6974 URL: https://issues.apache.org/jira/browse/AIRFLOW-6974 Project: Apache Airflow Issue Type: Bug Components: db Affects Versions: 1.10.9 Reporter: Tony Brookes Attachments: airflow-mssql-stack-trace.txt This took me a while to figure out as there was another issue with the migration in question which is cc1e65623dc7_add_max_tries_column_to_task_instance.py This file USED to have an issue where it would sit there forever during an initdb essentially deadlocked with itself. I couldn't figure out why it was still sitting there for me, given that I was using the version of the migration where this had been fixed, so I went looking at the locks on the DB. I found TWO processes running on the DB both originating inside the airflow initdb Python instance. The first was happily sitting there trying to query the max_retries column on a table, but the OTHER as attempting to query the table "slot_pool" from within example_subdag_operator.py . I killed the session which was querying that table and of course my Python process crashed, but helpfully with a stack trace. The session I killed was interacting with the DB running in EXAMPLES and was actually complaining that the table was not a valid object name. As soon as I set load_examples = False, the initdb process ran through in a few seconds and all was well. But with load_examples = True it would reliably hang on this specific migration every single time. I have attached a full stack trace from when I terminated the second DB session. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] kaxil closed pull request #7054: WIP: Test PR
kaxil closed pull request #7054: WIP: Test PR URL: https://github.com/apache/airflow/pull/7054 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] kaxil opened a new pull request #7054: WIP: Test PR
kaxil opened a new pull request #7054: WIP: Test PR URL: https://github.com/apache/airflow/pull/7054 --- Link to JIRA issue: https://issues.apache.org/jira/browse/AIRFLOW- - [ ] Description above provides context of the change - [ ] Commit message starts with `[AIRFLOW-]`, where AIRFLOW- = JIRA ID* - [ ] Unit tests coverage for changes (not needed for documentation changes) - [ ] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [ ] Relevant documentation is updated including usage instructions. - [ ] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). (*) For document-only changes, no JIRA issue is needed. Commit message starts `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] lucafuji commented on a change in pull request #6870: [AIRFLOW-0578] Check return code
lucafuji commented on a change in pull request #6870: [AIRFLOW-0578] Check return code URL: https://github.com/apache/airflow/pull/6870#discussion_r381445504 ## File path: airflow/jobs/local_task_job.py ## @@ -95,6 +95,14 @@ def signal_handler(signum, frame): # Monitor the task to see if it's done return_code = self.task_runner.return_code() if return_code is not None: +if return_code != 0: +self.task_instance.refresh_from_db() +# there is one case we should not treat non zero return +# code as failed: the job has been killed externally. +if (not self.terminating) or self.task_instance.state == State.FAILED: Review comment: 1. "self.terminating" means job been killed externally but not "state="failed". It's set in heartbeat_callback:L137. Basically it means whenever "ti.state != State.RUNNING", it's terminating. There are two cases this will happen a. explicitly calling terminate of a StandardTaskRunner, then the return code is -9 and the task_instance.state is not failed. In such case, we should not treat non zero exit code as failure. b. the task instance is explicitly set as failed, in this case, we should treat non zero exit code as failure. 2. As mentioned above, "job failure" is handled in base_job.py:run:L230. If exception is not thrown here, job state will not be marked as failure This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] codecov-io edited a comment on issue #7423: [AIRFLOW-3126] Add option to specify additional K8s volumes
codecov-io edited a comment on issue #7423: [AIRFLOW-3126] Add option to specify additional K8s volumes URL: https://github.com/apache/airflow/pull/7423#issuecomment-586568480 # [Codecov](https://codecov.io/gh/apache/airflow/pull/7423?src=pr=h1) Report > Merging [#7423](https://codecov.io/gh/apache/airflow/pull/7423?src=pr=desc) into [master](https://codecov.io/gh/apache/airflow/commit/2ea9278f76bf71aafb5601160602bf7f4194242f?src=pr=desc) will **increase** coverage by `0.1%`. > The diff coverage is `100%`. [![Impacted file tree graph](https://codecov.io/gh/apache/airflow/pull/7423/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/7423?src=pr=tree) ```diff @@Coverage Diff@@ ## master#7423 +/- ## = + Coverage 86.73% 86.84% +0.1% = Files 897 897 Lines 4275142797 +46 = + Hits3708137165 +84 + Misses 5670 5632 -38 ``` | [Impacted Files](https://codecov.io/gh/apache/airflow/pull/7423?src=pr=tree) | Coverage Δ | | |---|---|---| | [airflow/kubernetes/worker\_configuration.py](https://codecov.io/gh/apache/airflow/pull/7423/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3dvcmtlcl9jb25maWd1cmF0aW9uLnB5) | `99.35% <100%> (+0.04%)` | :arrow_up: | | [airflow/executors/kubernetes\_executor.py](https://codecov.io/gh/apache/airflow/pull/7423/diff?src=pr=tree#diff-YWlyZmxvdy9leGVjdXRvcnMva3ViZXJuZXRlc19leGVjdXRvci5weQ==) | `60.23% <100%> (+3.06%)` | :arrow_up: | | [airflow/jobs/scheduler\_job.py](https://codecov.io/gh/apache/airflow/pull/7423/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzL3NjaGVkdWxlcl9qb2IucHk=) | `90.64% <0%> (+0.14%)` | :arrow_up: | | [airflow/jobs/backfill\_job.py](https://codecov.io/gh/apache/airflow/pull/7423/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzL2JhY2tmaWxsX2pvYi5weQ==) | `92.15% <0%> (+0.28%)` | :arrow_up: | | [airflow/hooks/dbapi\_hook.py](https://codecov.io/gh/apache/airflow/pull/7423/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9kYmFwaV9ob29rLnB5) | `91.73% <0%> (+1.65%)` | :arrow_up: | | [airflow/providers/postgres/hooks/postgres.py](https://codecov.io/gh/apache/airflow/pull/7423/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvcG9zdGdyZXMvaG9va3MvcG9zdGdyZXMucHk=) | `94.36% <0%> (+16.9%)` | :arrow_up: | | [...roviders/google/cloud/operators/postgres\_to\_gcs.py](https://codecov.io/gh/apache/airflow/pull/7423/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvZ29vZ2xlL2Nsb3VkL29wZXJhdG9ycy9wb3N0Z3Jlc190b19nY3MucHk=) | `85.29% <0%> (+32.35%)` | :arrow_up: | | [airflow/providers/postgres/operators/postgres.py](https://codecov.io/gh/apache/airflow/pull/7423/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvcG9zdGdyZXMvb3BlcmF0b3JzL3Bvc3RncmVzLnB5) | `100% <0%> (+50%)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/airflow/pull/7423?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/airflow/pull/7423?src=pr=footer). Last update [2ea9278...f026674](https://codecov.io/gh/apache/airflow/pull/7423?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] tbobik1 commented on issue #7386: [AIRFLOW-6761] Fix WorkGroup param in AWSAthenaHook
tbobik1 commented on issue #7386: [AIRFLOW-6761] Fix WorkGroup param in AWSAthenaHook URL: https://github.com/apache/airflow/pull/7386#issuecomment-593672865 I had the same problem using version 1.10.9 trying to query athena with the aws_athena_hook.py . Changed Workgroup to WorkGroup and I had to change it to WorkGroup = 'primary' for it to work. Using 'default' gave me errors. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (AIRFLOW-6968) Failing when executing a dag: 'log file does not exist'
[ https://issues.apache.org/jira/browse/AIRFLOW-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omar Suárez updated AIRFLOW-6968: - Description: I am facing this error: {{*** Log file does not exist: /usr/local/airflow/logs/example_python_operator_virtualenv/virtualenv_python/2020-03-01T19:08:54.619529+00:00/1.log Fetching from: [http://894194e3daed:8793/log/example_python_operator_virtualenv/virtualenv_python/2020-03-01T19:08:54.619529+00:00/1.log]Failed to fetch log file from worker. HTTPConnectionPool(host='894194e3daed', port=8793): Max retries exceeded with url: /log/example_python_operator_virtualenv/virtualenv_python/2020-03-01T19:08:54.619529+00:00/1.log (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 111] Connection refused',))}} when trying to execute a DAG using the PythonVirtualenvOperator. Attached is the configuration of the logs inside the 'airflow.cfg' file. I am using the Airflow Docker image from puckel. was: I am facing this error: {{*** Log file does not exist: /usr/local/airflow/logs/example_python_operator_virtualenv/virtualenv_python/2020-03-01T19:08:54.619529+00:00/1.log *** Fetching from: http://894194e3daed:8793/log/example_python_operator_virtualenv/virtualenv_python/2020-03-01T19:08:54.619529+00:00/1.log *** Failed to fetch log file from worker. HTTPConnectionPool(host='894194e3daed', port=8793): Max retries exceeded with url: /log/example_python_operator_virtualenv/virtualenv_python/2020-03-01T19:08:54.619529+00:00/1.log (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 111] Connection refused',))}} when trying to execute a DAG using the PythonVirtualenvOperator. Attached is the configuration of the logs inside the 'airflow.cfg' file. I am using the Airflow Docker image from puckel. > Failing when executing a dag: 'log file does not exist' > --- > > Key: AIRFLOW-6968 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6968 > Project: Apache Airflow > Issue Type: Bug > Components: logging >Affects Versions: 1.10.2 >Reporter: Omar Suárez >Priority: Minor > Attachments: airflow_log_config.png > > > I am facing this error: > > {{*** Log file does not exist: > /usr/local/airflow/logs/example_python_operator_virtualenv/virtualenv_python/2020-03-01T19:08:54.619529+00:00/1.log > Fetching from: > [http://894194e3daed:8793/log/example_python_operator_virtualenv/virtualenv_python/2020-03-01T19:08:54.619529+00:00/1.log]Failed > to fetch log file from worker. HTTPConnectionPool(host='894194e3daed', > port=8793): Max retries exceeded with url: > /log/example_python_operator_virtualenv/virtualenv_python/2020-03-01T19:08:54.619529+00:00/1.log > (Caused by NewConnectionError(' 0x7f1d0966c6d8>: Failed to establish a new connection: [Errno 111] Connection > refused',))}} > > when trying to execute a DAG using the PythonVirtualenvOperator. > Attached is the configuration of the logs inside the 'airflow.cfg' file. > I am using the Airflow Docker image from puckel. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-6860) Default ignore_first_depends_on_past to True
[ https://issues.apache.org/jira/browse/AIRFLOW-6860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049694#comment-17049694 ] ASF subversion and git services commented on AIRFLOW-6860: -- Commit 2ea9278f76bf71aafb5601160602bf7f4194242f in airflow's branch refs/heads/master from Ping Zhang [ https://gitbox.apache.org/repos/asf?p=airflow.git;h=2ea9278 ] [AIRFLOW-6860] Default ignore_first_depends_on_past to True (#7490) > Default ignore_first_depends_on_past to True > > > Key: AIRFLOW-6860 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6860 > Project: Apache Airflow > Issue Type: Improvement > Components: cli >Affects Versions: 1.10.9 >Reporter: Ping Zhang >Assignee: Ping Zhang >Priority: Minor > > to avoid > BackfillJob is deadlocked.Some of the deadlocked tasks were unable to run > because of "depends_on_past" relationships. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-6860) Default ignore_first_depends_on_past to True
[ https://issues.apache.org/jira/browse/AIRFLOW-6860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049693#comment-17049693 ] ASF GitHub Bot commented on AIRFLOW-6860: - KevinYang21 commented on pull request #7490: [AIRFLOW-6860] Default ignore_first_depends_on_past to True URL: https://github.com/apache/airflow/pull/7490 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Default ignore_first_depends_on_past to True > > > Key: AIRFLOW-6860 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6860 > Project: Apache Airflow > Issue Type: Improvement > Components: cli >Affects Versions: 1.10.9 >Reporter: Ping Zhang >Assignee: Ping Zhang >Priority: Minor > > to avoid > BackfillJob is deadlocked.Some of the deadlocked tasks were unable to run > because of "depends_on_past" relationships. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] KevinYang21 merged pull request #7490: [AIRFLOW-6860] Default ignore_first_depends_on_past to True
KevinYang21 merged pull request #7490: [AIRFLOW-6860] Default ignore_first_depends_on_past to True URL: https://github.com/apache/airflow/pull/7490 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] leahecole commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow
leahecole commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow URL: https://github.com/apache/airflow/pull/7191#issuecomment-593639001 Hi @vsoch! Jarek spoke super highly about this contribution and I'd love to talk about how we can get you to the summit - I just followed you on twitter (@leahecole there too) - I think one good way would be if you're willing to give a talk about it - I'd be happy to help you through submitting the talk proposal. Feel free to DM me there to talk more details and thanks for your Airflow contribution! :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] kaxil removed a comment on issue #6788: [AIRFLOW-5944] Rendering templated_fields without accessing DAG files
kaxil removed a comment on issue #6788: [AIRFLOW-5944] Rendering templated_fields without accessing DAG files URL: https://github.com/apache/airflow/pull/6788#issuecomment-592308818 **ToDo**: - [ ] Add new column in SerializedDagTable to store unrendered template fields - [x] Add tests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] kaxil edited a comment on issue #6788: [AIRFLOW-5944] Rendering templated_fields without accessing DAG files
kaxil edited a comment on issue #6788: [AIRFLOW-5944] Rendering templated_fields without accessing DAG files URL: https://github.com/apache/airflow/pull/6788#issuecomment-592308818 **ToDo**: - [ ] Add new column in SerializedDagTable to store unrendered template fields - [x] Add tests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] kaxil commented on a change in pull request #6788: [AIRFLOW-5944] Rendering templated_fields without accessing DAG files
kaxil commented on a change in pull request #6788: [AIRFLOW-5944] Rendering templated_fields without accessing DAG files URL: https://github.com/apache/airflow/pull/6788#discussion_r386659768 ## File path: airflow/models/dagrun.py ## @@ -470,6 +472,12 @@ def verify_integrity(self, session=None): 1, 1) ti = TI(task, self.execution_date) session.add(ti) +session.commit() + +if STORE_SERIALIZED_DAGS: +RenderedTaskInstanceFields.delete_old_records(ti.task_id, ti.dag_id) +if not RenderedTaskInstanceFields.has_templated_fields(ti, session): +session.add(RenderedTaskInstanceFields(ti)) Review comment: ToDo: Change the location of this piece of code. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] kaxil commented on a change in pull request #6788: [AIRFLOW-5944] Rendering templated_fields without accessing DAG files
kaxil commented on a change in pull request #6788: [AIRFLOW-5944] Rendering templated_fields without accessing DAG files URL: https://github.com/apache/airflow/pull/6788#discussion_r386659768 ## File path: airflow/models/dagrun.py ## @@ -470,6 +472,12 @@ def verify_integrity(self, session=None): 1, 1) ti = TI(task, self.execution_date) session.add(ti) +session.commit() + +if STORE_SERIALIZED_DAGS: +RenderedTaskInstanceFields.delete_old_records(ti.task_id, ti.dag_id) +if not RenderedTaskInstanceFields.has_templated_fields(ti, session): +session.add(RenderedTaskInstanceFields(ti)) Review comment: I want to change the location of this piece of code. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-6747) UI - Show count of tasks in each dag on the main dags page
[ https://issues.apache.org/jira/browse/AIRFLOW-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049657#comment-17049657 ] t oo commented on AIRFLOW-6747: --- as part of this could use the new column to indicate whether a dagid is no longer in the dagbag > UI - Show count of tasks in each dag on the main dags page > -- > > Key: AIRFLOW-6747 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6747 > Project: Apache Airflow > Issue Type: Improvement > Components: ui >Affects Versions: 1.10.7 >Reporter: t oo >Assignee: Ebrima Jallow >Priority: Minor > Labels: gsoc, gsoc2020, mentor > > Main DAGs page in UI - would benefit from showing a new column: number of > tasks for each dag id -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] BasPH commented on issue #5910: [WIP][AIRFLOW-5307] Move BaseOperator to airflow.operators.base_operator
BasPH commented on issue #5910: [WIP][AIRFLOW-5307] Move BaseOperator to airflow.operators.base_operator URL: https://github.com/apache/airflow/pull/5910#issuecomment-593616994 Okay, I'll wait for #7596 to be merged first. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] dferguson992 commented on a change in pull request #7407: [AIRFLOW-6786] Add KafkaConsumerHook, KafkaProduerHook and KafkaSensor
dferguson992 commented on a change in pull request #7407: [AIRFLOW-6786] Add KafkaConsumerHook, KafkaProduerHook and KafkaSensor URL: https://github.com/apache/airflow/pull/7407#discussion_r386641018 ## File path: airflow/contrib/hooks/kafka_consumer_hook.py ## @@ -0,0 +1,71 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from airflow.hooks import base_hook as BaseHook +from kafka import KafkaConsumer + + +class KafkaConsumerHook(BaseHook): + +DEFAULT_HOST = 'localhost' +DEFAULT_PORT = 9092 + +def __init__(self, conn_id, topic): +super(KafkaConsumerHook, self).__init__(None) +self.conn = self.get_connection(conn_id) Review comment: Address in newest commit. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] potiuk commented on issue #7596: [AIRFLOW-6870] [WIP] remove imports from models/__init__.py
potiuk commented on issue #7596: [AIRFLOW-6870] [WIP] remove imports from models/__init__.py URL: https://github.com/apache/airflow/pull/7596#issuecomment-593596094 @msb217 -> Thanks for your considerations :) . Can you please push a fixup on top of the old one? this way we can review just the difference? If not - just push everything. I think this change needs to be commited as single push even if it is huge. And pre-commit can be done as next step I think This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] potiuk commented on issue #5910: [WIP][AIRFLOW-5307] Move BaseOperator to airflow.operators.base_operator
potiuk commented on issue #5910: [WIP][AIRFLOW-5307] Move BaseOperator to airflow.operators.base_operator URL: https://github.com/apache/airflow/pull/5910#issuecomment-593592953 Hey @BasPH -> I think it's a good idea but you have to be aware that we are moving a lot of stuff out of models/__init__.py in #7596 and also lazy-loading it in models/__init__.py, so possibly it's better to rebase after that one is merged. We need to make sure backwards compatibility so we need too keep the old from models import BaseOperator and with lazy loading / PEP-562 we can get this working. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] BasPH commented on issue #5910: [WIP][AIRFLOW-5307] Move BaseOperator to airflow.operators.base_operator
BasPH commented on issue #5910: [WIP][AIRFLOW-5307] Move BaseOperator to airflow.operators.base_operator URL: https://github.com/apache/airflow/pull/5910#issuecomment-593580264 @kaxil given we now have the BashOperator in airflow.operators.bash, and the PythonOperator in airflow.operators.python, I'd like to place this in airflow.operators.base, while we're at it. What do you think? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] msb217 edited a comment on issue #7596: [AIRFLOW-6870] [WIP] remove imports from models/__init__.py
msb217 edited a comment on issue #7596: [AIRFLOW-6870] [WIP] remove imports from models/__init__.py URL: https://github.com/apache/airflow/pull/7596#issuecomment-593537042 @potiuk @nuclearpinguin I've addressed the majority of your comments, however, considering the size of this PR and sanity's sake - would you like me close PR to break this down into separate PRs and Jira issues? For example: 1. Implement `resetdb` and `all_models` case 2. Change module path for `DAG` to `from airflow import DAG` for `example_dags` 3. Change paths to model modules + lazy load user facing models in `airflow.models` 4. Pre-commit hooks to be done by either of you guys I just don't want to drive you guys insane with such a large review :) Or I can just push if you guys don't mind This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] potiuk commented on issue #7608: [AIRFLOW-6972] Shorter frequently used commands in Breeze
potiuk commented on issue #7608: [AIRFLOW-6972] Shorter frequently used commands in Breeze URL: https://github.com/apache/airflow/pull/7608#issuecomment-593563472 Sure . Shorter is better :). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] mtagle commented on issue #7475: [AIRFLOW-6855]: Escape project_dataset_table in SQL query in gcs to bq …
mtagle commented on issue #7475: [AIRFLOW-6855]: Escape project_dataset_table in SQL query in gcs to bq … URL: https://github.com/apache/airflow/pull/7475#issuecomment-593554401 I had some trouble running the tests locally, so I'm exploiting travis to see if I did it right. 爛 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] codecov-io edited a comment on issue #7584: [AIRFLOW-6956] Extract kill_child_processes_by_pids from DagFileProcessorManager
codecov-io edited a comment on issue #7584: [AIRFLOW-6956] Extract kill_child_processes_by_pids from DagFileProcessorManager URL: https://github.com/apache/airflow/pull/7584#issuecomment-592517938 # [Codecov](https://codecov.io/gh/apache/airflow/pull/7584?src=pr=h1) Report > Merging [#7584](https://codecov.io/gh/apache/airflow/pull/7584?src=pr=desc) into [master](https://codecov.io/gh/apache/airflow/commit/1d16de7af0ba0b6c8493b105a6751693d2ef30f2?src=pr=desc) will **decrease** coverage by `0.23%`. > The diff coverage is `85%`. [![Impacted file tree graph](https://codecov.io/gh/apache/airflow/pull/7584/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/7584?src=pr=tree) ```diff @@Coverage Diff @@ ## master#7584 +/- ## == - Coverage 86.81% 86.58% -0.24% == Files 896 897 +1 Lines 4270442747 +43 == - Hits3707437011 -63 - Misses 5630 5736 +106 ``` | [Impacted Files](https://codecov.io/gh/apache/airflow/pull/7584?src=pr=tree) | Coverage Δ | | |---|---|---| | [airflow/utils/dag\_processing.py](https://codecov.io/gh/apache/airflow/pull/7584/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYWdfcHJvY2Vzc2luZy5weQ==) | `89.57% <50%> (+2.92%)` | :arrow_up: | | [airflow/utils/process\_utils.py](https://codecov.io/gh/apache/airflow/pull/7584/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9wcm9jZXNzX3V0aWxzLnB5) | `73.25% <88.88%> (+4.13%)` | :arrow_up: | | [airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/7584/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==) | `44.44% <0%> (-55.56%)` | :arrow_down: | | [airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/7584/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==) | `52.94% <0%> (-47.06%)` | :arrow_down: | | [airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/7584/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==) | `47.18% <0%> (-45.08%)` | :arrow_down: | | [...viders/cncf/kubernetes/operators/kubernetes\_pod.py](https://codecov.io/gh/apache/airflow/pull/7584/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvY25jZi9rdWJlcm5ldGVzL29wZXJhdG9ycy9rdWJlcm5ldGVzX3BvZC5weQ==) | `69.69% <0%> (-25.26%)` | :arrow_down: | | [airflow/kubernetes/refresh\_config.py](https://codecov.io/gh/apache/airflow/pull/7584/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3JlZnJlc2hfY29uZmlnLnB5) | `50.98% <0%> (-23.53%)` | :arrow_down: | | [airflow/config\_templates/airflow\_local\_settings.py](https://codecov.io/gh/apache/airflow/pull/7584/diff?src=pr=tree#diff-YWlyZmxvdy9jb25maWdfdGVtcGxhdGVzL2FpcmZsb3dfbG9jYWxfc2V0dGluZ3MucHk=) | `64.28% <0%> (-1.76%)` | :arrow_down: | | [airflow/models/dag.py](https://codecov.io/gh/apache/airflow/pull/7584/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvZGFnLnB5) | `91.55% <0%> (-0.03%)` | :arrow_down: | | [airflow/utils/log/cloudwatch\_task\_handler.py](https://codecov.io/gh/apache/airflow/pull/7584/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9sb2cvY2xvdWR3YXRjaF90YXNrX2hhbmRsZXIucHk=) | `100% <0%> (ø)` | | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/airflow/pull/7584?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/airflow/pull/7584?src=pr=footer). Last update [1d16de7...7d83211](https://codecov.io/gh/apache/airflow/pull/7584?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] mik-laj edited a comment on issue #7597: [AIRFLOW-6497] Avoid loading DAGs in the main scheduler loop
mik-laj edited a comment on issue #7597: [AIRFLOW-6497] Avoid loading DAGs in the main scheduler loop URL: https://github.com/apache/airflow/pull/7597#issuecomment-593430117 It is worth noting that this also solves one more problem. The modules are always reloaded. https://issues.apache.org/jira/projects/AIRFLOW/issues/AIRFLOW-6497?filter=allopenissues so when someone makes a change in the additional module it is correctly executed. Its old version is not executed. This can be a problem because the handler is often stored in helper module and is shared among many DAGs. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] mik-laj commented on issue #7503: [AIRFLOW-3607] fixed the bug with picking just special cases while maintaining the p…
mik-laj commented on issue #7503: [AIRFLOW-3607] fixed the bug with picking just special cases while maintaining the p… URL: https://github.com/apache/airflow/pull/7503#issuecomment-593540891 @houqp What do you think? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] mik-laj merged pull request #7606: [AIRFLOW-XXXX] Add Ternary Data to README.md
mik-laj merged pull request #7606: [AIRFLOW-] Add Ternary Data to README.md URL: https://github.com/apache/airflow/pull/7606 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] boring-cyborg[bot] commented on issue #7606: [AIRFLOW-XXXX] Add Ternary Data to README.md
boring-cyborg[bot] commented on issue #7606: [AIRFLOW-] Add Ternary Data to README.md URL: https://github.com/apache/airflow/pull/7606#issuecomment-593541241 Awesome work, congrats on your first merged pull request! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] msb217 edited a comment on issue #7596: [AIRFLOW-6870] [WIP] remove imports from models/__init__.py
msb217 edited a comment on issue #7596: [AIRFLOW-6870] [WIP] remove imports from models/__init__.py URL: https://github.com/apache/airflow/pull/7596#issuecomment-593537042 @potiuk @nuclearpinguin I've addressed the majority of your comments, however, considering the size of this PR and sanity's sake - would you like me close PR to break this down into separate PRs and Jira issues? For example: 1. Implement `resetdb` and `all_models` case 2. Change module path for `DAG` to `from airflow import DAG` for `example_dags` 3. Change paths to model modules 4. Pre-commit hooks to be done by either of you guys I just don't want to drive you guys insane with such a large review :) Or I can just push if you guys don't mind This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] msb217 commented on issue #7596: [AIRFLOW-6870] [WIP] remove imports from models/__init__.py
msb217 commented on issue #7596: [AIRFLOW-6870] [WIP] remove imports from models/__init__.py URL: https://github.com/apache/airflow/pull/7596#issuecomment-593537042 @potiuk @nuclearpinguin I've addressed the majority of your comments, however, considering the size of this PR and sanity's sake - would you like me close PR to break this down into separate PRs and Jira issues? For example: 1. Implement `resetdb` and `all_models` case 2. Change module path for `DAG` to `from airflow import DAG` for `example_dags` 3. Change paths to model modules 4. Pre-commit hooks to be done by either of you guys I just don't want to drive you guys insane with such a large review :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] nuclearpinguin opened a new pull request #7609: [AIRFLOW-6973] Make GCSCreateBucketOperator idempotent
nuclearpinguin opened a new pull request #7609: [AIRFLOW-6973] Make GCSCreateBucketOperator idempotent URL: https://github.com/apache/airflow/pull/7609 --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [x] Unit tests coverage for changes (not needed for documentation changes) - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [x] Relevant documentation is updated including usage instructions. - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-6973) Make GCSCreateBucketOperator idempotent
[ https://issues.apache.org/jira/browse/AIRFLOW-6973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049494#comment-17049494 ] ASF GitHub Bot commented on AIRFLOW-6973: - nuclearpinguin commented on pull request #7609: [AIRFLOW-6973] Make GCSCreateBucketOperator idempotent URL: https://github.com/apache/airflow/pull/7609 --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [x] Unit tests coverage for changes (not needed for documentation changes) - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [x] Relevant documentation is updated including usage instructions. - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Make GCSCreateBucketOperator idempotent > --- > > Key: AIRFLOW-6973 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6973 > Project: Apache Airflow > Issue Type: Improvement > Components: gcp, operators >Affects Versions: 2.0.0 >Reporter: Tomasz Urbaszek >Priority: Minor > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] mik-laj commented on issue #6670: [AIRFLOW-4816]MySqlToS3Operator
mik-laj commented on issue #6670: [AIRFLOW-4816]MySqlToS3Operator URL: https://github.com/apache/airflow/pull/6670#issuecomment-593527684 You should remove "_operator" prefix from the file name. After that, this operator should be in the `airflow/provviders/amazon/aws/operators` directory. More information: https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-21%3A+Changes+in+import+paths Best regards, Kamil This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-6972) Shorter frequently used commands in Breeze
[ https://issues.apache.org/jira/browse/AIRFLOW-6972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049487#comment-17049487 ] ASF GitHub Bot commented on AIRFLOW-6972: - mik-laj commented on pull request #7608: [AIRFLOW-6972] Shorter frequently used commands in Breeze URL: https://github.com/apache/airflow/pull/7608 I often do typo in the "environment" word. --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [X] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [X] Unit tests coverage for changes (not needed for documentation changes) - [X] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [X] Relevant documentation is updated including usage instructions. - [X] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Shorter frequently used commands in Breeze > -- > > Key: AIRFLOW-6972 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6972 > Project: Apache Airflow > Issue Type: Improvement > Components: breeze >Affects Versions: 1.10.9 >Reporter: Kamil Bregula >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] mik-laj opened a new pull request #7608: [AIRFLOW-6972] Shorter frequently used commands in Breeze
mik-laj opened a new pull request #7608: [AIRFLOW-6972] Shorter frequently used commands in Breeze URL: https://github.com/apache/airflow/pull/7608 I often do typo in the "environment" word. --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [X] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [X] Unit tests coverage for changes (not needed for documentation changes) - [X] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [X] Relevant documentation is updated including usage instructions. - [X] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (AIRFLOW-6973) Make GCSCreateBucketOperator idempotent
Tomasz Urbaszek created AIRFLOW-6973: Summary: Make GCSCreateBucketOperator idempotent Key: AIRFLOW-6973 URL: https://issues.apache.org/jira/browse/AIRFLOW-6973 Project: Apache Airflow Issue Type: Improvement Components: gcp, operators Affects Versions: 2.0.0 Reporter: Tomasz Urbaszek -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (AIRFLOW-6972) Shorter frequently used commands in Breeze.
Kamil Bregula created AIRFLOW-6972: -- Summary: Shorter frequently used commands in Breeze. Key: AIRFLOW-6972 URL: https://issues.apache.org/jira/browse/AIRFLOW-6972 Project: Apache Airflow Issue Type: Improvement Components: breeze Affects Versions: 1.10.9 Reporter: Kamil Bregula -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (AIRFLOW-6972) Shorter frequently used commands in Breeze
[ https://issues.apache.org/jira/browse/AIRFLOW-6972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kamil Bregula updated AIRFLOW-6972: --- Summary: Shorter frequently used commands in Breeze (was: Shorter frequently used commands in Breeze.) > Shorter frequently used commands in Breeze > -- > > Key: AIRFLOW-6972 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6972 > Project: Apache Airflow > Issue Type: Improvement > Components: breeze >Affects Versions: 1.10.9 >Reporter: Kamil Bregula >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] JavierLopezT commented on issue #6670: [AIRFLOW-4816]MySqlToS3Operator
JavierLopezT commented on issue #6670: [AIRFLOW-4816]MySqlToS3Operator URL: https://github.com/apache/airflow/pull/6670#issuecomment-593514642 Also, what is this? Would this prevent the code for passing all the checks? ``` _ TestOperatorsHooks.test_no_illegal_suffixes __ self = def test_no_illegal_suffixes(self): illegal_suffixes = ["_operator.py", "_hook.py", "_sensor.py"] files = itertools.chain(*[ glob.glob(f"{ROOT_FOLDER}/{part}/providers/**/{resource_type}/*.py", recursive=True) for resource_type in ["operators", "hooks", "sensors", "example_dags"] for part in ["airlfow", "tests"] ]) invalid_files = [ f for f in files if any(f.endswith(suffix) for suffix in illegal_suffixes) ] > self.assertEqual([], invalid_files) E AssertionError: Lists differ: [] != ['/opt/airflow/tests/providers/amazon/aws/[35 chars].py'] E E Second list contains 1 additional elements. E First extra element 0: E '/opt/airflow/tests/providers/amazon/aws/operators/test_mysql_to_s3_operator.py' E E - [] E + ['/opt/airflow/tests/providers/amazon/aws/operators/test_mysql_to_s3_operator.py'] tests/test_project_structure.py:267: AssertionError ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] JavierLopezT edited a comment on issue #6670: [AIRFLOW-4816]MySqlToS3Operator
JavierLopezT edited a comment on issue #6670: [AIRFLOW-4816]MySqlToS3Operator URL: https://github.com/apache/airflow/pull/6670#issuecomment-593511992 I am stucked again with the test. Could you help me please? Sorry for bothering again ``` AssertionError: Expected call: load_file(bucket_name='bucket', filename=, key='key') Actual call: load_file(bucket_name='bucket', filename=, key='key') ``` The code in the operator is: ``` with tempfile.NamedTemporaryFile(mode='r+', suffix='.csv') as tmp_csv: tmp_csv.file.write(data_df.to_csv(index=self.index, header=self.header)) tmp_csv.file.seek(0) s3_conn.load_file(filename=tmp_csv.name, key=self.s3_key, bucket_name=self.s3_bucket) ``` And the testing code (its latest version) is: @mock.patch("airflow.operators.mysql_to_s3_operator.tempfile.NamedTemporaryFile") ... ``` temp_mock.assert_called_once_with(mode='r+', suffix=".csv") temp_mock.return_value.__enter__.return_value.name = "file" mock_s3_hook.return_value.load_file.assert_called_once_with(filename=temp_mock.__enter__.name, key=s3_key, bucket_name=s3_bucket) ``` I have tried also with: ``` mock_s3_hook.return_value.load_file.assert_called_once_with(filename="file", key=s3_key, bucket_name=s3_bucket) ``` and ``` mock_s3_hook.return_value.load_file.assert_called_once_with(filename=temp_mock.name, key=s3_key, bucket_name=s3_bucket) ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] JavierLopezT commented on issue #6670: [AIRFLOW-4816]MySqlToS3Operator
JavierLopezT commented on issue #6670: [AIRFLOW-4816]MySqlToS3Operator URL: https://github.com/apache/airflow/pull/6670#issuecomment-593511992 I am stucked again with the test. Could you help me please? Sorry for bothering again ``` AssertionError: Expected call: load_file(bucket_name='bucket', filename=, key='key') Actual call: load_file(bucket_name='bucket', filename=, key='key') ``` The code in the operator is: ``` with tempfile.NamedTemporaryFile(mode='r+', suffix='.csv') as tmp_csv: tmp_csv.file.write(data_df.to_csv(index=self.index, header=self.header)) tmp_csv.file.seek(0) s3_conn.load_file(filename=tmp_csv.name, key=self.s3_key, bucket_name=self.s3_bucket) ``` And the testing code (its latest version) is: @mock.patch("airflow.operators.mysql_to_s3_operator.tempfile.NamedTemporaryFile") ... ``` temp_mock.assert_called_once_with(mode='r+', suffix=".csv") temp_mock.return_value.__enter__.return_value.name = "file" mock_s3_hook.return_value.load_file.assert_called_once_with(filename=temp_mock.__enter__.name, key=s3_key, bucket_name=s3_bucket) ``` I have tried also with: ``` mock_s3_hook.return_value.load_file.assert_called_once_with(filename="file", key=s3_key, bucket_name=s3_bucket) ``` and ``` mock_s3_hook.return_value.load_file.assert_called_once_with(filename=temp_mock.name, key=s3_key, bucket_name=s3_bucket) ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-6971) Fix return type in CloudSpeechToTextRecognizeSpeechOperator
[ https://issues.apache.org/jira/browse/AIRFLOW-6971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049442#comment-17049442 ] ASF GitHub Bot commented on AIRFLOW-6971: - nuclearpinguin commented on pull request #7607: [AIRFLOW-6971] Fix return type in CloudSpeechToTextRecognizeSpeechOp URL: https://github.com/apache/airflow/pull/7607 --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [x] Unit tests coverage for changes (not needed for documentation changes) - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [x] Relevant documentation is updated including usage instructions. - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Fix return type in CloudSpeechToTextRecognizeSpeechOperator > > > Key: AIRFLOW-6971 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6971 > Project: Apache Airflow > Issue Type: Improvement > Components: gcp >Affects Versions: 2.0.0 >Reporter: Tomasz Urbaszek >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] nuclearpinguin opened a new pull request #7607: [AIRFLOW-6971] Fix return type in CloudSpeechToTextRecognizeSpeechOp
nuclearpinguin opened a new pull request #7607: [AIRFLOW-6971] Fix return type in CloudSpeechToTextRecognizeSpeechOp URL: https://github.com/apache/airflow/pull/7607 --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [x] Unit tests coverage for changes (not needed for documentation changes) - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [x] Relevant documentation is updated including usage instructions. - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (AIRFLOW-6971) Fix return type in CloudSpeechToTextRecognizeSpeechOperator
Tomasz Urbaszek created AIRFLOW-6971: Summary: Fix return type in CloudSpeechToTextRecognizeSpeechOperator Key: AIRFLOW-6971 URL: https://issues.apache.org/jira/browse/AIRFLOW-6971 Project: Apache Airflow Issue Type: Improvement Components: gcp Affects Versions: 2.0.0 Reporter: Tomasz Urbaszek -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] potiuk commented on issue #7570: [AIRFLOW-6946] Switch to MySQL 5.7 in 2.0 as base
potiuk commented on issue #7570: [AIRFLOW-6946] Switch to MySQL 5.7 in 2.0 as base URL: https://github.com/apache/airflow/pull/7570#issuecomment-593506014 Hey @ashb @kaxil @ANiteckiP @anitakar -> This change moves us to MySQL 5.7. Also it contains the the fix for the utf8mb4 problem I discovered while testing unicode DAGs. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] boring-cyborg[bot] commented on issue #7595: [AIRFLOW-XXXX] Fix typo from upstream to downstream
boring-cyborg[bot] commented on issue #7595: [AIRFLOW-] Fix typo from upstream to downstream URL: https://github.com/apache/airflow/pull/7595#issuecomment-593505971 Awesome work, congrats on your first merged pull request! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] feluelle merged pull request #7595: [AIRFLOW-XXXX] Fix typo from upstream to downstream
feluelle merged pull request #7595: [AIRFLOW-] Fix typo from upstream to downstream URL: https://github.com/apache/airflow/pull/7595 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] mhousley opened a new pull request #7606: [AIRFLOW-XXXX] Add Ternary Data to README.md
mhousley opened a new pull request #7606: [AIRFLOW-] Add Ternary Data to README.md URL: https://github.com/apache/airflow/pull/7606 --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [x] Unit tests coverage for changes (not needed for documentation changes) - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [x] Relevant documentation is updated including usage instructions. - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] mhousley closed pull request #7605: Add Ternary Data to README.md
mhousley closed pull request #7605: Add Ternary Data to README.md URL: https://github.com/apache/airflow/pull/7605 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] mhousley opened a new pull request #7605: Add Ternary Data to README.md
mhousley opened a new pull request #7605: Add Ternary Data to README.md URL: https://github.com/apache/airflow/pull/7605 --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x ] Description above provides context of the change - [ ] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [ ] Unit tests coverage for changes (not needed for documentation changes) - [ ] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [ ] Relevant documentation is updated including usage instructions. - [ ] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] marwan116 commented on issue #7595: [AIRFLOW-XXXX] Fix typo from upstream to downstream
marwan116 commented on issue #7595: [AIRFLOW-] Fix typo from upstream to downstream URL: https://github.com/apache/airflow/pull/7595#issuecomment-593498283 Thank you - so for documentation fixes it is actually - sorry was a bit confused about that. Fixed it This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] mik-laj commented on issue #7595: [AIRFLOW-6971] Fix typo from upstream to downstream
mik-laj commented on issue #7595: [AIRFLOW-6971] Fix typo from upstream to downstream URL: https://github.com/apache/airflow/pull/7595#issuecomment-593496847 Here is pull-request guideline: https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] marwan116 commented on issue #7595: [AIRFLOW-6971] Fix typo from upstream to downstream
marwan116 commented on issue #7595: [AIRFLOW-6971] Fix typo from upstream to downstream URL: https://github.com/apache/airflow/pull/7595#issuecomment-593489182 Thank you for responding - yes I noted that the format should start with [AIRFLOW-] but I am not sure what - I just went to JIRA - checked the most recent number (6970) and I incremented mine by one ... would be good if there is a resource on how to create the JIRA issue number - This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] marwan116 edited a comment on issue #7595: [AIRFLOW-6971] Fix typo from upstream to downstream
marwan116 edited a comment on issue #7595: [AIRFLOW-6971] Fix typo from upstream to downstream URL: https://github.com/apache/airflow/pull/7595#issuecomment-593489182 Thank you for responding - yes I noted that the format should start with [AIRFLOW-] but I am not sure what is supposed to be - I just went to JIRA - checked the most recent number (6970) and I incremented mine by one ... would be good if there is a resource on how to create the JIRA issue number - This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-6970) Improve GCP Video Intelligence system tests
[ https://issues.apache.org/jira/browse/AIRFLOW-6970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049374#comment-17049374 ] ASF GitHub Bot commented on AIRFLOW-6970: - nuclearpinguin commented on pull request #7604: [AIRFLOW-6970] Improve GCP Video Intelligence system tests URL: https://github.com/apache/airflow/pull/7604 --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [x] Unit tests coverage for changes (not needed for documentation changes) - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [x] Relevant documentation is updated including usage instructions. - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve GCP Video Intelligence system tests > --- > > Key: AIRFLOW-6970 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6970 > Project: Apache Airflow > Issue Type: Improvement > Components: gcp, tests >Affects Versions: 2.0.0 >Reporter: Tomasz Urbaszek >Priority: Minor > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] nuclearpinguin opened a new pull request #7604: [AIRFLOW-6970] Improve GCP Video Intelligence system tests
nuclearpinguin opened a new pull request #7604: [AIRFLOW-6970] Improve GCP Video Intelligence system tests URL: https://github.com/apache/airflow/pull/7604 --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [x] Unit tests coverage for changes (not needed for documentation changes) - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [x] Relevant documentation is updated including usage instructions. - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] mik-laj commented on a change in pull request #7163: [AIRFLOW-6542] add spark-on-k8s operator/hook/sensor
mik-laj commented on a change in pull request #7163: [AIRFLOW-6542] add spark-on-k8s operator/hook/sensor URL: https://github.com/apache/airflow/pull/7163#discussion_r386495451 ## File path: airflow/providers/cncf/kubernetes/operators/spark_kubernetes.py ## @@ -0,0 +1,88 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +from typing import Optional + +import yaml +from kubernetes import client + +from airflow.exceptions import AirflowException +from airflow.models import BaseOperator +from airflow.providers.cncf.kubernetes.hooks.kubernetes import KubernetesHook +from airflow.utils.decorators import apply_defaults + + +class SparkKubernetesOperator(BaseOperator): +""" +Creates sparkApplication object in kubernetes cluster: + + .. seealso:: +For more detail about Spark Application Object have a look at the reference: + https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob/master/docs/api-docs.md#sparkapplication + +:param application_file: filepath to kubernetes custom_resource_definition of sparkApplication +:type application_file: str +:param namespace: kubernetes namespace to put sparkApplication +:type namespace: str +:param kubernetes_conn_id: the connection to Kubernetes cluster +:type conn_id: str +""" + +template_fields = ['application_file', 'namespace'] +template_ext = ('yaml', 'yml', 'json') +ui_color = '#f4a460' + +@apply_defaults +def __init__(self, + application_file: str, + namespace: Optional[str] = None, + conn_id: str = 'kubernetes_default', + *args, **kwargs) -> None: +super().__init__(*args, **kwargs) +self.application_file = application_file +self.namespace = namespace +self.conn_id = conn_id + +def execute(self, context): +self.log.info("Creating sparkApplication") +hook = KubernetesHook(conn_id=self.conn_id) +api_client = hook.get_conn() +api = client.CustomObjectsApi(api_client) +application_dict = self._load_application_to_dict() +if self.namespace is None: +namespace = hook.get_namespace() +else: +namespace = self.namespace +try: +response = api.create_namespaced_custom_object( Review comment: This logic should be in the hook so that it can be used again by other custom operators. Then we can add more methods that will start operators and wait for its completion or other depending on the situation. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] mik-laj commented on a change in pull request #7163: [AIRFLOW-6542] add spark-on-k8s operator/hook/sensor
mik-laj commented on a change in pull request #7163: [AIRFLOW-6542] add spark-on-k8s operator/hook/sensor URL: https://github.com/apache/airflow/pull/7163#discussion_r386491368 ## File path: airflow/providers/cncf/kubernetes/example_dags/example_spark_kubernetes_operator_spark_pi.yaml ## @@ -0,0 +1,57 @@ +# Review comment: Can you add this file to MANIFEST.in? https://github.com/apache/airflow/blob/master/MANIFEST.in This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] mik-laj commented on a change in pull request #6576: [AIRFLOW-5922] Add option to specify the mysql client library used in MySqlHook
mik-laj commented on a change in pull request #6576: [AIRFLOW-5922] Add option to specify the mysql client library used in MySqlHook URL: https://github.com/apache/airflow/pull/6576#discussion_r386484225 ## File path: airflow/providers/mysql/hooks/mysql.py ## @@ -16,10 +16,13 @@ # specific language governing permissions and limitations # under the License. +""" +This module allows to connect to a MySQL database. +""" + import json import MySQLdb Review comment: Do we need this import here? Maybe we can load this library only in specific cases? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] mik-laj commented on a change in pull request #6576: [AIRFLOW-5922] Add option to specify the mysql client library used in MySqlHook
mik-laj commented on a change in pull request #6576: [AIRFLOW-5922] Add option to specify the mysql client library used in MySqlHook URL: https://github.com/apache/airflow/pull/6576#discussion_r386483618 ## File path: airflow/providers/mysql/hooks/mysql.py ## @@ -113,8 +107,44 @@ def get_conn(self): conn_config['unix_socket'] = conn.extra_dejson['unix_socket'] if local_infile: conn_config["local_infile"] = 1 -conn = MySQLdb.connect(**conn_config) -return conn +return conn_config + +def _get_conn_config_mysql_connector_python(self, conn): +conn_config = { +'user': conn.login, +'password': conn.password or '', +'host': conn.host or 'localhost', +'database': self.schema or conn.schema or '', +'port': int(conn.port) if conn.port else 3306 +} + +if conn.extra_dejson.get('allow_local_infile', False): +conn_config["allow_local_infile"] = True + +return conn_config + +def get_conn(self): +""" +Establishes a connection to a mysql database +by extracting the connection configuration from the Airflow connection. + +.. note:: By default it connects to the database via the mysqlclient library. +But you can also choose the mysql-connector-python library which lets you connect through ssl +without any further ssl parameters required. + +:return: a mysql connection object +""" +conn = self.connection or self.get_connection(self.mysql_conn_id) # pylint: disable=no-member + +client_name = conn.extra_dejson.get('client', 'mysqlclient') + +if client_name == 'mysql-connector-python': +import mysql.connector +conn_config = self._get_conn_config_mysql_connector_python(conn) +return mysql.connector.connect(**conn_config) + +conn_config = self._get_conn_config_mysql_client(conn) Review comment: Can you raise an exception when an invalid value is provided? Loading a specific client by default can lead to difficult to detect typos. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (AIRFLOW-6970) Improve GCP Video Intelligence system tests
Tomasz Urbaszek created AIRFLOW-6970: Summary: Improve GCP Video Intelligence system tests Key: AIRFLOW-6970 URL: https://issues.apache.org/jira/browse/AIRFLOW-6970 Project: Apache Airflow Issue Type: Improvement Components: gcp, tests Affects Versions: 2.0.0 Reporter: Tomasz Urbaszek -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] kaxil commented on a change in pull request #6788: WIP: [AIRFLOW-5944] Rendering templated_fields without accessing DAG files
kaxil commented on a change in pull request #6788: WIP: [AIRFLOW-5944] Rendering templated_fields without accessing DAG files URL: https://github.com/apache/airflow/pull/6788#discussion_r386480827 ## File path: airflow/www/views.py ## @@ -570,25 +572,37 @@ def dag_details(self, session=None): @has_dag_access(can_dag_read=True) @has_access @action_logging -def rendered(self): +@provide_session +def rendered(self, session=None): dag_id = request.args.get('dag_id') task_id = request.args.get('task_id') execution_date = request.args.get('execution_date') dttm = timezone.parse(execution_date) form = DateTimeForm(data={'execution_date': dttm}) root = request.args.get('root', '') -# Loads dag from file -logging.info("Processing DAG file to render template.") -dag = dagbag.get_dag(dag_id, from_file_only=True) + +logging.info("Retrieving rendered templates.") +dag = dagbag.get_dag(dag_id) + task = copy.copy(dag.get_task(task_id)) ti = models.TaskInstance(task=task, execution_date=dttm) try: -ti.render_templates() +if STORE_SERIALIZED_DAGS: +rtif = RenderedTaskInstanceFields.get_templated_fields(ti) +if rtif: +for field_name, rendered_value in rtif.items(): +setattr(task, field_name, rendered_value) +else: +# ToDo: Fetch raw strings from RenderedTaskInstanceFields table +flash("Template field not found") +else: +ti.render_templates() Review comment: Done ! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] kaxil commented on a change in pull request #6788: WIP: [AIRFLOW-5944] Rendering templated_fields without accessing DAG files
kaxil commented on a change in pull request #6788: WIP: [AIRFLOW-5944] Rendering templated_fields without accessing DAG files URL: https://github.com/apache/airflow/pull/6788#discussion_r386480519 ## File path: airflow/utils/operator_helpers.py ## @@ -84,3 +85,24 @@ def context_to_airflow_vars(context, in_env_var_format=False): params[AIRFLOW_VAR_NAME_FORMAT_MAPPING['AIRFLOW_CONTEXT_DAG_RUN_ID'][ name_format]] = dag_run.run_id return params + + +def serialize_template_field(template_field): Review comment: Moved This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] kaxil commented on a change in pull request #6788: WIP: [AIRFLOW-5944] Rendering templated_fields without accessing DAG files
kaxil commented on a change in pull request #6788: WIP: [AIRFLOW-5944] Rendering templated_fields without accessing DAG files URL: https://github.com/apache/airflow/pull/6788#discussion_r386480422 ## File path: airflow/serialization/serialized_objects.py ## @@ -319,6 +320,9 @@ def serialize_operator(cls, op: BaseOperator) -> dict: if op.operator_extra_links: serialize_op['_operator_extra_links'] = \ cls._serialize_operator_extra_links(op.operator_extra_links) +serialize_op['_templated_fields'] = { +field: serialize_template_field(getattr(op, field)) for field in op.template_fields Review comment: Removing Unrendered part from this PR. Will create a separate PR to add that This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] feluelle commented on issue #6576: [AIRFLOW-5922] Add option to specify the mysql client library used in MySqlHook
feluelle commented on issue #6576: [AIRFLOW-5922] Add option to specify the mysql client library used in MySqlHook URL: https://github.com/apache/airflow/pull/6576#issuecomment-593466069 @potiuk @mik-laj any final comments before I merge it? I would really like to get more feedback on this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] feluelle edited a comment on issue #6578: [AIRFLOW-5907] Add S3 to MySql Operator
feluelle edited a comment on issue #6578: [AIRFLOW-5907] Add S3 to MySql Operator URL: https://github.com/apache/airflow/pull/6578#issuecomment-593462941 @RosterIn WDYT of this now? I personally don't like the `extra_options` (I added to `mysql` myself to `bulk_load_custom` :D). So now you are able to specify these options in the transfer operation. So you can load json or csv or whatever as long as LOAD DATA supports it. **EDIT:** I don't like the `extra_options` because it is not fully clear what options of https://dev.mysql.com/doc/refman/8.0/en/load-data.html these can be. But the issue is more because of `bulk_load_custom` not because of the implementation of the S3ToMySql, I would say. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] feluelle edited a comment on issue #6578: [AIRFLOW-5907] Add S3 to MySql Operator
feluelle edited a comment on issue #6578: [AIRFLOW-5907] Add S3 to MySql Operator URL: https://github.com/apache/airflow/pull/6578#issuecomment-593462941 @RosterIn WDYT of this now? I personally don't like the `extra_options` (I added to `mysql` myself to `bulk_load_custom` :D). So now you are able to specify these options in the transfer operation. So you can load json or csv or whatever as long as LOAD DATA supports it. **EDIT:** I don't like the `extra_options` because it is not fully clear what options of https://dev.mysql.com/doc/refman/8.0/en/load-data.html these can be. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] feluelle commented on issue #6578: [AIRFLOW-5907] Add S3 to MySql Operator
feluelle commented on issue #6578: [AIRFLOW-5907] Add S3 to MySql Operator URL: https://github.com/apache/airflow/pull/6578#issuecomment-593462941 @RosterIn WDYT of this now? I personally don't like the `extra_options` (I added to `mysql` myself to `bulk_load_custom` :D). So now you are able to specify these options in the transfer operation. So you can load json or csv or whatever as long as LOAD DATA supports it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-2325) Task logging with AWS Cloud watch
[ https://issues.apache.org/jira/browse/AIRFLOW-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049286#comment-17049286 ] ASF subversion and git services commented on AIRFLOW-2325: -- Commit 1e3cdddcd87be3c0f11b43efea11cdbddaff4470 in airflow's branch refs/heads/master from Daniel Hegberg [ https://gitbox.apache.org/repos/asf?p=airflow.git;h=1e3cddd ] [AIRFLOW-2325] Add CloudwatchTaskHandler option for remote task logging to Cloudwatch (#7437) > Task logging with AWS Cloud watch > - > > Key: AIRFLOW-2325 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2325 > Project: Apache Airflow > Issue Type: New Feature > Components: logging >Reporter: Fang-Pen Lin >Assignee: Daniel Hegberg >Priority: Minor > Fix For: 2.0.0 > > > In many cases, it's ideal to use remote logging while running Airflow in > production, as the worker could be easily scale down or scale up. Or the > worker is running in containers, where the local storage is not meant to be > there forever. In that case, the S3 task logging handler could be used > [https://github.com/apache/incubator-airflow/blob/master/airflow/utils/log/s3_task_handler.py] > However, it comes with drawback. S3 logging handler only uploads the log when > the task completed or failed. For long running tasks, it's hard to know > what's going on with the process until it finishes. > To make more real-time logging, I built a logging handler based on AWS > CloudWatch. It uses a third party python package `watchtower` > > [https://github.com/kislyuk/watchtower/tree/master/watchtower] > > I created a PR here [https://github.com/apache/incubator-airflow/pull/3229], > basically I just copy-pasted the code I wrote for my own project, it works > fine with 1.9 release, but never tested with master branch. Also, there is a > bug in watchtower causing task runner to hang forever when it completes. I > created an issue in their repo > [https://github.com/kislyuk/watchtower/issues/57] > And a PR for addressing that issue > [https://github.com/kislyuk/watchtower/pull/58] > > The PR is still far from ready to be reviewed, but I just want to get some > feedback before I spend more time on it. I would like to see if youguys want > this cloudwatch handler goes into the main repo, or do youguys prefer it to > be a standalone third-party module. If it's that case, I can close this > ticket and create a standalone repo on my own. If the PR is welcome, then I > can spend more time on polishing it based on your feedback, add tests / > documents and other stuff. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-2325) Task logging with AWS Cloud watch
[ https://issues.apache.org/jira/browse/AIRFLOW-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049285#comment-17049285 ] ASF subversion and git services commented on AIRFLOW-2325: -- Commit 1e3cdddcd87be3c0f11b43efea11cdbddaff4470 in airflow's branch refs/heads/master from Daniel Hegberg [ https://gitbox.apache.org/repos/asf?p=airflow.git;h=1e3cddd ] [AIRFLOW-2325] Add CloudwatchTaskHandler option for remote task logging to Cloudwatch (#7437) > Task logging with AWS Cloud watch > - > > Key: AIRFLOW-2325 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2325 > Project: Apache Airflow > Issue Type: New Feature > Components: logging >Reporter: Fang-Pen Lin >Assignee: Daniel Hegberg >Priority: Minor > Fix For: 2.0.0 > > > In many cases, it's ideal to use remote logging while running Airflow in > production, as the worker could be easily scale down or scale up. Or the > worker is running in containers, where the local storage is not meant to be > there forever. In that case, the S3 task logging handler could be used > [https://github.com/apache/incubator-airflow/blob/master/airflow/utils/log/s3_task_handler.py] > However, it comes with drawback. S3 logging handler only uploads the log when > the task completed or failed. For long running tasks, it's hard to know > what's going on with the process until it finishes. > To make more real-time logging, I built a logging handler based on AWS > CloudWatch. It uses a third party python package `watchtower` > > [https://github.com/kislyuk/watchtower/tree/master/watchtower] > > I created a PR here [https://github.com/apache/incubator-airflow/pull/3229], > basically I just copy-pasted the code I wrote for my own project, it works > fine with 1.9 release, but never tested with master branch. Also, there is a > bug in watchtower causing task runner to hang forever when it completes. I > created an issue in their repo > [https://github.com/kislyuk/watchtower/issues/57] > And a PR for addressing that issue > [https://github.com/kislyuk/watchtower/pull/58] > > The PR is still far from ready to be reviewed, but I just want to get some > feedback before I spend more time on it. I would like to see if youguys want > this cloudwatch handler goes into the main repo, or do youguys prefer it to > be a standalone third-party module. If it's that case, I can close this > ticket and create a standalone repo on my own. If the PR is welcome, then I > can spend more time on polishing it based on your feedback, add tests / > documents and other stuff. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (AIRFLOW-2325) Task logging with AWS Cloud watch
[ https://issues.apache.org/jira/browse/AIRFLOW-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Uellendall resolved AIRFLOW-2325. --- Fix Version/s: 2.0.0 Resolution: Fixed > Task logging with AWS Cloud watch > - > > Key: AIRFLOW-2325 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2325 > Project: Apache Airflow > Issue Type: New Feature > Components: logging >Reporter: Fang-Pen Lin >Assignee: Daniel Hegberg >Priority: Minor > Fix For: 2.0.0 > > > In many cases, it's ideal to use remote logging while running Airflow in > production, as the worker could be easily scale down or scale up. Or the > worker is running in containers, where the local storage is not meant to be > there forever. In that case, the S3 task logging handler could be used > [https://github.com/apache/incubator-airflow/blob/master/airflow/utils/log/s3_task_handler.py] > However, it comes with drawback. S3 logging handler only uploads the log when > the task completed or failed. For long running tasks, it's hard to know > what's going on with the process until it finishes. > To make more real-time logging, I built a logging handler based on AWS > CloudWatch. It uses a third party python package `watchtower` > > [https://github.com/kislyuk/watchtower/tree/master/watchtower] > > I created a PR here [https://github.com/apache/incubator-airflow/pull/3229], > basically I just copy-pasted the code I wrote for my own project, it works > fine with 1.9 release, but never tested with master branch. Also, there is a > bug in watchtower causing task runner to hang forever when it completes. I > created an issue in their repo > [https://github.com/kislyuk/watchtower/issues/57] > And a PR for addressing that issue > [https://github.com/kislyuk/watchtower/pull/58] > > The PR is still far from ready to be reviewed, but I just want to get some > feedback before I spend more time on it. I would like to see if youguys want > this cloudwatch handler goes into the main repo, or do youguys prefer it to > be a standalone third-party module. If it's that case, I can close this > ticket and create a standalone repo on my own. If the PR is welcome, then I > can spend more time on polishing it based on your feedback, add tests / > documents and other stuff. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-2325) Task logging with AWS Cloud watch
[ https://issues.apache.org/jira/browse/AIRFLOW-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049275#comment-17049275 ] ASF GitHub Bot commented on AIRFLOW-2325: - feluelle commented on pull request #7437: [AIRFLOW-2325] Add CloudwatchTaskHandler option for remote task loggi… URL: https://github.com/apache/airflow/pull/7437 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Task logging with AWS Cloud watch > - > > Key: AIRFLOW-2325 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2325 > Project: Apache Airflow > Issue Type: New Feature > Components: logging >Reporter: Fang-Pen Lin >Assignee: Daniel Hegberg >Priority: Minor > > In many cases, it's ideal to use remote logging while running Airflow in > production, as the worker could be easily scale down or scale up. Or the > worker is running in containers, where the local storage is not meant to be > there forever. In that case, the S3 task logging handler could be used > [https://github.com/apache/incubator-airflow/blob/master/airflow/utils/log/s3_task_handler.py] > However, it comes with drawback. S3 logging handler only uploads the log when > the task completed or failed. For long running tasks, it's hard to know > what's going on with the process until it finishes. > To make more real-time logging, I built a logging handler based on AWS > CloudWatch. It uses a third party python package `watchtower` > > [https://github.com/kislyuk/watchtower/tree/master/watchtower] > > I created a PR here [https://github.com/apache/incubator-airflow/pull/3229], > basically I just copy-pasted the code I wrote for my own project, it works > fine with 1.9 release, but never tested with master branch. Also, there is a > bug in watchtower causing task runner to hang forever when it completes. I > created an issue in their repo > [https://github.com/kislyuk/watchtower/issues/57] > And a PR for addressing that issue > [https://github.com/kislyuk/watchtower/pull/58] > > The PR is still far from ready to be reviewed, but I just want to get some > feedback before I spend more time on it. I would like to see if youguys want > this cloudwatch handler goes into the main repo, or do youguys prefer it to > be a standalone third-party module. If it's that case, I can close this > ticket and create a standalone repo on my own. If the PR is welcome, then I > can spend more time on polishing it based on your feedback, add tests / > documents and other stuff. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] feluelle commented on issue #7437: [AIRFLOW-2325] Add CloudwatchTaskHandler option for remote task loggi…
feluelle commented on issue #7437: [AIRFLOW-2325] Add CloudwatchTaskHandler option for remote task loggi… URL: https://github.com/apache/airflow/pull/7437#issuecomment-593443949 Nice work @dhegberg! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] feluelle merged pull request #7437: [AIRFLOW-2325] Add CloudwatchTaskHandler option for remote task loggi…
feluelle merged pull request #7437: [AIRFLOW-2325] Add CloudwatchTaskHandler option for remote task loggi… URL: https://github.com/apache/airflow/pull/7437 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] boring-cyborg[bot] commented on issue #7437: [AIRFLOW-2325] Add CloudwatchTaskHandler option for remote task loggi…
boring-cyborg[bot] commented on issue #7437: [AIRFLOW-2325] Add CloudwatchTaskHandler option for remote task loggi… URL: https://github.com/apache/airflow/pull/7437#issuecomment-593443592 Awesome work, congrats on your first merged pull request! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] mik-laj commented on issue #7597: [AIRFLOW-6497] Avoid loading DAGs in the main scheduler loop
mik-laj commented on issue #7597: [AIRFLOW-6497] Avoid loading DAGs in the main scheduler loop URL: https://github.com/apache/airflow/pull/7597#issuecomment-593431538 This also protects against killing the scheduler by incorrect error handling functions. If sys.exit(1) appears in the handler code, the scheduler will not be stopped. It will only affect one DAG. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] mik-laj edited a comment on issue #7597: [AIRFLOW-6497] Avoid loading DAGs in the main scheduler loop
mik-laj edited a comment on issue #7597: [AIRFLOW-6497] Avoid loading DAGs in the main scheduler loop URL: https://github.com/apache/airflow/pull/7597#issuecomment-593431538 This also protects against killing the scheduler by incorrect error handling functions. If sys.exit(1) appears in the handler code, the scheduler will not be stopped. It will only affect one DAG file. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] mik-laj commented on a change in pull request #7597: [AIRFLOW-6497] Avoid loading DAGs in the main scheduler loop
mik-laj commented on a change in pull request #7597: [AIRFLOW-6497] Avoid loading DAGs in the main scheduler loop URL: https://github.com/apache/airflow/pull/7597#discussion_r386425807 ## File path: airflow/jobs/scheduler_job.py ## @@ -1616,26 +1633,6 @@ def _validate_and_run_task_instances(self, simple_dag_bag: SimpleDagBag) -> bool self._process_executor_events(simple_dag_bag) return True -def _process_and_execute_tasks(self, simple_dag_bag): Review comment: I will revert this change and propose it in a separate PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] mik-laj commented on issue #7597: [AIRFLOW-6497] Avoid loading DAGs in the main scheduler loop
mik-laj commented on issue #7597: [AIRFLOW-6497] Avoid loading DAGs in the main scheduler loop URL: https://github.com/apache/airflow/pull/7597#issuecomment-593430117 It is worth noting that this also solves one more problem. The modules are always reloaded. https://issues.apache.org/jira/projects/AIRFLOW/issues/AIRFLOW-6497?filter=allopenissues so when someone makes a change in the additional module it is correctly executed. Its old version is not executed. This can be a problem because the handler is often stored in helper functions and is shared among many DAGs. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] mik-laj edited a comment on issue #7597: [AIRFLOW-6497] Avoid loading DAGs in the main scheduler loop
mik-laj edited a comment on issue #7597: [AIRFLOW-6497] Avoid loading DAGs in the main scheduler loop URL: https://github.com/apache/airflow/pull/7597#issuecomment-593427244 > Do you have any numbers for this please? It is very difficult to measure because it depends on the specific DAG File. Some DAG files take up to 30 seconds or more to load. During this time, the scheduler loop is stopped and does not start any new tasks. I can measure how long it takes to load example_dags, but it's not just a subset of cases. It doesn't provide real values,... but I created a spreadsheet: When I ran the following script: ```python import os import sys import time from contextlib import contextmanager import psutil from airflow.models import DagBag @contextmanager def timing_ctx(): time1 = time.time() try: yield finally: time2 = time.time() diff = (time2 - time1) * 1000.0 print('Time: %0.3f ms' % diff) def get_process_memory(): process = psutil.Process(os.getpid()) return process.memory_info().rss @contextmanager def memory_ctx(): before = get_process_memory() try: yield finally: after = get_process_memory() diff = after - before print('Memory: %d bytes' % diff) filename = sys.argv[1] with timing_ctx(), memory_ctx(): print("Filename:", filename) DagBag(dag_folder=filename, include_examples=False, store_serialized_dags=False) ``` ``` find airflow/providers/google/cloud/example_dags/ -type f | sort| grep -v "__init__.py" | grep -v "__init__.py" | xargs -n 1 readlink -e | xargs -t -n 1 python /files/performance/load_dag_perf_test.py ``` I got following values: https://docs.google.com/spreadsheets/d/1T0kLEQLSU5ujxU-W_PoxddbkEgWx70EQkpwjRLNWaic/edit?usp=sharing In this case, IPC communication should be very fast. I suspect it should take less than 3% of all DAG loading time. So it can be assumed that the main loop is faster by the time of loading the module. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-6747) UI - Show count of tasks in each dag on the main dags page
[ https://issues.apache.org/jira/browse/AIRFLOW-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049250#comment-17049250 ] Ebrima Jallow commented on AIRFLOW-6747: Thanks. I am looking into it. > UI - Show count of tasks in each dag on the main dags page > -- > > Key: AIRFLOW-6747 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6747 > Project: Apache Airflow > Issue Type: Improvement > Components: ui >Affects Versions: 1.10.7 >Reporter: t oo >Assignee: Ebrima Jallow >Priority: Minor > Labels: gsoc, gsoc2020, mentor > > Main DAGs page in UI - would benefit from showing a new column: number of > tasks for each dag id -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] mik-laj commented on issue #7597: [AIRFLOW-6497] Avoid loading DAGs in the main scheduler loop
mik-laj commented on issue #7597: [AIRFLOW-6497] Avoid loading DAGs in the main scheduler loop URL: https://github.com/apache/airflow/pull/7597#issuecomment-593427244 > Do you have any numbers for this please? It is very difficult to measure because it depends on the specific DAG File. Some DAG files take up to 30 seconds or more to load. During this time, the scheduler loop is stopped and does not start any new tasks. I can measure how long it takes to load example_dags, but it's not just a subset of cases. It doesn't provide real values,... but I created a spreadsheet: When I ran the following script: ```python import os import sys import time from contextlib import contextmanager import psutil from airflow.models import DagBag @contextmanager def timing_ctx(): time1 = time.time() try: yield finally: time2 = time.time() diff = (time2 - time1) * 1000.0 print('Time: %0.3f ms' % diff) def get_process_memory(): process = psutil.Process(os.getpid()) return process.memory_info().rss @contextmanager def memory_ctx(): before = get_process_memory() try: yield finally: after = get_process_memory() diff = after - before print('Memory: %d bytes' % diff) filename = sys.argv[1] with timing_ctx(), memory_ctx(): print("Filename:", filename) DagBag(dag_folder=filename, include_examples=False, store_serialized_dags=False) ``` ``` find airflow/providers/google/cloud/example_dags/ -type f | sort| grep -v "__init__.py" | grep -v "__init__.py" | xargs -n 1 readlink -e | xargs -t -n 1 python /files/performance/load_dag_perf_test.py ``` I got following values: https://docs.google.com/spreadsheets/d/1T0kLEQLSU5ujxU-W_PoxddbkEgWx70EQkpwjRLNWaic/edit?usp=sharing This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Assigned] (AIRFLOW-824) Allow writing to XCOM values via API
[ https://issues.apache.org/jira/browse/AIRFLOW-824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robin Miller reassigned AIRFLOW-824: Assignee: (was: Robin Miller) > Allow writing to XCOM values via API > > > Key: AIRFLOW-824 > URL: https://issues.apache.org/jira/browse/AIRFLOW-824 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Robin Miller >Priority: Minor > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (AIRFLOW-6931) One migration failed during "airflow initdb" in mssql server 2017
[ https://issues.apache.org/jira/browse/AIRFLOW-6931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Baoshan Gu closed AIRFLOW-6931. --- Resolution: Not A Problem > One migration failed during "airflow initdb" in mssql server 2017 > - > > Key: AIRFLOW-6931 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6931 > Project: Apache Airflow > Issue Type: Bug > Components: database >Affects Versions: 1.10.9 > Environment: microsoft sqlserver 2017 >Reporter: Baoshan Gu >Priority: Major > > Running "airflw initdb" got error: > {code:java} > _mssql.MSSQLDatabaseException: (5074, b"The object > 'UQ__dag_run__F78A9899295C1915' is dependent on column > 'execution_date'.DB-Lib error message 20018, severity 16:\nGeneral SQL Server > error: Check messages from the SQL Server\nDB-Lib error message 20018, > severity 16:\nGeneral SQL Server error: Check messages from the SQL Server\n") > {code} > The issue is migration file > [74effc47d867_change_datetime_to_datetime2_6_on_mssql_.py|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/74effc47d867_change_datetime_to_datetime2_6_on_mssql_.py#L235] > does not find all constraints. > Confirmed that changing it to case-insensitive selection works: > {code}(tc.CONSTRAINT_TYPE = 'PRIMARY KEY' or LOWER(tc.CONSTRAINT_TYPE) = > 'unique'){code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-6931) One migration failed during "airflow initdb" in mssql server 2017
[ https://issues.apache.org/jira/browse/AIRFLOW-6931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049221#comment-17049221 ] Baoshan Gu commented on AIRFLOW-6931: - Changing SQL server collate to Latin1_General_CI_AI works without any code changes. I am closing the ticket. > One migration failed during "airflow initdb" in mssql server 2017 > - > > Key: AIRFLOW-6931 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6931 > Project: Apache Airflow > Issue Type: Bug > Components: database >Affects Versions: 1.10.9 > Environment: microsoft sqlserver 2017 >Reporter: Baoshan Gu >Priority: Major > > Running "airflw initdb" got error: > {code:java} > _mssql.MSSQLDatabaseException: (5074, b"The object > 'UQ__dag_run__F78A9899295C1915' is dependent on column > 'execution_date'.DB-Lib error message 20018, severity 16:\nGeneral SQL Server > error: Check messages from the SQL Server\nDB-Lib error message 20018, > severity 16:\nGeneral SQL Server error: Check messages from the SQL Server\n") > {code} > The issue is migration file > [74effc47d867_change_datetime_to_datetime2_6_on_mssql_.py|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/74effc47d867_change_datetime_to_datetime2_6_on_mssql_.py#L235] > does not find all constraints. > Confirmed that changing it to case-insensitive selection works: > {code}(tc.CONSTRAINT_TYPE = 'PRIMARY KEY' or LOWER(tc.CONSTRAINT_TYPE) = > 'unique'){code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-6931) One migration failed during "airflow initdb" in mssql server 2017
[ https://issues.apache.org/jira/browse/AIRFLOW-6931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049219#comment-17049219 ] ASF GitHub Bot commented on AIRFLOW-6931: - BaoshanGu commented on pull request #7574: [AIRFLOW-6931] Fixed migrations to find all dependencies for mssql URL: https://github.com/apache/airflow/pull/7574 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > One migration failed during "airflow initdb" in mssql server 2017 > - > > Key: AIRFLOW-6931 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6931 > Project: Apache Airflow > Issue Type: Bug > Components: database >Affects Versions: 1.10.9 > Environment: microsoft sqlserver 2017 >Reporter: Baoshan Gu >Priority: Major > > Running "airflw initdb" got error: > {code:java} > _mssql.MSSQLDatabaseException: (5074, b"The object > 'UQ__dag_run__F78A9899295C1915' is dependent on column > 'execution_date'.DB-Lib error message 20018, severity 16:\nGeneral SQL Server > error: Check messages from the SQL Server\nDB-Lib error message 20018, > severity 16:\nGeneral SQL Server error: Check messages from the SQL Server\n") > {code} > The issue is migration file > [74effc47d867_change_datetime_to_datetime2_6_on_mssql_.py|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/74effc47d867_change_datetime_to_datetime2_6_on_mssql_.py#L235] > does not find all constraints. > Confirmed that changing it to case-insensitive selection works: > {code}(tc.CONSTRAINT_TYPE = 'PRIMARY KEY' or LOWER(tc.CONSTRAINT_TYPE) = > 'unique'){code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] BaoshanGu commented on issue #7574: [AIRFLOW-6931] Fixed migrations to find all dependencies for mssql
BaoshanGu commented on issue #7574: [AIRFLOW-6931] Fixed migrations to find all dependencies for mssql URL: https://github.com/apache/airflow/pull/7574#issuecomment-593409257 Changing SQL server collate to Latin1_General_CI_AI works without any code changes. I am closing the PR This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] BaoshanGu closed pull request #7574: [AIRFLOW-6931] Fixed migrations to find all dependencies for mssql
BaoshanGu closed pull request #7574: [AIRFLOW-6931] Fixed migrations to find all dependencies for mssql URL: https://github.com/apache/airflow/pull/7574 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] mik-laj commented on a change in pull request #7597: [AIRFLOW-6497] Avoid loading DAGs in the main scheduler loop
mik-laj commented on a change in pull request #7597: [AIRFLOW-6497] Avoid loading DAGs in the main scheduler loop URL: https://github.com/apache/airflow/pull/7597#discussion_r386390242 ## File path: airflow/utils/dag_processing.py ## @@ -655,6 +693,7 @@ def start(self): # Update number of loop iteration. self._num_run += 1 +simple_dags = self.collect_results() Review comment: Yes. First we need to create processes, and then we can read the results. Otherwise, we will never get the value on the first iteration of the loop. Other solution: we can increase loop iteration nunber: https://github.com/apache/airflow/blob/cb455dc81162680f90edcd78400e1ef46c09766d/tests/utils/test_dag_processing.py#L291 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] zhongjiajie edited a comment on issue #7593: [AIRFLOW-6959] Use NULL as dag.description default value and change r…
zhongjiajie edited a comment on issue #7593: [AIRFLOW-6959] Use NULL as dag.description default value and change r… URL: https://github.com/apache/airflow/pull/7593#issuecomment-592950147 **The detail change as below** **UPDATE AT 2020-03-02**: section `homepage without description` and `dag detail page without description` are same as before. | section | old | new | | :-- | :-- | : | | database| empty string ![](https://i.loli.net/2020/02/29/QyXGLTSztmh7vUD.png) | null value ![](https://i.loli.net/2020/02/29/zCwlgTdiyGbkrPB.png) | | ~~homepage without description~~ | | | | dag detail page with description| ![](https://i.loli.net/2020/02/29/CzXxy5tURBnNLDv.png) | ![](https://i.loli.net/2020/02/29/MN4UX8IPpLWfgiS.png)| | ~~dag detail page without description~~ | | | This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] mik-laj commented on a change in pull request #7597: [AIRFLOW-6497] Avoid loading DAGs in the main scheduler loop
mik-laj commented on a change in pull request #7597: [AIRFLOW-6497] Avoid loading DAGs in the main scheduler loop URL: https://github.com/apache/airflow/pull/7597#discussion_r386390242 ## File path: airflow/utils/dag_processing.py ## @@ -655,6 +693,7 @@ def start(self): # Update number of loop iteration. self._num_run += 1 +simple_dags = self.collect_results() Review comment: Yes. First we need to create processes, and then we can read the results. Otherwise, we will never get the value on the first iteration of the loop. Other solution, we can increase loop iteration nunber: https://github.com/apache/airflow/blob/cb455dc81162680f90edcd78400e1ef46c09766d/tests/utils/test_dag_processing.py#L291 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services