[jira] [Commented] (AIRFLOW-6824) EMRAddStepsOperator does not work well with multi-step XCom
[ https://issues.apache.org/jira/browse/AIRFLOW-6824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045157#comment-17045157 ] ASF subversion and git services commented on AIRFLOW-6824: -- Commit 3ea3e1a2b580b7ed10efe668de0cc37b03673500 in airflow's branch refs/heads/master from Bjorn Olsen [ https://gitbox.apache.org/repos/asf?p=airflow.git;h=3ea3e1a ] [AIRFLOW-6824] EMRAddStepsOperator problem with multi-step XCom (#7443) > EMRAddStepsOperator does not work well with multi-step XCom > --- > > Key: AIRFLOW-6824 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6824 > Project: Apache Airflow > Issue Type: Bug > Components: aws >Affects Versions: 1.10.9 >Reporter: Bjorn Olsen >Assignee: Bjorn Olsen >Priority: Minor > Fix For: 2.0.0 > > > EmrAddStepsOperator allows you to add several steps to EMR for processing - > the steps must be supplied as a list. > This works well when passing an actual Python list as the 'steps' value, but > we want to be able to generate the list of steps from a previous task - using > an XCom. > We must use the operator as follows, for the templating to work correctly and > for it to resolve the XCom: > > {code:java} > add_steps_task = EmrAddStepsOperator( > task_id='add_steps', > job_flow_id=job_flow_id, > aws_conn_id='aws_default', > provide_context=True, > steps="{{task_instance.xcom_pull(task_ids='generate_steps')}}" > ){code} > > The value in XCom from the 'generate_steps' task looks like (simplified): > {code:java} > [{'Name':'Step1'}, {'Name':'Step2'}] > {code} > However this is passed as a string to the operator, which cannot be passed to > the underlying boto3 library which expects a list object. > The following won't work either: > {code:java} > add_steps_task = EmrAddStepsOperator( > task_id='add_steps', > job_flow_id=job_flow_id, > aws_conn_id='aws_default', > provide_context=True, > steps={{task_instance.xcom_pull(task_ids='generate_steps')}} > ){code} > Since this is not valid Python. > We have to pass the steps as a string to the operator, and then convert it > into a list after the render_template_fields has happened (immediately before > the execute). Therefore the only option is to do the conversion from string > to list in the operator's execute method. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (AIRFLOW-6824) EMRAddStepsOperator does not work well with multi-step XCom
[ https://issues.apache.org/jira/browse/AIRFLOW-6824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kamil Bregula resolved AIRFLOW-6824. Fix Version/s: 2.0.0 Resolution: Fixed > EMRAddStepsOperator does not work well with multi-step XCom > --- > > Key: AIRFLOW-6824 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6824 > Project: Apache Airflow > Issue Type: Bug > Components: aws >Affects Versions: 1.10.9 >Reporter: Bjorn Olsen >Assignee: Bjorn Olsen >Priority: Minor > Fix For: 2.0.0 > > > EmrAddStepsOperator allows you to add several steps to EMR for processing - > the steps must be supplied as a list. > This works well when passing an actual Python list as the 'steps' value, but > we want to be able to generate the list of steps from a previous task - using > an XCom. > We must use the operator as follows, for the templating to work correctly and > for it to resolve the XCom: > > {code:java} > add_steps_task = EmrAddStepsOperator( > task_id='add_steps', > job_flow_id=job_flow_id, > aws_conn_id='aws_default', > provide_context=True, > steps="{{task_instance.xcom_pull(task_ids='generate_steps')}}" > ){code} > > The value in XCom from the 'generate_steps' task looks like (simplified): > {code:java} > [{'Name':'Step1'}, {'Name':'Step2'}] > {code} > However this is passed as a string to the operator, which cannot be passed to > the underlying boto3 library which expects a list object. > The following won't work either: > {code:java} > add_steps_task = EmrAddStepsOperator( > task_id='add_steps', > job_flow_id=job_flow_id, > aws_conn_id='aws_default', > provide_context=True, > steps={{task_instance.xcom_pull(task_ids='generate_steps')}} > ){code} > Since this is not valid Python. > We have to pass the steps as a string to the operator, and then convert it > into a list after the render_template_fields has happened (immediately before > the execute). Therefore the only option is to do the conversion from string > to list in the operator's execute method. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-6824) EMRAddStepsOperator does not work well with multi-step XCom
[ https://issues.apache.org/jira/browse/AIRFLOW-6824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045156#comment-17045156 ] ASF GitHub Bot commented on AIRFLOW-6824: - mik-laj commented on pull request #7443: [AIRFLOW-6824] EMRAddStepsOperator problem with multi-step XCom URL: https://github.com/apache/airflow/pull/7443 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > EMRAddStepsOperator does not work well with multi-step XCom > --- > > Key: AIRFLOW-6824 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6824 > Project: Apache Airflow > Issue Type: Bug > Components: aws >Affects Versions: 1.10.9 >Reporter: Bjorn Olsen >Assignee: Bjorn Olsen >Priority: Minor > > EmrAddStepsOperator allows you to add several steps to EMR for processing - > the steps must be supplied as a list. > This works well when passing an actual Python list as the 'steps' value, but > we want to be able to generate the list of steps from a previous task - using > an XCom. > We must use the operator as follows, for the templating to work correctly and > for it to resolve the XCom: > > {code:java} > add_steps_task = EmrAddStepsOperator( > task_id='add_steps', > job_flow_id=job_flow_id, > aws_conn_id='aws_default', > provide_context=True, > steps="{{task_instance.xcom_pull(task_ids='generate_steps')}}" > ){code} > > The value in XCom from the 'generate_steps' task looks like (simplified): > {code:java} > [{'Name':'Step1'}, {'Name':'Step2'}] > {code} > However this is passed as a string to the operator, which cannot be passed to > the underlying boto3 library which expects a list object. > The following won't work either: > {code:java} > add_steps_task = EmrAddStepsOperator( > task_id='add_steps', > job_flow_id=job_flow_id, > aws_conn_id='aws_default', > provide_context=True, > steps={{task_instance.xcom_pull(task_ids='generate_steps')}} > ){code} > Since this is not valid Python. > We have to pass the steps as a string to the operator, and then convert it > into a list after the render_template_fields has happened (immediately before > the execute). Therefore the only option is to do the conversion from string > to list in the operator's execute method. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] mik-laj merged pull request #7443: [AIRFLOW-6824] EMRAddStepsOperator problem with multi-step XCom
mik-laj merged pull request #7443: [AIRFLOW-6824] EMRAddStepsOperator problem with multi-step XCom URL: https://github.com/apache/airflow/pull/7443 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-6822) AWS hooks dont always cache the boto3 client
[ https://issues.apache.org/jira/browse/AIRFLOW-6822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045153#comment-17045153 ] ASF GitHub Bot commented on AIRFLOW-6822: - baolsen commented on pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client URL: https://github.com/apache/airflow/pull/7541 --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [x] Unit tests coverage for changes (not needed for documentation changes) - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [x] Relevant documentation is updated including usage instructions. - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > AWS hooks dont always cache the boto3 client > > > Key: AIRFLOW-6822 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6822 > Project: Apache Airflow > Issue Type: Bug > Components: aws >Affects Versions: 1.10.9 >Reporter: Bjorn Olsen >Assignee: Bjorn Olsen >Priority: Minor > > Implementation of the Amazon AWS hooks (eg S3 hook, Glue hook etc) varies > with how they call the underlying aws_hook.get_client_type(X) method. > Most of the time the client that gets returned is cached by the superclass, > but not always. The client should always be cached for performance reasons - > creating a client is a time consuming process. > Example of how to do it (athena.py): > > {code:java} > def get_conn(self): > """ > check if aws conn exists already or create one and return it > :return: boto3 session > """ > if not self.conn: > self.conn = self.get_client_type('athena') > return self.conn{code} > > > Example of how not to do it: (s3.py): > > {code:java} > def get_conn(self): > return self.get_client_type('s3'){code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] baolsen opened a new pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client
baolsen opened a new pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client URL: https://github.com/apache/airflow/pull/7541 --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [x] Unit tests coverage for changes (not needed for documentation changes) - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [x] Relevant documentation is updated including usage instructions. - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] codecov-io edited a comment on issue #7476: [AIRFLOW-6856][depends on AIRFLOW-6862][WIP] Bulk fetch paused_dag_ids
codecov-io edited a comment on issue #7476: [AIRFLOW-6856][depends on AIRFLOW-6862][WIP] Bulk fetch paused_dag_ids URL: https://github.com/apache/airflow/pull/7476#issuecomment-589383726 # [Codecov](https://codecov.io/gh/apache/airflow/pull/7476?src=pr=h1) Report > Merging [#7476](https://codecov.io/gh/apache/airflow/pull/7476?src=pr=desc) into [master](https://codecov.io/gh/apache/airflow/commit/0ec2774120d43fa667a371b384e6006e1d1c7821?src=pr=desc) will **decrease** coverage by `0.36%`. > The diff coverage is `100%`. [![Impacted file tree graph](https://codecov.io/gh/apache/airflow/pull/7476/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/7476?src=pr=tree) ```diff @@Coverage Diff @@ ## master#7476 +/- ## == - Coverage 86.81% 86.44% -0.37% == Files 893 893 Lines 4219342342 +149 == - Hits3662936604 -25 - Misses 5564 5738 +174 ``` | [Impacted Files](https://codecov.io/gh/apache/airflow/pull/7476?src=pr=tree) | Coverage Δ | | |---|---|---| | [airflow/utils/dag\_processing.py](https://codecov.io/gh/apache/airflow/pull/7476/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYWdfcHJvY2Vzc2luZy5weQ==) | `88.63% <ø> (+0.7%)` | :arrow_up: | | [airflow/dag/base\_dag.py](https://codecov.io/gh/apache/airflow/pull/7476/diff?src=pr=tree#diff-YWlyZmxvdy9kYWcvYmFzZV9kYWcucHk=) | `69.56% <ø> (+1.56%)` | :arrow_up: | | [airflow/jobs/scheduler\_job.py](https://codecov.io/gh/apache/airflow/pull/7476/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzL3NjaGVkdWxlcl9qb2IucHk=) | `90.29% <100%> (+0.05%)` | :arrow_up: | | [airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/7476/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==) | `44.44% <0%> (-55.56%)` | :arrow_down: | | [airflow/providers/postgres/operators/postgres.py](https://codecov.io/gh/apache/airflow/pull/7476/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvcG9zdGdyZXMvb3BlcmF0b3JzL3Bvc3RncmVzLnB5) | `50% <0%> (-50%)` | :arrow_down: | | [airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/7476/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==) | `52.94% <0%> (-47.06%)` | :arrow_down: | | [airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/7476/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==) | `47.18% <0%> (-45.08%)` | :arrow_down: | | [...roviders/google/cloud/operators/postgres\_to\_gcs.py](https://codecov.io/gh/apache/airflow/pull/7476/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvZ29vZ2xlL2Nsb3VkL29wZXJhdG9ycy9wb3N0Z3Jlc190b19nY3MucHk=) | `52.94% <0%> (-32.36%)` | :arrow_down: | | [...viders/cncf/kubernetes/operators/kubernetes\_pod.py](https://codecov.io/gh/apache/airflow/pull/7476/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvY25jZi9rdWJlcm5ldGVzL29wZXJhdG9ycy9rdWJlcm5ldGVzX3BvZC5weQ==) | `69.69% <0%> (-25.26%)` | :arrow_down: | | [airflow/kubernetes/refresh\_config.py](https://codecov.io/gh/apache/airflow/pull/7476/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3JlZnJlc2hfY29uZmlnLnB5) | `50.98% <0%> (-23.53%)` | :arrow_down: | | ... and [15 more](https://codecov.io/gh/apache/airflow/pull/7476/diff?src=pr=tree-more) | | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/airflow/pull/7476?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/airflow/pull/7476?src=pr=footer). Last update [0ec2774...72478de](https://codecov.io/gh/apache/airflow/pull/7476?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] codecov-io edited a comment on issue #7476: [AIRFLOW-6856][depends on AIRFLOW-6862][WIP] Bulk fetch paused_dag_ids
codecov-io edited a comment on issue #7476: [AIRFLOW-6856][depends on AIRFLOW-6862][WIP] Bulk fetch paused_dag_ids URL: https://github.com/apache/airflow/pull/7476#issuecomment-589383726 # [Codecov](https://codecov.io/gh/apache/airflow/pull/7476?src=pr=h1) Report > Merging [#7476](https://codecov.io/gh/apache/airflow/pull/7476?src=pr=desc) into [master](https://codecov.io/gh/apache/airflow/commit/0ec2774120d43fa667a371b384e6006e1d1c7821?src=pr=desc) will **decrease** coverage by `0.26%`. > The diff coverage is `100%`. [![Impacted file tree graph](https://codecov.io/gh/apache/airflow/pull/7476/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/7476?src=pr=tree) ```diff @@Coverage Diff @@ ## master#7476 +/- ## == - Coverage 86.81% 86.55% -0.27% == Files 893 893 Lines 4219342342 +149 == + Hits3662936647 +18 - Misses 5564 5695 +131 ``` | [Impacted Files](https://codecov.io/gh/apache/airflow/pull/7476?src=pr=tree) | Coverage Δ | | |---|---|---| | [airflow/utils/dag\_processing.py](https://codecov.io/gh/apache/airflow/pull/7476/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYWdfcHJvY2Vzc2luZy5weQ==) | `88.63% <ø> (+0.7%)` | :arrow_up: | | [airflow/dag/base\_dag.py](https://codecov.io/gh/apache/airflow/pull/7476/diff?src=pr=tree#diff-YWlyZmxvdy9kYWcvYmFzZV9kYWcucHk=) | `69.56% <ø> (+1.56%)` | :arrow_up: | | [airflow/jobs/scheduler\_job.py](https://codecov.io/gh/apache/airflow/pull/7476/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzL3NjaGVkdWxlcl9qb2IucHk=) | `90.72% <100%> (+0.48%)` | :arrow_up: | | [airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/7476/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==) | `44.44% <0%> (-55.56%)` | :arrow_down: | | [airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/7476/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==) | `52.94% <0%> (-47.06%)` | :arrow_down: | | [airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/7476/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==) | `47.18% <0%> (-45.08%)` | :arrow_down: | | [...viders/cncf/kubernetes/operators/kubernetes\_pod.py](https://codecov.io/gh/apache/airflow/pull/7476/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvY25jZi9rdWJlcm5ldGVzL29wZXJhdG9ycy9rdWJlcm5ldGVzX3BvZC5weQ==) | `69.69% <0%> (-25.26%)` | :arrow_down: | | [airflow/kubernetes/refresh\_config.py](https://codecov.io/gh/apache/airflow/pull/7476/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3JlZnJlc2hfY29uZmlnLnB5) | `50.98% <0%> (-23.53%)` | :arrow_down: | | [airflow/providers/amazon/aws/hooks/sns.py](https://codecov.io/gh/apache/airflow/pull/7476/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvYW1hem9uL2F3cy9ob29rcy9zbnMucHk=) | `96.42% <0%> (-3.58%)` | :arrow_down: | | [airflow/models/dagrun.py](https://codecov.io/gh/apache/airflow/pull/7476/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvZGFncnVuLnB5) | `95.81% <0%> (-0.76%)` | :arrow_down: | | ... and [9 more](https://codecov.io/gh/apache/airflow/pull/7476/diff?src=pr=tree-more) | | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/airflow/pull/7476?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/airflow/pull/7476?src=pr=footer). Last update [0ec2774...72478de](https://codecov.io/gh/apache/airflow/pull/7476?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] mik-laj commented on a change in pull request #7537: [AIRFLOW-6454] And script to benchmark scheduler dag-run time
mik-laj commented on a change in pull request #7537: [AIRFLOW-6454] And script to benchmark scheduler dag-run time URL: https://github.com/apache/airflow/pull/7537#discussion_r384258420 ## File path: scripts/perf/scheduler_dag_execution_timing.py ## @@ -0,0 +1,221 @@ +#!/usr/bin/env python3 +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +import gc +import os +import statistics +import time + +import click + + +class ShortCircutExecutorMixin: +def __init__(self, stop_when_these_completed): +super().__init__() +self.reset(stop_when_these_completed) + +def reset(self, stop_when_these_completed): +self.stop_when_these_completed = { +# Store the date as a timestamp, as sometimes this is a Pendulum +# object, others it is a datetime object. +(run.dag_id, run.execution_date.timestamp()): run for run in stop_when_these_completed +} + +def change_state(self, key, state): +from airflow.utils.state import State +super().change_state(key, state) + +dag_id, task_id, execution_date, __ = key +run_key = (dag_id, execution_date.timestamp()) +run = self.stop_when_these_completed.get(run_key, None) +if run and all(t.state == State.SUCCESS for t in run.get_task_instances()): +self.stop_when_these_completed.pop(run_key) + +if not self.stop_when_these_completed: +self.log.warning("STOPPING SCHEDULER -- all runs complete") +self.scheduler_job.processor_agent._done = True +else: +self.log.warning("WAITING ON %d RUNS", len(self.stop_when_these_completed)) +elif state == State.SUCCESS: +self.log.warning("WAITING ON %d RUNS", len(self.stop_when_these_completed)) + + +def get_executor_under_test(): +try: +# Run against master and 1.10.x releases +from tests.test_utils.mock_executor import MockExecutor +except ImportError: +from tests.executors.test_executor import TestExecutor as MockExecutor + +# from airflow.executors.local_executor import LocalExecutor + +# Change this to try other executors +Executor = MockExecutor + +class ShortCircutExecutor(ShortCircutExecutorMixin, Executor): +pass + +return ShortCircutExecutor + + +def reset_dag(dag, num_runs, session): +import airflow.models +from airflow.utils import timezone +from airflow.utils.state import State + +DR = airflow.models.DagRun +DM = airflow.models.DagModel +TI = airflow.models.TaskInstance +TF = airflow.models.TaskFail +dag_id = dag.dag_id + +session.query(DM).filter(DM.dag_id == dag_id).update({'is_paused': False}) +session.query(DR).filter(DR.dag_id == dag_id).delete() +session.query(TI).filter(TI.dag_id == dag_id).delete() +session.query(TF).filter(TF.dag_id == dag_id).delete() + +next_run_date = dag.normalize_schedule(dag.start_date or min(t.start_date for t in dag.tasks)) + +for _ in range(num_runs): +next_run = dag.create_dagrun( +run_id=DR.ID_PREFIX + next_run_date.isoformat(), +execution_date=next_run_date, +start_date=timezone.utcnow(), +state=State.RUNNING, +external_trigger=False, +session=session, +) +next_run_date = dag.following_schedule(next_run_date) +return next_run + + +def pause_all_dags(session): +from airflow.models.dag import DagModel +session.query(DagModel).update({'is_paused': True}) + + +@click.command() +@click.option('--num-runs', default=1, help='number of DagRun, to run for each DAG') +@click.option('--repeat', default=3, help='number of times to run test, to reduce variance') +@click.argument('dag_ids', required=True, nargs=-1) +def main(num_runs, repeat, dag_ids): +""" +This script will run the SchedulerJob for the specified dags "to completion". + +That is it creates a fixed number of DAG runs for the specified DAGs (from +the configured dag path/example dags etc), disable the scheduler from +creating more, and then monitor them for completion. When the file task of +the final dag run is
[GitHub] [airflow] codecov-io commented on issue #7539: [AIRFLOW-6919] Make Breeze DAG-test friedly
codecov-io commented on issue #7539: [AIRFLOW-6919] Make Breeze DAG-test friedly URL: https://github.com/apache/airflow/pull/7539#issuecomment-591192171 # [Codecov](https://codecov.io/gh/apache/airflow/pull/7539?src=pr=h1) Report > Merging [#7539](https://codecov.io/gh/apache/airflow/pull/7539?src=pr=desc) into [master](https://codecov.io/gh/apache/airflow/commit/311140616daafe496310d642e4164bc53fbd2ad2?src=pr=desc) will **decrease** coverage by `0.04%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/airflow/pull/7539/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/7539?src=pr=tree) ```diff @@Coverage Diff @@ ## master#7539 +/- ## == - Coverage 86.76% 86.72% -0.05% == Files 896 896 Lines 4264942649 == - Hits3700536987 -18 - Misses 5644 5662 +18 ``` | [Impacted Files](https://codecov.io/gh/apache/airflow/pull/7539?src=pr=tree) | Coverage Δ | | |---|---|---| | [airflow/executors/sequential\_executor.py](https://codecov.io/gh/apache/airflow/pull/7539/diff?src=pr=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvc2VxdWVudGlhbF9leGVjdXRvci5weQ==) | `56% <0%> (-44%)` | :arrow_down: | | [airflow/utils/helpers.py](https://codecov.io/gh/apache/airflow/pull/7539/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9oZWxwZXJzLnB5) | `75.77% <0%> (-6.84%)` | :arrow_down: | | [airflow/utils/dag\_processing.py](https://codecov.io/gh/apache/airflow/pull/7539/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYWdfcHJvY2Vzc2luZy5weQ==) | `81.83% <0%> (-6.12%)` | :arrow_down: | | [airflow/jobs/scheduler\_job.py](https://codecov.io/gh/apache/airflow/pull/7539/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzL3NjaGVkdWxlcl9qb2IucHk=) | `89.49% <0%> (-0.15%)` | :arrow_down: | | [airflow/utils/sqlalchemy.py](https://codecov.io/gh/apache/airflow/pull/7539/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9zcWxhbGNoZW15LnB5) | `84.93% <0%> (+1.36%)` | :arrow_up: | | [airflow/hooks/dbapi\_hook.py](https://codecov.io/gh/apache/airflow/pull/7539/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9kYmFwaV9ob29rLnB5) | `91.73% <0%> (+1.65%)` | :arrow_up: | | [airflow/providers/postgres/hooks/postgres.py](https://codecov.io/gh/apache/airflow/pull/7539/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvcG9zdGdyZXMvaG9va3MvcG9zdGdyZXMucHk=) | `94.36% <0%> (+16.9%)` | :arrow_up: | | [...roviders/google/cloud/operators/postgres\_to\_gcs.py](https://codecov.io/gh/apache/airflow/pull/7539/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvZ29vZ2xlL2Nsb3VkL29wZXJhdG9ycy9wb3N0Z3Jlc190b19nY3MucHk=) | `85.29% <0%> (+32.35%)` | :arrow_up: | | [airflow/providers/postgres/operators/postgres.py](https://codecov.io/gh/apache/airflow/pull/7539/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvcG9zdGdyZXMvb3BlcmF0b3JzL3Bvc3RncmVzLnB5) | `100% <0%> (+50%)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/airflow/pull/7539?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/airflow/pull/7539?src=pr=footer). Last update [3111406...c966f8d](https://codecov.io/gh/apache/airflow/pull/7539?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] codecov-io edited a comment on issue #7492: [AIRFLOW-6871] optimize tree view for large DAGs
codecov-io edited a comment on issue #7492: [AIRFLOW-6871] optimize tree view for large DAGs URL: https://github.com/apache/airflow/pull/7492#issuecomment-589921201 # [Codecov](https://codecov.io/gh/apache/airflow/pull/7492?src=pr=h1) Report > Merging [#7492](https://codecov.io/gh/apache/airflow/pull/7492?src=pr=desc) into [master](https://codecov.io/gh/apache/airflow/commit/311140616daafe496310d642e4164bc53fbd2ad2?src=pr=desc) will **increase** coverage by `0.08%`. > The diff coverage is `84.21%`. [![Impacted file tree graph](https://codecov.io/gh/apache/airflow/pull/7492/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/7492?src=pr=tree) ```diff @@Coverage Diff @@ ## master#7492 +/- ## == + Coverage 86.76% 86.85% +0.08% == Files 896 896 Lines 4264942663 +14 == + Hits3700537055 +50 + Misses 5644 5608 -36 ``` | [Impacted Files](https://codecov.io/gh/apache/airflow/pull/7492?src=pr=tree) | Coverage Δ | | |---|---|---| | [airflow/www/views.py](https://codecov.io/gh/apache/airflow/pull/7492/diff?src=pr=tree#diff-YWlyZmxvdy93d3cvdmlld3MucHk=) | `76.19% <84.21%> (-0.05%)` | :arrow_down: | | [airflow/jobs/scheduler\_job.py](https://codecov.io/gh/apache/airflow/pull/7492/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzL3NjaGVkdWxlcl9qb2IucHk=) | `90.07% <0%> (+0.43%)` | :arrow_up: | | [airflow/utils/sqlalchemy.py](https://codecov.io/gh/apache/airflow/pull/7492/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9zcWxhbGNoZW15LnB5) | `84.93% <0%> (+1.36%)` | :arrow_up: | | [airflow/hooks/dbapi\_hook.py](https://codecov.io/gh/apache/airflow/pull/7492/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9kYmFwaV9ob29rLnB5) | `91.73% <0%> (+1.65%)` | :arrow_up: | | [airflow/providers/postgres/hooks/postgres.py](https://codecov.io/gh/apache/airflow/pull/7492/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvcG9zdGdyZXMvaG9va3MvcG9zdGdyZXMucHk=) | `94.36% <0%> (+16.9%)` | :arrow_up: | | [...roviders/google/cloud/operators/postgres\_to\_gcs.py](https://codecov.io/gh/apache/airflow/pull/7492/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvZ29vZ2xlL2Nsb3VkL29wZXJhdG9ycy9wb3N0Z3Jlc190b19nY3MucHk=) | `85.29% <0%> (+32.35%)` | :arrow_up: | | [airflow/providers/postgres/operators/postgres.py](https://codecov.io/gh/apache/airflow/pull/7492/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvcG9zdGdyZXMvb3BlcmF0b3JzL3Bvc3RncmVzLnB5) | `100% <0%> (+50%)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/airflow/pull/7492?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/airflow/pull/7492?src=pr=footer). Last update [3111406...adf0c6d](https://codecov.io/gh/apache/airflow/pull/7492?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] kaxil merged pull request #7540: [AIRFLOW-XXXX] Add instructions on using templating in Bash Script
kaxil merged pull request #7540: [AIRFLOW-] Add instructions on using templating in Bash Script URL: https://github.com/apache/airflow/pull/7540 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] codecov-io edited a comment on issue #6342: [AIRFLOW-5662] fix incorrect naming for scheduler used slot metric
codecov-io edited a comment on issue #6342: [AIRFLOW-5662] fix incorrect naming for scheduler used slot metric URL: https://github.com/apache/airflow/pull/6342#issuecomment-547121627 # [Codecov](https://codecov.io/gh/apache/airflow/pull/6342?src=pr=h1) Report > Merging [#6342](https://codecov.io/gh/apache/airflow/pull/6342?src=pr=desc) into [master](https://codecov.io/gh/apache/airflow/commit/311140616daafe496310d642e4164bc53fbd2ad2?src=pr=desc) will **decrease** coverage by `0.17%`. > The diff coverage is `95.23%`. [![Impacted file tree graph](https://codecov.io/gh/apache/airflow/pull/6342/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/6342?src=pr=tree) ```diff @@Coverage Diff @@ ## master#6342 +/- ## == - Coverage 86.76% 86.58% -0.18% == Files 896 896 Lines 4264942677 +28 == - Hits3700536953 -52 - Misses 5644 5724 +80 ``` | [Impacted Files](https://codecov.io/gh/apache/airflow/pull/6342?src=pr=tree) | Coverage Δ | | |---|---|---| | [airflow/ti\_deps/deps/pool\_slots\_available\_dep.py](https://codecov.io/gh/apache/airflow/pull/6342/diff?src=pr=tree#diff-YWlyZmxvdy90aV9kZXBzL2RlcHMvcG9vbF9zbG90c19hdmFpbGFibGVfZGVwLnB5) | `100% <100%> (ø)` | :arrow_up: | | [airflow/models/pool.py](https://codecov.io/gh/apache/airflow/pull/6342/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvcG9vbC5weQ==) | `96.55% <95%> (-0.82%)` | :arrow_down: | | [airflow/jobs/scheduler\_job.py](https://codecov.io/gh/apache/airflow/pull/6342/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzL3NjaGVkdWxlcl9qb2IucHk=) | `90.18% <95.23%> (+0.54%)` | :arrow_up: | | [airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/6342/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==) | `44.44% <0%> (-55.56%)` | :arrow_down: | | [airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/6342/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==) | `52.94% <0%> (-47.06%)` | :arrow_down: | | [airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/6342/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==) | `47.18% <0%> (-45.08%)` | :arrow_down: | | [...viders/cncf/kubernetes/operators/kubernetes\_pod.py](https://codecov.io/gh/apache/airflow/pull/6342/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvY25jZi9rdWJlcm5ldGVzL29wZXJhdG9ycy9rdWJlcm5ldGVzX3BvZC5weQ==) | `69.69% <0%> (-25.26%)` | :arrow_down: | | [airflow/kubernetes/refresh\_config.py](https://codecov.io/gh/apache/airflow/pull/6342/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3JlZnJlc2hfY29uZmlnLnB5) | `50.98% <0%> (-23.53%)` | :arrow_down: | | [airflow/utils/sqlalchemy.py](https://codecov.io/gh/apache/airflow/pull/6342/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9zcWxhbGNoZW15LnB5) | `84.93% <0%> (+1.36%)` | :arrow_up: | | ... and [4 more](https://codecov.io/gh/apache/airflow/pull/6342/diff?src=pr=tree-more) | | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/airflow/pull/6342?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/airflow/pull/6342?src=pr=footer). Last update [3111406...9255ec7](https://codecov.io/gh/apache/airflow/pull/6342?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] kaxil opened a new pull request #7540: [AIRFLOW-XXXX] Add instructions on using templating in Bash Script
kaxil opened a new pull request #7540: [AIRFLOW-] Add instructions on using templating in Bash Script URL: https://github.com/apache/airflow/pull/7540 Add instructions on using Jinja templating in Bash Script passed to BashOperator --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [x] Unit tests coverage for changes (not needed for documentation changes) - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [x] Relevant documentation is updated including usage instructions. - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] ashb commented on a change in pull request #7492: [AIRFLOW-6871] optimize tree view for large DAGs
ashb commented on a change in pull request #7492: [AIRFLOW-6871] optimize tree view for large DAGs URL: https://github.com/apache/airflow/pull/7492#discussion_r384191672 ## File path: airflow/www/views.py ## @@ -1374,90 +1376,115 @@ def tree(self): .all() ) dag_runs = { -dr.execution_date: alchemy_to_dict(dr) for dr in dag_runs} +dr.execution_date: alchemy_to_dict(dr) for dr in dag_runs +} dates = sorted(list(dag_runs.keys())) max_date = max(dates) if dates else None min_date = min(dates) if dates else None tis = dag.get_task_instances(start_date=min_date, end_date=base_date) -task_instances = {} +task_instances: Dict[Tuple[str, datetime], models.TaskInstance] = {} for ti in tis: -tid = alchemy_to_dict(ti) -dr = dag_runs.get(ti.execution_date) -tid['external_trigger'] = dr['external_trigger'] if dr else False -task_instances[(ti.task_id, ti.execution_date)] = tid +task_instances[(ti.task_id, ti.execution_date)] = ti -expanded = [] +expanded = set() # The default recursion traces every path so that tree view has full # expand/collapse functionality. After 5,000 nodes we stop and fall # back on a quick DFS search for performance. See PR #320. -node_count = [0] +node_count = 0 node_limit = 5000 / max(1, len(dag.leaves)) +def encode_ti(ti: Optional[models.TaskInstance]) -> Optional[List]: +if not ti: +return None + +# NOTE: order of entry is important here because client JS relies on it for +# tree node reconstruction. Remember to change JS code in tree.html +# whenever order is altered. +data = [ +ti.state, +ti.try_number, +None, # start_ts +None, # duration +] + +if ti.start_date: +# round to seconds to reduce payload size +data[2] = int(ti.start_date.timestamp()) +if ti.duration is not None: +data[3] = int(ti.duration) + +return data + def recurse_nodes(task, visited): +nonlocal node_count +node_count += 1 visited.add(task) -node_count[0] += 1 - -children = [ -recurse_nodes(t, visited) for t in task.downstream_list -if node_count[0] < node_limit or t not in visited] - -# D3 tree uses children vs _children to define what is -# expanded or not. The following block makes it such that -# repeated nodes are collapsed by default. -children_key = 'children' -if task.task_id not in expanded: -expanded.append(task.task_id) -elif children: -children_key = "_children" - -def set_duration(tid): -if (isinstance(tid, dict) and tid.get("state") == State.RUNNING and -tid["start_date"] is not None): -d = timezone.utcnow() - timezone.parse(tid["start_date"]) -tid["duration"] = d.total_seconds() -return tid - -return { +task_id = task.task_id + +node = { 'name': task.task_id, 'instances': [ -set_duration(task_instances.get((task.task_id, d))) or { -'execution_date': d.isoformat(), -'task_id': task.task_id -} -for d in dates], -children_key: children, +encode_ti(task_instances.get((task_id, d))) +for d in dates +], 'num_dep': len(task.downstream_list), 'operator': task.task_type, 'retries': task.retries, 'owner': task.owner, -'start_date': task.start_date, -'end_date': task.end_date, -'depends_on_past': task.depends_on_past, 'ui_color': task.ui_color, -'extra_links': task.extra_links, } +if task.downstream_list: +children = [ +recurse_nodes(t, visited) for t in task.downstream_list +if node_count < node_limit or t not in visited] + +# D3 tree uses children vs _children to define what is +# expanded or not. The following block makes it such that +# repeated nodes are collapsed by default. +if task.task_id not in expanded: +children_key = 'children' +expanded.add(task.task_id) +else: +
[jira] [Updated] (AIRFLOW-6389) add config for 'allow_multi_scheduler_instances' default True
[ https://issues.apache.org/jira/browse/AIRFLOW-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo updated AIRFLOW-6389: -- Description: right now common deployment pattern with blue/green build is: 1. on EC2 1, start scheduler 2. Assign 'final' DNS to EC2 1 3. create EC2 2 4. start scheduler on EC2 2 5. Assign 'final' DNS to EC2 2 6. Teardown EC2 1 Issue is that since the metastore db (ie mysql) is shared to both EC2s there is a period of time between point 4 and 6 above where there are multiple schedulers running. To avoid this proposing config for 'allow_multi_scheduler_instances' that when set to False, the startup of scheduler will detect that another scheduler is running then exit (ie not startup) with WARNING message 7. We have cron/systemd setup to keep retrying to to start the scheduler pid, so as soon as point 6 completes scheduler should successfully launch on EC2 1 was: right now common deployment pattern with blue/green build is: 1. on EC2 1, start scheduler 2. Assign 'final' DNS to EC2 1 3. create EC2 2 4. start scheduler on EC2 2 5. Assign 'final' DNS to EC2 2 6. Teardown EC2 1 Issue is that since the metastore db (ie mysql) is shared to both EC2s there is a period of time between point 4 and 6 above where there are multiple schedulers running. To avoid this proposing config for 'allow_multi_scheduler_instances' that when set to False, the startup of scheduler will detect that another scheduler is running then exit (ie not startup) with WARNING message > add config for 'allow_multi_scheduler_instances' default True > - > > Key: AIRFLOW-6389 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6389 > Project: Apache Airflow > Issue Type: New Feature > Components: scheduler >Affects Versions: 1.10.6 >Reporter: t oo >Priority: Minor > > right now common deployment pattern with blue/green build is: > 1. on EC2 1, start scheduler > 2. Assign 'final' DNS to EC2 1 > 3. create EC2 2 > 4. start scheduler on EC2 2 > 5. Assign 'final' DNS to EC2 2 > 6. Teardown EC2 1 > Issue is that since the metastore db (ie mysql) is shared to both EC2s there > is a period of time between point 4 and 6 above where there are multiple > schedulers running. To avoid this proposing config for > 'allow_multi_scheduler_instances' that when set to False, the startup of > scheduler will detect that another scheduler is running then exit (ie not > startup) with WARNING message > 7. We have cron/systemd setup to keep retrying to to start the scheduler pid, > so as soon as point 6 completes scheduler should successfully launch on EC2 1 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (AIRFLOW-4502) new cli command - task_states_for_dag_run
[ https://issues.apache.org/jira/browse/AIRFLOW-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo closed AIRFLOW-4502. - Resolution: Fixed > new cli command - task_states_for_dag_run > - > > Key: AIRFLOW-4502 > URL: https://issues.apache.org/jira/browse/AIRFLOW-4502 > Project: Apache Airflow > Issue Type: Improvement > Components: cli >Affects Versions: 1.10.3 >Reporter: t oo >Assignee: t oo >Priority: Minor > > This would be a new cli command where you pass arguments of dagid and > execution_date and you get a response of all the tasks names and their state > (ie skipped,failed,success.etc) and exectime -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-6919) Make Breeeze more Dag-test friendly
[ https://issues.apache.org/jira/browse/AIRFLOW-6919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045000#comment-17045000 ] ASF GitHub Bot commented on AIRFLOW-6919: - potiuk commented on pull request #7539: [AIRFLOW-6919] Make Breeze DAG-test friedly URL: https://github.com/apache/airflow/pull/7539 Originally Breeze was used to run unit and integration tests, recently system tests and finally we make it a bit more friendly to test your DAGs there. You can now install any older airflow version in Breeze via --install-airflow-version switch and "files/dags" folder is mounted to "/files/dags" and this folder is used to read the dags from. --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [x] Unit tests coverage for changes (not needed for documentation changes) - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [x] Relevant documentation is updated including usage instructions. - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Make Breeeze more Dag-test friendly > --- > > Key: AIRFLOW-6919 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6919 > Project: Apache Airflow > Issue Type: Bug > Components: breeze >Affects Versions: 2.0.0, 1.10.9 >Reporter: Jarek Potiuk >Priority: Major > > Originally Breeze was used to run unit and integration tests, recently system > tests and finally we make it a bit more friendly to test your DAGs there. > You can now install any older > airflow version in Breeze via --install-airflow-version switch and > "files/dags" folder is mounted to "/files/dags" and this folder is used to > read the dags from. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (AIRFLOW-6920) AIRFLOW Feature Parity with LUIGI & CONTROLM
t oo created AIRFLOW-6920: - Summary: AIRFLOW Feature Parity with LUIGI & CONTROLM Key: AIRFLOW-6920 URL: https://issues.apache.org/jira/browse/AIRFLOW-6920 Project: Apache Airflow Issue Type: Improvement Components: tests Affects Versions: 1.10.7 Reporter: t oo *LUIGI* vs *AIRFLOW* 200 sequential tasks (so no parallelism): +LUIGI:+ mkdir -p test_output8 pip install luigi #no need to start web server, scheduler or meta db #*8.3secs* total time for all 200 time python3 -m luigi --module cloop --local-scheduler ManyMany +AIRFLOW:+ #*1032 sec* total time for all 200, .16s per task but 5sec gap between tasks #intention was for tasks in the DAG to be completely sequential ie task 3 must wait for task 2 which must wait for task 1..etc but chain() not working as intended?? so used default_pool=1 airflow initdb nohup airflow webserver -p 8080 & nohup airflow scheduler & airflow trigger_dag looper2 #look at dagrun start-endtime cloop.py {code:java} import os #import time import luigi # To run: # cd ~/luigi_workflows # pythonpath=.. luigi --module=luigi_workflows.test_resources ManyMany --workers=100 class Sleep(luigi.Task): #resources = {'foo': 10} fname = luigi.Parameter() def requires(self): #print(self) zin=self.fname ii=int(zin.split('_')[1]) if ii > 1: return Sleep(fname='marker_{}'.format(ii-1)) else: [] def full_path(self): return os.path.join(os.path.dirname(os.path.realpath(__file__)), 'test_output8', self.fname) def run(self): #time.sleep(1) with open(self.full_path(), 'w') as f: print('', file=f) def output(self): return luigi.LocalTarget(self.full_path()) class Many(luigi.WrapperTask): n = luigi.IntParameter() def requires(self): for i in range(self.n): yield Sleep(fname='marker_{}'.format(i)) class ManyMany(luigi.WrapperTask): n = luigi.IntParameter(default=200) def requires(self): for i in range(self.n): yield Many(n=self.n) {code} looper2.py {code:java} import airflow from airflow.models import DAG from airflow.operators.bash_operator import BashOperator from airflow.operators.dummy_operator import DummyOperator from airflow.utils.helpers import chain args = { 'owner': 'airflow', 'retries': 3, 'start_date': airflow.utils.dates.days_ago(2) } dag = DAG( dag_id='looper2', default_args=args, schedule_interval=None) chain([DummyOperator(task_id='op' + str(i), dag=dag) for i in range(1, 201)]) if __name__ == "__main__": dag.cli() {code} I saw similar test in https://github.com/apache/airflow/pull/5096 but it did not seem to be sequential or using scheduler Possible test scenarios: 1. 1 DAG with 200 tasks running sequentially 2. 1 DAG with 200 tasks running all in parallel (200 slots) 3. 1 DAG with 200 tasks running all in parallel (48 slots) 4. 200 DAGs each with 1 task Then repeat above changing 200 to 2000 or 20.etc Qs: 1. any plans for an 'in-memory' scheduler like Luigi's? 2. Anyone open to a Luigi Operator? 3. Any speedups to make existing scheduler faster? Noting that the tasks here are sequential (should be similar time to 200 dags of 1 task each) ControlM comparison: is it envisioned that airflow becomes a replacement for https://www.bmcsoftware.uk/it-solutions/control-m.html ? execution_date seems similar to Order Date, DAG seems similar to job, tasks in a dag seem similar to a command called by a job but some of the items I see missing: 1. integrating public holiday calendars, 2. ability to specify schedule like 11am on '2nd weekday of the month', 'last 5 days of the month', 'last business day of the month' 3. ability to visualise dependencies between dags (there does not seem to be a high level way to say at 11am schedule DAGc after DAGa and DAGb, then at 3pm schedule DAGd after DAGc only if DAGc was successful ) 4. ability to click 1 to many dags in a UI and change their state to killed/success (force ok).etc and have it instantly affect task instances (ie stopping them) 5. ability to set whole DAGs to 'dummy' on certain days of the week. ie DAGb (runs 7 days a week and do stuff) must run after DAGa for each execdate (DAGa should do stuff on mon-fri but on sat/sun DAGa should 'do' nothing ie entire dag is 'dummy' just to satisfy 'IN condition' of DAGb) 6. ability to change the number of tasks within a DAG for a diff exec date without 'stuffing' up the scheduler/metadb 7. ability to 'order up' any day in the past/future (for all or some dags) and keep it on 'hold', visualise which dags 'would' be scheduled, see dag dependencies, and choose to run all/some (or just do nothing and delete them) of the DAGs while maintaining dependencies between them and optionally 'forcing
[GitHub] [airflow] potiuk opened a new pull request #7539: [AIRFLOW-6919] Make Breeze DAG-test friedly
potiuk opened a new pull request #7539: [AIRFLOW-6919] Make Breeze DAG-test friedly URL: https://github.com/apache/airflow/pull/7539 Originally Breeze was used to run unit and integration tests, recently system tests and finally we make it a bit more friendly to test your DAGs there. You can now install any older airflow version in Breeze via --install-airflow-version switch and "files/dags" folder is mounted to "/files/dags" and this folder is used to read the dags from. --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [x] Unit tests coverage for changes (not needed for documentation changes) - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [x] Relevant documentation is updated including usage instructions. - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (AIRFLOW-6919) Make Breeeze more Dag-test friendly
Jarek Potiuk created AIRFLOW-6919: - Summary: Make Breeeze more Dag-test friendly Key: AIRFLOW-6919 URL: https://issues.apache.org/jira/browse/AIRFLOW-6919 Project: Apache Airflow Issue Type: Bug Components: breeze Affects Versions: 1.10.9, 2.0.0 Reporter: Jarek Potiuk Originally Breeze was used to run unit and integration tests, recently system tests and finally we make it a bit more friendly to test your DAGs there. You can now install any older airflow version in Breeze via --install-airflow-version switch and "files/dags" folder is mounted to "/files/dags" and this folder is used to read the dags from. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-5307) Move the BaseOperator to the operators package
[ https://issues.apache.org/jira/browse/AIRFLOW-5307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044991#comment-17044991 ] ASF GitHub Bot commented on AIRFLOW-5307: - BasPH commented on pull request #5910: [WIP][AIRFLOW-5307] Move BaseOperator to airflow.operators.base_operator URL: https://github.com/apache/airflow/pull/5910 **WIP: working out a bug in the docs.** Make sure you have checked _all_ steps below. ### Jira - [x] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-5307 - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. - In case you are proposing a fundamental code change, you need to create an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)). - In case you are adding a dependency, check if the license complies with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). ### Description - [x] Here are some details about my PR, including screenshots of any UI changes: The BaseOperator currently resides in /airflow/models but has never been a database model, i.e. it is not stored in a database table. I suggest to move it to its logical place, the operators package. To preserve backwards compatibility I import the BaseOperator in /airflow/models/\_\_init\_\_.py and raise a DeprecationWarning when imported from there. All references to airflow.models.BaseOperator have been removed and I suggest to remove backward compatibility in Airflow 2.0. ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: No test logic altered, only moved package names. ### Commits - [x] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [x] In case of new functionality, my PR adds documentation that describes how to use it. - All the public functions and the classes in the PR contain docstrings that explain what it does - If you implement backwards incompatible changes, please leave a note in the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so we can assign it to a appropriate release ### Code Quality - [x] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Move the BaseOperator to the operators package > -- > > Key: AIRFLOW-5307 > URL: https://issues.apache.org/jira/browse/AIRFLOW-5307 > Project: Apache Airflow > Issue Type: Improvement > Components: core >Affects Versions: 2.0.0 >Reporter: Bas Harenslak >Priority: Major > > The BaseOperator currently resides in /airflow/models but has never been a > database model, i.e. it is not stored in a database table. I suggest to move > it to its logical place, the operators package. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] kaxil commented on issue #5910: [WIP][AIRFLOW-5307] Move BaseOperator to airflow.operators.base_operator
kaxil commented on issue #5910: [WIP][AIRFLOW-5307] Move BaseOperator to airflow.operators.base_operator URL: https://github.com/apache/airflow/pull/5910#issuecomment-591124831 Agree, this should be moved to operator. Can you rebase to the latest master? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] BasPH opened a new pull request #5910: [WIP][AIRFLOW-5307] Move BaseOperator to airflow.operators.base_operator
BasPH opened a new pull request #5910: [WIP][AIRFLOW-5307] Move BaseOperator to airflow.operators.base_operator URL: https://github.com/apache/airflow/pull/5910 **WIP: working out a bug in the docs.** Make sure you have checked _all_ steps below. ### Jira - [x] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-5307 - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. - In case you are proposing a fundamental code change, you need to create an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)). - In case you are adding a dependency, check if the license complies with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). ### Description - [x] Here are some details about my PR, including screenshots of any UI changes: The BaseOperator currently resides in /airflow/models but has never been a database model, i.e. it is not stored in a database table. I suggest to move it to its logical place, the operators package. To preserve backwards compatibility I import the BaseOperator in /airflow/models/\_\_init\_\_.py and raise a DeprecationWarning when imported from there. All references to airflow.models.BaseOperator have been removed and I suggest to remove backward compatibility in Airflow 2.0. ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: No test logic altered, only moved package names. ### Commits - [x] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [x] In case of new functionality, my PR adds documentation that describes how to use it. - All the public functions and the classes in the PR contain docstrings that explain what it does - If you implement backwards incompatible changes, please leave a note in the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so we can assign it to a appropriate release ### Code Quality - [x] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (AIRFLOW-6389) add config for 'allow_multi_scheduler_instances' default True
[ https://issues.apache.org/jira/browse/AIRFLOW-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo updated AIRFLOW-6389: -- Description: right now common deployment pattern with blue/green build is: 1. on EC2 1, start scheduler 2. Assign 'final' DNS to EC2 1 3. create EC2 2 4. start scheduler on EC2 2 5. Assign 'final' DNS to EC2 2 6. Teardown EC2 1 Issue is that since the metastore db (ie mysql) is shared to both EC2s there is a period of time between point 4 and 6 above where there are multiple schedulers running. To avoid this proposing config for 'allow_multi_scheduler_instances' that when set to False, the startup of scheduler will detect that another scheduler is running then exit (ie not startup) with WARNING message was: right now common deployment pattern with blue/green build is: 1. on EC2 1, start scheduler 2. Assign 'final' DNS to EC2 1 3. create EC2 2 4. start scheduler on EC2 2 5. Assign 'final' DNS to EC2 2 6. Teardown EC2 1 Issue is that since the metastore db (ie mysql) is shared to both EC2s there is a period of time between point 4 and 6 above where there are multiple schedulers running. To avoid this proposing config for 'allow_multi_scheduler_instances' that when set tot False, the startup of scheduler will detect that another scheduler is running then exit (ie not startup) with WARNING message > add config for 'allow_multi_scheduler_instances' default True > - > > Key: AIRFLOW-6389 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6389 > Project: Apache Airflow > Issue Type: New Feature > Components: scheduler >Affects Versions: 1.10.6 >Reporter: t oo >Priority: Minor > > right now common deployment pattern with blue/green build is: > 1. on EC2 1, start scheduler > 2. Assign 'final' DNS to EC2 1 > 3. create EC2 2 > 4. start scheduler on EC2 2 > 5. Assign 'final' DNS to EC2 2 > 6. Teardown EC2 1 > Issue is that since the metastore db (ie mysql) is shared to both EC2s there > is a period of time between point 4 and 6 above where there are multiple > schedulers running. To avoid this proposing config for > 'allow_multi_scheduler_instances' that when set to False, the startup of > scheduler will detect that another scheduler is running then exit (ie not > startup) with WARNING message -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (AIRFLOW-6389) add config for 'allow_multi_scheduler_instances' default True
[ https://issues.apache.org/jira/browse/AIRFLOW-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo updated AIRFLOW-6389: -- Description: right now common deployment pattern with blue/green build is: 1. on EC2 1, start scheduler 2. Assign 'final' DNS to EC2 1 3. create EC2 2 4. start scheduler on EC2 2 5. Assign 'final' DNS to EC2 2 6. Teardown EC2 1 Issue is that since the metastore db (ie mysql) is shared to both EC2s there is a period of time between point 4 and 6 above where there are multiple schedulers running. To avoid this proposing config for 'allow_multi_scheduler_instances' that when set tot False, the startup of scheduler will detect that another scheduler is running then exit (ie not startup) with WARNING message was: right now common deployment pattern with blue/green build is: 1. on EC2 1, start scheduler 2. Assign 'final' DNS to EC2 1 3. create EC2 2 4. start scheduler on EC2 2 5. Assign 'final' DNS to EC2 2 6. Teardown EC2 1 Issue is that since the megastore db (ie mysql) is shared to both EC2s there is a period of time between point 4 and 6 above where there are multiple schedulers running. To avoid this proposing config for 'allow_multi_scheduler_instances' that when set tot False, the startup of scheduler will detect that another scheduler is running then exit (ie not startup) with WARNING message > add config for 'allow_multi_scheduler_instances' default True > - > > Key: AIRFLOW-6389 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6389 > Project: Apache Airflow > Issue Type: New Feature > Components: scheduler >Affects Versions: 1.10.6 >Reporter: t oo >Priority: Minor > > right now common deployment pattern with blue/green build is: > 1. on EC2 1, start scheduler > 2. Assign 'final' DNS to EC2 1 > 3. create EC2 2 > 4. start scheduler on EC2 2 > 5. Assign 'final' DNS to EC2 2 > 6. Teardown EC2 1 > Issue is that since the metastore db (ie mysql) is shared to both EC2s there > is a period of time between point 4 and 6 above where there are multiple > schedulers running. To avoid this proposing config for > 'allow_multi_scheduler_instances' that when set tot False, the startup of > scheduler will detect that another scheduler is running then exit (ie not > startup) with WARNING message -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (AIRFLOW-6388) SparkSubmitOperator polling should not 'consume' a pool slot
[ https://issues.apache.org/jira/browse/AIRFLOW-6388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo updated AIRFLOW-6388: -- Summary: SparkSubmitOperator polling should not 'consume' a pool slot (was: SparkSubmitOperator polling should not 'consume' a slot) > SparkSubmitOperator polling should not 'consume' a pool slot > > > Key: AIRFLOW-6388 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6388 > Project: Apache Airflow > Issue Type: Improvement > Components: dependencies, scheduler >Affects Versions: 1.10.3 >Reporter: t oo >Priority: Minor > > Spark jobs can often take many minutes (or even hours) to complete. > The spark submit operator submits a job to a spark cluster, then continually > polls its status until it detects the spark job has ended. This means it > could be consuming a 'slot' (ie parallelism, dag_concurrency, > max_active_dag_runs_per_dag, non_pooled_task_slot_count) for hours when it is > not 'doing' anything but polling for status. > https://github.com/apache/airflow/pull/6909#discussion_r361838225 suggested > it should move to a poke/reschedule model. > Another thing to note is that in cluster mode a spark-submit made to a 'full' > spark cluster will sit in WAITING state on spark side until some cores/memory > is freed, then the driver/app can go into RUNNING > "This actually means occupy worker and do nothing for n seconds is it not? > It was OK when it was 1 second but users may set it to even 5 min without > realising that it occupys the worker. > My comment here is more of a concern rather than an action to do. > Should this work by occupying the worker "indefinitely" or can it be > something like the sensors with (poke/reschedule)?" -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (AIRFLOW-6443) Pool name - increase length > 50, make cli give error if too large
[ https://issues.apache.org/jira/browse/AIRFLOW-6443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo updated AIRFLOW-6443: -- Labels: gsoc gsoc2020 mentor pool (was: pool) > Pool name - increase length > 50, make cli give error if too large > -- > > Key: AIRFLOW-6443 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6443 > Project: Apache Airflow > Issue Type: Improvement > Components: cli, database, models >Affects Versions: 1.10.3 >Reporter: t oo >Priority: Major > Labels: gsoc, gsoc2020, mentor, pool > > create some pool names (using cli) with 70 or 80 character length > > 1. UI does not allow creating > 50 length but why does cli? > click on one of the pool names listed (link is cut to 50 char name: > [https://domain:8080/admin/airflow/task?flt1_pool_equals=qjfdal_CRCE_INTERCONNECTION_FORECAST_TNC_EJFLSA_LP)] > If click 'edit' it shows full 80chars in Description but cut 50chars in Pool > 2. why limit to 50 length at all? should be increased - say 256 > 3. if trying to create really large length (more than models length) then cli > should give error -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] saguziel commented on issue #7269: [AIRFLOW-6651] Add Redis Heartbeat option
saguziel commented on issue #7269: [AIRFLOW-6651] Add Redis Heartbeat option URL: https://github.com/apache/airflow/pull/7269#issuecomment-591116708 Celery does not support this type of thing, and it would not be portable across executors. Celery's consistency model is really simple, which is why it gets really messy with the task_acks_late arg and the fair scheduler and whatnot. I would say Celery is not designed to offer near-exactly-once type semantics which is why we have so many hacks on top of it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] tooptoop4 commented on issue #7537: [AIRFLOW-XXXX] And script to benchmark scheduler dag-run time
tooptoop4 commented on issue #7537: [AIRFLOW-] And script to benchmark scheduler dag-run time URL: https://github.com/apache/airflow/pull/7537#issuecomment-591113663 maybe link to https://jira.apache.org/jira/browse/AIRFLOW-6454 ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] ashb commented on a change in pull request #7537: [AIRFLOW-XXXX] And script to benchmark scheduler dag-run time
ashb commented on a change in pull request #7537: [AIRFLOW-] And script to benchmark scheduler dag-run time URL: https://github.com/apache/airflow/pull/7537#discussion_r384173851 ## File path: scripts/perf/scheduler_dag_execution_timing.py ## @@ -0,0 +1,221 @@ +#!/usr/bin/env python3 +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +import gc +import os +import statistics +import time + +import click + + +class ShortCircutExecutorMixin: +def __init__(self, stop_when_these_completed): +super().__init__() Review comment: Used like this: https://github.com/apache/airflow/pull/7537/files#diff-fda366d49e19f54d95380db869c369e1R68-R71 ```python Executor = MockExecutor class ShortCircutExecutor(ShortCircutExecutorMixin, Executor): pass ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] ashb commented on a change in pull request #7537: [AIRFLOW-XXXX] And script to benchmark scheduler dag-run time
ashb commented on a change in pull request #7537: [AIRFLOW-] And script to benchmark scheduler dag-run time URL: https://github.com/apache/airflow/pull/7537#discussion_r384173467 ## File path: scripts/perf/scheduler_dag_execution_timing.py ## @@ -0,0 +1,221 @@ +#!/usr/bin/env python3 +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +import gc +import os +import statistics +import time + +import click + + +class ShortCircutExecutorMixin: +def __init__(self, stop_when_these_completed): +super().__init__() Review comment: Yes, because it's a MixIn. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] kaxil commented on a change in pull request #7537: [AIRFLOW-XXXX] And script to benchmark scheduler dag-run time
kaxil commented on a change in pull request #7537: [AIRFLOW-] And script to benchmark scheduler dag-run time URL: https://github.com/apache/airflow/pull/7537#discussion_r384173156 ## File path: scripts/perf/scheduler_dag_execution_timing.py ## @@ -0,0 +1,221 @@ +#!/usr/bin/env python3 +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +import gc +import os +import statistics +import time + +import click + + +class ShortCircutExecutorMixin: +def __init__(self, stop_when_these_completed): +super().__init__() Review comment: Do we need a super call here as we are not subclassing anything? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] ashb commented on issue #7510: [AIRFLOW-6887][WIP] Do not check the state of fresh DAGRun
ashb commented on issue #7510: [AIRFLOW-6887][WIP] Do not check the state of fresh DAGRun URL: https://github.com/apache/airflow/pull/7510#issuecomment-591112184 @Fokko `run.refresh_from_db()` just loads the id and state columns. The case you are talking about is still handled correctly (by the sligtly badly named `verify_integrity` function. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] houqp commented on a change in pull request #7492: [AIRFLOW-6871] optimize tree view for large DAGs
houqp commented on a change in pull request #7492: [AIRFLOW-6871] optimize tree view for large DAGs URL: https://github.com/apache/airflow/pull/7492#discussion_r384172061 ## File path: airflow/www/views.py ## @@ -1374,90 +1376,115 @@ def tree(self): .all() ) dag_runs = { -dr.execution_date: alchemy_to_dict(dr) for dr in dag_runs} +dr.execution_date: alchemy_to_dict(dr) for dr in dag_runs +} dates = sorted(list(dag_runs.keys())) max_date = max(dates) if dates else None min_date = min(dates) if dates else None tis = dag.get_task_instances(start_date=min_date, end_date=base_date) -task_instances = {} +task_instances: Dict[Tuple[str, datetime], models.TaskInstance] = {} for ti in tis: -tid = alchemy_to_dict(ti) -dr = dag_runs.get(ti.execution_date) -tid['external_trigger'] = dr['external_trigger'] if dr else False -task_instances[(ti.task_id, ti.execution_date)] = tid +task_instances[(ti.task_id, ti.execution_date)] = ti -expanded = [] +expanded = set() # The default recursion traces every path so that tree view has full # expand/collapse functionality. After 5,000 nodes we stop and fall # back on a quick DFS search for performance. See PR #320. -node_count = [0] +node_count = 0 node_limit = 5000 / max(1, len(dag.leaves)) +def encode_ti(ti: Optional[models.TaskInstance]) -> Optional[List]: +if not ti: +return None + +# NOTE: order of entry is important here because client JS relies on it for +# tree node reconstruction. Remember to change JS code in tree.html +# whenever order is altered. +data = [ +ti.state, +ti.try_number, +None, # start_ts +None, # duration +] + +if ti.start_date: +# round to seconds to reduce payload size +data[2] = int(ti.start_date.timestamp()) +if ti.duration is not None: +data[3] = int(ti.duration) + +return data + def recurse_nodes(task, visited): +nonlocal node_count +node_count += 1 visited.add(task) -node_count[0] += 1 - -children = [ -recurse_nodes(t, visited) for t in task.downstream_list -if node_count[0] < node_limit or t not in visited] - -# D3 tree uses children vs _children to define what is -# expanded or not. The following block makes it such that -# repeated nodes are collapsed by default. -children_key = 'children' -if task.task_id not in expanded: -expanded.append(task.task_id) -elif children: -children_key = "_children" - -def set_duration(tid): -if (isinstance(tid, dict) and tid.get("state") == State.RUNNING and -tid["start_date"] is not None): -d = timezone.utcnow() - timezone.parse(tid["start_date"]) -tid["duration"] = d.total_seconds() -return tid - -return { +task_id = task.task_id + +node = { 'name': task.task_id, 'instances': [ -set_duration(task_instances.get((task.task_id, d))) or { -'execution_date': d.isoformat(), -'task_id': task.task_id -} -for d in dates], -children_key: children, +encode_ti(task_instances.get((task_id, d))) +for d in dates +], 'num_dep': len(task.downstream_list), 'operator': task.task_type, 'retries': task.retries, 'owner': task.owner, -'start_date': task.start_date, -'end_date': task.end_date, -'depends_on_past': task.depends_on_past, 'ui_color': task.ui_color, -'extra_links': task.extra_links, } +if task.downstream_list: +children = [ +recurse_nodes(t, visited) for t in task.downstream_list +if node_count < node_limit or t not in visited] + +# D3 tree uses children vs _children to define what is +# expanded or not. The following block makes it such that +# repeated nodes are collapsed by default. +if task.task_id not in expanded: +children_key = 'children' +expanded.add(task.task_id) +else: +
[GitHub] [airflow] ashb commented on a change in pull request #7492: [AIRFLOW-6871] optimize tree view for large DAGs
ashb commented on a change in pull request #7492: [AIRFLOW-6871] optimize tree view for large DAGs URL: https://github.com/apache/airflow/pull/7492#discussion_r384171265 ## File path: airflow/www/templates/airflow/tree.html ## @@ -85,8 +85,59 @@
[jira] [Commented] (AIRFLOW-6518) Task did not retry when there was temporary metastore db connectivity loss
[ https://issues.apache.org/jira/browse/AIRFLOW-6518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044969#comment-17044969 ] t oo commented on AIRFLOW-6518: --- [~ash] [~potiuk] thoughts? > Task did not retry when there was temporary metastore db connectivity loss > -- > > Key: AIRFLOW-6518 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6518 > Project: Apache Airflow > Issue Type: Bug > Components: database, scheduler >Affects Versions: 1.10.6 >Reporter: t oo >Priority: Major > > My DAG has got retries configured at the task level. I started a dagrun, then > while a task was running the metastore db crashed, the task failed, but the > dagrun did not attempt to retry the task (even though task retries are > configured!), db recovered 3 seconds after the task failed, instead the > dagrun went to FAILED state. > *Last part of log of TaskInstance:* > [2020-01-08 17:34:46,301] {base_task_runner.py:115} INFO - Job 34662: Subtask > mytsk Traceback (most recent call last): > [2020-01-08 17:34:46,301] {base_task_runner.py:115} INFO - Job 34662: Subtask > mytsk File "/home/ec2-user/venv/bin/airflow", line 37, in > [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask > mytsk args.func(args) > [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask > mytsk File > "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/utils/cli.py", > line 74, in wrapper > [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask > mytsk return f(*args, **kwargs) > [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask > mytsk File > "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/bin/cli.py", > line 551, in run > [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask > mytsk _run(args, dag, ti) > [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask > mytsk File > "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/bin/cli.py", > line 469, in _run > [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask > mytsk pool=args.pool, > [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask > mytsk File > "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/utils/db.py", > line 74, in wrapper > [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask > mytsk return func(*args, **kwargs) > [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask > mytsk File > "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/models/taskinstance.py", > line 962, in _run_raw_task > [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask > mytsk self.refresh_from_db() > [2020-01-08 17:34:46,303] {base_task_runner.py:115} INFO - Job 34662: Subtask > mytsk File > "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/utils/db.py", > line 74, in wrapper > [2020-01-08 17:34:46,303] {base_task_runner.py:115} INFO - Job 34662: Subtask > mytsk return func(*args, **kwargs) > [2020-01-08 17:34:46,303] {base_task_runner.py:115} INFO - Job 34662: Subtask > mytsk File > "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/models/taskinstance.py", > line 461, in refresh_from_db > [2020-01-08 17:34:46,303] {base_task_runner.py:115} INFO - Job 34662: Subtask > mytsk ti = qry.first() > [2020-01-08 17:34:46,303] {base_task_runner.py:115} INFO - Job 34662: Subtask > mytsk File > "/home/ec2-user/venv/local/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", > line 3265, in first > [2020-01-08 17:34:46,303] {base_task_runner.py:115} INFO - Job 34662: Subtask > mytsk ret = list(self[0:1]) > [2020-01-08 17:34:46,303] {base_task_runner.py:115} INFO - Job 34662: Subtask > mytsk File > "/home/ec2-user/venv/local/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", > line 3043, in __getitem__ > [2020-01-08 17:34:46,303] {base_task_runner.py:115} INFO - Job 34662: Subtask > mytsk return list(res) > [2020-01-08 17:34:46,303] {base_task_runner.py:115} INFO - Job 34662: Subtask > mytsk File > "/home/ec2-user/venv/local/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", > line 3367, in __iter__ > [2020-01-08 17:34:46,303] {base_task_runner.py:115} INFO - Job 34662: Subtask > mytsk return self._execute_and_instances(context) > [2020-01-08 17:34:46,304] {base_task_runner.py:115} INFO - Job 34662: Subtask > mytsk File > "/home/ec2-user/venv/local/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", > line 3389, in _execute_and_instances > [2020-01-08 17:34:46,304] {base_task_runner.py:115} INFO - Job 34662: Subtask >
[jira] [Updated] (AIRFLOW-6747) UI - Show count of tasks in each dag on the main dags page
[ https://issues.apache.org/jira/browse/AIRFLOW-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo updated AIRFLOW-6747: -- Labels: gsoc gsoc2020 mentor (was: ) > UI - Show count of tasks in each dag on the main dags page > -- > > Key: AIRFLOW-6747 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6747 > Project: Apache Airflow > Issue Type: Improvement > Components: ui >Affects Versions: 1.10.7 >Reporter: t oo >Priority: Minor > Labels: gsoc, gsoc2020, mentor > > Main DAGs page in UI - would benefit from showing a new column: number of > tasks for each dag id -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] ashb commented on issue #7538: [AIRFLOW-6382] Add reason for pre-commit rule
ashb commented on issue #7538: [AIRFLOW-6382] Add reason for pre-commit rule URL: https://github.com/apache/airflow/pull/7538#issuecomment-591108820 (so much for the CI skip) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (AIRFLOW-6918) don't use 'is' in if conditions comparing STATE
[ https://issues.apache.org/jira/browse/AIRFLOW-6918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo updated AIRFLOW-6918: -- Description: Trivial code tidy-up - 'is' usually is used for testing booleans or None presence. This change ensures STATE values are tested with == / != instead of 'is' > don't use 'is' in if conditions comparing STATE > --- > > Key: AIRFLOW-6918 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6918 > Project: Apache Airflow > Issue Type: Improvement > Components: scheduler >Affects Versions: 1.10.9 >Reporter: t oo >Assignee: t oo >Priority: Trivial > > Trivial code tidy-up - 'is' usually is used for testing booleans or None > presence. This change ensures STATE values are tested with == / != instead of > 'is' -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] ashb commented on a change in pull request #7492: [AIRFLOW-6871] optimize tree view for large DAGs
ashb commented on a change in pull request #7492: [AIRFLOW-6871] optimize tree view for large DAGs URL: https://github.com/apache/airflow/pull/7492#discussion_r384168703 ## File path: airflow/www/templates/airflow/tree.html ## @@ -85,8 +85,59 @@ $('span.status_square').tooltip({html: true}); +function ts_to_dtstr(ts) { + var dt = new Date(ts * 1000); + return dt.toISOString(); +} + +function is_dag_run(d) { + return d.run_id !== undefined; +} + +var now_ts = Date.now()/1000; + +function populate_taskinstance_properties(node) { + // populate task instance properties for display purpose + var j; + for (j=0; j
[GitHub] [airflow] tooptoop4 commented on issue #7536: [AIRFLOW-6918] don't use 'is' in if conditions comparing STATE
tooptoop4 commented on issue #7536: [AIRFLOW-6918] don't use 'is' in if conditions comparing STATE URL: https://github.com/apache/airflow/pull/7536#issuecomment-591107895 @kaxil updated This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] ashb commented on a change in pull request #7492: [AIRFLOW-6871] optimize tree view for large DAGs
ashb commented on a change in pull request #7492: [AIRFLOW-6871] optimize tree view for large DAGs URL: https://github.com/apache/airflow/pull/7492#discussion_r384164611 ## File path: airflow/www/templates/airflow/tree.html ## @@ -85,8 +85,59 @@ $('span.status_square').tooltip({html: true}); +function ts_to_dtstr(ts) { + var dt = new Date(ts * 1000); + return dt.toISOString(); +} + +function is_dag_run(d) { + return d.run_id !== undefined; +} + +var now_ts = Date.now()/1000; + +function populate_taskinstance_properties(node) { + // populate task instance properties for display purpose + var j; + for (j=0; j
[GitHub] [airflow] ashb commented on a change in pull request #7537: [AIRFLOW-XXXX] And script to benchmark scheduler dag-run time
ashb commented on a change in pull request #7537: [AIRFLOW-] And script to benchmark scheduler dag-run time URL: https://github.com/apache/airflow/pull/7537#discussion_r384161251 ## File path: scripts/perf/scheduler_dag_execution_timing.py ## @@ -0,0 +1,220 @@ +#!/usr/bin/env python3 +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +import gc +import os +import statistics +import time + +import click + + +class ShortCircutExecutorMixin: +def __init__(self, stop_when_these_completed): +super().__init__() +self.reset(stop_when_these_completed) + +def reset(self, stop_when_these_completed): +self.stop_when_these_completed = { +# Store the date as a timestamp, as sometimes this is a Pendulum +# object, others it is a datetime object. +(run.dag_id, run.execution_date.timestamp()): run for run in stop_when_these_completed +} + +def change_state(self, key, state): +from airflow.utils.state import State +super().change_state(key, state) + +dag_id, task_id, execution_date, __ = key +run_key = (dag_id, execution_date.timestamp()) +run = self.stop_when_these_completed.get(run_key, None) +if run and all(t.state == State.SUCCESS for t in run.get_task_instances()): +self.stop_when_these_completed.pop(run_key) + +if not self.stop_when_these_completed: +self.log.warning("STOPPING SCHEDULER -- all runs complete") +self.scheduler_job.processor_agent._done = True +else: +self.log.warning("WAITING ON %d RUNS", len(self.stop_when_these_completed)) +elif state == State.SUCCESS: +self.log.warning("WAITING ON %d RUNS", len(self.stop_when_these_completed)) + + +def get_executor_under_test(): +try: +# Run against master and 1.10.x releases +from tests.test_utils.mock_executor import MockExecutor +except ImportError: +from tests.executors.test_executor import TestExecutor as MockExecutor + +# from airflow.executors.local_executor import LocalExecutor + +# Change this to try other executors +Executor = MockExecutor + +class ShortCircutExecutor(ShortCircutExecutorMixin, Executor): +pass + +return ShortCircutExecutor + + +def reset_dag(dag, num_runs, session): +import airflow.models +from airflow.utils import timezone +from airflow.utils.state import State + +DR = airflow.models.DagRun +DM = airflow.models.DagModel +TI = airflow.models.TaskInstance +TF = airflow.models.TaskFail +dag_id = dag.dag_id + +session.query(DM).filter(DM.dag_id == dag_id).update({'is_paused': False}) +session.query(DR).filter(DR.dag_id == dag_id).delete() +session.query(TI).filter(TI.dag_id == dag_id).delete() +session.query(TF).filter(TF.dag_id == dag_id).delete() + +next_run_date = dag.normalize_schedule(dag.start_date or min(t.start_date for t in dag.tasks)) + +for _ in range(num_runs): +next_run = dag.create_dagrun( +run_id=DR.ID_PREFIX + next_run_date.isoformat(), +execution_date=next_run_date, +start_date=timezone.utcnow(), +state=State.RUNNING, +external_trigger=False, +session=session, +) +next_run_date = dag.following_schedule(next_run_date) +return next_run + + +def pause_all_dags(session): +from airflow.models.dag import DagModel +session.query(DagModel).update({'is_paused': True}) + + +@click.command() +@click.option('--num-runs', default=1, help='number of DagRun, to run for each DAG') +@click.option('--repeat', default=3, help='number of times to run test, to reduce variance') +@click.argument('dag_ids', required=True, nargs=-1) +def main(num_runs, repeat, dag_ids): +""" +This script will run the SchedulerJob for the specified dags "to completion". + +That is it creates a fixed number of DAG runs for the specified DAGs (from +the configured dag path/example dags etc), disable the scheduler from +creating more, and then monitor them for completion. When the file task of +the final dag run is
[jira] [Commented] (AIRFLOW-6866) export EXTRA_DC_OPTIONS+= fails on mac
[ https://issues.apache.org/jira/browse/AIRFLOW-6866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044952#comment-17044952 ] ASF subversion and git services commented on AIRFLOW-6866: -- Commit 34a1f92ffd5e8977f640e0f1333c70b0c15bdb99 in airflow's branch refs/heads/v1-10-test from Jarek Potiuk [ https://gitbox.apache.org/repos/asf?p=airflow.git;h=34a1f92 ] [AIRFLOW-6866] Fix wrong export for Mac on Breeze (#7485) (cherry picked from commit 6502cfa8e61cb57673df327f8d1548c4bef89bcf) > export EXTRA_DC_OPTIONS+= fails on mac > -- > > Key: AIRFLOW-6866 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6866 > Project: Apache Airflow > Issue Type: Bug > Components: breeze >Affects Versions: 2.0.0, 1.10.9 >Reporter: Jarek Potiuk >Priority: Major > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-6763) Make system test ready for backport tests
[ https://issues.apache.org/jira/browse/AIRFLOW-6763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044954#comment-17044954 ] ASF subversion and git services commented on AIRFLOW-6763: -- Commit 3b03f861af570629649099900e60fbc2a82daf82 in airflow's branch refs/heads/v1-10-test from Jarek Potiuk [ https://gitbox.apache.org/repos/asf?p=airflow.git;h=3b03f86 ] [AIRFLOW-6763] Make systems tests ready for backport tests (#7389) We will run system test on back-ported operators for 1.10* series of airflow and for that we need to have support for running system tests using pytest's markers and reading environment variables passed from HOST machine (to pass credentials). This is the first step to automate system tests execution. (cherry picked from commit 848fbab5bddb5250d323a2ae5337b34a6e0734dd) > Make system test ready for backport tests > - > > Key: AIRFLOW-6763 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6763 > Project: Apache Airflow > Issue Type: Bug > Components: ci >Affects Versions: 2.0.0 >Reporter: Jarek Potiuk >Priority: Major > Fix For: 2.0.0 > > > We will run system test on back-ported operators for 1.10* series of airflow > and for that we need to have support for running system tests using pytest's > markers and reading environment variables passed from HOST machine (to pass > credentials). > This is the first step to automate system test execution. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-6892) Fix broken non-wheel releases
[ https://issues.apache.org/jira/browse/AIRFLOW-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044953#comment-17044953 ] ASF subversion and git services commented on AIRFLOW-6892: -- Commit 69f7d2e0ae305648d4c3746dfe4f82dc4cc1118f in airflow's branch refs/heads/v1-10-test from Kaxil Naik [ https://gitbox.apache.org/repos/asf?p=airflow.git;h=69f7d2e ] [AIRFLOW-6892] Fix broken non-wheel releases (#7514) (cherry picked from commit 20c507f0928efb406a7677d68a6382c100bdcfd3) > Fix broken non-wheel releases > - > > Key: AIRFLOW-6892 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6892 > Project: Apache Airflow > Issue Type: Bug > Components: packages >Affects Versions: 1.10.7, 1.10.8, 1.10.9 >Reporter: Kaxil Naik >Assignee: Kaxil Naik >Priority: Major > Fix For: 1.10.10 > > > Our non-wheel releases on pypi are broken > {code} > pip install --no-binary apache-airflow apache-airflow > {code} > will install a version that does this: > {noformat} > FileNotFoundError: [Errno 2] No such file or directory: > '/home/ash/.virtualenvs/clean/lib/python3.7/site-packages/airflow/serialization/schema.json' > {noformat} > There are no issues with wheel package. Pip install wheel package if > available and unless you use "--no-binary"specifically -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-6838) Introduce real subcommands for Breeze
[ https://issues.apache.org/jira/browse/AIRFLOW-6838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044955#comment-17044955 ] ASF subversion and git services commented on AIRFLOW-6838: -- Commit 156b946d05abaae1c30517050f9045c7bd538815 in airflow's branch refs/heads/v1-10-test from Jarek Potiuk [ https://gitbox.apache.org/repos/asf?p=airflow.git;h=156b946 ] [AIRFLOW-6838] Introduce real subcommands for Breeze (#7515) This change introduces sub-commands in breeze tool. It is much needed as we have many commands now and it was difficult to separate commands from flags. Also --help output was very long and unreadable. With this change help it is much easier to discover what breeze can do for you as well as navigate with it. Co-authored-by: Jarek Potiuk Co-authored-by: Kamil Breguła > Introduce real subcommands for Breeze > - > > Key: AIRFLOW-6838 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6838 > Project: Apache Airflow > Issue Type: Bug > Components: core >Affects Versions: 2.0.0, 1.10.9 >Reporter: Kamil Bregula >Priority: Major > Fix For: 1.10.10 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] houqp commented on a change in pull request #7492: [AIRFLOW-6871] optimize tree view for large DAGs
houqp commented on a change in pull request #7492: [AIRFLOW-6871] optimize tree view for large DAGs URL: https://github.com/apache/airflow/pull/7492#discussion_r384160648 ## File path: airflow/www/views.py ## @@ -1374,90 +1376,115 @@ def tree(self): .all() ) dag_runs = { -dr.execution_date: alchemy_to_dict(dr) for dr in dag_runs} +dr.execution_date: alchemy_to_dict(dr) for dr in dag_runs +} dates = sorted(list(dag_runs.keys())) max_date = max(dates) if dates else None min_date = min(dates) if dates else None tis = dag.get_task_instances(start_date=min_date, end_date=base_date) -task_instances = {} +task_instances: Dict[Tuple[str, datetime], models.TaskInstance] = {} for ti in tis: -tid = alchemy_to_dict(ti) -dr = dag_runs.get(ti.execution_date) -tid['external_trigger'] = dr['external_trigger'] if dr else False -task_instances[(ti.task_id, ti.execution_date)] = tid +task_instances[(ti.task_id, ti.execution_date)] = ti -expanded = [] +expanded = set() # The default recursion traces every path so that tree view has full # expand/collapse functionality. After 5,000 nodes we stop and fall # back on a quick DFS search for performance. See PR #320. -node_count = [0] +node_count = 0 node_limit = 5000 / max(1, len(dag.leaves)) +def encode_ti(ti: Optional[models.TaskInstance]) -> Optional[List]: +if not ti: +return None + +# NOTE: order of entry is important here because client JS relies on it for +# tree node reconstruction. Remember to change JS code in tree.html +# whenever order is altered. +data = [ +ti.state, +ti.try_number, +None, # start_ts +None, # duration +] + +if ti.start_date: +# round to seconds to reduce payload size +data[2] = int(ti.start_date.timestamp()) +if ti.duration is not None: +data[3] = int(ti.duration) + +return data + def recurse_nodes(task, visited): +nonlocal node_count +node_count += 1 visited.add(task) -node_count[0] += 1 - -children = [ -recurse_nodes(t, visited) for t in task.downstream_list -if node_count[0] < node_limit or t not in visited] - -# D3 tree uses children vs _children to define what is -# expanded or not. The following block makes it such that -# repeated nodes are collapsed by default. -children_key = 'children' -if task.task_id not in expanded: -expanded.append(task.task_id) -elif children: -children_key = "_children" - -def set_duration(tid): -if (isinstance(tid, dict) and tid.get("state") == State.RUNNING and -tid["start_date"] is not None): -d = timezone.utcnow() - timezone.parse(tid["start_date"]) -tid["duration"] = d.total_seconds() -return tid - -return { +task_id = task.task_id + +node = { 'name': task.task_id, 'instances': [ -set_duration(task_instances.get((task.task_id, d))) or { -'execution_date': d.isoformat(), -'task_id': task.task_id -} -for d in dates], -children_key: children, +encode_ti(task_instances.get((task_id, d))) +for d in dates +], 'num_dep': len(task.downstream_list), 'operator': task.task_type, 'retries': task.retries, 'owner': task.owner, -'start_date': task.start_date, -'end_date': task.end_date, -'depends_on_past': task.depends_on_past, 'ui_color': task.ui_color, -'extra_links': task.extra_links, } +if task.downstream_list: +children = [ +recurse_nodes(t, visited) for t in task.downstream_list +if node_count < node_limit or t not in visited] + +# D3 tree uses children vs _children to define what is +# expanded or not. The following block makes it such that +# repeated nodes are collapsed by default. +if task.task_id not in expanded: +children_key = 'children' +expanded.add(task.task_id) +else: +
[jira] [Commented] (AIRFLOW-6382) Extract provide/create session to session module
[ https://issues.apache.org/jira/browse/AIRFLOW-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044949#comment-17044949 ] ASF GitHub Bot commented on AIRFLOW-6382: - ashb commented on pull request #7538: [AIRFLOW-6382] Add reason for pre-commit rule URL: https://github.com/apache/airflow/pull/7538 [ci-skip] --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [x] Unit tests coverage for changes (not needed for documentation changes) - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [x] Relevant documentation is updated including usage instructions. - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Extract provide/create session to session module > > > Key: AIRFLOW-6382 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6382 > Project: Apache Airflow > Issue Type: Improvement > Components: core >Affects Versions: 2.0.0 >Reporter: Tomasz Urbaszek >Priority: Major > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] ashb opened a new pull request #7538: [AIRFLOW-6382] Add reason for pre-commit rule
ashb opened a new pull request #7538: [AIRFLOW-6382] Add reason for pre-commit rule URL: https://github.com/apache/airflow/pull/7538 [ci-skip] --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [x] Unit tests coverage for changes (not needed for documentation changes) - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [x] Relevant documentation is updated including usage instructions. - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] ashb commented on issue #6938: [AIRFLOW-6382] Extract provide/create session to session module
ashb commented on issue #6938: [AIRFLOW-6382] Extract provide/create session to session module URL: https://github.com/apache/airflow/pull/6938#issuecomment-591098934 https://github.com/apache/airflow/pull/7538 will I think solve my issue/brain fart. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] kaxil commented on issue #7536: [AIRFLOW-6918] don't use 'is' in if conditions comparing STATE
kaxil commented on issue #7536: [AIRFLOW-6918] don't use 'is' in if conditions comparing STATE URL: https://github.com/apache/airflow/pull/7536#issuecomment-591097807 The PR and JIRA has no description, please add it before someone can review This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] houqp commented on a change in pull request #7492: [AIRFLOW-6871] optimize tree view for large DAGs
houqp commented on a change in pull request #7492: [AIRFLOW-6871] optimize tree view for large DAGs URL: https://github.com/apache/airflow/pull/7492#discussion_r384158366 ## File path: airflow/www/templates/airflow/tree.html ## @@ -85,8 +85,59 @@
[GitHub] [airflow] houqp commented on a change in pull request #7492: [AIRFLOW-6871] optimize tree view for large DAGs
houqp commented on a change in pull request #7492: [AIRFLOW-6871] optimize tree view for large DAGs URL: https://github.com/apache/airflow/pull/7492#discussion_r384158262 ## File path: airflow/www/templates/airflow/tree.html ## @@ -85,8 +85,59 @@
[GitHub] [airflow] ashb commented on a change in pull request #7492: [AIRFLOW-6871] optimize tree view for large DAGs
ashb commented on a change in pull request #7492: [AIRFLOW-6871] optimize tree view for large DAGs URL: https://github.com/apache/airflow/pull/7492#discussion_r384158218 ## File path: airflow/www/views.py ## @@ -1374,90 +1376,115 @@ def tree(self): .all() ) dag_runs = { -dr.execution_date: alchemy_to_dict(dr) for dr in dag_runs} +dr.execution_date: alchemy_to_dict(dr) for dr in dag_runs +} dates = sorted(list(dag_runs.keys())) max_date = max(dates) if dates else None min_date = min(dates) if dates else None tis = dag.get_task_instances(start_date=min_date, end_date=base_date) -task_instances = {} +task_instances: Dict[Tuple[str, datetime], models.TaskInstance] = {} for ti in tis: -tid = alchemy_to_dict(ti) -dr = dag_runs.get(ti.execution_date) -tid['external_trigger'] = dr['external_trigger'] if dr else False -task_instances[(ti.task_id, ti.execution_date)] = tid +task_instances[(ti.task_id, ti.execution_date)] = ti -expanded = [] +expanded = set() # The default recursion traces every path so that tree view has full # expand/collapse functionality. After 5,000 nodes we stop and fall # back on a quick DFS search for performance. See PR #320. -node_count = [0] +node_count = 0 node_limit = 5000 / max(1, len(dag.leaves)) +def encode_ti(ti: Optional[models.TaskInstance]) -> Optional[List]: +if not ti: +return None + +# NOTE: order of entry is important here because client JS relies on it for +# tree node reconstruction. Remember to change JS code in tree.html +# whenever order is altered. +data = [ +ti.state, +ti.try_number, +None, # start_ts +None, # duration +] + +if ti.start_date: +# round to seconds to reduce payload size +data[2] = int(ti.start_date.timestamp()) +if ti.duration is not None: +data[3] = int(ti.duration) + +return data + def recurse_nodes(task, visited): +nonlocal node_count +node_count += 1 visited.add(task) -node_count[0] += 1 - -children = [ -recurse_nodes(t, visited) for t in task.downstream_list -if node_count[0] < node_limit or t not in visited] - -# D3 tree uses children vs _children to define what is -# expanded or not. The following block makes it such that -# repeated nodes are collapsed by default. -children_key = 'children' -if task.task_id not in expanded: -expanded.append(task.task_id) -elif children: -children_key = "_children" - -def set_duration(tid): -if (isinstance(tid, dict) and tid.get("state") == State.RUNNING and -tid["start_date"] is not None): -d = timezone.utcnow() - timezone.parse(tid["start_date"]) -tid["duration"] = d.total_seconds() -return tid - -return { +task_id = task.task_id + +node = { 'name': task.task_id, 'instances': [ -set_duration(task_instances.get((task.task_id, d))) or { -'execution_date': d.isoformat(), -'task_id': task.task_id -} -for d in dates], -children_key: children, +encode_ti(task_instances.get((task_id, d))) +for d in dates +], 'num_dep': len(task.downstream_list), 'operator': task.task_type, 'retries': task.retries, 'owner': task.owner, -'start_date': task.start_date, -'end_date': task.end_date, -'depends_on_past': task.depends_on_past, 'ui_color': task.ui_color, -'extra_links': task.extra_links, } +if task.downstream_list: +children = [ +recurse_nodes(t, visited) for t in task.downstream_list +if node_count < node_limit or t not in visited] + +# D3 tree uses children vs _children to define what is +# expanded or not. The following block makes it such that +# repeated nodes are collapsed by default. +if task.task_id not in expanded: +children_key = 'children' +expanded.add(task.task_id) +else: +
[GitHub] [airflow] ashb commented on issue #6938: [AIRFLOW-6382] Extract provide/create session to session module
ashb commented on issue #6938: [AIRFLOW-6382] Extract provide/create session to session module URL: https://github.com/apache/airflow/pull/6938#issuecomment-591096260 It's probably fine as it is. Def not worth changing all the files again anyway. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] houqp edited a comment on issue #7537: [AIRFLOW-XXXX] And script to benchmark scheduler dag-run time
houqp edited a comment on issue #7537: [AIRFLOW-] And script to benchmark scheduler dag-run time URL: https://github.com/apache/airflow/pull/7537#issuecomment-591094143 Looks like a useful script that's worth mentioning in CONTRIBUTING.rst? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] houqp edited a comment on issue #7537: [AIRFLOW-XXXX] And script to benchmark scheduler dag-run time
houqp edited a comment on issue #7537: [AIRFLOW-] And script to benchmark scheduler dag-run time URL: https://github.com/apache/airflow/pull/7537#issuecomment-591094143 Looks like something that's worth mentioning in CONTRIBUTING.rst? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] ashb commented on issue #7537: [AIRFLOW-XXXX] And script to benchmark scheduler dag-run time
ashb commented on issue #7537: [AIRFLOW-] And script to benchmark scheduler dag-run time URL: https://github.com/apache/airflow/pull/7537#issuecomment-591094351 Example invocation, and final line of output (there are a lot of logs in the middle) ``` ./scripts/perf/scheduler_dag_execution_timing.py --num-runs 2 example_dag example_0{1,2,3}_dag ... #Lots of logs here Time for 2 dag runs of 4 dags with 48 total tasks: 91.8814s (±0.149s) ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] houqp commented on issue #7537: [AIRFLOW-XXXX] And script to benchmark scheduler dag-run time
houqp commented on issue #7537: [AIRFLOW-] And script to benchmark scheduler dag-run time URL: https://github.com/apache/airflow/pull/7537#issuecomment-591094143 Looks like something that's worth mentioning in CONTRIBUTING.rst? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] ashb commented on a change in pull request #7537: [AIRFLOW-XXXX] And script to benchmark scheduler dag-run time
ashb commented on a change in pull request #7537: [AIRFLOW-] And script to benchmark scheduler dag-run time URL: https://github.com/apache/airflow/pull/7537#discussion_r384154553 ## File path: scripts/perf/scheduler_dag_execution_timing.py ## @@ -0,0 +1,220 @@ +#!/usr/bin/env python3 +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +import gc +import os +import statistics +import time + +import click + + +class ShortCircutExecutorMixin: +def __init__(self, stop_when_these_completed): +super().__init__() +self.reset(stop_when_these_completed) + +def reset(self, stop_when_these_completed): +self.stop_when_these_completed = { +# Store the date as a timestamp, as sometimes this is a Pendulum +# object, others it is a datetime object. +(run.dag_id, run.execution_date.timestamp()): run for run in stop_when_these_completed +} + +def change_state(self, key, state): +from airflow.utils.state import State +super().change_state(key, state) + +dag_id, task_id, execution_date, __ = key +run_key = (dag_id, execution_date.timestamp()) +run = self.stop_when_these_completed.get(run_key, None) +if run and all(t.state == State.SUCCESS for t in run.get_task_instances()): +self.stop_when_these_completed.pop(run_key) + +if not self.stop_when_these_completed: +self.log.warning("STOPPING SCHEDULER -- all runs complete") +self.scheduler_job.processor_agent._done = True +else: +self.log.warning("WAITING ON %d RUNS", len(self.stop_when_these_completed)) +elif state == State.SUCCESS: +self.log.warning("WAITING ON %d RUNS", len(self.stop_when_these_completed)) + + +def get_executor_under_test(): +try: +# Run against master and 1.10.x releases +from tests.test_utils.mock_executor import MockExecutor +except ImportError: +from tests.executors.test_executor import TestExecutor as MockExecutor + +# from airflow.executors.local_executor import LocalExecutor + +# Change this to try other executors +Executor = MockExecutor + +class ShortCircutExecutor(ShortCircutExecutorMixin, Executor): +pass + +return ShortCircutExecutor + + +def reset_dag(dag, num_runs, session): +import airflow.models +from airflow.utils import timezone +from airflow.utils.state import State + +DR = airflow.models.DagRun +DM = airflow.models.DagModel +TI = airflow.models.TaskInstance +TF = airflow.models.TaskFail +dag_id = dag.dag_id + +session.query(DM).filter(DM.dag_id == dag_id).update({'is_paused': False}) +session.query(DR).filter(DR.dag_id == dag_id).delete() +session.query(TI).filter(TI.dag_id == dag_id).delete() +session.query(TF).filter(TF.dag_id == dag_id).delete() + +next_run_date = dag.normalize_schedule(dag.start_date or min(t.start_date for t in dag.tasks)) + +for _ in range(num_runs): +next_run = dag.create_dagrun( +run_id=DR.ID_PREFIX + next_run_date.isoformat(), +execution_date=next_run_date, +start_date=timezone.utcnow(), +state=State.RUNNING, +external_trigger=False, +session=session, +) +next_run_date = dag.following_schedule(next_run_date) +return next_run + + +def pause_all_dags(session): +from airflow.models.dag import DagModel +session.query(DagModel).update({'is_paused': True}) + + +@click.command() +@click.option('--num-runs', default=1, help='number of DagRun, to run for each DAG') +@click.option('--repeat', default=3, help='number of times to run test, to reduce variance') +@click.argument('dag_ids', required=True, nargs=-1) +def main(num_runs, repeat, dag_ids): +""" +This script will run the SchedulerJob for the specified dags "to completion". + +That is it creates a fixed number of DAG runs for the specified DAGs (from +the configured dag path/example dags etc), disable the scheduler from +creating more, and then monitor them for completion. When the file task of +the final dag run is
[GitHub] [airflow] ashb commented on a change in pull request #7492: [AIRFLOW-6871] optimize tree view for large DAGs
ashb commented on a change in pull request #7492: [AIRFLOW-6871] optimize tree view for large DAGs URL: https://github.com/apache/airflow/pull/7492#discussion_r384154242 ## File path: airflow/www/templates/airflow/tree.html ## @@ -85,8 +85,59 @@
[GitHub] [airflow] houqp commented on a change in pull request #7537: [AIRFLOW-XXXX] And script to benchmark scheduler dag-run time
houqp commented on a change in pull request #7537: [AIRFLOW-] And script to benchmark scheduler dag-run time URL: https://github.com/apache/airflow/pull/7537#discussion_r384154018 ## File path: scripts/perf/scheduler_dag_execution_timing.py ## @@ -0,0 +1,220 @@ +#!/usr/bin/env python3 +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +import gc +import os +import statistics +import time + +import click + + +class ShortCircutExecutorMixin: +def __init__(self, stop_when_these_completed): +super().__init__() +self.reset(stop_when_these_completed) + +def reset(self, stop_when_these_completed): +self.stop_when_these_completed = { +# Store the date as a timestamp, as sometimes this is a Pendulum +# object, others it is a datetime object. +(run.dag_id, run.execution_date.timestamp()): run for run in stop_when_these_completed +} + +def change_state(self, key, state): +from airflow.utils.state import State +super().change_state(key, state) + +dag_id, task_id, execution_date, __ = key +run_key = (dag_id, execution_date.timestamp()) +run = self.stop_when_these_completed.get(run_key, None) +if run and all(t.state == State.SUCCESS for t in run.get_task_instances()): +self.stop_when_these_completed.pop(run_key) + +if not self.stop_when_these_completed: +self.log.warning("STOPPING SCHEDULER -- all runs complete") +self.scheduler_job.processor_agent._done = True +else: +self.log.warning("WAITING ON %d RUNS", len(self.stop_when_these_completed)) +elif state == State.SUCCESS: +self.log.warning("WAITING ON %d RUNS", len(self.stop_when_these_completed)) + + +def get_executor_under_test(): +try: +# Run against master and 1.10.x releases +from tests.test_utils.mock_executor import MockExecutor +except ImportError: +from tests.executors.test_executor import TestExecutor as MockExecutor + +# from airflow.executors.local_executor import LocalExecutor + +# Change this to try other executors +Executor = MockExecutor + +class ShortCircutExecutor(ShortCircutExecutorMixin, Executor): +pass + +return ShortCircutExecutor + + +def reset_dag(dag, num_runs, session): +import airflow.models +from airflow.utils import timezone +from airflow.utils.state import State + +DR = airflow.models.DagRun +DM = airflow.models.DagModel +TI = airflow.models.TaskInstance +TF = airflow.models.TaskFail +dag_id = dag.dag_id + +session.query(DM).filter(DM.dag_id == dag_id).update({'is_paused': False}) +session.query(DR).filter(DR.dag_id == dag_id).delete() +session.query(TI).filter(TI.dag_id == dag_id).delete() +session.query(TF).filter(TF.dag_id == dag_id).delete() + +next_run_date = dag.normalize_schedule(dag.start_date or min(t.start_date for t in dag.tasks)) + +for _ in range(num_runs): +next_run = dag.create_dagrun( +run_id=DR.ID_PREFIX + next_run_date.isoformat(), +execution_date=next_run_date, +start_date=timezone.utcnow(), +state=State.RUNNING, +external_trigger=False, +session=session, +) +next_run_date = dag.following_schedule(next_run_date) +return next_run + + +def pause_all_dags(session): +from airflow.models.dag import DagModel +session.query(DagModel).update({'is_paused': True}) + + +@click.command() +@click.option('--num-runs', default=1, help='number of DagRun, to run for each DAG') +@click.option('--repeat', default=3, help='number of times to run test, to reduce variance') +@click.argument('dag_ids', required=True, nargs=-1) +def main(num_runs, repeat, dag_ids): +""" +This script will run the SchedulerJob for the specified dags "to completion". + +That is it creates a fixed number of DAG runs for the specified DAGs (from +the configured dag path/example dags etc), disable the scheduler from +creating more, and then monitor them for completion. When the file task of +the final dag run is
[GitHub] [airflow] nuclearpinguin commented on issue #6938: [AIRFLOW-6382] Extract provide/create session to session module
nuclearpinguin commented on issue #6938: [AIRFLOW-6382] Extract provide/create session to session module URL: https://github.com/apache/airflow/pull/6938#issuecomment-591092356 That was a big cycle removal. As discussed above, I don't know any other "session" in airflow so the `session.py` was quite intuitive. Do you think it would be worth to rename it to `db_session` ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] ashb commented on issue #7537: [AIRFLOW-XXXX] And script to benchmark scheduler dag-run time
ashb commented on issue #7537: [AIRFLOW-] And script to benchmark scheduler dag-run time URL: https://github.com/apache/airflow/pull/7537#issuecomment-591092242 I haven't created a Jira for this, figuring it's not really "code" as it doesn't get shipped with the app, but I can do if anyone would rather. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] nuclearpinguin commented on issue #6938: [AIRFLOW-6382] Extract provide/create session to session module
nuclearpinguin commented on issue #6938: [AIRFLOW-6382] Extract provide/create session to session module URL: https://github.com/apache/airflow/pull/6938#issuecomment-591091572 Yeah, so long that I at first edited your comment instead of reply :D This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] ashb commented on a change in pull request #7492: [AIRFLOW-6871] optimize tree view for large DAGs
ashb commented on a change in pull request #7492: [AIRFLOW-6871] optimize tree view for large DAGs URL: https://github.com/apache/airflow/pull/7492#discussion_r384152129 ## File path: airflow/www/views.py ## @@ -1374,90 +1376,113 @@ def tree(self): .all() ) dag_runs = { -dr.execution_date: alchemy_to_dict(dr) for dr in dag_runs} +dr.execution_date: alchemy_to_dict(dr) for dr in dag_runs +} dates = sorted(list(dag_runs.keys())) max_date = max(dates) if dates else None min_date = min(dates) if dates else None tis = dag.get_task_instances(start_date=min_date, end_date=base_date) -task_instances = {} +task_instances: Dict[Tuple[str, datetime], Optional[models.TaskInstance]] = {} for ti in tis: -tid = alchemy_to_dict(ti) -dr = dag_runs.get(ti.execution_date) -tid['external_trigger'] = dr['external_trigger'] if dr else False -task_instances[(ti.task_id, ti.execution_date)] = tid +task_instances[(ti.task_id, ti.execution_date)] = ti -expanded = [] +expanded = set() # The default recursion traces every path so that tree view has full # expand/collapse functionality. After 5,000 nodes we stop and fall # back on a quick DFS search for performance. See PR #320. -node_count = [0] +node_count = 0 node_limit = 5000 / max(1, len(dag.leaves)) +def encode_ti(ti: models.TaskInstance) -> Optional[List]: +if not ti: +return None + +# NOTE: order of entry is important here because client JS relies on it for +# tree node reconstruction. Remember to change JS code in tree.html +# whenever order is altered. +data = [ +ti.state, +ti.try_number, +None, # start_ts +None, # duration +] + +if ti.start_date: +# round to seconds to reduce payload size +data[2] = int(ti.start_date.timestamp()) +if ti.duration is not None: +data[3] = int(ti.duration) + +return data + def recurse_nodes(task, visited): +nonlocal node_count +node_count += 1 visited.add(task) -node_count[0] += 1 - -children = [ -recurse_nodes(t, visited) for t in task.downstream_list -if node_count[0] < node_limit or t not in visited] - -# D3 tree uses children vs _children to define what is -# expanded or not. The following block makes it such that -# repeated nodes are collapsed by default. -children_key = 'children' -if task.task_id not in expanded: -expanded.append(task.task_id) -elif children: -children_key = "_children" - -def set_duration(tid): -if (isinstance(tid, dict) and tid.get("state") == State.RUNNING and -tid["start_date"] is not None): -d = timezone.utcnow() - timezone.parse(tid["start_date"]) -tid["duration"] = d.total_seconds() -return tid - -return { +task_id = task.task_id + +node = { 'name': task.task_id, 'instances': [ -set_duration(task_instances.get((task.task_id, d))) or { -'execution_date': d.isoformat(), -'task_id': task.task_id -} -for d in dates], -children_key: children, +encode_ti(task_instances.get((task_id, d))) +for d in dates +], 'num_dep': len(task.downstream_list), 'operator': task.task_type, 'retries': task.retries, 'owner': task.owner, -'start_date': task.start_date, -'end_date': task.end_date, -'depends_on_past': task.depends_on_past, 'ui_color': task.ui_color, -'extra_links': task.extra_links, } +if task.downstream_list: +children = [ +recurse_nodes(t, visited) for t in task.downstream_list +if node_count < node_limit or t not in visited] + +# D3 tree uses children vs _children to define what is +# expanded or not. The following block makes it such that +# repeated nodes are collapsed by default. +if task.task_id not in expanded: +children_key = 'children' Review comment: Nope, what you've done sounds good.
[GitHub] [airflow] ashb commented on issue #6938: [AIRFLOW-6382] Extract provide/create session to session module
ashb commented on issue #6938: [AIRFLOW-6382] Extract provide/create session to session module URL: https://github.com/apache/airflow/pull/6938#issuecomment-591091381 :man_facepalming: oh yeah. Sorry, long day. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] nuclearpinguin commented on issue #6938: [AIRFLOW-6382] Extract provide/create session to session module
nuclearpinguin commented on issue #6938: [AIRFLOW-6382] Extract provide/create session to session module URL: https://github.com/apache/airflow/pull/6938#issuecomment-591091208 >And how did this reduce cyclic imports when utils.db imports the new file still?! That is not a problem. The main reason is that db.py imports all models, and plenty of models methods uses provide_session decorator. So every model imported db.provide_session. And db.py imported every model. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] ashb edited a comment on issue #6938: [AIRFLOW-6382] Extract provide/create session to session module
ashb edited a comment on issue #6938: [AIRFLOW-6382] Extract provide/create session to session module URL: https://github.com/apache/airflow/pull/6938#issuecomment-591081255 Why did we make this change? Neither the PR nor the linked Jira give any justification for the change. It doesn't reduce any imports, and personally, I found the previous name clearer. Now seeing that code for the first time my question would be "what is it session?" Edit, sorry, not used to old PR template. > Extracting provide_session and create_session to separate module reduces number of cyclic imports and make a disctinction between session and database. Can you explain to be what the difference between DB and session is? Cos I wouldn't be able to tell you if you asked me. The session is for the db. And how did this reduce cyclic imports when utils.db imports the new file still?! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] ashb edited a comment on issue #6938: [AIRFLOW-6382] Extract provide/create session to session module
ashb edited a comment on issue #6938: [AIRFLOW-6382] Extract provide/create session to session module URL: https://github.com/apache/airflow/pull/6938#issuecomment-591081255 > And how did this reduce cyclic imports when utils.db imports the new file still?! That is not a problem. The main reason is that db.py imports all models, and plenty of models methods uses `provide_session` decorator. So every model imported db.provide_session. And db.py imported every model. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] ashb opened a new pull request #7537: [AIRFLOW-XXXX] And script to benchmark scheduler dag-run time
ashb opened a new pull request #7537: [AIRFLOW-] And script to benchmark scheduler dag-run time URL: https://github.com/apache/airflow/pull/7537 This script will run the SchedulerJob for the specified dags "to completion". That is it creates a fixed number of DAG runs for the specified DAGs (from the configured dag path/example dags etc), disable the scheduler from creating more, and then monitor them for completion. When the file task of the final dag run is completed the scheduler will be terminated. The aim of this script is to have a benchmark for real-world scheduler performance -- i.e. total time take to run N dag runs to completion. It is recommended to repeat the test at least 3 times so that you can get somewhat-accurate variance on the reported timing numbers. --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [x] Unit tests coverage for changes (not needed for documentation changes) - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [x] Relevant documentation is updated including usage instructions. - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] ashb edited a comment on issue #6938: [AIRFLOW-6382] Extract provide/create session to session module
ashb edited a comment on issue #6938: [AIRFLOW-6382] Extract provide/create session to session module URL: https://github.com/apache/airflow/pull/6938#issuecomment-591081255 Why did we make this change? Neither the PR nor the linked Jira give _any_ justification for the change. It doesn't reduce any imports, and personally, I found the previous name _clearer_. Now seeing that code for the first time my question would be "what is it session?" Edit, sorry, not used to old PR template. > Extracting provide_session and create_session to separate module > reduces number of cyclic imports and make a disctinction between > session and database. Can you explain to be what the difference between DB and session is? Cos I wouldn't be able to tell you if you asked me. The session _is for the db_. And how did this reduce cyclic imports when utils.db imports the new file still?! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] ashb commented on issue #6938: [AIRFLOW-6382] Extract provide/create session to session module
ashb commented on issue #6938: [AIRFLOW-6382] Extract provide/create session to session module URL: https://github.com/apache/airflow/pull/6938#issuecomment-591084955 (excuse the grump, but this change, and the pre commit rule we put in place meant I had to work around it have code live in the repo but work on master and in 1.10.) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow-on-k8s-operator] barney-s opened a new issue #8: Setup CI for the repo
barney-s opened a new issue #8: Setup CI for the repo URL: https://github.com/apache/airflow-on-k8s-operator/issues/8 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] ashb commented on issue #6938: [AIRFLOW-6382] Extract provide/create session to session module
ashb commented on issue #6938: [AIRFLOW-6382] Extract provide/create session to session module URL: https://github.com/apache/airflow/pull/6938#issuecomment-591081255 Why did we make this change? Neither the PR nor the linked Jira give _any_ justification for the change. It doesn't reduce any imports, and personally, I found the previous name _clearer_. Now seeing that code for the first time my question would be "what is it session?" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] tooptoop4 commented on issue #7169: [AIRFLOW-6555] mushroom cloud error when clicking 'task instance details' from graph view of dag that has never been run yet
tooptoop4 commented on issue #7169: [AIRFLOW-6555] mushroom cloud error when clicking 'task instance details' from graph view of dag that has never been run yet URL: https://github.com/apache/airflow/pull/7169#issuecomment-591075723 @ashb are there any similar tests I could 'reuse' as a starting point ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (AIRFLOW-6918) don't use 'is' in if conditions comparing STATE
[ https://issues.apache.org/jira/browse/AIRFLOW-6918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo updated AIRFLOW-6918: -- Priority: Trivial (was: Major) > don't use 'is' in if conditions comparing STATE > --- > > Key: AIRFLOW-6918 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6918 > Project: Apache Airflow > Issue Type: Improvement > Components: scheduler >Affects Versions: 1.10.9 >Reporter: t oo >Assignee: t oo >Priority: Trivial > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work started] (AIRFLOW-6918) don't use 'is' in if conditions comparing STATE
[ https://issues.apache.org/jira/browse/AIRFLOW-6918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on AIRFLOW-6918 started by t oo. - > don't use 'is' in if conditions comparing STATE > --- > > Key: AIRFLOW-6918 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6918 > Project: Apache Airflow > Issue Type: Improvement > Components: scheduler >Affects Versions: 1.10.9 >Reporter: t oo >Assignee: t oo >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] tooptoop4 opened a new pull request #7536: [AIRFLOW-6918] don't use 'is' in if conditions comparing STATE
tooptoop4 opened a new pull request #7536: [AIRFLOW-6918] don't use 'is' in if conditions comparing STATE URL: https://github.com/apache/airflow/pull/7536 --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [x] Unit tests coverage for changes (not needed for documentation changes) - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [x] Relevant documentation is updated including usage instructions. - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-6918) don't use 'is' in if conditions comparing STATE
[ https://issues.apache.org/jira/browse/AIRFLOW-6918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044890#comment-17044890 ] ASF GitHub Bot commented on AIRFLOW-6918: - tooptoop4 commented on pull request #7536: [AIRFLOW-6918] don't use 'is' in if conditions comparing STATE URL: https://github.com/apache/airflow/pull/7536 --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [x] Unit tests coverage for changes (not needed for documentation changes) - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [x] Relevant documentation is updated including usage instructions. - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > don't use 'is' in if conditions comparing STATE > --- > > Key: AIRFLOW-6918 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6918 > Project: Apache Airflow > Issue Type: Improvement > Components: scheduler >Affects Versions: 1.10.9 >Reporter: t oo >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (AIRFLOW-6918) don't use 'is' in if conditions comparing STATE
[ https://issues.apache.org/jira/browse/AIRFLOW-6918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo updated AIRFLOW-6918: -- Summary: don't use 'is' in if conditions comparing STATE (was: don't use is in if comparing state) > don't use 'is' in if conditions comparing STATE > --- > > Key: AIRFLOW-6918 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6918 > Project: Apache Airflow > Issue Type: Improvement > Components: scheduler >Affects Versions: 1.10.9 >Reporter: t oo >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (AIRFLOW-6918) don't use is in if comparing state
t oo created AIRFLOW-6918: - Summary: don't use is in if comparing state Key: AIRFLOW-6918 URL: https://issues.apache.org/jira/browse/AIRFLOW-6918 Project: Apache Airflow Issue Type: Improvement Components: scheduler Affects Versions: 1.10.9 Reporter: t oo -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-6864) Make airfow/jobs pylint compatible
[ https://issues.apache.org/jira/browse/AIRFLOW-6864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044861#comment-17044861 ] ASF subversion and git services commented on AIRFLOW-6864: -- Commit 311140616daafe496310d642e4164bc53fbd2ad2 in airflow's branch refs/heads/master from Tomek Urbaszek [ https://gitbox.apache.org/repos/asf?p=airflow.git;h=3111406 ] [AIRFLOW-6864] Make airfow/jobs pylint compatible (#7484) fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible > Make airfow/jobs pylint compatible > -- > > Key: AIRFLOW-6864 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6864 > Project: Apache Airflow > Issue Type: Improvement > Components: pylint >Affects Versions: 2.0.0 >Reporter: Tomasz Urbaszek >Priority: Major > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-6864) Make airfow/jobs pylint compatible
[ https://issues.apache.org/jira/browse/AIRFLOW-6864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044868#comment-17044868 ] ASF subversion and git services commented on AIRFLOW-6864: -- Commit 311140616daafe496310d642e4164bc53fbd2ad2 in airflow's branch refs/heads/master from Tomek Urbaszek [ https://gitbox.apache.org/repos/asf?p=airflow.git;h=3111406 ] [AIRFLOW-6864] Make airfow/jobs pylint compatible (#7484) fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible > Make airfow/jobs pylint compatible > -- > > Key: AIRFLOW-6864 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6864 > Project: Apache Airflow > Issue Type: Improvement > Components: pylint >Affects Versions: 2.0.0 >Reporter: Tomasz Urbaszek >Priority: Major > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-6864) Make airfow/jobs pylint compatible
[ https://issues.apache.org/jira/browse/AIRFLOW-6864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044860#comment-17044860 ] ASF subversion and git services commented on AIRFLOW-6864: -- Commit 311140616daafe496310d642e4164bc53fbd2ad2 in airflow's branch refs/heads/master from Tomek Urbaszek [ https://gitbox.apache.org/repos/asf?p=airflow.git;h=3111406 ] [AIRFLOW-6864] Make airfow/jobs pylint compatible (#7484) fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible > Make airfow/jobs pylint compatible > -- > > Key: AIRFLOW-6864 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6864 > Project: Apache Airflow > Issue Type: Improvement > Components: pylint >Affects Versions: 2.0.0 >Reporter: Tomasz Urbaszek >Priority: Major > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-6864) Make airfow/jobs pylint compatible
[ https://issues.apache.org/jira/browse/AIRFLOW-6864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044866#comment-17044866 ] ASF subversion and git services commented on AIRFLOW-6864: -- Commit 311140616daafe496310d642e4164bc53fbd2ad2 in airflow's branch refs/heads/master from Tomek Urbaszek [ https://gitbox.apache.org/repos/asf?p=airflow.git;h=3111406 ] [AIRFLOW-6864] Make airfow/jobs pylint compatible (#7484) fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible > Make airfow/jobs pylint compatible > -- > > Key: AIRFLOW-6864 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6864 > Project: Apache Airflow > Issue Type: Improvement > Components: pylint >Affects Versions: 2.0.0 >Reporter: Tomasz Urbaszek >Priority: Major > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-6864) Make airfow/jobs pylint compatible
[ https://issues.apache.org/jira/browse/AIRFLOW-6864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044863#comment-17044863 ] ASF subversion and git services commented on AIRFLOW-6864: -- Commit 311140616daafe496310d642e4164bc53fbd2ad2 in airflow's branch refs/heads/master from Tomek Urbaszek [ https://gitbox.apache.org/repos/asf?p=airflow.git;h=3111406 ] [AIRFLOW-6864] Make airfow/jobs pylint compatible (#7484) fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible > Make airfow/jobs pylint compatible > -- > > Key: AIRFLOW-6864 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6864 > Project: Apache Airflow > Issue Type: Improvement > Components: pylint >Affects Versions: 2.0.0 >Reporter: Tomasz Urbaszek >Priority: Major > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-6864) Make airfow/jobs pylint compatible
[ https://issues.apache.org/jira/browse/AIRFLOW-6864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044867#comment-17044867 ] ASF subversion and git services commented on AIRFLOW-6864: -- Commit 311140616daafe496310d642e4164bc53fbd2ad2 in airflow's branch refs/heads/master from Tomek Urbaszek [ https://gitbox.apache.org/repos/asf?p=airflow.git;h=3111406 ] [AIRFLOW-6864] Make airfow/jobs pylint compatible (#7484) fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible > Make airfow/jobs pylint compatible > -- > > Key: AIRFLOW-6864 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6864 > Project: Apache Airflow > Issue Type: Improvement > Components: pylint >Affects Versions: 2.0.0 >Reporter: Tomasz Urbaszek >Priority: Major > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-6864) Make airfow/jobs pylint compatible
[ https://issues.apache.org/jira/browse/AIRFLOW-6864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044864#comment-17044864 ] ASF subversion and git services commented on AIRFLOW-6864: -- Commit 311140616daafe496310d642e4164bc53fbd2ad2 in airflow's branch refs/heads/master from Tomek Urbaszek [ https://gitbox.apache.org/repos/asf?p=airflow.git;h=3111406 ] [AIRFLOW-6864] Make airfow/jobs pylint compatible (#7484) fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible > Make airfow/jobs pylint compatible > -- > > Key: AIRFLOW-6864 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6864 > Project: Apache Airflow > Issue Type: Improvement > Components: pylint >Affects Versions: 2.0.0 >Reporter: Tomasz Urbaszek >Priority: Major > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-6864) Make airfow/jobs pylint compatible
[ https://issues.apache.org/jira/browse/AIRFLOW-6864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044862#comment-17044862 ] ASF subversion and git services commented on AIRFLOW-6864: -- Commit 311140616daafe496310d642e4164bc53fbd2ad2 in airflow's branch refs/heads/master from Tomek Urbaszek [ https://gitbox.apache.org/repos/asf?p=airflow.git;h=3111406 ] [AIRFLOW-6864] Make airfow/jobs pylint compatible (#7484) fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible > Make airfow/jobs pylint compatible > -- > > Key: AIRFLOW-6864 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6864 > Project: Apache Airflow > Issue Type: Improvement > Components: pylint >Affects Versions: 2.0.0 >Reporter: Tomasz Urbaszek >Priority: Major > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-6864) Make airfow/jobs pylint compatible
[ https://issues.apache.org/jira/browse/AIRFLOW-6864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044865#comment-17044865 ] ASF subversion and git services commented on AIRFLOW-6864: -- Commit 311140616daafe496310d642e4164bc53fbd2ad2 in airflow's branch refs/heads/master from Tomek Urbaszek [ https://gitbox.apache.org/repos/asf?p=airflow.git;h=3111406 ] [AIRFLOW-6864] Make airfow/jobs pylint compatible (#7484) fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible > Make airfow/jobs pylint compatible > -- > > Key: AIRFLOW-6864 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6864 > Project: Apache Airflow > Issue Type: Improvement > Components: pylint >Affects Versions: 2.0.0 >Reporter: Tomasz Urbaszek >Priority: Major > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (AIRFLOW-6864) Make airfow/jobs pylint compatible
[ https://issues.apache.org/jira/browse/AIRFLOW-6864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomasz Urbaszek resolved AIRFLOW-6864. -- Fix Version/s: 2.0.0 Resolution: Done > Make airfow/jobs pylint compatible > -- > > Key: AIRFLOW-6864 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6864 > Project: Apache Airflow > Issue Type: Improvement > Components: pylint >Affects Versions: 2.0.0 >Reporter: Tomasz Urbaszek >Priority: Major > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-6864) Make airfow/jobs pylint compatible
[ https://issues.apache.org/jira/browse/AIRFLOW-6864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044859#comment-17044859 ] ASF subversion and git services commented on AIRFLOW-6864: -- Commit 311140616daafe496310d642e4164bc53fbd2ad2 in airflow's branch refs/heads/master from Tomek Urbaszek [ https://gitbox.apache.org/repos/asf?p=airflow.git;h=3111406 ] [AIRFLOW-6864] Make airfow/jobs pylint compatible (#7484) fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible fixup! [AIRFLOW-6864] Make airfow/jobs pylint compatible > Make airfow/jobs pylint compatible > -- > > Key: AIRFLOW-6864 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6864 > Project: Apache Airflow > Issue Type: Improvement > Components: pylint >Affects Versions: 2.0.0 >Reporter: Tomasz Urbaszek >Priority: Major > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] nuclearpinguin merged pull request #7484: [AIRFLOW-6864] Make airflow/jobs pylint compatible
nuclearpinguin merged pull request #7484: [AIRFLOW-6864] Make airflow/jobs pylint compatible URL: https://github.com/apache/airflow/pull/7484 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-6864) Make airfow/jobs pylint compatible
[ https://issues.apache.org/jira/browse/AIRFLOW-6864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044858#comment-17044858 ] ASF GitHub Bot commented on AIRFLOW-6864: - nuclearpinguin commented on pull request #7484: [AIRFLOW-6864] Make airflow/jobs pylint compatible URL: https://github.com/apache/airflow/pull/7484 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Make airfow/jobs pylint compatible > -- > > Key: AIRFLOW-6864 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6864 > Project: Apache Airflow > Issue Type: Improvement > Components: pylint >Affects Versions: 2.0.0 >Reporter: Tomasz Urbaszek >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (AIRFLOW-6917) Remove Pickling
[ https://issues.apache.org/jira/browse/AIRFLOW-6917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaxil Naik updated AIRFLOW-6917: Labels: dag-serialization (was: ) > Remove Pickling > > > Key: AIRFLOW-6917 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6917 > Project: Apache Airflow > Issue Type: Improvement > Components: serialization >Affects Versions: 2.0.0 >Reporter: Kaxil Naik >Priority: Major > Labels: dag-serialization > Fix For: 2.0.0 > > > We are going to use Serialized DAGs from DB. We no longer require picking > (this was already deprecated and only used when config is enabled) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (AIRFLOW-6917) Remove Pickling
Kaxil Naik created AIRFLOW-6917: --- Summary: Remove Pickling Key: AIRFLOW-6917 URL: https://issues.apache.org/jira/browse/AIRFLOW-6917 Project: Apache Airflow Issue Type: Improvement Components: serialization Affects Versions: 2.0.0 Reporter: Kaxil Naik Fix For: 2.0.0 We are going to use Serialized DAGs from DB. We no longer require picking (this was already deprecated and only used when config is enabled) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (AIRFLOW-3831) Remove DagBag from /pickle_info
[ https://issues.apache.org/jira/browse/AIRFLOW-3831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaxil Naik resolved AIRFLOW-3831. - Resolution: Fixed > Remove DagBag from /pickle_info > --- > > Key: AIRFLOW-3831 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3831 > Project: Apache Airflow > Issue Type: Improvement > Components: webserver >Reporter: Peter van 't Hof >Assignee: Bas Harenslak >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)