[jira] [Commented] (AIRFLOW-161) Redirection to external url
[ https://issues.apache.org/jira/browse/AIRFLOW-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297652#comment-15297652 ] Sumit Maheshwari commented on AIRFLOW-161: -- Hi Chris, Yes, for offline things like mail or slack this is what we are using as well, but for real time cases like mine, it doesn't suit. Please take a look on this PR (https://github.com/apache/incubator-airflow/pull/1538) and let me know if I can achieve similar using some other way. Thanks, Sumit > Redirection to external url > --- > > Key: AIRFLOW-161 > URL: https://issues.apache.org/jira/browse/AIRFLOW-161 > Project: Apache Airflow > Issue Type: Improvement > Components: webserver >Reporter: Sumit Maheshwari > > Hi, > I am not able to find a good way (apart from loading everything upfront), > where I can redirect someone to a external service url, using the information > stored in airflow. There could be many use cases like downloading a signed > file from s3, redirecting to hadoop job tracker, or a direct case on which I > am working which is linking airflow tasks to qubole commands. > I already have a working model and will open a PR soon. Please let me know if > there existing ways already. > Thanks, > Sumit -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [1/3] incubator-airflow git commit: use targetPartitionSize as the default partition spec
s/two/too .. sigh On Mon, May 23, 2016 at 8:29 PM, Chris Riccominiwrote: > Ah, yea. I get bitten by that two. It's annoying to have to ask people to > add a JIRA to their commit message. And we can't squash through GitHub > anymore. :( Wonder if the airflow-pr script allows us to edit it? I think > it might > > On Mon, May 23, 2016 at 5:50 PM, Dan Davydov < > dan.davy...@airbnb.com.invalid> wrote: > >> Yep sorry will check the versions in the future. My own commits have JIRA >> labels but I haven't validated that other users have done this for theirs >> when I merge their commits (as the LGTM is delegated to either another >> committer or the owner of a particular operator). Will be more vigilant in >> the future. >> >> On Mon, May 23, 2016 at 5:07 PM, Chris Riccomini >> wrote: >> >> > Hey Dan, >> > >> > Could you please file JIRAs, and put the JIRA name as the prefix to your >> > commits? >> > >> > Cheers, >> > Chris >> > >> > On Mon, May 23, 2016 at 5:01 PM, wrote: >> > >> >> Repository: incubator-airflow >> >> Updated Branches: >> >> refs/heads/airbnb_rb1.7.1_4 1d0d8681d -> 6f7ea90ae >> >> >> >> >> >> use targetPartitionSize as the default partition spec >> >> >> >> >> >> Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo >> >> Commit: >> >> >> http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/b58b5e09 >> >> Tree: >> >> http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/b58b5e09 >> >> Diff: >> >> http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/b58b5e09 >> >> >> >> Branch: refs/heads/airbnb_rb1.7.1_4 >> >> Commit: b58b5e09578d8a0df17b4de12fe3b49792e9feda >> >> Parents: 1d0d868 >> >> Author: Hongbo Zeng >> >> Authored: Sat May 14 17:00:42 2016 -0700 >> >> Committer: Dan Davydov >> >> Committed: Mon May 23 16:59:52 2016 -0700 >> >> >> >> -- >> >> airflow/hooks/druid_hook.py| 23 --- >> >> airflow/operators/hive_to_druid.py | 8 +--- >> >> 2 files changed, 21 insertions(+), 10 deletions(-) >> >> -- >> >> >> >> >> >> >> >> >> http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/b58b5e09/airflow/hooks/druid_hook.py >> >> -- >> >> diff --git a/airflow/hooks/druid_hook.py b/airflow/hooks/druid_hook.py >> >> index b6cb231..7c80c7c 100644 >> >> --- a/airflow/hooks/druid_hook.py >> >> +++ b/airflow/hooks/druid_hook.py >> >> @@ -10,7 +10,7 @@ from airflow.hooks.base_hook import BaseHook >> >> from airflow.exceptions import AirflowException >> >> >> >> LOAD_CHECK_INTERVAL = 5 >> >> - >> >> +TARGET_PARTITION_SIZE = 500 >> >> >> >> class AirflowDruidLoadException(AirflowException): >> >> pass >> >> @@ -52,13 +52,22 @@ class DruidHook(BaseHook): >> >> >> >> def construct_ingest_query( >> >> self, datasource, static_path, ts_dim, columns, >> metric_spec, >> >> -intervals, num_shards, >> hadoop_dependency_coordinates=None): >> >> +intervals, num_shards, target_partition_size, >> >> hadoop_dependency_coordinates=None): >> >> """ >> >> Builds an ingest query for an HDFS TSV load. >> >> >> >> :param datasource: target datasource in druid >> >> :param columns: list of all columns in the TSV, in the right >> >> order >> >> """ >> >> + >> >> +# backward compatibilty for num_shards, but >> >> target_partition_size is the default setting >> >> +# and overwrites the num_shards >> >> +if target_partition_size == -1: >> >> +if num_shards == -1: >> >> +target_partition_size = TARGET_PARTITION_SIZE >> >> +else: >> >> +num_shards = -1 >> >> + >> >> metric_names = [ >> >> m['fieldName'] for m in metric_spec if m['type'] != >> 'count'] >> >> dimensions = [c for c in columns if c not in metric_names and >> c >> >> != ts_dim] >> >> @@ -100,7 +109,7 @@ class DruidHook(BaseHook): >> >> }, >> >> "partitionsSpec" : { >> >> "type" : "hashed", >> >> -"targetPartitionSize" : -1, >> >> +"targetPartitionSize" : target_partition_size, >> >> "numShards" : num_shards, >> >> }, >> >> }, >> >> @@ -121,10 +130,10 @@ class DruidHook(BaseHook): >> >> >> >> def send_ingest_query( >> >> self, datasource, static_path, ts_dim, columns, >> metric_spec, >> >> -intervals, num_shards, >> hadoop_dependency_coordinates=None): >> >> +intervals, num_shards, target_partition_size, >> >> hadoop_dependency_coordinates=None): >>
[jira] [Commented] (AIRFLOW-163) Running multiple LocalExecutor schedulers makes system load skyrocket
[ https://issues.apache.org/jira/browse/AIRFLOW-163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15296980#comment-15296980 ] Bolke de Bruin commented on AIRFLOW-163: Not sure. Bence are you able to test the PR attached to airflow-128 or provide a sample dag that exposes the issue? > Running multiple LocalExecutor schedulers makes system load skyrocket > - > > Key: AIRFLOW-163 > URL: https://issues.apache.org/jira/browse/AIRFLOW-163 > Project: Apache Airflow > Issue Type: Bug >Affects Versions: Airflow 1.7.1 > Environment: EC2 t2.medium instance, > Docker `version 1.11.1, build 5604cbe`, > Host is `Linux ip-172-31-44-140 3.13.0-85-generic #129-Ubuntu SMP Thu Mar 17 > 20:50:15 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux`, > Docker containers are built upon the `python:3.5` image, > LocalExecutor is used with two scheduler containers running >Reporter: Bence Nagy >Priority: Minor > Labels: scheduler > > I've been told on Gitter that this is expected currently, but thought I'd > create an issue for it anyway. > See this screenshot of a task duration chart — I launched a second scheduler > for the 8:50 execution. The orange line represents a PostgresOperator task > (i.e. processing happens independent of airflow), while the other lines > represent data copying tasks that go through a temp file on the airflow host > https://i.imgur.com/2tDKgKj.png > I'm seeing a system load of around 4.0-5.0 when processing tasks when one > scheduler is running, and 20.0-30.0 with two. > Running {{airflow scheduler --num_runs 3}} under yappi got me these results > when ordered by total time: http://pastebin.com/8TiEG4P3. I still have the > raw profiling data, let me know if another data extract would be useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (AIRFLOW-165) Add a description/metadata field to the Task
[ https://issues.apache.org/jira/browse/AIRFLOW-165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Riccomini closed AIRFLOW-165. --- Resolution: Information Provided > Add a description/metadata field to the Task > > > Key: AIRFLOW-165 > URL: https://issues.apache.org/jira/browse/AIRFLOW-165 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Richard Davison >Priority: Minor > > I think that to help facilitate self documentation, we should add either a > `description` field or a `metadata` field at the Task level so we can add an > arbitrary blob of information to describe it. > On the UI side, we could put the description in the the alt text, a mouseover > popup, in the onclick overlay popup, or in a link inside of that similar to > the optional 'subdag' link. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AIRFLOW-165) Add a description/metadata field to the Task
[ https://issues.apache.org/jira/browse/AIRFLOW-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15296952#comment-15296952 ] Chris Riccomini commented on AIRFLOW-165: - I think this is already done, right? https://pythonhosted.org/airflow/concepts.html#task-documentation-notes You can do: {code} t = BashOperator("foo", dag=dag) t.doc_md = """\ #Title" Here's a [url](www.airbnb.com) """ {code} > Add a description/metadata field to the Task > > > Key: AIRFLOW-165 > URL: https://issues.apache.org/jira/browse/AIRFLOW-165 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Richard Davison >Priority: Minor > > I think that to help facilitate self documentation, we should add either a > `description` field or a `metadata` field at the Task level so we can add an > arbitrary blob of information to describe it. > On the UI side, we could put the description in the the alt text, a mouseover > popup, in the onclick overlay popup, or in a link inside of that similar to > the optional 'subdag' link. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AIRFLOW-161) Redirection to external url
[ https://issues.apache.org/jira/browse/AIRFLOW-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15296944#comment-15296944 ] Chris Riccomini commented on AIRFLOW-161: - What context is this under? You mean like you want to use the EmailOperator or SlackOperator to notify people to download a file that's been created as part of the DAG? We do this using XCom+EmailOperator. XCom variables can be accessed via templates. We store the file in a blob store (like S3). The file location is stored in XCom, which the EmailOperator references when it sends the email. > Redirection to external url > --- > > Key: AIRFLOW-161 > URL: https://issues.apache.org/jira/browse/AIRFLOW-161 > Project: Apache Airflow > Issue Type: Improvement > Components: webserver >Reporter: Sumit Maheshwari > > Hi, > I am not able to find a good way (apart from loading everything upfront), > where I can redirect someone to a external service url, using the information > stored in airflow. There could be many use cases like downloading a signed > file from s3, redirecting to hadoop job tracker, or a direct case on which I > am working which is linking airflow tasks to qubole commands. > I already have a working model and will open a PR soon. Please let me know if > there existing ways already. > Thanks, > Sumit -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (AIRFLOW-58) Add bulk_dump abstract method to DbApiHook
[ https://issues.apache.org/jira/browse/AIRFLOW-58?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bence Nagy reassigned AIRFLOW-58: - Assignee: Bence Nagy > Add bulk_dump abstract method to DbApiHook > -- > > Key: AIRFLOW-58 > URL: https://issues.apache.org/jira/browse/AIRFLOW-58 > Project: Apache Airflow > Issue Type: Improvement > Components: hooks >Affects Versions: Airflow 1.7.0 >Reporter: Bence Nagy >Assignee: Bence Nagy >Priority: Trivial > > I just see no reason for having a method for bulk loading but not for the > inverse. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AIRFLOW-159) Documentation: Cloud integration : GCP
[ https://issues.apache.org/jira/browse/AIRFLOW-159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15296939#comment-15296939 ] Chris Riccomini commented on AIRFLOW-159: - Sounds good! > Documentation: Cloud integration : GCP > -- > > Key: AIRFLOW-159 > URL: https://issues.apache.org/jira/browse/AIRFLOW-159 > Project: Apache Airflow > Issue Type: Task > Components: gcp >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel > > Start documenting all GCP operators and hooks. > I propose a new top-level documentation section that's called "Integration". > Under that section I would make a sub-section "Google Cloud Platform". > This way other Cloud integration can be documented as well in the Integration > section. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (AIRFLOW-10) Migrate GH issues to Apache JIRA
[ https://issues.apache.org/jira/browse/AIRFLOW-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Riccomini closed AIRFLOW-10. -- Resolution: Done > Migrate GH issues to Apache JIRA > > > Key: AIRFLOW-10 > URL: https://issues.apache.org/jira/browse/AIRFLOW-10 > Project: Apache Airflow > Issue Type: Bug > Components: project-management >Reporter: Chris Riccomini >Assignee: Bolke de Bruin > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AIRFLOW-164) Disable the web UI's page load animations
[ https://issues.apache.org/jira/browse/AIRFLOW-164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15296927#comment-15296927 ] Chris Riccomini commented on AIRFLOW-164: - I'm +1 on removing it, personally. I suspect [~maxime.beauche...@apache.org] will have the strongest preference. > Disable the web UI's page load animations > - > > Key: AIRFLOW-164 > URL: https://issues.apache.org/jira/browse/AIRFLOW-164 > Project: Apache Airflow > Issue Type: Improvement > Components: ui >Affects Versions: Airflow 1.7.1 >Reporter: Bence Nagy > > Alright, some people might disagree, looking forward to the discussion here. > Basically my qualm is that opening almost any page (even the DAGs list) will > trigger an animation where the content kinda swoops in from the left/the > top/the top-left corner. This gets pretty annoying for a few reasons: > - It takes around half a second before the content is visually parsable, and > these half seconds accumulate pretty quick when doing lots of administration. > - This makes visual diffing when refreshing or editing the URL impossible. If > the animations weren't firing, it would be possible to refresh for instance > the tree view of a complicated DAG and just see the treemap change, making > the differences obvious. Currently you need to commit the state to memory and > then recall it after the animation has finished to try and figure out what > has changed. > - I think it just makes no sense from a design point of view anyway to have > all this data sliding around the screen. It's not like it passes off as a > transition animation or anything. > What does everyone else think? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
incubator-airflow git commit: docfix: Fix a couple of minor typos.
Repository: incubator-airflow Updated Branches: refs/heads/master 88f895aa6 -> 8d7297573 docfix: Fix a couple of minor typos. Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/8d729757 Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/8d729757 Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/8d729757 Branch: refs/heads/master Commit: 8d72975734e66d6efa775cec62dd0aea87575c0d Parents: 88f895a Author: Mark ReidAuthored: Mon May 23 09:16:03 2016 -0300 Committer: Mark Reid Committed: Mon May 23 09:16:38 2016 -0300 -- docs/concepts.rst | 2 +- docs/configuration.rst | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/8d729757/docs/concepts.rst -- diff --git a/docs/concepts.rst b/docs/concepts.rst index 405048a..6e15ff8 100644 --- a/docs/concepts.rst +++ b/docs/concepts.rst @@ -172,7 +172,7 @@ functionally equivalent: When using the bitshift to compose operators, the relationship is set in the direction that the bitshift operator points. For example, ``op1 >> op2`` means -that ``op1`` runs first and ``op2`` runs seconds. Multiple operators can be +that ``op1`` runs first and ``op2`` runs second. Multiple operators can be composed -- keep in mind the chain is executed left-to-right and the rightmost object is always returned. For example: http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/8d729757/docs/configuration.rst -- diff --git a/docs/configuration.rst b/docs/configuration.rst index 2d8a9fb..3eed553 100644 --- a/docs/configuration.rst +++ b/docs/configuration.rst @@ -107,7 +107,7 @@ Here are a few imperative requirements for your workers: ``MySqlOperator``, the required Python library needs to be available in the ``PYTHONPATH`` somehow - The worker needs to have access to its ``DAGS_FOLDER``, and you need to - synchronize the filesystems by your own mean. A common setup would be to + synchronize the filesystems by your own means. A common setup would be to store your DAGS_FOLDER in a Git repository and sync it across machines using Chef, Puppet, Ansible, or whatever you use to configure machines in your environment. If all your boxes have a common mount point, having your
[jira] [Closed] (AIRFLOW-138) Airflow improperly shows task status as 'up for retry' for a task that failed on re-run
[ https://issues.apache.org/jira/browse/AIRFLOW-138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Anand closed AIRFLOW-138. --- > Airflow improperly shows task status as 'up for retry' for a task that failed > on re-run > --- > > Key: AIRFLOW-138 > URL: https://issues.apache.org/jira/browse/AIRFLOW-138 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler >Affects Versions: Airflow 1.7.0 >Reporter: Tomasz Bartczak >Assignee: Siddharth Anand >Priority: Minor > > Migrated from https://github.com/apache/incubator-airflow/issues/1441 > Dear Airflow Maintainers, > *Environment* > Before I tell you about my issue, let me describe my Airflow environment: > {panel} > Airflow version: 1.7.0 > Airflow components: webserver, mysql, scheduler with celery executor > Python Version: 2.7.6 > Operating System: Linux Ubuntu 3.19.0-26-generic > {panel} > *Description of Issue* > Now that you know a little about me, let me tell you about the issue I am > having: > *What I expect:* > If I do a re-run and it fails - The task should be either re-tried again > (resetting retry count) and marked accordingly in GUI OR not retried - and > marked in GUI as 'failed' > What happened instead? The task in the GUI was presented as > 'up_for_retry' however it was not retried, even after retry_delay has passed > *Reproducing the Issue* > DAG does not have some strange settings: > {code} > concurrency= 3, > max_active_runs = 2, > start_date = datetime(2016,04,03,01), > default_args={ > 'depends_on_past': False, > 'retries': 2, > 'retry_delay': timedelta(minutes=3) } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (AIRFLOW-151) trigger_rule='one_success' not allowing tasks downstream of a BranchPythonOperator to be executed
[ https://issues.apache.org/jira/browse/AIRFLOW-151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Anand closed AIRFLOW-151. --- > trigger_rule='one_success' not allowing tasks downstream of a > BranchPythonOperator to be executed > - > > Key: AIRFLOW-151 > URL: https://issues.apache.org/jira/browse/AIRFLOW-151 > Project: Apache Airflow > Issue Type: Bug >Reporter: Siddharth Anand >Assignee: Siddharth Anand > Attachments: DAG_Problem.png, DAG_Problem_Resolved.PNG, > DAG_Solution_Example.png > > > Porting from https://github.com/apache/incubator-airflow/issues/1521 > Dear Airflow Maintainers, > *Environment* > {panel} > Airflow version: 1.7.0rc3 > Airflow components: webserver, scheduler, worker, postgres database, > CeleryExecutor > Relevant airflow.cfg settings: nothing special here; mostly defaults > Python Version: 3.4.3 > Operating System: Centos 6.7 > Python packages: virtualenv with standard airflow install > {panel} > *Background* > We are constructing a workflow to automate standard business processes around > the creation and maintenance of virtual machines. After creation, we verify > several information points on the VM to ensure that it is a viable machine > and that no configuration errors occurred. If it fails verification and is > not running, then it should be deleted. If it fails verification and is > running, then we stop it first, then delete it. > *What did you expect to happen?* > After researching the BranchPythonOperator, I found that I should be using > trigger_rule='one_success' to allow a task at a join point downstream of the > branch(es) to be triggered, as mentioned in #1078. So, I defined the task as > follows: > {code} > delete_vm = PythonOperator( > task_id='delete_vm', > trigger_rule=TriggerRule.ONE_SUCCESS, > python_callable=_delete_vm, > provide_context=True, > dag=dag) > delete_vm.set_upstream({poll_vm_stop, verify_vm}) > {code} > *What happened instead?* > Rather than executing correctly, the delete_vm task is marked as skipped and > is not re-evaluated following poll_vm_stop. There is no stack trace > available, as the task simply does not execute. Sidenote: the PythonSensor > you see in the picture below is a sensor which evaluates the truthy- or > falsey-ness of a Python callable. It has been tested extensively and works as > intended. > !DAG_Problem.png! > Any help would be greatly appreciated. I've tested various ways of linking > the dag, providing DummyOperators as buffers, using a second > BranchPythonOperator to explicitly call the task; all of these have failed. > Am I missing something obvious here? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AIRFLOW-151) trigger_rule='one_success' not allowing tasks downstream of a BranchPythonOperator to be executed
[ https://issues.apache.org/jira/browse/AIRFLOW-151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15296493#comment-15296493 ] William Clark commented on AIRFLOW-151: --- [~sanand], thank you so much for the assistance! With those changes, the DAG is now functioning as expected. !DAG_Problem_Resolved.PNG! > trigger_rule='one_success' not allowing tasks downstream of a > BranchPythonOperator to be executed > - > > Key: AIRFLOW-151 > URL: https://issues.apache.org/jira/browse/AIRFLOW-151 > Project: Apache Airflow > Issue Type: Bug >Reporter: Siddharth Anand >Assignee: Siddharth Anand > Attachments: DAG_Problem.png, DAG_Problem_Resolved.PNG, > DAG_Solution_Example.png > > > Porting from https://github.com/apache/incubator-airflow/issues/1521 > Dear Airflow Maintainers, > *Environment* > {panel} > Airflow version: 1.7.0rc3 > Airflow components: webserver, scheduler, worker, postgres database, > CeleryExecutor > Relevant airflow.cfg settings: nothing special here; mostly defaults > Python Version: 3.4.3 > Operating System: Centos 6.7 > Python packages: virtualenv with standard airflow install > {panel} > *Background* > We are constructing a workflow to automate standard business processes around > the creation and maintenance of virtual machines. After creation, we verify > several information points on the VM to ensure that it is a viable machine > and that no configuration errors occurred. If it fails verification and is > not running, then it should be deleted. If it fails verification and is > running, then we stop it first, then delete it. > *What did you expect to happen?* > After researching the BranchPythonOperator, I found that I should be using > trigger_rule='one_success' to allow a task at a join point downstream of the > branch(es) to be triggered, as mentioned in #1078. So, I defined the task as > follows: > {code} > delete_vm = PythonOperator( > task_id='delete_vm', > trigger_rule=TriggerRule.ONE_SUCCESS, > python_callable=_delete_vm, > provide_context=True, > dag=dag) > delete_vm.set_upstream({poll_vm_stop, verify_vm}) > {code} > *What happened instead?* > Rather than executing correctly, the delete_vm task is marked as skipped and > is not re-evaluated following poll_vm_stop. There is no stack trace > available, as the task simply does not execute. Sidenote: the PythonSensor > you see in the picture below is a sensor which evaluates the truthy- or > falsey-ness of a Python callable. It has been tested extensively and works as > intended. > !DAG_Problem.png! > Any help would be greatly appreciated. I've tested various ways of linking > the dag, providing DummyOperators as buffers, using a second > BranchPythonOperator to explicitly call the task; all of these have failed. > Am I missing something obvious here? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (AIRFLOW-165) Add a description/metadata field to the Task
Richard Davison created AIRFLOW-165: --- Summary: Add a description/metadata field to the Task Key: AIRFLOW-165 URL: https://issues.apache.org/jira/browse/AIRFLOW-165 Project: Apache Airflow Issue Type: Improvement Reporter: Richard Davison Priority: Minor I think that to help facilitate self documentation, we should add either a `description` field or a `metadata` field at the Task level so we can add an arbitrary blob of information to describe it. On the UI side, we could put the description in the the alt text, a mouseover popup, in the onclick overlay popup, or in a link inside of that similar to the optional 'subdag' link. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (AIRFLOW-140) DagRun state not updated
[ https://issues.apache.org/jira/browse/AIRFLOW-140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292963#comment-15292963 ] dud edited comment on AIRFLOW-140 at 5/23/16 9:49 AM: -- Hello. I tried with the LocalExecutor as requested and I observed the same behaviour : {code} airflow=> SELECT * FROM task_instance WHERE dag_id = :dag_id ORDER BY execution_date ; SELECT * FROM dag_run WHERE dag_id = :dag_id ; SELECT * FROM job ORDER BY start_date DESC LIMIT 5; task_id | dag_id | execution_date| start_date | end_date | duration | state | try_number | hostname | unixname | job_id | pool | queue | priority_weight |operator| queued_dttm --+---+-+++---+-++---+--++--+-+-++- alt_sleep | dagrun_not_updated | 2016-05-20 07:45:00 | 2016-05-20 07:46:54.372843 || | running | 1 | localhost | airflow | 3203 | | default | 1 | PythonOperator | alt_sleep | dagrun_not_updated | 2016-05-20 07:46:00 | 2016-05-20 07:47:19.317705 | 2016-05-20 07:47:29.453316 | 10.135611 | success | 1 | localhost | airflow | 3204 | | default | 1 | PythonOperator | alt_sleep | dagrun_not_updated | 2016-05-20 07:47:00 | 2016-05-20 07:48:01.724885 || | running | 1 | localhost | airflow | 3205 | | default | 1 | PythonOperator | alt_sleep | dagrun_not_updated | 2016-05-20 07:48:00 | 2016-05-20 07:49:12.031225 | 2016-05-20 07:49:22.083763 | 10.052538 | success | 1 | localhost | airflow | 3206 | | default | 1 | PythonOperator | (4 rows) id |dag_id | execution_date| state | run_id | external_trigger | conf | end_date | start_date --+---+-+-++--+--+--+ 1485 | dagrun_not_updated | 2016-05-20 07:45:00 | running | scheduled__2016-05-20T07:45:00 | f| | | 2016-05-20 07:46:38.30924 1486 | dagrun_not_updated | 2016-05-20 07:46:00 | running | scheduled__2016-05-20T07:46:00 | f| | | 2016-05-20 07:47:01.563541 1487 | dagrun_not_updated | 2016-05-20 07:47:00 | running | scheduled__2016-05-20T07:47:00 | f| | | 2016-05-20 07:48:00.016718 1488 | dagrun_not_updated | 2016-05-20 07:48:00 | running | scheduled__2016-05-20T07:48:00 | f| | | 2016-05-20 07:49:00.203204 (4 rows) id | dag_id | state | job_type | start_date | end_date | latest_heartbeat | executor_class | hostname | unixname --++-+--+++++---+-- 3206 || success | LocalTaskJob | 2016-05-20 07:49:08.691714 | 2016-05-20 07:49:23.706144 | 2016-05-20 07:49:08.691725 | LocalExecutor | localhost | airflow 3205 || running | LocalTaskJob | 2016-05-20 07:48:01.155988 | | 2016-05-20 07:50:51.312164 | LocalExecutor | localhost | airflow 3204 || success | LocalTaskJob | 2016-05-20 07:47:16.153078 | 2016-05-20 07:47:31.168997 | 2016-05-20 07:47:16.153091 | LocalExecutor | localhost | airflow 3203 || running | LocalTaskJob | 2016-05-20 07:46:48.198379 | | 2016-05-20 07:50:53.42636 | LocalExecutor | localhost | airflow 3202 || running | SchedulerJob | 2016-05-20 07:45:31.43799 | | 2016-05-20 07:50:55.061958 | LocalExecutor | localhost | airflow {code} Extract of database logs : {code} 2016-05-20 07:47:31 UTC [24003-36] airflow@airflow LOG: duration: 38.731 ms statement: UPDATE job SET state='success', end_date='2016-05-20T07:47:31.168997'::timestamp, latest_heartbeat='2016-05-20T07:47:16.153091'::timestamp WHERE job.id = 3204 2016-05-20 07:49:23 UTC [24107-36] airflow@airflow LOG: duration: 0.179 ms statement: UPDATE job SET state='success', end_date='2016-05-20T07:49:23.706144'::timestamp, latest_heartbeat='2016-05-20T07:49:08.691725'::timestamp WHERE job.id = 3206 2016-05-20 07:52:03 UTC [23971-336] airflow@airflow LOG: duration: 0.291 ms statement: UPDATE job SET state='success', end_date='2016-05-20T07:52:03.526927'::timestamp, latest_heartbeat='2016-05-20T07:46:48.198389'::timestamp WHERE job.id = 3203 2016-05-20 07:53:06 UTC [24047-326] airflow@airflow LOG: duration: 0.179
[jira] [Created] (AIRFLOW-164) Disable the web UI's page load animations
Bence Nagy created AIRFLOW-164: -- Summary: Disable the web UI's page load animations Key: AIRFLOW-164 URL: https://issues.apache.org/jira/browse/AIRFLOW-164 Project: Apache Airflow Issue Type: Improvement Components: ui Affects Versions: Airflow 1.7.1 Reporter: Bence Nagy Alright, some people might disagree, looking forward to the discussion here. Basically my qualm is that opening almost any page (even the DAGs list) will trigger an animation where the content kinda swoops in from the left/the top/the top-left corner. This gets pretty annoying for a few reasons: - It takes around half a second before the content is visually parsable, and these half seconds accumulate pretty quick when doing lots of administration. - This makes visual diffing when refreshing or editing the URL impossible. If the animations weren't firing, it would be possible to refresh for instance the tree view of a complicated DAG and just see the treemap change, making the differences obvious. Currently you need to commit the state to memory and then recall it after the animation has finished to try and figure out what has changed. - I think it just makes no sense from a design point of view anyway to have all this data sliding around the screen. It's not like it passes off as a transition animation or anything. What does everyone else think? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (AIRFLOW-162) Allow variables to be exposed in the templates
Alex Van Boxel created AIRFLOW-162: -- Summary: Allow variables to be exposed in the templates Key: AIRFLOW-162 URL: https://issues.apache.org/jira/browse/AIRFLOW-162 Project: Apache Airflow Issue Type: Improvement Components: core Reporter: Alex Van Boxel Assignee: Alex Van Boxel Priority: Trivial Allow variables to be exposed in the templates. This makes it possible to access them in the following way, example: {var.gcp_dataflow_base}/test-pipleline.jar In this example the basepath in configured in variables. This makes it possible to make some parts configurable (example for differences in prod/staging/test). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (AIRFLOW-161) Redirection to external url
Sumit Maheshwari created AIRFLOW-161: Summary: Redirection to external url Key: AIRFLOW-161 URL: https://issues.apache.org/jira/browse/AIRFLOW-161 Project: Apache Airflow Issue Type: Improvement Components: webserver Reporter: Sumit Maheshwari Hi, I am not able to find a good way (apart from loading everything upfront), where I can redirect someone to a external service url, using the information stored in airflow. There could be many use cases like downloading a signed file from s3, redirecting to hadoop job tracker, or a direct case on which I am working which is linking airflow tasks to qubole commands. I already have a working model and will open a PR soon. Please let me know if there existing ways already. Thanks, Sumit -- This message was sent by Atlassian JIRA (v6.3.4#6332)