What is the best way to retry an entire DAG instead of just a single task?

2016-08-02 Thread Arthur Purvis
Apologies if this is a dumb question, but I'm looking for a way to retry an entire DAG if a single task fails, rather than retry just that task. The context is that of a job starter + sensor, and if the sensor fails it means the job needs to be restarted, not just re-sensored. >From reading the d

Re: 1.7.1.2 got broken, 1.7.1.3 is out

2016-08-02 Thread siddharth anand
FYI, 1.4.0 does not in fact work for us, but 1.4.1 works on 1.7.1.3. It's weird that we are able to run 1.7.1.3 on 1.4.1. We are building using setup.py from the Git repo. -s On Mon, Jun 13, 2016 at 3:33 PM, Maxime Beauchemin < maximebeauche...@gmail.com> wrote: > Hi there, > > The new flask-a

Re: What is the best way to retry an entire DAG instead of just a single task?

2016-08-02 Thread Wang Yajun
Arthur Purvis u can try: 1. airflow backfill -s start_date -e end_date your_DAG 2. airflow trigger_dag your_DAG u can see the detail information by the official document hope to help u Arthur Purvis 于2016年8月2日 周二下午10:55写道: > Apologies if this is a dumb question, but I'm looking for a way to re

Re: Projects using GitHub issues

2016-08-02 Thread siddharth anand
Mentors, Any updates here? I looked at the thread that Chris R had pasted to kick off this discussion. The thread meandered. I saw a comment from Jim J stating : Recall that issue trackers have not been, traditionally, held under the requirement of being under ASF infra nor infra control. If peopl

Re: DAG without a schedule interval?

2016-08-02 Thread siddharth anand
As Max mentioned, you can write : dag = DAG('ep_reload_data', schedule_interval=None, default_args=default_args) -s On Fri, Jun 10, 2016 at 8:37 AM, Maxime Beauchemin < maximebeauche...@gmail.com> wrote: > schedule_interval=None is supported and works nicely as an externally > triggered, "on d

DAG scheduled for start_date of today and an interval of 7 days keeps getting scheduled for the past

2016-08-02 Thread David Klosowski
I have a DAG that I just deployed that the scheduler keeps scheduling for the last two months in the past. start_date: 8/5/2016 scheduled runs started: 7/3/2016 6/5/2016 Here is the gist of this DAG's architecture: The DAG depends another dags tasks using 7 dynamic ExternalTaskSensors that it b

Re: DAG scheduled for start_date of today and an interval of 7 days keeps getting scheduled for the past

2016-08-02 Thread siddharth anand
The problem might be that the start_date does not get updated. I work around this by changing the name of my dag. I do lose history as well, but it works. My dags are named "some_dag_v1". When I change a start date, I update the version suffix to force a reload : "some_dag_v2" -s On Tue, Aug 2,

Re: DAG scheduled for start_date of today and an interval of 7 days keeps getting scheduled for the past

2016-08-02 Thread David Klosowski
start_date being updated isn't the issue here. I haven't changed it. New execution_dates keep getting created for the past before any dags or start_dates existed. Cheers, David On Tue, Aug 2, 2016 at 7:10 PM, siddharth anand wrote: > The problem might be that the start_date does not get upda

Re: DAG scheduled for start_date of today and an interval of 7 days keeps getting scheduled for the past

2016-08-02 Thread siddharth anand
Interesting. If you haven't already, can you create a Jira and append an example dag that I can run to reproduce (likely capturing the code you have above). You can then assign the bug to me to look into. Also, please provide enough context on your use-case and why you are structuring your code th

Re: What is the best way to retry an entire DAG instead of just a single task?

2016-08-02 Thread siddharth anand
Hi Arthur, It's not a dumb question. We don't have the ability to retry a DAG based on a task failure in a programatic way to the best of my knowledge. Also, we don't allow cyclic dependencies.. hence the Acyclic part of DAG. TriggerDagRunOperator won't work because the execution is async. The TDR

Re: DAG status still running when all its tasks are complete

2016-08-02 Thread Nadeem Ahmed Nazeer
Could someone please shed some light on this DAG status? My airflow version is 1.7.0. This is the only version that works for me when it comes to scheduler. Any version above this, the scheduler gets stuck without a trace and wouldn't schedule anything. Thanks, Nadeem On Mon, Aug 1, 2016 at 2:29

Re: DAG status still running when all its tasks are complete

2016-08-02 Thread siddharth anand
Hi Nadeem, Can you open a JIRA, attach a DAG which I can run to reproduce your issue, and assign the JIRA to me? -s On Tue, Aug 2, 2016 at 8:40 PM, Nadeem Ahmed Nazeer wrote: > Could someone please shed some light on this DAG status? > > My airflow version is 1.7.0. This is the only version that

Re: Run input field in Gantt page

2016-08-02 Thread siddharth anand
This makes sense. Go ahead and submit a PR if you haven't already and ping me to review it. -s On Fri, Jul 15, 2016 at 7:37 PM, wood stock wrote: > Hi All, > > My first post. I installed the latest release with "pip install airflow" > and it is running. Initial impression is very good. > > Quest

Re: Running a task from the Airflow UI

2016-08-02 Thread siddharth anand
A REST api is long overdue. I suggest anyone in the community that has the cycles to start implementing.. your PRs would be welcome. Currently, we have a very powerful CLI that should ideally have similar functionality exposed via the API. The CLI's trigger_dag command is one of the first ones I'd

Re: Creating and accessing different variable values for each instance of DAG run

2016-08-02 Thread siddharth anand
+1 for Laura's suggestion. We (Agari) also find ourselves in situations where dag runs end up overlapping. We pass the execution_date via Jinja templates to BashOperator to remote shell commands to Spark jobs to the final Spark output files on S3. The execution date is what ties the DAG Run to the

Re: Airflow SLA shows tasks that doesn't miss SLA

2016-08-02 Thread siddharth anand
We have been using run the SLA_miss_callback feature for over 6 months now... for our hourly DAGs. We tend to miss SLAs (and rightly so) about 20% of the time. We haven't seen this bug. We are also running on the hour boundaries and not at 59 minutes from the hour. Is there are reason your start_da

Re: Handling running tasks

2016-08-02 Thread siddharth anand
+1 on the uniqueness of the solution. Wondering how it worked! -s On Sun, Jul 10, 2016 at 9:02 AM, Cyril Scetbon wrote: > Interesting. Thanks for this solution Lance, gonna try it > > On Jul 6, 2016, at 19:11, Lance Norskog wrote: > > > > You could use the XCOM feature to post a semaphore at t

Re: "Run" pulldown menu in TreeView, TaskDuration, LandingTimes

2016-08-02 Thread siddharth anand
That sounds correct to me. Is there a PR? -s On Sat, Jul 16, 2016 at 12:40 AM, wood stock wrote: > Hi, > I found an issue with the "Number of runs:" pulldown menu. Looks like we > intend to retrieve task instances for the last X runs until the "Base". > However, the logic has a flaw IMO. > > It