Re: Airflow with Celery

2018-05-15 Thread David Capwell
What I find is that when celery rejects we hit this. For us we don't do work on the hosts so solve by over provisioning tasks in celery On Tue, May 15, 2018, 6:30 AM Andy Cooper wrote: > I have had very similar issues when there was a problem with the connection > string pointing to the message

Re: Improving Airflow SLAs

2018-05-02 Thread David Capwell
We use SLA as well and works great for some DAGs and painful for others We rely on sensors to validate the data is ready before we run and each dag waits on sensors for different times (one dag waits for 8 hours since it expects date at the start of day but tends to get it 8 hours later). We also

AIRFLOW-1157 and 1.9.1

2018-04-04 Thread David Capwell
So was bitten by this and found the jira which says it's resolved in 1.9.1 but I don't see the commit in v1-9-stable or test; what is the correct branch for 1.9 fixes? Thanks!

Re: schedule backfill jobs in reverse order

2018-04-02 Thread David Capwell
Nothing I know of. The scheduler finds the latest execution then creates the next based off interval; this is also why update to start date have no affect (doesn't try to fill gaps) On Mon, Apr 2, 2018, 11:26 AM Dennis O'Brien wrote: > Hi folks, > > I recently asked this question on gitter but

Re: Submitting 1000+ tasks to airflow programatically

2018-03-22 Thread David Capwell
For us we compile down to Python rather than do the logic in Python, that makes it so the load doesn't do real work. We have our own DSL that is just a simplified compiler; parse, analyze, optimize, code gen. In code gen we just generate the Python code. Our build then packages it up and have ai

Re: Running dag run doesn't schedule task

2018-03-19 Thread David Capwell
delay? On Mon, Mar 19, 2018, 6:15 PM David Capwell wrote: > Ignore that, must be something with splunk since stdiut doesn't have a > date field; the same process writing to a file is printing that out and > Filling is before that line... > > On Mon, Mar 19, 2018, 5:35 PM David

Re: Running dag run doesn't schedule task

2018-03-19 Thread David Capwell
Ignore that, must be something with splunk since stdiut doesn't have a date field; the same process writing to a file is printing that out and Filling is before that line... On Mon, Mar 19, 2018, 5:35 PM David Capwell wrote: > This is weird and hope not bad utc conversion tri

Re: Running dag run doesn't schedule task

2018-03-19 Thread David Capwell
ific dag was delayed 5 hours which matches the logs... On Mon, Mar 19, 2018, 9:10 AM David Capwell wrote: > The major reason we have been waiting was mostly because 1.8.2 and 1.9 are > backwards incompatible (don't remember off the top of my head but one > operator broke importan

Re: Running dag run doesn't schedule task

2018-03-19 Thread David Capwell
e bug is still in there. > > Cheers, Fokko > > 2018-03-18 19:41 GMT+01:00 David Capwell : > > > Thanks for the reply > > > > Our script doesn't set it so should be off; the process does not normally > > restart (monitoring has a counter for number of restarts s

Re: Running dag run doesn't schedule task

2018-03-18 Thread David Capwell
t; B. > > Verstuurd vanaf mijn iPad > > > Op 18 mrt. 2018 om 19:08 heeft David Capwell het > volgende geschreven: > > > > We just started seeing this a few days ago after turning on SLA for our > > tasks (not saying SLA did this, may have been happening before a

Running dag run doesn't schedule task

2018-03-18 Thread David Capwell
We just started seeing this a few days ago after turning on SLA for our tasks (not saying SLA did this, may have been happening before and not noticing), but we have a dag that runs once a hour and we see that 4-5 dag runs are marked running but tasks are not getting scheduled. When we get the SLA

Re: How to add hooks for strong deployment consistency?

2018-03-01 Thread David Capwell
ke much more of a headache. Note that one tradeoff is that if > git and whatever it depends has then a need to be highly available. > > Max > > On Wed, Feb 28, 2018 at 6:55 PM, David Capwell wrote: > > > Thanks for all the details! With a pluggable fetcher we would be able to >

Re: How to add hooks for strong deployment consistency?

2018-02-28 Thread David Capwell
uld, or > an > >> > "ArtifactoryDagFetcher", or "TarballInS3DagFetcher" may as well. > >> > > >> > Of course that assumes that the scheduler knows and stores the active > >> > version number when generating a new D

Re: How to add hooks for strong deployment consistency?

2018-02-27 Thread David Capwell
ning towards (1) for sake of simplicity. Note that > some users may not want dag to fail/retry even when dag is updated, so > this should be an optional feature, not required. > > My scheduler-foo isn't that great, so curious what others have to say > about this. > > On Fr

Re: How to add hooks for strong deployment consistency?

2018-02-23 Thread David Capwell
an see it being a > building block for future use cases). > > Joy > > On Fri, Feb 23, 2018 at 1:00 PM, David Capwell wrote: > > > My current thinking is to add a field to the dag table that is optional > and > > provided by the dag. We currently intercept the load path d

Re: How to add hooks for strong deployment consistency?

2018-02-23 Thread David Capwell
corner cases were this would fail. Any other recommendations for how this could be done? On Mon, Feb 19, 2018, 10:33 PM David Capwell wrote: > We have been using airflow for logic that delegates to other systems so > inject a task all tasks depends to make sure all resources used are the

How to add hooks for strong deployment consistency?

2018-02-19 Thread David Capwell
We have been using airflow for logic that delegates to other systems so inject a task all tasks depends to make sure all resources used are the same for all tasks in the dag. This works well for tasks that delegates to external systems but people are starting to need to run logic in airflow and the

Re: Rerunning task without cleaning DB?

2018-02-07 Thread David Capwell
do that, unfortunately. Airflow schedule the task based on the > > current state in the DB. If you would like to preserve the history one > > option would be to add instrumentation on airflow_local_settings.py > > > > Regards, > > Ananth.P, > > > > >

Rerunning task without cleaning DB?

2018-02-05 Thread David Capwell
When a production issue happens it's common that we clear the history to get airflow to run the task again. This is problematic since it throws away the history making finding out what real happened harder. Is there any way to rerun a task without deleting from the DB?

Re: Automatic DAGs deployment

2017-11-07 Thread David Capwell
e approach and would like to use it of this is not a concern. On Nov 7, 2017 9:52 PM, "David Capwell" wrote: > For us we use git commits to solve this for single node (don't have > distributed consistency), single task case (two tasks on same node may see > different state).

Re: Automatic DAGs deployment

2017-11-07 Thread David Capwell
For us we use git commits to solve this for single node (don't have distributed consistency), single task case (two tasks on same node may see different state). What we do is we install the whole code in the dag dir as the following DAG_DIR//. There is a metadata file we update when we deploy (a

Re: Airflow stops reading stdout of forked process with BashOperator

2017-10-06 Thread David Capwell
e might need to do the same here. > > > > Bolke > > > > Verstuurd vanaf mijn iPad > > > > > Op 3 okt. 2017 om 03:02 heeft David Capwell het > > volgende geschreven: > > > > > > We use the bash operator to call a Java command line. We noti

Airflow stops reading stdout of forked process with BashOperator

2017-10-02 Thread David Capwell
We use the bash operator to call a Java command line. We notice that some times the task stays running a long time (never stops) and that the logs in airflow stop getting updated for the task. After debugging a bit it turns out that the jvm is blocked on the stdout FD since the buffer is full. I ma

Re: Upgrading to 1.8.2 fails to load variables page

2017-09-07 Thread David Capwell
to the DB directly to cleanup Thanks for your time! On Sep 7, 2017 9:21 AM, "David Capwell" wrote: > Going into a python repl I see the following when I list the DB state > > yaml.repo.update.frequency : [encrypted data] > foo : [encrypted data] > None : [encrypted data]

Re: Upgrading to 1.8.2 fails to load variables page

2017-09-07 Thread David Capwell
Going into a python repl I see the following when I list the DB state yaml.repo.update.frequency : [encrypted data] foo : [encrypted data] None : [encrypted data] On Thu, Sep 7, 2017 at 9:16 AM, David Capwell wrote: > I just deployed 1.8.2 to a test cluster that was running 1.8.0 and

Re: Upgrading to 1.8.2 fails to display variable page

2017-09-07 Thread David Capwell
ay) > > -ash > > > On 7 Sep 2017, at 17:12, David Capwell wrote: > > > > I just upgraded a test environment from 1.8.0 to 1.8.2 and notice that > the > > variables page is no longer able to load. > > > > The stacktrace is defined below > >

Upgrading to 1.8.2 fails to load variables page

2017-09-07 Thread David Capwell
I just deployed 1.8.2 to a test cluster that was running 1.8.0 and the below stacktrace is all I get when I try to view the variables page Looking at it and searching JIRA I found https://issues.apache.org/jira/browse/AIRFLOW-1200 which looks like its trying to block things from being created, but

Upgrading to 1.8.2 fails to display variable page

2017-09-07 Thread David Capwell
I just upgraded a test environment from 1.8.0 to 1.8.2 and notice that the variables page is no longer able to load. The stacktrace is defined below

Re: As history grows UI gets slower

2017-08-29 Thread David Capwell
So if I cleanup the DB for anything older than 30 days, wouldn't the scheduler try to backfill? On Aug 29, 2017 11:02 AM, "David Capwell" wrote: > Thanks, will take a look at this project > > On Aug 29, 2017 10:35 AM, "Chris Riccomini" wrote: > >>

Re: As history grows UI gets slower

2017-08-29 Thread David Capwell
ziel > invalid > > > wrote: > > > > > Here at Airbnb we delete old "completed" task instances. > > > > > > On Mon, Aug 28, 2017 at 3:01 PM, David Capwell > > wrote: > > > > > > > We are on 1.8.0 and have a monitor DA

As history grows UI gets slower

2017-08-28 Thread David Capwell
We are on 1.8.0 and have a monitor DAG that monitors the health of Airflow and Celery every minute. This has been running for awhile now and at 26k dag runs. We see that the UI for this DAG is multiple seconds slower (6-7 second) than any other DAG. My question is, what do people do about managin

Re: Pools and extra capacity?

2017-08-17 Thread David Capwell
arantee it provides. > > `priority_weigth` works along with pool to define which task should be > scheduled first once slots open up. It won't kill any other tasks if higher > priority tasks show up, it just re-orders the queue. > > Max > > On Wed, Aug 16, 2017 at 10:47 PM

Pools and extra capacity?

2017-08-16 Thread David Capwell
I'm looking into pools and had a few questions Let's say I have two pools, each of 50% of the cluster. If one pool is at capacity and has a backlog, but the other pool is idle, will airflow allow the first pool's work to start consuming the slots from the idle pool, and if so is their preemption

Re: Tasks stay queued when they fail in celery

2017-08-05 Thread David Capwell
tps://issues.apache.org/jira/browse/AIRFLOW-1463 > > I have been working on a fix but it's likely to be a few more days before I > have a chance to make some progress. > > --George > > On Fri, Jul 28, 2017 at 5:05 PM David Capwell wrote: > > > We noticed that

Tasks stay queued when they fail in celery

2017-07-28 Thread David Capwell
We noticed that in the past few days we keep seeing tasks stay in the queued state. Looking into celery, we see that the task had failed. Traceback (most recent call last): File "/python/lib/python2.7/site-packages/celery/app/trace.py", line 367, in trace_task R = retval = fun(*args, **kwar

Re: Hooks and connection

2017-06-28 Thread David Capwell
ection information (URL, protocol, > credentials, etc) is stored as a Connection with type set as 'Samba' (it > was the closest I could find); the Hook just wraps the SharePoint API. > > > I hope that helps. > > Cheers, > > Jim > > > > On 28/06/2017,

Hooks and connection

2017-06-28 Thread David Capwell
I'm just starting out with airflow and looking to add my own artifactory hook so my tasks can pull from there. Looking at the docs this means I need a ArtifactoryHook but not clear to me how this integrates with connections. Looking over the connection code the mapping is hard coded but the plug-