What I find is that when celery rejects we hit this. For us we don't do
work on the hosts so solve by over provisioning tasks in celery
On Tue, May 15, 2018, 6:30 AM Andy Cooper wrote:
> I have had very similar issues when there was a problem with the connection
> string pointing to the message
We use SLA as well and works great for some DAGs and painful for others
We rely on sensors to validate the data is ready before we run and each dag
waits on sensors for different times (one dag waits for 8 hours since it
expects date at the start of day but tends to get it 8 hours later). We
also
So was bitten by this and found the jira which says it's resolved in 1.9.1
but I don't see the commit in v1-9-stable or test; what is the correct
branch for 1.9 fixes?
Thanks!
Nothing I know of. The scheduler finds the latest execution then creates
the next based off interval; this is also why update to start date have no
affect (doesn't try to fill gaps)
On Mon, Apr 2, 2018, 11:26 AM Dennis O'Brien
wrote:
> Hi folks,
>
> I recently asked this question on gitter but
For us we compile down to Python rather than do the logic in Python, that
makes it so the load doesn't do real work.
We have our own DSL that is just a simplified compiler; parse, analyze,
optimize, code gen. In code gen we just generate the Python code. Our
build then packages it up and have ai
delay?
On Mon, Mar 19, 2018, 6:15 PM David Capwell wrote:
> Ignore that, must be something with splunk since stdiut doesn't have a
> date field; the same process writing to a file is printing that out and
> Filling is before that line...
>
> On Mon, Mar 19, 2018, 5:35 PM David
Ignore that, must be something with splunk since stdiut doesn't have a date
field; the same process writing to a file is printing that out and Filling
is before that line...
On Mon, Mar 19, 2018, 5:35 PM David Capwell wrote:
> This is weird and hope not bad utc conversion tri
ific
dag was delayed 5 hours which matches the logs...
On Mon, Mar 19, 2018, 9:10 AM David Capwell wrote:
> The major reason we have been waiting was mostly because 1.8.2 and 1.9 are
> backwards incompatible (don't remember off the top of my head but one
> operator broke importan
e bug is still in there.
>
> Cheers, Fokko
>
> 2018-03-18 19:41 GMT+01:00 David Capwell :
>
> > Thanks for the reply
> >
> > Our script doesn't set it so should be off; the process does not normally
> > restart (monitoring has a counter for number of restarts s
t; B.
>
> Verstuurd vanaf mijn iPad
>
> > Op 18 mrt. 2018 om 19:08 heeft David Capwell het
> volgende geschreven:
> >
> > We just started seeing this a few days ago after turning on SLA for our
> > tasks (not saying SLA did this, may have been happening before a
We just started seeing this a few days ago after turning on SLA for our
tasks (not saying SLA did this, may have been happening before and not
noticing), but we have a dag that runs once a hour and we see that 4-5 dag
runs are marked running but tasks are not getting scheduled. When we get
the SLA
ke much more of a headache. Note that one tradeoff is that
if
> git and whatever it depends has then a need to be highly available.
>
> Max
>
> On Wed, Feb 28, 2018 at 6:55 PM, David Capwell wrote:
>
> > Thanks for all the details! With a pluggable fetcher we would be able to
>
uld, or
> an
> >> > "ArtifactoryDagFetcher", or "TarballInS3DagFetcher" may as well.
> >> >
> >> > Of course that assumes that the scheduler knows and stores the active
> >> > version number when generating a new D
ning towards (1) for sake of simplicity. Note that
> some users may not want dag to fail/retry even when dag is updated, so
> this should be an optional feature, not required.
>
> My scheduler-foo isn't that great, so curious what others have to say
> about this.
>
> On Fr
an see it being a
> building block for future use cases).
>
> Joy
>
> On Fri, Feb 23, 2018 at 1:00 PM, David Capwell wrote:
>
> > My current thinking is to add a field to the dag table that is optional
> and
> > provided by the dag. We currently intercept the load path d
corner cases were this would fail.
Any other recommendations for how this could be done?
On Mon, Feb 19, 2018, 10:33 PM David Capwell wrote:
> We have been using airflow for logic that delegates to other systems so
> inject a task all tasks depends to make sure all resources used are the
We have been using airflow for logic that delegates to other systems so
inject a task all tasks depends to make sure all resources used are the
same for all tasks in the dag. This works well for tasks that delegates to
external systems but people are starting to need to run logic in airflow
and the
do that, unfortunately. Airflow schedule the task based on the
> > current state in the DB. If you would like to preserve the history one
> > option would be to add instrumentation on airflow_local_settings.py
> >
> > Regards,
> > Ananth.P,
> >
> >
>
When a production issue happens it's common that we clear the history to
get airflow to run the task again. This is problematic since it throws
away the history making finding out what real happened harder.
Is there any way to rerun a task without deleting from the DB?
e approach and would like to use it of this is not a concern.
On Nov 7, 2017 9:52 PM, "David Capwell" wrote:
> For us we use git commits to solve this for single node (don't have
> distributed consistency), single task case (two tasks on same node may see
> different state).
For us we use git commits to solve this for single node (don't have
distributed consistency), single task case (two tasks on same node may see
different state).
What we do is we install the whole code in the dag dir as the following
DAG_DIR//.
There is a metadata file we update when we deploy (a
e might need to do the same here.
> >
> > Bolke
> >
> > Verstuurd vanaf mijn iPad
> >
> > > Op 3 okt. 2017 om 03:02 heeft David Capwell het
> > volgende geschreven:
> > >
> > > We use the bash operator to call a Java command line. We noti
We use the bash operator to call a Java command line. We notice that some
times the task stays running a long time (never stops) and that the logs in
airflow stop getting updated for the task. After debugging a bit it turns
out that the jvm is blocked on the stdout FD since the buffer is full. I
ma
to the DB directly to
cleanup
Thanks for your time!
On Sep 7, 2017 9:21 AM, "David Capwell" wrote:
> Going into a python repl I see the following when I list the DB state
>
> yaml.repo.update.frequency : [encrypted data]
> foo : [encrypted data]
> None : [encrypted data]
Going into a python repl I see the following when I list the DB state
yaml.repo.update.frequency : [encrypted data]
foo : [encrypted data]
None : [encrypted data]
On Thu, Sep 7, 2017 at 9:16 AM, David Capwell wrote:
> I just deployed 1.8.2 to a test cluster that was running 1.8.0 and
ay)
>
> -ash
>
> > On 7 Sep 2017, at 17:12, David Capwell wrote:
> >
> > I just upgraded a test environment from 1.8.0 to 1.8.2 and notice that
> the
> > variables page is no longer able to load.
> >
> > The stacktrace is defined below
>
>
I just deployed 1.8.2 to a test cluster that was running 1.8.0 and the
below stacktrace is all I get when I try to view the variables page
Looking at it and searching JIRA I found
https://issues.apache.org/jira/browse/AIRFLOW-1200 which looks like its
trying to block things from being created, but
I just upgraded a test environment from 1.8.0 to 1.8.2 and notice that the
variables page is no longer able to load.
The stacktrace is defined below
So if I cleanup the DB for anything older than 30 days, wouldn't the
scheduler try to backfill?
On Aug 29, 2017 11:02 AM, "David Capwell" wrote:
> Thanks, will take a look at this project
>
> On Aug 29, 2017 10:35 AM, "Chris Riccomini" wrote:
>
>>
ziel > invalid
> > > wrote:
> >
> > > Here at Airbnb we delete old "completed" task instances.
> > >
> > > On Mon, Aug 28, 2017 at 3:01 PM, David Capwell
> > wrote:
> > >
> > > > We are on 1.8.0 and have a monitor DA
We are on 1.8.0 and have a monitor DAG that monitors the health of Airflow
and Celery every minute. This has been running for awhile now and at 26k
dag runs. We see that the UI for this DAG is multiple seconds slower (6-7
second) than any other DAG.
My question is, what do people do about managin
arantee it provides.
>
> `priority_weigth` works along with pool to define which task should be
> scheduled first once slots open up. It won't kill any other tasks if higher
> priority tasks show up, it just re-orders the queue.
>
> Max
>
> On Wed, Aug 16, 2017 at 10:47 PM
I'm looking into pools and had a few questions
Let's say I have two pools, each of 50% of the cluster. If one pool is at
capacity and has a backlog, but the other pool is idle, will airflow allow
the first pool's work to start consuming the slots from the idle pool, and
if so is their preemption
tps://issues.apache.org/jira/browse/AIRFLOW-1463
>
> I have been working on a fix but it's likely to be a few more days before I
> have a chance to make some progress.
>
> --George
>
> On Fri, Jul 28, 2017 at 5:05 PM David Capwell wrote:
>
> > We noticed that
We noticed that in the past few days we keep seeing tasks stay in the
queued state. Looking into celery, we see that the task had failed.
Traceback (most recent call last):
File "/python/lib/python2.7/site-packages/celery/app/trace.py", line
367, in trace_task
R = retval = fun(*args, **kwar
ection information (URL, protocol,
> credentials, etc) is stored as a Connection with type set as 'Samba' (it
> was the closest I could find); the Hook just wraps the SharePoint API.
>
>
> I hope that helps.
>
> Cheers,
>
> Jim
>
>
>
> On 28/06/2017,
I'm just starting out with airflow and looking to add my own artifactory
hook so my tasks can pull from there.
Looking at the docs this means I need a ArtifactoryHook but not clear to me
how this integrates with connections. Looking over the connection code the
mapping is hard coded but the plug-
37 matches
Mail list logo