As the author of catch-up, the idea is that in many cases your data doesn't
"window" nicely and you want instead to just run as if it were a brilliant
Cron...
Ben
Sent from my iPhone
> On Jul 20, 2018, at 11:39 PM, Shah Altaf wrote:
>
> Hi my understanding is: if you use the LatestOnlyOperat
While I'm no longer a Googler, while there, I was excited about this!!!
Thanks,
Ben
--
Ben Tallman - 503.680.5709
On Tue, May 1, 2018 at 10:31 AM, Maxime Beauchemin <
maximebeauche...@gmail.com> wrote:
> I'm sure the community agrees when I say that we're happy and ho
Yes, once Sumit asked that question, it made me dig a bit, and ARG.
:)
Thanks,
Ben
--
Ben Tallman - 503.680.5709
On Mon, Nov 14, 2016 at 11:40 AM, siddharth anand wrote:
> Ben,
> I ran into issues while maintaining my company's airflow fork and
> cherry-picking my changes
We are seeing an issue when running Master where tasks sometimes never run.
It seems that once they get marked as Dependencies Not met because the Pool
is full, that isn't being re-evaluated. Is anyone else seeing this?
https://issues.apache.org/jira/browse/AIRFLOW-627
Thanks,
Ben
-
or id_elem in id_], ())
File "/Library/Python/2.7/site-packages/alembic/script/revision.py", line
304, in get_revisions
for rev_id in resolved_id)
File "/Library/Python/2.7/site-packages/alembic/script/revision.py", line
304, in
for rev_id in resolved_id)
File "/Library/Python/2.7/site-packages/alembic/script/revision.py", line
359, in _revision_for_ident
resolved_id)
alembic.util.CommandError: Can't locate revision identified by
'f2ca10b85618'
Thanks,
Ben
--
Ben Tallman - 503.680.5709
So to kill a running DAG (and keep it killed), we need to clear the state
of each task instance? Do we then pause the DAG? Or do that in advance?
Thanks,
Ben
--
Ben Tallman - 503.680.5709
On Tue, Nov 1, 2016 at 9:11 AM, Bolke de Bruin wrote:
> Clearing the state of the task, kills it. So
I vote for this feature! Preferably a polite and a NOW option.
Thanks,
Ben
--
Ben Tallman - 503.680.5709
On Tue, Nov 1, 2016 at 9:08 AM, Vishal Doshi wrote:
> I haven’t been able to find anything on this in the code / docs. Is there
> a supported way to kill a DAG (and its still running tasks)?
>
Boris -
The pull request includes a airflow.cfg config entry to set backfill=False
by default.
[scheduler]
backfill_by_default=(*true*|false)
Thanks,
Ben
*--*
*ben tallman* | *apigee
<http://t.sidekickopen06.com/e1t/
he goal of the pickling was to
distribute the dag to distributed workers, not freeze it in time. I think
that storing the pickled dag in the dagrun could probably solve this, but
it is a major issue/change. It is one that I am beginning to work on for us
though.
Thanks,
Ben
*--*
*ben tallman* | *apige
,
Ben
*--*
*ben tallman* | *apigee
<http://t.sidekickopen06.com/e1t/c/5/f18dQhb0S7lC8dDMPbW2n0x6l2B9nMJW7t5XZs4WJfgqW4WJj7n3MP7VWW3LqXLC56dWRRf2H8CkP02?t=http%3A%2F%2Fwww.apigee.com%2F&si=5141814536306688&pi=0949e84f-a26c-4234-e814-b829991e14c9>*
| m: +1.503.680.5709 | o: +1.503
.
https://github.com/apache/incubator-airflow/pull/1830
For instance:
dag = DAG(
"test_dag_id_here",
"backfill": False
, ...
)
Thanks,
Ben
*--*
*ben tallman* | *apigee
<http://t.sidekickopen06.com/e1t/c/5/f18dQhb0S7lC8dDMPbW2n0x6l2B9nMJW7t5XZs4WJfgqW4WJj7n3MP7VWW3LqXL
that end, staying on top of PRs is a huge
commitment, as well as a sign of health.
Ben
Thanks,
Ben
*--*
*ben tallman* | *apigee
<http://t.sidekickopen06.com/e1t/c/5/f18dQhb0S7lC8dDMPbW2n0x6l2B9nMJW7t5XZs4WJfgqW4WJj7n3MP7VWW3LqXLC56dWRRf2H8CkP02?t=http%3A%2F%2Fwww.apigee.com%2F&si=514181453
ure
of dag processing conflicting with sqlite?
Thanks,
Ben
*--*
*ben tallman* | *apigee
<http://t.sidekickopen06.com/e1t/c/5/f18dQhb0S7lC8dDMPbW2n0x6l2B9nMJW7t5XZs4WJfgqW4WJj7n3MP7VWW3LqXLC56dWRRf2H8CkP02?t=http%3A%2F%2Fwww.apigee.com%2F&si=5141814536306688&pi=291525b6-758e-488a-f6f
Has anyone looked (or Is anyone looking) at surfacing the new scheduler
process logs? By default, they go in /tmp/airflow/scheduler/logs, but we
are going to put them into our regular tree...
Thanks,
Ben
*--*
*ben tallman* | *apigee
<http://t.sidekickopen06.com/e1t/
I will be there...
Thanks,
Ben
*--*
*ben tallman* | *apigee
<http://t.sidekickopen06.com/e1t/c/5/f18dQhb0S7lC8dDMPbW2n0x6l2B9nMJW7t5XZs4WJfgqW4WJj7n3MP7VWW3LqXLC56dWRRf2H8CkP02?t=http%3A%2F%2Fwww.apigee.com%2F&si=5141814536306688&pi=7c5aea8e-ca41-471c-c5ca-303aaa06c73e>*
| m: +
hanks,
Ben
*--*
*ben tallman* | *apigee
<http://t.sidekickopen06.com/e1t/c/5/f18dQhb0S7lC8dDMPbW2n0x6l2B9nMJW7t5XZs4WJfgqW4WJj7n3MP7VWW3LqXLC56dWRRf2H8CkP02?t=http%3A%2F%2Fwww.apigee.com%2F&si=5141814536306688&pi=6af5cfc8-c90e-4e34-9b50-01af7a111a7e>*
| m: +1.503.680.5709 | o: +1.503
re-factor. Furthermore, I believe that the definition of a
Dynamic Acyclic Graph is that it is re-evaluated during runtime and that
the path is non-determinate at runtime.
Thanks,
Ben
*--*
*ben tallman* | *apigee
<http://t.sidekickopen0
Sorry, no video. I will also be speaking to the Data Warehouse team at
Optimizely on the 21st about our process, and really digging in to the
Airflow stuff.
Thanks,
Ben
*--*
*ben tallman* | *apigee
<http://t.sidekickopen06.com/e1t/
I did the cleanup... If I see more, I'll send you the user_id.
Thanks,
Ben
*--*
*ben tallman* | *apigee
<http://t.sidekickopen06.com/e1t/c/5/f18dQhb0S7lC8dDMPbW2n0x6l2B9nMJW7t5XZs4WJfgqW4WJj7n3MP7VWW3LqXLC56dWRRf2H8CkP02?t=http%3A%2F%2Fwww.apigee.com%2F&si=5141814536306688&pi=ce
Is there a limit on who get's to edit/add pages? Seems like someone is
putting TV listings into the pages...
Thanks,
Ben
*--*
*ben tallman* | *apigee
<http://t.sidekickopen06.com/e1t/c/5/f18dQhb0S7lC8dDMPbW2n0x6l2B9nMJW7t5XZs4WJfgqW4WJj7n3MP7VWW3LqXLC56dWRRf2H8CkP02?t=http%3A%2F%2Fwww.ap
tomer Reference: Ben Tallman, an Apigeek at Apigee, will discuss
Apigee's Data Warehouse and Business Performance Management, specifically
how leveraging Apigee Edge, RDS Postgres, Apache Airflow and Periscope Data
has allowed them to "free the data" and become a more agile and data driven
Happy to take the extra time to facilitate a talk about "features" and
"issues"...
Get Outlook for iOS
On Tue, Aug 16, 2016 at 1:28 PM -0700, "siddharth anand"
wrote:
Great!
I just tweeted it via our ApacheAirflow twitter account!
-s
On Tue, Aug 16, 2016 at 11:26 AM, Jeff Balo
Happy to give a talk about Apigee's use case as well...
Thanks,
Ben
*--*
*ben tallman* | *apigee
<http://t.sidekickopen06.com/e1t/c/5/f18dQhb0S7lC8dDMPbW2n0x6l2B9nMJW7t5XZs4WJfgqW4WJj7n3MP7VWW3LqXLC56dWRRf2H8CkP02?t=http%3A%2F%2Fwww.apigee.com%2F&si=5141814536306688&pi=de892ab
The doc section on Celery essentially points to the celery site for config,
however the celery_result_backend setting seems to be skipped, as a
result...
Any insights?
Thanks,
Ben
*--*
*ben tallman* | *apigee
<http://t.sidekickopen06.com/e1t/
Here is the error... Any thoughts?
File "/usr/local/lib/python2.7/dist-packages/airflow/models.py", line 3107,
in get_val
"Can't decrypt _val, configuration is missing")
AirflowException: Can't decrypt _val, configuration is missing
The configuration is there, and must be, or it wouldn't be c
Already there, twice, but one with a pull request...
https://github.com/apache/incubator-airflow/pull/1601
On Tue, Jun 21, 2016 at 4:53 PM Ben Tallman wrote:
> Maxime -
>
> Wish I did have time... BUT, I can say that SLA timeouts will fail (and
> error) on any dag with schedule=
gt;
> Thanks,
>
> Max
>
> On Tue, Jun 21, 2016 at 9:15 AM, Ben Tallman wrote:
>
> > We have seen this too. Running 1.7.0 with Celery, neither DAG timeout nor
> > individual task sla's seem to be honored. In truth, we haven't done a lot
> >
We have seen this too. Running 1.7.0 with Celery, neither DAG timeout nor
individual task sla's seem to be honored. In truth, we haven't done a lot
of testing, as it is more important that we get our overall ETL migrated
with workarounds.
However, we will be digging in at some point for greater cl
We are running 1.7.0 and it's time to look at upgrading...
Do we have best practices? Should we be updating all the time (this seems
dangerous)? How often? Major changes? Things to watch out for (other than
testing)?
Thanks,
Ben
*--*
*ben tallman* | *apigee
<http://t.sidekickopen06.com
In the past, I have written/seen systems where the pattern us that a task
runner/worker is in charge of handling scheduling of the next tasks that
need to run on completion of a task and the Scheduler only handles "issues"
and initial kickoffs...
Opens another can of worms, but I think I've seen d
30 matches
Mail list logo