More ideas:
- An "airflow" plugin at the moment is more of an extension; operators,
hooks, macros.
Consider an additional plugin API + default implementation for code
inside airflow that
has a cross-cutting concern, like:
* We start to use datadog for heavier monitoring of what's going on
I just added a bit of information about dynamic DAG creation here:
https://github.com/apache/incubator-airflow/pull/1889/files#diff-c6f0a0722c6a2f86277535d7bcec7f8cR162
Let me know if it helps.
Max
On Mon, Nov 21, 2016 at 2:58 AM, Deepak Kumar Malladi
wrote:
> Hi,
>
> I want to dynamically cre
1) The restart should not be needed, but if folks are reporting it, I'm
curious what the problem might be. If yo are running on master, then you
may not be aware of the min_file_process_interval setting.
[scheduler]
min_file_process_interval = 0
max_threads = 4
2) Yes.. security is not there. I
I am still deciding between Airflow and oozie for our brand new Hadoop
project but here is a few things that I did not like during my limited
testing:
1) pain with scheduler/webserver restarts - things magically begin working
after restart or disappear (like DAG tasks that are no longer part of DA
Also, a survey will be a little less noisy and easier to summarize than +1s
in this email thread.
-s (Sid)
On Mon, Nov 21, 2016 at 2:25 PM, siddharth anand wrote:
> Sergei,
> These are some great ideas -- I would classify at least half of them as
> pain points.
>
> Folks!
> I suggest people (on
Sergei,
These are some great ideas -- I would classify at least half of them as
pain points.
Folks!
I suggest people (on the dev list) keep feeding this thread at least for
the next 2 days. I can then float a survey based on these ideas and give
the community a chance to vote so we can prioritize
+1 on driving everything through a REST API including the UI. This unifies
the access to the scheduler and increases stability.
Consider running a very small webserver (node.js + socket.io), which
enables airflow to communicate scheduler events as they happen
to anything that connects to it throug
> Add FK to dag_run to the task_instance table on Postgres so that
task_instances can be uniquely attributed to dag runs.
> Ensure scheduler can be run continuously without needing restarts.
> Ensure scheduler can handle tens of thousands of active workflows
+1
We are planning to run around 40
> Ensure scheduler can be run continuously without needing restarts
+1
On Mon, Nov 21, 2016 at 5:25 AM, David Batista wrote:
> A small request, which might be handy.
>
> Having the possibility to select multiple tasks and mark them as
> Success/Clear/etc.
>
> Allow the UI to select individual ta
Hi,
I want to dynamically create DAG during run time. I tried the snippet given
in the documentation. But it didnt work for me.
Any pointer on how to trigger DAGs which aren't actually present in DAG
folder but are created through code execution (dynamically created)?
Thanks & Regards,
Deepak
A small request, which might be handy.
Having the possibility to select multiple tasks and mark them as
Success/Clear/etc.
Allow the UI to select individual tasks (i.e., inside the Tree View) and
then have a button to mark them as Success/Clear/etc.
On 21 November 2016 at 14:22, Sergei Iakhnin
I've been running Airflow on 1500 cores in the context of scientific
workflows for the past year and a half. Features that would be important to
me for 2.0:
- Add FK to dag_run to the task_instance table on Postgres so that
task_instances can be uniquely attributed to dag runs.
- Ensure scheduler
-1. We extremely rely on data profiling, as a pipeline health monitoring tool
-Original Message-
From: Chris Riccomini [mailto:criccom...@apache.org]
Sent: Saturday, November 19, 2016 1:57 AM
To: dev@airflow.incubator.apache.org
Subject: Re: Airflow 2.0
> RIP out the charting applicatio
Hi,
Like we have an admin panel,where we can configure the database connections
and query them . Similarly based on the executor backend chosen, some
information should be provided.
Like for Airflow + rabbit Mq + Celery backend, if rabbit mq goes down, it
keeps on showing the message that task ha
14 matches
Mail list logo