Chris, I am running SequentialExecutor. Thanks. Jason
On Tue, May 31, 2016 at 1:36 PM, Chris Riccomini <criccom...@apache.org> wrote: > Hey Jason, > > Are you running the SerialExecutor? This is the default out-of-the-box > executor. > > Cheers, > Chris > > On Tue, May 31, 2016 at 12:59 PM, Jason Chen <chingchien.c...@gmail.com> > wrote: > >> Hi Chris, >> >> I made the changes and tried it out. >> It seems not working as expected. >> When a dag is running (a particular task inside that dag is taking time), >> another task from another dag seems "blocked". >> >> My setting: >> (1) airflow.cfg >> max_active_runs_per_dag = 16 >> parallelism = 32 >> dag_concurrency = 16 >> >> (2) A dag (dag1) python file is as below partially. Please note that >> inside this DAG, the first task (task1) is a long running task >> >> dag1 = DAG('dag1', schedule_interval=timedelta(minutes=15), >> max_active_runs=1, default_args=args) >> >> Then, the tasks are running in the order... >> task1 (long running) --> task 2 --> task3 >> ... >> (3) In another dag (dag2) python file is as below partially. >> dag2 = DAG('dag2', schedule_interval=timedelta(minutes=3), >> max_active_runs=1, default_args=args) >> ... >> Then, the tasks are running in the order... >> taskA (short running task) --> taskB >> >> (4) Inside the upstart script file. this is the main part how I start >> airflow scheduler >> >> env SCHEDULER_RUNS=0 >> export SCHEDULER_RUNS >> >> script >> exec >> ${AIRFLOW_HOME}/scheduler-log/airflow-scheduler.log 2>&1 >> exec usr/local/bin/airflow scheduler -n ${SCHEDULER_RUNS} >> end script >> >> ========================= >> >> What I observed are that >> (a) task1 (of dag1) is running about 20 mins and during it's running >> time, there is no other dag1 triggered. This is as expected. >> >> (b) taskA (of dag2) should be triggered to run every 3 mins. However, it >> is NOT triggered if task-1 of dag-1 is running. >> taskA seems to be queued/bolcked and not run. It is executed after task-1 >> (of dag-1) is done. So, it looks like it is dispatched into a "gap" of >> task1 and task2 (of dag1). This looks not normal, as it's expected taskA >> (of dag 2) should run no matter what happens to another dag (dag-1). >> >> >> Any suggestions? >> Thanks. >> Jason >> >> >> On Tue, May 31, 2016 at 9:02 AM, Chris Riccomini <criccom...@apache.org> >> wrote: >> >>> Hey Jason, >>> >>> The problem is max_active_runs_per_dag=1. Set it back to 16. You just >>> need >>> max_active_runs=1 for the individual DAGs. This will allow multiple >>> (different) DAGs to run in parallel, but only one DAG of each type can >>> run >>> at the same type. >>> >>> Cheers, >>> Chris >>> >>> On Fri, May 27, 2016 at 11:42 PM, Jason Chen <chingchien.c...@gmail.com> >>> wrote: >>> >>> > Hi Chris, >>> > Thanks for your reply. After setting it up, I observed how it works >>> for >>> > couple of days.. >>> > >>> > I tried to to set max_active_runs=1 in the DAG >>> > dag = DAG(...max_active_runs=1...) and it executed fine to avoid two >>> runs >>> > at the same time. >>> > However, I noticed other dags (not the dag that is running) is also >>> > "paused". >>> > My understanding is that "max_active_runs" is basically >>> > "max_active_runs_per_dag". >>> > So, why another dag (different dag name) cannot run at the same time >>> as the >>> > first dag? >>> > I want to have the two dags can be possibly run at the same time and >>> inside >>> > each dag, there is only >>> > one run per dag. >>> > Thanks. >>> > >>> > Jason >>> > >>> > My other settings in airflow.cfg >>> > >>> > max_active_runs_per_dag=1 >>> > parallelism = 32 >>> > dag_concurrency = 16 >>> > >>> > >>> > >>> > On Mon, May 16, 2016 at 8:57 PM, Chris Riccomini < >>> criccom...@apache.org> >>> > wrote: >>> > >>> > > Hey Jason, >>> > > >>> > > For (2), by default, task1 will start running again. You'll have two >>> runs >>> > > going at the same time. If you want to prevent this, you can set >>> > > max_active_runs to 1 in your DAG. >>> > > >>> > > Cheers, >>> > > Chris >>> > > >>> > > On Mon, May 16, 2016 at 1:09 PM, Jason Chen < >>> chingchien.c...@gmail.com> >>> > > wrote: >>> > > >>> > > > I have two questions >>> > > > >>> > > > (1) For the airflow UI: "Tree view", it lists the tasks along with >>> the >>> > > time >>> > > > highlighted in the top (say, 08:30; 09:00, etc). What's the >>> meaning of >>> > > > time? It looks not the UTC time of the task was running. I know in >>> > > > overall, airflow uses UTC time >>> > > > (2) I have a DAG with two tasks: task1 --> task2 >>> > > > Task1 is running hourly and could take longer than one hour to run, >>> > > > sometimes. >>> > > > In such a setup, task1 will be triggered hourly and what happens >>> if the >>> > > > previous task1 is still running ? Will the "new" task1 be queued ? >>> > > > >>> > > > Thanks. >>> > > > Jason >>> > > > >>> > > >>> > >>> >> >> >