Repository: incubator-airflow
Updated Branches:
  refs/heads/master c6681681d -> b81bd08a3


[AIRFLOW-2538] Update faq doc on how to reduce airflow scheduler latency

Make sure you have checked _all_ steps below.

### JIRA
- [x] My PR addresses the following [Airflow JIRA]
(https://issues.apache.org/jira/browse/AIRFLOW/)
issues and references them in the PR title. For
example, "\[AIRFLOW-XXX\] My Airflow PR"
    -
https://issues.apache.org/jira/browse/AIRFLOW-2538
    - In case you are fixing a typo in the
documentation you can prepend your commit with
\[AIRFLOW-XXX\], code changes always need a JIRA
issue.

### Description
- [x] Here are some details about my PR, including
screenshots of any UI changes:
Update the faq doc on how to reduce airflow
scheduler latency. This comes from our internal
production setting which also aligns with Maxime's
email(https://lists.apache.org/thread.html/%3CCAHE
Ep7WFAivyMJZ0N+0Zd1T3nvfyCJRudL3XSRLM4utSigR3dQmai
l.gmail.com%3E).

### Tests
- [ ] My PR adds the following unit tests __OR__
does not need testing for this extremely good
reason:

### Commits
- [ ] My commits all reference JIRA issues in
their subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"[How to write a good git commit
message](http://chris.beams.io/posts/git-
commit/)":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not
"adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

### Documentation
- [ ] In case of new functionality, my PR adds
documentation that describes how to use it.
    - When adding new operators/hooks/sensors, the
autoclass documentation generation needs to be
added.

### Code Quality
- [ ] Passes `git diff upstream/master -u --
"*.py" | flake8 --diff`

Closes #3434 from feng-tao/update_faq


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/b81bd08a
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/b81bd08a
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/b81bd08a

Branch: refs/heads/master
Commit: b81bd08a334efa5242af705743519be43346295e
Parents: c668168
Author: Tao feng <tf...@lyft.com>
Authored: Thu May 31 22:01:59 2018 -0700
Committer: Maxime Beauchemin <maximebeauche...@gmail.com>
Committed: Thu May 31 22:01:59 2018 -0700

----------------------------------------------------------------------
 docs/faq.rst | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/b81bd08a/docs/faq.rst
----------------------------------------------------------------------
diff --git a/docs/faq.rst b/docs/faq.rst
index d2c6188..33b4d5a 100644
--- a/docs/faq.rst
+++ b/docs/faq.rst
@@ -162,10 +162,18 @@ How can we reduce the airflow UI page load time?
 
 If your dag takes long time to load, you could reduce the value of 
``default_dag_run_display_number`` configuration in ``airflow.cfg`` to a 
smaller value. This configurable controls the number of dag run to show in UI 
with default value 25.
 
+
 How to fix Exception: Global variable explicit_defaults_for_timestamp needs to 
be on (1)?
----------------------------------------------------------------------------------------------
+-----------------------------------------------------------------------------------------
 
 This means ``explicit_defaults_for_timestamp`` is disabled in your mysql 
server and you need to enable it by:
 
 #. Set ``explicit_defaults_for_timestamp = 1`` under the mysqld section in 
your my.cnf file.
 #. Restart the Mysql server.
+
+
+How to reduce airflow dag scheduling latency in production?
+-----------------------------------------------------------
+
+- ``max_threads``: Scheduler will spawn multiple threads in parallel to 
schedule dags. This is controlled by ``max_threads`` with default value of 2. 
User should increase this value to a larger value(e.g numbers of cpus where 
scheduler runs - 1) in production.
+- ``scheduler_heartbeat_sec``: User should consider to increase 
``scheduler_heartbeat_sec`` config to a higher value(e.g 60 secs) which 
controls how frequent the airflow scheduler gets the heartbeat and updates the 
job's entry in database.

Reply via email to