[ https://issues.apache.org/jira/browse/AIRFLOW-3058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16710064#comment-16710064 ]
Ash Berlin-Taylor commented on AIRFLOW-3058: -------------------------------------------- Was this fixed by disabling buffered IO from your python script? (for anyone else finding this ticket in the future) > Airflow log & multi-threading > ----------------------------- > > Key: AIRFLOW-3058 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3058 > Project: Apache Airflow > Issue Type: Task > Reporter: jack > Priority: Major > Attachments: 456.PNG, Sni.PNG > > > The airflow log does not show messages in real time when executing scripts > with Multi-threading. > > for example: > > The left is the Airflow log time. the right is the actual time of the print > in my code. If I would execute the script without airflow the console will > show the times on the right. > !Sni.PNG! > {code:java} > 2018-09-13 14:19:17,325] {base_task_runner.py:98} INFO - Subtask: [2018-09-13 > 14:19:17,289] {bash_operator.py:101} INFO - 2018-09-13 14:14:55.230044 > Thread: Thread-1 Generate page: #0 run #0 with URL: > http://...&cultureid=2&offset=0&limit=1000 > [2018-09-13 14:19:17,325] {base_task_runner.py:98} INFO - Subtask: > [2018-09-13 14:19:17,289] {bash_operator.py:101} INFO - 2018-09-13 > 14:14:55.231635 Thread: Thread-2 Generate page: #1 run #0 with URL: > http://...&cultureid=2&offset=1000&limit=1000 > [2018-09-13 14:19:17,326] {base_task_runner.py:98} INFO - Subtask: > [2018-09-13 14:19:17,289] {bash_operator.py:101} INFO - 2018-09-13 > 14:14:55.233226 Thread: Thread-3 Generate page: #2 run #0 with URL: > http://...&cultureid=2&offset=2000&limit=1000 > [2018-09-13 14:19:17,326] {base_task_runner.py:98} INFO - Subtask: > [2018-09-13 14:19:17,289] {bash_operator.py:101} INFO - 2018-09-13 > 14:14:55.234020 Thread: Thread-4 Generate page: #3 run #0 with URL: > http://...&cultureid=2&offset=3000&limit=1000 > [2018-09-13 14:19:17,326] {base_task_runner.py:98} INFO - Subtask: > [2018-09-13 14:19:17,290] {bash_operator.py:101} INFO - 2018-09-13 > 14:15:43.100122 Thread: Thread-1 page 0 finished. Length is 1000 > [2018-09-13 14:19:17,326] {base_task_runner.py:98} INFO - Subtask: > [2018-09-13 14:19:17,290] {bash_operator.py:101} INFO - 2018-09-13 > 14:15:43.100877 Thread: Thread-1 Generate page: #4 run #0 with URL: > http://...&cultureid=2&offset=4000&limit=1000 > [2018-09-13 14:19:17,326] {base_task_runner.py:98} INFO - Subtask: > [2018-09-13 14:19:17,290] {bash_operator.py:101} INFO - 2018-09-13 > 14:15:46.254536 Thread: Thread-3 page 2 finished. Length is 1000 > [2018-09-13 14:19:17,327] {base_task_runner.py:98} INFO - Subtask: > [2018-09-13 14:19:17,290] {bash_operator.py:101} INFO - 2018-09-13 > 14:15:46.255508 Thread: Thread-3 Generate page: #5 run #0 with URL: > http://...&cultureid=2&offset=5000&limit=1000 > [2018-09-13 14:19:17,327] {base_task_runner.py:98} INFO - Subtask: > [2018-09-13 14:19:17,290] {bash_operator.py:101} INFO - 2018-09-13 > 14:15:51.096360 Thread: Thread-2 page 1 finished. Length is 1000 > [2018-09-13 14:19:17,327] {base_task_runner.py:98} INFO - Subtask: > [2018-09-13 14:19:17,290] {bash_operator.py:101} INFO - 2018-09-13 > 14:15:51.097269 Thread: Thread-2 Generate page: #6 run #0 with URL: > http://...&cultureid=2&offset=6000&limit=1000 > [2018-09-13 14:19:17,327] {base_task_runner.py:98} INFO - Subtask: > [2018-09-13 14:19:17,290] {bash_operator.py:101} INFO - 2018-09-13 > 14:15:53.112621 Thread: Thread-4 page 3 finished. Length is 1000 > [2018-09-13 14:19:17,327] {base_task_runner.py:98} INFO - Subtask: > [2018-09-13 14:19:17,290] {bash_operator.py:101} INFO - 2018-09-13 > 14:15:53.113455 Thread: Thread-4 Generate page: #7 run #0 with URL: > http://...&cultureid=2&offset=7000&limit=1000 > [2018-09-13 14:19:17,327] {base_task_runner.py:98} INFO - Subtask: > [2018-09-13 14:19:17,290] {bash_operator.py:101} INFO - 2018-09-13 > 14:16:37.345343 Thread: Thread-3 Generate page: #8 run #0 with URL: > http://...&cultureid=2&offset=8000&limit=1000 > [2018-09-13 14:19:17,328] {base_task_runner.py:98} INFO - Subtask: > [2018-09-13 14:19:17,290] {bash_operator.py:101} INFO - 2018-09-13 > 14:16:37.701201 Thread: Thread-2 Generate page: #9 run #0 with URL: > http://...&cultureid=2&offset=9000&limit=1000 > [2018-09-13 14:19:17,328] {base_task_runner.py:98} INFO - Subtask: > [2018-09-13 14:19:17,291] {bash_operator.py:101} INFO - 2018-09-13 > 14:16:47.283796 Thread: Thread-1 page 4 finished. Length is 1000 > [2018-09-13 14:19:17,328] {base_task_runner.py:98} INFO - Subtask: > [2018-09-13 14:19:17,291] {bash_operator.py:101} INFO - 2018-09-13 > 14:17:27.169359 Thread: Thread-2 page 9 finished. Length is 1000 > > {code} > This never happens when executing regular code.. Happens only with > multi-threading. I have some other scripts that the airflow print appears > after more than 30 minutes. > > Check this one: > hours of delay and then printing everything together. These are not real > time. the prints in the log has no correlation to the actual time the command > was executed. > > !456.PNG! -- This message was sent by Atlassian JIRA (v7.6.3#76005)