zikun opened a new issue #9626:
URL: https://github.com/apache/airflow/issues/9626


   <!--
   
   Welcome to Apache Airflow!  For a smooth issue process, try to answer the 
following questions.
   Don't worry if they're not all applicable; just try to include what you can 
:-)
   
   If you need to include code snippets or logs, please put them in fenced code
   blocks.  If they're super-long, please use the details tag like
   <details><summary>super-long log</summary> lots of stuff </details>
   
   Please delete these comment blocks before submitting the issue.
   
   -->
   
   <!--
   
   IMPORTANT!!!
   
   PLEASE CHECK "SIMILAR TO X EXISTING ISSUES" OPTION IF VISIBLE
   NEXT TO "SUBMIT NEW ISSUE" BUTTON!!!
   
   PLEASE CHECK IF THIS ISSUE HAS BEEN REPORTED PREVIOUSLY USING SEARCH!!!
   
   Please complete the next sections or the issue will be closed.
   This questions are the first thing we need to know to understand the context.
   
   -->
   
   **Apache Airflow version**: 1.10.10
   
   **Environment**: 
   Official Docker image
   Python 3.7
   CeleryExecutor with worker_autoscale = 512,16
   Scheduler, worker and webserver running in three different Docker containers.
   
   - **Cloud provider or hardware configuration**:
   AWS EC2
   - **OS** (e.g. from /etc/os-release):
   PRETTY_NAME="Debian GNU/Linux 10 (buster)"
   NAME="Debian GNU/Linux"
   VERSION_ID="10"
   VERSION="10 (buster)"
   VERSION_CODENAME=buster
   ID=debian
   HOME_URL="https://www.debian.org/";
   SUPPORT_URL="https://www.debian.org/support";
   BUG_REPORT_URL="https://bugs.debian.org/";
   
   - **Kernel** (e.g. `uname -a`):
   Linux dumm.hostname.com 3.10.0-957.1.3.el7.x86_64 #1 SMP Thu Nov 29 14:49:43 
UTC 2018 x86_64 GNU/Linux
   - **Install tools**:
   git
   procps
   OpenJDK JRE 11 headless
   - **Others**:
   pip packages
   jaydebeapi==1.2.0
   docker==4.2.1
   
   **What happened**:
   
   There are two problems and they seem to always appear together.
   * The logs for running tasks are not consistent. They are overwritten during 
task runs. 
   * Some `FAILED: Task is in the 'running' state which is not a valid state 
for execution. The task must be cleared in order to be run.` messages in the 
log but eventually the task succeeds.
   
   Below are some snapshots from a `JDBCOperator` task log:
   
   <details>
   <summary>Task Log</summary>
   
   ## First snapshot
   airflow@xxx:/opt/airflow$ cat 
logs/dag_id/task_id/2020-07-01T22:07:00+00:00/4.log
   [2020-07-02 05:59:24,226] {taskinstance.py:669} INFO - Dependencies all met 
for <TaskInstance: dag_id.task_id 2020-07-01T22:07:00+00:00 [queued]>
   [2020-07-02 05:59:24,248] {taskinstance.py:669} INFO - Dependencies all met 
for <TaskInstance: dag_id.task_id 2020-07-01T22:07:00+00:00 [queued]>
   [2020-07-02 05:59:24,248] {taskinstance.py:879} INFO -
   
--------------------------------------------------------------------------------
   [2020-07-02 05:59:24,248] {taskinstance.py:880} INFO - Starting attempt 4 of 
5
   [2020-07-02 05:59:24,248] {taskinstance.py:881} INFO -
   
--------------------------------------------------------------------------------
   [2020-07-02 05:59:24,267] {taskinstance.py:900} INFO - Executing 
<Task(JdbcOperator): task_id> on 2020-07-01T22:07:00+00:00
   [2020-07-02 05:59:24,270] {standard_task_runner.py:53} INFO - Started 
process 16712 to run task
   [2020-07-02 05:59:24,358] {logging_mixin.py:112} INFO - Running %s on host 
%s <TaskInstance: dag_id.task_id 2020-07-01T22:07:00+00:00 [running]> xxx
   [2020-07-02 05:59:24,410] {jdbc_operator.py:61} INFO - Executing: ['CREATE 
OR REPLACE TABLE TEST.T1 AS (SELECT * FROM  TEST.DUMMY WHERE ID <= 10;', 
'SELECT * FROM TEST.T1 ORDER BY DBTIMESTAMP DESC limit 10;', 'CREATE OR REPLACE 
TABLE TEST.T2 AS (SELECT * FROM TEST.DUMMY WHERE ID > 10;', 'SELECT * FROM 
TEST.T2 limit 10;']
   [2020-07-02 05:59:24,415] {logging_mixin.py:112} INFO - [2020-07-02 
05:59:24,415] {base_hook.py:87} INFO - Using connection to: id: exasol_prod_rw. 
Host: jdbc:exa:172.18.229.31..47:8563;schema=EXDDATA, Port: None, Schema: , 
Login: exddata, Password: XXXXXXXX, extra: XXXXXXXX
   [2020-07-02 05:59:25,396] {logging_mixin.py:112} INFO - [2020-07-02 
05:59:25,396] {dbapi_hook.py:174} INFO - CREATE OR REPLACE TABLE TEST.T1 AS 
(SELECT * FROM  TEST.DUMMY WHERE ID <= 10;
   
   
   ## Second snapshot
   airflow@xxx:/opt/airflow$ cat 
logs/dag_id/task_id/2020-07-01T22:07:00+00:00/4.log
   [2020-07-02 06:00:55,465] {taskinstance.py:663} INFO - Dependencies not met 
for <TaskInstance: dag_id.task_id 2020-07-01T22:07:00+00:00 [running]>, 
dependency 'Task Instance State' FAILED: Task is in the 'running' state which 
is not a valid state for execution. The task must be cleared in order to be run.
   [2020-07-02 06:00:55,468] {logging_mixin.py:112} INFO - [2020-07-02 
06:00:55,468] {local_task_job.py:91} INFO - Task is not able to be run
   [2020-07-02 06:01:20,667] {logging_mixin.py:112} INFO - [2020-07-02 
06:01:20,667] {dbapi_hook.py:174} INFO - SELECT * FROM TEST.T2 limit 10;
   
   
   ## Third snapshot
   airflow@xxx:/opt/airflow$ cat 
logs/dag_id/task_id/2020-07-01T22:07:00+00:00/4.log
   [2020-07-02 06:00:55,465] {taskinstance.py:663} INFO - Dependencies not met 
for <TaskInstance: dag_id.task_id 2020-07-01T22:07:00+00:00 [running]>, 
dependency 'Task Instance State' FAILED: Task is in the 'running' state which 
is not a valid state for execution. The task must be cleared in order to be run.
   [2020-07-02 06:00:55,468] {logging_mixin.py:112} INFO - [2020-07-02 
06:00:55,468] {local_task_job.py:91} INFO - Task is not able to be run
   [2020-07-02 06:01:20,667] {logging_mixin.py:112} INFO - [2020-07-02 
06:01:20,667] {dbapi_hook.py:174} INFO - SELECT * FROM TEST.T2 limit 10;
   [2020-07-02 06:01:48,119] {taskinstance.py:1065} INFO - Marking task as 
SUCCESS.dag_id=dag_id, task_id=task_id, execution_date=20200701T220700, 
start_date=20200702T055924, end_date=20200702T060148
   [2020-07-02 06:01:50,254] {logging_mixin.py:112} INFO - [2020-07-02 
06:01:50,254] {local_task_job.py:103} INFO - Task exited with return code 0
   
   </details>
   
   <!-- (please include exact error messages if you can) -->
   
   **What you expected to happen**:
   
   The log should not be appended not overwritten.
   <!-- What do you think went wrong? -->
   
   **How to reproduce it**:
   
   This can be reproduced with some JDBCOperator and SSHOperator tasks every 
time in my docker environment. But I'm have not tried other environments
   
   <!---
   
   As minimally and precisely as possible. Keep in mind we do not have access 
to your cluster or dags.
   
   If you are using kubernetes, please attempt to recreate the issue using 
minikube or kind.
   
   ## Install minikube/kind
   
   - Minikube https://minikube.sigs.k8s.io/docs/start/
   - Kind https://kind.sigs.k8s.io/docs/user/quick-start/
   
   If this is a UI bug, please provide a screenshot of the bug or a link to a 
youtube video of the bug in action
   
   You can include images using the .md sytle of
   ![alt text](http://url/to/img.png)
   
   To record a screencast, mac users can use QuickTime and then create an 
unlisted youtube video with the resulting .mov file.
   
   --->
   
   
   **Anything else we need to know**:
   
   <!--
   
   How often does this problem occur? Once? Every time etc?
   
   Any relevant logs to include? Put them here in side a detail tag:
   <details><summary>x.log</summary> lots of stuff </details>
   
   -->
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to