[jira] [Commented] (AIRFLOW-2970) Kubernetes logging is broken

2019-09-17 Thread Steven Miller (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16931617#comment-16931617
 ] 

Steven Miller commented on AIRFLOW-2970:


for clarify, i'm talking about this button  !image-2019-09-17-12-19-45-207.png!

> Kubernetes logging is broken
> 
>
> Key: AIRFLOW-2970
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2970
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: executors
>Reporter: Jon Davies
>Assignee: Daniel Imberman
>Priority: Major
> Attachments: image-2019-09-17-12-19-45-207.png
>
>
> I'm using Airflow with the Kubernetes executor and pod operator. And my DAGs 
> are configured to do get_log=True and all my DAGs are set to log to stdout 
> and I can see all the logs in kubectl logs.
> I can see that the scheduler logs things to: 
> $AIRFLOW_HOME/logs/scheduler/2018-08-28/*
> However, this just consists of:
> {code:java}
> [2018-08-28 13:03:27,695] {jobs.py:385} INFO - Started process (PID=16994) to 
> work on /home/airflow/dags/dag.py
> [2018-08-28 13:03:27,697] {jobs.py:1782} INFO - Processing file 
> /home/airflow/dags/dag.py for tasks to queue
> [2018-08-28 13:03:27,697] {logging_mixin.py:95} INFO - [2018-08-28 
> 13:03:27,697] {models.py:258} INFO - Filling up the DagBag from 
> /home/airflow/dags/dag.py
> {code}
> If I quickly exec into the executor the scheduler spins up, I can see that 
> things are properly logged to:
> {code:java}
> /home/airflow/logs/dag$ tail -f 
> dag-downloader/2018-08-28T13\:05\:07.704072+00\:00/1.log
> [2018-08-28 13:05:24,399] {logging_mixin.py:95} INFO - [2018-08-28 
> 13:05:24,399] {pod_launcher.py:112} INFO - Event: dag-downloader-015ca48c had 
> an event of type Pending
> ...
> [2018-08-28 13:05:37,193] {logging_mixin.py:95} INFO - [2018-08-28 
> 13:05:37,193] {pod_launcher.py:95} INFO - 
> b'INFO:botocore.vendored.requests.packages.urllib3.connectionpool:Starting 
> new HTTPS connection (7): blah-blah.s3.eu-west-1.amazonaws.com\n'
> ...
> ...all other log lines from pod...
> {code}
> However, this executor pod only exists for the duration of the lifetime of 
> the task pod so the logs are lost pretty much immediately after the task 
> runs. There is nothing that ships the logs back to the scheduler and/or web 
> UI.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (AIRFLOW-2970) Kubernetes logging is broken

2019-09-17 Thread Steven Miller (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16931615#comment-16931615
 ] 

Steven Miller commented on AIRFLOW-2970:


Astronomer is debugging a similar issue in 1.10.x . If you try to download the 
logs from the UI (NOT view the logs - download them by clicking the button with 
a number on it) and you get a network error, and you check the webserver logs 
and have this stack trace:
```

[2019-09-17 11:51:11 +] [10636] [ERROR] Error handling request

Traceback (most recent call last):

  File "/usr/lib/python3.7/site-packages/gunicorn/workers/sync.py", line 181, 
in handle_request

    for item in respiter:

  File "/usr/lib/python3.7/site-packages/werkzeug/wsgi.py", line 507, in 
__next__

    return self._next()

  File "/usr/lib/python3.7/site-packages/werkzeug/wrappers/base_response.py", 
line 45, in _iter_encoded

    for item in iterable:

  File "/usr/lib/python3.7/site-packages/airflow/www_rbac/views.py", line 600, 
in _generate_log_stream

    logs, metadata = _get_logs_with_metadata(try_number, metadata)

  File "/usr/lib/python3.7/site-packages/airflow/www_rbac/views.py", line 569, 
in _get_logs_with_metadata

    logs, metadatas = handler.read(ti, try_number, metadata=metadata)

  File 
"/usr/lib/python3.7/site-packages/airflow/utils/log/file_task_handler.py", line 
164, in read

    log, metadata = self._read(task_instance, try_number, metadata)

  File "/usr/lib/python3.7/site-packages/airflow/utils/log/es_task_handler.py", 
line 144, in _read

    and offset >= metadata['max_offset']:

TypeError: '>=' not supported between instances of 'str' and 'int'
```

Then it is the same problem we are experiencing. If that is the case, this 
change is what we are using to patch it while we get to the bottom of what's 
going on. [https://github.com/astronomer/airflow/pull/63]

> Kubernetes logging is broken
> 
>
> Key: AIRFLOW-2970
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2970
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: executors
>Reporter: Jon Davies
>Assignee: Daniel Imberman
>Priority: Major
>
> I'm using Airflow with the Kubernetes executor and pod operator. And my DAGs 
> are configured to do get_log=True and all my DAGs are set to log to stdout 
> and I can see all the logs in kubectl logs.
> I can see that the scheduler logs things to: 
> $AIRFLOW_HOME/logs/scheduler/2018-08-28/*
> However, this just consists of:
> {code:java}
> [2018-08-28 13:03:27,695] {jobs.py:385} INFO - Started process (PID=16994) to 
> work on /home/airflow/dags/dag.py
> [2018-08-28 13:03:27,697] {jobs.py:1782} INFO - Processing file 
> /home/airflow/dags/dag.py for tasks to queue
> [2018-08-28 13:03:27,697] {logging_mixin.py:95} INFO - [2018-08-28 
> 13:03:27,697] {models.py:258} INFO - Filling up the DagBag from 
> /home/airflow/dags/dag.py
> {code}
> If I quickly exec into the executor the scheduler spins up, I can see that 
> things are properly logged to:
> {code:java}
> /home/airflow/logs/dag$ tail -f 
> dag-downloader/2018-08-28T13\:05\:07.704072+00\:00/1.log
> [2018-08-28 13:05:24,399] {logging_mixin.py:95} INFO - [2018-08-28 
> 13:05:24,399] {pod_launcher.py:112} INFO - Event: dag-downloader-015ca48c had 
> an event of type Pending
> ...
> [2018-08-28 13:05:37,193] {logging_mixin.py:95} INFO - [2018-08-28 
> 13:05:37,193] {pod_launcher.py:95} INFO - 
> b'INFO:botocore.vendored.requests.packages.urllib3.connectionpool:Starting 
> new HTTPS connection (7): blah-blah.s3.eu-west-1.amazonaws.com\n'
> ...
> ...all other log lines from pod...
> {code}
> However, this executor pod only exists for the duration of the lifetime of 
> the task pod so the logs are lost pretty much immediately after the task 
> runs. There is nothing that ships the logs back to the scheduler and/or web 
> UI.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (AIRFLOW-2970) Kubernetes logging is broken

2019-02-22 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16774953#comment-16774953
 ] 

Ash Berlin-Taylor commented on AIRFLOW-2970:


PRs welcome! Easy option that doesn't need a code change is to configure 
Airflow to write task logs to S3/GCS - 
https://airflow.apache.org/howto/write-logs.html#

> Kubernetes logging is broken
> 
>
> Key: AIRFLOW-2970
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2970
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Jon Davies
>Assignee: Daniel Imberman
>Priority: Major
>
> I'm using Airflow with the Kubernetes executor and pod operator. And my DAGs 
> are configured to do get_log=True and all my DAGs are set to log to stdout 
> and I can see all the logs in kubectl logs.
> I can see that the scheduler logs things to: 
> $AIRFLOW_HOME/logs/scheduler/2018-08-28/*
> However, this just consists of:
> {code:java}
> [2018-08-28 13:03:27,695] {jobs.py:385} INFO - Started process (PID=16994) to 
> work on /home/airflow/dags/dag.py
> [2018-08-28 13:03:27,697] {jobs.py:1782} INFO - Processing file 
> /home/airflow/dags/dag.py for tasks to queue
> [2018-08-28 13:03:27,697] {logging_mixin.py:95} INFO - [2018-08-28 
> 13:03:27,697] {models.py:258} INFO - Filling up the DagBag from 
> /home/airflow/dags/dag.py
> {code}
> If I quickly exec into the executor the scheduler spins up, I can see that 
> things are properly logged to:
> {code:java}
> /home/airflow/logs/dag$ tail -f 
> dag-downloader/2018-08-28T13\:05\:07.704072+00\:00/1.log
> [2018-08-28 13:05:24,399] {logging_mixin.py:95} INFO - [2018-08-28 
> 13:05:24,399] {pod_launcher.py:112} INFO - Event: dag-downloader-015ca48c had 
> an event of type Pending
> ...
> [2018-08-28 13:05:37,193] {logging_mixin.py:95} INFO - [2018-08-28 
> 13:05:37,193] {pod_launcher.py:95} INFO - 
> b'INFO:botocore.vendored.requests.packages.urllib3.connectionpool:Starting 
> new HTTPS connection (7): blah-blah.s3.eu-west-1.amazonaws.com\n'
> ...
> ...all other log lines from pod...
> {code}
> However, this executor pod only exists for the duration of the lifetime of 
> the task pod so the logs are lost pretty much immediately after the task 
> runs. There is nothing that ships the logs back to the scheduler and/or web 
> UI.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2970) Kubernetes logging is broken

2019-02-21 Thread dewin goh (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773877#comment-16773877
 ] 

dewin goh commented on AIRFLOW-2970:


Hi, may I know what's the progress on this ticket? My team is interested on the 
realtime logs for this, but it seems like the webserver is unable to reach into 
the logs of the newly spawned pods in kubernetes. We're willing to do a PR for 
this, if need be.

> Kubernetes logging is broken
> 
>
> Key: AIRFLOW-2970
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2970
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Jon Davies
>Assignee: Daniel Imberman
>Priority: Major
>
> I'm using Airflow with the Kubernetes executor and pod operator. And my DAGs 
> are configured to do get_log=True and all my DAGs are set to log to stdout 
> and I can see all the logs in kubectl logs.
> I can see that the scheduler logs things to: 
> $AIRFLOW_HOME/logs/scheduler/2018-08-28/*
> However, this just consists of:
> {code:java}
> [2018-08-28 13:03:27,695] {jobs.py:385} INFO - Started process (PID=16994) to 
> work on /home/airflow/dags/dag.py
> [2018-08-28 13:03:27,697] {jobs.py:1782} INFO - Processing file 
> /home/airflow/dags/dag.py for tasks to queue
> [2018-08-28 13:03:27,697] {logging_mixin.py:95} INFO - [2018-08-28 
> 13:03:27,697] {models.py:258} INFO - Filling up the DagBag from 
> /home/airflow/dags/dag.py
> {code}
> If I quickly exec into the executor the scheduler spins up, I can see that 
> things are properly logged to:
> {code:java}
> /home/airflow/logs/dag$ tail -f 
> dag-downloader/2018-08-28T13\:05\:07.704072+00\:00/1.log
> [2018-08-28 13:05:24,399] {logging_mixin.py:95} INFO - [2018-08-28 
> 13:05:24,399] {pod_launcher.py:112} INFO - Event: dag-downloader-015ca48c had 
> an event of type Pending
> ...
> [2018-08-28 13:05:37,193] {logging_mixin.py:95} INFO - [2018-08-28 
> 13:05:37,193] {pod_launcher.py:95} INFO - 
> b'INFO:botocore.vendored.requests.packages.urllib3.connectionpool:Starting 
> new HTTPS connection (7): blah-blah.s3.eu-west-1.amazonaws.com\n'
> ...
> ...all other log lines from pod...
> {code}
> However, this executor pod only exists for the duration of the lifetime of 
> the task pod so the logs are lost pretty much immediately after the task 
> runs. There is nothing that ships the logs back to the scheduler and/or web 
> UI.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)