[ 
https://issues.apache.org/jira/browse/AIRFLOW-6040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16980540#comment-16980540
 ] 

ASF GitHub Bot commented on AIRFLOW-6040:
-----------------------------------------

maxirus commented on pull request #6643: [AIRFLOW-6040] Fix 
KubernetesJobWatcher Read time out error
URL: https://github.com/apache/airflow/pull/6643
 
 
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues
     - [\[AIRFLOW-6040\]](https://issues.apache.org/jira/browse/AIRFLOW-6040)
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
     - Setting timeout_seconds=50 in the Watch() loop
   will cause a warning instead of an exception when a worker_uuid does not 
exist. timeout_seconds targets the list_namespaced_pod method as opposed to the 
underlying urllib3 library which throws  an exception.
     - Adding worker_uuid to the log message so users know which label is being 
watched
   
   ### Tests
   
   ### Commits
   
   ### Documentation
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Airflow scheduler with kubernetes executor fails :- Unknown error in 
> KubernetesJobWatcher
> -----------------------------------------------------------------------------------------
>
>                 Key: AIRFLOW-6040
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-6040
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: contrib, executor-kubernetes, scheduler
>    Affects Versions: 1.10.6
>            Reporter: Ashutosh Srivastava
>            Assignee: Daniel Imberman
>            Priority: Major
>
> I am trying to set up airflow with the kubernetes executor. I have cloned 
> airflow 1.10.6 and am building the docker image and then deploying it with 
> kube. The pods are running, the service airflow also starts. The webserver is 
> working fine. But when I check the logs for the scheduler I get the following 
> error.
>  
> {{ERROR - Error while health checking kube watcher process. Process died for 
> unknown reasons
> INFO - Event: and now my watch begins starting at resource_version: 0
> ERROR - Unknown error in KubernetesJobWatcher. Failing
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python2.7/dist-packages/airflow/contrib/executors/kubernetes_executor.py",
>  line 333, in run
>     self.worker_uuid, self.kube_config)
>   File 
> "/usr/local/lib/python2.7/dist-packages/airflow/contrib/executors/kubernetes_executor.py",
>  line 358, in _run
>     **kwargs):
>   File "/usr/local/lib/python2.7/dist-packages/kubernetes/watch/watch.py", 
> line 144, in stream
>     for line in iter_resp_lines(resp):
>   File "/usr/local/lib/python2.7/dist-packages/kubernetes/watch/watch.py", 
> line 48, in iter_resp_lines
>     for seg in resp.read_chunked(decode_content=False):
>   File "/usr/local/lib/python2.7/dist-packages/urllib3/response.py", line 
> 781, in read_chunked
>     self._original_response.close()
>   File "/usr/lib/python2.7/contextlib.py", line 35, in __exit__
>     self.gen.throw(type, value, traceback)
>   File "/usr/local/lib/python2.7/dist-packages/urllib3/response.py", line 
> 439, in _error_catcher
>     raise ReadTimeoutError(self._pool, None, "Read timed out.")
> ReadTimeoutError: HTTPSConnectionPool(host='10.0.0.1', port=443): Read timed 
> out.}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to