mfjackson opened a new issue #13579:
URL: https://github.com/apache/airflow/issues/13579


   **Apache Airflow version**: 2.0.0
   
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl 
version`): 1.19.4
   
   **Environment**:
   
   - **Cloud provider or hardware configuration**: Google Cloud Platform/GKE
   
   **What happened**:
   
   I successfully cleared the state of a failed task using the graph view UI, 
but when I attempted to re-run the cleared task instance in graph view manually 
by selecting the task instance and clicking "Run", I received the following 
error:
   
   ```
   Something bad has happened.
   Please consider letting us know by creating a bug report using GitHub.
   
   Python version: 3.8.7
   Airflow version: 2.0.0
   Node: mr_node
   
-------------------------------------------------------------------------------
   Traceback (most recent call last):
     File "/home/airflow/.local/lib/python3.8/site-packages/flask/app.py", line 
2447, in wsgi_app
       response = self.full_dispatch_request()
     File "/home/airflow/.local/lib/python3.8/site-packages/flask/app.py", line 
1952, in full_dispatch_request
       rv = self.handle_user_exception(e)
     File "/home/airflow/.local/lib/python3.8/site-packages/flask/app.py", line 
1821, in handle_user_exception
       reraise(exc_type, exc_value, tb)
     File "/home/airflow/.local/lib/python3.8/site-packages/flask/_compat.py", 
line 39, in reraise
       raise value
     File "/home/airflow/.local/lib/python3.8/site-packages/flask/app.py", line 
1950, in full_dispatch_request
       rv = self.dispatch_request()
     File "/home/airflow/.local/lib/python3.8/site-packages/flask/app.py", line 
1936, in dispatch_request
       return self.view_functions[rule.endpoint](**req.view_args)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/www/auth.py", line 
34, in decorated
       return func(*args, **kwargs)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/www/decorators.py", 
line 60, in wrapper
       return f(*args, **kwargs)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/www/views.py", line 
1366, in run
       executor.start()
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/executors/kubernetes_executor.py",
 line 493, in start
       raise AirflowException("Could not get scheduler_job_id")
   airflow.exceptions.AirflowException: Could not get scheduler_job_id
   ```
   
   **What you expected to happen**:
   
   I expected the task instance to be scheduled and begin running again.
   
   **How to reproduce it**:
   
   Configure Airflow 2.0.0 to run on GCP, clear the state of a finished task 
instance using the UI (I was able to reproduce the error on a task instance 
maked "Success" as well), and again use the Web UI to "Run" the task.
   
   **Anything else we need to know**:
   
   One important item to note is that when I _only_ clear task instance and do 
not attempt to run it manually using the UI, the task does queue and is placing 
in a `running` state, but quickly fails with the following error:
   
   ```
   [2021-01-08 21:08:16,140] {taskinstance.py:1396} ERROR - (0)
   Reason: Handshake status 500 Internal Server Error
   Traceback (most recent call last):
     File 
"/home/airflow/.local/lib/python3.8/site-packages/kubernetes/stream/ws_client.py",
 line 296, in websocket_call
       client = WSClient(configuration, get_websocket_url(url), headers, 
capture_all)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/kubernetes/stream/ws_client.py",
 line 94, in __init__
       self.sock.connect(url, header=header)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/websocket/_core.py", line 
226, in connect
       self.handshake_response = handshake(self.sock, *addrs, **options)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/websocket/_handshake.py", 
line 80, in handshake
       status, resp = _get_resp_headers(sock)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/websocket/_handshake.py", 
line 165, in _get_resp_headers
       raise WebSocketBadStatusException("Handshake status %d %s", status, 
status_message, resp_headers)
   websocket._exceptions.WebSocketBadStatusException: Handshake status 500 
Internal Server Error
   
   During handling of the above exception, another exception occurred:
   
   Traceback (most recent call last):
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py",
 line 1086, in _run_raw_task
       self._prepare_and_execute_task_with_callbacks(context, task)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py",
 line 1260, in _prepare_and_execute_task_with_callbacks
       result = self._execute_task(context, task_copy)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py",
 line 1300, in _execute_task
       result = task_copy.execute(context=context)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py",
 line 335, in execute
       final_state, result = self.handle_pod_overlap(
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py",
 line 375, in handle_pod_overlap
       final_state, result = self.monitor_launched_pod(launcher, pod)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py",
 line 513, in monitor_launched_pod
       (final_state, result) = launcher.monitor_pod(pod, get_logs=self.get_logs)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/kubernetes/pod_launcher.py",
 line 151, in monitor_pod
       result = self._extract_xcom(pod)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/kubernetes/pod_launcher.py",
 line 246, in _extract_xcom
       resp = kubernetes_stream(
     File 
"/home/airflow/.local/lib/python3.8/site-packages/kubernetes/stream/stream.py", 
line 35, in stream
       return func(*args, **kwargs)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/kubernetes/client/api/core_v1_api.py",
 line 841, in connect_get_namespaced_pod_exec
       (data) = self.connect_get_namespaced_pod_exec_with_http_info(name, 
namespace, **kwargs)  # noqa: E501
     File 
"/home/airflow/.local/lib/python3.8/site-packages/kubernetes/client/api/core_v1_api.py",
 line 927, in connect_get_namespaced_pod_exec_with_http_info
       return self.api_client.call_api(
     File 
"/home/airflow/.local/lib/python3.8/site-packages/kubernetes/client/api_client.py",
 line 340, in call_api
       return self.__call_api(resource_path, method,
     File 
"/home/airflow/.local/lib/python3.8/site-packages/kubernetes/client/api_client.py",
 line 172, in __call_api
       response_data = self.request(
     File 
"/home/airflow/.local/lib/python3.8/site-packages/kubernetes/stream/stream.py", 
line 30, in _intercept_request_call
       return ws_client.websocket_call(config, *args, **kwargs)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/kubernetes/stream/ws_client.py",
 line 302, in websocket_call
       raise ApiException(status=0, reason=str(e))
   kubernetes.client.rest.ApiException: (0)
   Reason: Handshake status 500 Internal Server Error
   ```
   
   Thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to