[ 
https://issues.apache.org/jira/browse/SPARK-27574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Will Zhang updated SPARK-27574:
-------------------------------
    Description: 
I'm using spark-on-kubernetes to submit spark app to kubernetes.
most of the time, it runs smoothly.
but sometimes, I see logs after submitting: the driver pod phase changed from 
running to pending and starts another container in the pod though the first 
container exited successfully.

I use the standard spark-submit to kubernetes like:

/opt/spark/spark-2.4.0-bin-hadoop2.7/bin/spark-submit --deploy-mode cluster 
--class xxx ...

 

log is below:

19/04/19 09:38:40 INFO LineBufferedStream: stdout: 2019-04-19 09:38:40 INFO 
LoggingPodStatusWatcherImpl:54 - State changed, new state:
19/04/19 09:38:40 INFO LineBufferedStream: stdout: pod name: 
com-xxxx-cloud-mf-trainer-submit-1555666719424-driver
19/04/19 09:38:40 INFO LineBufferedStream: stdout: namespace: default
19/04/19 09:38:40 INFO LineBufferedStream: stdout: labels: DagTask_ID -> 
54f854e2-0bce-4bd6-50e7-57b521b216f7, spark-app-selector -> 
spark-4343fe80572c4240bd933246efd975da, spark-role -> driver
19/04/19 09:38:40 INFO LineBufferedStream: stdout: pod uid: 
ea4410d5-6286-11e9-ae72-e8611f1fbb2a
19/04/19 09:38:40 INFO LineBufferedStream: stdout: creation time: 
2019-04-19T09:38:40Z
19/04/19 09:38:40 INFO LineBufferedStream: stdout: service account name: default
19/04/19 09:38:40 INFO LineBufferedStream: stdout: volumes: spark-local-dir-1, 
spark-conf-volume, default-token-q7drh
19/04/19 09:38:40 INFO LineBufferedStream: stdout: node name: N/A
19/04/19 09:38:40 INFO LineBufferedStream: stdout: start time: N/A
19/04/19 09:38:40 INFO LineBufferedStream: stdout: container images: N/A
19/04/19 09:38:40 INFO LineBufferedStream: stdout: phase: Pending
19/04/19 09:38:40 INFO LineBufferedStream: stdout: status: []
19/04/19 09:38:40 INFO LineBufferedStream: stdout: 2019-04-19 09:38:40 INFO 
LoggingPodStatusWatcherImpl:54 - State changed, new state:
19/04/19 09:38:40 INFO LineBufferedStream: stdout: pod name: 
com-xxxx-cloud-mf-trainer-submit-1555666719424-driver
19/04/19 09:38:40 INFO LineBufferedStream: stdout: namespace: default
19/04/19 09:38:40 INFO LineBufferedStream: stdout: labels: DagTask_ID -> 
54f854e2-0bce-4bd6-50e7-57b521b216f7, spark-app-selector -> 
spark-4343fe80572c4240bd933246efd975da, spark-role -> driver
19/04/19 09:38:40 INFO LineBufferedStream: stdout: pod uid: 
ea4410d5-6286-11e9-ae72-e8611f1fbb2a
19/04/19 09:38:40 INFO LineBufferedStream: stdout: creation time: 
2019-04-19T09:38:40Z
19/04/19 09:38:40 INFO LineBufferedStream: stdout: service account name: default
19/04/19 09:38:40 INFO LineBufferedStream: stdout: volumes: spark-local-dir-1, 
spark-conf-volume, default-token-q7drh
19/04/19 09:38:40 INFO LineBufferedStream: stdout: node name: 
yq01-m12-ai2b-service02.yq01.xxxx.com
19/04/19 09:38:40 INFO LineBufferedStream: stdout: start time: N/A
19/04/19 09:38:40 INFO LineBufferedStream: stdout: container images: N/A
19/04/19 09:38:40 INFO LineBufferedStream: stdout: phase: Pending
19/04/19 09:38:40 INFO LineBufferedStream: stdout: status: []

19/04/19 09:38:41 INFO LineBufferedStream: stdout: 2019-04-19 09:38:41 INFO 
LoggingPodStatusWatcherImpl:54 - State changed, new state:
19/04/19 09:38:41 INFO LineBufferedStream: stdout: pod name: 
com-xxxx-cloud-mf-trainer-submit-1555666719424-driver
19/04/19 09:38:41 INFO LineBufferedStream: stdout: namespace: default
19/04/19 09:38:41 INFO LineBufferedStream: stdout: labels: DagTask_ID -> 
54f854e2-0bce-4bd6-50e7-57b521b216f7, spark-app-selector -> 
spark-4343fe80572c4240bd933246efd975da, spark-role -> driver
19/04/19 09:38:41 INFO LineBufferedStream: stdout: pod uid: 
ea4410d5-6286-11e9-ae72-e8611f1fbb2a
19/04/19 09:38:41 INFO LineBufferedStream: stdout: creation time: 
2019-04-19T09:38:40Z
19/04/19 09:38:41 INFO LineBufferedStream: stdout: service account name: default
19/04/19 09:38:41 INFO LineBufferedStream: stdout: volumes: spark-local-dir-1, 
spark-conf-volume, default-token-q7drh
19/04/19 09:38:41 INFO LineBufferedStream: stdout: node name: 
yq01-m12-ai2b-service02.yq01.xxxx.com
19/04/19 09:38:41 INFO LineBufferedStream: stdout: start time: 
2019-04-19T09:38:40Z
19/04/19 09:38:41 INFO LineBufferedStream: stdout: container images: 
10.96.0.100:5000/spark:spark-2.4.0
19/04/19 09:38:41 INFO LineBufferedStream: stdout: phase: Pending
19/04/19 09:38:41 INFO LineBufferedStream: stdout: status: 
[ContainerStatus(containerID=null, image=10.96.0.100:5000/spark:spark-2.4.0, 
imageID=, lastState=ContainerState(running=null, terminated=null, waiting=null, 
additionalProperties={}), name=spark-kubernetes-driver, ready=false, 
restartCount=0, state=ContainerState(running=null, terminated=null, 
waiting=ContainerStateWaiting(message=null, reason=ContainerCreating, 
additionalProperties={}), additionalProperties={}), additionalProperties={})]
19/04/19 09:38:45 INFO LineBufferedStream: stdout: 2019-04-19 09:38:45 INFO 
LoggingPodStatusWatcherImpl:54 - State changed, new state:
19/04/19 09:38:45 INFO LineBufferedStream: stdout: pod name: 
com-xxxx-cloud-mf-trainer-submit-1555666719424-driver
19/04/19 09:38:45 INFO LineBufferedStream: stdout: namespace: default
19/04/19 09:38:45 INFO LineBufferedStream: stdout: labels: DagTask_ID -> 
54f854e2-0bce-4bd6-50e7-57b521b216f7, spark-app-selector -> 
spark-4343fe80572c4240bd933246efd975da, spark-role -> driver
19/04/19 09:38:45 INFO LineBufferedStream: stdout: pod uid: 
ea4410d5-6286-11e9-ae72-e8611f1fbb2a
19/04/19 09:38:45 INFO LineBufferedStream: stdout: creation time: 
2019-04-19T09:38:40Z
19/04/19 09:38:45 INFO LineBufferedStream: stdout: service account name: default
19/04/19 09:38:45 INFO LineBufferedStream: stdout: volumes: spark-local-dir-1, 
spark-conf-volume, default-token-q7drh
19/04/19 09:38:45 INFO LineBufferedStream: stdout: node name: 
yq01-m12-ai2b-service02.yq01.xxxx.com
19/04/19 09:38:45 INFO LineBufferedStream: stdout: start time: 
2019-04-19T09:38:40Z
19/04/19 09:38:45 INFO LineBufferedStream: stdout: container images: 
10.96.0.100:5000/spark-2.4.0:latest_7fdb0b75-0e7b-4587-42c7-b79a3dbd9f83
19/04/19 09:38:45 INFO LineBufferedStream: stdout: phase: Running
19/04/19 09:38:45 INFO LineBufferedStream: stdout: status: 
[ContainerStatus(containerID=docker://3d21a87775d016719d2f318739fe16dac62422e61fdc023cacdafaa7fce0f6ec,
 
image=10.96.0.100:5000/spark-2.4.0:latest_7fdb0b75-0e7b-4587-42c7-b79a3dbd9f83, 
imageID=docker-pullable://10.96.0.100:5000/spark-2.4.0@sha256:5b47e2a29aeb1c644fc3853933be2ad08f9cd233dec0977908803e9a1f870b0f,
 lastState=ContainerState(running=null, terminated=null, waiting=null, 
additionalProperties={}), name=spark-kubernetes-driver, ready=true, 
restartCount=0, 
state=ContainerState(running=ContainerStateRunning(startedAt=Time(time=2019-04-19T09:38:44Z,
 additionalProperties={}), additionalProperties={}), terminated=null, 
waiting=null, additionalProperties={}), additionalProperties={})]
19/04/19 09:38:46 INFO BatchSession$: Creating batch session 211: [owner: null, 
request: [proxyUser: None, file: 
hdfs://yq01-m12-ai2b-service02.yq01.xxxx.com:9000/bdl-service/module/jar/module-0.1-jar-with-dependencies.jar,
 args: 
--mode,train,--graph,hdfs://yq01-m12-ai2b-service02.yq01.xxxx.com:9000/project/62247e3a-e322-4456-6387-a66e9490652e/exp/62c37ae9-12aa-43f7-671f-d187e1bf1f84/graph/08e1dfad-c272-45ca-4201-1a8bc691a56e/meta/node1555662130294/graph.json,--tracking_server_url,http://10.155.197.12:8080,--sk,56305f9f-b755-4b42-4218-592555f5c4a8,--ak,970f5e4c-7171-4c61-603e-f101b65a573b,
 driverMemory: 2048m, driverCores: 1, numExecutors: 2, conf: 
spark.kubernetes.driver.label.DagTask_ID -> 
5fd12b90-fbbb-41f0-41ad-7bc5bd0abfe0,spark.kubernetes.driverEnv.xxxx_KUBERNETES_LOG_ENDPOINT
 -> yq01-m12-ai2b-service02.yq01.xxxx.com:8070,spark.hadoop.fs.defaultFS -> 
hdfs://yq01-m12-ai2b-service02.yq01.xxxx.com:9000,spark.executorEnv.xxxx_KUBERNETES_LOG_FLUSH_FREQUENCY
 -> 10s,spark.kubernetes.driverEnv.xxxx_KUBERNETES_LOG_PATH -> 
/project/62247e3a-e322-4456-6387-a66e9490652e/exp/62c37ae9-12aa-43f7-671f-d187e1bf1f84/graph/08e1dfad-c272-45ca-4201-1a8bc691a56e/log/driver,spark.kubernetes.container.image
 -> 
10.96.0.100:5000/spark:spark-2.4.0,spark.executorEnv.xxxx_KUBERNETES_LOG_PATH 
-> 
/project/62247e3a-e322-4456-6387-a66e9490652e/exp/62c37ae9-12aa-43f7-671f-d187e1bf1f84/graph/08e1dfad-c272-45ca-4201-1a8bc691a56e/log/executor,spark.executorEnv.xxxx_KUBERNETES_LOG_ENDPOINT
 -> 
yq01-m12-ai2b-service02.yq01.xxxx.com:8070,spark.kubernetes.driverEnv.xxxx_KUBERNETES_LOG_FLUSH_FREQUENCY
 -> 10s]]
19/04/19 09:38:46 INFO SparkProcessBuilder: Running 
'/opt/spark/spark-2.4.0-bin-hadoop2.7/bin/spark-submit' '--deploy-mode' 
'cluster' '--class' 'com.xxxx.cloud.mf.trainer.Submit' '--conf' 
'spark.executorEnv.xxxx_KUBERNETES_LOG_PATH=/project/62247e3a-e322-4456-6387-a66e9490652e/exp/62c37ae9-12aa-43f7-671f-d187e1bf1f84/graph/08e1dfad-c272-45ca-4201-1a8bc691a56e/log/executor'
 '--conf' 'spark.driver.memory=2048m' '--conf' 'spark.executor.instances=2' 
'--conf' 
'spark.kubernetes.driver.label.DagTask_ID=5fd12b90-fbbb-41f0-41ad-7bc5bd0abfe0' 
'--conf' 'spark.kubernetes.driverEnv.xxxx_KUBERNETES_LOG_FLUSH_FREQUENCY=10s' 
'--conf' 'spark.driver.cores=1' '--conf' 
'spark.kubernetes.driverEnv.xxxx_KUBERNETES_LOG_PATH=/project/62247e3a-e322-4456-6387-a66e9490652e/exp/62c37ae9-12aa-43f7-671f-d187e1bf1f84/graph/08e1dfad-c272-45ca-4201-1a8bc691a56e/log/driver'
 '--conf' 
'spark.executorEnv.xxxx_KUBERNETES_LOG_ENDPOINT=yq01-m12-ai2b-service02.yq01.xxxx.com:8070'
 '--conf' 'spark.submit.deployMode=cluster' '--conf' 
'spark.hadoop.fs.defaultFS=hdfs://yq01-m12-ai2b-service02.yq01.xxxx.com:9000' 
'--conf' 
'spark.kubernetes.driverEnv.xxxx_KUBERNETES_LOG_ENDPOINT=yq01-m12-ai2b-service02.yq01.xxxx.com:8070'
 '--conf' 'spark.kubernetes.container.image=10.96.0.100:5000/spark:spark-2.4.0' 
'--conf' 'spark.master=k8s://https://10.155.197.12:6443' '--conf' 
'spark.executorEnv.xxxx_KUBERNETES_LOG_FLUSH_FREQUENCY=10s' 
'hdfs://yq01-m12-ai2b-service02.yq01.xxxx.com:9000/bdl-service/module/jar/module-0.1-jar-with-dependencies.jar'
 '--mode' 'train' '--graph' 
'hdfs://yq01-m12-ai2b-service02.yq01.xxxx.com:9000/project/62247e3a-e322-4456-6387-a66e9490652e/exp/62c37ae9-12aa-43f7-671f-d187e1bf1f84/graph/08e1dfad-c272-45ca-4201-1a8bc691a56e/meta/node1555662130294/graph.json'
 '--tracking_server_url' 'http://10.155.197.12:8080' '--sk' 
'56305f9f-b755-4b42-4218-592555f5c4a8' '--ak' 
'970f5e4c-7171-4c61-603e-f101b65a573b'

19/04/19 09:39:57 INFO LineBufferedStream: stdout: 2019-04-19 09:39:57 INFO 
LoggingPodStatusWatcherImpl:54 - State changed, new state:
19/04/19 09:39:57 INFO LineBufferedStream: stdout: pod name: 
com-xxxx-cloud-mf-trainer-submit-1555666719424-driver
19/04/19 09:39:57 INFO LineBufferedStream: stdout: namespace: default
19/04/19 09:39:57 INFO LineBufferedStream: stdout: labels: DagTask_ID -> 
54f854e2-0bce-4bd6-50e7-57b521b216f7, spark-app-selector -> 
spark-4343fe80572c4240bd933246efd975da, spark-role -> driver
19/04/19 09:39:57 INFO LineBufferedStream: stdout: pod uid: 
ea4410d5-6286-11e9-ae72-e8611f1fbb2a
19/04/19 09:39:57 INFO LineBufferedStream: stdout: creation time: 
2019-04-19T09:38:40Z
19/04/19 09:39:57 INFO LineBufferedStream: stdout: service account name: default
19/04/19 09:39:57 INFO LineBufferedStream: stdout: volumes: spark-local-dir-1, 
spark-conf-volume, default-token-q7drh
19/04/19 09:39:57 INFO LineBufferedStream: stdout: node name: 
yq01-m12-ai2b-service02.yq01.xxxx.com
19/04/19 09:39:57 INFO LineBufferedStream: stdout: start time: 
2019-04-19T09:38:40Z
19/04/19 09:39:57 INFO LineBufferedStream: stdout: container images: 
10.96.0.100:5000/spark:spark-2.4.0
19/04/19 09:39:57 INFO LineBufferedStream: stdout: phase: Pending
19/04/19 09:39:57 INFO LineBufferedStream: stdout: status: 
[ContainerStatus(containerID=null, image=10.96.0.100:5000/spark:spark-2.4.0, 
imageID=, lastState=ContainerState(running=null, terminated=null, waiting=null, 
additionalProperties={}), name=spark-kubernetes-driver, ready=false, 
restartCount=0, state=ContainerState(running=null, terminated=null, 
waiting=ContainerStateWaiting(message=null, reason=ContainerCreating, 
additionalProperties={}), additionalProperties={}), additionalProperties={})]
19/04/19 09:40:00 INFO LineBufferedStream: stdout: 2019-04-19 09:40:00 INFO 
LoggingPodStatusWatcherImpl:54 - State changed, new state:
19/04/19 09:40:00 INFO LineBufferedStream: stdout: pod name: 
com-xxxx-cloud-mf-trainer-submit-1555666719424-driver
19/04/19 09:40:00 INFO LineBufferedStream: stdout: namespace: default
19/04/19 09:40:00 INFO LineBufferedStream: stdout: labels: DagTask_ID -> 
54f854e2-0bce-4bd6-50e7-57b521b216f7, spark-app-selector -> 
spark-4343fe80572c4240bd933246efd975da, spark-role -> driver
19/04/19 09:40:00 INFO LineBufferedStream: stdout: pod uid: 
ea4410d5-6286-11e9-ae72-e8611f1fbb2a
19/04/19 09:40:00 INFO LineBufferedStream: stdout: creation time: 
2019-04-19T09:38:40Z
19/04/19 09:40:00 INFO LineBufferedStream: stdout: service account name: default
19/04/19 09:40:00 INFO LineBufferedStream: stdout: volumes: spark-local-dir-1, 
spark-conf-volume, default-token-q7drh
19/04/19 09:40:00 INFO LineBufferedStream: stdout: node name: 
yq01-m12-ai2b-service02.yq01.xxxx.com
19/04/19 09:40:00 INFO LineBufferedStream: stdout: start time: 
2019-04-19T09:38:40Z
19/04/19 09:40:00 INFO LineBufferedStream: stdout: container images: 
10.96.0.100:5000/spark-2.4.0:latest_7fdb0b75-0e7b-4587-42c7-b79a3dbd9f83
19/04/19 09:40:00 INFO LineBufferedStream: stdout: phase: Running

19/04/19 09:40:00 INFO LineBufferedStream: stdout: status: 
[ContainerStatus(containerID=docker://23c9ea6767a274f8e8759da39dee90f403d9d28b1fec97c1fa4cd8746b41c8c3,
 
image=10.96.0.100:5000/spark-2.4.0:latest_7fdb0b75-0e7b-4587-42c7-b79a3dbd9f83, 
imageID=docker-pullable://10.96.0.100:5000/spark-2.4.0@sha256:5b47e2a29aeb1c644fc3853933be2ad08f9cd233dec0977908803e9a1f870b0f,
 lastState=ContainerState(running=null, terminated=null, waiting=null, 
additionalProperties={}), name=spark-kubernetes-driver, ready=true, 
restartCount=0, 
state=ContainerState(running=ContainerStateRunning(startedAt=Time(time=2019-04-19T09:39:57Z,
 additionalProperties={}), additionalProperties={}), terminated=null, 
waiting=null, additionalProperties={}), additionalProperties={})]

19/04/19 09:40:51 INFO LineBufferedStream: stdout: 2019-04-19 09:40:51 INFO 
LoggingPodStatusWatcherImpl:54 - State changed, new state:
19/04/19 09:40:51 INFO LineBufferedStream: stdout: pod name: 
com-xxxx-cloud-mf-trainer-submit-1555666719424-driver
19/04/19 09:40:51 INFO LineBufferedStream: stdout: namespace: default
19/04/19 09:40:51 INFO LineBufferedStream: stdout: labels: DagTask_ID -> 
54f854e2-0bce-4bd6-50e7-57b521b216f7, spark-app-selector -> 
spark-4343fe80572c4240bd933246efd975da, spark-role -> driver
19/04/19 09:40:51 INFO LineBufferedStream: stdout: pod uid: 
ea4410d5-6286-11e9-ae72-e8611f1fbb2a
19/04/19 09:40:51 INFO LineBufferedStream: stdout: creation time: 
2019-04-19T09:38:40Z
19/04/19 09:40:51 INFO LineBufferedStream: stdout: service account name: default
19/04/19 09:40:51 INFO LineBufferedStream: stdout: volumes: spark-local-dir-1, 
spark-conf-volume, default-token-q7drh
19/04/19 09:40:51 INFO LineBufferedStream: stdout: node name: 
yq01-m12-ai2b-service02.yq01.xxxx.com
19/04/19 09:40:51 INFO LineBufferedStream: stdout: start time: 
2019-04-19T09:38:40Z
19/04/19 09:40:51 INFO LineBufferedStream: stdout: container images: 
10.96.0.100:5000/spark-2.4.0:latest_7fdb0b75-0e7b-4587-42c7-b79a3dbd9f83
19/04/19 09:40:51 INFO LineBufferedStream: stdout: phase: Failed
19/04/19 09:40:51 INFO LineBufferedStream: stdout: status: 
[ContainerStatus(containerID=docker://23c9ea6767a274f8e8759da39dee90f403d9d28b1fec97c1fa4cd8746b41c8c3,
 
image=10.96.0.100:5000/spark-2.4.0:latest_7fdb0b75-0e7b-4587-42c7-b79a3dbd9f83, 
imageID=docker-pullable://10.96.0.100:5000/spark-2.4.0@sha256:5b47e2a29aeb1c644fc3853933be2ad08f9cd233dec0977908803e9a1f870b0f,
 lastState=ContainerState(running=null, terminated=null, waiting=null, 
additionalProperties={}), name=spark-kubernetes-driver, ready=false, 
restartCount=0, state=ContainerState(running=null, 
terminated=ContainerStateTerminated(containerID=docker://23c9ea6767a274f8e8759da39dee90f403d9d28b1fec97c1fa4cd8746b41c8c3,
 exitCode=1, finishedAt=Time(time=2019-04-19T09:40:48Z, 
additionalProperties={}), message=null, reason=Error, signal=null, 
startedAt=Time(time=2019-04-19T09:39:57Z, additionalProperties={}), 
additionalProperties={}), waiting=null, additionalProperties={}), 
additionalProperties={})]
19/04/19 09:40:51 INFO LineBufferedStream: stdout: 2019-04-19 09:40:51 INFO 
LoggingPodStatusWatcherImpl:54 - Container final statuses:

 

 

 

Please let me know if I miss anything. Any help appreciated.

  was:
I'm using spark-on-kubernetes to submit spark app to kubernetes.
most of the time, it runs smoothly.
but sometimes, I see logs after submitting: the driver pod phase changed from 
running to pending and starts another container in the pod though the first 
container exited successfully.

I use the standard spark-submit to kubernetes like:

/opt/spark/spark-2.4.0-bin-hadoop2.7/bin/spark-submit --deploy-mode cluster 
--class xxx ...

 

log is below:

 

 

2019-04-25 13:37:01 INFO LoggingPodStatusWatcherImpl:54 - State changed, new 
state:
pod name: com-xxxx-cloud-mf-trainer-submit-1556199419847-driver
namespace: default
labels: DagTask_ID -> 5fd12b90-fbbb-41f0-41ad-7bc5bd0abfe0, spark-app-selector 
-> spark-3c8350a62ab44c139ce073d654fddebb, spark-role -> driver
pod uid: 348cdcf5-675f-11e9-ae72-e8611f1fbb2a
creation time: 2019-04-25T13:37:01Z
service account name: default
volumes: spark-local-dir-1, spark-conf-volume, default-token-q7drh
node name: N/A
start time: N/A
container images: N/A
phase: Pending
status: []
2019-04-25 13:37:01 INFO LoggingPodStatusWatcherImpl:54 - State changed, new 
state:
pod name: com-xxxx-cloud-mf-trainer-submit-1556199419847-driver
namespace: default
labels: DagTask_ID -> 5fd12b90-fbbb-41f0-41ad-7bc5bd0abfe0, spark-app-selector 
-> spark-3c8350a62ab44c139ce073d654fddebb, spark-role -> driver
pod uid: 348cdcf5-675f-11e9-ae72-e8611f1fbb2a
creation time: 2019-04-25T13:37:01Z
service account name: default
volumes: spark-local-dir-1, spark-conf-volume, default-token-q7drh
node name: yq01-m12-ai2b-service02.yq01.xxxx.com
start time: N/A
container images: N/A
phase: Pending
status: []
2019-04-25 13:37:01 INFO Client:54 - Waiting for application 
com.xxxx.cloud.mf.trainer.Submit to finish...
2019-04-25 13:37:01 INFO LoggingPodStatusWatcherImpl:54 - State changed, new 
state:
pod name: com-xxxx-cloud-mf-trainer-submit-1556199419847-driver
namespace: default
labels: DagTask_ID -> 5fd12b90-fbbb-41f0-41ad-7bc5bd0abfe0, spark-app-selector 
-> spark-3c8350a62ab44c139ce073d654fddebb, spark-role -> driver
pod uid: 348cdcf5-675f-11e9-ae72-e8611f1fbb2a
creation time: 2019-04-25T13:37:01Z
service account name: default
volumes: spark-local-dir-1, spark-conf-volume, default-token-q7drh
node name: yq01-m12-ai2b-service02.yq01.xxxx.com
start time: 2019-04-25T13:37:01Z
container images: 10.96.0.100:5000/spark:spark-2.4.0
phase: Pending
status: [ContainerStatus(containerID=null, 
image=10.96.0.100:5000/spark:spark-2.4.0, imageID=, 
lastState=ContainerState(running=null, terminated=null, waiting=null, 
additionalProperties={}), name=spark-kubernetes-driver, ready=false, 
restartCount=0, state=ContainerState(running=null, terminated=null, 
waiting=ContainerStateWaiting(message=null, reason=ContainerCreating, 
additionalProperties={}), additionalProperties={}), additionalProperties={})]
2019-04-25 13:37:04 INFO LoggingPodStatusWatcherImpl:54 - State changed, new 
state:
pod name: com-xxxx-cloud-mf-trainer-submit-1556199419847-driver
namespace: default
labels: DagTask_ID -> 5fd12b90-fbbb-41f0-41ad-7bc5bd0abfe0, spark-app-selector 
-> spark-3c8350a62ab44c139ce073d654fddebb, spark-role -> driver
pod uid: 348cdcf5-675f-11e9-ae72-e8611f1fbb2a
creation time: 2019-04-25T13:37:01Z
service account name: default
volumes: spark-local-dir-1, spark-conf-volume, default-token-q7drh
node name: yq01-m12-ai2b-service02.yq01.xxxx.com
start time: 2019-04-25T13:37:01Z
container images: 10.96.0.100:5000/spark:spark-2.4.0
phase: Running
status: 
[ContainerStatus(containerID=docker://120dbf8cb11cf8ef9b26cff3354e096a979beb35279de34be64b3c06e896b991,
 image=10.96.0.100:5000/spark:spark-2.4.0, 
imageID=docker-pullable://10.96.0.100:5000/spark@sha256:5b47e2a29aeb1c644fc3853933be2ad08f9cd233dec0977908803e9a1f870b0f,
 lastState=ContainerState(running=null, terminated=null, waiting=null, 
additionalProperties={}), name=spark-kubernetes-driver, ready=true, 
restartCount=0, 
state=ContainerState(running=ContainerStateRunning(startedAt=Time(time=2019-04-25T13:37:03Z,
 additionalProperties={}), additionalProperties={}), terminated=null, 
waiting=null, additionalProperties={}), additionalProperties={})]

2019-04-25 13:37:27 INFO LoggingPodStatusWatcherImpl:54 - State changed, new 
state:
pod name: com-xxxx-cloud-mf-trainer-submit-1556199419847-driver
namespace: default
labels: DagTask_ID -> 5fd12b90-fbbb-41f0-41ad-7bc5bd0abfe0, spark-app-selector 
-> spark-3c8350a62ab44c139ce073d654fddebb, spark-role -> driver
pod uid: 348cdcf5-675f-11e9-ae72-e8611f1fbb2a
creation time: 2019-04-25T13:37:01Z
service account name: default
volumes: spark-local-dir-1, spark-conf-volume, default-token-q7drh
node name: yq01-m12-ai2b-service02.yq01.xxxx.com
start time: 2019-04-25T13:37:01Z
container images: 10.96.0.100:5000/spark:spark-2.4.0
phase: Pending
status: [ContainerStatus(containerID=null, 
image=10.96.0.100:5000/spark:spark-2.4.0, imageID=, 
lastState=ContainerState(running=null, terminated=null, waiting=null, 
additionalProperties={}), name=spark-kubernetes-driver, ready=false, 
restartCount=0, state=ContainerState(running=null, terminated=null, 
waiting=ContainerStateWaiting(message=null, reason=ContainerCreating, 
additionalProperties={}), additionalProperties={}), additionalProperties={})]
2019-04-25 13:37:29 INFO LoggingPodStatusWatcherImpl:54 - State changed, new 
state:
pod name: com-xxxx-cloud-mf-trainer-submit-1556199419847-driver
namespace: default
labels: DagTask_ID -> 5fd12b90-fbbb-41f0-41ad-7bc5bd0abfe0, spark-app-selector 
-> spark-3c8350a62ab44c139ce073d654fddebb, spark-role -> driver
pod uid: 348cdcf5-675f-11e9-ae72-e8611f1fbb2a
creation time: 2019-04-25T13:37:01Z
service account name: default
volumes: spark-local-dir-1, spark-conf-volume, default-token-q7drh
node name: yq01-m12-ai2b-service02.yq01.xxxx.com
start time: 2019-04-25T13:37:01Z
container images: 10.96.0.100:5000/spark:spark-2.4.0
phase: Running
status: 
[ContainerStatus(containerID=docker://43753f5336c41eaec8cdcdfd271b34ac465de331aad2d612fe0c7ad1c3706aac,
 image=10.96.0.100:5000/spark:spark-2.4.0, 
imageID=docker-pullable://10.96.0.100:5000/spark@sha256:5b47e2a29aeb1c644fc3853933be2ad08f9cd233dec0977908803e9a1f870b0f,
 lastState=ContainerState(running=null, terminated=null, waiting=null, 
additionalProperties={}), name=spark-kubernetes-driver, ready=true, 
restartCount=0, 
state=ContainerState(running=ContainerStateRunning(startedAt=Time(time=2019-04-25T13:37:28Z,
 additionalProperties={}), additionalProperties={}), terminated=null, 
waiting=null, additionalProperties={}), additionalProperties={})]
2019-04-25 13:37:52 INFO LoggingPodStatusWatcherImpl:54 - State changed, new 
state:
pod name: com-xxxx-cloud-mf-trainer-submit-1556199419847-driver
namespace: default
labels: DagTask_ID -> 5fd12b90-fbbb-41f0-41ad-7bc5bd0abfe0, spark-app-selector 
-> spark-3c8350a62ab44c139ce073d654fddebb, spark-role -> driver
pod uid: 348cdcf5-675f-11e9-ae72-e8611f1fbb2a
creation time: 2019-04-25T13:37:01Z
service account name: default
volumes: spark-local-dir-1, spark-conf-volume, default-token-q7drh
node name: yq01-m12-ai2b-service02.yq01.xxxx.com
start time: 2019-04-25T13:37:01Z
container images: 10.96.0.100:5000/spark:spark-2.4.0
phase: Failed
status: 
[ContainerStatus(containerID=docker://43753f5336c41eaec8cdcdfd271b34ac465de331aad2d612fe0c7ad1c3706aac,
 image=10.96.0.100:5000/spark:spark-2.4.0, 
imageID=docker-pullable://10.96.0.100:5000/spark@sha256:5b47e2a29aeb1c644fc3853933be2ad08f9cd233dec0977908803e9a1f870b0f,
 lastState=ContainerState(running=null, terminated=null, waiting=null, 
additionalProperties={}), name=spark-kubernetes-driver, ready=false, 
restartCount=0, state=ContainerState(running=null, 
terminated=ContainerStateTerminated(containerID=docker://43753f5336c41eaec8cdcdfd271b34ac465de331aad2d612fe0c7ad1c3706aac,
 exitCode=1, finishedAt=Time(time=2019-04-25T13:37:48Z, 
additionalProperties={}), message=null, reason=Error, signal=null, 
startedAt=Time(time=2019-04-25T13:37:28Z, additionalProperties={}), 
additionalProperties={}), waiting=null, additionalProperties={}), 
additionalProperties={})]
2019-04-25 13:37:52 INFO LoggingPodStatusWatcherImpl:54 - Container final 
statuses:

Container name: spark-kubernetes-driver
 Container image: 10.96.0.100:5000/spark:spark-2.4.0
 Container state: Terminated
 Exit code: 1
2019-04-25 13:37:52 INFO Client:54 - Application 
com.xxxx.cloud.mf.trainer.Submit finished.
2019-04-25 13:37:52 INFO ShutdownHookManager:54 - Shutdown hook called
2019-04-25 13:37:52 INFO ShutdownHookManager:54 - Deleting directory 
/tmp/spark-84727675-4ced-491c-8993-22e8f3539bf3
bash-4.4#

 

 

Please let me know if I miss anything.


> spark on kubernetes driver pod phase changed from running to pending and 
> starts another container in pod
> --------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-27574
>                 URL: https://issues.apache.org/jira/browse/SPARK-27574
>             Project: Spark
>          Issue Type: Bug
>          Components: Kubernetes
>    Affects Versions: 2.4.0
>         Environment: Kubernetes version (use kubectl version):
> v1.10.0
> OS (e.g: cat /etc/os-release):
> CentOS-7
> Kernel (e.g. uname -a):
> 4.17.11-1.el7.elrepo.x86_64
> Spark-2.4.0
>            Reporter: Will Zhang
>            Priority: Major
>         Attachments: driver-pod-logs.zip
>
>
> I'm using spark-on-kubernetes to submit spark app to kubernetes.
> most of the time, it runs smoothly.
> but sometimes, I see logs after submitting: the driver pod phase changed from 
> running to pending and starts another container in the pod though the first 
> container exited successfully.
> I use the standard spark-submit to kubernetes like:
> /opt/spark/spark-2.4.0-bin-hadoop2.7/bin/spark-submit --deploy-mode cluster 
> --class xxx ...
>  
> log is below:
> 19/04/19 09:38:40 INFO LineBufferedStream: stdout: 2019-04-19 09:38:40 INFO 
> LoggingPodStatusWatcherImpl:54 - State changed, new state:
> 19/04/19 09:38:40 INFO LineBufferedStream: stdout: pod name: 
> com-xxxx-cloud-mf-trainer-submit-1555666719424-driver
> 19/04/19 09:38:40 INFO LineBufferedStream: stdout: namespace: default
> 19/04/19 09:38:40 INFO LineBufferedStream: stdout: labels: DagTask_ID -> 
> 54f854e2-0bce-4bd6-50e7-57b521b216f7, spark-app-selector -> 
> spark-4343fe80572c4240bd933246efd975da, spark-role -> driver
> 19/04/19 09:38:40 INFO LineBufferedStream: stdout: pod uid: 
> ea4410d5-6286-11e9-ae72-e8611f1fbb2a
> 19/04/19 09:38:40 INFO LineBufferedStream: stdout: creation time: 
> 2019-04-19T09:38:40Z
> 19/04/19 09:38:40 INFO LineBufferedStream: stdout: service account name: 
> default
> 19/04/19 09:38:40 INFO LineBufferedStream: stdout: volumes: 
> spark-local-dir-1, spark-conf-volume, default-token-q7drh
> 19/04/19 09:38:40 INFO LineBufferedStream: stdout: node name: N/A
> 19/04/19 09:38:40 INFO LineBufferedStream: stdout: start time: N/A
> 19/04/19 09:38:40 INFO LineBufferedStream: stdout: container images: N/A
> 19/04/19 09:38:40 INFO LineBufferedStream: stdout: phase: Pending
> 19/04/19 09:38:40 INFO LineBufferedStream: stdout: status: []
> 19/04/19 09:38:40 INFO LineBufferedStream: stdout: 2019-04-19 09:38:40 INFO 
> LoggingPodStatusWatcherImpl:54 - State changed, new state:
> 19/04/19 09:38:40 INFO LineBufferedStream: stdout: pod name: 
> com-xxxx-cloud-mf-trainer-submit-1555666719424-driver
> 19/04/19 09:38:40 INFO LineBufferedStream: stdout: namespace: default
> 19/04/19 09:38:40 INFO LineBufferedStream: stdout: labels: DagTask_ID -> 
> 54f854e2-0bce-4bd6-50e7-57b521b216f7, spark-app-selector -> 
> spark-4343fe80572c4240bd933246efd975da, spark-role -> driver
> 19/04/19 09:38:40 INFO LineBufferedStream: stdout: pod uid: 
> ea4410d5-6286-11e9-ae72-e8611f1fbb2a
> 19/04/19 09:38:40 INFO LineBufferedStream: stdout: creation time: 
> 2019-04-19T09:38:40Z
> 19/04/19 09:38:40 INFO LineBufferedStream: stdout: service account name: 
> default
> 19/04/19 09:38:40 INFO LineBufferedStream: stdout: volumes: 
> spark-local-dir-1, spark-conf-volume, default-token-q7drh
> 19/04/19 09:38:40 INFO LineBufferedStream: stdout: node name: 
> yq01-m12-ai2b-service02.yq01.xxxx.com
> 19/04/19 09:38:40 INFO LineBufferedStream: stdout: start time: N/A
> 19/04/19 09:38:40 INFO LineBufferedStream: stdout: container images: N/A
> 19/04/19 09:38:40 INFO LineBufferedStream: stdout: phase: Pending
> 19/04/19 09:38:40 INFO LineBufferedStream: stdout: status: []
> 19/04/19 09:38:41 INFO LineBufferedStream: stdout: 2019-04-19 09:38:41 INFO 
> LoggingPodStatusWatcherImpl:54 - State changed, new state:
> 19/04/19 09:38:41 INFO LineBufferedStream: stdout: pod name: 
> com-xxxx-cloud-mf-trainer-submit-1555666719424-driver
> 19/04/19 09:38:41 INFO LineBufferedStream: stdout: namespace: default
> 19/04/19 09:38:41 INFO LineBufferedStream: stdout: labels: DagTask_ID -> 
> 54f854e2-0bce-4bd6-50e7-57b521b216f7, spark-app-selector -> 
> spark-4343fe80572c4240bd933246efd975da, spark-role -> driver
> 19/04/19 09:38:41 INFO LineBufferedStream: stdout: pod uid: 
> ea4410d5-6286-11e9-ae72-e8611f1fbb2a
> 19/04/19 09:38:41 INFO LineBufferedStream: stdout: creation time: 
> 2019-04-19T09:38:40Z
> 19/04/19 09:38:41 INFO LineBufferedStream: stdout: service account name: 
> default
> 19/04/19 09:38:41 INFO LineBufferedStream: stdout: volumes: 
> spark-local-dir-1, spark-conf-volume, default-token-q7drh
> 19/04/19 09:38:41 INFO LineBufferedStream: stdout: node name: 
> yq01-m12-ai2b-service02.yq01.xxxx.com
> 19/04/19 09:38:41 INFO LineBufferedStream: stdout: start time: 
> 2019-04-19T09:38:40Z
> 19/04/19 09:38:41 INFO LineBufferedStream: stdout: container images: 
> 10.96.0.100:5000/spark:spark-2.4.0
> 19/04/19 09:38:41 INFO LineBufferedStream: stdout: phase: Pending
> 19/04/19 09:38:41 INFO LineBufferedStream: stdout: status: 
> [ContainerStatus(containerID=null, image=10.96.0.100:5000/spark:spark-2.4.0, 
> imageID=, lastState=ContainerState(running=null, terminated=null, 
> waiting=null, additionalProperties={}), name=spark-kubernetes-driver, 
> ready=false, restartCount=0, state=ContainerState(running=null, 
> terminated=null, waiting=ContainerStateWaiting(message=null, 
> reason=ContainerCreating, additionalProperties={}), additionalProperties={}), 
> additionalProperties={})]
> 19/04/19 09:38:45 INFO LineBufferedStream: stdout: 2019-04-19 09:38:45 INFO 
> LoggingPodStatusWatcherImpl:54 - State changed, new state:
> 19/04/19 09:38:45 INFO LineBufferedStream: stdout: pod name: 
> com-xxxx-cloud-mf-trainer-submit-1555666719424-driver
> 19/04/19 09:38:45 INFO LineBufferedStream: stdout: namespace: default
> 19/04/19 09:38:45 INFO LineBufferedStream: stdout: labels: DagTask_ID -> 
> 54f854e2-0bce-4bd6-50e7-57b521b216f7, spark-app-selector -> 
> spark-4343fe80572c4240bd933246efd975da, spark-role -> driver
> 19/04/19 09:38:45 INFO LineBufferedStream: stdout: pod uid: 
> ea4410d5-6286-11e9-ae72-e8611f1fbb2a
> 19/04/19 09:38:45 INFO LineBufferedStream: stdout: creation time: 
> 2019-04-19T09:38:40Z
> 19/04/19 09:38:45 INFO LineBufferedStream: stdout: service account name: 
> default
> 19/04/19 09:38:45 INFO LineBufferedStream: stdout: volumes: 
> spark-local-dir-1, spark-conf-volume, default-token-q7drh
> 19/04/19 09:38:45 INFO LineBufferedStream: stdout: node name: 
> yq01-m12-ai2b-service02.yq01.xxxx.com
> 19/04/19 09:38:45 INFO LineBufferedStream: stdout: start time: 
> 2019-04-19T09:38:40Z
> 19/04/19 09:38:45 INFO LineBufferedStream: stdout: container images: 
> 10.96.0.100:5000/spark-2.4.0:latest_7fdb0b75-0e7b-4587-42c7-b79a3dbd9f83
> 19/04/19 09:38:45 INFO LineBufferedStream: stdout: phase: Running
> 19/04/19 09:38:45 INFO LineBufferedStream: stdout: status: 
> [ContainerStatus(containerID=docker://3d21a87775d016719d2f318739fe16dac62422e61fdc023cacdafaa7fce0f6ec,
>  
> image=10.96.0.100:5000/spark-2.4.0:latest_7fdb0b75-0e7b-4587-42c7-b79a3dbd9f83,
>  
> imageID=docker-pullable://10.96.0.100:5000/spark-2.4.0@sha256:5b47e2a29aeb1c644fc3853933be2ad08f9cd233dec0977908803e9a1f870b0f,
>  lastState=ContainerState(running=null, terminated=null, waiting=null, 
> additionalProperties={}), name=spark-kubernetes-driver, ready=true, 
> restartCount=0, 
> state=ContainerState(running=ContainerStateRunning(startedAt=Time(time=2019-04-19T09:38:44Z,
>  additionalProperties={}), additionalProperties={}), terminated=null, 
> waiting=null, additionalProperties={}), additionalProperties={})]
> 19/04/19 09:38:46 INFO BatchSession$: Creating batch session 211: [owner: 
> null, request: [proxyUser: None, file: 
> hdfs://yq01-m12-ai2b-service02.yq01.xxxx.com:9000/bdl-service/module/jar/module-0.1-jar-with-dependencies.jar,
>  args: 
> --mode,train,--graph,hdfs://yq01-m12-ai2b-service02.yq01.xxxx.com:9000/project/62247e3a-e322-4456-6387-a66e9490652e/exp/62c37ae9-12aa-43f7-671f-d187e1bf1f84/graph/08e1dfad-c272-45ca-4201-1a8bc691a56e/meta/node1555662130294/graph.json,--tracking_server_url,http://10.155.197.12:8080,--sk,56305f9f-b755-4b42-4218-592555f5c4a8,--ak,970f5e4c-7171-4c61-603e-f101b65a573b,
>  driverMemory: 2048m, driverCores: 1, numExecutors: 2, conf: 
> spark.kubernetes.driver.label.DagTask_ID -> 
> 5fd12b90-fbbb-41f0-41ad-7bc5bd0abfe0,spark.kubernetes.driverEnv.xxxx_KUBERNETES_LOG_ENDPOINT
>  -> yq01-m12-ai2b-service02.yq01.xxxx.com:8070,spark.hadoop.fs.defaultFS -> 
> hdfs://yq01-m12-ai2b-service02.yq01.xxxx.com:9000,spark.executorEnv.xxxx_KUBERNETES_LOG_FLUSH_FREQUENCY
>  -> 10s,spark.kubernetes.driverEnv.xxxx_KUBERNETES_LOG_PATH -> 
> /project/62247e3a-e322-4456-6387-a66e9490652e/exp/62c37ae9-12aa-43f7-671f-d187e1bf1f84/graph/08e1dfad-c272-45ca-4201-1a8bc691a56e/log/driver,spark.kubernetes.container.image
>  -> 
> 10.96.0.100:5000/spark:spark-2.4.0,spark.executorEnv.xxxx_KUBERNETES_LOG_PATH 
> -> 
> /project/62247e3a-e322-4456-6387-a66e9490652e/exp/62c37ae9-12aa-43f7-671f-d187e1bf1f84/graph/08e1dfad-c272-45ca-4201-1a8bc691a56e/log/executor,spark.executorEnv.xxxx_KUBERNETES_LOG_ENDPOINT
>  -> 
> yq01-m12-ai2b-service02.yq01.xxxx.com:8070,spark.kubernetes.driverEnv.xxxx_KUBERNETES_LOG_FLUSH_FREQUENCY
>  -> 10s]]
> 19/04/19 09:38:46 INFO SparkProcessBuilder: Running 
> '/opt/spark/spark-2.4.0-bin-hadoop2.7/bin/spark-submit' '--deploy-mode' 
> 'cluster' '--class' 'com.xxxx.cloud.mf.trainer.Submit' '--conf' 
> 'spark.executorEnv.xxxx_KUBERNETES_LOG_PATH=/project/62247e3a-e322-4456-6387-a66e9490652e/exp/62c37ae9-12aa-43f7-671f-d187e1bf1f84/graph/08e1dfad-c272-45ca-4201-1a8bc691a56e/log/executor'
>  '--conf' 'spark.driver.memory=2048m' '--conf' 'spark.executor.instances=2' 
> '--conf' 
> 'spark.kubernetes.driver.label.DagTask_ID=5fd12b90-fbbb-41f0-41ad-7bc5bd0abfe0'
>  '--conf' 
> 'spark.kubernetes.driverEnv.xxxx_KUBERNETES_LOG_FLUSH_FREQUENCY=10s' '--conf' 
> 'spark.driver.cores=1' '--conf' 
> 'spark.kubernetes.driverEnv.xxxx_KUBERNETES_LOG_PATH=/project/62247e3a-e322-4456-6387-a66e9490652e/exp/62c37ae9-12aa-43f7-671f-d187e1bf1f84/graph/08e1dfad-c272-45ca-4201-1a8bc691a56e/log/driver'
>  '--conf' 
> 'spark.executorEnv.xxxx_KUBERNETES_LOG_ENDPOINT=yq01-m12-ai2b-service02.yq01.xxxx.com:8070'
>  '--conf' 'spark.submit.deployMode=cluster' '--conf' 
> 'spark.hadoop.fs.defaultFS=hdfs://yq01-m12-ai2b-service02.yq01.xxxx.com:9000' 
> '--conf' 
> 'spark.kubernetes.driverEnv.xxxx_KUBERNETES_LOG_ENDPOINT=yq01-m12-ai2b-service02.yq01.xxxx.com:8070'
>  '--conf' 
> 'spark.kubernetes.container.image=10.96.0.100:5000/spark:spark-2.4.0' 
> '--conf' 'spark.master=k8s://https://10.155.197.12:6443' '--conf' 
> 'spark.executorEnv.xxxx_KUBERNETES_LOG_FLUSH_FREQUENCY=10s' 
> 'hdfs://yq01-m12-ai2b-service02.yq01.xxxx.com:9000/bdl-service/module/jar/module-0.1-jar-with-dependencies.jar'
>  '--mode' 'train' '--graph' 
> 'hdfs://yq01-m12-ai2b-service02.yq01.xxxx.com:9000/project/62247e3a-e322-4456-6387-a66e9490652e/exp/62c37ae9-12aa-43f7-671f-d187e1bf1f84/graph/08e1dfad-c272-45ca-4201-1a8bc691a56e/meta/node1555662130294/graph.json'
>  '--tracking_server_url' 'http://10.155.197.12:8080' '--sk' 
> '56305f9f-b755-4b42-4218-592555f5c4a8' '--ak' 
> '970f5e4c-7171-4c61-603e-f101b65a573b'
> 19/04/19 09:39:57 INFO LineBufferedStream: stdout: 2019-04-19 09:39:57 INFO 
> LoggingPodStatusWatcherImpl:54 - State changed, new state:
> 19/04/19 09:39:57 INFO LineBufferedStream: stdout: pod name: 
> com-xxxx-cloud-mf-trainer-submit-1555666719424-driver
> 19/04/19 09:39:57 INFO LineBufferedStream: stdout: namespace: default
> 19/04/19 09:39:57 INFO LineBufferedStream: stdout: labels: DagTask_ID -> 
> 54f854e2-0bce-4bd6-50e7-57b521b216f7, spark-app-selector -> 
> spark-4343fe80572c4240bd933246efd975da, spark-role -> driver
> 19/04/19 09:39:57 INFO LineBufferedStream: stdout: pod uid: 
> ea4410d5-6286-11e9-ae72-e8611f1fbb2a
> 19/04/19 09:39:57 INFO LineBufferedStream: stdout: creation time: 
> 2019-04-19T09:38:40Z
> 19/04/19 09:39:57 INFO LineBufferedStream: stdout: service account name: 
> default
> 19/04/19 09:39:57 INFO LineBufferedStream: stdout: volumes: 
> spark-local-dir-1, spark-conf-volume, default-token-q7drh
> 19/04/19 09:39:57 INFO LineBufferedStream: stdout: node name: 
> yq01-m12-ai2b-service02.yq01.xxxx.com
> 19/04/19 09:39:57 INFO LineBufferedStream: stdout: start time: 
> 2019-04-19T09:38:40Z
> 19/04/19 09:39:57 INFO LineBufferedStream: stdout: container images: 
> 10.96.0.100:5000/spark:spark-2.4.0
> 19/04/19 09:39:57 INFO LineBufferedStream: stdout: phase: Pending
> 19/04/19 09:39:57 INFO LineBufferedStream: stdout: status: 
> [ContainerStatus(containerID=null, image=10.96.0.100:5000/spark:spark-2.4.0, 
> imageID=, lastState=ContainerState(running=null, terminated=null, 
> waiting=null, additionalProperties={}), name=spark-kubernetes-driver, 
> ready=false, restartCount=0, state=ContainerState(running=null, 
> terminated=null, waiting=ContainerStateWaiting(message=null, 
> reason=ContainerCreating, additionalProperties={}), additionalProperties={}), 
> additionalProperties={})]
> 19/04/19 09:40:00 INFO LineBufferedStream: stdout: 2019-04-19 09:40:00 INFO 
> LoggingPodStatusWatcherImpl:54 - State changed, new state:
> 19/04/19 09:40:00 INFO LineBufferedStream: stdout: pod name: 
> com-xxxx-cloud-mf-trainer-submit-1555666719424-driver
> 19/04/19 09:40:00 INFO LineBufferedStream: stdout: namespace: default
> 19/04/19 09:40:00 INFO LineBufferedStream: stdout: labels: DagTask_ID -> 
> 54f854e2-0bce-4bd6-50e7-57b521b216f7, spark-app-selector -> 
> spark-4343fe80572c4240bd933246efd975da, spark-role -> driver
> 19/04/19 09:40:00 INFO LineBufferedStream: stdout: pod uid: 
> ea4410d5-6286-11e9-ae72-e8611f1fbb2a
> 19/04/19 09:40:00 INFO LineBufferedStream: stdout: creation time: 
> 2019-04-19T09:38:40Z
> 19/04/19 09:40:00 INFO LineBufferedStream: stdout: service account name: 
> default
> 19/04/19 09:40:00 INFO LineBufferedStream: stdout: volumes: 
> spark-local-dir-1, spark-conf-volume, default-token-q7drh
> 19/04/19 09:40:00 INFO LineBufferedStream: stdout: node name: 
> yq01-m12-ai2b-service02.yq01.xxxx.com
> 19/04/19 09:40:00 INFO LineBufferedStream: stdout: start time: 
> 2019-04-19T09:38:40Z
> 19/04/19 09:40:00 INFO LineBufferedStream: stdout: container images: 
> 10.96.0.100:5000/spark-2.4.0:latest_7fdb0b75-0e7b-4587-42c7-b79a3dbd9f83
> 19/04/19 09:40:00 INFO LineBufferedStream: stdout: phase: Running
> 19/04/19 09:40:00 INFO LineBufferedStream: stdout: status: 
> [ContainerStatus(containerID=docker://23c9ea6767a274f8e8759da39dee90f403d9d28b1fec97c1fa4cd8746b41c8c3,
>  
> image=10.96.0.100:5000/spark-2.4.0:latest_7fdb0b75-0e7b-4587-42c7-b79a3dbd9f83,
>  
> imageID=docker-pullable://10.96.0.100:5000/spark-2.4.0@sha256:5b47e2a29aeb1c644fc3853933be2ad08f9cd233dec0977908803e9a1f870b0f,
>  lastState=ContainerState(running=null, terminated=null, waiting=null, 
> additionalProperties={}), name=spark-kubernetes-driver, ready=true, 
> restartCount=0, 
> state=ContainerState(running=ContainerStateRunning(startedAt=Time(time=2019-04-19T09:39:57Z,
>  additionalProperties={}), additionalProperties={}), terminated=null, 
> waiting=null, additionalProperties={}), additionalProperties={})]
> 19/04/19 09:40:51 INFO LineBufferedStream: stdout: 2019-04-19 09:40:51 INFO 
> LoggingPodStatusWatcherImpl:54 - State changed, new state:
> 19/04/19 09:40:51 INFO LineBufferedStream: stdout: pod name: 
> com-xxxx-cloud-mf-trainer-submit-1555666719424-driver
> 19/04/19 09:40:51 INFO LineBufferedStream: stdout: namespace: default
> 19/04/19 09:40:51 INFO LineBufferedStream: stdout: labels: DagTask_ID -> 
> 54f854e2-0bce-4bd6-50e7-57b521b216f7, spark-app-selector -> 
> spark-4343fe80572c4240bd933246efd975da, spark-role -> driver
> 19/04/19 09:40:51 INFO LineBufferedStream: stdout: pod uid: 
> ea4410d5-6286-11e9-ae72-e8611f1fbb2a
> 19/04/19 09:40:51 INFO LineBufferedStream: stdout: creation time: 
> 2019-04-19T09:38:40Z
> 19/04/19 09:40:51 INFO LineBufferedStream: stdout: service account name: 
> default
> 19/04/19 09:40:51 INFO LineBufferedStream: stdout: volumes: 
> spark-local-dir-1, spark-conf-volume, default-token-q7drh
> 19/04/19 09:40:51 INFO LineBufferedStream: stdout: node name: 
> yq01-m12-ai2b-service02.yq01.xxxx.com
> 19/04/19 09:40:51 INFO LineBufferedStream: stdout: start time: 
> 2019-04-19T09:38:40Z
> 19/04/19 09:40:51 INFO LineBufferedStream: stdout: container images: 
> 10.96.0.100:5000/spark-2.4.0:latest_7fdb0b75-0e7b-4587-42c7-b79a3dbd9f83
> 19/04/19 09:40:51 INFO LineBufferedStream: stdout: phase: Failed
> 19/04/19 09:40:51 INFO LineBufferedStream: stdout: status: 
> [ContainerStatus(containerID=docker://23c9ea6767a274f8e8759da39dee90f403d9d28b1fec97c1fa4cd8746b41c8c3,
>  
> image=10.96.0.100:5000/spark-2.4.0:latest_7fdb0b75-0e7b-4587-42c7-b79a3dbd9f83,
>  
> imageID=docker-pullable://10.96.0.100:5000/spark-2.4.0@sha256:5b47e2a29aeb1c644fc3853933be2ad08f9cd233dec0977908803e9a1f870b0f,
>  lastState=ContainerState(running=null, terminated=null, waiting=null, 
> additionalProperties={}), name=spark-kubernetes-driver, ready=false, 
> restartCount=0, state=ContainerState(running=null, 
> terminated=ContainerStateTerminated(containerID=docker://23c9ea6767a274f8e8759da39dee90f403d9d28b1fec97c1fa4cd8746b41c8c3,
>  exitCode=1, finishedAt=Time(time=2019-04-19T09:40:48Z, 
> additionalProperties={}), message=null, reason=Error, signal=null, 
> startedAt=Time(time=2019-04-19T09:39:57Z, additionalProperties={}), 
> additionalProperties={}), waiting=null, additionalProperties={}), 
> additionalProperties={})]
> 19/04/19 09:40:51 INFO LineBufferedStream: stdout: 2019-04-19 09:40:51 INFO 
> LoggingPodStatusWatcherImpl:54 - Container final statuses:
>  
>  
>  
> Please let me know if I miss anything. Any help appreciated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to