[ https://issues.apache.org/jira/browse/SPARK-27927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16854470#comment-16854470 ]
Edwin Biemond commented on SPARK-27927: --------------------------------------- the 2.4.3 output {noformat} root 1 0 0 09:03 ? 00:00:00 /usr/local/bin/tini -s -- /opt/run.sh /opt/spark/bin/spark-submit --conf spark.driver.bindAddress=10.244.38.18 --deploy-mode client --properties-file /opt/spark/conf/spark.properties --class org.apache.spark.deploy.PythonRunner oci://code-assets@paasdevsss/pyspark_min.py root 17 1 0 09:03 ? 00:00:00 /bin/bash /opt/run.sh /opt/spark/bin/spark-submit --conf spark.driver.bindAddress=10.244.38.18 --deploy-mode client --properties-file /opt/spark/conf/spark.properties --class org.apache.spark.deploy.PythonRunner oci://code-assets@paasdevsss/pyspark_min.py root 20 17 2 09:03 ? 00:00:30 /usr/local/sparta-server-jre/jdk1.8.0_162/bin/java -cp local:///livy/jars/kryo-2.22.jar:/opt/spark/conf/:/opt/spark/jars/*:/etc/hadoop/conf/ -Xmx15G -Dlog4j.configuration=file:///etc/spark/conf/log4j.properties org.apache.spark.deploy.SparkSubmit --deploy-mode client --conf spark.driver.bindAddress=10.244.38.18 --properties-file /opt/spark/conf/spark.properties --class org.apache.spark.deploy.PythonRunner oci://code-assets@paasdevsss/pyspark_min.py bash-4.2# cat stdout.log Our Spark version is 2.4.3 Spark context information: <SparkContext master=k8s://https://kubernetes.default.svc:443 appName=hello_world> parallelism=2 python version=3.6 bash-4.2# cat stderr.log 19/06/03 09:33:13 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 19/06/03 09:33:20 INFO SparkContext: Running Spark version 2.4.3 19/06/03 09:33:20 INFO SparkContext: Submitted application: hello_world 19/06/03 09:33:20 INFO SecurityManager: Changing view acls to: root 19/06/03 09:33:20 INFO SecurityManager: Changing modify acls to: root 19/06/03 09:33:20 INFO SecurityManager: Changing view acls groups to: 19/06/03 09:33:20 INFO SecurityManager: Changing modify acls groups to: 19/06/03 09:33:20 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set() 19/06/03 09:33:20 INFO Utils: Successfully started service 'sparkDriver' on port 7078. 19/06/03 09:33:20 INFO SparkEnv: Registering MapOutputTracker 19/06/03 09:33:20 INFO SparkEnv: Registering BlockManagerMaster 19/06/03 09:33:20 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 19/06/03 09:33:20 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 19/06/03 09:33:20 INFO DiskBlockManager: Created local directory at /var/data/spark-799160e5-a3a5-4df5-ba9e-a35664ba7d8f/blockmgr-caceaedb-3edc-4dc9-8792-f871f3328f27 19/06/03 09:33:20 INFO MemoryStore: MemoryStore started with capacity 7.8 GB 19/06/03 09:33:20 INFO SparkEnv: Registering OutputCommitCoordinator 19/06/03 09:33:21 INFO log: Logging initialized @9307ms 19/06/03 09:33:21 INFO Server: jetty-9.3.z-SNAPSHOT, build timestamp: 2017-11-21T21:27:37Z, git hash: 82b8fb23f757335bb3329d540ce37a2a2615f0a8 19/06/03 09:33:21 INFO Server: Started @9392ms 19/06/03 09:33:21 INFO AbstractConnector: Started ServerConnector@58e9e261{HTTP/1.1,[http/1.1]}{0.0.0.0:4040} 19/06/03 09:33:21 INFO Utils: Successfully started service 'SparkUI' on port 4040. 19/06/03 09:33:21 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@697da11e{/jobs,null,AVAILABLE,@Spark} 19/06/03 09:33:21 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@717b2b90{/jobs/json,null,AVAILABLE,@Spark} 19/06/03 09:33:21 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@c969346{/jobs/job,null,AVAILABLE,@Spark} 19/06/03 09:33:21 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@37e8a0b1{/jobs/job/json,null,AVAILABLE,@Spark} 19/06/03 09:33:21 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@1f672a77{/stages,null,AVAILABLE,@Spark} 19/06/03 09:33:21 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@27d3967e{/stages/json,null,AVAILABLE,@Spark} 19/06/03 09:33:21 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@a1cfb17{/stages/stage,null,AVAILABLE,@Spark} 19/06/03 09:33:21 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@65ca8be2{/stages/stage/json,null,AVAILABLE,@Spark} 19/06/03 09:33:21 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@8e3000f{/stages/pool,null,AVAILABLE,@Spark} 19/06/03 09:33:21 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@69ff3b89{/stages/pool/json,null,AVAILABLE,@Spark} 19/06/03 09:33:21 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@3b0dd0d8{/storage,null,AVAILABLE,@Spark} 19/06/03 09:33:21 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@aa9c9c0{/storage/json,null,AVAILABLE,@Spark} 19/06/03 09:33:21 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@1ada43e2{/storage/rdd,null,AVAILABLE,@Spark} 19/06/03 09:33:21 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@ae91043{/storage/rdd/json,null,AVAILABLE,@Spark} 19/06/03 09:33:21 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@5fb78ad6{/environment,null,AVAILABLE,@Spark} 19/06/03 09:33:21 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@1b94bf29{/environment/json,null,AVAILABLE,@Spark} 19/06/03 09:33:21 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@53992aea{/executors,null,AVAILABLE,@Spark} 19/06/03 09:33:21 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@11b053d2{/executors/json,null,AVAILABLE,@Spark} 19/06/03 09:33:21 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@72329a08{/executors/threadDump,null,AVAILABLE,@Spark} 19/06/03 09:33:21 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@4ce11e90{/executors/threadDump/json,null,AVAILABLE,@Spark} 19/06/03 09:33:21 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@5635a39c{/static,null,AVAILABLE,@Spark} 19/06/03 09:33:21 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@23ebdc35{/,null,AVAILABLE,@Spark} 19/06/03 09:33:21 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@4a320d90{/api,null,AVAILABLE,@Spark} 19/06/03 09:33:21 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@2687d6b{/jobs/job/kill,null,AVAILABLE,@Spark} 19/06/03 09:33:21 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@35c4ddc5{/stages/stage/kill,null,AVAILABLE,@Spark} 19/06/03 09:33:21 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://spark-c4c30022a67746d0b99333887b44f7e1-1559554387686-driver-svc.24f2k7cztfza.svc:4040 19/06/03 09:33:21 INFO SparkContext: Added file oci://code-assets@paasdevsss/pyspark_min.py at oci://code-assets@paasdevsss/pyspark_min.py with timestamp 1559554401257 19/06/03 09:33:21 INFO Utils: Fetching oci://code-assets@paasdevsss/pyspark_min.py to /var/data/spark-799160e5-a3a5-4df5-ba9e-a35664ba7d8f/spark-66e2cb9e-8f42-476e-b2c5-05665a25bf5c/userFiles-33e3d7c2-8b35-4963-98fc-4b55e9cea2b1/fetchFileTemp3752927275413095055.tmp 19/06/03 09:33:22 INFO ExecutorPodsAllocator: Going to request 1 executors from Kubernetes. 19/06/03 09:33:22 INFO Version: HV000001: Hibernate Validator 5.2.4.Final 19/06/03 09:33:22 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 7079. 19/06/03 09:33:22 INFO NettyBlockTransferService: Server created on spark-c4c30022a67746d0b99333887b44f7e1-1559554387686-driver-svc.24f2k7cztfza.svc:7079 19/06/03 09:33:22 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 19/06/03 09:33:22 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, spark-c4c30022a67746d0b99333887b44f7e1-1559554387686-driver-svc.24f2k7cztfza.svc, 7079, None) 19/06/03 09:33:22 INFO BlockManagerMasterEndpoint: Registering block manager spark-c4c30022a67746d0b99333887b44f7e1-1559554387686-driver-svc.24f2k7cztfza.svc:7079 with 7.8 GB RAM, BlockManagerId(driver, spark-c4c30022a67746d0b99333887b44f7e1-1559554387686-driver-svc.24f2k7cztfza.svc, 7079, None) 19/06/03 09:33:22 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, spark-c4c30022a67746d0b99333887b44f7e1-1559554387686-driver-svc.24f2k7cztfza.svc, 7079, None) 19/06/03 09:33:22 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, spark-c4c30022a67746d0b99333887b44f7e1-1559554387686-driver-svc.24f2k7cztfza.svc, 7079, None) 19/06/03 09:33:22 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@c60c92{/metrics/json,null,AVAILABLE,@Spark} 19/06/03 09:33:22 INFO SparkContext: Registered listener oracle.dfcs.spark.listener.JobListener 19/06/03 09:33:22 INFO JobListener: Thread 64 called onApplicationStart... 19/06/03 09:33:22 INFO SparkUIIngressServiceBuilder: Intialize SparkUIIngressService using SparkConf... 19/06/03 09:33:22 INFO SparkUIIngressServiceBuilder: masterURL - https://kubernetes.default.svc:443, nameSpace - 24f2k7cztfza, backendServiceName - spark-c4c30022a67746d0b99333887b44f7e1-1559554387686-driver-svc, ingressServiceName - spark-c4c30022a67746d0b99333887b44f7e1-1559554387686-ingress, runId - c4c30022-a677-46d0-b993-33887b44f7e1 19/06/03 09:33:22 INFO SparkUIIngressServiceBuilder: Building SparkUIIngressService... 19/06/03 09:33:22 INFO SparkUIIngressServiceBuilder: --- apiVersion: "extensions/v1beta1" kind: "Ingress" metadata: annotations: nginx.ingress.kubernetes.io/rewrite-target: "/" nginx.ingress.kubernetes.io/configuration-snippet: "rewrite /sparkui/c4c30022-a677-46d0-b993-33887b44f7e1/(.*)$\ \ /$1 break;\nproxy_set_header Accept-Encoding \"\";\nsub_filter_types text/html\ \ application/javascript;\nsub_filter \"/static/\" \"/sparkui/c4c30022-a677-46d0-b993-33887b44f7e1/static/\"\ ;\nsub_filter \"/jobs/\" \"/sparkui/c4c30022-a677-46d0-b993-33887b44f7e1/jobs/\"\ ;\nsub_filter \"/stages/\" \"/sparkui/c4c30022-a677-46d0-b993-33887b44f7e1/stages/\"\ ;\nsub_filter \"/storage/\" \"/sparkui/c4c30022-a677-46d0-b993-33887b44f7e1/storage/\"\ ;\nsub_filter \"/environment/\" \"/sparkui/c4c30022-a677-46d0-b993-33887b44f7e1/environment/\"\ ;\nsub_filter \"/executors/\" \"/sparkui/c4c30022-a677-46d0-b993-33887b44f7e1/executors/\"\ ;\nsub_filter \"/streaming/\" \"/sparkui/c4c30022-a677-46d0-b993-33887b44f7e1/streaming/\"\ ;\nsub_filter \"/SQL/\" \"/sparkui/c4c30022-a677-46d0-b993-33887b44f7e1/SQL/\"\ ;\nsub_filter \"/api/\" \"/sparkui/c4c30022-a677-46d0-b993-33887b44f7e1/api/\"\ ;\nsub_filter \"</head>\" \"<script src='https://cdnjs.cloudflare.com/ajax/libs/iframe-resizer/3.6.5/iframeResizer.contentWindow.js'></script></head>\"\ ;\nsub_filter_once off;\n" nginx.ingress.kubernetes.io/proxy-redirect-from: "http://$host/" nginx.ingress.kubernetes.io/proxy-redirect-to: "$scheme://$host/sparkui/c4c30022-a677-46d0-b993-33887b44f7e1/" nginx.ingress.kubernetes.io/ssl-redirect: "false" labels: app: "spark-c4c30022a67746d0b99333887b44f7e1-1559554387686-ingress" name: "spark-c4c30022a67746d0b99333887b44f7e1-1559554387686-ingress" namespace: "24f2k7cztfza" spec: rules: - http: paths: - backend: serviceName: "spark-c4c30022a67746d0b99333887b44f7e1-1559554387686-driver-svc" servicePort: 4040 path: "/sparkui/c4c30022-a677-46d0-b993-33887b44f7e1" 19/06/03 09:33:22 WARN VersionUsageUtils: The client is using resource type 'ingresses' with unstable version 'v1beta1' 19/06/03 09:33:23 INFO SparkUIIngressServiceBuilder: Creating Ingress Service. 19/06/03 09:33:23 INFO SparkUIIngressServiceBuilder: Created Ingress Service. 19/06/03 09:33:52 INFO KubernetesClusterSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000(ms) 19/06/03 09:33:52 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/spark-warehouse'). 19/06/03 09:33:52 INFO SharedState: Warehouse path is 'file:/spark-warehouse'. 19/06/03 09:33:52 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@7d0fde73{/SQL,null,AVAILABLE,@Spark} 19/06/03 09:33:52 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@1d2515b7{/SQL/json,null,AVAILABLE,@Spark} 19/06/03 09:33:52 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@621b8047{/SQL/execution,null,AVAILABLE,@Spark} 19/06/03 09:33:52 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@5801cf32{/SQL/execution/json,null,AVAILABLE,@Spark} 19/06/03 09:33:52 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@6c86149c{/static/sql,null,AVAILABLE,@Spark} 19/06/03 09:33:53 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint 19/06/03 09:34:54 INFO KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.244.43.3:46308) with ID 1 19/06/03 09:34:55 INFO BlockManagerMasterEndpoint: Registering block manager 10.244.43.3:60844 with 8.4 GB RAM, BlockManagerId(1, 10.244.43.3, 60844, None) {noformat} > driver pod hangs with pyspark 2.4.3 and master on kubenetes > ----------------------------------------------------------- > > Key: SPARK-27927 > URL: https://issues.apache.org/jira/browse/SPARK-27927 > Project: Spark > Issue Type: Bug > Components: Kubernetes > Affects Versions: 3.0.0, 2.4.3 > Environment: k8s 1.11.9 > spark 2.4.3 and master branch. > Reporter: Edwin Biemond > Priority: Major > > When we run a simple pyspark on spark 2.4.3 or 3.0.0 the driver pods hangs > and never calls the shutdown hook. > {code:java} > #!/usr/bin/env python > from __future__ import print_function > import os > import os.path > import sys > # Are we really in Spark? > from pyspark.sql import SparkSession > spark = SparkSession.builder.appName('hello_world').getOrCreate() > print('Our Spark version is {}'.format(spark.version)) > print('Spark context information: {} parallelism={} python version={}'.format( > str(spark.sparkContext), > spark.sparkContext.defaultParallelism, > spark.sparkContext.pythonVer > )) > {code} > When we run this on kubernetes the driver and executer are just hanging. We > see the output of this python script. > {noformat} > bash-4.2# cat stdout.log > Our Spark version is 2.4.3 > Spark context information: <SparkContext > master=k8s://https://kubernetes.default.svc:443 appName=hello_world> > parallelism=2 python version=3.6{noformat} > What works > * a simple python with a print works fine on 2.4.3 and 3.0.0 > * same setup on 2.4.0 > * 2.4.3 spark-submit with the above pyspark > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org