Though I have created the kubernetes RBAC as per Spark site in my GKE cluster,Im getting POD NAME null error.
kubectl create serviceaccount spark kubectl create clusterrolebinding spark-role --clusterrole=edit --serviceaccount=default:spark --namespace=default On Thu, Feb 17, 2022 at 11:31 PM Gnana Kumar <gnana.kumar...@gmail.com> wrote: > Hi Mich > > This is the latest error I'm stuck with. Please help me resolve this issue. > > Exception in thread "main" > io.fabric8.kubernetes.client.KubernetesClientException: Operation: [create] > for kind: [Pod] with name: [null] in namespace: [default] failed. > > ~/spark/spark-3.2.1-bin-hadoop3.2/bin/spark-submit \ > --verbose \ > --class org.apache.spark.examples.SparkPi \ > --master k8s://${K8S_SERVER}:443 \ > --deploy-mode cluster \ > --name sparkBQ \ > --conf spark.kubernetes.namespace=$NAMESPACE \ > --conf spark.network.timeout=300 \ > --conf spark.executor.instances=3 \ > --conf spark.kubernetes.allocation.batch.size=3 \ > --conf spark.kubernetes.allocation.batch.delay=1 \ > --conf spark.driver.cores=3 \ > --conf spark.executor.cores=3 \ > --conf spark.driver.memory=8092m \ > --conf spark.executor.memory=8092m \ > --conf spark.dynamicAllocation.enabled=true \ > --conf spark.dynamicAllocation.shuffleTracking.enabled=true \ > --conf spark.kubernetes.driver.pod.name=spark-pi-driver \ > --conf spark.kubernetes.driver.container.image=${SPARK_IMAGE} \ > --conf spark.kubernetes.executor.container.image=${SPARK_IMAGE} > \ > > --conf > spark.kubernetes.authenticate.driver.serviceAccountName=spark \ > --conf spark.driver.extraJavaOptions= > "-Dio.netty.tryReflectionSetAccessible=true" \ > --conf spark.executor.extraJavaOptions= > "-Dio.netty.tryReflectionSetAccessible=true"\ > --conf spark.kubernetes.file.upload.path=file:///tmp \ > local:///opt/spark/examples/jars/spark-examples_2.12-3.2.1.jar > > Thanks > GK > > On Thu, Feb 17, 2022 at 6:55 PM Mich Talebzadeh <mich.talebza...@gmail.com> > wrote: > >> Hi Gnana, >> >> That JAR file /home/gnana_kumar123/spark/spark-3.2.1- >> bin-hadoop3.2/examples/jars/spark-examples_2.12-3.2.1.jar, is not >> visible to the GKE cluster such that all nodes can read it. I suggest that >> you put it on gs:// bucket in GCP and access it from there. >> >> >> HTH >> >> >> view my Linkedin profile >> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >> >> >> https://en.everybodywiki.com/Mich_Talebzadeh >> >> >> >> *Disclaimer:* Use it at your own risk. Any and all responsibility for >> any loss, damage or destruction of data or any other property which may >> arise from relying on this email's technical content is explicitly >> disclaimed. The author will in no case be liable for any monetary damages >> arising from such loss, damage or destruction. >> >> >> >> >> On Thu, 17 Feb 2022 at 13:05, Gnana Kumar <gnana.kumar...@gmail.com> >> wrote: >> >>> Hi There, >>> >>> I'm getting below error though I pass --class and --jars values >>> while submitting a spark job through Spark-Submit. >>> Please help. >>> >>> Exception in thread "main" org.apache.spark.SparkException: Failed to >>> get main class in JAR with error 'File file:/home/gnana_kumar123/spark/ >>> does not exist'. Please specify one with --class. >>> at >>> org.apache.spark.deploy.SparkSubmit.error(SparkSubmit.scala:972) >>> at >>> org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:486) >>> at org.apache.spark.deploy.SparkSubmit.org >>> $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:898) >>> at >>> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) >>> at >>> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) >>> at >>> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) >>> at >>> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043) >>> at >>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052) >>> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) >>> >>> >>> >>> ~/spark/spark-3.2.1-bin-hadoop3.2/bin/spark-submit \ >>> --master k8s://${K8S_SERVER}:443 \ >>> --deploy-mode cluster \ >>> --name sparkBQ \ >>> --conf spark.kubernetes.namespace=$NAMESPACE \ >>> --conf spark.network.timeout=300 \ >>> --conf spark.executor.instances=3 \ >>> --conf spark.kubernetes.allocation.batch.size=3 \ >>> --conf spark.kubernetes.allocation.batch.delay=1 \ >>> --conf spark.driver.cores=3 \ >>> --conf spark.executor.cores=3 \ >>> --conf spark.driver.memory=8092m \ >>> --conf spark.executor.memory=8092m \ >>> --conf spark.dynamicAllocation.enabled=true \ >>> --conf spark.dynamicAllocation.shuffleTracking.enabled=true \ >>> --conf spark.kubernetes.driver.container.image=${SPARK_IMAGE} >>> \ >>> --conf spark.kubernetes.executor.container.image= >>> ${SPARK_IMAGE} \ >>> >>> --conf >>> spark.kubernetes.authenticate.driver.serviceAccountName=spark \ >>> --conf spark.driver.extraJavaOptions= >>> "-Dio.netty.tryReflectionSetAccessible=true" \ >>> --conf spark.executor.extraJavaOptions= >>> "-Dio.netty.tryReflectionSetAccessible=true" \ >>> --class org.apache.spark.examples.SparkPi \ >>> >>> --jars >>> /home/gnana_kumar123/spark/spark-3.2.1-bin-hadoop3.2/examples/jars/spark-examples_2.12-3.2.1.jar >>> >>> Thanks >>> GK >>> >>> >>> >>> On Wed, Feb 16, 2022 at 11:11 PM Gnana Kumar <gnana.kumar...@gmail.com> >>> wrote: >>> >>>> Hi Mich >>>> >>>> Also I would like to run Spark nodes ( Master and Worker nodes in >>>> Kubernetes) and then run my Java Spark application from a JAR file. >>>> >>>> Can you please let me know how to specify the JAR file and the MAIN >>>> class. >>>> >>>> Thanks >>>> GK >>>> >>>> On Wed, Feb 16, 2022 at 10:36 PM Gnana Kumar <gnana.kumar...@gmail.com> >>>> wrote: >>>> >>>>> Hi Mich, >>>>> >>>>> I have built the image using the Dockerfile present >>>>> in spark-3.2.1-bin-hadoop3.2.tgz. >>>>> >>>>> Also I have pushed the same image to my docker hub account ie. >>>>> docker.io/gnanakumar123/spark3.2.1:latest >>>>> >>>>> I believe spark submit can pull image from docker hub when I run from >>>>> GKE's Cloud Shell. Please confirm. >>>>> >>>>> Below is the command I'm running. >>>>> >>>>> ./spark-submit \ >>>>> --master k8s://$K8S_SERVER \ >>>>> --deploy-mode cluster \ >>>>> --name spark-driver-pod \ >>>>> --class org.apache.spark.examples.SparkPi \ >>>>> --conf spark.executor.instances=2 \ >>>>> --conf spark.kubernetes.driver.container.image= >>>>> docker.io/gnanakumar123/spark3.2.1:latest \ >>>>> --conf spark.kubernetes.executor.container.image= >>>>> docker.io/gnanakumar123/spark3.2.1:latest \ >>>>> --conf spark.kubernetes.container.image= >>>>> docker.io/gnanakumar123/spark3.2.1:latest \ >>>>> --conf spark.kubernetes.driver.pod.name=spark-driver-pod \ >>>>> --conf spark.kubernetes.namespace=spark-demo \ >>>>> --conf spark.kubernetes.container.image.pullPolicy=Never \ >>>>> --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark >>>>> \ >>>>> $SPARK_HOME/examples/jars/spark-examples_2.12-3.2.1.jar >>>>> >>>>> Thanks >>>>> GK >>>>> >>>>> >>>>> On Mon, Feb 14, 2022 at 10:50 PM Mich Talebzadeh < >>>>> mich.talebza...@gmail.com> wrote: >>>>> >>>>>> Hi >>>>>> >>>>>> >>>>>> It is complaining about the missing driver container image. Does >>>>>> $SPARK_IMAGE point to a valid image in the GCP container registry? >>>>>> >>>>>> Example for a docker image for PySpark >>>>>> >>>>>> >>>>>> IMAGEDRIVER="eu.gcr.io/ >>>>>> <PROJRECT>/spark-py:3.1.1-scala_2.12-8-jre-slim-buster-java8PlusPackages" >>>>>> >>>>>> >>>>>> spark-submit --verbose \ >>>>>> >>>>>> --properties-file ${property_file} \ >>>>>> >>>>>> --master k8s://https://$KUBERNETES_MASTER_IP:443 \ >>>>>> >>>>>> --deploy-mode cluster \ >>>>>> >>>>>> --name sparkBQ \ >>>>>> >>>>>> --py-files $CODE_DIRECTORY_CLOUD/spark_on_gke.zip \ >>>>>> >>>>>> --conf spark.kubernetes.namespace=$NAMESPACE \ >>>>>> >>>>>> --conf spark.network.timeout=300 \ >>>>>> >>>>>> --conf spark.executor.instances=$NEXEC \ >>>>>> >>>>>> --conf spark.kubernetes.allocation.batch.size=3 \ >>>>>> >>>>>> --conf spark.kubernetes.allocation.batch.delay=1 \ >>>>>> >>>>>> --conf spark.driver.cores=3 \ >>>>>> >>>>>> --conf spark.executor.cores=3 \ >>>>>> >>>>>> --conf spark.driver.memory=8092m \ >>>>>> >>>>>> --conf spark.executor.memory=8092m \ >>>>>> >>>>>> --conf spark.dynamicAllocation.enabled=true \ >>>>>> >>>>>> --conf >>>>>> spark.dynamicAllocation.shuffleTracking.enabled=true \ >>>>>> >>>>>> --conf >>>>>> spark.kubernetes.driver.container.image=${IMAGEDRIVER} \ >>>>>> >>>>>> --conf >>>>>> spark.kubernetes.executor.container.image=${IMAGEDRIVER} \ >>>>>> >>>>>> --conf >>>>>> spark.kubernetes.authenticate.driver.serviceAccountName=spark-bq \ >>>>>> >>>>>> --conf >>>>>> spark.driver.extraJavaOptions="-Dio.netty.tryReflectionSetAccessible=true" >>>>>> \ >>>>>> >>>>>> --conf >>>>>> spark.executor.extraJavaOptions="-Dio.netty.tryReflectionSetAccessible=true" >>>>>> \ >>>>>> >>>>>> $CODE_DIRECTORY_CLOUD/${APPLICATION} >>>>>> >>>>>> HTH >>>>>> >>>>>> >>>>>> view my Linkedin profile >>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>>>> >>>>>> >>>>>> https://en.everybodywiki.com/Mich_Talebzadeh >>>>>> >>>>>> >>>>>> >>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility >>>>>> for any loss, damage or destruction of data or any other property which >>>>>> may >>>>>> arise from relying on this email's technical content is explicitly >>>>>> disclaimed. The author will in no case be liable for any monetary damages >>>>>> arising from such loss, damage or destruction. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Mon, 14 Feb 2022 at 17:04, Gnana Kumar <gnana.kumar...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Also im using the below parameters while submitting the spark job. >>>>>>> >>>>>>> spark-submit \ >>>>>>> --master k8s://$K8S_SERVER \ >>>>>>> --deploy-mode cluster \ >>>>>>> --name $POD_NAME \ >>>>>>> --class org.apache.spark.examples.SparkPi \ >>>>>>> --conf spark.executor.instances=2 \ >>>>>>> --conf spark.kubernetes.driver.container.image=$SPARK_IMAGE \ >>>>>>> --conf spark.kubernetes.executor.container.image=$SPARK_IMAGE \ >>>>>>> --conf spark.kubernetes.container.image=$SPARK_IMAGE \ >>>>>>> --conf spark.kubernetes.driver.pod.name=$POD_NAME \ >>>>>>> --conf spark.kubernetes.namespace=spark-demo \ >>>>>>> --conf spark.kubernetes.container.image.pullPolicy=Never \ >>>>>>> --conf >>>>>>> spark.kubernetes.authenticate.driver.serviceAccountName=spark \ >>>>>>> $SPARK_HOME/examples/jars/spark-examples_2.12-3.2.1.jar >>>>>>> >>>>>>> On Mon, Feb 14, 2022 at 9:51 PM Gnana Kumar < >>>>>>> gnana.kumar...@gmail.com> wrote: >>>>>>> >>>>>>>> Hi There, >>>>>>>> >>>>>>>> I have been trying to run Spark 3.2.1 in Google Cloud's Kubernetes >>>>>>>> Cluster version 1.19 or 1.21 >>>>>>>> >>>>>>>> But I kept on getting on following error and could not proceed. >>>>>>>> >>>>>>>> Please help me resolve this issue. >>>>>>>> >>>>>>>> 22/02/14 16:00:48 INFO SparkKubernetesClientFactory: >>>>>>>> Auto-configuring K8S client using current context from users K8S >>>>>>>> config file >>>>>>>> Exception in thread "main" org.apache.spark.SparkException: Must >>>>>>>> specify the driver container image >>>>>>>> at >>>>>>>> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$driverContainerImage$1(BasicDriverFeatureStep.scala:45) >>>>>>>> at scala.Option.getOrElse(Option.scala:189) >>>>>>>> at >>>>>>>> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.<init>(BasicDriverFeatureStep.scala:45) >>>>>>>> at >>>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:46) >>>>>>>> at >>>>>>>> org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:106) >>>>>>>> at >>>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$4(KubernetesClientApplication.scala:220) >>>>>>>> at >>>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$4$adapted(KubernetesClientApplication.scala:214) >>>>>>>> at >>>>>>>> org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2713) >>>>>>>> at >>>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:214) >>>>>>>> at >>>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:186) >>>>>>>> at org.apache.spark.deploy.SparkSubmit.org >>>>>>>> $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955) >>>>>>>> at >>>>>>>> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) >>>>>>>> at >>>>>>>> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) >>>>>>>> at >>>>>>>> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) >>>>>>>> at >>>>>>>> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043) >>>>>>>> at >>>>>>>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052) >>>>>>>> at >>>>>>>> org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) >>>>>>>> >>>>>>>> -- >>>>>>>> Thanks >>>>>>>> Gnana >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Thanks >>>>>>> Gnana >>>>>>> >>>>>> >>>>> >>>>> -- >>>>> Thanks >>>>> Gnana >>>>> >>>> >>>> >>>> -- >>>> Thanks >>>> Gnana >>>> >>> >>> >>> -- >>> Thanks >>> Gnana >>> >>> >>> -- >>> Thanks >>> Gnana >>> >> > > -- > Thanks > Gnana > -- Thanks Gnana