Hi There, I'm getting below error though I pass --class and --jars values while submitting a spark job through Spark-Submit. Please help.
Exception in thread "main" org.apache.spark.SparkException: Failed to get main class in JAR with error 'File file:/home/gnana_kumar123/spark/ does not exist'. Please specify one with --class. at org.apache.spark.deploy.SparkSubmit.error(SparkSubmit.scala:972) at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:486) at org.apache.spark.deploy.SparkSubmit.org $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:898) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) ~/spark/spark-3.2.1-bin-hadoop3.2/bin/spark-submit \ --master k8s://${K8S_SERVER}:443 \ --deploy-mode cluster \ --name sparkBQ \ --conf spark.kubernetes.namespace=$NAMESPACE \ --conf spark.network.timeout=300 \ --conf spark.executor.instances=3 \ --conf spark.kubernetes.allocation.batch.size=3 \ --conf spark.kubernetes.allocation.batch.delay=1 \ --conf spark.driver.cores=3 \ --conf spark.executor.cores=3 \ --conf spark.driver.memory=8092m \ --conf spark.executor.memory=8092m \ --conf spark.dynamicAllocation.enabled=true \ --conf spark.dynamicAllocation.shuffleTracking.enabled=true \ --conf spark.kubernetes.driver.container.image=${SPARK_IMAGE} \ --conf spark.kubernetes.executor.container.image=${SPARK_IMAGE} \ --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \ --conf spark.driver.extraJavaOptions= "-Dio.netty.tryReflectionSetAccessible=true" \ --conf spark.executor.extraJavaOptions= "-Dio.netty.tryReflectionSetAccessible=true" \ --class org.apache.spark.examples.SparkPi \ --jars /home/gnana_kumar123/spark/spark-3.2.1-bin-hadoop3.2/examples/jars/spark-examples_2.12-3.2.1.jar Thanks GK On Wed, Feb 16, 2022 at 11:11 PM Gnana Kumar <gnana.kumar...@gmail.com> wrote: > Hi Mich > > Also I would like to run Spark nodes ( Master and Worker nodes in > Kubernetes) and then run my Java Spark application from a JAR file. > > Can you please let me know how to specify the JAR file and the MAIN class. > > Thanks > GK > > On Wed, Feb 16, 2022 at 10:36 PM Gnana Kumar <gnana.kumar...@gmail.com> > wrote: > >> Hi Mich, >> >> I have built the image using the Dockerfile present >> in spark-3.2.1-bin-hadoop3.2.tgz. >> >> Also I have pushed the same image to my docker hub account ie. >> docker.io/gnanakumar123/spark3.2.1:latest >> >> I believe spark submit can pull image from docker hub when I run from >> GKE's Cloud Shell. Please confirm. >> >> Below is the command I'm running. >> >> ./spark-submit \ >> --master k8s://$K8S_SERVER \ >> --deploy-mode cluster \ >> --name spark-driver-pod \ >> --class org.apache.spark.examples.SparkPi \ >> --conf spark.executor.instances=2 \ >> --conf spark.kubernetes.driver.container.image= >> docker.io/gnanakumar123/spark3.2.1:latest \ >> --conf spark.kubernetes.executor.container.image= >> docker.io/gnanakumar123/spark3.2.1:latest \ >> --conf spark.kubernetes.container.image= >> docker.io/gnanakumar123/spark3.2.1:latest \ >> --conf spark.kubernetes.driver.pod.name=spark-driver-pod \ >> --conf spark.kubernetes.namespace=spark-demo \ >> --conf spark.kubernetes.container.image.pullPolicy=Never \ >> --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \ >> $SPARK_HOME/examples/jars/spark-examples_2.12-3.2.1.jar >> >> Thanks >> GK >> >> >> On Mon, Feb 14, 2022 at 10:50 PM Mich Talebzadeh < >> mich.talebza...@gmail.com> wrote: >> >>> Hi >>> >>> >>> It is complaining about the missing driver container image. Does >>> $SPARK_IMAGE point to a valid image in the GCP container registry? >>> >>> Example for a docker image for PySpark >>> >>> >>> IMAGEDRIVER="eu.gcr.io/ >>> <PROJRECT>/spark-py:3.1.1-scala_2.12-8-jre-slim-buster-java8PlusPackages" >>> >>> >>> spark-submit --verbose \ >>> >>> --properties-file ${property_file} \ >>> >>> --master k8s://https://$KUBERNETES_MASTER_IP:443 \ >>> >>> --deploy-mode cluster \ >>> >>> --name sparkBQ \ >>> >>> --py-files $CODE_DIRECTORY_CLOUD/spark_on_gke.zip \ >>> >>> --conf spark.kubernetes.namespace=$NAMESPACE \ >>> >>> --conf spark.network.timeout=300 \ >>> >>> --conf spark.executor.instances=$NEXEC \ >>> >>> --conf spark.kubernetes.allocation.batch.size=3 \ >>> >>> --conf spark.kubernetes.allocation.batch.delay=1 \ >>> >>> --conf spark.driver.cores=3 \ >>> >>> --conf spark.executor.cores=3 \ >>> >>> --conf spark.driver.memory=8092m \ >>> >>> --conf spark.executor.memory=8092m \ >>> >>> --conf spark.dynamicAllocation.enabled=true \ >>> >>> --conf spark.dynamicAllocation.shuffleTracking.enabled=true \ >>> >>> --conf spark.kubernetes.driver.container.image=${IMAGEDRIVER} >>> \ >>> >>> --conf >>> spark.kubernetes.executor.container.image=${IMAGEDRIVER} \ >>> >>> --conf >>> spark.kubernetes.authenticate.driver.serviceAccountName=spark-bq \ >>> >>> --conf >>> spark.driver.extraJavaOptions="-Dio.netty.tryReflectionSetAccessible=true" \ >>> >>> --conf >>> spark.executor.extraJavaOptions="-Dio.netty.tryReflectionSetAccessible=true" >>> \ >>> >>> $CODE_DIRECTORY_CLOUD/${APPLICATION} >>> >>> HTH >>> >>> >>> view my Linkedin profile >>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>> >>> >>> https://en.everybodywiki.com/Mich_Talebzadeh >>> >>> >>> >>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>> any loss, damage or destruction of data or any other property which may >>> arise from relying on this email's technical content is explicitly >>> disclaimed. The author will in no case be liable for any monetary damages >>> arising from such loss, damage or destruction. >>> >>> >>> >>> >>> On Mon, 14 Feb 2022 at 17:04, Gnana Kumar <gnana.kumar...@gmail.com> >>> wrote: >>> >>>> Also im using the below parameters while submitting the spark job. >>>> >>>> spark-submit \ >>>> --master k8s://$K8S_SERVER \ >>>> --deploy-mode cluster \ >>>> --name $POD_NAME \ >>>> --class org.apache.spark.examples.SparkPi \ >>>> --conf spark.executor.instances=2 \ >>>> --conf spark.kubernetes.driver.container.image=$SPARK_IMAGE \ >>>> --conf spark.kubernetes.executor.container.image=$SPARK_IMAGE \ >>>> --conf spark.kubernetes.container.image=$SPARK_IMAGE \ >>>> --conf spark.kubernetes.driver.pod.name=$POD_NAME \ >>>> --conf spark.kubernetes.namespace=spark-demo \ >>>> --conf spark.kubernetes.container.image.pullPolicy=Never \ >>>> --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \ >>>> $SPARK_HOME/examples/jars/spark-examples_2.12-3.2.1.jar >>>> >>>> On Mon, Feb 14, 2022 at 9:51 PM Gnana Kumar <gnana.kumar...@gmail.com> >>>> wrote: >>>> >>>>> Hi There, >>>>> >>>>> I have been trying to run Spark 3.2.1 in Google Cloud's Kubernetes >>>>> Cluster version 1.19 or 1.21 >>>>> >>>>> But I kept on getting on following error and could not proceed. >>>>> >>>>> Please help me resolve this issue. >>>>> >>>>> 22/02/14 16:00:48 INFO SparkKubernetesClientFactory: Auto-configuring >>>>> K8S client using current context from users K8S config file >>>>> Exception in thread "main" org.apache.spark.SparkException: Must >>>>> specify the driver container image >>>>> at >>>>> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$driverContainerImage$1(BasicDriverFeatureStep.scala:45) >>>>> at scala.Option.getOrElse(Option.scala:189) >>>>> at >>>>> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.<init>(BasicDriverFeatureStep.scala:45) >>>>> at >>>>> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:46) >>>>> at >>>>> org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:106) >>>>> at >>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$4(KubernetesClientApplication.scala:220) >>>>> at >>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$4$adapted(KubernetesClientApplication.scala:214) >>>>> at >>>>> org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2713) >>>>> at >>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:214) >>>>> at >>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:186) >>>>> at org.apache.spark.deploy.SparkSubmit.org >>>>> $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955) >>>>> at >>>>> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) >>>>> at >>>>> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) >>>>> at >>>>> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) >>>>> at >>>>> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043) >>>>> at >>>>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052) >>>>> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) >>>>> >>>>> -- >>>>> Thanks >>>>> Gnana >>>>> >>>> >>>> >>>> -- >>>> Thanks >>>> Gnana >>>> >>> >> >> -- >> Thanks >> Gnana >> > > > -- > Thanks > Gnana > -- Thanks Gnana -- Thanks Gnana