Re: VolcanoFeatureStep( Custom Scheduler ) not found in Spark 3.3.1 archive

Gnana Kumar Sun, 20 Nov 2022 05:47:54 -0800

Thanks Chris for your guidance.Your maven commands have worked really.

I have been able to build the source and generate the binary distribution
as well using below steps.


>mvn -Denforcer.skip=true -DrecompileMode=all -Pkubernetes -Pvolcano
-Pscala-2.12 -DskipTests clean package

>dev/./make-distribution.sh -Pkubernetes -Denforcer.skip=true
-DrecompileMode=all -Pkubernetes -Pvolcano -Pscala-2.12 -DskipTests

>docker build -t spark3.3.2_gnana_volcano_scheduler_snapshot
-f kubernetes/dockerfiles/spark/Dockerfile .
>docker push spark3.3.2_gnana_volcano_scheduler_snapshot:latest

spark_3.3.2/bin/spark-submit  \
           --verbose \
           --class com.demo.spark.SpringBootStarter  \
           --master k8s://https://$KUBERNETES_MASTER_IP:443 \
           --deploy-mode cluster \
           --name sparkSampleApp2 \
           --conf spark.kubernetes.namespace=default \
           --conf spark.network.timeout=300 \
           --conf spark.executor.instances=1 \
           --conf spark.kubernetes.scheduler.name=volcano \
           --conf
spark.kubernetes.scheduler.volcano.podGroupTemplateFile=/home/gnana_kumar123/spark/volcano_spark_podgroup_template_low_priority.yaml
\
           --conf
spark.kubernetes.driver.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
\
           --conf
spark.kubernetes.executor.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
\

But unfortunately, when I run the Spark-Submit to run spark job in
Kubernetes Cluster, it says class not found exception.

Please find the exception from Spark-Submit.

Using Spark's default log4j profile:
org/apache/spark/log4j-defaults.properties
22/11/20 12:40:52 INFO SparkKubernetesClientFactory: Auto-configuring K8S
client using current context from users K8S config file
Exception in thread "main" java.lang.ClassNotFoundException:
org.apache.spark.deploy.k8s.features.VolcanoFeatureStep

So,I have extracted the jar from dist/jars and I have verified that
Spark-kubernetes jar has the VolcanoFeatureStep.class but I'm not sure if
it is really packed within  docker image.

Thanks
Gnana



On Sat, Nov 19, 2022 at 12:27 AM Chris Nauroth <cnaur...@apache.org> wrote:

> Hello Gnana,
>
> I'm bringing this thread back to the user@ list for the benefit of anyone
> else who might want to try this feature.
>
> Running this from the root of the source tree should give you a working
> full build with Kubernetes and the experimental Volcano feature, using
> Scala 2.12:
>
> build/mvn -Pkubernetes -Pvolcano -Pscala-2.12 -DskipTests clean package
>
> If you want to use Scala 2.13, it would be this:
>
> dev/change-scala-version.sh 2.13
> build/mvn -Pkubernetes -Pvolcano -Pscala-2.13 -DskipTests clean package
>
> I don't expect you'd need to replace all jars in your deployment. However,
> in addition to spark-kubernetes.jar, I expect you'll need to get the
> Volcano client classes onto the classpath. Those are in
> volcano-client-5.12.2.jar and volcano-model-v1beta1-5.12.2.jar.
>
> I haven't tested this new feature myself, so I don't know if there are
> other steps you'll hit after this. Speaking just in terms of what the build
> does though, this should be sufficient.
>
> I hope this helps.
>
> Chris Nauroth
>
>
> On Thu, Nov 17, 2022 at 11:32 PM Gnana Kumar <gnana.kumar...@gmail.com>
> wrote:
>
>> I have maven built the spark-kubernetes jar (
>> spark-kubernetes_2.12-3.3.2-SNAPSHOT ) but when I build parent spark
>> directory , the build fails.
>>
>> mvn clean install -Denforcer.skip=true -Pvolcano -DskipTests
>> -Dcheckstyle.skip
>>
>> On Fri, Nov 18, 2022 at 12:47 PM Gnana Kumar <gnana.kumar...@gmail.com>
>> wrote:
>>
>>> Also please confirm if I have to use the SNAPSHOT version of all Spark
>>> jars for Volcano scheduling or only kubernete jar
>>> (spark-kubernetes_2.12-3.3.2-SNAPSHOT.jar) is alone enough to perform
>>> scheduling.
>>>
>>> Thanks
>>> Gnana
>>>
>>> On Fri, Nov 18, 2022 at 10:27 AM Gnana Kumar <gnana.kumar...@gmail.com>
>>> wrote:
>>>
>>>> Hi Chris,
>>>>
>>>> Thanks for the clarification.
>>>>
>>>> I have tried the below steps but getting below error. Please help me to
>>>> resolve this error and I would need the Volcano feature available in my
>>>> Spark-Kubernetes Jar.
>>>>
>>>> >git clone https://github.com/apache/spark.git -b branch-3.3
>>>> >cd spark
>>>> >mvn clean install -Denforcer.skip=true -Pvolcano -DskipTests
>>>>
>>>> [INFO]
>>>> ------------------------------------------------------------------------
>>>> [INFO] Reactor Summary for Spark Project Parent POM 3.3.2-SNAPSHOT:
>>>> [INFO]
>>>> [INFO] Spark Project Parent POM ........................... SUCCESS
>>>> [02:02 min]
>>>> [INFO] Spark Project Tags ................................. FAILURE [
>>>> 16.548 s]
>>>> [INFO] Spark Project Sketch ............................... SKIPPED
>>>> [INFO] Spark Project Local DB ............................. SKIPPED
>>>> [INFO] Spark Project Networking ........................... SKIPPED
>>>> [INFO] Spark Project Shuffle Streaming Service ............ SKIPPED
>>>> [INFO] Spark Project Unsafe ............................... SKIPPED
>>>> [INFO] Spark Project Launcher ............................. SKIPPED
>>>> [INFO] Spark Project Core ................................. SKIPPED
>>>> [INFO] Spark Project ML Local Library ..................... SKIPPED
>>>> [INFO] Spark Project GraphX ............................... SKIPPED
>>>> [INFO] Spark Project Streaming ............................ SKIPPED
>>>> [INFO] Spark Project Catalyst ............................. SKIPPED
>>>> [INFO] Spark Project SQL .................................. SKIPPED
>>>> [INFO] Spark Project ML Library ........................... SKIPPED
>>>> [INFO] Spark Project Tools ................................ SKIPPED
>>>> [INFO] Spark Project Hive ................................. SKIPPED
>>>> [INFO] Spark Project REPL ................................. SKIPPED
>>>> [INFO] Spark Project Assembly ............................. SKIPPED
>>>> [INFO] Kafka 0.10+ Token Provider for Streaming ........... SKIPPED
>>>> [INFO] Spark Integration for Kafka 0.10 ................... SKIPPED
>>>> [INFO] Kafka 0.10+ Source for Structured Streaming ........ SKIPPED
>>>> [INFO] Spark Project Examples ............................. SKIPPED
>>>> [INFO] Spark Integration for Kafka 0.10 Assembly .......... SKIPPED
>>>> [INFO] Spark Avro ......................................... SKIPPED
>>>> [INFO]
>>>> ------------------------------------------------------------------------
>>>> [INFO] BUILD FAILURE
>>>> [INFO]
>>>> ------------------------------------------------------------------------
>>>> [INFO] Total time:  02:21 min
>>>> [INFO] Finished at: 2022-11-18T10:23:08+05:30
>>>> [INFO]
>>>> ------------------------------------------------------------------------
>>>> [WARNING] The requested profile "volcano" could not be activated
>>>> because it does not exist.
>>>> [ERROR] Failed to execute goal
>>>> net.alchim31.maven:scala-maven-plugin:4.4.0:compile (scala-compile-first)
>>>> on project spark-tags_2.12: Execution scala-compile-first of goal
>>>> net.alchim31.maven:scala-maven-plugin:4.4.0:compile failed: An API
>>>> incompatibility was encountered while executing
>>>> net.alchim31.maven:scala-maven-plugin:4.4.0:compile:
>>>> java.lang.NoSuchMethodError:
>>>> org.fusesource.jansi.AnsiConsole.wrapOutputStream(Ljava/io/OutputStream;)Ljava/io/OutputStream;
>>>>
>>>> Thanks
>>>> Gnana
>>>>
>>>> On Thu, Nov 17, 2022 at 5:09 AM Chris Nauroth <cnaur...@apache.org>
>>>> wrote:
>>>>
>>>>> Hello Gnana,
>>>>>
>>>>> I think it's intentional that this is excluded from the binary
>>>>> release. By default, the build excludes this class [1]. It must be enabled
>>>>> in the build by activating a Maven profile [2]. The release script does 
>>>>> not
>>>>> activate this profile [3].
>>>>>
>>>>> See the relevant pull requests ([4], [5]) for discussion of how this
>>>>> feature is considered experimental and therefore excluded by default from
>>>>> the previously GA'd 3.3 release line. If you want to use the feature, you
>>>>> still have the option of building from source with the -Pvolcano profile
>>>>> activated.
>>>>>
>>>>> [1]
>>>>> https://github.com/apache/spark/blob/branch-3.3/resource-managers/kubernetes/core/pom.xml#L135
>>>>> [2]
>>>>> https://github.com/apache/spark/blob/branch-3.3/resource-managers/kubernetes/core/pom.xml#L35-L54
>>>>> [3]
>>>>> https://github.com/apache/spark/blob/branch-3.3/dev/create-release/release-build.sh
>>>>> [4] https://github.com/apache/spark/pull/34456
>>>>> [5] https://github.com/apache/spark/pull/35422
>>>>>
>>>>> Chris Nauroth
>>>>>
>>>>>
>>>>> On Wed, Nov 16, 2022 at 7:23 AM Gnana Kumar <gnana.kumar...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi There,
>>>>>>
>>>>>> I have installed Spark 3.3.1 and tried to use the following
>>>>>> configuration in Spark Submit for a spark job to run in Kubernetes 
>>>>>> Cluster
>>>>>> and I have got class not found exception for the reference
>>>>>> "VolcanoFeatureStep"
>>>>>>
>>>>>> --conf
>>>>>> spark.kubernetes.driver.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
>>>>>> --conf
>>>>>> spark.kubernetes.executor.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
>>>>>>
>>>>>> When I unzipped the spark 3.3.1 archive, I could not see the
>>>>>> VolcanoFeatureStep.class file.
>>>>>>
>>>>>> May I know if Volcano feature has been released in v3.3.1 ? How to
>>>>>> resolve this Class not found exception ?
>>>>>> Kindly help resolving this issue.
>>>>>>
>>>>>> Thanks
>>>>>> Gnana
>>>>>>
>>>>>
>>>>
>>>> --
>>>> Thanks
>>>> Gnana
>>>>
>>>
>>>
>>> --
>>> Thanks
>>> Gnana
>>>
>>
>>
>> --
>> Thanks
>> Gnana
>>
>

-- 
Thanks
Gnana

Re: VolcanoFeatureStep( Custom Scheduler ) not found in Spark 3.3.1 archive

Reply via email to