Sure but first it would be beneficial to understand the way Spark works on Kubernetes and the concept.s
Have a look at this article of mine Spark on Kubernetes, A Practitioner’s Guide <https://www.linkedin.com/pulse/spark-kubernetes-practitioners-guide-mich-talebzadeh-ph-d-%3FtrackingId=Wsu3lkoPaCWqGemYHe8%252BLQ%253D%253D/?trackingId=Wsu3lkoPaCWqGemYHe8%2BLQ%3D%3D> HTH Mich Talebzadeh, Dad | Technologist | Solutions Architect | Engineer London United Kingdom view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* The information provided is correct to the best of my knowledge but of course cannot be guaranteed . It is essential to note that, as with any advice, quote "one test result is worth one-thousand expert opinions (Werner <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". On Mon, 19 Feb 2024 at 15:09, Jagannath Majhi < jagannath.ma...@cloud.cbnits.com> wrote: > Yes > > On Mon, Feb 19, 2024, 8:35 PM Mich Talebzadeh <mich.talebza...@gmail.com> > wrote: > >> OK you have a jar file that you want to work with when running using >> Spark on k8s as the execution engine (EKS) as opposed to YARN on EMR as >> the execution engine? >> >> >> Mich Talebzadeh, >> Dad | Technologist | Solutions Architect | Engineer >> London >> United Kingdom >> >> >> view my Linkedin profile >> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >> >> >> https://en.everybodywiki.com/Mich_Talebzadeh >> >> >> >> *Disclaimer:* The information provided is correct to the best of my >> knowledge but of course cannot be guaranteed . It is essential to note >> that, as with any advice, quote "one test result is worth one-thousand >> expert opinions (Werner >> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun >> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". >> >> >> On Mon, 19 Feb 2024 at 14:38, Jagannath Majhi < >> jagannath.ma...@cloud.cbnits.com> wrote: >> >>> I am not using any private docker image. Only I am running the jar file >>> in EMR using spark-submit command so now I want to run this jar file in eks >>> so can you please tell me how can I set-up for this ?? >>> >>> On Mon, Feb 19, 2024, 8:06 PM Jagannath Majhi < >>> jagannath.ma...@cloud.cbnits.com> wrote: >>> >>>> Can we connect over Google meet?? >>>> >>>> On Mon, Feb 19, 2024, 8:03 PM Mich Talebzadeh < >>>> mich.talebza...@gmail.com> wrote: >>>> >>>>> Where is your docker file? In ECR container registry. >>>>> If you are going to use EKS, then it need to be accessible to all >>>>> nodes of cluster >>>>> >>>>> When you build your docker image, put your jar under the $SPARK_HOME >>>>> directory. Then add a line to your docker build file as below >>>>> Here I am accessing Google BigQuery DW from EKS >>>>> # Add a BigQuery connector jar. >>>>> ENV SPARK_EXTRA_JARS_DIR=/opt/spark/jars/ >>>>> ENV SPARK_EXTRA_CLASSPATH='/opt/spark/jars/*' >>>>> RUN mkdir -p "${SPARK_EXTRA_JARS_DIR}" \ >>>>> && chown spark:spark "${SPARK_EXTRA_JARS_DIR}" >>>>> COPY --chown=spark:spark \ >>>>> spark-bigquery-with-dependencies_2.12-0.22.2.jar >>>>> "${SPARK_EXTRA_JARS_DIR}" >>>>> >>>>> Here I am accessing Google BigQuery DW from EKS cluster >>>>> >>>>> HTH >>>>> >>>>> Mich Talebzadeh, >>>>> Dad | Technologist | Solutions Architect | Engineer >>>>> London >>>>> United Kingdom >>>>> >>>>> >>>>> view my Linkedin profile >>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>>> >>>>> >>>>> https://en.everybodywiki.com/Mich_Talebzadeh >>>>> >>>>> >>>>> >>>>> *Disclaimer:* The information provided is correct to the best of my >>>>> knowledge but of course cannot be guaranteed . It is essential to note >>>>> that, as with any advice, quote "one test result is worth one-thousand >>>>> expert opinions (Werner >>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun >>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". >>>>> >>>>> >>>>> On Mon, 19 Feb 2024 at 13:42, Jagannath Majhi < >>>>> jagannath.ma...@cloud.cbnits.com> wrote: >>>>> >>>>>> Dear Spark Community, >>>>>> >>>>>> I hope this email finds you well. I am reaching out to seek >>>>>> assistance and guidance regarding a task I'm currently working on >>>>>> involving >>>>>> Apache Spark. >>>>>> >>>>>> I have developed a JAR file that contains some Spark applications and >>>>>> functionality, and I need to run this JAR file within a Spark cluster. >>>>>> However, the JAR file is located in an AWS S3 bucket. I'm facing some >>>>>> challenges in configuring Spark to access and execute this JAR file >>>>>> directly from the S3 bucket. >>>>>> >>>>>> I would greatly appreciate any advice, best practices, or pointers on >>>>>> how to achieve this integration effectively. Specifically, I'm looking >>>>>> for >>>>>> insights on: >>>>>> >>>>>> 1. Configuring Spark to access and retrieve the JAR file from an >>>>>> AWS S3 bucket. >>>>>> 2. Setting up the necessary permissions and authentication >>>>>> mechanisms to ensure seamless access to the S3 bucket. >>>>>> 3. Any potential performance considerations or optimizations when >>>>>> running Spark applications with dependencies stored in remote storage >>>>>> like >>>>>> AWS S3. >>>>>> >>>>>> If anyone in the community has prior experience or knowledge in this >>>>>> area, I would be extremely grateful for your guidance. Additionally, if >>>>>> there are any relevant resources, documentation, or tutorials that you >>>>>> could recommend, it would be incredibly helpful. >>>>>> >>>>>> Thank you very much for considering my request. I look forward to >>>>>> hearing from you and benefiting from the collective expertise of the >>>>>> Spark >>>>>> community. >>>>>> >>>>>> Best regards, Jagannath Majhi >>>>>> >>>>>