I am not using any private docker image. Only I am running the jar file in EMR using spark-submit command so now I want to run this jar file in eks so can you please tell me how can I set-up for this ??
On Mon, Feb 19, 2024, 8:06 PM Jagannath Majhi < jagannath.ma...@cloud.cbnits.com> wrote: > Can we connect over Google meet?? > > On Mon, Feb 19, 2024, 8:03 PM Mich Talebzadeh <mich.talebza...@gmail.com> > wrote: > >> Where is your docker file? In ECR container registry. >> If you are going to use EKS, then it need to be accessible to all nodes >> of cluster >> >> When you build your docker image, put your jar under the $SPARK_HOME >> directory. Then add a line to your docker build file as below >> Here I am accessing Google BigQuery DW from EKS >> # Add a BigQuery connector jar. >> ENV SPARK_EXTRA_JARS_DIR=/opt/spark/jars/ >> ENV SPARK_EXTRA_CLASSPATH='/opt/spark/jars/*' >> RUN mkdir -p "${SPARK_EXTRA_JARS_DIR}" \ >> && chown spark:spark "${SPARK_EXTRA_JARS_DIR}" >> COPY --chown=spark:spark \ >> spark-bigquery-with-dependencies_2.12-0.22.2.jar >> "${SPARK_EXTRA_JARS_DIR}" >> >> Here I am accessing Google BigQuery DW from EKS cluster >> >> HTH >> >> Mich Talebzadeh, >> Dad | Technologist | Solutions Architect | Engineer >> London >> United Kingdom >> >> >> view my Linkedin profile >> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >> >> >> https://en.everybodywiki.com/Mich_Talebzadeh >> >> >> >> *Disclaimer:* The information provided is correct to the best of my >> knowledge but of course cannot be guaranteed . It is essential to note >> that, as with any advice, quote "one test result is worth one-thousand >> expert opinions (Werner >> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun >> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". >> >> >> On Mon, 19 Feb 2024 at 13:42, Jagannath Majhi < >> jagannath.ma...@cloud.cbnits.com> wrote: >> >>> Dear Spark Community, >>> >>> I hope this email finds you well. I am reaching out to seek assistance >>> and guidance regarding a task I'm currently working on involving Apache >>> Spark. >>> >>> I have developed a JAR file that contains some Spark applications and >>> functionality, and I need to run this JAR file within a Spark cluster. >>> However, the JAR file is located in an AWS S3 bucket. I'm facing some >>> challenges in configuring Spark to access and execute this JAR file >>> directly from the S3 bucket. >>> >>> I would greatly appreciate any advice, best practices, or pointers on >>> how to achieve this integration effectively. Specifically, I'm looking for >>> insights on: >>> >>> 1. Configuring Spark to access and retrieve the JAR file from an AWS >>> S3 bucket. >>> 2. Setting up the necessary permissions and authentication >>> mechanisms to ensure seamless access to the S3 bucket. >>> 3. Any potential performance considerations or optimizations when >>> running Spark applications with dependencies stored in remote storage >>> like >>> AWS S3. >>> >>> If anyone in the community has prior experience or knowledge in this >>> area, I would be extremely grateful for your guidance. Additionally, if >>> there are any relevant resources, documentation, or tutorials that you >>> could recommend, it would be incredibly helpful. >>> >>> Thank you very much for considering my request. I look forward to >>> hearing from you and benefiting from the collective expertise of the Spark >>> community. >>> >>> Best regards, Jagannath Majhi >>> >>