Thanks David. I wanted to explain the difference between Package and Jar with comments from the community on previous discussions back a few years ago.
cheers Mich Talebzadeh, Technologist | Architect | Data Engineer | Generative AI | FinCrime London United Kingdom view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* The information provided is correct to the best of my knowledge but of course cannot be guaranteed . It is essential to note that, as with any advice, quote "one test result is worth one-thousand expert opinions (Werner <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". On Mon, 6 May 2024 at 18:32, David Rabinowitz <david...@google.com> wrote: > Hi, > > It seems this library is several years old. Have you considered using the > Google provided connector? You can find it in > https://github.com/GoogleCloudDataproc/spark-bigquery-connector > > Regards, > David Rabinowitz > > On Sun, May 5, 2024 at 6:07 PM Jeff Zhang <zjf...@gmail.com> wrote: > >> Are you sure com.google.api.client.http.HttpRequestInitialize is in >> the spark-bigquery-latest.jar or it may be in the transitive dependency >> of spark-bigquery_2.11? >> >> On Sat, May 4, 2024 at 7:43 PM Mich Talebzadeh <mich.talebza...@gmail.com> >> wrote: >> >>> >>> Mich Talebzadeh, >>> Technologist | Architect | Data Engineer | Generative AI | FinCrime >>> London >>> United Kingdom >>> >>> >>> view my Linkedin profile >>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>> >>> >>> https://en.everybodywiki.com/Mich_Talebzadeh >>> >>> >>> >>> *Disclaimer:* The information provided is correct to the best of my >>> knowledge but of course cannot be guaranteed . It is essential to note >>> that, as with any advice, quote "one test result is worth one-thousand >>> expert opinions (Werner >>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun >>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". >>> >>> >>> ---------- Forwarded message --------- >>> From: Mich Talebzadeh <mich.talebza...@gmail.com> >>> Date: Tue, 20 Oct 2020 at 16:50 >>> Subject: Why spark-submit works with package not with jar >>> To: user @spark <u...@spark.apache.org> >>> >>> >>> Hi, >>> >>> I have a scenario that I use in Spark submit as follows: >>> >>> spark-submit --driver-class-path /home/hduser/jars/ddhybrid.jar --jars >>> /home/hduser/jars/spark-bigquery-latest.jar,/home/hduser/jars/ddhybrid.jar, >>> */home/hduser/jars/spark-bigquery_2.11-0.2.6.jar* >>> >>> As you can see the jar files needed are added. >>> >>> >>> This comes back with error message as below >>> >>> >>> Creating model test.weights_MODEL >>> >>> java.lang.NoClassDefFoundError: >>> com/google/api/client/http/HttpRequestInitializer >>> >>> at >>> com.samelamin.spark.bigquery.BigQuerySQLContext.bq$lzycompute(BigQuerySQLContext.scala:19) >>> >>> at >>> com.samelamin.spark.bigquery.BigQuerySQLContext.bq(BigQuerySQLContext.scala:19) >>> >>> at >>> com.samelamin.spark.bigquery.BigQuerySQLContext.runDMLQuery(BigQuerySQLContext.scala:105) >>> >>> ... 76 elided >>> >>> Caused by: java.lang.ClassNotFoundException: >>> com.google.api.client.http.HttpRequestInitializer >>> >>> at java.net.URLClassLoader.findClass(URLClassLoader.java:382) >>> >>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) >>> >>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) >>> >>> >>> >>> So there is an issue with finding the class, although the jar file used >>> >>> >>> /home/hduser/jars/spark-bigquery_2.11-0.2.6.jar >>> >>> has it. >>> >>> >>> Now if *I remove the above jar file and replace it with the same >>> version but package* it works! >>> >>> >>> spark-submit --driver-class-path /home/hduser/jars/ddhybrid.jar --jars >>> /home/hduser/jars/spark-bigquery-latest.jar,/home/hduser/jars/ddhybrid.jar >>> *-**-packages com.github.samelamin:spark-bigquery_2.11:0.2.6* >>> >>> >>> I have read the write-ups about packages searching the maven >>> libraries etc. Not convinced why using the package should make so much >>> difference between a failure and success. In other words, when to use a >>> package rather than a jar. >>> >>> >>> Any ideas will be appreciated. >>> >>> >>> Thanks >>> >>> >>> >>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>> any loss, damage or destruction of data or any other property which may >>> arise from relying on this email's technical content is explicitly >>> disclaimed. The author will in no case be liable for any monetary damages >>> arising from such loss, damage or destruction. >>> >>> >>> >> >> >> -- >> Best Regards >> >> Jeff Zhang >> >