>From the looks of it, it's the com.google.http-client ones. But there may be more. You should not have to reason about this. That's why you let Maven / Ivy resolution figure it out. It is not true that everything in .ivy2 is on the classpath.
On Tue, Oct 20, 2020 at 3:48 PM Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Hi Nicolas, > > I removed ~/.iv2 and reran the spark job with the package included (the > one working) > > Under ~/.ivy/jars I Have 37 jar files, including the one that I had > before. > > /home/hduser/.ivy2/jars> ls > com.databricks_spark-avro_2.11-4.0.0.jar > com.google.cloud.bigdataoss_gcs-connector-1.9.4-hadoop2.jar > com.google.oauth-client_google-oauth-client-1.24.1.jar > org.checkerframework_checker-qual-2.5.2.jar > com.fasterxml.jackson.core_jackson-core-2.9.2.jar > com.google.cloud.bigdataoss_gcsio-1.9.4.jar > com.google.oauth-client_google-oauth-client-java6-1.24.1.jar > org.codehaus.jackson_jackson-core-asl-1.9.13.jar > com.github.samelamin_spark-bigquery_2.11-0.2.6.jar > com.google.cloud.bigdataoss_util-1.9.4.jar > commons-codec_commons-codec-1.6.jar > org.codehaus.jackson_jackson-mapper-asl-1.9.13.jar > com.google.api-client_google-api-client-1.24.1.jar > com.google.cloud.bigdataoss_util-hadoop-1.9.4-hadoop2.jar > commons-logging_commons-logging-1.1.1.jar > org.codehaus.mojo_animal-sniffer-annotations-1.14.jar > com.google.api-client_google-api-client-jackson2-1.24.1.jar > com.google.code.findbugs_jsr305-3.0.2.jar > com.thoughtworks.paranamer_paranamer-2.3.jar > org.slf4j_slf4j-api-1.7.5.jar > com.google.api-client_google-api-client-java6-1.24.1.jar > com.google.errorprone_error_prone_annotations-2.1.3.jar > joda-time_joda-time-2.9.3.jar > org.tukaani_xz-1.0.jar > com.google.apis_google-api-services-bigquery-v2-rev398-1.24.1.jar > com.google.guava_guava-26.0-jre.jar > org.apache.avro_avro-1.7.6.jar > org.xerial.snappy_snappy-java-1.0.5.jar > com.google.apis_google-api-services-storage-v1-rev135-1.24.1.jar > com.google.http-client_google-http-client-1.24.1.jar > org.apache.commons_commons-compress-1.4.1.jar > com.google.auto.value_auto-value-annotations-1.6.2.jar > com.google.http-client_google-http-client-jackson2-1.24.1.jar > org.apache.httpcomponents_httpclient-4.0.1.jar > com.google.cloud.bigdataoss_bigquery-connector-0.13.4-hadoop2.jar > com.google.j2objc_j2objc-annotations-1.1.jar > org.apache.httpcomponents_httpcore-4.0.1.jar > > I don't think I need to add all of these to spark-submit --jars list. Is > there a way I can find out which dependency is missing > > This is the error I am getting when I use the jar file > * com.github.samelamin_spark-bigquery_2.11-0.2.6.jar* instead of the > package *com.github.samelamin:spark-bigquery_2.11:0.2.6* > > java.lang.NoClassDefFoundError: > com/google/api/client/http/HttpRequestInitializer > at > com.samelamin.spark.bigquery.BigQuerySQLContext.bq$lzycompute(BigQuerySQLContext.scala:19) > at > com.samelamin.spark.bigquery.BigQuerySQLContext.bq(BigQuerySQLContext.scala:19) > at > com.samelamin.spark.bigquery.BigQuerySQLContext.runDMLQuery(BigQuerySQLContext.scala:105) > ... 76 elided > Caused by: java.lang.ClassNotFoundException: > com.google.api.client.http.HttpRequestInitializer > at java.net.URLClassLoader.findClass(URLClassLoader.java:382) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > > > Thanks > > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Tue, 20 Oct 2020 at 20:09, Nicolas Paris <nicolas.pa...@riseup.net> > wrote: > >> once you got the jars from --package in the ~/.ivy2 folder you can then >> add the list to --jars . in this way there is no missing dependency. >> >> >> ayan guha <guha.a...@gmail.com> writes: >> >> > Hi >> > >> > One way to think of this is --packages is better when you have third >> party >> > dependency and --jars is better when you have custom in-house built >> jars. >> > >> > On Wed, 21 Oct 2020 at 3:44 am, Mich Talebzadeh < >> mich.talebza...@gmail.com> >> > wrote: >> > >> >> Thanks Sean and Russell. Much appreciated. >> >> >> >> Just to clarify recently I had issues with different versions of Google >> >> Guava jar files in building Uber jar file (to evict the unwanted ones). >> >> These used to work a year and half ago using Google Dataproc compute >> >> engines (comes with Spark preloaded) and I could create an Uber jar >> file. >> >> >> >> Unfortunately this has become problematic now so tried to use >> spark-submit >> >> instead as follows: >> >> >> >> ${SPARK_HOME}/bin/spark-submit \ >> >> --master yarn \ >> >> --deploy-mode client \ >> >> --conf spark.executor.memoryOverhead=3000 \ >> >> --class org.apache.spark.repl.Main \ >> >> --name "Spark shell on Yarn" "$@" >> >> --driver-class-path /home/hduser/jars/ddhybrid.jar \ >> >> --jars /home/hduser/jars/spark-bigquery-latest.jar, \ >> >> /home/hduser/jars/ddhybrid.jar \ >> >> --packages >> com.github.samelamin:spark-bigquery_2.11:0.2.6 >> >> >> >> Effectively tailored spark-shell. However, I do not think there is a >> >> mechanism to resolve jar conflicts without building an Uber jar file >> >> through SBT? >> >> >> >> Cheers >> >> >> >> >> >> >> >> On Tue, 20 Oct 2020 at 16:54, Russell Spitzer < >> russell.spit...@gmail.com> >> >> wrote: >> >> >> >>> --jar Adds only that jar >> >>> --package adds the Jar and a it's dependencies listed in maven >> >>> >> >>> On Tue, Oct 20, 2020 at 10:50 AM Mich Talebzadeh < >> >>> mich.talebza...@gmail.com> wrote: >> >>> >> >>>> Hi, >> >>>> >> >>>> I have a scenario that I use in Spark submit as follows: >> >>>> >> >>>> spark-submit --driver-class-path /home/hduser/jars/ddhybrid.jar >> --jars >> >>>> >> /home/hduser/jars/spark-bigquery-latest.jar,/home/hduser/jars/ddhybrid.jar, >> >>>> */home/hduser/jars/spark-bigquery_2.11-0.2.6.jar* >> >>>> >> >>>> As you can see the jar files needed are added. >> >>>> >> >>>> >> >>>> This comes back with error message as below >> >>>> >> >>>> >> >>>> Creating model test.weights_MODEL >> >>>> >> >>>> java.lang.NoClassDefFoundError: >> >>>> com/google/api/client/http/HttpRequestInitializer >> >>>> >> >>>> at >> >>>> >> com.samelamin.spark.bigquery.BigQuerySQLContext.bq$lzycompute(BigQuerySQLContext.scala:19) >> >>>> >> >>>> at >> >>>> >> com.samelamin.spark.bigquery.BigQuerySQLContext.bq(BigQuerySQLContext.scala:19) >> >>>> >> >>>> at >> >>>> >> com.samelamin.spark.bigquery.BigQuerySQLContext.runDMLQuery(BigQuerySQLContext.scala:105) >> >>>> >> >>>> ... 76 elided >> >>>> >> >>>> Caused by: java.lang.ClassNotFoundException: >> >>>> com.google.api.client.http.HttpRequestInitializer >> >>>> >> >>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:382) >> >>>> >> >>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) >> >>>> >> >>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) >> >>>> >> >>>> >> >>>> >> >>>> So there is an issue with finding the class, although the jar file >> used >> >>>> >> >>>> >> >>>> /home/hduser/jars/spark-bigquery_2.11-0.2.6.jar >> >>>> >> >>>> has it. >> >>>> >> >>>> >> >>>> Now if *I remove the above jar file and replace it with the same >> >>>> version but package* it works! >> >>>> >> >>>> >> >>>> spark-submit --driver-class-path /home/hduser/jars/ddhybrid.jar >> --jars >> >>>> >> /home/hduser/jars/spark-bigquery-latest.jar,/home/hduser/jars/ddhybrid.jar >> >>>> *-**-packages com.github.samelamin:spark-bigquery_2.11:0.2.6* >> >>>> >> >>>> >> >>>> I have read the write-ups about packages searching the maven >> >>>> libraries etc. Not convinced why using the package should make so >> much >> >>>> difference between a failure and success. In other words, when to >> use a >> >>>> package rather than a jar. >> >>>> >> >>>> >> >>>> Any ideas will be appreciated. >> >>>> >> >>>> >> >>>> Thanks >> >>>> >> >>>> >> >>>> >> >>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >> >>>> any loss, damage or destruction of data or any other property which >> may >> >>>> arise from relying on this email's technical content is explicitly >> >>>> disclaimed. The author will in no case be liable for any monetary >> damages >> >>>> arising from such loss, damage or destruction. >> >>>> >> >>>> >> >>>> >> >>> -- >> > Best Regards, >> > Ayan Guha >> >> >> -- >> nicolas paris >> >