Prometheus with spark
Hi Team, We wanted to query Prometheus data with spark. Any suggestions will be appreciated Searched for documents but did not got any prompt one
Re: Spark kubernetes s3 connectivity issue
I understand what you are saying . However, I am not sure how to implement when i create a docker image using spark 3.2.1 with hadoop 3.2 which has guava jar already added as part of distribution. On Tue, Feb 15, 2022, 01:17 Mich Talebzadeh wrote: > Hi Raj, > > I found the old email. That is what I did but it is 2018 stuff. > > The email says > > I sorted out this problem. I rewrote the assembly with shade rules to > avoid old jar files as follows: > > lazy val root = (project in file(".")). > settings( > name := "${APPLICATION}", > version := "1.0", > scalaVersion := "2.11.8", > mainClass in Compile := Some("myPackage.${APPLICATION}") > ) > assemblyShadeRules in assembly := Seq( > ShadeRule.rename("com.google.common.**" -> "my_conf.@1").inAll > ) > libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.0.0" % > "provided" > libraryDependencies += "org.apache.hadoop" % "hadoop-client" % "2.4.0" > libraryDependencies += "org.apache.spark" %% "spark-core" % "2.0.0" % > "provided" exclude("org.apache.hadoop", "hadoop-client") > resolvers += "Akka Repository" at "http://repo.akka.io/releases/; > libraryDependencies += "com.amazonaws" % "aws-java-sdk" % "1.7.8" > libraryDependencies += "commons-io" % "commons-io" % "2.4" > libraryDependencies += "javax.servlet" % "javax.servlet-api" % "3.0.1" % > "provided" > libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.0.0" % > "provided" > libraryDependencies += "org.apache.spark" %% "spark-hive" % "2.0.0" % > "provided" > libraryDependencies += "com.google.cloud.bigdataoss" % > "bigquery-connector" % "0.13.4-hadoop3" > libraryDependencies += "com.google.cloud.bigdataoss" % "gcs-connector" % > "1.9.4-hadoop3" > libraryDependencies += "com.google.code.gson" % "gson" % "2.8.5" > libraryDependencies += "org.apache.httpcomponents" % "httpcore" % "4.4.8" > libraryDependencies += "org.apache.hadoop" % "hadoop-hdfs" % "2.4.0" > libraryDependencies += "com.github.samelamin" %% "spark-bigquery" % > "0.2.5" > > // META-INF discarding > assemblyMergeStrategy in assembly := { > case PathList("META-INF", "MANIFEST.MF") => MergeStrategy.discard > case PathList("META-INF", xs @ _*) => MergeStrategy.discard > case x => MergeStrategy.first > } > > HTH > > > >view my Linkedin profile > <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> > > > https://en.everybodywiki.com/Mich_Talebzadeh > > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Mon, 14 Feb 2022 at 19:40, Raj ks wrote: > >> Should we remove the existing jar and upgrade it to some recent version? >> >> On Tue, Feb 15, 2022, 01:08 Mich Talebzadeh >> wrote: >> >>> I recall I had similar issues running Spark on Google Dataproc. >>> >>> sounds like it gets Hadoop's jars on the classpath which include an >>> older version of Guava. The solution is to shade/relocate Guava in your >>> distribution >>> >>> >>> HTH >>> >>> >>>view my Linkedin profile >>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>> >>> >>> https://en.everybodywiki.com/Mich_Talebzadeh >>> >>> >>> >>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>> any loss, damage or destruction of data or any other property which may >>> arise from relying on this email's technical content is explicitly >>> disclaimed. The author will in no case be liable for any monetary damages >>> arising from such loss, damage or destruction. >>> >>> >>> >>> >>> On Mon, 14 Feb 2022 at 19:10, Raj ks wrote: >>> >>>> Hi Team , >>>> >>>> We ar
Re: Spark kubernetes s3 connectivity issue
Should we remove the existing jar and upgrade it to some recent version? On Tue, Feb 15, 2022, 01:08 Mich Talebzadeh wrote: > I recall I had similar issues running Spark on Google Dataproc. > > sounds like it gets Hadoop's jars on the classpath which include an older > version of Guava. The solution is to shade/relocate Guava in your > distribution > > > HTH > > >view my Linkedin profile > <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> > > > https://en.everybodywiki.com/Mich_Talebzadeh > > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Mon, 14 Feb 2022 at 19:10, Raj ks wrote: > >> Hi Team , >> >> We are trying to build a docker image using Centos and trying to connect >> through S3. Same works with Hadoop 3.2.0 and spark.3.1.2 >> >> #Installing spark binaries >> ENV SPARK_HOME /opt/spark >> ENV SPARK_VERSION 3.2.1 >> ENV HADOOP_VERSION 3.2.0 >> ARG HADOOP_VERSION_SHORT=3.2 >> ARG HADOOP_AWS_VERSION=3.3.0 >> ARG AWS_SDK_VERSION=1.11.563 >> >> >> RUN set -xe \ >> && cd /tmp \ >> && wget >> http://mirrors.gigenet.com/apache/spark/spark-${SPARK_VERSION}/spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION_SHORT}.tgz >> \ >> && tar -zxvf >> spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION_SHORT}.tgz \ >> && rm *.tgz \ >> && mv spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION_SHORT} >> ${SPARK_HOME} \ >> && cp ${SPARK_HOME}/kubernetes/dockerfiles/spark/entrypoint.sh >> ${SPARK_HOME} \ >> && wget >> https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/${HADOOP_AWS_VERSION}/hadoop-aws-${HADOOP_AWS_VERSION}.jar >> \ >> && wget >> https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk-bundle/${AWS_SDK_VERSION}/aws-java-sdk-bundle-${AWS_SDK_VERSION}.jar >> \ >> && wget >> https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk/${AWS_SDK_VERSION}/aws-java-sdk-${AWS_SDK_VERSION}.jar >> \ >> && mv *.jar /opt/spark/jars/ >> >> Error: >> >> Any help on this is appreciated >> java.lang.NoSuchMethodError: >> com/google/common/base/Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;Ljava/lang/Object;)V >> (loaded from file:/opt/spark/jars/guava-14.0.1.jar by >> jdk.internal.loader.ClassLoaders$AppClassLoader@1e4553e) called from >> class org.apache.hadoop.fs.s3a.S3AUtils (loaded from >> file:/opt/spark/jars/hadoop-aws-3.3.0.jar by >> jdk.internal.loader.ClassLoaders$AppClassLoader@1e4553e). >> >>
Spark kubernetes s3 connectivity issue
Hi Team , We are trying to build a docker image using Centos and trying to connect through S3. Same works with Hadoop 3.2.0 and spark.3.1.2 #Installing spark binaries ENV SPARK_HOME /opt/spark ENV SPARK_VERSION 3.2.1 ENV HADOOP_VERSION 3.2.0 ARG HADOOP_VERSION_SHORT=3.2 ARG HADOOP_AWS_VERSION=3.3.0 ARG AWS_SDK_VERSION=1.11.563 RUN set -xe \ && cd /tmp \ && wget http://mirrors.gigenet.com/apache/spark/spark-${SPARK_VERSION}/spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION_SHORT}.tgz \ && tar -zxvf spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION_SHORT}.tgz \ && rm *.tgz \ && mv spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION_SHORT} ${SPARK_HOME} \ && cp ${SPARK_HOME}/kubernetes/dockerfiles/spark/entrypoint.sh ${SPARK_HOME} \ && wget https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/${HADOOP_AWS_VERSION}/hadoop-aws-${HADOOP_AWS_VERSION}.jar \ && wget https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk-bundle/${AWS_SDK_VERSION}/aws-java-sdk-bundle-${AWS_SDK_VERSION}.jar \ && wget https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk/${AWS_SDK_VERSION}/aws-java-sdk-${AWS_SDK_VERSION}.jar \ && mv *.jar /opt/spark/jars/ Error: Any help on this is appreciated java.lang.NoSuchMethodError: com/google/common/base/Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;Ljava/lang/Object;)V (loaded from file:/opt/spark/jars/guava-14.0.1.jar by jdk.internal.loader.ClassLoaders$AppClassLoader@1e4553e) called from class org.apache.hadoop.fs.s3a.S3AUtils (loaded from file:/opt/spark/jars/hadoop-aws-3.3.0.jar by jdk.internal.loader.ClassLoaders$AppClassLoader@1e4553e).