Prometheus with spark

2022-10-21 Thread Raj ks
Hi Team,


We wanted to query Prometheus data with spark. Any suggestions will
be appreciated

Searched for documents but did not got any prompt one


Re: Spark kubernetes s3 connectivity issue

2022-02-14 Thread Raj ks
I understand what you are saying . However, I am not sure how to implement
when i create a docker image using spark 3.2.1 with hadoop 3.2 which has
guava jar already added as part of distribution.

On Tue, Feb 15, 2022, 01:17 Mich Talebzadeh 
wrote:

> Hi Raj,
>
> I found the old email. That is what I did but it is 2018 stuff.
>
> The email says
>
>  I sorted out this problem. I rewrote the assembly with shade rules to
> avoid old jar files as follows:
>
> lazy val root = (project in file(".")).
>   settings(
> name := "${APPLICATION}",
> version := "1.0",
> scalaVersion := "2.11.8",
> mainClass in Compile := Some("myPackage.${APPLICATION}")
>   )
> assemblyShadeRules in assembly := Seq(
> ShadeRule.rename("com.google.common.**" -> "my_conf.@1").inAll
> )
> libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.0.0" %
> "provided"
> libraryDependencies += "org.apache.hadoop" % "hadoop-client" % "2.4.0"
> libraryDependencies += "org.apache.spark" %% "spark-core" % "2.0.0"  %
> "provided" exclude("org.apache.hadoop", "hadoop-client")
> resolvers += "Akka Repository" at "http://repo.akka.io/releases/;
> libraryDependencies += "com.amazonaws" % "aws-java-sdk" % "1.7.8"
> libraryDependencies += "commons-io" % "commons-io" % "2.4"
> libraryDependencies += "javax.servlet" % "javax.servlet-api" % "3.0.1" %
> "provided"
> libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.0.0" %
> "provided"
> libraryDependencies += "org.apache.spark" %% "spark-hive" % "2.0.0" %
> "provided"
> libraryDependencies += "com.google.cloud.bigdataoss" %
> "bigquery-connector" % "0.13.4-hadoop3"
> libraryDependencies += "com.google.cloud.bigdataoss" % "gcs-connector" %
> "1.9.4-hadoop3"
> libraryDependencies += "com.google.code.gson" % "gson" % "2.8.5"
> libraryDependencies += "org.apache.httpcomponents" % "httpcore" % "4.4.8"
> libraryDependencies += "org.apache.hadoop" % "hadoop-hdfs" % "2.4.0"
> libraryDependencies += "com.github.samelamin" %% "spark-bigquery" %
> "0.2.5"
>
> // META-INF discarding
> assemblyMergeStrategy in assembly := {
>  case PathList("META-INF", "MANIFEST.MF") => MergeStrategy.discard
>  case PathList("META-INF", xs @ _*) => MergeStrategy.discard
>  case x => MergeStrategy.first
> }
>
> HTH
>
>
>
>view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Mon, 14 Feb 2022 at 19:40, Raj ks  wrote:
>
>> Should we remove the existing jar and upgrade it to some recent version?
>>
>> On Tue, Feb 15, 2022, 01:08 Mich Talebzadeh 
>> wrote:
>>
>>> I recall I had similar issues running Spark on Google Dataproc.
>>>
>>> sounds like it gets Hadoop's jars on the classpath which include an
>>> older version of Guava. The solution is to shade/relocate Guava in your
>>> distribution
>>>
>>>
>>> HTH
>>>
>>>
>>>view my Linkedin profile
>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>
>>>
>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>>
>>> On Mon, 14 Feb 2022 at 19:10, Raj ks  wrote:
>>>
>>>> Hi Team ,
>>>>
>>>> We ar

Re: Spark kubernetes s3 connectivity issue

2022-02-14 Thread Raj ks
Should we remove the existing jar and upgrade it to some recent version?

On Tue, Feb 15, 2022, 01:08 Mich Talebzadeh 
wrote:

> I recall I had similar issues running Spark on Google Dataproc.
>
> sounds like it gets Hadoop's jars on the classpath which include an older
> version of Guava. The solution is to shade/relocate Guava in your
> distribution
>
>
> HTH
>
>
>view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Mon, 14 Feb 2022 at 19:10, Raj ks  wrote:
>
>> Hi Team ,
>>
>> We are trying to build a docker image using Centos and trying to connect
>> through S3. Same works with Hadoop 3.2.0 and spark.3.1.2
>>
>> #Installing spark binaries
>> ENV SPARK_HOME /opt/spark
>> ENV SPARK_VERSION 3.2.1
>> ENV HADOOP_VERSION 3.2.0
>> ARG HADOOP_VERSION_SHORT=3.2
>> ARG HADOOP_AWS_VERSION=3.3.0
>> ARG AWS_SDK_VERSION=1.11.563
>>
>>
>> RUN set -xe \
>>   && cd /tmp \
>>   && wget
>> http://mirrors.gigenet.com/apache/spark/spark-${SPARK_VERSION}/spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION_SHORT}.tgz
>>  \
>>   && tar -zxvf
>> spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION_SHORT}.tgz \
>>   && rm *.tgz \
>>   && mv spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION_SHORT}
>> ${SPARK_HOME} \
>>   && cp ${SPARK_HOME}/kubernetes/dockerfiles/spark/entrypoint.sh
>> ${SPARK_HOME} \
>>   && wget
>> https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/${HADOOP_AWS_VERSION}/hadoop-aws-${HADOOP_AWS_VERSION}.jar
>>  \
>>  && wget
>> https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk-bundle/${AWS_SDK_VERSION}/aws-java-sdk-bundle-${AWS_SDK_VERSION}.jar
>>  \
>> && wget
>> https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk/${AWS_SDK_VERSION}/aws-java-sdk-${AWS_SDK_VERSION}.jar
>>  \
>>  && mv *.jar /opt/spark/jars/
>>
>> Error:
>>
>> Any help on this is appreciated
>> java.lang.NoSuchMethodError:
>> com/google/common/base/Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;Ljava/lang/Object;)V
>> (loaded from file:/opt/spark/jars/guava-14.0.1.jar by
>> jdk.internal.loader.ClassLoaders$AppClassLoader@1e4553e) called from
>> class org.apache.hadoop.fs.s3a.S3AUtils (loaded from
>> file:/opt/spark/jars/hadoop-aws-3.3.0.jar by
>> jdk.internal.loader.ClassLoaders$AppClassLoader@1e4553e).
>>
>>


Spark kubernetes s3 connectivity issue

2022-02-14 Thread Raj ks
Hi Team ,

We are trying to build a docker image using Centos and trying to connect
through S3. Same works with Hadoop 3.2.0 and spark.3.1.2

#Installing spark binaries
ENV SPARK_HOME /opt/spark
ENV SPARK_VERSION 3.2.1
ENV HADOOP_VERSION 3.2.0
ARG HADOOP_VERSION_SHORT=3.2
ARG HADOOP_AWS_VERSION=3.3.0
ARG AWS_SDK_VERSION=1.11.563


RUN set -xe \
  && cd /tmp \
  && wget
http://mirrors.gigenet.com/apache/spark/spark-${SPARK_VERSION}/spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION_SHORT}.tgz
 \
  && tar -zxvf spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION_SHORT}.tgz
\
  && rm *.tgz \
  && mv spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION_SHORT}
${SPARK_HOME} \
  && cp ${SPARK_HOME}/kubernetes/dockerfiles/spark/entrypoint.sh
${SPARK_HOME} \
  && wget
https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/${HADOOP_AWS_VERSION}/hadoop-aws-${HADOOP_AWS_VERSION}.jar
 \
 && wget
https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk-bundle/${AWS_SDK_VERSION}/aws-java-sdk-bundle-${AWS_SDK_VERSION}.jar
 \
&& wget
https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk/${AWS_SDK_VERSION}/aws-java-sdk-${AWS_SDK_VERSION}.jar
 \
 && mv *.jar /opt/spark/jars/

Error:

Any help on this is appreciated
java.lang.NoSuchMethodError:
com/google/common/base/Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;Ljava/lang/Object;)V
(loaded from file:/opt/spark/jars/guava-14.0.1.jar by
jdk.internal.loader.ClassLoaders$AppClassLoader@1e4553e) called from class
org.apache.hadoop.fs.s3a.S3AUtils (loaded from
file:/opt/spark/jars/hadoop-aws-3.3.0.jar by
jdk.internal.loader.ClassLoaders$AppClassLoader@1e4553e).