Have updated the pom.xml in external/kafka-0-10-sql folder, in yellow , as
below, and have run the command
build/mvn package -DskipTests -pl external/kafka-0-10-sql
which generated
spark-sql-kafka-0-10_2.11-2.3.0-SNAPSHOT-jar-with-dependencies.jar

<?xml version="1.0" encoding="UTF-8"?>

<!--

  ~ Licensed to the Apache Software Foundation (ASF) under one or more

  ~ contributor license agreements.  See the NOTICE file distributed with

  ~ this work for additional information regarding copyright ownership.

  ~ The ASF licenses this file to You under the Apache License, Version 2.0

  ~ (the "License"); you may not use this file except in compliance with

  ~ the License.  You may obtain a copy of the License at

  ~

  ~    http://www.apache.org/licenses/LICENSE-2.0

  ~

  ~ Unless required by applicable law or agreed to in writing, software

  ~ distributed under the License is distributed on an "AS IS" BASIS,

  ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

  ~ See the License for the specific language governing permissions and

  ~ limitations under the License.

  -->


<project xmlns="http://maven.apache.org/POM/4.0.0"; xmlns:xsi="
http://www.w3.org/2001/XMLSchema-instance"; xsi:schemaLocation="
http://maven.apache.org/POM/4.0.0
http://maven.apache.org/xsd/maven-4.0.0.xsd";>

  <modelVersion>4.0.0</modelVersion>

  <parent>

    <groupId>org.apache.spark</groupId>

    <artifactId>spark-parent_2.11</artifactId>

    <version>2.3.0-SNAPSHOT</version>

    <relativePath>../../pom.xml</relativePath>

  </parent>


  <groupId>org.apache.spark</groupId>

  <artifactId>spark-sql-kafka-0-10_2.11</artifactId>

  <properties>

    <sbt.project.name>sql-kafka-0-10</sbt.project.name>

  </properties>

  <packaging>jar</packaging>

  <name>Kafka 0.10 Source for Structured Streaming</name>

  <url>http://spark.apache.org/</url>


  <dependencies>

    <dependency>

      <groupId>org.apache.spark</groupId>

      <artifactId>spark-sql_${scala.binary.version}</artifactId>

      <version>${project.version}</version>

      <scope>provided</scope>

    </dependency>

    <dependency>

      <groupId>org.apache.spark</groupId>

      <artifactId>spark-core_${scala.binary.version}</artifactId>

      <version>${project.version}</version>

      <type>test-jar</type>

      <scope>test</scope>

    </dependency>

    <dependency>

      <groupId>org.apache.spark</groupId>

      <artifactId>spark-catalyst_${scala.binary.version}</artifactId>

      <version>${project.version}</version>

      <type>test-jar</type>

      <scope>test</scope>

    </dependency>

    <dependency>

      <groupId>org.apache.spark</groupId>

      <artifactId>spark-sql_${scala.binary.version}</artifactId>

      <version>${project.version}</version>

      <type>test-jar</type>

      <scope>test</scope>

    </dependency>

    <dependency>

      <groupId>org.apache.kafka</groupId>

      <artifactId>kafka-clients</artifactId>

      <version>0.10.0.1</version>

    </dependency>

    <dependency>

      <groupId>org.apache.kafka</groupId>

      <artifactId>kafka_${scala.binary.version}</artifactId>

      <version>0.10.0.1</version>

    </dependency>

    <dependency>

      <groupId>net.sf.jopt-simple</groupId>

      <artifactId>jopt-simple</artifactId>

      <version>3.2</version>

      <scope>test</scope>

    </dependency>

    <dependency>

      <groupId>org.scalacheck</groupId>

      <artifactId>scalacheck_${scala.binary.version}</artifactId>

      <scope>test</scope>

    </dependency>

    <dependency>

      <groupId>org.apache.spark</groupId>

      <artifactId>spark-tags_${scala.binary.version}</artifactId>

    </dependency>


    <!--

      This spark-tags test-dep is needed even though it isn't used in this
module, otherwise testing-cmds that exclude

      them will yield errors.

    -->

    <dependency>

      <groupId>org.apache.spark</groupId>

      <artifactId>spark-tags_${scala.binary.version}</artifactId>

      <type>test-jar</type>

      <scope>test</scope>

    </dependency>


  </dependencies>

  <build>


<outputDirectory>target/scala-${scala.binary.version}/classes</outputDirectory>


<testOutputDirectory>target/scala-${scala.binary.version}/test-classes</testOutputDirectory>

        <plugins>

          <plugin>

        <artifactId>maven-assembly-plugin</artifactId>

        <version>3.0.0</version>

        <configuration>

          <descriptorRefs>

            <descriptorRef>jar-with-dependencies</descriptorRef>

          </descriptorRefs>

        </configuration>

        <executions>

          <execution>

            <id>make-assembly</id> <!-- this is used for inheritance merges
-->

            <phase>package</phase> <!-- bind to the packaging phase -->

            <goals>

              <goal>single</goal>

            </goals>

          </execution>

        </executions>

      </plugin>

     </plugins>

  </build>

</project>


Regards,

Satyajit.

On Wed, Jun 28, 2017 at 12:12 PM, Shixiong(Ryan) Zhu <
shixi...@databricks.com> wrote:

> "--package" will add transitive dependencies that are not
> "$SPARK_HOME/external/kafka-0-10-sql/target/*.jar".
>
> > i have tried building the jar with dependencies, but still face the same
> error.
>
> What's the command you used?
>
> On Wed, Jun 28, 2017 at 12:00 PM, satyajit vegesna <
> satyajit.apas...@gmail.com> wrote:
>
>> Hi All,
>>
>> I am trying too build Kafka-0-10-sql module under external folder in
>> apache spark source code.
>> Once i generate jar file using,
>> build/mvn package -DskipTests -pl external/kafka-0-10-sql
>> i get jar file created under external/kafka-0-10-sql/target.
>>
>> And try to run spark-shell with jars created in target folder as below,
>> bin/spark-shell --jars $SPARK_HOME/external/kafka-0-10-sql/target/*.jar
>>
>> i get below error based on the command,
>>
>> Using Spark's default log4j profile: org/apache/spark/log4j-default
>> s.properties
>>
>> Setting default log level to "WARN".
>>
>> To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use
>> setLogLevel(newLevel).
>>
>> 17/06/28 11:54:03 WARN NativeCodeLoader: Unable to load native-hadoop
>> library for your platform... using builtin-java classes where applicable
>>
>> Spark context Web UI available at http://10.1.10.241:4040
>>
>> Spark context available as 'sc' (master = local[*], app id =
>> local-1498676043936).
>>
>> Spark session available as 'spark'.
>>
>> Welcome to
>>
>>       ____              __
>>
>>      / __/__  ___ _____/ /__
>>
>>     _\ \/ _ \/ _ `/ __/  '_/
>>
>>    /___/ .__/\_,_/_/ /_/\_\   version 2.3.0-SNAPSHOT
>>
>>       /_/
>>
>>
>>
>> Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java
>> 1.8.0_131)
>>
>> Type in expressions to have them evaluated.
>>
>> Type :help for more information.
>>
>> scala> val lines = spark.readStream.format("kafka
>> ").option("kafka.bootstrap.servers", "localhost:9092").option("subscribe",
>> "test").load()
>>
>> java.lang.NoClassDefFoundError: org/apache/kafka/common/serial
>> ization/ByteArrayDeserializer
>>
>>   at org.apache.spark.sql.kafka010.KafkaSourceProvider$.<init>(Ka
>> fkaSourceProvider.scala:378)
>>
>>   at org.apache.spark.sql.kafka010.KafkaSourceProvider$.<clinit>(
>> KafkaSourceProvider.scala)
>>
>>   at org.apache.spark.sql.kafka010.KafkaSourceProvider.validateSt
>> reamOptions(KafkaSourceProvider.scala:325)
>>
>>   at org.apache.spark.sql.kafka010.KafkaSourceProvider.sourceSche
>> ma(KafkaSourceProvider.scala:60)
>>
>>   at org.apache.spark.sql.execution.datasources.DataSource.
>> sourceSchema(DataSource.scala:192)
>>
>>   at org.apache.spark.sql.execution.datasources.DataSource.
>> sourceInfo$lzycompute(DataSource.scala:87)
>>
>>   at org.apache.spark.sql.execution.datasources.DataSource.
>> sourceInfo(DataSource.scala:87)
>>
>>   at org.apache.spark.sql.execution.streaming.StreamingRelation$.
>> apply(StreamingRelation.scala:30)
>>
>>   at org.apache.spark.sql.streaming.DataStreamReader.load(
>> DataStreamReader.scala:150)
>>
>>   ... 48 elided
>>
>> Caused by: java.lang.ClassNotFoundException:
>> org.apache.kafka.common.serialization.ByteArrayDeserializer
>>
>>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>
>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>
>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>
>>   ... 57 more
>>
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> ++++++++++++++
>>
>> i have tried building the jar with dependencies, but still face the same
>> error.
>>
>> But when i try to do --package with spark-shell using bin/spark-shell
>> --package org.apache.spark:spark-sql-kafka-0-10_2.11:2.1.0 , it works
>> fine.
>>
>> The reason, i am trying to build something from source code, is because i
>> want to try pushing dataframe data into kafka topic, based on the url
>> https://github.com/apache/spark/commit/b0a5cd89097c563e9
>> 949d8cfcf84d18b03b8d24c, which doesn't work with version 2.1.0.
>>
>>
>> Any help would be highly appreciated.
>>
>>
>> Regards,
>>
>> Satyajit.
>>
>>
>>
>

Reply via email to