Have updated the pom.xml in external/kafka-0-10-sql folder, in yellow , as below, and have run the command build/mvn package -DskipTests -pl external/kafka-0-10-sql which generated spark-sql-kafka-0-10_2.11-2.3.0-SNAPSHOT-jar-with-dependencies.jar
<?xml version="1.0" encoding="UTF-8"?> <!-- ~ Licensed to the Apache Software Foundation (ASF) under one or more ~ contributor license agreements. See the NOTICE file distributed with ~ this work for additional information regarding copyright ownership. ~ The ASF licenses this file to You under the Apache License, Version 2.0 ~ (the "License"); you may not use this file except in compliance with ~ the License. You may obtain a copy of the License at ~ ~ http://www.apache.org/licenses/LICENSE-2.0 ~ ~ Unless required by applicable law or agreed to in writing, software ~ distributed under the License is distributed on an "AS IS" BASIS, ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ~ See the License for the specific language governing permissions and ~ limitations under the License. --> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi=" http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation=" http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <parent> <groupId>org.apache.spark</groupId> <artifactId>spark-parent_2.11</artifactId> <version>2.3.0-SNAPSHOT</version> <relativePath>../../pom.xml</relativePath> </parent> <groupId>org.apache.spark</groupId> <artifactId>spark-sql-kafka-0-10_2.11</artifactId> <properties> <sbt.project.name>sql-kafka-0-10</sbt.project.name> </properties> <packaging>jar</packaging> <name>Kafka 0.10 Source for Structured Streaming</name> <url>http://spark.apache.org/</url> <dependencies> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_${scala.binary.version}</artifactId> <version>${project.version}</version> <scope>provided</scope> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_${scala.binary.version}</artifactId> <version>${project.version}</version> <type>test-jar</type> <scope>test</scope> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-catalyst_${scala.binary.version}</artifactId> <version>${project.version}</version> <type>test-jar</type> <scope>test</scope> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_${scala.binary.version}</artifactId> <version>${project.version}</version> <type>test-jar</type> <scope>test</scope> </dependency> <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka-clients</artifactId> <version>0.10.0.1</version> </dependency> <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka_${scala.binary.version}</artifactId> <version>0.10.0.1</version> </dependency> <dependency> <groupId>net.sf.jopt-simple</groupId> <artifactId>jopt-simple</artifactId> <version>3.2</version> <scope>test</scope> </dependency> <dependency> <groupId>org.scalacheck</groupId> <artifactId>scalacheck_${scala.binary.version}</artifactId> <scope>test</scope> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-tags_${scala.binary.version}</artifactId> </dependency> <!-- This spark-tags test-dep is needed even though it isn't used in this module, otherwise testing-cmds that exclude them will yield errors. --> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-tags_${scala.binary.version}</artifactId> <type>test-jar</type> <scope>test</scope> </dependency> </dependencies> <build> <outputDirectory>target/scala-${scala.binary.version}/classes</outputDirectory> <testOutputDirectory>target/scala-${scala.binary.version}/test-classes</testOutputDirectory> <plugins> <plugin> <artifactId>maven-assembly-plugin</artifactId> <version>3.0.0</version> <configuration> <descriptorRefs> <descriptorRef>jar-with-dependencies</descriptorRef> </descriptorRefs> </configuration> <executions> <execution> <id>make-assembly</id> <!-- this is used for inheritance merges --> <phase>package</phase> <!-- bind to the packaging phase --> <goals> <goal>single</goal> </goals> </execution> </executions> </plugin> </plugins> </build> </project> Regards, Satyajit. On Wed, Jun 28, 2017 at 12:12 PM, Shixiong(Ryan) Zhu < shixi...@databricks.com> wrote: > "--package" will add transitive dependencies that are not > "$SPARK_HOME/external/kafka-0-10-sql/target/*.jar". > > > i have tried building the jar with dependencies, but still face the same > error. > > What's the command you used? > > On Wed, Jun 28, 2017 at 12:00 PM, satyajit vegesna < > satyajit.apas...@gmail.com> wrote: > >> Hi All, >> >> I am trying too build Kafka-0-10-sql module under external folder in >> apache spark source code. >> Once i generate jar file using, >> build/mvn package -DskipTests -pl external/kafka-0-10-sql >> i get jar file created under external/kafka-0-10-sql/target. >> >> And try to run spark-shell with jars created in target folder as below, >> bin/spark-shell --jars $SPARK_HOME/external/kafka-0-10-sql/target/*.jar >> >> i get below error based on the command, >> >> Using Spark's default log4j profile: org/apache/spark/log4j-default >> s.properties >> >> Setting default log level to "WARN". >> >> To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use >> setLogLevel(newLevel). >> >> 17/06/28 11:54:03 WARN NativeCodeLoader: Unable to load native-hadoop >> library for your platform... using builtin-java classes where applicable >> >> Spark context Web UI available at http://10.1.10.241:4040 >> >> Spark context available as 'sc' (master = local[*], app id = >> local-1498676043936). >> >> Spark session available as 'spark'. >> >> Welcome to >> >> ____ __ >> >> / __/__ ___ _____/ /__ >> >> _\ \/ _ \/ _ `/ __/ '_/ >> >> /___/ .__/\_,_/_/ /_/\_\ version 2.3.0-SNAPSHOT >> >> /_/ >> >> >> >> Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java >> 1.8.0_131) >> >> Type in expressions to have them evaluated. >> >> Type :help for more information. >> >> scala> val lines = spark.readStream.format("kafka >> ").option("kafka.bootstrap.servers", "localhost:9092").option("subscribe", >> "test").load() >> >> java.lang.NoClassDefFoundError: org/apache/kafka/common/serial >> ization/ByteArrayDeserializer >> >> at org.apache.spark.sql.kafka010.KafkaSourceProvider$.<init>(Ka >> fkaSourceProvider.scala:378) >> >> at org.apache.spark.sql.kafka010.KafkaSourceProvider$.<clinit>( >> KafkaSourceProvider.scala) >> >> at org.apache.spark.sql.kafka010.KafkaSourceProvider.validateSt >> reamOptions(KafkaSourceProvider.scala:325) >> >> at org.apache.spark.sql.kafka010.KafkaSourceProvider.sourceSche >> ma(KafkaSourceProvider.scala:60) >> >> at org.apache.spark.sql.execution.datasources.DataSource. >> sourceSchema(DataSource.scala:192) >> >> at org.apache.spark.sql.execution.datasources.DataSource. >> sourceInfo$lzycompute(DataSource.scala:87) >> >> at org.apache.spark.sql.execution.datasources.DataSource. >> sourceInfo(DataSource.scala:87) >> >> at org.apache.spark.sql.execution.streaming.StreamingRelation$. >> apply(StreamingRelation.scala:30) >> >> at org.apache.spark.sql.streaming.DataStreamReader.load( >> DataStreamReader.scala:150) >> >> ... 48 elided >> >> Caused by: java.lang.ClassNotFoundException: >> org.apache.kafka.common.serialization.ByteArrayDeserializer >> >> at java.net.URLClassLoader.findClass(URLClassLoader.java:381) >> >> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) >> >> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) >> >> ... 57 more >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> ++++++++++++++ >> >> i have tried building the jar with dependencies, but still face the same >> error. >> >> But when i try to do --package with spark-shell using bin/spark-shell >> --package org.apache.spark:spark-sql-kafka-0-10_2.11:2.1.0 , it works >> fine. >> >> The reason, i am trying to build something from source code, is because i >> want to try pushing dataframe data into kafka topic, based on the url >> https://github.com/apache/spark/commit/b0a5cd89097c563e9 >> 949d8cfcf84d18b03b8d24c, which doesn't work with version 2.1.0. >> >> >> Any help would be highly appreciated. >> >> >> Regards, >> >> Satyajit. >> >> >> >