Unable to create direct stream with SSL enabled Kafka cluster

2021-05-15 Thread dwgw
Hi I am trying to stream a Kafka topic using createDirectStream(). The Kafka cluster is SSL enabled. The code for the same is: *** import findspark findspark.init('/u01/idp/spark') from pyspark import SparkContext from pyspark.streaming import StreamingContext from pys

value col is not a member of org.apache.spark.rdd.RDD

2020-09-01 Thread dwgw
Hi I am trying to generate a hierarchy table using Spark GraphX but during runtime i am getting following error. *error: value col is not a member of org.apache.spark.rdd.RDD[(Any, (Int, Any, String, Int, Int))] val empHirearchyDF = empHirearchyExtDF.join(empDF , empDF.col("emp_id") === e

Streaming AVRO data in console: java.lang.ArrayIndexOutOfBoundsException

2020-08-10 Thread dwgw
Hi I am trying to stream Kafka topic (in AVRO format) in the console and for that i have loaded the avro data from kafka topic in the data-frame but when try to stream in the console i am getting following error. *scala>* val records = spark. readStream. format("kafka").

error: object functions is not a member of package org.apache.spark.sql.avro

2020-08-08 Thread dwgw
Hi I am getting the following error while trying to import the package org.apache.spark.sql.avro.functions._ in the scala shell: scala> import org.apache.spark.sql.avro.functions._ :23: error: object functions is not a member of package org.apache.spark.sql.avro import org.apache.spark.sql.avro.fu

Spark streaming with Confluent kafka

2020-07-03 Thread dwgw
Hi I am trying to stream confluent kafka topic in the spark shell. For that i have invoked spark shell using following command. # spark-shell --packages org.apache.spark:spark-sql-kafka-0-10_2.12:3.0.0 --conf "spark.executor.extraJavaOptions=-Djava.security.auth.login.config=/home/spark/kafka_jaa

Re: Spark streaming with Kafka

2020-07-02 Thread dwgw
Hi I am able to correct the issue. The issue was due to wrong version of JAR file I have used. I have removed the these JAR files and copied correct version of JAR files and the error has gone away. Regards -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ ---

Spark streaming with Kafka

2020-07-02 Thread dwgw
Hi I am trying to stream kafka topic from spark shell but i am getting the following error. I am using *spark 3.0.0/scala 2.12.10* (Java HotSpot(TM) 64-Bit Server VM, *Java 1.8.0_212*) *[spark@hdp-dev ~]$ spark-shell --packages org.apache.spark:spark-sql-kafka-0-10_2.12:3.0.0* Ivy Default Cache s

Spark streaming with Kafka

2020-07-02 Thread dwgw
HiI am trying to stream kafka topic from spark shell but i am getting the following error. I am using *spark 3.0.0/scala 2.12.10* (Java HotSpot(TM) 64-Bit Server VM, *Java 1.8.0_212*)*[spark@hdp-dev ~]$ spark-shell --packages org.apache.spark:spark-sql-kafka-0-10_2.12:3.0.0*Ivy Default Cache set to