Re: class KafkaCluster related errors

2021-06-17 Thread Mich Talebzadeh
This is interesting because I am using PySpark but I need these jar files for Spark 3.1.1 and Kafka 2.7.0 to work kafka-clients-2.7.0.jar commons-pool2-2.9.0.jar spark-streaming_2.12-3.1.1.jar spark-sql-kafka-0-10_2.12-3.1.0.jar Do you have equivalent of these artifacts in your POM file? HTH

Re: class KafkaCluster related errors

2021-06-17 Thread Mich Talebzadeh
Hi Kiran, You need kafka-clients for the version of kafka you are using. So if it is the correct version keep it. Try running and see what the error says. HTH view my Linkedin profile *Disclaimer:* Use it at your own risk. Any

Re: class KafkaCluster related errors

2021-06-13 Thread Mich Talebzadeh
Hi Kiran Looking at your pom file Isee below org.apache.kafka kafka-streams 0.10.0.0 Why do you need kafka-streams? I don't think spark uses it and may get libraries confused. HTH view my Linkedin profile

Re: class KafkaCluster related errors

2021-06-11 Thread Mich Talebzadeh
Hi Kiran. I was using - Kafka Cluster 2.12-1.1.0 - Spark Streaming 2.3, Spark SQL 2.3 - Scala 2.11.8 Your Kafka version 0.10 seems to be pretty old. That may be the issue here. Try upgrading Kafka in a test environment to see if it helps. HTH view my Linkedin profile

Re: class KafkaCluster related errors

2021-06-08 Thread Mich Talebzadeh
Hi Kiran, I don't seem to have a reference to handling offsets in my old code. However, in Spark structured streaming (SSS) I handle it using a reference to checkpointLocation as below: (this is in Python) checkpoint_path = "file:///ssd/hduser/avgtemperature/chkpt" result =

Re: class KafkaCluster related errors

2021-06-07 Thread Kiran Biswal
Hi Mich, Thanks a lot for your response. I am basically trying to get some older code(streaming job to read from kafka) in 2.0.1 spark to work in 3.0,1. The specific area where I am having problem (KafkaCluster) has most likely to do with get/ set commit offsets in kafka // Create message

Re: class KafkaCluster related errors

2021-06-07 Thread Mich Talebzadeh
Hi Kiran, As you be aware createDirectStream is depreciated and you ought to use Spark Structured streaming, especially that you are moving to version 3.0.1. If you still want to use dstream then that page seems to be correct Looking at my old code I have import org.apache.spark.streaming._

Re: class KafkaCluster related errors

2021-06-07 Thread Mich Talebzadeh
Hi, Are you trying to read topics from Kafka in spark 3.0.1? Have you checked Spark 3.0.1 documentation? Integrating Spark with Kafka is pretty straight forward. with 3.0.1 and higher HTH view my Linkedin profile *Disclaimer:*

class KafkaCluster related errors

2021-06-06 Thread Kiran Biswal
*I am using spark 3.0.1 AND Kafka 0.10 AND Scala 2.12. Getting an error related to KafkaCluster (not found: type KafkaCluster). Is this class deprecated? How do I find a replacement?* *I am upgrading from spark 2.0.1 to spark 3.0.1* *In spark 2.0.1 KafkaCluster was supported*