Remove the kafka-clients package and add starting offset to options.
df = spark.readStream\
.format("kafka")\
.option("zookeeper.connect", "localhost:2181")\
.option("kafka.bootstrap.servers", "localhost:9092")\
.option("subscribe", "ingest")\
.option("failOnDataLoss", "false")\
.option("startingOf
Hi All,
I am trying to test structured streaming using pyspark mentioned below spark
submit commands and packages used
*
pyspark2 --master=yarn --packages
org.apache.spark:spark-sql-kafka-0-10_2.11:2.3.0 --packages
org.apache.kafka:kafka-clients:0.10.0.1*
but getting following error (in bold),