Hi All, I am using Spark 3.0.1 Structuring streaming with Pyspark.
The problem is spark is running only 1 executor with 1 task. Following is the summary of what I am doing. Can anyone help on why my executor is 1 only? def process_events(event): fetch_actual_data() #many more steps def fetch_actual_data(): #applying operation on actual data df = spark.readStream.format("kafka") \ .option("kafka.bootstrap.servers", KAFKA_URL) \ .option("subscribe", KAFKA_TOPICS) \ .option("startingOffsets", START_OFFSET).load() .selectExpr("CAST(value AS STRING)") query = df.writeStream.foreach(process_events).option("checkpointLocation", "/opt/checkpoint").trigger(processingTime="30 seconds").start() Kind Regards, Sachit Murarka