Hi All,

I am using Spark 3.0.1 Structuring streaming with Pyspark.

The problem is spark is running only 1 executor with 1 task. Following is
the summary of what I am doing.

Can anyone help on why my executor is 1 only?

def process_events(event):
fetch_actual_data()
#many more steps

def fetch_actual_data():
#applying operation on actual data

df = spark.readStream.format("kafka") \
            .option("kafka.bootstrap.servers", KAFKA_URL) \
            .option("subscribe", KAFKA_TOPICS) \
            .option("startingOffsets",
START_OFFSET).load() .selectExpr("CAST(value AS STRING)")


query = df.writeStream.foreach(process_events).option("checkpointLocation",
"/opt/checkpoint").trigger(processingTime="30 seconds").start()



Kind Regards,
Sachit Murarka

Reply via email to