Hi All,

I have 10 million records in my Kafka and I am just trying to
spark.sql(select count(*) from kafka_view). I am reading from kafka and
writing to kafka.

My writeStream is set to "update" mode and trigger interval of one second (
Trigger.ProcessingTime(1000)). I expect the counts to be printed every
second but looks like it would print after going through all 10M. why?

Also, it seems to take forever whereas Linux wc of 10M rows would take 30
seconds.

Thanks!

Reply via email to