Re: Idiomatic way to rate-limit streaming sources to avoid OutOfMemoryError?

2024-04-07 Thread Mich Talebzadeh
OK, This is a common issue in Spark Structured Streaming (SSS), where the source generates data faster than Spark can process it. SSS doesn't have a built-in mechanism for directly rate-limiting the incoming data stream itself. However, consider the following: - Limit the rate at which data

Idiomatic way to rate-limit streaming sources to avoid OutOfMemoryError?

2024-04-07 Thread Baran, Mert
Hi Spark community, I have a Spark Structured Streaming application that reads data from a socket source (implemented very similarly to the TextSocketMicroBatchStream). The issue is that the source can generate data faster than Spark can process it, eventually leading to an OutOfMemoryError