Hi dev,

I'd like to propose the deprecation of DStream in Spark 3.4, in favor of
promoting Structured Streaming.
(Sorry for the late proposal, if we don't make the change in 3.4, we will
have to wait for another 6 months.)

We have been focusing on Structured Streaming for years (across multiple
major and minor versions), and during the time we haven't made any
improvements for DStream. Furthermore, recently we updated the DStream doc
to explicitly say DStream is a legacy project.
https://spark.apache.org/docs/latest/streaming-programming-guide.html#note

The baseline of deprecation is that we don't see a particular use case
which only DStream solves. This is a different story with GraphX and MLLIB,
as we don't have replacements for that.

The proposal does not mean we will remove the API soon, as the Spark
project has been making deprecation against public API. I don't intend to
propose the target version for removal. The goal is to guide users to
refrain from constructing a new workload with DStream. We might want to go
with this in future, but it would require a new discussion thread at that
time.

What do you think?

Thanks,
Jungtaek Lim (HeartSaVioR)

Reply via email to