Spark performance on small dataset

Prarthi Jain Sun, 20 Nov 2022 10:50:35 -0800

Hi Everyone,

Spark and the RDD approach it favors assumes that most applications run on
big data and need massive parallelism via sharding and concurrent
computing. But some tasks run on small data and do not need or benefit from
RDD parallelism. How are these tasks expected to perform on Spark?


Looking forward to more insights on this!

Thanks,
Prarthi

Spark performance on small dataset

Reply via email to