Hi Spark community, We’d like to propose a new SPIP to improve the experience of running Apache Spark on laptops.
SPIP doc: https://docs.google.com/document/d/1Nphejrf_vh4YRECn0JPgKClqxDS_lB6wufZFJQxyY98/edit?tab=t.0#heading=h.hj76akdx5ul Summary: Spark’s execution model is optimized for distributed workloads, but this introduces noticeable overhead for small datasets (e.g., <100MB), where even simple queries can take multiple seconds. This makes Spark less suitable for interactive and exploratory use cases on laptops, and often pushes users toward alternative single-node tools. This proposal aims to reduce that overhead in local mode, improving latency for small queries and making Spark more usable as an entry point for new users and iterative workflows. We’d appreciate your review and feedback. Thanks, Daniel Tenedorio and Liang-Chi Hsieh
