Love it. Please, count on me if any help is needed. El mar, 5 may 2026, 7:31, DB Tsai <[email protected]> escribió:
> Thanks Daniel and Liang-Chi for driving this. This is an exciting proposal > that can significantly speed up local experimentation and development on > laptops. It also helps make Spark a great fit for both big-data workloads > and small-data exploratory workflows. > > DB Tsai | https://www.dbtsai.com/ | PGP 0x9FB9FAA3 > > On Monday, May 4th, 2026 at 3:39 PM, Daniel Tenedorio < > [email protected]> wrote: > > Hi Spark community, > > We’d like to propose a new SPIP to improve the experience of running > Apache Spark on laptops. > > SPIP doc: > > > https://docs.google.com/document/d/1Nphejrf_vh4YRECn0JPgKClqxDS_lB6wufZFJQxyY98/edit?tab=t.0#heading=h.hj76akdx5ul > > Summary: > > Spark’s execution model is optimized for distributed workloads, but this > introduces noticeable overhead for small datasets (e.g., <100MB), where > even simple queries can take multiple seconds. This makes Spark less > suitable for interactive and exploratory use cases on laptops, and often > pushes users toward alternative single-node tools. > > This proposal aims to reduce that overhead in local mode, improving > latency for small queries and making Spark more usable as an entry point > for new users and iterative workflows. > > We’d appreciate your review and feedback. > > Thanks, > Daniel Tenedorio and Liang-Chi Hsieh > > >
