Oh great point On Mon, Aug 7, 2023 at 2:23 PM bo yang <bobyan...@gmail.com> wrote:
> Thanks Holden for bringing this up! > > Maybe another thing to think about is how to make dynamic allocation more > friendly with Kubernetes and disaggregated shuffle storage? > > > > On Mon, Aug 7, 2023 at 1:27 PM Holden Karau <hol...@pigscanfly.ca> wrote: > >> So I wondering if there is interesting in revisiting some of how Spark is >> doing it's dynamica allocation for Spark 4+? >> >> Some things that I've been thinking about: >> >> - Advisory user input (e.g. a way to say after X is done I know I need Y >> where Y might be a bunch of GPU machines) >> - Configurable tolerance (e.g. if we have at most Z% over target no-op) >> - Past runs of same job (e.g. stage X of job Y had a peak of K) >> - Faster executor launches (I'm a little fuzzy on what we can do here >> but, one area for example is we setup and tear down an RPC connection to >> the driver with a blocking call which does seem to have some locking inside >> of the driver at first glance) >> >> Is this an area other folks are thinking about? Should I make an epic we >> can track ideas in? Or are folks generally happy with today's dynamic >> allocation (or just busy with other things)? >> >> -- >> Twitter: https://twitter.com/holdenkarau >> Books (Learning Spark, High Performance Spark, etc.): >> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >> > -- Twitter: https://twitter.com/holdenkarau Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> YouTube Live Streams: https://www.youtube.com/user/holdenkarau