Re: [Internet]Re: Improving Dynamic Allocation Logic for Spark 4+

2023-08-23 Thread Holden Karau
One option could be to initially launch both drivers and initial executors (using the lazy executor ID allocation), but it would introduce a lot of complexity. On Wed, Aug 23, 2023 at 6:44 PM Qian Sun wrote: > Hi Mich > > I agree with your opinion that the startup time of the Spark on

Re: [Internet]Re: Improving Dynamic Allocation Logic for Spark 4+

2023-08-23 Thread Qian Sun
Hi Mich I agree with your opinion that the startup time of the Spark on Kubernetes cluster needs to be improved. Regarding the fetching image directly, I have utilized ImageCache to store the images on the node, eliminating the time required to pull images from a remote repository, which does

Re: Dynamic resource allocation for structured streaming [SPARK-24815]

2023-08-23 Thread Pavan Kotikalapudi
Thanks for the review Mich. I have updated the Q4 with as concise information as possible and left the detailed explanation to Appendix. here is the updated answer to the Q4 Thank you,

Re: [Internet]Re: Improving Dynamic Allocation Logic for Spark 4+

2023-08-23 Thread Mich Talebzadeh
Hi all, On this conversion, one of the issues I brought up was the driver start-up time. This is especially true in k8s. As spark on k8s is modeled on Spark on standalone schedler, Spark on k8s consist of a single-driver pod (as master on standalone”) and a number of executors (“workers”). When

Re: Dynamic resource allocation for structured streaming [SPARK-24815]

2023-08-23 Thread Mich Talebzadeh
Hi Pavan, I started reading your SPIP but have difficulty understanding it in detail. Specifically under Q4, " What is new in your approach and why do you think it will be successful?", I believe it would be better to remove the plots and focus on "what this proposed solution is going to add to

Re: Volcano in spark distro

2023-08-23 Thread Santosh Pingale
> In any way, I'd like to say that the root cause of the difference is those scheduler designs instead of Apache Spark itself. For example, Apache YuniKorn doesn't force us to add a new dependency at all while Volcano did. This makes sense! > In these day, I prefer and invest more Apache