Instead of External Shuffle Shufle, Apache Celeborn might be a good option as a Remote Shuffle Service for Spark on K8s.
There are some useful resources you might be interested in. [1] https://celeborn.apache.org/ [2] https://www.youtube.com/watch?v=s5xOtG6Venw [3] https://github.com/aws-samples/emr-remote-shuffle-service [4] https://github.com/apache/celeborn/issues/2140 Thanks, Cheng Pan > On Apr 6, 2024, at 21:41, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > > I have seen some older references for shuffle service for k8s, > although it is not clear they are talking about a generic shuffle > service for k8s. > > Anyhow with the advent of genai and the need to allow for a larger > volume of data, I was wondering if there has been any more work on > this matter. Specifically larger and scalable file systems like HDFS, > GCS , S3 etc, offer significantly larger storage capacity than local > disks on individual worker nodes in a k8s cluster, thus allowing > handling much larger datasets more efficiently. Also the degree of > parallelism and fault tolerance with these files systems come into > it. I will be interested in hearing more about any progress on this. > > Thanks > . > > Mich Talebzadeh, > > Technologist | Solutions Architect | Data Engineer | Generative AI > > London > United Kingdom > > > view my Linkedin profile > > > https://en.everybodywiki.com/Mich_Talebzadeh > > > > Disclaimer: The information provided is correct to the best of my > knowledge but of course cannot be guaranteed . It is essential to note > that, as with any advice, quote "one test result is worth one-thousand > expert opinions (Werner Von Braun)". > > --------------------------------------------------------------------- > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org