I have seen some older references for shuffle service for k8s,
although it is not clear they are talking about a generic shuffle
service for k8s.

Anyhow with the advent of genai and the need to allow for a larger
volume of data, I was wondering if there has been any more work on
this matter. Specifically larger and scalable file systems like HDFS,
GCS , S3 etc, offer significantly larger storage capacity than local
disks on individual worker nodes in a k8s cluster, thus allowing
handling much larger datasets more efficiently. Also the degree of
parallelism and fault tolerance  with these files systems come into
it. I will be interested in hearing more about any progress on this.

Thanks
.

Mich Talebzadeh,

Technologist | Solutions Architect | Data Engineer  | Generative AI

London
United Kingdom


   view my Linkedin profile


 https://en.everybodywiki.com/Mich_Talebzadeh



Disclaimer: The information provided is correct to the best of my
knowledge but of course cannot be guaranteed . It is essential to note
that, as with any advice, quote "one test result is worth one-thousand
expert opinions (Werner Von Braun)".

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to