Discussing the idea of Shared Volume block store client

2023-04-23 Thread Arun Ravi
Hi Spark Community,

Thank you for all the awesome features of Apache Spark 3.X like

   - GA of Spark On K8
   - Push-based external shuffle
   - Easy mounting of volume claims in Spark K8s

We run 1000s of spark jobs on K8s, Yarn (EMR), and Databricks. We run into
shuffle block loss and RDD loss due to a variety of infrastructure reasons
like spot loss, memory issues, etc. Not all resource managers that we use
today support external shuffle service natively, While all the existing
solution to this block loss revolves around using the ShuffleManager
interface to manage shuffle data into more reliable locations like cloud
storage or remotely managed storages (Uber's shuffle service). My
understanding is shuffle manager only handles shuffle blocks whereas the
external shuffle service (client) can handle any shuffle or RDD blocks and
can also support push shuffle merge. I was wondering what would be the
impact of implementing an alternate solution using shared mounted volumes
of high-performance filesystems and forcing the host local read
mechanism in spark to fetch these blocks in any executors that need them.
This should ideally solve both shuffle and RDD blocks.

The abstract of the idea is as follows

   - Mount shared volumes (eg: AWS FSx Lustre, EFS, etc, or other network
   filesystems) in spark driver and executors
   - The Network Block Store Client (extending the existing block store
   client ) retrieves executor info based on metadata stored in the shared
   volume.
   - override the getHostLocalDirs external block tore client to return all
   the registered executor local directories (to trigger host local read in
   spark codebase)
   - update ShuffleBlockFetcherIterator's partitionBlockByFetchMode to
   consider Shared volume mount blocks as host local
   - Deploy a small number of external shuffle servers to perform Shuffle
   block merge based on existing push merge logic thereby adding support for
   the same in non-yarn resource managers.
  - Implement Push Based Merge functions in  Network Block Store Client
  to work with this external shuffle servers


Thanks in advance for all the feedback and suggestions.

Arun Ravi M V
B.Tech (Batch: 2010-2014)

Computer Science and Engineering

Govt. Model Engineering College
Cochin University Of Science And Technology
Kochi
arunrav...@gmail.com
+91 9995354581
Skype : arunravimv


unsubscribe

2023-04-23 Thread 朱静
unsubscribe