Spark shuffle and inevitability of writing to Disk

Mich Talebzadeh Tue, 16 May 2023 10:08:39 -0700

Hi,

On the issue of Spark shuffle it is accepted that shuffle *often involves*
the following if not all below:


   - Disk I/O
   - Data serialization and deserialization
   - Network I/O

Excluding external shuffle service and without relying on the configuration
options provided by spark for shuffle does the operation always involve
disk usage (any HCFS compatible file system) or will it use the existing
persistent memory if it can.?

Thanks

Mich Talebzadeh,
Lead Solutions Architect/Engineering Lead
Palantir Technologies Limited
London
United Kingdom


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.

Spark shuffle and inevitability of writing to Disk

Reply via email to