Hello Spark community SPARK_LOCAL_DIRS or spark.local.dir is supposed to accept a list.
I want to list one local (fast) drive, followed by a gpfs network drive, similar to what is done here: https://cug.org/proceedings/cug2016_proceedings/includes/files/pap129s2-file1.pdf "Thus it is preferable to bias the data towards faster storage by including multiple directories on the faster devices (e.g., SPARK LOCAL DIRS=/tmp/spark1, /tmp/spark2, /tmp/spark3, /lus/scratch/sparkscratch/)." The purpose of this is to get both benefits of speed and avoiding "out of space" errors. However, for me, Spark is only considering the 1st directory on the list: export SPARK_LOCAL_DIRS="/tmp, /share/xxxx" I am using Spark 3.4.1. Does anyone have any experience getting this to work? If so can you suggest a simple example I can try and tell me which version of Spark you are using? Regards Andrew I am trying to use 2 local drives -- Andrew Petersen, PhD Advanced Computing, Office of Information Technology 2620 Hillsborough Street datascience.oit.ncsu.edu