Hi everyone,

I'm running into more and more cases where too many files are opened when
spark.shuffle.consolidateFiles is turned off.

I was wondering if this is a common scenario among the rest of the
community, and if so, if it is worth considering the setting to be turned on
by default. From the documentation, it seems like the performance could be
hurt on ext3 file systems. However, what are the concrete numbers of
performance degradation that is seen typically? A 2x slowdown in the average
job? 3x? Also, what cause the performance degradation on ext3 file systems
specifically?

Thanks,

-Matt Cheah


Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to