Hi everyone, I'm running into more and more cases where too many files are opened when spark.shuffle.consolidateFiles is turned off.
I was wondering if this is a common scenario among the rest of the community, and if so, if it is worth considering the setting to be turned on by default. From the documentation, it seems like the performance could be hurt on ext3 file systems. However, what are the concrete numbers of performance degradation that is seen typically? A 2x slowdown in the average job? 3x? Also, what cause the performance degradation on ext3 file systems specifically? Thanks, -Matt Cheah
smime.p7s
Description: S/MIME cryptographic signature