Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21474#discussion_r192487033 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -429,7 +429,11 @@ package object config { "external shuffle service, this feature can only be worked when external shuffle" + "service is newer than Spark 2.2.") .bytesConf(ByteUnit.BYTE) - .createWithDefault(Long.MaxValue) + // fetch-to-mem is guaranteed to fail if the message is bigger than 2 GB, so we might + // as well use fetch-to-disk in that case. The message includes some metadata in addition + // to the block data itself (in particular UploadBlock has a lot of metadata), so we leave + // extra room. + .createWithDefault(Int.MaxValue - 500) --- End diff -- no guarantee its big enough. Seemed OK in the test I tried. But UploadBlock has some variable length strings so can't say for sure. I'm fine making this much bigger, eg. 1 MB -- you'd only be bigger than that with a pathological case. then there would be *some* cases where we'd be taking an old message which was fine with fetch-to-mem and we'd switch to fetch-to-disk. But such a tiny case, and not an unreasonable change even for that ... so should be OK.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org