Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21474#discussion_r192487033
  
    --- Diff: 
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
    @@ -429,7 +429,11 @@ package object config {
             "external shuffle service, this feature can only be worked when 
external shuffle" +
             "service is newer than Spark 2.2.")
           .bytesConf(ByteUnit.BYTE)
    -      .createWithDefault(Long.MaxValue)
    +      // fetch-to-mem is guaranteed to fail if the message is bigger than 
2 GB, so we might
    +      // as well use fetch-to-disk in that case.  The message includes 
some metadata in addition
    +      // to the block data itself (in particular UploadBlock has a lot of 
metadata), so we leave
    +      // extra room.
    +      .createWithDefault(Int.MaxValue - 500)
    --- End diff --
    
    no guarantee its big enough.  Seemed OK in the test I tried.  But 
UploadBlock has some variable length strings so can't say for sure.
    
    I'm fine making this much bigger, eg. 1 MB -- you'd only be bigger than 
that with a pathological case.  then there would be *some* cases where we'd be 
taking an old message which was fine with fetch-to-mem and we'd switch to 
fetch-to-disk.  But such a tiny case, and not an unreasonable change even for 
that ... so should be OK.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to