Github user squito commented on the issue: https://github.com/apache/spark/pull/22818 Actually there are quite a few more uses, even of `Int.MaxValue`, which I find suspicious, but for the moment I only wanted to touch the cases I understood better. For example, ["spark.sql.sortMergeJoinExec.buffer.in.memory.threshold"](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L1225-L1231) is used as the max size for an [`ArrayBuffer` in `ExternalAppendOnlyUnsafeRowArray`](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/ExternalAppendOnlyUnsafeRowArray.scala#L107-L108), and I'm pretty sure that will cause the same problems: ```scala > scala -J-Xmx16G scala> val x = new scala.collection.mutable.ArrayBuffer[Int](128) scala> x.sizeHint(Int.MaxValue) java.lang.OutOfMemoryError: Requested array size exceeds VM limit at scala.collection.mutable.ArrayBuffer.sizeHint(ArrayBuffer.scala:69) ... 30 elided ``` do you think its important to tackle them all here? I could also open another jira to do an audit
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org