[GitHub] [spark] liuzqt commented on pull request #38064: [SPARK-40622][SQL][CORE]Result of a single task in collect() must fit in 2GB

2022-11-10 Thread GitBox
liuzqt commented on PR #38064: URL: https://github.com/apache/spark/pull/38064#issuecomment-1311348015 @mridulm I've tried `local-cluster[1,1,3072]` but doesn't help, I guess. Is there any way to turn up the JVM mem in github action job? -- This is an automated message from the Apache

[GitHub] [spark] liuzqt commented on pull request #38064: [SPARK-40622][SQL][CORE]Result of a single task in collect() must fit in 2GB

2022-11-09 Thread GitBox
liuzqt commented on PR #38064: URL: https://github.com/apache/spark/pull/38064#issuecomment-1309558197 Tried reduce the large result test to ~2.1GB, as well as local-cluster approach but neither worksSince I've verified that test on my local, so I removed the test -- This is an

[GitHub] [spark] liuzqt commented on pull request #38064: [SPARK-40622][SQL][CORE]Result of a single task in collect() must fit in 2GB

2022-11-07 Thread GitBox
liuzqt commented on PR #38064: URL: https://github.com/apache/spark/pull/38064#issuecomment-1306681050 Finally fixed it by adding `org/apache/spark/util/io`(this should be internal) to `Unidoc.ignoreUndocumentedPackages` -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] liuzqt commented on pull request #38064: [SPARK-40622][SQL][CORE]Result of a single task in collect() must fit in 2GB

2022-11-07 Thread GitBox
liuzqt commented on PR #38064: URL: https://github.com/apache/spark/pull/38064#issuecomment-1306322963 I fixed the double quote issue in `Cast` and command `build/sbt -Phadoop-3 -Pyarn -Pdocker-integration-tests -Pspark-ganglia-lgpl -Phive -Pmesos -Phive-thriftserver -Pkubernetes

[GitHub] [spark] liuzqt commented on pull request #38064: [SPARK-40622][SQL][CORE]Result of a single task in collect() must fit in 2GB

2022-11-04 Thread GitBox
liuzqt commented on PR #38064: URL: https://github.com/apache/spark/pull/38064#issuecomment-1304306588 @mridulm I got a error when running that command in my local ``` [error] /Users/ziqi.liu/spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala:51:

[GitHub] [spark] liuzqt commented on pull request #38064: [SPARK-40622][SQL][CORE]Result of a single task in collect() must fit in 2GB

2022-11-03 Thread GitBox
liuzqt commented on PR #38064: URL: https://github.com/apache/spark/pull/38064#issuecomment-1302834138 @mridulm I'm looking into it, but haven't any clue so far.it's weird that many compilation errors seems not related to this PR, and I was able to build on my local machine, not sure

[GitHub] [spark] liuzqt commented on pull request #38064: [SPARK-40622][SQL][CORE]Result of a single task in collect() must fit in 2GB

2022-10-12 Thread GitBox
liuzqt commented on PR #38064: URL: https://github.com/apache/spark/pull/38064#issuecomment-1276518707 Some general comments about the performance implication regarding replacing `Array[Byte]` and `ByteBuffer`(backed by `Array[Byte]`) with `ChunkedByteBuffer`: - when reading from stream