[GitHub] spark issue #22998: [SPARK-26001][SQL]Reduce memory copy when writing decima...
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/22998 yes, I agree with @cloud-fan , this can create wrong results with nulls... --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22998: [SPARK-26001][SQL]Reduce memory copy when writing decima...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22998 I think this is wrong. We have to zero out the bytes even writing a null decimal, so that 2 unsafe rows with same values(including null values) are exactly same(in binary format). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22998: [SPARK-26001][SQL]Reduce memory copy when writing decima...
Github user heary-cao commented on the issue: https://github.com/apache/spark/pull/22998 @kiszk thank you for review it. - when writing null decimalsï¼ ``` OpenJDK 64-Bit Server VM 1.8.0_163-b01 on Windows 7 6.1 Intel64 Family 6 Model 94 Stepping 3, GenuineIntel iter length 1048576: Best/Avg Time(ms)Rate(M/s) Per Row(ns) Relative before PR (input == null) 51 / 56 20.4 49.0 1.0X after PR (input == null) 8 /9125.2 8.0 6.1X ``` - when writing non-null decimals ``` OpenJDK 64-Bit Server VM 1.8.0_163-b01 on Windows 7 6.1 Intel64 Family 6 Model 94 Stepping 3, GenuineIntel iter length 1048576: Best/Avg Time(ms)Rate(M/s) Per Row(ns) Relative before PR (input != null) 52 / 53 20.3 49.2 1.0X after PR (input != null)54 / 56 19.3 51.7 1.0X ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22998: [SPARK-26001][SQL]Reduce memory copy when writing decima...
Github user heary-cao commented on the issue: https://github.com/apache/spark/pull/22998 @mgaido91 thank you for review it. I added a test case to test "write a decimal with 16 bytes and then one with less than 8". then the current change the remaining 8 bytes would not dirty. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22998: [SPARK-26001][SQL]Reduce memory copy when writing decima...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/22998 I have two questions. 1. Is this PR already tested with `"SPARK-25538: zero-out all bits for decimals"`? 2. How does this PR achieve performance improvement? This PR may introduce some complication. We would like to know the trade-off between performance and ease of understanding. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22998: [SPARK-26001][SQL]Reduce memory copy when writing decima...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22998 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22998: [SPARK-26001][SQL]Reduce memory copy when writing decima...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22998 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22998: [SPARK-26001][SQL]Reduce memory copy when writing decima...
Github user heary-cao commented on the issue: https://github.com/apache/spark/pull/22998 cc @mgaido91, @dongjoon-hyun , @cloud-fan , @kiszk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22998: [SPARK-26001][SQL]Reduce memory copy when writing decima...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22998 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org