[GitHub] spark issue #20757: [SPARK-23595][SQL] ValidateExternalType should support i...

2018-03-07 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20757 @maropu looks pretty solid. I left a few comments. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #20757: [SPARK-23595][SQL] ValidateExternalType should su...

2018-03-07 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20757#discussion_r172787179 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -1408,11 +1409,37 @@ case class

[GitHub] spark pull request #20757: [SPARK-23595][SQL] ValidateExternalType should su...

2018-03-07 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20757#discussion_r172784041 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -1408,11 +1409,37 @@ case class

[GitHub] spark pull request #20757: [SPARK-23595][SQL] ValidateExternalType should su...

2018-03-07 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20757#discussion_r172783686 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -1408,11 +1409,37 @@ case class

[GitHub] spark pull request #20757: [SPARK-23595][SQL] ValidateExternalType should su...

2018-03-07 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20757#discussion_r172782545 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -1408,11 +1409,37 @@ case class

[GitHub] spark pull request #20757: [SPARK-23595][SQL] ValidateExternalType should su...

2018-03-07 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20757#discussion_r172782339 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -1408,11 +1409,37 @@ case class

[GitHub] spark pull request #20748: [SPARK-23611][SQL] Add a helper function to check...

2018-03-06 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20748#discussion_r172588993 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala --- @@ -133,37 +169,39 @@ trait

[GitHub] spark issue #20749: [SPARK-23590][SQL] Add interpreted execution to CreateEx...

2018-03-06 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20749 Merging to master. Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #20751: [SPARK-23591][SQL] Add interpreted execution to E...

2018-03-06 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20751#discussion_r172580937 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -1154,8 +1154,18 @@ case class

[GitHub] spark pull request #20751: [SPARK-23591][SQL] Add interpreted execution to E...

2018-03-06 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20751#discussion_r172570633 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -1154,8 +1154,18 @@ case class

[GitHub] spark pull request #20751: [SPARK-23591][SQL] Add interpreted execution to E...

2018-03-06 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20751#discussion_r172566965 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -1154,8 +1154,18 @@ case class

[GitHub] spark issue #20750: [SPARK-23581][SQL] Add interpreted unsafe projection

2018-03-06 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20750 @mgaido91 you are right. It is on purpose. I do not like to introduce these things in one big bang (this makes it hard to review and might make it hard to merge). In this PR we just introduce

[GitHub] spark pull request #20748: [SPARK-23611][SQL] Add a helper function to check...

2018-03-06 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20748#discussion_r172538852 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala --- @@ -133,37 +169,39 @@ trait

[GitHub] spark pull request #20748: [SPARK-23611][SQL] Add a helper function to check...

2018-03-06 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20748#discussion_r172538524 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala --- @@ -93,6 +95,40 @@ trait

[GitHub] spark pull request #20750: [SPARK-23581][SQL] Add interpreted unsafe project...

2018-03-06 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20750#discussion_r172536692 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/InterpretedUnsafeProjection.scala --- @@ -0,0 +1,345 @@ +/* + * Licensed

[GitHub] spark pull request #20750: [SPARK-23581][SQL] Add interpreted unsafe project...

2018-03-06 Thread hvanhovell
GitHub user hvanhovell opened a pull request: https://github.com/apache/spark/pull/20750 [SPARK-23581][SQL] Add interpreted unsafe projection ## What changes were proposed in this pull request? We currently can only create unsafe rows using code generation. This is a problem

[GitHub] spark issue #20750: [SPARK-23581][SQL] Add interpreted unsafe projection

2018-03-06 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20750 cc @cloud-fan @mgaido91 @kiszk @maropu --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20749: [SPARK-23590][SQL] Add interpreted execution to CreateEx...

2018-03-06 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20749 LGTM - pending jenkins. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #20748: [SPARK-23611][SQL] Add a helper function to check...

2018-03-06 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20748#discussion_r172514416 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala --- @@ -93,6 +95,40 @@ trait

[GitHub] spark pull request #20748: [SPARK-23611][SQL] Add a helper function to check...

2018-03-06 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20748#discussion_r172513927 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala --- @@ -93,6 +95,40 @@ trait

[GitHub] spark issue #20746: [SPARK-23594][SQL] GetExternalRowField should support in...

2018-03-06 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20746 Merging to master. Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #20746: [SPARK-23594][SQL] GetExternalRowField should sup...

2018-03-06 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20746#discussion_r172477800 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ObjectExpressionsSuite.scala --- @@ -84,4 +85,23 @@ class

[GitHub] spark pull request #20746: [SPARK-23594][SQL] GetExternalRowField should sup...

2018-03-06 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20746#discussion_r172477430 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ObjectExpressionsSuite.scala --- @@ -84,4 +85,23 @@ class

[GitHub] spark issue #20746: [SPARK-23594][SQL] GetExternalRowField should support in...

2018-03-06 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20746 @maropu look pretty good. I left one comment, otherwise this is gtg. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #20746: [SPARK-23594][SQL] GetExternalRowField should sup...

2018-03-06 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20746#discussion_r172455421 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ObjectExpressionsSuite.scala --- @@ -84,4 +85,23 @@ class

[GitHub] spark issue #20741: [SPARK-23586][SQL] Add interpreted execution to WrapOpti...

2018-03-05 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20741 Merging to master. Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #20741: [SPARK-23586][SQL] Add interpreted execution to W...

2018-03-05 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20741#discussion_r172337853 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ObjectExpressionsSuite.scala --- @@ -75,4 +75,14 @@ class

[GitHub] spark pull request #20741: [SPARK-23586][SQL] Add interpreted execution to W...

2018-03-05 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20741#discussion_r172333065 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ObjectExpressionsSuite.scala --- @@ -75,4 +75,14 @@ class

[GitHub] spark issue #20736: [SPARK-23585][SQL] Add interpreted execution to UnwrapOp...

2018-03-05 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20736 Merging to master. Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #20723: [SPARK-23538][core] Remove custom configuration for SSL ...

2018-03-05 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20723 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #20734: [SPARK-23510][DOC][FOLLOW-UP] Update spark.sql.hive.meta...

2018-03-05 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20734 Merging to master. Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #20736: [SPARK-23586][SQL] Add interpreted execution to UnwrapOp...

2018-03-05 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20736 LGTM - pending jenkins --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #20736: [SPARK-23586][SQL] Add interpreted execution to U...

2018-03-05 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20736#discussion_r172194194 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -382,8 +382,14 @@ case class UnwrapOption

[GitHub] spark issue #20736: [SPARK-23586][SQL] Add interpreted execution to UnwrapOp...

2018-03-05 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20736 @mgaido91 you picked this up phenomenally fast! It looks pretty good, let two minor comments. --- - To unsubscribe, e-mail

[GitHub] spark pull request #20736: [SPARK-23586][SQL] Add interpreted execution to U...

2018-03-05 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20736#discussion_r172191654 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ObjectExpressionsSuite.scala --- @@ -66,4 +66,16 @@ class

[GitHub] spark pull request #20736: [SPARK-23586][SQL] Add interpreted execution to U...

2018-03-05 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20736#discussion_r172188241 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -382,8 +382,14 @@ case class UnwrapOption

[GitHub] spark issue #20664: [SPARK-23496][CORE] Locality of coalesced partitions can...

2018-03-05 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20664 Merging to master. Thanks! @mgaido91 if you feel this should be different, feel free to open a follow-up

[GitHub] spark issue #20734: [SPARK-23510][DOC][FOLLOW-UP] Update spark.sql.hive.meta...

2018-03-05 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20734 LGTM :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #20043: [SPARK-22856][SQL] Add wrappers for codegen output and n...

2018-03-05 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20043 @viirya big fan of this change! More structure will make code gen easier & safer to implement. I think we should merge this as is, and then I think it might be good to start adding t

[GitHub] spark issue #20712: [SPARK-23563] config codegen compile cache size

2018-03-05 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20712 @passionke can you elaborate on why we need this? A problem with increasing the cache is that we keep more classes around and those classes can blow out the JVM's internal code caches

[GitHub] spark issue #20699: [SPARK-23544][SQL]Remove redundancy ShuffleExchange in t...

2018-03-05 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20699 A very valid use cases for repartitioning is limiting concurrency. For example when you have a UDF that calls some service you don't want to DDOS that service because you make requests from too

[GitHub] spark issue #20700: [SPARK-23546][SQL] Refactor stateless methods/values in ...

2018-03-05 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20700 I am having some problems with the merge script, give me a little bit of time. --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #20700: [SPARK-23546][SQL] Refactor stateless methods/values in ...

2018-03-05 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20700 Merging to master. Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #20700: [SPARK-23546][SQL] Refactor stateless methods/values in ...

2018-03-04 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20700 @kiszk can you update the PR description? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20706: [SPARK-23550][core] Cleanup `Utils`.

2018-03-03 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20706 @jiangxb1987 keeping dead code around will inevitably lead to code rot. If there is not use for something then we should remove it. Adding things back is not a problem

[GitHub] spark issue #20712: [SPARK-23563] config codegen compile cache size

2018-03-02 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20712 @mgaido91 Yeah, you are right. This is pretty pointless unless you either set this as a static spark conf or pass it to the executor using local properties. Both aren't very attractive IMO

[GitHub] spark pull request #20706: [SPARK-23550][core] Cleanup `Utils`.

2018-03-01 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20706#discussion_r171683202 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -267,44 +264,20 @@ private[spark] object Utils extends Logging

[GitHub] spark pull request #20706: [SPARK-23550][core] Cleanup `Utils`.

2018-03-01 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20706#discussion_r171676609 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -267,44 +264,20 @@ private[spark] object Utils extends Logging

[GitHub] spark pull request #20706: [SPARK-23550][core] Cleanup `Utils`.

2018-03-01 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20706#discussion_r171674429 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -267,44 +264,20 @@ private[spark] object Utils extends Logging

[GitHub] spark pull request #20700: [SPARK-23546][SQL] Refactor stateless methods/val...

2018-03-01 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20700#discussion_r171667563 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -1524,4 +1348,215 @@ object

[GitHub] spark issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB o...

2018-02-28 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20691 I am not sure that adding such a test is very good for test stability, but we could disable it by default

[GitHub] spark issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB o...

2018-02-28 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20691 Have you tried serializing an array larger than 2GB? There is a pretty big chance that we do not support on the Spark side

[GitHub] spark issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB o...

2018-02-28 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20691 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB o...

2018-02-28 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20691 hmmm - jenkins seems not to be playing ball --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20693: [SPARK-23523] [SQL] [FOLLOWUP] Minor refactor of Optimiz...

2018-02-28 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20693 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #20642: i m not able to open Spark UI on local using localhost:4...

2018-02-20 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20642 @AtulKumVerma please close this PR. You can ask questions in the user list or on stack overflow. --- - To unsubscribe, e

[GitHub] spark pull request #20630: [SPARK-23381][CORE] Murmur3 hash generates a diff...

2018-02-16 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20630#discussion_r168885323 --- Diff: common/unsafe/src/test/java/org/apache/spark/unsafe/hash/Murmur3_x86_32Suite.java --- @@ -51,6 +53,23 @@ public void testKnownLongInputs

[GitHub] spark pull request #20630: [SPARK-23381][CORE] Murmur3 hash generates a diff...

2018-02-16 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20630#discussion_r168883881 --- Diff: common/sketch/src/main/java/org/apache/spark/util/sketch/Murmur3_x86_32.java --- @@ -60,6 +60,8 @@ public static int hashUnsafeWords(Object

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-16 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20568 @mrkm4ntr this is legitimate failure. Can you fix the python tests? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #20626: [SPARK-23447][SQL] Cleanup codegen template for Literal

2018-02-16 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20626 You are going to need to 'type' null values for this work, I think casting would be enough. --- - To unsubscribe, e-mail

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-14 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20568 @mrkm4ntr I see your point. Adding a method to Murmur3 would work. The problem is that we are now going to release a `FeatureHasher` in Spark 2.3 that uses the current Murmur3

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-12 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20568 @mrkm4ntr The change itself looks pretty reasonable. However I am very hesitant to merge this because this will probably break bucketing (it uses murmur3 to create the buckets); for example

[GitHub] spark issue #20579: [SPARK-23372][SQL] Writing empty struct in parquet fails...

2018-02-12 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20579 @dilipbiswal this is a nice improvement. I left a few comments. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #20579: [SPARK-23372][SQL] Writing empty struct in parque...

2018-02-12 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20579#discussion_r167625448 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala --- @@ -72,6 +72,29 @@ class FileBasedDataSourceSuite extends

[GitHub] spark pull request #20579: [SPARK-23372][SQL] Writing empty struct in parque...

2018-02-12 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20579#discussion_r167623896 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala --- @@ -68,6 +68,16 @@ class

[GitHub] spark pull request #20525: [SPARK-23271[SQL] Parquet output contains only _S...

2018-02-07 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20525#discussion_r166767068 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala --- @@ -190,9 +190,13 @@ object FileFormatWriter

[GitHub] spark issue #20530: [SPARK-23349][SQL]ShuffleExchangeExec: Duplicate and red...

2018-02-07 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20530 @wujianping10043419 this change is too trivial to merge, I am sorry. The main reason for not merging is because this will mess up the git blame for the given code, which is sometimes very useful

[GitHub] spark issue #20504: [SPARK-23332][SQL] Update SQLQueryTestSuite to support a...

2018-02-04 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20504 @wangyum this is a PR of 24K lines! How are we going to review this properly? Can you provide some guidelines (major changes etc

[GitHub] spark issue #20505: [SPARK-23251][SQL] Add checks for collection element Enc...

2018-02-04 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20505 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #20505: [SPARK-23251][SQL] Add checks for collection elem...

2018-02-04 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20505#discussion_r165889177 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLImplicits.scala --- @@ -165,11 +165,15 @@ abstract class SQLImplicits extends

[GitHub] spark issue #18136: [SPARK-20910][SQL] Add build-in SQL function - UUID

2018-02-01 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/18136 I just came across this expression and I have a few concerns: 1. A row will not get the same UUID assigned when a task fails. This might cause some really weird problems when the UUID

[GitHub] spark issue #20463: [SQL][MINOR] Inline SpecifiedWindowFrame.defaultWindowFr...

2018-01-31 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20463 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #20402: [SPARK-23223][SQL] Make stacking dataset transforms more...

2018-01-29 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20402 Ok, merging this to master/4.0. Thanks for all the reviews! --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #20402: [SPARK-23223][SQL] Make stacking dataset transfor...

2018-01-28 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20402#discussion_r164338267 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -62,7 +62,11 @@ import org.apache.spark.util.Utils private[sql

[GitHub] spark issue #20415: [SPARK-23247][SQL]combines Unsafe operations and statist...

2018-01-27 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20415 @heary-cao have you benchmarked this? The reason I am asking is because Spark SQL chains iterators, these are pipelined and only materialized when we need to. Your PR effectively removes two

[GitHub] spark pull request #20402: [SPARK-23223][SQL] Make stacking dataset transfor...

2018-01-25 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20402#discussion_r164030038 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -192,7 +192,7 @@ class Dataset[T] private[sql

[GitHub] spark issue #20402: [SPARK-23223][SQL] Make stacking dataset transforms more...

2018-01-25 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20402 cc @cloud-fan @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #20402: [SPARK-23223][SQL] Make stacking dataset transfor...

2018-01-25 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20402#discussion_r164029516 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala --- @@ -74,10 +74,36 @@ trait CheckAnalysis extends

[GitHub] spark pull request #20402: [SPARK-23223][SQL] Make stacking dataset transfor...

2018-01-25 Thread hvanhovell
GitHub user hvanhovell opened a pull request: https://github.com/apache/spark/pull/20402 [SPARK-23223][SQL] Make stacking dataset transforms more performant ## What changes were proposed in this pull request? It is a common pattern to apply multiple transforms to a `Dataset

[GitHub] spark pull request #20394: [SPARK-23214][SQL] cached data should not carry e...

2018-01-25 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20394#discussion_r163943771 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala --- @@ -73,11 +73,16 @@ case class InMemoryRelation

[GitHub] spark pull request #20391: [SPARK-23208][SQL] Fix code generation for comple...

2018-01-24 Thread hvanhovell
GitHub user hvanhovell opened a pull request: https://github.com/apache/spark/pull/20391 [SPARK-23208][SQL] Fix code generation for complex create array (related) expressions ## What changes were proposed in this pull request? The `GenArrayData.genCodeToCreateArrayData

[GitHub] spark issue #20391: [SPARK-23208][SQL] Fix code generation for complex creat...

2018-01-24 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20391 cc @cloud-fan @kiszk @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark pull request #20350: [SPARK-23179][SQL] Support option to throw except...

2018-01-22 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20350#discussion_r163013386 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/types/Decimal.scala --- @@ -237,14 +238,26 @@ final class Decimal extends Ordered[Decimal

[GitHub] spark pull request #20350: [SPARK-23179][SQL] Support option to throw except...

2018-01-22 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20350#discussion_r163007885 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/types/Decimal.scala --- @@ -237,14 +238,26 @@ final class Decimal extends Ordered[Decimal

[GitHub] spark pull request #20350: [SPARK-23179][SQL] Support option to throw except...

2018-01-22 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20350#discussion_r162957968 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/types/Decimal.scala --- @@ -237,14 +238,26 @@ final class Decimal extends Ordered[Decimal

[GitHub] spark issue #20302: [SPARK-23094] Fix invalid character handling in JsonData...

2018-01-18 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20302 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #20277: [SPARK-23090][SQL] polish ColumnVector

2018-01-16 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20277 @cloud-fan did you do some benchmarks? I'd like to make sure that the abstract class to interface change does not negatively impact performance

[GitHub] spark pull request #20200: [SPARK-23005][Core] Improve RDD.take on small num...

2018-01-09 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20200#discussion_r160391487 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1345,13 +1346,12 @@ abstract class RDD[T: ClassTag]( if (buf.isEmpty

[GitHub] spark pull request #20200: [SPARK-23005][Core] Improve RDD.take on small num...

2018-01-09 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20200#discussion_r160390893 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -985,7 +985,7 @@ abstract class RDD[T: ClassTag]( def subtract

[GitHub] spark issue #19764: [SPARK-22539][SQL] Add second order for rangepartitioner...

2018-01-03 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/19764 @caneGuy adding the second ordering makes the sort much more fine grained, this means that the range partitioner is probably going to use different range boundaries using all the ordering

[GitHub] spark pull request #20133: [SPARK-22934] [SQL] Make optional clauses order i...

2018-01-01 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20133#discussion_r159164626 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala --- @@ -408,9 +417,17 @@ class SparkSqlAstBuilder(conf: SQLConf

[GitHub] spark issue #20023: [SPARK-22036][SQL] Decimal multiplication with high prec...

2017-12-20 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20023 I don't fully agree :)... 1. You can use `SQLConf.get` for this. Or you can wire up the rules using the `SessionStateBuilder`. 2. I am reluctant to change this for a minor version. I

[GitHub] spark issue #20023: [SPARK-22036][SQL] Decimal multiplication with high prec...

2017-12-20 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20023 In am generally in favor of following the SQL standard. How about we do this. Let's make the standard behavior the default, and add a flag to revert to the old behavior. This allows us to ease

[GitHub] spark pull request #19864: [SPARK-22673][SQL] InMemoryRelation should utiliz...

2017-12-13 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/19864#discussion_r156718225 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala --- @@ -71,9 +74,10 @@ case class InMemoryRelation

[GitHub] spark pull request #19864: [SPARK-22673][SQL] InMemoryRelation should utiliz...

2017-12-13 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/19864#discussion_r156610279 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala --- @@ -71,9 +74,10 @@ case class InMemoryRelation

[GitHub] spark pull request #19864: [SPARK-22673][SQL] InMemoryRelation should utiliz...

2017-12-13 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/19864#discussion_r156609849 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala --- @@ -60,7 +62,8 @@ case class InMemoryRelation

[GitHub] spark pull request #19864: [SPARK-22673][SQL] InMemoryRelation should utiliz...

2017-12-13 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/19864#discussion_r156609277 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/CacheManager.scala --- @@ -80,6 +80,14 @@ class CacheManager extends Logging

[GitHub] spark pull request #19193: [WIP][SPARK-21896][SQL] Fix Stack Overflow when w...

2017-12-11 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/19193#discussion_r156060138 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1920,7 +1927,34 @@ class Analyzer

[GitHub] spark pull request #19193: [WIP][SPARK-21896][SQL] Fix Stack Overflow when w...

2017-12-11 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/19193#discussion_r156059425 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1920,7 +1927,34 @@ class Analyzer

[GitHub] spark pull request #19193: [WIP][SPARK-21896][SQL] Fix Stack Overflow when w...

2017-12-11 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/19193#discussion_r156059043 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1920,7 +1927,34 @@ class Analyzer

<    1   2   3   4   5   6   7   8   9   10   >