[GitHub] spark pull request #22758: [SPARK-25332][SQL] Instead of broadcast hash join...

2018-10-17 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22758#discussion_r226151421 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -193,6 +193,16 @@ private[hive] class HiveMetastoreCatalog

[GitHub] spark pull request #22758: [SPARK-25332][SQL] Instead of broadcast hash join...

2018-10-17 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22758#discussion_r226150341 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala --- @@ -1051,7 +1051,8 @@ class StatisticsSuite extends

[GitHub] spark pull request #22745: [SPARK-21402][SQL][FOLLOW-UP] Fix java map of str...

2018-10-17 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22745#discussion_r226149644 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala --- @@ -509,3 +509,24 @@ case class UnresolvedOrdinal

[GitHub] spark issue #22309: [SPARK-20384][SQL] Support value class in schema of Data...

2018-10-17 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22309 what's still missing to support top level value class? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #22309: [SPARK-20384][SQL] Support value class in schema ...

2018-10-17 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22309#discussion_r226149168 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala --- @@ -632,13 +667,17 @@ object ScalaReflection extends

[GitHub] spark pull request #22309: [SPARK-20384][SQL] Support value class in schema ...

2018-10-17 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22309#discussion_r226148985 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala --- @@ -376,6 +386,23 @@ object ScalaReflection extends

[GitHub] spark pull request #22309: [SPARK-20384][SQL] Support value class in schema ...

2018-10-17 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22309#discussion_r226148892 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala --- @@ -376,6 +386,23 @@ object ScalaReflection extends

[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...

2018-10-17 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22732#discussion_r226143221 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/UDFSuite.scala --- @@ -393,4 +393,30 @@ class UDFSuite extends QueryTest with SharedSQLContext

[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...

2018-10-17 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22732#discussion_r226142651 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFunction.scala --- @@ -73,19 +73,21 @@ case class UserDefinedFunction

[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...

2018-10-17 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22732#discussion_r226141983 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala --- @@ -39,29 +42,29 @@ import

[GitHub] spark pull request #22756: [SPARK-25758][ML] Deprecate computeCost on Bisect...

2018-10-17 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22756#discussion_r226141315 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/BisectingKMeans.scala --- @@ -125,8 +125,13 @@ class BisectingKMeansModel private[ml

[GitHub] spark issue #22721: [SPARK-25403][SQL] Refreshes the table after inserting t...

2018-10-17 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22721 what's the impact to end users? wrong statistics? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #20999: [SPARK-14922][SPARK-17732][SPARK-23866][SQL] Support par...

2018-10-17 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20999 It seems like Hive can drop partitions directly with a partition expression: https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive

[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...

2018-10-17 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22732#discussion_r225952267 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala --- @@ -39,29 +40,29 @@ import

[GitHub] spark issue #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF constructor sig...

2018-10-17 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22732 After more thoughts, there is one case we don't handle well. For UDFs like `(a: Any, i: Int) => xxx`, previously we can still get the int type info at runtime, and add null check for it. Now

[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

2018-10-17 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22708 thanks, merging to master/2.4! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #22674: [SPARK-25680][SQL] SQL execution listener shouldn't happ...

2018-10-17 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22674 since there is no objection, I'm merging it to master, thanks! --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #22750: [SPARK-25747][SQL] remove ColumnarBatchScan.needsUnsafeR...

2018-10-17 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22750 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF constructor sig...

2018-10-17 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22732 There is an argument about whether #22259 introduced behavior changes. Here is my analysis. Before #22259 , the type and null check was done as 1. user registers UDFs, like `(a: Int

[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...

2018-10-17 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22732#discussion_r225807388 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala --- @@ -31,6 +31,9 @@ import

[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...

2018-10-17 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22732#discussion_r225804995 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFunction.scala --- @@ -81,11 +81,11 @@ case class UserDefinedFunction

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-10-17 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21322 I'd like to recommend http://apache-spark-developers-list.1001551.n3.nabble.com/SPIP-SPARK-25728-Structured-Intermediate-Representation-Tungsten-IR-for-generating-Java-code-td25370.html

[GitHub] spark issue #22746: [SPARK-24499][SQL][DOC] Split the page of sql-programmin...

2018-10-17 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22746 This is very cool! thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22746: [SPARK-24499][SQL][DOC] Split the page of sql-pro...

2018-10-17 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22746#discussion_r225797461 --- Diff: docs/_data/menu-sql.yaml --- @@ -0,0 +1,79 @@ +- text: Getting Started + url: sql-getting-started.html + subitems: +- text

[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...

2018-10-16 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22724 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22708: [SPARK-21402][SQL] Fix java array of structs dese...

2018-10-16 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22708#discussion_r225767103 --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/JavaBeanWithArraySuite.java --- @@ -0,0 +1,222 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #22745: [SPARK-21402][SQL][FOLLOW-UP] Fix java map of structs de...

2018-10-16 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22745 It's a different issue, I think it worth a new ticket --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...

2018-10-16 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22732#discussion_r225614430 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala --- @@ -31,6 +31,7 @@ import

[GitHub] spark issue #22750: [SPARK-25747][SQL] remove ColumnarBatchScan.needsUnsafeR...

2018-10-16 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22750 cc @gatorsmile @hvanhovell --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22750: [SPARK-25747][SQL] remove ColumnarBatchScan.needs...

2018-10-16 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/22750 [SPARK-25747][SQL] remove ColumnarBatchScan.needsUnsafeRowConversion ## What changes were proposed in this pull request? `needsUnsafeRowConversion` is used in 2 places: 1

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-10-16 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21322 I think a design doc is better, to make sure we are on the same page before the actual coding, which saves time

[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

2018-10-16 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22708 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

2018-10-16 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22708 cc @dongjoon-hyun here is another instance of the FileBasedDataSourceSuite flaky test. --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-10-16 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21322 Ah i see. It's kind of a broadcast but has a much small scope and its lifecycle is bound to a stage. I'm looking forward to a full design of it, thanks

[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...

2018-10-16 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22724 yea +1 on 3.0 only, this is kind of a developer API, advanced users may use it. --- - To unsubscribe, e-mail: reviews

[GitHub] spark pull request #22468: [SPARK-25374][SQL] SafeProjection supports fallba...

2018-10-16 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22468#discussion_r225522164 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/UserDefinedTypeSuite.scala --- @@ -31,48 +30,7 @@ import org.apache.spark.sql.types

[GitHub] spark pull request #22468: [SPARK-25374][SQL] SafeProjection supports fallba...

2018-10-16 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22468#discussion_r225521786 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/InterpretedSafeProjection.scala --- @@ -0,0 +1,173

[GitHub] spark pull request #22468: [SPARK-25374][SQL] SafeProjection supports fallba...

2018-10-16 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22468#discussion_r225521397 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/InterpretedSafeProjection.scala --- @@ -0,0 +1,173

[GitHub] spark pull request #22468: [SPARK-25374][SQL] SafeProjection supports fallba...

2018-10-16 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22468#discussion_r225521206 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/InterpretedSafeProjection.scala --- @@ -0,0 +1,173

[GitHub] spark pull request #22468: [SPARK-25374][SQL] SafeProjection supports fallba...

2018-10-16 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22468#discussion_r225520812 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/InterpretedSafeProjection.scala --- @@ -0,0 +1,173

[GitHub] spark pull request #22468: [SPARK-25374][SQL] SafeProjection supports fallba...

2018-10-16 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22468#discussion_r225520290 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/InterpretedSafeProjection.scala --- @@ -0,0 +1,173

[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...

2018-10-16 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22724 `Literal` is an internal API, and AFAIK end-users can't construct an invalid `Literal` with public APIs. If they can, then it's a bug and we have a problem

[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

2018-10-16 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22708 lgtm --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

2018-10-16 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22708 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...

2018-10-16 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22728 FYI, I tried both hive and presto, neither of them supports multi-column count. --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #22597: [SPARK-25579][SQL] Use quoted attribute names if needed ...

2018-10-16 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22597 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22724 LGTM pending jenkins --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22724: [SPARK-25734][SQL] Literal should have a value co...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22724#discussion_r225396907 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala --- @@ -196,6 +197,48 @@ object Literal { case

[GitHub] spark issue #22379: [SPARK-25393][SQL] Adding new function from_csv()

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22379 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22724: [SPARK-25734][SQL] Literal should have a value co...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22724#discussion_r225392951 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala --- @@ -196,6 +197,48 @@ object Literal { case

[GitHub] spark pull request #22724: [SPARK-25734][SQL] Literal should have a value co...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22724#discussion_r225392843 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala --- @@ -196,6 +197,48 @@ object Literal { case

[GitHub] spark pull request #22724: [SPARK-25734][SQL] Literal should have a value co...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22724#discussion_r225392872 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala --- @@ -196,6 +197,48 @@ object Literal { case

[GitHub] spark pull request #22597: [SPARK-25579][SQL] Use quoted attribute names if ...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22597#discussion_r225373279 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilterSuite.scala --- @@ -383,4 +385,17 @@ class OrcFilterSuite

[GitHub] spark pull request #22597: [SPARK-25579][SQL] Use quoted attribute names if ...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22597#discussion_r225373223 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilterSuite.scala --- @@ -383,4 +385,17 @@ class OrcFilterSuite

[GitHub] spark issue #22719: [SPARK-25714] [BACKPORT-2.2] Fix Null Handling in the Op...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22719 thanks, merging to 2.2! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22718: [SPARK-25714] [BACKPORT-2.3] Fix Null Handling in the Op...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22718 thanks, merging to 2.3! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22724: [SPARK-25734][SQL] Literal should have a value co...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22724#discussion_r225368754 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala --- @@ -196,6 +197,48 @@ object Literal { case

[GitHub] spark issue #22731: [SPARK-25674][FOLLOW-UP] Update the stats for each Colum...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22731 LGTM, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22733: [SPARK-25738][SQL] Fix LOAD DATA INPATH for hdfs port

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22733 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22728 BTW MySQL doesn't support `count(a, b)` but supports `count(distinct a, b)`, the result is same as Spark

[GitHub] spark pull request #22379: [SPARK-25393][SQL] Adding new function from_csv()

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22379#discussion_r225218562 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala --- @@ -777,7 +777,6 @@ case class SchemaOfJson

[GitHub] spark pull request #22724: [SPARK-25734][SQL] Literal should have a value co...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22724#discussion_r225218350 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala --- @@ -196,6 +197,31 @@ object Literal { case

[GitHub] spark issue #22728: [SPARK-25736][SQL][TEST] add tests to verify the behavio...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22728 cc @gatorsmile @mgaido91 @viirya --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark pull request #22728: [SPARK-25736][SQL][TEST] add tests to verify the ...

2018-10-15 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/22728 [SPARK-25736][SQL][TEST] add tests to verify the behavior of multi-column count ## What changes were proposed in this pull request? AFAIK multi-column count is not widely supported

[GitHub] spark issue #22379: [SPARK-25393][SQL] Adding new function from_csv()

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22379 have you addressed https://github.com/apache/spark/pull/22379/files#r225033808 ? --- - To unsubscribe, e-mail: reviews

[GitHub] spark pull request #22379: [SPARK-25393][SQL] Adding new function from_csv()

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22379#discussion_r225183977 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala --- @@ -777,7 +777,6 @@ case class SchemaOfJson

[GitHub] spark pull request #22708: [SPARK-21402] Fix java array/map of structs deser...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22708#discussion_r225182746 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/JavaTypeInference.scala --- @@ -271,32 +272,41 @@ object JavaTypeInference

[GitHub] spark issue #22029: [SPARK-24395][SQL] IN operator should return NULL when c...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22029 I don't have a strong opinion. cc @gatorsmile @hvanhovell @juliuszsompolski --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #22029: [SPARK-24395][SQL] IN operator should return NULL when c...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22029 So, the goal here is to make the behavior consistent between multi-column IN-subquery and multi-column normal IN for Spark. That said, I feel it's reasonable to change the behavior

[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22724 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22029: [SPARK-24395][SQL] IN operator should return NULL when c...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22029 > Oracle does the same. So what's the behavior? return false? > Hive behaves like Spark now (before and after the PR) for this case. Again what's the behavior?

[GitHub] spark issue #20999: [SPARK-14922][SPARK-17732][SPARK-23866][SQL] Support par...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20999 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #22715: [SPARK-25727][SQL] Add outputOrdering to otherCopyArgs i...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22715 Hi @mgaido91 , since you are the major author of this part, do you have time to open a PR and move `outputOrdering` to the main constructor? Thanks in advance

[GitHub] spark pull request #22713: [SPARK-25691][SQL] Use semantic equality in Alias...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22713#discussion_r225119359 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/view.scala --- @@ -112,8 +112,8 @@ object EliminateView extends Rule

[GitHub] spark pull request #22708: [SPARK-21402] Fix java array/map of structs deser...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22708#discussion_r225118613 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/JavaTypeInference.scala --- @@ -271,32 +272,41 @@ object JavaTypeInference

[GitHub] spark issue #22561: [SPARK-25548][SQL]In the PruneFileSourcePartitions optim...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22561 Your rewrite is applied to the entire predicate expression tree, how about the case `NOT(p AND x)`? If `p` is true, `x` is false, then `NOT(p AND x)` is true. But `NOT(true AND true)` is false

[GitHub] spark pull request #20999: [SPARK-14922][SPARK-17732][SPARK-23866][SQL] Supp...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20999#discussion_r225108214 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -523,35 +523,114 @@ case class AlterTableRenamePartitionCommand

[GitHub] spark issue #22029: [SPARK-24395][SQL] IN operator should return NULL when c...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22029 Do you know how other databases behave for `(a, b) in (struct_col1, struct_col2, ...)` and `input_struct_col in (struct_col1, struct_col2, ...)`? Since `(a, b)` may be treated specially, we need

[GitHub] spark pull request #22713: [SPARK-25691][SQL] Use semantic equality in Alias...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22713#discussion_r225103000 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/RemoveRedundantAliasAndProjectSuite.scala --- @@ -124,4 +124,11 @@ class

[GitHub] spark pull request #22514: [SPARK-25271][SQL] Hive ctas commands should use ...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22514#discussion_r225079174 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ddl.scala --- @@ -34,11 +34,16 @@ import org.apache.spark.sql.types

[GitHub] spark pull request #20999: [SPARK-14922][SPARK-17732][SPARK-23866][SQL] Supp...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20999#discussion_r225074416 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala --- @@ -293,6 +293,31 @@ class AstBuilder(conf: SQLConf

[GitHub] spark pull request #20999: [SPARK-14922][SPARK-17732][SPARK-23866][SQL] Supp...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20999#discussion_r225073862 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/namedExpressions.scala --- @@ -382,6 +382,30 @@ case class

[GitHub] spark pull request #20999: [SPARK-14922][SPARK-17732][SPARK-23866][SQL] Supp...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20999#discussion_r225074108 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/namedExpressions.scala --- @@ -382,6 +382,30 @@ case class

[GitHub] spark pull request #20999: [SPARK-14922][SPARK-17732][SPARK-23866][SQL] Supp...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20999#discussion_r225076055 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -523,35 +523,114 @@ case class AlterTableRenamePartitionCommand

[GitHub] spark issue #22263: [SPARK-25269][SQL] SQL interface support specify Storage...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22263 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21322 Introducing a new concept which is similar to broadcast seems like an overkill. We can just update broadcast, to allow it to be memory-only. However, there might be simpler solutions

[GitHub] spark issue #22561: [SPARK-25548][SQL]In the PruneFileSourcePartitions optim...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22561 Please describe the idea clearly in the PR description. After reading the code, I think the idea is, for any predicate `p`, we can rewrite it to `p'` that only contains partition columns

[GitHub] spark issue #22379: [SPARK-25393][SQL] Adding new function from_csv()

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22379 I wouldn't make schema a `Columm`. `Column` means it's dynamic, while I think it should be a static value, to simplify the implementation

[GitHub] spark issue #22379: [SPARK-25393][SQL] Adding new function from_csv()

2018-10-14 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22379 BTW how would the `schema_of_csv` function help with `from_csv`? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #22379: [SPARK-25393][SQL] Adding new function from_csv()

2018-10-14 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22379 Looks pretty good! just one minor commet --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #22379: [SPARK-25393][SQL] Adding new function from_csv()

2018-10-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22379#discussion_r225033899 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVDataSource.scala --- @@ -254,7 +256,7 @@ object

[GitHub] spark pull request #22379: [SPARK-25393][SQL] Adding new function from_csv()

2018-10-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22379#discussion_r225033808 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/csvExpressions.scala --- @@ -0,0 +1,117 @@ +/* + * Licensed

[GitHub] spark issue #22029: [SPARK-24395][SQL] IN operator should return NULL when c...

2018-10-14 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22029 It looks to me that this is another instance of special handling `(a, b, ..)`, like https://github.com/apache/spark/pull/21403 `(a, b) in (struct_col1, struct_col2, ...)` is different

[GitHub] spark issue #22708: [SPARK-21402] Fix java array/map of structs deserializat...

2018-10-14 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22708 Can you explain how this happens? Why thhe fields of structs get mixed up? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #22561: [SPARK-25548][SQL]In the PruneFileSourcePartition...

2018-10-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22561#discussion_r225030929 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PruneFileSourcePartitions.scala --- @@ -39,21 +40,31 @@ private[sql] object

[GitHub] spark pull request #22309: [SPARK-20384][SQL] Support value class in schema ...

2018-10-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22309#discussion_r225030497 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala --- @@ -125,6 +125,17 @@ object ScalaReflection extends

[GitHub] spark pull request #22309: [SPARK-20384][SQL] Support value class in schema ...

2018-10-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22309#discussion_r225030384 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala --- @@ -376,6 +387,23 @@ object ScalaReflection extends

[GitHub] spark pull request #22309: [SPARK-20384][SQL] Support value class in schema ...

2018-10-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22309#discussion_r225030352 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala --- @@ -376,6 +387,23 @@ object ScalaReflection extends

[GitHub] spark issue #22697: [SPARK-25700][SQL][BRANCH-2.4] Partially revert append m...

2018-10-14 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22697 thanks, merging to 2.4! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22713: [SPARK-25691][SQL] Use semantic equality in Optim...

2018-10-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22713#discussion_r225025827 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/RemoveRedundantAliasAndProjectSuite.scala --- @@ -124,4 +124,11 @@ class

<    5   6   7   8   9   10   11   12   13   14   >