[GitHub] spark pull request #22976: [SPARK-25974][SQL]Optimizes Generates bytecode fo...

2018-11-11 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22976#discussion_r232552336 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateOrdering.scala --- @@ -68,62 +68,55 @@ object

[GitHub] spark issue #22955: [SPARK-25949][SQL] Add test for PullOutPythonUDFInJoinCo...

2018-11-11 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22955 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22938: [SPARK-25935][SQL] Prevent null rows from JSON pa...

2018-11-11 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22938#discussion_r232550860 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala --- @@ -1813,6 +1817,7 @@ class JsonSuite extends

[GitHub] spark pull request #22938: [SPARK-25935][SQL] Prevent null rows from JSON pa...

2018-11-11 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22938#discussion_r232550733 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala --- @@ -1115,6 +1115,7 @@ class JsonSuite extends

[GitHub] spark pull request #22938: [SPARK-25935][SQL] Prevent null rows from JSON pa...

2018-11-11 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22938#discussion_r232550502 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala --- @@ -550,15 +550,23 @@ case class JsonToStructs

[GitHub] spark pull request #22938: [SPARK-25935][SQL] Prevent null rows from JSON pa...

2018-11-11 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22938#discussion_r232550186 --- Diff: docs/sql-migration-guide-upgrade.md --- @@ -15,6 +15,8 @@ displayTitle: Spark SQL Upgrading Guide - Since Spark 3.0, the `from_json

[GitHub] spark issue #22998: [SPARK-26001][SQL]Reduce memory copy when writing decima...

2018-11-11 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22998 I think this is wrong. We have to zero out the bytes even writing a null decimal, so that 2 unsafe rows with same values(including null values) are exactly same(in binary format

[GitHub] spark issue #22990: [SPARK-25988] [SQL] Keep names unchanged when deduplicat...

2018-11-09 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22990 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #22990: [SPARK-25988] [SQL] Keep names unchanged when deduplicat...

2018-11-08 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22990 good catch! LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22990: [SPARK-25988] [SQL] Keep names unchanged when ded...

2018-11-08 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22990#discussion_r232148751 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -2856,6 +2856,59 @@ class SQLQuerySuite extends QueryTest

[GitHub] spark pull request #22990: [SPARK-25988] [SQL] Keep names unchanged when ded...

2018-11-08 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22990#discussion_r232148583 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -2856,6 +2856,59 @@ class SQLQuerySuite extends QueryTest

[GitHub] spark pull request #22987: [SPARK-25979][SQL] Window function: allow parenth...

2018-11-08 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22987#discussion_r232135028 --- Diff: sql/core/src/test/resources/sql-tests/inputs/window.sql --- @@ -109,3 +109,9 @@ last_value(false, false) OVER w AS last_value_contain_null

[GitHub] spark pull request #22987: [SPARK-25979][SQL] Window function: allow parenth...

2018-11-08 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22987#discussion_r231992909 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/SQLWindowFunctionSuite.scala --- @@ -31,6 +32,19 @@ class SQLWindowFunctionSuite

[GitHub] spark pull request #22987: [SPARK-25979][SQL] Window function: allow parenth...

2018-11-08 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22987#discussion_r231992777 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/SQLWindowFunctionSuite.scala --- @@ -31,6 +32,19 @@ class SQLWindowFunctionSuite

[GitHub] spark issue #22978: [SPARK-25676][SQL][FOLLOWUP] Use 'foreach(_ => ())'

2018-11-08 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22978 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22984: [minor] update HiveExternalCatalogVersionsSuite t...

2018-11-08 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22984#discussion_r231922622 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala --- @@ -206,7 +206,7 @@ class

[GitHub] spark issue #22984: [minor] update HiveExternalCatalogVersionsSuite to test ...

2018-11-08 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22984 cc @gatorsmile @srowen --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22984: [minor ]update HiveExternalCatalogVersionsSuite t...

2018-11-08 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/22984 [minor ]update HiveExternalCatalogVersionsSuite to test 2.4.0 ## What changes were proposed in this pull request? Since Spark 2.4.0 is released, we should test

[GitHub] spark pull request #22977: [BUILD] Bump previousSparkVersion in MimaBuild.sc...

2018-11-08 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22977#discussion_r231872904 --- Diff: project/MimaExcludes.scala --- @@ -84,7 +84,17 @@ object MimaExcludes { ProblemFilters.exclude[IncompatibleMethTypeProblem

[GitHub] spark pull request #22977: [BUILD] Bump previousSparkVersion in MimaBuild.sc...

2018-11-08 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22977#discussion_r231866958 --- Diff: project/MimaBuild.scala --- @@ -88,7 +88,7 @@ object MimaBuild { def mimaSettings(sparkHome: File, projectRef: ProjectRef

[GitHub] spark issue #22977: [BUILD] Bump previousSparkVersion in MimaBuild.scala to ...

2018-11-08 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22977 I think one major issue is, there is no document about how to update mima with new releases. Anyone knows the detailed process? Seems we need to update `MimaExcludes.scala` with something like

[GitHub] spark pull request #22977: [BUILD] Bump previousSparkVersion in MimaBuild.sc...

2018-11-08 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/22977 [BUILD] Bump previousSparkVersion in MimaBuild.scala to be 2.4.0 ## What changes were proposed in this pull request? Since Spark 2.4.0 is already in maven repo, we can Bump

[GitHub] spark pull request #22970: [SPARK-25676][FOLLOWUP][BUILD] Fix Scala 2.12 bui...

2018-11-08 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22970#discussion_r231833555 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/WideTableBenchmark.scala --- @@ -42,7 +43,7 @@ object WideTableBenchmark

[GitHub] spark pull request #22938: [SPARK-25935][SQL] Prevent null rows from JSON pa...

2018-11-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22938#discussion_r231762733 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala --- @@ -550,15 +550,33 @@ case class JsonToStructs

[GitHub] spark pull request #22938: [SPARK-25935][SQL] Prevent null rows from JSON pa...

2018-11-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22938#discussion_r231745125 --- Diff: docs/sql-migration-guide-upgrade.md --- @@ -15,6 +15,8 @@ displayTitle: Spark SQL Upgrading Guide - Since Spark 3.0, the `from_json

[GitHub] spark issue #22958: [SPARK-25952][SQL] Passing actual schema to JacksonParse...

2018-11-07 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22958 good catch! do we need to fix the CSV side? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22818: [SPARK-25904][CORE] Allocate arrays smaller than Int.Max...

2018-11-07 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22818 since this is a bug fix, shall we also backport it? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #22943: [SPARK-25098][SQL] Trim the string when cast stri...

2018-11-06 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22943#discussion_r231382309 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala --- @@ -140,16 +140,10 @@ class DateTimeUtilsSuite

[GitHub] spark pull request #22943: [SPARK-25098][SQL] Trim the string when cast stri...

2018-11-06 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22943#discussion_r231380552 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala --- @@ -140,16 +140,10 @@ class DateTimeUtilsSuite

[GitHub] spark issue #22956: [SPARK-25950][SQL] from_csv should respect to spark.sql....

2018-11-06 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22956 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22956: [SPARK-25950][SQL] from_csv should respect to spa...

2018-11-06 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22956#discussion_r231359024 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/csvExpressions.scala --- @@ -92,8 +93,14 @@ case class CsvToStructs

[GitHub] spark pull request #22944: [SPARK-25942][SQL] Fix Dataset.groupByKey to make...

2018-11-06 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22944#discussion_r231358749 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala --- @@ -262,25 +262,39 @@ object AppendColumns

[GitHub] spark pull request #22944: [SPARK-25942][SQL] Fix Dataset.groupByKey to make...

2018-11-06 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22944#discussion_r231358690 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala --- @@ -262,25 +262,39 @@ object AppendColumns

[GitHub] spark pull request #22944: [SPARK-25942][SQL] Fix Dataset.groupByKey to make...

2018-11-06 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22944#discussion_r231201502 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala --- @@ -262,25 +262,39 @@ object AppendColumns

[GitHub] spark issue #22873: [SPARK-25866][ML] Update KMeans formatVersion

2018-11-06 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22873 thanks, merging to master/2.4! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #22944: [SPARK-25942][SQL] Fix Dataset.groupByKey to make...

2018-11-06 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22944#discussion_r231159305 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala --- @@ -262,25 +262,39 @@ object AppendColumns

[GitHub] spark pull request #22944: [SPARK-25942][SQL] Fix Dataset.groupByKey to make...

2018-11-06 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22944#discussion_r231129654 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala --- @@ -262,25 +262,39 @@ object AppendColumns

[GitHub] spark pull request #22944: [SPARK-25942][SQL] Fix Dataset.groupByKey to make...

2018-11-05 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22944#discussion_r230997935 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala --- @@ -262,25 +262,39 @@ object AppendColumns

[GitHub] spark pull request #22547: [SPARK-25528][SQL] data source V2 read side API r...

2018-11-05 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22547#discussion_r230989559 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaContinuousInputStream.scala --- @@ -46,17 +45,22 @@ import

[GitHub] spark pull request #22944: [SPARK-25942][SQL] Fix Dataset.groupByKey to make...

2018-11-05 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22944#discussion_r230989073 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala --- @@ -262,25 +262,39 @@ object AppendColumns

[GitHub] spark pull request #22944: [SPARK-25942][SQL] Fix Dataset.groupByKey to make...

2018-11-05 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22944#discussion_r230986226 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala --- @@ -262,25 +262,39 @@ object AppendColumns

[GitHub] spark pull request #22944: [SPARK-25942][SQL] Fix Dataset.groupByKey to make...

2018-11-05 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22944#discussion_r230977212 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala --- @@ -262,25 +262,39 @@ object AppendColumns

[GitHub] spark issue #22928: [SPARK-25926][CORE] Move config entries in core module t...

2018-11-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22928 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

2018-11-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22946 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22946: [SPARK-25943][SQL] Fail if mismatching nested struct fie...

2018-11-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22946 ah good catch! Can you also add a test? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22502: [SPARK-25474][SQL]When the "fallBackToHdfsForStats= true...

2018-11-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22502 @shahidki31 thanks for fixing it! Do you know where we read `fallBackToHdfsForStats` currently and see if we can have a unified place to do

[GitHub] spark issue #22889: [SPARK-25882][SQL] Added a function to join two datasets...

2018-11-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22889 I think the problem is real, maybe we should not use `Seq` in the end-user API, but always use Array to be more Java-friendly. This can also avoid bugs like https://github.com/apache/spark/pull

[GitHub] spark issue #22949: [minor] update known_translations

2018-11-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22949 Note that, these updates are generated by the script not me. If someone is not in the list, it means the script can figure out the full name without translation

[GitHub] spark pull request #22949: [minor] update known_translations

2018-11-05 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22949#discussion_r230970453 --- Diff: dev/create-release/known_translations --- @@ -203,3 +203,61 @@ shenh062326 - Shen Hong aokolnychyi - Anton Okolnychyi linbojin - Linbo

[GitHub] spark issue #22949: [minor] update known_translations

2018-11-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22949 cc @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22949: [minor] update known_translations

2018-11-05 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/22949 [minor] update known_translations ## What changes were proposed in this pull request? update known_translations after running `translate-contributors.py` during 2.4.0 release

[GitHub] spark pull request #22944: [SPARK-25942][SQL] Fix Dataset.groupByKey to make...

2018-11-05 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22944#discussion_r230772018 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala --- @@ -1556,6 +1556,14 @@ class DatasetSuite extends QueryTest

[GitHub] spark pull request #22771: [SPARK-25773][Core]Cancel zombie tasks in a resul...

2018-11-05 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22771#discussion_r230730694 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1364,6 +1385,21 @@ private[spark] class DAGScheduler

[GitHub] spark issue #22847: [SPARK-25850][SQL] Make the split threshold for the code...

2018-11-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22847 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21732: [SPARK-24762][SQL] Enable Option of Product encoders

2018-11-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21732 I think this is close, can you answer https://github.com/apache/spark/pull/21732/files#r228782670 ? --- - To unsubscribe, e

[GitHub] spark pull request #21732: [SPARK-24762][SQL] Enable Option of Product encod...

2018-11-05 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21732#discussion_r230726457 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala --- @@ -1556,6 +1547,54 @@ class DatasetSuite extends QueryTest

[GitHub] spark pull request #21732: [SPARK-24762][SQL] Enable Option of Product encod...

2018-11-05 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21732#discussion_r230726136 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetAggregatorSuite.scala --- @@ -393,4 +431,18 @@ class DatasetAggregatorSuite extends

[GitHub] spark issue #22889: [SPARK-25882][SQL] Added a function to join two datasets...

2018-11-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22889 So we introduce a new API just to save typing `Seq(...)`? Maintaining an API has cost. --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #22928: [SPARK-25926][CORE] Move config entries in core module t...

2018-11-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22928 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #22928: [SPARK-25926][CORE] Move config entries in core module t...

2018-11-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22928 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22923: [SPARK-25910][CORE] accumulator updates from previous st...

2018-11-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22923 > We need to always update user accumulators Ah that's a good point. I'm going to close it. I missed one thing: the `AppStatusListener` will keep the `StageInfo` instance un

[GitHub] spark pull request #22923: [SPARK-25910][CORE] accumulator updates from prev...

2018-11-05 Thread cloud-fan
Github user cloud-fan closed the pull request at: https://github.com/apache/spark/pull/22923 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22764: [SPARK-25765][ML] Add training cost to BisectingKMeans s...

2018-11-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22764 cc @dbtsai --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22786: [SPARK-25764][ML][EXAMPLES] Update BisectingKMeans examp...

2018-11-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22786 cc @dbtsai --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22869: [SPARK-25758][ML] Deprecate computeCost in BisectingKMea...

2018-11-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22869 cc @dbtsai --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22919: [SPARK-25906][SHELL] Documents '-I' option (from ...

2018-11-05 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22919#discussion_r230686892 --- Diff: bin/spark-shell --- @@ -32,7 +32,10 @@ if [ -z "${SPARK_HOME}" ]; then source "$(dirname "$0")"/find-spark-

[GitHub] spark issue #22942: [SPARK-25884][SQL][FOLLOW-UP] Add sample.json back.

2018-11-04 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22942 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22919: [SPARK-25906][SHELL] Documents '-I' option (from ...

2018-11-04 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22919#discussion_r230655513 --- Diff: bin/spark-shell --- @@ -32,7 +32,10 @@ if [ -z "${SPARK_HOME}" ]; then source "$(dirname "$0")"/find-spark-

[GitHub] spark issue #22927: [SPARK-25918][SQL] LOAD DATA LOCAL INPATH should handle ...

2018-11-02 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22927 I'll list in as a known issue in 2.4.0, thanks for fixing it! --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #22847: [SPARK-25850][SQL] Make the split threshold for the code...

2018-11-01 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22847 did you address https://github.com/apache/spark/pull/22847#issuecomment-434836278 ? --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #22029: [SPARK-24395][SQL] IN operator should return NULL when c...

2018-11-01 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22029 If we decide to follow PostgreSQL about the EQUAL behavior eventually, then it will be much easier to fix the IN behavior, right

[GitHub] spark issue #22923: [SPARK-25910][CORE] accumulator updates from previous st...

2018-11-01 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22923 cc @vanzin @zsxwing @jiangxb1987 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark pull request #22923: [SPARK-25910][CORE] accumulator updates from prev...

2018-11-01 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/22923 [SPARK-25910][CORE] accumulator updates from previous stage attempt should not log error ## What changes were proposed in this pull request? For shuffle map stages, we may have multiple

[GitHub] spark issue #22029: [SPARK-24395][SQL] IN operator should return NULL when c...

2018-11-01 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22029 If we want to follow PostgreSQL/Oracle for the IN behavior, why don't we follow the EQUAL behavior as well

[GitHub] spark issue #22029: [SPARK-24395][SQL] IN operator should return NULL when c...

2018-11-01 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22029 Another point: I think it's also important to make the behavior of IN be consistent with EQUAL. I tried PostgreSQL and `(1, 2) = (3, null)` returns null. Shall we update EQUAL first

[GitHub] spark issue #22919: [SPARK-25906][SHELL] Restores '-i' option's behaviour in...

2018-11-01 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22919 personally I think Spark Shell should be consistent with the upstream Scala Shell, otherwise we may get another ticket complaining why we didn't follow

[GitHub] spark issue #22919: [SPARK-25906][SHELL] Restores '-i' option's behaviour in...

2018-11-01 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22919 so we would support both `-i` and `-I` in 2.4? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22029: [SPARK-24395][SQL] IN operator should return NULL when c...

2018-11-01 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22029 which Presto version did you test? I tried 0.203 and it fails ``` presto:default> select * from t2 where (1, 2) in (select x, y from t); Query 20181101_085707_00012_n644a failed: lin

[GitHub] spark issue #22029: [SPARK-24395][SQL] IN operator should return NULL when c...

2018-11-01 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22029 Do you know how Presto supports multi-value in subquery? By reading the PR description, it seems impossible if Preso treats `(a, b)` as a struct value. How Preso distinguish `(a, b) IN (select x

[GitHub] spark issue #22898: [SPARK-25746][SQL][followup] do not add unnecessary If e...

2018-10-31 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22898 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

2018-10-31 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22626 This needs to be rebased. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22892: [SPARK-25884][SQL] Add TBLPROPERTIES and COMMENT, and us...

2018-10-31 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22892 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22847: [SPARK-25850][SQL] Make the split threshold for the code...

2018-10-31 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22847 @rednaxelafx ah good point! It's hardcoded as 1024 too, and it's also doing method splitting. Let's apply the config there too

[GitHub] spark pull request #22029: [SPARK-24395][SQL] IN operator should return NULL...

2018-10-31 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22029#discussion_r229788060 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala --- @@ -212,27 +212,27 @@ object

[GitHub] spark pull request #22029: [SPARK-24395][SQL] IN operator should return NULL...

2018-10-31 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22029#discussion_r22978 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala --- @@ -212,27 +212,27 @@ object

[GitHub] spark pull request #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type ...

2018-10-31 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22905#discussion_r229754853 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala --- @@ -306,7 +306,15 @@ case class FileSourceScanExec

[GitHub] spark pull request #22907: [SPARK-25896][CORE][WIP] Accumulator should only ...

2018-10-31 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/22907 [SPARK-25896][CORE][WIP] Accumulator should only be updated once for each successful task in shuffle map stage ## What changes were proposed in this pull request? This is a followup

[GitHub] spark pull request #22029: [SPARK-24395][SQL] IN operator should return NULL...

2018-10-31 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22029#discussion_r229708259 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala --- @@ -212,27 +212,27 @@ object

[GitHub] spark issue #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type which r...

2018-10-31 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22905 is there anything blocked by this? I agree this is a good feature, but it asks the data source to provide a new ability, which may become a problem when migrating file sources to data source v2

[GitHub] spark pull request #22029: [SPARK-24395][SQL] IN operator should return NULL...

2018-10-31 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22029#discussion_r229701584 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala --- @@ -212,27 +212,27 @@ object

[GitHub] spark pull request #22029: [SPARK-24395][SQL] IN operator should return NULL...

2018-10-31 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22029#discussion_r229700708 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala --- @@ -212,27 +212,34 @@ object

[GitHub] spark pull request #22029: [SPARK-24395][SQL] IN operator should return NULL...

2018-10-31 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22029#discussion_r229699828 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala --- @@ -339,37 +371,57 @@ case class In(value

[GitHub] spark pull request #22029: [SPARK-24395][SQL] IN operator should return NULL...

2018-10-31 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22029#discussion_r229697077 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1561,6 +1561,16 @@ object SQLConf { .booleanConf

[GitHub] spark pull request #22029: [SPARK-24395][SQL] IN operator should return NULL...

2018-10-31 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22029#discussion_r229692081 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala --- @@ -202,7 +225,11 @@ case class InSubquery(values

[GitHub] spark issue #22892: [SPARK-25884][SQL] Add TBLPROPERTIES and COMMENT, and us...

2018-10-31 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22892 LGTM except some minor comments --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #22892: [SPARK-25884][SQL] Add TBLPROPERTIES and COMMENT,...

2018-10-31 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22892#discussion_r229672667 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveShowCreateTableSuite.scala --- @@ -0,0 +1,198 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #22892: [SPARK-25884][SQL] Add TBLPROPERTIES and COMMENT,...

2018-10-31 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22892#discussion_r229671459 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala --- @@ -1063,21 +1067,19 @@ case class ShowCreateTableCommand(table

[GitHub] spark pull request #22898: [SPARK-25746][SQL][followup] do not add unnecessa...

2018-10-31 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22898#discussion_r229667653 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoder.scala --- @@ -124,14 +124,9 @@ object ExpressionEncoder

[GitHub] spark issue #21860: [SPARK-24901][SQL]Merge the codegen of RegularHashMap an...

2018-10-31 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21860 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...

2018-10-31 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22788#discussion_r229585323 --- Diff: sql/core/src/test/resources/sql-tests/results/columnresolution-negative.sql.out --- @@ -161,7 +161,7 @@ SELECT db1.t1.i1 FROM t1, mydb2.t1

<    1   2   3   4   5   6   7   8   9   10   >