[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r237745005 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/PlannerSuite.scala --- @@ -780,6 +780,23 @@ class PlannerSuite extends SharedSQLContext

[GitHub] spark issue #23165: [SPARK-26188][SQL] FileIndex: don't infer data types of ...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23165 thanks, merging to master/2.4! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #23031: [SPARK-26060][SQL] Track SparkConf entries and make SET ...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23031 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23086#discussion_r237734357 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/Scan.java --- @@ -0,0 +1,68 @@ +/* + * Licensed to the Apache Software

[GitHub] spark issue #23152: [SPARK-26181][SQL] the `hasMinMaxStats` method of `Colum...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23152 good catch! Like @liancheng said, this is a real bug (a numeric column with all values null), it's better to have an end-to-end test case

[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r237732426 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/test/SQLTestUtils.scala --- @@ -66,6 +66,18 @@ private[sql] trait SQLTestUtils extends

[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r237732369 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/test/SQLTestUtils.scala --- @@ -66,6 +66,18 @@ private[sql] trait SQLTestUtils extends

[GitHub] spark pull request #23165: [SPARK-26188][SQL] FileIndex: don't infer data ty...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23165#discussion_r237532982 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -147,13 +163,13 @@ object PartitioningUtils

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r237532156 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -2542,10 +2542,10 @@ object EliminateUnions extends

[GitHub] spark pull request #23178: [SPARK-26216][SQL] Do not use case class as publi...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23178#discussion_r237522717 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFunction.scala --- @@ -38,114 +38,108 @@ import

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r237519971 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -2542,10 +2542,10 @@ object EliminateUnions extends

[GitHub] spark pull request #23171: [SPARK-26205][SQL] Optimize In for bytes, shorts,...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23171#discussion_r237519018 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala --- @@ -335,6 +343,41 @@ case class In(value

[GitHub] spark issue #23176: [SPARK-26211][SQL] Fix InSet for binary, and struct and ...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23176 thanks, merging to master/2.4/2.3! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark pull request #23165: [SPARK-26188][SQL] FileIndex: don't infer data ty...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23165#discussion_r237510705 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -94,18 +94,34 @@ object PartitioningUtils

[GitHub] spark pull request #23165: [SPARK-26188][SQL] FileIndex: don't infer data ty...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23165#discussion_r237510162 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -250,7 +276,13 @@ object PartitioningUtils

[GitHub] spark pull request #23165: [SPARK-26188][SQL] FileIndex: don't infer data ty...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23165#discussion_r237508120 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningAwareFileIndex.scala --- @@ -126,33 +126,15 @@ abstract class

[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23132 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23178: [SPARK-26216][SQL] Do not use case class as publi...

2018-11-29 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/23178 [SPARK-26216][SQL] Do not use case class as public API (UserDefinedFunction) ## What changes were proposed in this pull request? It's a bad idea to use case class as public API

[GitHub] spark issue #23178: [SPARK-26216][SQL] Do not use case class as public API (...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23178 cc @rxin @srowen @gatorsmile @HyukjinKwon @dongjoon-hyun --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #23176: [SPARK-26211][SQL] Fix InSet for binary, and struct and ...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23176 good catch! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #23176: [SPARK-26211][SQL] Fix InSet for binary, and stru...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23176#discussion_r237383211 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/PredicateSuite.scala --- @@ -293,6 +293,54 @@ class PredicateSuite

[GitHub] spark pull request #23176: [SPARK-26211][SQL] Fix InSet for binary, and stru...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23176#discussion_r237382990 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala --- @@ -367,11 +367,29 @@ case class InSet(child

[GitHub] spark pull request #23176: [SPARK-26211][SQL] Fix InSet for binary, and stru...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23176#discussion_r237382322 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala --- @@ -367,11 +367,29 @@ case class InSet(child

[GitHub] spark pull request #23031: [SPARK-26060][SQL] Track SparkConf entries and ma...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23031#discussion_r237364466 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1610,6 +1610,14 @@ object SQLConf { ""&qu

[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r237364160 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -105,5 +105,16 @@ abstract class SparkFunSuite logInfo(s"

[GitHub] spark issue #22514: [SPARK-25271][SQL] Hive ctas commands should use data so...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22514 can we try a query and see what the SQL UI looks like? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #22514: [SPARK-25271][SQL] Hive ctas commands should use ...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22514#discussion_r237363826 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveParquetSuite.scala --- @@ -92,4 +92,18 @@ class HiveParquetSuite extends QueryTest

[GitHub] spark pull request #22514: [SPARK-25271][SQL] Hive ctas commands should use ...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22514#discussion_r237361155 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala --- @@ -181,62 +180,39 @@ case class RelationConversions( conf

[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r237347450 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -105,5 +105,16 @@ abstract class SparkFunSuite logInfo(s"

[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23086#discussion_r237346499 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExec.scala --- @@ -22,86 +22,56 @@ import

[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23086#discussion_r237346029 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/Table.java --- @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23086#discussion_r237346128 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/TableProvider.java --- @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23086#discussion_r237342899 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/Table.java --- @@ -0,0 +1,65 @@ +/* + * Licensed to the Apache Software

[GitHub] spark issue #22957: [SPARK-25951][SQL] Ignore aliases for distributions and ...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22957 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #23165: [SPARK-26188][SQL] FileIndex: don't infer data ty...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23165#discussion_r237339152 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileIndexSuite.scala --- @@ -49,6 +50,21 @@ class FileIndexSuite extends

[GitHub] spark pull request #23165: [SPARK-26188][SQL] FileIndex: don't infer data ty...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23165#discussion_r237339038 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningAwareFileIndex.scala --- @@ -126,13 +126,14 @@ abstract class

[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r237338904 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -105,5 +105,16 @@ abstract class SparkFunSuite logInfo(s"

[GitHub] spark issue #23171: [SPARK-26205][SQL] Optimize In for bytes, shorts, ints

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23171 I'm wondering if this is still useful after we fix the boxing issue in `InSet`. We can write a binary hash set for primitive types, like `LongToUnsafeRowMap`, which should have better performance

[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23132 LGTM, does CSV need to do the same? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark issue #23124: [SPARK-25829][SQL] remove duplicated map keys with last ...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23124 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23130: [SPARK-26161][SQL] Ignore empty files in load

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23130 I think this change makes sense, at least it's good for performance. My only concern is, shall we ask all the parsers to return Nil for empty files? AFAIK JSON doesn't follow

[GitHub] spark pull request #23130: [SPARK-26161][SQL] Ignore empty files in load

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23130#discussion_r237121289 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/sources/SaveLoadSuite.scala --- @@ -142,4 +144,15 @@ class SaveLoadSuite extends DataSourceTest

[GitHub] spark pull request #23132: [SPARK-26163][SQL] Parsing decimals from JSON usi...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23132#discussion_r237120812 --- Diff: docs/sql-migration-guide-upgrade.md --- @@ -9,6 +9,8 @@ displayTitle: Spark SQL Upgrading Guide ## Upgrading From Spark SQL 2.4

[GitHub] spark pull request #23120: [SPARK-26151][SQL] Return partial results for bad...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23120#discussion_r237118916 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala --- @@ -243,21 +243,27 @@ class UnivocityParser

[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23086#discussion_r237113069 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExec.scala --- @@ -23,85 +23,55 @@ import

[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23086#discussion_r237111240 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala --- @@ -170,15 +157,24 @@ object

[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23086#discussion_r237111058 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala --- @@ -40,8 +40,8 @@ import

[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23086#discussion_r237110747 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala --- @@ -38,7 +38,7 @@ import org.apache.spark.sql.execution.datasources.jdbc

[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23086#discussion_r237103452 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExec.scala --- @@ -54,27 +53,17 @@ case class

[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23086#discussion_r237092275 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExec.scala --- @@ -23,29 +23,28 @@ import

[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23086#discussion_r237080333 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/TableProvider.java --- @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23086#discussion_r237078844 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/TableProvider.java --- @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r237076213 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala --- @@ -195,14 +195,26 @@ abstract class Expression

[GitHub] spark pull request #23031: [SPARK-26060][SQL] Track SparkConf entries and ma...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23031#discussion_r237075228 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala --- @@ -2715,4 +2715,11 @@ abstract class DDLSuite extends

[GitHub] spark pull request #23031: [SPARK-26060][SQL] Track SparkConf entries and ma...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23031#discussion_r237075170 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/RuntimeConfigSuite.scala --- @@ -68,4 +68,13 @@ class RuntimeConfigSuite extends SparkFunSuite

[GitHub] spark pull request #23031: [SPARK-26060][SQL] Track SparkConf entries and ma...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23031#discussion_r237074727 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RuntimeConfig.scala --- @@ -154,5 +154,9 @@ class RuntimeConfig private[sql](sqlConf: SQLConf

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r237070739 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala --- @@ -195,14 +195,26 @@ abstract class Expression

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r237069486 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala --- @@ -195,14 +195,26 @@ abstract class Expression

[GitHub] spark issue #23153: [SPARK-26147][SQL] only pull out unevaluable python udf ...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23153 thanks, merging to master/2.4! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r237065506 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala --- @@ -195,14 +195,26 @@ abstract class Expression

[GitHub] spark pull request #23130: [SPARK-26161][SQL] Ignore empty files in load

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23130#discussion_r237062653 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/sources/SaveLoadSuite.scala --- @@ -142,4 +144,15 @@ class SaveLoadSuite extends DataSourceTest

[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23083 thanks, merging to master/2.4! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #23128: [SPARK-26142][SQL] Implement shuffle read metrics in SQL

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23128 thanks , merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r237059126 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -105,5 +105,16 @@ abstract class SparkFunSuite logInfo(s"

[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r237058872 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -105,5 +105,16 @@ abstract class SparkFunSuite logInfo(s"

[GitHub] spark pull request #23132: [SPARK-26163][SQL] Parsing decimals from JSON usi...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23132#discussion_r237058331 --- Diff: docs/sql-migration-guide-upgrade.md --- @@ -9,6 +9,8 @@ displayTitle: Spark SQL Upgrading Guide ## Upgrading From Spark SQL 2.4

[GitHub] spark issue #22887: [SPARK-25880][CORE] user set's hadoop conf should not ov...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22887 Spark SQL SET command can't update any static config or Spark core configs, but I think hadoop configs are different. It's not static as users can update it via `SparkContext.hadoopConfiguration

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r237049166 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala --- @@ -195,14 +195,26 @@ abstract class Expression

[GitHub] spark pull request #23120: [SPARK-26151][SQL] Return partial results for bad...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23120#discussion_r237046616 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/FailureSafeParser.scala --- @@ -33,26 +33,21 @@ class FailureSafeParser

[GitHub] spark pull request #23120: [SPARK-26151][SQL] Return partial results for bad...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23120#discussion_r237046251 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala --- @@ -243,21 +243,27 @@ class UnivocityParser

[GitHub] spark pull request #23130: [SPARK-26161][SQL] Ignore empty files in load

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23130#discussion_r237045706 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/sources/SaveLoadSuite.scala --- @@ -142,4 +144,15 @@ class SaveLoadSuite extends DataSourceTest

[GitHub] spark pull request #21004: [SPARK-23896][SQL]Improve PartitioningAwareFileIn...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21004#discussion_r237040743 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningAwareFileIndex.scala --- @@ -126,35 +126,32 @@ abstract class

[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23052 seems like a real failure --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23141: [SPARK-26021][SQL][followup] add test for special floati...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23141 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23128: [SPARK-26142][SQL] Implement shuffle read metrics...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23128#discussion_r236976020 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ShuffledRowRDD.scala --- @@ -154,7 +156,10 @@ class ShuffledRowRDD

[GitHub] spark issue #23128: [SPARK-26142][SQL] Implement shuffle read metrics in SQL

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23128 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #23153: [SPARK-26147][SQL] only pull out unevaluable python udf ...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23153 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23052 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23083 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #23124: [SPARK-25829][SQL] remove duplicated map keys wit...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23124#discussion_r236962499 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayBasedMapBuilder.scala --- @@ -0,0 +1,118 @@ +/* + * Licensed

[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23052 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23086#discussion_r236957293 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/Table.java --- @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #23124: [SPARK-25829][SQL] remove duplicated map keys wit...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23124#discussion_r236949897 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala --- @@ -558,8 +558,11 @@ private[parquet

[GitHub] spark issue #23127: [SPARK-26159] Codegen for LocalTableScanExec and RDDScan...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23127 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23083 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #23128: [SPARK-26142][SQL] Support passing shuffle metrics to ex...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23128 LGTM except one comment --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23128: [SPARK-26142][SQL] Support passing shuffle metric...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23128#discussion_r236637264 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ShuffledRowRDD.scala --- @@ -154,7 +156,14 @@ class ShuffledRowRDD

[GitHub] spark pull request #23128: [SPARK-26142][SQL] Support passing shuffle metric...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23128#discussion_r236636819 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ShuffledRowRDD.scala --- @@ -154,7 +156,14 @@ class ShuffledRowRDD

[GitHub] spark issue #23138: [SPARK-23356][SQL][TEST] add new test cases for a + 1,a ...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23138 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23083 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #22887: [SPARK-25880][CORE] user set's hadoop conf should not ov...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22887 > Basically, if my "core-size.xml" says "mapreduce.input.fileinputformat.split.maxsize" is 2, and my Spark conf says "spark.hadoop.mapreduce.input.fileinp

[GitHub] spark pull request #23104: [SPARK-26138][SQL] Cross join requires push Local...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23104#discussion_r236625590 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -459,6 +459,7 @@ object LimitPushDown extends Rule

[GitHub] spark pull request #23153: [SPARK-26147][SQL] only pull out unevaluable pyth...

2018-11-27 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/23153 [SPARK-26147][SQL] only pull out unevaluable python udf from join condition ## What changes were proposed in this pull request? https://github.com/apache/spark/pull/22326 made a mistake

[GitHub] spark issue #23153: [SPARK-26147][SQL] only pull out unevaluable python udf ...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23153 @xuanyuanking @HyukjinKwon @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #23124: [SPARK-25829][SQL] remove duplicated map keys with last ...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23124 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23083 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #23128: [SPARK-26142][SQL] Support passing shuffle metrics to ex...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23128 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...

2018-11-26 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23083 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #23128: [SPARK-26142][SQL] Support passing shuffle metrics to ex...

2018-11-26 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23128 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #23130: [SPARK-26161][SQL] Ignore empty files in load

2018-11-26 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23130#discussion_r236515927 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala --- @@ -388,7 +388,7 @@ case class FileSourceScanExec

<    1   2   3   4   5   6   7   8   9   10   >