[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r237745005 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/PlannerSuite.scala --- @@ -780,6 +780,23 @@ class PlannerSuite extends SharedSQLContext

[GitHub] spark issue #23165: [SPARK-26188][SQL] FileIndex: don't infer data types of ...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23165 thanks, merging to master/2.4! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #23031: [SPARK-26060][SQL] Track SparkConf entries and make SET ...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23031 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23086#discussion_r237734357 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/Scan.java --- @@ -0,0 +1,68 @@ +/* + * Licensed to the Apache Software

[GitHub] spark issue #23152: [SPARK-26181][SQL] the `hasMinMaxStats` method of `Colum...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23152 good catch! Like @liancheng said, this is a real bug (a numeric column with all values null), it's better to have an end-to-end test case f

[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r237732426 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/test/SQLTestUtils.scala --- @@ -66,6 +66,18 @@ private[sql] trait SQLTestUtils extends

[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r237732369 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/test/SQLTestUtils.scala --- @@ -66,6 +66,18 @@ private[sql] trait SQLTestUtils extends

[GitHub] spark pull request #23165: [SPARK-26188][SQL] FileIndex: don't infer data ty...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23165#discussion_r237532982 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -147,13 +163,13 @@ object PartitioningUtils

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r237532156 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -2542,10 +2542,10 @@ object EliminateUnions extends

[GitHub] spark pull request #23178: [SPARK-26216][SQL] Do not use case class as publi...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23178#discussion_r237522717 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFunction.scala --- @@ -38,114 +38,108 @@ import

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r237519971 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -2542,10 +2542,10 @@ object EliminateUnions extends

[GitHub] spark pull request #23171: [SPARK-26205][SQL] Optimize In for bytes, shorts,...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23171#discussion_r237519018 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala --- @@ -335,6 +343,41 @@ case class In(value

[GitHub] spark issue #23176: [SPARK-26211][SQL] Fix InSet for binary, and struct and ...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23176 thanks, merging to master/2.4/2.3! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark pull request #23165: [SPARK-26188][SQL] FileIndex: don't infer data ty...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23165#discussion_r237510705 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -94,18 +94,34 @@ object PartitioningUtils

[GitHub] spark pull request #23165: [SPARK-26188][SQL] FileIndex: don't infer data ty...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23165#discussion_r237510162 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -250,7 +276,13 @@ object PartitioningUtils

[GitHub] spark pull request #23165: [SPARK-26188][SQL] FileIndex: don't infer data ty...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23165#discussion_r237508120 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningAwareFileIndex.scala --- @@ -126,33 +126,15 @@ abstract class

[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23132 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23178: [SPARK-26216][SQL] Do not use case class as publi...

2018-11-29 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/23178 [SPARK-26216][SQL] Do not use case class as public API (UserDefinedFunction) ## What changes were proposed in this pull request? It's a bad idea to use case class as public API,

[GitHub] spark issue #23178: [SPARK-26216][SQL] Do not use case class as public API (...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23178 cc @rxin @srowen @gatorsmile @HyukjinKwon @dongjoon-hyun --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #23176: [SPARK-26211][SQL] Fix InSet for binary, and struct and ...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23176 good catch! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #23176: [SPARK-26211][SQL] Fix InSet for binary, and stru...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23176#discussion_r237383211 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/PredicateSuite.scala --- @@ -293,6 +293,54 @@ class PredicateSuite

[GitHub] spark pull request #23176: [SPARK-26211][SQL] Fix InSet for binary, and stru...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23176#discussion_r237382990 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala --- @@ -367,11 +367,29 @@ case class InSet(child

[GitHub] spark pull request #23176: [SPARK-26211][SQL] Fix InSet for binary, and stru...

2018-11-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23176#discussion_r237382322 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala --- @@ -367,11 +367,29 @@ case class InSet(child

[GitHub] spark pull request #23031: [SPARK-26060][SQL] Track SparkConf entries and ma...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23031#discussion_r237364466 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1610,6 +1610,14 @@ object SQLConf { ""&qu

[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r237364160 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -105,5 +105,16 @@ abstract class SparkFunSuite logInfo(s"

[GitHub] spark issue #22514: [SPARK-25271][SQL] Hive ctas commands should use data so...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22514 can we try a query and see what the SQL UI looks like? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark pull request #22514: [SPARK-25271][SQL] Hive ctas commands should use ...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22514#discussion_r237363826 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveParquetSuite.scala --- @@ -92,4 +92,18 @@ class HiveParquetSuite extends QueryTest with

[GitHub] spark pull request #22514: [SPARK-25271][SQL] Hive ctas commands should use ...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22514#discussion_r237361155 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala --- @@ -181,62 +180,39 @@ case class RelationConversions( conf

[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r237347450 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -105,5 +105,16 @@ abstract class SparkFunSuite logInfo(s"

[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23086#discussion_r237346499 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExec.scala --- @@ -22,86 +22,56 @@ import

[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23086#discussion_r237346029 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/Table.java --- @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23086#discussion_r237346128 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/TableProvider.java --- @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23086#discussion_r237342899 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/Table.java --- @@ -0,0 +1,65 @@ +/* + * Licensed to the Apache Software

[GitHub] spark issue #22957: [SPARK-25951][SQL] Ignore aliases for distributions and ...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22957 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #23165: [SPARK-26188][SQL] FileIndex: don't infer data ty...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23165#discussion_r237339152 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileIndexSuite.scala --- @@ -49,6 +50,21 @@ class FileIndexSuite extends

[GitHub] spark pull request #23165: [SPARK-26188][SQL] FileIndex: don't infer data ty...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23165#discussion_r237339038 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningAwareFileIndex.scala --- @@ -126,13 +126,14 @@ abstract class

[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r237338904 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -105,5 +105,16 @@ abstract class SparkFunSuite logInfo(s"

[GitHub] spark issue #23171: [SPARK-26205][SQL] Optimize In for bytes, shorts, ints

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23171 I'm wondering if this is still useful after we fix the boxing issue in `InSet`. We can write a binary hash set for primitive types, like `LongToUnsafeRowMap`, which should have better perfor

[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23132 LGTM, does CSV need to do the same? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark issue #23124: [SPARK-25829][SQL] remove duplicated map keys with last ...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23124 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23130: [SPARK-26161][SQL] Ignore empty files in load

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23130 I think this change makes sense, at least it's good for performance. My only concern is, shall we ask all the parsers to return Nil for empty files? AFAIK JSON doesn't

[GitHub] spark pull request #23130: [SPARK-26161][SQL] Ignore empty files in load

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23130#discussion_r237121289 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/sources/SaveLoadSuite.scala --- @@ -142,4 +144,15 @@ class SaveLoadSuite extends DataSourceTest

[GitHub] spark pull request #23132: [SPARK-26163][SQL] Parsing decimals from JSON usi...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23132#discussion_r237120812 --- Diff: docs/sql-migration-guide-upgrade.md --- @@ -9,6 +9,8 @@ displayTitle: Spark SQL Upgrading Guide ## Upgrading From Spark SQL 2.4 to

[GitHub] spark pull request #23120: [SPARK-26151][SQL] Return partial results for bad...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23120#discussion_r237118916 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala --- @@ -243,21 +243,27 @@ class UnivocityParser

[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23086#discussion_r237113069 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExec.scala --- @@ -23,85 +23,55 @@ import

[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23086#discussion_r237111240 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala --- @@ -170,15 +157,24 @@ object

[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23086#discussion_r237111058 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala --- @@ -40,8 +40,8 @@ import

[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23086#discussion_r237110747 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala --- @@ -38,7 +38,7 @@ import org.apache.spark.sql.execution.datasources.jdbc

[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23086#discussion_r237103452 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExec.scala --- @@ -54,27 +53,17 @@ case class

[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23086#discussion_r237092275 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExec.scala --- @@ -23,29 +23,28 @@ import

[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23086#discussion_r237080333 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/TableProvider.java --- @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23086#discussion_r237078844 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/TableProvider.java --- @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r237076213 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala --- @@ -195,14 +195,26 @@ abstract class Expression

[GitHub] spark pull request #23031: [SPARK-26060][SQL] Track SparkConf entries and ma...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23031#discussion_r237075228 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala --- @@ -2715,4 +2715,11 @@ abstract class DDLSuite extends

[GitHub] spark pull request #23031: [SPARK-26060][SQL] Track SparkConf entries and ma...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23031#discussion_r237075170 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/RuntimeConfigSuite.scala --- @@ -68,4 +68,13 @@ class RuntimeConfigSuite extends SparkFunSuite

[GitHub] spark pull request #23031: [SPARK-26060][SQL] Track SparkConf entries and ma...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23031#discussion_r237074727 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RuntimeConfig.scala --- @@ -154,5 +154,9 @@ class RuntimeConfig private[sql](sqlConf: SQLConf

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r237070739 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala --- @@ -195,14 +195,26 @@ abstract class Expression

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r237069486 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala --- @@ -195,14 +195,26 @@ abstract class Expression

[GitHub] spark issue #23153: [SPARK-26147][SQL] only pull out unevaluable python udf ...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23153 thanks, merging to master/2.4! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r237065506 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala --- @@ -195,14 +195,26 @@ abstract class Expression

[GitHub] spark pull request #23130: [SPARK-26161][SQL] Ignore empty files in load

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23130#discussion_r237062653 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/sources/SaveLoadSuite.scala --- @@ -142,4 +144,15 @@ class SaveLoadSuite extends DataSourceTest

[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23083 thanks, merging to master/2.4! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #23128: [SPARK-26142][SQL] Implement shuffle read metrics in SQL

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23128 thanks , merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r237059126 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -105,5 +105,16 @@ abstract class SparkFunSuite logInfo(s"

[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r237058872 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -105,5 +105,16 @@ abstract class SparkFunSuite logInfo(s"

[GitHub] spark pull request #23132: [SPARK-26163][SQL] Parsing decimals from JSON usi...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23132#discussion_r237058331 --- Diff: docs/sql-migration-guide-upgrade.md --- @@ -9,6 +9,8 @@ displayTitle: Spark SQL Upgrading Guide ## Upgrading From Spark SQL 2.4 to

[GitHub] spark issue #22887: [SPARK-25880][CORE] user set's hadoop conf should not ov...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22887 Spark SQL SET command can't update any static config or Spark core configs, but I think hadoop configs are different. It's not static as users can update it via `SparkContext.hadoopCon

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r237049166 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala --- @@ -195,14 +195,26 @@ abstract class Expression

[GitHub] spark pull request #23120: [SPARK-26151][SQL] Return partial results for bad...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23120#discussion_r237046616 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/FailureSafeParser.scala --- @@ -33,26 +33,21 @@ class FailureSafeParser[IN

[GitHub] spark pull request #23120: [SPARK-26151][SQL] Return partial results for bad...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23120#discussion_r237046251 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala --- @@ -243,21 +243,27 @@ class UnivocityParser

[GitHub] spark pull request #23130: [SPARK-26161][SQL] Ignore empty files in load

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23130#discussion_r237045706 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/sources/SaveLoadSuite.scala --- @@ -142,4 +144,15 @@ class SaveLoadSuite extends DataSourceTest

[GitHub] spark pull request #21004: [SPARK-23896][SQL]Improve PartitioningAwareFileIn...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21004#discussion_r237040743 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningAwareFileIndex.scala --- @@ -126,35 +126,32 @@ abstract class

[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23052 seems like a real failure --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23141: [SPARK-26021][SQL][followup] add test for special floati...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23141 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23128: [SPARK-26142][SQL] Implement shuffle read metrics...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23128#discussion_r236976020 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ShuffledRowRDD.scala --- @@ -154,7 +156,10 @@ class ShuffledRowRDD

[GitHub] spark issue #23128: [SPARK-26142][SQL] Implement shuffle read metrics in SQL

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23128 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #23153: [SPARK-26147][SQL] only pull out unevaluable python udf ...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23153 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23052 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...

2018-11-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23083 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #23124: [SPARK-25829][SQL] remove duplicated map keys wit...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23124#discussion_r236962499 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayBasedMapBuilder.scala --- @@ -0,0 +1,118 @@ +/* + * Licensed to

[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23052 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23086#discussion_r236957293 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/Table.java --- @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #23124: [SPARK-25829][SQL] remove duplicated map keys wit...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23124#discussion_r236949897 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala --- @@ -558,8 +558,11 @@ private[parquet

[GitHub] spark issue #23127: [SPARK-26159] Codegen for LocalTableScanExec and RDDScan...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23127 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23083 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #23128: [SPARK-26142][SQL] Support passing shuffle metrics to ex...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23128 LGTM except one comment --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23128: [SPARK-26142][SQL] Support passing shuffle metric...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23128#discussion_r236637264 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ShuffledRowRDD.scala --- @@ -154,7 +156,14 @@ class ShuffledRowRDD

[GitHub] spark pull request #23128: [SPARK-26142][SQL] Support passing shuffle metric...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23128#discussion_r236636819 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ShuffledRowRDD.scala --- @@ -154,7 +156,14 @@ class ShuffledRowRDD

[GitHub] spark issue #23138: [SPARK-23356][SQL][TEST] add new test cases for a + 1,a ...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23138 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23083 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #22887: [SPARK-25880][CORE] user set's hadoop conf should not ov...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22887 > Basically, if my "core-size.xml" says "mapreduce.input.fileinputformat.split.maxsize" is 2, and my Spark conf says "spark.hadoop.mapreduce.input.fileinputformat.s

[GitHub] spark pull request #23104: [SPARK-26138][SQL] Cross join requires push Local...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23104#discussion_r236625590 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -459,6 +459,7 @@ object LimitPushDown extends Rule

[GitHub] spark pull request #23153: [SPARK-26147][SQL] only pull out unevaluable pyth...

2018-11-27 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/23153 [SPARK-26147][SQL] only pull out unevaluable python udf from join condition ## What changes were proposed in this pull request? https://github.com/apache/spark/pull/22326 made a mistake

[GitHub] spark issue #23153: [SPARK-26147][SQL] only pull out unevaluable python udf ...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23153 @xuanyuanking @HyukjinKwon @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #23124: [SPARK-25829][SQL] remove duplicated map keys with last ...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23124 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23083 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #23128: [SPARK-26142][SQL] Support passing shuffle metrics to ex...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23128 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...

2018-11-26 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23083 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #23128: [SPARK-26142][SQL] Support passing shuffle metrics to ex...

2018-11-26 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23128 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #23130: [SPARK-26161][SQL] Ignore empty files in load

2018-11-26 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23130#discussion_r236515927 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala --- @@ -388,7 +388,7 @@ case class FileSourceScanExec

<    1   2   3   4   5   6   7   8   9   10   >