[GitHub] spark pull request #23045: [SPARK-26071][SQL] disallow map as map key

2018-11-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23045#discussion_r234494542 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -521,13 +521,18 @@ case class

[GitHub] spark pull request #23054: [SPARK-26085][SQL] Key attribute of non-struct ty...

2018-11-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23054#discussion_r234476607 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala --- @@ -459,7 +460,11 @@ class KeyValueGroupedDataset[K, V] private

[GitHub] spark pull request #23043: [SPARK-26021][SQL] replace minus zero with zero i...

2018-11-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23043#discussion_r234476361 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala --- @@ -723,4 +723,32 @@ class DataFrameAggregateSuite extends

[GitHub] spark pull request #23043: [SPARK-26021][SQL] replace minus zero with zero i...

2018-11-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23043#discussion_r234475978 --- Diff: common/unsafe/src/test/java/org/apache/spark/unsafe/PlatformUtilSuite.java --- @@ -157,4 +159,15 @@ public void heapMemoryReuse

[GitHub] spark pull request #23043: [SPARK-26021][SQL] replace minus zero with zero i...

2018-11-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23043#discussion_r234476055 --- Diff: common/unsafe/src/test/java/org/apache/spark/unsafe/PlatformUtilSuite.java --- @@ -157,4 +159,15 @@ public void heapMemoryReuse

[GitHub] spark pull request #23043: [SPARK-26021][SQL] replace minus zero with zero i...

2018-11-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23043#discussion_r234475858 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java --- @@ -120,6 +120,9 @@ public static float getFloat(Object object, long

[GitHub] spark pull request #23025: [SPARK-26024][SQL]: Update documentation for repa...

2018-11-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23025#discussion_r234475550 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -2789,6 +2789,12 @@ class Dataset[T] private[sql]( * When no explicit

[GitHub] spark issue #23025: [SPARK-26024][SQL]: Update documentation for repartition...

2018-11-18 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23025 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #23054: [SPARK-26085][SQL] Key attribute of primitive typ...

2018-11-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23054#discussion_r234475321 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1594,6 +1594,15 @@ object SQLConf { "WHERE,

[GitHub] spark pull request #23054: [SPARK-26085][SQL] Key attribute of primitive typ...

2018-11-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23054#discussion_r234475156 --- Diff: docs/sql-migration-guide-upgrade.md --- @@ -17,6 +17,9 @@ displayTitle: Spark SQL Upgrading Guide - The `ADD JAR` command

[GitHub] spark pull request #23079: [SPARK-26107][SQL] Extend ReplaceNullWithFalseInP...

2018-11-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23079#discussion_r234474562 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala --- @@ -767,6 +767,15 @@ object ReplaceNullWithFalse

[GitHub] spark pull request #23042: [SPARK-26070][SQL] add rule for implicit type coe...

2018-11-17 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23042#discussion_r234401696 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -138,6 +138,11 @@ object TypeCoercion

[GitHub] spark pull request #23054: [SPARK-26085][SQL] Key attribute of primitive typ...

2018-11-17 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23054#discussion_r234401319 --- Diff: docs/sql-migration-guide-upgrade.md --- @@ -17,6 +17,8 @@ displayTitle: Spark SQL Upgrading Guide - The `ADD JAR` command

[GitHub] spark issue #22547: [SPARK-25528][SQL] data source V2 read side API refactor...

2018-11-17 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22547 I was stuck with some personal business recently, I'll send a PR for batch source after the weekend. --- - To unsubscribe, e

[GitHub] spark issue #23040: [SPARK-26068][Core]ChunkedByteBufferInputStream should h...

2018-11-16 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23040 also cc @jiangxb1987 @zsxwing --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #23040: [SPARK-26068][Core]ChunkedByteBufferInputStream should h...

2018-11-16 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23040 LGTM except one comment --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23040: [SPARK-26068][Core]ChunkedByteBufferInputStream s...

2018-11-16 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23040#discussion_r234395227 --- Diff: core/src/main/scala/org/apache/spark/util/io/ChunkedByteBuffer.scala --- @@ -222,7 +222,7 @@ private[spark] class ChunkedByteBufferInputStream

[GitHub] spark issue #23043: [SPARK-26021][SQL] replace minus zero with zero in Unsaf...

2018-11-16 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23043 `UnsafeRow.set` is not the only place to write float/double as binary data, can you check other places like UnsafeWriter

[GitHub] spark issue #23043: [SPARK-26021][SQL] replace minus zero with zero in Unsaf...

2018-11-16 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23043 Looking at `UnsafeRow.putFloat`, it normalizes the value of `Float.NaN`. I think we should do the same there for `-0.0`, and other related places (check how we handle Float.NaN

[GitHub] spark issue #23043: [SPARK-26021][SQL] replace minus zero with zero in Unsaf...

2018-11-16 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23043 Before rushing to a fix that replaces -0.0 to 0.0, I'd like to know how this bug happens. One possible reason might be, 0.0 and -0.0 have different binary format. Spark use unsafe API

[GitHub] spark issue #23054: [SPARK-26085][SQL] Key attribute of primitive type under...

2018-11-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23054 makes sense to me. This is a behavior change right? Shall we write a migration guide? --- - To unsubscribe, e-mail: reviews

[GitHub] spark pull request #23042: [SPARK-26070][SQL] add rule for implicit type coe...

2018-11-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23042#discussion_r234091858 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -138,6 +138,11 @@ object TypeCoercion

[GitHub] spark issue #23044: [SPARK-26073][SQL][FOLLOW-UP] remove invalid comment as ...

2018-11-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23044 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23046: [SPARK-23207][SQL][FOLLOW-UP] Use `SQLConf.get.enableRad...

2018-11-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23046 thanks, merging to master/2.4! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #23046: [SPARK-23207][SQL][FOLLOW-UP] Use `SQLConf.get.en...

2018-11-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23046#discussion_r234088968 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala --- @@ -280,7 +280,7 @@ object ShuffleExchangeExec

[GitHub] spark issue #23040: [SPARK-26068][Core]ChunkedByteBufferInputStream should h...

2018-11-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23040 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #23029: [SPARK-26055][CORE] InterfaceStability annotations shoul...

2018-11-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23029 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22989: [SPARK-25986][Build] Add rules to ban throw Error...

2018-11-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22989#discussion_r233821654 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/UnsafeAlignedOffset.java --- @@ -39,7 +39,9 @@ public static int getSize(Object object

[GitHub] spark issue #23043: [SPARK-26021][SQL] replace minus zero with zero in Unsaf...

2018-11-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23043 This only works for attribute, not literal or intermedia result. Is there a better place to fix it? --- - To unsubscribe, e

[GitHub] spark issue #23040: [SPARK-26068][Core]ChunkedByteBufferInputStream should h...

2018-11-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23040 It's good to fix a potential bug, can you add a unit test? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #23044: [SPARK-26073][SQL][FOLLOW-UP] remove invalid comment as ...

2018-11-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23044 add to whitelist --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #23044: [SPARK-26073][SQL][FOLLOW-UP] remove invalid comment as ...

2018-11-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23044 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #23044: [SPARK-26073][SQL][FOLLOW-UP] remove invalid comm...

2018-11-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23044#discussion_r233819066 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateOrdering.scala --- @@ -76,8 +76,6 @@ object

[GitHub] spark issue #23046: [SPARK-23207][SQL][FOLLOW-UP] Use `SQLConf.get.enableRad...

2018-11-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23046 good catch! LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #23035: [SPARK-26057][SQL] Transform also analyzed plans when de...

2018-11-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23035 thanks, merging to master/2.4! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #23042: [SPARK-26070][SQL] add rule for implicit type coe...

2018-11-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23042#discussion_r233816966 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -138,6 +138,11 @@ object TypeCoercion

[GitHub] spark issue #23045: [SPARK-26071][SQL] disallow map as map key

2018-11-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23045 cc @gatorsmile @dongjoon-hyun @viirya --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #23045: [SPARK-26071][SQL] disallow map as map key

2018-11-15 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/23045 [SPARK-26071][SQL] disallow map as map key ## What changes were proposed in this pull request? Due to implementation limitation, currently Spark can't compare or do equality check

[GitHub] spark pull request #22976: [SPARK-25974][SQL]Optimizes Generates bytecode fo...

2018-11-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22976#discussion_r233787377 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateOrdering.scala --- @@ -133,30 +126,26 @@ object

[GitHub] spark issue #22976: [SPARK-25974][SQL]Optimizes Generates bytecode for order...

2018-11-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22976 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22989: [SPARK-25986][Build] Add rules to ban throw Error...

2018-11-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22989#discussion_r233706605 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/UnsafeAlignedOffset.java --- @@ -52,7 +54,9 @@ public static void putSize(Object object

[GitHub] spark pull request #22989: [SPARK-25986][Build] Add rules to ban throw Error...

2018-11-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22989#discussion_r233706517 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/UnsafeAlignedOffset.java --- @@ -39,7 +39,9 @@ public static int getSize(Object object

[GitHub] spark pull request #23035: [SPARK-26057][SQL] Transform also analyzed plans ...

2018-11-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23035#discussion_r233696401 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala --- @@ -2554,4 +2554,34 @@ class DataFrameSuite extends QueryTest

[GitHub] spark pull request #23035: [SPARK-26057][SQL] Transform also analyzed plans ...

2018-11-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23035#discussion_r233695765 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala --- @@ -2554,4 +2554,34 @@ class DataFrameSuite extends QueryTest

[GitHub] spark issue #23029: [SPARK-26055][CORE] InterfaceStability annotations shoul...

2018-11-14 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23029 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #23029: [SPARK-26055][CORE] InterfaceStability annotation...

2018-11-13 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23029#discussion_r233322079 --- Diff: common/tags/src/main/java/org/apache/spark/annotation/InterfaceStability.java --- @@ -17,7 +17,7 @@ package

[GitHub] spark pull request #23029: [SPARK-26055][CORE] InterfaceStability annotation...

2018-11-13 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/23029 [SPARK-26055][CORE] InterfaceStability annotations should be retained at runtime ## What changes were proposed in this pull request? It's good to have annotations available at runtime

[GitHub] spark issue #23029: [SPARK-26055][CORE] InterfaceStability annotations shoul...

2018-11-13 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23029 cc @rxin @srowen @vanzin @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #21957: [SPARK-24994][SQL] When the data type of the fiel...

2018-11-13 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21957#discussion_r233287374 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala --- @@ -269,7 +269,8 @@ case class FileSourceScanExec

[GitHub] spark issue #22518: [SPARK-25482][SQL] Avoid pushdown of subqueries to data ...

2018-11-13 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22518 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22961: [SPARK-25947][SQL] Reduce memory usage in ShuffleExchang...

2018-11-13 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22961 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22962: [SPARK-25921][PySpark] Fix barrier task run witho...

2018-11-13 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22962#discussion_r233117222 --- Diff: python/pyspark/taskcontext.py --- @@ -147,8 +147,8 @@ def __init__(self): @classmethod def _getOrCreate(cls

[GitHub] spark issue #22961: [SPARK-25947][SQL] Reduce memory usage in ShuffleExchang...

2018-11-13 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22961 cool thanks! LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #22962: [SPARK-25921][PySpark] Fix barrier task run witho...

2018-11-13 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22962#discussion_r233033846 --- Diff: python/pyspark/taskcontext.py --- @@ -147,8 +147,8 @@ def __init__(self): @classmethod def _getOrCreate(cls

[GitHub] spark pull request #23004: [SPARK-26004][SQL] InMemoryTable support StartsWi...

2018-11-13 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23004#discussion_r233033597 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala --- @@ -237,6 +237,13 @@ case class

[GitHub] spark pull request #22518: [SPARK-25482][SQL] Avoid pushdown of subqueries t...

2018-11-13 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22518#discussion_r233032721 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala --- @@ -1268,4 +1269,16 @@ class SubquerySuite extends QueryTest

[GitHub] spark pull request #22518: [SPARK-25482][SQL] Avoid pushdown of subqueries t...

2018-11-13 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22518#discussion_r233032650 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala --- @@ -22,7 +22,7 @@ import scala.collection.mutable.ArrayBuffer

[GitHub] spark issue #22962: [SPARK-25921][PySpark] Fix barrier task run without Barr...

2018-11-13 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22962 LGTM, merging to master/2.4! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #23004: [SPARK-26004][SQL] InMemoryTable support StartsWi...

2018-11-13 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23004#discussion_r232945864 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala --- @@ -237,6 +237,13 @@ case class

[GitHub] spark pull request #22944: [SPARK-25942][SQL] Aggregate expressions shouldn'...

2018-11-13 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22944#discussion_r232941317 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala --- @@ -1556,6 +1556,20 @@ class DatasetSuite extends QueryTest

[GitHub] spark issue #22977: [SPARK-26030][BUILD] Bump previousSparkVersion in MimaBu...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22977 since this PR only touches mima, and the jenkins already passed the mima check, I'm going to merge it to master, thanks

[GitHub] spark issue #22518: [SPARK-25482][SQL] Avoid pushdown of subqueries to data ...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22518 BTW can you include a simple benchmark to show this problem? e.g. just run a query in spark-shell, and post the result before and after this PR

[GitHub] spark issue #22518: [SPARK-25482][SQL] Avoid pushdown of subqueries to data ...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22518 I'd like to merge this simple PR first, to address the performance problem (unnecessary subquery execution). Let's create a new ticket for subquery filter pushing to data source

[GitHub] spark pull request #22518: [SPARK-25482][SQL] Avoid pushdown of subqueries t...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22518#discussion_r232906707 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PruneFileSourcePartitions.scala --- @@ -47,7 +47,8 @@ private[sql] object

[GitHub] spark pull request #22518: [SPARK-25482][SQL] Avoid pushdown of subqueries t...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22518#discussion_r232906743 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala --- @@ -1268,4 +1269,16 @@ class SubquerySuite extends QueryTest

[GitHub] spark pull request #22518: [SPARK-25482][SQL] Avoid pushdown of subqueries t...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22518#discussion_r232906652 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala --- @@ -155,15 +155,14 @@ object

[GitHub] spark pull request #22961: [SPARK-25947][SQL] Reduce memory usage in Shuffle...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22961#discussion_r232906123 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala --- @@ -214,13 +214,22 @@ object ShuffleExchangeExec

[GitHub] spark pull request #22944: [SPARK-25942][SQL] Aggregate expressions shouldn'...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22944#discussion_r232905784 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala --- @@ -1556,6 +1556,20 @@ class DatasetSuite extends QueryTest

[GitHub] spark issue #23002: [SPARK-26003] Improve SQLAppStatusListener.aggregateMetr...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23002 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22977: [SPARK-26030][BUILD] Bump previousSparkVersion in...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22977#discussion_r232886260 --- Diff: project/MimaExcludes.scala --- @@ -164,7 +212,50 @@ object MimaExcludes { ProblemFilters.exclude[InheritedNewAbstractMethodProblem

[GitHub] spark issue #23015: [SPARK-26029][BUILD][2.4] Bump previousSparkVersion in M...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23015 thanks, merging to 2.4! Since we have more violates in the master branch, I did not forward port it, and I'll cherry-pick it in another PR

[GitHub] spark pull request #23015: [SPARK-26029][BUILD][2.4] Bump previousSparkVersi...

2018-11-12 Thread cloud-fan
Github user cloud-fan closed the pull request at: https://github.com/apache/spark/pull/23015 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22518: [SPARK-25482][SQL] Avoid pushdown of subqueries t...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22518#discussion_r232737458 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala --- @@ -1268,4 +1269,16 @@ class SubquerySuite extends QueryTest

[GitHub] spark pull request #22518: [SPARK-25482][SQL] Avoid pushdown of subqueries t...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22518#discussion_r232729788 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala --- @@ -1268,4 +1269,16 @@ class SubquerySuite extends QueryTest

[GitHub] spark pull request #22518: [SPARK-25482][SQL] ReuseSubquery can be useless w...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22518#discussion_r232720903 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala --- @@ -1268,4 +1269,16 @@ class SubquerySuite extends QueryTest

[GitHub] spark pull request #22944: [SPARK-25942][SQL] Aggregate expressions shouldn'...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22944#discussion_r232699342 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala --- @@ -1556,6 +1556,20 @@ class DatasetSuite extends QueryTest

[GitHub] spark pull request #22944: [SPARK-25942][SQL] Aggregate expressions shouldn'...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22944#discussion_r232698607 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala --- @@ -1556,6 +1556,20 @@ class DatasetSuite extends QueryTest

[GitHub] spark issue #23002: [SPARK-26003] Improve SQLAppStatusListener.aggregateMetr...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23002 LGTM, also cc @gengliangwang --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #22518: [SPARK-25482][SQL] ReuseSubquery can be useless w...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22518#discussion_r232688360 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala --- @@ -1268,4 +1269,16 @@ class SubquerySuite extends QueryTest

[GitHub] spark pull request #22518: [SPARK-25482][SQL] ReuseSubquery can be useless w...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22518#discussion_r232687865 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala --- @@ -1268,4 +1269,16 @@ class SubquerySuite extends QueryTest

[GitHub] spark pull request #22518: [SPARK-25482][SQL] ReuseSubquery can be useless w...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22518#discussion_r232668569 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala --- @@ -1268,4 +1269,16 @@ class SubquerySuite extends QueryTest

[GitHub] spark pull request #22944: [SPARK-25942][SQL] Aggregate expressions shouldn'...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22944#discussion_r232666996 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala --- @@ -1556,6 +1556,20 @@ class DatasetSuite extends QueryTest

[GitHub] spark pull request #22944: [SPARK-25942][SQL] Aggregate expressions shouldn'...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22944#discussion_r232665302 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala --- @@ -1556,6 +1556,20 @@ class DatasetSuite extends QueryTest

[GitHub] spark pull request #23015: [BUILD][2.4] Bump previousSparkVersion in MimaBui...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23015#discussion_r232661121 --- Diff: project/MimaExcludes.scala --- @@ -105,7 +105,50 @@ object MimaExcludes { ProblemFilters.exclude[InheritedNewAbstractMethodProblem

[GitHub] spark issue #23015: [BUILD][2.4] Bump previousSparkVersion in MimaBuild.scal...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23015 cc @srowen @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23015: [BUILD][2.4] Bump previousSparkVersion in MimaBui...

2018-11-12 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/23015 [BUILD][2.4] Bump previousSparkVersion in MimaBuild.scala to be 2.3.0 ## What changes were proposed in this pull request? Although it's a little late, we should still update mima

[GitHub] spark pull request #22200: [SPARK-25208][SQL] Loosen Cast.forceNullable for ...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22200#discussion_r232572693 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala --- @@ -154,6 +154,15 @@ object Cast

[GitHub] spark issue #21732: [SPARK-24762][SQL] Enable Option of Product encoders

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21732 LGTM except a few comments, good job! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22961: [SPARK-25947][SQL] Reduce memory usage in ShuffleExchang...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22961 do you have some benchmark numbers? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark issue #22961: [SPARK-25947][SQL] Reduce memory usage in ShuffleExchang...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22961 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22961: [SPARK-25947][SQL] Reduce memory usage in Shuffle...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22961#discussion_r232564430 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala --- @@ -214,13 +214,22 @@ object ShuffleExchangeExec

[GitHub] spark pull request #21732: [SPARK-24762][SQL] Enable Option of Product encod...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21732#discussion_r232563132 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala --- @@ -1556,6 +1547,69 @@ class DatasetSuite extends QueryTest

[GitHub] spark pull request #21732: [SPARK-24762][SQL] Enable Option of Product encod...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21732#discussion_r232562929 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala --- @@ -1556,6 +1547,69 @@ class DatasetSuite extends QueryTest

[GitHub] spark pull request #21732: [SPARK-24762][SQL] Enable Option of Product encod...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21732#discussion_r232561288 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala --- @@ -1556,6 +1547,69 @@ class DatasetSuite extends QueryTest

[GitHub] spark pull request #21732: [SPARK-24762][SQL] Enable Option of Product encod...

2018-11-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21732#discussion_r232560471 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoder.scala --- @@ -198,7 +189,7 @@ case class

[GitHub] spark pull request #22518: [SPARK-25482][SQL] ReuseSubquery can be useless w...

2018-11-11 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22518#discussion_r232558384 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala --- @@ -1268,4 +1269,16 @@ class SubquerySuite extends QueryTest

[GitHub] spark issue #22887: [SPARK-25880][CORE] user set's hadoop conf should not ov...

2018-11-11 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22887 looks reasonable, cc @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark pull request #22944: [SPARK-25942][SQL] Aggregate expressions shouldn'...

2018-11-11 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22944#discussion_r232556359 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala --- @@ -1556,6 +1556,20 @@ class DatasetSuite extends QueryTest

[GitHub] spark issue #22429: [SPARK-25440][SQL] Dumping query execution info to a fil...

2018-11-11 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22429 This is hard to review, do you mean we should add `maxFields: Option[Int]` to all the string related methods

[GitHub] spark issue #22976: [SPARK-25974][SQL]Optimizes Generates bytecode for order...

2018-11-11 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22976 LGTM except one comment, cc @rednaxelafx --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

<    1   2   3   4   5   6   7   8   9   10   >