[GitHub] spark issue #16476: [SPARK-19084][SQL] Implement expression field

2018-07-16 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/16476 @HyukjinKwon Done, thanks : ) Ping @maropu --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #16476: [SPARK-19084][SQL] Implement expression field

2018-06-26 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/16476 @maropu Sure, I will update it this week. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19755: [SPARK-22524][SQL] Subquery shows reused on UI SQ...

2018-06-10 Thread gczsjdy
Github user gczsjdy closed the pull request at: https://github.com/apache/spark/pull/19755 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20809: [SPARK-23667][CORE] Better scala version check

2018-05-20 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/20809 @vanzin Sorry for the late reply. According to the call stack, it's the first place called `getScalaVersion`, `isTest` is true so we can go into that path. This happens in travis

[GitHub] spark issue #20809: [SPARK-23667][CORE] Better scala version check

2018-05-13 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/20809 @vanzin Sorry but I will update it in next week, thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #21022: Fpga acc

2018-04-10 Thread gczsjdy
Github user gczsjdy closed the pull request at: https://github.com/apache/spark/pull/21022 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21022: Fpga acc

2018-04-10 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/21022 @HyukjinKwon Sorry, it is. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21022: Fpga acc

2018-04-10 Thread gczsjdy
GitHub user gczsjdy opened a pull request: https://github.com/apache/spark/pull/21022 Fpga acc You can merge this pull request into a Git repository by running: $ git pull https://github.com/gczsjdy/spark fpga_acc Alternatively you can review and apply these changes

[GitHub] spark pull request #20844: [SPARK-23707][SQL] Don't need shuffle exchange wi...

2018-03-31 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/20844#discussion_r178437551 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/ConfigBehaviorSuite.scala --- @@ -39,7 +39,7 @@ class ConfigBehaviorSuite extends QueryTest

[GitHub] spark pull request #20844: [SPARK-23707][SQL] Don't need shuffle exchange wi...

2018-03-26 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/20844#discussion_r177311524 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/ConfigBehaviorSuite.scala --- @@ -39,7 +39,7 @@ class ConfigBehaviorSuite extends QueryTest

[GitHub] spark issue #20809: [SPARK-23667][CORE] Better scala version check

2018-03-19 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/20809 @vanzin Thanks. : ) I am testing using [OAP](https://github.com/Intel-bigdata/OAP) with pre-built Spark on `LocalClusterMode`. This is on travis and no SPARK_HOME is set. The `mvn test

[GitHub] spark issue #20809: [SPARK-23667][CORE] Better scala version check

2018-03-16 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/20809 @viirya Yes, but this is only for people who will investigate on Spark code, and it also requires manual efforts. Isn't it better if we get this automatically

[GitHub] spark issue #20809: [SPARK-23667][CORE] Better scala version check

2018-03-15 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/20809 cc @cloud-fan @viirya --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #20809: [CORE] Better scala version check

2018-03-13 Thread gczsjdy
GitHub user gczsjdy opened a pull request: https://github.com/apache/spark/pull/20809 [CORE] Better scala version check ## What changes were proposed in this pull request? In some cases when outer project use pre-built Spark as dependency, `getScalaVersion` will fail due

[GitHub] spark pull request #20303: [SPARK-23128][SQL] A new approach to do adaptive ...

2018-01-26 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/20303#discussion_r164089610 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/QueryStage.scala --- @@ -0,0 +1,222 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #19862: [SPARK-22671][SQL] Make SortMergeJoin shuffle rea...

2018-01-24 Thread gczsjdy
Github user gczsjdy closed the pull request at: https://github.com/apache/spark/pull/19862 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19862: [SPARK-22671][SQL] Make SortMergeJoin shuffle read less ...

2018-01-24 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/19862 @cloud-fan Ok, thanks for your time, I will close this. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #20135: [SPARK-22937][SQL] SQL elt output binary for bina...

2018-01-02 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/20135#discussion_r159353652 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -271,33 +271,45 @@ case class ConcatWs

[GitHub] spark pull request #20135: [SPARK-22937][SQL] SQL elt output binary for bina...

2018-01-02 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/20135#discussion_r159235432 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -684,6 +685,34 @@ object TypeCoercion

[GitHub] spark pull request #20135: [SPARK-22937][SQL] SQL elt output binary for bina...

2018-01-02 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/20135#discussion_r159234455 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -271,33 +271,45 @@ case class ConcatWs

[GitHub] spark issue #20010: [SPARK-22826][SQL] findWiderTypeForTwo Fails over Struct...

2017-12-28 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/20010 Seems not a regular error? @bdrillard Maybe you can push a commit and trigger the test again. --- - To unsubscribe, e-mail

[GitHub] spark pull request #20099: [SPARK-22916][SQL] shouldn't bias towards build r...

2017-12-28 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/20099#discussion_r158961184 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala --- @@ -158,45 +158,65 @@ abstract class SparkStrategies extends

[GitHub] spark pull request #20099: [SPARK-22916][SQL] shouldn't bias towards build r...

2017-12-28 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/20099#discussion_r158961453 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala --- @@ -158,45 +158,65 @@ abstract class SparkStrategies extends

[GitHub] spark issue #20043: [SPARK-22856][SQL] Add wrappers for codegen output and n...

2017-12-24 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/20043 @viirya Thanks much. Actually local variable corresponds to `VariableValue` and `StatementValue`? IIUC `VariableValue` is value that depends on something else, but what is `StatementValue`? Maybe

[GitHub] spark issue #20067: [SPARK-22894][SQL] DateTimeOperations should accept SQL ...

2017-12-24 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/20067 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #20010: [SPARK-22826][SQL] findWiderTypeForTwo Fails over...

2017-12-21 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/20010#discussion_r158440114 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -158,11 +169,6 @@ object TypeCoercion

[GitHub] spark pull request #20010: [SPARK-22826][SQL] findWiderTypeForTwo Fails over...

2017-12-21 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/20010#discussion_r158440005 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -158,11 +213,8 @@ object TypeCoercion

[GitHub] spark issue #20043: [SPARK-22856][SQL] Add wrappers for codegen output and n...

2017-12-21 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/20043 @viirya Sorry I didn't quite understand, how do we easily know the value by adding wrappers? Could you explain a little bit

[GitHub] spark pull request #20039: [SPARK-22850][core] Ensure queued events are deli...

2017-12-21 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/20039#discussion_r158424211 --- Diff: core/src/main/scala/org/apache/spark/scheduler/LiveListenerBus.scala --- @@ -124,13 +127,19 @@ private[spark] class LiveListenerBus(conf

[GitHub] spark pull request #20039: [SPARK-22850][core] Ensure queued events are deli...

2017-12-21 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/20039#discussion_r158309818 --- Diff: core/src/main/scala/org/apache/spark/scheduler/LiveListenerBus.scala --- @@ -124,13 +127,19 @@ private[spark] class LiveListenerBus(conf

[GitHub] spark pull request #19977: [SPARK-22771][SQL] Concatenate binary inputs into...

2017-12-19 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19977#discussion_r157939820 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -48,17 +48,26 @@ import

[GitHub] spark pull request #20010: [SPARK-22826][SQL] findWiderTypeForTwo Fails over...

2017-12-19 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/20010#discussion_r157928910 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -99,6 +102,33 @@ object TypeCoercion

[GitHub] spark pull request #20010: [SPARK-22826][SQL] findWiderTypeForTwo Fails over...

2017-12-19 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/20010#discussion_r157929494 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -158,11 +169,6 @@ object TypeCoercion

[GitHub] spark pull request #20010: [SPARK-22826][SQL] findWiderTypeForTwo Fails over...

2017-12-19 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/20010#discussion_r157926754 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -99,6 +99,17 @@ object TypeCoercion { case

[GitHub] spark pull request #20010: [SPARK-22826][SQL] findWiderTypeForTwo Fails over...

2017-12-19 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/20010#discussion_r157926722 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -158,11 +169,6 @@ object TypeCoercion

[GitHub] spark pull request #19977: [SPARK-22771][SQL] Concatenate binary inputs into...

2017-12-19 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19977#discussion_r157819483 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -48,17 +48,26 @@ import

[GitHub] spark pull request #19977: [SPARK-22771][SQL] Concatenate binary inputs into...

2017-12-19 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19977#discussion_r157793004 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1035,6 +1035,12 @@ object SQLConf { .booleanConf

[GitHub] spark issue #19977: [SPARK-22771][SQL] Concatenate binary inputs into a bina...

2017-12-19 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/19977 You mean answers of mysql is unexpected? I think it's common these dbs get different behaviors, while Spark mainly follows Hive

[GitHub] spark pull request #20010: [SPARK-22826][SQL] findWiderTypeForTwo Fails over...

2017-12-19 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/20010#discussion_r157780706 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -99,6 +99,17 @@ object TypeCoercion { case

[GitHub] spark pull request #20010: [SPARK-22826][SQL] findWiderTypeForTwo Fails over...

2017-12-19 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/20010#discussion_r157775323 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -99,6 +99,17 @@ object TypeCoercion { case

[GitHub] spark pull request #20010: [SPARK-22826][SQL] findWiderTypeForTwo Fails over...

2017-12-19 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/20010#discussion_r157697044 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -99,6 +99,17 @@ object TypeCoercion { case

[GitHub] spark pull request #20010: [SPARK-22826][SQL] findWiderTypeForTwo Fails over...

2017-12-19 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/20010#discussion_r157696626 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -99,6 +99,17 @@ object TypeCoercion { case

[GitHub] spark pull request #20010: [SPARK-22826][SQL] findWiderTypeForTwo Fails over...

2017-12-19 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/20010#discussion_r157695599 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -99,6 +99,17 @@ object TypeCoercion { case

[GitHub] spark pull request #20015: [SPARK-22829] Add new built-in function date_trun...

2017-12-19 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/20015#discussion_r157686437 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala --- @@ -1295,87 +1295,184 @@ case class

[GitHub] spark pull request #20015: [SPARK-22829] Add new built-in function date_trun...

2017-12-18 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/20015#discussion_r157676669 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala --- @@ -1295,87 +1295,184 @@ case class

[GitHub] spark pull request #20015: [SPARK-22829] Add new built-in function date_trun...

2017-12-18 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/20015#discussion_r157678588 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala --- @@ -1295,87 +1295,184 @@ case class

[GitHub] spark pull request #20015: [SPARK-22829] Add new built-in function date_trun...

2017-12-18 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/20015#discussion_r157680290 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala --- @@ -944,9 +954,16 @@ object DateTimeUtils { date

[GitHub] spark pull request #20015: [SPARK-22829] Add new built-in function date_trun...

2017-12-18 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/20015#discussion_r157674840 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala --- @@ -1295,87 +1295,184 @@ case class

[GitHub] spark issue #19862: [SPARK-22671][SQL] Make SortMergeJoin shuffle read less ...

2017-12-13 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/19862 cc @cloud-fan @hvanhovell @viirya --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark issue #19862: [SPARK-22671][SQL] Make SortMergeJoin shuffle read less ...

2017-12-13 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/19862 This is actually a small change, but it can provide not small optimization for users who don't use `WholeStageCodegen`, for example there're still some users who use under 2.0 version of Spark

[GitHub] spark pull request #19862: [SPARK-22671][SQL] Make SortMergeJoin shuffle rea...

2017-12-12 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19862#discussion_r156581645 --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/execution/UnsafeExternalRowSorter.java --- @@ -159,6 +154,12 @@ public boolean hasNext

[GitHub] spark pull request #19862: [SPARK-22671][SQL] Make SortMergeJoin shuffle rea...

2017-12-04 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19862#discussion_r154635850 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala --- @@ -699,39 +700,44 @@ private[joins] class

[GitHub] spark pull request #19862: [WIP][SPARK-22671][SQL] Make SortMergeJoin shuffl...

2017-12-04 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19862#discussion_r154581844 --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/execution/UnsafeExternalRowSorter.java --- @@ -159,6 +154,12 @@ public boolean hasNext

[GitHub] spark pull request #19862: [SPARK-22671][SQL] Make SortMergeJoin shuffle rea...

2017-12-03 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19862#discussion_r154563897 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala --- @@ -674,8 +674,9 @@ private[joins] class

[GitHub] spark pull request #19862: [SPARK-22671][SQL] Make SortMergeJoin shuffle rea...

2017-12-03 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19862#discussion_r154564327 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala --- @@ -699,39 +700,44 @@ private[joins] class

[GitHub] spark pull request #19862: [SPARK-22671][SQL] Make SortMergeJoin shuffle rea...

2017-12-03 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19862#discussion_r154564488 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala --- @@ -699,39 +700,44 @@ private[joins] class

[GitHub] spark issue #19862: [SPARK-22671][SQL] Make SortMergeJoin read less data whe...

2017-12-01 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/19862 cc @cloud-fan @viirya @ConeyLiu --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #19862: Make SortMergeJoin read less data when wholeStage...

2017-12-01 Thread gczsjdy
GitHub user gczsjdy opened a pull request: https://github.com/apache/spark/pull/19862 Make SortMergeJoin read less data when wholeStageCodegen is off ## What changes were proposed in this pull request? In SortMergeJoin(with wholeStageCodegen), an optimization already exists

[GitHub] spark pull request #19823: [SPARK-22601][SQL] Data load is getting displayed...

2017-11-27 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19823#discussion_r153131202 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala --- @@ -341,6 +341,12 @@ case class LoadDataCommand

[GitHub] spark pull request #19823: [SPARK-22601][SQL] Data load is getting displayed...

2017-11-27 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19823#discussion_r153127637 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -2624,7 +2624,13 @@ class SQLQuerySuite extends QueryTest

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2017-11-26 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/19788 Can we just add the `ContinuousShuffleBlockId` without adding new conf `spark.shuffle.continuousFetch`? While in classes related to shuffle read like `ShuffleBlockFetcherIterator`, we also pattern

[GitHub] spark issue #19764: [SPARK-22539][SQL] Add second order for rangepartitioner...

2017-11-26 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/19764 @caneGuy Can you give a specific example to illustrate your change? Maybe former partition result & later partition re

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2017-11-26 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/19788 What are ` external shuffle service` here? Can you explain a little bit? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #19788: [SPARK-9853][Core] Optimize shuffle fetch of cont...

2017-11-26 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19788#discussion_r153117088 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockId.scala --- @@ -116,8 +117,8 @@ object BlockId { def apply(name: String): BlockId = name

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-24 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152921091 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -472,15 +475,66 @@ private[spark] class MapOutputTrackerMaster

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-24 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152920483 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -485,4 +485,13 @@ package object config { "

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-23 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152912084 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -485,4 +485,13 @@ package object config { "

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-23 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152911325 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -485,4 +485,13 @@ package object config { "

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-23 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152907079 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -485,4 +485,13 @@ package object config { "

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-23 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152906960 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -472,15 +475,66 @@ private[spark] class MapOutputTrackerMaster

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-23 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152888380 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -472,15 +475,66 @@ private[spark] class MapOutputTrackerMaster

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-23 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152888257 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -472,15 +475,66 @@ private[spark] class MapOutputTrackerMaster

[GitHub] spark issue #19763: [SPARK-22537][core] Aggregation of map output statistics...

2017-11-23 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/19763 @cloud-fan Seems Jenkins's not started? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-23 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152827467 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -472,15 +475,66 @@ private[spark] class MapOutputTrackerMaster

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-22 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152493779 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -472,15 +475,66 @@ private[spark] class MapOutputTrackerMaster

[GitHub] spark pull request #19788: [SPARK-9853][Core] Optimize shuffle fetch of cont...

2017-11-20 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19788#discussion_r152193203 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -812,10 +812,14 @@ private[spark] object MapOutputTracker extends Logging

[GitHub] spark issue #19755: [SPARK-22524][SQL] Subquery shows reused on UI SQL tab e...

2017-11-20 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/19755 I can't find a way to distinguish `reused` and `unreused` subquery. For example, in the `ReuseSubquery` rule, after seeing the 1st SubqueryExec(with `unreused` in name), it's buffered. When

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-20 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152185531 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -485,4 +485,13 @@ package object config { "

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-20 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152021708 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -472,16 +475,45 @@ private[spark] class MapOutputTrackerMaster

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-20 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152017736 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -485,4 +485,13 @@ package object config { "

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-20 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152016240 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -472,16 +475,45 @@ private[spark] class MapOutputTrackerMaster

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-20 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152008181 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -485,4 +485,13 @@ package object config { "

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-20 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152006310 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -485,4 +485,13 @@ package object config { "

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-20 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152005905 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -485,4 +485,13 @@ package object config { "

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-20 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152002860 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -485,4 +485,13 @@ package object config { "

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-20 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152002262 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -485,4 +485,13 @@ package object config { "

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-20 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r151921740 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -485,4 +485,13 @@ package object config { "

[GitHub] spark issue #19763: [SPARK-22537][core] Aggregation of map output statistics...

2017-11-16 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/19763 Actually, the time gap is O(number of mappers * shuffle partitions). In this case, number of mappers is not very large, while users are more likely to get slowed down when they run on a big data

[GitHub] spark issue #19763: [SPARK-22537][core] Aggregation of map output statistics...

2017-11-16 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/19763 This happens a lot in our TPC-DS 100TB test. We have a Intel Xeon CPU E5-2699 v4 @2.2GHz CPU as master, this will influence the driver's performance. And we set `spark.sql.shuffle.partitions

[GitHub] spark issue #19755: [SPARK-22524][SQL] Subquery shows reused on UI SQL tab e...

2017-11-16 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/19755 This targets on subquery that's not reused, the reused subquery is correctly shown in UI now. @cloud-fan --- - To unsubscribe

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-15 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r151339166 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -473,16 +477,41 @@ private[spark] class MapOutputTrackerMaster

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-15 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r151332369 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -473,16 +477,41 @@ private[spark] class MapOutputTrackerMaster

[GitHub] spark issue #19763: [SPARK-22537] Aggregation of map output statistics on dr...

2017-11-15 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/19763 cc @cloud-fan @viirya @gatorsmile @chenghao-intel --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #19763: [SPARK-22537] Aggregation of map output statistic...

2017-11-15 Thread gczsjdy
GitHub user gczsjdy opened a pull request: https://github.com/apache/spark/pull/19763 [SPARK-22537] Aggregation of map output statistics on driver faces single point bottleneck ## What changes were proposed in this pull request? In adaptive execution, the map output

[GitHub] spark issue #19755: [SPARK-22524][SQL] Subquery shows reused on UI SQL tab e...

2017-11-15 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/19755 But it might make users confused, I think what shows on UI is supposed to be exactly things that get executed. Maybe accuracy is more important than clearness

[GitHub] spark issue #19755: [SPARK-22524][SQL] Subquery shows reused on UI SQL tab e...

2017-11-15 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/19755 @cloud-fan @viirya @carsonwang @gatorsmile @yucai Could you please help me review this? --- - To unsubscribe, e-mail: reviews

[GitHub] spark pull request #19755: [SPARK-22524] Subquery shows reused on UI SQL tab...

2017-11-15 Thread gczsjdy
GitHub user gczsjdy opened a pull request: https://github.com/apache/spark/pull/19755 [SPARK-22524] Subquery shows reused on UI SQL tab even if it's not reused After manually disabling `reuseSubquery` rule, the subqueries won't be reused. But on the SQL graph, there is only one

[GitHub] spark issue #11403: [SPARK-13523] [SQL] Reuse exchanges in a query

2017-10-29 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/11403 @davies Hi, what do you mean by "Since all the planner only work with tree, so this rule should be the last one for the entire planning."? Thanks if you

[GitHub] spark pull request #17359: [SPARK-20028][SQL] Add aggreagate expression nGra...

2017-10-10 Thread gczsjdy
Github user gczsjdy closed the pull request at: https://github.com/apache/spark/pull/17359 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17359: [SPARK-20028][SQL] Add aggreagate expression nGrams

2017-10-10 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/17359 Sorry, but I think this is inactive. Thanks for your attention. @wzhfy @viirya @gatorsmile --- - To unsubscribe, e-mail

  1   2   >