[GitHub] spark issue #23228: [MINOR][DOC]The condition description of serialized shuf...

2018-12-09 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/23228 Please update the title `[MINOR][DOC] Update the condition description of serialized shuffle` --- - To unsubscribe, e-mail

[GitHub] spark issue #23222: [SPARK-20636] Add the rule TransposeWindow to the optimi...

2018-12-05 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/23222 Shall we add a SQL tag to the title? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #23046: [SPARK-23207][SQL][FOLLOW-UP] Use `SQLConf.get.enableRad...

2018-11-15 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/23046 I searched the code and didn't find similar issues, so this is the only one shall be fixed. --- - To unsubscribe, e-mail

[GitHub] spark issue #22912: [SPARK-25901][CORE] Use only one thread in BarrierTaskCo...

2018-11-03 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22912 Thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #22723: [SPARK-25729][CORE]It is better to replace `minPa...

2018-10-31 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22723#discussion_r229717747 --- Diff: core/src/main/scala/org/apache/spark/input/WholeTextFileInputFormat.scala --- @@ -48,11 +50,11 @@ private[spark] class

[GitHub] spark pull request #22723: [SPARK-25729][CORE]It is better to replace `minPa...

2018-10-31 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22723#discussion_r229717581 --- Diff: core/src/main/scala/org/apache/spark/input/WholeTextFileInputFormat.scala --- @@ -48,11 +50,11 @@ private[spark] class

[GitHub] spark issue #22849: [SPARK-25852][Core] we should filter the workOffers with...

2018-10-30 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22849 It may happen that a busy executor is marked as lost and later it re-register to the driver, in that case currently we call `makeOffers()` and that will add the executor

[GitHub] spark issue #22849: [SPARK-25852][Core] we should filter the workOffers with...

2018-10-30 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22849 What do you mean by "better performance" ? If that means we can spend less time on `TaskSchedulerImpl.resourceOffers()` then I agree it's true, but AFAIK it's never reporte

[GitHub] spark pull request #22853: [SPARK-25845][SQL] Fix MatchError for calendar in...

2018-10-28 Thread jiangxb1987
Github user jiangxb1987 closed the pull request at: https://github.com/apache/spark/pull/22853 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22853: [SPARK-25845][SQL] Fix MatchError for calendar interval ...

2018-10-28 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22853 Merging to master, I can open another PR against 2.4 if required in the future. --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #22853: [SPARK-25845][SQL] Fix MatchError for calendar interval ...

2018-10-27 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22853 Also cc @gatorsmile @cloud-fan @hvanhovell --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #22853: [SPARK-25845][SQL] Fix MatchError for calendar in...

2018-10-26 Thread jiangxb1987
GitHub user jiangxb1987 opened a pull request: https://github.com/apache/spark/pull/22853 [SPARK-25845][SQL] Fix MatchError for calendar interval type in range frame left boundary ## What changes were proposed in this pull request? WindowSpecDefinition checks start < l

[GitHub] spark issue #22813: [SPARK-25818][CORE] WorkDirCleanup should only remove th...

2018-10-24 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22813 IIUC it's not expected to share the SPARK_WORK_DIR with any other usage. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #22771: [SPARK-25773][Core]Cancel zombie tasks in a resul...

2018-10-23 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22771#discussion_r227459990 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1364,6 +1385,16 @@ private[spark] class DAGScheduler

[GitHub] spark issue #22674: [SPARK-25680][SQL] SQL execution listener shouldn't happ...

2018-10-15 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22674 LGTM, do you have any other concerns @hvanhovell @brkyvz @dongjoon-hyun ? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #22677: [SPARK-25683][Core] Make AsyncEventQueue.lastReportTimes...

2018-10-14 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22677 Sounds good! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22699: [SPARK-25711][Core] Allow start-history-server.sh to sho...

2018-10-11 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22699 Let's also update the title to include the deprecation changes. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #22699: [SPARK-25711][Core] Allow history server to show ...

2018-10-11 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22699#discussion_r224508691 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServerArguments.scala --- @@ -34,26 +34,25 @@ private[history] class

[GitHub] spark pull request #22699: [SPARK-25711][Core] Allow history server to show ...

2018-10-11 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22699#discussion_r224508223 --- Diff: sbin/start-history-server.sh --- @@ -28,7 +28,22 @@ if [ -z "${SPARK_HOME}" ]; then export SPARK_HOME="$(cd "`di

[GitHub] spark pull request #22699: [SPARK-25711][Core] Allow history server to show ...

2018-10-11 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22699#discussion_r224507524 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServerArguments.scala --- @@ -34,26 +34,25 @@ private[history] class

[GitHub] spark pull request #22699: [SPARK-25711][Core] Allow history server to show ...

2018-10-11 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22699#discussion_r224504103 --- Diff: sbin/start-history-server.sh --- @@ -28,7 +28,22 @@ if [ -z "${SPARK_HOME}" ]; then export SPARK_HOME="$(cd "`di

[GitHub] spark pull request #22699: [SPARK-25711][Core] Allow history server to show ...

2018-10-11 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22699#discussion_r224504246 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServerArguments.scala --- @@ -34,26 +34,25 @@ private[history] class

[GitHub] spark issue #22165: [SPARK-25017][Core] Add test suite for BarrierCoordinato...

2018-10-11 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22165 Actually my original thinking was like this: ``` val state = new ContextBarrierState(barrierId, numTasks) val requester = mockRequester() val request

[GitHub] spark issue #22677: [SPARK-25683][Core] Make AsyncEventQueue.lastReportTimes...

2018-10-11 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22677 Though it looks a little strange, the log content is actually right, I don't think we want to make the last report timestamp to current time (that can confuse users what happened before

[GitHub] spark pull request #22674: [SPARK-25680][SQL] SQL execution listener shouldn...

2018-10-09 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22674#discussion_r223729445 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/util/QueryExecutionListener.scala --- @@ -75,95 +76,69 @@ trait QueryExecutionListener

[GitHub] spark issue #22674: [SPARK-25680][SQL] SQL execution listener shouldn't happ...

2018-10-09 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22674 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22325: [SPARK-25318]. Add exception handling when wrapping the ...

2018-09-26 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22325 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22165: [SPARK-25017][Core] Add test suite for BarrierCoo...

2018-09-26 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22165#discussion_r220589416 --- Diff: core/src/main/scala/org/apache/spark/BarrierCoordinator.scala --- @@ -187,6 +191,12 @@ private[spark] class BarrierCoordinator

[GitHub] spark pull request #22165: [SPARK-25017][Core] Add test suite for BarrierCoo...

2018-09-26 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22165#discussion_r220590215 --- Diff: core/src/main/scala/org/apache/spark/BarrierCoordinator.scala --- @@ -187,6 +191,12 @@ private[spark] class BarrierCoordinator

[GitHub] spark pull request #22165: [SPARK-25017][Core] Add test suite for BarrierCoo...

2018-09-26 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22165#discussion_r220591706 --- Diff: core/src/test/scala/org/apache/spark/scheduler/BarrierCoordinatorSuite.scala --- @@ -0,0 +1,166 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #22458: [SPARK-25459] Add viewOriginalText back to Catalo...

2018-09-25 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22458#discussion_r220410022 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala --- @@ -2348,4 +2348,17 @@ class HiveDDLSuite

[GitHub] spark pull request #22526: [SPARK-25502][CORE][WEBUI]Empty Page when page nu...

2018-09-24 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22526#discussion_r219891354 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala --- @@ -685,7 +685,7 @@ private[ui] class TaskDataSource( private

[GitHub] spark pull request #22458: [SPARK-25459] Add viewOriginalText back to Catalo...

2018-09-20 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22458#discussion_r219370221 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala --- @@ -467,9 +467,9 @@ private[hive] class HiveClientImpl

[GitHub] spark pull request #22325: [SPARK-25318]. Add exception handling when wrappi...

2018-09-19 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22325#discussion_r218873184 --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala --- @@ -444,36 +444,34 @@ final class

[GitHub] spark pull request #22192: [SPARK-24918][Core] Executor Plugin API

2018-09-19 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22192#discussion_r218861435 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -136,6 +136,26 @@ private[spark] class Executor( // for fetching

[GitHub] spark pull request #22192: [SPARK-24918][Core] Executor Plugin API

2018-09-19 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22192#discussion_r218865220 --- Diff: core/src/test/java/org/apache/spark/ExecutorPluginSuite.java --- @@ -0,0 +1,128 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #22325: [SPARK-25318]. Add exception handling when wrappi...

2018-09-19 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22325#discussion_r218857081 --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala --- @@ -444,36 +444,34 @@ final class

[GitHub] spark pull request #20414: [SPARK-23243][SQL] Shuffle+Repartition on an RDD ...

2018-09-11 Thread jiangxb1987
Github user jiangxb1987 closed the pull request at: https://github.com/apache/spark/pull/20414 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22351: [MINOR][SQL] Add a debug log when a SQL text is used for...

2018-09-09 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22351 Just confirmed if the view is created and retrieved both at Spark side then there will be no exception thrown

[GitHub] spark issue #22351: [MINOR][SQL] Add a debug log when a SQL text is used for...

2018-09-09 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22351 This is actually read some view created by Hive, so I don't think it shall be a problem with view write side

[GitHub] spark issue #22165: [SPARK-25017][Core] Add test suite for BarrierCoordinato...

2018-09-05 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22165 I think it should be fine to make `ContextBarrierState` private[spark] to test it, WDYT @mengxr ? --- - To unsubscribe, e

[GitHub] spark issue #22277: [SPARK-25276] Redundant constrains when using alias

2018-09-05 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22277 You can have `select * from (select a, a as c from table1 where a > 10) t where a > c` --- - To unsubscribe,

[GitHub] spark issue #22277: [SPARK-25276] Redundant constrains when using alias

2018-09-04 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22277 Thank you for interest in this issue, however, I don't think the changes proposed in this PR is valid, consider you have another predicate like `a > z`, it is surely desired to infer a

[GitHub] spark issue #22240: [SPARK-25248] [CORE] Audit barrier Scala APIs for 2.4

2018-09-04 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22240 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22330: [SPARK-19355][SQL][FOLLOWUP][TEST] Properly recyc...

2018-09-04 Thread jiangxb1987
GitHub user jiangxb1987 opened a pull request: https://github.com/apache/spark/pull/22330 [SPARK-19355][SQL][FOLLOWUP][TEST] Properly recycle SparkSession on TakeOrderedAndProjectSuite finishes ## What changes were proposed in this pull request? Previously

[GitHub] spark issue #22240: [SPARK-25248] [CORE] Audit barrier Scala APIs for 2.4

2018-09-04 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22240 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-08-29 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22112 ping @tgravescs @mridulm @squito @markhamstra --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #22240: [SPARK-25248] [CORE] Audit barrier Scala APIs for...

2018-08-29 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22240#discussion_r213754911 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Task.scala --- @@ -82,31 +82,22 @@ private[spark] abstract class Task[T

[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-08-29 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22112 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-08-29 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22112 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-08-29 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22112 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22247: [SPARK-25253][PYSPARK] Refactor local connection ...

2018-08-28 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22247#discussion_r213378931 --- Diff: python/pyspark/taskcontext.py --- @@ -108,38 +108,12 @@ def _load_from_socket(port, auth_secret): """ Load da

[GitHub] spark issue #22240: [SPARK-25248] [CORE] Audit barrier Scala APIs for 2.4

2018-08-28 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22240 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-08-28 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22112 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21976: [SPARK-24909][core] Always unregister pending par...

2018-08-27 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21976#discussion_r213176636 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -2474,19 +2478,21 @@ class DAGSchedulerSuite extends

[GitHub] spark pull request #22085: [SPARK-25095][PySpark] Python support for Barrier...

2018-08-27 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22085#discussion_r213050049 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala --- @@ -180,7 +188,73 @@ private[spark] abstract class BasePythonRunner

[GitHub] spark pull request #21976: [SPARK-24909][core] Always unregister pending par...

2018-08-27 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21976#discussion_r213042176 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -2474,19 +2478,21 @@ class DAGSchedulerSuite extends

[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-08-26 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22112 The changes looks good from my side, it summarizes the current insight we have towards the data correctness issue caused by input order aware operators and inconsistent shuffle output order

[GitHub] spark issue #21698: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-08-24 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21698 Thanks everyone! I closed this in favor of #22112 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #21698: [SPARK-23243][Core] Fix RDD.repartition() data co...

2018-08-24 Thread jiangxb1987
Github user jiangxb1987 closed the pull request at: https://github.com/apache/spark/pull/21698 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22112: [SPARK-23243][Core] Fix RDD.repartition() data co...

2018-08-24 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22112#discussion_r212653282 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1502,6 +1502,53 @@ private[spark] class DAGScheduler

[GitHub] spark pull request #22112: [SPARK-23243][Core] Fix RDD.repartition() data co...

2018-08-24 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22112#discussion_r212651948 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala --- @@ -305,17 +306,19 @@ object

[GitHub] spark issue #22211: [SPARK-23207][SPARK-22905][SPARK-24564][SPARK-25114][SQL...

2018-08-24 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22211 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22112: [SPARK-23243][Core] Fix RDD.repartition() data co...

2018-08-23 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22112#discussion_r212383406 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala --- @@ -305,17 +306,19 @@ object

[GitHub] spark pull request #22112: [SPARK-23243][Core] Fix RDD.repartition() data co...

2018-08-23 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22112#discussion_r212379326 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1502,6 +1502,53 @@ private[spark] class DAGScheduler

[GitHub] spark pull request #22112: [SPARK-23243][Core] Fix RDD.repartition() data co...

2018-08-23 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22112#discussion_r212381036 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1502,6 +1502,53 @@ private[spark] class DAGScheduler

[GitHub] spark pull request #22112: [SPARK-23243][Core] Fix RDD.repartition() data co...

2018-08-23 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22112#discussion_r212368000 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -812,11 +813,13 @@ abstract class RDD[T: ClassTag]( */ private[spark

[GitHub] spark pull request #22112: [SPARK-23243][Core] Fix RDD.repartition() data co...

2018-08-23 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22112#discussion_r212376990 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1865,6 +1876,39 @@ abstract class RDD[T: ClassTag]( // RDD chain

[GitHub] spark issue #22165: [SPARK-25017][Core] Add test suite for BarrierCoordinato...

2018-08-22 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22165 One general idea is that we don't need to rely on the RPC framework to test `ContextBarrierState`, just mock `RpcCallContext`s should be enough (haven't go into detail so correct me if I'm

[GitHub] spark issue #22165: [SPARK-25017][Core] Add test suite for BarrierCoordinato...

2018-08-21 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22165 I'll make one pass of this later today :) Thanks for taking this task! --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #22079: [SPARK-23207][SPARK-22905][SPARK-24564][SPARK-25114][SQL...

2018-08-21 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22079 LGTM, thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22085: [SPARK-25095][PySpark] Python support for Barrier...

2018-08-21 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22085#discussion_r211683369 --- Diff: python/pyspark/taskcontext.py --- @@ -95,3 +99,126 @@ def getLocalProperty(self, key): Get a local property set upstream

[GitHub] spark pull request #22166: [2.3][SPARK-25114][Core][FOLLOWUP] Fix RecordBina...

2018-08-21 Thread jiangxb1987
Github user jiangxb1987 closed the pull request at: https://github.com/apache/spark/pull/22166 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22166: [2.3][SPARK-25114][Core][FOLLOWUP] Fix RecordBina...

2018-08-21 Thread jiangxb1987
GitHub user jiangxb1987 opened a pull request: https://github.com/apache/spark/pull/22166 [2.3][SPARK-25114][Core][FOLLOWUP] Fix RecordBinaryComparatorSuite build failure ## What changes were proposed in this pull request? Fix RecordBinaryComparatorSuite build failure

[GitHub] spark issue #22158: [SPARK-25161][Core] Fix several bugs in failure handling...

2018-08-21 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22158 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22158: [SPARK-25161][Core] Fix several bugs in failure handling...

2018-08-20 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22158 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22158: [SPARK-25161][Core] Fix several bugs in failure h...

2018-08-20 Thread jiangxb1987
GitHub user jiangxb1987 opened a pull request: https://github.com/apache/spark/pull/22158 [SPARK-25161][Core] Fix several bugs in failure handling of barrier execution mode ## What changes were proposed in this pull request? Fix several bugs in failure handling of barrier

[GitHub] spark issue #22101: [SPARK-25114][Core] Fix RecordBinaryComparator when subt...

2018-08-20 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22101 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22101: [SPARK-25114][Core] Fix RecordBinaryComparator when subt...

2018-08-20 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22101 Thanks @squito I've added another test case to cover when the last byte differs. --- - To unsubscribe, e-mail: reviews

[GitHub] spark pull request #22085: [SPARK-25095][PySpark] Python support for Barrier...

2018-08-20 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22085#discussion_r211182337 --- Diff: python/pyspark/taskcontext.py --- @@ -95,3 +99,124 @@ def getLocalProperty(self, key): Get a local property set upstream

[GitHub] spark pull request #22085: [WIP][SPARK-25095][PySpark] Python support for Ba...

2018-08-17 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22085#discussion_r210963511 --- Diff: python/pyspark/taskcontext.py --- @@ -95,3 +99,124 @@ def getLocalProperty(self, key): Get a local property set upstream

[GitHub] spark pull request #22085: [WIP][SPARK-25095][PySpark] Python support for Ba...

2018-08-17 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22085#discussion_r210963181 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala --- @@ -381,6 +421,45 @@ private[spark] abstract class BasePythonRunner

[GitHub] spark issue #22112: [WIP][SPARK-23243][Core] Fix RDD.repartition() data corr...

2018-08-16 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22112 > IMO we should traverse the dependency graph and rely on how ShuffledRDD is configured A trivial point here - Since `ShuffleDependency` is also a DeveloperAPI, it's possible for us

[GitHub] spark issue #22101: [SPARK-25114][Core] Fix RecordBinaryComparator when subt...

2018-08-15 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22101 ping @gatorsmile @mridulm @squito --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #22112: [WIP][SPARK-23243][Core] Fix RDD.repartition() da...

2018-08-15 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22112#discussion_r210450123 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1441,6 +1441,18 @@ class DAGScheduler

[GitHub] spark pull request #22112: [WIP][SPARK-23243][Core] Fix RDD.repartition() da...

2018-08-15 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22112#discussion_r210449640 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1441,6 +1441,18 @@ class DAGScheduler

[GitHub] spark issue #21698: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-08-14 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21698 Thanks @cloud-fan your summary above is super useful, and I think it's clear enough. > So when we see fetch failure and rerun map tasks, we should track which reducers have its shuf

[GitHub] spark pull request #22085: [SPARK-25095][PySpark] Python support for Barrier...

2018-08-14 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22085#discussion_r209974729 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala --- @@ -180,7 +183,42 @@ private[spark] abstract class BasePythonRunner

[GitHub] spark issue #22101: [SPARK-25114][Core] Fix RecordBinaryComparator when subt...

2018-08-14 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22101 @squito I've created a new JIRA task and updated the title, thanks for reminding! --- - To unsubscribe, e-mail: reviews

[GitHub] spark pull request #22085: [SPARK-25095][PySpark] Python support for Barrier...

2018-08-14 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22085#discussion_r209853276 --- Diff: python/pyspark/taskcontext.py --- @@ -95,3 +95,92 @@ def getLocalProperty(self, key): Get a local property set upstream

[GitHub] spark issue #22101: [SPARK-23207][Core][FOLLOWUP] Fix RecordBinaryComparator...

2018-08-14 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22101 cc @mridulm @squito --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22101: [SPARK-23207][Core][FOLLOWUP] Fix RecordBinaryCom...

2018-08-14 Thread jiangxb1987
GitHub user jiangxb1987 opened a pull request: https://github.com/apache/spark/pull/22101 [SPARK-23207][Core][FOLLOWUP] Fix RecordBinaryComparator when subtraction between two words is divisible by Integer.MAX_VALUE. ## What changes were proposed in this pull request

[GitHub] spark pull request #22079: [SPARK-23207][SPARK-22905][SQL][BACKPORT-2.2] Shu...

2018-08-13 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22079#discussion_r209822194 --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/execution/RecordBinaryComparator.java --- @@ -0,0 +1,70 @@ +/* + * Licensed

[GitHub] spark issue #22001: [SPARK-24819][CORE] Fail fast when no enough slots to la...

2018-08-13 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22001 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21698: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-08-13 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21698 @tgravescs I'm still working on this but I would be glad if you can also work on the "sort the serialized bytes of T" approach, actually the retry-all-tasks approach seems more comp

[GitHub] spark issue #21698: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-08-13 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21698 We fixed the DataFrame repartition correctness issue by inserting a local sort before repartition, and feedback for this approach is generally negative because the performance of repartition

[GitHub] spark issue #22001: [SPARK-24819][CORE] Fail fast when no enough slots to la...

2018-08-13 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22001 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22085: [SPARK-25095][PySpark] Python support for Barrier...

2018-08-12 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22085#discussion_r209490553 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala --- @@ -180,7 +183,42 @@ private[spark] abstract class BasePythonRunner

[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...

2018-08-12 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22079 Both seems fine to me, it's just a minor improvement. Normally we don't backport a improvement, but since it's a simple and small change I'm confident it is safe to also include the change

[GitHub] spark pull request #22085: [SPARK-25095][PySpark] Python support for Barrier...

2018-08-12 Thread jiangxb1987
GitHub user jiangxb1987 opened a pull request: https://github.com/apache/spark/pull/22085 [SPARK-25095][PySpark] Python support for BarrierTaskContext ## What changes were proposed in this pull request? Add method `barrier()` and `getTaskInfos()` in python TaskContext

  1   2   3   4   5   6   7   8   9   10   >