[GitHub] spark issue #21698: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-07-18 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21698 > I see some discussion about making shuffles deterministic, but it proved to be very difficult. Is there a prior discussion on this you can point me to? Is it that even if you used fetch

[GitHub] spark pull request #21638: [SPARK-22357][CORE] SparkContext.binaryFiles igno...

2018-07-18 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21638#discussion_r203589083 --- Diff: core/src/main/scala/org/apache/spark/input/PortableDataStream.scala --- @@ -47,7 +47,7 @@ private[spark] abstract class StreamFileInputFormat

[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

2018-07-18 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21533 Please also update the title and PR description because we changed the proposed solution in the middle. --- - To

[GitHub] spark pull request #21638: [SPARK-22357][CORE] SparkContext.binaryFiles igno...

2018-07-16 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21638#discussion_r202737100 --- Diff: core/src/main/scala/org/apache/spark/input/PortableDataStream.scala --- @@ -47,7 +47,7 @@ private[spark] abstract class StreamFileInputFormat

[GitHub] spark pull request #21729: [SPARK-24755][Core] Executor loss can cause task ...

2018-07-16 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21729#discussion_r202727503 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -87,7 +87,7 @@ private[spark] class TaskSetManager( // Set

[GitHub] spark pull request #21729: [SPARK-24755][Core] Executor loss can cause task ...

2018-07-16 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21729#discussion_r202725810 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -87,7 +87,7 @@ private[spark] class TaskSetManager( // Set

[GitHub] spark issue #21781: [INFRA] Close stale PR

2018-07-16 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21781 I checked the listed PRs in Core module and it seems fine to close them. Also cc @gatorsmile --- - To unsubscribe, e-mail

[GitHub] spark issue #21474: [SPARK-24297][CORE] Fetch-to-disk by default for > 2gb

2018-07-16 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21474 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #21758: [SPARK-24795][CORE] Implement barrier execution m...

2018-07-16 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21758#discussion_r202605444 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -359,17 +368,49 @@ private[spark] class TaskSchedulerImpl

[GitHub] spark pull request #21758: [SPARK-24795][CORE] Implement barrier execution m...

2018-07-16 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21758#discussion_r202605140 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -359,17 +368,49 @@ private[spark] class TaskSchedulerImpl

[GitHub] spark issue #21589: [SPARK-24591][CORE] Number of cores and executors in the...

2018-07-15 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21589 > @felixcheung I am not sure that our users are so interested in getting a list of cores per executors and calculate total numbers cores by summurizing the list. It will just complicate API

[GitHub] spark pull request #21589: [SPARK-24591][CORE] Number of cores and executors...

2018-07-15 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21589#discussion_r202545679 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskScheduler.scala --- @@ -67,6 +67,10 @@ private[spark] trait TaskScheduler { // Get

[GitHub] spark issue #21664: [SPARK-24687][CORE] NoClassDefFoundError will not be cat...

2018-07-15 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21664 I agree we shall fall the job instead of let the job stay hanging. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #21729: [SPARK-24755][Core] Executor loss can cause task ...

2018-07-15 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21729#discussion_r202545576 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -87,7 +87,7 @@ private[spark] class TaskSetManager( // Set

[GitHub] spark issue #21698: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-07-15 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21698 Actually I think @mridulm have a point here - if we only retry all the tasks for repartition/zip*, it's still possible that some tasks in a succeeding stage may have finished before retry

[GitHub] spark issue #21758: [SPARK-24795][CORE] Implement barrier execution mode

2018-07-12 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21758 cc @mengxr @gatorsmile @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #21758: [SPARK-24795][CORE] Implement barrier execution m...

2018-07-12 Thread jiangxb1987
GitHub user jiangxb1987 opened a pull request: https://github.com/apache/spark/pull/21758 [SPARK-24795][CORE] Implement barrier execution mode ## What changes were proposed in this pull request? Propose new APIs and modify job/task scheduling to support barrier execution

[GitHub] spark issue #21698: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-07-12 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21698 IIUC the output produced by `rdd1.zip(rdd2).map(v => (computeKey(v._1, v._2), computeValue(v._1, v._2)))` shall always have the same cardinality, no matter how many tasks are retried, so wh

[GitHub] spark issue #21698: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-07-11 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21698 > A synthetic example: > rdd1.zip(rdd2).map(v => (computeKey(v._1, v._2), computeValue(v._1, v._2))).groupByKey().map().save() The above example may create some differe

[GitHub] spark pull request #21526: [SPARK-24515][CORE] No need to warning when outpu...

2018-07-11 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21526#discussion_r201720609 --- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala --- @@ -1053,7 +1053,10 @@ class PairRDDFunctions[K, V](self: RDD[(K, V

[GitHub] spark issue #21658: [SPARK-24678][Spark-Streaming] Give priority in use of '...

2018-07-10 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21658 a late LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #21729: [SPARK-24755][Core] Executor loss can cause task ...

2018-07-10 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21729#discussion_r201374948 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -87,7 +87,7 @@ private[spark] class TaskSetManager( // Set

[GitHub] spark issue #19118: [SPARK-21882][CORE] OutputMetrics doesn't count written ...

2018-07-10 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/19118 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21664: [SPARK-24687][CORE] NoClassDefFoundError will not be cat...

2018-07-10 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21664 Unfortunately, I can't even track which line in Spark have hit the exception from the image you posted. --- ---

[GitHub] spark issue #21656: [SPARK-24677][Core]Avoid NoSuchElementException from Med...

2018-07-10 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21656 The changes LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #21729: [SPARK-24755][Core] Executor loss can cause task ...

2018-07-10 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21729#discussion_r201368147 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -87,7 +87,7 @@ private[spark] class TaskSetManager( // Set

[GitHub] spark issue #21664: [SPARK-24678][CORE] NoClassDefFoundError will not be cat...

2018-07-09 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21664 Can you post a useful stack trace of the job hang issue you hit? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #21664: [SPARK-24678][CORE] NoClassDefFoundError will not be cat...

2018-07-09 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21664 Seems the PR included wrong JIRA number --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21698: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-07-07 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21698 Thank you for your comments @mridulm ! We focus on resolving the RDD.repartition() correctness issue here in this PR, because it is most commonly used, and that we can still address the

[GitHub] spark pull request #21656: [SPARK-24677][Core]MedianHeap is empty when specu...

2018-07-05 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21656#discussion_r200366359 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -772,6 +772,12 @@ private[spark] class TaskSetManager

[GitHub] spark issue #21698: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-07-04 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21698 Thanks @cloud-fan @viirya comments addressed :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #21653: [SPARK-13343] speculative tasks that didn't commit shoul...

2018-07-03 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21653 IIUC this speculative task is not really killed right ? It is actually ignored. Does that worth it to add a new TaskState for this case

[GitHub] spark issue #21698: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-07-02 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21698 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21698: [SPARK-23243] Fix RDD.repartition() data correctn...

2018-07-02 Thread jiangxb1987
GitHub user jiangxb1987 opened a pull request: https://github.com/apache/spark/pull/21698 [SPARK-23243] Fix RDD.repartition() data correctness issue ## What changes were proposed in this pull request? The RDD repartition uses a round-robin way to distribute data, thus there

[GitHub] spark pull request #21474: [SPARK-24297][CORE] Fetch-to-disk by default for ...

2018-06-27 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21474#discussion_r198709276 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -429,7 +429,11 @@ package object config { "ext

[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more user-...

2018-06-26 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/17267 @dataknocker do you want to take over this one? then we can continue with #18324 --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #21624: [SPARK-24639][DOC] Add three config in the doc

2018-06-26 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21624 cc @zsxwing --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #21624: [SPARK-24639][DOC] Add three config in the doc

2018-06-26 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21624#discussion_r198189239 --- Diff: docs/configuration.md --- @@ -456,6 +456,13 @@ Apart from these, the following properties are also available, and may be useful from

[GitHub] spark pull request #21639: [SPARK-24631][tests] Avoid cross-job pollution in...

2018-06-26 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21639#discussion_r198188319 --- Diff: core/src/main/scala/org/apache/spark/TestUtils.scala --- @@ -173,21 +173,23 @@ private[spark] object TestUtils { * Run some code

[GitHub] spark pull request #21639: [SPARK-24631][tests] Avoid cross-job pollution in...

2018-06-26 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21639#discussion_r198186779 --- Diff: core/src/main/scala/org/apache/spark/TestUtils.scala --- @@ -173,21 +173,23 @@ private[spark] object TestUtils { * Run some code

[GitHub] spark issue #21639: [SPARK-24631][tests] Avoid cross-job pollution in TestUt...

2018-06-26 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21639 Seems the JIRA number is not related? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #21570: [SPARK-24564][TEST] Add test suite for RecordBina...

2018-06-26 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21570#discussion_r198177547 --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/execution/sort/RecordBinaryComparatorSuite.java --- @@ -0,0 +1,255

[GitHub] spark pull request #21603: [SPARK-17091][SQL] Add rule to convert IN predica...

2018-06-24 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21603#discussion_r197685463 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala --- @@ -270,6 +270,11 @@ private[parquet

[GitHub] spark pull request #21577: [SPARK-24589][core] Correctly identify tasks in o...

2018-06-19 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21577#discussion_r196479588 --- Diff: core/src/main/scala/org/apache/spark/scheduler/OutputCommitCoordinator.scala --- @@ -109,20 +116,21 @@ private[spark] class

[GitHub] spark issue #21577: [SPARK-24589][core] Correctly identify tasks in output c...

2018-06-19 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21577 This in general looks good, IMO we shall focus on fixing the output commit coordinator issue in this PR, and discuss the data source issue in a separated thread. I'm OOO this week but

[GitHub] spark pull request #21570: [SPARK-24564][TEST] Add test suite for RecordBina...

2018-06-19 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21570#discussion_r196437321 --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/execution/sort/RecordBinaryComparatorSuite.java --- @@ -0,0 +1,255

[GitHub] spark issue #21567: [SPARK-24560][CORE][MESOS] Fix some getTimeAsMs as getTi...

2018-06-19 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21567 Overall I don't think the current logic shall be modified. However, it shall be useful to document some the configs mentioned in th

[GitHub] spark pull request #21567: [SPARK-24560][CORE][MESOS] Fix some getTimeAsMs a...

2018-06-19 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21567#discussion_r196433023 --- Diff: core/src/main/scala/org/apache/spark/ui/ConsoleProgressBar.scala --- @@ -34,7 +34,7 @@ private[spark] class ConsoleProgressBar(sc

[GitHub] spark pull request #21567: [SPARK-24560][CORE][MESOS] Fix some getTimeAsMs a...

2018-06-19 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21567#discussion_r196432597 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -613,7 +614,7 @@ private[spark] class Executor( private[this] val

[GitHub] spark pull request #21567: [SPARK-24560][CORE][MESOS] Fix some getTimeAsMs a...

2018-06-19 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21567#discussion_r196431492 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala --- @@ -58,7 +59,7 @@ private[deploy] class DriverRunner

[GitHub] spark pull request #21567: [SPARK-24560][CORE][MESOS] Fix some getTimeAsMs a...

2018-06-19 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21567#discussion_r196431134 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala --- @@ -354,7 +355,8 @@ private[spark] abstract class BasePythonRunner[IN

[GitHub] spark pull request #21575: [SPARK-24566][CORE] spark.storage.blockManagerSla...

2018-06-19 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21575#discussion_r196429048 --- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala --- @@ -75,16 +76,18 @@ private[spark] class HeartbeatReceiver(sc: SparkContext

[GitHub] spark pull request #21575: [SPARK-24566][CORE] spark.storage.blockManagerSla...

2018-06-19 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21575#discussion_r196428732 --- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala --- @@ -75,16 +76,18 @@ private[spark] class HeartbeatReceiver(sc: SparkContext

[GitHub] spark issue #21558: [SPARK-24552][SQL] Use task ID instead of attempt number...

2018-06-15 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21558 I guess https://issues.apache.org/jira/browse/SPARK-24492 is potentially cause by the output committer issue ? --- - To

[GitHub] spark issue #21570: [SPARK-24564][TEST] Add test suite for RecordBinaryCompa...

2018-06-15 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21570 cc @JoshRosen @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #21570: [SPARK-24564][TEST] Add test suite for RecordBinaryCompa...

2018-06-14 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21570 @kiszk Thanks, updated! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21570: [SPARK-24564][TEST] Add test suite for RecordBina...

2018-06-14 Thread jiangxb1987
GitHub user jiangxb1987 opened a pull request: https://github.com/apache/spark/pull/21570 [SPARK-24564][TEST] Add test suite for RecordBinaryComparator ## What changes were proposed in this pull request? Add a new test suite to test RecordBinaryComparator. ## How

[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...

2018-06-13 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21536 Thanks! @HyukjinKwon --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...

2018-06-13 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21536 @HyukjinKwon can we merge this? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

2018-06-13 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21533 LGTM except for the comment from @HyukjinKwon --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark pull request #21533: [SPARK-24195][Core] Bug fix for local:/ path in S...

2018-06-13 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21533#discussion_r195310633 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -1517,9 +1517,12 @@ class SparkContext(config: SparkConf) extends Logging

[GitHub] spark issue #21558: [SPARK-24552][SQL] Use task ID instead of attempt number...

2018-06-13 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21558 As @squito suggested, we can either use taskAttemptId or combine stageAttemptId and taskAttemptNumber together, both shall be able to represent a unique task attempt

[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...

2018-06-13 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21536 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...

2018-06-12 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21536 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21545: [SPARK-23010][BUILD] Fix java checkstyle failure ...

2018-06-12 Thread jiangxb1987
GitHub user jiangxb1987 opened a pull request: https://github.com/apache/spark/pull/21545 [SPARK-23010][BUILD] Fix java checkstyle failure of kubernetes-integration-tests ## What changes were proposed in this pull request? Fix java checkstyle failure of kubernetes

[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...

2018-06-12 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21536 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21536: [MINOR][CORE][TEST] Remove unnecessary sort in Un...

2018-06-11 Thread jiangxb1987
GitHub user jiangxb1987 opened a pull request: https://github.com/apache/spark/pull/21536 [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInMemorySorterSuite ## What changes were proposed in this pull request? We don't require specific ordering of the input data

[GitHub] spark issue #19528: [SPARK-20393][WEBU UI][1.6] Strengthen Spark to prevent ...

2018-06-11 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/19528 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21514: [SPARK-22860] [Core] - hide key password from lin...

2018-06-10 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21514#discussion_r194296152 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/StandaloneSchedulerBackend.scala --- @@ -100,7 +100,7 @@ private[spark] class

[GitHub] spark issue #21514: [SPARK-22860] [Core] - hide key password from linux ps l...

2018-06-10 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21514 Have you tried the config "spark.redaction.regex" ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apac

[GitHub] spark issue #21494: [WIP][SPARK-24375][Prototype] Support barrier scheduling

2018-06-06 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21494 @Ngone51 You can refer to the SPIP that xiangrui proposed in SPARK-24374 for a basic background and major goal of barrier scheduling, and you can also refer to SPARK-24375 for a design sketch

[GitHub] spark pull request #21494: [WIP][SPARK-24375][Prototype] Support barrier sch...

2018-06-04 Thread jiangxb1987
GitHub user jiangxb1987 opened a pull request: https://github.com/apache/spark/pull/21494 [WIP][SPARK-24375][Prototype] Support barrier scheduling ## What changes were proposed in this pull request? Add new RDDBarrier and BarrierTaskContext to support barrier scheduling in

[GitHub] spark pull request #21333: [SPARK-23778][CORE] Avoid unneeded shuffle when u...

2018-05-30 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21333#discussion_r192000399 --- Diff: core/src/test/scala/org/apache/spark/rdd/RDDSuite.scala --- @@ -154,6 +154,13 @@ class RDDSuite extends SparkFunSuite with SharedSparkContext

[GitHub] spark issue #21454: [SPARK-24337][Core] Improve error messages for Spark con...

2018-05-29 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21454 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #21454: [SPARK-24337][Core] Improve error messages for Spark con...

2018-05-29 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21454 IIUC this PR print the config key in the error message if the config value(either default or get from the configMap) can't be cast properly. Personally I think it add some value to include

[GitHub] spark pull request #21454: [SPARK-24337][Core] Improve error messages for Sp...

2018-05-29 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21454#discussion_r191584812 --- Diff: core/src/main/scala/org/apache/spark/SparkConf.scala --- @@ -448,6 +473,20 @@ class SparkConf(loadDefaults: Boolean) extends Cloneable with

[GitHub] spark pull request #21454: [SPARK-24337][Core] Improve error messages for Sp...

2018-05-29 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21454#discussion_r191582665 --- Diff: core/src/main/scala/org/apache/spark/SparkConf.scala --- @@ -394,23 +407,35 @@ class SparkConf(loadDefaults: Boolean) extends Cloneable with

[GitHub] spark pull request #21454: [SPARK-24337][Core] Improve error messages for Sp...

2018-05-29 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21454#discussion_r191582611 --- Diff: core/src/main/scala/org/apache/spark/SparkConf.scala --- @@ -394,23 +407,35 @@ class SparkConf(loadDefaults: Boolean) extends Cloneable with

[GitHub] spark pull request #21454: [SPARK-24337][Core] Improve error messages for Sp...

2018-05-29 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21454#discussion_r191582499 --- Diff: core/src/main/scala/org/apache/spark/SparkConf.scala --- @@ -394,23 +407,35 @@ class SparkConf(loadDefaults: Boolean) extends Cloneable with

[GitHub] spark issue #21390: [SPARK-24340][Core] Clean up non-shuffle disk block mana...

2018-05-29 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21390 Are there any other concerns over this PR? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21390: [SPARK-24340][Core] Clean up non-shuffle disk block mana...

2018-05-24 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21390 @jerryshao Agree it should be useful to add a `debug-delay-sec` config for ease of developing, since this PR has already bring in a brunch of code changes, maybe we can add the config in a

[GitHub] spark issue #21390: [SPARK-24340][Core] Clean up non-shuffle disk block mana...

2018-05-23 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21390 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21406: [Minor][Core] Cleanup unused vals in `DAGScheduler.handl...

2018-05-23 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21406 cc @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #21406: [Minor][Core] Cleanup unused vals in `DAGSchedule...

2018-05-23 Thread jiangxb1987
GitHub user jiangxb1987 opened a pull request: https://github.com/apache/spark/pull/21406 [Minor][Core] Cleanup unused vals in `DAGScheduler.handleTaskCompletion` ## What changes were proposed in this pull request? Cleanup unused vals in `DAGScheduler.handleTaskCompletion

[GitHub] spark pull request #21390: [SPARK-24340][Core] Clean up non-shuffle disk blo...

2018-05-22 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21390#discussion_r190098764 --- Diff: common/network-shuffle/src/test/java/org/apache/spark/network/shuffle/NonShuffleFilesCleanupSuite.java --- @@ -0,0 +1,222

[GitHub] spark pull request #21390: [SPARK-24340][Core] Clean up non-shuffle disk blo...

2018-05-22 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21390#discussion_r190098624 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala --- @@ -732,6 +736,9 @@ private[deploy] class Worker

[GitHub] spark pull request #21390: [SPARK-24340][Core] Clean up non-shuffle disk blo...

2018-05-22 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21390#discussion_r190098033 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala --- @@ -97,6 +97,10 @@ private[deploy] class Worker( private val

[GitHub] spark pull request #21390: [SPARK-24340][Core] Clean up non-shuffle disk blo...

2018-05-22 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21390#discussion_r190074813 --- Diff: common/network-common/src/main/java/org/apache/spark/network/util/JavaUtils.java --- @@ -157,10 +172,10 @@ private static void

[GitHub] spark pull request #21390: [SPARK-24340][Core] Clean up non-shuffle disk blo...

2018-05-22 Thread jiangxb1987
GitHub user jiangxb1987 opened a pull request: https://github.com/apache/spark/pull/21390 [SPARK-24340][Core] Clean up non-shuffle disk block manager files following executor death ## What changes were proposed in this pull request? Currently we only clean up the local

[GitHub] spark issue #21341: Revert "[SPARK-22938][SQL][FOLLOWUP] Assert that SQLConf...

2018-05-16 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21341 Personally I feel it should be safe to do the revert since we have a better approach, but I'd prefer to hear what @squito think about

[GitHub] spark pull request #21299: [SPARK-24250][SQL] support accessing SQLConf insi...

2018-05-13 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21299#discussion_r187833760 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SQLExecution.scala --- @@ -90,13 +92,33 @@ object SQLExecution { * thread from

[GitHub] spark issue #21252: [SPARK-24193] Sort by disk when number of limit is big i...

2018-05-08 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21252 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21252: [SPARK-24193] Sort by disk when number of limit is big i...

2018-05-07 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21252 The changes looks good to me, but it should also be great to have a test suite to cover this change. Seems we don't have a test suite for the rule `SpecialL

[GitHub] spark issue #21213: [SPARK-24120] Show `Jobs` page when `jobId` is missing

2018-05-06 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21213 also cc @gengliangwang --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21225: [SPARK-24168][SQL] WindowExec should not access S...

2018-05-03 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21225#discussion_r185793645 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExec.scala --- @@ -114,7 +114,8 @@ case class WindowExec

[GitHub] spark issue #21214: [SPARK-23775][TEST] Make DataFrameRangeSuite not flaky

2018-05-03 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21214 > but the thread killing left the shared SparkContext sometimes in a state where further jobs can't be submitted. Just curious how this

[GitHub] spark pull request #21206: [SPARK-24133][SQL] Check for integer overflows wh...

2018-05-02 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21206#discussion_r185526310 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/WritableColumnVector.java --- @@ -92,17 +92,22 @@ public void reserve(int

[GitHub] spark pull request #21206: [SPARK-24133][SQL] Check for integer overflows wh...

2018-05-02 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21206#discussion_r185489656 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/WritableColumnVector.java --- @@ -92,17 +92,22 @@ public void reserve(int

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-02 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21212 How much memory did the converted pairs consume? If the empty blocks should be a issue can we just clean up the empty blocks

[GitHub] spark issue #21206: [SPARK-24133][SQL] Check for integer overflows when resi...

2018-05-02 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21206 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

<    1   2   3   4   5   6   7   8   9   10   >