Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21698
> I see some discussion about making shuffles deterministic, but it proved
to be very difficult. Is there a prior discussion on this you can point me to?
Is it that even if you used fetch
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21638#discussion_r203589083
--- Diff:
core/src/main/scala/org/apache/spark/input/PortableDataStream.scala ---
@@ -47,7 +47,7 @@ private[spark] abstract class StreamFileInputFormat
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21533
Please also update the title and PR description because we changed the
proposed solution in the middle.
---
-
To
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21638#discussion_r202737100
--- Diff:
core/src/main/scala/org/apache/spark/input/PortableDataStream.scala ---
@@ -47,7 +47,7 @@ private[spark] abstract class StreamFileInputFormat
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21729#discussion_r202727503
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -87,7 +87,7 @@ private[spark] class TaskSetManager(
// Set
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21729#discussion_r202725810
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -87,7 +87,7 @@ private[spark] class TaskSetManager(
// Set
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21781
I checked the listed PRs in Core module and it seems fine to close them.
Also cc @gatorsmile
---
-
To unsubscribe, e-mail
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21474
LGTM
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21758#discussion_r202605444
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -359,17 +368,49 @@ private[spark] class TaskSchedulerImpl
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21758#discussion_r202605140
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -359,17 +368,49 @@ private[spark] class TaskSchedulerImpl
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21589
> @felixcheung I am not sure that our users are so interested in getting a
list of cores per executors and calculate total numbers cores by summurizing
the list. It will just complicate API
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21589#discussion_r202545679
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskScheduler.scala ---
@@ -67,6 +67,10 @@ private[spark] trait TaskScheduler {
// Get
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21664
I agree we shall fall the job instead of let the job stay hanging.
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21729#discussion_r202545576
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -87,7 +87,7 @@ private[spark] class TaskSetManager(
// Set
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21698
Actually I think @mridulm have a point here - if we only retry all the
tasks for repartition/zip*, it's still possible that some tasks in a succeeding
stage may have finished before retry
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21758
cc @mengxr @gatorsmile @cloud-fan
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
GitHub user jiangxb1987 opened a pull request:
https://github.com/apache/spark/pull/21758
[SPARK-24795][CORE] Implement barrier execution mode
## What changes were proposed in this pull request?
Propose new APIs and modify job/task scheduling to support barrier
execution
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21698
IIUC the output produced by `rdd1.zip(rdd2).map(v => (computeKey(v._1,
v._2), computeValue(v._1, v._2)))` shall always have the same cardinality, no
matter how many tasks are retried, so wh
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21698
> A synthetic example:
> rdd1.zip(rdd2).map(v => (computeKey(v._1, v._2), computeValue(v._1,
v._2))).groupByKey().map().save()
The above example may create some differe
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21526#discussion_r201720609
--- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
---
@@ -1053,7 +1053,10 @@ class PairRDDFunctions[K, V](self: RDD[(K, V
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21658
a late LGTM
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21729#discussion_r201374948
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -87,7 +87,7 @@ private[spark] class TaskSetManager(
// Set
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/19118
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21664
Unfortunately, I can't even track which line in Spark have hit the
exception from the image you posted.
---
---
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21656
The changes LGTM
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21729#discussion_r201368147
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -87,7 +87,7 @@ private[spark] class TaskSetManager(
// Set
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21664
Can you post a useful stack trace of the job hang issue you hit?
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21664
Seems the PR included wrong JIRA number
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21698
Thank you for your comments @mridulm !
We focus on resolving the RDD.repartition() correctness issue here in this
PR, because it is most commonly used, and that we can still address the
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21656#discussion_r200366359
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -772,6 +772,12 @@ private[spark] class TaskSetManager
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21698
Thanks @cloud-fan @viirya comments addressed :)
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21653
IIUC this speculative task is not really killed right ? It is actually
ignored. Does that worth it to add a new TaskState for this case
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21698
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
GitHub user jiangxb1987 opened a pull request:
https://github.com/apache/spark/pull/21698
[SPARK-23243] Fix RDD.repartition() data correctness issue
## What changes were proposed in this pull request?
The RDD repartition uses a round-robin way to distribute data, thus there
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21474#discussion_r198709276
--- Diff:
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -429,7 +429,11 @@ package object config {
"ext
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/17267
@dataknocker do you want to take over this one? then we can continue with
#18324
---
-
To unsubscribe, e-mail: reviews
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21624
cc @zsxwing
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21624#discussion_r198189239
--- Diff: docs/configuration.md ---
@@ -456,6 +456,13 @@ Apart from these, the following properties are also
available, and may be useful
from
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21639#discussion_r198188319
--- Diff: core/src/main/scala/org/apache/spark/TestUtils.scala ---
@@ -173,21 +173,23 @@ private[spark] object TestUtils {
* Run some code
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21639#discussion_r198186779
--- Diff: core/src/main/scala/org/apache/spark/TestUtils.scala ---
@@ -173,21 +173,23 @@ private[spark] object TestUtils {
* Run some code
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21639
Seems the JIRA number is not related?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21570#discussion_r198177547
--- Diff:
sql/core/src/test/java/test/org/apache/spark/sql/execution/sort/RecordBinaryComparatorSuite.java
---
@@ -0,0 +1,255
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21603#discussion_r197685463
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala
---
@@ -270,6 +270,11 @@ private[parquet
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21577#discussion_r196479588
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/OutputCommitCoordinator.scala ---
@@ -109,20 +116,21 @@ private[spark] class
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21577
This in general looks good, IMO we shall focus on fixing the output commit
coordinator issue in this PR, and discuss the data source issue in a separated
thread.
I'm OOO this week but
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21570#discussion_r196437321
--- Diff:
sql/core/src/test/java/test/org/apache/spark/sql/execution/sort/RecordBinaryComparatorSuite.java
---
@@ -0,0 +1,255
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21567
Overall I don't think the current logic shall be modified. However, it
shall be useful to document some the configs mentioned in th
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21567#discussion_r196433023
--- Diff: core/src/main/scala/org/apache/spark/ui/ConsoleProgressBar.scala
---
@@ -34,7 +34,7 @@ private[spark] class ConsoleProgressBar(sc
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21567#discussion_r196432597
--- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala ---
@@ -613,7 +614,7 @@ private[spark] class Executor(
private[this] val
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21567#discussion_r196431492
--- Diff:
core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala ---
@@ -58,7 +59,7 @@ private[deploy] class DriverRunner
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21567#discussion_r196431134
--- Diff:
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
@@ -354,7 +355,8 @@ private[spark] abstract class BasePythonRunner[IN
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21575#discussion_r196429048
--- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala ---
@@ -75,16 +76,18 @@ private[spark] class HeartbeatReceiver(sc:
SparkContext
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21575#discussion_r196428732
--- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala ---
@@ -75,16 +76,18 @@ private[spark] class HeartbeatReceiver(sc:
SparkContext
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21558
I guess https://issues.apache.org/jira/browse/SPARK-24492 is potentially
cause by the output committer issue ?
---
-
To
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21570
cc @JoshRosen @gatorsmile
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21570
@kiszk Thanks, updated!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
GitHub user jiangxb1987 opened a pull request:
https://github.com/apache/spark/pull/21570
[SPARK-24564][TEST] Add test suite for RecordBinaryComparator
## What changes were proposed in this pull request?
Add a new test suite to test RecordBinaryComparator.
## How
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21536
Thanks! @HyukjinKwon
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21536
@HyukjinKwon can we merge this?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21533
LGTM except for the comment from @HyukjinKwon
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21533#discussion_r195310633
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -1517,9 +1517,12 @@ class SparkContext(config: SparkConf) extends
Logging
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21558
As @squito suggested, we can either use taskAttemptId or combine
stageAttemptId and taskAttemptNumber together, both shall be able to represent
a unique task attempt
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21536
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21536
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
GitHub user jiangxb1987 opened a pull request:
https://github.com/apache/spark/pull/21545
[SPARK-23010][BUILD] Fix java checkstyle failure of
kubernetes-integration-tests
## What changes were proposed in this pull request?
Fix java checkstyle failure of kubernetes
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21536
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
GitHub user jiangxb1987 opened a pull request:
https://github.com/apache/spark/pull/21536
[MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInMemorySorterSuite
## What changes were proposed in this pull request?
We don't require specific ordering of the input data
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/19528
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21514#discussion_r194296152
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
---
@@ -100,7 +100,7 @@ private[spark] class
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21514
Have you tried the config "spark.redaction.regex" ?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apac
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21494
@Ngone51 You can refer to the SPIP that xiangrui proposed in SPARK-24374
for a basic background and major goal of barrier scheduling, and you can also
refer to SPARK-24375 for a design sketch
GitHub user jiangxb1987 opened a pull request:
https://github.com/apache/spark/pull/21494
[WIP][SPARK-24375][Prototype] Support barrier scheduling
## What changes were proposed in this pull request?
Add new RDDBarrier and BarrierTaskContext to support barrier scheduling in
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21333#discussion_r192000399
--- Diff: core/src/test/scala/org/apache/spark/rdd/RDDSuite.scala ---
@@ -154,6 +154,13 @@ class RDDSuite extends SparkFunSuite with
SharedSparkContext
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21454
LGTM
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21454
IIUC this PR print the config key in the error message if the config
value(either default or get from the configMap) can't be cast properly.
Personally I think it add some value to include
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21454#discussion_r191584812
--- Diff: core/src/main/scala/org/apache/spark/SparkConf.scala ---
@@ -448,6 +473,20 @@ class SparkConf(loadDefaults: Boolean) extends
Cloneable with
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21454#discussion_r191582665
--- Diff: core/src/main/scala/org/apache/spark/SparkConf.scala ---
@@ -394,23 +407,35 @@ class SparkConf(loadDefaults: Boolean) extends
Cloneable with
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21454#discussion_r191582611
--- Diff: core/src/main/scala/org/apache/spark/SparkConf.scala ---
@@ -394,23 +407,35 @@ class SparkConf(loadDefaults: Boolean) extends
Cloneable with
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21454#discussion_r191582499
--- Diff: core/src/main/scala/org/apache/spark/SparkConf.scala ---
@@ -394,23 +407,35 @@ class SparkConf(loadDefaults: Boolean) extends
Cloneable with
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21390
Are there any other concerns over this PR?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21390
@jerryshao Agree it should be useful to add a `debug-delay-sec` config for
ease of developing, since this PR has already bring in a brunch of code
changes, maybe we can add the config in a
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21390
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21406
cc @cloud-fan
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
GitHub user jiangxb1987 opened a pull request:
https://github.com/apache/spark/pull/21406
[Minor][Core] Cleanup unused vals in `DAGScheduler.handleTaskCompletion`
## What changes were proposed in this pull request?
Cleanup unused vals in `DAGScheduler.handleTaskCompletion
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21390#discussion_r190098764
--- Diff:
common/network-shuffle/src/test/java/org/apache/spark/network/shuffle/NonShuffleFilesCleanupSuite.java
---
@@ -0,0 +1,222
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21390#discussion_r190098624
--- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala
---
@@ -732,6 +736,9 @@ private[deploy] class Worker
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21390#discussion_r190098033
--- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala
---
@@ -97,6 +97,10 @@ private[deploy] class Worker(
private val
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21390#discussion_r190074813
--- Diff:
common/network-common/src/main/java/org/apache/spark/network/util/JavaUtils.java
---
@@ -157,10 +172,10 @@ private static void
GitHub user jiangxb1987 opened a pull request:
https://github.com/apache/spark/pull/21390
[SPARK-24340][Core] Clean up non-shuffle disk block manager files following
executor death
## What changes were proposed in this pull request?
Currently we only clean up the local
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21341
Personally I feel it should be safe to do the revert since we have a better
approach, but I'd prefer to hear what @squito think about
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21299#discussion_r187833760
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/SQLExecution.scala ---
@@ -90,13 +92,33 @@ object SQLExecution {
* thread from
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21252
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21252
The changes looks good to me, but it should also be great to have a test
suite to cover this change. Seems we don't have a test suite for the rule
`SpecialL
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21213
also cc @gengliangwang
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21225#discussion_r185793645
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExec.scala
---
@@ -114,7 +114,8 @@ case class WindowExec
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21214
> but the thread killing left the shared SparkContext sometimes in a state
where further jobs can't be submitted.
Just curious how this
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21206#discussion_r185526310
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/WritableColumnVector.java
---
@@ -92,17 +92,22 @@ public void reserve(int
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21206#discussion_r185489656
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/WritableColumnVector.java
---
@@ -92,17 +92,22 @@ public void reserve(int
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21212
How much memory did the converted pairs consume? If the empty blocks should
be a issue can we just clean up the empty blocks
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21206
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
201 - 300 of 1801 matches
Mail list logo