Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r191685596
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -351,8 +352,62 @@ def show(self, n=20, truncate=True, vertical=False):
else
Github user xuanyuanking commented on the issue:
https://github.com/apache/spark/pull/21370
@viirya @gatorsmile @ueshin @felixcheung @HyukjinKwon
The refactor about generating html code out of `Dataset.scala` was done in
94f3414. Please help to check whether it is appropriate
Github user xuanyuanking commented on the issue:
https://github.com/apache/spark/pull/21445
```
Looks like the patch is needed only with #21353 #21332 #21293 as of now,
right?
```
@HeartSaVioR Yes, sorry for the late explanation. The background is we are
running POC based
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21385#discussion_r191149214
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/shuffle/UnsafeRowReceiver.scala
---
@@ -41,11 +50,15 @@ private
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r191080316
--- Diff: docs/configuration.md ---
@@ -456,6 +456,29 @@ Apart from these, the following properties are also
available, and may be useful
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r191080194
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -237,9 +238,13 @@ class Dataset[T] private[sql](
* @param truncate
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r191080082
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -358,6 +357,43 @@ class Dataset[T] private[sql](
sb.toString
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r191080049
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -347,13 +347,30 @@ def show(self, n=20, truncate=True, vertical=False):
name | Bob
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r191080066
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -347,13 +347,30 @@ def show(self, n=20, truncate=True, vertical=False):
name | Bob
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r191080057
--- Diff: python/pyspark/sql/tests.py ---
@@ -3040,6 +3040,50 @@ def test_csv_sampling_ratio(self):
.csv(rdd, samplingRatio=0.5
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r191080044
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -347,13 +347,30 @@ def show(self, n=20, truncate=True, vertical=False):
name | Bob
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r191080037
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -347,13 +347,30 @@ def show(self, n=20, truncate=True, vertical=False):
name | Bob
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r191080026
--- Diff: docs/configuration.md ---
@@ -456,6 +456,29 @@ Apart from these, the following properties are also
available, and may be useful
Github user xuanyuanking commented on the issue:
https://github.com/apache/spark/pull/21370
```
Can we also do something a bit more generic that works for non-Jupyter
notebooks as well?
```
Can we accept `spark.sql.repl.eagerEval.enabled` to control both
\_\_repr\_\_ and
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r190244648
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -358,6 +357,43 @@ class Dataset[T] private[sql](
sb.toString
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r190154145
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -237,9 +238,13 @@ class Dataset[T] private[sql](
* @param truncate
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r190154231
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -358,6 +357,43 @@ class Dataset[T] private[sql](
sb.toString
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r190153907
--- Diff: docs/configuration.md ---
@@ -456,6 +456,29 @@ Apart from these, the following properties are also
available, and may be useful
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r190153833
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -347,13 +347,26 @@ def show(self, n=20, truncate=True, vertical=False):
name | Bob
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r190153812
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -347,13 +347,26 @@ def show(self, n=20, truncate=True, vertical=False):
name | Bob
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r189614136
--- Diff: docs/configuration.md ---
@@ -456,6 +456,29 @@ Apart from these, the following properties are also
available, and may be useful
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r189614067
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -347,13 +347,26 @@ def show(self, n=20, truncate=True, vertical=False):
name | Bob
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r189613358
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -292,31 +297,25 @@ class Dataset[T] private[sql
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r189611792
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -358,6 +357,43 @@ class Dataset[T] private[sql](
sb.toString
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r189603851
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -347,13 +347,26 @@ def show(self, n=20, truncate=True, vertical=False):
name | Bob
Github user xuanyuanking commented on the issue:
https://github.com/apache/spark/pull/21370
Thanks all reviewer's comments, I address all comments in this commit.
Please have a look.
---
-
To unsubscribe, e
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r189574938
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -237,9 +238,13 @@ class Dataset[T] private[sql](
* @param truncate
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r189570764
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -358,6 +357,43 @@ class Dataset[T] private[sql](
sb.toString
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r189570479
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -292,31 +297,25 @@ class Dataset[T] private[sql
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r189569952
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -78,6 +78,12 @@ def __init__(self, jdf, sql_ctx):
self.is_cached = False
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r189569437
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -347,13 +353,18 @@ def show(self, n=20, truncate=True, vertical=False):
name | Bob
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r189567614
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -78,6 +78,12 @@ def __init__(self, jdf, sql_ctx):
self.is_cached = False
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r189567350
--- Diff: docs/configuration.md ---
@@ -456,6 +456,29 @@ Apart from these, the following properties are also
available, and may be useful
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r189567315
--- Diff: docs/configuration.md ---
@@ -456,6 +456,29 @@ Apart from these, the following properties are also
available, and may be useful
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r189567259
--- Diff: docs/configuration.md ---
@@ -456,6 +456,29 @@ Apart from these, the following properties are also
available, and may be useful
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r189483903
--- Diff: docs/configuration.md ---
@@ -456,6 +456,29 @@ Apart from these, the following properties are also
available, and may be useful
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r189483894
--- Diff: docs/configuration.md ---
@@ -456,6 +456,29 @@ Apart from these, the following properties are also
available, and may be useful
Github user xuanyuanking commented on the issue:
https://github.com/apache/spark/pull/21370
```
this will need to escape the values to make sure it is legal html too right?
```
Yes you're right, thanks for your guidance, the new patch consider the
escape and add n
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r189463652
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -237,9 +236,13 @@ class Dataset[T] private[sql](
* @param truncate
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r189463098
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -78,6 +78,12 @@ def __init__(self, jdf, sql_ctx):
self.is_cached = False
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21370#discussion_r189463079
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -3056,7 +3059,6 @@ class Dataset[T] private[sql](
* view, e.g
Github user xuanyuanking commented on the issue:
https://github.com/apache/spark/pull/21370
Not sure who is the right reviewer, maybe @rdblue @gatorsmile ?
Could you help me check whether it is the right implementation for the
discussion in the dev list
GitHub user xuanyuanking opened a pull request:
https://github.com/apache/spark/pull/21370
[SPARK-24215][PySpark] Implement _repr_html_ for dataframes in PySpark
## What changes were proposed in this pull request?
Implement _repr_html_ for PySpark while in notebook and add
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21353#discussion_r188975680
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/ContinuousShuffleMapTask.scala
---
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21353#discussion_r188974718
--- Diff: core/src/main/scala/org/apache/spark/SparkEnv.scala ---
@@ -140,6 +140,7 @@ object SparkEnv extends Logging {
private[spark] val
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21353#discussion_r188974568
--- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala ---
@@ -213,6 +213,12 @@ private[spark] sealed trait MapOutputTrackerMessage
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21353#discussion_r188974319
--- Diff: core/src/main/scala/org/apache/spark/Dependency.scala ---
@@ -88,14 +96,53 @@ class ShuffleDependency[K: ClassTag, V: ClassTag, C:
ClassTag
GitHub user xuanyuanking opened a pull request:
https://github.com/apache/spark/pull/21353
[SPARK-24036][SS] Scheduler changes for continuous processing shuffle
support
## What changes were proposed in this pull request?
This is the last part of the preview PRs, the mainly
Github user xuanyuanking commented on the issue:
https://github.com/apache/spark/pull/21332
> As discussed in the other PR, I'm not sure about how we're integrating
with the scheduler here, so I can't really give a more detailed review at this
point.
My
Github user xuanyuanking commented on the issue:
https://github.com/apache/spark/pull/21114
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21337#discussion_r188604001
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/shuffle/ContinuousShuffleReadRDD.scala
---
@@ -0,0 +1,64
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21337#discussion_r188601016
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/shuffle/UnsafeRowReceiver.scala
---
@@ -0,0 +1,56
Github user xuanyuanking commented on the issue:
https://github.com/apache/spark/pull/21332
cc @jose-torres
As we discussion in #21293, the main difference between us is whether we
can reuse current implementation of scheduler and shuffle, but in this part
about the
GitHub user xuanyuanking opened a pull request:
https://github.com/apache/spark/pull/21332
[SPARK-24236][SS] Continuous replacement for ShuffleExchangeExec
## What changes were proposed in this pull request?
1. New RDD named ContinuousShuffleRowRDD
2. New case class
Github user xuanyuanking commented on the issue:
https://github.com/apache/spark/pull/21293
@jose-torres Great thanks for you advise and guidance for us! I found the
main difference between us is whether we can reuse current implementation of
scheduler and shuffle. I marked in your
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21293#discussion_r188277683
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/ContinuousShuffleMapTask.scala
---
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21293#discussion_r188273722
--- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala ---
@@ -769,6 +796,43 @@ private[spark] class MapOutputTrackerWorker(conf
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21293#discussion_r188270290
--- Diff: core/src/main/scala/org/apache/spark/Dependency.scala ---
@@ -88,14 +90,53 @@ class ShuffleDependency[K: ClassTag, V: ClassTag, C:
ClassTag
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21293#discussion_r188269208
--- Diff: core/src/main/scala/org/apache/spark/Dependency.scala ---
@@ -65,15 +65,17 @@ abstract class NarrowDependency[T](_rdd: RDD[T])
extends
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21114#discussion_r188152980
--- Diff: core/src/test/scala/org/apache/spark/AccumulatorSuite.scala ---
@@ -237,6 +236,65 @@ class AccumulatorSuite extends SparkFunSuite with
Github user xuanyuanking commented on the issue:
https://github.com/apache/spark/pull/21114
cc @cloud-fan
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21114#discussion_r187823469
--- Diff: core/src/test/scala/org/apache/spark/AccumulatorSuite.scala ---
@@ -237,6 +236,65 @@ class AccumulatorSuite extends SparkFunSuite with
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21114#discussion_r187763308
--- Diff: core/src/test/scala/org/apache/spark/AccumulatorSuite.scala ---
@@ -237,6 +236,65 @@ class AccumulatorSuite extends SparkFunSuite with
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21114#discussion_r187763285
--- Diff: core/src/test/scala/org/apache/spark/AccumulatorSuite.scala ---
@@ -209,10 +209,8 @@ class AccumulatorSuite extends SparkFunSuite with
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21293#discussion_r187599741
--- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala ---
@@ -769,6 +796,43 @@ private[spark] class MapOutputTrackerWorker(conf
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21293#discussion_r187598787
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/ContinuousShuffleMapTask.scala
---
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21293#discussion_r187598365
--- Diff: core/src/main/scala/org/apache/spark/SparkEnv.scala ---
@@ -227,6 +228,7 @@ object SparkEnv extends Logging
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21293#discussion_r187598100
--- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala ---
@@ -233,6 +239,28 @@ private[spark] class MapOutputTrackerMasterEndpoint
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21293#discussion_r187597922
--- Diff: core/src/main/scala/org/apache/spark/Dependency.scala ---
@@ -88,14 +90,53 @@ class ShuffleDependency[K: ClassTag, V: ClassTag, C:
ClassTag
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21293#discussion_r187596748
--- Diff: core/src/main/scala/org/apache/spark/Dependency.scala ---
@@ -65,15 +65,17 @@ abstract class NarrowDependency[T](_rdd: RDD[T])
extends
Github user xuanyuanking commented on the issue:
https://github.com/apache/spark/pull/21293
cc @jose-torres @zsxwing
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
GitHub user xuanyuanking opened a pull request:
https://github.com/apache/spark/pull/21293
[SPARK-24237][SS] Continuous shuffle dependency and map output tracker
## What changes were proposed in this pull request?
As our disscussion in [jira
comment](https
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21199#discussion_r186764630
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousTextSocketSource.scala
---
@@ -0,0 +1,304
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21199#discussion_r186765402
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousTextSocketSource.scala
---
@@ -0,0 +1,304
Github user xuanyuanking commented on the issue:
https://github.com/apache/spark/pull/21188
@maasg as comment in #21194, I just consider we should not change the
behavior while `seconds > rampUpTimeSeconds`. Maybe it more important than
smo
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21188#discussion_r185852663
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/sources/RateStreamProvider.scala
---
@@ -107,14 +107,25 @@ object
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21194#discussion_r185851172
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/sources/RateStreamProviderSuite.scala
---
@@ -173,55 +173,154 @@ class
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21194#discussion_r185252544
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/sources/RateStreamProvider.scala
---
@@ -101,25 +101,10 @@ object
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21194#discussion_r185252360
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/sources/RateStreamProvider.scala
---
@@ -101,25 +101,10 @@ object
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21175#discussion_r184882338
--- Diff:
core/src/test/scala/org/apache/spark/io/ChunkedByteBufferSuite.scala ---
@@ -20,12 +20,12 @@ package org.apache.spark.io
import
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21175#discussion_r184882396
--- Diff:
core/src/test/scala/org/apache/spark/io/ChunkedByteBufferSuite.scala ---
@@ -20,12 +20,12 @@ package org.apache.spark.io
import
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21177#discussion_r184725980
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
---
@@ -78,7 +81,7 @@ object TPCDSQueryBenchmark
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21177#discussion_r184724132
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
---
@@ -87,10 +90,20 @@ object
Github user xuanyuanking commented on the issue:
https://github.com/apache/spark/pull/20930
> Have you applied this patch: #17955 ?
No, this happened on Spark 2.1. Thanks xingbo & wenchen, I'll port back
this patch to our internal Spark 2.1.
> That
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/20930#discussion_r184276403
--- Diff:
core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala ---
@@ -2399,6 +2399,84 @@ class DAGSchedulerSuite extends
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/20930#discussion_r184274946
--- Diff:
core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala ---
@@ -2399,6 +2399,84 @@ class DAGSchedulerSuite extends
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/20930#discussion_r184260597
--- Diff:
core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala ---
@@ -2399,6 +2399,84 @@ class DAGSchedulerSuite extends
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/20930#discussion_r184260210
--- Diff:
core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala ---
@@ -2399,6 +2399,84 @@ class DAGSchedulerSuite extends
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/20930#discussion_r184109204
--- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
---
@@ -1266,6 +1266,9 @@ class DAGScheduler
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21114#discussion_r183772627
--- Diff: core/src/test/scala/org/apache/spark/AccumulatorSuite.scala ---
@@ -209,10 +209,8 @@ class AccumulatorSuite extends SparkFunSuite with
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21114#discussion_r183770468
--- Diff: core/src/main/scala/org/apache/spark/util/AccumulatorV2.scala ---
@@ -258,14 +258,8 @@ private[spark] object AccumulatorContext
Github user xuanyuanking commented on the issue:
https://github.com/apache/spark/pull/21136
+1 for this.
We find this by CP app use filter with functions, this can be supported by
current implement.
cc @jose-torres @zsxwing @tdas
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21136#discussion_r183604217
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationsSuite.scala
---
@@ -771,7 +778,16 @@ class
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/20946#discussion_r183447816
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/OffsetSeq.scala
---
@@ -39,7 +39,9 @@ case class OffsetSeq(offsets: Seq
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/20946#discussion_r183447988
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/OffsetSeqLogSuite.scala
---
@@ -125,6 +125,19 @@ class OffsetSeqLogSuite
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21116#discussion_r183224838
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/WriteToContinuousDataSourceExec.scala
---
@@ -0,0 +1,126
Github user xuanyuanking commented on the issue:
https://github.com/apache/spark/pull/20930
![image](https://user-images.githubusercontent.com/4833765/39091106-ff11d0a6-461f-11e8-968f-7fcbe6652bb3.png)
Stage 0\1\2\3 same with 20\21\22\23 in this screenshot, stage2's shuf
Github user xuanyuanking commented on the issue:
https://github.com/apache/spark/pull/20930
@Ngone51 Ah, maybe I know how the description misleading you, the in the
description 5, 'this stage' refers to 'Stage 2' in screenshot, thanks for your
check, I modifie
Github user xuanyuanking commented on the issue:
https://github.com/apache/spark/pull/20930
@Ngone51
You can check the screenshot in detail, stage 2's shuffleID is 1, but stage
3 failed by missing an output for shuffle '0'! So here the stage 2's skip cause
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/20930#discussion_r183198368
--- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
---
@@ -1266,6 +1266,9 @@ class DAGScheduler
501 - 600 of 777 matches
Mail list logo