Github user ssaavedra commented on a diff in the pull request:
https://github.com/apache/spark/pull/22392#discussion_r218616371
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala ---
@@ -54,6 +54,10 @@ class Checkpoint(ssc: StreamingContext, val
Github user ssaavedra commented on the issue:
https://github.com/apache/spark/pull/22392
@skonto this seems to be related to how the Checkpointing code works in
general. Variables that are not explicitly whitelisted as per the list I'm
changing in this proposal seems to be reloaded
Github user ssaavedra commented on a diff in the pull request:
https://github.com/apache/spark/pull/22392#discussion_r218231743
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala ---
@@ -54,6 +54,10 @@ class Checkpoint(ssc: StreamingContext, val
Github user ssaavedra commented on a diff in the pull request:
https://github.com/apache/spark/pull/22392#discussion_r218048682
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala ---
@@ -54,6 +54,10 @@ class Checkpoint(ssc: StreamingContext, val
Github user ssaavedra commented on a diff in the pull request:
https://github.com/apache/spark/pull/22392#discussion_r217821786
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala ---
@@ -54,6 +54,10 @@ class Checkpoint(ssc: StreamingContext, val
GitHub user ssaavedra opened a pull request:
https://github.com/apache/spark/pull/22392
[SPARK-23200] Reset Kubernetes-specific config on Checkpoint restore
Several configuration parameters related to Kubernetes need to be
reset, as they are changed with each invokation of spark
Github user ssaavedra commented on the issue:
https://github.com/apache/spark/pull/20383
Yes, sorry for the misunderstanding I was also probably too eager with
this. However, if the changes I'm stating up there don't work, I am not sure
what I'm missing now. I'll take a further look
Github user ssaavedra commented on the issue:
https://github.com/apache/spark/pull/20383
Sorry, I hadn't answered yet because it seems my patch does not work
cleanly on 2.3. Many names were rewritten as part of the merge and some logic
on how the executor pods look up the configMap
Github user ssaavedra commented on the issue:
https://github.com/apache/spark/pull/20383
I can probably take a look at testing this over 2.3.0-rc2 on Monday. I did
not test this on a clean 2.3.0-ish branch
Github user ssaavedra commented on the issue:
https://github.com/apache/spark/pull/20383
spark-integration was created much later. I originally opened this as
https://github.com/apache-spark-on-k8s/spark/pull/516 last September. However,
the integration tests repo exists since
Github user ssaavedra commented on the issue:
https://github.com/apache/spark/pull/20383
However, Spark Streaming should always be used with checkpoint enabled if
you are using at least `updateStateByKey` or `reduceByKeyAndWindow` and you
don't want to lose data, or miscalculate
GitHub user ssaavedra opened a pull request:
https://github.com/apache/spark/pull/20383
[SPARK-23200] Reset Kubernetes-specific config on Checkpoint restore
## What changes were proposed in this pull request?
When using the Kubernetes cluster-manager and spawning
Github user ssaavedra commented on a diff in the pull request:
https://github.com/apache/spark/pull/19427#discussion_r148945009
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala ---
@@ -62,6 +63,7 @@ class Checkpoint(ssc: StreamingContext, val
Github user ssaavedra commented on the issue:
https://github.com/apache/spark/pull/19427
Is anyone considering this patch? Should I advertise it anywhere else?
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user ssaavedra commented on the issue:
https://github.com/apache/spark/pull/19469
The one I submitted here https://issues.apache.org/jira/browse/SPARK-22294
is not kubernetes-related as such, although it does affect deployments in
Kubernetes. It should affect any spark-submit
Github user ssaavedra commented on the issue:
https://github.com/apache/spark/pull/19469
I think that may be a good idea. I'd say this can depend on the scheduler.
Should that be discussed under a different JIRA number
Github user ssaavedra commented on the issue:
https://github.com/apache/spark/pull/19427
Should I create the appropriate issue in JIRA? I'm not sure if there is any
automation which does that.
---
-
To unsubscribe
GitHub user ssaavedra opened a pull request:
https://github.com/apache/spark/pull/19427
Reset spark.driver.bindAddress when starting a Checkpoint
## What changes were proposed in this pull request?
It seems that recovering from a checkpoint can replace the old
driver
18 matches
Mail list logo