sohurdc opened a new pull request, #51264:
URL: https://github.com/apache/spark/pull/51264
### What changes were proposed in this pull request?
<!--
Please clarify what changes you are proposing. The purpose of this section
is to outline the changes and how this PR fixes the issue.
If possible, please consider writing useful notes for better and faster
reviews in your PR. See the examples below.
1. If you refactor some codes with changing classes, showing the class
hierarchy will help reviewers.
2. If you fix some SQL features, you can provide some references of other
DBMSes.
3. If there is design documentation, please add the link.
4. If there is a discussion in the mailing list, please add the link.
-->
### Why are the changes needed?
Once checkpointing is enabled in Spark Streaming, configuration changes
require deleting the checkpoint, which results in loss of state.
code like this:
// Get StreamingContext from checkpoint data or create a new one
val context = StreamingContext.getOrCreate(checkpointDirectory,
functionToCreateContext _)
...
context.start()
context.awaitTermination()
I modified the key class org.apache.spark.streaming.StreamingContext by
updating the getOrCreate method, so that the latest configurations can be
applied when recovering from a checkpoint. This way, there's no need to delete
the checkpoint to make configuration changes take effect.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
After modifying the Spark source code, I replaced the client-side
spark/jars/spark-streaming_2.12-4.0.0.jar. Then, I changed the configuration
settings, updating the runtime resources to --num-executors 4 --executor-memory
6G --executor-cores 3. Without touching the checkpoint, I restarted the Spark
Streaming application and confirmed that the changes took effect.
### Was this patch authored or co-authored using generative AI tooling?
No
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]