dongjoon-hyun opened a new pull request, #644:
URL: https://github.com/apache/spark-kubernetes-operator/pull/644
### What changes were proposed in this pull request?
This PR adds a periodic `System.gc()` invocation in the Spark Operator JVM,
controlled by a new configuration:
- `spark.kubernetes.operator.periodicGC.intervalSeconds` (default: `120`)
- Set to `0` or a negative value to disable.
- Note that `System.gc()` is a no-op when the JVM is started with
`-XX:+DisableExplicitGC`.
### Why are the changes needed?
The Spark Operator JVM is a long-running process that continuously allocates
objects through JOSDK reconcilers and `SentinelManager` health checks. Without
an explicit Full GC, fragmentation and old-generation garbage can accumulate
over time, leading to unpredictable Full GC pauses. Triggering `System.gc()`
periodically gives operators a deterministic, observable point at which old
generation is reclaimed, which is helpful for steady-state memory behavior in
long-running deployments.
### Does this PR introduce _any_ user-facing change?
Yes. A new operator configuration
`spark.kubernetes.operator.periodicGC.intervalSeconds` is introduced and is
enabled by default (`120` seconds). Operators that wish to opt out can set the
value to `0` or a negative number. New `INFO`-level log lines are emitted at
startup and on every GC cycle.
### How was this patch tested?
Pass the CIs. And manually install and check the log.
```
26/04/27 21:51:30 INFO o.a.s.k.o.SparkOperator Version: 0.9.0-SNAPSHOT
26/04/27 21:51:30 INFO o.a.s.k.o.SparkOperator Java Version: 26.0.1+8
26/04/27 21:51:30 INFO o.a.s.k.o.SparkOperator Built-in Spark
Version: 4.2.0-preview4
...
26/04/27 21:51:30 INFO o.a.s.k.o.SparkOperator Periodic System.gc()
enabled with interval 120s
...
26/04/27 21:53:30 INFO o.a.s.k.o.SparkOperator System.gc() finished in 41
ms. used: 31 MB -> 23 MB, total: 31 MB -> 206 MB
26/04/27 21:55:30 INFO o.a.s.k.o.SparkOperator System.gc() finished in 48
ms. used: 31 MB -> 22 MB, total: 206 MB -> 53 MB
26/04/27 21:57:30 INFO o.a.s.k.o.SparkOperator System.gc() finished in 43
ms. used: 25 MB -> 22 MB, total: 53 MB -> 53 MB
```
### Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]