[
https://issues.apache.org/jira/browse/FELIX-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451645#comment-17451645
]
Georg Henzler commented on FELIX-6475:
--------------------------------------
While it's unlikely that a Throwable happens at this spot (because system
errors produced in the future will be wrapped in ExecutionException) in theory
it's also possible that e.g. the code that deals with the future handling runs
into the OOM error. So I think it's conceptually correct to catch throwable
here even if it will rarely happen. In a way it would be even nice to also show
the message "System error during health check execution" for the regular case,
then we would have to catch the ExecutionException explicitly and and also show
that system error message for the case the ExecutionException wraps a system
error.
> How to handle OutOfMemoryError in health check
> ----------------------------------------------
>
> Key: FELIX-6475
> URL: https://issues.apache.org/jira/browse/FELIX-6475
> Project: Felix
> Issue Type: Bug
> Components: Health Checks
> Affects Versions: healthcheck.core 2.0.10
> Reporter: Christian Schneider
> Priority: Critical
>
> Currently a java Error lets during a health check returns a HealthCheck ERROR
> state.
> This is especially problematic when the health check is a k8s readiness check
> as then the pod is take out of the load balancer but not necessarily
> restarted.
> After digging more the error happens inside the BundlesStartedCheck. I don't
> think the check is causing the OutOfMemoryError but it repeatedly shows it.
> So the question is how should felix healthcheck code handle such a java error?
>
> {code:java}
> 10.04.2021 08:00:10.181 *WARN* [hc-monitor-15-systemalive,systemready]
> org.apache.felix.hc.core.impl.executor.HealthCheckExecutorImpl Unexpected
> Exception during future.get(): java.util.concurrent.ExecutionException:
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC
> overhead limit exceeded
> at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:191)
> at
> org.apache.felix.hc.core.impl.executor.HealthCheckExecutorImpl.collectResultFromFuture(HealthCheckExecutorImpl.java:430)
> [org.apache.felix.healthcheck.core:2.0.8]
> at
> org.apache.felix.hc.core.impl.executor.HealthCheckExecutorImpl.collectResultsFromFutures(HealthCheckExecutorImpl.java:408)
> [org.apache.felix.healthcheck.core:2.0.8]
> at
> org.apache.felix.hc.core.impl.executor.HealthCheckExecutorImpl.createResultsForDescriptors(HealthCheckExecutorImpl.java:268)
> [org.apache.felix.healthcheck.core:2.0.8]
> at
> org.apache.felix.hc.core.impl.executor.HealthCheckExecutorImpl.execute(HealthCheckExecutorImpl.java:211)
> [org.apache.felix.healthcheck.core:2.0.8]
> at
> org.apache.felix.hc.core.impl.executor.HealthCheckExecutorImpl.execute(HealthCheckExecutorImpl.java:181)
> [org.apache.felix.healthcheck.core:2.0.8]
> at
> org.apache.felix.hc.core.impl.executor.HealthCheckExecutorImpl.execute(HealthCheckExecutorImpl.java:168)
> [org.apache.felix.healthcheck.core:2.0.8]
> at
> org.apache.felix.hc.core.impl.monitor.HealthState.update(HealthState.java:123)
> [org.apache.felix.healthcheck.core:2.0.8]
> at
> org.apache.felix.hc.core.impl.monitor.HealthCheckMonitor.runWithThreadNameContext(HealthCheckMonitor.java:321)
> [org.apache.felix.healthcheck.core:2.0.8]
> at
> org.apache.felix.hc.core.impl.monitor.HealthCheckMonitor.lambda$null$2(HealthCheckMonitor.java:264)
> [org.apache.felix.healthcheck.core:2.0.8]
> at
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> at
> java.base/java.util.concurrent.ConcurrentHashMap$ValueSpliterator.forEachRemaining(ConcurrentHashMap.java:3605)
> at
> java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
> at
> java.base/java.util.stream.ForEachOps$ForEachTask.compute(ForEachOps.java:290)
> at
> java.base/java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:746)
> at
> java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
> at
> java.base/java.util.concurrent.ForkJoinTask.doInvoke(ForkJoinTask.java:408)
> at
> java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:736)
> at
> java.base/java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:159)
> at
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:173)
> at
> java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
> at
> java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
> at
> java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:661)
> at
> org.apache.felix.hc.core.impl.monitor.HealthCheckMonitor.lambda$run$3(HealthCheckMonitor.java:263)
> [org.apache.felix.healthcheck.core:2.0.8]
> at
> org.apache.felix.hc.core.impl.monitor.HealthCheckMonitor.runWithThreadNameContext(HealthCheckMonitor.java:321)
> [org.apache.felix.healthcheck.core:2.0.8]
> at
> org.apache.felix.hc.core.impl.monitor.HealthCheckMonitor.run(HealthCheckMonitor.java:259)
> [org.apache.felix.healthcheck.core:2.0.8]
> at
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at
> java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
> at
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
> 10.04.2021 08:00:10.231 *WARN* [hc-monitor-15-systemalive,systemready]
> org.apache.felix.hc.core.impl.executor.HealthCheckExecutorImpl Unexpected
> Exception during future.get(): java.util.concurrent.ExecutionException:
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC
> overhead limit exceeded
> at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:191)
> at
> org.apache.felix.hc.core.impl.executor.HealthCheckExecutorImpl.collectResultFromFuture(HealthCheckExecutorImpl.java:430)
> [org.apache.felix.healthcheck.core:2.0.8]
> at
> org.apache.felix.hc.core.impl.executor.HealthCheckExecutorImpl.collectResultsFromFutures(HealthCheckExecutorImpl.java:408)
> [org.apache.felix.healthcheck.core:2.0.8]
> at
> org.apache.felix.hc.core.impl.executor.HealthCheckExecutorImpl.createResultsForDescriptors(HealthCheckExecutorImpl.java:268)
> [org.apache.felix.healthcheck.core:2.0.8]
> at
> org.apache.felix.hc.core.impl.executor.HealthCheckExecutorImpl.execute(HealthCheckExecutorImpl.java:211)
> [org.apache.felix.healthcheck.core:2.0.8]
> at
> org.apache.felix.hc.core.impl.executor.HealthCheckExecutorImpl.execute(HealthCheckExecutorImpl.java:181)
> [org.apache.felix.healthcheck.core:2.0.8]
> at
> org.apache.felix.hc.core.impl.executor.HealthCheckExecutorImpl.execute(HealthCheckExecutorImpl.java:168)
> [org.apache.felix.healthcheck.core:2.0.8]
> at
> org.apache.felix.hc.core.impl.monitor.HealthState.update(HealthState.java:123)
> [org.apache.felix.healthcheck.core:2.0.8]
> at
> org.apache.felix.hc.core.impl.monitor.HealthCheckMonitor.runWithThreadNameContext(HealthCheckMonitor.java:321)
> [org.apache.felix.healthcheck.core:2.0.8]
> at
> org.apache.felix.hc.core.impl.monitor.HealthCheckMonitor.lambda$null$2(HealthCheckMonitor.java:264)
> [org.apache.felix.healthcheck.core:2.0.8]
> at
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> at
> java.base/java.util.concurrent.ConcurrentHashMap$ValueSpliterator.forEachRemaining(ConcurrentHashMap.java:3605)
> at
> java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
> at
> java.base/java.util.stream.ForEachOps$ForEachTask.compute(ForEachOps.java:290)
> at
> java.base/java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:746)
> at
> java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
> at
> java.base/java.util.concurrent.ForkJoinTask.doInvoke(ForkJoinTask.java:408)
> at
> java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:736)
> at
> java.base/java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:159)
> at
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:173)
> at
> java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
> at
> java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
> at
> java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:661)
> at
> org.apache.felix.hc.core.impl.monitor.HealthCheckMonitor.lambda$run$3(HealthCheckMonitor.java:263)
> [org.apache.felix.healthcheck.core:2.0.8]
> at
> org.apache.felix.hc.core.impl.monitor.HealthCheckMonitor.runWithThreadNameContext(HealthCheckMonitor.java:321)
> [org.apache.felix.healthcheck.core:2.0.8]
> at
> org.apache.felix.hc.core.impl.monitor.HealthCheckMonitor.run(HealthCheckMonitor.java:259)
> [org.apache.felix.healthcheck.core:2.0.8]
> at
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at
> java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
> at
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded {code}
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)