[ 
https://issues.apache.org/jira/browse/FELIX-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451645#comment-17451645
 ] 

Georg Henzler commented on FELIX-6475:
--------------------------------------

While it's unlikely that a Throwable happens at this spot (because system 
errors produced in the future will be wrapped in ExecutionException) in theory 
it's also possible that e.g. the code that deals with the future handling runs 
into the OOM error. So I think it's conceptually correct to catch throwable 
here even if it will rarely happen. In a way it would be even nice to also show 
the message "System error during health check execution" for the regular case, 
then we would have to catch the ExecutionException explicitly and and also show 
that system error message for the case the ExecutionException wraps a system 
error.

> How to handle OutOfMemoryError in health check
> ----------------------------------------------
>
>                 Key: FELIX-6475
>                 URL: https://issues.apache.org/jira/browse/FELIX-6475
>             Project: Felix
>          Issue Type: Bug
>          Components: Health Checks
>    Affects Versions: healthcheck.core 2.0.10
>            Reporter: Christian Schneider
>            Priority: Critical
>
> Currently a java Error lets during a health check returns a HealthCheck ERROR 
> state.
> This is especially problematic when the health check is a k8s readiness check 
> as then the pod is take out of the load balancer but not necessarily 
> restarted.
> After digging more the error happens inside the BundlesStartedCheck. I don't 
> think the check is causing the OutOfMemoryError but it repeatedly shows it.
> So the question is how should felix healthcheck code handle such a java error?
>  
> {code:java}
> 10.04.2021 08:00:10.181 *WARN* [hc-monitor-15-systemalive,systemready] 
> org.apache.felix.hc.core.impl.executor.HealthCheckExecutorImpl Unexpected 
> Exception during future.get(): java.util.concurrent.ExecutionException: 
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC 
> overhead limit exceeded
>       at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122)
>       at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:191)
>       at 
> org.apache.felix.hc.core.impl.executor.HealthCheckExecutorImpl.collectResultFromFuture(HealthCheckExecutorImpl.java:430)
>  [org.apache.felix.healthcheck.core:2.0.8]
>       at 
> org.apache.felix.hc.core.impl.executor.HealthCheckExecutorImpl.collectResultsFromFutures(HealthCheckExecutorImpl.java:408)
>  [org.apache.felix.healthcheck.core:2.0.8]
>       at 
> org.apache.felix.hc.core.impl.executor.HealthCheckExecutorImpl.createResultsForDescriptors(HealthCheckExecutorImpl.java:268)
>  [org.apache.felix.healthcheck.core:2.0.8]
>       at 
> org.apache.felix.hc.core.impl.executor.HealthCheckExecutorImpl.execute(HealthCheckExecutorImpl.java:211)
>  [org.apache.felix.healthcheck.core:2.0.8]
>       at 
> org.apache.felix.hc.core.impl.executor.HealthCheckExecutorImpl.execute(HealthCheckExecutorImpl.java:181)
>  [org.apache.felix.healthcheck.core:2.0.8]
>       at 
> org.apache.felix.hc.core.impl.executor.HealthCheckExecutorImpl.execute(HealthCheckExecutorImpl.java:168)
>  [org.apache.felix.healthcheck.core:2.0.8]
>       at 
> org.apache.felix.hc.core.impl.monitor.HealthState.update(HealthState.java:123)
>  [org.apache.felix.healthcheck.core:2.0.8]
>       at 
> org.apache.felix.hc.core.impl.monitor.HealthCheckMonitor.runWithThreadNameContext(HealthCheckMonitor.java:321)
>  [org.apache.felix.healthcheck.core:2.0.8]
>       at 
> org.apache.felix.hc.core.impl.monitor.HealthCheckMonitor.lambda$null$2(HealthCheckMonitor.java:264)
>  [org.apache.felix.healthcheck.core:2.0.8]
>       at 
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
>       at 
> java.base/java.util.concurrent.ConcurrentHashMap$ValueSpliterator.forEachRemaining(ConcurrentHashMap.java:3605)
>       at 
> java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
>       at 
> java.base/java.util.stream.ForEachOps$ForEachTask.compute(ForEachOps.java:290)
>       at 
> java.base/java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:746)
>       at 
> java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
>       at 
> java.base/java.util.concurrent.ForkJoinTask.doInvoke(ForkJoinTask.java:408)
>       at 
> java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:736)
>       at 
> java.base/java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:159)
>       at 
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:173)
>       at 
> java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
>       at 
> java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
>       at 
> java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:661)
>       at 
> org.apache.felix.hc.core.impl.monitor.HealthCheckMonitor.lambda$run$3(HealthCheckMonitor.java:263)
>  [org.apache.felix.healthcheck.core:2.0.8]
>       at 
> org.apache.felix.hc.core.impl.monitor.HealthCheckMonitor.runWithThreadNameContext(HealthCheckMonitor.java:321)
>  [org.apache.felix.healthcheck.core:2.0.8]
>       at 
> org.apache.felix.hc.core.impl.monitor.HealthCheckMonitor.run(HealthCheckMonitor.java:259)
>  [org.apache.felix.healthcheck.core:2.0.8]
>       at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>       at 
> java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
>       at 
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
>       at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>       at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>       at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
> 10.04.2021 08:00:10.231 *WARN* [hc-monitor-15-systemalive,systemready] 
> org.apache.felix.hc.core.impl.executor.HealthCheckExecutorImpl Unexpected 
> Exception during future.get(): java.util.concurrent.ExecutionException: 
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC 
> overhead limit exceeded
>       at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122)
>       at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:191)
>       at 
> org.apache.felix.hc.core.impl.executor.HealthCheckExecutorImpl.collectResultFromFuture(HealthCheckExecutorImpl.java:430)
>  [org.apache.felix.healthcheck.core:2.0.8]
>       at 
> org.apache.felix.hc.core.impl.executor.HealthCheckExecutorImpl.collectResultsFromFutures(HealthCheckExecutorImpl.java:408)
>  [org.apache.felix.healthcheck.core:2.0.8]
>       at 
> org.apache.felix.hc.core.impl.executor.HealthCheckExecutorImpl.createResultsForDescriptors(HealthCheckExecutorImpl.java:268)
>  [org.apache.felix.healthcheck.core:2.0.8]
>       at 
> org.apache.felix.hc.core.impl.executor.HealthCheckExecutorImpl.execute(HealthCheckExecutorImpl.java:211)
>  [org.apache.felix.healthcheck.core:2.0.8]
>       at 
> org.apache.felix.hc.core.impl.executor.HealthCheckExecutorImpl.execute(HealthCheckExecutorImpl.java:181)
>  [org.apache.felix.healthcheck.core:2.0.8]
>       at 
> org.apache.felix.hc.core.impl.executor.HealthCheckExecutorImpl.execute(HealthCheckExecutorImpl.java:168)
>  [org.apache.felix.healthcheck.core:2.0.8]
>       at 
> org.apache.felix.hc.core.impl.monitor.HealthState.update(HealthState.java:123)
>  [org.apache.felix.healthcheck.core:2.0.8]
>       at 
> org.apache.felix.hc.core.impl.monitor.HealthCheckMonitor.runWithThreadNameContext(HealthCheckMonitor.java:321)
>  [org.apache.felix.healthcheck.core:2.0.8]
>       at 
> org.apache.felix.hc.core.impl.monitor.HealthCheckMonitor.lambda$null$2(HealthCheckMonitor.java:264)
>  [org.apache.felix.healthcheck.core:2.0.8]
>       at 
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
>       at 
> java.base/java.util.concurrent.ConcurrentHashMap$ValueSpliterator.forEachRemaining(ConcurrentHashMap.java:3605)
>       at 
> java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
>       at 
> java.base/java.util.stream.ForEachOps$ForEachTask.compute(ForEachOps.java:290)
>       at 
> java.base/java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:746)
>       at 
> java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
>       at 
> java.base/java.util.concurrent.ForkJoinTask.doInvoke(ForkJoinTask.java:408)
>       at 
> java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:736)
>       at 
> java.base/java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:159)
>       at 
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:173)
>       at 
> java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
>       at 
> java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
>       at 
> java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:661)
>       at 
> org.apache.felix.hc.core.impl.monitor.HealthCheckMonitor.lambda$run$3(HealthCheckMonitor.java:263)
>  [org.apache.felix.healthcheck.core:2.0.8]
>       at 
> org.apache.felix.hc.core.impl.monitor.HealthCheckMonitor.runWithThreadNameContext(HealthCheckMonitor.java:321)
>  [org.apache.felix.healthcheck.core:2.0.8]
>       at 
> org.apache.felix.hc.core.impl.monitor.HealthCheckMonitor.run(HealthCheckMonitor.java:259)
>  [org.apache.felix.healthcheck.core:2.0.8]
>       at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>       at 
> java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
>       at 
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
>       at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>       at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>       at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to