[
https://issues.apache.org/jira/browse/BROOKLYN-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15369293#comment-15369293
]
ASF GitHub Bot commented on BROOKLYN-313:
-----------------------------------------
GitHub user aledsage opened a pull request:
https://github.com/apache/brooklyn-library/pull/49
BROOKLYN-313: test with ControlledDynamicWebAppCluster
Please review and merge https://github.com/apache/brooklyn-server/pull/245
first.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/aledsage/brooklyn-library test/BROOKLYN-313
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/brooklyn-library/pull/49.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #49
----
commit 2bf0e970a8142b06b7b3750421d3f226fe5265c8
Author: Aled Sage <[email protected]>
Date: 2016-07-09T17:44:40Z
BROOKLYN-313: test with ControlledDynamicWebAppCluster
----
> Exception stacktrace never logged when resize effector fails, called by
> AutoScaler policy
> -----------------------------------------------------------------------------------------
>
> Key: BROOKLYN-313
> URL: https://issues.apache.org/jira/browse/BROOKLYN-313
> Project: Brooklyn
> Issue Type: Bug
> Affects Versions: 0.9.0
> Reporter: Aled Sage
>
> A customer was using the ControlledDynamicWebAppCluster, with a
> AutoScalerPolicy. However, they also had an EntitlementManager that throws a
> NullPointerException whenever it is called with {{context==null ||
> context.user==null}} (which happens for automated calls not triggered
> explicitly by a user, such as for policies).
> The underlying cause of the NullPointerException was never logged. All that
> made it into the logs was something like:
> {noformat}
> 2016-07-09 16:07:57,491 DEBUG o.a.b.util.core.task.BasicTask
> [brooklyn-autoscalerpolicy-0]: call from
> Thread[brooklyn-autoscalerpolicy-0,5,main], blocking until
> 'Task[Auto-scaler]@DHtSIGZf' finishes, ended with error:
> java.util.concurrent.ExecutionException:
> org.apache.brooklyn.util.exceptions.PropagatedRuntimeException: Error
> invoking resize at DynamicWebAppClusterImpl{id=iyo37ngogv}:
> NullPointerException
> {noformat}
> We should *always* log (at least at debug) the underlying cause of an
> exception, even if it risks logging it multiple times.
> I've reproduced this in a test case. The thread stacktraces are shown below:
> The thread below is about to throw a NullPointerException. The logging from
> this thread is also shown. The underlying exception is recorded in the task
> (so a subsequent task.get() would throw it):
> {noformat}
> Daemon Thread [brooklyn-execmanager-cU8EcGHT-39] (Suspended (breakpoint at
> line 64 in EffectorExceptionLoggingTest$ThrowingEntitlementManager))
>
> EffectorExceptionLoggingTest$ThrowingEntitlementManager.isEntitled(EntitlementContext,
> EntitlementClass<T>, T) line: 64
> Entitlements.isEntitled(EntitlementManager, EntitlementClass<T>, T)
> line: 409
> Entitlements.checkEntitled(EntitlementManager, EntitlementClass<T>, T)
> line: 414
>
> EntityManagementSupport$EntityChangeListenerImpl.onEffectorStarting(Effector<?>,
> Object) line: 469
> EffectorUtils.invokeMethodEffector(Entity, Effector<T>, Map<String,?>)
> line: 274
> MethodEffector<T>.call(Entity, Map) line: 148
> MethodEffector<T>(AbstractEffector<T>).call(Map, Entity) line: 61
> AbstractEffector$1$1.call() line: 83
> DynamicSequentialTask$DstJob.call() line: 359
> BasicExecutionManager$SubmissionCallable<T>.call() line: 519
> FutureTask<V>.run() line: 262
> ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) line: 1145
> ThreadPoolExecutor$Worker.run() line: 615
> Thread.run() line: 745
> 2016-07-09 16:33:47,333 WARN o.a.b.c.m.i.EffectorUtils
> [brooklyn-execmanager-cU8EcGHT-39]: Error invoking resize at
> DynamicWebAppClusterImpl{id=iyo37ngogv}: NullPointerException
> 2016-07-09 16:33:51,213 DEBUG o.a.b.u.c.t.BasicExecutionManager
> [brooklyn-execmanager-cU8EcGHT-39]: Exception running task
> Task[resize]@ZbLY4WUg (rethrowing):
> org.apache.brooklyn.core.mgmt.internal.EffectorUtils$EffectorCallPropagatedRuntimeException:
> Error invoking resize at DynamicWebAppClusterImpl{id=iyo37ngogv}:
> NullPointerException
> 2016-07-09 16:33:51,214 DEBUG o.a.b.u.core.task.CompoundTask
> [brooklyn-execmanager-cU8EcGHT-36]: Secondary job queue for
> Task[Auto-scaler]@pOJ7r6Y0 ignoring error in inessential task
> Task[resize]@ZbLY4WUg: java.util.concurrent.ExecutionException:
> org.apache.brooklyn.core.mgmt.internal.EffectorUtils$EffectorCallPropagatedRuntimeException:
> Error invoking resize at DynamicWebAppClusterImpl{id=iyo37ngogv}:
> NullPointerException
> {noformat}
> The thread below invoked the effector - note that the AutoScalerPolicy has
> wrapped the invocation in a task. The logging from this thread is also shown.
> Within that task, it will not catch the exception:
> {noformat}
> "brooklyn-execmanager-cU8EcGHT-0" daemon prio=5 tid=0x00007fcfe5b5f800
> nid=0x6003 waiting on condition [0x00007000018de000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00000007f93d7ab0> (a
> java.util.concurrent.FutureTask)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:425)
> at java.util.concurrent.FutureTask.get(FutureTask.java:187)
> at
> com.google.common.util.concurrent.ForwardingFuture.get(ForwardingFuture.java:63)
> at
> org.apache.brooklyn.util.core.task.BasicTask.get(BasicTask.java:361)
> at
> org.apache.brooklyn.util.core.task.BasicTask.getUnchecked(BasicTask.java:370)
> at
> org.apache.brooklyn.util.core.task.DynamicTasks$TaskQueueingResult.andWaitForSuccess(DynamicTasks.java:160)
> at
> org.apache.brooklyn.core.objs.proxy.EntityProxyImpl.invoke(EntityProxyImpl.java:211)
> at com.sun.proxy.$Proxy19.resize(Unknown Source)
> at
> org.apache.brooklyn.entity.webapp.ControlledDynamicWebAppClusterImpl.resize(ControlledDynamicWebAppClusterImpl.java:297)
> at
> org.apache.brooklyn.policy.autoscaling.AutoScalerPolicy$3.resize(AutoScalerPolicy.java:364)
> at
> org.apache.brooklyn.policy.autoscaling.AutoScalerPolicy$14.call(AutoScalerPolicy.java:1054)
> at
> org.apache.brooklyn.policy.autoscaling.AutoScalerPolicy$14.call(AutoScalerPolicy.java:1)
> at
> org.apache.brooklyn.util.core.task.DynamicSequentialTask$DstJob.call(DynamicSequentialTask.java:359)
> at
> org.apache.brooklyn.util.core.task.BasicExecutionManager$SubmissionCallable.call(BasicExecutionManager.java:519)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> 2016-07-09 16:36:54,385 DEBUG o.a.b.u.c.t.BasicExecutionManager
> [brooklyn-execmanager-cU8EcGHT-0]: Exception running task
> Task[Auto-scaler]@pOJ7r6Y0 (rethrowing):
> org.apache.brooklyn.util.exceptions.PropagatedRuntimeException: Error
> invoking resize at DynamicWebAppClusterImpl{id=iyo37ngogv}:
> NullPointerException
> {noformat}
> The thread below invoked the task. It has called {{task.blockUntilEnded}} to
> wait for the result:
> {noformat}
> "brooklyn-autoscalerpolicy-0" prio=5 tid=0x00007fcfe5b49800 nid=0x1407
> waiting on condition [0x0000700002b14000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00000007f81a07b0> (a
> java.util.concurrent.FutureTask)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:425)
> at java.util.concurrent.FutureTask.get(FutureTask.java:187)
> at
> com.google.common.util.concurrent.ForwardingFuture.get(ForwardingFuture.java:63)
> at
> org.apache.brooklyn.util.core.task.BasicTask.blockUntilEnded(BasicTask.java:417)
> at
> org.apache.brooklyn.util.core.task.BasicTask.blockUntilEnded(BasicTask.java:407)
> at
> org.apache.brooklyn.policy.autoscaling.AutoScalerPolicy.resizeNow(AutoScalerPolicy.java:1066)
> at
> org.apache.brooklyn.policy.autoscaling.AutoScalerPolicy.access$15(AutoScalerPolicy.java:1021)
> at
> org.apache.brooklyn.policy.autoscaling.AutoScalerPolicy$13.run(AutoScalerPolicy.java:952)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> 2016-07-09 16:38:45,089 DEBUG o.a.b.util.core.task.BasicTask
> [brooklyn-autoscalerpolicy-0]: call from
> Thread[brooklyn-autoscalerpolicy-0,5,main], blocking until
> 'Task[Auto-scaler]@pOJ7r6Y0' finishes, ended with error:
> java.util.concurrent.ExecutionException:
> org.apache.brooklyn.util.exceptions.PropagatedRuntimeException: Error
> invoking resize at DynamicWebAppClusterImpl{id=iyo37ngogv}:
> NullPointerException
> {noformat}
> ---
> Note we'd have had the same problem if the cluster.resize() effector itself
> had thrown the exception (rather than the entitlements). We'd have only got
> logging like:
> {noformat}
> 2016-07-09 16:43:52,293 DEBUG o.a.b.u.c.t.BasicExecutionManager
> [brooklyn-execmanager-NpLl6jt3-1]: Exception running task
> Task[Auto-scaler]@uLBsngUp (rethrowing): java.lang.NullPointerException
> 2016-07-09 16:43:52,294 DEBUG o.a.b.util.core.task.BasicTask
> [brooklyn-autoscalerpolicy-0]: call from
> Thread[brooklyn-autoscalerpolicy-0,5,main], blocking until
> 'Task[Auto-scaler]@uLBsngUp' finishes, ended with error:
> java.util.concurrent.ExecutionException: java.lang.NullPointerException
> {noformat}
> This lack of logging is hard to reproduce in a simpler test case where
> effectors are invoked directly. It is primarily because of the additional
> layers of task, and the use of {{basicTask.blockUntilEnded}}, that we lose
> the message.
> However, I have in the past seen something similar when there is an exception
> in an effector invoked from the web-console: the web-console's activity view
> showed the full exception stacktrace, but the debug log did not have it. This
> is very bad for when a customer sends us a debug log!
> ---
> It's unclear what the holistic fix for this should be.
> We do not want to require the caller of the effector to always wrap the call
> in a try-catch with logging: the failure of an effector should get logged
> automatically, and should include the underlying cause. Any other option
> risks us being sent customer log files that give no indication of what caused
> the problem.
> ---
> We can hopefully log any effector invocation error in the same way.
> Alternatively we could treat framework exceptions (e.g. NPE preparing to
> execute) as different from exceptions during the actual effector
> implementation.
> I don't want to do it in low-level code like {{BasicTask.blockUntilEnded}},
> because that will probably result in a huge amount of duplicate logging.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)