[ 
https://issues.apache.org/jira/browse/BROOKLYN-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15370628#comment-15370628
 ] 

ASF GitHub Bot commented on BROOKLYN-313:
-----------------------------------------

Github user bostko commented on the issue:

    https://github.com/apache/brooklyn-server/pull/245
  
    LGTM


> Exception stacktrace never logged when resize effector fails, called by 
> AutoScaler policy
> -----------------------------------------------------------------------------------------
>
>                 Key: BROOKLYN-313
>                 URL: https://issues.apache.org/jira/browse/BROOKLYN-313
>             Project: Brooklyn
>          Issue Type: Bug
>    Affects Versions: 0.9.0
>            Reporter: Aled Sage
>
> A customer was using the ControlledDynamicWebAppCluster, with a 
> AutoScalerPolicy. However, they also had an EntitlementManager that throws a 
> NullPointerException whenever it is called with {{context==null || 
> context.user==null}} (which happens for automated calls not triggered 
> explicitly by a user, such as for policies).
> The underlying cause of the NullPointerException was never logged. All that 
> made it into the logs was something like:
> {noformat}
> 2016-07-09 16:07:57,491 DEBUG o.a.b.util.core.task.BasicTask 
> [brooklyn-autoscalerpolicy-0]: call from 
> Thread[brooklyn-autoscalerpolicy-0,5,main], blocking until 
> 'Task[Auto-scaler]@DHtSIGZf' finishes, ended with error: 
> java.util.concurrent.ExecutionException: 
> org.apache.brooklyn.util.exceptions.PropagatedRuntimeException: Error 
> invoking resize at DynamicWebAppClusterImpl{id=iyo37ngogv}: 
> NullPointerException
> {noformat}
> We should *always* log (at least at debug) the underlying cause of an 
> exception, even if it risks logging it multiple times.
> I've reproduced this in a test case. The thread stacktraces are shown below:
> The thread below is about to throw a NullPointerException. The logging from 
> this thread is also shown. The underlying exception is recorded in the task 
> (so a subsequent task.get() would throw it):
> {noformat}
> Daemon Thread [brooklyn-execmanager-cU8EcGHT-39] (Suspended (breakpoint at 
> line 64 in EffectorExceptionLoggingTest$ThrowingEntitlementManager))       
>       
> EffectorExceptionLoggingTest$ThrowingEntitlementManager.isEntitled(EntitlementContext,
>  EntitlementClass<T>, T) line: 64 
>       Entitlements.isEntitled(EntitlementManager, EntitlementClass<T>, T) 
> line: 409   
>       Entitlements.checkEntitled(EntitlementManager, EntitlementClass<T>, T) 
> line: 414        
>       
> EntityManagementSupport$EntityChangeListenerImpl.onEffectorStarting(Effector<?>,
>  Object) line: 469      
>       EffectorUtils.invokeMethodEffector(Entity, Effector<T>, Map<String,?>) 
> line: 274        
>       MethodEffector<T>.call(Entity, Map) line: 148   
>       MethodEffector<T>(AbstractEffector<T>).call(Map, Entity) line: 61       
>       AbstractEffector$1$1.call() line: 83    
>       DynamicSequentialTask$DstJob.call() line: 359   
>       BasicExecutionManager$SubmissionCallable<T>.call() line: 519    
>       FutureTask<V>.run() line: 262   
>       ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) line: 1145      
>       ThreadPoolExecutor$Worker.run() line: 615       
>       Thread.run() line: 745  
> 2016-07-09 16:33:47,333 WARN  o.a.b.c.m.i.EffectorUtils 
> [brooklyn-execmanager-cU8EcGHT-39]: Error invoking resize at 
> DynamicWebAppClusterImpl{id=iyo37ngogv}: NullPointerException
> 2016-07-09 16:33:51,213 DEBUG o.a.b.u.c.t.BasicExecutionManager 
> [brooklyn-execmanager-cU8EcGHT-39]: Exception running task 
> Task[resize]@ZbLY4WUg (rethrowing): 
> org.apache.brooklyn.core.mgmt.internal.EffectorUtils$EffectorCallPropagatedRuntimeException:
>  Error invoking resize at DynamicWebAppClusterImpl{id=iyo37ngogv}: 
> NullPointerException
> 2016-07-09 16:33:51,214 DEBUG o.a.b.u.core.task.CompoundTask 
> [brooklyn-execmanager-cU8EcGHT-36]: Secondary job queue for 
> Task[Auto-scaler]@pOJ7r6Y0 ignoring error in inessential task 
> Task[resize]@ZbLY4WUg: java.util.concurrent.ExecutionException: 
> org.apache.brooklyn.core.mgmt.internal.EffectorUtils$EffectorCallPropagatedRuntimeException:
>  Error invoking resize at DynamicWebAppClusterImpl{id=iyo37ngogv}: 
> NullPointerException
> {noformat}
> The thread below invoked the effector - note that the AutoScalerPolicy has 
> wrapped the invocation in a task. The logging from this thread is also shown. 
> Within that task, it will not catch the exception:
> {noformat}
> "brooklyn-execmanager-cU8EcGHT-0" daemon prio=5 tid=0x00007fcfe5b5f800 
> nid=0x6003 waiting on condition [0x00007000018de000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00000007f93d7ab0> (a 
> java.util.concurrent.FutureTask)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>         at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:425)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:187)
>         at 
> com.google.common.util.concurrent.ForwardingFuture.get(ForwardingFuture.java:63)
>         at 
> org.apache.brooklyn.util.core.task.BasicTask.get(BasicTask.java:361)
>         at 
> org.apache.brooklyn.util.core.task.BasicTask.getUnchecked(BasicTask.java:370)
>         at 
> org.apache.brooklyn.util.core.task.DynamicTasks$TaskQueueingResult.andWaitForSuccess(DynamicTasks.java:160)
>         at 
> org.apache.brooklyn.core.objs.proxy.EntityProxyImpl.invoke(EntityProxyImpl.java:211)
>         at com.sun.proxy.$Proxy19.resize(Unknown Source)
>         at 
> org.apache.brooklyn.entity.webapp.ControlledDynamicWebAppClusterImpl.resize(ControlledDynamicWebAppClusterImpl.java:297)
>         at 
> org.apache.brooklyn.policy.autoscaling.AutoScalerPolicy$3.resize(AutoScalerPolicy.java:364)
>         at 
> org.apache.brooklyn.policy.autoscaling.AutoScalerPolicy$14.call(AutoScalerPolicy.java:1054)
>         at 
> org.apache.brooklyn.policy.autoscaling.AutoScalerPolicy$14.call(AutoScalerPolicy.java:1)
>         at 
> org.apache.brooklyn.util.core.task.DynamicSequentialTask$DstJob.call(DynamicSequentialTask.java:359)
>         at 
> org.apache.brooklyn.util.core.task.BasicExecutionManager$SubmissionCallable.call(BasicExecutionManager.java:519)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> 2016-07-09 16:36:54,385 DEBUG o.a.b.u.c.t.BasicExecutionManager 
> [brooklyn-execmanager-cU8EcGHT-0]: Exception running task 
> Task[Auto-scaler]@pOJ7r6Y0 (rethrowing): 
> org.apache.brooklyn.util.exceptions.PropagatedRuntimeException: Error 
> invoking resize at DynamicWebAppClusterImpl{id=iyo37ngogv}: 
> NullPointerException
> {noformat}
> The thread below invoked the task. It has called {{task.blockUntilEnded}} to 
> wait for the result:
> {noformat}
> "brooklyn-autoscalerpolicy-0" prio=5 tid=0x00007fcfe5b49800 nid=0x1407 
> waiting on condition [0x0000700002b14000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00000007f81a07b0> (a 
> java.util.concurrent.FutureTask)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>         at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:425)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:187)
>         at 
> com.google.common.util.concurrent.ForwardingFuture.get(ForwardingFuture.java:63)
>         at 
> org.apache.brooklyn.util.core.task.BasicTask.blockUntilEnded(BasicTask.java:417)
>         at 
> org.apache.brooklyn.util.core.task.BasicTask.blockUntilEnded(BasicTask.java:407)
>         at 
> org.apache.brooklyn.policy.autoscaling.AutoScalerPolicy.resizeNow(AutoScalerPolicy.java:1066)
>         at 
> org.apache.brooklyn.policy.autoscaling.AutoScalerPolicy.access$15(AutoScalerPolicy.java:1021)
>         at 
> org.apache.brooklyn.policy.autoscaling.AutoScalerPolicy$13.run(AutoScalerPolicy.java:952)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> 2016-07-09 16:38:45,089 DEBUG o.a.b.util.core.task.BasicTask 
> [brooklyn-autoscalerpolicy-0]: call from 
> Thread[brooklyn-autoscalerpolicy-0,5,main], blocking until 
> 'Task[Auto-scaler]@pOJ7r6Y0' finishes, ended with error: 
> java.util.concurrent.ExecutionException: 
> org.apache.brooklyn.util.exceptions.PropagatedRuntimeException: Error 
> invoking resize at DynamicWebAppClusterImpl{id=iyo37ngogv}: 
> NullPointerException
> {noformat}
> ---
> Note we'd have had the same problem if the cluster.resize() effector itself 
> had thrown the exception (rather than the entitlements). We'd have only got 
> logging like:
> {noformat}
> 2016-07-09 16:43:52,293 DEBUG o.a.b.u.c.t.BasicExecutionManager 
> [brooklyn-execmanager-NpLl6jt3-1]: Exception running task 
> Task[Auto-scaler]@uLBsngUp (rethrowing): java.lang.NullPointerException
> 2016-07-09 16:43:52,294 DEBUG o.a.b.util.core.task.BasicTask 
> [brooklyn-autoscalerpolicy-0]: call from 
> Thread[brooklyn-autoscalerpolicy-0,5,main], blocking until 
> 'Task[Auto-scaler]@uLBsngUp' finishes, ended with error: 
> java.util.concurrent.ExecutionException: java.lang.NullPointerException
> {noformat}
> This lack of logging is hard to reproduce in a simpler test case where 
> effectors are invoked directly. It is primarily because of the additional 
> layers of task, and the use of {{basicTask.blockUntilEnded}}, that we lose 
> the message.
> However, I have in the past seen something similar when there is an exception 
> in an effector invoked from the web-console: the web-console's activity view 
> showed the full exception stacktrace, but the debug log did not have it. This 
> is very bad for when a customer sends us a debug log!
> ---
> It's unclear what the holistic fix for this should be. 
> We do not want to require the caller of the effector to always wrap the call 
> in a try-catch with logging: the failure of an effector should get logged 
> automatically, and should include the underlying cause. Any other option 
> risks us being sent customer log files that give no indication of what caused 
> the problem.
> ---
> We can hopefully log any effector invocation error in the same way. 
> Alternatively we could treat framework exceptions (e.g. NPE preparing to 
> execute) as different from exceptions during the actual effector 
> implementation. 
> I don't want to do it in low-level code like {{BasicTask.blockUntilEnded}}, 
> because that will probably result in a huge amount of duplicate logging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to