[ 
https://issues.apache.org/jira/browse/USERGRID-1283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276919#comment-15276919
 ] 

David Johnson commented on USERGRID-1283:
-----------------------------------------

One idea for moving the management app lookup to an earlier step in the process:

1) in the CpEntityMangementFactory constructor, call the (existing) method 
initMgmtAppInternal()

2) In initMgmtAppInternal() add retry logic to retry when it cannot fetch the 
management application. If every retry fails, then throw a RuntimeException, 
which should prevent Usergrid from deploying to Tomcat

3) Add a management app field in CpEntitymanagementFactory and a getter that 
keeps that field up to date, and returns that field value whenever it cannot 
look up the management app from Cassandra

> Improve ServiceManager.init() start-up logic
> --------------------------------------------
>
>                 Key: USERGRID-1283
>                 URL: https://issues.apache.org/jira/browse/USERGRID-1283
>             Project: Usergrid
>          Issue Type: Improvement
>            Reporter: David Johnson
>
> Sometimes on Usergrid startup there is a failure contacting Cassandra, either 
> an immediate communications failure or a time-out.
> In some cases when this happens, the ServiceManager.init() method cannot 
> retrieve the internal Management Application that holds information about 
> Usergrid orgs, app and admin users.  
> We added some retry logic to the ServiceManager.init() method, which is not 
> an ideal fix because that method is also invoked in processing of HTTP 
> requests.  Problem is, if the the retries do not work we end up with an 
> instance of Usergrid that is alive and able to respond to /status requests, 
> but everything else fails.
> We should fix this by:
> 1) Moving the Management App lookup (and retry logic) to a much earlier point 
> in the startup process. 
> 2) Caching the Management App in some place where all threads can get it. 
> This cache should never be allowed to be null.  We always need to be able to 
> fall back to a recent version of the Management App



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to