[
https://issues.apache.org/jira/browse/USERGRID-1283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15285227#comment-15285227
]
ASF GitHub Bot commented on USERGRID-1283:
------------------------------------------
GitHub user snoopdave opened a pull request:
https://github.com/apache/usergrid/pull/524
Retries on Management App init, plus caching
First stab at improving the startup process so that, if we are unable to
lookup or create the management app we fail fast and if we are able to, then we
cache the value for later times when we may temporarily unable to look it up.
- Attempt to get or create management app in CpEntityManagerFactory
constructor
- With configurable retries
- Throw RuntimeException retry count exceeded
- Cache Management App
- CpEntityManager uses that new cache to get management app
https://issues.apache.org/jira/browse/USERGRID-1283
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/snoopdave/usergrid usegrid-1283-mgmt-app-init
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/usergrid/pull/524.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #524
----
commit e0c0c875271cda47f9baf9072f029a92921fd1be
Author: Dave Johnson <[email protected]>
Date: 2016-05-16T20:10:07Z
Move the initial get of management app to the CpEntityManagerFactory with
retries, and caching for the management app itself.
----
> Improve ServiceManager.init() start-up logic
> --------------------------------------------
>
> Key: USERGRID-1283
> URL: https://issues.apache.org/jira/browse/USERGRID-1283
> Project: Usergrid
> Issue Type: Improvement
> Affects Versions: 2.1.0
> Reporter: David Johnson
> Fix For: 2.1.1
>
>
> Sometimes on Usergrid startup there is a failure contacting Cassandra, either
> an immediate communications failure or a time-out.
> In some cases when this happens, the ServiceManager.init() method cannot
> retrieve the internal Management Application that holds information about
> Usergrid orgs, app and admin users.
> We added some retry logic to the ServiceManager.init() method, which is not
> an ideal fix because that method is also invoked in processing of HTTP
> requests. Problem is, if the the retries do not work we end up with an
> instance of Usergrid that is alive and able to respond to /status requests,
> but everything else fails.
> We should fix this by:
> 1) Moving the Management App lookup (and retry logic) to a much earlier point
> in the startup process.
> 2) Caching the Management App in some place where all threads can get it.
> This cache should never be allowed to be null. We always need to be able to
> fall back to a recent version of the Management App
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)