GitHub user nickwallen opened a pull request:
https://github.com/apache/metron/pull/748
METRON-1177 Stale running topologies seen post-kerberization and cause
exceptions
[METRON-1177](https://issues.apache.org/jira/browse/METRON-1177)
### Problem
After running the Ambari Kerberization process on a cluster where Metron
was installed with the MPack, often times the Kerberization process would
complete successfully, but the running Metron topologies were stale and had not
been restarted properly after all Kerberization steps completed. In other
cases, the Metron service check would fail when Ambari began restarting all
cluster services.
One clue that this has occurred is that when querying Storm using the
Thrift API to check on topology status after kerberization would result in the
following error.
```
AuthorizationException(msg:getTopologyInfo on topology snort is not
authorized)
```
### Solution
* All Metron services have to be started before performing a Metron service
check.
* All external dependencies like Storm, HBase, Kafka, etc must complete
their service check before performing the service check of a Metron service
having those dependencies.
* Added Storm as a start dependency for the Metron Profiler.
* Metron Profiler has to be stopped before Storm is stopped.
### Testing
This was tested by launching the Full Dev environment, kerberizing the
environment, and then monitoring the order in which each of the service start,
stop and status check actions occurred. I was not able to replicate the
failure condition with this fix.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/nickwallen/metron METRON-1177
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/metron/pull/748.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #748
----
commit a115f726eb5ccd3d215a7282871d290317cc1052
Author: Nick Allen <[email protected]>
Date: 2017-09-08T15:23:49Z
METRON-1177 Stale running topologies seen post-kerberization and cause
exceptions
----
---