-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31483/
-----------------------------------------------------------

Review request for Ambari, Andrew Onischuk, Emil Anca, Jonathan Hurley, and 
Vitalyi Brodetskyi.


Bugs: AMBARI-9785
    https://issues.apache.org/jira/browse/AMBARI-9785


Repository: ambari


Description
-------

After enabling Kerberos, the root user has the spnego user set for it 

```
[root@c6501 ~]# klist
Ticket cache: FILE:/tmp/krb5cc_0
Default principal: HTTP/[email protected]

Valid starting     Expires            Service principal
02/18/15 22:14:51  02/19/15 22:14:51  krbtgt/[email protected]
        renew until 02/18/15 22:14:51
```

It appears that the issue is related to the agent-side scheduler and/or some 
job that is scheduled to run periodically. Apparently some job is kinit-ing 
with the SPNEGO identity as the running user (root in this case) without 
changing the ticket cache. Thus whenever the job runs the root user's ticket 
cache gets changed to contain the SPNEGO identity's ticket.

While investigating and solving the issue it was found that other credentials 
were added to this cache, overwriting what was there, during backround 
processing, as well.

Most of the issues were releated to _alert_ checking on web-based UI endpoints 
while configuring the environment for curl to use Kerberos authentication.  
Another place (in Oozie) was a failure to run a command as the `oozie` local 
user.

Solving this includes using an alternate credential cache when kinit-ing. While 
at it, the cached is checked to see if the tickets are expired (or even there) 
before kinit-ing.


Diffs
-----

  ambari-agent/src/main/python/ambari_agent/alerts/web_alert.py 8ee6606 
  
ambari-common/src/main/python/resource_management/libraries/functions/__init__.py
 44d235c 
  
ambari-common/src/main/python/resource_management/libraries/functions/get_klist_path.py
 PRE-CREATION 
  
ambari-server/src/main/resources/common-services/HIVE/0.12.0.2.0/package/alerts/alert_webhcat_server.py
 970ddde 
  
ambari-server/src/main/resources/common-services/OOZIE/4.0.0.2.0/package/alerts/alert_check_oozie_server.py
 a5a066b 
  
ambari-server/src/main/resources/common-services/OOZIE/4.0.0.2.0/package/scripts/oozie_service.py
 092149d 
  
ambari-server/src/main/resources/stacks/BIGTOP/0.8/services/OOZIE/package/files/alert_check_oozie_server.py
 a5a066b 
  
ambari-server/src/main/resources/stacks/BIGTOP/0.8/services/WEBHCAT/package/files/alert_webhcat_server.py
 970ddde 
  ambari-server/src/test/python/stacks/2.0.6/OOZIE/test_oozie_server.py 45e9dc4 

Diff: https://reviews.apache.org/r/31483/diff/


Testing
-------

Manually tested all services in test cluster to see which might have this 
issue. Found only OOZIE and HIVE issues and tests showed they are fixed and 
working as they should.

#Jenkins Test Results

[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 01:12 h
[INFO] Finished at: 2015-02-26T06:35:45+00:00
[INFO] Final Memory: 44M/457M
[INFO] ------------------------------------------------------------------------


Thanks,

Robert Levas

Reply via email to