Nigel Jones created ATLAS-1869:
----------------------------------

             Summary: Atlas "plugin" for Ranger (metadata capture)
                 Key: ATLAS-1869
                 URL: https://issues.apache.org/jira/browse/ATLAS-1869
             Project: Atlas
          Issue Type: Bug
            Reporter: Nigel Jones


With a variety of data processing engines in Hadoop such as Hive, we have an 
Atlas plugin & hook that captures new & updated metadata from those engines and 
pushes it to Atlas. This can then be used to support governance including 
lineage.

We already have a "ranger plugin" for Atlas which allows ranger to control 
access to metadata in atlas - this is NOT the subject of this Jira, but rather 
"the other way around"

Examples might include
 * Capture information about the policies that are deployed in a ranger server 
- the types of assets they refer to, the classifications that are used. 
 * Capture information about the topology of ranger - by this I mean the 
plugins that are deployed and active, the nodes they run on, and feed this back 
into an operational model in Atlas

In each case the information could be published by Ranger, consumed by Atlas, 
and stewardship activities around the atlas metadata could help in tying things 
together

The benefit would be
 - better end to end view (since we know the endpoints, identifiers in audit 
logs)
 - optimizing the interfaces (rest & kafka) by being able to better targer 
useful information - ie if only a hive plugin is being used & configured for 
tags, let's just worry the tags it needs.

I see this of use around our open metadata work & specifically VDC, though not 
essential for an initial MVP

Placeholder for now... will elaborate further



At the same time the coupling would be loose, and shouldn't hinder any existing 
integrations, or decisions as to what is done in Atlas vs Ranger



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to