[ 
https://issues.apache.org/jira/browse/ATLAS-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16002261#comment-16002261
 ] 

David Radley commented on ATLAS-1690:
-------------------------------------

Hi [~madhan.neethiraj] 
Great feedback - thanks for your thoughtful and open response.
I will change the array to and endpoint1 and 2. I think that is clearer.

I am keen that we propagate tags, this is very powerful. 
I thought I would explain how we could do classifications and then see how this 
option fits in.
The classifications our customers are working with include confidentiality. The 
confidentiality scheme might have C1,C2, C3 and C4 levels. C1 might be public 
and C3 top secret. Different companies name these Cn levels differently. But in 
all these cases there is an order, C4 being the highest classification level. 
Though it is possible to have more complex classifications schemes- many / most 
cases can work with this list sort of classification schema. A particular 
glossary term or column might be associated with one of these classifications. 
We were thinking that the Classification levels (C1, C2 etc) would be new 
system Types (I suggest an EntityDef) A classification use is the relationship 
between the classification level and the thing it is classifying. By default 
the classification will be that from the level , but a rule can run and 
increase the classification; this would be calculated (and stored?) in the 
relationship instance.      

So to address your proposal:
- I think the propagated classifications would be derived at query time and 
could be useful -do we need an effective classification? I am not convinced 
with the proposed mechanism.
- Your example around tables and columns and PII assumes that PII is a binary 
flag (or one tag), I am suggesting that this is not the way that 
classifications are normally implemented - these should be an ordered list of 
levels. I see in some of your recent demos you use v1 terms to implement these 
classification levels for this classification ordering. If a table is 
classified as public and has a PII column , we would not want the public 
classification to override the PII column.  As a query brings together 2 public 
fields, line name and salary, the combination becomes PII, in this case we need 
the rule to drive this.
This implementation would encourage the use of bidirectional relationships to 
be implemented purely to propagate tags. I suggest many propagations would not 
be on one relationship, but could flow much further would be to all has-a terms 
- following all the has-a links. 
I am also concerned that the role who authors the relationship is not the right 
role to make classification propagation decisions.  

I wonder whether a smarter approach would be to tag the relationship as 
"propagate-1-to-2" (hopefully something more meaningful like 
propogate-table-to-column")  and Ranger picks up this hint. Ranger could decide 
to run a simple rule of propagating all the tags from 1 to 2 or a more complex 
rule taking other conditions into account.

I suggest that we explicitly implement  these classification levels and uses, I 
hope there is a simple case where there are some classifications that should be 
propagated for all consumer cases, and rules can run to override the 
classifications and we can find a way of doing this using a governance role and 
we could make this work. Maybe some supplied Ranger rules and tags that Atlas 
used out of the box. GDPR rules and tags would be a good use case here.



  


> Introduce top level relationships
> ---------------------------------
>
>                 Key: ATLAS-1690
>                 URL: https://issues.apache.org/jira/browse/ATLAS-1690
>             Project: Atlas
>          Issue Type: Improvement
>            Reporter: David Radley
>            Assignee: David Radley
>              Labels: VirtualDataConnector
>         Attachments: Atlas_RelationDef_Json_Structure_v1.pdf, Atlas 
> Relationships proposal v1.0.pdf, Atlas Relationships proposal v1.1.pdf, Atlas 
> Relationships proposal v1.2.pdf, Atlas Relationships proposal v1.3.pdf, Atlas 
> Relationships proposal v1.4.pdf, Atlas Relationships proposal v1.5.pdf, Atlas 
> Relationships proposal v1.6.pdf, Atlas Relationships proposal v1.7.pdf
>
>
> Introduce top level relationships including support for 
> -many to many relationships
> - relationship names including the name for both ends and the relationship.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to