[ 
https://issues.apache.org/jira/browse/ATLAS-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15928409#comment-15928409
 ] 

David Radley commented on ATLAS-1410:
-------------------------------------

Thank you  [~zimnymc] for your feedback. It is great. Here are my responses
ad. Use Case
use case 1 
It shouldn't be possible to define two terms with exactly the same name. 
<<David Yes this might be desirable in a single glossary - but we should not 
restrict another glossary (name space with a new context) from using the same 
name. There are cases where we want 2 term names to be the same in a glossary 
for example when we use replaces.  >> 
It can be possible to do it only through synonyms if definition stays the same. 
<<David I agree that synonyms are useful ways of equating differently named 
terms with the same meaning>>
If we have different definition then we also must have different name for each 
term. 
If we will allow same naming we will probably enormously stress glossary 
integrity.  <<David. Maybe we should consider a constraint of the form; Active 
Terms need to have unique  names in a glossary, unless they are connected with 
replaces relationships. [~mandy_chessell] Is this too restrictive? >>  
use cases 2 and 3
I agree that Categories are needed to give more control over terms organization 
but I think I need a bit more thinking if 
categories should help in creating hierarchies. It might be the case but then 
we should allow terms to only be leaves and every kind
of grouping should be done via category. This would mean that categories should 
also have classifiers. <<David I am not sure what the benefit would be on not 
allowing non-leaf terms. I like your idea of classifying categories; do you 
have a use case in mind that would require this ? >>  
use case 7
Do you mean collections ? <<David. Yes,  I'll make this clearer>>
use case 11
this sounds a bit too high level and would probably be nice to describe it in 
more details <<David OK I will add more detail>>
I'm explicitly missing two things:
1. ability to inherit classifiers <<David I assume you are meaning that a term 
has a classification and there are downstream terms that should get (inherit) 
the classifier. This is important -we want the OMAS layers to do this based on 
context; as we only want the classification stored once in the metadata store.  
Or are you thinking of a is-a relationship? >>
2. are there any models between terms and assets or is it only about term to 
asset ? 
we might want to include couple of levels of models (like LDM and/or PDM for 
particular technology)
at least one is already there - by connecting terms to other terms we are 
creating concepts which should 
be visualized in some way for easier navigation <<David great idea - I suggest 
raising a Jira for this. This proposal only talks of associating assets and 
terms. An asset could have many terms, and a term could be assigned to many 
assets. This proposal does not introduce any models to control the assignments. 
 >>
ad. discussion point on p. 5
yes, that's how I also see it - Taxonomy is the name of the hierarchy of 
Glossary Categories
but does this mean that Taxonomy is a name of Glossary instance ? <<David. 
>From discussions it seems that taxonomies would naturally fit as named 
category hierarchies. The terms would not be in a taxonomy. As the existing 
Atlas v1 taxonomy has terms - we felt it clearer to remove taxonomy and add a 
glossary to hold terms and categories. .  >> 
ad. Glossary Terms and Glossary Categories
discussion point - can there be a term without Category ? if not will there 
always be at least one prime category for each Glossary ?
if yes what is the difference then between Glossary and prime category ? is 
there any at all ? <<David. We are suggesting a term does not need a category, 
it is owned by its glossary. We are suggesting not having a prime category. >> 
point for discussion - should it be allowed that term from one Glossary is 
inside Category from another Glossary ?
I think we should not allow this kind of situations as those increase the risk 
of loosing integrity for particular Glossary. <<David I am interested in how 
you define  glossary integrity; are there a series of rules you have in mind?>>
I'd say that there should be a copy of that term done to the other Glossary 
with some kind of a marker "inspired by".
Otherwise we will create tight connection between two Glossaries and their 
maintenance will be more difficult (e.g. upgrades). <<David I am curious what 
sort of upgrade scenarios do you feel might be problematic? And how do you 
think the "inspired-by" term would solve this? The risk with inspired-by 
approach is that the "inspired-by" term gets out of sync with the original.  
Leading to duplication of slightly different metadata. I think your concerns 
would be addressed when terms and other metadata are versioned (not in scope of 
this document  >> 
ad. Glossary Term identification and names
Glossary Term names might not be unique in a Glossary. For example, there could 
be 2 definitions of customer. - just NO  :-) <<David lets discuss off line :-) 
>> 
"we do not allow 2 Glossary Terms of the same name inheriting from a parent 
Glossary Term" - so we do allow or not ? or I missed something ?
I need an example for this one to properly understand it.
ad. Glossary Term context
I'd like to create clear distinction between what is here meant by context and 
the term business context (being a term to term relations that
create business context) - I just don't like using word "context" for both. 
<<david So we are not introducing anything called context. this was an English 
word to try to make sense of it. I can see it might confuse because of this.  I 
will take look and see if I can rephrase to make it clearer.>>
ad. page 9, example
In general I do agree with the line of thinking but I have a question:
both customer and attributes are terms right ? if so then is "has-a" 
relationship the best one to do term-to-term assignments ? <<David I assume you 
are suggesting we do not need attribute terms as we can work this out from the 
"has-a" relationship. >> 
ad. Owning relationships
this "Concept Glossary Terms own Attribute Glossary Terms." I've some doubts 
about (see above remark for page 9, example)
I not saying not go there, I just want to explore it more to understand it 
better <<David: I was inspired by the IBM industry models. I am interested in 
whether this is too prescriptive at this level [~mandy_chessell] what to you 
think?   >> 
ad. Discussion point – maybe we should consider defining the Glossary Term 
attributes using the
type system rather than relationships - yes we should
ad. Discussion point: we could add homophones as well – if there was a need.
I don't think there is a need now to do that. <<David agreed>> 
ad. Discussion point preferred-term attribute could be stored in the entity, 
AtlasObjectId or
classification. I suggest storing it in the entity.
I agree.
ad. Discussion point: We will enable collection types to be created. 
Additionally, we may want to
consider including a Collection type that has one attribute called contents 
with multiple
values of the top-level type.
Do you mean nesting Collections ? <<David I was not thinking of that - but if 
we do allow that, we would need to check for recursion. I was thinking of the 
implementation of one collection>>
ad. Discussion point Introduction of bidirectional relationships, could be done 
separately from
the Glossary enhancement.
We may take a step-by-step approach but I'd say we need this from nearly the 
very beginning.  <<David I agree it is important. Pragmatically I suggest this 
is a definable addition so we could have a pretty functional glossary without 
it and could then add it as the next enhancement. >>
ad. Discussion point: We may wish to take a more revolutionary approach and 
allow
relationships to be defined as top level artifacts, of which classifications 
are a type.
Can we explore it more ? Sound pretty ambitious and worth to do but let's list 
consequences.
<<David lets discuss>>  
 

> V2 Glossary API
> ---------------
>
>                 Key: ATLAS-1410
>                 URL: https://issues.apache.org/jira/browse/ATLAS-1410
>             Project: Atlas
>          Issue Type: Improvement
>            Reporter: David Radley
>            Assignee: David Radley
>         Attachments: Atlas Glossary V2 proposal v1.0.pdf, Atlas Glossary V2 
> proposal v1.1.pdf
>
>
> The BaseResourceDefinition uses the AttributeDefintion class from typesystem. 
> There are newer more funcitonal versions of this capability in the atlas-intg 
> project. This Jira is changing over the glossary implementation to the newer 
> entity / type classes.  
> Instread of the instanceProperties and collectionProperties in the 
> BaseResourceDefintions we should use something in this sort of style :  
> "
>  AtlasEntityDef deptTypeDef =
>                 AtlasTypeUtil.createClassTypeDef(DEPARTMENT_TYPE, 
> "Department"+_description, ImmutableSet.<String>of(),
>                         AtlasTypeUtil.createRequiredAttrDef("name", "string"),
>                         new AtlasAttributeDef("employees", 
> String.format("array<%s>", "Person"), true,
>                                 AtlasAttributeDef.Cardinality.SINGLE, 0, 1, 
> false, false,
>                                 
> Collections.<AtlasStructDef.AtlasConstraintDef>emptyList()));
>         AtlasEntityDef personTypeDef = 
> AtlasTypeUtil.createClassTypeDef("Person", "Person"+_description, 
> ImmutableSet.<String>of(),
>                 AtlasTypeUtil.createRequiredAttrDef("name", "string"),
>                 AtlasTypeUtil.createOptionalAttrDef("address", "Address"),
>                 AtlasTypeUtil.createOptionalAttrDef("birthday", "date"),
>                 AtlasTypeUtil.createOptionalAttrDef("hasPets", "boolean"),
>                 AtlasTypeUtil.createOptionalAttrDef("numberOfCars", "byte"),
>                 AtlasTypeUtil.createOptionalAttrDef("houseNumber", "short"),
>                 AtlasTypeUtil.createOptionalAttrDef("carMileage", "int"),
>                 AtlasTypeUtil.createOptionalAttrDef("age", "float"),
> "
> For the parent child relationships with glossary categories and terms we 
> should be able to have the type system manage edge deletion. As part of this, 
> we will need to investigate whether we could get rid of the disconnect and 
> connect methods added in ATLAS-1186 
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to