[ 
https://issues.apache.org/jira/browse/ATLAS-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15936571#comment-15936571
 ] 

David Radley commented on ATLAS-1410:
-------------------------------------

Responding to [~eostic] 
Use Case 1 (page 5), with references also to comments on page 7.... "It is 
important that duplicate Glossary Term names
can be defined in the glossary, each with their own context..". That "context" 
is important, and in fact, should be what makes the Term unique. It might be a 
"duplicate" in the highest sense of the word, for the whole "collection" of 
Terms in a glossary, but parentage is indeed important, and keeps things 
separate. Enterprises can't always agree on Term names or their categorization. 
One company might, in an insurance example, spend the time to clarify their 
terms and be sure to have "Automobile Accident Claim" along with "Homeowners 
Claim" in the same Glossary, but another site might just as easily have the 
same Term "Claim", existing many times.....once in a "Personal 
Lines/Claims/Auto/Accident" category and another in "Personal 
Lines/Claims/Homeowners" category.  <<David Stefhan's point is, what would 
these look like in search results. I will articulate this point in the document 
in a discussion point >>   
This is a complex topic, however ....so it might be beneficial for the model 
and the APIs to allow full duplication within the higher Glossary level, 
without requiring parentage definition, leaving it to the "implementer" of any 
GUI to support (or not) some further level of identity....but it should be 
strongly recommended. 
Use Case 9. Classifying terms rather than assets sounds very natural, but 
shouldn't be a "requirement". Is this implying that direct asset classification 
wouldn't be permitted? <<David I did not mean to. We will still support 
classifying assets.>>There will be times when the classifications are only 
applicable to assets<<David absolutely>>, because a Term does not yet exist, or 
because the type of classification isn't fine grained enough, or other equally 
creative reasons. 
Page 7. Kinds of Glossary Terms. It isn't clear why there is a need for 
different formal "types" of mutually exclusive Terms. Relationships determine 
the "use" of a particular Term, and if it is important for a consumer to have, 
for their users and model, a set of "Semantic" terms vs "Classifying" terms, it 
can be done other ways, such as by putting the Terms into their own separate 
categories or parental structures. <<David that does make sense. I have removed 
the types >>
Page 10 and 11 ....this discussion is more clear now, since the removal of the 
idea of an "Attribute Term". On page 11 specifically, it may be beneficial to 
also show how "has-a" can cascade. Customer "has_a" address, and then address 
could also "has_a" City, State and Zip (and so forth...). 
<<David will do>> 
Page 13. Great that you brought up the need for custom relationships. We need 
to ensure that this capability as "hoped for", remains intact. ; ) 
Page 14. Perhaps it needs more explanation, but I found the definition of 
Has-type and Types to be a bit confusing. "Has-types/is-a-type-of" seems more 
natural.....and that perhaps these could be combined into one. <<David Maybe we 
should discuss further so we can agree on clearer definitions. It is important 
we have the difference between is-a and is-type-of. It seems to me that their 
opposites should also be uniquely named. >> 
Page 14. Synonyms. Perhaps needs more explanation? Synonyms are difficult to 
have any kind of "owner". They are all peers in a "collection" of similar 
concepts. Having one owner, in the model itself, could create issues if/when 
that owner is deleted. <<David agreed. Ownership is a stop gap until with have 
bidirectional associations.>> 
Page 14. Antonyms....needs further definition so that it is explained 
separately from Synonyms. In this case, there could be many Terms that are 
opposites, but they themselves are not necessarily antonyms of each other. This 
one seems ok to have an "owner" concept. <<David good spot - I will amend>>
Page 14 Homonyms. This one is more like Synonyms, where they can be peers of 
each other.
Page 15. Preferred Term. Great concept. Especially important for enterprises 
that are overloading the glossary to meet a lot of their governance objectives, 
but still want to retain the idea that "this term" is "the one to use" for 
specific alternate name, or priority reference purposes. Specifically critical 
to scenarios where Terms are seen as a "replacement for names in retrieval 
requests or reporting tool interfaces". <<David I have included these words in 
the document>> 
Page 15. Collections. Very important concept — but is it part of the Glossary 
specification? <<David yes this is a separate concept to the glossary - but 
they would effectively be enabled if we allow terms to point to any entity in 
the type definition. These may need a separate API, which would be out of scope 
or this document. I have removed collections >>  ...or should it be reviewed at 
a much higher Atlas perspective? Certainly the glossary could have a set of 
Terms, qualified in some way as a "Collection Glossary" and then more 
generically use "assigned assets" [including other Terms] as a generic 
relationship ---- but it maybe that this is overloading the Glossary too much

> V2 Glossary API
> ---------------
>
>                 Key: ATLAS-1410
>                 URL: https://issues.apache.org/jira/browse/ATLAS-1410
>             Project: Atlas
>          Issue Type: Improvement
>            Reporter: David Radley
>            Assignee: David Radley
>         Attachments: Atlas Glossary V2 proposal v1.0.pdf, Atlas Glossary V2 
> proposal v1.1.pdf, Atlas Glossary V2 proposal v1.2.pdf
>
>
> The BaseResourceDefinition uses the AttributeDefintion class from typesystem. 
> There are newer more funcitonal versions of this capability in the atlas-intg 
> project. This Jira is changing over the glossary implementation to the newer 
> entity / type classes.  
> Instread of the instanceProperties and collectionProperties in the 
> BaseResourceDefintions we should use something in this sort of style :  
> "
>  AtlasEntityDef deptTypeDef =
>                 AtlasTypeUtil.createClassTypeDef(DEPARTMENT_TYPE, 
> "Department"+_description, ImmutableSet.<String>of(),
>                         AtlasTypeUtil.createRequiredAttrDef("name", "string"),
>                         new AtlasAttributeDef("employees", 
> String.format("array<%s>", "Person"), true,
>                                 AtlasAttributeDef.Cardinality.SINGLE, 0, 1, 
> false, false,
>                                 
> Collections.<AtlasStructDef.AtlasConstraintDef>emptyList()));
>         AtlasEntityDef personTypeDef = 
> AtlasTypeUtil.createClassTypeDef("Person", "Person"+_description, 
> ImmutableSet.<String>of(),
>                 AtlasTypeUtil.createRequiredAttrDef("name", "string"),
>                 AtlasTypeUtil.createOptionalAttrDef("address", "Address"),
>                 AtlasTypeUtil.createOptionalAttrDef("birthday", "date"),
>                 AtlasTypeUtil.createOptionalAttrDef("hasPets", "boolean"),
>                 AtlasTypeUtil.createOptionalAttrDef("numberOfCars", "byte"),
>                 AtlasTypeUtil.createOptionalAttrDef("houseNumber", "short"),
>                 AtlasTypeUtil.createOptionalAttrDef("carMileage", "int"),
>                 AtlasTypeUtil.createOptionalAttrDef("age", "float"),
> "
> For the parent child relationships with glossary categories and terms we 
> should be able to have the type system manage edge deletion. As part of this, 
> we will need to investigate whether we could get rid of the disconnect and 
> connect methods added in ATLAS-1186 
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to