[ https://issues.apache.org/jira/browse/ATLAS-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15936571#comment-15936571 ]
David Radley commented on ATLAS-1410: ------------------------------------- Responding to [~eostic] Use Case 1 (page 5), with references also to comments on page 7.... "It is important that duplicate Glossary Term names can be defined in the glossary, each with their own context..". That "context" is important, and in fact, should be what makes the Term unique. It might be a "duplicate" in the highest sense of the word, for the whole "collection" of Terms in a glossary, but parentage is indeed important, and keeps things separate. Enterprises can't always agree on Term names or their categorization. One company might, in an insurance example, spend the time to clarify their terms and be sure to have "Automobile Accident Claim" along with "Homeowners Claim" in the same Glossary, but another site might just as easily have the same Term "Claim", existing many times.....once in a "Personal Lines/Claims/Auto/Accident" category and another in "Personal Lines/Claims/Homeowners" category. <<David Stefhan's point is, what would these look like in search results. I will articulate this point in the document in a discussion point >> This is a complex topic, however ....so it might be beneficial for the model and the APIs to allow full duplication within the higher Glossary level, without requiring parentage definition, leaving it to the "implementer" of any GUI to support (or not) some further level of identity....but it should be strongly recommended. Use Case 9. Classifying terms rather than assets sounds very natural, but shouldn't be a "requirement". Is this implying that direct asset classification wouldn't be permitted? <<David I did not mean to. We will still support classifying assets.>>There will be times when the classifications are only applicable to assets<<David absolutely>>, because a Term does not yet exist, or because the type of classification isn't fine grained enough, or other equally creative reasons. Page 7. Kinds of Glossary Terms. It isn't clear why there is a need for different formal "types" of mutually exclusive Terms. Relationships determine the "use" of a particular Term, and if it is important for a consumer to have, for their users and model, a set of "Semantic" terms vs "Classifying" terms, it can be done other ways, such as by putting the Terms into their own separate categories or parental structures. <<David that does make sense. I have removed the types >> Page 10 and 11 ....this discussion is more clear now, since the removal of the idea of an "Attribute Term". On page 11 specifically, it may be beneficial to also show how "has-a" can cascade. Customer "has_a" address, and then address could also "has_a" City, State and Zip (and so forth...). <<David will do>> Page 13. Great that you brought up the need for custom relationships. We need to ensure that this capability as "hoped for", remains intact. ; ) Page 14. Perhaps it needs more explanation, but I found the definition of Has-type and Types to be a bit confusing. "Has-types/is-a-type-of" seems more natural.....and that perhaps these could be combined into one. <<David Maybe we should discuss further so we can agree on clearer definitions. It is important we have the difference between is-a and is-type-of. It seems to me that their opposites should also be uniquely named. >> Page 14. Synonyms. Perhaps needs more explanation? Synonyms are difficult to have any kind of "owner". They are all peers in a "collection" of similar concepts. Having one owner, in the model itself, could create issues if/when that owner is deleted. <<David agreed. Ownership is a stop gap until with have bidirectional associations.>> Page 14. Antonyms....needs further definition so that it is explained separately from Synonyms. In this case, there could be many Terms that are opposites, but they themselves are not necessarily antonyms of each other. This one seems ok to have an "owner" concept. <<David good spot - I will amend>> Page 14 Homonyms. This one is more like Synonyms, where they can be peers of each other. Page 15. Preferred Term. Great concept. Especially important for enterprises that are overloading the glossary to meet a lot of their governance objectives, but still want to retain the idea that "this term" is "the one to use" for specific alternate name, or priority reference purposes. Specifically critical to scenarios where Terms are seen as a "replacement for names in retrieval requests or reporting tool interfaces". <<David I have included these words in the document>> Page 15. Collections. Very important concept — but is it part of the Glossary specification? <<David yes this is a separate concept to the glossary - but they would effectively be enabled if we allow terms to point to any entity in the type definition. These may need a separate API, which would be out of scope or this document. I have removed collections >> ...or should it be reviewed at a much higher Atlas perspective? Certainly the glossary could have a set of Terms, qualified in some way as a "Collection Glossary" and then more generically use "assigned assets" [including other Terms] as a generic relationship ---- but it maybe that this is overloading the Glossary too much > V2 Glossary API > --------------- > > Key: ATLAS-1410 > URL: https://issues.apache.org/jira/browse/ATLAS-1410 > Project: Atlas > Issue Type: Improvement > Reporter: David Radley > Assignee: David Radley > Attachments: Atlas Glossary V2 proposal v1.0.pdf, Atlas Glossary V2 > proposal v1.1.pdf, Atlas Glossary V2 proposal v1.2.pdf > > > The BaseResourceDefinition uses the AttributeDefintion class from typesystem. > There are newer more funcitonal versions of this capability in the atlas-intg > project. This Jira is changing over the glossary implementation to the newer > entity / type classes. > Instread of the instanceProperties and collectionProperties in the > BaseResourceDefintions we should use something in this sort of style : > " > AtlasEntityDef deptTypeDef = > AtlasTypeUtil.createClassTypeDef(DEPARTMENT_TYPE, > "Department"+_description, ImmutableSet.<String>of(), > AtlasTypeUtil.createRequiredAttrDef("name", "string"), > new AtlasAttributeDef("employees", > String.format("array<%s>", "Person"), true, > AtlasAttributeDef.Cardinality.SINGLE, 0, 1, > false, false, > > Collections.<AtlasStructDef.AtlasConstraintDef>emptyList())); > AtlasEntityDef personTypeDef = > AtlasTypeUtil.createClassTypeDef("Person", "Person"+_description, > ImmutableSet.<String>of(), > AtlasTypeUtil.createRequiredAttrDef("name", "string"), > AtlasTypeUtil.createOptionalAttrDef("address", "Address"), > AtlasTypeUtil.createOptionalAttrDef("birthday", "date"), > AtlasTypeUtil.createOptionalAttrDef("hasPets", "boolean"), > AtlasTypeUtil.createOptionalAttrDef("numberOfCars", "byte"), > AtlasTypeUtil.createOptionalAttrDef("houseNumber", "short"), > AtlasTypeUtil.createOptionalAttrDef("carMileage", "int"), > AtlasTypeUtil.createOptionalAttrDef("age", "float"), > " > For the parent child relationships with glossary categories and terms we > should be able to have the type system manage edge deletion. As part of this, > we will need to investigate whether we could get rid of the disconnect and > connect methods added in ATLAS-1186 > -- This message was sent by Atlassian JIRA (v6.3.15#6346)