[ 
https://issues.apache.org/jira/browse/ATLAS-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16000621#comment-16000621
 ] 

ernie ostic commented on ATLAS-1765:
------------------------------------

Initial thoughts on search and query use types, based on various use cases that 
are often seen with Infosphere Information Governance Catalog (IGC).

Search/Queries against the repository.   

Here are various patterns we see frequently with IGC.   The categories below 
are loose, but and correspond to the user's objective and also their level of 
experience with the tool, and whether they are in the role of "governance team" 
vs "regular enterprise user".   Breaking them up here just to aid with further 
discussion.     Each of these comes up in three "access" modes, fairly equally: 
[(1) online, using the gui  (2) batch, via command line, for extraction to a 
.csv or other export file structure (3) via REST api].    Each also typically 
allows a list of "properties" to be simply selected along with said "asset" 
(name, description, internal identifier, date created, etc.)   

This is more a listing of "syntax examples" than pure business use cases, but 
each should be easily backed into a personna or business use case as necessary. 


Governance Queries

List out all assets (usually columns) that have not yet been assigned a Term
List out all assets (usually columns) that have not yet been assigned a Steward
List out all assets (any kind) that are being managed by Steward <steward>
List out all assets that have been modified since <date>
List out all assets that are in a particular state (such as "Draft", where 
"workflow" in IGC has been implemented)
List out all assets based on their time remaining in a particular state ("all 
terms in draft for more than <n> days")
List out all assets (usually Terms) where property <property> is null   
[similar to where property <property> is <value> but called out here 
specifically because it is a common "management level" governance query
List out all assets (usually Terms) where relationship <relationship> is null  



 Research Queries

Often by an individual user, data research person....sometimes also performed 
by developers, often exploiting a "lineage" relationship

List out all assets <a specific type> where property <property> is <value> 
[string, between, equal_to, etc., etc. etc. ]
List out all assets <relationship, such as "owned by"> <steward> 
Show all assets "written by" <name of process or other data-mover kind of 
asset>.   For Atlas in its current form, this might the name of a SQOOP process
For <asset> (type and name), show immediate upstream asset (and properties of 
that asset...last time it ran, status code, etc.)
Show a Term and all of its "history" (particular important for comments by 
reviewers over time)
Various complex "set" retrievals, qualified by existence of a particular 
instance...such as "dump out all database/table/column details for every 
database that contains a schema called <schemaName> [at times, the qualifier is 
just "if it exists" as a child but still dump all children....possibly 
requiring multilple requests or additional filtering against the final returned 
list]
List all transformations (and their sources/targets/processes) where 
nullability was changed for a column from null to "not null".   [that is a 
specific example, but could exist for datatype changes, column name changes, 
specific mappings or functions, etc.
Requests that exploit multiple relationships for qualification...such as "list 
all tables that have a Steward...but only for Stewards who also manage/own 
assets in the Risk Collection" 



> Self-Service Catalog Search and Data Preview
> --------------------------------------------
>
>                 Key: ATLAS-1765
>                 URL: https://issues.apache.org/jira/browse/ATLAS-1765
>             Project: Atlas
>          Issue Type: New Feature
>          Components: atlas-webui
>    Affects Versions: 0.9-incubating
>            Reporter: Mandy Chessell
>            Assignee: Mandy Chessell
>              Labels: Self-Service-UIs, VirtualDataConnector
>
> This JIRA covers the development of the catalog search and preview of data 
> for data scientists and business users.  It supports the search of the Atlas 
> metadata repository, display of search results, additional filtering and 
> drill down into details of the data sources, including a data preview option 
> if the end user has access permission.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to