[jira] [Commented] (ATLAS-1147) UI: column name doesn't show up in schema tab for hive table
[ https://issues.apache.org/jira/browse/ATLAS-1147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494017#comment-15494017 ] Andrew Ahn commented on ATLAS-1147: --- Sample file of patch attached as: SchemaLayoutView.js > UI: column name doesn't show up in schema tab for hive table > > > Key: ATLAS-1147 > URL: https://issues.apache.org/jira/browse/ATLAS-1147 > Project: Atlas > Issue Type: Bug >Reporter: Shwetha G S >Assignee: Kalyani Kashikar > Attachments: ATLAS-1147.patch, SchemaLayoutView.js, Screen Shot > 2016-08-30 at 9.21.07 PM.png > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ATLAS-1147) UI: column name doesn't show up in schema tab for hive table
[ https://issues.apache.org/jira/browse/ATLAS-1147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ahn updated ATLAS-1147: -- Attachment: SchemaLayoutView.js -In hosts that run Atlas server: copy attached SchemaLayoutView.js to /usr/hdp/current/atlas-server/server/webapp/atlas/js/views/schema/SchemaLayoutView.js -Refresh the browser, so that its cache will be updated for the updated JavaScript file > UI: column name doesn't show up in schema tab for hive table > > > Key: ATLAS-1147 > URL: https://issues.apache.org/jira/browse/ATLAS-1147 > Project: Atlas > Issue Type: Bug >Reporter: Shwetha G S >Assignee: Kalyani Kashikar > Attachments: ATLAS-1147.patch, SchemaLayoutView.js, Screen Shot > 2016-08-30 at 9.21.07 PM.png > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-274) No lineage is recorded for creating a table using LIKE
[ https://issues.apache.org/jira/browse/ATLAS-274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15018219#comment-15018219 ] Andrew Ahn commented on ATLAS-274: -- Good point. We should definitely 1) track lineage for table creation with like 2) will the not the operation (sql) not show what was done to create the new entity from a visualization perspective ?Perhaps we could use a different graphic element - different color to represent this and other cases like a materialized view. AA > No lineage is recorded for creating a table using LIKE > -- > > Key: ATLAS-274 > URL: https://issues.apache.org/jira/browse/ATLAS-274 > Project: Atlas > Issue Type: Bug >Affects Versions: 0.6-incubating >Reporter: Ayub Khan >Assignee: Hemanth Yamijala > Attachments: application.log > > > Seems like no lineage is recorded for creating a table using LIKE > example: > create table table_1242 LIKE table_1240 > Input graph query > {noformat} > curl > 'http://localhost:21000/api/atlas/lineage/hive/table/primary.default.table_1242/inputs/graph' > -H 'Accept-Encoding: gzip, deflate, sdch' -H 'Accept-Language: > en-US,en;q=0.8' -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X > 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.80 > Safari/537.36' -H 'Accept: application/json, text/plain, */*' -H 'Referer: > http://localhost:21000/' -H 'Cookie: JSESSIONID=15b7c1abmgzpsv2gax1e68sor' -H > 'Connection: keep-alive' --compressed | python -m json.tool > { > "requestId": "qtp983814036-14184 - f97fb1cb-be0b-4eaa-8338-5c1543803438", > "results": { > "jsonClass": > "org.apache.atlas.typesystem.json.InstanceSerialization$_Struct", > "typeName": "__tempQueryResultStruct267", > "values": { > "edges": {}, > "vertices": {} > } > }, > "tableName": "primary.default.table_1242" > } > {noformat} > Output graph query > {noformat} > curl > 'http://localhost:21000/api/atlas/lineage/hive/table/primary.default.table_1242/outputs/graph' > -H 'Accept-Encoding: gzip, deflate, sdch' -H 'Accept-Language: > en-US,en;q=0.8' -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X > 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.80 > Safari/537.36' -H 'Accept: application/json, text/plain, */*' -H 'Referer: > http://localhost:21000/' -H 'Cookie: JSESSIONID=15b7c1abmgzpsv2gax1e68sor' -H > 'Connection: keep-alive' --compressed | python -m json.tool > { > "requestId": "qtp983814036-14184 - 14e8cbc2-c2df-42b5-b506-51649fe2705b", > "results": { > "jsonClass": > "org.apache.atlas.typesystem.json.InstanceSerialization$_Struct", > "typeName": "__tempQueryResultStruct261", > "values": { > "edges": {}, > "vertices": {} > } > }, > "tableName": "primary.default.table_1242" > } > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ATLAS-183) Add a Hook in Storm to post the topology metadata
[ https://issues.apache.org/jira/browse/ATLAS-183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ahn updated ATLAS-183: - Description: Apache Storm Integration with Apache Atlas (incubating) Introduction Apache Storm is a distributed real-time computation system. Storm makes it easy to reliably process unbounded streams of data, doing for real-time processing what Hadoop did for batch processing. The process is essentially a DAG of nodes, which is called topology. Apache Atlas is a metadata repository that enables end-to-end data lineage, search and associate business classification. Overview The goal of this integration is to at minimum push the operational topology metadata along with the underlying data source(s), target(s), derivation processes and any available business context so Atlas can capture the lineage for this topology. It would also help to support custom user annotations per node in the topology. There are 2 parts in this process detailed below: Data model to represent the concepts in Storm Storm Bridge to update metadata in Atlas Data Model A data model is represented as a Type in Atlas. It contains the descriptions of various nodes in the DAG, such as spouts and bolts and the corresponding source and target types. These need to be expressed as Types in Atlas type system. At the least, we need to create types for: Storm topology containing spouts, bolts, etc. with associations between them Source (typically Kafka, etc.) Target (typically Hive, HBase, HDFS, etc.) You can take a look at the data model code for Hive. Storm should only be simpler than Hive from a data modeling perspective. Pushing Metadata into Atlas There are 2 parts to the bridge: Storm Bridge This is a one-time import for Storm to list all the active topologies and push the metadata into Atlas to address cases where Storm deployments exist before Atlas. You can refer to the bridge code for Hive. Post-execution Hook Atlas needs to be notified when a new topology is registered successfully in Storm or when someone changes the definition of an existing topology. You can refer to the hook code for Hive. Example use case: Custom annotations associated with each node in the topology. For example: Data Quality Rules, Error Handling, etc. A set of annotations that enumerates rules handling nulls– all nulls for a column get filtered, etc. > Add a Hook in Storm to post the topology metadata > - > > Key: ATLAS-183 > URL: https://issues.apache.org/jira/browse/ATLAS-183 > Project: Atlas > Issue Type: Sub-task >Affects Versions: 0.6-incubating >Reporter: Venkatesh Seetharam > Fix For: 0.6-incubating > > > Apache Storm Integration with Apache Atlas (incubating) > Introduction > Apache Storm is a distributed real-time computation system. Storm makes it > easy to reliably process unbounded streams of data, doing for real-time > processing what Hadoop did for batch processing. The process is essentially > a DAG of nodes, which is called topology. > Apache Atlas is a metadata repository that enables end-to-end data lineage, > search and associate business classification. > Overview > The goal of this integration is to at minimum push the operational topology > metadata along with the underlying data source(s), target(s), derivation > processes and any available business context so Atlas can capture the lineage > for this topology. > It would also help to support custom user annotations per node in the > topology. > There are 2 parts in this process detailed below: > Data model to represent the concepts in Storm > Storm Bridge to update metadata in Atlas > Data Model > A data model is represented as a Type in Atlas. It contains the descriptions > of various nodes in the DAG, such as spouts and bolts and the corresponding > source and target types. These need to be expressed as Types in Atlas type > system. At the least, we need to create types for: > Storm topology containing spouts, bolts, etc. with associations between them > Source (typically Kafka, etc.) > Target (typically Hive, HBase, HDFS, etc.) > You can take a look at the data model code for Hive. Storm should only be > simpler than Hive from a data modeling perspective. > Pushing Metadata into Atlas > There are 2 parts to the bridge: > Storm Bridge > This is a one-time import for Storm to list all the active topologies and > push the metadata into Atlas to address cases where Storm deployments exist > before Atlas. > You can refer to the bridge code for Hive. > Post-execution Hook > Atlas needs to be notified when a new topology is registered successfully in > Storm or when someone changes the definition of an existing topology. > You can refer to the hook code for Hive. > > Example use case: > Custom annotations associated with each
[jira] [Commented] (ATLAS-181) Integrate storm topology metadata into Atlas
[ https://issues.apache.org/jira/browse/ATLAS-181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14949035#comment-14949035 ] Andrew Ahn commented on ATLAS-181: -- Please reference additional info with Storm Jira: https://issues.apache.org/jira/browse/STORM-1098 > Integrate storm topology metadata into Atlas > > > Key: ATLAS-181 > URL: https://issues.apache.org/jira/browse/ATLAS-181 > Project: Atlas > Issue Type: Improvement >Affects Versions: 0.6-incubating >Reporter: Venkatesh Seetharam > Fix For: 0.6-incubating > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ATLAS-184) Integrate Sqoop metadata into Atlas
[ https://issues.apache.org/jira/browse/ATLAS-184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ahn updated ATLAS-184: - Description: Apache Sqoop Integration with Apache Atlas (incubating) Introduction Apache Sqoop is a tool designed for efficiently transferring bulk data between Apache Hadoop and structured data stores such as relational databases. Apache Atlas is a metadata repository that enables end-to-end data lineage, search and associate business classification. Overview The goal of this integration is to at minimum push the Sqoop generated query metadata along with the source provenance, target(s), and any available business context so Atlas can capture the lineage for this topology. There are 2 parts in this process detailed below: 1. Data model to represent the concepts in Sqoop 2. Sqoop Bridge/Hook to update metadata in Atlas Data Model A data model is represented as a Type in Atlas. This can reuse or closely be modeled after Hive data types that already exist. At the least, we need to create types for: • Sqoop processes containing the SQL query text, start/end times, user, etc. • Source Provenance, fine-grained at DB, Table, Column, etc. so we have a 1-1 mapping between source and target assets • Target (typically Hive, HBase, HDFS, etc.) You can take a look at the data model code for Hive. Sqoop should reuse the data model from Hive or closely model after that. Pushing Metadata into Atlas There are 2 parts to the bridge: 1. Sqoop Bridge This does not apply to Sqoop tool. However, will apply if and when we migrate to Sqoop 2. 2. Post-execution Hook Atlas needs to be notified when a new Sqoop Ingest is executed successfully or when someone changes the definition of an existing Sqoop Job. You can refer to the hook code for Hive. 3. Column-level lineage It would be good to have column level lineage for data flowing from the source database/WH into Hive. > Integrate Sqoop metadata into Atlas > --- > > Key: ATLAS-184 > URL: https://issues.apache.org/jira/browse/ATLAS-184 > Project: Atlas > Issue Type: Improvement >Affects Versions: 0.6-incubating >Reporter: Venkatesh Seetharam > Fix For: 0.6-incubating > > > Apache Sqoop Integration with Apache Atlas (incubating) > Introduction > Apache Sqoop is a tool designed for efficiently transferring bulk data > between Apache Hadoop and structured data stores such as relational databases. > Apache Atlas is a metadata repository that enables end-to-end data lineage, > search and associate business classification. > Overview > The goal of this integration is to at minimum push the Sqoop generated query > metadata along with the source provenance, target(s), and any available > business context so Atlas can capture the lineage for this topology. > There are 2 parts in this process detailed below: > 1.Data model to represent the concepts in Sqoop > 2.Sqoop Bridge/Hook to update metadata in Atlas > Data Model > A data model is represented as a Type in Atlas. This can reuse or closely be > modeled after Hive data types that already exist. At the least, we need to > create types for: > • Sqoop processes containing the SQL query text, start/end times, user, > etc. > • Source Provenance, fine-grained at DB, Table, Column, etc. so we have a > 1-1 mapping between source and target assets > • Target (typically Hive, HBase, HDFS, etc.) > You can take a look at the data model code for Hive. Sqoop should reuse the > data model from Hive or closely model after that. > Pushing Metadata into Atlas > There are 2 parts to the bridge: > 1.Sqoop Bridge > This does not apply to Sqoop tool. However, will apply if and when we migrate > to Sqoop 2. > 2.Post-execution Hook > Atlas needs to be notified when a new Sqoop Ingest is executed successfully > or when someone changes the definition of an existing Sqoop Job. > You can refer to the hook code for Hive. > 3.Column-level lineage > It would be good to have column level lineage for data flowing from the > source database/WH into Hive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ATLAS-77) REST API to access all functions in Atlas
[ https://issues.apache.org/jira/browse/ATLAS-77?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ahn updated ATLAS-77: Description: Create REST API to access all functions in Atlas. Specifically the capabilities should include: 1Create new types 2Modify existing types 3Create new relationships for all types (data and trait) 4Remove relationships -- disassociate 5Allow assignment of properties to all types 6Allow search for type, entity, properties 7Return search results - detail, lineage (relations) These API's will be used for UI, HDP component and 3rd party interfaces. Summary: REST API to access all functions in Atlas (was: REST API to create new connections to Atlas) REST API to access all functions in Atlas -- Key: ATLAS-77 URL: https://issues.apache.org/jira/browse/ATLAS-77 Project: Atlas Issue Type: New Feature Affects Versions: 0.5-incubating Reporter: Linda George Create REST API to access all functions in Atlas. Specifically the capabilities should include: 1Create new types 2Modify existing types 3Create new relationships for all types (data and trait) 4Remove relationships -- disassociate 5Allow assignment of properties to all types 6Allow search for type, entity, properties 7Return search results - detail, lineage (relations) These API's will be used for UI, HDP component and 3rd party interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ATLAS-76) Atlas business classification
[ https://issues.apache.org/jira/browse/ATLAS-76?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ahn updated ATLAS-76: Assignee: Venkatesh Seetharam Atlas business classification - Key: ATLAS-76 URL: https://issues.apache.org/jira/browse/ATLAS-76 Project: Atlas Issue Type: New Feature Affects Versions: 0.5-incubating Reporter: Linda George Assignee: Venkatesh Seetharam Create an agile facility to model a business organizational taxonomy. This logical model would include the following capabilities: 1 Allow the creation a hierarchical model 2 Allow the creation of properties to any object in the model 3 Have inheritance so that child objects will get the same attributes (association) as the parent object. 4 Related to data object such as a Hive table, Kafka topic or Storm topology. 5 Be able to create (import) and report (export) on structures in bulk or ad hoc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ATLAS-75) Atlas Hive integration
[ https://issues.apache.org/jira/browse/ATLAS-75?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ahn updated ATLAS-75: Due Date: 17/Jul/15 Fix Version/s: 0.5.1-incubating Atlas Hive integration -- Key: ATLAS-75 URL: https://issues.apache.org/jira/browse/ATLAS-75 Project: Atlas Issue Type: New Feature Affects Versions: 0.5-incubating Reporter: Linda George Assignee: Venkatesh Seetharam Fix For: 0.5.1-incubating Native Atlas connector for Hive. This should be a jar file - or a simple install- to add to Hive to allow complete capture for Hive metadata will Atlas. 1) Should require no configuration from either Hive or Altas 2) Have bootstrapping capability to get an initial synch. Should be able to re-run in the event of DR situations. 3) Post - Execution Hive Hook to capture incremental changes. This should not impact Hive performance or reliability. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (ATLAS-77) REST API to access all functions in Atlas
[ https://issues.apache.org/jira/browse/ATLAS-77?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ahn resolved ATLAS-77. - Resolution: Fixed REST API to access all functions in Atlas -- Key: ATLAS-77 URL: https://issues.apache.org/jira/browse/ATLAS-77 Project: Atlas Issue Type: New Feature Affects Versions: 0.5-incubating Reporter: Linda George Assignee: Venkatesh Seetharam Fix For: 0.5.1-incubating Create REST API to access all functions in Atlas. Specifically the capabilities should include: 1Create new types 2Modify existing types 3Create new relationships for all types (data and trait) 4Remove relationships -- disassociate 5Allow assignment of properties to all types 6Allow search for type, entity, properties 7Return search results - detail, lineage (relations) These API's will be used for UI, HDP component and 3rd party interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ATLAS-77) REST API to access all functions in Atlas
[ https://issues.apache.org/jira/browse/ATLAS-77?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ahn updated ATLAS-77: Assignee: Venkatesh Seetharam REST API to access all functions in Atlas -- Key: ATLAS-77 URL: https://issues.apache.org/jira/browse/ATLAS-77 Project: Atlas Issue Type: New Feature Affects Versions: 0.5-incubating Reporter: Linda George Assignee: Venkatesh Seetharam Create REST API to access all functions in Atlas. Specifically the capabilities should include: 1Create new types 2Modify existing types 3Create new relationships for all types (data and trait) 4Remove relationships -- disassociate 5Allow assignment of properties to all types 6Allow search for type, entity, properties 7Return search results - detail, lineage (relations) These API's will be used for UI, HDP component and 3rd party interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ATLAS-75) Atlas Hive integration
[ https://issues.apache.org/jira/browse/ATLAS-75?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ahn updated ATLAS-75: Description: Native Atlas connector for Hive. This should be a jar file - or a simple install- to add to Hive to allow complete capture for Hive metadata will Atlas. 1) Should require no configuration from either Hive or Altas 2) Have bootstrapping capability to get an initial synch. Should be able to re-run in the event of DR situations. 3) Post - Execution Hive Hook to capture incremental changes. This should not impact Hive performance or reliability. Atlas Hive integration -- Key: ATLAS-75 URL: https://issues.apache.org/jira/browse/ATLAS-75 Project: Atlas Issue Type: New Feature Affects Versions: 0.5-incubating Reporter: Linda George Native Atlas connector for Hive. This should be a jar file - or a simple install- to add to Hive to allow complete capture for Hive metadata will Atlas. 1) Should require no configuration from either Hive or Altas 2) Have bootstrapping capability to get an initial synch. Should be able to re-run in the event of DR situations. 3) Post - Execution Hive Hook to capture incremental changes. This should not impact Hive performance or reliability. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (ATLAS-75) Atlas Hive integration
[ https://issues.apache.org/jira/browse/ATLAS-75?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ahn resolved ATLAS-75. - Resolution: Fixed Atlas Hive integration -- Key: ATLAS-75 URL: https://issues.apache.org/jira/browse/ATLAS-75 Project: Atlas Issue Type: New Feature Affects Versions: 0.5-incubating Reporter: Linda George Assignee: Venkatesh Seetharam Fix For: 0.5.1-incubating Native Atlas connector for Hive. This should be a jar file - or a simple install- to add to Hive to allow complete capture for Hive metadata will Atlas. 1) Should require no configuration from either Hive or Altas 2) Have bootstrapping capability to get an initial synch. Should be able to re-run in the event of DR situations. 3) Post - Execution Hive Hook to capture incremental changes. This should not impact Hive performance or reliability. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ATLAS-77) REST API to access all functions in Atlas
[ https://issues.apache.org/jira/browse/ATLAS-77?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ahn updated ATLAS-77: Due Date: 17/Jul/15 Fix Version/s: 0.5.1-incubating REST API to access all functions in Atlas -- Key: ATLAS-77 URL: https://issues.apache.org/jira/browse/ATLAS-77 Project: Atlas Issue Type: New Feature Affects Versions: 0.5-incubating Reporter: Linda George Assignee: Venkatesh Seetharam Fix For: 0.5.1-incubating Create REST API to access all functions in Atlas. Specifically the capabilities should include: 1Create new types 2Modify existing types 3Create new relationships for all types (data and trait) 4Remove relationships -- disassociate 5Allow assignment of properties to all types 6Allow search for type, entity, properties 7Return search results - detail, lineage (relations) These API's will be used for UI, HDP component and 3rd party interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ATLAS-76) Atlas business classification
[ https://issues.apache.org/jira/browse/ATLAS-76?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ahn updated ATLAS-76: Due Date: 17/Jul/15 Fix Version/s: 0.5.1-incubating Atlas business classification - Key: ATLAS-76 URL: https://issues.apache.org/jira/browse/ATLAS-76 Project: Atlas Issue Type: New Feature Affects Versions: 0.5-incubating Reporter: Linda George Assignee: Venkatesh Seetharam Fix For: 0.5.1-incubating Create an agile facility to model a business organizational taxonomy. This logical model would include the following capabilities: 1 Allow the creation a hierarchical model 2 Allow the creation of properties to any object in the model 3 Have inheritance capabilites so that child objects will get the same attributes (association) as the parent object. 4 Allow relating business logical model to data objects such as a Hive table, Kafka topic or Storm topology. 5 Be able to create (import) and report (export) on structures in bulk or ad hoc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)