[jira] [Updated] (ATLAS-1207) Dataset exists query in lineage APIs takes longer
[ https://issues.apache.org/jira/browse/ATLAS-1207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shwetha G S updated ATLAS-1207: --- Attachment: ATLAS-1207-v2.patch Fixed. Also changed the tests to catch this issue (the tests had both the name and qualifiedName same) > Dataset exists query in lineage APIs takes longer > - > > Key: ATLAS-1207 > URL: https://issues.apache.org/jira/browse/ATLAS-1207 > Project: Atlas > Issue Type: Bug >Reporter: Sharmadha Sainath >Assignee: Shwetha G S > Fix For: 0.8-incubating > > Attachments: ATLAS-1207-v2.patch, ATLAS-1207.patch > > > Hive_column now extends DataSet. Lineage Service uses the DSL query Dataset > where __guid = which maps to the gremlin query g.V().has(supertype, > Dataset).has(__guid, ). Since the first filter is on type which returns > many vertices, this query is slow. Supertypes is a list property and not sure > how adding combined index will work. This can be replaced with graph query > directly like > {code} > titanGraph.query().has(Constants.GUID_PROPERTY_KEY, guid) > .has(Constants.SUPER_TYPES_PROPERTY_KEY, > AtlasClient.DATA_SET_SUPER_TYPE) > {code} > Thanks [~ssainath] for helping to test this -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ATLAS-1207) Dataset exists query in lineage APIs takes longer
[ https://issues.apache.org/jira/browse/ATLAS-1207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shwetha G S updated ATLAS-1207: --- Attachment: ATLAS-1207.patch > Dataset exists query in lineage APIs takes longer > - > > Key: ATLAS-1207 > URL: https://issues.apache.org/jira/browse/ATLAS-1207 > Project: Atlas > Issue Type: Bug >Reporter: Sharmadha Sainath >Assignee: Shwetha G S > Fix For: 0.8-incubating > > Attachments: ATLAS-1207.patch > > > Hive_column now extends DataSet. Lineage Service uses the DSL query Dataset > where __guid = which maps to the gremlin query g.V().has(supertype, > Dataset).has(__guid, ). Since the first filter is on type which returns > many vertices, this query is slow. Supertypes is a list property and not sure > how adding combined index will work. This can be replaced with graph query > directly like > {code} > titanGraph.query().has(Constants.GUID_PROPERTY_KEY, guid) > .has(Constants.SUPER_TYPES_PROPERTY_KEY, > AtlasClient.DATA_SET_SUPER_TYPE) > {code} > Thanks [~ssainath] for helping to test this -- This message was sent by Atlassian JIRA (v6.3.4#6332)