[ https://issues.apache.org/jira/browse/ATLAS-2816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chengbing Liu updated ATLAS-2816: --------------------------------- Attachment: ATLAS-2816.01.patch > Allow ignoring relationship in EntityGraphRetriever for FullTextMapperV2 > ------------------------------------------------------------------------ > > Key: ATLAS-2816 > URL: https://issues.apache.org/jira/browse/ATLAS-2816 > Project: Atlas > Issue Type: Bug > Affects Versions: 1.0.0 > Reporter: Chengbing Liu > Priority: Major > Attachments: ATLAS-2816.01.patch > > > We encountered a problem when using Hive bridge in production. One database > has 5000+ tables. Importing the first table costs only tens of milliseconds, > and then it becomes slower with more tables. In the end, it costs 1~2 seconds > to import one table. > After investigation, we realized that it is not necessary for the > {{FullTextMapperV2}} to retrieve all the relationship of the database each > time a table is imported. The time complexity of importing a whole database > actually goes to O(n^2) (n is number of tables). > We propose to add a parameter to the constructor of {{EntityGraphRetriever}}: > {{ignoreRelationship}}. When set to true, {{mapVertexToAtlasEntity}} will > skip the {{mapRelationshipAttributes}} call. Since {{FullTextMapperV2}} will > not use relationship attributes of the entity, this can save plenty of time > when importing entities with a large number of relations. -- This message was sent by Atlassian JIRA (v7.6.3#76005)