Ashutosh Mestry created ATLAS-2434: -------------------------------------- Summary: Import: Performance Improvement Key: ATLAS-2434 URL: https://issues.apache.org/jira/browse/ATLAS-2434 Project: Atlas Issue Type: Bug Components: atlas-core Affects Versions: trunk Reporter: Ashutosh Mestry Assignee: Ashutosh Mestry Fix For: trunk
*Background* The introduction of _relationships_ within Atlas, caused the _EntityMutationResponse_ to contain many more entities as modified than before. This has adverse impact on performance when it comes to bulk entity creation. Entity creation in bulk happens during import process. Single entity creation. *Behavior* During import, in a typical scenario where database is being imported. The _EntityMutationResponse_'s updated entities grows progressively. This happens because every edge created between database-table and table-column is marked as updated entity. Import thus slows down progressively. On a ZIP file used for benchmarks, showed: * Branch-0.8 (last release): 2 minutes. * Master (current development): 40+ minutes. The behavior deteriorates as size of import increases. *Possible Solution* During import process, avoid marking entities affected due to relationship edge creation as modified. -- This message was sent by Atlassian JIRA (v7.6.3#76005)