[ https://issues.apache.org/jira/browse/ATLAS-2434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Madhan Neethiraj updated ATLAS-2434: ------------------------------------ Attachment: ATLAS-2434-2.patch > Import: Performance Improvement > ------------------------------- > > Key: ATLAS-2434 > URL: https://issues.apache.org/jira/browse/ATLAS-2434 > Project: Atlas > Issue Type: Bug > Components: atlas-core > Affects Versions: trunk > Reporter: Ashutosh Mestry > Assignee: Ashutosh Mestry > Priority: Major > Fix For: trunk > > Attachments: ATLAS-2434-2.patch, > ATLAS-2434-Import-Perf-Improvement.patch > > > *Background* > The introduction of _relationships_ within Atlas, caused the > _EntityMutationResponse_ to contain many more entities as modified than > before. > This has adverse impact on performance when it comes to bulk entity creation. > Entity creation in bulk happens during import process. Single entity creation. > *Behavior* > During import, in a typical scenario where database is being imported. The > _EntityMutationResponse_'s updated entities grows progressively. This happens > because every edge created between database-table and table-column is marked > as updated entity. > Import thus slows down progressively. > On a ZIP file used for benchmarks, showed: > * Branch-0.8 (last release): 2 minutes. > * Master (current development): 40+ minutes. > The behavior deteriorates as size of import increases. > *Possible Solution* > During import process, avoid marking entities affected due to relationship > edge creation as modified. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)