[ 
https://issues.apache.org/jira/browse/ATLAS-2434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Madhan Neethiraj updated ATLAS-2434:
------------------------------------
    Attachment: ATLAS-2434-2.patch

> Import: Performance Improvement
> -------------------------------
>
>                 Key: ATLAS-2434
>                 URL: https://issues.apache.org/jira/browse/ATLAS-2434
>             Project: Atlas
>          Issue Type: Bug
>          Components:  atlas-core
>    Affects Versions: trunk
>            Reporter: Ashutosh Mestry
>            Assignee: Ashutosh Mestry
>            Priority: Major
>             Fix For: trunk
>
>         Attachments: ATLAS-2434-2.patch, 
> ATLAS-2434-Import-Perf-Improvement.patch
>
>
> *Background*
> The introduction of _relationships_ within Atlas, caused the 
> _EntityMutationResponse_ to contain many more entities as modified than 
> before.
> This has adverse impact on performance when it comes to bulk entity creation. 
> Entity creation in bulk happens during import process. Single entity creation.
> *Behavior*
> During import, in a typical scenario where database is being imported. The 
> _EntityMutationResponse_'s updated entities grows progressively. This happens 
> because every edge created between database-table and table-column is marked 
> as updated entity.
> Import thus slows down progressively.
> On a ZIP file used for benchmarks, showed:
>  * Branch-0.8 (last release): 2 minutes.
>  * Master (current development): 40+ minutes.
> The behavior deteriorates as size of import increases.
> *Possible Solution*
> During import process, avoid marking entities affected due to relationship 
> edge creation as modified.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to