[ 
https://issues.apache.org/jira/browse/ATLAS-2434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16356501#comment-16356501
 ] 

Madhan Neethiraj commented on ATLAS-2434:
-----------------------------------------

[~ashutoshm] - following change might miss to updated entities in notifications 
- like when import adds a process entity that connects two existing entities. 
In this case, update to existing entities should be handled as such (instead of 
ignoring it). Please review the updated patch.

{code}
--- 
a/repository/src/main/java/org/apache/atlas/repository/store/graph/v1/EntityGraphMapper.java
+++ 
b/repository/src/main/java/org/apache/atlas/repository/store/graph/v1/EntityGraphMapper.java
@@ -403,7 +403,7 @@ public class EntityGraphMapper {

                 // created new relationship,
                 // record entity update on both vertices of the new 
relationship
-                if (currentEdge == null && newEdge != null) {
+                if (!context.isImport() && currentEdge == null && newEdge != 
null) {

                     // based on relationship edge direction record update only 
on attribute vertex
                     if (edgeDirection == IN) {
@@ -706,7 +706,7 @@ public class EntityGraphMapper {

                     // if relationship did not exist before and new 
relationship was created
                     // record entity update on both relationship vertices
-                    if (!relationshipExists) {
+                    if (!relationshipExists && !context.isImport()) {
                         recordEntityUpdate(attributeVertex);
                     }
                 }
{code}

> Import: Performance Improvement
> -------------------------------
>
>                 Key: ATLAS-2434
>                 URL: https://issues.apache.org/jira/browse/ATLAS-2434
>             Project: Atlas
>          Issue Type: Bug
>          Components:  atlas-core
>    Affects Versions: trunk
>            Reporter: Ashutosh Mestry
>            Assignee: Ashutosh Mestry
>            Priority: Major
>             Fix For: trunk
>
>         Attachments: ATLAS-2434-2.patch, 
> ATLAS-2434-Import-Perf-Improvement.patch
>
>
> *Background*
> The introduction of _relationships_ within Atlas, caused the 
> _EntityMutationResponse_ to contain many more entities as modified than 
> before.
> This has adverse impact on performance when it comes to bulk entity creation. 
> Entity creation in bulk happens during import process. Single entity creation.
> *Behavior*
> During import, in a typical scenario where database is being imported. The 
> _EntityMutationResponse_'s updated entities grows progressively. This happens 
> because every edge created between database-table and table-column is marked 
> as updated entity.
> Import thus slows down progressively.
> On a ZIP file used for benchmarks, showed:
>  * Branch-0.8 (last release): 2 minutes.
>  * Master (current development): 40+ minutes.
> The behavior deteriorates as size of import increases.
> *Possible Solution*
> During import process, avoid marking entities affected due to relationship 
> edge creation as modified.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to