[ https://issues.apache.org/jira/browse/ATLAS-3132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16818819#comment-16818819 ]
ASF subversion and git services commented on ATLAS-3132: -------------------------------------------------------- Commit 9350a7d4bf13694c11127687843a49ef3ab29a95 in atlas's branch refs/heads/branch-2.0 from Ashutosh Mestry [ https://gitbox.apache.org/repos/asf?p=atlas.git;h=9350a7d ] ATLAS-3132: performance improvements in UniqueAttributesPatch Signed-off-by: Madhan Neethiraj <mad...@apache.org> (cherry picked from commit efc4bebc1623c9d00fe4fdf0df424918654a73df) > Data Patch Fx: Improve Data Patching Performance > ------------------------------------------------ > > Key: ATLAS-3132 > URL: https://issues.apache.org/jira/browse/ATLAS-3132 > Project: Atlas > Issue Type: Improvement > Components: atlas-core > Affects Versions: trunk > Reporter: Ashutosh Mestry > Assignee: Ashutosh Mestry > Priority: Major > Fix For: trunk > > > *Background* > The Java patch framework (now called data patching framework) introduced > recently performs patching at the rate of 1 million entities per 15 hrs. This > can be improved. > *Proposed Solution* > * Use the Producer-Consumer framework to spawn multiple workers to perform > concurrent updates to entity vertices. > * Use _AtlasGraph_ in bulk loading mode to further gain performance. > * Perform duplicate data checks during processing. > *Projected Performance Improvement* > * Based on various tests, these give increased throughput. New rate can be > ~300K entities per 5 mins. -- This message was sent by Atlassian JIRA (v7.6.3#76005)