> On April 2, 2020, 5:39 a.m., Madhan Neethiraj wrote:
> > repository/src/main/java/org/apache/atlas/repository/graph/GraphBackedSearchIndexer.java
> > Lines 344 (patched)
> > <https://reviews.apache.org/r/72287/diff/1/?file=2216514#file2216514line344>
> >
> >     edgeLabel is typicallu used to find subset of edges from a given 
> > vertex. Having an edge-index on the label probably won't help improve the 
> > performance; however, need to understand the impact of creating this index 
> > in an existing Atlas instance having large number of edges. 1) Would index 
> > be populated with existing edge labels? 2) If yes, how long would the index 
> > creation take - say for 1m edges? 3) If no, would search ignore edges that 
> > were not indexd?
> >     
> >     I suggest to find the performace impact of not having this index.

I did a run last night without the index and it did not have impact on the 
performance. I have removed this change.


- Ashutosh


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72287/#review220181
-----------------------------------------------------------


On March 30, 2020, 11:19 p.m., Ashutosh Mestry wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72287/
> -----------------------------------------------------------
> 
> (Updated March 30, 2020, 11:19 p.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3706
>     https://issues.apache.org/jira/browse/ATLAS-3706
> 
> 
> Repository: atlas
> 
> 
> Description
> -------
> 
> **Approach**
> 
> 1. Added Metrics to most of the methods in entity creation. (The patch does 
> not include the additional metrics added to additional places.)
> 2. Started importing large number of entities using the 
> _ZipFileMigrationImporter_.
> 3. Observed behavior of import over 24 hours. Observations included CPU 
> usage, memory usage and the import throughput using the _metric.log_.
> 4. Changes were added to the one at a time. Impact of the change was observed 
> for performance (via metric.log) and accuracy before next change was added.
> 
> **Observations**
> * Relationship creation took inordinately large amount of time under load. 
> The time was spent in _GraphHelper.getAdjacentEdgesByLabel_. This 
> implementation also caused memory build up of _AtlasEdge_ objects which 
> stayed in memory for long time. This had the secondary effect of slowing down 
> entity creation operations after about 6 hours (this duration differed with 
> node configuration).
> * _GraphHelper.getOrCreateEdge_ did a vertex to vertex comparison which is 
> time consuming.
> * _GraphBackedSearchIndexer_ edge label index. Majority of edge creation 
> operation included lookup by edge label.
> 
> **Configuration**
> Cluster: 3 node: 40 cores, 128 GB RAM, 1.5 TB of disk space.
> Atlas configuration: 32 GB RAM.
> 
> 
> Diffs
> -----
> 
>   
> repository/src/main/java/org/apache/atlas/repository/graph/GraphBackedSearchIndexer.java
>  647e3040c 
>   repository/src/main/java/org/apache/atlas/repository/graph/GraphHelper.java 
> 5ab9f4d13 
> 
> 
> Diff: https://reviews.apache.org/r/72287/diff/1/
> 
> 
> Testing
> -------
> 
> **Manual tests**
> (See above).
> Accuracy verification.
> 
> **Unit tests**
> Executed existing unit tests.
> 
> **Pre-commit build**
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1776/
> 
> 
> Thanks,
> 
> Ashutosh Mestry
> 
>

Reply via email to