Yushi Hayasaka created ATLAS-5095:
-------------------------------------
Summary: Cache shell entity after creation to prevent from cache
miss
Key: ATLAS-5095
URL: https://issues.apache.org/jira/browse/ATLAS-5095
Project: Atlas
Issue Type: Improvement
Reporter: Yushi Hayasaka
Sometimes Atlas attempts to load an entity from the cache (e.g., to notify
listeners of processed entities after `createOrUpdate()`).
[https://github.com/apache/atlas/blob/18d7f9dccf5658988d32e387339948286810f0a8/repository/src/main/java/org/apache/atlas/repository/graph/FullTextMapperV2.java#L176]
[https://github.com/apache/atlas/blob/18d7f9dccf5658988d32e387339948286810f0a8/repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java#L595]
[https://github.com/apache/atlas/blob/18d7f9dccf5658988d32e387339948286810f0a8/repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java#L465]
[https://github.com/apache/atlas/blob/18d7f9dccf5658988d32e387339948286810f0a8/repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java#L418]
[https://github.com/apache/atlas/blob/18d7f9dccf5658988d32e387339948286810f0a8/repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java#L111-L115]
[https://github.com/apache/atlas/blob/18d7f9dccf5658988d32e387339948286810f0a8/repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java#L1145-L1146]
If the specified entity is not found in the cache, Atlas falls back to
retrieving it through `EntityGraphRetriever#toAtlasEntity`, which is slow path
compared to cache.
[https://github.com/apache/atlas/blob/18d7f9dccf5658988d32e387339948286810f0a8/repository/src/main/java/org/apache/atlas/repository/graph/FullTextMapperV2.java#L181]
Currently, we observe that Atlas tries to retrieve shell entities from
EntityGraphRetriever instead of cache.
When there are many shell entities in the event, it increases the operation
time.
As introduced in ATLAS-3405, if the non-existing entities are included in the
event, Atlas creates the shell entity.
In my understanding (please correct me if wrong), the shell entity should only
have some properties which are specified in createShellEntityVertex.
[https://github.com/apache/atlas/blob/18d7f9dccf5658988d32e387339948286810f0a8/repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java#L291-L312]
So, I guess it is safe to cache after creation (e.g. right after
`createShellEntityVertex`), and it leads to improve the performance by reducing
calling slow path.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)