----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/73329/ -----------------------------------------------------------
(Updated May 5, 2021, 12:39 a.m.) Review request for atlas, Radhika Kundam and Sarath Subramanian. Changes ------- Updates include: - Modified approach. Now leveraging serializers for adding spool attribute. - Utilizing msgCreated for within correlations. Bugs: ATLAS-4152 https://issues.apache.org/jira/browse/ATLAS-4152 Repository: atlas Description ------- **Background** As part of ATLAS-4204, HS2 notifications send entity-lineage only (provided the poperty is enabled). When Spooling enabled the order of messages can potentially change. The notification messages coming from HS2 and HMS may not be in the same order as when they arrived with direct notification. Problem: Consider the sequence of arriving messages: This is the sequence of messages for Entity 1 (C = create, U = update, D = delete, L?x = Lineage of type 'x') No problem: C1, U1, L1x, L1y, D1 Problem: C1, U1, D1, L1x, L1y This implementation attempts to handle ths problem mentioned above. If the above case is not handled, it will end up creating shell entities, since deleted entities are not looked up as part of entity creation. **Approach** Used bounded stream approach where an incoming stream of messages is bounded with an indicator that it originates from spool. This helps makes localized decisions on the incoming stream of messages. High-level approach: - Messages when written to the spool are tagged with a timestamp. - Deleted entities are maintained in a cache. - Lineage-only message are checked if they refer to a deleted entity. - If they refer to deleted entity, they are stitched to the one present in the cache only if it falls within the threshold. New: _EntityCorrelationsManager_: Uses message timestamp and cached entity qualifiedName-GUID map. Modifed: _NotificationHookConsumer_ Uses the new class. New: _HiveDDLLineagePreprocess_ Uses entity-correlation to link to deleted entities. Diffs (updated) ----- intg/src/main/java/org/apache/atlas/model/notification/AtlasNotificationMessage.java 810ba97c9 notification/src/main/java/org/apache/atlas/hook/AtlasHook.java 9162ac144 notification/src/main/java/org/apache/atlas/kafka/AtlasKafkaConsumer.java f7d9668ec notification/src/main/java/org/apache/atlas/kafka/AtlasKafkaMessage.java 22bd79fdf notification/src/main/java/org/apache/atlas/notification/AbstractNotification.java c45a1da95 notification/src/main/java/org/apache/atlas/notification/AtlasNotificationMessageDeserializer.java 3264e264c notification/src/main/java/org/apache/atlas/notification/spool/Spooler.java 2cacaaadc webapp/src/main/java/org/apache/atlas/notification/EntityCorrelationManager.java PRE-CREATION webapp/src/main/java/org/apache/atlas/notification/NotificationHookConsumer.java 84cc8d813 webapp/src/main/java/org/apache/atlas/notification/preprocessor/EntityPreprocessor.java 89568e236 webapp/src/main/java/org/apache/atlas/notification/preprocessor/HiveDdlLineagePreprocessor.java PRE-CREATION webapp/src/main/java/org/apache/atlas/notification/preprocessor/HivePreprocessor.java e69d63e3a webapp/src/main/java/org/apache/atlas/notification/preprocessor/PreprocessorContext.java 608b4a304 Diff: https://reviews.apache.org/r/73329/diff/4/ Changes: https://reviews.apache.org/r/73329/diff/3-4/ Testing ------- **Functional tests** Manual verification of scenarios. Thanks, Ashutosh Mestry