-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72893/
-----------------------------------------------------------

(Updated Oct. 22, 2020, 9:05 p.m.)


Review request for atlas, Deep Singh, Madhan Neethiraj, mayank jain, Nikhil 
Bonte, Nixon Rodrigues, and Sarath Subramanian.


Changes
-------

Updates include: 
- Added property for enabling/disabling the feature.


Bugs: ATLAS-3427
    https://issues.apache.org/jira/browse/ATLAS-3427


Repository: atlas


Description
-------

(Internal review) 
**Description**
Integration: _AtlasHook_ receives enhanced _NotificationInterface_ in the form 
of _AtlasFileSpool_. This capability can be optionally set using: 
_atlas.hook.spool.dir_ configuration property.

_AtlasFileSpool_: Enhances default hook functionality by encapsulating the 
default _NotificationInterface_ with file spooling capability. In case of 
destination being unavailable, the messages received will be spooled to local 
file system.

What is spool?
- These are files containg data (messages in this specific case).

Files on the disk that are managed using index files. Index files have special 
structure that is used to describe the spool files. 

Who generates spool files?
_AtlasHook_ is base class from which all hooks. It is responsible for sending 
messages to a destination (mostly Kafka). If destination is down, an exception 
is raised by the destination. Before the implementation, the message was logged 
after retry. With this implementation, the message will be spooled, and when 
the destination is up, it is sent to the destination.

How are spool file stored?
- Spool files and index files are store on disk in local file system.

Structure of _AtlasFileSpool_:
- _IndexManagement_: Storage and retrieval of index files. Provides Spooler 
(see below) with _PrintWriter_ to write messages to the disk.
- _Spooler_: 
  - Interacts with _IndexManagement_ to receive _PrintWriter_. 
  - Serializes messages and writes to the spool file. 
- _Publisher_: 
  - Interacts with _IndexManagement_ to receive _IndexRecord_ that is ready to 
be published.
  - Reads messages from the spool file that is described in the _IndexRecord_ 
and attempts to send it to the destination.
  - If destination is down, it waits for destination to be up.
- _SpoolConfiguration_: Stores configuration specific to the implementation.
- _SpoolUtils_: Utility methods for file storage.


**Impacted Areas**
Hooks:
- Hive: HS2
- Hive: HMS
- Impala.
- HBase.
- Spark.


Diffs (updated)
-----

  addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 
651323490 
  
addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveMetastoreHookImpl.java
 3c0f0c106 
  notification/pom.xml 8affd59a2 
  notification/src/main/java/org/apache/atlas/hook/AtlasHook.java 8659126eb 
  notification/src/main/java/org/apache/atlas/hook/FailedMessagesLogger.java 
b319e81b8 
  notification/src/main/java/org/apache/atlas/kafka/NotificationProvider.java 
2dd970ef7 
  
notification/src/main/java/org/apache/atlas/notification/AbstractNotification.java
 45a66bf07 
  notification/src/main/java/org/apache/atlas/notification/LogConfigUtils.java 
PRE-CREATION 
  
notification/src/main/java/org/apache/atlas/notification/NotificationException.java
 2dd9c9fa0 
  
notification/src/main/java/org/apache/atlas/notification/NotificationInterface.java
 6caf7e2d5 
  notification/src/main/java/org/apache/atlas/notification/spool/Archiver.java 
PRE-CREATION 
  
notification/src/main/java/org/apache/atlas/notification/spool/AtlasFileSpool.java
 PRE-CREATION 
  
notification/src/main/java/org/apache/atlas/notification/spool/FileOperations.java
 PRE-CREATION 
  
notification/src/main/java/org/apache/atlas/notification/spool/IndexManagement.java
 PRE-CREATION 
  notification/src/main/java/org/apache/atlas/notification/spool/Publisher.java 
PRE-CREATION 
  
notification/src/main/java/org/apache/atlas/notification/spool/SpoolConfiguration.java
 PRE-CREATION 
  
notification/src/main/java/org/apache/atlas/notification/spool/SpoolUtils.java 
PRE-CREATION 
  notification/src/main/java/org/apache/atlas/notification/spool/Spooler.java 
PRE-CREATION 
  
notification/src/main/java/org/apache/atlas/notification/spool/models/IndexRecord.java
 PRE-CREATION 
  
notification/src/main/java/org/apache/atlas/notification/spool/models/IndexRecords.java
 PRE-CREATION 
  
notification/src/main/java/org/apache/atlas/notification/spool/utils/local/FileLockedReadWrite.java
 PRE-CREATION 
  
notification/src/main/java/org/apache/atlas/notification/spool/utils/local/FileOpAppend.java
 PRE-CREATION 
  
notification/src/main/java/org/apache/atlas/notification/spool/utils/local/FileOpCompaction.java
 PRE-CREATION 
  
notification/src/main/java/org/apache/atlas/notification/spool/utils/local/FileOpDelete.java
 PRE-CREATION 
  
notification/src/main/java/org/apache/atlas/notification/spool/utils/local/FileOpRead.java
 PRE-CREATION 
  
notification/src/main/java/org/apache/atlas/notification/spool/utils/local/FileOpUpdate.java
 PRE-CREATION 
  
notification/src/main/java/org/apache/atlas/notification/spool/utils/local/FileOperation.java
 PRE-CREATION 
  
notification/src/test/java/org/apache/atlas/notification/AbstractNotificationTest.java
 94cb70d20 
  
notification/src/test/java/org/apache/atlas/notification/spool/AtlasFileSpoolTest.java
 PRE-CREATION 
  notification/src/test/java/org/apache/atlas/notification/spool/BaseTest.java 
PRE-CREATION 
  
notification/src/test/java/org/apache/atlas/notification/spool/IndexManagementTest.java
 PRE-CREATION 
  notification/src/test/resources/spool/archive/spool-1.json PRE-CREATION 
  notification/src/test/resources/spool/index-test-src-1.json PRE-CREATION 
  notification/src/test/resources/spool/index-test-src-1_closed.json 
PRE-CREATION 
  pom.xml b9242016b 


Diff: https://reviews.apache.org/r/72893/diff/10/

Changes: https://reviews.apache.org/r/72893/diff/9-10/


Testing
-------

**Unit testing**
Additional unit tests added.

**System Testing**
End-to-end verification for each of the hooks.

**Volume Testing**
- Spooled large data.
- Spooled compressed messages.
- Spooled split messages.
- Verified publishing. 
- Ensured that sequence of messages is maintained.

**Spark-specific**
- Multiple spark-shells publishing with Kafka down.
- Medium-sized lineage creation using special script.

**Pre-commit**
https://ci-builds.apache.org/job/Atlas/job/PreCommit-ATLAS-Build-Test/85


Thanks,

Ashutosh Mestry

Reply via email to