----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/70463/ -----------------------------------------------------------
Review request for atlas, Kapildeo Nayak, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and Sarath Subramanian. Bugs: ATLAS-3132 https://issues.apache.org/jira/browse/ATLAS-3132 Repository: atlas Description ------- **Approach** - Refactored existing implementation for new design. - Renamed 'Java Patch Framework' to 'Data Patch Framework', rationale being that this is essentially to modify structure of existing data. - New _DataPatchService_: Modified order in which services are called. _DataPatchService_ will be called before other services are invoked, thereby giving chance for it to complete before entertaining new data. - New _DataPatchRegistry_: Data access (CRUD) operation for data patches. - New _UniqueAttributePatchHandler_: Current implementation for adding the new property to data vertices. Implemented rudimentary caching to precent repetitive look-ups. - New REST Endpoint to query status of patches. **Performance** Since the data patching operation is high-volume operation, it has been treated with priority. - New _NewPropertyDataHandler_ uses database in bulk loading mode for rapid processing. This scales with resources. Additional properties: - _atlas.processing.batchSize_: Size of batch. - _atlas.processing.numWorkers_: Number of worker threads to be employed. - Leverages existing PC framework. Processing speed: - 300K vertices: ~5 mins - 4.2 M entities: ~45 mins (from: 2019-04-12 04:44:50 to 2019-04-12 05:29:04) Diffs ----- graphdb/api/src/main/java/org/apache/atlas/repository/graphdb/AtlasGraph.java d282c9966 graphdb/api/src/main/java/org/apache/atlas/repository/graphdb/DataPatchGraphDBHandler.java PRE-CREATION graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/VertexIterator.java PRE-CREATION graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/patches/NewPropertyDataPatch.java PRE-CREATION intg/src/main/java/org/apache/atlas/pc/WorkItemConsumer.java b7eb4d89c intg/src/main/java/org/apache/atlas/pc/WorkItemManager.java 0e7d3f22d notification/src/main/java/org/apache/atlas/kafka/EmbeddedKafkaServer.java 32b597fb6 notification/src/main/java/org/apache/atlas/kafka/KafkaNotification.java 1d0a2734b repository/src/main/java/org/apache/atlas/repository/patches/AtlasJavaPatchHandler.java 9153d497b repository/src/main/java/org/apache/atlas/repository/patches/DataPatchHandler.java PRE-CREATION repository/src/main/java/org/apache/atlas/repository/patches/DataPatchManager.java PRE-CREATION repository/src/main/java/org/apache/atlas/repository/patches/DataPatchRegistry.java PRE-CREATION repository/src/main/java/org/apache/atlas/repository/patches/DataPatchService.java PRE-CREATION repository/src/main/java/org/apache/atlas/repository/patches/PatchContext.java a60422b80 repository/src/main/java/org/apache/atlas/repository/patches/TypeNameAttributeCache.java PRE-CREATION repository/src/main/java/org/apache/atlas/repository/patches/UniqueAttributePatch.java PRE-CREATION repository/src/main/java/org/apache/atlas/repository/patches/UniqueAttributePatchHandler.java f2238f1b0 repository/src/main/java/org/apache/atlas/repository/store/bootstrap/AtlasTypeDefStoreInitializer.java 78f3faf99 repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasGraphUtilsV2.java 80141b4f1 repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphRetriever.java 5aa6c8f0e repository/src/test/java/org/apache/atlas/patches/DataPatchRegistryTest.java PRE-CREATION webapp/src/main/java/org/apache/atlas/notification/NotificationHookConsumer.java ce2d76f11 webapp/src/main/java/org/apache/atlas/web/resources/AdminResource.java c5ceb9d6d webapp/src/test/java/org/apache/atlas/web/resources/AdminResourceTest.java 223a90a9c Diff: https://reviews.apache.org/r/70463/diff/1/ Testing ------- **Unit tests** Additional tests added. **Volume tests** Verification with large datasets: - 4M entities - 3.2M entities - 16K entities. **Performance tests** CPU usage, memory usage and disk IO. **Pre-commit build** https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1031/ Thanks, Ashutosh Mestry