jackhalfalltrades opened a new pull request, #567: URL: https://github.com/apache/atlas/pull/567
## What changes were proposed in this pull request? This patch addresses a performance degradation observed in Atlas async table replication by reducing overhead in the async import path. The following optimizations were implemented: Introduced caching for AtlasAsyncImportRequest to store intermediate state and avoid repeated serialization/deserialization and frequent updates during entity processing. Reduced the number of graph transactions/commits by more than 50% by consolidating import and bookkeeping operations, significantly lowering transaction overhead. Updated the processedEntities result structure from List to Set to avoid repeated reconstruction and improve lookup efficiency while processing entities. These changes reduce per-entity processing overhead and significantly improve async import throughput without introducing any functional or behavioral changes. ## How was this patch tested? Manual Testing, Unit Tests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
