aplex opened a new pull request #3178: URL: https://github.com/apache/incubator-gobblin/pull/3178
We use dataset descriptors to track lineage. Previously, it only included the platform name (hive,hdfs) and path of the dataset. As a result, we could not differentiate the data copy between multiple production clusters, as the dataset descriptors were the same for them. We add an optional cluster name to address that. This change will be used for data copy audit system. Hive and file-based copy code is updated to include cluster names. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
