[
https://issues.apache.org/jira/browse/GOBBLIN-1339?focusedWorklogId=525316&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-525316
]
ASF GitHub Bot logged work on GOBBLIN-1339:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 16/Dec/20 22:49
Start Date: 16/Dec/20 22:49
Worklog Time Spent: 10m
Work Description: aplex opened a new pull request #3178:
URL: https://github.com/apache/incubator-gobblin/pull/3178
We use dataset descriptors to track lineage. Previously, it
only included the platform name (hive,hdfs) and path of the
dataset. As a result, we could not differentiate the data copy
between multiple production clusters, as the dataset descriptors
were the same for them. We add an optional cluster name to
address that.
This change will be used for data copy audit system.
Hive and file-based copy code is updated to include cluster names.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 525316)
Remaining Estimate: 0h
Time Spent: 10m
> Add cluster name to dataset descriptor
> --------------------------------------
>
> Key: GOBBLIN-1339
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1339
> Project: Apache Gobblin
> Issue Type: Improvement
> Components: gobblin-core
> Reporter: Alex Prokofiev
> Assignee: Abhishek Tiwari
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> This is going to be used for more detailed lineage tracking
--
This message was sent by Atlassian Jira
(v8.3.4#803005)