[
https://issues.apache.org/jira/browse/GOBBLIN-2128?focusedWorklogId=929621&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-929621
]
ASF GitHub Bot logged work on GOBBLIN-2128:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 09/Aug/24 22:26
Start Date: 09/Aug/24 22:26
Worklog Time Spent: 10m
Work Description: arjun4084346 commented on code in PR #4020:
URL: https://github.com/apache/gobblin/pull/4020#discussion_r1712291646
##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/ManifestBasedDataset.java:
##########
@@ -157,20 +161,24 @@ public Iterator<FileSet<CopyEntity>>
getFileSetIterator(FileSystem targetFs, Cop
toDelete.add(targetFs.getFileStatus(fileToCopy));
}
}
- // Only set permission for newly created folders on target
- // To change permissions for existing dirs, expectation is to add the
folder to the manifest
- Set<String> parentFolders = new
HashSet<>(flattenedAncestorPermissions.keySet());
- for (String folder : parentFolders) {
- if (targetFs.exists(new Path(folder))) {
- flattenedAncestorPermissions.remove(folder);
- }
- }
- // We need both precommit step to create the directories copying to, and
a postcommit step to ensure that the execute bit needed for recursive rename is
reset
+
+ // Precreate the directories to avoid an edge case where recursive
rename can create extra directories in the target
CommitStep createDirectoryWithPermissionsCommitStep = new
CreateDirectoryWithPermissionsCommitStep(targetFs, ancestorOwnerAndPermissions,
this.properties);
- CommitStep setPermissionCommitStep = new
SetPermissionCommitStep(targetFs, flattenedAncestorPermissions,
this.properties);
copyEntities.add(new PrePublishStep(datasetURN(), Maps.newHashMap(),
createDirectoryWithPermissionsCommitStep, 1));
- copyEntities.add(new PostPublishStep(datasetURN(), Maps.newHashMap(),
setPermissionCommitStep, 1));
+ if (this.enableSetPermissionPostPublish) {
+ for (String parent : ancestorOwnerAndPermissions.keySet()) {
+ Path currentPath = new Path(parent);
+ for (OwnerAndPermission ownerAndPermission :
ancestorOwnerAndPermissions.get(parent)) {
+ if
(!flattenedAncestorPermissions.containsKey(currentPath.toString()) &&
!targetFs.exists(currentPath)) {
+ flattenedAncestorPermissions.put(currentPath.toString(),
ownerAndPermission);
Review Comment:
On left side, the `value` we are putting is output of
`resolveReplicatedAncestorOwnerAndPermissionsRecursively` but on right it is
the output of `replicateAncestorsOwnerAndPermission`. what is the rational
behind this change?
Issue Time Tracking
-------------------
Worklog Id: (was: 929621)
Time Spent: 0.5h (was: 20m)
> Set permission in Manifest distcp does not set all paths correctly
> ------------------------------------------------------------------
>
> Key: GOBBLIN-2128
> URL: https://issues.apache.org/jira/browse/GOBBLIN-2128
> Project: Apache Gobblin
> Issue Type: Bug
> Reporter: William Lo
> Priority: Major
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> There's a bug in the treemap for `SetPermissionCommitStep` where paths of the
> equal file depth can be misassigned permissions due to an error in the
> Treemap initialization:
> {code:java}
> Long.compare(o1.chars().filter(ch -> ch == '/').count(), o2.chars().filter(ch
> -> ch == '/').count()) {code}
> Treats any key with the same depth as equal, this could lead to incorrect
> behavior when performing map operations in this map.
> We want to rearrange this so that the comparator also compares the source
> string.
> Additionally, build off the map from the same file permissions as the
> ancestor tree in copyable file so that it can stay consistent permission
> reading and reduce FS calls.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)