[
https://issues.apache.org/jira/browse/GOBBLIN-2206?focusedWorklogId=968347&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-968347
]
ASF GitHub Bot logged work on GOBBLIN-2206:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 02/May/25 09:35
Start Date: 02/May/25 09:35
Worklog Time Spent: 10m
Work Description: khandelwal-prateek commented on code in PR #4115:
URL: https://github.com/apache/gobblin/pull/4115#discussion_r2071172287
##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/ManifestBasedDataset.java:
##########
@@ -143,12 +146,28 @@ public Iterator<FileSet<CopyEntity>>
getFileSetIterator(FileSystem targetFs, Cop
copyableFile.setFsDatasets(srcFs, targetFs);
copyEntities.add(copyableFile);
+ // In case of directory with 000 permission, the permission is
changed to 100 due to HadoopUtils::addExecutePermissionToOwner
+ // getting called from
CopyDataPublisher::preserveFileAttrInPublisher ->
FileAwareInputStreamDataWriter::setPathPermission ->
+ // FileAwareInputStreamDataWriter::setOwnerExecuteBitIfDirectory
-> HadoopUtils::addExecutePermissionToOwner
+ // We need to revert this extra permission change in
setPermissionStep
+ if (srcFile.isDirectory() &&
!srcFile.getPermission().getUserAction().implies(FsAction.EXECUTE)
+ &&
!ancestorOwnerAndPermissionsForSetPermissionStep.containsKey(PathUtils.getPathWithoutSchemeAndAuthority(fileToCopy).toString())
+ && !targetFs.exists(fileToCopy)) {
+ List<OwnerAndPermission> ancestorsOwnerAndPermission = new
ArrayList<>(copyableFile.getAncestorsOwnerAndPermission());
+ OwnerAndPermission srcFileOwnerPermission = new
OwnerAndPermission(srcFile.getOwner(), srcFile.getGroup(),
srcFile.getPermission());
+ ancestorsOwnerAndPermission.add(0, srcFileOwnerPermission);
+
ancestorOwnerAndPermissionsForSetPermissionStep.put(fileToCopy.toString(),
ancestorsOwnerAndPermission);
+ }
+
// Always grab the parent since the above permission setting
should be setting the permission for a folder itself
// {@link CopyDataPublisher#preserveFileAttrInPublisher} will be
setting the permission for the empty child dir
Path fromPath = fileToCopy.getParent();
// Avoid duplicate calculation for the same ancestor
if (fromPath != null &&
!ancestorOwnerAndPermissions.containsKey(PathUtils.getPathWithoutSchemeAndAuthority(fromPath).toString())
&& !targetFs.exists(fromPath)) {
ancestorOwnerAndPermissions.put(fromPath.toString(),
copyableFile.getAncestorsOwnerAndPermission());
+ if
(!ancestorOwnerAndPermissionsForSetPermissionStep.containsKey(PathUtils.getPathWithoutSchemeAndAuthority(fromPath).toString()))
{
+
ancestorOwnerAndPermissionsForSetPermissionStep.put(fromPath.toString(),
copyableFile.getAncestorsOwnerAndPermission());
Review Comment:
for `containsKey` we are checking
`PathUtils.getPathWithoutSchemeAndAuthority(fromPath).toString()`, but in `put`
`fromPath.toString()` is used directly. There can be mismatch due to this, it
would be better to use
`PathUtils.getPathWithoutSchemeAndAuthority(fromPath).toString()` in both
places
Issue Time Tracking
-------------------
Worklog Id: (was: 968347)
Time Spent: 0.5h (was: 20m)
> Fix extra execute bit getting set in ManifestDistcp
> ---------------------------------------------------
>
> Key: GOBBLIN-2206
> URL: https://issues.apache.org/jira/browse/GOBBLIN-2206
> Project: Apache Gobblin
> Issue Type: Improvement
> Reporter: Vivek Rai
> Priority: Major
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> In Manifest Distcp extra execute bit is set to owner permission even if owner
> permission doesn't have execute bit set
--
This message was sent by Atlassian Jira
(v8.20.10#820010)