[
https://issues.apache.org/jira/browse/GOBBLIN-1619?focusedWorklogId=752536&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-752536
]
ASF GitHub Bot logged work on GOBBLIN-1619:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 04/Apr/22 22:37
Start Date: 04/Apr/22 22:37
Worklog Time Spent: 10m
Work Description: hanghangliu commented on code in PR #3477:
URL: https://github.com/apache/gobblin/pull/3477#discussion_r842190964
##########
gobblin-utility/src/main/java/org/apache/gobblin/util/WriterUtils.java:
##########
@@ -307,10 +303,20 @@ public static void
mkdirsWithRecursivePermissionWithRetry(final FileSystem fs, f
throw new IOException("Path " + path + "does not exist however it
should. Giving up..."+ e);
}
}
+ }
+
+ private static void gobblinMkDirs(final FileSystem fs, final Path path,
FsPermission perm) throws IOException {
+ Set<Path> parentsThatDidntExistBefore = new HashSet<>();
+ for (Path p = path.getParent(); p != null && !fs.exists(p); p =
p.getParent()) {
+ parentsThatDidntExistBefore.add(p);
+ }
+
+ if (!FileSystem.mkdirs(fs, path, perm)) {
+ throw new IOException(String.format("Unable to mkdir %s with permission
%s", path, perm));
+ }
- // Double check permission, since fs.mkdirs() may not guarantee to set the
permission correctly
Review Comment:
Do you think it's worth keeping the if condition before setting the
permission?
Issue Time Tracking
-------------------
Worklog Id: (was: 752536)
Time Spent: 2h 20m (was: 2h 10m)
> WriterUtils.mkdirsWithRecursivePermission contains race condition and puts
> unnecessary load on filesystem
> ---------------------------------------------------------------------------------------------------------
>
> Key: GOBBLIN-1619
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1619
> Project: Apache Gobblin
> Issue Type: Bug
> Reporter: Matthew Ho
> Priority: Minor
> Time Spent: 2h 20m
> Remaining Estimate: 0h
>
> The current implementation recursively calls fs.mkdirs has the following
> issues:
> * *Race condition for creating parent directories, causing FileNotFound
> exception even when the file exists on file system*
> * {*}HDFS fs.mkdirs atomically creates missing parent directories. Thus, the
> recursive approach is making unnecessary calls.{*}{*}{*}
> HDFS, which the current FileSystem interface is built upon, guarantees the
> parents will be created. So all FileSystem class implementations should also
> follow this behavior.
>
> *Note the
> [FileSystem|https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/fs/FileSystem.html]
> abstract class documentation says the following:*
> The behaviour of the filesystem is [specified in the Hadoop documentation.
> |https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/filesystem/filesystem.html]However,
> the normative specification of the behavior of this class is actually HDFS:
> {color:#de350b}if HDFS does not behave the way these Javadocs or the
> specification in the Hadoop documentations define, assume that the
> documentation is incorrect{color}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)