[ https://issues.apache.org/jira/browse/HADOOP-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274962#comment-16274962 ]
Chris Douglas commented on HADOOP-14600: ---------------------------------------- Thanks, [~myapachejira]. Glad to have that cleared up. +1 on the latest patch. [~ste...@apache.org], unless you have other feedback, let's leave further refinements to followup JIRAs and commit this. One question: this speeds up calls through {{DeprecatedRawLocalFileStatus}}. Did you look at refactoring the deprecation logic, to see if this class is still necessary? There are multiple checks for the platform and whether the native library is loaded, and not only for {{FileStatus}} operations. This is likely due to accumulated layers of ad hoc improvements and optimizations in {{RawLocalFileSystem}}. At a glance, it looks feasible to cut the number of inline checks substantially. What do you think? > LocatedFileStatus constructor forces RawLocalFS to exec a process to get the > permissions > ---------------------------------------------------------------------------------------- > > Key: HADOOP-14600 > URL: https://issues.apache.org/jira/browse/HADOOP-14600 > Project: Hadoop Common > Issue Type: Bug > Components: fs > Affects Versions: 2.7.3 > Environment: file:// in a dir with many files > Reporter: Steve Loughran > Assignee: Ping Liu > Attachments: HADOOP-14600.001.patch, HADOOP-14600.002.patch, > HADOOP-14600.003.patch, HADOOP-14600.004.patch, HADOOP-14600.005.patch, > HADOOP-14600.006.patch, HADOOP-14600.007.patch, HADOOP-14600.008.patch, > HADOOP-14600.009.patch, TestRawLocalFileSystemContract.java, > command_line_test_result__linux.txt, command_line_test_result__windows.txt > > > Reported in SPARK-21137. a {{FileSystem.listStatus}} call really craws > against the local FS, because {{FileStatus.getPemissions}} call forces > {{DeprecatedRawLocalFileStatus}} tp spawn a process to read the real UGI > values. > That is: for every other FS, what's a field lookup or even a no-op, on the > local FS it's a process exec/spawn, with all the costs. This gets expensive > if you have many files. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org