[ 
https://issues.apache.org/jira/browse/HIVE-26887?focusedWorklogId=839996&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839996
 ]

ASF GitHub Bot logged work on HIVE-26887:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 18/Jan/23 14:54
            Start Date: 18/Jan/23 14:54
    Worklog Time Spent: 10m 
      Work Description: skysiders commented on PR #3894:
URL: https://github.com/apache/hive/pull/3894#issuecomment-1387202288

   Hi @zabetak , thanks for your review. 
   You mentioned that this is "kind of a breaking change", but I don't agree 
with this. In setting file permissions, most of the time we can rely on the 
umask of the underlying file system, such as the most commonly used fs.create 
function, but in In my fix, all file permissions are explicitly set. I think 
this explicit permission setting is due to the developer thinking that the file 
should be set to this explicit permission. If the development of this pair of 
files does not require explicit permissions, the underlying umask can indeed be 
used to constrain it, but once the permissions are clear, the underlying umask 
may cause the file permissions to be too strict and make the files unusable. I 
would like to give an inappropriate example here. For example, the umask of the 
underlying file system is 777, and the file permissions are 000, so the 
upper-level files will not have any permissions. Therefore, for such files with 
clearly set permissions, I think it should be Make sure they are properly 
assigned permissions.
   
   Regarding the second point "programming pattern" you mentioned, in fact, it 
is also possible to use Hadoop's underlying FileSystem.create(fs, path, perm) 
here. In fact, I now think that such a "programming pattern" should be adopted, 
because this It is safe and more reliable than fs.create(path,perm). This kind 
of repair is mainly aimed at API misuse. I have mentioned this problem in[ 
HBASE-26994](https://github.com/apache/hbase/pull/4391), which means that the 
developer originally intended to Set special permissions here, but mistakenly 
think that fs.create(path, perm) can set special permissions perm for the path. 
In fact, this is wrong. In the chat with the hbase developer, I pointed out his 
mistake and Got his approval.
   
   Finally, I want to say that this fix is necessary in my opinion. I have 
searched in hive and found these four API misuse problems, so I point out this 
problem here.




Issue Time Tracking
-------------------

    Worklog Id:     (was: 839996)
    Time Spent: 1h 50m  (was: 1h 40m)

> Make sure dirPath has the correct permissions
> ---------------------------------------------
>
>                 Key: HIVE-26887
>                 URL: https://issues.apache.org/jira/browse/HIVE-26887
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Zhang Dongsheng
>            Priority: Major
>              Labels: backward-incompatible, pull-request-available
>          Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> In the QueryResultsCache function of class QueryResultsCache, there is the 
> following code segment
> {code:java}
>   private QueryResultsCache(HiveConf configuration) throws IOException {
>     ......
>     FileSystem fs = cacheDirPath.getFileSystem(conf);
>     FsPermission fsPermission = new FsPermission("700");
>     fs.mkdirs(cacheDirPath, fsPermission);
>     ......
> }
> {code}
> It can be seen that the function will use the mkdirs to create cacheDirPath, 
> and the parameters passed in include the path variable cacheDirPath and a 
> permission 700. But we haven't confirmed whether the permission is correctly 
> assigned to the file.
> The above question is raised because there are two mkdir functions of hadoop, 
> {code:java}
> mkdirs(Path f, FsPermission permission)
> {code}
>  and 
> {code:java}
> mkdirs(FileSystem fs, Path dir, FsPermission permission)
> {code}
> and the first one is used here. The permissions of this function will be 
> affected by the underlying umask. Although 700 here will hardly be affected 
> by umask, but I think from a rigorous point of view, we should have one more 
> permission check and permission grant here.
> And I find same issue in other three methods here.
> In class Context
> {code:java}
> private Path getScratchDir(String scheme, String authority,
>       boolean mkdir, String scratchDir) {
>           ......
>           FileSystem fs = dirPath.getFileSystem(conf);
>           dirPath = new Path(fs.makeQualified(dirPath).toString());
>           FsPermission fsPermission = new FsPermission(scratchDirPermission);
>           if (!fs.mkdirs(dirPath, fsPermission)) {
>             throw new RuntimeException("Cannot make directory: "
>                 + dirPath.toString());
>           ......
>   }
> {code}
> In class SessionState
> {code:java}
>   static void createPath(HiveConf conf, Path path, String permission, boolean 
> isLocal,
>       boolean isCleanUp) throws IOException {
>     FsPermission fsPermission = new FsPermission(permission);
>     FileSystem fs;
>     ......
>     if (!fs.mkdirs(path, fsPermission)) {
>       throw new IOException("Failed to create directory " + path + " on fs " 
> + fs.getUri());
>     }
>     ......
>   }
> {code}
> and in class TezSessionState
> {code:java}
> private Path createTezDir(String sessionId, String suffix) throws IOException 
> {
>     ......
>     Path tezDir = new Path(hdfsScratchDir, TEZ_DIR);
>     FileSystem fs = tezDir.getFileSystem(conf);
>     FsPermission fsPermission = new FsPermission(HiveConf.getVar(conf, 
> HiveConf.ConfVars.SCRATCHDIRPERMISSION));
>     fs.mkdirs(tezDir, fsPermission);
>     ......
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to