[ https://issues.apache.org/jira/browse/HADOOP-8731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456057#comment-13456057 ]
Vinod Kumar Vavilapalli commented on HADOOP-8731: ------------------------------------------------- Apologies for repeating the questions, I overlooked your answers. There are two cases: - In the case of a real cluster, and with HDFS, the definition of a public dist-cache file is one which is accessible to all users; snd HDFS also has posix style permissions. The method isPublic() eventually is used by the JobClient to figure out which of the user-needed artifacts are public and which are not. So in the distributed-cluster case with DFS, this definition of public-cache doesn't need to change irrespective of whether you have Windows or Linux underneath. - If you are talking of distributed MR cluster working on a local-filesystem, yes your changes will be needed, but that mode is not a supported setup anyways and will most likely need many more changes besides yours. Regarding the permissions related changes: - I believe TT absolutely needs to set ugo+rx for dirs containing expanded archives. This is needed to address some of the artifacts which retain permissions from the original bits that a user upload. So let's not move/change that code out of the archives code block. - And for files, can you tell me why the 2nd line in the code-fragment shown below doesn't already do it correctly on Windows? It may in fact be because of some other bug, so asking - is it not enough to set correct permissions on the file itself in case of Windows? {code} ... sourceFs.copyToLocalFile(sourcePath, workFile); localFs.setPermission(workFile, permission); if (isArchive) { ... {code} > Public distributed cache support for Windows > -------------------------------------------- > > Key: HADOOP-8731 > URL: https://issues.apache.org/jira/browse/HADOOP-8731 > Project: Hadoop Common > Issue Type: Bug > Components: filecache > Reporter: Ivan Mitic > Assignee: Ivan Mitic > Attachments: HADOOP-8731-PublicCache.patch > > > A distributed cache file is considered public (sharable between MR jobs) if > OTHER has read permissions on the file and +x permissions all the way up in > the folder hierarchy. By default, Windows permissions are mapped to "700" all > the way up to the drive letter, and it is unreasonable to ask users to change > the permission on the whole drive to make the file public. IOW, it is hardly > possible to have public distributed cache on Windows. > To enable the scenario and make it more "Windows friendly", the criteria on > when a file is considered public should be relaxed. One proposal is to check > whether the user has given EVERYONE group permission on the file only (and > discard the +x check on parent folders). > Security considerations for the proposal: Default permissions on Unix > platforms are usually "775" or "755" meaning that OTHER users can read and > list folders by default. What this also means is that Hadoop users have to > explicitly make the files private in order to make them private in the > cluster (please correct me if this is not the case in real life!). On > Windows, default permissions are "700". This means that by default all files > are private. In the new model, if users want to make them public, they have > to explicitly add EVERYONE group permissions on the file. > TestTrackerDistributedCacheManager fails because of this issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira