[ https://issues.apache.org/jira/browse/DRILL-4990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15665682#comment-15665682 ]
ASF GitHub Bot commented on DRILL-4990: --------------------------------------- GitHub user ppadma opened a pull request: https://github.com/apache/drill/pull/652 DRILL-4990:Use new HDFS API access instead of listStatus to check if … …users have permissions to access workspace. Manually tested the fix with impersonation enabled. All unit and regression tests pass. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ppadma/drill DRILL-4990 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/652.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #652 ---- ---- > Use new HDFS API access instead of listStatus to check if users have > permissions to access workspace. > ----------------------------------------------------------------------------------------------------- > > Key: DRILL-4990 > URL: https://issues.apache.org/jira/browse/DRILL-4990 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization > Affects Versions: 1.8.0 > Reporter: Padma Penumarthy > Assignee: Padma Penumarthy > Fix For: 1.9.0 > > > For every query, we build the schema tree > (runSQL->getPlan->getNewDefaultSchema->getRootSchema). All workspaces in all > storage plugins are checked and are added to the schema tree if they are > accessible by the user who initiated the query. For file system plugin, > listStatus API is used to check if the workspace is accessible or not > (WorkspaceSchemaFactory.accessible) by the user. The idea seem to be if the > user does not have access to file(s) in the workspace, listStatus will > generate an exception and we return false. But, listStatus (which lists all > the entries of a directory) is an expensive operation when there are large > number of files in the directory. A new API is added in Hadoop 2.6 called > access (HDFS-6570) which provides the ability to check if the user has > permissions on a file/directory. Use this new API instead of listStatus. -- This message was sent by Atlassian JIRA (v6.3.4#6332)