[ https://issues.apache.org/jira/browse/DRILL-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15232949#comment-15232949 ]
ASF GitHub Bot commented on DRILL-4577: --------------------------------------- Github user jinfengni commented on a diff in the pull request: https://github.com/apache/drill/pull/461#discussion_r59090757 --- Diff: contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/schema/HiveDatabaseSchema.java --- @@ -72,4 +80,76 @@ public String getTypeName() { return HiveStoragePluginConfig.NAME; } + @Override + public List<Pair<String, ? extends Table>> getTablesByNames(final List<String> tableNames) { + final String schemaName = getName(); + final List<Pair<String, ? extends Table>> tableNameToTable = Lists.newArrayList(); + List<org.apache.hadoop.hive.metastore.api.Table> tables; + // Retries once if the first call to fetch the metadata fails + synchronized(mClient) { + final List<String> tableNamesWithAuth = Lists.newArrayList(); + for(String tableName : tableNames) { + try { + if(mClient.tableExists(schemaName, tableName)) { --- End diff -- According to [1], under "Sql standard based authorization", Drill will return all the tables, even if the user does not have read access. That's the behavior before Sean's change to use bulk loading of getTableObjectsByNames(). However, under "Storage based authorization", the current expected behavior is only list the tables that user has access [2]. @vkorukanti , does this current behavior make sense? Why would Drill show different behavior under these two models? Essentially, looks to me that the bulk loading will make Drill show same behavior under both "Sql standard based authorization", and "storage based authorization". That is, "show tables" will list all the tables, whether a user has access or not. But when a user query the table he does not have read access, then error will be raised. [1] https://github.com/apache/drill/blob/master/contrib/storage-hive/core/src/test/java/org/apache/drill/exec/impersonation/hive/TestSqlStdBasedAuthorization.java#L153 [2] https://github.com/apache/drill/blob/master/contrib/storage-hive/core/src/test/java/org/apache/drill/exec/impersonation/hive/TestStorageBasedHiveAuthorization.java#L244-L247 > Improve performance for query on INFORMATION_SCHEMA when HIVE is plugged in > --------------------------------------------------------------------------- > > Key: DRILL-4577 > URL: https://issues.apache.org/jira/browse/DRILL-4577 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Hive > Reporter: Sean Hsuan-Yi Chu > Assignee: Sean Hsuan-Yi Chu > Fix For: 1.7.0 > > > A query such as > {code} > select * from INFORMATION_SCHEMA.`TABLES` > {code} > is converted as calls to fetch all tables from storage plugins. > When users have Hive, the calls to hive metadata storage would be: > 1) get_table > 2) get_partitions > However, the information regarding partitions is not used in this type of > queries. Beside, a more efficient way is to fetch tables is to use > get_multi_table call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)