Neer393 commented on code in PR #6020:
URL: https://github.com/apache/hive/pull/6020#discussion_r2265871821
##########
standalone-metastore/metastore-client/src/main/java/org/apache/hadoop/hive/metastore/utils/TableFetcher.java:
##########
@@ -117,6 +119,35 @@ public List<TableName> getTables() throws Exception {
return candidates;
}
+ public List<Table> getTables(int maxBatchSize) throws Exception {
+ List<Table> candidates = new ArrayList<>();
Review Comment:
The issue with this is in ```getTables()``` within the for loop I am
iterating over all databases and for each database I am getting the names of
all tables within that and for each db then I am getting those tables batch
wise so there is some processing happening within getTables whereas that is not
the case with ```getTableNames()``` .
Coming to your second point of calling getTableNames withing getTables will
also not reduce redundancy as if you take a look at this line, you can see that
I need to iterate over databases which would then require me to make a double
msc call to ```client.getDatabases(catalogName, dbPattern)``` (once called by
getTableNames() and once I will call within getTables()). So that would still
increase the number of msc calls by one.
```
for (Table table : new TableIterable(client, db, tablesNames, maxBatchSize))
{
candidates.add(table);
}
```
So I think we cannot extract them to a common method. Do let me know if you
think otherwise :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]