saihemanth-cloudera commented on code in PR #5843:
URL: https://github.com/apache/hive/pull/5843#discussion_r2127106017
##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/metatool/MetaToolTaskMetadataSummary.java:
##########
@@ -183,25 +185,45 @@ Pair<MetaSummarySchema, List<MetadataTableSummary>>
obtainAndFilterSummary() thr
MetaSummarySchema extraSchema = new MetaSummarySchema();
for (Class<? extends MetaSummaryHandler> handler :
nonNativeSummaries.keys()) {
Configuration conf = getObjectStore().getConf();
+ List<Future<?>> futures = new ArrayList<>();
try (MetaSummaryHandler summaryHandler =
JavaUtils.newInstance(handler)) {
summaryHandler.setConf(conf);
summaryHandler.initialize(MetaStoreUtils.getDefaultCatalog(conf),
formatJson, extraSchema);
List<MetadataTableSummary> tableSummaries =
nonNativeSummaries.get(handler);
// Filter those we don't want to collect
Set<Long> tableIds =
getObjectStore().filterTablesForSummary(tableSummaries, recentUpdatedDays,
maxNonNativeTables);
+ if (service == null) {
+ int nThreads = Math.min(MetastoreConf.getIntVar(conf,
MetastoreConf.ConfVars.METADATA_SUMMARY_NONNATIVE_THREADS),
+ tableIds.size());
+ if (nThreads > 1) {
+ service = Executors.newFixedThreadPool(nThreads,
+ new
ThreadFactoryBuilder().setDaemon(true).setNameFormat("MetaToolTaskMetadataSummary
#%d").build());
+ }
+ }
for (MetadataTableSummary summary : tableSummaries) {
- if (tableIds.contains(summary.getTableId())) {
+ if (!tableIds.contains(summary.getTableId())) {
+ filteredSummary.put(summary, null);
+ } else {
TableName tableName = new TableName(summary.getCatalogName(),
summary.getDbName(), summary.getTblName());
- summaryHandler.appendSummary(tableName, summary);
- } else {
- filteredSummary.put(summary, null);
- }
- // If there is an exception while collecting the summary, remove it
- if (summary.isDropped()) {
- filteredSummary.put(summary, null);
+ Runnable task = () -> {
+ summaryHandler.appendSummary(tableName, summary);
+ // If there is an exception while collecting the summary,
remove it
+ if (summary.isDropped()) {
+ filteredSummary.put(summary, null);
+ }
+ };
+ if (service != null) {
+ futures.add(service.submit(task));
+ } else {
+ task.run();
+ }
}
}
+ // Waiting for the result before closing the MetaSummaryHandler
+ for (Future<?> future : futures) {
+ future.get();
Review Comment:
Should we consider a timeout for this task? Threads could be blocked at the
back end and the metadata summary tool can run for hours(I have seen this
happen in a customer env).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]