kasakrisz commented on code in PR #3552:
URL: https://github.com/apache/hive/pull/3552#discussion_r967046476
##########
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/views/HiveMaterializedViewUtils.java:
##########
@@ -160,11 +181,79 @@ public static Boolean isOutdatedMaterializedView(
return false;
}
+ private static Boolean isOutdatedMaterializedView(
+ MaterializationSnapshot snapshot, Hive db,
+ Set<TableName> tablesUsed, Table materializedViewTable) throws
HiveException {
+ List<String> tablesUsedNames = tablesUsed.stream()
+ .map(tableName -> TableName.getDbTable(tableName.getDb(),
tableName.getTable()))
+ .collect(Collectors.toList());
+
+ Map<String, String> snapshotMap = snapshot.getTableSnapshots();
+ if (snapshotMap == null || snapshotMap.isEmpty()) {
+ LOG.debug("Materialized view " +
materializedViewTable.getFullyQualifiedName() +
+ " ignored for rewriting as we could not obtain current snapshot
ids");
+ return null;
+ }
+
+ Set<String> storedTablesUsed =
materializedViewTable.getMVMetadata().getSourceTableFullNames();
+ for (String fullyQualifiedTableName : tablesUsedNames) {
+ // Note. If the materialized view does not contain a table that is
contained in the query,
+ // we do not need to check whether that specific table is outdated or
not. If a rewriting
+ // is produced in those cases, it is because that additional table is
joined with the
+ // existing tables with an append-columns only join, i.e., PK-FK + not
null.
+ if (!storedTablesUsed.contains(fullyQualifiedTableName)) {
+ continue;
+ }
+
+ Table table = db.getTable(fullyQualifiedTableName);
+ if (table.getStorageHandler() == null) {
+ LOG.debug("Materialized view {} ignored for rewriting as we could not
storage handler of table {}",
+ materializedViewTable.getFullyQualifiedName(),
fullyQualifiedTableName);
+ return null;
+ }
+ String currentTableSnapshot =
table.getStorageHandler().getCurrentSnapshotId(table);
+ if (isBlank(currentTableSnapshot)) {
Review Comment:
Refactored this API and it's usage:
* Renamed `getCurrentSnapshotId` to `getCurrentSnapshotContext` and return a
`SnapshotContext` object wraps the `long snapshotId` instead of `String`
* `getCurrentSnapshotContext` default implementation returns `null` and the
Iceberg implementation can also return `null` if the table is empty so no
current snapshot exists.
* introduced `boolean areSnapshotsSupported` API method to distinguish
between empty table and snapshots are not supported by the storage handler.
* Adjusted the usage at client side when checking MV is up-to-date.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]