Re: [PR] [SPARK-54119] Support METRIC_VIEW creation on V2 catalogs [spark]

via GitHub Tue, 28 Apr 2026 10:58:44 -0700


cloud-fan commented on code in PR #55487:
URL: https://github.com/apache/spark/pull/55487#discussion_r3156204831



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/command/metricViewCommands.scala:
##########
@@ -65,6 +83,66 @@ case class CreateMetricViewCommand(
       ignoreIfExists = allowExisting)
     Seq.empty
   }
+
+  private def createMetricViewInV2Catalog(
+      sparkSession: SparkSession,
+      resolved: ResolvedIdentifier): Seq[Row] = {
+    // Metric views are persisted through the same `ViewCatalog` interface as 
plain views; the
+    // only differences are `PROP_TABLE_TYPE = METRIC_VIEW` (so 
`V1Table.toCatalogTable` maps the
+    // round-tripped row back to `CatalogTableType.METRIC_VIEW`), the 
`metric_view.*` descriptor
+    // properties produced by `MetricView.getProperties`, and the typed 
`viewDependencies` field.
+    val viewCatalog = resolved.catalog match {
+      case vc: ViewCatalog => vc
+      case other =>
+        throw QueryCompilationErrors.missingCatalogViewsAbilityError(other)
+    }
+    val ident = resolved.identifier
+    val name = ident.asTableIdentifier
+
+    val analyzed = MetricViewHelper.analyzeMetricViewText(sparkSession, name, 
originalText)
+    validateUserColumns(name, analyzed)
+
+    // `retainMetadata = true` preserves the per-column `metric_view.type` / 
`metric_view.expr`
+    // metadata that `ResolveMetricView.buildMetricViewOutput` attaches, even 
when the user
+    // supplies column names with comments (same contract as the V1 
session-catalog path, which
+    // goes through `ViewHelper.prepareTable` with `isMetricView = true`).
+    val aliasedSchema = ViewHelper
+      .aliasPlan(sparkSession, analyzed, userSpecifiedColumns, retainMetadata 
= true)
+      .schema

Review Comment:
   Compared with `V2ViewPreparation.buildViewInfo` for plain v2 views, the v2 
metric-view path is missing:
   
   - `CharVarcharUtils.getRawSchema(...)` on the aliased schema before passing 
it to the builder.
   - `SchemaUtils.checkColumnNameDuplication(...)`.
   - `SchemaUtils.checkIndeterminateCollationInSchema(query.schema)`.
   - `withOwner(CurrentUserContext.getCurrentUser)` — v1 (`prepareTable` → 
`CatalogTable`) and plain v2 (`CreateV2ViewExec`) both stamp the owner, but 
here the captured `ViewInfo` has no `PROP_OWNER`.
   
   All four are exactly what `V2ViewPreparation.buildViewInfo` already does — 
the cleanest fix is to share that code rather than re-add these one by one.



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/command/metricViewCommands.scala:
##########
@@ -65,6 +83,66 @@ case class CreateMetricViewCommand(
       ignoreIfExists = allowExisting)
     Seq.empty
   }
+
+  private def createMetricViewInV2Catalog(
+      sparkSession: SparkSession,
+      resolved: ResolvedIdentifier): Seq[Row] = {
+    // Metric views are persisted through the same `ViewCatalog` interface as 
plain views; the
+    // only differences are `PROP_TABLE_TYPE = METRIC_VIEW` (so 
`V1Table.toCatalogTable` maps the
+    // round-tripped row back to `CatalogTableType.METRIC_VIEW`), the 
`metric_view.*` descriptor
+    // properties produced by `MetricView.getProperties`, and the typed 
`viewDependencies` field.
+    val viewCatalog = resolved.catalog match {
+      case vc: ViewCatalog => vc
+      case other =>
+        throw QueryCompilationErrors.missingCatalogViewsAbilityError(other)
+    }
+    val ident = resolved.identifier
+    val name = ident.asTableIdentifier
+
+    val analyzed = MetricViewHelper.analyzeMetricViewText(sparkSession, name, 
originalText)
+    validateUserColumns(name, analyzed)
+
+    // `retainMetadata = true` preserves the per-column `metric_view.type` / 
`metric_view.expr`
+    // metadata that `ResolveMetricView.buildMetricViewOutput` attaches, even 
when the user
+    // supplies column names with comments (same contract as the V1 
session-catalog path, which
+    // goes through `ViewHelper.prepareTable` with `isMetricView = true`).
+    val aliasedSchema = ViewHelper
+      .aliasPlan(sparkSession, analyzed, userSpecifiedColumns, retainMetadata 
= true)
+      .schema
+
+    // Describe this metric view's source and filter as user-visible 
properties so catalogs and
+    // tooling can inspect them without re-parsing the YAML body.
+    val metricView = MetricViewFactory.fromYAML(originalText)
+    val viewProperties = new java.util.HashMap[String, String]()
+    properties.foreach { case (k, v) => viewProperties.put(k, v) }
+    metricView.getProperties.foreach { case (k, v) => viewProperties.put(k, v) 
}
+    viewProperties.put(TableCatalog.PROP_TABLE_TYPE, 
TableSummary.METRIC_VIEW_TABLE_TYPE)
+
+    val sourceTableNames = MetricViewHelper.collectTableDependencies(analyzed)
+    val deps = if (sourceTableNames.nonEmpty) {
+      DependencyList.of(sourceTableNames.map(Dependency.table): _*)
+    } else {
+      null
+    }
+
+    val manager = sparkSession.sessionState.catalogManager
+    val builder = new ViewInfo.Builder()
+      .withSchema(aliasedSchema)
+      .withProperties(viewProperties)
+      .withQueryText(originalText)
+      .withCurrentCatalog(manager.currentCatalog.name)
+      .withCurrentNamespace(manager.currentNamespace)
+      .withSqlConfigs(
+        ViewHelper.sqlConfigsToProps(sparkSession.sessionState.conf, 
"").asJava)
+      .withSchemaMode(SchemaUnsupported.toString)
+      .withQueryColumnNames(analyzed.output.map(_.name).toArray)
+      .withViewDependencies(deps)
+    comment.foreach(builder.withComment)
+
+    viewCatalog.createView(ident, builder.build())

Review Comment:
   `allowExisting` and `replace` are silently ignored here:
   
   - `CREATE VIEW IF NOT EXISTS … WITH METRICS` on a v2 catalog will throw 
`ViewAlreadyExistsException` instead of being a no-op — regression vs. v1 
(`createMetricViewInSessionCatalog` passes `ignoreIfExists = allowExisting`).
   - `CREATE OR REPLACE VIEW … WITH METRICS` will likewise throw instead of 
replacing.
   - If a *table* sits at the ident (mixed `RelationCatalog`), the 
`ViewAlreadyExistsException` from `RelationCatalog.createView` will leak as a 
generic "already exists" instead of the dedicated 
`unsupportedCreateOrReplaceViewOnTableError` that `CreateV2ViewExec`'s catch 
block (CreateV2ViewExec.scala:153-169) decodes via `tableExists`.
   
   `CreateV2ViewExec.run` shows the full pattern: `viewExists` short-circuit on 
`IF NOT EXISTS`, then `createOrReplaceView` vs `createView`, with the catch 
block handling the table-collision case. None of the three paths is covered by 
the new test suite today — see comment there.



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala:
##########
@@ -1089,8 +1085,9 @@ object CatalogTableType {
   val EXTERNAL = new CatalogTableType("EXTERNAL")
   val MANAGED = new CatalogTableType("MANAGED")
   val VIEW = new CatalogTableType("VIEW")
+  val METRIC_VIEW = new CatalogTableType("METRIC_VIEW")

Review Comment:
   Adding `METRIC_VIEW` as a separate `CatalogTableType` means every existing 
`tableType == CatalogTableType.VIEW` site needs to be audited for whether it 
should also accept `METRIC_VIEW`. Two visible regressions I noticed in 
unchanged code:
   
   **(a) `CatalogTable.toJsonLinkedHashMap` (line 670 of this file).** The 
block emitting `"View Text"`, `"View Original Text"`, `"View Schema Mode"`, 
`"View Catalog and Namespace"`, and `"SQL Path"` is gated on `tableType == 
CatalogTableType.VIEW`. With this PR's change to `tableType = METRIC_VIEW`, all 
of those rows disappear from `DESCRIBE TABLE EXTENDED` output — a user-visible 
regression vs. the pre-PR `VIEW + viewWithMetrics=true` storage. Likely fix: 
change to `if (tableType == VIEW || tableType == METRIC_VIEW)`.
   
   **(b) `HiveExternalCatalog`.** Three sites only handle `VIEW`:
   - `createTable` (lines 277, 297) — empty-schema fallback for 
Hive-incompatible types.
   - `alterTable` (line 598) — VIEW-specific alter path.
   - `restoreTableMetadata` (line 838) — read-back falls into 
`restoreHiveSerdeTable` instead of restoring schema from view properties.
   
   Tests use the in-memory catalog so the Hive path isn't exercised, but a 
Hive-metastore-backed session-catalog metric view (the realistic deployment) 
won't round-trip correctly.
   
   Worth a repo-wide grep for `tableType == CatalogTableType.VIEW` / `tableType 
== VIEW` to enumerate the rest before merging.



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/command/metricViewCommands.scala:
##########
@@ -65,6 +83,66 @@ case class CreateMetricViewCommand(
       ignoreIfExists = allowExisting)
     Seq.empty
   }
+
+  private def createMetricViewInV2Catalog(

Review Comment:
   This new method open-codes a v2 view-CREATE path inside a v1 
`RunnableCommand`. The plain-view v2 path is already factored: 
`V2ViewPreparation.buildViewInfo()` (CreateV2ViewExec.scala:62-110) does almost 
exactly what this method's lines 109-138 do — alias plan, schema, properties, 
currentCatalog/Namespace, sqlConfigs, schemaMode, queryColumnNames, comment — 
and `CreateV2ViewExec` adds the IF NOT EXISTS / OR REPLACE / 
cross-type-collision handling.
   
   Would you consider refactoring this into either (a) a 
`CreateV2MetricViewExec extends V2ViewPreparation` exec invoked by a 
`DataSourceV2Strategy` rule that matches `CreateMetricViewCommand` on a 
non-session catalog, or (b) extending `V2ViewPreparation` with a 
`viewDependencies` / `extraProperties` hook that `CreateV2ViewExec` ignores and 
a metric-view subclass populates? Either way the metric-view path picks up the 
same flag handling and validation as plain v2 views without copy-paste.



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/metricview/util/MetricViewPlanner.scala:
##########
@@ -65,9 +64,11 @@ object MetricViewPlanner {
       MetricViewFactory.fromYAML(yaml)
     } catch {
       case e: MetricViewValidationException =>
-        throw QueryCompilationErrors.invalidLiteralForWindowDurationError()
+        throw SparkException.internalError(
+          s"Invalid metric view YAML: ${e.getMessage}", e)
       case e: MetricViewYAMLParsingException =>
-        throw QueryCompilationErrors.invalidLiteralForWindowDurationError()
+        throw SparkException.internalError(

Review Comment:
   Replacing `invalidLiteralForWindowDurationError` is good — that one was 
definitely wrong — but `SparkException.internalError(...)` is also the wrong 
category. A malformed user-supplied YAML body in `CREATE VIEW … LANGUAGE YAML 
AS $$ … $$` is user input error, not a Spark internal bug; `internalError` 
renders with framing along the lines of "please contact support / file a bug."
   
   Recommend a `QueryCompilationErrors.invalidMetricViewYAMLError(...)` (or 
similar) routed through `AnalysisException` with a real error class, so the 
message is identified as a user-correctable problem. Same applies to the 
`MetricViewValidationException` branch immediately above.



##########
sql/core/src/test/scala/org/apache/spark/sql/execution/MetricViewV2CatalogSuite.scala:
##########
@@ -0,0 +1,462 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+import java.util.concurrent.ConcurrentHashMap
+
+import scala.jdk.CollectionConverters._
+
+import org.apache.spark.sql.{AnalysisException, QueryTest}
+import org.apache.spark.sql.catalyst.analysis.{NoSuchViewException, 
ViewAlreadyExistsException}
+import org.apache.spark.sql.connector.catalog.{Identifier, 
InMemoryTableCatalog, MetadataOnlyTable, RelationCatalog, Table, TableCatalog, 
TableDependency, TableSummary, ViewInfo}
+import org.apache.spark.sql.metricview.serde.{AssetSource, Column, Constants, 
DimensionExpression, MeasureExpression, MetricView, MetricViewFactory, 
SQLSource}
+import org.apache.spark.sql.test.SharedSparkSession
+import org.apache.spark.sql.types.Metadata
+
+/**
+ * Tests that exercise 
[[org.apache.spark.sql.execution.command.CreateMetricViewCommand]] on a
+ * non-session V2 catalog. Metric views are persisted through the same 
[[ViewCatalog]] interface
+ * as plain views; the only marker that distinguishes them is `PROP_TABLE_TYPE 
= METRIC_VIEW`
+ * plus the typed `viewDependencies` field on [[ViewInfo]]. The recording 
catalog used here is a
+ * minimal [[RelationCatalog]] so the same instance can also host the source 
table referenced by
+ * the metric view's YAML.
+ */
+class MetricViewV2CatalogSuite extends QueryTest with SharedSparkSession {

Review Comment:
   Test gaps worth filling — especially the first three, which would have 
caught the silent-flag-ignore findings on the create path:
   
   - `CREATE OR REPLACE VIEW … WITH METRICS` on the v2 catalog (currently 
throws `ViewAlreadyExistsException` instead of replacing).
   - `CREATE VIEW IF NOT EXISTS … WITH METRICS` on the v2 catalog (regression 
vs. v1; currently throws instead of no-op).
   - `CREATE VIEW … WITH METRICS` against a v2 catalog where a *table* sits at 
the target ident — should produce the dedicated cross-type-collision error, not 
a generic `ViewAlreadyExistsException`.
   - A V1 source dependency: a v2 metric view referencing a session-catalog 
table (`HiveTableRelation` / `LogicalRelation`). Surfaces what 
`tableFullName()` actually contains for that case.
   - A multi-part-namespace V2 source: a metric view whose source identifier is 
`cat.ns1.ns2.tbl`. Surfaces what `mkString(".")` produces for `tableFullName()`.
   - Read-back: `SELECT * FROM <v2 metric view>` (or at least `EXPLAIN` / 
`DESCRIBE EXTENDED`) so the `loadRelation` → `MetadataOnlyTable` → 
`V1Table.toCatalogTable(ViewInfo)` round-trip is exercised end-to-end. The 
capture-only fixture pins what flows INTO `createView` but not what comes back 
OUT.



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/command/metricViewCommands.scala:
##########
@@ -73,6 +151,37 @@ case class CreateMetricViewCommand(
 case class AlterMetricViewCommand(child: LogicalPlan, originalText: String)
 
 object MetricViewHelper {
+
+  /**
+   * Walks the analyzed plan to collect direct table/view dependencies.
+   * Stops recursion at relation leaf nodes and persistent View nodes so that 
only
+   * direct (not transitive) dependencies are recorded.
+   */
+  private[execution] def collectTableDependencies(plan: LogicalPlan): 
Seq[String] = {
+    val tables = scala.collection.mutable.ArrayBuffer.empty[String]
+    def traverse(p: LogicalPlan): Unit = p match {
+      case v: View if !v.isTempView =>
+        tables += v.desc.identifier.unquotedString
+      case r: DataSourceV2Relation if r.catalog.isDefined && 
r.identifier.isDefined =>
+        val cat = r.catalog.get.name()
+        val ns = r.identifier.get.namespace().mkString(".")
+        val name = r.identifier.get.name()
+        tables += s"$cat.$ns.$name"

Review Comment:
   Inconsistent name forms across the four match arms:
   
   - `View` / `HiveTableRelation` / `LogicalRelation`: use 
`TableIdentifier.unquotedString`, which is `db.tbl` when `catalog` is `None` 
and `cat.db.tbl` when set.
   - `DataSourceV2Relation`: `s"$cat.$ns.$name"` with `ns = 
namespace().mkString(".")` — N+2 parts for an N-level namespace.
   
   So the same metric view yields 2-, 3-, or 4+-part dependency names depending 
on what kind of source was referenced. This ties into the `TableDependency` 
Javadoc/contract issue — once `nameParts` is structural, all four arms can pass 
`Seq[String]` directly without hand-formatting.



##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/ViewInfo.java:
##########
@@ -57,11 +58,14 @@ private ViewInfo(Builder builder) {
     this.sqlConfigs = Collections.unmodifiableMap(builder.sqlConfigs);
     this.schemaMode = builder.schemaMode;
     this.queryColumnNames = builder.queryColumnNames;
-    // Force PROP_TABLE_TYPE = VIEW so that `properties()` reflects the typed 
ViewInfo
-    // classification. Catalogs and generic viewers reading PROP_TABLE_TYPE 
from the properties
-    // bag (e.g. TableCatalog.listTableSummaries default impl, DESCRIBE) see 
"VIEW" without
-    // requiring authors to remember to call withTableType(VIEW).
-    properties().put(TableCatalog.PROP_TABLE_TYPE, 
TableSummary.VIEW_TABLE_TYPE);
+    this.viewDependencies = builder.viewDependencies;
+    // Force PROP_TABLE_TYPE = VIEW by default so that `properties()` reflects 
the typed
+    // ViewInfo classification. Metric views intentionally set PROP_TABLE_TYPE 
= METRIC_VIEW
+    // before construction and must retain that more-specific view kind.
+    String tableType = properties().get(TableCatalog.PROP_TABLE_TYPE);
+    if (!TableSummary.METRIC_VIEW_TABLE_TYPE.equals(tableType)) {
+      properties().put(TableCatalog.PROP_TABLE_TYPE, 
TableSummary.VIEW_TABLE_TYPE);
+    }

Review Comment:
   Embedding the metric-view-specific value in the generic `ViewInfo` 
constructor is awkward — it makes `ViewInfo` aware of one specific 
`PROP_TABLE_TYPE` value that lives in the metric-view code. 
`BaseBuilder.withTableType` already exists (`TableInfo.java:147`); the 
metric-view path could just call it after `withProperties(...)` and let this 
constructor remain the unconditional default:
   
   ```java
     private ViewInfo(Builder builder) {
       super(builder);
       ...
       properties().putIfAbsent(TableCatalog.PROP_TABLE_TYPE, 
TableSummary.VIEW_TABLE_TYPE);
     }
   ```
   
   and in `metricViewCommands.scala` drop the `viewProperties.put(...)` for 
`PROP_TABLE_TYPE` and add `.withTableType(TableSummary.METRIC_VIEW_TABLE_TYPE)` 
after `.withProperties(viewProperties)`. That way `ViewInfo` doesn't reference 
`METRIC_VIEW_TABLE_TYPE` at all, and adding future view kinds doesn't require 
touching this class.



##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableDependency.java:
##########
@@ -0,0 +1,38 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.catalog;
+
+import java.util.Objects;
+
+import org.apache.spark.annotation.Evolving;
+
+/**
+ * A table dependency of a SQL object.
+ * <p>
+ * The dependent table is identified by its fully-qualified three-part name
+ * in the form {@code catalog_name.schema_name.table_name}.
+ *
+ * @param tableFullName fully-qualified three-part table name

Review Comment:
   The "three-part name in the form `catalog_name.schema_name.table_name`" 
claim doesn't hold for catalogs with multi-level namespaces (e.g. Iceberg 
`cat.db1.db2.tbl` is 4 parts), and the producer in 
`MetricViewHelper.collectTableDependencies` doesn't always emit three parts 
either: V1 sources use `TableIdentifier.unquotedString` which yields 2 parts 
when the catalog component is unset, and V2 sources dot-join the namespace 
which yields N parts.
   
   Since the dot-flattened string is also ambiguous against quoted identifiers 
containing a literal `.`, the cleanest fix is to make this DTO structural — 
e.g. `record TableDependency(String[] nameParts)` (and the same for 
`FunctionDependency`). Consumers can join the parts themselves if they need a 
flat string, but they can't recover arity if the join happens at storage time.



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/metricview/serde/MetricViewCanonical.scala:
##########
@@ -167,4 +167,33 @@ private[sql] case class MetricView(
     version: String,
     from: Source,
     where: Option[String] = None,
-    select: Seq[Column])
+    select: Seq[Column]) {
+
+  /**
+   * Returns a set of table properties describing this metric view's source and
+   * filter clauses. Mirrors the property keys used by the canonical metric 
view
+   * representation on other Spark platforms so consumers of the catalog see a
+   * consistent property layout.
+   */
+  def getProperties: Map[String, String] = {

Review Comment:
   Worth noting in the Scaladoc that `metric_view.from.sql` is truncated to 
`Constants.MAXIMUM_PROPERTY_SIZE` and is therefore a descriptive value (for 
catalog browsers / lineage tooling), not a round-trippable representation of 
the source — a consumer relying on it to reconstruct the view body would 
silently get a truncated query for any view whose source SQL exceeds the limit.



##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/Dependency.java:
##########
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.catalog;
+
+import org.apache.spark.annotation.Evolving;
+
+/**
+ * Represents a dependency of a SQL object such as a view or metric view.
+ * <p>
+ * A dependency is one of: {@link TableDependency} or {@link 
FunctionDependency}.
+ *
+ * @since 4.2.0
+ */
+@Evolving
+public interface Dependency {

Review Comment:
   The Javadoc claim "A dependency is one of: TableDependency or 
FunctionDependency" is only structurally enforced if the interface is sealed. 
As a plain `interface` on Java 17, third-party connectors can implement it and 
the "is one of" claim is no longer true.
   
   ```suggestion
   @Evolving
   public sealed interface Dependency permits TableDependency, 
FunctionDependency {
   ```
   
   (Or alternatively, drop the "is one of" line if you want to leave the 
interface open for future extension.)



##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/DependencyList.java:
##########
@@ -0,0 +1,46 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.catalog;
+
+import java.util.Objects;
+
+import org.apache.spark.annotation.Evolving;
+
+/**
+ * A list of dependencies for a SQL object such as a view or metric view.
+ * <p>
+ * <ul>
+ *   <li>When {@code null}, the dependency information is not provided.</li>
+ *   <li>When the array is empty, dependencies are provided but the object has 
none.</li>
+ *   <li>When the array is non-empty, each entry describes one dependency.</li>
+ * </ul>
+ *
+ * @param dependencies array of dependencies
+ * @since 4.2.0
+ */
+@Evolving
+public record DependencyList(Dependency[] dependencies) {
+
+  public DependencyList {
+    Objects.requireNonNull(dependencies, "dependencies must not be null");

Review Comment:
   Records auto-generate accessors that return field references, so 
`DependencyList.dependencies()` hands callers the live array — a downstream 
consumer can mutate the lineage list of any `ViewInfo` via this accessor.
   
   Either defensively copy in the canonical constructor and the accessor, or 
expose `List<Dependency>` instead:
   
   ```java
   public DependencyList {
     Objects.requireNonNull(dependencies, "dependencies must not be null");
     dependencies = dependencies.clone();
   }
   
   public Dependency[] dependencies() {
     return dependencies.clone();
   }
   ```
   
   Same consideration applies to existing `ViewInfo.queryColumnNames()` / 
`currentNamespace()`, but those are pre-existing — at least worth doing it on 
the new types.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-54119] Support METRIC_VIEW creation on V2 catalogs [spark]

Reply via email to