gengliangwang commented on code in PR #55507:
URL: https://github.com/apache/spark/pull/55507#discussion_r3132688322


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ChangelogTable.scala:
##########
@@ -45,3 +53,65 @@ case class ChangelogTable(
 
   override def capabilities: JSet[TableCapability] = JEnumSet.of(BATCH_READ, 
MICRO_BATCH_READ)
 }
+
+object ChangelogTable {
+
+  def validateSchema(cl: Changelog): Unit = {
+    val byName = cl.columns.map(c => c.name -> c).toMap

Review Comment:
   Duplicate column names in the connector schema are silently dropped (last 
write wins in `toMap`). A connector with a bug that emits two `_change_type` 
columns wouldn't trip the validator — it would surface later as an 
attribute-resolution ambiguity. Worth rejecting duplicates here. (Not a 
blocker.)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to