[ 
https://issues.apache.org/jira/browse/SPARK-28438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ShuMing Li updated SPARK-28438:
-------------------------------
    Description: 
When users register a datasource table to Spark,  Spark only support complete 
schema equality of datasource's origin schema  and user-specific's schema now.

However datasource's origin schema may be little different with user-specific's 
schema: the diff maybe `column's comment` or other metadata info.

Can we ignore column's comment or metadata info when comparing?

// DataSource.scala
case (dataSource: RelationProvider, Some(schema)) =>
  val baseRelation =
    dataSource.createRelation(sparkSession.sqlContext, caseInsensitiveOptions)
  if (baseRelation.schema != schema) \{
    throw new AnalysisException(s"$className does not allow user-specified 
schemas, " +
        s"source schema: ${baseRelation.schema}, user-specific schema: 
${schema}")
  }

// StructType.scala

override def equals(that: Any): Boolean = \{
  that match {
    case StructType(otherFields) =>
      java.util.Arrays.equals(
        fields.asInstanceOf[Array[AnyRef]], 
otherFields.asInstanceOf[Array[AnyRef]])
    case _ => false
  }
}

> [SQL] Ignore metadata's(comments) difference when comparing datasource's 
> schema and user-specific schema
> --------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-28438
>                 URL: https://issues.apache.org/jira/browse/SPARK-28438
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.3.0
>            Reporter: ShuMing Li
>            Priority: Minor
>
> When users register a datasource table to Spark,  Spark only support complete 
> schema equality of datasource's origin schema  and user-specific's schema now.
> However datasource's origin schema may be little different with 
> user-specific's schema: the diff maybe `column's comment` or other metadata 
> info.
> Can we ignore column's comment or metadata info when comparing?
> // DataSource.scala
> case (dataSource: RelationProvider, Some(schema)) =>
>   val baseRelation =
>     dataSource.createRelation(sparkSession.sqlContext, caseInsensitiveOptions)
>   if (baseRelation.schema != schema) \{
>     throw new AnalysisException(s"$className does not allow user-specified 
> schemas, " +
>         s"source schema: ${baseRelation.schema}, user-specific schema: 
> ${schema}")
>   }
> // StructType.scala
> override def equals(that: Any): Boolean = \{
>   that match {
>     case StructType(otherFields) =>
>       java.util.Arrays.equals(
>         fields.asInstanceOf[Array[AnyRef]], 
> otherFields.asInstanceOf[Array[AnyRef]])
>     case _ => false
>   }
> }



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to