Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/22009#discussion_r226780862 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala --- @@ -169,15 +174,16 @@ object DataSourceV2Relation { options: Map[String, String], tableIdent: Option[TableIdentifier] = None, userSpecifiedSchema: Option[StructType] = None): DataSourceV2Relation = { - val reader = source.createReader(options, userSpecifiedSchema) + val readSupport = source.createReadSupport(options, userSpecifiedSchema) --- End diff -- In the long term, I don't think that sources should use the reader to get a schema. This is a temporary hack until we have catalog support, which is really where schemas should come from. The way this works in our version (which is substantially ahead of upstream Spark, unfortunately), is that a Table is loaded by a Catalog. The schema reported by that table is used to validate writes. That way, the table can report it's schema and Spark knows that data written must be compatible with that schema, but the source isn't required to be readable.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org