Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/23086#discussion_r237178976 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExec.scala --- @@ -23,29 +23,28 @@ import org.apache.spark.sql.catalyst.expressions._ import org.apache.spark.sql.catalyst.plans.physical import org.apache.spark.sql.catalyst.plans.physical.SinglePartition import org.apache.spark.sql.execution.{ColumnarBatchScan, LeafExecNode, WholeStageCodegenExec} -import org.apache.spark.sql.execution.streaming.continuous._ import org.apache.spark.sql.sources.v2.DataSourceV2 import org.apache.spark.sql.sources.v2.reader._ -import org.apache.spark.sql.sources.v2.reader.streaming.{ContinuousPartitionReaderFactory, ContinuousReadSupport, MicroBatchReadSupport} /** - * Physical plan node for scanning data from a data source. + * Physical plan node for scanning a batch of data from a data source. */ case class DataSourceV2ScanExec( output: Seq[AttributeReference], @transient source: DataSourceV2, @transient options: Map[String, String], --- End diff -- With a catalog, there is no expectation that a `source` will be passed. This could be a string that identifies either the source or the catalog, for a good string representation of the physical plan. This is another area where I think `Table.name` would be helpful because the table's identifying information is really what should be shown instead of its source or catalog. For options, these are part of the scan and aren't used to affect the behavior of this physical node. I think that means that they shouldn't be part of the node's arguments. I think a good way to solve this problem is to change the pretty string format to use `Scan` instead. That has the information that defines what this node is doing, like the filters, projection, and options. And being able to convert a logical scan to text would be useful across all 3 execution modes.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org