[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

cloud-fan Wed, 28 Nov 2018 19:59:59 -0800

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23086#discussion_r237346499
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExec.scala
 ---
    @@ -22,86 +22,56 @@ import org.apache.spark.sql.catalyst.InternalRow
     import org.apache.spark.sql.catalyst.expressions._
     import org.apache.spark.sql.catalyst.plans.physical
     import org.apache.spark.sql.catalyst.plans.physical.SinglePartition
    +import org.apache.spark.sql.catalyst.util.truncatedString
     import org.apache.spark.sql.execution.{ColumnarBatchScan, LeafExecNode, 
WholeStageCodegenExec}
    -import org.apache.spark.sql.execution.streaming.continuous._
    -import org.apache.spark.sql.sources.v2.DataSourceV2
     import org.apache.spark.sql.sources.v2.reader._
    -import 
org.apache.spark.sql.sources.v2.reader.streaming.{ContinuousPartitionReaderFactory,
 ContinuousReadSupport, MicroBatchReadSupport}
     
     /**
    - * Physical plan node for scanning data from a data source.
    + * Physical plan node for scanning a batch of data from a data source.
      */
     case class DataSourceV2ScanExec(
         output: Seq[AttributeReference],
    -    @transient source: DataSourceV2,
    -    @transient options: Map[String, String],
    -    @transient pushedFilters: Seq[Expression],
    -    @transient readSupport: ReadSupport,
    -    @transient scanConfig: ScanConfig)
    -  extends LeafExecNode with DataSourceV2StringFormat with 
ColumnarBatchScan {
    +    scanDesc: String,
    +    @transient batch: Batch)
    --- End diff --
    
    @rdblue I want to reuse this plan for batch and microbatch. Here this plan 
doesn't take `Scan` but just `Batch`, so that the caller side is flexible to 
decide how to produce batch(es) from a scan.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

Reply via email to