[GitHub] spark pull request #22547: [SPARK-25528][SQL] data source V2 read side API r...

mccheah Fri, 02 Nov 2018 13:56:03 -0700

Github user mccheah commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22547#discussion_r230505785
  
    --- Diff: 
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaContinuousInputStream.scala
 ---
    @@ -46,17 +45,22 @@ import org.apache.spark.sql.types.StructType
      *                       scenarios, where some offsets after the specified 
initial ones can't be
      *                       properly read.
      */
    -class KafkaContinuousReadSupport(
    +class KafkaContinuousInputStream(
    --- End diff --
    
    +1 for this. A lot of the changes right now are for moving around the 
streaming code especially, which makes it harder to isolate just the proposed 
API for review.
    
    An alternative is to make this PR separate commits that, while the commits 
themselves may not compile because of mismatching signatures - but all the 
commits taken together would compile, and each commit can be reviewed 
individually for assessing the API and then the implementation.
    
    For example I'd propose 3 PRs:
    
    * Batch reading, with a commit for the interface changes and a separate 
commit for the implementation changes
    * Micro Batch Streaming read, with a commit for the interface changes and a 
separate commit for the implementation changes
    * Continuous streaming read, similar to above
    
    Thoughts?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22547: [SPARK-25528][SQL] data source V2 read side API r...

Reply via email to