[GitHub] spark issue #22143: [SPARK-24647][SS] Report KafkaStreamWriter's written min...

vackosar Mon, 20 Aug 2018 12:33:33 -0700

Github user vackosar commented on the issue:

    https://github.com/apache/spark/pull/22143
  
    @arunmahadevan min and max are used there can be other writers to same 
topic occurring in different job. The messages sent would then become 
interleaved and one would have to return large number of intervals to be 
accurate. This approach gives sufficient information where the data ended up 
being written, while being also resilient and simplistic. Would you recommend 
adding this as a Java Doc?
    
    To explain montivation I updated description of this PR using description 
of the Jira. (To track data lineage we need to know where data was read from 
and written to at least approaximately.)



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22143: [SPARK-24647][SS] Report KafkaStreamWriter's written min...

Reply via email to