[jira] [Created] (SPARK-20378) StreamSinkProvider should provide schema in createSink.

Yogesh Mahajan (JIRA) Tue, 18 Apr 2017 14:59:17 -0700

Yogesh Mahajan created SPARK-20378:
--------------------------------------

             Summary: StreamSinkProvider should provide schema in createSink. 
                 Key: SPARK-20378
                 URL: https://issues.apache.org/jira/browse/SPARK-20378
             Project: Spark
          Issue Type: Improvement
          Components: Structured Streaming
    Affects Versions: 2.1.0
            Reporter: Yogesh Mahajan



We have our own Sink implementation based on our in memory store and this sink 
is also queryable through SparkSQL with corresponding logical and physical 
plans.  It is very similar to memory Sink provided in structured streaming. 
Custome Sinks are registered through DataSource and it's per query and hence 
per schema. 
StreamingQueryManager can have multiple queries in one sparkSession and their 
schema could be different. 

So with the proposed changes StreamSinkProvider trait will change as follows - 

>From this definition 

trait StreamSinkProvider {
  def createSink(
      sqlContext: SQLContext,
      parameters: Map[String, String],
      partitionColumns: Seq[String],
      outputMode: OutputMode): Sink
}

to this definition - 

trait StreamSinkProvider {
  def createSink(
      schema: StructType,
      sqlContext: SQLContext,
      parameters: Map[String, String],
      partitionColumns: Seq[String],
      outputMode: OutputMode): Sink
}




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-20378) StreamSinkProvider should provide schema in createSink.

Reply via email to