Re: SPARK-20325 - Spark Structured Streaming documentation Update: checkpoint configuration

Michael Armbrust Fri, 14 Apr 2017 13:35:01 -0700

>
> 1)  could we update documentation for Structured Streaming and describe
> that checkpointing could be specified by 
> spark.sql.streaming.checkpointLocation
> on SparkSession level and thus automatically checkpoint dirs will be
> created per foreach query?
>
>
Sure, please open a pull request.



> 2) Do we really need to specify the checkpoint dir per query? what the
> reason for this? finally we will be forced to write some checkpointDir name
> generator, for example associate it with some particular named query and so
> on?
>

Every query needs to have a unique checkpoint as this is how we track what
has been processed.  If we don't have this, we can't restart the query
where it left off.  In you example, I would suggest including the metric
name in the checkpoint location path.

Re: SPARK-20325 - Spark Structured Streaming documentation Update: checkpoint configuration

Reply via email to