> > 1) could we update documentation for Structured Streaming and describe > that checkpointing could be specified by > spark.sql.streaming.checkpointLocation > on SparkSession level and thus automatically checkpoint dirs will be > created per foreach query? > > Sure, please open a pull request.
> 2) Do we really need to specify the checkpoint dir per query? what the > reason for this? finally we will be forced to write some checkpointDir name > generator, for example associate it with some particular named query and so > on? > Every query needs to have a unique checkpoint as this is how we track what has been processed. If we don't have this, we can't restart the query where it left off. In you example, I would suggest including the metric name in the checkpoint location path.