I have a use case where I use Spark (streaming) as a way to distribute a set of computations, which requires (some) of the computations to call an external service. Naturally, I'd like to manage my connections (per executor/worker).
I know this pattern for DStream: https://people.apache.org/~pwendell/spark-nightly/spark-branch-2.0-docs/spark-2.0.1-SNAPSHOT-2016_07_21_04_05-f9367d6-docs/streaming-programming-guide.html#design-patterns-for-using-foreachrdd and I was wondering how I'd do the same for map functions ? as I would like to "commit" the output iterator and only afterwards "return" my connection. And generally, how's this going to work with Structured Streaming ? Thanks, Amit