Am 24.03.2016 um 10:34 schrieb Simon Hafner: > 2016-03-24 9:54 GMT+01:00 Max Schmidt <m...@datapath.io > <mailto:m...@datapath.io>>: > > we're using with the java-api (1.6.0) a ScheduledExecutor that > continuously > > executes a SparkJob to a standalone cluster. > I'd recommend Scala. Why should I use scala? > > > After each job we close the JavaSparkContext and create a new one. > Why do that? You can happily reuse it. Pretty sure that also causes > the other problems, because you have a race condition on waiting for > the job to finish and stopping the Context. I do that because it is a very common pattern to create an object for specific "job" and release its resources when its done.
The first problem that came in my mind was that the appName is immutable once the JavaSparkContext was created, so it is, to me, not possible to resuse the JavaSparkContext for jobs with different IDs (that we wanna see in the webUI). And of course it is possible to wait for closing the JavaSparkContext gracefully, except when there is some asynchronous action in the background? -- *Max Schmidt, Senior Java Developer* | m...@datapath.io <mailto:m...@datapath.io> | LinkedIn <https://www.linkedin.com/in/maximilian-schmidt-9893b7bb/> Datapath.io Decreasing AWS latency. Your traffic optimized. Datapath.io GmbH Mainz | HRB Nr. 46222 Sebastian Spies, CEO