Am 24.03.2016 um 10:34 schrieb Simon Hafner:
> 2016-03-24 9:54 GMT+01:00 Max Schmidt <m...@datapath.io
> <mailto:m...@datapath.io>>:
> > we're using with the java-api (1.6.0) a ScheduledExecutor that
> continuously
> > executes a SparkJob to a standalone cluster.
> I'd recommend Scala.
Why should I use scala?
>
> > After each job we close the JavaSparkContext and create a new one.
> Why do that? You can happily reuse it. Pretty sure that also causes
> the other problems, because you have a race condition on waiting for
> the job to finish and stopping the Context.
I do that because it is a very common pattern to create an object for
specific "job" and release its resources when its done.

The first problem that came in my mind was that the appName is immutable
once the JavaSparkContext was created, so it is, to me, not possible to
resuse the JavaSparkContext for jobs with different IDs (that we wanna
see in the webUI).

And of course it is possible to wait for closing the JavaSparkContext
gracefully, except when there is some asynchronous action in the background?

-- 
*Max Schmidt, Senior Java Developer* | m...@datapath.io
<mailto:m...@datapath.io> | LinkedIn
<https://www.linkedin.com/in/maximilian-schmidt-9893b7bb/>
Datapath.io
 
Decreasing AWS latency.
Your traffic optimized.

Datapath.io GmbH
Mainz | HRB Nr. 46222
Sebastian Spies, CEO

Reply via email to