Re: No active SparkContext

2016-03-31 Thread Max Schmidt
Just to mark this question closed - we expierienced an OOM-Exception on
the Master, which we didn't see on the Driver, but made him crash.

Am 24.03.2016 um 09:54 schrieb Max Schmidt:
> Hi there,
>
> we're using with the java-api (1.6.0) a ScheduledExecutor that
> continuously executes a SparkJob to a standalone cluster.
>
> After each job we close the JavaSparkContext and create a new one.
>
> But sometimes the Scheduling JVM crashes with:
>
> 24.03.2016-08:30:27:375# error - Application has been killed. Reason:
> All masters are unresponsive! Giving up.
> 24.03.2016-08:30:27:398# error - Error initializing SparkContext.
> java.lang.IllegalStateException: Cannot call methods on a stopped
> SparkContext.
> This stopped SparkContext was created at:
>
> org.apache.spark.api.java.JavaSparkContext.(JavaSparkContext.scala:59)
> io.datapath.spark.AbstractSparkJob.createJavaSparkContext(AbstractSparkJob.java:53)
> io.datapath.measurement.SparkJobMeasurements.work(SparkJobMeasurements.java:130)
> io.datapath.measurement.SparkMeasurementScheduler.lambda$submitSparkJobMeasurement$30(SparkMeasurementScheduler.java:117)
> io.datapath.measurement.SparkMeasurementScheduler$$Lambda$17/1568787282.run(Unknown
> Source)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> java.lang.Thread.run(Thread.java:745)
>
> The currently active SparkContext was created at:
>
> (No active SparkContext.)
>
> at
> org.apache.spark.SparkContext.org$apache$spark$SparkContext$$assertNotStopped(SparkContext.scala:106)
> at
> org.apache.spark.SparkContext.getSchedulingMode(SparkContext.scala:1578)
> at
> org.apache.spark.SparkContext.postEnvironmentUpdate(SparkContext.scala:2179)
> at org.apache.spark.SparkContext.(SparkContext.scala:579)
> at
> org.apache.spark.api.java.JavaSparkContext.(JavaSparkContext.scala:59)
> at
> io.datapath.spark.AbstractSparkJob.createJavaSparkContext(AbstractSparkJob.java:53)
> at
> io.datapath.measurement.SparkJobMeasurements.work(SparkJobMeasurements.java:130)
> at
> io.datapath.measurement.SparkMeasurementScheduler.lambda$submitSparkJobMeasurement$30(SparkMeasurementScheduler.java:117)
> at
> io.datapath.measurement.SparkMeasurementScheduler$$Lambda$17/1568787282.run(Unknown
> Source)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> 24.03.2016-08:30:27:402# info - SparkMeasurement - finished.
>
> Any guess?
> -- 
> *Max Schmidt, Senior Java Developer* | m...@datapath.io | LinkedIn
> <https://www.linkedin.com/in/maximilian-schmidt-9893b7bb/>
> Datapath.io
>  
> Decreasing AWS latency.
> Your traffic optimized.
>
> Datapath.io GmbH
> Mainz | HRB Nr. 46222
> Sebastian Spies, CEO
>

-- 
*Max Schmidt, Senior Java Developer* | m...@datapath.io
<mailto:m...@datapath.io> | LinkedIn
<https://www.linkedin.com/in/maximilian-schmidt-9893b7bb/>
Datapath.io
 
Decreasing AWS latency.
Your traffic optimized.

Datapath.io GmbH
Mainz | HRB Nr. 46222
Sebastian Spies, CEO



Re: No active SparkContext

2016-03-24 Thread Max Schmidt

Am 2016-03-24 18:00, schrieb Mark Hamstra:

You seem to be confusing the concepts of Job and Application.  A
Spark Application has a SparkContext.  A Spark Application is capable
of running multiple Jobs, each with its own ID, visible in the webUI.


Obviously I mixed it up, but then I would like to know how my Java 
application should be constrcuted if wanted to submit periodic 
'Applications' to my cluster?

Did anyone use the

http://spark.apache.org/docs/latest/api/java/index.html?org/apache/spark/launcher/package-summary.html

for this scenario?


On Thu, Mar 24, 2016 at 6:11 AM, Max Schmidt  wrote:


Am 24.03.2016 um 10:34 schrieb Simon Hafner:


2016-03-24 9:54 GMT+01:00 Max Schmidt :
> we're using with the java-api (1.6.0) a ScheduledExecutor that 
continuously

> executes a SparkJob to a standalone cluster.
I'd recommend Scala.

Why should I use scala?


After each job we close the JavaSparkContext and create a new one.

Why do that? You can happily reuse it. Pretty sure that also causes
the other problems, because you have a race condition on waiting 
for

the job to finish and stopping the Context.
I do that because it is a very common pattern to create an object 
for specific "job" and release its resources when its done.


The first problem that came in my mind was that the appName is 
immutable once the JavaSparkContext was created, so it is, to me, not 
possible to resuse the JavaSparkContext for jobs with different IDs 
(that we wanna see in the webUI).


And of course it is possible to wait for closing the 
JavaSparkContext gracefully, except when there is some asynchronous 
action in the background?


--

MAX SCHMIDT, SENIOR JAVA DEVELOPER | m...@datapath.io | LinkedIn [1]

 
Decreasing AWS latency.
Your traffic optimized.

Datapath.io GmbH
Mainz | HRB Nr. 46222
Sebastian Spies, CEO




Links:
--
[1] https://www.linkedin.com/in/maximilian-schmidt-9893b7bb/



-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: No active SparkContext

2016-03-24 Thread Mark Hamstra
You seem to be confusing the concepts of Job and Application.  A Spark
Application has a SparkContext.  A Spark Application is capable of running
multiple Jobs, each with its own ID, visible in the webUI.

On Thu, Mar 24, 2016 at 6:11 AM, Max Schmidt  wrote:

> Am 24.03.2016 um 10:34 schrieb Simon Hafner:
>
> 2016-03-24 9:54 GMT+01:00 Max Schmidt :
> > we're using with the java-api (1.6.0) a ScheduledExecutor that
> continuously
> > executes a SparkJob to a standalone cluster.
> I'd recommend Scala.
>
> Why should I use scala?
>
>
> > After each job we close the JavaSparkContext and create a new one.
> Why do that? You can happily reuse it. Pretty sure that also causes
> the other problems, because you have a race condition on waiting for
> the job to finish and stopping the Context.
>
> I do that because it is a very common pattern to create an object for
> specific "job" and release its resources when its done.
>
> The first problem that came in my mind was that the appName is immutable
> once the JavaSparkContext was created, so it is, to me, not possible to
> resuse the JavaSparkContext for jobs with different IDs (that we wanna see
> in the webUI).
>
> And of course it is possible to wait for closing the JavaSparkContext
> gracefully, except when there is some asynchronous action in the background?
>
> --
> *Max Schmidt, Senior Java Developer* | m...@datapath.io |
> LinkedIn 
> [image: Datapath.io]
>
> Decreasing AWS latency.
> Your traffic optimized.
>
> Datapath.io GmbH
> Mainz | HRB Nr. 46222
> Sebastian Spies, CEO
>


Re: No active SparkContext

2016-03-24 Thread Max Schmidt
Am 24.03.2016 um 10:34 schrieb Simon Hafner:
> 2016-03-24 9:54 GMT+01:00 Max Schmidt  >:
> > we're using with the java-api (1.6.0) a ScheduledExecutor that
> continuously
> > executes a SparkJob to a standalone cluster.
> I'd recommend Scala.
Why should I use scala?
>
> > After each job we close the JavaSparkContext and create a new one.
> Why do that? You can happily reuse it. Pretty sure that also causes
> the other problems, because you have a race condition on waiting for
> the job to finish and stopping the Context.
I do that because it is a very common pattern to create an object for
specific "job" and release its resources when its done.

The first problem that came in my mind was that the appName is immutable
once the JavaSparkContext was created, so it is, to me, not possible to
resuse the JavaSparkContext for jobs with different IDs (that we wanna
see in the webUI).

And of course it is possible to wait for closing the JavaSparkContext
gracefully, except when there is some asynchronous action in the background?

-- 
*Max Schmidt, Senior Java Developer* | m...@datapath.io
 | LinkedIn

Datapath.io
 
Decreasing AWS latency.
Your traffic optimized.

Datapath.io GmbH
Mainz | HRB Nr. 46222
Sebastian Spies, CEO



Re: No active SparkContext

2016-03-24 Thread Simon Hafner
2016-03-24 9:54 GMT+01:00 Max Schmidt :
> we're using with the java-api (1.6.0) a ScheduledExecutor that
continuously
> executes a SparkJob to a standalone cluster.
I'd recommend Scala.

> After each job we close the JavaSparkContext and create a new one.
Why do that? You can happily reuse it. Pretty sure that also causes
the other problems, because you have a race condition on waiting for
the job to finish and stopping the Context.


No active SparkContext

2016-03-24 Thread Max Schmidt
Hi there,

we're using with the java-api (1.6.0) a ScheduledExecutor that
continuously executes a SparkJob to a standalone cluster.

After each job we close the JavaSparkContext and create a new one.

But sometimes the Scheduling JVM crashes with:

24.03.2016-08:30:27:375# error - Application has been killed. Reason:
All masters are unresponsive! Giving up.
24.03.2016-08:30:27:398# error - Error initializing SparkContext.
java.lang.IllegalStateException: Cannot call methods on a stopped
SparkContext.
This stopped SparkContext was created at:

org.apache.spark.api.java.JavaSparkContext.(JavaSparkContext.scala:59)
io.datapath.spark.AbstractSparkJob.createJavaSparkContext(AbstractSparkJob.java:53)
io.datapath.measurement.SparkJobMeasurements.work(SparkJobMeasurements.java:130)
io.datapath.measurement.SparkMeasurementScheduler.lambda$submitSparkJobMeasurement$30(SparkMeasurementScheduler.java:117)
io.datapath.measurement.SparkMeasurementScheduler$$Lambda$17/1568787282.run(Unknown
Source)
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
java.util.concurrent.FutureTask.run(FutureTask.java:266)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)

The currently active SparkContext was created at:

(No active SparkContext.)

at
org.apache.spark.SparkContext.org$apache$spark$SparkContext$$assertNotStopped(SparkContext.scala:106)
at
org.apache.spark.SparkContext.getSchedulingMode(SparkContext.scala:1578)
at
org.apache.spark.SparkContext.postEnvironmentUpdate(SparkContext.scala:2179)
at org.apache.spark.SparkContext.(SparkContext.scala:579)
at
org.apache.spark.api.java.JavaSparkContext.(JavaSparkContext.scala:59)
at
io.datapath.spark.AbstractSparkJob.createJavaSparkContext(AbstractSparkJob.java:53)
at
io.datapath.measurement.SparkJobMeasurements.work(SparkJobMeasurements.java:130)
at
io.datapath.measurement.SparkMeasurementScheduler.lambda$submitSparkJobMeasurement$30(SparkMeasurementScheduler.java:117)
at
io.datapath.measurement.SparkMeasurementScheduler$$Lambda$17/1568787282.run(Unknown
Source)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
24.03.2016-08:30:27:402# info - SparkMeasurement - finished.

Any guess?
-- 
*Max Schmidt, Senior Java Developer* | m...@datapath.io
<mailto:m...@datapath.io> | LinkedIn
<https://www.linkedin.com/in/maximilian-schmidt-9893b7bb/>
Datapath.io
 
Decreasing AWS latency.
Your traffic optimized.

Datapath.io GmbH
Mainz | HRB Nr. 46222
Sebastian Spies, CEO