Yes you can submit multiple actions from different threads to the same
SparkContext. It is safe.
Indeed what you want to achieve is quite common. Expose some operations
over a SparkContext through HTTP.
I have used spray for this and it just worked fine.

At bootstrap of your web app, start a sparkcontext, maybe preprocess some
data and cache it, then start accepting requests against this sc. Depending
where you place the initialization code, you can block the server from
initializing until your context is ready. This is nice if you don't want to
accept requests while the context is being prepared.


Eugen


2015-02-05 23:22 GMT+01:00 Shuai Zheng <szheng.c...@gmail.com>:

> This example helps a lot J
>
>
>
> But I am thinking a below case:
>
>
>
> Assume I have a SparkContext as a global variable.
>
> Then if I use multiple threads to access/use it. Will it mess up?
>
>
>
> For example:
>
>
>
> My code:
>
>
>
> *public* *static* List<Tuple2<Integer, Double>> run(JavaSparkContext
> sparkContext, Map<Integer, List<ExposureInfo>> cache, Properties prop,
> List<EghInfo> el)
>
>                      *throws* IOException, InterruptedException {
>
> JavaRDD<EghInfo> lines = sparkContext.parallelize(el, 100);
>
> Lines.map(…)
>
> …
>
> Lines.count()
>
> }
>
>
>
> If I have two threads call this method at the same time and pass in the
> same SparkContext.
>
>
>
> Will SparkContext be thread-safe? I am a bit worry here, in traditional
> java, it should be, but in Spark context, I am not 100% sure.
>
>
>
> Basically the sparkContext need to smart enough to differentiate the
> different method context (RDD add to it from different methods), so create
> two different DAG for different method.
>
>
>
> Anyone can confirm this? This is not something I can easily test with
> code. Thanks!
>
>
>
> Regards,
>
>
>
> Shuai
>
>
>
> *From:* Corey Nolet [mailto:cjno...@gmail.com]
> *Sent:* Thursday, February 05, 2015 11:55 AM
> *To:* Charles Feduke
> *Cc:* Shuai Zheng; user@spark.apache.org
> *Subject:* Re: How to design a long live spark application
>
>
>
> Here's another lightweight example of running a SparkContext in a common
> java servlet container: https://github.com/calrissian/spark-jetty-server
>
>
>
> On Thu, Feb 5, 2015 at 11:46 AM, Charles Feduke <charles.fed...@gmail.com>
> wrote:
>
> If you want to design something like Spark shell have a look at:
>
>
>
> http://zeppelin-project.org/
>
>
>
> Its open source and may already do what you need. If not, its source code
> will be helpful in answering the questions about how to integrate with long
> running jobs that you have.
>
>
>
> On Thu Feb 05 2015 at 11:42:56 AM Boromir Widas <vcsub...@gmail.com>
> wrote:
>
> You can check out https://github.com/spark-jobserver/spark-jobserver -
> this allows several users to upload their jars and run jobs with a REST
> interface.
>
>
>
> However, if all users are using the same functionality, you can write a
> simple spray server which will act as the driver and hosts the spark
> context+RDDs, launched in client mode.
>
>
>
> On Thu, Feb 5, 2015 at 10:25 AM, Shuai Zheng <szheng.c...@gmail.com>
> wrote:
>
> Hi All,
>
>
>
> I want to develop a server side application:
>
>
>
> User submit request à Server run spark application and return (this might
> take a few seconds).
>
>
>
> So I want to host the server to keep the long-live context, I don’t know
> whether this is reasonable or not.
>
>
>
> Basically I try to have a global JavaSparkContext instance and keep it
> there, and initialize some RDD. Then my java application will use it to
> submit the job.
>
>
>
> So now I have some questions:
>
>
>
> 1, if I don’t close it, will there any timeout I need to configure on the
> spark server?
>
> 2, In theory I want to design something similar to Spark shell (which also
> host a default sc there), just it is not shell based.
>
>
>
> Any suggestion? I think my request is very common for application
> development, here must someone has done it before?
>
>
>
> Regards,
>
>
>
> Shawn
>
>
>
>
>

Reply via email to