Thanks a lot Mark and Christopher for your prompt replies and clarification.

Regards,

Kapil Malik | kma...@adobe.com<mailto:kma...@adobe.com>

From: Christopher Nguyen [mailto:c...@adatao.com]
Sent: 25 January 2014 22:34
To: user@spark.incubator.apache.org
Subject: RE: Can I share the RDD between multiprocess


Kapil, that's right, your #2 is the pattern I was referring to. Of course it 
could be Tomcat or something even lighter weight as long as you define some 
suitable client/server protocol.

Sent while mobile. Pls excuse typos etc.
On Jan 25, 2014 6:03 AM, "Kapil Malik" 
<kma...@adobe.com<mailto:kma...@adobe.com>> wrote:
Hi Christopher,

“make a "server" out of that JVM, and serve up (via HTTP/THRIFT, etc.) some 
kind of reference to those RDDs to multiple clients of that server”

Can you kindly hint at any starting points regarding your suggestion?
In my understanding, SparkContext constructor creates an Akka actor system and 
starts a jetty UI server. So can we somehow use / tweak the same to serve to 
multiple clients? Or can we simply construct a spark context inside a Java 
server (like Tomcat) ?

Regards,

Kapil Malik | kma...@adobe.com<mailto:kma...@adobe.com> | 33430 / 8800836581

From: Christopher Nguyen [mailto:c...@adatao.com<mailto:c...@adatao.com>]
Sent: 25 January 2014 12:00
To: user@spark.incubator.apache.org<mailto:user@spark.incubator.apache.org>
Subject: Re: Can I share the RDD between multiprocess

D.Y., it depends on what you mean by "multiprocess".

RDD lifecycles are currently limited to a single SparkContext. So to "share" 
RDDs you need to somehow access the same SparkContext.

This means one way to share RDDs is to make sure your accessors are in the same 
JVM that started the SparkContext.

Another is to make a "server" out of that JVM, and serve up (via HTTP/THRIFT, 
etc.) some kind of reference to those RDDs to multiple clients of that server, 
even though there is only one SparkContext (held by the server). We have built 
a server product using this pattern so I know it can work well.

--
Christopher T. Nguyen
Co-founder & CEO, Adatao<http://adatao.com>
linkedin.com/in/ctnguyen<http://linkedin.com/in/ctnguyen>


On Fri, Jan 24, 2014 at 6:06 PM, D.Y Feng 
<yyfeng88...@gmail.com<mailto:yyfeng88...@gmail.com>> wrote:
How can I share the RDD between multiprocess?

--


DY.Feng(叶毅锋)
yyfeng88625@twitter
Department of Applied Mathematics
Guangzhou University,China
dyf...@stu.gzhu.edu.cn<mailto:dyf...@stu.gzhu.edu.cn>


Reply via email to