[ 
https://issues.apache.org/jira/browse/SPARK-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120657#comment-14120657
 ] 

Matei Zaharia commented on SPARK-3215:
--------------------------------------

Thanks Marcelo! Just a few notes on the API:
- It needs to be Java-friendly, so it's probably not good to use Scala 
functions (e.g. `JobContext => T`) and maybe even the Future type.
- It's a little weird to be passing a SparkConf to the JobClient, since most 
flags in there will not affect the jobs run (as they use the remote Spark 
cluster's SparkConf). Maybe it would be better to just pass a cluster URL.
- It would be good to give jobs some kind of ID that client apps can log and 
can refer to even if the client crashes and the JobHandle object is gone. This 
is similar to how Hive prints the MapReduce job IDs it launched, and lets you 
kill them later using MR's hadoop job -kill.

> Add remote interface for SparkContext
> -------------------------------------
>
>                 Key: SPARK-3215
>                 URL: https://issues.apache.org/jira/browse/SPARK-3215
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>            Reporter: Marcelo Vanzin
>              Labels: hive
>         Attachments: RemoteSparkContext.pdf
>
>
> A quick description of the issue: as part of running Hive jobs on top of 
> Spark, it's desirable to have a SparkContext that is running in the 
> background and listening for job requests for a particular user session.
> Running multiple contexts in the same JVM is not a very good solution. Not 
> only SparkContext currently has issues sharing the same JVM among multiple 
> instances, but that turns the JVM running the contexts into a huge bottleneck 
> in the system.
> So I'm proposing a solution where we have a SparkContext that is running in a 
> separate process, and listening for requests from the client application via 
> some RPC interface (most probably Akka).
> I'll attach a document shortly with the current proposal. Let's use this bug 
> to discuss the proposal and any other suggestions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to