Hi Tobias

I am very interested implemented rest based api on top of spark. My rest
based system would make predictions from data provided in the request using
models trained in batch. My SLA is 250 ms.

Would you mind sharing how you implemented your rest server?

I am using spark-1.6.1. I have several unit tests that create spark context,
master is set to Œlocal[4]¹. I do not think the unit test frame is going to
scale. Can each rest server have a pool of sparks contexts?


The system would like to replacing is set up as follows

Layer of dumb load balancers: l1, l2, l3
Layer of proxy servers:       p1, p2, p3, p4, p5, Š.. Pn
Layer of containers:          c1, c2, c3, Š.. Cn

Where Cn is much larger than Pn


Kind regards

Andy

P.s. There is a talk on 5/5 about spark 2.0 Hoping there is something in the
near future.
https://www.brighttalk.com/webcast/12891/202021?utm_campaign=google-calendar
&utm_content=&utm_source=brighttalk-portal&utm_medium=calendar&utm_term=

From:  Tobias Eriksson <tobias.eriks...@qvantel.com>
Date:  Tuesday, May 3, 2016 at 7:34 AM
To:  "user @spark" <user@spark.apache.org>
Subject:  Multiple Spark Applications that use Cassandra, how to share
resources/nodes

> Hi 
>  We are using Spark for a long running job, in fact it is a REST-server that
> does some joins with some tables in Casandra and returns the result.
> Now we need to have multiple applications running in the same Spark cluster,
> and from what I understand this is not possible, or should I say somewhat
> complicated
> 1. A Spark application takes all the resources / nodes in the cluster (we have
> 4 nodes one for each Cassandra Node)
> 2. A Spark application returns it¹s resources when it is done (exits or the
> context is closed/returned)
> 3. Sharing resources using Mesos only allows scaling down and then scaling up
> by a step-by-step policy, i.e. 2 nodes, 3 nodes, 4 nodes, Š And increases as
> the need increases
> But if this is true, I can not have several applications running in parallell,
> is that true ?
> If I use Mesos then the whole idea with one Spark Worker per Cassandra Node
> fails, as it talks directly to a node, and that is how it is so efficient.
> In this case I need all nodes, not 3 out of 4.
> 
> Any mistakes in my thinking ?
> Any ideas on how to solve this ? Should be a common problem I think
> 
> -Tobias
> 
> 


Reply via email to