Re: Best practice for multi-user web controller in front of Spark

2014-11-11 Thread Sonal Goyal
I believe the Spark Job Server by Ooyala can help you share data across multiple jobs, take a look at http://engineering.ooyala.com/blog/open-sourcing-our-spark-job-server. It seems to fit closely to what you need. Best Regards, Sonal Founder, Nube Technologies http://www.nubetech.co

Re: Best practice for multi-user web controller in front of Spark

2014-11-11 Thread Evan R. Sparks
For sharing RDDs across multiple jobs - you could also have a look at Tachyon. It provides an HDFS compatible in-memory storage layer that keeps data in memory across multiple jobs/frameworks - http://tachyon-project.org/ . - On Tue, Nov 11, 2014 at 8:11 AM, Sonal Goyal sonalgoy...@gmail.com

RE: Best practice for multi-user web controller in front of Spark

2014-11-11 Thread Mohammed Guller
David, Here is what I would suggest: 1 - Does a new SparkContext get created in the web tier for each new request for processing? Create a single SparkContext that gets shared across multiple web requests. Depending on the framework that you are using for the web-tier, it should not be

Re: Best practice for multi-user web controller in front of Spark

2014-11-11 Thread Tobias Pfeiffer
Hi, also there is Spindle https://github.com/adobe-research/spindle which was introduced on this list some time ago. I haven't looked into it deeply, but you might gain some valuable insights from their architecture, they are also using Spark to fulfill requests coming from the web. Tobias