We've built a model server internally, based on Scalatra and Akka
Clustering. Our use case is more geared towards serving possibly thousands
of smaller models.
It's actually very basic, just reads models from S3 as strings (!!) (uses
HDFS FileSystem so can read from local, HDFS, S3) and uses
Hi Nick,
Any specific reason of choosing scalatra and not play/spray (now that they
are getting integrated) ?
Sean,
Would you be interested in a play and akka clustering based module in oryx2
and see how it compares against the servlets ? I am interested to
understand the scalability
Well, when I started development ~2 years ago, Scalatra just appealed more,
being more lightweight (I didn't need MVC just barebones REST endpoints),
and I still find its API / DSL much nicer to work with. Also, the swagger
API docs integration was important to me. So it's more familiarity than
The shared-nothing load-balanced server architecture works for all but the
most massive models - and even then a few big EC2 r3 instances should do
the trick.
One nice thing about Akka (and especially the new HTTP) is fault tolerance,
recovery and potential for persistence.
For us arguably the
A concrete plan and a definite version upon which the upgrade would be
applied sounds like it would benefit the community. If you plan far enough
out (as Hadoop has done) and give the community enough of a notice, I can't
see it being a problem as they would have ample time upgrade.
On Sat, Oct
BTW several people asked about registration and student passes. Registration
will open in a few weeks, and like in previous Spark Summits, I expect there to
be a special pass for students.
Matei
On Oct 18, 2014, at 9:52 PM, Matei Zaharia matei.zaha...@gmail.com wrote:
After successful
Hi Saurabh,
Good way to start is to use Spark with your applications and file
issues you might have found and maybe provide patch for those or
existing ones.
Please take a look at Spark's how to contribute page [1] to help you
get started.
Hope this helps.
- Henry
[1]