Well, when I started development ~2 years ago, Scalatra just appealed more, being more lightweight (I didn't need MVC just barebones REST endpoints), and I still find its API / DSL much nicer to work with. Also, the swagger API docs integration was important to me. So it's more familiarity than any other reason.
If I were to build a model server from scratch perhaps Spray/Akka HTTP would be the better way to go purely for integration purposes. Having said that I think Scalatra is great and performant, so it's not a no-brainer either way. On Sun, Oct 19, 2014 at 5:29 PM, Debasish Das <debasish.da...@gmail.com> wrote: > Hi Nick, > > Any specific reason of choosing scalatra and not play/spray (now that they > are getting integrated) ? > > Sean, > > Would you be interested in a play and akka clustering based module in > oryx2 and see how it compares against the servlets ? I am interested to > understand the scalability.... > > Thanks. > Deb > > On Sat, Oct 18, 2014 at 11:22 PM, Nick Pentreath <nick.pentre...@gmail.com > > wrote: > >> We've built a model server internally, based on Scalatra and Akka >> Clustering. Our use case is more geared towards serving possibly thousands >> of smaller models. >> >> It's actually very basic, just reads models from S3 as strings (!!) (uses >> HDFS FileSystem so can read from local, HDFS, S3) and uses Breeze for >> linear algebra. (Technically it is also not dependent on Spark, it could be >> reading models generated by any computation layer). >> >> It's designed to allow scaling via cluster sharding, by adding nodes (but >> could also support a load-balanced approach). Not using persistent actors >> as doing a model reload on node failure is not a disaster as we have >> multiple levels of fallback. >> >> Currently it is a bit specific to our setup (and only focused on >> recommendation models for now), but could with some work be made generic. >> I'm certainly considering if we can find the time to make it a releasable >> project. >> >> One major difference to Oryx is that it only handles the model loading >> and vector computations, not the filtering-related and other things that >> come as part of a recommender system (that is done elsewhere in our >> system). It also does not handle the ingesting of data at all. >> >> On Sun, Oct 19, 2014 at 7:10 AM, Sean Owen <so...@cloudera.com> wrote: >> >>> Yes, that is exactly what the next 2.x version does. Still in progress >>> but >>> the recommender app and framework are code - complete. It is not even >>> specific to MLlib and could plug in other model build functions. >>> >>> The current 1.x version will not use MLlib. Neither uses Play but is >>> intended to scale just by adding web servers however you usually do. >>> >>> See graphflow too. >>> On Oct 18, 2014 5:06 PM, "Rajiv Abraham" <rajiv.abra...@gmail.com> >>> wrote: >>> >>> > Oryx 2 seems to be geared for Spark >>> > >>> > https://github.com/OryxProject/oryx >>> > >>> > 2014-10-18 11:46 GMT-04:00 Debasish Das <debasish.da...@gmail.com>: >>> > >>> > > Hi, >>> > > >>> > > Is someone working on a project on integrating Oryx model serving >>> layer >>> > > with Spark ? Models will be built using either Streaming data / Batch >>> > data >>> > > in HDFS and cross validated with mllib APIs but the model serving >>> layer >>> > > will give API endpoints like Oryx >>> > > and read the models may be from hdfs/impala/SparkSQL >>> > > >>> > > One of the requirement is that the API layer should be scalable and >>> > > elastic...as requests grow we should be able to add more >>> nodes...using >>> > play >>> > > and akka clustering module... >>> > > >>> > > If there is a ongoing project on github please point to it... >>> > > >>> > > Is there a plan of adding model serving and experimentation layer to >>> > mllib >>> > > ? >>> > > >>> > > Thanks. >>> > > Deb >>> > > >>> > >>> > >>> > >>> > -- >>> > Take care, >>> > Rajiv >>> > >>> >> >> >