The key motivation behind this idea/request is to: Simplify baseline PredictionIO deployment, both conceptually & technically.
My vision with this thread is to: Enable single-process, single network-listener PredictionIO app deployment (i.e. Queries & Events APIs in the same process.) Attempting to address some previous questions & statements… From Pat Ferrel on Tue, 11 Jul 2017 10:53:48 -0700 (PDT): > how much of your problem is workflow vs installation vs bundling of APIs? Can > you explain it more? I am focused on deploying PredictionIO on Heroku via this buildpack: https://github.com/heroku/predictionio-buildpack Heroku is an app-centric platform, where each app gets a single routable network port. By default apps get a URL like: https://tdx-classi.herokuapp.com (an example PIO Classification engine) Deploying a separate Eventserver app that must be configured to share storage config & backends leads to all kinds of complexity, especially when unsuspectingly a developer might want to deploy a new engine with a different storage config but not realize that Eventserver is not simply shareable. Despite a lot of docs & discussion suggesting its share-ability, there is precious little documentation that presents how the multi-backend Storage really works in PIO. (I didn't understand it until I read a bunch of Storage source code.) From Kenneth Chan on Tue, 11 Jul 2017 12:49:58 -0700 (PDT): > For example, one can modify the classification to train a classifier on the > same set of data used by recommendation. …and later on Wed, 12 Jul 2017 13:44:01 -0700: > My concern of embedding event server in engine is > - what problem are we solving by providing an illusion that events are only > limited for one engine? This is a great ideal target, but the reality is that it takes some significant design & engineering to reach that level of data share-ability. I'm not suggesting that we do anything to undercut the possibilities of such a distributed architecture. I suggest that we streamline PIO for everyone that is not at that level of distributed architecture. Make PIO not *require* it. The best example I have is that you can run Spark in local mode, without worrying about any aspect of its ideal distributed purpose. (In fact PredictionIO is built on this feature of Spark!) I don't know the history there, but would imagine Spark was not always so friendly for small or embedded tasks like this. A huge part of my reality is seeing how many newcomers fumble around and get frustrated. I'm looking at PredictionIO from a very Heroku-style perspective of "how do we help [new] developers be successful", which is probably going to seem like I want to take away capabilities. I just want to make the onramp more graceful! *Mars ( <> .. <> )