One of the ongoing challenges we face with PredictionIO is the separation of 
Engine & Eventserver APIs. This separation leads to several problems:

1. Deploying a complete PredictionIO app requires multiple processes, each with 
its own network listener
2. Eventserver & Engine must be configured to share exactly the same storage 
backends (same `pio-env.sh`)
3. Confusion between "Eventserver" (an optional REST API) & "event storage" (a 
required database)

These challenges are exacerbated by the fact that PredictionIO's docs & `pio 
app` CLI make it appear that sharing an Eventserver between Engines is a good 
idea. I recently filed a JIRA issue about this topic. TL;DR sharing an 
eventserver between engines with different Meta Storage config will cause data 
corruption:
  https://issues.apache.org/jira/browse/PIO-96


I believe a lot of these issues could be alleviated with one change to 
PredictionIO core:

By default, expose the Eventserver API from the `pio deploy` Engine process, so 
that it is not necessary to deploy a second Eventserver-only process. Separate 
`pio eventserver` could still be optional if you need the separation of 
concerns for scalability.


I'd love to hear what you folks think. I will file a JIRA enhancement issue if 
this seems like an acceptable approach.

*Mars Hall
Customer Facing Architect
Salesforce Platform / Heroku
San Francisco, California

Reply via email to