Re: Eventserver API in an Engine?

Pat Ferrel Sat, 23 Sep 2017 12:08:15 -0700

And glad you did.

The needs of Heroku are just as important as any user of an Apache project *but 
no more so* since one extremely important measure of TLP eligibility is to 
demonstrate freedom from corporate dominance.

So let me chime in with my own reasons to look at a major refactoring of PIO;
Simplify deployment, one server with integrated engine(s) all incorporated into
a single REST API and a single JVM process (perhaps identical to what Mars is
asking for)
No need to “train” or ‘deploy” on different machines but full access to
clustered compute and storage services (also something Mars mentions)
Kappa, non-Spark-based Engines, pure clean REST API that allows GUIs to be
plugged in, optional true security (SSL+Auth).
The ML/AI community is moving on from Hadoop Mapreduce, to Spark, to TensorFlow
and Streaming online learners (Kappa) and this requires independence from any
specific compute backend.
Multi-tenant, with multiple instances and types of Engines.
Secure, TLS + Authentications + Authorization but optionally done so no
overhead when it isn’t needed.
The CLI is just another client communicating with the server’s REST API and can
be replaced with custom admin GUIs, for example.

We now have an MVP that delivers the above requirements but as a replacement to
PIO. We at first saw this as PIO-Kappa. Early code was named this. But things
have changed since it requires some major re-thinking and so now has its own
name—Harness. To get these features the re-thinking of the PIO codebase will
also be required along with a *lot* of work to implement. We chose to start
from scratch as an easier route. The sever has one JVM process with REST for
all input and query endpoints and even methods to trigger training for Lambda
Engines. We have benchmarked performance on our scaffold Template (minimal
operational Engine) at 6ms/request for one user (connection) in one thread on a
2013 Macbook Pro in localhost mode—add 1 ms for SSL+Auth. Since it uses
akka-http is will also handle a self-tuning number of parallel requests (no
benchmarks yet). So suffice to say it is fast.

Templates for this server are quite a bit different because they now include
their own robust validation mechanism for input, query, and engine.json but
also because Templates must now do some of what pio does. With this
responsibility comes great freedom. Freedom to use any compute backend. Freedom
to use any storage mechanism for model or input. Freedom to be Kappa, Lambda,
or any hybrid between. And Engines get new functionality from the server as
listed in the requirements.

Even though there are structural Template differences they remain JSON input
compatible with PIO. We took a PIO Template we had created in 2016 that uses
Vowpal Wabbit as a compute backend and re-implemented it in this new ML Server
as a clean Kappa Template. Therefore we can talk about the differences with
some evidence to back up statements. There was 0 change to input so backups of
the PIO engine were moved to the new server quite easily with CLI and no change
to data.

There are long tedious discussions that could be made about how to get what
Mars and I are asking for from PIO but Apache is a do-ocracy. All of our asks
can be done incrementally with incremental disruption—or they can be done at
once (and have been). There are so many trade-offs that the discussion will, in
all likelihood never end.

I therefore suggest that Mars *do* what he thinks is needed, or alternatively,
I am willing to donate what we have running. I’m planning to make the UR a
Kappa algorithm soon, requiring no `pio train` (and no Spark). This must, of
necessity be done on the new server framework so whether the new framework
becomes part of PIO 2 or not, is a choice for the team. I suppose I could just
push it to an “experimental” branch but this is something I’m not willing to
*do* without some indication it is welcome.

https://github.com/actionml/harness <https://github.com/actionml/harness>
https://github.com/actionml/harness/blob/develop/commands.md
<https://github.com/actionml/harness/blob/develop/commands.md>
https://github.com/actionml/harness/blob/develop/rest_spec.md
<https://github.com/actionml/harness/blob/develop/rest_spec.md>
Template contract:
https://github.com/actionml/harness/tree/develop/rest-server/core/src/main/scala/com/actionml/core/template

The major downside I will volunteer is that Templates will require a fair bit
of work to port and we have no Spark based ones to use as examples yet. Also we
have not integrated PIO-Stores as the lead-in diagram implies. Remember it is
an MVP running a Template in a production environment but makes no effort to
replicate all PIO features.

On Sep 22, 2017, at 6:35 PM, Mars Hall <mars.h...@salesforce.com> wrote:

I'm bringing this thread back to life!

There is another thread here this week:
How to training and deploy on different machine?

In it, Pat replies:

You will have to spread the pio “workflow” out over a permanent
deploy+eventserver machine. I usually call this a combo PredictionServer and
EventServe. These are 2 JVM processes the take events and respond to queries
and so must be available all the time. You will run `pio eventserver` and `pio
deploy` on this machine.

This is exactly what I'm talking about. Two processes on a single machine to
run a complete deployment. Doesn't it make sense to allow these APIs to coexist
in a single JVM?

Sure, in some cases you may want to scale out and tune two different JVMs for
these two different use-cases, but for most of us, making it so the main
runtime only requires a single process/JVM would make PredictionIO much more
friendly to operate.

A few more comments inline below…

On Wed, Jul 12, 2017 at 7:43 PM, Kenneth Chan <kenn...@apache.org
<mailto:kenn...@apache.org>> wrote:
Mars, i totally understand and agree we should make developer successful. but
Would like to understand your problem more before jump into conclusion

first, a complete PIO setup has following:
1. PIO framework layer
2. PIO administration (e.g. PIO app)
3. PIO event server
4. one or more PIO engines

the storage and setup config applied to 1 globally and the rest 2, 3, 4 would
run on top of 1.

my understanding is that the Buildpack would take engine code and then build,
release and deploy it which can then serve query.

when heroku user use buildpack,
- Where is the event server in the picture?

The eventserver is considered optional. If a Heroku user wants to use events
API, then they must provision a second Heroku app for the eventserver:

https://github.com/heroku/predictionio-buildpack/blob/master/CUSTOM.md#user-content-eventserver

<https://github.com/heroku/predictionio-buildpack/blob/master/CUSTOM.md#user-content-eventserver>

- How user setup the storage config for 1?

With the Heroku buildpack, PostgreSQL is the default for all storage sources,
and it is automatically configured.

- if i use build pack to deploy another engine, does it share 1 and 2 above?

No. Every engine is another Heroku app. Every eventserver is another Heroku
app. These can be configured to intentionally share databases/storage, such as
for a specific engine+eventserver pair.

On Wed, Jul 12, 2017 at 3:21 PM, Mars Hall <m...@heroku.com
<mailto:m...@heroku.com>> wrote:
The key motivation behind this idea/request is to:

Simplify baseline PredictionIO deployment, both conceptually & technically.

My vision with this thread is to:

Enable single-process, single network-listener PredictionIO app deployment
(i.e. Queries & Events APIs in the same process.)

Attempting to address some previous questions & statements…

From Pat Ferrel on Tue, 11 Jul 2017 10:53:48 -0700 (PDT):
> how much of your problem is workflow vs installation vs bundling of APIs? Can
> you explain it more?

I am focused on deploying PredictionIO on Heroku via this buildpack:
https://github.com/heroku/predictionio-buildpack
<https://github.com/heroku/predictionio-buildpack>

Heroku is an app-centric platform, where each app gets a single routable
network port. By default apps get a URL like:
https://tdx-classi.herokuapp.com <https://tdx-classi.herokuapp.com/> (an
example PIO Classification engine)

Deploying a separate Eventserver app that must be configured to share storage
config & backends leads to all kinds of complexity, especially when
unsuspectingly a developer might want to deploy a new engine with a different
storage config but not realize that Eventserver is not simply shareable.
Despite a lot of docs & discussion suggesting its share-ability, there is
precious little documentation that presents how the multi-backend Storage
really works in PIO. (I didn't understand it until I read a bunch of Storage
source code.)

From Kenneth Chan on Tue, 11 Jul 2017 12:49:58 -0700 (PDT):
> For example, one can modify the classification to train a classifier on the
> same set of data used by recommendation.
…and later on Wed, 12 Jul 2017 13:44:01 -0700:
> My concern of embedding event server in engine is
> - what problem are we solving by providing an illusion that events are only
> limited for one engine?

This is a great ideal target, but the reality is that it takes some significant
design & engineering to reach that level of data share-ability. I'm not
suggesting that we do anything to undercut the possibilities of such a
distributed architecture. I suggest that we streamline PIO for everyone that is
not at that level of distributed architecture. Make PIO not *require* it.

The best example I have is that you can run Spark in local mode, without
worrying about any aspect of its ideal distributed purpose. (In fact
PredictionIO is built on this feature of Spark!) I don't know the history
there, but would imagine Spark was not always so friendly for small or embedded
tasks like this.

A huge part of my reality is seeing how many newcomers fumble around and get
frustrated. I'm looking at PredictionIO from a very Heroku-style perspective of
"how do we help [new] developers be successful", which is probably going to
seem like I want to take away capabilities. I just want to make the onramp more
graceful!

*Mars

( <> .. <> )

--
*Mars Hall
415-818-7039 <tel:(415)%20818-7039>
Customer Facing Architect
Salesforce Platform / Heroku
San Francisco, California

Re: Eventserver API in an Engine?

Reply via email to