Lets say I have use spark ML to train a linear model. I know I can save and
load the model to disk. I am not sure how I can use the model in a real time
environment. For example I do not think I can return a ³prediction² to the
client using spark streaming easily. Also for some applications the extra
latency created by the batch process might not be acceptable.

If I was not using spark I would re-implement the model I trained in my
batch environment in a lang like Java  and implement a rest service that
uses the model to create a prediction and return the prediction to the
client. Many models make predictions using linear algebra. Implementing
predictions is relatively easy if you have a good vectorized LA package. Is
there a way to use a model I trained using spark ML outside of spark?

As a motivating example, even if its possible to return data to the client
using spark streaming. I think the mini batch latency would not be
acceptable for a high frequency stock trading system.

Kind regards

Andy

P.s. The examples I have seen so far use spark streaming to ³preprocess²
predictions. For example a recommender system might use what current users
are watching to calculate ³trending recommendations². These are stored on
disk and served up to users when the use the ³movie guide². If a
recommendation was a couple of min. old it would not effect the end users
experience.



Reply via email to