+1 on that. It would be useful to use the model outside of Spark.


    _____________________________
From: DB Tsai <dbt...@dbtsai.com>
Sent: Wednesday, November 11, 2015 11:57 PM
Subject: Re: thought experiment: use spark ML to real time prediction
To: Nirmal Fernando <nir...@wso2.com>
Cc: Andy Davidson <a...@santacruzintegration.com>, Adrian Tanase 
<atan...@adobe.com>, user @spark <user@spark.apache.org>


       Do you think it will be useful to separate those models and model 
loader/writer code into another spark-ml-common jar without any spark platform 
dependencies so users can load the models trained by Spark ML in their 
application and run the prediction?       
            
Sincerely,     
     
DB Tsai     
----------------------------------------------------------     
Web:      https://www.dbtsai.com     
PGP Key ID: 0xAF08DF8D           
       On Wed, Nov 11, 2015 at 3:14 AM, Nirmal Fernando     <nir...@wso2.com> 
wrote:    
               As of now, we are basically serializing the ML model and then 
deserialize it for prediction at real time.                               
                 On Wed, Nov 11, 2015 at 4:39 PM, Adrian Tanase          
<atan...@adobe.com> wrote:         
                                                                         I 
don’t think this answers your question but here’s how you would evaluate the 
model in realtime in a streaming app                                         
https://databricks.gitbooks.io/databricks-spark-reference-applications/content/twitter_classifier/predict.html
                                                                                
                         
                                    Maybe you can find a way to extract 
portions of MLLib and run them outside of spark – loading the precomputed model 
and calling .predict on it…                                   
                                    -adrian                                   
                                                   From: Andy Davidson          
   
              Date: Tuesday, November 10, 2015 at 11:31 PM             
 To: "user @spark"
 Subject: thought experiment: use spark ML to real time prediction
                                                                    
                                                                                
Lets say I have use spark ML to train a linear model. I know I can save and 
load the model to disk. I am not sure how I can use the model in a real time 
environment. For example I do not think I can return a “prediction” to the 
client using spark streaming easily. Also for some applications the extra 
latency created by the batch process might not be acceptable.                   
                               
                                                   If I was not using spark I 
would re-implement the model I trained in my batch environment in a lang like 
Java  and implement a rest service that uses the model to create a prediction 
and return the prediction to the client. Many models make predictions using 
linear algebra. Implementing predictions is relatively easy if you have a good 
vectorized LA package. Is there a way to use a model I trained using spark ML 
outside of spark?                                                  
                                                   As a motivating example, 
even if its possible to return data to the client using spark streaming. I 
think the mini batch latency would not be acceptable for a high frequency stock 
trading system.                                                  
                                                   Kind regards                 
                                 
                                                   Andy                         
                         
                                                   P.s. The examples I have 
seen so far use spark streaming to “preprocess” predictions. For example a 
recommender system might use what current users are watching to calculate 
“trending recommendations”. These are stored on disk and served up to users 
when the use the “movie guide”. If a recommendation was a couple of min. old it 
would not effect the end users experience.                                      
            
                                                                                
                               
        
                 
                           -- 
                                                                             
Thanks & regards,              
Nirmal              
              
Team Lead - WSO2 Machine Learner              
Associate Technical Lead - Data Technologies Team, WSO2 Inc.              
Mobile:               +94715779733              
Blog:               http://nirmalfdo.blogspot.com/

Reply via email to