[ 
https://issues.apache.org/jira/browse/SPARK-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiangrui Meng updated SPARK-6345:
---------------------------------
    Assignee: Jeremy Freeman

> Model update propagation during prediction in Streaming Regression
> ------------------------------------------------------------------
>
>                 Key: SPARK-6345
>                 URL: https://issues.apache.org/jira/browse/SPARK-6345
>             Project: Spark
>          Issue Type: Bug
>          Components: MLlib, Streaming
>            Reporter: Jeremy Freeman
>            Assignee: Jeremy Freeman
>
> During streaming regression analyses (Streaming Linear Regression and 
> Streaming Logistic Regression), model updates based on training data are not 
> being reflected in subsequent calls to predictOn or predictOnValues, despite 
> updates themselves occurring successfully. It may be due to recent changes to 
> model declaration, and I have a working fix prepared to be submitted ASAP 
> (alongside expanded test coverage).
> A temporary workaround is to retrieve and use the updated model within a 
> foreachRDD, as in:
> {code}
> model.trainOn(trainingData)
> testingData.foreachRDD{ rdd =>
>     val latest = model.latestModel()
>     val predictions = rdd.map(lp => latest.predict(lp.features))
>     ...print or other side effects...
> }
> {code}
> Or within a transform, as in:
> {code}
> model.trainOn(trainingData)
> val predictions = testingData.transform { rdd =>
>       val latest = model.latestModel()
>       rdd.map(lp => (lp.label, latest.predict(lp.features)))
> }
> {code}
> Note that this does not affect Streaming KMeans, which works as expected for 
> combinations of training and prediction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to