[ https://issues.apache.org/jira/browse/SPARK-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiangrui Meng updated SPARK-6345: --------------------------------- Assignee: Jeremy Freeman > Model update propagation during prediction in Streaming Regression > ------------------------------------------------------------------ > > Key: SPARK-6345 > URL: https://issues.apache.org/jira/browse/SPARK-6345 > Project: Spark > Issue Type: Bug > Components: MLlib, Streaming > Reporter: Jeremy Freeman > Assignee: Jeremy Freeman > > During streaming regression analyses (Streaming Linear Regression and > Streaming Logistic Regression), model updates based on training data are not > being reflected in subsequent calls to predictOn or predictOnValues, despite > updates themselves occurring successfully. It may be due to recent changes to > model declaration, and I have a working fix prepared to be submitted ASAP > (alongside expanded test coverage). > A temporary workaround is to retrieve and use the updated model within a > foreachRDD, as in: > {code} > model.trainOn(trainingData) > testingData.foreachRDD{ rdd => > val latest = model.latestModel() > val predictions = rdd.map(lp => latest.predict(lp.features)) > ...print or other side effects... > } > {code} > Or within a transform, as in: > {code} > model.trainOn(trainingData) > val predictions = testingData.transform { rdd => > val latest = model.latestModel() > rdd.map(lp => (lp.label, latest.predict(lp.features))) > } > {code} > Note that this does not affect Streaming KMeans, which works as expected for > combinations of training and prediction. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org