[ https://issues.apache.org/jira/browse/SPARK-9120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexander Ulanov updated SPARK-9120: ------------------------------------ Description: org.apache.spark.ml.regression.RegressionModel supports prediction only for a single variable with a method "predict:Double" by extending the Predictor. There is a need for multivariate prediction, at least for regression. I propose to modify "RegressionModel" interface similarly to how it is done in "ClassificationModel", which supports multiclass classification. It has "predict:Double" and "predictRaw:Vector". Analogously, "RegressionModel" should have something like "predictMultivariate:Vector". Update:After reading the design docs, adding "predictMultivariate" to RegressionModel does not seem reasonable to me anymore. The issue is as follows. RegressionModel extends PredictionModel which has "predict:Double". Its "train" method uses "predict:Double" for prediction, i.e. PredictionModel is hard-coded to have only one output. It is the same problem that I pointed out long time ago in MLLib ( was: org.apache.spark.ml.regression.RegressionModel supports prediction only for a single variable with a method "predict:Double" by extending the Predictor. There is a need for multivariate prediction, at least for regression. I propose to modify "RegressionModel" interface similarly to how it is done in "ClassificationModel", which supports multiclass classification. It has "predict:Double" and "predictRaw:Vector". Analogously, "RegressionModel" should have something like "predictMultivariate:Vector". > Add multivariate regression (or prediction) interface > ----------------------------------------------------- > > Key: SPARK-9120 > URL: https://issues.apache.org/jira/browse/SPARK-9120 > Project: Spark > Issue Type: Improvement > Components: ML > Affects Versions: 1.4.0 > Reporter: Alexander Ulanov > Fix For: 1.4.0 > > Original Estimate: 1h > Remaining Estimate: 1h > > org.apache.spark.ml.regression.RegressionModel supports prediction only for a > single variable with a method "predict:Double" by extending the Predictor. > There is a need for multivariate prediction, at least for regression. I > propose to modify "RegressionModel" interface similarly to how it is done in > "ClassificationModel", which supports multiclass classification. It has > "predict:Double" and "predictRaw:Vector". Analogously, "RegressionModel" > should have something like "predictMultivariate:Vector". > Update:After reading the design docs, adding "predictMultivariate" to > RegressionModel does not seem reasonable to me anymore. The issue is as > follows. RegressionModel extends PredictionModel which has "predict:Double". > Its "train" method uses "predict:Double" for prediction, i.e. PredictionModel > is hard-coded to have only one output. It is the same problem that I pointed > out long time ago in MLLib ( -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org