[ 
https://issues.apache.org/jira/browse/SPARK-9120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15383081#comment-15383081
 ] 

Ruben Janssen edited comment on SPARK-9120 at 7/18/16 9:18 PM:
---------------------------------------------------------------

Bumping this JIRA because of the recent PR for JIRA 
https://issues.apache.org/jira/browse/SPARK-10409 which triggered the same 
discussion. Given 10409 is on the road map for 2.1 
(https://issues.apache.org/jira/browse/SPARK-5575), we should keep discussion 
at one place or at least link this JIRA to 10409. 

Regarding the update on the description which states 'The issue is as follows. 
RegressionModel extends PredictionModel which has "predict:Double".': this 
seems to be out of date if I am not missing something. ClassificationModel in 
ML seems to be extending PredictionModel in the same way RegressionModel does. 
The initial solution stated therefore seems to be sufficient in case we want to 
have multivariate regression for all regression algorithms that implement the 
interface. I am not sure if this is the case however, but if not, I think it 
would be best to create a separate interface which can then be implemented by 
algorithms individually (and to keep things consistent, we can let 
ClassificationModel also  have it: we wouldn't have to change any code really).



was (Author: rubenjanssen):
Bumping this JIRA because of the recent PR for JIRA 
https://issues.apache.org/jira/browse/SPARK-10409 which triggered the same 
discussion. Given 10409 is on the road map for 2.1 
(https://issues.apache.org/jira/browse/SPARK-5575), we should keep discussion 
at one place or at least link this JIRA to 10409. 

Regarding the update on the description which states 'The issue is as follows. 
RegressionModel extends PredictionModel which has "predict:Double".': this 
seems to be out of date if I am not missing something. ClassificationModel in 
ML seems to be extending PredictionModel in the same way RegressionModel does. 
The initial solution stated therefore seems to be sufficient in case we want to 
have multivariate regression for all regression algorithms that implement the 
interface. I am not sure if this is the case however, but if not, I think it 
would be best to create a separate interface which can then be implemented by 
algorithms individually (and to keep things consistent, we let 
ClassificationModel als us to have it: it would not require us to change any 
code if the naming would be consistent).


> Add multivariate regression (or prediction) interface
> -----------------------------------------------------
>
>                 Key: SPARK-9120
>                 URL: https://issues.apache.org/jira/browse/SPARK-9120
>             Project: Spark
>          Issue Type: Improvement
>          Components: ML
>    Affects Versions: 1.4.0
>            Reporter: Alexander Ulanov
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> org.apache.spark.ml.regression.RegressionModel supports prediction only for a 
> single variable with a method "predict:Double" by extending the Predictor. 
> There is a need for multivariate prediction, at least for regression. I 
> propose to modify "RegressionModel" interface similarly to how it is done in 
> "ClassificationModel", which supports multiclass classification. It has 
> "predict:Double" and "predictRaw:Vector". Analogously, "RegressionModel" 
> should have something like "predictMultivariate:Vector".
> Update: After reading the design docs, adding "predictMultivariate" to 
> RegressionModel does not seem reasonable to me anymore. The issue is as 
> follows. RegressionModel extends PredictionModel which has "predict:Double". 
> Its "train" method uses "predict:Double" for prediction, i.e. PredictionModel 
> (and RegressionModel) is hard-coded to have only one output. There exist a 
> similar problem in MLLib (https://issues.apache.org/jira/browse/SPARK-5362). 
> The possible solution for this problem might require to redesign the class 
> hierarchy or addition of a separate interface that extends model. Though the 
> latter means code duplication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to