[ 
https://issues.apache.org/jira/browse/FLINK-2116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14597836#comment-14597836
 ] 

Sachin Goel commented on FLINK-2116:
------------------------------------

It would make sense to have three functions:

predict for Vectors: returns LabeledVector

test for LabeledVector: returns (predicted_label, Original Vector). User is 
free to run an evaluation framework to determine any metrics they want on the 
output set now.

evaluate for LabeledVector: prints/returns some scores based on our own 
interpretation of an algorithm. There will be no code duplication for we will 
simply call test first and then use the output data set for evaluation.  For 
example, for regression, R-squared. For an SVM, accuracy, recall and F-score. 
This would be necessary to have in the Predictor because we would need to 
evaluate the Validation sets for model selection while doing cross-validation.

> Make pipeline extension require less coding
> -------------------------------------------
>
>                 Key: FLINK-2116
>                 URL: https://issues.apache.org/jira/browse/FLINK-2116
>             Project: Flink
>          Issue Type: Improvement
>          Components: Machine Learning Library
>            Reporter: Mikio Braun
>            Assignee: Till Rohrmann
>            Priority: Minor
>
> Right now, implementing methods from the pipelines for new types, or even 
> adding new methods to pipelines requires many steps:
> 1) implementing methods for new types
>   implement implicit of the corresponding class encapsulating the operation 
> in the companion object
> 2) adding methods to the pipeline
>   - adding a method
>   - adding a trait for the operation
>   - implement implicit in the companion object
> These are all objects which contain many generic parameters, so reducing the 
> work would be great.
> The goal should be that you can really focus on the code to add, and have as 
> little boilerplate code as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to