[ 
https://issues.apache.org/jira/browse/SPARK-7674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596672#comment-14596672
 ] 

Joseph K. Bradley commented on SPARK-7674:
------------------------------------------

I created initial subtasks for LinearRegression.  However, if someone wants to 
work on another model type, we can create analogous JIRAs for those.  Some 
high-priority models include:
* Linear regression
* Logistic regression
* Decision tree
* K-Means clustering
* Random Forest
* Gradient-Boosted Trees

> R-like stats for ML models
> --------------------------
>
>                 Key: SPARK-7674
>                 URL: https://issues.apache.org/jira/browse/SPARK-7674
>             Project: Spark
>          Issue Type: New Feature
>          Components: ML
>            Reporter: Joseph K. Bradley
>            Assignee: Joseph K. Bradley
>            Priority: Critical
>
> This is an umbrella JIRA for supporting ML model summaries and statistics, 
> following the example of R's summary() and plot() functions.
> [Design 
> doc|https://docs.google.com/document/d/1oswC_Neqlqn5ElPwodlDY4IkSaHAi0Bx6Guo_LvhHK8/edit?usp=sharing]
> From the design doc:
> {quote}
> R and its well-established packages provide extensive functionality for 
> inspecting a model and its results.  This inspection is critical to 
> interpreting, debugging and improving models.
> R is arguably a gold standard for a statistics/ML library, so this doc 
> largely attempts to imitate it.  The challenge we face is supporting similar 
> functionality, but on big (distributed) data.  Data size makes both efficient 
> computation and meaningful displays/summaries difficult.
> R model and result summaries generally take 2 forms:
> * summary(model): Display text with information about the model and results 
> on data
> * plot(model): Display plots about the model and results
> We aim to provide both of these types of information.  Visualization for the 
> plottable results will not be supported in MLlib itself, but we can provide 
> results in a form which can be plotted easily with other tools.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to