[ 
https://issues.apache.org/jira/browse/SPARK-14567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph K. Bradley resolved SPARK-14567.
---------------------------------------
       Resolution: Fixed
    Fix Version/s: 2.2.0

> Add instrumentation logs to MLlib training algorithms
> -----------------------------------------------------
>
>                 Key: SPARK-14567
>                 URL: https://issues.apache.org/jira/browse/SPARK-14567
>             Project: Spark
>          Issue Type: Umbrella
>          Components: ML, MLlib
>            Reporter: Timothy Hunter
>            Assignee: Timothy Hunter
>             Fix For: 2.2.0
>
>
> In order to debug performance issues when training mllib algorithms,
> it is useful to log some metrics about the training dataset, the training 
> parameters, etc.
> This ticket is an umbrella to add some simple logging messages to the most 
> common MLlib estimators. There should be no performance impact on the current 
> implementation, and the output is simply printed in the logs.
> Here are some values that are of interest when debugging training tasks:
> * number of features
> * number of instances
> * number of partitions
> * number of classes
> * input RDD/DF cache level
> * hyper-parameters



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to