[ https://issues.apache.org/jira/browse/SPARK-14567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joseph K. Bradley updated SPARK-14567: -------------------------------------- Assignee: Timothy Hunter > Add instrumentation logs to MLlib training algorithms > ----------------------------------------------------- > > Key: SPARK-14567 > URL: https://issues.apache.org/jira/browse/SPARK-14567 > Project: Spark > Issue Type: Umbrella > Components: ML, MLlib > Reporter: Timothy Hunter > Assignee: Timothy Hunter > > In order to debug performance issues when training mllib algorithms, > it is useful to log some metrics about the training dataset, the training > parameters, etc. > This ticket is an umbrella to add some simple logging messages to the most > common MLlib estimators. There should be no performance impact on the current > implementation, and the output is simply printed in the logs. > Here are some values that are of interest when debugging training tasks: > * number of features > * number of instances > * number of partitions > * number of classes > * input RDD/DF cache level > * hyper-parameters -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org