spark git commit: [SPARK-21911][ML][FOLLOW-UP] Fix doc for parallel ML Tuning in PySpark

jkbradley Mon, 13 Nov 2017 17:02:01 -0800

Repository: spark
Updated Branches:
  refs/heads/master c8b7f97b8 -> d8741b2b0



[SPARK-21911][ML][FOLLOW-UP] Fix doc for parallel ML Tuning in PySpark

## What changes were proposed in this pull request?

Fix doc issue mentioned here: 
https://github.com/apache/spark/pull/19122#issuecomment-340111834

## How was this patch tested?

N/A

Author: WeichenXu <weichen...@databricks.com>

Closes #19641 from WeichenXu123/fix_doc.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d8741b2b
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d8741b2b
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d8741b2b

Branch: refs/heads/master
Commit: d8741b2b0fe8b8da74f120859e969326fb170629
Parents: c8b7f97
Author: WeichenXu <weichen...@databricks.com>
Authored: Mon Nov 13 17:00:51 2017 -0800
Committer: Joseph K. Bradley <jos...@databricks.com>
Committed: Mon Nov 13 17:00:51 2017 -0800

----------------------------------------------------------------------
 docs/ml-tuning.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/d8741b2b/docs/ml-tuning.md
----------------------------------------------------------------------
diff --git a/docs/ml-tuning.md b/docs/ml-tuning.md
index 64dc46c..54d9cd2 100644
--- a/docs/ml-tuning.md
+++ b/docs/ml-tuning.md
@@ -55,7 +55,7 @@ for multiclass problems. The default metric used to choose 
the best `ParamMap` c
 method in each of these evaluators.
 
 To help construct the parameter grid, users can use the 
[`ParamGridBuilder`](api/scala/index.html#org.apache.spark.ml.tuning.ParamGridBuilder)
 utility.
-By default, sets of parameters from the parameter grid are evaluated in 
serial. Parameter evaluation can be done in parallel by setting `parallelism` 
with a value of 2 or more (a value of 1 will be serial) before running model 
selection with `CrossValidator` or `TrainValidationSplit` (NOTE: this is not 
yet supported in Python).
+By default, sets of parameters from the parameter grid are evaluated in 
serial. Parameter evaluation can be done in parallel by setting `parallelism` 
with a value of 2 or more (a value of 1 will be serial) before running model 
selection with `CrossValidator` or `TrainValidationSplit`.
 The value of `parallelism` should be chosen carefully to maximize parallelism 
without exceeding cluster resources, and larger values may not always lead to 
improved performance.  Generally speaking, a value up to 10 should be 
sufficient for most clusters.
 
 # Cross-Validation


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-21911][ML][FOLLOW-UP] Fix doc for parallel ML Tuning in PySpark

Reply via email to