Repository: spark Updated Branches: refs/heads/master e04811137 -> a76bde9da
[SPARK-10469] [DOC] Try and document the three options >From JIRA: Add documentation for tungsten-sort. >From the mailing list "I saw a new "spark.shuffle.manager=tungsten-sort" >implemented in https://issues.apache.org/jira/browse/SPARK-7081, but it can't be found its corresponding description in http://people.apache.org/~pwendell/spark-releases/spark-1.5.0-rc3-docs/configuration.html(Currenlty there are only 'sort' and 'hash' two options)." Author: Holden Karau <hol...@pigscanfly.ca> Closes #8638 from holdenk/SPARK-10469-document-tungsten-sort. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a76bde9d Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/a76bde9d Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/a76bde9d Branch: refs/heads/master Commit: a76bde9dae54c4641e21f3c1ceb4870e3dc91881 Parents: e048111 Author: Holden Karau <hol...@pigscanfly.ca> Authored: Thu Sep 10 11:49:53 2015 -0700 Committer: Andrew Or <and...@databricks.com> Committed: Thu Sep 10 11:49:53 2015 -0700 ---------------------------------------------------------------------- docs/configuration.md | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/a76bde9d/docs/configuration.md ---------------------------------------------------------------------- diff --git a/docs/configuration.md b/docs/configuration.md index e287591..0b1a273 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -447,9 +447,12 @@ Apart from these, the following properties are also available, and may be useful <td><code>spark.shuffle.manager</code></td> <td>sort</td> <td> - Implementation to use for shuffling data. There are two implementations available: - <code>sort</code> and <code>hash</code>. Sort-based shuffle is more memory-efficient and is - the default option starting in 1.2. + Implementation to use for shuffling data. There are three implementations available: + <code>sort</code>, <code>hash</code> and the new (1.5+) <code>tungsten-sort</code>. + Sort-based shuffle is more memory-efficient and is the default option starting in 1.2. + Tungsten-sort is similar to the sort based shuffle, with a direct binary cache-friendly + implementation with a fall back to regular sort based shuffle if its requirements are not + met. </td> </tr> <tr> --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org