Repository: spark Updated Branches: refs/heads/master ef48222c1 -> 96f28c972
[SPARK-2522] set default broadcast factory to torrent HttpBroadcastFactory is the current default broadcast factory. It sends the broadcast data to each worker one by one, which is slow when the cluster is big. TorrentBroadcastFactory scales much better than http. Maybe we should make torrent the default broadcast method. Author: Xiangrui Meng <m...@databricks.com> Closes #1437 from mengxr/bt-broadcast and squashes the following commits: ed492fe [Xiangrui Meng] set default broadcast factory to torrent Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/96f28c97 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/96f28c97 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/96f28c97 Branch: refs/heads/master Commit: 96f28c9726d18f3b0d7a57b128c16ec9157f1532 Parents: ef48222 Author: Xiangrui Meng <m...@databricks.com> Authored: Wed Jul 16 11:27:51 2014 -0700 Committer: Reynold Xin <r...@apache.org> Committed: Wed Jul 16 11:27:51 2014 -0700 ---------------------------------------------------------------------- .../main/scala/org/apache/spark/broadcast/BroadcastManager.scala | 2 +- docs/configuration.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/96f28c97/core/src/main/scala/org/apache/spark/broadcast/BroadcastManager.scala ---------------------------------------------------------------------- diff --git a/core/src/main/scala/org/apache/spark/broadcast/BroadcastManager.scala b/core/src/main/scala/org/apache/spark/broadcast/BroadcastManager.scala index c88be6a..8f8a0b1 100644 --- a/core/src/main/scala/org/apache/spark/broadcast/BroadcastManager.scala +++ b/core/src/main/scala/org/apache/spark/broadcast/BroadcastManager.scala @@ -39,7 +39,7 @@ private[spark] class BroadcastManager( synchronized { if (!initialized) { val broadcastFactoryClass = - conf.get("spark.broadcast.factory", "org.apache.spark.broadcast.HttpBroadcastFactory") + conf.get("spark.broadcast.factory", "org.apache.spark.broadcast.TorrentBroadcastFactory") broadcastFactory = Class.forName(broadcastFactoryClass).newInstance.asInstanceOf[BroadcastFactory] http://git-wip-us.apache.org/repos/asf/spark/blob/96f28c97/docs/configuration.md ---------------------------------------------------------------------- diff --git a/docs/configuration.md b/docs/configuration.md index 9d3fe74..a70007c 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -419,7 +419,7 @@ Apart from these, the following properties are also available, and may be useful </tr> <tr> <td><code>spark.broadcast.factory</code></td> - <td>org.apache.spark.broadcast.<br />HttpBroadcastFactory</td> + <td>org.apache.spark.broadcast.<br />TorrentBroadcastFactory</td> <td> Which broadcast implementation to use. </td>