I'm looking for about how scale broadcast variables in Spark and what
algorithm uses.

I have found 
http://www.cs.berkeley.edu/~agearh/cs267.sp10/files/mosharaf-spark-bc-report-spring10.pdf
I don't know if they're talking about the current version (1.2.1)
because the file was created in 2010.
I took a look to the documentation and API and I read that there is an
TorrentFactory for broadcast variable
 it's which it uses Spark right now? In the article they talk that
Spark uses another one (Centralized HDFS Broadcast)

How does it scale if I have a big cluster (about 300 nodes) the
current algorithm?? is it linear? are there others options to choose
others algorithms?

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to