[SparkQL] how are RDDs partitioned and distributed in a standalone cluster?

prabhastechie Sun, 18 Feb 2018 18:05:01 -0800

Say I have a main method with the following pseudo-code (to be run on a spark
standalone cluster):
main(args) {
  RDD rdd
  rdd1 = rdd.map(...)
  // some other statements not using RDD
  rdd2 = rdd.filter(...)
}


When executed, will each of the two statements involving RDDs (map and
filter) be individually partitioned and distributed on available cluster
nodes? And any statements not involving RDDs (or data frames) will typically
be executed on the driver?
Is that how spark take advantage of the cluster?



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

[SparkQL] how are RDDs partitioned and distributed in a standalone cluster?

Reply via email to