Hi,
I am running Spark applications in GCE. I set up cluster with different
number of nodes varying from 1 to 7. The machines are single core machines.
I set the spark.default.parallelism to the number of nodes in the cluster
for each cluster. I ran the four applications available in Spark Examples,
SparkTC, SparkALS, SparkLR, SparkPi for each of the configurations.
What I notice is the following:
In case of SparkTC and SparkALS, the time to complete the job increases
with the increase in number of nodes in cluster, where as in SparkLR and
SparkPi, the time to complete the job remains the same across all the
configurations.
Could anyone explain me this?

Thank You
Regards,
Deep

Reply via email to