Hi all,

I have a three node cluster with identical hardware. I am trying a workflow
where it reads data from hdfs, repartitions it and runs a few map operations
then writes the results back to hdfs.

It looks like that all the computation, including the repartitioning and the
maps complete within similar time intervals on all the nodes, except when it
writes it back to HDFS when the master node does the job way much faster
then the slaves (15s for each block as opposed to 1.2 min for the slaves). 

Any suggestion what the reason might be?

thanks,



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/writing-to-hdfs-on-master-node-much-faster-tp22570.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to