> > The best network results are achieved when Spark nodes share the same > hosts as Hadoop or they happen to be on the same subnet. >
That's only true for those portions of a Spark execution pipeline that are actually reading from HDFS. If you're re-using an RDD for which the needed shuffle files are already available on Executor nodes or are looking at stages of a Spark SQL query execution later than those reading from HDFS, then data locality and network utilization concerns don't really have anything to do with co-location of Executors and HDFS data nodes. On Fri, Sep 23, 2016 at 1:31 PM, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Does this assume that Spark is running on the same hosts as HDFS? Hence > does increasing the latency affects the network latency on Hadoop nodes as > well in your tests? > > The best network results are achieved when Spark nodes share the same > hosts as Hadoop or they happen to be on the same subnet. > > > HTH > > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > On 22 September 2016 at 14:54, gusiri <dreame...@gmail.com> wrote: > >> Hi, >> >> When I increase the network latency among spark nodes, >> >> I see compute time (=executor computing time in Spark Web UI) also >> increases. >> >> In the graph attached, left = latency 1ms vs right = latency 500ms. >> >> Is there any communication between worker and driver/master even 'during' >> executor computing? or any idea on this result? >> >> >> <http://apache-spark-user-list.1001560.n3.nabble.com/file/ >> n27779/Screen_Shot_2016-09-21_at_5.png> >> >> >> >> >> >> Thank you very much in advance. >> >> //gusiri >> >> >> >> >> -- >> View this message in context: http://apache-spark-user-list. >> 1001560.n3.nabble.com/Is-executor-computing-time-affected- >> by-network-latency-tp27779.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >> >> >