>
> The best network results are achieved when Spark nodes share the same
> hosts as Hadoop or they happen to be on the same subnet.
>

That's only true for those portions of a Spark execution pipeline that are
actually reading from HDFS.  If you're re-using an RDD for which the needed
shuffle files are already available on Executor nodes or are looking at
stages of a Spark SQL query execution later than those reading from HDFS,
then data locality and network utilization concerns don't really have
anything to do with co-location of Executors and HDFS data nodes.

On Fri, Sep 23, 2016 at 1:31 PM, Mich Talebzadeh <mich.talebza...@gmail.com>
wrote:

> Does this assume that Spark is running on the same hosts as HDFS? Hence
> does increasing the latency affects the network latency on Hadoop nodes as
> well in your tests?
>
> The best network results are achieved when Spark nodes share the same
> hosts as Hadoop or they happen to be on the same subnet.
>
>
> HTH
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 22 September 2016 at 14:54, gusiri <dreame...@gmail.com> wrote:
>
>> Hi,
>>
>> When I increase the network latency among spark nodes,
>>
>> I see compute time (=executor computing time in Spark Web UI) also
>> increases.
>>
>> In the graph attached, left = latency 1ms vs right = latency 500ms.
>>
>> Is there any communication between worker and driver/master even 'during'
>> executor computing? or any idea on this result?
>>
>>
>> <http://apache-spark-user-list.1001560.n3.nabble.com/file/
>> n27779/Screen_Shot_2016-09-21_at_5.png>
>>
>>
>>
>>
>>
>> Thank you very much in advance.
>>
>> //gusiri
>>
>>
>>
>>
>> --
>> View this message in context: http://apache-spark-user-list.
>> 1001560.n3.nabble.com/Is-executor-computing-time-affected-
>> by-network-latency-tp27779.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>
>>
>

Reply via email to