[jira] [Commented] (IMPALA-6088) Rack aware broadcast operator

2020-12-23 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17254181#comment-17254181
 ] 

Tim Armstrong commented on IMPALA-6088:
---

Note that we could either use a static configuration or we could try to 
dynamically decide this using stats like IMPALA-9235

> Rack aware broadcast operator
> -
>
> Key: IMPALA-6088
> URL: https://issues.apache.org/jira/browse/IMPALA-6088
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Distributed Exec
>Reporter: Mostafa Mokhtar
>Priority: Major
>
> When conducting large scale experiments on a 6 rack cluster with aggregator 
> core network topology overall cluster bandwidth utilization was limited. 
> With aggregator core networks nodes and racks are not equidistant, which 
> means a broadcast operation can be inefficient as the broadcasting node needs 
> to send the same data N times to each node on a remote rack. 
> Ideally Rowbatches should be sent once per remote rack then a node on each 
> remote rack would broadcast within its rack. 
> Table below represent rack to rack latency for the 90% of operations, ration 
> between best and worst case is 7.3x 
> | ||  va||vc||vd1||   vd3||   ve|
> |va|  4,238|  4,290|  9,692|  8,897|  8,208|
> |vc|  9,290|  4,396|  30,952| 13,529| 14,578|
> |vd1| 9,131|  29,066| 4,346|  17,265| 16,849|
> |vd3| 7,409|  15,517| 17,265| 4,370|  4,687|
> |ve|  4,914|  16,894| 16,430| 4,713|  4,472|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6088) Rack aware broadcast operator

2019-04-26 Thread Ruslan Dautkhanov (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827317#comment-16827317
 ] 

Ruslan Dautkhanov commented on IMPALA-6088:
---

Would be also interesting to look if something like Torrent protocol helps here 
too.
This Torrent protocol can take into account rack location, if available. 

Spark uses this to avoid -query coordinator- Spark Driver to be a network 
bottleneck 

https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/broadcast/TorrentBroadcast.scala

{quote}

* The driver divides the serialized object into small chunks and
 * stores those chunks in the BlockManager of the driver.

 * On each executor, the executor first attempts to fetch the object from its 
BlockManager. If
 * it does not exist, it then uses remote fetches to fetch the small chunks 
from the driver and/or
 * other executors if available. Once it gets the chunks, it puts the chunks in 
its own
 * BlockManager, ready for other executors to fetch from.

 * This prevents the driver from being the bottleneck in sending out multiple 
copies of the
 * broadcast data (one per executor).

{quote}

> Rack aware broadcast operator
> -
>
> Key: IMPALA-6088
> URL: https://issues.apache.org/jira/browse/IMPALA-6088
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Distributed Exec
>Reporter: Mostafa Mokhtar
>Priority: Major
>
> When conducting large scale experiments on a 6 rack cluster with aggregator 
> core network topology overall cluster bandwidth utilization was limited. 
> With aggregator core networks nodes and racks are not equidistant, which 
> means a broadcast operation can be inefficient as the broadcasting node needs 
> to send the same data N times to each node on a remote rack. 
> Ideally Rowbatches should be sent once per remote rack then a node on each 
> remote rack would broadcast within its rack. 
> Table below represent rack to rack latency for the 90% of operations, ration 
> between best and worst case is 7.3x 
> | ||  va||vc||vd1||   vd3||   ve|
> |va|  4,238|  4,290|  9,692|  8,897|  8,208|
> |vc|  9,290|  4,396|  30,952| 13,529| 14,578|
> |vd1| 9,131|  29,066| 4,346|  17,265| 16,849|
> |vd3| 7,409|  15,517| 17,265| 4,370|  4,687|
> |ve|  4,914|  16,894| 16,430| 4,713|  4,472|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org