[ 
https://issues.apache.org/jira/browse/SPARK-6987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954983#comment-14954983
 ] 

Piotr Kołaczkowski commented on SPARK-6987:
-------------------------------------------

Probably just having ability to list the host-names that Spark knows of would 
be enough.

> Node Locality is determined with String Matching instead of Inet Comparison
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-6987
>                 URL: https://issues.apache.org/jira/browse/SPARK-6987
>             Project: Spark
>          Issue Type: Bug
>          Components: Scheduler, Spark Core
>    Affects Versions: 1.2.0, 1.3.0
>            Reporter: Russell Alexander Spitzer
>
> When determining whether or not a task can be run NodeLocal the 
> TaskSetManager ends up using a direct string comparison between the 
> preferredIp and the executor's bound interface.
> https://github.com/apache/spark/blob/c84d91692aa25c01882bcc3f9fd5de3cfa786195/core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala#L878-L880
> https://github.com/apache/spark/blob/c84d91692aa25c01882bcc3f9fd5de3cfa786195/core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala#L488-L490
> This means that the preferredIp must be a direct string match of the ip the 
> the worker is bound to. This means that apis which are gathering data from 
> other distributed sources must develop their own mapping between the 
> interfaces bound (or exposed) by the external sources and the interface bound 
> by the Spark executor since these may be different. 
> For example, Cassandra exposes a broadcast rpc address which doesn't have to 
> match the address which the service is bound to. This means when adding 
> preferredLocation data we must add both the rpc and the listen address to 
> ensure that we can get a string match (and of course we are out of luck if 
> Spark has been bound on to another interface). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to