[ https://issues.apache.org/jira/browse/SPARK-542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13967185#comment-13967185 ]
Mridul Muralidharan commented on SPARK-542: ------------------------------------------- Spark uses only hostnames - not ip's. Even for hostnames, it should ideally pick only the canonical hostname - not the others. This was done by design in 0.8 ... try to find if multiple host names/ip's are all referring to the same physical host/container is fraught with too many issues. > Cache Miss when machine have multiple hostname > ---------------------------------------------- > > Key: SPARK-542 > URL: https://issues.apache.org/jira/browse/SPARK-542 > Project: Spark > Issue Type: Bug > Reporter: frankvictor > > HI, I encountered a weird runtime of pagerank in last few day. > After debugging the job, I found it was caused by the DNS name. > The machines of my cluster have multiple hostname, for example, slave 1 have > name (c001 and c001.cm.cluster) > when spark adding cache in cacheTracker, it get "c001" and add cache use it. > But when schedule task in SimpleJob, the msos offer give spark > "c001.cm.cluster". > so It will never get preferred location! > I thinks spark should handle the multiple hostname case(by using ip instead > of hostname, or some other methods). > Thanks! -- This message was sent by Atlassian JIRA (v6.2#6252)