[ 
https://issues.apache.org/jira/browse/SAMZA-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15218429#comment-15218429
 ] 

Jake Maes commented on SAMZA-922:
---------------------------------

An update based on my testing. 

The ANY_HOST issue leads to the exception but is not the root cause of the 
exception. The ANY_HOST issue causes a container request with an invalid host 
so the fix is still necessary. Also, since the fix removes "ANY_HOST" as the 
preferred host, the ScriptBasedMapping is never invoked and that's why I saw 
the exception disappear.

However the exception seems to occur for any request with a preferred host, 
which is a separate issue. I suspect that issue is with our internal Yarn impl. 

Anyhow, I wanted to set the record straight. There are 2 issues in the original 
description:
1. ContainerRequests were using "ANY_HOST" as the preferred host
2. ScriptBasedMapping exception

The patch only fixes issue #1

> Host Affinity - Bug in SamzaContainerRequest causes (recoverable) exceptions 
> in YARN
> ------------------------------------------------------------------------------------
>
>                 Key: SAMZA-922
>                 URL: https://issues.apache.org/jira/browse/SAMZA-922
>             Project: Samza
>          Issue Type: Bug
>            Reporter: Jake Maes
>            Assignee: Jake Maes
>             Fix For: 0.10.1
>
>         Attachments: SAMZA-922.patch
>
>
> The constructor for SamzaContainerRequest creates the Yarn container request 
> differently depending on whether there is a preferred host or not. 
> Unfortunately, it looks for preferredHost == null but not 
> preferredHost.equals(ANY_HOST) and ANY_HOST is the string passed when there 
> is no preferred host. 
> As a result, the Yarn container request is actually asking for a container on 
> the host name "ANY_HOST" which causes the following exception:
> 2016-03-29 21:25:53.892 [main] ScriptBasedMapping [WARN] Exception running 
> /OMITTED/sbin/yarn-topology.py ANY_HOST 
> java.io.IOException: Cannot run program 
> "/OMITTED/application_1452292535523_0047/container_1452292535523_0047_02_000001"):
>  error=2, No such file or directory
>       at java.lang.ProcessBuilder.start(ProcessBuilder.java:1042)
>       at org.apache.hadoop.util.Shell.runCommand(Shell.java:485)
>       at org.apache.hadoop.util.Shell.run(Shell.java:455)
>       at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
>       at 
> org.apache.hadoop.net.ScriptBasedMapping$RawScriptBasedMapping.runResolveCommand(ScriptBasedMapping.java:251)
>       at 
> org.apache.hadoop.net.ScriptBasedMapping$RawScriptBasedMapping.resolve(ScriptBasedMapping.java:188)
>       at 
> org.apache.hadoop.net.CachedDNSToSwitchMapping.resolve(CachedDNSToSwitchMapping.java:119)
>       at 
> org.apache.hadoop.yarn.util.RackResolver.coreResolve(RackResolver.java:101)
>       at 
> org.apache.hadoop.yarn.util.RackResolver.resolve(RackResolver.java:95)
>       at 
> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.resolveRacks(AMRMClientImpl.java:551)
>       at 
> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.addContainerRequest(AMRMClientImpl.java:411)
>       at 
> org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl.addContainerRequest(AMRMClientAsyncImpl.java:166)
>       at 
> org.apache.samza.job.yarn.ContainerRequestState.updateRequestState(ContainerRequestState.java:82)
>       at 
> org.apache.samza.job.yarn.AbstractContainerAllocator.requestContainer(AbstractContainerAllocator.java:102)
>       at 
> org.apache.samza.job.yarn.AbstractContainerAllocator.requestContainers(AbstractContainerAllocator.java:85)
>       at 
> org.apache.samza.job.yarn.SamzaTaskManager.onInit(SamzaTaskManager.java:112)
>       at 
> org.apache.samza.job.yarn.SamzaAppMaster$$anonfun$run$1.apply(SamzaAppMaster.scala:117)
>       at 
> org.apache.samza.job.yarn.SamzaAppMaster$$anonfun$run$1.apply(SamzaAppMaster.scala:117)
>       at scala.collection.immutable.List.foreach(List.scala:318)
>       at 
> org.apache.samza.job.yarn.SamzaAppMaster$.run(SamzaAppMaster.scala:117)
>       at 
> org.apache.samza.job.yarn.SamzaAppMaster$.main(SamzaAppMaster.scala:104)
>       at org.apache.samza.job.yarn.SamzaAppMaster.main(SamzaAppMaster.scala)
> Caused by: java.io.IOException: error=2, No such file or directory
>       at java.lang.UNIXProcess.forkAndExec(Native Method)
>       at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
>       at java.lang.ProcessImpl.start(ProcessImpl.java:134)
>       at java.lang.ProcessBuilder.start(ProcessBuilder.java:1023)
> The exception is recoverable when relaxed locality = true because Yarn just 
> defaults to a random host on the default rack, which was the desired result 
> of the ANY_HOST request. However the behavior is incorrect and the stack 
> traces tend to fill the log.
> The string "ANY_HOST" is internal to Samza and Yarn should never see it. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to