[jira] [Commented] (SAMZA-498) Kill YARN Job doesn't kill the container

Falk Scheerschmidt (JIRA) Tue, 16 Dec 2014 00:14:16 -0800

    [ 
https://issues.apache.org/jira/browse/SAMZA-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14247921#comment-14247921
 ]


Falk Scheerschmidt commented on SAMZA-498:
------------------------------------------

In YARN-1842 the reporter solved the "bug" with install YARN on CentOS. We do 
it also (and it works) and find out with the same version of Java, YARN and 
Samza the only difference is the kernel. On CentOS we have Linux Kernel 
2.6.32-504.el6.x86_64 and on Debian Wheezy 3.16.0-0.bpo.4-amd64. 

On both machines the java version is

{noformat}
java version "1.7.0_71"
Java(TM) SE Runtime Environment (build 1.7.0_71-b14)
Java HotSpot(TM) 64-Bit Server VM (build 24.71-b01, mixed mode)
{noformat}

Now we try do downgrade the kernel on debian to 2.6.*

> Kill YARN Job doesn't kill the container
> ----------------------------------------
>
>                 Key: SAMZA-498
>                 URL: https://issues.apache.org/jira/browse/SAMZA-498
>             Project: Samza
>          Issue Type: Bug
>          Components: container, yarn
>    Affects Versions: 0.8.0
>            Reporter: Falk Scheerschmidt
>
> I tried to kill a samza job with the kill-yarn-job.sh and the console output 
> looks like 
> {noformat}2014-12-12 13:47:54 RMProxy [INFO] Connecting to ResourceManager at 
> xxxxxxxx/10.2.0.79:8032
> 2014-12-12 13:48:03 NativeCodeLoader [WARN] Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> Killing application application_1418381509268_0016
> 2014-12-12 13:48:03 YarnClientImpl [INFO] Killed application 
> application_1418381509268_0016{noformat}
> But the job wasn't killed. All containers are running and on the server the 
> log looks like
> {noformat}2014-12-12 13:48:04,003 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=user.name   
>   IP=10.255.250.131      OPERATION=Kill Application Request      
> TARGET=ClientRMService    RESULT=SUCCESS  APPID=application_1418381509268_0016
> 2014-12-12 13:48:04,565 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: 
> Application attempt appattempt_1418381509268_0016_000002 doesn't exist in 
> ApplicationMasterService cache.
> 2014-12-12 13:48:04,565 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 27 on 8030, call 
> org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB.allocate from 
> 10.2.0.88:40934 Call#751 Retry#0
> org.apache.hadoop.yarn.exceptions.ApplicationAttemptNotFoundException: 
> Application attempt appattempt_1418381509268_0016_000002 doesn't exist in 
> ApplicationMasterService cache.
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:436)
>         at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60)
>         at 
> org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
> 2014-12-12 13:48:05,202 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: 
> application_1418381509268_0016 unregistered successfully. 
> 2014-12-12 13:48:06,229 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
>  Null container completed...{noformat}
> Any hints what I can try to do?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SAMZA-498) Kill YARN Job doesn't kill the container

Reply via email to