This is an excerpt from the yarn-root-resourcemanager-kfk-samza01.out file. Tell me if you need another file.
Thanks, jordi 0001 State change from NEW to SUBMITTED 2015-09-28 11:15:58,081 INFO [ResourceManager Event Processor] capacity.LeafQueue (LeafQueue.java:activateApplications(626)) - Application application_1443431699703_0003 from user: root activated in queue: default 2015-09-28 11:15:58,081 INFO [ResourceManager Event Processor] capacity.LeafQueue (LeafQueue.java:addApplicationAttempt(643)) - Application added - appId: application_1443431699703_0003 user: org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User@3d3c59a, leaf-queue: default #user-pending-applications: 0 #user-active-applications: 3 #queue-pending-applications: 0 #queue-active-applications: 3 2015-09-28 11:15:58,081 INFO [ResourceManager Event Processor] capacity.CapacityScheduler (CapacityScheduler.java:addApplicationAttempt(746)) - Added Application Attempt appattempt_1443431699703_0003_000001 to scheduler from user root in queue default 2015-09-28 11:15:58,082 INFO [AsyncDispatcher event handler] attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(764)) - appattempt_1443431699703_0003_000001 State change from SUBMITTED to SCHEDULED 2015-09-28 11:15:58,135 INFO [ResourceManager Event Processor] rmcontainer.RMContainerImpl (RMContainerImpl.java:handle(380)) - container_1443431699703_0003_01_000001 Container Transitioned from NEW to ALLOCATED 2015-09-28 11:15:58,135 INFO [ResourceManager Event Processor] resourcemanager.RMAuditLogger (RMAuditLogger.java:logSuccess(106)) - USER=root OPERATION=AM Allocated Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1443431699703_0003 CONTAINERID=container_1443431699703_0003_01_000001 2015-09-28 11:15:58,135 INFO [ResourceManager Event Processor] scheduler.SchedulerNode (SchedulerNode.java:allocateContainer(141)) - Assigned container container_1443431699703_0003_01_000001 of capacity <memory:256, vCores:1> on host kfk-samza01:36066, which has 3 containers, <memory:768, vCores:3> used and <memory:1280, vCores:5> available after allocation 2015-09-28 11:15:58,136 INFO [ResourceManager Event Processor] capacity.LeafQueue (LeafQueue.java:assignContainer(1570)) - assignedContainer application attempt=appattempt_1443431699703_0003_000001 container=Container: [ContainerId: container_1443431699703_0003_01_000001, NodeId: kfk-samza01:36066, NodeHttpAddress: kfk-samza01:8042, Resource: <memory:256, vCores:1>, Priority: 0, Token: null, ] queue=default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:512, vCores:2>, usedCapacity=0.125, absoluteUsedCapacity=0.125, numApps=3, numContainers=2 clusterResource=<memory:4096, vCores:16> 2015-09-28 11:15:58,136 INFO [ResourceManager Event Processor] capacity.ParentQueue (ParentQueue.java:assignContainersToChildQueues(601)) - Re-sorting assigned queue: root.default stats: default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:768, vCores:3>, usedCapacity=0.1875, absoluteUsedCapacity=0.1875, numApps=3, numContainers=3 2015-09-28 11:15:58,137 INFO [ResourceManager Event Processor] capacity.ParentQueue (ParentQueue.java:assignContainers(464)) - assignedContainer queue=root usedCapacity=0.1875 absoluteUsedCapacity=0.1875 used=<memory:768, vCores:3> cluster=<memory:4096, vCores:16> 2015-09-28 11:15:58,138 INFO [AsyncDispatcher event handler] security.NMTokenSecretManagerInRM (NMTokenSecretManagerInRM.java:createAndGetNMToken(200)) - Sending NMToken for nodeId : kfk-samza01:36066 for container : container_1443431699703_0003_01_000001 2015-09-28 11:15:58,144 INFO [AsyncDispatcher event handler] rmcontainer.RMContainerImpl (RMContainerImpl.java:handle(380)) - container_1443431699703_0003_01_000001 Container Transitioned from ALLOCATED to ACQUIRED 2015-09-28 11:15:58,144 INFO [AsyncDispatcher event handler] security.NMTokenSecretManagerInRM (NMTokenSecretManagerInRM.java:clearNodeSetForAttempt(146)) - Clear node set for appattempt_1443431699703_0003_000001 2015-09-28 11:15:58,144 INFO [AsyncDispatcher event handler] attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:storeAttempt(1837)) - Storing attempt: AppId: application_1443431699703_0003 AttemptId: appattempt_1443431699703_0003_000001 MasterContainer: Container: [ContainerId: container_1443431699703_0003_01_000001, NodeId: kfk-samza01:36066, NodeHttpAddress: kfk-samza01:8042, Resource: <memory:256, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 192.168.15.92:36066 }, ] 2015-09-28 11:15:58,145 INFO [AsyncDispatcher event handler] attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(764)) - appattempt_1443431699703_0003_000001 State change from SCHEDULED to ALLOCATED_SAVING 2015-09-28 11:15:58,145 INFO [AsyncDispatcher event handler] attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(764)) - appattempt_1443431699703_0003_000001 State change from ALLOCATED_SAVING to ALLOCATED 2015-09-28 11:15:58,161 INFO [pool-1-thread-3] amlauncher.AMLauncher (AMLauncher.java:run(253)) - Launching masterappattempt_1443431699703_0003_000001 2015-09-28 11:15:58,165 INFO [pool-1-thread-3] amlauncher.AMLauncher (AMLauncher.java:launch(106)) - Setting up container Container: [ContainerId: container_1443431699703_0003_01_000001, NodeId: kfk-samza01:36066, NodeHttpAddress: kfk-samza01:8042, Resource: <memory:256, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 192.168.15.92:36066 }, ] for AM appattempt_1443431699703_0003_000001 2015-09-28 11:15:58,166 INFO [pool-1-thread-3] amlauncher.AMLauncher (AMLauncher.java:createAMContainerLaunchContext(191)) - Command to launch container container_1443431699703_0003_01_000001 : export SAMZA_LOG_DIR=<LOG_DIR> && ln -sfn <LOG_DIR> logs && exec ./__package/bin/run-am.sh 1>logs/stdout 2>logs/stderr 2015-09-28 11:15:58,166 INFO [pool-1-thread-3] security.AMRMTokenSecretManager (AMRMTokenSecretManager.java:createAndGetAMRMToken(195)) - Create AMRMToken for ApplicationAttempt: appattempt_1443431699703_0003_000001 2015-09-28 11:15:58,166 INFO [pool-1-thread-3] security.AMRMTokenSecretManager (AMRMTokenSecretManager.java:createPassword(307)) - Creating password for appattempt_1443431699703_0003_000001 2015-09-28 11:15:58,194 INFO [pool-1-thread-3] amlauncher.AMLauncher (AMLauncher.java:launch(127)) - Done launching container Container: [ContainerId: container_1443431699703_0003_01_000001, NodeId: kfk-samza01:36066, NodeHttpAddress: kfk-samza01:8042, Resource: <memory:256, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 192.168.15.92:36066 }, ] for AM appattempt_1443431699703_0003_000001 2015-09-28 11:15:58,194 INFO [AsyncDispatcher event handler] attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(764)) - appattempt_1443431699703_0003_000001 State change from ALLOCATED to LAUNCHED 2015-09-28 11:15:59,138 INFO [ResourceManager Event Processor] rmcontainer.RMContainerImpl (RMContainerImpl.java:handle(380)) - container_1443431699703_0003_01_000001 Container Transitioned from ACQUIRED to RUNNING 2015-09-28 11:18:30,736 INFO [Socket Reader #1 for port 8030] ipc.Server (Server.java:saslProcess(1306)) - Auth successful for appattempt_1443431699703_0001_000001 (auth:SIMPLE) 2015-09-28 11:18:30,766 INFO [IPC Server handler 8 on 8030] resourcemanager.ApplicationMasterService (ApplicationMasterService.java:registerApplicationMaster(274)) - AM registration appattempt_1443431699703_0001_000001 2015-09-28 11:18:30,768 INFO [IPC Server handler 8 on 8030] resourcemanager.RMAuditLogger (RMAuditLogger.java:logSuccess(127)) - USER=root IP=192.168.15.92 OPERATION=Register App Master TARGET=ApplicationMasterService RESULT=SUCCESS APPID=application_1443431699703_0001 APPATTEMPTID=appattempt_1443431699703_0001_000001 2015-09-28 11:18:30,769 INFO [AsyncDispatcher event handler] attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(764)) - appattempt_1443431699703_0001_000001 State change from LAUNCHED to RUNNING 2015-09-28 11:18:30,769 INFO [AsyncDispatcher event handler] rmapp.RMAppImpl (RMAppImpl.java:handle(718)) - application_1443431699703_0001 State change from ACCEPTED to RUNNING 2015-09-28 11:18:31,952 INFO [ResourceManager Event Processor] rmcontainer.RMContainerImpl (RMContainerImpl.java:handle(380)) - container_1443431699703_0001_01_000002 Container Transitioned from NEW to ALLOCATED 2015-09-28 11:18:31,952 INFO [ResourceManager Event Processor] resourcemanager.RMAuditLogger (RMAuditLogger.java:logSuccess(106)) - USER=root OPERATION=AM Allocated Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1443431699703_0001 CONTAINERID=container_1443431699703_0001_01_000002 2015-09-28 11:18:31,953 INFO [ResourceManager Event Processor] scheduler.SchedulerNode (SchedulerNode.java:allocateContainer(141)) - Assigned container container_1443431699703_0001_01_000002 of capacity <memory:256, vCores:1> on host kfk-samza02:59687, which has 1 containers, <memory:256, vCores:1> used and <memory:1792, vCores:7> available after allocation 2015-09-28 11:18:31,953 INFO [ResourceManager Event Processor] capacity.LeafQueue (LeafQueue.java:assignContainer(1570)) - assignedContainer application attempt=appattempt_1443431699703_0001_000001 container=Container: [ContainerId: container_1443431699703_0001_01_000002, NodeId: kfk-samza02:59687, NodeHttpAddress: kfk-samza02:8042, Resource: <memory:256, vCores:1>, Priority: 0, Token: null, ] queue=default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:768, vCores:3>, usedCapacity=0.1875, absoluteUsedCapacity=0.1875, numApps=3, numContainers=3 clusterResource=<memory:4096, vCores:16> 2015-09-28 11:18:31,953 INFO [ResourceManager Event Processor] capacity.ParentQueue (ParentQueue.java:assignContainersToChildQueues(601)) - Re-sorting assigned queue: root.default stats: default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:1024, vCores:4>, usedCapacity=0.25, absoluteUsedCapacity=0.25, numApps=3, numContainers=4 2015-09-28 11:18:31,954 INFO [ResourceManager Event Processor] capacity.ParentQueue (ParentQueue.java:assignContainers(464)) - assignedContainer queue=root usedCapacity=0.25 absoluteUsedCapacity=0.25 used=<memory:1024, vCores:4> cluster=<memory:4096, vCores:16> 2015-09-28 11:18:32,921 INFO [IPC Server handler 18 on 8030] security.NMTokenSecretManagerInRM (NMTokenSecretManagerInRM.java:createAndGetNMToken(200)) - Sending NMToken for nodeId : kfk-samza02:59687 for container : container_1443431699703_0001_01_000002 2015-09-28 11:18:32,923 INFO [IPC Server handler 18 on 8030] rmcontainer.RMContainerImpl (RMContainerImpl.java:handle(380)) - container_1443431699703_0001_01_000002 Container Transitioned from ALLOCATED to ACQUIRED 2015-09-28 11:18:33,980 INFO [ResourceManager Event Processor] rmcontainer.RMContainerImpl (RMContainerImpl.java:handle(380)) - container_1443431699703_0001_01_000002 Container Transitioned from ACQUIRED to RUNNING 2015-09-28 11:22:17,889 INFO [AsyncDispatcher event handler] rmapp.RMAppImpl (RMAppImpl.java:handle(718)) - application_1443431699703_0003 State change from ACCEPTED to KILLING 2015-09-28 11:22:17,890 INFO [AsyncDispatcher event handler] attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:rememberTargetTransitionsAndStoreState(1129)) - Updating application attempt appattempt_1443431699703_0003_000001 with final state: KILLED, and exit status: -1000 2015-09-28 11:22:17,891 INFO [AsyncDispatcher event handler] attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(764)) - appattempt_1443431699703_0003_000001 State change from LAUNCHED to FINAL_SAVING 2015-09-28 11:22:17,892 INFO [AsyncDispatcher event handler] resourcemanager.ApplicationMasterService (ApplicationMasterService.java:unregisterAttempt(676)) - Unregistering app attempt : appattempt_1443431699703_0003_000001 2015-09-28 11:22:17,893 INFO [AsyncDispatcher event handler] security.AMRMTokenSecretManager (AMRMTokenSecretManager.java:applicationMasterFinished(124)) - Application finished, removing password for appattempt_1443431699703_0003_000001 2015-09-28 11:22:17,893 INFO [AsyncDispatcher event handler] attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(764)) - appattempt_1443431699703_0003_000001 State change from FINAL_SAVING to KILLED 2015-09-28 11:22:17,894 INFO [AsyncDispatcher event handler] rmapp.RMAppImpl (RMAppImpl.java:rememberTargetTransitionsAndStoreState(992)) - Updating application application_1443431699703_0003 with final state: KILLED 2015-09-28 11:22:17,895 INFO [AsyncDispatcher event handler] rmapp.RMAppImpl (RMAppImpl.java:handle(718)) - application_1443431699703_0003 State change from KILLING to FINAL_SAVING 2015-09-28 11:22:17,895 INFO [ResourceManager Event Processor] capacity.CapacityScheduler (CapacityScheduler.java:doneApplicationAttempt(785)) - Application Attempt appattempt_1443431699703_0003_000001 is done. finalState=KILLED 2015-09-28 11:22:17,895 INFO [AsyncDispatcher event handler] recovery.RMStateStore (RMStateStore.java:transition(161)) - Updating info for app: application_1443431699703_0003 2015-09-28 11:22:17,901 INFO [ResourceManager Event Processor] rmcontainer.RMContainerImpl (RMContainerImpl.java:handle(380)) - container_1443431699703_0003_01_000001 Container Transitioned from RUNNING to KILLED 2015-09-28 11:22:17,901 INFO [ResourceManager Event Processor] fica.FiCaSchedulerApp (FiCaSchedulerApp.java:containerCompleted(98)) - Completed container: container_1443431699703_0003_01_000001 in state: KILLED event:KILL 2015-09-28 11:22:17,901 INFO [ResourceManager Event Processor] resourcemanager.RMAuditLogger (RMAuditLogger.java:logSuccess(106)) - USER=root OPERATION=AM Released Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1443431699703_0003 CONTAINERID=container_1443431699703_0003_01_000001 2015-09-28 11:22:17,902 INFO [ResourceManager Event Processor] scheduler.SchedulerNode (SchedulerNode.java:releaseContainer(204)) - Released container container_1443431699703_0003_01_000001 of capacity <memory:256, vCores:1> on host kfk-samza01:36066, which currently has 2 containers, <memory:512, vCores:2> used and <memory:1536, vCores:6> available, release resources=true 2015-09-28 11:22:17,902 INFO [ResourceManager Event Processor] capacity.LeafQueue (LeafQueue.java:releaseResource(1723)) - default used=<memory:768, vCores:3> numContainers=3 user=root user-resources=<memory:768, vCores:3> 2015-09-28 11:22:17,903 INFO [ResourceManager Event Processor] capacity.LeafQueue (LeafQueue.java:completedContainer(1674)) - completedContainer container=Container: [ContainerId: container_1443431699703_0003_01_000001, NodeId: kfk-samza01:36066, NodeHttpAddress: kfk-samza01:8042, Resource: <memory:256, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 192.168.15.92:36066 }, ] queue=default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:768, vCores:3>, usedCapacity=0.1875, absoluteUsedCapacity=0.1875, numApps=3, numContainers=3 cluster=<memory:4096, vCores:16> 2015-09-28 11:22:17,903 INFO [ResourceManager Event Processor] capacity.ParentQueue (ParentQueue.java:completedContainer(646)) - completedContainer queue=root usedCapacity=0.1875 absoluteUsedCapacity=0.1875 used=<memory:768, vCores:3> cluster=<memory:4096, vCores:16> 2015-09-28 11:22:17,904 INFO [ResourceManager Event Processor] capacity.ParentQueue (ParentQueue.java:completedContainer(664)) - Re-sorting completed queue: root.default stats: default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:768, vCores:3>, usedCapacity=0.1875, absoluteUsedCapacity=0.1875, numApps=3, numContainers=3 2015-09-28 11:22:17,904 INFO [ResourceManager Event Processor] capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1215)) - Application attempt appattempt_1443431699703_0003_000001 released container container_1443431699703_0003_01_000001 on node: host: kfk-samza01:36066 #containers=2 available=1536 used=512 with event: KILL 2015-09-28 11:22:17,904 INFO [AsyncDispatcher event handler] rmapp.RMAppImpl (RMAppImpl.java:handle(718)) - application_1443431699703_0003 State change from FINAL_SAVING to KILLED 2015-09-28 11:22:17,905 INFO [ResourceManager Event Processor] scheduler.AppSchedulingInfo (AppSchedulingInfo.java:clearRequests(115)) - Application application_1443431699703_0003 requests cleared 2015-09-28 11:22:17,906 INFO [ResourceManager Event Processor] capacity.LeafQueue (LeafQueue.java:removeApplicationAttempt(686)) - Application removed - appId: application_1443431699703_0003 user: root queue: default #user-pending-applications: 0 #user-active-applications: 2 #queue-pending-applications: 0 #queue-active-applications: 2 2015-09-28 11:22:17,906 INFO [ResourceManager Event Processor] capacity.ParentQueue (ParentQueue.java:removeApplication(411)) - Application removed - appId: application_1443431699703_0003 user: root leaf-queue of parent: root #applications: 2 2015-09-28 11:22:17,907 INFO [AsyncDispatcher event handler] resourcemanager.RMAuditLogger (RMAuditLogger.java:logSuccess(148)) - USER=root OPERATION=Application Finished - Killed TARGET=RMAppManager RESULT=SUCCESS APPID=application_1443431699703_0003 2015-09-28 11:22:17,909 INFO [pool-1-thread-4] amlauncher.AMLauncher (AMLauncher.java:run(267)) - Cleaning master appattempt_1443431699703_0003_000001 2015-09-28 11:22:17,910 INFO [AsyncDispatcher event handler] resourcemanager.RMAppManager$ApplicationSummary (RMAppManager.java:logAppSummary(179)) - appId=application_1443431699703_0003,name=flow.OperationJob_1,user=root,queue=default,state=KILLED,trackingUrl=http://kfk-samza01:8088/cluster/app/application_1443431699703_0003,appMasterHost=N/A,startTime=1443431758077,finishTime=1443432137894,finalStatus=KILLED 2015-09-28 11:22:18,102 INFO [IPC Server handler 3 on 8032] resourcemanager.RMAuditLogger (RMAuditLogger.java:logSuccess(148)) - USER=root IP=192.168.15.92 OPERATION=Kill Application Request TARGET=ClientRMService RESULT=SUCCESS APPID=application_1443431699703_0003 2015-09-28 11:22:20,493 INFO [AsyncDispatcher event handler] rmapp.RMAppImpl (RMAppImpl.java:handle(718)) - application_1443431699703_0001 State change from RUNNING to KILLING 2015-09-28 11:22:20,494 INFO [AsyncDispatcher event handler] attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:rememberTargetTransitionsAndStoreState(1129)) - Updating application attempt appattempt_1443431699703_0001_000001 with final state: KILLED, and exit status: -1000 2015-09-28 11:22:20,494 INFO [AsyncDispatcher event handler] attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(764)) - appattempt_1443431699703_0001_000001 State change from RUNNING to FINAL_SAVING 2015-09-28 11:22:20,494 INFO [AsyncDispatcher event handler] resourcemanager.ApplicationMasterService (ApplicationMasterService.java:unregisterAttempt(676)) - Unregistering app attempt : appattempt_1443431699703_0001_000001 2015-09-28 11:22:20,495 INFO [AsyncDispatcher event handler] security.AMRMTokenSecretManager (AMRMTokenSecretManager.java:applicationMasterFinished(124)) - Application finished, removing password for appattempt_1443431699703_0001_000001 2015-09-28 11:22:20,495 INFO [AsyncDispatcher event handler] attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(764)) - appattempt_1443431699703_0001_000001 State change from FINAL_SAVING to KILLED 2015-09-28 11:22:20,495 INFO [AsyncDispatcher event handler] rmapp.RMAppImpl (RMAppImpl.java:rememberTargetTransitionsAndStoreState(992)) - Updating application application_1443431699703_0001 with final state: KILLED 2015-09-28 11:22:20,495 INFO [AsyncDispatcher event handler] rmapp.RMAppImpl (RMAppImpl.java:handle(718)) - application_1443431699703_0001 State change from KILLING to FINAL_SAVING 2015-09-28 11:22:20,496 INFO [ResourceManager Event Processor] capacity.CapacityScheduler (CapacityScheduler.java:doneApplicationAttempt(785)) - Application Attempt appattempt_1443431699703_0001_000001 is done. finalState=KILLED 2015-09-28 11:22:20,496 INFO [AsyncDispatcher event handler] recovery.RMStateStore (RMStateStore.java:transition(161)) - Updating info for app: application_1443431699703_0001 2015-09-28 11:22:20,496 INFO [ResourceManager Event Processor] rmcontainer.RMContainerImpl (RMContainerImpl.java:handle(380)) - container_1443431699703_0001_01_000002 Container Transitioned from RUNNING to KILLED 2015-09-28 11:22:20,496 INFO [AsyncDispatcher event handler] rmapp.RMAppImpl (RMAppImpl.java:handle(718)) - application_1443431699703_0001 State change from FINAL_SAVING to KILLED 2015-09-28 11:22:20,496 INFO [pool-1-thread-5] amlauncher.AMLauncher (AMLauncher.java:run(267)) - Cleaning master appattempt_1443431699703_0001_000001 2015-09-28 11:22:20,497 INFO [AsyncDispatcher event handler] resourcemanager.RMAuditLogger (RMAuditLogger.java:logSuccess(148)) - USER=root OPERATION=Application Finished - Killed TARGET=RMAppManager RESULT=SUCCESS APPID=application_1443431699703_0001 2015-09-28 11:22:20,496 INFO [ResourceManager Event Processor] fica.FiCaSchedulerApp (FiCaSchedulerApp.java:containerCompleted(98)) - Completed container: container_1443431699703_0001_01_000002 in state: KILLED event:KILL 2015-09-28 11:22:20,497 INFO [ResourceManager Event Processor] resourcemanager.RMAuditLogger (RMAuditLogger.java:logSuccess(106)) - USER=root OPERATION=AM Released Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1443431699703_0001 CONTAINERID=container_1443431699703_0001_01_000002 2015-09-28 11:22:20,497 INFO [AsyncDispatcher event handler] resourcemanager.RMAppManager$ApplicationSummary (RMAppManager.java:logAppSummary(179)) - appId=application_1443431699703_0001,name=flow.Router_1,user=root,queue=default,state=KILLED,trackingUrl=http://kfk-samza01:8088/cluster/app/application_1443431699703_0001,appMasterHost=N/A,startTime=1443431751189,finishTime=1443432140495,finalStatus=KILLED 2015-09-28 11:22:20,497 INFO [ResourceManager Event Processor] scheduler.SchedulerNode (SchedulerNode.java:releaseContainer(204)) - Released container container_1443431699703_0001_01_000002 of capacity <memory:256, vCores:1> on host kfk-samza02:59687, which currently has 0 containers, <memory:0, vCores:0> used and <memory:2048, vCores:8> available, release resources=true 2015-09-28 11:22:20,498 INFO [ResourceManager Event Processor] capacity.LeafQueue (LeafQueue.java:releaseResource(1723)) - default used=<memory:512, vCores:2> numContainers=2 user=root user-resources=<memory:512, vCores:2> 2015-09-28 11:22:20,498 INFO [ResourceManager Event Processor] capacity.LeafQueue (LeafQueue.java:completedContainer(1674)) - completedContainer container=Container: [ContainerId: container_1443431699703_0001_01_000002, NodeId: kfk-samza02:59687, NodeHttpAddress: kfk-samza02:8042, Resource: <memory:256, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 192.168.15.94:59687 }, ] queue=default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:512, vCores:2>, usedCapacity=0.125, absoluteUsedCapacity=0.125, numApps=2, numContainers=2 cluster=<memory:4096, vCores:16> 2015-09-28 11:22:20,499 INFO [ResourceManager Event Processor] capacity.ParentQueue (ParentQueue.java:completedContainer(646)) - completedContainer queue=root usedCapacity=0.125 absoluteUsedCapacity=0.125 used=<memory:512, vCores:2> cluster=<memory:4096, vCores:16> 2015-09-28 11:22:20,499 INFO [ResourceManager Event Processor] capacity.ParentQueue (ParentQueue.java:completedContainer(664)) - Re-sorting completed queue: root.default stats: default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:512, vCores:2>, usedCapacity=0.125, absoluteUsedCapacity=0.125, numApps=2, numContainers=2 2015-09-28 11:22:20,499 INFO [ResourceManager Event Processor] capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1215)) - Application attempt appattempt_1443431699703_0001_000001 released container container_1443431699703_0001_01_000002 on node: host: kfk-samza02:59687 #containers=0 available=2048 used=0 with event: KILL 2015-09-28 11:22:20,500 INFO [ResourceManager Event Processor] rmcontainer.RMContainerImpl (RMContainerImpl.java:handle(380)) - container_1443431699703_0001_01_000001 Container Transitioned from RUNNING to KILLED 2015-09-28 11:22:20,500 INFO [ResourceManager Event Processor] fica.FiCaSchedulerApp (FiCaSchedulerApp.java:containerCompleted(98)) - Completed container: container_1443431699703_0001_01_000001 in state: KILLED event:KILL 2015-09-28 11:22:20,500 INFO [ResourceManager Event Processor] resourcemanager.RMAuditLogger (RMAuditLogger.java:logSuccess(106)) - USER=root OPERATION=AM Released Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1443431699703_0001 CONTAINERID=container_1443431699703_0001_01_000001 2015-09-28 11:22:20,500 INFO [ResourceManager Event Processor] scheduler.SchedulerNode (SchedulerNode.java:releaseContainer(204)) - Released container container_1443431699703_0001_01_000001 of capacity <memory:256, vCores:1> on host kfk-samza01:36066, which currently has 1 containers, <memory:256, vCores:1> used and <memory:1792, vCores:7> available, release resources=true 2015-09-28 11:22:20,500 INFO [ResourceManager Event Processor] capacity.LeafQueue (LeafQueue.java:releaseResource(1723)) - default used=<memory:256, vCores:1> numContainers=1 user=root user-resources=<memory:256, vCores:1> 2015-09-28 11:22:20,501 INFO [ResourceManager Event Processor] capacity.LeafQueue (LeafQueue.java:completedContainer(1674)) - completedContainer container=Container: [ContainerId: container_1443431699703_0001_01_000001, NodeId: kfk-samza01:36066, NodeHttpAddress: kfk-samza01:8042, Resource: <memory:256, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 192.168.15.92:36066 }, ] queue=default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:256, vCores:1>, usedCapacity=0.0625, absoluteUsedCapacity=0.0625, numApps=2, numContainers=1 cluster=<memory:4096, vCores:16> 2015-09-28 11:22:20,501 INFO [ResourceManager Event Processor] capacity.ParentQueue (ParentQueue.java:completedContainer(646)) - completedContainer queue=root usedCapacity=0.0625 absoluteUsedCapacity=0.0625 used=<memory:256, vCores:1> cluster=<memory:4096, vCores:16> 2015-09-28 11:22:20,501 INFO [ResourceManager Event Processor] capacity.ParentQueue (ParentQueue.java:completedContainer(664)) - Re-sorting completed queue: root.default stats: default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:256, vCores:1>, usedCapacity=0.0625, absoluteUsedCapacity=0.0625, numApps=2, numContainers=1 2015-09-28 11:22:20,501 INFO [ResourceManager Event Processor] capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1215)) - Application attempt appattempt_1443431699703_0001_000001 released container container_1443431699703_0001_01_000001 on node: host: kfk-samza01:36066 #containers=1 available=1792 used=256 with event: KILL 2015-09-28 11:22:20,502 INFO [ResourceManager Event Processor] scheduler.AppSchedulingInfo (AppSchedulingInfo.java:clearRequests(115)) - Application application_1443431699703_0001 requests cleared 2015-09-28 11:22:20,502 INFO [ResourceManager Event Processor] capacity.LeafQueue (LeafQueue.java:removeApplicationAttempt(686)) - Application removed - appId: application_1443431699703_0001 user: root queue: default #user-pending-applications: 0 #user-active-applications: 1 #queue-pending-applications: 0 #queue-active-applications: 1 2015-09-28 11:22:20,502 INFO [ResourceManager Event Processor] capacity.ParentQueue (ParentQueue.java:removeApplication(411)) - Application removed - appId: application_1443431699703_0001 user: root leaf-queue of parent: root #applications: 1 2015-09-28 11:22:20,700 INFO [IPC Server handler 4 on 8032] resourcemanager.RMAuditLogger (RMAuditLogger.java:logSuccess(148)) - USER=root IP=192.168.15.92 OPERATION=Kill Application Request TARGET=ClientRMService RESULT=SUCCESS APPID=application_1443431699703_0001 2015-09-28 11:22:20,971 ERROR [IPC Server handler 22 on 8030] resourcemanager.ApplicationMasterService (ApplicationMasterService.java:allocate(435)) - Application attempt appattempt_1443431699703_0001_000001 doesn't exist in ApplicationMasterService cache. 2015-09-28 11:22:20,974 INFO [IPC Server handler 22 on 8030] ipc.Server (Server.java:run(2060)) - IPC Server handler 22 on 8030, call org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB.allocate from 192.168.15.92:53988 Call#231 Retry#0 org.apache.hadoop.yarn.exceptions.ApplicationAttemptNotFoundException: Application attempt appattempt_1443431699703_0001_000001 doesn't exist in ApplicationMasterService cache. at org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:436) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60) at org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) 2015-09-28 11:22:23,161 INFO [AsyncDispatcher event handler] rmapp.RMAppImpl (RMAppImpl.java:handle(718)) - application_1443431699703_0002 State change from ACCEPTED to KILLING 2015-09-28 11:22:23,162 INFO [AsyncDispatcher event handler] attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:rememberTargetTransitionsAndStoreState(1129)) - Updating application attempt appattempt_1443431699703_0002_000001 with final state: KILLED, and exit status: -1000 2015-09-28 11:22:23,163 INFO [AsyncDispatcher event handler] attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(764)) - appattempt_1443431699703_0002_000001 State change from LAUNCHED to FINAL_SAVING 2015-09-28 11:22:23,163 INFO [AsyncDispatcher event handler] resourcemanager.ApplicationMasterService (ApplicationMasterService.java:unregisterAttempt(676)) - Unregistering app attempt : appattempt_1443431699703_0002_000001 2015-09-28 11:22:23,164 INFO [AsyncDispatcher event handler] security.AMRMTokenSecretManager (AMRMTokenSecretManager.java:applicationMasterFinished(124)) - Application finished, removing password for appattempt_1443431699703_0002_000001 2015-09-28 11:22:23,164 INFO [AsyncDispatcher event handler] attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(764)) - appattempt_1443431699703_0002_000001 State change from FINAL_SAVING to KILLED 2015-09-28 11:22:23,165 INFO [AsyncDispatcher event handler] rmapp.RMAppImpl (RMAppImpl.java:rememberTargetTransitionsAndStoreState(992)) - Updating application application_1443431699703_0002 with final state: KILLED 2015-09-28 11:22:23,165 INFO [AsyncDispatcher event handler] rmapp.RMAppImpl (RMAppImpl.java:handle(718)) - application_1443431699703_0002 State change from KILLING to FINAL_SAVING 2015-09-28 11:22:23,165 INFO [ResourceManager Event Processor] capacity.CapacityScheduler (CapacityScheduler.java:doneApplicationAttempt(785)) - Application Attempt appattempt_1443431699703_0002_000001 is done. finalState=KILLED 2015-09-28 11:22:23,166 INFO [AsyncDispatcher event handler] recovery.RMStateStore (RMStateStore.java:transition(161)) - Updating info for app: application_1443431699703_0002 2015-09-28 11:22:23,166 INFO [ResourceManager Event Processor] rmcontainer.RMContainerImpl (RMContainerImpl.java:handle(380)) - container_1443431699703_0002_01_000001 Container Transitioned from RUNNING to KILLED 2015-09-28 11:22:23,166 INFO [AsyncDispatcher event handler] rmapp.RMAppImpl (RMAppImpl.java:handle(718)) - application_1443431699703_0002 State change from FINAL_SAVING to KILLED 2015-09-28 11:22:23,166 INFO [ResourceManager Event Processor] fica.FiCaSchedulerApp (FiCaSchedulerApp.java:containerCompleted(98)) - Completed container: container_1443431699703_0002_01_000001 in state: KILLED event:KILL 2015-09-28 11:22:23,167 INFO [AsyncDispatcher event handler] resourcemanager.RMAuditLogger (RMAuditLogger.java:logSuccess(148)) - USER=root OPERATION=Application Finished - Killed TARGET=RMAppManager RESULT=SUCCESS APPID=application_1443431699703_0002 2015-09-28 11:22:23,167 INFO [ResourceManager Event Processor] resourcemanager.RMAuditLogger (RMAuditLogger.java:logSuccess(106)) - USER=root OPERATION=AM Released Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1443431699703_0002 CONTAINERID=container_1443431699703_0002_01_000001 2015-09-28 11:22:23,167 INFO [ResourceManager Event Processor] scheduler.SchedulerNode (SchedulerNode.java:releaseContainer(204)) - Released container container_1443431699703_0002_01_000001 of capacity <memory:256, vCores:1> on host kfk-samza01:36066, which currently has 0 containers, <memory:0, vCores:0> used and <memory:2048, vCores:8> available, release resources=true 2015-09-28 11:22:23,168 INFO [ResourceManager Event Processor] capacity.LeafQueue (LeafQueue.java:releaseResource(1723)) - default used=<memory:0, vCores:0> numContainers=0 user=root user-resources=<memory:0, vCores:0> 2015-09-28 11:22:23,168 INFO [ResourceManager Event Processor] capacity.LeafQueue (LeafQueue.java:completedContainer(1674)) - completedContainer container=Container: [ContainerId: container_1443431699703_0002_01_000001, NodeId: kfk-samza01:36066, NodeHttpAddress: kfk-samza01:8042, Resource: <memory:256, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 192.168.15.92:36066 }, ] queue=default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:0, vCores:0>, usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=1, numContainers=0 cluster=<memory:4096, vCores:16> 2015-09-28 11:22:23,169 INFO [ResourceManager Event Processor] capacity.ParentQueue (ParentQueue.java:completedContainer(646)) - completedContainer queue=root usedCapacity=0.0 absoluteUsedCapacity=0.0 used=<memory:0, vCores:0> cluster=<memory:4096, vCores:16> 2015-09-28 11:22:23,169 INFO [ResourceManager Event Processor] capacity.ParentQueue (ParentQueue.java:completedContainer(664)) - Re-sorting completed queue: root.default stats: default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:0, vCores:0>, usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=1, numContainers=0 2015-09-28 11:22:23,169 INFO [ResourceManager Event Processor] capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1215)) - Application attempt appattempt_1443431699703_0002_000001 released container container_1443431699703_0002_01_000001 on node: host: kfk-samza01:36066 #containers=0 available=2048 used=0 with event: KILL 2015-09-28 11:22:23,169 INFO [ResourceManager Event Processor] scheduler.AppSchedulingInfo (AppSchedulingInfo.java:clearRequests(115)) - Application application_1443431699703_0002 requests cleared 2015-09-28 11:22:23,170 INFO [pool-1-thread-6] amlauncher.AMLauncher (AMLauncher.java:run(267)) - Cleaning master appattempt_1443431699703_0002_000001 2015-09-28 11:22:23,168 INFO [AsyncDispatcher event handler] resourcemanager.RMAppManager$ApplicationSummary (RMAppManager.java:logAppSummary(179)) - appId=application_1443431699703_0002,name=flow.WorkFlow_1,user=root,queue=default,state=KILLED,trackingUrl=http://kfk-samza01:8088/cluster/app/application_1443431699703_0002,appMasterHost=N/A,startTime=1443431754782,finishTime=1443432143165,finalStatus=KILLED 2015-09-28 11:22:23,170 INFO [ResourceManager Event Processor] capacity.LeafQueue (LeafQueue.java:removeApplicationAttempt(686)) - Application removed - appId: application_1443431699703_0002 user: root queue: default #user-pending-applications: 0 #user-active-applications: 0 #queue-pending-applications: 0 #queue-active-applications: 0 2015-09-28 11:22:23,172 INFO [ResourceManager Event Processor] capacity.ParentQueue (ParentQueue.java:removeApplication(411)) - Application removed - appId: application_1443431699703_0002 user: root leaf-queue of parent: root #applications: 0 2015-09-28 11:22:23,184 INFO [IPC Server handler 20 on 8030] resourcemanager.ApplicationMasterService (ApplicationMasterService.java:finishApplicationMaster(351)) - application_1443431699703_0001 unregistered successfully. 2015-09-28 11:22:23,371 INFO [IPC Server handler 0 on 8032] resourcemanager.RMAuditLogger (RMAuditLogger.java:logSuccess(148)) - USER=root IP=192.168.15.92 OPERATION=Kill Application Request TARGET=ClientRMService RESULT=SUCCESS APPID=application_1443431699703_0002 2015-09-28 11:22:24,195 INFO [ResourceManager Event Processor] capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed... 2015-09-28 11:24:59,693 INFO [Timer-3] scheduler.AbstractYarnScheduler (AbstractYarnScheduler.java:run(407)) - Release request cache is cleaned up -----Mensaje original----- De: Yi Pan [mailto:nickpa...@gmail.com] Enviado el: lunes, 28 de septiembre de 2015 10:37 Para: dev@samza.apache.org Asunto: Re: process killing Hm... interesting. What did you see in the application master's logs? I saw that the remaining processes running are SamzaAppMasters. On Tue, Sep 22, 2015 at 1:05 AM, Jordi Blasi Uribarri <jbl...@nextel.es> wrote: > Hi, > > I have two machines running yarn and samza. They are samza 0.9.1 and > hadoop 2.6.0. > > I run the kill-all.sh I recently wrote and calls the kill-yarn-job.sh. > This is the output: > > java version "1.7.0_79" > OpenJDK Runtime Environment (IcedTea 2.5.6) (7u79-2.5.6-1~deb7u1) > OpenJDK 64-Bit Server VM (build 24.79-b02, mixed mode) > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/opt/jobs/lib/redirect-0.0.1.jar!/org/slf4j/impl/StaticLogge > rBinder.class] > SLF4J: Found binding in > [jar:file:/opt/jobs/lib/samzafroga-0.0.1-jar-with-dependencies.jar!/or > g/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > java version "1.7.0_79" > OpenJDK Runtime Environment (IcedTea 2.5.6) (7u79-2.5.6-1~deb7u1) > OpenJDK 64-Bit Server VM (build 24.79-b02, mixed mode) > /usr/lib/jvm/java-7-openjdk-amd64/bin/java > -Dlog4j.configuration=file:bin/log4j-console.xml > -Dsamza.log.dir=/opt/jobs -Djava.io.tmpdir=/opt/jobs/tmp -Xmx768M > -XX:+PrintGCDateStamps -Xloggc:/opt/jobs/gc.log > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 > -XX:GCLogFileSize=10241024 -d64 -cp > /opt/hadoop-2.6.0/conf:/opt/jobs/lib/redirect-0.0.1.jar:/opt/jobs/lib/ > samzafroga-0.0.1-jar-with-dependencies.jar > org.apache.hadoop.yarn.client.cli.ApplicationCLI application -kill > application_1442908447829_0001 > 2015-09-22 10:02:46 RMProxy [INFO] Connecting to ResourceManager at > kfk-samza01/192.168.15.92:8032 > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/opt/jobs/lib/redirect-0.0.1.jar!/org/slf4j/impl/StaticLogge > rBinder.class] > SLF4J: Found binding in > [jar:file:/opt/jobs/lib/samzafroga-0.0.1-jar-with-dependencies.jar!/or > g/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > 2015-09-22 10:02:46 NativeCodeLoader [WARN] Unable to load > native-hadoop library for your platform... using builtin-java classes > where applicable Killing application application_1442908447829_0001 > 2015-09-22 10:02:47 YarnClientImpl [INFO] Killed application > application_1442908447829_0001 > java version "1.7.0_79" > OpenJDK Runtime Environment (IcedTea 2.5.6) (7u79-2.5.6-1~deb7u1) > OpenJDK 64-Bit Server VM (build 24.79-b02, mixed mode) > /usr/lib/jvm/java-7-openjdk-amd64/bin/java > -Dlog4j.configuration=file:bin/log4j-console.xml > -Dsamza.log.dir=/opt/jobs -Djava.io.tmpdir=/opt/jobs/tmp -Xmx768M > -XX:+PrintGCDateStamps -Xloggc:/opt/jobs/gc.log > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 > -XX:GCLogFileSize=10241024 -d64 -cp > /opt/hadoop-2.6.0/conf:/opt/jobs/lib/redirect-0.0.1.jar:/opt/jobs/lib/ > samzafroga-0.0.1-jar-with-dependencies.jar > org.apache.hadoop.yarn.client.cli.ApplicationCLI application -kill > application_1442908447829_0002 > 2015-09-22 10:02:49 RMProxy [INFO] Connecting to ResourceManager at > kfk-samza01/192.168.15.92:8032 > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/opt/jobs/lib/redirect-0.0.1.jar!/org/slf4j/impl/StaticLogge > rBinder.class] > SLF4J: Found binding in > [jar:file:/opt/jobs/lib/samzafroga-0.0.1-jar-with-dependencies.jar!/or > g/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > 2015-09-22 10:02:49 NativeCodeLoader [WARN] Unable to load > native-hadoop library for your platform... using builtin-java classes > where applicable Killing application application_1442908447829_0002 > 2015-09-22 10:02:50 YarnClientImpl [INFO] Killed application > application_1442908447829_0002 > > When I run ps -fe |grep java I see this: > > root 9542 1 3 09:54 pts/0 00:00:21 > /usr/lib/jvm/java-7-openjdk-amd64/bin/java -Dproc_resourcemanager > -Xmx1000m -Dhadoop.log.dir=/opt/hadoop-2.6.0/logs > -Dyarn.log.dir=/opt/hadoop-2.6.0/logs > -Dhadoop.log.file=yarn-root-resourcemanager-kfk-samza01.log > -Dyarn.log.file=yarn-root-resourcemanager-kfk-samza01.log > -Dyarn.home.dir=/opt/hadoop-2.6.0 -Dhadoop.home.dir=/opt/hadoop-2.6.0 > -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA > -Djava.library.path=/opt/hadoop-2.6.0/lib/native -classpath > /opt/hadoop-2.6.0/conf:/opt/hadoop-2.6.0/conf:/opt/hadoop-2.6.0/conf:/ > opt/hadoop-2.6.0/share/hadoop/common/lib/*:/opt/hadoop-2.6.0/share/had > oop/common/*:/opt/hadoop-2.6.0/share/hadoop/hdfs:/opt/hadoop-2.6.0/sha > re/hadoop/hdfs/lib/*:/opt/hadoop-2.6.0/share/hadoop/hdfs/*:/opt/hadoop > -2.6.0/share/hadoop/yarn/lib/*:/opt/hadoop-2.6.0/share/hadoop/yarn/*:/ > opt/hadoop-2.6.0/share/hadoop/mapreduce/lib/*:/opt/hadoop-2.6.0/share/ > hadoop/mapreduce/*:/opt/hadoop-2.6.0/share/hadoop/yarn/*:/opt/hadoop-2 > .6.0/share/hadoop/yarn/lib/*:/opt/hadoop-2.6.0/conf/rm-config/log4j.pr > operties org.apache.hadoop.yarn.server.resourcemanager.ResourceManager > root 9814 1 4 09:54 ? 00:00:24 > /usr/lib/jvm/java-7-openjdk-amd64/bin/java -Dproc_nodemanager > -Xmx1000m -server -Dhadoop.log.dir=/opt/hadoop-2.6.0/logs > -Dyarn.log.dir=/opt/hadoop-2.6.0/logs > -Dhadoop.log.file=yarn-root-nodemanager-kfk-samza01.log > -Dyarn.log.file=yarn-root-nodemanager-kfk-samza01.log > -Dyarn.home.dir=/opt/hadoop-2.6.0 -Dhadoop.home.dir=/opt/hadoop-2.6.0 > -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA > -Djava.library.path=/opt/hadoop-2.6.0/lib/native -classpath > /opt/hadoop-2.6.0/conf:/opt/hadoop-2.6.0/conf:/opt/hadoop-2.6.0/conf:/ > opt/hadoop-2.6.0/share/hadoop/common/lib/*:/opt/hadoop-2.6.0/share/had > oop/common/*:/opt/hadoop-2.6.0/share/hadoop/hdfs:/opt/hadoop-2.6.0/sha > re/hadoop/hdfs/lib/*:/opt/hadoop-2.6.0/share/hadoop/hdfs/*:/opt/hadoop > -2.6.0/share/hadoop/yarn/lib/*:/opt/hadoop-2.6.0/share/hadoop/yarn/*:/ > opt/hadoop-2.6.0/share/hadoop/mapreduce/lib/*:/opt/hadoop-2.6.0/share/ > hadoop/mapreduce/*:/opt/hadoop-2.6.0/share/hadoop/yarn/*:/opt/hadoop-2 > .6.0/share/hadoop/yarn/lib/*:/opt/hadoop-2.6.0/conf/nm-config/log4j.pr > operties org.apache.hadoop.yarn.server.nodemanager.NodeManager > root 10271 10268 0 09:54 ? 00:00:05 > /usr/lib/jvm/java-7-openjdk-amd64/bin/java -server > -Dsamza.container.name=samza-application-master > -Dsamza.log.dir=/opt/hadoop-2.6.0/logs/userlogs/application_1442908447 > 829_0002/container_1442908447829_0002_01_000001 > -Djava.io.tmpdir=/tmp/hadoop-root/nm-local-dir/usercache/root/appcache > /application_1442908447829_0002/container_1442908447829_0002_01_000001 > /__package/tmp > -Xmx768M -XX:+PrintGCDateStamps > -Xloggc:/opt/hadoop-2.6.0/logs/userlogs/application_1442908447829_0002 > /container_1442908447829_0002_01_000001/gc.log > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 > -XX:GCLogFileSize=10241024 -d64 -cp > /opt/hadoop-2.6.0/conf:/tmp/hadoop-root/nm-local-dir/usercache/root/ap > pcache/application_1442908447829_0002/container_1442908447829_0002_01_ > 000001/__package/lib/jackson-annotations-2.6.0.jar:/tmp/hadoop-root/nm > -local-dir/usercache/root/appcache/application_1442908447829_0002/cont > ainer_1442908447829_0002_01_000001/__package/lib/jackson-core-2.6.0.ja > r:/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_14 > 42908447829_0002/container_1442908447829_0002_01_000001/__package/lib/ > jackson-databind-2.6.0.jar:/tmp/hadoop-root/nm-local-dir/usercache/roo > t/appcache/application_1442908447829_0002/container_1442908447829_0002 > _01_000001/__package/lib/jackson-dataformat-smile-2.6.0.jar:/tmp/hadoo > p-root/nm-local-dir/usercache/root/appcache/application_1442908447829_ > 0002/container_1442908447829_0002_01_000001/__package/lib/jackson-jaxr > s-json-provider-2.6.0.jar:/tmp/hadoop-root/nm-local-dir/usercache/root > /appcache/application_1442908447829_0002/container_1442908447829_0002_ > 01_000001/__package/lib/jackson-module-jaxb-annotations-2.6.0.jar:/tmp > /hadoop-root/nm-local-dir/usercache/root/appcache/application_14429084 > 47829_0002/container_1442908447829_0002_01_000001/__package/lib/nxtBro > ker-0.0.1.jar:/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/ap > plication_1442908447829_0002/container_1442908447829_0002_01_000001/__ > package/lib/nxtBroker-0.0.1-jar-with-dependencies.jar > org.apache.samza.job.yarn.SamzaAppMaster > root 10346 10344 0 09:54 ? 00:00:04 > /usr/lib/jvm/java-7-openjdk-amd64/bin/java -server > -Dsamza.container.name=samza-application-master > -Dsamza.log.dir=/opt/hadoop-2.6.0/logs/userlogs/application_1442908447 > 829_0001/container_1442908447829_0001_01_000001 > -Djava.io.tmpdir=/tmp/hadoop-root/nm-local-dir/usercache/root/appcache > /application_1442908447829_0001/container_1442908447829_0001_01_000001 > /__package/tmp > -Xmx768M -XX:+PrintGCDateStamps > -Xloggc:/opt/hadoop-2.6.0/logs/userlogs/application_1442908447829_0001 > /container_1442908447829_0001_01_000001/gc.log > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 > -XX:GCLogFileSize=10241024 -d64 -cp > /opt/hadoop-2.6.0/conf:/tmp/hadoop-root/nm-local-dir/usercache/root/ap > pcache/application_1442908447829_0001/container_1442908447829_0001_01_ > 000001/__package/lib/jackson-annotations-2.6.0.jar:/tmp/hadoop-root/nm > -local-dir/usercache/root/appcache/application_1442908447829_0001/cont > ainer_1442908447829_0001_01_000001/__package/lib/jackson-core-2.6.0.ja > r:/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_14 > 42908447829_0001/container_1442908447829_0001_01_000001/__package/lib/ > jackson-databind-2.6.0.jar:/tmp/hadoop-root/nm-local-dir/usercache/roo > t/appcache/application_1442908447829_0001/container_1442908447829_0001 > _01_000001/__package/lib/jackson-dataformat-smile-2.6.0.jar:/tmp/hadoo > p-root/nm-local-dir/usercache/root/appcache/application_1442908447829_ > 0001/container_1442908447829_0001_01_000001/__package/lib/jackson-jaxr > s-json-provider-2.6.0.jar:/tmp/hadoop-root/nm-local-dir/usercache/root > /appcache/application_1442908447829_0001/container_1442908447829_0001_ > 01_000001/__package/lib/jackson-module-jaxb-annotations-2.6.0.jar:/tmp > /hadoop-root/nm-local-dir/usercache/root/appcache/application_14429084 > 47829_0001/container_1442908447829_0001_01_000001/__package/lib/nxtBro > ker-0.0.1.jar:/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/ap > plication_1442908447829_0001/container_1442908447829_0001_01_000001/__ > package/lib/nxtBroker-0.0.1-jar-with-dependencies.jar > org.apache.samza.job.yarn.SamzaAppMaster > > As you can see the proceses are still there. In the web application > the appear as KILLED. > > Thanks, > > Jordi > > -----Mensaje original----- > De: Yan Fang [mailto:yanfang...@gmail.com] Enviado el: martes, 22 de > septiembre de 2015 9:59 > Para: dev@samza.apache.org > Asunto: Re: process killing > > Hi Jordi, > > 1. Are you running the job in one machine yarn? or in a cluster? > > 2. what kind of the java process do you see after killing the yarn > application? Because usually, when we run kill-yarn-job applicationId, > we do kill all the processes (this is actually done by the Yarn). > > 3. Which version of Samza and Yarn are you using ? This matters sometimes. > > Thanks, > > Fang, Yan > yanfang...@gmail.com > > On Tue, Sep 22, 2015 at 3:42 PM, Jordi Blasi Uribarri > <jbl...@nextel.es> > wrote: > > > Hi, > > > > I am currently developing solution using samza and in the > > development process I need to constantly change the code and test in the > > system. > > What I am seeing is that most of the times I kill a job using the > > kill-yarn-job script the job gets killed according to the web > > interface but I see the java process running. I also have seen that > > the job was actually been executed, as I got messages in the far end > > of the application. I have been manually killing these processes > > (kill > -9 ) but I have some questions: > > > > > > - Is there a reason for the processes not to be killed. It was > > not a matter of time as I could find them hours later. > > > > - I don’t know if there should be any other action performed to > > completely clean the information or killing the process the hard way > > is enough. > > > > - I am finding some memory consumption problems that I don’t > know > > if they are related with this. Maybe I will describe them in another > > message. > > > > Thnaks, > > > > Jordi > > ________________________________ > > Jordi Blasi Uribarri > > Área I+D+i > > > > jbl...@nextel.es > > Oficina Bilbao > > > > [http://www.nextel.es/wp-content/uploads/Firma_Nextel_2015.png] > > >