[jira] [Created] (YARN-8508) GPU does not get released even though the container is killed
Sumana Sathish created YARN-8508: Summary: GPU does not get released even though the container is killed Key: YARN-8508 URL: https://issues.apache.org/jira/browse/YARN-8508 Project: Hadoop YARN Issue Type: Bug Reporter: Sumana Sathish Assignee: Wangda Tan GPU failed to release even though the container using it is being killed {Code} 2018-07-06 05:22:26,201 INFO container.ContainerImpl (ContainerImpl.java:handle(2093)) - Container container_e20_1530854311763_0006_01_01 transitioned from RUNNING to KILLING 2018-07-06 05:22:26,250 INFO container.ContainerImpl (ContainerImpl.java:handle(2093)) - Container container_e20_1530854311763_0006_01_02 transitioned from RUNNING to KILLING 2018-07-06 05:22:26,251 INFO application.ApplicationImpl (ApplicationImpl.java:handle(632)) - Application application_1530854311763_0006 transitioned from RUNNING to FINISHING_CONTAINERS_WAIT 2018-07-06 05:22:26,251 INFO launcher.ContainerLaunch (ContainerLaunch.java:cleanupContainer(734)) - Cleaning up container container_e20_1530854311763_0006_01_02 2018-07-06 05:22:31,358 INFO launcher.ContainerLaunch (ContainerLaunch.java:getContainerPid(1102)) - Could not get pid for container_e20_1530854311763_0006_01_02. Waited for 5000 ms. 2018-07-06 05:22:31,358 WARN launcher.ContainerLaunch (ContainerLaunch.java:cleanupContainer(784)) - Container clean up before pid file created container_e20_1530854311763_0006_01_02 2018-07-06 05:22:31,359 INFO launcher.ContainerLaunch (ContainerLaunch.java:reapDockerContainerNoPid(940)) - Unable to obtain pid, but docker container request detected. Attempting to reap container container_e20_1530854311763_0006_01_02 2018-07-06 05:22:31,494 INFO nodemanager.LinuxContainerExecutor (LinuxContainerExecutor.java:deleteAsUser(828)) - Deleting absolute path : /grid/0/hadoop/yarn/local/usercache/hrt_qa/appcache/application_1530854311763_0006/container_e20_1530854311763_0006_01_02/launch_container.sh 2018-07-06 05:22:31,500 INFO nodemanager.LinuxContainerExecutor (LinuxContainerExecutor.java:deleteAsUser(828)) - Deleting absolute path : /grid/0/hadoop/yarn/local/usercache/hrt_qa/appcache/application_1530854311763_0006/container_e20_1530854311763_0006_01_02/container_tokens 2018-07-06 05:22:31,510 INFO container.ContainerImpl (ContainerImpl.java:handle(2093)) - Container container_e20_1530854311763_0006_01_01 transitioned from KILLING to CONTAINER_CLEANEDUP_AFTER_KILL 2018-07-06 05:22:31,510 INFO container.ContainerImpl (ContainerImpl.java:handle(2093)) - Container container_e20_1530854311763_0006_01_02 transitioned from KILLING to CONTAINER_CLEANEDUP_AFTER_KILL 2018-07-06 05:22:31,512 INFO container.ContainerImpl (ContainerImpl.java:handle(2093)) - Container container_e20_1530854311763_0006_01_01 transitioned from CONTAINER_CLEANEDUP_AFTER_KILL to DONE 2018-07-06 05:22:31,513 INFO container.ContainerImpl (ContainerImpl.java:handle(2093)) - Container container_e20_1530854311763_0006_01_02 transitioned from CONTAINER_CLEANEDUP_AFTER_KILL to DONE 2018-07-06 05:22:38,955 INFO container.ContainerImpl (ContainerImpl.java:handle(2093)) - Container container_e20_1530854311763_0007_01_02 transitioned from NEW to SCHEDULED {Code} New container requesting for GPU fails to launch {code} 2018-07-06 05:22:39,048 ERROR nodemanager.LinuxContainerExecutor (LinuxContainerExecutor.java:handleLaunchForLaunchType(550)) - ResourceHandlerChain.preStart() failed! org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerException: Failed to find enough GPUs, requestor=container_e20_1530854311763_0007_01_02, #RequestedGPUs=2, #availableGpus=1 at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.gpu.GpuResourceAllocator.internalAssignGpus(GpuResourceAllocator.java:225) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.gpu.GpuResourceAllocator.assignGpus(GpuResourceAllocator.java:173) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.gpu.GpuResourceHandlerImpl.preStart(GpuResourceHandlerImpl.java:98) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerChain.preStart(ResourceHandlerChain.java:75) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.handleLaunchForLaunchType(LinuxContainerExecutor.java:509) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:479) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.launchContainer(ContainerLaunch.java:494) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:306)
[jira] [Created] (YARN-8474) sleeper service fails to launch with "Authentication Required"
Sumana Sathish created YARN-8474: Summary: sleeper service fails to launch with "Authentication Required" Key: YARN-8474 URL: https://issues.apache.org/jira/browse/YARN-8474 Project: Hadoop YARN Issue Type: Bug Components: yarn Reporter: Sumana Sathish Sleeper job fails with Authentication required. yarn app -launch sl1 a/YarnServiceLogs/sleeper-orig.json 18/06/28 22:00:43 INFO client.ApiServiceClient: Loading service definition from local FS: /a/YarnServiceLogs/sleeper-orig.json 18/06/28 22:00:44 ERROR client.ApiServiceClient: Authentication required -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8460) please add a way to fetch 'yarn.cluster.max-application-priority'
Sumana Sathish created YARN-8460: Summary: please add a way to fetch 'yarn.cluster.max-application-priority' Key: YARN-8460 URL: https://issues.apache.org/jira/browse/YARN-8460 Project: Hadoop YARN Issue Type: Bug Reporter: Sumana Sathish Add a method to fetch value for 'yarn.cluster.max-application-priority'. Since the property is not available by default , please add either REST api /CLI method. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8423) GPU does not get released even though the application gets killed.
Sumana Sathish created YARN-8423: Summary: GPU does not get released even though the application gets killed. Key: YARN-8423 URL: https://issues.apache.org/jira/browse/YARN-8423 Project: Hadoop YARN Issue Type: Bug Components: yarn Reporter: Sumana Sathish Assignee: Wangda Tan -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-8317) fix href in Queue page for RM UIV2
[ https://issues.apache.org/jira/browse/YARN-8317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sumana Sathish resolved YARN-8317. -- Resolution: Won't Fix href is #0 since clicking on the queue does not redirect to new page instead only panel changes > fix href in Queue page for RM UIV2 > -- > > Key: YARN-8317 > URL: https://issues.apache.org/jira/browse/YARN-8317 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-ui-v2 >Reporter: Sumana Sathish >Assignee: Yesha Vora >Priority: Major > > As of now, any no of Queue you create, href will always be '#0' for all > the queues. > Can you please make href point to something like '#/yarn-queue/' -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8317) fix href in Queue page for RM UIV2
Sumana Sathish created YARN-8317: Summary: fix href in Queue page for RM UIV2 Key: YARN-8317 URL: https://issues.apache.org/jira/browse/YARN-8317 Project: Hadoop YARN Issue Type: Bug Components: yarn-ui-v2 Reporter: Sumana Sathish Assignee: Yesha Vora As of now, any no of Queue you create, href will always be '#0' for all the queues. Can you please make href point to something like '#/yarn-queue/' -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8292) Preemption of GPU resource does not happen if memory/vcores is not required to be preempted
Sumana Sathish created YARN-8292: Summary: Preemption of GPU resource does not happen if memory/vcores is not required to be preempted Key: YARN-8292 URL: https://issues.apache.org/jira/browse/YARN-8292 Project: Hadoop YARN Issue Type: Bug Components: yarn Reporter: Sumana Sathish Assignee: Tan, Wangda -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8264) [UI2 GPU] GPU Info tab disappears if we click any sub link under List of Applications or List of Containers
Sumana Sathish created YARN-8264: Summary: [UI2 GPU] GPU Info tab disappears if we click any sub link under List of Applications or List of Containers Key: YARN-8264 URL: https://issues.apache.org/jira/browse/YARN-8264 Project: Hadoop YARN Issue Type: Bug Reporter: Sumana Sathish Assignee: Sunil G Run an application asking for 5000 containers so that it remains RUNNING for long time. 1. Click Nodes tab in RM UI 2. Click on the node link under 'Node HTTP Address' 3. Click on the 'List of Applications' Tab and click on the application Id available 4. GPU Info tab goes away. 5. Similarly click on 'List of Containers' tab and click on container id link available 5. GPU Info tab goes away again -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8230) [UI2] Attempt Info page url shows NA for several fields for container info
Sumana Sathish created YARN-8230: Summary: [UI2] Attempt Info page url shows NA for several fields for container info Key: YARN-8230 URL: https://issues.apache.org/jira/browse/YARN-8230 Project: Hadoop YARN Issue Type: Bug Components: yarn, yarn-ui-v2 Reporter: Sumana Sathish Assignee: Chandni Singh 1. Click on any application 2. Click on the appAttempt present 3. Click on grid View 4. It shows container Info. But logs / nodemanager / and several fields show NA, with finished time as Invalid -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8229) exp
Sumana Sathish created YARN-8229: Summary: exp Key: YARN-8229 URL: https://issues.apache.org/jira/browse/YARN-8229 Project: Hadoop YARN Issue Type: Bug Reporter: Sumana Sathish -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8205) AM launching is delayed, then state is not updated in ATS
Sumana Sathish created YARN-8205: Summary: AM launching is delayed, then state is not updated in ATS Key: YARN-8205 URL: https://issues.apache.org/jira/browse/YARN-8205 Project: Hadoop YARN Issue Type: Bug Reporter: Sumana Sathish Assignee: Rohith Sharma K S There is momentarily issue between app ACCEPTED to RUNNING duration. If AM launching is delayed, then state is not updated in ATS. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8197) Tracking URL in the app state does not get redirected to MR ApplicationMaster for Running applications
Sumana Sathish created YARN-8197: Summary: Tracking URL in the app state does not get redirected to MR ApplicationMaster for Running applications Key: YARN-8197 URL: https://issues.apache.org/jira/browse/YARN-8197 Project: Hadoop YARN Issue Type: Bug Components: yarn Reporter: Sumana Sathish Assignee: Sunil G {code} org.eclipse.jetty.servlet.ServletHandler: javax.servlet.ServletException: Could not determine the proxy server for redirection at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.findRedirectUrl(AmIpFilter.java:211) at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:145) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759) at org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1617) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759) at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) at org.eclipse.jetty.server.Server.handle(Server.java:534) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108) at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303) at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148) at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671) at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589) at java.lang.Thread.run(Thread.java:748) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8187) [UI2] clicking on Individual Nodes does not contain breadcums in Nodes Page
Sumana Sathish created YARN-8187: Summary: [UI2] clicking on Individual Nodes does not contain breadcums in Nodes Page Key: YARN-8187 URL: https://issues.apache.org/jira/browse/YARN-8187 Project: Hadoop YARN Issue Type: Bug Components: yarn-ui-v2 Reporter: Sumana Sathish Assignee: Zian Chen 1. Click on 'Nodes' Tab in the RM home page 2. Click on individual node under 'Node HTTP Address' 3. No breadcrums available like '/Home/Nodes/Node Id/ 4. breadcums comes back once we click on other tabs like 'List of Applications', 'List of Containers'. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8183) yClient for Kill Application stuck in infinite loop with message "Waiting for Application to be killed"
Sumana Sathish created YARN-8183: Summary: yClient for Kill Application stuck in infinite loop with message "Waiting for Application to be killed" Key: YARN-8183 URL: https://issues.apache.org/jira/browse/YARN-8183 Project: Hadoop YARN Issue Type: Bug Components: yarn Reporter: Sumana Sathish Assignee: Suma Shivaprasad yclient gets stuck in killing application with repeatedly printing following message {code} INFO impl.YarnClientImpl: Waiting for application application_1523604760756_0001 to be killed.{code} RM shows following exception {code} ERROR resourcemanager.ResourceManager (ResourceManager.java:handle(995)) - Error in handling event type APP_UPDATE_SAVED for application application_ID java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextNode(HashMap.java:1442) at java.util.HashMap$EntryIterator.next(HashMap.java:1476) at java.util.HashMap$EntryIterator.next(HashMap.java:1474) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics.convertAtomicLongMaptoLongMap(RMAppAttemptMetrics.java:212) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics.getAggregateAppResourceUsage(RMAppAttemptMetrics.java:133) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.getRMAppMetrics(RMAppImpl.java:1660) at org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV2Publisher.appFinished(TimelineServiceV2Publisher.java:178) at org.apache.hadoop.yarn.server.resourcemanager.metrics.CombinedSystemMetricsPublisher.appFinished(CombinedSystemMetricsPublisher.java:73) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl$FinalTransition.transition(RMAppImpl.java:1470) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl$AppKilledTransition.transition(RMAppImpl.java:1408) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl$AppKilledTransition.transition(RMAppImpl.java:1400) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl$FinalStateSavedTransition.transition(RMAppImpl.java:1177) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl$FinalStateSavedTransition.transition(RMAppImpl.java:1164) at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) at org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:898) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:118) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:993) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:977) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126) at java.lang.Thread.run(Thread.java:748) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8182) [UI2] Proxy- Clicking on nodes under Nodes HeatMap gives 401 error
Sumana Sathish created YARN-8182: Summary: [UI2] Proxy- Clicking on nodes under Nodes HeatMap gives 401 error Key: YARN-8182 URL: https://issues.apache.org/jira/browse/YARN-8182 Project: Hadoop YARN Issue Type: Bug Reporter: Sumana Sathish Assignee: Sunil G 1. Click on 'Nodes' Tab in the RM UI 2. Click on 'Nodes HeatMap' tab under Nodes 3. Click on the nodes available. It gives 401 error -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8075) DShell does not Fail when we ask more GPUs than available even though AM throws 'InvalidResourceRequestException'
Sumana Sathish created YARN-8075: Summary: DShell does not Fail when we ask more GPUs than available even though AM throws 'InvalidResourceRequestException' Key: YARN-8075 URL: https://issues.apache.org/jira/browse/YARN-8075 Project: Hadoop YARN Issue Type: Bug Reporter: Sumana Sathish Assignee: Wangda Tan Run an DShell asking for 4 GPU per container, The application remains in RUNNING state. But AM logs shows it failed to launch containers with org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException {Code} ERROR impl.AMRMClientAsyncImpl: Exception on heartbeat org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, requested resource type=[yarn.io/gpu] < 0 or greater than maximum allowed allocation. Requested resource=, maximum allowed allocation=, please note that maximum allowed allocation is calculated by scheduler based on maximum resource of registered NodeManagers, which might be less than configured maximum allocation= at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:286) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:242) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndvalidateRequest(SchedulerUtils.java:258) at org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.normalizeAndValidateRequests(RMServerUtils.java:249) at org.apache.hadoop.yarn.server.resourcemanager.DefaultAMSProcessor.allocate(DefaultAMSProcessor.java:230) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.processor.DisabledPlacementProcessor.allocate(DisabledPlacementProcessor.java:75) at org.apache.hadoop.yarn.server.resourcemanager.AMSProcessingChain.allocate(AMSProcessingChain.java:92) at org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:433) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60) at org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateYarnException(RPCUtil.java:75) at org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:116) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:79) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359) at com.sun.proxy.$Proxy8.allocate(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:313) at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$HeartbeatThread.run(A
[jira] [Created] (YARN-8005) Add unit tests for queue priority with dominant resource calculator
Sumana Sathish created YARN-8005: Summary: Add unit tests for queue priority with dominant resource calculator Key: YARN-8005 URL: https://issues.apache.org/jira/browse/YARN-8005 Project: Hadoop YARN Issue Type: Bug Reporter: Sumana Sathish Assignee: Wangda Tan -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8004) Add unit tests for inter queue preemption for dominant resource calculator
Sumana Sathish created YARN-8004: Summary: Add unit tests for inter queue preemption for dominant resource calculator Key: YARN-8004 URL: https://issues.apache.org/jira/browse/YARN-8004 Project: Hadoop YARN Issue Type: Bug Components: yarn Reporter: Sumana Sathish Assignee: Wangda Tan -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7761) [UI2]Clicking 'master container log' or 'Link' next to 'log' under application's appAttempt goes to Old UI's Log link
Sumana Sathish created YARN-7761: Summary: [UI2]Clicking 'master container log' or 'Link' next to 'log' under application's appAttempt goes to Old UI's Log link Key: YARN-7761 URL: https://issues.apache.org/jira/browse/YARN-7761 Project: Hadoop YARN Issue Type: Bug Components: yarn-ui-v2 Reporter: Sumana Sathish Assignee: Vasudevan Skm Clicking 'master container log' or 'Link' next to 'Log' under application's appAttempt goes to Old UI's Log link -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7760) [UI2}Clicking 'Master Node' or link next to 'AM Node Web UI' under application's appAttempt page goes to OLD RM UI
Sumana Sathish created YARN-7760: Summary: [UI2}Clicking 'Master Node' or link next to 'AM Node Web UI' under application's appAttempt page goes to OLD RM UI Key: YARN-7760 URL: https://issues.apache.org/jira/browse/YARN-7760 Project: Hadoop YARN Issue Type: Bug Reporter: Sumana Sathish Assignee: Vasudevan Skm Clicking 'Master Node' or link next to 'AM Node Web UI' under application's appAttempt page goes to OLD RM UI -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7759) [UI2]GPU chart shows as "Available: 0" even though GPU is available
Sumana Sathish created YARN-7759: Summary: [UI2]GPU chart shows as "Available: 0" even though GPU is available Key: YARN-7759 URL: https://issues.apache.org/jira/browse/YARN-7759 Project: Hadoop YARN Issue Type: Bug Reporter: Sumana Sathish Assignee: Vasudevan Skm GPU chart under Node Manager page shows as zero GPU's available even though GPU s are present. Only when we click 'GPU Information' chart, it shows correct GPU information -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7738) DShell requesting gpu resources fails to run
Sumana Sathish created YARN-7738: Summary: DShell requesting gpu resources fails to run Key: YARN-7738 URL: https://issues.apache.org/jira/browse/YARN-7738 Project: Hadoop YARN Issue Type: Bug Reporter: Sumana Sathish Assignee: Tan, Wangda Priority: Critical Run an DShell app requesting for 1 GPU on the node which has 2 GPUs, the application finishes with FAILED state -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-7234) Kill Application button shows "404 Error" even though the application gets killed
[ https://issues.apache.org/jira/browse/YARN-7234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sumana Sathish resolved YARN-7234. -- Resolution: Cannot Reproduce Release Note: Do not see the issue anymore > Kill Application button shows "404 Error" even though the application gets > killed > - > > Key: YARN-7234 > URL: https://issues.apache.org/jira/browse/YARN-7234 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Sumana Sathish >Assignee: Suma Shivaprasad >Priority: Critical > > In Secured UI, > Run an application as 'hrt_qa' > Kill the application by clicking "kill application" button in UI once you > logged in as hrt_qa user in UI > 404 error is shown even though the application gets killed. > Options -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7269) Tracking URL in the app state does not get redirected to ApplicationMaster for Running applications
Sumana Sathish created YARN-7269: Summary: Tracking URL in the app state does not get redirected to ApplicationMaster for Running applications Key: YARN-7269 URL: https://issues.apache.org/jira/browse/YARN-7269 Project: Hadoop YARN Issue Type: Bug Reporter: Sumana Sathish Assignee: Tan, Wangda Priority: Critical Tracking URL in the app state does not get redirected to ApplicationMaster for Running applications. It gives following exception {code} org.mortbay.log: /ws/v1/mapreduce/info javax.servlet.ServletException: Could not determine the proxy server for redirection at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.findRedirectUrl(AmIpFilter.java:199) at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:141) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1426) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7234) Kill Application button shows "404 Error" even though the application gets killed
Sumana Sathish created YARN-7234: Summary: Kill Application button shows "404 Error" even though the application gets killed Key: YARN-7234 URL: https://issues.apache.org/jira/browse/YARN-7234 Project: Hadoop YARN Issue Type: Bug Reporter: Sumana Sathish Priority: Critical In Secured UI, Run an application as 'hrt_qa' Kill the application by clicking "kill application" button in UI once you logged in as hrt_qa user in UI 404 error is shown even though the application gets killed. Options -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7185) Application fails to go to FINISHED state or sometimes to RUNNING state
Sumana Sathish created YARN-7185: Summary: Application fails to go to FINISHED state or sometimes to RUNNING state Key: YARN-7185 URL: https://issues.apache.org/jira/browse/YARN-7185 Project: Hadoop YARN Issue Type: Bug Components: yarn Reporter: Sumana Sathish Assignee: Tan, Wangda Priority: Critical Application fails to go to FINISHED state or sometimes to RUNNING state. In the nodemanager, we can see the following warnings {Code} WARN scheduler.ContainerScheduler (ContainerScheduler.java:pickOpportunisticContainersToKill(458)) - There are no sufficient resources to start guaranteed container_ at the moment. Opportunistic containers are in the process ofbeing killed to make room {Code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7011) yarn-daemon.sh is not respecting --config option
Sumana Sathish created YARN-7011: Summary: yarn-daemon.sh is not respecting --config option Key: YARN-7011 URL: https://issues.apache.org/jira/browse/YARN-7011 Project: Hadoop YARN Issue Type: Bug Components: yarn Reporter: Sumana Sathish Priority: Blocker Fix For: 3.0.0-beta1 Steps to reproduce: 1. Copy the conf to a temporary location /tmp/Conf 2. Modify anything in yarn-site.xml under /tmp/Conf/. Ex: Give invalid RM address 3. Restart the resourcemanager using yarn-daemon.sh using --config /tmp/Conf 4. --config is not respected as the changes made in /tmp/Conf/yarn-site.xml is not taken in while restarting RM -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6992) "Kill application" button is present even if the application is FINISHED in RM UI
Sumana Sathish created YARN-6992: Summary: "Kill application" button is present even if the application is FINISHED in RM UI Key: YARN-6992 URL: https://issues.apache.org/jira/browse/YARN-6992 Project: Hadoop YARN Issue Type: Bug Reporter: Sumana Sathish Assignee: Suma Shivaprasad -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6991) "Kill application" button does not show error if other user tries to kill the application for secure cluster
Sumana Sathish created YARN-6991: Summary: "Kill application" button does not show error if other user tries to kill the application for secure cluster Key: YARN-6991 URL: https://issues.apache.org/jira/browse/YARN-6991 Project: Hadoop YARN Issue Type: Bug Reporter: Sumana Sathish Assignee: Suma Shivaprasad 1. Submit an application by user 1 2. log into RM UI as user 2 3. Kill the application submitted by user 1 4. Even though application does not get killed, there is no error/info dialog box being shown to let the user that "user doesnot have permissions to kill application of other user" -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6977) Node information is not provided for non am containers in RM logs
Sumana Sathish created YARN-6977: Summary: Node information is not provided for non am containers in RM logs Key: YARN-6977 URL: https://issues.apache.org/jira/browse/YARN-6977 Project: Hadoop YARN Issue Type: Bug Reporter: Sumana Sathish There is no information on which node non am container is being assigned in the trunk for 3.0 Earlier we used to have logs for non am container in the similar way {code} Assigned container container_ of capacity on host , which has 1 containers, used and available after allocation {code} 3.0 has information for am container alone in the following way {code} Done launching container Container: [ContainerId: container_, AllocationRequestId: 0, Version: 0, NodeId:nodeID, NodeHttpAddress: nodeAddress, Resource: , Priority: 0, Token: Token { kind: ContainerToken, service: service}, ExecutionType: GUARANTEED, ] for AM appattempt_ {code} Can we please have similar message for Non am container too ?? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6891) Can kill other user's applications via RM UI
Sumana Sathish created YARN-6891: Summary: Can kill other user's applications via RM UI Key: YARN-6891 URL: https://issues.apache.org/jira/browse/YARN-6891 Project: Hadoop YARN Issue Type: Bug Reporter: Sumana Sathish Assignee: Junping Du Priority: Critical In a secured cluster with UI unsecured which has following config {code} "hadoop.http.authentication.simple.anonymous.allowed" => "true" "hadoop.http.authentication.type" => kerberos {code} UI can be accessed without any security setting. Also any user can kill other user's applications via UI -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6570) No logs were found for running application, running container
Sumana Sathish created YARN-6570: Summary: No logs were found for running application, running container Key: YARN-6570 URL: https://issues.apache.org/jira/browse/YARN-6570 Project: Hadoop YARN Issue Type: Bug Components: yarn Reporter: Sumana Sathish Assignee: Junping Du Priority: Critical 1.Obtain running containers from the following CLI for running application: yarn container -list appattempt 2. Couldnot fetch logs {code} Can not find any log file matching the pattern: ALL for the container {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6271) yarn rmadin -getGroups returns information from standby RM
Sumana Sathish created YARN-6271: Summary: yarn rmadin -getGroups returns information from standby RM Key: YARN-6271 URL: https://issues.apache.org/jira/browse/YARN-6271 Project: Hadoop YARN Issue Type: Bug Components: yarn Reporter: Sumana Sathish Assignee: Jian He Priority: Critical -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6174) Log files pattern should be same for both running and finished container
Sumana Sathish created YARN-6174: Summary: Log files pattern should be same for both running and finished container Key: YARN-6174 URL: https://issues.apache.org/jira/browse/YARN-6174 Project: Hadoop YARN Issue Type: Improvement Components: yarn Reporter: Sumana Sathish Assignee: Xuan Gong Log files pattern should be same for both running and finished container As of now, {Code:title= running container} Container: container_e48_1486721307661_0002_01_01 on ctr-e127-1486658464320-1142-01-04.hwx.site:25454 LogType: LOCAL FileName:syslog {Code} {Code:title= finished container} LogType:syslog LogLength:3730 |Log Contents: {Code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-5539) AM fails due to "java.net.SocketTimeoutException: Read timed out"
[ https://issues.apache.org/jira/browse/YARN-5539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sumana Sathish resolved YARN-5539. -- Resolution: Cannot Reproduce Not able to reproduce the issue. > AM fails due to "java.net.SocketTimeoutException: Read timed out" > - > > Key: YARN-5539 > URL: https://issues.apache.org/jira/browse/YARN-5539 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Reporter: Sumana Sathish >Assignee: Junping Du >Priority: Critical > > AM fails with the following exception > {code} > FATAL distributedshell.ApplicationMaster: Error running ApplicationMaster > com.sun.jersey.api.client.ClientHandlerException: > java.net.SocketTimeoutException: Read timed out > at > com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineJerseyRetryFilter$1.run(TimelineClientImpl.java:236) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:185) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineJerseyRetryFilter.handle(TimelineClientImpl.java:247) > at com.sun.jersey.api.client.Client.handle(Client.java:648) > at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670) > at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74) > at > com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563) > at > org.apache.hadoop.yarn.client.api.impl.TimelineWriter.doPostingObject(TimelineWriter.java:154) > at > org.apache.hadoop.yarn.client.api.impl.TimelineWriter$1.run(TimelineWriter.java:115) > at > org.apache.hadoop.yarn.client.api.impl.TimelineWriter$1.run(TimelineWriter.java:112) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) > at > org.apache.hadoop.yarn.client.api.impl.TimelineWriter.doPosting(TimelineWriter.java:112) > at > org.apache.hadoop.yarn.client.api.impl.TimelineWriter.putEntities(TimelineWriter.java:92) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:345) > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1166) > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:567) > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:298) > Caused by: java.net.SocketTimeoutException: Read timed out > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) > at java.net.SocketInputStream.read(SocketInputStream.java:170) > at java.net.SocketInputStream.read(SocketInputStream.java:141) > at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) > at java.io.BufferedInputStream.read(BufferedInputStream.java:345) > at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:704) > at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:647) > at > sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1536) > at > sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1441) > at > java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480) > at > org.apache.hadoop.security.authentication.client.AuthenticatedURL.extractToken(AuthenticatedURL.java:253) > at > org.apache.hadoop.security.authentication.client.PseudoAuthenticator.authenticate(PseudoAuthenticator.java:77) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:132) > at > org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:216) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.openConnection(DelegationTokenAuthenticatedURL.java:322) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineURLConnectionFactory.getHttpURLConnection(TimelineClientImpl.java:472) > at > com.sun.jersey.client.urlconnection.URLConnectionClientHandler._invoke(URLConnectionClientHandler.java:159) > at > com.sun.jersey.client.urlconnection.
[jira] [Created] (YARN-5539) AM fails due to "java.net.SocketTimeoutException: Read timed out"
Sumana Sathish created YARN-5539: Summary: AM fails due to "java.net.SocketTimeoutException: Read timed out" Key: YARN-5539 URL: https://issues.apache.org/jira/browse/YARN-5539 Project: Hadoop YARN Issue Type: Bug Components: yarn Reporter: Sumana Sathish Assignee: Junping Du Priority: Critical AM fails with the following exception {code} FATAL distributedshell.ApplicationMaster: Error running ApplicationMaster com.sun.jersey.api.client.ClientHandlerException: java.net.SocketTimeoutException: Read timed out at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineJerseyRetryFilter$1.run(TimelineClientImpl.java:236) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:185) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineJerseyRetryFilter.handle(TimelineClientImpl.java:247) at com.sun.jersey.api.client.Client.handle(Client.java:648) at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670) at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74) at com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563) at org.apache.hadoop.yarn.client.api.impl.TimelineWriter.doPostingObject(TimelineWriter.java:154) at org.apache.hadoop.yarn.client.api.impl.TimelineWriter$1.run(TimelineWriter.java:115) at org.apache.hadoop.yarn.client.api.impl.TimelineWriter$1.run(TimelineWriter.java:112) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) at org.apache.hadoop.yarn.client.api.impl.TimelineWriter.doPosting(TimelineWriter.java:112) at org.apache.hadoop.yarn.client.api.impl.TimelineWriter.putEntities(TimelineWriter.java:92) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:345) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1166) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:567) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:298) Caused by: java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:170) at java.net.SocketInputStream.read(SocketInputStream.java:141) at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) at java.io.BufferedInputStream.read(BufferedInputStream.java:345) at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:704) at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:647) at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1536) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1441) at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480) at org.apache.hadoop.security.authentication.client.AuthenticatedURL.extractToken(AuthenticatedURL.java:253) at org.apache.hadoop.security.authentication.client.PseudoAuthenticator.authenticate(PseudoAuthenticator.java:77) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:132) at org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:216) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.openConnection(DelegationTokenAuthenticatedURL.java:322) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineURLConnectionFactory.getHttpURLConnection(TimelineClientImpl.java:472) at com.sun.jersey.client.urlconnection.URLConnectionClientHandler._invoke(URLConnectionClientHandler.java:159) at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:147) ... 19 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands
[jira] [Created] (YARN-5500) 'Master node' link under application tab is broken
Sumana Sathish created YARN-5500: Summary: 'Master node' link under application tab is broken Key: YARN-5500 URL: https://issues.apache.org/jira/browse/YARN-5500 Project: Hadoop YARN Issue Type: Bug Reporter: Sumana Sathish Priority: Critical Steps to reproduce: * Click on the running application portion on the donut under "Cluster resource usage by applications" * Under App Master Info, there is a link provided for "Master Node". The link is broken. It doesn't redirect to any page. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5499) Logs of container loads first time but fails if you go back and click again
Sumana Sathish created YARN-5499: Summary: Logs of container loads first time but fails if you go back and click again Key: YARN-5499 URL: https://issues.apache.org/jira/browse/YARN-5499 Project: Hadoop YARN Issue Type: Bug Reporter: Sumana Sathish Priority: Critical Steps to reproduce: *Click on Nodes. This page will list nodes of the cluster *Select a node which has running container *Select 'list of containers' *Select any one of the logs link like stdout or stderr. Logs link appear properly *Click back on the browser to go to container page again *Select any other logs link The Url prompts "Sorry, Error Occurred." {code} jquery.js:8630 XMLHttpRequest cannot load http://cn044-10.l42scl.hortonworks.com:8042cn044-10.l42scl.hortonworks.com:…e/containerlogs/container_1469893274276_0261_01_09/launch_container.sh. Cross origin requests are only supported for protocol schemes: http, data, chrome, chrome-extension, https, chrome-extension-resource.send @ jquery.js:8630ajax @ jquery.js:8166(anonymous function) @ rest-adapter.js:764initializePromise @ ember.debug.js:52308Promise @ ember.debug.js:54158ajax @ rest-adapter.js:729ajax @ yarn-container-log.js:36superWrapper @ ember.debug.js:22060findRecord @ rest-adapter.js:333ember$data$lib$system$store$finders$$_find @ finders.js:18fetchRecord @ store.js:541_fetchRecord @ store.js:595_flushPendingFetchForType @ store.js:641cb @ ember.debug.js:17448forEach @ ember.debug.js:17251forEach @ ember.debug.js:17456flushAllPendingFetches @ store.js:584invoke @ ember.debug.js:320flush @ ember.debug.js:384flush @ ember.debug.js:185end @ ember.debug.js:563run @ ember.debug.js:685run @ ember.debug.js:20105(anonymous function) @ ember.debug.js:23761dispatch @ jquery.js:4435elemData.handle @ jquery.js:4121 ember.debug.js:30877 Error: Adapter operation failed at new Error (native) at Error.EmberError (http://localhost:4200/assets/vendor.js:25278:21) at Error.ember$data$lib$adapters$errors$$AdapterError (http://localhost:4200/assets/vendor.js:91198:50) at Class.handleResponse (http://localhost:4200/assets/vendor.js:92494:16) at Class.hash.error (http://localhost:4200/assets/vendor.js:92574:33) at fire (http://localhost:4200/assets/vendor.js:3306:30) at Object.fireWith [as rejectWith] (http://localhost:4200/assets/vendor.js:3418:7) at done (http://localhost:4200/assets/vendor.js:8473:14) at XMLHttpRequest. (http://localhost:4200/assets/vendor.js:8806:9) at Object.send (http://localhost:4200/assets/vendor.js:8837:10) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5361) Obtaining logs for completed container says 'file belongs to a running container ' at the end
Sumana Sathish created YARN-5361: Summary: Obtaining logs for completed container says 'file belongs to a running container ' at the end Key: YARN-5361 URL: https://issues.apache.org/jira/browse/YARN-5361 Project: Hadoop YARN Issue Type: Improvement Reporter: Sumana Sathish Assignee: Xuan Gong Priority: Critical Obtaining logs via yarn CLI for completed container but running application says "This log file belongs to a running container (container_e32_1468319707096_0001_01_04) and so may not be complete" which is not correct. {code} LogType:stdout Log Upload Time:Tue Jul 12 10:38:14 + 2016 Log Contents: End of LogType:stdout. This log file belongs to a running container (container_e32_1468319707096_0001_01_04) and so may not be complete. {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5340) App Name/User/RPC Port/AM Host info is missing from ATS web service or YARN CLI's app info
Sumana Sathish created YARN-5340: Summary: App Name/User/RPC Port/AM Host info is missing from ATS web service or YARN CLI's app info Key: YARN-5340 URL: https://issues.apache.org/jira/browse/YARN-5340 Project: Hadoop YARN Issue Type: Bug Components: yarn Reporter: Sumana Sathish Assignee: Li Lu Priority: Critical App Name/User/RPC Port/AM Host info is missing from ATS web service or YARN CLI's app info {code} RUNNING: /usr/hdp/current/hadoop-yarn-client/bin/yarn --config /tmp/hadoopConf application -status application_1467931619679_0001 Application Report : Application-Id : application_1467931619679_0001 Application-Name : null Application-Type : null User : null Queue : null Application Priority : null Start-Time : 0 Finish-Time : 1467931672057 Progress : 100% State : FINISHED Final-State : SUCCEEDED Tracking-URL : N/A RPC Port : -1 AM Host : N/A Aggregate Resource Allocation : 290014 MB-seconds, 74 vcore-seconds Log Aggregation Status : N/A {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5339) passing file to -out for YARN log CLI doesnt give warning or error code
Sumana Sathish created YARN-5339: Summary: passing file to -out for YARN log CLI doesnt give warning or error code Key: YARN-5339 URL: https://issues.apache.org/jira/browse/YARN-5339 Project: Hadoop YARN Issue Type: Improvement Reporter: Sumana Sathish Assignee: Xuan Gong passing file to -out for YARN log CLI doesnt give warning or error code {code} /usr/hdp/current/hadoop-yarn-client/bin/yarn logs -applicationId application_1467117709224_0003 -out /grid/0/hadoopqe/artifacts/file.txt {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5337) Dshell AM failed with "java.lang.OutOfMemoryError: GC overhead limit exceeded"
Sumana Sathish created YARN-5337: Summary: Dshell AM failed with "java.lang.OutOfMemoryError: GC overhead limit exceeded" Key: YARN-5337 URL: https://issues.apache.org/jira/browse/YARN-5337 Project: Hadoop YARN Issue Type: Bug Reporter: Sumana Sathish Assignee: Jian He Please find AM logs with the exception {code} INFO distributedshell.ApplicationMaster: Container completed successfully., containerId=container_e49_1467633982200_0001_01_04 Exception in thread "AMRM Callback Handler Thread" org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.OutOfMemoryError: GC overhead limit exceeded at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:312) Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded at java.lang.Object.clone(Native Method) at java.lang.reflect.Method.getParameterTypes(Method.java:264) at java.lang.reflect.Executable.getGenericParameterTypes(Executable.java:285) at java.lang.reflect.Method.getGenericParameterTypes(Method.java:283) at org.codehaus.jackson.map.introspect.AnnotatedMethod.getParameterTypes(AnnotatedMethod.java:143) at org.codehaus.jackson.map.introspect.AnnotatedMethod.getParameterCount(AnnotatedMethod.java:139) at org.codehaus.jackson.map.introspect.POJOPropertiesCollector._addMethods(POJOPropertiesCollector.java:427) at org.codehaus.jackson.map.introspect.POJOPropertiesCollector.collect(POJOPropertiesCollector.java:219) at org.codehaus.jackson.map.introspect.BasicClassIntrospector.collectProperties(BasicClassIntrospector.java:160) at org.codehaus.jackson.map.introspect.BasicClassIntrospector.forSerialization(BasicClassIntrospector.java:96) at org.codehaus.jackson.map.introspect.BasicClassIntrospector.forSerialization(BasicClassIntrospector.java:16) at org.codehaus.jackson.map.SerializationConfig.introspect(SerializationConfig.java:973) at org.codehaus.jackson.map.ser.BeanSerializerFactory.createSerializer(BeanSerializerFactory.java:251) at org.codehaus.jackson.map.ser.StdSerializerProvider._createUntypedSerializer(StdSerializerProvider.java:782) at org.codehaus.jackson.map.ser.StdSerializerProvider._createAndCacheUntypedSerializer(StdSerializerProvider.java:735) at org.codehaus.jackson.map.ser.StdSerializerProvider.findValueSerializer(StdSerializerProvider.java:344) at org.codehaus.jackson.map.ser.impl.PropertySerializerMap.findAndAddSerializer(PropertySerializerMap.java:39) at org.codehaus.jackson.map.ser.std.MapSerializer._findAndAddDynamic(MapSerializer.java:403) at org.codehaus.jackson.map.ser.std.MapSerializer.serializeFields(MapSerializer.java:257) at org.codehaus.jackson.map.ser.std.MapSerializer.serialize(MapSerializer.java:186) at org.codehaus.jackson.map.ser.std.MapSerializer.serialize(MapSerializer.java:23) at org.codehaus.jackson.map.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:446) at org.codehaus.jackson.map.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:150) at org.codehaus.jackson.map.ser.BeanSerializer.serialize(BeanSerializer.java:112) at org.codehaus.jackson.map.ser.std.StdContainerSerializers$IndexedListSerializer.serializeContents(StdContainerSerializers.java:122) at org.codehaus.jackson.map.ser.std.StdContainerSerializers$IndexedListSerializer.serializeContents(StdContainerSerializers.java:71) at org.codehaus.jackson.map.ser.std.AsArraySerializerBase.serialize(AsArraySerializerBase.java:86) at org.codehaus.jackson.map.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:446) at org.codehaus.jackson.map.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:150) at org.codehaus.jackson.map.ser.BeanSerializer.serialize(BeanSerializer.java:112) at org.codehaus.jackson.map.ser.std.StdContainerSerializers$IndexedListSerializer.serializeContents(StdContainerSerializers.java:122) at org.codehaus.jackson.map.ser.std.StdContainerSerializers$IndexedListSerializer.serializeContents(StdContainerSerializers.java:71) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5268) DShell AM fails java.lang.InterruptedException
Sumana Sathish created YARN-5268: Summary: DShell AM fails java.lang.InterruptedException Key: YARN-5268 URL: https://issues.apache.org/jira/browse/YARN-5268 Project: Hadoop YARN Issue Type: Bug Components: yarn Reporter: Sumana Sathish Assignee: Tan, Wangda Priority: Critical Fix For: 2.9.0 Distributed Shell AM failed with the following error {Code} 16/06/16 11:08:10 INFO impl.NMClientAsyncImpl: NMClient stopped. 16/06/16 11:08:10 INFO distributedshell.ApplicationMaster: Application completed. Signalling finish to RM 16/06/16 11:08:10 INFO distributedshell.ApplicationMaster: Diagnostics., total=16, completed=19, allocated=21, failed=4 16/06/16 11:08:10 INFO impl.AMRMClientImpl: Waiting for application to be successfully unregistered. 16/06/16 11:08:10 INFO distributedshell.ApplicationMaster: Application Master failed. exiting 16/06/16 11:08:10 INFO impl.AMRMClientAsyncImpl: Interrupted while waiting for queue java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2048) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:287) End of LogType:AppMaster.stderr {Code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5266) Wrong exit code while trying to get app logs using regex via CLI
Sumana Sathish created YARN-5266: Summary: Wrong exit code while trying to get app logs using regex via CLI Key: YARN-5266 URL: https://issues.apache.org/jira/browse/YARN-5266 Project: Hadoop YARN Issue Type: Bug Components: yarn Reporter: Sumana Sathish Assignee: Xuan Gong Priority: Critical The test is trying to do negative test by passing regex as 'ds+' and expects exit code != 0. *Exit Code is zero and the error message is typed more than once* {code} RUNNING: /usr/hdp/current/hadoop-yarn-client/bin/yarn logs -applicationId application_1465500362360_0016 -logFiles ds+ Can not find any log file matching the pattern: [ds+] for the application: application_1465500362360_0016 2016-06-14 19:19:25,079|beaver.machine|INFO|4427|140145752217344|MainThread|Can not find any log file matching the pattern: [ds+] for the application: application_1465500362360_0016 2016-06-14 19:19:25,216|beaver.machine|INFO|4427|140145752217344|MainThread|Can not find any log file matching the pattern: [ds+] for the application: application_1465500362360_0016 2016-06-14 19:19:25,331|beaver.machine|INFO|4427|140145752217344|MainThread|Can not find any log file matching the pattern: [ds+] for the application: application_1465500362360_0016 2016-06-14 19:19:25,432|beaver.machine|INFO|4427|140145752217344|MainThread|Can not find any log file matching the pattern: [ds+] for the application: application_1465500362360_0016 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5231) obtaining yarn logs for last 'n' bytes using CLI gives 'java.io.IOException'
Sumana Sathish created YARN-5231: Summary: obtaining yarn logs for last 'n' bytes using CLI gives 'java.io.IOException' Key: YARN-5231 URL: https://issues.apache.org/jira/browse/YARN-5231 Project: Hadoop YARN Issue Type: Bug Components: yarn Reporter: Sumana Sathish Assignee: Xuan Gong Priority: Blocker Obtaining logs for last 'n' bytes gives the following exception {code} yarn logs -applicationId application_1465421211793_0004 -containerId container_e07_1465421211793_0004_01_01 -logFiles syslog -size -1000 Exception in thread "main" java.io.IOException: The bytes were skipped are different from the caller requested at org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readContainerLogsForALogType(AggregatedLogFormat.java:838) at org.apache.hadoop.yarn.logaggregation.LogCLIHelpers.dumpAContainerLogsForALogType(LogCLIHelpers.java:300) at org.apache.hadoop.yarn.logaggregation.LogCLIHelpers.dumpAContainersLogsForALogTypeWithoutNodeId(LogCLIHelpers.java:224) at org.apache.hadoop.yarn.client.cli.LogsCLI.printContainerLogsForFinishedApplicationWithoutNodeId(LogsCLI.java:447) at org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:782) at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:228) at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:264) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5131) Distributed shell AM fails Java Null Point Exception
Sumana Sathish created YARN-5131: Summary: Distributed shell AM fails Java Null Point Exception Key: YARN-5131 URL: https://issues.apache.org/jira/browse/YARN-5131 Project: Hadoop YARN Issue Type: Bug Reporter: Sumana Sathish Assignee: Wangda Tan Priority: Critical DShell AM fails with the following exception {code} INFO impl.AMRMClientAsyncImpl: Interrupted while waiting for queue java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2052) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:287) End of LogType:AppMaster.stderr {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5103) With NM recovery enabled, restarting NM multiple times results in AM restart
Sumana Sathish created YARN-5103: Summary: With NM recovery enabled, restarting NM multiple times results in AM restart Key: YARN-5103 URL: https://issues.apache.org/jira/browse/YARN-5103 Project: Hadoop YARN Issue Type: Bug Components: yarn Reporter: Sumana Sathish Assignee: Junping Du Priority: Critical AM is restarted on NM restart even though NM recovery is enabled. {Code:title=NM log on which AM attempt 1 was running } ERROR launcher.RecoveredContainerLaunch (RecoveredContainerLaunch.java:call(88)) - Unable to recover container container_e12_1463043063682_0002_01_01 java.io.IOException: java.lang.InterruptedException at org.apache.hadoop.util.Shell.runCommand(Shell.java:579) at org.apache.hadoop.util.Shell.run(Shell.java:487) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:753) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.signalContainer(LinuxContainerExecutor.java:478) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.isContainerProcessAlive(LinuxContainerExecutor.java:542) at org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor.reacquireContainer(ContainerExecutor.java:185) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.reacquireContainer(LinuxContainerExecutor.java:445) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.RecoveredContainerLaunch.call(RecoveredContainerLaunch.java:83) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.RecoveredContainerLaunch.call(RecoveredContainerLaunch.java:46) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {Code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5084) Cannot obtain AM container logs for the finished application using YARN CLI
Sumana Sathish created YARN-5084: Summary: Cannot obtain AM container logs for the finished application using YARN CLI Key: YARN-5084 URL: https://issues.apache.org/jira/browse/YARN-5084 Project: Hadoop YARN Issue Type: Bug Components: yarn Reporter: Sumana Sathish Assignee: Xuan Gong Priority: Critical YARN log CLI for getting AM container logs for finished application gives the following error {code} yarn logs -applicationId application_1463168167998_0001 -am 1 16/05/13 19:48:38 INFO impl.TimelineClientImpl: Timeline service address: 16/05/13 19:48:38 INFO client.RMProxy: Connecting to ResourceManager at Can not get AMContainers logs for the application:application_1463168167998_0001 This application:application_1463168167998_0001 is finished. Please enable the application history service. Or Using yarn logs -applicationId -containerId --nodeAddress to get the container logs {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5083) YARN CLI for AM logs does not give any error message if entered invalid am value
Sumana Sathish created YARN-5083: Summary: YARN CLI for AM logs does not give any error message if entered invalid am value Key: YARN-5083 URL: https://issues.apache.org/jira/browse/YARN-5083 Project: Hadoop YARN Issue Type: Improvement Components: yarn Reporter: Sumana Sathish Assignee: Xuan Gong Entering invalid value for am in yarn logs CLI does not give any error message {code:title= there is no amattempt 30 for the application} yarn logs -applicationId -am 30 impl.TimelineClientImpl: Timeline service address: INFO client.RMProxy: Connecting to ResourceManager at {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5080) Cannot obtain logs using YARN CLI -am for either KILLED or RUNNING AM
Sumana Sathish created YARN-5080: Summary: Cannot obtain logs using YARN CLI -am for either KILLED or RUNNING AM Key: YARN-5080 URL: https://issues.apache.org/jira/browse/YARN-5080 Project: Hadoop YARN Issue Type: Bug Components: yarn Reporter: Sumana Sathish Assignee: Xuan Gong Priority: Critical When the application is running, if we try to obtain AM logs using {code} yarn logs -applicationId -am 1 {code} It throws the following error {code} Unable to get AM container informations for the application: Illegal character in scheme name at index 0: 0.0.0.0:// {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5002) getApplicationReport call may raise NPE
Sumana Sathish created YARN-5002: Summary: getApplicationReport call may raise NPE Key: YARN-5002 URL: https://issues.apache.org/jira/browse/YARN-5002 Project: Hadoop YARN Issue Type: Bug Reporter: Sumana Sathish Priority: Critical getApplicationReport call may raise NPE {code} Exception in thread "main" java.lang.NullPointerException: java.lang.NullPointerException org.apache.hadoop.yarn.server.resourcemanager.security.QueueACLsManager.checkAccess(QueueACLsManager.java:57) org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.checkAccess(ClientRMService.java:279) org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplications(ClientRMService.java:760) org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplications(ClientRMService.java:682) org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplications(ApplicationClientProtocolPBServiceImpl.java:234) org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:425) org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2268) org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2264) java.security.AccessController.doPrivileged(Native Method) javax.security.auth.Subject.doAs(Subject.java:422) org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1708) org.apache.hadoop.ipc.Server$Handler.run(Server.java:2262) sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) java.lang.reflect.Constructor.newInstance(Constructor.java:423) org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:107) org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplications(ApplicationClientProtocolPBClientImpl.java:254) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) java.lang.reflect.Method.invoke(Method.java:498) org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256) org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) com.sun.proxy.$Proxy18.getApplications(Unknown Source) org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplications(YarnClientImpl.java:479) org.apache.hadoop.mapred.ResourceMgrDelegate.getAllJobs(ResourceMgrDelegate.java:135) org.apache.hadoop.mapred.YARNRunner.getAllJobs(YARNRunner.java:167) org.apache.hadoop.mapreduce.Cluster.getAllJobStatuses(Cluster.java:294) org.apache.hadoop.mapreduce.tools.CLI.listJobs(CLI.java:553) org.apache.hadoop.mapreduce.tools.CLI.run(CLI.java:338) org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) org.apache.hadoop.mapred.JobClient.main(JobClient.java:1274) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4965) Distributed shell AM failed due to ClientHandlerException thrown by jersey
Sumana Sathish created YARN-4965: Summary: Distributed shell AM failed due to ClientHandlerException thrown by jersey Key: YARN-4965 URL: https://issues.apache.org/jira/browse/YARN-4965 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.7.2 Reporter: Sumana Sathish Priority: Critical Fix For: 2.8.0 Distributed shell AM failed with java.io.IOException {code:title= app logs} Exception in thread "AMRM Callback Handler Thread" org.apache.hadoop.yarn.exceptions.YarnRuntimeException: com.sun.jersey.api.client.ClientHandlerException: java.io.IOException: Stream closed. at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:312) Caused by: com.sun.jersey.api.client.ClientHandlerException: java.io.IOException: Stream closed. at com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:563) at com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:506) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:446) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishContainerEndEvent(ApplicationMaster.java:1144) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.access$400(ApplicationMaster.java:169) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$RMCallbackHandler.onContainersCompleted(ApplicationMaster.java:779) at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:300) Caused by: java.io.IOException: Stream closed. at java.net.AbstractPlainSocketImpl.available(AbstractPlainSocketImpl.java:458) at java.net.SocketInputStream.available(SocketInputStream.java:245) at java.io.BufferedInputStream.read(BufferedInputStream.java:342) at sun.net.www.http.ChunkedInputStream.readAheadBlocking(ChunkedInputStream.java:552) at sun.net.www.http.ChunkedInputStream.readAhead(ChunkedInputStream.java:609) at sun.net.www.http.ChunkedInputStream.read(ChunkedInputStream.java:696) at java.io.FilterInputStream.read(FilterInputStream.java:133) at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3067) at org.codehaus.jackson.impl.ByteSourceBootstrapper.ensureLoaded(ByteSourceBootstrapper.java:507) at org.codehaus.jackson.impl.ByteSourceBootstrapper.detectEncoding(ByteSourceBootstrapper.java:129) at org.codehaus.jackson.impl.ByteSourceBootstrapper.constructParser(ByteSourceBootstrapper.java:224) at org.codehaus.jackson.JsonFactory._createJsonParser(JsonFactory.java:785) at org.codehaus.jackson.JsonFactory.createJsonParser(JsonFactory.java:561) at org.codehaus.jackson.jaxrs.JacksonJsonProvider.readFrom(JacksonJsonProvider.java:414) at com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:553) ... 6 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4794) Distributed shell app gets stuck on stopping containers after App completes
Sumana Sathish created YARN-4794: Summary: Distributed shell app gets stuck on stopping containers after App completes Key: YARN-4794 URL: https://issues.apache.org/jira/browse/YARN-4794 Project: Hadoop YARN Issue Type: Bug Reporter: Sumana Sathish Priority: Critical Distributed shell app gets stuck on stopping containers after App completes with the following exception {code:title = app log} 16/03/10 23:19:41 INFO distributedshell.ApplicationMaster: Application completed. Stopping running containers 16/03/10 23:19:41 INFO impl.NMClientAsyncImpl: NM Client is being stopped. 16/03/10 23:19:41 INFO impl.NMClientAsyncImpl: Waiting for eventDispatcherThread to be interrupted. 16/03/10 23:19:41 INFO impl.NMClientAsyncImpl: eventDispatcherThread exited. 16/03/10 23:19:41 INFO impl.NMClientAsyncImpl: Stopping NM client. 16/03/10 23:19:41 INFO impl.NMClientImpl: Clean up running containers on stop. 16/03/10 23:19:41 INFO impl.NMClientImpl: Stopping container_e05_1457650296862_0046_01_08 16/03/10 23:19:41 INFO impl.NMClientImpl: ok, stopContainerInternal.. container_e05_1457650296862_0046_01_08 16/03/10 23:19:41 INFO impl.ContainerManagementProtocolProxy: Opening proxy : yarn-sumana-1.novalocal:25454 16/03/10 23:19:41 WARN ipc.Client: Exception encountered while connecting to the server : java.nio.channels.ClosedByInterruptException at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:407) at org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:57) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131) at java.io.FilterInputStream.read(FilterInputStream.java:133) at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) at java.io.BufferedInputStream.read(BufferedInputStream.java:265) at java.io.DataInputStream.readInt(DataInputStream.java:387) at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:367) at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:558) at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:373) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:727) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:723) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:722) at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:373) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1493) at org.apache.hadoop.ipc.Client.call(Client.java:1397) at org.apache.hadoop.ipc.Client.call(Client.java:1358) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) at com.sun.proxy.$Proxy30.startContainers(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:96) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) at com.sun.proxy.$Proxy31.startContainers(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.NMClientImpl.startContainer(NMClientImpl.java:205) at org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl$StatefulContainer$StartContainerTransition.transition(NMClientAsyncImpl.java:382) at org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl$StatefulContainer$StartContainerTransition.transition(NMClientAsyncImpl.java:368) at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.h
[jira] [Created] (YARN-3753) RM failed to come up with "java.io.IOException: Wait for ZKClient creation timed out"
Sumana Sathish created YARN-3753: Summary: RM failed to come up with "java.io.IOException: Wait for ZKClient creation timed out" Key: YARN-3753 URL: https://issues.apache.org/jira/browse/YARN-3753 Project: Hadoop YARN Issue Type: Bug Components: yarn Reporter: Sumana Sathish Priority: Critical RM failed to come up with the following error while submitting an mapreduce job. {code:title=RM log} 015-05-30 03:40:12,190 ERROR recovery.RMStateStore (RMStateStore.java:transition(179)) - Error storing app: application_1432956515242_0006 java.io.IOException: Wait for ZKClient creation timed out at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1098) at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:1122) at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:923) at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:937) at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.createWithRetries(ZKRMStateStore.java:970) at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.storeApplicationStateInternal(ZKRMStateStore.java:609) at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreAppTransition.transition(RMStateStore.java:175) at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreAppTransition.transition(RMStateStore.java:160) at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:837) at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:900) at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:895) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:175) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:108) at java.lang.Thread.run(Thread.java:745) 2015-05-30 03:40:12,194 FATAL resourcemanager.ResourceManager (ResourceManager.java:handle(750)) - Received a org.apache.hadoop.yarn.server.resourcemanager.RMFatalEvent of type STATE_STORE_OP_FAILED. Cause: java.io.IOException: Wait for ZKClient creation timed out at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1098) at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:1122) at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:923) at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:937) at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.createWithRetries(ZKRMStateStore.java:970) at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.storeApplicationStateInternal(ZKRMStateStore.java:609) at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreAppTransition.transition(RMStateStore.java:175) at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreAppTransition.transition(RMStateStore.java:160) at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:837) at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:900) at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMSt
[jira] [Created] (YARN-3681) yarn cmd says "could not find main class 'queue'" in windows
Sumana Sathish created YARN-3681: Summary: yarn cmd says "could not find main class 'queue'" in windows Key: YARN-3681 URL: https://issues.apache.org/jira/browse/YARN-3681 Project: Hadoop YARN Issue Type: Bug Components: yarn Affects Versions: 2.7.0 Environment: Windows Only Reporter: Sumana Sathish Priority: Critical Attached the screenshot of the command prompt in windows running yarn queue command. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3493) RM fails to come up with error "Failed to load/recover state" when mem settings are changed
Sumana Sathish created YARN-3493: Summary: RM fails to come up with error "Failed to load/recover state" when mem settings are changed Key: YARN-3493 URL: https://issues.apache.org/jira/browse/YARN-3493 Project: Hadoop YARN Issue Type: Bug Components: yarn Affects Versions: 2.7.0 Reporter: Sumana Sathish Priority: Critical Fix For: 2.7.0 RM fails to come up for the following case: 1. Change yarn.nodemanager.resource.memory-mb and yarn.scheduler.maximum-allocation-mb to 4000 in yarn-site.xml 2. Start a randomtextwriter job with mapreduce.map.memory.mb=4000 in background and wait for the job to reach running state 3. Restore yarn-site.xml to have yarn.scheduler.maximum-allocation-mb to 2048 before the above job completes 4. Restart RM 5. RM fails to come up with the below error {code:title= RM error for Mem settings changed} - RM app submission failed in validating AM resource request for application application_1429094976272_0008 org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, requested memory < 0, or requested memory > max configured, requestedMemory=3072, maxMemory=2048 at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:204) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:385) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:328) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:317) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:422) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1187) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:574) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:994) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1035) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1031) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1031) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1071) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1208) 2015-04-15 13:19:18,623 ERROR resourcemanager.ResourceManager (ResourceManager.java:serviceStart(579)) - Failed to load/recover state org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, requested memory < 0, or requested memory > max configured, requestedMemory=3072, maxMemory=2048 at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:204) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:385) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:328) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:317) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:422) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1187) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:574) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:994) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1035) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1031) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio