[ 
https://issues.apache.org/jira/browse/YARN-8116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16432938#comment-16432938
 ] 

Wangda Tan commented on YARN-8116:
----------------------------------

+1, thanks [~csingh], will commit shortly.

> Nodemanager fails with NumberFormatException: For input string: ""
> ------------------------------------------------------------------
>
>                 Key: YARN-8116
>                 URL: https://issues.apache.org/jira/browse/YARN-8116
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 3.1.0
>            Reporter: Yesha Vora
>            Assignee: Chandni Singh
>            Priority: Critical
>         Attachments: YARN-8116.001.patch, YARN-8116.002.patch
>
>
> Steps followed.
> 1) Update nodemanager debug delay config
> {code}
> <property>
>       <name>yarn.nodemanager.delete.debug-delay-sec</name>
>       <value>350</value>
>     </property>{code}
> 2) Launch distributed shell application multiple times
> {code}
> /usr/hdp/current/hadoop-yarn-client/bin/yarn  jar 
> hadoop-yarn-applications-distributedshell-*.jar  -shell_command "sleep 120" 
> -num_containers 1 -shell_env YARN_CONTAINER_RUNTIME_TYPE=docker -shell_env 
> YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=centos/httpd-24-centos7:latest -shell_env 
> YARN_CONTAINER_RUNTIME_DOCKER_DELAYED_REMOVAL=true -jar 
> hadoop-yarn-applications-distributedshell-*.jar{code}
> 3) restart NM
> Nodemanager fails to start with below error.
> {code}
> {code:title=NM log}
> 2018-03-23 21:32:14,437 INFO  monitor.ContainersMonitorImpl 
> (ContainersMonitorImpl.java:serviceInit(181)) - ContainersMonitor enabled: 
> true
> 2018-03-23 21:32:14,439 INFO  logaggregation.LogAggregationService 
> (LogAggregationService.java:serviceInit(130)) - rollingMonitorInterval is set 
> as 3600. The logs will be aggregated every 3600 seconds
> 2018-03-23 21:32:14,455 INFO  service.AbstractService 
> (AbstractService.java:noteFailure(267)) - Service 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl
>  failed in state INITED
> java.lang.NumberFormatException: For input string: ""
>       at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>       at java.lang.Long.parseLong(Long.java:601)
>       at java.lang.Long.parseLong(Long.java:631)
>       at 
> org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.loadContainerState(NMLeveldbStateStoreService.java:350)
>       at 
> org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.loadContainersState(NMLeveldbStateStoreService.java:253)
>       at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:365)
>       at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:316)
>       at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>       at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
>       at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:464)
>       at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>       at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:899)
>       at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:960)
> 2018-03-23 21:32:14,458 INFO  logaggregation.LogAggregationService 
> (LogAggregationService.java:serviceStop(148)) - 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService
>  waiting for pending aggregation during exit
> 2018-03-23 21:32:14,460 INFO  service.AbstractService 
> (AbstractService.java:noteFailure(267)) - Service NodeManager failed in state 
> INITED
> java.lang.NumberFormatException: For input string: ""
>       at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>       at java.lang.Long.parseLong(Long.java:601)
>       at java.lang.Long.parseLong(Long.java:631)
>       at 
> org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.loadContainerState(NMLeveldbStateStoreService.java:350)
>       at 
> org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.loadContainersState(NMLeveldbStateStoreService.java:253)
>       at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:365)
>       at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:316)
>       at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>       at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
>       at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:464)
>       at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>       at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:899)
>       at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:960)
> 2018-03-23 21:32:14,463 INFO  impl.MetricsSystemImpl 
> (MetricsSystemImpl.java:stop(210)) - Stopping NodeManager metrics system...
> 2018-03-23 21:32:14,464 INFO  impl.MetricsSinkAdapter 
> (MetricsSinkAdapter.java:publishMetricsFromQueue(141)) - timeline thread 
> interrupted.
> 2018-03-23 21:32:14,468 INFO  impl.MetricsSystemImpl 
> (MetricsSystemImpl.java:stop(216)) - NodeManager metrics system stopped.
> 2018-03-23 21:32:14,508 INFO  impl.MetricsSystemImpl 
> (MetricsSystemImpl.java:shutdown(607)) - NodeManager metrics system shutdown 
> complete.{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to