[ https://issues.apache.org/jira/browse/GRIFFIN-197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nikolay Sokolov closed GRIFFIN-197. ----------------------------------- Resolution: Fixed > Job is left in UNKNOWN state by Service if Yarn RM is restarted > --------------------------------------------------------------- > > Key: GRIFFIN-197 > URL: https://issues.apache.org/jira/browse/GRIFFIN-197 > Project: Griffin (Incubating) > Issue Type: Bug > Reporter: Nikolay Sokolov > Priority: Minor > > From one hand, according to Livy behavior, missing app can be treaded as DEAD > state. > From other hand, logged client errors are polluting the log with unnecessary > stack traces, but not showing error description, returned by Yarn. > Sample stack trace on Service side: > {code:none} > 2018-09-21 14:30:58.016 WARN 14699 --- [nio-8080-exec-4] > o.a.g.c.j.JobServiceImpl : sessionId(300) > appId(application_1534940268145_0318) 404 Not Found. > 2018-09-21 14:30:58.016 WARN 14699 --- [nio-8080-exec-4] > o.a.g.c.j.JobServiceImpl : Spark session 300 may be overdue! > Now we use yarn to update state. > 2018-09-21 14:30:58.020 ERROR 14699 --- [nio-8080-exec-4] > o.a.g.c.u.YarnNetUtil : update exception happens by yarn. > {} > org.springframework.web.client.HttpClientErrorException: 404 Not Found > at > org.springframework.web.client.DefaultResponseErrorHandler.handleError(DefaultResponseErrorHandler.java:91) > ~[spring-web-4.3.6.RELEASE.jar!/:4.3.6.RELEASE] > at > org.springframework.web.client.RestTemplate.handleResponse(RestTemplate.java:700) > ~[spring-web-4.3.6.RELEASE.jar!/:4.3.6.RELEASE] > at > org.springframework.web.client.RestTemplate.doExecute(RestTemplate.java:653) > ~[spring-web-4.3.6.RELEASE.jar!/:4.3.6.RELEASE] > at > org.springframework.web.client.RestTemplate.execute(RestTemplate.java:613) > ~[spring-web-4.3.6.RELEASE.jar!/:4.3.6.RELEASE] > at > org.springframework.web.client.RestTemplate.getForObject(RestTemplate.java:287) > ~[spring-web-4.3.6.RELEASE.jar!/:4.3.6.RELEASE] > at > org.apache.griffin.core.util.YarnNetUtil.update(YarnNetUtil.java:53) > [classes!/:0.3.1-incubating-SNAPSHOT] > at > org.apache.griffin.core.job.JobServiceImpl.setStateByYarn(JobServiceImpl.java:569) > [classes!/:0.3.1-incubating-SNAPSHOT] > at > org.apache.griffin.core.job.JobServiceImpl.setStateByYarn(JobServiceImpl.java:530) > [classes!/:0.3.1-incubating-SNAPSHOT] > at > org.apache.griffin.core.job.JobServiceImpl.syncInstancesOfJob(JobServiceImpl.java:514) > [classes!/:0.3.1-incubating-SNAPSHOT] > at > org.apache.griffin.core.job.JobServiceImpl.updateState(JobServiceImpl.java:274) > [classes!/:0.3.1-incubating-SNAPSHOT] > at > org.apache.griffin.core.job.JobServiceImpl.findInstancesOfJob(JobServiceImpl.java:267) > [classes!/:0.3.1-incubating-SNAPSHOT] > at > org.apache.griffin.core.job.JobController.findInstancesOfJob(JobController.java:94) > [classes!/:0.3.1-incubating-SNAPSHOT] > at sun.reflect.GeneratedMethodAccessor124.invoke(Unknown Source) > ~[?:?] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_181] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_181] > at > org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205) > [spring-web-4.3.6.RELEASE.jar!/:4.3.6.RELEASE] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)