Re: LevelDB corruption in YARN Application TimelineServer
Hi Abhishek! You might also want to pull in https://issues.apache.org/jira/browse/YARN-6054 . HTH Ravi On Mon, Mar 6, 2017 at 8:39 AM, Jason Lowewrote: > Verify that something outside of Hadoop/YARN is not coming along > periodically and removing "old" files (e.g.: tmpwatch, etc.). Users have > reported similar cases in the past that were tracked down to an invalid > setup. State was being corrupted by a periodic cleanup tool, like > tmpwatch, removing files. > Jason > > > On Thursday, March 2, 2017 5:59 PM, Abhishek Das < > abhishek.b...@gmail.com> wrote: > > > Hi, > > I am running a hadoop 2.6.0 cluster in ec2 instances with r3.2xlarge as > instance of the master node. YARN Application TimelineServer running in the > master node is throwing an exception because of leveldb corruption. This > issue seems to be happening when the cluster has been up for a long time > (more than 7 days). The stack trace is given below. > > ERROR org.apache.hadoop.yarn.server.timeline.TimelineDataManager: Skip the > timeline entity: { id: , type: TEZ_TASK_ID } > java.lang.RuntimeException: > org.fusesource.leveldbjni.internal.NativeDB$DBException: *IO error: > /media/ephemeral0/hadoop-root/yarn/timeline/leveldb- > timeline-store.ldb/330951.sst: > No such file or directory* > at > org.fusesource.leveldbjni.internal.JniDBIterator.seek( > JniDBIterator.java:68) > at > org.apache.hadoop.yarn.server.timeline.LeveldbTimelineStore.getEntity( > LeveldbTimelineStore.java:444) > at > org.apache.hadoop.yarn.server.timeline.TimelineDataManager.postEntities( > TimelineDataManager.java:257) > at > org.apache.hadoop.yarn.server.timeline.webapp.TimelineWebServices. > postEntities(TimelineWebServices.java:259) > at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke( > JavaMethodInvokerFactory.java:60) > at > com.sun.jersey.server.impl.model.method.dispatch. > AbstractResourceMethodDispatchProvider$TypeOutInvoker._dispatch( > AbstractResourceMethodDispatchProvider.java:185) > at > com.sun.jersey.server.impl.model.method.dispatch. > ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher. > java:75) > at > com.sun.jersey.server.impl.uri.rules.HttpMethodRule. > accept(HttpMethodRule.java:288) > at > com.sun.jersey.server.impl.uri.rules.ResourceClassRule. > accept(ResourceClassRule.java:108) > at > com.sun.jersey.server.impl.uri.rules.RightHandPathRule. > accept(RightHandPathRule.java:147) > at > com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept( > RootResourceClassesRule.java:84) > at > com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest( > WebApplicationImpl.java:1469) > at > com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest( > WebApplicationImpl.java:1400) > at > com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest( > WebApplicationImpl.java:1349) > at > com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest( > WebApplicationImpl.java:1339) > at > com.sun.jersey.spi.container.servlet.WebComponent.service( > WebComponent.java:416) > at > com.sun.jersey.spi.container.servlet.ServletContainer. > service(ServletContainer.java:537) > at > com.sun.jersey.spi.container.servlet.ServletContainer. > doFilter(ServletContainer.java:886) > at > com.sun.jersey.spi.container.servlet.ServletContainer. > doFilter(ServletContainer.java:834) > at > com.sun.jersey.spi.container.servlet.ServletContainer. > doFilter(ServletContainer.java:795) > at > com.google.inject.servlet.FilterDefinition.doFilter( > FilterDefinition.java:163) > at > com.google.inject.servlet.FilterChainInvocation.doFilter( > FilterChainInvocation.java:58) > at > com.google.inject.servlet.ManagedFilterPipeline.dispatch( > ManagedFilterPipeline.java:118) > at > com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113) > at > org.mortbay.jetty.servlet.ServletHandler$CachedChain. > doFilter(ServletHandler.java:1212) > at > org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter( > StaticUserWebFilter.java:96) > at > org.mortbay.jetty.servlet.ServletHandler$CachedChain. > doFilter(ServletHandler.java:1212) > at > org.apache.hadoop.yarn.server.timeline.webapp.CrossOriginFilter.doFilter( > CrossOriginFilter.java:95) > at > org.mortbay.jetty.servlet.ServletHandler$CachedChain. > doFilter(ServletHandler.java:1212) > at > org.apache.hadoop.security.authentication.server. > AuthenticationFilter.doFilter(AuthenticationFilter.java:572) >
Re: LevelDB corruption in YARN Application TimelineServer
Verify that something outside of Hadoop/YARN is not coming along periodically and removing "old" files (e.g.: tmpwatch, etc.). Users have reported similar cases in the past that were tracked down to an invalid setup. State was being corrupted by a periodic cleanup tool, like tmpwatch, removing files. Jason On Thursday, March 2, 2017 5:59 PM, Abhishek Daswrote: Hi, I am running a hadoop 2.6.0 cluster in ec2 instances with r3.2xlarge as instance of the master node. YARN Application TimelineServer running in the master node is throwing an exception because of leveldb corruption. This issue seems to be happening when the cluster has been up for a long time (more than 7 days). The stack trace is given below. ERROR org.apache.hadoop.yarn.server.timeline.TimelineDataManager: Skip the timeline entity: { id: , type: TEZ_TASK_ID } java.lang.RuntimeException: org.fusesource.leveldbjni.internal.NativeDB$DBException: *IO error: /media/ephemeral0/hadoop-root/yarn/timeline/leveldb-timeline-store.ldb/330951.sst: No such file or directory* at org.fusesource.leveldbjni.internal.JniDBIterator.seek(JniDBIterator.java:68) at org.apache.hadoop.yarn.server.timeline.LeveldbTimelineStore.getEntity(LeveldbTimelineStore.java:444) at org.apache.hadoop.yarn.server.timeline.TimelineDataManager.postEntities(TimelineDataManager.java:257) at org.apache.hadoop.yarn.server.timeline.webapp.TimelineWebServices.postEntities(TimelineWebServices.java:259) at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60) at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$TypeOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:185) at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75) at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288) at com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108) at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1469) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1400) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1349) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1339) at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416) at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:537) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:886) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795) at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118) at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:96) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.yarn.server.timeline.webapp.CrossOriginFilter.doFilter(CrossOriginFilter.java:95) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:572) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationFilter.doFilter(DelegationTokenAuthenticationFilter.java:269) at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:542) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at
LevelDB corruption in YARN Application TimelineServer
Hi, I am running a hadoop 2.6.0 cluster in ec2 instances with r3.2xlarge as instance of the master node. YARN Application TimelineServer running in the master node is throwing an exception because of leveldb corruption. This issue seems to be happening when the cluster has been up for a long time (more than 7 days). The stack trace is given below. ERROR org.apache.hadoop.yarn.server.timeline.TimelineDataManager: Skip the timeline entity: { id: , type: TEZ_TASK_ID } java.lang.RuntimeException: org.fusesource.leveldbjni.internal.NativeDB$DBException: *IO error: /media/ephemeral0/hadoop-root/yarn/timeline/leveldb-timeline-store.ldb/330951.sst: No such file or directory* at org.fusesource.leveldbjni.internal.JniDBIterator.seek(JniDBIterator.java:68) at org.apache.hadoop.yarn.server.timeline.LeveldbTimelineStore.getEntity(LeveldbTimelineStore.java:444) at org.apache.hadoop.yarn.server.timeline.TimelineDataManager.postEntities(TimelineDataManager.java:257) at org.apache.hadoop.yarn.server.timeline.webapp.TimelineWebServices.postEntities(TimelineWebServices.java:259) at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60) at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$TypeOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:185) at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75) at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288) at com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108) at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1469) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1400) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1349) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1339) at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416) at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:537) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:886) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795) at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118) at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:96) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.yarn.server.timeline.webapp.CrossOriginFilter.doFilter(CrossOriginFilter.java:95) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:572) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationFilter.doFilter(DelegationTokenAuthenticationFilter.java:269) at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:542) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1242) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) There are lot of .sst files in the level db