[ https://issues.apache.org/jira/browse/SPARK-16711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Thomas Graves updated SPARK-16711: ---------------------------------- Fix Version/s: 2.0.1 > YarnShuffleService doesn't re-init properly on YARN rolling upgrade > ------------------------------------------------------------------- > > Key: SPARK-16711 > URL: https://issues.apache.org/jira/browse/SPARK-16711 > Project: Spark > Issue Type: Bug > Components: Shuffle, YARN > Affects Versions: 1.5.2 > Reporter: Thomas Graves > Assignee: Thomas Graves > Fix For: 2.0.1, 2.1.0 > > > When a yarn rolling upgrade happens the Spark YarnShuffleService isn't > re-initializing the tokens soon enough which causes running applications to > fail with NullPointerExceptions rather then IOExceptions which causes clients > to not retry which in turn causes the application to totally fail when it > should have just retried and succeeded. > 2016-07-22 23:22:05,460 [shuffle-server-1] ERROR > server.TransportRequestHandler: Error while invoking RpcHandler#receive() on > RPC id 6235606084052282795 > java.lang.NullPointerException: Password cannot be null if SASL is enabled > at > org.spark-project.guava.base.Preconditions.checkNotNull(Preconditions.java:208) > at > org.apache.spark.network.sasl.SparkSaslServer.encodePassword(SparkSaslServer.java:196) > at > org.apache.spark.network.sasl.SparkSaslServer$DigestCallbackHandler.handle(SparkSaslServer.java:166) > at > com.sun.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java:589) > at > com.sun.security.sasl.digest.DigestMD5Server.evaluateResponse(DigestMD5Server.java:244) > at > org.apache.spark.network.sasl.SparkSaslServer.response(SparkSaslServer.java:119) > at > org.apache.spark.network.sasl.SaslRpcHandler.receive(SaslRpcHandler.java:101) > at > org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:149) > at > org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:102) > at > org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:104) > at > org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51) > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) > at > io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:254) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) > at > io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) > at > org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:86) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) > at > io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787) > at > io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130) > at > io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) > at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org