[ https://issues.apache.org/jira/browse/SPARK-14437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15231442#comment-15231442 ]
Kevin Hogeland commented on SPARK-14437: ---------------------------------------- [~zsxwing] Can confirm that after applying this commit to 1.6.1, the driver is able to connect to the block manager. Thanks for the quick patch. I also encountered this error when trying to run on the latest 2.0.0-SNAPSHOT, possibly unrelated but worth documenting here: {code} org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 29.0 failed 4 times, most recent failure: Lost task 3.3 in stage 29.0 (TID 24, ip-172-16-15-0.us-west-2.compute.internal): java.lang.RuntimeException: Stream '/jars/' was not found. at org.apache.spark.network.client.TransportResponseHandler.handle(TransportResponseHandler.java:223) at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:121) at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51) at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294) at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294) at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294) at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294) at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) at java.lang.Thread.run(Thread.java:745) {code} > Spark using Netty RPC gets wrong address in some setups > ------------------------------------------------------- > > Key: SPARK-14437 > URL: https://issues.apache.org/jira/browse/SPARK-14437 > Project: Spark > Issue Type: Bug > Components: Block Manager, Spark Core > Affects Versions: 1.6.0, 1.6.1 > Environment: AWS, Docker, Flannel > Reporter: Kevin Hogeland > > Netty can't get the correct origin address in certain network setups. Spark > should handle this, as relying on Netty correctly reporting all addresses > leads to incompatible and unpredictable network states. We're currently using > Docker with Flannel on AWS. Container communication looks something like: > {{Container 1 (1.2.3.1) -> Docker host A (1.2.3.0) -> Docker host B (4.5.6.0) > -> Container 2 (4.5.6.1)}} > If the client in that setup is Container 1 (1.2.3.4), Netty channels from > there to Container 2 will have a client address of 1.2.3.0. > The {{RequestMessage}} object that is sent over the wire already contains a > {{senderAddress}} field that the sender can use to specify their address. In > {{NettyRpcEnv#internalReceive}}, this is replaced with the Netty client > socket address when null. {{senderAddress}} in the messages sent from the > executors is currently always null, meaning all messages will have these > incorrect addresses (we've switched back to Akka as a temporary workaround > for this). The executor should send its address explicitly so that the driver > doesn't attempt to infer addresses based on possibly incorrect information > from Netty. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org