Hi , I am able to run spark streaming job in local mode, when i try to run the same job in my YARN cluster, its throwing errors.
Any help is appreciated in this regard Here are my Exception logs: Exception 1: java.net.SocketTimeoutException: 480000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/172.16.28.192:50010 remote=/172.16.28.193:46147] at org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246) at org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:172) at org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:220) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:559) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:728) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:496) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:116) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235) at java.lang.Thread.run(Thread.java:745) Exception 2: 2016-03-22 12:17:47,838 WARN org.apache.hadoop.hdfs.BlockReaderFactory: I/O error constructing remote block reader. java.nio.channels.ClosedByInterruptException at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202) at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:658) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530) at org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3101) at org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:755) at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:670) at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:337) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:576) at org.apache.hadoop.hdfs.DFSInputStream.seekToBlockSource(DFSInputStream.java:1460) at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:773) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:806) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:847) at java.io.DataInputStream.read(DataInputStream.java:100) at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:84) at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52) at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112) at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:366) at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:265) at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 2016-03-22 12:17:47,838 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1458629096860_0001_01_000001 transitioned from KILLING to DONE 2016-03-22 12:17:47,841 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Removing container_1458629096860_0001_01_000001 from application application_1458629096860_0001 2016-03-22 12:17:47,842 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_STOP for appId application_1458629096860_0001 2016-03-22 12:17:47,842 WARN org.apache.hadoop.hdfs.DFSClient: Failed to connect to /node1:50010 for block, add to deadNodes and continue. java.nio.channels.ClosedByInterruptException java.nio.channels.ClosedByInterruptException