Hi Pedro, Based on your suggestion, I deployed this on a aws node and it worked fine. thanks for your advice.
I am still trying to figure out the issues on the local environment Anyways thanks again -VG On Sat, Jul 23, 2016 at 9:26 PM, Pedro Rodriguez <ski.rodrig...@gmail.com> wrote: > Have you changed spark-env.sh or spark-defaults.conf from the default? It > looks like spark is trying to address local workers based on a network > address (eg 192.168……) instead of on localhost (localhost, 127.0.0.1, > 0.0.0.0,…). Additionally, that network address doesn’t resolve correctly. > You might also check /etc/hosts to make sure that you don’t have anything > weird going on. > > Last thing to try perhaps is that are you running Spark within a VM and/or > Docker? If networking isn’t setup correctly on those you may also run into > trouble. > > What would be helpful is to know everything about your setup that might > affect networking. > > — > Pedro Rodriguez > PhD Student in Large-Scale Machine Learning | CU Boulder > Systems Oriented Data Scientist > UC Berkeley AMPLab Alumni > > pedrorodriguez.io | 909-353-4423 > github.com/EntilZha | LinkedIn > <https://www.linkedin.com/in/pedrorodriguezscience> > > On July 23, 2016 at 9:10:31 AM, VG (vlin...@gmail.com) wrote: > > Hi pedro, > > Apologies for not adding this earlier. > > This is running on a local cluster set up as follows. > JavaSparkContext jsc = new JavaSparkContext("local[2]", "DR"); > > Any suggestions based on this ? > > The ports are not blocked by firewall. > > Regards, > > > > On Sat, Jul 23, 2016 at 8:35 PM, Pedro Rodriguez <ski.rodrig...@gmail.com> > wrote: > >> Make sure that you don’t have ports firewalled. You don’t really give >> much information to work from, but it looks like the master can’t access >> the worker nodes for some reason. If you give more information on the >> cluster, networking, etc, it would help. >> >> For example, on AWS you can create a security group which allows all >> traffic to/from itself to itself. If you are using something like ufw on >> ubuntu then you probably need to know the ip addresses of the worker nodes >> beforehand. >> >> — >> Pedro Rodriguez >> PhD Student in Large-Scale Machine Learning | CU Boulder >> Systems Oriented Data Scientist >> UC Berkeley AMPLab Alumni >> >> pedrorodriguez.io | 909-353-4423 >> github.com/EntilZha | LinkedIn >> <https://www.linkedin.com/in/pedrorodriguezscience> >> >> On July 23, 2016 at 7:38:01 AM, VG (vlin...@gmail.com) wrote: >> >> Please suggest if I am doing something wrong or an alternative way of >> doing this. >> >> I have an RDD with two values as follows >> JavaPairRDD<String, Long> rdd >> >> When I execute rdd..collectAsMap() >> it always fails with IO exceptions. >> >> >> 16/07/23 19:03:58 ERROR RetryingBlockFetcher: Exception while beginning >> fetch of 1 outstanding blocks >> java.io.IOException: Failed to connect to /192.168.1.3:58179 >> at >> org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:228) >> at >> org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:179) >> at >> org.apache.spark.network.netty.NettyBlockTransferService$$anon$1.createAndStart(NettyBlockTransferService.scala:96) >> at >> org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:140) >> at >> org.apache.spark.network.shuffle.RetryingBlockFetcher.start(RetryingBlockFetcher.java:120) >> at >> org.apache.spark.network.netty.NettyBlockTransferService.fetchBlocks(NettyBlockTransferService.scala:105) >> at >> org.apache.spark.network.BlockTransferService.fetchBlockSync(BlockTransferService.scala:92) >> at >> org.apache.spark.storage.BlockManager.getRemoteBytes(BlockManager.scala:546) >> at >> org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply$mcV$sp(TaskResultGetter.scala:76) >> at >> org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:57) >> at >> org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:57) >> at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1793) >> at >> org.apache.spark.scheduler.TaskResultGetter$$anon$2.run(TaskResultGetter.scala:56) >> at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) >> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) >> at java.lang.Thread.run(Unknown Source) >> Caused by: java.net.ConnectException: Connection timed out: no further >> information: /192.168.1.3:58179 >> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) >> at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source) >> at >> io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224) >> at >> io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:289) >> at >> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528) >> at >> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) >> at >> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) >> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) >> at >> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) >> ... 1 more >> 16/07/23 19:03:58 INFO RetryingBlockFetcher: Retrying fetch (1/3) for 1 >> outstanding blocks after 5000 ms >> >> >> >> >