Have you changed spark-env.sh or spark-defaults.conf from the default? It looks 
like spark is trying to address local workers based on a network address (eg 
192.168……) instead of on localhost (localhost, 127.0.0.1, 0.0.0.0,…). 
Additionally, that network address doesn’t resolve correctly. You might also 
check /etc/hosts to make sure that you don’t have anything weird going on.

Last thing to try perhaps is that are you running Spark within a VM and/or 
Docker? If networking isn’t setup correctly on those you may also run into 
trouble.

What would be helpful is to know everything about your setup that might affect 
networking.

—
Pedro Rodriguez
PhD Student in Large-Scale Machine Learning | CU Boulder
Systems Oriented Data Scientist
UC Berkeley AMPLab Alumni

pedrorodriguez.io | 909-353-4423
github.com/EntilZha | LinkedIn

On July 23, 2016 at 9:10:31 AM, VG (vlin...@gmail.com) wrote:

Hi pedro,

Apologies for not adding this earlier. 

This is running on a local cluster set up as follows.
JavaSparkContext jsc = new JavaSparkContext("local[2]", "DR");

Any suggestions based on this ? 

The ports are not blocked by firewall. 

Regards,



On Sat, Jul 23, 2016 at 8:35 PM, Pedro Rodriguez <ski.rodrig...@gmail.com> 
wrote:
Make sure that you don’t have ports firewalled. You don’t really give much 
information to work from, but it looks like the master can’t access the worker 
nodes for some reason. If you give more information on the cluster, networking, 
etc, it would help.

For example, on AWS you can create a security group which allows all traffic 
to/from itself to itself. If you are using something like ufw on ubuntu then 
you probably need to know the ip addresses of the worker nodes beforehand.

—
Pedro Rodriguez
PhD Student in Large-Scale Machine Learning | CU Boulder
Systems Oriented Data Scientist
UC Berkeley AMPLab Alumni

pedrorodriguez.io | 909-353-4423
github.com/EntilZha | LinkedIn

On July 23, 2016 at 7:38:01 AM, VG (vlin...@gmail.com) wrote:

Please suggest if I am doing something wrong or an alternative way of doing 
this. 

I have an RDD with two values as follows 
JavaPairRDD<String, Long> rdd

When I execute   rdd..collectAsMap()
it always fails with IO exceptions.   


16/07/23 19:03:58 ERROR RetryingBlockFetcher: Exception while beginning fetch 
of 1 outstanding blocks 
java.io.IOException: Failed to connect to /192.168.1.3:58179
at 
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:228)
at 
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:179)
at 
org.apache.spark.network.netty.NettyBlockTransferService$$anon$1.createAndStart(NettyBlockTransferService.scala:96)
at 
org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:140)
at 
org.apache.spark.network.shuffle.RetryingBlockFetcher.start(RetryingBlockFetcher.java:120)
at 
org.apache.spark.network.netty.NettyBlockTransferService.fetchBlocks(NettyBlockTransferService.scala:105)
at 
org.apache.spark.network.BlockTransferService.fetchBlockSync(BlockTransferService.scala:92)
at org.apache.spark.storage.BlockManager.getRemoteBytes(BlockManager.scala:546)
at 
org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply$mcV$sp(TaskResultGetter.scala:76)
at 
org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:57)
at 
org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:57)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1793)
at 
org.apache.spark.scheduler.TaskResultGetter$$anon$2.run(TaskResultGetter.scala:56)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: java.net.ConnectException: Connection timed out: no further 
information: /192.168.1.3:58179
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source)
at 
io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224)
at 
io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:289)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at 
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
... 1 more
16/07/23 19:03:58 INFO RetryingBlockFetcher: Retrying fetch (1/3) for 1 
outstanding blocks after 5000 ms




Reply via email to