Mostly a network issue, you need to check your network configuration from
the aws console and make sure the ports are accessible within the cluster.

Thanks
Best Regards

On Thu, Oct 22, 2015 at 8:53 PM, Eugen Cepoi <cepoi.eu...@gmail.com> wrote:

> Huh indeed this worked, thanks. Do you know why this happens, is that some
> known issue?
>
> Thanks,
> Eugen
>
> 2015-10-22 19:08 GMT+07:00 Akhil Das <ak...@sigmoidanalytics.com>:
>
>> Can you try fixing spark.blockManager.port to specific port and see if
>> the issue exists?
>>
>> Thanks
>> Best Regards
>>
>> On Mon, Oct 19, 2015 at 6:21 PM, Eugen Cepoi <cepoi.eu...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I am running spark streaming 1.4.1 on EMR (AMI 3.9) over YARN.
>>> The job is reading data from Kinesis and the batch size is of 30s (I
>>> used the same value for the kinesis checkpointing).
>>> In the executor logs I can see every 5 seconds a sequence of stacktraces
>>> indicating that the block replication failed. I am using the default
>>> storage level MEMORY_AND_DISK_SER_2.
>>> WAL is not enabled nor checkpointing (the checkpoint dir is configured
>>> for the spark context but not for the streaming context).
>>>
>>> Here is an example of those logs for ip-10-63-160-18. They occur in
>>> every executor while trying to replicate to any other executor.
>>>
>>>
>>> 15/10/19 03:11:55 INFO nio.SendingConnection: Initiating connection to 
>>> [ip-10-63-160-18.ec2.internal/10.63.160.18:50929]
>>> 15/10/19 03:11:55 WARN nio.SendingConnection: Error finishing connection to 
>>> ip-10-63-160-18.ec2.internal/10.63.160.18:50929
>>> java.net.ConnectException: Connection refused
>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>     at 
>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
>>>     at 
>>> org.apache.spark.network.nio.SendingConnection.finishConnect(Connection.scala:344)
>>>     at 
>>> org.apache.spark.network.nio.ConnectionManager$$anon$10.run(ConnectionManager.scala:292)
>>>     at 
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>     at 
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>     at java.lang.Thread.run(Thread.java:745)
>>> 15/10/19 03:11:55 ERROR nio.ConnectionManager: Exception while sending 
>>> message.
>>> java.net.ConnectException: Connection refused
>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>     at 
>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
>>>     at 
>>> org.apache.spark.network.nio.SendingConnection.finishConnect(Connection.scala:344)
>>>     at 
>>> org.apache.spark.network.nio.ConnectionManager$$anon$10.run(ConnectionManager.scala:292)
>>>     at 
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>     at 
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>     at java.lang.Thread.run(Thread.java:745)
>>> 15/10/19 03:11:55 INFO nio.ConnectionManager: Notifying 
>>> ConnectionManagerId(ip-10-63-160-18.ec2.internal,50929)
>>> 15/10/19 03:11:55 INFO nio.ConnectionManager: Handling connection error on 
>>> connection to ConnectionManagerId(ip-10-63-160-18.ec2.internal,50929)
>>> 15/10/19 03:11:55 WARN storage.BlockManager: Failed to replicate 
>>> input-1-1445242310000 to BlockManagerId(3, ip-10-159-151-22.ec2.internal, 
>>> 50929), failure #0
>>> java.net.ConnectException: Connection refused
>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>     at 
>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
>>>     at 
>>> org.apache.spark.network.nio.SendingConnection.finishConnect(Connection.scala:344)
>>>     at 
>>> org.apache.spark.network.nio.ConnectionManager$$anon$10.run(ConnectionManager.scala:292)
>>>     at 
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>     at 
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>     at java.lang.Thread.run(Thread.java:745)
>>> 15/10/19 03:11:55 INFO nio.ConnectionManager: Removing SendingConnection to 
>>> ConnectionManagerId(ip-10-63-160-18.ec2.internal,50929)
>>> 15/10/19 03:11:55 INFO nio.SendingConnection: Initiating connection to 
>>> [ip-10-63-160-18.ec2.internal/10.63.160.18:39506]
>>> 15/10/19 03:11:55 WARN nio.SendingConnection: Error finishing connection to 
>>> ip-10-63-160-18.ec2.internal/10.63.160.18:39506
>>> java.net.ConnectException: Connection refused
>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>     at 
>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
>>>     at 
>>> org.apache.spark.network.nio.SendingConnection.finishConnect(Connection.scala:344)
>>>     at 
>>> org.apache.spark.network.nio.ConnectionManager$$anon$10.run(ConnectionManager.scala:292)
>>>     at 
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>     at 
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>     at java.lang.Thread.run(Thread.java:745)
>>> 15/10/19 03:11:55 ERROR nio.ConnectionManager: Exception while sending 
>>> message.
>>> java.net.ConnectException: Connection refused
>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>     at 
>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
>>>     at 
>>> org.apache.spark.network.nio.SendingConnection.finishConnect(Connection.scala:344)
>>>     at 
>>> org.apache.spark.network.nio.ConnectionManager$$anon$10.run(ConnectionManager.scala:292)
>>>     at 
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>     at 
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>     at java.lang.Thread.run(Thread.java:745)
>>> 15/10/19 03:11:55 INFO nio.ConnectionManager: Notifying 
>>> ConnectionManagerId(ip-10-63-160-18.ec2.internal,39506)
>>> 15/10/19 03:11:55 INFO nio.ConnectionManager: Handling connection error on 
>>> connection to ConnectionManagerId(ip-10-63-160-18.ec2.internal,39506)
>>> 15/10/19 03:11:55 INFO nio.ConnectionManager: Removing SendingConnection to 
>>> ConnectionManagerId(ip-10-63-160-18.ec2.internal,39506)
>>> 15/10/19 03:11:55 WARN storage.BlockManager: Failed to replicate 
>>> input-1-1445242310000 to BlockManagerId(2, ip-10-141-12-91.ec2.internal, 
>>> 39506), failure #1
>>> java.net.ConnectException: Connection refused
>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>     at 
>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
>>>     at 
>>> org.apache.spark.network.nio.SendingConnection.finishConnect(Connection.scala:344)
>>>     at 
>>> org.apache.spark.network.nio.ConnectionManager$$anon$10.run(ConnectionManager.scala:292)
>>>     at 
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>     at 
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>     at java.lang.Thread.run(Thread.java:745)
>>> 15/10/19 03:11:55 WARN storage.BlockManager: Block input-1-1445242310000 
>>> replicated to only 0 peer(s) instead of 1 peers
>>> 15/10/19 03:11:55 INFO receiver.BlockGenerator: Pushed block 
>>> input-1-1445242310000
>>>
>>>
>>>
>>> Thanks,
>>> Eugen
>>>
>>
>>
>

Reply via email to