Logs from the workers?

On Wed, Jan 6, 2016 at 1:57 PM, Jeff Jones <jjo...@adaptivebiotech.com>
wrote:

> I upgraded our Spark standalone cluster from 1.4.1 to 1.6.0 yesterday. We
> are now seeing regular timeouts between two of the workers when making
> connections. These workers and the same driver code worked fine running on
> 1.4.1 and finished in under a second. Any thoughts on what might have
> changed?
>
> 16/01/06 19:17:58 ERROR RetryingBlockFetcher: Exception while beginning
> fetch of 1 outstanding blocks (after 3 retries)
> java.io.IOException: Connecting to /10.248.0.218:52104 timed out (120000
> ms)
> at
> org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:214)
> at
> org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:167)
> at
> org.apache.spark.network.netty.NettyBlockTransferService$$anon$1.createAndStart(NettyBlockTransferService.scala:90)
> at
> org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:140)
> at
> org.apache.spark.network.shuffle.RetryingBlockFetcher.access$200(RetryingBlockFetcher.java:43)
> at
> org.apache.spark.network.shuffle.RetryingBlockFetcher$1.run(RetryingBlockFetcher.java:170)
> at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
> at java.util.concurrent.FutureTask.run(Unknown Source)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
> at java.lang.Thread.run(Unknown Source)
> 16/01/06 19:17:58 WARN BlockManager: Failed to fetch remote block rdd_74_3
> from BlockManagerId(1, 10.248.0.218, 52104) (failed attempt 1)
> java.io.IOException: Connecting to /10.248.0.218:52104 timed out (120000
> ms)
> at
> org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:214)
> at
> org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:167)
> at
> org.apache.spark.network.netty.NettyBlockTransferService$$anon$1.createAndStart(NettyBlockTransferService.scala:90)
> at
> org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:140)
> at
> org.apache.spark.network.shuffle.RetryingBlockFetcher.access$200(RetryingBlockFetcher.java:43)
> at
> org.apache.spark.network.shuffle.RetryingBlockFetcher$1.run(RetryingBlockFetcher.java:170)
> at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
> at java.util.concurrent.FutureTask.run(Unknown Source)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
> at java.lang.Thread.run(Unknown Source)
>
>
>
> Thanks,
> Jeff
>
>
>
>
>
> This message (and any attachments) is intended only for the designated
> recipient(s). It
> may contain confidential or proprietary information, or have other
> limitations on use as
> indicated by the sender. If you are not a designated recipient, you may
> not review, use,
> copy or distribute this message. If you received this in error, please
> notify the sender by
> reply e-mail and delete this message.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to