I see this

java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.Arrays.copyOfRange(Arrays.java:2694)
at java.lang.String.<init>(String.java:203)
at java.lang.StringBuilder.toString(StringBuilder.java:405)
at java.io.UnixFileSystem.resolve(UnixFileSystem.java:108)
at java.io.File.<init>(File.java:367)
at
org.apache.spark.storage.DiskBlockManager.getFile(DiskBlockManager.scala:81)
at
org.apache.spark.storage.DiskBlockManager.getFile(DiskBlockManager.scala:84)
at
org.apache.spark.shuffle.IndexShuffleBlockManager.getIndexFile(IndexShuffleBlockManager.scala:60)
at
org.apache.spark.shuffle.IndexShuffleBlockManager.getBlockData(IndexShuffleBlockManager.scala:107)
at
org.apache.spark.storage.BlockManager.getBlockData(BlockManager.scala:304)
at
org.apache.spark.network.netty.NettyBlockRpcServer$$anonfun$2.apply(NettyBlockRpcServer.scala:57)
at
org.apache.spark.network.netty.NettyBlockRpcServer$$anonfun$2.apply(NettyBlockRpcServer.scala:57)
at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at
org.apache.spark.network.netty.NettyBlockRpcServer.receive(NettyBlockRpcServer.scala:57)
at
org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:124)
at
org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:97)
at
org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:91)
at
org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:44)
at
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
at
io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
at
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)

On Wed, Jun 24, 2015 at 7:16 AM, Akhil Das <ak...@sigmoidanalytics.com>
wrote:

> Can you look a bit more in the error logs? It could be getting killed
> because of OOM etc. One thing you can try is to set the
> spark.shuffle.blockTransferService to nio from netty.
>
> Thanks
> Best Regards
>
> On Wed, Jun 24, 2015 at 5:46 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com>
> wrote:
>
>> I have a Spark job that has 7 stages. The first 3 stage complete and the
>> fourth stage beings (joins two RDDs). This stage has multiple task
>>  failures all the below exception.
>>
>> Multiple tasks (100s) of them get the same exception with different
>> hosts. How can all the host suddenly stop responding when few moments ago 3
>> stages ran successfully. If I re-run the three stages will again run
>> successfully. I cannot think of it being a cluster issue.
>>
>>
>> Any suggestions ?
>>
>>
>> Spark Version : 1.3.1
>>
>> Exception:
>>
>> org.apache.spark.shuffle.FetchFailedException: Failed to connect to HOST
>>      at 
>> org.apache.spark.shuffle.hash.BlockStoreShuffleFetcher$.org$apache$spark$shuffle$hash$BlockStoreShuffleFetcher$$unpackBlock$1(BlockStoreShuffleFetcher.scala:67)
>>      at 
>> org.apache.spark.shuffle.hash.BlockStoreShuffleFetcher$$anonfun$3.apply(BlockStoreShuffleFetcher.scala:83)
>>      at 
>> org.apache.spark.shuffle.hash.BlockStoreShuffleFetcher$$anonfun$3.apply(BlockStoreShuffleFetcher.scala:83)
>>      at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
>>      at 
>> org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32)
>>      at 
>> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
>>      at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
>>      at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
>>      at 
>> org.apache.spark.util.collection.ExternalAppendOnlyMap.insertAll(ExternalAppendOnlyMap.scala:125)
>>      at org.apache.sp
>>
>>
>> --
>> Deepak
>>
>>
>


-- 
Deepak

Reply via email to