The solution is to set 'spark.shuffle.io.preferDirectBufs' to 'false'.

Then it is working.

Cheers!

On Fri, Oct 9, 2015 at 3:13 PM, Ivan Héda <ivan.h...@gmail.com> wrote:

> Hi,
>
> I'm facing an issue with PySpark (1.5.1, 1.6.0-SNAPSHOT) running over Yarn
> (2.6.0-cdh5.4.4). Everything seems fine when working with dataframes, but
> when i need RDD the workers start to fail. Like in the next code
>
> table1 = sqlContext.table('someTable')
> table1.count() ## OK ## cca 500 millions rows
>
> table1.groupBy(table1.field).count().show() ## no problem
>
> table1.rdd.count() ## fails with above log from driver
>
> # Py4JJavaError: An error occurredwhile calling
> z:org.apache.spark.api.python.PythonRDD.collectAndServe.
>
> # : org.apache.spark.SparkException: Job aborted due to stage failure: Task 
> 23 in stage 117.0 failed 4 times, most recent failure: Lost task 23.3 in 
> stage 117.0 (TID 23836, some_host): ExecutorLostFailure (executor 2446 lost)
>
> The particular workers fail with this log
>
> 15/10/09 14:56:59 WARN TransportChannelHandler: Exception in connection from 
> host/ip:port
> java.io.IOException: Connection reset by peer
>       at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>       at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>       at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>       at sun.nio.ch.IOUtil.read(IOUtil.java:192)
>       at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
>       at 
> io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313)
>       at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
>       at 
> io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242)
>       at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
>       at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>       at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>       at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>       at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>       at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>       at java.lang.Thread.run(Thread.java:745)
>
>
> RDD is working as expected if I use 
> conf.set("spark.shuffle.blockTransferService", "nio").
>
> Since "nio" is deprecated I'm looking for better solution. Any ideas?
>
> Thanks in advance
>
> ih
>
>

Reply via email to