Hi,

I'm facing an issue with PySpark (1.5.1, 1.6.0-SNAPSHOT) running over Yarn
(2.6.0-cdh5.4.4). Everything seems fine when working with dataframes, but
when i need RDD the workers start to fail. Like in the next code

table1 = sqlContext.table('someTable')
table1.count() ## OK ## cca 500 millions rows

table1.groupBy(table1.field).count().show() ## no problem

table1.rdd.count() ## fails with above log from driver

# Py4JJavaError: An error occurredwhile calling
z:org.apache.spark.api.python.PythonRDD.collectAndServe.

# : org.apache.spark.SparkException: Job aborted due to stage failure:
Task 23 in stage 117.0 failed 4 times, most recent failure: Lost task
23.3 in stage 117.0 (TID 23836, some_host): ExecutorLostFailure
(executor 2446 lost)

The particular workers fail with this log

15/10/09 14:56:59 WARN TransportChannelHandler: Exception in
connection from host/ip:port
java.io.IOException: Connection reset by peer
        at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
        at sun.nio.ch.IOUtil.read(IOUtil.java:192)
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
        at 
io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313)
        at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
        at 
io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242)
        at 
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
        at 
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
        at 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
        at 
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
        at 
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
        at java.lang.Thread.run(Thread.java:745)


RDD is working as expected if I use
conf.set("spark.shuffle.blockTransferService", "nio").

Since "nio" is deprecated I'm looking for better solution. Any ideas?

Thanks in advance

ih

Reply via email to