Hello,
I am using BulkOutputFormat to load data from a .csv file into Cassandra. I am 
using Cassandra 1.1.3 and Hadoop 0.20.2.I have 7 hadoop nodes: 1 
namenode/jobtracker and 6 datanodes/tasktrackers. Cassandra is installed on 4 
of these 6 datanodes/tasktrackers.The issue happens when I have more than 1 
reducer, SSTables are generated in each node, however, I get the following 
error in the tasktracker's logs when they are streamed into the Cassandra 
cluster:
Exception in thread "Streaming to /172.16.110.79:1" java.lang.RuntimeException: 
java.io.EOFException
        at 
org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:628)
        at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)
Caused by: java.io.EOFException
        at java.io.DataInputStream.readInt(Unknown Source)
        at 
org.apache.cassandra.streaming.FileStreamTask.receiveReply(FileStreamTask.java:194)
        at 
org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:181)
        at 
org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:94)
        at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
        ... 3 more
Exception in thread "Streaming to /172.16.110.92:1" java.lang.RuntimeException: 
java.io.EOFException
        at 
org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:628)
        at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)
Caused by: java.io.EOFException
        at java.io.DataInputStream.readInt(Unknown Source)
        at 
org.apache.cassandra.streaming.FileStreamTask.receiveReply(FileStreamTask.java:194)
        at 
org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:181)
        at 
org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:94)
        at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
        ... 3 more ...
This is what I get in the logs of one of my Cassandra nodes:ERROR 16:47:34,904 
Sending retry message failed, closing session.
java.io.IOException: Broken pipe
        at sun.nio.ch.FileDispatcher.write0(Native Method)
        at sun.nio.ch.SocketDispatcher.write(Unknown Source)
        at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
        at sun.nio.ch.IOUtil.write(Unknown Source)
        at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
        at java.nio.channels.Channels.writeFullyImpl(Unknown Source)
        at java.nio.channels.Channels.writeFully(Unknown Source)
        at java.nio.channels.Channels.access$000(Unknown Source)
        at java.nio.channels.Channels$1.write(Unknown Source)
        at java.io.OutputStream.write(Unknown Source)
        at java.nio.channels.Channels$1.write(Unknown Source)
        at java.io.DataOutputStream.writeInt(Unknown Source)
        at 
org.apache.cassandra.net.OutboundTcpConnection.write(OutboundTcpConnection.java:196)
        at 
org.apache.cassandra.streaming.StreamInSession.sendMessage(StreamInSession.java:171)
        at 
org.apache.cassandra.streaming.StreamInSession.retry(StreamInSession.java:160)
        at 
org.apache.cassandra.streaming.IncomingStreamReader.retry(IncomingStreamReader.java:168)
        at 
org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:98)
        at 
org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:182)
        at 
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:78)

Does anyone know what caused these errors?
Thank you for your help.Regards,Ralph                                     

Reply via email to