Hello, I am using BulkOutputFormat to load data from a .csv file into Cassandra. I am using Cassandra 1.1.3 and Hadoop 0.20.2.I have 7 hadoop nodes: 1 namenode/jobtracker and 6 datanodes/tasktrackers. Cassandra is installed on 4 of these 6 datanodes/tasktrackers.The issue happens when I have more than 1 reducer, SSTables are generated in each node, however, I get the following error in the tasktracker's logs when they are streamed into the Cassandra cluster: Exception in thread "Streaming to /172.16.110.79:1" java.lang.RuntimeException: java.io.EOFException at org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:628) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(Unknown Source) at org.apache.cassandra.streaming.FileStreamTask.receiveReply(FileStreamTask.java:194) at org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:181) at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:94) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) ... 3 more Exception in thread "Streaming to /172.16.110.92:1" java.lang.RuntimeException: java.io.EOFException at org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:628) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(Unknown Source) at org.apache.cassandra.streaming.FileStreamTask.receiveReply(FileStreamTask.java:194) at org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:181) at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:94) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) ... 3 more ... This is what I get in the logs of one of my Cassandra nodes:ERROR 16:47:34,904 Sending retry message failed, closing session. java.io.IOException: Broken pipe at sun.nio.ch.FileDispatcher.write0(Native Method) at sun.nio.ch.SocketDispatcher.write(Unknown Source) at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source) at sun.nio.ch.IOUtil.write(Unknown Source) at sun.nio.ch.SocketChannelImpl.write(Unknown Source) at java.nio.channels.Channels.writeFullyImpl(Unknown Source) at java.nio.channels.Channels.writeFully(Unknown Source) at java.nio.channels.Channels.access$000(Unknown Source) at java.nio.channels.Channels$1.write(Unknown Source) at java.io.OutputStream.write(Unknown Source) at java.nio.channels.Channels$1.write(Unknown Source) at java.io.DataOutputStream.writeInt(Unknown Source) at org.apache.cassandra.net.OutboundTcpConnection.write(OutboundTcpConnection.java:196) at org.apache.cassandra.streaming.StreamInSession.sendMessage(StreamInSession.java:171) at org.apache.cassandra.streaming.StreamInSession.retry(StreamInSession.java:160) at org.apache.cassandra.streaming.IncomingStreamReader.retry(IncomingStreamReader.java:168) at org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:98) at org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:182) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:78)
Does anyone know what caused these errors? Thank you for your help.Regards,Ralph