a follow up for anyone that may end up on this conversation again:

I kept trying and neither changing the number of concurrent map tasks,
nor the slice size helped.
Finally, I found out a screw up in our logging system,  which had
forbidden us from noticing a couple of recurring errors in the logs :

ERROR [ROW-READ-STAGE:1] 2010-05-11 16:43:32,328
DebuggableThreadPoolExecutor.java (line 101) Error in
ThreadPoolExecutor
java.lang.RuntimeException: java.lang.RuntimeException: corrupt sstable
        at 
org.apache.cassandra.service.RangeSliceVerbHandler.doVerb(RangeSliceVerbHandler.java:53)
        at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:40)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)
Caused by: java.lang.RuntimeException: corrupt sstable
        at org.apache.cassandra.io.SSTableScanner.seekTo(SSTableScanner.java:73)
        at 
org.apache.cassandra.db.ColumnFamilyStore.getKeyRange(ColumnFamilyStore.java:907)
        at 
org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1000)
        at 
org.apache.cassandra.service.RangeSliceVerbHandler.doVerb(RangeSliceVerbHandler.java:41)
        ... 4 more
Caused by: java.io.FileNotFoundException:
/path/to/data/Keyspace/CF-123-Index.db (Too many open files)
        at java.io.RandomAccessFile.open(Native Method)
        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:98)
        at 
org.apache.cassandra.io.util.BufferedRandomAccessFile.<init>(BufferedRandomAccessFile.java:143)
        at 
org.apache.cassandra.io.util.BufferedRandomAccessFile.<init>(BufferedRandomAccessFile.java:138)
        at 
org.apache.cassandra.io.SSTableReader.getNearestPosition(SSTableReader.java:414)
        at org.apache.cassandra.io.SSTableScanner.seekTo(SSTableScanner.java:62)
        ... 7 more

and the related

 WARN [main] 2010-05-11 16:43:38,076 TThreadPoolServer.java (line 190)
Transport error occurred during acceptance of message.
org.apache.thrift.transport.TTransportException:
java.net.SocketException: Too many open files
        at 
org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:124)
        at 
org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:35)
        at 
org.apache.thrift.transport.TServerTransport.accept(TServerTransport.java:31)
        at 
org.apache.thrift.server.TThreadPoolServer.serve(TThreadPoolServer.java:184)
        at 
org.apache.cassandra.thrift.CassandraDaemon.start(CassandraDaemon.java:149)
        at 
org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:190)
Caused by: java.net.SocketException: Too many open files
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:390)
        at java.net.ServerSocket.implAccept(ServerSocket.java:453)
        at java.net.ServerSocket.accept(ServerSocket.java:421)
        at 
org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:119)
        ... 5 more

The client was reporting timeouts in this case.


The max fd limit on the process was in fact not exceedingly high
(1024) and raising it seems to have solved the problem.

Anyway It still seems that there may be two issues:

- since we had never seen this error before with normal client
connections (as in: non hadoop), is it possible that the
Cassandra/hadoop layer is not closing sockets properly between one
connection and the other, or not reusing connections efficiently?
E.g. TSocket seems to have a close() method but I don't see it used in
ColumnFamilyInputFormat.(getSubSplits, getRangeMap) but it may well be
inside CassandraClient.

Anyway, judging by lsof's output I can only see about a hundred TCP
connections, but those from the hadoop jobs seem to always be below 60
so this may just be my wrong impression.

- is it possible that such errors show up on the client side as
timeoutErrors when they could be reported better? this would probably
help other people in diagnosing/reporting internal errors in the
future.


Thanks again to everyone with this, I promise I'll put the discussion
on the wiki for future reference :)

Reply via email to