Even without lsof, you should be able to get the data from /proc/$pid -ryan
On Fri, Jun 17, 2011 at 5:08 AM, Dominic Williams <dwilli...@system7.co.uk> wrote: > Unfortunately I shutdown that node and anyway lsof wasn't installed. > But $ulimit gives > unlimited > > On 17 June 2011 13:00, Sylvain Lebresne <sylv...@datastax.com> wrote: >> >> On Fri, Jun 17, 2011 at 1:51 PM, Dominic Williams >> <dwilli...@system7.co.uk> wrote: >> > As far as scrub goes that could be it. I'm already running unlimited >> > file >> > handles though so ulimit not answer unfortunately >> >> Are you sure ? How many file descriptors are open on the system when >> you get that >> scrub exception ? If you really have unlimited file handles and >> Cassandra uses them >> all, then we have a more serious problem than just CASSANDRA-2669. >> >> -- >> Sylvain >> >> > Dominic >> > On 17 June 2011 12:12, Sylvain Lebresne <sylv...@datastax.com> wrote: >> >> >> >> Scrub apparently dies because it cannot acquire a file descriptor. >> >> Scrub >> >> does >> >> not correctly closes files >> >> (https://issues.apache.org/jira/browse/CASSANDRA-2669) >> >> so that may be part of why that happens. However, a simple fix is >> >> probably >> >> to >> >> raise up the file descriptor limit. >> >> >> >> -- >> >> Sylvain >> >> >> >> >> >> On Fri, Jun 17, 2011 at 11:31 AM, Dominic Williams >> >> <dwilli...@system7.co.uk> wrote: >> >> > Hi all, >> >> > Anyone experiencing this..? >> >> > I noticed one of my 7.6-2 nodes had inexplicable and consistently >> >> > high >> >> > cpu >> >> > usage. Checking the log I found that there was a some kind of SSTable >> >> > corruption that was stopping a bunch of files from compacting (first >> >> > trace >> >> > copied below). >> >> > I then tried scrub (before anyone asks it had been run when >> >> > upgrading, >> >> > but I >> >> > was just trying to find out if this fixed it) but that failed with >> >> > "Too >> >> > many >> >> > open files" even though node should be using mem map (second trace >> >> > copied >> >> > below). >> >> > Current plan is to snapshot, decommission & bootstrap but not what >> >> > should be >> >> > happening. Only factor I can think of that might have caused problem >> >> > is >> >> > we >> >> > are currently running huge GCGraceSeconds because we can't run repair >> >> > (we >> >> > are hanging on for 8.1 as CASSANDRA-2280 causes repair to consume all >> >> > disk >> >> > space and fail) >> >> > INFO [NonPeriodicTasks:1] 2011-06-16 22:43:45,232 SSTable.java (line >> >> > 147) >> >> > Deleted /var/opt/cassandra/data/FightMyMonster/Users_CisIndex-f-5922 >> >> > INFO [MutationStage:7989] 2011-06-16 22:49:57,852 >> >> > ColumnFamilyStore.java >> >> > (line 1065) Enqueuing flush of Memtable-UserMonsters@423965881(34471 >> >> > bytes, >> >> > 322112 operations) >> >> > INFO [FlushWriter:1176] 2011-06-16 22:49:57,917 Memtable.java (line >> >> > 157) >> >> > Writing Memtable-UserMonsters@423965881(34471 bytes, 322112 >> >> > operations) >> >> > INFO [FlushWriter:1176] 2011-06-16 22:50:01,087 Memtable.java (line >> >> > 172) >> >> > Completed flushing >> >> > /var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7614-Data.db >> >> > (7250239 >> >> > bytes) >> >> > INFO [CompactionExecutor:1] 2011-06-16 22:50:01,090 >> >> > CompactionManager.java >> >> > (line 395) Compacting >> >> > >> >> > >> >> > [SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-6993-Data.db'),SSTableReader( >> >> > >> >> > >> >> > path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-6994-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-6995-Data.db'),SSTableReader(path='/var/opt/cassandra >> >> > >> >> > >> >> > /data/FightMyMonster/UserMonsters-f-6996-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-6998-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/Use >> >> > >> >> > >> >> > rMonsters-f-7000-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7002-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7004-Data.db >> >> > >> >> > >> >> > '),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7006-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7008-Data.db'),SSTableReader(path='/ >> >> > >> >> > >> >> > var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7010-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7012-Data.db'),SSTableReader(path='/var/opt/cassandra/data/F >> >> > >> >> > >> >> > ightMyMonster/UserMonsters-f-7014-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7016-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonste >> >> > >> >> > >> >> > rs-f-7018-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7020-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7022-Data.db'),SSTa >> >> > >> >> > >> >> > bleReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7024-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7026-Data.db'),SSTableReader(path='/var/opt >> >> > >> >> > >> >> > /cassandra/data/FightMyMonster/UserMonsters-f-7028-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7030-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyM >> >> > >> >> > >> >> > onster/UserMonsters-f-7032-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7034-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-70 >> >> > >> >> > >> >> > 36-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7038-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7040-Data.db'),SSTableRead >> >> > >> >> > >> >> > er(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7042-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7044-Data.db'),SSTableReader(path='/var/opt/cassan >> >> > >> >> > >> >> > dra/data/FightMyMonster/UserMonsters-f-7046-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7048-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/ >> >> > >> >> > >> >> > UserMonsters-f-7050-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7052-Data.db')] >> >> > INFO [NonPeriodicTasks:1] 2011-06-16 22:50:02,399 >> >> > ColumnFamilyStore.java >> >> > (line 1065) Enqueuing flush of Memtable-Monsters@562380513(497561 >> >> > bytes, >> >> > 13860 operations) >> >> > INFO [FlushWriter:1176] 2011-06-16 22:50:02,399 Memtable.java (line >> >> > 157) >> >> > Writing Memtable-Monsters@562380513(497561 bytes, 13860 operations) >> >> > INFO [FlushWriter:1176] 2011-06-16 22:50:02,580 Memtable.java (line >> >> > 172) >> >> > Completed flushing >> >> > /var/opt/cassandra/data/FightMyMonster/Monsters-f-7160-Data.db >> >> > (471100 >> >> > bytes) >> >> > ERROR [CompactionExecutor:1] 2011-06-16 22:50:37,776 >> >> > AbstractCassandraDaemon.java (line 114) Fatal exception in thread >> >> > Thread[CompactionExecutor:1,1,main] >> >> > java.io.IOError: java.io.IOException: Corrupt (negative) value length >> >> > encountered >> >> > at >> >> > >> >> > >> >> > org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:252) >> >> > at >> >> > >> >> > >> >> > org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:268) >> >> > at >> >> > >> >> > >> >> > org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:227) >> >> > at >> >> > >> >> > >> >> > java.util.concurrent.ConcurrentSkipListMap.buildFromSorted(ConcurrentSkipListMap.java:1493) >> >> > at >> >> > >> >> > >> >> > java.util.concurrent.ConcurrentSkipListMap.<init>(ConcurrentSkipListMap.java:1443) >> >> > at >> >> > >> >> > >> >> > org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:379) >> >> > at >> >> > >> >> > >> >> > org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:362) >> >> > at >> >> > >> >> > >> >> > org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:322) >> >> > at >> >> > >> >> > >> >> > org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:129) >> >> > at >> >> > >> >> > >> >> > org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:201) >> >> > at >> >> > >> >> > org.apache.cassandra.io.PrecompactedRow.<init>(PrecompactedRow.java:78) >> >> > at >> >> > >> >> > >> >> > org.apache.cassandra.io.CompactionIterator.getCompactedRow(CompactionIterator.java:154) >> >> > at >> >> > >> >> > >> >> > org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:110) >> >> > at >> >> > >> >> > >> >> > org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:45) >> >> > at >> >> > >> >> > >> >> > org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:74) >> >> > at >> >> > >> >> > >> >> > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136) >> >> > at >> >> > >> >> > >> >> > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131) >> >> > at >> >> > >> >> > >> >> > org.apache.commons.collections.iterators.FilterIterator.setNextObject(FilterIterator.java:183) >> >> > at >> >> > >> >> > >> >> > org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94) >> >> > at >> >> > >> >> > >> >> > org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:448) >> >> > at >> >> > >> >> > >> >> > org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:124) >> >> > at >> >> > >> >> > >> >> > org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:94) >> >> > at >> >> > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) >> >> > at java.util.concurrent.FutureTask.run(FutureTask.java:138) >> >> > at >> >> > >> >> > >> >> > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >> >> > at >> >> > >> >> > >> >> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) >> >> > at java.lang.Thread.run(Thread.java:619) >> >> > Caused by: java.io.IOException: Corrupt (negative) value length >> >> > encountered >> >> > at >> >> > >> >> > >> >> > org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:315) >> >> > at >> >> > >> >> > >> >> > org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:99) >> >> > at >> >> > >> >> > >> >> > org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:248) >> >> > ... 26 more >> >> > INFO [CompactionExecutor:1] 2011-06-16 22:50:37,794 >> >> > CompactionManager.java >> >> > (line 395) Compacting >> >> > >> >> > >> >> > [SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/Monsters-f-7157-Data.db'),SSTableReader(path >> >> > >> >> > >> >> > ='/var/opt/cassandra/data/FightMyMonster/Monsters-f-7158-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/Monsters-f-7159-Data.db'),SSTableReader(path='/var/opt/cassandra/data/FightM >> >> > yMonster/Monsters-f-7160-Data.db')] >> >> > >> >> > INFO [CompactionExecutor:1] 2011-06-17 00:46:24,756 >> >> > CompactionManager.java >> >> > (line 652) Scrub of >> >> > >> >> > >> >> > SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7554-Data.db') >> >> > complete: 244 rows in new sstable and 0 empty (tombstoned) rows >> >> > dropped >> >> > INFO [CompactionExecutor:1] 2011-06-17 00:46:24,756 >> >> > CompactionManager.java >> >> > (line 511) Scrubbing >> >> > >> >> > >> >> > SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7446-Data.db') >> >> > INFO [CompactionExecutor:1] 2011-06-17 00:46:26,115 >> >> > CompactionManager.java >> >> > (line 652) Scrub of >> >> > >> >> > >> >> > SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7446-Data.db') >> >> > complete: 493 rows in new sstable and 0 empty (tombstoned) rows >> >> > dropped >> >> > INFO [CompactionExecutor:1] 2011-06-17 00:46:26,115 >> >> > CompactionManager.java >> >> > (line 511) Scrubbing >> >> > >> >> > >> >> > SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7018-Data.db') >> >> > INFO [CompactionExecutor:1] 2011-06-17 00:46:26,225 >> >> > CompactionManager.java >> >> > (line 652) Scrub of >> >> > >> >> > >> >> > SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7018-Data.db') >> >> > complete: 275 rows in new sstable and 0 empty (tombstoned) rows >> >> > dropped >> >> > INFO [CompactionExecutor:1] 2011-06-17 00:46:26,226 >> >> > CompactionManager.java >> >> > (line 511) Scrubbing >> >> > >> >> > >> >> > SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7580-Data.db') >> >> > INFO [CompactionExecutor:1] 2011-06-17 00:46:28,383 >> >> > CompactionManager.java >> >> > (line 652) Scrub of >> >> > >> >> > >> >> > SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7580-Data.db') >> >> > complete: 297 rows in new sstable and 0 empty (tombstoned) rows >> >> > dropped >> >> > INFO [CompactionExecutor:1] 2011-06-17 00:46:28,384 >> >> > CompactionManager.java >> >> > (line 511) Scrubbing >> >> > >> >> > >> >> > SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7574-Data.db') >> >> > INFO [CompactionExecutor:1] 2011-06-17 00:46:29,300 >> >> > CompactionManager.java >> >> > (line 652) Scrub of >> >> > >> >> > >> >> > SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7574-Data.db') >> >> > complete: 347 rows in new sstable and 0 empty (tombstoned) rows >> >> > dropped >> >> > INFO [CompactionExecutor:1] 2011-06-17 00:46:29,300 >> >> > CompactionManager.java >> >> > (line 511) Scrubbing >> >> > >> >> > >> >> > SSTableReader(path='/var/opt/cassandra/data/FightMyMonster/UserMonsters-f-7010-Data.db') >> >> > ERROR [CompactionExecutor:1] 2011-06-17 00:46:29,374 >> >> > AbstractCassandraDaemon.java (line 114) Fatal exception in thread >> >> > Thread[CompactionExecutor:1,1,main] >> >> > java.io.FileNotFoundException: >> >> > >> >> > /var/opt/cassandra/data/FightMyMonster/UserMonsters-tmp-f-7823-Data.db >> >> > (Too >> >> > many open files) >> >> > at java.io.RandomAccessFile.open(Native Method) >> >> > at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212) >> >> > at >> >> > >> >> > >> >> > org.apache.cassandra.io.util.BufferedRandomAccessFile.<init>(BufferedRandomAccessFile.java:113) >> >> > at >> >> > >> >> > >> >> > org.apache.cassandra.io.sstable.SSTableWriter.<init>(SSTableWriter.java:78) >> >> > at >> >> > >> >> > >> >> > org.apache.cassandra.db.ColumnFamilyStore.createCompactionWriter(ColumnFamilyStore.java:2243) >> >> > at >> >> > >> >> > >> >> > org.apache.cassandra.db.CompactionManager.maybeCreateWriter(CompactionManager.java:794) >> >> > at >> >> > >> >> > >> >> > org.apache.cassandra.db.CompactionManager.doScrub(CompactionManager.java:534) >> >> > at >> >> > >> >> > >> >> > org.apache.cassandra.db.CompactionManager.access$600(CompactionManager.java:56) >> >> > at >> >> > >> >> > >> >> > org.apache.cassandra.db.CompactionManager$3.call(CompactionManager.java:195) >> >> > at >> >> > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) >> >> > at java.util.concurrent.FutureTask.run(FutureTask.java:138) >> >> > at >> >> > >> >> > >> >> > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >> >> > at >> >> > >> >> > >> >> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) >> >> > at java.lang.Thread.run(Thread.java:619) >> >> > Dominic >> > >> > > >