[ 
https://issues.apache.org/jira/browse/CASSANDRA-7275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253271#comment-14253271
 ] 

Benedict commented on CASSANDRA-7275:
-------------------------------------

I should clarify, since it sounds like we are not too far in disagreement on 
this point: I'm suggesting only that the failure is reported to the flush call 
site, not so the callsite can specialised on the kind of exception, but so that 
if this specific callsite can safely cope with _any_ failure, it can be 
specialised to do so, with remedial action if necessary. A whitelist would be a 
subset of this approach, and hence simpler - but only if it's genuinely safe to 
just drop the problem on the floor; I'm not sufficiently familiar with these 
system tables to say for sure, but I do recollect problems safely starting a 
node when compactions_in_progress was not properly maintained, so I expect 
_some_ remedial action will probably be necessary, perhaps on a case-by-case 
basis.

> Errors in FlushRunnable may leave threads hung
> ----------------------------------------------
>
>                 Key: CASSANDRA-7275
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7275
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Tyler Hobbs
>            Assignee: Pavel Yaskevich
>            Priority: Minor
>             Fix For: 2.0.12
>
>         Attachments: 0001-Move-latch.countDown-into-finally-block.patch, 
> 7252-2.0-v2.txt, CASSANDRA-7275-flush-info.patch
>
>
> In Memtable.FlushRunnable, the CountDownLatch will never be counted down if 
> there are errors, which results in hanging any threads that are waiting for 
> the flush to complete.  For example, an error like this causes the problem:
> {noformat}
> ERROR [FlushWriter:474] 2014-05-20 12:10:31,137 CassandraDaemon.java (line 
> 198) Exception in thread Thread[FlushWriter:474,5,main]
> java.lang.IllegalArgumentException
>     at java.nio.Buffer.position(Unknown Source)
>     at 
> org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:64)
>     at 
> org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:72)
>     at 
> org.apache.cassandra.db.marshal.AbstractCompositeType.split(AbstractCompositeType.java:138)
>     at 
> org.apache.cassandra.io.sstable.ColumnNameHelper.minComponents(ColumnNameHelper.java:103)
>     at 
> org.apache.cassandra.db.ColumnFamily.getColumnStats(ColumnFamily.java:439)
>     at 
> org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:194)
>     at 
> org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:397)
>     at 
> org.apache.cassandra.db.Memtable$FlushRunnable.runWith(Memtable.java:350)
>     at 
> org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
>     at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>     at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
>     at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>     at java.lang.Thread.run(Unknown Source)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to