[ https://issues.apache.org/jira/browse/CASSANDRA-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Brandon Williams updated CASSANDRA-5244: ---------------------------------------- Reviewer: vijay2...@yahoo.com > Compactions don't work while node is bootstrapping > -------------------------------------------------- > > Key: CASSANDRA-5244 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5244 > Project: Cassandra > Issue Type: Bug > Components: Core > Affects Versions: 1.2.0 beta 1 > Reporter: Jouni Hartikainen > Assignee: Brandon Williams > Priority: Critical > Labels: gossip > Fix For: 1.2.2 > > Attachments: 5244.txt > > > It seems that there is a race condition in StorageService that prevents > compactions from completing while node is in a bootstrap state. > I have been able to reproduce this multiple times by throttling streaming > throughput to extend the bootstrap time while simultaneously inserting data > to the cluster. > The problems lies in the synchronization of initServer(int delay) and > reportSeverity(double incr) methods as they both try to acquire the instance > lock of StorageService through the use of synchronized keyword. As initServer > does not return until the bootstrap has completed, all calls to > reportSeverity will block until that. However, reportSeverity is called when > starting compactions in CompactionInfo and thus all compactions block until > bootstrap completes. > This might severely degrade node's performance after bootstrap as it might > have lots of compactions pending while simultaneously starting to serve reads. > I have been able to solve the issue by adding a separate lock for > reportSeverity and removing its class level synchronization. This of course > is not a valid approach if we must assume that any of Gossiper's > IEndpointStateChangeSubscribers could potentially end up calling back to > StorageService's synchronized methods. However, at least at the moment, that > does not seem to be the case. > Maybe somebody with more experience about the codebase comes up with a better > solution? > (This might affect DynamicEndpointSnitch as well, as it also calls to > reportSeverity in its setSeverity method) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira