[jira] [Created] (CASSANDRA-5384) SSTables are evicted from the page cache during compaction even if populate_io_cache_on_flush is true
Jouni Hartikainen created CASSANDRA-5384: Summary: SSTables are evicted from the page cache during compaction even if populate_io_cache_on_flush is true Key: CASSANDRA-5384 URL: https://issues.apache.org/jira/browse/CASSANDRA-5384 Project: Cassandra Issue Type: Improvement Affects Versions: 1.2.3 Reporter: Jouni Hartikainen Priority: Minor AbstractCompactionStrategy acquires direct scanners on SSTables to be compacted. These scanners are always created with skipIOCache set true. Because of this, compactions even for CFs that have populate_io_cache_on_flush set to true will evict source SSTables from the page cache after 128MB (CACHE_FLUSH_INTERVAL_IN_BYTES in RandomAccessReader) have been read from them. This leads to disk reads even in cases where the dataset completely fits into memory and unnecessarily limits compaction throughput on nodes that have lots of RAM. Maybe compaction strategy should try to avoid skipping IO cache if CF has populate_io_cache_on_flush set to true? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-5244) Compactions don't work while node is bootstrapping
Jouni Hartikainen created CASSANDRA-5244: Summary: Compactions don't work while node is bootstrapping Key: CASSANDRA-5244 URL: https://issues.apache.org/jira/browse/CASSANDRA-5244 Project: Cassandra Issue Type: Bug Affects Versions: 1.2.1 Reporter: Jouni Hartikainen Priority: Critical It seems that there is a race condition in StorageService that prevents compactions from completing while node is in a bootstrap state. I have been able to reproduce this multiple times by throttling streaming throughput to extend the bootstrap time while simultaneously inserting data to the cluster. The problems lies in the synchronization of initServer(int delay) and reportSeverity(double incr) methods as they both try to acquire the instance lock of StorageService through the use of synchronized keyword. As initServer does not return until the bootstrap has completed, all calls to reportSeverity will block until that. However, reportSeverity is called when starting compactions in CompactionInfo and thus all compactions block until bootstrap completes. This might severely degrade node's performance after bootstrap as it might have lots of compactions pending while simultaneously starting to serve reads. I have been able to solve the issue by adding a separate lock for reportSeverity and removing its class level synchronization. This of course is not a valid approach if we must assume that any of Gossiper's IEndpointStateChangeSubscribers could potentially end up calling back to StorageService's synchronized methods. However, at least at the moment, that does not seem to be the case. Maybe somebody with more experience about the codebase comes up with a better solution? (This might affect DynamicEndpointSnitch as well, as it also calls to reportSeverity in its setSeverity method) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4784) Create separate sstables for each token range handled by a node
[ https://issues.apache.org/jira/browse/CASSANDRA-4784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13575153#comment-13575153 ] Jouni Hartikainen commented on CASSANDRA-4784: -- I'm not really sure if I understood this correctly, but wouldn't this change lead to memtable flushes creating much more random I/O than previously? Especially when using vnodes wouldn't the incoming data be spread to num_tokens files per CF instead of one per CF? Wouldn't this affect compactions as well? E.g. for default size tiered strategy, instead of compacting 4 larger SSTables into one even larger per CF, we would be compacting num_tokens * 4 smaller files into num_tokens larger ones per CF. Am I missing something here? > Create separate sstables for each token range handled by a node > --- > > Key: CASSANDRA-4784 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4784 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.2.0 beta 1 >Reporter: sankalp kohli >Assignee: Benjamin Coverston >Priority: Minor > Labels: perfomance > Fix For: 2.0 > > Attachments: 4784.patch > > > Currently, each sstable has data for all the ranges that node is handling. If > we change that and rather have separate sstables for each range that node is > handling, it can lead to some improvements. > Improvements > 1) Node rebuild will be very fast as sstables can be directly copied over to > the bootstrapping node. It will minimize any application level logic. We can > directly use Linux native methods to transfer sstables without using CPU and > putting less pressure on the serving node. I think in theory it will be the > fastest way to transfer data. > 2) Backup can only transfer sstables for a node which belong to its primary > keyrange. > 3) ETL process can only copy one replica of data and will be much faster. > Changes: > We can split the writes into multiple memtables for each range it is > handling. The sstables being flushed from these can have details of which > range of data it is handling. > There will be no change I think for any reads as they work with interleaved > data anyway. But may be we can improve there as well? > Complexities: > The change does not look very complicated. I am not taking into account how > it will work when ranges are being changed for nodes. > Vnodes might make this work more complicated. We can also have a bit on each > sstable which says whether it is primary data or not. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-5097) cassandra-shuffle fails as system keyspace is not user-modifiable
[ https://issues.apache.org/jira/browse/CASSANDRA-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jouni Hartikainen updated CASSANDRA-5097: - Attachment: CASSANDRA-5097.patch > cassandra-shuffle fails as system keyspace is not user-modifiable > - > > Key: CASSANDRA-5097 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5097 > Project: Cassandra > Issue Type: Bug > Components: Core, Tools >Affects Versions: 1.2.0 rc2, 1.2.0 >Reporter: Jouni Hartikainen >Assignee: Aleksey Yeschenko > Fix For: 1.2.1 > > Attachments: CASSANDRA-5097.patch > > > cassandra-shuffle tool fails to insert calculated relocations into the system > keyspace as it is not user-modifiable. When run, the following exception is > thrown after printing out the list of relocations for the first node in ring: > Exception in thread "main" java.lang.RuntimeException: > InvalidRequestException(why:system keyspace is not user-modifiable.) > at > org.apache.cassandra.tools.Shuffle.executeCqlQuery(Shuffle.java:516) > at org.apache.cassandra.tools.Shuffle.shuffle(Shuffle.java:359) > at org.apache.cassandra.tools.Shuffle.main(Shuffle.java:678) > Caused by: InvalidRequestException(why:system keyspace is not > user-modifiable.) > at > org.apache.cassandra.thrift.Cassandra$execute_cql3_query_result.read(Cassandra.java:37849) > at > org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) > at > org.apache.cassandra.thrift.Cassandra$Client.recv_execute_cql3_query(Cassandra.java:1562) > at > org.apache.cassandra.thrift.Cassandra$Client.execute_cql3_query(Cassandra.java:1547) > at > org.apache.cassandra.tools.CassandraClient.execute_cql_query(Shuffle.java:733) > at > org.apache.cassandra.tools.Shuffle.executeCqlQuery(Shuffle.java:502) > ... 2 more > By quickly checking the code it seems that the patch set for CASSANDRA-4874 > disallows modifications to system keyspace again (they were previously > allowed by CASSANDRA-4664) thus rendering cassandra-shuffle unable to do its > job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-5097) cassandra-shuffle fails as system keyspace is not user-modifiable
Jouni Hartikainen created CASSANDRA-5097: Summary: cassandra-shuffle fails as system keyspace is not user-modifiable Key: CASSANDRA-5097 URL: https://issues.apache.org/jira/browse/CASSANDRA-5097 Project: Cassandra Issue Type: Bug Affects Versions: 1.2.0 rc2 Reporter: Jouni Hartikainen cassandra-shuffle tool fails to insert calculated relocations into the system keyspace as it is not user-modifiable. When run, the following exception is thrown after printing out the list of relocations for the first node in ring: Exception in thread "main" java.lang.RuntimeException: InvalidRequestException(why:system keyspace is not user-modifiable.) at org.apache.cassandra.tools.Shuffle.executeCqlQuery(Shuffle.java:516) at org.apache.cassandra.tools.Shuffle.shuffle(Shuffle.java:359) at org.apache.cassandra.tools.Shuffle.main(Shuffle.java:678) Caused by: InvalidRequestException(why:system keyspace is not user-modifiable.) at org.apache.cassandra.thrift.Cassandra$execute_cql3_query_result.read(Cassandra.java:37849) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_execute_cql3_query(Cassandra.java:1562) at org.apache.cassandra.thrift.Cassandra$Client.execute_cql3_query(Cassandra.java:1547) at org.apache.cassandra.tools.CassandraClient.execute_cql_query(Shuffle.java:733) at org.apache.cassandra.tools.Shuffle.executeCqlQuery(Shuffle.java:502) ... 2 more By quickly checking the code it seems that the patch set for CASSANDRA-4874 disallows modifications to system keyspace again (they were previously allowed by CASSANDRA-4664) thus rendering cassandra-shuffle unable to do its job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-5097) cassandra-shuffle fails as system keyspace is not user-modifiable
[ https://issues.apache.org/jira/browse/CASSANDRA-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jouni Hartikainen updated CASSANDRA-5097: - Affects Version/s: 1.2.0 > cassandra-shuffle fails as system keyspace is not user-modifiable > - > > Key: CASSANDRA-5097 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5097 > Project: Cassandra > Issue Type: Bug >Affects Versions: 1.2.0 rc2, 1.2.0 >Reporter: Jouni Hartikainen > > cassandra-shuffle tool fails to insert calculated relocations into the system > keyspace as it is not user-modifiable. When run, the following exception is > thrown after printing out the list of relocations for the first node in ring: > Exception in thread "main" java.lang.RuntimeException: > InvalidRequestException(why:system keyspace is not user-modifiable.) > at > org.apache.cassandra.tools.Shuffle.executeCqlQuery(Shuffle.java:516) > at org.apache.cassandra.tools.Shuffle.shuffle(Shuffle.java:359) > at org.apache.cassandra.tools.Shuffle.main(Shuffle.java:678) > Caused by: InvalidRequestException(why:system keyspace is not > user-modifiable.) > at > org.apache.cassandra.thrift.Cassandra$execute_cql3_query_result.read(Cassandra.java:37849) > at > org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) > at > org.apache.cassandra.thrift.Cassandra$Client.recv_execute_cql3_query(Cassandra.java:1562) > at > org.apache.cassandra.thrift.Cassandra$Client.execute_cql3_query(Cassandra.java:1547) > at > org.apache.cassandra.tools.CassandraClient.execute_cql_query(Shuffle.java:733) > at > org.apache.cassandra.tools.Shuffle.executeCqlQuery(Shuffle.java:502) > ... 2 more > By quickly checking the code it seems that the patch set for CASSANDRA-4874 > disallows modifications to system keyspace again (they were previously > allowed by CASSANDRA-4664) thus rendering cassandra-shuffle unable to do its > job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira