[jira] [Created] (CASSANDRA-3006) Enormous counter
Enormous counter - Key: CASSANDRA-3006 URL: https://issues.apache.org/jira/browse/CASSANDRA-3006 Project: Cassandra Issue Type: Bug Affects Versions: 0.8.3 Environment: ubuntu 10.04 Reporter: Boris Yen I have two-node cluster with the following keyspace and column family settings. Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 63fda700-c243-11e0--2d03dcafebdf: [172.17.19.151, 172.17.19.152] Keyspace: test: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Durable Writes: true Options: [datacenter1:2] Column Families: ColumnFamily: testCounter (Super) "APP status information." Key Validation Class: org.apache.cassandra.db.marshal.BytesType Default column value validator: org.apache.cassandra.db.marshal.CounterColumnType Columns sorted by: org.apache.cassandra.db.marshal.BytesType/org.apache.cassandra.db.marshal.BytesType Row cache size / save period in seconds: 0.0/0 Key cache size / save period in seconds: 20.0/14400 Memtable thresholds: 1.1578125/1440/247 (millions of ops/MB/minutes) GC grace seconds: 864000 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: true Built indexes: [] Then, I use a test program based on hector to add a counter column (testCounter[sc][column]) 1000 times. In the middle the adding process, I intentional shut down the node 172.17.19.152. In addition to that, the test program is smart enough to switch the consistency level from Quorum to One, so that the following adding actions would not fail. After all the adding actions are done, I start the cassandra on 172.17.19.152, and I use cassandra-cli to check if the counter is correct on both nodes, and I got a result 1001 which should be reasonable because hector will retry once. However, when I shut down 172.17.19.151 and after 172.17.19.152 is aware of 172.17.19.151 is down, I try to start the cassandra on 172.17.19.151 again. Then, I check the counter again, this time I got a result 481387 which is so wrong. I use 0.8.3 the reproduce this bug, but I think this also happens on 0.8.2 or before also. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3006) Enormous counter
[ https://issues.apache.org/jira/browse/CASSANDRA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081545#comment-13081545 ] Boris Yen commented on CASSANDRA-3006: -- I forgot the mention that the counter is out of sync between these two nodes, one shows 481387 and the other one shows 20706. > Enormous counter > - > > Key: CASSANDRA-3006 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3006 > Project: Cassandra > Issue Type: Bug >Affects Versions: 0.8.3 > Environment: ubuntu 10.04 >Reporter: Boris Yen > > I have two-node cluster with the following keyspace and column family > settings. > Cluster Information: >Snitch: org.apache.cassandra.locator.SimpleSnitch >Partitioner: org.apache.cassandra.dht.RandomPartitioner >Schema versions: > 63fda700-c243-11e0--2d03dcafebdf: [172.17.19.151, 172.17.19.152] > Keyspace: test: > Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy > Durable Writes: true > Options: [datacenter1:2] > Column Families: > ColumnFamily: testCounter (Super) > "APP status information." > Key Validation Class: org.apache.cassandra.db.marshal.BytesType > Default column value validator: > org.apache.cassandra.db.marshal.CounterColumnType > Columns sorted by: > org.apache.cassandra.db.marshal.BytesType/org.apache.cassandra.db.marshal.BytesType > Row cache size / save period in seconds: 0.0/0 > Key cache size / save period in seconds: 20.0/14400 > Memtable thresholds: 1.1578125/1440/247 (millions of ops/MB/minutes) > GC grace seconds: 864000 > Compaction min/max thresholds: 4/32 > Read repair chance: 1.0 > Replicate on write: true > Built indexes: [] > Then, I use a test program based on hector to add a counter column > (testCounter[sc][column]) 1000 times. In the middle the adding process, I > intentional shut down the node 172.17.19.152. In addition to that, the test > program is smart enough to switch the consistency level from Quorum to One, > so that the following adding actions would not fail. > After all the adding actions are done, I start the cassandra on > 172.17.19.152, and I use cassandra-cli to check if the counter is correct on > both nodes, and I got a result 1001 which should be reasonable because hector > will retry once. However, when I shut down 172.17.19.151 and after > 172.17.19.152 is aware of 172.17.19.151 is down, I try to start the cassandra > on 172.17.19.151 again. Then, I check the counter again, this time I got a > result 481387 which is so wrong. > I use 0.8.3 the reproduce this bug, but I think this also happens on 0.8.2 or > before also. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3006) Enormous counter
[ https://issues.apache.org/jira/browse/CASSANDRA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boris Yen updated CASSANDRA-3006: - Description: I have two-node cluster with the following keyspace and column family settings. Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 63fda700-c243-11e0--2d03dcafebdf: [172.17.19.151, 172.17.19.152] Keyspace: test: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Durable Writes: true Options: [datacenter1:2] Column Families: ColumnFamily: testCounter (Super) "APP status information." Key Validation Class: org.apache.cassandra.db.marshal.BytesType Default column value validator: org.apache.cassandra.db.marshal.CounterColumnType Columns sorted by: org.apache.cassandra.db.marshal.BytesType/org.apache.cassandra.db.marshal.BytesType Row cache size / save period in seconds: 0.0/0 Key cache size / save period in seconds: 20.0/14400 Memtable thresholds: 1.1578125/1440/247 (millions of ops/MB/minutes) GC grace seconds: 864000 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: true Built indexes: [] Then, I use a test program based on hector to add a counter column (testCounter[sc][column]) 1000 times. In the middle the adding process, I intentional shut down the node 172.17.19.152. In addition to that, the test program is smart enough to switch the consistency level from Quorum to One, so that the following adding actions would not fail. After all the adding actions are done, I start the cassandra on 172.17.19.152, and I use cassandra-cli to check if the counter is correct on both nodes, and I got a result 1001 which should be reasonable because hector will retry once. However, when I shut down 172.17.19.151 and after 172.17.19.152 is aware of 172.17.19.151 is down, I try to start the cassandra on 172.17.19.151 again. Then, I check the counter again, this time I got a result 481387 which is so wrong. I use 0.8.3 to reproduce this bug, but I think this also happens on 0.8.2 or before also. was: I have two-node cluster with the following keyspace and column family settings. Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 63fda700-c243-11e0--2d03dcafebdf: [172.17.19.151, 172.17.19.152] Keyspace: test: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Durable Writes: true Options: [datacenter1:2] Column Families: ColumnFamily: testCounter (Super) "APP status information." Key Validation Class: org.apache.cassandra.db.marshal.BytesType Default column value validator: org.apache.cassandra.db.marshal.CounterColumnType Columns sorted by: org.apache.cassandra.db.marshal.BytesType/org.apache.cassandra.db.marshal.BytesType Row cache size / save period in seconds: 0.0/0 Key cache size / save period in seconds: 20.0/14400 Memtable thresholds: 1.1578125/1440/247 (millions of ops/MB/minutes) GC grace seconds: 864000 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: true Built indexes: [] Then, I use a test program based on hector to add a counter column (testCounter[sc][column]) 1000 times. In the middle the adding process, I intentional shut down the node 172.17.19.152. In addition to that, the test program is smart enough to switch the consistency level from Quorum to One, so that the following adding actions would not fail. After all the adding actions are done, I start the cassandra on 172.17.19.152, and I use cassandra-cli to check if the counter is correct on both nodes, and I got a result 1001 which should be reasonable because hector will retry once. However, when I shut down 172.17.19.151 and after 172.17.19.152 is aware of 172.17.19.151 is down, I try to start the cassandra on 172.17.19.151 again. Then, I check the counter again, this time I got a result 481387 which is so wrong. I use 0.8.3 the reproduce this bug, but I think this also happens on 0.8.2 or before also. > Enormous counter > - > > Key: CASSANDRA-3006 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3006 > Project: Cassandra > Issue Type: Bug >Affects Versions: 0.8.3 > Environment: ubuntu 10.04 >Reporter: Boris Yen > > I have two-node cluster with the following keyspace and column family > settings. > Cluster Information: >Snitch: org.apache.cassandra.locator.SimpleSnitch >Partitioner: org.apache.cassandra.dht.RandomPartitioner >Schema versions: > 63fda700-c243-11e0--2d
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2843: Attachment: 2843_h.patch bq. the IColumnMap name when it does not implement Map interface, and some things it has in common with Map (iteration) it changes semantics of (iterating values instead of keys). not sure what to use instead though, since we already have an IColumnContainer. Maybe ISortedColumns? Yeah, I'm not sure I have a better name either, maybe ISortedColumnHolder, but not sure it's better than ISortedColumns so attached rebased patch simply rename ColumnMap -> SortedColumns bq. TSCM and ALCM extending instead of wrapping CSLM/AL, respectively The idea was to save one object creation. I admit this is probably not a huge deal, but it felt that in this case it was no big deal to extend instead of wrapping either, so felt like worth "optimizing". I still stand by that choice but I have no good argument against the criticism that it is possibly premature. bq. unrelated reformatting If we're talking about the ones in SuperColumn.java, sorry, I mistakenly forced re-indentation on the file which rewrote the tab to spaces. New patch keeps the old formatting. I'd mention that there is also a few places where I've rewrote cf.getSortedColumns().iterator() to cf.iterator(), which is arguably a bit gratuitous for this patch, but I figured this avoids creating a new Collection in the case of CLSM and there's not so many occurrences. > better performance on long row read > --- > > Key: CASSANDRA-2843 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 > Project: Cassandra > Issue Type: New Feature >Reporter: Yang Yang > Fix For: 1.0 > > Attachments: 2843.patch, 2843_d.patch, 2843_g.patch, 2843_h.patch, > fix.diff, microBenchmark.patch, patch_timing, std_timing > > > currently if a row contains > 1000 columns, the run time becomes considerably > slow (my test of > a row with 30 00 columns (standard, regular) each with 8 bytes in name, and > 40 bytes in value, is about 16ms. > this is all running in memory, no disk read is involved. > through debugging we can find > most of this time is spent on > [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) > [Wall Time] > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, > ColumnFamily) > [Wall Time] > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, > ColumnFamily) > [Wall Time] > org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, > int, ColumnFamily) > [Wall Time] > org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, > Iterator, int) > [Wall Time] > org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, > Iterator, int) > [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) > ColumnFamily.addColumn() is slow because it inserts into an internal > concurrentSkipListMap() that maps column names to values. > this structure is slow for two reasons: it needs to do synchronization; it > needs to maintain a more complex structure of map. > but if we look at the whole read path, thrift already defines the read output > to be List so it does not make sense to use a luxury map > data structure in the interium and finally convert it to a list. on the > synchronization side, since the return CF is never going to be > shared/modified by other threads, we know the access is always single thread, > so no synchronization is needed. > but these 2 features are indeed needed for ColumnFamily in other cases, > particularly write. so we can provide a different ColumnFamily to > CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always > creates the standard ColumnFamily, but take a provided returnCF, whose cost > is much cheaper. > the provided patch is for demonstration now, will work further once we agree > on the general direction. > CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is > provided. the main work is to let the FastColumnFamily use an array for > internal storage. at first I used binary search to insert new columns in > addColumn(), but later I found that even this is not necessary, since all > calling scenarios of ColumnFamily.addColumn() has an invariant that the > inserted columns come in sorted order (I still have an issue to resolve > descending or ascending now, but ascending works). so the current logic is > simply to compare the new column against the end column in the array, if > names not equal, append, if equal, reconcile. > slight temporary hacks are made on getTopLevelColumnFamily so we have 2 > flavors of the method, one accepting a returnCF. but we could definitely
[jira] [Created] (CASSANDRA-3007) NullPointerException in MessagingService.java:420
NullPointerException in MessagingService.java:420 - Key: CASSANDRA-3007 URL: https://issues.apache.org/jira/browse/CASSANDRA-3007 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.3 Environment: Linux w0 2.6.35-24-virtual #42-Ubuntu SMP Thu Dec 2 05:15:26 UTC 2010 x86_64 GNU/Linux java version "1.6.0_18" OpenJDK Runtime Environment (IcedTea6 1.8.7) (6b18-1.8.7-2~squeeze1) OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode) Reporter: Viliam Holub Priority: Minor I'm getting large quantity of exceptions during streaming. It is always in MessagingService.java:420. The streaming appears to be blocked. INFO 10:11:14,734 Streaming to /10.235.77.27 ERROR 10:11:14,734 Fatal exception in thread Thread[StreamStage:2,5,main] java.lang.NullPointerException at org.apache.cassandra.net.MessagingService.stream(MessagingService.java:420) at org.apache.cassandra.streaming.StreamOutSession.begin(StreamOutSession.java:176) at org.apache.cassandra.streaming.StreamOut.transferRangesForRequest(StreamOut.java:148) at org.apache.cassandra.streaming.StreamRequestVerbHandler.doVerb(StreamRequestVerbHandler.java:54) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:636) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data
[ https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Yaskevich updated CASSANDRA-1717: --- Attachment: CASSANDRA-1717-v2.patch bq. CSW.flushData() forgot to reset the checksum (this is caught by the unit tests btw). Not a problem since it was due to Sylvain's bad merge. bq. We should convert the CRC32 to an int (and only write that) as it is an int internally (getValue() returns a long only because CRC32 implements the interface Checksum that require that). Lets leave that to the ticket for CRC optimization which will allow us to modify that system-wide. bq. Here we checksum the compressed data. The other approach would be to checksum the uncompressed data. The advantage of checksumming compressed data is the speed (less data to checksum), but checksumming the uncompressed data would be a little bit safer. In particular, it would prevent us from messing up in the decompression (and we don't have to trust the compression algorithm, not that I don't trust Snappy, but...). This is a clearly a trade-off that we have to make, but I admit that my personal preference would lean towards safety (in particular, I know that checksumming the uncompressed data give a bit more safety, I don't know what is our exact gain quantitatively with checksumming compressed data). On the other side, checksumming the uncompressed data would likely mean that a good part of the bitrot would result in a decompression error rather than a checksum error, which is maybe less convenient from the implementation point of view. So I don't know, I guess I'm thinking aloud to have other's opinions more than anything else. Checksum is moved to the original data. bq. Let's add some unit tests. At least it's relatively easy to write a few blocks, switch one bit in the resulting file, and checking this is caught at read time (or better, do that multiple time changing a different bit each time). Test was added to CompressedRandomAccessReaderTest. As Todd noted, HADOOP-6148 contains a bunch of discussions on the efficiency of java CRC32. In particular, it seems they have been able to close to double the speed of the CRC32, with a solution that seems fairly simple to me. It would be ok to use java native CRC32 and leave the improvement to another ticket, but quite frankly if it is that simple and since the hadoop guys have done all the hard work for us, I say we start with the efficient version directly. As decided previously this will be a matter of the separate ticket. Rebased with latest trunk (last commit 1e36fb1e44bff96005dd75a25648ff25eea6a95f) > Cassandra cannot detect corrupt-but-readable column data > > > Key: CASSANDRA-1717 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1717 > Project: Cassandra > Issue Type: New Feature > Components: Core >Reporter: Jonathan Ellis >Assignee: Pavel Yaskevich > Fix For: 1.0 > > Attachments: CASSANDRA-1717-v2.patch, CASSANDRA-1717.patch, > checksums.txt > > > Most corruptions of on-disk data due to bitrot render the column (or row) > unreadable, so the data can be replaced by read repair or anti-entropy. But > if the corruption keeps column data readable we do not detect it, and if it > corrupts to a higher timestamp value can even resist being overwritten by > newer values. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data
[ https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081569#comment-13081569 ] Pavel Yaskevich edited comment on CASSANDRA-1717 at 8/9/11 11:25 AM: - bq. CSW.flushData() forgot to reset the checksum (this is caught by the unit tests btw). Not a problem since it was due to Sylvain's bad merge. bq. We should convert the CRC32 to an int (and only write that) as it is an int internally (getValue() returns a long only because CRC32 implements the interface Checksum that require that). Lets leave that to the ticket for CRC optimization which will allow us to modify that system-wide. bq. Here we checksum the compressed data. The other approach would be to checksum the uncompressed data. The advantage of checksumming compressed data is the speed (less data to checksum), but checksumming the uncompressed data would be a little bit safer. In particular, it would prevent us from messing up in the decompression (and we don't have to trust the compression algorithm, not that I don't trust Snappy, but...). This is a clearly a trade-off that we have to make, but I admit that my personal preference would lean towards safety (in particular, I know that checksumming the uncompressed data give a bit more safety, I don't know what is our exact gain quantitatively with checksumming compressed data). On the other side, checksumming the uncompressed data would likely mean that a good part of the bitrot would result in a decompression error rather than a checksum error, which is maybe less convenient from the implementation point of view. So I don't know, I guess I'm thinking aloud to have other's opinions more than anything else. Checksum is moved to the original data. bq. Let's add some unit tests. At least it's relatively easy to write a few blocks, switch one bit in the resulting file, and checking this is caught at read time (or better, do that multiple time changing a different bit each time). Test was added to CompressedRandomAccessReaderTest. bq. As Todd noted, HADOOP-6148 contains a bunch of discussions on the efficiency of java CRC32. In particular, it seems they have been able to close to double the speed of the CRC32, with a solution that seems fairly simple to me. It would be ok to use java native CRC32 and leave the improvement to another ticket, but quite frankly if it is that simple and since the hadoop guys have done all the hard work for us, I say we start with the efficient version directly. As decided previously this will be a matter of the separate ticket. Rebased with latest trunk (last commit 1e36fb1e44bff96005dd75a25648ff25eea6a95f) was (Author: xedin): bq. CSW.flushData() forgot to reset the checksum (this is caught by the unit tests btw). Not a problem since it was due to Sylvain's bad merge. bq. We should convert the CRC32 to an int (and only write that) as it is an int internally (getValue() returns a long only because CRC32 implements the interface Checksum that require that). Lets leave that to the ticket for CRC optimization which will allow us to modify that system-wide. bq. Here we checksum the compressed data. The other approach would be to checksum the uncompressed data. The advantage of checksumming compressed data is the speed (less data to checksum), but checksumming the uncompressed data would be a little bit safer. In particular, it would prevent us from messing up in the decompression (and we don't have to trust the compression algorithm, not that I don't trust Snappy, but...). This is a clearly a trade-off that we have to make, but I admit that my personal preference would lean towards safety (in particular, I know that checksumming the uncompressed data give a bit more safety, I don't know what is our exact gain quantitatively with checksumming compressed data). On the other side, checksumming the uncompressed data would likely mean that a good part of the bitrot would result in a decompression error rather than a checksum error, which is maybe less convenient from the implementation point of view. So I don't know, I guess I'm thinking aloud to have other's opinions more than anything else. Checksum is moved to the original data. bq. Let's add some unit tests. At least it's relatively easy to write a few blocks, switch one bit in the resulting file, and checking this is caught at read time (or better, do that multiple time changing a different bit each time). Test was added to CompressedRandomAccessReaderTest. As Todd noted, HADOOP-6148 contains a bunch of discussions on the efficiency of java CRC32. In particular, it seems they have been able to close to double the speed of the CRC32, with a solution that seems fairly simple to me. It would be ok to use java native CRC32 and leave the improvement to another ticket, but quite
[jira] [Issue Comment Edited] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data
[ https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081569#comment-13081569 ] Pavel Yaskevich edited comment on CASSANDRA-1717 at 8/9/11 11:29 AM: - bq. CSW.flushData() forgot to reset the checksum (this is caught by the unit tests btw). Not a problem since it was due to Sylvain's bad merge. bq. We should convert the CRC32 to an int (and only write that) as it is an int internally (getValue() returns a long only because CRC32 implements the interface Checksum that require that). Lets leave that to the ticket for CRC optimization which will allow us to modify that system-wide. bq. Here we checksum the compressed data. The other approach would be to checksum the uncompressed data. The advantage of checksumming compressed data is the speed (less data to checksum), but checksumming the uncompressed data would be a little bit safer. In particular, it would prevent us from messing up in the decompression (and we don't have to trust the compression algorithm, not that I don't trust Snappy, but...). This is a clearly a trade-off that we have to make, but I admit that my personal preference would lean towards safety (in particular, I know that checksumming the uncompressed data give a bit more safety, I don't know what is our exact gain quantitatively with checksumming compressed data). On the other side, checksumming the uncompressed data would likely mean that a good part of the bitrot would result in a decompression error rather than a checksum error, which is maybe less convenient from the implementation point of view. So I don't know, I guess I'm thinking aloud to have other's opinions more than anything else. It checksums original (non-compressed) data and stores checksum at the end of the compressed chunk, reader makes a checksum check after decompression. bq. Let's add some unit tests. At least it's relatively easy to write a few blocks, switch one bit in the resulting file, and checking this is caught at read time (or better, do that multiple time changing a different bit each time). Test was added to CompressedRandomAccessReaderTest. bq. As Todd noted, HADOOP-6148 contains a bunch of discussions on the efficiency of java CRC32. In particular, it seems they have been able to close to double the speed of the CRC32, with a solution that seems fairly simple to me. It would be ok to use java native CRC32 and leave the improvement to another ticket, but quite frankly if it is that simple and since the hadoop guys have done all the hard work for us, I say we start with the efficient version directly. As decided previously this will be a matter of the separate ticket. Rebased with latest trunk (last commit 1e36fb1e44bff96005dd75a25648ff25eea6a95f) was (Author: xedin): bq. CSW.flushData() forgot to reset the checksum (this is caught by the unit tests btw). Not a problem since it was due to Sylvain's bad merge. bq. We should convert the CRC32 to an int (and only write that) as it is an int internally (getValue() returns a long only because CRC32 implements the interface Checksum that require that). Lets leave that to the ticket for CRC optimization which will allow us to modify that system-wide. bq. Here we checksum the compressed data. The other approach would be to checksum the uncompressed data. The advantage of checksumming compressed data is the speed (less data to checksum), but checksumming the uncompressed data would be a little bit safer. In particular, it would prevent us from messing up in the decompression (and we don't have to trust the compression algorithm, not that I don't trust Snappy, but...). This is a clearly a trade-off that we have to make, but I admit that my personal preference would lean towards safety (in particular, I know that checksumming the uncompressed data give a bit more safety, I don't know what is our exact gain quantitatively with checksumming compressed data). On the other side, checksumming the uncompressed data would likely mean that a good part of the bitrot would result in a decompression error rather than a checksum error, which is maybe less convenient from the implementation point of view. So I don't know, I guess I'm thinking aloud to have other's opinions more than anything else. Checksum is moved to the original data. bq. Let's add some unit tests. At least it's relatively easy to write a few blocks, switch one bit in the resulting file, and checking this is caught at read time (or better, do that multiple time changing a different bit each time). Test was added to CompressedRandomAccessReaderTest. bq. As Todd noted, HADOOP-6148 contains a bunch of discussions on the efficiency of java CRC32. In particular, it seems they have been able to close to double the speed of the CRC32, with a solution that seems fa
[jira] [Commented] (CASSANDRA-3007) NullPointerException in MessagingService.java:420
[ https://issues.apache.org/jira/browse/CASSANDRA-3007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081601#comment-13081601 ] Jonathan Ellis commented on CASSANDRA-3007: --- What kind of streaming are you attempting? > NullPointerException in MessagingService.java:420 > - > > Key: CASSANDRA-3007 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3007 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 0.8.3 > Environment: Linux w0 2.6.35-24-virtual #42-Ubuntu SMP Thu Dec 2 > 05:15:26 UTC 2010 x86_64 GNU/Linux > java version "1.6.0_18" > OpenJDK Runtime Environment (IcedTea6 1.8.7) (6b18-1.8.7-2~squeeze1) > OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode) >Reporter: Viliam Holub >Priority: Minor > Labels: nullpointerexception, streaming > > I'm getting large quantity of exceptions during streaming. It is always in > MessagingService.java:420. The streaming appears to be blocked. > INFO 10:11:14,734 Streaming to /10.235.77.27 > ERROR 10:11:14,734 Fatal exception in thread Thread[StreamStage:2,5,main] > java.lang.NullPointerException > at > org.apache.cassandra.net.MessagingService.stream(MessagingService.java:420) > at > org.apache.cassandra.streaming.StreamOutSession.begin(StreamOutSession.java:176) > at > org.apache.cassandra.streaming.StreamOut.transferRangesForRequest(StreamOut.java:148) > at > org.apache.cassandra.streaming.StreamRequestVerbHandler.doVerb(StreamRequestVerbHandler.java:54) > at > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > at java.lang.Thread.run(Thread.java:636) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3007) NullPointerException in MessagingService.java:420
[ https://issues.apache.org/jira/browse/CASSANDRA-3007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-3007: -- Attachment: 3007.txt Never mind, not relevant. Looks like you upgraded from 0.7 without updating your configuration file? Fix for missing encryption_options attached. > NullPointerException in MessagingService.java:420 > - > > Key: CASSANDRA-3007 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3007 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 0.8.3 > Environment: Linux w0 2.6.35-24-virtual #42-Ubuntu SMP Thu Dec 2 > 05:15:26 UTC 2010 x86_64 GNU/Linux > java version "1.6.0_18" > OpenJDK Runtime Environment (IcedTea6 1.8.7) (6b18-1.8.7-2~squeeze1) > OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode) >Reporter: Viliam Holub >Priority: Minor > Labels: nullpointerexception, streaming > Fix For: 0.8.4 > > Attachments: 3007.txt > > > I'm getting large quantity of exceptions during streaming. It is always in > MessagingService.java:420. The streaming appears to be blocked. > INFO 10:11:14,734 Streaming to /10.235.77.27 > ERROR 10:11:14,734 Fatal exception in thread Thread[StreamStage:2,5,main] > java.lang.NullPointerException > at > org.apache.cassandra.net.MessagingService.stream(MessagingService.java:420) > at > org.apache.cassandra.streaming.StreamOutSession.begin(StreamOutSession.java:176) > at > org.apache.cassandra.streaming.StreamOut.transferRangesForRequest(StreamOut.java:148) > at > org.apache.cassandra.streaming.StreamRequestVerbHandler.doVerb(StreamRequestVerbHandler.java:54) > at > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > at java.lang.Thread.run(Thread.java:636) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CASSANDRA-3006) Enormous counter
[ https://issues.apache.org/jira/browse/CASSANDRA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis reassigned CASSANDRA-3006: - Assignee: Sylvain Lebresne > Enormous counter > - > > Key: CASSANDRA-3006 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3006 > Project: Cassandra > Issue Type: Bug >Affects Versions: 0.8.3 > Environment: ubuntu 10.04 >Reporter: Boris Yen >Assignee: Sylvain Lebresne > > I have two-node cluster with the following keyspace and column family > settings. > Cluster Information: >Snitch: org.apache.cassandra.locator.SimpleSnitch >Partitioner: org.apache.cassandra.dht.RandomPartitioner >Schema versions: > 63fda700-c243-11e0--2d03dcafebdf: [172.17.19.151, 172.17.19.152] > Keyspace: test: > Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy > Durable Writes: true > Options: [datacenter1:2] > Column Families: > ColumnFamily: testCounter (Super) > "APP status information." > Key Validation Class: org.apache.cassandra.db.marshal.BytesType > Default column value validator: > org.apache.cassandra.db.marshal.CounterColumnType > Columns sorted by: > org.apache.cassandra.db.marshal.BytesType/org.apache.cassandra.db.marshal.BytesType > Row cache size / save period in seconds: 0.0/0 > Key cache size / save period in seconds: 20.0/14400 > Memtable thresholds: 1.1578125/1440/247 (millions of ops/MB/minutes) > GC grace seconds: 864000 > Compaction min/max thresholds: 4/32 > Read repair chance: 1.0 > Replicate on write: true > Built indexes: [] > Then, I use a test program based on hector to add a counter column > (testCounter[sc][column]) 1000 times. In the middle the adding process, I > intentional shut down the node 172.17.19.152. In addition to that, the test > program is smart enough to switch the consistency level from Quorum to One, > so that the following adding actions would not fail. > After all the adding actions are done, I start the cassandra on > 172.17.19.152, and I use cassandra-cli to check if the counter is correct on > both nodes, and I got a result 1001 which should be reasonable because hector > will retry once. However, when I shut down 172.17.19.151 and after > 172.17.19.152 is aware of 172.17.19.151 is down, I try to start the cassandra > on 172.17.19.151 again. Then, I check the counter again, this time I got a > result 481387 which is so wrong. > I use 0.8.3 to reproduce this bug, but I think this also happens on 0.8.2 or > before also. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data
[ https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081603#comment-13081603 ] Sylvain Lebresne commented on CASSANDRA-1717: - {quote} bq. We should convert the CRC32 to an int (and only write that) as it is an int internally (getValue() returns a long only because CRC32 implements the interface Checksum that require that). Lets leave that to the ticket for CRC optimization which will allow us to modify that system-wide {quote} Let's not: * this is completely orthogonal to switching to a drop-in, faster, CRC implementation. * it is unclear we want to make that system-wide. Imho, it is not worth breaking commit log compatibility for that, but it it stupid to commit new code that perpetuate the mistake, especially to change it later. > Cassandra cannot detect corrupt-but-readable column data > > > Key: CASSANDRA-1717 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1717 > Project: Cassandra > Issue Type: New Feature > Components: Core >Reporter: Jonathan Ellis >Assignee: Pavel Yaskevich > Fix For: 1.0 > > Attachments: CASSANDRA-1717-v2.patch, CASSANDRA-1717.patch, > checksums.txt > > > Most corruptions of on-disk data due to bitrot render the column (or row) > unreadable, so the data can be replaced by read repair or anti-entropy. But > if the corruption keeps column data readable we do not detect it, and if it > corrupts to a higher timestamp value can even resist being overwritten by > newer values. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data
[ https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081605#comment-13081605 ] Jonathan Ellis commented on CASSANDRA-1717: --- Saving 4 bytes out of 64K doesn't seem like enough benefit to make life harder for ourselves if we want to use a long checksum later. > Cassandra cannot detect corrupt-but-readable column data > > > Key: CASSANDRA-1717 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1717 > Project: Cassandra > Issue Type: New Feature > Components: Core >Reporter: Jonathan Ellis >Assignee: Pavel Yaskevich > Fix For: 1.0 > > Attachments: CASSANDRA-1717-v2.patch, CASSANDRA-1717.patch, > checksums.txt > > > Most corruptions of on-disk data due to bitrot render the column (or row) > unreadable, so the data can be replaced by read repair or anti-entropy. But > if the corruption keeps column data readable we do not detect it, and if it > corrupts to a higher timestamp value can even resist being overwritten by > newer values. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data
[ https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081609#comment-13081609 ] Pavel Yaskevich commented on CASSANDRA-1717: +1 with Jonathan, also it is better if we satisfy interface instead of relying on internal implementation details that also could be helpful if we will decide to change checksum algorithm. > Cassandra cannot detect corrupt-but-readable column data > > > Key: CASSANDRA-1717 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1717 > Project: Cassandra > Issue Type: New Feature > Components: Core >Reporter: Jonathan Ellis >Assignee: Pavel Yaskevich > Fix For: 1.0 > > Attachments: CASSANDRA-1717-v2.patch, CASSANDRA-1717.patch, > checksums.txt > > > Most corruptions of on-disk data due to bitrot render the column (or row) > unreadable, so the data can be replaced by read repair or anti-entropy. But > if the corruption keeps column data readable we do not detect it, and if it > corrupts to a higher timestamp value can even resist being overwritten by > newer values. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data
[ https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081629#comment-13081629 ] Sylvain Lebresne commented on CASSANDRA-1717: - What are the chance we'll switch from CRC32 any time soon ? And even if we do, why would that help us to save 4 bytes of 0's right now ? We will still have to bump the file format versioning and to keep the code to be compatible with the old CRC32 format if we do so. It's not like the only difference between checksum algorithms is the size of the checksum. So yes, 4 bytes out of 64K is not a lot of data, but to knowingly write 4 bytes of 0's every 64k every time for the vague remote chance that it may save us 1 or 2 lines of code someday (again, that even remains to be proven) feels ridiculous to me. But if I'm the only one to feel that way, fine, it's not a big deal. > Cassandra cannot detect corrupt-but-readable column data > > > Key: CASSANDRA-1717 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1717 > Project: Cassandra > Issue Type: New Feature > Components: Core >Reporter: Jonathan Ellis >Assignee: Pavel Yaskevich > Fix For: 1.0 > > Attachments: CASSANDRA-1717-v2.patch, CASSANDRA-1717.patch, > checksums.txt > > > Most corruptions of on-disk data due to bitrot render the column (or row) > unreadable, so the data can be replaced by read repair or anti-entropy. But > if the corruption keeps column data readable we do not detect it, and if it > corrupts to a higher timestamp value can even resist being overwritten by > newer values. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data
[ https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081637#comment-13081637 ] Pavel Yaskevich commented on CASSANDRA-1717: I still think that such change is a matter of the separate ticket as we will want to change CRC stuff globally, we can make own Checksum class with will return int value, apply performance improvements mentioned by HADOOP-6148 to it and use system-wide. Is there anything else that keeps this from being committed? > Cassandra cannot detect corrupt-but-readable column data > > > Key: CASSANDRA-1717 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1717 > Project: Cassandra > Issue Type: New Feature > Components: Core >Reporter: Jonathan Ellis >Assignee: Pavel Yaskevich > Fix For: 1.0 > > Attachments: CASSANDRA-1717-v2.patch, CASSANDRA-1717.patch, > checksums.txt > > > Most corruptions of on-disk data due to bitrot render the column (or row) > unreadable, so the data can be replaced by read repair or anti-entropy. But > if the corruption keeps column data readable we do not detect it, and if it > corrupts to a higher timestamp value can even resist being overwritten by > newer values. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
svn commit: r1155374 - /cassandra/branches/cassandra-0.8/debian/control
Author: eevans Date: Tue Aug 9 14:05:55 2011 New Revision: 1155374 URL: http://svn.apache.org/viewvc?rev=1155374&view=rev Log: build requires subversion (line 235 of build.xml) Patch by Sven Wilhelm; reviewed by eevans Modified: cassandra/branches/cassandra-0.8/debian/control Modified: cassandra/branches/cassandra-0.8/debian/control URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/debian/control?rev=1155374&r1=1155373&r2=1155374&view=diff == --- cassandra/branches/cassandra-0.8/debian/control (original) +++ cassandra/branches/cassandra-0.8/debian/control Tue Aug 9 14:05:55 2011 @@ -2,7 +2,7 @@ Source: cassandra Section: misc Priority: extra Maintainer: Eric Evans -Build-Depends: debhelper (>= 5), openjdk-6-jdk (>= 6b11) | java6-sdk, ant (>= 1.7), ant-optional (>= 1.7) +Build-Depends: debhelper (>= 5), openjdk-6-jdk (>= 6b11) | java6-sdk, ant (>= 1.7), ant-optional (>= 1.7), subversion Homepage: http://cassandra.apache.org Vcs-Svn: https://svn.apache.org/repos/asf/cassandra/trunk Vcs-Browser: http://svn.apache.org/viewvc/cassandra/trunk
[jira] [Created] (CASSANDRA-3008) Error getting range slices
Error getting range slices -- Key: CASSANDRA-3008 URL: https://issues.apache.org/jira/browse/CASSANDRA-3008 Project: Cassandra Issue Type: Bug Affects Versions: 0.8.2 Environment: Ubuntu, using the 08x repository Reporter: Luis Eduardo Villares Matta Priority: Critical I can get a range slice on one of my column families. ERROR 14:16:26,672 Internal error processing get_range_slices java.io.IOError: java.io.EOFException: EOF after 26948 bytes out of 1681403191 at org.apache.cassandra.db.columniterator.SimpleSliceReader.(SimpleSliceReader.java:66) at org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:91) at org.apache.cassandra.db.columniterator.SSTableSliceIterator.(SSTableSliceIterator.java:86) at org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:71) at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:87) at org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:184) at org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:144) at org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:136) at org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:39) at org.apache.commons.collections.iterators.CollatingIterator.set(CollatingIterator.java:284) at org.apache.commons.collections.iterators.CollatingIterator.least(CollatingIterator.java:326) at org.apache.commons.collections.iterators.CollatingIterator.next(CollatingIterator.java:230) at org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:69) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135) at org.apache.cassandra.db.RowIterator.hasNext(RowIterator.java:49) at org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1392) at org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:684) at org.apache.cassandra.thrift.CassandraServer.get_range_slices(CassandraServer.java:617) at org.apache.cassandra.thrift.Cassandra$Processor$get_range_slices.process(Cassandra.java:3202) at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.EOFException: EOF after 26948 bytes out of 1681403191 at org.apache.cassandra.io.util.FileUtils.skipBytesFully(FileUtils.java:229) at org.apache.cassandra.io.sstable.IndexHelper.skipBloomFilter(IndexHelper.java:50) at org.apache.cassandra.db.columniterator.SimpleSliceReader.(SimpleSliceReader.java:57) ... 24 more -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3006) Enormous counter
[ https://issues.apache.org/jira/browse/CASSANDRA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081660#comment-13081660 ] Sylvain Lebresne commented on CASSANDRA-3006: - I've haven't had luck with reproducing so far. I've tried to stick with the description above but did not used hector (not saying it is hector fault though, maybe it is the way it does retry that I don't emulate well). If you are able to share a minimal hector script with which you reproduce this easily, that would be very helpful. > Enormous counter > - > > Key: CASSANDRA-3006 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3006 > Project: Cassandra > Issue Type: Bug >Affects Versions: 0.8.3 > Environment: ubuntu 10.04 >Reporter: Boris Yen >Assignee: Sylvain Lebresne > > I have two-node cluster with the following keyspace and column family > settings. > Cluster Information: >Snitch: org.apache.cassandra.locator.SimpleSnitch >Partitioner: org.apache.cassandra.dht.RandomPartitioner >Schema versions: > 63fda700-c243-11e0--2d03dcafebdf: [172.17.19.151, 172.17.19.152] > Keyspace: test: > Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy > Durable Writes: true > Options: [datacenter1:2] > Column Families: > ColumnFamily: testCounter (Super) > "APP status information." > Key Validation Class: org.apache.cassandra.db.marshal.BytesType > Default column value validator: > org.apache.cassandra.db.marshal.CounterColumnType > Columns sorted by: > org.apache.cassandra.db.marshal.BytesType/org.apache.cassandra.db.marshal.BytesType > Row cache size / save period in seconds: 0.0/0 > Key cache size / save period in seconds: 20.0/14400 > Memtable thresholds: 1.1578125/1440/247 (millions of ops/MB/minutes) > GC grace seconds: 864000 > Compaction min/max thresholds: 4/32 > Read repair chance: 1.0 > Replicate on write: true > Built indexes: [] > Then, I use a test program based on hector to add a counter column > (testCounter[sc][column]) 1000 times. In the middle the adding process, I > intentional shut down the node 172.17.19.152. In addition to that, the test > program is smart enough to switch the consistency level from Quorum to One, > so that the following adding actions would not fail. > After all the adding actions are done, I start the cassandra on > 172.17.19.152, and I use cassandra-cli to check if the counter is correct on > both nodes, and I got a result 1001 which should be reasonable because hector > will retry once. However, when I shut down 172.17.19.151 and after > 172.17.19.152 is aware of 172.17.19.151 is down, I try to start the cassandra > on 172.17.19.151 again. Then, I check the counter again, this time I got a > result 481387 which is so wrong. > I use 0.8.3 to reproduce this bug, but I think this also happens on 0.8.2 or > before also. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2474) CQL support for compound columns
[ https://issues.apache.org/jira/browse/CASSANDRA-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081665#comment-13081665 ] T Jake Luciani commented on CASSANDRA-2474: --- I don't (yet) know how to add hint types to hive but once a transposed hint operator was added we should be able to hook it into the hive driver. > CQL support for compound columns > > > Key: CASSANDRA-2474 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2474 > Project: Cassandra > Issue Type: Sub-task > Components: API, Core >Reporter: Eric Evans > Labels: cql > Fix For: 1.0 > > > For the most part, this boils down to supporting the specification of > compound column names (the CQL syntax is colon-delimted terms), and then > teaching the decoders (drivers) to create structures from the results. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2474) CQL support for compound columns
[ https://issues.apache.org/jira/browse/CASSANDRA-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081669#comment-13081669 ] Jonathan Ellis commented on CASSANDRA-2474: --- Isn't changing query semantics kind of the opposite of what hints are supposed to be for? > CQL support for compound columns > > > Key: CASSANDRA-2474 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2474 > Project: Cassandra > Issue Type: Sub-task > Components: API, Core >Reporter: Eric Evans > Labels: cql > Fix For: 1.0 > > > For the most part, this boils down to supporting the specification of > compound column names (the CQL syntax is colon-delimted terms), and then > teaching the decoders (drivers) to create structures from the results. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3008) Error getting range slices
[ https://issues.apache.org/jira/browse/CASSANDRA-3008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081673#comment-13081673 ] Jonathan Ellis commented on CASSANDRA-3008: --- did you try "nodetool scrub"? > Error getting range slices > -- > > Key: CASSANDRA-3008 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3008 > Project: Cassandra > Issue Type: Bug >Affects Versions: 0.8.2 > Environment: Ubuntu, using the 08x repository >Reporter: Luis Eduardo Villares Matta >Priority: Critical > > I can get a range slice on one of my column families. > ERROR 14:16:26,672 Internal error processing get_range_slices > java.io.IOError: java.io.EOFException: EOF after 26948 bytes out of 1681403191 > at > org.apache.cassandra.db.columniterator.SimpleSliceReader.(SimpleSliceReader.java:66) > at > org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:91) > at > org.apache.cassandra.db.columniterator.SSTableSliceIterator.(SSTableSliceIterator.java:86) > at > org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:71) > at > org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:87) > at > org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:184) > at > org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:144) > at > org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:136) > at > org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:39) > at > org.apache.commons.collections.iterators.CollatingIterator.set(CollatingIterator.java:284) > at > org.apache.commons.collections.iterators.CollatingIterator.least(CollatingIterator.java:326) > at > org.apache.commons.collections.iterators.CollatingIterator.next(CollatingIterator.java:230) > at > org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:69) > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140) > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135) > at org.apache.cassandra.db.RowIterator.hasNext(RowIterator.java:49) > at > org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1392) > at > org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:684) > at > org.apache.cassandra.thrift.CassandraServer.get_range_slices(CassandraServer.java:617) > at > org.apache.cassandra.thrift.Cassandra$Processor$get_range_slices.process(Cassandra.java:3202) > at > org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889) > at > org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > Caused by: java.io.EOFException: EOF after 26948 bytes out of 1681403191 > at > org.apache.cassandra.io.util.FileUtils.skipBytesFully(FileUtils.java:229) > at > org.apache.cassandra.io.sstable.IndexHelper.skipBloomFilter(IndexHelper.java:50) > at > org.apache.cassandra.db.columniterator.SimpleSliceReader.(SimpleSliceReader.java:57) > ... 24 more -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2749) fine-grained control over data directories
[ https://issues.apache.org/jira/browse/CASSANDRA-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081679#comment-13081679 ] Chris Burroughs commented on CASSANDRA-2749: It would also be cool (but this is obviously speculative) to have the ability to keep Index files on an SSD, and the larger data files on rotating disks. > fine-grained control over data directories > -- > > Key: CASSANDRA-2749 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2749 > Project: Cassandra > Issue Type: New Feature > Components: Core >Reporter: Jonathan Ellis >Assignee: Pavel Yaskevich >Priority: Minor > Fix For: 1.0 > > > Currently Cassandra supports multiple data directories but no way to control > what sstables are placed where. Particularly for systems with mixed SSDs and > rotational disks, it would be nice to pin frequently accessed columnfamilies > to the SSDs. > Postgresql does this with tablespaces > (http://www.postgresql.org/docs/9.0/static/manage-ag-tablespaces.html) but we > should probably avoid using that name because of confusing similarity to > "keyspaces." -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3007) NullPointerException in MessagingService.java:420
[ https://issues.apache.org/jira/browse/CASSANDRA-3007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081680#comment-13081680 ] Viliam Holub commented on CASSANDRA-3007: - It's removetoken command. Yes, I updated the node and forgot to specify encryption_options - thanks! > NullPointerException in MessagingService.java:420 > - > > Key: CASSANDRA-3007 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3007 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 0.8.3 > Environment: Linux w0 2.6.35-24-virtual #42-Ubuntu SMP Thu Dec 2 > 05:15:26 UTC 2010 x86_64 GNU/Linux > java version "1.6.0_18" > OpenJDK Runtime Environment (IcedTea6 1.8.7) (6b18-1.8.7-2~squeeze1) > OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode) >Reporter: Viliam Holub >Assignee: Jonathan Ellis >Priority: Minor > Labels: nullpointerexception, streaming > Fix For: 0.8.4 > > Attachments: 3007.txt > > > I'm getting large quantity of exceptions during streaming. It is always in > MessagingService.java:420. The streaming appears to be blocked. > INFO 10:11:14,734 Streaming to /10.235.77.27 > ERROR 10:11:14,734 Fatal exception in thread Thread[StreamStage:2,5,main] > java.lang.NullPointerException > at > org.apache.cassandra.net.MessagingService.stream(MessagingService.java:420) > at > org.apache.cassandra.streaming.StreamOutSession.begin(StreamOutSession.java:176) > at > org.apache.cassandra.streaming.StreamOut.transferRangesForRequest(StreamOut.java:148) > at > org.apache.cassandra.streaming.StreamRequestVerbHandler.doVerb(StreamRequestVerbHandler.java:54) > at > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > at java.lang.Thread.run(Thread.java:636) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data
[ https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081685#comment-13081685 ] Sylvain Lebresne commented on CASSANDRA-1717: - As previously said, I both disagree on using 8 bytes when we need 4 and that using 4 is a matter for another ticket, but since this is probably me being too anal as usual, +1 on the rest of the patch, modulo a small optional nitpick: the toLong() function is a bit hard to read imho. It's hard to see where the parenthesis are, and if it does the right thing. It seems ok though, I just think a simple for loop on the bytes would be more readable. We also historically keep ByteBufferUtil for ByteBuffer manipulations and use FBUtilities for byte[] manipulation. > Cassandra cannot detect corrupt-but-readable column data > > > Key: CASSANDRA-1717 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1717 > Project: Cassandra > Issue Type: New Feature > Components: Core >Reporter: Jonathan Ellis >Assignee: Pavel Yaskevich > Fix For: 1.0 > > Attachments: CASSANDRA-1717-v2.patch, CASSANDRA-1717.patch, > checksums.txt > > > Most corruptions of on-disk data due to bitrot render the column (or row) > unreadable, so the data can be replaced by read repair or anti-entropy. But > if the corruption keeps column data readable we do not detect it, and if it > corrupts to a higher timestamp value can even resist being overwritten by > newer values. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data
[ https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081689#comment-13081689 ] Pavel Yaskevich commented on CASSANDRA-1717: Ok, I will move toLong(byte[] bytes) to FBUtilities and commit, thanks! > Cassandra cannot detect corrupt-but-readable column data > > > Key: CASSANDRA-1717 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1717 > Project: Cassandra > Issue Type: New Feature > Components: Core >Reporter: Jonathan Ellis >Assignee: Pavel Yaskevich > Fix For: 1.0 > > Attachments: CASSANDRA-1717-v2.patch, CASSANDRA-1717.patch, > checksums.txt > > > Most corruptions of on-disk data due to bitrot render the column (or row) > unreadable, so the data can be replaced by read repair or anti-entropy. But > if the corruption keeps column data readable we do not detect it, and if it > corrupts to a higher timestamp value can even resist being overwritten by > newer values. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data
[ https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081690#comment-13081690 ] Jonathan Ellis commented on CASSANDRA-1717: --- You're right, if we change checksum implementation we need to bump sstable revision anyway. +1 casting to int here. (But as you said above, -1 changing this in CommitLog.) > Cassandra cannot detect corrupt-but-readable column data > > > Key: CASSANDRA-1717 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1717 > Project: Cassandra > Issue Type: New Feature > Components: Core >Reporter: Jonathan Ellis >Assignee: Pavel Yaskevich > Fix For: 1.0 > > Attachments: CASSANDRA-1717-v2.patch, CASSANDRA-1717.patch, > checksums.txt > > > Most corruptions of on-disk data due to bitrot render the column (or row) > unreadable, so the data can be replaced by read repair or anti-entropy. But > if the corruption keeps column data readable we do not detect it, and if it > corrupts to a higher timestamp value can even resist being overwritten by > newer values. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3008) Error getting range slices
[ https://issues.apache.org/jira/browse/CASSANDRA-3008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081701#comment-13081701 ] Luis Eduardo Villares Matta commented on CASSANDRA-3008: No I did not, it seams to have fixed my issues. Thank you Very Much. (I am inclined to close this issue, but I do not know if I should. Also I am testing every thing in the next few hours) > Error getting range slices > -- > > Key: CASSANDRA-3008 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3008 > Project: Cassandra > Issue Type: Bug >Affects Versions: 0.8.2 > Environment: Ubuntu, using the 08x repository >Reporter: Luis Eduardo Villares Matta >Priority: Critical > > I can get a range slice on one of my column families. > ERROR 14:16:26,672 Internal error processing get_range_slices > java.io.IOError: java.io.EOFException: EOF after 26948 bytes out of 1681403191 > at > org.apache.cassandra.db.columniterator.SimpleSliceReader.(SimpleSliceReader.java:66) > at > org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:91) > at > org.apache.cassandra.db.columniterator.SSTableSliceIterator.(SSTableSliceIterator.java:86) > at > org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:71) > at > org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:87) > at > org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:184) > at > org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:144) > at > org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:136) > at > org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:39) > at > org.apache.commons.collections.iterators.CollatingIterator.set(CollatingIterator.java:284) > at > org.apache.commons.collections.iterators.CollatingIterator.least(CollatingIterator.java:326) > at > org.apache.commons.collections.iterators.CollatingIterator.next(CollatingIterator.java:230) > at > org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:69) > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140) > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135) > at org.apache.cassandra.db.RowIterator.hasNext(RowIterator.java:49) > at > org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1392) > at > org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:684) > at > org.apache.cassandra.thrift.CassandraServer.get_range_slices(CassandraServer.java:617) > at > org.apache.cassandra.thrift.Cassandra$Processor$get_range_slices.process(Cassandra.java:3202) > at > org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889) > at > org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > Caused by: java.io.EOFException: EOF after 26948 bytes out of 1681403191 > at > org.apache.cassandra.io.util.FileUtils.skipBytesFully(FileUtils.java:229) > at > org.apache.cassandra.io.sstable.IndexHelper.skipBloomFilter(IndexHelper.java:50) > at > org.apache.cassandra.db.columniterator.SimpleSliceReader.(SimpleSliceReader.java:57) > ... 24 more -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data
[ https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Yaskevich updated CASSANDRA-1717: --- Attachment: CASSANDRA-1717-v3.patch v3 which removes BBU.toLong and adds FBU.byteArrayToInt + uses int instead of long for checksum > Cassandra cannot detect corrupt-but-readable column data > > > Key: CASSANDRA-1717 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1717 > Project: Cassandra > Issue Type: New Feature > Components: Core >Reporter: Jonathan Ellis >Assignee: Pavel Yaskevich > Fix For: 1.0 > > Attachments: CASSANDRA-1717-v2.patch, CASSANDRA-1717-v3.patch, > CASSANDRA-1717.patch, checksums.txt > > > Most corruptions of on-disk data due to bitrot render the column (or row) > unreadable, so the data can be replaced by read repair or anti-entropy. But > if the corruption keeps column data readable we do not detect it, and if it > corrupts to a higher timestamp value can even resist being overwritten by > newer values. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3008) Error getting range slices
[ https://issues.apache.org/jira/browse/CASSANDRA-3008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081712#comment-13081712 ] Jonathan Ellis commented on CASSANDRA-3008: --- Check (scrub) your other nodes -- data corruption can happen (usually from bad memory) but if there's a pattern of all the nodes being affected at the same time there could be a Cassandra bug. > Error getting range slices > -- > > Key: CASSANDRA-3008 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3008 > Project: Cassandra > Issue Type: Bug >Affects Versions: 0.8.2 > Environment: Ubuntu, using the 08x repository >Reporter: Luis Eduardo Villares Matta >Priority: Critical > > I can get a range slice on one of my column families. > ERROR 14:16:26,672 Internal error processing get_range_slices > java.io.IOError: java.io.EOFException: EOF after 26948 bytes out of 1681403191 > at > org.apache.cassandra.db.columniterator.SimpleSliceReader.(SimpleSliceReader.java:66) > at > org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:91) > at > org.apache.cassandra.db.columniterator.SSTableSliceIterator.(SSTableSliceIterator.java:86) > at > org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:71) > at > org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:87) > at > org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:184) > at > org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:144) > at > org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:136) > at > org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:39) > at > org.apache.commons.collections.iterators.CollatingIterator.set(CollatingIterator.java:284) > at > org.apache.commons.collections.iterators.CollatingIterator.least(CollatingIterator.java:326) > at > org.apache.commons.collections.iterators.CollatingIterator.next(CollatingIterator.java:230) > at > org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:69) > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140) > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135) > at org.apache.cassandra.db.RowIterator.hasNext(RowIterator.java:49) > at > org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1392) > at > org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:684) > at > org.apache.cassandra.thrift.CassandraServer.get_range_slices(CassandraServer.java:617) > at > org.apache.cassandra.thrift.Cassandra$Processor$get_range_slices.process(Cassandra.java:3202) > at > org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889) > at > org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > Caused by: java.io.EOFException: EOF after 26948 bytes out of 1681403191 > at > org.apache.cassandra.io.util.FileUtils.skipBytesFully(FileUtils.java:229) > at > org.apache.cassandra.io.sstable.IndexHelper.skipBloomFilter(IndexHelper.java:50) > at > org.apache.cassandra.db.columniterator.SimpleSliceReader.(SimpleSliceReader.java:57) > ... 24 more -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data
[ https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081718#comment-13081718 ] Sylvain Lebresne commented on CASSANDRA-1717: - lgtm, +1 > Cassandra cannot detect corrupt-but-readable column data > > > Key: CASSANDRA-1717 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1717 > Project: Cassandra > Issue Type: New Feature > Components: Core >Reporter: Jonathan Ellis >Assignee: Pavel Yaskevich > Fix For: 1.0 > > Attachments: CASSANDRA-1717-v2.patch, CASSANDRA-1717-v3.patch, > CASSANDRA-1717.patch, checksums.txt > > > Most corruptions of on-disk data due to bitrot render the column (or row) > unreadable, so the data can be replaced by read repair or anti-entropy. But > if the corruption keeps column data readable we do not detect it, and if it > corrupts to a higher timestamp value can even resist being overwritten by > newer values. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-3009) 404 on apt-get install from http://www.apache.org/dist/cassandra/debian
404 on apt-get install from http://www.apache.org/dist/cassandra/debian --- Key: CASSANDRA-3009 URL: https://issues.apache.org/jira/browse/CASSANDRA-3009 Project: Cassandra Issue Type: Bug Components: Documentation & website Affects Versions: 0.8.3 Environment: ubuntu maverick 64-bit Reporter: Chris Lohfink Priority: Minor First bug report on here so sorry if I am doing something incorrectly. I followed the wiki (http://wiki.apache.org/cassandra/DebianPackaging) but I am receiving a 404 error during the install. Looks like the {code} clohfink@roc-lvm-dev:~dev$ sudo apt-get install cassandra [sudo] password for clohfink: Reading package lists... Done Building dependency tree Reading state information... Done The following packages were automatically installed and are no longer required: libcommons-pool-java authbind libmcrypt4 libtomcat6-java libcommons-dbcp-java tomcat6-common Use 'apt-get autoremove' to remove them. The following NEW packages will be installed: cassandra 0 upgraded, 1 newly installed, 0 to remove and 66 not upgraded. Need to get 8,415kB of archives. After this operation, 9,540kB of additional disk space will be used. Err http://www.apache.org/dist/cassandra/debian/ unstable/main cassandra all 0.8.0 404 Not Found Failed to fetch http://www.apache.org/dist/cassandra/debian/pool/main/c/cassandra/cassandra_0.8.0_all.deb 404 Not Found E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing? {code} for debugging info: {code} clohfink@roc-lvm-dev:~dev/fabrictests$ sudo apt-cache show cassandra N: Can't select versions from package 'cassandra' as it purely virtual N: No packages found clohfink@roc-lvm-dev:~dev/fabrictests$ sudo add-apt-repository "deb http://www.apache.org/dist/cassandra/debian unstable main" clohfink@roc-lvm-dev:~dev/fabrictests$ sudo apt-get update ... Ign http://www.apache.org/dist/cassandra/debian/ unstable/main Translation-en Ign http://www.apache.org/dist/cassandra/debian/ unstable/main Translation-en_US ... Hit http://us.archive.ubuntu.com maverick-proposed/universe amd64 Packages Fetched 6,989B in 1s (5,974B/s) Reading package lists... Done clohfink@roc-lvm-dev:~dev/fabrictests$ sudo apt-cache show cassandra Package: cassandra Version: 0.8.0 Architecture: all Maintainer: Eric Evans Installed-Size: 9316 Depends: openjdk-6-jre-headless (>= 6b11) | java6-runtime, jsvc (>= 1.0), libcommons-daemon-java (>= 1.0), adduser Recommends: libjna-java Homepage: http://cassandra.apache.org Priority: extra Section: misc Filename: pool/main/c/cassandra/cassandra_0.8.0_all.deb Size: 8415180 SHA256: 7eaaeb9d3ef5af6abff834fe93f1a84349dff98776eaee83f8dabb267ffe4833 SHA1: 9cca3ffbcbab9e6ba2385f734691c97afeaa8be6 MD5sum: 01e0435495f7ff40e1b4e4be5857a1ea Description: distributed storage system for structured data Cassandra is a distributed (peer-to-peer) system for the management and storage of structured data. {code} included fabric script, if have fabric installed can run {code} fab -H localhost install_cassandra {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CASSANDRA-1974) PFEPS-like snitch that uses gossip instead of a property file
[ https://issues.apache.org/jira/browse/CASSANDRA-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis reassigned CASSANDRA-1974: - Assignee: (was: Brandon Williams) I think the biggest win is when you can automatically determine rack/dc from the environment somehow (e.g.: ec2snitch). Otherwise the advantage of editing a file, vs edit + rsync, is small. Small enough that it's probably not worth the education headache. > PFEPS-like snitch that uses gossip instead of a property file > - > > Key: CASSANDRA-1974 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1974 > Project: Cassandra > Issue Type: New Feature >Reporter: Brandon Williams >Priority: Minor > > Now that we have an ec2 snitch that propagates its rack/dc info via gossip > from CASSANDRA-1654, it doesn't make a lot of sense to use PFEPS where you > have to rsync the property file across all the machines when you add a node. > Instead, we could have a snitch where you specify its rack/dc in a property > file, and propagate this via gossip like the ec2 snitch. In order to not > break PFEPS, this should probably be a new snitch. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3009) 404 on apt-get install from http://www.apache.org/dist/cassandra/debian
[ https://issues.apache.org/jira/browse/CASSANDRA-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Lohfink updated CASSANDRA-3009: - Attachment: fabfile.py > 404 on apt-get install from http://www.apache.org/dist/cassandra/debian > --- > > Key: CASSANDRA-3009 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3009 > Project: Cassandra > Issue Type: Bug > Components: Documentation & website >Affects Versions: 0.8.3 > Environment: ubuntu maverick 64-bit >Reporter: Chris Lohfink >Priority: Minor > Attachments: fabfile.py > > > First bug report on here so sorry if I am doing something incorrectly. I > followed the wiki (http://wiki.apache.org/cassandra/DebianPackaging) but I am > receiving a 404 error during the install. Looks like the > {code} > clohfink@roc-lvm-dev:~dev$ sudo apt-get install cassandra > [sudo] password for clohfink: > Reading package lists... Done > Building dependency tree > Reading state information... Done > The following packages were automatically installed and are no longer > required: > libcommons-pool-java authbind libmcrypt4 libtomcat6-java > libcommons-dbcp-java tomcat6-common > Use 'apt-get autoremove' to remove them. > The following NEW packages will be installed: > cassandra > 0 upgraded, 1 newly installed, 0 to remove and 66 not upgraded. > Need to get 8,415kB of archives. > After this operation, 9,540kB of additional disk space will be used. > Err http://www.apache.org/dist/cassandra/debian/ unstable/main cassandra all > 0.8.0 > 404 Not Found > Failed to fetch > http://www.apache.org/dist/cassandra/debian/pool/main/c/cassandra/cassandra_0.8.0_all.deb > 404 Not Found > E: Unable to fetch some archives, maybe run apt-get update or try with > --fix-missing? > {code} > for debugging info: > {code} > clohfink@roc-lvm-dev:~dev/fabrictests$ sudo apt-cache show cassandra > N: Can't select versions from package 'cassandra' as it purely virtual > N: No packages found > clohfink@roc-lvm-dev:~dev/fabrictests$ sudo add-apt-repository "deb > http://www.apache.org/dist/cassandra/debian unstable main" > clohfink@roc-lvm-dev:~dev/fabrictests$ sudo apt-get update > ... > Ign http://www.apache.org/dist/cassandra/debian/ unstable/main Translation-en > > Ign http://www.apache.org/dist/cassandra/debian/ unstable/main > Translation-en_US > ... > Hit http://us.archive.ubuntu.com maverick-proposed/universe amd64 Packages > Fetched 6,989B in 1s (5,974B/s) > Reading package lists... Done > clohfink@roc-lvm-dev:~dev/fabrictests$ sudo apt-cache show cassandra > Package: cassandra > Version: 0.8.0 > Architecture: all > Maintainer: Eric Evans > Installed-Size: 9316 > Depends: openjdk-6-jre-headless (>= 6b11) | java6-runtime, jsvc (>= 1.0), > libcommons-daemon-java (>= 1.0), adduser > Recommends: libjna-java > Homepage: http://cassandra.apache.org > Priority: extra > Section: misc > Filename: pool/main/c/cassandra/cassandra_0.8.0_all.deb > Size: 8415180 > SHA256: 7eaaeb9d3ef5af6abff834fe93f1a84349dff98776eaee83f8dabb267ffe4833 > SHA1: 9cca3ffbcbab9e6ba2385f734691c97afeaa8be6 > MD5sum: 01e0435495f7ff40e1b4e4be5857a1ea > Description: distributed storage system for structured data > Cassandra is a distributed (peer-to-peer) system for the management > and storage of structured data. > {code} > included fabric script, if have fabric installed can run > {code} > fab -H localhost install_cassandra > {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2892) Don't "replicate_on_write" with RF=1
[ https://issues.apache.org/jira/browse/CASSANDRA-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2892: Attachment: 2892.patch That's a super easy one and it removes some nasty boolean flag from SP.sendToHintedEndpoints so let's do it. > Don't "replicate_on_write" with RF=1 > > > Key: CASSANDRA-2892 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2892 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 0.8.0 >Reporter: Sylvain Lebresne >Assignee: Sylvain Lebresne >Priority: Trivial > Labels: counters > Fix For: 0.8.4 > > Attachments: 2892.patch > > > For counters with RF=1, we still do a read to replicate, even though there is > nothing to replicate it too. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2892) Don't "replicate_on_write" with RF=1
[ https://issues.apache.org/jira/browse/CASSANDRA-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081728#comment-13081728 ] Jonathan Ellis commented on CASSANDRA-2892: --- can you spell out what's going on with this part? {code} -if (cm.shouldReplicateOnWrite()) +hintedEndpoints.removeAll(FBUtilities.getLocalAddress()); + +if (cm.shouldReplicateOnWrite() && !hintedEndpoints.isEmpty()) {code} > Don't "replicate_on_write" with RF=1 > > > Key: CASSANDRA-2892 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2892 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 0.8.0 >Reporter: Sylvain Lebresne >Assignee: Sylvain Lebresne >Priority: Trivial > Labels: counters > Fix For: 0.8.4 > > Attachments: 2892.patch > > > For counters with RF=1, we still do a read to replicate, even though there is > nothing to replicate it too. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Cassandra Wiki] Trivial Update of "DebianPackaging" by SylvainLebresne
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification. The "DebianPackaging" page has been changed by SylvainLebresne: http://wiki.apache.org/cassandra/DebianPackaging?action=diff&rev1=22&rev2=23 To install on Debian or Debian derivatives, use the following sources: {{{ - deb http://www.apache.org/dist/cassandra/debian unstable main + deb http://www.apache.org/dist/cassandra/debian 08x main - deb-src http://www.apache.org/dist/cassandra/debian unstable main + deb-src http://www.apache.org/dist/cassandra/debian 08x main }}} - ''Note: the unstable suite points to the most current branch of development (for historical reasons). Production systems should use a version-specific suite/codename, (for example, `06x` for the 0.6.x series, `07x` for the 0.7.x series, etc).'' + You will want to replace `08x` by the series you want to use: `06x` for the 0.6.x series, 07x for the 0.7.x series, etc... It does mean that you will not get major version update unless you change the series, but that is ''a feature''. + If you run ''apt-get update'' now, you will see an error similar to this: {{{
[jira] [Commented] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081729#comment-13081729 ] Jonathan Ellis commented on CASSANDRA-2843: --- +1 > better performance on long row read > --- > > Key: CASSANDRA-2843 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 > Project: Cassandra > Issue Type: New Feature >Reporter: Yang Yang > Fix For: 1.0 > > Attachments: 2843.patch, 2843_d.patch, 2843_g.patch, 2843_h.patch, > fix.diff, microBenchmark.patch, patch_timing, std_timing > > > currently if a row contains > 1000 columns, the run time becomes considerably > slow (my test of > a row with 30 00 columns (standard, regular) each with 8 bytes in name, and > 40 bytes in value, is about 16ms. > this is all running in memory, no disk read is involved. > through debugging we can find > most of this time is spent on > [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) > [Wall Time] > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, > ColumnFamily) > [Wall Time] > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, > ColumnFamily) > [Wall Time] > org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, > int, ColumnFamily) > [Wall Time] > org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, > Iterator, int) > [Wall Time] > org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, > Iterator, int) > [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) > ColumnFamily.addColumn() is slow because it inserts into an internal > concurrentSkipListMap() that maps column names to values. > this structure is slow for two reasons: it needs to do synchronization; it > needs to maintain a more complex structure of map. > but if we look at the whole read path, thrift already defines the read output > to be List so it does not make sense to use a luxury map > data structure in the interium and finally convert it to a list. on the > synchronization side, since the return CF is never going to be > shared/modified by other threads, we know the access is always single thread, > so no synchronization is needed. > but these 2 features are indeed needed for ColumnFamily in other cases, > particularly write. so we can provide a different ColumnFamily to > CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always > creates the standard ColumnFamily, but take a provided returnCF, whose cost > is much cheaper. > the provided patch is for demonstration now, will work further once we agree > on the general direction. > CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is > provided. the main work is to let the FastColumnFamily use an array for > internal storage. at first I used binary search to insert new columns in > addColumn(), but later I found that even this is not necessary, since all > calling scenarios of ColumnFamily.addColumn() has an invariant that the > inserted columns come in sorted order (I still have an issue to resolve > descending or ascending now, but ascending works). so the current logic is > simply to compare the new column against the end column in the array, if > names not equal, append, if equal, reconcile. > slight temporary hacks are made on getTopLevelColumnFamily so we have 2 > flavors of the method, one accepting a returnCF. but we could definitely > think about what is the better way to provide this returnCF. > this patch compiles fine, no tests are provided yet. but I tested it in my > application, and the performance improvement is dramatic: it offers about 50% > reduction in read time in the 3000-column case. > thanks > Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-3009) 404 on apt-get install from http://www.apache.org/dist/cassandra/debian
[ https://issues.apache.org/jira/browse/CASSANDRA-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne resolved CASSANDRA-3009. - Resolution: Not A Problem Sorry, this is because I don't update the 'unstable' series anymore. You should use 08x instead (or 07x if you feel inclined to). It felt more easily harmful to have an 'unstable' version that would silently do major version upgrade, so we've switched to numbered series instead. I've updated the wiki accordingly. Sorry for the inconvenience. > 404 on apt-get install from http://www.apache.org/dist/cassandra/debian > --- > > Key: CASSANDRA-3009 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3009 > Project: Cassandra > Issue Type: Bug > Components: Documentation & website >Affects Versions: 0.8.3 > Environment: ubuntu maverick 64-bit >Reporter: Chris Lohfink >Priority: Minor > Attachments: fabfile.py > > > First bug report on here so sorry if I am doing something incorrectly. I > followed the wiki (http://wiki.apache.org/cassandra/DebianPackaging) but I am > receiving a 404 error during the install. Looks like the > {code} > clohfink@roc-lvm-dev:~dev$ sudo apt-get install cassandra > [sudo] password for clohfink: > Reading package lists... Done > Building dependency tree > Reading state information... Done > The following packages were automatically installed and are no longer > required: > libcommons-pool-java authbind libmcrypt4 libtomcat6-java > libcommons-dbcp-java tomcat6-common > Use 'apt-get autoremove' to remove them. > The following NEW packages will be installed: > cassandra > 0 upgraded, 1 newly installed, 0 to remove and 66 not upgraded. > Need to get 8,415kB of archives. > After this operation, 9,540kB of additional disk space will be used. > Err http://www.apache.org/dist/cassandra/debian/ unstable/main cassandra all > 0.8.0 > 404 Not Found > Failed to fetch > http://www.apache.org/dist/cassandra/debian/pool/main/c/cassandra/cassandra_0.8.0_all.deb > 404 Not Found > E: Unable to fetch some archives, maybe run apt-get update or try with > --fix-missing? > {code} > for debugging info: > {code} > clohfink@roc-lvm-dev:~dev/fabrictests$ sudo apt-cache show cassandra > N: Can't select versions from package 'cassandra' as it purely virtual > N: No packages found > clohfink@roc-lvm-dev:~dev/fabrictests$ sudo add-apt-repository "deb > http://www.apache.org/dist/cassandra/debian unstable main" > clohfink@roc-lvm-dev:~dev/fabrictests$ sudo apt-get update > ... > Ign http://www.apache.org/dist/cassandra/debian/ unstable/main Translation-en > > Ign http://www.apache.org/dist/cassandra/debian/ unstable/main > Translation-en_US > ... > Hit http://us.archive.ubuntu.com maverick-proposed/universe amd64 Packages > Fetched 6,989B in 1s (5,974B/s) > Reading package lists... Done > clohfink@roc-lvm-dev:~dev/fabrictests$ sudo apt-cache show cassandra > Package: cassandra > Version: 0.8.0 > Architecture: all > Maintainer: Eric Evans > Installed-Size: 9316 > Depends: openjdk-6-jre-headless (>= 6b11) | java6-runtime, jsvc (>= 1.0), > libcommons-daemon-java (>= 1.0), adduser > Recommends: libjna-java > Homepage: http://cassandra.apache.org > Priority: extra > Section: misc > Filename: pool/main/c/cassandra/cassandra_0.8.0_all.deb > Size: 8415180 > SHA256: 7eaaeb9d3ef5af6abff834fe93f1a84349dff98776eaee83f8dabb267ffe4833 > SHA1: 9cca3ffbcbab9e6ba2385f734691c97afeaa8be6 > MD5sum: 01e0435495f7ff40e1b4e4be5857a1ea > Description: distributed storage system for structured data > Cassandra is a distributed (peer-to-peer) system for the management > and storage of structured data. > {code} > included fabric script, if have fabric installed can run > {code} > fab -H localhost install_cassandra > {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2892) Don't "replicate_on_write" with RF=1
[ https://issues.apache.org/jira/browse/CASSANDRA-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081739#comment-13081739 ] Sylvain Lebresne commented on CASSANDRA-2892: - Sure. Applying a counter update on the first replica has two parts: we apply locally (we know we are on a replica at that point), we read back (in cm.makeReplicationMutation, not showed in that diff) and finally we use sendToHintedEndpoint to send what was read to the remaining replica. To avoid reapplying locally in sendToHintedEndpoint, we used to give it the 'applyMutationLocally' flag to false. Instead, this patch remove the local node from hintedEndpoints (since we just applied locally already) before calling sendToHintedEndpoint. We can then just check if hintedEndpoints is empty as a synonym of 'is it RF=1'. > Don't "replicate_on_write" with RF=1 > > > Key: CASSANDRA-2892 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2892 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 0.8.0 >Reporter: Sylvain Lebresne >Assignee: Sylvain Lebresne >Priority: Trivial > Labels: counters > Fix For: 0.8.4 > > Attachments: 2892.patch > > > For counters with RF=1, we still do a read to replicate, even though there is > nothing to replicate it too. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1608) Redesigned Compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081738#comment-13081738 ] Jonathan Ellis commented on CASSANDRA-1608: --- Where I was going is if we are compacting {L2.1, L3.1, L3.2, ..., L3.11} we can also compact {L2.9, L3.90, L3.91, ..., L3.99}, for instance. Because if the input keys are non-overlapping, we know that the output keys will be as well. Right? > Redesigned Compaction > - > > Key: CASSANDRA-1608 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1608 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Chris Goffinet >Assignee: Benjamin Coverston > Attachments: 1608-v11.txt, 1608-v2.txt > > > After seeing the I/O issues in CASSANDRA-1470, I've been doing some more > thinking on this subject that I wanted to lay out. > I propose we redo the concept of how compaction works in Cassandra. At the > moment, compaction is kicked off based on a write access pattern, not read > access pattern. In most cases, you want the opposite. You want to be able to > track how well each SSTable is performing in the system. If we were to keep > statistics in-memory of each SSTable, prioritize them based on most accessed, > and bloom filter hit/miss ratios, we could intelligently group sstables that > are being read most often and schedule them for compaction. We could also > schedule lower priority maintenance on SSTable's not often accessed. > I also propose we limit the size of each SSTable to a fix sized, that gives > us the ability to better utilize our bloom filters in a predictable manner. > At the moment after a certain size, the bloom filters become less reliable. > This would also allow us to group data most accessed. Currently the size of > an SSTable can grow to a point where large portions of the data might not > actually be accessed as often. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1608) Redesigned Compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081746#comment-13081746 ] Benjamin Coverston commented on CASSANDRA-1608: --- We know the input and output keys, yes. If we isolate the problem to concurrent compactions in the same level, and staggered levels {L2, L3}, {L4, L5}. > Redesigned Compaction > - > > Key: CASSANDRA-1608 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1608 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Chris Goffinet >Assignee: Benjamin Coverston > Attachments: 1608-v11.txt, 1608-v2.txt > > > After seeing the I/O issues in CASSANDRA-1470, I've been doing some more > thinking on this subject that I wanted to lay out. > I propose we redo the concept of how compaction works in Cassandra. At the > moment, compaction is kicked off based on a write access pattern, not read > access pattern. In most cases, you want the opposite. You want to be able to > track how well each SSTable is performing in the system. If we were to keep > statistics in-memory of each SSTable, prioritize them based on most accessed, > and bloom filter hit/miss ratios, we could intelligently group sstables that > are being read most often and schedule them for compaction. We could also > schedule lower priority maintenance on SSTable's not often accessed. > I also propose we limit the size of each SSTable to a fix sized, that gives > us the ability to better utilize our bloom filters in a predictable manner. > At the moment after a certain size, the bloom filters become less reliable. > This would also allow us to group data most accessed. Currently the size of > an SSTable can grow to a point where large portions of the data might not > actually be accessed as often. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (CASSANDRA-1608) Redesigned Compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081746#comment-13081746 ] Benjamin Coverston edited comment on CASSANDRA-1608 at 8/9/11 4:36 PM: --- We know the input and output keys, yes. If we isolate the problem to concurrent compactions in the same level, and staggered levels {L2, L3}, {L4, L5} it is certainly an easier problem. was (Author: bcoverston): We know the input and output keys, yes. If we isolate the problem to concurrent compactions in the same level, and staggered levels {L2, L3}, {L4, L5}. > Redesigned Compaction > - > > Key: CASSANDRA-1608 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1608 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Chris Goffinet >Assignee: Benjamin Coverston > Attachments: 1608-v11.txt, 1608-v2.txt > > > After seeing the I/O issues in CASSANDRA-1470, I've been doing some more > thinking on this subject that I wanted to lay out. > I propose we redo the concept of how compaction works in Cassandra. At the > moment, compaction is kicked off based on a write access pattern, not read > access pattern. In most cases, you want the opposite. You want to be able to > track how well each SSTable is performing in the system. If we were to keep > statistics in-memory of each SSTable, prioritize them based on most accessed, > and bloom filter hit/miss ratios, we could intelligently group sstables that > are being read most often and schedule them for compaction. We could also > schedule lower priority maintenance on SSTable's not often accessed. > I also propose we limit the size of each SSTable to a fix sized, that gives > us the ability to better utilize our bloom filters in a predictable manner. > At the moment after a certain size, the bloom filters become less reliable. > This would also allow us to group data most accessed. Currently the size of > an SSTable can grow to a point where large portions of the data might not > actually be accessed as often. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3009) 404 on apt-get install from http://www.apache.org/dist/cassandra/debian
[ https://issues.apache.org/jira/browse/CASSANDRA-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081748#comment-13081748 ] Chris Lohfink commented on CASSANDRA-3009: -- Thanks! Worked great. > 404 on apt-get install from http://www.apache.org/dist/cassandra/debian > --- > > Key: CASSANDRA-3009 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3009 > Project: Cassandra > Issue Type: Bug > Components: Documentation & website >Affects Versions: 0.8.3 > Environment: ubuntu maverick 64-bit >Reporter: Chris Lohfink >Priority: Minor > Attachments: fabfile.py > > > First bug report on here so sorry if I am doing something incorrectly. I > followed the wiki (http://wiki.apache.org/cassandra/DebianPackaging) but I am > receiving a 404 error during the install. Looks like the > {code} > clohfink@roc-lvm-dev:~dev$ sudo apt-get install cassandra > [sudo] password for clohfink: > Reading package lists... Done > Building dependency tree > Reading state information... Done > The following packages were automatically installed and are no longer > required: > libcommons-pool-java authbind libmcrypt4 libtomcat6-java > libcommons-dbcp-java tomcat6-common > Use 'apt-get autoremove' to remove them. > The following NEW packages will be installed: > cassandra > 0 upgraded, 1 newly installed, 0 to remove and 66 not upgraded. > Need to get 8,415kB of archives. > After this operation, 9,540kB of additional disk space will be used. > Err http://www.apache.org/dist/cassandra/debian/ unstable/main cassandra all > 0.8.0 > 404 Not Found > Failed to fetch > http://www.apache.org/dist/cassandra/debian/pool/main/c/cassandra/cassandra_0.8.0_all.deb > 404 Not Found > E: Unable to fetch some archives, maybe run apt-get update or try with > --fix-missing? > {code} > for debugging info: > {code} > clohfink@roc-lvm-dev:~dev/fabrictests$ sudo apt-cache show cassandra > N: Can't select versions from package 'cassandra' as it purely virtual > N: No packages found > clohfink@roc-lvm-dev:~dev/fabrictests$ sudo add-apt-repository "deb > http://www.apache.org/dist/cassandra/debian unstable main" > clohfink@roc-lvm-dev:~dev/fabrictests$ sudo apt-get update > ... > Ign http://www.apache.org/dist/cassandra/debian/ unstable/main Translation-en > > Ign http://www.apache.org/dist/cassandra/debian/ unstable/main > Translation-en_US > ... > Hit http://us.archive.ubuntu.com maverick-proposed/universe amd64 Packages > Fetched 6,989B in 1s (5,974B/s) > Reading package lists... Done > clohfink@roc-lvm-dev:~dev/fabrictests$ sudo apt-cache show cassandra > Package: cassandra > Version: 0.8.0 > Architecture: all > Maintainer: Eric Evans > Installed-Size: 9316 > Depends: openjdk-6-jre-headless (>= 6b11) | java6-runtime, jsvc (>= 1.0), > libcommons-daemon-java (>= 1.0), adduser > Recommends: libjna-java > Homepage: http://cassandra.apache.org > Priority: extra > Section: misc > Filename: pool/main/c/cassandra/cassandra_0.8.0_all.deb > Size: 8415180 > SHA256: 7eaaeb9d3ef5af6abff834fe93f1a84349dff98776eaee83f8dabb267ffe4833 > SHA1: 9cca3ffbcbab9e6ba2385f734691c97afeaa8be6 > MD5sum: 01e0435495f7ff40e1b4e4be5857a1ea > Description: distributed storage system for structured data > Cassandra is a distributed (peer-to-peer) system for the management > and storage of structured data. > {code} > included fabric script, if have fabric installed can run > {code} > fab -H localhost install_cassandra > {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3009) 404 on apt-get install from http://www.apache.org/dist/cassandra/debian
[ https://issues.apache.org/jira/browse/CASSANDRA-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Lohfink updated CASSANDRA-3009: - Comment: was deleted (was: wiki was updated with distribution changes) > 404 on apt-get install from http://www.apache.org/dist/cassandra/debian > --- > > Key: CASSANDRA-3009 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3009 > Project: Cassandra > Issue Type: Bug > Components: Documentation & website >Affects Versions: 0.8.3 > Environment: ubuntu maverick 64-bit >Reporter: Chris Lohfink >Priority: Minor > Attachments: fabfile.py > > > First bug report on here so sorry if I am doing something incorrectly. I > followed the wiki (http://wiki.apache.org/cassandra/DebianPackaging) but I am > receiving a 404 error during the install. Looks like the > {code} > clohfink@roc-lvm-dev:~dev$ sudo apt-get install cassandra > [sudo] password for clohfink: > Reading package lists... Done > Building dependency tree > Reading state information... Done > The following packages were automatically installed and are no longer > required: > libcommons-pool-java authbind libmcrypt4 libtomcat6-java > libcommons-dbcp-java tomcat6-common > Use 'apt-get autoremove' to remove them. > The following NEW packages will be installed: > cassandra > 0 upgraded, 1 newly installed, 0 to remove and 66 not upgraded. > Need to get 8,415kB of archives. > After this operation, 9,540kB of additional disk space will be used. > Err http://www.apache.org/dist/cassandra/debian/ unstable/main cassandra all > 0.8.0 > 404 Not Found > Failed to fetch > http://www.apache.org/dist/cassandra/debian/pool/main/c/cassandra/cassandra_0.8.0_all.deb > 404 Not Found > E: Unable to fetch some archives, maybe run apt-get update or try with > --fix-missing? > {code} > for debugging info: > {code} > clohfink@roc-lvm-dev:~dev/fabrictests$ sudo apt-cache show cassandra > N: Can't select versions from package 'cassandra' as it purely virtual > N: No packages found > clohfink@roc-lvm-dev:~dev/fabrictests$ sudo add-apt-repository "deb > http://www.apache.org/dist/cassandra/debian unstable main" > clohfink@roc-lvm-dev:~dev/fabrictests$ sudo apt-get update > ... > Ign http://www.apache.org/dist/cassandra/debian/ unstable/main Translation-en > > Ign http://www.apache.org/dist/cassandra/debian/ unstable/main > Translation-en_US > ... > Hit http://us.archive.ubuntu.com maverick-proposed/universe amd64 Packages > Fetched 6,989B in 1s (5,974B/s) > Reading package lists... Done > clohfink@roc-lvm-dev:~dev/fabrictests$ sudo apt-cache show cassandra > Package: cassandra > Version: 0.8.0 > Architecture: all > Maintainer: Eric Evans > Installed-Size: 9316 > Depends: openjdk-6-jre-headless (>= 6b11) | java6-runtime, jsvc (>= 1.0), > libcommons-daemon-java (>= 1.0), adduser > Recommends: libjna-java > Homepage: http://cassandra.apache.org > Priority: extra > Section: misc > Filename: pool/main/c/cassandra/cassandra_0.8.0_all.deb > Size: 8415180 > SHA256: 7eaaeb9d3ef5af6abff834fe93f1a84349dff98776eaee83f8dabb267ffe4833 > SHA1: 9cca3ffbcbab9e6ba2385f734691c97afeaa8be6 > MD5sum: 01e0435495f7ff40e1b4e4be5857a1ea > Description: distributed storage system for structured data > Cassandra is a distributed (peer-to-peer) system for the management > and storage of structured data. > {code} > included fabric script, if have fabric installed can run > {code} > fab -H localhost install_cassandra > {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (CASSANDRA-3009) 404 on apt-get install from http://www.apache.org/dist/cassandra/debian
[ https://issues.apache.org/jira/browse/CASSANDRA-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Lohfink closed CASSANDRA-3009. wiki was updated with distribution changes > 404 on apt-get install from http://www.apache.org/dist/cassandra/debian > --- > > Key: CASSANDRA-3009 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3009 > Project: Cassandra > Issue Type: Bug > Components: Documentation & website >Affects Versions: 0.8.3 > Environment: ubuntu maverick 64-bit >Reporter: Chris Lohfink >Priority: Minor > Attachments: fabfile.py > > > First bug report on here so sorry if I am doing something incorrectly. I > followed the wiki (http://wiki.apache.org/cassandra/DebianPackaging) but I am > receiving a 404 error during the install. Looks like the > {code} > clohfink@roc-lvm-dev:~dev$ sudo apt-get install cassandra > [sudo] password for clohfink: > Reading package lists... Done > Building dependency tree > Reading state information... Done > The following packages were automatically installed and are no longer > required: > libcommons-pool-java authbind libmcrypt4 libtomcat6-java > libcommons-dbcp-java tomcat6-common > Use 'apt-get autoremove' to remove them. > The following NEW packages will be installed: > cassandra > 0 upgraded, 1 newly installed, 0 to remove and 66 not upgraded. > Need to get 8,415kB of archives. > After this operation, 9,540kB of additional disk space will be used. > Err http://www.apache.org/dist/cassandra/debian/ unstable/main cassandra all > 0.8.0 > 404 Not Found > Failed to fetch > http://www.apache.org/dist/cassandra/debian/pool/main/c/cassandra/cassandra_0.8.0_all.deb > 404 Not Found > E: Unable to fetch some archives, maybe run apt-get update or try with > --fix-missing? > {code} > for debugging info: > {code} > clohfink@roc-lvm-dev:~dev/fabrictests$ sudo apt-cache show cassandra > N: Can't select versions from package 'cassandra' as it purely virtual > N: No packages found > clohfink@roc-lvm-dev:~dev/fabrictests$ sudo add-apt-repository "deb > http://www.apache.org/dist/cassandra/debian unstable main" > clohfink@roc-lvm-dev:~dev/fabrictests$ sudo apt-get update > ... > Ign http://www.apache.org/dist/cassandra/debian/ unstable/main Translation-en > > Ign http://www.apache.org/dist/cassandra/debian/ unstable/main > Translation-en_US > ... > Hit http://us.archive.ubuntu.com maverick-proposed/universe amd64 Packages > Fetched 6,989B in 1s (5,974B/s) > Reading package lists... Done > clohfink@roc-lvm-dev:~dev/fabrictests$ sudo apt-cache show cassandra > Package: cassandra > Version: 0.8.0 > Architecture: all > Maintainer: Eric Evans > Installed-Size: 9316 > Depends: openjdk-6-jre-headless (>= 6b11) | java6-runtime, jsvc (>= 1.0), > libcommons-daemon-java (>= 1.0), adduser > Recommends: libjna-java > Homepage: http://cassandra.apache.org > Priority: extra > Section: misc > Filename: pool/main/c/cassandra/cassandra_0.8.0_all.deb > Size: 8415180 > SHA256: 7eaaeb9d3ef5af6abff834fe93f1a84349dff98776eaee83f8dabb267ffe4833 > SHA1: 9cca3ffbcbab9e6ba2385f734691c97afeaa8be6 > MD5sum: 01e0435495f7ff40e1b4e4be5857a1ea > Description: distributed storage system for structured data > Cassandra is a distributed (peer-to-peer) system for the management > and storage of structured data. > {code} > included fabric script, if have fabric installed can run > {code} > fab -H localhost install_cassandra > {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-2919) CQL system test for counters is failing
[ https://issues.apache.org/jira/browse/CASSANDRA-2919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne resolved CASSANDRA-2919. - Resolution: Cannot Reproduce Ok, I cannot reproduce either anymore. Probably got fixed, or I screwed up the first time. Sorry for that. > CQL system test for counters is failing > --- > > Key: CASSANDRA-2919 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2919 > Project: Cassandra > Issue Type: Bug > Components: Tests > Environment: ubuntu 11.04 64 bit >Reporter: Sylvain Lebresne >Assignee: Tyler Hobbs >Priority: Minor > Labels: cql, test > > On my machine (and on current 0.8 branch) the CQL system test for counters is > failing. While reading the counter value, junk bytes are apparently returned > instead of the value (on the following excerpt it looks like a empty value, > but on the terminal it does show a random character): > {noformat} > == > FAIL: update statement should be able to work with counter columns > -- > Traceback (most recent call last): > File "/usr/lib/pymodules/python2.7/nose/case.py", line 186, in runTest > self.test(*self.arg) > File "/home/pcmanus/Git/cassandra/test/system/test_cql.py", line 1130, in > test_counter_column_support > "unrecognized value '%s'" % r[1] > AssertionError: unrecognized value '' > -- > {noformat} > I've checked, the server correctly fetch the right column and return what it > should. So this seems to be on the python driver side. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data
[ https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081773#comment-13081773 ] Hudson commented on CASSANDRA-1717: --- Integrated in Cassandra #1010 (See [https://builds.apache.org/job/Cassandra/1010/]) Add block level checksum for compressed data patch by Pavel Yaskevich; reviewed by Sylvain Lebresne for CASSANDRA-1717 xedin : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1155420 Files : * /cassandra/trunk/test/unit/org/apache/cassandra/Util.java * /cassandra/trunk/test/unit/org/apache/cassandra/io/compress/CompressedRandomAccessReaderTest.java * /cassandra/trunk/src/java/org/apache/cassandra/io/compress/CorruptedBlockException.java * /cassandra/trunk/CHANGES.txt * /cassandra/trunk/src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java * /cassandra/trunk/test/unit/org/apache/cassandra/io/util/BufferedRandomAccessFileTest.java * /cassandra/trunk/src/java/org/apache/cassandra/io/compress/CompressionMetadata.java * /cassandra/trunk/src/java/org/apache/cassandra/utils/FBUtilities.java * /cassandra/trunk/src/java/org/apache/cassandra/io/compress/CompressedSequentialWriter.java > Cassandra cannot detect corrupt-but-readable column data > > > Key: CASSANDRA-1717 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1717 > Project: Cassandra > Issue Type: New Feature > Components: Core >Reporter: Jonathan Ellis >Assignee: Pavel Yaskevich > Fix For: 1.0 > > Attachments: CASSANDRA-1717-v2.patch, CASSANDRA-1717-v3.patch, > CASSANDRA-1717.patch, checksums.txt > > > Most corruptions of on-disk data due to bitrot render the column (or row) > unreadable, so the data can be replaced by read repair or anti-entropy. But > if the corruption keeps column data readable we do not detect it, and if it > corrupts to a higher timestamp value can even resist being overwritten by > newer values. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081772#comment-13081772 ] Hudson commented on CASSANDRA-2843: --- Integrated in Cassandra #1010 (See [https://builds.apache.org/job/Cassandra/1010/]) Make ColumnFamily backing column map pluggable and introduce unsynchronized ArrayList backed map for reads patch by slebresne; reviewed by jbellis for CASSANDRA-2843 slebresne : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1155426 Files : * /cassandra/trunk/src/java/org/apache/cassandra/db/filter/IFilter.java * /cassandra/trunk/src/java/org/apache/cassandra/db/filter/QueryFilter.java * /cassandra/trunk/src/java/org/apache/cassandra/db/columniterator/SSTableNamesIterator.java * /cassandra/trunk/src/java/org/apache/cassandra/db/CounterColumn.java * /cassandra/trunk/src/java/org/apache/cassandra/db/SuperColumn.java * /cassandra/trunk/src/java/org/apache/cassandra/db/filter/NamesQueryFilter.java * /cassandra/trunk/src/java/org/apache/cassandra/db/Table.java * /cassandra/trunk/src/java/org/apache/cassandra/service/RowRepairResolver.java * /cassandra/trunk/src/java/org/apache/cassandra/db/IColumnContainer.java * /cassandra/trunk/test/unit/org/apache/cassandra/db/ArrayBackedSortedColumnsTest.java * /cassandra/trunk/src/java/org/apache/cassandra/db/ArrayBackedSortedColumns.java * /cassandra/trunk/test/unit/org/apache/cassandra/db/ColumnFamilyStoreTest.java * /cassandra/trunk/src/java/org/apache/cassandra/db/AbstractColumnContainer.java * /cassandra/trunk/src/java/org/apache/cassandra/db/ThreadSafeSortedColumns.java * /cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamilyStore.java * /cassandra/trunk/src/java/org/apache/cassandra/thrift/CassandraServer.java * /cassandra/trunk/src/java/org/apache/cassandra/db/filter/SliceQueryFilter.java * /cassandra/trunk/src/java/org/apache/cassandra/db/CounterMutation.java * /cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamily.java * /cassandra/trunk/src/java/org/apache/cassandra/db/RowMutation.java * /cassandra/trunk/src/java/org/apache/cassandra/db/ISortedColumns.java * /cassandra/trunk/src/java/org/apache/cassandra/db/ReadResponse.java * /cassandra/trunk/CHANGES.txt * /cassandra/trunk/test/unit/org/apache/cassandra/db/RowTest.java * /cassandra/trunk/src/java/org/apache/cassandra/db/Row.java * /cassandra/trunk/test/unit/org/apache/cassandra/streaming/StreamingTransferTest.java * /cassandra/trunk/test/unit/org/apache/cassandra/service/AntiEntropyServiceTestAbstract.java * /cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamilySerializer.java > better performance on long row read > --- > > Key: CASSANDRA-2843 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 > Project: Cassandra > Issue Type: New Feature >Reporter: Yang Yang >Assignee: Sylvain Lebresne > Fix For: 1.0 > > Attachments: 2843.patch, 2843_d.patch, 2843_g.patch, 2843_h.patch, > fix.diff, microBenchmark.patch, patch_timing, std_timing > > > currently if a row contains > 1000 columns, the run time becomes considerably > slow (my test of > a row with 30 00 columns (standard, regular) each with 8 bytes in name, and > 40 bytes in value, is about 16ms. > this is all running in memory, no disk read is involved. > through debugging we can find > most of this time is spent on > [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) > [Wall Time] > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, > ColumnFamily) > [Wall Time] > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, > ColumnFamily) > [Wall Time] > org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, > int, ColumnFamily) > [Wall Time] > org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, > Iterator, int) > [Wall Time] > org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, > Iterator, int) > [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) > ColumnFamily.addColumn() is slow because it inserts into an internal > concurrentSkipListMap() that maps column names to values. > this structure is slow for two reasons: it needs to do synchronization; it > needs to maintain a more complex structure of map. > but if we look at the whole read path, thrift already defines the read output > to be List so it does not make sense to use a luxury map > data structure in the interium and finally convert it to a list. on the > synchronization side, since the return CF is never going to be > shared/modified by other threads, we know the access is always single thread, > so no synchronization is needed. > but these 2 features are indeed needed for ColumnFamily in other cases, > parti
[jira] [Updated] (CASSANDRA-2990) We should refuse query for counters at CL.ANY
[ https://issues.apache.org/jira/browse/CASSANDRA-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2990: Attachment: 2990.patch > We should refuse query for counters at CL.ANY > - > > Key: CASSANDRA-2990 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2990 > Project: Cassandra > Issue Type: Bug >Affects Versions: 0.8.0 >Reporter: Sylvain Lebresne >Assignee: Sylvain Lebresne >Priority: Trivial > Labels: counters > Fix For: 0.8.4 > > Attachments: 2990.patch > > > We currently do not reject writes for counters at CL.ANY, even though this is > not supported (and rightly so). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2892) Don't "replicate_on_write" with RF=1
[ https://issues.apache.org/jira/browse/CASSANDRA-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-2892: -- Attachment: 2892-v1.5.txt v1.5 attached. I thought I could improve it more, but couldn't. :) Ended up just extracting counterWriteTask() to remove the executeOnMutationStage flag. > Don't "replicate_on_write" with RF=1 > > > Key: CASSANDRA-2892 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2892 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 0.8.0 >Reporter: Sylvain Lebresne >Assignee: Sylvain Lebresne >Priority: Trivial > Labels: counters > Fix For: 0.8.4 > > Attachments: 2892-v1.5.txt, 2892.patch > > > For counters with RF=1, we still do a read to replicate, even though there is > nothing to replicate it too. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2892) Don't "replicate_on_write" with RF=1
[ https://issues.apache.org/jira/browse/CASSANDRA-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081802#comment-13081802 ] Sylvain Lebresne commented on CASSANDRA-2892: - v1.5 lgtm > Don't "replicate_on_write" with RF=1 > > > Key: CASSANDRA-2892 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2892 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 0.8.0 >Reporter: Sylvain Lebresne >Assignee: Sylvain Lebresne >Priority: Trivial > Labels: counters > Fix For: 0.8.4 > > Attachments: 2892-v1.5.txt, 2892.patch > > > For counters with RF=1, we still do a read to replicate, even though there is > nothing to replicate it too. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
svn commit: r1155460 - in /cassandra/branches/cassandra-0.8: CHANGES.txt src/java/org/apache/cassandra/service/StorageProxy.java
Author: jbellis Date: Tue Aug 9 18:37:20 2011 New Revision: 1155460 URL: http://svn.apache.org/viewvc?rev=1155460&view=rev Log: avoid doing read forno-op replicate-on-write at CL=1 patch by slebresne and jbellis for CASSANDRA-2892 Modified: cassandra/branches/cassandra-0.8/CHANGES.txt cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/StorageProxy.java Modified: cassandra/branches/cassandra-0.8/CHANGES.txt URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/CHANGES.txt?rev=1155460&r1=1155459&r2=1155460&view=diff == --- cassandra/branches/cassandra-0.8/CHANGES.txt (original) +++ cassandra/branches/cassandra-0.8/CHANGES.txt Tue Aug 9 18:37:20 2011 @@ -1,6 +1,7 @@ 0.8.4 * include files-to-be-streamed in StreamInSession.getSources (CASSANDRA-2972) * use JAVA env var in cassandra-env.sh (CASSANDRA-2785, 2992) + * avoid doing read for no-op replicate-on-write at CL=1 (CASSANDRA-2892) 0.8.3 Modified: cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/StorageProxy.java URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/StorageProxy.java?rev=1155460&r1=1155459&r2=1155460&view=diff == --- cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/StorageProxy.java (original) +++ cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/StorageProxy.java Tue Aug 9 18:37:20 2011 @@ -96,7 +96,7 @@ public class StorageProxy implements Sto public void apply(IMutation mutation, Multimap hintedEndpoints, IWriteResponseHandler responseHandler, String localDataCenter, ConsistencyLevel consistency_level) throws IOException { assert mutation instanceof RowMutation; -sendToHintedEndpoints((RowMutation) mutation, hintedEndpoints, responseHandler, localDataCenter, true, consistency_level); +sendToHintedEndpoints((RowMutation) mutation, hintedEndpoints, responseHandler, localDataCenter, consistency_level); } }; @@ -110,7 +110,11 @@ public class StorageProxy implements Sto { public void apply(IMutation mutation, Multimap hintedEndpoints, IWriteResponseHandler responseHandler, String localDataCenter, ConsistencyLevel consistency_level) throws IOException { -applyCounterMutation(mutation, hintedEndpoints, responseHandler, localDataCenter, consistency_level, false); +if (logger.isDebugEnabled()) +logger.debug("insert writing local & replicate " + mutation.toString(true)); + +Runnable runnable = counterWriteTask(mutation, hintedEndpoints, responseHandler, localDataCenter, consistency_level); +runnable.run(); } }; @@ -118,7 +122,11 @@ public class StorageProxy implements Sto { public void apply(IMutation mutation, Multimap hintedEndpoints, IWriteResponseHandler responseHandler, String localDataCenter, ConsistencyLevel consistency_level) throws IOException { -applyCounterMutation(mutation, hintedEndpoints, responseHandler, localDataCenter, consistency_level, true); +if (logger.isDebugEnabled()) +logger.debug("insert writing local & replicate " + mutation.toString(true)); + +Runnable runnable = counterWriteTask(mutation, hintedEndpoints, responseHandler, localDataCenter, consistency_level); +StageManager.getStage(Stage.MUTATION).execute(runnable); } }; } @@ -218,7 +226,7 @@ public class StorageProxy implements Sto return ss.getTokenMetadata().getWriteEndpoints(StorageService.getPartitioner().getToken(key), table, naturalEndpoints); } -private static void sendToHintedEndpoints(final RowMutation rm, Multimap hintedEndpoints, IWriteResponseHandler responseHandler, String localDataCenter, boolean insertLocalMessages, ConsistencyLevel consistency_level) +private static void sendToHintedEndpoints(final RowMutation rm, Multimap hintedEndpoints, IWriteResponseHandler responseHandler, String localDataCenter, ConsistencyLevel consistency_level) throws IOException { // Multimap that holds onto all the messages and addresses meant for a specific datacenter @@ -237,8 +245,7 @@ public class StorageProxy implements Sto // unhinted writes if (destination.equals(FBUtilities.getLocalAddress())) { -if (insertLocalMessages) -insertLocal(rm, responseHandler); +insertLocal(rm, responseHandler); } else { @@ -425,13 +432,9 @@ public class Storag
[jira] [Resolved] (CASSANDRA-2892) Don't "replicate_on_write" with RF=1
[ https://issues.apache.org/jira/browse/CASSANDRA-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-2892. --- Resolution: Fixed Reviewer: jbellis committed > Don't "replicate_on_write" with RF=1 > > > Key: CASSANDRA-2892 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2892 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 0.8.0 >Reporter: Sylvain Lebresne >Assignee: Sylvain Lebresne >Priority: Trivial > Labels: counters > Fix For: 0.8.4 > > Attachments: 2892-v1.5.txt, 2892.patch > > > For counters with RF=1, we still do a read to replicate, even though there is > nothing to replicate it too. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
svn commit: r1155466 - in /cassandra/trunk: ./ contrib/ debian/ interface/thrift/gen-java/org/apache/cassandra/thrift/ redhat/ src/java/org/apache/cassandra/cli/ src/java/org/apache/cassandra/service/
Author: jbellis Date: Tue Aug 9 18:40:54 2011 New Revision: 1155466 URL: http://svn.apache.org/viewvc?rev=1155466&view=rev Log: merge from 0.8 Modified: cassandra/trunk/ (props changed) cassandra/trunk/CHANGES.txt cassandra/trunk/contrib/ (props changed) cassandra/trunk/debian/control cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java (props changed) cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java (props changed) cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java (props changed) cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java (props changed) cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java (props changed) cassandra/trunk/redhat/cassandra cassandra/trunk/src/java/org/apache/cassandra/cli/Cli.g cassandra/trunk/src/java/org/apache/cassandra/cli/CliClient.java cassandra/trunk/src/java/org/apache/cassandra/cli/CliCompleter.java cassandra/trunk/src/java/org/apache/cassandra/service/StorageProxy.java cassandra/trunk/src/java/org/apache/cassandra/service/StorageService.java cassandra/trunk/src/resources/org/apache/cassandra/cli/CliHelp.yaml cassandra/trunk/test/unit/org/apache/cassandra/cli/CliTest.java Propchange: cassandra/trunk/ -- --- svn:mergeinfo (original) +++ svn:mergeinfo Tue Aug 9 18:40:54 2011 @@ -1,7 +1,7 @@ /cassandra/branches/cassandra-0.6:922689-1052356,1052358-1053452,1053454,1053456-1131291 /cassandra/branches/cassandra-0.7:1026516-1151306 /cassandra/branches/cassandra-0.7.0:1053690-1055654 -/cassandra/branches/cassandra-0.8:1090934-1125013,1125019-1154424 +/cassandra/branches/cassandra-0.8:1090934-1125013,1125019-1155460 /cassandra/branches/cassandra-0.8.0:1125021-1130369 /cassandra/branches/cassandra-0.8.1:1101014-1125018 /cassandra/tags/cassandra-0.7.0-rc3:1051699-1053689 Modified: cassandra/trunk/CHANGES.txt URL: http://svn.apache.org/viewvc/cassandra/trunk/CHANGES.txt?rev=1155466&r1=1155465&r2=1155466&view=diff == --- cassandra/trunk/CHANGES.txt (original) +++ cassandra/trunk/CHANGES.txt Tue Aug 9 18:40:54 2011 @@ -33,6 +33,8 @@ 0.8.4 * include files-to-be-streamed in StreamInSession.getSources (CASSANDRA-2972) + * use JAVA env var in cassandra-env.sh (CASSANDRA-2785, 2992) + * avoid doing read for no-op replicate-on-write at CL=1 (CASSANDRA-2892) 0.8.3 Propchange: cassandra/trunk/contrib/ -- --- svn:mergeinfo (original) +++ svn:mergeinfo Tue Aug 9 18:40:54 2011 @@ -1,7 +1,7 @@ /cassandra/branches/cassandra-0.6/contrib:922689-1052356,1052358-1053452,1053454,1053456-1068009 /cassandra/branches/cassandra-0.7/contrib:1026516-1151306 /cassandra/branches/cassandra-0.7.0/contrib:1053690-1055654 -/cassandra/branches/cassandra-0.8/contrib:1090934-1125013,1125019-1154424 +/cassandra/branches/cassandra-0.8/contrib:1090934-1125013,1125019-1155460 /cassandra/branches/cassandra-0.8.0/contrib:1125021-1130369 /cassandra/branches/cassandra-0.8.1/contrib:1101014-1125018 /cassandra/tags/cassandra-0.7.0-rc3/contrib:1051699-1053689 Modified: cassandra/trunk/debian/control URL: http://svn.apache.org/viewvc/cassandra/trunk/debian/control?rev=1155466&r1=1155465&r2=1155466&view=diff == --- cassandra/trunk/debian/control (original) +++ cassandra/trunk/debian/control Tue Aug 9 18:40:54 2011 @@ -2,7 +2,7 @@ Source: cassandra Section: misc Priority: extra Maintainer: Eric Evans -Build-Depends: debhelper (>= 5), openjdk-6-jdk (>= 6b11) | java6-sdk, ant (>= 1.7), ant-optional (>= 1.7) +Build-Depends: debhelper (>= 5), openjdk-6-jdk (>= 6b11) | java6-sdk, ant (>= 1.7), ant-optional (>= 1.7), subversion Homepage: http://cassandra.apache.org Vcs-Svn: https://svn.apache.org/repos/asf/cassandra/trunk Vcs-Browser: http://svn.apache.org/viewvc/cassandra/trunk Propchange: cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java -- --- svn:mergeinfo (original) +++ svn:mergeinfo Tue Aug 9 18:40:54 2011 @@ -1,7 +1,7 @@ /cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:922689-1052356,1052358-1053452,1053454,1053456-1131291 /cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1151306 /cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1055654 -/cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thr
[jira] [Commented] (CASSANDRA-2990) We should refuse query for counters at CL.ANY
[ https://issues.apache.org/jira/browse/CASSANDRA-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081831#comment-13081831 ] Jonathan Ellis commented on CASSANDRA-2990: --- A few days ago, you said, "A counter mutation only live enough so that it is applied to the first replica. Once this is done, a *row* mutation is generated for the other replica. That second mutation can be hinted. But that is a row mutation, so there should be no special casing at all for that." Why can't we hint the first replica? > We should refuse query for counters at CL.ANY > - > > Key: CASSANDRA-2990 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2990 > Project: Cassandra > Issue Type: Bug >Affects Versions: 0.8.0 >Reporter: Sylvain Lebresne >Assignee: Sylvain Lebresne >Priority: Trivial > Labels: counters > Fix For: 0.8.4 > > Attachments: 2990.patch > > > We currently do not reject writes for counters at CL.ANY, even though this is > not supported (and rightly so). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
buildbot failure in ASF Buildbot on cassandra-trunk
The Buildbot has detected a new failure on builder cassandra-trunk while building ASF Buildbot. Full details are available at: http://ci.apache.org/builders/cassandra-trunk/builds/1503 Buildbot URL: http://ci.apache.org/ Buildslave for this Build: isis_ubuntu Build Reason: scheduler Build Source Stamp: [branch cassandra/trunk] 1155466 Blamelist: jbellis BUILD FAILED: failed compile sincerely, -The Buildbot
[jira] [Commented] (CASSANDRA-2868) Native Memory Leak
[ https://issues.apache.org/jira/browse/CASSANDRA-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081834#comment-13081834 ] Brandon Williams commented on CASSANDRA-2868: - bq. Wouldn't it be worth indicating that how many collection have been done since last log message if it's > 1, since it can (be > 1). The only reason I added count tracking was to prevent it from firing when there were no GCs (the api is flakey.) I've never actually been able to get > 1 to happen, but we can add it to the logging. bq. IMO the duration-based thresholds are hard to reason about here, where we're dealing w/ summaries and not individual GC results. We are dealing with individual GCs at least 99% of the time in practice. The worst case is >1 GC inflates the gctime enough that we errantly log when it's not needed, but I imagine to trigger that you would have to be in a gc pressure situation already. bq. I think I'd rather have something like the dropped messages logger, where every N seconds we log the summary we get from the mbean. That seems like it could a lot of noise since GC is constantly happening. bq. The flushLargestMemtables/reduceCacheSizes stuff should probably be removed. I think the logic there is still sound ("Did we just do a CMS? Is the heap still 80% full?") and it seems to work as well as it always has. > Native Memory Leak > -- > > Key: CASSANDRA-2868 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2868 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Daniel Doubleday >Assignee: Brandon Williams >Priority: Minor > Fix For: 0.8.4 > > Attachments: 2868-v1.txt, 2868-v2.txt, 48hour_RES.png, > low-load-36-hours-initial-results.png > > > We have memory issues with long running servers. These have been confirmed by > several users in the user list. That's why I report. > The memory consumption of the cassandra java process increases steadily until > it's killed by the os because of oom (with no swap) > Our server is started with -Xmx3000M and running for around 23 days. > pmap -x shows > Total SST: 1961616 (mem mapped data and index files) > Anon RSS: 6499640 > Total RSS: 8478376 > This shows that > 3G are 'overallocated'. > We will use BRAF on one of our less important nodes to check wether it is > related to mmap and report back. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (CASSANDRA-2868) Native Memory Leak
[ https://issues.apache.org/jira/browse/CASSANDRA-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081834#comment-13081834 ] Brandon Williams edited comment on CASSANDRA-2868 at 8/9/11 6:43 PM: - bq. Wouldn't it be worth indicating that how many collection have been done since last log message if it's > 1, since it can (be > 1). The only reason I added count tracking was to prevent it from firing when there were no GCs (the api is flakey.) I've never actually been able to get > 1 to happen, but we can add it to the logging. bq. IMO the duration-based thresholds are hard to reason about here, where we're dealing w/ summaries and not individual GC results. We are dealing with individual GCs at least 99% of the time in practice. The worst case is >1 GC inflates the gctime enough that we errantly log when it's not needed, but I imagine to trigger that you would have to be in a gc pressure situation already. bq. I think I'd rather have something like the dropped messages logger, where every N seconds we log the summary we get from the mbean. That seems like it could be a lot of noise since GC is constantly happening. bq. The flushLargestMemtables/reduceCacheSizes stuff should probably be removed. I think the logic there is still sound ("Did we just do a CMS? Is the heap still 80% full?") and it seems to work as well as it always has. was (Author: brandon.williams): bq. Wouldn't it be worth indicating that how many collection have been done since last log message if it's > 1, since it can (be > 1). The only reason I added count tracking was to prevent it from firing when there were no GCs (the api is flakey.) I've never actually been able to get > 1 to happen, but we can add it to the logging. bq. IMO the duration-based thresholds are hard to reason about here, where we're dealing w/ summaries and not individual GC results. We are dealing with individual GCs at least 99% of the time in practice. The worst case is >1 GC inflates the gctime enough that we errantly log when it's not needed, but I imagine to trigger that you would have to be in a gc pressure situation already. bq. I think I'd rather have something like the dropped messages logger, where every N seconds we log the summary we get from the mbean. That seems like it could a lot of noise since GC is constantly happening. bq. The flushLargestMemtables/reduceCacheSizes stuff should probably be removed. I think the logic there is still sound ("Did we just do a CMS? Is the heap still 80% full?") and it seems to work as well as it always has. > Native Memory Leak > -- > > Key: CASSANDRA-2868 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2868 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Daniel Doubleday >Assignee: Brandon Williams >Priority: Minor > Fix For: 0.8.4 > > Attachments: 2868-v1.txt, 2868-v2.txt, 48hour_RES.png, > low-load-36-hours-initial-results.png > > > We have memory issues with long running servers. These have been confirmed by > several users in the user list. That's why I report. > The memory consumption of the cassandra java process increases steadily until > it's killed by the os because of oom (with no swap) > Our server is started with -Xmx3000M and running for around 23 days. > pmap -x shows > Total SST: 1961616 (mem mapped data and index files) > Anon RSS: 6499640 > Total RSS: 8478376 > This shows that > 3G are 'overallocated'. > We will use BRAF on one of our less important nodes to check wether it is > related to mmap and report back. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2990) We should refuse query for counters at CL.ANY
[ https://issues.apache.org/jira/browse/CASSANDRA-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081854#comment-13081854 ] Sylvain Lebresne commented on CASSANDRA-2990: - bq. Why can't we hint the first replica? Well, actually I think we could. Or at least if we cannot I forgot why. We would need to be sure we never replay an hint twice though, which I'm not sure is a guarantee right now. Also, we can only make this if what we store as a hint is the serialized mutation (in this case, the serialized CounterMutation): we can't apply the CounterMutation on a non-replica (partly because that would potentially increase the counter context too much, partly because counter remove suck, which would probably be a problem at some point). So it should be doable, but it's a bit of work. > We should refuse query for counters at CL.ANY > - > > Key: CASSANDRA-2990 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2990 > Project: Cassandra > Issue Type: Bug >Affects Versions: 0.8.0 >Reporter: Sylvain Lebresne >Assignee: Sylvain Lebresne >Priority: Trivial > Labels: counters > Fix For: 0.8.4 > > Attachments: 2990.patch > > > We currently do not reject writes for counters at CL.ANY, even though this is > not supported (and rightly so). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2892) Don't "replicate_on_write" with RF=1
[ https://issues.apache.org/jira/browse/CASSANDRA-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081858#comment-13081858 ] Hudson commented on CASSANDRA-2892: --- Integrated in Cassandra-0.8 #264 (See [https://builds.apache.org/job/Cassandra-0.8/264/]) avoid doing read forno-op replicate-on-write at CL=1 patch by slebresne and jbellis for CASSANDRA-2892 jbellis : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1155460 Files : * /cassandra/branches/cassandra-0.8/CHANGES.txt * /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/StorageProxy.java > Don't "replicate_on_write" with RF=1 > > > Key: CASSANDRA-2892 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2892 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 0.8.0 >Reporter: Sylvain Lebresne >Assignee: Sylvain Lebresne >Priority: Trivial > Labels: counters > Fix For: 0.8.4 > > Attachments: 2892-v1.5.txt, 2892.patch > > > For counters with RF=1, we still do a read to replicate, even though there is > nothing to replicate it too. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2990) We should refuse query for counters at CL.ANY
[ https://issues.apache.org/jira/browse/CASSANDRA-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081862#comment-13081862 ] Jonathan Ellis commented on CASSANDRA-2990: --- Okay, +1 on making the validation match what is actually currently supported (no ANY for counters), although I'd change "not supported" to "not yet supported." We can deal w/ adding ANY support if and when someone actually needs it. > We should refuse query for counters at CL.ANY > - > > Key: CASSANDRA-2990 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2990 > Project: Cassandra > Issue Type: Bug >Affects Versions: 0.8.0 >Reporter: Sylvain Lebresne >Assignee: Sylvain Lebresne >Priority: Trivial > Labels: counters > Fix For: 0.8.4 > > Attachments: 2990.patch > > > We currently do not reject writes for counters at CL.ANY, even though this is > not supported (and rightly so). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-2518) invalid column name length 0
[ https://issues.apache.org/jira/browse/CASSANDRA-2518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-2518. --- Resolution: Duplicate probably CASSANDRA-2675, fixed in 0.7.7 > invalid column name length 0 > > > Key: CASSANDRA-2518 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2518 > Project: Cassandra > Issue Type: Bug >Affects Versions: 0.7.3 > Environment: three nodes, > JVM: > -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms6G -Xmx6G -Xmn2400M > -XX:+HeapDumpOnOutOfMemoryError -Xss128k -XX:+UseParNewGC > -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 > -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 > -XX:+UseCMSInitiatingOccupancyOnly -Djava.net.preferIPv4Stack=true >Reporter: lichenglin > > one of the three nodes cassandra 0.7.3 report error after start up: > ERROR [CompactionExecutor:1] 2011-04-16 22:18:39,281 PrecompactedRow.java > (line 82) Skipping row DecoratedKey(3813860378406449638560060231106122758, > 79616e79776275636b65743030303030303030312f6f626a303030303030323534) in > /opt/cassandra/data/Keyspace/cf-f-4715-Data.db > org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid > column name length 0 > at > org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:68) > at > org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:35) > at > org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:129) > at > org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:176) > at > org.apache.cassandra.io.PrecompactedRow.(PrecompactedRow.java:78) > at > org.apache.cassandra.io.CompactionIterator.getCompactedRow(CompactionIterator.java:139) > at > org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:108) > at > org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:43) > at > org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:73) > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136) > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131) > at > org.apache.commons.collections.iterators.FilterIterator.setNextObject(FilterIterator.java:183) > at > org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94) > at > org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:449) > at > org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:124) > at > org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:94) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:619) > and few minutes later, > ERROR [CompactionExecutor:1] 2011-04-16 22:20:20,073 > AbstractCassandraDaemon.java (line 114) Fatal exception in thread > Thread[CompactionExecutor:1,1,main] > java.lang.OutOfMemoryError: Java heap space > at > org.apache.cassandra.io.util.BufferedRandomAccessFile.readBytes(BufferedRandomAccessFile.java:267) > at > org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:310) > at > org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:267) > at > org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:94) > at > org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:35) > at > org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:129) > at > org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:176) > at > org.apache.cassandra.io.PrecompactedRow.(PrecompactedRow.java:78) > at > org.apache.cassandra.io.CompactionIterator.getCompactedRow(CompactionIterator.java:139) > at > org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:108) > at > org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:43) > at > org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:73) > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
[Cassandra Wiki] Trivial Update of "Committers" by JonathanEllis
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification. The "Committers" page has been changed by JonathanEllis: http://wiki.apache.org/cassandra/Committers?action=diff&rev1=15&rev2=16 Comment: update release manager ||Avinash Lakshman||Jan 2009||Facebook||Co-author of Facebook Cassandra|| ||Prashant Malik||Jan 2009||Facebook||Co-author of Facebook Cassandra|| ||Jonathan Ellis||Mar 2009||Datastax||Project chair|| - ||Eric Evans||Jun 2009||Rackspace||PMC member, Release manager, Debian packager|| + ||Eric Evans||Jun 2009||Rackspace||PMC member, Debian packager|| ||Jun Rao||Jun 2009||!LinkedIn||PMC member|| ||Chris Goffinet||Sept 2009||Twitter||PMC member|| ||Johan Oskarsson||Nov 2009||Twitter||Also a [[http://hadoop.apache.org/|Hadoop]] committer|| @@ -12, +12 @@ ||Jaakko Laine||Dec 2009||?|| || ||Brandon Williams||Jun 2010||Datastax||PMC member|| ||Jake Luciani||Jan 2011||Datastax||Also a [[http://thrift.apache.org/|Thrift]] committer|| - ||Sylvain Lebresne||Mar 2011||Datastax||PMC member|| + ||Sylvain Lebresne||Mar 2011||Datastax||PMC member, Release manager|| ||Pavel Yaskevich||Aug 2011||Datastax|| ||
[jira] [Commented] (CASSANDRA-2993) Issues with parameters being escaped correctly in Python CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081879#comment-13081879 ] Blake Visin commented on CASSANDRA-2993: Works for me too. Thanks Tyler! > Issues with parameters being escaped correctly in Python CQL > > > Key: CASSANDRA-2993 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2993 > Project: Cassandra > Issue Type: Bug > Environment: Python CQL >Reporter: Blake Visin >Assignee: Tyler Hobbs > Labels: CQL, parameter, python > Attachments: 2993-cql-grammar.txt, 2993-pycql.txt, > 2993-system-test.txt > > > When using parameterised queries in Python CQL strings are not being escaped > correctly. > Query and Parameters: > {code} > 'UPDATE sites SET :col = :val WHERE KEY = :site_id' > {'col': 'feed_stats:1312493736688033024', > 'site_id': '899d15e8-bd4a-11e0-bc8c-001fe14cba06', > 'val': > "(dp0\nS'1'\np1\n(lp2\nI1\naI2\naI3\naI4\nasS'0'\np3\n(lp4\nI1\naI2\naI3\naI4\nasS'3'\np5\n(lp6\nI1\naI2\naI3\naI4\nasS'2'\np7\n(lp8\nI1\naI2\naI3\naI4\nas."} > {code} > Query trying to be executed after processing parameters > {code} > "UPDATE sites SET 'feed_stats:1312493736688033024' = > '(dp0\nS''1''\np1\n(lp2\nI1\naI2\naI3\naI4\nasS''0''\np3\n(lp4\nI1\naI2\naI3\naI4\nasS''3''\np5\n(lp6\nI1\naI2\naI3\naI4\nasS''2''\np7\n(lp8\nI1\naI2\naI3\naI4\nas.' > WHERE KEY = '899d15e8-bd4a-11e0-bc8c-001fe14cba06'" > {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2993) Issues with parameters being escaped correctly in Python CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-2993: -- Reviewer: xedin > Issues with parameters being escaped correctly in Python CQL > > > Key: CASSANDRA-2993 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2993 > Project: Cassandra > Issue Type: Bug > Environment: Python CQL >Reporter: Blake Visin >Assignee: Tyler Hobbs > Labels: CQL, parameter, python > Attachments: 2993-cql-grammar.txt, 2993-pycql.txt, > 2993-system-test.txt > > > When using parameterised queries in Python CQL strings are not being escaped > correctly. > Query and Parameters: > {code} > 'UPDATE sites SET :col = :val WHERE KEY = :site_id' > {'col': 'feed_stats:1312493736688033024', > 'site_id': '899d15e8-bd4a-11e0-bc8c-001fe14cba06', > 'val': > "(dp0\nS'1'\np1\n(lp2\nI1\naI2\naI3\naI4\nasS'0'\np3\n(lp4\nI1\naI2\naI3\naI4\nasS'3'\np5\n(lp6\nI1\naI2\naI3\naI4\nasS'2'\np7\n(lp8\nI1\naI2\naI3\naI4\nas."} > {code} > Query trying to be executed after processing parameters > {code} > "UPDATE sites SET 'feed_stats:1312493736688033024' = > '(dp0\nS''1''\np1\n(lp2\nI1\naI2\naI3\naI4\nasS''0''\np3\n(lp4\nI1\naI2\naI3\naI4\nasS''3''\np5\n(lp6\nI1\naI2\naI3\naI4\nasS''2''\np7\n(lp8\nI1\naI2\naI3\naI4\nas.' > WHERE KEY = '899d15e8-bd4a-11e0-bc8c-001fe14cba06'" > {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2325) invalidateKeyCache / invalidateRowCache should remove saved cache files from disk
[ https://issues.apache.org/jira/browse/CASSANDRA-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo updated CASSANDRA-2325: --- Attachment: cassandra-2325.patch.2.txt > invalidateKeyCache / invalidateRowCache should remove saved cache files from > disk > - > > Key: CASSANDRA-2325 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2325 > Project: Cassandra > Issue Type: Improvement >Affects Versions: 0.7.8, 0.8.2 >Reporter: Matthew F. Dennis >Assignee: Edward Capriolo >Priority: Minor > Attachments: cassandra-2325-1.patch.txt, cassandra-2325.patch.2.txt > > > the invalidate[Key|Row]Cache calls don't remove the saved caches from disk. > It seems logical that if you are clearing the caches you don't expect them to > be reinstantiated with the old values the next time C* starts. > This is not a huge issue since next time the caches are saved the old values > will be removed. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
svn commit: r1155544 - /cassandra/trunk/src/java/org/apache/cassandra/cli/CliClient.java
Author: jbellis Date: Tue Aug 9 20:18:47 2011 New Revision: 1155544 URL: http://svn.apache.org/viewvc?rev=1155544&view=rev Log: r/m merged reference to obsolete memtable_flush_after_mins Modified: cassandra/trunk/src/java/org/apache/cassandra/cli/CliClient.java Modified: cassandra/trunk/src/java/org/apache/cassandra/cli/CliClient.java URL: http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/cli/CliClient.java?rev=1155544&r1=1155543&r2=1155544&view=diff == --- cassandra/trunk/src/java/org/apache/cassandra/cli/CliClient.java (original) +++ cassandra/trunk/src/java/org/apache/cassandra/cli/CliClient.java Tue Aug 9 20:18:47 2011 @@ -1671,7 +1671,6 @@ public class CliClient normaliseType(cfDef.key_validation_class, "org.apache.cassandra.db.marshal")); writeAttr(sb, false, "memtable_operations", cfDef.memtable_operations_in_millions); writeAttr(sb, false, "memtable_throughput", cfDef.memtable_throughput_in_mb); -writeAttr(sb, false, "memtable_flush_after", cfDef.memtable_flush_after_mins); writeAttr(sb, false, "rows_cached", cfDef.row_cache_size); writeAttr(sb, false, "row_cache_save_period", cfDef.row_cache_save_period_in_seconds); writeAttr(sb, false, "keys_cached", cfDef.key_cache_size);
buildbot success in ASF Buildbot on cassandra-trunk
The Buildbot has detected a restored build on builder cassandra-trunk while building ASF Buildbot. Full details are available at: http://ci.apache.org/builders/cassandra-trunk/builds/1504 Buildbot URL: http://ci.apache.org/ Buildslave for this Build: isis_ubuntu Build Reason: scheduler Build Source Stamp: [branch cassandra/trunk] 1155544 Blamelist: jbellis Build succeeded! sincerely, -The Buildbot
svn commit: r1155548 - in /cassandra/branches/cassandra-0.8: CHANGES.txt src/java/org/apache/cassandra/cql/UpdateStatement.java src/java/org/apache/cassandra/thrift/ThriftValidation.java test/system/t
Author: slebresne Date: Tue Aug 9 20:24:17 2011 New Revision: 1155548 URL: http://svn.apache.org/viewvc?rev=1155548&view=rev Log: Refuse counter write at CL.ANY patch by slebresne; reviewed by jbellis for CASSANDRA-2990 Modified: cassandra/branches/cassandra-0.8/CHANGES.txt cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/cql/UpdateStatement.java cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/thrift/ThriftValidation.java cassandra/branches/cassandra-0.8/test/system/test_cql.py cassandra/branches/cassandra-0.8/test/system/test_thrift_server.py Modified: cassandra/branches/cassandra-0.8/CHANGES.txt URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/CHANGES.txt?rev=1155548&r1=1155547&r2=1155548&view=diff == --- cassandra/branches/cassandra-0.8/CHANGES.txt (original) +++ cassandra/branches/cassandra-0.8/CHANGES.txt Tue Aug 9 20:24:17 2011 @@ -2,6 +2,7 @@ * include files-to-be-streamed in StreamInSession.getSources (CASSANDRA-2972) * use JAVA env var in cassandra-env.sh (CASSANDRA-2785, 2992) * avoid doing read for no-op replicate-on-write at CL=1 (CASSANDRA-2892) + * refuse counter write for CL.ANY (CASSANDRA-2990) 0.8.3 Modified: cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/cql/UpdateStatement.java URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/cql/UpdateStatement.java?rev=1155548&r1=1155547&r2=1155548&view=diff == --- cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/cql/UpdateStatement.java (original) +++ cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/cql/UpdateStatement.java Tue Aug 9 20:24:17 2011 @@ -39,6 +39,7 @@ import static org.apache.cassandra.cql.Q import static org.apache.cassandra.cql.Operation.OperationType; import static org.apache.cassandra.thrift.ThriftValidation.validateColumnFamily; +import static org.apache.cassandra.thrift.ThriftValidation.validateCommutativeForWrite; /** * An UPDATE statement parsed from a CQL query statement. @@ -142,6 +143,8 @@ public class UpdateStatement extends Abs } CFMetaData metadata = validateColumnFamily(keyspace, columnFamily, hasCommutativeOperation); +if (hasCommutativeOperation) +validateCommutativeForWrite(metadata, cLevel); QueryProcessor.validateKeyAlias(metadata, keyName); Modified: cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/thrift/ThriftValidation.java URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/thrift/ThriftValidation.java?rev=1155548&r1=1155547&r2=1155548&view=diff == --- cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/thrift/ThriftValidation.java (original) +++ cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/thrift/ThriftValidation.java Tue Aug 9 20:24:17 2011 @@ -627,7 +627,11 @@ public class ThriftValidation public static void validateCommutativeForWrite(CFMetaData metadata, ConsistencyLevel consistency) throws InvalidRequestException { -if (!metadata.getReplicateOnWrite() && consistency != ConsistencyLevel.ONE) +if (consistency == ConsistencyLevel.ANY) +{ +throw new InvalidRequestException("Consistency level ANY is not yet supported for counter columnfamily " + metadata.cfName); +} +else if (!metadata.getReplicateOnWrite() && consistency != ConsistencyLevel.ONE) { throw new InvalidRequestException("cannot achieve CL > CL.ONE without replicate_on_write on columnfamily " + metadata.cfName); } Modified: cassandra/branches/cassandra-0.8/test/system/test_cql.py URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/test/system/test_cql.py?rev=1155548&r1=1155547&r2=1155548&view=diff == --- cassandra/branches/cassandra-0.8/test/system/test_cql.py (original) +++ cassandra/branches/cassandra-0.8/test/system/test_cql.py Tue Aug 9 20:24:17 2011 @@ -1260,6 +1260,11 @@ class TestCql(ThriftTester): cursor.execute, "UPDATE CounterCF SET count_me = count_not_me + 2 WHERE key = 'counter1'") +# counters can't do ANY +assert_raises(cql.ProgrammingError, + cursor.execute, + "UPDATE CounterCF USING CONSISTENCY ANY SET count_me = count_me + 2 WHERE key = 'counter1'") + def test_key_alias_support(self): "should be possible to use alias instead of KEY keyword" cursor = init() Modified: cassandra/branches/cassandra-0.8/test/system/test_thrift_server.py URL: http://svn.a
svn commit: r1155549 - in /cassandra/trunk: ./ contrib/ interface/thrift/gen-java/org/apache/cassandra/thrift/ src/java/org/apache/cassandra/cql/ src/java/org/apache/cassandra/thrift/ test/system/
Author: slebresne Date: Tue Aug 9 20:26:07 2011 New Revision: 1155549 URL: http://svn.apache.org/viewvc?rev=1155549&view=rev Log: commit from 0.8 Modified: cassandra/trunk/ (props changed) cassandra/trunk/CHANGES.txt cassandra/trunk/contrib/ (props changed) cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java (props changed) cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java (props changed) cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java (props changed) cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java (props changed) cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java (props changed) cassandra/trunk/src/java/org/apache/cassandra/cql/UpdateStatement.java cassandra/trunk/src/java/org/apache/cassandra/thrift/ThriftValidation.java cassandra/trunk/test/system/test_cql.py cassandra/trunk/test/system/test_thrift_server.py Propchange: cassandra/trunk/ -- --- svn:mergeinfo (original) +++ svn:mergeinfo Tue Aug 9 20:26:07 2011 @@ -1,7 +1,7 @@ /cassandra/branches/cassandra-0.6:922689-1052356,1052358-1053452,1053454,1053456-1131291 /cassandra/branches/cassandra-0.7:1026516-1151306 /cassandra/branches/cassandra-0.7.0:1053690-1055654 -/cassandra/branches/cassandra-0.8:1090934-1125013,1125019-1155460 +/cassandra/branches/cassandra-0.8:1090934-1125013,1125019-1155460,1155548 /cassandra/branches/cassandra-0.8.0:1125021-1130369 /cassandra/branches/cassandra-0.8.1:1101014-1125018 /cassandra/tags/cassandra-0.7.0-rc3:1051699-1053689 Modified: cassandra/trunk/CHANGES.txt URL: http://svn.apache.org/viewvc/cassandra/trunk/CHANGES.txt?rev=1155549&r1=1155548&r2=1155549&view=diff == --- cassandra/trunk/CHANGES.txt (original) +++ cassandra/trunk/CHANGES.txt Tue Aug 9 20:26:07 2011 @@ -35,6 +35,7 @@ * include files-to-be-streamed in StreamInSession.getSources (CASSANDRA-2972) * use JAVA env var in cassandra-env.sh (CASSANDRA-2785, 2992) * avoid doing read for no-op replicate-on-write at CL=1 (CASSANDRA-2892) + * refuse counter write for CL.ANY (CASSANDRA-2990) 0.8.3 Propchange: cassandra/trunk/contrib/ -- --- svn:mergeinfo (original) +++ svn:mergeinfo Tue Aug 9 20:26:07 2011 @@ -1,7 +1,7 @@ /cassandra/branches/cassandra-0.6/contrib:922689-1052356,1052358-1053452,1053454,1053456-1068009 /cassandra/branches/cassandra-0.7/contrib:1026516-1151306 /cassandra/branches/cassandra-0.7.0/contrib:1053690-1055654 -/cassandra/branches/cassandra-0.8/contrib:1090934-1125013,1125019-1155460 +/cassandra/branches/cassandra-0.8/contrib:1090934-1125013,1125019-1155460,1155548 /cassandra/branches/cassandra-0.8.0/contrib:1125021-1130369 /cassandra/branches/cassandra-0.8.1/contrib:1101014-1125018 /cassandra/tags/cassandra-0.7.0-rc3/contrib:1051699-1053689 Propchange: cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java -- --- svn:mergeinfo (original) +++ svn:mergeinfo Tue Aug 9 20:26:07 2011 @@ -1,7 +1,7 @@ /cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:922689-1052356,1052358-1053452,1053454,1053456-1131291 /cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1151306 /cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1055654 -/cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1090934-1125013,1125019-1155460 +/cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1090934-1125013,1125019-1155460,1155548 /cassandra/branches/cassandra-0.8.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1125021-1130369 /cassandra/branches/cassandra-0.8.1/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1101014-1125018 /cassandra/tags/cassandra-0.7.0-rc3/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1051699-1053689 Propchange: cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java -- --- svn:mergeinfo (original) +++ svn:mergeinfo Tue Aug 9 20:26:07 2011 @@ -1,7 +1,7 @@ /cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:922689-1052356,1052358-1053452,1053454,1053456-1131291 /cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandr
[jira] [Commented] (CASSANDRA-3004) Once a message has been dropped, cassandra logs total messages dropped and tpstats every 5s forever
[ https://issues.apache.org/jira/browse/CASSANDRA-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081901#comment-13081901 ] Brandon Williams commented on CASSANDRA-3004: - +1 > Once a message has been dropped, cassandra logs total messages dropped and > tpstats every 5s forever > --- > > Key: CASSANDRA-3004 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3004 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 0.8.3 >Reporter: Brandon Williams >Assignee: Jonathan Ellis >Priority: Minor > Labels: lhf > Fix For: 0.8.4 > > Attachments: 3004.txt > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
svn commit: r1155558 - in /cassandra/branches/cassandra-0.8: CHANGES.txt src/java/org/apache/cassandra/net/MessagingService.java
Author: jbellis Date: Tue Aug 9 20:47:35 2011 New Revision: 118 URL: http://svn.apache.org/viewvc?rev=118&view=rev Log: switch back to only logging recent dropped messages patch by jbellis; reviewed by brandonwilliams for CASSANDRA-3004 Modified: cassandra/branches/cassandra-0.8/CHANGES.txt cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/net/MessagingService.java Modified: cassandra/branches/cassandra-0.8/CHANGES.txt URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/CHANGES.txt?rev=118&r1=117&r2=118&view=diff == --- cassandra/branches/cassandra-0.8/CHANGES.txt (original) +++ cassandra/branches/cassandra-0.8/CHANGES.txt Tue Aug 9 20:47:35 2011 @@ -3,6 +3,7 @@ * use JAVA env var in cassandra-env.sh (CASSANDRA-2785, 2992) * avoid doing read for no-op replicate-on-write at CL=1 (CASSANDRA-2892) * refuse counter write for CL.ANY (CASSANDRA-2990) + * switch back to only logging recent dropped messages (CASSANDRA-3004) 0.8.3 Modified: cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/net/MessagingService.java URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/net/MessagingService.java?rev=118&r1=117&r2=118&view=diff == --- cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/net/MessagingService.java (original) +++ cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/net/MessagingService.java Tue Aug 9 20:47:35 2011 @@ -100,18 +100,11 @@ public final class MessagingService impl private final Map droppedMessages = new EnumMap(StorageService.Verb.class); // dropped count when last requested for the Recent api. high concurrency isn't necessary here. private final Map lastDropped = Collections.synchronizedMap(new EnumMap(StorageService.Verb.class)); +private final Map lastDroppedInternal = new EnumMap(StorageService.Verb.class); private final List subscribers = new ArrayList(); private static final long DEFAULT_CALLBACK_TIMEOUT = (long) (1.1 * DatabaseDescriptor.getRpcTimeout()); -{ -for (StorageService.Verb verb : DROPPABLE_VERBS) -{ -droppedMessages.put(verb, new AtomicInteger()); -lastDropped.put(verb, 0); -} -} - private static class MSHandle { public static final MessagingService instance = new MessagingService(); @@ -123,6 +116,13 @@ public final class MessagingService impl private MessagingService() { +for (StorageService.Verb verb : DROPPABLE_VERBS) +{ +droppedMessages.put(verb, new AtomicInteger()); +lastDropped.put(verb, 0); +lastDroppedInternal.put(verb, 0); +} + listenGate = new SimpleCondition(); verbHandlers_ = new EnumMap(StorageService.Verb.class); streamExecutor_ = new DebuggableThreadPoolExecutor("Streaming", DatabaseDescriptor.getCompactionThreadPriority()); @@ -584,11 +584,13 @@ public final class MessagingService impl for (Map.Entry entry : droppedMessages.entrySet()) { AtomicInteger dropped = entry.getValue(); -if (dropped.get() > 0) +StorageService.Verb verb = entry.getKey(); +int recent = dropped.get() - lastDroppedInternal.get(verb); +if (recent > 0) { logTpstats = true; -logger_.info("{} {} messages dropped in server lifetime", - dropped, entry.getKey()); +logger_.info("{} {} messages dropped in server lifetime", recent, verb); +lastDroppedInternal.put(verb, dropped.get()); } }
[jira] [Updated] (CASSANDRA-3004) Once a message has been dropped, cassandra logs total messages dropped and tpstats every 5s forever
[ https://issues.apache.org/jira/browse/CASSANDRA-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-3004: -- Affects Version/s: (was: 0.8.3) 0.8.2 Issue Type: Improvement (was: Bug) > Once a message has been dropped, cassandra logs total messages dropped and > tpstats every 5s forever > --- > > Key: CASSANDRA-3004 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3004 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 0.8.2 >Reporter: Brandon Williams >Assignee: Jonathan Ellis >Priority: Minor > Labels: lhf > Fix For: 0.8.4 > > Attachments: 3004.txt > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2325) invalidateKeyCache / invalidateRowCache should remove saved cache files from disk
[ https://issues.apache.org/jira/browse/CASSANDRA-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-2325: -- Affects Version/s: (was: 0.7.8) (was: 0.8.2) 0.6 Fix Version/s: 0.8.4 > invalidateKeyCache / invalidateRowCache should remove saved cache files from > disk > - > > Key: CASSANDRA-2325 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2325 > Project: Cassandra > Issue Type: Improvement >Affects Versions: 0.6 >Reporter: Matthew F. Dennis >Assignee: Edward Capriolo >Priority: Minor > Fix For: 0.8.4 > > Attachments: cassandra-2325-1.patch.txt, cassandra-2325.patch.2.txt > > > the invalidate[Key|Row]Cache calls don't remove the saved caches from disk. > It seems logical that if you are clearing the caches you don't expect them to > be reinstantiated with the old values the next time C* starts. > This is not a huge issue since next time the caches are saved the old values > will be removed. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2325) invalidateKeyCache / invalidateRowCache should remove saved cache files from disk
[ https://issues.apache.org/jira/browse/CASSANDRA-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081920#comment-13081920 ] Jonathan Ellis commented on CASSANDRA-2325: --- Shouldn't we check that the file exists first? otherwise we log spurious errors. > invalidateKeyCache / invalidateRowCache should remove saved cache files from > disk > - > > Key: CASSANDRA-2325 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2325 > Project: Cassandra > Issue Type: Improvement >Affects Versions: 0.6 >Reporter: Matthew F. Dennis >Assignee: Edward Capriolo >Priority: Minor > Fix For: 0.8.4 > > Attachments: cassandra-2325-1.patch.txt, cassandra-2325.patch.2.txt > > > the invalidate[Key|Row]Cache calls don't remove the saved caches from disk. > It seems logical that if you are clearing the caches you don't expect them to > be reinstantiated with the old values the next time C* starts. > This is not a huge issue since next time the caches are saved the old values > will be removed. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CASSANDRA-3005) OutboundTcpConnection's sending queue goes unboundedly without any backpressure logic
[ https://issues.apache.org/jira/browse/CASSANDRA-3005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Melvin Wang reassigned CASSANDRA-3005: -- Assignee: Melvin Wang > OutboundTcpConnection's sending queue goes unboundedly without any > backpressure logic > - > > Key: CASSANDRA-3005 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3005 > Project: Cassandra > Issue Type: Improvement >Reporter: Melvin Wang >Assignee: Melvin Wang > > OutboundTcpConnection's sending queue unconditionally queues up the request > and process them in sequence. Thinking about tagging the message coming in > with timestamp and drop them before actually sending it if the message stays > in the queue for too long, which is defined by the message's own time out > value. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2988) Improve SSTableReader.load() when loading index files
[ https://issues.apache.org/jira/browse/CASSANDRA-2988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081930#comment-13081930 ] Pavel Yaskevich commented on CASSANDRA-2988: First of all I would like to point you to http://wiki.apache.org/cassandra/CodeStyle, please modify your code according to conventions listed in there. According to c2988-modified-buffer.patch: - please encapsulate your modifications because if you compare how it was and how it is in your patch it's hard to undertand and just looks like a mess, I would like to suggest moving those modifications to separate inner class (IndexReader maybe?) and replace only RandomAccessReader initialization in the SSTableReader.load(...) method... - let's add a test comparing "getEstimatedRowSize().count();" and "SSTable.estimateRowsFromIndex(input);" just to be sure it works correctly. Also I don't quiet understand logic behind "while (buffer.remaining() > 10) {" in SSTableReader.loadByteBuffer, let's avoid any hardcoding or at least comment why you did that. I'm going to take a closer look at patch for parallel index file loading after we will be done with index reader patch (c2988-modified-buffer.patch). > Improve SSTableReader.load() when loading index files > - > > Key: CASSANDRA-2988 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2988 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Melvin Wang >Assignee: Melvin Wang >Priority: Minor > Fix For: 1.0 > > Attachments: c2988-modified-buffer.patch, > c2988-parallel-load-sstables.patch > > > * when we create BufferredRandomAccessFile, we pass skipCache=true. This > hurts the read performance because we always process the index files > sequentially. Simple fix would be set it to false. > * multiple index files of a single column family can be loaded in parallel. > This buys a lot when you have multiple super large index files. > * we may also change how we buffer. By using BufferredRandomAccessFile, for > every read, we need bunch of checking like > - do we need to rebuffer? > - isEOF()? > - assertions > These can be simplified to some extent. We can blindly buffer the index > file by chunks and process the buffer until a key lies across boundary of a > chunk. Then we rebuffer and start from the beginning of the partially read > key. Conceptually, this is same as what BRAF does but w/o the overhead in the > read**() methods in BRAF. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2950) Data from truncated CF reappears after server restart
[ https://issues.apache.org/jira/browse/CASSANDRA-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081941#comment-13081941 ] Brandon Williams commented on CASSANDRA-2950: - Currently, truncate does: * force a flush * record the time * delete any sstables older than the time This isn't quite enough if the machine crashes shortly afterward, however, since there can be mutations present in the commitlog that were previously truncated and are now resurrected by CL replay. One thing we could do is record the truncate time for the CF in the system ks and then ignore mutations older than that, however this would require time synchronization between the client and the server to be accurate. > Data from truncated CF reappears after server restart > - > > Key: CASSANDRA-2950 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2950 > Project: Cassandra > Issue Type: Bug >Reporter: Cathy Daw >Assignee: Brandon Williams > > * Configure 3 node cluster > * Ensure the java stress tool creates Keyspace1 with RF=3 > {code} > // Run Stress Tool to generate 10 keys, 1 column > stress --operation=INSERT -t 2 --num-keys=50 --columns=20 > --consistency-level=QUORUM --average-size-values --replication-factor=3 > --create-index=KEYS --nodes=cathy1,cathy2 > // Verify 50 keys in CLI > use Keyspace1; > list Standard1; > // TRUNCATE CF in CLI > use Keyspace1; > truncate counter1; > list counter1; > // Run stress tool and verify creation of 1 key with 10 columns > stress --operation=INSERT -t 2 --num-keys=1 --columns=10 > --consistency-level=QUORUM --average-size-values --replication-factor=3 > --create-index=KEYS --nodes=cathy1,cathy2 > // Verify 1 key in CLI > use Keyspace1; > list Standard1; > // Restart all three nodes > // You will see 51 keys in CLI > use Keyspace1; > list Standard1; > {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3004) Once a message has been dropped, cassandra logs total messages dropped and tpstats every 5s forever
[ https://issues.apache.org/jira/browse/CASSANDRA-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081948#comment-13081948 ] Hudson commented on CASSANDRA-3004: --- Integrated in Cassandra-0.8 #265 (See [https://builds.apache.org/job/Cassandra-0.8/265/]) switch back to only logging recent dropped messages patch by jbellis; reviewed by brandonwilliams for CASSANDRA-3004 jbellis : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=118 Files : * /cassandra/branches/cassandra-0.8/CHANGES.txt * /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/net/MessagingService.java > Once a message has been dropped, cassandra logs total messages dropped and > tpstats every 5s forever > --- > > Key: CASSANDRA-3004 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3004 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 0.8.2 >Reporter: Brandon Williams >Assignee: Jonathan Ellis >Priority: Minor > Labels: lhf > Fix For: 0.8.4 > > Attachments: 3004.txt > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2990) We should refuse query for counters at CL.ANY
[ https://issues.apache.org/jira/browse/CASSANDRA-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081947#comment-13081947 ] Hudson commented on CASSANDRA-2990: --- Integrated in Cassandra-0.8 #265 (See [https://builds.apache.org/job/Cassandra-0.8/265/]) Refuse counter write at CL.ANY patch by slebresne; reviewed by jbellis for CASSANDRA-2990 slebresne : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1155548 Files : * /cassandra/branches/cassandra-0.8/test/system/test_cql.py * /cassandra/branches/cassandra-0.8/CHANGES.txt * /cassandra/branches/cassandra-0.8/test/system/test_thrift_server.py * /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/thrift/ThriftValidation.java * /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/cql/UpdateStatement.java > We should refuse query for counters at CL.ANY > - > > Key: CASSANDRA-2990 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2990 > Project: Cassandra > Issue Type: Bug >Affects Versions: 0.8.0 >Reporter: Sylvain Lebresne >Assignee: Sylvain Lebresne >Priority: Trivial > Labels: counters > Fix For: 0.8.4 > > Attachments: 2990.patch > > > We currently do not reject writes for counters at CL.ANY, even though this is > not supported (and rightly so). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2982) Refactor secondary index api
[ https://issues.apache.org/jira/browse/CASSANDRA-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] T Jake Luciani updated CASSANDRA-2982: -- Attachment: 2982-v1.txt refactored api, should cover new index types. Should we consider removing IndexType enum and just use classname? > Refactor secondary index api > > > Key: CASSANDRA-2982 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2982 > Project: Cassandra > Issue Type: Sub-task > Components: Core >Reporter: T Jake Luciani >Assignee: T Jake Luciani > Fix For: 1.0 > > Attachments: 2982-v1.txt > > > Secondary indexes currently make some bad assumptions about the underlying > indexes. > 1. That they are always stored in other column families. > 2. That there is a unique index per column > In the case of CASSANDRA-2915 neither of these are true. The new api should > abstract the search concepts and allow any search api to plug in. > Once the code is refactored and basically pluggable we can remove the > IndexType enum and use class names similar to how we handle partitioners and > comparators. > Basic api is to add a SecondaryIndexManager that handles different index > types per CF and a SecondaryIndex base class that handles a particular type > implementation. > This requires major changes to ColumnFamilyStore and Table.IndexBuilder -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2950) Data from truncated CF reappears after server restart
[ https://issues.apache.org/jira/browse/CASSANDRA-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081966#comment-13081966 ] Jonathan Ellis commented on CASSANDRA-2950: --- but we record CL "context" at time of flush in the sstable it makes, and we on replay we ignore any mutations from before that position. checked and we do wait for flush to complete in truncate. > Data from truncated CF reappears after server restart > - > > Key: CASSANDRA-2950 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2950 > Project: Cassandra > Issue Type: Bug >Reporter: Cathy Daw >Assignee: Brandon Williams > > * Configure 3 node cluster > * Ensure the java stress tool creates Keyspace1 with RF=3 > {code} > // Run Stress Tool to generate 10 keys, 1 column > stress --operation=INSERT -t 2 --num-keys=50 --columns=20 > --consistency-level=QUORUM --average-size-values --replication-factor=3 > --create-index=KEYS --nodes=cathy1,cathy2 > // Verify 50 keys in CLI > use Keyspace1; > list Standard1; > // TRUNCATE CF in CLI > use Keyspace1; > truncate counter1; > list counter1; > // Run stress tool and verify creation of 1 key with 10 columns > stress --operation=INSERT -t 2 --num-keys=1 --columns=10 > --consistency-level=QUORUM --average-size-values --replication-factor=3 > --create-index=KEYS --nodes=cathy1,cathy2 > // Verify 1 key in CLI > use Keyspace1; > list Standard1; > // Restart all three nodes > // You will see 51 keys in CLI > use Keyspace1; > list Standard1; > {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2982) Refactor secondary index api
[ https://issues.apache.org/jira/browse/CASSANDRA-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081968#comment-13081968 ] Jonathan Ellis commented on CASSANDRA-2982: --- I don't think full index pluggability is a goal here. So I don't see the point of that. > Refactor secondary index api > > > Key: CASSANDRA-2982 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2982 > Project: Cassandra > Issue Type: Sub-task > Components: Core >Reporter: T Jake Luciani >Assignee: T Jake Luciani > Fix For: 1.0 > > Attachments: 2982-v1.txt > > > Secondary indexes currently make some bad assumptions about the underlying > indexes. > 1. That they are always stored in other column families. > 2. That there is a unique index per column > In the case of CASSANDRA-2915 neither of these are true. The new api should > abstract the search concepts and allow any search api to plug in. > Once the code is refactored and basically pluggable we can remove the > IndexType enum and use class names similar to how we handle partitioners and > comparators. > Basic api is to add a SecondaryIndexManager that handles different index > types per CF and a SecondaryIndex base class that handles a particular type > implementation. > This requires major changes to ColumnFamilyStore and Table.IndexBuilder -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2950) Data from truncated CF reappears after server restart
[ https://issues.apache.org/jira/browse/CASSANDRA-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081969#comment-13081969 ] Brandon Williams commented on CASSANDRA-2950: - bq. but we record CL "context" at time of flush in the sstable it makes, and we on replay we ignore any mutations from before that position. I think there's something wrong with that, then: {noformat} INFO 21:25:15,274 Replaying /var/lib/cassandra/commitlog/CommitLog-1312924388053.log DEBUG 21:25:15,290 Replaying /var/lib/cassandra/commitlog/CommitLog-1312924388053.log starting at 0 DEBUG 21:25:15,291 Reading mutation at 0 DEBUG 21:25:15,295 replaying mutation for system.4c: {ColumnFamily(LocationInfo [47656e65726174696f6e:false:4@131292438814,])} DEBUG 21:25:15,321 Reading mutation at 89 DEBUG 21:25:15,322 replaying mutation for system.426f6f747374726170: {ColumnFamily(LocationInfo [42:false:1@1312924388203,])} DEBUG 21:25:15,322 Reading mutation at 174 DEBUG 21:25:15,322 replaying mutation for system.4c: {ColumnFamily(LocationInfo [546f6b656e:false:16@1312924388204,])} DEBUG 21:25:15,322 Reading mutation at 270 DEBUG 21:25:15,324 replaying mutation for Keyspace1.3030: {ColumnFamily(Standard1 [C0:false:34@1312924813259,C1:false:34@1312924813260,C2:false:34@1312924813260,C3:false:34@1312924813260,C4:false:34@1312924813260,])} {noformat} The last entry there is the first of many errant mutations. > Data from truncated CF reappears after server restart > - > > Key: CASSANDRA-2950 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2950 > Project: Cassandra > Issue Type: Bug >Reporter: Cathy Daw >Assignee: Brandon Williams > > * Configure 3 node cluster > * Ensure the java stress tool creates Keyspace1 with RF=3 > {code} > // Run Stress Tool to generate 10 keys, 1 column > stress --operation=INSERT -t 2 --num-keys=50 --columns=20 > --consistency-level=QUORUM --average-size-values --replication-factor=3 > --create-index=KEYS --nodes=cathy1,cathy2 > // Verify 50 keys in CLI > use Keyspace1; > list Standard1; > // TRUNCATE CF in CLI > use Keyspace1; > truncate counter1; > list counter1; > // Run stress tool and verify creation of 1 key with 10 columns > stress --operation=INSERT -t 2 --num-keys=1 --columns=10 > --consistency-level=QUORUM --average-size-values --replication-factor=3 > --create-index=KEYS --nodes=cathy1,cathy2 > // Verify 1 key in CLI > use Keyspace1; > list Standard1; > // Restart all three nodes > // You will see 51 keys in CLI > use Keyspace1; > list Standard1; > {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CASSANDRA-2950) Data from truncated CF reappears after server restart
[ https://issues.apache.org/jira/browse/CASSANDRA-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis reassigned CASSANDRA-2950: - Assignee: Jonathan Ellis (was: Brandon Williams) > Data from truncated CF reappears after server restart > - > > Key: CASSANDRA-2950 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2950 > Project: Cassandra > Issue Type: Bug >Reporter: Cathy Daw >Assignee: Jonathan Ellis > > * Configure 3 node cluster > * Ensure the java stress tool creates Keyspace1 with RF=3 > {code} > // Run Stress Tool to generate 10 keys, 1 column > stress --operation=INSERT -t 2 --num-keys=50 --columns=20 > --consistency-level=QUORUM --average-size-values --replication-factor=3 > --create-index=KEYS --nodes=cathy1,cathy2 > // Verify 50 keys in CLI > use Keyspace1; > list Standard1; > // TRUNCATE CF in CLI > use Keyspace1; > truncate counter1; > list counter1; > // Run stress tool and verify creation of 1 key with 10 columns > stress --operation=INSERT -t 2 --num-keys=1 --columns=10 > --consistency-level=QUORUM --average-size-values --replication-factor=3 > --create-index=KEYS --nodes=cathy1,cathy2 > // Verify 1 key in CLI > use Keyspace1; > list Standard1; > // Restart all three nodes > // You will see 51 keys in CLI > use Keyspace1; > list Standard1; > {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-3010) Java CQL command-line shell
Java CQL command-line shell --- Key: CASSANDRA-3010 URL: https://issues.apache.org/jira/browse/CASSANDRA-3010 Project: Cassandra Issue Type: New Feature Components: Tools Reporter: Jonathan Ellis Assignee: Pavel Yaskevich Fix For: 1.0 We need a "real" CQL shell that: - does not require installing additional environments - includes "show keyspaces" and other introspection tools - does not break existing cli scripts I.e., it needs to be java, but it should be a new tool instead of replacing the existing cli. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3010) Java CQL command-line shell
[ https://issues.apache.org/jira/browse/CASSANDRA-3010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081974#comment-13081974 ] Jonathan Ellis commented on CASSANDRA-3010: --- I.e., do we do "\d CF" (postgresql) or "describe CF" (mysql) or "desc CF" (oracle)? > Java CQL command-line shell > --- > > Key: CASSANDRA-3010 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3010 > Project: Cassandra > Issue Type: New Feature > Components: Tools >Reporter: Jonathan Ellis >Assignee: Pavel Yaskevich > Fix For: 1.0 > > > We need a "real" CQL shell that: > - does not require installing additional environments > - includes "show keyspaces" and other introspection tools > - does not break existing cli scripts > I.e., it needs to be java, but it should be a new tool instead of replacing > the existing cli. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3010) Java CQL command-line shell
[ https://issues.apache.org/jira/browse/CASSANDRA-3010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081973#comment-13081973 ] Jonathan Ellis commented on CASSANDRA-3010: --- We should also pick a SQL command line to imitate for the introspection stuff. Might as well get that degree of familiarity as well since there is no reason not to. > Java CQL command-line shell > --- > > Key: CASSANDRA-3010 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3010 > Project: Cassandra > Issue Type: New Feature > Components: Tools >Reporter: Jonathan Ellis >Assignee: Pavel Yaskevich > Fix For: 1.0 > > > We need a "real" CQL shell that: > - does not require installing additional environments > - includes "show keyspaces" and other introspection tools > - does not break existing cli scripts > I.e., it needs to be java, but it should be a new tool instead of replacing > the existing cli. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3010) Java CQL command-line shell
[ https://issues.apache.org/jira/browse/CASSANDRA-3010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081977#comment-13081977 ] Jeremy Hanna commented on CASSANDRA-3010: - If I had to choose one, it would be nice to be more descriptive (describe versus \d). However, it would be really nice to have a basic concept of synonyms. For example mysql's cli supports both describe and desc. Building that type of functionality in from the start shouldn't be too onerous. > Java CQL command-line shell > --- > > Key: CASSANDRA-3010 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3010 > Project: Cassandra > Issue Type: New Feature > Components: Tools >Reporter: Jonathan Ellis >Assignee: Pavel Yaskevich > Fix For: 1.0 > > > We need a "real" CQL shell that: > - does not require installing additional environments > - includes "show keyspaces" and other introspection tools > - does not break existing cli scripts > I.e., it needs to be java, but it should be a new tool instead of replacing > the existing cli. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (CASSANDRA-3010) Java CQL command-line shell
[ https://issues.apache.org/jira/browse/CASSANDRA-3010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081977#comment-13081977 ] Jeremy Hanna edited comment on CASSANDRA-3010 at 8/9/11 10:36 PM: -- If I had to choose one, it would be nice to be more descriptive (describe versus \d). However, it would be really nice to have a basic concept of synonyms. For example mysql's cli supports both describe and desc. Building that type of functionality in from the start would hopefully not be too onerous. was (Author: jeromatron): If I had to choose one, it would be nice to be more descriptive (describe versus \d). However, it would be really nice to have a basic concept of synonyms. For example mysql's cli supports both describe and desc. Building that type of functionality in from the start shouldn't be too onerous. > Java CQL command-line shell > --- > > Key: CASSANDRA-3010 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3010 > Project: Cassandra > Issue Type: New Feature > Components: Tools >Reporter: Jonathan Ellis >Assignee: Pavel Yaskevich > Fix For: 1.0 > > > We need a "real" CQL shell that: > - does not require installing additional environments > - includes "show keyspaces" and other introspection tools > - does not break existing cli scripts > I.e., it needs to be java, but it should be a new tool instead of replacing > the existing cli. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3010) Java CQL command-line shell
[ https://issues.apache.org/jira/browse/CASSANDRA-3010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081982#comment-13081982 ] Pavel Yaskevich commented on CASSANDRA-3010: I don't think that we should choose anything because we can support all of those notations using synonyms in the ANTLR grammar. That would be hard from the begging to include all of the possible synonyms but grammar will be designed in the way which will allow to easy add new synonyms as we go. > Java CQL command-line shell > --- > > Key: CASSANDRA-3010 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3010 > Project: Cassandra > Issue Type: New Feature > Components: Tools >Reporter: Jonathan Ellis >Assignee: Pavel Yaskevich > Fix For: 1.0 > > > We need a "real" CQL shell that: > - does not require installing additional environments > - includes "show keyspaces" and other introspection tools > - does not break existing cli scripts > I.e., it needs to be java, but it should be a new tool instead of replacing > the existing cli. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2988) Improve SSTableReader.load() when loading index files
[ https://issues.apache.org/jira/browse/CASSANDRA-2988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13082008#comment-13082008 ] Melvin Wang commented on CASSANDRA-2988: bq. First of all I would like to point you to http://wiki.apache.org/cassandra/CodeStyle, please modify your code according to conventions listed in there. Sure. This boils down to where to put the curly braces bq. please encapsulate your modifications because if you compare how it was and how it is in your patch it's hard to undertand and just looks like a mess, I would like to suggest moving those modifications to separate inner class (IndexReader maybe?) and replace only RandomAccessReader initialization in the SSTableReader.load(...) method... This patch is about changing the most part of the load() method. I am not clear how we could only change the initialization of RandomAcessReader. bq. Also I don't quiet understand logic behind "while (buffer.remaining() > 10) {" in SSTableReader.loadByteBuffer, let's avoid any hardcoding or at least comment why you did that. Sorry for lacking comments. I will add it. However, this is not a hard coding in the sense that, Short consists of 2 bytes and Long consists of 8 bytes, the sum is 10 bytes. It is just a quick checking if we reach the end. bq. I'm going to take a closer look at patch for parallel index file loading after we will be done with index reader patch (c2988-modified-buffer.patch). FYI, these two patches are completely independent with each other. > Improve SSTableReader.load() when loading index files > - > > Key: CASSANDRA-2988 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2988 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Melvin Wang >Assignee: Melvin Wang >Priority: Minor > Fix For: 1.0 > > Attachments: c2988-modified-buffer.patch, > c2988-parallel-load-sstables.patch > > > * when we create BufferredRandomAccessFile, we pass skipCache=true. This > hurts the read performance because we always process the index files > sequentially. Simple fix would be set it to false. > * multiple index files of a single column family can be loaded in parallel. > This buys a lot when you have multiple super large index files. > * we may also change how we buffer. By using BufferredRandomAccessFile, for > every read, we need bunch of checking like > - do we need to rebuffer? > - isEOF()? > - assertions > These can be simplified to some extent. We can blindly buffer the index > file by chunks and process the buffer until a key lies across boundary of a > chunk. Then we rebuffer and start from the beginning of the partially read > key. Conceptually, this is same as what BRAF does but w/o the overhead in the > read**() methods in BRAF. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2777) Pig storage handler should implement LoadMetadata
[ https://issues.apache.org/jira/browse/CASSANDRA-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-2777: Attachment: 2777-v2.txt v2 rebased. > Pig storage handler should implement LoadMetadata > - > > Key: CASSANDRA-2777 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2777 > Project: Cassandra > Issue Type: Improvement > Components: Contrib >Reporter: Brandon Williams >Assignee: Brandon Williams >Priority: Minor > Fix For: 0.7.9 > > Attachments: 2777-v2.txt, 2777.txt > > > The reason for this is many builtin functions like SUM won't work on longs > (you can workaround using LongSum, but that's lame) because the query planner > doesn't know about the types beforehand, even though we are casting to native > longs. > There is some impact to this, though. With LoadMetadata implemented, > existing scripts that specify schema will need to remove it (since LM is > doing it for them) and they will need to conform to LM's terminology (key, > columns, name, value) within the script. This is trivial to change, however, > and the increased functionality is worth the switch. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2988) Improve SSTableReader.load() when loading index files
[ https://issues.apache.org/jira/browse/CASSANDRA-2988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13082016#comment-13082016 ] Jonathan Ellis commented on CASSANDRA-2988: --- bq. Short consists of 2 bytes and Long consists of 8 bytes, the sum is 10 bytes IMO that's more obvious if you leave it as "2 + 8," or use the DBConstants class. > Improve SSTableReader.load() when loading index files > - > > Key: CASSANDRA-2988 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2988 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Melvin Wang >Assignee: Melvin Wang >Priority: Minor > Fix For: 1.0 > > Attachments: c2988-modified-buffer.patch, > c2988-parallel-load-sstables.patch > > > * when we create BufferredRandomAccessFile, we pass skipCache=true. This > hurts the read performance because we always process the index files > sequentially. Simple fix would be set it to false. > * multiple index files of a single column family can be loaded in parallel. > This buys a lot when you have multiple super large index files. > * we may also change how we buffer. By using BufferredRandomAccessFile, for > every read, we need bunch of checking like > - do we need to rebuffer? > - isEOF()? > - assertions > These can be simplified to some extent. We can blindly buffer the index > file by chunks and process the buffer until a key lies across boundary of a > chunk. Then we rebuffer and start from the beginning of the partially read > key. Conceptually, this is same as what BRAF does but w/o the overhead in the > read**() methods in BRAF. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2810) RuntimeException in Pig when using "dump" command on column name
[ https://issues.apache.org/jira/browse/CASSANDRA-2810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-2810: Attachment: 2810-v2.txt It looks like the final problem here is that IntegerType always returns a BigInteger, which pig does not like. This is unfortunate since IntegerType can't be easily subclassed and overridden to return ints. v2 instead adds a setTupleValue method that is always used for adding values to tuples, and houses all the special-casing currently needed and provides a spot for more in the future, rather than proliferating custom type converters since I'm sure IntegerType won't be alone here. > RuntimeException in Pig when using "dump" command on column name > > > Key: CASSANDRA-2810 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2810 > Project: Cassandra > Issue Type: Bug >Affects Versions: 0.8.1 > Environment: Ubuntu 10.10, 32 bits > java version "1.6.0_24" > Brisk beta-2 installed from Debian packages >Reporter: Silvère Lestang >Assignee: Brandon Williams > Attachments: 2810-v2.txt, 2810.txt > > > This bug was previously report on [Brisk bug > tracker|https://datastax.jira.com/browse/BRISK-232]. > In cassandra-cli: > {code} > [default@unknown] create keyspace Test > with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy' > and strategy_options = [{replication_factor:1}]; > [default@unknown] use Test; > Authenticated to keyspace: Test > [default@Test] create column family test; > [default@Test] set test[ascii('row1')][long(1)]=integer(35); > set test[ascii('row1')][long(2)]=integer(36); > set test[ascii('row1')][long(3)]=integer(38); > set test[ascii('row2')][long(1)]=integer(45); > set test[ascii('row2')][long(2)]=integer(42); > set test[ascii('row2')][long(3)]=integer(33); > [default@Test] list test; > Using default limit of 100 > --- > RowKey: 726f7731 > => (column=0001, value=35, timestamp=1308744931122000) > => (column=0002, value=36, timestamp=1308744931124000) > => (column=0003, value=38, timestamp=1308744931125000) > --- > RowKey: 726f7732 > => (column=0001, value=45, timestamp=1308744931127000) > => (column=0002, value=42, timestamp=1308744931128000) > => (column=0003, value=33, timestamp=1308744932722000) > 2 Rows Returned. > [default@Test] describe keyspace; > Keyspace: Test: > Replication Strategy: org.apache.cassandra.locator.SimpleStrategy > Durable Writes: true > Options: [replication_factor:1] > Column Families: > ColumnFamily: test > Key Validation Class: org.apache.cassandra.db.marshal.BytesType > Default column value validator: > org.apache.cassandra.db.marshal.BytesType > Columns sorted by: org.apache.cassandra.db.marshal.BytesType > Row cache size / save period in seconds: 0.0/0 > Key cache size / save period in seconds: 20.0/14400 > Memtable thresholds: 0.571875/122/1440 (millions of ops/MB/minutes) > GC grace seconds: 864000 > Compaction min/max thresholds: 4/32 > Read repair chance: 1.0 > Replicate on write: false > Built indexes: [] > {code} > In Pig command line: > {code} > grunt> test = LOAD 'cassandra://Test/test' USING CassandraStorage() AS > (rowkey:chararray, columns: bag {T: (name:long, value:int)}); > grunt> value_test = foreach test generate rowkey, columns.name, columns.value; > grunt> dump value_test; > {code} > In /var/log/cassandra/system.log, I have severals time this exception: > {code} > INFO [IPC Server handler 3 on 8012] 2011-06-22 15:03:28,533 > TaskInProgress.java (line 551) Error from > attempt_201106210955_0051_m_00_3: java.lang.RuntimeException: Unexpected > data type -1 found in stream. > at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:478) > at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541) > at org.apache.pig.data.BinInterSedes.writeBag(BinInterSedes.java:522) > at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:361) > at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541) > at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:357) > at > org.apache.pig.impl.io.InterRecordWriter.write(InterRecordWriter.java:73) > at org.apache.pig.impl.io.InterStorage.putNext(InterStorage.java:87) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97) > at > org.apache.hadoop.mapred.MapTask$
[jira] [Commented] (CASSANDRA-2982) Refactor secondary index api
[ https://issues.apache.org/jira/browse/CASSANDRA-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13082039#comment-13082039 ] Jonathan Ellis commented on CASSANDRA-2982: --- Want to give a high-level overview of the changes here? > Refactor secondary index api > > > Key: CASSANDRA-2982 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2982 > Project: Cassandra > Issue Type: Sub-task > Components: Core >Reporter: T Jake Luciani >Assignee: T Jake Luciani > Fix For: 1.0 > > Attachments: 2982-v1.txt > > > Secondary indexes currently make some bad assumptions about the underlying > indexes. > 1. That they are always stored in other column families. > 2. That there is a unique index per column > In the case of CASSANDRA-2915 neither of these are true. The new api should > abstract the search concepts and allow any search api to plug in. > Once the code is refactored and basically pluggable we can remove the > IndexType enum and use class names similar to how we handle partitioners and > comparators. > Basic api is to add a SecondaryIndexManager that handles different index > types per CF and a SecondaryIndex base class that handles a particular type > implementation. > This requires major changes to ColumnFamilyStore and Table.IndexBuilder -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3006) Enormous counter
[ https://issues.apache.org/jira/browse/CASSANDRA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13082091#comment-13082091 ] Boris Yen commented on CASSANDRA-3006: -- Here is the test program I am using now. the hector version is 0.8.0-2. Hope this will be helpful. import java.util.Arrays; import me.prettyprint.cassandra.model.AllOneConsistencyLevelPolicy; import me.prettyprint.cassandra.serializers.StringSerializer; import me.prettyprint.cassandra.service.CassandraHostConfigurator; import me.prettyprint.cassandra.service.ThriftCluster; import me.prettyprint.hector.api.Keyspace; import me.prettyprint.hector.api.beans.HCounterColumn; import me.prettyprint.hector.api.factory.HFactory; import me.prettyprint.hector.api.mutation.Mutator; import org.slf4j.Logger; import org.slf4j.LoggerFactory; public class CounterTest { private Logger logger = LoggerFactory.getLogger(CounterTest.class) ; private static final Integer COUNTER_NUM = 1000 ; private static final StringSerializer ss = StringSerializer.get(); private static final String HOST = "172.17.19.151:9160" ; private ThriftCluster cluster ; /** * @param args */ public static void main(String[] args) { CounterTest tc = new CounterTest() ; try { tc.testAlarmCounter() ; } catch (InterruptedException e) { } } public CounterTest(){ CassandraHostConfigurator chc = new CassandraHostConfigurator(HOST) ; chc.setMaxActive(100) ; chc.setMaxIdle(10) ; chc.setCassandraThriftSocketTimeout(6) ; cluster = new ThriftCluster("Test Cluster", chc) ; } public void testAlarmCounter() throws InterruptedException{ int successCounter = 0 ; int cl = 0; for(int i=0; i mutator = HFactory.createMutator(getKeyspace(cl), StringSerializer.get()); HCounterColumn column = HFactory.createCounterColumn("testSC", 1L) ; mutator.addCounter("sc", "testCounter", HFactory.createCounterSuperColumn("testC", Arrays.asList(column), ss, ss)); mutator.execute() ; successCounter++ ; } catch(Exception e){ logger.info("Error! Change consistency level to 1.", e) ; cl=1 ; } Thread.sleep(50) ; } logger.info("\nsuccess counter: "+successCounter) ; } private Keyspace getKeyspace(int cl){ if(cl == 1) return HFactory.createKeyspace("test", cluster, new AllOneConsistencyLevelPolicy()) ; else return HFactory.createKeyspace("test", cluster) ; // default consistency level is Quorum } } > Enormous counter > - > > Key: CASSANDRA-3006 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3006 > Project: Cassandra > Issue Type: Bug >Affects Versions: 0.8.3 > Environment: ubuntu 10.04 >Reporter: Boris Yen >Assignee: Sylvain Lebresne > > I have two-node cluster with the following keyspace and column family > settings. > Cluster Information: >Snitch: org.apache.cassandra.locator.SimpleSnitch >Partitioner: org.apache.cassandra.dht.RandomPartitioner >Schema versions: > 63fda700-c243-11e0--2d03dcafebdf: [172.17.19.151, 172.17.19.152] > Keyspace: test: > Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy > Durable Writes: true > Options: [datacenter1:2] > Column Families: > ColumnFamily: testCounter (Super) > "APP status information." > Key Validation Class: org.apache.cassandra.db.marshal.BytesType > Default column value validator: > org.apache.cassandra.db.marshal.CounterColumnType > Columns sorted by: > org.apache.cassandra.db.marshal.BytesType/org.apache.cassandra.db.marshal.BytesType > Row cache size / save period in seconds: 0.0/0 > Key cache size / save period in seconds: 20.0/14400 > Memtable thresholds: 1.1578125/1440/247 (millions of ops/MB/minutes) > GC grace seconds: 864000 > Compaction min/max thresholds: 4/32 > Read repair chance: 1.0 > Replicate on write: true > Built indexes: [] > Then, I use a test program based on hector
[jira] [Commented] (CASSANDRA-2991) Add a 'load new sstables' JMX/nodetool command
[ https://issues.apache.org/jira/browse/CASSANDRA-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13082097#comment-13082097 ] Jonathan Ellis commented on CASSANDRA-2991: --- What about the "restore snapshot" scenario? > Add a 'load new sstables' JMX/nodetool command > -- > > Key: CASSANDRA-2991 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2991 > Project: Cassandra > Issue Type: New Feature >Reporter: Brandon Williams >Priority: Minor > Fix For: 0.8.4 > > > Sometimes people have to create a new cluster to get around a problem and > need to copy sstables around. It would be convenient to be able to trigger > this from nodetool or JMX instead of doing a restart of the node. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-1608) Redesigned Compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Coverston updated CASSANDRA-1608: -- Attachment: 1608-v13.txt 1608 without some of the cruft > Redesigned Compaction > - > > Key: CASSANDRA-1608 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1608 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Chris Goffinet >Assignee: Benjamin Coverston > Attachments: 1608-v11.txt, 1608-v13.txt, 1608-v2.txt > > > After seeing the I/O issues in CASSANDRA-1470, I've been doing some more > thinking on this subject that I wanted to lay out. > I propose we redo the concept of how compaction works in Cassandra. At the > moment, compaction is kicked off based on a write access pattern, not read > access pattern. In most cases, you want the opposite. You want to be able to > track how well each SSTable is performing in the system. If we were to keep > statistics in-memory of each SSTable, prioritize them based on most accessed, > and bloom filter hit/miss ratios, we could intelligently group sstables that > are being read most often and schedule them for compaction. We could also > schedule lower priority maintenance on SSTable's not often accessed. > I also propose we limit the size of each SSTable to a fix sized, that gives > us the ability to better utilize our bloom filters in a predictable manner. > At the moment after a certain size, the bloom filters become less reliable. > This would also allow us to group data most accessed. Currently the size of > an SSTable can grow to a point where large portions of the data might not > actually be accessed as often. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira