error casting Column to SuperColumn during compaction. ? CASSANDRA-1992 ?
I got the error below on an newish 0.7.0 cluster with the following...- no schema changes.- original RF at 1, changed to 3 via cassandra-cli and repair run- stable node membership, i.e. no nodes addedWas thinking it may have to do with CASSANDRA-1992 (seehttp://www.mail-archive.com/user@cassandra.apache.org/msg09276.html) but I've not seen this error associated with that issue before. I can apply the patch or run the head on the 0.7 branch if that will help.May be able to dig into it further later today. (not ruling out me doing stupid things yet)AaronERROR [CompactionExecutor:1] 2011-02-08 16:59:35,380 AbstractCassandraDaemon.java (line org.apache.cassandra.service.AbstractCassandraDaemon) Fatal exception in thread Thread[CompactionExecutor:1,1,main]java.lang.ClassCastException: org.apache.cassandra.db.Column cannot be cast to org.apache.cassandra.db.SuperColumn at org.apache.cassandra.db.SuperColumnSerializer.serialize(SuperColumn.java:329) at org.apache.cassandra.db.SuperColumnSerializer.serialize(SuperColumn.java:313) at org.apache.cassandra.dbColumnFamilySerializer.serializeForSSTable(ColumnFamilySerializer.java:87) at org.apache.cassandra.db.ColumnFamilySerializer.serializeWithIndexes(ColumnFamilySerializer.java:106) at org.apache.cassandra.io.PrecompactedRow.init(PrecompactedRow.java:97) at org.apache.cassandra.io.CompactionIterator.getCompactedRow(CompactionIterator.java:138) at org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:107) at org.apachecassandra.io.CompactionIterator.getReduced(CompactionIterator.java:42) at org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:73) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131) at org.apache.commons.collections.iterators.FilterIterator.setNextObjecand such startingt(FilterIterator.java:183) at org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94) at org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:323) at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:122) at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:92) at java.utilconcurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619)Then after restart started getting...ERROR [CompactionExecutor:1] 2011-02-09 10:39:15,496 PrecompactedRow.java (line org.apache.cassandra.io.PrecompactedRow.init(PrecompactedRow.java:82)) Skipping row DecoratedKey(71445620198865609512646197760056087250, 2f77686174696674686973776572656d617374657266696c652f746e6e5f73686f74735f3035305f3030352f6d664964) in /local1/cassandra/data/junkbox/ObjectIndex-e-216-Data.dbjava.io.IOException: Corrupt (negative) value length encountered at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:274) at orgapache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:94) at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:35) at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:129) at org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:137) at org.apache.cassandra.io.PrecompactedRow.init(PrecompactedRow.java:78) at org.apache.cassandra.io.CompactionIterator.getCompactedRow(CompactionIterator.java:138) at org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:107) at org.apache.cassandraio.CompactionIterator.getReduced(CompactionIterator.java:42) at org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:73) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131) at org.apache.commons.collections.iterators.FilterIterator.setNextObject(FilterIterator.java:183) at org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94) at org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:323) at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:122) at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:92) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at
Different comparator types for column and supercolumn don't work
Hello, I'm new to cassandra. I'm using cassandra release 0.7.0 (local, single node). I can't perform write operations in case the column and supercolumn families have different comparator types. For example if I use the code given in Issue: https://issues.apache.org/jira/browse/CASSANDRA-1712 by Jonathan Ellis in the CLI, I get the following output: [default@Keyspace1] create keyspace KS1 8bb2fc2d-1fcb-11e0-add0-a9c93d38c544 [default@Keyspace1] use KS1 Authenticated to keyspace: KS1 [default@KS1] create column family CFCli with column_type= 'Super' and comparator= 'LongType' and subcomparator='UTF8Type' 97742bbe-1fcb-11e0-add0-a9c93d38c544 [default@KS1] set CFCli['newrow'][1234567890]['column'] = 'value' 'column' could not be translated into a LongType. I also tried a setup with the release inclosed example keyspace (loaded via the StorageService bean loadSchemaFromYAML method): ColumnFamily: Super3 (Super) A column family with supercolumns, whose column names are Longs (8 bytes) Columns sorted by: org.apache.cassandra.db.marshal.LongType/org.apache.cassandra.db.marshal.BytesType Subcolumns sorted by: org.apache.cassandra.db.marshal.LongType Row cache size / save period: 0.0/0 Key cache size / save period: 20.0/3600 Memtable thresholds: 0.2953125/63/60 GC grace seconds: 864000 Compaction min/max thresholds: 4/32 CLI output: [default@Keyspace1] set Super3['account_value']['1:1'][1234567890] = 'value1' A long is exactly 8 bytes: 3 [default@Keyspace1] set Super3['account_value'][1234567890]['test'] = 'value1' 'test' could not be translated into a LongType. [default@Keyspace1] set Super3['account_value'][1234567890][1234567890] = 'value1' A long is exactly 8 bytes: 10 [default@Keyspace1] set Super3[1234567890][1234567890][1234567890] = 'value1' Syntax error at position 11: mismatched input '1234567890' expecting set null [default@Keyspace1] set Super3['account_value']['test'][1234567890] = 'value1' A long is exactly 8 bytes: 4 [default@Keyspace1] set Super3[1234567890]['test']['column'] = 'value1' Syntax error at position 11: mismatched input '1234567890' expecting set null According to the CLI help the format is: set cf['key']['super']['col'] = value, thus the errors generated seem weird for me. What am I doing wrong? Thanks in advance, Kind regards, Karin
Re: set ReplicationFactor and Token at Column Family/SuperColumn level.
If I create 3-4 keyspaces, will this impact performance and resources (esp. memory and disk I/O) too much? Thanks, Zhong On Aug 5, 2010, at 4:52 PM, Benjamin Black wrote: On Thu, Aug 5, 2010 at 12:59 PM, Zhong Li z...@voxeo.com wrote: The big thing bother me is initial ring token. We have some Column Families. It is very hard to choose one token suitable for all CFs. Also some Column Families need higher Consistent Level and some don't. If we set Consistency Level is set by clients, per request. If you require different _Replication Factors_ for different CFs, then just put them in different keyspaces. Additional keyspaces have very little overhead (unlike CFs). ReplicationFactor too high, it is too costy for crossing datacenter, especially in otherside the world. I know we can setup multiple rings, but it costs more hardware. if Cassandra can implement Ring,Token and RF on the CF level, or even SuperColumn level, it will make design much easier and more efficiency. Is it possible? The approach I described above is what you can do. The rest of what you asked is not happening. b
Re: set ReplicationFactor and Token at Column Family/SuperColumn level.
Additional keyspaces have very little overhead (unlike CFs). On Fri, Aug 6, 2010 at 9:42 AM, Zhong Li z...@voxeo.com wrote: If I create 3-4 keyspaces, will this impact performance and resources (esp. memory and disk I/O) too much? Thanks, Zhong On Aug 5, 2010, at 4:52 PM, Benjamin Black wrote: On Thu, Aug 5, 2010 at 12:59 PM, Zhong Li z...@voxeo.com wrote: The big thing bother me is initial ring token. We have some Column Families. It is very hard to choose one token suitable for all CFs. Also some Column Families need higher Consistent Level and some don't. If we set Consistency Level is set by clients, per request. If you require different _Replication Factors_ for different CFs, then just put them in different keyspaces. Additional keyspaces have very little overhead (unlike CFs). ReplicationFactor too high, it is too costy for crossing datacenter, especially in otherside the world. I know we can setup multiple rings, but it costs more hardware. if Cassandra can implement Ring,Token and RF on the CF level, or even SuperColumn level, it will make design much easier and more efficiency. Is it possible? The approach I described above is what you can do. The rest of what you asked is not happening. b
set ReplicationFactor and Token at Column Family/SuperColumn level.
All, Thanks for Apache Cassandra Project, it is great project. This is my first time to use it. We install it on 10 nodes and runs great. The 10 nodes cross all 5 datacenters around the world. The big thing bother me is initial ring token. We have some Column Families. It is very hard to choose one token suitable for all CFs. Also some Column Families need higher Consistent Level and some don't. If we set ReplicationFactor too high, it is too costy for crossing datacenter, especially in otherside the world. I know we can setup multiple rings, but it costs more hardware. if Cassandra can implement Ring,Token and RF on the CF level, or even SuperColumn level, it will make design much easier and more efficiency. Is it possible? Thanks, Zhong
Re: set ReplicationFactor and Token at Column Family/SuperColumn level.
On Thu, Aug 5, 2010 at 14:59, Zhong Li z...@voxeo.com wrote: All, Thanks for Apache Cassandra Project, it is great project. This is my first time to use it. We install it on 10 nodes and runs great. The 10 nodes cross all 5 datacenters around the world. The big thing bother me is initial ring token. We have some Column Families. It is very hard to choose one token suitable for all CFs. Also some Column Families need higher Consistent Level and some don't. If we set ReplicationFactor too high, it is too costy for crossing datacenter, especially in otherside the world. I know we can setup multiple rings, but it costs more hardware. if Cassandra can implement Ring,Token and RF on the CF level, or even SuperColumn level, it will make design much easier and more efficiency. You can get separate replication factors by putting your column families in separate keyspaces. Token per CF (or even token per KS) isn't on the roadmap though. You can mitigate this to some extent by choosing row keys that play nicely with your ring though. Gary Is it possible? Thanks, Zhong
Re: Column or SuperColumn
if you have a relatively small, static set of subcolumns, that you read as a group, then using supercolumns is reasonable On Tue, Jun 1, 2010 at 7:33 PM, Peter Hsu pe...@motivecast.com wrote: I have a pretty simple data modeling question. I don't know whether or not to use a CF or SCF in one instance. Here's my example. I have an Store entry and locations for each store. So I have something like: Using CF: Store { //CF storeId { //row key storeName:str, storeLogo:image } storeId:locationId1 { locationName:str, latLong:coordinate } storeId:locationId2 { locationName:str, latLong:coordinate } } Using SCF: Store { //SCF storeId { //row key store { storeName:str, storeLogo:image } locationId1 { locationName:str, latLong:coordinate } locationId2 { locationName:str, latLong:coordinate } } } Queries: Reads: 1. Read store and all locations (could be done by range query efficiently when using CF, since I'm using OPP) 2. Read only a particular location of a store (don't need the store meta data here) 3. Read only store name info (don't need any location info here) Writes: 1. Update store meta data (without touching location info) 2. Update location data for a store (without touching rest of store data) 3. Add a new location to an existing store (would have a unique identifier for location, no worries about having to do a read..) I read that SuperColumns are not as fast as Columns, and obviously you can't have indexed subcolumns of supercolumns, but in this case I don't need the subsubcolumn indices. It seems cleaner to model it as a SuperColumn, but why would I want to pay a performance penalty instead of just concating my keys. This seems like a fairly common pattern? What's the rule to decide between CF and SCF? Thanks, Peter -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com