Rüdiger "SortedMap<byte[], SortedMap<byte[], Pair<Long, byte[]>>"
When using a RandomPartitioner or Murmur3Partitioner, the outer map is a simple Map, not SortedMap. The only case you have a SortedMap for row key is when using OrderPreservingPartitioner, which is clearly not advised for most cases because of hot spots in the cluster. On Thu, Feb 20, 2014 at 10:49 PM, Rüdiger Klaehn <rkla...@gmail.com> wrote: > Hi Sylvain, > > I applied the patch to the cassandra-2.0 branch (this required some manual > work since I could not figure out which commit it was supposed to apply > for, and it did not apply to the head of cassandra-2.0). > > The benchmark now runs in pretty much identical time to the thrift based > benchmark. ~30s for 1000 inserts of 10000 key/value pairs each. Great work! > > > I still have some questions regarding the mapping. Please bear with me if > these are stupid questions. I am quite new to Cassandra. > > The basic cassandra data model for a keyspace is something like this, > right? > > SortedMap<byte[], SortedMap<byte[], Pair<Long, byte[]>> > ^ row key. determines which server(s) the rest is stored > on > ^ column key > ^ timestamp > (latest one wins) > ^ > value (can be size 0) > > So if I have a table like the one in my benchmark (using blobs) > > CREATE TABLE IF NOT EXISTS test.wide ( > time blob, > name blob, > value blob, > PRIMARY KEY (time,name)) > WITH COMPACT STORAGE > > From reading http://www.datastax.com/dev/blog/thrift-to-cql3 it seems > that > > - time maps to the row key and name maps to the column key without any > overhead > - value directly maps to value in the model above without any prefix > > is that correct, or is there some overhead involved in CQL over the raw > model as described above? If so, where exactly? > > kind regards and many thanks for your help, > > Rüdiger > > > On Thu, Feb 20, 2014 at 8:36 AM, Sylvain Lebresne <sylv...@datastax.com>wrote: > >> >> >> >> On Wed, Feb 19, 2014 at 9:38 PM, Rüdiger Klaehn <rkla...@gmail.com>wrote: >> >>> >>> I have cloned the cassandra repo, applied the patch, and built it. But >>> when I want to run the bechmark I get an exception. See below. I tried with >>> a non-managed dependency to >>> cassandra-driver-core-2.0.0-rc3-SNAPSHOT-jar-with-dependencies.jar, which I >>> compiled from source because I read that that might help. But that did not >>> make a difference. >>> >>> So currently I don't know how to give the patch a try. Any ideas? >>> >>> cheers, >>> >>> Rüdiger >>> >>> Exception in thread "main" java.lang.IllegalArgumentException: >>> replicate_on_write is not a column defined in this metadata >>> at >>> com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.java:273) >>> at >>> com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnDefinitions.java:279) >>> at com.datastax.driver.core.Row.getBool(Row.java:117) >>> at >>> com.datastax.driver.core.TableMetadata$Options.<init>(TableMetadata.java:474) >>> at >>> com.datastax.driver.core.TableMetadata.build(TableMetadata.java:107) >>> at >>> com.datastax.driver.core.Metadata.buildTableMetadata(Metadata.java:128) >>> at com.datastax.driver.core.Metadata.rebuildSchema(Metadata.java:89) >>> at >>> com.datastax.driver.core.ControlConnection.refreshSchema(ControlConnection.java:259) >>> at >>> com.datastax.driver.core.ControlConnection.tryConnect(ControlConnection.java:214) >>> at >>> com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:161) >>> at >>> com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:77) >>> at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:890) >>> at >>> com.datastax.driver.core.Cluster$Manager.newSession(Cluster.java:910) >>> at >>> com.datastax.driver.core.Cluster$Manager.access$200(Cluster.java:806) >>> at com.datastax.driver.core.Cluster.connect(Cluster.java:158) >>> at >>> cassandra.CassandraTestMinimized$delayedInit$body.apply(CassandraTestMinimized.scala:31) >>> at scala.Function0$class.apply$mcV$sp(Function0.scala:40) >>> at >>> scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12) >>> at scala.App$$anonfun$main$1.apply(App.scala:71) >>> at scala.App$$anonfun$main$1.apply(App.scala:71) >>> at scala.collection.immutable.List.foreach(List.scala:318) >>> at >>> scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:32) >>> at scala.App$class.main(App.scala:71) >>> at >>> cassandra.CassandraTestMinimized$.main(CassandraTestMinimized.scala:5) >>> at >>> cassandra.CassandraTestMinimized.main(CassandraTestMinimized.scala) >>> >> >> I believe you've tried the cassandra trunk branch? trunk is basically the >> future Cassandra 2.1 and the driver is currently unhappy because the >> replicate_on_write option has been removed in that version. I'm supposed to >> have fixed that on the driver 2.0 branch like 2 days ago so maybe you're >> also using a slightly old version of the driver sources in there? Or maybe >> I've screwed up my fix, I'll double check. But anyway, it would be overall >> simpler to test with the cassandra-2.0 branch of Cassandra, with which you >> shouldn't run into that. >> >> -- >> Sylvain >> > >