Re: How to create a TupleType/TupleValue in a UDF
I’m running 3.0.8, so it probably wasn’t fixed? ;) [cqlsh 5.0.1 | Cassandra 3.0.8 | CQL spec 3.4.0 | Native protocol v4] The CodecNotFoundException is very random, when I get it, if I re-run the same exact query then it works! I’ll see if I can reproduce it more consistently. BTW, is there a way to get the CodecRegistry and the ProtocolVersion from the UDF environment so I don’t have to create them? - Drew > On Aug 18, 2016, at 10:10 AM, Tyler Hobbs wrote: > > The logback-related error is due to > https://issues.apache.org/jira/browse/CASSANDRA-11033 > <https://issues.apache.org/jira/browse/CASSANDRA-11033>, which is fixed in > 3.0.4 and 3.4. > > I'm not sure about the CodecNotFoundException, can you reproduce that one > reliably? > > On Thu, Aug 18, 2016 at 10:52 AM, Drew Kutcharian <mailto:d...@venarc.com>> wrote: > Hi All, > > I have a UDF/UDA that returns a map of date -> TupleValue. > > CREATE OR REPLACE FUNCTION min_max_by_timestamps_udf(state map frozen>>, flake blob) > RETURNS NULL ON NULL INPUT > RETURNS map>> > LANGUAGE java > > CREATE OR REPLACE AGGREGATE min_max_by_timestamps(blob) > SFUNC min_max_by_timestamps_udf > STYPE map>> > INITCOND {}; > > I’ve been using the following syntax to build the TupleType/TupleValue in my > UDF: > > TupleType tupleType = > TupleType.of(com.datastax.driver.core.ProtocolVersion.NEWEST_SUPPORTED, > CodecRegistry.DEFAULT_INSTANCE, DataType.timestamp(), DataType.timestamp()); > tupleType.newValue(new java.util.Date(timestamp), new > java.util.Date(timestamp))); > > But “randomly" I get errors like the following: > FunctionFailure: code=1400 [User Defined Function failure] message="execution > of ’testdb.min_max_by_timestamps_udf[map timestamp>>>, blob]' failed: java.security.AccessControlException: access > denied ("java.io.FilePermission" "/etc/cassandra/logback.xml" "read”)" > > Or CodecNotFoundException for Cassandra not being able to find a codec for > "map>>”. > > Is this a bug or I’m doing something wrong? > > > Thanks, > > Drew > > > > -- > Tyler Hobbs > DataStax <http://datastax.com/>
How to create a TupleType/TupleValue in a UDF
Hi All, I have a UDF/UDA that returns a map of date -> TupleValue. CREATE OR REPLACE FUNCTION min_max_by_timestamps_udf(state map>>, flake blob) RETURNS NULL ON NULL INPUT RETURNS map>> LANGUAGE java CREATE OR REPLACE AGGREGATE min_max_by_timestamps(blob) SFUNC min_max_by_timestamps_udf STYPE map>> INITCOND {}; I’ve been using the following syntax to build the TupleType/TupleValue in my UDF: TupleType tupleType = TupleType.of(com.datastax.driver.core.ProtocolVersion.NEWEST_SUPPORTED, CodecRegistry.DEFAULT_INSTANCE, DataType.timestamp(), DataType.timestamp()); tupleType.newValue(new java.util.Date(timestamp), new java.util.Date(timestamp))); But “randomly" I get errors like the following: FunctionFailure: code=1400 [User Defined Function failure] message="execution of ’testdb.min_max_by_timestamps_udf[map>>, blob]' failed: java.security.AccessControlException: access denied ("java.io.FilePermission" "/etc/cassandra/logback.xml" "read”)" Or CodecNotFoundException for Cassandra not being able to find a codec for "map>>”. Is this a bug or I’m doing something wrong? Thanks, Drew
Re: Cassandra Debian repos (Apache vs DataStax)
OK to make things even more confusing, the “Release” files in the Apache Repo say "Origin: Unofficial Cassandra Packages”!! i.e. http://dl.bintray.com/apache/cassandra/dists/35x/:Release > On May 17, 2016, at 12:11 PM, Drew Kutcharian wrote: > > BTW, the language on this page should probably change since it currently > sounds like the official repo is the DataStax one and Apache is only an > “alternative" > > http://wiki.apache.org/cassandra/DebianPackaging > > - Drew > >> On May 17, 2016, at 11:35 AM, Drew Kutcharian wrote: >> >> Thanks Eric. >> >> >>> On May 17, 2016, at 7:50 AM, Eric Evans wrote: >>> >>> On Mon, May 16, 2016 at 5:19 PM, Drew Kutcharian wrote: >>>> >>>> What’s the difference between the two “Community” repositories Apache >>>> (http://www.apache.org/dist/cassandra/debian) and DataStax >>>> (http://debian.datastax.com/community/)? >>> >>> Good question. All I can tell you is that the Apache repository is >>> the official one (the only official one). >>> >>>> If they are just mirrors, then it seems like the DataStax one is a bit >>>> behind (version 3.0.6 is available on Apache but not on DataStax). >>>> >>>> I’ve been using the DataStax community repo and wanted to see if I still >>>> should continue using it or switch to the Apache repo. >>> >>> If it is your intention to run Apache Cassandra, from the Apache >>> Cassandra project, then you should be using the Apache repo. >>> >>> -- >>> Eric Evans >>> eev...@apache.org >> >
Re: Cassandra Debian repos (Apache vs DataStax)
BTW, the language on this page should probably change since it currently sounds like the official repo is the DataStax one and Apache is only an “alternative" http://wiki.apache.org/cassandra/DebianPackaging - Drew > On May 17, 2016, at 11:35 AM, Drew Kutcharian wrote: > > Thanks Eric. > > >> On May 17, 2016, at 7:50 AM, Eric Evans wrote: >> >> On Mon, May 16, 2016 at 5:19 PM, Drew Kutcharian wrote: >>> >>> What’s the difference between the two “Community” repositories Apache >>> (http://www.apache.org/dist/cassandra/debian) and DataStax >>> (http://debian.datastax.com/community/)? >> >> Good question. All I can tell you is that the Apache repository is >> the official one (the only official one). >> >>> If they are just mirrors, then it seems like the DataStax one is a bit >>> behind (version 3.0.6 is available on Apache but not on DataStax). >>> >>> I’ve been using the DataStax community repo and wanted to see if I still >>> should continue using it or switch to the Apache repo. >> >> If it is your intention to run Apache Cassandra, from the Apache >> Cassandra project, then you should be using the Apache repo. >> >> -- >> Eric Evans >> eev...@apache.org >
Re: Cassandra Debian repos (Apache vs DataStax)
Thanks Eric. > On May 17, 2016, at 7:50 AM, Eric Evans wrote: > > On Mon, May 16, 2016 at 5:19 PM, Drew Kutcharian wrote: >> >> What’s the difference between the two “Community” repositories Apache >> (http://www.apache.org/dist/cassandra/debian) and DataStax >> (http://debian.datastax.com/community/)? > > Good question. All I can tell you is that the Apache repository is > the official one (the only official one). > >> If they are just mirrors, then it seems like the DataStax one is a bit >> behind (version 3.0.6 is available on Apache but not on DataStax). >> >> I’ve been using the DataStax community repo and wanted to see if I still >> should continue using it or switch to the Apache repo. > > If it is your intention to run Apache Cassandra, from the Apache > Cassandra project, then you should be using the Apache repo. > > -- > Eric Evans > eev...@apache.org
Cassandra Debian repos (Apache vs DataStax)
Hi, What’s the difference between the two “Community” repositories Apache (http://www.apache.org/dist/cassandra/debian) and DataStax (http://debian.datastax.com/community/)? If they are just mirrors, then it seems like the DataStax one is a bit behind (version 3.0.6 is available on Apache but not on DataStax). I’ve been using the DataStax community repo and wanted to see if I still should continue using it or switch to the Apache repo. Best, Drew
Cassandra 3.0.6 Release?
Hi, What’s the 3.0.6 release date? Seems like the code has been frozen for a few days now. I ask because I want to install Cassandra on Ubuntu 16.04 and CASSANDRA-10853 is blocking it. Best, Drew
Re: Data partitioning and composite partition key
Mainly lower latency and (network overhead) in multi-get requests (WHERE IN (….)). The coordinator needs to connect only to one node vs potentially all the nodes in the cluster. On Aug 29, 2014, at 5:23 PM, Jack Krupansky wrote: > Okay, but what benefit do you think you get from having the partitions on the > same node – since they would be separate partitions anyway? I mean, what > exactly do you think you’re going to do with them, that wouldn’t be a whole > lot more performant by being able to process data in parallel from separate > nodes? I mean, the whole point of Cassandra is scalability and distributed > processing, right? > > -- Jack Krupansky > > From: Drew Kutcharian > Sent: Friday, August 29, 2014 7:31 PM > To: user@cassandra.apache.org > Subject: Re: Data partitioning and composite partition key > > Hi Jack, > > I think you missed the point of my email which was trying to avoid the > problem of having very wide rows :) In the notation of sensorId-datatime, > the datatime is a datetime bucket, say a day. The CQL rows would still be > keyed by the actual time of the event. So you’d end up having > SesonId->Datetime Bucket (day/week/month)->actual event. What I wanted to be > able to do was to colocate all the events related to a sensor id on a single > node (token). > > See "High Throughput Timelines” at > http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra > > - Drew > > > On Aug 29, 2014, at 3:58 PM, Jack Krupansky wrote: > >> With CQL3, you, the developer, get to decide whether to place a primary key >> column in the partition key or as a clustering column. So, make sensorID the >> partition key and datetime as a clustering column. >> >> -- Jack Krupansky >> >> From: Drew Kutcharian >> Sent: Friday, August 29, 2014 6:48 PM >> To: user@cassandra.apache.org >> Subject: Data partitioning and composite partition key >> >> Hey Guys, >> >> AFAIK, currently Cassandra partitions (thrift) rows using the row key, >> basically uses the hash(row_key) to decide what node that row needs to be >> stored on. Now there are times when there is a need to shard a wide row, say >> storing events per sensor, so you’d have sensorId-datetime row key so you >> don’t end up with very large rows. Is there a way to have the partitioner >> use only the “sensorId” part of the row key for the hash? This way we would >> be able to store all the data relating to a sensor in one node. >> >> Another use case of this would be multi-tenancy: >> >> Say we have accounts and accounts have users. So we would have the following >> tables: >> >> CREATE TABLE account ( >> id timeuuid PRIMARY KEY, >> company text //timezone >> ); >> >> CREATE TABLE user ( >> id timeuuid PRIMARY KEY, >> accountId timeuuid, >> emailtext, >> password text >> ); >> >> // Get users by account >> CREATE TABLE user_account_index ( >> accountId timeuuid, >> userIdtimeuuid, >> PRIMARY KEY(acid, id) >> ); >> >> Say I want to get all the users that belong to an account. I would first >> have to get the results from user_account_index and then use a multi-get >> (WHERE IN) to get the records from user table. Now this multi-get part could >> potentially query a lot of different nodes in the cluster. It’d be great if >> there was a way to limit storage of users of an account to a single node so >> that way multi-get would only need to query a single node. >> >> Note that the problem cannot be simply fixed by using (accountId, id) as the >> primary key for the user table since that would create a problem of having a >> very large number of (thrift) rows in the users table. >> >> I did look thru the code and JIRA and I couldn’t really find a solution. The >> closest I got was to have a custom partitioner, but then you can’t have a >> partitioner per keyspace and that’s not even something that’d be implemented >> in future based on the following JIRA: >> https://issues.apache.org/jira/browse/CASSANDRA-295 >> >> Any ideas are much appreciated. >> >> Best, >> >> Drew > >
Re: Data partitioning and composite partition key
Hi Rob, I agree that one should not mess around with the default partitioner. But there might be value in improving the Murmur3 partitioner to be “Composite Aware”. Since we can have composites in row keys now, why not be able to use only a part of the row key for partitioning? Makes sense? I just opened this JIRA https://issues.apache.org/jira/browse/CASSANDRA-7850 - Drew On Aug 29, 2014, at 4:36 PM, Robert Coli wrote: > On Fri, Aug 29, 2014 at 3:48 PM, Drew Kutcharian wrote: > AFAIK, currently Cassandra partitions (thrift) rows using the row key, > basically uses the hash(row_key) to decide what node that row needs to be > stored on. Now there are times when there is a need to shard a wide row, say > storing events per sensor, so you’d have sensorId-datetime row key so you > don’t end up with very large rows. Is there a way to have the partitioner use > only the “sensorId” part of the row key for the hash? This way we would be > able to store all the data relating to a sensor in one node. > > As a general statement, if you believe you need to create a custom > Partitioner in order to handle your use case, you are almost certainly wrong > or Doing It Wrong. > > =Rob
Re: Data partitioning and composite partition key
Hi Jack, I think you missed the point of my email which was trying to avoid the problem of having very wide rows :) In the notation of sensorId-datatime, the datatime is a datetime bucket, say a day. The CQL rows would still be keyed by the actual time of the event. So you’d end up having SesonId->Datetime Bucket (day/week/month)->actual event. What I wanted to be able to do was to colocate all the events related to a sensor id on a single node (token). See "High Throughput Timelines” at http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra - Drew On Aug 29, 2014, at 3:58 PM, Jack Krupansky wrote: > With CQL3, you, the developer, get to decide whether to place a primary key > column in the partition key or as a clustering column. So, make sensorID the > partition key and datetime as a clustering column. > > -- Jack Krupansky > > From: Drew Kutcharian > Sent: Friday, August 29, 2014 6:48 PM > To: user@cassandra.apache.org > Subject: Data partitioning and composite partition key > > Hey Guys, > > AFAIK, currently Cassandra partitions (thrift) rows using the row key, > basically uses the hash(row_key) to decide what node that row needs to be > stored on. Now there are times when there is a need to shard a wide row, say > storing events per sensor, so you’d have sensorId-datetime row key so you > don’t end up with very large rows. Is there a way to have the partitioner use > only the “sensorId” part of the row key for the hash? This way we would be > able to store all the data relating to a sensor in one node. > > Another use case of this would be multi-tenancy: > > Say we have accounts and accounts have users. So we would have the following > tables: > > CREATE TABLE account ( > id timeuuid PRIMARY KEY, > company text //timezone > ); > > CREATE TABLE user ( > id timeuuid PRIMARY KEY, > accountId timeuuid, > emailtext, > password text > ); > > // Get users by account > CREATE TABLE user_account_index ( > accountId timeuuid, > userIdtimeuuid, > PRIMARY KEY(acid, id) > ); > > Say I want to get all the users that belong to an account. I would first have > to get the results from user_account_index and then use a multi-get (WHERE > IN) to get the records from user table. Now this multi-get part could > potentially query a lot of different nodes in the cluster. It’d be great if > there was a way to limit storage of users of an account to a single node so > that way multi-get would only need to query a single node. > > Note that the problem cannot be simply fixed by using (accountId, id) as the > primary key for the user table since that would create a problem of having a > very large number of (thrift) rows in the users table. > > I did look thru the code and JIRA and I couldn’t really find a solution. The > closest I got was to have a custom partitioner, but then you can’t have a > partitioner per keyspace and that’s not even something that’d be implemented > in future based on the following JIRA: > https://issues.apache.org/jira/browse/CASSANDRA-295 > > Any ideas are much appreciated. > > Best, > > Drew
Data partitioning and composite partition key
Hey Guys, AFAIK, currently Cassandra partitions (thrift) rows using the row key, basically uses the hash(row_key) to decide what node that row needs to be stored on. Now there are times when there is a need to shard a wide row, say storing events per sensor, so you’d have sensorId-datetime row key so you don’t end up with very large rows. Is there a way to have the partitioner use only the “sensorId” part of the row key for the hash? This way we would be able to store all the data relating to a sensor in one node. Another use case of this would be multi-tenancy: Say we have accounts and accounts have users. So we would have the following tables: CREATE TABLE account ( id timeuuid PRIMARY KEY, company text //timezone ); CREATE TABLE user ( id timeuuid PRIMARY KEY, accountId timeuuid, emailtext, password text ); // Get users by account CREATE TABLE user_account_index ( accountId timeuuid, userIdtimeuuid, PRIMARY KEY(acid, id) ); Say I want to get all the users that belong to an account. I would first have to get the results from user_account_index and then use a multi-get (WHERE IN) to get the records from user table. Now this multi-get part could potentially query a lot of different nodes in the cluster. It’d be great if there was a way to limit storage of users of an account to a single node so that way multi-get would only need to query a single node. Note that the problem cannot be simply fixed by using (accountId, id) as the primary key for the user table since that would create a problem of having a very large number of (thrift) rows in the users table. I did look thru the code and JIRA and I couldn’t really find a solution. The closest I got was to have a custom partitioner, but then you can’t have a partitioner per keyspace and that’s not even something that’d be implemented in future based on the following JIRA: https://issues.apache.org/jira/browse/CASSANDRA-295 Any ideas are much appreciated. Best, Drew
Re: Relation between Atomic Batches and Consistency Level
Alright, this is much better. The main thing I’m trying to figure out is that if there is a way to stop the batch if the first statement fails or there is a better pattern/construct for Cassandra to handle that scenario. - Drew On Mar 18, 2014, at 4:46 AM, Jonathan Lacefield wrote: > Okay your question is clear to me know. > > My understanding, after talking this through with some of the engineers here, > is that we have 2 levels of success with batches: > > 1) Did the batch make it to the batch log table? [yes or no] > - yes = success > - no = not success > 2) Did each statement in the batch succeed? [yes or no] > - yes = success > - no = not success > - the case you are interested in. > > If 1 and 2 are both successful - you will receive a success message > if 1 is successful but 2 is not successful (your case) - you will receive a > message stating the batch succeeded but not all replicas are live yet > - in this case, the batch will be retried by Cassandra. This is the > target scenario for atomic batches (to take the burden off of the client app > to monitor, maintain, and retry batches) > - i am going to test this, was shooting for last night but didn't get > to it, to see what actually happens inside the batch > - you could test this scenario with a trace to see what occurs (i.e. if > statement 1 fails is statement 2 tried) > if 1 is not successful then the batch "fails" > - this is because it couldn't make it to the batchlog table for execution > > Hope this helps. I believe this is the best i can do for you at the moment. > > Thanks, > > Jonathan Lacefield > Solutions Architect, DataStax > (404) 822 3487 > > > > > > > On Mon, Mar 17, 2014 at 4:05 PM, Drew Kutcharian wrote: > I have read that blog post which actually was the source of the initial > confusion ;) > > If I write normally (no batch) at Quorum, then a hinted write wouldn’t count > as a valid write so the write wouldn’t succeed, which means I would have to > retry. That’s a pretty well defined outcome. > > Now if I write a logged batch at Quorum, then a by definition, a hinted write > shouldn’t be considered a valid response, no? > > - Drew > > > On Mar 17, 2014, at 11:23 AM, Jonathan Lacefield > wrote: > >> Hello, >> >> Have you seen this blog post, it's old but still relevant. I think it >> will answer your questions. >> http://www.datastax.com/dev/blog/atomic-batches-in-cassandra-1-2. >> >> I think the answer lies in how Cassandra defines a batch "In the context >> of a Cassandra batch operation, atomic means that if any of the batch >> succeeds, all of it will." >> >> My understanding is that in your scenario if either statement succeeded, >> you batch would succeed. So #1 would get "hinted" and #2 would be applied, >> assuming no other failure events occur, like the coordinator fails, the >> client fails, etc. >> >> Hope that helps. >> >> Thanks, >> >> Jonathan >> >> Jonathan Lacefield >> Solutions Architect, DataStax >> (404) 822 3487 >> >> >> >> >> >> >> On Mon, Mar 17, 2014 at 1:38 PM, Drew Kutcharian wrote: >> Hi Jonathan, >> >> I’m still a bit unclear on this. Say I have two CQL3 tables: >> - user (replication of 3) >> - user_email_index (replication of 3) >> >> Now I create a new logged batch at quorum consistency level and put two >> inserts in there: >> #1 Insert into the “user" table with partition key of a timeuuid of the user >> #2 Insert into the “user_email_index" with partition key of user’s email >> address >> >> As you can see, there is a chance that these two insert statements will be >> executed on two different nodes because they are keyed by different >> partition keys. So based on the docs for Logged Batches, a batch will be >> applied “eventually” in an "all or nothing” fashion. So my question is, what >> happens if insert #1 fails (say replicas are unavailable), would insert #2 >> get applied? Would the whole thing be rejected and return an error to the >> client? >> >> PS. I’m aware of the isolation guarantees and that’s not an issue. All I >> need to make sure is that if the first the statement failed, the whole batch >> needs to fail. >> >> Thanks, >> >> Drew >> >> On Mar 17, 2014, at 5:33 AM, Jonathan Lacefield >> wrote: >> >>> Hello, >>>
Re: Relation between Atomic Batches and Consistency Level
I have read that blog post which actually was the source of the initial confusion ;) If I write normally (no batch) at Quorum, then a hinted write wouldn’t count as a valid write so the write wouldn’t succeed, which means I would have to retry. That’s a pretty well defined outcome. Now if I write a logged batch at Quorum, then a by definition, a hinted write shouldn’t be considered a valid response, no? - Drew On Mar 17, 2014, at 11:23 AM, Jonathan Lacefield wrote: > Hello, > > Have you seen this blog post, it's old but still relevant. I think it will > answer your questions. > http://www.datastax.com/dev/blog/atomic-batches-in-cassandra-1-2. > > I think the answer lies in how Cassandra defines a batch "In the context of > a Cassandra batch operation, atomic means that if any of the batch succeeds, > all of it will." > > My understanding is that in your scenario if either statement succeeded, > you batch would succeed. So #1 would get "hinted" and #2 would be applied, > assuming no other failure events occur, like the coordinator fails, the > client fails, etc. > > Hope that helps. > > Thanks, > > Jonathan > > Jonathan Lacefield > Solutions Architect, DataStax > (404) 822 3487 > > > > > > > On Mon, Mar 17, 2014 at 1:38 PM, Drew Kutcharian wrote: > Hi Jonathan, > > I’m still a bit unclear on this. Say I have two CQL3 tables: > - user (replication of 3) > - user_email_index (replication of 3) > > Now I create a new logged batch at quorum consistency level and put two > inserts in there: > #1 Insert into the “user" table with partition key of a timeuuid of the user > #2 Insert into the “user_email_index" with partition key of user’s email > address > > As you can see, there is a chance that these two insert statements will be > executed on two different nodes because they are keyed by different partition > keys. So based on the docs for Logged Batches, a batch will be applied > “eventually” in an "all or nothing” fashion. So my question is, what happens > if insert #1 fails (say replicas are unavailable), would insert #2 get > applied? Would the whole thing be rejected and return an error to the client? > > PS. I’m aware of the isolation guarantees and that’s not an issue. All I need > to make sure is that if the first the statement failed, the whole batch needs > to fail. > > Thanks, > > Drew > > On Mar 17, 2014, at 5:33 AM, Jonathan Lacefield > wrote: > >> Hello, >> >> Consistency is declared at the statement level, i.e. batch level when >> writing, but enforced at each batch row level. My understanding is that >> each batch (and all of it's contents) will be controlled through a specific >> CL declaration. So batch A could use a CL of QUORUM while batch B could use >> a CL of ONE. >> >> The detail that may help sort this out for you is that batch statements do >> not provide isolation guarantees: >> www.datastax.com/documentation/cql/3.0/cql/cql_reference/batch_r.html. This >> means that you write the batch as a batch but the reads are per row. If you >> are reading records contained in the batch, you will read results of >> partially updated batches. Taking this into account for your second >> question, you should expect that your read CL will preform as it would for >> any individual row mutation. >> >> Hope this helps. >> >> Jonathan >> >> Jonathan Lacefield >> Solutions Architect, DataStax >> (404) 822 3487 >> >> >> >> >> >> >> On Sat, Mar 15, 2014 at 12:23 PM, Drew Kutcharian wrote: >> Hi Guys, >> >> How do Atomic Batches and Consistency Level relate to each other? More >> specifically: >> >> - Is consistency level set/applicable per statement in the batch or the >> batch as a whole? >> >> - Say if I write a Logged Batch at QUORUM and read it back at QUORUM, what >> can I expect at normal, single node replica failure or double node replica >> failure scenarios? >> >> Thanks, >> >> Drew >> > >
Re: Relation between Atomic Batches and Consistency Level
Hi Jonathan, I’m still a bit unclear on this. Say I have two CQL3 tables: - user (replication of 3) - user_email_index (replication of 3) Now I create a new logged batch at quorum consistency level and put two inserts in there: #1 Insert into the “user" table with partition key of a timeuuid of the user #2 Insert into the “user_email_index" with partition key of user’s email address As you can see, there is a chance that these two insert statements will be executed on two different nodes because they are keyed by different partition keys. So based on the docs for Logged Batches, a batch will be applied “eventually” in an "all or nothing” fashion. So my question is, what happens if insert #1 fails (say replicas are unavailable), would insert #2 get applied? Would the whole thing be rejected and return an error to the client? PS. I’m aware of the isolation guarantees and that’s not an issue. All I need to make sure is that if the first the statement failed, the whole batch needs to fail. Thanks, Drew On Mar 17, 2014, at 5:33 AM, Jonathan Lacefield wrote: > Hello, > > Consistency is declared at the statement level, i.e. batch level when > writing, but enforced at each batch row level. My understanding is that each > batch (and all of it's contents) will be controlled through a specific CL > declaration. So batch A could use a CL of QUORUM while batch B could use a > CL of ONE. > > The detail that may help sort this out for you is that batch statements do > not provide isolation guarantees: > www.datastax.com/documentation/cql/3.0/cql/cql_reference/batch_r.html. This > means that you write the batch as a batch but the reads are per row. If you > are reading records contained in the batch, you will read results of > partially updated batches. Taking this into account for your second > question, you should expect that your read CL will preform as it would for > any individual row mutation. > > Hope this helps. > > Jonathan > > Jonathan Lacefield > Solutions Architect, DataStax > (404) 822 3487 > > > > > > > On Sat, Mar 15, 2014 at 12:23 PM, Drew Kutcharian wrote: > Hi Guys, > > How do Atomic Batches and Consistency Level relate to each other? More > specifically: > > - Is consistency level set/applicable per statement in the batch or the batch > as a whole? > > - Say if I write a Logged Batch at QUORUM and read it back at QUORUM, what > can I expect at normal, single node replica failure or double node replica > failure scenarios? > > Thanks, > > Drew >
Relation between Atomic Batches and Consistency Level
Hi Guys, How do Atomic Batches and Consistency Level relate to each other? More specifically: - Is consistency level set/applicable per statement in the batch or the batch as a whole? - Say if I write a Logged Batch at QUORUM and read it back at QUORUM, what can I expect at normal, single node replica failure or double node replica failure scenarios? Thanks, Drew
Re: Consistency Level One Question
Thanks, this clears things up. > On Feb 21, 2014, at 6:47 AM, Edward Capriolo wrote: > > When you write at one, as soon as one node acknowledges the write the ack is > returned to the client. This means if you quickly read from aome other node > 1)you may get the result because by the time the read is processed the data > may be on that node > 2)the node you read from may proxy the request to the node woth the data or > not > 3)you may get a column not found because the read might hit a node where the > data does not exist yet. > > Generally even at level one the replication is fast. I have done an > experiment on what you are asking. Write.one read from another as soon as > client gets an ack. Most of the time the data is replicated by the time the > second requeat is received. However "most of the time" is not a guarentee. If > the nodes are geographically separate who is to say if the firat request and > the second route around the internet a different way and the second action > arrives on a node before the first. That is eventual consistency for you. > > On Friday, February 21, 2014, graham sanderson wrote: > > My bad; should have checked the code: > > > > /** > > * This function executes local and remote reads, and blocks for the > > results: > > * > > * 1. Get the replica locations, sorted by response time according to > > the snitch > > * 2. Send a data request to the closest replica, and digest requests > > to either > > *a) all the replicas, if read repair is enabled > > *b) the closest R-1 replicas, where R is the number required to > > satisfy the ConsistencyLevel > > * 3. Wait for a response from R replicas > > * 4. If the digests (if any) match the data return the data > > * 5. else carry out read repair by getting data from all the nodes. > > */ > > > > On Feb 21, 2014, at 3:10 AM, Duncan Sands wrote: > > > >> Hi Graham, > >> > >> On 21/02/14 07:54, graham sanderson wrote: > >>> Note also; that reading at ONE there will be no read repair, since the > >>> coordinator does not know that another replica has stale data (remember > >>> at ONE, basically only one node is asked for the answer). > >> > >> I don't think this is right. My understanding is that while only one node > >> will be sent a direct read request, all other replicas will (not on every > >> query - it depends on the value of read_repair_chance) get a background > >> read repair request. You can test this experimentally using cqlsh and > >> turning tracing on: issue a read request many times. Most of the time you > >> will see that the coordinator sends a message to one node, but from time > >> to time (depending on read_repair_chance) you will see it sending messages > >> to many nodes. > >> > >> Best wishes, Duncan. > >> > >>> > >>> In practice for our use cases, we always write at LOCAL_QUORUM (failing > >>> the whole update if that doesn’t work - stale data is OK if >1 node is > >>> down), and we read at LOCAL_QUORUM, but (because stale data is better > >>> than no data), we will fall back per read request to LOCAL_ONE if we > >>> detect that there were insufficient nodes - this lets us cope with 2 down > >>> nodes in a 3 replica environment (or more if the nodes are not > >>> consecutive in the ring). > >>> > >>> On Feb 20, 2014, at 11:21 PM, Drew Kutcharian wrote: > >>> > >>>> Hi Guys, > >>>> > >>>> I wanted to get some clarification on what happens when you write and > >>>> read at consistency level 1. Say I have a keyspace with replication > >>>> factor of 3 and a table which will contain write-once/read-only wide > >>>> rows. If I write at consistency level 1 and the write happens on node A > >>>> and I read back at consistency level 1 from another node other than A, > >>>> say B, will C* return “not found” or will it trigger a read-repair > >>>> before responding? In addition, what’s the best consistency level for > >>>> reading/writing write-once/read-only wide rows? > >>>> > >>>> Thanks, > >>>> > >>>> Drew > >>>> > >>> > >> > > > > > > -- > Sorry this was sent from mobile. Will do less grammar and spell check than > usual.
Consistency Level One Question
Hi Guys, I wanted to get some clarification on what happens when you write and read at consistency level 1. Say I have a keyspace with replication factor of 3 and a table which will contain write-once/read-only wide rows. If I write at consistency level 1 and the write happens on node A and I read back at consistency level 1 from another node other than A, say B, will C* return “not found” or will it trigger a read-repair before responding? In addition, what’s the best consistency level for reading/writing write-once/read-only wide rows? Thanks, Drew
Re: CQL3 Custom Functions
In that case are there any plans of supporting microsecond version of dateOf() and now() functions. It's pretty common to use microsecond precision timeuuids. I created this JIRA: https://issues.apache.org/jira/browse/CASSANDRA-6672 cheers, Drew > On Feb 11, 2014, at 1:11 AM, Sylvain Lebresne wrote: > >> On Mon, Feb 10, 2014 at 7:16 PM, Drew Kutcharian wrote: >> Hey Guys, >> >> How can I define custom CQL3 functions (similar to dateOf, now, etc)? > > You can't, there is currently no way to define custom functions. > > -- > Sylvain
CQL3 Custom Functions
Hey Guys, How can I define custom CQL3 functions (similar to dateOf, now, etc)? Cheers, Drew
Re: Data modeling users table with CQL
You’re right. I didn’t catch that. No need to have email in the PRIMARY KEY. On Jan 21, 2014, at 5:11 PM, Jon Ribbens wrote: > On Tue, Jan 21, 2014 at 10:40:39AM -0800, Drew Kutcharian wrote: >> Thanks, I was actually thinking of doing that. Something along the lines >> of >> CREATE TABLE user ( >> idtimeuuid PRIMARY KEY, >> emailtext, >> nametext, >> ... >> ); >> CREATE TABLE user_email_index ( >> email text, >> id timeuuid, >> PRIMARY KEY (email, id) >> ); >> And during registration, I would just use LWT on the user_email_index >> table first and insert the record and then insert the actual user record >> into user table w/o LWT. Does that sound right to you? > > Yes, although unless I'm confused you don't need "id" in the > primary key on "user_email_index", just "PRIMARY KEY (email)".
Re: Data modeling users table with CQL
Cool. BTW, what do you mean by have additional session tracking ids? What’d that be for? - Drew On Jan 21, 2014, at 10:48 AM, Tupshin Harper wrote: > It does sound right. > > You might want to have additional session tracking id's, separate from the > user id, but that is an additional implementation detail, and could be > external to Cassandra. But the approach you describe accurately describes > what I would do as a first pass, at least. > > -Tupshin > > On Jan 21, 2014 10:41 AM, "Drew Kutcharian" wrote: > Thanks, I was actually thinking of doing that. Something along the lines of > > CREATE TABLE user ( > idtimeuuid PRIMARY KEY, > emailtext, > nametext, > ... > ); > > CREATE TABLE user_email_index ( > email text, > id timeuuid, > PRIMARY KEY (email, id) > ); > > And during registration, I would just use LWT on the user_email_index table > first and insert the record and then insert the actual user record into user > table w/o LWT. Does that sound right to you? > > - Drew > > > > On Jan 21, 2014, at 10:01 AM, Tupshin Harper wrote: > >> One CQL row per user, keyed off of the UUID. >> >> Another table keyed off of email, with another column containing the UUID >> for lookups in the first table. Only registration will require a >> lightweight transaction, and only for the purpose of avoiding duplicate >> email registration race conditions. >> >> -Tupshin >> >> On Jan 21, 2014 9:17 AM, "Drew Kutcharian" wrote: >> A shameful bump ;) >> >> > On Jan 20, 2014, at 2:14 PM, Drew Kutcharian wrote: >> > >> > Hey Guys, >> > >> > I’m new to CQL (but have been using C* for a while now). What would be the >> > best way to model a users table using CQL/Cassandra 2.0 Lightweight >> > Transactions where we would like to have: >> > - A unique TimeUUID as the primary key of the user >> > - A unique email address used for logging in >> > >> > In the past I would use Zookeeper and/or Astyanax’s "Uniqueness >> > Constraint” but I want to see how can this be handled natively. >> > >> > Cheers, >> > >> > Drew >> > >
Re: Data modeling users table with CQL
Thanks, I was actually thinking of doing that. Something along the lines of CREATE TABLE user ( idtimeuuid PRIMARY KEY, emailtext, nametext, ... ); CREATE TABLE user_email_index ( email text, id timeuuid, PRIMARY KEY (email, id) ); And during registration, I would just use LWT on the user_email_index table first and insert the record and then insert the actual user record into user table w/o LWT. Does that sound right to you? - Drew On Jan 21, 2014, at 10:01 AM, Tupshin Harper wrote: > One CQL row per user, keyed off of the UUID. > > Another table keyed off of email, with another column containing the UUID for > lookups in the first table. Only registration will require a lightweight > transaction, and only for the purpose of avoiding duplicate email > registration race conditions. > > -Tupshin > > On Jan 21, 2014 9:17 AM, "Drew Kutcharian" wrote: > A shameful bump ;) > > > On Jan 20, 2014, at 2:14 PM, Drew Kutcharian wrote: > > > > Hey Guys, > > > > I’m new to CQL (but have been using C* for a while now). What would be the > > best way to model a users table using CQL/Cassandra 2.0 Lightweight > > Transactions where we would like to have: > > - A unique TimeUUID as the primary key of the user > > - A unique email address used for logging in > > > > In the past I would use Zookeeper and/or Astyanax’s "Uniqueness Constraint” > > but I want to see how can this be handled natively. > > > > Cheers, > > > > Drew > >
Re: Data modeling users table with CQL
A shameful bump ;) > On Jan 20, 2014, at 2:14 PM, Drew Kutcharian wrote: > > Hey Guys, > > I’m new to CQL (but have been using C* for a while now). What would be the > best way to model a users table using CQL/Cassandra 2.0 Lightweight > Transactions where we would like to have: > - A unique TimeUUID as the primary key of the user > - A unique email address used for logging in > > In the past I would use Zookeeper and/or Astyanax’s "Uniqueness Constraint” > but I want to see how can this be handled natively. > > Cheers, > > Drew >
Re: one or more nodes were unavailable.
What do you see when you run "desc keyspace;” in cqlsh? On Jan 20, 2014, at 10:10 PM, Vivek Mishra wrote: > 1 have downloaded cassandra 2.x and set up on single machine. Started > Cassandra server and connecting via cqlsh. Created a column family and > inserting a single record into it(via cqlsh). > > Wondering why it gives "No node available" > > Even though simple insert queries(without CAS) works! > > -Vivek > > > On Tue, Jan 21, 2014 at 11:33 AM, Drew Kutcharian wrote: > If you are trying this out on a single node, make sure you set the > replication_factor of the keyspace to one. > > > On Jan 20, 2014, at 7:41 PM, Vivek Mishra wrote: > >> Single node and default consistency. Running via cqsh >> >> >> On Tue, Jan 21, 2014 at 1:47 AM, sankalp kohli >> wrote: >> Also do you have any nodes down...because it is possible to reach write >> consistency and not do CAS because some machines are down. >> >> >> On Mon, Jan 20, 2014 at 12:16 PM, sankalp kohli >> wrote: >> What consistency level are you using? >> >> >> On Mon, Jan 20, 2014 at 7:16 AM, Vivek Mishra wrote: >> Hi, >> Trying CAS feature of cassandra 2.x and somehow getting given below error: >> >> >> cqlsh:sample> insert into "User"(user_id,first_name) values( >> fe08e810-81e4-11e3-9470-c3aa8ce77cc4,'vivek1') if not exists; >> Unable to complete request: one or more nodes were unavailable. >> cqlsh:training> >> >> >> cqlsh:sample> insert into "User"(user_id,first_name) values( >> fe08e810-81e4-11e3-9470-c3aa8ce77cc4,'vivek1') >> >> It works fine. >> >> Any idea? >> >> -Vivek >> >> >> >> >> > >
Re: one or more nodes were unavailable.
If you are trying this out on a single node, make sure you set the replication_factor of the keyspace to one. On Jan 20, 2014, at 7:41 PM, Vivek Mishra wrote: > Single node and default consistency. Running via cqsh > > > On Tue, Jan 21, 2014 at 1:47 AM, sankalp kohli wrote: > Also do you have any nodes down...because it is possible to reach write > consistency and not do CAS because some machines are down. > > > On Mon, Jan 20, 2014 at 12:16 PM, sankalp kohli > wrote: > What consistency level are you using? > > > On Mon, Jan 20, 2014 at 7:16 AM, Vivek Mishra wrote: > Hi, > Trying CAS feature of cassandra 2.x and somehow getting given below error: > > > cqlsh:sample> insert into "User"(user_id,first_name) values( > fe08e810-81e4-11e3-9470-c3aa8ce77cc4,'vivek1') if not exists; > Unable to complete request: one or more nodes were unavailable. > cqlsh:training> > > > cqlsh:sample> insert into "User"(user_id,first_name) values( > fe08e810-81e4-11e3-9470-c3aa8ce77cc4,'vivek1') > > It works fine. > > Any idea? > > -Vivek > > > > >
Data modeling users table with CQL
Hey Guys, I’m new to CQL (but have been using C* for a while now). What would be the best way to model a users table using CQL/Cassandra 2.0 Lightweight Transactions where we would like to have: - A unique TimeUUID as the primary key of the user - A unique email address used for logging in In the past I would use Zookeeper and/or Astyanax’s "Uniqueness Constraint” but I want to see how can this be handled natively. Cheers, Drew
Re: Data Modeling: How to keep track of arbitrarily inserted column names?
One thing I can do is to have a client-side cache of the keys to reduce the number of updates. On Apr 5, 2013, at 6:14 AM, Edward Capriolo wrote: > Since there are few column names what you can do is this. Make a reverse > index, low read repair chance, Be aggressive with compaction. It will be many > extra writes but that is ok. > > Other option is turn on row cache and try read before write. It is a good > case for row cache because it is a very small data set. > > On Thursday, April 4, 2013, Drew Kutcharian wrote: > > I don't really need to answer "what rows contain column named X", so no > > need for a reverse index here. All I want is a distinct set of all the > > column names, so I can answer "what are all the available column names" > > > > On Apr 4, 2013, at 4:20 PM, Edward Capriolo wrote: > > > > Your reverse index of "which rows contain a column named X" will have very > > wide rows. You could look at cassandra's secondary indexing, or possibly > > look at a solandra/solr approach. Another option is you can shift the > > problem slightly, "which rows have column X that was added between time y > > and time z". Remember with few distinct column names that reverse index of > > column to row is going to be a very big list. > > > > > > On Thu, Apr 4, 2013 at 5:45 PM, Drew Kutcharian wrote: > >> > >> Hi Edward, > >> I anticipate that the column names will be reused a lot. For example, key1 > >> will be in many rows. So I think the number of distinct column names will > >> be much much smaller than the number of rows. Is there a way to have a > >> separate CF that keeps track of the column names? > >> What I was thinking was to have a separate CF that I write only the column > >> name with a null value in there every time I write a key/value to the main > >> CF. In this case if that column name exist, then it will just be > >> overridden. Now if I wanted to get all the column names, then I can just > >> query that CF. Not sure if that's the best approach at high load (100k > >> inserts a second). > >> -- Drew > >> > >> On Apr 4, 2013, at 12:02 PM, Edward Capriolo wrote: > >> > >> You can not get only the column name (which you are calling a key) you can > >> use get_range_slice which returns all the columns. When you specify an > >> empty byte array (new byte[0]{}) as the start and finish you get back all > >> the columns. From there you can return only the columns to the user in a > >> format that you like. > >> > >> > >> On Thu, Apr 4, 2013 at 2:18 PM, Drew Kutcharian wrote: > >>> > >>> Hey Guys, > >>> > >>> I'm working on a project and one of the requirements is to have a schema > >>> free CF where end users can insert arbitrary key/value pairs per row. > >>> What would be the best way to know what are all the "keys" that were > >>> inserted (preferably w/o any locking). For example, > >>> > >>> Row1 => key1 -> XXX, key2 -> XXX > >>> Row2 => key1 -> XXX, key3 -> XXX > >>> Row3 => key4 -> XXX, key5 -> XXX > >>> Row4 => key2 -> XXX, key5 -> XXX > >>> … > >>> > >>> The query would be give me all the inserted keys and the response would > >>> be {key1, key2, key3, key4, key5} > >>> > >>> Thanks, > >>> > >>> Drew > >>> > >> > >> > > > > > >
Re: Data Modeling: How to keep track of arbitrarily inserted column names?
I don't really need to answer "what rows contain column named X", so no need for a reverse index here. All I want is a distinct set of all the column names, so I can answer "what are all the available column names" On Apr 4, 2013, at 4:20 PM, Edward Capriolo wrote: > Your reverse index of "which rows contain a column named X" will have very > wide rows. You could look at cassandra's secondary indexing, or possibly look > at a solandra/solr approach. Another option is you can shift the problem > slightly, "which rows have column X that was added between time y and time > z". Remember with few distinct column names that reverse index of column to > row is going to be a very big list. > > > On Thu, Apr 4, 2013 at 5:45 PM, Drew Kutcharian wrote: > Hi Edward, > > I anticipate that the column names will be reused a lot. For example, key1 > will be in many rows. So I think the number of distinct column names will be > much much smaller than the number of rows. Is there a way to have a separate > CF that keeps track of the column names? > > What I was thinking was to have a separate CF that I write only the column > name with a null value in there every time I write a key/value to the main > CF. In this case if that column name exist, then it will just be overridden. > Now if I wanted to get all the column names, then I can just query that CF. > Not sure if that's the best approach at high load (100k inserts a second). > > -- Drew > > > On Apr 4, 2013, at 12:02 PM, Edward Capriolo wrote: > >> You can not get only the column name (which you are calling a key) you can >> use get_range_slice which returns all the columns. When you specify an empty >> byte array (new byte[0]{}) as the start and finish you get back all the >> columns. From there you can return only the columns to the user in a format >> that you like. >> >> >> On Thu, Apr 4, 2013 at 2:18 PM, Drew Kutcharian wrote: >> Hey Guys, >> >> I'm working on a project and one of the requirements is to have a schema >> free CF where end users can insert arbitrary key/value pairs per row. What >> would be the best way to know what are all the "keys" that were inserted >> (preferably w/o any locking). For example, >> >> Row1 => key1 -> XXX, key2 -> XXX >> Row2 => key1 -> XXX, key3 -> XXX >> Row3 => key4 -> XXX, key5 -> XXX >> Row4 => key2 -> XXX, key5 -> XXX >> … >> >> The query would be give me all the inserted keys and the response would be >> {key1, key2, key3, key4, key5} >> >> Thanks, >> >> Drew >> >> > >
Re: Data Modeling: How to keep track of arbitrarily inserted column names?
Hi Edward, I anticipate that the column names will be reused a lot. For example, key1 will be in many rows. So I think the number of distinct column names will be much much smaller than the number of rows. Is there a way to have a separate CF that keeps track of the column names? What I was thinking was to have a separate CF that I write only the column name with a null value in there every time I write a key/value to the main CF. In this case if that column name exist, then it will just be overridden. Now if I wanted to get all the column names, then I can just query that CF. Not sure if that's the best approach at high load (100k inserts a second). -- Drew On Apr 4, 2013, at 12:02 PM, Edward Capriolo wrote: > You can not get only the column name (which you are calling a key) you can > use get_range_slice which returns all the columns. When you specify an empty > byte array (new byte[0]{}) as the start and finish you get back all the > columns. From there you can return only the columns to the user in a format > that you like. > > > On Thu, Apr 4, 2013 at 2:18 PM, Drew Kutcharian wrote: > Hey Guys, > > I'm working on a project and one of the requirements is to have a schema free > CF where end users can insert arbitrary key/value pairs per row. What would > be the best way to know what are all the "keys" that were inserted > (preferably w/o any locking). For example, > > Row1 => key1 -> XXX, key2 -> XXX > Row2 => key1 -> XXX, key3 -> XXX > Row3 => key4 -> XXX, key5 -> XXX > Row4 => key2 -> XXX, key5 -> XXX > … > > The query would be give me all the inserted keys and the response would be > {key1, key2, key3, key4, key5} > > Thanks, > > Drew > >
Data Modeling: How to keep track of arbitrarily inserted column names?
Hey Guys, I'm working on a project and one of the requirements is to have a schema free CF where end users can insert arbitrary key/value pairs per row. What would be the best way to know what are all the "keys" that were inserted (preferably w/o any locking). For example, Row1 => key1 -> XXX, key2 -> XXX Row2 => key1 -> XXX, key3 -> XXX Row3 => key4 -> XXX, key5 -> XXX Row4 => key2 -> XXX, key5 -> XXX … The query would be give me all the inserted keys and the response would be {key1, key2, key3, key4, key5} Thanks, Drew
Re: Any plans for read-before-write update operations in CQL3?
I guess it'd be safe to say that the read consistency could be the same as the consistency of the update. But regardless, that would be a lot better than reading a value, modifying it at the client side and then writing it back. On Apr 3, 2013, at 7:12 PM, Edward Capriolo wrote: > Counters are currently read before write, some collection operations on List > are read before write. > > > On Wed, Apr 3, 2013 at 9:59 PM, aaron morton wrote: > I would guess not. > >> I know this goes against keeping updates idempotent, > There are also issues with consistency. i.e. is the read local or does it > happen at the CL level ? > And it makes things go slower. > >> We currently do things like this in client code, but it would be great to >> be able to this on the server side to minimize the chance of race conditions. > Sometimes you can write the plus one into a new column and then apply the > changes in the reading client thread. > > Cheers > > - > Aaron Morton > Freelance Cassandra Consultant > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 4/04/2013, at 12:48 AM, Drew Kutcharian wrote: > >> Hi Guys, >> >> Are there any short/long term plans to support UPDATE operations that >> require read-before-write, such as increment on a numeric non-counter >> column? >> i.e. >> >> UPDATE CF SET NON_COUNTER_NUMERIC_COLUMN = NON_COUNTER_NUMERIC_COLUMN + 1; >> >> UPDATE CF SET STRING_COLUMN = STRING_COLUMN + "postfix"; >> >> etc. >> >> I know this goes against keeping updates idempotent, but there are times you >> need to do these kinds of operations. We currently do things like this in >> client code, but it would be great to be able to this on the server side to >> minimize the chance of race conditions. >> >> -- Drew > >
Any plans for read-before-write update operations in CQL3?
Hi Guys, Are there any short/long term plans to support UPDATE operations that require read-before-write, such as increment on a numeric non-counter column? i.e. UPDATE CF SET NON_COUNTER_NUMERIC_COLUMN = NON_COUNTER_NUMERIC_COLUMN + 1; UPDATE CF SET STRING_COLUMN = STRING_COLUMN + "postfix"; etc. I know this goes against keeping updates idempotent, but there are times you need to do these kinds of operations. We currently do things like this in client code, but it would be great to be able to this on the server side to minimize the chance of race conditions. -- Drew
Re: Cassandra Compression and Wide Rows
Thanks Sylvain. So C* compression is block based and has nothing to do with format of the rows. On Mar 19, 2013, at 1:31 AM, Sylvain Lebresne wrote: > That's just describing what compression is about. Compression (not in C*, in > general) is based on recognizing repeated pattern. > > So yes, in that sense, static column families are more likely to yield better > compression ratio because it is more likely to have repeated patterns in the > compressed blocks. But: > 1) it doesn't necessarily mean that wide column families won't have a good > compression ratio per se. > 2) you can absolutely have crappy compression ratio with a static column > family. Just create a column family where each row has 1 column 'image' that > contains a png. > > And to come back to your initial question, I highly doubt disk level > compression would be much of a workaround because again, that's more about > how compression is working than how Cassandra use it. > > At the end of the day, I really think the best choice is to try it and decide > for yourself if it does more good than harm or the converse. > > -- > Sylvain > > > On Tue, Mar 19, 2013 at 3:58 AM, Drew Kutcharian wrote: > Edward/Sylvain, > > I also came across this post on DataStax's blog: > >> When to use compression >> Compression is best suited for ColumnFamilies where there are many rows, >> with each row having the same columns, or at least many columns in common. >> For example, a ColumnFamily containing user data such as username, email, >> etc., would be a good candidate for compression. The more similar the data >> across rows, the greater the compression ratio will be, and the larger the >> gain in read performance. >> Compression is not as good a fit for ColumnFamilies where each row has a >> different set of columns, or where there are just a few very wide rows. >> Dynamic column families such as this will not yield good compression ratios. > > http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-compression > > @Sylvain, does this still apply on more recent versions of C*? > > > -- Drew > > > > On Mar 18, 2013, at 7:16 PM, Edward Capriolo wrote: > >> I feel this has come up before. I believe the compression is block based, so >> just because no two column names are the same does not mean the compression >> will not be effective. Possibly in their case the compression was not >> effective. >> >> On Mon, Mar 18, 2013 at 9:08 PM, Drew Kutcharian wrote: >> That's what I originally thought but the OOYALA presentation from C*2012 got >> me confused. Do you guys know what's going on here? >> >> The video: >> http://www.youtube.com/watch?v=r2nGBUuvVmc&feature=player_detailpage#t=790s >> The slides: Slide 22 @ >> http://www.datastax.com/wp-content/uploads/2012/08/C2012-Hastur-NoahGibbs.pdf >> >> -- Drew >> >> >> On Mar 18, 2013, at 6:14 AM, Edward Capriolo wrote: >> >>> >>> Imho it is probably more efficient for wide. When you decompress 8k blocks >>> to get at a 200 byte row you create overhead , particularly young gen. >>> On Monday, March 18, 2013, Sylvain Lebresne wrote: >>> > The way compression is implemented, it is oblivious to the CF being >>> > wide-row or narrow-row. There is nothing intrinsically less efficient in >>> > the compression for wide-rows. >>> > -- >>> > Sylvain >>> > >>> > On Fri, Mar 15, 2013 at 11:53 PM, Drew Kutcharian wrote: >>> >> >>> >> Hey Guys, >>> >> >>> >> I remember reading somewhere that C* compression is not very effective >>> >> when most of the CFs are in wide-row format and some folks turn the >>> >> compression off and use disk level compression as a workaround. >>> >> Considering that wide rows with composites are "first class citizens" in >>> >> CQL3, is this still the case? Has there been any improvements on this? >>> >> >>> >> Thanks, >>> >> >>> >> Drew >>> > >> >> > >
Re: Cassandra Compression and Wide Rows
Edward/Sylvain, I also came across this post on DataStax's blog: > When to use compression > Compression is best suited for ColumnFamilies where there are many rows, with > each row having the same columns, or at least many columns in common. For > example, a ColumnFamily containing user data such as username, email, etc., > would be a good candidate for compression. The more similar the data across > rows, the greater the compression ratio will be, and the larger the gain in > read performance. > Compression is not as good a fit for ColumnFamilies where each row has a > different set of columns, or where there are just a few very wide rows. > Dynamic column families such as this will not yield good compression ratios. http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-compression @Sylvain, does this still apply on more recent versions of C*? -- Drew On Mar 18, 2013, at 7:16 PM, Edward Capriolo wrote: > I feel this has come up before. I believe the compression is block based, so > just because no two column names are the same does not mean the compression > will not be effective. Possibly in their case the compression was not > effective. > > On Mon, Mar 18, 2013 at 9:08 PM, Drew Kutcharian wrote: > That's what I originally thought but the OOYALA presentation from C*2012 got > me confused. Do you guys know what's going on here? > > The video: > http://www.youtube.com/watch?v=r2nGBUuvVmc&feature=player_detailpage#t=790s > The slides: Slide 22 @ > http://www.datastax.com/wp-content/uploads/2012/08/C2012-Hastur-NoahGibbs.pdf > > -- Drew > > > On Mar 18, 2013, at 6:14 AM, Edward Capriolo wrote: > >> >> Imho it is probably more efficient for wide. When you decompress 8k blocks >> to get at a 200 byte row you create overhead , particularly young gen. >> On Monday, March 18, 2013, Sylvain Lebresne wrote: >> > The way compression is implemented, it is oblivious to the CF being >> > wide-row or narrow-row. There is nothing intrinsically less efficient in >> > the compression for wide-rows. >> > -- >> > Sylvain >> > >> > On Fri, Mar 15, 2013 at 11:53 PM, Drew Kutcharian wrote: >> >> >> >> Hey Guys, >> >> >> >> I remember reading somewhere that C* compression is not very effective >> >> when most of the CFs are in wide-row format and some folks turn the >> >> compression off and use disk level compression as a workaround. >> >> Considering that wide rows with composites are "first class citizens" in >> >> CQL3, is this still the case? Has there been any improvements on this? >> >> >> >> Thanks, >> >> >> >> Drew >> > > >
Re: Cassandra Compression and Wide Rows
That's what I originally thought but the OOYALA presentation from C*2012 got me confused. Do you guys know what's going on here? The video: http://www.youtube.com/watch?v=r2nGBUuvVmc&feature=player_detailpage#t=790s The slides: Slide 22 @ http://www.datastax.com/wp-content/uploads/2012/08/C2012-Hastur-NoahGibbs.pdf -- Drew On Mar 18, 2013, at 6:14 AM, Edward Capriolo wrote: > > Imho it is probably more efficient for wide. When you decompress 8k blocks to > get at a 200 byte row you create overhead , particularly young gen. > On Monday, March 18, 2013, Sylvain Lebresne wrote: > > The way compression is implemented, it is oblivious to the CF being > > wide-row or narrow-row. There is nothing intrinsically less efficient in > > the compression for wide-rows. > > -- > > Sylvain > > > > On Fri, Mar 15, 2013 at 11:53 PM, Drew Kutcharian wrote: > >> > >> Hey Guys, > >> > >> I remember reading somewhere that C* compression is not very effective > >> when most of the CFs are in wide-row format and some folks turn the > >> compression off and use disk level compression as a workaround. > >> Considering that wide rows with composites are "first class citizens" in > >> CQL3, is this still the case? Has there been any improvements on this? > >> > >> Thanks, > >> > >> Drew > >
Cassandra Compression and Wide Rows
Hey Guys, I remember reading somewhere that C* compression is not very effective when most of the CFs are in wide-row format and some folks turn the compression off and use disk level compression as a workaround. Considering that wide rows with composites are "first class citizens" in CQL3, is this still the case? Has there been any improvements on this? Thanks, Drew
Re: Cassandra instead of memcached
I think the dataset should fit in memory easily. The main purpose of this would be as a store for an API rate limiting/accounting system. I think ebay guys are using C* too for the same reason. Initially we were thinking of using Hazelcast or memcahed. But Hazelcast (at least the community edition) has Java gc issues with big heaps and the problem with memcached is lack of a reliable distribution (you lose a node, you need to rehash everything), so I figured why not just use C*. On Mar 6, 2013, at 9:08 AM, Edward Capriolo wrote: > If your writing much more data then RAM cassandra will not work as fast as > memcache. Cassandra is not magical, if all of your data fits in memory it is > going to be fast, if most of your data fits in memory it can still be fast. > However if you plan on having much more data then disk you need to think > about more RAM and OR SSD disks. > > We do not use c* as an "in-memory store". However for many of our datasets we > do not have a separate caching tier. In those cases cassandra is both our > "database" and our "in-memory store" if you want to use those terms :) > > On Wed, Mar 6, 2013 at 12:02 PM, Drew Kutcharian wrote: > Thanks guys, this is what I was looking for. > > @Edward. I definitely like crazy ideas ;), I think the only issue here is > that C* is a disk space hug, so not sure if that would be feasible since free > RAM is not as abundant as disk. BTW, I watched your presentation, are you > guys still using C* as in-memory store? > > > > > On Mar 6, 2013, at 7:44 AM, Edward Capriolo wrote: > >> http://www.slideshare.net/edwardcapriolo/cassandra-as-memcache >> >> Read at ONE. >> READ_REPAIR_CHANCE as low as possible. >> >> Use short TTL and short GC_GRACE. >> >> Make the in memory memtable size as high as possible to avoid flushing and >> compacting. >> >> Optionally turn off commit log. >> >> You can use cassandra like memcache but it is not a memcache replacement. >> Cassandra persists writes and compacts SSTables, memcache only has to keep >> data in memory. >> >> If you want to try a crazy idea. try putting your persistent data on a ram >> disk! Not data/system however! >> >> >> >> >> >> >> On Wed, Mar 6, 2013 at 2:45 AM, aaron morton wrote: >> consider disabling durable_writes in the KS config to remove writing to the >> commit log. That will speed things up for you. Note that you risk losing >> data is cassandra crashes or is not shut down with nodetool drain. >> >> Even if you set the gc_grace to 0, deletes will still need to be committed >> to disk. >> >> Cheers >> >> - >> Aaron Morton >> Freelance Cassandra Developer >> New Zealand >> >> @aaronmorton >> http://www.thelastpickle.com >> >> On 5/03/2013, at 9:51 AM, Drew Kutcharian wrote: >> >>> Thanks Ben, that article was actually the reason I started thinking about >>> removing memcached. >>> >>> I wanted to see what would be the optimum config to use C* as an in-memory >>> store. >>> >>> -- Drew >>> >>> >>> On Mar 5, 2013, at 2:39 AM, Ben Bromhead wrote: >>> >>>> Check out >>>> http://techblog.netflix.com/2012/07/benchmarking-high-performance-io-with.html >>>> >>>> Netflix used Cassandra with SSDs and were able to drop their memcache >>>> layer. Mind you they were not using it purely as an in memory KV store. >>>> >>>> Ben >>>> Instaclustr | www.instaclustr.com | @instaclustr >>>> >>>> >>>> >>>> On 05/03/2013, at 4:33 PM, Drew Kutcharian wrote: >>>> >>>>> Hi Guys, >>>>> >>>>> I'm thinking about using Cassandra as an in-memory key/value store >>>>> instead of memcached for a new project (just to get rid of a dependency >>>>> if possible). I was thinking about setting the replication factor to 1, >>>>> enabling off-heap row-cache and setting gc_grace_period to zero for the >>>>> CF that will be used for the key/value store. >>>>> >>>>> Has anyone tried this? Any comments? >>>>> >>>>> Thanks, >>>>> >>>>> Drew >>>>> >>>>> >>>> >>> >> >> > >
Re: Cassandra instead of memcached
Thanks guys, this is what I was looking for. @Edward. I definitely like crazy ideas ;), I think the only issue here is that C* is a disk space hug, so not sure if that would be feasible since free RAM is not as abundant as disk. BTW, I watched your presentation, are you guys still using C* as in-memory store? On Mar 6, 2013, at 7:44 AM, Edward Capriolo wrote: > http://www.slideshare.net/edwardcapriolo/cassandra-as-memcache > > Read at ONE. > READ_REPAIR_CHANCE as low as possible. > > Use short TTL and short GC_GRACE. > > Make the in memory memtable size as high as possible to avoid flushing and > compacting. > > Optionally turn off commit log. > > You can use cassandra like memcache but it is not a memcache replacement. > Cassandra persists writes and compacts SSTables, memcache only has to keep > data in memory. > > If you want to try a crazy idea. try putting your persistent data on a ram > disk! Not data/system however! > > > > > > > On Wed, Mar 6, 2013 at 2:45 AM, aaron morton wrote: > consider disabling durable_writes in the KS config to remove writing to the > commit log. That will speed things up for you. Note that you risk losing data > is cassandra crashes or is not shut down with nodetool drain. > > Even if you set the gc_grace to 0, deletes will still need to be committed to > disk. > > Cheers > > - > Aaron Morton > Freelance Cassandra Developer > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 5/03/2013, at 9:51 AM, Drew Kutcharian wrote: > >> Thanks Ben, that article was actually the reason I started thinking about >> removing memcached. >> >> I wanted to see what would be the optimum config to use C* as an in-memory >> store. >> >> -- Drew >> >> >> On Mar 5, 2013, at 2:39 AM, Ben Bromhead wrote: >> >>> Check out >>> http://techblog.netflix.com/2012/07/benchmarking-high-performance-io-with.html >>> >>> Netflix used Cassandra with SSDs and were able to drop their memcache >>> layer. Mind you they were not using it purely as an in memory KV store. >>> >>> Ben >>> Instaclustr | www.instaclustr.com | @instaclustr >>> >>> >>> >>> On 05/03/2013, at 4:33 PM, Drew Kutcharian wrote: >>> >>>> Hi Guys, >>>> >>>> I'm thinking about using Cassandra as an in-memory key/value store instead >>>> of memcached for a new project (just to get rid of a dependency if >>>> possible). I was thinking about setting the replication factor to 1, >>>> enabling off-heap row-cache and setting gc_grace_period to zero for the CF >>>> that will be used for the key/value store. >>>> >>>> Has anyone tried this? Any comments? >>>> >>>> Thanks, >>>> >>>> Drew >>>> >>>> >>> >> > >
Re: Cassandra instead of memcached
Thanks Ben, that article was actually the reason I started thinking about removing memcached. I wanted to see what would be the optimum config to use C* as an in-memory store. -- Drew On Mar 5, 2013, at 2:39 AM, Ben Bromhead wrote: > Check out > http://techblog.netflix.com/2012/07/benchmarking-high-performance-io-with.html > > Netflix used Cassandra with SSDs and were able to drop their memcache layer. > Mind you they were not using it purely as an in memory KV store. > > Ben > Instaclustr | www.instaclustr.com | @instaclustr > > > > On 05/03/2013, at 4:33 PM, Drew Kutcharian wrote: > >> Hi Guys, >> >> I'm thinking about using Cassandra as an in-memory key/value store instead >> of memcached for a new project (just to get rid of a dependency if >> possible). I was thinking about setting the replication factor to 1, >> enabling off-heap row-cache and setting gc_grace_period to zero for the CF >> that will be used for the key/value store. >> >> Has anyone tried this? Any comments? >> >> Thanks, >> >> Drew >> >> >
Cassandra instead of memcached
Hi Guys, I'm thinking about using Cassandra as an in-memory key/value store instead of memcached for a new project (just to get rid of a dependency if possible). I was thinking about setting the replication factor to 1, enabling off-heap row-cache and setting gc_grace_period to zero for the CF that will be used for the key/value store. Has anyone tried this? Any comments? Thanks, Drew
Re: Cassandra Geospatial Search
Hey Dean, do you guys have any thoughts on how to implement it yet? On Feb 15, 2013, at 6:18 AM, "Hiller, Dean" wrote: > Yes, this is in PlayOrm's roadmap as well but not there yet. > > Dean > > On 2/13/13 6:42 PM, "Drew Kutcharian" wrote: > >> Hi Guys, >> >> Has anyone on this mailing list tried to build a bounding box style (get >> the records inside a known bounding box) geospatial search? I've been >> researching this a bit and seems like the only attempt at this was by >> SimpleGeo guys, but there isn't much public info out there on how they >> did it besides the a video. >> >> -- Drew >> >
Re: multiget_slice using CQL3
Thanks Edward. I assume I can still do a column slice using WHERE in case of wide rows. I wonder if the multiget count is the only thing that you can do using thrift but not CQL3. On Feb 14, 2013, at 6:35 PM, Edward Capriolo wrote: > The equivalent of multget slice is > > select * from table where primary_key in ('that', 'this', 'the other thing') > > Not sure if you can count these in a way that makes sense since you > can not group. > > On Thu, Feb 14, 2013 at 9:17 PM, Michael Kjellman > wrote: >> I'm confused what you are looking to do. >> >> CQL3 syntax (SELECT * FROM keyspace.cf WHERE user = 'cooldude') has >> nothing to do with thrift client calls (such as multiget_slice) >> >> What is your goal here? >> >> Best, >> michael >> >> On 2/14/13 5:57 PM, "Drew Kutcharian" wrote: >> >>> Hi Guys, >>> >>> What's the syntax for multiget_slice in CQL3? How about multiget_count? >>> >>> -- Drew >>
multiget_slice using CQL3
Hi Guys, What's the syntax for multiget_slice in CQL3? How about multiget_count? -- Drew
Cassandra Geospatial Search
Hi Guys, Has anyone on this mailing list tried to build a bounding box style (get the records inside a known bounding box) geospatial search? I've been researching this a bit and seems like the only attempt at this was by SimpleGeo guys, but there isn't much public info out there on how they did it besides the a video. -- Drew
Re: Documentation/Examples for DataStax java-driver
That's kinda what I was thinking to, just wanted to see if there's a built-in way. On Feb 13, 2013, at 10:07 AM, Shahryar Sedghi wrote: > The API allows to build your own batch through building a query I do not use > that, neither counter columns. I do not build a query, I create a CQL like: > String batchInsert = "BEGIN BATCH " + > > "INSERT INTO xyz( a,b,c, " + > " VALUES ( ?, ?, ?) " + > > "INSERT INTO def(a, b, ,c," + > > " VALUES ( ?, ?, ?) " + > > "APPLY BATCH"; > > PreparedStatement prBatchInsert = session.prepare(batchInsert); > statement.setConsistencyLevel(ConsistencyLevel.QUORUM); > BoundStatement query = prBatchInsert.bind(1,2,3, 1,2,3); > session.execute(query); > > I got session through this: > > cluster = > Cluster.builder().addContactPoint(getInitParameter("cassandraCluster")) > > .withRetryPolicy(DowngradingConsistencyRetryPolicy.INSTANCE ).build(); > session = cluster.connect(getInitParameter("keyspace")); > > I have queries that i have begin unlogged batch instead of begin batch > > Hopefully it helps > > > On Wed, Feb 13, 2013 at 12:23 PM, Drew Kutcharian wrote: > @Shahryar/Gabriel > I know the source code is nicely documented, but I couldn't find much info on: > 1. Creating/submitting atomic/non-atomic batches. > 2. Handling Counter columns > Do you have any examples for that? > > @Edward > I was under impression that client-dev mailing list was to be used by the > developers/committers of the client libs and each client has their own > mailing list such as hector, but I'm not sure there exist a mailing list for > DataStax's java-driver. > > > -- Drew > > > > On Feb 13, 2013, at 8:06 AM, Edward Capriolo wrote: > > > Just an FYI. More appropriate for the client-dev list. > > > > On Wed, Feb 13, 2013 at 10:37 AM, Gabriel Ciuloaica > > wrote: > >> Code has good documentation and also the example module has enough sample > >> code to help you started. > >> > >> --Gabi > >> > >> On 2/13/13 5:31 PM, Shahryar Sedghi wrote: > >> > >> Source code has enough documentation in it, apparently this is how they do > >> it with new stuff. Start with Custer class, it tells you how to write. If > >> you still had problem let me know, I can give you sample code. > >> > >> > >> On Tue, Feb 12, 2013 at 9:19 PM, Drew Kutcharian wrote: > >>> > >>> Are there any documentation/examples available for DataStax java-driver > >>> besides what's in the GitHub repo? > >>> > >>> -- Drew > >> > >> > >> > >> > >> -- > >> "Life is what happens while you are making other plans." ~ John Lennon > >> > >> > > > > > -- > "Life is what happens while you are making other plans." ~ John Lennon
Re: Documentation/Examples for DataStax java-driver
@Edward. I completely agree. I was just explaining my original rationale in posting to this mailing list. Maybe it's time to start a dedicated DataStax java-driver mailing list. On Feb 13, 2013, at 12:53 PM, Edward Capriolo wrote: > @Drew > > This list is for cassandra users. Since the DataStax java-driver is > not actually part of Cassandra. If every user comes here to talk about > their driver/orm/problems they are having with code that is not part > of cassandra this list will get noisy. > > IMHO client-dev is the right place for these type of topics. > Occasionally a cross post makes sense. > > Edward > > On Wed, Feb 13, 2013 at 12:23 PM, Drew Kutcharian wrote: >> @Shahryar/Gabriel >> I know the source code is nicely documented, but I couldn't find much info >> on: >> 1. Creating/submitting atomic/non-atomic batches. >> 2. Handling Counter columns >> Do you have any examples for that? >> >> @Edward >> I was under impression that client-dev mailing list was to be used by the >> developers/committers of the client libs and each client has their own >> mailing list such as hector, but I'm not sure there exist a mailing list for >> DataStax's java-driver. >> >> >> -- Drew >> >> >> >> On Feb 13, 2013, at 8:06 AM, Edward Capriolo wrote: >> >>> Just an FYI. More appropriate for the client-dev list. >>> >>> On Wed, Feb 13, 2013 at 10:37 AM, Gabriel Ciuloaica >>> wrote: >>>> Code has good documentation and also the example module has enough sample >>>> code to help you started. >>>> >>>> --Gabi >>>> >>>> On 2/13/13 5:31 PM, Shahryar Sedghi wrote: >>>> >>>> Source code has enough documentation in it, apparently this is how they do >>>> it with new stuff. Start with Custer class, it tells you how to write. If >>>> you still had problem let me know, I can give you sample code. >>>> >>>> >>>> On Tue, Feb 12, 2013 at 9:19 PM, Drew Kutcharian wrote: >>>>> >>>>> Are there any documentation/examples available for DataStax java-driver >>>>> besides what's in the GitHub repo? >>>>> >>>>> -- Drew >>>> >>>> >>>> >>>> >>>> -- >>>> "Life is what happens while you are making other plans." ~ John Lennon >>>> >>>> >>
Re: Documentation/Examples for DataStax java-driver
@Shahryar/Gabriel I know the source code is nicely documented, but I couldn't find much info on: 1. Creating/submitting atomic/non-atomic batches. 2. Handling Counter columns Do you have any examples for that? @Edward I was under impression that client-dev mailing list was to be used by the developers/committers of the client libs and each client has their own mailing list such as hector, but I'm not sure there exist a mailing list for DataStax's java-driver. -- Drew On Feb 13, 2013, at 8:06 AM, Edward Capriolo wrote: > Just an FYI. More appropriate for the client-dev list. > > On Wed, Feb 13, 2013 at 10:37 AM, Gabriel Ciuloaica > wrote: >> Code has good documentation and also the example module has enough sample >> code to help you started. >> >> --Gabi >> >> On 2/13/13 5:31 PM, Shahryar Sedghi wrote: >> >> Source code has enough documentation in it, apparently this is how they do >> it with new stuff. Start with Custer class, it tells you how to write. If >> you still had problem let me know, I can give you sample code. >> >> >> On Tue, Feb 12, 2013 at 9:19 PM, Drew Kutcharian wrote: >>> >>> Are there any documentation/examples available for DataStax java-driver >>> besides what's in the GitHub repo? >>> >>> -- Drew >> >> >> >> >> -- >> "Life is what happens while you are making other plans." ~ John Lennon >> >>
Documentation/Examples for DataStax java-driver
Are there any documentation/examples available for DataStax java-driver besides what's in the GitHub repo? -- Drew
Re: Cassandra 1.2 Atomic Batches and Thrift API
Thanks Sylvain. BTW, what's the status of the java-driver? When will it be GA? On Feb 12, 2013, at 1:19 AM, Sylvain Lebresne wrote: > Yes, it's called atomic_batch_mutate and is used like batch_mutate. If you > don't use thrift directly (which would qualify as a very good idea), you'll > need to refer to whatever client library you are using to see if 1) support > for that new call has been added and 2) how to use it. If you are not sure > what is the best way to contact the developers of you client library, then > you may try the Cassandra client mailing list: > client-...@cassandra.apache.org. > > -- > Sylvain > > > On Tue, Feb 12, 2013 at 4:44 AM, Drew Kutcharian wrote: > Hey Guys, > > Is the new atomic batch feature in Cassandra 1.2 available via the thrift > API? If so, how can I use it? > > -- Drew > >
Re: Operation Consideration with Counter Column Families
For anyone interested, I came across this video where Sylvain explains how counters are actually implemented in Cassandra. http://vimeo.com/26011102 On Feb 6, 2013, at 8:08 PM, aaron morton wrote: >> Thanks Aaron, so will there only be one "value" for each counter column per >> sstable just like regular columns? > Yes. > >> For some reason I was under the impression that Cassandra keeps a log of >> all the increments not the actual value. > Not as far as I understand. > > Cheers > > - > Aaron Morton > Freelance Cassandra Developer > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 6/02/2013, at 11:15 AM, Drew Kutcharian wrote: > >> Thanks Aaron, so will there only be one "value" for each counter column per >> sstable just like regular columns? For some reason I was under the >> impression that Cassandra keeps a log of all the increments not the actual >> value. >> >> >> On Feb 5, 2013, at 12:36 PM, aaron morton wrote: >> >>>> Are there any specific operational considerations one should make when >>>> using counter columns families? >>> Performance, as they incur a read and a write. >>> There were some issues with overcounts in log replay (see the changes.txt). >>> >>>> How are counter column families stored on disk? >>> Same as regular CF's. >>> >>>> How do they effect compaction? >>> None. >>> >>> Cheers >>> >>> - >>> Aaron Morton >>> Freelance Cassandra Developer >>> New Zealand >>> >>> @aaronmorton >>> http://www.thelastpickle.com >>> >>> On 6/02/2013, at 7:47 AM, Drew Kutcharian wrote: >>> >>>> Hey Guys, >>>> >>>> Are there any specific operational considerations one should make when >>>> using counter columns families? How are counter column families stored on >>>> disk? How do they effect compaction? >>>> >>>> -- Drew >>>> >>> >> >
Cassandra 1.2 Atomic Batches and Thrift API
Hey Guys, Is the new atomic batch feature in Cassandra 1.2 available via the thrift API? If so, how can I use it? -- Drew
Re: Operation Consideration with Counter Column Families
Thanks Aaron, so will there only be one "value" for each counter column per sstable just like regular columns? For some reason I was under the impression that Cassandra keeps a log of all the increments not the actual value. On Feb 5, 2013, at 12:36 PM, aaron morton wrote: >> Are there any specific operational considerations one should make when using >> counter columns families? > Performance, as they incur a read and a write. > There were some issues with overcounts in log replay (see the changes.txt). > >> How are counter column families stored on disk? > Same as regular CF's. > >> How do they effect compaction? > None. > > Cheers > > - > Aaron Morton > Freelance Cassandra Developer > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 6/02/2013, at 7:47 AM, Drew Kutcharian wrote: > >> Hey Guys, >> >> Are there any specific operational considerations one should make when using >> counter columns families? How are counter column families stored on disk? >> How do they effect compaction? >> >> -- Drew >> >
Operation Consideration with Counter Column Families
Hey Guys, Are there any specific operational considerations one should make when using counter columns families? How are counter column families stored on disk? How do they effect compaction? -- Drew
Re: How to show unread messages counts?
Thanks Aaron. On Jan 2, 2013, at 2:55 PM, aaron morton wrote: >> Currently I'm thinking of having a separate Counter CF just to keep the >> number of unread messages in there. Is this a good approach? > Yup. > Add a UserMetrics CF with columns for the counts you want to keep. > > Cheers > > - > Aaron Morton > Freelance Cassandra Developer > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 3/01/2013, at 8:37 AM, Drew Kutcharian wrote: > >> Happy New Year Everyone! >> >> What's the best way to model "unread messages count" in Cassandra? I have a >> UserMessage CF where the row key is the user id and the column name is the >> message id (timeuuid) and I store the message and the status (READ/UNREAD) >> in the value column. I would like to be able to show the number of unread >> messages to the user. Currently I'm thinking of having a separate Counter CF >> just to keep the number of unread messages in there. Is this a good >> approach? Is there a better one? >> >> Thanks, >> >> Drew >
How to show unread messages counts?
Happy New Year Everyone! What's the best way to model "unread messages count" in Cassandra? I have a UserMessage CF where the row key is the user id and the column name is the message id (timeuuid) and I store the message and the status (READ/UNREAD) in the value column. I would like to be able to show the number of unread messages to the user. Currently I'm thinking of having a separate Counter CF just to keep the number of unread messages in there. Is this a good approach? Is there a better one? Thanks, Drew
Re: State of Cassandra and Java 7
In addition, the DataStax official documentation states: "Versions earlier than 1.6.0_19 should not be used. Java 7 is not recommended." http://www.datastax.com/docs/1.1/install/install_rpm On Dec 14, 2012, at 9:42 AM, Aaron Turner wrote: > Does Datastax (or any other company) support Cassandra under Java 7? > Or will they tell you to downgrade when you have some problem, because > they don't support C* running on 7? > > At least that would be one way of defining "officially supported". > > On Fri, Dec 14, 2012 at 2:22 AM, Sylvain Lebresne > wrote: >> What kind of official statement do you want? As far as I can be considered >> an official voice of the project, my statement is: "various people run in >> production with Java 7 and it seems to work". >> >> Or to answer the initial question, the only issue related to Java 7 that I >> know of is CASSANDRA-4958, but that's osx specific (I wouldn't advise using >> osx in production anyway) and it's not directly related to Cassandra anyway >> so you can easily use the beta version of snappy-java as a workaround if you >> want to. So that non blocking issue aside, and as far as we know, Cassandra >> supports Java 7. Is it rock-solid in production? Well, only repeated use in >> production can tell, and that's not really in the hand of the project. We do >> obviously encourage people to try Java 7 as much as possible and report any >> problem they may run into, but I would have though this goes without saying. >> >> >> On Fri, Dec 14, 2012 at 4:05 AM, Rob Coli wrote: >>> >>> On Thu, Dec 13, 2012 at 11:43 AM, Drew Kutcharian wrote: >>>> With Java 6 begin EOL-ed soon >>>> (https://blogs.oracle.com/java/entry/end_of_public_updates_for), what's the >>>> status of Cassandra's Java 7 support? Anyone using it in production? Any >>>> outstanding *known* issues? >>> >>> I'd love to see an official statement from the project, due to the >>> sort of EOL issues you're referring to. Unfortunately previous >>> requests on this list for such a statement have gone unanswered. >>> >>> The non-official response is that various people run in production >>> with Java 7 and it seems to work. :) >>> >>> =Rob >>> >>> -- >>> =Robert Coli >>> AIM>ALK - rc...@palominodb.com >>> YAHOO - rcoli.palominob >>> SKYPE - rcoli_palominodb >> >> > > > > -- > Aaron Turner > http://synfin.net/ Twitter: @synfinatic > http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix & > Windows > Those who would give up essential Liberty, to purchase a little temporary > Safety, deserve neither Liberty nor Safety. >-- Benjamin Franklin > "carpe diem quam minimum credula postero"
State of Cassandra and Java 7
Hey Guys, With Java 6 begin EOL-ed soon (https://blogs.oracle.com/java/entry/end_of_public_updates_for), what's the status of Cassandra's Java 7 support? Anyone using it in production? Any outstanding *known* issues? -- Drew
Re: Single Node Cassandra Installation
Thanks Rob, this makes sense. We only have one rack at this point, so I think it'd be better to start with PropertyFileSnitch to make Cassandra think that these nodes each are in a different rack without having to put them on different subnets. And I will have more flexibility (at the cost of keeping the property file in sync) when it comes to growth. What do you think? -- Drew On Nov 5, 2012, at 7:50 PM, Rob Coli wrote: > On Mon, Nov 5, 2012 at 12:23 PM, Drew Kutcharian wrote: >>> Switching from SimpleStrategy to RackAware can be a pain. >> >> Can you elaborate a bit? What would be the pain point? > > If you don't maintain the same replica placement vis a vis nodes on > your cluster, you have to dump and reload. > > Simple example, 6 node cluster RF=3 : > > SimpleSnitch : A B C D E F > > Data for natural range of A is also on B and C, the "next" nodes in the ring. > > RackAwareSnitches : A B C D E F > "racks" they are in : 1 1 2 2 3 3 > > Data for natural range of A is also on C and E, because despite not > being the next nodes in the RING, they are the first nodes in the next > rack. > > If however you go from simple to rack aware and put your nodes in racks like : > > A B C D E F > 1 2 3 1 2 3 > > Then you have the same replica placement that SimpleStrategy gives you > and can safely switch strategies/snitches on an existing cluster. Data > for A is on B and C, on the same hosts, but for different reasons. Use > nodetool getendpoints to test. > > =Rob > > -- > =Robert Coli > AIM>ALK - rc...@palominodb.com > YAHOO - rcoli.palominob > SKYPE - rcoli_palominodb
Re: Single Node Cassandra Installation
I understand that with one node we will have no HA, but since we are just starting out we wanted to see what would be the bare minimum to go to production with and as we see traction we can add more nodes. > Switching from SimpleStrategy to RackAware can be a pain. Can you elaborate a bit? What would be the pain point? On Nov 5, 2012, at 10:12 AM, Michael Kjellman wrote: > Should be fine if one node can deal with your read and write load. > Switching from SimpleStrategy to RackAware can be a pain. That¹s a > potential growth point way down the line (if you ever have your nodes on > different switches). You might want to just setup your keyspace as > RackAware if you intend to keep this keyspace in production in the future. > > Redistributing load could take a while if your one node gets really > loaded, but obviously there are well documented ways to fix this with > nodetool. > > I'd make sure you start off your single node with a token of 0 so it's > easy to add more nodes later. > > Also, with a single node you will have no replication so I assume this is > okay in your production environment. > > On 11/5/12 9:59 AM, "zGreenfelder" wrote: > >> On Mon, Nov 5, 2012 at 12:49 PM, Drew Kutcharian wrote: >>> Hey Guys, >>> >>> What should I look out for when deploying a single node installation? >>> We want to launch a product that uses Cassandra and since we are going >>> to have very little load initially, we were thinking of just going live >>> with one node and eventually add more nodes as the load (hopefully) >>> grows. Is this practice recommended? >>> >> >> I'm far from an expert and single node should be fine for development, >> but for your production site, I'd suggest going with a minimal 'real' >> cluster (2-3 nodes). in my limited experience with hadoop, single >> node is significantly different from multinode, and I wouldn't want to >> start from a position where config is certain to have to >> fundamentally change as opposed to just grow. But that's just my >> first guess, you could try a dev move from a single node to a dual >> node to find out how difficult it might be. >> >> >> -- >> Even the Magic 8 ball has an opinion on email clients: Outlook not so >> good. > > > 'Like' us on Facebook for exclusive content and other resources on all > Barracuda Networks solutions. > Visit http://barracudanetworks.com/facebook > >
Single Node Cassandra Installation
Hey Guys, What should I look out for when deploying a single node installation? We want to launch a product that uses Cassandra and since we are going to have very little load initially, we were thinking of just going live with one node and eventually add more nodes as the load (hopefully) grows. Is this practice recommended? Thanks, Drew
Re: Data Modeling: Comments with Voting
Thanks Roshni, I'm not sue how #d will work when users are actually voting on a comment. What happens when two users vote on the same comment simultaneously? How do you update the entries in #d column family to prevent duplicates? Also #a and #c can be combined together using TimeUUID as comment ids. - Drew On Sep 27, 2012, at 2:13 AM, Roshni Rajagopal wrote: > Hi Drew, > > I think you have 4 requirements. Here are my suggestions. > > a) store comments : have a static column family for comments with master data > like created date, created by , length etc > b) when a person votes for a comment, increment a vote counter : have a > counter column family for incrementing the votes for each comment > c) display comments sorted by date created: have a column family with a dummy > row id 'sort_by_time_list', column names can be date created(timeUUID), and > column value can be comment id > d) display comments sorted by number of votes: have a column family with a > dummy row id 'sort_by_votes_list' and column names can be a composite of > number of votes , and comment id ( as more than 1 comment can have the same > votes) > > > Regards, > Roshni > > > Date: Wed, 26 Sep 2012 17:36:13 -0700 > > From: k...@mustardgrain.com > > To: user@cassandra.apache.org > > CC: d...@venarc.com > > Subject: Re: Data Modeling: Comments with Voting > > > > Depending on your needs, you could simply duplicate the comments in two > > separate CFs with the column names including time in one and the vote in > > the other. If you allow for updates to the comments, that would pose > > some issues you'd need to solve at the app level. > > > > On 9/26/12 4:28 PM, Drew Kutcharian wrote: > > > Hi Guys, > > > > > > Wondering what would be the best way to model a flat (no sub comments, > > > i.e. twitter) comments list with support for voting (where I can sort by > > > create time or votes) in Cassandra? > > > > > > To demonstrate: > > > > > > Sorted by create time: > > > - comment 1 (5 votes) > > > - comment 2 (1 votes) > > > - comment 3 (no votes) > > > - comment 4 (10 votes) > > > > > > Sorted by votes: > > > - comment 4 (10 votes) > > > - comment 1 (5 votes) > > > - comment 2 (1 votes) > > > - comment 3 (no votes) > > > > > > It's the sorted-by-votes that I'm having a bit of a trouble with. I'm > > > looking for a roll-your-own approach and prefer not to use secondary > > > indexes and CQL sorting. > > > > > > Thanks, > > > > > > Drew > > > > >
Data Modeling: Comments with Voting
Hi Guys, Wondering what would be the best way to model a flat (no sub comments, i.e. twitter) comments list with support for voting (where I can sort by create time or votes) in Cassandra? To demonstrate: Sorted by create time: - comment 1 (5 votes) - comment 2 (1 votes) - comment 3 (no votes) - comment 4 (10 votes) Sorted by votes: - comment 4 (10 votes) - comment 1 (5 votes) - comment 2 (1 votes) - comment 3 (no votes) It's the sorted-by-votes that I'm having a bit of a trouble with. I'm looking for a roll-your-own approach and prefer not to use secondary indexes and CQL sorting. Thanks, Drew
Re: DROP keyspace doesn't delete the files
bump On Jul 29, 2012, at 11:03 PM, Drew Kutcharian wrote: > Hi, > > What's the correct procedure to drop a keyspace? When I drop a keyspace, the > files of that keyspace don't get deleted. There is a JIRA on this: > > https://issues.apache.org/jira/browse/CASSANDRA-4075 > > Is this a bug or I'm missing something? > > I'm using Cassandra 1.1.2 on Ubuntu Linux with Sun JVM 1.6, 64bit > > Thanks, > > Drew >
DROP keyspace doesn't delete the files
Hi, What's the correct procedure to drop a keyspace? When I drop a keyspace, the files of that keyspace don't get deleted. There is a JIRA on this: https://issues.apache.org/jira/browse/CASSANDRA-4075 Is this a bug or I'm missing something? I'm using Cassandra 1.1.2 on Ubuntu Linux with Sun JVM 1.6, 64bit Thanks, Drew
Re: Highest and lowest valid values for UUIDs/TimeUUIDs
Nice, that's exactly what I was looking for. On Apr 24, 2012, at 11:21 AM, Tyler Hobbs wrote: > Oh, I just realized that you're asking about the lowest TimeUUID *overall*, > not just for a particular timestamp. Sorry. > > The lowest possible TimeUUID is '--1000-8080-808080808080'. > The highest is '--1fff-bf7f-7f7f7f7f7f7f'. > > On Tue, Apr 24, 2012 at 12:47 PM, Drew Kutcharian wrote: > Thanks. So looking at the code, to get the lowest possible TimeUUID value > using your function I should just call convert_time_to_uuid(0) ? > > > On Apr 24, 2012, at 10:15 AM, Tyler Hobbs wrote: > >> Yes, I have tested it. >> >> On Tue, Apr 24, 2012 at 12:08 PM, Drew Kutcharian wrote: >> Thanks Tyler. So have you actually tried this with Cassandra? >> >> >> >> On Apr 24, 2012, at 5:44 AM, Tyler Hobbs wrote: >> >>> At least for TimeUUIDs, this email I sent to client-dev@ a couple of weeks >>> ago should help to explain things: >>> http://www.mail-archive.com/client-dev@cassandra.apache.org/msg00125.html >>> >>> Looking at the linked pycassa code might be the most useful thing. >>> >>> On Tue, Apr 24, 2012 at 1:46 AM, Drew Kutcharian wrote: >>> Hi All, >>> >>> Considering that UUIDs are compared as numbers in Java [1], what are the >>> lowest and highest possible values a valid UUID can have? How about >>> TimeUUIDs? >>> >>> The reason I ask is that I would like to pick a "default" UUID value in a >>> composite column definition like Composite(UUID1, UUID2) where UUID1 can be >>> set to the default value if not supplied. In addition, it'd be nice if the >>> "default" columns are always sorted before the rest of the columns. >>> >>> I was thinking of just doing "new UUID(Long.MAX_VALUE, Long.MAX_VALUE)" or >>> "new UUID(Long.MIN_VALUE, Long.MIN_VALUE)" but not sure if that's going to >>> cause other issues that I'm not aware of. >>> >>> Thanks, >>> >>> Drew >>> >>> >>> [1] Here's the compareTo of java.util.UUID as a reference: >>> >>> public int compareTo(UUID val) { >>>// The ordering is intentionally set up so that the UUIDs >>>// can simply be numerically compared as two numbers >>>return (this.mostSigBits < val.mostSigBits ? -1 : >>>(this.mostSigBits > val.mostSigBits ? 1 : >>> (this.leastSigBits < val.leastSigBits ? -1 : >>> (this.leastSigBits > val.leastSigBits ? 1 : >>> 0; >>> } >>> >>> >>> >>> >>> -- >>> Tyler Hobbs >>> DataStax >>> >> >> >> >> >> -- >> Tyler Hobbs >> DataStax >> > > > > > -- > Tyler Hobbs > DataStax >
Re: Highest and lowest valid values for UUIDs/TimeUUIDs
Thanks. So looking at the code, to get the lowest possible TimeUUID value using your function I should just call convert_time_to_uuid(0) ? On Apr 24, 2012, at 10:15 AM, Tyler Hobbs wrote: > Yes, I have tested it. > > On Tue, Apr 24, 2012 at 12:08 PM, Drew Kutcharian wrote: > Thanks Tyler. So have you actually tried this with Cassandra? > > > > On Apr 24, 2012, at 5:44 AM, Tyler Hobbs wrote: > >> At least for TimeUUIDs, this email I sent to client-dev@ a couple of weeks >> ago should help to explain things: >> http://www.mail-archive.com/client-dev@cassandra.apache.org/msg00125.html >> >> Looking at the linked pycassa code might be the most useful thing. >> >> On Tue, Apr 24, 2012 at 1:46 AM, Drew Kutcharian wrote: >> Hi All, >> >> Considering that UUIDs are compared as numbers in Java [1], what are the >> lowest and highest possible values a valid UUID can have? How about >> TimeUUIDs? >> >> The reason I ask is that I would like to pick a "default" UUID value in a >> composite column definition like Composite(UUID1, UUID2) where UUID1 can be >> set to the default value if not supplied. In addition, it'd be nice if the >> "default" columns are always sorted before the rest of the columns. >> >> I was thinking of just doing "new UUID(Long.MAX_VALUE, Long.MAX_VALUE)" or >> "new UUID(Long.MIN_VALUE, Long.MIN_VALUE)" but not sure if that's going to >> cause other issues that I'm not aware of. >> >> Thanks, >> >> Drew >> >> >> [1] Here's the compareTo of java.util.UUID as a reference: >> >> public int compareTo(UUID val) { >>// The ordering is intentionally set up so that the UUIDs >>// can simply be numerically compared as two numbers >>return (this.mostSigBits < val.mostSigBits ? -1 : >>(this.mostSigBits > val.mostSigBits ? 1 : >> (this.leastSigBits < val.leastSigBits ? -1 : >> (this.leastSigBits > val.leastSigBits ? 1 : >> 0; >> } >> >> >> >> >> -- >> Tyler Hobbs >> DataStax >> > > > > > -- > Tyler Hobbs > DataStax >
Re: Highest and lowest valid values for UUIDs/TimeUUIDs
Thanks Tyler. So have you actually tried this with Cassandra? On Apr 24, 2012, at 5:44 AM, Tyler Hobbs wrote: > At least for TimeUUIDs, this email I sent to client-dev@ a couple of weeks > ago should help to explain things: > http://www.mail-archive.com/client-dev@cassandra.apache.org/msg00125.html > > Looking at the linked pycassa code might be the most useful thing. > > On Tue, Apr 24, 2012 at 1:46 AM, Drew Kutcharian wrote: > Hi All, > > Considering that UUIDs are compared as numbers in Java [1], what are the > lowest and highest possible values a valid UUID can have? How about TimeUUIDs? > > The reason I ask is that I would like to pick a "default" UUID value in a > composite column definition like Composite(UUID1, UUID2) where UUID1 can be > set to the default value if not supplied. In addition, it'd be nice if the > "default" columns are always sorted before the rest of the columns. > > I was thinking of just doing "new UUID(Long.MAX_VALUE, Long.MAX_VALUE)" or > "new UUID(Long.MIN_VALUE, Long.MIN_VALUE)" but not sure if that's going to > cause other issues that I'm not aware of. > > Thanks, > > Drew > > > [1] Here's the compareTo of java.util.UUID as a reference: > > public int compareTo(UUID val) { >// The ordering is intentionally set up so that the UUIDs >// can simply be numerically compared as two numbers >return (this.mostSigBits < val.mostSigBits ? -1 : >(this.mostSigBits > val.mostSigBits ? 1 : > (this.leastSigBits < val.leastSigBits ? -1 : > (this.leastSigBits > val.leastSigBits ? 1 : > 0; > } > > > > > -- > Tyler Hobbs > DataStax >
Highest and lowest valid values for UUIDs/TimeUUIDs
Hi All, Considering that UUIDs are compared as numbers in Java [1], what are the lowest and highest possible values a valid UUID can have? How about TimeUUIDs? The reason I ask is that I would like to pick a "default" UUID value in a composite column definition like Composite(UUID1, UUID2) where UUID1 can be set to the default value if not supplied. In addition, it'd be nice if the "default" columns are always sorted before the rest of the columns. I was thinking of just doing "new UUID(Long.MAX_VALUE, Long.MAX_VALUE)" or "new UUID(Long.MIN_VALUE, Long.MIN_VALUE)" but not sure if that's going to cause other issues that I'm not aware of. Thanks, Drew [1] Here's the compareTo of java.util.UUID as a reference: public int compareTo(UUID val) { // The ordering is intentionally set up so that the UUIDs // can simply be numerically compared as two numbers return (this.mostSigBits < val.mostSigBits ? -1 : (this.mostSigBits > val.mostSigBits ? 1 : (this.leastSigBits < val.leastSigBits ? -1 : (this.leastSigBits > val.leastSigBits ? 1 : 0; }
OFF TOPIC: Anyone knows of a good mailing list for network administrators?
Hi Guys, Sorry for posting this here, but I figured there should be a lot of smart Network/System Admins on this list which can recommend a good mailing list/forum for questions related to networking and rack/datacenter setups. Best, Drew
Re: Issue with cassandra-cli "assume"
Well if what you're saying is true, then the help is inconsistent: > It is also valid to specify the fully-qualified class name to a class that > extends org.apache.Cassandra.db.marshal.AbstractType. [default@unknown] help assume; assume comparator as ; assume sub_comparator as ; assume validator as ; assume keys as ; Assume one of the attributes (comparator, sub_comparator, validator or keys) of the given column family match specified type. The specified type will be used when displaying data returned from the column family. This statement does not change the column family definition stored in Cassandra. It only affects the cli and how it will transform values to be sent to and interprets results from Cassandra. If results from Cassandra do not validate according to the assumptions an error is displayed in the cli. Required Parameters: - cf: Name of the column family to make the assumption about. - type: Validator type to use when processing values. Supported values are: - ascii - bytes - counterColumn (distributed counter column) - int - integer (a generic variable-length integer type) - lexicalUUID - long - utf8 It is also valid to specify the fully-qualified class name to a class that extends org.apache.Cassandra.db.marshal.AbstractType. Examples: assume Standard1 comparator as lexicaluuid; assume Standard1 keys as ascii; On Mar 23, 2012, at 7:11 PM, Dave Brosius wrote: > I don't think thats possible with the cli. You'd have to embellish > CliClient.Function > > > > On 03/23/2012 09:59 PM, Drew Kutcharian wrote: >> I actually have a custom type, I put the BytesType in the example to >> demonstrate the issue is not with my custom type. >> >> -- Drew >> >> On Mar 23, 2012, at 6:46 PM, Dave Brosius wrote: >> >>> I think you want >>> >>> assume UserDetails validator as bytes; >>> >>> >>> >>> On 03/23/2012 08:09 PM, Drew Kutcharian wrote: >>>> Hi Everyone, >>>> >>>> I'm having an issue with cassandra-cli's assume command with a custom >>>> type. I tried it with the built-in BytesType and got the same error: >>>> >>>> [default@test] assume UserDetails validator as >>>> org.apache.cassandra.db.marshal.BytesType; >>>> Syntax error at position 35: missing EOF at '.' >>>> >>>> I also tried it with single and double quotes with no success: >>>> [default@test] assume UserDetails validator as >>>> 'org.apache.cassandra.db.marshal.BytesType'; >>>> Syntax error at position 32: mismatched input >>>> ''org.apache.cassandra.db.marshal.BytesType'' expecting Identifier >>>> >>>> Is this a bug? >>>> >>>> I'm using Cassanda 1.0.7 on Mac OSX Lion. >>>> >>>> Thanks, >>>> >>>> Drew >>>> >>>> >> >
Re: Issue with cassandra-cli "assume"
I actually have a custom type, I put the BytesType in the example to demonstrate the issue is not with my custom type. -- Drew On Mar 23, 2012, at 6:46 PM, Dave Brosius wrote: > I think you want > > assume UserDetails validator as bytes; > > > > On 03/23/2012 08:09 PM, Drew Kutcharian wrote: >> Hi Everyone, >> >> I'm having an issue with cassandra-cli's assume command with a custom type. >> I tried it with the built-in BytesType and got the same error: >> >> [default@test] assume UserDetails validator as >> org.apache.cassandra.db.marshal.BytesType; >> Syntax error at position 35: missing EOF at '.' >> >> I also tried it with single and double quotes with no success: >> [default@test] assume UserDetails validator as >> 'org.apache.cassandra.db.marshal.BytesType'; >> Syntax error at position 32: mismatched input >> ''org.apache.cassandra.db.marshal.BytesType'' expecting Identifier >> >> Is this a bug? >> >> I'm using Cassanda 1.0.7 on Mac OSX Lion. >> >> Thanks, >> >> Drew >> >> >
Issue with cassandra-cli "assume"
Hi Everyone, I'm having an issue with cassandra-cli's assume command with a custom type. I tried it with the built-in BytesType and got the same error: [default@test] assume UserDetails validator as org.apache.cassandra.db.marshal.BytesType; Syntax error at position 35: missing EOF at '.' I also tried it with single and double quotes with no success: [default@test] assume UserDetails validator as 'org.apache.cassandra.db.marshal.BytesType'; Syntax error at position 32: mismatched input ''org.apache.cassandra.db.marshal.BytesType'' expecting Identifier Is this a bug? I'm using Cassanda 1.0.7 on Mac OSX Lion. Thanks, Drew
Re: Single Node Cassandra Installation
Thanks for the comments, I guess I will end up doing a 2 node cluster with replica count 2 and read consistency 1. -- Drew On Mar 15, 2012, at 4:20 PM, Thomas van Neerijnen wrote: > So long as data loss and downtime are acceptable risks a one node cluster is > fine. > Personally this is usually only acceptable on my workstation, even my dev > environment is redundant, because servers fail, usually when you least want > them to, like for example when you've decided to save costs by waiting before > implementing redundancy. Could a failure end up costing you more than you've > saved? I'd rather get cheaper servers (maybe even used off ebay??) so I could > have at least two of them. > > If you do go with a one node solution, altho I haven't tried it myself Priam > looks like a good place to start for backups, otherwise roll your own with > incremental snapshotting turned on and a watch on the snapshot directory. > Storage on something like S3 or Cloud Files is very cheap so there's no good > excuse for no backups. > > On Thu, Mar 15, 2012 at 7:12 PM, R. Verlangen wrote: > Hi Drew, > > One other disadvantage is the lack of "consistency level" and "replication". > Both ware part of the high availability / redundancy. So you would really > need to backup your single-node-"cluster" to some other external location. > > Good luck! > > > 2012/3/15 Drew Kutcharian > Hi, > > We are working on a project that initially is going to have very little data, > but we would like to use Cassandra to ease the future scalability. Due to > budget constraints, we were thinking to run a single node Cassandra for now > and then add more nodes as required. > > I was wondering if it is recommended to run a single node cassandra in > production? Are there any other issues besides lack of high availability? > > Thanks, > > Drew > > >
Single Node Cassandra Installation
Hi, We are working on a project that initially is going to have very little data, but we would like to use Cassandra to ease the future scalability. Due to budget constraints, we were thinking to run a single node Cassandra for now and then add more nodes as required. I was wondering if it is recommended to run a single node cassandra in production? Are there any other issues besides lack of high availability? Thanks, Drew
Anybody using Cassandra/DataStax Distribution with Java Service Wrapper?
Hi Everyone, I noticed that the DataStax's distribution of Cassandra uses Jsvc. There's also Tanuki Java Service Wrapper (http://wrapper.tanukisoftware.com/doc/english/download.jsp) that does a lot more than simply launching a Java process, for example it monitors the JVM and even restarts failed/hung JVMs. Is anyone using it with Cassandra? The license for Java Service Wrapper is GPLv2, so it's not really compatible with Apache License, but it would be even nicer to have an option of using it instead of Jsvc in the distribution. -- Drew
Implications of length of column names
What are the implications of using short vs long column names? Is it better to use short column names or longer ones? I know for MongoDB you are better of using short field names http://www.mongodb.org/display/DOCS/Optimizing+Storage+of+Small+Objects Does this apply to Cassandra column names? -- Drew
Deleting a column vs setting it's value to empty
Hi Everyone, Let's say I have the following object which I would like to save in Cassandra: class User { UUID id; //row key String name; //columnKey: "name", columnValue: the name of the user String description; //columnKey: "description", columnValue: the description of the user } Description can be nullable. What's the best approach when a user updates her description and sets it to null? Should I delete the description column or set it to an empty string? In addition, if I go with the delete column strategy, since I don't know what was the previous value of description (the column could not even exist), what would happen when I delete a non existent column? Thanks, Drew
Re: How to reliably achieve unique constraints with Cassandra?
It makes great sense. You're a genius!! On Jan 6, 2012, at 10:43 PM, Narendra Sharma wrote: > Instead of trying to solve the generic problem of uniqueness, I would focus > on the specific problem. > > For eg lets consider your usecase of user registration with email address as > key. You can do following: > 1. Create CF (Users) where row key is UUID and has user info specific columns. > 2. Whenever user registers create a row in this CF with user status flag as > waiting for confirmation. > 3. Send email to the user's email address with link that contains the UUID > (or encrypted UUID) > 4. When user clicks on the link, use the UUID (or decrypted UUID) to lookup > user > 5. If the user exists with given UUID and status as waiting for confirmation > then update the status and create a entry in another CF (EmailUUIDIndex) > representing email address to UUID mapping. > 6. For authentication you can lookup in the index to get UUID and proceed. > 7. If a malicious user registers with someone else's email id then he will > never be able to confirm and will never have an entry in EmailUUIDIndex. As a > additional check if the entry for email id exists in EmailUUIDIndex then the > request for registration can be rejected right away. > > Make sense? > > -Naren > > On Fri, Jan 6, 2012 at 4:00 PM, Drew Kutcharian wrote: > So what are the common RIGHT solutions/tools for this? > > > On Jan 6, 2012, at 2:46 PM, Narendra Sharma wrote: > >> >>>It's very surprising that no one seems to have solved such a common use >> >>>case. >> I would say people have solved it using RIGHT tools for the task. >> >> >> >> On Fri, Jan 6, 2012 at 2:35 PM, Drew Kutcharian wrote: >> Thanks everyone for the replies. Seems like there is no easy way to handle >> this. It's very surprising that no one seems to have solved such a common >> use case. >> >> -- Drew >> >> On Jan 6, 2012, at 2:11 PM, Bryce Allen wrote: >> >> > That's a good question, and I'm not sure - I'm fairly new to both ZK >> > and Cassandra. I found this wiki page: >> > http://wiki.apache.org/hadoop/ZooKeeper/FailureScenarios >> > and I think the lock recipe still works, even if a stale read happens. >> > Assuming that wiki page is correct. >> > >> > There is still subtlety to locking with ZK though, see (Locks based >> > on ephemeral nodes) from the zk mailing list in October: >> > http://mail-archives.apache.org/mod_mbox/zookeeper-user/201110.mbox/thread?0 >> > >> > -Bryce >> > >> > On Fri, 6 Jan 2012 13:36:52 -0800 >> > Drew Kutcharian wrote: >> >> Bryce, >> >> >> >> I'm not sure about ZooKeeper, but I know if you have a partition >> >> between HazelCast nodes, than the nodes can acquire the same lock >> >> independently in each divided partition. How does ZooKeeper handle >> >> this situation? >> >> >> >> -- Drew >> >> >> >> >> >> On Jan 6, 2012, at 12:48 PM, Bryce Allen wrote: >> >> >> >>> On Fri, 6 Jan 2012 10:03:38 -0800 >> >>> Drew Kutcharian wrote: >> >>>> I know that this can be done using a lock manager such as ZooKeeper >> >>>> or HazelCast, but the issue with using either of them is that if >> >>>> ZooKeeper or HazelCast is down, then you can't be sure about the >> >>>> reliability of the lock. So this potentially, in the very rare >> >>>> instance where the lock manager is down and two users are >> >>>> registering with the same email, can cause major issues. >> >>> >> >>> For most applications, if the lock managers is down, you don't >> >>> acquire the lock, so you don't enter the critical section. Rather >> >>> than allowing inconsistency, you become unavailable (at least to >> >>> writes that require a lock). >> >>> >> >>> -Bryce >> >> >> >> >> >> >> -- >> Narendra Sharma >> Software Engineer >> http://www.aeris.com >> http://narendrasharma.blogspot.com/ >> >> > > > > > -- > Narendra Sharma > Software Engineer > http://www.aeris.com > http://narendrasharma.blogspot.com/ > >
Re: How to reliably achieve unique constraints with Cassandra?
So what are the common RIGHT solutions/tools for this? On Jan 6, 2012, at 2:46 PM, Narendra Sharma wrote: > >>>It's very surprising that no one seems to have solved such a common use > >>>case. > I would say people have solved it using RIGHT tools for the task. > > > > On Fri, Jan 6, 2012 at 2:35 PM, Drew Kutcharian wrote: > Thanks everyone for the replies. Seems like there is no easy way to handle > this. It's very surprising that no one seems to have solved such a common use > case. > > -- Drew > > On Jan 6, 2012, at 2:11 PM, Bryce Allen wrote: > > > That's a good question, and I'm not sure - I'm fairly new to both ZK > > and Cassandra. I found this wiki page: > > http://wiki.apache.org/hadoop/ZooKeeper/FailureScenarios > > and I think the lock recipe still works, even if a stale read happens. > > Assuming that wiki page is correct. > > > > There is still subtlety to locking with ZK though, see (Locks based > > on ephemeral nodes) from the zk mailing list in October: > > http://mail-archives.apache.org/mod_mbox/zookeeper-user/201110.mbox/thread?0 > > > > -Bryce > > > > On Fri, 6 Jan 2012 13:36:52 -0800 > > Drew Kutcharian wrote: > >> Bryce, > >> > >> I'm not sure about ZooKeeper, but I know if you have a partition > >> between HazelCast nodes, than the nodes can acquire the same lock > >> independently in each divided partition. How does ZooKeeper handle > >> this situation? > >> > >> -- Drew > >> > >> > >> On Jan 6, 2012, at 12:48 PM, Bryce Allen wrote: > >> > >>> On Fri, 6 Jan 2012 10:03:38 -0800 > >>> Drew Kutcharian wrote: > >>>> I know that this can be done using a lock manager such as ZooKeeper > >>>> or HazelCast, but the issue with using either of them is that if > >>>> ZooKeeper or HazelCast is down, then you can't be sure about the > >>>> reliability of the lock. So this potentially, in the very rare > >>>> instance where the lock manager is down and two users are > >>>> registering with the same email, can cause major issues. > >>> > >>> For most applications, if the lock managers is down, you don't > >>> acquire the lock, so you don't enter the critical section. Rather > >>> than allowing inconsistency, you become unavailable (at least to > >>> writes that require a lock). > >>> > >>> -Bryce > >> > > > > > -- > Narendra Sharma > Software Engineer > http://www.aeris.com > http://narendrasharma.blogspot.com/ > >
Re: How to reliably achieve unique constraints with Cassandra?
Thanks everyone for the replies. Seems like there is no easy way to handle this. It's very surprising that no one seems to have solved such a common use case. -- Drew On Jan 6, 2012, at 2:11 PM, Bryce Allen wrote: > That's a good question, and I'm not sure - I'm fairly new to both ZK > and Cassandra. I found this wiki page: > http://wiki.apache.org/hadoop/ZooKeeper/FailureScenarios > and I think the lock recipe still works, even if a stale read happens. > Assuming that wiki page is correct. > > There is still subtlety to locking with ZK though, see (Locks based > on ephemeral nodes) from the zk mailing list in October: > http://mail-archives.apache.org/mod_mbox/zookeeper-user/201110.mbox/thread?0 > > -Bryce > > On Fri, 6 Jan 2012 13:36:52 -0800 > Drew Kutcharian wrote: >> Bryce, >> >> I'm not sure about ZooKeeper, but I know if you have a partition >> between HazelCast nodes, than the nodes can acquire the same lock >> independently in each divided partition. How does ZooKeeper handle >> this situation? >> >> -- Drew >> >> >> On Jan 6, 2012, at 12:48 PM, Bryce Allen wrote: >> >>> On Fri, 6 Jan 2012 10:03:38 -0800 >>> Drew Kutcharian wrote: >>>> I know that this can be done using a lock manager such as ZooKeeper >>>> or HazelCast, but the issue with using either of them is that if >>>> ZooKeeper or HazelCast is down, then you can't be sure about the >>>> reliability of the lock. So this potentially, in the very rare >>>> instance where the lock manager is down and two users are >>>> registering with the same email, can cause major issues. >>> >>> For most applications, if the lock managers is down, you don't >>> acquire the lock, so you don't enter the critical section. Rather >>> than allowing inconsistency, you become unavailable (at least to >>> writes that require a lock). >>> >>> -Bryce >>
Re: How to reliably achieve unique constraints with Cassandra?
Bryce, I'm not sure about ZooKeeper, but I know if you have a partition between HazelCast nodes, than the nodes can acquire the same lock independently in each divided partition. How does ZooKeeper handle this situation? -- Drew On Jan 6, 2012, at 12:48 PM, Bryce Allen wrote: > On Fri, 6 Jan 2012 10:03:38 -0800 > Drew Kutcharian wrote: >> I know that this can be done using a lock manager such as ZooKeeper >> or HazelCast, but the issue with using either of them is that if >> ZooKeeper or HazelCast is down, then you can't be sure about the >> reliability of the lock. So this potentially, in the very rare >> instance where the lock manager is down and two users are registering >> with the same email, can cause major issues. > > For most applications, if the lock managers is down, you don't acquire > the lock, so you don't enter the critical section. Rather than allowing > inconsistency, you become unavailable (at least to writes that require > a lock). > > -Bryce
Re: How to reliably achieve unique constraints with Cassandra?
Yes, my issue is with handling concurrent requests. I'm not sure how your logic will work with eventual consistency. I'm going to have the same issue in the "tracker" CF too, no? On Jan 6, 2012, at 10:38 AM, Mohit Anchlia wrote: > On Fri, Jan 6, 2012 at 10:03 AM, Drew Kutcharian wrote: >> Hi Everyone, >> >> What's the best way to reliably have unique constraints like functionality >> with Cassandra? I have the following (which I think should be very common) >> use case. >> >> User CF >> Row Key: user email >> Columns: userId: UUID, etc... >> >> UserAttribute1 CF: >> Row Key: userId (which is the uuid that's mapped to user email) >> Columns: ... >> >> UserAttribute2 CF: >> Row Key: userId (which is the uuid that's mapped to user email) >> Columns: ... >> >> The issue is we need to guarantee that no two people register with the same >> email address. In addition, without locking, potentially a malicious user >> can "hijack" another user's account by registering using the user's email >> address. > > It could be as simple as reading before writing to make sure that > email doesn't exist. But I think you are looking at how to handle 2 > concurrent requests for same email? Only way I can think of is: > > 1) Create new CF say tracker > 2) write email and time uuid to CF tracker > 3) read from CF tracker > 4) if you find a row other than yours then wait and read again from > tracker after few ms > 5) read from USER CF > 6) write if no rows in USER CF > 7) delete from tracker > > Please note you might have to modify this logic a little bit, but this > should give you some ideas of how to approach this problem without > locking. > > Regarding hijacking accounts, can you elaborate little more? >> >> I know that this can be done using a lock manager such as ZooKeeper or >> HazelCast, but the issue with using either of them is that if ZooKeeper or >> HazelCast is down, then you can't be sure about the reliability of the lock. >> So this potentially, in the very rare instance where the lock manager is >> down and two users are registering with the same email, can cause major >> issues. >> >> In addition, I know this can be done with other tools such as Redis (use >> Redis for this use case, and Cassandra for everything else), but I'm >> interested in hearing if anyone has solved this issue using Cassandra only. >> >> Thanks, >> >> Drew
How to reliably achieve unique constraints with Cassandra?
Hi Everyone, What's the best way to reliably have unique constraints like functionality with Cassandra? I have the following (which I think should be very common) use case. User CF Row Key: user email Columns: userId: UUID, etc... UserAttribute1 CF: Row Key: userId (which is the uuid that's mapped to user email) Columns: ... UserAttribute2 CF: Row Key: userId (which is the uuid that's mapped to user email) Columns: ... The issue is we need to guarantee that no two people register with the same email address. In addition, without locking, potentially a malicious user can "hijack" another user's account by registering using the user's email address. I know that this can be done using a lock manager such as ZooKeeper or HazelCast, but the issue with using either of them is that if ZooKeeper or HazelCast is down, then you can't be sure about the reliability of the lock. So this potentially, in the very rare instance where the lock manager is down and two users are registering with the same email, can cause major issues. In addition, I know this can be done with other tools such as Redis (use Redis for this use case, and Cassandra for everything else), but I'm interested in hearing if anyone has solved this issue using Cassandra only. Thanks, Drew
Choosing a Partitioner Type for Random java.util.UUID Row Keys
Hey Guys, I just came across http://wiki.apache.org/cassandra/ByteOrderedPartitioner and it got me thinking. If the row keys are java.util.UUID which are generated randomly (and securely), then what type of partitioner would be the best? Since the key values are already random, would it make a difference to use RandomPartitioner or one can use ByteOrderedPartitioner or OrderPreservingPartitioning as well and get the same result? -- Drew
Re: raid 0 and ssd
RAID 0 is the fastest, but you'll lose the whole array if you lose a drive. One thing to keep in mind is that SSDs get slower as they get filled up and closer to their capacity due to garbage collection. If you want more info on how SSDs perform in general, Percona guys have done extensive tests. (In addition to comparing all the raid levels and etc. http://www.percona.com/docs/wiki/benchmark:ssd:start http://www.mysqlperformanceblog.com/2009/05/01/raid-vs-ssd-vs-fusionio/ (see the "RELATED SEARCHES" on the right side too) - Drew On Apr 13, 2011, at 9:42 PM, Anurag Gujral wrote: > Hi All, >We are using three ssd disks with cassandra 0.7.3 , should we set > them as raid0 .What are the advantages and disadvantages of doing this. > Please advise. > > Thanks > Anurag
Re: Possible design flaw in "Cassandra By Example" blog
Thanks for your response. In general, seems like you always need some kind of external coordination if you are doing inverted indexes. How do others tackle this issue? Now would using secondary indexes be a good idea in this case considering cardinality of the keys will be pretty high? cheers, Drew On Apr 13, 2011, at 4:51 PM, Eric Evans wrote: > On Wed, 2011-04-13 at 15:07 -0700, Drew Kutcharian wrote: >> username = 'jericevans' >> password = '**' >> useruuid = str(uuid()) >> columns = {'id': useruuid, 'username': username, 'password': password} >> USER.insert(useruuid, columns) >> USERNAME.insert(username, {'id': useruuid}) >> >> How can I guarantee that USERNAME.insert(username, {'id': useruuid}) >> won't overwrite someone else's account. What I mean is how can I >> guarantee that a user's username doesn't already exist in Cassandra? I >> know I can check first, but in a highly concurrent environment, >> there's a possibility that between USER.insert(useruuid, columns) and >> USERNAME.insert(username, {'id': useruuid}) someone else does the same >> USERNAME.insert(username, {'id': useruuid}) and hijack the user's >> account. > > Yes, this is a flaw. You'd need some sort of external coordination to > be sure you could prevent this. > > There are probably many such flaws, Twissandra wasn't meant to be a Real > app, it's an aid in teaching the query and data models, and a lot was > glossed over to keep it concise. > >> Seems like that USERNAME is something that the author has added since >> it's missing in original Twissandra source code. > > Right, since that article was written, the Username column family was > removed, and the User column family is now keyed on username (which > solves the problem of concurrent updates, by making it "last write > wins"). > > -- > Eric Evans > eev...@rackspace.com >
Possible design flaw in "Cassandra By Example" blog
Hi Everyone, I was going thru Cassandra By Example Blog http://www.rackspace.com/cloud/blog/2010/05/12/cassandra-by-example/ and I had a question about the user sign up section: username = 'jericevans' password = '**' useruuid = str(uuid()) columns = {'id': useruuid, 'username': username, 'password': password} USER.insert(useruuid, columns) USERNAME.insert(username, {'id': useruuid}) How can I guarantee that USERNAME.insert(username, {'id': useruuid}) won't overwrite someone else's account. What I mean is how can I guarantee that a user's username doesn't already exist in Cassandra? I know I can check first, but in a highly concurrent environment, there's a possibility that between USER.insert(useruuid, columns) and USERNAME.insert(username, {'id': useruuid}) someone else does the same USERNAME.insert(username, {'id': useruuid}) and hijack the user's account. Seems like that USERNAME is something that the author has added since it's missing in original Twissandra source code. Thanks, Drew
Re: Atomicity Strategies
I'm interested in this too, but I don't think this can be done with Cassandra alone. Cassandra doesn't support transactions. I think hector can retry operations, but I'm not sure about the atomicity of the whole thing. On Apr 8, 2011, at 1:26 PM, Alex Araujo wrote: > Hi, I was wondering if there are any patterns/best practices for creating > atomic units of work when dealing with several column families and their > inverted indices. > > For example, if I have Users and Groups column families and did something > like: > > Users.insert( user_id, columns ) > UserGroupTimeline.insert( group_id, { timeuuid() : user_id } ) > UserGroupStatus.insert( group_id + ":" + user_id, { "Active" : "True" } ) > UserEvents.insert( timeuuid(), { "user_id" : user_id, "group_id" : group_id, > "event_type" : "join" } ) > > Would I want the client to retry all subsequent operations that failed > against other nodes after n succeeded, maintain an "undo" queue of > operations to run, batch the mutations and choose a strong consistency level, > some combination of these/others, etc? > > Thanks, > Alex > >
Re: Secondary Indexes
I just updated added a new page to the wiki: http://wiki.apache.org/cassandra/SecondaryIndexes On Apr 3, 2011, at 7:37 PM, Drew Kutcharian wrote: > Yea I know, I just didn't know anyone can update it. > > > On Apr 3, 2011, at 1:26 PM, Joe Stump wrote: > >> >> On Apr 3, 2011, at 2:22 PM, Drew Kutcharian wrote: >> >>> Thanks Tyler. Can you update the wiki with these answers so they are stored >>> there for others to see too? >> >> Dude, it's a wiki. >
Re: Secondary Indexes
Yea I know, I just didn't know anyone can update it. On Apr 3, 2011, at 1:26 PM, Joe Stump wrote: > > On Apr 3, 2011, at 2:22 PM, Drew Kutcharian wrote: > >> Thanks Tyler. Can you update the wiki with these answers so they are stored >> there for others to see too? > > Dude, it's a wiki.
Re: Secondary Indexes
Thanks Tyler. Can you update the wiki with these answers so they are stored there for others to see too? On Apr 3, 2011, at 12:51 PM, Tyler Hobbs wrote: > I'm not familiar with some of the details, but I'll try to answer your > questions in general. Secondary indexes are implemented as a slightly > special separate column family with the indexed value serving as the key; > most of the properties of secondary indexes follow from that. > > On Sun, Apr 3, 2011 at 2:28 PM, Drew Kutcharian wrote: > Hi Everyone, > > I posted the following email a couple of days ago and I didn't get any > responses. Makes me wonder, does anyone on this list know/use Secondary > Indexes? They seem to me like a pretty big feature and it's a bit > disappointing to not be able to get a documentation on it. > > The only thing I could find on the Wiki was the end of > http://wiki.apache.org/cassandra/StorageConfiguration and that was pointing > to the non-existing page http://wiki.apache.org/cassandra/SecondaryIndexes . > In addition, I checked the JIRA CASSANDRA-749 and there's a lot of back and > forth that I couldn't really figure out what the conclusion was. What gives? > > I think the Cassandra committers are doing a heck of a job adding all these > cool functionalities but the documenting side doesn't really keep up. > Jonathan Ellis's blog post on Secondary Indexes only scratches the surface of > the topic, and if you consider that the whole point of using Cassandra is > scalability, there isn't a single mention of how Secondary Indexes scale!!! > (This same thing applies to Counters too) > > I'm not trying to be a complainer, but as someone new to this community, I > hope you guys take my comments as productive criticism. > > Thanks, > > Drew > > > [ORIGINAL POST] > > I just read Jonathan Ellis' great post on Secondary Indexes > (http://www.datastax.com/dev/blog/whats-new-cassandra-07-secondary-indexes) > and I was wondering where I can find a bit more info on them. I would like to > know: > > 1) Are there in limitations beside the hash properties (no between queries)? > Like size or memory, etc? > > No. > > > 2) Are there distributed? If so, how does that work? How are there stored on > the nodes? > > Each node only indexes data that it holds locally. > > > 3) When you write a new row, when/how does the index get updated? What I > would like to know is the atomicity of the operation, is the "index write" > part of the "row write"? > > The row and index updates are one atomic operation. > > > 4) Is there a difference between creating a secondary index vs creating an > "index" CF manually such as "users_by_country"? > > > Yes. First, when creating your own index, a node may index data held by > another node. Second, updates to the index and data are not atomic. > > Your feedback is certainly helpful and hopefully we can get some of these > details into the documentation! > > -- > Tyler Hobbs > Software Engineer, DataStax > Maintainer of the pycassa Cassandra Python client library >
Secondary Indexes
Hi Everyone, I posted the following email a couple of days ago and I didn't get any responses. Makes me wonder, does anyone on this list know/use Secondary Indexes? They seem to me like a pretty big feature and it's a bit disappointing to not be able to get a documentation on it. The only thing I could find on the Wiki was the end of http://wiki.apache.org/cassandra/StorageConfiguration and that was pointing to the non-existing page http://wiki.apache.org/cassandra/SecondaryIndexes . In addition, I checked the JIRA CASSANDRA-749 and there's a lot of back and forth that I couldn't really figure out what the conclusion was. What gives? I think the Cassandra committers are doing a heck of a job adding all these cool functionalities but the documenting side doesn't really keep up. Jonathan Ellis's blog post on Secondary Indexes only scratches the surface of the topic, and if you consider that the whole point of using Cassandra is scalability, there isn't a single mention of how Secondary Indexes scale!!! (This same thing applies to Counters too) I'm not trying to be a complainer, but as someone new to this community, I hope you guys take my comments as productive criticism. Thanks, Drew [ORIGINAL POST] I just read Jonathan Ellis' great post on Secondary Indexes (http://www.datastax.com/dev/blog/whats-new-cassandra-07-secondary-indexes) and I was wondering where I can find a bit more info on them. I would like to know: 1) Are there in limitations beside the hash properties (no between queries)? Like size or memory, etc? 2) Are there distributed? If so, how does that work? How are there stored on the nodes? 3) When you write a new row, when/how does the index get updated? What I would like to know is the atomicity of the operation, is the "index write" part of the "row write"? 4) Is there a difference between creating a secondary index vs creating an "index" CF manually such as "users_by_country"?
Re: Revised: Data Modeling advise for Cassandra 0.8 (added #8)
Thanks Aaron, I have already checked out Twissandra. I was mainly looking to see how Secondary Indexes can be used and how they effect Data Modeling. There doesn't seem to be a lot of coverage on them. In addition, I couldn't tell what kind of Partitioner is Twissandra using and why. cheers, Drew On Mar 31, 2011, at 5:53 AM, aaron morton wrote: > Drew, > The Twissandra project is a twitter clone in cassandra, it may give you > some insight into how things can be modelled > https://github.com/thobbs/twissandra > > If you are just starting then consider something like... > > - CF to hold the user, their data and their network links > - standard CF to hold a blog entry, key is a timestamp > - standard CF to hold blog comments, each comment as a single column > where the name is a long timestamp > - standard CF to hold the blogs for a user, key is the user id and each > column is the blog key > > Thats not a great schema but it's a simple starting point you can build on > and refine using things like secondary indexes and doing more/less in the > same CF. > > Good luck. > Aaron > > On 30 Mar 2011, at 15:13, Drew Kutcharian wrote: > >> I'm pretty new to Cassandra and I would like to get your advice on modeling. >> The object model of the project that I'm working on will be pretty close to >> Blogger, Tumblr, etc. (or any other blogging website). >> Where you have Users, that each can have many Blogs and each Blog can have >> many comments. How would you model this efficiently considering: >> >> 1) Be able to directly link to a User >> 2) Be able to directly link to a Blog >> 3) Be able to query and get all the Blogs for a User ordered by time created >> descending (new blogs first) >> 4) Be able to query and get all the Comments for each Blog ordered by time >> created ascending (old comments first) >> 5) Be able to link different Users to each other, as a network. >> 6) Have a well distributed hash so we don't end up with "hot" nodes, while >> the rest of the nodes are idle >> 7) It would be nice to show a User how many Blogs they have or how many >> comments are on a Blog, without iterating thru the whole dataset. >> NEW: 8) Be able to query for the most recently added Blogs. For example, >> Blogs added today, this week, this month, etc. >> >> The target Cassandra version is 0.8 to use the Secondary Indexes. The goal >> is to be very efficient, so no Text keys. We were thinking of using Time >> Based 64bit ids, using Snowflake. >> >> Thanks, >> >> Drew >
Re: Any way to get different unique time UUIDs for the same time value?
Hi Ed, Cool, I guess we both read/interpreted his post differently and gave two valid answers ;) - Drew On Mar 30, 2011, at 5:40 PM, Ed Anuff wrote: > Hey Drew, I'm somewhat familiar with Snowflake, and it's certainly a > good option, but, my impression was that the main reason to use it is > because you find the 128-bits for a UUID overkill, not because it's > doing anything you can't do with UUID's. The difference in time > resolution between UUIDs and Snowflake ids is actually greater than > the size of the sequence value that Snowflake uses to differentiate > duplicated timestamps, so the easiest thing would be just to round to > milliseconds unless your goal was to save the extra 64 bits per UUID. > I was just over-reading into Roshan's question that he wanted the full > time resolution of a UUID and on top of that be able to have a number > of duplicate timestamps. > > On Wed, Mar 30, 2011 at 4:24 PM, Drew Kutcharian wrote: >> Hi Ed, >> >> There's no need to re-invent the wheel that's pretty much what Twitter >> Snowflake does. The way it works is it creates a 64 bit long id which is >> formatted as such >> >> time_bits : data_center_id : machine_id : sequence >> >> Where time_bits are the milliseconds since a custom epoch. >> >> So If you see, you would get ids that are unique and ordered by time up to >> 1ms (if two ids were created during the same millisecond, then the ordering >> is not preserved) >> >> - Drew >> >> >> On Mar 30, 2011, at 4:13 PM, Ed Anuff wrote: >> >>> If I understand the question, it's not that >>> UUIDGen.makeType1UUIDFromHost(InetAddress.getLocalHost()) is returning >>> duplicate UUID's. It should always be giving unique time-based uuids >>> and has checks to make sure it does. The question was whether it was >>> possible to get multiple unique time-based UUID's with the exact same >>> timestamp component, rather than avoiding duplicates in the timestamp >>> the way UUIDGen currently does. The answer to that is that you could >>> take a look at the code for the UUIDGen class and create your own >>> version that perhaps generated the clock sequence in a different way, >>> such as leaving a certain number of low order bits of the clock >>> sequence empty and then incrementing those when duplicate timestamps >>> were generated rather than incrementing the timestamp the way UUIDGen >>> currently does. >>> >>> On Wed, Mar 30, 2011 at 10:08 AM, Drew Kutcharian wrote: >>>> Hi Roshan, >>>> You probably want to look at Twitter's >>>> Snowflake: https://github.com/twitter/snowflake >>>> There's also another Java variant: https://github.com/earnstone/eid >>>> - Drew >>>> >>>> On Mar 30, 2011, at 6:08 AM, Roshan Dawrani wrote: >>>> >>>> Hi, >>>> Is there any way I can get multiple unique time UUIDs for the same >>>> timestamp >>>> value - I mean, the UUIDs that are same in their time (most significant >>>> bits), but differ in their least significant bits? >>>> The least significant bits added by >>>> me.prettyprint.cassandra.utils.TimeUUIDUtils seem to be a fixed value based >>>> on mac/ip address, which makes sure that I get the same UUID for a >>>> timestamp >>>> value, everytime I ask. >>>> I need the "(timestamp): " kind of columns that need to be >>>> sorted by time, and I wanted to use TimeUUID to use column sorting that >>>> comes out-of-the-box, but the problem is that I can get multiple values for >>>> the same timestamp. >>>> So, I am looking for some way where the time portion is same, but the other >>>> UUID half is different so that I can safely store "1 time UUID: 1 value". >>>> Any help there is appreciated. >>>> -- >>>> Roshan >>>> Blog: http://roshandawrani.wordpress.com/ >>>> Twitter: @roshandawrani >>>> Skype: roshandawrani >>>> >>>> >>>> >> >>
Questions about Secondary Indexes
Hi Everyone, I just read Jonathan Ellis' great post on Secondary Indexes (http://www.datastax.com/dev/blog/whats-new-cassandra-07-secondary-indexes) and I was wondering where I can find a bit more info on them. I would like to know: 1) Are there in limitations beside the hash properties (no between queries)? Like size or memory, etc? 2) Are there distributed? If so, how does that work? How are there stored on the nodes? 3) When you write a new row, when/how does the index get updated? What I would like to know is the atomicity of the operation, is the "index write" part of the "row write"? 4) Is there a difference between creating a secondary index vs creating an "index" CF manually such as "users_by_country"? Thanks, Drew
Re: Any way to get different unique time UUIDs for the same time value?
Hi Ed, There's no need to re-invent the wheel that's pretty much what Twitter Snowflake does. The way it works is it creates a 64 bit long id which is formatted as such time_bits : data_center_id : machine_id : sequence Where time_bits are the milliseconds since a custom epoch. So If you see, you would get ids that are unique and ordered by time up to 1ms (if two ids were created during the same millisecond, then the ordering is not preserved) - Drew On Mar 30, 2011, at 4:13 PM, Ed Anuff wrote: > If I understand the question, it's not that > UUIDGen.makeType1UUIDFromHost(InetAddress.getLocalHost()) is returning > duplicate UUID's. It should always be giving unique time-based uuids > and has checks to make sure it does. The question was whether it was > possible to get multiple unique time-based UUID's with the exact same > timestamp component, rather than avoiding duplicates in the timestamp > the way UUIDGen currently does. The answer to that is that you could > take a look at the code for the UUIDGen class and create your own > version that perhaps generated the clock sequence in a different way, > such as leaving a certain number of low order bits of the clock > sequence empty and then incrementing those when duplicate timestamps > were generated rather than incrementing the timestamp the way UUIDGen > currently does. > > On Wed, Mar 30, 2011 at 10:08 AM, Drew Kutcharian wrote: >> Hi Roshan, >> You probably want to look at Twitter's >> Snowflake: https://github.com/twitter/snowflake >> There's also another Java variant: https://github.com/earnstone/eid >> - Drew >> >> On Mar 30, 2011, at 6:08 AM, Roshan Dawrani wrote: >> >> Hi, >> Is there any way I can get multiple unique time UUIDs for the same timestamp >> value - I mean, the UUIDs that are same in their time (most significant >> bits), but differ in their least significant bits? >> The least significant bits added by >> me.prettyprint.cassandra.utils.TimeUUIDUtils seem to be a fixed value based >> on mac/ip address, which makes sure that I get the same UUID for a timestamp >> value, everytime I ask. >> I need the "(timestamp): " kind of columns that need to be >> sorted by time, and I wanted to use TimeUUID to use column sorting that >> comes out-of-the-box, but the problem is that I can get multiple values for >> the same timestamp. >> So, I am looking for some way where the time portion is same, but the other >> UUID half is different so that I can safely store "1 time UUID: 1 value". >> Any help there is appreciated. >> -- >> Roshan >> Blog: http://roshandawrani.wordpress.com/ >> Twitter: @roshandawrani >> Skype: roshandawrani >> >> >>
Looking for an independent consultant
Hi Everyone, Anyone on this list interested in a remote very flexible contract gig? If yes, please contact me directly. Thanks, Drew Kutcharian Chief Technology Officer Venarc Inc. www.venarc.com Phone: 818-524-2500
Re: Any way to get different unique time UUIDs for the same time value?
Hi Roshan, You probably want to look at Twitter's Snowflake: https://github.com/twitter/snowflake There's also another Java variant: https://github.com/earnstone/eid - Drew On Mar 30, 2011, at 6:08 AM, Roshan Dawrani wrote: > Hi, > > Is there any way I can get multiple unique time UUIDs for the same timestamp > value - I mean, the UUIDs that are same in their time (most significant > bits), but differ in their least significant bits? > > The least significant bits added by > me.prettyprint.cassandra.utils.TimeUUIDUtils seem to be a fixed value based > on mac/ip address, which makes sure that I get the same UUID for a timestamp > value, everytime I ask. > > I need the "(timestamp): " kind of columns that need to be sorted > by time, and I wanted to use TimeUUID to use column sorting that comes > out-of-the-box, but the problem is that I can get multiple values for the > same timestamp. > > So, I am looking for some way where the time portion is same, but the other > UUID half is different so that I can safely store "1 time UUID: 1 value". > > Any help there is appreciated. > > -- > Roshan > Blog: http://roshandawrani.wordpress.com/ > Twitter: @roshandawrani > Skype: roshandawrani >
Revised: Data Modeling advise for Cassandra 0.8 (added #8)
I'm pretty new to Cassandra and I would like to get your advice on modeling. The object model of the project that I'm working on will be pretty close to Blogger, Tumblr, etc. (or any other blogging website). Where you have Users, that each can have many Blogs and each Blog can have many comments. How would you model this efficiently considering: 1) Be able to directly link to a User 2) Be able to directly link to a Blog 3) Be able to query and get all the Blogs for a User ordered by time created descending (new blogs first) 4) Be able to query and get all the Comments for each Blog ordered by time created ascending (old comments first) 5) Be able to link different Users to each other, as a network. 6) Have a well distributed hash so we don't end up with "hot" nodes, while the rest of the nodes are idle 7) It would be nice to show a User how many Blogs they have or how many comments are on a Blog, without iterating thru the whole dataset. NEW: 8) Be able to query for the most recently added Blogs. For example, Blogs added today, this week, this month, etc. The target Cassandra version is 0.8 to use the Secondary Indexes. The goal is to be very efficient, so no Text keys. We were thinking of using Time Based 64bit ids, using Snowflake. Thanks, Drew
Data Modeling advise for Cassandra 0.8
I'm pretty new to Cassandra and I would like to get your advice on modeling. The object model of the project that I'm working on will be pretty close to Blogger, Tumblr, etc. (or any other blogging website). Where you have Users, that each can have many Blogs and each Blog can have many comments. How would you model this efficiently considering: 1) Be able to directly link to a User 2) Be able to directly link to a Blog 3) Be able to query and get all the Blogs for a User ordered by time created descending (new blogs first) 4) Be able to query and get all the Comments for each Blog ordered by time created ascending (old comments first) 5) Be able to link different Users to each other, as a network. 6) Have a well distributed hash so we don't end up with "hot" nodes, while the rest of the nodes are idle 7) It would be nice to show a User how many Blogs they have or how many comments are on a Blog, without iterating thru the whole dataset. The target Cassandra version is 0.8 to use the Secondary Indexes. The goal is to be very efficient, so no Text keys. We were thinking of using Time Based 64bit ids, using Snowflake. Thanks, Drew