schema design question

2010-03-08 Thread Matteo Caprari
Hi. We have a collection operation that generates documents like this: item: { "id": "", "title": "...", "liked_by": ["user_2", "user_3", ...] } The liked_by list contains on average 100 unique users. Users may also appear in other items. Our database contains a few million entries and is grow

Use Case scenario: Keeping a window of data + online analytics

2010-03-08 Thread Aníbal Rojas
Hello, Have been testing alternatives for MySQL / Postgres based app with the following characteristics: - A high rate of inserts. Heavy bursts are expected. - A high rate of deletes to remove old data. We keep a window, as old data is not relevant. - Online analytics based on _ag

Re: Use Case scenario: Keeping a window of data + online analytics

2010-03-08 Thread Daniel Lundin
A few comments on building a time-series store in Cassandra... Using the timestamp dimension of columns, "reusing" columns, could prove quite useful. This allows simple use of batch_mutate deletes (new in 0.6) to purge old data outside the active time window. Otherwise, performance wise, deletes

Re: Use Case scenario: Keeping a window of data + online analytics

2010-03-08 Thread Aníbal Rojas
Daniel, Thanks for your quick response. > Using the timestamp dimension of columns, "reusing" columns, could prove > quite useful. This allows simple use of batch_mutate deletes (new in > 0.6) to purge old data outside the active time window. Interesting, while draft modeling the app in Ca

Reason for not allowing null values for in Column

2010-03-08 Thread Erik Holstad
Hey! Been looking at the src and have a couple of questions: Why is it that null column values are not allowed? What is the reason for using a ConcurrentSkipListMap for columns_ in ColumnFamily compared to using the set version and use the comparator to sort on the name field in IColumn? For the

Re: Reason for not allowing null values for in Column

2010-03-08 Thread Jonathan Ellis
On Mon, Mar 8, 2010 at 11:07 AM, Erik Holstad wrote: > Why is it that null column values are not allowed? It's semantically unnecessary and potentially harmful at an implementation level. (Many java Map implementations can't distinguish between a null key and a key that is not present.) > What

Re: Reason for not allowing null values for in Column

2010-03-08 Thread Erik Holstad
On Mon, Mar 8, 2010 at 9:10 AM, Jonathan Ellis wrote: > On Mon, Mar 8, 2010 at 11:07 AM, Erik Holstad > wrote: > > Why is it that null column values are not allowed? > > It's semantically unnecessary and potentially harmful at an > implementation level. (Many java Map implementations can't > di

Re: Reason for not allowing null values for in Column

2010-03-08 Thread Jonathan Ellis
On Mon, Mar 8, 2010 at 11:22 AM, Erik Holstad wrote: > I was probably a little bit unclear here. I'm wondering about the two byte[] > in Column. > One for name and one for value. I was under the impression that the > skiplistmap > wraps the Columns, not that the name and the value are themselves i

Re: Reason for not allowing null values for in Column

2010-03-08 Thread Erik Holstad
On Mon, Mar 8, 2010 at 9:30 AM, Jonathan Ellis wrote: > On Mon, Mar 8, 2010 at 11:22 AM, Erik Holstad > wrote: > > I was probably a little bit unclear here. I'm wondering about the two > byte[] > > in Column. > > One for name and one for value. I was under the impression that the > > skiplistmap

Re: Reason for not allowing null values for in Column

2010-03-08 Thread Jonathan Ellis
On Mon, Mar 8, 2010 at 12:07 PM, Erik Holstad wrote: > So why is it again that the value field in the Column cannot be null if it > is not the > value field in the map, but just a part of the value field? Because without a compelling reason to allow nulls, the best policy is not to do so. > All

Re: Incr/Decr Counters in Cassandra

2010-03-08 Thread Jonathan Ellis
On Sat, Mar 6, 2010 at 4:59 PM, simon.reavely wrote: > Is there a place on the Cassandra wiki where the proposals/thinking on these > issues has been captured in one place? The wiki is a terrible place for proposals. Use the ML for those, and use JIRA when you start to actually generate code. h

Re: Reason for not allowing null values for in Column

2010-03-08 Thread Erik Holstad
On Mon, Mar 8, 2010 at 10:14 AM, Jonathan Ellis wrote: > On Mon, Mar 8, 2010 at 12:07 PM, Erik Holstad > wrote: > > So why is it again that the value field in the Column cannot be null if > it > > is not the > > value field in the map, but just a part of the value field? > > Because without a co

RE: Testing row cache feature in trunk: write should put record in cache

2010-03-08 Thread Daniel Kluesing
This is interesting for the use cases I'm looking at Cassandra for, so if that offer still stands I'll take you up on it. I took a crack at it in https://issues.apache.org/jira/browse/CASSANDRA-860 - also in large part to get my feet wet with the code. -Original Message- From: Jonathan

RE: Latest check-in to trunk/ is broken

2010-03-08 Thread Stu Hood
Run `ant clean` before building. A few files moved around. -Original Message- From: "Cool BSD" Sent: Monday, March 8, 2010 5:18pm To: "cassandra-user" Subject: Latest check-in to trunk/ is broken version info: $ svn info Path: . URL: https://svn.apache.org/repos/asf/incubator/cassandra/

DigestMismatchException

2010-03-08 Thread B. Todd Burruss
i am seeing a lot of these INFO level messages in cassandra server's logs: 2010-03-08 15:30:08,123 INFO [pool-1-thread-625] [StorageProxy.java:485] DigestMismatchException: Mismatch for key vmguest85__1349889195 (076e19c042e3756a619a8b3fcdd1b9f2 vs 6a2d12aebba5c51c1d9a60cd0ec556d9) anything to w

Re: DigestMismatchException

2010-03-08 Thread Jonathan Ellis
It means that you're doing a lot of reads that saw multiple versions of the answer, which depending on your workload may be normal On Mon, Mar 8, 2010 at 5:31 PM, B. Todd Burruss wrote: > i am seeing a lot of these INFO level messages in cassandra server's logs: > > 2010-03-08 15:30:08,123  INFO

Re: DigestMismatchException

2010-03-08 Thread B. Todd Burruss
i'm doing quorum reads and quorum writes with N=3 and 4 node cluster. i am "updating" values in cassandra cluster at a fairly high rate. so does this mean that a read is obtaining its two values (because of "quorum") and one of them must have been from the "third" replica that may not have been u

Cassandra latency question

2010-03-08 Thread David Dabbs
Hello. I've been running the vPork load generator against two Cassandra nodes running in VMs. I'm running a trunk build with W=2 and R=1 and out-of-the-box JVM_OPTS which should be fine, or so I thought. Throughput is lower than I expected. Are my expectations out-of-line? Thanks, David Writi

Re: Cassandra latency question

2010-03-08 Thread Jonathan Ellis
something is screwed up if writes are 10x slower than reads On Mon, Mar 8, 2010 at 5:52 PM, David Dabbs wrote: > > Hello. I've been running the vPork load generator against two Cassandra > nodes running in VMs. > I'm running a trunk build with W=2 and R=1 and out-of-the-box JVM_OPTS which > shoul

Re: schema design question

2010-03-08 Thread Jonathan Ellis
On Mon, Mar 8, 2010 at 6:18 AM, Matteo Caprari wrote: > The 'key' queries are: These map straightforwardly to one CF per query. > - list all the items a user liked row key is user id, columns names are timeuuid of when the like-ing occurred, column value is either item id, or a supercolumn cont

Re: DigestMismatchException

2010-03-08 Thread Jonathan Ellis
yes. On Mon, Mar 8, 2010 at 5:40 PM, B. Todd Burruss wrote: > i'm doing quorum reads and quorum writes with N=3 and 4 node cluster.  i am > "updating" values in cassandra cluster at a fairly high rate. > > so does this mean that a read is obtaining its two values (because of > "quorum") and one o

Re: schema design question

2010-03-08 Thread Keith Thornhill
jonathan, wouldn't using Long values as the column names for the 3rd CF cause potential conflicts if 2 users liked the same # of items? (only saving one user for any given value) was thinking about this same problem (sorted lists of top N user activity) and thought that was a roadblock for that d