Re: if cassandra isn't ideal for keep track of counts, how does digg count diggs?
On Sun, Apr 11, 2010 at 3:30 AM, Mark Robson wrote: > Can we not implement counts by just storing all the deltas in a row, and > then summing them all up to acheive a count. > > If a row ends up with too many deltas, a reader could just summarise the > deltas occasionally into a single value (in a way which avoids race > conditions, of course). How do you avoid the race condition? Don't you need a lock? Paul Prescod Ayogo, Inc.
Re: if cassandra isn't ideal for keep track of counts, how does digg count diggs?
Can we not implement counts by just storing all the deltas in a row, and then summing them all up to acheive a count. If a row ends up with too many deltas, a reader could just summarise the deltas occasionally into a single value (in a way which avoids race conditions, of course). So you'd map key => { uniqueid: delta1, uniqueid: delta2 } Every column in Cassandra also has a timestamp, so your app can decide, when it does a read, which deltas to summarise. Mark
Re: if cassandra isn't ideal for keep track of counts, how does digg count diggs?
What will be the latency for the zk based atomic increase? On Tue, Apr 6, 2010 at 8:22 PM, Chris Goffinet wrote: > http://issues.apache.org/jira/browse/CASSANDRA-704 > http://issues.apache.org/jira/browse/CASSANDRA-721 > We have our own internal codebase of Cassandra at Digg. But we are using > those above patches until we have the vector clock work cleaned up, that > patch will also goto jira. Most likely the vector clock work will go into > 0.7, but since we run 0.6 and built it for that version, we will share that > patch too. > -Chris > On Apr 6, 2010, at 10:17 AM, S Ahmed wrote: > > Chris, > When you so patch, does that mean for Cassandra or your own internal > codebase? > Sounds interesting thanks! > > On Tue, Apr 6, 2010 at 12:54 PM, Chris Goffinet wrote: >> >> That's not true. We have been using the Zookeper work we posted on jira. >> That's what we are using internally and have been for months. We are now >> just wrapping up our vector clocks + distributed counter patch so we can >> begin transitioning away from the Zookeeper approach because there are >> problems with it long-term. >> >> -Chris >> >> On Apr 6, 2010, at 9:50 AM, Ryan King wrote: >> >> > They don't use cassandra for it yet. >> > >> > -ryan >> > >> > On Tue, Apr 6, 2010 at 9:00 AM, S Ahmed wrote: >> >> From what I read in another thread, Cassandra isn't used for isn't >> >> 'ideal' >> >> for keeping track of counts. >> >> For example, I would undertand this to mean keeping track of which >> >> stories >> >> were dugg. >> >> If this is true, how would a site like digg keep track of the 'dugg' >> >> counter? >> >> Also, I am assuming with eventual consistancy the number *may* not be >> >> 100% >> >> accurate. If you wanted it to be accurate, would you just use the >> >> Quorom >> >> flag? (I believe quorom is to ensure all writes are written to disk) >> > > >
Re: if cassandra isn't ideal for keep track of counts, how does digg count diggs?
http://issues.apache.org/jira/browse/CASSANDRA-704 http://issues.apache.org/jira/browse/CASSANDRA-721 We have our own internal codebase of Cassandra at Digg. But we are using those above patches until we have the vector clock work cleaned up, that patch will also goto jira. Most likely the vector clock work will go into 0.7, but since we run 0.6 and built it for that version, we will share that patch too. -Chris On Apr 6, 2010, at 10:17 AM, S Ahmed wrote: > Chris, > > When you so patch, does that mean for Cassandra or your own internal > codebase? > > Sounds interesting thanks! > > On Tue, Apr 6, 2010 at 12:54 PM, Chris Goffinet wrote: > That's not true. We have been using the Zookeper work we posted on jira. > That's what we are using internally and have been for months. We are now just > wrapping up our vector clocks + distributed counter patch so we can begin > transitioning away from the Zookeeper approach because there are problems > with it long-term. > > -Chris > > On Apr 6, 2010, at 9:50 AM, Ryan King wrote: > > > They don't use cassandra for it yet. > > > > -ryan > > > > On Tue, Apr 6, 2010 at 9:00 AM, S Ahmed wrote: > >> From what I read in another thread, Cassandra isn't used for isn't 'ideal' > >> for keeping track of counts. > >> For example, I would undertand this to mean keeping track of which stories > >> were dugg. > >> If this is true, how would a site like digg keep track of the 'dugg' > >> counter? > >> Also, I am assuming with eventual consistancy the number *may* not be 100% > >> accurate. If you wanted it to be accurate, would you just use the Quorom > >> flag? (I believe quorom is to ensure all writes are written to disk) > >
Re: if cassandra isn't ideal for keep track of counts, how does digg count diggs?
Chris, When you so patch, does that mean for Cassandra or your own internal codebase? Sounds interesting thanks! On Tue, Apr 6, 2010 at 12:54 PM, Chris Goffinet wrote: > That's not true. We have been using the Zookeper work we posted on jira. > That's what we are using internally and have been for months. We are now > just wrapping up our vector clocks + distributed counter patch so we can > begin transitioning away from the Zookeeper approach because there are > problems with it long-term. > > -Chris > > On Apr 6, 2010, at 9:50 AM, Ryan King wrote: > > > They don't use cassandra for it yet. > > > > -ryan > > > > On Tue, Apr 6, 2010 at 9:00 AM, S Ahmed wrote: > >> From what I read in another thread, Cassandra isn't used for isn't > 'ideal' > >> for keeping track of counts. > >> For example, I would undertand this to mean keeping track of which > stories > >> were dugg. > >> If this is true, how would a site like digg keep track of the 'dugg' > >> counter? > >> Also, I am assuming with eventual consistancy the number *may* not be > 100% > >> accurate. If you wanted it to be accurate, would you just use the > Quorom > >> flag? (I believe quorom is to ensure all writes are written to disk) > >
Re: if cassandra isn't ideal for keep track of counts, how does digg count diggs?
That's not true. We have been using the Zookeper work we posted on jira. That's what we are using internally and have been for months. We are now just wrapping up our vector clocks + distributed counter patch so we can begin transitioning away from the Zookeeper approach because there are problems with it long-term. -Chris On Apr 6, 2010, at 9:50 AM, Ryan King wrote: > They don't use cassandra for it yet. > > -ryan > > On Tue, Apr 6, 2010 at 9:00 AM, S Ahmed wrote: >> From what I read in another thread, Cassandra isn't used for isn't 'ideal' >> for keeping track of counts. >> For example, I would undertand this to mean keeping track of which stories >> were dugg. >> If this is true, how would a site like digg keep track of the 'dugg' >> counter? >> Also, I am assuming with eventual consistancy the number *may* not be 100% >> accurate. If you wanted it to be accurate, would you just use the Quorom >> flag? (I believe quorom is to ensure all writes are written to disk)
Re: if cassandra isn't ideal for keep track of counts, how does digg count diggs?
Is it just the counters they are using mysql/postgresql for or also the list of stories? e.g. get me the top stories in category x. On Tue, Apr 6, 2010 at 12:50 PM, Ryan King wrote: > They don't use cassandra for it yet. > > -ryan > > On Tue, Apr 6, 2010 at 9:00 AM, S Ahmed wrote: > > From what I read in another thread, Cassandra isn't used for isn't > 'ideal' > > for keeping track of counts. > > For example, I would undertand this to mean keeping track of which > stories > > were dugg. > > If this is true, how would a site like digg keep track of the 'dugg' > > counter? > > Also, I am assuming with eventual consistancy the number *may* not be > 100% > > accurate. If you wanted it to be accurate, would you just use the Quorom > > flag? (I believe quorom is to ensure all writes are written to disk) >
Re: if cassandra isn't ideal for keep track of counts, how does digg count diggs?
They don't use cassandra for it yet. -ryan On Tue, Apr 6, 2010 at 9:00 AM, S Ahmed wrote: > From what I read in another thread, Cassandra isn't used for isn't 'ideal' > for keeping track of counts. > For example, I would undertand this to mean keeping track of which stories > were dugg. > If this is true, how would a site like digg keep track of the 'dugg' > counter? > Also, I am assuming with eventual consistancy the number *may* not be 100% > accurate. If you wanted it to be accurate, would you just use the Quorom > flag? (I believe quorom is to ensure all writes are written to disk)
if cassandra isn't ideal for keep track of counts, how does digg count diggs?
>From what I read in another thread, Cassandra isn't used for isn't 'ideal' for keeping track of counts. For example, I would undertand this to mean keeping track of which stories were dugg. If this is true, how would a site like digg keep track of the 'dugg' counter? Also, I am assuming with eventual consistancy the number *may* not be 100% accurate. If you wanted it to be accurate, would you just use the Quorom flag? (I believe quorom is to ensure all writes are written to disk)