Re: Site Not Surviving a Single Cassandra Node Crash

2011-04-09 Thread Joe Stump
Did the Cassandra cluster go down or did you start getting failures from the client when it routed queries to the downed node? The key in the client is to keep working around the ring if the initial node is down. --Joe On Apr 9, 2011, at 12:52 PM, Vram Kouramajian wrote: We have a 5

Re: Pyramid Organization of Data

2011-04-08 Thread Joe Stump
A few lines of Java in a partitioning or rack aware strategy might be able to achieve this. --Joe -- Typed with big fingers on a small keyboard. On Apr 8, 2011, at 13:17, Patrick Julien pjul...@gmail.com wrote: We have a pilot project running where all our historical data worldwide would

Re: Secondary Indexes

2011-04-03 Thread Joe Stump
On Apr 3, 2011, at 2:22 PM, Drew Kutcharian wrote: Thanks Tyler. Can you update the wiki with these answers so they are stored there for others to see too? Dude, it's a wiki.

Re: cassandra as session store

2011-02-01 Thread Joe Stump
FWIW we used Memcached for session data at Digg without any major issues. The one thing we did end up doing to reduce the LRU on sessions was to modify the slab size and put sessions in their own Memcached cluster. Probably not an issue for you though. +1 on Memcached. On Feb 1, 2011, at

Re: Cassandra on AWS across Regions

2010-09-01 Thread Joe Stump
On Sep 1, 2010, at 1:42 PM, Peter Fales wrote: I probably should have made it clear that I wasn't proposing this as an official patch (as you point out, it's not general enough for production use). I'm just looking for feedback on the concept (thanks!) and thought it might possibly be

Re: Cassandra HAProxy

2010-08-28 Thread Joe Stump
On Aug 28, 2010, at 12:29 PM, Mark wrote: Also, what would be a good way of monitoring the health of the cluster? We use Ganglia. I believe failover is usually built into clients. Not sure why using HAProxy or LVS wouldn't be a good option though. I used to use it with MySQL slaves with much

Re: RackAwareStrategy vs RackUnAwareStrategy on AWS EC2 cloud

2010-07-09 Thread Joe Stump
On Jul 9, 2010, at 1:16 PM, maneela a wrote: Is there any way to mark cassandra node to keep it as just for replication purpose and not to be as Primary for any data range in the ring? I believe there is. This is what we're doing, but we do all of our writes via a queue. Derek or Mike from

Re: Digg 4 Preview on TWiT

2010-07-06 Thread Joe Stump
On Jul 6, 2010, at 6:18 PM, David Strauss wrote: Then I'll tell my friend at Facebook to stick to topics he's qualified to speak about. :-) You might want to clarify that this advice applies to all topics of discussion and not just Facebook related ones. ;) --Joe

Re: Cassandra on AWS across Regions

2010-06-29 Thread Joe Stump
On Jun 29, 2010, at 12:44 PM, Anthony Molinaro wrote: Maybe you need to modify the security groups to allow the ports to be accessible from one to the other? A likely better solution would be to look into the VPNCubed product which was built specifically for this purpose. We're in the middle

Re: Cassandra on AWS across Regions

2010-06-29 Thread Joe Stump
On Jun 29, 2010, at 2:56 PM, Lenin Gali wrote: Thanks Joe, I was hoping to hear from you. Can you pass me the SA contact at AWS we would love to look in to it. Just contact your account representative. They'll get you hooked up. They have multiple SAs that help out account representatives.

Re: django or pylons

2010-06-20 Thread Joe Stump
A lot of the magic that Django brings to the table is derived from the ORM. If you're skipping that then Pylons likely makes more sense. --Joe On Jun 20, 2010, at 5:08 PM, Charles Woerner charleswoer...@gmail.com wrote: I recently looked into this and came to the same conclusion, but I'm not

Re: Cassandra data loss

2010-05-24 Thread Joe Stump
This is largely FUD. Cassandra let's you choose how consistent you want writes to be. The more consistency you choose, the slower the writes, but it's very unlikely with high consistency that you'll lose data. That being said, if you write with a consistency level of 0 then, yes, you could

Re: is cassandra really a 'handsoff' solution once setup?

2010-05-14 Thread Joe Stump
On May 14, 2010, at 12:46 PM, S Ahmed wrote: For those with live apps, how has it been? (fb/digg/twitter people, would love your experiences) I didn't say it didn't require *any* administration. Just that it required *minimal* administration. I'd say we spend about a quarter of an engineer

Re: The Difference Between Cassandra and HBase

2010-04-25 Thread Joe Stump
On Apr 25, 2010, at 11:40 AM, Mark Robson wrote: For me an important difference is that Cassandra is operationally much more straightforward - there is only one type of node, and it is fully redundant (depending what consistency level you're using). This seems to be an advantage in

Re: The Difference Between Cassandra and HBase

2010-04-25 Thread Joe Stump
On Apr 25, 2010, at 5:18 PM, Eric Hauser wrote: Out of curiosity, are you planning on copying the data you store in HBase/Hive into separate Hadoop cluster in a different data center or backing up HDFS in some other manner? Redundancy isn't an issue within the cluster; it's more a

Re: Just to be clear, cassandra is web framework agnostic b/c of Thrift?

2010-04-18 Thread Joe Stump
On Apr 18, 2010, at 5:33 PM, S Ahmed wrote: Obviously if you run asp.net on windows, it is probably a VERY good idea to be running cassandra on a linux box. Actually, I'm not sure this is true. A few people have found Windows performs fairly well with Cassandra, if I recall correctly.

Re: Deployment on AWS

2010-04-03 Thread Joe Stump
On Apr 3, 2010, at 1:53 PM, Benjamin Black wrote: What specific features are you looking for to operate on EC2? It seemed people weren't looking for features, but tools to help with the management. The two things we've created that people might be interested in are: 1. An EC2-specific

Re: LazyBoy question

2010-04-03 Thread Joe Stump
On Apr 3, 2010, at 2:00 PM, Jonathan Ellis wrote: I don't think Lazyboy exposes range queries [that is, iterating rows whose keys you do not know ahead of time]. Pycassa does, though. I think ieure's fork has itertools support that will let you do crazy iteration stuff with it. I haven't

Re: Deployment on AWS

2010-04-03 Thread Joe Stump
On Apr 3, 2010, at 2:54 PM, Benjamin Black wrote: I'm pretty familiar with EC2, hence the question. I don't believe any patches are required to do these things. Regardless, as I noted in that ticket, you definitely do NOT need AWS credentials to determine your availability zone. It is

Re: How reliable is cassandra?

2010-03-29 Thread Joe Stump
On Mar 29, 2010, at 12:40 PM, Eric Hauser wrote: BTW, does anyone from Digg patrol the list? I'm really interested in some additional the implementation of atomic counters with ZooKeeper. I know at least three Diggers patrol the list and one of them is a committer to Cassandra. Last I

Re: Digg's data model

2010-03-20 Thread Joe Stump
On Mar 20, 2010, at 2:53 AM, Lenin Gali wrote: 1. Eventual consistency: Given a volume of 5K writes / sec and roughly 1500 writes are Updates per sec while the rest are inserts, what kind of latency can be expected in eventual consistency? Depending on the size of the cluster you're not