Re: Programmatically allow only one out of two types of rows in a CF to enter the CACHE

2011-10-30 Thread David Jeske
If your summary data is frequently accessed, you will probably be best off storing the two sets of data separately (either in separate column families or with different key-prefixes). This will give you the greatest cache-locality for your summary data, which you say is popular. If your summary

ByteBuffer as an initial serializer to read columns with mixed datatypes ?

2011-10-30 Thread Ertio Lew
I have a mix of byte[] Integer column names/ values within a CF rows. So should ByteBuffer be my initial choice for the serializer while making the read query to the database for the mixed datatypes then I should retrieve the byte[] or Integer from ByteBuffer using the ByteBuffer api's getInt()

Re: Cassandra cluster HW spec (commit log directory vs data file directory)

2011-10-30 Thread Alexandru Dan Sicoe
Hi Chris, Thanks for your post. I can see you guys handle extremely large amounts of data compared to my system. Yes I will own the racks and the machines but the problem is I am limited by actual physical space in our data center (believe it or not) and also the budget. It would be hard for me

What does a cluster throttled by the network look like ?

2011-10-30 Thread Philippe
Dear all, I'm working with a 12-node, RF=3 cluster on low-end hardware (core i5 with 16GB of RAM SATA disks). I'm using a BOP and each node has a load between 50GB and 100GB (yes, I apparently did not set my tokens right... I'll fix that later). I'm hitting the cluster with a little over 100

Re: Cassandra cluster HW spec (commit log directory vs data file directory)

2011-10-30 Thread Radim Kolar
Dne 30.10.2011 23:34, Sorin Julean napsal(a): Hey Chris, Thanks for sharing all the info. I have few questions: 1. What are you doing with so much memory :) ? cassandra eats memory like there is no tomorrow on large databases. It keeps some structures in memory which depends on database

Re: What does a cluster throttled by the network look like ?

2011-10-30 Thread David Jeske
You are answering your own question here. If you are running at 80% of network bandwidth, you are saturating your network. AFAIK - most distributed databases are running on gigabit, not 100mb. I recommend you upgrade your switch (and nics if necessary). Gigabit is insanely cheap now. In the

Very slow writes in Cassandra

2011-10-30 Thread Evgeny
Hello Cassandra users, I'm newbie in NoSQL and Cassandara in particular. At the moment doing some benchmarking with Cassandra and experiencing very slow write throughput. As it is said, Cassandra can perform hundreds of thousands of inserts per second, however I'm not observing this: 1) when I

Re: Cassandra cluster HW spec (commit log directory vs data file directory)

2011-10-30 Thread Chris Goffinet
On Sun, Oct 30, 2011 at 3:34 PM, Sorin Julean sorin.jul...@gmail.comwrote: Hey Chris, Thanks for sharing all the info. I have few questions: 1. What are you doing with so much memory :) ? How much of it do you allocate for heap ? max heap is 12GB. we use the rest for cache. we run

Re: Cassandra cluster HW spec (commit log directory vs data file directory)

2011-10-30 Thread Mohit Anchlia
On Sun, Oct 30, 2011 at 6:53 PM, Chris Goffinet c...@chrisgoffinet.com wrote: On Sun, Oct 30, 2011 at 3:34 PM, Sorin Julean sorin.jul...@gmail.com wrote: Hey Chris,  Thanks for sharing all  the info.  I have few questions:  1. What are you doing with so much memory :) ? How much of it do

Re: Cassandra cluster HW spec (commit log directory vs data file directory)

2011-10-30 Thread Chris Goffinet
No. We built a pluggable cache provider for memcache. On Sun, Oct 30, 2011 at 7:31 PM, Mohit Anchlia mohitanch...@gmail.comwrote: On Sun, Oct 30, 2011 at 6:53 PM, Chris Goffinet c...@chrisgoffinet.com wrote: On Sun, Oct 30, 2011 at 3:34 PM, Sorin Julean sorin.jul...@gmail.com wrote: