Re: concurrent reads

2010-07-13 Thread Peter Schuller
Has anyone experimented with different settings for concurrent reads?  I have set our servers to 4 ( 2 per processor core ).  I have noticed that occasionally, our pending reads will get backed up and our servers don't appear to be under too much load.  In fact, most of the load appears to be

Re: advice, is cassandra suitable for a multi-tanency vBulletin type application?

2010-07-13 Thread Michael Dürgner
Are your PVs mostly read or write? As if they are read, I'd think you wouldn't need a Cassandra like storage which is tuned towards writes. Am 12.07.2010 um 23:40 schrieb Sandeep Kalidindi at PaGaLGuY.com: well we were going down constantly with VB running on 3-4 dedicated servers due to

Re: Authentication

2010-07-13 Thread Michael Pearson
Hey Stu, I've been using 0.6.3's SimpleAuthenticator without a hitch (just had to figure out the daemon args -Dpasswd.properties=conf/passwd.properties -Daccess.properties=conf/access.properties) - why do you ask? -michael -- http://www.github.com/mjpearson

Re: advice, is cassandra suitable for a multi-tanency vBulletin type application?

2010-07-13 Thread Michael Dürgner
The thing about slow on joins is true (we experience that ourselves) but still I wonder myself, why you use cassandra for the indices. Can't you just store them in MySQL although? Am 13.07.2010 um 08:26 schrieb Sandeep Kalidindi at PaGaLGuY.com: @paul - cassandra is really good for storing

Re: advice, is cassandra suitable for a multi-tanency vBulletin type application?

2010-07-13 Thread Benjamin Black
We use Cassandra (multidimensional metrics) *and* redis (counters and alerts) *and* MySQL (supporting Rails). Right tool for each job. The idea that it is a good thing to cram everything into a single database (and data model), beaten into everyone by years of relational database marketing, is

Re: advice, is cassandra suitable for a multi-tanency vBulletin type application?

2010-07-13 Thread Benjamin Black
On Mon, Jul 12, 2010 at 11:35 PM, Michael Dürgner mich...@duergner.de wrote: The thing about slow on joins is true (we experience that ourselves) but still I wonder myself, why you use cassandra for the indices. Can't you just store them in MySQL although? ...and then shard and shard and

Re: advice, is cassandra suitable for a multi-tanency vBulletin type application?

2010-07-13 Thread Sandeep Kalidindi at PaGaLGuY.com
@michael - benjamin answered your question. Thing is if you use mysql just for indices you are not at all using the benefits of the whole relational database engine(which is fine) but then are inheriting all its disadvantages. You can use mysql for storing indices and then write your own

Cassandra client - clock sync

2010-07-13 Thread Narendra Sharma
Hi, We have an application that uses Cassandra to store data. The application is deployed on multiple nodes that are part of an application cluster. We are at present using single Cassandra node. We have noticed few errors in application and our analysis revealed that the root cause was that the

Re: CassandraBulkLoader

2010-07-13 Thread Torsten Curdt
On Tue, Jul 13, 2010 at 04:35, Mubarak Seyed mubarak.se...@gmail.com wrote: Where can i find the documentation for BinaryMemTable (btm_example in contrib) to use CassandraBulkLoader? What is the input to be supplied to CassandraBulkLoader? How to form the input data and what is the format of

Re: Is anyone using version 0.7 schema update API

2010-07-13 Thread GH
They are not complicated, its more that they are not in the package that they should be in. I assume the client package exposes the functionality of the server and it does not have the ability to manage the tables in the database that to me seems to be extremely limiting. When I did not see that

Re: advice, is cassandra suitable for a multi-tanency vBulletin type application?

2010-07-13 Thread Paul Prescod
On Mon, Jul 12, 2010 at 11:44 PM, Benjamin Black b...@b3k.us wrote: We use Cassandra (multidimensional metrics) *and* redis (counters and alerts) *and* MySQL (supporting Rails).  Right tool for each job.  The idea that it is a good thing to cram everything into a single database (and data

Re: concurrent reads

2010-07-13 Thread Schubert Zhang
For read, the bottleneck is usually the disk. Use iostat to check the utility of your disks. On Tue, Jul 13, 2010 at 2:07 PM, Peter Schuller peter.schul...@infidyne.com wrote: Has anyone experimented with different settings for concurrent reads? I have set our servers to 4 ( 2 per

Re: Iterate all keys - doing it as the faq fails for me :(

2010-07-13 Thread Thomas Heller
I'm not entirely sure but I think you can only use get_range_slices with start_key/end_key on a cluster using OrderPreservingPartitioner. Dont know if that is intentional or buggy like Jonathan suggest but I saw the same duplicates behaviour when trying to iterate all rows using RP and

Re: Question regarding consistency and deletion

2010-07-13 Thread Samuru Jackson
Thanks for the links. Actually it is pretty easy to catch those tombstoned keys on the client side. However, in certain applications it can generate some additional overhead on the network. I think it would be nice to have a forced garbage collection in the API. This would IMHO ease to write

RE: Iterate all keys - doing it as the faq fails for me :(

2010-07-13 Thread Per Olesen
I'm not entirely sure but I think you can only use get_range_slices with start_key/end_key on a cluster using OrderPreservingPartitioner. Dont know if that is intentional or buggy like Jonathan suggest but I saw the same duplicates behaviour when trying to iterate all rows using RP and

Re: Iterate all keys - doing it as the faq fails for me :(

2010-07-13 Thread Jonathan Ellis
On Tue, Jul 13, 2010 at 7:38 AM, Thomas Heller i...@zilence.net wrote: I'm not entirely sure but I think you can only use get_range_slices with start_key/end_key on a cluster using OrderPreservingPartitioner. Dont know if that is intentional or buggy like Jonathan suggest but I saw the same

Re: Cassandra client - clock sync

2010-07-13 Thread Jonathan Ellis
You should use ntp in daemon mode, not as a one-time fix. http://linux.die.net/man/1/ntpd On Tue, Jul 13, 2010 at 2:45 AM, Narendra Sharma narendra.sha...@gmail.com wrote: Hi, We have an application that uses Cassandra to store data. The application is deployed on multiple nodes that are part

Re: advice, is cassandra suitable for a multi-tanency vBulletin type application?

2010-07-13 Thread S Ahmed
The only issue I see (please correct me if I am wrong) is that you loose, is that you have single points of failure in the system now i.e. redis etc. On Tue, Jul 13, 2010 at 3:33 AM, Sandeep Kalidindi at PaGaLGuY.com sandeep.kalidi...@pagalguy.com wrote: @michael - benjamin answered your

Re: concurrent reads

2010-07-13 Thread Lee Parker
The iostat numbers are rather low as is cpu utilization. We have a couple of nightly jobs which do a lot of reads in a short amount of time. That is when the pending reads was climbing. I'm going to bump up the number and see how things run. Lee Parker On Tue, Jul 13, 2010 at 6:18 AM, Schubert

Re: advice, is cassandra suitable for a multi-tanency vBulletin type application?

2010-07-13 Thread Sandeep Kalidindi at PaGaLGuY.com
@Ahmed - we are trying to use Redis + gizzard - with gizzard responsible for sharding and maintaining replicas . Need to test it well before plunging into production though. Cheers, Deepu. On Tue, Jul 13, 2010 at 7:46 PM, S Ahmed sahmed1...@gmail.com wrote: The only issue I see (please

Consequences of having many columns

2010-07-13 Thread Kochheiser,Todd W - TOK-DITT-1
I recently ran across a blog posting with a comment from a Cassandra committer that indicated a performance penalty when having a large number of columns per row/key. Unfortunately I didn't bookmark the blog posting and now I can't find it. Regardless, since our current plan and design is to

Performance Issues

2010-07-13 Thread Samuru Jackson
Hi, I have set up a ring with a couple of servers and wanted to run some stress tests. Unfortunately, there is some kind of bottleneck at the client side. I'm using Hector and Cassandra 0.6.1. The subsequent profile results are based on a small Java program that inserts sequentially records,

Re: Consequences of having many columns

2010-07-13 Thread Mason Hale
Currently there is a limitation that each row must fit in memory (with some not insignificant overhead), thus having lots of columns per row can trigger out-of-memory errors. This limitation should be removed in a future release. Please see: -

Re: GCGraceSeconds per ColumnFamily/Keyspace

2010-07-13 Thread Todd Burruss
Yes -Original Message- From: Jonathan Ellis [jbel...@gmail.com] Received: 7/12/10 9:15 PM To: user@cassandra.apache.org [u...@cassandra.apache.org] Subject: Re: GCGraceSeconds per ColumnFamily/Keyspace Probably. Can you open a ticket? On Mon, Jul 12, 2010 at 10:41 PM, Todd Burruss

Re: advice, is cassandra suitable for a multi-tanency vBulletin type application?

2010-07-13 Thread Benjamin Black
On Tue, Jul 13, 2010 at 2:43 AM, Paul Prescod p...@prescod.net wrote: On Mon, Jul 12, 2010 at 11:44 PM, Benjamin Black b...@b3k.us wrote: We use Cassandra (multidimensional metrics) *and* redis (counters and alerts) *and* MySQL (supporting Rails).  Right tool for each job.  The idea that it is

Re: Question regarding consistency and deletion

2010-07-13 Thread Benjamin Black
On Tue, Jul 13, 2010 at 5:47 AM, Samuru Jackson samurujack...@googlemail.com wrote: Thanks for the links. Actually it is pretty easy to catch those tombstoned keys on the client side. However, in certain applications it can generate some additional overhead on the network. I think it would

Re: Is anyone using version 0.7 schema update API

2010-07-13 Thread Benjamin Black
I updated the Ruby client to 0.7, but I am not a Cassandra committer (and not much of a Java guy), so haven't touched the Java client. Is there more to it than regenerating Thrift bindings? On Tue, Jul 13, 2010 at 1:42 AM, GH gavan.h...@gmail.com wrote: They are not complicated, its more that

Consulting for Rollout + Cassandra

2010-07-13 Thread David Boxenhorn
We are planning a rollout of our online product ~September 1. Cassandra is a major part of our online system. We need some Cassandra consulting + general online consulting for determining our server configuration so it will support Cassandra under all possible scenarios. Does anybody have any

Re: Consulting for Rollout + Cassandra

2010-07-13 Thread Benjamin Black
http://riptano.com On Tue, Jul 13, 2010 at 9:14 AM, David Boxenhorn da...@lookin2.com wrote: We are planning a rollout of our online product ~September 1. Cassandra is a major part of our online system. We need some Cassandra consulting + general online consulting for determining our server

Re: Cassandra client - clock sync

2010-07-13 Thread Benjamin Black
On Tue, Jul 13, 2010 at 12:45 AM, Narendra Sharma narendra.sha...@gmail.com wrote: How are other Cassandra users handling the clock sync in production environment? By structuring access in the app such that there are never conflicts in the first place, for example by using UUIDs for row and

Re: Authentication

2010-07-13 Thread Ben Standefer
Are there any plans or talks of adding SSL/encryption support between Cassandra nodes? This would make setting up secure cross-country Cassandra clusters much easier, without having to setup a secure overlay network. MySQL supports this in it's replication. -Ben On Mon, Jul 12, 2010 at 11:23

Re: Performance Issues

2010-07-13 Thread Ran Tavory
Since you're using hector hector-users@ is a good place to be, so u...@cassandra to bcc operateWithFailover is one stop before sending the request over the network and waiting, so it makes lots of sense that a significant part of the application is spent in it. On Tue, Jul 13, 2010 at 6:22 PM,

Re: CassandraBulkLoader

2010-07-13 Thread Mubarak Seyed
Thanks Torsten. Jonathan's blog on Fact Vs Fiction says that Fact: It has always been straightforward to send the output of Hadoop jobs to Cassandra, and Facebook, Digg, and others have been using Hadoop like this as a Cassandra bulk-loader for over a year. Does anyone from Facebook or Digg

RE: Consequences of having many columns

2010-07-13 Thread Kochheiser,Todd W - TOK-DITT-1
So it would appear that 0.7 will have solved the requirement that a single row must be able to fit in memory. That issue aside, how would one expect the read/write performance to be in the scenarios listed below? From: Mason Hale [mailto:ma...@onespot.com]

Re: CassandraBulkLoader

2010-07-13 Thread Jonathan Ellis
look at contrib/bmt_example, with the caveat that it's usually premature optimization On Tue, Jul 13, 2010 at 12:31 PM, Mubarak Seyed mubarak.se...@gmail.com wrote: Thanks Torsten. Jonathan's blog on Fact Vs Fiction says that Fact: It has always been straightforward to send the output of

Re: NYC Cassandra training

2010-07-13 Thread Jonathan Ellis
We would like to do one in Europe in October. On Fri, Jul 9, 2010 at 11:02 AM, Dave Gardner dave.gard...@imagini.net wrote: Do you have a rough estimate as to when there might be a training day in London (UK). I'm currently weighing up whether I should be making a journey across the pond for

Re: NYC Cassandra training

2010-07-13 Thread Jonathan Ellis
On Fri, Jul 9, 2010 at 9:36 AM, Jeremy Dunck jdu...@gmail.com wrote: On Fri, Jul 2, 2010 at 1:08 PM, Jonathan Ellis jbel...@gmail.com wrote: Riptano's one day Cassandra training is coming to NYC in August, our first public session on the East coast: http://www.eventbrite.com/event/749518831

Re: High CPU usage on all nodes without any read or write

2010-07-13 Thread Jonathan Ellis
did you look at compaction activity? On Mon, Jul 12, 2010 at 9:31 AM, Olivier Rosello orose...@corp.free.fr wrote: But in Cassandra output log : r...@cassandra-2:~#  tail -f /var/log/cassandra/output.log  INFO 15:32:05,390 GC for ConcurrentMarkSweep: 1359 ms, 4295787600 reclaimed leaving

Re: Authentication

2010-07-13 Thread Jonathan Ellis
It's been suggested, but it's not very useful w/o having encryption for Thrift as well (in case a client has to fail over to the cross-country Cassandra nodes). So using a secure VPN makes the most sense to me. On Tue, Jul 13, 2010 at 12:02 PM, Ben Standefer b...@simplegeo.com wrote: Are there

Re: Authentication

2010-07-13 Thread Ben Standefer
Many apps would find it realistic or feasible to failover database connections across the country (going from 1ms latency to ~90ms latency). The scheme of failing over client database connections across the country is probably the minority case. SSL between Cassandra nodes, even without

Re: Authentication

2010-07-13 Thread Ben Standefer
Err, find it *unrealistic* -Ben On Tue, Jul 13, 2010 at 2:22 PM, Ben Standefer b...@simplegeo.com wrote: Many apps would find it realistic or feasible to failover database connections across the country (going from 1ms latency to ~90ms latency). The scheme of failing over client database

Re: CassandraBulkLoader

2010-07-13 Thread Torsten Curdt
look at contrib/bmt_example, with the caveat that it's usually premature optimization I wish that was true for us :) Fact: It has always been straightforward to send the output of Hadoop jobs to Cassandra, and Facebook, Digg, and others have been using Hadoop like this as a Cassandra

Re: live nodes list in ring

2010-07-13 Thread Artie Copeland
Benjamin, Yes i have seen this when adding a new node into the cluster. the new node doesnt see the complete ring through nodetool, but the strange part is that looking at the ring through jconsole shows the complete ring. it as if there is a big in nodetool publishing the actual ring. has

Re: Authentication

2010-07-13 Thread Jonathan Ellis
Are you interested in contributing this? On Tue, Jul 13, 2010 at 4:22 PM, Ben Standefer b...@simplegeo.com wrote: Many apps would find it realistic or feasible to failover database connections across the country (going from 1ms latency to ~90ms latency).  The scheme of failing over client

Elastic Load Balancing Cassandra

2010-07-13 Thread Brian Helfrich
Hi, has anyone been able to load balance a Cassandra cluster with an AWS Elastic Load Balancer? I've setup an ELB with the obvious settings (namely, --listener lb-port=9160,instance-port=9160,protocol=TCP) but client's simply hang trying to load records from the ELB hostname:9160. Thanks,

Using Pelops with Cassandra 0.7.X

2010-07-13 Thread Peter Harrison
I know Cassandra 0.7 isn't released yet, but I was wondering if anyone has used Pelops with the latest builds of Cassandra? I'm having some issues, but I wanted to make sure that somebody else isn't working on a branch of Pelops to support Cassandra 7. I have downloaded and built the latest code

java.lang.NoSuchMethodError: org.apache.cassandra.db.ColumnFamily.id()I

2010-07-13 Thread Arya Goudarzi
I just build today's trunk successfully and am getting the following exception on startup which to me it seams bogus as the method exists but I don't know why: ERROR 15:27:00,957 Exception encountered during startup. java.lang.NoSuchMethodError: org.apache.cassandra.db.ColumnFamily.id()I

Re: Using Pelops with Cassandra 0.7.X

2010-07-13 Thread Ran Tavory
Hector doesn't have 0.7 support yet On Jul 14, 2010 1:34 AM, Peter Harrison cheetah...@gmail.com wrote: I know Cassandra 0.7 isn't released yet, but I was wondering if anyone has used Pelops with the latest builds of Cassandra? I'm having some issues, but I wanted to make sure that somebody else

Re: java.lang.NoSuchMethodError: org.apache.cassandra.db.ColumnFamily.id()I

2010-07-13 Thread Jonathan Ellis
ant clean On Tue, Jul 13, 2010 at 5:33 PM, Arya Goudarzi agouda...@gaiaonline.com wrote: I just build today's trunk successfully and am getting the following exception on startup which to me it seams bogus as the method exists but I don't know why: ERROR 15:27:00,957 Exception encountered

Re: Elastic Load Balancing Cassandra

2010-07-13 Thread Dave Viner
I haven't used ELB, but I've setup HAProxy to do it... appears to work well so far. Dave Viner On Tue, Jul 13, 2010 at 3:30 PM, Brian Helfrich helfrich9...@gmail.comwrote: Hi, has anyone been able to load balance a Cassandra cluster with an AWS Elastic Load Balancer? I've setup an ELB with

Re: nodetool loadbalance : Strerams Continue on Non Acceptance of New Token

2010-07-13 Thread Arya Goudarzi
Hi Gary, Thanks for the reply. I tried this again today. Streams gets stuck, pls read my comment: https://issues.apache.org/jira/browse/CASSANDRA-1221 -arya - Original Message - From: Gary Dusbabek gdusba...@gmail.com To: user@cassandra.apache.org Sent: Wednesday, June 23, 2010

Re: Is anyone using version 0.7 schema update API

2010-07-13 Thread GH
To be honest I do not know how to regenerate the binidings, I will look into that. ollowing your email, I went on and took the unit test code and created a client. Given that this code works I am guessing that the thrift bindings are in place and it is more that the client code does not support

Re: Using Pelops with Cassandra 0.7.X

2010-07-13 Thread Dan Washusen
http://github.com/danwashusen/pelops/tree/cassandra-0.7.0 p.s. Pelops doesn't have any test coverage and my implicit tests (my app integration tests) don't touch anywhere near all of the Pelops API. p.s.s. I've made API breaking changes to support the new 0.7.0 API and Dominic (the original

Re: Using Pelops with Cassandra 0.7.X

2010-07-13 Thread Peter Harrison
On Wed, Jul 14, 2010 at 2:43 PM, Dan Washusen d...@reactive.org wrote: http://github.com/danwashusen/pelops/tree/cassandra-0.7.0 Doh - I've just finished making most of the changes for the new API. p.s. Pelops doesn't have any test coverage and my implicit tests (my app integration tests)

Re: Is anyone using version 0.7 schema update API

2010-07-13 Thread Dave Viner
Check out step 4 of this page: https://wiki.fourkitchens.com/display/PF/Using+Cassandra+with+PHP ./compiler/cpp/thrift -gen php ../PATH-TO-CASSANDRA/interface/cassandra.thrift That is how to compile the thrift client from the cassandra bindings. Just replace the php with the language of your