Re: MESSAGE-DESERIALIZER-POOL is slow down by SliceFromReadCommand

2010-06-28 Thread Lu Ming
OK, I found the reason. The task queue for ROW-READ-STAGE ThreadPoolExecutor is a Bounded queue, and its size is 4096 (defined by DatabaseDescriptor.getStageQueueSize()) When the pending tasks of ROW-READ-STAGE ThreadPoolExecutor is more than 4096, the follow-up submitted task will be

Hintedhandoff will never complete when a BIG rowmutation

2010-06-28 Thread Lu Ming
Hi: These days I found my Cassandra is strange, much slower than before. And I Spent much time to figure it out and today I got the answer. Some bad buy keeps on writing many data day and night, then made a very big row mutation which size is about 140M. In this period I restarted

RE: Map Reduce support

2010-06-28 Thread Carlos Sanchez
Drew, I was wondering if you care to share your map-reduce code Thanks Carlos From: Drew Dahlke [drew.dah...@bronto.com] Sent: Monday, June 28, 2010 7:17 AM To: user@cassandra.apache.org Subject: Re: Map Reduce support The difference is noticeable but

Re: Map Reduce support

2010-06-28 Thread Drew Dahlke
I'm afraid I didn't hold on to it, sorry folks On Mon, Jun 28, 2010 at 8:58 AM, Carlos Sanchez carlos.sanc...@riskmetrics.com wrote: Drew, I was wondering if you care to share your map-reduce code Thanks Carlos From: Drew Dahlke

Digg 4 Preview on TWiT

2010-06-28 Thread Kochheiser,Todd W - TOK-DITT-1
On yesterday's This Week in Techhttp://www.twit.tv/254 (TWiT) podcast with Leo Laporte (Wiki: http://wiki.twit.tv/wiki/TWiT_254), Kevin Rose of Digghttp://digg.com/ fame was a guest. He gave a public preview of the new Digg 4; it looks very nice and should be released in the next month or two.

Re: Hintedhandoff will never complete when a BIG rowmutation

2010-06-28 Thread Jonathan Ellis
Yes, you should increase your timeout if you are hinting big mutations (or big rows that were built from smaller mutations). 2010/6/28 Lu Ming xl...@live.com: Hi: These days I found my Cassandra is strange, much slower than before. And I Spent much time to figure it out and today I got

Re: forum application data model conversion

2010-06-28 Thread Jonathan Ellis
The principle in Cassandra modeling is that for each query, you should denormalize your data at write time such that you can get the data for that query from a single row. The best explanation so far is at http://www.rackspacecloud.com/blog/2010/05/12/cassandra-by-example/ On Tue, Jun 22, 2010

Re: MESSAGE-DESERIALIZER-POOL is slow down by SliceFromReadCommand

2010-06-28 Thread Jonathan Ellis
If you're getting to where you have thousands of backed-up commands on RRS then it doesn't really matter what executor behavior is, because you're severely screwed. :) Often this means you are i/o bound and you should look into row or key caching. Possibly you may simply need to add capacity to

Re: sstable2json and key ordering

2010-06-28 Thread Jonathan Ellis
Yes. On Sat, Jun 26, 2010 at 12:53 AM, Todd Nine t...@spidertracks.co.nz wrote: Hi all,   I'm having a lot of problems getting Lucandra to correctly handle numeric document fields.   After examining the keys it has written to the CF, I believe it may be an issue of column ordering by bytes.  

Re: Hintedhandoff will never complete when a BIG rowmutation

2010-06-28 Thread Robert Coli
On 6/28/10 3:29 AM, Lu Ming wrote: Every one hour HintedHandOffManager will check hintedhandoff ColumnFamily then send out the big rowmutations to alive nodes, It fails again because of the TimeoutException, so the task will never finish and the big rowmutation is sending again and again. In

Re: Digg 4 Preview on TWiT

2010-06-28 Thread Chris Goffinet
Digg is not forking Cassandra. We use 0.6 for production, with a few in-house patches (related to our infrastructure). The biggest difference with our branch and apache 0.6 branch is we have the work Kelvin and Twitter has done in regards to Vector Clocks + Distributed Counters. This will never

Re: Digg 4 Preview on TWiT

2010-06-28 Thread Kelvin Kakugawa
If you're interested: https://issues.apache.org/jira/browse/CASSANDRA-1072 https://issues.apache.org/jira/browse/CASSANDRA-580 -Kelvin On Mon, Jun 28, 2010 at 9:35 AM, Chris Goffinet c...@chrisgoffinet.com wrote: Digg is not forking Cassandra. We use 0.6 for production, with a few in-house

Re: Distributed work-queues?

2010-06-28 Thread Jeremy Davis
The approach that I took incorporates some of the ideas listed here... Basically each message in the queue was assigned a sequence number (needs to be unique and increasing per queue), and then read out in sequence number order. The Message CF is logically one row per queue, with each column being

Re: New User: OSX vs. Debian on Cassandra 0.5.0 with Thrift

2010-06-28 Thread JavierCanillas
Carlos Alvarez cbalvarez at gmail.com writes: On Sat, Apr 24, 2010 at 10:59 PM, Jonathan Ellis jbellis at gmail.com wrote: No, Framed is totally different. You are right. Seeing both, the java and c# thrift code, I think that there is no need to use other transport than TSocket in Java

Description of clustertool functionality?

2010-06-28 Thread Carlos Armas
Hi everyone, Can anyone point me to an in-depth description of clustertool functionality? Specifically interested in the global_snaphot/clear_global_snapshot options. The terse --help doesn't tell much. Thanks in advance! Carlos

Cassandra and Thrift on the Server Side

2010-06-28 Thread Peter Minearo
First let me premise, I am new to Cassandra. I just got it installed and was able to add data, connect to Cassandra via Thrift and retrieve the data. Since, Thrift uses RPC, I was wondering if Cassandra uses Thrift on the server side to handle the requests from the clients? I know Thrift is

Re: Cassandra and Thrift on the Server Side

2010-06-28 Thread Paul Prescod
On Mon, Jun 28, 2010 at 1:30 PM, Peter Minearo peter.mine...@reardencommerce.com wrote: First let me premise, I am new to Cassandra.  I just got it installed and was able to add data, connect to Cassandra via Thrift and retrieve the data.  Since, Thrift uses RPC, I was wondering if Cassandra

Re: Digg 4 Preview on TWiT

2010-06-28 Thread Ryan King
On Mon, Jun 28, 2010 at 9:35 AM, Chris Goffinet c...@chrisgoffinet.com wrote: Digg is not forking Cassandra. We use 0.6 for production, with a few in-house patches (related to our infrastructure). The biggest difference with our branch and apache 0.6 branch is we have the work Kelvin and

Re: Cassandra and Thrift on the Server Side

2010-06-28 Thread Marty Greenia
Would it ever be useful to someday have browser clients access cassandra servers directly? I imagine that would be the most compelling scenario to have REST API for. On Mon, Jun 28, 2010 at 1:36 PM, Paul Prescod p...@prescod.net wrote: On Mon, Jun 28, 2010 at 1:30 PM, Peter Minearo

RE: Cassandra and Thrift on the Server Side

2010-06-28 Thread Peter Minearo
You answered my questions on that. Thanks!! -Original Message- From: pres...@gmail.com [mailto:pres...@gmail.com] On Behalf Of Paul Prescod Sent: Monday, June 28, 2010 1:37 PM To: user@cassandra.apache.org Subject: Re: Cassandra and Thrift on the Server Side On Mon, Jun 28, 2010 at

Re: Cassandra and Thrift on the Server Side

2010-06-28 Thread Paul Prescod
On Mon, Jun 28, 2010 at 1:45 PM, Marty Greenia martygree...@gmail.com wrote: Would it ever be useful to someday have browser clients access cassandra servers directly? I imagine that would be the most compelling scenario to have REST API for. This is an interesting idea, but introduces quite a

Re: Cassandra and Thrift on the Server Side

2010-06-28 Thread Clint Byrum
On Mon, 2010-06-28 at 13:45 -0700, Marty Greenia wrote: Would it ever be useful to someday have browser clients access cassandra servers directly? I imagine that would be the most compelling scenario to have REST API for. I think thats an interesting use case for REST. From what I've seen,

Re: Cassandra and Thrift on the Server Side

2010-06-28 Thread Marty Greenia
I agree, it would probably make more sense to just use a conventional http server to interface with the browser clients on the front-end to act as a pass-through to cassandra on the back-end. No sense re-implementing all that functionality. Still, to Clint's point, everyone knows how to make an

Re: geo distance calculations

2010-06-28 Thread Ben Standefer
Joe, (Disclaimer: I work here) Check out http://www.simplegeo.com, we're building a horizontally scalable spatial database on top of Cassandra along with geographic analytics, data provisioning, and hosting services. Using us might be easier than hacking around in the weeds with MongoDB. Sorry

RE: Understanding SuperColumns

2010-06-28 Thread Anthony Ikeda
So no one is able to help? From: Anthony Ikeda [mailto:anthony.ik...@cardlink.com.au] Sent: Monday, 28 June 2010 12:36 PM To: user@cassandra.apache.org Subject: Understanding SuperColumns I seem to be scratching my head about how to model Super Columns properly and how they relate to a 1

Re: Description of clustertool functionality?

2010-06-28 Thread Jonathan Ellis
Clustertool is nodetool for cluster-wide operations. On Mon, Jun 28, 2010 at 2:27 PM, Carlos Armas carl...@gmail.com wrote: Hi everyone, Can anyone point me to an  in-depth description of clustertool functionality? Specifically interested in the global_snaphot/clear_global_snapshot options.

Cassandra on AWS across Regions

2010-06-28 Thread Lenin Gali
Hi All, We have a Cassandra cluster setup with 2 EC2 m1.large machines, one is running in us-east zone whereas 2nd one is running in us-west zone. As part of Seed section, I provided each other's node's EC2 public DNS entry in each of storage-conf.xml. but when I ran nodetool command, I am not

Re: Cassandra on AWS across Regions

2010-06-28 Thread Marty Greenia
What are you using for ListenAddress and ThriftAddress in the storage-conf.xml file? By default, they are set to 'localhost', which means Node A is probably telling Node B to contact itself on the private IP address. Marty On Mon, Jun 28, 2010 at 7:18 PM, Lenin Gali galile...@gmail.com wrote:

Re: Understanding SuperColumns

2010-06-28 Thread Benjamin Black
On Sun, Jun 27, 2010 at 7:36 PM, Anthony Ikeda anthony.ik...@cardlink.com.au wrote: Say my query is: Get all Work addresses in New York and the address owner. Steps to get the data would be: If this is the query you want to run, then you probably just want to put the owner in the index

Migrate cassandra between two DCs

2010-06-28 Thread albert_e
Hi, all We have several nodes in DC1 and DC2 and we want to move all nodes in DC2 to a new DC3, also IPs will be changed. The whole progress will last about 1-2 days. Can we simply stop all nodes in DC2, move to DC3, chage IP config in storage-conf.xml then start all nodes in DC3? Or shall we do

Re: Cassandra on AWS across Regions

2010-06-28 Thread maneela a
I have provided East node's public DNS/ip address at that both end points ( ListenAddress and ThriftAddress) and West instance's public IP address added as part of Seed for east.  Similarly I used West node's public DNS/IP for ListenAddress and ThriftAddress and included East node's public IP

RE: Understanding SuperColumns

2010-06-28 Thread Anthony Ikeda
Thanks Ben. I've just been trying to nail this down - it's difficult knowing if you are getting it right when it feels like you're going backwards. Anthony From: Benjamin Black [mailto:b...@b3k.us] Sent: Tuesday, 29 June 2010 1:00 PM To: user@cassandra.apache.org Subject: Re:

about use batch_mutate,can some one provide a example,i am using FluentCassandra

2010-06-28 Thread Medcl~
now i am confused about using the operation batch_mutate,when i follow the code from here:http://stackoverflow.com/questions/2783323/cassandra-batch-mutate/3137753#3137753 and it seems it has success inserted something,but when i use cassandra-cli to check it,it just return nothing, could