Re: Crash with TombstoneOverwhelmingException

2014-01-17 Thread Shao-Chuan Wang
Agree with Robert about the dogfood.

http://www.datastax.com/docs/datastax_enterprise3.2/dse_release_notes#rn-3-2-4
It may be a good indicator when DSE starts using Cassandra 2.x.y in
production.


> From: Robert Coli 
> Date: Mon, Dec 30, 2013 at 2:58 PM
> Subject: Re: Crash with TombstoneOverwhelmingException
> To: "user@cassandra.apache.org" 
>
>
>  On Wed, Dec 25, 2013 at 10:01 AM, Edward Capriolo 
> wrote:
>
>> I have to hijack this thread. There seem to be many problems with the
>> 2.0.3 release.
>>
>
> +1. There is no 2.0.x release I consider production ready, even after
> today's 2.0.4.
>
> Outside of passing all unit tests, factors into the release voting process?
>> What other type of extended real world testing should be done to find
>> bugs like this one that unit testing wont?
>>
>
> I also +1 these questions. Voting seems of limited use given the outputs
> of the process.
>
>>
>> Here is a whack y idea that I am half serious about. Make a CMS for
>> http://cassndra.apache.org  that back ends it's data and reporting into
>> cassandra. No release unless Cassanda db that servers the site is upgraded
>> first. :)
>>
>
> I agree wholeheartedly that eating ones own dogfood is informative.
>
> =Rob
>
>
>


Re: Tracking word frequencies

2014-01-17 Thread Jonathan Lacefield
Hi David,

  How do you know that you are receiving a seek for each row?  Are you
querying for a specific word at a time or do the queries span multiple
words, i.e. what's the query pattern? Also, what is your goal for read
latency?  Most customers can achieve microsecond partition key base query
reads with Cassanda.  This can be done through tuning, data modeling,
and/or scaling.  Please post a cfhistograms for this table as well as
provide some details on the specific queries you are running.

Thanks,

Jonathan

Jonathan Lacefield
Solutions Architect, DataStax
(404) 822 3487






On Fri, Jan 17, 2014 at 1:41 AM, David Tinker wrote:

> I have an app that stores lots of bits of text in Cassandra. One of
> the things I need to do is keep a global word frequency table.
> Something like this:
>
> CREATE TABLE IF NOT EXISTS word_count (
>   word text,
>   count value,
>   PRIMARY KEY (word)
> );
>
> This is slow to read as the rows (100's of thousands of them) each
> need a seek. Is there a better way to model this in Cassandra? I could
> periodically snapshot the rows into a fat row in another table I
> suppose.
>
> Or should I use Redis or something instead? I would prefer to keep it
> all Cassandra if possible.
>