Re: Cassandra query degradation with high frequency updated tables.

Tyler Hobbs Thu, 08 Oct 2015 14:36:31 -0700

Upgrade to 2.2.2.  Your sstables are probably not compacting due to
CASSANDRA-10270 <https://issues.apache.org/jira/browse/CASSANDRA-10270>,
which was fixed in 2.2.2.


Additionally, you may want to look into using leveled compaction (
http://www.datastax.com/dev/blog/when-to-use-leveled-compaction).

On Thu, Oct 8, 2015 at 4:27 PM, Nazario Parsacala <dodongj...@gmail.com>
wrote:

>
> Hi,
>
> so we are developing a system that computes profile of things that it
> observes. The observation comes in form of events. Each thing that it
> observe has an id and each thing has a set of subthings in it which has
> measurement of some kind. Roughly there are about 500 subthings within each
> thing. We receive events containing measurements of these 500 subthings
> every 10 seconds or so.
>
> So as we receive events, we  read the old profile value, calculate the new
> profile based on the new value and save it back. We use the following
> schema to hold the profile.
>
> CREATE TABLE myprofile (
>     id text,
>     month text,
>     day text,
>     hour text,
>     subthings text,
>     lastvalue double,
>     count int,
>     stddev double,
>  PRIMARY KEY ((id, month, day, hour), subthings)
> ) WITH CLUSTERING ORDER BY (subthings ASC) );
>
>
> This profile will then be use for certain analytics that can use in the
> context of the ‘thing’ or in the context of specific thing and subthing.
>
> A profile can be defined as monthly, daily, hourly. So in case of monthly
> the month will be set to the current month (i.e. ‘Oct’) and the day and
> hour will be set to empty ‘’ string.
>
>
> The problem that we have observed is that over time (actually in just a
> matter of hours) we will see a huge degradation of query response  for the
> monthly profile. At the start it will be respinding in 10-100 ms and after
> a couple of hours it will go to 2000-3000 ms . If you leave it for a couple
> of days you will start experiencing readtimeouts . The query is basically
> just :
>
> select * from myprofile where id=‘1’ and month=‘Oct’ and day=‘’ and hour=‘'
>
> This will have only about 500 rows or so.
>
>
> I believe that this is cause by the fact there are multiple updates done
> to this specific partition. So what do we think can be done to resolve this
> ?
>
> BTW, I am using Cassandra 2.2.1 . And since this is a test , this is just
> running on a single node.
>
>
>
>
>


-- 
Tyler Hobbs
DataStax <http://datastax.com/>

Re: Cassandra query degradation with high frequency updated tables.

Reply via email to