Re: Column Slice Query performance after deletions

Víctor Hugo Oliveira Molinar Sat, 02 Mar 2013 11:15:16 -0800

What is your gc_grace set to? Sounds like as the number of tombstones
records increase your performance decreases. (Which I would expect)



gr_grace is default.


Casandra's data files are write once. Deletes are another write. Until
compaction they all live on disk.Making really big rows has these problem.

Oh, so it looks like I should lower the min_compaction_threshold for this
column family. Right?
What does realy mean this threeshold value?


Guys, thanks for the help so far.

On Sat, Mar 2, 2013 at 3:42 PM, Michael Kjellman <mkjell...@barracuda.com>wrote:

> What is your gc_grace set to? Sounds like as the number of tombstones
> records increase your performance decreases. (Which I would expect)
>
> On Mar 2, 2013, at 10:28 AM, "Víctor Hugo Oliveira Molinar" <
> vhmoli...@gmail.com> wrote:
>
> I have a daily maintenance of my cluster where I truncate this column
> family. Because its data doesnt need to be kept more than a day.
> Since all the regular operations on it finishes around 4 hours before
> finishing the day. I regurlarly run a truncate on it followed by a repair
> at the end of the day.
>
> And every day, when the operations are started(when are only few deleted
> columns), the performance looks pretty well.
> Unfortunately it is degraded along the day.
>
>
> On Sat, Mar 2, 2013 at 2:54 PM, Michael Kjellman 
> <mkjell...@barracuda.com>wrote:
>
>> When is the last time you did a cleanup on the cf?
>>
>> On Mar 2, 2013, at 9:48 AM, "Víctor Hugo Oliveira Molinar" <
>> vhmoli...@gmail.com> wrote:
>>
>> > Hello guys.
>> > I'm investigating the reasons of performance degradation for my case
>> scenario which follows:
>> >
>> > - I do have a column family which is filled of thousands of columns
>> inside a unique row(varies between 10k ~ 200k). And I do have also
>> thousands of rows, not much more than 15k.
>> > - This rows are constantly updated. But the write-load is not that
>> intensive. I estimate it as 100w/sec in the column family.
>> > - Each column represents a message which is read and processed by
>> another process. After reading it, the column is marked for deletion in
>> order to keep it out from the next query on this row.
>> >
>> > Ok, so, I've been figured out that after many insertions plus deletion
>> updates, my queries( column slice query ) are taking more time to be
>> performed. Even if there are only few columns, lower than 100.
>> >
>> > So it looks like that the longer is the number of columns being
>> deleted, the longer is the time spent for a query.
>> > -> Internally at C*, does column slice query ranges among deleted
>> columns?
>> > If so, how can I mitigate the impact in my queries? Or, how can I avoid
>> those deleted columns?
>>
>> Copy, by Barracuda, helps you store, protect, and share all your amazing
>> things. Start today: www.copy.com.
>>
>
>
> ----------------------------------
> Copy, by Barracuda, helps you store, protect, and share all your amazing
> things. Start today: www.copy.com <http://www.copy.com?a=em_footer>.
>   
>

Re: Column Slice Query performance after deletions

Reply via email to