> We were thinking of doing a major compaction after each year is 'closed off'. 
Not a terrible idea. Years tend to happen annually, so their growth pattern is 
well understood. 

> This would mean that compactions for the current year were dealing with a 
> smaller amount of data and hence be faster and have less impact on a 
> day-to-day basis.
Older data is compacted into higher tiers / generations so will not be included 
when compacting new data (background 
http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra). That 
said, there is a chance that at some point you the big older files get 
compacted. i.e. if you get (by default) 4 X 100GB files they will get compacted 
into 1. 

It feels a bit like a premature optimisation. 
 
-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 23/05/2012, at 1:52 PM, Franc Carter wrote:

> On Wed, May 23, 2012 at 7:42 AM, aaron morton <aa...@thelastpickle.com> wrote:
> 1 KS with 24 CF's will use roughly the same resources as 24 KS's with 1 CF. 
> Each CF:
> 
> * loads the bloom filter for each SSTable
> * samples the index for each sstable
> * uses row and key cache
> * has a current memtable and potentially memtables waiting to flush.
> * had secondary index CF's
> 
> I would generally avoid a data model that calls for CF's to be added in 
> response to new entities or new data. Older data will move moved to larger 
> files, and not included in compaction for newer data.
> 
> We were thinking of doing a major compaction after each year is 'closed off'. 
> This would mean that compactions for the current year were dealing with a 
> smaller amount of data and hence be faster and have less impact on a 
> day-to-day basis. Our query patterns will only infrequently cross year 
> boundaries.
> 
> Are we being naive ?
> 
> cheers
>  
> 
> Hope that helps. 
> 
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 23/05/2012, at 3:31 AM, Luís Ferreira wrote:
> 
>> I have 24 keyspaces, each with a columns family and am considering changing 
>> it to 1 keyspace with 24 CFs. Would this be beneficial?
>> On May 22, 2012, at 12:56 PM, samal wrote:
>> 
>>> Not ideally, now cass has global memtable tuning. Each cf correspond to 
>>> memory  in ram. Year wise cf means it will be in read only state for next 
>>> year, memtable  will still consume ram.
>>> 
>>> On 22-May-2012 5:01 PM, "Franc Carter" <franc.car...@sirca.org.au> wrote:
>>> On Tue, May 22, 2012 at 9:19 PM, aaron morton <aa...@thelastpickle.com> 
>>> wrote:
>>> It's more the number of CF's than keyspaces.
>>> 
>>> Oh - does increasing the number of Column Families affect performance ?
>>> 
>>> The design we are working on at the moment is considering using a Column 
>>> Family per year. We were thinking this would isolate compactions to a more 
>>> manageable size as we don't update previous years.
>>> 
>>> cheers
>>>  
>>> 
>>> Cheers
>>> 
>>> -----------------
>>> Aaron Morton
>>> Freelance Developer
>>> @aaronmorton
>>> http://www.thelastpickle.com
>>> 
>>> On 22/05/2012, at 6:58 PM, R. Verlangen wrote:
>>> 
>>>> Yes, it does. However there's no real answer what's the limit: it depends 
>>>> on your hardware and cluster configuration. 
>>>> 
>>>> You might even want to search the archives of this mailinglist, I remember 
>>>> this has been asked before.
>>>> 
>>>> Cheers!
>>>> 
>>>> 2012/5/21 Luís Ferreira <zamith...@gmail.com>
>>>> Hi,
>>>> 
>>>> Does the number of keyspaces affect the overall cassandra performance?
>>>> 
>>>> 
>>>> Cumprimentos,
>>>> Luís Ferreira
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> With kind regards,
>>>> 
>>>> Robin Verlangen
>>>> www.robinverlangen.nl
>>>> 
>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> Franc Carter | Systems architect | Sirca Ltd
>>> franc.car...@sirca.org.au | www.sirca.org.au
>>> Tel: +61 2 9236 9118 
>>> Level 9, 80 Clarence St, Sydney NSW 2000
>>> PO Box H58, Australia Square, Sydney NSW 1215
>>> 
>> 
>> Cumprimentos,
>> Luís Ferreira
>> 
>> 
>> 
> 
> 
> 
> 
> -- 
> Franc Carter | Systems architect | Sirca Ltd
> franc.car...@sirca.org.au | www.sirca.org.au
> Tel: +61 2 9236 9118 
> Level 9, 80 Clarence St, Sydney NSW 2000
> PO Box H58, Australia Square, Sydney NSW 1215
> 

Reply via email to