Re: Compaction Strategy guidance

Andrei Ivanov Tue, 25 Nov 2014 10:26:38 -0800

Nikolai,

Just in case you've missed my comment in the thread (guess you have) -
increasing sstable size does nothing (in our case at least). That is,
it's not worse but the load pattern is still the same - doing nothing
most of the time. So, I switched to STCS and we will have to live with
extra storage cost - storage is way cheaper than cpu etc anyhow:-)


On Tue, Nov 25, 2014 at 5:53 PM, Nikolai Grigoriev <ngrigor...@gmail.com> wrote:
> Hi Jean-Armel,
>
> I am using latest and greatest DSE 4.5.2 (4.5.3 in another cluster but there
> are no relevant changes between 4.5.2 and 4.5.3) - thus, Cassandra 2.0.10.
>
> I have about 1,8Tb of data per node now in total, which falls into that
> range.
>
> As I said, it is really a problem with large amount of data in a single CF,
> not total amount of data. Quite often the nodes are idle yet having quite a
> bit of pending compactions. I have discussed it with other members of C*
> community and DataStax guys and, they have confirmed my observation.
>
> I believe that increasing the sstable size won't help at all and probably
> will make the things worse - everything else being equal, of course. But I
> would like to hear from Andrei when he is done with his test.
>
> Regarding the last statement - yes, C* clearly likes many small servers more
> than fewer large ones. But it is all relative - and can be all recalculated
> to $$$ :) C* is all about partitioning of everything - storage,
> traffic...Less data per node and more nodes give you lower latency, lower
> heap usage etc, etc. I think I have learned this with my project. Somewhat
> hard way but still, nothing is better than the personal experience :)
>
> On Tue, Nov 25, 2014 at 3:23 AM, Jean-Armel Luce <jaluc...@gmail.com> wrote:
>>
>> Hi Andrei, Hi Nicolai,
>>
>> Which version of C* are you using ?
>>
>> There are some recommendations about the max storage per node :
>> http://www.datastax.com/dev/blog/performance-improvements-in-cassandra-1-2
>>
>> "For 1.0 we recommend 300-500GB. For 1.2 we are looking to be able to
>> handle 10x
>> (3-5TB)".
>>
>> I have the feeling that those recommendations are sensitive according many
>> criteria such as :
>> - your hardware
>> - the compaction strategy
>> - ...
>>
>> It looks that LCS lower those limitations.
>>
>> Increasing the size of sstables might help if you have enough CPU and you
>> can put more load on your I/O system (@Andrei, I am interested by the
>> results of your  experimentation about large sstable files)
>>
>> From my point of view, there are some usage patterns where it is better to
>> have many small servers than a few large servers. Probably, it is better to
>> have many small servers if you need LCS for large tables.
>>
>> Just my 2 cents.
>>
>> Jean-Armel
>>
>> 2014-11-24 19:56 GMT+01:00 Robert Coli <rc...@eventbrite.com>:
>>>
>>> On Mon, Nov 24, 2014 at 6:48 AM, Nikolai Grigoriev <ngrigor...@gmail.com>
>>> wrote:
>>>>
>>>> One of the obvious recommendations I have received was to run more than
>>>> one instance of C* per host. Makes sense - it will reduce the amount of 
>>>> data
>>>> per node and will make better use of the resources.
>>>
>>>
>>> This is usually a Bad Idea to do in production.
>>>
>>> =Rob
>>>
>>
>>
>
>
>
> --
> Nikolai Grigoriev
> (514) 772-5178

Re: Compaction Strategy guidance

Reply via email to