Nikolai, Just in case you've missed my comment in the thread (guess you have) - increasing sstable size does nothing (in our case at least). That is, it's not worse but the load pattern is still the same - doing nothing most of the time. So, I switched to STCS and we will have to live with extra storage cost - storage is way cheaper than cpu etc anyhow:-)
On Tue, Nov 25, 2014 at 5:53 PM, Nikolai Grigoriev <ngrigor...@gmail.com> wrote: > Hi Jean-Armel, > > I am using latest and greatest DSE 4.5.2 (4.5.3 in another cluster but there > are no relevant changes between 4.5.2 and 4.5.3) - thus, Cassandra 2.0.10. > > I have about 1,8Tb of data per node now in total, which falls into that > range. > > As I said, it is really a problem with large amount of data in a single CF, > not total amount of data. Quite often the nodes are idle yet having quite a > bit of pending compactions. I have discussed it with other members of C* > community and DataStax guys and, they have confirmed my observation. > > I believe that increasing the sstable size won't help at all and probably > will make the things worse - everything else being equal, of course. But I > would like to hear from Andrei when he is done with his test. > > Regarding the last statement - yes, C* clearly likes many small servers more > than fewer large ones. But it is all relative - and can be all recalculated > to $$$ :) C* is all about partitioning of everything - storage, > traffic...Less data per node and more nodes give you lower latency, lower > heap usage etc, etc. I think I have learned this with my project. Somewhat > hard way but still, nothing is better than the personal experience :) > > On Tue, Nov 25, 2014 at 3:23 AM, Jean-Armel Luce <jaluc...@gmail.com> wrote: >> >> Hi Andrei, Hi Nicolai, >> >> Which version of C* are you using ? >> >> There are some recommendations about the max storage per node : >> http://www.datastax.com/dev/blog/performance-improvements-in-cassandra-1-2 >> >> "For 1.0 we recommend 300-500GB. For 1.2 we are looking to be able to >> handle 10x >> (3-5TB)". >> >> I have the feeling that those recommendations are sensitive according many >> criteria such as : >> - your hardware >> - the compaction strategy >> - ... >> >> It looks that LCS lower those limitations. >> >> Increasing the size of sstables might help if you have enough CPU and you >> can put more load on your I/O system (@Andrei, I am interested by the >> results of your experimentation about large sstable files) >> >> From my point of view, there are some usage patterns where it is better to >> have many small servers than a few large servers. Probably, it is better to >> have many small servers if you need LCS for large tables. >> >> Just my 2 cents. >> >> Jean-Armel >> >> 2014-11-24 19:56 GMT+01:00 Robert Coli <rc...@eventbrite.com>: >>> >>> On Mon, Nov 24, 2014 at 6:48 AM, Nikolai Grigoriev <ngrigor...@gmail.com> >>> wrote: >>>> >>>> One of the obvious recommendations I have received was to run more than >>>> one instance of C* per host. Makes sense - it will reduce the amount of >>>> data >>>> per node and will make better use of the resources. >>> >>> >>> This is usually a Bad Idea to do in production. >>> >>> =Rob >>> >> >> > > > > -- > Nikolai Grigoriev > (514) 772-5178