On Wed, Sep 26, 2012 at 11:07 AM, Rob Coli <rc...@palominodb.com> wrote: > On Wed, Sep 26, 2012 at 9:30 AM, Andrey Ilinykh <ailin...@gmail.com> wrote: >> [ repair ballooned my data size ] >> 1. Why repair almost triples data size? > > You didn't mention what version of cassandra you're running. In some > old versions of cassandra (prior to 1.0), repair often creates even > more extraneous data than it should by design. > Thank you for reply.
I run 1.1.5 Honestly, I don't understand what is going on. I ran major compaction on Sep 15 as result I had one big sstable and several smalls. This is one biggest: -rw-rw-r-- 1 ubuntu ubuntu 90G Sep 15 12:56 Bidgely-rawstreams-he-8475-Data.db On Sep 22 (one week later)I ran repair and get two more sstables: -rw-rw-r-- 1 ubuntu ubuntu 85G Sep 22 00:41 Bidgely-rawstreams-he-8605-Data.db -rw-rw-r-- 1 ubuntu ubuntu 86G Sep 22 00:45 Bidgely-rawstreams-he-8606-Data.db I don't understand why it copied data twice. In worst case scenario it should copy everything (~90G), but data is triplicates (90G + 85G +85G). Yesterday I ran repair one more time, six(!) more big sstables are added. It does'n make any sense! What do I miss? -rw-rw-r-- 1 ubuntu ubuntu 75G Sep 26 09:43 Bidgely-rawstreams-he-8785-Data.db -rw-rw-r-- 1 ubuntu ubuntu 77G Sep 26 09:45 Bidgely-rawstreams-he-8788-Data.db -rw-rw-r-- 1 ubuntu ubuntu 76G Sep 26 11:54 Bidgely-rawstreams-he-8793-Data.db -rw-rw-r-- 1 ubuntu ubuntu 75G Sep 26 11:55 Bidgely-rawstreams-he-8797-Data.db -rw-rw-r-- 1 ubuntu ubuntu 76G Sep 26 14:03 Bidgely-rawstreams-he-8804-Data.db -rw-rw-r-- 1 ubuntu ubuntu 75G Sep 26 14:03 Bidgely-rawstreams-he-8807-Data.db Even I somehow compact it back to 100G, I will have the same problem very soon. What did I do wrong? Andrey