On Wed, Sep 26, 2012 at 11:07 AM, Rob Coli <rc...@palominodb.com> wrote:
> On Wed, Sep 26, 2012 at 9:30 AM, Andrey Ilinykh <ailin...@gmail.com> wrote:
>> [ repair ballooned my data size ]
>> 1. Why repair almost triples data size?
>
> You didn't mention what version of cassandra you're running. In some
> old versions of cassandra (prior to 1.0), repair often creates even
> more extraneous data than it should by design.
>
Thank you for reply.

I run 1.1.5

Honestly, I don't understand what is going on.

I ran major compaction on Sep 15
as result I had one big sstable and several smalls. This is one biggest:

-rw-rw-r-- 1 ubuntu ubuntu  90G Sep 15 12:56 Bidgely-rawstreams-he-8475-Data.db

On Sep 22 (one week later)I ran repair and get two more sstables:

-rw-rw-r-- 1 ubuntu ubuntu  85G Sep 22 00:41 Bidgely-rawstreams-he-8605-Data.db
-rw-rw-r-- 1 ubuntu ubuntu  86G Sep 22 00:45 Bidgely-rawstreams-he-8606-Data.db

I don't understand why it copied data twice. In worst case scenario it
should copy everything (~90G), but data is triplicates (90G + 85G
+85G).
Yesterday I ran repair one more time, six(!) more big sstables are
added. It does'n make any sense! What do I miss?

-rw-rw-r-- 1 ubuntu ubuntu  75G Sep 26 09:43 Bidgely-rawstreams-he-8785-Data.db
-rw-rw-r-- 1 ubuntu ubuntu  77G Sep 26 09:45 Bidgely-rawstreams-he-8788-Data.db
-rw-rw-r-- 1 ubuntu ubuntu  76G Sep 26 11:54 Bidgely-rawstreams-he-8793-Data.db
-rw-rw-r-- 1 ubuntu ubuntu  75G Sep 26 11:55 Bidgely-rawstreams-he-8797-Data.db
-rw-rw-r-- 1 ubuntu ubuntu  76G Sep 26 14:03 Bidgely-rawstreams-he-8804-Data.db
-rw-rw-r-- 1 ubuntu ubuntu  75G Sep 26 14:03 Bidgely-rawstreams-he-8807-Data.db

Even I somehow compact it back to 100G, I will have the same problem
very soon. What did I do wrong?

Andrey

Reply via email to