Re: Understanding of proliferation of sstables during a repair

2017-02-26 Thread Benjamin Roth
Too many open files. Which is 100k by default and we had >40k sstables.
Normally the are around 500-1000.

Am 27.02.2017 02:40 schrieb "Seth Edwards" :

> This makes a lot more sense. What does TMOF stand for?
>
> On Sun, Feb 26, 2017 at 1:01 PM, Benjamin Roth 
> wrote:
>
>> Hi Seth,
>>
>> Repairs can create a lot of tiny SSTables. I also encountered the
>> creation of so many sstables that the node died because of TMOF. At that
>> time the affected nodes were REALLY inconsistent.
>>
>> One reason can be immense inconsistencies spread over many
>> partition(-ranges) with a lot of subrange repairs that trigger a lot of
>> independant streams. Each stream results in a single SSTable that can be
>> very small. No matter how small it is, it has to be compacted and can cause
>> a compaction impact that is a lot bigger than expected from a tiny little
>> table.
>>
>> Also consider that there is a theoretical race condition that can cause
>> repairs even though data is not inconsistent due to "flighing in mutations"
>> during merkle tree calculation.
>>
>> 2017-02-26 20:41 GMT+01:00 Seth Edwards :
>>
>>> Hello,
>>>
>>> We just ran a repair on a keyspace using TWCS and a mixture of TTLs
>>> .This caused a large proliferation of sstables and compactions. There is
>>> likely a lot of entropy in this keyspace. I am trying to better understand
>>> why this is.
>>>
>>> I've also read that you may not want to run repairs on short TTL data
>>> and rely upon other anti-entropy mechanisms to achieve consistency instead.
>>> Is this generally true?
>>>
>>>
>>> Thanks!
>>>
>>
>>
>>
>> --
>> Benjamin Roth
>> Prokurist
>>
>> Jaumo GmbH · www.jaumo.com
>> Wehrstraße 46 · 73035 Göppingen · Germany
>> Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1
>> <+49%207161%203048801>
>> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>>
>
>


Re: Understanding of proliferation of sstables during a repair

2017-02-26 Thread Seth Edwards
This makes a lot more sense. What does TMOF stand for?

On Sun, Feb 26, 2017 at 1:01 PM, Benjamin Roth 
wrote:

> Hi Seth,
>
> Repairs can create a lot of tiny SSTables. I also encountered the creation
> of so many sstables that the node died because of TMOF. At that time the
> affected nodes were REALLY inconsistent.
>
> One reason can be immense inconsistencies spread over many
> partition(-ranges) with a lot of subrange repairs that trigger a lot of
> independant streams. Each stream results in a single SSTable that can be
> very small. No matter how small it is, it has to be compacted and can cause
> a compaction impact that is a lot bigger than expected from a tiny little
> table.
>
> Also consider that there is a theoretical race condition that can cause
> repairs even though data is not inconsistent due to "flighing in mutations"
> during merkle tree calculation.
>
> 2017-02-26 20:41 GMT+01:00 Seth Edwards :
>
>> Hello,
>>
>> We just ran a repair on a keyspace using TWCS and a mixture of TTLs .This
>> caused a large proliferation of sstables and compactions. There is likely a
>> lot of entropy in this keyspace. I am trying to better understand why this
>> is.
>>
>> I've also read that you may not want to run repairs on short TTL data and
>> rely upon other anti-entropy mechanisms to achieve consistency instead. Is
>> this generally true?
>>
>>
>> Thanks!
>>
>
>
>
> --
> Benjamin Roth
> Prokurist
>
> Jaumo GmbH · www.jaumo.com
> Wehrstraße 46 · 73035 Göppingen · Germany
> Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1
> <+49%207161%203048801>
> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>


Re: Understanding of proliferation of sstables during a repair

2017-02-26 Thread Benjamin Roth
Hi Seth,

Repairs can create a lot of tiny SSTables. I also encountered the creation
of so many sstables that the node died because of TMOF. At that time the
affected nodes were REALLY inconsistent.

One reason can be immense inconsistencies spread over many
partition(-ranges) with a lot of subrange repairs that trigger a lot of
independant streams. Each stream results in a single SSTable that can be
very small. No matter how small it is, it has to be compacted and can cause
a compaction impact that is a lot bigger than expected from a tiny little
table.

Also consider that there is a theoretical race condition that can cause
repairs even though data is not inconsistent due to "flighing in mutations"
during merkle tree calculation.

2017-02-26 20:41 GMT+01:00 Seth Edwards :

> Hello,
>
> We just ran a repair on a keyspace using TWCS and a mixture of TTLs .This
> caused a large proliferation of sstables and compactions. There is likely a
> lot of entropy in this keyspace. I am trying to better understand why this
> is.
>
> I've also read that you may not want to run repairs on short TTL data and
> rely upon other anti-entropy mechanisms to achieve consistency instead. Is
> this generally true?
>
>
> Thanks!
>



-- 
Benjamin Roth
Prokurist

Jaumo GmbH · www.jaumo.com
Wehrstraße 46 · 73035 Göppingen · Germany
Phone +49 7161 304880-6 · Fax +49 7161 304880-1
AG Ulm · HRB 731058 · Managing Director: Jens Kammerer


Understanding of proliferation of sstables during a repair

2017-02-26 Thread Seth Edwards
Hello,

We just ran a repair on a keyspace using TWCS and a mixture of TTLs .This
caused a large proliferation of sstables and compactions. There is likely a
lot of entropy in this keyspace. I am trying to better understand why this
is.

I've also read that you may not want to run repairs on short TTL data and
rely upon other anti-entropy mechanisms to achieve consistency instead. Is
this generally true?


Thanks!