Re: Usage of allocate_tokens_for_keyspace for a new cluster

2019-02-14 Thread DuyHai Doan
Ok thanks John

On Thu, Feb 14, 2019 at 8:51 PM Jonathan Haddad  wrote:

> Create the first node, setting the tokens manually.
> Create the keyspace.
> Add the rest of the nodes with the allocate tokens uncommented.
>
> On Thu, Feb 14, 2019 at 11:43 AM DuyHai Doan  wrote:
>
>> Hello users
>>
>> By looking at the mailing list archive, there was already some questions
>> about the flag "allocate_tokens_for_keyspace" from cassandra.yaml
>>
>> I'm starting a fresh new cluster (with 0 data).
>>
>> The keyspace used by the project is raw_data so I
>> set allocate_tokens_for_keyspace = raw_data in the cassandra.yaml
>>
>> However the cluster fails to start, the keyspace does not exist yet (of
>> course, it is not yet created).
>>
>> So to me it is like chicken and egg issue:
>>
>> 1. You create a fresh new cluster with the option
>> "allocate_tokens_for_keyspace" commented out, in this case you cannot
>> optimize the token allocations
>> 2. You create a fresh new cluster with option
>> "allocate_tokens_for_keyspace" pointing to a not-yet created keyspace, it
>> fails (logically)
>>
>> The third option is:
>>
>>  a. create a new cluster with "allocate_tokens_for_keyspace" commented out
>>  b. create the keyspace "raw_data"
>>  c. set allocate_tokens_for_keyspace = raw_data
>>
>> My question is, since after step a. the token allocation is *already*
>> done, what's the point setting the flag in step c. 
>>
>> Regards
>>
>> Duy Hai DOAN
>>
> --
> Jon Haddad
> http://www.rustyrazorblade.com
> twitter: rustyrazorblade
>


Re: Usage of allocate_tokens_for_keyspace for a new cluster

2019-02-14 Thread Jonathan Haddad
Create the first node, setting the tokens manually.
Create the keyspace.
Add the rest of the nodes with the allocate tokens uncommented.

On Thu, Feb 14, 2019 at 11:43 AM DuyHai Doan  wrote:

> Hello users
>
> By looking at the mailing list archive, there was already some questions
> about the flag "allocate_tokens_for_keyspace" from cassandra.yaml
>
> I'm starting a fresh new cluster (with 0 data).
>
> The keyspace used by the project is raw_data so I
> set allocate_tokens_for_keyspace = raw_data in the cassandra.yaml
>
> However the cluster fails to start, the keyspace does not exist yet (of
> course, it is not yet created).
>
> So to me it is like chicken and egg issue:
>
> 1. You create a fresh new cluster with the option
> "allocate_tokens_for_keyspace" commented out, in this case you cannot
> optimize the token allocations
> 2. You create a fresh new cluster with option
> "allocate_tokens_for_keyspace" pointing to a not-yet created keyspace, it
> fails (logically)
>
> The third option is:
>
>  a. create a new cluster with "allocate_tokens_for_keyspace" commented out
>  b. create the keyspace "raw_data"
>  c. set allocate_tokens_for_keyspace = raw_data
>
> My question is, since after step a. the token allocation is *already*
> done, what's the point setting the flag in step c. 
>
> Regards
>
> Duy Hai DOAN
>
-- 
Jon Haddad
http://www.rustyrazorblade.com
twitter: rustyrazorblade


Usage of allocate_tokens_for_keyspace for a new cluster

2019-02-14 Thread DuyHai Doan
Hello users

By looking at the mailing list archive, there was already some questions
about the flag "allocate_tokens_for_keyspace" from cassandra.yaml

I'm starting a fresh new cluster (with 0 data).

The keyspace used by the project is raw_data so I
set allocate_tokens_for_keyspace = raw_data in the cassandra.yaml

However the cluster fails to start, the keyspace does not exist yet (of
course, it is not yet created).

So to me it is like chicken and egg issue:

1. You create a fresh new cluster with the option
"allocate_tokens_for_keyspace" commented out, in this case you cannot
optimize the token allocations
2. You create a fresh new cluster with option
"allocate_tokens_for_keyspace" pointing to a not-yet created keyspace, it
fails (logically)

The third option is:

 a. create a new cluster with "allocate_tokens_for_keyspace" commented out
 b. create the keyspace "raw_data"
 c. set allocate_tokens_for_keyspace = raw_data

My question is, since after step a. the token allocation is *already* done,
what's the point setting the flag in step c. 

Regards

Duy Hai DOAN


Re: AxonOps - Cassandra operational management tool

2019-02-14 Thread Nitan Kainth
This is really cool!

will it be open source or licensed in near future?

On Thu, Feb 14, 2019 at 12:15 PM AxonOps  wrote:

> Hi folks,
>
> We are excited to announce AxonOps, an operational management tool for
> Apache Cassandra, is now ready for Beta testing.
>
> We'd be interested to hear you try this and let us know what you think!
>
> Please read the installation instructions on https://www.axonops.com
>
> AxonOps Team
>


Re: Bootstrap keeps failing

2019-02-14 Thread Léo FERLIN SUTTON
On Thu, Feb 14, 2019 at 6:56 PM Kenneth Brotman
 wrote:

> Those aren’t the same error messages so I think progress has been made.
>
>
>
> What version of C* are you running?
>
3.0.17 We will upgrade to 3.0.18 soon.

> How did you clear out the space?
>
I had a few topology changes to cleanup. `nodetool cleanup` did miracles.

Regards,

Leo

>
> *From:* Léo FERLIN SUTTON [mailto:lfer...@mailjet.com.INVALID]
> *Sent:* Thursday, February 14, 2019 7:54 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Bootstrap keeps failing
>
>
>
> Hello again !
>
>
>
> I have managed to free a lot of disk space and now most nodes hover
> between 50% and 80%.
>
> I am still getting bootstrapping failures :(
>
>
>
> Here I have some logs :
>
> 2019-02-14T15:23:05+00:00 cass02-0001.c.company.internal user err
> cassandra  [org.apache.cassandra.streaming.StreamSession] [onError] -
> [Stream #ea8ae230-2f8f-11e9-8418-6d4f57de615d] Streaming error occurred
>
> 2019-02-14T15:23:05+00:00 cass02-0001.c.company..internal user info
> cassandra  [org.apache.cassandra.streaming.StreamResultFuture]
> [handleSessionComplete] - [Stream #ea8ae230-2f8f-11e9-8418-6d4f57de615d]
> Session with /10.10.23.1
>
> 55 is complete
>
> 2019-02-14T15:23:05+00:00 cass02-0001.c.company..internal user warning
> cassandra  [org.apache.cassandra.streaming.StreamResultFuture]
> [maybeComplete] - [Stream #ea8ae230-2f8f-11e9-8418-6d4f57de615d] Stream
> failed
>
> 2019-02-14T15:23:05+00:00 cass02-0001.c.company..internal user warning
> cassandra  [org.apache.cassandra.service.StorageService] [onFailure] -
> Error during bootstrap.
>
> 2019-02-14T15:23:05+00:00 cass02-0001.c.company..internal user err
> cassandra  [org.apache.cassandra.service.StorageService] [bootstrap] -
> Error while waiting on bootstrap to complete. Bootstrap will have to be
> restarted.
>
> 2019-02-14T15:23:05+00:00 cass02-0001.c.company..internal user warning
> cassandra  [org.apache.cassandra.service.StorageService] [joinTokenRing] -
> Some data streaming failed. Use nodetool to check bootstrap state and
> resume. For more, see `nodetool help bootstrap`. IN_PROGRESS
>
>
>
>
>
> I can see a `Streaming error occured` for all of my nodes it is trying to
> stream from. Is there a way to have more logs to know why the failures
> occurred ?
>
> I have set ` level="DEBUG" />` but it doesn't seem to give me more details, is there
> another class I should set to DEBUG ?
>
>
>
> Finally I have also noticed a lot of :
>
> [org.apache.cassandra.db.compaction.LeveledManifest]
> [getCompactionCandidates] - Bootstrapping - doing STCS in L0
>
> In my logs files, It might be important.
>
>
>
> Regards,
>
>
>
> Leo
>
>
>
> On Fri, Feb 8, 2019 at 3:59 PM Léo FERLIN SUTTON 
> wrote:
>
> On Fri, Feb 8, 2019 at 3:37 PM Kenneth Brotman
>  wrote:
>
> Thanks for the details that helps us understand the situation.  I’m pretty
> sure you’ve exceed the working capacity of some of those nodes.  Going over
> 50% - 75% depending on compaction strategy is ill-advised.
>
> 50% free disk space is a steep price to pay for disk space not used. We
> have about 90 terabytes of data on SSD and we are paying about 100$ per
> terabytes of ssd storage (on google cloud).
>
> Maybe we can get closer to 75%.
>
>
>
> Our compaction strategy is `LeveledCompactionStrategy` on our two biggest
> tables (90% of the data).
>
>
>
> You need to clear out as much room as possible to add more nodes.
>
> Are the tombstones clearing out.
>
> I think we don't have a lot of tombstones :
>
> We have 0 deletes on our two biggest tables.
>
> One of them get updated with new data (messages.messages), but the updates
> are filling columns previously empty, I am unsure but I think this doesn't
> cause any tombstones.
>
> I have joined the info from `nodetool tablestats` for our two largest
> tables.
>
>
>
> We are using cassandra-reaper that manages our repairs. A full repair
> takes about 13 days. So if we have tombstones they should not be older than
> 13 days.
>
>
>
> Are there old snap shots that you can delete.  And so on.
>
> Unfortunately no. We take a daily snapshot that we backup then drop.
>
>
>
> You have to make more room on the existing nodes.
>
>
>
> I am trying to run `nodetool cleanup` on our most "critical" nodes to see
> if it helps. If that doesn't do the trick we will only have two solutions :
>
>- Add more disk space on each node
>- Adding new nodes
>
> We have looked at some other companies case studies and it looks like we
> have a few very big nodes instead of a lot of smaller ones.
>
> We are currently trying to add nodes, and are hoping to eventually
> transition to a "lot of small nodes" model and be able to add nodes a lot
> faster.
>
>
>
> Thank you again for your interest,
>
>
>
> Regards,
>
>
>
> Leo
>
>
>
> *From:* Léo FERLIN SUTTON [mailto:lfer...@mailjet.com.INVALID]
> *Sent:* Friday, February 08, 2019 6:16 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Bootstrap keeps failing
>
>
>
> On Thu, Feb 7, 

AxonOps - Cassandra operational management tool

2019-02-14 Thread AxonOps
Hi folks,

We are excited to announce AxonOps, an operational management tool for
Apache Cassandra, is now ready for Beta testing.

We'd be interested to hear you try this and let us know what you think!

Please read the installation instructions on https://www.axonops.com

AxonOps Team


RE: Bootstrap keeps failing

2019-02-14 Thread Kenneth Brotman
Those aren’t the same error messages so I think progress has been made.  

 

What version of C* are you running?

How did you clear out the space?

 

Kenneth Brotman

 

From: Léo FERLIN SUTTON [mailto:lfer...@mailjet.com.INVALID] 
Sent: Thursday, February 14, 2019 7:54 AM
To: user@cassandra.apache.org
Subject: Re: Bootstrap keeps failing

 

Hello again !

 

I have managed to free a lot of disk space and now most nodes hover between 50% 
and 80%.

I am still getting bootstrapping failures :(

 

Here I have some logs : 

2019-02-14T15:23:05+00:00 cass02-0001.c.company.internal user err cassandra  
[org.apache.cassandra.streaming.StreamSession] [onError] - [Stream 
#ea8ae230-2f8f-11e9-8418-6d4f57de615d] Streaming error occurred

2019-02-14T15:23:05+00:00 cass02-0001.c.company..internal user info cassandra  
[org.apache.cassandra.streaming.StreamResultFuture] [handleSessionComplete] - 
[Stream #ea8ae230-2f8f-11e9-8418-6d4f57de615d] Session with /10.10.23.1

55 is complete

2019-02-14T15:23:05+00:00 cass02-0001.c.company..internal user warning 
cassandra  [org.apache.cassandra.streaming.StreamResultFuture] [maybeComplete] 
- [Stream #ea8ae230-2f8f-11e9-8418-6d4f57de615d] Stream failed

2019-02-14T15:23:05+00:00 cass02-0001.c.company..internal user warning 
cassandra  [org.apache.cassandra.service.StorageService] [onFailure] - Error 
during bootstrap.

2019-02-14T15:23:05+00:00 cass02-0001.c.company..internal user err cassandra  
[org.apache.cassandra.service.StorageService] [bootstrap] - Error while waiting 
on bootstrap to complete. Bootstrap will have to be restarted.

2019-02-14T15:23:05+00:00 cass02-0001.c.company..internal user warning 
cassandra  [org.apache.cassandra.service.StorageService] [joinTokenRing] - Some 
data streaming failed. Use nodetool to check bootstrap state and resume. For 
more, see `nodetool help bootstrap`. IN_PROGRESS

 

 

I can see a `Streaming error occured` for all of my nodes it is trying to 
stream from. Is there a way to have more logs to know why the failures occurred 
? 

I have set `` but it doesn't seem to give me more details, is there another 
class I should set to DEBUG ?

 

Finally I have also noticed a lot of :

[org.apache.cassandra.db.compaction.LeveledManifest] [getCompactionCandidates] 
- Bootstrapping - doing STCS in L0

In my logs files, It might be important.

 

Regards,

 

Leo

 

On Fri, Feb 8, 2019 at 3:59 PM Léo FERLIN SUTTON  wrote:

On Fri, Feb 8, 2019 at 3:37 PM Kenneth Brotman  
wrote:

Thanks for the details that helps us understand the situation.  I’m pretty sure 
you’ve exceed the working capacity of some of those nodes.  Going over 50% - 
75% depending on compaction strategy is ill-advised.

50% free disk space is a steep price to pay for disk space not used. We have 
about 90 terabytes of data on SSD and we are paying about 100$ per terabytes of 
ssd storage (on google cloud). 

Maybe we can get closer to 75%.

 

Our compaction strategy is `LeveledCompactionStrategy` on our two biggest 
tables (90% of the data).

 

You need to clear out as much room as possible to add more nodes.  

Are the tombstones clearing out.  

I think we don't have a lot of tombstones :

We have 0 deletes on our two biggest tables. 

One of them get updated with new data (messages.messages), but the updates are 
filling columns previously empty, I am unsure but I think this doesn't cause 
any tombstones.

I have joined the info from `nodetool tablestats` for our two largest tables.

 

We are using cassandra-reaper that manages our repairs. A full repair takes 
about 13 days. So if we have tombstones they should not be older than 13 days.

 

Are there old snap shots that you can delete.  And so on.  

Unfortunately no. We take a daily snapshot that we backup then drop.  

 

You have to make more room on the existing nodes.

 

I am trying to run `nodetool cleanup` on our most "critical" nodes to see if it 
helps. If that doesn't do the trick we will only have two solutions :

*   Add more disk space on each node
*   Adding new nodes

We have looked at some other companies case studies and it looks like we have a 
few very big nodes instead of a lot of smaller ones.

We are currently trying to add nodes, and are hoping to eventually transition 
to a "lot of small nodes" model and be able to add nodes a lot faster.

 

Thank you again for your interest,

 

Regards,

 

Leo

 

From: Léo FERLIN SUTTON [mailto:lfer...@mailjet.com.INVALID] 
Sent: Friday, February 08, 2019 6:16 AM
To: user@cassandra.apache.org
Subject: Re: Bootstrap keeps failing

 

On Thu, Feb 7, 2019 at 10:11 PM Kenneth Brotman  
wrote:

Lots of things come to mind. We need more information from you to help us 
understand:

How long have you had your cluster running?

A bit more than a year old. But it has been constantly growing (3 nodes to 6 
nodes to 12 nodes, etc).

We have a replication_factor of 3 on all keyspaces and 3 racks with an equal 
amount of nodes. 

 

Is it generally 

Re: forgot to run nodetool cleanup

2019-02-14 Thread Oleksandr Shulgin
On Thu, Feb 14, 2019 at 4:39 PM Jeff Jirsa  wrote:
>
> Wait, doesn't cleanup just rewrite every SSTable one by one?  Why would
compaction strategy matter?  Do you mean that after cleanup STCS may pick
some resulting tables to re-compact them due to the min/max size
difference, which would not be the case with LCS?
>
>
> LCS has smaller, non-overlapping files. The upleveling process and
non-overlapping part makes it very likely (but not guaranteed) that within
a level, only 2 sstables will overlap a losing range.
>
> Since cleanup only rewrites files if they’re out of range, LCS probably
only has 5 (levels) * 2 (lower and upper) * number of ranges sstables that
are going to get rewritten, where TWCS / stcs is probably going to rewrite
all of them.

Thanks for the explanation!

Still with the default number of vnodes, there is probably not much of a
difference as even a single additional node will touch a lot of ranges?

--
Alex


Re: Bootstrap keeps failing

2019-02-14 Thread Léo FERLIN SUTTON
Hello again !

I have managed to free a lot of disk space and now most nodes hover between
50% and 80%.
I am still getting bootstrapping failures :(

Here I have some logs :

> 2019-02-14T15:23:05+00:00 cass02-0001.c.company.internal user err
>> cassandra  [org.apache.cassandra.streaming.StreamSession] [onError] -
>> [Stream #ea8ae230-2f8f-11e9-8418-6d4f57de615d] Streaming error occurred
>
> 2019-02-14T15:23:05+00:00 cass02-0001.c.company..internal user info
>> cassandra  [org.apache.cassandra.streaming.StreamResultFuture]
>> [handleSessionComplete] - [Stream #ea8ae230-2f8f-11e9-8418-6d4f57de615d]
>> Session with /10.10.23.1
>
> 55 is complete
>
> 2019-02-14T15:23:05+00:00 cass02-0001.c.company..internal user warning
>> cassandra  [org.apache.cassandra.streaming.StreamResultFuture]
>> [maybeComplete] - [Stream #ea8ae230-2f8f-11e9-8418-6d4f57de615d] Stream
>> failed
>
> 2019-02-14T15:23:05+00:00 cass02-0001.c.company..internal user warning
>> cassandra  [org.apache.cassandra.service.StorageService] [onFailure] -
>> Error during bootstrap.
>
> 2019-02-14T15:23:05+00:00 cass02-0001.c.company..internal user err
>> cassandra  [org.apache.cassandra.service.StorageService] [bootstrap] -
>> Error while waiting on bootstrap to complete. Bootstrap will have to be
>> restarted.
>
> 2019-02-14T15:23:05+00:00 cass02-0001.c.company..internal user warning
>> cassandra  [org.apache.cassandra.service.StorageService] [joinTokenRing] -
>> Some data streaming failed. Use nodetool to check bootstrap state and
>> resume. For more, see `nodetool help bootstrap`. IN_PROGRESS
>
>
>
I can see a `Streaming error occured` for all of my nodes it is trying to
stream from. Is there a way to have more logs to know why the failures
occurred ?
I have set `` but it doesn't seem to give me more details, is there
another class I should set to DEBUG ?

Finally I have also noticed a lot of :

> [org.apache.cassandra.db.compaction.LeveledManifest]
> [getCompactionCandidates] - Bootstrapping - doing STCS in L0

In my logs files, It might be important.

Regards,

Leo

On Fri, Feb 8, 2019 at 3:59 PM Léo FERLIN SUTTON 
wrote:

> On Fri, Feb 8, 2019 at 3:37 PM Kenneth Brotman
>  wrote:
>
>> Thanks for the details that helps us understand the situation.  I’m
>> pretty sure you’ve exceed the working capacity of some of those nodes.
>> Going over 50% - 75% depending on compaction strategy is ill-advised.
>>
> 50% free disk space is a steep price to pay for disk space not used. We
> have about 90 terabytes of data on SSD and we are paying about 100$ per
> terabytes of ssd storage (on google cloud).
> Maybe we can get closer to 75%.
>
> Our compaction strategy is `LeveledCompactionStrategy` on our two biggest
> tables (90% of the data).
>
> You need to clear out as much room as possible to add more nodes.
>>
> Are the tombstones clearing out.
>>
> I think we don't have a lot of tombstones :
> We have 0 deletes on our two biggest tables.
> One of them get updated with new data (messages.messages), but the updates
> are filling columns previously empty, I am unsure but I think this doesn't
> cause any tombstones.
> I have joined the info from `nodetool tablestats` for our two largest
> tables.
>
> We are using cassandra-reaper that manages our repairs. A full repair
> takes about 13 days. So if we have tombstones they should not be older than
> 13 days.
>
> Are there old snap shots that you can delete.  And so on.
>>
> Unfortunately no. We take a daily snapshot that we backup then drop.
>
>
>> You have to make more room on the existing nodes.
>>
>
> I am trying to run `nodetool cleanup` on our most "critical" nodes to see
> if it helps. If that doesn't do the trick we will only have two solutions :
>
>- Add more disk space on each node
>- Adding new nodes
>
> We have looked at some other companies case studies and it looks like we
> have a few very big nodes instead of a lot of smaller ones.
> We are currently trying to add nodes, and are hoping to eventually
> transition to a "lot of small nodes" model and be able to add nodes a lot
> faster.
>
> Thank you again for your interest,
>
> Regards,
>
> Leo
>
>
>> *From:* Léo FERLIN SUTTON [mailto:lfer...@mailjet.com.INVALID]
>> *Sent:* Friday, February 08, 2019 6:16 AM
>> *To:* user@cassandra.apache.org
>> *Subject:* Re: Bootstrap keeps failing
>>
>>
>>
>> On Thu, Feb 7, 2019 at 10:11 PM Kenneth Brotman
>>  wrote:
>>
>> Lots of things come to mind. We need more information from you to help us
>> understand:
>>
>> How long have you had your cluster running?
>>
>> A bit more than a year old. But it has been constantly growing (3 nodes
>> to 6 nodes to 12 nodes, etc).
>>
>> We have a replication_factor of 3 on all keyspaces and 3 racks with an
>> equal amount of nodes.
>>
>>
>>
>> Is it generally working ok?
>>
>> Works fine. Good performance, repairs managed by cassandra-reaper.
>>
>>
>>
>> Is it just one node that is misbehaving at a time?
>>
>> We only bootstrap nodes one at a time. Sometimes it works 

Re: [EXTERNAL] Re: Make large partitons lighter on select without changing primary partition formation.

2019-02-14 Thread Jeff Jirsa
It takes effect on each sstable as it’s written, so you have to rewrite your 
dataset before it’s fully in effect

You can do that with “nodetool upgradesstables -a”



-- 
Jeff Jirsa


> On Feb 13, 2019, at 11:43 PM, "ishib...@gmail.com"  wrote:
> 
> Hi Jeff,
> If increase the value, it will affect only newly created indexes. Will repair 
> rebuilds old indexes with new , larger, size, or leave them with the same 
> size?
> 
> Best regards, Ilya
> 
> 
> 
>  Исходное сообщение 
> Тема: Re: [EXTERNAL] Re: Make large partitons lighter on select without 
> changing primary partition formation.
> От: Jeff Jirsa 
> Кому: user@cassandra.apache.org
> Копия: 
> 
> 
> We build an index on each partition as we write it - in 3.0 it’s a list 
> that relates the clustering columns for a given partition key to a point in 
> the file. When you read, we use that index to skip to the point at the 
> beginning of your read.
> 
> That 64k value is just a default that few people ever have reason to change - 
> it’s somewhat similar to the 64k compression chunk size, though they’re 
> not aligned.
> 
> If you increase the value from 64k to 128k, you’ll have half as many index 
> markers per partition. This means when you use the index, you’ll do a bit 
> more IO to find the actual start of your result. However, it also means you 
> have half as many index objects created in the heap on reads - for many uses 
> cases with wife partitions, the indexinfo objects on reads create far too 
> much garbage and cause bad latency/gc. This just gives you a way to trade off 
> between the two options - disk IO or gc pauses.
> 
> 
> -- 
> Jeff Jirsa
> 
> 
>> On Feb 13, 2019, at 10:45 PM, "ishib...@gmail.com"  
>> wrote:
>> 
>> Hello!
>> increase column_index_size_in_kb for rarely index creations, am I correct?
>> But will it be used in every read request, or column index for queries 
>> within a row only?
>> 
>> Best regards, Ilya
>> 
>> 
>> 
>>  Исходное 
>> сообщение 
>> Тема: Re: [EXTERNAL] Re: Make large partitons lighter on select 
>> without changing primary partition formation.
>> От: Jeff Jirsa 
>> Кому: user@cassandra.apache.org
>> Копия: 
>> 
>> 
>> Cassandra-11206 (https://issues.apache.org/jira/browse/CASSANDRA-11206) is 
>> in 3.11 and does have a few knobs to make this less painful
>> 
>> You can also increase the column index size from 64kb to something 
>> significantly higher to decrease the cost of those reads on the JVM 
>> (shifting cost to the disk) - consider 256k or 512k for 100-1000mb 
>> partitions.
>> 
>> -- 
>> Jeff Jirsa
>> 
>> 
>>> On Feb 13, 2019, at 5:48 AM, Durity, Sean R  
>>> wrote:
>>> 
>>> Agreed. It’s pretty close to impossible to administrate your way out 
>>> of a data model that doesn’t play to Cassandra’s strengths. 
>>> Which is true for other data storage technologies – you need to 
>>> model the data the way that the engine is designed to work.
>>>  
>>>  
>>> Sean Durity
>>>  
>>> From: DuyHai Doan  
>>> Sent: Wednesday, February 13, 2019 8:08 AM
>>> To: user 
>>> Subject: [EXTERNAL] Re: Make large partitons lighter on select without 
>>> changing primary partition formation.
>>>  
>>> Plain answer is NO
>>>  
>>> There is a slight hope that the JIRA 
>>> https://issues.apache.org/jira/browse/CASSANDRA-9754 gets into 4.0 release
>>>  
>>> But right now, there seems to be few interest in this ticket, the last 
>>> comment 23/Feb/2017 old ...
>>>  
>>>  
>>> On Wed, Feb 13, 2019 at 1:18 PM Vsevolod Filaretov  
>>> wrote:
>>> Hi all,
>>>  
>>> The question.
>>>  
>>> We have Cassandra 3.11.1 with really heavy primary partitions:
>>> cfhistograms 95%% is 130+ mb, 95%% cell count is 3.3mln and higher, 98%% 
>>> partition size is 220+ mb sometimes partitions are 1+ gb. We have regular 
>>> problems with node lockdowns leading to read request timeouts under read 
>>> requests load.
>>>  
>>> Changing primary partition key structure is out of question.
>>>  
>>> Are there any sharding techniques available to dilute partitions at level 
>>> lower than 'select' requests to make read performance better? Without 
>>> changing read requests syntax?
>>>  
>>> Thank you all in advance,
>>> Vsevolod Filaretov.
>>> 
>>> 
>>> The information in this Internet Email is confidential and may be legally 
>>> privileged. It is intended solely for the addressee. Access to this Email 
>>> by anyone else is unauthorized. If you are not the intended recipient, any 
>>> disclosure, copying, distribution or any action taken or omitted to be 
>>> taken in reliance on it, is prohibited and may be unlawful. When addressed 
>>> to our clients any opinions or advice contained in this Email are subject 
>>> to the terms and conditions expressed in any applicable governing The Home 
>>> Depot terms of business or client engagement 

Re: forgot to run nodetool cleanup

2019-02-14 Thread Jeff Jirsa



> On Feb 14, 2019, at 12:19 AM, Oleksandr Shulgin 
>  wrote:
> 
>> On Wed, Feb 13, 2019 at 6:47 PM Jeff Jirsa  wrote:
>> Depending on how bad data resurrection is, you should run it for any host 
>> that loses a range. In vnodes, that's usually all hosts. 
>> 
>> Cleanup with LCS is very cheap. Cleanup with STCS/TWCS is a bit more work.
> 
> Wait, doesn't cleanup just rewrite every SSTable one by one?  Why would 
> compaction strategy matter?  Do you mean that after cleanup STCS may pick 
> some resulting tables to re-compact them due to the min/max size difference, 
> which would not be the case with LCS?

LCS has smaller, non-overlapping files. The upleveling process and 
non-overlapping part makes it very likely (but not guaranteed) that within a 
level, only 2 sstables will overlap a losing range. 

Since cleanup only rewrites files if they’re out of range, LCS probably only 
has 5 (levels) * 2 (lower and upper) * number of ranges sstables that are going 
to get rewritten, where TWCS / stcs is probably going to rewrite all of them.

Re: Ansible scripts for Cassandra to help with automation needs

2019-02-14 Thread Abdul Patel
One idea will be to rolling restart of complete cluster , that script will
be huge help.
Just read a blog too that last pickle group has come up with a tool called
'cstart' something which can help in rolling restart.


On Thursday, February 14, 2019, Jeff Jirsa  wrote:

>
>
>
> On Feb 13, 2019, at 9:51 PM, Kenneth Brotman 
> wrote:
>
> I want to generate a variety of Ansible scripts to share with the Apache
> Cassandra community.  I’ll put them in a Github repository.  Just email me
> offline what scripts would help the most.
>
>
>
> Does this exist already?  I can’t find it.  Let me know if it does.
>
>
> Not aware of any repo that does this, but it’s a good idea
>
>
>
> If not, let’s put it together for the community.  Maybe we’ll end up with
> a download right on the Apache Cassandra web site or packaged with future
> releases of Cassandra.
>
>
>
> Kenneth Brotman
>
>
>
> P.S.  Terraform is next!
>
>


Re: forgot to run nodetool cleanup

2019-02-14 Thread shalom sagges
Cleanup is a great way to free up disk space.

Just note you might run into
https://issues.apache.org/jira/browse/CASSANDRA-9036 if you use a version
older than 2.0.15.



On Thu, Feb 14, 2019 at 10:20 AM Oleksandr Shulgin <
oleksandr.shul...@zalando.de> wrote:

> On Wed, Feb 13, 2019 at 6:47 PM Jeff Jirsa  wrote:
>
>> Depending on how bad data resurrection is, you should run it for any host
>> that loses a range. In vnodes, that's usually all hosts.
>>
>> Cleanup with LCS is very cheap. Cleanup with STCS/TWCS is a bit more work.
>>
>
> Wait, doesn't cleanup just rewrite every SSTable one by one?  Why would
> compaction strategy matter?  Do you mean that after cleanup STCS may pick
> some resulting tables to re-compact them due to the min/max size
> difference, which would not be the case with LCS?
>
>
>> If you're just TTL'ing all data, it may not be worth the effort.
>>
>
> Indeed, but in our case the main reason to scale out is that the nodes are
> running out of disk space, so we really want to get rid of the extra copies.
>
> --
> Alex
>
>


Re: forgot to run nodetool cleanup

2019-02-14 Thread Oleksandr Shulgin
On Wed, Feb 13, 2019 at 6:47 PM Jeff Jirsa  wrote:

> Depending on how bad data resurrection is, you should run it for any host
> that loses a range. In vnodes, that's usually all hosts.
>
> Cleanup with LCS is very cheap. Cleanup with STCS/TWCS is a bit more work.
>

Wait, doesn't cleanup just rewrite every SSTable one by one?  Why would
compaction strategy matter?  Do you mean that after cleanup STCS may pick
some resulting tables to re-compact them due to the min/max size
difference, which would not be the case with LCS?


> If you're just TTL'ing all data, it may not be worth the effort.
>

Indeed, but in our case the main reason to scale out is that the nodes are
running out of disk space, so we really want to get rid of the extra copies.

--
Alex