Re: Right sizing Cassandra data nodes

2018-02-28 Thread kurt greaves
The problem with higher densities is operations, not querying. When you
need to add nodes/repair/do any streaming operation having more than 3TB
per node becomes more difficult. It's certainly doable, but you'll probably
run into issues. Having said that, an insert only workload is the best
candidate for higher densities.

I'll note that you don't need to bucket by partition really, if you can use
clustering keys (e.g a timestamp) Cassandra will be smart enough to only
read from the SSTables that contain the relevant rows.

But to answer your question, all data is active data. There is no inactive
data. If all you query is the past two months, that's the only data that
will be read by Cassandra. It won't go and read old data unless you tell it
to.

On 24 February 2018 at 07:02, onmstester onmstester 
wrote:

> Another Question on node density, in this scenario:
> 1. we should keep time series data of some years for a heavy write system
> in Cassandra (> 10K Ops in seconds)
> 2. the system is insert only and inserted data would never be updated
> 3. in partition key, we used number of months since 1970, so data for
> every month would be on separate partitions
> 4. because of rule 2, after the end of month previous partitions would
> never be accessed for write requests
> 5. more than 90% of read requests would concern current month partitions,
> so we merely access Old data, we should just keep them for that 10% of
> reports!
> 6. The overall read in comparison to writes are so small (like 0.0001 % of
> overall time)
>
> So, finally the question:
> Even in this scenario would the active data be the whole data (this month
> + all previous months)? or the one which would be accessed for most reads
> and writes (only the past two months)?
> Could i use more than 3TB  per node for this scenario?
>
> Sent using Zoho Mail 
>
>
>  On Tue, 20 Feb 2018 14:58:39 +0330 *Rahul Singh
> >* wrote 
>
> Node density is active data managed in the cluster divided by the number
> of active nodes. Eg. If you you have 500TB or active data under management
> then you would need 250-500 nodes to get beast like optimum performance. It
> also depends on how much memory is on the boxes and if you are using SSD
> drives. SSD doesn’t replace memory but it doesn’t hurt.
>
> --
> Rahul Singh
> rahul.si...@anant.us
>
> Anant Corporation
>
> On Feb 19, 2018, 5:55 PM -0500, Charulata Sharma (charshar) <
> chars...@cisco.com>, wrote:
>
> Thanks for the response Rahul. I did not understand the “node density”
> point.
>
>
>
> Charu
>
>
>
> *From:* Rahul Singh 
> *Reply-To:* "user@cassandra.apache.org" 
> *Date:* Monday, February 19, 2018 at 12:32 PM
> *To:* "user@cassandra.apache.org" 
> *Subject:* Re: Right sizing Cassandra data nodes
>
>
>
> 1. I would keep opscenter on different cluster. Why unnecessarily put
> traffic and computing for opscenter data on a real business data cluster?
> 2. Don’t put more than 1-2 TB per node. Maybe 3TB. Node density as it
> increases creates more replication, read repairs , etc and memory usage for
> doing the compactions etc.
> 3. Can have as much as you want for snapshots as long as you have it on
> another disk or even move it to a SAN / NAS. All you may care about us the
> most recent snapshot on the physical machine / disks on a live node.
>
>
> --
> Rahul Singh
> rahul.si...@anant.us
>
> Anant Corporation
>
>
> On Feb 19, 2018, 3:08 PM -0500, Charulata Sharma (charshar) <
> chars...@cisco.com>, wrote:
>
> Hi All,
>
>
>
> Looking for some insight into how application data archive and purge is
> carried out for C* database. Are there standard guidelines on calculating
> the amount of space that can be used for storing data in a specific node.
>
>
>
> Some pointers that I got while researching are;
>
>
>
> -  Allocate 50% space for compaction, e.g. if data size is 50GB
> then allocate 25GB for compaction.
>
> -  Snapshot strategy. If old snapshots are present, then they
> occupy the disk space.
>
> -  Allocate some percentage of storage (  ) for system tables
> and OpsCenter tables ?
>
>
>
> We have a scenario where certain transaction data needs to be archived
> based on business rules and some purged, so before deciding on an A
> strategy, I am trying to analyze
>
> how much transactional data can be stored given the current node capacity.
> I also found out that the space available metric shown in Opscenter is not
> very reliable because it doesn’t show
>
> the snapshot space. In our case, we have a huge snapshot size. For some
> unexplained reason, we seem to be taking snapshots of our data every hour
> and purging them only after 7 days.
>
>
>
>
>
> Thanks,
>
> Charu
>
> Cisco Systems.
>
>
>
>
>
>
>
>
>
>


Re: The home page of Cassandra is mobile friendly but the link to the third parties is not

2018-02-28 Thread kurt greaves
Already addressed in CASSANDRA-14128
, however waiting on
review/comments regarding what we actually do with this page.

If you want to bring attention to JIRA's, user list is probably
appropriate. I'd avoid spamming it too much though.

On 26 February 2018 at 19:22, Kenneth Brotman 
wrote:

> The home page of Cassandra is mobile friendly but the link to the third
> parties from that web page is not.  Any suggestions?
>
>
>
> I made a JIRA for it: https://issues.apache.org/
> jira/browse/CASSANDRA-14263
>
>
>
> Should posts about JIRA’s be on this list or the dev list?
>
>
>
> Kenneth Brotman
>
>
>
>
>


RE: [EXTERNAL] Re: Version Rollback

2018-02-28 Thread Durity, Sean R
My short answer is always – there are no rollbacks, we only go forward.  Jeff’s 
answer is much more complete and technically precise. You *could* rollback a 
few nodes (depending on topology) by just replacing them as if they had died.

I always upgrade all nodes (the binaries) as quickly as possible (but, one node 
at a time). The application stays up, stays happy, and my customers love 
“always up” Cassandra. I have clusters where we have done 3 or more major 
upgrades with 0 downtime for the application. One of the best things about 
supporting Cassandra! One node at a time upgrades can also be automated (which 
we have done).

After upgrading binaries on all nodes, I execute upgradesstables on groups of 
nodes (depending on load, hardware, cluster size, etc.). Reasoning: You cannot 
do any streaming operations (bootstrap, repairs) in a mixed-version cluster 
(except for maybe very minor version upgrades).


Sean Durity
From: shalom sagges [mailto:shalomsag...@gmail.com]
Sent: Wednesday, February 28, 2018 3:54 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Version Rollback

These are really good directions. Thanks a lot everyone!
@Kenneth - The cluster is comprised of 44 nodes, version 2.0.14, ~2.5TB of data 
per node. It's gonna be a major version upgrade (or upgrades to be exact... 
version 3.x is the target).

@Jeff, I have a passive DC. What if I upgrade the passive DC and if all goes 
well, move the applications to work with the passive DC and then upgrade the 
active DC. Is this doable?
Also, Would you suggest to upgrade one node (binaries), upgrade the SSTables 
and move to the second node, and then third etc, or first upgrade binaries to 
all nodes, and only then start with the SSTables upgrade?
Thanks!


On Tue, Feb 27, 2018 at 7:47 PM, Jeff Jirsa 
> wrote:
MOST minor versions support rollback - the exceptions are those where internode 
protocol changes (3.0.14 being the only one in recent memory), or where sstable 
format changes (again rare). No major versions support rollback - the only way 
to do it is to upgrade in a way that you can effectively reinstall the old 
version without data loss.

The steps usually look like:

Test in a lab
Test in a lab again
Test in a lab a few more times
Snapshot everything

If you have a passive data center:
- upgrade one instance
- check to see if it’s happy
- upgrade another
- check to see if it’s happy
- continue until the passive dc is done
- if at any point they’re unhappy rebuild (wipe and restream the old version) 
the dc from the active dc

On the active DCs, you’ll want to canary it one replica at a time so you can 
treat a failed upgrade like a bad disk:
- upgrade one instance
- check if it’s happy; if it’s not treat it like a failed disk and replace it 
with the old version
- if you’re using single token, do another instance in a different replica set, 
repeat until you’re out of different replicas.
- if you’re using vnodes but a rack aware snitch and have more racks than your 
RF, do another instance in the same rack as the canary, repeat until you’re out 
of instances in that rack

This is typically your point of no return - as soon as you have two replicas in 
the new version there’s no more rollback practical.


--
Jeff Jirsa


On Feb 27, 2018, at 9:22 AM, Carl Mueller 
> wrote:
My speculation is that IF (bigif) the sstable formats are compatible between 
the versions, which probably isn't the case for major versions, then you could 
drop back.

If the sstables changed format, then you'll probably need to figure out how to 
rewrite the sstables in the older format and then sstableloader them in the 
older-version cluster if need be. Alas, while there is an sstable upgrader, 
there isn't a downgrader AFAIK.

And I don't have an intimate view of version-by-version sstable format changes 
and compatibilities. You'd probably need to check the upgrade instructions 
(which you presumably did if you're upgrading versions) to tell.

Basically, version rollback is pretty unlikely to be done.

The OTHER option:

1) build a new cluster with the new version, no new data.

2) code your driver interfaces to interface with both clusters. Write to both, 
but read preferentially from the new, then fall through to the old. Yes, that 
gets hairy on multiple row queries. Port your data with sstable loading from 
the old to the new gradually.

When you've done a full load of all the data from old to new, and you're 
satisfied with the new cluster stability, retire the old cluster.

For merging two multirow sets you'll probably need your multirow queries to 
return the partition hash value (or extract the code that generates the hash), 
and have two simulaneous java-driver ResultSets going, and merge their results, 
providing the illusion of a single database query. You'll need to pay attention 
to both the row key ordering and column key ordering 

Re: JMX metrics for CL

2018-02-28 Thread Eric Evans
On Tue, Feb 27, 2018 at 2:26 AM, Kyrylo Lebediev 
wrote:

> Hello!
>
>
> Is it possible to get counters  from C* side regarding CQL queries
> executed since startup for each CL?
> For example:
> CL ONE: NNN queries
> CL QUORUM: MMM queries
> etc
>

It's possible to get a count of client requests.  You want the count
attribute of the client-request latency histogram (closest thing to
documentation here:
http://cassandra.apache.org/doc/latest/operating/metrics.html#client-request-metrics).
For the scope, there are more request-types than are listed in the
documentation; For each CL there is a read/write scope, ala
(Read|Write)-(LOCAL-QUORUM,LOCAL-ONE,QUORUM,ONE,...).

An example dashboard:

https://grafana.wikimedia.org/dashboard/db/cassandra-client-request?panelId=1=1=eqiad%20prometheus%2Fservices=restbase=All=99p

Note: This graph suppresses series with a rate of zero for the graph
time-span, so you won't see all possible request-types.



-- 
Eric Evans
john.eric.ev...@gmail.com


Re: JMX metrics for CL

2018-02-28 Thread Simon Fontana Oscarsson
It's not a change in the source code though since it's a plugin. You 
simply do the following:


 * Implement the QueryHandler interface with new metrics.
 * Add the compiled jar to the CLASSPATH in cassandra.in.sh
 * In cassandra-env.sh, set -Dcassandra.custom_query_handler_class to
   your custom class, e.g. JVM_OPTS="$JVM_OPTS
   -Dcassandra.custom_query_handler_class=example.com.MyQueryHandler.

BR,
Simon

On 2018-02-28 11:21, Kyrylo Lebediev wrote:


Thanks, Horia!

Wouldn't like to introduce any changes in source code.


Any alternatives how to trace CL's used from C* side?  If not 'since 
startup', then at least online metrics will be fine.



Regards,

Kyrill


*From:* Horia Mocioi 
*Sent:* Tuesday, February 27, 2018 7:38:23 PM
*To:* user@cassandra.apache.org
*Subject:* Re: JMX metrics for CL
Hello,

There are no such metrics that I am aware of.

One way you could do this is to have your own implementation of 
QueryHandler and your own metrics and filter the queries based on the 
CL and increment the according metric.


Then, in cassandra-env.sh you could specify to use your class using 
 -Dcassandra.custom_query_handler_class.


HTH,
Horia
On tis, 2018-02-27 at 08:26 +, Kyrylo Lebediev wrote:


Hello!


Is it possible to get counters  from C* side regarding CQL queries 
executed since startup for each CL?

For example:
CL ONE: NNN queries
CL QUORUM: MMM queries
etc

Regards,

Kyrill





Cassandra Chunk Cache hints ?

2018-02-28 Thread Nicolas Guyomar
Hi everyone,

I'm trying to find information on the Chunk Cache, and the "only" thing I
found so far is the Jira :
https://issues.apache.org/jira/browse/CASSANDRA-5863 where this
functionality was added.

I'm wondering if this is something to be adjusted when it's full ? Are
there some rule of thumb for file_cache_size_in_mb ?

As an example, on a quite new cluster with barely 100Gb per node, this
cache is full:

Chunk Cache: entries 7680, size 480 MiB, capacity 480 MiB,
1059773 misses, 1964564 requests, 0.461 recent hit rate, 52360.062
microseconds miss latency

I'm wondering if I could benefit from reducing the number of "misses" by
increasing this cache ?

Is anyone tuning this cache already for some use case ?

Thank you

@Kenneth might be worth an entry in the documentation maybe ?


Re: Batch too large exception

2018-02-28 Thread Marek Kadek -T (mkadek - CONSOL PARTNERS LTD at Cisco)
Hi,

Are you writing the batch to same partition? If not, there is a much stricter 
limit (I think 50Kb).
Check https://docs.datastax.com/en/cql/3.3/cql/cql_using/useBatch.html , and 
followups.

From: Goutham reddy 
Reply-To: "user@cassandra.apache.org" 
Date: Tuesday, February 27, 2018 at 9:55 PM
To: "user@cassandra.apache.org" 
Subject: Batch too large exception

Hi,
I have been getting batch too large exception when performing WRITE from Client 
application. My insert size is 5MB, so I have to split the 10 insert objects to 
insert at one go. It save some inserts and closes after some uncertain time. 
And it is a wide column table, we do have 113 columns. Can anyone kindly 
provide solution what was going wrong on my execution. Appreciate your help.

Regards
Goutham Reddy



Re: JMX metrics for CL

2018-02-28 Thread Kyrylo Lebediev
Thanks, Horia!

Wouldn't like to introduce any changes in source code.


Any alternatives how to trace CL's used from C* side?  If not 'since startup', 
then at least online metrics will be fine.


Regards,

Kyrill


From: Horia Mocioi 
Sent: Tuesday, February 27, 2018 7:38:23 PM
To: user@cassandra.apache.org
Subject: Re: JMX metrics for CL

Hello,

There are no such metrics that I am aware of.

One way you could do this is to have your own implementation of QueryHandler 
and your own metrics and filter the queries based on the CL and increment the 
according metric.

Then, in cassandra-env.sh you could specify to use your class using  
-Dcassandra.custom_query_handler_class.

HTH,
Horia
On tis, 2018-02-27 at 08:26 +, Kyrylo Lebediev wrote:

Hello!


Is it possible to get counters  from C* side regarding CQL queries executed 
since startup for each CL?
For example:
CL ONE: NNN queries
CL QUORUM: MMM queries
etc

Regards,

Kyrill


Re: Cassandra on high performance machine: virtualization vs Docker

2018-02-28 Thread Kyrylo Lebediev
In terms of Cassandra, rack is considered as single point of failure. So, using 
some rack-aware snitch (GossipingPropertyFileSnitch would be the best for your 
case) Cassandra won't place multiple replicas of the same range in the same 
rack.


Basically, there are two requirements that should be met:

1) Number of C* racks must be not less than RF chosen [not less than 3 for RF=3]

2) [Recommended] Number of nodes should be [more-less] the same for all racks. 
In this case data/workload will be evenly balanced across all racks and nodes.


Having requirement #1 met, failure of a rack would cause just unavailability of 
one replica of 3 for some token ranges.

Of course, queries at consistency level ALL won't work in this case, but it's 
not typical to use CL=ALL for Cassandra. BTW, which CL's  you will use?


In case you use Cassandra version >= 3.0, you may lower value for vnodes per a 
server to 32 [maybe even to 16]. This will reduce overhead for anti-entropy 
repairs, AFAIK. Reference: https://issues.apache.org/jira/browse/CASSANDRA-7032


Kind Regards,

Kyrill


From: onmstester onmstester 
Sent: Wednesday, February 28, 2018 10:11:19 AM
To: user
Subject: Re: Cassandra on high performance machine: virtualization vs Docker

Thanks
Unfortunately yes! this is a production, That's the only thing i have! and I'm 
going to use ESX (I'm not worried about throughput overhead although stress 
tests shows no problem with esx some thing like throughput of cassandra on 
single physical server < 3 * nodes on the same server)
If i use like 3 nodes per physical server, using rf=3 and config nodes on every 
physical server to be in same rac (Cassandra config), and if one of my server 
(with 3 nodes on it) failed would the data be lost ? and my application fails 
(using write consistency = 3 and read consistency =1) ?


Sent using Zoho Mail


 On Wed, 28 Feb 2018 08:13:01 +0330 daemeon reiydelle  
wrote 

Docker will provide less per node overhead.

And yes, virtualizing smaller nodes out of a bigger physical makes sense. Of 
course you lose the per node failure protection, but I guess this is not 
production?


<==>
"Who do you think made the first stone spear? The Asperger guy.
If you get rid of the autism genetics, there would be no Silicon Valley"
Temple Grandin
Daemeon C.M. Reiydelle
San Francisco 1.415.501.0198
London 44 020 8144 9872

On Tue, Feb 27, 2018 at 8:26 PM, onmstester onmstester 
> wrote:


What i've got to set up my Apache Cassandra cluster are some Servers with 20 
Core cpu * 2 Threads and 128 GB ram and 8 * 2TB disk.
Just read all over the web: Do not use big nodes for your cluster, i'm 
convinced to run multiple nodes on a single physical server.
So the question is which technology should i use: Docker or Virtualiztion 
(ESX)? Any exprience?

Sent using Zoho Mail






Re: Version Rollback

2018-02-28 Thread shalom sagges
These are really good directions. Thanks a lot everyone!

@Kenneth - The cluster is comprised of 44 nodes, version 2.0.14, ~2.5TB of
data per node. It's gonna be a major version upgrade (or upgrades to be
exact... version 3.x is the target).

@Jeff, I have a passive DC. What if I upgrade the passive DC and if all
goes well, move the applications to work with the passive DC and then
upgrade the active DC. Is this doable?
Also, Would you suggest to upgrade one node (binaries), upgrade the
SSTables and move to the second node, and then third etc, or first upgrade
binaries to all nodes, and only then start with the SSTables upgrade?

Thanks!



On Tue, Feb 27, 2018 at 7:47 PM, Jeff Jirsa  wrote:

> MOST minor versions support rollback - the exceptions are those where
> internode protocol changes (3.0.14 being the only one in recent memory), or
> where sstable format changes (again rare). No major versions support
> rollback - the only way to do it is to upgrade in a way that you can
> effectively reinstall the old version without data loss.
>
> The steps usually look like:
>
> Test in a lab
> Test in a lab again
> Test in a lab a few more times
> Snapshot everything
>
> If you have a passive data center:
> - upgrade one instance
> - check to see if it’s happy
> - upgrade another
> - check to see if it’s happy
> - continue until the passive dc is done
> - if at any point they’re unhappy rebuild (wipe and restream the old
> version) the dc from the active dc
>
> On the active DCs, you’ll want to canary it one replica at a time so you
> can treat a failed upgrade like a bad disk:
> - upgrade one instance
> - check if it’s happy; if it’s not treat it like a failed disk and replace
> it with the old version
> - if you’re using single token, do another instance in a different replica
> set, repeat until you’re out of different replicas.
> - if you’re using vnodes but a rack aware snitch and have more racks than
> your RF, do another instance in the same rack as the canary, repeat until
> you’re out of instances in that rack
>
> This is typically your point of no return - as soon as you have two
> replicas in the new version there’s no more rollback practical.
>
>
>
> --
> Jeff Jirsa
>
>
> On Feb 27, 2018, at 9:22 AM, Carl Mueller 
> wrote:
>
> My speculation is that IF (bigif) the sstable formats are compatible
> between the versions, which probably isn't the case for major versions,
> then you could drop back.
>
> If the sstables changed format, then you'll probably need to figure out
> how to rewrite the sstables in the older format and then sstableloader them
> in the older-version cluster if need be. Alas, while there is an sstable
> upgrader, there isn't a downgrader AFAIK.
>
> And I don't have an intimate view of version-by-version sstable format
> changes and compatibilities. You'd probably need to check the upgrade
> instructions (which you presumably did if you're upgrading versions) to
> tell.
>
> Basically, version rollback is pretty unlikely to be done.
>
> The OTHER option:
>
> 1) build a new cluster with the new version, no new data.
>
> 2) code your driver interfaces to interface with both clusters. Write to
> both, but read preferentially from the new, then fall through to the old.
> Yes, that gets hairy on multiple row queries. Port your data with sstable
> loading from the old to the new gradually.
>
> When you've done a full load of all the data from old to new, and you're
> satisfied with the new cluster stability, retire the old cluster.
>
> For merging two multirow sets you'll probably need your multirow queries
> to return the partition hash value (or extract the code that generates the
> hash), and have two simulaneous java-driver ResultSets going, and merge
> their results, providing the illusion of a single database query. You'll
> need to pay attention to both the row key ordering and column key ordering
> to ensure the combined results are properly ordered.
>
> Writes will be slowed by the double-writes, reads you'll be bound by the
> worse performing cluster.
>
> On Tue, Feb 27, 2018 at 8:23 AM, Kenneth Brotman <
> kenbrot...@yahoo.com.invalid> wrote:
>
>> Could you tell us the size and configuration of your Cassandra cluster?
>>
>>
>>
>> Kenneth Brotman
>>
>>
>>
>> *From:* shalom sagges [mailto:shalomsag...@gmail.com]
>> *Sent:* Tuesday, February 27, 2018 6:19 AM
>> *To:* user@cassandra.apache.org
>> *Subject:* Version Rollback
>>
>>
>>
>> Hi All,
>>
>> I'm planning to upgrade my C* cluster to version 3.x and was wondering
>> what's the best way to perform a rollback if need be.
>>
>> If I used snapshot restoration, I would be facing data loss, depends when
>> I took the snapshot (i.e. a rollback might be required after upgrading half
>> the cluster for example).
>>
>> If I add another DC to the cluster with the old version, then I could
>> point the apps to talk to that DC if anything bad happens, but building it
>> is really time 

Re: Cassandra on high performance machine: virtualization vs Docker

2018-02-28 Thread onmstester onmstester
Thanks

Unfortunately yes! this is a production, That's the only thing i have! and I'm 
going to use ESX (I'm not worried about throughput overhead although stress 
tests shows no problem with esx some thing like throughput of cassandra on 
single physical server  3 * nodes on the same server)

If i use like 3 nodes per physical server, using rf=3 and config nodes on every 
physical server to be in same rac (Cassandra config), and if one of my server 
(with 3 nodes on it) failed would the data be lost ? and my application fails 
(using write consistency = 3 and read consistency =1) ?


Sent using Zoho Mail






 On Wed, 28 Feb 2018 08:13:01 +0330 daemeon reiydelle 
daeme...@gmail.com wrote 




Docker will provide less per node overhead. 



And yes, virtualizing smaller nodes out of a bigger physical makes sense. Of 
course you lose the per node failure protection, but I guess this is not 
production?






==

"Who do you think made the first stone spear? The Asperger guy.

If you get rid of the autism genetics, there would be no Silicon Valley" 

Temple Grandin

Daemeon C.M. Reiydelle
San Francisco 1.415.501.0198
London 44 020 8144 9872





































On Tue, Feb 27, 2018 at 8:26 PM, onmstester onmstester 
onmstes...@zoho.com wrote:








What i've got to set up my Apache Cassandra cluster are some Servers with 20 
Core cpu * 2 Threads and 128 GB ram and 8 * 2TB disk. 

Just read all over the web: Do not use big nodes for your cluster, i'm 
convinced to run multiple nodes on a single physical server.

So the question is which technology should i use: Docker or Virtualiztion 
(ESX)? Any exprience?



Sent using Zoho Mail