Re: Changing replication factor of Cassandra cluster

2015-01-06 Thread Pranay Agarwal
Thanks Robert. Also, I have seen the node-repair operation to fail for some
nodes. What are the chances of the data getting corrupt if node-repair
fails? I am okay with data availability issues for some time as long as I
don't loose or corrupt data. Also, is there way to restore the graph
without having to backup the token ring range but just the data backup?

-Pranay

On Mon, Dec 29, 2014 at 1:58 PM, Robert Coli rc...@eventbrite.com wrote:

 On Mon, Dec 29, 2014 at 1:40 PM, Pranay Agarwal agarwalpran...@gmail.com
 wrote:

 I want to understand what is the best way to increase/change the replica
 factor of the cassandra cluster? My priority is consistency and probably I
 am tolerant about some down time of the cluster. Is it totally weird to try
 changing replica later or are there people doing it for production env in
 past?


 The way you are doing it is fine, but risks false-negative reads.

 Basically, if you ask the wrong node does this key exist before it is
 repaired, you will get the answer no when in fact it does exist under the
 RF=1 paradigm. Unfortunately the only way to avoid this case is to do all
 reads with ConsistencyLevel.ALL until the whole cluster is repaired.

 =Rob



Re: Changing replication factor of Cassandra cluster

2015-01-06 Thread Robert Coli
On Tue, Jan 6, 2015 at 4:40 PM, Pranay Agarwal agarwalpran...@gmail.com
wrote:

 Thanks Robert. Also, I have seen the node-repair operation to fail for
 some nodes. What are the chances of the data getting corrupt if node-repair
 fails?


If repair does not complete before gc_grace_seconds, chance of data getting
corrupt is non-zero and relates to how under-replicated tombstones might be.

You probably want to increase gc_grace_seconds if you're experiencing
failing repairs, as the default value is ambitious in non-toy clusters. I
personally recommend 34 days so that you can start repair on the first of
every month and have up to 7 days to complete the repair.

=Rob


Re: Changing replication factor of Cassandra cluster

2014-12-29 Thread Pranay Agarwal
Thanks Ryan.

I want to understand what is the best way to increase/change the replica
factor of the cassandra cluster? My priority is consistency and probably I
am tolerant about some down time of the cluster. Is it totally weird to try
changing replica later or are there people doing it for production env in
past?

On Tue, Dec 16, 2014 at 9:47 AM, Ryan Svihla rsvi...@datastax.com wrote:

 Repair's performance is going to vary heavily by a large number of
 factors, hours for 1 node to finish is within range of what I see in the
 wild, again there are so many factors it's impossible to speculate on if
 that is good or bad for your cluster. Factors that matter include:

1. speed of disk io
2. amount of ram and cpu on each node
3. network interface speed
4. is this multidc or not
5. are vnodes enabled or not
6. what are the jvm tunings
7. compaction settings
8. current load on the cluster
9. streaming settings

 Suffice it to say to improve repair performance is a full on tuning
 exercise, note you're current operation is going to be worse than
 tradtional repair, as your streaming copies of data around and not just
 doing normal merkel tree work.

 Restoring from backup to a new cluster (including how to handle token
 ranges) is discussed in detail here
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_snapshot_restore_new_cluster.html


 On Mon, Dec 15, 2014 at 4:14 PM, Pranay Agarwal agarwalpran...@gmail.com
 wrote:

 Hi All,


 I have 20 nodes cassandra cluster with 500gb of data and replication
 factor of 1. I increased the replication factor to 3 and ran nodetool
 repair on each node one by one as the docs says. But it takes hours for 1
 node to finish repair. Is that normal or am I doing something wrong?

 Also, I took backup of cassandra data on each node. How do I restore the
 graph in a new cluster of nodes using the backup? Do I have to have the
 tokens range backed up as well?

 -Pranay



 --

 [image: datastax_logo.png] http://www.datastax.com/

 Ryan Svihla

 Solution Architect

 [image: twitter.png] https://twitter.com/foundev [image: linkedin.png]
 http://www.linkedin.com/pub/ryan-svihla/12/621/727/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.




Re: Changing replication factor of Cassandra cluster

2014-12-29 Thread Robert Coli
On Mon, Dec 29, 2014 at 1:40 PM, Pranay Agarwal agarwalpran...@gmail.com
wrote:

 I want to understand what is the best way to increase/change the replica
 factor of the cassandra cluster? My priority is consistency and probably I
 am tolerant about some down time of the cluster. Is it totally weird to try
 changing replica later or are there people doing it for production env in
 past?


The way you are doing it is fine, but risks false-negative reads.

Basically, if you ask the wrong node does this key exist before it is
repaired, you will get the answer no when in fact it does exist under the
RF=1 paradigm. Unfortunately the only way to avoid this case is to do all
reads with ConsistencyLevel.ALL until the whole cluster is repaired.

=Rob


Re: Changing replication factor of Cassandra cluster

2014-12-16 Thread Ryan Svihla
Repair's performance is going to vary heavily by a large number of factors,
hours for 1 node to finish is within range of what I see in the wild, again
there are so many factors it's impossible to speculate on if that is good
or bad for your cluster. Factors that matter include:

   1. speed of disk io
   2. amount of ram and cpu on each node
   3. network interface speed
   4. is this multidc or not
   5. are vnodes enabled or not
   6. what are the jvm tunings
   7. compaction settings
   8. current load on the cluster
   9. streaming settings

Suffice it to say to improve repair performance is a full on tuning
exercise, note you're current operation is going to be worse than
tradtional repair, as your streaming copies of data around and not just
doing normal merkel tree work.

Restoring from backup to a new cluster (including how to handle token
ranges) is discussed in detail here
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_snapshot_restore_new_cluster.html


On Mon, Dec 15, 2014 at 4:14 PM, Pranay Agarwal agarwalpran...@gmail.com
wrote:

 Hi All,


 I have 20 nodes cassandra cluster with 500gb of data and replication
 factor of 1. I increased the replication factor to 3 and ran nodetool
 repair on each node one by one as the docs says. But it takes hours for 1
 node to finish repair. Is that normal or am I doing something wrong?

 Also, I took backup of cassandra data on each node. How do I restore the
 graph in a new cluster of nodes using the backup? Do I have to have the
 tokens range backed up as well?

 -Pranay



-- 

[image: datastax_logo.png] http://www.datastax.com/

Ryan Svihla

Solution Architect

[image: twitter.png] https://twitter.com/foundev [image: linkedin.png]
http://www.linkedin.com/pub/ryan-svihla/12/621/727/

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.


Changing replication factor of Cassandra cluster

2014-12-15 Thread Pranay Agarwal
Hi All,


I have 20 nodes cassandra cluster with 500gb of data and replication factor
of 1. I increased the replication factor to 3 and ran nodetool repair on
each node one by one as the docs says. But it takes hours for 1 node to
finish repair. Is that normal or am I doing something wrong?

Also, I took backup of cassandra data on each node. How do I restore the
graph in a new cluster of nodes using the backup? Do I have to have the
tokens range backed up as well?

-Pranay