Re: Node added, no performance boost -- are the tokens correct?
The DS docs go with should regarding setting the initial token to zero. It's not a must, but you get enough convenience out of never having to move tokens on that node that I'm not sure why you wouldn't do it. If anyone has a compelling reason not to do so, I'm happy to hear it :) On Fri, Apr 1, 2011 at 10:58 AM, Edward Capriolo edlinuxg...@gmail.comwrote: On Fri, Apr 1, 2011 at 1:15 PM, Peter Schuller peter.schul...@infidyne.com wrote: Now, I moved the tokens. I still observe that read latency deteriorated with 3 machines vs original one. Replication factor is 1, Cassandra version 0.7.2 (didn't have time to upgrade as I need results by this weekend). Read *latency* is fully expected to increase if you just add a node. *Throughput* should increase, unless you have a workload that manages to be more expensive on RPC than actual reads/writes. Latency would only be improved by additional nodes under some significant load. How are you benchmarking? Are you concurrently submitting requests to all nodes at the same time? Try using stress.py from the Cassandra tree as a comparison. If you're sending one request at a time, there is no expectation at all of a performance improvement - just a decrease in performance. -- / Peter Schuller To be clear on this issue. It does not matter where the tokens start it only matters that they are equally spaced around the token space. So for a 4 node clusters your tokens should either be 1 * ((2^127) / 4) = 42535295865117307932921825928971026432 2 * ((2^127) / 4) = 85070591730234615865843651857942052864 3 * ((2^127) / 4) = 127605887595351923798765477786913079296 4 * ((2^127) / 4) = 170141183460469231731687303715884105728 or 0 * ((2^127) / 4) = 0 1 * ((2^127) / 4) = 42535295865117307932921825928971026432 2 * ((2^127) / 4) = 85070591730234615865843651857942052864 3 * ((2^127) / 4) = 127605887595351923798765477786913079296 If you move one you have to move the rest because the distance between 170141183460469231731687303715884105728 and 0 is 1
Re: Node added, no performance boost -- are the tokens correct?
A script that I have says the following: $ python ctokens.py How many nodes are in your cluster? 2 node 0: 0 node 1: 85070591730234615865843651857942052864 The first token should be zero, for the reasons discussed here: http://www.datastax.com/dev/tutorials/getting_started_0_7/configuring#initial-token-values More details are available in http://www.datastax.com/docs/0.7/operations/clustering#adding-capacity The DS docs have some weak areas, but these two pages have been pretty well vetted over the past months :) On Thu, Mar 31, 2011 at 3:06 PM, buddhasystem potek...@bnl.gov wrote: I just configured a cluster of two nodes -- do these token values make sense? The reason I'm asking that so far I don't see load balancing to be happening, judging from performance. Address Status State LoadOwnsToken 170141183460469231731687303715884105728 130.199.185.194 Up Normal 153.52 GB 50.00% 85070591730234615865843651857942052864 130.199.185.193 Up Normal 199.82 GB 50.00% 170141183460469231731687303715884105728 -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Node-added-no-performance-boost-are-the-tokens-correct-tp6228872p6228872.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: How to determine if repair need to be run
Peter, I want to join everyone else thanking you for helping out so much with this thread, and especially for pointing out the problems with the DS docs on this topic. We have some corrections posted today, and will keep looking to improve the information. On Thu, Mar 31, 2011 at 3:11 PM, Peter Schuller peter.schul...@infidyne.com wrote: Thanks a lot for elaborating on repairs.Still, it's a bit fuzzy to me why it is so important to run a repair before the GCGraceSeconds kicks in. Does this mean a delete does not get replicated ? In other words when I delete something on a node, doesn't cassandra set tombstones on its replica copies? Deletes are replicated, but deletes are special in that unlike actual data, you're wanting to *remove* something, but the information that says stuff is gone is information in and of itself. Clearly you don't want to forever and ever keep track of anything ever removed in the cluster, so this has to expire somehow. For that reason, there is a requirement that tombstones are replicated prior to their expiry. See: http://wiki.apache.org/cassandra/DistributedDeletes And technically, isn't repair only needed for cases where things weren't properly propogated in the cluster? If all writes are written to the right replicas, and all deletes are written to all the replicas, and all nodes were available at all times, then everything should work as designed - without manual intervention, right? Yes, but you can assume that doesn't happen in real life for extended periods of time. It doesn't take a lot at all for a *few* writes not getting replicated (for example, just restarting a Cassandra node will cause some writes to be dropped - hinted handoff is not a guarantee, only an optimization). -- / Peter Schuller
Re: Add node to balanced cluster?
Ruslan, I'm not sure exactly what risks you are referring to -- can you be more specific? Do the CPU-intensive operations one at a time, including doing the cleanup when it will not interfere with other operations, and I think you should be fine, from my understanding. 1. Start the new nodes in staggered fashion, allowing at least two minutes between each node startup for the gossip protocol to perform important inter-node communication. You can monitor the startup and data streaming process to its completion using *nodetool netstats*http://www.datastax.com/docs/0.7/utilities/nodetool#nodetool-streams . 2. After the new nodes are fully bootstrapped, run nodetool move new_token on each existing node, one node at a time, where new_token is the value you calculated for the node. Only the first node in the ring, whose token value is zero, does not need to be moved. 1. Run *nodetool cleanup*http://www.datastax.com/docs/0.7/utilities/nodetool#nodetool-cleanupon each of the previously existing nodes to remove the keys no longer belonging to those nodes. This operation is as disk-intensive as a major compaction, so run only one cleanup command at a time. Cleanup may be safely postponed for low-usage hours. On Fri, Mar 25, 2011 at 10:39 AM, ruslan usifov ruslan.usi...@gmail.comwrote: 2011/3/25 Eric Gilmore e...@datastax.com Also: http://www.datastax.com/docs/0.7/operations/clustering#adding-capacity Can do that about i represent, but i afraid that when i begin balance cluster with new node this will be a big stress for it. Mey be exists some strategies how to do that?
Re: where to find the stress testing programs?
There are both Python and Java stress testing tools. I found the Java version easier to use. These directions (which echo the README for stress.java) may help get you going: http://www.datastax.com/docs/0.7/utilities/stress_java On Tue, Mar 15, 2011 at 9:25 AM, Jeremy Hanna jeremy.hanna1...@gmail.comwrote: contrib is only in the source download of cassandra On Mar 15, 2011, at 11:23 AM, Jonathan Colby wrote: According to the Cassandra Wiki and OReilly book supposedly there is a contrib directory within the cassandra download containing the Python Stress Test script stress.py. It's not in the binary tarball of 0.7.3. Anyone know where to find it? Anyone know of other, maybe better stress testing scripts? Jon
Re: Installation
The DataStax packaged releaseshttp://www.datastax.com/docs/0.7/configuration/packaged_releasesfollow standard practices for Linux-ish installation, so they might be a good model to follow. For instance, the RHEL/CentOS package installs the binaries (cassandra-cli, nodetool) in /usr/bin, configuration-related things in /etc/cassandra/conf/, and start/stop scripts in /etc/init.d/ . The installation pagehttp://www.datastax.com/dev/tutorials/getting_started_0.7/installingin our Getting Started guide describes the breakdown in binary files that you can download from Apache http://cassandra.apache.org/download/ and unpack, but in the near future we may add a more detailed description of the package releases. On Mon, Mar 7, 2011 at 8:40 AM, Mark static.void@gmail.com wrote: Where do must people install Cassandra to? /var or /opt? Thanks
Re: cassandra-rack.properties or cassandra-topology.properties
Apologies A. J. -- the reference to rack.properties is an error in the DataStax docs. We'll update it ASAP. On Thu, Mar 3, 2011 at 10:56 AM, A J s5a...@gmail.com wrote: Yes, that has topology and not rack. conf/access.properties conf/log4j-server.properties conf/cassandra-env.sh conf/log4j-tools.properties conf/cassandra-topology.properties conf/passwd.properties conf/cassandra.yaml conf/README.txt On Thu, Mar 3, 2011 at 1:28 PM, Jonathan Ellis jbel...@gmail.com wrote: Did you try ls conf/ ? On Thu, Mar 3, 2011 at 11:27 AM, A J s5a...@gmail.com wrote: In PropertyFileSnitch is cassandra-rack.properties or cassandra-topology.properties file used ? Little confused by the stmt: PropertyFileSnitch determines the location of nodes by referring to a user-defined description of the network details located in the property file cassandra-rack.properties. Your installation contains an example properties file for PropertyFileSnitch in $CASSANDRA_HOME/conf/cassandra-topology.properties. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: Seed Nodes
Yes, two per DC is a recommendation I've heard from Jonathan Ellis. We put that in yet more documentation athttp://www.datastax.com/dev/tutorials/getting_started_0.7/configuring#seed-list(appreciate the citation Aaron :) I had a recent conversation with a Cassandra expert who had me convinced that, as long as a node online already had one seed in its list at startup, you wouldn't really need to restart it -- at least right away -- after adding a second seed to the config. Rather than confuse the issue by trying to remember/recreate the rationale for that, I'll ping him and see if he'll comment here. On Tue, Mar 1, 2011 at 12:04 PM, Aaron Morton aa...@thelastpickle.comwrote: AFAIK it's recommended to have two seed nodes per dc. Some info on seeds here http://www.datastax.com/docs/0.7/operations/clustering http://www.datastax.com/docs/0.7/operations/clustering You will need a restart. Aaron On 2/03/2011, at 6:08 AM, shan...@accenture.com wrote: How many seed nodes should I have for a cluster of 100 nodes each with about 500gb of data? Also to add seeds the nodes, must I change the seed nodes list on all existing nodes through the Cassandra.yaml file? Will changes take effect without restarting the node? *Shan (Susie) Lu, *Analyst** Accenture Technology Labs - Silicon Valley ** cell +1 425.749.2546 email *shan...@accenture.com charles.nebol...@accenture.com* -- This message is for the designated recipient only and may contain privileged, proprietary, or otherwise private information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the email by you is prohibited.
Re: Multiple Seeds
The DataStax documentation offers some answers to those questions in the Getting Startedhttp://www.datastax.com/dev/tutorials/getting_started_0.7/configuring#adding-nodes-to-a-cassandra-clustersection and the Clusteringhttp://www.datastax.com/docs/0.7/operations/clustering#adding-capacityreference docs. Autobootstrap should be true, but with the important caveat that intial_token values should be specified. Have a look at those docs, and please give feedback on how helpful they are/aren't. Regards, Eric Gilmore On Wed, Feb 23, 2011 at 11:15 AM, jeremy.truel...@barclayscapital.comwrote: What’s the best way to bring multiple seeds up, should only one of them have auto bootstrap set to true or should neither of them? Should they list themselves and the other seed in their seed section in the yaml config? ___ This e-mail may contain information that is confidential, privileged or otherwise protected from disclosure. If you are not an intended recipient of this e-mail, do not duplicate or redistribute it by any means. Please delete it and any attachments and notify the sender that you have received it in error. Unless specifically indicated, this e-mail is not an offer to buy or sell or a solicitation to buy or sell any securities, investment products or other financial product or service, an official confirmation of any transaction, or an official statement of Barclays. Any views or opinions presented are solely those of the author and do not necessarily represent those of Barclays. This e-mail is subject to terms available at the following link: www.barcap.com/emaildisclaimer. By messaging with Barclays you consent to the foregoing. Barclays Capital is the investment banking division of Barclays Bank PLC, a company registered in England (number 1026167) with its registered office at 1 Churchill Place, London, E14 5HP. This email may relate to or be sent from other members of the Barclays Group.** ___
Re: Multiple Seeds
Well -- when you first bring a node into a ring, you will probably want to stream data to it with auto_bootstrap: true. If you want that node to be a seed, then add it to the seeds list AFTER it has joined the ring. I'd refer you to the Seed List and Autoboostrapping sections of the Getting Started guide, which contain the following blurbs: *There is no strict rule to determine which hosts need to be listed as seeds, but all nodes in a cluster need the same seed list. For a production deployment, DataStax recommends two seeds per data center.* * * *An autobootstrapping node cannot have itself in the list of seeds nor can it contain an initial_tokenhttp://www.datastax.com/docs/0.7/configuration/storage_configuration#initial-tokenalready claimed by another node. To add new seeds, autobootstrap the nodes first, and then configure them as seeds.* On Wed, Feb 23, 2011 at 11:39 AM, jeremy.truel...@barclayscapital.comwrote: So all seeds should always be set to 'auto_bootstrap: false' in their .yaml file. -Original Message- From: Edward Capriolo [mailto:edlinuxg...@gmail.com] Sent: Wednesday, February 23, 2011 2:36 PM To: user@cassandra.apache.org Cc: Truelove, Jeremy: IT (NYK) Subject: Re: Multiple Seeds On Wed, Feb 23, 2011 at 2:30 PM, jeremy.truel...@barclayscapital.com wrote: Yeah I set the tokens, I'm more asking if I start the first seed node with autobootstrap set to false the second seed should have it set to true as well as all the slave nodes correct? I didn't see this in the docs but I may have just missed it. From: Eric Gilmore [mailto:e...@datastax.com] Sent: Wednesday, February 23, 2011 2:24 PM To: user@cassandra.apache.org Subject: Re: Multiple Seeds The DataStax documentation offers some answers to those questions in the Getting Started section and the Clustering reference docs. Autobootstrap should be true, but with the important caveat that intial_token values should be specified. Have a look at those docs, and please give feedback on how helpful they are/aren't. Regards, Eric Gilmore On Wed, Feb 23, 2011 at 11:15 AM, jeremy.truel...@barclayscapital.com wrote: What's the best way to bring multiple seeds up, should only one of them have auto bootstrap set to true or should neither of them? Should they list themselves and the other seed in their seed section in the yaml config? ___ This e-mail may contain information that is confidential, privileged or otherwise protected from disclosure. If you are not an intended recipient of this e-mail, do not duplicate or redistribute it by any means. Please delete it and any attachments and notify the sender that you have received it in error. Unless specifically indicated, this e-mail is not an offer to buy or sell or a solicitation to buy or sell any securities, investment products or other financial product or service, an official confirmation of any transaction, or an official statement of Barclays. Any views or opinions presented are solely those of the author and do not necessarily represent those of Barclays. This e-mail is subject to terms available at the following link: www.barcap.com/emaildisclaimer. By messaging with Barclays you consent to the foregoing. Barclays Capital is the investment banking division of Barclays Bank PLC, a company registered in England (number 1026167) with its registered office at 1 Churchill Place, London, E14 5HP. This email may relate to or be sent from other members of the Barclays Group. ___ If a node is defined as a seeds it will never auto bootstrap. After it has bootstrapped and has a system table you can set its yaml file as a seed if you wish.
Re: Error when bringing up 3rd node
It sounds like one of your existing nodes already has the initial token zero. Did you set the intial token of the first node you brought online to zero? On Fri, Feb 18, 2011 at 12:35 PM, mcasandra mohitanch...@gmail.com wrote: I see following error. Is it because I have initial token defined? What token should I use as initial token? INFO 12:31:36,689 Finished hinted handoff of 0 rows to endpoint /172.16.208.12 INFO 12:32:58,448 Joining: getting bootstrap token ERROR 12:32:58,451 Fatal error: Bootstraping to existing token 0 is not allowed (decommission/removetoken the old node first). Bad configuration; unable to start server -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Error-when-bringing-up-3rd-node-tp6041409p6041409.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: Error when bringing up 3rd node
A Java program should work fine. The Wiki and the DataStax documentation use a python program for the same purpose: http://www.datastax.com/docs/0.7/operations/clustering#calculating-tokens On Fri, Feb 18, 2011 at 12:45 PM, mcasandra mohitanch...@gmail.com wrote: Yes I had set the first node to token 0. I think I read somewhere in the docs. What should I do. Should I write a java program to calculate the hash for 3 nodes and distribute it accross 3 nodes? -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Error-when-bringing-up-3rd-node-tp6041409p6041430.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: Error when bringing up 3rd node
I'm not sure I can say exactly why, but I'm sure those numbers can't be correct. One node should be zero and the other values should be very long numbers like 85070591730234615865843651857942052863. We need another Java expert's opinion here, but it looks like your snippet may have integer overflowhttp://www.mkyong.com/java/javas-silent-killer-buffer-overflow-careful/ or integer overload going on. On Fri, Feb 18, 2011 at 1:04 PM, mcasandra mohitanch...@gmail.com wrote: Thanks! This is what I got. Is this right? public class TokenCalc{ public static void main(String ...args){ int nodes=3; for(int i = 1 ; i = nodes; i++) { System.out.println( (2 ^ 127) / nodes * i); } } } 41 82 123 -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Error-when-bringing-up-3rd-node-tp6041409p6041471.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: Internal error processing insert
For now, I have committed a change in the misleading documentation, substituting SimpleStrategy for NTS. Sorry you ran into trouble due to that, mcasandra. On Mon, Feb 14, 2011 at 4:28 PM, Aaron Morton aa...@thelastpickle.comwrote: Will take a closer look at the code tonight, perhaps we should return an error if you try to using Network Topology it cannot detect any DC's . Cheers Aaron On 15 Feb, 2011,at 01:22 PM, mcasandra mohitanch...@gmail.com wrote: That's what I thought might be happening since network topology will try to find one node on the other data center. Message is little confusing though. [default@unknown] update keyspace twissandra placement_strategy='org.apache.cassandra.locator.SimpleStrategy'; Syntax error at position 28: missing EOF at 'placement_strategy' [default@unknown] update keyspace twissandra with placement_strategy='org.apache.cassandra.locator.SimpleStrategy'; 5c487967-3899-11e0-993f-b7fa7ed61af9 [default@unknown] use twissandra ... ; Authenticated to keyspace: twissandra [default@twissandra] set users['jsmith']['password']='ch@ngem3'; Value inserted. -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Internal-error-processing-insert-tp6025740p6025813.htmlhttp://cassandra-user-incubator-apache-org.3065146.n2.nabblecom/Internal-error-processing-insert-tp6025740p6025813.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: How does Bootstrapping work in 0.7 ??
Thanks very much Patrick for the good words and suggestions. Those are important points about initial_token and nodetool move. Definitely, keep us informed about any and all doc issues you have, and we will do what we can to keep improving the docs. Eric On Sun, Jan 23, 2011 at 2:26 PM, Patrick de Torcy pdeto...@gmail.comwrote: Thanks, Eric and Peter, for all the replies ! It helped me a lot... I've read again the Getting Started pagehttp://www.datastax.com/docs/0.7/getting_started/index... In fact I have particularly read the chapter : setting up a multi node cluster (as it was what I tried to do). It's written : The rest of this section uses a two-node example cluster. With two nodes, a new Cassandra cluster’s load is automatically balanced, and remains balanced if you increase to four nodes, then eight. So I thought that keeping the default parameter (ie blank) would have been enough. So I think you should add in your doc (and in the yaml comments) that if it's the first node of your cluster, you really should set intialToken to 0, if you want your further nodes balanced. You should add too, that if your cluster is unbalanced, you 'll have to run : nodetool host move newToken (it's not specified in this section). Some examples would be useful too. For people new to Cassandra, this part is very confusing (well, maybe I'm a little dumb too...) But even with these flaws, your documentation is the best I have read. And if I have other issues, don't worry, you'll be informed :-). Thanks again, Patrick On Thu, Jan 20, 2011 at 8:55 PM, Eric Gilmore e...@riptano.com wrote: Patrick, if you try adding capacity again from the beginning, I'd be curious to hear if the DataStax/Riptanohttp://www.datastax.com/docs/0.7/operations/clustering#adding-capacitydocs are helpful or not. Also, in the Getting Started pagehttp://www.datastax.com/docs/0.7/getting_started/index, we note that it may be best to set initial_token to 0 on the very first node that you start. Regards, Eric Gilmore On Thu, Jan 20, 2011 at 11:05 AM, Peter Schuller peter.schul...@infidyne.com wrote: Is it supposed to work that way, or have I missed something ? I don't see that you did anything wrong based on your description and based on my understanding how it works in 0.7 (not sure about 0.6), but hopefully someone else can address that part. What I can think of - did you inspect the log on the new node? Does it say anything about bootstraping or streaming data from other nodes? Does 'nodetool ring' indicate it considers itself completely up and in the cluster already? Trying to determine whether the node is in fact considering it self done bootstrapping and joined to the ring, yet containing no data. I tried then to put values for initialToken for both nodes (stopping and restartings the servers), but it didn't change anything : I have the same token values... This is expected. Once the node has bootstrapped into the cluster and saved its token, it will no longer try to acquire a new one. Any initial token in the configuration is ignored; it is only the *initial* token, quite literally. Changing the token would require a 'nodetool move' command. -- / Peter Schuller -- *Eric Gilmore * Consulting Technical Writer Riptano, Inc. Ph: 510 684 9786 (cell) -- *Eric Gilmore * Consulting Technical Writer Riptano, Inc. Ph: 510 684 9786 (cell)
Re: How does Bootstrapping work in 0.7 ??
Patrick, if you try adding capacity again from the beginning, I'd be curious to hear if the DataStax/Riptanohttp://www.datastax.com/docs/0.7/operations/clustering#adding-capacitydocs are helpful or not. Also, in the Getting Started pagehttp://www.datastax.com/docs/0.7/getting_started/index, we note that it may be best to set initial_token to 0 on the very first node that you start. Regards, Eric Gilmore On Thu, Jan 20, 2011 at 11:05 AM, Peter Schuller peter.schul...@infidyne.com wrote: Is it supposed to work that way, or have I missed something ? I don't see that you did anything wrong based on your description and based on my understanding how it works in 0.7 (not sure about 0.6), but hopefully someone else can address that part. What I can think of - did you inspect the log on the new node? Does it say anything about bootstraping or streaming data from other nodes? Does 'nodetool ring' indicate it considers itself completely up and in the cluster already? Trying to determine whether the node is in fact considering it self done bootstrapping and joined to the ring, yet containing no data. I tried then to put values for initialToken for both nodes (stopping and restartings the servers), but it didn't change anything : I have the same token values... This is expected. Once the node has bootstrapped into the cluster and saved its token, it will no longer try to acquire a new one. Any initial token in the configuration is ignored; it is only the *initial* token, quite literally. Changing the token would require a 'nodetool move' command. -- / Peter Schuller -- *Eric Gilmore * Consulting Technical Writer Riptano, Inc. Ph: 510 684 9786 (cell)
Re: How does Bootstrapping work in 0.7 ??
Sorry, my comments were indeed a little short on elucidation. :) The cited doc suggest that setting initial_token to 0 on the first node simplifies load balancing as you later expand the cluster . . . . If this is unset (the default), Cassandra picks a token number randomly. A more complete explanation might look something like: . . . it is recommended to set the initial token's value to zero. This simplifies load balancing as you later expand the cluster, since the node starting at 0 will never need to be moved to a new token. Also, if this is unset (the default), Cassandra picks a token number randomly, which can lead to hot spots in the ring. On Thu, Jan 20, 2011 at 12:59 PM, Brandon Williams dri...@gmail.com wrote: On Thu, Jan 20, 2011 at 2:14 PM, Robert Coli rc...@digg.com wrote: On Thu, Jan 20, 2011 at 11:55 AM, Eric Gilmore e...@riptano.com wrote: Also, in the Getting Started page, we note that it may be best to set initial_token to 0 on the very first node that you start. Could you expand a bit on the reasons for and implications of this, for our collective elucidation? :) Because then the node never has to move. Same would be true of 2**127, but zero is mnemonically easier. :) -Brandon -- *Eric Gilmore * Consulting Technical Writer Riptano, Inc. Ph: 510 684 9786 (cell)
Re: If one seed node crash, how can I add one seed node?
What would comprise a sane and reasonably balanced list? Should there be a certain proportion of seeds per total nodes? Any other considerations besides a) list must be identical on all nodes and b) you can't auto-bootstrap a seed node? I'm new to thinking about this setting, but it sounds like this discussion may be approaching some best-practice guidelines. On Tue, Dec 7, 2010 at 1:01 PM, Jonathan Ellis jbel...@gmail.com wrote: The gossip-to-seed each round is to prevent cluster partitions, so if you're following correct procedure and making every node's seed list identical, then any potential new nodes gossiping to one of the old seeds means it is still harmless for old nodes not to gossip to the new one until the next restart. On Tue, Dec 7, 2010 at 2:10 PM, Aaron Morton aa...@thelastpickle.comwrote: Ryan, I've not checked with the code but the wiki docs for the Gossip Protocol say it makes use of the seed list. http://wiki.apache.org/cassandra/ArchitectureGossip During each gossip round a node will try to gossip to one seed node. Which made me think keeping the list sane and reasonably balanced was a good idea. Obviously would not matter too much on a small cluster though. Aaron On 08 Dec, 2010,at 07:16 AM, Ryan King r...@twitter.com wrote: Note that there's not really anything special about the seed node and its all relative– the cluster doesn't necessarily have to agreed on who the seeds are. So, to bring up a new node to replace the old seed, just set the new node's seed to any existing node in the system. After that you can go back and make the setting consistent across the cluster. -ryan On Tue, Dec 7, 2010 at 7:01 AM, Nick Bailey n...@riptano.com wrote: Yes, cassandra only reads the configuration when it starts up. However seed nodes are only used when a node starts. After that they aren't needed. There should be no reason to restart your cluster after adding a seed node to your cluster. On Tue, Dec 7, 2010 at 2:09 AM, aaron morton aa...@thelastpickle.comwrote: You will need to restart the nodes for them to pickup changes in cassandra.yaml Aaron On 7 Dec 2010, at 16:32, lei liu wrote: Thanks Nick. After I add the new node as seed node in the configuration for all of my nodes, do I need to restart all of my nodes? 2010/12/7 Nick Bailey n...@riptano.com The node can be set as a seed node at any time. It does not need to be a seed node when it joins the cluster. You should remove it as a seed node, set autobootstrap to true and let it join the cluster. Once it has joined the cluster you should add it as a seed node in the configuration for all of your nodes. On Mon, Dec 6, 2010 at 9:59 AM, lei liu liulei...@gmail.com wrote: Thank Jonathan for your reply. How can I bootstrap the node into cluster, I know if the node is seed node, I can't set AutoBootstrap to true. 2010/12/6 Jonathan Ellis jbel...@gmail.com set it as a seed _after_ bootstrapping it into the cluster. On Mon, Dec 6, 2010 at t5:01 AM, lei liu liulei...@gmail.com wrote: After one seed node crash, I want to add one node as seed node, I set auto_bootstrap to true, but the new node don't migrate data from other node s. How can I add one new seed node and let the node to migrate data from other nodes? Thanks, LiuLei -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com -- *Eric Gilmore * Consulting Technical Writer Riptano, Inc. Ph: 510 684 9786 (cell)