Re: Maintaining counter column consistency
Hi Ben, If you make sure R + W N you should be fine. Have a read of this http://www.slideshare.net/benjaminblack/introduction-to-cassandra-replication-and-consistency Thanks, H On 1 Oct 2013, at 18:29, Ben Hood 0x6e6...@gmail.comhttp://gmail.com wrote: Hi, We're maintaining a bunch of application specific counters that are incremented on a per event basis just after the event has been inserted. Given the fact that they can get of sync, we were wondering if there are any best practices or just plain real world experience for handling the consistency of these counters? The application could tolerate an inconsistency for a while, so I'm not sure that the cost of any full-on ACID semantics (should they actually be possible in Cassandra) would be justified. So the first inclination was to issue the increment after the insert and hope for the best. Then at some later point, we would run a reconciliation on the underlying data in the column family and compare this with the counter values. Obviously you can only do this once a counter column has gone cold - i.e. it wouldn't make sense to reconcile something that could still get incremented. Does it make sense to put the insert and increment in a CQL batch? Does anybody have any high level advice for this design deliberation? Cheers, Ben
Re: Maintaining counter column consistency
Hi Haithem, I might have phrased my question wrongly - I wasn't referring to the considerations of consistency level or replication factors - I was referring to fact that I want to insert a row and increment a counter in the same operation. I was concerned about the inconsistency that could arise if the counter increment failed, after the underlying record on which the increment was based succeeded. So I wasn't talking about the consistency between Cassandra nodes, rather the consistency between an idempotent base record and an non-idempotent summary counter. Cheers, Ben On October 2, 2013 at 10:09:40 AM, Haithem Jarraya (a-hjarr...@expedia.com) wrote: Hi Ben, If you make sure R + W N you should be fine. Have a read of this http://www.slideshare.net/benjaminblack/introduction-to-cassandra-replication-and-consistency Thanks, H On 1 Oct 2013, at 18:29, Ben Hood 0x6e6...@gmail.com wrote: Hi, We're maintaining a bunch of application specific counters that are incremented on a per event basis just after the event has been inserted. Given the fact that they can get of sync, we were wondering if there are any best practices or just plain real world experience for handling the consistency of these counters? The application could tolerate an inconsistency for a while, so I'm not sure that the cost of any full-on ACID semantics (should they actually be possible in Cassandra) would be justified. So the first inclination was to issue the increment after the insert and hope for the best. Then at some later point, we would run a reconciliation on the underlying data in the column family and compare this with the counter values. Obviously you can only do this once a counter column has gone cold - i.e. it wouldn't make sense to reconcile something that could still get incremented. Does it make sense to put the insert and increment in a CQL batch? Does anybody have any high level advice for this design deliberation? Cheers, Ben
Re: Cassandra Heap Size for data more than 1 TB
The version of Cassandra I am using is 1.0.11, we are migrating to 1.2.X though. We had tuned bloom filters (0.1) and AFAIK making it lower than this won't matter. Thanks ! On Tue, Oct 1, 2013 at 11:54 PM, Mohit Anchlia mohitanch...@gmail.comwrote: Which Cassandra version are you on? Essentially heap size is function of number of keys/metadata. In Cassandra 1.2 lot of the metadata like bloom filters were moved off heap. On Tue, Oct 1, 2013 at 9:34 PM, srmore comom...@gmail.com wrote: Does anyone know what would roughly be the heap size for cassandra with 1TB of data ? We started with about 200 G and now on one of the nodes we are already on 1 TB. We were using 8G of heap and that served us well up until we reached 700 G where we started seeing failures and nodes flipping. With 1 TB of data the node refuses to come back due to lack of memory. needless to say repairs and compactions takes a lot of time. We upped the heap from 8 G to 12 G and suddenly everything started moving rapidly i.e. the repair tasks and the compaction tasks. But soon (in about 9-10 hrs) we started seeing the same symptoms as we were seeing with 8 G. So my question is how do I determine what is the optimal size of heap for data around 1 TB ? Following are some of my JVM settings -Xms8G -Xmx8G -Xmn800m -XX:NewSize=1200M XX:MaxTenuringThreshold=2 -XX:SurvivorRatio=4 Thanks !
RE: Rollback question regarding system metadata change
I went with deleting the extra rows created in schema_columns and I've now successfully bootstrapped three nodes back on 1.2.10. No sour side effects to report yet. Thanks for your help From: Robert Coli [mailto:rc...@eventbrite.com] Sent: 02 October 2013 01:00 To: user@cassandra.apache.org Subject: Re: Rollback question regarding system metadata change On Tue, Oct 1, 2013 at 3:45 PM, Chris Wirt chris.w...@struq.com wrote: Yep they still work. They dont acutally have any of the new system CF created for 2.0, paxos, etc.. but they do have new rows in the schema_columns table preventing startup and bootstrapping of new nodes. It *may* be least risky to manually remove these rows and then restart DC3. But unfortunately without really diving into the code, I can't make any statement about what effects it might have. But anyway, actions to do this would be: - drop schema (wont actually delete data?) What actually happens is that you automatically create a snapshot in the snapshots dir when you drop, so you would have to move (or (better) hard link) those files back into place. - create schema (will create all the metadata and leave my data directories alone?) - on each node run nodetool refresh (will load my existing data?) Right. Refresh will rename all SSTables while opening them. As an alternative to refresh, you can restart the node; Cassandra loads whatever files it finds in the data dir at startup. =Rob
Re: Cassandra Heap Size for data more than 1 TB
Have a look to index_interval. Cem. On Wed, Oct 2, 2013 at 2:25 PM, srmore comom...@gmail.com wrote: The version of Cassandra I am using is 1.0.11, we are migrating to 1.2.X though. We had tuned bloom filters (0.1) and AFAIK making it lower than this won't matter. Thanks ! On Tue, Oct 1, 2013 at 11:54 PM, Mohit Anchlia mohitanch...@gmail.comwrote: Which Cassandra version are you on? Essentially heap size is function of number of keys/metadata. In Cassandra 1.2 lot of the metadata like bloom filters were moved off heap. On Tue, Oct 1, 2013 at 9:34 PM, srmore comom...@gmail.com wrote: Does anyone know what would roughly be the heap size for cassandra with 1TB of data ? We started with about 200 G and now on one of the nodes we are already on 1 TB. We were using 8G of heap and that served us well up until we reached 700 G where we started seeing failures and nodes flipping. With 1 TB of data the node refuses to come back due to lack of memory. needless to say repairs and compactions takes a lot of time. We upped the heap from 8 G to 12 G and suddenly everything started moving rapidly i.e. the repair tasks and the compaction tasks. But soon (in about 9-10 hrs) we started seeing the same symptoms as we were seeing with 8 G. So my question is how do I determine what is the optimal size of heap for data around 1 TB ? Following are some of my JVM settings -Xms8G -Xmx8G -Xmn800m -XX:NewSize=1200M XX:MaxTenuringThreshold=2 -XX:SurvivorRatio=4 Thanks !
Re: Cassandra Heap Size for data more than 1 TB
I changed my index_interval from 128 to index_interval: 128 to 512, does it make sense to increase more than this ? On Wed, Oct 2, 2013 at 9:30 AM, cem cayiro...@gmail.com wrote: Have a look to index_interval. Cem. On Wed, Oct 2, 2013 at 2:25 PM, srmore comom...@gmail.com wrote: The version of Cassandra I am using is 1.0.11, we are migrating to 1.2.X though. We had tuned bloom filters (0.1) and AFAIK making it lower than this won't matter. Thanks ! On Tue, Oct 1, 2013 at 11:54 PM, Mohit Anchlia mohitanch...@gmail.comwrote: Which Cassandra version are you on? Essentially heap size is function of number of keys/metadata. In Cassandra 1.2 lot of the metadata like bloom filters were moved off heap. On Tue, Oct 1, 2013 at 9:34 PM, srmore comom...@gmail.com wrote: Does anyone know what would roughly be the heap size for cassandra with 1TB of data ? We started with about 200 G and now on one of the nodes we are already on 1 TB. We were using 8G of heap and that served us well up until we reached 700 G where we started seeing failures and nodes flipping. With 1 TB of data the node refuses to come back due to lack of memory. needless to say repairs and compactions takes a lot of time. We upped the heap from 8 G to 12 G and suddenly everything started moving rapidly i.e. the repair tasks and the compaction tasks. But soon (in about 9-10 hrs) we started seeing the same symptoms as we were seeing with 8 G. So my question is how do I determine what is the optimal size of heap for data around 1 TB ? Following are some of my JVM settings -Xms8G -Xmx8G -Xmn800m -XX:NewSize=1200M XX:MaxTenuringThreshold=2 -XX:SurvivorRatio=4 Thanks !
Issue with source command and utf8 file
Hi, I'm trying to load some data in Cassandra by the source command in cqlsh. The file is utf8 encoded, however Cassandra seems unable to detect utf8 encoded characters. Here is a sample: insert into positions8(iddevice,timestampevent,idunit,idevent,status,value) values(40135,'2013-06-06T10:08:02',13524915,0,'G','{sp:0,A1:FRANCE,lat:45216954,iDD:40135,A2:RHÔNE-ALPES,tEv:2013-06-06T10:08:02,iE:0,iTE:0,lng:6462520,iD:13318089,mi:0,st:ÉCHANGEUR DE ST-MICHEL-DE-MAURIENNE,A4:SAINT-MARTIN-D'ARC,iU:13524915,A3:SAVOIE,tRx:2013-06-06T10:12:56}'); Here is the hex dump of the file: 6e69 6573 7472 6920 746e 206f 6f70 6973 6974 6e6f 3873 6928 6464 7665 6369 2c65 6974 656d 7473 6d61 6570 6576 746e 692c 7564 696e 2c74 6469 7665 6e65 2c74 7473 7461 7375 762c 6c61 6575 2029 6176 756c 7365 3428 3130 3030 3030 3533 272c 3032 3331 302d 2d36 3630 3154 3a30 3830 303a 2732 312c 3533 3432 3139 2c35 2c30 4727 2c27 7b27 7322 2270 223a 2230 222c 3141 3a22 4622 4152 434e 2245 222c 616c 2274 223a 3534 3132 3936 3435 2c22 6922 3a22 3422 3130 3030 3030 3533 2c22 4122 2232 223a 4852 94c3 454e 412d 504c 5345 2c22 7422 7645 3a22 3222 3130 2d33 3630 302d 5436 3031 303a 3a38 3230 2c22 6922 2245 223a 2230 222c 5469 2245 223a 2230 222c 6e6c 2267 223a 3436 3236 3235 2230 222c 4469 3a22 3122 3831 3830 2239 222c 696d 3a22 2c30 7322 2274 223a 89c3 4843 4e41 4547 5255 4420 2045 5453 4d2d 4349 4548 2d4c 4544 4d2d 5541 4952 4e45 454e 2c22 4122 2234 223a 4153 4e49 2d54 414d 5452 4e49 442d 4127 4352 2c22 6922 2255 223a 3331 3235 3934 3531 2c22 4122 2233 223a 4153 4f56 4549 2c22 7422 7852 3a22 3222 3130 2d33 3630 302d 5436 3031 313a 3a32 3635 7d22 2927 0a3b 000a As an example, Ô is encoded as C394. When I try to load the file I get this error: cqlsh:demodb source 'rhone.cql'; rhone.cql:3:Incomplete statement at end of file The error disappears only when I remove all the non ascii characters. If I copy and paste the insert on cqlsh shell, it works. Cassandra is installed on a centos 6.3 server, LANG is .UTF8, I tried connecting from remote both with gnome terminal and putty on windows, with utf-8 shell, no success on both. Has anybody got any clue? Regards, Paolo -- Paolo Crosato Software engineer/Custom Solutions
Unable to bootstrap new node
Hi all, We are running C* 1.2.8 with Vnodes enabled and are attempting to bootstrap a new node and are having issues. When we add the node we see it bootstrap and we see data start to stream over from other nodes however we are seeing one of the other nodes get stuck in full GCs to the point where we had to restart one of the nodes. I assume this is because building the merkle tree is expensive. The main issue is that the streaming from the node never recovers. We see the following: Stream failed because /10.8.44.80 died or was restarted/removed (streams may still be active in background, but further streams won't be started) Any way to force the streaming to restart? Have others seen this? Thanks
Re: Cassandra Heap Size for data more than 1 TB
I think 512 is fine. Could you tell more about your traffic characteristics? Cem On Wed, Oct 2, 2013 at 4:32 PM, srmore comom...@gmail.com wrote: I changed my index_interval from 128 to index_interval: 128 to 512, does it make sense to increase more than this ? On Wed, Oct 2, 2013 at 9:30 AM, cem cayiro...@gmail.com wrote: Have a look to index_interval. Cem. On Wed, Oct 2, 2013 at 2:25 PM, srmore comom...@gmail.com wrote: The version of Cassandra I am using is 1.0.11, we are migrating to 1.2.X though. We had tuned bloom filters (0.1) and AFAIK making it lower than this won't matter. Thanks ! On Tue, Oct 1, 2013 at 11:54 PM, Mohit Anchlia mohitanch...@gmail.comwrote: Which Cassandra version are you on? Essentially heap size is function of number of keys/metadata. In Cassandra 1.2 lot of the metadata like bloom filters were moved off heap. On Tue, Oct 1, 2013 at 9:34 PM, srmore comom...@gmail.com wrote: Does anyone know what would roughly be the heap size for cassandra with 1TB of data ? We started with about 200 G and now on one of the nodes we are already on 1 TB. We were using 8G of heap and that served us well up until we reached 700 G where we started seeing failures and nodes flipping. With 1 TB of data the node refuses to come back due to lack of memory. needless to say repairs and compactions takes a lot of time. We upped the heap from 8 G to 12 G and suddenly everything started moving rapidly i.e. the repair tasks and the compaction tasks. But soon (in about 9-10 hrs) we started seeing the same symptoms as we were seeing with 8 G. So my question is how do I determine what is the optimal size of heap for data around 1 TB ? Following are some of my JVM settings -Xms8G -Xmx8G -Xmn800m -XX:NewSize=1200M XX:MaxTenuringThreshold=2 -XX:SurvivorRatio=4 Thanks !
Problem with sstableloader from text data
Hi, following the article at http://www.datastax.com/dev/blog/bulk-loading , I developed a custom builder app to serialize a text file with rows in json format to a sstable. I managed to get the tool running and building the tables, however when I try to load them I get this error: sstableloader -d localhost demodb/ Exception in thread main java.lang.NullPointerException at org.apache.cassandra.io.sstable.SSTableLoader.init(SSTableLoader.java:64) at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:64) and when I try to decode the sstables to json I get this one: sstable2json demodb/demodb-positions8-jb-1-Data.db [ {key: 000800bae94e08013f188b9bd00400,columns: [Exception in thread main java.lang.IllegalArgumentException at java.nio.Buffer.limit(Buffer.java:267) at org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:55) at org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:64) at org.apache.cassandra.db.marshal.AbstractCompositeType.getString(AbstractCompositeType.java:230) at org.apache.cassandra.tools.SSTableExport.serializeColumn(SSTableExport.java:183) at org.apache.cassandra.tools.SSTableExport.serializeAtom(SSTableExport.java:152) at org.apache.cassandra.tools.SSTableExport.serializeAtoms(SSTableExport.java:140) at org.apache.cassandra.tools.SSTableExport.serializeRow(SSTableExport.java:238) at org.apache.cassandra.tools.SSTableExport.serializeRow(SSTableExport.java:223) at org.apache.cassandra.tools.SSTableExport.export(SSTableExport.java:360) at org.apache.cassandra.tools.SSTableExport.export(SSTableExport.java:382) at org.apache.cassandra.tools.SSTableExport.export(SSTableExport.java:394) at org.apache.cassandra.tools.SSTableExport.main(SSTableExport.java:477) So it seems something is wrong with me streaming the data. These are the relevant parts of the code: This is the pojo to deserialize the json: public class PositionJsonModel { @JsonProperty(iD) private Long idDevice; @JsonProperty(iU) private Long idUnit; @JsonProperty(iE) private Integer idEvent; @JsonProperty(iTE) private Integer idTypeEvent; @JsonProperty(tEv) private String timestampEvent; @JsonProperty(tRx) private String timestampRx; @JsonProperty(mi) private Long mileage; private Long lat; private Long lng; @JsonProperty(A1) private String country; @JsonProperty(A2) private String state; @JsonProperty(A3) private String county; @JsonProperty(A4) private String city; @JsonProperty(A5) private String locality; @JsonProperty(st) private String street; @JsonProperty(cn) private String civnum; @JsonProperty(in) private String info; @JsonProperty(sp) private Integer speed; //getters, setters, tostring ... And this is the main class: BufferedReader reader = new BufferedReader(new FileReader(filename)); String keyspace = demodb; String columnFamily=positions8; File directory = new File(keyspace); if (!directory.exists()) { directory.mkdir(); } Murmur3Partitioner partitioner = new Murmur3Partitioner(); SSTableSimpleUnsortedWriter positionsWriter = new SSTableSimpleUnsortedWriter(directory,partitioner,keyspace,columnFamily, UTF8Type.instance,null,64); String line=; ObjectMapper mapper = new ObjectMapper(); while ((line = reader.readLine()) != null){ long timestamp = System.currentTimeMillis() * 1000; System.out.println(timestamp: +timestamp); PositionJsonModel model= mapper.readValue(line, PositionJsonModel.class); //CREATE TABLE positions8 ( // iddevice bigint, // timestampevent timestamp, // idevent int, // idunit bigint, // status text, // value text, // PRIMARY KEY (iddevice, timestampevent, idevent) //) WITH CLUSTERING ORDER BY (timestampevent DESC, idevent ASC) ListAbstractType? typeList = new ArrayListAbstractType?(); typeList.add(LongType.instance); typeList.add(DateType.instance); typeList.add(IntegerType.instance); CompositeType compositeKeyTypes = CompositeType.getInstance(typeList); Builder cpBuilder= new Builder(compositeKeyTypes); System.out.println(getIdDevice: +model.getIdDevice()); System.out.println(getTimestampEvent: +model.getTimestampEvent()); System.out.println(getIdEvent: +model.getIdEvent()); cpBuilder.add(bytes(model.getIdDevice()));
Re: Cassandra Heap Size for data more than 1 TB
Sure, I was testing using high traffic with about 6K - 7K req/sec reads and writes combined I added a node and ran repair, at this time the traffic was stopped and heap was 8G. I saw a lot of flushing and GC activity and finally it died saying out of memory. So I gave it more memory 12 G and started the nodes. This sped up the compactions and validations for around 12 hours and now I am back to the flushing and high GC activity at this point there was no traffic for more than 24 hours. Again, thanks for the help ! On Wed, Oct 2, 2013 at 10:19 AM, cem cayiro...@gmail.com wrote: I think 512 is fine. Could you tell more about your traffic characteristics? Cem On Wed, Oct 2, 2013 at 4:32 PM, srmore comom...@gmail.com wrote: I changed my index_interval from 128 to index_interval: 128 to 512, does it make sense to increase more than this ? On Wed, Oct 2, 2013 at 9:30 AM, cem cayiro...@gmail.com wrote: Have a look to index_interval. Cem. On Wed, Oct 2, 2013 at 2:25 PM, srmore comom...@gmail.com wrote: The version of Cassandra I am using is 1.0.11, we are migrating to 1.2.X though. We had tuned bloom filters (0.1) and AFAIK making it lower than this won't matter. Thanks ! On Tue, Oct 1, 2013 at 11:54 PM, Mohit Anchlia mohitanch...@gmail.comwrote: Which Cassandra version are you on? Essentially heap size is function of number of keys/metadata. In Cassandra 1.2 lot of the metadata like bloom filters were moved off heap. On Tue, Oct 1, 2013 at 9:34 PM, srmore comom...@gmail.com wrote: Does anyone know what would roughly be the heap size for cassandra with 1TB of data ? We started with about 200 G and now on one of the nodes we are already on 1 TB. We were using 8G of heap and that served us well up until we reached 700 G where we started seeing failures and nodes flipping. With 1 TB of data the node refuses to come back due to lack of memory. needless to say repairs and compactions takes a lot of time. We upped the heap from 8 G to 12 G and suddenly everything started moving rapidly i.e. the repair tasks and the compaction tasks. But soon (in about 9-10 hrs) we started seeing the same symptoms as we were seeing with 8 G. So my question is how do I determine what is the optimal size of heap for data around 1 TB ? Following are some of my JVM settings -Xms8G -Xmx8G -Xmn800m -XX:NewSize=1200M XX:MaxTenuringThreshold=2 -XX:SurvivorRatio=4 Thanks !
Re: Cassandra Heap Size for data more than 1 TB
Did you upgrade your existing sstables after lowering the value? BTW: If you have tried all other avenues then my suggestion is to increase your heap to 12GB and ParNew to 3GB. Test it out. On Wed, Oct 2, 2013 at 5:25 AM, srmore comom...@gmail.com wrote: The version of Cassandra I am using is 1.0.11, we are migrating to 1.2.X though. We had tuned bloom filters (0.1) and AFAIK making it lower than this won't matter. Thanks ! On Tue, Oct 1, 2013 at 11:54 PM, Mohit Anchlia mohitanch...@gmail.comwrote: Which Cassandra version are you on? Essentially heap size is function of number of keys/metadata. In Cassandra 1.2 lot of the metadata like bloom filters were moved off heap. On Tue, Oct 1, 2013 at 9:34 PM, srmore comom...@gmail.com wrote: Does anyone know what would roughly be the heap size for cassandra with 1TB of data ? We started with about 200 G and now on one of the nodes we are already on 1 TB. We were using 8G of heap and that served us well up until we reached 700 G where we started seeing failures and nodes flipping. With 1 TB of data the node refuses to come back due to lack of memory. needless to say repairs and compactions takes a lot of time. We upped the heap from 8 G to 12 G and suddenly everything started moving rapidly i.e. the repair tasks and the compaction tasks. But soon (in about 9-10 hrs) we started seeing the same symptoms as we were seeing with 8 G. So my question is how do I determine what is the optimal size of heap for data around 1 TB ? Following are some of my JVM settings -Xms8G -Xmx8G -Xmn800m -XX:NewSize=1200M XX:MaxTenuringThreshold=2 -XX:SurvivorRatio=4 Thanks !
Re: Unable to bootstrap new node
On Wed, Oct 2, 2013 at 8:12 AM, Keith Wright kwri...@nanigans.com wrote: We are running C* 1.2.8 with Vnodes enabled and are attempting to bootstrap a new node and are having issues. When we add the node we see it bootstrap and we see data start to stream over from other nodes however we are seeing one of the other nodes get stuck in full GCs to the point where we had to restart one of the nodes. I assume this is because building the merkle tree is expensive. Merkle trees are only involved in repair, not in normal bootstrap. Have you considered lowering the throttle for streaming? Bootstrap will be slower but should be less likely to overwhelm heap. Any way to force the streaming to restart? Have others seen this? In the bootstrap case, you can just wipe the bootstrapping node and re-start the bootstrap. In the general case regarding hung streaming : https://issues.apache.org/jira/browse/CASSANDRA-3486 The only solution to hung non-bootstrap streaming is restart all nodes participating in the streaming. With vnodes, this will probably approach 100% of nodes... =Rob
Re: Best version to upgrade from 1.1.10 to 1.2.X
Hello, I just started the rolling upgrade procedure from 1.1.10 to 2.1.10. Our strategy is to simultaneously upgrade one server from each replication group. So, if we have a 6 nodes with RF=2, we will upgrade 3 nodes at a time (from distinct replication groups). My question is: do the newly upgraded nodes show as Down in the nodetool ring of the old cluster (1.1.10)? Because I thought that network compatibility meant nodes from a newer version would receive traffic (write + reads) from the previous version without problems. Cheers, Paulo 2013/9/26 Paulo Motta pauloricard...@gmail.com Hello Charles, Thank you very much for your detailed upgrade report. It'll be very helpful during our upgrade operation (even though we'll do a rolling production upgrade). I'll also share our findings during the upgrade here. Cheers, Paulo 2013/9/24 Charles Brophy cbro...@zulily.com Hi Paulo, I just completed a migration from 1.1.10 to 1.2.10 and it was surprisingly painless. The course of action that I took: 1) describe cluster - make sure all nodes are on the same schema 2) shutoff all maintenance tasks; i.e. make sure no scheduled repair is going to kick off in the middle of what you're doing 3) snapshot - maybe not necessary but it's so quick it makes no sense to skip this step 4) drain the nodes - I shut down the entire cluster rather than chance any incompatible gossip concerns that might come from a rolling upgrade. I have the luxury of controlling both the providers and consumers of our data, so this wasn't so disruptive for us. 5) Upgrade the nodes, turn them on one-by-one, monitor the logs for funny business. 6) nodetool upgradesstables 7) Turn various maintenance tasks back on, etc. The worst part was managing the yaml/config changes between the versions. It wasn't horrible, but the diff was noisier than a more incremental upgrade typically is. A few things I recall that were special: 1) Since you have an existing cluster, you'll probably need to set the default partitioner back to RandomPartitioner in cassandra.yaml. I believe that is outlined in NEWS. 2) I set the initial tokens to be the same as what the nodes held previously. 3) The timeout is now divided into more atomic settings and you get to decided how (or if) to configure it from the default appropriately. tldr; I did a standard upgrade and payed careful attention to the NEWS.txt upgrade notices. I did a full cluster restart and NOT a rolling upgrade. It went without a hitch. Charles On Tue, Sep 24, 2013 at 2:33 PM, Paulo Motta pauloricard...@gmail.comwrote: Cool, sounds fair enough. Thanks for the help, Rob! If anyone has upgraded from 1.1.X to 1.2.X, please feel invited to share any tips on issues you're encountered that are not yet documented. Cheers, Paulo 2013/9/24 Robert Coli rc...@eventbrite.com On Tue, Sep 24, 2013 at 1:41 PM, Paulo Motta pauloricard...@gmail.comwrote: Doesn't the probability of something going wrong increases as the gap between the versions increase? So, using this reasoning, upgrading from 1.1.10 to 1.2.6 would have less chance of something going wrong then from 1.1.10 to 1.2.9 or 1.2.10. Sorta, but sorta not. https://github.com/apache/cassandra/blob/trunk/NEWS.txt Is the canonical source of concerns on upgrade. There are a few cases where upgrading to the root of X.Y.Z creates issues that do not exist if you upgrade to the head of that line. AFAIK there have been no cases where upgrading to the head of a line (where that line is mature, like 1.2.10) has created problems which would have been avoided by upgrading to the root first. I'm hoping this reasoning is wrong and I can update directly from 1.1.10 to 1.2.10. :-) That's what I plan to do when we move to 1.2.X, FWIW. =Rob -- Paulo Ricardo -- European Master in Distributed Computing*** Royal Institute of Technology - KTH * *Instituto Superior Técnico - IST* *http://paulormg.com* -- Paulo Ricardo -- European Master in Distributed Computing*** Royal Institute of Technology - KTH * *Instituto Superior Técnico - IST* *http://paulormg.com* -- Paulo Ricardo -- European Master in Distributed Computing*** Royal Institute of Technology - KTH * *Instituto Superior Técnico - IST* *http://paulormg.com*
Re: Best version to upgrade from 1.1.10 to 1.2.X
Nevermind the question. It was a firewall problem. Now the nodes between different versions are able to see ach other! =) Cheers, Paulo 2013/10/2 Paulo Motta pauloricard...@gmail.com Hello, I just started the rolling upgrade procedure from 1.1.10 to 2.1.10. Our strategy is to simultaneously upgrade one server from each replication group. So, if we have a 6 nodes with RF=2, we will upgrade 3 nodes at a time (from distinct replication groups). My question is: do the newly upgraded nodes show as Down in the nodetool ring of the old cluster (1.1.10)? Because I thought that network compatibility meant nodes from a newer version would receive traffic (write + reads) from the previous version without problems. Cheers, Paulo 2013/9/26 Paulo Motta pauloricard...@gmail.com Hello Charles, Thank you very much for your detailed upgrade report. It'll be very helpful during our upgrade operation (even though we'll do a rolling production upgrade). I'll also share our findings during the upgrade here. Cheers, Paulo 2013/9/24 Charles Brophy cbro...@zulily.com Hi Paulo, I just completed a migration from 1.1.10 to 1.2.10 and it was surprisingly painless. The course of action that I took: 1) describe cluster - make sure all nodes are on the same schema 2) shutoff all maintenance tasks; i.e. make sure no scheduled repair is going to kick off in the middle of what you're doing 3) snapshot - maybe not necessary but it's so quick it makes no sense to skip this step 4) drain the nodes - I shut down the entire cluster rather than chance any incompatible gossip concerns that might come from a rolling upgrade. I have the luxury of controlling both the providers and consumers of our data, so this wasn't so disruptive for us. 5) Upgrade the nodes, turn them on one-by-one, monitor the logs for funny business. 6) nodetool upgradesstables 7) Turn various maintenance tasks back on, etc. The worst part was managing the yaml/config changes between the versions. It wasn't horrible, but the diff was noisier than a more incremental upgrade typically is. A few things I recall that were special: 1) Since you have an existing cluster, you'll probably need to set the default partitioner back to RandomPartitioner in cassandra.yaml. I believe that is outlined in NEWS. 2) I set the initial tokens to be the same as what the nodes held previously. 3) The timeout is now divided into more atomic settings and you get to decided how (or if) to configure it from the default appropriately. tldr; I did a standard upgrade and payed careful attention to the NEWS.txt upgrade notices. I did a full cluster restart and NOT a rolling upgrade. It went without a hitch. Charles On Tue, Sep 24, 2013 at 2:33 PM, Paulo Motta pauloricard...@gmail.comwrote: Cool, sounds fair enough. Thanks for the help, Rob! If anyone has upgraded from 1.1.X to 1.2.X, please feel invited to share any tips on issues you're encountered that are not yet documented. Cheers, Paulo 2013/9/24 Robert Coli rc...@eventbrite.com On Tue, Sep 24, 2013 at 1:41 PM, Paulo Motta pauloricard...@gmail.com wrote: Doesn't the probability of something going wrong increases as the gap between the versions increase? So, using this reasoning, upgrading from 1.1.10 to 1.2.6 would have less chance of something going wrong then from 1.1.10 to 1.2.9 or 1.2.10. Sorta, but sorta not. https://github.com/apache/cassandra/blob/trunk/NEWS.txt Is the canonical source of concerns on upgrade. There are a few cases where upgrading to the root of X.Y.Z creates issues that do not exist if you upgrade to the head of that line. AFAIK there have been no cases where upgrading to the head of a line (where that line is mature, like 1.2.10) has created problems which would have been avoided by upgrading to the root first. I'm hoping this reasoning is wrong and I can update directly from 1.1.10 to 1.2.10. :-) That's what I plan to do when we move to 1.2.X, FWIW. =Rob -- Paulo Ricardo -- European Master in Distributed Computing*** Royal Institute of Technology - KTH * *Instituto Superior Técnico - IST* *http://paulormg.com* -- Paulo Ricardo -- European Master in Distributed Computing*** Royal Institute of Technology - KTH * *Instituto Superior Técnico - IST* *http://paulormg.com* -- Paulo Ricardo -- European Master in Distributed Computing*** Royal Institute of Technology - KTH * *Instituto Superior Técnico - IST* *http://paulormg.com* -- Paulo Ricardo -- European Master in Distributed Computing*** Royal Institute of Technology - KTH * *Instituto Superior Técnico - IST* *http://paulormg.com*