Re: Maintaining counter column consistency

2013-10-02 Thread Haithem Jarraya
Hi Ben,

If you make sure R + W  N you should be fine.
Have a read of this 
http://www.slideshare.net/benjaminblack/introduction-to-cassandra-replication-and-consistency

Thanks,

H
On 1 Oct 2013, at 18:29, Ben Hood 0x6e6...@gmail.comhttp://gmail.com wrote:

Hi,

We're maintaining a bunch of application specific counters that are
incremented on a per event basis just after the event has been
inserted.

Given the fact that they can get of sync, we were wondering if there
are any best practices or just plain real world experience for
handling the consistency of these counters?

The application could tolerate an inconsistency for a while, so I'm
not sure that the cost of any full-on ACID semantics (should they
actually be possible in Cassandra) would be justified.

So the first inclination was to issue the increment after the insert
and hope for the best. Then at some later point, we would run a
reconciliation on the underlying data in the column family and compare
this with the counter values. Obviously you can only do this once a
counter column has gone cold - i.e. it wouldn't make sense to
reconcile something that could still get incremented.

Does it make sense to put the insert and increment in a CQL batch?

Does anybody have any high level advice for this design deliberation?

Cheers,

Ben



Re: Maintaining counter column consistency

2013-10-02 Thread Ben Hood
Hi Haithem,

I might have phrased my question wrongly - I wasn't referring to the 
considerations of consistency level or replication factors - I was referring to 
fact that I want to insert a row and increment a counter in the same operation. 
I was concerned about the inconsistency that could arise if the counter 
increment failed, after the underlying record on which the increment was based 
succeeded. So I wasn't talking about the consistency between Cassandra nodes, 
rather the consistency between an idempotent base record and an non-idempotent 
summary counter.

Cheers,

Ben

On October 2, 2013 at 10:09:40 AM, Haithem Jarraya (a-hjarr...@expedia.com) 
wrote:

Hi Ben,

If you make sure R + W  N you should be fine.
Have a read of this 
http://www.slideshare.net/benjaminblack/introduction-to-cassandra-replication-and-consistency

Thanks,

H
On 1 Oct 2013, at 18:29, Ben Hood 0x6e6...@gmail.com wrote:

Hi,

We're maintaining a bunch of application specific counters that are
incremented on a per event basis just after the event has been
inserted.

Given the fact that they can get of sync, we were wondering if there
are any best practices or just plain real world experience for
handling the consistency of these counters?

The application could tolerate an inconsistency for a while, so I'm
not sure that the cost of any full-on ACID semantics (should they
actually be possible in Cassandra) would be justified.

So the first inclination was to issue the increment after the insert
and hope for the best. Then at some later point, we would run a
reconciliation on the underlying data in the column family and compare
this with the counter values. Obviously you can only do this once a
counter column has gone cold - i.e. it wouldn't make sense to
reconcile something that could still get incremented.

Does it make sense to put the insert and increment in a CQL batch?

Does anybody have any high level advice for this design deliberation?

Cheers,

Ben



Re: Cassandra Heap Size for data more than 1 TB

2013-10-02 Thread srmore
The version of Cassandra I am using is 1.0.11, we are migrating to 1.2.X
though. We had tuned bloom filters (0.1) and AFAIK making it lower than
this won't matter.

Thanks !


On Tue, Oct 1, 2013 at 11:54 PM, Mohit Anchlia mohitanch...@gmail.comwrote:

 Which Cassandra version are you on? Essentially heap size is function of
 number of keys/metadata. In Cassandra 1.2 lot of the metadata like bloom
 filters were moved off heap.


 On Tue, Oct 1, 2013 at 9:34 PM, srmore comom...@gmail.com wrote:

 Does anyone know what would roughly be the heap size for cassandra with
 1TB of data ? We started with about 200 G and now on one of the nodes we
 are already on 1 TB. We were using 8G of heap and that served us well up
 until we reached 700 G where we started seeing failures and nodes flipping.

 With 1 TB of data the node refuses to come back due to lack of memory.
 needless to say repairs and compactions takes a lot of time. We upped the
 heap from 8 G to 12 G and suddenly everything started moving rapidly i.e.
 the repair tasks and the compaction tasks. But soon (in about 9-10 hrs) we
 started seeing the same symptoms as we were seeing with 8 G.

 So my question is how do I determine what is the optimal size of heap for
 data around 1 TB ?

 Following are some of my JVM settings

 -Xms8G
 -Xmx8G
 -Xmn800m
 -XX:NewSize=1200M
 XX:MaxTenuringThreshold=2
 -XX:SurvivorRatio=4

 Thanks !





RE: Rollback question regarding system metadata change

2013-10-02 Thread Christopher Wirt
I went with deleting the extra rows created in schema_columns and I've now
successfully bootstrapped three nodes back on 1.2.10. 

 

No sour side effects to report yet.

 

Thanks for your help

 

From: Robert Coli [mailto:rc...@eventbrite.com] 
Sent: 02 October 2013 01:00
To: user@cassandra.apache.org
Subject: Re: Rollback question regarding system metadata change

 

On Tue, Oct 1, 2013 at 3:45 PM, Chris Wirt chris.w...@struq.com wrote:

Yep they still work. They dont acutally have any of the new system CF
created for 2.0, paxos, etc.. but they do have new rows in the
schema_columns table preventing startup and bootstrapping of new
nodes.

 

It *may* be least risky to manually remove these rows and then restart DC3.
But unfortunately without really diving into the code, I can't make any
statement about what effects it might have.

  

But anyway, actions to do this would be:
- drop schema (wont actually delete data?)

 

What actually happens is that you automatically create a snapshot in the
snapshots dir when you drop, so you would have to move (or (better) hard
link) those files back into place.

 

- create schema (will create all the metadata and leave my data
directories alone?)
- on each node run nodetool refresh (will load my existing data?)

 

Right. Refresh will rename all SSTables while opening them.

 

As an alternative to refresh, you can restart the node; Cassandra loads
whatever files it finds in the data dir at startup.

 

=Rob

 

 

 



Re: Cassandra Heap Size for data more than 1 TB

2013-10-02 Thread cem
Have a look to index_interval.

Cem.


On Wed, Oct 2, 2013 at 2:25 PM, srmore comom...@gmail.com wrote:

 The version of Cassandra I am using is 1.0.11, we are migrating to 1.2.X
 though. We had tuned bloom filters (0.1) and AFAIK making it lower than
 this won't matter.

 Thanks !


 On Tue, Oct 1, 2013 at 11:54 PM, Mohit Anchlia mohitanch...@gmail.comwrote:

 Which Cassandra version are you on? Essentially heap size is function of
 number of keys/metadata. In Cassandra 1.2 lot of the metadata like bloom
 filters were moved off heap.


 On Tue, Oct 1, 2013 at 9:34 PM, srmore comom...@gmail.com wrote:

 Does anyone know what would roughly be the heap size for cassandra with
 1TB of data ? We started with about 200 G and now on one of the nodes we
 are already on 1 TB. We were using 8G of heap and that served us well up
 until we reached 700 G where we started seeing failures and nodes flipping.

 With 1 TB of data the node refuses to come back due to lack of memory.
 needless to say repairs and compactions takes a lot of time. We upped the
 heap from 8 G to 12 G and suddenly everything started moving rapidly i.e.
 the repair tasks and the compaction tasks. But soon (in about 9-10 hrs) we
 started seeing the same symptoms as we were seeing with 8 G.

 So my question is how do I determine what is the optimal size of heap
 for data around 1 TB ?

 Following are some of my JVM settings

 -Xms8G
 -Xmx8G
 -Xmn800m
 -XX:NewSize=1200M
 XX:MaxTenuringThreshold=2
 -XX:SurvivorRatio=4

 Thanks !






Re: Cassandra Heap Size for data more than 1 TB

2013-10-02 Thread srmore
I changed my index_interval from 128 to index_interval: 128 to 512, does it
make sense to increase more than this ?


On Wed, Oct 2, 2013 at 9:30 AM, cem cayiro...@gmail.com wrote:

 Have a look to index_interval.

 Cem.


 On Wed, Oct 2, 2013 at 2:25 PM, srmore comom...@gmail.com wrote:

 The version of Cassandra I am using is 1.0.11, we are migrating to 1.2.X
 though. We had tuned bloom filters (0.1) and AFAIK making it lower than
 this won't matter.

 Thanks !


 On Tue, Oct 1, 2013 at 11:54 PM, Mohit Anchlia mohitanch...@gmail.comwrote:

 Which Cassandra version are you on? Essentially heap size is function of
 number of keys/metadata. In Cassandra 1.2 lot of the metadata like bloom
 filters were moved off heap.


 On Tue, Oct 1, 2013 at 9:34 PM, srmore comom...@gmail.com wrote:

 Does anyone know what would roughly be the heap size for cassandra with
 1TB of data ? We started with about 200 G and now on one of the nodes we
 are already on 1 TB. We were using 8G of heap and that served us well up
 until we reached 700 G where we started seeing failures and nodes flipping.

 With 1 TB of data the node refuses to come back due to lack of memory.
 needless to say repairs and compactions takes a lot of time. We upped the
 heap from 8 G to 12 G and suddenly everything started moving rapidly i.e.
 the repair tasks and the compaction tasks. But soon (in about 9-10 hrs) we
 started seeing the same symptoms as we were seeing with 8 G.

 So my question is how do I determine what is the optimal size of heap
 for data around 1 TB ?

 Following are some of my JVM settings

 -Xms8G
 -Xmx8G
 -Xmn800m
 -XX:NewSize=1200M
 XX:MaxTenuringThreshold=2
 -XX:SurvivorRatio=4

 Thanks !







Issue with source command and utf8 file

2013-10-02 Thread Paolo Crosato

Hi,

I'm trying to load some data in Cassandra by the source command in cqlsh.
The file is utf8 encoded, however Cassandra seems unable to detect utf8 
encoded characters.


Here is a sample:

insert into 
positions8(iddevice,timestampevent,idunit,idevent,status,value) 
values(40135,'2013-06-06T10:08:02',13524915,0,'G','{sp:0,A1:FRANCE,lat:45216954,iDD:40135,A2:RHÔNE-ALPES,tEv:2013-06-06T10:08:02,iE:0,iTE:0,lng:6462520,iD:13318089,mi:0,st:ÉCHANGEUR 
DE 
ST-MICHEL-DE-MAURIENNE,A4:SAINT-MARTIN-D'ARC,iU:13524915,A3:SAVOIE,tRx:2013-06-06T10:12:56}');


Here is the hex dump of the file:

6e69 6573 7472 6920 746e 206f 6f70 6973
6974 6e6f 3873 6928 6464 7665 6369 2c65
6974 656d 7473 6d61 6570 6576 746e 692c
7564 696e 2c74 6469 7665 6e65 2c74 7473
7461 7375 762c 6c61 6575 2029 6176 756c
7365 3428 3130 3030 3030 3533 272c 3032
3331 302d 2d36 3630 3154 3a30 3830 303a
2732 312c 3533 3432 3139 2c35 2c30 4727
2c27 7b27 7322 2270 223a 2230 222c 3141
3a22 4622 4152 434e 2245 222c 616c 2274
223a 3534 3132 3936 3435 2c22 6922 
3a22 3422 3130 3030 3030 3533 2c22 4122
2232 223a 4852 94c3 454e 412d 504c 5345
2c22 7422 7645 3a22 3222 3130 2d33 3630
302d 5436 3031 303a 3a38 3230 2c22 6922
2245 223a 2230 222c 5469 2245 223a 2230
222c 6e6c 2267 223a 3436 3236 3235 2230
222c 4469 3a22 3122  3831 3830 2239
222c 696d 3a22 2c30 7322 2274 223a 89c3
4843 4e41 4547 5255 4420 2045 5453 4d2d
4349 4548 2d4c 4544 4d2d 5541 4952 4e45
454e 2c22 4122 2234 223a 4153 4e49 2d54
414d 5452 4e49 442d 4127 4352 2c22 6922
2255 223a 3331 3235 3934 3531 2c22 4122
2233 223a 4153 4f56 4549 2c22 7422 7852
3a22 3222 3130 2d33 3630 302d 5436 3031
313a 3a32 3635 7d22 2927 0a3b 000a

As an example, Ô is encoded as C394. When I try to load the file I get 
this error:


cqlsh:demodb source 'rhone.cql';
rhone.cql:3:Incomplete statement at end of file

The error disappears only when I remove all the non ascii characters.

If I copy and paste the insert on cqlsh shell, it works.

Cassandra is installed on a centos 6.3 server, LANG is .UTF8, I tried 
connecting from remote both with gnome terminal and putty on windows, 
with utf-8 shell, no success on both.


Has anybody got any clue?

Regards,

Paolo

--
Paolo Crosato
Software engineer/Custom Solutions



Unable to bootstrap new node

2013-10-02 Thread Keith Wright
Hi all,

   We are running C* 1.2.8 with Vnodes enabled and are attempting to bootstrap 
a new node and are having issues.  When we add the node we see it bootstrap and 
we see data start to stream over from other nodes however we are seeing one of 
the other nodes get stuck in full GCs to the point where we had to restart one 
of the nodes.  I assume this is because building the merkle tree is expensive.  
 The main issue is that the streaming from the node never recovers.  We see the 
following:

Stream failed because /10.8.44.80 died or was restarted/removed (streams may 
still be active in background, but further streams won't be started)

Any way to force the streaming to restart?   Have others seen this?

Thanks



Re: Cassandra Heap Size for data more than 1 TB

2013-10-02 Thread cem
I think 512 is fine. Could you tell more about your traffic
characteristics?

Cem


On Wed, Oct 2, 2013 at 4:32 PM, srmore comom...@gmail.com wrote:

 I changed my index_interval from 128 to index_interval: 128 to 512, does
 it make sense to increase more than this ?


 On Wed, Oct 2, 2013 at 9:30 AM, cem cayiro...@gmail.com wrote:

 Have a look to index_interval.

 Cem.


 On Wed, Oct 2, 2013 at 2:25 PM, srmore comom...@gmail.com wrote:

 The version of Cassandra I am using is 1.0.11, we are migrating to 1.2.X
 though. We had tuned bloom filters (0.1) and AFAIK making it lower than
 this won't matter.

 Thanks !


 On Tue, Oct 1, 2013 at 11:54 PM, Mohit Anchlia 
 mohitanch...@gmail.comwrote:

 Which Cassandra version are you on? Essentially heap size is function
 of number of keys/metadata. In Cassandra 1.2 lot of the metadata like bloom
 filters were moved off heap.


 On Tue, Oct 1, 2013 at 9:34 PM, srmore comom...@gmail.com wrote:

 Does anyone know what would roughly be the heap size for cassandra
 with 1TB of data ? We started with about 200 G and now on one of the nodes
 we are already on 1 TB. We were using 8G of heap and that served us well 
 up
 until we reached 700 G where we started seeing failures and nodes 
 flipping.

 With 1 TB of data the node refuses to come back due to lack of memory.
 needless to say repairs and compactions takes a lot of time. We upped the
 heap from 8 G to 12 G and suddenly everything started moving rapidly i.e.
 the repair tasks and the compaction tasks. But soon (in about 9-10 hrs) we
 started seeing the same symptoms as we were seeing with 8 G.

 So my question is how do I determine what is the optimal size of heap
 for data around 1 TB ?

 Following are some of my JVM settings

 -Xms8G
 -Xmx8G
 -Xmn800m
 -XX:NewSize=1200M
 XX:MaxTenuringThreshold=2
 -XX:SurvivorRatio=4

 Thanks !








Problem with sstableloader from text data

2013-10-02 Thread Paolo Crosato

Hi,

following the article at http://www.datastax.com/dev/blog/bulk-loading ,
I developed a custom builder app to serialize a text file with rows in 
json format to a sstable.
I managed to get the tool running and building the tables, however when 
I try to load them I get this error:


sstableloader -d localhost demodb/
Exception in thread main java.lang.NullPointerException
at 
org.apache.cassandra.io.sstable.SSTableLoader.init(SSTableLoader.java:64)

at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:64)

and when I try to decode the sstables to json I get this one:

sstable2json demodb/demodb-positions8-jb-1-Data.db
[
{key: 
000800bae94e08013f188b9bd00400,columns: 
[Exception in thread main java.lang.IllegalArgumentException

at java.nio.Buffer.limit(Buffer.java:267)
at 
org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:55)
at 
org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:64)
at 
org.apache.cassandra.db.marshal.AbstractCompositeType.getString(AbstractCompositeType.java:230)
at 
org.apache.cassandra.tools.SSTableExport.serializeColumn(SSTableExport.java:183)
at 
org.apache.cassandra.tools.SSTableExport.serializeAtom(SSTableExport.java:152)
at 
org.apache.cassandra.tools.SSTableExport.serializeAtoms(SSTableExport.java:140)
at 
org.apache.cassandra.tools.SSTableExport.serializeRow(SSTableExport.java:238)
at 
org.apache.cassandra.tools.SSTableExport.serializeRow(SSTableExport.java:223)
at 
org.apache.cassandra.tools.SSTableExport.export(SSTableExport.java:360)
at 
org.apache.cassandra.tools.SSTableExport.export(SSTableExport.java:382)
at 
org.apache.cassandra.tools.SSTableExport.export(SSTableExport.java:394)
at 
org.apache.cassandra.tools.SSTableExport.main(SSTableExport.java:477)


So it seems something is wrong with me streaming the data.
These are the relevant parts of the code:

This is the pojo to deserialize the json:

public class PositionJsonModel {

@JsonProperty(iD)
private Long idDevice;
@JsonProperty(iU)
private Long idUnit;
@JsonProperty(iE)
private Integer idEvent;
@JsonProperty(iTE)
private Integer idTypeEvent;
@JsonProperty(tEv)
private String timestampEvent;
@JsonProperty(tRx)
private String timestampRx;
@JsonProperty(mi)
private Long mileage;
private Long lat;
private Long lng;
@JsonProperty(A1)
private String country;
@JsonProperty(A2)
private String state;
@JsonProperty(A3)
private String county;
@JsonProperty(A4)
private String city;
@JsonProperty(A5)
private String locality;
@JsonProperty(st)
private String street;
@JsonProperty(cn)
private String civnum;
@JsonProperty(in)
private String info;
@JsonProperty(sp)
private Integer speed;

//getters, setters, tostring
...

And this is the main class:


   BufferedReader reader = new BufferedReader(new 
FileReader(filename));


String keyspace = demodb;
String columnFamily=positions8;
File directory = new File(keyspace);
if (!directory.exists()) {
directory.mkdir();
}
Murmur3Partitioner partitioner = new Murmur3Partitioner();
SSTableSimpleUnsortedWriter positionsWriter =
new 
SSTableSimpleUnsortedWriter(directory,partitioner,keyspace,columnFamily, 
UTF8Type.instance,null,64);


String line=;
ObjectMapper mapper = new ObjectMapper();
while ((line = reader.readLine()) != null){
long timestamp = System.currentTimeMillis() * 1000;
System.out.println(timestamp: +timestamp);
PositionJsonModel model= mapper.readValue(line, 
PositionJsonModel.class);


//CREATE TABLE positions8 (
//  iddevice bigint,
//  timestampevent timestamp,
//  idevent int,
//  idunit bigint,
//  status text,
//  value text,
//  PRIMARY KEY (iddevice, timestampevent, idevent)
//) WITH CLUSTERING ORDER BY (timestampevent DESC, 
idevent ASC)


ListAbstractType? typeList = new 
ArrayListAbstractType?();

typeList.add(LongType.instance);
typeList.add(DateType.instance);
typeList.add(IntegerType.instance);
CompositeType compositeKeyTypes = 
CompositeType.getInstance(typeList);


Builder cpBuilder= new Builder(compositeKeyTypes);
System.out.println(getIdDevice: +model.getIdDevice());
System.out.println(getTimestampEvent: 
+model.getTimestampEvent());

System.out.println(getIdEvent: +model.getIdEvent());
cpBuilder.add(bytes(model.getIdDevice()));

Re: Cassandra Heap Size for data more than 1 TB

2013-10-02 Thread srmore
Sure, I was testing using high traffic with about 6K - 7K req/sec reads and
writes combined I added a node and ran repair, at this time the traffic was
stopped and heap was 8G. I saw a lot of flushing and GC activity and
finally it died saying out of memory. So I gave it more memory 12 G and
started the nodes. This sped up the compactions and validations for around
12 hours and now I am back to the flushing and high GC activity at this
point there was no traffic for more than 24 hours.

Again, thanks for the help !


On Wed, Oct 2, 2013 at 10:19 AM, cem cayiro...@gmail.com wrote:

 I think 512 is fine. Could you tell more about your traffic
 characteristics?

 Cem


 On Wed, Oct 2, 2013 at 4:32 PM, srmore comom...@gmail.com wrote:

 I changed my index_interval from 128 to index_interval: 128 to 512, does
 it make sense to increase more than this ?


 On Wed, Oct 2, 2013 at 9:30 AM, cem cayiro...@gmail.com wrote:

 Have a look to index_interval.

 Cem.


 On Wed, Oct 2, 2013 at 2:25 PM, srmore comom...@gmail.com wrote:

 The version of Cassandra I am using is 1.0.11, we are migrating to
 1.2.X though. We had tuned bloom filters (0.1) and AFAIK making it lower
 than this won't matter.

 Thanks !


 On Tue, Oct 1, 2013 at 11:54 PM, Mohit Anchlia 
 mohitanch...@gmail.comwrote:

 Which Cassandra version are you on? Essentially heap size is function
 of number of keys/metadata. In Cassandra 1.2 lot of the metadata like 
 bloom
 filters were moved off heap.


 On Tue, Oct 1, 2013 at 9:34 PM, srmore comom...@gmail.com wrote:

 Does anyone know what would roughly be the heap size for cassandra
 with 1TB of data ? We started with about 200 G and now on one of the 
 nodes
 we are already on 1 TB. We were using 8G of heap and that served us well 
 up
 until we reached 700 G where we started seeing failures and nodes 
 flipping.

 With 1 TB of data the node refuses to come back due to lack of
 memory. needless to say repairs and compactions takes a lot of time. We
 upped the heap from 8 G to 12 G and suddenly everything started moving
 rapidly i.e. the repair tasks and the compaction tasks. But soon (in 
 about
 9-10 hrs) we started seeing the same symptoms as we were seeing with 8 G.

 So my question is how do I determine what is the optimal size of heap
 for data around 1 TB ?

 Following are some of my JVM settings

 -Xms8G
 -Xmx8G
 -Xmn800m
 -XX:NewSize=1200M
 XX:MaxTenuringThreshold=2
 -XX:SurvivorRatio=4

 Thanks !









Re: Cassandra Heap Size for data more than 1 TB

2013-10-02 Thread Mohit Anchlia
Did you upgrade your existing sstables after lowering the value?

BTW: If you have tried all other avenues then my suggestion is to increase
your heap to 12GB and ParNew to 3GB. Test it out.

On Wed, Oct 2, 2013 at 5:25 AM, srmore comom...@gmail.com wrote:

 The version of Cassandra I am using is 1.0.11, we are migrating to 1.2.X
 though. We had tuned bloom filters (0.1) and AFAIK making it lower than
 this won't matter.

 Thanks !


 On Tue, Oct 1, 2013 at 11:54 PM, Mohit Anchlia mohitanch...@gmail.comwrote:

 Which Cassandra version are you on? Essentially heap size is function of
 number of keys/metadata. In Cassandra 1.2 lot of the metadata like bloom
 filters were moved off heap.


 On Tue, Oct 1, 2013 at 9:34 PM, srmore comom...@gmail.com wrote:

 Does anyone know what would roughly be the heap size for cassandra with
 1TB of data ? We started with about 200 G and now on one of the nodes we
 are already on 1 TB. We were using 8G of heap and that served us well up
 until we reached 700 G where we started seeing failures and nodes flipping.

 With 1 TB of data the node refuses to come back due to lack of memory.
 needless to say repairs and compactions takes a lot of time. We upped the
 heap from 8 G to 12 G and suddenly everything started moving rapidly i.e.
 the repair tasks and the compaction tasks. But soon (in about 9-10 hrs) we
 started seeing the same symptoms as we were seeing with 8 G.

 So my question is how do I determine what is the optimal size of heap
 for data around 1 TB ?

 Following are some of my JVM settings

 -Xms8G
 -Xmx8G
 -Xmn800m
 -XX:NewSize=1200M
 XX:MaxTenuringThreshold=2
 -XX:SurvivorRatio=4

 Thanks !






Re: Unable to bootstrap new node

2013-10-02 Thread Robert Coli
On Wed, Oct 2, 2013 at 8:12 AM, Keith Wright kwri...@nanigans.com wrote:

We are running C* 1.2.8 with Vnodes enabled and are attempting to
 bootstrap a new node and are having issues.  When we add the node we see it
 bootstrap and we see data start to stream over from other nodes however we
 are seeing one of the other nodes get stuck in full GCs to the point where
 we had to restart one of the nodes.  I assume this is because building the
 merkle tree is expensive.


Merkle trees are only involved in repair, not in normal bootstrap. Have
you considered lowering the throttle for streaming? Bootstrap will be
slower but should be less likely to overwhelm heap.


 Any way to force the streaming to restart?   Have others seen this?


In the bootstrap case, you can just wipe the bootstrapping node and
re-start the bootstrap.

In the general case regarding hung streaming :

https://issues.apache.org/jira/browse/CASSANDRA-3486

The only solution to hung non-bootstrap streaming is restart all nodes
participating in the streaming. With vnodes, this will probably approach
100% of nodes...

=Rob


Re: Best version to upgrade from 1.1.10 to 1.2.X

2013-10-02 Thread Paulo Motta
Hello,

I just started the rolling upgrade procedure from 1.1.10 to 2.1.10. Our
strategy is to simultaneously upgrade one server from each replication
group. So, if we have a 6 nodes with RF=2, we will upgrade 3 nodes at a
time (from distinct replication groups).

My question is: do the newly upgraded nodes show as Down in the nodetool
ring of the old cluster (1.1.10)? Because I thought that network
compatibility meant nodes from a newer version would receive traffic (write
+ reads) from the previous version without problems.

Cheers,

Paulo


2013/9/26 Paulo Motta pauloricard...@gmail.com

 Hello Charles,

 Thank you very much for your detailed upgrade report. It'll be very
 helpful during our upgrade operation (even though we'll do a rolling
 production upgrade).

 I'll also share our findings during the upgrade here.

 Cheers,

 Paulo


 2013/9/24 Charles Brophy cbro...@zulily.com

 Hi Paulo,

 I just completed a migration from 1.1.10 to 1.2.10 and it was
 surprisingly painless.

 The course of action that I took:
 1) describe cluster - make sure all nodes are on the same schema
 2) shutoff all maintenance tasks; i.e. make sure no scheduled repair is
 going to kick off in the middle of what you're doing
 3) snapshot - maybe not necessary but it's so quick it makes no sense to
 skip this step
 4) drain the nodes - I shut down the entire cluster rather than chance
 any incompatible gossip concerns that might come from a rolling upgrade. I
 have the luxury of controlling both the providers and consumers of our
 data, so this wasn't so disruptive for us.
 5) Upgrade the nodes, turn them on one-by-one, monitor the logs for funny
 business.
 6) nodetool upgradesstables
 7) Turn various maintenance tasks back on, etc.

 The worst part was managing the yaml/config changes between the versions.
 It wasn't horrible, but the diff was noisier than a more incremental
 upgrade typically is. A few things I recall that were special:
 1) Since you have an existing cluster, you'll probably need to set the
 default partitioner back to RandomPartitioner in cassandra.yaml. I believe
 that is outlined in NEWS.
 2) I set the initial tokens to be the same as what the nodes held
 previously.
 3) The timeout is now divided into more atomic settings and you get to
 decided how (or if) to configure it from the default appropriately.

 tldr; I did a standard upgrade and payed careful attention to the
 NEWS.txt upgrade notices. I did a full cluster restart and NOT a rolling
 upgrade. It went without a hitch.

 Charles






 On Tue, Sep 24, 2013 at 2:33 PM, Paulo Motta pauloricard...@gmail.comwrote:

 Cool, sounds fair enough. Thanks for the help, Rob!

 If anyone has upgraded from 1.1.X to 1.2.X, please feel invited to share
 any tips on issues you're encountered that are not yet documented.

 Cheers,

 Paulo


 2013/9/24 Robert Coli rc...@eventbrite.com

 On Tue, Sep 24, 2013 at 1:41 PM, Paulo Motta 
 pauloricard...@gmail.comwrote:

 Doesn't the probability of something going wrong increases as the gap
 between the versions increase? So, using this reasoning, upgrading from
 1.1.10 to 1.2.6 would have less chance of something going wrong then from
 1.1.10 to 1.2.9 or 1.2.10.


 Sorta, but sorta not.

 https://github.com/apache/cassandra/blob/trunk/NEWS.txt

 Is the canonical source of concerns on upgrade. There are a few cases
 where upgrading to the root of X.Y.Z creates issues that do not exist if
 you upgrade to the head of that line. AFAIK there have been no cases
 where upgrading to the head of a line (where that line is mature, like
 1.2.10) has created problems which would have been avoided by upgrading to
 the root first.


 I'm hoping this reasoning is wrong and I can update directly from
 1.1.10 to 1.2.10. :-)


 That's what I plan to do when we move to 1.2.X, FWIW.

 =Rob




 --
 Paulo Ricardo

 --
 European Master in Distributed Computing***
 Royal Institute of Technology - KTH
 *
 *Instituto Superior Técnico - IST*
 *http://paulormg.com*





 --
 Paulo Ricardo

 --
 European Master in Distributed Computing***
 Royal Institute of Technology - KTH
 *
 *Instituto Superior Técnico - IST*
 *http://paulormg.com*




-- 
Paulo Ricardo

-- 
European Master in Distributed Computing***
Royal Institute of Technology - KTH
*
*Instituto Superior Técnico - IST*
*http://paulormg.com*


Re: Best version to upgrade from 1.1.10 to 1.2.X

2013-10-02 Thread Paulo Motta
Nevermind the question. It was a firewall problem. Now the nodes between
different versions are able to see ach other! =)

Cheers,

Paulo


2013/10/2 Paulo Motta pauloricard...@gmail.com

 Hello,

 I just started the rolling upgrade procedure from 1.1.10 to 2.1.10. Our
 strategy is to simultaneously upgrade one server from each replication
 group. So, if we have a 6 nodes with RF=2, we will upgrade 3 nodes at a
 time (from distinct replication groups).

 My question is: do the newly upgraded nodes show as Down in the
 nodetool ring of the old cluster (1.1.10)? Because I thought that network
 compatibility meant nodes from a newer version would receive traffic (write
 + reads) from the previous version without problems.

 Cheers,

 Paulo


 2013/9/26 Paulo Motta pauloricard...@gmail.com

 Hello Charles,

 Thank you very much for your detailed upgrade report. It'll be very
 helpful during our upgrade operation (even though we'll do a rolling
 production upgrade).

 I'll also share our findings during the upgrade here.

 Cheers,

 Paulo


 2013/9/24 Charles Brophy cbro...@zulily.com

 Hi Paulo,

 I just completed a migration from 1.1.10 to 1.2.10 and it was
 surprisingly painless.

 The course of action that I took:
 1) describe cluster - make sure all nodes are on the same schema
 2) shutoff all maintenance tasks; i.e. make sure no scheduled repair is
 going to kick off in the middle of what you're doing
 3) snapshot - maybe not necessary but it's so quick it makes no sense to
 skip this step
 4) drain the nodes - I shut down the entire cluster rather than chance
 any incompatible gossip concerns that might come from a rolling upgrade. I
 have the luxury of controlling both the providers and consumers of our
 data, so this wasn't so disruptive for us.
 5) Upgrade the nodes, turn them on one-by-one, monitor the logs for
 funny business.
 6) nodetool upgradesstables
 7) Turn various maintenance tasks back on, etc.

 The worst part was managing the yaml/config changes between the
 versions. It wasn't horrible, but the diff was noisier than a more
 incremental upgrade typically is. A few things I recall that were special:
 1) Since you have an existing cluster, you'll probably need to set the
 default partitioner back to RandomPartitioner in cassandra.yaml. I believe
 that is outlined in NEWS.
 2) I set the initial tokens to be the same as what the nodes held
 previously.
 3) The timeout is now divided into more atomic settings and you get to
 decided how (or if) to configure it from the default appropriately.

 tldr; I did a standard upgrade and payed careful attention to the
 NEWS.txt upgrade notices. I did a full cluster restart and NOT a rolling
 upgrade. It went without a hitch.

 Charles






 On Tue, Sep 24, 2013 at 2:33 PM, Paulo Motta 
 pauloricard...@gmail.comwrote:

 Cool, sounds fair enough. Thanks for the help, Rob!

 If anyone has upgraded from 1.1.X to 1.2.X, please feel invited to
 share any tips on issues you're encountered that are not yet documented.

 Cheers,

 Paulo


 2013/9/24 Robert Coli rc...@eventbrite.com

 On Tue, Sep 24, 2013 at 1:41 PM, Paulo Motta pauloricard...@gmail.com
  wrote:

 Doesn't the probability of something going wrong increases as the gap
 between the versions increase? So, using this reasoning, upgrading from
 1.1.10 to 1.2.6 would have less chance of something going wrong then from
 1.1.10 to 1.2.9 or 1.2.10.


 Sorta, but sorta not.

 https://github.com/apache/cassandra/blob/trunk/NEWS.txt

 Is the canonical source of concerns on upgrade. There are a few cases
 where upgrading to the root of X.Y.Z creates issues that do not exist if
 you upgrade to the head of that line. AFAIK there have been no cases
 where upgrading to the head of a line (where that line is mature, like
 1.2.10) has created problems which would have been avoided by upgrading to
 the root first.


 I'm hoping this reasoning is wrong and I can update directly from
 1.1.10 to 1.2.10. :-)


 That's what I plan to do when we move to 1.2.X, FWIW.

 =Rob




 --
 Paulo Ricardo

 --
 European Master in Distributed Computing***
 Royal Institute of Technology - KTH
 *
 *Instituto Superior Técnico - IST*
 *http://paulormg.com*





 --
 Paulo Ricardo

 --
 European Master in Distributed Computing***
 Royal Institute of Technology - KTH
 *
 *Instituto Superior Técnico - IST*
 *http://paulormg.com*




 --
 Paulo Ricardo

 --
 European Master in Distributed Computing***
 Royal Institute of Technology - KTH
 *
 *Instituto Superior Técnico - IST*
 *http://paulormg.com*




-- 
Paulo Ricardo

-- 
European Master in Distributed Computing***
Royal Institute of Technology - KTH
*
*Instituto Superior Técnico - IST*
*http://paulormg.com*