Re: Column family ID mismatch-Error on concurrent schema modifications

2014-11-30 Thread Jens-U. Mozdzen

Hi Eric,

Zitat von Eric Stevens migh...@gmail.com:

@Jens,


will inactive CFs be released from C*'s memory after i.e. a few days
or when under resource pressure?


No, certain memory structures are allocated and will remain resident on
each node for as long as the table exists.


That's good to know, while not so good for our current design ;)


These CFs are used as time buckets, but are to be kept for speedy

recovery

I would recommend a structure where you include time bucket as part of your
primary key and use a single column family for all time buckets.  Use TTL's
if you want this old data to expire automatically after some certain amount
of time.


Just FYI:

We've had a severe impact on access times when running in single-CF  
mode, because each bucket itself is quite big (large number of rows,  
but with small records). That's when we turned to separating data into  
CFs per time bucket, using TTLs to auto-expire data and deleting the  
CF once the count of rows dropped to zero. We're talking about  
millions of (small) records here, not thousands. Per bucket.


As we created an architecture that will let us switch between these  
models, we'll re-test to check the performance impact of both variants.


Best regards,
Jens



Performance Difference between Batch Insert and Bulk Load

2014-11-30 Thread Dong Dai
Hi, all, 

I have a performance question about the batch insert and bulk load. 

According to the documents, to import large volume of data into Cassandra, 
Batch Insert and Bulk Load can both be an option. Using batch insert is pretty 
straightforwards, but there have not been an ‘official’ way to use Bulk Load to 
import the data (in this case, i mean the data was generated online). 

So, i am thinking first clients use CQLSSTableWriter to create the SSTable 
files, then use “org.apache.cassandra.tools.BulkLoader” to import these 
SSTables into Cassandra directly. 

The question is can I expect a better performance using the BulkLoader this way 
comparing with using Batch insert?

I am not so familiar with the implementation of Bulk Load. But i do see a huge 
performance improvement using Batch Insert. Really want to know the upper 
limits of the write performance. Any comment will be helpful, Thanks!

- Dong

does safe cassandra shutdown require disable binary?

2014-11-30 Thread Kevin Burton
I’m trying to figure out a safe way to do a rolling restart.

http://devblog.michalski.im/2012/11/25/safe-cassandra-shutdown-and-restart/

It has the following command which make sense:

root@cssa01:~# nodetool -h cssa01.michalski.im
disablegossiproot@cssa01:~# nodetool -h cssa01.michalski.im
disablethriftroot@cssa01:~# nodetool -h cssa01.michalski.im drain


… but I don’t think this takes into consideration CQL.


So you would first disablethrift, then disablebinary


anything else needed in modern Cassandra ?

-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
https://plus.google.com/102718274791889610666/posts
http://spinn3r.com


Cassandra add a node and remove a node

2014-11-30 Thread Neha Trivedi
Hi,
I need to Add new Node and remove existing node.

Should I first remove the node and then add a new node or Add new node and
then remove existing node.
Which practice is better and things I need to take care?

regards
Neha


Re: Cassandra add a node and remove a node

2014-11-30 Thread Jens Rantil
Hi Neha,




Generally best practice is to add the new node before removing the old one. 
This is especially important if the cluster’s resources (such as available disk 
space) are low. Also, adding a node usually asserts that the node is 
functioning correctly (check logs) before decommisioning the old node. See [1].




[1] 
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_live_node.html




Cheers,

Jens




———
Jens Rantil
Backend engineer
Tink AB

Email: jens.ran...@tink.se
Phone: +46 708 84 18 32
Web: www.tink.se

Facebook Linkedin Twitter

On Mon, Dec 1, 2014 at 7:15 AM, Neha Trivedi nehajtriv...@gmail.com
wrote:

 Hi,
 I need to Add new Node and remove existing node.
 Should I first remove the node and then add a new node or Add new node and
 then remove existing node.
 Which practice is better and things I need to take care?
 regards
 Neha