Re: Cassandra OOM, many deletedColumn

2013-03-13 Thread aaron morton
> For JVM Heap it is 2G Try 4G > and gc_grace = 1800 Realised that I did not provide a warning about the implication this has for node tool repair. If you are doing deleted on the CF you need to run nodetool repair every gc_grace seconds. In this case I think you main problem was not enough

Re: Best practices for nodes to come back

2013-03-13 Thread aaron morton
If you node has been dead for less then gc_grace you can return it to the cluster and run nodetool repair (without the -pr). Until repair has completed will be getting inconsistent results, but if you have been using ONE / ONE for all ops that is a possibility for everything. If the node has b

Re: Quorum read after quorum write guarantee

2013-03-13 Thread aaron morton
>> If you are doing reads and writes using QUOURM double check that your code >> is correct. If it is provide some more info on what you are seeing. > > It seems correct. What more info could be of value? Can you reproduce the fault in a small script/program? We have a pretty solid history of Qu

Re: Row cache off-heap ?

2013-03-13 Thread aaron morton
> No, I didn't. I used the nodetool setcachecapacity and didn't restart the > node. ok. > I find them hudge, and just happened on the node in which I had enabled row > cache. I just enabled it on .164 node from 10:45 to 10:48 and the heap size > doubled from 3.5GB to 7GB (out of 8, which induc

Re: Cassandra cluster setup issue

2013-03-13 Thread aaron morton
Try it on the localhost. A - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 11/03/2013, at 11:29 PM, amulya rattan wrote: > > Here is a screenshot showing command ran and the fact that cassandra started > fine with JMX bo

Re: HintedHandoff IOError?

2013-03-13 Thread aaron morton
> What is the sanctioned way of removing hints? rm -f HintsColumnFamily*? > Truncate from CLI? There is a JMX command to do it for a particular node. But if you just want to remove all of them, stop and delete the files. > the only one with zero size are the -tmp- files. It seems odd… Temp fi

Re: repair hangs

2013-03-13 Thread Dane Miller
On Wed, Mar 13, 2013 at 12:39 PM, Wei Zhu wrote: > My guess would be there is some exception during the repair and your session > is aborted. > Here is the code of doing repair: > >https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/AntiEntropyService.java > > loo

Re: mmaped files and swap

2013-03-13 Thread Jared Biel
See http://en.wikipedia.org/wiki/Swappiness On 13 March 2013 19:56, Fredrik Stigbäck wrote: > Well, we've seen a Cassandra process swap out 500 MB on a Linux OS > with plenty of RAM, so I was just curious as why the OS thinks it > should use the swap at all. > > 2013/3/13 karim duran : > > I agr

Re: mmaped files and swap

2013-03-13 Thread Fredrik Stigbäck
Well, we've seen a Cassandra process swap out 500 MB on a Linux OS with plenty of RAM, so I was just curious as why the OS thinks it should use the swap at all. 2013/3/13 karim duran : > I agree with Edward Capriolo, > Even when swap is enabled on your system, swaping rarely occurs on OS > today..

Re: repair hangs

2013-03-13 Thread Wei Zhu
My guess would be there is some exception during the repair and your session is aborted. Here is the code of doing repair: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/AntiEntropyService.java looking for logger.info Compare that with your log file, it s

Re: repair hangs

2013-03-13 Thread Dane Miller
On Wed, Mar 13, 2013 at 11:44 AM, Wei Zhu wrote: >Do you see anything related to "merkle" tree in your log? > >Also do a nodetool compactionstats, during merkle tree calculation, you will >see >validation there. The last mention of "merkle" is 2 days old. compactionstats are: $ nodetool compac

Re: About the heap

2013-03-13 Thread Wei Zhu
Here is the JIRA I submitted regarding the ancestor. https://issues.apache.org/jira/browse/CASSANDRA-5342 -Wei - Original Message - From: "Wei Zhu" To: user@cassandra.apache.org Sent: Wednesday, March 13, 2013 11:35:29 AM Subject: Re: About the heap Hi Dean, The index_interval is cont

Re: repair hangs

2013-03-13 Thread Wei Zhu
Do you see anything related to "merkle" tree in your log? Also do a nodetool compactionstats, during merkle tree calculation, you will see validation there. -Wei - Original Message - From: "Dane Miller" To: user@cassandra.apache.org Sent: Wednesday, March 13, 2013 10:54:50 AM Subject:

Re: data model to store large volume syslog

2013-03-13 Thread Aaron Turner
On Wed, Mar 13, 2013 at 4:23 AM, Mohan L wrote: > > > On Fri, Mar 8, 2013 at 9:42 PM, aaron morton > wrote: >> >> > 1). create a column family 'cfrawlog' which stores raw log as received. >> > row key could be 'ddmmhh'(new row is added for each hour or less), each >> > 'column name' is uuid w

Re: About the heap

2013-03-13 Thread Wei Zhu
It's not BloomFilter. Cassandra will read through sstable index files on start-up, doing what is known as "index sampling". This is used to keep a subset (currently and by default, 1 out of 100) of keys and and their on-disk location in the index, in memory. See ArchitectureInternals. This mea

Re: About the heap

2013-03-13 Thread Alain RODRIGUEZ
"You can trigger minor compaction on an individual SStable file when the percentage of tombstones in that Sstable crosses a user-defined threshold." We have just one cf with TTL. I don't think the problem comes from there. "Peaks may be occurring during compaction, when Sstable files are memmappe

Re: About the heap

2013-03-13 Thread Wei Zhu
Hi Dean, The index_interval is controlling the sampling of the SSTable to speed up the lookup of the keys in the SSTable. Here is the code: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/DataTracker.java#L478 To increase the interval meaning, taking less samples,

Re: Pig / Map Reduce on Cassandra

2013-03-13 Thread cscetbon.ext
Finally I've found the answer in CassandraStorage.java ! columns is not an alias but a bag that you fill with columns (name+value) that don't have metadata. That's why your sample doesn't return anything in my test as I've only filled existing columns (found in the CQL CREATE command) I think y

Re: About the heap

2013-03-13 Thread Alain RODRIGUEZ
"called index_interval set to 128" I think this is for BloomFilters actually. 2013/3/13 Hiller, Dean > Going to 1.2.2 helped us quite a bit as well as turning on LCS from STCS > which gave us smaller bloomfilters. > > As far as key cache. There is an entry in cassandra.yaml called > index_int

Re: About the heap

2013-03-13 Thread Hiller, Dean
Going to 1.2.2 helped us quite a bit as well as turning on LCS from STCS which gave us smaller bloomfilters. As far as key cache. There is an entry in cassandra.yaml called index_interval set to 128. I am not sure if that is related to key_cache. I think it is. By turning that to 512 or may

repair hangs

2013-03-13 Thread Dane Miller
Hi, On one of my nodes, nodetool repair -pr has been running for 48 hours and appears to be hung, with no output and no AntiEntropy messages in system.log for 40+ hours. Load, cpu, etc are all near zero. There are no other repair jobs running in my cluster. What's the recommended way to deal wi

Re: Pig / Map Reduce on Cassandra

2013-03-13 Thread cscetbon.ext
I'm trying to execute your sample pig script and I don't understand where the alias "columns" comes from : grunt> rows = LOAD 'cassandra://MyKeyspace/MyColumnFamily' USING CassandraStorage(); grunt> cols = FOREACH rows GENERATE flatten(columns); I suppose it's defined by the call to getSchema

RE: About the heap

2013-03-13 Thread moshe.kranc
Peaks may be occurring during compaction, when Sstable files are memmapped. If so: Upgrading to C* 1.2 may bring some relief: You can trigger minor compaction on an individual SStable file when the percentage of tombstones in that Sstable crosses a user-defined threshold. (Aaron, can you confirm?

Accessing timestamp of a cassandra column Using CQL3

2013-03-13 Thread Haithem Jarraya
Hi there, I am wondering if it's possible to access the timestamp of a cassandra column via CQL3? Many Thanks, Haithem

Re: mmaped files and swap

2013-03-13 Thread karim duran
I agree with Edward Capriolo, Even when swap is enabled on your system, swaping rarely occurs on OS today...(except for very loaded systems). But, take care that some 32 bits system kernels allows only 2^32 bits memory mapped file length ( ~ 2 Go ). It could be a limitation for NoSQL databases. It

Re: mmaped files and swap

2013-03-13 Thread Edward Capriolo
You really can not control what the OS-swaps out. java has other memory usage outside the heap, and native memory. best to turn swap off. Swap is kinda old school anyway at this point. It made sense when machines had 32MB RAM. Keeping your read 95th percentile low is mostly about removing deviatio

mmaped files and swap

2013-03-13 Thread Fredrik
I've got a question regarding understanding the recomendation to disable swap. Since Cassandra uses mlockall to lock the heap in RAM what is the reason for disabling swap? My guess is that is has to do with memory mapped files but as of my understanding, accessing pages of memory mapped files, t

PlayOrm as an example for implementing Conceptual Hierarchies in C*

2013-03-13 Thread raman
Thank you Raman > You may want to look at PlayOrm and it's class heirarchy for a single > table. Just a possible option. > > Dean > > On 3/12/13 6:04 PM, "Raman" wrote: > >>Can someone refer me to a C* tutorial on how to define dynamic schema >>and populate data. >> >>I am trying to an inheritan

Re: commitlog -deleted keyspaces.

2013-03-13 Thread Hiller, Dean
I do not know if it is a known issue. I only know what I saw in QA. If I remember correctly, I tested this on 1.1.4 and 1.2.2 with the same result that commit logs are left there(maybe it is by design….I really don't know). All I know is that the sstables do correctly grow in size. Dean Fro

Re: commitlog -deleted keyspaces.

2013-03-13 Thread Hiller, Dean
You may want to look at PlayOrm and it's class heirarchy for a single table. Just a possible option. Dean On 3/12/13 6:04 PM, "Raman" wrote: >Can someone refer me to a C* tutorial on how to define dynamic schema >and populate data. > >I am trying to an inheritance hierarchy object population i

Re: data model to store large volume syslog

2013-03-13 Thread Mohan L
On Fri, Mar 8, 2013 at 9:42 PM, aaron morton wrote: > > 1). create a column family 'cfrawlog' which stores raw log as received. > row key could be 'ddmmhh'(new row is added for each hour or less), each > 'column name' is uuid with 'value' is raw log data. Since we are also going > to use this

Re: Pig / Map Reduce on Cassandra

2013-03-13 Thread cscetbon.ext
Ok forget it. It was a mix of mistakes like environment variables not set, package name not added in the script and libraries not found. Regards -- Cyril SCETBON On Mar 12, 2013, at 10:43 AM, cscetbon@orange.com wrote: I'm already using Cassandra 1.2.2 with

Problem setting up encrypted communication

2013-03-13 Thread Jan Kesten
Hello together, after my inital tests all is up and running, replacing a dead node was no problem at all. Now I tried to setup encryption between nodes. I set up keystores and a truststore as described in the docs. Every node has it's own keystore with one private key and a truststore with all