RE: Reading SSTables Potential File Descriptor Leak 1.2.18

2014-09-23 Thread Job Thomas
Hi, It look like the offset in keycache is wrong !!. refresh the keycache may solve the issue. Thanks & Regards Job M Thomas Platform & Technology From: Tim Heckman [mailto:t...@pagerduty.com] Sent: Wed 9/24/2014 6:17 AM To: user@cassandra.apache.org Subject

Reading SSTables Potential File Descriptor Leak 1.2.18

2014-09-23 Thread Tim Heckman
Hello, I ran in to a problem today where Cassandra 1.2.18 exhausted its number of permitted open file descriptors (65,535). This node has 256 tokens (vnodes) and runs in a test environment with relatively little traffic/data. As best I could tell, the majority of the file descriptors open were fo

Re: How to get data which has changed within x minutes using CQL?

2014-09-23 Thread DuyHai Doan
No, you need to compute yourself now - 15mins. CQL3 does not offer built-in functions to deal with dates right now Le 24 sept. 2014 00:47, "Check Peck" a écrit : > > On Tue, Sep 23, 2014 at 3:41 PM, DuyHai Doan wrote: > >> now - 15 mins > > > > Can I run like this in CQL using cqlsh? > > SELECT

Re: How to get data which has changed within x minutes using CQL?

2014-09-23 Thread Check Peck
On Tue, Sep 23, 2014 at 3:41 PM, DuyHai Doan wrote: > now - 15 mins Can I run like this in CQL using cqlsh? SELECT * FROM client_data WHERE client_id = 1 and last_modified_date >= now - 15 mins When I ran the above query I got an error on my cql client - Bad Request: line 1:81 no viable alt

Re: How to get data which has changed within x minutes using CQL?

2014-09-23 Thread DuyHai Doan
let previous15Min = now - 15 mins SELECT * FROM client_data WHERE client_id = 1 and last_modified_date >= previous15Min Same thing for last 5 mins On Wed, Sep 24, 2014 at 12:32 AM, Check Peck wrote: > Yes I can provide client_id in my where clause. So now my query pattern > will be - > > Give

Re: How to get data which has changed within x minutes using CQL?

2014-09-23 Thread Check Peck
Yes I can provide client_id in my where clause. So now my query pattern will be - Give me everything for what has changed within last 15 minutes or 5 minutes whose client_id is equal to 1? How does my query will look like then? On Tue, Sep 23, 2014 at 3:26 PM, DuyHai Doan wrote: > It is possi

Re: How to get data which has changed within x minutes using CQL?

2014-09-23 Thread DuyHai Doan
It is possible to request a "range" of data according to the last_modified_date but you still need to provide the client_id , the partition key, in any case On Wed, Sep 24, 2014 at 12:23 AM, Check Peck wrote: > I have a table structure like below - > > CREATE TABLE client_data ( > cli

How to get data which has changed within x minutes using CQL?

2014-09-23 Thread Check Peck
I have a table structure like below - CREATE TABLE client_data ( client_id int, consumer_id text, last_modified_date timestamp, PRIMARY KEY (client_id, last_modified_date, consumer_id) ) I have a query pattern like this - Give me everything for what has changed wit

Re: CPU consumption of Cassandra

2014-09-23 Thread DuyHai Doan
Nice catch Daniel. The comment from Sylvain explains a lot ! On Tue, Sep 23, 2014 at 11:33 PM, Daniel Chia wrote: > If I had to guess, it might be in part i could be due to inefficiencies in > 2.0 with regards to CompositeType (which is used in CQL3 tables) - > https://issues.apache.org/jira/bro

Re: CPU consumption of Cassandra

2014-09-23 Thread Daniel Chia
If I had to guess, it might be in part i could be due to inefficiencies in 2.0 with regards to CompositeType (which is used in CQL3 tables) - https://issues.apache.org/jira/browse/CASSANDRA-5417?focusedCommentId=13821243&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-

Re: CPU consumption of Cassandra

2014-09-23 Thread DuyHai Doan
I had done some benching in the past when we faced high CPU usage even though data set is very small, sitting entirely in memory, read the report there: https://github.com/doanduyhai/Cassandra_Data_Model_Bench Our *partial *conclusion were: 1) slice query fetching a page of 64kb of data and dec

Re: CPU consumption of Cassandra

2014-09-23 Thread Chris Lohfink
CPU consumption may be affected from the cassandra-stress tool in 2nd example as well. Running on a separate system eliminates it as a possible cause. There is a little extra work but not anything that I think would be that obvious. tracing (can enable with nodetool) or profiling (ie with you

Re: Is there harm from having all the nodes in the seed list?

2014-09-23 Thread DuyHai Doan
Well, having all nodes in the seed list does not compromise any correctness of gossip protocol. However there will be extra network traffic when nodes are starting because it will ping all nodes for topology discovery, AFAIK On Tue, Sep 23, 2014 at 7:31 PM, Donald Smith < donald.sm...@audiencescie

RE : CPU consumption of Cassandra

2014-09-23 Thread Leleu Eric
First of all, Thanks for your help ! :) Here is some details : > With RF=N=2 your essentially testing a single machine locally which isnt the > best indicator long term I will test with more nodes, (4 with RF = 2) but for now I'm limited to 2 nodes for non technical reason ... > Well, first o

Cassandra sometimes times out on write queries and it spends majority amount of the CPU time on method org.apache.cassandra.db.marshal.AbstractCompositeType.compare()

2014-09-23 Thread Li, George
Hi, I am running some load test in a 5 node Cassandra cluster (EC2, single region, each node has 15 GB RAM, Cassandra version 2.0.6, replication factor 3). My Java program uses Java driver version 2.0.6 and it does 2000 rounds of batch write queries, each with 8 inserts, 8 updates and 8 deletes. W

Is there harm from having all the nodes in the seed list?

2014-09-23 Thread Donald Smith
Is there any harm from having all the nodes listed in the seeds list in cassandra.yaml? Donald A. Smith | Senior Software Engineer P: 425.201.3900 x 3866 C: (206) 819-5965 F: (646) 443-2333 dona...@audiencescience.com [AudienceScience]

Re: CPU consumption of Cassandra

2014-09-23 Thread Chris Lohfink
Well, first off you shouldn't run stress tool on the node your testing. Give it its own box. With RF=N=2 your essentially testing a single machine locally which isnt the best indicator long term (optimizations available when reading data thats local to the node). 80k/sec on a system is pret

RE: CPU consumption of Cassandra

2014-09-23 Thread Leleu Eric
I tried to run "cassandra-stress" on some of my table as proposed by Jake Luciani. For a simple table, this tool is able to perform 8 read op/s with a few CPU consumption if I request the table by the PK(name, tenanted) Ex : TABLE : CREATE TABLE IF NOT EXISTS buckets (tenantid varchar, nam