Re: MIssing data in range query

2014-10-08 Thread Owen Kim
Nope. No secondary index. Just a slice query on the PK. On Tuesday, October 7, 2014, Robert Coli wrote: > On Tue, Oct 7, 2014 at 3:11 PM, Owen Kim > wrote: > >> Sigh, it is a bit grating. I (genuinely) appreciate your acknowledgement >> of that. Though, I didn't int

Re: MIssing data in range query

2014-10-07 Thread Robert Coli
On Tue, Oct 7, 2014 at 3:11 PM, Owen Kim wrote: > Sigh, it is a bit grating. I (genuinely) appreciate your acknowledgement > of that. Though, I didn't intend for the question to be "about" > supercolumns. > (Yep, understand tho that if you hadn't been told that advice before, it would grate a lo

Re: MIssing data in range query

2014-10-07 Thread Owen Kim
Sigh, it is a bit grating. I (genuinely) appreciate your acknowledgement of that. Though, I didn't intend for the question to be "about" supercolumns. It is possible I'm hitting an odd edge case though I'm having trouble reproducing the issue in a controlled environment since there seems to be a t

Re: MIssing data in range query

2014-10-07 Thread Robert Coli
On Tue, Oct 7, 2014 at 2:03 PM, Owen Kim wrote: > I'm aware. I've had the system up since pre-composite columns and haven't > had the cycles to do a major data and schema migration. > > And that's not "slightly" non-responsive. > "There may be unknown bugs in the code you're using, especially be

Re: MIssing data in range query

2014-10-07 Thread Owen Kim
I'm aware. I've had the system up since pre-composite columns and haven't had the cycles to do a major data and schema migration. And that's not "slightly" non-responsive. On Tue, Oct 7, 2014 at 1:49 PM, Robert Coli wrote: > On Tue, Oct 7, 2014 at 1:38 PM, Owen Kim wrote: > >> I'm running Cass

Re: MIssing data in range query

2014-10-07 Thread Robert Coli
On Tue, Oct 7, 2014 at 1:38 PM, Owen Kim wrote: > I'm running Cassandra 1.2.16 with supercolumns and Hector. > Slightly non-responsive response : In general supercolumn use is not recommended. It makes it more difficult to get support when one uses a feature no one else uses. =Rob

MIssing data in range query

2014-10-07 Thread Owen Kim
nd replicate_on_write = true and compaction_strategy = 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy' and caching = 'KEYS_ONLY'; I'm doing a adding a time series supercolumn then doing a slice query over this super column. I'm really just trying to see if any data is in the time slice s

Re: CQL query throws TombstoneOverwhelmingException against a LeveledCompactionStrategy table

2014-10-06 Thread dlu66061
If it is of the same cause, does that mean I should switch to SizeTieredCompactionStrategy? -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/CQL-query-throws-TombstoneOverwhelmingException-against-a-LeveledCompactionStrategy-table-tp7597077p759709

CQL query throws TombstoneOverwhelmingException against a LeveledCompactionStrategy table

2014-10-03 Thread dlu66061
amount of records have 30 day TTL. Now a simple CQL query like “select * from event_index limit 1” won’t run and Cassandra log says ERROR [ReadStage:68] 2014-10-01 15:40:14,751 SliceQueryFilter.java (line 200) Scanned over 10 tombstones in event_index; query aborted (see

Re: Unable to query with token range.. unable to make long from ‘...'

2014-09-28 Thread Kevin Burton
gt; >> It is expecting a 64 bit value … murmer3 partitioner uses 64 bit long >> tokens… where did you get your 128 bit long from, and what partitioner are >> you using? >> >> On Sep 28, 2014, at 1:39 PM, Kevin Burton wrote: >> >> I’m trying to query an entir

Re: Unable to query with token range.. unable to make long from ‘...'

2014-09-28 Thread graham sanderson
ioner uses 64 bit long tokens… > where did you get your 128 bit long from, and what partitioner are you using? > > On Sep 28, 2014, at 1:39 PM, Kevin Burton wrote: > >> I’m trying to query an entire table in parallel by splitting it up in token >> ranges. >> >>

Re: Unable to query with token range.. unable to make long from ‘...'

2014-09-28 Thread Kevin Burton
gt; > On Sep 28, 2014, at 1:39 PM, Kevin Burton wrote: > > I’m trying to query an entire table in parallel by splitting it up in > token ranges. > > However, it’s not working because I get this: > > cqlsh:blogindex> select token(hashcode), hashcode from source where

Re: Unable to query with token range.. unable to make long from ‘...'

2014-09-28 Thread graham sanderson
It is expecting a 64 bit value … murmer3 partitioner uses 64 bit long tokens… where did you get your 128 bit long from, and what partitioner are you using? On Sep 28, 2014, at 1:39 PM, Kevin Burton wrote: > I’m trying to query an entire table in parallel by splitting it up in token >

Unable to query with token range.. unable to make long from ‘...'

2014-09-28 Thread Kevin Burton
I’m trying to query an entire table in parallel by splitting it up in token ranges. However, it’s not working because I get this: cqlsh:blogindex> select token(hashcode), hashcode from source where token(hashcode) >= 0 and token(hashcode) <= 17014118346046923173168730371588410572 limit

How to avoid column family duplication (when query requires multiple restrictions)

2014-09-22 Thread Gianluca Borello
Hi, I have a column family storing very large blobs that I would not like to duplicate, if possible. Here's a simplified version: CREATE TABLE timeline ( key text, a int, b int, value blob, PRIMARY KEY (key, a, b) ); On this, I run exactly two types of query. Both of them

Re: Slow down of secondary index query with VNODE (C* version 1.2.18, jre6).

2014-09-19 Thread Jay Patel
gt;> (e.g. [min(-9223372036854775808), max(-9193352069377957523), and >> (max(-9136021049555745100), max(-8959555493872108621)], etc. ] >> >> Seems like it needs to query data in token order. So, >> min(-9223372036854775808), max(-*9193352069377957523*) on 192.168.51.22. >&

Re: Slow down of secondary index query with VNODE (C* version 1.2.18, jre6).

2014-09-19 Thread Tyler Hobbs
gt; just do one scan through all of the ranges held by it, isn't it? > (e.g. [min(-9223372036854775808), max(-9193352069377957523), and > (max(-9136021049555745100), max(-8959555493872108621)], etc. ] > > Seems like it needs to query data in token order. So, > min(-922337203685

Re: Slow down of secondary index query with VNODE (C* version 1.2.18, jre6).

2014-09-19 Thread Jay Patel
he queried data, at once. Also, internally that node should be able to just do one scan through all of the ranges held by it, isn't it? (e.g. [min(-9223372036854775808), max(-9193352069377957523), and (max(-9136021049555745100), max(-8959555493872108621)], and etc. ] Seems like it needs to

Re: Slow down of secondary index query with VNODE (C* version 1.2.18, jre6).

2014-09-19 Thread Robert Coli
On Fri, Sep 19, 2014 at 2:19 PM, DuyHai Doan wrote: > But does it implies that with vnodes, there are actually "extra work" to > do for scanning indices ? > Vnodes are just nodes, so they have all the problems-associated-with-many-nodes one would get with 256x as many nodes. =Rob

Re: Slow down of secondary index query with VNODE (C* version 1.2.18, jre6).

2014-09-19 Thread Jay Patel
ges for the queried data, at once. Also, internally that node should be able to just do one scan through all of the ranges held by it, isn't it? (e.g. [min(-9223372036854775808), max(-9193352069377957523), and (max(-9136021049555745100), max(-8959555493872108621)], etc. ] Seems like it needs to q

Re: Slow down of secondary index query with VNODE (C* version 1.2.18, jre6).

2014-09-19 Thread Tyler Hobbs
On Fri, Sep 19, 2014 at 4:19 PM, DuyHai Doan wrote: > > But does it implies that with vnodes, there are actually "extra work" to > do for scanning indices ? > Yes. > If yes, is this "extra load" rather I/O bound or CPU bound ? > It doesn't nece

Re: Slow down of secondary index query with VNODE (C* version 1.2.18, jre6).

2014-09-19 Thread DuyHai Doan
es, there are actually "extra work" to do for scanning indices ? If yes, is this "extra load" rather I/O bound or CPU bound ? On Fri, Sep 19, 2014 at 11:10 PM, Tyler Hobbs wrote: > > On Fri, Sep 19, 2014 at 12:41 PM, Jay Patel > wrote: > >> >> Btw

Re: Slow down of secondary index query with VNODE (C* version 1.2.18, jre6).

2014-09-19 Thread Tyler Hobbs
On Fri, Sep 19, 2014 at 12:41 PM, Jay Patel wrote: > > Btw, there is no data in the table. Table is empty. Query is fired on the > empty table. > This is actually the worst case for secondary index lookups. > > From the tracing ouput, I don't understand why it's d

Re: Slow down of secondary index query with VNODE (C* version 1.2.18, jre6).

2014-09-19 Thread Jay Patel
is no data in the table. Table is empty. Query is fired on the empty table. >From the tracing ouput, I don't understand why it's doing multiple scans on one node. With non-vnode, there is only one scan per node & same query works fine. If you look at the output1.txt attached e

Re: Slow down of secondary index query with VNODE (C* version 1.2.18, jre6).

2014-09-19 Thread Tyler Hobbs
r 19, 2014 4:01 AM > To: user@cassandra.apache.org > Subject: Re: Slow down of secondary index query with VNODE (C* version > 1.2.18, jre6). > > Keep in mind secondary indexes in cassandra are not there to improve > performance, or even really be used in a serious user facing manner. &

RE: Slow down of secondary index query with VNODE (C* version 1.2.18, jre6).

2014-09-19 Thread Parag Patel
Of Jonathan Haddad Sent: Friday, September 19, 2014 4:01 AM To: user@cassandra.apache.org Subject: Re: Slow down of secondary index query with VNODE (C* version 1.2.18, jre6). Keep in mind secondary indexes in cassandra are not there to improve performance, or even really be used in a serious

Re: Slow down of secondary index query with VNODE (C* version 1.2.18, jre6).

2014-09-19 Thread Jonathan Haddad
seeing extreme slow down (500ms to 1s) in query on secondary index > with vnode. I'm seeing multiple secondary index scans on a given node in > trace output when vnode is enabled. Without vnode, everything is good. > > Cluster size: 6 nodes > Replication factor: 3 > Consisten

Re: Slow down of secondary index query with VNODE (C* version 1.2.18, jre6).

2014-09-18 Thread DuyHai Doan
Hello Jay Your query is : "select * from keyspaceuser.company_testusers where lastname = ‘lau’ LIMIT 1" Why do you think that the slowness is due to vnodes and not your query asking for 10 000 results ? On Fri, Sep 19, 2014 at 3:33 AM, Jay Patel wrote: > Hi there, >

Re: Java sample code for non-blocking async query

2014-09-01 Thread Stephen Portanova
27;m looking for non-blocking async query sample code. The one I found in > the following link is async query but blocking. Could anyone share such > code? > > > http://www.datastax.com/documentation/developer/java-driver/1.0/java-driver/asynchronous_t.html > > Thanks > Gary > -- Stephen Portanova (480) 495-2634

Java sample code for non-blocking async query

2014-09-01 Thread Gary Zhao
Hello I'm looking for non-blocking async query sample code. The one I found in the following link is async query but blocking. Could anyone share such code? http://www.datastax.com/documentation/developer/java-driver/1.0/java-driver/asynchronous_t.html Thanks Gary

Re: Help with select IN query in cassandra

2014-09-01 Thread Subodh Nijsure
Thanks Michael I will certainly go with this approach for now. -Subodh On Mon, Sep 1, 2014 at 6:33 AM, Laing, Michael wrote: > This should work for your query requirements - 2 tables w same info because > disk is cheap and writes are fast so optimize for reads: > > CREATE TABLE

Re: Help with select IN query in cassandra

2014-09-01 Thread Subodh Nijsure
Laing, Michael > Sent: Monday, September 1, 2014 11:34 AM > To: user@cassandra.apache.org > Subject: Re: Help with select IN query in cassandra > > Did the OP propose that? > > > On Mon, Sep 1, 2014 at 10:53 AM, Jack Krupansky > wrote: >> >> One comment on del

Re: Help with select IN query in cassandra

2014-09-01 Thread Jack Krupansky
1, 2014 11:34 AM To: user@cassandra.apache.org Subject: Re: Help with select IN query in cassandra Did the OP propose that? On Mon, Sep 1, 2014 at 10:53 AM, Jack Krupansky wrote: One comment on deletions – aren’t deletions kind of an anti-pattern for modern data processing, such as sensor

Re: Help with select IN query in cassandra

2014-09-01 Thread Laing, Michael
ging” rather than the exercise in > futility of doing a massive number of deletes and updates in place? > > -- Jack Krupansky > > *From:* Laing, Michael > *Sent:* Monday, September 1, 2014 9:33 AM > *To:* user@cassandra.apache.org > *Subject:* Re: Help with select IN quer

Re: Help with select IN query in cassandra

2014-09-01 Thread Jack Krupansky
Krupansky From: Laing, Michael Sent: Monday, September 1, 2014 9:33 AM To: user@cassandra.apache.org Subject: Re: Help with select IN query in cassandra This should work for your query requirements - 2 tables w same info because disk is cheap and writes are fast so optimize for reads: CREATE TABLE

Re: Help with select IN query in cassandra

2014-09-01 Thread Laing, Michael
This should work for your query requirements - 2 tables w same info because disk is cheap and writes are fast so optimize for reads: CREATE TABLE sensor_asset ( asset_id text, event_time timestamp, tuuid timeuuid, sensor_reading map, sensor_serial_number text, sensor_type int

Re: Help with select IN query in cassandra

2014-08-31 Thread Subodh Nijsure
"timestamp" timeuuid, sensor_reading map, sensor_serial_number text, sensor_type int, PRIMARY KEY ((asset_id, "timestamp"), event_time) ); It does what I want to do, and I removed the index for timestamp item since now it is part of primary key and thus my query l

Re: Help with select IN query in cassandra

2014-08-31 Thread Laing, Michael
> Hmm. Because the clustering key is (event_time, "timestamp"), event_time > must be specified as well - hopefully that info is available to the ux. > > Unfortunately you will then hit another problem with your query: you are > selecting a collection field... this will not

Re: Help with select IN query in cassandra

2014-08-31 Thread Laing, Michael
Hmm. Because the clustering key is (event_time, "timestamp"), event_time must be specified as well - hopefully that info is available to the ux. Unfortunately you will then hit another problem with your query: you are selecting a collection field... this will not work with IN on "

Re: Help with select IN query in cassandra

2014-08-31 Thread Subodh Nijsure
et_id), event_time, "timestamp") >> ); >> >> CREATE INDEX event_time_index ON sensor_info_table (event_time); >> >> CREATE INDEX timestamp_index ON sensor_info_table ("timestamp"); >> >> Now I am able to insert the data into this table, howev

Re: Help with select IN query in cassandra

2014-08-31 Thread Laing, Michael
quot;timestamp"); > > Now I am able to insert the data into this table, however I am unable > to do following query where I want to select items with specific > timeuuid values. > > It gives me following error. > > SELECT * from mydb.sensor_info_table where timestamp IN (

Help with select IN query in cassandra

2014-08-31 Thread Subodh Nijsure
event_time_index ON sensor_info_table (event_time); CREATE INDEX timestamp_index ON sensor_info_table ("timestamp"); Now I am able to insert the data into this table, however I am unable to do following query where I want to select items with specific timeuuid values. It gives me f

Re: range query times out (on 1 node, just 1 row in table)

2014-08-20 Thread Subodh Nijsure
d a >> definitive answer on this but all I have come up with is this (old, >> non-authoritative) blog post which states "Cassandra’s native index is like >> a hashed index, which means you can only do equality query and not range >> query." > > > Somewhere i

Re: 答复: can not query data from cassandra

2014-08-20 Thread Mark Reddy
/www.gtafe.com/ > > [image: 说明: cid:image001.png@01CF5897.E1268DE0] > > > > > > *发件人:* 鄢来琼 [mailto:laiqiong@gtafe.com] > *发送时间:* 2014年8月20日 14:13 > *收件人:* user@cassandra.apache.org > *主题:* can not query data from cassandra > > > > HI ALL, > > &

答复: can not query data from cassandra

2014-08-20 Thread 鄢来琼
.png@01CF5897.E1268DE0] 发件人: 鄢来琼 [mailto:laiqiong@gtafe.com] 发送时间: 2014年8月20日 14:13 收件人: user@cassandra.apache.org 主题: can not query data from cassandra HI ALL, I setup Cassandra on a linux host. I have insert some data into “mykeyspace.cffex_l23” table. The following error are raised during

can not query data from cassandra

2014-08-19 Thread 鄢来琼
HI ALL, I setup Cassandra on a linux host. I have insert some data into “mykeyspace.cffex_l23” table. The following error are raised during query data from “mykeyspace.cffex_l23”. Could you give me any suggestion to fix it? According to “top” cmd, I found that most of the memory are used by

Re: Strange select result when using date grater than query

2014-08-17 Thread Subodh Nijsure
I am running csql on same machine as my cassandra server. I am observing really strange behavior if I do this query all 3 rows show up. SELECT asset_id,event_time,sensor_type, temperature,humidity from temp_humidity_data ALLOW FILTERING; asset_id | event_time | sensor_type

Re: Strange select result when using date grater than query

2014-08-17 Thread Jack Krupansky
I should have asked where your coordinator node is located. Check its time zone, relative to GMT. cqlsh is simply formatting the time stamp for your local display. That is separate from the actual query execution on the server coordinator node. cqlsh is merely a "client", not t

Re: Strange select result when using date grater than query

2014-08-17 Thread Subodh Nijsure
7 05:33:17-0500 | 1 | 67.228 | 91.228 2 | 2014-08-17 05:33:19-0500 | 1 | 61.97 |73.97 So for query i though I should be giving time strings in local timezone too, no? -Subodh On Sun, Aug 17, 2014 at 5:17 AM, Jack Krupansky wrote: > Are you mor

Re: Strange select result when using date grater than query

2014-08-17 Thread Jack Krupansky
Are you more than 7 time zones behind GMT? If so, that would make 03:33 your query less than 03:33-0700 Your query is using the default time zone, which will be the time zone configured for the coordinator node executing the query. IOW, where are you? -- Jack Krupansky -Original

Strange select result when using date grater than query

2014-08-17 Thread Subodh Nijsure
1 | 61.97 |73.97 Now if I execute a query : SELECT asset_id,event_time,sensor_type, temperature,humidity from temp_humidity_data where asset_id='2' and event_time > '2014-08-17 03:33:20' ALLOW FILTERING; it gives me back same results (!), I expected it to give me 0

Re: range query times out (on 1 node, just 1 row in table)

2014-08-13 Thread Robert Coli
nd a > definitive answer on this but all I have come up with is this (old, > non-authoritative) blog post which states "Cassandra’s native index is > like a hashed index, which means you can only do equality query and not > range query." > Somewhere in google I'm p

Re: range query times out (on 1 node, just 1 row in table)

2014-08-13 Thread Ian Rose
val bigint, >> PRIMARY KEY ((foo_name, foo_shard)) >> ) WITH read_repair_chance=0.1; >> >> CREATE INDEX ON foo (int_val); >> CREATE INDEX ON foo (foo_name); >> >> I have inserted just a single row into this table: >> insert into foo(foo_name, foo_shard,

Re: range query times out (on 1 node, just 1 row in table)

2014-08-13 Thread Sylvain Lebresne
ow into this table: > insert into foo(foo_name, foo_shard, int_val) values('dave', 27, 100); > > This query works fine: > select * from foo where foo_name='dave'; > > But when I run this query, I get an RPC timeout: > select * from foo where foo_name='

Re: range query times out (on 1 node, just 1 row in table)

2014-08-13 Thread Ian Rose
Frankly, no matter how inefficient / expensive the query is, surely it should still work when there is only 1 row and 1 node (which is localhost)! I'm starting to wonder if range queries on secondary indexes aren't supported at all (although if that is the case, I would certainly prefe

Re: range query times out (on 1 node, just 1 row in table)

2014-08-13 Thread DuyHai Doan
ave" 2) how many primary keys of table "foo" match the condition int_val>0 --> read from the 2nd index "int_val" where partition key > 0, so basically it is a range scan Once it gets all the results from 2nd indices, C* can query the primary table to return data.

Re: range query times out (on 1 node, just 1 row in table)

2014-08-13 Thread Jack Krupansky
Agreed, but... in this case the table has ONE row, so what exactly could be causing this timeout? I mean, it can’t be the row count, right? -- Jack Krupansky From: DuyHai Doan Sent: Wednesday, August 13, 2014 9:01 AM To: user@cassandra.apache.org Subject: Re: range query times out (on 1 node

Re: range query times out (on 1 node, just 1 row in table)

2014-08-13 Thread DuyHai Doan
Hello Ian Secondary index performs poorly with inequalities (<, ≤, >, ≥). Indeed inequalities forces the server to scan all the cluster to find the requested range, which is clearly not optimal. That's the reason why you need to add "ALLOW FILTERING" for the query to

Re: range query times out (on 1 node, just 1 row in table)

2014-08-13 Thread Ian Rose
Confusingly, it appears to be the presence of an index on int_val that is causing this timeout. If I drop that index (leaving only the index on foo_name) the query works just fine. On Tue, Aug 12, 2014 at 10:25 PM, Ian Rose wrote: > Hi - > > I am currently running a single Cassandr

range query times out (on 1 node, just 1 row in table)

2014-08-12 Thread Ian Rose
, foo_shard)) ) WITH read_repair_chance=0.1; CREATE INDEX ON foo (int_val); CREATE INDEX ON foo (foo_name); I have inserted just a single row into this table: insert into foo(foo_name, foo_shard, int_val) values('dave', 27, 100); This query works fine: select * from foo where foo_name='dav

Re: horizontal query scaling issues follow on

2014-07-23 Thread Benedict Elliott Smith
> > if you find that adding nodes causes performance to degrade I would > suspect that you are querying data in one CQL statement that is spread over > multiple partitions This is exactly what is happening. The better way to query multiple partitions is to simply despatch mult

Re: Should PREPARE QUERY return metadata for the query result?

2014-07-23 Thread Ben Hood
The rest of the body of a Prepared result is: where: - is [short bytes] representing the prepared query ID. - is defined exactly as for a Rows RESULT (See section 4.2.5.2; you can however assume that the Has_more_pages flag is always off) and is the specification for the

Re: Should PREPARE QUERY return metadata for the query result?

2014-07-23 Thread Ben Hood
ssage. The rest of the body of a Prepared result is: where: - is [short bytes] representing the prepared query ID. - is defined exactly as for a Rows RESULT (See section 4.2.5.2) - this represents the type information for the query arguments - is defined exactly as for a R

Should PREPARE QUERY return metadata for the query result?

2014-07-23 Thread Ben Hood
Hi all, I'm looking at the specification of statement preparation (section 4.2.5.4 of the CQL protocol) and I'm wondering whether the metadata result of the PREPARE query only returns column information for the query arguments, and not for the columns of the actual query result. The

Re: horizontal query scaling issues follow on

2014-07-23 Thread Diane Griffith
I posted the query wrong, I gave the query for 1 key versus the large batch of ids like I was testing. What it was using for large batch was IN, so Select * from foo where key IN and col_name='LATEST So after breaking it down and reading as much as I can with regard to our - s

Re: horizontal query scaling issues follow on

2014-07-21 Thread Diane Griffith
So I appreciate all the help so far. Upfront, it is possible the schema and data query pattern could be contributing to the problem. The schema was born out of certain design requirements. If it proves to be part of what makes the scalability crumble, then I hope it will help shape the design

Re: horizontal query scaling issues follow on

2014-07-21 Thread Robert Coli
On Sun, Jul 20, 2014 at 6:12 PM, Diane Griffith wrote: > I am running tests again across different number of client threads and > number of nodes but this time I tweaked some of the timeouts configured for > the nodes in the cluster. I was able to get better performance on the > nodes at 10 clie

Re: horizontal query scaling issues follow on

2014-07-21 Thread Jonathan Lacefield
Hello, Here is the documentation for cfhistograms, which is in microseconds. http://www.datastax.com/documentation/cassandra/2.0/cassandra/tools/toolsCFhisto.html Your question about setting timeouts is subjective, but you have set your timeout limits to 4 mins, which seems excessive. The

Re: horizontal query scaling issues follow on

2014-07-20 Thread Diane Griffith
I am running tests again across different number of client threads and number of nodes but this time I tweaked some of the timeouts configured for the nodes in the cluster. I was able to get better performance on the nodes at 10 client threads by upping 4 timeout values in cassandra.yaml to 24

Re: horizontal query scaling issues follow on

2014-07-18 Thread Diane Griffith
key=”Type1:1109dccb-169b-40ef-b7f8-d072f04d8139“ and col_name=”LATEST“ Read result from above query: {"key":"1109dccb-169b-40ef-b7f8-d072f04d8139","keyType":" Type1","state":"state3","timestamp":1303446284614,"eventId&

Re: horizontal query scaling issues follow on

2014-07-18 Thread Tyler Hobbs
On Fri, Jul 18, 2014 at 8:01 AM, Diane Griffith wrote: > > Partition Size (bytes) > 1109 bytes: 1800 > > Cell Count per Partition > 8 cells: 1800 > > meaning I can't glean anything about how it partitioned or if it broke a > key across partitions from this right? Does it mean for 180

Re: horizontal query scaling issues follow on

2014-07-18 Thread Diane Griffith
stering >>> columns, or does each row have a unique partition key and no clustering >>> columns. >>> >>> -- Jack Krupansky >>> >>> *From:* Diane Griffith >>> *Sent:* Thursday, July 17, 2014 6:21 PM >>> *To:* user >>> *Subjec

Re: horizontal query scaling issues follow on

2014-07-18 Thread Benedict Elliott Smith
your primary key and whether you >> are using a small number of partition keys and a large number of clustering >> columns, or does each row have a unique partition key and no clustering >> columns. >> >> -- Jack Krupansky >> >> *From:* Diane Griffith >

Re: horizontal query scaling issues follow on

2014-07-18 Thread Diane Griffith
g > columns. > > -- Jack Krupansky > > *From:* Diane Griffith > *Sent:* Thursday, July 17, 2014 6:21 PM > *To:* user > *Subject:* Re: horizontal query scaling issues follow on > > So do partitions equate to tokens/vnodes? > > If so we had configured all

Re: horizontal query scaling issues follow on

2014-07-17 Thread Jonathan Haddad
The problem with starting without vnodes is moving to them is a bit hairy. In particular, nodetool shuffle has been reported to take an extremely long time (days, weeks). I would start with vnodes if you have any intent on using them. On Thu, Jul 17, 2014 at 6:03 PM, Robert Coli wrote: > On Thu

Re: horizontal query scaling issues follow on

2014-07-17 Thread Jack Krupansky
whether you are using a small number of partition keys and a large number of clustering columns, or does each row have a unique partition key and no clustering columns. -- Jack Krupansky From: Diane Griffith Sent: Thursday, July 17, 2014 6:21 PM To: user Subject: Re: horizontal query scaling

Re: horizontal query scaling issues follow on

2014-07-17 Thread Robert Coli
On Thu, Jul 17, 2014 at 5:16 PM, Diane Griffith wrote: > I did tests comparing 1, 2, 10, 20, 50, 100 clients spawned all querying. > Performance on 2 nodes starts to degrade from 10 clients on. I saw > similar behavior on 4 nodes but haven't done the official runs on that yet. > > Ok, if you'v

Re: horizontal query scaling issues follow on

2014-07-17 Thread Diane Griffith
So I stripped out the number of clients experiment path information. It is unclear if I can only show horizontal scaling by also spawning many client requests all working at once. So that is why I stripped that information out to distill what our original attempt was at how to show horizontal sca

Re: horizontal query scaling issues follow on

2014-07-17 Thread Robert Coli
On Thu, Jul 17, 2014 at 3:21 PM, Diane Griffith wrote: > So do partitions equate to tokens/vnodes? > A partition is what used to be called a "row". Each individual token in the token ring can contain a partition, which you request using the token as the key. A "token range" is the space betwee

Re: horizontal query scaling issues follow on

2014-07-17 Thread Diane Griffith
er. I didn't think I was hitting an i/o wall on the client vm (separate vm) where we command line scripted our query call to the cassandra cluster. I can break the client call load across vms which I tried early on. Happy to verify that again though. So given that I was assuming the partition

Re: horizontal query scaling issues follow on

2014-07-17 Thread Jack Krupansky
a single partition would certainly not be a test of “horizontal scaling” (adding nodes to handle more data – more token values or partitions.) -- Jack Krupansky From: Diane Griffith Sent: Thursday, July 17, 2014 1:33 PM To: user Subject: horizontal query scaling issues follow on This is a

horizontal query scaling issues follow on

2014-07-17 Thread Diane Griffith
Procedure: - Inserted 54 million cells in 18 million rows (so 3 cells per row), using randomly generated row keys. That was to be our data control for the test. - Spawn a client on a different VM to query 100k rows and do that for 100 reps. Each row key queried is drawn randomly

How to get different columns for different rows in one query from Cassandra?

2014-07-09 Thread srinivas rao
Hi, Is there any way to get values for column "column1" for key "rowkey1" and column "column2" for key "rowkey2" and column "columns2" and "column3" for key "rowkey3" etc' from Cassandra in one single query? Thanks Srini

Re: RPC timeout paging secondary index query results

2014-07-02 Thread Phil Luckhurst
oing to have to rework our data model to avoid them. -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/RPC-timeout-paging-secondary-index-query-results-tp7595078p7595486.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.

Re: RPC timeout paging secondary index query results

2014-07-01 Thread Ken Hancock
Luckhurst < phil.luckhu...@powerassure.com> wrote: > But would you expect performance to drop off so quickly? At 250,000 records > we can still page through the query with LIMIT 5 but when adding an > additional 50,000 records we can't page past the first 10,000 records

Read 75k live rows in a query that should only return 500 (in queue-like table).

2014-06-30 Thread Kevin Burton
ssentially my schema is: bucket: int sequence: long value: text… primary key( bucket, sequence ) … value is just a big chunk of html. sequence is a timestamp essentially. I have 100 buckets… and that's the partition key. So I can stick these buckets across 100 servers token ranges. The que

CQL IN query with 2i index

2014-06-14 Thread tommaso barbugli
Hi there, I was wondering if there is a good reason for select queries on secondary indexes to not support any where operator other than the equality operator, or if its just a missing feature in CQL. Thanks, Tommaso

Re: Cannot query secondary index

2014-06-13 Thread Mohit Anchlia
r the index item. > > The cost of the "every once in a while" delete may be infrequent enough > for you to do what you were actually trying to do in the first place, use a > secondary index and query the table leveraging the ALLOW FILTERING clause. > > My recommendation

Re: Cannot query secondary index

2014-06-13 Thread Jonathan Lacefield
elps manage the effort of the manual delete. However, you would still have to insert into this separate table per the index item. The cost of the "every once in a while" delete may be infrequent enough for you to do what you were actually trying to do in the first place, use a seco

Re: CQL query regarding indexes

2014-06-13 Thread Akash Pandey
your query (which being a date u can get from the timestamp you are searching (eg 140154480)) and the range of timestamps you w​ant. You wont need any secondary indices in this solution. If you need to make some queries on partition id also, keep the original table but you'll need the

Re: RPC timeout paging secondary index query results

2014-06-13 Thread Phil Luckhurst
But would you expect performance to drop off so quickly? At 250,000 records we can still page through the query with LIMIT 5 but when adding an additional 50,000 records we can't page past the first 10,000 records even if we drop to LIMIT 10. What about the case where we add 100,000 re

Re: CQL query regarding indexes

2014-06-12 Thread Jabbar Azam
x27; : 64 }; > > CREATE INDEX idx_messagepayload_senttime ON services.messagepayload > (senttime); > > While I am running the below query I am getting an exception. > > SELECT * FROM b_bank_services.messagepayload WHERE senttime>=140154480 > AND senttime<=140171760

Re: CQL query regarding indexes

2014-06-12 Thread Bulat Shakirzyanov
As far as I can tell, the problem is that you're not using a partition key in your query. AFAIK, you always have to use partition key in where clause. And ALLOW FILTERING option is to let cassandra filter data from the rows it found using the partition key. One way to solve it is to

CQL query regarding indexes

2014-06-12 Thread Roshan
ssion' : 'LZ4Compressor', 'chunk_length_kb' : 64 }; CREATE INDEX idx_messagepayload_senttime ON services.messagepayload (senttime); While I am running the below query I am getting an exception. SELECT * FROM b_bank_services.messagepayload WHERE senttime>=140154480 AND sentti

Re: RPC timeout paging secondary index query results

2014-06-12 Thread Robert Coli
On Thu, Jun 12, 2014 at 9:18 AM, Phil Luckhurst < phil.luckhu...@powerassure.com> wrote: > The problem appears to be directly related to number of entries in the > index. > I started with an empty table and added 50,000 entries at a time with the > same indexed value. All requests in Cassandra a

Re: RPC timeout paging secondary index query results

2014-06-12 Thread Phil Luckhurst
The problem appears to be directly related to number of entries in the index. I started with an empty table and added 50,000 entries at a time with the same indexed value. I was able to page through the results of a query that used the secondary index with 250,000 records in the table using a

Re: Large number of row keys in query kills cluster

2014-06-12 Thread Laing, Michael
to have been requesting a large number of row keys >>> combined with a large number of named columns in a query. 20K rows with 20K >>> columns destroyed my cluster. Splitting it into slices of 100 sequential >>> queries fixed the performance issue. >>> >>> Wh

Re: Large number of row keys in query kills cluster

2014-06-12 Thread Jeremy Jongsma
emy Jongsma > wrote: > >> The big problem seems to have been requesting a large number of row keys >> combined with a large number of named columns in a query. 20K rows with 20K >> columns destroyed my cluster. Splitting it into slices of 100 sequential >> queries fixed

Re: Large number of row keys in query kills cluster

2014-06-12 Thread Peter Sanford
On Wed, Jun 11, 2014 at 9:17 PM, Jack Krupansky wrote: > Hmmm... that multipl-gets section is not present in the 2.0 doc: > > http://www.datastax.com/documentation/cassandra/2.0/cassandra/architecture/architecturePlanningAntiPatterns_c.html > > Was that intentional – is that anti-pattern no lon

Re: Large number of row keys in query kills cluster

2014-06-11 Thread Jack Krupansky
batches” as an anti-pattern: http://www.slideshare.net/mattdennis -- Jack Krupansky From: Peter Sanford Sent: Wednesday, June 11, 2014 7:34 PM To: user@cassandra.apache.org Subject: Re: Large number of row keys in query kills cluster On Wed, Jun 11, 2014 at 10:12 AM, Jeremy Jongsma wrote: The

<    2   3   4   5   6   7   8   9   10   11   >