Nope. No secondary index. Just a slice query on the PK.
On Tuesday, October 7, 2014, Robert Coli wrote:
> On Tue, Oct 7, 2014 at 3:11 PM, Owen Kim > wrote:
>
>> Sigh, it is a bit grating. I (genuinely) appreciate your acknowledgement
>> of that. Though, I didn't int
On Tue, Oct 7, 2014 at 3:11 PM, Owen Kim wrote:
> Sigh, it is a bit grating. I (genuinely) appreciate your acknowledgement
> of that. Though, I didn't intend for the question to be "about"
> supercolumns.
>
(Yep, understand tho that if you hadn't been told that advice before, it
would grate a lo
Sigh, it is a bit grating. I (genuinely) appreciate your acknowledgement of
that. Though, I didn't intend for the question to be "about" supercolumns.
It is possible I'm hitting an odd edge case though I'm having trouble
reproducing the issue in a controlled environment since there seems to be a
t
On Tue, Oct 7, 2014 at 2:03 PM, Owen Kim wrote:
> I'm aware. I've had the system up since pre-composite columns and haven't
> had the cycles to do a major data and schema migration.
>
> And that's not "slightly" non-responsive.
>
"There may be unknown bugs in the code you're using, especially be
I'm aware. I've had the system up since pre-composite columns and haven't
had the cycles to do a major data and schema migration.
And that's not "slightly" non-responsive.
On Tue, Oct 7, 2014 at 1:49 PM, Robert Coli wrote:
> On Tue, Oct 7, 2014 at 1:38 PM, Owen Kim wrote:
>
>> I'm running Cass
On Tue, Oct 7, 2014 at 1:38 PM, Owen Kim wrote:
> I'm running Cassandra 1.2.16 with supercolumns and Hector.
>
Slightly non-responsive response :
In general supercolumn use is not recommended. It makes it more difficult
to get support when one uses a feature no one else uses.
=Rob
nd replicate_on_write = true
and compaction_strategy =
'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'
and caching = 'KEYS_ONLY';
I'm doing a adding a time series supercolumn then doing a slice query over
this super column. I'm really just trying to see if any data is in the time
slice s
If it is of the same cause, does that mean I should switch to
SizeTieredCompactionStrategy?
--
View this message in context:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/CQL-query-throws-TombstoneOverwhelmingException-against-a-LeveledCompactionStrategy-table-tp7597077p759709
amount of records have
30 day TTL.
Now a simple CQL query like “select * from event_index limit 1” won’t run
and Cassandra log says
ERROR [ReadStage:68] 2014-10-01 15:40:14,751 SliceQueryFilter.java (line
200) Scanned over 10 tombstones in event_index; query aborted (see
gt;
>> It is expecting a 64 bit value … murmer3 partitioner uses 64 bit long
>> tokens… where did you get your 128 bit long from, and what partitioner are
>> you using?
>>
>> On Sep 28, 2014, at 1:39 PM, Kevin Burton wrote:
>>
>> I’m trying to query an entir
ioner uses 64 bit long tokens…
> where did you get your 128 bit long from, and what partitioner are you using?
>
> On Sep 28, 2014, at 1:39 PM, Kevin Burton wrote:
>
>> I’m trying to query an entire table in parallel by splitting it up in token
>> ranges.
>>
>>
gt;
> On Sep 28, 2014, at 1:39 PM, Kevin Burton wrote:
>
> I’m trying to query an entire table in parallel by splitting it up in
> token ranges.
>
> However, it’s not working because I get this:
>
> cqlsh:blogindex> select token(hashcode), hashcode from source where
It is expecting a 64 bit value … murmer3 partitioner uses 64 bit long tokens…
where did you get your 128 bit long from, and what partitioner are you using?
On Sep 28, 2014, at 1:39 PM, Kevin Burton wrote:
> I’m trying to query an entire table in parallel by splitting it up in token
>
I’m trying to query an entire table in parallel by splitting it up in token
ranges.
However, it’s not working because I get this:
cqlsh:blogindex> select token(hashcode), hashcode from source where
token(hashcode) >= 0 and token(hashcode) <=
17014118346046923173168730371588410572 limit
Hi,
I have a column family storing very large blobs that I would not like to
duplicate, if possible.
Here's a simplified version:
CREATE TABLE timeline (
key text,
a int,
b int,
value blob,
PRIMARY KEY (key, a, b)
);
On this, I run exactly two types of query. Both of them
gt;> (e.g. [min(-9223372036854775808), max(-9193352069377957523), and
>> (max(-9136021049555745100), max(-8959555493872108621)], etc. ]
>>
>> Seems like it needs to query data in token order. So,
>> min(-9223372036854775808), max(-*9193352069377957523*) on 192.168.51.22.
>&
gt; just do one scan through all of the ranges held by it, isn't it?
> (e.g. [min(-9223372036854775808), max(-9193352069377957523), and
> (max(-9136021049555745100), max(-8959555493872108621)], etc. ]
>
> Seems like it needs to query data in token order. So,
> min(-922337203685
he queried data, at once. Also, internally that node should be able to
just do one scan through all of the ranges held by it, isn't it?
(e.g. [min(-9223372036854775808), max(-9193352069377957523), and
(max(-9136021049555745100), max(-8959555493872108621)], and etc. ]
Seems like it needs to
On Fri, Sep 19, 2014 at 2:19 PM, DuyHai Doan wrote:
> But does it implies that with vnodes, there are actually "extra work" to
> do for scanning indices ?
>
Vnodes are just nodes, so they have all the
problems-associated-with-many-nodes one would get with 256x as many nodes.
=Rob
ges for
the queried data, at once. Also, internally that node should be able to
just do one scan through all of the ranges held by it, isn't it?
(e.g. [min(-9223372036854775808), max(-9193352069377957523), and
(max(-9136021049555745100), max(-8959555493872108621)], etc. ]
Seems like it needs to q
On Fri, Sep 19, 2014 at 4:19 PM, DuyHai Doan wrote:
>
> But does it implies that with vnodes, there are actually "extra work" to
> do for scanning indices ?
>
Yes.
> If yes, is this "extra load" rather I/O bound or CPU bound ?
>
It doesn't nece
es, there are actually "extra work" to
do for scanning indices ? If yes, is this "extra load" rather I/O bound or
CPU bound ?
On Fri, Sep 19, 2014 at 11:10 PM, Tyler Hobbs wrote:
>
> On Fri, Sep 19, 2014 at 12:41 PM, Jay Patel
> wrote:
>
>>
>> Btw
On Fri, Sep 19, 2014 at 12:41 PM, Jay Patel wrote:
>
> Btw, there is no data in the table. Table is empty. Query is fired on the
> empty table.
>
This is actually the worst case for secondary index lookups.
>
> From the tracing ouput, I don't understand why it's d
is no data in the table. Table is empty. Query is fired on the
empty table.
>From the tracing ouput, I don't understand why it's doing multiple scans on
one node. With non-vnode, there is only one scan per node & same query
works fine.
If you look at the output1.txt attached e
r 19, 2014 4:01 AM
> To: user@cassandra.apache.org
> Subject: Re: Slow down of secondary index query with VNODE (C* version
> 1.2.18, jre6).
>
> Keep in mind secondary indexes in cassandra are not there to improve
> performance, or even really be used in a serious user facing manner.
&
Of
Jonathan Haddad
Sent: Friday, September 19, 2014 4:01 AM
To: user@cassandra.apache.org
Subject: Re: Slow down of secondary index query with VNODE (C* version 1.2.18,
jre6).
Keep in mind secondary indexes in cassandra are not there to improve
performance, or even really be used in a serious
seeing extreme slow down (500ms to 1s) in query on secondary index
> with vnode. I'm seeing multiple secondary index scans on a given node in
> trace output when vnode is enabled. Without vnode, everything is good.
>
> Cluster size: 6 nodes
> Replication factor: 3
> Consisten
Hello Jay
Your query is : "select * from keyspaceuser.company_testusers where
lastname = ‘lau’ LIMIT 1"
Why do you think that the slowness is due to vnodes and not your query
asking for 10 000 results ?
On Fri, Sep 19, 2014 at 3:33 AM, Jay Patel wrote:
> Hi there,
>
27;m looking for non-blocking async query sample code. The one I found in
> the following link is async query but blocking. Could anyone share such
> code?
>
>
> http://www.datastax.com/documentation/developer/java-driver/1.0/java-driver/asynchronous_t.html
>
> Thanks
> Gary
>
--
Stephen Portanova
(480) 495-2634
Hello
I'm looking for non-blocking async query sample code. The one I found in
the following link is async query but blocking. Could anyone share such
code?
http://www.datastax.com/documentation/developer/java-driver/1.0/java-driver/asynchronous_t.html
Thanks
Gary
Thanks Michael I will certainly go with this approach for now.
-Subodh
On Mon, Sep 1, 2014 at 6:33 AM, Laing, Michael
wrote:
> This should work for your query requirements - 2 tables w same info because
> disk is cheap and writes are fast so optimize for reads:
>
> CREATE TABLE
Laing, Michael
> Sent: Monday, September 1, 2014 11:34 AM
> To: user@cassandra.apache.org
> Subject: Re: Help with select IN query in cassandra
>
> Did the OP propose that?
>
>
> On Mon, Sep 1, 2014 at 10:53 AM, Jack Krupansky
> wrote:
>>
>> One comment on del
1, 2014 11:34 AM
To: user@cassandra.apache.org
Subject: Re: Help with select IN query in cassandra
Did the OP propose that?
On Mon, Sep 1, 2014 at 10:53 AM, Jack Krupansky wrote:
One comment on deletions – aren’t deletions kind of an anti-pattern for
modern data processing, such as sensor
ging” rather than the exercise in
> futility of doing a massive number of deletes and updates in place?
>
> -- Jack Krupansky
>
> *From:* Laing, Michael
> *Sent:* Monday, September 1, 2014 9:33 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Help with select IN quer
Krupansky
From: Laing, Michael
Sent: Monday, September 1, 2014 9:33 AM
To: user@cassandra.apache.org
Subject: Re: Help with select IN query in cassandra
This should work for your query requirements - 2 tables w same info because
disk is cheap and writes are fast so optimize for reads:
CREATE TABLE
This should work for your query requirements - 2 tables w same info because
disk is cheap and writes are fast so optimize for reads:
CREATE TABLE sensor_asset (
asset_id text,
event_time timestamp,
tuuid timeuuid,
sensor_reading map,
sensor_serial_number text,
sensor_type int
"timestamp" timeuuid,
sensor_reading map,
sensor_serial_number text,
sensor_type int,
PRIMARY KEY ((asset_id, "timestamp"), event_time)
);
It does what I want to do, and I removed the index for timestamp item
since now it is part of primary key and thus my query l
> Hmm. Because the clustering key is (event_time, "timestamp"), event_time
> must be specified as well - hopefully that info is available to the ux.
>
> Unfortunately you will then hit another problem with your query: you are
> selecting a collection field... this will not
Hmm. Because the clustering key is (event_time, "timestamp"), event_time
must be specified as well - hopefully that info is available to the ux.
Unfortunately you will then hit another problem with your query: you are
selecting a collection field... this will not work with IN on "
et_id), event_time, "timestamp")
>> );
>>
>> CREATE INDEX event_time_index ON sensor_info_table (event_time);
>>
>> CREATE INDEX timestamp_index ON sensor_info_table ("timestamp");
>>
>> Now I am able to insert the data into this table, howev
quot;timestamp");
>
> Now I am able to insert the data into this table, however I am unable
> to do following query where I want to select items with specific
> timeuuid values.
>
> It gives me following error.
>
> SELECT * from mydb.sensor_info_table where timestamp IN (
event_time_index ON sensor_info_table (event_time);
CREATE INDEX timestamp_index ON sensor_info_table ("timestamp");
Now I am able to insert the data into this table, however I am unable
to do following query where I want to select items with specific
timeuuid values.
It gives me f
d a
>> definitive answer on this but all I have come up with is this (old,
>> non-authoritative) blog post which states "Cassandra’s native index is like
>> a hashed index, which means you can only do equality query and not range
>> query."
>
>
> Somewhere i
/www.gtafe.com/
>
> [image: 说明: cid:image001.png@01CF5897.E1268DE0]
>
>
>
>
>
> *发件人:* 鄢来琼 [mailto:laiqiong@gtafe.com]
> *发送时间:* 2014年8月20日 14:13
> *收件人:* user@cassandra.apache.org
> *主题:* can not query data from cassandra
>
>
>
> HI ALL,
>
>
&
.png@01CF5897.E1268DE0]
发件人: 鄢来琼 [mailto:laiqiong@gtafe.com]
发送时间: 2014年8月20日 14:13
收件人: user@cassandra.apache.org
主题: can not query data from cassandra
HI ALL,
I setup Cassandra on a linux host.
I have insert some data into “mykeyspace.cffex_l23” table.
The following error are raised during
HI ALL,
I setup Cassandra on a linux host.
I have insert some data into “mykeyspace.cffex_l23” table.
The following error are raised during query data from “mykeyspace.cffex_l23”.
Could you give me any suggestion to fix it?
According to “top” cmd, I found that most of the memory are used by
I am running csql on same machine as my cassandra server.
I am observing really strange behavior if I do this query all 3 rows show up.
SELECT asset_id,event_time,sensor_type, temperature,humidity from
temp_humidity_data ALLOW FILTERING;
asset_id | event_time | sensor_type
I should have asked where your coordinator node is located. Check its time
zone, relative to GMT.
cqlsh is simply formatting the time stamp for your local display. That is
separate from the actual query execution on the server coordinator node.
cqlsh is merely a "client", not t
7 05:33:17-0500 | 1 | 67.228 | 91.228
2 | 2014-08-17 05:33:19-0500 | 1 | 61.97 |73.97
So for query i though I should be giving time strings in local timezone too, no?
-Subodh
On Sun, Aug 17, 2014 at 5:17 AM, Jack Krupansky wrote:
> Are you mor
Are you more than 7 time zones behind GMT? If so, that would make 03:33 your
query less than 03:33-0700 Your query is using the default time zone, which
will be the time zone configured for the coordinator node executing the
query.
IOW, where are you?
-- Jack Krupansky
-Original
1 | 61.97 |73.97
Now if I execute a query :
SELECT asset_id,event_time,sensor_type, temperature,humidity from
temp_humidity_data where asset_id='2' and event_time > '2014-08-17
03:33:20' ALLOW FILTERING;
it gives me back same results (!), I expected it to give me 0
nd a
> definitive answer on this but all I have come up with is this (old,
> non-authoritative) blog post which states "Cassandra’s native index is
> like a hashed index, which means you can only do equality query and not
> range query."
>
Somewhere in google I'm p
val bigint,
>> PRIMARY KEY ((foo_name, foo_shard))
>> ) WITH read_repair_chance=0.1;
>>
>> CREATE INDEX ON foo (int_val);
>> CREATE INDEX ON foo (foo_name);
>>
>> I have inserted just a single row into this table:
>> insert into foo(foo_name, foo_shard,
ow into this table:
> insert into foo(foo_name, foo_shard, int_val) values('dave', 27, 100);
>
> This query works fine:
> select * from foo where foo_name='dave';
>
> But when I run this query, I get an RPC timeout:
> select * from foo where foo_name='
Frankly, no matter how inefficient / expensive the query is, surely it
should still work when there is only 1 row and 1 node (which is localhost)!
I'm starting to wonder if range queries on secondary indexes aren't
supported at all (although if that is the case, I would certainly prefe
ave"
2) how many primary keys of table "foo" match the condition int_val>0 --> read
from the 2nd index "int_val" where partition key > 0, so basically it is a
range scan
Once it gets all the results from 2nd indices, C* can query the primary
table to return data.
Agreed, but... in this case the table has ONE row, so what exactly could be
causing this timeout? I mean, it can’t be the row count, right?
-- Jack Krupansky
From: DuyHai Doan
Sent: Wednesday, August 13, 2014 9:01 AM
To: user@cassandra.apache.org
Subject: Re: range query times out (on 1 node
Hello Ian
Secondary index performs poorly with inequalities (<, ≤, >, ≥). Indeed
inequalities forces the server to scan all the cluster to find the
requested range, which is clearly not optimal. That's the reason why you
need to add "ALLOW FILTERING" for the query to
Confusingly, it appears to be the presence of an index on int_val that is
causing this timeout. If I drop that index (leaving only the index on
foo_name) the query works just fine.
On Tue, Aug 12, 2014 at 10:25 PM, Ian Rose wrote:
> Hi -
>
> I am currently running a single Cassandr
, foo_shard))
) WITH read_repair_chance=0.1;
CREATE INDEX ON foo (int_val);
CREATE INDEX ON foo (foo_name);
I have inserted just a single row into this table:
insert into foo(foo_name, foo_shard, int_val) values('dave', 27, 100);
This query works fine:
select * from foo where foo_name='dav
>
> if you find that adding nodes causes performance to degrade I would
> suspect that you are querying data in one CQL statement that is spread over
> multiple partitions
This is exactly what is happening. The better way to query multiple
partitions is to simply despatch mult
The rest of the body of a Prepared result is:
where:
- is [short bytes] representing the prepared query ID.
- is defined exactly as for a Rows RESULT (See section
4.2.5.2; you
can however assume that the Has_more_pages flag is always off) and
is the specification for the
ssage. The rest of the body of a Prepared result is:
where:
- is [short bytes] representing the prepared query ID.
- is defined exactly as for a Rows RESULT (See section
4.2.5.2) - this represents the type information for the query
arguments
- is defined exactly as for a R
Hi all,
I'm looking at the specification of statement preparation (section
4.2.5.4 of the CQL protocol) and I'm wondering whether the metadata
result of the PREPARE query only returns column information for the
query arguments, and not for the columns of the actual query result.
The
I posted the query wrong, I gave the query for 1 key versus the large batch
of ids like I was testing.
What it was using for large batch was IN, so
Select * from foo where key IN and col_name='LATEST
So after breaking it down and reading as much as I can with regard to our
- s
So I appreciate all the help so far. Upfront, it is possible the schema
and data query pattern could be contributing to the problem. The schema
was born out of certain design requirements. If it proves to be part of
what makes the scalability crumble, then I hope it will help shape the
design
On Sun, Jul 20, 2014 at 6:12 PM, Diane Griffith
wrote:
> I am running tests again across different number of client threads and
> number of nodes but this time I tweaked some of the timeouts configured for
> the nodes in the cluster. I was able to get better performance on the
> nodes at 10 clie
Hello,
Here is the documentation for cfhistograms, which is in microseconds.
http://www.datastax.com/documentation/cassandra/2.0/cassandra/tools/toolsCFhisto.html
Your question about setting timeouts is subjective, but you have set your
timeout limits to 4 mins, which seems excessive.
The
I am running tests again across different number of client threads and
number of nodes but this time I tweaked some of the timeouts configured for
the nodes in the cluster. I was able to get better performance on the
nodes at 10 client threads by upping 4 timeout values in cassandra.yaml to
24
key=”Type1:1109dccb-169b-40ef-b7f8-d072f04d8139“ and col_name=”LATEST“
Read result from above query:
{"key":"1109dccb-169b-40ef-b7f8-d072f04d8139","keyType":"
Type1","state":"state3","timestamp":1303446284614,"eventId&
On Fri, Jul 18, 2014 at 8:01 AM, Diane Griffith
wrote:
>
> Partition Size (bytes)
> 1109 bytes: 1800
>
> Cell Count per Partition
> 8 cells: 1800
>
> meaning I can't glean anything about how it partitioned or if it broke a
> key across partitions from this right? Does it mean for 180
stering
>>> columns, or does each row have a unique partition key and no clustering
>>> columns.
>>>
>>> -- Jack Krupansky
>>>
>>> *From:* Diane Griffith
>>> *Sent:* Thursday, July 17, 2014 6:21 PM
>>> *To:* user
>>> *Subjec
your primary key and whether you
>> are using a small number of partition keys and a large number of clustering
>> columns, or does each row have a unique partition key and no clustering
>> columns.
>>
>> -- Jack Krupansky
>>
>> *From:* Diane Griffith
>
g
> columns.
>
> -- Jack Krupansky
>
> *From:* Diane Griffith
> *Sent:* Thursday, July 17, 2014 6:21 PM
> *To:* user
> *Subject:* Re: horizontal query scaling issues follow on
>
> So do partitions equate to tokens/vnodes?
>
> If so we had configured all
The problem with starting without vnodes is moving to them is a bit
hairy. In particular, nodetool shuffle has been reported to take an
extremely long time (days, weeks). I would start with vnodes if you
have any intent on using them.
On Thu, Jul 17, 2014 at 6:03 PM, Robert Coli wrote:
> On Thu
whether you are
using a small number of partition keys and a large number of clustering
columns, or does each row have a unique partition key and no clustering columns.
-- Jack Krupansky
From: Diane Griffith
Sent: Thursday, July 17, 2014 6:21 PM
To: user
Subject: Re: horizontal query scaling
On Thu, Jul 17, 2014 at 5:16 PM, Diane Griffith
wrote:
> I did tests comparing 1, 2, 10, 20, 50, 100 clients spawned all querying.
> Performance on 2 nodes starts to degrade from 10 clients on. I saw
> similar behavior on 4 nodes but haven't done the official runs on that yet.
>
>
Ok, if you'v
So I stripped out the number of clients experiment path information. It is
unclear if I can only show horizontal scaling by also spawning many client
requests all working at once. So that is why I stripped that information
out to distill what our original attempt was at how to show horizontal
sca
On Thu, Jul 17, 2014 at 3:21 PM, Diane Griffith
wrote:
> So do partitions equate to tokens/vnodes?
>
A partition is what used to be called a "row".
Each individual token in the token ring can contain a partition, which you
request using the token as the key.
A "token range" is the space betwee
er.
I didn't think I was hitting an i/o wall on the client vm (separate vm)
where we command line scripted our query call to the cassandra cluster.
I can break the client call load across vms which I tried early on. Happy
to verify that again though.
So given that I was assuming the partition
a single partition would certainly not be a test of
“horizontal scaling” (adding nodes to handle more data – more token values or
partitions.)
-- Jack Krupansky
From: Diane Griffith
Sent: Thursday, July 17, 2014 1:33 PM
To: user
Subject: horizontal query scaling issues follow on
This is a
Procedure:
- Inserted 54 million cells in 18 million rows (so 3 cells per row),
using randomly generated row keys. That was to be our data control for the
test.
- Spawn a client on a different VM to query 100k rows and do that for
100 reps. Each row key queried is drawn randomly
Hi,
Is there any way to get values for column "column1" for key "rowkey1" and
column "column2" for key "rowkey2" and column "columns2" and "column3" for
key "rowkey3" etc' from Cassandra in one single query?
Thanks
Srini
oing to have
to rework our data model to avoid them.
--
View this message in context:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/RPC-timeout-paging-secondary-index-query-results-tp7595078p7595486.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at
Nabble.com.
Luckhurst <
phil.luckhu...@powerassure.com> wrote:
> But would you expect performance to drop off so quickly? At 250,000 records
> we can still page through the query with LIMIT 5 but when adding an
> additional 50,000 records we can't page past the first 10,000 records
ssentially my schema is:
bucket: int
sequence: long
value: text…
primary key( bucket, sequence )
… value is just a big chunk of html.
sequence is a timestamp essentially.
I have 100 buckets… and that's the partition key. So I can stick these
buckets across 100 servers token ranges.
The que
Hi there,
I was wondering if there is a good reason for select queries on secondary
indexes to not support any where operator other than the equality operator,
or if its just a missing feature in CQL.
Thanks,
Tommaso
r the index item.
>
> The cost of the "every once in a while" delete may be infrequent enough
> for you to do what you were actually trying to do in the first place, use a
> secondary index and query the table leveraging the ALLOW FILTERING clause.
>
> My recommendation
elps manage the
effort of the manual delete. However, you would still have to insert into
this separate table per the index item.
The cost of the "every once in a while" delete may be infrequent enough
for you to do what you were actually trying to do in the first place, use a
seco
your query (which being a date u can get from the timestamp
you are searching (eg 140154480)) and the range of timestamps you
want. You wont need any secondary indices in this solution.
If you need to make some queries on partition id also, keep the original
table but you'll need the
But would you expect performance to drop off so quickly? At 250,000 records
we can still page through the query with LIMIT 5 but when adding an
additional 50,000 records we can't page past the first 10,000 records even
if we drop to LIMIT 10.
What about the case where we add 100,000 re
x27; : 64 };
>
> CREATE INDEX idx_messagepayload_senttime ON services.messagepayload
> (senttime);
>
> While I am running the below query I am getting an exception.
>
> SELECT * FROM b_bank_services.messagepayload WHERE senttime>=140154480
> AND senttime<=140171760
As far as I can tell, the problem is that you're not using a partition key
in your query. AFAIK, you always have to use partition key in where clause.
And ALLOW FILTERING option is to let cassandra filter data from the rows it
found using the partition key.
One way to solve it is to
ssion' : 'LZ4Compressor', 'chunk_length_kb' : 64 };
CREATE INDEX idx_messagepayload_senttime ON services.messagepayload
(senttime);
While I am running the below query I am getting an exception.
SELECT * FROM b_bank_services.messagepayload WHERE senttime>=140154480
AND sentti
On Thu, Jun 12, 2014 at 9:18 AM, Phil Luckhurst <
phil.luckhu...@powerassure.com> wrote:
> The problem appears to be directly related to number of entries in the
> index.
> I started with an empty table and added 50,000 entries at a time with the
> same indexed value.
All requests in Cassandra a
The problem appears to be directly related to number of entries in the index.
I started with an empty table and added 50,000 entries at a time with the
same indexed value. I was able to page through the results of a query that
used the secondary index with 250,000 records in the table using a
to have been requesting a large number of row keys
>>> combined with a large number of named columns in a query. 20K rows with 20K
>>> columns destroyed my cluster. Splitting it into slices of 100 sequential
>>> queries fixed the performance issue.
>>>
>>> Wh
emy Jongsma
> wrote:
>
>> The big problem seems to have been requesting a large number of row keys
>> combined with a large number of named columns in a query. 20K rows with 20K
>> columns destroyed my cluster. Splitting it into slices of 100 sequential
>> queries fixed
On Wed, Jun 11, 2014 at 9:17 PM, Jack Krupansky
wrote:
> Hmmm... that multipl-gets section is not present in the 2.0 doc:
>
> http://www.datastax.com/documentation/cassandra/2.0/cassandra/architecture/architecturePlanningAntiPatterns_c.html
>
> Was that intentional – is that anti-pattern no lon
batches” as an anti-pattern:
http://www.slideshare.net/mattdennis
-- Jack Krupansky
From: Peter Sanford
Sent: Wednesday, June 11, 2014 7:34 PM
To: user@cassandra.apache.org
Subject: Re: Large number of row keys in query kills cluster
On Wed, Jun 11, 2014 at 10:12 AM, Jeremy Jongsma wrote:
The
601 - 700 of 1183 matches
Mail list logo