RE: no change observed in read latency after switching from EBS to SSD storage

2014-09-18 Thread Mohammed Guller
Benedict,
That makes perfect sense. Even though the node has multiple cores, I do see 
that only one core is pegged at 100%.

Interestingly, after I switched to 2.1, cqlsh trace now shows that the same 
query takes only 600ms. However, cqlsh still waits for almost 20-30 seconds 
before it starts showing the result. I noticed similar latency when I ran the 
query from our app, which uses the Astyanax driver. So I thought perhaps there 
is a bug in the cqlsh code that tracks the statistics and the reported numbers 
are incorrect. But, I guess the numbers shown by cqlsh trace is correct, but 
the bottleneck is somewhere else now. In other words, the read operation itself 
is much faster in 2.1, but something else delays the response back to the 
client.

Mohammed

From: Benedict Elliott Smith [mailto:belliottsm...@datastax.com]
Sent: Thursday, September 18, 2014 2:15 AM
To: user@cassandra.apache.org
Cc: Chris Lohfink
Subject: Re: no change observed in read latency after switching from EBS to SSD 
storage

It is possible this is CPU bound. In 2.1 we have optimised the comparison of 
clustering columns 
(CASSANDRA-5417<https://issues.apache.org/jira/browse/CASSANDRA-5417>), but in 
2.0 it is quite expensive. So for a large row with several million comparisons 
to perform (to merge, filter, etc.) it could be a significant proportion of the 
cost. Note that these costs for a given query are all bound by a single core, 
there is no parallelism, since the assumption is we are serving more queries at 
once than there are cores (in general Cassandra is not designed to serve 
workloads consisting of single large queries, at least not yet)

On Thu, Sep 18, 2014 at 7:29 AM, Mohammed Guller 
mailto:moham...@glassbeam.com>> wrote:
Chris,
I agree that reading 250k row is a bit excessive and that breaking up the 
partition would help reduce the query time. That part is well understood. The 
part that we can’t figure out is why read time did not change when we switched 
from a slow Network Attached Storage (AWS EBS) to local SSD.

One possibility is that the read is not bound by disk i/o, but it is not cpu or 
memory bound either. So where is it spending all that time? Another possibility 
is that even though it is returning only 193311 cells, C* reads the entire 
partition, which may have a lot more cells. But even in that case reading from 
a local SSD should have been a lot faster than reading from non-provisioned EBS.

Mohammed

From: Chris Lohfink 
[mailto:clohf...@blackbirdit.com<mailto:clohf...@blackbirdit.com>]
Sent: Wednesday, September 17, 2014 7:17 PM

To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: no change observed in read latency after switching from EBS to SSD 
storage

"Read 193311 live and 0 tombstoned cells "

is your killer.  returning 250k rows is a bit excessive, you should really page 
this in smaller chunks, what client are you using to access the data?  This 
partition (a, b, c, d, e, f) may be too large as well (can check partition max 
size from output of nodetool cfstats), may be worth including g to break it up 
more - but I dont know enough about your data model.

---
Chris Lohfink

On Sep 17, 2014, at 4:53 PM, Mohammed Guller 
mailto:moham...@glassbeam.com>> wrote:

Thank you all for your responses.

Alex –
  Instance (ephemeral) SSD

Ben –
the query reads data from just one partition. If disk i/o is the bottleneck, 
then in theory, if reading from EBS takes 10 seconds, then it should take lot 
less when reading the same amount of data from local SSD. My question is not 
about why it is taking 10 seconds, but why is the read time same for both EBS 
(network attached storage) and local SSD?

Tony –
if the data was cached in memory, then a read should not take 10 seconds just 
for 20MB data

Rob –
Here is the schema, query, and trace. I masked the actual column names to 
protect the innocents ☺

create table dummy(
  a   varchar,
  b   varchar,
  c   varchar,
  d   varchar,
  e   varchar,
  f   varchar,
  g   varchar,
  h   timestamp,
  i   int,
  non_key1   varchar,
  ...
  non_keyN   varchar,
  PRIMARY KEY ((a, b, c, d, e, f), g, h, i)
) WITH CLUSTERING ORDER BY (g ASC, h DESC, i ASC)

SELECT h, non_key100, non_key200 FROM dummy WHERE a='' AND b='bb' AND 
c='ccc' AND d='dd' AND e='' AND f='ff' AND g='g'AND 
h >='2014-09-10T00:00:00' AND h<='2014-09-10T23:40:41';

The above query returns around 250,000 CQL rows.

cqlsh trace:

activity | timestamp| source  | source_elapsed
-
execute_cql3_query | 21:57:16,830 | 10.10.100.5 |  0
Parsing query; | 21:57:16,830 | 10.10.100.5 |673
Preparing statement | 21:57:16,831 | 10.10.100.5 |   1602
Executing single-partition query on event | 

Re: no change observed in read latency after switching from EBS to SSD storage

2014-09-18 Thread Tony Anecito
Thanks for the detail.
1. So how do we know from the trace if the read was from memory or I/O? You 
might want to use a java profiler like VisualVM and a plugin for it to show 
I/O. But you will have to be a Cassandra developer to understand the output and 
tune your filters for the tool.

2.  Also, I have never seen such a big primary key (and mostly varchars) before 
but maybe that is common for the data you are using? I usually use keys that 
have int but then I use a STAR schema with combo keys where at most 2 type int 
for the key. My queries take less than 100 microsecnds and that is two queries 
one to get a list of keys and other to take that list for the second query. I 
also changed the default key to an int rather than a varchar that Cassandra 
uses. Also I may return only 40-100 records.
3. I have done performance testing where Oracle and SQL Server and I rarely see 
this long of a response time except maybe for ad-hoc queries against a main 
frame.

Again a different set or 
requirements and data that I have and I have not even optimized the design and 
architecture yet but hope for sub-microsecond overall time.  I  am very pleased 
with Cassandra for what I am doing.


Good Luck,
-Tony



On Thursday, September 18, 2014 3:15 AM, Benedict Elliott Smith 
 wrote:
 


It is possible this is CPU bound. In 2.1 we have optimised the comparison of 
clustering columns (CASSANDRA-5417), but in 2.0 it is quite expensive. So for a 
large row with several million comparisons to perform (to merge, filter, etc.) 
it could be a significant proportion of the cost. Note that these costs for a 
given query are all bound by a single core, there is no parallelism, since the 
assumption is we are serving more queries at once than there are cores (in 
general Cassandra is not designed to serve workloads consisting of single large 
queries, at least not yet)


On Thu, Sep 18, 2014 at 7:29 AM, Mohammed Guller  wrote:

Chris,
>I agree that reading 250k row is a bit excessive and that breaking up the 
>partition would help reduce the query time. That part is well understood. The 
>part that we can’t figure out is why read time did not change when we switched 
>from a slow Network Attached Storage (AWS EBS) to local SSD.
> 
>One possibility is that the read is not bound by disk i/o, but it is not cpu 
>or memory bound either. So where is it spending all that time? Another 
>possibility is that even though it is returning only 193311 cells, C* reads 
>the entire partition, which may have a lot more cells. But even in that case 
>reading from a local SSD should have been a lot faster than reading from 
>non-provisioned EBS.
> 
>Mohammed
> 
>From:Chris Lohfink [mailto:clohf...@blackbirdit.com] 
>Sent: Wednesday, September 17, 2014 7:17 PM
>
>To: user@cassandra.apache.org
>Subject: Re: no change observed in read latency after switching from EBS to 
>SSD storage
> 
>"Read 193311 live and 0 tombstoned cells " 
> 
>is your killer.  returning 250k rows is a bit excessive, you should really 
>page this in smaller chunks, what client are you using to access the data?  
>This partition (a, b, c, d, e, f) may be too large as well (can check 
>partition max size from output of nodetool cfstats), may be worth including g 
>to break it up more - but I dont know enough about your data model.
> 
>---
>Chris Lohfink
> 
>On Sep 17, 2014, at 4:53 PM, Mohammed Guller  wrote:
>
>
>
>Thank you all for your responses.
> 
>Alex –
>  Instance (ephemeral) SSD
> 
>Ben –
>the query reads data from just one partition. If disk i/o is the bottleneck, 
>then in theory, if reading from EBS takes 10 seconds, then it should take lot 
>less when reading the same amount of data from local SSD. My question is not 
>about why it is taking 10 seconds, but why is the read time same for both EBS 
>(network attached storage) and local SSD?
> 
>Tony –
>if the data was cached in memory, then a read should not take 10 seconds just 
>for 20MB data
> 
>Rob –
>Here is the schema, query, and trace. I masked the actual column names to 
>protect the innocents J
> 
>create table dummy(
>  a   varchar,
>  b   varchar,
>  c   varchar,
>  d   varchar,
>  e   varchar,
>  f   varchar,
>  g   varchar,
>  h   timestamp,
>  i   int,
>  non_key1   varchar,
>  ...
>  non_keyN   varchar,
>  PRIMARY KEY ((a, b, c, d, e, f), g, h, i)
>) WITH CLUSTERING ORDER BY (g ASC, h DESC, i ASC)
> 
>SELECT h, non_key100, non_key200 FROM dummy WHERE a='' AND b='bb' AND 
>c='ccc' AND d='dd' AND e='' AND f='ff' AND 
>g='g'AND h >='2014-09-10T00:00:00' AND h<='2014-09-10T23:40:41';
> 
>The above query returns around 250,000

Re: no change observed in read latency after switching from EBS to SSD storage

2014-09-18 Thread Benedict Elliott Smith
It is possible this is CPU bound. In 2.1 we have optimised the comparison
of clustering columns (CASSANDRA-5417
<https://issues.apache.org/jira/browse/CASSANDRA-5417>), but in 2.0 it is
quite expensive. So for a large row with several million comparisons to
perform (to merge, filter, etc.) it could be a significant proportion of
the cost. Note that these costs for a given query are all bound by a single
core, there is no parallelism, since the assumption is we are serving more
queries at once than there are cores (in general Cassandra is not designed
to serve workloads consisting of single large queries, at least not yet)

On Thu, Sep 18, 2014 at 7:29 AM, Mohammed Guller 
wrote:

>  Chris,
>
> I agree that reading 250k row is a bit excessive and that breaking up the
> partition would help reduce the query time. That part is well understood.
> The part that we can’t figure out is why read time did not change when we
> switched from a slow Network Attached Storage (AWS EBS) to local SSD.
>
>
>
> One possibility is that the read is not bound by disk i/o, but it is not
> cpu or memory bound either. So where is it spending all that time? Another
> possibility is that even though it is returning only 193311 cells, C* reads
> the entire partition, which may have a lot more cells. But even in that
> case reading from a local SSD should have been a lot faster than reading
> from non-provisioned EBS.
>
>
>
> Mohammed
>
>
>
> *From:* Chris Lohfink [mailto:clohf...@blackbirdit.com]
> *Sent:* Wednesday, September 17, 2014 7:17 PM
>
> *To:* user@cassandra.apache.org
> *Subject:* Re: no change observed in read latency after switching from
> EBS to SSD storage
>
>
>
> "Read 193311 live and 0 tombstoned cells "
>
>
>
> is your killer.  returning 250k rows is a bit excessive, you should really
> page this in smaller chunks, what client are you using to access the data?
>  This partition (a, b, c, d, e, f) may be too large as well (can check
> partition max size from output of *nodetool cfstats*), may be worth
> including g to break it up more - but I dont know enough about your data
> model.
>
>
>
> ---
>
> Chris Lohfink
>
>
>
> On Sep 17, 2014, at 4:53 PM, Mohammed Guller 
> wrote:
>
>
>
>   Thank you all for your responses.
>
>
>
> Alex –
>
>   Instance (ephemeral) SSD
>
>
>
> Ben –
>
> the query reads data from just one partition. If disk i/o is the
> bottleneck, then in theory, if reading from EBS takes 10 seconds, then it
> should take lot less when reading the same amount of data from local SSD.
> My question is not about why it is taking 10 seconds, but why is the read
> time same for both EBS (network attached storage) and local SSD?
>
>
>
> Tony –
>
> if the data was cached in memory, then a read should not take 10 seconds
> just for 20MB data
>
>
>
> Rob –
>
> Here is the schema, query, and trace. I masked the actual column names to
> protect the innocents J
>
>
>
> create table dummy(
>
>   a   varchar,
>
>   b   varchar,
>
>   c   varchar,
>
>   d   varchar,
>
>   e   varchar,
>
>   f   varchar,
>
>   g   varchar,
>
>   h   timestamp,
>
>   i   int,
>
>   non_key1   varchar,
>
>   ...
>
>   non_keyN   varchar,
>
>   PRIMARY KEY ((a, b, c, d, e, f), g, h, i)
>
> ) WITH CLUSTERING ORDER BY (g ASC, h DESC, i ASC)
>
>
>
> SELECT h, non_key100, non_key200 FROM dummy WHERE a='' AND b='bb'
> AND c='ccc' AND d='dd' AND e='' AND f='ff' AND
> g='g'AND h >='2014-09-10T00:00:00' AND h<='2014-09-10T23:40:41';
>
>
>
> The above query returns around 250,000 CQL rows.
>
>
>
> cqlsh trace:
>
>
>
> activity | timestamp| source  | source_elapsed
>
>
> -
>
> execute_cql3_query | 21:57:16,830 | 10.10.100.5 |  0
>
> Parsing query; | 21:57:16,830 | 10.10.100.5 |673
>
> Preparing statement | 21:57:16,831 | 10.10.100.5 |   1602
>
> Executing single-partition query on event | 21:57:16,845 | 10.10.100.5
> |  14871
>
> Acquiring sstable references | 21:57:16,845 | 10.10.100.5 |  14896
>
> Merging memtable tombstones | 21:57:16,845 | 10.10.100.5 |  14954
>
> Bloom filter allows skipping sstable 1049 | 21:57:16,845 | 10.10.100.5
> |  15090
>
> Bloom filter allows skipping sstable 989 | 21:57:16,845 | 10.10.100.5
> |  15146
>
> Partition index with 0 entries 

RE: no change observed in read latency after switching from EBS to SSD storage

2014-09-17 Thread Mohammed Guller
Chris,
I agree that reading 250k row is a bit excessive and that breaking up the 
partition would help reduce the query time. That part is well understood. The 
part that we can't figure out is why read time did not change when we switched 
from a slow Network Attached Storage (AWS EBS) to local SSD.

One possibility is that the read is not bound by disk i/o, but it is not cpu or 
memory bound either. So where is it spending all that time? Another possibility 
is that even though it is returning only 193311 cells, C* reads the entire 
partition, which may have a lot more cells. But even in that case reading from 
a local SSD should have been a lot faster than reading from non-provisioned EBS.

Mohammed

From: Chris Lohfink [mailto:clohf...@blackbirdit.com]
Sent: Wednesday, September 17, 2014 7:17 PM
To: user@cassandra.apache.org
Subject: Re: no change observed in read latency after switching from EBS to SSD 
storage

"Read 193311 live and 0 tombstoned cells "

is your killer.  returning 250k rows is a bit excessive, you should really page 
this in smaller chunks, what client are you using to access the data?  This 
partition (a, b, c, d, e, f) may be too large as well (can check partition max 
size from output of nodetool cfstats), may be worth including g to break it up 
more - but I dont know enough about your data model.

---
Chris Lohfink

On Sep 17, 2014, at 4:53 PM, Mohammed Guller 
mailto:moham...@glassbeam.com>> wrote:


Thank you all for your responses.

Alex -
  Instance (ephemeral) SSD

Ben -
the query reads data from just one partition. If disk i/o is the bottleneck, 
then in theory, if reading from EBS takes 10 seconds, then it should take lot 
less when reading the same amount of data from local SSD. My question is not 
about why it is taking 10 seconds, but why is the read time same for both EBS 
(network attached storage) and local SSD?

Tony -
if the data was cached in memory, then a read should not take 10 seconds just 
for 20MB data

Rob -
Here is the schema, query, and trace. I masked the actual column names to 
protect the innocents :)

create table dummy(
  a   varchar,
  b   varchar,
  c   varchar,
  d   varchar,
  e   varchar,
  f   varchar,
  g   varchar,
  h   timestamp,
  i   int,
  non_key1   varchar,
  ...
  non_keyN   varchar,
  PRIMARY KEY ((a, b, c, d, e, f), g, h, i)
) WITH CLUSTERING ORDER BY (g ASC, h DESC, i ASC)

SELECT h, non_key100, non_key200 FROM dummy WHERE a='' AND b='bb' AND 
c='ccc' AND d='dd' AND e='' AND f='ff' AND g='g'AND 
h >='2014-09-10T00:00:00' AND h<='2014-09-10T23:40:41';

The above query returns around 250,000 CQL rows.

cqlsh trace:

activity | timestamp| source  | source_elapsed
-
execute_cql3_query | 21:57:16,830 | 10.10.100.5 |  0
Parsing query; | 21:57:16,830 | 10.10.100.5 |673
Preparing statement | 21:57:16,831 | 10.10.100.5 |   1602
Executing single-partition query on event | 21:57:16,845 | 10.10.100.5 |
  14871
Acquiring sstable references | 21:57:16,845 | 10.10.100.5 |  14896
Merging memtable tombstones | 21:57:16,845 | 10.10.100.5 |  14954
Bloom filter allows skipping sstable 1049 | 21:57:16,845 | 10.10.100.5 |
  15090
Bloom filter allows skipping sstable 989 | 21:57:16,845 | 10.10.100.5 | 
 15146
Partition index with 0 entries found for sstable 937 | 21:57:16,845 | 
10.10.100.5 |  15565
Seeking to partition indexed section in data file | 21:57:16,845 | 10.10.100.5 
|  15581
Partition index with 7158 entries found for sstable 884 | 21:57:16,898 | 
10.10.100.5 |  68644
Seeking to partition indexed section in data file | 21:57:16,899 | 10.10.100.5 
|  69014
Partition index with 20819 entries found for sstable 733 | 21:57:16,916 | 
10.10.100.5 |  86121
Seeking to partition indexed section in data file | 21:57:16,916 | 10.10.100.5 
|  86412
Skipped 1/6 non-slice-intersecting sstables, included 0 due to tombstones | 
21:57:16,916 | 10.10.100.5 |  86494
Merging data from memtables and 3 sstables | 21:57:16,916 | 10.10.100.5 |   
   86522
Read 193311 live and 0 tombstoned cells | 21:57:24,552 | 10.10.100.5 |
7722425
Request complete | 21:57:29,074 | 10.10.100.5 |   12244832


Mohammed

From: Alex Major [mailto:al3...@gmail.com]
Sent: Wednesday, September 17, 2014 3:47 AM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: no change observed in read latency after switching from EBS to SSD 
storage

When you say you moved from EBS to SSD, do you mean the EBS HDD drives to EBS 
SSD drives? Or instance SSD drives? The m3.large only comes with 32GB of 
instance based SSD storage. If you're using EBS SSD drives then network will 
s

Re: no change observed in read latency after switching from EBS to SSD storage

2014-09-17 Thread Chris Lohfink
"Read 193311 live and 0 tombstoned cells " 

is your killer.  returning 250k rows is a bit excessive, you should really page 
this in smaller chunks, what client are you using to access the data?  This 
partition (a, b, c, d, e, f) may be too large as well (can check partition max 
size from output of nodetool cfstats), may be worth including g to break it up 
more - but I dont know enough about your data model.

---
Chris Lohfink

On Sep 17, 2014, at 4:53 PM, Mohammed Guller  wrote:

> Thank you all for your responses.
>  
> Alex –
>   Instance (ephemeral) SSD
>  
> Ben –
> the query reads data from just one partition. If disk i/o is the bottleneck, 
> then in theory, if reading from EBS takes 10 seconds, then it should take lot 
> less when reading the same amount of data from local SSD. My question is not 
> about why it is taking 10 seconds, but why is the read time same for both EBS 
> (network attached storage) and local SSD?
>  
> Tony –
> if the data was cached in memory, then a read should not take 10 seconds just 
> for 20MB data
>  
> Rob –
> Here is the schema, query, and trace. I masked the actual column names to 
> protect the innocents J
>  
> create table dummy(
>   a   varchar,
>   b   varchar,
>   c   varchar,
>   d   varchar,
>   e   varchar,
>   f   varchar,
>   g   varchar,
>   h   timestamp,
>   i   int,
>   non_key1   varchar,
>   ...
>   non_keyN   varchar,
>   PRIMARY KEY ((a, b, c, d, e, f), g, h, i)
> ) WITH CLUSTERING ORDER BY (g ASC, h DESC, i ASC)
>  
> SELECT h, non_key100, non_key200 FROM dummy WHERE a='' AND b='bb' AND 
> c='ccc' AND d='dd' AND e='' AND f='ff' AND 
> g='g'AND h >='2014-09-10T00:00:00' AND h<='2014-09-10T23:40:41';
>  
> The above query returns around 250,000 CQL rows.
>  
> cqlsh trace:
>  
> activity | timestamp| source  | source_elapsed
> -
> execute_cql3_query | 21:57:16,830 | 10.10.100.5 |  0
> Parsing query; | 21:57:16,830 | 10.10.100.5 |673
> Preparing statement | 21:57:16,831 | 10.10.100.5 |   1602
> Executing single-partition query on event | 21:57:16,845 | 10.10.100.5 |  
> 14871
> Acquiring sstable references | 21:57:16,845 | 10.10.100.5 |  14896
> Merging memtable tombstones | 21:57:16,845 | 10.10.100.5 |  14954
> Bloom filter allows skipping sstable 1049 | 21:57:16,845 | 10.10.100.5 |  
> 15090
> Bloom filter allows skipping sstable 989 | 21:57:16,845 | 10.10.100.5 |   
>15146
> Partition index with 0 entries found for sstable 937 | 21:57:16,845 | 
> 10.10.100.5 |  15565
> Seeking to partition indexed section in data file | 21:57:16,845 | 
> 10.10.100.5 |  15581
> Partition index with 7158 entries found for sstable 884 | 21:57:16,898 | 
> 10.10.100.5 |  68644
> Seeking to partition indexed section in data file | 21:57:16,899 | 
> 10.10.100.5 |  69014
> Partition index with 20819 entries found for sstable 733 | 21:57:16,916 | 
> 10.10.100.5 |  86121
> Seeking to partition indexed section in data file | 21:57:16,916 | 
> 10.10.100.5 |  86412
> Skipped 1/6 non-slice-intersecting sstables, included 0 due to tombstones | 
> 21:57:16,916 | 10.10.100.5 |  86494
> Merging data from memtables and 3 sstables | 21:57:16,916 | 10.10.100.5 | 
>  86522
> Read 193311 live and 0 tombstoned cells | 21:57:24,552 | 10.10.100.5 |
> 7722425
> Request complete | 21:57:29,074 | 10.10.100.5 |   12244832
>  
>  
> Mohammed
>  
> From: Alex Major [mailto:al3...@gmail.com] 
> Sent: Wednesday, September 17, 2014 3:47 AM
> To: user@cassandra.apache.org
> Subject: Re: no change observed in read latency after switching from EBS to 
> SSD storage
>  
> When you say you moved from EBS to SSD, do you mean the EBS HDD drives to EBS 
> SSD drives? Or instance SSD drives? The m3.large only comes with 32GB of 
> instance based SSD storage. If you're using EBS SSD drives then network will 
> still be the slowest thing so switching won't likely make much of a 
> difference.
>  
> On Wed, Sep 17, 2014 at 6:00 AM, Mohammed Guller  
> wrote:
> Rob,
> The 10 seconds latency that I gave earlier is from CQL tracing. Almost 5 
> seconds out of that was taken up by the “merge memtable and sstables” step. 
> The remaining 5 seconds are from “read live and tombstoned cells.”
>  
> I too first thought that maybe disk is not the bottleneck and Cassandra is 
> serving everything from cache, but in that case, it

RE: no change observed in read latency after switching from EBS to SSD storage

2014-09-17 Thread Mohammed Guller
Thank you all for your responses.

Alex –
  Instance (ephemeral) SSD

Ben –
the query reads data from just one partition. If disk i/o is the bottleneck, 
then in theory, if reading from EBS takes 10 seconds, then it should take lot 
less when reading the same amount of data from local SSD. My question is not 
about why it is taking 10 seconds, but why is the read time same for both EBS 
(network attached storage) and local SSD?

Tony –
if the data was cached in memory, then a read should not take 10 seconds just 
for 20MB data

Rob –
Here is the schema, query, and trace. I masked the actual column names to 
protect the innocents ☺

create table dummy(
  a   varchar,
  b   varchar,
  c   varchar,
  d   varchar,
  e   varchar,
  f   varchar,
  g   varchar,
  h   timestamp,
  i   int,
  non_key1   varchar,
  ...
  non_keyN   varchar,
  PRIMARY KEY ((a, b, c, d, e, f), g, h, i)
) WITH CLUSTERING ORDER BY (g ASC, h DESC, i ASC)

SELECT h, non_key100, non_key200 FROM dummy WHERE a='' AND b='bb' AND 
c='ccc' AND d='dd' AND e='' AND f='ff' AND g='g'AND 
h >='2014-09-10T00:00:00' AND h<='2014-09-10T23:40:41';

The above query returns around 250,000 CQL rows.

cqlsh trace:

activity | timestamp| source  | source_elapsed
-
execute_cql3_query | 21:57:16,830 | 10.10.100.5 |  0
Parsing query; | 21:57:16,830 | 10.10.100.5 |673
Preparing statement | 21:57:16,831 | 10.10.100.5 |   1602
Executing single-partition query on event | 21:57:16,845 | 10.10.100.5 |
  14871
Acquiring sstable references | 21:57:16,845 | 10.10.100.5 |  14896
Merging memtable tombstones | 21:57:16,845 | 10.10.100.5 |  14954
Bloom filter allows skipping sstable 1049 | 21:57:16,845 | 10.10.100.5 |
  15090
Bloom filter allows skipping sstable 989 | 21:57:16,845 | 10.10.100.5 | 
 15146
Partition index with 0 entries found for sstable 937 | 21:57:16,845 | 
10.10.100.5 |  15565
Seeking to partition indexed section in data file | 21:57:16,845 | 10.10.100.5 
|  15581
Partition index with 7158 entries found for sstable 884 | 21:57:16,898 | 
10.10.100.5 |  68644
Seeking to partition indexed section in data file | 21:57:16,899 | 10.10.100.5 
|  69014
Partition index with 20819 entries found for sstable 733 | 21:57:16,916 | 
10.10.100.5 |  86121
Seeking to partition indexed section in data file | 21:57:16,916 | 10.10.100.5 
|  86412
Skipped 1/6 non-slice-intersecting sstables, included 0 due to tombstones | 
21:57:16,916 | 10.10.100.5 |  86494
Merging data from memtables and 3 sstables | 21:57:16,916 | 10.10.100.5 |   
   86522
Read 193311 live and 0 tombstoned cells | 21:57:24,552 | 10.10.100.5 |
7722425
Request complete | 21:57:29,074 | 10.10.100.5 |   12244832


Mohammed

From: Alex Major [mailto:al3...@gmail.com]
Sent: Wednesday, September 17, 2014 3:47 AM
To: user@cassandra.apache.org
Subject: Re: no change observed in read latency after switching from EBS to SSD 
storage

When you say you moved from EBS to SSD, do you mean the EBS HDD drives to EBS 
SSD drives? Or instance SSD drives? The m3.large only comes with 32GB of 
instance based SSD storage. If you're using EBS SSD drives then network will 
still be the slowest thing so switching won't likely make much of a difference.

On Wed, Sep 17, 2014 at 6:00 AM, Mohammed Guller 
mailto:moham...@glassbeam.com>> wrote:
Rob,
The 10 seconds latency that I gave earlier is from CQL tracing. Almost 5 
seconds out of that was taken up by the “merge memtable and sstables” step. The 
remaining 5 seconds are from “read live and tombstoned cells.”

I too first thought that maybe disk is not the bottleneck and Cassandra is 
serving everything from cache, but in that case, it should not take 10 seconds 
for reading just 20MB data.

Also, I narrowed down the query to limit it to a single partition read and I 
ran the query in cqlsh running on the same node. I turned on tracing, which 
shows that all the steps got executed on the same node. htop shows that CPU and 
memory are not the bottlenecks. Network should not come into play since the 
cqlsh is running on the same node.

Is there any performance tuning parameter in the cassandra.yaml file for large 
reads?

Mohammed

From: Robert Coli [mailto:rc...@eventbrite.com<mailto:rc...@eventbrite.com>]
Sent: Tuesday, September 16, 2014 5:42 PM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: no change observed in read latency after switching from EBS to SSD 
storage

On Tue, Sep 16, 2014 at 5:35 PM, Mohammed Guller 
mailto:moham...@glassbeam.com>> wrote:
Does anyone have insight as to why we don't see any performance impact on the 
reads going from EBS

Re: no change observed in read latency after switching from EBS to SSD storage

2014-09-17 Thread Robert Coli
On Tue, Sep 16, 2014 at 10:00 PM, Mohammed Guller 
wrote:

>  The 10 seconds latency that I gave earlier is from CQL tracing. Almost 5
> seconds out of that was taken up by the “merge memtable and sstables” step.
> The remaining 5 seconds are from “read live and tombstoned cells.”
>

Could you paste the query, the schema, and the trace? Your summary is not
likely to be as usefully diagnostic as these.. :)

=Rob


Re: no change observed in read latency after switching from EBS to SSD storage

2014-09-17 Thread Alex Major
When you say you moved from EBS to SSD, do you mean the EBS HDD drives to
EBS SSD drives? Or instance SSD drives? The m3.large only comes with 32GB
of instance based SSD storage. If you're using EBS SSD drives then network
will still be the slowest thing so switching won't likely make much of a
difference.

On Wed, Sep 17, 2014 at 6:00 AM, Mohammed Guller 
wrote:

>  Rob,
>
> The 10 seconds latency that I gave earlier is from CQL tracing. Almost 5
> seconds out of that was taken up by the “merge memtable and sstables” step.
> The remaining 5 seconds are from “read live and tombstoned cells.”
>
>
>
> I too first thought that maybe disk is not the bottleneck and Cassandra is
> serving everything from cache, but in that case, it should not take 10
> seconds for reading just 20MB data.
>
>
>
> Also, I narrowed down the query to limit it to a single partition read and
> I ran the query in cqlsh running on the same node. I turned on tracing,
> which shows that all the steps got executed on the same node. htop shows
> that CPU and memory are not the bottlenecks. Network should not come into
> play since the cqlsh is running on the same node.
>
>
>
> Is there any performance tuning parameter in the cassandra.yaml file for
> large reads?
>
>
>
> Mohammed
>
>
>
> *From:* Robert Coli [mailto:rc...@eventbrite.com]
> *Sent:* Tuesday, September 16, 2014 5:42 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: no change observed in read latency after switching from
> EBS to SSD storage
>
>
>
> On Tue, Sep 16, 2014 at 5:35 PM, Mohammed Guller 
> wrote:
>
> Does anyone have insight as to why we don't see any performance impact on
> the reads going from EBS to SSD?
>
>
>
> What does it say when you enable tracing on this CQL query?
>
>
>
> 10 seconds is a really long time to access anything in Cassandra. There
> is, generally speaking, a reason why the default timeouts are lower than
> this.
>
>
>
> My conjecture is that the data in question was previously being served
> from the page cache and is now being served from SSD. You have, in
> switching from EBS-plus-page-cache to SSD successfully proved that SSD and
> RAM are both very fast. There is also a strong suggestion that whatever
> access pattern you are using is not bounded by disk performance.
>
>
>
> =Rob
>
>
>


RE: no change observed in read latency after switching from EBS to SSD storage

2014-09-16 Thread Mohammed Guller
Rob,
The 10 seconds latency that I gave earlier is from CQL tracing. Almost 5 
seconds out of that was taken up by the “merge memtable and sstables” step. The 
remaining 5 seconds are from “read live and tombstoned cells.”

I too first thought that maybe disk is not the bottleneck and Cassandra is 
serving everything from cache, but in that case, it should not take 10 seconds 
for reading just 20MB data.

Also, I narrowed down the query to limit it to a single partition read and I 
ran the query in cqlsh running on the same node. I turned on tracing, which 
shows that all the steps got executed on the same node. htop shows that CPU and 
memory are not the bottlenecks. Network should not come into play since the 
cqlsh is running on the same node.

Is there any performance tuning parameter in the cassandra.yaml file for large 
reads?

Mohammed

From: Robert Coli [mailto:rc...@eventbrite.com]
Sent: Tuesday, September 16, 2014 5:42 PM
To: user@cassandra.apache.org
Subject: Re: no change observed in read latency after switching from EBS to SSD 
storage

On Tue, Sep 16, 2014 at 5:35 PM, Mohammed Guller 
mailto:moham...@glassbeam.com>> wrote:
Does anyone have insight as to why we don't see any performance impact on the 
reads going from EBS to SSD?

What does it say when you enable tracing on this CQL query?

10 seconds is a really long time to access anything in Cassandra. There is, 
generally speaking, a reason why the default timeouts are lower than this.

My conjecture is that the data in question was previously being served from the 
page cache and is now being served from SSD. You have, in switching from 
EBS-plus-page-cache to SSD successfully proved that SSD and RAM are both very 
fast. There is also a strong suggestion that whatever access pattern you are 
using is not bounded by disk performance.

=Rob



Re: no change observed in read latency after switching from EBS to SSD storage

2014-09-16 Thread Ben Bromhead
EBS vs local SSD in terms of latency you are using ms as your unit of
measurement.
If your query runs for 10s you will not notice anything. What is a few less
ms for the life of a 10 second query.

To reiterate what Rob said. The query is probably slow because of your use
case / data model, not the underlying disk.



On 17 September 2014 14:21, Tony Anecito  wrote:

> If you cached your tables or the database you may not see any difference
> at all.
>
> Regards,
> -Tony
>
>
>   On Tuesday, September 16, 2014 6:36 PM, Mohammed Guller <
> moham...@glassbeam.com> wrote:
>
>
> Hi -
>
> We are running Cassandra 2.0.5 on AWS on m3.large instances. These
> instances were using EBS for storage (I know it is not recommended). We
> replaced the EBS storage with SSDs. However, we didn't see any change in
> read latency. A query that took 10 seconds when data was stored on EBS
> still takes 10 seconds even after we moved the data directory to SSD. It is
> a large query returning 200,000 CQL rows from a single partition. We are
> reading 3 columns from each row and the combined data in these three
> columns for each row is around 100 bytes. In other words, the raw data
> returned by the query is approximately 20MB.
>
> I was expecting at least 5-10 times reduction in read latency going from
> EBS to SSD, so I am puzzled why we are not seeing any change in performance.
>
> Does anyone have insight as to why we don't see any performance impact on
> the reads going from EBS to SSD?
>
> Thanks,
> Mohammed
>
>
>


-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
 | +61 415 936 359


Re: no change observed in read latency after switching from EBS to SSD storage

2014-09-16 Thread Tony Anecito
If you cached your tables or the database you may not see any difference at all.
 
Regards,
-Tony  


On Tuesday, September 16, 2014 6:36 PM, Mohammed Guller 
 wrote:
  


Hi -

We are running Cassandra 2.0.5 on AWS on m3.large instances. These instances 
were using EBS for storage (I know it is not recommended). We replaced the EBS 
storage with SSDs. However, we didn't see any change in read latency. A query 
that took 10 seconds when data was stored on EBS still takes 10 seconds even 
after we moved the data directory to SSD. It is a large query returning 200,000 
CQL rows from a single partition. We are reading 3 columns from each row and 
the combined data in these three columns for each row is around 100 bytes. In 
other words, the raw data returned by the query is approximately 20MB.

I was expecting at least 5-10 times reduction in read latency going from EBS to 
SSD, so I am puzzled why we are not seeing any change in performance.

Does anyone have insight as to why we don't see any performance impact on the 
reads going from EBS to SSD?

Thanks,
Mohammed

Re: no change observed in read latency after switching from EBS to SSD storage

2014-09-16 Thread Alex Kamil
Mohammed, to add to previous answers, EBS is network attached, with SSD or
without it , you access your disk via network constrained by network
bandwidth and latency, if you really need to improve IO performance try
switching to  ephemeral storage  (also called instance storage) which is
physically attached to EC2 instance, and is as good as native disk IO goes.

On Tue, Sep 16, 2014 at 11:39 PM, James Briggs 
wrote:

> To expand on what Robert said, Cassandra is a log-structured database:
>
> - writes are append operations, so both correctly configured disk volumes
> and SSD are fast at that
> - reads could be helped by SSD if they're not in cache (ie. on disk)
> - but compaction is definitely helped by SSD with large data loads
> (compaction is the trade-off for fast writes)
>
> Thanks, James Briggs.
> --
> Cassandra/MySQL DBA. Available in San Jose area or remote.
> Mailbox dimensions: 10"x12"x14"
>
>   --
>  *From:* Robert Coli 
> *To:* "user@cassandra.apache.org" 
> *Sent:* Tuesday, September 16, 2014 5:42 PM
> *Subject:* Re: no change observed in read latency after switching from
> EBS to SSD storage
>
>
>
> On Tue, Sep 16, 2014 at 5:35 PM, Mohammed Guller 
> wrote:
>
> Does anyone have insight as to why we don't see any performance impact on
> the reads going from EBS to SSD?
>
>
> What does it say when you enable tracing on this CQL query?
>
> 10 seconds is a really long time to access anything in Cassandra. There
> is, generally speaking, a reason why the default timeouts are lower than
> this.
>
> My conjecture is that the data in question was previously being served
> from the page cache and is now being served from SSD. You have, in
> switching from EBS-plus-page-cache to SSD successfully proved that SSD and
> RAM are both very fast. There is also a strong suggestion that whatever
> access pattern you are using is not bounded by disk performance.
>
> =Rob
>
>
>
>


Re: no change observed in read latency after switching from EBS to SSD storage

2014-09-16 Thread James Briggs
To expand on what Robert said, Cassandra is a log-structured database:

- writes are append operations, so both correctly configured disk volumes and 
SSD are fast at that

- reads could be helped by SSD if they're not in cache (ie. on disk)

- but compaction is definitely helped by SSD with large data loads (compaction 
is the trade-off for fast writes)

 
Thanks, James Briggs. 
-- 
Cassandra/MySQL DBA. Available in San Jose area or remote. 
Mailbox dimensions: 10"x12"x14"



 From: Robert Coli 
To: "user@cassandra.apache.org"  
Sent: Tuesday, September 16, 2014 5:42 PM
Subject: Re: no change observed in read latency after switching from EBS to SSD 
storage
 





On Tue, Sep 16, 2014 at 5:35 PM, Mohammed Guller  wrote:

Does anyone have insight as to why we don't see any performance impact on the 
reads going from EBS to SSD?
>

What does it say when you enable tracing on this CQL query?

10 seconds is a really long time to access anything in Cassandra. There is, 
generally speaking, a reason why the default timeouts are lower than this.

My conjecture is that the data in question was previously being served from the 
page cache and is now being served from SSD. You have, in switching from 
EBS-plus-page-cache to SSD successfully proved that SSD and RAM are both very 
fast. There is also a strong suggestion that whatever access pattern you are 
using is not bounded by disk performance.

=Rob

Re: no change observed in read latency after switching from EBS to SSD storage

2014-09-16 Thread Robert Coli
On Tue, Sep 16, 2014 at 5:35 PM, Mohammed Guller 
wrote:

> Does anyone have insight as to why we don't see any performance impact on
> the reads going from EBS to SSD?
>

What does it say when you enable tracing on this CQL query?

10 seconds is a really long time to access anything in Cassandra. There is,
generally speaking, a reason why the default timeouts are lower than this.

My conjecture is that the data in question was previously being served from
the page cache and is now being served from SSD. You have, in switching
from EBS-plus-page-cache to SSD successfully proved that SSD and RAM are
both very fast. There is also a strong suggestion that whatever access
pattern you are using is not bounded by disk performance.

=Rob


no change observed in read latency after switching from EBS to SSD storage

2014-09-16 Thread Mohammed Guller
Hi -

We are running Cassandra 2.0.5 on AWS on m3.large instances. These instances 
were using EBS for storage (I know it is not recommended). We replaced the EBS 
storage with SSDs. However, we didn't see any change in read latency. A query 
that took 10 seconds when data was stored on EBS still takes 10 seconds even 
after we moved the data directory to SSD. It is a large query returning 200,000 
CQL rows from a single partition. We are reading 3 columns from each row and 
the combined data in these three columns for each row is around 100 bytes. In 
other words, the raw data returned by the query is approximately 20MB.

I was expecting at least 5-10 times reduction in read latency going from EBS to 
SSD, so I am puzzled why we are not seeing any change in performance.

Does anyone have insight as to why we don't see any performance impact on the 
reads going from EBS to SSD?

Thanks,
Mohammed