[jira] [Commented] (CASSANDRA-9619) Read performance regression in tables with many columns on trunk and 2.2 vs. 2.1

2015-06-27 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604413#comment-14604413
 ] 

Jonathan Ellis commented on CASSANDRA-9619:
---

Nice work, [~snazy]!

> Read performance regression in tables with many columns on trunk and 2.2 vs. 
> 2.1
> 
>
> Key: CASSANDRA-9619
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9619
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jim Witschey
>  Labels: perfomance
> Fix For: 2.2.0 rc2
>
>
> There seems to be a regression in read in 2.2 and trunk, as compared to 2.1 
> and 2.0. I found it running cstar_perf jobs with 50-column tables. 2.2 may be 
> worse than trunk, though my results on that aren't consistent. The relevant 
> cstar_perf jobs are here:
> http://cstar.datastax.com/tests/id/273e2ea8-0fc8-11e5-816c-42010af0688f
> http://cstar.datastax.com/tests/id/3a8002d6-1480-11e5-97ff-42010af0688f
> http://cstar.datastax.com/tests/id/40ff2766-1248-11e5-bac8-42010af0688f
> The sequence of commands for these jobs is
> {code}
> stress write n=6500 -rate threads=300 -col n=FIXED\(50\)
> stress read n=6500 -rate threads=300
> stress read n=6500 -rate threads=300
> {code}
> Have a look at the operations per second going from [the first read 
> operation|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=174379.7]
>  to [the second read 
> operation|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=174379.7].
>  They've fallen from ~135K to ~100K comparing trunk to 2.1 and 2.0. It's 
> slightly worse for 2.2, and 2.2 operations per second fall continuously from 
> the first to the second read operation.
> There's a corresponding increase in read latency -- it's noticable on trunk 
> and pretty bad on 2.2. Again, the latency gets higher and higher on 2.2 as 
> the read operations progress (see the graphs 
> [here|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=95th_latency&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=17.27]
>  and 
> [here|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=95th_latency&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=928.62&ymin=0&ymax=14.52]).
> I see a similar regression in a [more recent 
> test|http://cstar.datastax.com/graph?stats=40ff2766-1248-11e5-bac8-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=752.62&ymin=0&ymax=171799.1],
>  though in this one trunk performed worse than 2.2. This run also didn't 
> display the increasing latency in 2.2.
> This regression may show for smaller numbers of columns, but not as 
> prominently, as shown [in the results to this test with the stress default of 
> 5 
> columns|http://cstar.datastax.com/graph?stats=227cb89e-0fc8-11e5-9f14-42010af0688f&metric=99.9th_latency&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=498.19&ymin=0&ymax=334.29].
>  There's an increase in latency variability on trunk and 2.2, but I don't see 
> a regression in summary statistics.
> My measurements aren't confounded by [the recent regression in 
> cassandra-stress|https://issues.apache.org/jira/browse/CASSANDRA-9558]; 
> cstar_perf uses the same stress program (from trunk) on all versions on the 
> cluster.
> I'm currently working to
> - reproduce with a smaller workload so this is easier to bisect and debug.
> - get results with larger numbers of columns, since we've seen the regression 
> on 50 columns but not the stress default of 5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9619) Read performance regression in tables with many columns on trunk and 2.2 vs. 2.1

2015-06-27 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604072#comment-14604072
 ] 

Robert Stupp commented on CASSANDRA-9619:
-

The regression for this workload is caused by 
{{sstable_preemptive_open_interval_in_mb}} being ignored (hard-coded to {{-1}}) 
in 2.1.3 and 2.1.4. It is evaluated in versions before and after these releases.

cstar runs:
* [last "bisect" 
run|http://cstar.datastax.com/tests/id/8ed4f4c0-1c48-11e5-b36d-42010af0688f] 
that 
[identifies|http://cstar.datastax.com/graph?stats=8ed4f4c0-1c48-11e5-b36d-42010af0688f&metric=op_rate&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=50.6&ymin=0&ymax=196958.3]
 this 
[commit|https://github.com/apache/cassandra/commit/cf3e748cbf1faaed68870f22a45edc603eb1b4e8].
* [cross 
check|http://cstar.datastax.com/tests/id/1eee9132-1c4f-11e5-bcd7-42010af0688f] 
with [latest 2.1 and 2.1 with that commit 
reversed|http://cstar.datastax.com/graph?stats=1eee9132-1c4f-11e5-bcd7-42010af0688f&metric=op_rate&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=50.82&ymin=0&ymax=206758.2]
* [cross 
check|http://cstar.datastax.com/tests/id/53f35062-1c53-11e5-bcd7-42010af0688f] 
with [latest 2.2 and 2.2 with that commit 
reversed|http://cstar.datastax.com/graph?stats=53f35062-1c53-11e5-bcd7-42010af0688f&metric=op_rate&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=50.27&ymin=0&ymax=195163.1]

That's the good news. Another good news is that a simple {{cassandra.yaml}} 
change can ”solve” this regression.

The bad news IMO is that {{sstable_preemptive_open_interval_in_mb}} has some 
meaning and AFAIK should give some improvement for ”matching” workloads. 
Frankly I don't really know what to do next - whether to let it default to 
{{-1}}, stick with current default of {{50}}, change it to something else. IMO 
some extensive perf testing should be done (again??) to give better advice for 
this parameter.
I think, this is also the reason why blade_11 and bdplab gave different results 
- one has SSDs and one has spindles - just a guess. For reference, I've started 
the [2.1-cross-check on 
bdplab|http://cstar.datastax.com/tests/id/32532262-1cac-11e5-8031-42010af0688f].

Another bad news is that there seems to be another less big regression when 
looking at the numbers of 2.1.4 compared to the current 2.1/2.2 branches with 
{{sstable_preemptive_open_interval_in_mb=-1}} or approx. 1.5-4% for both reads 
and writes. This one is much harder to find - but this one is likely to be 
caused by ”pure” code change(s).

Finally I have to admit that we should have at least a daily performance cstar 
test with some ”standard” workloads (90% writes, 90% reads, 50/50) against 
current dev branches (2.1, 2.2, trunk) linked in cassci (since that's where we 
usually look at). These tests don't need to run for a long time - 2M or 3M keys 
should be enough to find obvious regressions. For more ”detailed" results we 
already have extensive tests in place. Beside that, we should run perf tests 
before commit for everything that likely affects performance.

> Read performance regression in tables with many columns on trunk and 2.2 vs. 
> 2.1
> 
>
> Key: CASSANDRA-9619
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9619
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jim Witschey
>  Labels: perfomance
> Fix For: 2.2.0 rc2
>
>
> There seems to be a regression in read in 2.2 and trunk, as compared to 2.1 
> and 2.0. I found it running cstar_perf jobs with 50-column tables. 2.2 may be 
> worse than trunk, though my results on that aren't consistent. The relevant 
> cstar_perf jobs are here:
> http://cstar.datastax.com/tests/id/273e2ea8-0fc8-11e5-816c-42010af0688f
> http://cstar.datastax.com/tests/id/3a8002d6-1480-11e5-97ff-42010af0688f
> http://cstar.datastax.com/tests/id/40ff2766-1248-11e5-bac8-42010af0688f
> The sequence of commands for these jobs is
> {code}
> stress write n=6500 -rate threads=300 -col n=FIXED\(50\)
> stress read n=6500 -rate threads=300
> stress read n=6500 -rate threads=300
> {code}
> Have a look at the operations per second going from [the first read 
> operation|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=174379.7]
>  to [the second read 
> operation|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=174379.7].
>  They've fallen from ~135K to ~100K comparing trunk to 2.1 and 2.0. It's 
> slightly worse for 2.2, and 2.2 operations per second fall continuously from 
> the first to

[jira] [Commented] (CASSANDRA-9619) Read performance regression in tables with many columns on trunk and 2.2 vs. 2.1

2015-06-26 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14602570#comment-14602570
 ] 

Benedict commented on CASSANDRA-9619:
-

2.2 is still consistently a good 6% slower :(

What's weird is that [~mambocab]'s runs seem to be showing the git bisect point 
prior to 2.1.6 as slower still than 2.1.6 
([link|http://cstar.datastax.com/graph?stats=700ea9f4-1ab9-11e5-ac85-42010af0688f&metric=op_rate&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=564.85&ymin=0&ymax=169929.1])

Looks like performance has hopped around a few times.

> Read performance regression in tables with many columns on trunk and 2.2 vs. 
> 2.1
> 
>
> Key: CASSANDRA-9619
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9619
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jim Witschey
>Assignee: T Jake Luciani
>  Labels: perfomance
> Fix For: 2.2.0 rc2
>
>
> There seems to be a regression in read in 2.2 and trunk, as compared to 2.1 
> and 2.0. I found it running cstar_perf jobs with 50-column tables. 2.2 may be 
> worse than trunk, though my results on that aren't consistent. The relevant 
> cstar_perf jobs are here:
> http://cstar.datastax.com/tests/id/273e2ea8-0fc8-11e5-816c-42010af0688f
> http://cstar.datastax.com/tests/id/3a8002d6-1480-11e5-97ff-42010af0688f
> http://cstar.datastax.com/tests/id/40ff2766-1248-11e5-bac8-42010af0688f
> The sequence of commands for these jobs is
> {code}
> stress write n=6500 -rate threads=300 -col n=FIXED\(50\)
> stress read n=6500 -rate threads=300
> stress read n=6500 -rate threads=300
> {code}
> Have a look at the operations per second going from [the first read 
> operation|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=174379.7]
>  to [the second read 
> operation|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=174379.7].
>  They've fallen from ~135K to ~100K comparing trunk to 2.1 and 2.0. It's 
> slightly worse for 2.2, and 2.2 operations per second fall continuously from 
> the first to the second read operation.
> There's a corresponding increase in read latency -- it's noticable on trunk 
> and pretty bad on 2.2. Again, the latency gets higher and higher on 2.2 as 
> the read operations progress (see the graphs 
> [here|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=95th_latency&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=17.27]
>  and 
> [here|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=95th_latency&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=928.62&ymin=0&ymax=14.52]).
> I see a similar regression in a [more recent 
> test|http://cstar.datastax.com/graph?stats=40ff2766-1248-11e5-bac8-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=752.62&ymin=0&ymax=171799.1],
>  though in this one trunk performed worse than 2.2. This run also didn't 
> display the increasing latency in 2.2.
> This regression may show for smaller numbers of columns, but not as 
> prominently, as shown [in the results to this test with the stress default of 
> 5 
> columns|http://cstar.datastax.com/graph?stats=227cb89e-0fc8-11e5-9f14-42010af0688f&metric=99.9th_latency&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=498.19&ymin=0&ymax=334.29].
>  There's an increase in latency variability on trunk and 2.2, but I don't see 
> a regression in summary statistics.
> My measurements aren't confounded by [the recent regression in 
> cassandra-stress|https://issues.apache.org/jira/browse/CASSANDRA-9558]; 
> cstar_perf uses the same stress program (from trunk) on all versions on the 
> cluster.
> I'm currently working to
> - reproduce with a smaller workload so this is easier to bisect and debug.
> - get results with larger numbers of columns, since we've seen the regression 
> on 50 columns but not the stress default of 5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9619) Read performance regression in tables with many columns on trunk and 2.2 vs. 2.1

2015-06-25 Thread Jim Witschey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14602365#comment-14602365
 ] 

Jim Witschey commented on CASSANDRA-9619:
-

Ryan's made some improvements to the {{cstar_perf}} backend that make it 
possible to run write workloads without deleting data afterwards, then run read 
workloads over those datasets. In that environment, I've got my initial writes 
down and I've got the first read step going. I'll review the results in the 
morning.

> Read performance regression in tables with many columns on trunk and 2.2 vs. 
> 2.1
> 
>
> Key: CASSANDRA-9619
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9619
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jim Witschey
>Assignee: T Jake Luciani
>  Labels: perfomance
> Fix For: 2.2.0 rc2
>
>
> There seems to be a regression in read in 2.2 and trunk, as compared to 2.1 
> and 2.0. I found it running cstar_perf jobs with 50-column tables. 2.2 may be 
> worse than trunk, though my results on that aren't consistent. The relevant 
> cstar_perf jobs are here:
> http://cstar.datastax.com/tests/id/273e2ea8-0fc8-11e5-816c-42010af0688f
> http://cstar.datastax.com/tests/id/3a8002d6-1480-11e5-97ff-42010af0688f
> http://cstar.datastax.com/tests/id/40ff2766-1248-11e5-bac8-42010af0688f
> The sequence of commands for these jobs is
> {code}
> stress write n=6500 -rate threads=300 -col n=FIXED\(50\)
> stress read n=6500 -rate threads=300
> stress read n=6500 -rate threads=300
> {code}
> Have a look at the operations per second going from [the first read 
> operation|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=174379.7]
>  to [the second read 
> operation|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=174379.7].
>  They've fallen from ~135K to ~100K comparing trunk to 2.1 and 2.0. It's 
> slightly worse for 2.2, and 2.2 operations per second fall continuously from 
> the first to the second read operation.
> There's a corresponding increase in read latency -- it's noticable on trunk 
> and pretty bad on 2.2. Again, the latency gets higher and higher on 2.2 as 
> the read operations progress (see the graphs 
> [here|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=95th_latency&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=17.27]
>  and 
> [here|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=95th_latency&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=928.62&ymin=0&ymax=14.52]).
> I see a similar regression in a [more recent 
> test|http://cstar.datastax.com/graph?stats=40ff2766-1248-11e5-bac8-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=752.62&ymin=0&ymax=171799.1],
>  though in this one trunk performed worse than 2.2. This run also didn't 
> display the increasing latency in 2.2.
> This regression may show for smaller numbers of columns, but not as 
> prominently, as shown [in the results to this test with the stress default of 
> 5 
> columns|http://cstar.datastax.com/graph?stats=227cb89e-0fc8-11e5-9f14-42010af0688f&metric=99.9th_latency&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=498.19&ymin=0&ymax=334.29].
>  There's an increase in latency variability on trunk and 2.2, but I don't see 
> a regression in summary statistics.
> My measurements aren't confounded by [the recent regression in 
> cassandra-stress|https://issues.apache.org/jira/browse/CASSANDRA-9558]; 
> cstar_perf uses the same stress program (from trunk) on all versions on the 
> cluster.
> I'm currently working to
> - reproduce with a smaller workload so this is easier to bisect and debug.
> - get results with larger numbers of columns, since we've seen the regression 
> on 50 columns but not the stress default of 5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9619) Read performance regression in tables with many columns on trunk and 2.2 vs. 2.1

2015-06-25 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14601790#comment-14601790
 ] 

T Jake Luciani commented on CASSANDRA-9619:
---

Well according to this run it's only really this 2.1.5 issue, 2.2 is about the 
same 
http://cstar.datastax.com/graph?stats=c118cac6-1b5b-11e5-8031-42010af0688f&metric=op_rate&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=53.79&ymin=0&ymax=200154.9

> Read performance regression in tables with many columns on trunk and 2.2 vs. 
> 2.1
> 
>
> Key: CASSANDRA-9619
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9619
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jim Witschey
>Assignee: T Jake Luciani
>  Labels: perfomance
> Fix For: 2.2.0 rc2
>
>
> There seems to be a regression in read in 2.2 and trunk, as compared to 2.1 
> and 2.0. I found it running cstar_perf jobs with 50-column tables. 2.2 may be 
> worse than trunk, though my results on that aren't consistent. The relevant 
> cstar_perf jobs are here:
> http://cstar.datastax.com/tests/id/273e2ea8-0fc8-11e5-816c-42010af0688f
> http://cstar.datastax.com/tests/id/3a8002d6-1480-11e5-97ff-42010af0688f
> http://cstar.datastax.com/tests/id/40ff2766-1248-11e5-bac8-42010af0688f
> The sequence of commands for these jobs is
> {code}
> stress write n=6500 -rate threads=300 -col n=FIXED\(50\)
> stress read n=6500 -rate threads=300
> stress read n=6500 -rate threads=300
> {code}
> Have a look at the operations per second going from [the first read 
> operation|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=174379.7]
>  to [the second read 
> operation|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=174379.7].
>  They've fallen from ~135K to ~100K comparing trunk to 2.1 and 2.0. It's 
> slightly worse for 2.2, and 2.2 operations per second fall continuously from 
> the first to the second read operation.
> There's a corresponding increase in read latency -- it's noticable on trunk 
> and pretty bad on 2.2. Again, the latency gets higher and higher on 2.2 as 
> the read operations progress (see the graphs 
> [here|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=95th_latency&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=17.27]
>  and 
> [here|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=95th_latency&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=928.62&ymin=0&ymax=14.52]).
> I see a similar regression in a [more recent 
> test|http://cstar.datastax.com/graph?stats=40ff2766-1248-11e5-bac8-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=752.62&ymin=0&ymax=171799.1],
>  though in this one trunk performed worse than 2.2. This run also didn't 
> display the increasing latency in 2.2.
> This regression may show for smaller numbers of columns, but not as 
> prominently, as shown [in the results to this test with the stress default of 
> 5 
> columns|http://cstar.datastax.com/graph?stats=227cb89e-0fc8-11e5-9f14-42010af0688f&metric=99.9th_latency&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=498.19&ymin=0&ymax=334.29].
>  There's an increase in latency variability on trunk and 2.2, but I don't see 
> a regression in summary statistics.
> My measurements aren't confounded by [the recent regression in 
> cassandra-stress|https://issues.apache.org/jira/browse/CASSANDRA-9558]; 
> cstar_perf uses the same stress program (from trunk) on all versions on the 
> cluster.
> I'm currently working to
> - reproduce with a smaller workload so this is easier to bisect and debug.
> - get results with larger numbers of columns, since we've seen the regression 
> on 50 columns but not the stress default of 5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9619) Read performance regression in tables with many columns on trunk and 2.2 vs. 2.1

2015-06-25 Thread Jim Witschey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14601427#comment-14601427
 ] 

Jim Witschey commented on CASSANDRA-9619:
-

Good find, thank you. I'll start bisecting over that range, since we're not 
very confident about my earlier findings.

> Read performance regression in tables with many columns on trunk and 2.2 vs. 
> 2.1
> 
>
> Key: CASSANDRA-9619
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9619
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jim Witschey
>Assignee: T Jake Luciani
>  Labels: perfomance
> Fix For: 2.2.0 rc2
>
>
> There seems to be a regression in read in 2.2 and trunk, as compared to 2.1 
> and 2.0. I found it running cstar_perf jobs with 50-column tables. 2.2 may be 
> worse than trunk, though my results on that aren't consistent. The relevant 
> cstar_perf jobs are here:
> http://cstar.datastax.com/tests/id/273e2ea8-0fc8-11e5-816c-42010af0688f
> http://cstar.datastax.com/tests/id/3a8002d6-1480-11e5-97ff-42010af0688f
> http://cstar.datastax.com/tests/id/40ff2766-1248-11e5-bac8-42010af0688f
> The sequence of commands for these jobs is
> {code}
> stress write n=6500 -rate threads=300 -col n=FIXED\(50\)
> stress read n=6500 -rate threads=300
> stress read n=6500 -rate threads=300
> {code}
> Have a look at the operations per second going from [the first read 
> operation|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=174379.7]
>  to [the second read 
> operation|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=174379.7].
>  They've fallen from ~135K to ~100K comparing trunk to 2.1 and 2.0. It's 
> slightly worse for 2.2, and 2.2 operations per second fall continuously from 
> the first to the second read operation.
> There's a corresponding increase in read latency -- it's noticable on trunk 
> and pretty bad on 2.2. Again, the latency gets higher and higher on 2.2 as 
> the read operations progress (see the graphs 
> [here|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=95th_latency&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=17.27]
>  and 
> [here|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=95th_latency&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=928.62&ymin=0&ymax=14.52]).
> I see a similar regression in a [more recent 
> test|http://cstar.datastax.com/graph?stats=40ff2766-1248-11e5-bac8-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=752.62&ymin=0&ymax=171799.1],
>  though in this one trunk performed worse than 2.2. This run also didn't 
> display the increasing latency in 2.2.
> This regression may show for smaller numbers of columns, but not as 
> prominently, as shown [in the results to this test with the stress default of 
> 5 
> columns|http://cstar.datastax.com/graph?stats=227cb89e-0fc8-11e5-9f14-42010af0688f&metric=99.9th_latency&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=498.19&ymin=0&ymax=334.29].
>  There's an increase in latency variability on trunk and 2.2, but I don't see 
> a regression in summary statistics.
> My measurements aren't confounded by [the recent regression in 
> cassandra-stress|https://issues.apache.org/jira/browse/CASSANDRA-9558]; 
> cstar_perf uses the same stress program (from trunk) on all versions on the 
> cluster.
> I'm currently working to
> - reproduce with a smaller workload so this is easier to bisect and debug.
> - get results with larger numbers of columns, since we've seen the regression 
> on 50 columns but not the stress default of 5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9619) Read performance regression in tables with many columns on trunk and 2.2 vs. 2.1

2015-06-25 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14601298#comment-14601298
 ] 

Robert Stupp commented on CASSANDRA-9619:
-

Looks like the regression has been introduce by something [between 2.1.4 and 
2.1.5|http://cstar.datastax.com/graph?stats=1bf00fb6-1b3b-11e5-bcd7-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=57.2&ymin=0&ymax=197016.6]

> Read performance regression in tables with many columns on trunk and 2.2 vs. 
> 2.1
> 
>
> Key: CASSANDRA-9619
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9619
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jim Witschey
>Assignee: T Jake Luciani
>  Labels: perfomance
> Fix For: 2.2.0 rc2
>
>
> There seems to be a regression in read in 2.2 and trunk, as compared to 2.1 
> and 2.0. I found it running cstar_perf jobs with 50-column tables. 2.2 may be 
> worse than trunk, though my results on that aren't consistent. The relevant 
> cstar_perf jobs are here:
> http://cstar.datastax.com/tests/id/273e2ea8-0fc8-11e5-816c-42010af0688f
> http://cstar.datastax.com/tests/id/3a8002d6-1480-11e5-97ff-42010af0688f
> http://cstar.datastax.com/tests/id/40ff2766-1248-11e5-bac8-42010af0688f
> The sequence of commands for these jobs is
> {code}
> stress write n=6500 -rate threads=300 -col n=FIXED\(50\)
> stress read n=6500 -rate threads=300
> stress read n=6500 -rate threads=300
> {code}
> Have a look at the operations per second going from [the first read 
> operation|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=174379.7]
>  to [the second read 
> operation|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=174379.7].
>  They've fallen from ~135K to ~100K comparing trunk to 2.1 and 2.0. It's 
> slightly worse for 2.2, and 2.2 operations per second fall continuously from 
> the first to the second read operation.
> There's a corresponding increase in read latency -- it's noticable on trunk 
> and pretty bad on 2.2. Again, the latency gets higher and higher on 2.2 as 
> the read operations progress (see the graphs 
> [here|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=95th_latency&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=17.27]
>  and 
> [here|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=95th_latency&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=928.62&ymin=0&ymax=14.52]).
> I see a similar regression in a [more recent 
> test|http://cstar.datastax.com/graph?stats=40ff2766-1248-11e5-bac8-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=752.62&ymin=0&ymax=171799.1],
>  though in this one trunk performed worse than 2.2. This run also didn't 
> display the increasing latency in 2.2.
> This regression may show for smaller numbers of columns, but not as 
> prominently, as shown [in the results to this test with the stress default of 
> 5 
> columns|http://cstar.datastax.com/graph?stats=227cb89e-0fc8-11e5-9f14-42010af0688f&metric=99.9th_latency&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=498.19&ymin=0&ymax=334.29].
>  There's an increase in latency variability on trunk and 2.2, but I don't see 
> a regression in summary statistics.
> My measurements aren't confounded by [the recent regression in 
> cassandra-stress|https://issues.apache.org/jira/browse/CASSANDRA-9558]; 
> cstar_perf uses the same stress program (from trunk) on all versions on the 
> cluster.
> I'm currently working to
> - reproduce with a smaller workload so this is easier to bisect and debug.
> - get results with larger numbers of columns, since we've seen the regression 
> on 50 columns but not the stress default of 5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9619) Read performance regression in tables with many columns on trunk and 2.2 vs. 2.1

2015-06-24 Thread Jim Witschey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600167#comment-14600167
 ] 

Jim Witschey commented on CASSANDRA-9619:
-

The bisect is down to this range of commits:

https://github.com/apache/cassandra/compare/f34f712ad340ddb9b03619c84c950a1b854244d6...a4d075800cd7a56f8b5091e502aae979c318972b

To me it seems plausible -- a change in the memory management system could 
cause unpredictable performance regressions. Unfortunately, these commits are 
merged in from 2.1, so they don't explain a performance difference between 2.1 
and 2.2, at least not on their own. That, plus the fact that the regression 
doesn't show itself every time, makes me wonder if I have incorrectly marked 
some bad commits as good based on one good run. I'll revisit some of my 
bisecting decisions.

In the meantime, [~tjake], do you think trying to revert that memory management 
code is doable/worth a try?

> Read performance regression in tables with many columns on trunk and 2.2 vs. 
> 2.1
> 
>
> Key: CASSANDRA-9619
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9619
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jim Witschey
>Assignee: T Jake Luciani
>  Labels: perfomance
> Fix For: 2.2.0 rc2
>
>
> There seems to be a regression in read in 2.2 and trunk, as compared to 2.1 
> and 2.0. I found it running cstar_perf jobs with 50-column tables. 2.2 may be 
> worse than trunk, though my results on that aren't consistent. The relevant 
> cstar_perf jobs are here:
> http://cstar.datastax.com/tests/id/273e2ea8-0fc8-11e5-816c-42010af0688f
> http://cstar.datastax.com/tests/id/3a8002d6-1480-11e5-97ff-42010af0688f
> http://cstar.datastax.com/tests/id/40ff2766-1248-11e5-bac8-42010af0688f
> The sequence of commands for these jobs is
> {code}
> stress write n=6500 -rate threads=300 -col n=FIXED\(50\)
> stress read n=6500 -rate threads=300
> stress read n=6500 -rate threads=300
> {code}
> Have a look at the operations per second going from [the first read 
> operation|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=174379.7]
>  to [the second read 
> operation|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=174379.7].
>  They've fallen from ~135K to ~100K comparing trunk to 2.1 and 2.0. It's 
> slightly worse for 2.2, and 2.2 operations per second fall continuously from 
> the first to the second read operation.
> There's a corresponding increase in read latency -- it's noticable on trunk 
> and pretty bad on 2.2. Again, the latency gets higher and higher on 2.2 as 
> the read operations progress (see the graphs 
> [here|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=95th_latency&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=17.27]
>  and 
> [here|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=95th_latency&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=928.62&ymin=0&ymax=14.52]).
> I see a similar regression in a [more recent 
> test|http://cstar.datastax.com/graph?stats=40ff2766-1248-11e5-bac8-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=752.62&ymin=0&ymax=171799.1],
>  though in this one trunk performed worse than 2.2. This run also didn't 
> display the increasing latency in 2.2.
> This regression may show for smaller numbers of columns, but not as 
> prominently, as shown [in the results to this test with the stress default of 
> 5 
> columns|http://cstar.datastax.com/graph?stats=227cb89e-0fc8-11e5-9f14-42010af0688f&metric=99.9th_latency&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=498.19&ymin=0&ymax=334.29].
>  There's an increase in latency variability on trunk and 2.2, but I don't see 
> a regression in summary statistics.
> My measurements aren't confounded by [the recent regression in 
> cassandra-stress|https://issues.apache.org/jira/browse/CASSANDRA-9558]; 
> cstar_perf uses the same stress program (from trunk) on all versions on the 
> cluster.
> I'm currently working to
> - reproduce with a smaller workload so this is easier to bisect and debug.
> - get results with larger numbers of columns, since we've seen the regression 
> on 50 columns but not the stress default of 5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9619) Read performance regression in tables with many columns on trunk and 2.2 vs. 2.1

2015-06-22 Thread Jim Witschey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14595924#comment-14595924
 ] 

Jim Witschey commented on CASSANDRA-9619:
-

Bisecting is going, but slowly -- I've only been able to reproduce the 
regression with a lot of data written, so each run takes several hours. The 
current bisect log is in the description of [this {{cstar_perf}} 
job|http://cstar.datastax.com/tests/id/30987b60-18e3-11e5-9c22-42010af0688f].

> Read performance regression in tables with many columns on trunk and 2.2 vs. 
> 2.1
> 
>
> Key: CASSANDRA-9619
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9619
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jim Witschey
>  Labels: perfomance
> Fix For: 2.2.0 rc2
>
>
> There seems to be a regression in read in 2.2 and trunk, as compared to 2.1 
> and 2.0. I found it running cstar_perf jobs with 50-column tables. 2.2 may be 
> worse than trunk, though my results on that aren't consistent. The relevant 
> cstar_perf jobs are here:
> http://cstar.datastax.com/tests/id/273e2ea8-0fc8-11e5-816c-42010af0688f
> http://cstar.datastax.com/tests/id/3a8002d6-1480-11e5-97ff-42010af0688f
> http://cstar.datastax.com/tests/id/40ff2766-1248-11e5-bac8-42010af0688f
> The sequence of commands for these jobs is
> {code}
> stress write n=6500 -rate threads=300 -col n=FIXED\(50\)
> stress read n=6500 -rate threads=300
> stress read n=6500 -rate threads=300
> {code}
> Have a look at the operations per second going from [the first read 
> operation|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=174379.7]
>  to [the second read 
> operation|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=174379.7].
>  They've fallen from ~135K to ~100K comparing trunk to 2.1 and 2.0. It's 
> slightly worse for 2.2, and 2.2 operations per second fall continuously from 
> the first to the second read operation.
> There's a corresponding increase in read latency -- it's noticable on trunk 
> and pretty bad on 2.2. Again, the latency gets higher and higher on 2.2 as 
> the read operations progress (see the graphs 
> [here|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=95th_latency&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=17.27]
>  and 
> [here|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=95th_latency&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=928.62&ymin=0&ymax=14.52]).
> I see a similar regression in a [more recent 
> test|http://cstar.datastax.com/graph?stats=40ff2766-1248-11e5-bac8-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=752.62&ymin=0&ymax=171799.1],
>  though in this one trunk performed worse than 2.2. This run also didn't 
> display the increasing latency in 2.2.
> This regression may show for smaller numbers of columns, but not as 
> prominently, as shown [in the results to this test with the stress default of 
> 5 
> columns|http://cstar.datastax.com/graph?stats=227cb89e-0fc8-11e5-9f14-42010af0688f&metric=99.9th_latency&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=498.19&ymin=0&ymax=334.29].
>  There's an increase in latency variability on trunk and 2.2, but I don't see 
> a regression in summary statistics.
> My measurements aren't confounded by [the recent regression in 
> cassandra-stress|https://issues.apache.org/jira/browse/CASSANDRA-9558]; 
> cstar_perf uses the same stress program (from trunk) on all versions on the 
> cluster.
> I'm currently working to
> - reproduce with a smaller workload so this is easier to bisect and debug.
> - get results with larger numbers of columns, since we've seen the regression 
> on 50 columns but not the stress default of 5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9619) Read performance regression in tables with many columns on trunk and 2.2 vs. 2.1

2015-06-22 Thread Jim Witschey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14595923#comment-14595923
 ] 

Jim Witschey commented on CASSANDRA-9619:
-

Bisecting is going, but slowly -- I've only been able to reproduce the 
regression with a lot of data written, so each run takes several hours. The 
current bisect log is in the description of [this {{cstar_perf}} 
job|http://cstar.datastax.com/tests/id/30987b60-18e3-11e5-9c22-42010af0688f].

> Read performance regression in tables with many columns on trunk and 2.2 vs. 
> 2.1
> 
>
> Key: CASSANDRA-9619
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9619
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jim Witschey
>  Labels: perfomance
> Fix For: 2.2.0 rc2
>
>
> There seems to be a regression in read in 2.2 and trunk, as compared to 2.1 
> and 2.0. I found it running cstar_perf jobs with 50-column tables. 2.2 may be 
> worse than trunk, though my results on that aren't consistent. The relevant 
> cstar_perf jobs are here:
> http://cstar.datastax.com/tests/id/273e2ea8-0fc8-11e5-816c-42010af0688f
> http://cstar.datastax.com/tests/id/3a8002d6-1480-11e5-97ff-42010af0688f
> http://cstar.datastax.com/tests/id/40ff2766-1248-11e5-bac8-42010af0688f
> The sequence of commands for these jobs is
> {code}
> stress write n=6500 -rate threads=300 -col n=FIXED\(50\)
> stress read n=6500 -rate threads=300
> stress read n=6500 -rate threads=300
> {code}
> Have a look at the operations per second going from [the first read 
> operation|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=174379.7]
>  to [the second read 
> operation|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=174379.7].
>  They've fallen from ~135K to ~100K comparing trunk to 2.1 and 2.0. It's 
> slightly worse for 2.2, and 2.2 operations per second fall continuously from 
> the first to the second read operation.
> There's a corresponding increase in read latency -- it's noticable on trunk 
> and pretty bad on 2.2. Again, the latency gets higher and higher on 2.2 as 
> the read operations progress (see the graphs 
> [here|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=95th_latency&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=17.27]
>  and 
> [here|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=95th_latency&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=928.62&ymin=0&ymax=14.52]).
> I see a similar regression in a [more recent 
> test|http://cstar.datastax.com/graph?stats=40ff2766-1248-11e5-bac8-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=752.62&ymin=0&ymax=171799.1],
>  though in this one trunk performed worse than 2.2. This run also didn't 
> display the increasing latency in 2.2.
> This regression may show for smaller numbers of columns, but not as 
> prominently, as shown [in the results to this test with the stress default of 
> 5 
> columns|http://cstar.datastax.com/graph?stats=227cb89e-0fc8-11e5-9f14-42010af0688f&metric=99.9th_latency&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=498.19&ymin=0&ymax=334.29].
>  There's an increase in latency variability on trunk and 2.2, but I don't see 
> a regression in summary statistics.
> My measurements aren't confounded by [the recent regression in 
> cassandra-stress|https://issues.apache.org/jira/browse/CASSANDRA-9558]; 
> cstar_perf uses the same stress program (from trunk) on all versions on the 
> cluster.
> I'm currently working to
> - reproduce with a smaller workload so this is easier to bisect and debug.
> - get results with larger numbers of columns, since we've seen the regression 
> on 50 columns but not the stress default of 5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9619) Read performance regression in tables with many columns on trunk and 2.2 vs. 2.1

2015-06-21 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14595320#comment-14595320
 ] 

Jonathan Ellis commented on CASSANDRA-9619:
---

Any progress, Jim?

> Read performance regression in tables with many columns on trunk and 2.2 vs. 
> 2.1
> 
>
> Key: CASSANDRA-9619
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9619
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jim Witschey
>  Labels: perfomance
> Fix For: 2.2.0 rc2
>
>
> There seems to be a regression in read in 2.2 and trunk, as compared to 2.1 
> and 2.0. I found it running cstar_perf jobs with 50-column tables. 2.2 may be 
> worse than trunk, though my results on that aren't consistent. The relevant 
> cstar_perf jobs are here:
> http://cstar.datastax.com/tests/id/273e2ea8-0fc8-11e5-816c-42010af0688f
> http://cstar.datastax.com/tests/id/3a8002d6-1480-11e5-97ff-42010af0688f
> http://cstar.datastax.com/tests/id/40ff2766-1248-11e5-bac8-42010af0688f
> The sequence of commands for these jobs is
> {code}
> stress write n=6500 -rate threads=300 -col n=FIXED\(50\)
> stress read n=6500 -rate threads=300
> stress read n=6500 -rate threads=300
> {code}
> Have a look at the operations per second going from [the first read 
> operation|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=174379.7]
>  to [the second read 
> operation|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=174379.7].
>  They've fallen from ~135K to ~100K comparing trunk to 2.1 and 2.0. It's 
> slightly worse for 2.2, and 2.2 operations per second fall continuously from 
> the first to the second read operation.
> There's a corresponding increase in read latency -- it's noticable on trunk 
> and pretty bad on 2.2. Again, the latency gets higher and higher on 2.2 as 
> the read operations progress (see the graphs 
> [here|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=95th_latency&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=17.27]
>  and 
> [here|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=95th_latency&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=928.62&ymin=0&ymax=14.52]).
> I see a similar regression in a [more recent 
> test|http://cstar.datastax.com/graph?stats=40ff2766-1248-11e5-bac8-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=752.62&ymin=0&ymax=171799.1],
>  though in this one trunk performed worse than 2.2. This run also didn't 
> display the increasing latency in 2.2.
> This regression may show for smaller numbers of columns, but not as 
> prominently, as shown [in the results to this test with the stress default of 
> 5 
> columns|http://cstar.datastax.com/graph?stats=227cb89e-0fc8-11e5-9f14-42010af0688f&metric=99.9th_latency&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=498.19&ymin=0&ymax=334.29].
>  There's an increase in latency variability on trunk and 2.2, but I don't see 
> a regression in summary statistics.
> My measurements aren't confounded by [the recent regression in 
> cassandra-stress|https://issues.apache.org/jira/browse/CASSANDRA-9558]; 
> cstar_perf uses the same stress program (from trunk) on all versions on the 
> cluster.
> I'm currently working to
> - reproduce with a smaller workload so this is easier to bisect and debug.
> - get results with larger numbers of columns, since we've seen the regression 
> on 50 columns but not the stress default of 5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)