[jira] [Comment Edited] (CASSANDRA-7247) Provide top ten most frequent keys per column family

2014-05-20 Thread Chris Lohfink (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14004005#comment-14004005
 ] 

Chris Lohfink edited comment on CASSANDRA-7247 at 5/20/14 10:15 PM:


Added patch that uses the trace executor to track the partition thats updated 
the most, has the most columns inserted (useful for finding rows that are too 
wide) and the partitions with slowest insertion times.  Will only track if 
trace probability > 0.  Gives (key,count,error) tuples
!jconsole.png!


was (Author: cnlwsu):
Added patch that uses the trace executor to track the partition thats updated 
the most, has the most columns inserted (useful for finding rows that are too 
wide) and the partitions with slowest insertion times.  Will only track if 
trace probability > 0.  Gives (key,count,error) tuples

> Provide top ten most frequent keys per column family
> 
>
> Key: CASSANDRA-7247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7247
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Minor
> Attachments: jconsole.png, patch.txt
>
>
> Since already have the nice addthis stream library, can use it to keep track 
> of most frequent DecoratedKeys that come through the system using 
> StreamSummaries ([nice 
> explaination|http://boundary.com/blog/2013/05/14/approximate-heavy-hitters-the-spacesaving-algorithm/]).
>   Then provide a new metric to access them via JMX.  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (CASSANDRA-7247) Provide top ten most frequent keys per column family

2014-05-20 Thread Chris Lohfink (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14004005#comment-14004005
 ] 

Chris Lohfink edited comment on CASSANDRA-7247 at 5/20/14 9:32 PM:
---

Added patch that uses the trace executor to track the partition thats updated 
the most, has the most columns inserted (useful for finding rows that are too 
wide) and the partitions with slowest insertion times.  Will only track if 
trace probability > 0.  Gives (key,count,error) tuples


was (Author: cnlwsu):
Added patch that uses the trace executor to track the partition thats updated 
the most, has the most columns inserted (useful for finding rows that are too 
wide) and the partitions with slowest insertion times.  Will only track if 
trace probability > 0.

> Provide top ten most frequent keys per column family
> 
>
> Key: CASSANDRA-7247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7247
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Minor
> Attachments: jconsole.png, patch.txt
>
>
> Since already have the nice addthis stream library, can use it to keep track 
> of most frequent DecoratedKeys that come through the system using 
> StreamSummaries ([nice 
> explaination|http://boundary.com/blog/2013/05/14/approximate-heavy-hitters-the-spacesaving-algorithm/]).
>   Then provide a new metric to access them via JMX.  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (CASSANDRA-7247) Provide top ten most frequent keys per column family

2014-05-16 Thread Chris Lohfink (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13999621#comment-13999621
 ] 

Chris Lohfink edited comment on CASSANDRA-7247 at 5/16/14 5:53 AM:
---

Problem is StreamSummary is not thread safe.  There is a 
ConcurrentStreamSummary, which I found in this implementation to be ~5x slower 
then a synchronized block around the offer of the non-thread safe one.  
Concurrent did perform similarly when also wrapped in synchronized block which 
I will show below but because it would lose any benefit of being a concurrent 
implementation when access is serialized I think the faster impl is best.

Done on 2013 retina MBP with 500gb ssd against trunk:

{code:title=No Changes}
id, ops   ,op/s,   key/s,mean, med, .95, 
.99,.999, max,   time,   stderr
 4 threadCount, 634450,   21692,   21692, 0.2, 0.2, 0.2, 
0.2, 0.4,   740.1,   29.2,  0.01188
 8 threadCount, 886600,   29762,   29762, 0.3, 0.2, 0.3, 
0.4, 1.3,  1007.3,   29.8,  0.01220
16 threadCount, 912050,   29035,   29035, 0.5, 0.3, 0.9, 
2.5,11.2,  1393.8,   31.4,  0.01162
24 threadCount, 1022250   ,   32681,   32681, 0.7, 0.5, 1.0, 
2.9,13.5,  1126.5,   31.3,  0.00923
36 threadCount, 946550,   30900,   30900, 1.2, 0.8, 1.4, 
3.0,22.5,  1369.2,   30.6,  0.01089
{code}

{code:title=With Patch}
id, ops   ,op/s,   key/s,mean, med, .95, 
.99,.999, max,   time,   stderr
 4 threadCount, 643900,   21700,   21700, 0.2, 0.2, 0.2, 
0.2, 0.9,   941.1,   29.7,  0.01079
 8 threadCount, 942100,   32300,   32300, 0.2, 0.2, 0.3, 
0.3, 1.2,   849.5,   29.2,  0.01519
16 threadCount, 907400,   30650,   30650, 0.5, 0.3, 0.8, 
1.9,10.7,  1124.0,   29.6,  0.01112
24 threadCount, 1026150   ,   31753,   31753, 0.7, 0.5, 0.9, 
3.3,20.6,  1299.0,   32.3,  0.01295
36 threadCount, 980600,   30077,   30077, 1.2, 0.8, 1.3, 
2.7,24.9,  1394.3,   32.6,  0.01747
{code}

{code:title=ConcurrentStreamSummary with sync}
 4 threadCount, 494350,   16643,   16643, 0.2, 0.2, 0.3, 
0.3, 1.0,   943.6,   29.7,  0.01286
 8 threadCount, 812950,   26358,   26358, 0.3, 0.2, 0.3, 
0.5, 1.4,  1488.9,   30.8,  0.01909
16 threadCount, 877500,   27396,   27396, 0.6, 0.3, 1.0, 
2.2,12.1,  1299.2,   32.0,  0.01824
24 threadCount, 837550,   25345,   25345, 0.9, 0.4, 1.2, 
3.7,84.2,  2123.6,   33.0,  0.02437
36 threadCount, 910200,   28008,   28008, 1.3, 0.6, 2.8, 
9.2,32.2,  1212.8,   32.5,  0.01654
{code}


was (Author: cnlwsu):
Problem is StreamSummary is not thread safe.  There is a 
ConcurrentStreamSummary, which I found in this implementation to be ~5x slower 
then a synchronized block around the offer of the non-thread safe one.  
Concurrent did perform similarly when also wrapped in synchronized block which 
I will show below but because it would lose any benefit of being a concurrent 
implementation when access is serialized I think the faster impl is best.

Done on 2013 retina MBP with 500gb ssd against trunk:

{code:title=No Changes}
id, ops   ,op/s,   key/s,mean, med, .95, 
.99,.999, max,   time,   stderr
 4 threadCount, 634450,   21692,   21692, 0.2, 0.2, 0.2, 
0.2, 0.4,   740.1,   29.2,  0.01188
 8 threadCount, 886600,   29762,   29762, 0.3, 0.2, 0.3, 
0.4, 1.3,  1007.3,   29.8,  0.01220
16 threadCount, 912050,   29035,   29035, 0.5, 0.3, 0.9, 
2.5,11.2,  1393.8,   31.4,  0.01162
24 threadCount, 1022250   ,   32681,   32681, 0.7, 0.5, 1.0, 
2.9,13.5,  1126.5,   31.3,  0.00923
36 threadCount, 946550,   30900,   30900, 1.2, 0.8, 1.4, 
3.0,22.5,  1369.2,   30.6,  0.01089
{code}

{code:title=With Patch}
id, ops   ,op/s,   key/s,mean, med, .95, 
.99,.999, max,   time,   stderr
 4 threadCount, 643900,   21700,   21700, 0.2, 0.2, 0.2, 
0.2, 0.9,   941.1,   29.7,  0.01079
 8 threadCount, 942100,   32300,   32300, 0.2, 0.2, 0.3, 
0.3, 1.2,   849.5,   29.2,  0.01519
16 threadCount, 907400,   30650,   30650, 0.5, 0.3, 0.8, 
1.9,10.7,  1124.0,   29.6,  0.01112
24 threadCount, 1026150   ,   31753,   31753, 0.7, 0.5, 0.9, 
3.3,20.6,  1299.0,   32.3,  0.01295
36 threadCount, 980600,   30077,   30077, 1.2, 0.8, 1.3, 
2.7,24.9,  1394.3,   32.6,  0.01747
{code}

> Provide top ten most frequent keys per column family
>

[jira] [Comment Edited] (CASSANDRA-7247) Provide top ten most frequent keys per column family

2014-05-16 Thread Chris Lohfink (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13999621#comment-13999621
 ] 

Chris Lohfink edited comment on CASSANDRA-7247 at 5/16/14 5:51 AM:
---

Problem is StreamSummary is not thread safe.  There is a 
ConcurrentStreamSummary, which I found in this implementation to be ~5x slower 
then a synchronized block around the offer of the non-thread safe one.  
Concurrent did perform similarly when also wrapped in synchronized block which 
I will show below but because it would lose any benefit of being a concurrent 
implementation when access is serialized I think the faster impl is best.

Done on 2013 retina MBP with 500gb ssd against trunk:

{code:title=No Changes}
id, ops   ,op/s,   key/s,mean, med, .95, 
.99,.999, max,   time,   stderr
 4 threadCount, 634450,   21692,   21692, 0.2, 0.2, 0.2, 
0.2, 0.4,   740.1,   29.2,  0.01188
 8 threadCount, 886600,   29762,   29762, 0.3, 0.2, 0.3, 
0.4, 1.3,  1007.3,   29.8,  0.01220
16 threadCount, 912050,   29035,   29035, 0.5, 0.3, 0.9, 
2.5,11.2,  1393.8,   31.4,  0.01162
24 threadCount, 1022250   ,   32681,   32681, 0.7, 0.5, 1.0, 
2.9,13.5,  1126.5,   31.3,  0.00923
36 threadCount, 946550,   30900,   30900, 1.2, 0.8, 1.4, 
3.0,22.5,  1369.2,   30.6,  0.01089
{code}

{code:title=With Patch}
id, ops   ,op/s,   key/s,mean, med, .95, 
.99,.999, max,   time,   stderr
 4 threadCount, 643900,   21700,   21700, 0.2, 0.2, 0.2, 
0.2, 0.9,   941.1,   29.7,  0.01079
 8 threadCount, 942100,   32300,   32300, 0.2, 0.2, 0.3, 
0.3, 1.2,   849.5,   29.2,  0.01519
16 threadCount, 907400,   30650,   30650, 0.5, 0.3, 0.8, 
1.9,10.7,  1124.0,   29.6,  0.01112
24 threadCount, 1026150   ,   31753,   31753, 0.7, 0.5, 0.9, 
3.3,20.6,  1299.0,   32.3,  0.01295
36 threadCount, 980600,   30077,   30077, 1.2, 0.8, 1.3, 
2.7,24.9,  1394.3,   32.6,  0.01747
{code}


was (Author: cnlwsu):
Problem is StreamSummary is not thread safe.  There is a 
ConcurrentStreamSummary, which I found in this implementation to be ~5x slower 
then a synchronized block around the offer of the non-thread safe one.  
Concurrent did perform similarly when also wrapped in synchronized block which 
I will show below but because it would lose any benefit of being a concurrent 
implementation when access is serialized I think the faster impl is best.

Done on 2013 retina MBP with 500gb ssd:

{code:title=No Changes}
id, ops   ,op/s,   key/s,mean, med, .95, 
.99,.999, max,   time,   stderr
 4 threadCount, 634450,   21692,   21692, 0.2, 0.2, 0.2, 
0.2, 0.4,   740.1,   29.2,  0.01188
 8 threadCount, 886600,   29762,   29762, 0.3, 0.2, 0.3, 
0.4, 1.3,  1007.3,   29.8,  0.01220
16 threadCount, 912050,   29035,   29035, 0.5, 0.3, 0.9, 
2.5,11.2,  1393.8,   31.4,  0.01162
24 threadCount, 1022250   ,   32681,   32681, 0.7, 0.5, 1.0, 
2.9,13.5,  1126.5,   31.3,  0.00923
36 threadCount, 946550,   30900,   30900, 1.2, 0.8, 1.4, 
3.0,22.5,  1369.2,   30.6,  0.01089
{code}

{code:title=With Patch}
id, ops   ,op/s,   key/s,mean, med, .95, 
.99,.999, max,   time,   stderr
 4 threadCount, 643900,   21700,   21700, 0.2, 0.2, 0.2, 
0.2, 0.9,   941.1,   29.7,  0.01079
 8 threadCount, 942100,   32300,   32300, 0.2, 0.2, 0.3, 
0.3, 1.2,   849.5,   29.2,  0.01519
16 threadCount, 907400,   30650,   30650, 0.5, 0.3, 0.8, 
1.9,10.7,  1124.0,   29.6,  0.01112
24 threadCount, 1026150   ,   31753,   31753, 0.7, 0.5, 0.9, 
3.3,20.6,  1299.0,   32.3,  0.01295
36 threadCount, 980600,   30077,   30077, 1.2, 0.8, 1.3, 
2.7,24.9,  1394.3,   32.6,  0.01747
{code}

> Provide top ten most frequent keys per column family
> 
>
> Key: CASSANDRA-7247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7247
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Chris Lohfink
>Priority: Minor
> Attachments: patch.diff
>
>
> Since already have the nice addthis stream library, can use it to keep track 
> of most frequent DecoratedKeys that come through the system using 
> StreamSummaries ([nice 
> explaination|http://boundary.com/blog/2013/05/14/approximate-heavy-hitters-the-spacesaving-algorithm/]).
>   Then provide a new metric to ac

[jira] [Comment Edited] (CASSANDRA-7247) Provide top ten most frequent keys per column family

2014-05-16 Thread Chris Lohfink (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13999621#comment-13999621
 ] 

Chris Lohfink edited comment on CASSANDRA-7247 at 5/16/14 5:55 AM:
---

Problem is StreamSummary is not thread safe.  There is a 
ConcurrentStreamSummary, which I found in this implementation to be ~4x slower 
then a synchronized block around the offer of the non-thread safe one.  
Concurrent did perform similarly when also wrapped in synchronized block which 
I will show below but because it would lose any benefit of being a concurrent 
implementation when access is serialized I think the faster impl is best.

Done on 2013 retina MBP with 500gb ssd against trunk:

{code:title=No Changes}
id, ops   ,op/s,   key/s,mean, med, .95, 
.99,.999, max,   time,   stderr
 4 threadCount, 634450,   21692,   21692, 0.2, 0.2, 0.2, 
0.2, 0.4,   740.1,   29.2,  0.01188
 8 threadCount, 886600,   29762,   29762, 0.3, 0.2, 0.3, 
0.4, 1.3,  1007.3,   29.8,  0.01220
16 threadCount, 912050,   29035,   29035, 0.5, 0.3, 0.9, 
2.5,11.2,  1393.8,   31.4,  0.01162
24 threadCount, 1022250   ,   32681,   32681, 0.7, 0.5, 1.0, 
2.9,13.5,  1126.5,   31.3,  0.00923
36 threadCount, 946550,   30900,   30900, 1.2, 0.8, 1.4, 
3.0,22.5,  1369.2,   30.6,  0.01089
{code}

{code:title=With Patch}
id, ops   ,op/s,   key/s,mean, med, .95, 
.99,.999, max,   time,   stderr
 4 threadCount, 643900,   21700,   21700, 0.2, 0.2, 0.2, 
0.2, 0.9,   941.1,   29.7,  0.01079
 8 threadCount, 942100,   32300,   32300, 0.2, 0.2, 0.3, 
0.3, 1.2,   849.5,   29.2,  0.01519
16 threadCount, 907400,   30650,   30650, 0.5, 0.3, 0.8, 
1.9,10.7,  1124.0,   29.6,  0.01112
24 threadCount, 1026150   ,   31753,   31753, 0.7, 0.5, 0.9, 
3.3,20.6,  1299.0,   32.3,  0.01295
36 threadCount, 980600,   30077,   30077, 1.2, 0.8, 1.3, 
2.7,24.9,  1394.3,   32.6,  0.01747
{code}

{code:title=ConcurrentStreamSummary with sync}
 4 threadCount, 494350,   16643,   16643, 0.2, 0.2, 0.3, 
0.3, 1.0,   943.6,   29.7,  0.01286
 8 threadCount, 812950,   26358,   26358, 0.3, 0.2, 0.3, 
0.5, 1.4,  1488.9,   30.8,  0.01909
16 threadCount, 877500,   27396,   27396, 0.6, 0.3, 1.0, 
2.2,12.1,  1299.2,   32.0,  0.01824
24 threadCount, 837550,   25345,   25345, 0.9, 0.4, 1.2, 
3.7,84.2,  2123.6,   33.0,  0.02437
36 threadCount, 910200,   28008,   28008, 1.3, 0.6, 2.8, 
9.2,32.2,  1212.8,   32.5,  0.01654
{code}

{code:title=ConcurentStreamSummary no blocking}
id, ops   ,op/s,   key/s,mean, med, .95, 
.99,.999, max,   time,   stderr
 4 threadCount, 183600,6145,6145, 0.6, 0.6, 0.8, 
1.0, 2.6,   354.5,   29.9,  0.01063
 8 threadCount, 197200,6593,6593, 1.2, 1.1, 1.4, 
1.8, 3.3,   413.5,   29.9,  0.00716
16 threadCount, 203200,6794,6794, 2.3, 2.2, 2.6, 
3.5,12.1,   649.1,   29.9,  0.01096
24 threadCount, 198000,6615,6615, 3.6, 3.3, 4.2, 
4.9,44.2,   570.4,   29.9,  0.00894
36 threadCount, 199800,6627,6627, 5.4, 4.9, 6.5, 
8.0,   110.8,   272.3,   30.1,  0.01452
{code}


was (Author: cnlwsu):
Problem is StreamSummary is not thread safe.  There is a 
ConcurrentStreamSummary, which I found in this implementation to be ~5x slower 
then a synchronized block around the offer of the non-thread safe one.  
Concurrent did perform similarly when also wrapped in synchronized block which 
I will show below but because it would lose any benefit of being a concurrent 
implementation when access is serialized I think the faster impl is best.

Done on 2013 retina MBP with 500gb ssd against trunk:

{code:title=No Changes}
id, ops   ,op/s,   key/s,mean, med, .95, 
.99,.999, max,   time,   stderr
 4 threadCount, 634450,   21692,   21692, 0.2, 0.2, 0.2, 
0.2, 0.4,   740.1,   29.2,  0.01188
 8 threadCount, 886600,   29762,   29762, 0.3, 0.2, 0.3, 
0.4, 1.3,  1007.3,   29.8,  0.01220
16 threadCount, 912050,   29035,   29035, 0.5, 0.3, 0.9, 
2.5,11.2,  1393.8,   31.4,  0.01162
24 threadCount, 1022250   ,   32681,   32681, 0.7, 0.5, 1.0, 
2.9,13.5,  1126.5,   31.3,  0.00923
36 threadCount, 946550,   30900,   30900, 1.2, 0.8, 1.4, 
3.0,22.5,  1369.2,   30.6,  0.01089
{code}

{code:title=With Patch}

[jira] [Comment Edited] (CASSANDRA-7247) Provide top ten most frequent keys per column family

2014-05-16 Thread Chris Lohfink (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13999621#comment-13999621
 ] 

Chris Lohfink edited comment on CASSANDRA-7247 at 5/16/14 5:54 AM:
---

Problem is StreamSummary is not thread safe.  There is a 
ConcurrentStreamSummary, which I found in this implementation to be ~5x slower 
then a synchronized block around the offer of the non-thread safe one.  
Concurrent did perform similarly when also wrapped in synchronized block which 
I will show below but because it would lose any benefit of being a concurrent 
implementation when access is serialized I think the faster impl is best.

Done on 2013 retina MBP with 500gb ssd against trunk:

{code:title=No Changes}
id, ops   ,op/s,   key/s,mean, med, .95, 
.99,.999, max,   time,   stderr
 4 threadCount, 634450,   21692,   21692, 0.2, 0.2, 0.2, 
0.2, 0.4,   740.1,   29.2,  0.01188
 8 threadCount, 886600,   29762,   29762, 0.3, 0.2, 0.3, 
0.4, 1.3,  1007.3,   29.8,  0.01220
16 threadCount, 912050,   29035,   29035, 0.5, 0.3, 0.9, 
2.5,11.2,  1393.8,   31.4,  0.01162
24 threadCount, 1022250   ,   32681,   32681, 0.7, 0.5, 1.0, 
2.9,13.5,  1126.5,   31.3,  0.00923
36 threadCount, 946550,   30900,   30900, 1.2, 0.8, 1.4, 
3.0,22.5,  1369.2,   30.6,  0.01089
{code}

{code:title=With Patch}
id, ops   ,op/s,   key/s,mean, med, .95, 
.99,.999, max,   time,   stderr
 4 threadCount, 643900,   21700,   21700, 0.2, 0.2, 0.2, 
0.2, 0.9,   941.1,   29.7,  0.01079
 8 threadCount, 942100,   32300,   32300, 0.2, 0.2, 0.3, 
0.3, 1.2,   849.5,   29.2,  0.01519
16 threadCount, 907400,   30650,   30650, 0.5, 0.3, 0.8, 
1.9,10.7,  1124.0,   29.6,  0.01112
24 threadCount, 1026150   ,   31753,   31753, 0.7, 0.5, 0.9, 
3.3,20.6,  1299.0,   32.3,  0.01295
36 threadCount, 980600,   30077,   30077, 1.2, 0.8, 1.3, 
2.7,24.9,  1394.3,   32.6,  0.01747
{code}

{code:title=ConcurrentStreamSummary with sync}
 4 threadCount, 494350,   16643,   16643, 0.2, 0.2, 0.3, 
0.3, 1.0,   943.6,   29.7,  0.01286
 8 threadCount, 812950,   26358,   26358, 0.3, 0.2, 0.3, 
0.5, 1.4,  1488.9,   30.8,  0.01909
16 threadCount, 877500,   27396,   27396, 0.6, 0.3, 1.0, 
2.2,12.1,  1299.2,   32.0,  0.01824
24 threadCount, 837550,   25345,   25345, 0.9, 0.4, 1.2, 
3.7,84.2,  2123.6,   33.0,  0.02437
36 threadCount, 910200,   28008,   28008, 1.3, 0.6, 2.8, 
9.2,32.2,  1212.8,   32.5,  0.01654
{code}

{code:title=ConcurentStreamSummary no blocking}
id, ops   ,op/s,   key/s,mean, med, .95, 
.99,.999, max,   time,   stderr
 4 threadCount, 183600,6145,6145, 0.6, 0.6, 0.8, 
1.0, 2.6,   354.5,   29.9,  0.01063
 8 threadCount, 197200,6593,6593, 1.2, 1.1, 1.4, 
1.8, 3.3,   413.5,   29.9,  0.00716
16 threadCount, 203200,6794,6794, 2.3, 2.2, 2.6, 
3.5,12.1,   649.1,   29.9,  0.01096
24 threadCount, 198000,6615,6615, 3.6, 3.3, 4.2, 
4.9,44.2,   570.4,   29.9,  0.00894
36 threadCount, 199800,6627,6627, 5.4, 4.9, 6.5, 
8.0,   110.8,   272.3,   30.1,  0.01452
{code}


was (Author: cnlwsu):
Problem is StreamSummary is not thread safe.  There is a 
ConcurrentStreamSummary, which I found in this implementation to be ~5x slower 
then a synchronized block around the offer of the non-thread safe one.  
Concurrent did perform similarly when also wrapped in synchronized block which 
I will show below but because it would lose any benefit of being a concurrent 
implementation when access is serialized I think the faster impl is best.

Done on 2013 retina MBP with 500gb ssd against trunk:

{code:title=No Changes}
id, ops   ,op/s,   key/s,mean, med, .95, 
.99,.999, max,   time,   stderr
 4 threadCount, 634450,   21692,   21692, 0.2, 0.2, 0.2, 
0.2, 0.4,   740.1,   29.2,  0.01188
 8 threadCount, 886600,   29762,   29762, 0.3, 0.2, 0.3, 
0.4, 1.3,  1007.3,   29.8,  0.01220
16 threadCount, 912050,   29035,   29035, 0.5, 0.3, 0.9, 
2.5,11.2,  1393.8,   31.4,  0.01162
24 threadCount, 1022250   ,   32681,   32681, 0.7, 0.5, 1.0, 
2.9,13.5,  1126.5,   31.3,  0.00923
36 threadCount, 946550,   30900,   30900, 1.2, 0.8, 1.4, 
3.0,22.5,  1369.2,   30.6,  0.01089
{code}

{code:title=With Patch}