[jira] [Commented] (CASSANDRA-15388) Add compaction allocation measurement test to support compaction gc optimization.

2020-03-09 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055216#comment-17055216
 ] 

David Capwell commented on CASSANDRA-15388:
---

bq. This is not meant to be in a state where it can be plugged into our ci 
process.

Sure, would be good for this to evolve over time but not a blocker for this.

New changes are fine, only nits really left (though would prefer isAgentLoaded 
since logs are too dense its easy to miss)

+1

> Add compaction allocation measurement test to support compaction gc 
> optimization. 
> --
>
> Key: CASSANDRA-15388
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15388
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> This adds a test that is able to quickly and accurately measure the effect of 
> potential gc optimizations against a wide range of (synthetic) compaction 
> workloads. This test accurately measures allocation rates from 16 workloads 
> in less that 2 minutes.
> This test uses google’s {{java-allocation-instrumenter}} agent to measure the 
> workloads. Measurements using this agent are very accurate and pretty 
> repeatable from run to run, with most variance being negligible (1-2 bytes 
> per partition), although workloads with larger but fewer partitions vary a 
> bit more (still less that 0.03%).
> The thinking behind this patch is that with compaction, we’re generally 
> interested in the memory allocated per partition, since garbage scales more 
> or less linearly with the number of partitions compacted. So measuring 
> allocation from a small number of partitions that otherwise represent real 
> world use cases is a good enough approximation.
> In addition to helping with compaction optimizations, this test could be used 
> as a template for future optimization work. This pattern could also be used 
> to set allocation limits on workloads/operations and fail CI if the 
> allocation behavior changes past some threshold. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15388) Add compaction allocation measurement test to support compaction gc optimization.

2020-03-04 Thread Blake Eggleston (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17051445#comment-17051445
 ] 

Blake Eggleston commented on CASSANDRA-15388:
-

I've added a few comments addressing questions. However, regarding:
{quote}Personally would love if test-memory target could go away in favor of 
testclasslist so it matches the rest of CI.
{quote}
and
{quote}Make this overridable; a system properly would be fine
{quote}
and
{quote}Once you complete you can add to the logger for now (though more value 
if you generate a test report, but fine if this is a different JIRA)
{quote}
This is basically some ad-hoc code intended as a tool to identify optimization 
opportunities, and quantify changes. It's being contributed so the methodology 
can be reviewed, to quantify / justify the changes in the other tickets, and so 
that it can be re-used / built upon for later work. This is not meant to be in 
a state where it can be plugged into our ci process.

I think that we _should_ have allocation measurement as part of a performance 
test suite, but that's not the goal of this patch. I also don't think it's a 
good use of time to start making changes like these, because they'll make it 
more awkward to use this test as currently intended, and will probably just be 
re-written once we have a better idea of what we want a performance test suite 
to do.
{quote}Why all the sleeps?
{quote}
The ones when profiling is to give you time to start and stop recording. The 
one after compactions was to work around some issues with ending recording 
immediately after compactions. The one at the end is for logging. When running 
the test in IntelliJ, the report doesn't always make it to the output without 
them.

> Add compaction allocation measurement test to support compaction gc 
> optimization. 
> --
>
> Key: CASSANDRA-15388
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15388
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> This adds a test that is able to quickly and accurately measure the effect of 
> potential gc optimizations against a wide range of (synthetic) compaction 
> workloads. This test accurately measures allocation rates from 16 workloads 
> in less that 2 minutes.
> This test uses google’s {{java-allocation-instrumenter}} agent to measure the 
> workloads. Measurements using this agent are very accurate and pretty 
> repeatable from run to run, with most variance being negligible (1-2 bytes 
> per partition), although workloads with larger but fewer partitions vary a 
> bit more (still less that 0.03%).
> The thinking behind this patch is that with compaction, we’re generally 
> interested in the memory allocated per partition, since garbage scales more 
> or less linearly with the number of partitions compacted. So measuring 
> allocation from a small number of partitions that otherwise represent real 
> world use cases is a good enough approximation.
> In addition to helping with compaction optimizations, this test could be used 
> as a template for future optimization work. This pattern could also be used 
> to set allocation limits on workloads/operations and fail CI if the 
> allocation behavior changes past some threshold. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15388) Add compaction allocation measurement test to support compaction gc optimization.

2020-02-14 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037376#comment-17037376
 ] 

David Capwell commented on CASSANDRA-15388:
---

Sadly 
com.google.monitoring.runtime.instrumentation.AllocationRecorder#getInstrumentation
 is packaged protected, would be great to use that to see if the agent is 
loaded. 

To make make things clear, 
https://github.com/apache/cassandra/compare/trunk...bdeggleston:15388#diff-5d25f445e825fbeef166372877b810d1R140.
 it would be good if you did

{code}
diff --git 
a/test/memory/org/apache/cassandra/db/compaction/CompactionAllocationTest.java 
b/test/memory/org/apache/cassandra/db/compaction/CompactionAllocationTest.java
index 6394b92e6d..53d11c9031 100644
--- 
a/test/memory/org/apache/cassandra/db/compaction/CompactionAllocationTest.java
+++ 
b/test/memory/org/apache/cassandra/db/compaction/CompactionAllocationTest.java
@@ -74,7 +74,7 @@ public class CompactionAllocationTest
 private static final Logger logger = 
LoggerFactory.getLogger(CompactionAllocationTest.class);
 private static final ThreadMXBean threadMX = (ThreadMXBean) 
ManagementFactory.getThreadMXBean();
 
-private static final boolean AGENT_MEASUREMENT = true;
+private static final boolean AGENT_MEASUREMENT = isAgentLoaded();
 
 private static final boolean PROFILING_READS = false;
 private static final boolean PROFILING_COMPACTION = false;
@@ -153,6 +153,14 @@ public class CompactionAllocationTest
 }
 }
 
+private static boolean isAgentLoaded()
+{
+AgentMeasurement measurement = new AgentMeasurement();
+measurement.start();
+measurement.stop();
+return measurement.objectsAllocated != 0 || measurement.bytesAllocated 
!= 0;
+}
+
 @BeforeClass
 public static void setupClass() throws Throwable
 {

{code}

This way we only use the agent if loaded.  Right now you don't modify the IDE 
files, so if you run the test in any IDE it will always log that you allocated 
0 bytes; with the patch I posted it uses the thread mbean.

https://github.com/apache/cassandra/compare/trunk...bdeggleston:15388#diff-5d25f445e825fbeef166372877b810d1R79
 and 
https://github.com/apache/cassandra/compare/trunk...bdeggleston:15388#diff-5d25f445e825fbeef166372877b810d1R80.
 Make this overridable; a system properly would be fine

https://github.com/apache/cassandra/compare/trunk...bdeggleston:15388#diff-5d25f445e825fbeef166372877b810d1R396.
 remove the branch or get rid of "|| true"

https://github.com/apache/cassandra/compare/trunk...bdeggleston:15388#diff-5d25f445e825fbeef166372877b810d1R402.
 Why all the sleeps?  If I remove them I don't really see a difference

Without:
{code}
INFO  [main] 2020-02-14 16:24:20,357 CompactionAllocationTest.java:433 - *** 
tinyNonOverlapping3 reads summary
INFO  [main] 2020-02-14 16:24:20,357 CompactionAllocationTest.java:434 - 
11592072 bytes, 4293 /read, 11649 cpu
INFO  [main] 2020-02-14 16:24:20,357 CompactionAllocationTest.java:435 - *** 
tinyNonOverlapping3 compaction summary
INFO  [main] 2020-02-14 16:24:20,357 CompactionAllocationTest.java:436 - 
10893688 bytes, 4034 /partition, 4034 /row, 93812000 cpu
{code}

With:

{code}
INFO  [main] 2020-02-14 16:25:17,491 CompactionAllocationTest.java:433 - *** 
tinyNonOverlapping3 reads summary
INFO  [main] 2020-02-14 16:25:17,491 CompactionAllocationTest.java:434 - 
11589808 bytes, 4292 /read, 109551000 cpu
INFO  [main] 2020-02-14 16:25:17,491 CompactionAllocationTest.java:435 - *** 
tinyNonOverlapping3 compaction summary
INFO  [main] 2020-02-14 16:25:17,491 CompactionAllocationTest.java:436 - 
10893848 bytes, 4034 /partition, 4034 /row, 74584000 cpu
{code}

The only real change I see is in terms of CPU, which makes sense since you just 
track total time (though implemented from mbean).

The only hint I have as to why is the comment "// maybe log entries will stop 
disappearing?"  which makes me think logger?  

https://github.com/apache/cassandra/compare/trunk...bdeggleston:15388#diff-5d25f445e825fbeef166372877b810d1R200

Looks like you are trying to create a report, but targeting humans?  Can you 
move the strings out of the logger and into something like StringBuilder?  Once 
you complete you can add to the logger for now (though more value if you 
generate a test report, but fine if this is a different JIRA)

Last comment, it would be nice if you asserted the bounds of memory allocated 
rather than log.

> Add compaction allocation measurement test to support compaction gc 
> optimization. 
> --
>
> Key: CASSANDRA-15388
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15388
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>

[jira] [Commented] (CASSANDRA-15388) Add compaction allocation measurement test to support compaction gc optimization.

2020-01-16 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17017562#comment-17017562
 ] 

Benedict Elliott Smith commented on CASSANDRA-15388:


bq. Given the size of the jmh patch though, I’d lean towards reviewing it 
separately. 

Sure, I can feel a follow-up JIRA

bq. What's the compilation breakage you’re seeing?

If you set {{AGENT_MEASUREMENT = false}}, you need 
[this|https://github.com/belliottsmith/cassandra/commit/1a5cfea508285255eea8927b5ea5ad596c405151#diff-5d25f445e825fbeef166372877b810d1R150]

> Add compaction allocation measurement test to support compaction gc 
> optimization. 
> --
>
> Key: CASSANDRA-15388
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15388
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> This adds a test that is able to quickly and accurately measure the effect of 
> potential gc optimizations against a wide range of (synthetic) compaction 
> workloads. This test accurately measures allocation rates from 16 workloads 
> in less that 2 minutes.
> This test uses google’s {{java-allocation-instrumenter}} agent to measure the 
> workloads. Measurements using this agent are very accurate and pretty 
> repeatable from run to run, with most variance being negligible (1-2 bytes 
> per partition), although workloads with larger but fewer partitions vary a 
> bit more (still less that 0.03%).
> The thinking behind this patch is that with compaction, we’re generally 
> interested in the memory allocated per partition, since garbage scales more 
> or less linearly with the number of partitions compacted. So measuring 
> allocation from a small number of partitions that otherwise represent real 
> world use cases is a good enough approximation.
> In addition to helping with compaction optimizations, this test could be used 
> as a template for future optimization work. This pattern could also be used 
> to set allocation limits on workloads/operations and fail CI if the 
> allocation behavior changes past some threshold. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15388) Add compaction allocation measurement test to support compaction gc optimization.

2020-01-16 Thread Blake Eggleston (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17017559#comment-17017559
 ] 

Blake Eggleston commented on CASSANDRA-15388:
-

Yeah I’d really like to quantify the performance changes from these tickets 
before committing them. Given the size of the jmh patch though, I’d lean 
towards reviewing it separately. 

What's the compilation breakage you’re seeing? I haven’t had any issues.

> Add compaction allocation measurement test to support compaction gc 
> optimization. 
> --
>
> Key: CASSANDRA-15388
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15388
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> This adds a test that is able to quickly and accurately measure the effect of 
> potential gc optimizations against a wide range of (synthetic) compaction 
> workloads. This test accurately measures allocation rates from 16 workloads 
> in less that 2 minutes.
> This test uses google’s {{java-allocation-instrumenter}} agent to measure the 
> workloads. Measurements using this agent are very accurate and pretty 
> repeatable from run to run, with most variance being negligible (1-2 bytes 
> per partition), although workloads with larger but fewer partitions vary a 
> bit more (still less that 0.03%).
> The thinking behind this patch is that with compaction, we’re generally 
> interested in the memory allocated per partition, since garbage scales more 
> or less linearly with the number of partitions compacted. So measuring 
> allocation from a small number of partitions that otherwise represent real 
> world use cases is a good enough approximation.
> In addition to helping with compaction optimizations, this test could be used 
> as a template for future optimization work. This pattern could also be used 
> to set allocation limits on workloads/operations and fail CI if the 
> allocation behavior changes past some threshold. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15388) Add compaction allocation measurement test to support compaction gc optimization.

2020-01-14 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17015376#comment-17015376
 ] 

Benedict Elliott Smith commented on CASSANDRA-15388:


The change LGTM, except a minor compilation breakage if the javaagent isn't 
used.

I've pushed 
[here|https://github.com/belliottsmith/cassandra/tree/15388-suggest] some extra 
tests, that I haven't yet had available server time to run.  These are just 
re-abstractions of tests I wrote for some work I plan to post in the coming 
days, that permit us to cover a slightly wider range of partition 
characteristics (though still probably not as many as we might like), and 
integrates them into JMH so we can compare performance as well as allocations.

It's very much complementary with the work you've done, as it doesn't track 
end-to-end costs of compaction, only the isolated costs each of merge and 
deserialization (and not the entire deserialization pipeline, to keep it 
simple).  But I think it should capture the main areas of expense and 
improvement, and these tests have been informative in my other work - certain 
data characteristics can lead to surprising results with a given approach.

We can always file this as a follow-up, though I think it would be nice (if you 
agree with the tests) to run some comparisons before we commit the follow-up 
work, though it will take a long time to run a full comparison (several days, 
though we can prune the state-space).

> Add compaction allocation measurement test to support compaction gc 
> optimization. 
> --
>
> Key: CASSANDRA-15388
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15388
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> This adds a test that is able to quickly and accurately measure the effect of 
> potential gc optimizations against a wide range of (synthetic) compaction 
> workloads. This test accurately measures allocation rates from 16 workloads 
> in less that 2 minutes.
> This test uses google’s {{java-allocation-instrumenter}} agent to measure the 
> workloads. Measurements using this agent are very accurate and pretty 
> repeatable from run to run, with most variance being negligible (1-2 bytes 
> per partition), although workloads with larger but fewer partitions vary a 
> bit more (still less that 0.03%).
> The thinking behind this patch is that with compaction, we’re generally 
> interested in the memory allocated per partition, since garbage scales more 
> or less linearly with the number of partitions compacted. So measuring 
> allocation from a small number of partitions that otherwise represent real 
> world use cases is a good enough approximation.
> In addition to helping with compaction optimizations, this test could be used 
> as a template for future optimization work. This pattern could also be used 
> to set allocation limits on workloads/operations and fail CI if the 
> allocation behavior changes past some threshold. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15388) Add compaction allocation measurement test to support compaction gc optimization.

2019-12-11 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16993994#comment-16993994
 ] 

Benedict Elliott Smith commented on CASSANDRA-15388:


I wonder if it would be worth setting it up as a profiler add-on for JMH?  It 
would be really nice to be able to flip between different measurements easily 
without having to write different tests.  It doesn't _look_ like a super 
complicated API.  _Absolutely not_ a demand, just idle curiosity about how 
viable it might be.



> Add compaction allocation measurement test to support compaction gc 
> optimization. 
> --
>
> Key: CASSANDRA-15388
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15388
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> This adds a test that is able to quickly and accurately measure the effect of 
> potential gc optimizations against a wide range of (synthetic) compaction 
> workloads. This test accurately measures allocation rates from 16 workloads 
> in less that 2 minutes.
> This test uses google’s {{java-allocation-instrumenter}} agent to measure the 
> workloads. Measurements using this agent are very accurate and pretty 
> repeatable from run to run, with most variance being negligible (1-2 bytes 
> per partition), although workloads with larger but fewer partitions vary a 
> bit more (still less that 0.03%).
> The thinking behind this patch is that with compaction, we’re generally 
> interested in the memory allocated per partition, since garbage scales more 
> or less linearly with the number of partitions compacted. So measuring 
> allocation from a small number of partitions that otherwise represent real 
> world use cases is a good enough approximation.
> In addition to helping with compaction optimizations, this test could be used 
> as a template for future optimization work. This pattern could also be used 
> to set allocation limits on workloads/operations and fail CI if the 
> allocation behavior changes past some threshold. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15388) Add compaction allocation measurement test to support compaction gc optimization.

2019-12-11 Thread Blake Eggleston (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16993993#comment-16993993
 ] 

Blake Eggleston commented on CASSANDRA-15388:
-

Looking at the code for GCProfiler, it looks like the ThreadMXBean is used to 
measure allocation. ThreadMXBean was one of the first things I looked at, and 
it wasn’t accurate enough, although it is available as an alternate measurement 
method in these tests.

AFAIK, ThreadMXBean uses TLAB overflows to approximate allocations with minimum 
overhead, and the java allocation instrumenter actually invokes the measurement 
method anytime anything is allocated. So it’s much slower, but super accurate, 
which is perfect for this use case.

> Add compaction allocation measurement test to support compaction gc 
> optimization. 
> --
>
> Key: CASSANDRA-15388
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15388
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> This adds a test that is able to quickly and accurately measure the effect of 
> potential gc optimizations against a wide range of (synthetic) compaction 
> workloads. This test accurately measures allocation rates from 16 workloads 
> in less that 2 minutes.
> This test uses google’s {{java-allocation-instrumenter}} agent to measure the 
> workloads. Measurements using this agent are very accurate and pretty 
> repeatable from run to run, with most variance being negligible (1-2 bytes 
> per partition), although workloads with larger but fewer partitions vary a 
> bit more (still less that 0.03%).
> The thinking behind this patch is that with compaction, we’re generally 
> interested in the memory allocated per partition, since garbage scales more 
> or less linearly with the number of partitions compacted. So measuring 
> allocation from a small number of partitions that otherwise represent real 
> world use cases is a good enough approximation.
> In addition to helping with compaction optimizations, this test could be used 
> as a template for future optimization work. This pattern could also be used 
> to set allocation limits on workloads/operations and fail CI if the 
> allocation behavior changes past some threshold. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15388) Add compaction allocation measurement test to support compaction gc optimization.

2019-12-11 Thread Blake Eggleston (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16993812#comment-16993812
 ] 

Blake Eggleston commented on CASSANDRA-15388:
-

I'd looked and rejected a few other options, but I don't think jmh was one of 
them. I'll take a look at it

> Add compaction allocation measurement test to support compaction gc 
> optimization. 
> --
>
> Key: CASSANDRA-15388
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15388
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> This adds a test that is able to quickly and accurately measure the effect of 
> potential gc optimizations against a wide range of (synthetic) compaction 
> workloads. This test accurately measures allocation rates from 16 workloads 
> in less that 2 minutes.
> This test uses google’s {{java-allocation-instrumenter}} agent to measure the 
> workloads. Measurements using this agent are very accurate and pretty 
> repeatable from run to run, with most variance being negligible (1-2 bytes 
> per partition), although workloads with larger but fewer partitions vary a 
> bit more (still less that 0.03%).
> The thinking behind this patch is that with compaction, we’re generally 
> interested in the memory allocated per partition, since garbage scales more 
> or less linearly with the number of partitions compacted. So measuring 
> allocation from a small number of partitions that otherwise represent real 
> world use cases is a good enough approximation.
> In addition to helping with compaction optimizations, this test could be used 
> as a template for future optimization work. This pattern could also be used 
> to set allocation limits on workloads/operations and fail CI if the 
> allocation behavior changes past some threshold. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15388) Add compaction allocation measurement test to support compaction gc optimization.

2019-12-10 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16993136#comment-16993136
 ] 

Benedict Elliott Smith commented on CASSANDRA-15388:


With absolutely no preconceived answer to this question on my part: have you 
considered and rejected JMH's GC profiler?

I'm unsure how the approaches compare with respect to sophistication (I haven't 
looked closely at how either underlying tool is implemented), but having played 
with it myself recently, using jmh 1.22, it seems to produce exactly the 
numbers I would calculate manually for small functions performing easily 
calculated allocations.

I only ask because it would hook into our (minimal) pre-existing benchmark 
workflows, and produce throughput/latency data alongside.  The latest version 
of jmh can even neatly hook into perf counters to produce very sophisticated 
ancillary data to consider at the same time, to weigh the pros/cons of a change 
(although it's definitely not as easy to explore as a flamegraph).

> Add compaction allocation measurement test to support compaction gc 
> optimization. 
> --
>
> Key: CASSANDRA-15388
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15388
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> This adds a test that is able to quickly and accurately measure the effect of 
> potential gc optimizations against a wide range of (synthetic) compaction 
> workloads. This test accurately measures allocation rates from 16 workloads 
> in less that 2 minutes.
> This test uses google’s {{java-allocation-instrumenter}} agent to measure the 
> workloads. Measurements using this agent are very accurate and pretty 
> repeatable from run to run, with most variance being negligible (1-2 bytes 
> per partition), although workloads with larger but fewer partitions vary a 
> bit more (still less that 0.03%).
> The thinking behind this patch is that with compaction, we’re generally 
> interested in the memory allocated per partition, since garbage scales more 
> or less linearly with the number of partitions compacted. So measuring 
> allocation from a small number of partitions that otherwise represent real 
> world use cases is a good enough approximation.
> In addition to helping with compaction optimizations, this test could be used 
> as a template for future optimization work. This pattern could also be used 
> to set allocation limits on workloads/operations and fail CI if the 
> allocation behavior changes past some threshold. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org