[ 
https://issues.apache.org/jira/browse/KAFKA-16052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800892#comment-17800892
 ] 

Divij Vaidya commented on KAFKA-16052:
--------------------------------------

I don't think that clearing the mocks is helping here. I ran "./gradlew 
:core:test --tests GroupCoordinatorConcurrencyTest --rerun" and took a heap 
dump. The test spikes to take 778MB heap at peak. Here's the heap dump 
[https://www.dropbox.com/scl/fi/dttxe7e8mzifaaw57er1d/GradleWorkerMain_79331_28_12_2023_00_10_47.hprof.zip?rlkey=cg6w2e26vzprbrz92gcdhwfkd&dl=0]
 

For reference, this is how I was clearing the mocks


{code:java}
@@ -89,11 +91,12 @@ class GroupCoordinatorConcurrencyTest extends 
AbstractCoordinatorConcurrencyTest
   @AfterEach
   override def tearDown(): Unit = {
     try {
-      if (groupCoordinator != null)
-        groupCoordinator.shutdown()
+      CoreUtils.swallow(groupCoordinator.shutdown(), this)
+      CoreUtils.swallow(metrics.close(), this)
     } finally {
       super.tearDown()
     }
+    Mockito.framework().clearInlineMocks()
   } {code}


!Screenshot 2023-12-28 at 00.13.06.png!

> OOM in Kafka test suite
> -----------------------
>
>                 Key: KAFKA-16052
>                 URL: https://issues.apache.org/jira/browse/KAFKA-16052
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 3.7.0
>            Reporter: Divij Vaidya
>            Priority: Major
>         Attachments: Screenshot 2023-12-27 at 14.04.52.png, Screenshot 
> 2023-12-27 at 14.22.21.png, Screenshot 2023-12-27 at 14.45.20.png, Screenshot 
> 2023-12-27 at 15.31.09.png, Screenshot 2023-12-27 at 17.44.09.png, Screenshot 
> 2023-12-28 at 00.13.06.png
>
>
> *Problem*
> Our test suite is failing with frequent OOM. Discussion in the mailing list 
> is here: [https://lists.apache.org/thread/d5js0xpsrsvhgjb10mbzo9cwsy8087x4] 
> *Setup*
> To find the source of leaks, I ran the :core:test build target with a single 
> thread (see below on how to do it) and attached a profiler to it. This Jira 
> tracks the list of action items identified from the analysis.
> How to run tests using a single thread:
> {code:java}
> diff --git a/build.gradle b/build.gradle
> index f7abbf4f0b..81df03f1ee 100644
> --- a/build.gradle
> +++ b/build.gradle
> @@ -74,9 +74,8 @@ ext {
>        "--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED"
>      )-  maxTestForks = project.hasProperty('maxParallelForks') ? 
> maxParallelForks.toInteger() : Runtime.runtime.availableProcessors()
> -  maxScalacThreads = project.hasProperty('maxScalacThreads') ? 
> maxScalacThreads.toInteger() :
> -      Math.min(Runtime.runtime.availableProcessors(), 8)
> +  maxTestForks = 1
> +  maxScalacThreads = 1
>    userIgnoreFailures = project.hasProperty('ignoreFailures') ? 
> ignoreFailures : false   userMaxTestRetries = 
> project.hasProperty('maxTestRetries') ? maxTestRetries.toInteger() : 0
> diff --git a/gradle.properties b/gradle.properties
> index 4880248cac..ee4b6e3bc1 100644
> --- a/gradle.properties
> +++ b/gradle.properties
> @@ -30,4 +30,4 @@ scalaVersion=2.13.12
>  swaggerVersion=2.2.8
>  task=build
>  org.gradle.jvmargs=-Xmx2g -Xss4m -XX:+UseParallelGC
> -org.gradle.parallel=true
> +org.gradle.parallel=false {code}
> *Result of experiment*
> This is how the heap memory utilized looks like, starting from tens of MB to 
> ending with 1.5GB (with spikes of 2GB) of heap being used as the test 
> executes. Note that the total number of threads also increases but it does 
> not correlate with sharp increase in heap memory usage. The heap dump is 
> available at 
> [https://www.dropbox.com/scl/fi/nwtgc6ir6830xlfy9z9cu/GradleWorkerMain_10311_27_12_2023_13_37_08.hprof.zip?rlkey=ozbdgh5vih4rcynnxbatzk7ln&dl=0]
>  
> !Screenshot 2023-12-27 at 14.22.21.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to