[ 
https://issues.apache.org/jira/browse/KAFKA-16052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800851#comment-17800851
 ] 

Justine Olshan commented on KAFKA-16052:
----------------------------------------

Thanks Divij for the digging. These tests share the common 
AbstractCoordinatorCouncurrencyTest. I wonder if the mock in question is there.

We do mock replica manager, but in an interesting way


replicaManager = mock(classOf[TestReplicaManager], 
withSettings().defaultAnswer(CALLS_REAL_METHODS))

> OOM in Kafka test suite
> -----------------------
>
>                 Key: KAFKA-16052
>                 URL: https://issues.apache.org/jira/browse/KAFKA-16052
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 3.7.0
>            Reporter: Divij Vaidya
>            Priority: Major
>         Attachments: Screenshot 2023-12-27 at 14.04.52.png, Screenshot 
> 2023-12-27 at 14.22.21.png, Screenshot 2023-12-27 at 14.45.20.png, Screenshot 
> 2023-12-27 at 15.31.09.png, Screenshot 2023-12-27 at 17.44.09.png
>
>
> *Problem*
> Our test suite is failing with frequent OOM. Discussion in the mailing list 
> is here: [https://lists.apache.org/thread/d5js0xpsrsvhgjb10mbzo9cwsy8087x4] 
> *Setup*
> To find the source of leaks, I ran the :core:test build target with a single 
> thread (see below on how to do it) and attached a profiler to it. This Jira 
> tracks the list of action items identified from the analysis.
> How to run tests using a single thread:
> {code:java}
> diff --git a/build.gradle b/build.gradle
> index f7abbf4f0b..81df03f1ee 100644
> --- a/build.gradle
> +++ b/build.gradle
> @@ -74,9 +74,8 @@ ext {
>        "--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED"
>      )-  maxTestForks = project.hasProperty('maxParallelForks') ? 
> maxParallelForks.toInteger() : Runtime.runtime.availableProcessors()
> -  maxScalacThreads = project.hasProperty('maxScalacThreads') ? 
> maxScalacThreads.toInteger() :
> -      Math.min(Runtime.runtime.availableProcessors(), 8)
> +  maxTestForks = 1
> +  maxScalacThreads = 1
>    userIgnoreFailures = project.hasProperty('ignoreFailures') ? 
> ignoreFailures : false   userMaxTestRetries = 
> project.hasProperty('maxTestRetries') ? maxTestRetries.toInteger() : 0
> diff --git a/gradle.properties b/gradle.properties
> index 4880248cac..ee4b6e3bc1 100644
> --- a/gradle.properties
> +++ b/gradle.properties
> @@ -30,4 +30,4 @@ scalaVersion=2.13.12
>  swaggerVersion=2.2.8
>  task=build
>  org.gradle.jvmargs=-Xmx2g -Xss4m -XX:+UseParallelGC
> -org.gradle.parallel=true
> +org.gradle.parallel=false {code}
> *Result of experiment*
> This is how the heap memory utilized looks like, starting from tens of MB to 
> ending with 1.5GB (with spikes of 2GB) of heap being used as the test 
> executes. Note that the total number of threads also increases but it does 
> not correlate with sharp increase in heap memory usage. The heap dump is 
> available at 
> [https://www.dropbox.com/scl/fi/nwtgc6ir6830xlfy9z9cu/GradleWorkerMain_10311_27_12_2023_13_37_08.hprof.zip?rlkey=ozbdgh5vih4rcynnxbatzk7ln&dl=0]
>  
> !Screenshot 2023-12-27 at 14.22.21.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to