Re: RFR: JDK-8324930: java/lang/StringBuilder problem with concurrent jtreg runs

2024-03-14 Thread Matthias Baesken
On Tue, 30 Jan 2024 09:08:28 GMT, Matthias Baesken  wrote:

> On some Windows machines we see sometimes OOM errors because of high resource 
> (memory/swap) consumption. This is especially seen when the jtreg runs have 
> higher concurrency. A solution is to put the java/lang/StringBuilder tests in 
> the exclusiveAccess.dirs group so that they are not executed concurrently, 
> which helps to mitigate the resource shortages.
> Of course this has the downside that on very large machines the concurrent 
> execution is not done any more.

With some rebalancing/adjustments to our test landscape the issues are gone. 
Unfortunately there was not much interest  in the resource related discussion 
on jtreg-dev  
https://mail.openjdk.org/pipermail/jtreg-dev/2024-February/001926.html

closing for now because the issues are currently not seen any more on our side.

-

PR Comment: https://git.openjdk.org/jdk/pull/17625#issuecomment-1996885622
PR Comment: https://git.openjdk.org/jdk/pull/17625#issuecomment-1996886625


Re: RFR: JDK-8324930: java/lang/StringBuilder problem with concurrent jtreg runs

2024-02-20 Thread Matthias Baesken
On Tue, 30 Jan 2024 09:08:28 GMT, Matthias Baesken  wrote:

> On some Windows machines we see sometimes OOM errors because of high resource 
> (memory/swap) consumption. This is especially seen when the jtreg runs have 
> higher concurrency. A solution is to put the java/lang/StringBuilder tests in 
> the exclusiveAccess.dirs group so that they are not executed concurrently, 
> which helps to mitigate the resource shortages.
> Of course this has the downside that on very large machines the concurrent 
> execution is not done any more.

Hi [~[jaikiran] the exclude and match files sound promising, this could be 
helpful to achieve what we need/want .

-

PR Comment: https://git.openjdk.org/jdk/pull/17625#issuecomment-1954373348


Re: RFR: JDK-8324930: java/lang/StringBuilder problem with concurrent jtreg runs

2024-02-19 Thread Jaikiran Pai
On Tue, 30 Jan 2024 09:08:28 GMT, Matthias Baesken  wrote:

> On some Windows machines we see sometimes OOM errors because of high resource 
> (memory/swap) consumption. This is especially seen when the jtreg runs have 
> higher concurrency. A solution is to put the java/lang/StringBuilder tests in 
> the exclusiveAccess.dirs group so that they are not executed concurrently, 
> which helps to mitigate the resource shortages.
> Of course this has the downside that on very large machines the concurrent 
> execution is not done any more.

> What do you think about marking jtreg tests with higher memory requirements 
> with a jtreg key like highmemusage ? This way we do not need to put these 
> tests into the exclusiveAccess.dirs group, but get a way (only if needed) to 
> execute those with high memory usage separately e.g. with lower concurrency.

`jtreg --help Tests` shows this (among other things):


Test Selection Options
These options can be used to refine the set of tests to be
executed.
...

-exclude: | -Xexclude:
Provide a file specifying tests that should not be run

...

-match:   Provide a file specifying tests that can be run (inverse of
-exclude)



Maybe you could experiment with these options to exclude the 
`java/lang/StringBuilder` test directory from your high concurrency run and 
then only run those in a low concurrency run?

-

PR Comment: https://git.openjdk.org/jdk/pull/17625#issuecomment-1952574423


Re: RFR: JDK-8324930: java/lang/StringBuilder problem with concurrent jtreg runs

2024-02-19 Thread Jaikiran Pai
On Tue, 30 Jan 2024 09:08:28 GMT, Matthias Baesken  wrote:

> On some Windows machines we see sometimes OOM errors because of high resource 
> (memory/swap) consumption. This is especially seen when the jtreg runs have 
> higher concurrency. A solution is to put the java/lang/StringBuilder tests in 
> the exclusiveAccess.dirs group so that they are not executed concurrently, 
> which helps to mitigate the resource shortages.
> Of course this has the downside that on very large machines the concurrent 
> execution is not done any more.

Thank you for those additional details.

> It happens on various machines, two for example

>Windows Server 2022 Standard 16 cores 32G RAM
>Windows Server 2019 Standard 16 cores 32G RAM
>
>On both machines we run :tier1 -avm with -conc:15 (concurrency jtreg flag) .

That then looks like (an extremely high) concurrency of 15 that has been 
explicitly set when launching those tests. By default, the concurrency gets set 
to `num_cores/2` (so should have been 8 in your case) 
https://github.com/openjdk/jdk/blob/master/make/RunTests.gmk#L152.

I had a quick look at our internal CI, a lot of our Windows systems use 12 core 
and 24 GB setups (I haven't looked at all of them). The tests on those systems 
end up using a concurrency of 6 (which is default computed in that RunTests.gmk 
and matches the `num_cores/2` arithmetic).

-

PR Comment: https://git.openjdk.org/jdk/pull/17625#issuecomment-1952559923


Re: RFR: JDK-8324930: java/lang/StringBuilder problem with concurrent jtreg runs

2024-02-19 Thread Matthias Baesken
On Tue, 30 Jan 2024 09:08:28 GMT, Matthias Baesken  wrote:

> On some Windows machines we see sometimes OOM errors because of high resource 
> (memory/swap) consumption. This is especially seen when the jtreg runs have 
> higher concurrency. A solution is to put the java/lang/StringBuilder tests in 
> the exclusiveAccess.dirs group so that they are not executed concurrently, 
> which helps to mitigate the resource shortages.
> Of course this has the downside that on very large machines the concurrent 
> execution is not done any more.

It happens on various machines, two for example

 Windows Server 2022 Standard 16 cores  32G  RAM
 Windows Server 2019 Standard 16 cores  32G  RAM

On both machines we run :tier1  -avm  with  -conc:15   (concurrency jtreg flag) 
.

> The other unanswered question is - why is this happening now?

I filed the issue this year but there are a couple of occurrences also from 
last year.  
I find also similar older failures  from 2022 of  
java/lang/StringBuilder/HugeCapacity.java because of resource shortages (but 
those did not generate a hserr file for some reasons just some text output).
So the issue is there for months already (maybe years?) .

-

PR Comment: https://git.openjdk.org/jdk/pull/17625#issuecomment-1952490909


Re: RFR: JDK-8324930: java/lang/StringBuilder problem with concurrent jtreg runs

2024-02-19 Thread Jaikiran Pai
On Tue, 30 Jan 2024 09:08:28 GMT, Matthias Baesken  wrote:

> On some Windows machines we see sometimes OOM errors because of high resource 
> (memory/swap) consumption. This is especially seen when the jtreg runs have 
> higher concurrency. A solution is to put the java/lang/StringBuilder tests in 
> the exclusiveAccess.dirs group so that they are not executed concurrently, 
> which helps to mitigate the resource shortages.
> Of course this has the downside that on very large machines the concurrent 
> execution is not done any more.

The other unanswered question is - why is this happening now? I did:


git log test/jdk/java/lang/StringBuilder/

which shows:


commit df22fb322e6c4c9931a770bd0abf4c43b83c4e4a
Author: Jim Laskey 
Date:   Thu Jan 4 12:46:31 2024 +

8322512: StringBuffer.repeat does not work correctly after toString() was 
called

Reviewed-by: rriggs, jpai

commit 9b9b5a7a5c624f3512567f5d9b2e9eec231cabb3
Author: Jim Laskey 
Date:   Mon Apr 3 15:29:21 2023 +

8302323: Add repeat methods to StringBuilder/StringBuffer

Reviewed-by: tvaleev, redestad

So there's been only 1 commit in that test directory since April 2023. That 
commit happened on Jan 4th 2024, but at first glance, that change itself 
doesn't look like something that can cause this issue. The JBS issue you filed 
is on Jan 30th 2024. Have you noticed such failures with these 
`test/jdk/java/lang/StringBuilder/` last year?

-

PR Comment: https://git.openjdk.org/jdk/pull/17625#issuecomment-1952476504


Re: RFR: JDK-8324930: java/lang/StringBuilder problem with concurrent jtreg runs

2024-02-19 Thread Jaikiran Pai
On Tue, 30 Jan 2024 09:08:28 GMT, Matthias Baesken  wrote:

> On some Windows machines we see sometimes OOM errors because of high resource 
> (memory/swap) consumption. This is especially seen when the jtreg runs have 
> higher concurrency. A solution is to put the java/lang/StringBuilder tests in 
> the exclusiveAccess.dirs group so that they are not executed concurrently, 
> which helps to mitigate the resource shortages.
> Of course this has the downside that on very large machines the concurrent 
> execution is not done any more.

Hello Matthias,

> What do you think about marking jtreg tests with higher memory requirements 
> with a jtreg key like highmemusage ?

I still don't have any concrete suggestions - it isn't fully clear to me what 
we should do here. Part of the reason is because, details like the exact 
command that's being used to run these tests, the "-concurrency" that's either 
getting computed or explicitly set, the exact Windows OS version and Windows 
system configurations like the total memory available, the number of CPUs 
etc... are all unknown right now. Having those details I think would be good to 
understand what approach to take here. Those details will also help understand 
why this isn't observed in our internal CI runs.

-

PR Comment: https://git.openjdk.org/jdk/pull/17625#issuecomment-1952463101


Re: RFR: JDK-8324930: java/lang/StringBuilder problem with concurrent jtreg runs

2024-02-19 Thread Matthias Baesken
On Wed, 31 Jan 2024 08:13:25 GMT, Matthias Baesken  wrote:

> Can we maybe see if we can fix these tests without exclusive-accessing them? 
> I find it surprising that `java/lang/StringBuilder` tests are problematic, 
> but `java/lang/StringBuffer` tests are not. Which tests fail?

What do you think about  marking jtreg tests with higher memory requirements 
with a jtreg key like highmemusage  ? This way we do not need to put these 
tests into the _exclusiveAccess.dirs_  group, but get a way (only if needed) to 
execute those with high memory usage separately e.g. with lower concurrency.

-

PR Comment: https://git.openjdk.org/jdk/pull/17625#issuecomment-1951934706


Re: RFR: JDK-8324930: java/lang/StringBuilder problem with concurrent jtreg runs

2024-02-14 Thread Matthias Baesken
On Tue, 30 Jan 2024 09:08:28 GMT, Matthias Baesken  wrote:

> On some Windows machines we see sometimes OOM errors because of high resource 
> (memory/swap) consumption. This is especially seen when the jtreg runs have 
> higher concurrency. A solution is to put the java/lang/StringBuilder tests in 
> the exclusiveAccess.dirs group so that they are not executed concurrently, 
> which helps to mitigate the resource shortages.
> Of course this has the downside that on very large machines the concurrent 
> execution is not done any more.

I started a discussion on jtreg-dev  
https://mail.openjdk.org/pipermail/jtreg-dev/2024-February/001926.html 
but not much response so far.  Adding a jtreg test key for tests with higher 
memory requirement  (like HugeCapacity) would probably help to solve these 
resource issues .

-

PR Comment: https://git.openjdk.org/jdk/pull/17625#issuecomment-1943742596


Re: RFR: JDK-8324930: java/lang/StringBuilder problem with concurrent jtreg runs

2024-02-13 Thread Matthias Baesken
On Mon, 12 Feb 2024 10:47:56 GMT, Jaikiran Pai  wrote:

> What seems to be happening is that the system where this run appears to be 
> launching too many tests concurrently. 

Sure, that's why I want to limit the concurrency *for certain tests/ test 
groups* . Limiting it for the whole tier1 would slow down tests that are 
absolutely fine with the concurrency we set.

-

PR Comment: https://git.openjdk.org/jdk/pull/17625#issuecomment-1940741219


Re: RFR: JDK-8324930: java/lang/StringBuilder problem with concurrent jtreg runs

2024-02-12 Thread Jaikiran Pai
On Tue, 30 Jan 2024 09:08:28 GMT, Matthias Baesken  wrote:

> On some Windows machines we see sometimes OOM errors because of high resource 
> (memory/swap) consumption. This is especially seen when the jtreg runs have 
> higher concurrency. A solution is to put the java/lang/StringBuilder tests in 
> the exclusiveAccess.dirs group so that they are not executed concurrently, 
> which helps to mitigate the resource shortages.
> Of course this has the downside that on very large machines the concurrent 
> execution is not done any more.

Hello Matthias, looking at the crash log you pasted, it's clear that the test 
itself isn't a culprit here. Specifically, the failure appears to be when a JVM 
launch is being attempted for the 
`test/jdk/java/lang/StringBuilder/Insert.java` test (which looking at the code 
doesn't use too much memory once launched).

What seems to be happening is that the system where this run appears to be 
launching too many tests concurrently. The exact command used to launch these 
tests on that setup would be helpful in understanding the configurations. 

The JDK build by default "computes" the `TEST_JOBS` value which controls this 
concurrency (the number of jtreg concurrent tests to run) and that's done here 
https://github.com/openjdk/jdk/blob/master/make/RunTests.gmk#L151 and as noted 
in testing.md, it is configurable (and has a per system default) 
https://github.com/openjdk/jdk/blob/master/doc/testing.md#jobs-1. This 
configuration ultimately translates to the `-concurrency` option of jtreg which 
is explained in section `3.8 How do I specify whether to run tests 
concurrently?` and `3.25 My system is unusable while I run tests. How do I fix 
that?` of the jtreg FAQ https://openjdk.org/jtreg/faq.html.

Based on the available details so far, it appears that you might have to reduce 
the value for this concurrency option, through the right build/test option.

-

PR Comment: https://git.openjdk.org/jdk/pull/17625#issuecomment-1938437285


Re: RFR: JDK-8324930: java/lang/StringBuilder problem with concurrent jtreg runs

2024-02-08 Thread Matthias Baesken
On Tue, 30 Jan 2024 09:08:28 GMT, Matthias Baesken  wrote:

> On some Windows machines we see sometimes OOM errors because of high resource 
> (memory/swap) consumption. This is especially seen when the jtreg runs have 
> higher concurrency. A solution is to put the java/lang/StringBuilder tests in 
> the exclusiveAccess.dirs group so that they are not executed concurrently, 
> which helps to mitigate the resource shortages.
> Of course this has the downside that on very large machines the concurrent 
> execution is not done any more.

Hello, any further comments on this ?
Or should we carry the discussion to jtreg on how to work with resource (in 
this case memory) issues in case of concurrent runs ?
Can we **_configure_** to execute some jtreg tests with higher mem requirements 
with less concurrency ?

-

PR Comment: https://git.openjdk.org/jdk/pull/17625#issuecomment-1933560174


Re: RFR: JDK-8324930: java/lang/StringBuilder problem with concurrent jtreg runs

2024-02-05 Thread Matthias Baesken
On Tue, 30 Jan 2024 09:08:28 GMT, Matthias Baesken  wrote:

> On some Windows machines we see sometimes OOM errors because of high resource 
> (memory/swap) consumption. This is especially seen when the jtreg runs have 
> higher concurrency. A solution is to put the java/lang/StringBuilder tests in 
> the exclusiveAccess.dirs group so that they are not executed concurrently, 
> which helps to mitigate the resource shortages.
> Of course this has the downside that on very large machines the concurrent 
> execution is not done any more.

This is what the thread stack looks like in hs_err for example 
for  java\lang\StringBuilder\Insert\hs_err_pid910208.log
we had on
Sun Jan 07 20:32:56 CET 2024
such an hs err file with thread stack : 


#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 536870912 bytes. Error detail: 
G1 virtual space
# Possible reasons:
#   The system is out of physical RAM or swap space
#   This process is running with CompressedOops enabled, and the Java Heap may 
be blocking the growth of the native heap
# Possible solutions:
#   Reduce memory load on the system
#   Increase physical memory or swap space
#   Check if swap backing store is full
#   Decrease Java heap size (-Xmx/-Xms)
#   Decrease number of Java threads
#   Decrease Java thread stack sizes (-Xss)
#   Set larger code cache with -XX:ReservedCodeCacheSize=
#   JVM is running with Unscaled Compressed Oops mode in which the Java heap is
# placed in the first 4GB address space. The Java Heap base address is the
# maximum limit for the native heap growth. Please use 
-XX:HeapBaseMinAddress
# to set the Java Heap base and to place the Java Heap above 4GB virtual 
address.
# This output file may be truncated or incomplete.
#
#  Out of Memory Error 
(c:\openjdk-jdk-dev-windows_x86_64-dbg\jdk\src\hotspot\os\windows\os_windows.cpp:3627),
 pid=910208, tid=910648
#
# JRE version:  (23.0) (fastdebug build )
# Java VM: OpenJDK 64-Bit Server VM (fastdebug 
23-internal-adhoc.GLOBALsapmachine.jdk, mixed mode, sharing, tiered, compressed 
oops, compressed class ptrs, g1 gc, windows-amd64)
# CreateCoredumpOnCrash turned off, no core file dumped
#

...

---  T H R E A D  ---

Current thread (0x02178c5a44c0):  JavaThread "Unknown thread" 
[_thread_in_vm, id=910648, stack(0x00eeec50,0x00eeec60) (1024K)]

Stack: [0x00eeec50,0x00eeec60]
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [jvm.dll+0xc96581]  os::win32::platform_print_native_stack+0x101  
(os_windows_x86.cpp:236)
V  [jvm.dll+0xfe7b31]  VMError::report+0x1491  (vmError.cpp:1005)
V  [jvm.dll+0xfea055]  VMError::report_and_die+0x645  (vmError.cpp:1834)
V  [jvm.dll+0xfea7cf]  VMError::report_and_die+0x5f  (vmError.cpp:1604)
V  [jvm.dll+0x559d4f]  report_vm_out_of_memory+0x5f  (debug.cpp:225)
V  [jvm.dll+0xc91c5d]  os::pd_commit_memory_or_exit+0xad  (os_windows.cpp:3635)
V  [jvm.dll+0xc82a2e]  os::commit_memory_or_exit+0x6e  (os.cpp:2051)
V  [jvm.dll+0x6de800]  G1PageBasedVirtualSpace::commit+0x100  
(g1PageBasedVirtualSpace.cpp:192)
V  [jvm.dll+0x6f0aff]  G1RegionsLargerThanCommitSizeMapper::commit_regions+0x7f 
 (g1RegionToSpaceMapper.cpp:100)
V  [jvm.dll+0x7806da]  HeapRegionManager::expand+0x8a  
(heapRegionManager.cpp:164)
V  [jvm.dll+0x780be6]  HeapRegionManager::expand_by+0xf6  
(heapRegionManager.cpp:361)
V  [jvm.dll+0x6812e4]  G1CollectedHeap::expand+0xf4  (g1CollectedHeap.cpp:1014)
V  [jvm.dll+0x682dc6]  G1CollectedHeap::initialize+0x596  
(g1CollectedHeap.cpp:1389)
V  [jvm.dll+0xf823e0]  universe_init+0x140  (universe.cpp:794)
V  [jvm.dll+0x79c8c1]  init_globals+0x31  (init.cpp:126)
V  [jvm.dll+0xf5c20e]  Threads::create_vm+0x2ae  (threads.cpp:552)
V  [jvm.dll+0x8c17b2]  JNI_CreateJavaVM_inner+0x82  (jni.cpp:3576)
V  [jvm.dll+0x8c5d9f]  JNI_CreateJavaVM+0x1f  (jni.cpp:3667)
C  [jli.dll+0x539f]  JavaMain+0x113  (java.c:491)
C  [ucrtbase.dll+0x2268a]  (no source info available)
C  [KERNEL32.DLL+0x17ac4]  (no source info available)
C  [ntdll.dll+0x5a4e1]  (no source info available)

-

PR Comment: https://git.openjdk.org/jdk/pull/17625#issuecomment-1927319309


Re: RFR: JDK-8324930: java/lang/StringBuilder problem with concurrent jtreg runs

2024-02-04 Thread Jaikiran Pai
On Tue, 30 Jan 2024 09:08:28 GMT, Matthias Baesken  wrote:

> On some Windows machines we see sometimes OOM errors because of high resource 
> (memory/swap) consumption. This is especially seen when the jtreg runs have 
> higher concurrency. A solution is to put the java/lang/StringBuilder tests in 
> the exclusiveAccess.dirs group so that they are not executed concurrently, 
> which helps to mitigate the resource shortages.
> Of course this has the downside that on very large machines the concurrent 
> execution is not done any more.

Hello Matthias, would you be able to include a stacktrace from one such 
failure? The tests you mention as failing:


java/lang/StringBuilder/StringBufferRepeat.java
java/lang/StringBuilder/CompactStringBuilderSerialization.java
java/lang/StringBuilder/Insert.java

are all "othervm" tests, so I'm curious what kind of OOM is being reported.

-

PR Comment: https://git.openjdk.org/jdk/pull/17625#issuecomment-1925764255


Re: RFR: JDK-8324930: java/lang/StringBuilder problem with concurrent jtreg runs

2024-01-31 Thread Matthias Baesken
On Wed, 31 Jan 2024 00:48:35 GMT, Joe Darcy  wrote:

> Can we maybe see if we can fix these tests without exclusive-accessing them? 
> I find it surprising that `java/lang/StringBuilder` tests are problematic, 
> but `java/lang/StringBuffer` tests are not. Which tests fail?

It is a bit arbitrary which tests fail.
one day :

java/lang/StringBuilder/StringBufferRepeat.java
java/lang/StringBuilder/CompactStringBuilderSerialization.java
java/lang/StringBuilder/Insert.java

other day:
java/lang/StringBuilder/HugeCapacity.java

next day it might differ a bit.
Maybe it would be sufficient to execute only the HugeCapacity test in a non 
concurrent way because this one seems to be especially resource hungry, but I 
am not aware how this would work in jtreg (I can only set whole directories).
Currently we run with this patch and the issues are gone.

Is there a way to balance resource usage in jtreg runs?

-

PR Comment: https://git.openjdk.org/jdk/pull/17625#issuecomment-1918595685


Re: RFR: JDK-8324930: java/lang/StringBuilder problem with concurrent jtreg runs

2024-01-30 Thread Joe Darcy
On Tue, 30 Jan 2024 17:21:07 GMT, Aleksey Shipilev  wrote:

> Can we maybe see if we can fix these tests without exclusive-accessing them? 
> I find it surprising that `java/lang/StringBuilder` tests are problematic, 
> but `java/lang/StringBuffer` tests are not. Which tests fail?

I agree it would be strongly preferable to allow these tests to run without 
exclusive access.

-

PR Comment: https://git.openjdk.org/jdk/pull/17625#issuecomment-1918163496


Re: RFR: JDK-8324930: java/lang/StringBuilder problem with concurrent jtreg runs

2024-01-30 Thread Aleksey Shipilev
On Tue, 30 Jan 2024 09:08:28 GMT, Matthias Baesken  wrote:

> On some Windows machines we see sometimes OOM errors because of high resource 
> (memory/swap) consumption. This is especially seen when the jtreg runs have 
> higher concurrency. A solution is to put the java/lang/StringBuilder tests in 
> the exclusiveAccess.dirs group so that they are not executed concurrently, 
> which helps to mitigate the resource shortages.
> Of course this has the downside that on very large machines the concurrent 
> execution is not done any more.

Can we maybe see if we can fix these tests without exclusive-accessing them? I 
find it surprising that `java/lang/StringBuilder` tests are problematic, but 
`java/lang/StringBuffer` tests are not. Which tests fail?

-

PR Review: https://git.openjdk.org/jdk/pull/17625#pullrequestreview-1851921699


RFR: JDK-8324930: java/lang/StringBuilder problem with concurrent jtreg runs

2024-01-30 Thread Matthias Baesken
On some Windows machines we see sometimes OOM errors because of high resource 
(memory/swap) consumption. This is especially seen when the jtreg runs have 
higher concurrency. A solution is to put the java/lang/StringBuilder tests in 
the exclusiveAccess.dirs group so that they are not executed concurrently, 
which helps to mitigate the resource shortages.
Of course this has the downside that on very large machines the concurrent 
execution is not done any more.

-

Commit messages:
 - JDK-8324930

Changes: https://git.openjdk.org/jdk/pull/17625/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17625&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8324930
  Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/17625.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/17625/head:pull/17625

PR: https://git.openjdk.org/jdk/pull/17625