Re: Testing no memory leak occurs via references

2023-03-07 Thread Philip Race
Having just a very few sources of wisdom on this in the JDK test suites 
is a good idea because
then any tests that might be affected by policy changes would be easily 
spotted.
I say this despite an instinctive reticence to rely on "frameworks" and 
"utilities" in jtreg tests.

As resources allow we should look into that (across all areas).

I don't know if tests which expect to run with the default GC would be 
wise setting a specific GC.

Testing "out of the box" is more important to (eg) client tests.

I also think native code is a problem because a lot of tests are run 
using jtreg *outside* of a build.
ie add jtreg and a build to your path and then run jtreg. This is 
actually normal not an aberration.
It has come to me as a surprise in the past that folks who work on VM 
etc were surprised at this :-)

So its not easy to build the native code then.

The observations about the fragility of the VM in OOME situations is noted.
Also othervm mode just seems a lot safer for all the above tests.

-phil.

PS someone

On 3/6/23 7:11 AM, Aleksei Ivanov wrote:

On 06/03/2023 13:51, Albert Yang wrote:
Upon a cursory inspection of ForceGC.java, it seems that the 
fundamental logic involves waiting for a certain duration and relying 
on chance. However, I am of the opinion that utilizing the WhiteBox 
API can provide greater determinism and potentially strengthen some 
of the assertions.


Yes, it calls System.gc in a loop and waits for a reference to become 
cleared.


(It looks as if the body of ForceGC duplicates what one would have in 
the passed BooleanSupplier which again tests if a reference is cleared.)



I decided ForceGC is simpler and easier to use
I was not aware of your specific requirements, so I cannot say for 
certain which approach is best. (However, it is worth noting that the 
WhiteBox API can be utilized to implement ForceGC if necessary.)


My test is written to ensure awt.List gets garbage-collected when 
there are no strong references to it. Before JDK-8040076 had been 
fixed it wasn't.


So the test creates awt.List, adds it to a frame, calls 
setMultipleMode(true) to enable multi-selection in the list component, 
removes it from the frame. At this point, I want to confirm the 
awt.List can be garbage-collected.


The original test created a very long String to cause OutOfMemoryError 
and then verified whether the WeakReference to awt.List is cleared or 
not.


In my first fix, I replaced generating OutOfMemoryError with a call to 
System.gc() in a loop and waited for the reference object to be 
cleared. Usually, the reference is cleared in the second iteration if 
not in the first one.


Essentially, ForceGC does the same thing. So, it replaced my custom 
code with ForceGC.






Re: Usage of iconv()

2024-04-25 Thread Philip Race




On 4/24/24 4:24 AM, Magnus Ihse Bursie wrote:
That is a good question. libiconv is used only on macOS and AIX, for a 
few libraries, as you say. I just tried removing -liconv from the 
macOS dependencies and recompiled, just to see what would happen. 
There were three instances for macOS: libsplashscreen, libjdwp and 
libinstrument.




libsplashscreen uses it in splashscreen_sys.m, where it is used to 
convert the jar file name.


This is called from the launcher, in java.base/share/native/libjli/java.c




libjdwp uses it in utf_util.c, where it is used to convert file name 
and command lines, iiuc.


It is likely that we have similar (but better) ways to convert 
charsets elsewhere in our libraries that can be used instead of 
libiconv. I guess these are not the only two places where a filename 
must be converted from the platform charset to UTF8.


So whatever replacement there might be, it needs to be something that is 
available very early in the life of the VM, in fact before there is a VM 
running.


-phil.


Re: stack overflow in regex engine

2024-05-22 Thread Philip Race
P4 is the default JBS priority, so sometimes it just means no one 
figured out the true priority.

But in general P4 bugs could be open for years, or even never get fixed.
The priority is also partially an assessment of where it falls as a 
priority for the JDK developers.

A user of JDK may have an entirely different perspective.
And that's why there are vendors who provide support for JDK. They can 
also arrange the backports you need.
But that's not done here. Here is where you come to participate and 
contribute fixes, not ask for fixes.
So my suggestion is to raise it via your support channel to your 
particular vendor who provided your binary.


-phil


On 5/21/24 8:46 PM, mark.yagnatin...@barclays.com wrote:


(Sorry about my previous “do I need to subscribe?” email; in 
retrospect that was needless noise.)


The purpose of this email is twofold: first, inquire about the status 
of ticket filed a few years ago, and second to point out some 
non-obvious reasons why it might be slightly more serious than it seems.


The ticket is this one https://bugs.openjdk.org/browse/JDK-8260866 
(stack overflow in regex matching quantified alternation)


The priority is listed as P4, which I guess means something like 
“medium” (more important than p5, but less than p3)


It also has a specific person assigned, which seems vaguely 
encouraging, but no updates at all in the years since it’s been 
created, which seems less encouraging.


It was seemingly never once discussed on this mailing list, not even 
when it was first filed.


As an outsider, I’m not quite sure how to interpret all these various 
omens and turn them into guesses about its eventual fate.


Will it remain unfixed for another decade or two?  Will it be fixed in 
a few months, but then never backported to old versions?  Something 
else?  No one knows?


That concludes the status inquiry.  Now on to the advocacy.  Some bugs 
are annoying, but once you hit them, you can work around them by 
changing your code so it does not trigger the bug.


Note the phrase “your code” above.  This is much more awkward to do if 
the bug triggered by third-party code you got from maven central or 
something.


At that point your options are to either ask the third party library 
to work around it, or else fork the dependency (which is not well 
supported by mainstream build tools (or maybe I’m just using them wrong)).


In this case, regular expressions are so ubiquitous that the bug is 
quite plausibly more likely to be triggered by some third party 
dependency than by code you own.


That was the case for me today: after spending hours trying to track 
down a stack overflow error I found the offending regex in a third 
party library.


The good news is that for the kinds of inputs we need to handle, it is 
indeed easy to substitute a much simpler regex that would avoid the issue.


The bad news is that it’s not my code, so I can’t.  I could petition 
the maintainers of the library, but this is not great because:


First, maybe the version I’m on is not longer even supported, and 
newer versions are not compatible,


Second, it may take them a while to fix it, and third, it is wasteful 
(and inelegant) to have workarounds slowly percolate throughout the 
Java ecosystem instead of fixing the problem at the root.


The other annoying thing here is that even when you have “enough” 
stack space to avoid crashing, using it may not be quite “free”.


For instance, project loom’s foundational premise seems to be that 
“most threads have oversized stacks; we can have more threads if we 
start off with small stacks and grow them only when needed”.


This would be false when the thread in question uses a regex with 
quantified alternation.


(Since many Loom threads will be based on the same Runnable, it’s a 
pretty safe bet that if one of them uses this feature, many will, so 
you can’t assume it will “average out”.)


There are other reasons besides loom to be low on stack space; maybe 
you’re using some crazy framework(s) that like(s) to have call stacks 
that are crazy deep.


Or maybe you’re running with -Xss set pretty low.  Or you passed a 
small value for stack space to the Thread constructor.


Or maybe none of these things are true, but in most operating systems 
a thread stack costs “real” memory in proportion to its 
high-watermark, so even a SINGLE heavy regex in the lifetime of a 
thread is tantamount to a memory leak of hundreds of kilobytes.


Practicalities aside, I don’t like it when code consumes “surprising” 
types of resources, or surprising amounts of them.


For instance, you wouldn’t expect a sorting function to spawn threads 
behind your back, unless it was called “parallel sort” or something 
like that.


You wouldn’t expect it to allocate multi-gigabyte arrays, nor to 
perform I/O.


Similarly, most functions need only O(1) stack space, so this tends to 
be the default assumption unless the docs explicitly call out “this 
thing might throw stack ov

Re: stack overflow in regex engine

2024-05-22 Thread Philip Race



On 5/22/24 10:51 AM, mark.yagnatin...@barclays.com wrote:


Ah, didn’t realize P4 is default; that makes sense.

So I should not even be trying to derive omens from that.

So I guess only the assignee would know whether or not the status is 
closer to


“I was going to work on that next week” versus

“I totally forgot about that thing, and am about to forget about it again”

I’m quite sure he’s on this list and will hopefully read the advocacy 
section of my email.


Um.  I feel awkward writing this paragraph because you know how 
OpenJDK works much better than I do, so it feels a bit silly to argue 
with you about it.  But.  Um.


When you say “this is not the place to ask for fixes” …

I was under the impression that “asking for fixes” actually does 
provide value, and not all of that value can be replaced by merely 
providing fixes.


In particular, asking for fixes gives maintainers a vague sense of how 
often people in the “real world” tend to run into an issue, which in 
turn informs how much “cost” is worth spending on addressing it.


(Where “cost” could mean things like “time” and also things like “this 
makes trickier and hence harder to maintain”.)


In fact, I was under the impression that OpenJDK is slightly hostile 
to “big” fixes by “outsiders” because of the worry that there’s now a 
big/complicated chunk of code that no one inside the project 
understands and yet the project is responsible for, and the original 
author might never be heard from again.




I think that is mainly the case for some new feature.
Or if you want to take some existing feature / functionality and 
re-write it in a different way.


True "bug fixes" to existing code are generally welcome, although that 
isn't the same as saying
they are quickly accepted.  They still need review and testing, and if 
the area is sensitive or complex that
can be quite time consuming on all ends of it. Which would all also be 
the case even if an

experienced contributor provided the fix.

-phil.


Anyway, thanks a bunch for responding; I was worried that no one would.

*From:*Philip Race 
*Sent:* Wednesday, May 22, 2024 11:54 AM
*To:* Yagnatinsky, Mark : IT (NYK) ; 
core-libs-dev@openjdk.org

*Subject:* Re: stack overflow in regex engine

CAUTION: This email originated from outside our organisation - 
philip.r...@oracle.com Do not click on links, open attachments, or 
respond unless you recognize the sender and can validate the content 
is safe.


P4 is the default JBS priority, so sometimes it just means no one 
figured out the true priority.

But in general P4 bugs could be open for years, or even never get fixed.
The priority is also partially an assessment of where it falls as a 
priority for the JDK developers.

A user of JDK may have an entirely different perspective.
And that's why there are vendors who provide support for JDK. They can 
also arrange the backports you need.
But that's not done here. Here is where you come to participate and 
contribute fixes, not ask for fixes.
So my suggestion is to raise it via your support channel to your 
particular vendor who provided your binary.


-phil

On 5/21/24 8:46 PM, mark.yagnatin...@barclays.com wrote:

(Sorry about my previous “do I need to subscribe?” email; in
retrospect that was needless noise.)

The purpose of this email is twofold: first, inquire about the
status of ticket filed a few years ago, and second to point out
some non-obvious reasons why it might be slightly more serious
than it seems.

The ticket is this one https://bugs.openjdk.org/browse/JDK-8260866

<https://clicktime.symantec.com/15t5ekSGXorRH53n7q6GJ?h=e9ZmDJOAdCkeHz_PXjDgZiyUdvJmTZTTcGvZoAULMmE=&u=https://bugs.openjdk.org/browse/JDK-8260866>
(stack overflow in regex matching quantified alternation)

The priority is listed as P4, which I guess means something like
“medium” (more important than p5, but less than p3)

It also has a specific person assigned, which seems vaguely
encouraging, but no updates at all in the years since it’s been
created, which seems less encouraging.

It was seemingly never once discussed on this mailing list, not
even when it was first filed.

As an outsider, I’m not quite sure how to interpret all these
various omens and turn them into guesses about its eventual fate.

Will it remain unfixed for another decade or two?  Will it be
fixed in a few months, but then never backported to old versions? 
Something else?  No one knows?

That concludes the status inquiry.  Now on to the advocacy.  Some
bugs are annoying, but once you hit them, you can work around them
by changing your code so it does not trigger the bug.

Note the phrase “your code” above.  This is much more awkward to
do if the bug triggered by third-party code you got from maven
central or something.

At that point your options are to either ask the third party
library