On Wed, 4 Mar 2026 19:25:54 GMT, Kieran Farrell <[email protected]> wrote:

>> Currently, it is only possible to read the number of open file descriptors 
>> of a Java process via the `UnixOperatingSystemMXBean` which is only 
>> accessible via JMX enabled tools. To improve servicability, it would be 
>> benifical to be able to view this information from jcmd VM.info output or 
>> hs_err_pid crash logs. This could help diagnose resource exhaustion and 
>> troubleshoot "too many open files" errors in Java processes on Unix 
>> platforms.
>> 
>> This PR adds reporting the current open file descriptor count to both jcmd 
>> VM.info output or hs_err_pid crash logs by refactoring the native JNI logic 
>> from 
>> `Java_com_sun_management_internal_OperatingSystemImpl_getOpenFileDescriptorCount0`
>>  of the `UnixOperatingSystemMXBean` into hotspot. Apple's API for retrieving 
>> open file descriptor count provides an array of the actual FDs to determine 
>> the count. To avoid using `malloc` to store this array in a potential signal 
>> handling context where stack space may be limited, the apple implementation 
>> instead allocates a fixed 32KB struct on the stack to store the open FDs and 
>> only reports the result if the struct is less than the max (1024 FDs). This 
>> should cover the majoirty of use cases.
>
> Kieran Farrell has updated the pull request incrementally with two additional 
> commits since the last revision:
> 
>  - use scratch buffer in hs_error mac
>  - use scratch buffer in hs_error mac

src/hotspot/os/aix/os_aix.cpp line 83:

> 81: #include <alloca.h>
> 82: #include <ctype.h>
> 83: #include <dirent.h>

what are these for? pls remove

src/hotspot/os/bsd/os_bsd.cpp line 2587:

> 2585: #ifdef __APPLE__
> 2586:   pid_t my_pid;
> 2587:   const int MAX_SAFE_FDS = 1024;

Suggestion:

  constexpr int MAX_SAFE_FDS = 1024;

src/hotspot/os/bsd/os_bsd.cpp line 2613:

> 2611: }
> 2612: 
> 2613: void os::Bsd::print_open_file_descriptors(outputStream* st, char* buf, 
> size_t buflen) {

This is not what I had in mind with my proposal. Too much duplication.

What I meant was:

- Write one version that takes an input buffer and a size. Let that one be the 
version that does all the work.
- Call that version directly from error reporting using its scratch buffer
- Write a second version that allocates a static local buffer, and calls the 
first version with that static local buffer as input buffer.

The second version should only have one or two lines, top.

src/hotspot/os/bsd/os_bsd.cpp line 2620:

> 2618:       st->print_cr("Open File Descriptors: unknown");
> 2619:       return;
> 2620:   }

Simplify this to an assert. Input buffer size is not a surprise at runtime, its 
hardcoded somewhere in the JVM, so it must be correct.

src/hotspot/os/linux/os_linux.cpp line 5395:

> 5393:   struct timespec start, now;
> 5394:   bool timed_out = false;
> 5395:   int status = clock_gettime(CLOCK_MONOTONIC, &start);

Here and below, please just use `os::javaTimeNanos()`.

src/hotspot/os/linux/os_linux.cpp line 5401:

> 5399:     st->print_cr("Open File Descriptors: unknown");
> 5400:     return;
> 5401:   }

Suggestion:

  assert(dirp != nullptr, "No proc fs?");

can never be false on a normal linux

src/hotspot/os/linux/os_linux.cpp line 5405:

> 5403:   // limit proc file read to 50ms
> 5404:   while ((dentp = readdir(dirp)) != nullptr) {
> 5405:     if (isdigit(dentp->d_name[0])) fds++;

This can also be an assert(isdigit). There is nothing else in that directory.

src/hotspot/os/linux/os_linux.cpp line 5413:

> 5411:       if (elapsed_ms > TIMEOUT_MS) {
> 5412:         timed_out = true;
> 5413:         break;

Remove break, and make `timed_out` part of the while condition

src/hotspot/os/linux/os_linux.cpp line 5420:

> 5418:   closedir(dirp);
> 5419:   if (timed_out) {
> 5420:     st->print_cr("Open File Descriptors: > %d", fds - 1); // minus the 
> opendir fd itself

Lets just avoid the `-1`. Nobody cares for that precision, and it is (quite 
likely) wrong: If we run into this branch, the process has a buckload of open 
fds, so chances are high that the opendir fd, opened last, will have a very 
high number and thus not be part of the set you just counted.

This whole counting thing is racy anyway, and the set of open file descriptors 
can change while you iterate the directory.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/27971#discussion_r2932149950
PR Review Comment: https://git.openjdk.org/jdk/pull/27971#discussion_r2932169942
PR Review Comment: https://git.openjdk.org/jdk/pull/27971#discussion_r2932212595
PR Review Comment: https://git.openjdk.org/jdk/pull/27971#discussion_r2932228676
PR Review Comment: https://git.openjdk.org/jdk/pull/27971#discussion_r2932317044
PR Review Comment: https://git.openjdk.org/jdk/pull/27971#discussion_r2932274784
PR Review Comment: https://git.openjdk.org/jdk/pull/27971#discussion_r2932281480
PR Review Comment: https://git.openjdk.org/jdk/pull/27971#discussion_r2932322875
PR Review Comment: https://git.openjdk.org/jdk/pull/27971#discussion_r2932336058

Reply via email to