zclllyybb commented on issue #64138: URL: https://github.com/apache/doris/issues/64138#issuecomment-4628998988
Breakwater-GitHub-Analysis-Slot: slot_5ce3712a7728 Initial analysis from the current broker code: This looks like a valid broker-side resource lifetime bug, not just a heap sizing symptom. In `BrokerFileSystem.closeFileSystem()`, the underlying Hadoop `FileSystem.close()` call is still commented out, so the method only drops Doris' reference by setting `dfsFileSystem = null`. The HDFS broker path creates the instance through `FileSystem.get(...)` with `fs.hdfs.impl.disable.cache=true`, which means the broker's own cache effectively owns the lifecycle. When `openReader`, `openWriter`, or other file operations hit an `IOException`, they call `fileSystem.closeFileSystem()`. With the current implementation, that invalidates the cached wrapper but does not close the Hadoop/LiteFS resources. If the concrete filesystem owns Netty event-loop threads, repeated failing imports/exports can accumulate those resources, which matches the reported `DFS_OPEN` OOM stack and the suggested `jstack` symptom. I also agree that simply reverting the old close change is unsafe. The broker stores the same cached `BrokerFileSystem` behind active readers/writers, and normal operations use `getDFSFileSystem()` without holding `BrokerFileSystem.lock`. Re-enabling an unconditional close in an error or expiry path could close the shared `FileSystem` while another thread is still using streams opened from it, which explains the historical `Filesystem closed` failure mode. There is a separate real concurrency issue in `updateCachedFileSystem()`. It uses `containsKey()`, `get()`, and `put()` as separate operations on a `ConcurrentHashMap`. During concurrent first use, two callers can create different wrappers and the later `put()` can overwrite a wrapper that another thread may already have initialized with a Hadoop `FileSystem`. During expiry replacement, the code locks the wrapper read earlier, but then rechecks the map separately and can still replace the wrong generation. Both cases can leave a `FileSystem` outside the intended cache lifecycle or make close-on-eviction unsafe. Suggested fix direction: 1. Make cache creation/replacement atomic per `FileSystemIdentity`, for example with a `compute`/generation-check pattern or an explicit per-identity lifecycle lock, so an initialized wrapper cannot be overwritten accidentally. 2. Reintroduce real resource closing only for a wrapper that has first been removed from visibility to new users and is no longer referenced by active streams. A reference-count or lease-style wrapper would address the leak without reintroducing the old `Filesystem closed` regression. 3. Add a concurrency test covering simultaneous first creation, open failure, and expiry replacement, plus a stress/repro check that repeated failing broker opens do not monotonically increase Hadoop/LiteFS/Netty worker threads. Useful confirmation data for the PR review: - the exact Doris commit/branch and the Hadoop/LiteFS dependency versions used by the broker; - jstack thread-count snapshots over time and one heap histogram or heap dump showing retained `FileSystem` / LiteFS / Netty event-loop objects; - a small repro config or script for the failing import/export path that repeatedly triggers `DFS_OPEN` failures. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
