naahk37 opened a new issue, #24816:
URL: https://github.com/apache/pulsar/issues/24816

   ### Search before reporting
   
   - [x] I searched in the [issues](https://github.com/apache/pulsar/issues) 
and found nothing similar.
   
   
   ### Read release policy
   
   - [x] I understand that [unsupported 
versions](https://pulsar.apache.org/contribute/release-policy/#supported-versions)
 don't get bug fixes. I will attempt to reproduce the issue on a supported 
version of Pulsar client and Pulsar broker.
   
   
   ### User environment
   
   Pulsar 4.0.2
   openjdk version "17.0.14"
   Ubuntu 24.04
   3 brokers, 3 bookies, 3 zookeepers
   
   ### Issue Description
   
   Hi,
   
   I'm running a Pulsar cluster with the specs mentioned above and partitioned 
topics and I have two problems:
   
   - The filesystem usage on the bookies doesn't seem to go down (bk1: 100GB, 
bk2: 400GB, bk3: 100G). I already set a retention policy on my main namespace 
(2weeks, 10GB) and the metriks in Grafana report the correct topic sizes (100GB 
storage size and ~30GB backlog size). I get the usage on bk1 and bk3, but not 
the 400GB on bk2...
   
   - The bookie service on bk2 stops frequently with an error 
"io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 4194304 
byte(s) of direct memory (used: 2147483648, max: 2147483648)". I can't find the 
setting where I can control the limit for this memory setting. I already 
increased the jvm allocations in the pulsar_env.sh those don't seem to 
correlate..
   
   On bookie bk2 there are around 12k ledger log files, whereas on the other 
two bookies are ~200 and 85 files.
   On bk2 I have the following log entries regarding GC:
   "Forced garbage collection triggered by thread: LedgerDirsMonitorThread",
   "Garbage collector thread forced to perform GC before expiry of wait time",
   "Extracting entry log meta from entryLogId: 185",
   "GarbageCollectorThread-6-1 Set forceGarbageCollection to false after force 
GC to make it forceGC-able again"
   
   It looks like GC is happening, because the "deleted ledger" count goes up 
when the bookie is running.
   
   ### Error messages
   
   ```text
   
   ```
   
   ### Reproducing the issue
   
   not really applicable - keeps happening when cluster is running
   
   ### Additional information
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [ ] I'm willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to