keyboardbobo opened a new issue, #18408: URL: https://github.com/apache/pulsar/issues/18408
### Search before asking - [X] I searched in the [issues](https://github.com/apache/pulsar/issues) and found nothing similar. ### Version broker 2.9.2 , bookkeeper 4.14 ### Minimal reproduce step persistent://qlm-test/qlm-ns/qlm-test has 500 partitions, Full gc can be reproduced locally with the following script: > Start 5 instances on the same linux server:nohup bin/pulsar-perf produce -threads 20 -u pulsar://clusterIp:port -n 20 -s 200 -r 100000 persistent://qlm-test/qlm-ns/qlm-test & > Start 2 instances on the other same linux server:nohup bin/pulsar-perf consume -u pulsar://clusterIp:port -q 5000 -ss qlm-sub -st Shared persistent://qlm-test/qlm-ns/qlm-test & The broker has more logs like this: > 2022-11-10 15:41:27.0581 [BookKeeperClientWorker-OrderedExecutor-2-0] WARN org.apache.bookkeeper.client.PendingAddOp - Fencing exception on write: L780738 E1371097 on 10.101.129.75:3181 > 2022-11-10 15:41:27.0581 [BookKeeperClientWorker-OrderedExecutor-39-0] WARN org.apache.bookkeeper.client.PendingAddOp - Fencing exception on write: L780711 E1370071 on 10.101.129.68:3181 > 2022-11-10 15:41:27.0581 [BookKeeperClientWorker-OrderedExecutor-18-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L780690 E1374427 > 2022-11-10 15:41:27.0581 [BookKeeperClientWorker-OrderedExecutor-40-0] WARN org.apache.bookkeeper.client.PendingAddOp - Failed to write entry (780712, 1354072): Bookie operation timeout > 2022-11-10 15:41:27.0581 [BookKeeperClientWorker-OrderedExecutor-7-0] WARN org.apache.bookkeeper.client.PendingAddOp - Fencing exception on write: L780743 E1370335 on 10.101.129.68:3181 > 2022-11-10 15:41:27.0581 [BookKeeperClientWorker-OrderedExecutor-18-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L780690 E1374428 The client has more logs like this: > 2022-11-10 16:48:50.0588 [pulsar-timer-78-1] INFO org.apache.pulsar.client.impl.ConnectionHandler - [persistent://qlm-test/qlm-ns/qlm-test-partition-279] [pulsar_dev2-118-13100] Reconnecting after timeout > 2022-11-10 16:48:50.0588 [pulsar-client-io-16-1] WARN org.apache.pulsar.client.impl.ClientCnx - [id: 0x28f60fe8, L:/10.101.75.4:45744 ! R:10.101.129.70/10.101.129.70:6650] Failed to send request to broker: null > 2022-11-10 16:48:50.0537 [pulsar-timer-75-1] INFO org.apache.pulsar.client.impl.ConnectionHandler - [persistent://qlm-test/qlm-ns/qlm-test-partition-488] [pulsar_dev2-118-173804] Reconnecting after timeout > 2022-11-10 16:48:50.0588 [pulsar-client-io-16-1] ERROR org.apache.pulsar.client.impl.ProducerImpl - [persistent://qlm-test/qlm-ns/qlm-test-partition-160] [pulsar_dev2-119-27579] Failed to create producer: null > 2022-11-10 16:48:50.0534 [pulsar-client-io-6-1] WARN org.apache.pulsar.client.impl.ClientCnx - [id: 0x9f18bc7a, L:/10.101.75.4:35788 ! R:10.101.129.68/10.101.129.68:6650] Failed to send request 1295575096907667936 to broker: null > 2022-11-10 16:48:50.0588 [pulsar-client-io-19-1] INFO org.apache.pulsar.client.impl.ProducerImpl - [persistent://qlm-test/qlm-ns/qlm-test-partition-226] [pulsar_dev2-119-60703] Created producer on cnx [id: 0xfdb3f279, L:/10.101.75.4:45962 - R:10.101.129.70/10.101.129.70:6650] ### What did you expect to see? The system runs stably ### What did you see instead? Generally, within half an hour, full gc can appear on one or more brokers ### Anything else? _No response_ ### Are you willing to submit a PR? - [ ] I'm willing to submit a PR! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
