congbobo184 opened a new issue, #15133:
URL: https://github.com/apache/pulsar/issues/15133

   - **Status**: Discussion
   - **Author**: Bo Cong
   - **Pull Request**: 
   - **Mailing List discussion**: 
   - **Release**: 2.11
   
   # Motivation
   
   Currently, the transaction coordinator does not limit the number of active 
transactions, which may cause the following problems:
   
   - A large number of active transactions will put a lot of pressure on memory
   - The transaction that a single TC can handle is limited, so the active 
transaction cannot be expanded infinitely
   - End transaction should wait TP or TB recover success, so a lot end request 
will pending in TP or TB and TC don't kown the state of the TB or TP, it will 
wast a lot of resource of the machine. If there have a lot of TB or TP request 
in pending state, it will cause the OOM
   
   ## Implementation
   
   ### Add config
   
   add maxActiveTransactions into broker.conf
   
   ```makefile
   # The max active transactions in one transaction coordinator
   maxActiveTransactions=0
   ```
   
   
   
   ### How to handle the number of active transactions reach the 
maxActiveTransactions?
   
   
   
   If reach the maxActiveTransactions, return the Exception to client. It has a 
lot of disadvantages:
   
   1. broker should add a ReachMaxActiveTxnException, if reach the max active 
txn exception. client need try this exception then do op. every client will 
handle the ReachMaxActiveTxnException. 
   2. client receive this transaction will not stop open txn, because it don't 
know what time the TC will be recoverd. It will retry now. When the TC can't 
recover, the client will keep retrying. But this op is not make sense.
   
   ### Design
   
   When this op request reach the maxActiveTransactions, coordinator don't 
return any response for this request. ignore this request directly. In this 
way, broker don't need to add any exception for this config.
   
   
   
   #### Let's we can see, how does this way will affect the client?
   
   If broker don't return the reponse for this request, the op of open txn will 
timeout. and in coordinator client, it has a semaphore to control the op of 
txn(open, add produce topic, add ack topic, end txn). In the timeout time, the 
coordinator client only can open the number of semaphore txns. Any other 
request will be block. So this design slove this two problems:
   
   1. don't need to add a exception
   2. client will not infinite retry
   
   #### Worries
   
   If you are worried that this design will affect the client-side experience, 
because the open transaction will always time out and other txn op will be 
blocked. I think your worry is superfluous, At this time, you should consider 
increasing the performance of the cluster or find the problematic client to 
repair.
   
   
   
   ### flow chart
   
   
![image](https://user-images.githubusercontent.com/39078850/162964277-6342ae82-1691-48b5-af84-18bb7a422ff1.png)
   
   
   
   ### Compatibility, Deprecation, and Migration Plan
   
   maxActiveTransactions default = 0, if maxActiveTransactions will not block 
open txn
   
   ### Test Plan
   
   reach maxActiveTransactions client open txn will timeout
   
   ### Rejected Alternatives
   
   If reach the maxActiveTransactions, return the Exception to client. It has a 
lot of disadvantages:
   
   1. broker should add a ReachMaxActiveTxnException, if reach the max active 
txn exception. client need try this exception then do op. every client will 
handle the ReachMaxActiveTxnException. 
   2. client receive this transaction will not stop open txn, because it don't 
know what time the TC will be recoverd. It will retry now. When the TC can't 
recover, the client will keep retrying. But this op is not make sense.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to