poorbarcode commented on code in PR #24945:
URL: https://github.com/apache/pulsar/pull/24945#discussion_r2501424938


##########
pulsar-broker/src/main/java/org/apache/pulsar/broker/transaction/buffer/impl/TopicTransactionBuffer.java:
##########
@@ -269,45 +260,124 @@ public long getCommittedTxnCount() {
 
     @Override
     public CompletableFuture<Position> appendBufferToTxn(TxnID txnId, long 
sequenceId, ByteBuf buffer) {
-        // Method `takeAbortedTxnsSnapshot` will be executed in the different 
thread.
-        // So we need to retain the buffer in this thread. It will be released 
after message persistent.
-        buffer.retain();
-        CompletableFuture<Position> future = 
getPublishFuture().thenCompose(ignore -> {
-            if (checkIfNoSnapshot()) {
-                CompletableFuture<Void> completableFuture = new 
CompletableFuture<>();
-                // `publishFuture` will be completed after message persistent, 
so there will not be two threads
-                // writing snapshots at the same time.
-                
snapshotAbortedTxnProcessor.takeAbortedTxnsSnapshot(maxReadPosition).thenRun(() 
-> {
-                    if (changeToReadyStateFromNoSnapshot()) {
-                        timer.newTimeout(TopicTransactionBuffer.this,
-                                takeSnapshotIntervalTime, 
TimeUnit.MILLISECONDS);
-                        completableFuture.complete(null);
-                    } else {
-                        log.error("[{}]Failed to change state of transaction 
buffer to Ready from NoSnapshot",
-                                topic.getName());
-                        completableFuture.completeExceptionally(new 
BrokerServiceException.ServiceUnitNotReadyException(
-                                "Transaction Buffer take first snapshot 
failed, the current state is: " + getState()));
-                    }
-                }).exceptionally(exception -> {
-                    log.error("Topic {} failed to take snapshot", 
this.topic.getName());
-                    completableFuture.completeExceptionally(exception);
-                    return null;
-                });
-                return completableFuture.thenCompose(__ -> 
internalAppendBufferToTxn(txnId, buffer));
-            } else if (checkIfReady()) {
-                return internalAppendBufferToTxn(txnId, buffer);
-            } else {
-                // `publishFuture` will be completed after transaction buffer 
recover completely
-                // during initializing, so this case should not happen.
+        synchronized (pendingAppendingTxnBufferTasks) {
+            // The first snapshot is in progress, the following publish tasks 
will be pending.
+            if (!pendingAppendingTxnBufferTasks.isEmpty()) {
+                CompletableFuture<Position> res = new CompletableFuture<>();
+                buffer.retain();
+                pendingAppendingTxnBufferTasks.offer(new 
PendingAppendingTxnBufferTask(txnId, sequenceId, buffer, res));
+                return res;
+            }
+
+            // `publishFuture` will be completed after transaction buffer 
recover completely
+            // during initializing, so this case should not happen.
+            if (!checkIfReady() && !checkIfNoSnapshot() && 
!checkIfFirstSnapshotting() && !checkIfInitializing()) {
+                log.error("[{}] unexpected state: {} when try to take the 
first transaction buffer snapshot",
+                        topic.getName(), getState());
                 return FutureUtil.failedFuture(new 
BrokerServiceException.ServiceUnitNotReadyException(
                         "Transaction Buffer recover failed, the current state 
is: " + getState()));
             }
-        }).whenComplete(((position, throwable) -> buffer.release()));
-        setPublishFuture(future);
-        return future;
+
+            // The transaction buffer is ready to write.
+            if (checkIfReady()) {
+                return internalAppendBufferToTxn(txnId, buffer, sequenceId);
+            }

Review Comment:
   >  The main concern about using synchronized could cause contention and a 
performance regression.
   
   No, after the state is `Ready`, this method will always runs in the same 
thread (pulsar-io thread of the producer), there are no race conditions. So 
`synchronized` will never cause performance regression.
   
   By the way, this change has improved performance, I have described in the 
Motivation-issue-3



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to