lhotari commented on code in PR #24623:
URL: https://github.com/apache/pulsar/pull/24623#discussion_r2306463882


##########
managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ActiveManagedCursorContainerImpl.java:
##########
@@ -0,0 +1,836 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.bookkeeper.mledger.impl;
+
+import static java.util.Objects.requireNonNull;
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.Comparator;
+import java.util.HashMap;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.PriorityQueue;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.ConcurrentMap;
+import java.util.concurrent.locks.StampedLock;
+import lombok.extern.slf4j.Slf4j;
+import org.apache.bookkeeper.mledger.ManagedCursor;
+import org.apache.bookkeeper.mledger.Position;
+import org.apache.commons.lang3.mutable.MutableInt;
+import org.apache.commons.lang3.tuple.Pair;
+
+/**
+ * Implementation of {@link ActiveManagedCursorContainer} that tracks active 
cursors for cache eviction purposes.
+ * This implementation is optimized for the use with 
cacheEvictionByExpectedReadCount. It doesn't implement
+ * the {@link #cursorUpdated(ManagedCursor, Position)} method, as it is not 
needed for this use case. Cursors
+ * are updated using the {@link #updateCursor(ManagedCursor, Position)} method 
instead. This allows lazy updates
+ * to track the ordering of cursors and their positions without the need to 
immediately reorder the list of cursors
+ * on every cursor update. The cacheEvictionByExpectedReadCount use case will 
only need to know the number of cursors
+ * and to be able to know how many cursors are at the same position or before 
a given position when a backlogged read
+ * is performed. When cursors are all performing tailing reads, the ordering 
of cursors is not important and can be
+ * updated lazily when needed.
+ */
+@Slf4j
+public class ActiveManagedCursorContainerImpl implements 
ActiveManagedCursorContainer {
+    private static class Node {
+        final ManagedCursor cursor;
+        Position position;
+        Position pendingPosition;
+        boolean pendingRemove = false;
+        MutableInt numberOfCursorsAtSamePositionOrBefore;

Review Comment:
   `int` doesn't work for this case since the instance for 
`numberOfCursorsAtSamePositionOrBefore` is shared for all nodes at the same 
position. For nodes at the same position, the single shared `MutableInt` 
instance gets updated when there are changes in ordering. 
   
   Regarding the algorithm, I did try a version where positions would become 
nodes and there would be a count for the position that would be kept 
up-to-date. However, that had a lot more overhead compared to what is the 
current implementation. The current implementation performs well for both 
shared positions and individual positions without generating garbage since it's 
just field updates that happen when there are changes. 
   The position node would also require creating a new object instance and some 
construct to keep the nodes in the position. That's why I discarded that 
solution and sticked with the double-linked list.
   
   Another important feature in the ActiveManagedCursorContainerImpl is the 
ability to postpone updates until the `numberOfCursorsAtSamePositionOrBefore` 
information is actually needed. It's only needed when there are backlogged 
reads and therefore when cursors stay "on the tail", this calculation won't 
need to happen. This is the reason why I split the 
`ActiveManagedCursorContainer` interface from `ManagedCursorContainer` so that 
updates don't need to return the oldest position in 
`ActiveManagedCursorContainer` immediately after the update.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to