lhotari commented on code in PR #24623: URL: https://github.com/apache/pulsar/pull/24623#discussion_r2306463882
########## managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ActiveManagedCursorContainerImpl.java: ########## @@ -0,0 +1,836 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.bookkeeper.mledger.impl; + +import static java.util.Objects.requireNonNull; +import com.google.common.annotations.VisibleForTesting; +import java.util.ArrayList; +import java.util.Comparator; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.PriorityQueue; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.ConcurrentMap; +import java.util.concurrent.locks.StampedLock; +import lombok.extern.slf4j.Slf4j; +import org.apache.bookkeeper.mledger.ManagedCursor; +import org.apache.bookkeeper.mledger.Position; +import org.apache.commons.lang3.mutable.MutableInt; +import org.apache.commons.lang3.tuple.Pair; + +/** + * Implementation of {@link ActiveManagedCursorContainer} that tracks active cursors for cache eviction purposes. + * This implementation is optimized for the use with cacheEvictionByExpectedReadCount. It doesn't implement + * the {@link #cursorUpdated(ManagedCursor, Position)} method, as it is not needed for this use case. Cursors + * are updated using the {@link #updateCursor(ManagedCursor, Position)} method instead. This allows lazy updates + * to track the ordering of cursors and their positions without the need to immediately reorder the list of cursors + * on every cursor update. The cacheEvictionByExpectedReadCount use case will only need to know the number of cursors + * and to be able to know how many cursors are at the same position or before a given position when a backlogged read + * is performed. When cursors are all performing tailing reads, the ordering of cursors is not important and can be + * updated lazily when needed. + */ +@Slf4j +public class ActiveManagedCursorContainerImpl implements ActiveManagedCursorContainer { + private static class Node { + final ManagedCursor cursor; + Position position; + Position pendingPosition; + boolean pendingRemove = false; + MutableInt numberOfCursorsAtSamePositionOrBefore; Review Comment: `int` doesn't work for this case since the instance for `numberOfCursorsAtSamePositionOrBefore` is shared for all nodes at the same position. For nodes at the same position, the single shared `MutableInt` instance gets updated when there are changes in ordering. Regarding the algorithm, I did try a version where positions would become nodes and there would be a count for the position that would be kept up-to-date. However, that had a lot more overhead compared to what is the current implementation. The current implementation performs well for both shared positions and individual positions without generating garbage since it's just field updates that happen when there are changes. The position node would also require creating a new object instance and some construct to keep the nodes in the position. That's why I discarded that solution and sticked with the double-linked list. Another important feature in the ActiveManagedCursorContainerImpl is the ability to postpone updates until the `numberOfCursorsAtSamePositionOrBefore` information is actually needed. It's only needed when there are backlogged reads and therefore when cursors stay "on the tail", this calculation won't need to happen. This is the reason why I split the `ActiveManagedCursorContainer` interface from `ManagedCursorContainer` so that updates don't need to return the oldest position in `ActiveManagedCursorContainer` immediately after the update. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pulsar.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org