Allan Clements created TINKERPOP-3113: -----------------------------------------
Summary: UnifiedChanilizer Retains Sessions Forever Key: TINKERPOP-3113 URL: https://issues.apache.org/jira/browse/TINKERPOP-3113 Project: TinkerPop Issue Type: Bug Components: server Affects Versions: 3.7.2 Reporter: Allan Clements Attachments: dominator_tree.png, path_to_gc_root.png ([Prior Discord Thread|https://discord.com/channels/838910279550238720/1290792261117939722/1290792261117939722]) I believe UnifiedChannelizer is causing sessions to be retained forever until either the channel is closed or an OOME occurs. My JanusGraph deployment would generally OOME after a couple hours. After collecting some heap dumps I noticed over 40k traversal bytecode submissions to the server were still in the dump according to the Eclipse Memory Analyzer Tool's dominator tree view, attached below. Inspecting further I picked an instance of the bytecode and it appeared the path to the GC root, preventing its collection, was through a CloseFuture associated with its initialChannel, attached below also. So if the channel were regularly closed it may duck the issue. In either case I was able to reproduce it in a toy Java application using the official driver by continuously submitting traversals with large payloads via an inject() step without closing the connection as well as in a toy Rust application using the unofficial driver. I believe the issue originates [here|https://github.com/apache/tinkerpop/blob/b7c9ddda16a3d059b2a677f578f131a7124187b6/gremlin-server/src/main/java/org/apache/tinkerpop/gremlin/server/handler/AbstractSession.java#L166-L170]. The code creates a future using a closure that just calls close(). However in doing so it captures a reference to the current object. The child class of AbstractSession is SingleTaskSession and holds a reference to its [onlySessionTask here|https://github.com/apache/tinkerpop/blob/a6777d0c8394f95c5aa400b965e8f4f996db1abd/gremlin-server/src/main/java/org/apache/tinkerpop/gremlin/server/handler/SingleTaskSession.java#L33]. [SessionTask |https://github.com/apache/tinkerpop/blob/a6777d0c8394f95c5aa400b965e8f4f996db1abd/gremlin-server/src/main/java/org/apache/tinkerpop/gremlin/server/handler/SessionTask.java]extends Context which holds a reference to the traversal's [Bytecode here|https://github.com/apache/tinkerpop/blob/a6777d0c8394f95c5aa400b965e8f4f996db1abd/gremlin-server/src/main/java/org/apache/tinkerpop/gremlin/server/Context.java#L62]. There doesn't appear to be a "happy path" removal of the anonymously created listener for the channel's closeFuture() once the session completes normally (like when a traversal is fully iterated). As a side note, that anonymous created listener may also be problematic for exposing the object being constructed before the constructor completes. Which can cause problems in multi-threaded environments, assuming the underlying asynchronous runtime permits multi-threaded operation. Imagine the channel closing concurrently to the constructor being executed after the listener was added. -- This message was sent by Atlassian Jira (v8.20.10#820010)