Allan Clements created TINKERPOP-3113:
-----------------------------------------

             Summary: UnifiedChanilizer Retains Sessions Forever
                 Key: TINKERPOP-3113
                 URL: https://issues.apache.org/jira/browse/TINKERPOP-3113
             Project: TinkerPop
          Issue Type: Bug
          Components: server
    Affects Versions: 3.7.2
            Reporter: Allan Clements
         Attachments: dominator_tree.png, path_to_gc_root.png

([Prior Discord 
Thread|https://discord.com/channels/838910279550238720/1290792261117939722/1290792261117939722])

I believe UnifiedChannelizer is causing sessions to be retained forever until 
either the channel is closed or an OOME occurs.

 

My JanusGraph deployment would generally OOME after a couple hours. After 
collecting some heap dumps I noticed over 40k traversal bytecode submissions to 
the server were still in the dump according to the Eclipse Memory Analyzer 
Tool's dominator tree view, attached below.

 

Inspecting further I picked an instance of the bytecode and it appeared the 
path to the GC root, preventing its collection, was through a CloseFuture 
associated with its initialChannel, attached below also.

 

So if the channel were regularly closed it may duck the issue. In either case I 
was able to reproduce it in a toy Java application using the official driver by 
continuously submitting traversals with large payloads via an inject() step 
without closing the connection as well as in a toy Rust application using the 
unofficial driver.

 

I believe the issue originates 
[here|https://github.com/apache/tinkerpop/blob/b7c9ddda16a3d059b2a677f578f131a7124187b6/gremlin-server/src/main/java/org/apache/tinkerpop/gremlin/server/handler/AbstractSession.java#L166-L170].
 The code creates a future using a closure that just calls close(). However in 
doing so it captures a reference to the current object. The child class of 
AbstractSession is SingleTaskSession and holds a reference to its 
[onlySessionTask 
here|https://github.com/apache/tinkerpop/blob/a6777d0c8394f95c5aa400b965e8f4f996db1abd/gremlin-server/src/main/java/org/apache/tinkerpop/gremlin/server/handler/SingleTaskSession.java#L33].
 [SessionTask 
|https://github.com/apache/tinkerpop/blob/a6777d0c8394f95c5aa400b965e8f4f996db1abd/gremlin-server/src/main/java/org/apache/tinkerpop/gremlin/server/handler/SessionTask.java]extends
 Context which holds a reference to the traversal's [Bytecode 
here|https://github.com/apache/tinkerpop/blob/a6777d0c8394f95c5aa400b965e8f4f996db1abd/gremlin-server/src/main/java/org/apache/tinkerpop/gremlin/server/Context.java#L62].

 

There doesn't appear to be a "happy path" removal of the anonymously created 
listener for the channel's closeFuture() once the session completes normally 
(like when a traversal is fully iterated).

 

As a side note, that anonymous created listener may also be problematic for 
exposing the object being constructed before the constructor completes. Which 
can cause problems in multi-threaded environments, assuming the underlying 
asynchronous runtime permits multi-threaded operation. Imagine the channel 
closing concurrently to the constructor being executed after the listener was 
added.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to