[ 
https://issues.apache.org/jira/browse/TINKERPOP-3113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887416#comment-17887416
 ] 

Allan Clements commented on TINKERPOP-3113:
-------------------------------------------

FWIW if it is just a happy path to remove the listener I'd be happy to try to 
tackle this, but wanted to post the issue first in case this is problematic to 
larger architecture consequences I'm unaware of.

> UnifiedChanilizer Retains Sessions Forever
> ------------------------------------------
>
>                 Key: TINKERPOP-3113
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-3113
>             Project: TinkerPop
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.7.2
>            Reporter: Allan Clements
>            Priority: Major
>         Attachments: dominator_tree.png, path_to_gc_root.png
>
>
> ([Prior Discord 
> Thread|https://discord.com/channels/838910279550238720/1290792261117939722/1290792261117939722])
> I believe UnifiedChannelizer is causing sessions to be retained forever until 
> either the channel is closed or an OOME occurs.
>  
> My JanusGraph deployment would generally OOME after a couple hours. After 
> collecting some heap dumps I noticed over 40k traversal bytecode submissions 
> to the server were still in the dump according to the Eclipse Memory Analyzer 
> Tool's dominator tree view, attached below.
>  
> Inspecting further I picked an instance of the bytecode and it appeared the 
> path to the GC root, preventing its collection, was through a CloseFuture 
> associated with its initialChannel, attached below also.
>  
> So if the channel were regularly closed it may duck the issue. In either case 
> I was able to reproduce it in a toy Java application using the official 
> driver by continuously submitting traversals with large payloads via an 
> inject() step without closing the connection as well as in a toy Rust 
> application using the unofficial driver.
>  
> I believe the issue originates 
> [here|https://github.com/apache/tinkerpop/blob/b7c9ddda16a3d059b2a677f578f131a7124187b6/gremlin-server/src/main/java/org/apache/tinkerpop/gremlin/server/handler/AbstractSession.java#L166-L170].
>  The code creates a future using a closure that just calls close(). However 
> in doing so it captures a reference to the current object. The child class of 
> AbstractSession is SingleTaskSession and holds a reference to its 
> [onlySessionTask 
> here|https://github.com/apache/tinkerpop/blob/a6777d0c8394f95c5aa400b965e8f4f996db1abd/gremlin-server/src/main/java/org/apache/tinkerpop/gremlin/server/handler/SingleTaskSession.java#L33].
>  [SessionTask 
> |https://github.com/apache/tinkerpop/blob/a6777d0c8394f95c5aa400b965e8f4f996db1abd/gremlin-server/src/main/java/org/apache/tinkerpop/gremlin/server/handler/SessionTask.java]extends
>  Context which holds a reference to the traversal's [Bytecode 
> here|https://github.com/apache/tinkerpop/blob/a6777d0c8394f95c5aa400b965e8f4f996db1abd/gremlin-server/src/main/java/org/apache/tinkerpop/gremlin/server/Context.java#L62].
>  
> There doesn't appear to be a "happy path" removal of the anonymously created 
> listener for the channel's closeFuture() once the session completes normally 
> (like when a traversal is fully iterated).
>  
> As a side note, that anonymous created listener may also be problematic for 
> exposing the object being constructed before the constructor completes. Which 
> can cause problems in multi-threaded environments, assuming the underlying 
> asynchronous runtime permits multi-threaded operation. Imagine the channel 
> closing concurrently to the constructor being executed after the listener was 
> added.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to