[ 
https://issues.apache.org/jira/browse/TINKERPOP-3087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Bennington updated TINKERPOP-3087:
----------------------------------------
    Attachment: tinkerpop-3087.diff

> Panic in gremlin-go driver responseHandler
> ------------------------------------------
>
>                 Key: TINKERPOP-3087
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-3087
>             Project: TinkerPop
>          Issue Type: Bug
>          Components: go
>    Affects Versions: 3.7.1
>         Environment: Ubuntu 20.04, x86_64, AWS EC2 image. Think it's 2c2g, 
> can't recall off the top of my head.
> Compiled _without_ CGO.
>            Reporter: David Bennington
>            Priority: Minor
>         Attachments: tinkerpop-3087.diff, tinkerpop-3087.txt
>
>
> Ocasionally, I get the following panic in the gremlin-go driver:
>  
> {noformat}
> 10:22:39.895 <successfully added a vertex to janusgraph>
> 10:22:44.487 panic: runtime error: invalid memory address or nil pointer 
> dereference
> 10:22:44.487 [signal SIGSEGV: segmentation violation code=0x1 addr=0x78 
> pc=0x11fdd16]
> 10:22:44.487 
> 10:22:44.487 goroutine 275 [running]:
> 10:22:44.487 
> github.com/apache/tinkerpop/gremlin-go/v3/driver.(*gremlinServerWSProtocol).responseHandler(0xc0006be6c0,
>  0xc0001dfc60, {{0x9, 0xf6, 0x33, 0x7f, 0x19, 0x16, 0x4e, 0xc9, ...}, ...})
> 10:22:44.487 
> /go/pkg/mod/github.com/apache/tinkerpop/gremlin-go/v3@v3.7.1/driver/protocol.go:116
>  +0x7d6
> 10:22:44.487 
> github.com/apache/tinkerpop/gremlin-go/v3/driver.(*gremlinServerWSProtocol).readLoop(0xc0006be6c0,
>  0xc0001dfc60, 0xc0001dfca0)
> 10:22:44.487 
> /go/pkg/mod/github.com/apache/tinkerpop/gremlin-go/v3@v3.7.1/driver/protocol.go:82
>  +0x272
> 10:22:44.487 created by 
> github.com/apache/tinkerpop/gremlin-go/v3/driver.newGremlinServerWSProtocol 
> in goroutine 272
> 10:22:44.487 
> /go/pkg/mod/github.com/apache/tinkerpop/gremlin-go/v3@v3.7.1/driver/protocol.go:197
>  +0x23a{noformat}
> I can't reliably reproduce, it's never happened locally. It's occurred just 
> after doing some work in the logs above, but it's also happened after a 
> period of doing nothing.
>  
> First glance looks like it's a nil being returned from the `resultSets` 
> synchronizedMap, something getting in and removing it, which should only 
> happen when the `channelResultSet` itself is closed? Seems like a race 
> condition to me.
> I think the sensible way to handle this, if it is what I suspect (I've no 
> proof yet), would be to
>  * Store a reference to the ResultSet when first loading it in responseHandler
>  * Make `channelResultSet` discard modifications after close OR
>  * responseHandler needs to aquire the `channelMutex` somehow for the 
> duration of modifications (I think I prefer this, it's more obvious to what's 
> happening)
> I did try and get the project tests to run, but it's been _years_ since I've 
> had to use maven and I can't seem to find a good idiots guide to working with 
> the project from a non java POV - I need to know how to build the docker 
> images the tests require. If someone could help me get started I'd be more 
> than happy to contribute a PR (although given the nature, a reliable test may 
> be hard to write).
> As it stands, I'm probably going to modify my fork and mod replace, see if a 
> fix would actually work in my deployments.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to