[ 
https://issues.apache.org/jira/browse/TINKERPOP-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064737#comment-17064737
 ] 

Stephen Mallette commented on TINKERPOP-2352:
---------------------------------------------

Thanks for your thoughts on this one. I don't think there's a problem with 
changing the default pool size to what you suggest as long as we have tests 
that continue to validate behavior of the larger pool size somehow. A pull 
request would be great especially one that included some more documentation of 
the type you describe, of course, I think a nicer pull request would be to 
solve the keep-alive problem more generally as described on TINKERPOP-1886. 
Fixing that would be a much more robust solution. If you have the opportunity 
to help there it would be appreciated. If you could solve that, then perhaps we 
should just close this ticket in favor of that one and continue our discussion 
there. 

> Gremlin Python driver default pool size makes Gremlin keep-alive difficult
> --------------------------------------------------------------------------
>
>                 Key: TINKERPOP-2352
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-2352
>             Project: TinkerPop
>          Issue Type: Bug
>          Components: python
>    Affects Versions: 3.3.5, 3.4.5
>         Environment: AWS Lambda, Python 3.7 runtime, AWS Neptune.
> (AWS Lambda functions can remain in memory and thus hold connections open for 
> many minutes between invocations)
>            Reporter: Mark Br...e
>            Priority: Major
>
> I'm working with a Gremlin database that (like many) terminates connections 
> if they don't execute any transactions with a timeout period.  When we want 
> to run a traversal we first check our `GraphTraversalSource` by running 
> `g.V().limit(1).count().next()` and if that raises an exception we know we 
> need to reconnect before running the actual traversal.
> We've been very confused that this hasn't worked as expected: we 
> intermittently see traversals fail with `WebSocketClosed` or other 
> connection-related errors immediately after the "connection test" passes. 
> I've (finally) found the cause of this inconsistency is the default pool size 
> in `gremlin_python.driver.client.Client` being 4.  This means there's no 
> visiblity outside the `Client` of which connection in the pool is tested 
> and/or used, and in fact no way for the application (`GraphTraversalSource`) 
> to run keep-alive type traversals reliably.  Anytime an application passes in 
> a pool size of `None` or a number > 1 there'll be no way to make sure that 
> each and every connection in the pool actually sends keep-alive traversals to 
> the remote, _except_ in the case of a single-threaded application where a 
> tight loop could issue `pool_size` of them.  In that latter case as the 
> application is single-threaded then a `pool_size` above 1 won't provide much 
> benefit.
> I've raised this as a bug because I think a default `pool_size` of 1 would 
> give much more predictable behaviour, and in the specific case of the Python 
> driver is probably more appropriate because Python applications tend to run 
> single-threaded by default, with multi-threading carefully added when 
> performance requires it.  Perhaps it's a wish, but as the behaviour from the 
> default option is quite confusing it feels more like a bug, at least.  If it 
> would help I'm happy to raise a PR with some updated function header comments 
> or maybe updated documentation about multi-threaded / multi-async-loop usage 
> of gremlin-python.
> (This is my first issue here, apologies if it has some fields wrong.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to