[jira] [Updated] (TINKERPOP-2820) gremlin-python _close_session race condition/FD leak

2023-04-25 Thread Valentyn Kahamlyk (Jira)


 [ 
https://issues.apache.org/jira/browse/TINKERPOP-2820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Kahamlyk updated TINKERPOP-2820:
-
Fix Version/s: 3.7.0
   3.6.3
   3.5.6

> gremlin-python _close_session race condition/FD leak
> 
>
> Key: TINKERPOP-2820
> URL: https://issues.apache.org/jira/browse/TINKERPOP-2820
> Project: TinkerPop
>  Issue Type: Bug
>  Components: python
>Affects Versions: 3.6.1
>Reporter: Alex Hamilton
>Assignee: Valentyn Kahamlyk
>Priority: Critical
> Fix For: 3.7.0, 3.6.3, 3.5.6
>
>
> There is a race condition in gremlin-python when closing session-based 
> connections that results in leaking file descriptors for event loops - 
> eventually leading to an `OSError [Errno 24] too many open files` error after 
> enough transactions occur.
> The problem stems from a race condition when closing session based 
> connections that causes the event loop opened for the session's connection to 
> be left open.
> The problem is completely contained in these two methods from 
> `gremlin_python.driver.client.py`
> ```py
> def close(self):
>     # prevent the Client from being closed more than once. it raises errors 
> if new jobby jobs
>     # get submitted to the executor when it is shutdown
>     if self._closed:
>         return
>     if self._session_enabled:
>         self._close_session() # 1. (see below)
>     log.info("Closing Client with url '%s'", self._url)
>     while not self._pool.empty(): # 3. (see below)
>         conn = self._pool.get(True)
>         conn.close()
>     self._executor.shutdown()
>     self._closed = True
> def _close_session(self):
>     message = request.RequestMessage(
>         processor='session', op='close',
>         args=\{'session': str(self._session)})
>     conn = self._pool.get(True) 
>     return conn.write(message).result() # 2. (see below)
> ```
> 1. `_close_session()` called
> 2. `.result()` waits for the _write_ to finish, but does *not* wait for the 
> _read_ to finish. `conn` does not get put back into `self._pool` until AFTER 
> the read finishes (`gremlin_python.driver.connection.Connection._receive()`). 
> However, this method returns early and goes to 3.
> 3. this while loop is not entered to close out the connections. This leaves 
> the conn's event loop running, never to be closed.
> I was able to solve this by modifying `_close_session` as follows:
> ```py
> def _close_session(self):
>     message = request.RequestMessage(
>         processor='session', op='close',
>         args=\{'session': str(self._session)})
>     conn = self._pool.get(True)
>     try:
>         write_result_set = conn.write(message).result()
>         return write_result_set.all().result() # wait for _receive() to finish
>     except protocol.GremlinServerError:
>         pass
> ```
> I'm not sure if this is the correct solution, but wanted to point out the bug.
> In the meantime however, I wrote a context manager to handle this cleanup for 
> me
> ```py
> @contextlib.contextmanager
> def transaction():
>     tx = g.tx()
>     gtx = tx.begin()
>     try:
>         yield tx, gtx
>         tx.commit()
>     except Exception as e:
>         tx.rollback()
>     finally:
>         while not tx._session_based_connection._client._pool.empty():
>             conn = tx._session_based_connection._client._pool.get(True)
>             conn.close()
>             logger.info("Closed abandoned session connection")
> with transaction() as (tx, gtx):
>     foo = gtx.some_traversal().to_list()
>     # do something with foo
>     gtx.some_other_traversal().iterate()
> ```
> Cheers



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TINKERPOP-2820) gremlin-python _close_session race condition/FD leak

2023-03-24 Thread Yang Xia (Jira)


 [ 
https://issues.apache.org/jira/browse/TINKERPOP-2820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Xia updated TINKERPOP-2820:

Priority: Critical  (was: Minor)

> gremlin-python _close_session race condition/FD leak
> 
>
> Key: TINKERPOP-2820
> URL: https://issues.apache.org/jira/browse/TINKERPOP-2820
> Project: TinkerPop
>  Issue Type: Bug
>  Components: python
>Affects Versions: 3.6.1
>Reporter: Alex Hamilton
>Priority: Critical
>
> There is a race condition in gremlin-python when closing session-based 
> connections that results in leaking file descriptors for event loops - 
> eventually leading to an `OSError [Errno 24] too many open files` error after 
> enough transactions occur.
> The problem stems from a race condition when closing session based 
> connections that causes the event loop opened for the session's connection to 
> be left open.
> The problem is completely contained in these two methods from 
> `gremlin_python.driver.client.py`
> ```py
> def close(self):
>     # prevent the Client from being closed more than once. it raises errors 
> if new jobby jobs
>     # get submitted to the executor when it is shutdown
>     if self._closed:
>         return
>     if self._session_enabled:
>         self._close_session() # 1. (see below)
>     log.info("Closing Client with url '%s'", self._url)
>     while not self._pool.empty(): # 3. (see below)
>         conn = self._pool.get(True)
>         conn.close()
>     self._executor.shutdown()
>     self._closed = True
> def _close_session(self):
>     message = request.RequestMessage(
>         processor='session', op='close',
>         args=\{'session': str(self._session)})
>     conn = self._pool.get(True) 
>     return conn.write(message).result() # 2. (see below)
> ```
> 1. `_close_session()` called
> 2. `.result()` waits for the _write_ to finish, but does *not* wait for the 
> _read_ to finish. `conn` does not get put back into `self._pool` until AFTER 
> the read finishes (`gremlin_python.driver.connection.Connection._receive()`). 
> However, this method returns early and goes to 3.
> 3. this while loop is not entered to close out the connections. This leaves 
> the conn's event loop running, never to be closed.
> I was able to solve this by modifying `_close_session` as follows:
> ```py
> def _close_session(self):
>     message = request.RequestMessage(
>         processor='session', op='close',
>         args=\{'session': str(self._session)})
>     conn = self._pool.get(True)
>     try:
>         write_result_set = conn.write(message).result()
>         return write_result_set.all().result() # wait for _receive() to finish
>     except protocol.GremlinServerError:
>         pass
> ```
> I'm not sure if this is the correct solution, but wanted to point out the bug.
> In the meantime however, I wrote a context manager to handle this cleanup for 
> me
> ```py
> @contextlib.contextmanager
> def transaction():
>     tx = g.tx()
>     gtx = tx.begin()
>     try:
>         yield tx, gtx
>         tx.commit()
>     except Exception as e:
>         tx.rollback()
>     finally:
>         while not tx._session_based_connection._client._pool.empty():
>             conn = tx._session_based_connection._client._pool.get(True)
>             conn.close()
>             logger.info("Closed abandoned session connection")
> with transaction() as (tx, gtx):
>     foo = gtx.some_traversal().to_list()
>     # do something with foo
>     gtx.some_other_traversal().iterate()
> ```
> Cheers



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TINKERPOP-2820) gremlin-python _close_session race condition/FD leak

2022-11-04 Thread Alex Hamilton (Jira)


 [ 
https://issues.apache.org/jira/browse/TINKERPOP-2820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Hamilton updated TINKERPOP-2820:
-
Description: 
There is a race condition in gremlin-python when closing session-based 
connections that results in leaking file descriptors for event loops - 
eventually leading to an `OSError [Errno 24] too many open files` error after 
enough transactions occur.

The problem stems from a race condition when closing session based connections 
that causes the event loop opened for the session's connection to be left open.

The problem is completely contained in these two methods from 
`gremlin_python.driver.client.py`

```py
def close(self):

    # prevent the Client from being closed more than once. it raises errors if 
new jobby jobs

    # get submitted to the executor when it is shutdown
    if self._closed:
        return

    if self._session_enabled:
        self._close_session() # 1. (see below)
    log.info("Closing Client with url '%s'", self._url)
    while not self._pool.empty(): # 3. (see below)
        conn = self._pool.get(True)
        conn.close()
    self._executor.shutdown()
    self._closed = True

def _close_session(self):
    message = request.RequestMessage(
        processor='session', op='close',
        args=\{'session': str(self._session)})
    conn = self._pool.get(True) 
    return conn.write(message).result() # 2. (see below)
```

1. `_close_session()` called
2. `.result()` waits for the _write_ to finish, but does *not* wait for the 
_read_ to finish. `conn` does not get put back into `self._pool` until AFTER 
the read finishes (`gremlin_python.driver.connection.Connection._receive()`). 
However, this method returns early and goes to 3.
3. this while loop is not entered to close out the connections. This leaves the 
conn's event loop running, never to be closed.

I was able to solve this by modifying `_close_session` as follows:

```py
def _close_session(self):
    message = request.RequestMessage(
        processor='session', op='close',
        args=\{'session': str(self._session)})
    conn = self._pool.get(True)
    try:
        write_result_set = conn.write(message).result()
        return write_result_set.all().result() # wait for _receive() to finish
    except protocol.GremlinServerError:
        pass
```

I'm not sure if this is the correct solution, but wanted to point out the bug.

In the meantime however, I wrote a context manager to handle this cleanup for me

```py
@contextlib.contextmanager
def transaction():
    tx = g.tx()
    gtx = tx.begin()

    try:
        yield tx, gtx
        tx.commit()
    except Exception as e:
        tx.rollback()
    finally:
        while not tx._session_based_connection._client._pool.empty():
            conn = tx._session_based_connection._client._pool.get(True)
            conn.close()
            logger.info("Closed abandoned session connection")

with transaction() as (tx, gtx):
    foo = gtx.some_traversal().to_list()

    # do something with foo
    gtx.some_other_traversal().iterate()
```

Cheers

  was:
There is a race condition in gremlin-python when closing session-based 
connections that results in leaking file descriptors for event loops - 
eventually leading to an `OSError [Errno 24] too many open files` error after 
enough transactions occur.

The problem stems from a race condition when closing session based connections 
that causes the event loop opened for the session's connection to be left open.

The problem is completely contained in these two methods from 
`gremlin_python.driver.client.py`

```py
def close(self):

    # prevent the Client from being closed more than once. it raises errors if 
new jobby jobs

    # get submitted to the executor when it is shutdown
    if self._closed:
        return

    if self._session_enabled:
        self._close_session() # 1. (see below)
    log.info("Closing Client with url '%s'", self._url)
    while not self._pool.empty(): # 3. (see below)
        conn = self._pool.get(True)
        conn.close()
    self._executor.shutdown()
    self._closed = True

def _close_session(self):
    message = request.RequestMessage(
        processor='session', op='close',
        args=\{'session': str(self._session)})
    conn = self._pool.get(True) 
    return conn.write(message).result() # 2. (see below)
```

1. `_close_session()` called
2. `.result()` waits for the _write_ to finish, but does _{_}not{_}_ wait for 
the _read_ to finish. `conn` does not get put back into `self._pool` until 
AFTER the read finishes 
(`gremlin_python.driver.connection.Connection._receive()`). However, this 
method returns early and goes to 3.
3. this while loop is not entered to close out the connections. This leaves the 
conn's event loop running, never to be closed.

I was able to solve this by modifying `_close_session` as follows:

```py
def _close_session(self):
    message = 

[jira] [Updated] (TINKERPOP-2820) gremlin-python _close_session race condition/FD leak

2022-11-04 Thread Alex Hamilton (Jira)


 [ 
https://issues.apache.org/jira/browse/TINKERPOP-2820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Hamilton updated TINKERPOP-2820:
-
Description: 
There is a race condition in gremlin-python when closing session-based 
connections that results in leaking file descriptors for event loops - 
eventually leading to an `OSError [Errno 24] too many open files` error after 
enough transactions occur.

The problem stems from a race condition when closing session based connections 
that causes the event loop opened for the session's connection to be left open.

The problem is completely contained in these two methods from 
`gremlin_python.driver.client.py`

```py
def close(self):

    # prevent the Client from being closed more than once. it raises errors if 
new jobby jobs

    # get submitted to the executor when it is shutdown
    if self._closed:
        return

    if self._session_enabled:
        self._close_session() # 1. (see below)
    log.info("Closing Client with url '%s'", self._url)
    while not self._pool.empty(): # 3. (see below)
        conn = self._pool.get(True)
        conn.close()
    self._executor.shutdown()
    self._closed = True

def _close_session(self):
    message = request.RequestMessage(
        processor='session', op='close',
        args=\{'session': str(self._session)})
    conn = self._pool.get(True) 
    return conn.write(message).result() # 2. (see below)
```

1. `_close_session()` called
2. `.result()` waits for the _write_ to finish, but does _{_}not{_}_ wait for 
the _read_ to finish. `conn` does not get put back into `self._pool` until 
AFTER the read finishes 
(`gremlin_python.driver.connection.Connection._receive()`). However, this 
method returns early and goes to 3.
3. this while loop is not entered to close out the connections. This leaves the 
conn's event loop running, never to be closed.

I was able to solve this by modifying `_close_session` as follows:

```py
def _close_session(self):
    message = request.RequestMessage(
        processor='session', op='close',
        args=\{'session': str(self._session)})
    conn = self._pool.get(True)
    try:
        write_result_set = conn.write(message).result()
        return write_result_set.all().result() # wait for _receive() to finish
    except protocol.GremlinServerError:
        pass
```

I'm not sure if this is the correct solution, but wanted to point out the bug.

In the meantime however, I wrote a context manager to handle this cleanup for me

```py
@contextlib.contextmanager
def transaction():
    tx = g.tx()
    gtx = tx.begin()

    try:
        yield tx, gtx
        tx.commit()
    except Exception as e:
        tx.rollback()
    finally:
        while not tx._session_based_connection._client._pool.empty():
            conn = tx._session_based_connection._client._pool.get(True)
            conn.close()
            logger.info("Closed abandoned session connection")

with transaction() as (tx, gtx):
    foo = gtx.some_traversal().to_list()

    # do something with foo
    gtx.some_other_traversal().iterate()
```

Cheers

  was:
There is a race condition in gremlin-python when closing session-based 
connections that results in leaking file descriptors for event loops - 
eventually leading to an `OSError [Errno 24] too many open files` error after 
enough transactions occur.

The problem stems from a race condition when closing session based connections 
that causes the event loop opened for the session's connection to be left open.

The problem is completely contained in these two methods from 
`gremlin_python.driver.client.py`

```py
def close(self):
# prevent the Client from being closed more than once. it raises errors if 
new jobby jobs
# get submitted to the executor when it is shutdown
if self._closed:
return

if self._session_enabled:
self._close_session()  # 1. (see below)
log.info("Closing Client with url '%s'", self._url)
while not self._pool.empty():  # 3. (see below)
conn = self._pool.get(True)
conn.close()
self._executor.shutdown()
self._closed = True

def _close_session(self):
message = request.RequestMessage(
processor='session', op='close',
args={'session': str(self._session)})
conn = self._pool.get(True)
return conn.write(message).result()# 2. (see below)
```

1. `_close_session()` called
2. `.result()` waits for the _write_ to finish, but does __not__ wait for the 
_read_ to finish. `conn` does not get put back into `self._pool` until AFTER 
the read finishes (`gremlin_python.driver.connection.Connection._receive()`). 
However, this method returns early and goes to 3.
3. this while loop is not entered to close out the connections. This leaves the 
conn's event loop running, never to be closed.


I was able to solve this by modifying `_close_session` as follows:

```py
def _close_session(self):
message =