On 6/17/15 4:05 AM, Soeren Medard wrote:
Hi,
I have had an issue with bulk operations arising from the following
use case:
I was looping an iterator of mappings, which are either already in the
DB (inserts) or not (updates). As I did not want to loop over it twice
and wanted to do only on transaction, I used greenlets do split the
iterator and concurrently run `bulk_insert_mappings` and
`bulk_update_mappings` on the same session object.
that won't work, because it means you are running multiple greenlets
against a single database connection, and that is not thread/greenlet
safe assuming you are using a greenlet-monkeypatched DBAPI. you need
to synchronize the work of each greenlet, or use individual connections.
additionally, as the stack trace you've passed on indicates, the bulk
methods on Session still make use of a subtransaction internally, which
is stateful. So that's the specific error you're seeing, multiple
greenlets are competing for the state of the session.transaction object
which is a linked list of transaction nests. It absolutely is not
threadsafe (which means greenlets also). The Session doesn't have
much need to support concurrent threads/greenlets doing things on it
without locking because the DBAPI connections it refers to are virtually
never safe in to use this way in any case.
So, questions related:
1) Is there indeed a better way to run both bulk operations while
looping over an iterator only once?
2) Should this be considered a bug?
Versions used:
|
SQLAlchemy==1.0.5
greenlet==0.4.7
psycopg2==2.6.1
|
and Postgres 9.4.1, Python 3.4.3
So the async support for psycopg2 is ultimately using a single
Postgresql Connection with async=1, this is documented at
http://initd.org/psycopg/docs/advanced.html#asynchronous-support. We can
see here that this use is not supported: " Two cursors can’t execute
concurrent queries on the same asynchronous connection."
So your options are to either synchronize the work of the multiple
greenlets, which will pretty much eliminate any point to doing it that
way, or to use a connection/session per greenlet. At the end of the
day you're communicating on a TCP socket where an INSERT/UPDATE string
is sent along the wire and a response is being waited for, and multiple
statements cannot be simultaneously multiplexed on a single connection.
--
You received this message because you are subscribed to the Google Groups
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.