Re: [sqlalchemy] Custom Dialect - recommendations needed for handling of Sequences/lastrowid
On Jan 25, 2013, at 2:25 AM, jank wrote: Hello, I have implemented a dialect for a new database (EXASol). that's great. I'd like to point you to a new system we have for testing and deploying external dialects, where your dialect can be packaged with a standard layout and make use of a series of compliance suites within SQLAlchemy. Within this test system, you can customize fully those capabilities which your dialect supports. If you check out SQLAlchemy-Access and SQLAlchemy-Akiban, you can see the standard forms: https://bitbucket.org/zzzeek/sqlalchemy-access https://github.com/zzzeek/sqlalchemy_akiban the key files within these packages regarding using the SQLAlchemy compliance suite are: /run_tests.py - test runner, is a front-end to Nose /setup.cfg - test runner configuration /test/requirements.py - a custom SuiteRequirements class which provides rules for those features and behaviors supported by the database /test/test_suite.py - pulls in the sqlalchemy.testing.suite package which causes the suite tests to be present for the Nose runner. the compliance suite is a work in progress and doesn't cover everything yet. Key areas that it does cover are the whole INSERT/lastrowid mechanics you're concerned with here, database reflection, and basic SQL types. I have not done tests using the ORM layer of SA so far as I am primarily interested in the Core layer. So far things worked out pretty well DDL and DML support are basically running. The EXASol DB does not offer Sequences but autoincrement columns that are very similar to postgres SERIAL types. Example DDL statement: CREATE TABLE test_exadialect.test ( id INTEGER IDENTITY 10 NOT NULL, name VARCHAR(40) NOT NULL, age INTEGER, PRIMARY KEY (id) ) Identity is the keyword to add autoincrement behavior to an Integer-like column. 10 is the initial value of the autoincrement. This DDL statement is generated based on this table metadata: Table('test', self.metadata, Column('id', Integer, Sequence('test.id.seq', start=10, optional=True), primary_key=True), Column('name', String(40), nullable=False), Column('age', Integer) ) Looking at the postgres dialect implementation, I came to the conclusion that using Sequences is the only way to get the desired behavior. The best dialect for you to look at here would be the MSSQL dialect, lib/sqlalchemy/dialects/mssql/base.py and perhaps the pyodbc implementation of it, lib/sqlalchemy/dialects/mssql/pyodbc.py.MSSQL's INSERT system resembles this the most, where we use the Sequence to allow configurability of the IDENTITY column, and a post-fetch at the cursor level is used to get at the last inserted identity.The post-fetch is performed right on the same cursor that the INSERT occurred on and bypasses the usual SQLAlchemy mechanics of executing a statement, so to that degree the Python overhead of this post fetch is negligible. I have also implemented the get_lastrowid() method of the ExecutionContext class. This all works as expected albeit at the costs of an additional roundtrip for each single insert as the DB in question does not support RETURNING. First question: is this the intended way to implement autoincrement behavior in the absence of support for explicit sequence objects in the DB? sounds like you're on the right track, the usage of Sequence is optional overall but if you want configurability of the start and all that then yes. No to the problem that I could not solve so far. I want to make the costs of fetching the last autoincrement id upon insert/update optional. the last row id mechanics only come into play when an Insert() construct is used.This construct supports a flag inline=True which is intended to indicate an INSERT where you don't need any of the default values back. If you execute a table.insert(inline=True)... the entire lastrowid mechanics are bypassed, you can see this in sqlalchemy/engine/default.py line 663 post_insert(). This functionality of this flag is invoked automatically whenever the Insert() construct is used in an executemany context as well. In our use case we are fine with the DB determining the next id value without knowing about the value upon insert. I tried to fiddle around with various configuration switches. Namely: postfetch_lastrowid postfetch_lastrowid refers to whether or not the method of acquiring the last inserted id, when it is desired, is done via post-fetch, or whether the last inserted id is provided by some other method, which could be one of: pre-execute+embed in the INSERT, embed in the INSERT+use RETURNING, use the dbapi lastrowid() method. When this flag is False, in the absense of lastrowid() or RETURNING the system behaves as though a pre-execute insert is present, but since that's not implemented either you get a NULL. The flag does not indicate that the
[sqlalchemy] Explicit locking
Hello, I have multiple processes that can potentially insert duplicate rows into the database. These inserts do not happen very frequently (a few times every hour) so it is not performance critical. I've tried an exist check before doing the insert, like so: #Assume we're inserting a camera object, that's a valid SQLAlchemy ORM object that inherits from declarative_base... try: stmt = exists().where(Camera.id == camera_id) exists_result = session.query(Camera).with_lockmode(update).filter(stmt).first() if exists_result is None: session.add(Camera(...)) #Lots of parameters, just assume it works session.commit() except IntegrityError as e: session.rollback() The problem I'm running into is that the `exist()` check doesn't lock the table, and so there is a chance that multiple processes could attempt to insert the same object at the same time. In such a scenario, one process succeeds with the insert and the others fail with an IntegrityError exception. While this works, it doesn't feel clean to me as I end up with gaps in the primary key ID as failed inserts still increment the primary key counter. I would really like some way of locking the Camera table before doing the `exists()` check. Thanks, Phil p.s. I've also posted this on Stackoverflow. http://stackoverflow.com/questions/14520340/sqlalchemy-and-explicit-locking -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
[sqlalchemy] Can't access model object backref
Hello. I have Category model: class Category(Base): __tablename__ = 'categories' id = Column(Integer, primary_key=True) parent_id = Column(Integer, ForeignKey('categories.id')) name = Column(Unicode(255), nullable=False) description = Column(UnicodeText) position = Column(Integer) children = relationship('Category', backref=backref('parent', remote_side=[id]), lazy='joined', join_depth=1, order_by='Category.position') But I can't access it's 'parent' backref (I want to use it in order_by). Why? from whs.models import Category Category.parent Traceback (most recent call last): File console, line 1, in module AttributeError: type object 'Category' has no attribute 'parent' -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
[sqlalchemy] Re: Can't access model object backref
I want to order_by(Category.parent.name, Category.name) is it possible? -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] Re: Can't access model object backref
You need to join to Category.parent first. It also must be aliased, because this is self-referential: ca = aliased(Category) query(Category).join(ca, Category.parent).order_by(ca.name, Category.name) On Jan 25, 2013, at 3:49 PM, sector119 wrote: I want to order_by(Category.parent.name, Category.name) is it possible? -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
[sqlalchemy] Refresh session: rollback() or commit()?
I'm having a problem with many concurrent scripts, workers and uwsgi instances writing and reading the same tables and rows almost simultaneously, and sometimes one of them seems to get an older state, even from an object it never touched in the first place and I'm querying for the first time. I find that weird, but I assume it has to do with the database isolation level. The problem is, how to adequately deal with that and make sure it never happens? I added a session.commit() before doing anything and it works, I assume rollback would work too. Is there any better solution? -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] Refresh session: rollback() or commit()?
On Jan 25, 2013, at 4:33 PM, Pedro Werneck wrote: I'm having a problem with many concurrent scripts, workers and uwsgi instances writing and reading the same tables and rows almost simultaneously, and sometimes one of them seems to get an older state, even from an object it never touched in the first place and I'm querying for the first time. I find that weird, but I assume it has to do with the database isolation level. sure, if the updates to that row are still pending in an uncommitted transaction, the outside world would still see the old data. The problem is, how to adequately deal with that and make sure it never happens? I added a session.commit() before doing anything and it works, I assume rollback would work too. Is there any better solution? You should be committing *after* you've done some work. Then when a new request comes in, it should start out with a brand new Session which will make a new database connection as soon as the database is accessed. When the request completes, the Session should be closed out. The documentation at http://docs.sqlalchemy.org/en/rel_0_8/orm/session.html#session-frequently-asked-questions discusses this, and continues the discussion at http://docs.sqlalchemy.org/en/rel_0_8/orm/session.html#using-thread-local-scope-with-web-applications . -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] Refresh session: rollback() or commit()?
Well... I'm afraid it's not as simple as that. I'll give an example: I have a webservice A, which triggers a callback and calls webservice B, creating a new row in the database with status = 0 and commiting the transaction. Then I have a script which finds all rows with status = 0, and sends their id, one by one, to a worker, who is supposed to get lots of data from many sources and then send that to another webservice C. Now, sometimes, especially when things happen too fast, the query the worker does for the row with that id returns empty, even though that isn't in an uncommited transaction, and the script who called the worker itself found it. In principle, if things are running smoothly, that isn't supposed to happen. Get the problem? The worker doesn't have uncommitted changes, actually it never does any changes at all. It got the id from a script who got the row, so it exists for someone who just started a new session. So, how can I be sure the worker will see that new row? I'm doing a commit with the empty transaction the worker has, as soon as it's called, and it seems to be working, but is there any better way? On Fri, Jan 25, 2013 at 7:42 PM, Michael Bayer mike...@zzzcomputing.comwrote: On Jan 25, 2013, at 4:33 PM, Pedro Werneck wrote: I'm having a problem with many concurrent scripts, workers and uwsgi instances writing and reading the same tables and rows almost simultaneously, and sometimes one of them seems to get an older state, even from an object it never touched in the first place and I'm querying for the first time. I find that weird, but I assume it has to do with the database isolation level. sure, if the updates to that row are still pending in an uncommitted transaction, the outside world would still see the old data. The problem is, how to adequately deal with that and make sure it never happens? I added a session.commit() before doing anything and it works, I assume rollback would work too. Is there any better solution? You should be committing *after* you've done some work. Then when a new request comes in, it should start out with a brand new Session which will make a new database connection as soon as the database is accessed. When the request completes, the Session should be closed out. The documentation at http://docs.sqlalchemy.org/en/rel_0_8/orm/session.html#session-frequently-asked-questionsdiscusses this, and continues the discussion at http://docs.sqlalchemy.org/en/rel_0_8/orm/session.html#using-thread-local-scope-with-web-applications. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out. -- --- Pedro Werneck -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] Refresh session: rollback() or commit()?
On Jan 25, 2013, at 5:12 PM, Pedro Werneck wrote: Well... I'm afraid it's not as simple as that. I'll give an example: I have a webservice A, which triggers a callback and calls webservice B, creating a new row in the database with status = 0 and commiting the transaction. Then I have a script which finds all rows with status = 0, and sends their id, one by one, to a worker, who is supposed to get lots of data from many sources and then send that to another webservice C. Now, sometimes, especially when things happen too fast, the query the worker does for the row with that id returns empty, even though that isn't in an uncommited transaction, and the script who called the worker itself found it. In principle, if things are running smoothly, that isn't supposed to happen. Is there some kind of distribution to the database, like master/slave? Otherwise, once data is committed, it is readable to all new transactions subsequent to that commit. If the script that is searching for status=0 is finding rows that are committed, then the worker that is querying for those rows should be able to see them, unless the worker has been holding open a long running transaction. Long running transactions here are more of the antipattern. The worker should ensure it responds to new messages from the status=0 script with a brand new transaction to read the message in question. So, how can I be sure the worker will see that new row? I'm doing a commit with the empty transaction the worker has, as soon as it's called, and it seems to be working, but is there any better way? The worker should wait for a request from the script in a non-transactional state, without a Session. A request from the script comes in- the worker starts a new Session to respond to that request, hence new transaction. Thinking about transaction demarcation in reverse still seems to suggest that this worker is leaving a dormant connection open as it waits for new jobs. All of that said, this is only based on what you're telling me so far. There may be many more details here that entirely change how this might have to work. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] Refresh session: rollback() or commit()?
On Jan 25, 2013, at 5:35 PM, Pedro Werneck wrote: If the script that is searching for status=0 is finding rows that are committed, then the worker that is querying for those rows should be able to see them, unless the worker has been holding open a long running transaction. Exactly. Long running transactions here are more of the antipattern. The worker should ensure it responds to new messages from the status=0 script with a brand new transaction to read the message in question. That's the point. What's the best way to do that, considering the worker is never updating anything, only reading? Should I commit in the end of every task then, even without anything to commit? Should I start a new session on every call? The commit does that automatically if I'm not using autocommit=True, right? just do it like this: def receive_some_request(args): session = Session(some_engine) # connect to the database (in reality, pulls a connection from a pool as soon as the Session is used to emit SQL) try: .. do things with session ... session.commit()# if you have data to commit finally: session.close() # close what was opened above. just like it were a plain database connection. that's per request received by your worker. The worker should wait for a request from the script in a non-transactional state, without a Session. A request from the script comes in- the worker starts a new Session to respond to that request, hence new transaction. Thinking about transaction demarcation in reverse still seems to suggest that this worker is leaving a dormant connection open as it waits for new jobs. I'm pretty sure it does. I'm using Flask SQLAlchemy and Celery for the workers. The workers reach the global app for the session and are keeping the connection open, but they do have work almost all the time and never sleep for more than a few secs. --- Pedro Werneck -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] Refresh session: rollback() or commit()?
That works, but now I'll have to change how my models use the session. Would this all be solved if I just use READ COMMITTED transaction isolation? On Fri, Jan 25, 2013 at 8:45 PM, Michael Bayer mike...@zzzcomputing.comwrote: On Jan 25, 2013, at 5:35 PM, Pedro Werneck wrote: If the script that is searching for status=0 is finding rows that are committed, then the worker that is querying for those rows should be able to see them, unless the worker has been holding open a long running transaction. Exactly. Long running transactions here are more of the antipattern. The worker should ensure it responds to new messages from the status=0 script with a brand new transaction to read the message in question. That's the point. What's the best way to do that, considering the worker is never updating anything, only reading? Should I commit in the end of every task then, even without anything to commit? Should I start a new session on every call? The commit does that automatically if I'm not using autocommit=True, right? just do it like this: def receive_some_request(args): session = Session(some_engine) # connect to the database (in reality, pulls a connection from a pool as soon as the Session is used to emit SQL) try: .. do things with session ... session.commit()# if you have data to commit finally: session.close() # close what was opened above. just like it were a plain database connection. that's per request received by your worker. The worker should wait for a request from the script in a non-transactional state, without a Session. A request from the script comes in- the worker starts a new Session to respond to that request, hence new transaction. Thinking about transaction demarcation in reverse still seems to suggest that this worker is leaving a dormant connection open as it waits for new jobs. I'm pretty sure it does. I'm using Flask SQLAlchemy and Celery for the workers. The workers reach the global app for the session and are keeping the connection open, but they do have work almost all the time and never sleep for more than a few secs. --- Pedro Werneck -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out. -- --- Pedro Werneck -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] Refresh session: rollback() or commit()?
On Jan 25, 2013, at 8:02 PM, Pedro Werneck wrote: That works, but now I'll have to change how my models use the session. hmm, is that because your model objects themselves are controlling the scope of the transaction ?That's another pattern I don't really recommend... Would this all be solved if I just use READ COMMITTED transaction isolation? maybe? If the problem is really just exactly those rows needing to be visible. But the long running dormant transaction thing is still kind of an antipattern that will generally have negative effects. On Fri, Jan 25, 2013 at 8:45 PM, Michael Bayer mike...@zzzcomputing.com wrote: On Jan 25, 2013, at 5:35 PM, Pedro Werneck wrote: If the script that is searching for status=0 is finding rows that are committed, then the worker that is querying for those rows should be able to see them, unless the worker has been holding open a long running transaction. Exactly. Long running transactions here are more of the antipattern. The worker should ensure it responds to new messages from the status=0 script with a brand new transaction to read the message in question. That's the point. What's the best way to do that, considering the worker is never updating anything, only reading? Should I commit in the end of every task then, even without anything to commit? Should I start a new session on every call? The commit does that automatically if I'm not using autocommit=True, right? just do it like this: def receive_some_request(args): session = Session(some_engine) # connect to the database (in reality, pulls a connection from a pool as soon as the Session is used to emit SQL) try: .. do things with session ... session.commit()# if you have data to commit finally: session.close() # close what was opened above. just like it were a plain database connection. that's per request received by your worker. The worker should wait for a request from the script in a non-transactional state, without a Session. A request from the script comes in- the worker starts a new Session to respond to that request, hence new transaction. Thinking about transaction demarcation in reverse still seems to suggest that this worker is leaving a dormant connection open as it waits for new jobs. I'm pretty sure it does. I'm using Flask SQLAlchemy and Celery for the workers. The workers reach the global app for the session and are keeping the connection open, but they do have work almost all the time and never sleep for more than a few secs. --- Pedro Werneck -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out. -- --- Pedro Werneck -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.