Re: [sqlalchemy] ORM events order
Thank you very much On Thursday, December 5, 2013 5:20:57 PM UTC, Michael Bayer wrote: On Dec 5, 2013, at 12:14 PM, Tim Kersten t...@io41.com javascript: wrote: thank you. What of the relative ordering of the different ORM event types? i.e. before_flush before_delete after_flush etc When looking at before_flush I see the before_delete has not yet been fired, yet is has been fired in the after_flush. Is this guaranteed to always be the case? yes, before_flush and after_flush provide boundaries around the mechanics of the flush itself. before_delete as well as the other mapper-level events like before_update before_insert after_update etc. are all within the flush mechanics. you can’t necessarily rely upon the ordering of insert/update/delete events within the flush however, relative to different objects and especially across different kinds of objects. The mapper-level flush events are fired right as individual batches of objects are being prepared for INSERT/UPDATE/DELETE statements. On 5 Dec 2013, at 16:01, Michael Bayer mik...@zzzcomputing.comjavascript: wrote: On Dec 5, 2013, at 10:51 AM, Tim Kersten t...@io41.com javascript: wrote: Hi Folks, Is the order ORM events ( http://docs.sqlalchemy.org/en/rel_0_9/orm/events.html ) are fired in deterministic and guaranteed to be the same every time? I've searched the docs and google but couldn't anything relating to their relative order. the events are ordered. when you do an event.listen it appends the event listener to a list of listeners. the events are fired from the beginning of the list on forward.there’s actually an undocumented argument to event.listen() “insert=True” that will cause the listener to be inserted at position zero rather than appended. the reason the order of events is not really mentioned much is because there’s complex cases where the order of listener application has not been evaluated or tested. When you make use of mapper or instrumentation events against un-mapped base classes and such, the actual append() operation doesn’t occur until later, when classes are actually mapped, and this works by shuttling the event listener functions around to those classes. In these cases we don’t as yet have guarantees in place as to the order of the listeners being first applied, e.g. if you had a class that is a product of two mixins, and each mixin has listeners applied to it, that sort of thing. however, the order of listeners once applied should definitely be the same each time assuming no changes to the listener collections. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com javascript:. To post to this group, send email to sqlal...@googlegroups.comjavascript:. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] postgres schema per model
well ... i think i was right when i said that this implementation doesn't work, and i don't know exactly why. using abstract, declared_attr or __table_args__ via inheritance is not creating my models in another schema. if i put it explicitly in the model, it works fine. inherited? nopes. no way. one question that can be raised is that am i already using __table_args__ in those inherited ones?. yes, i am, but not all of them, and even so all tables are created in the public schema. i'm using sa 0.8.3, psycopg2 and postgres 9.3. On 12/05/2013 01:37 PM, Michael Bayer wrote: here’s an example, works on this end: from sqlalchemy import * from sqlalchemy.orm import * from sqlalchemy.ext.declarative import declarative_base Base = declarative_base() class AltSchema(object): __table_args__ = {schema: test_schema} class A(AltSchema, Base): __tablename__ = 'a' id = Column(Integer, primary_key=True) bs = relationship(B) class B(AltSchema, Base): __tablename__ = 'b' id = Column(Integer, primary_key=True) a_id = Column(Integer, ForeignKey('test_schema.a.id')) e = create_engine(postgresql://scott:tiger@localhost/test, echo=True) Base.metadata.create_all(e) echo=True will tell all here... On Dec 5, 2013, at 8:20 AM, Richard Gerd Kuesters rich...@humantech.com.br mailto:rich...@humantech.com.br wrote: hi all! another question: i have a postgres database, and i would like to work with schemas for module models. so far so good, but i didn't find much information besides it is available in sa docs. so, i came into this: http://stackoverflow.com/questions/9298296/sqlalchemy-support-of-postgres-schemas but, i would like to declare it at the model level, not the metadata level. is it possible? i tried using *__table_args__ = {'schema': 'foo'}* in *__abstract__*, with and without *declared_attr* decorator and also without *__abstract__*, neither worked and all tables were created on public schema. any tips? :) i'm asking this because i have a LOT of tables, and declare *__table_args__* in all of them just because the schema seems kinda weird, since we can mixin almost everything in sa. thanks in advance! richard. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com mailto:sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com mailto:sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.
[sqlalchemy] Parallel connection attempts in QueuePool
Hello, I have faced a problem in my production server (gevent based); when a connection attempt is made and MySQL server does not respond (due to listen backlog full), the whole application hangs. This seems to be because SQLAlchemy QueuePool does not allow multiple connection attempts simultaneously. It is waiting for overflow count lock. I suggest that we allow multiple connection attempts at the same time as I don't see any side effects of doing so. Details follow. Details of the problem: python-sqlalchemy 0.7.4-1 python 2.7.3-0ubuntu2 python-gevent 1.0~b1-1 mysql-server 5.5.34-0ubuntu0.12.0 I opened a gevent backdoor connection to the hung server and created a test method and ran it. def test(): import pdb pdb.set_trace() import sqlalchemy import mysql.connector p = sqlalchemy.pool.manage(mysql.connector, pool_size=128) p.connect(host='myhost', port=3306, user='myuser', password='mypassword', buffered=True) It ran with following trace (excerpt): (Pdb) step --Call-- /usr/lib/python2.7/dist-packages/sqlalchemy/pool.py(931)connect() - def connect(self, *args, **kw): [...] (Pdb) /usr/lib/python2.7/dist-packages/sqlalchemy/pool.py(210)connect() - return _ConnectionFairy(self).checkout() [...] (Pdb) /usr/lib/python2.7/dist-packages/sqlalchemy/pool.py(371)__init__() - rec = self._connection_record = pool._do_get() [...] (Pdb) --Call-- /usr/lib/python2.7/dist-packages/sqlalchemy/pool.py(671)_do_get() - def _do_get(self): [...] (Pdb) /usr/lib/python2.7/dist-packages/sqlalchemy/pool.py(676)_do_get() - except sqla_queue.Empty: (Pdb) /usr/lib/python2.7/dist-packages/sqlalchemy/pool.py(677)_do_get() - if self._max_overflow -1 and \ (Pdb) /usr/lib/python2.7/dist-packages/sqlalchemy/pool.py(678)_do_get() - self._overflow = self._max_overflow: (Pdb) /usr/lib/python2.7/dist-packages/sqlalchemy/pool.py(687)_do_get() - if self._overflow_lock is not None: (Pdb) /usr/lib/python2.7/dist-packages/sqlalchemy/pool.py(688)_do_get() - self._overflow_lock.acquire() (Pdb) /usr/lib/pymodules/python2.7/gevent/coros.py(98)acquire() - def acquire(self, blocking=True, timeout=None): Analysis: Before making a connection attempt the overflow counter lock is obtained and it is being released only after the connection either succeeds or fails. In my case, a connection remained hung possibly because of a surge in new DB connections and SYN backlog overflew on the database server (I have since added a timeout and tuned my database server to have much higher backlog). While this connection didn't respond, any new connection attempt as seen in the above trace waited trying to acquire overflow lock. The whole application became in capable of serving requests. Cause is this code: class QueuePool(Pool): def _do_get(self): [...] if self._overflow_lock is not None: self._overflow_lock.acquire() if self._max_overflow -1 and \ self._overflow = self._max_overflow: if self._overflow_lock is not None: self._overflow_lock.release() return self._do_get() try: con = self._create_connection() self._overflow += 1 finally: if self._overflow_lock is not None: self._overflow_lock.release() return con Changeset 5f0a7bb cleaned up this code but does not seem to have changed the flow (behaviour should be the same on trunk). Since disabling the overflow with max_overflow = -1 does not use lock at all, this behaviour is possibly an oversight rather than intended behaviour. Possible solution: Since the overflow lock seems to be to only maintain overflow count, I suggest that we increment the counter *before* connection attempt, don't hold the lock during connection attempt and then decrement the counter in case of an error. If there is interest in doing this, I shall find time for a patch and possibly a test case. Thank you, -- Sunil Mohan Adapa -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] fake models ?
I can only do this by making an example for you and for that I need the actual bits I’d be working with. The examples at http://docs.sqlalchemy.org/en/rel_0_9/orm/examples.html#module-examples.vertical already show how to create simple “queryable” properties if you only need to see the basic idea. On Dec 6, 2013, at 6:06 AM, Richard Gerd Kuesters rich...@humantech.com.br wrote: well ... i was trying to be a bit lazy on describing my models, because i'm interested in querying python classes that were not mapped to any table, but to a lot of tables. my model works, it doesn't have all the flaws pointed here. i just thought it would be more clear to see. but no problem. let's just forget about schematics and go to a simple class where i can save to the database any attribute that doesn't start with a underscore -- a pretty common pattern between pythonistas :) so, i have two tables that defines a structure (entity and attribute), and some other tables that contains their values: instance, to tell me what entity i'm using; types * values, to tell me what i have persisted, from what instance it is, the type of the attribute and, from there, any validation i may find necessary. now we can start again? :) what i would like to do is get this persisted pseudo-model and add mappings so i can query on using the sqlalchemy orm api, even better if i would also be able to insert, update, delete ... my best regards, richard. On 12/05/2013 07:44 PM, Michael Bayer wrote: Overall the different pieces aren’t making sense entirely. We have the notion of a “schema”, stored in the database - that is, Entity/Attribute. Those tables are fixed per type. It can tell me for example that there’s a “Person” type with two attributes associated, “name” and “website”. So what happens when I do this: class Person(Model): name = StringType(required=True) website = URLType() is there a metaclass that’s writing to the database at that point the Entity/Attribute rows for that type? It’s not clear when I say Person.website, am I just going by the fact that we have Person.website in the Python model, and if so what am I getting with the Attribute or even the Value table? There could just be a table called “url_values” and I join from Instance to that. The schema seems to be stated twice here in two very different ways. Also not clear what the purpose of Instance.valid_attributes are, this seems redundant vs. Entity already referring to Attribute. On Dec 5, 2013, at 11:48 AM, Richard Gerd Kuesters rich...@humantech.com.br wrote: the vertical mapper example was kind of a base for me to develop my eav system. i have added also a table that defines the structure of the entity, since then I use schematics (and its json schema import - export utility). it's very handy. i think that a pseudo-code can explain a little better, those ones in sa (i'll just write simplified): class Entity(Base): id = Column(int, pk) name = Column(str, unique) class Attribute(Base): id = Column(int, pk) name = Column(str) discriminator = Column(enum) # bool, string, integer, blob, etc entity_id = Column(fk(Entity.id)) entity = relationship(Entity) tbl_args = uniqueconstraint(name, entity_id) basically, with those classes i can define how my eav objects are (of course, they're much more complete, but for now it can give the idea). so, when I have this schematics model: class Person(Model): name = StringType(required=True) website = URLType() it will be interpreted as: person_entity = Entity() person_entity.name('person') session.add(person_entity) session.commit() name_attr = Attribute() name_attr.name = 'name' name_attr.discriminator = TypeDiscriminator.STRING # i used here a slightly modified code from decl_enum, a post you wrote in your blog name_attr.entity. = person_entity session.add(name_attr) session.commit() website_attr = Attribute() ... i think this can start to illustrate better what the Person model really is. now, let's go to the values (in poor code again, lol): class Instance(Base): id = Column(int, pk) entity_id = Column(fk(Entity.id)) entity = relationship(Entity) valid_attributes = relationship(Attribute, primaryjoin=entity_id==Attribute.entity_id) class Value(Base): id = Column(int, pk) attribute_id = Column(fk(Attribute.id)) discriminator = Column(enum) __mapper_args__ = dict(polymorphic_on=discriminator) class StringValue(Value): id = Column(fk(Value.id)) value = Column(string) __mapper_args__ = dict(polymorphic_identity=TypeDiscriminator.STRING) class BoolValue(Value): ... then, with a dozen of ifs and elses, I can translate the values given to a Person instance from schematics directly to the database.
[sqlalchemy] Re: Parallel connection attempts in QueuePool
On Friday, December 6, 2013 10:16:21 AM UTC-5, Sunil Adapa wrote: Before making a connection attempt the overflow counter lock is obtained and it is being released only after the connection either succeeds or fails. In my case, a connection remained hung possibly because of a surge in new DB connections and SYN backlog overflew on the database server (I have since added a timeout and tuned my database server to have much higher backlog). While this connection didn't respond, any new connection attempt as seen in the above trace waited trying to acquire overflow lock. The whole application became in capable of serving requests. Cause is this code: It sounds to me like you have an anti-pattern somewhere. Your entire application shouldn't hang because of a database connectivity issue. Specific requests to code that is trying to interact with the database should hang -- but the application shouldn't itself. SqlAlchemy shouldn't even connect to the database on a request that doesn't have DB logic in it. The immediate concern I see with your fix -- at least in your situation -- is that it could create a further surge and backlog on the MySQL server. I've worked with a few large online properties where some sort of surge like this ( where a bunch of app-servers kept reconnecting to mysql in a short time ) ended up creating a self-inflicted Denial of Service that took the sites offline, and in most of the instances the DB got corrupted and had to be repaired. This could be a good fix, Mike is the arbiter of all that stuff -- but this seems to me like it would facilitate a different problem. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.
[sqlalchemy] Re: Many2many, referential integrity and introspection
can you share your existing schema for these relations ? -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] Parallel connection attempts in QueuePool
On Dec 6, 2013, at 10:16 AM, Sunil Adapa su...@innopark.in wrote: Hello, I have faced a problem in my production server (gevent based); when a connection attempt is made and MySQL server does not respond (due to listen backlog full), the whole application hangs. This seems to be because SQLAlchemy QueuePool does not allow multiple connection attempts simultaneously. It is waiting for overflow count lock. I suggest that we allow multiple connection attempts at the same time as I don't see any side effects of doing so. Details follow. OK, I see this is with gevent - while I like the idea of gevent, I’m not deeply familiar with best practices for it. The QueuePool specifically uses thread-based locks to achieve it’s work. I can’t comment on what modifications might be needed to it in order to work with gevent’s model, but overall I’d suggest an entirely different pool implementation optimized for gevent. When I spent some time trying out gevent I noticed that QueuePool might have been having problems, and this is not surprising. For starters, I’d probably use NullPool with a gevent-based application, if there are in fact gevent-specific issues occurring. Analysis: Before making a connection attempt the overflow counter lock is obtained and it is being released only after the connection either succeeds or fails. In my case, a connection remained hung possibly because of a surge in new DB connections and SYN backlog overflew on the database server (I have since added a timeout and tuned my database server to have much higher backlog). While this connection didn't respond, any new connection attempt as seen in the above trace waited trying to acquire overflow lock. The whole application became in capable of serving requests. Cause is this code: class QueuePool(Pool): def _do_get(self): [...] if self._overflow_lock is not None: self._overflow_lock.acquire() if self._max_overflow -1 and \ self._overflow = self._max_overflow: if self._overflow_lock is not None: self._overflow_lock.release() return self._do_get() try: con = self._create_connection() self._overflow += 1 finally: if self._overflow_lock is not None: self._overflow_lock.release() return con Changeset 5f0a7bb cleaned up this code but does not seem to have changed the flow (behaviour should be the same on trunk). Since disabling the overflow with max_overflow = -1 does not use lock at all, this behaviour is possibly an oversight rather than intended behavior. Noting that I haven’t deeply gotten into this code at the moment, overall I’m confused about “the application became incapable of serving requests” - if the QueuePool serves out as many connections as it’s supposed to, its supposed to block all callers at that point.If you set max_overflow to -1, then there is no overflow_lock present at all, it’s set to None in the constructor. Otherwise, blocking on the call is what it’s supposed to do, in a traditionally threaded application. If when using gevent this means that other workers are blocked because the whole thing expects any kind of waiting to be handled “async style”, then that suggests we need a totally different approach for gevent. Since the overflow lock seems to be to only maintain overflow count, I suggest that we increment the counter *before* connection attempt, don't hold the lock during connection attempt and then decrement the counter in case of an error. If there is interest in doing this, I shall find time for a patch and possibly a test case. How would that work with a traditionally threaded application? My program goes to get a connection, the QueuePool says there’s none available yet and I should wait, then the call returns with…what?if it isn’t waiting. I apologize that I have only a fuzzy view of how things work with gevent, and at this time of the morning I’m probably not engaging the traditional threading model in my head so well either. signature.asc Description: Message signed with OpenPGP using GPGMail
[sqlalchemy] Functions sometimes return decimal or float
Hello, I'm working with a database that stores mostly integer values or floats with little (relevant) precision. Thus, I am using all numeric types with asdecimal=True. Unfortunately, our database-calculations return Decimal objects when run in MySQL, though they return int/float in Sqlite. While I can deal with either type, the inconsistency is causing some pain -- as we only notice the lack of return value conversion / db CAST rather late in the development process. Generally, I would like the ORM ensure consistent result types for a query like: DBSession.query(func.sum(Table.int_column)) We often handle this as: DBSession.query(cast(func.sum(Table.int_column), Integer)) Is there something we can do to generally avoid this type of bug? E.g., always return all DB-results to float, issue a warning when an explicit cast is missing and some DBs may return decimal, ... ? How do you deal with this situation? Cheers, Robert -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.
[sqlalchemy] Re: Functions sometimes return decimal or float
what about using custom compiler functions ( http://docs.sqlalchemy.org/en/rel_0_9/core/compiler.html#utc-timestamp-function ) ? you could make a custom function... sum_integer() and just call that instead of the default sqlalchemy option -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] Re: Functions sometimes return decimal or float
Hey Jonathan, On Fr, 2013-12-06 at 10:12 -0800, Jonathan Vanasco wrote: what about using custom compiler functions that is definitely a good idea to cast the computation result more elegantly. Do you also know a way to solve the issue of doing it wrong by accident and not noticing? Besides SUM, there's still AVG and other mathematical functions, that you can still use and not be aware of the impact in other DB engines. My goal was to make it harder to make a mistake unknowingly. Cheers, Robert -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] Parallel connection attempts in QueuePool
On Dec 6, 2013, at 11:15 AM, Michael Bayer mike...@zzzcomputing.com wrote: Since the overflow lock seems to be to only maintain overflow count, I suggest that we increment the counter *before* connection attempt, don't hold the lock during connection attempt and then decrement the counter in case of an error. If there is interest in doing this, I shall find time for a patch and possibly a test case. starting again, I think the part I missed is that you’re saying the creation of a new connection is the part that hung, not the waiting for existing connections to be available. So suppose _create_connection() is hanging, but there’s plenty of overflow available - other threads should still be able to go in and all access _create_connection(). the proposal is simple enough: diff --git a/lib/sqlalchemy/pool.py b/lib/sqlalchemy/pool.py index 34681ef..219a98f 100644 --- a/lib/sqlalchemy/pool.py +++ b/lib/sqlalchemy/pool.py @@ -812,11 +812,14 @@ class QueuePool(Pool): self._overflow = self._max_overflow: return self._do_get() else: -con = self._create_connection() self._overflow += 1 -return con finally: self._overflow_lock.release() +try: +return self._create_connection() +except: +self._overflow -= 1 +raise def recreate(self): self.logger.info(Pool recreating”) the hard part is producing a test case. I noticed just now that even if I take overflow_lock out entirely, all the current tests pass, but this is because it’s not easy for tests to catch race conditions like that. To test the new change, it should be simpler, inject a mock connection that will hang on one attempt and pass on another, then ensure that the second attempt successfully connects within the overflow range before the “hanging” one does (or errors out). Also can you confirm the MySQL behavior here is such that only arbitrary connection attempts are hanging? That is, a subsequent connection attempt succeeds while the previous one continues to hang - otherwise I’m not sure how this patch improves the situation. signature.asc Description: Message signed with OpenPGP using GPGMail
Re: [sqlalchemy] Re: Functions sometimes return decimal or float
I just learned all about the Events model. You could potentially write a listener that can filter the data for you. I'm not very familiar with the events though, and it could be a bit difficult to do -- as some items would be functions, others column operations, and others from 'text' constructs -- you might end up needing to apply a filter to every 'number-like' value, and that could cause bad side effects. are there any methods in the databases that would set up per-session options ? mysql has session-level options, sqlite has pragma and some other stuff. maybe you could tell the db to format stuff on it's end? -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] Parallel connection attempts in QueuePool
yeah, sorry I get this entirely now, ticket is http://www.sqlalchemy.org/trac/ticket/2880 and here’s a test, has nothing to do with gevent. the patch should be fine and I’ll see if i can get it into 0.8.4 as well. from sqlalchemy.pool import QueuePool from sqlalchemy.testing.mock import Mock, call import threading import time dbapi = Mock() def hanging_dbapi(): time.sleep(5) return dbapi.connect() def fast_dbapi(): return dbapi.connect() def failing_dbapi(): time.sleep(5) raise Exception(connection failed) creator = threading.local() def create(): return creator.mock_connector() def run_test(name, should_hang, should_fail): print(run test: %s %s %s % (name, should_hang, should_fail)) if should_fail: creator.mock_connector = failing_dbapi elif should_hang: creator.mock_connector = hanging_dbapi else: creator.mock_connector = fast_dbapi conn = pool.connect() print(connected: %s % name) conn.operation(name) time.sleep(3) conn.close() pool = QueuePool(creator=create, pool_size=2, max_overflow=3) success_one = threading.Thread(target=run_test, args=(success_one, False, False)) success_two = threading.Thread(target=run_test, args=(success_two, False, False)) overflow_one = threading.Thread(target=run_test, args=(overflow_one, True, False)) overflow_two = threading.Thread(target=run_test, args=(overflow_two, False, False)) overflow_three = threading.Thread(target=run_test, args=(overflow_three, False, False)) success_one.start() time.sleep(.5) success_two.start() time.sleep(.5) overflow_one.start() time.sleep(.5) overflow_two.start() time.sleep(.5) overflow_three.start() time.sleep(.5) overflow_one.join(timeout=10) assert \ dbapi.connect().operation.mock_calls == \ [call(success_one), call(success_two), call(overflow_two), call(overflow_three), call(overflow_one)],\ dbapi.connect().operation.mock_calls signature.asc Description: Message signed with OpenPGP using GPGMail
Re: [sqlalchemy] Functions sometimes return decimal or float
On Dec 6, 2013, at 12:28 PM, Robert Buchholz robert.buchh...@goodpoint.de wrote: Hello, I'm working with a database that stores mostly integer values or floats with little (relevant) precision. Thus, I am using all numeric types with asdecimal=True. Unfortunately, our database-calculations return Decimal objects when run in MySQL, though they return int/float in Sqlite. While I can deal with either type, the inconsistency is causing some pain -- as we only notice the lack of return value conversion / db CAST rather late in the development process. Generally, I would like the ORM ensure consistent result types for a query like: DBSession.query(func.sum(Table.int_column)) We often handle this as: DBSession.query(cast(func.sum(Table.int_column), Integer)) Is there something we can do to generally avoid this type of bug? E.g., always return all DB-results to float, issue a warning when an explicit cast is missing and some DBs may return decimal, ... ? How do you deal with this situation? the func.XYZ() construct returns an object that is untyped in most cases, meaning SQLAlchemy has no opinion about the data that’s returned by the database. This type can be specified using “type_=sometype”. However, that should not be necessary for sum() - there’s a small subset of known SQL functions for which we assign a fixed type. sum() is actually one of them, and should be using the same type as that of the given expression. So if you hand it a Column with type Numeric(asdecimal=True), that will be the return type of the func, and the dialect-specific logic will take effect. A test case (below) confirms this is the case for sum() - the type of column is maintained, whether asdecimal is True or False, succeeds on both MySQL and SQLite. If you’re using some function other than sum() which isn’t one of the known types, then you need to add type_=yourtype to it, like func.myfunc(table.c.column, type_=Float(asdecimal=True)). from sqlalchemy import func, Table, Column, MetaData, Float, select, create_engine from decimal import Decimal # create a table with a Float column not using decimal. m = MetaData() t = Table('t', m, Column('data', Float)) dburl = sqlite:// #dburl = mysql://scott:tiger@localhost/test e = create_engine(dburl, echo=True) m.drop_all(e) m.create_all(e) e.execute(t.insert(), data=45.67) # now let's query func.sum() using a Float with asdecimal: m2 = MetaData() t_with_decimal = Table('t', m2, Column('data', Float(asdecimal=True))) # with the Float column, we get float assert isinstance( e.scalar(select([func.sum(t.c.data)])), float ) # with the Float(asdecimal=True) column, we get Decimal assert isinstance( e.scalar(select([func.sum(t_with_decimal.c.data)])), Decimal ) Cheers, Robert -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out. signature.asc Description: Message signed with OpenPGP using GPGMail
Re: [sqlalchemy] Parallel connection attempts in QueuePool
On Dec 6, 2013, at 6:01 PM, Claudio Freire klaussfre...@gmail.com wrote: On Fri, Dec 6, 2013 at 3:50 PM, Michael Bayer mike...@zzzcomputing.com wrote: +except: +self._overflow -= 1 +raise That also needs to aquire the lock. I think I’m going to make a different object out of “overflow” just to make this easier to follow. It just needs to be an atomic counter that goes between M and N and tells you “yes” or “no”. the code right now is very inlined as it originates from a certain performance-crazy time in SQLA’s history, but connection pool checkouts aren’t really per-statement these days. signature.asc Description: Message signed with OpenPGP using GPGMail
Re: [sqlalchemy] Parallel connection attempts in QueuePool
On Dec 6, 2013, at 6:27 PM, Michael Bayer mike...@zzzcomputing.com wrote: On Dec 6, 2013, at 6:01 PM, Claudio Freire klaussfre...@gmail.com wrote: On Fri, Dec 6, 2013 at 3:50 PM, Michael Bayer mike...@zzzcomputing.com wrote: +except: +self._overflow -= 1 +raise That also needs to aquire the lock. I think I’m going to make a different object out of “overflow” just to make this easier to follow. It just needs to be an atomic counter that goes between M and N and tells you “yes” or “no”. the code right now is very inlined as it originates from a certain performance-crazy time in SQLA’s history, but connection pool checkouts aren’t really per-statement these days. OK more or less that, please review my commit at https://github.com/zzzeek/sqlalchemy/commit/d1cc78479d988bd9acbcf395483d2130b0873b1c which moves handling of “overflow” into _inc_overflow() and _dec_overflow() methods - the handling of the _overflow_lock is local to those methods. The _do_get() method is a lot easier to read now. Also added the existing test plus a new one to check the “decrement overflow on connection failed”. thanks all for the help on this one. signature.asc Description: Message signed with OpenPGP using GPGMail
Re: [sqlalchemy] Re: Parallel connection attempts in QueuePool
On Friday 06 December 2013 09:35 PM, Jonathan Vanasco wrote: [...] It sounds to me like you have an anti-pattern somewhere. Your entire application shouldn't hang because of a database connectivity issue. Specific requests to code that is trying to interact with the database should hang -- but the application shouldn't itself. SqlAlchemy shouldn't even connect to the database on a request that doesn't have DB logic in it. Sorry, I forgot to mention that all my application's requests require database queries. As expected, other parts of my application are indeed working properly. In fact, as mentioned, I was able to connect to the gevent backdoor (like twisted manhole) port and obtain the provided trace on the 'hung' server. The immediate concern I see with your fix -- at least in your situation -- is that it could create a further surge and backlog on the MySQL server. I've worked with a few large online properties where some sort of surge like this ( where a bunch of app-servers kept reconnecting to mysql in a short time ) ended up creating a self-inflicted Denial of Service that took the sites offline, and in most of the instances the DB got corrupted and had to be repaired. Hundreds of connections per second to the database are being created by a legacy PHP application. My Python/SQLAlchemy application is much more nice behaving during connection pooling. It just got caught in the overall mess. My database is able to handle lot higher connections than the temporary backlog surge. The proper fix for the backlog problem, of course, is to increase the backlog by setting MySQL backlog parameter and overall system backlog values in the kernel. I have fixed this [1]. However, the current problem is that once a temporary backlog overflow is hit (or a few connections are lost for some reason), the application does not *ever* serve another database related request. It certainly requires the fix that has been committed. Links: 1) http://www.mysqlperformanceblog.com/2012/01/06/mysql-high-number-connections-per-secon/ Thank you, -- Sunil -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] Parallel connection attempts in QueuePool
On Friday 06 December 2013 09:45 PM, Michael Bayer wrote: [...] OK, I see this is with gevent - while I like the idea of gevent, I’m not deeply familiar with best practices for it. The QueuePool specifically uses thread-based locks to achieve it’s work. I can’t comment on what modifications might be needed to it in order to work with gevent’s model, but overall I’d suggest an entirely different pool implementation optimized for gevent. When I spent some time trying out gevent I noticed that QueuePool might have been having problems, and this is not surprising. For starters, I’d probably use NullPool with a gevent-based application, if there are in fact gevent-specific issues occurring. The threading API is transparently replaced with gevent's own lightweight threading implementation using monkey patching. This includes the lock implementation. After monkey patching, a library like SQLAlchemy instead of spawning threads will unknowingly spawn gthreads and instead of using regular thread locks will unknowingly use gthread locks. Where in the traditional model locks block a thread and other threads continue to run, a gthread lock stops a gthread and returns back to the event loop for processing and running other events/gthreads. This all usually works fine except in rare situations. I see that there is nothing in SQLAlchemy/QueuePool that would make this not work properly. I am happy to report that I have been Gevent/SQLAlchemy/QueuePool for quite some time in a highly available setup with ~2-3k QPS database load. As noted by you later, the problem at hand has nothing to do with gevent and would occur in a traditional threading model too. Sorry to have introduced gevent confusion, but I felt obliged to mention it for the purpose of a full report. [...] Changeset 5f0a7bb cleaned up this code but does not seem to have changed the flow (behaviour should be the same on trunk). Since disabling the overflow with max_overflow = -1 does not use lock at all, this behaviour is possibly an oversight rather than intended behavior. Noting that I haven’t deeply gotten into this code at the moment, overall I’m confused about “the application became incapable of serving requests” - if the QueuePool serves out as many connections as it’s supposed to, its supposed to block all callers at that point. If you set max_overflow to -1, then there is no overflow_lock present at all, it’s set to None in the constructor. Otherwise, blocking on the call is what it’s supposed to do, in a traditionally threaded application. If when using gevent this means that other workers are blocked because the whole thing expects any kind of waiting to be handled “async style”, then that suggests we need a totally different approach for gevent. Since the overflow lock seems to be to only maintain overflow count, I suggest that we increment the counter *before* connection attempt, don't hold the lock during connection attempt and then decrement the counter in case of an error. If there is interest in doing this, I shall find time for a patch and possibly a test case. How would that work with a traditionally threaded application? My program goes to get a connection, the QueuePool says there’s none available yet and I should wait, then the call returns with…what? if it isn’t waiting. I apologize that I have only a fuzzy view of how things work with gevent, and at this time of the morning I’m probably not engaging the traditional threading model in my head so well either. As you predicted in the later mail, this problem has in fact occurred way before the pool size has reached. Pool limit = 128 + 10 overflow Checked out connections at the time of the problem = 27 -- Sunil -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] Parallel connection attempts in QueuePool
On Saturday 07 December 2013 12:20 AM, Michael Bayer wrote: [...] the hard part is producing a test case. I noticed just now that even if I take overflow_lock out entirely, all the current tests pass, but this is because it’s not easy for tests to catch race conditions like that. To test the new change, it should be simpler, inject a mock connection that will hang on one attempt and pass on another, then ensure that the second attempt successfully connects within the overflow range before the “hanging” one does (or errors out). I was thinking of writing a dummy listener to connect MySQL python/connector to for the test case. I see that the your test case using a dummy connector method is much simpler and elegant. Also can you confirm the MySQL behavior here is such that only arbitrary connection attempts are hanging? That is, a subsequent connection attempt succeeds while the previous one continues to hang - otherwise I’m not sure how this patch improves the situation. My observation is that subsequent connection attempts go alright. Although some of my application instances (3 out of 12) hung, other instances (9 lucky ones out of 12) continued to work. Legacy PHP code without connection pooling continued to work. Due the nature of the listen() backlog, this problem, I assume, is not specific to MySQL. Information I used to fix the MySQL problem provides more information [1][2][3]. Links: 1) https://dev.mysql.com/doc/refman/5.5/en/server-system-variables.html#sysvar_back_log 2) http://www.mysqlperformanceblog.com/2012/01/06/mysql-high-number-connections-per-secon/ 3) http://man7.org/linux/man-pages/man2/listen.2.html Thank you, -- Sunil -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.